You are on page 1of 92

Subsections Definition of a Matrix Special Matrices Operations on Matrices Multiplication of Matrices Inverse of a Matrix Some More Special Matrices

Submatrix of a Matrix Block Matrices Miscellaneous Exercises Matrices over Complex Numbers

Mohd. Abrar Nizami

DEFINITION 1.1.1 (Matrix) A rectangular array of numbers is called a matrix. We shall mostly be concerned with matrices having real numbers as entries. The horizontal arrays of a matrix are called its ROWS and the vertical arrays are called its COLUMNS. A matrix rows and columns is said to have the order having A matrix of ORDER can be represented in the following form:

where

is the entry at the intersection of the

row and by

column. by suppressing its order.

In a more concise manner, we also denote the matrix

Remark 1.1.2 Some books also use

to represent a matrix.

Let

Then

and

A matrix having only one column is called a COLUMN VECTOR; and a matrix with only one row is called a ROW VECTOR. WHENEVER A VECTOR IS USED, IT SHOULD BE UNDERSTOOD FROM THE CONTEXT WHETHER IT IS A ROW VECTOR OR A COLUMN VECTOR. DEFINITION 1.1.3 (Equality of two Matrices) Two matrices order are equal if for each and and having the same

where the point of symmetry is left as an exercise to the reader.

is the centre. The calculation of the planes

DEFINITION 1.1.5 1. A matrix in which each entry is zero is called a zero-matrix, denoted by For example,

2. A matrix for which the number of rows equals the number of columns, is called a square matrix. So, if is an matrix then is said to have order . 3. In a square matrix, of order , the entries are called the diagonal entries

and form the principal diagonal of 4. A square matrix is said to be a diagonal matrix if for In other words, the and

non-zero entries appear only on the principal diagonal. For example, the zero matrix are a few diagonal matrices. A diagonal matrix of order with the diagonal entries is denoted by

If

for all of order .

then the diagonal matrix

is called a scalar matrix. for all . This

5. A diagonal matrix matrix is denoted by

is called an IDENTITY MATRIX if

For example,

and

The subscript A square matrix

is suppressed in case the order is clear from the context or if no confusion arises. is said to be an upper triangular matrix if for

A square matrix A square matrix

is said to be a lower triangular matrix if

for

is said to be triangular if it is an upper or a lower triangular matrix.

For example

is an upper triangular matrix. An upper triangular matrix will be

represented by

DEFINITION 1.2.1 (Transpose of a Matrix) The transpose of an matrix with for and

matrix The transpose of

is defined as the is denoted by

That is, by the transpose of an its columns and the columns of

matrix as its rows.

we mean a matrix of order

having the rows of

as

For example, if

then

Thus, the transpose of a row vector is a column vector and vice-versa. THEOREM 1.2.2 For any matrix Proof . Let and Then, the definition of transpose gives

and the result follows. height6pt width 6pt depth 0pt DEFINITION 1.2.3 (Addition of Matrices) let the sum is defined to be the matrix and with be are two matrices. Then

Note that, we define the sum of two matrices only when the order of the two matrices are same. DEFINITION 1.2.4 (Multiplying a Scalar to a Matrix) Let element we define be an matrix. Then for any

For example, if

and

then

THEOREM 1.2.5 Let

and

be matrices of order

and let

1. 2. 3. 4. Proof . Part 1. Let and

Then

as real numbers commute. The reader is required to prove the other parts as all the results follow from the properties of real numbers. height6pt width 6pt depth 0pt EXERCISE 1.2.6 1. Suppose 2. Suppose Then show that Then show that be an matrix. This matrix is called the additive inverse of and

DEFINITION 1.2.7 (Additive Inverse) Let 1. Then there exists a matrix is denoted by 2. Also, for the matrix identity. with

Hence, the matrix

is called the additive

DEFINITION 1.2.8 (Matrix Multiplication / Product) Let be an matrix. The product is a matrix of order

be an with

matrix and

That is, if

and

then

Observe that the product

is defined if and only if

THE NUMBER OF COLUMNS OF MATHEND000# THE NUMBER OF ROWS OF MATHEND000#

For example, if

and

then

(1.2.1)

(1.2.2)

Observe the following: 1. In this example, while is defined, the product and is not defined. and are defined.

However, for square matrices 2. The product 3. The product

of the same order, both the product

corresponds to operating on the rows of the matrix (see 1.2.1), and also corresponds to operating on the columns of the matrix (see 1.2.2). and are said to commute if

DEFINITION 1.2.9 Two square matrices Remark 1.2.10 1. Note that if

is a square matrix of order

then

Also, a scalar matrix of order

commutes with any square matrix of order . 2. In general, the matrix product is not commutative. For example, consider the following two matrices and . Then check that the matrix product

THEOREM 1.2.11 Suppose that the matrices are defined. 1. Then 2. For any 3. Then 4. If is an matrix then of order is the and

and

are so chosen that the matrix multiplications

That is, the matrix multiplication is associative.

That is, multiplication distributes over addition.

5. For any square matrix the first row of for

we have

times the first row of row of is times the when and row of

A similar statement holds for the columns of Proof . Part 1. Let

is multiplied on the right by

Therefore,

Part 5.

For all

we have

as

whenever

Hence, the required result follows.

The reader is required to prove the other parts. height6pt width 6pt depth 0pt EXERCISE 1.2.12 1. Let and be two matrices. If the matrix addition . Also, if the matrix product is defined, then prove that is defined then prove that .

2. Let

and

Compute the matrix products

and

Let

be a positive integer. Compute

for the following matrices:

Can you guess a formula for

1. Suppose that the matrix product 2. Suppose that the matrix products have different orders. 3. Suppose that the matrices and may not be equal.

is defined. Then the product need not be defined. and are defined. Then the matrices and can are square matrices of order Then and may or

DEFINITION 1.2.13 (Inverse of a Matrix) Let 1. A square matrix 2. A square matrix 3. A matrix

be a square matrix of order if if such that

is said to be a LEFT INVERSE of is called a RIGHT INVERSE of

is said to be INVERTIBLE (or is said to have an INVERSE) if there exists a matrix

LEMMA 1.2.14 Let and Proof . Note that

be an then

matrix. Suppose that there exist

matrices

and

such that

height6pt width 6pt depth 0pt Remark 1.2.15 1. From the above lemma, we observe that if a matrix is invertible, then the inverse is unique. 2. As the inverse of a matrix is unique, we denote it by That is, THEOREM 1.2.16 Let 1. 2. 3. Proof . Proof of Part 1 . and be two matrices with inverses and respectively. Then

By definition definition, implies

Hence, if we denote or equivalently

by

then we get

Thus, the

Proof of Part 2. Verify t hat Proof of Part 3 We know Taking transpose, we get

Hence, by definition EXERCISE 1.2.17 1. Let

height6pt width 6pt depth 0pt

be invertible matrices. Prove that the product

is also an invertible

matrix. 2. Let be an inveritble matrix. Then prove that cannot have a row or column consisting of only zeros. 3. Let be an invertible matrix and let be a nonzero real number. Then determine the inverse of the matrix .

DEFINITION 1.3.1 1. A matrix 2. A matrix EXAMPLE 1.3.2 over is called symmetric if and skew-symmetric if

is said to be orthogonal if

1. Let skew-symmetric matrix. 2. Let

and

Then

is a symmetric matrix and

is a

Then

is an orthogonal matrix.

3. Let

be an

matrix with for which a positive integer for which

Then exists such that

and

for are called

The matrices

NILPOTENT matrices. The least positive integer NILPOTENCY.

is called the ORDER OF are called

4. Let

Then

The matrices that satisfy the condition that

IDEMPOTENT matrices.

EXERCISE 1.3.3 1. Show that for any square matrix skew-symmetric, and 2. Show that the product of two lower triangular matrices is a lower triangular matrix. A similar statement holds for upper triangular matrices. 3. Let and be symmetric matrices. Show that is symmetric if and only if 4. Show that the diagonal entries of a skew-symmetric matrix are zero. Let be skew-symmetric matrices with Is the matrix skew-symmetric? 6. Let be a symmetric matrix of order with Is it necessarily true that 7. Let be a nilpotent matrix. Show that there exists a matrix such that is symmetric, is

DEFINITION 1.3.4 A matrix obtained by deleting some of the rows and/or columns of a matrix is said to be a submatrix of the given matrix. For example, if a few submatrices of are

But the matrices

and

are not submatrices of

Let

be an and

matrix and as and

be an and

matrix. Suppose where and and rows of has order

Then, we can decompose the and has order and That consists

matrices

is, the matrices of the last rows of and

are submatrices of . Similarly,

consists of the first are submatrices of

columns of and

columns of

consists of the first

consists of the last

. We now prove the following important theorem.

THEOREM 1.3.5 Let

and

be defined as above. Then

Proof . First note that the matrices

and

are each of order and and

. The matrix products

and and

are valid as the order of the matrices . Let , we have

are respectively, . Then, for and

height6pt width 6pt depth 0pt Theorem 1.3.5 is very useful due to the following reasons: 1. The order of the matrices and are smaller than that of or

2. It may be possible to block the matrix in such a way that a few blocks are either identity matrices or zero matrices. In this case, it may be easy to handle the matrix product using the block form. Or when we want to prove results using induction, then we may assume the result for

submatrices and then look for

submatrices, etc.

For example, if

and

Then

If

then

can be decomposed as follows:

or

or

and so on.

Suppose

and are called the blocks of the matrices and

Then the matrices respectively.

and

Even if and

is defined, the orders of in the block form. But, if

and and

may not be same and hence, we may not be able to add is defined then

Similarly, if the product is defined, the product need not be defined. Therefore, we can talk of matrix product as block product of matrices, if both the products and are defined. And in this case, we have

That is, once a partition of is fixed, the partition of block addition or multiplication.

has to be properly chosen for purposes of

Here the entries of the matrix are complex numbers. All the definitions still hold. One just needs to look at the following additional definitions. DEFINITION 1.4.1 (Conjugate Transpose of a Matrix) 1. Let matrix be an matrix over with If then the Conjugate of denoted by is the

For example, Let

Then

2. Let

be an

matrix over with

If

then the Conjugate Transpose of

denoted by

is the matrix

For example, Let

Then

3. A square matrix 4. A square matrix 5. A square matrix 6. A square matrix Remark 1.4.2 If

over over over over with

is called Hermitian if is called skew-Hermitian if is called unitary if is called Normal if then

Subsections Introduction A Solution Method Row Operations and Equivalent Systems Gauss Elimination Method Row Reduced Echelon Form of a Matrix Gauss-Jordan Elimination Elementary Matrices Rank of a Matrix Existence of Solution of Example Main Theorem Equivalent conditions for Invertibility Inverse and the Gauss-Jordan Method Determinant Adjoint of a Matrix Cramer's Rule Miscellaneous Exercises

Let us look at some examples of linear systems. 1. Suppose 1. If 2. If 1. 2. Consider the system then the system has a UNIQUE SOLUTION and then the system has NO SOLUTION. then the system has INFINITE NUMBER OF SOLUTIONS, namely all equations in unknowns. If one of the coefficients, Thus for the system or is non-zero, then this linear

2. We now consider a system with Consider the equation equation represents a line in

the set of solutions is given by the points of intersection of the two lines. There are three cases to be considered. Each case is illustrated by an example. 1. UNIQUE SOLUTION and The unique solution is Observe that in this case, 2. INFINITE NUMBER OF SOLUTIONS and The set of solutions is with represent the same line. Observe that in this case, 3. NO SOLUTION and no point of intersection. Observe that in this case, 3. As a last example, consider A linear equation arbitrary. In other words, both the equations and The equations represent a pair of parallel lines and hence there is but provided As in the

equations in unknowns. represent a plane in

case of equations in unknowns, we have to look at the points of intersection of the given three planes. Here again, we have three cases. The three cases are illustrated by examples. 1. UNIQUE SOLUTION Consider the system and The unique

solution to this system is 2. INFINITE NUMBER OF SOLUTIONS Consider the system solutions to this system is
THE THREE PLANES INTERSECT ON A LINE. 3. NO SOLUTION

i.e. THE THREE PLANES INTERSECT AT A POINT. and The set of with arbitrary:

The system

and

has no solution. In this

case, we get three parallel lines as intersections of the above planes taken two at a time. The readers are advised to supply the proof. DEFINITION 2.1.1 (Linear System) A linear system of set of equations of the form equations in unknowns is a

(2.1.1)

where for

and

Linear System (2.1.1) is called HOMOGENEOUS if

and NON-HOMOGENEOUS otherwise. We rewrite the above equations in the form where

and

The matrix

is called the COEFFICIENT matrix and the block matrix

is the AUGMENTED matrix of the

linear system (2.1.1). Remark 2.1.2 Observe that the the row of the augmented matrix represents the variable equation and That is, for equation and

column of the coefficient matrix and the entry

corresponds to coefficients of the of the coefficient matrix

corresponds to the

variable For a system of linear equations


SYSTEM.

the system

is called the ASSOCIATED HOMOGENEOUS

is a column DEFINITION 2.1.3 (Solution of a Linear System) A solution of the linear system vector with entries such that the linear system (2.1.1) is satisfied by substituting in place of That is, if Note: The zero -tuple -tuple then holds. and is called the TRIVIAL

is always a solution of the system if it satisfies

solution. A non-zero

is called a NON-TRIVIAL solution.

EXAMPLE 2.1.4 Let us solve the linear system

and

Solution: 1. The above linear system and the linear system (2.1.2)

have the same set of solutions. (why?) 2. Using the equation, we eliminate from and equation to get the linear system

(2.1.3) This system and the system (2.1.2) has the same set of solution. (why?) 3. Using the equation, we eliminate from the last equation of system (2.1.3) to get the system

(2.1.4) which has the same set of solution as the system (2.1.3). (why?) 4. The system (2.1.4) and system

(2.1.5) has the same set of solution. (why?) 5. Now, solution is implies and Or in terms of a vector, the set of

DEFINITION 2.2.1 (Elementary Operations) The following operations 1, 2 and 3 are called elementary operations. 1. interchange of two equations, say ``interchange the (compare the system (2.1.2) with the original system.) 2. multiply a non-zero constant throughout an equation, say ``multiply the (compare the system (2.1.5) and the system (2.1.4).) 3. replace an equation by itself plus a constant multiple of another equation, say ``replace the equation by equation plus times the equation". equation by "; and equations";

(compare the system (2.1.3) with (2.1.2) or the system (2.1.4) with (2.1.3).) Remark 2.2.2 1. In Example 2.1.4, observe that the elementary operations helped us in getting a linear system (2.1.5), which was easily solvable. 2. Note that at Step 1, if we interchange the first and the second equation, we get back to the linear system from which we had started. This means the operation at Step 1, has an inverse operation. In other words, INVERSE OPERATION sends us back to the step where we had precisely started. So, in Example 2.1.4, the application of a finite number of elementary operations helped us to obtain a simpler system whose solution can be obtained directly. That is, after applying a finite number of elementary operations, a simpler linear system is obtained which can be easily solved. Note that the three elementary operations defined above, have corresponding INVERSE operations, namely, 1. ``interchange the 2. ``divide the 3. ``replace the and equation by equation by equations", "; equation minus times the equation".

It will be a useful exercise for the reader to IDENTIFY THE INVERSE OPERATIONS at each step in Example 2.1.4. DEFINITION 2.2.3 (Equivalent Linear Systems) Two linear systems are said to be equivalent if one can be

obtained from the other by a finite number of elementary operations. The linear systems at each step in Example 2.1.4 are equivalent to each other and also to the original linear system. LEMMA 2.2.4 Let be the linear system obtained from the linear system by a single and have the same set of solutions. elementary operation. Then the linear systems Proof . We prove the result for the elementary operation ``the times the equation is replaced by equation plus

equation." The reader is advised to prove the result for other elementary operations. and vary only in the Then substituting for equation. Let 's in the and be a

In this case, the systems solution of the linear system equations, we get

's in place of

Therefore, (2.2.1)

But then the

equation of the linear system

is (2.2.2)

Therefore, using Equation (2.2.1), Use a similar argument to show that if also a solution of the linear system

is also a solution for the

Equation (2.2.2). then it is

is a solution of the linear system

Hence, we have the proof in this case. height6pt width 6pt depth 0pt Lemma 2.2.4 is now used as an induction step to prove the main result of this section (Theorem 2.2.5). THEOREM 2.2.5 Two equivalent systems have the same set of solutions. Proof . Let be the number of elementary operations performed on theorem by induction on If suppose Lemma 2.2.4 answers the question. If to get We prove the Now, step from the

assume that the theorem is true for

Apply the Lemma 2.2.4 again at the ``last step" (that is, at the

step) to get the required result using induction. height6pt width 6pt depth 0pt Let us formalise the above section which led to Theorem 2.2.5. For solving a linear system of equations, we applied elementary operations to equations. It is observed that in performing the elementary operations, the calculations were made on the COEFFICIENTS (numbers). The variables and the sign of ) are not disturbed. Therefore, in place of looking at the system of equations as a equality (that is, whole, we just need to work with the coefficients. These coefficients when arranged in a rectangular array gives us the augmented matrix DEFINITION 2.2.6 (Elementary Row Operations) The elementary row operations are defined as: 1. interchange of two rows, say ``interchange the and rows", denoted row by ", denoted

2. multiply a non-zero constant throughout a row, say ``multiply the

3. replace a row by itself plus a constant multiple of another row, say ``replace the plus times the row", denoted

row by

row

EXERCISE 2.2.7 Find the INVERSE row operations corresponding to the elementary row operations that have been defined just above. DEFINITION 2.2.8 (Row Equivalent Matrices) Two matrices are said to be row-equivalent if one can be obtained from the other by a finite number of elementary row operations. EXAMPLE 2.2.9 The three matrices given below are row equivalent.

Whereas the matrix

is not row equivalent to the matrix

DEFINITION 2.2.10 (Forward/Gauss Elimination Method) Gaussian elimination is a method of solving a linear system (consisting of equations in unknowns) by bringing the augmented matrix

to an upper triangular form

This elimination process is also called the forward elimination method. The following examples illustrate the Gauss elimination procedure. EXAMPLE 2.2.11 Solve the linear system by Gauss elimination method.

Solution: In this case, the augmented matrix is steps. 1. Interchange and equation (or ).

The method proceeds along the following

2. Divide the

equation by

(or

).

3. Add

times the

equation to the

equation (or

).

4. Add

times the

equation to the

equation (or

).

5. Multiply the

equation by

(or

).

The last equation gives Hence the set of solutions is

the second equation now gives

Finally the first equation gives

A UNIQUE SOLUTION.

EXAMPLE 2.2.12 Solve the linear system by Gauss elimination method.

Solution: In this case, the augmented matrix is

and the method proceeds as follows:

1. Add

times the first equation to the second equation.

2. Add

times the first equation to the third equation.

3. Add

times the second equation to the third equation

Thus, the set of solutions is words, the system has INFINITE NUMBER OF SOLUTIONS. EXAMPLE 2.2.13 Solve the linear system by Gauss elimination method.

with

arbitrary. In other

Solution: In this case, the augmented matrix is

and the method proceeds as follows:

1. Add

times the first equation to the second equation.

2. Add

times the first equation to the third equation.

3. Add

times the second equation to the third equation

The third equation in the last step is

This can never hold for any value of

Hence, the system has NO SOLUTION. one needs to apply only the elementary row

Remark 2.2.14 Note that to solve a linear system, operations to the augmented matrix

DEFINITION 2.3.1 (Row Reduced Form of a Matrix) A matrix

is said to be in the row reduced form if

1. THE FIRST NON-ZERO ENTRY IN EACH ROW OF MATHEND000# IS MATHEND000# 2. THE COLUMN CONTAINING THIS MATHEND000# HAS ALL ITS OTHER ENTRIES ZERO. A matrix in the row reduced form is also called a ROW REDUCED MATRIX. EXAMPLE 2.3.2 1. One of the most important examples of a row reduced matrix is the that the entry of the identity matrix is identity matrix, Recall

is usually referred to as the Kronecker delta function.

2. The matrices

and

are also in row reduced form.

3. The matrix

is not in the row reduced form. (why?)

DEFINITION 2.3.3 (Leading Term, Leading Column) For a row-reduced matrix, the first non-zero entry of any row is called a LEADING TERM. The columns containing the leading terms are called the LEADING COLUMNS. in variables and DEFINITION 2.3.4 (Basic, Free Variables) Consider the linear system equations. Let be the row-reduced matrix obtained by applying the Gauss elimination method to the augmented matrix Then the variables corresponding to the leading columns in the first

columns of variables.

are called the BASIC variables. The variables which are not basic are called FREE

The free variables are called so as they can be assigned arbitrary values and the value of the basic variables can then be written in terms of the free variables. Observation: In Example 2.2.12, the solution set was given by

That is, we had two basic variables,

and

and

as a free variable.

Remark 2.3.5 It is very important to observe that if there are non-zero rows in the row-reduced form of the matrix then there will be leading terms. That is, there will be leading columns. Therefore, IF THERE
ARE MATHEND000# LEADING TERMS AND MATHEND000# VARIABLES, THEN THERE WILL BE MATHEND000# BASIC VARIABLES AND MATHEND000# FREE VARIABLES.

We now start with Step 5 of Example 2.2.11 and apply the elementary operations once again. But this time, we start with the 1. Add row. ). times the third equation to the second equation (or

2. Add

times the third equation to the first equation (or

).

3. From the above matrix, we directly have the set of solution as DEFINITION 2.3.6 (Row Reduced Echelon Form of a Matrix) A matrix reduced echelon form if is said to be in the row

1. is already in the row reduced form; 2. The rows consisting of all zeros comes below all non-zero rows; and 3. the leading terms appear from left to right in successive rows. That is, for leading column of the row. Then

let

be the

EXAMPLE 2.3.7 Suppose

and

are in row reduced form.

Then the corresponding matrices in the row reduced echelon form are respectively,

and

DEFINITION 2.3.8 (Row Reduced Echelon Matrix) A matrix which is in the row reduced echelon form is also called a row reduced echelon matrix. DEFINITION 2.3.9 (Back Substitution/Gauss-Jordan Method) The procedure to get to Step II of Example 2.2.11 from Step 5 of Example 2.2.11 is called the back substitution. The elimination process applied to obtain the row reduced echelon form of the augmented matrix is called the Gauss-Jordan elimination. That is, the Gauss-Jordan elimination method consists of both the forward elimination and the backward substitution. Method to get the row-reduced echelon form of a given matrix Let be an matrix. Then the following method is used to obtain the row-reduced echelon form the matrix Step 1: Consider the first column of the matrix If all the entries in the first column are zero, move to the second column. Else, find a row, say the first row with the whole row by below this row, which contains a non-zero entry in the first column. Now, interchange row. Suppose the non-zero entry in the -entry of the new matrix is -position is Now, use the Divide the to make all the entries

so that the

equal to

Step 2: If all entries in the first column after the first step are zero, consider the right submatrix of the matrix obtained in step 1 and proceed as in step 1. Else, forget the first row and first column. Start with the lower matrix obtained in the first step and proceed as in step 1. Step 3: Keep repeating this process till we reach a stage where all the entries below a particular row, Then has the following form: say , are zero. Suppose at this stage we have obtained a matrix 1. THE FIRST NON-ZERO ENTRY IN EACH ROW of is These 's are the leading terms of and the columns containing these leading terms are the leading columns. 2. THE ENTRIES OF MATHEND000# BELOW THE LEADING TERM ARE ALL ZERO. Step 4: Now use the leading term in the to zero. Step 5: Next, use the leading term in the row to make all entries in the leading column equal leading submatrix of the

row to make all entries in the

column equal to zero and continue till we come to the first leading term or column.

The final matrix is the row-reduced echelon form of the matrix


RIGHT. Hence, if

Remark 2.3.10 Note that the row reduction involves only row operations and proceeds from LEFT TO is a matrix consisting of first columns of a matrix then the row reduced form of will be the first columns of the row reduced form of

The proof of the following theorem is beyond the scope of this book and is omitted. THEOREM 2.3.11 The row reduced echelon form of a matrix is unique. EXERCISE 2.3.12 1. Solve the following linear system. 1. 2. 3. 4. 5. and and and and and

2. Find the row-reduced echelon form of the following matrices.

DEFINITION 2.3.13 A square matrix of order is called an elementary matrix if it is obtained by applying exactly one elementary row operation to the identity matrix, Remark 2.3.14 There are three types of elementary matrices. 1. which is obtained by the application of the elementary row operation to the identity matrix,

Thus, the 2.

entry of

is to the identity

which is obtained by the application of the elementary row operation

matrix,

The

entry of

is

3.

which is obtained by the application of the elementary row operation

to the identity

matrix,

The

entry of

is

In particular, if we start with a

identity matrix

, then

EXAMPLE 2.3.15

1. Let

Then

That is, interchanging the two rows of the matrix is same as multiplying on the left by the corresponding elementary matrix. In other words, we see that the left multiplication of elementary matrices to a matrix results in elementary row operations. 2. Consider the augmented matrix same as the matrix product Then the result of the steps given below is

Now, consider an

matrix

and an elementary matrix

of order

Then multiplying by

on the

right to corresponds to applying column transformation on the matrix matrix, there is a corresponding column transformation. We summarize:

Therefore, for each elementary

DEFINITION 2.3.16 The column transformations obtained by right multiplication of elementary matrices are called elementary column operations.

EXAMPLE 2.3.17 Let

and consider the elementary column operation

which

interchanges the second and the third column of

Then

EXERCISE 2.3.18 1. Let is, be an elementary row operation and let is the matrix obtained from be the corresponding elementary matrix. That Show that

by applying the elementary row operation

2. Show that the Gauss elimination method is same as multiplying by a series of elementary matrices on the left to the augmented matrix. Does the Gauss-Jordan method also corresponds to multiplying by elementary matrices on the left? Give reasons. 3. Let and be two matrices. Then prove that the two matrices are row-equivalent if and only if where is product of elementary matrices. When is this unique? 4. Show that every elementary matrix is invertible. Is the inverse of an elementary matrix, also an elementary matrix?

In previous sections, we solved linear systems using Gauss elimination method or the Gauss-Jordan method. In the examples considered, we have encountered three possibilities, namely 1. existence of a unique solution, 2. existence of an infinite number of solutions, and 3. no solution. Based on the above possibilities, we have the following definition. DEFINITION 2.4.1 (Consistent, Inconsistent) A linear system is called CONSISTENT if it admits a solution and is called INCONSISTENT if it admits no solution. The question arises, as to whether there are conditions under which the linear system is consistent. The answer to this question is in the affirmative. To proceed further, we need a few definitions and remarks. Recall that the row reduced echelon form of a matrix is unique and therefore, the number of non-zero rows is a unique number. Also, note that the number of non-zero rows in either the row reduced form or the row reduced echelon form of a matrix are same. DEFINITION 2.4.2 (Row rank of a Matrix) The number of non-zero rows in the row reduced form of a matrix is called the row-rank of the matrix. By the very definition, it is clear that row-equivalent matrices have the same row-rank. For a matrix write ` EXAMPLE 2.4.3 ' to denote the row-rank of we

1. Determine the row-rank of Solution: To determine the row-rank of 1. we proceed as follows.

2.

3.

4. The last matrix in Step 1d is the row reduced form of which has non-zero rows. Thus, This result can also be easily deduced from the last matrix in Step 1b. 2. Determine the row-rank of Solution: Here we have 1.

2. From the last matrix in Step 2b, we deduce Remark 2.4.4 Let be a linear system with equations and unknowns. Then the row-reduced echelon form of agrees with the first columns of and hence

The reader is advised to supply a proof. Remark 2.4.5 Consider a matrix (see Definition 2.3.16) to the matrix After application of a finite number of elementary column operations we can have a matrix, say which has the following properties:

1. The first nonzero entry in each column is 2. A column containing only 0 's comes after all columns with at least one non-zero entry. 3. The first non-zero entry (the leading term) in each non-zero column moves down in successive columns. Therefore, we can define column-rank of that as the number of non-zero columns in It will be proved later

Thus we are led to the following definition. DEFINITION 2.4.6 The number of non-zero rows in the row reduced form of a matrix of denoted THEOREM 2.4.7 Let such that be a matrix of rank Then there exist elementary matrices is called the rank

and

Proof . Let be the row reduced echelon matrix obtained by applying elementary row operations to the given matrix As the matrix will have the first rows as the non-zero rows. So by Remark 2.3.5, will have in the will have leading columns, say Note that, for the column

row and zero elsewhere. Let be the matrix obtained from Then the matrix block of by successively and column of for can be written in the form is an identity matrix, the block This gives the required result.

We now apply column operations to the matrix interchanging the where

is a matrix of appropriate size. As the

can be made the zero matrix by application of column operations to height6pt width 6pt depth 0pt COROLLARY 2.4.8 Let be a matrix of rank

Then the system of equations

has

infinite number of solutions. Proof . By Theorem 2.4.7, there exist elementary matrices Define and . Then the matrix such that

as the elementary martices

's are being multiplied on the left of the matrix . Then check that for

Let .

be the columns of the matrix

Hence, we can use the

's which are non-zero (Use Exercise 1.2.17.2) to generate infinite number of

solutions. height6pt width 6pt depth 0pt EXERCISE 2.4.9 1. Determine the ranks of the coefficient and the augmented matrices that appear in Part 1 and Part 2 of Exercise 2.3.12. 2. Let be an matrix with Then prove that is row-equivalent to 3. If and are invertible matrices and and and 5. Let and 1. if 2. if 6. Let be two matrices. Show that is defined, then is defined, then and Then show that there exists invertible matrices and prove that the matrix 7. Let 8. Let be an and and have rank is an matrix of rank is a matrix of size invertible matrix. Then can be written as and is a matrix of size Then show that then
and such that

is defined then show that where

4. Find matrices

which are product of elementary matrices such that

be any matrix of rank

such that Also,

where both

and

be two matrices such that for some matrix for some matrix
and

is defined and is defined and

Similarly, if

[Hint: Choose non-singular matrices

Define

9. If matrices

and

are invertible and the involved partitioned products are defined, then show that

10. Suppose

is the inverse of a matrix

Partition

and

as follows:

If

is invertible and

then show that

and

We try to understand the properties of the set of solutions of a linear system through an example, using the Gauss-Jordan method. Based on this observation, we arrive at the existence and uniqueness results for the linear system This example is more or less a motivation.

Subsections Example Main Theorem Equivalent conditions for Invertibility Inverse and the Gauss-Jordan Method

Consider a linear system with

which after the application of the Gauss-Jordan method reduces to a matrix

For this particular matrix Observations:

we want to see the set of solutions. We start with some observations.

1. The number of non-zero rows in

is

This number is also equal to the number of non-zero rows in and

2. The first non-zero entry in the non-zero rows appear in columns 3. Thus, the respective variables 4. The remaining variables, 5. We assign arbitrary constants Hence, we have the set of solutions as and and and are free variables. to the free variables

are the basic variables. and respectively.

where

and

are arbitrary.

Let

and

Then it can easily be verified that

and for

A similar idea is used in the proof of the next theorem and is omitted. The interested readers can read the proof in Appendix 14.1.

THEOREM 2.5.1 [Existence and Non-existence] Consider a linear system matrix, and are vectors with orders and respectively. Suppose

where

is a and

Then exactly one of the following statement holds: 1. if the set of solutions of the linear system is an infinite set and has the form

where 2. if 3. If

are

vectors satisfying

and vector

for satisfying

the solution set of the linear system has a unique the linear system has no solution. be an matrix and consider the linear system is consistent if and only if

Remark 2.5.2 Let

Then by Theorem 2.5.1,

we see that the linear system

The following corollary of Theorem 2.5.1 is a very important result about the homogeneous linear system COROLLARY 2.5.3 Let solution if and only if rank Proof . Suppose the system assumption, we need to show that has a non-trivial solution, That is, and So, Under this be an matrix. Then the homogeneous system has a non-trivial

On the contrary, assume that rank

Also implies that is a solution of the linear system solution under the condition (see Theorem 2.5.1), we get that was a given non-trivial solution. Then

Hence, by the uniqueness of the A contradiction to the fact

Now, let us assume that rank

So, by Theorem 2.5.1, the solution set of the linear system has infinite number of vectors satisfying From this infinite set, we can choose any vector that is different from Thus, we have a solution That is, we have obtained a non-trivial solution height6pt width 6pt depth 0pt

We now state another important result whose proof is immediate from Theorem 2.5.1 and Corollary 2.5.3. PROPOSITION 2.5.4 Consider the linear system hold together. 1. The system 2. The system Remark 2.5.5 1. Suppose 2. If are two solutions of then of Then is also a solution of That is, differ by a for any has a unique solution for every has a non-trivial solution. Then the two statements given below cannot

are two solutions of for some solution

is a solution of the system That is, any two solutions of

solution of the associated homogeneous system In conclusion, for the set of solutions of the system and is a solution is of the form, where

is a particular solution of EXERCISE 2.5.6 1. For what values of and

-the following systems have

no solution,

a unique solution and

infinite number of solutions. 1. 2. 3. 4. 5. 6. 2. Find the condition on so that the linear system

is consistent. 3. Let be an

matrix. If the system

has a non trivial solution then show that

also has a non trivial solution.

DEFINITION 2.5.7 A square matrix THEOREM 2.5.8 For a square matrix 1. 2. 3. 4. Proof . 1

or order of order

is said to be of full rank if the following statements are equivalent.

is invertible. is of full rank. is row-equivalent to the identity matrix. is a product of elementary matrices. 2 Then there exists an invertible matrix where is an (a product of elementary is invertible, let

Let if possible rank matrices) such that

matrix. Since

where

is an

matrix. Then

(2.5.1)

Thus the matrix

has

rows as zero rows. Hence, is of full rank.

cannot be invertible. A contradiction to

being

a product of invertible matrices. Thus, 2 3

Suppose is of full rank. This implies, the row reduced echelon form of has all non-zero rows. But as many columns as rows and therefore, the last row of the row reduced echelon form of will be Hence, the row reduced echelon form of is the identity matrix. 3 Since 4 is row-equivalent to the identity matrix there exist elementary matrices That is, 4 1 is product of elementary matrices.

has

such that

Suppose

where the

's are elementary matrices. We know that elementary matrices are

invertible and product of invertible matrices is also invertible, we get the required result. height6pt width 6pt depth 0pt The ideas of Theorem 2.5.8 will be used in the next subsection to find the inverse of an invertible matrix. The idea used in the proof of the first part also gives the following important Theorem. We repeat the proof for the sake of clarity. THEOREM 2.5.9 Let be a square matrix of order such that such that Then Then exists. exists.

1. Suppose there exists a matrix 2. Suppose there exists a matrix Proof . Suppose that Let if possible, rank matrices) such that

We will prove that the matrix

is of full rank. That is, (a product of elementary matrix. Then

Then there exists an invertible matrix Let where is an

(2.5.2)

Thus the matrix

has

rows as zero rows. So,

cannot be invertible. A contradiction to

being a

product of invertible matrices. Thus, is an invertible matrix. That is, Using the first part, it is clear that the matrix

That is, as well.

is of full rank. Hence, using Theorem 2.5.8,

in the second part, is invertible. Hence

Thus,

is invertible as well. height6pt width 6pt depth 0pt of order

Remark 2.5.10 This theorem implies the following: ``if we want to show that a square matrix is invertible, it is enough to show the existence of 1. either a matrix 2. or a matrix such that such that of order

THEOREM 2.5.11 The following statements are equivalent for a square matrix 1. 2. is invertible. has only the trivial solution

3. Proof . 1 Since 2

has a solution

for every

is invertible, by Theorem 2.5.8

is of full rank. That is, for the linear system Hence, by Theorem 2.5.1 the system

the number has a unique

of unknowns is equal to the rank of the matrix solution 2 1

Let if possible be non-invertible. Then by Theorem 2.5.8, the matrix is not of full rank. Thus by Corollary 2.5.3, the linear system has infinite number of solutions. This contradicts the assumption has only the trivial solution that 1 Since 3 For 1 define for each and consider the linear system Define a matrix Then By That 3 is invertible, for every the system has a unique solution

assumption, this system has a solution is, the column of

is the solution of the system

Therefore, by Theorem 2.5.9, the matrix EXERCISE 2.5.12 1. Show that a triangular matrix 2. Let be a matrix and invertible? Give reasons. 3. Let be an matrix and and only if the matrix

is invertible. height6pt width 6pt depth 0pt

is invertible if and only if each diagonal entry of is non-zero. be a matrix having positive entries. Which of or is be an matrix. Prove that the matrix is invertible if

is invertible.

We first give a consequence of Theorem 2.5.8 and then use it to find the inverse of an invertible matrix. COROLLARY 2.5.13 Let be an invertible matrix. Suppose that a sequence of elementary

row-operations reduces to the identity matrix. Then the same sequence of elementary row-operations when applied to the identity matrix yields Proof . Let be a square matrix of order Also, let Then be a sequence of elementary row This implies

operations such that height6pt width 6pt depth 0pt Summary: Let be an

matrix. Apply the Gauss-Jordan method to the matrix is If then or else

Suppose the is not

row reduced echelon form of the matrix invertible.

EXAMPLE 2.5.14 Find the inverse of the matrix

using the Gauss-Jordan method.

Solution: Consider the matrix

A sequence of steps in the Gauss-Jordan method

are:

1.

2.

3.

4.

5.

6.

7.

8. Thus, the inverse of the given matrix is

EXERCISE 2.5.15 Find the inverse of the following matrices using the Gauss-Jordan method.

Notation: For an
deleting the

matrix

by

we mean the submatrix

of

which is obtained by

row and

column.

EXAMPLE 2.6.1 Consider a matrix

Then

and

DEFINITION 2.6.2 (Determinant of a Square Matrix) Let associate inductively (on

be a square matrix of order written (or

With ) by

we

) a number, called the determinant of

EXAMPLE 2.6.3 1. Let Then,

For example, for

2. Let

Then,

(2.6.1)

For example, if

then

EXERCISE 2.6.4 1. Find the determinant of the following matrices.

2. Show that the determinant of a triangular matrix is the product of its diagonal entries. DEFINITION 2.6.5 A matrix is said to be a singular matrix if It is called non-singular if

The proof of the next theorem is omitted. The interested reader is advised to go through Appendix 14.3. THEOREM 2.6.6 Let 1. if 2. if be an matrix. Then by interchanging two rows, then by multiplying a row by then , times the th row, where , ,

is obtained from is obtained from

3. if all the elements of one row or column of 4. if then 5. if is obtained from , by replacing the

are 0 then th row by itself plus

is a square matrix having two rows equal then

Remark 2.6.7

1. Many authors define the determinant using ``Permutations." It turns out that THE WAY WE HAVE DEFINED DETERMINANT is usually called the expansion of the determinant along the first row. 2. Part 1 of Lemma 2.6.6 implies that ``one can also calculate the determinant by expanding along any matrix for every , one also has row." Hence, for an

Remark 2.6.8 1. Let and be two vectors in and Then consider the parallelogram, We

formed by the vertices

Recall that the dot product, vector vectors We denote the length by and then

and With the above notation, if

is the length of the is the angle between the

Which tells us,

Hence, the claim holds. That is, in 2. Let cross product of two vectors in is,

the determinant is and

times the area of the parallelogram. be three elements of Recall that the

Note here that if

then

Let

be the parallelopiped formed with

as a vertex and the vectors

as adjacent

vertices. Then observe that formed by the vectors at formed by where and and

is a vector perpendicular to the plane that contains the parallelogram So, to compute the volume of the parallelopiped we need to look and the normal vector to the parallelogram

is the angle between the vector So,

Hence, Let properties of and let also hold for the volume of an be an matrix. Then the following

-dimensional parallelopiped formed with as adjacent vertices: and then then the determinant of the new matrix is

as one vertex and the vectors 1. If Also, volume of a unit -dimensional cube is 2. If we replace the vector by for some

. This is also true for the volume, as the original volume gets multiplied by 3. If for some then the vectors will give rise to an -dimensional hyperplane.

-dimensional parallelopiped. So, this parallelopiped lies on an Thus, its -dimensional volume will be zero. Also, matrix it can be proved that

In general, for any the

is indeed equal to the volume of

DEFINITION 2.6.9 (Minor, Cofactor of a Matrix) The number . We write The cofactor of be an denoted denoted

is called the is the number with

minor of

DEFINITION 2.6.10 (Adjoint of a Matrix) Let for

matrix. The matrix

is called the Adjoint of

EXAMPLE 2.6.11 Let

Then

as THEOREM 2.6.12 Let 1. for 2. for 3. Thus, and be an matrix. Then

and so on.

(2.6.2)

Proof . Let the row of

be a square matrix with as the row of

the other rows of By the construction of construction again,

are the same as that of two rows ( and ) are equal. By Part 5 of Lemma 2.6.6, for Thus, by Remark 2.6.7, we have By

Now,

Thus,

Since, has an inverse and

Therefore,

has a right

inverse. Hence, by Theorem 2.5.9

height6pt width 6pt depth 0pt

EXAMPLE 2.6.13 Let

Then

and

By Theorem 2.6.12.3,

The next corollary is an easy consequence of Theorem 2.6.12 (recall Theorem 2.5.9). COROLLARY 2.6.14 If is a non-singular matrix, then and

THEOREM 2.6.15 Let Proof . Step 1. Let

and

be square matrices of order

Then

This means, is invertible. Therefore, either matrices (see Theorem 2.5.8). So, let

is an elementary matrix or is a product of elementary be elementary matrices such that

Then, by using Parts 1, 2 and 4 of Lemma 2.6.6 repeatedly, we get

Thus, we get the required result in case Step 2. Suppose Then So,

is non-singular.

is not invertible. Hence, there exists an invertible matrix and therefore

such that

where

Thus, the proof of the theorem is complete. height6pt width 6pt depth 0pt COROLLARY 2.6.16 Let Proof . Suppose inverse. Suppose has an inverse. Then there exists a matrix both sides, we get such that Taking determinant of be a square matrix. Then is non-singular if and only if has an inverse. Thus, has an

is non-singular. Then

and therefore,

This implies that THEOREM 2.6.17 Let Proof . If If

Thus,

is non-singular. height6pt width 6pt depth 0pt

be a square matrix. Then

is a non-singular Corollary 2.6.14 gives Hence, by Corollary 2.6.16, has an inverse then doesn't have an inverse. Therefore, Thus again by

is singular, then

also doesn't have an inverse (for if Corollary 2.6.16, Hence, we have

Therefore, we again have height6pt width 6pt depth 0pt

Recall the following: The linear system has a unique solution for every has an inverse if and only if Thus, has a unique solution FOR EVERY if and only if when if and only if exists.

The following theorem gives a direct method of finding the solution of the linear system

THEOREM 2.6.18 (Cramer's Rule) Let be a linear system with then the unique solution to this system is

equations in

unknowns. If

where

is the matrix obtained from

by replacing the

th column of

by the column vector has the solution

Proof . Since Hence, the

Thus, the linear system th coordinate of is given by

height6pt width 6pt depth 0pt The theorem implies that

and in general

for

EXAMPLE 2.6.19 Suppose that

and

Use Cramer's rule to find a vector

such that Solution: Check that Therefore

and

That is,

In this chapter, the linear transformations are from a given finite dimensional vector space to itself. Observe that in this case, the matrix of the linear transformation is a square matrix. So, in this chapter, all the for some positive integer matrices are square matrices and a vector means EXAMPLE 6.1.1 Let be a real symmetric matrix. Consider the following problem:

To solve this, consider the Lagrangian

Partially differentiating

with respect to

for

we get

and so on, till

Therefore, to get the points of extrema, we solve for

We therefore need to find a

and

such that

for the extremal problem.

EXAMPLE 6.1.2 Consider a system of

ordinary differential equations of the form

(6.1.1)

where

is a real

matrix and

is a column vector.

To get a solution, let us assume that (6.1.2)

is a solution of (6.1.1) and look into what and has to satisfy, i.e., we are investigating for a necessary condition on and so that (6.1.2) is a solution of (6.1.1). Note here that (6.1.1) has the zero solution, namely and so we are looking for a non-zero Differentiating (6.1.2) with respect to and substituting in (6.1.1), leads to (6.1.3)

So, (6.1.2) is a solution of the given system of differential equations if and only if matrix we are this lead to find a pair such that That is, given an satisfied. Let be a matrix of order In general, we ask the question: For what values of there exist a non-zero vector such that

and

satisfy (6.1.3). and (6.1.3) is

(6.1.4) stands for either the vector space Here, equation over or over Equation (6.1.4) is equivalent to the

By Theorem 2.5.1, this system of linear equations has a non-zero solution, if

So, to solve (6.1.4), we are forced to choose those values of that is a polynomial in of degree

for which

Observe

We are therefore lead to the following definition. The polynomial The equation

DEFINITION 6.1.3 (Characteristic Polynomial) Let be a matrix of order is called the characteristic polynomial of and is denoted by

is called the characteristic equation of then is called a characteristic value of

If

is a solution of the characteristic equation

Some books use the term EIGENVALUE in place of characteristic value. THEOREM 6.1.4 Let characteristic equation. Then there exists a non-zero Proof . Since is a root of the characteristic equation, Suppose such that This shows that the matrix is a root of the

is singular and therefore by Theorem 2.5.1 the linear system

has a non-zero solution. height6pt width 6pt depth 0pt Remark 6.1.5 Observe that the linear system consider only those has a solution for every So, we

that are non-zero and are solutions of the linear system has a non-zero solution

DEFINITION 6.1.6 (Eigenvalue and Eigenvector) If the linear system for some then 1. 2. 3. the tuple is called an eigenvalue of is called an eigenvector corresponding to the eigenvalue is called an eigenpair. of

and

Remark 6.1.7 To understand the difference between a characteristic value and an eigenvalue, we give the following example. Consider the matrix Then the characteristic polynomial of is

Given the matrix

recall the linear transformation

defined by

1. If has

that is, if and

is considered a COMPLEX matrix, then the roots of as eigenpairs.

in

are

So,

2. If

that is, if then

is considered a REAL matrix, then

has no solution in

Therefore, if

has no eigenvalue but it has

as characteristic values. matrix then for any non-zero are eigenvectors of it is easily seen that if Hence, when

Remark 6.1.8 Note that if

is an eigenpair for an Similarly, if

is also an eigenpair for corresponding to the eigenvalue , then

then for any non-zero

is also an eigenvector of

corresponding to the eigenvalue

we talk of eigenvectors corresponding to an eigenvalue Suppose is a root of the characteristic equation Suppose has

we mean LINEARLY INDEPENDENT EIGENVECTORS. Then is singular and

Then by Corollary 4.3.9, the linear system has linearly independent

linearly independent solutions. That is, whenever

eigenvectors corresponding to the eigenvalue EXAMPLE 6.1.9 1. Let with

for

Then

is the

characteristic equation. So, the eigenpairs are

2. Let That is

Then

Hence, the characteristic equation has roots for Hence, from the above is

is a repeated eigenvalue. Now check that the equation And this has the solution

equivalent to the equation remark,

is a representative for the eigenvector. Therefore, HERE WE HAVE TWO EIGENVALUES

MATHEND000# BUT ONLY ONE EIGENVECTOR.

3. Let

Then and we know that from

The characteristic equation has roots for every to get and then

Here,

the matrix that we have is

and we can CHOOSE ANY as the two eigenpairs.

TWO LINEARLY INDEPENDENT VECTORS

In general, if

are linearly independent vectors in are eigenpairs for the identity matrix,

4. Let

Then Now check that the eigenpairs are and

The characteristic equation has roots In this case, we have TWO

DISTINCT EIGENVALUES AND THE CORRESPONDING EIGENVECTORS ARE ALSO LINEARLY INDEPENDENT.

The reader is required to prove the linear independence of the two eigenvectors. Then Hence, over that the eigenpairs are the matrix and

5. Let

The characteristic equation has roots has no eigenvalue. Over the reader is required to show

EXERCISE 6.1.10 1. Find the eigenvalues of a triangular matrix. 2. Find eigenpairs over for each of the following matrices: and 3. Let and be similar matrices. 1. Then prove that and have the same set of eigenvalues. 2. Let be an eigenpair for and be an eigenpair for between the vectors and ? and are similar, then there exists a non-singular matrix

What is the relationship

[Hint: Recall that if the matrices such that ] 4. Let be an

matrix. Suppose that for all

Then prove that

is an eigenvalue of What is the corresponding eigenvector? have the same set of eigenvalues. Construct a 5. Prove that the matrices and

matrix

such

that the eigenvectors of and are different. be a matrix such that ( is called an idempotent matrix). Then prove that its 6. Let eigenvalues are either 0 or or both. 7. Let be a matrix such that ( is called a nilpotent matrix) for some positive integer . Then prove that its eigenvalues are all 0 . THEOREM 6.1.11 Let distinct. Then be an and matrix with eigenvalues not necessarily

Proof . Since

are the

eigenvalues of

by definition, (6.1.5)

(6.1.5) is an identity in

as polynomials. Therefore, by substituting

in (6.1.5), we get

Also,

(6.1.6)

(6.1.7) for some Note that the coefficient of comes from the product

So,

by definition of trace.

But , from (6.1.5) and (6.1.7), we get

(6.1.8) Therefore, comparing the coefficient of we have

Hence, we get the required result. height6pt width 6pt depth 0pt EXERCISE 6.1.12

1. Let 2. Let

be a skew symmetric matrix of order be a orthogonal matrix such that .If

Then prove that 0 is an eigenvalue of , then prove that there exists a

non-zero vector Let be an

matrix. Then in the proof of the above theorem, we observed that the characteristic is a polynomial equation of degree it has the form in Also, for some numbers

equation

Note that, in the expression elements of It turns out that the expression

is an element of

Thus, we can only substitute

by

holds true as a matrix identity. This is a celebrated theorem called the Cayley Hamilton Theorem. We state this theorem without proof and give some implications. THEOREM 6.1.13 (Cayley Hamilton Theorem) Let characteristic equation. That is, be a square matrix of order Then satisfies its

holds true as a matrix identity. Some of the implications of Cayley Hamilton Theorem are as follows. Remark 6.1.14 1. Let Then its characteristic polynomial is and eigenvalue of does not imply that where is and a Also, for the function, for each

This shows that the condition

2. Suppose we are given a square matrix of order and we are interested in calculating Then we can use the division algorithm to find numbers large compared to polynomial such that

Hence, by the Cayley Hamilton Theorem,

That is, we just need to compute the powers of

till

In the language of graph theory, it says the following:


``Let be a graph on vertices. Suppose there is no path of length to of any length. That is, the graph or less from a vertex is disconnected and and to a vertex are in different of Then there is no path from components."

3. Let

be a non-singular matrix of order

Then note that

and

This matrix identity can be used to calculate the inverse.


Note that the vector (as an element of the vector space of all matrices) is a linear combination of the vectors

EXERCISE 6.1.15 Find inverse of the following matrices by using the Cayley Hamilton Theorem

THEOREM 6.1.16 If eigenvectors

are distinct eigenvalues of a matrix then the set

with corresponding

is linearly independent.

Proof . The proof is by induction on the number of eigenvalues. The result is obviously true if as the corresponding eigenvector is non-zero and we know that any set containing exactly one non-zero vector is linearly independent. Let the result be true for We prove the result for We consider the equation (6.1.9)

for the unknowns

We have

(6.1.10) From Equations (6.1.9) and (6.1.10), we get

This is an equation in

eigenvectors. So, by the induction hypothesis, we have

But the eigenvalues are distinct implies Also,

for

We therefore get

for

and therefore (6.1.9) gives

Thus, we have the required result. height6pt width 6pt depth 0pt We are thus lead to the following important corollary. COROLLARY 6.1.17 The eigenvectors corresponding to distinct eigenvalues of an linearly independent. EXERCISE 6.1.18 1. For an 1. 2. If 3. If 4. If and matrix prove the following. matrix are

have the same set of eigenvalues. then is an eigenvalue of

is an eigenvalue of an invertible matrix

is an eigenvalue of then is an eigenvalue of for any positive integer and are matrices with nonsingular then and have the same set

of eigenvalues. In each case, what can you say about the eigenvectors? 2. Let and be matrices for which and need not be similar. be an eigenpair for another matrix

1. Do and have the same set of eigenvalues? 2. Give examples to show that the matrices and 3. Let be an eigenpair for a matrix and let 1. Then prove that

is an eigenpair for the matrix are respectively the eigenvalues of and then

2. Give an example to show that if

need not be an eigenvalue of 4. Let be distinct non-zero eigenvalues of an matrix Let be the If

corresponding eigenvectors. Then show that then show that

forms a basis of has the unique solution

Let

be a square matrix of order

and let

be the corresponding linear transformation. In of such that the matrix of the

this section, we ask the question ``does there exist a basis linear transformation is in the simplest possible form."

We know that, the simplest form for a matrix is the identity matrix and the diagonal matrix. In this section, we we can find a basis such that is a diagonal matrix, show that for a certain class of matrices consisting of the eigenvalues of This is equivalent to saying that the above, we need the following definition. is similar to a diagonal matrix. To show

DEFINITION 6.2.1 (Matrix Diagonalisation) A matrix is said to be diagonalisable if there exists a non-singular matrix such that is a diagonal matrix. Remark 6.2.2 Let be an diagonalisable matrix with eigenvalues Observe that By definition, as similar matrices have the

is similar to a diagonal matrix

same set of eigenvalues and the eigenvalues of a diagonal matrix are its diagonal entries. EXAMPLE 6.2.3 Let Then we have the following:

1. Let Then has no real eigenvalue (see Example 6.1.8 and hence doesn't have Hence, there does not exist any non-singular real matrix eigenvectors that are vectors in such that 2. In case, are Define a and is a diagonal matrix. the two complex eigenvalues of respectively. Also, complex matrix by and Then are and the corresponding eigenvectors can be taken as a basis of

THEOREM 6.2.4 let

be an

matrix. Then

is diagonalisable if and only if

has

linearly

independent eigenvectors. Proof . Let be diagonalisable. Then there exist matrices and such that

Or equivalently,

Let

Then

implies that

Since

's are the columns of a non-singular matrix of Since,

they are non-zero and so for

we get the

eigenpairs get

's are columns of the non-singular matrix

using Corollary 4.3.9, we

are linearly independent. is diagonalisable then has linearly independent eigenvectors. with eigenvalues Then

Thus we have shown that if Conversely, suppose Let is non-singular. Also, has

linearly independent eigenvectors Since

are linearly independent, by Corollary 4.3.9,

Therefore the matrix COROLLARY 6.2.5 let diagonalisable. Proof . As is an

is diagonalisable. height6pt width 6pt depth 0pt be an matrix. Suppose that the eigenvalues of are distinct. Then is

matrix, it has

eigenvalues. Since all the eigenvalues of

are distinct, by is diagonalisable. as

Corollary 6.1.17, the eigenvectors are linearly independent. Hence, by Theorem 6.2.4, height6pt width 6pt depth 0pt COROLLARY 6.2.6 Let be an matrix with

as its distinct eigenvalues and divides . Then but

its characteristic polynomial. Suppose that for each does not divides for some positive integers

Or equivalently Proof . As is diagonalisable, by Theorem 6.2.4, as has linearly independent eigenvalues. Also, , has exactly linearly

. Hence, for each eigenvalue

independent eigenvectors. Thus, for each has exactly Indeed Now suppose that for each choose

, the homogeneous linear system .

linearly independent vectors in its solution set. Therefore, for follows from a simple counting argument. . Then for each

, we can

linearly independent eigenvectors. Also by Corollary 6.1.17, the eigenvectors corresponding to has linearly independent eigenvectors.

distinct eigenvalues are linearly independent. Hence Hence by Theorem 6.2.4, EXAMPLE 6.2.7

is diagonalisable. height6pt width 6pt depth 0pt

1. Let It is easily seen that

Then and

Hence,

has eigenvalues

are the only eigenpairs. That is, the matrix Hence, by Theorem 6.2.4,

has exactly one eigenvector corresponding to the repeated eigenvalue the matrix is not diagonalisable. 2. Let can be easily verified that corresponds to the eigenvalue Then and Note that the set Hence,

has eigenvalues and

It

correspond to the eigenvalue

consisting of eigenvectors

corresponding to the eigenvalue are not orthogonal. This set can be replaced by the orthogonal set which still consists of eigenvectors corresponding to the eigenvalue as . Also, the set forms a basis of

So, by Theorem 6.2.4, the matrix

is diagonalisable. Also, if

is the

corresponding unitary matrix then Observe that the matrix is a symmetric matrix. In this case, the eigenvectors are mutually orthogonal. In general, for any real symmetric matrix there always exist eigenvectors and they are mutually orthogonal. This result will be proved later. EXERCISE 6.2.8 1. By finding the eigenvalues of the following matrices, justify whether or not real non-singular matrix and a real diagonal matrix for any with for some

2. Are the two matrices

and

diagonalisable? , where matrix. Suppose if and otherwise. is

3. Find the eigenvalues and eigenvectors of 4. Let be an matrix and an

Then show that

diagonalisable if and only if both and are diagonalisable. 5. Let be a linear transformation with

and

Then 1. determine the eigenvalues of 2. find the number of linearly independent eigenvectors corresponding to each eigenvalue? 3. is diagonalisable? Justify your answer. 6. Let be a non-zero square matrix such that Show that cannot be diagonalised. [Hint: Use Remark 6.2.2.] 7. Are the following matrices diagonalisable?

In this section, we will look at some special classes of square matrices which are diagonalisable. We will also recall the following be dealing with matrices having complex entries and hence for a matrix definitions. DEFINITION 6.3.1 (Special Matrices) 1. Note that 2. A square matrix with complex entries is called 1. a Hermitian matrix if 2. a unitary matrix if 3. a skew-Hermitian matrix if 4. a normal matrix if 3. A square matrix with real entries is called 1. a symmetric matrix if 2. an orthogonal matrix if 3. a skew-symmetric matrix if Note that a symmetric matrix is always Hermitian, a skew-symmetric matrix is always skew-Hermitian and an orthogonal matrix is always unitary. Each of these matrices are normal. If is a unitary matrix then EXAMPLE 6.3.2 1. Let Then is skew-Hermitian. is called the conjugate transpose of the matrix

2. Let that

and is also a normal matrix.

Then

is a unitary matrix and

is a normal matrix. Note

DEFINITION 6.3.3 (Unitary Equivalence) Let equivalent if there exists a unitary matrix

and

be two

matrices. They are called unitarily

such that

Note that EXERCISE 6.3.4

as

is a unitary matrix. So,

is unitarily similar to the matrix

1. Let be a square matrix such that is a normal matrix. 2. Let be any matrix. Then and

is a diagonal matrix for some unitary matrix where

. Prove that

is the Hermitian part of

is the skew-Hermitian part of where both and are Hermitian matrices.

3. Every matrix can be uniquely expressed as 4. Show that is always skew-Hermitian. such that

5. Does there exist a unitary matrix and

where

PROPOSITION 6.3.5 Let Proof . Let

be an

Hermitian matrix. Then all the eigenvalues of and implies

are real.

be an eigenpair. Then

Hence

But

is an eigenvector and hence That is,

and so the real number

is non-zero as well. Thus

is a real number. height6pt width 6pt depth 0pt be an such that Hermitian matrix. Then where is unitarily diagonalisable. That is, there

THEOREM 6.3.6 Let exists a unitary matrix

is a diagonal matrix with the eigenvalues of

as the diagonal entries. In other words, the eigenvectors of form an orthonormal basis of

Proof . We will prove the result by induction on the size of the matrix. The result is clearly true if Let the result be true for we will prove the result in case So, let be a matrix and let be an eigenpair of orthonormal basis with We now extend the linearly independent set (using Gram-Schmidt Orthogonalisation) of . to form an

As

is an orthonormal set,

Therefore, observe that for all

Hence, we also have as columns of

for

Now, define is a unitary matrix and

(with

). Then the matrix

where

is a

matrix. As

,we get

. This condition, . That is, is

together with the fact that

is a real number (use Proposition 6.3.5), implies that

also a Hermitian matrix. Therefore, by induction hypothesis there exists a such that

unitary matrix

Recall that , the entries

for

are the eigenvalues of the matrix

We also know that two are Define

similar matrices have the same set of eigenvalues. Hence, the eigenvalues of Then is a unitary matrix and

Thus,

is a diagonal matrix with diagonal entries

the eigenvalues of

Hence, the

result follows. height6pt width 6pt depth 0pt COROLLARY 6.3.7 Let be an real symmetric matrix. Then

1. the eigenvalues of are all real, 2. the corresponding eigenvectors can be chosen to have real entries, and 3. the eigenvectors also form an orthonormal basis of Proof . As is symmetric, is also an Hermitian matrix. Hence, by Proposition 6.3.5, the eigenvalues of are all real. Let be an eigenpair of Suppose Then there exist such that So,

Comparing the real and imaginary parts, we get eigenvectors to have real entries.

and

Thus, we can choose the

To prove the orthonormality of the eigenvectors, we proceed on the lines of the proof of Theorem 6.3.6, Hence, the readers are advised to complete the proof. height6pt width 6pt depth 0pt EXERCISE 6.3.8 1. Let be a skew-Hermitian matrix. Then all the eigenvalues of are either zero or purely imaginary. Also, the eigenvectors corresponding to distinct eigenvalues are mutually orthogonal. [Hint: Carefully study the proof of Theorem 6.3.6.] be an unitary matrix. Then 2. Let 1. the rows of form an orthonormal basis of 2. the columns of form an orthonormal basis of 3. for any two vectors 4. for any vector

5. for any eigenvalue 6. the eigenvectors if 3. Let for 4. Show that the matrices and are similar. Is it possible to find a unitary and corresponding to distinct eigenvalues are eigenpairs, with then and and satisfy That is,

are mutually orthogonal. then is an eigenpair

be a normal matrix. Then, show that if

is an eigenpair for

such that matrix 5. Let be a orthogonal matrix. Then prove the following: 1. if then for some

2. if

then there exists a basis of

in which the matrix of

looks like

Or equivalently, reflects the vectors in

for some

In this case, prove that

about a line passing through origin. Also, determine this line.

6. Let 7. Let be a

Determine

orthogonal matrix. Then prove the following: then is a rotation about a fixed axis, in the sense that to the plane has an eigenpair

1. if

such that the restriction of 2. if then the action of

is a two dimensional rotation of followed by

corresponds to a reflection through a plane

a rotation about the line through the origin that is perpendicular to Remark 6.3.9 In the previous exercise, we saw that the matrices and are

similar but not unitarily equivalent, whereas unitary equivalence implies similarity equivalence as But in numerical calculations, unitary transformations are preferred as compared to similarity transformations. The main reasons being: 1. Exercise 6.3.8.2 implies that an orthonormal change of basis leaves unchanged the sum of squares of the absolute values of the entries which need not be true under a non-orthonormal change of basis.

2. As

for a unitary matrix

unitary equivalence is computationally simpler.

3. Also in doing ``conjugate transpose", the loss of accuracy due to round-off errors doesn't occur. We next prove the Schur's Lemma and use it to show that normal matrices are unitarily diagonalisable. LEMMA 6.3.10 (Schur's Lemma) Every matrix. Proof . We will prove the result by induction on the size of the matrix. The result is clearly true if Let the result be true for we will prove the result in case So, let be a matrix and let be an eigenpair for with Then the linearly independent set can be extended, using of ) is a unitary matrix and . complex matrix is unitarily similar to an upper triangular

the Gram-Schmidt Orthogonalisation process, to get an orthonormal basis Then (with as the columns of the matrix

where matrix

is a such that

matrix. By induction hypothesis there exists a is an upper triangular matrix with diagonal entries are the eigenvalues of is a unitary matrix and Observe that since the eigenvalues of Define

unitary the eigen values are is an upper Hence, the result

of the matrix

Then check that

triangular matrix with diagonal entries follows. height6pt width 6pt depth 0pt EXERCISE 6.3.11 1. Let be an

the eigenvalues of the matrix

real invertible matrix. Prove that there exists an orthogonal matrix with positive diagonal entries such that and .

and a

diagonal matrix 2. Show that matrices

are unitarily equivalent via the unitary

matrix

Hence, conclude that the upper triangular matrix obtained in the

"Schur's Lemma" need not be unique. 3. Show that the normal matrices are diagonalisable. [Hint: Show that the matrix in the proof of the above theorem is also a normal matrix and if an upper triangular matrix with then has to be a diagonal matrix]. Remark 6.3.12 (The Spectral Theorem for Normal Matrices) Let be an normal matrix. of

is

Then the above exercise shows that there exists an orthonormal basis such that for

be a normal matrix. Prove the following: 4. Let 1. if all the eigenvalues of are then 2. if all the eigenvalues of 5. Let be an 1. if 2. if is Hermitian and are then for all then for all . then . matrix. Prove that

is a real, symmetric matrix and

Do these results hold for arbitrary matrices? We end this chapter with an application of the theory of diagonalisation to the study of conic sections in analytic geometry and the study of maxima and minima in analysis.

DEFINITION 6.4.1 (Bilinear Form) Let

be a

matrix with real entries. A bilinear form in

is an expression of the type

Observe that if (the identity matrix) then the bilinear form reduces to the standard real inner product. Also, if we want it to be symmetric in and then it is necessary and sufficient that for all Why? Hence, any symmetric bilinear form is naturally associated with a real symmetric matrix. DEFINITION 6.4.2 (Sesquilinear Form) Let form in be a matrix with complex entries. A sesquilinear is given by

Note that if (the identity matrix) then the sesquilinear form reduces to the standard complex inner product. Also, it can be easily seen that this form is `linear' in the first component and `conjugate linear' in the second component. Also, if we want Note that if The expression and of and then the matrix need to be an Hermitian matrix. , then the sesquilinear form reduces to a bilinear form. is called the quadratic form and in place of and is a real number. , the Hermitian form can be rewritten as the Hermitian form. We generally write

, respectively. It can be easily shown that for any choice

the Hermitian form

Therefore, in matrix notation, for a Hermitian matrix

EXAMPLE 6.4.3 Let the Hermitian form

Then check that

is an Hermitian matrix and for

where `Re' denotes the real part of a complex number. This shows that for every choice of Hermitian form is always real. Why? The main idea is to express Note that if we replace by

the

as sum of squares and hence determine the possible values that it can take. where is any complex number, then for which ( i.e., simply gets multiplied by

and hence one needs to study only those From Exercise 6.3.11.3 one knows that if such that ( know are real). So, taking

is a normalised vector.

is Hermitian) then there exists a unitary matrix with 's the eigenvalues of the matrix which we 's as linear combination of 's with coefficients

(i.e., choosing ), one gets

coming from the entries of the matrix

(6.4.1)

Thus, one knows the possible values that case is a Hermitian matrix. Also, for

can take depending on the eigenvalues of the matrix

in

represents the principal axes of the conic that

they represent in the n-dimensional space. Equation (6.4.1) gives one method of writing as a sum of absolute squares of linearly independent as sum of squares. The

linear forms. One can easily show that there are more than one way of writing question arises, ``what can we say about the coefficients when squares".

has been written as sum of absolute

This question is answered by `Sylvester's law of inertia' which we state as the next lemma.

LEMMA 6.4.4 Every Hermitian form written as

(with

an Hermitian matrix) in

variables can be

where

are linearly independent linear forms in depend only on

and the integers

and

Proof . From Equation (6.4.1) it is easily seen that are uniquely given by

has the required form. Need to show that

and

Hence, let us assume on the contrary that there exist positive integers

with

such that /

Since, find a matrix get such that

and Choose

are linear combinations of . Since such that

we can Theorem 2.5.1, Hence, we

gives the existence of finding nonzero values of

Now, this can hold only if Similarly, the case

which gives a contradiction. Hence

can be resolved. height6pt width 6pt depth 0pt

Note: The integer degree of

is the rank of the matrix

and the number

is sometimes called the inertial

We complete this chapter by understanding the graph of

for

We first look at the following example.

EXAMPLE 6.4.5 Sketch the graph of Solution: Note that

The eigenpairs for

are

Thus,

Let

Then

Thus the given graph reduces to

Therefore, the given graph represents an ellipse with the principal axes principal axes are

and

That is, the

The eccentricity of the ellipse is

the foci are at the points

and

and the equations of the directrices are

Figure 6.1: Ellipse DEFINITION 6.4.6 (Associated Quadratic Form) Let equation of a general conic. The quadratic expression be the

is called the quadratic form associated with the given conic. We now consider the general conic. We obtain conditions on the eigenvalues of the associated quadratic form to characterise the different conic sections in (endowed with the standard inner product). PROPOSITION 6.4.7 Consider the general conic

Prove that this conic represents 1. an ellipse if 2. a parabola if 3. a hyperbola if and

Proof . Let

Then the associated quadratic form

As

is a symmetric matrix, by Corollary 6.3.7, the eigenvalues are orthonormal and

of

are both real, the corresponding

eigenvectors

is unitarily diagonalisable with

(6.4.2)

Let

Then

and the equation of the conic section in the

-plane, reduces to

Now, depending on the eigenvalues 1. Substituting

we consider different cases:

in (6.4.2) gives in the -plane.

Thus, the given conic reduces to a straight line

2. In this case, the equation of the conic reduces to

1. If 2. If 1. If 2. If 3. If

then in the

-plane, we get the pair of coincident lines

then we get a pair of parallel lines the solution set corresponding to the given conic is an empty set. Then the given equation is of the form for some translates

and

and thus represents a parabola. implies that the That is,

Also, observe that 3. Let and

Then the equation of the conic can be rewritten as

In this case, we have the following: 1. suppose Then the equation of the conic reduces to

The terms on the left can be written as product of two factors as the given equation represents a pair of intersecting straight lines in the 2. suppose As we can assume

Thus, in this case, -plane.

So, the equation of the conic reduces to

This equation represents a hyperbola in the

-plane, with principal axes

As

we have

4. In this case, the equation of the conic can be rewritten as

we now consider the following cases: 1. suppose Then the equation of the ellipse reduces to a pair of perpendicular lines and 2. suppose in the -plane.

Then there is no solution for the given equation. Hence, we do not get any real

ellipse in the 3. suppose

-plane. In this case, the equation of the conic reduces to

This equation represents an ellipse in the

-plane, with principal axes

Also, the condition

implies that

height6pt width 6pt depth 0pt Remark 6.4.8 Observe that the condition

implies that the principal axes of the conic are functions of the eigenvectors EXERCISE 6.4.9 Sketch the graph of the following surfaces: 1. 2. 3. 4.

and

As a last application, we consider the following problem that helps us in understanding the quadrics. Let (6.4.3)

be a general quadric. Then we need to follow the steps given below to write the above quadric in the standard form and thereby get the picture of the quadric. The steps are:

1. Observe that this equation can be rewritten as

where

2. As the matrix is symmetric matrix, find an orthogonal matrix matrix. 3. Replace the vector by Then writing

such that

is a diagonal

the equation (6.4.3) reduces to (6.4.4)

where

are the eigenvalues of so

4. Complete the squares, if necessary, to write the equation (6.4.4) in terms of the variables

that this equation is in the standard form. 5. Use the condition to determine the centre and the planes of symmetry of the quadric in terms of the original system. EXAMPLE 6.4.10 Determine the quadric

Solution: In this case,

and

and

. Check that for the orthonormal matrix

So, the equation of the quadric reduces to

Or equivalently,

So, the equation of the quadric in standard form is

where the point of symmetry is left as an exercise to the reader.

is the centre. The calculation of the planes

You might also like