You are on page 1of 12

Quick notes on Linear Algebra

by Vladimir Vargas Calderon - Universidad Nacional de Colombia


Definition Let u and v be two vectors different from zero. Then the proyection of u onto v is
proyv u =

uv
v.
|v|2

(1)

Definition Let A be a set and be an operation. Then A has closure under if for every x, y A, (x y) A.
Definition A set V is a vector space if it is closed under vector addition and scalar multiplication. If H V is a vector
space, we say that H is a subspace of V .
Definition Let V be a vector space. Let v1 , . . . , vn V . Then these vectors are linear dependent if there are
c1 , . . . , cn Rn such that:
i)

n
X

|cj | =
6 0,

j=1

ii)

n
X

cj vj = 0.

j=1

They are linear independent if they are not linear dependent. ii implies that if we let

a11
a21

A= .
..

a12
a22
..
.

..
.

am1 am2

a1n
a2n

.. ,
.

(2)

amn

and if we consider the columns of A as vectors, then they are linear dependent if and only if Ac = 0 for a non-trivial c.
Definition Let A be a m n matrix, then the null space of A is
null A = {x Rn : Ax = 0}.

(3)

Also dim(null A) is the nullity of A. We shall write null A to both the set of null space and the nullity. Note that if
null A = {0}, then A = 0.
Definition Let A be a m n matrix. Then the image of A is
image A = {y Rm : Ax = y for some x Rn }.

(4)

Also dim(image A) = A is the range of A.


Theorem 1 For any matrix A, its column space CA = image A.
Proof Suppose that y image A. Then there is a x such that y = Ax. Note that Ax is some linear combination of the
columns of A. Hence y CA , which implies image A CA . Similarly, if y CA then it can be written as a linear
combination of the columns of A. Let x be the columns of coefficients of this combination, then y = Ax implies that
y image A. Hence CA image A. Therefore CA = image A.
1

Theorem 2 Let A be a m n matrix. Then the dimension of its row space


dim RA = dim CA .

(5)

Proof Let the rows of A be r1 , . . . , rm and let k = dim RA . Let S = {s1 , . . . , sk } be a basis for RA . Then each row of A
can be written as a linear combination of the vectors in S, then there are constants ij such that
r1 = 11 s2 + 12 s2 + . . . + 1k sk
r2 = 21 s2 + 22 s2 + . . . + 2k sk
..
..
..
..
..
.
.
.
.
.

(6)

rm = m1 s2 + m2 s2 + . . . + mk sk .
Consider the j-th component of ri , which is aij . Then last equation yields, together with the fact that si = (si1 , si2 , . . . , sin ):
a1j = 11 s1j + 12 s2j + . . . + 1kskj
a2j = 21 s1j + 22 s2j + . . . + 2kskj
..
..
..
..
..
.
.
.
.
.
amj = m1 s1j + m2 s2j + . . . + mkskj

a1j
11
a1k

= ... = s1j ... + . . . + skj ... .

(7)

amj

m1

(8)

amk

1i

Let i = ... . Note that in the right hand side of the last equation is just the j-th column of A. Since each column
mi
of A can be written as a linear combination of 1 , . . . , k , then these vectors span the column space CA . This means that
dim CA k. The last inequality holds for any matrix A. In particular, it holds for AT . Since CAT = RA and RAT = CA
then dim CAT = dim RA k = dim RAT = dim CA .
Theorem 3 Let A be a m n matrix. Then every vector in the row space RA is orthogonal to every vector in the null
space null A, and we say that RA null A.
Proof Let x null A. Then

n
X

cj xj = 0. In particular, the i-th row of the last equation is given by

j=1
n
X

aij xj = (ai1 , . . . ain ) (x1 , . . . , xn )T = 0.

(9)

j=1

Definition Let A be a n n matrix. The ij cofactor of A, denoted by Aij is given by


Aij = (1)i+j det(Mij )

(10)

where Mij = A 1i1,i+1n .


1j1,j+1n

Definition Let A be a n n matrix. Then:


det A = ai1 Ai1 + ai2 Ai2 + . . . + ain Ain =

n
X

aik Aik

k=1
n
X

= a1j A1j + a2j A2j + . . . + anj Anj =

akj Akj .

k=1

This is: we can compute det A expanding by cofactors on any row of A or on any column of A.
2

(11)
(12)

Definition Let A be a n n matrix. Let B be the cofactors of A:

A11 A12
A21 A22

B= .
..
..
..
.
.
An1 An2

A1n
A2n

..
.

(13)

Ann ,

Then adj A = B T is the adjoint matrix of A, and it defines:


A1 =

1
adj A,
det A

(14)

where A1 is the inverse of the matrix, defined by the only matrix that makes AA1 = A1 A = I true.
Proposition 4 Let A, B be two n n matrices. Then:
1. det(AB) = det A det B.
2. det AT = det A.
3. If any row or column of A is 0, then det A = 0.
4. If any row or column of A is multiplyied by c R, then det A is multiplyied by c as well.
5. If any row or column of A is exchanged by another row or column (respectively), then det A is multiplyied by 1.
6. If there are two equal rows or columns in A, then det A = 0.
7. In general if a column or row of A is a linear combination of other columns or rows (respectively), then det A = 0.
8. If we do operations between rows, det A wont change.
9. det A1 =

1
.
det A

Proposition 5 (Rank-nullity theorem) Let A be a m n matrix over some field, then rank A + null A = n. In particular
we say that rank A = dim(image A), where image A is the image of A as a linear transformation.
Theorem 6 Let A be a n n matrix. Then the following statements are equivalent:
1. A is invertible.
2. The only solution to the homogeneous system Ax = 0 is x = 0.
3. The system Ax = b has a unique solution for each n-vector b.
4. det A 6= 0.
5. rank A = n.
6. null A = 0.
Proof To help the reader, here is a diagram of the proof, we have to show each arrow and when we finish, the theorem will
be proven.

1 2 If A is invertible then Ax = 0 = x = A1 0 = x = 0.
1 3 If A is invertible then Ax = b = x = A1 b means that x is unique for each b.
1 4 Since A1 A = I, then det(A1 A) = det I = 1 = det A 6= 0.


1 5 If A is invertible then B : AB = I, where B is the inverse of A. We can write B = b1 bn , where bi is the
i-th column of B. Then Abi = ei , where ei is the canonical basis corresponding to the i-th column of I. Hence
dim(image A) = n = rank A.
2 3 If the only solution to the homogeneous system Ax = 0 is x = 0 then if x, y A. Since Ax = y = 0, then
x = y = 0 means that A is injective, then the system Ax = b has unique solution.
2 6 If Ax = 0 implies x = 0, then A = 0.
3 5 If for each n-vector b there is a unique x : Ax = b, then rank A = n.
4 1 Note that if Ei are elementary matrices (the ones that define adding multiples of rows to other rows, or perform
permutations), and if T is a upper triangular matrix, then A = E1 E2 . . . Em T . We know that
det A = det E1 det E2 . . . det Em det T.

(15)

Considering that det Ei 6= 0 for every i (if we do a permutation, there is a matrix that undoes that permutation, i.e.
its inverse; same thing with a row operation), it is correct to say that det A 6= 0 = det T 6= 0. Since T is invertible
(because it is upper triangular), and Ei is invertible for each i, and taking into account that the product of invertible
matrices is invertible, then A must be invertible.
5 1 It is clear that dim(image A) = n implies that for every i, vi : Avi = ei . If B is the matrix whose columns are vi ,
then AB = I, hence A is invertible.
5 6 By proposition 5 the result holds.
It follows by logic that if one of the statements is false, the others are false as well.
Proposition 7 Let B1 = {u1 , u2 , . . . , un } and B2 = {v1 , v2 , . . . , vn } be two basis for the vector space V . Let x V .
Then we can write x in terms of two basis:
x = b1 u1 + . . . + bn un ,

(16)

x = c1 v1 + . . . + cn vn ,
(17)


b1
c1
..
..
where bi , ci R. If (x)B1 = . denotes the representation of x in terms of the basis B1 and (x)B2 = . denotes
bn
cn
the representation of x in terms of the basis B2 , then suppose that w1 = a1 u1 + . . . + an un and w2 = b1 u1 + . . . + bn un .
Note that w1 + w2 = (a1 + b1 )u1 + . . . + (an + bn )un . This implies that the vector space V coordinated by the base B1
is closed under vector addition. Moreover, coordinated by any base, it is still a vector space.
Since B2 is a basis, then each uj B1 can be written as a linear combination of vi vectors. In particular there are
a1j , b2j , . . . , anj such that for j = 1, 2, . . . , n:
uj = a1j v1 + . . . + anj vn .

(18)

So that:

(uj )B2

a1j
a2j

= . .
..
anj
4

(19)

Now consider Equation (16) and Equation (18), that yield:


(x)B1 =

n
X

bi (aij vj ).

(20)

i,j=1

Note that writing x in B2 is just:

(x)B2

a11
..
..
=
bi aij = .
.
i,j=1
an1
n
X


a1n
b1
.. .. .
. .
ann

(21)

bn

We define the transition matrix from the basis B1 to B2 as:

a11 a1n

.. = (u ) (u )  .
..
A = ...
n B2
1 B2
.
.
an1 ann

(22)

Hence (x)B2 = A(x)B1 . Note that the transition matrix from B2 to B1 is just A1 .
Definition It is said that a set S = {uj : uj Rn for some j = 1, . . . , n.} is an orthonormal set if:
(
ui uj = 0 if i 6= j
.
ui uj = 1 if i = j
Definition A n n matrix Q is said to be orthogonal if Q is invertible and Q1 = QT .
Theorem 8 A n n matrix Q is orthogonal if and only if its columns are an orthonormal basis for K n .
Proof Let

a11
..
Q= .
an1

..
.

a1n
.. .
.

(23)

ann

Then

a11
..
T
..
Q = .
.
a1n

an1
.. .
.

(24)

ann

Let B = (bij ) = QT Q, then


bij = a1i a1j + a2i a2j + . . . + ani anj = ci cj
where ci denotes the i-th column of Q. If the columns of Q are orthonormal, then:
(
0 if i 6= j
bij =
.
1 if i = j

(25)

(26)

Which implies that B = I, which is only possible if QT = Q1 , implying that Q is orthogonal. Also, if Q is
orthogonal, then QT = Q1 implies that B = I, which means that Equation (26) holds and then the columns of Q are
orthonormal.

Definition Let H be a subspace of K n (K a field as always) with orthonormal base {u1 , u2 , . . . , uk }. If v K n , then
the orthogonal projection of v onto H is proyH v = (v u1 )u1 + (v u2 )u2 + . . . + (v uk )uk . Note that proyH v H.
Also compare this with the definition of projection of a vector onto another vector. Since we talk about an orthonormal
basis the terms 1/|uj | are just 1.
Definition Let H be a subspace of K n . The orthogonal complement of H is H = {x K n : x h = 0, h H}.
Theorem 9 If H is a subspace of K n then:
i) H is a subspace of K n .
ii) H H = {0}.
iii) dim H = n dim H.
Proof We prove i and iii since ii is obvious.
i) Let x, y H and h H . If R, then (x + y) h = x h + yh = 0, then H is a subspace indeed.
iii) Let {u1 , . . . , uk } be an orthonormal basis for H. Since H is a subspace, there is an orthonormal basis for K n :
{u1 , . . . , uk , vk+1 , . . . , vn }. Now, let x H , then
proyH x = (x u1 )u1 + (x u2 )u2 + . . . + (x uk )uk + (x vk+1 )vk+1 + . . . + (x vn )vn .
But x uj = 0 for j = 1, . . . , k. Hence x = (x vk+1 )vk+1 + . . . + (x vn )vn Then {vk+1 , . . . , vn } is a basis for
H . Thus dim H = n k.
Theorem 10 Let H be a subspace of K n , and let v K n . Then there is a unique pair of vectors h, p : h H p H
such that v = h + p. In particular h = proyH v and p = proyH v.
Proof Let h = proyH v and let p = v h. Let {u1 . . . uk } be an orthonormal basis for H. Then
h = (v u1 )u1 + (v u2 )u2 + . . . + (v uk )uk .
If x H, then there are some constants 1 , . . . k R such that
x = 1 u1 + . . . + k uk .
Then
p x = (v h) x = v x h x =

k
X
i=1

i v ui

k
X

i v ui = 0.

i=1

Thus x H . From here one can see that p = proyH v. Also, to prove the uniqueness of these vectors, suppose that
v = h1 + p1 = h2 + p2 , then h1 h2 = p2 p1 . But h1 h2 H and p2 p1 H . If they are equal it is because
h1 h2 = 0 and p2 p1 = 0.
Definition Let V and W two vector spaces. A linear transformation T : V W is a function defined by v 7 T v
such that if u, w V then T (u + w) = T u + T w and T (u) = T v.
Theorem 11 Let T : V W be a linear transformation. Then:
i) ker T is a subspace of V .
ii) image T is a subspace of W .
Proof We prove both items:
i) Let u, v ker T and R. Then T (u + v) = T u + T v = 0. Thus u + v ker T .
6

ii) Let w, x image T , then w = T u and x = T v for some vectors u, v V . Then, if R we have that
T (u + v) = w + x. Which is a linear combination of two vectors in image T .
Theorem 12 Let T : V W be a linear transformation. Then T is injective if and only if ker T = {0}.
Proof If ker T = {0} and T v1 = T v2 , then T v1 T v2 = T (v1 v2 ) = 0. Then v1 v2 ker T . Therefore since the
only element in ker T is 0 we can say that v1 v2 = 0 = v1 = v2 , which means that T is injective.
Now, if T is injective and v ker T then T v = 0. But we know that always T 0 = 0. Since T is injective, it follows
that v = 0.
Definition Let T : V W be a linear transformation. Then T is an isomorphism if T is bijective.
Definition It is said that the vector spaces V and W are isomorphous if there exist an isomorphism T : V W , and we
write V
= W.
Definition A linear transformation T : K n K n is an isometry if for every x K: |T x| = |x|. We also say that if V
and W are two vector spaces with inner product, then the linear transformation T : V W is an isometry if for every
v V : kvkV = kT vkW .
Theorem 13 Let T be an isometry from K n K n . If x, y K n then T x T y = x y.
Proof Note that |T x T y| = |T x|2 2T x T y + |T y|2 . Also |x y|2 = |x|2 2x y + |y|2 . Since T is an isometry it
follows that 2T x T y = 2x y = T x T y = x y. This means that an isometry conserves the inner product.
Theorem 14 Writing the dot product as h, i, then if A is an m n matrix, if x K n and y K m we have that
hAx, yi = hx, AT yi.

Proof If x, y are column vectors then hAx, yi = (Ax)T y. Note that (Ax)T = (hA1j1jm , xi, hA2j1jm , xi, . . . , hAnj1jm , xi) =
xT AT . Hence hAx, yi = xT AT y = hx, AT yi.
Theorem 15 A linear transformation T : K n K n is an isometry if and only if the matrix representation of T is
orthogonal.
Proof By theorem 14 hT x, T yi = hx, T T T yi. If T is an isometry then hT x, T yi = hx, yi. Combining the last two
equations, it has to be the case that T T T = I, then T is orthogonal. The converse completes the proof.
Theorem 16 Let T : Rn Rn be an isometry. Then T is an isomorphism.
Proof Let {u1 , u2 , . . . , un } be an orthonormal basis for Rn . For i 6= j we have that hui , uj i = 0. Then by theorem 13
hT ui , T uj i = hui , uj i = 0 shows that {T u1 , T u2 , . . . , T un } is an orthonormal basis for Rn , which means that the last
basis span Rn . Thus image T = Rn , which means ker T = {0} (T is surjective). Now, let x ker T , then T x = 0. Since
|T x| = |x| = 0 = x = 0. Then T is injective as well.
Definition It is said that two vector spaces V and W with the set of scalars are isometric isomorphous if there exists a
linear transformation T : V W that is both isometric and isomorphous.
Definition Let A be a n n matrix. is an eigenvalue of A if there is a vector v 6= 0 (called the eigenvector associated
with ) where
Av = v.

(27)

Note that the last equation implies Av v = 0 = (A I)x = 0, which, by theorem 6, has solutions x 6= 0 if and
only if det(A I) = 0. The eigenvalues are the solutions to the last equation, which is called the characteristic equation
of A.
7

Theorem 17 Let A be a n n be a matrix with different eigenvalues 1 , 2 , . . . , m whose corresponding eigenvectors


are v1 , v2 , . . . , vm . Then the eigenvectors of A are linear independent.
Proof We will prove this result with mathematical induction. Lets check the case n = 2. Suppose that the c1 v1 + c2 v2 =
0. Our aim is to show that, accordingly to definition of linear independence, the last equation is only possible when c1 and
c2 are 0 (because the vectors are linear independent). If we multiply the equation by A:
0 = A(c1 v1 + c2 v2 ) = c1 Av1 + c2 Av2 = c1 1 v1 + c2 2 v2 .

(28)

If we take c1 v1 + c2 v2 = 0, multiply it by 1 and subtract it from Equation (28):


c2 (2 1 )v2 = 0.

(29)

Since v2 6= 0 by definition (and the eigenvalues are different), then c2 must be 0. Plugging this in Equation (28) gives
us c1 = 0 as well. Therefore the vectors are linear independent. Now we suppose that the theorem is valid for n = k.
Consider
c1 v1 + c2 v2 + . . . + ck vk + ck+1 vk+1 = 0

(30)

= c1 1 v1 + c2 2 v2 + . . . + ck k vk + ck+1 k+1 vk+1 = 0.

(31)

If we multiply the first of the latter equations by k+1 and subtract the result from the second of the latter equations:
c1 (1 k+1 )v1 + c2 (2 k+1 )v2 + . . . + ck (k k+1 )vk = 0,

(32)

then, since v1 , . . . , vk are independent c1 (1 k+1 ) = c2 (2 k+1 ) = . . . = ck (k k+1 ) = 0. In particular,


given that the eigenvalues are different it must be the case that c1 = c2 = . . . = ck . But this means that ck+1 has to be 0 as
well. Then the theorem is valid for n = k + 1.
Remark Notice that by Theorem 17, the eigenvectors of a n n matrix will always span K n .
Definition It is said that two n n matrices A and B are similar if there exists an invertible n n matrix C such that
B = C 1 AC (or CB = AC).
Theorem 18 Let A and B be two n n similar matrices. Then A and B have the same characteristic equation, and
therefore the same eigenvalues.
Proof Since A and B are similar, there is a C such that B = C 1 AC. Which means that:
det(B I) = det(C 1 AC I) = det(C 1 AC C 1 (I)C) = det(C 1 (A I)C)
= det C

det(A I) det C = det C

C det(A I) = det(A I).

(33)
(34)

Definition A n n matrix A is diagonalisable if there is a diagonal matrix D such that A is similar to D.


Theorem 19 Let A be a n n matrix. Then A is diagonalisable if and only if it has n different, linear independent
eigenvectors. Also, the diagonal matrix D similar to A is
D = diag(1 , 2 , . . . , n ),

(35)

where 1 , 2 , . . . , n are the eigenvalues of A. In particular, if C is a matrix whose columns are the eigenvectors of A,
then D = C 1 AC.
Proof If A has n linear independent eigenvectors v1 = c1j1jn , v2 = c2j1jn , . . . , vn = cnj1jn corresponding to the


eigenvalues 1 , 2 , . . . , n . Let C = v1 , v2 , . . . , vn . Then C is invertible because its columns are linear independent.
Moreover, the i-th column of AC is Avi = i vi . Thus

1 c11 2 c12 . . . n c1n


1 c21 2 c22 . . . n c2n

AC = .
(36)
..
.. .
.
.
.
.
.
.
.
1 cn1 2 cn2 . . . n cnn
8

But note that

. . . c1n
. . . c2n

.. diag(1 , 2 , . . . , n )
..
.
.
cn1 cn2 . . . cnn

1 c11 2 c12 . . . n c1n


1 c21 2 c22 . . . n c2n

= .
..
.. .
.
.
.
.
.
.
.

c11
c21

CD = .
..

c12
c22
..
.

(37)

(38)

1 cn1 2 cn2 . . . n cnn


This means that CD = AC = D = C 1 AC. Now suppose that A is diagonalisable by some invertible matrix C.
If v1 = c1j1jn , v2 = c2j1jn , . . . , vn = cnj1jn are the columns of C then AC = CD implies that Avi = i vi , for
i = 1, 2, . . . , n. Then by definition v1 , v2 , . . . , vn are the eigenvectors of A, and they are linear independent because C is
invertible.
Theorem 20 Let V be a vector space of finite dimension in which the basis B1 = {v1 , v2 , . . . , vn } and B2 = {w1 , w2 , . . . , wn }
are defined. Let T : V V be a linear transformation. Let AT be the matrix representation of T in the basis B1 and
CT be the matrix representation of T in the basis B2 , then AT and CT are similar.
Proof We have that (T x)B1 = AT (x)B1 and (T x)B2 = CT (x)B2 . Let M be transition matrix from B1 to B2 . This means
that (x)B2 = M (x)B1 . Moreover (T x)B2 = M (T x)B1 . Combining the last results yields
M (T x)B1 = CT M (x)B1 = (T x)B1 = M 1 CT M (x)B1 = AT (x)B1 = M 1 CT M (x)B1 .

(39)

Since the last equation holds for every x V we conclude that AT = M 1 CT M , which implies that AT and CT are
similar.
Definition We say that a matrix A is symmetric if A = AT . It is antisymmetric if A = AT .
Theorem 21 Let A be a n n real symmetric matrix. Then the eigenvalues of A are real.
Proof We have that A = AT , or equivalently Aij = Aji for 1 j, i, n. Suppose that z Cn is an eigenvector
associated with the eigenvalue C. Then
n
X

Aij zj = zi .

(40)

j=1

Note that zi zi = (Re zi )2 + (Im zi )2 0. So if we multiply Equation (40) by zi :


n
X

zi Aij zj =

i,j=1

n
X

zi zj .

(41)

i=1

The last equation has to be satisfied by the conjugates. Consider that zi = zi and that zi zi = zi zi . If we denote A as
the matrix A after taking the transpose and then taking the conjugate of each entry i.e. Aij = A , then:
n
X

zi Aji zj =

i,j=1

n
X

zi zj =

i=1

n
X

zj Aij zi .

(42)

i,j=1

Here it might be important to notice that A = A because A is real and symmetric. In particular i, j are just indexes so:
n
X

zi Aij zj =

i,j=1

n
X

zj Aij zi .

i,j=1

Comparing the equations we see that = , which is possible only if R.


9

(43)

Theorem 22 Let A be a n n real symmetric matrix. Then the eigenvectors of A are real.
Proof Let z Cn be an eigenvector of A associated to R. Then Az = z means:
A(Re z + j Im z) = (Re z + j Im z) = A Re z + jA Im z = Re z + j Im z.

(44)

Hence A Re z = Re z and A Im z = Im z, which means that the actual eigenvectors are Re z and Im z, which are
real and not complex.
Definition The Kronecker delta is defined as:
(
0
ij =
1

if i 6= j,
if i = j.

(45)

Theorem 23 The following statements are equivalent:


i) A is a n n real symmetric matrix.
ii) There exists an orthonormal basis for Rn consisting of eigenvectors of A.
iii) There exists an orthogonal matrix P such that D = P T AT = diag(1 , . . . , n ) is diagonal (whose components are
the eigenvalues of A).
Proof If there exists an orthogonal matrix P such that D = P T AT = diag(1 , . . . , n ) is diagonal (whose components
are the eigenvalues of A), then:
If P T AT = D, then A = P DP T . Taking the transposes:
AT = (P DP T )T = P T DP = A.

(46)

Then A is symmetric. Which proves that iii) i).


Suppose
that v1 , . . . , vn are the eigenvectors of A (which at the same time are an orthonormal basis for Rn ). Let

P = v1 , . . . , vn . Note that P ei = vi . Now hvi , vj i = ij for every i, j. The last dot product can be written as:
eTi P T P ej = ij = P T P = I.

(47)

Avi = i vi AP ei = i P ei P T AP ei = i ei P T AP = diag(1 , . . . , n ).

(48)

Moreover

Which proves ii) iii).


Note that (P T P )ij = viT vj = hvi , vj i = ij for every i, j, which means that the columns of P are orthonormal, in
particular an orthonormal basis for Rn . If there exists an orthogonal matrix P such that D = P T AT = diag(1 , . . . , n )
is diagonal, then we can read Equation (48) from right to left. Then the columns of P are the eigenvectors of A, which
proves that iii) ii),
Now we want to prove that i) ii), which would complete the proof. We will proceed by mathematical induction.
For the case n = 1 we want to prove that if A is a 1 1 real symmetric matrix, then iii) holds: because of the fundamental
theorem of algebra, A has at least one eigenvalue with its normal eigenvector (both real because of the latter theorems).
This means that there are 1 , x1 : Ax1 = 1 x1 . Let P = x1 , then P T AP = xT1 Ax1 = xT 1 x1 = 1 = diag(1 ).
Now suppose that the statement is true for n = k 1 (we will see what this means later in the proof). We try to
prove the truth of the statement for n = k. Again by the fundamental theorem of algebra the characteristic equation of
A has at least one root ( is real as well!) which is an eigenvalue of A with its corresponding unit eigenvector u. Let

10

V = (span u) . Then V is a proper subspace of Rk because dim V < dim Rk . Let v V . From theorem 14, combined
with A = AT we know that
0 = hv, ui = hv, ui = hv, Aui = hAv, ui.

(49)

The last equation shows that Av V (we say that V is A-invariant subspace of Rk ). This means that A is a symmetric
operator on V in which we apply the assumption of the case n = k 1 to suppose that V has an orthonormal basis
{b1 , . . . , bk1 } which are eigenvectors of A. If we let bk = u, then the basis is complete and is clearly orthonormal.
Definition A quadratic form over a field K is a homogeneous polynomial (each monomial of the polynomial has the same
degree) of degree 2 in n variables with coefficients in K:
q(x1 , . . . , xn ) =

n
X

Aij xi xj , aij K

qA (x) = xT Ax.

(50)

i,j=1

Remark Theorems 23, 19 are extremely useful since they tell us that a real and symmetric matrix is diagonalised by a
matrix P whose columns are the eigenvectors of A. Moreover, the eigenvectors of A defines an orthonormal basis BE . P
is the transition matrix from the canonic basis to BE .
Definition Let A and B be two n n matrices of K n . Let v 6= 0 K n . We say that v is a generalised eigenvector of B
with respect to A if K : Bv = Av, and we say that is the generalised eigenvalue associated to v.
Definition A quadratic real form (or its matrix A) is said to be positive definite if qA (x) > 0, x 6= 0 Rn .
Definition If for a matrix A there exists a matrix Ag such that:
i) AAg A = A,
ii) Ag AAg = Ag ,
then we say that Ag is a generalised and reflexive inverse of A.
Lemma 24 Let B g be a symmetric and reflexive g-inverse of a symmetric and positive definite matrix B., then there is a
rank factorisation of B g : B g = Y Y T and there exists a left inverse C of Y such that B = C T C is a rank factorisation of
B.
Proof Since B is a positive definite matrix, by theorem 23, there exists an orthogonal matrix U such that B = U F U T ,

T

T = U F U T = B. Which
where F is a diagonal matrix. Let D = F U T . Then DT D = U F
F U T = U F F U
means that DT D is a rank factorisation of B. Similarly, B g = (U F U )g = U F g U T . Let Y = U F g . Then Y Y T is a rank
factorisation of B g . But also note that if O is an orthogonal matrix, then Y = Y O implies that Y Y T = Y OOT Y T = B g
is also a rank factorisation of B g ! (the rank factorisation is not unique).
Now note that Y Y T DT DY Y T = Y Y T , because either Y Y T DT D or DT DY Y T are I. Also note that

Y T DT DY = OT F g U T U F F U T U F g O = O F g F F g OT .

(51)


g g

But F g = F g F g , or what is the same F = F g F g = F F g = F g I = F g F F g = I. Therefore


Y T DT DY = I. Let L = Y T DT , then L is orthogonal. Note that C = LD implies that CY = LDY = Y T DT DY = I,
as expected because C is a left inverse of Y . And more importantly:
C T C = DT LT LD = DT DY Y T DT D = DT D = B.

(52)

Theorem 25 Let xT Ax and xT Bx be two quadratic forms, where rank A = rank B = n. If B is positive definite, there
exists a matrix L such that LBLT = I and LALT = is diagonal.

11


Proof From Lemma 24 we now that there exists a matrix C which is the left inverse of a matrix Y = U F g O, where F is a
diagonal matrix with the eigenvalues of B, U is an orthogonal matrix whose columns are eigenvectors of B associated with
the eigenvalues in F , and O is any orthogonal matrix. This matrix C is such that B = CC T . Let D be a left inverse of C
(DC = I). Let = diag(1 , . . . , n ) : det(DADT i I) = 0. Then there is a matrix M such that M DADT M T =
(because DADT and are similar). Let L = M D, then LALT = and LBLT = M DCC T DT M T = M M T =
I. Then more explicitly L = M D is a matrix where M is the orthogonal
matrix whose columns are the normalised
T , being D the left inverse of the left inverse of U F g O. We choose O = I, so that DAD T =
eigenvectors
of
DAD

(U F g )1 A(U F g )1 T = U T F A F U
Theorem 26 Let B be a n n real symmetric positive definite form, and A a n n real symmetric form. Then there
are n generalised eigenvalues 1 , . . . , n of A with respect to B. Let w1 , . . . , wn , each of which is standardised, i.e.
wiT Bwi = 1. Let W be the n n matrix whose columns are the generalised normalised (with respect to B) eigenvectors.
Then
W T BW = I,

W T AW = = diag(1 , . . . , n ).

(53)

Proof First of all, we can get the normalised eigenvectors by considering some eigenvectors with different norms vi , . . . , vn .
Let
vi
.
wi = p
hvi , Bvi i

(54)

D
E hv , Bv i
vi
vi
i
i
= 1.
wiT Bwi = p
,Bp
=
hv
,
Bv
hvi , Bvi i
hvi , Bvi i
i
ii

(55)

Then

= U1 and P BP T = I = U2 .
Now, by Theorem 25 we know that there exists a matrix P such that P AP T =
Consider U1 x = U2 x. Its easy to see that the generalised eigenvalue of U1 with respect to U2 are the diagonal values of
In particular, for some generalised eigenvalue
ii :
vi =
ii v
.
i , only if v
i = ei is a generalised eigenvector. Then

U1 I = U2 I

P AP T I = P BP T I .

(56)

Let W = P T , therefore Equation (56) yields:


= AW = BW

T AW = T BW

(57)

= is the matrix of generalised eigenvalues of A with respect to B, and moreover W is the matrix
and shows that
whose columns are the generalised eigenvectors of A with respect to B. Moreover:
P AP T = = W T AW

and

12

P BP T = I = W T BW.

(58)

You might also like