Professional Documents
Culture Documents
LINEAR ALGEBRA
L M. GELTAND
Academy of Sciences, Moscow, U.S.S.R.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
COPYRIGHT 0 1961 BY INTERSCIENCE PUBLISHERS, INC.
ALL RIGHTS RESERVED
LIBRARY OF CONGRESS CATALOG CARD NUMBER 61-8630
SECOND PRINTING 1963
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
PREFACE TO THE SECOND EDITION
The second edition differs from the first in two ways. Some of the
material was substantially revised and new material was added. The
major additions include two appendices at the end of the book dealing
with computational methods in linear algebra and the theory of pertur-
bations, a section on extremal properties of eigenvalues, and a section
on polynomial matrices ( §§ 17 and 21). As for major revisions, the
chapter dealing with the Jordan canonical form of a linear transforma-
tion was entirely rewritten and Chapter IV was reworked. Minor
changes and additions were also made. The new text was written in colla-
boration with Z. Ja. Shapiro.
I wish to thank A. G. Kurosh for making available his lecture notes
on tensor algebra. I am grateful to S. V. Fomin for a number of valuable
comments Finally, my thanks go to M. L. Tzeitlin for assistance in
the preparation of the manuscript and for a number of suggestions.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
PREFACE TO THE FIRST EDITION
vii
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
TABLE OF CONTENTS
Page
Preface to the second edition
Preface to the first edition vii
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CHAPTER I
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
2 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 3
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
4 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 5
x= (0, 0,, 1)
are easily seen to be linearly independent. On the other hand, any
m vectors in R, tre > n, are linearly dependent. Indeed, let
Yi(nii, n12,
Y2 = (V21, n22, , n2n),
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
6 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
tt-DIMENSIONAL SPACES 7
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
S LECTURES ON LINEAR ALGEBRA
et = (1, 1, 1, 1),
e, = (0, I, 1, , 1),
en = (0, 0, 0, 1),
and then compute the coordinates n , n of the vector
x = (E1, $2, , e) relative to the basis el, e,, , e. By definition
x = n2e2 ' + nen;
i.e.,
(E1, E, en) n/(1, 1, , 1)
272(0, 1, , 1)
Ti (0, , 1)
0,
= + n2, ni + + + n).
The numbers (ni, .72, , n) must satisfy the relations
= El,
171 + 172 E2
+ 172 + + n. = E.
Consequently,
ni " Si, n2 " S2 -- $1, nn " $n $fl-1- '
= (0, 0, , 1).
Then
x = (Ei, 52, En)
Ei(1, o, o) 2(o, 1, , o) + + $(o, o, , 1)
= e2e2+ +Ee,,.
It follows that in the space R of n-tuples ($1, $2, , $) the numbers
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 9
A very simple basis in this space is the basis whose elements are
the vectors el = 1, e, = t, , e = t"--1-. It is easy to see that the
coordinates of the polynomial P(t) = a0r-1 a1t"-2 +
in this basis are the coefficients a_, an_2, , ao.
Let us now select another basis for R:
e' = 1, e'2 = t a, e's = (t a)2, , e' = (t a)"--1.
Expanding P (t) in powers of (t a) we find that
P (t) = P (a) + P' (a)(t a) + + [P(nl) (a)I (n-1)!](t a)n_'.
Thus the coordinates of P(t) in this basis are
P(a), P' (a), , [PR-1)(a)/ (n 1)1
Isomorphism of n-dimensional vector spaces. In the examples
considered above some of the spaces are identical with others when
it comes to the properties we have investigated so far. One instance
of this type is supplied by the ordinary three-dimensional space R
considered in Example / and the space R' whose elements are
triples of real numbers. Indeed, once a basis has been selected in
R we can associate with a vector in R its coordinates relative to
that basis; i.e., we can associate with a vector in R a vector in R'.
When vectors are added their coordinates are added. When a
vector is multiplied by a scalar all of its coordinates are multiplied
by that scalar. This implies a parallelism between the geometric
properties of R and appropriate properties of R'.
We shall now formulate precisely the notion of "sameness" or of
"isomorphism" of vector spaces.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
10 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 11
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
12 LECTURES ON LINEAR ALGEBRA
Etl., $22, E lk
If we ignore null spaces, then the simplest vector spaces are one-
dimensional vector spaces. A basis of such a space is a single
vector el O. Thus a one-dimensional vector space consists of
all vectors me1, where a is an arbitrary scalar.
Consider the set of vectors of the form x xo °Lei, where xo
and e, 0 are fixed vectors and ranges over all scalars. It is
natural to call this set of vectors by analogy with three-
dimensional space a line in the vector space R.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 13
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
14 LECTURES ON LINEAR ALGEBRA
Then
buE1 + 1)12E2 + + binen,
ei2 = bn + b22$2 + + b2Jn,
§ 2. Euclidean space
1. Definition of Euclidean space. In the preceding section a
vector space was defined as a collection of elements (vectors) for
which there are defined the operations of addition and multipli-
cation by scalars.
By means of these operations it is possible to define in a vector
space the concepts of line, plane, dimension, parallelism of lines,
etc. However, many concepts of so-called Euclidean geometry
cannot be forniulated in terms of addition and multiplication by
scalars. Instances of such concepts are: length of a vector, angles
between vectors, the inner product of vectors. The simplest way
of introducing these concepts is the following.
We take as our fundamental concept the concept of an inner
product of vectors. We define this concept axiomatically. Using
the inner product operation in addition to the operations of addi-
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 15
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
16 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 17
i.e., we put
(x, y)
(5) cos 9)
13C1 1371
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
18 LECTURES ON LINEAR ALGEBRA
i.e., that the square of the length of the diagonal of a rectangle is equal
to the sum of the squares of the lengths of its two non-parallel sides
(the theorem of Pythagoras).
Proof: By definition of length of a vector
lx + 3/12 (x Y, x Y).
In view of the distributivity property of inner products (Axiom 3),
(X + y, x + y) = (x, x) (x, Y) 4- (Y, x) (Y, y).
Since x and y are supposed orthogonal,
(x, y) = (y, x) = O.
Thus
1x + y12 = (x, x) (Y, Y) 13112,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 19
This inequality implies that the polynomial cannot have two dis-
tinct real roots. Consequently, the discriminant of the equation
t2(y, y) 2t(x, y) + (x, x) =
cannot be positive; i.e.,
(x, y)2 (x, x)(y, y) _CO,
which is what we wished to prove.
EXERCISE. Prove that a necessary and sufficient condition for
(x, y), (x, x)(y, y) is the linear dependence of the vectors x and y.
EXAMPLES. We have proved the validity of (6) for an axiomatically
defined Euclidean space. It is now appropriate to interpret this inequality
in the various concrete Euclidean spaces in para. 1.
1. In the case of Example 1, inequality (6) tells us nothing new. (cf.
the remark preceding the proof of the Schwarz inequality.)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
20 LECTURES ON LINEAR ALGEBRA
(x, y) = E
t=1
It follows that
(X, X) E Et2, (y. y) = E i)(8,
2=1 2--1
and inequality (6) becomes
n )( n
ei MY 5- ( Ei2 tif2).
i=1 i=1 i=1
In Example 3 the inner product was defined as
(1) Y) aikeink,
i,n=1
where
and
(3) E aikeiek >.
k=1
for any choice of the ¿i. Hence (6) implies that
il the numbers an satisfy conditions (2) and (3), then the following inequality
holds:
2 ( n
( E ao,$((o) E aikeik)( E 6111115112)
n
k=1 k=1 i, 2=1
EXERCISE. Show that if the numbers an satisfy conditions (2) and (3),
anakk. (Hint: Assign suitable values to the numbers Ei, En, , En
and /72, , 71 in the inequality just derived.)
In Example 4 the inner product was defined by means of the integral
fb
1(1)g (t) dt. Hence (6) takes the form
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 21
Proof:
lx y12 = (x + y, x -1- y) = (x, x) 2(x, y) + (y, y).
Since 2(x, y) 213E1 it follows that
1x±y12 = (x+y, x+y) (x, x)+21x1 lyi+ (y, Y) = fix1+IYI)2,
i.e., 1x + yl x1 SI, which is the desired conclusion.
EXERCISE. Interpret inequality (7) in each of the concrete Euclidean
spaces considered in the beginning of this section.
In geometry the distance between two points x and y (note the
use of the same symbol to denote a vectordrawn from the origin-
and a point, the tip of that vector) is defined as the length of the
vector x y. In the general case of an n-dimensional Euclidean
space we define the distance between x and y by the relation
d lx yl.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
22 LECTURES ON LINEAR ALGEBRA
f1 if i = k
(ei, ek) =
10 if i k.
For this definition to be correct we must prove that the vectors
ei, e,, en of the definition actually form a basis, i.e., are
linearly independent.
Thus, let
A2e2 + + Ae = O.
We wish to show that (2) implies Ai = 2.2 = A = O. To
this end we multiply both sides of (2) by el (i.e., form the inner
product of each side of (2) with ei). The result is
21(e1, el) + A2(e1, e2) + + 2(e1, en) = O.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 23
Since the vectors el, e2, , e, are pairwise orthogonal, the latter
equalities become:
(fk, el) + 22-1(e1, et) = 0,
(fk, 02) 22-2(e2, e2) 0,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
24 LECTURES ON LINEAR ALGEBRA
O (t+ I, I) = f (t dt = 2a,
it follows that a = 0, i.e., e, = t. Finally we put e, = 12 + 131 y 1.
The orthogonality requirements imply ß 0 and y = 1/3, i.e., e, = t2
1/3. Thus 1, t, P 1/3 is an orthogonal basis in R. By dividing each basis
vector by its length we obtain an orthonormal basis for R.
Let R be the space of polynomials of degree not exceeding n 1.
We define the inner product of two vectors in this space as in the preceding
example.
We select as basis the vectors 1, t, -,t^-2. As in Example 2 the process
of orthogonalization leads to the sequence of polynomials
1, t, 12 1/3, (3/5)1, .
Apart from multiplicative constants these polynomials coincide with the
Legendre polynomials
1 dk (12 1)k
2k k! dtk
The Legendre polynomials form an orthogonal, but not orthonormal basis
in R. Multiplying each Legendre polynomial by a suitable constant we
obtain an orthonormal basis in R. We shall denote the kth element of this
basis by Pk(t).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 25
it follows that
(x, y) El% + $27)2 + + Enfl.
Thus, the inner product of two vectors relative to an orthonormal basis
is equal to the sum of the products of the corresponding coordinates of
these vectors (cf. Example 2, § 2).
EXERCISES 1. Show that if f., f., f is an arbit ary basis, then
(x, y) = E nikEink,
Cle=1
where aik = aki and ei e2, " en and ni, n2, ?I are the coordinates of
x and y respectively.
2. Show that if in some basis f1, f'2, ,f
(x, Y) = 071 + + + Enn.
for every x = " ' Ent. and Y = nifi + + n,,f, then this
basis is orthonormal.
We shall now find the coordinates of a vector x relative to an
orthonormal basis el, e2, , e.
Let
x = eie, E2e2 ene.
Multiplying both sides of this equation by el we get
(x, e1) = el) + $2(e2 el) + + e(en , el) =
and, similarly,
(7) e, = (x, e2), , = (x, e).
Thus the kth coordinate of a vector relative to an orthonormal basis is
the inner product of this vector and the kth basis vector.
It is natural to call the inner product of a vector x and a vector e
of length 1 the projection of x on e. The result just proved may be
states as follows: The coordinates of a vector relative to an orthonor-
mal basis are the projections of this vector on the basis vectors. This
is the exact analog of a statement with which we are familiar from
analytic geometry, except that there we speak of projections on
the coordinate axes rather than on the basis vectors.
EXAMPLES. 1. Let Po(t), P,(1), P o(t) be the normed Legendre
polynomials of degree 0, 1, , n. Further, let Q (t) be an arbitrary polyno-
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
26 LECTURES ON LINEAR ALGEBRA
Since
27, 227
.f: sin% kt dt = cos% kt dt n and ldt = 2n,
o .{0
it follows that the functions
(8') 1/1/2a, (l/ n) cos t, (l/ n) sin t, (l/Vn) cos nt, (l/Vn) sin nt
are an orthonormal basis for R1.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 27
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
28 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 29
x2c2 +
xincl + xmc, = y .
However usually the number n of measurements exceeds the
number m of unknowns and the results of the measurements are
never free from error. Thus, the system (13) is usually incompati-
ble and can be solved only approximately. There arises the problem
of determining t1, c2, , cm so that the left sides of the equations
in (13) are as "close" as possible to the corresponding right sides.
As a measure of "closeness" we take the so-called mean
deviation of the left sides of the equations from the corresponding
free terms, i.e., the quantity
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
30 LECTURES ON LINEAR ALGEBRA
which solve this problem are found from the system of equations
(e1, ei)c, (e2, e1)c2 + + (e, el)a, = (f, e1),
(15) (e1, e2)c1 (e2, e2)c2 + ' (e e2)c,,, = (f, e2),
(x, y) xox
c k=1
(x, x)
k
I x,c2
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 31
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
32 n-DIMENSIONAL SPACES
27
= Vart
-- oI
f(t)dt; tik.,
e,,....1c,
1
Vn o
J.2"
f(t) cos kt dt;
1
27
C2k - , 1(t) sin kt dt.
' 7( /0
=7
ThuS, for the mean deviation of the trigonometric polynomial
a, n
P(t) := * Ea,. cos kt + b sin kt
2 k=1
from f(t) to be a minimum the coefficients a and bk must have the values
a,
127
5 fit) dt; a =-
1
5
2
1(1) cos kt dt;
X o x o
b= 5
127 f(t) sin kt dt.
"Jo
The numbers a,. and bk defined above are called the Fourier coefficients of
the function fit).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 33
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
34 LECTURES ON LINEAR ALGEBRA
Thus
(x', y') = (x, Y):
i.e., the inner products of corresponding pairs of vectors have
indeed the same value.
This completes the proof of our theorem.
EXERCISE. Prove this theorem by a method analogous to that used in
para. 4, § 1.
bb
does not exceed the sum of the lengths of its two non-parallel sides,
and is therefore valid in every Euclidean space. To illustrate, the
inequality,
(f(t)
vr g(t))2 dt VSa [ f (t)]2 dt VSa [g(t)]2 dt,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 35
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
36 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 37
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
38 LECTURES ON LINEAR ALGEBRA
A (x; y) = I A (ei;
k=1
A (x; y) = ai,e
a = 11 + 2 1 + 3 (-1) (-1) = 6,
a= a 1 + 2 U (-1) + 3 (-1) = 4,
1
1
a = a= 1 1 1- 2 (-1) + 3 (-1)(-1) = 2,
a 1
604
4 2 d.
= 0 6
6
It follows that if the coordinates of x and y relative to the basis e,, e2, e,
are denoted by 5',, 5',, and n'2, TA, respectively, then
A (x; y) = 65'oy, 4E', n'a 6E', + + 2 3.17' +
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 39
cn2c2 cn
is referred to as the matrix of transition from the basis e,, e2,
to the basis f1, f2, , f.
Let si = I laid I be the matrix of a bilinear form A (x; y) relative
to the basis e1, e2, , e and .4 = I ibikl 1, the matrix of that form
relative to the basis f1, f2, L. Our problem consists in finding
the matrix I IbI I given the matrix I kJ.
By definition [eq. (4)] b = A (f,,, fe), i.e., bp, is the value of our
bilinear form for x f, y = fe. To find this value we make use
of (3) where in place of the ei and ni we put the coordinates of fp
and fe relative to the basis e1, e2, , e, i.e., the numbers
c, ca, , en and c, c, , c,. It follows that
(6) by, = A (fp; fe) =
k=-1
acic..
We shall now express our result in matrix form. To this end
we put e1,, = c' The e',,1 are, of course, the elements of the
transpose W' of W. Now b becomes 4
4 As is well known, the element c of a matrix 55' which is the product of
two matrices at = 11a0,11 and a = is defined as
cik E ai,zbk.
c/-1
Using this definition twice one can show that i = d.Vt, then
= E abafic Aft.
a.ß=1
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
40 LECTURES ON LINEAR ALGEBRA
(7*) Icriaikc,.
t, k=1
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 41
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
42 LECTURES ON LINEAR ALGEBRA
A (x; X) = a zo in
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 43
(3)
1
(ann, + + a1)2 B.
It is clear that B contains only squares and products of the terms
al2172, " (Jinn. so
that upon substitution of the right side of (3)
in (2) the quadratic form under consideration becomes
1
(x; x) =
a 11 (a11711 + ' + a1)2 +
where the dots stand for a sum of terms in the variables t)2' 4.
If we put
711* = aniD a12n2 "
1)2* ,
71: =
then our quadratic form goes over into
n *,si*,ik*
A (x; x) n , *2 + '
all
nn** nn*,
A (x; x)
au
th**2
a22*
n2ik,,hThc.
**2 + a **
fi
t,1=3
** **
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
44 LECTURES ON LINEAR ALGEBRA
After a finite number of steps of the type just described our ex-
pression will finally take the form
A (x; x) , 21E18 ¿2$22 27n em2,
where m n.
We leave it as an exercise for the reader to write out the basis
transformation corresponding to each of the coordinate transfor-
mations utilized in the process of reduction of A (x; x) (cf. para. 6,
§ 1) and to see that each change leads from basis to basis, i.e., to
n linearly independent vectors.
If m < n, we put 4+1= = An = O. We may now sum up
our conclusions as follows:
THEOREM 1. Let A (x; x) be a quadratic form in an n-dimensional
space R. Then there exists a basis el, e2, , en of R relative to
which A (x; x) has the form
A (x; x) = 21E12 + 22E22 An en2,
77. =
then
A (x; x) = 71' ,8 + 27j + 41)/ 2?y, 8712.
Again, if
171* +
=
n.s =
then
A (x; x) =_ n1.2 +722.2 + 4,j* ,j*
Finally, if
SI = rits,
es = 712. + 27 s*
e3 = '73'
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 45
= dnif d2f2 + +
If the form A (x; x) is such that at no stage of the reduction
process is there need to "create squares" or to change the number-
ing of the basis elements (cf. the beginning of the description of
the reduction process in this section), then the expressions for
El, E2, , e in terms of711, 712, , n take the form
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
46 LECTURES ON LINEAR ALGEBRA
an2 am,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 47
then
A (ek; e1) = A (ek; + ;212 + + 2(f1)
= ocii A (ek; fi) + aA(ek; fi).
oci2A (ek; f2) +
Thus if A (ek; f1) = 0 for every k and for all i < k, then
A (e,; = 0 for i < k and therefore, in view of the symmetry of
the bilinear form, also for i > k, i.e., e1, e2, , en is the required
basis. Our problem then is to find coefficients ,x, OCk2, Ctkk
such that the vector
ek = atk2f2 + + Mkkfk
satisfies the relations
A (ek; = 0, (i = 1, 2, ,k I).
We assert that conditions (4) determine the vector ek to within
a constant multiplier. To fix this multiplier we add the condition
A (ek; f2) = 1.
We claim that conditions (4) and (5) determine the vector ek
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
48 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 49
Thus
blek = A ,e; e) = A k-1
Ak
To sum up:
THEOREM 1. Let A (x; x) be a quadratic form defined relative to
some basis f1, f2, , fr, by the equation
an2 (inn
be all different from zero Then there exists a basis el, e2, ,e
relative to which A (x; x) is expressed as a sum of squares,
_AI 22 + A I. 2.
A (x; x) = 40 A
AI A2
Here 4Ç, are the coordinates of x in the basis el, e2, , en.
,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
50 LECTURES ON LINEAR ALGEBRA
i.e., 2a = I, or o( i and
e, = if, =(j 0, 0).
Next a and 822 are determined from the equations
A (e2;f,) = 0 and A (e2, f2) = 1,
o r,
2M21 = 0;
whence
121 = 6,
and
e, = 6f1 8f2 = (6, 8, 0).
Finally, ct, 822, an are determined from the equations
A (ea; fi) = 0, A (es; 12) = 0, A (e3; fa) = 1
Or
21ai 1832 + 2833 =
jln + 832 = 0,
28. 833 = 1,
whence
8 12
831
-y,' -33 133
and
e3-1871 12ea + 117 fa _(S 127, 117),
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 51
It is clear that if A1_1 and A, have the same sign then the coefficient
of E12 is positive and that if and A, have opposite signs, then
this coefficient is negative. Hence,
THEOREM 2. The number of negative coefficients which appear in
the canonical form (8) of a quadratic form is equal to the number of
changes of sign in the sequence
1, A1, Z12, , A.
Actually all we have shown is how to compute the number of
positive and negative squares for a particular mode of reducing a
quadratic form to a sum of squares. In the next section we shall
show that the number of positive and negative squares is independ-
ent of the method used in reducing the form to a sum of squares.
Assume that d, > 0, /12 > 0, , A > O. Then there exists a
basis e1, e2, , e7, in which A (x; x) takes the form
A (x; x) 4E12 + 22E22 Anen2,
where all the A, are positive. Hence A (x; x) 0 for all x and
A (x; x) = I 21E12
1=1
is equivalent to
E1= E2 = = En = O.
In other words,
If A1> 0, A2 > 0, A,, > 0, then the quadrat e form A (x; x)
is positive definite.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
52 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
ti-DIMENSIONAL SPACES 53
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
54 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 55
(x, y) (x, z)
(3', Y) (37, z)
(z, z) y)
Thus the Gramm determinant of three vectors x, y, z is the square of the
volume of the parallelepiped on these vectors.
Similarly, it is possible to show that the Gramm determinant of k vectors
x, y, , w in a k-dimenional space R is the square of the determinant
X1 22 " Xfr
Y1 Y2 Yk
(9)
W1 W2 " Wk
where the xi are coordinates of x in some orthogonal basis, the yi are the
coordinates of y in that basis, etc.
(It is clear that the space R need not be k-dimensional. R may, indeed,
be even infinite-dimensional since our considerations involve only the
subspace generated by the k vectors x, y, , w.)
By analogy with the three-dimensional case, the determinant (9) is
referred to as the volume of the k-dimensional parallelepiped determined by
the vectors x, y, w.
3. In the space of functions (Example 4, § 2) the Gramm determinant
takes the form
rb rb "b
I 10 (t)de .1. /1(012(t)dt .1 11(1)1k(i)di
Pb
12(t)f1(t)dt Pba 122(t)dt f a
12(t)1(t)dt
rb rb Pb
tic(1)11(1)dt 1k(t)12(t)dt " f (t)dt
a .a
and the theorem just proved implies that:
The Gramm determinant of a system of functions is always 0. For a
system of functions to be linearly dependent it is necessary and sufficient that
their Gramm determinant vanish.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
56 LECTURES ON LINEAR ALGEBRA
an an
z1n
an an ... a2
a, an
are different from zero. Then, as was shown in para. 2, § 6, all ).;
in formula (1) are different from zero and the number of positive
coefficients obtained after reduction of A (x; x) to a sum of squares
by the method described in that section is equal to the number of
changes of sign in the sequence 1, z11, .(12, , A,,.
Now, suppose some other basis e'1, e'2, , e' were chosen.
Then a certain matrix I ja'11 would take the place of I laikl and
certain determinants
would replace the determinants z11, ZI2, A,,. There arises the
question of the connection (if any) between the number of changes
of sign in the squences 1, , z1'2, , A',, and I, A1, , 2, , zl.
The following theorem, known as the law of inertia of quadratic
torms, answers the question just raised.
THEOREM 1. If a quadratic form is reduced by two different
methods (i.e., in two different bases) to a sum of squares, then the
number of positive coefficients as well as the number of negative
coefficients is the same in both cases.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 57
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
58 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 59
We shall now try to get a better insight into the space Ro.
If f, f,, , f is a basis of R, then for a vector
Y= n2f2 + + nnf.
to belong to the null space of A (x; y) it suffices that
for i= 1, 2,
A (ft; y) 0 n.
Replacing y in (7) by (6) we obtain the following system of
equations:
A (f1; 71f1 + n2f2 + + nnf,,) = Q
A (f2; 702 + n2f2 + + mit,) = Q
A (fn; nifi /02 + + ni,) = O.
If we put A (fi; f,e) = aik, the above system goes over into
ant), + an% + ainnn = 0,
a22n2 = 0,
O An
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
60 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 61
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
62 LECTURES ON LINEAR ALGEBRA
where at, are given complex numbers satisfying the following two
conditions:
(a) a ,, =
(Th azkez f,>. 0 for every n-tuple el, $2, , En and takes on
the value zero only if el = C2 = en = O.
3. Let R be the set of complex valued functions of a real
variable t defined and integrable on an interval [a, b]. It is easy to
see that R becomes a unitary space if we put
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 63
t hen
(x, ez) = e2e2 + + ez) = e,(ei, et)
+ $2 (e2, et) + + e (e et),
SO that
(x, e,) = Et.
Using the method of § 3 we prove that all un tary spaces of
dimension n are isomorphic.
4. Bilinear and quadratic forms. With the exception of positive
definiteness all the concepts introduced in § 4 retain meaning for
vector spaces over arbitrary fields and in particular for complex
vector spaces. However, in the case of complex vector spaces
there is another and for us more important way of introducing
these concepts.
Linear functions of the first and second kind. A complex valued
function f defined on a complex space is said to be a linear function
of the first kind if
f(x + Y) =f(x) ±f(y),
f(Ax) = Af (x) ,
and a linear function of the second le.nd if
1- f(x + y) --f(x) +f(Y),
2. f (2x) = ;1./(x).
Using the method of § 4 one can prove that every linear function
of the first kind can be written in the form
f(x) = a1e, a2 + ame,
where $, are the coordinates of the vector x relative to the basis
el, e2, en and a, are constants, a, = f(e,), and that every
linear function of the second kind can be written in the form
f(x) = b2t, + + b&,
DEFINITION 1. W e shall say that A (x; y) is a bilinear form
(function) of the vectors x and y if:
for any fixed y, A (x; y) is a linear function of the first kind of x,
for any fixed x, A (x; y) is a linear function of the second kind
of y. In other words,
1. A (xl + x2; y) = A (xi; y) A (x2; y),
A (2x; y) = 2,4 (x; y),
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
64 LECTURES ON LINEAR ALGEBRA
A (x; y) = aik$,Fik
i, k=1
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
n-DIMENSIONAL SPACES 65
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
66 LECTURES ON LINEAR ALGEBRA
a = dkì, then the same must be true for the matrix of this form
relative to any other basis. Indeed, a, -- d relative to some basis
implies that A (x; y) is a Hermitian bilinear form, but then a--=d
relative to any other basis.
If a bilinear form is Hermitian, then the associated quadratic
form is also called Hermitian. The following result holds:
For a bilinear form A (x; y) to be Hermitian it is necessar y
and sufficient that A (x; x) be real for every vector x.
Proof: Let the form A (x; y) be Hermitian; i.e., let A (x; y)
= A (y; x). Then A (x; x) = A (x; x), so that the number
A (x; x) is real. Conversely, if A (x; x) is real for al x, then, in
particular, A (x + y; x + y), A (x iy; x iy), A (x y; xy),
A (x iy; x iy) are all real and it is easy to see from formulas
(1) and (2) that A (x; y) = (y; x).
COROLLARY. A quadratic form is Hermitian i f and only i f it is real
valued.
The proof is a direct consequence of the fact just proved that for
a bilinear form to be Hermitian it is necessary and sufficient that
A (x; x) be real for all x.
One example of a Hermitian quadratic form is the form
A (x; x) = (x, x),
where (x, x) denotes the inner product of x with itself. In fact,
axioms 1 through 3 for the inner product in a complex Euclidean
space say in effect that (x, y) is a Hermitian bilinear form so that
(x, x) is a Hermitian quadratic form.
If, as in § 4, we call a quadratic form A (x; x) positive definite
when
for x 0,
A (x; x) > 0
then a complex Euclidean space can be defined as a complex
vector space with a positive definite Hermitian quadratic form.
If Al is the matrix of a bilinear form A (x; y) relative to the
basis e1, e2, , en and I the matrix of A (x; y) relative to the
tt
basis f1, f2, , f2 and if f; = coe, j 1, , n), then
4-4
= %)* seW
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
R.-DIMENSIONAL SPACES 67
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
68 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
It-DIMENSIONAL SPACES 69
relative to two bases, then the number of positive, negative and zero
coefficients is the same in both cases.
The proof of this theorem is the same as the proof of the corre-
sponding theorem in § 7.
The concept of rank of a quadratic form introduced in § 7 for real
spaces can be extended without change to complex spaces.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CHAPTER II
Linear Transformations
§ 9. Linear transformations. Operations on linear
transformations
1. Fundamental definitions. In the preceding chapter we stud-
ied functions which associate numbers with points in an n-
dimensional vector space. In many cases, however, it is necessary
to consider functions which associate points of a vector space with
points of that same vector space. The simplest functions of this
type are linear transformations.
DEFINITION I. If with every vector x of a vector space R there is
associated a (unique) vector y in R, then the mapping y = A(x) is
called a transformation of the space R.
This transformation is said to be linear if the following two condi-
tions hold:
A (x + x2) = A(x1) + A (x,),
A (dlx ) = (x).
Whenever there is no danger of confusion the symbol A (x) is
replaced by the symbol Ax.
EXAMPLES. 1. Consider a rotation of three-dimensional Eucli-
dean space R about an axis through the origin. If x is any vector
in R, then Ax stands for the vector into which x is taken by this
rotation. It is easy to see that conditions 1 and 2 hold for this
mapping. Let us check condition 1, say. The left side of 1 is the
result of first adding x and x, and then rotating the sum. The
right side of 1 is the result of first rotating x, and x, and then
adding the results. Clearly, both procedures yield the same vector.
2. Let R' be a plane in the space R (of Example 1) passing
through the origin. We associate with x in R its projection
x' = Ax on the plane R'. It is again easy to see that conditions
1 and 2 hold.
70
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 71
aike k
k=1
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
72 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 73
00
It is easy to see that the null transformation is always represented
by the matrix all of whose entries are zero.
Let R be the space of polynomials of degree n 1. Let A
be the differentiation transformation, i.e.,
AP(t) = P'(t).
We choose the following basis in R:
t2 En
= 1, e2 = t, e e -=
3 2! (n I)!
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
74 LECTURES ON LINEAR ALGEBRA
Then
Ael = 1' = 0, Ae2 =e 1 Ae3 (2)'
2
t e
tn-2
Ae =
(n 1) ! (n 2) !
0 0 0 1
0 0 0 0
Let A be a linear transformation, el, e2, , en a basis in R and
MakH the matrix which represents A relative to this basis. Let
(4) x = $1e, $2e2 + + $nen,
(4') Ax = 121e1+ n2; + + nen.
We wish to express the coordinates ni of Ax by means of the coor-
dinates ei of x. Now
Ax = A (e,e, E2e2 + + Een)
= ei(ael a21e2 + + anien)
$2(a12e1 a22e2 + + an2e2)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 75
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
76 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 77
so that
c=a b,.
The matrix + bIl is called the sum of the matrices Ila,[1 and
111)0,11. Thus the matrix of the sum of two linear transformations is the
sum of the matrices associated with the summands.
Addition and multiplication of linear transformations have
some of the properties usually associated vvith these operations.
Thus
A+B=B±A;
(A B) C = A + (B C);
A (BC) = (AB)C;
f (A B)C = AC -1- BC,
1 C(A B) = CA + CB.
We could easily prove these equalities directly but this is unnec-
essary. We recall that we have established the existence of a
one-to-one correspondence between linear transformations and
matrices which preserves sums and products. Since properties
1 through 4 are proved for matrices in a course in algebra, the iso-
morphism between matrices and linear transformations just
mentioned allows us to claim the validity of 1 through 4 for linear
transformations.
We now define the product of a number A and a linear transfor-
mation A. Thus by 2A we mean the transformation which associ-
ates with every vector x the vector il(Ax). It is clear that if A is
represented by the matrix then 2A is represented by the
matrix rj2a2,11
If P (t) aot'n + + a, is an arbitrary polynomial
and A is a transformation, we define the symbol P(A) by the
equation
P(A) = (Om + a, 21,16-1 + + a,E.
EXAMPLE. Consider the space R of functions defined and
infinitely differentiable on an interval (a, b). Let D be the linear
mapping defined on R by the equation
Df (t) = f(1).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
78 LECTURES ON LINEAR ALGEBRA
0 0 ).,,
0 At,' 0 0 ;.,^
it follows that
P (0). , ) ;
1 i : 0
0 - P(2..)
0
000
0 0 0
o_
It is possible to give reasonable definitions not only for a
polynomial in a matrix at but also for any function of a matrix d
such as exp d, sin d, etc.
As was already mentioned in § 1, Example 5, all matrices of
order n with the usual definitions of addition and multiplication
by a scalar form a vector space of dimension n2. Hence any
n2 + 1 matrices are linearly dependent. Now consider the
following set of powers of some matrix sl
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 79
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
80 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 81
(10') Aek =
tt
(10") Afk =
i=1
bat.
We wish to express the matrix ,R in terms of the matrices si and W.
To this end we rewrite (10") as
ACe, = Ce,.
C-'ACe, = bzker
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
82 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANsFormarioNs 83
an ' avc
0 O a1,1 a
In this case the subspace generated by the vectors e , e2, ek is
invariant under A. The proof is left to the reader. If
= = (1 _< k),
then the subspace generated by ek-Flp e1+2 en would also be
invariant under A.
2. Eigenvectors and eigenvalues. In the sequel one-dimensional
invariant subspaces will play a special role.
Let R1 be a one-dimensional subspace generated by some vector
x O. Then R, consists of all vectors of the form ax. It is clear
that for R1 to be invariant it is necessary and sufficient that the
vector Ax be in R1, i.e., that
Ax = 2x.
DEFINITION 2. A vector x 0 satisfying the relation Ax Ax
is called an eigenvector of A. The number A is called an eigenvalue
of A.
Thus if x is an eigenvector, then the vectors ax form a one-
dimensional invariant subspace.
Conversely, all non-zero vectors of a one-dimensional invariant
subspace are eigenvectors.
THEOREM 1. If A is a linear transformation on a complex i space
R, then A has at least one eigenvector.
Proof: Let e1, e2, en be a basis in R. Relative to this basis A
is represented by some matrix IctikrI Let
x = elei E2e, + + Ee
be any vector in R. Then the coordinates ni, n2, , n of the
vector Ax are given by
The proof holds for a vector space over any algebraically closed field
since it makes use only of the fact that equation (2) has a solution.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
84 LECTURES ON LINEAR ALGEBRA
/12 4 a22 e2 +
= a2111 + a2¿,,,
= an1e1+ ct,i02+ + ctE
(Cf. para. 3 of § 9).
The equation
Ax = Ax,
which expresses the condition for x to be an eigenvector, is equiv-
alent to the system of equations:
a111 a12$2 + + ainE=-- A¿,,
a2151 a22 + + a2¿,, = A2
an11 an2$2 + + ae--= A¿,,,
Or
(an A)ei an$, + + a1--- 0,
Ei (a22 A)E, + + a2,, 0,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 85
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
86 LECTURES ON LINEAR ALGEBRA
basis by a diagonal matrix, then the vectors of this basis are eigen-
values of A.
NOTE: There is one important case in which a linear transforma-
tion is certain to have n linearly independent eigenvectors. We
lead up to this case by observing that
If e1, e2, , ek are eigenvectors of a transformation A and the
corresponding eigenvalues 2,, A ' , A, are distinct, then e1, e2,
e, are linearly independent.
For k = 1 this assertion is obviously true. We assume its
validity for k 1 vectors and prove it for the case of k vectors.
If our assertion were false in the case of k vectors, then there
would exist k numbers ai , 0( , a, with al 0 0, say, such that
(3) ei a2 e2 e, = O.
Apply ng A to both sides of equation (3) we get
A (al ek + x2e2 + + a, e,) = 0,
Or
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 87
0 0 1 A
010
1 0 0 0
0
0
0
0 0 0 1 0
Solution: (-1)"(A" a11^-1 a2A^-2 a ).
We shall now find an explicit expression for the characteristic
polynomial in terms of the entries in some representation sal of A.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
88 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 89
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
90 LECTURES ON LINEAR ALGEBRA
If we multiply the first of these equations on the left by t, the second by al,
the third by Sr, '' the last by dm and add the resulting equations, we get
0 on the left, and P(.4) = a, t + a,_, d + + a, dm on the right.
Thus P( 31) = 0 and our lemma is proved 3.
THEOREM 3. If P(A) is the characteristic polynomial of Al, then P(d) = O.
Proof: Consider the inverse of the matrix d At. We have
A t)(d A t)-1 e. As is well known, the inverse matrix can be
written in the form
1
AS)-/ =
P(A)
where 5 (A) is the matrix of the cofactors of the elements of a/ At and
P(A) the determinant of d ite, i.e., the characteristic polynomial of S.
Hence
(.29 ,i.e)w(A) = P(A)e.
Since the elements of IS(A) are polynomials of degree . n I in A, we
conclude on the basis of our lemma that
P ( 31) = C.
This completes the proof.
We note that if the characteristic polynomial of the matrix d has no
multiple roots, then there exists no polynomial Q(A) of degree less than n
such that Q(.0) = 0 (cf. the exercise below).
EXERCISE. Let d be a diagonal matrix
Oi
0 A, 0
=[A,
A
where all the A; are distinct. Find a polynomial P(t) of lowest degree for
which P(d) = 0 (cf. para. 3, § 9).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 91
ani En an2Eni12 + +
We shall now try to represent the above expression as an inner
product. To this end we rewrite it as follows:
A (x; y) (6711E1 + an 52 + + an' 52)771
(171251 + 422E2 + -F a,,2 j2
(a25e1 a2e2 + + anE)77.
Now we introduce the vector z with coordinates
= aide]. -F a21 52 ' + ani$,
C2 = a12e1 + 422E2 + + an2$,
= a2ne2 + a$,.
It is clear that z is obtained by applying to x a linear transforma-
tion whose matrix is the transpose of the matrix Haikil of the
bilinear form A (x; y). We shall denote this linear transformation
4 Relative to a given basis both linear transformations and bilinear forms
are given by matrices. One could therefore try to associate with a given
linear transformation the bilinear form determined by the same matrix as
the transformation in question. However, such correspondence would be
without significance. In fact, if a linear transformation and a bilinear form
are represented relative to some basis by a matrix at, then, upon change of
basis, the linear transformation is represented by Se-1 are (cf. § 9) and the
bilinear form is represented by raw (cf. § 4). Here re is the transpose of rer
The careful reader will notice that the correspondence between bilinear
forms and linear transformations in Euclidean space considered below
associates bilinear forms and linear transformations whose matrices relative
to an orthonormal basis are transposes of one another. This correspondence
is shown to be independent of the choice of basis.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
92 LECTURES ON LINEAR ALGEBRA
Bx, y) =
(Ax
for all y. But this means that Ax Ex = 0 for all x Hence
Ax = Ex for all x, which is the same as saying that A = B. This
proves the uniqueness assertion.
We can now sum up our results in the following
THEonEm 1. The equation
(2) A (x; y) = (Ax, y)
establishes a one-to-one correspondence between bilinear forms and
linear transformations on a Euclidean vector space.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 93
$ n(an1Fn a2772 + +
= + d12n2 + + din)
+ + a2272 + + a2nn.)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
94 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 95
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
96 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 97
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
98 LECTURES ON LINEAR ALGEBRA
Or,
2(x, x) = 71(x, x).
Since (x, x) 0, it follows that A = 1, which proves that A is real.
LEMMA 2. Let A be a self-adjoint transformation on an n-dimen-
sional Euclidean vector space R and let e be an eigenvector of A.
The totality R, of vectors x orthogonal to e form an (n 1)-dimen-
sional subspace invariant under A.
Proof: The totality R1 of vectors x orthogonal to e form an
(n 1)-dimensional subspace of R.
We show that R, is invariant under A. Let x e R. This means
that (x, e) = 0. We have to show that Ax e R1, that is, (Ax, e)
= O. Indeed,
(Ax, e) = (x, A*e) = (x, Ae) (x, 2e) = 2(x, e) = 0.
THEOREM 1. Let A be a self-adjoint transformation on an n-
dimensional Euclidean space. Then there exist n pairwise orthogonal
eigenvectors of A. The corresponding eigenvalues of A are all real.
Proof: According to Theorem 1, § 10, there exists at least one
eigenvector el of A. By Lemma 2, the totality of vectors orthogo-
nal to e, form an (n 1)-dimensional invariant subspace
We now consider our transformation A on R, only. In R, there
exists a vector e2 which is an eigenvector of A (cf. note to Theorem
1, § 10). The totality of vectors of R, orthogonal to e, form an
(n 2)-dimensional invariant subspace R2. In R, there exists an
eigenvector e, of A, etc.
In this manner we obtain n pairwise orthogonal eigenvectors
e1, e2, , en. By Lemma 1, the corresponding eigenvalues are
real. This proves Theorem 1.
Since the product of an eigenvector by any non-zero number is
again an eigenvector, we can select the vectors e. that each of
them is of length one.
THEOREM 2. Let A be a linear transformation on an n-dimensional
Euclidean space R. For A to be self-adjoint it is necessary and
sufficient that there exists an orthogonal basis relative to which the
matrix of A is diagonal and real.
Necessity: Let A be self-adjoint. Select in R a basis consisting of
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 99
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
100 LECTURES ON LINEAR ALGEBRA
to the 28th power. Hint: Bring the matrix to its diagonal form, raise it to
the proper power, and then revert to the original basis.
2. Reduction to principal axes. Simultaneous reduction of a pair
of quadratic forms to a sum of squares. We now apply the results
obtained in para. 1 to quadratic forms.
We know that we can associate with each Hermitian bilinear
form a self-adjoint transformation. Theorem 2 permits us now to
state the important
THEOREM 3. Let A (x; y) be a Hermitian bilinear form defined on
an n-dimensional Euclidean space R. Then there exists an orthonor-
mal basis in R relative to which the corresponding quadratic form can
be written as a sum of squares,
A (x; x) = ili[e ii2,
where the Xi are real, and the $1 are the coordi ales of the vector
x. 6
Proof: Let A( y) be a Hermitian bilinear form, i.e.,
A (x; y) = A (y; X),
We have shown in § 8 that in any vector space a Hermitian quadratic
form can be written in an appropriate basis as a sum of squares. In the case
of a Euclidean space we can state a stronger result, namely, we can assert
the existence of an orthonnal basis relative to which a given Hermitian
quadratic form can be reduced to a sum of squares.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 101
Since
I1 for i=k
0 for i k,
we get
A (x; y) (Ax, y)
= e2Ae2 + + en Aen , n1e1 /12e2 + + ?)en)
= 22e2e2 + -I- An enen, %el n2e2 + + nnen)
= 1E11 + 225 + + fin
In particular
A (x; x) = (Ax, x) 211$112
,121 212 + + Arisni2.
This proves the theorem.
The process of finding an orthonormal basis in a Euclidean
space relative to which a given quadratic form can be represented
as a sum of squares is called reduction to principal axes.
THEOREM 4. Let A (x; x) and B(x; x) be two Hermitian quadratic
forms on an n-dimensional vector space R and assume B(x; x) to be
positive definite. Then there exists a basis in R relative to which
each form can be written as a sum of squares.
Proof: We introduce in R an inner product by putting (x, y)
B(x; y), where B(x; y) is the bilinear form corresponding to
B(x; x). This can be done since the axioms for an inner product
state that (x, y) is a Hermitian bilinear form corresponding to a
positive definite quadratic form (§ 8). With the introduction of an
inner product our space R becomes a Euclidean vector space. By
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
102 LECTURES ON LINEAR ALGEBRA
0 0 A
Consequently,
Det (id AR) (A1 A) (22 2) (2 A).
Under a change of basis the matrices of the Hermitian quadratic
forms A and B go over into the matrices Jill = (t* d%' and
= %)* . Hence, if el, e2, , en is an arbitrary basis, then
with respect to this basis
Det 141) Det V* Det (at
Al) Det C,
i.e., Det 141) differs from (4) by a multiplicative constant.
It follows that the numbers A, 22, , A are the roots of the equation
a Ab a Abu ' al,, /bin
an 2b21 a22 Ab22 a2n 21)2n
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 103
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
104 LECTURES ON LINEAR ALGEBRA
a21 a22 aa
a1 a2 a
be the matr x of the transformation U relative to this basis. Then
dn dn an]]
d12 d22 dn2
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 105
Then
(x, x) = (Ux, Ux) = (2x, 2x) = 22(x, x),
that is, Ai = 1 or 121 = 1.
LEMMA 2. Let U be a unitary transfor ation on an n-di ensional
space R and e its eigenvector, i.e.,
Ue = 2e, e O.
Then the (n 1)-d mensional subspace R, of R consisting of all
vectors x orthogonal to e is invariant under U.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
106 LECTURES ON LINEAR ALGEBRA
Ue =
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 107
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
108 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 109
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
110 LECTURES ON LINEAR ALGEBRA
0 0 IL,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 111
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
112 LECTURES ON LINEAR ALGEBRA
so that
AA* -= H2.
Consequently, in order to find H one has to "extract the square
root" of AA*. Having found H, we put U = H-1A.
Before proving Theorem 1 we establish three lemmas.
LEMMA 1. Given any linear transformation A, the transformation
AA* is positive definite. If A is non-singular then so is AA*.
Proof: The transformation AA* is positive definite. Indeed,
(AA*)* = A**A* = AA*,
that is, AA* is self-adjoint. Furthermore,
(AA* x, x) = (A*x, A*x) 0,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 113
NOTE: It iS clear from equality (1) that if all the A, are positive
then the transformation B is non-singular and, conversely, if B is
positive definite and non-singular then the are positive.
LEMMA 3. Given any positive definite transformation B, there
exists a positive definite transformation H such that H2 = B (in
this case we write H = Bi). In addition, if B is non-singular
then H is non-singular.
Proof: We select in R an orthogonal basis relative to which B is
of the form
[Al O 01
B=0 A,
0 0 2
where 21, 22, , Ar, are the eigenvalues of B. By Lemma 2 all
A,>. O. Put
[V21. O 0
H= VA2 '
O 0 \/2
App y ng Lemma 2 again we conclude that H is positive definite.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
114 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 115
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
1 16 LECTURES ON LINEAR ALGEBRA
b. + O. Let
+ inn E2 1.1/2, E 1.)7.
, $ in ( 1) by these
be a solution of (1 ). Replacing $i, $2,
numbers and separating the real and imaginary parts we get
+ a12e2 = + amen --= ace' Pni,
(2) anEi + 022E2 + + azii en Cte2 A2,
ani$,
cJane,. a2$2 + + a7,$ = a&2
and
an r.,1 + a12 n2 -i- ' ' ' + alniin = °U71. 4- ßE1,f
(2)' a21n1 a22n2 + + a2nnii = 15t/12 /3E2,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 117
=a/M,
k=1
where jaiklj is the matrix of A relative to the basis el, e2, , en.
It follows that
Similarly,
where aik ak.i. Comparing (5) and (6) we obta n the following
result:
Given a symmetric bilinear form A (x; y) there ex sts a self-adjoint
transformation A such that
A (x; y) = (Ax, y).
VVe shall make use of this result in the proof of Theorem 3 of
this section.
We shall now show that given a self-adjoint transformation
there exists an orthogonal basis relative to which the matrix of
the transformation is diagonal. The proof of this statement will be
based on the material of para. 1. A different proof which does not
depend on the results of para. I and is thus independent of the
theorem asserting the existence of the root of an algebraic equation
is given in § 17.
We first prove two lemmas.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
118 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRA.'SFORMATIONS 119
o o
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
120 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 121
Since
(x, y)
cos 99 =
ix)
and since neither the numerator nor the denominator in the
expression above is changed under an orthogonal transformation,
it follows that an orthogonal transformation preserves the angle
between two vectors.
Let e1, e2, , en be an orthonormal basis. Since an orthogonal
transformation A preserves the angles between vectors and the
length of vectors, it follows that the vectors Aei, Ae , Ae
likewise form an orthonormal basis, i.e.,
{I for i k
(A;, A;)0for i k.
Now let Ila11 be the matrix of A relative to the basis e1, e2, ,
en. Since the columns of this matrix are the coordinates of the
vectors Ae conditions (11) can be rewritten as follows:
{1for i = k
anan =
a-1 for i
0 k.
EXERCISE. Show that conditions (I1) and, consequently, conditions (12)
are sufficient for a transformation to be orthogonal.
Conditions (12) can be written in matrix form. Indeed,
I axian are the elements of the product of the transpose of the
a=1
matrix of A by the matrix of A. Conditions (12) imply that
this product is the unit matrix. Since the determinant of the pro-
duct of two matrices is equal to the product of the determinants,
it follows that the square of the determinant of a matrix of an
orthogonal transformation is equal to one, i.e., the determinant of a
matrix of an orthogonal transformation is equal to + 1.
An orthogonal transformation whose determinant is equal to
+ lis called a proper orthogonal transformation, whereas an ortho-
gonal transforMation whose determinant is equal to 1 is called
improper.
EXERCISE. Show that the product of two proper or two improper
orthogonal transformations is a proper orthogonal transformation and the
product of a proper by an improper orthogonal transformation is an
improper orthogonal transformation.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
122 LECTURES ON LINEAR ALGEBRA
[7/ /
be the matrix of A relative to that basis.
We first study the case when A is a proper orthogonal trans-
formation, i.e., we assume that acó ßy -= 1.
The orthogonality condition implies that the product of the
matrix (13) by its transpose is equal to the unit matrix, i.e., that
(14) Fa )51-1 Fa vl
Ly J Lß fit
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 123
(15)
fi'br --13.1.
It follows from (14) and (15) that in this case the matrix of the
transformation is
r
where a2 + ß2 = 1. Putting x = cos q», ß sin qi we find that
the matrix of a proper orthogonal transformation on a two dimensional
space relative to an orthogonal basis is of the form
[cos 9) sin 92-1
sin cos 9'I
(a rotation of the plane by an angle go).
Assume now that A is an improper orthogonal transformation,
that is, that GO ßy = 1. In this case the characteristic
equation of the matrix (13) is A2 (a + 6)2 1 = O and, thus,
has real roots. This means that the transformation A has an
eigenvector e, Ae = /le. Since A is orthogonal it follows that
Ae ±e. Furthermore, an orthogonal transformation preserves
the angles between vectors and their length. Therefore any vector
e, orthogonal to e is transformed by A into a vector orthogonal to
Ae ±e, i.e., Ae, +e,. Hence the matrix of A relative to the
basis e, e, has the form
F±I
L o +1j.
Since the determinant of an improper transformation is equal to
-- 1, the canonical form of the matrix of an improper orthogonal
transformation in two-dimensional space is
HE oi 1 01
Or
L o o +1
( a reflection in one of the axes).
We now find the simplest form of the matrix of an orthogonal
transformation defined on a space of arbitrary dimension.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
124 LECTURES ON LINEAR ALGEBRA
1 sin
cos 92, 921
COS 92k -
sin 92, cos 99,_
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 125
1
1
1 sin 921
cos qpi
sin go, cos q),,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
126 LECTURES ON LINEAR ALGEBRA
cos q sin 9)
sin yo cos w
1
Making use of Theorem 5 one can easily show that every orthogonal
transformation can be written as the product of a number of simple rota-
tions and simple reflections. The proof is left to the reader.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 127
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
128 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 129
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
130 LECTURES ON LINEAR ALGEBRA
But then the minimum of (Ax, x) for x on the unit sphere in Rk must be
equal to or less than Ak.
To sum up: If Rk is an k 1)-dimensional subspace and x varies
over all vectors in R, for which (x, x) = 1, then
min (Ax, x) A,.
Note that among all the subspaces of dimension n k 1 there exists
one for which min (Ax, x), (x, x) = I, x e 12.0, is actually equal to Ak.
This is the subspace consisting of all vectors orthogonal to the first k
eigenvectors et, e, , e. Indeed, we showed in this section that min
(Ax, x), (x, x) = 1, taken over all vectors orthogonal to et, et, , et,
is equal to ;I.,.
We have thus proved the following theorem:
THEOREM. Let R be a (n k + 1)-dimensional subspace of the space R.
Then min (Ax, x) for all x elt,, (x, x) = 1, is less than or equal to A,. The
subspace Rk can be chosen so that min (Ax, x) is equal to A,.
Our theorem can be expressed by the formula
(3) max min (Ax, x) -= 4.
Rk (x,
xe Rk
In this formula the minimum is taken over all x e R,, (x, x) = 1, and
the maximum over all subspaces Rk of dimension n k + 1.
As a consequence of our theorem we have:
Let A be a sell-adjoint linear transformation and B a postive definite linear
transformation. Let A, A, A be the eigenvalues of A and lel
" ,u be the eigenvalues of A -7 B. Then A, f
Indeed
(Ax, x) ((A + 13)x, x),
for all x. Hence for any (n k + 1)-dimensional subspace Rk we have
min (Ax, x) min ((A B)x, x).
(x, xi=1 X)=-1
xe Rk xeRk
It follows that the maximum of the expression on the left side taken over
all subspaces Rk does not exceed the maximum of the right side. Since, by
formula (3), the maximum of the left side is equal to A, and the maximum
of the right side is equal to 1.4, we have 2,
We now extend our results to the case of a complex space.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
LINEAR TRANSFORMATIONS 131
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CHAPTER III
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 133
e, f, h1,
corresponding to the eigenvalues Xi, 22, , A,. Then there exists a
basis consisting of k sets of vectors 2
e, , e,; f1, , fq; ; 111, ,h
relative to which the transformation A has the form:
Ael = 11e1, Ae, = e, 21e2, , Ae = e,_1 21e;
Af, = 22f1, Af, = f, 22f2, At, = 12f,;
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
134 LECTURES ON LINEAR ALGEBRA
ciAl+ c2r-
c2A,± c3 = Ac2,
cp-14+ 1Cp-1,
cAl = Ac.
We first show that A = Al. Indeed, if A A1, then it would follow
from the last equation that c, = 0 and from the remaining equa-
tions that c,_1= c,_2= = c2= el= O. Hence A = A1. Sub-
stituting this value for A we get from the first equation c2 = 0,
from the second, c, = 0, and from the last, c, = O. This
means that the eigenvector is equal to cle and, therefore, coincides
(to within a multiplicative constant) with the first vector of the
corresponding set.
We now write down the matrix of the transformation (2). Since
the vectors of each set are transformed into linear combinations
of vectors of the same set, it follows that in the first p columns the
row indices of possible non-zero elements are 1, 2, p; in the
next q columns the row indices of possible non zero elements are
p + 1, p + 2, ,p q, and so on. Thus, the matrix of the
transformation relative to the basis (1) has k boxes along the main
diagonal. The elements of the matrix which are outside these
boxes are equal to zero.
To find out what the elements in each box are it suffices to note
how A transforms the vectors of the appropriate set. We have
Ael = Ale,
Ae2 = e1 +
Ae = e,_, + A1e,_1,
Ae ep + Ale,.
Recalling how one constructs the matrix of a transformation
relative to a given basis we see that the box corresponding to the
set of vectors e1, e2, , e, has the form
-Al 0 0 0
- O All 0
1
0
(3)
0 0 0 A, 1
0 0 0 0
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 135
0 0 0 21
221 0 0
0221 0
(4) 0 0 0 22
2k' 0 0
0 A, 1 0
0 0 0
k_
where the a, are square boxes and all othur elements are zero. Then
sif2
that is, in order to raise the matrix al to some power all one has to do is
raise each one of the boxes to that power. Now let P(1) =, ao + ait + +
amtm be any polynomial. It is easy to see that
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
136 LECTURES ON LINEAR ALGEBRA
[P(ei1)
P(s12)
P(s,
We now show how to compute P(s1,), say. First we write the matrix si,
in the form
st, A,e +
where et is the unit matnx of order p and where the matrix f has the form
r0 1 0 0
0 I 0
.1 = 0 0 0 o 11
0 0 o 0
We note that the matrices .02, .5.3, , ,0P-I are of the form 2
[0 0 0 [0 0 0 0
0001 1
if 2-2 -
0 0 0 o
0000
0000 00
00
0
0
o
o
and
fr == JP+, == = 0.
It is now easy to compute P(,(11). In view of Taylor's formula a polynomial
P(t) can be written as
(t A1)2 (tA,)"
P(t)= P(20) (t )0) -1-v(2.1) -E
2!
P"(À1) + + n!
P"'' (A1),
where n is the degree of P(t). Substituting for t the matrix sari we get
(st, A1 e)2
P(di) = P(Mg + (si, A, e)P( (20 -1-
2! P"(11.1)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 137
0O O ' P(21)
Thus in order to compute P(d1) where sal, has order p it suffices to know
the value of P(t) and its first p 1 derivatives at the point A,, where A,
is the eigenvalue of si,. It follows that if the matrix has canonical form (4)
with boxes of order p, q, ,s, then to compute P(d) one has to know the
value of P(t) at the points t = A,, A2, , A, as well as the values of the
first p 1 derivatives at A,, the first q 1 derivatives at A,, , and the
first s 1 derivatives at A,.
3 The main idea for the proof of this theorem is due to I. G. Petrovsky.
See I. G. Petrovsky, Lectures on the Theory of Ordinary Differential Equa-
tions, chapter 6.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
138 LEcruims ON LINEAR ALGEBRA
Ah, = 4112,
Ah2 = +2,h2,
= h,_, + 21h,.
We now pick a vector e wh ch together with the vectors
el, e2, ev; f, f2, ft; ; h, h2, hs
forms a basis in R.
Applying the transformation A to e we get
4 We assume Itere that R is Euclidean, i.e., that an inner product is
defined on R. However, by changing the proof slightly 've can show that
the Lemma holds for any vector space R.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 139
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
140 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 141
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
142 LECTURES ON LINEAR ALGEBRA
e'
e' = cc,e. + ßf, + yrgr,
= Aet_7+2 = Gip ep_,.±, + 41, f,.+1 y,g1,
= Aet_.+1 = e,_, + fg_r,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 143
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
144 LECTURES ON LINEAR ALGEBRA
0 o 1 ].
rooAo A,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 145
0 0 0
where .1, and .42 are of order n, and n,, then the mth order non-zero
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
146 LECTURES ON LINEAR ALGEBRA
O 1 O
=
O O O I
0 0 A, A_
7 Of course, a non-zero kth order minor of d may have the form 4 k(,
it may he entirely made up of elements of a,. In this case we shall
i.e.,
write it formally as z1 = 4725 where zlo,22 --- 1.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 147
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
148 LECTURES ON LINEAR ALGEBRA
The expressions for the D1(A) show that in place of the D,(2) it is
more convenient to consider their ratios
,(2.)
E ,(2) .
D k 19)
The E1(1) are called elementary divisors. Thus if the Jordan
canonical form of a matrix d contains p boxes of order n, n2, ,
n(ni n, >: n) corresponding to the eigenvalue A, q boxes
of order mi., m2, m, (m1 m2_> mg) corresponding
to the eigenvalue 22, etc., then the elementary divisors E1(A) are
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 149
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
150 LECTURES ON LINEAR ALGEBRA
iCj = agkek
k=1
The matrix of this system of equations is A ilE, with A the matrix of
coefficients in the system (1). Thus the study of the system of differential
equations (1) is closely linked to polynomial matrices of degree one, namely,
those of the form A AE.
Similarly, the study of higher order systems of differential equations leads
to polynomial matrices of degree higher than one. Thus the study of the
system
d2yk n dyk n
2+ an,
dx2
+ E bik + czkyk O
k=1 k=1 dx k=1
is synonymous with the study of the polynomial matrix AA% + 132 + C,
where A -= 16/.0, B = C = 11c3k1F.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 151
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
152 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 153
(4) 0 0 E3(2)
0 0 0 E(2)_
Here lije diagonal elements Ek(A) are monic polynomials and El (X)
divides E2(2), E, (A) divides E3(2.), etc. This form of a polynomial
matrix is called its canonical diagonal form.
It may, of course, happen that
Er+,(2) = E.,.+2(2) = =
for some value of Y.
REMARK: We have brought A(A) to a diagonal form in which
every diagonal element is divisible by its predecessor. If we dis-
pense with the latter requirement the process of diagonalization
can be considerably simplified.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
154 LECTURES ON LINEAR ALGEBRA
412 A2)].
[01 (A
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 155
its sign or replace it with another kth order minor. In all these
cases the greatest common divisor of all kth order minors remains
unchanged. Likewise, elementary transformations of type 3 do
not change D,(2) since under such transformations the minors are
at most multiplied by a constant. Now consider elementary
transformations of type 2. Specifically, consider addition of the
jth column multiplied by T(A) to the ith column. If some particular
kth order minor contains none of these columns or if it contains
both of them it is not affected by the transformation in question.
If it contains the ith column but not the kth column we can write
it as a combination of minors each of which appears in the original
matrix. Thus in this case, too, the greatest common divisor of the
kth order minors remains unchanged.
If all kth order minors and, consequently, all minors of order
higher than k are zero, then we put 13,(A) = Dk±i(A) =
D(2) = O. We observe that equality of the /k(A) for all
equivalent matrices implies that equivalent matrices have the
same rank.
We compute the polynomials D,(2) for a matrix in canonical
form
[Ei(i) 0 I
O E2(2)
(5)
E(2)
We observe that in the case of a diagonal matrix the only non-
zero minors are the principal minors, that is, minors made up of
like numbered rows and columns. These minors are of the form
(2)E,.2(2.) E2k(2).
Since E2(2) is divisible by E1(2), E3(2) is divisible by E2(2), etc.,
it follows that the greatest common divisor D1(A) of all minors of
order one is Ei(A). Since all the polynomials Ek(A) are divisible
by E1(2) and all polynomials other than E1(2) are divisible by
E2 (A), the product Ei(A)Ei(A)(i < j) is always divisible by the
minor E,(A)E,(A). Hence D2(A) = E,(À)E,(A). Since all E,(4
other than E1(11) and E2(A) are divisible by E3(2.), the product
E ,(1.)E,(2.)E,(2) (i < j < k) is divisible by the minor
E1(A)E2(A)E3(2) and so Da(A) = E,(;t)E,(A)E,(A).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
156 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATIoN 157
mial matrices A (A) and E(A) to be equivalent is that the polyno als
Di(A), D2(2), , .13(2) be the same for both matrices.
Indeed, if the polynomials D,(2) are the same for A(A) and B (A),
then both of these matrices are equivalent to the same canonical
diagonal matrix and are therefore equivalent (to one another).
3. A polynomial matrix P(2) is said to be invertible if the matrix
[P(2)]-1 is also a polynomial matrix. If det P (A) is a constant other
than zero, then P (A) is invertible. Indeed, the elements of the
inverse matrix are, apart from sign, the (n 1)st order minors
divided by det P(2). In our case these quotients would be poly-
nomials and [P (2)J-1 would be a polynomial matrix. Conversely,
if P (A) is invertible, then det P(2) = const O. Indeed, let
[P (2)1-1- = Pi(A). Then det P (A) det (A) = 1 and a product of
two polynomials equals one only if the polynomials in question are
non-zero constants. We have thus shown that a polynomial matrix
is invertible if and only if its determinant is a non-zero constant.
All invertible matrices are equivalent to the unit matrix.
Indeed, the determinant of an invertible matrix is a non-zero
constant, so that Da(A) = 1. Since D(2) is divisible by WM
D,(2) = 1 (k = 1, 2, , n). It follows that all the elementary
divisors E(2) of an invertible matrix are equal to one and the
canonical diagonal form of such a matrix is therefore the unit
matrix.
THEOREM 3. Two polynomial matrices A (A) and B(2) are
equivalent if and only if there exist invertible polynomial matrices
P(2) and Q(A) such that.
(7) A(A) = P (2)B (2)Q (2).
Proof: We first show that if A (A) and B(2) are equivalent, then
there exist invertible matrices P(A) and Q(A) such that (7) holds.
To this end we observe that every elementary transformation of a
polynomial matrix A(2) can be realized by multiplying A(2) on the
right or on the left by a suitable invertible polynomial matrix,
namely, by the matrix of the elementary transformation in ques-
tion.
We illustrate this for all three types of elementary transforma-
tions. Thus let there be given a polynomial matrix A(2)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
158 LECTURES ON LINEAR ALGEBRA
(8) 0 0 1 0
0 0 0 1_
obtained from the unit matrix by multiplying its second column
(or, what amounts to the same thing, row) by a.
Finally, to add to the first column of A (A) the second column
multiplied by q(A) we must multiply A(A) on the right by the
matrix 1 0 0 0
T(2) 1 0 0
(10) 0 0 1 0
0 0 0 1
0 0 0 1
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 159
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
160 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 161
Let
P(A) = P02" P,
where the P, are constant matrices.
It is easy to see that the polynomial matrix
P(A) + (A AE)P02"-1
If
P(2) + (A AE)P,An-i- -=- 13102"-I P'12"-2 + + P'_,,
then the polynomial matrix
P(A) + (A AE)P02"-1 + (A AE)P'0An-2
is of degree not higher than n 2. Continuing this process we
obtain a polynomial matrix
P(2) + (A 2E) (P02"-1 P'02"-2 + .)
P(2) = (A AE)S(2) + R.
This proves our lemma.
A similar proof holds for the possibility of division on the right;
i.e., there exist matrices S1(A) and R1 such that
P(2) -= S,(2) (A AE)
We note that in our case, just as in the ordinary theorem of Bezout,
can claim that
R = R, = P(A).
THEOREM 4. The polynomial matrices A AE and B AE are
equivalent if and only if the matrices A and B are similar.
Proof: The sufficiency part of the proof was given in the
beginning of this paragraph. It remains to prove necessity. This
means that we must show that the equivalence of A 2E and
B AE implies the similarity of A and B. By Theorem 3 there
exist invertible polynomial matrices P(2) and Q(A) such that
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
162 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CANONICAL FORM OF LINEAR TRANSFORMATION 163
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
CHAPTER IV
Introduction to Tensors
§ 22. The dual space
1. Definition of the dual space. Let R be a vector space. To-
gether with R one frequently considers another space called the
dual space which is closely connected with R. The starting point
for the definition of a dual space is the notion of a linear function
introduced in para. 1, § 4.
We recall that a function f(x), x E R, is called linear if it satisfies
the following conditions:
f(x+y)-f(x)+f(Y),
f(2x) = 2f (x).
Let el, e2, e be a basis in an n-dimensional space R. If
x = ei e2 e, + + e" e
is a vector in R and f is a linear function on R, then (cf. § 4) we
can write
f(x) = /(eei e2e2 re,)
(1) = a2e2 + ane",
where the coefficients al, a2, , a which determine the linear
function are given by
(2) a = f(e2), a2 = f(e2), a,, = f(e).
It is clear from (1) that given a basis e1, e2, , en every n-tuple
al, a2, , a determines a unique linear function.
Let f and g be linear functions. By the sum h off and g we mean
the function which associates with a vector x the number f(x)
g (x). By the product off by a number a we mean the function
which associates with a vector x the number x f(x).
Obviously the sum of two linear functions and the product of a
function by a number are again linear functions. Also, if f is
164
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 165
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
166 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 167
then
(fk, x) (fk, vet) ei(fk, e1)61k Ek.
To repeat:
If el, e2, , e is a basis in R and f', f2, , its dual basis in
R then
(4) (if x) = niE' + 172E2 + + nnen,
where $1, E2, . ,
are the coordinates of x c R relative to the basis
e1, e2, , en and Th, n ,i, are the coordinates off E R relative
to the basis in, f2, , fn.
NOTE. For arbitrary bases el, e2, , en and P, /2, h in R and R
respectively
(1, x) = a, rke,
where a/c, = (fi,
3. Interchangeability of R and R. We now show that it is possible
to interchange the roles of R and R without affecting the theory
developed so far.
R was defined as the totality of linear functions on R. We wish
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
168 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 169
Let p, f2,- , p be the dual basis of e1, e2, , e andf'1, f '2, , f'n
be the dual basis of e'1, e'2, , e'. We wish to find the matrix
111)7'11 of transition from the fi basis to the f'. basis. We first find its
inverse, the matrix of transition from the basis f'1, f'2, ,r
to the basis F. f2, , fn:
fk
(6')
To this end we compute (7, e'i) in two ways:
(fk, e'i) = (fk, ciece2) = ci.(fk, = cik
(fk, e'i) =1= e'i) u5k (f e'1)
Hence c1 = u1k, i.e., the matrix in (6') is the transpose 1 of the
transition matrix in (6). It follows that the matrix of the transition
fOc=
from fl, f2, , to fg, f'2, , f'k is equal to the inverse of the
This is seen by comparing the matrices in (6) and (6'). We say that the
matrix lime j/ in (6') is the transpose of the transition matrix in (6) because
the summation indices in (6) and (6') are different.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
170 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 171
§ 23. Tensors
1. Multilinear functions. In the first chapter we studied linear
and bilinear functions on an n-dimensional vector space. A natural
If R is an n-dimensional vector space, then R is also n-dimensional and
2
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
172 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 173
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
174 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 175
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
176 LECTURES ON LINEAR ALGEBRA
It follows that
Ae', = Ac,ae2 = cia Ae2 = e fi = ci2afl bflk e', = a' jei k.
This means that the matrix of A relative to the e', basis
takes the form
c2ab fik,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 177
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
178 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 179
/(x, y, , z, ; f, g, , h, )
l'(x, y, ; f, g, -)1(z,
1 is a multilinear function of p' p" vectors in R and q' q"
vectors in R. To see this we need only vary in 1 one vector at a
time keeping all other vectors fixed.
Ve shall now express the components of the tensor correspond-
ing to the product of the multilinear functions l' and 1" in terms
of the components of the tensors corresponding to l' and 1". Since
r(ei, ej, -; f3, -)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
180 LECTURES ON LINEAR ALGEBRA
and
= 1" (ek, e1, ; Jet, fu , ),
it follows that
att tuk,::: a"tkl
This formula defines the product of two tensors.
Contraction of tensors. Let /(x, y, ; f, g, ) be a multilinear
function of p vectors in R (p 1) and q vectors in R(q 1).
We use 1 to define a new multilinear function of p 1 vectors in R
and q 1 vectors in R. To this end we choose a basis el, e2, ,
e in R and its dual basis p, f2, f" in R and consider the sum
l'(y, -; g, )
= /(ei, y, ; fl, g, ) /(e2, y, ; f2, g, )
(7)
+ 1(e, Y, ; f", g, .)
1(e, y, ; g, ).
Since each summand is a multilinear function of y, and g,
the same is true of the sum I'. We now show that whereas each
summand depends on the choice of basis, the sum does not. Let us
choose a new basis e'1, e'2, e' and denote its dual basis by
f'2, irtn. Since the vectors y, and g, remain fixed we
need only prove our contention for a bilinear form A (x;
Specifically we must show that
A (e; fa) = A (e' a; f'k).
We recall that if
e', cikek,
then
fk eikr.
Therefore
A (e'2; f'ce) = A (cak ek; f'a) = cak A( f' z)
= A (ek; ck f'a) = A (ek; fk),
i.e., A (ea; P) is indeed independent of choice of basis.
We now express the coefficients of the form (7) in terms of the
coefficients of the form /(x, y, ; f, g, -). Since
= r(e5, ; f', -)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 181
and
l'(e ; jes, ) = l(e e ; f2, ),
if follows that
(8) =
The tensor a';::: obtained from a::: as per (8) is called a
contraction of the tensor
It is clear that the summation in the process of contraction may
involve any covariant index and any contravariant index. How-
ever, if one tried to sum over two covariant indices, say, the result-
ing system of numbers would no longer form a tensor (for upon
change of basis this system of numbers would not transform in
accordance with the prescribed law of transformation for tensors).
We observe that contraction of a tensor of rank two leads to a
tensor of rank zero (scalar), i.e., to a number independent of
coordinate systems.
The operation of lowering indices discussed in para. 4 of this
section can be viewed as contraction of the product of some tensor
by the metric tensor g, (repeated as a factor an appropriate num-
ber of times). Likewise the raising of indices can be viewed as
contraction of the product of some tensor by the tensor g".
Another example. Let a,k be a tensor of rank three and bt'n
a tensor of rank two. Their product ct'z' ai,kb,"' is a tensor rank
five. The result of contracting this tensor over the indices i and m,
say, would be a tensor of rank three. Another contraction, over
the indices j and k, say, would lead to a tensor of rank one (vector).
Let ati and b,' be two tensors of rank two. By multiplication
and contraction these yield a new tensor of rank two:
cit = aiabat.
If the tensors a1 and b ki are looked upon as matrices of linear
transformations, then the tensor cit is the matrix of the product
of these linear transformations.
With any tensor ai5 of rank two we can associate a sequence of
invariants (i.e., numbers independent of choice of basis, simply
scalars)
a:, a/
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
182 LECTURES ON LINEAR ALGEBRA
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 183
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
184 LECTURES ON LINEAR ALGEBRA
(10)a11±a
of the integers 1, 2, , n and if we put
/(x, y, z) = =a ni n2 " vn
12'''
This proves the fact that apart from a multiplicative constant the
only skew symmetric multilinear function of n vectors in an n-
dimensional vector space is the determinant of the coordinates of
these vectors.
The ofieration of symmetrization. Given a tensor one can always
construct another tensor symmetric with respect to a preassigned
group of indices. This operation is called symmetrization and
consists in the following.
Let the given tensor be 011,12_1 say. To symmetrize it with
,
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
INTRODUCTION TO TENSORS 185
CI V.
The tensor a[ii '41 does not change when we add to one of
the vectors E, n, any linear combination of the remaining
vectors.
Consider a k-dimensional subspace of an n-dimensional space R.
We vvish to characterize this subspace by means of a system of
numbers, i.e., we wish to coordinatize it.
A k-dimensional subspace is generated by k linearly independent
vectors ei, 2f 2, , Cik. Different systems of k linearly independ-
ent vectors may generate the same subspace. However, it is
easy to show (the proof is left to the reader) that if two such
systems of vectors generate the same subspace, the tensors
constructed from each of these systems differ by a non-zero
multiplicative constant only.
Thus the skew symmetric tensor a[i1i2" constructed on the
generators VI, Cik of the subspace defines this subspace.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor