Professional Documents
Culture Documents
{
t ,.... . .. ....
i
' 1. ... > ....... .
~
~
f .. ,. ., ..
i'
~ i ~ . '"
~ : . ~ ~ ..
~ ... ', ' ' ~ . . _ . ; ' ~
~ . ; . .. .. :, '\ ! '
'f.
1'
LINEAR ALGEBRA
With Geometric Appl ications
Reqiatra,d:/>;E .. p.:f.,(r;'___:_
PrQc&dencfa
F&eha ,{.?- X- D'
7
- rt .,)
$
1
J
.
1
1
1
:
1
'1
PURE AND APPLIED MATHEMATICS
A Program of Monographs, Textbooks, and Lecture Notes
EXECUTIVE EDITORS-MONOGRAPHS, TEXTBOOKS, AND LECTURE NOTES
Earl J. Taft
Rutgers University
New Brunswick, New Jersey
Edwin Hewitt
University of Washington
Seattle, Washington
CHAIRMAN OF THE EDITORIAL BOARD
S. Kobayashi
University of California, Berkeley
Berkeley, California
EDITORIAL BOARD
Masanao Aoki
University of California, Los Angeles
Glen E. Bredon
Rutgers University
Sigurdur Helgason
Massachusetts /nstitute of Technology
G. Leitman
University of California, Berkeley
Marvin Marcus
University of California, Santa Barbara
W. S. Massey
Yale University
Irving Reiner
University of /llinois at Urbana-Champaig
Pau/ J. Sal/y, Jr.
U11iversity of Chicago
Jane Cronin Scanlon
Rutgers University
Martin Schechter
Yeshiva University
Julius L. Shaneson
Rutgers University
MONOGRAPHS AND TEXTBOOKS IN
PURE AND APPLIED MATHEMATICS
l. K. YANO. Integral Formulas in Riemannian Geometry (1970)
2. S. KoBAYASHI. Hyperbolic Manifolds and Holomorphic Mappings (1970)
3. V. S. VLADIMIROV. Equations of Mathematical Physics (A. Jeffrey, editor; A. Little-
wood, translator) (1970)
4. B. N. PsHENICHNYI. Necessary Conditions for an Extremum (L. Neustadt, translation
editor; K. Makowski, translator) (1971) .
5. L. NARICI, E. BECKENSTEIN, and G. BACHMAN. Functiona) Ana)ysis and Valuation
Theory (1971)
6. D. S. PASSMAN. lnfinite Group Rings (1971)
7. L. DoRNHOFF. Group Representation Theory (in two parts). PartA: Ordinary Repre-
sentation Theory. Part B: Modular Representation Theory (1971, 1972)
8. W. BooTHBY and G. L. WEISS (eds.). Symmetric Spaces: Short Courses Presented at
Washington University (1972)
9. Y. MATSUSHIMA. Differentiable Manifolds (E. T. Kobayashi, translator) (1972)
10. L. E. WARD, JR. Topology: An Outline for a First Course (1972)
11. A. BABAKHANIAN. Cohomological Methods in Group Theory (1972)
12. R. GILMER. Multiplicative Ideal Theory (1972)
13. J. YEH. Stochastic Processes and the Wiener Integral (1973)
14. J. BARROS-NETO. Introduction to the Theory of Distributions (1973)
15. R. LARSEN. Functional Analysis: An Introduction (1973)
16. K. YANO and S. IsHIHARA. Tangent and Cotangent Bundles: Differential Geometry
(1973)
17. C. PROCESI. Rings with Polynomial Identities (1973)
18. R. HERMANN. Geometry, Physics, and Systems (1973)
19. N. R. WALLACH. Harmonic Analysis on Homogeneous Spaces (1973)
20. J. DIEVDONN. Introduction to the Theory of Formal Groups (1973)
21. l. VAISMAN. Cohomology and Differential Forms (1973)
22. B.-Y. CHEN. Geometry of Submanifolds (1973)
23. M. MARCUs. Finite Dimensional Multilinear Algebra (in two parts) (1973, 1975)
24. R. LARSEN. Banach Algebras: An Introduction (1973)
25. R. O. KuJALA and A. L. VITTER (eds). Value Distribution Theory: Part A; Part B.
Deficit and Bezout Estimates by Wilhelm Stoll (1973)
26. K. B. STOLARSKY. Algebraic Numbers and Diophantine Approximation (1974)
27. A. R. MAGID. The Separable Galois Theory of Commutative Rings (1974)
28. B. R. McDONALD. Finite Rings with Identity (1974)
29. l. SATAKE. Linear Algebra (S. Koh, T. Akiba, and S. Ihara, translators) (1975)
30. J. S. GoLAN. Localization of Noncommutative Rirrgs (1975)
31. G. KLAMBAUER. Mathematical Analysis (1975)
32. M. K. AaosTON. Algebraic Topology: A First Course (1976)
33. K. R. GoooEARL. Ring Theory: Nonsingular Rings and Modules (1976)
34. L. E. MANSFIELD. Linear Algebra with Geometric Applications (1976)
LINEAR ALGEBRA
Witt- Geometric Applications
LARRY E. MANSFIELD
Queens Col/ege of the City
University of New York
Flushing, New York
MARCEL DEKKER, INC. New York and Base!
COPYJt!GHT 1976 MARCEL DEKKER, INC. ALL RIGHTS RESERVED.
Neither this book nor any part may be reproduced. or transmitted in any forro or by any
r r i e a n ~ electronic or mechanical, including photocopying, microfilming, and recording, or
by any information storage and retrieval system, without permission in writing from the
publisher.
MARCEL DEKKER, INC.
270 l\ladison Avenue, New York, New York 10016
LIBRA.'l.Y OF CONGRESS CATALOG CARD NUMBER: 75-10345
ISBN: 0-8247-6321-1
Current printing (last digit):
10 9 8 7 6 5 4 3 2 1
PRINTED IN THE UNITED STATES OF AMERICA
Contents
Preface vii
Chapter 1. A Geometric Model
1. The Field of Real Numbers 2.
2. The Vector Space 1' 2 4.
3. Geometric Representa/ion of Vectors of 1'
2
7.
4. The Plane Viewed as a Vector Space 12.
5. Ang/e Measurement in 6!
2
19.
6. Generaliza/ion of 1'
2
and rff
2
22.
Chapter 2. Rea! Vector Spaces
l. Definition and Examples 32.
2. Subspaces 37.
3. Linear Dependence 44.
4. Bases and Dimension 51.
5. Coordina/es 58.
6. Direct Sums 65.
Chapter 3. Systems of Linear Equations
1. Notation
2. Row-equivalence and Rank
3. Gaussia,n Elimination
74.
83.
94.
iv ::::v
4. Geometric lnterpretation
5. Determinants: Definition
Contents
6. Determinants: Properties and Applications
7. A/ternate Approaches to Rank
101.
106.
116:
120.
Chapter 4. Linear Transformations
l. Definitions and Examples
2. The Image and Nu/1 Spaces
3. Algebra of Linear Transformations
4. Matrices of a Linear Transformation
5. Composition of Maps
130.
137.
146.
155.
166.
176.
187.
6. Matrix Multiplication
7. Elementary Matrices
l. Change in Coordinares
2. Change in the Matrix of a Map
3. Equivalence Re/ations
4. Similarity
5. Invariants for Simi/arity
1. Definition and Examp/es
2. The Normand Orthogonality
3. The Gram-Schmidt Process
4. Orthogonal Transformations
Chapter 5. Change of Basis
200.
208.
214.
222.
23/.
Chapter 6. lnner Product Spaces
242.
247.
255.
261.
5. Vector Spaces over Ar.bitrary Fields
270.
277.
6. Complex lnner Product Spaces and Hermitian Transformations
Chapter 7. Second Degree Curves and Surfaces
l. Quadratic Forms
2. Congruence
3. Rigid Motions
4. Conics
5. Quadric Surfaces
6. Chords, Rulings, and Centers
290.
298.
305.
3J2.
32J.
329.
Chapter 8. Canonical Forms Under Similarity
J. Invariant Subspaces
2. Polynomials
3. Mnimum Polynomials
4. Elementary Divisors
5. Canonical Forms
6. Applications of the Jordan Form
342.
348.
360.
367.
377.
388.
Contents
J. A Determinan! Function
2. Permutations and Existence
3. Cofactors, Existence, and Applications
Answers and Suggestions
Symbol Index
Subject Index
y
Appendix: Determinants
406.
412.
418.
429.
489.
491.
Preface
Until recently an introduction to linear algebra was devoted primarily to
solving systems of l i n ~ r equations and to the evaluation of determinants.
But now a more theoretical approach is usually taken and linear algebra is to
a large extent the study of an abstract mathematical object called a vector
space. This modern approach encompasses the former, but it has the advan-
tage of a much wider applicability, for it is possible to apply conclusions
derived from the study of an abstract system to diverse problems arising in
various branches of mathematics and the sciences.
Linear Algebra with Geometric Applications was developed as a text for a
sophomore leve!, introductory course from dittoed material used by severa!
classes. Very little mathematical background is assumed aside from that
obtained in the .usual high school algebra and geometry courses. AlthoJlgh a
few examples are drawn from the calculus, they are not essential and may be
skipped if one is unfamiliar with the ideas. This means that very little mathe-
matical sophistication is required. However, a major objective of the text is
to develop one's mathematical maturity and convey a sense of what con-
stitutes modern mathematics. This can be accomplished by determining how
one goes about solving problems and what constitutes a proof, while master-
ing computational techniques and the underlying concepts. The study of
linear algebra is well suited to this task for it is based on the simple arithmetic
properties of addition and multiplication.
Altnough linear algebra is grounded in arithmetic, so many new concepts
vii
viii Preface
-
must be introduced that the underlying simplicity can be obscured by termi-
nology. Therefore every effort has been made to introduce new terms only
when necessary and then only with sufficient motivation. For example, systenis
of linear equations are not considered until it is clear how they arise, matrix
multiplication is not defined until one sees how it will be used, and complex
scalars are not introduced until they are actually needed. In addition, ex-
amples are presented with each new term. These examples are usually either
algebraic or geometric in nature. Heavy reliance is placed on geometric
examples because geometric ideas are familiar and they provide good inter-
pretations of linear algebraic concepts. Examples employing polynomials or
functions are also easily understood and they supply nongeometric lnter-
pretations. Occasionally examples are drawn from other fields to suggest the
range of possible application, but this is not done often beca use it is difficult
to clarify a new concept while motivating and solving problems in another
field.
The first seven chapters follow a natural development begining with an
algebraic approach to geometry and ending with an algebraic analysis of
second degree curves and surfaces. Chapter 8 develops canonical forms for
matrices under similarity and might be covered at any point after Chapter 5.
It is by far the most difficult chapter in the book. T11e appendix on determi-
nants refers to concepts found in Chapters 4 and 6, but it could be taken up
when determinants are introduced in Chapter 3.
Importance of Problems The role of problems in the study of mathe-
matics cannot be overemphasized. They should not be regarded simply as
hurdles to be overcome in assignments and tests. Rather they are the means
to understanding the material being presented and to appreciating how ideas
can be used. Once the role of problems is understood, it will be seen that the
first place to look for problems is not necessarily in problem sets. 1t is im-
portant to be able to find and solve problems while reading the text. For
example, when a new concept is introduced, ask yourselfwhat it really means;
loo k for an example in which property is not present as well as one in
which it is, and then note the differences. Numerical examples can be made
from almost any abstract expression. Whenever .an abstract expression from
the text or one of your own seems unclear, replace the symbols with par-
ticular numerical expressions. This usually transforms the abstraction into an
exercise in arithmetic or a system of linear equations. The next plaee to look
for problems is in worked out examples and proved theorems. In fact, the
best way to understand either is by working through the computation or
deduction on paper as you read, filling in any steps that may have been
ornitted. Most of our theorems will have fairly simple proofs which can be
constructed with little more than a good understanding of what is being
claimed and the kriowledge of how each term is defined. This does not mean
Preface ix
that you should be able to prove each theorem when you first encounter it,
however the attempt to construct a proof will usually aid in understanding
the given proof. The problems at the end of each section should be con-
sidered next. Solve enough of the computational problems to master the
computational techniques, and work on as many of the remaining problems
as possible. At the very least, read each problem and determine exactly what
is being claimed. Finally you should often try to gain an overview of what
you are doing; set yourself the problem of determining how and why a par-
ticular concept or technique has come about. In other words, ask yourself
what has been achieved, what terms had to be introduced, and what facts
were required. This is a good time to see ifyou can prove sorne ofthe essential
facts or outline a proof of the main result.
At times you will not find a solution immediately, but simply attempting
to set up an example, prove a theorem, or solve a problem can be very use-
fui. Such an attempt can point out that a term or concept is not well under-
stood and thus Iead to further examination of sorne idea. Such an examina-
tion will often pro vide the basis for finding a solution, but even if it do es not,
it should lead to a fuller understanding of sorne aspect of linear algebra.
Because problems are so important, an extensive solution section is
provided for the problem sets. lt contains full answers for all computational
problems and some theoretical problems. However, when a problem requires
a proof,, the actual development of the argument is one objective of the
problem. Therefore you will often find a suggestion as to how to begin rather
than a complete solution. The way in which such a solution begins is very
important; too often an assumption is made at the begining of an argument
which amounts to assuming what is to be proved, or a hypothesis is either
misused or omitted entirely. One should keep in mind that a proof is viewed
in its entirety, so that an argument which begins incorrectly cannot become a
proof no matter what is claimed in the last line about having solved the
problem. A given suggestion or proof should be used as a last resort, for once
you see a completed argument you can no longer create it yourself; creating
a proof not only extends your knowledge, but it amounts to participating in
the development of linear algebra.
Acknowledgments At this point I would like to acknowledge the
invaluable assistance I have received from the many students who worked
through my original lecture notes. Their observations when answers did not
check or when arguments were not clear have lead to many changes and
revisions. I would also like to thank Professor Robert B. Gardner for his
many helpful comments and suggestions.
LINEAR ALGEBRA
With Geometric Applications
i.
T
..,j
!
A Geometric Model
1. The Field of Real Numbers
2. The Vector S pace "f/'
2
3. Geometric Representation of Vectors of "f/'
2
4. The Plane Viewed as a Vector Space
5. Angle Measurement in 8
2
6. Generalization of "f/'
2
and 1
2
2 1. A GEOMETRIC MODEL
Before begining an abstract study of vector spaces, it is helpful to ha ve a
concrete example to use as a guide. Therefore we will begin by defining a.
particular vector space and after examining a few of its properties, we wm
see how it may be used in the study of plane geometry.
1. The Field of Real Numbers
Our study of linear algebra is based on the arithmetic properties of.real
numbers, and severa! importan! terms are derived directly from these prop-
erties. Therefore we begin by examining the basic properties of arithmetic.
The set of all real numbers will be denoted by R, and the symbol "E" will
mean "is a member of." Thus .j2 E R can be read as ".j2 is a member of the
real number system" or more simply as ".j'l is a real number." Now if r, s,
and t are any real numbers, then the following properties are satisfied:
l'roperties of addition:
r + (s + t) = (r + s) + t
or R is closed under addition
or addition is associative
r+O=r
or addition is commutative
or O is an additive identity
For any rE R, there is an additive inverse -rE R such that
r + ( -r) =O.
Properties of multiplication:
rsE R
r(st) = (rs)t
rs = sr
rl = r
or R is closed under multiplication
or multiplication is associative
or multiplication is commutative
or 1 is a multiplicative identity
For any rE R, r # O, there is a multiplicative in verse r -
1
E R
such that r(r-
1
) = l.
The final property states that multiplication distributes over addition and
1. The Field of Real Numbers 3
ti es the two operations together:
r(s + t) = rs + rt a distributive law.
This is a rather speciallist of properties. On one hand, none of the prop-
erties can be derived from the others, while on the other, many properties of
real numbers are omitted. For example, it does not contain properties of
order or the fact that every real number can be expressed as a decimal. Only
certain properties of the real number system have been included, and many
other mathematical systems share them. Thus if r, s, and t are thought of as
complex numbers and R is replaced by C, representing the set of all complex
numbers, then all the above properties are still valid. In general, an algebraic
system satisfying all the preceding properties is called afie/d. The real number
system and the complex number system are two different fields, and there
are many others. However, we will consider only the field of real numbers
in the first five chapters.
Addition and multiplication are binary operations, that is they are only
defined for two elements. This explains the need for associative laws. For if
addition were not associative, then r + (s + t) need not equal (r + s) + t
and r + s + t would be undefined. The field properties listed above may
seem obvious, but it is not to difficult to find binary operations that violate
any or all of them.
One phrase in the preceding list which will appear repeatedly is the state-
ment that a set is closed under an operation. The statement is defined for a set
of numbers and the operations of addition and multiplication as follows:
Definition Let S be a set of real numbers. S is closed under addition if
r + tE S for all r, tE S. S is closed under multiplication if r tE S for all
r, tE S.
For example, if S is the set containing only the numbers 1, 3, 4, then S is
not closed under either addition or multiplication. For 3 + 4 = 7 rj; S and
3 4 = 12 rj; S, yet both 3 and 4 are in S. As another example, the set of aH odd
integers is closed undermultiplication but is not closed under addition.
Sorne notation is useful when working with sets. When the elements are
easily listed, the set will be denoted by writing the elements within brackets.
Therefore {1, 3, 4} denotes the set containing only the numbers 1, 3, and 4.
For larger sets, a set-building notation is used which denotes an arbitrary
member of the set and states the conditions which must be satisfied by any
member ofthe set. This notation is { 1 } and may be read as "the set of
all . . such that . " Thus the set of odd .integers could be wriMen as:
4 1. A GEOMETRIC MODEL
{x 1 x is an odd integer}, "the set of all x such that x is an odd integer." Or it
could be written as {2n + 1 1 n is an integer}.
Problems
1. Write out the following notations in words:
a.7ER. b . ./-6f/=R. c.{0,5}. d.{xlxeR,x<O}.
e. {xeRix
2
= -1}.
2. a. Show by example that the set of odd integers is not closed under addit.ion.
b. Prove that the set of odd integers is closed under multiplication.
3. Determine if the following sets are closed under addition or multiplication:
a.{l,-1}. b.{5}. c.{xe.Rix<O}. d.{2nlnisaninteger}.
e. {x E R 1 x 0}.
4. Using the property of addition as a guide, give a formal definition of what it
means to say that "addition of real numbers is commutative."
5. A distributive law is included in the properties of the real number system.
State another distributive law which holds and explain why it was not in-
cluded.
2. The Vector S pace "Y
2
It would be possible to begin a study of linear algebra with a formal
definition of an abstract vector space. However, it is more fruitful to consider
an example of a particular vector space first. The formal definition will essen-
tially be a selection of properties possessed by the example. The idea is the
same as that used in defining a field by selecting certain properties of real
numbers. The mathematical problem is to select enough properties to give the
essential character of the example while at the same time not taking so many
that there are few examples that share them. This procedure obviously cannot
be carried 0ut with only one example in hand, but even with severa) examples
the resulting definition might appear arbitrary. Therefore one should not
expect the example to point directly to the definition of an abstract vector
space, rather it should provide a first place to interpret abstract concepts.
As with the real number system, a vector space will be more than just a
collection of elements; it will also include the algebraic structure imposed by
operations on the elements. Therefore to define the vector space "Y
2
, both its
elements and its operations must be given.
2. The Vector Space "1'"
2 5
The elements of "Y
2
are defined to be al! ordered pairs of real numbers
and are called vectors. The operations of "Y
2
are addition and scalar multipli-
cation as defined below:
Vector addition: The su m of two vectors (a
1
, b
1
) and (a
2
, b
2
) is defined
by: (a
1
, h
1
) + (a2, b2) = (a
1
+ a
2
, b
1
+ b
2
).
For example, (2, -5) + (4, 7) = (2 + 4, -5 + 7) = (6, 2).
Scalarmultiplication: For any real number r, called a scalar, and any
vector (a, b) in "1'
2
, r(a, b) is a scalar multiple and is defined by r(a, b) =
(ra, rb).
For example, 5(3, -4) = (15, -20).
Now the set of al! ordered pairs of real numbers together with the opera-
tions of vector addition and scalar multiplication forms the vector space "Y
2
The numbers a and b in the vector (a, b) are called the components ofthe vec-
tor. Since vectors are ordered pairs, two vectors (a, b) and (e, d) are equal if
their corresponding components are equal, that is if a = e and b = d.
One point that should be made immediately is that the term "vector"
may be applied to many different objects, so in other situations the term may
apply to something quite ditferent from an ordered pair of real numbers. In
this regard it is commonly said that a vector has magnitude and direction,
but this is not true for vectors in "Y
2
.
The strong similarity between the vectors of "Y
2
and the names for points
in the Cartesian plane will be utilized in time. However, these are quite ditfer-
ent mathematical objects, for "Y
2
has no geometric properties and the Carte-
sian plane does not ha ve algebraic properties. Before relating the vector space
"Y
2
with geometry, much can be said of its algebraic structure.
Theorem 1.1 (Basic properties of addition in "Y
2
) If U, V, and W
are any vectors in "Y
2
, then
l. U+ VE"/'
2
2. U + (V+ W) = (U + V) + W
3. U+ V= V+ U
4. U + (0, O) = U
5. For any vector U E "Y
2
, there exists a vector -U E "Y
2
such that
U + (- U) = (0, 0).
6 1. A GEOMETRIC MODEL
Proof Each of these is easily pro ved using the definition of addition
in "f'
2
and the properties of addition for real numbers. Por example, to prove.
part 3, Jet U = (a, b) and V= (e, ) where a, b, e, dE R, then
U + V = (a, b) + (e, d)
= (a + e, b + d) Definition of vector addition
= (e + a, d + b.) Addition in R is commutative
=(e, d) +(a, b) Definition of addition in 1'
2
=V+ U.
The proof of 2 is similiar, using the fact that addition in R is associative, and
4 foEows from the fact that zero is an additive identity in R. Using the above
notation, U + V = (a + e, b + d) and a + e, b + dE R since R is closed
under addition. Therefore U + V E "f/
2
if U, V E "f/
2
and 1 holds. Part 5
follows from the fact that every real number has an additive inverse, thus if
U= (a, b)
U+ (-a, -b) = (a, b) + (-a, -b) = (a - a, b - b) = (0, O)
and (-a, -b) can be called -U.
Each property in Theorem 1.1 arises from a property of addition in R and
gives rise to similar terminology. (1) states that "f/
2
is c!osed under addition.
From (2) and (3) we say that addition in "f/
2
is associative and commutative,
respectively. The fourth property shows that the vector (0, O) is an identity
for addition in "Y
2
Therefore (0, O) is called the zero vector ofthe vector space
"Y
2
and it will be denoted by O. Finally the fifth property states that every
vector U has an additve in verse denoted by -U.
Other properties of addition and a list of basic properties for scalar
multiplication can be found in the problems below.
Problems
1. Find tbe following vector sums:
a. (2, - 5) + (3, 2). c. (2, -3) + ( -2, 3).
b. (-6, I) + (5, -1). d. (l/2, l/3) + ( -l/4, 2).
2. Find tbe following scalar multiples:
a. !(4, - 5). b. 0(2, 6). c. 3(2, -l/3). d. ( -1)(3, -6).
1
1
1
3. Geometric Representation of Vectors of 1' 2
7
3. Solve the following equations for the vector U:
a. U+ (2, -3) = (4, 7). c. (-5, 1) +u= (0, 0).
b. 3U + (2, 1) = (1, 0).
d. 2U + (-4)(3, -1) = (1, 6).
4. Show that for al! vectors U E i 2, U + O = U.
S. Prove that vector addition in i
2
is associative; give a reason for each step.
6. Suppose an operation were defined on pairs ofvectors from i 2 by the formula
(a, b)o(c, d) = ac + bd. Would i 2 be closed under tbis operation?
7. The following is a proof of the fact that the additive identity of i 2 is unique,
-.c+hat is, there is only one vector that is an additive identity. Find the reasons for
the six indicated steps.
8.
9.
10.
11.
Suppose there is a vector W E i
2
such that U + W = U for all U E i 2
Then, since O is an additive identity, it must be shown that W = O.
Let U= (a, b) and W = (x, y) a.?
then U+ W = (a, b) + (x, y)
=(a+ x, b +y) b.?
but U+ W= (a, b) c.?
therefore a + x = a and b + y = b d.?
so x = O and y = O e.?
and W= O. f.?
Following the pattern in problem 7, prove that each vector in i z has a unique
additive inverse.
Prove that the following properties of scalar multiplication hold for any vectors
U, V E i
2
and any scalars r, sER:
a. rUEiz
b. r(U+ V)=rU+rV
c. (r + s)U = rU + sU
d. (rs)U = r(sU)
e. l U= U where 1 is the number l.
Show that, for all U E i 2: a. OU= O. b.-U= (-l)U.
Addition in R and therefore in i 2 is both associative and commutative, but
not all operations ha ve these properties. Show by examples that if subtraction
s viewed as an operation on real numbers, then it is neither commutative nor
associative. Do the same for di vis ion on the set of all positive real numbers.
3. Geometric Representation of Vectors of "f/ 2
The definition of an abstract vector space is to be based on the example
provided by "Y
2
, and we ha ve now obtained all the properties necessary for
the deflnition. However, asid e from the fact that "f/
2
has a rather simple
8 1. A GEOMETRIC M O DEL
dcfinition, the preceding discussion gives no indication as to why one would
want to study it, let alone something more abstract. Therefore the remainder
of this chapter is devoted to examing one of the applications for lineat
algebra, by drawing the connection between linear algebra and Euclidean
geometry. We will first find that the Cartesian plane serves as a good model
for algebraic concepts, and then begin to see how algebraic techniques can
be used to solve geometric problems.
Let E
2
denote the Cartesian plane, that is the Euclidean plane with a
Cartesian coordinate system. Every point of the plane E
2
is named by an
ordered pair of numbers, the coordinates of the point, and thus can be used
to representa vector of "Y
2
pictorially. That is, for every vector U = (a, b),
thcre is a point in E
2
with Cartesian coordina tes a and b that can be used as a
picture of U. And conversely, for every point with coordinates (x, y) in the
plane, there is a vector in "Y
2
which has x and y as components. Now if the
vectors of"Y
2
are represented as points in E
2
, how are the operations ofvector
addition and scalar multiplication represented?
Suppose U= (a, b) and V= (e, d) are two vectors in 1'
2
, then U+ V=
(a + e, b + d), and Figure 1 gives a picture of these three vectors in E
2
A
little plane geometry shows that when the four vectors O, U, V, and U + V are
viewed as points in E
2
, they lie at the vert ces of a parallelogram, as in
Figure 2. Thus the su m of two vectors U and V can be pictured as the fourth
vertex of the parallelogram having the two lines from the origin to U and V
as two sides.
To see how scalar multiples are represented, let U= (a, b) and rE R,
then rU = (ra, rb). If a # O, then the components of the scalar multiple rU
satisfy the equation rb = (bja)ra. That is, if rU = (x, y), then the components
U+ V
b+d
(a +e, b +d)
. . .._,
V
d
(e,d)
u
b
(a, b)
e a a+e
Figure 1
3. Geometric Representation of Vectors of "f/' 2
9
U+ V
o
Figure 2
OU=O
Figure 3
satisfy the equation of the Iine given by y = (bja)x. Converse! y, any point on
the Iine with equation y = (bja)x has coordinates (x, (bja)x), and the vector
(x, (bja)x) equals (xja)(a, b) which is a scalar multiple of U. In Figure 3
severa! scalar multiples of the vector U = (2, 1) are shown on the line which
represents all the scalar multiples of (2, 1). If a = O and b # O, then the set
of all multiples of U is represented by the vertical axis in E
2
If U= O, then
al! multiples are again O. Therefore in general, the set of al! scalar multiples of
a nonzero vector U are represented by the points on the line in E
2
through the
origin and the point representing the vector U.
This interpretation of scalar multiplication provides vector equations
for Jines through the origin. Suppose e is a line in E
2
passing through the
10 1 A GEOMETRIC MODEL
origin and U is any nonzero vector represented. by a point on e. Then identi-
fying points and vectors, every point on e is tU for ome scalar t. Letting P
denote this variable point, P = tU, tER is a vector equation for e, and the
variable t is called a parameter.
Example 1 Find a vector equation for the Ji e e with Cartesian equa-
tion y= 4x.
e passes through the origin and the point (1, 4 . Therefore P = t(l, 4),
tER is a vector equation for e. If the variable poi t Pis called (x, y), then
(x, ;;) = t(l, 4) yields x = t, y = 4t and eliminati g the parameter t gives
y = 4x. Actually there are many vector equations fo e. Since (- 3, -12) is on
e, P = s(- 3, - 1 2), sE R is another vector equatio for e. But s(- 3, -12)
= (- 3s)(l, 4), so the two equations ha ve the same raph.
Using the geometric interpretation of vector a dition, we can write a
vector equation for any Iine in the plane. Given a Ji e e through the origin,
suppose we wish to find a vector equation for the !in m parallel to e passing
through the point V. If U is a nonzero vector on e, t en tU is on e for each
tE R. Now O, tU, V, and tU+ V are vertices of a arallelogram, Figure 4,
and !17 is parallel to e. Therefore tU+ Vis on m r each value of t. In
fact every point on m can be expressed as tU+ Vfo sorne t, so P = tU+
V, t E R is a vector equation for the Iine m.
Example 2 Find a vector equation for the ine m passing through
the point (1, 3) and parallel to the linee with Cartesi n equation x = 2y.
The point (2, 1) is on e so P = t(2, 1) + (1, 3), t R is a vector equation
/
/
/
/
/
/
/
/
V
,
/
/
/
~ tU
Figure 4
/
/
~
.,
,!
~ ,1
~
3. Geometric Representation of Vectors of "f/'
2
11
for m. Setting P = (x, y) and eliminating the parameter t from x = 2t + 1
and y = t + 3 gives the Cartesian equation x = 2y - 5. The same Cartesian
equation for m can be obtained using the point-slope formula.
Notice the use of the vector U in the equation P = tU + V for m and
P = tU for e, in Example 2. U determines the direction of m and e, or
rather its use results from the fact that they are parallel. Measurement of
direction for Iines is a relative matter, for example slope only measures direc-
tion relative to given directions, whereas the fact that lines are parallel is
independent of coordina te axes. For the vectors in "Y
2
, direction is best left
as an undefined term. However, each nonzero vector of "Y
2
can be used to
express the direction of a class of parallellines in E
2
Definition If m is a Iine with vector equation P = tU+ V, tER,
then any nonzero scalar multiple of U gives direction numbers for m.
Example 3 The line e with Cartesian equation y = 3x has P =
t(I, 3) as a vector equation. So (1, 3), 2(1, 3) = (2, 6), -3(1, 3) = ( -3, -9)
are all direction numbers for e. Notice that 3/1 = 6/2 = -9/-3 = 3 is the
slope of e.
In general, if a line has slope p, then it has direction numbers (1, p).
Example 4 Find a vector equation for the Iine m with Cartesian
equation y = 3x + 5.
mis parallel to y = 3x, so(!, 3) gives direction numbers for m. It passes
through the point ( -1, 2), so P = t(l, 3) + ( -1, 2), t E Risa vector equation
for m. To check the result, let P = (x, y), then x = t - 1 and y = 3t + 2.
Eliminating t yields y = 3x + 5.
Aside from providing a parameterization for lines, this vector approach
will generalize easily to yield equations for lines in 3-space. In contrast, the
approach to Iines in the plane found in analytic geometry does not generalize
to Iines in 3-space.
There are two possible points of confusion in this representation of
vectors with points. The first is in the nature of coordinates. To say that a
point P has coordinates (x, y) is not to say that P equals (x, y). An object is
not equal to its name. However, a statement such as "consider the point with
coordina tes (x, y)" is often shortened to "consider the point (x, y)." But such
a simplification in terminology should not be made until it is clear that points
-
12 1. A GEOMETRIC MODEL
and coordinates are different. The second point of confusion is between
vectors and points. There is often the feeling that "a vector really is a point
isn't it?" The answer is emphatically no. lfthis is not clear, consider the situa-
tion with a number Iine such as a coordinate axis. When the identification is
made between real numbers and points on a line, no one goes away saying
"but of course, numbers are really points !"
Problems
l. For each of the vectors (3, -4) (0, -1/2) and (2, 3) as U;
a. Plot the points in E
2
which represent U, -U, iU, OU, and 2U.
b. Without using a formula, find a Cartesian equation for the line representing
all scalar multiples of U.
2. Plot the points in E
2
which represent the vectors O, U, V, and U + V and sketch
the parallelogram they determine when
a. U= (1, 4); V= (3, 2). b. U=(l,l); V=(-1,1).
3. How are the points representing U, V, and U + V related if U is a scalar multi-
ple of Vand V of. O?
4. Find a vector equation for the lines with the following Cartesian equations:
a. y = x. b. y = 5x. c. 2x - 3y = O. d. y = O.
e. y = 5x- 6. f. x = 4. g. 2x - 4y + 3 = O.
5. Find direction numbers for each of the Iines in problem 4.
6. Find a vector equation for the line through
a. (5, -2) and parallel to 3x = 2y.
b. (4, O) and parallel to y = x.
c. (1, 2) and parallel to x = 5.
7. Find Cartesian equations for the lines with the given vector equations.
a. P = t(6, 2) + (7, 3), tE R.
c. P = 1(2, 8) + (1, 2), tE R.
b. P = 1(5, 5) + (5, 5), tE R.
8. Show that if P = tU + V, tER and P = sA + B, sER are two vector equa-
tions for the same line, then U is a scalar multiple of A.
4. The Plane Viewed as a Vector Space
Pictures of vectors as points in the Cartesian plane contain Euclidean
properties which are lacking in "Y'
2
For example, the distan ce from the origin
to the point (a, b) in E
2
is J a
2
+ b
2
but this number does not represent any-
4. The Plane Viewed as a Vector 8_iace 13
thing in "Y'
2
Similarly angles can be measured in E
2
while angle has no mean-
ing in "Y'
2
In order to bring the geometry of E
2
into the vector space, we
could define the Iength of a vector (a, b) to be Ja
2
+ b
2
, but this definition
would 'come from the plane rather than the vector space. The algebraic
approach is to add to the algebraic structure of the vector space so that the
properties of length and angle in the plan e will represent properties of vectors.
The following operation will do just that.
Definition Let U= (a, b) and V= (e, -d); then the dot produet of U
and Vis U o V= (a, b) o (e, d) = ae + bd.
For example,
(2, -6) o (2, 1) = 22 + ( -6)1 = -2;
(3, -4)o(-4, -3) = -12 + 12 =O and (1, 1) o (1, 1) = 2.
Notice that the dot product of two vectors is a scalar or number, not a
vector. Therefore the dot product is not a multiplication of vectors in the
usual sense of the word; for when two things are multiplied, the product
should be the same type of object.
The algebraic system consisting of all ordered pairs of real numbers
together with the operations of addition, scalar multiplication, and the dot
product is not really "Y'
2
This new space will be denoted by tff
2
The vector
space g
2
differs from "Y'
2
only in that it possesses the dot product and "Y'
2
does
no t. The similarity between the notations E
2
and tff
2
is not accidental; it will
soon beco me evident that g
2
is essentially the Cartesian p!ane together with
the algebraic structure of a vector space.
Notice that for any vector U= (a, b) in tff
2
, U o U= a
2
+ h
2
This is
the square of what we thought of calling the length of U.
Definition The length or norm of a vector U E tff
2
denoted 11 Ull, is defined
by IIUII = JU o u.
The notation 11 Ull may be read "the length of U." In terms of compo-
nents, if U= (a, b) then ll(a, b)ll = Ja
2
+ b
2
For example
11(2, -1)11 = J4+T = J5, 11(3/5, 4/5)11 = 1, and 11011 = O.
Now with the length of a vector U= (a, b) equal to the distance between
the origin and the point with coordina tes (a, b) in E
2
, the Iine segment between
the origin and (a, b) could be used to represent both the vector U and its
14 1. A GEOMETRIC MODEL
U+ V
o -.;.____
Figure 5
length. But each nonzero vector of "f/'
2
and hence tff
2
determines a direction in
the Cartesian plane. Therefore the vector U = (a, b) can be represented picto-
rially by\an arrow from the origin to the point with coordinates (a, b). Using
this reprcrsentation for vectors from tff
2
, the picture for addition of vectors in
Figure 2 can be redrawn. In Figure 5, the sum U + Vis represented by an
arrow which is the diagonal of the parallelogram determined by O, U, V, and
U + V. This figure illustrates the parallelogram rule of addition which states
that the sum of two vectors, viewed as arrows, is the diagonal (at the origin)
of the parallelogram they determine.
An arrow is not necessarily the best way to represent a vector from tff
2
Viewing vectors as arrows would not improve the picture of ail the scaiar
m u! tiples of a vector. The choice of which representation to use depends on
the situation.
An arrow which is not based at the origin may be viewed as representing
a vector. This arises from a representation of the difference of two vectors and
ieads to many geometric applications. For U, V E tff
2
, U - V = U + (-V),
so that viewing U, V, - V, and U - V as arrows in E
2
gives a picture such as
that in Figure 6. Ifthese four vectors are represented by points anda few lines
are added, then two congruent triangles with corresponding sides parallel are
obtained, as shown in Figure 7. Therefore the line segment from V to U has
the same Jength and is parallel to the line from the origin to U - V. In other
words, the arrow from V to U has the same iength and direction as the arrow
representing the difference U- V. Therefore the arrow from Vto U can also
be used to represent U - V, as shown in Figure 8. This representation can be
quite confusing if one forgets that the arrow is only a picture for an ordered
pair ofnumbers; it is not the vector itself. For example, when U= (4, 1) and
V= (3, 7), the difference U- V= (1, -6) is an ordered pair. The simplest
representation of the vector (1, - 6) is either as a point or asan arrow at the
origin ending at the point with coordinates (1, -6). There is nothing in the
4. The Plane Viewed as a Vector S pace
15
V
-V
Figure 6
v.
\
U
Figure 7
vector (1, -6) which indicates that it should be viewed as some other arrow.
A vector equation for the line determined by two points can be obtained
using the above representation of a vector difference. Suppose A and B are
two points on a line t n E
2
Viewing these points as vectors, the difference
B- A may be represented by an arrow from Ato B; see Figure 9. Or B- A
is on the line parallel to ez:which passes through the origin. Therefore the
16
1
1
1
1
1
1
--
1 -
.-
A
V
--
Figure 8
1. A GEOMETRIC MODEL
:;
u
B-A
Figure 9
difference B - A gives direction numbers for the Iine &, and a vector equa-
tion for & is P = t(B - A) + A, tE R.
Example 1 Find a vector equation for the !in e & through (-2, 3)
and (4, 6).
. & has numbers (6, 3) = (4, 6) - ( -2, 3), and ( -2, 3) is a
pomt onthe !me. Thus P = t(6, 3) + ( -2, 3), tER is a vector equation for
&. Severa! other ,equations could be obtained for &, such as p = s(- 6, _ 3)
-: (4, 6), sE R. fhese equations have the same graph, so for each par-
value of t, there 1s a value of s which gives the same point. In this case
th1s occurs when s = 1 - t.
4. The Plane Viewed as a Vector S pace 17
Figure 10
Example 2 Find the midpoint M of the Iine segment joining the
two points A and B in the plane E
2
If A, B, and M are viewed as vectors, then B - A can be thought of as
an arrow from A to B. Mis in the middle of this arrow, but f(B - A) is not
M. Rather, f(B - A) can be represented by the arrow from A to M, as in
Figure 10. Therefore t(B- A) = M- A, orA must be added to -HB- A)
to obtain M; and M= A + t(B- A)= f(A + B).
M = !(A + B) is a simple vector formula for the midpoint of a line
which does not explicitly involve the coordinates of points. One of the main
strengths of the vector approach to geometry is the ability to express relation-
ships in a form free of coordinates. On the other hand, the vector formula can
easily be converted to the usual expression for the coordinates of a midpoint
in analytic geometry. For if the point A has coordinates (x
1
, y
1
) and B has
coordinates (x
2
, y
2
), tlien M= t(A + B) = (t(x
1
+ x
2
), t(y
1
+ y
2
)).
The vector form of the midpoint formula is used in the next example to obtain
a short proof of a geometric theorem.
Example 3 Prove that the line joining the midpoints of two sides
of a triangle is parallel to and half the length of the third side.
Suppose A, B, and C are the vertices of a triangle, Mis the midpoint of
si de AB and N is the midpoint of si de BC, as in Figure 11. Then it is necessary
to show that the 1ine MN is ha1f the length and parallel to AC. In terms of
vectors this is equivalent to showing that N- M= t(C - A). But we know
that M = t(A + B) and N = t(B + C), so
N - M = J(B + C) - t(A + B) = JB + tC - tA - tB
= t(C- A).
18 1.A GEOMETRIC MODEL
Figure 11
Problems
l. Compute the following dot products: a. (2, -l)o(3, 4). b. (0, O) o(l, 7).
C. (2, -J)o(J, 2). d. (5, -l)o(-4, 3).
2. Find the lengths of the following vectors from rff 2 :
a. (1, 0). b. o1v2. tJV2). c. (-4, 3). d. (1, -3).
3. Show that llrUII = ir 111 Uli for any vector U E rff 2 and any scalar r.
4. Let U, V, WE 8
2
a. What is the meaning of Uo(VoW)?
b. Show that Uo(V + W) =U o V+ UoW.
5. Use arrows from the origin in E
2
to represent the vec'tors U, V, U + V, - V,
and U - V= U + (- V) when
a. U= (2, 4), V= (-2, 3). b. U= (5, 1), V= (2, -4).
6. Find a vector equation for the line.through the given pair of points.
a. (2, -5), ( -1, 4). b. (4, 7), (10, 7). c. (6, 0), (3, 7).
7. Find the midpoint of each line segment joining the pairs of points in problem 6.
8. Let A and B be two points. Without using components, find a vector formula
for the point Q which is on the line segment between A and B and one-fifth
the distance to A from B.
9. Without using components, prove that the midpoints of the sides of any
quadrilateral are the vertices of a parallelogram.
10. Let A, B, and C be the vertices of a triangle. Without using components, show
that the point on a median which is 2/3 of the distance from the vertex to the
opposite side can be expressed as }(A + B + C). Conclude from this that
the medians of a triangle are concurren t.
11. Prove without using components that the diagonals of a parallelogram bisect
each other. Suggestion: Show that the midpoints of the diagonals coincide.
5. Angle Measurement in lz
19
5. Angle Measurement in rff 2
The dot product was used to define length for vectors in rff 2 after first
finding what the length should be in order to have geometric meaning. T ~ e
definition of angle can be motivated in the same way. We need the properties
listed below but a detailed discussion of them will be postponed to Chapter 6.
~
Theorem 1.2
Let U, V, and W E g
2
, and r E R, then
1. UoV=VoU
2. U o (V + W) = U o V + U o W
3. (rU) o V = r(U o V)
4. U o V ~ O and U o U = O f and only if U == O.
To motiva te the definition of an angle in rff
2
, suppose U and V are two
vectors such that one is nota scalar multiple of the other. Representing U, V,
and U - V with arrows in E
2
yields a triangle as in Figure 12. Then the angle
e between the arrows representing U and Vis a well-defined angle inside the
triangle. Since the three si des of the triangle can be expressed in terms of the
given vectors, the law of cosines has the form
IIU- Vll
2
== 11Uil
2
+ IIVII
2
- 2IIUIIIIVII cos8.
u
Figure 12
20
1. A GEOMETRIC MODEL
Using Theorem 1.2 and the definition of 1ength,
il U - Vil
2
= (U - V)
0
(U- V) = U o U- U o V- V o U+ V o V
= IJUil
2
- 2U o V+ IIVII
2
the. 1aw of cosines gives us the relation U o V = il Uilil Vil cos e.
Th1s relatwn tles together the angle between the arrows and the vectors they
represent, making it clear how angles should be defined in g
2
. Definition If U and Vare nonzerovectors in C
2
, the angle btween them
IS the angJe e which satisfies 0 5 e 5 180 and
UoV
cos O = 11 Uilil Vil .
If U or Vis the zero vector, then the angle between them is taken to be zero.
this definition, angles in g
2
will ha ve the usual geometric inter-
pretatwn when the vectors of C
2
are represented by arrows in E2. The second
part of the definition in sures that an angle is defined for any pair of vectors
and we can write U o V= il Uilil Vil cos O for all vectors U, V from g
2
'
Example 1
U and V satisfies
lf U= (1, 2) and V= (- 1, 3), then the angle between
cosO = (1, 2) o (- 1, 3) = _1
il(1, 2)ilil( -1, 3)il .J2'
Example 2 If U= (1, 1), V= (- 1, 2) and O is the angle between
U and V, then cos O= 1/.JIO ore= cos-
1
1/.JIO. Jt would b 'bl
t b
d . . e poss1 e
o o tam a ec1mal approx1matwn for O if required.
Examp!e 3 When U= (4, 6) and V= ( -3, 2), cosO= O. There-
fore between U and Vis 90, and if U and Vare represented by
arrows m E , then the arrows are perpendicular.
Definition
Two nonzero vectors in C
2
are perpendicular or orthogonal
5. Angle Measurement in 1
2
21
if the ang1e between them is 90. The zero vector is taken to be perpendicular
or orthogonal to every vector.
The zero vector is defined to be perpendicular to every vector for con-
venience, for then it is not necessary to continually refer to it as a special case.
Although the definition of perpendicular has a good geometric interpretation,
it is easier to use the following algebraic condition.
Theorem 1 .3 Two vectors U and V in C
2
are perpendicular if and
only if U o V = O .
Proof ( => )* Suppose U and V are perpendicular. Then the angle
() between U and Vis 90 and cosO =O, or either U or Vis the zero vector
and 11 Ull or il Vil is O. In each case U o V = 11 Ullll VIl cos O = O.
( <=) Suppose U o V = O. Then since U o V = 11 Ullll VIl cos O, one of the
three factors is O. If 11 Ull or 11 VIl is O, then U or Vis the zero vector so U and V
are perpendicular. If cos O = O, then O = 90 o and U is again perpendicular
to V.
Example 4 Let e be the line given by P = t(2, 5) + (7, 1), tE R. Find
a vector equation for the line m passing through (1, O) which is perpendicular
toe.
g has direction numbers (2, 5), therefore the direction numbers (a, b) for
m must satisfy O = (2, 5) o (a, b) = 2a + 5b. There are many solutions for
this equation besides a= b = O. Taking (a, b) = (5, -2) yields the equation
P = t(5, -2) + (1, 0), tER, for the line m.
Problems
l. Find the angles between the following pairs of vectors.
a. (1, 5), (-10, 2). b. (1, -1), (-1, 3). c. (1, v3), (y'6, v2).
2. Find a vector equation for the line passing through the given point and perpen-
dicular to the given Iine.
a. (O, 0), P = t(l, -3), tE R. b. (5, 4), P = t(4, 4), tE R.
c. (2, 1), p = t(O, 7) + (4, 2), tE R.
*The statement "P if and only if Q" contains two assertions; first "P only if Q" or
"P implies Q" which may be denoted by P=>Q and second "P if Q" or "Pis implied by Q"
which may be denoted by P"""' Q. Each assertion is the converse of the other. An assertion and
its converse are independent statements, therefore a theorem containing an if and only if
statement requires two separate proofs.
22 1. A GEOMETRIC MODEL
3. Show that, in <f 2 , the vectors (a, b) and (b, -a) are perpendicular.
4. The proof ofTheorem 1.3 uses the fact that if U E tf
2
and 11 Ull = O, then U= O.
Show that this is true.
5. Give a vector proof of the Pythagorean theorem without the use of components.
(The proof is simplest if the vertex with the right angle represents the zero
vector of tfz.)
6. Show that the diagonals of a rhombus (a parallelogram with equal sides) are
perpendicular without using components.
7. Without using components, prove that any triangle inscribed in a circle with
one side a diameter is a right triangle having the diameter as .hypotenuse.
(Suppose the center of the circle represents the zero vector of C
2
.)
6. Generalization of "f
2
and <f
2
At this time the examples provided by "f
2
and <f
2
are sufficient to begin
a general discussion of vector spaces. But before leaving the geometric discus-
sion, a few concerning 3-dimensional Euclidean geometry should
be covered. The vector techniques developed for two dimensions generalize to
n dimensions as easily as to three. Thus "f
2
can be generalized in one process
to an infinite number of vector spaces.
Oefinition An orderedn-tuple ofnumbers has the form (a
1
, , an) where
a
1
, , an E R and (a
1
, . , an) = (b
1
, , bn) if and only if a
1
= b
1
, . ,
an = bn.
Thus when n = 1, the ordered n-tuple is essentially a real number; when
n = 2, it is an ordered pair; and when n = 3, it is an ordered triple.
Definition For each positive integer n, the vector space "Yn consists of all
ordered n-tuples of real numbers, called vectors, together with the operation
of vector addition:
and the operation of scalar multiplication:
for any rE R.
This definition gives an infinite mfrnber of vector spaces, one for each
6. of r 2 and rff,
23
positive integer. Thus in the vector space "f 4 ,
(-2,0,5,3) + (8, -3, -6, 1) = (6, -3, -1,4).
And in "f
6
,
2(0, !, O, -7, 3, 4) = (0, 2, O, -14, 6, 8).
The numbers in an ordered n-tuple from "f" will be called components.
Thus 4 is the second component of the vector (3, 4, J2, O, 1) from "f s
Since the vectors of "f
3
are ordered triples, they can be represented by
points in Cartesian 3-space. Such a space has three mutually perpendicular
coordinate axes with each pair of axes determnng a coordinate plane. A
point Pis gven coordinates (a, b, e) if a, b, and e are the drected distances
from the three coordnate planes to P. lf a rectangular box is drawn at the
origin with height, and depth equal to lcl, lbl, and !al, then Pis at the
vertex furthest from the origin, as in Figure 13. Now P can be used as a
pictorial representation of the vector in "f
3
with components a, b, and c.
Since we Jve in a 3-dmensional world, we cannot draw such pictures
for vectors from r" when n > 3. This does not affect the usefulness of these
spaces, but at first you m ay wish to think of "Yn as either "f 2 or "f 3
The vector notation u sed in the previous sections is free of components.
That is, capitalletters were used to denote vectors rather than ordered pairs.
For example, the midpoint formula was written as M = 1(A + B) and the
equation of a Jine as P = tU + V, tE R. Now that there are other vector
z
y
X
Figure 13
24 1. A GEOMETRIC MODEL
spaces, the question arises, when can these vectors be interpreted as coming
from .Y n? The answer is that all the general statements about .Y
2
carry over
to "Y . Thus it can be shown that addition in "Y. is associative and cotnmuta-
tive or that r(U + V) = rU + r V for all U, V E .Y n and r E R. Everything
cardes over directly to "Y .
The properties of a line are the same in all dimensions. A line is either
determined by two points or there is a unique Iine parallel to a given line
through a given point. Therefore, the various vector statements about lines in
the plane can be regarded as statements about Iines in higher dimensions.
This means that the set of all scalar multiples of a nonzero vector from .Y n
can be pictured as a Iine through the origin in n-dimensional Cartesian space.
Example 1 The three coordina te axes of 3-space ha ve vector equations
P = (t, O, 0), P = (0, t, 0), and P = (0, O, t) as t runs through the real u ~
bers. The line passing through the point with coordina tes (I, 3, - 2) and the
origin has the vector equation P = t(l, 3, -:-2), tE R.
In particular, each nonzero vector in "Y. gives direction numbers for a
set ofparallel lines in n-dimensional Cartesian space. Notice that the measure-
ment of slope used in the plane is a ratio requiring only two dimensions and
cannot be applied to lines in three or more dimensions.
Example 2 The graph of the vector equation P = t(I, 3, -2) +
(4, O, 1), tER is the line with direction numbers (I, 3, -2), passing through
the point with coordinates (4, O, 1). It is also parallel to the line considered in
Example l.
Euclidean properties, that is, length and angle, are again introduced
using the dot product.
Definition The dot product of the n-tuples U= (a
1
, . , a.) and V=
(b
1
, ... , b.) is U o V= a
1
b
1
+ + a.b . The algebraic system consisting
of all ordered n-tuples, vector addition, scalar multiplication, and the dot
product will be denoted by gn
The space g
2
was obtained from "Y
2
usin,g the plane E
2
as a guide, and
3-dimensional geometric ideas will provide a basis for viewing properties
of g
3
But for higher dimensions there is Iittle to build on. Therefore, for
dimensions larger than three the procedure can be reversed, and properties
;:
~
6. Generalization of "f/' 2 and l2
25
of n-dimensional Cartesian space can be found as geometric properties of g
That is, we can gain geometric insights by studying the vector spaces g when
n > 3.
As with .Y
2
and the spaces .Y., the properties of g 2 carry o ver directly
to g . The terms length, angle, and perpendicular are defined in just the same
way. Thus the vectors (4, 1, O, 2) and ( -1, 2, 7, I) are perpendicular in g4
beca use
(4, l, O, 2) o ( -1, 2, 7, 1) = -4 + 2 + O + 2 =O.
Arrows, triangles, and parallelograms )ook the same in two, three, or higher
dimensions, so the geometric results obtained in the plane also hold in each
of the other spaces.
Examp/e 3 Find a vector equation for the Iine e in 3-space passing
through the points with coordinates (2, 5, -4) and (7, -1, 2).
Direction numbers for e are given by (7, -1, 2) - (2, 5, -4) =
(5, -6, 6) and e passes through the point (7, -1, 2). Therefore, P =
t(5, -6, 6) + (7, -1, 2), tER is a vector equation for e. As in the 2-
dimensional case, there are many other equations for e. But notice that
setting P = (x, y, z) yields three parametric equations
X= 5t + 7,
y= -6t - 1, z = 6t + 2,
tE R.
It is not possible to eliminate t from these three equations and obtain a
single Cartesian equation. In fact three equations are obtained by eliminating
t, which may be written in the form
x-7 y+l z-2
-5-=--=-r-=-6-
These are called symmetric equations for e. They are the equations of three
planes that intersect in e.
Next in complexity after points and Iines in 3-space come the planes. We
will use the following charactetJZation of a plan e in 3-space: A plan e n is
determined by a normal direction N and a point A, that is, n consists of all
Iines through A perpendicular to N. This is a perfect situation for the vector
space approach. If P is any point on n, then viewing the points as vectors in
g
3
, P - A is perpendicular to N. This gives rise to a simple vector equation,
for P is on the plane n if and only if (P - A) o N = O. The direction N is
26
1.A GEOMETRJC MODEL
P-A
Figure 14
called a normal to the plane, and it makes a nice picture if N is represented by
an arrow perpendicular ton, as in Figure 14. However, it must be remembered
that N is an ordered triple not an arrow.
Example 4 Find a vector equation and a Cartesian equation for the
plane passing through the point A = (1, -2, 3) with normal N= (4, _ 1, 2).
A vector equation for this plane is
[P- (1, -2, 3)] o (4, -1, 2) =O.
Now the arbitrary point P has coordinates (x, y, z). Then the vector
equatwn
[(x, y, z) - (1, -2, 3)] o (4, -1, 2) = O
beco mes
(x - 1)4 + (y + 2)(- 1) + (z - 3)2 = O
or
4x - y + 2z = 12.
!hat is, the equation of this plane is a first degree or linear equation
m the coordmates.
6. Generalization of 'Y
2
and &
2
27
This 'examp1e suggests that the graph of a first degree equation in 3-space
is a plane not a line.
Theorem 1.4 The Cartesian equation of a plane in 3-space is a linear
equation. Conversely, the locus of points (x, y, z) satisfying the linear equa-
tion ax + by + ez = d is aplane with normal (a, b, e).
Proof The first part is left to the reader, see problem 7 at the end of
this section. For the converse, suppose A = (x
0
, y
0
, z
0
) is a point on the locus,
then ax
0
+ by
0
+ ez
0
= d. lf P = (x, y, z) is a general point on the locus,
then ax + by + ez = d, and putting the equations together gives
ax + by + ez = ax
0
+ by
0
+ ez
0
or
a(x - x
0
) + b(y - Yo) + e(z - z
0
) = O
or
(a, b, e) o [(x, y, z) - (x
0
, y
0
, z
0
)] = O,
which is N o (P - A) = O when N = (a, b, e). That is, P satisfies the equa-
tion of the plane passing through the point A with normal N.
The point of Theorem 1 .4 is that planes not lines ha ve linear Cartesian
equations in rff
3
Since lines in the plane ha ve linear equations, it is often as-
sumed that lines in 3-space must also ha ve linear equations. But a linear equa-
tion should not be associated with a line, rather it is associated with the
figure determined by a point and a normal direction.
Definition A hyperp/ane in n-space is the set of all points P satisfying
the vector equation (P - A) o N = O, where P, A, N E rt. and N t= O.
For n = 3 a hyperplane is simply a plan e in rff
3
Every hyperplane has a linear Cartesian equation and conversely.
For P, N, A in the equation (P- A) o N= O, let P = (x
1
, , x.), N=
(a
1
, , a.), andA o N = d. Then (P - A) o N = O or Po N :;= A o N be-
comes a
1
x
1
+ + a.x. = d, a general linear equation in the n coordinates
of P. Tlie converse folloW! just as in the proof of Theorem 1 .4.
This means that a 1ine in the plane E
2
can be regarded as a hyperplane.
28 ~ 1. A GEOMETRIC MODEL
Por example, consider the line t in E
2
with equation y = mx + b.
O= y- mx- b = (x, y) o (-m, 1) - b
= (x,y)o(-m, I)- (O,b)o(-m, 1)
= [(x, y) - (0, b)] o (-m, I) = (P - A) o N.
Here A = (0, b) is a point on t and N= (-m, 1) is the direction perpen-
dicular to t.
Problems
1. Perform the indicated operations and determine the vector space from which
the vectors come.
a. (2, -1, 1) + (6, 2, 5). d. (l, 3, 2, o, 1) - (1, 4, 3, 6, 5).
b. 4(2, -3, 1, O)+ 3(-l, 4, 2, 6). e. (2/3, -1/3, 2/3)(2/3, -1/3, 2/3).
c. (2, 4, -1) (3, -3, - 6). f. (3, 1, -4, 2)(-l, 1, 3, 4).
2. Prove that U+ V= V+ U when U, VE "Y
3
3. Prove that (r + s)U = rU +sU when r, sER and U E "f/
11
J
-12 - 8 - 48 = -68.
Example 3 Show that this definition of determinant agrees with the
definition for 2 x 2 matrices.
J
au a21 = aAu + a2A!2
a21 a22
= a
11
(-1)
1
+
1
l(azz)l + ad-1)
1
+
2
l(a2t)l
5. Determinants: Definition
Example4 Compute the determinant. of
(
3 o 1 o)
1 2 o o
A=o13o
o o 2 1
detA = 3A
11
+ OA
12
+ 1A
13
+ OA14
2 o o
= 3(-1)
1
+
1
1 3 o
o 2 1
1 2 o
+ 1(-1)
1
+
3
o 1 o
o o 1
= + (-1)1+11
+
= 6(3) + 1 - 2(0) = 19.
113
lf it were not for the presence of severa! zeros in the above matrix, the
eva1uation of the 4 x 4 detern1inant wou1d involve a considerable amount of
arithmatic. A 2 x 2 determinant involves two products. A 3 x 3 determinant
is the sum of three 2 x 2 determinants, each multiplied by a real number
and therefore involves 32 products, each with 3 factors. Similarly, a 4 x 4
. determinant involves 4 3 2 = 24 products of four numbers each. In general,
to evaluate an n x n determinant it is necessary to compute n (n - 1)
. ... 3. 2 = n! products of n factors each. Fortunately there are short cuts,
so that a determinant need not be evaluated in exactly the way it has been
defined. The following theorem is proved in the appendix on determinants,
but at this point it is stated without proof.
Theorem 3.13
1 ::; i ::; n,
Let A = (a;) be an n x n matrix, then for each i,
This is called the expansion of the determinant along the ith row of A.
And for eachj, 1 ::; j ::; n,
This is the expansion of the determinant along thejth column of A.
114 3. SYSTEMS OF LINEAR EQUATIONS
Example 5 Let
(
2 5 1 7)
3 1 o -1
A=2oo o
1 3 o 2
The expansion of det A along the first row, as in the definition, results in four
3 x 3 determinants. But if det A is expanded along the 3rd column or the 3rd
row, then only one 3 x 3 cofactor need be evaluated. Expansion along the
3rd column of A gives
3 1
detA = 1(-1)
1
+
3
2 O
1 3
o = 2(-1)2+1 1
-1 1
2 3
Expansion along the 3rd row of A gives
5 1 7 1
detA = 2(-1)3+
1
1 O -1 = 2(-1)
1
+
2
1
3 o 2
3
-1
2 ,; -2(2 + 3) = -10.
-1
2 = -2(2 + 3) = -10.
The values are the same as predicted by Theorem 3.13. In the expansions,
the first 3 x 3 minor was expanded along its 2nd row and the second along
its 2nd column.
Two corollaries follow easily from Theorem 3.13.
Corollary 3.14 If a row or column of an n x n matrix A contains only
O's, then det A = O.
Corollary 3.15 If A is an n x n matrix and Bis obtained from A by
multiplying a row or column of A by a scalar r, then det B = r det A.
The second corollary agrees with the idea that a determinant measures
volume. For if the edges of a parallelogram or parallelepiped determined by
one direction are mu1tiplied by r, r > O, then the area or volume is also
multiplied by r.
5. Determinants: Definition 115
Problems
Evaluate the following determinants:
/i il
12 o 31 11 5 91
5. ~
7
g/
l. 2. j(-5)j. 3. 7 2 9 . 4. o 1 2 . 1
5 o 8 o o 1 2
1 2 8 -1 1 o o 1
11 2 31
6.
o o 2 o
7.
o 1 1 1
8. 1 2 3 .
o 3 7
1 .
o 1
o o .
2 1 3 -3 1 1 1 o
5 4 1
9. Show that the four points A, B, e, and D are the vertices of a parallelogram
with D opposite A and find its area when
a. A = (0, 0), B = (2, 5), e= (1, -4), D = (3, 1).
b. A= (2, -5), B = (1, 6), e= (4, 7), D = (3, 18).
c. A= (1, 1), B = (2, 4), e= (3, -2), D = (4, 1).
10. Find the volume of the parallelepiped in 3-space determined by the vectors
a. (1, O, O), (0, 1, 0), (0, O, 1).
b. (3, 1, -1), (4, 2, 2), (-1, 2, -3).
11. Write out the six terms in the expansion of an arbitrary 3 x 3 determinan!
and notice that this sum contains every possible product that can be formed
by taking one element from each row and each column. Does this idea
generalize to n x n determinants?
12. Prove Corollary 3.15.
13. Let A be an n x n matrix. Use induction to prove that jAr = JAj.
14. For B, e E S 3, Jet B X e denote the cross product of B and C. Compute the
following cross products:
a. (1, O, O) X (0, 1, O). d. (2, 3, 2) X (-2, 1, 0).
b. (0, 1, O) X (1, O, 0). e. (-4, 2, -8) x (6, -3, 12).
c. (1, 2, 3) X (1, 2, 3) .. f. (3, O, 2) X ( -5, O, 3).
15. Suppose the definition of a 3 x 3 determinan! is extended to allow vectors as
entries in the first row, then the value of such a determinan! is a linear com-
bination of the vectors.
a. Show that if B = (b, b2, b
3
) and e= (e, c
2
, c3), then
l
E E2 E31
B X e= b b2 b3
c1 c2 C3
where {E, E2, E3} is the standard basis for 3 3
b. Let A = (a, a2, a3) and show that
116 3. SYSTEMS OF LINEAR EQUATIONS
A o (B x e) is the scalar triple product of A, B, and e and the volume of
the parallelepiped determined by A, B, and e in G 3
16. Use a geometric argument to show that {A, B, e} is Iinearly dependent if
A o (B x e)= O.
17. a. Compute [(0, 1, 2) x (3, O, -1)] x (2, 4, 1), and (0, 1, 2) x [(3, O, -1) x
(2, 4, 1)].
b. What can be concluded from part a?
c. What can be said about the expression A x B x e?
6. Determinants: Properties and Applications
The evaluation of a determimint is greatly simplified if it can be expanded
along a row or column with only one nonzero entry. If no such row or
column exists, one could be obtained by performing elementary row opera-
tions on the matrix. The following theorem shows how such operations affect
the value of the determinan t.
Theorem 3.16 Suppose A is an n x n matrix and Bis obtained from
A by a single elementary row operation, then
l. IBI = -IAI if the elementary row operation is of type I.
2. IBI = riAI if a row of A is'multiplied by a scalar r.
3. IBI = IAI if the elementary row operation is of type III.
Proof The proof of 1 is by induction on the number of rows in A.
It is not possible to interchange two rows of a 1 x 1 matrix, so consider
2 x 2 matrices. If
and B =(e d)
a b '
then
IBI =eh- da= -(ad- be)= -IAI.
Therefore the property holds for 2 x 2 matrices. Now assume that 1 holds
for all (n - 1) x (n - 1) matrices with n 2: 3. Let A be an n x n matrix
A = (a;) and 1et B = (b;j) be the matrix obtained from A by interchanging
;.;;
'
6. Determinants: Pr_9Perties and Applications 117
two rows. Since n > 2, at 1east one row is not involved in the interchange,
say the kth is such a row and expand IBI along this row. Then IBI = bk
1
Bkl
+ bk
2
Bk
2
+ + bknBkn For each indexj, the cofactor Bk
1
differs from the
cofactor AkJ of akJ in A on!y in that two rows are interchanged. These co-
factors are determinants of (n - 1) x (n - 1) matrices, so by the induction
assumption, BkJ = - AkJ Since the kth row of B equa1s the kth row of A, the
above expansion for IBI becomes
which is the negative of the expansion of IAI along the kth row of A.
Now 1 ho1ds for all 2 x 2 matrices, and if it ho1ds for all (n - 1) x
(n - 1) matrices (n 3), then it holds for all n x n matrices. Therefore 1
is proved by induction:
Part 2 was pro ved in the previous section (Corollary 3 .15); and the induc-
tion proof of 3 is 1eft as a prob1em.
Examp!e 1
-5 2 -3 7 -4 o
4 -2 1 = 4 -2 1 = 1(-1)
2
+
3
/
7
-
4
= -29.
2 3 o 2 30
2 3
Here the second determinant is obtained from the first by adding three times
the second row to the first row.
Elementary row operations can be used to obtain a matrix with zeros in
a column but usually not in a row. To do this it is necessary to make changes
in the columns of the matrix. Elementary eolumn operations are defined just
as elementary row operations except they are applied to the columns of a
matrix rather than the rows.
Examp!e 2 The following transformation is made with elementary
column operations of type III:
2 -1) (3
1 -2 -----> 2
3 -5 4
2 3) ( -1 2 3)
10-----> 010.
3 1 -2 3 1
First 2 times the 2nd column is added to the 3rd and then -2 times the 2nd
column is added to the 1st.
118 3. SYSTEMS OF LINEAR EQUATJONS
Theorem 3.17 Suppose A is an n x n matrix and Bis obtained from A
with a single elementary column operation, then
l. IBI = -lA J if the elementary column operation is of type I.
2. IBI = rlAI if a column of A is multiplied by r.
3. IBI = JAI if the elementary column operation is of type III.
Proof Performing an elementary column operation on a matrix has
the same effect as elementary row operation on its transpose.
Therefore this theorem follows from the corresponding theorem for ele-
mentary row operations and the fact that lA TI = JAI for aii n x n. matrices A.
Theorem 3.18 If two rows or two columns of an n x n matrix A are
proportional, then IAI = O.
Proof If A satisfies the hypothesis, then there is an elementary row or
column operation of type III which yields a matrix having a row or column
of zeros. Thus JAI = O.
lt is possible to evaluate any determinant using only elementary row or
elementary column operations, that is, without computing any cofactors. To
see how this can be done, we wiii say that the elements a of a matrix A =
(a;) are on the main diagonal of A. And that A is upper triangular if all the
elements below the main diagonal are zero, thl;\t is, if aii = O when i > j.
Lower triangular is similarly defined. Then we have the foiJowing theorem.
Theorem 3.19 If A =(a;) is an n x n upper (lower) triangular matrix,
then JAI = a11 a22 a .
Proof By induction and Jeft as a problem.
Example 3
nant.
Use elementary row operations to evaluate a determi-
2 4 -6 1 2 -3 1 2 -3
3 1 -4 =23 1 -4 =20 -5 5
2 2 5 2 2 5 o -2 11
1 2 -3 1 2 -3
-10 o 1 -1 = -10 o 1 -1
o -2 ll o o 9
= -1 o. 1 . 1 . 9 = -90.
6. Determinants: Properties and Applications 119
The technique shown in Example 3 is the same as that used in finding the
echelon form of a matrix, except that it is necessary to keep track of the
changes in value that arise when elementary row operations of type 1 and II
are used.
Elementary row operations change the value of a determinant by at
most a nonzero scalar multiple. Therefore, the determinant of an n x n
matrix A and the determinant of its echelon form can differ at most by a
nonzero multiple. Thus if the echelon form for A has a row of zeros, then
JAI =O or:
Theorem 3.20
IAI #O.
The rank of an n x n matrix A 1s n if and only if
Now using determinants, two results from the theory of systems of
linear equations can be restated.
Theorem 3.21 1. A system of n linear equations in n unknowns,
AX = B, has a unique solution if and only if !Al # O.
2. A homogeneous system of n linear equations in n unknowns AX = O
has a nontrivial solution if and only if lA 1 = O.
Examp!e 4 Determine if
{(2, 1' o, -1), (5, 8, 6, - 3), (3, 3, o, 4), (!' 2, o, 5)}
is a basis for "f/"
4
.
A set of 4 vectors is a basis for "f/"
4
if it is Jinearly independent, or equiva-
lently if the matrix with the vectors as rows has rank 4, that is, a nonzero
determinant. But
2 1
5 8
3 3
2
o -1 2
6 -3 = 6(-1)2+3 3
o 4 1
o 5
1 -1
3 4
2 5
o o
-6 -3 3 7 =O.
-3 2 7
Since two column operations yield a 3 x 3 matrix with two columns proper-
tional, the determinant is zero and the set fails to be a basis.
Problems
rr
l. Use elementary row and column operations to simplify the evaluation of the
120
following determinnts:
a. ~ .
1 1 o 1
o 1 1 o
d. 1 1 o o .
1 1 1 1
1
2 1
b. 3 -1
4 2
2
2
e. 6
1
4 -6
6 -1
o 3
2 o
3. SYSTEMS OF LINEAR EQUATIONS
1 1 o o
1 1 1 1
c. o 1 1 o
4
1
6 .
2
o o 1 1
o 1 2 3
1 2 3 o
f. 2 3 o 1
3 o 1 2
2. Let U, V E S
3
, and use problem 15 of Section 5 along with the properties of
determinants to prove
a. U X V= -V X U.
b. U x Vis perpendicular to both U and V.
c. lf U and V are Iinearly dependent, then U x V = O.
3. Use determinants to determine if the following sets are linearly independent:
a. {(4, 2, 3), (l, -6, 1), (2, 3, 1)}.
b. {3 + 4i, 2 + i}.
c. {t
5
+ 2t, t
3
+ t + 1, 2t
5
+ 5t, t
3
+ 1}.
4. Is the set {(4, O, O, O, 0), (3, -2, O, O, 0), (8, 2, 5, O, 0), (-3, 9, 4, 2, 0),
(0, 5, 3, 1, 7)} a basis for "Y 5 ?
5. Give induction proofs for the following statements.
a. An elementary row operation of type 111 does not affect the value of a
determinan t.
b. The determinant of an upper triangular n x n matrix is the product of the
elements on its main diagonal.
6.- Suppose A = (al)) is an n x n matrix with a11 #- O. Show that det A=
(lf(a
11
)"-
2
) det B, where B =(bu) is the (n- 1) X (n- 1) matrix with 2:::; i,
j:::; n, and b,
1
= a
11
a
11
- a
11
au. This procedure, called pivota/ condensation,
night be used by a computer to reduce the order of a determinant while
introducing only one fraction ..
7. Use pivota) condensation to evaluate
1
3 7 21
a. 5 1 4 .
9 6 8 1
5 4 21
b. 2 7 2 .
3 2 3
2 1 3 2
1 3 2 1
c. 2 2 1 1
3 1 2 1
8. Prove that interchanging two columns in a determinant changes the sign of the
determinant. (This does not require induction.)
7. Alternate Approaches to Rank
The rank of a matrix is defined as the number of nonzero rows in its
echelon forro, which equals the dimension of its row space. For an n x n
~
7. Alternate Approaches to Rank 121
matrix A, the rank is n provided IAI =F O, and IAI = O says only that the
rank is less than n. However, ~ v e n if IAI = O orA is n x m and the determi-
nant of A is not defined, it is possible to use determinants to find the rank of A.
Oefinition A k x k submatrix of an n x m matrix A is a k x k matrix
obtained by deleting n - k rows and m - k columns from A.
The 2 x 3 matrix
has three 2 x 2 submatrices;
It has six 1 x l submatrices; (2), (1),.(5), etc.
Definition The determnant rank of an n x m matrix A is the largest
value of k for which sorne k x k submatrix of A has a nonzero determinan t.
Examp!e 1 Let
Then
4 1
2 3
2 -2
and
1
3
-2
3
1 =o
2
Therefore the determinant rank of A is 2.
122 3. SYSTEMS OF LINEAR EQUATIONS
The rank or row rank of A is also 2, for the echelon form for A is
Theorem 3.22
rank of A.
o 4/5)
1 -1/5 .
o o
For any matrix A, the determinant rank of A equals the
Proof Let A be an n x m matrix with determinant rank s and rank
r. We will show first that s ::;: r and then that r ::;: s.
If the determinant rank of A is s, then A has an s x s submatrix with
nonzero determinant. The s rows of this submatrix are therefore linearly
independent. Hence there are s rows of A that are linearly independent, and
the dimension of the row space of A is at Jeast s. That is, the rank of A is at
least s or s ::;: r.
For the second inequa!ity, the rank of A is r. Therefore the echelon form
for A, call it B, has r nonzero rows and r columns containing the leading 1 's
for these rows. So B has an r x r submatrix, call it e, with 1 's on the main
diagonal and O's e!sewhere. Thus ICI = 1 # O. Now the r nonzero rows of
B span the rows of A, therefore there is a sequence of elementary row opera-
tions which transforms these r rows to r rows of A, leaving the remaining
n - r rows all zero. This transforms the r x r submatrix e toan r x r sub-
matrix of A, and since elementary row operations change the value of a
determinant by at most a nonzero multiple, this r x r submatrix of A has
nonzero determinant. That is, the determinant rank of A is at least r or r ::;: s.
Example 2 It may be helpful to look at an example of the pro-
cedure u sed in the second part of the preceding proof.
Let
4
!)
10
6
2 o
o 1
o o
7. Alternate Approaches to Rank 123
is the echelon form for A and the rank of A is 2. Now
is a 2 x 2 submatrix of B with nonzero determinant. A can be transformed to
B with a sequence of elementary row operations; working backward we
obtain a sequence that transforms the nonzero rows of B to the first two
rows of A (in this case any pair of rows in A could be obtained).
.
(
! 2 o !) (! 2 1/2 5/2)--d(l
0013>00 13
11
0
2 1/2 5/2)
o -3/2 -9/2
2 1/2 5J2)-d(2 4 1 5)
10 1 8 JI 5 10 1 8 .
Performing this sequence of operations on the submatrix e we obtain
(O
I o) (I 112) (I 112) (I 112) (2 1)
1 > o 1 Ir' o - 3/2 > 5 1 Ir' 5 1 .
The matrix
is a 2 x 2 submatrix of A, which must have determinant nonzero; in fact,
the determinant is - 3. Thus the determinant rank of A cannot be less than 2,
the rank of A.
Determinant rank might be used to quickly show that the rank of a
3 x 3 matrix with zero determinant is 2, as in Example l. But in general,
it is not practica! to determine the rank of a matrix by finding the determinant
rank. For example, if the rank of a 3 x 5 matrix is 2, it would be necessary
to show that all ten, 3 x 3 submatrices have zero determinant before con-
sidering a 2 x 2 submatrix.
In this chapter, matrices have been used primarily to express systems of
linear equations. Thus the rows of a matrix contain the important informa-
tion, and rank is defined in terms of rows. But in the next chapter, the
columns will contain the information needed, so the final approach to rank
is in terms of the columns.
124 3. SYSTEMS OF LINEAR EQUATIONS
Definition The column space of an n x m matrix A is the subspace of
vltnx
1
spanned by the m columns of A. The column ran.'; of A is the dimension
of its column space.
Examp/e 3 Let
then the column space of A is
and the column rank of A is 2. If
then
is the column space of B, and the column rank of B is 2.
The transpose of a matrix turns columns into rows, therefore the column
rank of a matrix A is the rank (row rank) of its transpose Ar. Also, transpos-
ing a k x k matrix does not affect its determinant (problem 13, page 115),
so the determinant ranks of Ar and A are equal. Therefore,"'
column rank A = (row) rank Ar = determinant rank Ar
= determinant rank A = (row) rank A,
proving the following theorem.
Theorem 3.23 The co1umn rank of a matrix equals its rank.
Example 4 Show that no set of n - 1 vectors can span the space ~ ' n
7. Alternate Approaches to Rank 125
Suppose {u 1' ... ' un -1} e "f/' n lt is necessary to find a vector w E "f/' n
such that the vector equation x
1
U
1
+ + Xn-I u._
1
= W has no so1ution
for x
1
, .. , x.-
1
This vector equation yields a system of n linear equations
in n - 1 unknowns. If the system is denoted by AX = B, then A is an
n x (n - 1) matrix, and we are 1ooking for B = wr so that the rank of A*
is not equal to the rank of A.
Suppose the rank of A is k, then k ~ n - 1, and A has k Iinearly in-
dependent co1umns. But the columns of A are vectors in vll.x
1
, which has
dimension n. So there exists a column vector B in vil n x
1
which is independent
of the columns of A. Let this column vector B determine W by W = Br.
Then the rank of the augmented matrix A* is k + 1, and for the vector W,
the system is inconsistent. That is
and the n - l vectors do not span ~ '
Using elementary column operations, it is possible to define two new
relations between matrices following the pattern of row-equivalence. A
matrix A is column-equivalent to B if B can be obtained from A with a finite
sequence of elementary column operations, and A is equivalent to B if B can
be obtained from A with a finite sequence of elementary row and column
operations. The re1ation of equivalence for matrices will appear 1ater in
another context.
Examp/e 5 Show that
( ~ L!)
is equiva1ent to
(
1 o o)
o 1 o .
o o o
Using column and row operations, we obtain
(i
3) . ( 1
4 1 ~ o
lli
4 -2 o
o 3) ( 1
4 -5 ~ o
lli
4 -5 o
o 3) ( 1
4 -5 ~ o
o o o
o
126 3. SYSTEMS OF LINEAR EQUATIONS
Problems
l. Find a matrix equivalent to the given matrix with 1 's on the main diagonal and
O's elsewhere.
a. i)
(2 3 1)
c. m.
e 2 3 4 o)
(;
b. 2 -5 9 . d. 2 4 o 6 2 . e.
4 2 6 3 1 2 1 5
2. Find the column space and thus the rank of the following matrices:
a. !)
( 2 1 3) (4 o 3) (3 1 2 5)
b. -1 1 o . c. o 2 1 . d. o 2 1 . 1 .
1 5 6 o o 5 2 o 1 3
3. Why is it impossible for the column space of A to equal the row space of A,
unless A is a 1 x 1 matrix?
4. The column space of a matrix A is essentially the same as the row space of Ar.
Show that even for n x n matrices, the row space of A need not equal the row
space of AT by considering
(
1 5
a. A= 2 4
1 -1
b. A =
-4 10
5. How do you read AT?
6. Use determinan! rank to prove that if U x V= O for U, V E g
3
, then U and V
are linearly dependen!.
7. Use the theory of systems of linear equations to prove that if {U
1
, , U;,}
span "Yn, then {U, ... , Un} is linearly independent. In your proof, write out
the systems used, showing the coefficients and the unknowns.
Review Problems
l. Suppose AX = B is a system of n linear equations in 2 unknowns. Each
equation determines a line in the plane. Consider all possible values for the
ranks of A and A* and in each case determine how the n lines intersect.
2. Suppose AX = O is a system of linear equations obtained in determining if the
set S e R"[tJ is linearly independent. Find a set S for which this system is
obtained if
a. A= (i
b.
(
3 1
c. A= 2 -1
3. Suppose S = {V, ... , V m} e "Y n and m < n. Use the theory of linear equa-
tions and column rank to prove that S does not span "Yn.
Review Problems 127
4. Show that the line in the plane through the points with coordinates (x, y
1
)
and (xz, Yz) has the equation
5.
6.
1
;: =o.
1 1 1
Use problem 4 to find an equation for the line through the points with co-
ordinates:
a. (1 ,0), (0, 1 ). b. (3, 4), (2, - 5). c. (5, 1), (8, 9).
Using problem 4 as a guide, devise a determinan! equation for the plane
through the three points with coordinates (x, y, Z), (xz, Y2, z2), and
(x
3
, y
3
, z
3
). (What would happen in the equation if the three points were
collinear ?)
7. Use elementary row operations to find a Cartesian equation for the follow-
ing planes:
a. {(8a + 8b, 4a + 8b, 8a + 11b)ia, bE R}.
b . .?{(5, 15, 7), (2, 6, 8), (3, 9, -2)}.
c. {(2a- 4b + 6c, a+ b- 2c, -?a+ 2b- c)ia, b, e E R}.
d . .?{(2, 3, -2), (-5, -4, 5)}.
8. Show that the plane through the points with coordinates (x, y, z 1),
(x2, Y2, z
2
), and (x3 , y 3 , z3) has the equation
1
X - X y - Yt Z- Zl
Xz - X Yz - Yt Zz - Z = 0.
X3 - X
1
y
3
- y
1
Za - Z
9. Gven a system of n linear equatons in n unknowns, why is one justified in
expecting a u ni que solution?
10. In the next chapter, we will find that matrix multiplication distributes over
matrix addition. That is, A( U + V) = A U + AV provided the products are
defined. Use this fact to prove the following:
a. The solution set for a homogeneous system of linear equations is closed
under addition (Theorem 3.2, page 80).
b. lf X= C1 is a solution for AX = B, and X= C2 is a solution for AX = O,
then X= C
1
+ C2 is a solution for AX = B (problem 6, page 105).
11. lnstead of viewing both A and 8 as being fixed in the system of linear equa-
tions A X = B, consider only A to be fixed. Then A X = B defines a cor-
respondence from ,({m x 1 to ,({" x .
a. To what vectors do O), ( _ and correspond in ,({ 3 x 1 if
A= (i D?
128
3. SYSTEMS OF LINEAR EQUATIONS
b. To what vectors do (z). C!), and e D correspond in .-112 X 1 if
A= (b ~ ?
c. Suppose this correspondence satisfies the second condition in the definition
for an isomorphism, page 62. What can be concluded about the rank of A,
and the relative values of n and m?
Linear Transformations
1. Definitions and Examples
2. The lmage and Null Spaces
3. Algebra of Linear Transformations
4. Matrices of a Linear Transformation
5. Composition of Maps
6. Matrix Multiplication
7. Elementary Matrices
130 4. LINEAR TRMISFORMATIONS
A special type of transformation, the isomorphism, has been touched
upon as a natural consequence of introducing coordinates. We now tuxn to
a general study of transformations between vector spaces.
1. Definitions and Examples
Definition Let "f/ a n d ~ b e vector spaces. If T(V) is a unique vector in
if for each V E 1', then T is afunetion, transformation, or map from r, the
domain, to if, the eodomain. If T(V) = W, then W is the image of V under
Tand Vis a preimage of W. The set of all images in "'r, {T(V)IV E 1'}, is the
image or range of T. The notation T: i' -+ "'r will be used to denote a trans-
formation from i' to "'r.
According to this definition, each vector in the domain has one and only
one image. However, a vector in the codomain may have many preimages
or non e at all. For example, consider the map T: i'
3
-+ "f/
2
defined by
T(a, b, e) =(a+ b- e, 2e- 2a- 2b). For this map, T(3, 1, 2) = (2, -4).
Therefore (2, -4) is the image of(3, 1, 2) and (3, 1, 2) is a preimage of(2, -4).
But T(l, 2, 1) = (2, -4), so (1, 2, 1) is also a preimage of (2, -4). Can you
find any other preimages of (2, -4)? On the other hand, (1, O) has no
preimage, for T(a, b, e) = (1, O)yields the inconsistent system a + b - e = l.
2C - 2a ..:.. 2b == -. Thus sorne vectors in th codomain "f/
2
ha ve many
preimages, while others have none.
In the preceding definition, i' and 1f/ could be arbitrary sets and the
definition would still be meaningful provided only that the word "vector"
were replaced by "element." In fact, elementary calculus is the study offunc-
tions from a set of real numbers to the set of real numbers. But a transforma-
tion between vector spaces, instead of simply sets of vectors, should be more
than simpJy a rule giving a vector in 1f/ for each vector in 1', 'it should also
preserve the vector space structure. Therefore, we begin by restricting our
attention to maps that send sets as vector spaces to vector spaces.
Definition Let "f/ and 1f/ be vector spaces and T: "f/ -+ "'f/". Tis a linear
tfansformation ora linear map if for every U, V E i' and rE R:
l. T(U + V)= T(U) + T(V); and .
2. T(rU) = rT(U).
Thus, under a linear transformation, the image of the sum of vectors is
1. Definitions and Examples
131
the sum of the images and the image of a scalar multiple is the scalar multiple
of the image. We say that a linear transformation "preserves addition and
scalar multiplication."
Almost none of the functions studied in calculus are linear maps from
the vector space R to R. In fact,f: R-+ R is linear only ifj(x) = ax for sorne
real number a.
The two conditions in the definition of a linear transformation are in-
cluded in the definition of an isomorphism (see page 62). That is, isomor-
phisms are linear transformations. But the converse is not true.
Examp/e 1 Define T: i'
3
-+ i'
2
by T(x, y, z) = (2x + y, 4z). Show
that T is linear but not an isomorphism.
If U= (a
1
, a
2
, a
3
) and V= (b
1
, b
2
, b
3
), then
T(U + V) = T(a
1
+ b
1
, a
2
+ b
2
, a
3
+ b
3
)
= (2[a
1
+ b
1
] + a
2
+ b
2
, 4[a
3
+ b
3
])
= (2a
1
+ a
2
, 4a
3
) + (2b
1
+ b
2
, 4b
3
)
= T(a
1
, a
2
, a
3
) + T(b
1
, b
2
, b
3
)
= T(U) + T(V).
Definition of + in "f/
3
Definition of T
Definition of + in i'
2
Definition of T
Therefore T preserves addition of vectors. T also preserves scalar multiplica-
tion for if rE R, then
T(rU) = T(ra
1
, ra
2
, ra
3
)
= (2ra
1
+ ra
2
, 4ra
3
)
= r(2a
1
, +a
2
, 4a
3
)
= rT(a
1
, a
2
, a
3
)
= rT(U).
Definition of scalar multiplication in "f/
3
Definition of T
Definition of scalar multiplication in "f/
2
Definition of T
Thus T is a linear map. However, T is not an isomorphism for there is not
a unique preimage for every vector in the codomain "f/
2
Given (a, b) E "f/
2
,
there exists a vector (x, y, z) E i'
3
such that T(x, y, z) = (a, b) provided
2x + y = a and 4z = b. Given a and b, this system of two equations in the
three unknowns, x, y, z, cannot have a unique solution. Therefore, if a
vector (a, b) has one preimage, then it has many. For example, T(3, O, 1)
= (6, 4)_ = T(2, 2, 1).
132
4. LINEAR TRANSFORMATIONS
Definition A transformation T: "/'--+ "/Y is one to one, written 1-1, if
T(U) = T(V) implies U = V.
A map from "/' to "/Y is one to one if each vector in the codomain "/Y has
at most one preimage in the domain "/'. Thus the map in Example 1 is not
1-1. The property of being one to one is inverse to the property of being a
function. For T: "/'--+ "/Y is a function provided T(U) = W, and T(U) = X
implies W = X. In fact, the function notation T(U) is used only when U has
a unique image.
A map may be linear and one to one and still not be an isomorphism.
Examp!e 2 The map T: "/'
2
--+ "/'
3
defined by T(x, y) = (x, 3x -y, 2x)
is linear and one to one, but not an isomorphism.
To show that T preserves addition, let U= (a
1
, a2) and V= (b1, b2),
then
T(U + V) = T(a
1
+ h
1
, a2 + h2)
= (a
1
+ b
1
, 3[a
1
+ b] - [a
2
+ h2], 2[a1 + b])
= (a
1
, 3a
1
- a
2
, 2a
1
) + (b
1
, 3b
1
- b
2
, 2b
1
)
= T(a
1
, a
2
) + T(b
1
, b
2
) = T(U) + T(V).
The proof that T preserves scalar multiplication is similar, thus T is linear.
To see that T is 1-1, suppose T(U) = T(V). Then (a1, 3a
1
- a2 , 2a1) =
(b
1
, 3b- b
2
, 2b
1
) yields a
1
= b
1
, 3a
1
- a
2
= 3b
1
- h2, and 2a1 = 2b1.
These equations have a
1
= b
1
, a
2
= b
2
as the only solution, so U= V and T
is 1-1.
However, T is not an isomorphism beca use sorne vectors in "/' 3 ha ve no
preimage under T. Given (a, b, e) E"/'
3
, there exists a vector (x, y) E"/' 2 such
that T(x, y) = (a, b, e) provided x = a, 3x - y = b, and 2x = c. For any
choice of a, b, e, this is a system of three linear equations in two unknowns,
x and y, which need not be consistent. In this case, if 2a f= e, there is no
solution. For example, (1, 6, 3) has no preimage in "/'
2
under T. Therefore
T does not satisfy the definition of an isomorphism.
Definition A transformation T: "/' --+ '"IY is onto if for every vector
W E "/Y, there exists a vector V E"/' such that T(V) = W.
T maps "/' onto the codomain "/Y if every vector in "/Y is the image of
sorne vector in"/'. That is, T is onto if its image or range equals its codomain.
J.
''.
1. Definitions and Examples 133
The map in Example 2 is not onto for the image does not contain (1, 6, 3), a
vector in the codomain.
The second condition in our definition of an isomorphism requires that
an isomorphism be both one to one and onto. lt is now possible to give a
simple restatement of the definition.
Definition An isomorphism from a vector space "/' to a vector space
"/Y is a one to one, onto, linear map from "/' to '"IY.
An isomorphism is a very special transformation in that isomorphic
vector spaces are algebraically identical. But an arbitrary linear transforma-
tion may fail to be either one to one or onto.
Example 3 The map T: "/'
2
--+ "/'
2
defined by T(a, b) = (0, O) is
linear, but it is neither one to one nor onto.
Example 4 Let
Then T is a linear map from R[t] to itself which is onto but not 1-1.
The image of a polynomial PE R[t] under T is the derivative of p with
respect to t. One can prove directly that T is linear, but this is also implied
by the theorems on differentiation obtained in elementary calculus. T maps
R[t] onto R[t]; for given any polynomial b
0
+ b
1
t + .. + bmtm E R[t], the
polynomial
is sent to it by T. Thus, every polynomial has a preimage and T is an onto
map. But T fails to be one to one. (Why?)
A direct sum decomposition of "/' given by "/' = Y E& ff determines
two linear transformations; P
1
: "/'--+ Y and P
2
: "/'--+ ff. These projection
maps are defined by P
1
(V) = U and PiV) = W where V= U+ W, U E y
and W E ff. P
1
and P
2
are well-defined maps because, in a direct sum, U
~ n W are uniquely determined by V. The proof that a projection is linear
IS left to the reader.
134 4. LINEAR TRANSFORMATIONS
Example 5 Let Y = {(x, y, z)ix +y+ z = O} and ff =
2{(2, 5, 7)}. Then Y + ff is direct and g
3
= Y Etl ff expresses g
3
as the
direct su m of a plan e and a line. Thus there are two projections P
1
:
3
Y
and P
2
: g
3
ff. Given (7, 3, 4) E g
3
, one finds that (7, 3, 4) = (5, -2, - 3)
+ (2, 5, 7) with (5, -2, -3) E Y and (2, 5, 7) E ff. Therefore, P(7, 3, 4)
= (5, -2, -3) and P
2
(7, 3, 4) = (2, 5, 7).
A projection map is much more general than the perpendicular projec-
tion found in Euclidean geometry. However, a perpendicular projection can
be expressed as a projection map. For example, if ff in Example 5 is replaced
by Sf{(l, l, 1)}, the subspace (line) normal to the plane Y, then P
1
(V) is at
the foot of the perpendicular dropped from V to the plane Y. That is, P
1
is the perpendicular projection of g
3
onto Y.
. A projection map is necessarily onto; for given any vector U E Y, U
is in "Y and U = U + O, O E ff. Therefore P
1
(U) = U, and every vector in Y
has a preimage in "Y. On the other hand, if ff '# {0}, then P
1
fails to be
1 - 1. This is easily seen, for every vector in ::T is sent to the same vector
by P
1
, namely O. That is, if W E :Y, then W = O + W with O E Y and
WEff, so P
1
(W) =O.
The linearity conditions have a good geometrc interpretaton. Gener-
ally, a linear map sends parallel lines into parallellines. To see this, consider
the linee with vector equation P = k U + V, k E R, for U, V E "Y", U ;6 O.
If U and V are vewed as points in Euclidean space, then e is the line through
V with direction U. Now suppose T: "Yn "Yn is a linear map, then
T(P) = T(kU + V)= T(kU) + T(V) = kT(U) +T(V) for all k E R.
Therefore if T(U) '# O, then Tsends e into the line through T(V) with direc-
tion T( U). Now any line parallel toe has direction U and is sent toa line with
direction T( U). What would happen to the parallel line given by P = k U + W,
k E R, if T(V) = T(W)? What would happen toe if T(U) =O?
We can use this geometric interpretation to show that a rotation of
the about the origin must be linear. Certainly a rotation sends parallel
lines into parallel lines, but the converse of the above argument is false.
That is, a map may send parallellines into parallellines without being linear;
considera translation. However, if T: g
2
g
2
is a rotation about the origin,
then T(O) = O and T moves geometric figures without distortion. Thus the
parallelogram determined by O, U, V and U + Vis sent to the parallelogram
determined by O, T(U), T(V), and T(U + V). So by the parallelogra,m rule
for addition T(U + V)= T(U) + T(V), and a rotation preserves addition.
T also preserves scalar multiplication beca use a rotation m oves a line passing
through the origin, {rUir E R}, toa line through the origin without changing
1. Definitions and Examples 135
(-sin 1/>, cos if>) _)-- (0, 1)
(cos if>, sin if>)
(1, O)
Figure 1
lengths or the relative position of points. Thus a rotation of g
2
about the
origin is a linear map, and we can use this fact to obtain a componentwise
expression for T. Suppose cp is the angle of rotation with positive angles
measured in the counterclockwise direction. Then T(1, O) = (cos cp, sin cp)
and T(O, 1) = (-sin cp, cos cp) as in Figure l. Now using the linearity of T,
T(a, b) = T(a(I, O) + b(O, 1))
= aT(I, O) + bT(O, 1)
= a(cos cp, sin cp) + b( -sin cp, cos cp)
= (a cos cp - b sin cp, a sin cp + b cos cp).
From the geometric point ofview, a rotation in g
2
be 1-1 and onto, but
this could a1so be obtained from the componentwise definition.
Problems
l. Determine which of the following transfoimations are linear.
a. T: "f"2 _, 1"
2
; T(a, b) = (3a- b, b2).
b. T: 1"3 _, 1"2; T(a, b, e)= (3a + 2e, b):
c. t).
d .. T: R3[t] _, R3[t]; T(a + bt + et
2
) = (a b)t
2
+ 4et- b.
e. T: 1" 3 _, 1" 4; T(a, b, e) = (a - 5b, 7, a, 0).
136
4. LINEAR TRANSFORMATIONS
f. T: e- R; T(a + bi) = a + b.
g. T: e- e; T(a + bi) = i - a..
2. Suppose T:"f"- if. Show that Tis linear ifand only if T(aU + bV) d:aT(U)
+ bT(V) for all a, be R and all U, Ve "f".
3. How do you read "T: "f"- if?"
4. Show that each of the following maps is linear. Find all preimages fot the given
vector W in the. codomain.
a. T(a, b) = (2a - b, 3b - 4a); W = ( -1, 5).
b. T(a, b, e) = (a + b, 3a- e); W = (0, O).
c. T(a, b) =(a- b, a+ b, 2a); W = (2, 5, 3).
d. T(a + bt + e/
2
) =(a+ b)t + (e- a)t
2
; W =t.
e. T(a + bt + et
2
) = b + 2et; W = t.
f. T(a + bt + et
2
) =a + b +(b + e)t + (e- a)t
2
; W = 3 - 3t
2
5. State what it would mean for each of the following maps to be onto.
a. T:"f"
3
-"f"
4
b. T:vft2x3-vft2x2 c. T:R3[t]-R2[t].
6. Show that each of the following maps is linear. Determine if the maps are one
to one or onto.
a. T: vft2x3- vft2x2; r ~ ! /) = (d ~ e b fe).
b. T: e- e; T(a + bi) = O.
c. T: 1'"
2
- "1'"
3
; T(a, b) = (2a + b, 3b, b- 4a).
d. T: R
3
[t]- R
3
[t]; T(a + bt + e/
2
) =a + b + (b + e)t + (a + e)t
2
.
e. T: 1'"
3
- "1'"
3
; T(a, b, e)= (2a + 4b, 2a + 3e, 4b- 3e).
f. T: R[t]- R[t]; T(ao + at + + ant")
1 1
= aot + -a1t
2
+ +--a t"+
1
2 n + 1 n
7. Find the following images for the projection maps in Example 5.
8.
9.
10.
a. P
1
(18,2,8). b. Pi18,2,8). c. P(-1,-1,-12).
d. P
2
(0, O, 14). e. P
1
(x, y, -x-y). f. P2(x, y, -x-y).
a.
b.
a.
b.
a.
Show that a projection map is a linear transformation.
Suppose "f" = [!' + !!T but the sum is not direct. Wli.y isn't there a
projection map from "f" to [!'? ____,
What map defines the perpendicular projection of E
2
onto the x axis?
What map defines the perpendicular projection of E
2
onto the line with
equation y = 3x?
Find the map T that rotates the plane through the angle rp = 60. What
is T(!, O)? .
b. Use the componentwise expression for a rotation to show that any
rotation of the plane about the origin is linear.
c. Prove that a rotation about the origin is one to one and onto.
11. A translation in the plane is given by T(x, y) = (x + h, y + k) for sorne
h, k E R. Is a translation linear? 1-1? onto?
2. The lmage and Null S paces 137
12. The reftection of E
2
in the line with equation y = x is given by T(x, y) = (y, x).
Is this reftection linear? 1-1? onto?
13. Suppose T: "f" - if is linear and there exists a nonzero vector U e "f" such
that T( U) = O. Prove that T is not one to one.
14. Suppose T: f - if is a linear map, U, ... , Uk E "f" and r, ... , rk E R.
Use induction to prove that if k ;;::: 2, then
T(rU + + rkUk) = rT(U) + + rkT(Uk).
15. Suppose T: f - "'f is linear and { V1, ... , Vn} is a basis for f. Pro ve that the
image or range of T is !l' {T(V), ... , T(Vn)}. That is, the image of T is a
subspace of the codomain if.
2. The lmage and Null Spaces
The requirement that a transformation be linear and thus preserve the
algebraic structure of a vector space is quite restrictive. In particular, the
image of the zero vector under a linear map must be the zero vector. This
generally involves two different zero vectors. That is, if T: "//' ---+ i/1' is linear
and 0,.-, 0
1
r are the zero vectors in "f/' and "'Y, respectively, then T(O,.-) = Oif'.
For, using any vector V e"//', we obtain T(O,.-) = T(OV) = OT(V) = 0
1
r. The
use of ditferent symbols for the zero vectors is helpful here but not necessary.
Knowing that T is a map from "//' to "'Y is sufficient to determine what O
means in T(O) = O.
The requirement that a map preserve the vector space structure also
guarantees that the image of a vector space is again a vector space. In general,
if T is a map and the set S is contained in its domain, then the set T[S]
= {T(U)i U E S} is called the image of S. The claim is that if S is a vector
space, then sois the image T[S].
Example 1 Define T: "f/'
2
---+ "f/'
3
by
T(a, b) = (2a - b, 3b - a, a + b).
If Y = 2' {(1, 2)}, then
T[ff] = { T( U)i U E S"} = { T(k, 2k)jk E R}
= {(2k - 2k, 3(2k) - k, k + 2k)lk E R}
= 2'{(0, 5, 3)}.
138
4. LINEAR TRANSFORMATIONS
Note that, in this case, the image of the subspace 8' of "Y 2 is a subspace of
1'"3
Theorem 4.1 If T: "Y -t 1/ is linear and 8' is a subspace of "Y, then
the image of 8', T[Y'], is a subspace of 1/.
Proof It is necessary to show that T[Y'] is nonempty and closed
under addition and scalar multiplication. Since 8' is a subspace of "Y, O E 8'
and T(O) = O because T is linear. Thus O E T[Y'] and T[Y'] is nonempty. To
complete the proof it is sufficient to show that if U, V E T[.9"] and r, sE R,
then rU + sV E T[Y']. (Why is this sufficient?) But U, V E T[Y'] implies that
there exist W, X E 8' such that T(W) = U and T(X) = V. And 8' is closed
under addition and scalar multiplication, so r W + sX E 8'. Therefore
T(rW + sX) E T[Y']. Now T is linear, so T(rW + sX) = rT(W) + sT(X)
= rU + sV completing the proof.
In particular, this theorem shows that the image of T, T["Y], is a subspace
cf the codomain 1/. Thus it might be called the image space of T. We will
denote the image space of T by f T Using this notation, T: "Y -t 1f/ is onto
if and only if fr = 1/.
A map fails to be onto because of the choice of its codomain. If T:
r -t 1f is not onto, the codomain might be changed to r Then T: "Y -t r
is onto; or any transformation maps its domain onto its image. Hence if
T: "Y -t 1f is linear and one to one, then T: "Y -t r is an isomorphism.
Example 2 Define T: "Y
2
-t "Y
3
by
T(a, b) = (2a + b, b- a, 3a + b).
Find the image space of T.
fr = {T(a, b)la, bE R}
= {(2a + b, b - a, 3a + 4b)la, bE R}
= g-'{(2, -1, 3), (1, 1, 4)}
= g-'{(1, o, 7/3), (0, 1, 5/3)}
= {(x, y, z)i7x + 5y - 3z = 0}.
1t can be shown that Tis linear and 1-1. Thus the fact that fr is represented
by a plan e agrees with the fact that r is isomorpl:l to "Y 2
2. The lmage and Null Spaces
139
Examp/e 3 Consider the linear map T: "Y
2
-t "Y
2
given by T(a, b)
= (2a - b, 3b - 6a).
T does not map "Y
2
onto "Y
2
beca use
fr = {T(a, b)la, bE R} = {(2a - b, 3b- 6a)la, bE R}
= g-'{(2, -6), (-1, 3)} = 2'{(1, -3)} =1 "f/
2
The image of T can be viewed as the Iine in E
2
with Cartesian equation
y = - 3x. Each point on this Iine has an en tire line of preimages. To see this,
consider an arbitrary point W = (k, - 3k) in the image. The set of all pre-
images for W is
{VE "Y
2
IT(V) = W} = {(x, y)IT(x, y)= (k, -3k)}
= {(x, y)l2x -y = k}.
Therefore the en tire line with equation y = 2x ..:.. k is sent by T to (k, - 3k).
If &k denotes this line and Pk = (k, - 3k), then T[&d = Pk. Every point of
the domain is on one of the !in es &k, so the effect of T on the plane is to col-
lapse each line &k to the point Pk. This is illustrated in Figure 2. The line &0
y
y
Figure 2
140
4. LINEAR TRANSFORMATIONS
passes through the origin and thus represents a subspace of the domain. This
subspace determines much of the nature of T for every line parallel to it is
sent to a point. This idea generalizes to all linear maps. That is, knowing
what vectors are sent to O, tells much about a linear transformation.
The following theorem is easily proved and left to the reader.
Theorem 4.2
space of"Y.
If T: .Y --+ "'f/ is linear, then {V E "YI T(V) = O} is a sub-
Definition The null space .% r of a linear map T: .Y --+ 1Y is the set
of all preimages for O;r. That is, .% r = {V E "YI T( V) = O}.
The null space is often ca Jled the kernel of the map.
In Example 3, the null space of T is represented by the line 110 , and
every Iine parallel to the null space is sent to a single point. Example 5 and
problem 13 at the end of this section examine similar situations in 3-space.
As indicated, the null space provides information about a linear map.
The simplest result is given by the next theorem.
Theorem 4.3 Suppose T: .Y --+ "'f/ is linear. T is one to one if and only
if.% r = {0}. That is, T is 1-1 if and only if 0-r is the only vector that is
mapped to 0-nr by T.
Proof (=>) Suppose T is 1-l. We must show that {O} e.% r and
.% r e {0}. The first containment holds because T(O) = O for any linear map.
So suppose U E.% T that is, T(U) = O. Then since Tis 1-1, T(O) = O = T(U)
implies O = U and .% r e {0}. Thus .% r = {0}.
(<=) Suppose A'r = {0}. If T(U) = T(V) for sorne U, VE "Y, then
O= T(U)- T(V) = T(U- V). That is, U- VEA'r and U- V= O.
Therefore U= Vand T is 1-l.
Example 4
Determine if the map T: R
3
[t] --+ R
3
[t] is one to one if
T(a + bt + ct
2
) = a - b + (b - c)t + (e - a)t
2
If a + bt + ct
2
E.% T then T(a + bt + ct
2
) = O. This yields a system
of three homogeneous linear equations in three unknowns:
a-b=O
b-c=O
c-a=O
2. The lmage and ,N ull S:paces
which has coefficient matrix
A= (
-1
-1
1
o
141
The rank of A .is 2, therefore there are non trivial solutions and .% r # {0}.
So T is not one to one.
A second source of information about a map is provided by the dimen-
sion of its null space.
Theorem 4.4 If T: .Y --+ 1f/' is linear and .Y is finite dimensional, then
dim "f/' = dim.% r + dim .Fr.
Proof Dimension is the number of vectors in a basis, therefore we
must choose bases for the three vector spaces. The best place to start is with
.% r. for then a basis for .Y can be obtained by extension. Therefore, suppose
{V, ... , Vk} is a basis for .% r Then dim .% r =k. If.% r = {0}, then
k = O and this basis is empty. Since the dimension of .Y is finite, there exists
a basis V, ... , Vk> Vk+l ... , v. for .Y. Then dim "f/' = n and we need a
basis for .F r If W E .F T there exists a vector U E .Y such that T(U) = W,
and there exist scalars a
1
, , a. such that U= a
1
V
1
+ + a.Vw By
problem 14, page 137,
W = T(U) = a1T(V1) + + akT(Vk) + ak+
1
T(Vk+
1
) + + a.T(V.)
= ak+ 1 T(Vk+ 1) + + a.T(V.).
Therefore the set B = {T(Vk+
1
), , T(V.)} spans .F r To show that B is
linearly independent, suppose
rk+ 1 T(Vk+ 1) + + r.T(V.) = O.
Then
andrk+lVk+I + + r.V.E.A'r. Thereforethereexistscalarsr
1
, . , rkER
such that
142
4. LINEAR TRANSFORMATIONS
(Why?) But {V
1
, , V"} is Jinearly independent as a basis for 1"", so
rk+t = = r. =O and Bis a basis for .fy. Now dim %y+ dim . .fy
= k + (n - k) = n = dim 1"" as claimed.
Definition The dimension of the null space of T,% n is the nullity of
T; and the dimension of the image space r is the rank of T.
Thus, Theorem 4.4 states that the nullity of T plus the rank of T equals
. :he dimension of the doma'nt-of T.
by
Example 5 Find the rank and nullity of the map T: 1""
3
--+ 1""
3
given
T(a, b, e) = (2a - b + 3e, a - b + e, 3a - 4b + 2e).
The vector (x, y, z) is in% y if
2x- y+ 3z =O
x-y+z=O
3x - 4y + 2z = O.
This system is equivalent to x + 2z = O, y + z = O. Therefore
.Ar = {(x, y, z)lx + 2z =O, y+ z =O}= 2'{(2, 1, -1)}.
So the nullity of T is 1, and by Theorem 4.4, the rank of T is 2 = 3 - 1
= dim 1""
3
- dim %y.
If T is thought of as a map sending points in 3-space to points in 3-space,
then% y is a line g with equation P = t(2, 1, -1), tE R. This Iine is sent to
the origin by T. Moreover, as in Example 3, every Iine parallel to t is
sent to a single point. For a line paraiiel to t has the vector equation P
t(2, 1, -1) + V for so me vector V E 1""
3
: and
T(P) = tT(2, 1, -1) + T(V) = O + T(V) = T(V).
Further since the rank of this map is 2, we know that r is a plane
through the origin. In fact,
.fr = {(2a- b + 3e, a- b + e, 3a- 4b + 2e)la, b, e E R}
= {(x,y, z)E1""
3
Ix- 5y + z = 0}.
(Be sure to check this computation.)
2. The lmage and Null Spaces 143
Severa! conclusions follow at once from Theorem 4.4 using the fact that
the dimension of y cannot exceed the dimension of the codomain.
Corollary 4.5
then
Suppose T: 1""--+ 11' is linear and 1"" is finite dimensional,
l. dim .fy::; dim 1"".
2. T is 1-1 if and only if dim .fr = dim 1"".
3. T is onto if and only if dim r = di m 11/' .
4. If dim 1"" > dim 11/', then T is not 1-1.
5. If dim 1"" < dim 11/', then T is not onto.
6. If 1"" = 11', then T is 1-1 if and only if T is onto.
This corollary provides a simple proof that 1""" is not isomorphic to
"f'"m when n # m, for any linear map from -r. to "f'"m is either not 1-1 when
n > m, or it is not onto when n < m.
The result of problem 14, page 137, which was used in the proof of
Theorem 4.4, is important enough to be stated as a theorem.
Theorem 4.6 Suppose T: 1"" --+ 11' is linear and { V
1
, , Vn} is a
basis for 1"". If V= a
1
V
1
+ + a" Vm then T(V) = a
1
T(V
1
) + +
anT(Vn).
This states that a linear map is completely determined by its action on
a basis. If the images of the basis vectors are known, then the image of any
other vector can be found. Again we see how restrictive the linearity require-
ment is. Compare with the functions studied in calculus for which even a
large number of functional values need not determine the function.
The fact that a linear map is determined by its action on a basis means
that a linear map can be obtained by arbitrarily assigning images to the
vectors in a basis. Suppose B = { V
1
, , V.} is a basis for 1"" and W
1
, , W.
are any n vectors from a vector space 11'. Then set T{V
1
) = W
1
, , T(V.)
= W" and define T on the rest of 1"" by
T(V) = a1 W1 + + a"W" if V= a1V1 + + a"V".
It is not difficult to show that T is a linear map from 1"" to 11' which sends
V to W, 1 ::; i::; n .
Exainple 6 Given the basis {(<Z; 6), (0, !)} for 1""
2
, define a linear
map T: 1""
2
--+ 1""
3
which sends (2, 6) to (8, 10, 4) and (0, 1) to (- 1, 4, 2).
144
4. LINEAR TRANSFORMATIONS
Since (a, b) = ta(2, 6) + (b - 3a](O, 1), we define T by
T(a, b) = ta(8, 10, 4) + [b - 3a]( -1, 4, 2)
= (1a - b, 4b - 7a, 2b - 4a).
Then T is seen to be a linear map from f 2 to f 3 and
as desired.
T(2, 6) = (14 - 6, 24 - 14, 12 - 8) = (8, 10, 4);
T(O, 1) = (-1, 4, 2)
We will frequently want to find a linear map that has a certain effect on
the vectors of a basis. In such cases we will define the map on the basis and
"extend it Iinearly" to the entire vector space by the foregoing procedure. In
general, if S e f and T: S ~ 'iris given, then to extend T linear/y to f is
to define Ton all off in such a way that T: f ~ "'f is linear, and for each
U E S, T(U) has the given value. However, if S is not abasis for "!', then it
may not be possible to extent T linearly to f; see problem 10 at the end of
this section.
lf T: f ~ 'ir is the extension of T: S ~ "'f, then T: S ~ 'ir is the
"restriction of T toS." We will most often wish to restrict a map to a sub-
space of the domain: lf T: f ~ ' i r and !/ is a subspace of "!', then the
restriction of T to !/ is the map Ty: / ~ ' i r defined by Ty(U) = T(U) for
all U E!/. Since the domain of the restriction map T fl' is a vector space, if
Tis linear, then sois Ty.
Problems
l. Find .Y r and Y r for the following linear maps:
a. T: 1'
3
~ 1'
2
; T(a, b, e) = (3a - 2e, b + e).
b. T: 1'
2
-1'
3
; T(a, b) =(a- 4b, a+ 2b, 2a- b).
c. T: C - C; T(a + bi) = a + b.
d. T: R
3
[t]- R
3
[t]; T(a + bt + et
2
) =a+ 2e + (b- e)t +(a+ 2b)t
2
2. Let T: 1'
2
- 1'
2
be defined by T(a, b) = (4a - 2b, 3b - 6a) and find the
images of the following lines under T:
a. {t(l, 5)lt E R}. d. {t(2, 1) + (3, 5)lt E R}.
b. {t(l, 2)lt E R}. e. {1(5, 10) + (1, 4)lt E R}.
c. {t(l, 2) + (3, 5)lt E R}.
f. Under what conditions is one of these Iines sent to a single point?
2. The lmage and Null Spaces 145
3. Find T[9'] when
a. T(a, b, e)= (2a + b, 3b, 4b- e); !fl = 2'{(-1, 3, 5), (1, O, 4)} ..
b. Tas in parta and !fl = {(x, y, z)iy = x + z}.
c. T(a, b, e)= (a+ b-e, 2e- b); !fl = .P{(-1, 2, 1)}.
d. Tas in parte and !fl = {(x, y, z)iz =O}.
4. Determine if T is 1-1 by examining the null space Y r
a. T(a + bt + e/
2
) = 2a + b +(a+ b + e)t
2
b. T(a + bt + e/
2
) =e+ (a- b)t + at
2
c. T(a, b, e, d) = (e - 3d, O, 2a + b, e, O, e + d).
d. T(a + bt) =a+ 2at +(a- b)t
2
+ bt
3
e. T(a, b, e) = (a + 2b + e, 3a - b + 2e).
5. How could one determine that the maps in parts a and e of problem 4 are not
1-1 by inspection?
6. Suppose 1' = !fl 0 ff and P: 1'- ff is a projection map. Find P and .#'p.
7. Suppose T: 1'- if" is linear and S e if". r-
1
[S] (read "the complete in verse
image of S") is the set of all preimages for vectors in S, that is, r-
1
[S] =
{VEfiT(V) E S}.
a. What is r-
1
[{0}]?
b. Show that if !fl is a subspace of if", then r-
1
[9'] is a subspace off.
c. What condition must be satisfied by r-
1
[ { W}] for any W E if" if T is to
map 1' onto if/"?
d. What condition must be satisfied by r-
1
[ { W}] for any W E if" if T is to
be one to one?
8. B = {(4, 3), (2, 2)} is a basis for 1' 2 Define Ton B by T(4, 3) = (7, 5) and
T(2, 2) = (2, 2). Extend T linearly to 1'
2
and find T(a, b).
9. Set T(4, 1) = (2, 1, -1), T(3, 1) = (4, -2, 1), and extend Tlinearly toa map
from 1'
2
to 1'
3
What is T(a, b)?
10. a. If we set T(2, 4) = (6, -1, 3) and T(3, 6) = (3, O, 1), can T be extended
linear! y to a linear transformation from 1'
2
to 1'
3
?
b. If T(2, 4) = (2, O, 6) and T(3, 6) = (3, O, 9), can T be extended to a linear
transformation from 1' 2 to 1'
3
? . .,.,_,
11. Show that T: 1' z- 1' 2 is linear if and only if there exist scalars a, b, e, d such
that T(x, y) = (ax + by, ex + dy).
12. Prove that if T:f, -fm and the ith componen! of T(x
1
, ... , x.) is ofthe
form aX1 + a;2X2 + + a,.x. for sorne scalars a1., , a1., then T is
linear.
13. Define T: 1' 3 - 1' 3 by T(a, b, e) = (4a + 2b - 6e, O, 6a + 3b - 9c).
a. Show that X r is represented by a plane through the origin.
b. Show that every plane parallel to Y r is sent by T to a single point.
c. Let !fl be the subspace normal to Y T Show that the restriction map T!l'
is 1-1 and the image space of T y is the image space of T.
d. Could !fl in part e be replaced by any other subspace?
146 4. LINEAR TRANSFORMATIONS
14. Suppose "Y is a finite-dimensional vector space and T: "Y - 1l is linear but
not 1-1. Find a subspace fl' of "Y such that Ty is 1-1 and the image of the
restriction T y equals the image of T. Is there a unique choice for Y?
15. Suppose T: "Y- 1l is linear and {U., ... , Un} is a basis for "Y. Show that the
vectors T( U
1
), , T( U,) span the image space T
3. Algebra of Linear Transformations
Recall the construction of the vector space ff, in which vectors are con-
tinuous functions from [0, 1] to R. The algebraic structure of ff was defined
using addition and multiplication in the codomain of the functions, R.
Further, the proof that ff is a vector space relied on the fact that R is itself a
vector space. This idea can easily be generalized to other collections of func-
tions with images in a vector space. In particular, the algebraic structure of
a vector space can be introduced in the set of all linear transformations from
one vector space to another.
A linear transformation preserves the algebraic structure of a vector
space and thus belongs to a broad class of functions between algebraic
systems which preserve the structure of the system. Such functions are called
homomorphisms. A map between the field of real numbers and the field of
complex numbers would be a homomorphism if it preserved the field struc-
ture. Since a linear transformation is an example of a homomorphism, the
set of alllinear transformations from 1/ to "/f is often denoted Hom("f/, "'f).
Definition If 1/ and "/11" are vector spaces, then
Hom("f/, "/f) = {TITis a linear map from 1/ to "/f}.
When 1/ = "'f, Hom("f/, "//) will be written as Hom("f/).
Note that the statement TE Hom("f/, "/f) is simply a short way to write
"T is a linear transformation from 1/ to "/f."
To give Hom('f", "/f) the structure of a vector space, we must define
operations of addition and scalar multiplication:
For S, TE Hom('f", "/f) define S+ T by [S+ T](V) = S(V) + T(V)
for all V E 1'". For TE Hom( 1'", "'f) and r E R, define rT by [rT]( V) = r[T( V)J
forall Ver.
3. Algebra of linear Transformations
147
Example 1 Let S, TE Hom(1'"
2
, 1/
3
) be given by
S(a, b) = (a - b, 2a, 3b - a)
and T(a, b) = (b - a, b, a - b).
Then S + T is given by
[S+ T](a, b) = S(a, b) + T(a, b) = (O, 2a + b, 2b)
and
[rT](a, b) = (rb - ra, rb, ra - rb).
Example 2 Let S, TE Hom(R
3
[t]) be defined by
S(a + bt + ct
1
) = (a + b)t + at
1
and T(a + bt + ct
2
) = (e - a)t
1
- bt.
Then
[S + TJ(a + bt + ct
2
) = at + ct
1
For example,
The maps 7 S and 4T ha ve the following effect on the vector 3 - 2t + t
2
:
[7S](3 - 2t + t
2
) = 7[S(3 - 2t + t
2
)] = 7[t + 3t
2
J = 7t + 2lt
2
and
Theorem 4.7 The set of linear transformations Hom('f", 111") together
with addition and scalar multiplication as defined above is a vector space.
Proof It is necessary to prove that all nine conditions in the de-
finition of a vector space hold, these are Iisted on page 32.
Hom("f/, "'f) is closed under addition if, for any two maps S, TE
Hom('f", 111"), S + T: 1'"-> "'f and S + T is linear. S + T is defined for
each V E 1' and since "/f is closed under addition, the codomain of S + T is
148 4. LINEAR TRANSFORMATIONS
if, therefore S + T: "Y--+ if. To show that S + T is linear, suppose U,
V E "Y and a, b E R, then
[S+ T](aU + bV) = S(aU + bV) + T(aU + bV)
= [a[S(U)] + b[S(V)]] + [a[T(U)] + b[T(V)]]
= [a[S(U)] + a[T(U)]J + [b[S(V)J + b[T(V)]]
= a[S(U) + T(U)] + b[S(V) + T(V)J
= a[[S + T](U)J + b[[S + T](V)]
Therefore S + T is linear and S + TE Hom("Y, if) . .
Definition of + in
Hom("Y, if)
S, TE Hom("Y, 1/)
Commutativity and
associativity of + in
if
Distributive law in if
Definition of + in
Hom("Y, if).
To prove that addition is commutative, that is, S + T = T + S for all
S, TE Hom("Y, 1/), it is necessary to show that [S + T](V) = [T + S]( V) for
all V E "Y. That is, two maps are equal if and only if they ha ve the same effect
on every vector. But
[S + T](V) = S(V) + T(V) = T(V) + S(V) = [T + S](V).
(At what point in this chain of equalities is the commutivity of addition in "/f/
used ?) Therefore addition is commutative in Hom("Y, "/Y). The proof that
addition is associative in Hom("Y, if) is similar.
Define the zero map 0: "Y--+ if by O(V) = 0
1
,- for all V E "Y. The map O
is easily seen to be linear and thus a member of Hom("Y, "/f). Further
[T + O](V) = T(V) + O( V) = T(V) + O = T(V) for all V E "Y.
-,.,
Therefore T + O = T for any TE Hom("Y, "/Y), andO is the additive identity
for Hom("Y, 1/). Notice that, in general, there are four different zeros as-
sociated with Hom("Y, "/Y); the number zero, the zero vectors in "Y and 1f,
and the zero map in Hom("Y, if).
Por the final property of addition, define - T by [- T](V) = - [T(V)]
for all V E "Y.It is easily shown that if TE Hom("Y, 1/), then -TE Hom("Y, "/f)
and T + (- T) = O. Therefore every element has an additive inverse in
Hom("Y, 1/).
To complete the proof, it is necessary to prove that all the properties
for scalar multiplication hold in Hom("Y, "'f). Only one of the distributive
laws will be proved here with the other properties left to the reader. To prove
3. Algebra of Linear Transformations
149
that-r(S + T) = rS + rT forrE R and S, TE Hom("Y, if), it is necessary
to show that [r(S + T)](V) = [rS + rT](V) for any V E "Y:
[r(S + T)](V) = r[(S + T)(V)] Definition of scalar multiplication in
Hom("Y, "'f)
= r[S(V) + T(V)] Definition of addition in Hom("Y, "'f)
= r[S(V)] + r[T(V)] Distributive law in if
= [rS](V) + [rT](V) Definition of scalar multiplication in
Hom{"Y, "'f)
= [rS + rT](V) Definition of addition in Hom("Y, 1Y).
Therefore the maps r(S + T) and rS + rT yield the same image for each
vector V and so are equal.
With Theorem 4.7 we have a new class of vector spaces and the term
vector applies toa linear transformation. Moreover, any definition or theorem
about abstract vector spaces now applies to a space of linear transformations,
Hom("Y, if).
Examp/e 3 Determine if the vectors T
1
, T
2
, T
3
E Hom("Y
2
) are
linearly independent when T
1
(a, b) = (2a, a - b), Tz(a, b) = (4a + b, a),
and T
3
(a, b) = (b, 3a).
Consider S = x 1 T1 + x2 T
2
+ x
3
T
3
, a linear combination of these
vectors. The vectors are linearly independent if S = O implies x
1
= x
2
= x
3
= O. If S = O, then S( a, b) = O(a, b) = (0, O) for all (a, b) in "Y
2
But S is
determined by what it does toa basis. Using {E} we have
S(I, O) = (2x
1
+ 4x
2
, x
1
+ x
2
+ 3x
3
) = (0, O)
and
S(O, 1) = (x
2
+ x
3
, -x) = (0, 0).
These vector equations yield the homogeneous system
2x
1
+ 4x
2
=O, x 1 + x
2
+ 3x
3
= O, x2 + x
3
=O, -X= 0,
which has only the trivial solution. Thus the set {T
1
, T
2
, T
3
} is linear! y
independent in Hom("Y
2
).
150 4. LINEAR TRANSFORMATIONS
Why does Example 3 show that the dimension of the vector space
Hom( 1"
2
) is at Ieast 3?
Example 4 Consider the three vectors in Hom('i"
2
, 1"
3
) given by
T
1
(a, b) =(a- b, a, 2a- 3b), Tia, b) = (3a- b, b, a+ b), and T
3
(a, b)
= (a + b, - 2a, 7b - 3a). Is the set {T
1
, T
2
, T
3
} linearly independent?
Suppose S= x
1
T
1
+ x
2
T
2
+ x
3
T
3
=O. Then S(l, O) = S(O, 1)
= (0, O, O) yields two equations in 1"
3
:
and
Setting components equal, we obtain a homogeneous system of six linear
equations in three unknowns. But the rank of the coefficient matrix is 2, so
there are nontrivial solutions. That is, S = O does not imply that x
1
= x
2
= x
3
= O, and the set {T
1
, T
2
, T
3
} is linearly dependent.
If one looks to the Iast two examples for general ideas, the first point to
notice is that again the problem of determining linear dependence or inde-
pendence Ieads to a homogeneous system of linear equations. Second, the
number of homogeneous equations obtained, 4 and 6, is the product of the
dimension of the domain and the dimension of the codomain; 4 = 2 2 and
6 = 23. In general, ifthe maps come from Hom('f"., 'i"m), then each ofthe
n vectors in a basis for 1" n yields a vector equation in 1" m and the m com-
ponents in each vector equation yield m linear equations. Since a system of
nm homogeneous linear equations has a nontrivial solution if there are more
than nm unknowns, we can conclude that dim Hom('f"., 1" m) :::; nm. We
could prove that the dimension of Hom('i"
2
) is 4 by extending the set in
Example 3 to a basis. But it will be more informative to construct a standard
basis for Hom('i"
2
) from the standard basis for 1"
2
.
Example 5 Show that dim Hom('i"
2
) = 4.
If we follow the pattern set in 1" "' R[t], and J!t n x m of finding a simple
basis, then we should loo k for a collection of simple nonzero maps in Hom( 1"
2
).
Using the standard basis {E
1
} for 1"
2
, we can obtain a map by choosing
images for E
1
and E
2
, and then extending linearly. The simplest choice is to
send one vector to zero and the other to either E
1
or E
2
This yields four
maps TIJ: 1"
2
1"
2
with Tii(E) = Ei and Tcq(Ek) = (0, O) for k =f. i;
i,j = 1, 2. For example, T
12
sends the first basis vector in {E;} to the second
3. Algebra of Linear Transformations 151
vector in {E;} and it sends the second vector in {E;} to zero. Extending this
map linearly to 1"
2
gives
Tu(a, b) = aT
12
(1, O) + bT
1
i0, 1) = a(O, 1) + bO = (0, a).
Yo u might check that the other maps are justas simple with T
11
(a, b) = (a, O),
T
21
(a, b) = (b, 0), and T
22
(a, b) = (0, b).
The claim is that B = {T
11
, TJi, T
21
, T
22
} is a basis for Hom(1"
2
).
The proof that Bis linear! y independent follows the pattern set in Examples 3
and 4. To show that B spans Hom('i"
2
), suppose TE Hom('i"
2
). Then there
exist scalars aii such that T(l, O)= (a
11
, a
12
) and T(O, 1) = (aw a
22
).
(Since T is determined by its action on a basis, these scalars completely
determine T.) Now for any vector (x, y):
T(x, y) = xT(I, O) + yT(O, 1)
Therefore
= (xa
11
+ ya
21
, xa
12
+ ya
22
)
= a
11
(x, O)+ a
12
(0, x) + a
21
(y, O)+ a
2
i0, y)
= a
11
T
11
(X, y)+ a
12
T
12
(x, y)+ a
21
T
21
(x, y)+ a
22
T
2
ix, y)
= [a
11
T
11
+ a
12
T
12
+ a
21
T
21
+ a
22
T
22
](x, y).
and B spans Hom('i"
2
). Thus Bis a basis for Hom('i"
2
) and the dimension
is 4.
In showing that B is a basis for Hom('i"
2
), we have discovered how to
find the coordinates ofa map with respect to B. In fact, T: (a
11
, a
12
, a
21
, a
22
)
8
Hence if T(x, y) = (3x - 4y, 2x), then T(l, O) = (3, 2) and T(O, 1) = ( -4, O),
so T: (3, 2, -4, O)a. In the next section we will essentially arrange these co-
ordinates into a matrix to obtain a matrix representation for T.
The technique for constructing a basis for Hom('i"
2
) can easily be
generalized.
Theorem 4.8
nm.
If dim 1" = n and dim 1f =m, then dim Hom('i", "'f) =
Proof 1t is necessary to find a basis for Hom('i", "'f). This means
defining nm linear maps from 1" to "'f. Since all we know about 1" and 1f
is their dimensions, the natural place to start is with bases for 1" and "'f.
152 4. LINEAR TRANSFORMATIONS
So suppose {V
1
, , V.} and {W
1
, , Wm} are bases for "Y and "f//,
respectively. For each i, 1 :<; i :<; n, and eachj, 1 :<; j :<; m, we obtain a map
Tu as follows: Let Tu(V;) = Wi, Tu(Vk) = O if k #- i, and extend Tu to a
linear map from "Y to "ff/. This gives us nm vectors in Hom("Y, "f//), and it is
only necessary to show that they are linearly independent and that they span
Hom("Y, "f//).
Suppose a linear combination of these vectors is the zero map, say
m n
S= YTu + YzT!2 + ... + 'mTnm = L L ruTu =O.
i=l j=l
Then S(Vh) = O(Vh) = 011'" for each basis vector Vh. Therefore,
m
= L [r!jT
1
j(Vh) + r2 jT2/Vh) + + rhiTh/Vh) + + 'iT.j(Vh)]
j=l
m
= L [O + o + ... + o + rhjwj + o + ... + O]
j=l
The W's form a basis for 1f/, so for each integer h, 1 :<; h :<; n, rh
1
= rh
2
= = rhm = O and the maps Tii are linearly independent.
To prove that these maps span Hom("Y, "'f/), suppose TE Hom("Y, 1f).
Since T(V;) E "ff/ and {W
1
, ... , W.,} spans if, there exist scalars ai such
that T(V) = a W
1
+ a
2
W
2
+ + am Wm, for 1 :<; i :<; n. Now T and
the linear combination Li=
1
LJ=
1
aiiTu have the same effect on the basis
{V
1
, ... , V.}, for if 1 :.:; h:.:; n, then
m
= :L ahiwi
j=l
Therefore T = Li'=
1
Li=
1
aiiTii and the maps Tu span Hom("Y, if/).
Thus the nm maps Tu form a basis for Hom("Y, 1f/) and the dimemsion
of Hom("Y, 1f) is nm.
3. Algebra qf Linear Tranyormations 153
Example 6 Suppose B = {T
11
, T
12
, T
13
, T
21
, T
22
, T
23
} is the basis
for Hom("f/'
2
, "f/'
3
) obtained using the bases B
1
= {(3, 1), (1, O)} for "f/'
2
and
B
2
= {(!, 1, 1), (l, 1, 0), (1, O, O)} for "f/'
3
Find Tw T
13
and the coordinates
of T with respect to B, given that T(x, y) = (3x - 2y, S y, 4y - x).
For T
21
, we know that T
21
(3, 1) = (0, O, O) and T
21
(1, O) = (1, 1, 1).
Since (a, b): (b, a - 3b)a,, T
21
is defined by
T
21
(a, b) = b(O, O, O) + [a- 3b](l, 1, 1) =(a - 3b, a- 3b, a- 3b).
For example, T
21
(15, 3) = (6, 6, 6). For T
13
,
TJa, b) = bT
13
(3, 1) + [a - 3b]T
13
(1, O)
= b(l, O, O) + (0, O, O)
= (b, o, 0).
Therefore Tn(15, 3) = (3, O, 0).
To find the coordinates of T with respect to B, notice that
T(3, 1) = (7, 5, l)
= l(l, l, l) + 4(1, l, O) + 2(1, O, O)
= IT
11
(3, l) + 4Tti3, l) + 2T
13
(3, 1)
= [IT
11
+ 4T
12
+ 2T
13
](3, 1).
So if T: (a
11
, a
12
, a
13
, a
21
, a
22
, a
23
)a, then a
11
= l, a
12
= 4, and a
13
= 2.
F or the second vector in B
1
, T(l, O) = (3, O, - l): (- l, l, 3)a,, therefore the
coordinates of T with respect to B are given by T: (l, 4, 2, -1, l, 3)a.
In the process of obtaining the basis {Tu} for Hom("Y, "f//) we have
found how to obtain the coordinates ai of a map with respect to {Tu}. The
double index notation was natural since two bases were used, but it also
strongly suggests the matrix (a;). In the next section we will see how to as-
sociate the transpose of this matrix with the map.
Problems
l. Suppose Tt. T2. TJ E Hom('i'" 2, 1'" 3) are given by T1(a, b) = (3a- b, 2a + 4b,
a- 8b}, Tz(a, b) =(a+ b, b, a- 4b), and T3(a, b) =(a+ Sb, -2a,
3a- 8b). Find the following vectors:
a. Tt + T2. b. STt. c. T3- 3T2 d. T1- 4T2 + T3
154 4. LINEAR TRANSFORMATJONS
2. Determine if the following sets of vectors are linearly independent:
a. {T
1
, T
2
, T
3
, T
4
} e Hom('t'"
2
); T
1
(a, b) =(a+ b, a), T2(a, b) =
(a, a + b), T
3
(a, b) = (a + b, 0), T4(a, b) = (0, b).
b. {T, T
2
, T
3
} e HomW
2
); T
1
(a, b) =(a, b), T2(a, b) = (b, a), T3(a, b)
= (2a - b, 2b - a).
c. {T
1
, T
2
, T
3
, T
4
} e Hom('t'"
2
, 1""
3
); T1(a, b) =(a, b, b), T2(a, b) =
(a, a, b), T
3
(a, b) = (b, a, a), T4(a, b) = (b, b, a).
d. {T
1
, T
2
, T
3
} e Hom('t'"
2
, 1""
3
); T
1
(a, b) =(a+ b, 2b, b- a), T2(a, b)
= (a- b; 2a; a + b), T
3
(a, b) = (a, a + b, b).
3. What does the s e n t e n ~ l e t TE Hom('t'", 111')" mean?
4. Let S, TE Hom('t'", 111') and U, V E 1"".
a. Identify three different meanings for the plus sign in the equation
[S+ T](U + V)= [S+ T](U) +[S+ T](V).
b. What are the meanings of the symbol O in O( U+ O)= O( U)= O?
5. a. Prove that Hom('t'", 111') is closed under scalar multiplication.
b. Prove that[a + b]T = aT + bTfor any TE Hom('t'", 111') anda, bE R.
6. Let B = {T
11
, T
12
, T
21
, T
22
} be the basis constructed for Hom('t'"2) in
Example 5. Find the following images:
a. T
12
(3, 5). b. T22C4, -6).
c. [2T
12
- 3T21 + Tn](l, 3). d. [5Tu + ?T12 + 4T2,](0, -3).
Find the coordinates of the following vectors with respect to B:
e. T(a, b) = (4a, b - 3a). g. T(a, b) = (4a + b, 0).
f. T(a, b) = (0, 0). h. T(a, b) = (2a + 3b, a - 5b).
7. Write out the following sums:
3 2
c. L; L; aiJ TIJ.
l= 1 J= 1
8. Write the following sums using the sigma notation: .
a. ruTu + r12T12 + r21T21 + r22T22.
b. a41T41 + a42T42 + a43T43 + a44T44
c. r1auT11 + r1a12 T12 + r2a21T21 + r2a22T22 + r3a31T31 + r3a32T32.
9. Suppose TlJ are the maps used in Example 6.
a. Find T12 b. Find T22. c. T23(4, 7) = ? d. Tl!(5, 9) = ?
10. If B is the basis for Hom('t'"
2
, 1""
3
) from Example 6, find the coordina tes of T
with respect to B when
a. T(x, y) = (x - y, 4x + y, 3y). c. T(x, y) = (x + 2y, x, x - y).
b. T(x, y) = (5x - 3y, O, 2x + 2y). d. T(x, y) = (3x - 2y, 4y, 2x).
11. Use the bases {1, t} and {1; t, t
2
} for R2[t] and R3[t] to obtain the basis B =
{Tu, T12, T13, T21 T22, T23} for Hom (R2[t], R3[t]).
a. What are the maps T11 , T13, and T21?
b. What is the vector 2Tu + JT21 - 4Tl3?
Express the following maps as a linear combination of the TlJ's.
c. T(a + bt) = at + bt
2
. d. T(a + bt) = 2b + (a- 4b)t + 5at
2
.
4. Matrices of a Linear Transformation 155
Find the coordina tes of the following vectors with respect to B.
e. T(a + bt) = (a+ b)t
2
. g. T(a + bt) = O.
f. T(a + bt) = 3a- b + (4b- 3a)t + (7a + 5b)t
2
.
12. Use the bases B
1
= {(1, 2, 1), (0, 1, -1), (1, 4, O)} for 1""3 and B2 = {(0, 1),
(1, 2)} for 1""
2
to define the maps in the basis B = {Tu, T12, T21, T22, T3,
T
32
} for Hom('t'"
3
, 1""
2
). Find the following images:
a. T
21
(1, 4, O) b. Tn(1, 4, 0). c. Tn(O, -5, 5).
d. T22(3, 4, 5). e. Tu(1, O, 0).
Find the coordina tes of the following maps with respect to B.
f. T(x, y, z) = (3x- y+ z, 2y- x + 4z).
g. T(x, y, z) =(y- 2z, 3x + 4z).
13. What are the 4 zeros used in the vector space
a. Hom('t'"2, 1""4)? b. Hom(1""3, R3[t])? c. Hom(.Jt2x3, .Jt2x2)?
4. Matrices of a Linear Transformation
We have seen that if "f and 1f/ are finite-dimensional vector spaces,
then sois the space of maps Hom("f, "/Y). Further each choice of bases for
"f and 1f/ yields a basis for Hom("f, "/f). The coordinates of a linear trans-
formation T from "f to 1f/ with respect to such a basis can be arranged as
a matrix in severa! ways. The following example shows which way we should
choose and illustrates one of the major uses for such a matrix.
Examp!e 1 Suppose T: R
2
[t] -+ R
3
[t] is defined by
T(a + bt) = 3a + b + (a - 4b)t + (a + 2b)t
2
An arbitrary image vector has the form x + yt + zt
2
, and the equation
T(a + bt) = x + yt + zt
2
yields the system of linear equations:
or in matrix form
3a + b = x
a-4b=y
a+2b=z
156
4. LINEAR TRANSFORMATIONS
Therefore the image of any vector a + bt in R
2
[t] can be found by matrix
multiplication. As an example,
and
Further since
T(4 - 6t) = 12 - 6 + (4 + 24)t + (4 - 12)t
2
= 6 + 28! - 8!
2
'
(
3 1) (12- 6) ( 6)
~ - ~ (_!)= : ~ i ~ = ~ .
we can conclude that T(1 + 3t) = 6 - llt + 7t
2
In using the matrix
to find images under T, it is clear that the vectors are not being used directly.
Instead the columns contain coordinates with respect to the bases B
1
= {1, t}
and B
2
= {1, t, t
2
}. Thus 4- 6t: (4, -6h, and 6 + 28t- 8!
2
: (6, 28, -8)
82
In general, the column vector
contains the coordinates of a + bt with respect to B
1
and
contains the coordinates of x + yt + zt
2
with respect to B
2
Moreover, the
matrix A is related to these bases, for T(1) = 3 + t + t
2
has coordinates
(3, 1, 1)
82
and T(t) = 1 - 4t + 2t
2
has coordinates (1, -4, 2}a
2
Therefore
.;;.,
" ..
e$,
4. Matrices of a Linear Transformation 157
the two columns in A contain the coordinates with respect to B
2
of T(1) and
T(t), the images ofthe two vectors in B
1
Using Example 1 as a guide, we associate matrices with a mapas follows:
Definition Let TE Hom(f, "/Y) and suppose B
1
= {V
1
, , Vn}
and B
2
= {W
1
, . , Wm} are bases for f and "/Y, respectively. If
T(Vi): (a
1
i, a
2
i, ... , am)
82
for 1 ::; j ::;n, then the m x n matrix (a) is the
matrix of T with respect to the bases B
1
and B
2
Therefore A is the matrix of Twith respect to B
1
and B
2
ifthejth co1umn
of A contains the coordinates with respect to B
2
for the image of the jth vector
in B
1
Since a linear map T is determined by its action on any basis, the
matrix for T with respect to B
1
and B
2
contains enough information, in co-
ordinate form, to determine the image of any vector under T.
Example 2 Suppose TE Hom(f
2
, f
3
) is given by T(a, b) =
(2a + b, 3a + b, a + 2b). Find the matrix A of T with respect to the basis
B
1
= {(4, 1), (1, O)} in the domain and the basis B
2
= {(1, 1, 1), (1, 1, 0),
(1, O, O)} in the codomain.
Since
T(4, 1) = (9, 13, 6): (6, 7, -4)a
2
and
T(1, O)= (2, 3, 1): (1, 2, -1h
2
,
the matrix A is
A= ( ~ ;)
-4 -1
Following Example 1, we should expect that if V: (a, b)
8
, and
then T(V): (x, y, zh
2
To check this, choose V= (4, -6) arbitrarily. Then
V= (4, -6): ( -6, 28)
8
, and T(V) = (2, 5, -8): ( -8, 14, -4h
2
In this case
158
4. LINEAR TRANSFORMATIONS
we do have
The proof that coordinates for an image can always be obtained in this way
is given by Theorem 4.9.
Example 3 Let T be as in Example 2. Find the matrix A' of Twith
respect to the standard bases for "Y
2
and "Y
3
T(I, 0) = (2, 3, 1): (2, 3, l)E,J
and
T(O, 1) = (1, 1, 2): (1, 1, 2)E
1
,
therefore
Examples 2 and 3 show that ditferent bases may give rise to ditferent
matrices for a given map. Therefore, one cannot speak of a matrix for a
map without reference to a choice of bases. As a further example, consider
the two bases {(1, 0), (0, 1)} and {(3, 1, 1), (1, -4, 2), (5, 1, 2)}, the matrix
for the map T of Examples 2 and 3 with respect to these two bases is
Thus by a careful choice of bases, we have obtained a particularly simple
matrix for T. We could continue to produce matrices for T in infinite variety
by choosing ditferent bases, but not every 3 x 2 matrix can be a matrix for
T. Can yo u think of a 3 x 2 matrix which is not the matrix of T with respect
to sorne choice of bases? What about a matrix with a column of zeros?
Theorem 4.9 Let A = (a) be the matrix of T with respect to B
1
and
B
2
If V: (x
1
, , x.)s, and T(V): (y
1
, , Ymh
2
, set X= (x
1
, , x.f and
Y= (y,., Ymf, then Y= AX.
4. Matrices of a Linear Transformation 159
Proof Suppose B
1
= {V
1
, , V.} and B
2
= {W
1
, , Wm}.
Then by assumption
n m
V=
and T(V) = L yW.
Further, by the definition of the matrix A,
m
T(V) = L aijW.
Therefore
T(V) = rCt/Yi) = it
1
xiT(V)
= Ixi[faijwi] = f[auxi]w;
So y, the ith coordinilte of T(V) with respect to B2, is
auxi. Now
n
Yt L aljxi a
11
x
1
+ a
12
x
2
+ + anXn
n
Y2 L a 2jXj a
21
X 1 + X22X
2
+ + aznXn
1
n
Y m L amjXj OmX + am2X2 + + amnxn
all az a. X
a21 a22 a2n Xz
ami am2 amn x.
that is, Y = AX.
Lo6sely, the matrix equation Y= AX states that the product of a matrix
A for T times coordina tes X for V equals coordinates Y for T(V). Notice the
160 . 4. LII\!EAR TRANSFORMATIONS
similarity between the matrix equation for a linear transformation, Y = AX,
and the equation for a linear function from R to R, y = ax. We should also
observe that the matrix equation AX = Y is in the form of a systeril of m
linear equations in n unknowns, see problem 9 at the end of this section.
Suppose TE Hom( "Y"' "Y m) has matrix A with respect to any basis
B = {V
1
, , V.} for "Y. and the standard basis for In this special
case, T( Vi) is the transpose of the jth column of A. Then sin ce the rank of T
equals the dimension of fr = .P{T(V
1
), ... , T(V.)}, the rank of Tequals
the (column) rank of A.
Example 4
T(a, b, e, d) = (a - b + 2e, 2a - b + d, a - 2e + d).
Suppose A is the matrix of T with respect to
B = {(1, 2, O, 1), (3, O, 2, 1), (0, -2, O, 2), (1, O, -2, 1)}
in the domain and {E} in the codomain. Show that the rank of A equals the
rank of T.
First,
Therefore
T(l, 2, O, 1) = ( -1, 1, 2),
T(3, O, 2, l) = (7, 7, 0),
T(O, - 2, O, 2) = (2, 4, 2)
T(l, O, -2, 1) = ( -3, 3, 6).
C'
7 2
-3\
A= ;
7 4
o 2
Since A is column-equivalent to
( :
o o
o
-1 o
4. Matrices of a Linear Transformation
the rank of A is 2. On the other hand, the image space of T is
fr = .P{(-1, 1, 2), (7, 7, 7), (2, 4, 2), (-3, 3, 6)}
= .P {(1, o, -1), (0, 1, 1)}.
161
So the rank of T is also 2. Notice that J r is spanned by the transposed
columns of A'. This happens only because A was obtained using the standard
basis in the codomain.
In general, if A is a matrix for T, the columns of A do not span J T but
the ranks of A and T must be equal.
Theorem 4.10 Suppose TeHom("Y, 1Y) has A as its matrix with re-
spect to B
1
and B
2
, then rank T = rank A.
Proof If B
1
= {V
1
, , V.}, then fr = .P{T(V
1
), , T(V,,)}
and the jth column of A contains the coordinates of T(V) with respect to
B
2
. The columns of A and the spanning vectors of J T are related by the
isomorphism L: Amx
1
-> 1f where L((a
1
, , am)r) = W if W: (a
1
, ,
am)B
2
(The proof that Lis an isomorphism is the same as the proof that an
m-dimensional vector space is isomorphic to "Y'"' Theorem 2.14, page 64.)
lf A =(a), then T(V):(a
1
i, ... , am)B
2
Therefore L((a
1
i, ... , am)T)
= T( V), or L sends the jth column of A to the jth spanning vector of J r
This means that the restriction of L to the column space of A is a 1-1, onto
map from the column space of Ato fr. Therefore dim(column space of A)
= dim fr or rank A = rank T.
Example 5
T, if
Find the rank of T: R
3
[t] -> R
3
[t] using a matrix for
T(a + bt + et
2
) = 3a + b + Se + (2a + b + 4e)t + (a + b + 3e)t
2
The matrix of T with respect to the basis B = {1, t, t
2
} in both the domain
and codomain is
162 4. LINEAR TRANSFORMATIONS
and
(
1 o o)
A'= O 1 O
-1 2 o
is column-equivalent toA. Hence the rank of A is 2 and rank T = 2. Since
di m J r ::::; di m R
3
[t], T is neither 1-1 nor onto.
The matrix A' can be used to find the image space for T. Since elementary
column operations do not change the column space of a matrix, the columns
of A' span the column space of A and are coordinates of spanning vectors for
.Ir. That is, .Ir= .P{U, V} where U: (1, O, -l)n and V: (0, 1, 2)
8
. These
coordinates give U= 1 - 1
2
and V= 1 + 21
2
, so .Ir = 2'{1 - 1
2
, t + 21
2
}.
This basis for J r is analogous to the special basis obtained for the image
space in Example 4.
The matrix A' is not the matrix of T with respect to {1, t, t
2
} for both
the dorriain and codomain. But bases could be found which would give A'
as the matrix for T. In fact, an even simpler matrix can be obtained.
Example 6 Let T and A be as in Example 5. The matrix
(
1 o o)
A"= O 1 O
o o o
is equivalent to A. That is, A" can be obtained from A with elementary row
and co1umn operations. Find bases B
1
and B
2
for R
3
[1] with respect to which
A" is the matrix of T.
Since the third co1umn of A" has all zeros, the third vector in B
1
must
be in .A' r = 2'{1 + 21 - 1
2
}. Therefore take 1 + 2t + t
2
as the third vector
of B
1
Then the first two ve<...ors in B
1
may be chosen arbitrarily provided the
set obtained is linearly independent. Say we take B
1
= {t, 1
2
, 1 + 2t- 1
2
}.
Now since T(t) = 1 + t + t
2
and T(t
2
) = 5 + 4t + 3t
2
, these must be taken
as the first two vectors in B
2
(Why do these two images have to be inde-
pendent? T is not one to one.) The third vector in B
2
may be any vector in
R
3
[t] so long as B
2
is linearly independent. So Jet B
2
= {1 + t + t
2
, 5 + 4t
+ 3t
2
, 7}. Then A" is the matrix of T with respect to B
1
and B
2
In order to obtain a simple matrix for a map, such as A" in Example 6,
complicated bases may be needed, while the simple basis {1, t, t
2
} used in
Example 5 produced a 3 x 3 matrix with no speeial form for its entries. In
4. Matrices of a linear Transformation 163
sorne situations a simple matrix is needed while in others the bases should be
simple. But in general, it is not possible to have both.
We have been considering various matrices for a particular linear trans-
formation. Suppose instead we fix a choice for the bases B
1
and B
2
, and
consider all maps in Hom('i'", 1f). If di m 1'" = n and dim "/Y = m, then each
map TE Hom('i'", "/f) determines an m x n matrix Ar which is the matrix
of Twith respect to B
1
and B
2
Thus there is a map <p defined from Hom('i'", 1f)
to J!t m X n which sends T to the matrix Ay.
Example 7 Consider the map <p :Hom('i'"
2
) ~ Jt
2
x
2
obtained using
B
1
= {(2, 4), (5, -3)} and B
2
= {(!, 0), (3, 1)}. Then for TE Hom(1'"
2
), cp(T)
is the matrix of T with respect to B
1
and B
2
Suppose T
1
, T
2
, T
3
, T
4
E Hom('i'"
2
) are given by
Then
T
1
(a, b) = (a, b),
T
3
(a, b) = (2a, b),
Tia, b) = (2a - b, a + b),
T
4
(a, b) = ( -b, a).
(
-10 14)
cp(T) = 4 -3 '
cp(T2) = ( l ~ ~ .
(
-8 19)
<p(T3) = 4 -3 ,
cp(T4) = ( 1 ~ -1;).
Notice that T
2
= T
3
+ T
4
and
That is,
suggesting that cp preserves addition. Further the map 4T
1
given by [4T.J(a, b)
= (4a, 4b) has
(
-40 56)
16 -12
as its matrix with respect to B
1
and B
2
So <p(4T
1
) = 4[cp(T
1
)], suggesting
that <p is a linear transformation.
164 4. liNEAR TRANSFORMATIONS
The map q is essentially the correspondence that sends maps from
Hom(-r, 11') to their coordinates with respect to a basis {T;J- The only
difference is that instead of writing the coordinates in a row, they are broken
up into the columris of a matrix. From this point of view, the correspondance
qJ should be an isomorphism.
Theorem 4.11 If dim -r = n and dim "/Y= m, then Hom(-r, 11') is
isomorphic to Afmxn
Proof Suppose B
1
and B
2
are bases for -yr and 1/, respectively.
For each TE Hom(-r, 1/), let q(T) be the matrix of Twith respect to 8
1
and
B
2
lt must be shown that q: Hom(-r, 1/)-> Afmxn is an isomorphism.
The proof that q is linear is straightfoward and left to the reader.
To show that (p is 1-1, su ppose TE then
q(T) = O..llmxn"
That is, the matrix of T with respect to B
1
and B
2
is the zero matrix. Hence
the matrix equation for T is Y = OX. Since every image under T has co-
ordinates Y = 0_
11
.... ,, T is the zero map 0Hom(
1
r,
1
n- This means that the null
space of q contains only the zero vector, so T is 1-1.
To prove that q is onto, Jet {Tij} be the basis for Hom(-r, ir) con-
structed using the bases B
1
and B
2
Then q(T;) is the m x n matrix with 1
in the ith column and jth row, and zeros everywhere el se. The set of all such
m x n matrices spans vffmxn so qJ maps Hom(-r, 1f/") onto vffmxn Thus (p
is an isomorphism and Hom(-r, "'r)
Theorem 4.1 1 is an important result for it te lis us that the study of
linear transformations from one finite-dimensional vector space to another
may be carried out in a space of matrices, and conversely. Computations
with maps are often handled most easily with matrices; while severa! theo-
retical results for matrices are easily derived with linear maps.
Problems
1. Find the matrices of the following maps with respect to the standard bases for
the n-tuple spaces.
a. T(a, b) = (a, b).
c. T(a, b, e) = (0, e, 0).
b. T(a, b) = (0, a).
d. T(a, b) = (2a - b, 3a + 4b, a + b).
4. Matrices of a Linear 165
2. UsethebasesB
1
= {1,t}andB
2
= {l,t,t
2
}tofindmatricesforthefollowing
linear trarisformations from Hom(Rz[t], RJ[t]).
a. T(a + bt) = a + bt. b. T(a + bt) = (a + b)t
2
c. T(a + bt) = O. d. T(a + bt) = 3a- b + (4b - 3a)t + (7a + 5b)t
2
3. Find the matrix of TE Hom(i'" 2 , 1'" 3 ) with respect to the bases {(7, 3), (5, 2)}
and {(1, 3, 1), (1, 4, -2), (2, 6, 3)} when
a. T(a, b) = (0, O, 0).
b. T(a, b) = (5b- 2a, 20b- 8a, 4a- JOb).
c. T(a, b) = (a - 2b, O, b).
4.. Find the matrix of T(a, b, e) = (a + 3b, b - 2e, a + 6e) with respect to B1 and
B
2
when;
a. Bt = B2 ={E,}.
b. B
1
= {(4, 1,,;-2), (3, -2, 1), (1, 1, O)}, B2 = {E}.
c. B
1
= {(4, 1, -2). (3, -2, 1), (1, 1, 0)},
B
2
= {(7, 5, -8), (-3, -4, 9), (1, 2, 3)}.
d. B
1
= {(2, 1, 0), (4, O, 1), ( -12, 4, 2) },
B
2
= {(5, 1, 2), (4, -2, 10), (1, O, 1)}.
e. B
1
= {(1, O, 0), (0, 1, 0), (6, -2, -1)},
B
2
= {(1, O, 1), (3, 1, 0), (8, 4, 2) }.
5. Let A be the matrix of T(a, b, e) = (b -- 3a + 2e, 4e - 3a - b, a - e) with
respect to the standard basis for 1'" 3 Use the matrix formula Y = AX to find:
a. T(O, 1, 0). b. T(2, -1, 3). c. T(2, 2, 2).
6. Let A be the matrix for the map T(a, b) = (3a - b, b- 4a) with respect to the
basis {(1, 1), (1, O)} for 1'" 2 (in both the domain and codomain). Use the matrix
formula Y= AX to find: a. T(l, 1). b. T(4, 0). c. T(3, 2).
d. T(O, 1).
7. Let A be the matrix of T(a + bt + ct
2
) = 2a - e -\- (3a -\- b)t -\- (e -
a- 3b)t
2
with respect to {1, t, t
2
}. Use Y= AX to obtain: a. T(t).
b. T(l + t
2
). c. T(3t - 4). d. T(O).
8. Suppose A s the matrix of T with respect to the basis B1 = {V, ... , v.}
in the doma in and B2 = { W, ... , w .. }. What does it mean if:
a. The 3rd column of A has all zeros?
b. The 4th column of A has all zeros except for a 7 in the 2nd row?
c. The 1st column of A contains only two nonzero entries?
d. All nonzero entries of A occur in the first and last rows?
e. A = (a,,;) and a u = O unless i = j?
9. Suppose A as the matrix of a map Twith respect to B
1
and B
2
View the matrix
equation Y = AX as a system of linear equations in the coordinates of vectors.
What can be infered about T if:
a. AX = Y is consisten! for all Y?
b. AX = Y has a unique solution if consisten!?
166 4. LINEAR TRANSFORMATIONS
c. AX = Y is not a1ways consistent?
d. There is at 1east one parameter in the so1ution for AX = Y for any Y?
10. The matrix
A'= (b ?)
is equivalent to the matrix A in problem 6. Find bases B1 and B2 for "Y
2
with
respect to which A' is the matrix for T.
11. The matrix
'WL._ ( 1 o go)
A'= O 1
o o
is equivalent to the matrix A in prob1em 5. Find bases B1 and B2 for "Y a with
respect to which T has matrix A'.
12. Let T, A, and A' be as in Example 3. Therefore, A' is column-equiva1ent toA.
Find bases B1 and B2 for "Y 4 and "Y a; respectively, so that A' is the matrix of
Twith respect to B
1
and B
2
.
13. Let B1 = {(2, 1, 1), (- 1, 3, 1), (1, O, 2)} and B2 = {(4, 1), (1, O)}. Suppose
rp(T) is the matrix of Twith respect to B
1
andB
2
Define T
1
and T
2
by T
1
(a, b, e)
= (3a - b + e, 3e - 4a) and Tia, b, e) = (e - b - 2a, a - 2e).
a. Find rp(Tt), rp(T2), rp(T
1
+ T2), and rp(2T
1
+ 3T
2
).
b. Show that rp(T1 + T2) = rp(T1) + rp(T2).
c. Show that rp(2T1 + 3T2) = 2rp(T1 ) + 3rp(T2).
14. Show that the map rp: Hom("Y, 11')---+ vffmxn is linear.
15. Suppose T, B ~ and B
2
are as in Example 6, page 153.
a. Find the matrix A of Twith respect to B
1
and B
2
b. How does A compare with the coordinates of T with respect to the basis
{TIJ}?
16. Suppose A = (a) is the matrix of TE Hom("Y a, "Y 2) with respect to the
bases B1 and B2, and B = {T11 , Tu, T21> T2 2, Ta 1, T32 } is the basis for
Hom("Y a, "Y 2) constructed using B
1
and B
2
Write the coordina tes of Twith
respect to B in terms of the entries a
1
in A.
5. Composition of Maps
Suppose S and Tare maps such that the image of T is contained in the
domain of S. Then S and T can be "composed" to obtain a new map. This
operation yields a multiplication for sorne pairs of maps, and using the as-
sociation between matrices and maps, it leads directly to a general definition
for matrix multiplication.
5. Composition of Maps
167
Definition Let T: "f/" ~ 1/ and S: 1/ ~ /1. The eomposition of S
and T is the map S o T: "f/" ~ /1 defined by [S o T](V) = S(T(V)) for every
vector VE "f/".
Example 1 Let T: "f/"
3
~ "f/"
2
and S: "f/"
2
~ "f/"
3
be given.by
T(a, b, e) = (a + b, a - e) and S(x, y) = (x - y, 2x -y, 3y).
Then
[S o T](a, b, e) = S(T(a, b, e)) = S(a + b, a - e)
= ([a + b] - [a - e], 2[a + b] - [a - e], 3[a - e])
= (b + e, a + 2b + e, 3a - 3e).
The composition S o T could be thought of as sending a vector V E "f/" 3 to
[S o T](V) by first sending it to "f/"
2
with T and then sending T( V) to "f/" 3 with
S. When V= (1, 2, O) then T(l, 2, O) = (3, 1), S(3, 1) = (2, 5, 3), and
[S o T](l, 2, O) = (2, 5, 3). This is represented pictorially in Figure 3.
The composition T o S is also defined for the maps in Example 1, with
[To S](x, y)= T(x- y, 2x- y, 3y) = (3x- 2y, x- 4y).
Notice that not only are T o S and S o T not equal, but they ha ve different
domains and codomains, for T o S: "f/"
2
~ "f/"
2
and S o T: "f/"
3
~ "f/" 3
.y
2
Fi\JUre 3
168 ::1) 4. LINEAR TRANSFORMATIONS
Example 2
T(a, b, e) = (a - e, O, b + e) and S(x, y)= (y, -y, x +y);
Then S o T is undefined beca use the codomain of T is "f/
3
while the domain
of S is "f/
2
However, ToS is defined, with [To S](x, y) = ( -x, O, x). Thus
it may be possible to compose two maps in only one way.
Composed maps are encountered frequently in calculus. The chain rule
for differentiation is introduced in order to differentiate composed functions.
For example, the function y = Jsin x is the composition of z = f(x) and
y= g(z) wheref(x) = sin x and g(z) = JZ:. For then,
y= [g of](x) = g(f(x)) = g(sin x) = Jsin x.
Composition may be viewed as a multiplication of maps. This is in
contrast to scalar multiplication, which is only the multiplication of a
number times a map. The most obvious characteristic of this multiplication
is that it is not defined for all pairs of maps. But when composition is defined,
it satisfies sorne of the properties associated with multiplication of real
numbers.
Theorem 4.12
l. The composition of linear maps is linear.
2. Composition of maps is assodative; if T
1
o [T
2
o T
3
] and
[T
1
o T
2
] o T
3
are defined, then
3. For linear maps, composition distributes over addition; if
Similarly
provided the maps may be composed and added.
~
5. Composition of Maps 169
Proof l and 3 are left to the reader. For 2, suppose V is in the
domain of T
3
, then
(T
1
o [T
2
o T
3
])(V) = T
1
([T
2
o T
3
](V)) = T
1
(T
2
(T
3
(V)))
= [T
1
o T
2
](T
3
(V)) = ([T
1
o T
2
] o T
3
)(V).
Since the maps T
1
o [T
2
o T
3
] and [T
1
o T
2
] o T
3
have the same effect o ~ V,
they are equal, and composition of maps is associative.
If S and T are linear maps from a vector space to itself, t ~ product
S o T is always defined. Further S o T: "f/ ~ "f/ is linear, so Hom("f/") is
closed under composition. Thus composition defines a multiplication on the
space of linear transformations Hom("f/"). This is a special situation for up
to this point we have considered only one other multiplicative operation
within a vector space, namely, the cross product in ~
3
Consider the algebraic system consisting of Hom("f/") together with
addition and multiplication given by composition, while disregarding scalar
multiplication. This algebraic system is not a vector space. 1t is rather a set
of elements, maps, on which two operations, addition and multiplication,
are defined, and as such it is similar to the real number system. If we check
through the basic properties of the real numbers listed on page 2, we find
that addition of maps satisfies all the properties listed for addition in R. Of
the properties listed for multiplication in R, we have seen, in Theorem 4.12,
that Hom("f/") is closed under multiplication and that the associative law for
multiplication holds in Hom("f/"). Moreover, the distributive laws hold in
Hom("f/"). An algebraic system satisfying just these properties of addition
and multiplication is called a ring. Thus the set of linear maps, Hom("f/"),
together with addition and multiplication (composiotion) forma ring.
It is easy to see that Hom("f/") does not satisfy all the field properties
listed for R. -....,.,
Example 3 Let S, TE Hom("f/"
2
) be given by
S(a, b) = (2a - b, a + 3b) and T(a, b) = (a + b, 2b - a).
Show that S o T i= T o S.
We have
[S o T](a, b) = S(T(a, b)) = S( a + b, 2b - a) = (3a, 7b - 2a),
170 4. LINEAR TRANSFORMATIONS
and
[ToS]( a, b) = T(S(a, b)) = T(2a - b, a + 3b) = (3a + 2b, 1b),
Therefore S o T =1: ToS and multiplication is not commutative in Hom('i'
2
).
You might check to see that the same result is obtained with almost any pair
ofmaps from Hom(1'
2
).
A ring of maps does share one additional property with the real num-
bers; it has a multiplicative identity. We define the identity map 1: 1' 1'
by /(V) = V for all V E 1'. It is then immediate that 1 E Hom('i').and 1 o T
= To 1 = Tfor any map TE Hom('i'). Since a multiplicative identity exists,
the ring Hom('i') is called a ring with identity.
Since there is an identity for composition, we can look for multiplicative
in verses.
Definition Given T: 1' 11'. Suppose S: 11' ~ 1' satisfies S o T
= !.y and- ToS = /111'. Then S is the in verse of T. T is said to be in vertible,
and its inverse is denoted by r-
1
Example 4 Suppose T: 1'
2
~ 1'
2
is given by T(a, b) = (2a + 3b,
3a + 4b). Show that T is invertible by finding T-
1
Write T(a, b) = (x, y). If a map. S exists such that S o T = /, then
S(x, y) = (a, b). Therefore, S may be obtained by solving
2a + 3b = x
3a + 4b =y
for a and b in terms of x and y. These equations have the unique solution
a= -4x + 3y
b = 3x- 2y.
Therefore S should be defined by
S(x, y) = ( -4x + 3y, 3x - 2y).
Using this definition,
[S o T](a, b) = S(T(a, b)) = S(2a + 3b, 3a + 4b)
= ( -4(2a + 3b) + 3(3a + 4b), 3(2a + 3b) - 2(3a + 4b))
=(a, b).
5. Composition of Maps 171
That is, S o T = 1 as desired. Since composition is not commutative, it is
necessary to check that T o S = 1 also, in order to show that S is T -
1
Example 5 Define TE Hom(R
3
[t], 1'
3
) by
T(a + bt + et
2
) = (a + b + e, 2a + b + 2e, a + b).
Given that T is invertible, find T-
1
If ~
T(a + bt + et
2
) = (x, y, z),
then
That is, a, b, and e must be expressed in terms of x, y, and z. The system
a+ b +e= x, 2a + b + 2e =y, a+ b = z,
has the solution
a= -2x +y+ z, b = 2:X- y, e=x-z
for any x, y, and z. Therefore, T-
1
should be defined by
r-
1
(x, y, z) = -2x +y+ z + (2x- y)t + (x- z)t
2
lt must now be shown that r-t o T = IR,[t] and T o r-
1
= 1-;r,. This is left
to the reader.
In the real number system, every nonzero number has a multiplicative
inverse. However, a similar statement cannot be made for the nonzero maps
in Hom('i'
2
).
Exf!mple 6 Suppose TE Hom('i'
2
) is defined by T(a, b) = (a, O).
Show that even through T =1: O, T has no inverse.
Assume a map S e Hom('f"
2
) exists such that S o T =l. Then S(T(V))
= Vfor all V E 'f"
2
But .;V r =1: {O}; forexample, T(O, 1) = (0, O) = T(O, 2).
Therefore S(O, O) must be (0, 1) on the one hand and (0, 2) on the other. Since
this violates the definition of a map, S cannot exist. In this case, S fails to
exist beca use T is not one to one.
172 4. LINEAR TRANSFORMATIONS
Suppose we had asked that ToS = l. Then T(S(W)) = W for all
W E .Y
2
But now ..f r f:. .Y
2
; in particular, (0, 1) ..f r So if W is ta)<en to
be (0, 1), then W has no preimage under T. Therefore no matter how S(O, l)
is defined, T(S(O, l)) f:. (0, 1), and this map S cannot exist either. Here S
fails to exist beca use T is not onto.
Before generalizing the implications of this example, in Theorem 4.14,
a few facts concerning inverses should be stated. They are not difficult to
prove, so their proofs are left to the reader.
Theorem 4.13
l. If T is invertible, then T has a unique in verse.
2. If T is linear and invertible, then T-
1
is linear and invertible with
(T-1)-1 = T.
3. If S and Tare invertible and S o T is defined, then S o T is in vertible
with (S o T)-
1
= y-t o s-
1
Notice that the inverse of a product is the product of the inverses in the
reverse order. This change in order is necessary because composition is not
commutative. In fact, even if the composition (S-
1
o T-
1
) o (S o T) is
defined, it need not be the identity map.
Theorem 4.14 A linear map T: .Y -> "/f is invertible if an only if it
is one to one and onto.
Proof (=>) Assume y-t exists. To see that T is 1-1, suppose
V E ,Al T Then since y-t is linear, T-
1
(T(V)) = r-
1
(0) = O. But
T-
1
(T(V)) = [T-
1
o T](V) = !(V) = V.
Therefore V = O and .Al T o;= {O}, so T is 1-1.
- To see that T is onto, suppose W E 1/. Since r-t is defined on 1r, there
is a vector U E .Y, such that r-t ( W) = U. But then
T(U) = T(T-
1
(W)) = [To T-
1
](W) = l(W) = W.
Thus "/f e J T and T is onto.
( <=) As sume T is 1-1 and onto. For each W E "/f, there exists a vector
V E .Y, such that T(V) = W, because T maps .Y onto 1/. Further V is
uniquely determined by W since Tis 1-1. Therefore we may define S: 1/-> .Y
by S(W) = V where T(V) = W. Now if X E .Y, [S o T](X) = S(T(X)) = X
5. Composition of Maps 173
and if YEif, then [ToS](Y) =-T(S(Y)) =Y. Thus S is r-
1
and T is
in vertible.
Theorem 4.14 shows that a linear map is invertible if and only if it is an
isomorphism. However, the essential requirement is that the map be one to
one, for if T: r -> if is one to one but not onto, then T: r -> ..f T is both
one to one and onto. So an invertible map may be obtained without changing
how T is defined on .Y. On the other hand, if T is not 1-l, then there is no
such simple change which will correct the defect. Since being l-1 is essential
to inverting a map, a linear transformation is said to be nonsingular if it is
one to one; otherwise it is singular. A singular map cannot be inverted even
if its codomain is redefined, for it collapses a subspace of dimension greater
than zero to the single vector O.
The property of being invertible is not equivalent to being nonsingular
or l-1. However, in the improtant case of a map from a finite-dimensional
vector space to itself, the two conditions are identical.
Theorem 4.15 If .Y is finite dimensional and TE Hom('"f'"), then the
following conditions are equivalent:
l. T is in vertible.
2. T is l-1. (Nullity T = O or .Al r = {0}.)
3. T is onto. (Rank T = dim .Y.)
4. If A is a matrix for T with respect to sorne choice of bases, then
det A f:. O.
5. T is nonsingular.
Proof
two facts:
These follow at once from Theorem 4.14 together with the
dim .Y = rank T + nullity T and rank T = rank A.
We can see how the conditions in Theorem 4.15 are related by con-
. sidering an arbitrary map TE Hom(.Y
2
). From problem 11, page 145, we
know that T(a, b) = (ra + sb, ta + ub) for sorne scalars r, s, t, u. If we write
T(a, b) = (x, y), then r-
1
exists provided the equations
ra + sb = x
ta + ub =y
can be solved for a and b given any values for x and y. (This was done in
Example 5.) From our work with systems of linear equations, we know that
174 4. LINEAR TRANSFORMATIONS --
mch a solution exists if and only if the coefficient matrix
A= G
has rank 2. But A is the matrix of T with respect to the standard basis for
"r'
2
So T -t exists if and only if det A ,: O or rank T = 2. Further, the rank
c.f A is 2 if and only if the equations have a unique solution for every choice
c.f x and y. That is, if and only if T is 1-1 and onto. -
Returning to the ring of maps Hom('i'"), Examples 3 and 5 can be
generalized to show that if the dimension of i is at least 2, then multiplica-
tion in Hom("f) is not commutative and nonzero elements need not have
rr:ultiplicative inverses. So the ring Hom(i) ditfers from the field of real
numbers in two important respects. Suppose we briefty consider the set of
nonsingular maps in Hom("f) together with the single operation of c:omposi-
tion. This algebraic system is similar to the set of all nonzero real numbers
together with the single operation of multiplication. The only ditference is
that composition of maps is not commutative. That is, for the set of all non-
sbgular maps in Hom("f) together with composition (!) the set is closed
under composition; (2) composition is associative; (3) there is an identity
element; and (4) every element has an inverse under composition. Such an
algebraic system with one operation, satsfying these four properties is called
a group. The nonzero real numbers together with multiplication also form
a group. However, since multiplication of numbers is commutatve, this
system is called a commutative group. Still another commutative group is
formed by the set of all real numbers together with the single operation of
addition.
A general study of groups, rings, and fields is usually called modern
algebra. We will not need to make a general study of such systems, but in
time we will have occasion to examine general fields, rings of polynomials,
and groups of permutations.
Problems
l. Find SoTand/or ToS when
a. T(a, b) = (2a- b, 3b), S(x, y)= (y- x, 4x).
b. T(a, b, e) = (a - b + e, a + e), S(x, y) = (x, x - y, y).
c. T(a, b) =(a, 3b +a, 2a- 4b, b), S(a, b, e) = (2a, b).
d. T(a + bt) = 2a- bt
2
, S(a + bt + et
2
) =e+ bt
2
- 3at
3
2. For Te Hom('f"), T" is the map obtained by composing Twith itself n times.
If 1"' = 1"' 2 and T(a, b) = (2a, b - a), find
a. T
2
b. T
3
c. T". d. T
2
- 2T. e. <1f + 3/.
5. Composition of Maps
175
3. Show that T
2
= 1 when T(a, b) =(a, 3a - b).
4. Is T
2
meaningful for any linear map T?
5. Suppose Te Hom('f", if") and S E Hom(if", /1). Prove that So TE Hom('f", /1).
That is, prove that the composition of linear maps is linear.
6. Suppose T
1
(a, b, e) = (a, b + e, a- e), T2(a, b) = (2a - b, a, b - a) and
T
3
(a, b) = (0, 3a- b, 2b). Show that the maps T1o[T2 + T3] and T1oT2 +
T
1
o T3 are equal.
7. a. Prove that the distributive law in problem 6 holds for any three linear
maps for which the expressions are defined.
b. Why are there two distributive Jaws given for the ring of maps Hom('f")
when only one was stated for the field of real numbers R?
s. Find all maps S E Hom('f"
2
) slich that So T = ToS when
a. T(a, b) = (a, 0). c. T(a, b) = (2a - b, 3a).
b. T(a, b) = (2a, b). d. T(a, b) = (a + 2b, 3b - 2a).
9. Suppose TE Hom('f") and let f/ = {S e Hom('f")jSo T = ToS}.
a. Prove that f/ is a subspace of Hom(f).
b. Show that if dim f;;:::: 1, then dim f/;;:::: 1 also.
10. Determine if T is invertible and if so find its inverse.
a. T(a, b, e) = (a - 2b + e, 2a + b - e, b - 3a).
b. T(a, b) = (a- 2b, b - a).
c. T(a, b, e) = a+ e+ bt + (-a - b)t
2
d. T(a + bt) = (2a + b, 3a + b).
e. T(a + bt) =a+ 3at + bt
2
.
f. T(a, b, e) = (a - 2e, 2a + b, b + 3e).
11. Suppose T: f - 11' is linear and invertible.
a. Prove that T has only one inverse.
b. Prove that T-
1
is linear.
12. Show by example that the set of all nonsingular maps in Hom('f" 2) together
with O is not closed under addition. Therefore the set of nonsingular maps
together with addition and composition does not form a ring.
13. Find an example of a linear map from a vector space to itself which is
a. one to one but not onto.
b. onto but not one to one.
14. Suppose the composition S o T is one to one and onto.
a. Prove that T is one to one.
b. Prove that S is onto.
c. Show by example that T need not be onto and S need not be 1-1.
15. Suppose risa finite-dimensional vector space and Te Hom('f"). Prove that
if there exists a map S E Hom('f") such that SoT = l, then ToS= l. That is,
if SoT = 1 (orToS= 1), then T is nonsingular and S= T-
1
176
4. LINEAR TRANSFORMATIONS
16. Suppose S(a, b) = (4a + b, 3a + b) and T(a, b) = (3a + b, 5a + 2b).
a. Find SoT.
b. Find s-t, r-
1
, and (SoT)-
1
.
c. Find r-
1
oS-
1
and check that it equals (SoT)-
1
.
d. Show that s-
1
oT-
1
oSoT '# /. That is, (SoT)-
1
'# s-
1
oT-
1
.
17. Prove that if S and Tare invertible and SoT is defined, then (SoT)-
1
=
T-1os-1.
18. We know from analytic geometry that if T. and Tp are rotations of the plane
through the angles a and p, respectively, then T. o Tp = T+ P Show that the
equation T. o Tp(l, O) = T. +P(l, O) yields the addition formulas for the sine and
cosine functions.
6. Matrix Multiplication
The. multiplication of linear transformations, given by composition in
conjunction with the correspondence between maps and matrices, yields a
general definition for matrix multiplication. The product of matrices should
be defined so that the isomorphisms between spaces of maps and spaces of
matrices preserve multiplication as well as addition and scalar multiplication.
That is, if S and Tare maps with matrices A and B with respect to sorne choice
of bases, then S o T should ha ve the matrix product AB as its matrix with
respect to the same choice of bases. An example will lead us to the proper
definition.
then
Example 1 Let S, TE Hom('f"
2
) be given by
S(x, y)= (a
11
x + a
12
y, a
21
x + a
22
y),
T(x,y) = (b
11
x + b
12
y, b
21
x + b22y),
As= (a,, a,z)
az an
and
are the matrices of S and T with respect to the standard basis for 'f"
2
And
sin ce
S o T(x, y)= ([a
11
b
11
+ a
12
b
2
]x + [a
11
b
12
+ a
12
bn]y,
[a
21
b
11
+ a
22
b
2
]x + [a
21
b
12
+ a
22
b
22
]y),
6. Matrix Multiplication 177
the matrix of S o Twith respect to {E} is
Therefore the product of As and Ar should be defined so that
Notice that the product should be defined with the same row times column
rule used to represent systems of linear equations in matrix form. For the
first column of As,r is the matrix product
(
a
11
a
12
)(b")
Gz Gzz b21
and the second column is
In other words, the element in the ith row andjth column of As,r is the sum
of the products of the elements in the ith row of As times the elements in the
jth column of Ar. This "row by column" rule will be used to define multipli-
cation in general. The summation notation is very useful for this. If we set
As,r = (e;) above, then eii = a;
1
bu + a;
2
b
2
j for each i and j, therefore
2
e;j = L a;kbkj
k=l
A sum of this form will be characteristic of an entry in a matrix product.
Notice that in the sum L ~
1
a;kbkj the elements a;k come from the ith row
of the first matrix and the elements bkj come from the jth column of the
second.
Definition Let A = (a;) be be an n x m matrix and B = (b;j) an
m x p matrix. The produet AB of A and Bis the n x p matrix (e;) where
m
eij = L a;kbkj = a; 1b1j + a;2b2j + + a;mbmj
k=l
1 ::::; i ::::; n, 1 ::::; j::::; p.
178 4. LINEAR TRANSFORMATIONS
This definition yields AsAT = As.T when As, An and As.T are as in
Example l. The definition also agrees with our previous definition of matrix
product when B is m x l. Not all matrices can be multiplied, the product
AB is defined only if the number of columns in A equals the number of rows
in B. What does this correspond to in the composition of maps?
Example 2 Let
A = (a;) = (i :)
2 -1
and
(
1 2 -1 o)
B = (bii) = -1 1 2 -2 .
Since A is 3 x 2 and B is 2 x 4, AB is defined but BA i s not. The product
AB = (e;) is a 3 x 4 matrix with
Thus
2
c
11
= ,Lalkbkl = 21 + 4(-1) = -2
k=l
and
2
c34 = .L a3kbk4 = 2-o +e -1)C -2) = 2.
k=l
Computing all the entries cli gives
Example 3 Sorne other examples of matrix products are:
(
3 -1)(2 2) (5 3)
2 -4 1 3 = o -8 '
G) (1 3 O) = (i 1 ~ ~ }
6. Matrix Multiplication 179
(
4 1 -2)(-;
1 3 5 3
-
3
) (o -14)
~ = 11 2 '
(
-2 -3)(4 1 -2) = (-5 =7 -19)
2 o 13 5 8 2 4.
3 1 13 6 -1
The complicated row by column definition for matrix multiplication is
chosen to correspond to composition of linear transformations. Once this
correspondence is established in the next theorem, all the properties relating
to composition, such as associativity or invertibility, can be carried over to
matrix multiplication.
Theorem 4.16 Let T: "Y ---+ 11/" and S: 1f ---+ il/f be linear with dim "Y
= p, dim 1f = m, and dim il/f = n. Suppose B
1
, B
2
, B
3
are bases of "Y, 1f,
il/f, respectively, and tllat AT is the m x p matrix of T with respect to B
1
and B
2
, As is the n x m matrix of S with respect to B
2
and B
3
, and As.T is
the n x p matrix of S o T with respect to B
1
and B
3
Then AsAT = As.T
Proof The proof is of necessity a Iittle involved, but it only uses
the definitions for a matrix of a map and matrix multiplication.
Let
n
S(Wk) = L ahkuh for 1 ~ k ~ m.
h=l
m
T(V) = .L bkjwk for 1 ~ j ~ p
k=l
And if As.T = (chi), then
for 1 :::; j ~ p
180
LINEAR TRANSFORMATIONS
The proof consists in finding a second expression for S o T(V).
So T(V) = S(T(Vi)) = =J
1
bkiS(Wk)
= f bki[ f ahkuh] = f [ f ahkbki] Uh.
k=l h=l h=l k=l
The last equality is obtained by rearranging the nm terms in the previous
sum. Now the two expressions for S o T(V) give
But the U's are linearly independent, so chi = LZ'= 1 ahkbki for each h and j.
That is, (a
11
k)(bk) = (eh) or AsAT = AsoT
If this proof appears mysterious, try showing that it works for the
spaces and maps in Example 1, or yo u might consider the case "Y = "Y 2,
"'f = "Y
3
, and !1 = "Y 2
For all positive integers n and m, let cp: Hom("Ym, J!fnxm with
cp(T) the matrix of T with respect to the standard bases. For any particular
n and m we know that cp is an isomorphism, and with the above theorem,
cp preserves multiplication where it is defined. That is, if TE Hom(Vp, "Y m)
and S E Hom("Y,., "Y.), then cp(S o T) = cp(S)cp(T). Thus cp can be used to
carry the properties for composition cif maps over to multiplication of
matrices.
Theorem 4.17
addition.
That is,
A(BC) = (AB)C,
Matrix multiplication is associative and distributive over
A(B + C) = AB + AC, (A + B)C = AC + BC
hold for all matrices A, B, and C for which the expressions are defined.
Proof Consider one of the distributive laws. Suppose
A E J!tnxm
Then there exist maps TE Hom("Y m "Y.) and S, U E Hom("Y P' "Y m) such
6. Matrix Multiplication
that cp(T) = A, cp(S) = B, and cp(U) = C. Using these maps,
A(B + C) = cp(T)[cp(S) + cp(U)]
= cp(T)cp(S + U)
= cp(To [S+ U])
= cp(T o S + T o U)
= cp(To S)+ cp(To U)
= cp(T)cp(S) + cp(T)cp(U)
= AB + AC.
cp preserves addition
cp preserves multiplication
Distributive law for maps
cp preserves addition
cp preserves multiplication
181
!he map cp carries the algebraic properties of Hom("Y.) to Jtt.x.,
makmg J!tnxn a ring with identity. Thus cp: Hom("Y.) J!fnxn is a (ring)
homomorphism. The identity in J!t n x n is cp(I), the n x n matrix with 's on the
main diagonal and O's elsewhere. This matrix is denoted by I. and is called
the n x n identity matrix. If A is an n x n matrix, then Al. = l
11
A = A. If
Bis an n x m matrix with n -:f. m, then I.B = B and the product of B times
I. is not defined, but Blm = B.
Definition Let A be an n x n matrix and suppose there exists an
n x n matrix B such that AB = BA = !,.. Then Bis the inverse of A, written
A-l, and A is said to be nonsingular. Otherwise A is singular.
Let TEHom("Y.) and AEA'inxn be the matrix of T
wth respect to the standard basis for "Y., i.e., A = cp(T). Then A is non-
singular if and only if T is nonsingular.
Corollary 4.19
IAI #O.
An n x n matrix A 1s nol?.!fngular if an only if
These two facts are easily proved and left as problems.
Example 4 Show that
is nonsingular and find A-l.
182 4. LINEAR TRANSFORMATIONS
We ha ve that JAI = 1, so A is nonsingular and there exist scalars a, b, e, d
such that
(
2 1)(a b) = I = (1 o).
53ed
2
01
This matrix equation yields four linear equations which have the unique
solution a = 3, b = -1, e = - 5, and d = 2. One can verify that if
then AB = BA = I
2
and B = A -
1
. This is nota good way to find the in verse
of any matrix. In particular, we will see that the inverse of a 2 x 2 matrix A
can be written down directly from A.
The definition of A -
1
requires that AB = In and BA = In, but it is not
necessary to check both products. Problem 15 on page 175 yields the fol-
lowing result for matrices.
Theorem 4.20 Let A, BE A n X n If AB = In or BA = In> then A is
nonsingular and B = A -
1
The amount of computation needed to invert an n x n nonsingular
matrix increases an n increases, and although there is generally no quick way
to find an inverse, there are two general techniques that have several useful
consequences. The first technique results from the following fact.
Theorem 4.21 Suppose A = (a;) is an n x n matrix with Aii the
cofactor of aii. If h # k, then
n
and L a;hAik =O.
i=1 .
(The first equation shows that the sum of the elements from the hth row
times the cofactors from the kth row is zero, if h # k. Recall that if h = k,
then LJ= 1 ahiAhi = det A.)
Proof The proof consists in showing that LJ=
1
ahiAki is the
expansion of a determinant with two equal rows.
6. Matrix Multiplication
183
Consider the matrix B = (b;) obtained from A by replacing its kth row
with its hth row. (This is not an elementary row operation.) Then bli = ii
when i # k and bki = ahi> for all j. Since B and A differ only in their kth
rows, the cofactors of corresponding elements from their kth rows are equal.
That is, Bki = Aki' for 1 s j s n. Therefore, expanding JBI along the kth
row yields,
n n
IBI = L bkiBki = L ahiAki
j=1 j=1
This gives the first equation, for B has two equal rows, and JBJ = O.
The second equality contains the corresponding result for columns and
it may be obtained similarly.
Using Theorem 4.21, we see that the sum of any row in A times a row
of cofactors is either O or JAJ. 1)erefore using a matrix with the rows of
cofactors written as columns yields the following:
(""
a12
"')C" A,.
A,)
a2l a22 a2n Al2 A22 An2
anl an2 ann Atn A2n Ann
~
o
o) IAI 0
= IAIIn.
o
IAI
So when IAI #O, the matrix of cofa:ctors can be used to obtain A-
1
Definition Let A = (a;) be an n x n matrix. The adjoint (matrix) of
A, denoted Adi, is the transpose of the cofactor matrix (A;). That is, AdJ
= (A;)T and the element in the ith row andjth column of AdJ is the cofactor
of the element in the jth row and ith column of A.
Now the above matrix equation can be rewritten as AAdJ = IAIIn.
Therefore if IAI # O, we can divide by IAI to obtain
or
This is the adjoint formula for finding the inverse of a matrix.
184
4. LINEAR TRANSFORMATIONS
Example 5. Consider the matrix
A_ (2 1)
- 5 3
from Example 4. Since IAI = 1,
A- 1 = A adi = ( 3
-5
-1)
2 .
In general, for 2 x 2 matrices
(
a b)adi = ( d -b),
e d -e a
so if
is nonsingular, then
(
a b)-l 1 ( d -b)
e d = ad - be -e a
Asan example,
(
2 3)-l = _!_(7 -3) = (7/20 -3/20)
-2 7 20 2 2 1/10 1/10 .
Example 6
We have
Find the inverse of the matrix
(
2 l. 3)
A=240.
1 2 1
5
-1
-3
-12) ( 2/3
6 = -1/3
6 o
5/6
-1/6
-1/2
-2)
1 .
1
6. Matrix Multiplication 185
The adjoint formula for the in verse of a matrix can be applied to certain
systems of linear equations. Suppose AX = B is a consistent system of n
independent linear equations in n unknowns. Then IAI #O and A-
1
exists.
Therefore, we can multiply the equation AX = B by A -
1
to obtain X = A -
1
B.
This is the unique solution for
The solution X= A -.lB is the matrix form of Cramer's rule.
Theorem 4.22 (Cramer's rule) Let AX = B be a consistent sys-
tem of n independent linear equations in n unknowns with A = (a),
X= (x), and B = (b). Then the unique solution is given by
a u al,i-1
b
al,i+l aln
a21 az,-1 b2 a2,i+ 1 a2n
anl an,j-1 bn an,j+ 1 ann
for 1 ::;j::; n. X=
IAI
That is, X is the ratio of two determinants with the determinant in the nu-
merator obtained from IAI by replacing the jth column with the column of
constant terms, B.
Proof The proof simply consists in showing that the jth entry in the
matrix solution X = A-lB has the required form.
Therefore
for eachj, but b
1
A
1
+ b2A2 + + bnAn is the expansion, along thejth
column, of the determinant of the matrix obtained from A by replacing the
jth column with B. This is the desired expression for x.
186
4. LINEAR TRANSFORMATIONS
Example 7
x,y, andz:
Use Cramer's rule to solve the following system for
2x+y-z=3
x + 3y + 2z = 1
x + 2y + 2z = 2.
. The determinant ofthe coefficient matrix is 5, therefore the equations are
independent and Cramer's rule can be applied:
3 1 -1 2 3 -1 2 3
1 3 2 1 1 2 3 1
2 2 2 1 2 2 2 2
X=
5
Y=
5
Z=
5
This gives x = 12/5, y = -1, and z = 4/5.
Problems
l. Compute the following matrix products:
a. (i -1)(1 4)
3 5 3 o .
b. (2 1
3)( -).
c.
( 4 -1)(2 1)
-i ~ 5 1 .
( -i)(4 2 - 3).
(6 9) ( 3 -6) (2 1 5) ( -1 4 3) d.
e. 4 6 -2 4
f. o 4 2 6 -3 o.
110 3 22
2. Let S(x, y) = (2x- 3y, x + 2y, y- 3x), T(x, y, z) = (x + y+ z, 2x- y
+ 3z) and suppose As, ATare the matrices of S and T with respect to the
standard bases for "Y 2 and "Y 3 Show that the product ATAs is the matrix of
ToS with respect to the standard bases.
3. Use the adjoint formula to find the inverse of:
a. (i ~ ) .
d. ~ l D
e. ~ g ?).
o 1 o
(
1 2 o)
f. o 1 4.
3 o 1
4. a. Prove that the product of two n x n nonsingular matrices is nonsingular.
b. Suppose At. A2 , , Ak are n x n nonsingular matrices. Show that
A1A2 Ak is nonsingular and
(A1A2 Ak)-
1
= A;;
1
A;;!
1
.,r. Ai
1
A
1
7. Elementary Matrices 187
5. a. Prove Theorem 4.18.
b. Prove Corollary 4.19.
c. Prove Theorem 4.20.
6. Use Cramer's rule to solve tlie following systems of equations:
a. 2x- 3y = 5 b. 3x- 5y- 2z = 1 c. 3a + 2c = 5
X + 2y = 4. X + y - Z = 4 4b + 5d = 1
- x + 2y + z = 5. a - 2c + d = O
4b +Se= 5.
7. B = {(1, -2, -2), (1, -1, 1), (2, -3, O)} is a basis for "Y
3
If the vector
(a, b, e) has coordina tes x, y, z with respect to B, then they satisfy the following
system of linear equations:
x +y+ 2z =a, -2x- y- 3z = b, -2x +y =c.
Use the matrix form of Cramer's rule to find the coordinates of
a. (1, O, 0), b. (1, O, 3), c. (1, 1, 8),. d. (4, -6, 0),
with respect to the basis B.
8. Let B = {(4, 1, 4), ( -3, 5, 2); (3, 2, 4)} and use the matrix form of Cramer's
rule to find the coordinates with respect to B of a. ( 4, -1, 2).
b. (-3, 5, 2). c. (0, 1' 0). d. (0, o, 2).
9. Let T(a, b) = (2a- 3b, 3b- 3a). If A is the matrix of T with respect to the
10.
standard basis for .Y
2
, then A -
1
is the matrix of r-
1
with respect to the
standard basis. Find A and A -
1
and use A -
1
to find a. r-
1
(0, 1 ).
b. r-
1
(-5, 6). c. r-
1
(1, 3). d. r-
1
(x,y).
Let T(a, b, e)= (2a + b + 3c, 2a.+ 2b +e, 4a + 2b + 4c) and use the same
technique as in problem 9 to find a. r-
1
(0, o, 8). b. r-
1
(9, 1, 12).
c. r-
1
(x, y, z).
7. Elementary Matrices
The second useful technique for finding the inverse of a matrix is based
on the fact that every nonsingular n x n matrix is row-equivalent to In. That
is, the n x n identity matrix is the echelon form for an n x n matrix ofrank n.
It is now possible to represent the elementary row operations that transform
a matrix to its echelon form with multiplication by special matrices called
"elementary matrices." Multiplication on the Ieft by an elementary matrix
will give an elementary row operation, while multiplication on the right
yields an elementary column operation. There are three basic types of ele-
elementary matrices, corresponding to the three types of elementary opera-
tions, and each is obtained from In with an elementary row operation.
An x n e/ementary matrix oftype I, Eh, k> is obtained by interchanging
188
4. liNEAR TRANSFORMATIONS
the hth and kth rows of I . Jf A is an n x m matrix, then multiplying A on
the left by Eh,k interchanges the hth and kth rows of A.
Examp/e 1
E,,,G
o
!)G G il
o
1
E,.{!
2 5
(l
o
2 5
4 2
7 1 1 o 7 7 J'
4 2 o o 4 2 2 5
1 4 o o 4 4
An n x n elementary matrix of lype JI, Ek), is obtained from I. by
multiplying the hth row by r, a nonzero scalar. Multiplication of a matrix on
the left by an elementary matrix of type JI multiplies a row of the matrix by
a nonzero scalar.
Examp/e 2
E
1
(1/2)(; 4 8) = ('/2
5 3 o
n(;
4 8) (' 2 4)
53=453"
E,( -3)(!
2
!)
o
2
2
-i)
-2 -3 -2 6
5 o 5 5
An n x n elementary matrix of type lll, Eh,k(r), is obtained from I. by
adding r times the hth row to the kth row. Multiplication of a matrix on the
Jeft by an elementary matrix of type III performs an elementary row opera-
tion of type III.
Examp/e 3
!) = (
= G _
7) =
7) =
o o o 3 1 o 1 3
7. Elementary Matrices
189
Theorem 4.23 Elementary matrices are nonsingular, and the inverse
of an elementary matrix is an elementary matrix of the stme type.
Proof One need only check that
The notations used above for the various elementary matrices are useful
in defining the three types, but it.is generally not necessary to be so explicit
in representing an elementary matrix. For example, suppose A is an n x n
nonsingular matrix. Then A is row-equivalent to 1
11
and there is a sequence of
elementary row operations that transforms A to 1 . This can be expressed as
EP E
2
E
1
A = I. where E
1
, E
2
, , EP are the elementary matrices that
perform the required elementary row operations. In this case, it is neither
important nor useful to specify exactly which elementary matrices are being
used. This example also provides a method for obtaining the inverse of A.
For if EP E
2
E
1
A = I., then the product EP E2 E1 is the inverse of A.
Therefore
That is, the inverse of A can be obtained from !
11
by performing the same
sequence of elementary row operations on I. that must be applied to A in
order to transform A to !,. In practice it is not necessary to keep trae k of the
elementary matrices E
1
, , EP, for the elementary row operations can be
performed simultaneously on A and /,.
Examp/e 4
Use elementary row operations to find the inverse of
the matrix of Examples 4 and 5 in the previous section.
Place A and !
2
together and transform A to !
2
with elementary row
operations.
G
1 1
G
1/2 1/2
1/2 1/2
3 o 3 o 1/2 -5/2
1/2 1/2
o 3
-1)
1 -5 -5
2 .
190 4. LINEAR TRANSFORMATIONS
Therefore
(
2 1) -l = ( 3 -1)
5 3 -5 2 .
This agrees with our previous results. Comparing the three ways in which
A -t has been found should make it clear that the adjoint formula provides
the simplest way to obtain the inverse of a 2 x 2 matrix.
For larger matrices the use of elementary row operations can be simpler
than the adjoint method. However, it is much easier to locate and correct
errors in the computation of the adjoint matrix than in a sequence of ele-
mentary row operations.
Examp/e 5 Use elementary row operations to find
r1
r
1 2
2 4
We have
G
3 1 o
2 o
o 1 o) ( 1
2 o o
1 o)
2 o o 1 3 1.0 OutO 1 3 1 o o
4 2 o o 4 2 o o 1 o o 2 o -2 1
2 o o
1 o) ( 1
2 o o 1
1 3 1 o o lit o 1 o 1 3
o o -1 1/2 o o 1 o -1 1/2
o o -2 -5
-3!2)
1 o 1 3
o o -1 1/2
So
r
1
r c
2 -5
-3)2 )
1 2 o = 1 3
2 4 2 o
-1 1/2
This technique for finding an inverse is based on the fact that if A s
nonsingular, then A -t = EP E
2
E
1
for so me sequence of elementary
7. Elementary Matrices 191
matrices. This expression can be rewritten in the form A = (EP E
2
E
1
)-
1
,
and using prob1em 4 on page 186 we ha ve A = E!
1
E2
1
EP-
1
Since the
inverse of an elementary matrix is itself an elementary matrix, we ha ve pro ved
the following.
Theorem 4.24 Every nonsingular matrix can be expressed as a product
of elementary matrices.
The importan ce of this theorem les in the existence of such a representa-
tion. However, it is possible to find an explicit representation for a non-
singular matrix A by keeping track of the elementary matrices u sed to trans-
form A to 1 . This also provides a good exercise in the use of elementary
matrices.
Example 6 Express
as a product of elementary matrices.
Therefore
n =
= nG
Is this expression unique?
If a matrix A is multiplied on the right by an elementary matrix, the
result is equivalent to performing an elementary column operation. Multiply-
ing A on the right by Eh,k interchanges the hth and kth columns of A,
tiplication by Eh(r) on the righLmultiplies the hth column of A by r, and mul-
tiplication by Eh,k(r) on the right adds r times the kth column to the hth
column (note the change in ordt1f}.
192 TRANSFORMATIONS
Example 7 Let
1 o
2 2
3 5
then
AE,,, G
o
1 o
o) r o o
:)
1
o o
o = 2 5 2
2 2
3 5
o 1
o 3 2 5
o o 1
o o
AE,(l/2) G
1 o 1 o
1 o
2 1 2 2
3 5
o 1/2
3 5/2
o o
AE
4
,z( -5) =
o
o o
(l
1
o o)
2
1 o
2 2 -4 . 2
5
o 1
3 5 -6 ,2 3
o o
The fact that a nonsingular matrix can be expressed as a product of
elemenatry matrices is used to prove that the determinan t. of a is the
product of the determinants. But first we need the followmg spec1al case.
Lemma 4.25 If A is an n x n matrix and E is an n x n elementary
matrix, then IAEI = IAIIEI = lEAl.
Proof First since an elementary matrix is obtained from I. with a
single elementary row operation, the properties of determinants give
Then AE and EA are obtained from A with a single column and row operation
so
IAEh,kl = -IAI = IEh,kAI,
IAElr)l = riAI = IEh(r)AI,
IAEh,k(r)l = IAI = IEh,k(r)AI.
Combining these two sets of equalities proves that IAEI = IAIIEI = lEAl.
:.
.'f
"'
"";
7. Elementary Matrices
193
Theorem 4.26 If A and B aren x nmatrices, then IABI = IAIIBI.
Proof Case l. Suppose Bis nonsingular. Then B = E
1
E
2
Eq
for sorne sequence of elementary matrices E
1
, , Eq, and the proof is by
induction on q.
When q = 1, B = E
1
and IABI = IAIIBI by the lemma.
Assume IABI = IAIIBI for any matrix B which is the product of q...,.. 1
elementary matrices, q 2. Then
IABI = IAE. Eq-lEql = I(AE. Eq-)Eql
= IAE1 Eq-d IEql
= IAIIEt .. Eq-d IEql
= IAII(E" Eq-)Eql
= IAJIBI.
By the lemma
By assumption
By the lemma
Case 2. Suppose Bis singular. Let S and T be the maps in Hom(f.)
which have A and B as their matrices with respect to the standard basis. B is
singular, therefore Tis singular which implies that S o Tis singular. [If T(U)
= T(V), then S o T(U) =S o T(V).] Therefore AB, the matrix of So Twith
respect to the standard basis, is singular. Thus IABI = O, but B is singular,
so IBI = O, and we have IABI = IAIIBI.
Since every n x n matrix is either singular or nonsingular, the proof is
complete.
Corollary 4.27 If A is nonsingular, then IA-
1
1 = 1/IAI.
Corollary 4.28 If A and B are n x n matrices such that the rank of
AB is n, then A and B each have rank n. -..._,
The proofs for these corollaries are left to the reader.
The second corollary is a special case of the following theorem.
Theorem 4.29 If A and B are matrices for which AB is defined, then
rank AB.::;; rank A and rank AB::;; rank B.
Rather than prove this theorem directly, it is easier to prove the cor-
responding result for linear transformations.
194
4. liNEAR TRANSFORMATIONS
Theorem 4.30 Let S and Tbe linear maps defined on finite-dimensional
vector spaces for which S o T is defined, then rank S o T ::; rank T and
rank S o T ::; rank S.
Proof Suppose { U
1
, , U.} is a basis for the domain of T, and
rank T = k. Then dim .'l'{T(U
1
), , T(U.)} = k, and since a linear map
cannq_t increase dimension, rank S o T = dim .'l'{S o T(U1), , S o T(U.)}
cannot exceed k. That is, rank S o T ::; rank T. On the other hand,
--{S o T(U), ... , S o T(U.)} spans f ST and is a subset of f s Therefore
dim f s.T ::; dim f s or rank S o T ::; rank S.
Now Theorem 4.29 for matrices follows from Theorem 4.30 using the
isomorphism between linear transformations and matrices.
Problems
l. Let A = j), B = (! and C = ! Find the elementary
matrix that performs each of the following operations and use it to carry out
the operation:
a. Add -5/2 times the first row of A to the second row.
b. Interchange the first and second rows of B.
c. Divide the first row of A by 2.
d. Interchange the second and third rows of C.
e. Subtract twice the first row of C from the second row.
f. Add -3 times the secm;td row of B to the third row.
g. Divide the third row of C by 4.
2. a. Find the elementary matrices E
1
used to perform the following sequence
of elementary row operations:
?).
(
2 6) -l
b. Compute the product E
4
E3E2E1 and check that it is
5
9
.
c. Find the inverses of E, E2, E3, and E4.
d. Compute the product E"i.
1
E2
1
Ei
1
E
1
3. Write the following matrices as products of elementary matrices:
a.
b. ). c. ! !) d. (? 1
7. Elementary Matrices 195
4. Use elementary row operations to find the inverse of:
c. !).
3 o 6 o
d. g).
2 5 3
(
3 1 4)
a. 1 O 2 .
2 5 o
b. ).
5. We know that if A, BE ,({nxn and AB =In, then BA =In. What is wrong
with the following "proof" of this fact?
Since AB = I., (AB)A = InA = A = Ain. Therefore A(BA) - Ain = O or
A(BA - In) = O. But A :/= O since AB = In :/= O, therefore BA - In = O and
BA =In.
6. Len;--B ,({n x n Prove that if AB = O and A is nonsingular, then B = O.
7. a. Prove that if A is row-equivalent to B, then there exists a nonsingular
matrix Q such that B = QA.
b. Show that Q is unique if A is n x n and nonsingular.
c. Show by example that Q need not be unique if A is n x n and singular.
8. Find a nonsingular matrix Q such that B = QA when
a. A= (f j B = ( ? b A 1) B ( ?)
c. 1), B = (b ?
3 2
, O O.
4 6 -6 o o o
d. A = (f B = (b ?) .
- (1 2 o) (o 1 - 3) .
9. Let A - : b and B =
1 2
_
4
. Fmd the elementary matrix
which performs the following elementary column operations and perform the
operatwn by multiplication: a. Divide the second column of A by 2.
b. Subtract twice the first column of A from the second. c. Interchange
the first and second columns of B. d. Multiply the second column of B
by 3 and add it to the third column. e. Interchange the second and third
columns of A.
10. Suppose B = AP, A, B, and P matrices with P nonsingular. Prove that A is
column-equivalent to B.
11. Find a nonsingular matrix P such that B = AP when
a. B = (b ? g). b. A=(! B =
?).
-6
12. Find nonsingular matrices Q and p. such that B = QAP when
. (1 o o) . (1 o o)
a. A as m 8a, B =
0 1 0
. b. A as m Se, B = g b g .
c. A as in 11 b, B = (z
196
4. LINEAR TRANSFORMATIONS
13. Prove that if A is nonsingular, then IA-
1
1 = 1/IAI.
14. Prove that rank AB ::;; rank A using the corresponding result for linear
formations.
15. Use rank to prove that if A, BE Al" x" and AB = l., then A is nonsingular.
Review Problems
(
2 3) .
l. Let 1 be the matrix of TE Hom(f 2 f 3) with respect to the bases
{(0, 1), (1, 3)} and {(-2, 1, 0), (3, 1, -4), (0, -2, 3)} and find: a. T(2, 5).
b. T(l, 3). c. T(a: b).
2. Find the linear map T: f
3
- f 3 which has 1. as its matrix with respect to
B
1
and B2 if:
a. Bt =B2.
b. B
1
= {(1, O, 1), (1, 1, 0), (0, O, 1) }, B2 = {(2, -3, 5), (0, 4, 2), ( -4, 3, 1) }.
3. Prove the following properties of isomorphism for any vector spaces f, "/f',
and il/t:
a. -r=-r.
b. If f = "/f', then 1f' =f.
c. Iff = 1f' and 1f' = lllt, then f = lllt.
4. Define TE HomW 2) by T(a, b) = (2a- 3b, b- a).
a. Find the matrix A of T with respect to the standard basis.
b. Prove that there do not exist bases for f 2 with respect to which ( g)
is the matrix for T.
c. Find bases for 1'"
2
with respect to which (_j -v is the matrix for
T.
5. Suppose f and 1f' are finite-dimensional vector spaces. Prove thak.l{ = "/Y
if and only if dim f = dim "/Y.
6. Let T(a, b, e) = (2a- 3e, b + 4e, Sa + 3b).
a. Express .#' r as the span of independent vectors.
b. Find bases B
1
and B2 for 1'"3 with respect to which ( -!
o o)
1 O is the
o o
matrix of T.
7. Let T(a, b, e, d) = (a + 3b + d, 2b - e + 2d, 2a + 3e - 4d, 2a + 4b + e).
Find bases for f
4
with respect to which A is the matrix of T when
a. A = (8 b b. A = (? b
0000 0100
Review Problems 197
8. Let A be an n X n matrix such that A
2
= A, prove that either A = In orA is
singular. Find a 2 X 2 matrix A with all nonZero en tries such that A
2
= A.
9. Suppose T(a, b, e)= (2a + 3b+ e, 6a + 9b + 3e, 5a- 3b +e).
a. Show that the image space J r is a plane. Use elementary row operations
on a matrix with J ras row space to find a Cartesian equation for J T
b. Use Gaussian elimination to show that the null space .#' r is a line.
c. Show that any line parallel to .#' r is sent to a single point by T.
d. Find bases for f 3 with respect to which (b g) is the matrix for
o o o
T.
e. Find bases for f 3 with respect to which (g g) is the matrix for
T.
10. Suppose TE Hom(f). Prove that .#' r n J r = {O} if and only if .#'r2 =
.AI'r.
11. Define T: f
3
- f 3 by T(a, b, e) = (a+ b, b + e, a- e).
a. Show that T-
1
does not exist.
b. Find T-
1
[.'1'] if !7 = {(x, y, z)ix- y+ z = 0} and show that
T[T-
1
[.9"]] f= !7. (The complete inverse image of a set is defined in
problem 7, page 145.)
c. Find T-
1
[5] if !T = {(x, y, z)i3x- 4y + z = 0}.
d. Find T-
1
[{(x, y, z)ix- y- z = 0}].
12. For which values of k does T fail.to be an isomorphism?
a. TE Hom(f2); T(a, b) = (3a + b, ka- 4b).
b. TE Hom('r 2); T(a, b) = (2a- kb. ka+ b).
c. TE Hom('r
3
); T(a, b, e) =(ka+ 3e, 2a- kb, 3e- kb).
13. The definition of determinant was motivated by showing that it gives area in
the plane and should give volume in 3-space. Relate this geometric idea with
the fact that S E Hom( g 2) or S E Hom( g
3
) is singular if and only if any matrix
for S has zero determinant.
14. For the given matrix A, show that sorne matrix in the sequence 1 A A
2
A
3
, is a linear combination of those preceding it. "' ' '
a. A= ( j). b. A= G c.
o o 2
f.
d.
1 1 -1
o o o)
o. o o
1 o o .
o 1 o
e. A = = 1 j).
15. Suppose A E vltnxn Show that there exists an integer k::::;; n
2
and scalars
ao, a, ... , ak such that
aoln + a1A + a2A
2
+ ... + akAk = O.
198 4. LINEAR TRANSFORMATIONS
16. a. Suppose TE Hom(1'") and the dimension of 1'" is n. Restate problem 15 in
terms of T.
b. Find two different proofs for this fact.
17. Suppose 1'" is a three-dimensional vector space, not necessarily 1'" 3, and if is
a vector space of infinite dimension. Show that there is a one to one, linear map
from 1'" to if.
18. Associate with each ordered pair of complex numbers (z, w) the 2 x 2 matrix
(
~ ~ , where the bar denotes the complex conjuga te (defined on page 277).
-w z
Use this association to define the "quaternionic multiplication"
(z, w)(u, v) = (zu - wii, zv + w).
a. Show that the definition of this product follows from a matrix product.
b. Use Theorem 4.17 to show that this multiplication is associative.
c. Show by example that quaternionic multiplication is not commutative.
19. a. Suppose A is n x n and B is n x m. Show that if A is nonsingular, then
rank AB = rank B.
b. Suppose C is n x m and D is m x m. Show that if D is nonsingular, then
rank C D = rank C.
c. Is it true that if rank E ::;; rank F, then rank EF rank E?
Change of Basis
1. Change of Coordinates
2. Change in the Matrix of a Map
3. Equivalence Relations
4. Similarity
5. lnvariants for Similarity
200 5. CHANGE OF BASIS
In this chapter we will restrict our attention to finite-dimensional vector
spaces. Therefore, given a basis, each vector can be associated with, an
ordered n-tuple of coordinates. Since coordinates do not depend ori the
vector alone, they .reflect in part the choice of basis. The situation is similar
to the meas u re of slope in analytic geometry, where the slope of a line depends
on the position of the coordinate axes. In order to determine the influence
of a choice of basis, we must examine how different choices affect the co-
ordinates of vectors and the matrices of maps.
1. Change in Coordinates
Let B be a basis for the vector space "Y, with dim "Y = n. Then each
vector V E .Y has coordinates with respect to B, say V: (xt> . .. , x.h. If a
new basis B' is chosen for "Y, and V: ... , then the problem is to
determine how the new coordinates x; are related to the old coordinates X;.
Suppose B = {U
1
, , U.} and B' = {U;, ... , then
X U + .. : + x.u. = V= X u; + ... +
So a relationship between the two sets o( coordinates can be obtained by
expressing the vectors from one basis in terms of the other basis. Therefore,
suppose that for eachj, l j n,
n
u;. = Piiui + ... + Pnjun = LPijui.
i= 1
That is, u;: (p!j . .. 'Pn)B The n X n matrix p = (p;) will be called the
transition matrix from the basis B to the basis B'. Notice that the jth column
of the transition matrix P contains the coordinates of Uj with respect to B.
The scalars Pii could be indexed in other ways, but this notation will conform
with the notation for matrices of a linear transformation.
Now the equation
x
1
U
1
+ .. + x.u. = + ... +
beco mes
x 1 U1 + + x.u. = + ... +
= 1 t-l
1. Change in Coordinates
201
and since {U
1
, , U.} is linearly independent, the coefficient of U; on the
left equals the coefficient of U; on the right. For example, the terms involving
U1 on the right are xp11 U
1
+
+ +
b. U= x, V= .../1 - x
2
in !F.
3. Show that the Schwartz inequality for vectors in yields the inequality
(t a1b1)
2
::;; (t (a1)
2
) (t (b1)
2
)
1=1
for all real numbers a
1
, b
1
4. Prove that (U, V)
2
= 11 Ull
2
11 Vll
2
if and only if U and Vare linearly dependen t.
5. Find two orthonormal bases for
2
containing the vector (3/5, 4/5).
6. Show that each of the following sets is orthonormal in 3 and for each set find
an orthonormal basis for
3
containing the two vectors.
a. {(1/.Y2, O, -1/V2), (0, 1, O)}.
b. {(-1/3, 2/3, -2/3), co. t/v'2, 1/v2)}.
c. {(2/7, 3/7, 6/7), (6/7, 2/7, -3/7)}.
7. Let S be a nonempty subset of an inner product space ("Y,<,)). Show that the
set of all vectors orthogonal to every vector in S is a subspace of "Y.
8. B = {(4, 1), (2, 3)} is a basis for "Y
2
a. Find an inner product defined on "Y
2
for which B is an orthonormal
basis.
3. The Gram-Schmidt Process
255
b. What is the norm of (2, -7) in the inner product space obtained in a?
9. Let S= {(1, O, 2, 1), (2, 3, -1, 0), (-3, 2, O, 3)}.
a. Show that S is an orthogonal subset of 4
b. Obtain an orthonormal set from S.
10. Following Example 4:
a. Show that if n :F m, then sin 2nn:x is orthogonal to sin 2mn:x.
b. Find the Fourier series expansion of /(x) = x.
3. The Process
We have seen that orthonormal bases are the most useful bases for a
given inner product space. However, an inner product space often arises with
a basis that is not orthonormal. In such a case it is usually helpful to first
obtain an orthonormal basis from the given basis. The Gram-Schmidt process
provides a procedure for doing this. The existence of such a procedure will
also show that every finite-dimensional inner product space has an orthonor-
mal basis.
Por an indication of how this procedure should be defined, suppose
!/ = .'l'{ U
1
, U
2
} is a 2-dimensional subspace of an inner product space. Then
the prob1em is to find an orthonormal basis {V1, V2 } for !/. Setting V1
= UtfiiU
1
IIyields a unit vector. To obtain V2,consider W2 = U2 + xV.
For any x E R, .'l'{V
1
, W
2
} = .'l'{U
1
, U
2
}. Therefore it is necessary to find
a value for x such that < W
2
, V
1
) = O. Then V
2
can be set equal to W2/ll Wzll
(How do we know that W
2
cannot be O?) W
2
is orthogonal to V1 if
Therefore when x = -(U
2
, V
1
), (W2 , V) =O.
Example 1 Suppose !/ = 2{U
1
, U
2
} is the subspace of lf 3 spanned
by U
1
= (2, 1, -2) and U
2
= (0, 1, 2). Using the above technique,
V = U/IIU.II = (2/3, 1/3, - 2/3)
and W
2
is given by
W2 = U2 --: cu2 o _v.w.
= (0, 1, 2)- [(0, 1, 2) o (2/3, 1/3, -2/3)](2/3, 1/3, -2/3)
= (2/3, 4/3, 4/3).
256 6. INNER PRODUCT SPACES
Setting V2 = W
2
/IIW
2
1l = (l/3, 2/3, 2/3) yields
{V
1
, V
2
} = {(2/3, l/3, -2/3), (l/3, 2/3, 2/3)},
an orthonormal basis for f/ constructed from U
1
and U
2
There is a good geometric illustration which shows why W
2
=
U
2
- (U
2
, V
1
)V
1
is orthogonal to V
1
Since V
1
is a unit vector,
Since cos 8 s the ratio of the side adjacent to the angle 8, and the hy-
potenuse, this vector is represented by the point P at the foot of the per-
pendicular dropped from U
2
to the line determined by O and V
1
, see Figure 1.
Thus W
2
= U
2
- (U
2
, V
1
)V
1
can be represented by the arrow from P
to U
2
That is, W
2
and V
1
are represented by perpendicular arrows. The
vector (U
2
, V
1
)V
1
is called the "orthogonal projection" of U
2
on V
1
Figure 1 shows that subtracting orthogonal projection of U
2
on V
1
from
U
2
yields a vector orthogonal to V
1
In general, the orthogonal projection
of one vector on another is defined as in lff
2
with V
1
replaced by U
1
/ll U
1
11.
Defnition If U
1
, U
2
are vectors from an inner product space with
U
1
of. O, then the orthogonal projecton of U
2
on U
1
is the vector
Figure 1
:.;
:.
.'.,:!
3. The Gram-Schmidt Process 257
Suppose now that it is desired to find an orthonormal basis for the
3-dimensional space
U
2
, U
3
}. Using the above procedure, we can
obtain V
1
, and V
2
such that {V
1
, V
2
} is an orthonormal set and 2{V
1
, V
2
}
== 2{U
1
, U
2
}. To obtain a vector W
3
from U
3
orthogonal to V
1
and V
2
,
one might consider subtracting the orthogonal projections of U
3
on V
1
and
V
2
from U
3
Therefore let
Then
(W
3
, V
1
) = (U
3
, V
1
)- (U
3
, V
1
)(V
1
, V
1
)- (U
3
, V
2
)(V
2
, V
1
)
== (U
3
, V
1
)- (U
3
, V
1
) =O.
So W
3
is orthogonal to V
1
. Similarly W
3
is orthogonal to V
2
, and setting
V
3
= W
3
/ll W
3
!1 yields an orthonormal basis {V
1
, V
2
, V
3
} for U
1
, U
2
, U
3
}.
This construction should indicate how the following theorem is proved by
induction.
Theorem 6.8 (Gram-Schmidt Process) Let 8 be a finite-dimensional
inner product space with basis {U
1
, , Un}. Then 8 has an orthonormal
basis ... , v.} defined inductively by V
1
= Udll U
1
ll and for each
k, 1 < k :::;; n, Vk = Wk/11 Wk!! where
Example 2
mal basis for
Use the Gram-Schmidt process to obtain an orthonor-
2{(1, O, 1, 0), (3, 1, -1, 0), (8, -7, O, 3)} e 8
4
.
Set V
1
= (1/.j'l, O, 1/.j'l, 0). Then
W
2
= (3, 1, -1, O) - [(3, 1, -1, O) o (1/.j'l, O, IJ.j'l, 0)](1/.)2, O, 1/.)2, O)
= (2, 1, -2, 0),
and V2 = W
2
/IIW
2
11 = (2/3, l/3, -2/3, O) is a unit vector orthogonal to
V
1
. Finally subtracting the orthogonal projections of (8, -7, O, 3) on V
1
258
6. INNER PRODUCT SPACES
and v2 from (8, -7, o, 3) gives
JV
3
= (8, -7, O, 3) - [(8, -7, O, 3) o (1/)2, O, 1/)2, 0)](1/)2, O, 1/)2, O)
- [(8, -7, O, 3) o (2/3, 1/3, -2/3, 0)](2/3, 1/3, -2/3, O)
= (8, -7, O, 3)- (4, O, 4, O)- (2, 1, -2, O)= (2, -8, -2, 3).
So V
3
= W
3
/IIW
3
II = (2/9, -8/9, -2/9, 1/3) and the orthonormal basis is
{(1/Jl, O, 1/)2, 0), (2/3, 1/3, -2/3, 0), (2/9, -8/9, -2/9, 1/3)}.
In practice it is often easier to obtain an orthogonal basis such as
{U
1
, W
2
, , Wn} and then divide each vector by its 1ength to get an or-
f1onorma1 basis. This postpones the introduction of radicals and even allows
for the elimination offractions by sca1ar multip1ication. That is, if(1/2, O, 1/3, O)
i5 orthogonal toa set ofvectors, then sois (3, O, 2, 0). In Examp1e 2 the second
and third orthogonal vectors are given by
and
(3, 1, -1, O) o (1, O, 1, 0)(
1
O
1
O)
(
3
,
1
' -
1
, O) - (1, O, 1, O) o (1, O, 1, O) ' ' '
= (3, 1, -1, O)- (1, O, 1, O)= (2, 1, -2, O)
(8, -7, O, 3) o (1, O, 1, O) )
(
8
, -
7
, O,
3
)- (1, O, 1, O) o (1, O, 1, O) (1, O,
1
' O
_ (8, -7,0,3)o(2, 1, -2,0)(
2 1
_
2
O)
(2, 1, -2, O) o (2, 1, - 2, O) ' ' '
= (8, -7, O, 3)- (4, O, 4, O)- (2, 1, -2, O)
= (2, -8, -2, 3).
1\ ow the desired orthonormal basis is obtained from the orthogonal set
{(1, O, 1, 0), (2, 1, -2, O), (2, -8, -2, 3)} when each vector is divided by its
length.
Corollary 6.9
l. Every finite-dimensional inner product space has an orthonormal
basis.
2. An orthonormal subset of a inner product space
tff can be extended to an orthonormal basis for tff.
3. The Gram-Schmidt Process 259
Proof These follow at once from the corresponding results for
vector spaces, using the Gram-Schmidt process.
Examp!e 3 Suppose tff is the inner product space obtained by
defining < , ) on 1/
2
by
<(a, b), (e, d)) = 3ae - 2ad- 2be + Sbd.
Find an orthonormal basis for tff. (W;.._
An orthonormal basis can be constructed from the standard basis for
-r
2
. A vector orthogonal to (1, O) is given by
<(0, 1), (1, O)) ( /
(0, 1) -((!,O), (1, O)) (1, O)= (0, !) - -2/3, O)= (2 3, !).
Therefore {(1, 0), (2, 3)} is an orthogonal basis for tff, and dividing each
vector by its length gives the orthonormal basis {(1/)3, O), (2/)33, 3/)33)}
for tff.
betinition Let Y be a subspace of an inner product space tff. The set
y1. = {V E tffl <U, V) = O for all U E Y}
is the orthogonal eomplement of Y. The notation gJ. is often read "9' perp."
Example 4 If 9' is the line .P {(1, 3, - 2)} in tff
3
, then what is the
orthogonal complement of 9'?
Suppose (x,y. z) E 9'1., then (x,y, z) o (1, 3, -2) = O or x + 3y - 2z = O.
Conversely, if a, b, e satisfy the equation a + 3b - 2e = O, then
(a, b, e) o (1, 3, -2) =O and (a, b, e) E 9'1.. Therefore gJ. is the plane with
Cartesian equation x + 3y - 2z = O. ( 1, 3, - 2) are direction numbers
normal to this plane, so g1. is perpendicular to the line 9' in the geometric
sen se.
As a consequence of problem 7, page 254, the orthogonal complement
of a subspace is also a subspace. So if 9' is a subspace of an inner product
space tff, then the subspace 9' + g>l. exists. This sum must be direct for if
U E 9' and U E 9'1., then <U, U) = O and U = O (Why?) If tff is finite dimen-
sional, then Corollary 6.9 can be used to show that 9' EB gJ. is a direct sum
decomposition of tff.
260 6. INNER PRODUCT SPACES
Theorem 6.1 O If g is a subspace of a finite-dimensional inner pro-
duct space then g = [/ El3 f/.i.
Proof As a subspace of C, f/ is an inner product space and there-
fore has an orthonormal basis {V
1
, , Vk}. This basis can be extended to
an orthonormal basis {V
1
, , Vk, Vk+l ... , V"} for C. The proof will be
complete if it can be shown that the vectors vk+ 1' ... ' vn E [/.l. (Why is
this?) Therefore consider Vj with j > k. If U E Y, then there exist scalars
a
1
, , ak such that U= a
1
V
1
+ + akVk, so
Therefore vj E g.L for k < j :S: n and g = g El3 g.l,
Problems
l. Use the Gram-Schmidt process to obtain an orthonormal basis for the spaces
spanned by the following subsets of t!n.
a. {(2, O, 0), (3, O, 5) }.
b. {(2, 3), (1, 2)}.
c. {(1, -1, 1), (2, 1, 5)}.
d. {(2, 2, 2, 2), (3, 2, O, 3), (0, -2, O, 6)}.
2. Construct an orthonormal basis from { 1, t, t
2
} for the inner product space
(R
3
[t]; ( , )) where (P, Q) = J: P(t)Q(t)dt.
3. Let e = (1" 2 , (, )) whcre thc inner product is given by
((a, b), (e, d)) = 4ac - ad- be + 2bd.
Use thc Gram-Schmidt process to find an orthonormal basis for e .
a. starting with the standard basis for "f"
2
- ,., b. starting with the basis {(2, 2), (- 3, 7) }.
4. Define an inner product on "f"
3
by
((a1 , a2, a
3
), (b, bz, b3))
= a1b - 2abz - Za2b 1 + 5a2b2 + a2b3 + a3b2 + 4a3b3.
Use the Gram-Schmidt process to construct an orthonormal basis for the
inner product space (1"3, (, )) from the standard basis for 1"
3
5. Carefully write an induction proof for the Gram-Schmidt process.
6. Find ff?L for the following subsets of en:
a. ff? = 2'{(2, 1, 4), (1, 3, 1)}. b. ff? = 2'{(1, 3)}.
c. [!? = 2'{(2, -1, 3), (-4, 2, -6)}. d. [!? = 2'{(1, o, --2)}.
e. ff? = 2'{(2, O, 1, 0), (1, 3, O, 2), (0, 2, 2, 0)}.
4. Orthogonal Transformations 261
7. Let T be a linear map from tf
3
to tf
3
su eh that T( U) T( V) = U V for all
U, V E tf
3
. Suppose a subspace ff? is 'invariant under T, and prove that ff?L
is also invariant.
8. a. Verify that T(a, b, e) = Ga - b, !a + !e) satisfies the hypotheses of
problem 7.
b. Find a Iine ff? and the plane ff?L that are invariant under this T.
9. Verify that 8
2
= ff? (f) ff?L when ff? = 2' {(1, 4) }. What is the geometric
interpretation of this direct sum decomposition of the plane?
10. Verify that 8
4
= ff? (f) ff?L when ff? is the plane spanned by (1, O, 2, 4) and
(2, O, -1, 0).
'4. Orthogonal Transformations
What type of map sends an inner product space into an inner product
space? Since an inner product space is a vector space together with an inner
product, the map should preserve both the vector space structure and the
inner product. That is, it must be linear and the inner product of the images
of two vectors must equal-the inner product of the two vectors.
Let ("f, <, )r) and ("'f, < , )
1
r) be inner product spaces
and T: "f --+ "/f a linear map. T is orthogonal if for all U, V E "f,
(T(U), T(V))
1
r = (U, V>r
Since the norm of a vector is defined using the inner product, it is pre-
served by an orthogonal map. Thus the image of a nonzero vector is also
nonzero, implying that every orthogonal map is nonsingular.
Example 1 The rotation about the origin in C
2
through the angle
cp is an orthogonal transformation of lff
2
to itself.
It is not difficult to show that T(U) o T(V) = U o V for all U, V E lff
2
when
T(a, b) = (a cos cp - b sin cp, a sin cp + b cos cp).
Use the fact that sin
2
cp + cos
2
cp = l.
The distan ce between two points A and B in the plane ( or lff ") is the norm
of the difference A - B. If T: C
2
lff
2
is orthogonal, then
IIT(A)- T(B)II = IIT(A- B)ll-= IIA- Bll.
262 6. INNER PROI1UCT. SPACES
That is, an orthogonal map of 8'
2
to itself preserves the Euclide'an distance
between points.
Example 2 The reftection of 8'
2
in a line through the origin is an
orthogonal transformation.
Suppose T is the reftection of 8'
2
in the line .e with vector equation
P = tU, tER, where U is a unit vector. For any V E 8'
2
, Jet W =(V o U) U;
.then W is the orthogonal projection of V on U. Since W is the midpoint of
thc line and T(V), W =!(V+ T(V)), see Figure 2. Therefore
T( V) = 2 W - V = 2( V o V) U - V.
The reflection T is linear for
T(r
1
V
1
+ r
2
V
2
) = 2[(r
1
V
1
+ r
1
V
2
) o U]U- (r
1
V
1
+ r
2
V
2
)
= r
1
2(V
1
o U)V- r
1
V
1
+ r
2
2(V
2
o U)U- r
2
V
2
= r
1
T(V
1
) + r
2
T:Vz),
and T preserves the dot product for
T(V
1
) o T(V
2
) = [2(V
1
o U) U - V
1
] o [2(V
2
o V)V- V
2
]
= 4(V
1
o U)(V
2
o U) - 2(V
1
o U)( U o V
2
)
- 2(V
2
o U)(V
1
o U) + V
1
o V
1
Thus T is an orthogonal transformation. If U= (cosa, sin a), as in Figure
2, and V= (a, b), then a short computation shows that the reflection T is
V
\y
T(V)
Figure 2
1
4. Orthogonal Transformations 263
given by
T(a, b) = (a cos 2rx + b sin 2o:, a sin 2rx - b cos 2a).
Notice that Ttransforms the standard basis for 8'
2
to {(cos cp, sin cp), (sin cp,
-cos cp)} when cp = 2a. This is the second form for an orthonormal basis
for t
2
obtained in Example 2, page 251.
We will find in Theorem 6.13 that the only orthogonal transformations
of t
2
to itself are rotations about the origin and reftections in lines through
the origin.
The dot product was defined in $
2
to introduce the geometric properties
of length and angle. Since an orthogonal map from $
2
to itself preserves the
dot product, it preserves these geometric properties. However, the converse
is not true. That is, there are transformations of the plane into itself that
preserve length and angle but are not linear. A translation given by T(V)
= V + P, P a nonzero, fixed vector, is such a transformation.
Theorem 6.11 Let TEHom('f", 11/) and suppose B = {V
1
, . , V.} is
an orthonormal basis for (1'", ( , ).,). Then T is orthogonal if and only if
{T(V
1
), , T(V.)} is an orthonormal set in (1/, ( , )
1
y).
Proof ( =>) If T is orthogonal, then
(<=) Suppose (T(V), T(Vi));r = Oi and let U, V E 1'". 1f U= L7 =
1
aV
and V= LJ=1 biVi, then
n n
= La L bj (T(V;), T(V))1Y
i= 1 j= 1
n n n
= L L, a;bAj = L a;b; =(U, V)B =(U, V).,.
i=l j=l i=l .
Definition Two inner product spaces are isomorphic ifthere exists an
orthogonal map from onto the other.
264
Theorem 6.12
isomorphic.
6. INNER-2PRODUCT SPACES
Any two n-dimensional inner product spaces are
Proof Suppose re and re' are n-dimensional inner product spaces.
Let B = {V
1
, , V.} and B' = {V;, ... , V ~ } be orthonormal bases for
rff and re', respectively. Set T(V) = V[, l ~ i ~ n, and extend T linearly to
aH of re. Then by Theorem 6.ll, T is orthogonal. Since T is clearly onto, the
spaces are isomorphic.
Thus any n-dimensional inner product space is algebraically indistin-
guishable from re . This result corresponds to the fact that every n-dimen-
sional vector space is isomorphic to the space of n-tuples, "Y . 1t of course
does not say that re. is the only n-dimensional inner product space.
Supposc T: re-+ re is an orthogonal map and B = {V
1
, , V.} is an
orthonormal basis forre. What is the nature ofthe matrix A for Twith respect
to B? If A =(a) then T(Vk): (a
1
k, ... , a.k)n. The fact that < , ) = < , >B
and (T(V), T(V)) = Dii implies a
1
a
1
j + + a.a.j = Dii. But this sum
is the entry from the ith row and jth column of the matrix product ATA.
That is, ArA = (bii) = !,., or the inverse of A is its transpose.
Definition A nonsingular matrix A is orthogonal if A-! = Ar.
Notice that the condition A-! = Ar implies that the n rows (or the n
transposed columns) of A forman orthonormalbasis forre., and conversely.
Using this idea, the orthogonal maps from re
2
to itself can easily be charac-
terized.
Theorem 6.13 If T: re
2
-+ re
2
is orthogonal, then T is either a rota-
tion or a reftection.
Proof If A is the matrix of T with respect to the standard basis,
:hen A is orthogonal. Thus if
then B = {(a, b), (e, d)} is an orthonormal basis for re
2
. Now Example 2,
page 251, states that there exists an angle cp such that
B = {(cos cp, sin <p), (-sin cp, cos cp)}.
4. Orthogonal Transformations
Taking the plus sign for B gives
-sin cp)
cos (/J
Since A is thc matrix of T with respect to the standard basis,
T(x, y) = (x cos cp -y sin ~ o x sin cp +y cos cp).
That is, T is the rotation about the origin, through the angle cp.
Using the minus sign for B gives
and T is given by
sin cp)
-cos cp '
T(x, y) = (x cos <p +y sin cp, x sin cp -y cos cp).
265
That is, Tis the reftection in the line through the origin with direction numbers
(cos cp/2, sin cp/2).
Examp/e 3 Determine the nature of the orthogonal map from re
2
to itself given by
T(x, y)= ( -!x + Jy, !x + fy).
The matrix of T with respect to the standard basis is
(
-4/5 3/5)
3/5 4/5 .
With cos (/J = -4/5 and sin cp = 3/5, this is seen to be the matrix of a reftcc-
tion in a line C. The slope of e is given by
Since cp is in the second quadrant, cp/2 is in the first quadrant and the slope
of C is 3. Therefore T is the reftection in the Iine with equation y = 3x.
1t should be clear that any reftection in a Iine through the origin of re
2
266 6. INNER PRODUCT SPACES
has a matrix in the form - n with respect to sorne orthonormal basis,
fcr one line (subspace) is fixed and another is refiected into itself. This result
can also be obtained by considering the matrix A for T with respect to the
standard basis. Since
sin q>)
-cos <p '
the characteristic equation for A is ..1? = l. Thus 1 are the characteristic
roots of A, andA is similar to diag (1, -1). For the map in Example 3, the
orthonorma1 basis {(1/)10, -3/)W), (3/)10, 1/)10)} yields the matrix
-n ror T.
The previous argument shows that if A is a matrix for a refiection in the
plane, then det A = - l. On the other hand, if A is the matrix of a rotation
in the plane, then det A = cos
2
cp + sin
2
cp = L So the matrix of an or-
thogonal map in the plane has determinant l. This is a general property
of orthogonal matrices following from the fact that AAr = In.
Theorem 6.14 If A is an orthogonal matrix, then det A l.
Characierizing the orthogonal transformations of 3-space, rff
3
, is only a
little more difficult than for the plane. Suppose T: rff
3
rff
3
is orthogonal.
The characteristic polynomial of T must have a nonzero real characteristic
root r since it is a polynomial of odd degree. Let V
1
be a unit characteristic
vector corresponding to r, and let B = {V
1
, V
2
, V
3
} be an orthonormal basis
for rff
3
(B exists by Corollary 6.9 on page 258.) Then the matrix A of T
with respect to B has the form
Since A is orthogonal, its columns are orthonormal in rff
3
Therefore r = l,
x = y = O, and ( is orthogonal. That is, the line 2' {V} is invariant
under T and its orthogonal complement, the plane !/ = 2'{V
2
, V
3
}, is sent
into itself by the orthogonal, restriction map
4. Orthogonal Transformations 267
This already says quite a bit about an arbitrary orthogonal transforma-
tion of rff
3
But if
= 1,
then T y is a rotation in the plane !/ and there is an angle (P such that
(
+1 o
A = O cos <P
O sin q>
<P )
cos <P
On the other hand, if
then Ty reflects the plane 2'{V
2
, V
3
} into itself. Therefore there exist vectors
and such that B' = {V
1
, is an orthonormal basis for rff
3
and
the matrix of T with respect to B' is
-1
Since one of the matrices denoted by A' is similar to one denoted by A when
<P = n, there are three dfferent types of matrix for T. Or we ha ve found that
if T: rff
3
rff
3
is orthogonal, then T has one of the following matrices with
respect to sorne orthonormal basis { U
1
, U
2
, U
3
} for rff
3
:
(1 o
_,,: )
el o o)
o cos <P o 1 o
O sin q> cos <P o o 1
or
n
o
o
o )C' o o)
cos <P cos <P -sin cp O 1 O .
sin q> cos <P o sin q> cos <P o o 1
If T has the first matrix, then T is called a rotation of rff
3
Thus a rotation of
rff
3
is the identity on a line, 2'{U
1
}, called the axis ofthe rotation. Further
268
6. INNER PRODUCT SPACES
the rotation Trotates every plane parallel to 2'{U2 , U
3
} into itself, through
the angle ([J. This angle lfJ is called the angle ofrotation of T, with the positive
direction being measured from U
2
to U
3
If T has the second matrix above,
then Tis a reflection of re
3
in the plane 2'{U
2
, U
3
}. Including the third form
for the matrix of T we ha ve the following.
Theorem 6.15 An orthogonal transformation of re
3
to itself is either
a rotation, a reflection in a plane, or the composition of a rotation and a
refiection in a plane.
An orthogonal transformation in re
2
or re
3
is a rotation if its matrix
with respect to an orthonormal basis has determinant + l. Therefore the
following definition is made.
Definition Let T: re ...... re. be an orthogonal map, B an orthonormal
basis forre., andA the matrix of Twith respect to B. Then Tis a rotation of
re" if det A = l.
Can yo u guess how a rotation acts in 4-space, re
4
?
Example 4 Suppose T: re
3
-+ re
3
is the orthogonal transformation
having the matrix
(
1/3 2/3
-2/3 2/3
2/3 1/3
-2/3)
1/3
2/3
with respect to the standard basis. Determine how T acts on re 3
Since T is orthogonal and det A = 1, T is a rotation. If U les on the
axis of T, then T(U) = U. This equation has the solutions U= t(O, 1, 1),
tE R. Therefore the axis of T is 2'{(0, 1, 1)} and T rotates each plane with___,
equation y + z = constant into itself. To find the angle of T, let
B = {(0, I.j'l, 1.)2), (0, -1/.)2, 1/.)2), (1, O, O)}
This is an orthonormal basis forre
3
with the first vector on the axis of rotation.
Computing the matrix A of T with respect to this basis yields
(
1 o o )
A = O ij3 .JB/3 .
O -.JS/3 1/3
Therefore the angle ({J of T satisfies cos ({J = 1/3 and sin ({J = -.JS/3.
4. Orthogonal Transformations 269
The sign of ({J is meaningful only in reference to a choice of basis. If the
last two vectors in B are interchanged, the sign of ({J is changed, for then
cos lfJ = 1/3, sin ({J = .JS/3. Thus the amount of rotation might be found
without reference to the sense of the rotation. This can be done by simply
finding the angle between any nonzero vector V in 2'{(0, l, l)}J. and its
image T(V). If Vis taken to be (1, O, O), then the angle lfJ satisfies
U o T(U)
cos lfJ = IJUIJIJT(U)I/ = (1, O, O) o (1/3, 2/3, -2/3) = 1/3.
One further general result should be stated to complete this section.
Theorem 6.16 The mattix of an orthogonal transformation with respect
to an orthonormal basis is orthogonal.
Conversely, given an n x n orthogonal matrix A and an n-dimensional
inner product space re with an orthonormal basis B, there exists an orthogonal
map T: re-+ re with matrix A with respect to B.
Proof The first statement has airead y been derived. Por the second,
suppose A = (a;) is orthogonal and B = {V
1
, , V.} is an orthonormal
basis forre. Define T by T(V) = aljV
1
+ + a.iv and extend Iinearly
to all of re. Then T is linear and has matrix A with respect to B, so it is o ni y
necessary to show that T preserves the inner product. It is sufficient to show
that (T(V;), T(V)) = Oi, and
(T(V;), T(V)) = (T(V;), T(V))B = a
1
a
1
i + + a,.;a,i
which is the e!ement from the ith row and jth column of the product ATA.
But ATA = I,., so (T(V;), T(V)) = Oi and T preserves the inner product
of re.
Problems
l. Determine the nature of the following orthogonal maps in rff
2
a. T(x, y) = (h + h/3y, tv3x- ty).
b. T(x, y) = Ctx + tv3y, -tv3x +{y).
c. T(x, y) = ( -1-x + h. -h - !y).
d. T(x, y) = (x, y). e. T(x, y) = (-y, -x).
2. a. Show that the composition of two orthogonal maps is orthogonal.
b. Show that the composition of two rotations in rffn is a rotation.
c. ls the composition of two reflections a reflection?
3. Suppose T: r ~ r is linear and preserves the norm, show that Tis orthogonal.
270 6. INNER PRODUCT SPACES
4. Fill in the missing entries to obtain an orthogonal matrix
(
1/3 2/3 a ) (1/VJ 1/,/3 a-)
a. b 2/3 e . b. b e 1/V2 .
2/3 d 2/3 1/V6 d l/v'6
5. Determine which of the following maps of <ff 3 are orthogonal and give a
geometric description of those that are orthogonal.
a. T(a, b, e) = (a/V'l + b/V'l, b/v''l- afv''l, -e).
b. T(a, b, e) = (b/V'l + efV'l, a, b/v''l + e/V'l).
c. T(a, b, e) = (bfv''l- ajy''l, afv''l + bfV'l, -e).
d. T(a, b, e) = (a/v''I- efv''I, b, afv''I + ejy'2).
e. T(a, b, e) = (a/V'I + b/V'l, b/V'I- e/V'I, afV'I + e/yZ).
6. Give two different proofs of the fact that the product of two n x n orthogonal
matrices is orthogonal.
7. Determine how the orthogonal transformation T acts on <ff 3 if the matrix of
T with respect to the standard basis is
(
6/7 2/7 3/7) ( 6/7 2/7 - 3/7)
a. 2/7 3/7 -6/7 . b. 2/7 3/7 6/7 .
-3/7 6/7 2/7 -3/7 6/7 -2/7
(
1/3 2/3 - 2/3) (2/3 1/3 - 2/3)
c. 2/3 1/3 2/3 . d. 2/3 -2/3 1/3 .
-2/3 2/3 1/3 1/3 2/3 2/3
(
-1/9 -8/9 4/9) ( 1/9 8/9 4/9)
e. -8/9 -1/9 -4/9 . f. 8/9 1/9 -4/9 .
4/9 -4/9 -7/9 -4/9 4/9 -7/9
8. Given PE rff,, the map T defined by T( V) = V + P for all V E rff" is cailed
a translation.
a. Show that a translation preserves the distance between points.
b. When is a translation orthogonal?
9. Find a vector expression for a reflection of the plane <ff 2 in a Iine that does not
pass through the origin. Show that this map preserves distance between points
but is not linear and does not preserve the dot product.
10. Let T: rff
be orthogonal and suppose Thas a real characteristic root.
a. Show that T has two real characteristic roots.
b. Describe the possible action of T in terms of a rotation about a plane and
reflections in hyperplanes.
ll. Prove Euler's theorem: If T: is a rotation and n is odd, then 1 is a
characteristic value for T. That is, any rotation of t!" fixes sorne Iine, when n
is odd.
5. Vector Spaces over Arbitrary Fields
Although the point has not bee'fi stressed, all the vector spaces considered
to this point have been real vector spaces. That is, the scalars have been real
5. Vector Spaces over Arbitrary Fields 271
numbers. But there are many situations in which it is necessary to use scalars
from other "number systems." For example, we have ignored complex roots
of a characteristic equaion simply because of the requirement that a scalar
be a real number. Beca use of the existen ce of complex roots, it will be neces-
sary io introduce complex inner product spaces to provean importan! theorem
about diagonalizability.
Before considering the complex case in particular, it is worthwhile to
see exactly how our definition of a vector space might be generalized. Instead
ofusing scalars from the real number system, the scalars could be chosen from
any system that shares certain properties with the real numbers. The abstract
algebraic system satisfying these properties is called a field.
Definition Let F be a set on which two operations called addition and
multiplication are defined such that for all a, bE F, a + bE F and ab
= abE F. The system (F, +, ) is afield if the following properties hold for
all a, b, e E F:
l. a + (b + e) = (a + b) + e and a(be) = (ab)e.
2. a + b = b + a and ab = ba.
3. There exists an element O E F, such that a + O = a, and there exists
and element 1 E F, 1 =1= O, such that, al = a.
4. For each a E F, there exists -a E F, such that a+ (-a) = O, and
for each a E F, if a =1= O, there exists a-
1
E F such that a a-
1
= l.
5. a(b + e) = ab + ae.
lt should be clear that the real number system is a field. In fact, the pro-
perties required of a field are the same as those Iisted for R on page 2 in
preparation for defining the vector space "Y
2
It is not difficult to show that
the complex numbers C = {a + biJa, bE R} form a field with addition and
defined by
(a + bi) + (e + di) = a + e + (b + d)i
and
(a+ bi)(e +di)= ae- bd + (ad + be)i.
There are many fields besides the real and complex numbers. For ex-
ample, the set of all rational numbers, denoted by Q, forros a field within
R. The su m and product of rational numbers is ratio na!, so Q is closed under
addition and multiplication. The identities O and 1 of R are in Q, and the
additive and multiplicative in verses of rational numbers are rational numbers.
Thus Q is a "subfield" of R.
~
272
6. INNER PRODUCT SPACES
The fields R, e, and Q are infinite fields in that they contain an infinite
number of elements. But there are also finite fields. The simplest finite fields,
denoted by zp, are the fields of integers modulo p, where p is a prime. zp
is the set {0, 1, 2, ... , p - l} together with addition and multiplication
defined using the operations in the integers as follows:
a+b=c
if a + b = kp + e for sorne k E Z, O ::::; e < p
and
ah = d if ah = hp + d for sorne h E Z, O ::::; d < p.
Tliat is, a + h and ab in Zr are remainders obtained when a + h and ab
are divided by p in Z. For example, in Z
7
, 2 + 3 = 5, 6 + 5 = 4, 52 = 3,
56 = 2, and 23 = 6. A proofthat ZP is a field is omitted since these finite
fields are introduced here simply to indicate the diversity of fields. The field
Z
5
is considered in problem 6 at the end of this section, and you might note
that the elements of Z
5
are esseiltially the equivalence classes obtained in
problem 3, page 221. The simplest field is Z
2
, which contains only O and l.
In Z
2
, 1 + 1 =O, so 1 is its own additive inverse! This fact leads to sorne in-
teresting complications when statements are made for arbitrary fields.
The real number system and any field F share the properties used in de-
fining a vector space. Therefore the concept of a vector space can be gener-
alized by simply replacing R by Fin our original definition. This yields the
definition for a vector space "f/ over a jield F. This is an algebraic system
consisting of a set "//, whose elements are called vectors, a field F, whose
elements are called scalars, and operations of vector addition and scalar
multiplication.
Given any field F, the set of ordered n-tu pies can be made into a vector
space over F for each positive integer n. Let
"f/.(F) = {(a
1
, ... , a.)la
1
, , a. E F}
and define
(a
1
, , a.)+ (b
1
, , h.)= (a
1
+ b
1
, , a.+ h.)
c(a
1
, , a.) = (ca
1
, , ca.), for e E F.
The proof that "f/.(F) together with these two operations is a vector space
over F is exactly like the proof that "f'. is a real vector space. In fact, "f/.(R)
is "f/ n
5. Vector Spaces over Arbitrary Fields :2! 273
Similarly, the set of all polynomials in t with coefficients from F, denoted
by F[t], can be turned into a vector space over the field F. The set of all n x m
matrices with elements from F, denoted vltnxm(F), forms a vector space over
F, and if "f/ and "/f are vector spaces over F, then so is Hom("f/, 1/).
Example 1 Consider the vector space "f/ JCZ
2
). Since the field con-
tains only two elements, it is possible to list all the vectors in this space. They
are: (0, O, O), (1, O, O), (0, 1, 0), (0, O, !), (1, 1, 0), (1, O, 1), (0, 1, 1), (1, 1, 1).
Therefore "f/
3
(Z
2
) is finite in the sense that there are only eight vectors. lt
must a1so be finite dimensional since a basis could contain at most seven
vectors. However, the seven nonzero vectors are not linearly independent.
For example, the set {(1, 1, 0), (1, O, 1), (0, 1, 1)} is linearly dependent in
"f/
3
(Z
2
). In fact, (1, 1, O)+ (1, O, 1) + (0, 1, 1) = (1 + 1, 1 + !, 1 + 1)
= (0, O, 0).
Example 1 implies that the basic properties of "f/.(R) carry over to
"f/.(F). In particular, the standard basis {E} for "f/.(R) is a basis for "f/.(F).
Therefore the dimension of 'f.(F) is n.
Example 2 Consider the vector space "f/ z(C). The vectors in this
space are ordered pairs of complex numbers, such as (3 - i, 2i - 6), (1, 5),
(6i, i). Given an arbitraty vector (z, w) with z, w E e, (z, w) = z(!, O) + w(O, 1),
so {E} spans "f/
2
(e). And if
z(!, O) + w(O, 1) =O =(O, 0),
then z = O and w = O by the definition of an ordered pair, so {E} is Iinearly
independent. So {E} is a basis for "f/ z( C) and the vector space has dimension
2.
Example 3 Let T: "f/'
2
( e) -> "f/ z( C) be defined by
T(z, w) = (iz + 3w, (1 - i)w).
The image of any vector under T is easily computed. For example;
T(2 + 3i, 3 - i) = (i(2 + 3i) + 3(3 - i), (1 - i)(3 - i))
= (6 - i, 4 - 4i).
274 __ 6. JNNER PRODUCT SPACES
T is a linear map, for if U= (z
1
, w
1
) and V= (z
2
, w
2
), then
T(U + V) = T(z
1
+ z
2
, w
1
+ W
2
)
= (i[z
1
+ z
2
] + 3[w
1
+ w
2
], (1 - i)[w
1
+ w
2
])
= ([iz
1
+ 3w
1
] + [iz
2
+ 3w
2
], (1 - i)w
1
+ (1 - i)w
2
)
= (iz
1
+ 3w
1
, (1 - i)w
1
) + (iz
2
+ 3w
2
, (1 - i)w
2
)
= T(U) + T(V).
Similarly, T(zU) = zT(U) for any z E C.
Since T(1, O) = (i, O) and T(O, 1) = (3, 1 - i), the matrix of T with
respect to the standard basis {E} is
(
3 )
A= O
1
_ Evlt2xiC).
The determinant of A is i(1 - i) = 1 + i, so T is nonsingular and the matrix
of r-
1
with respect to {Ei} is
A-1 = _1 (1 - i -3) = (-i (3i- 3)/2)
1 + i o i o (1 + i)/2 .
Therefore
r-
1
(a, b) = ( -ia + t(3i- 3)b, t(l + i)b).
"You might check to see that either To T-
1
= 1 or T-
1
o T =l.
The characteristic polynomial of T is
(i- .A.)(I - i - A.) = .A.
2
- A. + 1 + i.
Since T is defined on a complex vector space, its characteristic values may
be complex i:mmbers. In this case i and 1 - i are the characteristic values for
T. These values are distinct, so T is diagonalizable, although its characteristic
vectors will hitve complex components. (1, O) is a characteristic vector cor-
responding to i, as is (i, O) or (z, O) for any complex number z #: O, and
(3, 1 - 2i) is a characteristic vector corresponding to 1 - i. Therefore the
matrix of Twith respect to the basis {(1, 0), (3, 1 - 2i)} is the diagonal matrix
G
. .l
:1;
.;,
5. Vector Spaces over Arbitrary Fields 275
Example 3 serves to show that anything that has been done in real vector
spaces can be done in a complex vector space. C could ha ve been replaced by
an arbitrary field and a similar statement could be made for a vector space
over any field F. Notice this means that all the results on systems of linear
equations apply to systems with coefficients from an arbitrary field. However,
when it comes to defining an inner product on an arbitrary vector space, the
nature of the field must be taken into account. One way to see why is to con-
sider the requirement that an inner product be positive definite. This con-
dition requires an order relation on the elements of the field. But not all
fields can be ordered, in particular the complex numbers cannot be ordered,
see problem 15 below.
Problems
l. Compute the following.
a. (2 + i, 3i- 1) - (i- 5, 2- 4i). b. [3 + i)( 1 + i, 2i - 4, 3 - i).
c. 3i2i ))(!;). d. (2ti
4
3
+- (2i o 2 + 3i)
.
1
1 3-i l+i.
1 .
2. a. Show that B = {(! + i, 3i), (2, 3 + 2i)} is a basis for "Y
2
(C).
b. Find the coordinates of (4i- 2, 6i- 9) and (4i, 7i- 4) with respect to
B.
3. a. Determine if S = {(i, 1 - i, 2), (2, 1, - i), (5 - 2i, 4, - 1 - i)} is linear) y
4.
5
6.
independent in "Y
3
( C).
b. Determine if (3 + i, 4, 2) is in the span of S.
Prove that "Yn(F) is a vector space o ver F, noting a JI the points at which it is
necessary to use sorne field property of F.
Define Fn[t) as Rn[t] was defined and show that dim Fn[t] = n.
Find the following elements in Zs:
a. 3 + 4, 2 + 3, and 3 + 3.
b. 24, 34, 4
2
, and 2
6
(note 6 rf= Z
5
).
c. -1, -4, and -3.
d. 2-1, 3-
1
, and 4-
1
e. Prove th.t Z
5
is a field.
7. Define Zn, for any positive integer n, in the same way Zp was defined. Are Z
4
8.
and z6 fields?
a.
b.
_c.
d.
Find all vectors in 2'{(1, 4)}, if (1, 4) E "Y2(Z
5
).
Is {(2, 3), (3, 2)} a basis for "Y
2
(Z
5
)?
How many vectors are ther_e in the vector space "Y
2
(Z
5
)?
Is {(2, 1, 3, 1), (4, 2, 1, 0), (1, 3, 4, 2)} linearly independent in "Y
4
(Z
5
)?
e
276 6. INNER.JRODUCT SPACES
9. Let T: "Y
2
(C)-"Y
2
(C) be defined by T(z, w) = (w, -z).
a. Show that T is linear.
b. Find the characteristic equation of T.
c. Find the characteristic values and corresponding characteristic vectors
for T.
d. Find a basis for "Y
2
( C) with respect to which T has a diagonal matrix.
10. Follow the directions from problem 9 for the map given by
T(z, w) = ([i- 1]z + [3i- l]w, 2iz + 4w).
11. Let T(a + bi, e + di) = (a + d- ci, b + (a - d)i). Show that T is not in
Hom("Y z(C)).
12. Let "Y be the algebraic system whose elements are ordered pairs of real
numbers. Define addition as in "Y z(R) and scalar multiplication by z(a, b) =
(za, zb), z E C, a, bE R. Determine if "Y is a vector space over C.
13. Let "Y be the algebraic system whose elements are ordered pairs of complex
numbers. Define addition as in "Y z( C) and scalar multiplication by r(z, w) =
(rz, rw), rE R, z, w E C.
a. Show that "Y is a vector space over R. _
b. What is the dimension of "Y?
14. Let "Y be the algebraic system whose elements are ordered pairs of real
numbers. Define addition as in 1 z(R) and scalar multiplication by q(a, b) =
(qa, qb), q E Q, a, bE R. (Q denotes the set of rationals.)
a. Show that "Y is a vector space over Q.
b. How does "Y differ from "Y z(Q)?
c. Is 1 a subspace of "Y 2(R)?
d. Show that {(1, 0), ( v2, O)} is linearly independent in 11.
e. Is (n, n) in the subspace of 1 given by .P{(1, 0), (0, 1)}?
f. What can be concluded from parts d ande about the dimension of "Y?
15. A field F can be ordered if it contains a subset P (for positive) which is closed
under addition and multiplication, and for every x E F exactly one of the
following holds; x E P, x = O, or - x E P. Show that the field of complex
numbers cannot be ordered.
16. For each nonzero vector U E rff 2 , suppose Bu is an angle satisfying
U= (a, b) = 11 U[[(a/[IU[[, b/IIU[[) = 11 U[[ (cos Bu, sin Bu).
Let Tou denote the rotation of C 2 through the angle Bu, and define a multi-
plication in rffz by
UV = [[ U[[Tou(V) if U-# O and VE rff 2
and
O V = O when U = O.
That is, to multiply U times V, rotate Vthrough the angle Bu and then multiply
by the length of U.
a. Show that the set of all ordered pairs in rff z together with vector addition
and the above multiplication is a field.
6. Complex lnner Product Spaces 277
b. Show that{(a, O)[a E R} is a subfield which is isomorphic to the field of
real numbers.
c. Show that (0, 1)2 = -l. Therefore if we set i = (0, 1), then i
2
= -t
using the identification given in part b.
d. Show that this field of ordered pairs is isomorphic to the field of complex
numbers.
6. Complex lnner Product Spaces and Hermitian
Transformations
The dot product of ce. cannot be directly generalized to give an inner
prodilct on r.(C) because the square of a complex number need not be real,
let alone positive, e.g., (! - i)
2
= - 2i. However a positive defi.nite function
can be defined on "f/.(C) using the fact that the the product of a + bi and
a - bi is both real and nonnegative.
Definition
z =a- bi.
The conjugate of the complex number z = a + bi is
Thus 5 - 3i = 5 + 3i, 2i- 7 = -2i- 7, and 6 = 6.
All ofthe following properties of the complex conjugate are easily derived
from the definition.
Theorem 6.17 Suppose z and w are complex numbers, then
l. z + w = z + w.
2. ZW = ZW.
3. z.z = a
2
+ b
2
if z =a+ bi, a, bE R.
4. z = z.
5. z = z if and only if z E R.
Definition For Z = (z
1
, , z.) and W = (w
1
, , w.) m "f/"(C),
let Z o W = z
1
w1 + + z.w ..
For two ordered triples of complex numbers,
(3 - i, 2, 1 + i) o (4, 2 + i, 3i - 1)
= (3 - i)4 + 2(2 - i) + (1 + i)(- 3i - 1)
= 18- IOi.
278
6. INNER PRODUCT SPACES
We will call this operation the standard inner product for ordered n-
tuples of complex numbers and denote the vector space "f/n(C) together with
this inner product by . But it should be quickly noted that this inner pro-
duct does not satisfy all the properties of the inner product of lff .
Theorem 6.18 For all vectors Z, W, U
l. Zo WE C.
2. Zo W= WoZ.
3. Z o (W + U)= Z o W + Z o U and (Z + W) o U= Z o U+ W o U
4. (aZ) o W = a(Z o W) and Z o (a W) = a(Z o W) for all a E C.
5. Z o Z is real and /; o Z > O if Z =1 O.
Proof of (2) If Z = (z
1
, .. , z.) and W = (w
1
, .. , w.), then
n n n n -.---
Z o w = L ZW = L WZ = L WZ = L WZ = L WZ = w o z.
i=l i=l i=l i=l i=l
Can you fill in a reason for each step?
Notice that the dot product in lff. satisfies all the preceding properties
since the conjugate of a real number is the number itself. The dot product
of lff. was said to be symmetric and bilinear. The corresponding properties
for the inner product of numbered 2, 3, and 4 in Theorem 6.18, and
thc inner product of is said to be hermitian. Theorem 6.18 provides the
essential properties which should be required of any inner product on a com-
plex vector space.
Definition Let "// be a complex vector space and < , ) a function such
that <Z, W) E C for all Z, W E"//, which satisfies
l. <Z, W) = <W, Z).
2. <Z, aW +bU)= a<Z, W) + 5<Z, U), for all a, bE C, and z, W,
U E"'/.
3. <Z, Z) >O if Z =1 O.
Thcn < , ) is a complex inner product on "//, and ("//, < , )) is a complex
inner product space. < , ) may also be called a positive definite hermitian
product on "// and ("//, < , ) ) may be called a unitary space.
The second condition in this definition shows that a complex inner
product is not linear in both variables. However, you might verify that the
first and second conditions imply
6. Complex lnner Product Spaces 279
<aZ t bW: U) = a<Z, U) + b<Z, U)
for all a, b E C, and Z, W, U E "//.
Complex inner product spaces can be constructed from complex vector
spaces in the same ways that real inner product spaces were obtained from
real vector spaces.
Examp!e 1 The set of all continuous complex-valued functions
on [0, 1] might be denoted by .?(C). This is a complex vector space
with the operations defined as in.?. Define < , ) on .?(C) by
<J, g) = J:!(x)g(x) dx,
so that
<ix
2
, 3 + ix) = J: ix
2
(3 - ix) dx
= 3ix
2
+ x
3
dx = i + 1/4.
It can be shown that the function. < , ) is a complex inner product for the
space of functions .?( C).
The definitions of terms such as norm, orthogonal, and orthonormal
basis are all the same for either a real or a complex inner product space.
Therefore the Gram-Schmidt process could be applied to Iinearly independ-
ent subsets of complex inner product spaces.
Example 2 Given
f/ = 2{(1 + i, 3i, 2 - i), (2 - 3i, 10 + 2i, 5 - i)}
a subspace of
3
, find an orthonormal basis for Y.
We have 11(1 + i, 3i, 2 - i)l/ = 4, so t(l + i, 3i, 2 - i) is a unit vector.
The orthogonal projection of (2 - 3i, 10 + 2i, 5 - i) on this vector is
[(2 - 3i, 10 + 2i, 5 - i) o t(I + i, 3i, 2 - i)] t(l + i, 3i, 2 - i)
= [1 - 2i](l + i, 3i, 2 - i) = (3 - i, 6 + 3i, - 5i).
280 6. INNER PRODUCT SPACES
Subtracting this projection from (2 - 3i, 10 + 2i, 5 - i) yields ( -l - 2i,
4 - i, 5 + 4i), which has norm .j63. Therefore an orthonormal basis for
9'is
{t(l + i, 3i, 2 - i), (IJ.j63)( -l - 2i, 4 - i, 5 + 4i)}.
The primary reason for introducing complex inner product spaces here
is to show that matrices of a certain type are always diagonalizable. The next
step is to consider the following class of linear transformations.
Definition Let ("Y, < , )) be a complex inner product space and
T: "Y -> "Y a linear map. T is hermitian if (T(U), V) = <U, T(V)) for all
vectors U, V E "Y.
The identity map and the zero map are obviously hermitian maps, and
we will soon see that there are many other examples. But first it should be
shown why hermitian transformations are of interest.
Theorem 6.19 The characteristic values of a hermitian map are real.
Proof Suppose T is a hermitian transformation defined on the
complex inner product space ("Y, < , ) ). Then a characteristic val u e z for
T might well be complex. However, if W is a characteristic vector for z,
then
zJIWII
2
= z(W, W) = (zW, W) = (T(W), W) = (W, T(W))
= (W, zW) = z(W, W) = ziiWII
2
Since W is a nonzero vector, z = z, hence the characteristic value z is real.
From this theorem we see that if T: '?. -4 '?
11
is hermitian, then a matrix
of T with respect to any basis has only real characteristic roots. This implies
that there is a collection of complex matrices that have only real charac-
teristic roots. To determine the nature of these matrices, suppose A = (aij)
is the matrix of T with respect to the standard basis {E;}. Thep T(E)
= (a
1
i, ... , a,.), and since T is hermitian,
aii = E o (a
1
i, . .. , a.i) = E o TtE)
= T(E) o Ei = (a u, ... , a.;) o Ei = aii
Therefore, if we setA= (a), the matrix A satisfies r = A.
6. Complex lnner Product Spaces 281
Definition
r =A.
An n X n matrix A with entries from e is hermitian if
Thus
(
2 3- 2i)
3+2i 5 '
are examples of hermitian matrices.
2- i
3
1 - i
Theorem 6.20 Suppose TE Hom('?.) has matrix A with respect to the
standard basis. T is hermitian if and only if A is hermitian.
Proof ( =>) This direction has been obtained abo ve.
( <=) Given that A = (a) is hermitian, it is necessary to show that
T(U) o V= U o T(V) for all U, V E'? .
Let U= (z
1
, , z.) and V= (w
1
, , w.), then
n n
"
L a1iziw1 + L a2iziw2 +
j= 1 j= 1
+ L a.iziw,
j=l
Thus T is hermitian and the proof is complete.
Therefore, every hermitian matrix gives rise to a hermitian transforma-
tion on '?.- Since a hermitian transformation has only real characteristic roots,
we have:
Theorem 6.21 The characteristic values or characteristic roots of a
hermitian matrix are real.
282 6. INNER PRODUCT SPACES
A matrix with real en tries is hermitian if it equals its transpose, beca use
conjugation does not change a real number. An n x n matrix A is said tobe
symmetric if Ar = A. For a real symmetric matrix, Theorem 6.21 becorries:
Theorem 6.22 The characteristic equation of a real symmetric matrix
has only real roots.
This means that the characteristic roots of matrices such as
are all real. But even though
and
(
2 1 3)
1 4 7
3 7 o
(
1 + i 4 +3 i)
4 + i
is symmetric, it need not have real characteristic roots.
The important fact is that not only are the characteristic roots of a real
symmetric matrix real, but such a matrix is always diagonalizable. Again,
the result is obtained first in the complex case.
Theorem 6.23 If T: ~ ~ ~ is hermitian, then Tis diagonalizable.
Proof T is diagonalzable if it has n linearly independent cbarac-
teristic vectors in ~ . Let ..1.
1
, .. , ..1.. be the n real characteristic values for
T. Suppose V
1
is a characteristic vector corresponding to ..1.
1
If n = 1, the
proof is complete, otherwise continue by induction on the dimension of n
Suppose V
1
, , Vk, k < n, ha ve been obtained, with V, a characteristic
vector corresponding to ..1., 1 s; i s; k, and V
1
, , Vk linearly independent.
The proof consists in showing that there is a characteristic vector for ..1.k+
1
in the orthogonal complement of .2"{V
1
, , Vd. Therefore, suppose
Y'k+t = S!"{V,, ... , Vk}J.andletTk+t betherestrictionofTtoY'k+t That
is 11.+1(U) = T(U) for each U E Y'k+t This map sends 9"k+
1
into itself, for
if U e 9"k+
1
and i s; k, then
That is, TH1(U) is in the orthogoaal complement of .2"{V
1
, , Vk}, and
6. Complex lnner Product Spaces 283
Tk+
1
: 9"k+
1
~ 9"k+
1
As a map from 9"k+
1
to itself, Tk+
1
has n - k charac-
teristic values, which must be ).k+
1
, , ).n Now if Vk+
1
is a characteristic
vector for Tk+
1
corresponding to ..1.k+
1
, Vk+
1
is a characteristic vector for
T and vk+l E Sf{V, o o o' Vk}J.. Therefore v,, o o o' Vk, vk+l are k+ 1
linearly independent characteristic vectors for T, and the proof is complete
by induction.
Corollary 6.24 Every hermitian matrix is diagonalizable.
This. corollary refers to diagonalizability o ver the field of complex
numbers. However, it implies that real symmetric matrices are diagonalizable
over the field of real numbers.
Theorem 6.25 lf A is an n x n real symmetric matrix, then there exists
a real orthogonal matrix P such that P-
1
AP is a real diagonal matrix.
Proof Since A is real and symmetric, A is hermitian and hence
diagonalizable over the field of complex numbers. That is, there is a matrix
P with en tries from C such that P _, AP is diag (.A.
1
, , .A.n), where .A.
1
, , .A..
are (real) characteristic roots of A. But the jth column of P is a solution of
the system of linear equations AX = ).jX. Since ..1.j and the entries of A are
real, the entries of P must also be real. Therefore, A is diagonalizable over
the field of real numbers. From Theorem 6.23 we know that the columns of
Pare orthogonal in rff., therefore it is only necessary to use unit characteristic
vectors in the columns of P to obtain an orthogonal matrix which dagonalzes
A.
So far there has been little indication that symmetric matrices are of
particular importance. Symmetric matrices are associated with inner pro-
ducts, as suggested in problem 9, page 247. But because of Theorem 6.25
it is often advantageous to introduce a symmetric matrix whenever possible.
Examp/e 3 Find a rotation of the plane that transforms the poly-
nomial x
2
+ 6xy + y
2
into a polynomial without an xy or cross product
term.
First notice that
284 . 6. INNER PRODUCT SPACES
where
A =G i}
The characteristic values for the symmetric matrix A are 4 and -2, and
(1/.}2, 1/.}2) and ( -1/.}2, 1/.}2) are corresponding unit characteristic
vectors. Constructing P from these vectors yields the orthogonal matrix
and
P
- (1/.}2
- 1(.)2
-1/.}2)
1/.)2 '
Since det P = 1, Pis the matrix of a rotation in the p!ane given by
or X= PX' with
(
x')
X'= y'.
Now if the given polynomial is written in terms of the new variables x' and
y', we obtain
x
2
+-6xy + y
2
= XrAX = (PXYA(PX') = X'rprAPX'
----..,
= X'rp-
1
APX' = (x',
The polynomial in x' and y' does not have an x'y' term as desired. If the
graph of the equation x
2
+ 6xy + y
2
= k is "equivalent" to the graph
of 4x'
2
- 2y'
2
= k for a constant k, then it would clearly be much easier to
find the second graph. 1t may be noted that no rotation is explicitly given
here; this situation is investigated in problem 10 at the end of this section.
6. Complex lnner Product Spaces 285
1t should be pointed out that the major iheorems of this section could
have been stated for an arbitrary finite-dimensional complex inner product
space instead of the n-tuple space . However, since each such space is
isom,orphic to one of the spaces only notational changes would be in-
volved in the statements of such theorems.
Problems
1. Compute:
a. (2, 1 + i, 3i)o(2 - i, 4i, 2 - 3i) in
3
b. (1 - i, 3, i + 2, 3i)o(5, 2i- 4, O, i) in
2. Find the orthogonal complement off!>
3
when
a. f!> = .'l'{(i, 2, 1 + i)}.
b. f!> = .'l'{(2i, i, 4), (1 + i, O, 1 - i)}.
3. Show that if Z, W, U anda, bE C, then Zo(aW +bU)= a(ZoW) +
li(Z o U).
4. a. Define a complex inner product (, ) on .Y 2( C) such that B = {(3 + i, 2i),
(1, i- 2)} is an orthonormal basis for (1"
2
(C), <, )).
b. What is the norm of (4i, 2) in this complex inner product space?
5. Which of the following matrices are hermitian and/or symmetric?
(
2 1 - ) ( 1 2) c. ( i 2 + )
a. 1 + i 4i b. 2 3 2 + i 3
d. (4 2 2i
4
li) e. G f. (2 3i
3
I
2
)
6. Suppose A is hermitian, a. Why are the diagonal entries of A real? b. Show
that the determinan! of A is real.
7. a. Show that the matrix A = cannot be diagonalized over the
field of real numbers.
b. Find a complex matrix P such that p-
1
AP is diagonal.
8. Suppose T: is given by T(z, w) = (2z + (1 + i)w, (1 - i)z + 3w).
a. Find the matrix A of T with respect to the standard basis, and conclude
that T is hermitian.
b. Find the characteristic values of A and a matrix P such that p-t AP is
diagonal.
9. Do there exist matrices that cannot be diagonalized over the field of complex
numbers?
10. In Example 3, page 283, it is stated that Pis the matrix of a rotation, yet since
no basis is mentioned, no map is defined.
a. Define a rotation T: 8 2 -<ff 2 such that Pis the matrix of Twith respect to
the standard basis for cff 2
286 6. INNER PRODUCT SPACES
b. Let B = {T(l, O), T(O, 1)}. Show that for each WE<ff2, if W: (x, y)<E;l
and (x, YV = P(x', yy, then W: (x', y')B.
c. Sketch the coordina te systems associat_ed with the bases {E;} and B in the
plane, together with the curve that has the equation x
2
+ 6xy + y
2
= 36
in coordinates with respect to {E
1
} and the equation 4x'
2
- 2y'
2
= 36 in
coordinates with respect to B.
11. Let G be the subset of tff 2 defined by
G = {V E@' 2/ V: (x, y)E,J and 5x
2
+ 4xy + 2y
2
= 24 }.
a. Find an orthonormal basis B for @' 2 such that the polynomial equation
G in coordinates with respect to B does not have a cross product
term.
b. Sketch the coordina te systems representing {E
1
} and B in the plane along
with the points representing the vectors in G.
12. A linear map T defined on a real inner product space (''Y, (, )) is symmetrie if
(T(U), V)= (U, T(V)) fgr all U, Ver.
a. Show that the matrix of T with respect to any basis for "Y is symmetric if
T is symmetric.
b. Suppose T: "Y- "Y is orthogonal and prove that T s symmetric if and
only if T
2
= ToT =l.
c. What are the possible symmetric, orthogonal transformations from @'
3
into itself?
Review Problems
l. For which values of r is ((a, b), (e, d)) = ae + be + ad + rbd an inner product
on "f"z?
2. Use the Gram-Schmidt process to construct an orthonormal basis for
("f"2, (, )) from the standard basis when:
a. ((a, b), (e, d)) = 2ae- 3ad- 3bc + 5bd.
b. ((a, b), (e, d)) =(a b)(i 1)
3. Prove that (U+ V, U- V) = O if and only if 1/ U/1 = 1/ VI/. What is the
geometric interpretation of this statement?
4. Show that (A, B) = tr(ABT) defines an inner product on the space of all real
n x n matrices.
5. Prove the parallelogram law:
r, U+ V/1
2
+ 1/ U- V/1
2
= 21/ U/1
2
+ 21/ V/1
2
6. Suppose ("Y, ( , )) is a real finite-dimensionalinner product space. An element
of Hom ("Y, R) is called a linear funetional on "Y. Show that for every linear
functional T, there exists a vector U E "Y such that T( V) = (U, V) for all
VE "Y.
Review Problems 287
7. Suppose .'? and :Y are subspaces of an inner product space @'. Prove the
following statements:
a. .'7 e (.'7
1
)
1
and .'? = (.'7
1
)
1
if @' is finite dimensional.
b. (.'? + :Y)l = .'?l n :Y l.
c. (.'? n ff)
1
::J f/'
1
+ :Y
1
and (.'? n Y)
1
= .'7
1
+ Y
1
if @' is finite dimen-
sional.
8. If TE Hom("f") and there exists T* E Hom("f") such that (V, T*(U)) =
(T(V), U) for all U, VE "Y, then T* is called the adjoint of T.
a.- Find T* if TE Hom(@'
2
) is given by T(a, b) = (2a + 3b, 5a - b).
b. Suppose "Y is finite dimensional with orthonormal basis B. If A is the
matrix of Twith respect to B, determine how the matrix of T* with respect
to Bis related to A when "Y is a real inner product space; when "Y is a
complex inner product space.
.c. Show that -fr> = if TE Hom(tff,).
9. Partition n x n matrices into blocks where r + s = n, A is r x r,
Bis r X S, e is S X r, and D is S X S.
. (A
1
!B1)(A2!Bz). l kf
a. Wnte the product e ;D---- e-- :D- 111 b oc orm.
1: 1 2- 2
b. Suppose ( is symmetric and A is nonsingular. Find matrices E and
Fsuch that =
c. Use part b to obtain a formula for the determinan! of a symmetric matrix
-partitioned into such blocks.
10. Use the formula obtained in problem 9 to find the determinan! of the following
partitioned symmetric matrices.
--- (2 3'1 4)
a. b. }---}j} -1
4 1 : 2 3
e
(
LJ tJ)
. 2 2 3 5 .
4 5 5 2
Second Degree Curves
and Surfaces
1. Quadratic Forms
2. Congruence
3. Rigid Motions
4. Conics
5. Quadric Surfaces
6. Chords, Rulings, and Centers
---..,
290 7. SECOND DEGREE CURVES ANO SURFACES
Our geometric illustrations have been confined primarily to p.oints,
lines, planes, and hyperplanes, which are the graphs of linear equations or
-;ystems of linear equations. Most other types of curves and surfaces are the
graphs of nonlinear equations and therefore lie outside the study of vector
However, linear techniques can be used to examine second degree
equations and their graphs. Such a study will not only provide
a good geometric application of linear algebra, but it also leads naturally to
severa! new concepts.
All vector spaces considered in this chapter will be over the real number
1ie1d and finite dimensional.
1. Ouadratic Forms
Consider the most general second degree polynomia1 equation in the co-
ordinates x
1
, , x. of points in Euclidean n-space. Such an equation might
be wrltten in the form
n n
L aiixixi + L bX + e = O, where aii b, e E R.
i,j= 1 i= 1
For example, in the equation - x
1
x
2
+ 9x
2
x
1
- 2x
2
+ 7 = O, n = 2,
a
11
= 2, a
12
= -1, a
21
= 9, a
22
= b
1
=O, b
2
= -2, and e= 7. The
problem is to determine the nature of the graph of the points
V\ hose coordina tes satisfy such an equation. We a1ready know that if x and y
are the coordinates of points in the Cartesian plane, then the graphs of
x
1
+ y
2
- 4 = O and x
2
+ 4x + y
2
= O are circles of radius 2. But what
might the graph of 4yz + z
2
+ x - 3z + 2 = O look like in 3-space? Before
considering such questions we will examine the second degree terms of the
general equation. The first step is to write these terms, L7.i=
1
aijxixi, in a
form using familiar notation. Notice that the polynomial 'L?.i=l aijxixi,
can be rewritten as
suggesting a product of matrices. In fact, if we let X= (X, . .. , x.)T and
A = (a;j), then
n
L aijxixi = XTAX.
i,j=l
1. Quadratic Forms 291
For example, the second degree polynomial 2xi - x
1
x
2
+ 9x
2
x
1
can be
written as
Although XT AXis a 1 x 1 matrix, the matrix notation is dropped here for
simplicity. There should be no question about the meaning of the equation
LL=
1
aiixixi = XT A X. This equation states that XT AX is a homogeneous
second degree polynomial in x
1
, , x., and the coefficient of XXi
is the element from the ith row and thejth column of A. (In general, a poly-
nomial is said to be homogeneous of degree k if every term in the polynomial
is of degree k.)
Definition A quadratieform on an n-dimensional vector space "//, with
basis B, is an expression of the form
1
aiixixi where x
1
, , x. are
the coordinates of an arbitrary vector with respect to B. If X= (x
1
, ... , x.)T
andA = (a;), then
aiixixi = XT AX
i,j;:::::: 1
and A is the matrix of the quadratie form with respeet to B.
Thus a quadratic form is a homogeneous second degree polynomial in
coordinates, it is not simply a po1ynomial. In fact, any one homogeneous
second degree polynomial usually yields different quadratic forms when
associated with different bases.
Example 1 Consider the polynomial 2xf - x
1
x
2
+ 9x
2
x
1
If x
1
and x
2
are coordinates with respect to the basis B = {(2, 1), ( -1, 0)}, then
this polynomial determines a quadratic form on "//
2
This quadratic form
assigns real numbers to each vector in "Y
2
The value of the quadratic form
at (2, l) is 2 since (2, 1): (1, O)n gives x
1
= 1 and x
2
=O. And since ( -1, 1):
(1, 3)s, the value ofthe quadratic form at (1, 1) is 2(W - 1(3) + 9(3)1 = 26.
For an arbitrary vector (a, b), x
1
= b and x
2
= 2b - a (Why?). Therefore
the value of the quadratic form 2xi - x
1
x
2
+ 9x
2
x
1
at (a, b) is 18b
2
- 8ab.
This gives the va1ue of the quadratic form at ( -1, 1) to be 18 + 8 or 26 as
before. -
Since the coordinates of (a, b) with. respect to the standard basis are a
and b, the polynomial 18b
2
- 8ab is also a quadratic form on 1'
2
292 7. SECOND DEGREE CURVES ANO SURFACES
2xi - x
1
x
2
+ 9x
2
x
1
and 18b
2
- 8ab are different quadratic forros in that.
they are different polynoroials in coordinates for different bases, but they
assign the saroe value to each vector in "Y
2
That is, they both define a par-
ticular function froro "Y
2
to the real nurobers, called a quadratic function.
Definition A quadratic function q on a vector space "Y with basis B
is a function defined by
n
q(V) = L aiixixi for each V E "Y
i,j== 1
where V: (x
1
, , x.)n.
That is, a quadratic function is defined in terms of a quadratic forro.
The two quadratic forros abo ve determine a quadratic function q on "Y
2
for which q(2, I) = 2, q( -!, 1) = 26, and in general q(a, b) = 18b
2
- 8ab,
so that q: "Y
2
-+R.
The relation between a quadratic function (r = q(V)) and a quadratic
forro (r = xr AX) is much the same as the relation between a linear trans-
torroation (W = T(V)) and its roatrix representation (Y= AX). But there
are an infinite number of possible matrix representations for any quadratic
function, in contrast to the linear case where each basis determines a unique
matrix for a roap. For example, the above quadratic function is given by the
quadratic forros 18b
2
- 8ab, 18b
2
- 4ab - 4ba, and 18b
2
+ 12ab - 20ba.
Therefore
q(a, b) =(a = (a b)(
=(a b)(
However, out of all the possible matrix representations for a quadratic
function there is a best choice, for it is always possible to use a symroetric
matrix. This should be clear, for if in 'L7.i=
1
aiixixi, ahk ;f. akh for sorne h
and k, then the terros ahkxhxk + akhxkxh can be replaced by
which ha ve equal coefficients. For example, the polynomial 8x
2
+ 7xy + 3yx
+ y
2
can be written as
:.
:1
1. Quadratic Forms 293
using a nonsymmetric matr.ix, but the polynomial 8x
2
+ 5xy + 5yx + y
2
yields the same values for given values of x and y and can be expressed as
using a symmetric matrix. Choosing a symmetric matrix to represent a quad-
ratic function will make it possible to find a simple representation by a change
of basis, for every symmetric matrix is diagonalizable.
Definition The matrix of a quadratic function q with respect toa basis B
is the symroetric roatrix A = (a;) for which
n
q(V) = L aijxixi
i,j=1
with V: (x
1
, . , x.)n.
The quadratic forro 'L7.i= 1 aiixixi is symmetric if aii = aii for all i andj.
Example 2 Recall the quadratic forro considered in 1 given
by 2xf - x
1
x
2
+ 9x
2
x
1
in terros of the basis B. The quadratic function q
on "Y
2
deterroined by this quadratic forro is defined by
(
2 -
0
I)(xx
21
), q(V) = (x
1
x
2
)
9
However, G - is not the roatrix of q with respect to B because it is not
syrometric. The matrix of q with respect to since it is symmetric,
and when V: (x
1
, x
2
)
8
,
We have defined quadratic functions in terms of coordinates, so the
definition of each quadratic function is in part a refiection of the particular
basis used. But a quadratic function is a function of vectors, and as such it
should be possible to obtain a description without reference to a basis. An
analogous situation would be to have defined linear transforroations in
terros of roatrix roultiplication rather than requiring thero to be roaps that
preserve the algebraic structure of a vector space. Our coordinate ap-
294 7. SECOND DEGREE CURVES ANO SURFACES
proach to guadratic functions followed naturally from an examination
or second degree polynomial eguations, but there should also be a co-
ordinate-free characterization. We can obtain such a characterization by
referring to bilinear functions. It might be noted that severa! inner pro-
ducs considered in Chapter 6 appear similar to guadratic forms. For example,
the inner product in Example 6 on page 245 is expressed as a polynomial
in the coordinates of vectors with respect to the standard basis.
Definition A bilinear form on an n-dimensional vector space 1/', with
basis B, is an expression of the form 'L7.i=t aiixiyi were x
1
, ,.xn and
Y
1
, . , Yn are the .coordinates of two arbitrary vectors with respect to
B. A bilinear fot:m may be written as xr A Y with X= (x
1
, .. , xnl,
Y= (y
1
, , Yn)r, and A = (a;). A is the matrix of the bilinear form with
respect to B, and the bilinear form is symmetric if A s symmetric.
Thus a bilinear form is a homogeneous second degree polynomial in the
coordinates of two vectors. The binear form suggested by Example 6, page
245, is given by 3x
1
y
1
- x
1
y
2
- x
2
y
1
+ 2x
2
Yz, where x
1
, x
2
and y
1
, y
2
are the coordina tes of two vectors from 1/'
2
with respect to the standard basis.
This bilinear form yields a number for each pair of vectors in 1/'
2
and was
uscd to obtain an inner product on "f/"
2
However, a bilinear form need not
determine an inner product. Consider the bilinear form 4x
1
y
2
- 3x
2
y
2
delined on "/"'
2
in terms of coordina tes with respect to {E;}. This bilinear form
defines a bilinear function from 1/'
2
to R, but it is neither symmetric nor
positive definite.
The name "bilinear form" is justified by the fact that a bilinear form is
simply a coordinate representation of a real-valued, bilinear function. That
is, a function b for which:
l. b(U, V) E R for all U, V E 1/'.
2. For each W E 1/', the map T given by T(U) = b(U, W), for all
U E 1/', is linear.
3. For each W E 1/', the map S given by S( U) = b(W, U), for all
U E 1/', is linear.
It is not difficult to obtain this relationship between bilinear maps and
bilinear forms. It is only necessary to show that if b is a bilinear function on
1/', and B = { U
1
, , Un} is a basis for 1/', then
n
b(U, V) = L aiixiyi
i,j= 1
<r
where U: (x
1
, , xn)
8
, V: (y
1
, , Ynh, and aii = b(U;, U).
1. Quadratic Forms 295
Examp!e 3 Construct a function b on ~
2
as follows: Suppose T
is the linear map given by T(u, v) = (2u - v, 5v - 3u) and set b(U, V)
= U o T(V) for all U, V E ~
2
. The linearity of T and the bilinearity of the dot
product guarantees that b is bilinear. Therefore the function b gives rise to a
bilinear form for each basis for ~
(x' y' 1) o 1 o o -5 5 o 1
-3/4 1 1 3/2 5 o o o
(x' y' !)(:
o
-5
o
= 2x'
2
- 5y'
2
+ 31/8.
320 7. SECOND DEGREE CURVES ANO SURFACES
::V
This is the polynomial obtained in Example 2.
The composition of the rotation X= PX' followed by the transl<ition
(x", y") = (x' + h', y' + k') is given in matrix form by
If this coordina te expression is written as k = P X", then the polynomial
xr AXis transformed to k"r(pr AP)k". That is, a rigid motion transforms
A to pr AP. Since Pis a nonsingular matrix, the rank of A is an invariant of
the polynomial under rigid motions of tff This invariant together with the
discriminant and the property of having a graph (an equation has a graph
if the coordinates for sorne point satisfy the equation) distinguish the nine
types of canonical forms for second degree polynomals in two variables.
This is summarized in the Table I.
Table 1
b
2
-4ac Rank Graph Canonical forms Type of graph
>O 3 Y es xzaz - yzbz = 1 Hyperbola
>O 2 Y es xzaz - yzbz =O Intersecting Jines.
<O 3 Y es xzaz + yzbz = 1 (Real) ellipse
<O 3 No xzaz + yzbz = _ 1 Imaginary ellipse
<O 2 Y es x2Ja2 + yzp =O Pont ellipse
=0 3 Y es x
2
= 4py Para bola
=0 2 Y es xz = a2 Parallel lines
=0 2 No xz = -az Imaginary lines
=0 1 Y es x
2
=O Coinciden! lines
Problems
l. Find the matrices A, B, and that are used to express the following polyno-
mials in the forro xr AX + BX + e and gr X.
a. 3x
2
- 4xy + 7y + 8. b. 8xy- 4x + 6y.
2. Find a rotation that transforms 14x
2
+ 36xy - y
2
- 26 = O to x'
2
- y'
2
/2 =
l. Sketch both coordinate systems ad the graph of the two equations.
5. Ouadric Surfaces
321
3. Find a rigid motion that transforms the equation
16x
2
- 24xy + 9y
2
- 200x- lOOy- 500 =O
to the canonical forro for a parabola. Sketch the graph and the coordinatc
systems corresponding to the various coordinates used.
4. Write the polynomial 5x
2
- 3y
2
- lOx- 12y- 7 in the form XTX and
transform it to a polynomial without linear terms with a translation given by
X=PX'.
5. For each of the following equations, use Table I, without obtaining the canon-
ical form, to determine the nature of its graph.
a. 2x
2
- 6xy + 4y
2
+ 2x + 4y - 24 = O.
b. 4x
2
+ 12xy + 9y
2
+ 3x- 4 =o.
c. 7x
2
- 5xy + y
2
+ 8y =O.
d. x
2
+ 4xy + 4y
2
+ 2x + 4y + 5 = O.
e. x
2
+ 3xy + y
2
+ 2x = O.
6. The equations xr AX + BX + e = O and r(XT AX + BX + C) = O ha ve the
same graph for every nonzero scalar r.. Should the fact that det A f:. det(r A)
ha ve been taken into consideration when the discriminan! was called a geometric
invariant?
7.
8.
The line in z through (a, b) with direction numbers (a, /3) can be written as
X = tK + , t E R, with
X = w. = m. K= (g)
a. Show that this line intersects the graph of gr X = O at the points for
which the parameter t satisfies
(KrAK)t
2
+ (2KrA0)t + TAO =O.
b. Use the equation in part a to find the points at which the line through
(-2, 3) with direction (1, - 2) intersects the graph of x
2
+ 4xy - 2y +
5 =O.
Find the canonical forro for the following and sketch the graph togeriler with
any coordinate systems used.
a. 2x
2
+ 4xy + 2y
2
= 36. b. 28x
2
+ l2xy + l2y
2
== 90.
5. Ouadric Surfaces
The general second degree equation in three variables may be written in
the form kT Ak = O where A is the symmetric matrix
322 7. SECOND DEGREE CURVES ANO SURFACES
and
If A = (;), B = (b
1
, b
2
, b
3
), and e= (e), then
xTAx = xTAx + 2BX +e, with X = (x, y, z)T.
Notice that both A and A are symmetric. If x, y, and z are viewed as Car-
tesian coordinates in three-space, then the graph of xT AX = o is a quadric
surface. In general, this is actually a surface, but as with the conic sections
sorne equations either have no graph or their graphs degenerate to lines or
points.
We know that for any equation XT AX = O there exists a rigid motion
which transforms it toan equation in which each variable that appears is either
a perfect square or in the linear portien. Therefore it is not difficult to make
a list of choices for canonical forms, as shown in Table 11.
This list contains 17 types of equations ranked first according to the rank
of A and then by the rank of A. (This relationship is shown later in Table
111.) Each of the possible combinations of signatures for A, linear terms,
and constant terms are included in Table 11. (Recall that a constant term can
be absorbed in a linear term using a translation.) Although it may appear that
there is no form for an equation of the type ax
2
+ by + cz = O, it is possible
to reduce the number of linear terms in this equation by using a rotation
that fixes the x axis. To see this write by + cz in the form
Then the rotation given by
+ c
2
-cfJb
2
+ c
2
bfJb2 + c
2
z
'J
;)
...
' ;
..,__
5. Quadric Surfaces
Type of equation
x2fa2 + y2fb2 + z2fc2 = 1
x2fa2 + y2fb2 + z2fc2 = -1
x2fa2 + y2jb2 - z2fc2 = -1
x
2
/ + y
2
/b
2
- z
2
/c
2
= 1
x2fa2 + y2fb2 = 2z
x2fa2 - y2fb2 = 2z
x2fa2 + y2fb2 + z2fc2 =O
x2fa2 + y2fb2 - z2fc2 = O
x2fa2 + y2fb2 = 1
x2fa2 + ytbt = -1
x2fa2 - y2fb2 = 1
x
2
= 4py
x2fa2 + y2fb2 =O
x2fa2 - y2fb2 =O
x2 = at
x2 = -at
x
2
=O
Table 11
N ame for graph
Ellipsoid
No graph (imaginary ellipsoid)
Hyperboloid of 2 sheets
Hyperboloid of 1 sheet
Elliptic paraboloid
Hyperbolic paraboloid
Point (imaginary elliptic cone)
Elliptic cone
Elliptic cylinder
No graph (imaginary elliptic cylinder)
Hyperbolic cylinder
Parabolic cylinder
Line (imaginary intersecting planes)
Intersecting planes
Parallel planes
No graph (imaginary parallel planes)
Coinciden! planes
323
transforms ax
2
+ by+ cz =O to ax'
2
+ Jb
2
+ c
2
y' =O. Thus the given
equation can be put into the standard form for a parabolic cylinder.
Quadric surfaces are not as familiar as conics and at least one is rather
complicated, therefore it may be useful to briefty consider the surfaces listed
in Table II. Starting with the simplest, there should be no problem with
those graphs composed of planes. For example, the two intersecting planes
given by x
2
ja
2
- y
2
jb
2
=O are the planes with equations xfa + yfb =O
and xfa - yfb = O.
A cylinder is determined by a plane curve called a directrix and a direc-
tion. A line passing through the directrix with the given direction is called a
generator, and the cylinder is the set of all points which lie on sorne generator.
For the cylinders listed in Table II the generators are parallel to the z axis and
the directrix may be taken as a conic section in the xy plane. These three types
of cylinders are sketched in Figure 7 showing that the name of each surface
is derived from the type of conic used as directrix.
The nature of the remaining quadric surfaces is most easily determined
by examining how various planes interesect the surface. Given a surface and
a plane, points which lie in both are subject to two constraints. Therefore a
surface and a plane generally intersect in a curve. Such a curve is called a
section of the surface, and a section made by a coordina te plane is called a
trace. For example, the directrix is the trace in the xy plane for each of the
three cylinders described above, and every section made by a plane parallel
324
Elliptic cylinder
z
X
7. SECOND DEGREE CURVES ANO SURFACES
Generators
---
---
x
2
=4py,p >O
Parabolic cylinder
Figure 7
Hyperbolic cylinder
Generator
Directrix
y
'.
J .......
1
1
1
5. Quadric Surfaces
325
to the xy plane is congruent to the directrix. Further, if a plane parallel to
the z axis intersects one of these cylinders, then the section is either one or
two generators. The sections of a surface in planes parallel to the xy plane
are often called leve/ curves. If you are familiar with contour maps, you will
recognize that the Iines on a contour map are the leve! curves of the land
surface given by parallel planes a fixed number of feet apart.
The traces of the ellipsoid x
2
fa
2
+ y
2
fb
2
+ z
2
/c
2
= 1 in the three co-
ordinate planes are ellipses, and a reasonably good sketch of the surface is
obtained by simply sketching the three traces, Figure 8. If planes parallel to
a coordinate plane intersect the ellipsoid, then the sections are ellipses, which
get smaller as the distance from the origin increases. For example, the leve!
curve in the plane z = k is the ellipse x
2
fa
2
+ y
2
/b
2
= 1 - k
2
/c
2
, z = k.
The elliptic cone x
2
/a
2
+ y
2
/b
2
- z
2
fc
2
= O is also easily sketched. The
traces in the planes y = O and x = O are pairs of lines intersecting at the
origin. And the leve! curve for z =k is the ellipse x
2
fa
2
+ y
2
/b
2
= k
2
fc
2
,
z = k, as shown in Figure 9.
The hyperboloids given by x
2
fa
2
+ y
2
fb
2
- z
2
fc
2
= 1 are so named
because the sections in planes parallel to the yz plane and the xz plane are
hyperbolas. (This could also be said of the elliptic cone, see problems 5 and
8 at the end of this section.) In particular, the traces in x = O and y = O
are the hyperbolas given by y
2
fb
2
- z
2
/c
2
= 1, x =O and x
2
fa2 - z
2
fc
2
= l, y= O, respectively. The hyperboloid of2 sheets, x
2
fa
2
+ y
2
fb
2
- z
2
/c
2
= - 1, has leve! curves in the plane z = k only if k
2
> c
2
Such a section is
an ellipse with the equation x
2
fa
2
+ y
2
fb
2
= -1 + k
2
fc
2
, z =k. As k
2
increases, these leve! curves increase in size so that the surface is in two parts
opening away from each other, Figure IO(a). For the hyperboloid of 1 sheet
with the equation x
2
fa
2
+ y
2
fb
2
- z
2
fc
2
= 1, there is an elliptic section for
every plane parallel to the xy plane. Therefore the surface is in one piece, as
shown in Figure IO(b).
Sections ofthe paraboloids x
2
fa
2
+ y
2
fb
2
= 2z and x
2
fa
2
- y
2
fb
2
= 2z
in planes parallel to the yz and xz planes, i.e., x = k and y = k, are para bolas.
Planes parallel to the xy plane intersect the elliptic paraboloid x
2
fa
2
+ y
2
fb
2
= 2z in ellipses given by x_
2
fa
2
+ y
2
fb
2
= 2k, z = k, provided k ;::: O. When
k < O there is no section and the surface is bowl-shaped, Figure ll(a). For
the hyperboc paraboloid with equation x
2
fa
2
- y
2
fb
2
= 2z every plane par-
allel to the xy plane intersects the surface in a hyperbola. The trace in the xy
plane is the degenera te hyperbola composed of two intersecting lines; while
above the xy plane the hyperbolas open in the direction of the x axis and
below they open in the direction of the y axis. The hyperbolic paraboloid is
called a saddle surface because the region near the origin has the shape of a
saddle, Figure ll(b).
326
X
7. SECOND DEGREE CURVES ANO SURFACES
Trace in
y=O
z
Ellipsoid
Figure 8
Elliptic cone
Figure 9
Trace in
x=O
Trace in
y=O
y
5. Quadric Surfaces
y
Hyperboloid of 2 sheets
(a)
X
Elliptic paraboloid
(a)
Section in
z=k
Figure 10
Section in
327
y
Trace in
z=O
Hyperboloid of 1 sheet
{b)
z=k,k>O
Figure 11
X
Hyperbolic paraboloid
(b)
328 7. SECOND DEGREE CURVES ANO SURFACES
Given the equation of a quadric surface, gr AX = O, one can determine
the type of surface by obtaining values for at most five invariants. These .are
the rank of A, the rank of A, the absolute value of the signature of A, the
existence of a graph, and the sign of the determinant of A. This is shown in
Table III.
Rank A Rank A Abs (sA)
3
4
2
3
3
2
2
2
3
2
o
3
1
2
o
2
o
Table III
Graph Det A
Y es <O
No >O
Y es <O
Y es >O
Y es <O
Y es >O
Y es
Y es
Y es
No
Y es
Y es
Y es
Y es
Y es
No
Y es
Type of graph
Ellipsoid
Imaginary ellipsoid
Hyperboloid of 2 sheets
Hyperboloid of 1 sheet
Elliptic paraboloid
Hyperbolic paraboloid
Point
Elliptic cone
Elliptic cylinder
Imaginary elliptic cylinder
Hyperbolic cylinder
Parabolic cylinder
Line
Intersecting planes
Parallel planes
Imaginary parallel planes
Coincident planes
..._The ti ve invariants arise quite na tu rally. A rigid motion is given in
coordinate form by X' = Px with P nonsingular. The other operation used
to obtain the canonical forms is multiplication of the equation by a nonzero
scalar. This has no effect on the graph or the rank of a matrix, but since A
is 3 x 3, it can change the sign of its determinant. Therefore the ranks of A
and A are invariants but the sign of det A is not. Recall that for the conics,
det A gives rise to the discriminant. The signature of A is invariant under
rigid motions, but it may change sign when the equation is multiplied by a
negative scalar. Therefore the absolute value of the signature, abs(sA), is an
invariant of the equation. These three invariants together with the existence
or nonexistence of a graph are sufficient to distinguish between all types of
6. Chords, Rulings, and Centers
329
surface except for ~ h two hyperboloids. The hyperboloids are distinguished
by the sign of det A, which is an invariant because A is a 4 x 4 matrix.
Problems
l. Por the following equations find both the canonical form and the transfor-
mation used to obtain it.
a. x
2
- y
2
/4- z
2
/9 = l.
b. x
2
/5 - z
2
= l.
c. 4z
2
+ 2x + 3y + 8z = O.
d. 3x
2
+ 6y
2
+ 2z
2
+ 12x + 12y + 12z + 42 = O.
e. -3x
2
+ 3y
2
- 12xz + l2yz + 4x +4y- 2z =O.
2. Show that every plane parallel to the z axis iritersects the paraboloids x
2
fa
2
=
y
2
/b
2
= 2z in parabolas.
3. Show that every plane parallel to the z axis intersects the hyperboloids x
2
fa
2
+
y
2
/b
2
- z
2
/c
2
= 1 in hyperbolas,
4. If intersecting planes, parallel planes, and coincident planes are viewed as
cylinders, then what degenerate conics would be taken as the directrices?
5. Justify the name "conic section" for the graph of a second degree polynomial
equation in two variables by showing that every real conic, except for
parallellines, may be obtained as the plane section of the elliptic cone x
2
fa
2
+
yzbz - zzcz =O.
6. Find the canonical form for the following equations.
a. 2xy + 2xz + 2yz - 8 = O. b. 2x
2
- 2xy + 2xz + 4yz = O.
7. Determine the type of quadric surface given by
a. x
2
+ 4xy + 4y
2
+ 2xz + 4yz + z
2
+ Sx + 4z = O.
b. x
2
+ 6xy - 4xz + yz + 4z
2
+ 2x - 4z + 5 =O.
c. 3x
2
+ 2xy + 2y
2
+ 6yz + 7z
2
+ 2x + 2y + 4z + 1 =O.
d. 2x
2
+ Sxy- 2y
2
+ l2xz + 4yz + 8z
2
+ 4x + Sy + 12z + 2 =O.
8. Show that the elliptic cone x
2
fa2 + y
2
/b
2
- z
2
/c
2
=O is the "asymptotic cone'
for both hyperboloids given by x
2
/a
2
+ y
2
/b
2
- z
2
/c
2
= l. That is, sho\\
that the distance between the cone and each hyperboloid approaches zero as
the distance from the origin increases.
6. Chords, Rulings, and Centers
There are severa! ways in which a line can intersect or fail to intersect
a given quadric surface, gr AX = O. To investigate this we need a vector
330 7. SECOND DEGREE CURVES ANO SURFACES
equation for a line in terms of X. Recall that the line through the point
U = (a, b, e) with direction numbers V = (a, {3, y) has the equation X
= tV + U or (x, y, z) = t(a, {3, y) + (a, b, e), tE R. Written in terms .f
4 x 1 matrices this equation becomes
The fourth entry in the direction, R, must be zero to obtain an equality; there-
fore direction numbers and points now have different representations.
The line given by X = tK + O intersects the quadric surface given by
XT AX = O in points for which the parameter t satisfies
(tK + O)T A(tK + 0) = O.
Using the properties of the transpose and matrix multiplication, and the
fact that for a 1 x 1 matrix
this condition may be written in the forro
The coefficients of this equation determine exactly how the line intersects
the quadric surface.
If the direction K satisfies kT AK =P O, then there are two solutions for
t. These solutions may be real and distinct, real and concident, or complex
conjugates. Therefore if kT AK =P O, then the line intersects the surface in
two real points, either distinct or coincident, or in two imaginary points. In
all three cases we will say the line contains a ehord of the surface with the
the chord being the line segment between the two points of intersection. In
only one case is the chord a realline segment. However with this definition,
every line with direction numbers R contains a chord of the surface XT AX
= O provided RT AK =P O.
Example 1 Show that every line contains a chord of an arbitrary
ellipsoid.
1t is sufficient to consideran ellipsoid given by x
2
fa
2
+ y
2
fb
2
+ z
2
fc
2
= 1
(Why?). Forthis equationA = diag(1/a2, 1/b
2
, 1/c
2
, -1), so if R =(a, {3, y, Ol
6. Chords, Rulings, and Centers 331
is an arbitrary direction, then RT AK = a
2
fa
2
+ f3
2
/a
2
+ y
2
fc
2
This is zero
only if a = f3 = y = O, but as direction numbers, R =P O. Therefore, RT .A k
=P O for all directions R.
A line near the origin intersects this ellipsoid in two real points, far from
the origin it intersects the surface in two imaginary points, and somewhere
inbetween for each direction there are (tangent) lines which intersect the
ellipsoid in two coincident points.
If the line & given by X = tK + O intersects XT AX = O when t = t
1
and t = t
2
, then the point for which t = t(t
1
+ t
2
) is the midpoint of the
chord. This midpoint is always real, for even if t
1
and !
2
are not real, they are
complex conjugates. Thus if RT AK =P O, then every line with direction K
contains the real midpoint of a chord. lt is not difficult to show that the
locus of all these midpoints is aplane. For suppose Mis the midpoint of the
chord on the line &. If & is parameterized so that X = M when t = O, then
the equation for & is X = tK + M. & intersects the surface XT AX = O
at the points for which t satisfies kT AKt
2
+ 2RT AMt + MT AM = O.
Since these points are the endpoints of the chord, the coefficient of t in this
equation must vanish. That is, if & intersects the surface when t = t
1
and
t = t
2
, then !(t
1
+ t
2
) =O implies that !
1
= -t
2
Therefore the mdpont
M satsfies RT AM = O. Conversely, if X = sK + O is another parameterza-
tion with RT AO = O, then & intersects the surface when s = s
1
for sorne
s
1
and O is the midpoint of the chord. Thus we ha ve proved the following.
Theorem 7.7 If RTAR =PO for the quadric surface given by XT AX =O,
then the locus of midpoints of aiJ the chords with direction R is the plane
with equation RT AX = O.
Definition The plane that bisects a set of parallel chords of a quadric
surface is a diametral plane of the surface.
Example 2 Find the equation of the diametral plane for the ellip-
soid x
2
/9 + y
2
+ z
2
/4 = 1 that bisects the chords with direction numbers
(3, 5, 4).
kT AX = (3, 5, 4, O) diag(I/9, 1, 1/4, -1)(x, y, z, I)T
= dag(1/3, 5, 1, O)(x, y, z, 1)T
a:: = x/3 + 5y + z.
332 7. SECOND DEGREE CURVES SURFACES
Therefore, the diametral plane is given by x/3 + 5y + z = O. Notice that
the direction K is not normal to the diametrial plane in this case.
Suppose the line X = tK + O does not contain a chord of xr AX = O,
then f.:..T AK = O and the nature of the intersection depends on the values of
Kr AO and or AO. If f.:..T AO -:f. o, then the Iine intersects the surface in only
one point. (Why isn't this a tangent line?)
Examp/e 3 Determine how lines with direction numbers (0, 2, 3)
intersect the hyperboloid of two sheets given by x
2
+ y
2
/4- z
2
/9 = -l.
For this surface A = diag(1, 1/4, -1/9, 1). Therefore, with f.:..
= (0, 2, 3, Of, f.:..T AK = 4/4 - 9/9 = O, and lines with direction K do
not contain chords. For an arbitrary point O = (a, b, e, l)r, Kr AO =
2b/4 - 3ef9 = b/2 - ef3. is nonzero if (a, b, e) is not in the plane with
equation 3y - 2z = O. Therefore, every line with direction numbers
(0, 2, 3) that is not in the plane 3y - 2z = O intersects the hyperboloid
in exactly one point. Each line with direction (O, 2, 3) that is in this plane
fails to intersect the surface. For such a line has the equation X = tK + O
with O = (a, O, O, 1l. In this case, the equation
becomes Ot
2
+ Ot + a
2
+ 1 = O, which has no solution. In particular when
a = O the line is in the asymptotic con e, x
2
+ y
2
/4 - z
2
/9 = O, for the
surface (see problem 8, page 329).
This example gives a third relationship between a line and a quadric
surface. If KT AK = KT AO = o but or AO -:f. O, then there are no values
for t, real or complex, that satisfy the equation f.:..T AKt
2
+ 2f.:..T A0t + 0r A0
= O. So the line X = tf.:.. + O fails to intersect the surface xr AX = O. The
last possibility is that f.:..T AK = f.:..T .A O = or AO = o for which all tE R
satisfy the equation and the Iine lies in the surface.
Definition A line that lies entirely in a surface is a ruling of the surface.
Examp!e 4 Show that a line with direction numbers (0, O, 1) is
eithcr a ruling of the cylinder x
2
/4 - y
2
f9 = 1 or it does not intersect the
surface.
For this surface A = diag(l/4, -1/9, O, -1). For the direction k
= (0, O, 1, Of and any point 0, kr AK = or AO = O and or AO = O if
6. Chords, Rulings, and Centers 333
and only if O is 0n the surface. That is, X= tk + O is either a ruJing ofthe
cylinder or it does not intersect the surface. Since O could be taken as an
arbitrary point on the cylinder, this shows that there is a ruling through every
point of the hyperbolic cylinder.
Definition A quadric surface is said to be ruled if every point on the
surface lies on a ruling.
The definition of a cylinder implies that all cylinders are ruled surfaces;
are any other quadric surfaces ruled? Certainly an ellipsoid is not since every
line contains a chord, but a cone appears to be ruled. For the corre given by
x
2
fa
2
+ y
2
fb
2
- z
2
/e
2
=O, A = diag(Ifa
2
, 1fb
2
, -l/e
2
, 0). So if K is the
direction given by K = (a, {3, y, Of, then KT AK = a
2
fa
2
+ f3
2
fb
2
- y
2
je
2
This vanishes if an only if the point (a, {3, y) is on the cone. Therefore, if
O = (a, {3, y, l)r is a point on the corre, not the origin, then Kr A k = Kr .A O
= or AO = o and the line x = tK + O is a ruling through O. Since all these
rulings pass through the origin, every point les on a ruling and the
cone is a ruled surface.
Although the figures do not suggest it, there are two other ruled
surfaces. Consider first the two hyperboloids with equations of the from
x
2
fa
2
+ y
2
fb
2
- z
2
/e
2
= ::l. For these surfaces A = diag(lja
2
, !jb
2
, -1/c
2
,
1). Suppose O = (x, y, z, 1)r is an arbitrary point on either ofthese surfaces,
i.e., or AO = o, and consider an arbitrary Iine throuth O, x = tK + O
with K = (a, {3, y, O)r. These surfaces are ruled if for each point O there exist>
a direction K such that KT AK = KT AO = or AO = O. Suppose thesc
conditions are satisfied, then
So
K1'AK = az + f32W -lfcz =O.
Kr AO = axja
2
+ f3YW - yz/e
2
=O,
or AO = xzaz + yzbz - zzcz 1 =o.
ax/a
2
+ f3yW = yzjc
2
= (yfe)(z/e)
--_,_,
= .Jazaz + f32fb2 .Jxzaz + y2fb2 l.
Squaring both sides and combining terms yields
Since a and b are nonzero and a = f3 = O implies y = O, there is no solution
334 7. SECOND DEGREE CURVES ANO SURFACES
Figure 12
for K using the plus sign, which means that the hyperboloid of 2 sheets is not
a ruled surface. Thus the question becomes, can a and f3 be found so that
(ay - f3x)
2
- (a
2
b
2
+ f3
2
a
2
) = O, if so, then the hyperboloid of 1 sheet is a
ruled surface. If JyJ #- b, view this condition as a quadratic equation for a
is terms of /3,
Then the discriminant in the quadratic formula is 4f3\a
2
y
2
+ x
2
b
2
- a
2
b
2
).
This is always nonnegative for x
2
ja
2
+ y
2
Jb
2
= 1 is the leve! curve oftP,e sur-
face in z = O and all other leve! curves are larger. That is, if (x, y, z) is on the
surface, then x
2
ja
2
+ y
2
jb
2
l. Therefore if O is not on the trace in z = O
and JyJ #- b, then there are two different solutions for a in terms of f3. Further,
for each solution there is only one value for y satisfying K.T A.R. = K.TAO
= O, so in this case there are two rulings through O. Ifyou check the remain-
ing cases, you will find that there are two distinct rulings through every point
on the surface. Thus not only is the hyperboloid of 1 sheet a ruled surface,
but it is "doubly ruled." This is illustrated in Figure 12, but it is much better
shown in a three-dimensional model.
Now turn to the two types of paraboloids given by x
2
fa
2
lfb
2
= 2z.
6. Chords, Rulings, and Centers
For these surfaces
(
lfa
2
o
A=
o
o
Figure 13
o o
l/b
2
o
o o
o -1
335
-f)
So if K= (a, f3, y, O)T, then K.TA.K. = a
2
ja
2
f3
2
fb
2
For the plus sign this
vanishes only if a = f3 = O. But K.T A. O = axja
2
f3yfb
2
- y. Therefore
there cannot be a ruling through O if O lies on the elliptic paraboloid. For
the hyperbolic paraboloid, a, f3, y need only satisfy a
2
Ja
2
- f3
2
Jb
2
= O and
axja
2
- f3yfb
2
- y = O. Again there are two rulings through O with the
directions K = (a, b, xja- yjb, O)T and K = (a, -b, xja + yjb, O)T. Thus
the hyperbolic paraboloid is also a doubly ruled surface, as shown in Figure
13. Since a portion of a hyperbolic paraboloid can be made of straight lines,
the surface can be manufactured without curves. This fact has been u sed in the
design of roofs for sorne modern buildings, Figure 14.
We will conclude this section with a determination of when a quadric
surface has one or more centers.
Definition A point U is a center of a quadric surface S if every line
through U either
336 7. SECOND DEGREE CURVES ANO SURFACES
Figure 14
l. contains a chord of S with midpoint U, or
2. fails to intersect S, or
3. is a ruling of S.
That is, a center is a point about which the quadric surface is symmetric.
If O is a center for the surface given by j(T AX = O and K is sorne direc-
tion, then each of the above three conditions implies that KT ,40 = O. For
KT AO to vanish for all directions K, implies that the first three entries of
AO must b!! zero, or AO = (0, O, O, *)T. Conversely, suppose P is a point
for which AP = (0, O, O, *f. Then for any direction K = (ex, {3, y, O)T,
ATAA A A A
K AP = O. Therefore, the line X = tK + P (I) contains a chord with
midpoint P if KT AK ,. O, or (2) does not intersect the surface if KT AK = O
and OT ,40 #- O, or (3) is a ruling of the surface if KT AK = OT AO = O.
This proves:
Theorem 7.8 The point O is a center of the quadric surface given by
j(T AX =O if and only if AO = (0, O, O, *)T.
Example 5 Determine if the quadric surface given by
2x
2
+ 6xy + 6y
2
- 2xz + 2z
2
- 2x + 4z + 5 = O
has any centers.
For this polynomial,
A ( ; ~
A= -1 O
-1 o
-1
o
2
2
- ~
2 .
-5
'
6. Chords, Rulings, and Centers 337
Therefore x, y, z are coordinates of a center provided
.( 2 3
AX = 3 6
-1 o
-1 o
or
One finds that this system of equations is consistent, so the surface has at
least one center. In fact, there is a line of centers, for the solution is given
by x = 2t + 2, y= - t - 1, and z = t for any tE R. This quadric surface
is an elliptic cylinder.
Examp/e 6 Determine if the hyperboloid of 2 sheets given by
x
2
- 3xz + 4y
2
+ 2yz + IOx - 12y + 4z = O has any centers.
For this equation
A=( ~
-3
5
o -3 5)
_! ~ - ~
and
is the system of linear equations given by AX = (0, O, O, *)T. This system
has the unique snlution x =y= 1, z = 2. Therefore this hyperboloid has
exactly one center.
Definition A quadric surface with a single center is a central quadric.
The system of equations AX = (0, O, O, *)T has a unique solulion if
and only if the rank of A is 3. Therefore, ellipsoids, hyperboloids, and eones
are the only central quadric surfaces. A quadric surface has at least one
center if the rank of A equals the rank of the first three rows of A. That is
when the system AX = (0, O, O, *f is consistent. Further, a surface with a ~
least one center has a line of centers if rank A = 2 and a plane of centers if
rank A = l.
338 7. SECOND DEGREE CURVES ANO SURFACES
Problems
l. Determine how the Iine intersects the quadric surface. If the Iine contains a
chord, find the corresponding diametral plane.
a. (x, y, z) = t(l, --1, 2) + (-2, -2, 2);
2x
2
+ 2xy + 6yz + z
2
+ 2x + 8 = O.
b. (x, y, z) = t(1, -1, 1) + (0, 1, -2);
2xy + 2xz - 2yz - 2z
2
+ Sx + 6z + 2 = O.
c. (x, y, z) = t(2, v2, O);
x
2
-- 2xz + 2y
2
+ Syz + 8z
2
+ 4y + 1 =O.
2. Determine how the fa mil y of lines with direction numbers (1, -2, 1) intersects
the quadric surface given by
4xy + y
2
- 2yz + 6x + 4z + 5 = O.
3. Show that every diametral plane for a sphere is perpendicular to the chords it
bisects.
4. For each surface find the direction numbers of the rulings thrugh the given
point.
a. x
2
/5 + y
2
/3 - z
2
/2 =O; (5, 3, 4).
b. x
2
- y
2
/8 = 2z; (4, 4, 7).
c. 4xy + 2xz- 6yz + 2y- 4 =O; (1, O, 2).
d. x
2
- 8xy + 2yz- z
2
- 6x -- 6y- 6 =O; ( -1, O, 1).
5. Find all the centers of the following quadric surfaces.
a. 3x
2
+ 4xy + y
2
+ 10xz + 6yz + 8z
2
+ 4y- 6z + 2 =O.
b. x
2
+ 2xy + y
2
+ 2xz + 2yz + z
2
+ 2x + 2y + 2z = O.
c. x
2
+ 6xy - 4yz + 3z
2
- Sx + 6y - 14z + 13 = O.
d. 4xy + 6y
2
- Sxz- 12yz + 8x + 12y =O.
6. Suppose O is a point on the surface given by xr X = O.
a. What conditions should a direction K satisfy for the line X = tK + O to
be tangent to the surface?
b. Why should O not be a center if there is to be a tangent line through 0?
A point of a q).ladric surface which is not a center is called a regular point of
the surface.
c. What is the equation of the tangent p1ane to xr X = O at a regular
point 0?
7. If the given point is a regular point for the surface, find the equation of the
tangent plane at that point, see problem 6.
a. x
2
+ y
2
+ z
2
= 1; (0, O, 1).
b. x
2
+ 6xy + 2y
2
+ 4yz + z
2
- 8x -- 6y- 2z + 6 =O; (1, 1, -1).
c. 4xy- 7y
2
+ 3z
2
- 6x + 8y- 2 =O; (1, 2, -2).
d. 3x
2
- 5xz + y
2
+ 3yz- Sx + 2y- 6z =O; (2, 4, 5).
8. Find the two lines in thz:: plane x - 4y + z = O, passing through the point
(2, 1, 2), that are tangent to the ellipsoid given by x
2
/4 + y
2
+ z
2
/4 = l.
Review Problems
339
Review Problems
1. Find a symmetric matrix A anda matrix B that express the given polynomial
as the sum of a symmetric bilinear form xr AX and a linear form BX.
a. 3x
2
- 5xy + 2y
2
+ 5x- ?y.
b. x
2
+ 4xy + 6y.
c. 4x
2
+ 4xy- y
2
+ !Oyz + 4x- 9z.
d. 3xy + 3y
2
- 4xz + 2yz - 5x + y - 2z. <!IL.....
2. Prove that if S, TE Hom('Y, R), then the function b defined by b(U, V) =
S(U)T(V) is bilinear. Is such a function ever symmetric and positive definite?
3. Determine which of the following functions define inner products.
a. b((x
1
, x
2
), (Yt. y2)) = 5x1y1 + 6x1Y2 + 6XzY1 + 9XzYz.
b. b((x
1
, x
2
), (y
1
, y
2
)) = 6x1y + 1XYz + 3XzY! + 5XzYz
c. b((x
1
, x
2
), (y
1
, y
2
)) = 4x1y 1 + 3XYz + 3xzy + 2XzYz
4. :;ud( ,;:;: th( :"( "T )""""'"'e fo'
5. Transform the following equations for conics into canonical forro. Sketch the
curve and the coordinate systems used.
a. x
2
- 4y
2
+ 4x + 24y - 48 = O.
b. x
2
- y
2
- 10x- 4y + 21 =O.
c. 9x
2
- 24xy + 16y
2
- 4x - 3y =O.
d. x
2
+ 2xy + y
2
- 8 = O.
e. 9x
2
- 4xy + 6y
2
+ 12v5x + 4v5y - 20 =O.
6. a. Define center for a .conic.
b. Find a system of equations that gives the centers of conics.
c. Which types of conics have centers?
7. Find the centers of the following conics, see problem 6.
a. 2x
2
+ 2xy + 5y
2
- 12x + 12y + 4 =O.
b. 3x
2
+ 12xy + 12y
2
+ 2x + 4y = O.
c. 4x
2
- lOxy + 7y
2
+ 2x- 4y + 9 =O.
8. Jdentify the quadric surface given by
a. 2x
2
+ 2xy + 3y
2
- 4xz + 2yz + 2x - 2y + 4z = O.
b. x
2
+ 2xy + y
2
+ 2xz + 2yz + z
2
+ 2x + 2y + 2z - 3 = O.
c. x
2
+ 4xy + 3y
2
- 2xz - 3z
2
+ 2y - 4z + 2 = O.
9. Given the equation for an arbitrary quadric surface, which of the 17 types
would it most likely be?
10. Find all diametral planes for the following quadric surfaces, if their equation
is in standard form.
a. Real elliptic cylinder.
340 7. SECOND DEGREE CURVES ANO SURFACES
b. Elliptic paraboloid.
c. Hyperbolic paraboloid.
11. How do lines with direction numbers (3, 6, O) intersect the quadric s ~ r f c e
with equation 4xz - 2yz + 6x = O?
12. Find a rigid motion of the plane, in the form }{ = P :X', which transforms
52x
2
- 72xy + 73y
2
- 200x + IOOy = O into a polynomial equation without
cross productor linear terms. What is the canonical form for this equation?
Canonical Forms Under
Similarity
1. lnvariant Subspaces
2. Polynomials
3. Minimum Polynomials
4. Elementary Divisors
5. Canonical Forms
6. Applications of the Jordan Form
342 8. CANONJCAl FORMS UNDER SIMJlARITY
In Chapter 5 we found severa) invariants for similarity, but they were
insufficient to distinguish between similar and nonsimilar matrices. There-
fore it was not possible to define a set of canonical forros for matricesunder
similarity. However, there are good reasons for taking up the problem again.
From a mathematical point of view our examination of linear transformations
is left open by not having a complete set of invariants under similarity. On
the other hand, the existence of a canonical form that is nearly diagonal
permits the application of matrix algebra to severa) problems. Sorne of these
applications will be considered in the final section of this chapter.
'---Throughout this chapter "Y will be a finite-dimensional vector space
over an arbitrary field F.
1. lnvariant Subspaces
A set of canonical forms was easily found for matrices under equivalence.
The set consisted of all diagonal matrices with 1 's and then O's on the main
diagonal. 1t soon became clear that diagonal matrices could not play the
same role for similarity. However, we can use the diagonalizable caseto show
how we should proceed in general.
Since two matrices are similar if and only if they are matrices for sorne
map TE Hom("Y), it will be sufficient to loo k for properties that characterize
maps in Hom("Y), rather than n x n matrices. Therefore suppose TE Hom("Y)
is a diagonalizable map. Let A
1
, , Ak, k ::; dim 1/", be the distinct charac-
teristic values of T. For each characteristic value A, the subspace 9"(A;) =
{V E "YI T(V) = A V} is invariant under T. To say that 9"(A;) is a T-invariant
subspace means that if U E 9"(A;), then T(U) E 9"(A;). Call S"'( A;) the eharae-
teristie spaee of A; (iris al so called the eigenspace of A;.) Then the characteristic
spaces of the diagonalizable map T yield a direct sum decomposition of 1/"
in the following sense.
Definition
is
Let 9"
1
, , 9"h be subspaces of "Y. The su m ofthese spaces
!he mis direet, written 9"
1
EB EB 9"h, if U
1
+ + Uh = u; + ... +
Imphes U = u when U, u E 9" , 1 ::; i ::; h. 9" 1 EB ... EB 9" h is a direct
sum deeomposition of "Y if "Y = 9"
1
EB EB 9"
1
,.
1. lnvariant Subspaces
343
The following theorem gives another characterization of when a sum
is direct. The proof is not difficult and is left to the reader.
Theorem 8.1 The sum 9"
1
+ + 9"
1
, is direct if and only if for
U E 9", 1 ::; i ::; h, U + ... + u,, = o implies U = ... = uh = o.
Now the above statement regarding characteristic spaces becomes:
Theorem 8.2 If TE Hom("Y) is diagonalizable and A1, , Ak are
the distinct characteristic values of T, then "Y = 9"(A1) EB EB 9"(Ak).
Proof Since "Y has a basis B of characteristic vectors,
1/" e 9"(A
1
) + + 9"(Ak), therefore "Y = 9"(A1) + + 9"(Ak)
To show the sum is direct suppose U + ... + uk = o with U; E 9"(A;).
Each U; is a linear combination of the characteristic vectors in B correspond-
ing to A. Characteristic vectors corresponding to different characteristic
values are linearly independent, therefore replacing each U in U1 +
+ Uk with this linear combination of characteristic vectors yields a linear
combination of the vectors in B. Since Bis a basis, all the coefficients vanish
and U; = O for each i. So by Theorem 8.1 the sum is direct.
Examp/e 1
Suppose TE Hom("Y
3
) is given by
T(a, b, e) = (3a- 2b- e, -2a + 6b + 2e, 4a -10b - 3c).
T has three distinct characteristic values, 1, 2, and 3, and sois diagonalizable.
Therefore 1/"
3
= 9"(1) EB 9"(2) EB 9"(3). In fact one can show that
9"(1) = -2, 6)}, 9"(2) = 1, -2)}, 9"(3) = -2, 4)}
and {(1, -2, 6), (0, 1, -2), (1, -2, 4)} is a basis for 1'3
Return to a diagonalizable map T with distinct characteristic values
A
1
, . , Ak .. Then T determines the direct sum decomposition "Y =
9"(A
1
) EB EB 9"(Ak) and this decomposition yields a diagonal matrix for
T as follows: Choose bases arbitrarily for 9"(A
1
), , 9"(Ak) and Jet B
denote the basis for r composed of these k bases. Then the matrix of T
with respect to B is the diagonal matrix with the characteristic values
A
1
, ... , Ak on the main diagonal. In Example 1 such a basis is given by
344 8. CANONICAL FORMS SIMILARITY
B = {(l, -2, 6), (0, l, -2), (!, -2, 4)} and the matrix of Twith respect to
B is diag(l, 2, 3).
How might this be generalized if T is not diagonalizable? Suppose
enough T-invariant subspaces could be found to express "/'. as a direct sum
of T-invariant subspaces. Then a basis B could be constructed for "Y from
bases for the T-invariant subspaces in the decomposition. Since T sends
each of these subspaces into itself, the matrix for T with respect to B will
have its nonzero entries confined to blocks along the main diagonal. To
see this suppose 9'
1
, , 9'
1
, are T-invariant subspaces, although not neces-
sarily characteristic spaces, and "Y = 9'
1
$ $ ffh. If dim 9'
1
=a,,
choose a basis B; = { U;
1
, , for each 9';, and let
Then the matrix A of T with respect to B has its nonzero entries confined to
blocks, A
1
, .. , A
1
, which contain the main diagonal of A. The block A;
being the matrix of the restriction of T to 9'; with respect to the basis B,.
(Since 9'
1
is T-invariant, T: 9'
1
..... 9'
1
.)
Thus
(
A
1
o)
A = Az . .
O A,
We will cal! A a diagonal block matrix and denote it by diag(A
1
, , A,).
A diagonal block matrix is also called a "direct su m of matrices" and denoted
by A
1
$ $ A,, but this no'tation will not be used here.
Examp!e 2 Suppose T: f
3
..... f
3
is rotation with axis .f!1 {U} = Y
and angle () #- O. Let f7 = gJ., then f7 is not a characteristic space for T,
but Y and f7 are T-invariant subspaces and f
3
= Y $ .'Y. Now suppose
{U, V, W} is an orthonormal basis for !%'
3
with Y = .f!1 {V, W}. Then the
matrix of T with respect to {U, V, W} is
( JJ = e
o ; SIU 8 u
Thus a decomposition of "f/ into a direct sum of T-invariant subspaces
gives rise to a diagonal block matrix. The procedure for obtaining canonical
forms under similarity now becomes one of selecting T-invariant subspaces
1. lnvariant Subspaces 345
in sorne well-defined way and then choosing bases which make the blocks
relatively simple. The problem is in the selection of the subspaces, for it is
easy to generate T-invariant subspaces and obtain bases in the process.
For any nonzero U E "Y, there is an integer k :s; dim "Y such th.at the
set {U,T(U), ... , Tk-
1
(U)} is Iinearly independent and the set {U,T(U), ... ,
Tk-
1
(U), Tk(U)} is linearly dependent. Set ff(U, T) = .f!l{U, T(U), ... ,
Tk-
1
(U)}. Then it is a simple matter to show that Y( U, T) is a T-invariant
subspace called the T-cyclic subspace generated by U. Moreover the sct
{U, T(U), ... , Tk-
1
(U)} is a good choice as a basis for Y(U, T). Problem4
at the end of this section pro vides an example of how a direct sum of T-cyclic
subspaces Ieads to a simple matrix for T.
There is a polynomial implicit in the definition of the T-cyclic subspace
Y(U, T) which arises when a matrix is found for T using the vectors in
{U, T(U), ... , Tk-
1
(U)}. This polynomial is of fundamental importance
in determining how we should select the T-invariant subspaces for a direct
sum decomposition of "Y. Since Tk(U) E Y( U, T), there exist scalars
ak-t ... , a
1
, a
0
such that
or
Therefore U determines a polynomial
such that when the symbol x is replaced by the map T, a linear map a(T)
is obtained which sends U toO. This implies that a(T) sends al! of Y( U, T)
to O or that Y( U, T) is contained in .Aa<T> the null space of a(T).
Exampte 3 Consider the map TE Hom("f/
3
) given by T = 21.
For any nonzero vector U E "Y
3
, T( U) = 2U. Therefore the polynomial
associated with U is a(x) = x- 2. Then a(T) = T -21 and
a(T)U = (T- 2I)U = T(U)- 2U = 2U- 2U =O.
Since T = 21, Y(U, T) == .f!l{U} and .Aa(T) = "f/
3
. Thus the null spacc
contains the T-cydic space, but they need not be egua!.
The null space of a(T) in Example 3 is all of "Y
3
and, therefore, a T-
346 8. CANONICAL FORMS UNDER SIMILARITY
invariant subspace. But the fact is that the null space of such a polynomial in
T is always T-invariant.
Theorem 8.3 Suppose Te Hom(f) and bm, ... , b
1
, b
0
E F. Let
b(T) = bmTm + + b
1
T + b
0
!. Then b(T) E Hom(f) and Jlb(T) is a
T-invariant subspace off.
Proof Since Hom(f) is a vector space and closed under composi-
tion, a simple induction argument shows that b(T) E Hom(f).
To show that .!V b(T) is T-invariant, let W E .!V b(TJ then
b(T)T(W) = (bmr + + b
1
T + b
0
J)T(W)
= b"'Tm(T(W)) + + b
1
T(T(W)) + b
0
J(T(W))
= bmT(Tm(W)) + + b
1
T(T(W)) + b
0
T(I(W))
= T(b'"Tm(W) + + b
1
T(W) + b01(W))
= T(b(T) W) = T(O) = O.
Therefore T(W) E Jlb<TJ and .;Vb(TJ is T-invariant.
Example 4 Let T be the map in Example 1 and b(x) = x
2
- 3x.
Find Jlb(Tl"
b(T) = T
2
- 3T. A little computation shows that
b(T)(a, b, e)= (-2b- e, -4a + 2b + 2e, 8a- 8b- 6e),
and that Jlb(TJ = .P{(I, -2, 4)}. In Example 1 we found that (1, -2, 4) is
a characteristic vector for T, which confirms that .;Vb(T) is T-invariant.
Our procedure for obtaining canonical forros under similarity is to obtain
a direct sum decomposition of f into T-invariant subspaces and choose
bases for these subspaces in a well-defined way. The preceding discussion
shows that every nonzero vector U generates a T-invariant subspace 9'( U, T)
and each polynomial a(x) yields a T-invariant subspace Jla(T) Although
these two methods of producing T-invariant subspaces are related, Example
3 shows that they are not identical. Selecting a sequence of T-invariant sub-
spaces will use both methods; it will be based on a polynomial associated
with T and basic properties of polynomials. These properties el ose! y parallel
properties of the integers and would be obtained in a general study of rings.
,,
.;
1. lnvariant Subspaces
347
However, we cannot proceed without them. Therefore the needed properties
are derived in the next section culminating in the unique factorization theorem
for polynomials. 1t would be posslble to go quickly through this discussion
and then refer back to it as questions arise.
Problems
1. Suppose
(
i j ~ )
1 1 2
is the matrix of TE Hom(1"
3
) with respect to {E}, for the given vector U, find
[J'(U, T) and the polynomial a(x) associated with U.
a. U= (1, O, O). b. (1, -1, 0). c. (-2, O, 1). d. (0, 1, O).
2. Let TE Hom( 1"
3
) ha ve matrix
(f - ~ ~ )
with respect to {E
1
}, find [J'(U, T) and the polynomial a(x) associated with U
when:
a. U= (-1, O, 1). b. U= (2, O, -1). c. U= (1, 1, 1).
d. What is the characteristic polynomial of T?
3. Show that [J'(U, T) is the smallest T-invariant subspace of 1" that contains U.
That is, if .'T is a T-invariant subspace and U E .'T, then [!'(U, T) e !T.
4. Suppose
-1
1
1
o
1
is the matrix of TE Hom(J"s) with respect to {E}.
a. Show that 1"
5
= [J'(U, T) Ef) Y(Uz, T) EB [J'(U3, T) when
U
1
= (0,0, :_1, 1, O), U
2
= (1, O, -1, O, 0), U3 = (-1, 2, 3, -1, 0).
b. Find the diagonal block matrix of T with respect to the basis
B ={U, T(U), Uz, U3, T(U3)}.
5. Prove that for a diagonal block matrix,
det diag (A
1
, A
2
, , Ak) = (det A1)(det Az) (det Ak).
6. Suppose TE Hom(1"3 ) has
~ - ~ -i)
1 2 2
as its matrix with respect to {E}. Find the null space Yv(T) when
348
a. p(x) = x + 3.
c. p(x) = x
2
- 4.
8. CANONICAL" FORMS UNDER SIMILARITY
b. p(x) = x
2
- 5x + 9.
d. p(x) = (x + 3)(x
2
- 5x + 9).
7. Suppose TE Hom( "Y
3
) has
~ ~ i)
1 o 1
as its matrix with respect to {E}. Find ff P<T> when
a. p(x) = x- 2. b. p(x) = (x - 2)
2
c. p(x) = x.
d. p(x) = x(x - 2). e. p(x) = x(x - 2)
2
8. Suppose TE Hom("Y) and p(x) and q(x) are polynomials such that "Y =
ff P<T> + ff q<TJ What can be concluded about the map p(T) o q(T)? .
9. Suppose TE Hom("Y
3
) has
- ~ } ?)
o 1 2
as its matrix with respect to {E}. Find ffv<r> when
a. p(x) = x- 2. b. p(x). = (x- 2)
2
c. p(x) = x
2
- 3x + 3.
d. p(x) = x - l. e. Find p(x) such that p(T) = O.
2. Polynomials
We have encountered polynomials in two situations, first as vectors in
the vector space R[t], for which t is an indeterminant symbol never allowed
to be a member of the field R, and then in potynomial equations used to find
characteristic values, for which the symbol J. was regarded an unknown
variable in the field. A general study of polynomials must combine both
roles for the symbol used. Therefore suppose that the symbol x is an indeter-
minant, which may be assigned values from the field F, and denote by F[x]
the set of all polynomials in x with coefficients from F. That is, p(x) E F[x] if
p(x) = a.x" + + a
1
x + a
0
with a
11
, , a
1
, a
0
E F.
Definition Two polynomials from F[x], p(x) = a
11
x" + + a
1
x + a
0
and q(x) = b
111
x"' + + b
1
x + b
0
are equal if n =manda= b for all
i, 1 ::;; i ::;; n. The zero polynomial O(x) of F[x] satisfies the condition that if
O(x) = a.x" + + a
1
x + a
0
, then a" = = a
1
= a0 =O.
The two different roles for x are nicely illustrated by the two statemen'ts
p(x) = O(x) and p(x) = O. The first means that p(x) is the zero polynomial,
while the second will be regarded as an equation for x as an element in F,
that is, p(x) is the zero element ofF. The distinciton is very clear over a finite
2. Polynomials 349
field where a polynomial may vanish for every element and yet not be the
zero polynomial, see problem 1 at the end of this section.
Two operations are defined on the polynomials of F[x]. Addition in
F[x] is defined as in the vector space F[t], and multiplication is defined as
follows: If p(x) = a
11
x" + + a
1
x + a0 and q(x) = bmX" + b1x + b0,
then
with
p(x)q(x) = cn+mx"+m + + c1x + c0
ck = L aibi, 1 ::;; k ::;; n + m.
i+ j=k
For example, if p(x) = x
3
- x
2
+ 3x - 2 and q(x) = 2x
2
+ 4x + 3, then
p(x)q(x) = 12x
5
+ (14- 12)x
4
+ (13- 14 + 32)x
3
+ (- 1 3 + 3 4 - 2 3)x
2
+ (3 3 - 2 4)x - 2 3
= 2x
5
+ 2x
4
+ 5x
3
+ 3x
2
+ x - 6.
The algebraic system composed of F[x] together with the operations of addition
and multiplication of polynomia1s will also be denoted by F[x]. Thus F[x]
is nota vector space but is an algebraic system with the same type of opera-
tions as the field F. In fact, F[x] satisfies all but one of the defining properties
of a field-in general polynomials do not have multiplicative inverses in F[x].
Theorem 8.4 F[x] is a commutative ring with identity in which the can-
cellation law for multiplication holds.
Proof The additive properties for a ring are the same as the addi-
tive properties for the vector space F[t] and therefore carry over to F[x]. This
Ieaves six properties to be established. Let p(x), q(x), and r(x) be arbitrary
elements from F[x], then the remaining properties for a ring are:
l. p(x)q(x) E F[x]
2. p(x)(q(x)r(x)) = (p(x)q(x))r(x)
3. p(x)(q(x) + r(x)) = p(x)q(x) + p(x)r(x) and
(p(x) + q(x))r(x) = p(x)r(x) + q(x)r(x).
F[x] is commutative if
4. p(x)q(x) = q(x)p(x).
F[x] has an identity if
350
8. CANONICAL FORMS UNDER SIMILARITY
5. there exists l(x) E F[x] such that p(x)I(x) = I(x)p(x) = p(x).
The cancellation Jaw for multiplication holds in F[x] if
6. p(x)q(x) = p(x)r(x) and p(x) #- O(x), then q(x) = r(x) arid if
q(x)p(x) = r(x)p(x) and p(x) #- O(x), then q(x) = r(x).
Parts 1 and 4 are obvious from the definition of multiplication and the
corresponding properties ofF. Thus only half of parts 3 and 6 need be estab-
lished, and this is left to the reader. For 5, take l(x) to be the multiplicative
identity ofF. Parts 2 and 3 follow with a little work from the corresponding
properties in F. This Jeaves 6, which may be proved as follows:
Suppose p(x)q(x) = p(x)r(x), then p(x)(q(x) - r(x)) = O(x) md it is
sufficient to show that if p(x)s(x) = O(x) and p(x) ... O(x), then s(x) = O(x).
Therefore, suppose p(x)s(x) = O(x), p(x) = a
11
x" + + a
1
x + a
0
with
a" #-O, so p(x) #- O(x), and s(x) = h
111
X
111
+ + b
1
x + b
0
Assume that
s(x) #- O(x) so that bm #-O. Thenp(x)s(x) = a
11
b
111
x"+m + terms in lower powers
of x. p(x)s(x) = O(x) implies that a
11
b
111
= O, but a
11
#- O and F is a field, so
bm = O, contradicting the assumption tpat s(x) #- O(x). Therefore s(x) = O(x),
and the cancellation law holds in F[x].
Definition An integral domain is a commutative ring with identity in
which the cancellation law for multiplication holds.
Hence a polynomial ring F[x] is an integral domain. The integers together
with the arithmetic operations of addition and multiplication also com-
prise an integral domain. Actually the term integral domain is derived from
the fact that such an algebraic system is an abstraction of the integers. Our
main objective in this section is to prove that there is a unique way to factor
any polynomial just as there is a unique factorization of any integer into
primes. Since every element of Fis a polynomial in F[x], and Fis a field, there
are many factorizations that are not of interest. For example, 2x + 3
= 4(xf2 + 3/4) = t(lOx + 15).
Definition Let p(x) = a
11
x" + + a
1
x + a
0
E F[x]. If p(x) ... O(x)
and an #- O, then an is the leading coefficient of p(x) and p(x) has degree n
denoted deg p.= n. [f the leading coefficient of p(x) is 1, then p(x) is monic.
For a(x), b(x) E F[x], b(x) is a factor of a(x) or b(x) divides a(x), denoted
b(x)la(x), if there exists q(x) E F[x] such that a(x) = b(x)q(x).
There is no number that can reasonably be taken for the degree of the
zero polynomial. Howe\ler it is useful if the degree of O(x) is taken to be
minus infinity.
1
1
i 1
2. Polynomials 351
The nonzero elements ofF are polynomials of degree O in F[x]. F is a
field so that any nonzero element ofF is a factor of any polynomial in F[x].
However, a polynomial such as 2x + 3 above can be factored into the pro-
duct of an element from F anda monic polynomial in only one way asid e from
order; 2x + 3 = 2(x + 3/2). Thus, given a monic polynomial, we will only
look for monic factors.
In deriving our main result on factorization we will need two properties
of the integers which are equivalent to induction, see problem 13 at the end
of this section.
Theorem 8.5 The set of positive integers is well ordered. That is, every
nonempty subset of the positive integers contains a smallest element.
Proof The proof is by induction. Suppose A is a nonempty subset
of positive integers. Assume A has no smallest element and Jet S = {nln is
a positive integer and n < a for all a E A}. Then 1 E S for if not then 1 E A
and A has a smallest element. If k E S and k + 1 S, then k + 1 E A and
again A has a smallest e1ement. Therefore k E S implies k + 1 E S and S is
the set of aiJ positive integers by induction. This contradicts the fact that A
is nonempty, therefore A has a smallest element as claimed.
Many who find induction difficult to believe, regard this principie of well
ordering as "obvious." For them it is a paradox that the two principies are
equivalen t.
There are two principies of induction for the positive integers. The
induction u sed in the abo ve proof m ay be called the "first principie of induc-
tion."
Theorem 8.6 (Second Principie of lnduction) Suppose B is a set
of positive integers such that
l. 1 E B and
2. If m is a positive integer such that k E B for all k < m, then m E B.
Then B is the set of all positive integers.
Proof Let A = {nln is a positive integer and n B}. If A is non-
empty, then by well ordering A has a smallest element, call it m. m B so
m #- 1 and if k is a positive integer Jess than m, then k E B. Therefore, by
2, m E B and the assumption that A is nonempty leads to a contradiction.
Thus B is the set of all positive integers.
352 8. CANONICAL FORMS UNDER SIMILARITY
As with the first principie of induction, both the well ordering property
and the second principie of nduction can be applied to any set of integers
obtained by taking all integers greater than sorne number.
Theorem 8.7 (Division Algorithm for Polynomials) Let a(x), b(x)
E F[x] with b(x) -:1 O(x). Then there exist polynomials q(x), called the
quotiel1t, and r(x), called the remail1der, such that
a(x) = b(x)q(x) + r(x) with deg r < deg b.
Moreover the polynomials q(x) and r(x) are uniquely determined by a(x)
and b(x).
Proof If deg a < deg b, then q(x) = O(x) and r(x) = a(x) satisfy
the existence part of the theorem. Therefore, suppose deg a deg b and
apply the second principie of induction to the set of non-negative integers
B = {11lif deg a = 11, then q(x) and r(x) exist}.
If deg a = O, there is nothing to pro ve for a(x) and b(x) are in F and we
can take q(x) = a(x)jb(x), r(x) = O(x). So O E B.
Now assume k E B if k < m, that is, that q(x) and r(x) exist for c(x) if
dege =k. Leta(x) = a,xm + + a
1
x + a
0
andb(x) = b.:x!' + + b
1
x
+ b
0
with m > 11. Consider the polynomial a'(x) = a(x) - (a,fb.)xm-nb(x).
Deg a' < in, so by the induction assumption there exist polynomials q'(x)
and r(x) such that a'(x) = b(x)q'(x) + r(x) with deg r < deg b. Therefore,
if q(x) = q'(x) + (a,fb.)xm-n, then a(x) = b(x)q(x) + r(x) with deg r <
deg b. Therefore k < m and k E B impiies m E B, and by the second principie
of induction, B is the set of all non-negative integers. That is, q(x) and r(x)
exist for all poiynomials a(x) such that deg a deg b.
The proof of uniqueness is left to problem 5 at the end of this section.
In practice the quotient and remainder are easily found by long division.
-.,.,
Example 1 Find q(x) and r(x) if a(x) = 4x
3
+ x
2
- x + 2 and
b(x) = x
2
+ x - 2.
4x- 3
x
2
+ x - 2 l4x
3
+ x
1
- x + 2
4x
3
+ 4x
2
- 8x
- 3x
2
+ 7x + 2
- 3x
2
- 3x + 6
!Ox- 4
Therefore q(x) = 4x - 3 and r(x) = !Ox - 4.
2. Polynomials 353
Corollary 8.8 (Remainder Theorem) Let e E F and p(x) E F[x]. Then
the remainder of the divison p(x)j(x - e) is p(e).
Proof Since x - e E F[x], there exist polynomials q(x) and r(x)
such that p(x) = (x - e)q(x) + r(x) with deg r(x) < deg (x - e), but
deg (x - e) = 1 so r(x) is an element ofF. Since r(x) is constant, its value
can be obtained by setting x equal to e; p(e) = (e - e)q(e) + r(e). Thus
the remainder r(x) = p(e).
Corollary 8.9 (Factor Theorem) Let p(x) E F[x] and e E F, x - e
is a factor of p(x) if and only if p(e) = O.
Definition Given p(x) E F[xJ and e E F. e is a root of p(x) if x - e is
a factor of p(x).
A third corollary is easiiy proved using the first principie of induction.
Corollary 8.10
most 11 roots in F.
If p(x) E F[xJ and deg p(x) = n > O, then p(x) has at
Of course a polynomial need have no roots in a particular field. For
exampie, x
2
+ i has no roots in R. However, every polynomial in R[xJ has
at least one root in C, the complex numbers. This follows from the Funda-
mental Theorem of Algebra, since F[x] e C[x]. This theorem is quite dif-
ficult to prove, and will only be stated here.
Theorem 8.11 (Fundamental Theorem of Algebra) If p(x) E C[x],
then p(x) has at least one root in C.
Definition A field F is algebraically closed if every polynomial of
positive degree in F[xJ has a root in F.
The essential part of this definiton is the requirement that the root be
in F. Thus the complex numbers are algebraically closed while the real
numbers are no t. We will find that the simplest cano ni cal form under similarity
is obtained when the matrix is considered to be over an algebraically closed
fiel d.
If the polynomial p(x) has e as a root, then x - e is a factor and there
exists a polynomial q(x) such that p(x) = (x - c)q(x). Since deg q =
354 8. CANONICAL FORMS UNDER SIMILARITY
deg p - 1, an induction argument can be u sed to extend Theorem 8.11 as
follows.
Corollary 8.12 If p(x) E C[x] and deg p = 11 > O, then p(x) has
exactly 11 roots. (The n roots of p(x) need not all be distinct.)
Thus an nth degree polynomial p(x) E C[x] can be factored into the
product ofn linear factors of the form x - e;, 1 s; i s; n, with p(e;) = O. A
Po'fynomial p(x) in R[x] is also in C[x] and it can be shown that any complex
roots of p(x) occur in conjugate pairs. Therefore p(x) can be expressed as a
product of linear and quadratic factors in R[x].
Before obtaining a factorization for every polynomial in F[x], we must
investigate greatest common divisors of polynomials. This is done with the
aid of subsets called ideals.
Definition An ideal in F[x] is a nonempty subset S of F[x] such that
J. if p(x), q(x) E S, then p(x) - q(x) E S, and
2. if r(x) E F[x] and p(x) E S, then r(x)p(x) E S.
Examp/e 2 Show that S= {p(x) E R[x]/4 is a root of p(x)} is an
ideal in
If p(x) and q(x) E S, then (x - 4)/p(x) and (x - 4)/q(x) implies that
(x - 4)/(p(x) - q(x)), so p(x) - q(x) E S, and (x - 4)/r(x)p(x) for any
r(x) E R[x], so r(x)p(x) E S. Therefore both conditions are satisfied and S
is an ideal.
Any set of polynomials can be used to generate an ideal in F[x]. For
example, given a(x) and b(x) in F[x], define S by
S = {u(x)a(x) + v(x)b(x)iu(x), v(x) E F[x]}.
It is not difficult to check that S is an ideal in F[x] containing both a(x) and
b(x). Notice that if e is a root of both a(x) and b(x), then e is a root of every
J?Oiynomial in S. In fact this ideal is determined by the roots or factors com-
mon to a(x) and b(x). This is shown in the next two theorems.
Theorem 8.13 F[x] is a principal ideal doma in. That is, if S is an ideal in
F[x], then there exists a polynomial d(x) such that S = {q(x)d(x)/q(x) E F[x]}.
:_.
2. Polynomials 355
Proof If S = {O(x)}, there is nothing to prove, therefore
S# {O(x)}. Then the well-ordering principie implies that S contains a poly-
nomial of least finite degree. lf d(x) denotes such a polynomial, then it s
only necessary to show that d(x) divides every element of S.
If p(x) E S, then there exist polynomials q(x) and r(x) such that p(x)
= d(x)q(x) + r(x) with deg r < deg d. But r(x) is in the ideal since r(x)
= p(x) - q(x)d(x) and p(x), d(x) E S. Now since d(x) has the minimal finite
degree of any polynomial in S and deg r < deg d, r(x) = O(x) and d(x)lp(x)
as desired.
The ideal in Example 2 is principal with d(x) = x - 4 as a single
generator. For an ideal generated by two polynomials, a single generator
guaranteed by Theorem 8.13 is also a greatest common divisor.
Theorem 8.14 Any two nonzero polynomials a(x), b(x) E F[x] have
a greatest eommon divisor (g.c.d.) d(x) satisfying
l. d(x)la(x) and d(x)/b(x)
2. if p(x)la(x) and p(x)lb(x), then p(x)ld(x).
Moreover there exist polynomials s(x), t(x) E F[x] such that
d(x) = s(x)a(x) + t(x)b(x).
Proof The set
S = {u(x)a(x) + v(x)b(x)iu(x), v(x) E F[x]}
is an ideal in F[x] containing a(x) and b(x). Therefore, by Theorem 8.13,
there exists a nonzero polynomial d(x) E S such that d(x) divides every element
of S. In particular, d(x)la(x) and d(x)lb(x).
Since d(x) E S, there exist polynomials s(x), t(x) E F[x] such that d(x)
= s(x)a(x) + t(x)b(x), and it only remains to show that 2 is satisfied. But
if p(x)la(x) and p(x)/b(x), then p(x) divides s(x)a(x) + t(x)b(x) = d(x).
Two polynomials will have many g.c.d.'s but they have a unique monic
g.c.d. In practice a g.c.d. is most easily found if the polynomials are given
in a factored form. Thus the monic g.c.d. of a(x) = 4(x - 2)
3
(x
2
+ 1) and
b(x) = 2(x - 2)
2
(x
2
+ 1 )
5
is (x - 2)
2
(x
2
+ 1 ).
e
356
8. CANONICAL FORMS SIMILARITY
Definition Two nonzero polynomials in F[x] are relatively prime if
their only common factors in F[x] are elements ofF.
Thus x
2
- 4 and x
2
- 9 are relatively prime even though each has
polynomial factors of positive degree.
Theorem 8.15 If a(x) and b(x) are relatively prime in F[x], then there
exist s(x), t(x) E F[x] such that 1 = s(x)a(x) + t(x)b(x).
.Proof Since a(x) and b(x) are relatively prime, 1 is a g.c.d .
We are finally ready to define the analogue in F[x] to a prime number
and to prove the analogue to the Fundamental Theorem of Arithmatic.
Recall that this theorem states that every integer greater than 1 may be ex-
pressed as a product of primes in a unique way, up to order.
Definition A polynomial p(x) E F[x] of positive degree is irreducible
o1er F if p(x) is relatively prime to all nonzero polynomials in F[x] of degree
less than the degree of p(x).
That is, if deg p > O, then p(x) is irreducible over F if its only factors
from F[x] are constant multiples of p(x) and constants from F.
Examp!es
l. AII polynomials of degree 1 in F[x] are irreducible over F. (Why?)
2. The only irreducible polynomials in C[x] are polynomials of degree
l. (Why?)
3. ax
2
+ bx + e E R[x] is irreducible over R if the discriminan!
b
2
- 4ae <O.
4. x
2
+ 1 is irreducible over R but not over C.
5. x
2
- 2 is irreducible over field of rationals but not over R or C.
Theorem 8.16 Suppose a(x), b(x), p(x) E F[x] with p(x) irreducible over
F. If p(x)/a(x)b(x), then either p(x)/a(x) or p(x)/b(x).
Proof If p(x)/a(x) we are done. Therefore, suppose p(x),(a(x).
Then p(x) and a(x) are relatively prime so there exist s(x), t(x) E F[x] such
that 1 = s(x)p(x) + t(x)a(x). Multiply this equation by b(x) to obtain
b(x) = (b(x)s(x))p(x) + t(x)(a(x)b(x)). Now p(x) divides the right side of this
equation so p(x)/b(x), completing the proof.
1 f
2. Polynomials 357
Theorem 8.17 (Unique Factorization Theorem) Every polynomial
of positive degree in F[x] can be expressed as an element of F times a
product of irreducible monic polynomials. This expression is unique except
for the order of the factors.
Proof Every polynomial in F[x] can be expressed in a unique way
as ea(x) where e E F and a(x) is monic. Therefore it is sufficient to consider
a(x) a monic polynomial of positive degree. The existence proof uses thc
second principie of induction on the degree of a(x). If deg a = 1, then a(x)
is irreducible, therefore an expression exists when a(x) has degree l. So as-
sume every monic polynomial of degree less than m has an expression in terms
of irreducible mnic polynomials and suppose deg a ,;, m, m > l. If a(x)
is irreducible over F, then it is expressed in the proper form. If a(x) is not
irreducible over F, then a(x) = p(x)q(x) where p(x) and q(x) are monic
polynomials of positive degree. Since deg a = deg p + deg q (see problem
2 at the end of this section), p(;x) and q(x) ha ve degrees less than m. So by thc
induction assumption p(x) and q(x) may be expressed as products of ir-
reducible monic polynomials. This gives such an expression for a(x), which
has degree m. Therefore, by the second principie of induction, every monic
polynomial of positive degree may be expressed as a product of irreducible
monic polynomials.
For uniqueness, suppose a(x) = p
1
(x)P2(x) p,(x) and a(x) =
q
1
(x)qix) qk(x) with p
1
(x), ... ,ph(x), q
1
(x), ... , qk(x) irreducible
monic polynomials.p
1
(x)Ja(x) so p
1
(x)/q
1
(x) qk(x). Since p
1
(x) is irreducible_
an extension of Theorem 8.16 implies that p
1
(x) divides one of the q's. Say
p
1
(x)/q
1
(x), then since both are irreducible and monic, p
1
(x) = q
1
(x), and by
the cancellation Jaw, pix) p,(x) = qix) qk(x). Continuing this
with an induction argument yields h = k and p
1
(x), ... , Pk(x) equal to
q
1
(x), . .. , qk(x) in sorne order.
Corollary 8.18 If a(x) E F[x] has positive degree, then there
distinct irreducible (over F) monic polynomials p(x), . .. , Pk(X), positive
integers e
1
, ... , e k andan element e E Fsuch that a(x) = ep'(x)
pZk(x).
This expression is unique up to the order of the polynomials p
1
, . , Pk
Definition If a(x) = epi'(x) pZk(x) with e E F, p
1
(x), ... ,Pk(x)
distinct irreducible monic polynomials, and e
1
, , ek positive integers, then
e; is the multiplicity of p;(x), 1 =::; i =::; k.
Finding roots for an arbitrary polynomial from R[x] or C[x] is a difficult
task. The quadratic fomula provides complex roots for any second degree
358 8. CANONICAL FORMS UNDER SIMILARITY
polynomial, and there are rather complicated formulas for third and fourth
degree polynomials. But there are no formulas for general polynomials of
degree larger than four. Therefore, unless roots can be found by good
guesses, approximation techniques are necessary. One should not expect an
arbitrary polynomial to have rational roots, Jet alone integral roots. In fact,
for a polynomial with rational coefficients there are few rational numbers
which could be roots, see problem 10.
Problems
l. Suppose p(x) E F[x] satisfies p(c) = O for all e E F. Show that if F is a finite
field, then p(x) need not be O(x). What if F were infinite?
2. Show that if a(x), b(x) E F[x], then deg ab = deg a + deg b. (Take r - oo =
- oo for any rE R.)
3. Let F be a fiel d. a. Show that the cancellation law for multiplication holds in
F. b. Suppose a, bE F and ab = O, show that either a = O or b = O.
4. Find the quotient and remainder when a(x) is divided by b(x).
a. a(x) = 3x
4
+ 5x
3
- 3x
2
+ 6x + 1, b(x) = x
2
+ 2x- l.
b. a(x) = x
2
+ 2x + 1, b(x) = x
3
- 3x.
c. a(x) = x
5
+ x
3
- 2x
2
+ x - 2, b(x) = x
2
+ l.
d. a(x) = x
4
- 2x
3
- 3x
2
+ 10x - 8, b(x) = x - 2.
5. Show that the quotient and remainder given by the division algorithm for
polynomials are unique.
6. Prove that any two nonzero polynomials a(x), b(x) E F[x] ha ve a least commom
multiple (l.c.m.) m(x) satisfying
l. a(x)/m(x) and b(x)/m(x)
2. if a(x)/p(x) and b(x)/p(x), then m(x)/p(x).
7. Find the monic g.c.d. and the monic l.c.m. of a(x) and b(x).
a. a(x) = x(x - 1)
4
(x + 2)
2
, b(x) = (x- 1)
3
(x + 2)
4
(x + 5).
b. a(x) = (x
2
- 16)(5x
3
- 5), b(x) = 5(x- 4)
3
(x- 1).
8. Prove Euclid's algorithm: If a(x), b(x) E F[x] with deg adeg b>O, and the
polynomials q
1
(x), r
1
(x) E F[x] satisfy
a(x) = b(x)q1(x) + r 1(x) O<deg r 1 <deg b
b(x) = r1(x)q2(x) + r 2(x) O<deg rz<deg r1
r(X) = r
2
(x)q3(x) + r3(x) O<deg r3<deg rz
rk_z(x) = 'k-t(x)qk(x) + rk(x)
'k-t(x) = rk(x)qk+t(x) + O(x),
then rb) is a g.c.d. of a(x) and b(x).
2. Polynomials 359
9. Use Euclid's algorithm to find a g.c.d. of
a. a(x) = 2x
5
- 2x
4
+ x
3
- 4x
2
- x - 2, b(x) = x
4
- x
3
- x - l.
b. a(x) = 2x
4
+ 5x
3
- 6x
2
- 6x + 22, b(x) = x
4
+ 2x
3
- 3x
2
- x + 8_
c. a(x) = 6x
4
+ 7x
3
+ 2x
2
+ 8x + 4, b(x) = 6x
3
- x
2
+ 4x + 3.
10. Suppose p(x) = anx" + + a1x + a0 E R[x] has integral coefficients. Show
that if the ratio na] number cfd is a root of p(x), then c/ao and d/an. How might
this be extended to polynomials with rational coefficients?
11. Using problem 10, list all possible rational roots of
a. 2x
5
+ 3x
4
- 6x + 6. c. x
4
- 3x
3
+ x - 8.
b. x" + l. d. 9x
3
+ x - l.
12. Is the field of integers modulo 2, that is the field containing only O and 1,
algebraica]] y closed?
13. Prove that the first principie of induction, the second principie of induction,
and the well-ordering principie are equivalen! properties of the positive
integers.
14. Let "Y be a vector space over F, U E "Y, TE Hom("Y) and show that {p(T)U/
p(x) E F[x]} is the T-cyclic subspace Y( U, T).
15. Let TE Hom("Y) and U E "Y.
a. Show that S= {q(x) E F[x]/q(T)U =O} is an ideal in F[x].
b. Find the polynomial that generales S as a principal ideal if S#- {O(x)}.
16. You should be able to pro ve any of the following for Z, the integral domain of
integers.
a. The division algorithm for integers.
b. Z is a principal ideal domain.
c. If a, bE Z are relatively prime, then there exist s, tEZ such that 1 =
sa + tb.
d. The Fundamental Theorem of Arithmetic as stated on page 356.
17. Extend the definition of F[x] to F[x, y] and F[x, y, z], the rings of polynomials
in two and three variables, respectively.
a. Show that F[x, y] is not a principal ideal domain. For example, consider
the ideal generated by xy and x + y.
b. Suppose S is the ideal in R[x, y, z] generated by a(x, y, z) = x
2
+ y
2
- 4
and b(x, y, z) = x
2
+ y
2
- z, and Jet G be the graph of the intersection
of a(x, y, z) =O and b(x, y, z) =O. Show that S can be taken asan alge-
braic description of G in that every element of S is zero on G.
c. Show that the polynomials z - 4, x
2
+ y
2
+ (z - 4)2 - 4, and 4x
2
+
4y
2
- z
2
are in the ideal ofpart b. Note that the zeros ofthese polynomials
give the plane, a sphere, anda cone through G.
d. Give a geometric argument to show that the ideal S in part b is not
principal.
360 8. CANONICAL FORMS UNDER SIMILARITY
3. Minimum Polynomials
Now return to the polynomial associated with the T-cyclic subspace
!/(U, T). Recall that if !/(U, T) = 2'{U, T(U), ... , Tk-
1
(U)} and Tk(U)
+ ak_
1
Tk-
1
(U) + + a
1
T(U)+ a
0
U =O, then this equation may be
written as a(T)U = O where a(x) = xk + ak_
1
xk-l + + a
1
x + a
0
Notice that a(x) is monic and since !/(T, U) has dimension k, there does not
exist a nonzero polynomial b(x) of lower degree for which b(T)U = O.
Therefore we will call a(x) the T-minimum po/ynomial of U.
Theorem 8.19 If a(x) is the T-minimum polynomial of U and
b(T)U = O, then a(x)Jb(x). The monic polynomial a(x) is uniquely determined
by u.
Proof By the division algorithm there exist polynomials q(x) and
r(x) such that b(x) = a(x)q(x) + r(x) with deg r < deg a. But r(T)U = O
for
r(T)U = (b(T) - q(T) o a(T))U = b(T)U - q(T)(a(T)U).
If r(x) # O(x), then r(x) is a nonzero polynomial of degree less than deg a
and r(T)U = O which contradicts the n!marks preceding this theorem.
Therefore a(x) divides b(x). This also gives the uniqueness of a(x), for if
deg b = deg a, then they differ by a constant multiple. Since a(x) is monic,
it is uniquely determined by U.
The T-minimum polynomial of U determines a map that sends the
en tire subspace !/(U, T) to O. There is a corresponding polynomial for the
vector space that plays the central role in obtaining a canonical form under
similarity. We see that there is a polynomial in T that maps all of "f/ to O as
follows: If di m "f/ = n, then di m Hom("f/) = n
2
and the set {!, T, T
2
, , T"
2
}
is linearly dependent. That is, there is a linear combination of these maps (a
polynomial in T) that is the zero map on "f/.
Theorem 8.20 Given a nonzero map TE Hom("f/), there exists a unique
monic polynomial m(x) such that m(T) = OHom('f") and if p(T) = OHum("l
for sorne polynomial p(x) then m(x)Jp(x).
3. Mnimum Polynomials
361
Proof By the above remarks there exists a polynomial p(x) such
that deg p .::;; n
2
and p(T) = O. Therefore the set
B = {klthere exists p(x) E F[x], deg p = k > O and p(T) = O}
is nonempty, and since the positive integers are well ordered, B has a least
element. Let m(x) be a monic polynomial of least positive degree such that
m(T) = O. This proof is completed in the same manner as the corresponding:
properties were obtained for the T-minimum polynomial of U.
Is it clear that the symbol O in this proof denotes quite a different vector
from O in the preceding proof?
Definition The minimunz poly11on1ial m(x) of a nonzero map TE
Hom(i/") is the unique monic polynomial of lowest positive degree such
that m(T) = OHom(t
Jf the mnimum polynomial of T is the key to obtaining a canonical
from under similarity, then a method of computation is required. Fortunately
it is not necessary to search through all 11
2
+ 1 maps /, T, ... , T"
2
for a
dependency. We will find that the degree of the mnimum polynomial of T
cannot exceed the dimension of "f/, Corollary 8.28, page 378. However,
this does leave 11 + 1 maps or matrices to consider.
Example 1 Find the mnimum polynomial of TE Hom("f/
3
) if the
matrix of T with respect to {E} is
A ~ H
3
- ~
1
-1
M ultiplication gives
(-2
4
-2)
r-6
o
-14)
A2 = - ~ o -6 and
A3 = - ~ o -10 .
-2 3 -2 9
It is clear that the sets {/
3
, A} and {/
3
, A, A
2
} are linearly independent, and a
little computation shows that {/
3
, A, A
2
, A
3
} is Jinearly dependent with
A
3
= 3A
2
- 4A + 4/
3
. Therefore {!, T, T
2
} is linearly independent and
T
3
- 3T
2
+ 4T- 4/ = O, so the mnimum polynomial of T is x
3
- 3x
2
+ 4x- 4.
362 8. CANONICAL FORMS UNDER SIMILARITY
Another approach to finding the mnimum polynomial of a map is given
in the next two theorems.
Theorem 8.21 Suppose TE Hom('f"') and U E 'f"'. If m(x) is the mn-
imum polynomial of T and a(x) is the T-minimum polynomial of U, then
a(x)Jm(x).
Proof By the division algorithm there exist polynomials q(x) and
r(x) such that m(x) = a(x)q(x) + r(x) with deg r < deg a. Since m(T)U
= a(T)U = O and r(T) = m(T) - q(T) o a(T), we have r(T)U == O. But
a(x) is the T-minimum polynomial of U and deg r < deg a. Therefore r(x)
= O(x) and a(x)Jm(x).
Theorem 8.22 Let { U
1
, , U
11
} be a basis for 'f"' and Te Hom('f"'). If
a
1
(x), ... , a.(x) are the T-minimum polynomials of U
1
, , U., respec-
tively, then the mnimum polynomial of T is the monic least common mul-
tiple of a
1
(x), . .. , a
11
(x).
Proof Suppose m(x) is the mnimum polynomial of T and s(x) is
the monic l.c.m .. of a
1
(x), ... , a.(x). Each polynomial a
1
(x) divides s(x) so
s(T)U; = O for 1 ::::; i ::::; n. Therefore s(T) = O and m(x)Js(x).
To show that m(x) = s(x), suppose p
1
(x), . .. , Pk(x) are the distinct
irreducible monic factors of the polynomials a
1
(x), ... , a.(x). Then
for 1 ::::; j::::; n
with eii = O allowed. Since s(x) is the J.c. m. of a
1
(x), ... , a.(x),
s(x) =
where e
1
= max{e
11
, e
12
, , e
1
k}. Now m(x)Js(x) implies that there exist
integers/
1
, . ,fk such that
m(x) = p{'(x)p{
2
(x) p{(x).
m(x) divides s(x) and both are monic, so if m(x) # s(x), then.fh < eh for sorne
index h and .fh < ehi for sorne indexj. But then ai(x) ,( m(x), con!fadicting
Theorem 8.21. Therefore m(x) = s(x) as claimed.
(C
3. Minimum Polynomials 363
Example 2 Use these tlieorems to find the m1mmum polynomial
of the map in Example l.
Consider the basis vector E
1
; E
1
= (1, O, 0), T(E
1
) = (1, -1, 0), and
T
2
(E) = ( -2, -2, 1) are linearly independent. T
3
(E
1
) = ( -6, -2, 3) and
T\E
1
) = 3T
2
(E
1
) - 4T(E
1
) + 4E
1
Therefore the T-minimum polynomial
of E
1
is x
3
- 3x
2
- 4x + 4. This polynomial divides the mnimum poly-
nomial of T, m(x) and deg m ::::; 3, therefore m(x) = x
3
- 3x
2
+ 4x - 4.
Examp!e 3 Find the mnimum polynomial of Te the
matrix of T with respect to {E;} is
_;
3 6 -1
E
1
= (1, O, 0), T(E
1
) = (3, -2, 3), T
2
(E) = (2, 4, -6), and T
2
(E
1
)
= -2T(E
1
) + 8E
1
Since E
1
and T(E
1
) are linearly independent, the T-
minimum polynomial of E
1
is x
2
+ 2x - 8. Further computation shows that
E
2
and E
3
also have x
2
+ 2x - 8 as their T-minimum polynomial, so this is
the mnimum polynomial of T.
The map Tin Example 3 was considered in Example 4, page 229. In that
example we found that the characteristic polynomial of Tis - (x - 2)
2
(x + 4).
Since the mnimum polynomial of T is (x - 2)(x + 4), the mnimum poly-
nomial of a map need not equa1 its characteristic polynomial. However, the
mnimum polynomial of a map always divides its characteristic polynomial.
This is a consequence ofthe Cayley-Hamilton theorem, Theorem 8.31, page
380.
Recall that our approach to obtaining canonical forms under similarity
relies on finding a direct sum decomposition of 'f"' into T-invariant sub-
spaces. This decomposition is accomplished in two steps with the first given
by a factorization of the mnimum polynomial of T.
Theorem 8.23 (Primary Decomposition Theorem) Suppose m(x) is
the mnimum polynomial of TE Hom('f"') and m(x) = pNx) where
p
1
(x), ... , Pk(x) are the distinct irreducible monic factors of m(x) with
multiplicities el> ... , ek, respectively. Then
364 8. CANONICAL FORMS UNDER SIMILARITY
Proof There are two. parts to the proof: (1) Show that the sum
of the null spaces equals "f/" and (2) show that the sum is direct.
Define q;(x) for 1 ::;; i ::;; k by m(x) = pf'(x)q;(x). Then for any U e"//", O
= m(T)U = p'f'(T)(q;(T)U), therefore q(T)U E .!V P "i(TJ This fact is used to
obtain (1) as follows: The monic g.c.d. of q
1
(x), .' .. , qk(x) is 1, so by an
extension of Theorem 8.15 there exist polynomials s
1
(x), ... , sk(x) such that
Therefore
and applying this map to the vector U yields
U= l(U) = s
1
(T)(q
1
(T)U) + + sk(T)(qk(T)U).
which establishes (1).
For (2), suppose W + ... + wk = o with W E .!V P"i(T) for 1 ::;; i ::;; k.
q;(x) and pf'(x) are relatively prime, so there exist s(x) and t(x) such that
1 = s(x)q;(x) + t(x)pf'(x). Applying the map
J = s(T) o q;(T) + t(T) o p'f'(T)
to W; E .!V P,'(TJ yields
W; = s(T)oq;(T)W; + t(T)op'f'(T)W; = s(T)oq;(T)W; +O.
---
But pj(x)lq;(x) if j # i, so q(T)Wi =O if i ,. }, and we have
O= q(T)O = q(T)(W, + + Wk) = q;(T)W;.
Therefore, W; = s(T)(q;(T)W;) = s(T)O =O, proving that the sum is direct
and completing the proof of the theorem.
It should be clear that none of the direct summands .!V P (TJ in this
decomposition can be the zero subspace. '
This decomposition theorem provides a diagonal block matrix for T
3. Minimum Polynomials 365
as follows. Suppose a basis B; is chosen for each null space .JVP,'(TJ and
these bases are combined in order to give a basis B for "//". Then the matrix
of T with respect to B is diag(A
1
, .. , Ak), where A; is the matrix of the
restriction
T: .!V p"I(T) -t .!V p"i(T)
with respect to the basis B;. In general this cannot be regarded as a canonical
form because no invariant procedure is given for choosing the bases B;. The
second decomposition theorem will provide such a choice for each B;. How-
ever there is one case for which choosing the bases presents no problems,
namely, when the irreducible factors are alllinear and of multiplicity ,l.
Suppose the mnimum polynomial of T factors to p
1
(x) Pk(x) with
p;(x) = x - A;, 1 ::;; i ::;; k, and A
1
, . , Ak distinct elements from the field
F. Then the null space .!V p,(T) is the characteristic space 9'(A;). Therefore if
B; is any basis for .!V p(TJ then the block A; is diag(A;, ... , A;) and the
diagonal block matrix diag(A
1
, . , A k) is a diagonal matrix. This establishes
half of the following theorem.
Theorem 8.24 TE Hom("f/") is diagonalizable if and only if the mn-
imum polynomial of T factors into linear factors, each of multiplicity l.
Proof ( =>) If T is diagonalizable, then "f/" has a basis B
= {U, ... , U"} of characteristic vectors for T. Let A
1
, , Ak be the
distinct characteristic values for T and consider the polynomial
Since each vector Ui E B is a characteristic vector, q(T)Ui = O. (Multiplica-
tion of polynomials is commutative, thus the factor involving the characteris-
tic value for Ui can be written last in q(T).) Therefore q(T) = O and the
mnimum polynomial of T divides q(x). But this implies the mnimum
polynomial has only linear factors of multiplicity l.
( <=) This part of the proof was obtained above.
Examp/e 4
{E;} is
Suppose the matrix of TE Hom("f/"
3
) with respect to
-2
3
7
- ~
-1
366 8. CANONICAL FORMS UNDER SIMILARITY
Determine if T is diagonalizable by examining the mnimum polynomial of T.
Choosing E
1
arbitrarily, we find that the T-minimum polynomial of
E
1
is x
3
- 5x
2
+ 8x- 4 = (x- 2)
2
(x- 1). Since this polynomial the
mnimum polynomial of T, the mnimum polynomial of T contains a linear
factor with multiplicity greater than l. Therefore T is not diagonalizable.
The same conclusion was obtained in Example 5, page 230.
Problems
l. Suppose the matrix of TE Hom(f3 ) with respect to {E;} is
-i).
6 1 6
Find the T-minimum polynomial of the following vectors.
a.(l,O,O). b.(l,0,-2). c.(-3,1,3). d.(l,-1,-1).
2. Suppose TE Hom(f2) is given by T(a, b) = (3a- 2b, ?a- 4b).
3.
a. Find the T-minimum polynomial of (1, O) and (0, 1).
b. Show that every nonzero vector in f
2
has the same T-minimum poly-
nomial.
c. If T is viewed as a map in Hom("J' 2 ( C)), does every nonzero vector ha ve
the same T-minimum polynomial?
Find the mnimum polynomial for TE Hom(f n) given the matrix for T with
respect to the standard basis for f".
-4)
(b
o
(b
-16
Z).
a. b.
2.
c. o d. -8
1 o -5
-1
i).
o o
(
10
-2) o o
e. -1 f.
1 o -6"
g. -8 2.
1 -4
o 1 o
-5 -10 4
4. Use the mnimum po1ynomial to determine which of the maps in problem 3
are diagonalizable.
5. For each map Tin problem 3, find the direct sum decomposition off" given
by the primary decomposition theorem.
6. Let T be the map in problem 3e, then the decomposition of f
3
obtained in
problem 5 is of the form fi'(U, T) (j) fi'(U2 , T) for sorne U
1
, U
2
E r
3
Find
the matrix of Twith respect to the basis {U
1
, T(U
1
), U2}.
7. Suppose dim f = n, TE Hom{f), and the mnimum polynomial of T is
irreducible over F and of degree n.
a. Show that if u E f, u# o, then r = fi'(U, T).
b. Find the matrix of Twith respect to the basis {U, T(U), ... , rn-
1
(U)}.
c. What is this matrix if f = 1' 2 and the mnimum polynomial of T is
x
2
- 5x + 7?
4. Elementary Divisors 367
8. For the map T of Example 4, page 365, the T-minimum polynomial of E1
is (x - 2)
2
(x - 1). Use this fact to obtain a vector from E
1
which has as its
T-minimum polynomial
a. (x- 2)2. b. x- l. c. x- 2.
9. Suppose TE Hom(fn(C)) and every nonzero vector in 'Yn(C) has the same
T-minimum polynomial. Show that this implies that T = zl for sorne z E C.
10. Suppose TE Hom{f) and the T-minimum polynomial of U is irreducible
o ver F. Pro ve that fl'( U, T) cannot be decomposed into a direct su m of nonzero
T-invariant subspaces.
11. Let TE Hom(f) and suppose p1(x) is the T-minimum polynomial of U1,
U
1
E r, for i = 1, 2. Prove thatp1{x) andp2(x) are relatively prime if and only
if p
1
(x)p2(x) is the T-minimum polynomial of U1 + U2
12. A map TE Hom{f) is.aprojection (idempotent) if T
2
=T.
a. Find the three possible mnimum polynomials for a projection.
b. Show that every projection is diagonalizable.
13. Show that TE Hom( f) is a projection if and only if there exist subspaces
[!' and :T such that f = fl' (j) :T and T(V) = U when V= U+ W with
UEfl', WE:T.
14. Show that f = [!'
1
(j) ... (j) fl' k if and only if there exist k projection maps
T, ... , Tk such that
i. T,(f) = fl';.
ii. T
1
o Tj = O when i ,t. j.
iii. 1 ='= T1 + + Tk.
15. Prove the primary decomposition theorem using the result of problem 14.
4. Elementary Divisors
Given TE Hom('f"), the primary decomposition theorem expresses 'f" as
a direct su m of the null spaces .;V p,<T> where p;(x) is an irreducible factor
of the mnimum polynomial of T with multiplicity e. The next problem is
to decompose each of these null spaces into a direct sum of T-invariant sub-
spaces in sorne well-defined way. There are severa! approaches to this pro-
blem, but none of them can be regarded as simple. The problem is somewhat
simplified by assuming the field F is algebraically closed. In that case each
p;(x) is linear, and the most useful canonical form is obtained at once. How-
ever this approach does not determine canonical forms for matrices over an
arbitrary field, such as R, and so will not be followed here. Instead we will
choose a sequence of T-cyclic subspaces for each null space in such way that
each subs;:llce is of maximal dimension.
368 8. CANONICAL FORMS UNDER IMILARITY
A decomposition must be found for each of the null spaces in the pri-
mary decomposition theorem. Therefore it will be sufficient to begin by
assuming that the mnimum polynomial of TE Hom('t") is p"(x) with p(x)
irreducible over F. Then A'p(T) = 1'". It is easy to show that 'f" contains a
T-cyclic subspace of maX:imal dimension. Since p"(x) is the mnimum poly-
nomial of T, there exists a vector U
1
E 'f" such that p-
1
(T)U
1
=!= O. There-
fore, the mnimum polynomial of U
1
is p(x). (Why?) Suppose the degree of
p(x) is b, then deg p = eb and dim Y(U
1
, T) = eb. Since no element of 't"
can have a T-minimum polynomial of degree greater than eb, Y(U
1
, T) is
a T-cyclic subspace of maximal dimension in 1'". If 'f" = Y(U
1
, T), then we
have obtained the desired decomposition.
Example 1 Suppose
is the matrix of TE Hom('f"
3
) with respect to the standard basis. The mnimum
polynomial of T is found to be x
3
- 6x
2
+ l2x - 8 = (x - 2)
3
which is
of the form p(x) with p(x) = x - 2 and e = 3. The T-minimum polynomial
of E
1
is also (x- 2)3, so we may take U
1
= (1, O, 0). Then dim Y(U
1
, T)
= 3 1 = 3 and Y( U
1
, T) = 'f"
3
Although Tis not diagonalizable, the matrix
of Twith respect to the basis {U
1
, T(U
1
), T
2
(U
1
)} has the simple forro
(
o o s)
1 o -12 .
o 1 6
Notice that the elements in the Jast column come from the mnimum poly-
nomial of T, for
Therefore this matrix for T does not depend on the choice of U
1
, provided
the T-minimum polynomial of U
1
is (x- 2)
3
.
Definition The companion matrix of the momc polynomial q(x)
4. Elementary Divisors 369
== x!' + ah_
1
.x!'-
1
+ + a
1
x + a
0
is the h x h matrix
o o o o -ao
1 o o o -a
C(q) =
o o o -a2
o o o -ah-2
o o o 1 -ah-t
That is, the companion matrix of q(x) has 1 's on the "subdiagonal"
below the main diagonal and negatives of coefficients from q(x) in the last
column.
Ifthe mnimum polynomial of TE Hom('t") isp(x) and degp = dim 1'",
then the companion matrix of p(x) is the matrix of T with respect to a basis
for 1'". In this case C(p) can be taken as the canonical form for the matrices
of T. However, it is not always so simple. For example, if TE Hom('f"
3
) has
(
-2 -4 2)
2 4 -1
-4 -4 4
as its matrix with respect to {E}, then the mnimum polynomial of T is
(x - 2)
2
Therefore T is not diagonalizable, and if U
1
is chosen as above, the
dimension of Y(U
1
, T) is only 2. So a T-cyclic subspace of maxmal dimen-
sion in 'f"
3
is a proper subspace of 'f"
3
Returning to the general.case, suppose Y( U
1
, T) =!= 1'". Then we must
find a vector U
2
that generates a T-cyclic subspace of maxmal dmension
satisfying Y(U
1
, T) n Y(U
2
, T) = {0}. The first step in searching for such a
vector is to observe that if V rf; Y(U
1
, T), then when k is sufficiently large,
p\T)VEf!>(U
1
, T). In particular, p(T)VEY(U
1
, T). Thus for each Vrf;
Y(U
1
, T), there exists an integer k, 1 :::; k :::; e, such that l(T)V E Y( U
1
,
and l-
1
(T) V rf; Y( U
1
, T). Let r
2
be the largest of all such integers k. Then
l :::; r
2
:::; e and there exists a vector W E 'f" such that p''(T)W E Y(U
1
, T)
and p''-
1
(T)W rf; Y(U
1
, T). lf p''(T)W =O, then we may set U
2
= Wand we
will be ab1e to show that the sum Y(U
1
, T) + Y(U
2
, T) is drect. However,
if p''(T) W =!= O, then we must subtract a vector from W to obtain a choice
for U
2
To find this vector, notice thatp''(T)VEY(U
1
, T) means that there
exists a polynomial f(x) E F[x] such that
p''(T)W = f(T)U
1
(l)
370
8. CANONICAL FORMS UNDER SIMILARITY
(See problem 14, page 359.) Applying the map pe-"(T) to Eq. (1) gives
Since pe(x) is the T-minimum polynomial of U
1
, pe(x)lpe-r'(x)f(x) and
therefore p"(x)lf(x). Supposef(x) = pr'(x)q(x), then Eq. (1) becomes
p''(T) W = f(T)U
1
= p''(T) o q(T)U
1
Therefore if we set U
2
= W- q(T)U
1
, then
pr'(T)U
2
= p''(T)W- pr'(T) o q(T)U
1
=O
as desired, and it is only necessary to check that p''- \T)U
2
rf: 9'(U
1
, T). But
pr'-
1
(T)U
2
= p''-
1
(T)(W- q(T)U
1
)
= pr'-
1
(T)W- pr'-
1
(T) o q(T)U.
So if p''-
1
(T)U2 E 9'( U
1
, T), then so is pr'-
1
(T) W, contradicting the choice
of W. Therefore there exists a vector U
2
with minirnum polynomial p''(x),
1 :$ r
2
:S e, such that pr,-
1
(T)U
2
rf: 9'(U
1
, T) and r
2
is the largest integer
such that pr'-
1
(T)V rf: 9'(U
1
, T) for sorne V E "Y.
It rernains to be shown that 9'(U
1
, T) 11 9'(U
2
, T) = {0}. Therefore
suppose Vi= O is in the intersection. Then V= g(T)U
2
for sorne g(x) E F[x].
Suppose the g.c.d. of g(x) and pr'(x) is p(x). Then there exist polynomials
s(x) and t(x) such that
p"(x) = s(x)g(x) + t(x)pr'(x).
Replacing x by T and applying the map p(T) to U
2
yields
p(T)U2 = s(T) o g(T)U
2
+ t(T) o pr'(T)U]. = s(T)V + O.
Since Vis in the intersection, p(T)U
2
E 9'(U
1
, T) and a;;::: r
2
But a< r
2
for g(T)U2 = V i= O irnplies pr'(x), the T-minirnurn polynornial of U
2
, does
not divide g(x). This contradiction shows that the intersection only contains O,
andsothesum9'(U
1
, T) + 9'(U
2
, T)isdirect.If"f" = 9'(U
1
, T) E9 9'(U
2
, T),
we have obtained the desired decornposition; otherwise this procedure may
be continued to prove the following.
Theorem 8.25 (Secondary Decomposition Theorern) Suppose the
minimum polynornial of TE Hom("f") is p'(x) where p(x) is irreducible over
" >.
4. Elementary Divisors 371
F. Then there exist h vedors U
1
, , Uh such that
"Y = 9'(U
1
, T) E9 9'(U
2
, T) E9 E9 9'(Uh, T)
and 9'(U;, T) is a T-cyclic subspace of maximal dimension contained in
9'(U;, T) E9 9'(U;+
1
, T) E9 E9 9'(Uh, T), 1:$i<h.
Further if p''(x), ... , p'(x) are the T-minimum polynomials of U
1
, , Uh,
respectively, then e = r
1
;;::: ;;::: rh ;;::: l.
It is too soon to claim that we have a method for choosing a canonical
form under similarity since the decomposition given by Theorem 8.25 is not
unique. But we will show that the T-minimum polynomials p''(x), ... , pr(x)
are uniquely determined by T and thereby obtain a canonical form. Both
these points are illustrated in Example 2 below. The candidate for a canonical
form is the diagonal block matrix given by this decomposition. Call {U,
T(U;), ... , T''b-t(U;)} a cyclic basis for the T-cyclic space 9'(U;, T), 1 :S i
::; h, where deg p = b. Then the restriction of T to 9'(U;, T) has the com-
panion matrix C(p'') as its matrix with respect to this cydic basis. Therefore,
if we build a cyclic basis B for "Y by combining these h cyclic bases in order,
then the matrix of T with respect to Bis diag(C(p''), C(p''), . .. , C(p')).
Examp!e 2 If
(
6 -9 . 6)
-1 6 -2
-3 9 -3
is the matrix of TE Hom( "Y
3
) with respect to {E;}, then (x - 3)
2
is the
minimum polynomial of T. Find the matrix of T with respect to a cyclic basis
for "Y
3
given by the secondary decomposition theorem.
The degree of (x - 3)
2
is 2, so U
1
must be a vector that generates a T-
cyclic subspace of dimension 2. Clearly any of the standard basis vectors
could be used. (Why?) Ifwe take U
1
= (1, O, 0), then 9'(U
1
, T) = ..'l'{(1, O, 0),
(6, -1, 3)}. Since thedimension of "Y
3
is 3, U
2
must generate a !-dimensional
T-cyclic subspace. Therefore U
2
can be any characteristic ve.ct?r for t ~
characteristic value 3 which is not in 9'(U
1
, T). The charactenstJc space 1s
9'(3) = {(a, b, c)la - 3b + 2c = O}
and 9'(3) 11 9'(U
1
, T) = ..'l'{(3, -1, -3)}. Therefore wemay set U
2
= (3, 1, O)
372 8. CANONICAL FORMS UNDER SIMILARITY
and
"Y
3
= 9'((1, O, 0), T) Ef) 9'((3, 1, 0), T)
is a decomposition of "Y
3
given by the secondary decomposition theorem.
This decomposition determines the cyclic basis B = {U
1
, T(U
1
), U
2
}.
Now the T-minimum polynomial of U
1
is (x - 3)
2
, so the restriction of
T ot 9'(U
1
, T) has C((x- 3)
2
) as its matrix with respect to {U
1
, T(U
1
)}.
And the T-minimum polynomial of u2 is X - 3, so C(x - 3) is the matrix
of Ton 9'(U
2
, T). Therefore the matrix cif Ton "Y
3
with respect to the cyclic
basis Bis
(
o -9 o)
diag(C{(x - 3)
2
), C(x - 3)) = _I_ 6 O .
o o 3
Note that almost any vector in "Y
3
could ha ve been chosen for U
1
in
this example. That is, U
1
could have been any vector not in the 2-dimensional
characteristic space 9'(3). And then U
2
could have been any vector in 9'(3)
not on a particular line. So the vectors U
1
and U
2
are far from uniquely
determined. However, no matter how they are chosen, their T-minimum
polynomials will always be (x - 3)
2
and x - 3.
The secondary decomposition theorem provides the basis for the
claim made in the last section that the degree of the mnimum polynomial
for TE Hom("Y) cannot exceed the dimension of "Y. For if p(x) is the
mnimum polynomial of T, then
dim "Y ;:::: dim 9'(U
1
, T) = deg pr' = deg p.
In Example 2, p"(x) = (x - 3)
2
has degree 2 and "Y = "Y
3
has dimension 3.
Example 3 If the matrix of TE Hom("Y
5
) is
(
14 8 -1 -6 2)
-12 -4 2 8 -1
8 -2 o -9 o
8 8 o o 2
-8 -4 o 4 o
with respect to {E}, then the mini m u m polynomial of T is p
3
(x) = (x - 2)
3
.
Find a direct sum decomposition for "Y
5
following the procedure given by
4. Elementary Divil;ors
373
the secondary decomposition theorem, and find the corresponding diagonal
block matrix.
U
1
must be a vector such that p
2
(T)U
1
:f. O. Choosing E1 arbitrarily
(the definition of T does not suggest a choice for U1) we find that
p
2
(T)E
1
= (T
2
- 4T + 4/)E
1
= (-24, 16, 32, -32, O) :f. O.
Therefore we set U
1
= E
1
and obtain
9'(U
1
, T) = .P{(I, O, O, O, 0), (14, -12, 8, 8, -8), (28, -32, 64, O, -32)}.
= .P{(1,0,0,0,0),(0, 1,2, -2,0),(0,0, -4,2, 1)}.
In looking for a vector W, dimensional restrictions leave only two possibili-
ties; either p(T)V E 9'(U
1
, T) for all V rf: 9'(U
1
, T) or there exists a vector V
such that p(T) V rf: 9'( U
1
, T). Again we arbitrarily choose a vector not in
9'(U
1
, T), say E
2
Then
p(T)E
2
= (T- 2/)E
2
= (8, -6, -2, 8, -4),
and it is easily shown that p(T)E
2
r/: 9'(U
1
, T). Therefore set W = E2 and
r
2
= 2. Since p
2
(T) W = (- 6, 4, 8, -8, O) is not O, we cannot Jet '02 equal
W. [Note that p
2
(T)W = tT
2
(U
1
)- T(U
1
) + U1, showing that p
2
(T) E
9'(U
1
, T) as expected. Is it clearthat r
2
= 2 is maximal? What wouldp
2
(T)W rf:
Y(U
1
, T) and r
2
;:::: 3 imply about the dimension of -yr 5 ?] Still following the
prooffor the secondary decomposition theorem, it is now necessary to subtract
an element of !/( U
1
, T) from W. Equation (!)in this case is p
2
(T)W = f(T)U1
with f(x) = tx
2
- x + l. Therefore the polynomial q(x) satisfying f(x)
= pr2(x)q(x) is the constant polynomial q(x) = 4 and we should take
U
2
= W- q(T)U
1
= E
2
- 4E
1
= ( -4, 1, O, O, O).
Then since dim 9'(U
1
, T) = 3, dim 9'(U
2
, T) = 2, and dim ji" 5 = 5, we have
-yr
5
= 9'((1, O, O, O, 0), T) Ef) Y'(( -4, 1, O, O, 0), T).
The cyclic basis given by this decomposition is B = {U1, T(U1), T
2
(U1),
U
2
, T(U
2
)} and the matrix of T with respect to Bis
diag(C(p
3
), C(p
2
)) = (! L-j L :).
o o o o -4
o o o 1 4
374
8. CANONICAL FORMS UNDER SIMILARITY
The two previous examples suggest quite strongly that the T-minimum
polynomials associated with !/(U
1
, T), ... , Y(Uh, T) are independent of
U1, , U,, and that they are determined entirely by the map T. lf so, then
the matrix of Twith respect toa cyciic basis for "f/' can be taken as the canoni-
cal form for the equivalence ciass under similarity containing all possible
matrices for T.
Theorem 8.26 The numbers h, rh in the secondary deposition
theorem depend only on T.
That is, if"f/' = !/(U, T) EB EB T) is another decomposition
for "Y satisfying the secondary decomposition theorem and the T-minimum
polynomial of u; is pr';(x), then h' = h and r = r, . .. = rh.
Proof The proof uses the second principie of induction on the
dimension of "f/'.
If di m "f/' = 1, the theorem is clearly true.
Therefore suppose the theorem holds for all T-invariant spaces of
dimension less than n and Jet dim "f/' = n, n > l. Consider the null space of
p(T), .;V p(T) Jf = "f/', then e = 1 and the dimension of each T-cyclic
space. is deg p. In this case hp = dim "f/' = h'p and the theorem is true. If
.!Vv<T> i= "Y, Jet Ve .;Vp(T) Then Ve !/(U
1
, T) EB EB !/(Uh, T) implies
there exist polynomials/
1
(x), ... ,j,,(x) such that V= /
1
(T)U
1
+ + f,(T)Uh
and
O= p(T)V = p(T) of
1
(T)U
1
+ .. + p(T) oj,(T)Uh.
Since there is only one expression for O in a direct sum
1
p(T) o/;(T)U = O
for 1 i h. Therefore the mnimum polynomial of U, pr;(x), divides
p(x)f(x), implying that pr'-
1
(x) divides J(x). Thus there exist polynomials
q
1
(x), ... , qh(x) such that
Therefore
which means that
.!V p(T) e: !/(pr' -
1
(T)U1, T) EB EB !/(pr'-
1
(T)Uh, T).
<r
4. Elementary Divisors
375
The reverse inclusion is immediate so that
Similar! y
These direct su m decompositions satisfy the secondary
and .;Vp(T) is a T-invariant space of djmension less than n, so the induction
assumption applies a.nd h' = h.
Now consider the image space Jp(T) = p(T)["f/']. By a similar agrument
it can be shown that
JP<T> = !/(p(T)U
1
, T) EB EB !/(p(T)Uk, T)
where k is the index such that rk > 1 and rk+
1
= 1 or k = h with rh > l.
Then the T-minimum polynmial of p(T)U is pr'-
1
(x), 1 i Also,
Jp(T) = !f(p(T)U, T) EB EB T)
with k' defined similarly. Now Jp(T) is also a T-invariant space, and since
.;V p(T) i= {0}, J p(T) is a proper subspace of "f/'. Therefore the induction as-
sumption yields k' = k and = r
1
, , = rk with
1
= =
= rk+
1
= = rh = 1, which completes the proof of the theorem.
Definition If the mnimum polynomial of Te Hom("f/') is p(x) with
p(x) irreducible over F, then the polynomials p(x) = pr'(x), pr
2
(x), ... , pr(x)
from the secondary decomposition theorem are the p(x)-elementary divisors
ofT.
From the comments that have been made above, it should be clear that
if p(x) is the mnimum polynomial of T, then a diagonal block matrix can be
given for T once the p(x)-elementary divisors are known.
Example 4 lf TE Hom("f/'
8
) has p(x)-elementary divisors
(x
2
- x + 3)
2
, x
2
- x + 3, x
2
- x + 3, then the diagonal block matrix
376
8. CANONICAL FORMS UNDER SIMILARITY
of T with respect to a cyclic basis for "1'"
8
is
o o o -9 j o
o o 6 i
o 1 o -7
. (2 001 2!
p ), c(p), C(p)) = _____ .. _________________ .. __
i 1 1 i
: ................. ---o-- ..
o 1
Herep(x) = x
2
- x + 3 andp
2
(x) = x
4
- 2x
3
+ 7x
2
- 6x + 9.
Problems
l. Find the companion matrices of the following polynomials.
a. x- 5. b. (x- 5)
2
c. (x- 5)
3
d. x
2
- 3x + 4.
e. (x
2
- 3x + 4)
2
2. Suppose the companion matrix C(q) E vil h x h is the matrix of TE Hom( "Y) with
respect to sorne basis B.
a. Show that q(x) is the mnimum polynomial of T.
b. Shw that ( -!)l'q(x) is the characteristic polynomial of T.
3. Find a cyclic basis B for "Y 3 and the mati:ix of T with respect to B if the matrix
of T with respect to {E} is
a. -1 - b. (-i -: -T).
o o 2 -4 -4 4
4. Suppose TE Hom("Y) and B is a basis for "Y. Let U
1
, , Uh be vectors
satisfying the secondary decomposition theorem with U
2
, , Uh obtained
... , wh as in the proof.
a. Why is there always a vector in B that can be used as U
1
?
b. Show that W2 , , Wh can always be chosen from B.
5. Suppose TE Hom("Y6) and p(x) is the mnimum polynomial of T. List all
possible diagonal block matrices for T (up to order) composed of companion
matrices if
a. p"(x) = (x + 3)2.
d. p(x) = (x
2
+ 4)
3
.
b. p"(x) = x- 4. c. p(x) = (x
2
+ x + 1)
2
e. p(x) = (x- 1)
3
6. Suppose (x - e)' is the T-minimum polynomial of U, e E F.
a. Show that SI'( U, T) contains a characteristic vector for c.
b. Show that SI'( U, T) cannot contain two linearly independent characteristic
vectors for c. -- -
.
5. Canonical Forms
377
7. Find the (x + 1)-elementary divisors of TE Hom("Y
4
) given the matrix of T
with respect to {E1} and the fact that its mnimum polynomial is (x + 1)
2
(
_: _ -i) ( g -g -i)
a. -1 -1 -2 -2 b. -9 10 -7 -4
o o o -1 o 1 o -2
8. Find a diagonal block matrix for a map with the given elementary divisors.
a. x
2
+ x + 1, x
2
+ x + 1, x
2
+ x + l.
b. (x + 3)3, (x + 3)
2
, (x + 3)
2
c. (x
2
- 4x + 7)
2
, x
2
- 4x + 7.
9. Combine the primary and secondary decomposition theorems to find a diagonal
block matrix composed of companion matrices similar to
a. (-! -f b. _:
-4 o -4 7 6 o 6 -10
5. Canonical Forms
The primary and secondary decomposition theorems determine a direct
sum decomposition of "1'" for any map TE Hom("f'"). This decomposition
yields a diagonal block matrix for T, which is unique up to the order of the
blocks, and thus may be used as a canonical representation for the equiv-
alence class of matrices for T.
Theorem 8.27 Suppose m(x) is the mnimum polynomial of TE Hom("f'")
and m(x) = p1'(x) is a factorization for which p
1
(x), ... ,Pk(x) are
distinct monic polynomials, irreducible over F. Then there exist unique
positive integers h;, rij with e = r;
1
::=:: ::=:: r;h, ::=:: 1 for 1 :::; i :::; k and
1 :::; j :::; h and there exist (not unique!y) vectors Uij E "f'" such that "f'" is
the direct sum of the T-cyclic subspaces [1-'(U;j, T) and the T-minimum
polynorniai Of ijij is for J :::; i :::; k, J :::; j :::; h.
Proof The factorization m(x) = pk"(x) is unique up to
the order of the irreducible factors, and the primary decomposition theorem
gives
(1)
Now consider the restriction of T to each T-invariant space .!V p,<T> Thc
378 8. CANONICAL FORMS UNDER SIMILARITY
mnimum polynomial of this restriction map is Pi'(x). So from the previous
section we know that there exist unique positive integers h
1
, e
1
= r
11
;::
;:: rh, ;:: 1 and there exist vectors U, ... ' uih E .;V p,(T) e "Y with T-
minimum polynomials ... ,p?h(x), respectively, such that
.!V P,"<TJ = Y( U;, T) EB EB Y(U1h,, T).
Replacing each null space in the sum (!) by such direct sums of T-cyclic
subspaces completes the proof of the theorem.
Corollary 8.28 The degree of the mimmum polynomial of
TE Hom("Y) cannot exceed the dimension of "Y.
Proof
k h k k
dim "Y= L L dim Y(U
1
j, T):?: L dim Y(U
11
, T) = L degp'f'
i=l j=l i=l i=l
= degm.
The collection of all p
1
{x)-elementary divisors of T, 1 ::;; i ::;; k, will be
referred to as the e/ementary divisors of T. These polynomials are determined
by the map T and therefore must be reflected in any matrix for T. Since two
matrices are similar if and only if they are matrices for sorne map, we ha ve
obtained a set of invariants for matrices under simjlarity. It is a simple matter
to extend our terminology to n x n matrices with entries from the field F.
Every matrix A E .H.x.(F), determines a map from the vector space .H.x
1
(F)
to itself with A(V) = AVfor all V E .H.x
1
(F). Therefore we can speak of the
mnimum polynomial of A and of the elementary divisors of A. In fact, the
examples and problems of the two previous sections were stated in terms of
matrices for simplicity and they show that the results for a map were first
obtained for the matrix.
Theorem 8.29 Two matrices from .lf.x.(F) are similar if and only if
they have the same elementary divisors.
Proof ( =>) lf A is the matrix of T with respect to the basis B, then
Y = AX is the matrix equation for W = T(V). Therefore, the elementary
divisors of A equal the elementary divisors of T. If A andA' are similar, then
they are matrices for sorne map T and hence they ha ve the same elementary
divisors.
5. Canonical Forms
379
( <=) Let A be an n x n matrix with elementary divisors pi'
1
(x) for
1 ::;; j:::; h, 1 :::; i:::; k. Suppose A is the matrix of TE Hom("f/) with respect
to sorne basis. Then there exists a basis for "Y with respect to which Thas the
matrix
D = ... , .
Thus A is similar to D. lf A' E J!t.x.(F) has the same elementary divisors
as A, then by the same argumentA' is similar to D and hence similar to A.
This theorem shows that the elementary divisors of a matrix form a
complete set of invariants under similarity, and it provides a canonical form
for each matrix.
Definition If A E .H.x.(F) has elementary divisors Pi'i(x), 1 ::;; j::;; h;
and 1 :::; i :::; k, then the diagonal block matrix
diag(C(p;"), ... , C(p;h), ... , C(pk"'), .. . , C(pk"h))
is a rational canonical form for A.
Two rational canonical forms for a matrix A differ only in the order of
the blocks. It is possible to specify an order, but there is Iittle need to do so
here. It should be noticed that the rational canonical form for a matrix A
depends on the nature of the field F. For example, if A E .H.x.CR), then A
is also in .lf. x .( C), and two quite different rational canonical forros can be
obtained for A by changing the base field. This is the case in Example 1
below.
The following result is included in the notion of a canonical form but
might be stated for completeness.
Theorem 8.30 Two matrices from .lf.x.(F) are similar if and only if
they have the same rational canonical form.
Example 1
Find a rational canonical form for
-1 o -1
o)
o 4 o -4
o 3 -2 -i E .lf s x s(R).
o -1 2
<r -2 1 3 -2 -2
380 8. CANONICAL FORMS UNDER
Suppose {E} denotes the standard basis for .H.x
1
(R). Then a little
computation shows that the A-minimum polynomial of E
1
and E
4
is (x - .1)
2
,
the A-mnimum polynomial of E
2
is x
2
+ 4 and the A-minimum polyncimial
of E
3
and Es is (x - 1)
2
(x
2
+ 4). Therefore the mnimum polynomial of A
is m(x) = (x- 1)
2
(x
2
+ 4). The polynomials p
1
(x) = x- 1 and P2(x)
= x
2
+ 4 are irreducible over R and m(x) =
Thus and
p
2
(x) are two elementary divisors of A. The degrees of the elementary divisors
of A must add up to 5, so A has a third elementary divisor of degree l. This
must be x - l. Since the elementary divisors of A are p
1
(x), and P2(x),
a rational canonical form for A is
If the matrix of this example were given as a member of .H
5
x
5
( C), then
the mnimum polynomial would remain the same but its irreducible factors
would change. Over the complex numbers,
m(x) = (x - 1)
2
(x + 2i)(x - 2i).
Then (x - 1)
2
, x + 2i, and x - 2i are elementary divisors of A, and again
there must be another elementary divisor of degree l. Why is it impossible
for the fourth elementary divisor to be either x + 2i or x - 2i? With ele-
mentary divisors (x - 1)
2
, x - 1, x + 2i, and x - 2i, a rational canonical
form for A E .H s x
5
( C) is obtained from D by replacing the block ( -
. h (2i o)
Wlt 0 -2i .
Theorem 8.31 (Cayley-Hamilton Theorem) An n x n matrix
[or a map in Hom('i'")] satisfies its characteristic equation.
That is, if q(A.) = O is the characteristic equation for the matrix A, then
q(A) =O.
Proof Suppose the rational canonical form for A is D =
, Then since A is similar to D, q(A.) = det(D - Al.).
5. Canonical Forms 381
The determinant of a diagonal block matrix is the product of the determinants
of the blocks (problem 5, page 347), so if we set bii = deg then
But
- AI
6
.) is the characteristic polynomial of C(p/'
1
), and by
problem 2, page 376 this polynomial is ( -1)
6
iip?J(A.). So q(A.) =
( -l)"p'"(A.) .. But if m(x) is the mnimum polynomial of A, then
m(x) =
p;"'(x). Therefore q(A.) = m(A.)s(A.) for sorne polynomial
s(A.), and m(A) = O implies q(A) = O as desired.
Corollary 8.32 If p;
11
(x), ... , are the elementary divisors
for an n x n matrix A, then the characteristic polynomial of A is
( -1 ...
Although we have used the rational canonical form to prove the Caley-
Hamilton theorem, the form itself leaves something to be desired. As a di-
agonal block matrix most of its entries are O, but a general companion matrix
is quite different from a diagonal matrix. This is true even in an algebraically
closed field, for example, the companion matrix of (x - a)
4
is
(
o o o
1 o o
o 1 o
o o 1
a4)
4a
3
6
2 .
- a
4a
For practica! purposes it would be better to have a matrix that is more
nearly diagonal. One might hope to obtain a diagonal block matrix similar
to the companion matrix C(p) which has e blocks of C(p) on the diagonal.
Although this is not possible, there is a similar matrix, which comes el ose ro-
this goal.
Considera map TE Hom('i'") for which 'i'" = !/(U, T) and p(x) is the
T-minimum polynomial of U. Then C(p) is the matrix of T with respect to
the cyclic basis {U, T(U), ... , T"
6
-
1
( U)}, where b is the degree of p(x). \Ve
will obtain a simpler matrix for T by choosing a more complicated basi;;.
Supposep(x) = x
6
+ a
6
_
1
xb-t + + a
1
x + a
0
.ReplacexbyTandapply
the resulting map to U and obtain
382 8. CANONICAL FORMS UNDER SIMJLARITY
Now repeatedly apply the map p(T) to obtain the following relations:
/(T)U = p(T)Tb(U) + ab_
1
p(T)Tb-
1
(U) + + a
1
p(T)T(U)
+ a
0
p(T)U
p\T)U = p
2
(T)Tb(U) + ab_
1
p
2
(T)Tb-
1
(U) + + a
1
p
2
(T)T(U)
+ aop
2
(T)U
p-
1
(T)U = p-
2
(T)Tb(U) + ab_
1
p-\T)Tb-!(U) + + a
1
p.-
2
(T)T(U) --.:._
+ GoPe-
2
(T) U.
Notice that the entries from the last column of C(p) appear in each equation,
and the p + 1 vectors on the right side of each equation are like the vectors
of a cyclic basis. Also the vector on the left side of each equation appears in
the next equation. Therefore, consider the following set of vectors:
T(U)
p(T)T(U)
p
2
(T)T(U)
T
2
(U)
p(T)T
2
(U)
p
2
(T)T
2
(U)
yb-1(U)
p(T)Tb-
1
(U)
pz(T)Tb-1(U)
p-1(T)Tb-1(U).
There are eb vectors in this set, so if they are linearly independent, they form
a basis for "Y. A linear combination of these vectors may be written as q(T)U
for sorne polynomial q(x) E F[x]. Since the highest power of T occurs in
p-
1
(T)Tb-!(U), the degree of q(x) cannot exceed eb- l. Therefore, ifthe set
is linearly dependent, there exists a nonzero polynomial q(x) of degree less
than p(x) for which q(T)U = O. This contradicts the fact that p(x) is the
T-minimum polynomial of U, so the set is a basis fGr "Y. Order these vectors
from left to right, row by row, starting at the top, and denote the basis by
Be. We will call Be a canonical basis for "/'. Let Ae be the matrix of Twith re-
spect to this canonical basis Be. The first b - 1 vectors of Be are the same as
the first b - 1 vectors of the cyclic basis, but
Therefore the b x b submatrix in the upper left comer of Ac is C(p), and
the entry in the p + 1st row and pth column is l. This pattern is repeated;
each equation above gives rise toa block C(p) and each equation except for
the last adds a 1 on the subdiagonal between two blocks. For example, the
second row of elements in Be begins like a cyclic basis with T(p(T)U)
= p(T)T(U), T(p(T)T(U)) = p(T)T
2
(U), T(p(T)T
2
(U)) = p(T)T
3
(U), and
;
:\
~
.,
::;
.-
:j.
...
5. Canonical Forms
383
so on. Then using the second equation above, we have
T(p(T)Tb-
1
(U)) = p(T)Tb(U)
= p
2
(T)U- ab_
1
p(T)Tb-
1
(U)- - a
0
p(T)U.
Therefore
Call this matrix the hypercompanion matrix of p'(x) and denote it by Ch(p).
Example 2 Jf p(x) = x
2
- 2x + 3, then p(x) is irreducible over
R. Find the companion matrix of p
3
(x), C(p
3
), and the hypercompanion matrix
of p
3
(x), Ch(p
3
).
o o o o o -27
1 o o o o 54
C(p3) =
o 1 o o o -63
o o 1 o o 44
o o o 1 o -21
o o o o 6
and
o -3 o o o o
2 o o o o
Ch(p3) =
o 1 o -3 o o
o o 1 2 o o
o o o 1 o -3
o o o o: 1 2
If the companion matrices of a rational canonical form are replaced
by hypercompanion matrices for the same polynomials, then a new canonical
form is obtained, which has its nonzero entries clustered more closely about
the main diagonal.
Definition
lf A E Anxn(F) has elementary divisors pj'
1
(x), 1 ::::; j ::::; h;
384 8. CANONICAl FORMS UNDER SIMllARITY
and 1 ::; i ::; k, then the diagonal block matrix
is a classieal eanoniealform for A.
Example 3 Suppose TE Hom('i"
6
) has elementary divisors
(x
2
+ x + 2)
2
and (x- 5)
2
Find a matrix D
1
for T in rational canonical
form and a matrix D
2
for T in classical canonical form.
o o o -4!
1 o o -4 !
o 1 o 5 i
o o 1 =2 j
o
-------------- --------ro-- - 2s
o : 1 10
-2 o o; o
-1 _Q_ o
1 o -2
o 1 -1
~ 1 ~
If p(x) is a first degree polynomial, say p(x) = x - e, then the hyper-
companion matrix C,(pe) is particularly simple with the characteristic value
e on the main diagonal and 1 's on the subdiagonal. Thus if all irreducible
factors of the mnimum polynomial for A are linear, then the nonzero en tries
in a classical canonical form for A are confined to the main and subdiagonals.
If A is not diagonalizable, then such a form is the best that can be hoped for
in computational and theoretical applications. In particular, this always
occurs over an algebraically closed field such as the complex numbers.
Definition lf the mnimum polynomial (or characteristic polynomial)
of A E v!t" x .(F) factors into powers of first degree polynomials o ver F, then
a classical canonical form for A is called a Jordan eanonieal form or simply
aJordan form.
Every matrix with complex entries has aJordan form. So every matrix
with real entries has aJordan form if complex entries are permitted. How-
ever, not every matrix in v!t" xn(R) has aJordan form in v!t" xnCR). For example,
the equivalence class of 6 x 6 real matrices containing D
1
and D
2
of Example
3 does not contain a matrix in Jordan form.
Examp/e 4 Suppose TE Hom(-r
5
) has elementary divisors (x - 2?
and (x - 2)
2
, as in Example 3, page 372. Then a matrix for T in classical
canonical form is a Jordan form, and there is a canonical basis for 'i"
5
with
. .____,
5. Canonical Forms
385
respect to which the matrix of T is
p(x) = x- 2.
Example 5 Let
e
-3 8
-8)
A= O
o 1 -24
-1 -1 3
-16 .
o 1 -1 o
Find a classica1 canonical form for A as a member of v!t
4
x iR) and a J ordan
form for A as a member of v!t
4
x i C).
The A-mnimum polynomial of (l/8, -1/8, O, O)T is easily found to be
x
4
+ 8x
2
+ 16. Since the degree of the mnimum polynomial for A cannot
exceed 4, x
4
+ 8x
2
+ 16 is the mnimum polynomia1 for A. Over the real
numbers this po1ynomial factors into (x
2
+ 4)
2
, whereas over the complex
numbers it factors into (x - 2i)
2
(x + 2i)
2
Therefore the classical canonical
form for A is
(
o -4: o o)
1 o: o o
a-c:o .. ~ 4
o o : 1 o
and
Gf'
o
i)
o
-2i
lll .,{{ 4xiC).
1 -2i
The second matrix is aJordan formas required.
If a map T has a matrix in Jordan form, then this fact can be used to ex-
press T as the sum of a diagonalizable map and a map which becomes zero
when raised to sorne power.
386 8. CANONICAL FORMS UNDER SIMILARITY
Definition A map TE Hom('t'") is nilpotent if Tk = O for sorne positive
integer k. The index of a nilpotent map is 1 if T = O and the integer k if
Tk = O but yk-I :1 O, k> J.
Nilpotent matrices are defined similary.
Theorem 8.33 If the mnimum polynomial of TE Hom('i') has only
linear irreducible factors, then there exists a nilpotent map T" and a diag-
onalizable map Td such that T = T" + Td.
Proof Since the minimum polynomia1 has only linear factors,
there exists a basis Be for 1' such that if A is the matrix for T with respect
to Be, then A is in Jordan canonical form. Write A = N + D where D is
the diagonal matrix containing the diagonal entries of A and N is the "sub-
diagonal matrix" containing the subdiagonal entries of A. It is easily shown
that N is a nilpotent matrix, problem 14 below. So if T" E Hom('i') has N
as its matrix with respect to Be and Td E Hom('i') has D as its matrix with
respect to Be, then T = T" + Td.
Theorem 8.33 may be proved without reference to the existence of a
Jordan form. This fact can then be used to derive the existence of a Jordan
form over.an algebraically closed field. Se problem 16 below.
Problems
l. Given the elementary divisors of A E .lt" x n(R), find a rational canonical form
D
1
for A anda classical canonical form D2 for A.
a. (x- 3)
3
, x- 3, x- 3. c. (x
2
+ 16)
2
, x
2
+ 16.
b. (x
2
+ 4)
2
, (x + 2)
2
d. (x
2
+ 2x + 2)
3
2. Suppose the matrices of problem 1 are in .tnxn(C). Determine the elementary
divisors of A and find a rational canonical form D3 for A and a Jordan
canonical form D
4
for A.
3. Suppose the matrix of TE Hom(-r' 4 ) with respect to {Er} is
(
! -g =f).
o 1 o 3
a. Show that the minimum polynomia1 of T is (x - 2)
4
b. Find the canonical basis Be for 1"
4
which begins with the standard basis
vector E
1
c. Show that th!l:lilatrix of Twith respect to Beis aJordan canonical form.
5. Canonical Forms 387
4. Find a classical canonical form for A as a member of vf(
4
x
4
(R) and a J ordan
form for A as a member of .J'f4x4(C) when
A = f
-1 1 o o
5. Determine conditions which insure that the minimum polynomial of an n x n
matrix A equals (- 1)" times the characteristic polynomial of A.
6. Use the characteristic polynomial to find the minimum polynomial and the
Jordan form of the following matrices from .Jt3 x 3 (C).
a. ( 6 -i). b. (-i c. (=! 16) ..
-1 1 o -1 1 o 2 -2 -3
7. Suppose the characteristic polynomial of A is (x- 2)
6
a. Find all possible sets of elementary divisors for A.
b. Which sets of elementary divisors are determined entirely by the number
of linear! y independent characteristic vectors for 2?
8. Compute (f for any positive integer n.
9. For the following matrices in Jordan form, determine by inspection their
elementary divisors and the number of linearly independent characteristic
vectors for each characteristic value.
a. (! 8). b. (! f 8).
0012 0002
o o)
o o
2 o.
o 2
10. Suppose the matrix of T with respect to {E} is
a.
b.
c.
11. a.
b.
c.
(
=i !) ..
-5 2 1 1
Show that the minimum polynomial of Tis (x
2
+ 2)
2
With TE Hom(?'" 4 (R)), find a canonical basis Be for 1'"
4
begining with
(0, O, 1, -1) and find the matrix of T with respect to Be.
With TE Hom( 1'"
4
( C)), find a canonical basis Be for 1'"
4
( C) and the
matrix of T with respect to Be.
Show that every characteristic value of a nilpotent map is zero.
Suppose TE Hom(-r') is nilpotent, and dim 1'" = 4. List all posible sets
of elementary divisors for Tand find a matrix in Jordan form for each.
Prove that if Tis nilpotent, then Thas a matrix with O's offthe subdiagonal
and 1's. as the only nonzero en tries on the subdiagonal, for sorne choice
of basis.
12. Show that the following matrices are nilpotent and find a Jordan form for
each.
388 8. CANONICAL FORMS UNDER SIMILARITY
(
6 -4 3)
a. 6 -4 1 .
-3 2 -2
(
1 -1 1) ( 1 1 -1 2)
-2 -2 2 -4
b. = 1 1 = 1 . c. 5 3 - 1 . . 2 .
1 1 1
3 2 -1 . 2
13. Find a nilpotent map T. and a diagonalizable map T4 such that T = T,, + T4
when
a. T(a, b) = (5a - b, a + 3b).
b. T(a, b, e) = (3a- b - e, 4b + e, a - 3b).
14. Suppose A is an n x n matrix with all en tries on and above the main diagonal
equal to O. Show that A is nilpotent.
15. Suppose the mimimum polynomial of TE Hom('Y) has only linear irreducible
factors. Let T
1
, , Tk be the projection maps that determine a primary
decomposition of 'Y, as obtained in problem 15 on page 367. Thus if A.J> .. , A.k
are the k distinct characteristic values of T, then T1: 'Y- .k'<r-;.,o"
a. Prove that Td = A. 1 T1 + + A.k Tk is diagonalizable.
b. Prove that T,, = T- T
4
is nilpotent. Suggestion: Write T = To 1 =
To T
1
+ + To Tk and show that for any positive integer r,
T;, = (T- A.
1
/)'o T1 + + (T- A.kl)'o h
16. Combine problems llc and 15 to show how the decomposition T = Td + T,,
yields a matrix in Jordan form. (The statement of llc can be strengthened to
specify how the 1 's appear, and then proved without reference to a Jordan
form. See problem 17.)
17. Suppose TE Hom('Y) is nilpotent of index r and nullity s.
a. Show that {O}= .k' ro e .k'r e .1Vr' e e ffrr ='Y.
b. Show that 'Y has a basis
B
1
= {U
1
, , Va., Ua
1
+ 1, , U"
2
, , UQ,._,+J, ... , Uar},
with {U
1
, , Ua
1
} a basis for% r; for i = 1, ... , r.
c. Prove that the set {UJ> ... , Ua
1
_, T(Ua1+1), . , T(Ua1.,)} may be
extended toa basis for% r' for i = 1, ... , r- l.
d. Use parte to obtain a basis B2 = {W1, ... , Wa,, ... , War- 1 +1, ... ,
WaJ such that { W
1
, .. , Wa;} is a basis for X r; for each i, and if W E 82
is in !he basis for .1V' r', but not the basis for .!V r'- , then
.. ...,.,
T(W)E{Wa
1
_,+1, ... , Wa
1
_
1
}, T
2
(W)E{Wa1- 3 +1, ... , Wa1 _,},
and so on.
e. Find a basis 8
3
for 'Y such that the matrix of T with respect to 83 is
diag(N
1
, , N
5
), where N
1
has 1 's on !he subdiagonal and O's elsewhere,
andifN
1
isb
1
X ...
6. Applications of the Jordan Form
Our examples have been chosen primarily from Euclidean geometry to
provide clarification of abstrae! concepts and a context in which to test new
-,(
6. Applications of the Jordan Form 389
ideas. However, there are no simple geometric applications of the Jordan
form for a nondiagonalizable matrix. On the other hand, there are manv
places where the Jordan form is used, and a brief discussion of them wiI
show how linear techniques are applied in other fields of study. Such an
application properly belongs in an orderly presentation of the field in ques-
tion, so no attempt will be made here to give a rigorous presentation. Wc
will briefly consider applications to differential equations, probability, and
numerical an,Jysis.
Differential equations: Suppose we are given the problem of finding
x
1
, , x. as functions of t when
D,x1 = dxjdt = a11 x1 + + a1.x.,
for each i, 1 :s; i :s; n. If the aij's are all scalars, then this system of equa-
tions is cailed a homogeneous system of linear first-order differential equations
with constan! coefficients. The conditions x(t
0
) = c
1
are the initial conditions
at t = t
0
. Such a system may be written in matrix form by setting A = (a
1
),
X= (x, ... , x.f, and e= (e, ... ; c.)T. Then the system becomes
D,X = AX or D,X(t) = AX(t) with X(t
0
) = C.
Example 1 Such a system is given by
D,x = 4x - 3y - z, x(O) = 2
D,y =X- z, y(O) = 1
D
1
z = -x + 2y + 3z, z(O) '= 4.
This system is written D,X = AX, with initial conditions at t = O given by
X(O) = C, if
A= (
-1
-3
o
2
-)
-1,
3
A solution for the system D,X = AX with initial conditions X(t
0
) = C
is a set on n functions j
1
, J. su eh that x
1
= f
1
(t), ... , x. = J.(t) satisfies
the equations and j
1
(t
0
) = c
1
, ,f,,(t
0
) = c . Usually D,X = AX cannot
be solved by inspection or by simple computation of antiderivatives. How-
ever, the system would be considerably simplified if the coefficient matrix
A cou!d be replaced by its Jordan form. Such a change in the coefficient
matrix amounts to a change in variables.
390 8. CANONICAL FORMS UNDER SIMILARITY
Theorem 8.34 Given the system of differential equations D1X =
AX with initial conditions X(t
0
) = e, Jet X= PX' define a new set ofvari-
ables, P nonsingular. Then X is a solution of D
1
X = AX, X(t0) = e if and
only if X' is a solution of D
1
X' = p-IAPX', X'(t
0
) =p-Ie.
Proof Since differentiation is a linear operation and Pis a matrix of
constants, P(D
1
X') = DlPX'). Therefore D
1
X' = p-I APX' if and only if
D(PX') = A(PX'), that is, if and only if X= PX' satisfies the equation
Example 2 Find x, y, z if
D1X = = ; = AX
z 10 2 -10 z
and
For this problem t
0
= O.
The characteristic values of A are 1 and 2 so A s diagonalizable.
In fact, if
then
(
1 o o)
p-IAP = O 2 O .
o o -2
Therefore if new variables x', y', z' are defined by X= PX', then it is neces-
sary to solve D
1
X' = diag(1, 2, -2)X' with initial conditions
(
-1 -1 2) (4) ( 1)
X'(O) =p-Ie= 4 2 -5 1 = 3 .
-1 o 1 3 -1
The equations in the new coordinates are simply D
1
x' = x', D
1
y' = 2y', and
D
1
z' = -2z', which have general solutions x' = ae
1
, y' = be
21
, z' = ce-
21
where a, b, e are arbitrary constants. But the initial conditions yield x' = e
1
,
6. Applications of the Jordan Form 391
y'= 3e
21
, and z' = -e-
21
, therefore the solution of the original system is
or
(
2 1 1) ( e
1
)
X = PX' = 1 1 3 3e
21
2 1 2 -e-
21
x = 2et + 3e2t - e-2t
y = et + 3e2t - 3e-2t
z = 2e
1
+ 3e
21
- 2e-
21
.
Example 3 Consider the system given in Example l. A Jordan
form for A is given by p-
1
AP with
(
2 o o)
p-
1
AP= 12 O.
o o 3
and
Thus A is not diagonalizable. If new variables are given by X= PX', then
it is necessary to solve the system
with
(
x'(O)) (o 1 1)(2) (5)
y'(O) =P-
1
e= 1 -11 1 = 5.
z'(O) 1 -1 O 4 1
The first and Iast equations are easily solved: x' = 5e
21
and z' = e
31
The
second equation is D
1
y' = x' + 2y' or D
1
y' - 2y' = 5e
21
. Using e-
21
as an
integrating factor,
or D(y'e-
21
) = D(5t).
Therefore y'e-
21
= 5t + b, but y' = 5 when t =O so y' = 5(t + 1)e
21
,
cr
392 8. CANONICAL FORMS UNDER SIMILARITY
:::!)
and the solution for the original system is
-1
-1
1
2)( 5e
2
' )
1 5(t + 1)' .
-1 e3'
The general solution of D,x = ax is x = ce' where e is an arbitrary
constan t. The surprising fact is that if e'A is properly defined, then the general
solution of D,X = AXis given by e'AC where C is a column matrix of con-
stants. There is no point in proving this here, however computation of e'A
is most easily accomplished using aJordan form for A.
Computation of and e'A: The matrix eA is defined by following the
pattern set for cf in the Maclaurin series expansion:
"' xk
--
-k=ok!'
Given any n x n matrix A, we would set
CX) 1
= L kiA\
k=O
if this infinite sum of matrices can be made meaningful.
Definition Suppose Ak E Alnxn for k =O, 1, 2, ... , and Ak = (au,k).
If limk-ro a;i,k exists for all i and j, then we write
where B = (bii) and bii = Iim aij,k
k-ro
Sk = L ~ o A; is apartial sum of the infinite series of matrices Lf:o A.
This series converges if limk-ro Sk exists. If limk-ro Sk = S, then we write
Li"=o Ak =S.
Theorem 8.35 The series
converges for every A E ..4!, x w
~ 1 k
.t.. -A
k=ok!
6. Applications of the Jordan Form 393
Proof Let A.= (a;) and Ak = (aij,k)- The entries in A must be
bounded, say by M= max {laiill1 ::; i, j::; n}. Then the entries in A
2
are
boimded by nM
2
for
An induction argument gives la;
1
,d ::; nk-tMk. Therefore,
But (nM)kfk! is the k+ 1st term in theseriesexpansionforeM, whichalways
converges. Therefore, by the comparison test the seires Li"=o (1/k!)a;j,i:
converges absolutely, hence it converges for all i and j. That is, the series
Li"=o (lfk!)Ak converges.
Thus we may define
and
If A is diagonal, then eA is easily compUted for
But if A is not diagonal, then eA can be computed using aJordan form for A.
Theorem 8.36
Proof
Example 4 Let A and P be as in Example 2. Then p-t AP =
----.
394
8. CANONICAL FORMS UNDER SJMILARITY
dia$(1, 2, -2) and
e'A = P(e'diag(l,2,-2))p-t
= P diag(e', e
2
', e-
2
')P-
1
Now with C as in Example 2,
is the solution obtained for the system of differential equations given in
Example 2.
The calculation of eA is more difficult if A is not diagonalizable. Sup-
pose J = diag(J
1
, , J
11
) is aJordan forro for A where each block l; con-
tains 1 's on the subdiagonal and a characteristic value of A on the diagonal.
Then eJ = diag(', ... , eJ''). Thus it is sufficient to consider Jan n x n
matrix with A. on the diagonal and 1 'son the subdiagonal. Write J = A.I. + N,
then N is nilpotent of index n. It can be shown that eA+B = eAe
8
provided
that AB = BA, therefore
and
Exampfe 5
o
1
1
1/(n - 2)!
o
1
... o)
... o
.. . o .
.. . 1
Let A and P be the matrices of Example 3. Then
(
2 o o)
p-
1
AP= 12 0.
o o 3
6. Applications of the Jordan Form 395
Consider
with N = ~ ~ }
Then
and
With
X = e'AC gives the solution obtained for the system of differential equations
in Example 3.
Markov chains: In probability theory, a stochastic process concerns
events that change with time. If the probability that an object will be in a
certain state at time t depends only on the state it occupied at time t - 1,
then the process is called a Ma.rkov process. lf, in addition, there are only a
finite number of states, then the process is called a Markov chain. Possible
states for a Markow chain might be positions an object can occupy, or qual-
ities that change with time, as in Example 6 .
If a Markov chain involves n states, let au denote the probability that
an object in the ith state at time t was in the jth state at time t - l. Then
a;i is called a transition probability. A Markov chain with n states is then
characterized by the set of all n
2
transition probabilities and the initial dis-
tribution in the various states. Suppose x
1
, , x. give the distribution at
t = O, and set X = (x , ... , xnl, then the distribution at time t = 1 is
given by the product AX, whre A = (aij) is the n x n matrix of transition
probabilities. When t = 2, the distribution is given by A
2
X, and in general,
the distribution when t = k is given by the product AkX.
(
396 8. CANONICAL FORMS UNDER SIMILARITY
The matrix A of transition probabilities is called a stochastc matrix.
The sum of the elements in any column of a stochastic matrix must equal l.
This follows from the fact that the sum a1j + a2 j + + a.j is the pro-
bability that an element in the jth state at time t was in sorne state at time
t - l.
If A is the stochastic matrix for a Markov process, then the ij entry in
A k is the probability of moving from the jth state to the ith state in k steps.
If limk_,oo Ak exists, then the process approaches a "steady-state" distribu-
tion. That is, if B = Iimk_,oo A\ then the distribution Y= BX is unchanged
by the process, and A Y = Y. This is the situation in Example 6.
If A is a diagonal matrix, then it is a simple matter to determine if a
steady state is approached. For if A = diag(d
1
, , d.), then A k = diag(df, ... ,
d ~ . So for a diagonal matrix, limk_,oo A exsts provded ldl < 1 or d = 1
for all the diagonal entries d. This condition can easily be extended to diago-
nalizable matrices.
Theorem 8.37 If B = p-
1
AP, and limk_,oo Ak exists, then
lim Bk = r
1
(lim Ak)P.
k-+oo k-+oo
Proof.
Iim Bk = lim (P-
1
AP)k = lim p-lAkP = r
1
(lim Ak)P.
k-+oo k-+oo k-+co k-+oo
Using Theorem 8.37, we have the following condition.
Theorem 8.38 If A is diagonalizable and IAI < l or A = l for each
characteristic value A. of A, then limk_,oo Ak exists.
Example 6 Construct a Markov chain as follows: Call being
brown-eyed, state 1, and blue-eyed, state 2. Then a transition occurs from
one generation to the next. Suppose the probability of a brown-eyed parent
having a brown-eyed child is 5/6, so a
11
= 5/6; and the probability of having
a blue-eyed child is 1/6, so a
21
= 1/6. Suppose further that the probability
of a blue-eyed parent having a brown-eyed child is 1/3 = a
12
; and the pro-
bability of having a blue-eyed child is 2/3 = a
22
.
If the initial population is half brown-eyed and half blue-eyed, then
,";-,
6. Applications of the Jordan Form
397
X = (1/2, 1/2? gives the ntal dstribution. Therefore, the product
AX = (5/6 l/3)(1/2) = (7/12)
1/6 2/3 1/2 5/12
indicates that after one generation, 7 /12th of the population should be
brown-eyed and 5/l2th blue-eyed. After k generations, the proportions would
be given by Ak(1/2, 1/2V.
The stochastic matrix for this Markov chain is diagonalizable. If
(
2 -1)
p = 1 1 '
then
Therefore the hypotheses of Theorem 8.38 are satisfied, and
lim Ak = P[lim(l O )k]p-:i = p(l O)p-1 = (2/3 2/3)
k->oo k-<:o O l/2 O O 1/3 1/3
Hence this chain approaches a steady-state distribution. In fact,
(
2/3 2/3) (112) (2/3)
1/3 l/3 1/2 = 1/3 '
so that, in time, the distribution should approach two brown-eyed individuals
_ for each blue-eyed individual. Notice that, in this case, the steady-state
----::zcftstribution is independent of the initial distribution. That is, for any a with
O:-:; a:-:; 1,
(
2/3 2/3)(1 - a) = (2/3)
1/3 1/3 a l/3
Numerical analysis: A stochastic matrix A must ha ve 1 as a characteristic
value (see problem 8 at the end of this section), therefore limk__,oo Ak cannot
be O. In contrast, our application of the Jordan form to numerical analysis
requires that Ak approach the zero matrix.
398
8. CANONICAL FORMS UNDER SJMILARITY
Theorem 8.39 Suppose A is a matrix with complex entries.
Limk-oo Ak = O if and only if 1.1.1 < 1 for every characteristic value A of A.
Proof Jf J = diag(J
1
, , Jh) is a Jordan form for A, then
limk-oo A k =O if and only if limk .... oo Jk =O. But Jk = diag(J}, ... , so it is
sufficient to considera single Jordan block Un + N where A is a characteristic
val u e of A and N contains 1 's on the su bdiagorial and O's elsewhere. That
is, it mus! be shown that
lim (Aln + N)k = O if and only if 1.1.1 < l.
k-+oo
If k ;:::: n, then since the index of N is n,
(21" +N/= )._k"+ e)Ak-1N
+ (;}1k-2N2 + . , . + (n J}.k-n+1Nn-1
where is the binomial coefficient h !(k h)!. Therefore
J. k O o
A k
o
(U.+ N)k = )Ak-2 o
( k )Ak-n+1 ( k )Ak-n+2
_Ak
n - 1 n-2
If limk .... "' (Al. + N)k = O, then IAI < 1 since ).k is on the diagonal. Con-
versely, if IAI < 1, then
and =O
l
follows from repeated applicalions of L'Hopital's rule for 1 :::; i :::; n,
so that limk-+oo (}..[n + Nl = O.
Using this resuhe can show that the series Lf=o Ak behaves very much
like a geometric series of the form :L:,o ar", a and r scalars.
6. Applications of the Jordan Form 399
Theorem 8.40 The series Ir'= o Ak converges if and only iflimk-+oo A k= O.
Further when the series converges,
00
L Ak =(In- A)-1.
k=O
Proof Let sk be the parta! sum A''.
(=>) If Lk"=o Ak converges, say Lk"=o Ak =S, then for k> 1, Ak =
sk- sk-1 and ----
lim Ak = lim (Sk - Sk_
1
) = S - S = O.
k_.co k-?co
( <=) Suppose limk .... oo A k = O and consider
Then
or
Taking the limit gives
I. = lim(I. - Ak+
1
) = lim (!. - A)Sk = (I. - A) lim Sk.
k-+co k-+-oo k-+-oo
Therefore the limit of the partial sums exists and is equal to (!. - A)-
1
,
that is,
CX)
L Ak =(l.- A)-1.
k=O
Corollary 8.41 Lf=o Ak =(l.- A)-
1
if and only if I.AI < 1 for
every characteristic value .A of A.
The above theorems are employed in iterative techniques. For example,
suppose EX = Bis a system of n linear equations in n unknowns. If A = In - E,
then X AX + B is equivalel}t to the original system. Given a nonzero
vector X
0
, define Xk by Xk = AXk-l + B for k = 1, 2, .... If the sequence
400 8. CANONICAL FORMS UNDER SIMILARITY
of vectors X
0
, X
1
, X
2
, converges, then it converges toa solution of the
given system. For if limh-oo Xh = e, then
e= Ae + B = (!,- E)e + B,
or Ee = B. This is a typical iterative technique of numerical analysis.
Theorem 8.42 The sequen'ce X
0
, X
1
, X
2
, converges to a solution
for Ee = B if IA.I < 1 for every characteristic value A. of A = /, - E.
Proof
X
1
= AX
0
+ B,
X
2
= AX
1
+ B = A
2
X
0
+ AB + B,
X
3
= AX
2
+ B = A
3
X
0
+ (A
2
+ A + I,)B,
and by induction
With the given assumption,
(!, - A) -J B is a solution for EX = B for
Notice that the solution obtained above agrees with Cramer's rule
beca use
What if E were singular?
Because of our choice of problems and examples, it may not be clear
why one would elect to use such an iterative technique to solve a system of
linear equations. But suppose the coefficients of a system were given only
6. Applications of the Jordan Form 401
approximately, say to four or five decimal places. Then this technique might
provide a method for approximating a solution involving only matrix
multiplication and addition. This is obviously not something to be done by
hand, but a computer can be programmed to perform such a sequence of
operations.
Problems
l. Solve the following systems of differential equations by making a change of
variables.
a. D,x = x - 6y, x(O) = 1
D,y = 2x - 6y, y(O) = 5.
b. D,x = 7x - 4y, x(O) = 2
D,y = 9x - 5y, y(O) = 5.
c. D,x = x - 2y + 2z, x(O) = 3
D,y = 5y - 4z, y(O) = 1
D,z = 6y - 5z, z(O) = 2.
d. D,x = 5y - z, x(O) = 1
D,y = -X + 5y - z, y(O) = 2
D,z = -x + 3y + z, z(O) = l.
2. Compute eA and e'" if A is
(
-1 4)
a. -2 5 b.
c. o 5 o .
(
4 o o)
o 1 5
(
3 o o o)
1 3 o o
d. o 1 3 o .
o o 1 3
3. For the systems D,X =, AX, X(O) = C, of problem 1, show that e'"C is the
solution obtained by making a change of variables.
4. An nth order linear homogeneous differential equation with constan! coeffi-
cients has the form
(1)
Define x 1 = x, x2 = D,x, x3 = Dtx, ... , Xn =
... , Uk_
1
, V, Uk+!> ... , Un)
is linear for each k, 1 ::;; k ::;; n, and for each choice of vectors
U, ... ' uk-1> uk+I> ... ' un E "Y.
Therefore the claim is that det is a multilinear function of the n columns
in an arbitrary matrix. We will not prove this, but simply note that this result
was obtained for 2 x 2 matrices in problem 6, page 246.
We found that an elementary column operation of type 1, interchanging
columns, changes the sign of det A. This property will be introduced with
the following concept.
Definition A multilinear functionfis alternating if /(U
1
, .. , Un) = O
whenever. at least two of the varbles U, ... , Un are equal.
408 APPENDIX: DETERMINANTS
If two columns of a real matrix are equal, then the determinant is zero
so if the function det is multilinear, then it is alternating.
We are now prepared to redefine determinants.
Definition A determinantfunction D from ~ . x . F ) to Fis an alternat-
ing multilinear function of the n columns for wlch D(I.) = l.
Therefore a determinant function D must satisfy the following conditions
for all A
1
, , A. E ~ . x
1
(F) and all k, 1 s k s n:
l. D(A
1
, ,Ak + A,, ... ,A.)
= D(A
1
, , Ak> ... , A.) + D(A
1
, , A,, ... , A.).
2. D(A
1
, , cAk, ... , A.) = cD(A
1
, , Ak> ... , A.), for all e E F.
3. D(A
1
, , A.) = O if at least two columns of A = (A
1
, , A.) are
equal.
4. D(I.) = D(E
1
, , E.) = 1 where E
1
, , E. is the standard basis
for ~ . x
1
(F). These vectors are actually Ei, ... , EJ, but the transpose nota-
tion will be dropped for simplicity.
So little is required of a determinant function that one might easily
doubt that D could equal det. In fact there is no assurance that a determinant
function D exists. One cannot say, "of course one exists, det is obviously a
determinant function," for the proof of this statement relies on the unproved
result about expansions along arbitrary columns.
Assuming that a determinant function exists, is it clear why the condition
D(I.) = l is assumed? Suppose there are many aiternating multilinear func-
tions of the columns of a matrix. This is actually the case, and such a con-
dition is required to obtain uniqueness and equality with det.
We will proceed by assuming that a determinant function exists and
search for a characterization in terms of the entries from the matrix A. The
first step is to note that D must be skew-symmetric. That is, interchanging two
variables changes the sign of the functional value.
Theorem A1 If D exists, then it is skew-symmetric. That is, inter-
changing two columns of A changes D(A) to - D(A).
Proof It must be shown that if 1 ::s; h < k ::s; n, then
D(A
1
, , Ah, ... , Ak, ... , A.) = -D(A
1
, . , Ak, ... , Ah, ... , A.).
1. A Determinant Function 409
For the matrix with Ah + Ak in both its hth and kth columns,
O= D(A
1
, , Ah+ Ak, ... , Ah+ Ak, ... , A.)
= D(A
1
, , Ah, ... , Ah, ... , A.) + D(A
1
, , Ah, ... , Ak, ... , A.)
+ D(A
1
, , A k, ... , Ah, ... , A.) + D(A
1
, , A k> , Ak, ... , A.)
= D(A
1
, , Ah, ... , Ak, ... , A.) + D(A
1
,.,., Ak, ... , Ah, ... , A.).
This shows that the two functional values are additive inverses in F, establish-
ing the theorem.
Skew-symmetry is not equivalent to alternating. However, in a field
satisfying 1 + 1 ~ O, a skew-symmetric multilinear function is alternating,
see problem 6 at the end of this section.
To find an expression for D(A), write each column Ak of A in the form
n
Ak = I;a;kE.
;
For example, if
then
When every column is written in this way it will be necessary to keep track
ofwhich column con tributes which vector E;. Therefore we introduce a second
index and write
"
Ak = I a;.kE; .
k= 1
So in the above example, A
3
= 2E
1
, + 8E
2
, + 5E
3
,. With this notation, A
410
APPENDIX: DETERMINANTS
takes the form
A = (A
1
, , A.)
Now if a determinant function D exists, it is linear in each variable. Therefore,
writing A in this way shows that
n n n
D(A) =. L a;, I L a;
22
L a;.,.D(E;,, E;
2
, , E;,,).
l! = l 12::::::1 i
11
:;:; 1
Example 2
2 x 2 matrix
Use the above formula to compute D(A) when A is the
In this case the expression
beco mes
Therefore
and
2 2
D(A) = L L a;,
2
D(E;,, E;)
it=l il=l
= a
11
a
12
D(E
1
, E
1
) + a
11
a
22
D(E, E
2
)
+ a
21
a
12
D(E
2
, E
1
) + a
21
a
22
D(E
2
, E
2
).
l
'
1
J
)
1
1
l
l
1. A Determinant Function 411
But D must be skew-symmetric and D(E
1
, E
2
) = l. Therefore,
D(A) = a
11
a
22
- a
21
a
12
That is, if D exists, then it equals det on 2 x 2 matrices.
This example suggests the general situation, but it may be useful to
compute D(A) for an arbitrary 3 x 3 matrix using the above formula. You
should find that D(A) is again equal to det A when A s 3 x 3.
There are now two ways to proceed in showing that a func-
tion exists. The classical approach is to show that such a function can only
assign one value to each matrix of the form (E,, ... , E;). Of course, if two
columns are equal, then the val u e must be O. But if no two columns are equal,
then all the vectors E
1
, , E. are in the matrix., So using enough inter-
changes, the sequence E;,, ... , E;., can be rearranged to E
1
, , E . That is
D(E;,, ... , E;) = D(E, ... , En) = l.
In the next section we will show that this sign is determined only by the order
ofthe ndices i
1
, , in. A more abstract approach to the problem of existence
will be considered in Section 3. Using this approach we will ignore this sign
problem and obtain an expression for D(A) in terms of cofactors.
Before proceeding to show that a determinant function exists, it should
be pointed out that if such a function exists, then it must be unique, for a
function is single valued. So ifD exists, then D(E:;,, ... , E;.,) has a unique
value for each choice of i
1
, , in. This value depends only on the ndices
i
1
, , i. and the value of D(E
1
, , E.). Since D(E
1
, , E.) is required
to be 1, we have:
Theorem A2 If a determinant function exists, then it is unique.
Problems
1. Assuming a determinant function exists, let {E} be the standard basis in
A 3 x 1 (F) and find
a. D(E, EJ, E
2
). c. D(EJ, E2, E
1
).
b. D(E
2
, E, E
2
). d. D(E3, E
2
).
2. Assuming a determinant function exists, let {E,} be the standard basis in
.Jt4x t(F) and find
a. E3, E4, E2). c. D(E2, E4, E, E3).
b. D(E3, E2, El> E3). d. D(E, E
1
).
412 APPENDIX: DETERMINANTS
3. Show that the function W- det(U, ... , Uk-t W, Uk+t ... , Un) is a linear
function from .lfn x
1
(R) to R for any n - 1 vectors U, ... , Uk- , Uk+ , . _ .. , Un
from .(( n X l (R).
4. Show that conditions 1 and 2 following the definition of a determinan! function
are equivalen! to conditions 1 * and 2 if 1 * is given by:
1.* D(A, ... , Ah+ cAk, ... , An) = D(At. ... , A1, , An)
for any two columns. Ah and h -1' k, and any e E F.
5. Prove that if U, ... , Un are linearly dependen! in .ltnx 1(F) anda determinan!
function exists, then D(U, . .. , U,.) =O.
6. Suppose f is a multilinear function of n variables, and F is a field in which
1 + 1 -1' O. Prove that if f is skew-symmetric, then f is alternating.
7. Could the definition of a determinan! function ha ve been made in terms of rows
instead of columns?
8. Suppose f is a multilinear function such that f( U, ... , Un) = O whenever two
adjacent columns are equal. Prove that/is alternating.
9. Suppose f is a skew-symmetric function of n variables. Pro ve that if f is linear
in the first variable, then f is multilinear.
2. Permutations and Existence
Definition
set onto itself.
A permutation is a one to one transformation of a finite
We are intersted in permutations of the first n positive integers and will
Jet &. denote the set of all permutations of {I, ... , n}. For example if
CT(I) = 3, CT(2) = 4, CT(3) = 1, CT(4) = 5, and CT(5) = 2, then CT is a member of
&
5
. of &
5
is given by r( 1) = 2, r(2) = 1, r(3) = 3, r(4) = 5,
r(5) = 4. Since permutations are 1 - 1 maps, their compositions are 1 - 1,
so two permutations may be composed to obtain a permutation. For CT and
r above, CT o r is the permutation 11 E &
5
given by .u( 1) = 4, .u(2) = 3, .u(3) = 1,
.u(4) = 2, and p(5) = 5. Find ro CT, and note that it is not 1.
Theorem A3 The set of permutations r!/
11
together with the operation
of composition is a group. If n > 2, then the group is not commutative.
Proof Closure of &
11
under composition was obtained above.
Associativity is a general property of composition. The identity map I is the
2. Permutations and Existence
413
identity in & . And finally since permutations are 1 - 1, they have inverses,
so &. is a group. The proof of noncommutativity is left as a problem.
In the previous section we found that D(A) is expressed in terms of
D(E;,, . .. , E;.) and the value of D(E;,, . .. , E;) depends on the arrange-
ment of the integers i
1
, , i . If this arrangement contains all n integers,
then it is a permutation in&. given by CT(l) = i
1
, , CT(n) = i . Therefore
the expression obtained for D(A) might be written as
Still assuming such a function D exists, the problem now is to determine
how the sign of D(Ea(tl ... , Ea(n)) depends on CT.
Definition A transposition is a permutation that interchanges two
elements and sends every other element to itself.
For a given permutation CT E&"' or i
1
, , i"' there is a sequence of
transpositions that rearranges i
1
, ,
11
into the normal order l, ... , n.
That is, there is a sequen ce of transpositions which when composed with the
permutation CT yields the identity permutation l.
Example 1 Use a sequence of transpositions to rearrange the
permutation 4, 2, 1, 5, 3 into l, 2, 3, 4, 5.
This may be done with three transpositions.
4, 2, 1, 5, 3---71,2,4, 5, 3
---7 1, 2, 4, 3, 5
---7 1, 2, 3, 4, 5
interchange 4 and l
interchange 5 and 3
interchange 4 and 3.
But this sequence of transpositions is not unique, for"example:
4, 2, 1, 5, 3 """2,1' 4, 1, 2, 5, 3---;' 1, 4, 2, 5, 3 4:2' 1, 2, 4, 5, 3
---s,3' 1, 2, 4, 3, 5 ----;-y 1, 2, 3, 4, 5
where the numbers under each arrow indicate the pair interchanged by a
transposition.
Since both sequences in Example 3 contain an odd number oftransposi-
414 APPENDIX: DETERMINANTS
tions, either sequence would give
But if 4, 2, 1, 5, 3 can be transformed to 1, 2, 3, 4, 5 with an even number of
transpositions, then the determinant function D does not exist. Therefore
it is necessary to prove that only an odd number of transpositions will do the
job. The general proof of such an "invariance of parity" relies on a special
polynomial.
The product of all factors X; - xi for 1 i < j n is written as
n
TI (x;- xi).
i<j
Denote this polynomial by P(x
1
, , x.). Then
and so on.
Por u E EJJ., apply u to the ndices in the polynomial P(x
1
, , x.).
Then for the ndices in xu(i) - xuU> either u{i) < u(j) or u(j) < u(i) and
Xu(i) - xu(j) = -(xa(j) - Xa(i)). Therefore the factors in Tii<j (xu(i) - Xu(j))
differ from the factors in Ti'<i (x; - x) by at most multiples of -1 and
P(xa(l)> . .. 'Xa(n)) = P(x, .. . 'x.).
Definition A permutation u E EJJ. is even if
P(xa(l\> ... 'Xa(n)) = P(x, ... 'x.)
and odd if
P(xa(l) ... ' Xa(n)) = - P(x, ... ' x.).
Examp/e 2 Let u E fJJ
3
be given by u(1) = 3, u(2) = 2, and u(3) = 1,
then u is odd for
P(xa(l)> Xa(2) Xa(3)) = (Xa(l) - Xa(2))(Xa(1) - Xa(3))(xa(2) - Xa(3))
= (x
3
- Xz)(x
3
- x
1
)(x
2
- x
1
)
= ( -1)
3
(x
2
- x
3
)(x
1
- x
3
)(x
1
- x
2
)
= -P(x
1
, x
2
, x
3
).
"'
2. Permutations and Existence 415
Theorem A4 Every transposition is an odd permutation.
Proof Suppose u E EJJ. is a transposition interchanging h and k
with h < k. Then there are 2(k - h - 1) + 1 (an odd number) factors in
Tii<i (xu(i) - xuu>) that must be multiplied by -1 to obtain Tii<i (x; - xi).
If h < i < k, then k - h - 1 of these factors are
Another k - h - 1 factors in the wrong order are
and finally
These are all the factors that must be multiplied by -1 since all other choices
for i < j yield u(i) < u(j). Thus u is an odd permutation.
Theorem A5 If i
1
, , i. gives an even (odd) permutation in EJJ. and
there exist m transpositions that transform i
1
, . , i. into 1, ... , n, then
mis even (odd).
Proof Suppose u is an even permutation given by a(l) = i
1
, , u(n)
= i. and r
1
, , rm are transpositions such that r"' o o r
1
o u = J. Since
1 is an even permutation, app1ying the permutation rm o o r
1
o u to the
indicies in P(x
1
, , :X.) leaves the polynomial unchanged. But u leaves
P(x
1
, , x.) unchanged, and each transposition changes the sign. There-
fore the composition changes the polynomial to ( -1)mP(x
1
, , x.) and
m must be even. Similarly m is odd if u is an odd permutation.
Definition The sign or signum of a permutation u, denoted sgn u, is
1 if u is even and -1 if u is odd.
If there exist m transpositions that transform a permutation into the
normal order, then sgn u = ( -1)"', and we can write
P(xa(l) . .. , Xa(n)) = (sgn u)P(x
1
, . .. , x.).
rr
A value for m can be obtained by counting the number of inversions in the
416 APPENDIX: DETERMII\W.NTS
permutation a. An inversion in a permutation occurs whenever an integer
precedes a smaller integer. Thus the permutation 2, 1, 5, 4, 3 contains four
inversions. Each inversion in a permutation can be turned around with a
single transposition. l;herefore if a has m inversions, then sgn a = ( -l)m.
Example 3 The permutation given in Example 1 contains five
inversions, therefore sgn (4, 2, 1, 5, 3) = ( -1)
5
= -l. The second sequence
of transpositions used to transform 4, 2, 1, 5, 3 to 1, 2, 3, 4, 5 is a sequence
that inverts each of the five inversions.
Now if E,, ... , E. is a rearrangement of E
1
, , E., then the number
of interchanges required to obtain the standard order depends on whether
i
1
, , i. is an even or odd permutation. Therefore
D(E" ... , E..) = sgn(i
1
, . , i.)D(E
1
, , E.)
== sgn(i
1
, ... , i.),
and the formula for D(A) obtained on page 410 becomes
D(A) = L (sgn a)au(I)Iau(Z)Z a .. (n)w
aefJJn
Soto pro ve the existence of a determinant function it is only necessary to show
that this formula does in fact define such a function.
Theorem A6 The function D: vfl.x.(F)--> Fdefined by
D(A) = L (sgn a)a<I(l)l ... au(n)n for A =(a)
qe!fl'n
is a determinant function.
Proof Showing that D is multilinear and that D(I.) = 1 is not
difficult and is left to the reader.
In proving that D is alternating we will use the alternating group of
even permutations, d. = {Jl E &>.jsgn J1 = 1}. lf -r is a transposition in
&>., then
see problem 8 below.
2. Permutations and Existence 417
To show that D is alternating, suppose the hth and kth columns of A
are equal. Then if -r is die transposition interchanging h and k,
D(A) = L (sgn Jl)ap(I)I ap(n)n + L (sgn t o Jl)a,.p(I)l a<p(n)n
p.e.n J.UE&In
For each J1 E d., sgn J1 = 1 and sgn -ro f1 = -l. Further since the hth and
kth columns of A are equal,
Therefore D(A) = O, and D is alternating.
Problems
l. Write out all the permutations in f!i'3 and f!i' 4 and determine which are even
and which re odd.
2. Let a, rE f!i' 5 be given by a(I) = 4, a(2) = 2, a(3) = 5, a(4) = 1, a(5) = J
and r(l) = 5, r(2) = 1, r(3) = 2, r(4) = 4, r(5) = 3.
Find a.ao-r. b.roa. c.a-
1
d.r-
1
3. Suppose a E f!i' 3 satisfies a(l) = 2, a(2) = 3, and a(3) = l. Find all permuta-
tions in f!i' 3 that commute with a, that is, al! rE f!i'
3
for which a o r = ro a.
4. Prove that if n > 2, then f!i'. is not a commutative group.
5. Show that sgn a o r = (sgn a)(sgn r) for all a, rE f!i',..
6. a. Write out P(x,, Xz, x
3
, x4).
b. Use this polynomial to find the sign of a and r if a is given by a(I) = 4,
a(2) = 2, a(3) = 1, a(4) = 3 and r is given by 1, 4, 3, 2.
7. How many elements are there in f!i'.?
8. a. Show that .91, = {,u E f!i' .,u is even} is a subgroup of f!i',.
b. Suppose rE f!i'. is a transposition. Show that each odd permutation a E f!i',
can be expressed as r o ,u for sorne ,u E .91,.
9. Find the number of inversions in the following permutations. Use the number
of inversions to determine the sign of each.
a. 5, 4, 3, 2, l. c. 5, 1,4,2,3.
b. 6, 5, 4, 3, 2, l. d. 4, 3, 6, 2, 1, 5.
10. Suppose A = (A 1 , , A,) and B = (b
11
) are n x n matrices.
a. Show that the product may be written as
AB =(f. bA, ... , 'f, b1,A,).
i=l l=l
418
APPENDIX: DETERMINANTS
b. Use the definition given for D in Thoerem A6 and part a to prove that
D(AB) = D(A)D(B).
11. Suppose 1 is an alternating multilinear function of k variables and a E &k.
Prove thatf(V.
0
, ... , V<k>) = (sgn a)f(V" ... , Vk).
3. Cofactors, Existence, and Applications
<Si-:__
We will obtain another definition for D(A) by again assuming that a
determinant function exists. First, since D is to be linear in the first variable
'
This expresses D(A) in terms of D(Ek, A
2
, , An). But since D is alternating,
D(Ek, Az, . .. , An) does not involve the kth entries of A
2
, . .. , Aw Therefore,
if denotes the hth column of A with the kth entry deleted, Aik) =
L i*k ahE, and D(Ek> A
2
, , AJ differs from D(Aj,k>, . .. , by at
most a sign. Write
Th; value of e can be found by setting the columns of A equal to basis vectors.
For
D(Ek, E
1
, , Ek_
1
, Ek+
1
, , EJ
= cD(E{k>, ... , m".!
1
, EP')
1
, , E!kl)
= cD(/(n-!)x(n-1)) =C.
On the other hand, D must be skew-symmetric, so
D(Ek, E, ... , Ek-t Ek+t ... , En)
= ( -l)k-
1
D(E, .. . , Ek_
1
, Ek, Ek+
1
, , En)
= {-l)k-tD(IJ = (-l)k-1.
This gives e= (- v-t, but itwill be better to write e in the form ( -J)k+t.
.
-
}
3. Cofactors, Existence, and Applications 419
Therefore
Now since CAkl, ... , is the (n - 1) x (n - 1) matrix obtained from
A by deleting the kth row and the 1st column, we can give an inductive defini-
tion of D just as was done in Chapter 3.
Theorem A7 Define D: J!tnxnCF)-+ Finductively by D(A) = a
11
when
n = 1 andA= (a
11
); and
D(A) = D(A, A2, . .. , An)
n
= L ak( -l)k+I ... wheri n > l.
k=!
Then D is a determinan! function for each n.
Proof It is clear from the definition that D is linear in the first
variable, and a straightfoward induction argument shows that D is linear in
every other variable.
To prove that D is alternating, first note that the definition gives
!) = a(-l)1+
1
D(d) + c(-1)
1
+
2
D(b) = ad- cb.
So
D( = ac - ca
and D is alternating for 2 x 2 matrices. Therefore assume D is alternating
for (n - 1) x (n - 1) matrices, n > 2. If A is an n x n matrix with two
columns equal, then there are two cases.
Case/: Neither ofthe equal columns is the first column of A In this case
two columns of (Aj_k>, ... , are equal, so by the induction assumption
... , = O for each k and D(A) = O.
Case l/: The first column of A equals sorne other column. _Since case 1
and the multilinearity of D imp1y that D is skew-symmetric in all variables
but the nrst, it is sufficient to assume the first two columns of A are equal.
420 APPENDIX: DETERMINANTS
Thus
D(A) = D(A
1
, A
1
, A
3
, , A.)
n
= "'a (-l)k+tD(A<k> A<k) A<k>)
1-J kl 1 ' 3 ' ' n
k=t
To evaluate the terms in this expression, Jet AYh) denote the jth column of
A with the hth and kth entries deleted, so
A(k,h) = "' a . .E.
J ,t_, IJ
i*k,h
Then
k-t
= L ... ,
h= 1
n
+ L a,( -1/'D(Ak,h), ... ,
h=k+t
and
D(A) = D(A
1
, A
1
, A
3
, .. , A.)
n k-!
="' "'a a (-l)k+h+tD(A<k,hl A<k,hl)
i.J ./.., kL hl 3 ' ' n
k= 1 h= 1
n n
+ L L aklah1(- v+hD(Ak,h), ... ,
k= 1 h=k+ 1
The product a;
1
ai
1
appears twice in this expression for each i =1= j, and since
the coefficients are ( -l)+ i+
1
D(AU.i> A< i,il) and (-J)i+ ;D(AU,i>
3 ' ' n 3 , .. ,
D(A) = O. Therefore D is alternating.
To complete the proof it is necessary to show that D(l.) = l. This is
proved by induction, noting that if k =1= 1, then the multilinearity of D implies
that . .. , = O.
Since there is only one determinant function, the functions defined in
Theorems A6 and A 7 must be equal. The first step in showing that this func-
tion equals det is to generalize the idea u sed above. If we write the hth column
:::
'
3. Cofactors, Existence, and Applications
n
= L ak,,D(A , ... 'A,-, Ek> Ah+, ... ' A.)
k=!
n
- "' ( I)k+hD(A<k> A<kl A<k> A<k>)
- '-' akh - 1 , . ' h-1, h+ 1, ... ' n
k=t
For the last equality, note that if k < h, then
D(E
1
, ... , Ek-t Ek+
1
, , E,-
1
, E,, Ek, Eh+ 1, , E.)
421
= ( -l)h-(k+tl+tD(E
1
, , Ek-t Ek, Ek+
1
, , E,-
1
, Eh+
1
, , E.)
= ( -l)h-kD(I.) = ( -It+k.
The case k > h is handled similarly. Now the matrix
is obtained from A by deleting the kth row and the hth column. Therefore wc
will denote ( ... ,
empty.
2. a. 1 + 1 = 2.
b. Consider the product (2n
1
+ 1)(2n
2
+ 1).
3. a. Closed under multiplication but not addition.
c. Closed under addition but not multiplication.
b. Neither.
d. Both. e. Both.
4. Addition in R is eommutative if r + s = s + r for all r, sE R.
5. (r + s)t = rt + st can proved using the given distributive Iaw and the fact
that multiplication is commutative. Can you prove this?
2
1.
2.
3.
a.
a.
a.
Page 6
(5, -3).
(2, -5/2).
(2, 10).
b. (-1, 0). c. o. d. (1/4, 7/3).
b. o. c. (6, -1). d. ( -3, 6).
b. ( -1/3, -1/3). c. (5, -1). d. (13/2, 1).
4. Suppose U is the ordered pair (a, b), then
U+ O = (a, b) + (0, O) Definition of O
= (a + O, b + O) Definition of addition in "Y
2
= (a, b) Zero is the additive identity in R
=u.
5. Set U= (a, b), V= (e, d), and W =(e,/) and perform the indicated opera-
tions, thus
U+ (V+ W) =(a, b) +((e, d) + (e,f))
=(a, b) +(e+ e, d + /) Reason?
until you obtain (U+ V)+ W, it should take four more steps,
w1th reasons, and the fact that addition is associative in R rnust be used.
6. No. The operation yields scalars (numbers) and vectors are ordered pairs of
numbers.
7. a.
Vectors in "Y 2 are ordered pairs.
b. Definition of vector addition in "Y
2
c. By assumption U + W = U.
d.
Definition of equality of ordered
Chapter 1 431
e. Property of O E R and the fact that equations ha ve unique solutions in
R. (The Iatter can be derived from the properties of addition.)
f. Definition of the zero vector.
8. Por U= (a, b) E "f/'
2
, assume there is a vector TE "f/'2 such that U+ T =O
and show that T = (-a, -b).
9. All these properties are proved using properties of real numbers. To show that
(r + s)U = rU + sU, let U= (a, b), then
[r + s]U = [r + s](a, b)
= ([r + s]a, [r + s]b)
= (ra + sa, rb + sb)
= (ra, rb) + (sa, sb)
= r(a, b) + s(a, b)
= rU+ sU.
Definition of scalar multiplication in "Y
2
Distributive law in R
Definition of addition in "Y 2
Definition of scalar multiplication in "Y
2
10. b. If U= (a, b), then from problem 8, -U= (-a, -b). But ( -1)U
= (-1)(a, b) = ((-1)a, (-1)b) = (-q, -b).
11. Subtraction would be commutative if x - y = y - x for all x, y E R. This
does not hold for 2 - 5 = -3 while 5 - 2 = 3. This shows why we use
signed numbers.
3 Page 12
1. b. If (x, y) is a point on the Iine representing all scalar multiples of {3, -4),
then (x, y) = r(3, -4) for sorne rE R. Therefore x = 3r and y = -4r, so
x/3 = r = yf-4 and y= (-4/3)x.
3. The three points are collinear with the origin.
4. Remember that there are many equations for each Iine.
a. P = (t, t), tE R. b. P = t(1, 5), tE R. c. P = t(3, 2), tE R.
d. P=(t,O),tER. e. P=t(l,5)+(1,-1),tER.
f. P = (4, t), tE R. g. P = t(l, 1/2) + {1/2, 1), tE R.
5. a. (1, 1). b. (1, 5). c. (3, 2). d. (1, 0). e. (1, 5).
f. (0, 1). g. (2, 1 ).
6. a. P = t(2, 3) + (5, -2), /E R.
c. P = t(O, 1) + (1, 2), tER.
b. P = 1(1, 1) + {4, O), tER.
7; a. x - 3y + 2 = O. b. y =X. c. 4x - y - 2 = O.
8. The equations have the same graph, e, and the vectors V and B are on e.
4
1.
2.
Thereforethereexistscalarss0 andt0 suchthat V= soA + BandB = toU + V.
This leads to U = ( -s0 /t0 )A. What if t0 = O?
Page 18
a. 2. b. o. c. o. d. -23.
a. l. b. l. c. 5. d. ..)lO.
432 ANSWERS ANO SUGGESTIONS
3. Let U= (a, b) and compute the length of rU, remember that .t:2 =Ir!.
4. a. lt has none. b. Use the same procedure as when proving the pro-
perties of addition in "Y
2
.
6. a. P = t(l, -3) + (2, -5), tER. b. P = 1(1, 0) + (4, 7), teR.
c. P = /(3, -7) + (6, 0), tE R.
7. a. (1/2, -1/2). b. (7, 7). c. (9/2, 7 /2).
8. Q = H
11. Notice that if A, ii, C, and D are the vertices of a parallelogram, with B
opposite D, then A - B = D - C. Write out what must be proved in vector
form and use this relation.
5 Page 21
l. a. 90. b. cos-
1
-2/'1/5. c. 30.
2. a. P = /(3, 1), tER. b. P = t(i, -1) + (5, 4), tER.
c. P = t(l, O) + (2, 1), tE R.
7. lf A, B, C are the vertices of the triangle with A and B at the ends of the
ter and O is the center, then A = -B. Consider (A - C) o (B - C).
6 Page 28
l. a. (8, 1, 6), "Y
3
or rff
3
d. (0, -1, -1, -6, -4), "Y
5
or &'
5
b. (5, O, 10, 18), "Y4 or C4. e. 1, rffJ.
c. O, rff3. f. -6, rff4.
2. Since U, Ve "Y3, Jet U= (a,. b, e), V= (d, e, f) and proceed as in the cor-
responding proof for "Y
2
4. a. .YJ/2. b. '1/14. c. l.
5. Choose a coordnate system wth a vertex of the cube at the origin and the
coordinate axes along edges of the cube. cos-
1
1/'1/3.
6. a. P=t(5,3,-8)+(2,-1,4),tER.
d. p = t(2, 8, -5, -7, -5) + (1, -1, 4, 2, 5), tE R.
8. a. The intersection of the two walls. b. The ftoor. c. The plane
parallel to and 4 feet above the ftoor. d. The plane parallel to and 3 feet
in front of the wall representng the yz coordinate plane. e. The line on
the wall representing the yz coordina te plane which ntersects the ftoor and the
other wall in points 3 feet from the comer. f. The plane parallel to and 1
foot behind the wall representing the xz coordinate plane. g. The plane
which bsects the angle between the two walls. h. The ntersection of the
plane in part g with the ftoor. i. The plane which intersects the floor
and two walls in an equilateral triangle with side 3'1/2.
l
. j
1
i
1
' 1
Chapter 2
433
9. a. [P- (1, 2, O)] o (0, O, 1) =O, z =O.
b. [P- (4, O, 8)] o ( -3, 2, 1) =O, 3x- 2y- z = 4.
c. [P- (1, 1)] o (2, -3) =O, 2x- 3y =-l.
d. [P- (-7, O)] o (1, O)= O, x = -7.
e. [P- (2, 1, 1, O)] o (3, 1, O, 2) =O, 3x +y+ 2w = 7, P = (x, y, z, w).
11. a. Y es at (0, O, 0). b. Y es at (2, -2, 4) sin ce the three equations t = s + 4,
-t = 3s + 4, and 2t = -2s are all satisfied by t = -s = 2. c. No for
a solution for any two of the equations does not satisfy the third. d. No.
e. Yes at (1, 1, 1).
12. a. [P- (0, -3)] o (4, -1) =O.
b. [P- (0, 7)] o (0, 1) =O.
d. [P- (1, 1)] o (-3, 1) =O.
e. po(1,-1)=0.
c. [P- (6, 1)] o (-2, 3) =O.
13. b. (2/3, 2/3, 1/3), (3/5, 4/5), (lfVJ, 1/VJ, 1/VJ).
14. The angle between U and the positive x axis is the angle between U and
(1, O, O).
15. a. ( -1/3, 2/3, 2/3) or (1/3, -2/3, - 2/3).
b. (1/Vl, O, 1/Vl). c. (0, 1, 0).
16. (a, b, O), (a, O, e), (0, b, e).
17. As points on a directed line .
Chapter 2
1 Page 36
l. a. (7, -2, -4, 4, -8). b. 2 + 2t + 6t
2
c. 3i - 5. d. cos x + 4 - .yX:.
2. a. ( -2, 4, O, -2, -1), ( -5, -2, 4, -2, 9). b. 3t
4
- 6t
2
- 2, -2t
- 3t
4
c. - 2, 7 - 4i. d. -cos x, vx - 4.
5. a. Definition of addition in ff; Definition of O; Property of zero in R.
f + (-/)=O if (f + ( -/))(x) = O(x) for all x E [0, 1].
(f + (-/))(x) =f(x) + (-/)(x) =/(x) + (-/(x)) =O= O(x).
What is the reason for each step?
c. The fact that ti' is closed under scalar multiplication is proveo in any
introductory calculus course. To show that r(f +.g) = rf + rg, for all rE R
and J, g E, consider the value of [r(f + g)](x).
[r(f + g)](x) = r[(f + g)(x)] Definition of scalar multiplication in ti'
= r(f(x) + g(x)] Definition of addition in ti'
= r(f(x)] + r[g(x)] Distributive law in R
= [rf](x) + [rg](x) Definition of scalar multiplication in ti'
= [rf + rg](x) Definition of addition in ti'.
Therefore the functions r(f + g) and rf + rg are equal.
434 ANSWERS ANO SUGGESTIONS
6. a. (0, O, 0). b. O(x) = O, for all x E [0, 1]. c. O = O + Oi.
d. O E R. e. (0, O, O, O, O). f. {an} where an =O for all n.
g. The origin.
7. a. The distributive law (r + s)U = rU + sU fails to hold.
b. 'If b -1 O, then 1(a, b) # (a, b).
c. Addition is neither commutative nor associative.
d. Not closed under addition or scalar multiplication. There is no zero element
and there are no additive inverses.
8. Define ~ ~ ) + (; ~ ) = ~ t; ~ t {) and r ~ ~ ) = ~ ~ ~ ~ ) .
2 Page 42
1. a. Call the set !/. (4, 3, O) E.'/, so!/ is nonempty. If U, VE!/, then U
= (a, b, O) and U = (e, d, O) for sorne scalars a, b, e, dE R. Since the third
componen! of U + V = (a + e, b + d, O) is zero, U + Vis in .'/, and !/ is
closed under addition. Similarly, !/ is closed under scalar multiplication.
(How does !/ compare with "/' 2 ?)
b. Call the set fY. If U E fY, then U= (a, b, e) with a= 3b anda+ b =c.
Suppose rU = (x, y, z), then (x, y, z) = (ra, rb, re). Now x = ra = r(3b)
= 3(rb) = 3y and x +y = ra + rb = r(a + b) = re = z. Thus rU E fY
and fY is closed under scalar multiplication.
2. If N is normal to the plane, then the problem is to show that {U E ~
3
j U o N
=O} .is a subspace of 3
3. The question is, does the vector equation
x(1, 1, 1) + y(O, 1, 1) + z(O, O, 1) =(a, b, e)
have a solution for x, y, z in terms of a, b, e, for all possible scalars a, b, e?
That is, is (a, b, e) E!/ for all (a, b, e) E"/' 3? The answer is yes, for x = a,
y = b - a and z = e - b is a solution.
4. a. lmproper: (a, b, e) = a(l, O, O)+ b(O, 1, O)+ e(O, O, 1).
b. Proper. F:or example, it need not contain (7, O, O).
c. Proper. If b -1 O, then (a, b, e) need not be a member.
d. lmproper: 2(a, b, e) =[a+ b- e](l, 1, O)+ [a- b + e](1, O, t)
+[-a+ b + e](O, 1, 1).
5. a. (a, b, a+ 4b) for sorne a, bE R.
b. (x, 5y, ~ x , y) for sorne x, y E R.
c. ( -t, 2t, 1) for sorne tE R. d. ( -4t, 2t, 1) for sorne 1 E R.
6. Only one set forms a subspace.
8. Show that if !/ is a subspace of "/' , and !/ contains a nonzero vector, then
"Y 1 e !/.
9. a. !/ is the subspace of all constant functions, it is essentially the vector
space R. <e
Chapter 2
435
b. The function f(x) = -5/3 is in .'/, so !/ is nonempty. But !/ is neither
closed under addition nor scalar multiplication.
c. !/ is subspace. d. !/ is not a subspace.
10. a. Since !/ and fY are nonempty sois the sum !/ + !Y. If A, BE(.'/+ !Y),
then A = ul + vl and B = Uz + Vz for sorne U, u2 E!/ and sorne
Vt. V
2
E !Y. (Why is this true?) Now A+ B = (Ut + Vt) + (U2 + Vz)
= (U
1
+ U
2
) + (V
1
+ V
2
). Since !/ and fY are vector spaces, U1 + Vz E!/
and vl + v2 E !Y, therefore !/ + fY is closed under addition. Closure under
scalar multiplication follows siinilarly.
11. a. Def. of O; Def: of Y; Assoc. of +; Comm. of +; D""-of- U; Comm.
of +; Def. of O. b. Def. of additive inverse; Result just established;
Assoc. of +; Def. of -OU; Def. of O.
12. Suppose rU = O, then either r = O or r #- O. Show that if r #- O, then U = O.
13. If r -1 s, is rU # sU?
14. You want a chain of equalities yielding Z =O. Start with
Z = Z + O= Z +(U+ (-U)) =
3 Page 49
1. a. (0, 3, 3) is in thespan ifthereexist x,y E R such that (0, 3, 3) = x( -2, 1, -9)
+ y(2, -4, 6). This equation gives -2x + 2y =O, x- 4y = 3, and -9x
+ 6y = 3. These equations have the common solution x = y = -l. Thus
(0, 3, 3) is in the span.
b. Here the equations are -2x + 2y = 1, x- 4y =O, and -9x + 6y =l.
The first two equations are satisfied only by y = 1/6 and x = 2/3, but these
values do not satisfy the third equation. Thus (1, O, 1) is not in the span.
c. O is in the span of any set. d. Not in the span.
e. X= 1, y= 3/2. f. X= 2/3, Y= 7/6.
2. a. Suppose t
2
= x(t
3
- t + 1) + y(3t
2
+ 2t) + zt
3
Then collecting terms
in t gives, (x + z)t
3
+ (3y- 1)t
2
+ (2y- x)t + x =O. Therefore, x + z
= o, 3y - 1 = O, 2y - x = O, and x = O. Since these equations imply
y = 1/3 and y = O, there is no solution for x, y, and z, and the vector t
2
is
not in the span.
b. With x, y, zas above, x = -1, y= O, and z =l.
c. x = 4, y = 2; z = l. d. Not in the span.
e. x = -1, y = 1, z = 2.
3. a. It spans; (a, b, e) ='Ib- e](I, 1, O)+ [b- a](O, 1, 1) + [a- b + e](1, 1, 1).
b. It spans with (a, b, e) = a(1, O, O)+ 0(4, 2, O)+ b(O, 1, O)+ (0, O, O)
+ e(O, O, 1).
c. It does not span. From the vector equation (a, b, e) = x(l, 1, O)
+ y(2, O, -1) + z(4, 4, O) + w(S, 3, -1) can be obtained the two equations
-y-:- w = e and 2y + 2w =a- b. Therefore (a, b, e) is in the span only if
a- b + 2e =O.
436
ANSWERS ANO SUGGESTIONS
d. It does not span. (a, b, e) is in the span only if a - 2b + e = O.
4. a. "U is a linear combination of Vand W," or "U is in the span of Vand
W" of "U is Iinearly dependent on V and W."
b. "!7 is contained in the span of U and V."
c. "!7 is spanned by U and V."
d. "The set of all linear combinations of the vectors U, . .. , Uk" or "the
span of U1, . .. , Uk."
5. a. Your proof must hold for any finite subset of R[t], including a set such
as {t + 5t
6
, t
2
- 1}.
6. Beca use addition of vectors is associative.
7. {(a, b), (e, d)} is linearly dependent if and only if there exist x and y, not both
zero, such that x(a, b) + y(e, d) = (0, 0). This vector equation yields the
system of linear equations
ax + ey =O
bx + dy =O.
Eliminating one variable from each equation yields the system
(ad- be)x =O
(ad- be)y = O,
which has x = y = O as the only solution if and only if ad- be ,p O.
8. a. a # 1 and b # 2. b. 2a - b - 4 # O. c. ab # O.
9. a. You cannot choose Ur or Un to be the "sorne vector," you must use U
1
where i is any integer between 1 and n.
c. Besides the fact that the statement in part a is rather cumbersome to write
out, it does not imply that {O} is linear! y dependen t.
10. a. Independent. b. Dependent. c. Dependent. d. Dependent,
e. Independent. f. Independent. g. Independent.
11. a. The vector equation x(2, -4) + y(l, 3) + z(-6, 3) =O has many solu-
tions. They can be found, in this case, by letting z ha ve an arbitrary value and
solving for x and y in terms of z. Each solution is of the form x = 3k, y = 6k,
z = 2k for sorne real number k.
b. (5,0) -4) + b(1,3) + e(-6, -3)witha = 3k + 3/2,b = 6k + 2,
and e = 2k, for any k E R. That is, (5, O) = (5, O) + O, and (5, O) = -4)
+ 2(1, 3).
For (0, - 20), a = 3k + 2, b = 6k - 4, and e = 2k, k E R.
d. .P(S) e "Y 2 is immediate, and part e gives "Y
2
e .P(S).
12. a. 3k(6, 2, -3) + 5k( -2, -4, 1)- 2k(r, -7, -2) =O for all k E R.
b. If (0, O, 1) is in the span, then there must be scalars x, y, z such that
6x - 2y + 4z = O and - 3x + y - 2z = l.
c. Let x, y, z be the scalars needed in the linear combination. ForO, see part
a. For (0, -1, O); x = 3k + 1/10, y= 5k + 3/10, and z = -2k, k E R,
For (-2, O, -1); x = 3k + 2/5, y= 5k + 1/5, and z = -2k, k E R. For
(6, 2, -3); x = 3k + 1, y= 5k, and z = -2k, k E R.
!
'l
1
l
j
1
j
Chapter 2 437
13. In problems such as this, you may not be able to complete a proof at first,
but you should be able to start. Here you are given two conditions and asked
to prove a. third. Start by writing out exactly what these conditions mean.
In this situation, this entails writing out expressions involving linear com-
binations and making statements about the coefficients. From these expres-
sions, you should be able to complete a proof, but without them a proof is
impossible.
14. The same rcmarks apply here as were given for problem 13.
15. Note that for any n vectors, O = OU1 + + OUn.
16. The first principie of mathematical induction states that if S is a set of integers
which are all greater than or equal to b, and S satisfies:
l. the integer b is in S,
2. whenever n is in S, then n + 1 is also in S, then S contains all in-
tegers greater than or equal to b.
There is a nice analogy for the first principie of induction. Suppose an
infinite collection of dominas is set up so that whenever one falls over, it
the next one over. The principie is then equivalent to saying that if the first
domino is pushed over, they all fall over.
For this problem, we are asked to prove that the set S ='
{klr(Ur + + Uk) = rU1 + + rUd includes all positive integers.
(Here b = 1, although it would be possible to take b = 2.) Thus there are
two parts to the proof:
l. Prove that the statement, r(Ur + + Uk) = rUr + + rU;:
is true when k = l.
2. Prove that if the statement is true for k = n, then it is true for k
= n +l. That is, prove that r(U1 + + Un) = rU1 + + rUn implies
r(U1 + +Un+ Un+r) = rU1 + + rUn + rUn+t
For the second part, no ti ce that
r(U1 + + Un+ Un+r) = r([U1 + + Un]+ Un+l).
4 Page 57
l. Show that {1, i} is a basis.
3. The dimension of a subspace must be an integer.
4. a. No; dim "f/
3
= 3. b. Yes. Show that the set is linearly independent,
then use problem 1 and Corollary 2.8.
c. No; by Corollary 2.9. d. No; 10(-9)- (-15)(6) =O.
e. No; it cannot span. f. Yes; 2(4)- 1(-3) = 11 #O.
5. a. True. b. False. A counterexample: 0(1, 4) =O, but the set {(1, 4)}
is not linearly dependent. c. False.
6. a. Let dim "Y = n and S = { U
1
, , Un} be a spanning set for "Y. Suppose
438
ANSWERS ANO SUGGESTIONS
a, C.:1 + + a. u. = O and show that if one ~ f the scalars is nonzero, then
"Y JS spanned by n - 1 vectors. This implies that dim "Y ::::;; n - 1 < n.
b. Let dim "Y= n and S= {U, ... , U.} be linearly independent in "Y.
Suppose S does not span "Y and show that this implies the existence of a Jine-
arly independent set with n + 1 vectors within "Y .
~ L ~ h 3. ~ 2
8. Consider the set { ( g), (g ) (? g), (g ?) }
9. a. Add any vector (a, b) such that 4b + 7a ,. O.
b. Add any vector (a, b, e) such that 2a - 5b - 3e ,. O.
c. Add any vector a + bt + et
2
such that a - 4e ,. O.
d. Add any vector a + bt + et
2
such that a + b - Se ,. O.
10. !f !/ n .r has S= {U,, ... , Ud as a basis, then dim(!/ n Y) =k. g n y
JS a subspace of !/, therefore S can be extended to a basis of !/. Call this basis
{U,, Uk, Vk+t ... , V.}, so dim !/ = n. Similarly S can be extended to
a basis {U, ... , Uk, Wk+t. ... , Wm} of Y, and dim y= m.
You should now be able to show that the set {U
1
, , Uk, Vk+t. ... ,
V., Wk+l, ... , Wm} is a basis for !/+Y. Notice that if a
1
U
1
+ + akUk
+ bk+lVk+l + + bnVn + ek+,Wk+, + + emWm = O,thenthevector
~ , _ U , + + akUk + bk+t Vk+l + + bnVn = -(ek+lWk+l + + emWm)
JS m both !/ and Y.
11. Aplane through the origin in 3-space can be viewed as a 2-dimensional subspace
of "Y
3
5 Page 64
l. a. ~ ou. need to find a and b such that (2, - 5) = a(6, - 5) + b(2, 5). The
solutJOnJsa = -b = 1/2 so(2 -5)(1/2 -1/2)
' ' , {(6,- 5),(2,5))
b. (1/10, -17/20)(3,1),(- 2,
6
. c. (2, -5)CEtJ
d. (0, 1)c(l,oJ.c2,-5JJ
2. a. (3/40, 11/40)cc
6
, _
5
,c
2
,
5
>J
C. (1, 1){E}
b. (2/5, 1/10){(3,1),(-2,6)}
d. (7/5, -1/5)((1,0),(2,-5)}
3. a. (0, 1)s.
e. (- 5, 27)
8
b. (1, -4)
8
C. (- 6, 27)s. d. (8, - 30)
8
4. a. t
2
+ 1: (0, O, 1, O)s. d. 1
2
: (-1, 1, 1, -1)
8
b. t
2
: (1, O, O, 0)
8
e. 1
3
- t
2
: (2, -1, -1, 1)s.
c. 4: (4, -4, O, 4)
8
f. t
2
- t: (O, O, 1, -1)
8
5. a. (0, -1)s. b. (-3/5, -1/5)s. c. (1, 2)s. d. (-2, -1)s.
e. (-3, -2)
8
6. a. (5, -8, 5)
81
b. (O, -1/2, 0)
82
c. (-3, o, 5)s,. d. (2, 2, 1)s .
7. B = {(2/3, 5/21), (1, 1/7)}.
;!
,
1
~
!
~
Chapter 2 439
8. B = {(1, 4), (1/2, 5/2)}.
9. If B ={U, V, W}, then t
2
~ U+ W, t = V+ W, and 1 = U+ V, so
B = {it
2
- lt + t, -!t
2
+ !t + !, -}t
2
+ !t- !}.
10. B = {4 + 2t, 2 + 3t}.
11. (Notice that T is not defined using coordinates.) To establish condition 2 in
the definition of isomorphism, suppose T(a, b) = T(e, d). Then a+ bt =e
+ dt and a = e, b = d. Therefore, T(a, b) = T(e, d) implies (a, b) = (e, d).
For the other part of condition 2, suppose W E R2 [t], say W = e + ft. Then
e, /E R, so there is a vector (e, f) E "1'2 and T(e, f) = W. Thus condition 2
holds.
12.
13.
,,
14.
lf !/ and Y are two such subspaces, then there exist nonzero vectors U, V E i'"
2
such that !/ = {tU/tER} and Y = {tV/t E R}.
a. The coordinates of (4, 7) with respect to B are 1 and 5.
b. The vector 2t
2
- 2 has coordinates -2, 1, and 1 with respect to the
basis {t + 2, 3tZ, 4- t
2
}.
a. Although T(a, b) = T(e, d) implies (a, b) = (e, d), not every vector of
"Y 3 corresponds to a vector in "Y 2 , e.g. consider (0, O, 1 ). Thus condtion 2
does not hold.
Condition 3 holds for T(U + V) = T((a, b) + (e, d)) = T(a + e,
b + d) = (a + e, b + d, O) = (a, b, O) + (e, d, O) = T(a, b) + T(e, d)
= T(U) + T(V).
b. For condition 2, suppose W = (x, y) E "Y 2, then (x, O, 2x - y) is a
vector in "Y
3
and T(x, O, 2x - y) = (x, y). Therefore, half the condition holds.
But if T(a, b, e) = T(d, e,/), then (a, b, e) need not equal (d, e,/). For T(a, b, e)
= T(d, e, /) implies a + b = d + e and 2a - e = 2d- f. And these equa-
tions ha ve roan y solutions besides a = d, b = e, e =f. For example a = d + 1,
b =e- 1, e = f + 2, thus T(4, 6, 3) = T(S, 5, 5) and condition 2 fails to hold.
T satisfies condition 4 since T(rU) = T(r(a, b, e)) = T(ra, rb, re) = (ra + rb,
2(ra) - re) = r(a + b, 2a- e) = rT(a, b, e) = rT(U).
15. a. 2 + t and 4 + 6t.
16. Suppose T is a correspondence from "Y 2 to "Y
3
satisfying conditions 1, 3, and 4
in the definition of isomorphism (the map in problem 14a is an example.) Show
that T fails to satisfy condition 2 by showing that the set {T( V)/ V E "Y
2
} cannot
contain three linearly independent vectors.
6
l.
Page 70
(a, b) E .s!'{(l, 7)} n .s!'{(2, 3)} if (a, b) = x(1, 7) and (a, b) = y(2, 3) for
somex, y E R. Thus x(l, 7) = y(2, 3), but {(1, 7), (2, 3)} is linearly independent,
so x = y = O and the sum is direct. Since {(1, 7), (2, 3)} must span "Y 2, for any
ccW E "Y
2
, there exist a, bE R such that W = a(l, 7) + b(2, 3). Now a(l, 7)
i
:'
440
ANSWERS ANO
Figure A1
E 7)} and b(2, 3) {(2, 3)} so "Y
2
is contained in the sum and
"Yz = 7)} (8 3)}.
2. No.
3. a. If (a, b, e) E Y n .'T, then there exist r, s, t, u E R such that
(a, b, e)= r(l, O, O)+ s(O, 1, O)== t(O, O, 1) + u(l, 1, 1).
Therefore (r; s, O)= (u, 11, t + 11) and Y n .'T = 1, 0)).
b. Direct sum. c. Direct sum. d. Y n .'T = 1, 1, 4)}.
5. If W = O, it is immediate, so assume W . O. Let e be the angle between W
and N, see Figure A l. Then cos e = /1 V2 //!// W/1 with the sign depending on
whether e is acule or obtuse. And V2 = /1 Vz/I(N//IN/1) with the sign depend-
ing on whether W and N are on the same or opposite sides of the plane Y,
i.e., on whether e is acute or obtuse.
6. a. "Y 3 = Y + .'T, the su m is not direct. b. "Y
3
. Y + .'T.
c. "f/" 3 = y EEl .'T. d. "Y 3 f. y + !:T.
7. a. U= (4a, a, 3a) and V= (2b + e, b, b + 2e) for sorne a, b, e E R. U
= (- 20, -5, -15) and V = (16, 4, 20).
b. U = (1, -8, 6) and V = (- 1, 2, - 1 ).
8. U= (a, b, a + b) and V= (e, d, 3c - d) for sorne a, h, e, dE R. Since the
sum Y + ,o/" is not direct, there are many solutions. For example, taking
d =O gives U= (2, 1, 3) and V= (-1, O, -3).
9. Choose a basis for /fP and extend it to a basis for "Y.
10. Two possiblities are R
4
[t] (8 {!
4
, t
5
, . , t", ... } and
t2, (4, ... 't2", .. . } 8) t3, ts, ... , 2n+t, .. . ).
Chapter 3 441
Review Problems Page 71
1. a, b, and e are false. d is true but incomplete and therefore not very useful. e
and f are false.
2. b, d, g, h, i, and p are meaningful uses of the notation. s is meaningful if U
= V = O. The other expressions are meaningless.
3. b. Note that showing S is linearly dependen! amounts to showing that O may
be expressed as a linear combination of the vectors in S in an infinite number
of ways. t = ( -1/4- 4k)(2t + 2) + (1/2 + 5k)(3t + 1) + k(3 - 7t) for any
kER.
c. Use the fact that 2t + 2 = 1{3t + 1) + !(3- 7t). Compare with problem
14, on page 51.
4. Only three of the statements are true.
a. A counterexample for this statement is given by {0, !} in "Y = Rz[t] or
{(1, 3), (2, 6)} in "Y = "Y 2
5. b. If U= (a, 2a + 4b, b) and V= (e, 3d, 5d), then there are many solutions
to the equations a + e = 2, 2a + 4b + 3d = 9, and b + 5d = 6. One solu-
tion is given by a = b = e = d = l.
7. a. For n = 3, k= 1, choose "Y= "1'3, {U, Uz, U3} ={E}, and Y==
1, 7)}.
1 Page 81
l. a. 3y =O
x + y+ 4z = 3
x + 7z =O.
Chapter 3
b. 6x + 7y = 2
9x- 2y = 3.
2. a. 3y =O, x +y+ z =O, z =O, -5y =O.
b. 3x - y + 6z = O, 4x + 2y - z = O.
C. X+ 2y = 2
3x +y= 1
6x + y= -4.
c. Suppose aex + be
2
x + ee
5
x = O(x) for all x E [0, 1]. Then x = O give-;
a + b + e =O; x = 1 gives ea+ e
2
b + e
5
e =O, and so on.
3. a11 = 2, a23 = 1, a13 =O, a24 = -7, and a21 =O.
-5 o
-5 o 6
-6)
4. a.
3 1 3 1 -7
8 .
A=G
-1)
A*= G
-1
D
b. 1 , 1
-1 -1
A= (i
-3
-i).
A*= (i
-3 1
8)
c.
o o -2
442 ANSWERS ANO SUGGESTJONS
(0
2 -35
6. a.
c. (f
-3
o
7. a. In determining if S e "Y. is.linearly dependent, one equation is obtained
for each component. A is 2 X 3, so AX = O "has two equatons and. n = 2.
Further there is one unknown for each vector. Snce A has three columns S
must contain three vectors from "Y 2 If the first equation is obtained from ;he
first component, then S= {(5, 1), (2, 3), (7, 8)}.
b. {(2, 1), (4, 3)}.
c. {(2, 1, 2), (5, 3, 1), (7, O, 0), (8, 2, 6)}.
d. {(5, 2, 1, 6), (1, 4, o, 1), (3, 9, 7, 5)}.
8. a. W = (3, 2), Ut = (2, 1), and U2 = (-1, 1).
b. W =(O, O, 0), U
1
= (1, 2, 4), and U2 = (5, 3, 9).
c. W = (4, 8, 6, 0), U
1
= (9, 2, 1; 0), and U2 = (5, 1, 7, 2).
10. a. (2 -1 8)
4 11 3 . b. (_!)
11. (g g). (g). (0, O, O, 0), and (& &) .
12. The zero in AX = O is n x 1 ; the zero in X = O is m x 1.
14. a. O r! S. b. A plane not through the origin.
c. S may be empty, or a plane or line not passing through the origin.
2 Page 93
1. a. (1 1 1) (
3
3
--:_ _ ( 3 - 2k + 3 - k + 3k) ( 6)
4 1 3
3
k - 12 - Sk + 3 - k+ 9k = 15
Therefore all the so1utions satisfy this system.
b. f) (
3
3--:_
2
f) = ( holds for all values of k.
1 -2 o 3k -3 .
(
1 1 -2) (
3
- lk) ( 6 9k) (O)
c. 3 1 1 3 Jk k = 12 = 4k -, 1 for all k.
Therefore the solutions do not satisfy this system.
Chapter 3
443
2. AtX = Bt and AzX = B
2
are the same except for their hth equations which
are ah
1
Xt + + ahmXm = bh and (ah! + rak)X + + (ahm + rakm)Xm
= bh + rbk, respectvely. Therefore, it is necessary to show that if A1 e = B1,
then X= e s a solution of the hth equation in A2 X = B2 and if A2 e = B2 ,
then X= e is a solution of the hth equation of A1 X = B1
3
' a. (g b b. (g b} c. (g g -g b -i) d. (g g b
4. a. From the echelon form (b
o o
b. (
o o o
o 2)
O 3 , x = 2, y = 3, and z = l.
1 1
(
1 o -1 o)
c. O 1 2 O , no solution.
o o o 1
5. The only 2 X 2 matrices in echelon form are (& g), (g b), (b and
(b ), where k is any real number.
6. Call the span .?. Then (x, y, z) E.?, if (x, y, z) = a(l, O, -4) + b(O, 1, 2)
=(a, b, -4a + 2b). Therefore !/ = {(x, y, z)/z = 2y- 4x}, i.e., !/ is the
plane in 3-space wth the equation 4x - 2y + z = O.
7. a. The echelon form is (g b), so the row space is "Y 2
b. (b - is the echelon form, therefore U is in the row space if
o o o
U = a(l, O,- 2) + b(O, 1, 3) and the row space is the set {(x, y, z)/2x- 3y + z
=O}.
(
1 o
c. o 1
d. "Y 3
{(x, y, z)/z = x + 5y}.
e. (b g {(x, y, z, w)/w = x + 2y + 3z}.
o o 1 3
f. ( {(x, y, z, w)/z = 2x and w = 2x + 6y}.
8. a. 2. b. 2. c. 2. d. 3. e. 3. f. 2.
11. a. The rank of ( is 1, so the vectors are linearly dependen t.
b. Rank is 2. c. Rank is 2. d. Rank is 3.
12. "A is an n by m matrix."
13. Let U1o ... , u. be the rows of A and U, ... , Uh + rUk, ... , u. the rows
of B. Then it is necessary to show
2{U, ... , uk, ... , uh, ... , u.}
e 2{U, . , Uk, ... , Uh + rUk, ... , U.}
444
ANSWERS ANO SUGGESTIONS
and
:&'{U, ... , Uk, ... , Uh + rUk, ... , U.}
e .!&'{U1, , Uk, ... , Uh, ... , U.}.
14. The idea is to find a so!ution for one system which does not satisfy the other
by setting sorne of the unknowns equal to O. Show that it is sufficient to con-
sider the following two cases.
Case 1 : ekh is the leading entry of 1 in the kth row of E, and the leading
1 in the kth row ofF is fic
1
with j > h.
Case 2: Neither ekh nor hh is a leading entry of l.
In the first case, show that x
1
must be a parameter in the so.Iutions of EX= O,
but when XJ+ 1 = = Xm = O in a so!ution for FX = O, then x
1
= O.
3 Page 100
l. X = 3, y = -2.
2. X = 1 + 2t, y = t, tE R.
3. X = 1/2, y = 1/3.
4. Inconsistent, it is equivalent to
5. Equivalent to
X- 3y = 0.
0=1.
X- fy = -2
z = -3,
so x = 3t- 2, y = 2t, and z = -3, for any tE R.
6. Inconsistent. 7. X = 3 + 2s - f, y = S, Z = f, S, tE R.
8. X= y= Z =l. 9. X = - 2t, y = 3t - 2, z = t, w = 4, tE R.
10. The system is equivalent to
x- + w =O
y- + w =O,
therefore the solutions are given by x = s - t, y = 5s- t, z = 1s, w = t,
for s, tE R.
11. a. Independent. b, e, and d are dependent.
12. a. (i
b. (i
c. (2, 5, 6).
d. g
e. i)
13. a. 2. b. 2. c. l. d. 2. e. 3.
15. From the definition of the span of a set, obtain a system of n linear equations
in n unknowns. Use the fact that {V1 , , V.} is linearly independent to show
that such a system is consistent.
Chapter 3 445
4 Page 105
l. a. 2; a point in the plane. b. 1; a line in the plane.
c. 2; a line through the origin in 3-space.
d. 1; a plane through the origin in 3-space.
e. 1 ; a hyperplane in 4-space. f. 3; the origin in 3-space.
2. a. In a line.
c. At the origin.
b. The planes coincide.
d. In a line.
3. Rank A = rank A* = 3; the planes intersect in a point.
Rank A = rank A* = 2; the planes intersect in a line.
(Rank A = rank A* = 1 is excluded since the planes are distinct.)
Rank A = 1, rank A* = 2; three parallel planes.
Rank A = 2, rank A* = 3; either two planes are parallel and intersect the
third in a pair of parallel lines, or each pair of planes intersects in one of thrce
parallellines.
4. a. Set Y= {(2a + 3b, 3a j- 5b, 4a + 5b)/a, bE R}. Then Y = .!&'{(2, 3, 41,
(3, 5, 5)} is the row space of A = 1). Since A is row-equivalent to
(6 ? Y= .!&'{(1, O, 5), (0, 1, -2)}. That is, (x, y, z) E Y if (x, y,;:)
= x(I, O, 5) + y(O, 1, -2). This gives the equation z = 5x- 2y.
b. (6 _ - is row-equivalent to (b ? _ iW, and an equation is
x- 2y- 3z =O.
c. x- 3y + z =O. Notice that although the space is defined using 3 para-
meters, only two are necessary.
d. 4x - 2y + 3z - w = O. e. x + y + z + w = O.
5. a. (x, y, z, w) = (4, O, O, 2) + t( -3, 2, 1, 0), tE R. This is the line with direc-
tion numbers (- 3, 2, 1, O) which passes through the point (4, O, O, 2).
b. (x,y, z, w) = (-3, 1, O, 0) + t(-1, -2, 1, O)+ s(-2, O, O, 1), t, sER.
This is the plane in 4-space, parallel to .!&'{(-1, -2, 1, 0), (-2, O, O, 1)} and
through the point (-3, 1, O, 0). .
c. (x
1
, x
2
, x3 , x
4
, x5 ) = (1, 2, O, 1, O)+ t( -3, -1, 1, O, O)+ s(O, -2, O, O, 1),
t, sE R. The plane in 5-space through the point (1, 2, O, 1, O) and parallel to the
plane.!&'{(-3, -1, 1,0,0),(0, -2,0,0, !)}.
6. a. An arbitrary point on the Iine is given by t(P- Q) + P = (1 + t)P- tQ.
If S= (x
1
), P = (p
1
) and Q = (q), show that x
1
= (1 + t)p
1
- tql> for
1 s j s n, satisfies an arbitrary equation in the system AX = B.
b. Recall that both S e {U+ V/ VE Y} and {U+ V/ VE Y} e S must be
established. For the first containment, notice that, for any W E S, W = U
+(W- U).
8. a. S= {(-3, -?k, -1- 4k, k)/kER}. This is the line with vector equa-
tion P = t( -7, -4, 1) + (- 3, -4, 0), tE R.
b. Y is the line with equation P = t( -7, -4, 1), tE R.
446 ANSWERS ANO SUGGESTIONS
9. a. S= {(4- 2a + b, 3- 3a.+ 2b, a, b)ja, bE R};
!/ = 2"{(-2, -3, 1, 0), (1, 2, O, 1)}.
b. S = {(4, 2, O, O) + (b - 2a, 2b - 3a, a, b)ja, bE R}.
c. S is a plane.
5 Page 115
l. 2. 2. -5. 3. 2. 4. l. 5. o. 6. O. 7. 2. 8. O.
9. a. A parallelogram, since A is the origin and D = B + C. Area = 13.
b. A parallelogram since B - A = D - C. The area is the same as the area
of the parallelogram determined by B- A and C-A. The area is 34.
c. Area = 9.
10. a. l. b. 30.
12. Suppose the hth row of A is multiplied by r, A = (a
11
) and B = (bu). Expand
B along the hth row where bhi = rahi and Bhi = Ahi The corollary can also
be proved using problem 1 1.
13. The statement is immediate for 1 x 1 matrices, therefore assume the pro-
perty holds for all (n - 1) x (n - 1) matrices, n 2. To complete the proof
by induction, it must be shown that if A = (au) is an n x n matrix, then
jATj = jAj.
Let AT = (b
11
), so b
11
= a
11
, then
jATj = b11B11 + b12B12 + + b1nB1n
= a11B11 + OzBl2 + ... + OniBln
The cofctor B1
1
of b1
1
is ( -1)1+
1
M u, where MJJ is the minor of b
11
The
minor MJJ is the determinan! of the (n - 1) x (n - 1) matrix obtained from
AT by deleting the 1st row andjth column. But if the 1st column andjth row
of A are deleted, the transpose of Mu is obtained. Since these matrices are
(n - l) X (n - 1), the induction assumption implies that M
11
is equal to the
minor of a
1
1 in A. That is, BJJ = (-1)
1
+1M
11
= (-J)i+
1
M
11
= A
11
There-
above expansion becomes lAr = a11 A11 + a21 A21 + . +
wh1ch IS the expansion of jAj along the first column.
Now the statement is true for all 1 x 1 matrices, and if it is true for all
(n - 1) X (n - 1) matrices (n 2), it is true for all n x n matrices. There-
fore the proof is complete by induction.
14. a. (!? g, -16 g, lb ?1) = (0, O, 1). b. (0, O, -1).
Note that parts a and b show that the cross product is not commutative.
c. O. d. (-2, -4, 8). e. O. f. (0, -19, 0).
15. a. If the matrix is called A = (a
11
), then
Auau = = [bzc3 - b3c2](1, O, 0).
16. A (B x C) is the volume of the parallelepiped determined by A, B, and C.
Since there volume, A, B, and C must lie in a plane that passes through
the origin.
Chapter 3
447
17. a. (18, -c-5, -16) and (22, 8, -4).
b. The cross product is not an associative operation.
c. The expression is not defined -beca use the order in which the operations
are to be performed has not been indicated.
6 Page 119
1. a. 13. b. 15. c. o. d. l. e. -144. f. 96.
3. a. The determinan! is - 11, so the set is linearly independent.
determinan! is -5. c. The determinant is O.
b. The
4. The set yields a lower triangular matrix with all the elements on the main
diagonal nonzero.
5. a. Follow the pattern set in the proof of Theorem 3.12, that interchanging
rows changes the sign of a determinan!, and the proof that jAr = jAj, problem
13, page 115.
1 a12/a11
6
jAj = a
11
O Ozz - az(az/au)
atnlau
a2n- Oz(anfau)
7. a.
O an2 - an1(a12/au)
19
35 7 21 131- 57
14
=336-97
6 8
3. 4 - 5. 21-! -32 621 = -34
3. 8 - 9. 2 - 3 -27
1
52- 22,_ 51
53- 32 - .
c.
2 1 3 2
1 3 2 1
2 2 i 1
3 1 2 1
7 Page 126
l. a. (g !)
e. (b ?).
1
23- 11 22- 13 21- 12
=tz 22-21 21-23 21-23
21- 31 22- 33 21- 32
11
5 1 0
11 1-22 -101
=;
2
-
4
-
2
=;:rs -24 -2o =
10
-1 -S -4
b. (g ! g)
c. (g)
4. a. The row space of A= 2"{(1, O, 11/3), (0, 1, -1/3)} and the row space of
AT = 2'{(1, o, -1), (0, 1, 1)}.
448 ANSWERS ANO SUGGESTIONS
b. The row space of A = 2'{(1, O, 1), (O, 1, -2)} and the row space of AT
= 2'{(1, O, -3), (0, 1, 1)}.
5. "A transpose" or "the transpose of A."
6. lf U= (a, b, e) and V= (d, e, f), then the components of U x Vare plus or
minus the determinants of the 2 x 2 submatrices of
(d ).
7. That is, if x1 U1 + + x.U. = W has a solution for al! W E "Y"' show that
x, U, + + x11 U11 = O implies x 1 = = x. =O.
Review Problems Page 126
2. a. The two equations in two unknowns would be obtained in determining
if {4 + t, 5 + 3t} (or {l + 41, 3 + 51}) is linearly independent.
b. S = {2 + 91 + 1
2
, 8 + 31 + 71
2
} e R3[1].
c. S= {3 + 21, 1 - 1, 4 + 61} e R2 [1].
5. a. 1 h ; / = 1 - y - x. So an equation is y = 1 - x.
1 1 1
b. y= 9x- 23. c. 3y- 8x + 37 =O.
7. a. 5x + 6y - 8z = O.
c. 3x + 8y + 2z = O.
b. 3x- y= O.
d. X+ z =O.
9. If AX = B is a system of n linear equations in n unknowns, then there is a
unique solution if and only if the determinan! of A is nonzero. Since [Al may
equal any real number, it is reasonable to expect that a given n x n matrix
has nonzero determinan!. That is, it is the exceptional case when the value of
the determinan! of an n x n matrix is O, or any other particular number.
11. a. AG) = W and W respectively.
1
b. (g), (g), and (?). respectively.
c. The is that, for each BE vlt" x" there exists one and only one
X E vil"' x 1 , such that AX = B. What does this say about A when B = O? Is
there always an X when m< n?
Chapter 4
Page 135
l. a, e, e, and g are not linear. b, d, and f are linear maps.
2. Remember that an if and only if statement requires a proof for both implica-
tions.
Chapter 4 449
3. "T is a map from "Y to ir," or "Tsends "Y into 1/."
4. a. Suppose T(a, b) = (-1, 5). Then 2a- b = -1 and 3b- 4a = 5. These
equations have a unique solution, and (1, 3) is the only preimage.
b. {(1, -t, 3t)ft E R}. c. No preimage.
d. The problem is to determine which vectors in RJ[t] are sent to 1 by the map
T. That is, for what values of a, b, e, if any, is T(a + bl + et
2
) = t? This
happens only if a + b = 1 and e - a = O. Therefore any vector in {r +
(1 - r)t + rt
2
fr E R} is a preimage of t under the map T.
e. {tt
2
+ kfk E R}. f. {3 +k- kl + kt
2
fk E R}.
5. a. Por each ordered 4-tuple W, there is at Ieast one ordered triple V such
that T( V) = W.
b. Por each 2 x 2 matrix A, there ls a 2 x 3 matrix B, such that T(B) = A.
c. For each polynomial P of degree less than 2, there is a polynomial Q of
degree less than 3, such that T(Q) = P.
6. a. Onto but not 1-l. For example, T(Z Z = (g g), for any scalar k.
b. Neither 1-1 nor onto. c. 1-1, but not onto.
d. 1-1 and onto. e. Neither. f. 1-1, but not onto.
7. a. (14, -8, -6).
d. (2, 5, 7).
b. (4, 10, 14).
e. (x, y, -x-y).
c. (1, 4, -5).
f. (0, O, 0).
8. a. Suppose "Y = JI>() !T and V= U + W, with U E Y, W E !T. Then
P1(V) = U. lf rE R, then rV = r(U + W) = rU + rW, and rU E Y, rWE !T.
(Why?) So P
1
(r V) = rU = rP
1
(V), and P
1
preserves sea lar multiplication.
The proof that a projection map preserves addition is similar.
9. a. lf Y= 2'{(1, O)}, then !T = 2'{(0, 1)} is perpendicular to JI> and S
2
=Y() !T. Thus P
1
is the desired projection. Since (a, b) = (a, O) + (0, b),
P
1
(a, b) =(a, 0).
b. P1(a, b) = (a/10 + 3b/10, 3a/10 + 9b/10).
10. a. T(a, b) = (a/2 - y')b/2, y'3a/2 + b/2), T(l, O) = (1/2, y')/2).
c. To show that a rotation is 1-1 and onto, Jet (a, b) be any vector in the
codomain, and suppose T(x, y) =(a, b). That is,
(x COS e- y sin e, X sin e + )' COS e) = (a, b).
This vector equation yields a system of linear equations. Consider the coef-
ficient matrix of this system and show that, for any a, b E R, there is a unique
solution for x and y.
11. A translation is linear only if h = k = O. Every translation is both one to one
and onto.
12. The reflection is linear, 1-1, and onto.
13. Show that T(U) = T(O).
14. This is proved for n = 2 in problem 2. The induction assumption is that it
is true for any linear combination of n - 1 terms, n ;:::: 3.
450 ANSWERS ANO SUGGESTIONS
2 Page 144
l. a. y= {(3a- 2e, b + e)ja, b, eER} = 2'{(3, 0), (O, 1), (-2, 1)}
= "Y 2
JV' r = {(a, b, e)/3a- 2e =O, b + e =O} = {t(2, -3, 3)/t E R}
= 2'{(2, -3, 3)}.
b. = {(x, y, z)/5x + 7y- 6z =O}; JV' r ={O}.
c. = R; JV'r ={a- ai/aER} = 2'{1- i}.
d. = ..'l'{J -f- 1
2
, f -f- 21
2
,2- 1} = ..'&'{1 -f- 1
2
,2- !}.
JV'r ={a+ bt + el
2
/b =e, a= -2e} = ..'&'{2- 1- 1
2
}.
2. a. {(4!- 2(51), 3(51)- 61)/t E R} = {!(2, -3)/t E R} = 2'{(2, -3)}.
b. {0}. c. {(2, -3)}. d. 2'{(2, -3)}. e. {(-4, 6)}.
f. If it is parallel to 2'{(1, 2)}, i.e., the line with equation y= 2x.
3. a. 2'{(1, O, -2), (0, 1, 1)} = {(x, y, z)/2x- y+ z = 0}.
b. 2'{(1, O, 1/2), (0, 1, 5/6)} = {(x, y, z)/3x + 5y- 6z =O}.
c. o. d. "Y 2
4. a. (a, b, e) E JV' r if 2a + b = O and a + b + e = O. A homogeneous system
of 2 equations in 3 unknowns has non trivial solutions, so T is not 1-1.
b. One to one. c. T(a, -2a, O, O)= O, foral! a E R.
d. One to one. d. Not one to one.
5. Each is a map from a 3-dimensional space to a 2-dimensional space.
6. = :T and JV'p = .?.
7. a. JV' :f. b. Why is r-
1
[/T] nonempty? For closure under addition and
scalar multiplication, is it sufficient to take U, V E r-
1
[/T] and show that
rU + sVE r-
1
[/T] for any r, sER?
c. T-
1
[{W}] must contain at least one vector.
8. (x, y) = [x- y](4, 3) + [2y- hl(2, 2), therefore T(x, y) = (4x- 3y, 2x- y).
9. (x, y) = [x- 3y](4, 1) + [4y - x](3, 1), so T(x, y) = (IOy- 2x, 3x - Ily,
1y- 2x).
10. a. (2, 4) = i(3, 6), but T(2, 4) ;6 jT(3, 6). Therefore any extension of T
to all "Y z will fail to be linear.
b. Y es, but not uniquely, for now T(2, 4) = 6). An extension can be
obtained by extending {(2, 4)} toa basis for "1'
2
, say to {(2, 4), (5, 7)}, and
then assigning an image in "Y
3
to the second basis vector, say T(5, 7) = (0, 3, 8).
For these choices, T may be extended linearly to obtain the map
T(x, y) = GY - h, 2x - y, -
11. Observe that since T(1, O) E "Y 2, there exista, e E R such that T(l, O) = (a, e).
13. a. JV' r = {(x, y, z)/2x +y- 3z = O}.
b. Recall that aplane parallel to JV' T has the Cartesian equation 2x + y - 3z
=k for sorne k E R; or that for any fixed vector U E "Y
3
, {V+ U/ V E JV' r}
is a plane parallel to JV' T
d. .? could be any subspace of dimension 1 which is not contained in JV' T
.....
Chapter 4 451
3 Page 153
1. a. [Tr + T
2
](a, b) = (4a, 2a + 5b, 2a- 12b).
b. [5T](a, b) = (15a- 5b, 10a + 20b, 5a- 40b).
c. [T3 - 3T
2
](a, b) = (2b- 2a, -2a- 3b, 4b).
d. 0Hom(1r
2
,1!"
3
).
2. a. x1T1 + x 2Tz + X3T3 + x4T4 =O yields the equations x1 + x
2
+ x
3
=O, x 1 + x3 =O, x1 + Xz =O, and x2 + X4 =O. This system has only the
trivialsolution,sothevectorsarelinearlyindependent. b. 2T1 - Tz- T
3
= O. c. T1 - Tz + T3 - T4 = O. d. T1 + T2 - 2T
3
= O.
3. "Let T be a linear map from "Y to "/!'."
4. b. Since we cannot multiply vectors, the first zero must be the zero map in
Hom("Y, "11'), the second is the zero vector in "Y, and the fourth is the zero vec- .
tor in "/1'.
5. a. If TE Hom( "Y, 11') and r E R, then it is necessary to show that rT: "Y "/1',
and that the map rT is linear.
b. Why is it necessary to show that [(a+ b)T](V) = [aT + bT](V) for all
V E "Y? Where is a distributive law in "/1' u sed?
6. a. (0, 3). b. (0, -6). c. ( -9, 5). d. ( -12, O).
e. (4, -3, O, 1)8 f. (0, O, O, O)s. g. (4, O, 1, 0)
8
h. (2, 1, 3, -5)s.
7. a. a12T1z + az2T22 + a32T32. b. a31T31 + a32T32.
c. auTu + a21T21 + a31T31 + a12T12 + a22T22 + a32T32.
2 2 4 3 2
8. a. :; :; Yij Tij. b. I: a41T4
1
. c. :; .L; r1a1Tu.
1=1 i=l }=1 1=1 J=l
9. a. Tu(x, y) = (y, y, 0). b. T22(x, y) = (x - 3y, x - 3y, 0).
c. T23(4, 7) = (-17, O, O). d. Tu(5, 9) = (9, 9, 9)
10. a. T(3, 1) = (2, 13, 3): (3, 10, -ll)s
2
, therefore T(3, 1) = [3T
11
+ IOT
1
z
- 11 T13](3, 1). And T(I, O) = (1, 4, O): (0, 4, - 3)
82
, so T(I, O) = [OTz + 4Tzz
- 3TZJ](I, 0). Therefore T = 3T11 + IOT12 - 11 T1z + OT
21
+ 4T22- 3Tz3,
and T: (3, 1 O, -11, O, 4, - 3)8 Yo u should be obtaining the coordinates of
vectors with respect to B
2
by inspection.
b. T: (8, -8, 12, 2, -2, 5)8 c. T: (2, 1, 2, 1, O, 0)
8
d. T: (6, -2, 3, 2, -2, 3)s.
11. a. T11 (a + bt) =a. T13(a + bt) = at
2
Tz 1(a + bt) = b.
b. [2T11 + 3T21 - 4Tn](a + bt) = 2a + 3b- 4at
2
c. T = T12 + T23. d. T = 2Tz1 + T1z - 4Tzz + 5Tt3
e. (0, O, 1, O, O, 1)
8
f. (3, -3, 7, -1, 4, 5)8
g. (0, O, O, O, O, O)s.
12. a. (0, 0). b. (1, 2). c. (- 5, -1 0). d. Yo u must first find the
coordina tes of (3, 4, 5) with respect to B1 Tzz(3, 4, 5) = - 2(1, 2) = (-2, -4).
cr: e. T12(1, O, O)= (4, 8). f. T: (3, 2, 2, -2, 9, -1)
8
g. T: (7, O, -10, 3, -5, 4)
8
452 ANSWERS ANO SUGGESTIONS
13. a. O E R, (0, O) E "Y
2
, (0, O, O, O) E "Y 4, and the map in Hom("Y 2, "Y 4) that
sends (x, y) to (0, O, O, O) for all (x, y) E "Y 2
4 Page 164
l. a. (b b. 8)
2. a. (g b)
b.
3. a. (g g) b. (b g)
4. a. (? b b.
d. ande. (g b g)
o o)
o 1 .
o o
8)
(
18
c. -3
-7
-8
d.
d. ( -1)
-3 4)
-4 l .
9 1
c. (g b l)
5. -1 _1) a. A (l) = ( -l), so T(O, l, O)= (1, -1, 0).
b. (-1, 7, -1). c. (0, O, O).
6. a. (1, l): (1, 0)0 , 1,(t.o, A(b) = and :._3(1, l)
+ 5(1, O) '= (2, -3), so T(l, 1) = (2, -3).
b. and T(4, O)= (12, -16).
c. Am =(-m and T(3, 2) = (7, -10).
d. A(_D = and T(O, 1) = (-1, ]).
7. The matrix of T with respect to B 1 and Bz is A = ( - b).
-1 -3 1
a. (
-l -3 l o -3
b. A(!, O, W = (1, 3, O)r, so T(l + t
2
) = 1 + 3t.
c. A(-4, 3, Of = (-8, -9, -5)T, so T(3t- 4) = -8- 9t- 5t
2
d. A(O, O, O)r = (0, O, O)r, so T(O) = O.
8. a. T(V
3
) =O. b. T(V
4
) = 7W
2
c. T(V)isalinearcombinationof
two vectors in B2. d. Yr e ..'t'{W
1
, Wm}. e. T(V) =a W, ... ,
T(V,,) = a.,W, provided n::::; m. What if n >m?
9. a. T is onto.
not 1-1.
b. Tis 1-1. c. T is not onto. d. T is onto, but
Chapter 4 453
10. There are an infinite number of answers. Jf Bt = {E}, then B
2
must be
{(3- 4), (-1, 1)}.
11. lf Bt = { Vt. V2, V3 }, then V3 must be in the null space of T. One possible
answerisB
1
= {{1,0,0),(0, 1,0),(1, 1, 1)}andB2 = {(-3, -3, 1),(1,
(1, O, 0)}. Check that B1 and B2 are in fact bases and that they give the desired
result.
12. If B
1
= {V
1
, V2, V3, V4 } and B2 = {W1, W2, W3}, then V
3
, V4 E .!V'r and
T(V} = Wt- W3, T(V2) = W2 + W3.
(
-5 7 2) ( o
13
26 -33 -3 ' -4
-3 -3) (-5 4 -1)
12 12 ' 22 -'21 9 '
(
-10 5 -5)
.40 -30 30 .
14. Let B1 = {V1, , V.} and B2 = {W1 , , Wm}. Jf ffJ(T) =A= (ali),
then T(V
1
) = L:r. 1 aiJ W1 To show that ffJ preserves scalar multiplication,
notice that [rT](V
1
) = r[T(V
1
)] = a
11
W, = .L;;".
1
ra,
1
W1 Therefore
[rTJ(V
1
):(ra
11
, , ram)
82
and qJ(rT) = (ra
11
). But (ra
11
) = r(aiJ) = rA
= rlfi(T).
15. a. A= (i
16. If T: (b
11
, b12, b13, b21, bn, b23)s, then T = L:ff. .L;f. bhkThk Therefore
T(V
1
) = .L;f. bhkThk(V) = .L;f. blkWik
Hence T(V
1
): (b
1
, b
1
2, b
1
J)s
2
, and =
b13 b23 a3 a32
5 Page 174
l. a. S o T(a, b) = (4b- 2a, 8a- 4b), To S(x, y) = (2y- 6x, 12x).
b. S o T(a, b, e)= (a- b +e, -b, a+ e), To S(x; y)= (2y, x +y).
c. S o T is undefined, T o S(a, b, e) = (2a, 2a + 3b, 4a - 4b, b).
d. S o T(a + bt) = -b - 6at
2
, T o S is undefined.
2. a. T
2
(a, b) = (4a, b - 3a). d. [T
2
- 2TJ(a, b) = (0, -a - b).
b. T
3
(a, b) = (8a, b - 7a). e. [T + 3/](a, b) = (5a, 4b - a).
c. T"(a, b) = (2"a, b - [2" - l]a).
5. Sin ce by definition, S o T: "Y - 11', it need only be shown that if U, V E "Y
and a, bE R, then [S o T](aU + b V) = a([S o Jl(U)) + b([S o T]( V)).
7. b. When multiplication is commutative, a(b + e) = (b + e)a.
8. If S E Hom("Y
2
), then there exist scalars x, y, z, and w such that S( a, b)
= (xa + yb, za + wb) (problem 11, page 145), and since a linear map is
determined by its action on a basis, S o T = T o S if and only if S o T(E
1
)
= T o S(E1), i = 1, 2, where {E} is the standard basis for "Y
2
.
a. {S/S(a, b) = (xa, wb) for sorne x, w E R}. b. Same as a.
454 ANSWERS ANO SUGGESTIONS
c. {S(a, b) = (xa + yb, za + wb)lx + 2y - w = O, 3y + z = 0}.
d. {S(a, b) = (xa + yb, za + wb)lx +y- w =O, y+ z = 0}.
9. b. For any map T, T o 1 = 1 o T.
10. a. Not invertible.
b. Solving the system a - 2b = x, b - a =y for a and b, gives T-
1
(x, y)
=(-x-2y,-x-y).
c. Suppose r-
1
(x + yt + zt
2
) =(a, b, e), then a+ e= x, b =y, and
-a- b = z. So r-
1
(x + yt + zt
2
) =(-y- z, y, X+ y+ z).- - - .
d. T-
1
(x, y) =y- X+ [3x- 2y]t.
e. A map from R
2
[t] to R
3
[t] is not invertible.
f. r-
1
(x, y, z) = (-3x + 2y- 2z, 6x- 3y + 4z, -2x +y- z).
11. a. Assume S
1
and S
2
are inverses for T and consider the fact that
S
1
= S
1
o 1 = S
1
o [T o S2].
13. The vector space cannot be finite dimensional in either a or b.
15. If V= [To S](W), consider S(V). Where do you need problem 14?
16. a. S o T(a, b) = (17a + 6b, 14a + 5b).
b. s-
1
(a, b) =(a- b, 4b- 3a). r-
1
(a, b) = (2a- b, 3b- 5a).
(S o T)-
1
(a, b) = (5a- 6b, 17b--:- 14a).
6 Page 186
c.
(
-1 8)
1
a. 18 12
b. (6). d.
12 6 -9
(
19 15 16)
f. 30 -8 4 .
5 1 3
e. (g 8)
(
3/2 -2) ( 2/11 1/11)
3
a. -1/2 1 b. -3/11 4/11
d. -: e. (b g
-3 6 -1 o 1 o
-2 8)
1 -4.
6 1
4. a. Show that the product B-
1
A -
1
is the in verse of AB.
b. By induction on k for 2. Use the fact that (A1A2 AkAk+t)-
1
= ((A1Az Ak)Ak+t)-
1
5. For Theorem 4.18 (=>) Suppose Tis nonsingular, then T-
1
exists. Let rp(T-
1
)
= B and show that B = A-
1
( <=) Suppose A-
1
exists. Then thereexists Ssuch that q(S) = A-
1
(Why?)
Show that S= r-
1
6. a. = (i -
1
m = H-i =
b. x = 14, y= 3, z = 13. c. a= 9/8, b = -3/8, e= 13/16, d = 1/2.
Chapter 4
7. a. ( (b) = ( therefore (1, O, 0): (3, 6, -4)a.
-4 -3 1 o -4
b. (0,3,-1)8 C. (-3,2,1)8 d. (0,0,2)B.
8. a.
9 10 -23/2 2 3
b. (0, 1, 0)8. c. (-9, -2, 10)8. d. (21, 5, -23)8.
455
9. and A-
1
=
For a.
(b) =
( -=2}3), therefore r-
1
(0, 1) = ( -1, - 2/3). b. ( -1, 1). c. ( -4, - 3).
d. (-x-y, -x -1y).
10. The matrices of T and r-
1
with respect to the standard basis for r
3
are
(
2 1 3) (-3/2 -1/2 5/4)
2 2 1 and 1 1 -1 respectively.
4 2 4 1 o -1/2
a. (10, -8, -4). b. (1, -2, 3).
C. T-
1
(x, y, z) = (-h- !Y+ X+ y- Z, X -iz).
7 Page 194
l. a.
(
o 1 o)
b. 1 o o .
o o 1
(
1 o o)
f. o 1 o .
o -3 1
e. 8).
o o 1
(
1/2 o)
c. o 1 .
(
1 o
g. o 1
o o
g )
1/4
2. a. E= (16
2
Ez = Ea= (b E4 = (b -i).
E
-1 - (2
c. 1 - o
o) E-1 (1 o) E-1 (1 o) E-1 (1 3)
1 , 2 = 5 1 , 3 = o -6 , 4 = o 1 .
d.
3. There are many possible answers for each matrix.
a. ( (? ) ( ( -f) b. G ( ( j).
c. (z l ?) l ?) (8 ? l) (8 ! ?)(8 r ?)(8 r ?) .
d. l g) b) r b (g b
4. a. (l
o 1 o)
1 o o -
o o 1
456 ANSWERS ANO SUGGESTIONS
-10/3 -1/3)
4/3 1/3 .
13/6 1/6
(g
1 o 2 o 1 o) (1 o 2 o 1
1 -2 1 -3 o o 1 -2 1 -3
5 -4 o -2 1 o o 6 -5 13
(g
1 o 2 o 1 o ) (1 o o 5/3
1 -2 1 -3 o - o 1 o -2/3
o 1 -5/6 13/6 1/6 o o 1 -5/6
Therefore 1 O 2 = -2/3 4/3 1/3 . b.
(
3 1 4)-l ( 5/3 -10/3 -1/3)
2 5 o -5/6 13/6 1/6
(
2 -1)
-5/2 3/2 .
(
-3/8 1 1/8 -17/12) ( 9/2 8 -15/2)
-1/2 o 1/2 -1/3
c. 3/16 -1/2 -1/16 21/24. d. -3 -S 5 .
3/8 o -1/8 1/12 2 3 -3
5. Consider the product -1) (
6. Let q be the isomorphism q: Hom( 1"' n) - ,(( n X n with q(S) = A and q(T) = B.
Show that T = O so that B = q(O) = O.
7. a. Since A is row-equivalent to B, there exists a sequence of elementary row
operations that transforms A to B. If this sequence of operations is performed
using the elementary matrices E, ... , Ek, show that Q = Ek E2E1 is
nonsingular and satisfies the equation B = QA.
b. Suppose Q
1
A = Q
2
A, if A is nonsingular, then A-
1
exists.
c. ( -1 G ) = ( = G )
( 3/2 -2) ( 5
-4
2
8). 8. a.
-1/2
1 .
b. -1 1 c. -1
-13 10 -2 1
d.
A-1.
(g
o
(g
-2
?)
(o
1
?)
(g
o
1)
9. a. 1/2 b. 1 c. 1 o d. 1
o o o o o
(g
o o)
e. o 1 .
1 o
10. See problem 7a.
-4/3
(J
-5)
11. a. 1 b.
2 .
o
12. a. With Q as in 8a, b. With Q as in Se, c. With P as in llb,
(1 o -8) (1 o 18) ( 1 o o)
p = o 1 2. p = o 1
.
Q= 010.
o o 1 o o -5 6 1
Review Problems Page 196
l. a.
1)(-1) = (_i),soT(2,5)=4(-2,1,0)
+ 7(3, 1, -4)- 3(0, -2, 3) = (13, 17, -37).
Chapter 4 457
b. (6, 5, -13). c. (9a- b, 26a- 7b, -46a + llb).
2. a. T is the identity map.
b. T(a, b, e) = (6a- 6b- 4c, -6a + 10b + 3c, 4a- 2b + e).
3. a. Show that 1 is an isomorphism.
b. If T: 1"' - 1f' is an isomorphism, show that T-
1
exists and is an iso-
morphism.
c. Show that the composition of two isomorphisms is an isomorphism.
4. a. A= (_i -n. b. The rank of A is 2 ,. l. d. B
1
= {(5, 2),
. (3, 4)}, B2 = {E
1
}.
6. a . .iVr = .2'{(3, -8, 2)}.
b. There are many possible answers. B1 can be taken to be {(1, O, 0), (0, 1, 0),
(3, -8, 2) }, which is clearly a basis for 1"'
3
Then since T(O, 1, O) = (O, 1, 3),
the basis B
2
must have the form {U, (0, 1, 3), V}, with T(l, O, O) = 2U-
(0, 1, 3) + 3 V. There are now an infinite number of choices for B
2
given by thc
solutions for this equation, satisfying the condition that U, (0, 1, 3), V are
linearly independent.
8. For the first part, suppose A
2
=A and A is nonsingular. Show that this
implies A = 110
The 2 x 2 matrix satisfies these conditions if a + d = 1 and
be= a- a
2
9. a. r = {(x, y, z)jy = 3x}. b. .iV r = .2'{(2, 1, -7)}.
c. The line parallel to .iV r that passes through U has the vector equation
P = t(2, 1, -7) +U, tER.
11. b. T(a, b, e) E Y if (a + b) - (b + e) + (a - e) = O, or a = c. Therefore
T-
1
[.9] = {(a, b, c)ja =e} and T[T-
1
[.9]] = {T(a, b, a)ja,b E R} = 2'{(1, 1, O))
#Y.
c. T-
1
[.'1] = {(a, b, c)j4a - b - Se = 0}. d. 1"'
3
12. a. k= -12. b. No values. c. k= O, -2.
. .._...,
13. Suppose S E Hom(C
2
). We can show that it' S is nonsingular, then S sends
parallelograms to parallelograms; and if S is singular, then it collapses paral-
lelograms into lines.
For U, V E @"
2
, write ( ur, vr) for the matrix with U and V as columns.
Then area (0, U, V, U+ V)= jdet(Ur, vr) is the area of the parallelogram
with vertices O, U, V, and U + V.
If A is the matrix of S with respect to the standard basis, then (S(U))T
= AUr. That is, in this case, coordinates are simply the components of the
vector.
S sends the vertices O, U, V, U+ Vto O= S(O), S( U), S( V), S( U)+ S( V)
=S( U+ V). Now the area of the figure with these four vertices is given by
jdet((S(U))r, S(V)Y)j = jdet(AW, AVT)j = jdet A(W, vr)j
= jdet A det(W, VT)j = jdet Al area (0, U, V, U+ V).
458
ANSWERS ANO SUGGESTJONS
So if det A = O (and S is singular), then the image of every parallelogram has
no area.
14. a. A
2
= 4A - 3[2.
d. A
3
= 4A
2
- 9h
b. A
2
= 7A. c. A
3
= 2A
2
e. A
3
= 4A
2
- A - 6/3 f. A
4
= O.
Chapter 5
1 Page 207
1. Por each part, except b, you should be able to find the coordinates ofthe vectors
in B' with respect to B by inspection.
( 2/9 -1/9)
1
-V
a. b. c. 3
-1/9 5/9 .
-5
o 1)
el
-1
-2)
(i
2
d. -4 2 . e. -3 2 1 . f. -4
1 o 4 2 o 5
2. a. The transition matrix from B to {E} is (3 5r
1
- ( 2
1 2 - -1
-5)
3
, and
(_ = (-D, SO (4, 3): (-7, 5)8
b. (-13,8)s. c. (4,-3)s. d. (2a-5b,3b-a)s.
3. = -j
1 2 3 -4 -1 6
a. (2, -2, 1)
8
b. (1, -1, 1)8 c. (2, O, l)s.
4. a. Por this choice, (0, 1) = (2, 1) - 2(1, O) and (1, 4) = 4(2, 1) - 7(1, 0).
If V: (xt> x2)s and V: then
V= x
1
(2, 1) + x2(1, O) = 1) + 4)
= xa(2, 1)- 2(1, O)]+ 1)- 7(1, O)]
= + 4xn(2, 1) + [-2xi - O).
Therefore x
1
= + and x2 = -2xi - or
X= = (_i _j) = PX'.
5. a. To prove (AB)T = Br Ar, Jet A = (a u). B = (bu). Ar = (cu). Br = (du),
AB = (e
11
), (ABY = Uu), and BT Ar = (Bii). Then express fu and Bu in terms
of the elements in A and B to show that fu = Bu
6. a. The transformation changes Iength, angle, and area. See Figure A2.
b. The transformation is called a "shear." The points on one axis are fixed,
so not alllengths are changed. See Figure A3.
c. The transformation is a "dilation." Lengths are doub!ed, but angles
remain
><
.
-
h
Cll
,.-.,
.....
'-'
,......
,.-.,
o
V> Cll
'-' ,.-.,
11 -
,.-.,.....;-
- '-'
..... .,
'-' 1 '
h 1 '
1
1
1
1
1
1
1
1
..... ---
'-'
1
Ql Ql
,.-., ,.-.,
o
.....
V> .....
'-' '-'
Cll
,.-.,
.....
'-'
N
od;
(1)
....
::;
C)
u:
::>..
1
....
11
,.-., .
..........
'-'
.......
'-' ,.-.,
&-.-
::>.. r<i'
1\ '-'
1 \
1 \
\
Cll
,.-.,
.
.....
'-'
.
.....
'-'
\
\
\
\
\,.-.,
\ o
.------
1
1
1
1
..........
M,....
::>.. '-' '-'
Cll
S o
-
o
'-'
'-'
M
od;
(1)
....
::;
C)
u:
460 ANSWERS ANO SUGGESTIONS
d. The transforrnation is a rotation through 45, length, angle, and area
rernain u.nchanged by a rotation.
e. The transforrnation is a reflection in the y axis. Length and area are. un-
changed but the sense of angles is reversed.
f. This transforrnatioil can be regarded as either a rotation through 180
or a reflection in the origin.
8. Suppose Pis the transition rnatrix frorn B to B', where B and B' are bases for
"Y and dirn .:Y = n. Then the colurnns of P are coordinates of n independent
vectors with respect to B. Use an isornorphisrn between "Y and .,u. x
1
to show
that the n colurnns of Pare linearly independent.
2 Page 213
-1)
c6
-S) (2 1)
'1
1
l. -4,
A'= 1i ' p = 3 2' Q =
o
1 o
2.
A=G
-3)
1 '
4' (-1
. = 1 p = G
l)
3. If Pis nonsingular, Pis the product of elernentary matrices, P = E
1
E
2
Ek,
and AP = AE1E2 Ek is a rnatrix obtained frorn A with k elernentary colurnn
operations.
(
2 1) -l.
4. a. Suppose Q = /2 , and P =
3 4
5. a. Jf T(a, b, e) =(a+ b + e, -3a + b + Se, 2a + b), then the rnatrix of
T with respect to {(1, O, 0), (0, 1, 0), (1, -2, 1)} and {(1, -3, 2), (1, 1, 1),
(1, O, O)} is the desired rnatrix.
7. lf I. = p-t AP, then PI.P-
1
= A.
12. T(U_;) = rpj.
3 Page 221
l. a. Reflexive and transitive, but not syrnrnetric.
b. Syrnrnetric, but neither reflexive nor transitive.
c. The relation is an equivalence relation.
d. Syrnrnetric and transitive, but not reflexive.
e. Reflexive and syrnmetric, but not transitive.
2. Suppose a is not related to anything. Consider problern Id.
3. a. Por syrnmetry: if a b, then a- b =Se, for sorne e E Z. Therefore
b- a= S( -e). Since -e is an integer, b a. This equivalence realtion
is usually denoted a = b mod S, read "a is congruent to b modulo S."
b. Five. Por exarnple the equivalence class with representative 7 is
[7] = {pE Zl7- p = Sk for sorne k E Z}
= {7 - SkJk E Z} = {2 + 5hJh E Z} = [2].
Chapter 5 461
y
T
X
p
Figure A4
d. The property of having a certain rernainder when divided by 5.
4. a. Por transitivity, suppose a b and b e, then there exist rational
numbers r, s such that a - b = r and b - e = s. The rationals are closed
under addition, so r + s is rational. Since a - e = r + s, a e and is
transitive.
b, [1] = [2/3] = {rir is rational}, h/2] = { v2 + rir is rational}.
c. The nurnber is uncountably infinite.
5. b. Given PQ and any point X E E
2
, there exists a point Y such that PQ XY.
Therefore drawing all the line segments in (f<2l would completely fill the
plane. So only representatives for [PQl can be drawn.
c. The set of all arrows starting at the origin is a good set of canonical forros.
Compare with Example 2, page 33.
d. 1t is possible to define addition and scalar multiplication for equivalence
classes of directed line segments in such a way that the set of equivalence
classes becomes a vector space. Por example, [JJQ] + [XY] can be defined to
be [PT] where XY PS and P, Q, S, Tare vertices of a parallelogram as in
Figure A4. In this vector space any arrow is a representative of sorne vector,
but the vectors are not arrows.
6. b. The lines with equations y = 2x + b, where bE R is arbitrary.
7. a. K T b. Al! planes with normal (2, -1, -4).
8. a. Row-equivalence is a special case of equivalence with P = !,..
b. Rank beca use of the reason for part a.
1
c. The (row) echelon forro.
d. An infinite number, since (g g), (8 b), (b ?) , and (b for each
x E R, belong to different equivalence classes.
462 ANSWERS ANO SUGGESTIONS
9. A few are area, perimenter, being isoceles or having a certain value for the
lengths of two si des and the included angle.
10. Write the matrix as g:). then 01 is k x (m- k), 02 is (n- k) x k,
03 is (n - k) x (m - k).
4 Page 230
l. For each part, there are many choices for P and the diagonal entries in D
might be in any order.
a. P = D = b. P = -D D = (ci -?)
c. P = (? D = diag(3, 5, 0).
o 1 -3
d. P = (-I -1 ?), D = diag(2, -4, 10).
o 18 2
2. a. There are no real characteristic roots, therefore it is not diagonalizable.
b. The characteristic equation has three distinct real roots since the discri-
minan! of the quadratic factor is 28. c. (a, b, e, d)T is a characteristic
vector for 1, if 4a - 3c + d =O and this equation has three independent
solutions. d. There are only two independent characteristic vectors.
3. All vectors of the form (h, k, h + 2k)T, with h
2
+ k
2
# O, are characteristic
vectors for 2 and (k, - 2k, 3k), with k # O, are characteristic vectors for -4.
lf P = (6 ? then p-
1
AP = diag(2, 2, -4).
1 2 3
4. (2k, k, -k)r, with k # O, are the characteristic vectors for 2, and (k, k, k)T,
k # O, are the characteristic vectors for l.
5. AO = ..1.0 for all matrices A and all real numbers A..
6. Not. necessarily, consider (f
7. The matrix A - U. is singular for each characteristic value ..1., therefore .the
system of linear equations (A - U.)X = O has at most n - 1 independent
equations and n unknowns.
8. a. 2, 4, and 3, for subtracting any of these numbers from the main diagonal
yields a singular matrix. b. 5 andO; forO, note that two columns are equal.
c. 7 and -2; subtracting -2 yields two equal rows.
9. a. Every polynomial equation of odd degree has at least one real root.
b. (? g g) and (6 g ?) give two different types of examples.
1 1 o o -1 o
;
;:
Chapter 5 463
10. a. The diagonal entries. b. They are invariant under similarity.
11. a. T is singular. b. ff T
5 Page 237
l. a. -7- ,1. + .1.
2
d. -5..1.- .1.
3
c. -11 + 2,1. + .1.
2
2.
3.
4.
5.
7.
8.
9.
a. 2- 5,1. + .1.
2
d. -23- 3,1. + .1.
2
a. sn-1 = 2.::1.1 A, where A;; is the cofactor of a.
b. The characteristic polynomial of A is IAI - s2A. + (tr A)A.
2
- .1.
3
i. -2 + 6.1. + 2,1.- ,1.
3
, ii. 35,1.- ,1.
3
. iii. 1 - 3,1. + 3.1.
2
- ,1.
3
,
You should be able to find two matrices with the same characteristic poly-
nomial but different ranks.
a. The characteristic polynomial is 2- 4,1. + 3.1.
2
- .1.
3
t = 2'{(1, O, 1)}.
b. The plane has the equation x + z = O.
b. ff T
The maps in a, b, and e are diagonalizable.
b . .9'(1) = 2'{(1, -1, -1)} and .9'(3) = 2'{(1, O, 1), (0, 1, 0)}. Since
{(1, -1, -1), (1, O, 1), (0, 1, O)} is a basis for "f/3, "f/3 = Y(l) (f) .9'(3). This
also implies that Y(l) n .9'(2) = {0}, but to prove it directly, suppose
U E .9'(1) n .9'(3). Then U = (a, -a, -a) and U= (b, e, b) for sorne a, b, e E R.
But then a = b = -a =e implies U= O and .9'(1) n .9'(3) = {0}. Thus
the sum is direct.
10. Any plane n parallel to !T can be written as {U + Pi U E !T} for sorne fixed
PE Y(l).
11. a. T(U) = (1, 2, 2), T
2
(U) = (-1, 3, 7). b. (? g -:).
o 1 5
c. T
2
(U)E2'{U, T(U)}. d. (! ?) Notice that the characteristic
values for Tare on the main diagonal.
Review Problems Page 238
l. a. (6 ? g). b. There are many answers. If = {(1, O, 0), (0, 1, 0),
(1, -2, -1)} then = {(2, 1), (-1, 2)}.
2. b. For consisten! systems, choose the equations exhibiting the solutions
using the system with augmented matrix in echelon form. What about in-
consisten! systems?
464 ANSWERS ANO SUGGESTIONS :2>
3. c. ( ? 8) for each x in R.
0 0 X
5. The discriminant of the characteristic equation is -4 sin
2
8. Therefore a
rotation in the plane has characteristic values only if 8 = nn, where n is an
integer. There are two cases, n even or n odd.
6. b, e, and d are diagonalizab!e.
7. Definethe sum by !J'+ff+6lt={U+V+W!UE!J', VE', WEd/t}.
Two equivalent definitions for the direct sum are: 9' + ff + q is a direct
sum if 9' n (ff + 6/1) = ff n (9' + 6/1) = q n (9' + ti)= {O}. Or 9' + t7
+ q is a direct sum iffor every vector V E -r, there exist unique vectors X E 9',
Y E t/, Z E q such that V= X+ Y+ Z.
8. a. There are three distinct characteristic values.
b. 9'(2) = 2'{(0, O, 1, 0), (1, O, O, 1)}, 9'(-2) = 2'{(1, O, 3, -3)}, and
9'(-3) = 2'{(0, 1, o, 0)}.
c. Pollows since {(0, O, 1, 0), (1, O, O, 1), (1, O, 3, -3), (0, 1, O, O)} is a
basis for -r 4
9. a. 8- 4A. + 2A.
2
- ,13. d. (? g - ~ .
0.1 2
12. a. T(a + bt) = -33a- 5b + (203a + 31b)t.
b. B' = {5- 29t, 1 - 7t}.
Chapter 6
1 Page 246
l. a. 2 b. O. c. 13. d. -4.
2. a. i. l. ii. 10. iii. l. iv. O. v. 8. vi. -14.
b. Por linearity in the first variable, let r, sER and U, V, W E -r
2
. Suppose
U:(a1, a2)n, V:(bl> h2)8 , and W:(c, c2)n, then (rU + sV):(ra1 + sb
1
,
ra2 + sb2)n (Why?). Using this fact, (rU + sV, W)8 = (ra 1 + sb
1
)c
1
+ (ra
2
+ sb2)c2 and r(U, W)n + s(V, W)n = r(aC + a2c2) + s(b1c1 + b2c2).
4. a. l. b. o. c. 9. d. -14.
5. Recall that O V = O.
6. a. 5, O, -5. b. Sin ce b is not symmetric, it is necessary to show that
b( U, V) is linear in each variable.
7. a. 2, sin l. b. b(rf + sg, h) = (r/(1) + sg(1))h(1) = (r/(1))h(l) + sg(1)h(1)
= rb(/, h) + sb(g, h), fill in the reasons. Since b is symmetric, this implies b
is bilinear. c. Consider !(x) = sin nx.
Chapter 6
465
8. a. 8, O.
10. a. x = z = 1, y= O.
b. X= 13/100, y= -11/100, Z = 17/100.
C. X= 3, y= -1, X= 2.
11. f is symmetric and bilinear but it should be easy to find values for x, y, and
z for whichfis not positive definite.
2 Page 254
1. a. 1/VJ. b. .JB/3. c. 1/v'I.
2. a. (U, V)
2
= 25, IIUII
2
= 10 and 11 Vll
2
= 5.
b. (U, V)
2
= 1/9, IIUII
2
= 1/3 and IIVII
2
= 2/3.
4. The direction ( <=) is not difficult. Por ( = ), consider the case (U, V) = 11 Ullll VIl
with u, V.. O. The problem is to find nonzero scalars a and b such that
a U+ b V= O. If such scalars exist, then O = (a U+ b V, V) = 11 Vll(all Ull
+ biiVII). Therefore a possible choice is a = IIVII and b = -11 VIl. Consider
(aU + bV, aU + bV) for this choice.
5. The second vector (x, y) must satisfy (x, y) o (3/5, 4/5) = O and x
2
+ y
2
= l.
Solutions: (4/5, -3/5) and (-4/5, 3/5).
6. a. There are two vectors which complete an orthonormal basis, 0/v'I, O,
1/v'J.). Note that these are the two possible cross products of the given
vectors.
b. 0/3v2)(4, 1, -1). c. (3/7, -6/7, 2/7).
7. Prove that {Vj(V, U)= O for all U E S} is a subspace of "f/'.
8. a. Use ( , )n. b. Since (2, -7): (2, - 3)s, 11 (2, -7)11 = .JTI.
9. b. {(1/V6) (1, O, 2, 1), (1/.J14) (2, 3, -1, 0), (1/.JTI) (-3, 2, O, 3)}.
10. a. Use the identity 2 sin A sin B = cos (A - B) - cos (A + B). Why
isn't the integral zero when n = m?
b. t - (1/n) I;;'=
1
(1/n) sin 2nnx. Notice that this series does not equal f(x)
when x = O or x = 1; at these points the series equals H/(0) + /(1)).
3 Page 260
l. a. V
1
= (1, O, O) and V2 = (0, O, 1) for W2 = (3, O, 5) -
[(3, O, 5) o (1, O, 0)](1, O, O) = (0, O, 5).
b. {(2/.J13, 3/.J13), ( -3/./f3, 2/.J13)}.
c. {C1/V3, -1/V'J, I/V3), co. 1/v'I, 1/V'I)}.
d. {(1/2, 1/2, 1/2, 1/2), (1/V6, O, -2/V6, 1/V6), (- 2/.J30, - 3/.J30, 1/.J30,
4/.J3Q)}.
2. {1, 2v3t- v3, 6v:St
2
- 6v:St + vs}.
466 ANSWERS ANO SUGGESTIONS
3. a. {(1/2, 0), (l/._/2S, 2N7)}.
b. Check that your set is orthonormal.
4. {(1, o, O), (2, t, O), (-2..3, -tN3, tN3)}.
6. a. .?
1
= {(x, y, z)/2x + y+ 4z = O, x + 3y + z =O}
= 2 {(11, -2, -5)}.
b. 2{(3, -1)}. c. {(x, y, z)/2x- y+ 3z = 0}.
d. {(x, y, z)/x = 2z}. e. 2{( -2, -4, 4, 7)}.
8. b. 2{(0, 1, O)} is invariant under T. 2{(0, 1, O)}T = {(x, y, z)/y = 0}.
9 . .?
1
= 2{(4, -1)}. cf
2
=.? + [/
1
because {(1, 4), (4, -1)} The
sum is direct, that is,.? n .?
1
= {0}, because {(1, 4), (4, -1)} is linearly
independent.
10. .?
1
= S!'{(O, 1, O, 0), (4, O, S, -5)}.
4 Page 269
l. a. Since the matrix of T with respect to {E} is ( T is a
reflection. The line reflected in is y = (1 v3)x.
b. The rotation through the angle -n/3.
c. The rotation through rp where cos rp = -4/5 and sin rp = -3/5.
d. The identity map, rp =O. e. Reflection in the line y= -x.
3. You need an expression for the inner product in terms of the norm. Consider
1/U- V//
2
(
1/3 2/3 -2/3)
4. a. -2/3 2/3 1/3 .
2/3 1/3 2/3
(
1 ..3 1 ..3 1 ..3)
b. -1..2 o 1N2 .
l/v6 -2..6 1N6
5. a. The composition of the reflection of cf
3
in the xy plane and the rotation
with the z axis as the axis of rotation and angle -n/4.
b. Not orthogonal.
c. The rotation with axis 2{(1 - v2, -1, O)} and angle n.
d. The rotation about the y axis with imgle n/4.
e. Not orthogonal.
6. Use the definition of orthogonal for one, and problem 2 for the other.
7. a. The determinant is 1, so Tis a rotation. (2, 1, O) is a characteristic vector
for 1, so the axis ofthe rotation is 2{(2, 1, O)}. The orthogonal complement of
the axis is the plane [/ = 2{(1, -2, 0), (0, O, 1)}; and the restriction Ty
rotates.? through the angle rp satisfying cos rp = (0, O, 1) T(O, O, 1) = 2/7.
The direction of this rotation can be determined by examining the matrix of
Twith respect to the orthonormal basis {(2N'S, 1/V''S, 0), (1/y''S, -2/y'S, 0),
(0, o, 1) }.
b. The determinant is -1, so -1 is a characteristic value for T. The charac-
Chapter 6 __ _
467
teristic polynomial for T is -1 + A. + A.
2
- A.
3
= -(A. + 1 )(A.
2
- 2A. + 1 ),
so the other two characteristic values are both l. Therefore T is the reflection
in the plane through the origin which has a characteristic vector corresponding
to -1 as norm.al. (1, -2, 3) is such a vector, so T is the reflection of cf 3 in the
plane x - 2y + 3z = O.
c. The determinant is -1, so -1 is a characteristic value for T. The charac-
teristic polynomial is as in part b. Therefore T is a reflection. T(1, -1, 1)
= - (1, -1, 1 ), hence T reflects cf 3 in the :Vith equati.on x - + z =
d. The determinant is -1 and the charactenst1c polynomJal of T IS -1 + ,A.
+ - A,3 = -(A.+ 1)(A.
2
- 5A./3 + 1). Since the discriminant of the
second degree factor is negative, there is only one real characteristic root.
Thus T is the composition of a reflection and a rotation. A characteristic
vector corresponding to -1 is (1, -3, 1). Therefore the reflection is in the
plane x- 3y + z =O and the axis ofthe rotation is 2{(1, -3, 1)}. (1, O, -1)
E 2{(1, -3, 1)}
1
so the angle of the rotation rp, disregarding its sense, is the
angle between (1, O, -1) and T(l, O, -1) = (4/3, 1/3, -1/3). Thus cos f/1 = 5/6.
e. T is the rotation with axis 2{(2, -2, 1)} and angle n.
f. Tis therotation with axis 2{(1, 1, O)} and angle f/1 satisfying cos f/1 = -7/9.
9. Suppose T is the reflection of cf 2 in the line & with equation P = tU+ A,
tE R. Consider Figure A5.
10. b. An orthonormal basis {U, U
2
, U
3
, U4} can be found for cf4 for which
the matrix of T is either
(
g b (8 b ?
o o b d o o o
o) (1 o o o) (-1 o o o) o o -1 o o o -1 o o
O ' O O a e ' or O O a e
-1 o o b d o o b d
where (a e) = (c?s f/1 -sin f/1) for sorne angle rp. The first matrix gives a
b d sm rp cos f/1
rotation in each plane parallel to 2{U3, U4}, that is, a rotation about the
plane 2 {U, U
2
}, and the second matrix gives a reflection of E4 in the hyper-
plane U2, U3}.
11. This may be proved without using the property that complex roots come
in conjuga te pairs. Combine the facts det A-l = 1 and det(ln - A) =
( -l)"det(A - /
11
) to show that det(A - /") = O.
5 Page 275
l. a. (7, 7i '---- 3). b. (2 + 4i, 2i- 14, 10).
. (1 - i S - 6i 1 + 4i )
d. 2 + 5i 13 - i 4 + 13i .
i 1 + 3i i- 1
(
Si- 6)
c. Si '
2. a. Det e t i
3
2
) = 1 - i ,. O, so the set is linearly indep!mdent.
Sin ce the dimension of 1'" 2 ( C) is 2, the set is a basis.
b. (1 + i, 3i): (4, -3)
8
, (4i, 7i- 4): (1 - i, 2i- 1)s.
:::V 468 ANSWERS ANO SUGGESTIONS
1
,-..
:::..
'-'
f-.
Chapter 6 469
3.
6.
7.
8.
9.
b. For every w E e, [1 + i -iw](i, 1 - i, 2) + [2 + (1 - 3)w](2, 1, -i)
+ w(5 - 2i, 4, -1 - i) = (3 + i, 4, 2).
a. 3 + 4 = 2, 2 + 3 = O, 3 + 3 = l.
b. 24 = 3, 34 = 2, 4
2
= 1, 2
6
= 4.
c. -1=4,-4=1,-3=2. d. 2-
1
=3,3-
1
=2,4-
1
=4.
2
2
=O in Z
4
Why does this imply that Z4 is nota field?
a. .'t'{(l, 4)} = {(0, 0), (1, 4), (2, 3), (3, 2), (4, 1)}.
b. No. c. 25. d. No.
a. IfU = (z, w)andaE e, thenT(aU) = T(a(z, w)) = T(az,aw) = (aw, -az)
= a(w, -z) = aT(U). Similarly T(U + V)= T(U) + T(V).
b. A.
2
+ l. c. If z is any nonzero complex number, then z(l, i) are
characteristic vectors for the characteristic value i, and z(l, -i) are charac-
teristic vectors for - i.
d. The matrix of Twith respect to the basis {(1, i), (1, -i)} is diag(i, -i).
10. b. A.
2
- (3 + i)A. + 2 + 6i.
c. The discriminant is -18i. To find .J-18i, solve (a+ bi)
2
= -18i for
a and b. If z is any nonzero complex number, z(i - 2, i) are characteristi::
vectors for the characteristic value 2i, and z(l + i, -2i) are characteristk
vectors for 3 - i.
d. The matrix of T with respect to the basis {(i- 2, i), (1 + i, -2i)} is
diag(2i, 3 - i).
13. b. Dim "1' = 4. Notice that "1' # "1' z(e).
14. b. Vectors such as (v2, O) and (1, e) are in "1' but they are not in "1'
2
(Q).
c. No. d. If (a/b)l + (e/d)v2 =O with a, b, e, dE Z and e # O, then
v2 = -ad/be but v2 rf= Q. e. No since n rf= Q. f. The dimension
is at least 3. In fact, "1' is an infinite-dimensional vector space even though the
vectors are ordered pairs of real numbers.
15. Suppose there exists a set P e e closed under addition and multiplication.
Show that either i E P or - i E P leads to both 1 and -1 being in P.
-___,
6 Page 285
l. a. 4i- l. b. -4- lli.
2. a. {(x,y,z)J-ix+2y+(l-i)z=0}. b . .'t'{(i,2i,-l)}.
4. a. Use< ' >R b. (4i, 2): (1 + i, -2)u, so Jj(4i, 2)jj = v6.
5. a. Neither.
f. Neither.
b. Both. c. Symmetric. d. Hermitian. e. Neither.
7. a. The characteristic equation for A is . .1.2 + 9 = O, so A has no real charac-
teristic values.
b. If P = (
3
3
1
3 ), then P-
1
AP = diag(3i, - 3i).
470 ANSWERS ANO SUGGESTIONS
y' y
X
(0, -y'i4)B
Figure A6
8. a. (1 i 1 t ).
b. ). = 1, 4. Jf P
1
), then p-
1
AP = diag(l, 4).
9. Y es. Examples may be found with real or complex entries.
10. a. T(a, b) = (afv''l.- bfv''J., ajy''J. + bfv''J.).
b. B = {(1/V'l., 1/V'l.), (-1/V'l., 1/y''J.)}.
c. The curve is a hyperbola with vertices at the points with coordinates
( 3, O)B or ( 3/ y''J., 3/ y'l.)E
1
and asymptotes given by y' = y''J.x' in
terms of coordina tes with respect to B.
11. a. If B = {(2/VS, 1/VS), (1/v'S, -2/VS)}, then G ={Ve S2 jV:(x', y')8
and 6x'
2
+ f
2
= 24}.
b. Figure A6.
Review Problems Page 286
l. r >l.
2. a. {(1/V'l., 0), (3/v'l., y'J.)}. b. {(1/v'3, 0), (-1j,J15, ,J35)}.
7. a. To see that sP need not equal (!1'
1
)\ consider the inner product space of
continuous functions ff with the integral inner product. Let sP = R[x], the set
of all polynomial functions. Then using the Maclaurin series expansion for
sin x, it can be shown that the sine function is in (!1'
1
)\ but it is not in !1'.
8. a. T*(x, y)= (2x + 5y, 3x- y). b. AT;
9. a. ..
j
Chapter 7
471
b. E=-A-
1
B,F=D-BTA-
1
B.
c. det = det A det(D - BT A -
1
B).
10. a. -277. b. 85. c. 162.
Chapter 7
1 Page 297
l. a. (X
b. (X
X2)(7j2 762)
e 5/2 0) (X)
( 5 o
(f)
c. (x y
z) 562 .
d. (x y z) O 4
-1 1/2
2. A = where x +y= -5.
3. a. 3a
2
+ 2ab + 3b
2
b. (l/25)(99a
2
+ 14ab + 51b
2
). This is obtained
using (a, b):(Wa + 4b), t(4a- 3b))8 c. 4a
2
+ 2b
2
d. a2 + b
2
5. When n = 2, the sum L.;}= 1 67 = 1 a1Jx;x1 is
2 2
L.; ajXXj + L.; a2jX2XJ = aXX + a2XX2 + a21X2X1 + a22X2X2.
j=l
Three other ways to write the sum are
f; f; ajXXj, t X t OjXj, t Xj t ajX.
i=l j=l i=l j=l J=l 1=1
7. a. If b( U, V) = xr A Y with X and Y the coordina tes of U and V with respect
to B, then b(U, V)= 21x
1
y + 3x1Yz + 21x2Yt + 30x2Yz For example,
9.
10.
12.
13.
a
12
= (4, 1) o T(l, -2) = (4, 1) o (4, -3) = 3.
b. 14x1y + 1XtY2 + 5x2Yt + 3X2Y2
a. q(x, y)= 4xy- 5y
2
. b. q(x, y, z) = 3x
2
- 2xz + 2y
2
+ Syz.
a. Xy - 4XtY2 - 4X2Yt + ?x2Y2
b. XYJ + X1Y2 + XzY1 - X2Y2 - 6XzY3 - 6X3Y2
Consider Example 4 on page 296.
a. lt should be a homogeneous 1st degree polynomial in the coordinates of
an arbitrary vector. Thus a linear form on "Y with basis B is an expression of
the form a
1
x
1
+ + a.x. where X, ... , x. are the coordinates of an
arbitrary vector with respect to B.
b. If U:(x, ... ,x.)
8
and Tis defined by T(U)= L:1=1a,x1 for Ue"Y,
then. Te Hom("Y, R) and (a, ... , a.) (JJ =AXis the matrix expression
for T in terms of coordina tes with respect to B.
472
ANSWERS ANO SUGGESTIONS
2
l.
2.
3.
c. Show that if W is held fixed, then T( U) = b( U, W) and S( U) = b( W, U)
are expressed as linear forros.
a.
c.
d.
b.
c.
d.
a.
c.
Page 304
-1V b. A'= ( -?)
B' = {(2/5, 1/5), (-1/5'1/3, 2/5'1/3)}.
q(a, b) = xf- if (a, b): (x, x
2
)s
The characteristic values are 9, 18, O; A'= diag(l, 1, 0).
B' = W/9, 2/9, -2/9), Cv2/9, v2/18, v2/9), c-2/3, 2/3, 1/3)}.
q(a, b, e) = xf + if (a, b, e): (x, x
2
, x
3
)s
A=c: A'=pTAP=(-g g).
q(a + bt) = -4xf + if a+ bt: (X, x2)s ..
5. Consider Ietting X = Er and Y = EJ, E; and Ei from {E
1
}.
6. a. r = 2, s =O, diag(l, -1). b. r = 2, s = -2, diag(-1, -1).
c. det A = -3, sor= 3, tr A= 6, sos= 1, diag(l, 1, -1).
d. The characteristic polynomial of A is -13 + 5). - 3).
2
- ).
3
Det A = -13, so r = 3. Tr A = -3, so the roots might be all negative or
only one might be negative. But the second symmetric function s
2
of the roots
is -5. If all the roots were negative, Sz would be positive so the canonical form
is diag(1, 1, -1).
7. a. The
is singular.
b. The characteristic polynomial for the matrix A of q with respect to {E
1
}
is 8- 21). + 11).
2
- ).
3
, This polynomial is positive for all values of).::;; O,
therefore all three characteristic roots are positive and A is congruent to 1
3
3 Page 311
l. If P is a transition what are the possible signs of det P?
2. Figure A 7.
3. a. A line has the equation P = tU+ V for sorne V, V E Cn, tE R. Show
that T(P) lies on sorne line e for all tE R and that if W E e, then T(tU + V)
= W for sorne tE R.
b. That is, does 11 T(V)- T(V)Ii = 11 V- VIl for all U, V E C,?
4. Suppose A, B, and C are collinear with B between A and C. Show that if T(B)
is not between T(A) and T( C), a contradiction results.
5. a. Given two lines e, and e2 find a translation which sends e
1
to a Jne
through the origin. Show that there exists a rotation that takes the translated
line toa line parallel to e2 and then translate this line onto ez.
Chapter 7 473
y
y'
o
T(Ez)
____ _. ___ x'
T(O) T(E
1
)
Figure A7
b. Use the normal to obtain the rotation. In this case you rnust show that
your map sends the first hyperplane onto the second. (Why wasn't this necessary
in a?)
6. Use the fact that O, U, V, U+ V, andO, T(U), T(V), T(V) + T(V) determine
parallelograms to show that T(V + V)= T(V) + T(V).
7. Show that two spheres are congruent if and only if they ha ve the sarne radius.
Recall that X is on the sphere with center C and radius r if 11 X- Cll = r.
4 Page 320
l. a.
b. A= ci),
B (0, 7),
B = (-4, 6),
(
3 -2 o)
A= -2 o 7/2.
o 7/2 8
(
o 4 -2)
A= 4 o 3 .
-2 3 o.
2. The characteristic values of A are 26 and 13 with corresponding characteristic
vectors (3, 2) and (-2, 3). Therefore the rotation given by
(
x) = (3/Vl3 -2/V!}) (x')
y 2/V13 3/Vl3 y'
yields 26x'
2
- 13y'
2
- 26 =O or x'
2
- y'
2
/2 = l. See Figure AS.
3. The rigid motion given by
({) =
yields x"
2
= 4py" with p = 2. Let x', y' be coordinates with respect to B' =
{(4/5, -3/5), (3/5, 4/5)}, then B" is obtained frorn B' by the translation
(x", y") = (x' - 2, y' + 3). See Figure A9.
474
ANSWERS ANO SUGGESTJONS
y
y'
Figure AS
x"
Figure A9
Chapter 7 475
4 . .fT AA'= (x y = ({) If P = (g r -f) and g = P.f,'
then .f'T(FT AP).f' = 5x'
2
- 3y'
2
5. a. Intersecting lines. b. Parabola. c. Ellipse, (0, O) satisfies the
equation. d. No graph (imaginary parallellines). e. Hyperbola.
6. Probably, but since A is a 2 X 2 matrix, det(rA) = r
2
det A, so the sign of the
determinant is unchanged if r oF O.
7. b. The equation is -7t
2
+ 28t- 21 =O. The values t = 1, 3 give (-1, 1)
and (1, - 3) as the points of intersection.
8. a. B' = {(1/V2, 1/v2), (-1/V2, 1/v2)}. The canonical form is x'
2
= 9.
See Figure AlO(a).
b. B' = {(1/.J10, -3/.J10), (3/.J10, 1/.JlO)}. The canonical form is x'
2
j32
+ y'
2
/( v'W = l. See Figure A10(b).
5 Page 329
l. a. (yxz;') __ (0o
1
1 o) (x)
g , and multiplici:ition by -1 yields x'
2
/4 + y'
2
j9 - z'2
= -1.
b. ( = (g ? -l) ( yields x'
2
/5 - y'
2
= l.
c. The translation (x', y', z') = (x- 2, y, z + 1) yields 4z'
2
+ 2x' + 3y' = O,
and 4x"
2
- .J13y" =O or x"
2
= 4py", withp = .J13/16, results from
(
x") ( O O 1)(x')
y" = -2/V13 -3/Vl3 o y'
z" 3/Vl3 -2/V13 O z'
d. (x', y', z') = (x +2, y+ 1, z + 3), x'
2
/2 + y'
2
+ z'
2
/3 = -l.
e. The rotation
(
;) _ -m Z)(;:)
z - -2/3 2/3 1/3 O z'
1 o o o 1 1
yields 9x'
2
- 9y'
2
= 18z' or x'
2
- y'i = 2z'.
2. The plane y = mx + k, m, k E R, intersects the paraboloid x
2
fa
2
- y''fb
2
= 2z
in the curve given by
(1fa
2
- m
2
/b
2
)x
2
- imxfb
2
+ k
2
/b
2
= 2z, y = mx + k.
If 1fa
2
# m
2
fb
2
, then there is a translation under which the equation becomes
x'
2
= 4pz', y'= m'x' + k'. This is the equation of a parabola in the given
plane: If l/a
2
= m
2
/b
2
, the section is a Iine which might be viewed as a de-
generate parabola.
476
ANSWERS ANO SUGGESTIONS
y
y
x'
(b)
Figure A10
Chapter 7
4. Intersecting lines, parallel lines, and concident Iines.
6. a. x'
2
/8 + y'
2
/8- z'
2
/4 =-l.
7. a. Parablic cylinder.
c. Point.
b. Hyperboloid of 2 sheets.
d. Intersecting planes.
477
8. Consider a section of the two surfaces in the plane given by y = mx. lt is suf-
ficient to show that as x approaches infinity, the difference between the z co-
ordinates approaches zero.
6 Page 338
l. a .. (-2, -2, 2) and (-25/4, 9/4, -13/2). The diametral plane is given by
X + ?y - Z + 1 = 0.
b. (1, O, :-1) is a single point of intersection.
c. (- v'l/2, -1/2, 0), two conincident points of intersection. The diametral
plane is given by 2x + 2y2y + (4v'2 - 2)z + 2y'2 = O.
2. Suppose the line passes through the point O, then if O is not on the plane with
equation 4x + y - 2z = 5, the Iine intersects the surface in a single point. lf
O is on this plane but not on the line with equations 4x +y - 2z = 5, ?x- 7z
= 15, then the line does _not intersect the surface. If O is on the Iine given above,
then the line through it with direction (l, -2, 1) is a ruling. (Does this give
an entire plane of rulings?)
4. a. For = diag(l/5, 1/3, -1/2, O) and O = (5, 3, 4, W we must find K
= (o:, p, y, O)T such that Kr K = Kr O = O. These equations can be com-
bined to yield (3y - 4P)
2
= O, o: = 2y - P. therefore K may be taken as
(5, 3, 4, O)r as expected.
b. K = (1, y'S, 4 - y'2, O)T or (1, - y'S, 4 + y'2, O)r.
c. K= (1, 1, 1, O)r or (9, 2, -12, O)T.
d. K= (O, 1, 2, O)T.
5. a. There are no centers. b. A plane, x + y + z = -1, of centers.
c. (1, 1, 3). d. A Iine, X= t( -3, 2, 1) + (3, -2, 0), of centers.
6. a. Kr AO = O, for then the Iine either intersects the surface in 2 coincident
points (Kr AK #- O) or it is a ruling (KT K = 0).
b. If O is a center, then KT K = O for every direction.
c. The point X is on the tangent plane if X- O is the direction of a tangent
or a ruling (X #- 0). Therefore (X - O)T AO = O. But O is on the surface,
so the equation is xr O =o. Is the converse c!ear? That if vr AO =o, t h e ~
Vis a point on the tangent plane.
7. a. z = l. b. (1, 1, -1) is a singular point-the center or vertex of the
cone. c. x- 8y- 6z + 3 =O. d. 2lx- 25y + 4z + 38 =O.
8. If A= diag(1/4, 1, 1/4, -1) and O= (2, 1, 2, l)T, then we must find K such
that (2KT A0)
2
- 4(KT K)(T A O)= O. (Why?) The tangent lines are given
by X= t(1, O, 1) + (2, 1, 2) and X= t(9, 2, 1) + (2, 1, 2).
478
ANSWERS ANO SUGGESTIONS
Review Problems Page 339
l. a. A= -5i2), B = (5, -7).
c. A = (i -I B = (4, O, -9).
b. A = G B = (O, 6).
o 5 o
(-5, 1, -2).
(
o 3/2 -2)
d. A = 3/2 3 1 , B =
-2 1 o
3. a. An inner product.
4. a. -2; diag(-1, -1).
b. Not symmetric.
b. O; diag(l, -1).
c. Not positive definite.
c. 2; diag(l,J.._Q).
5. a. x'
2
/16- y'
2
/4 = 1; x' = x + 2, y' =y- 3.
b. x'
2
- y'
2
= O; x' = x - 5, y' = y + 2.
c. y'= 5x'
2
; x = Wx'- 4y'), y= + 3y').
d. x'
2
= 4.Y2; x = (I/V2)(x' +y'), y = (IJV2)(y' - x').
e. x'
2
/5 + y'
2
/10 = 1; x = (l/V5)(2x' + y' - 4), y =
(1/VS)(-x' + 2y'- 3).
7. a. (4, - 2).
b. ((x, y)i3x + 6y + 1 = 0}. c. (1, 1).
8. a. Hyperboloid of 2 sheets.
b. Parallel planes.
cylinder.
9. Ellipsoid or hyperboloid. See problem 9, page 127.
10. a. Any plane containing the z axis.
b. Any plane parallel to the z axis.
c. Hyperbolic
c. Any plane parallel to the z axis but not parallel to either trace of the
surface in the xy plane.
11. Each line intersects the surface in a single point.
(
4/5 -3/5 11/5)
12. P = 35 2(5 . x'
2
/8 + y'
2
j2 = J.
Chapter 8
1 Page 347
l. a. Y(U, T) = .5!'{(1, O, 0), (3, 1, 1), (10, 8, 6)} and
a(x) = x
3
- 8x
2
+ 18x- 12 = (x- 2)(x
2
- 6x + 6).
b. Y( U, T) = S!'{ U}; a(x) = x- 2.
c. Y(U, T) = S!'{U, (-6, O, 0), (-18, -6, -6)}; a(x) as in parta.
d. S!'(U, T) = .5!'{(0, 1, 0), (1, 3, 1)}: a(x) = x
2
- 6x + 6.
2. a. Y(U, T) = .5!'{(-1, O, 1), (0, 1, 2), (3, 7, 8)} and
a(x) = x
3
- 8x
2
+ 16x- 9 = (x- 1)(x
2
- 7x + 9).
b. Y(U, T) = S!'{ U}; a(x) = x- l.
Chapter 8 479
c. Y(U, T) = .5!'{(1, 1, 1), (3, 6, 6)}; a(x) = x
2
- 7x + 9.
d. x
3
- 8x
2
+ 16x- 9.
o o
L ..... } __L ......... .
4. b. : -1:
......... ----y:.:.:T
o 1 1
5. For the two block case, suppose A1 E vltrxr and A2 E vltsx., and use the fact
that
6. a. .5!'{(1, -3, 1)}.
c. {0}.
(
At O) _ (At O) (Ir O)
O A2 - O I. O A2 .
b. {(x, y, z)ix- 3y + z = 0}.
d. 1"3
7. a. .5!'{(0, 1, 0)}. b. .5!'{(0, 1, 0), (1, O, 1)}. c. .5!'{(2, 1, -2)}.
d. .5!'{(0, 1, 0), (2, 1, -2)}. e. 1"
3
, for p(T) =O.
8. It is the zero map.
9. a. .5!'{(1, O, 2)}, b. .5!'{(1, O, 2)}, compare with 7a and b.
c. .5!'{(1, o, 1), (0, 1, 0)}. d. {0}.
e. Use problem 8 with parts a and c.
2 Page 358
l. Consider x
2
- x E Z 2[x] or in general the polynomial (x - e1)(x - e2)
(x - eh) when F = {e, ... , eh}.
4. a. q(x) = 3x
2
- x + 2, r(x) = x + 3. c. q(x) = x
3
- 2, r(x) = x.
b. q(x) = O(x), r(x) = x
2
+ 2x + l. d. q(x) = x
3
- 3x + 4, r(x) = O(x).
5. Suppose a(x) = b(x)q1(x) + r
1
(x) with deg r 1 < deg b and a(x) = b(x)q2(x)
+ rl(x) with deg r2 < deg b. Show that q(x) # q
2
(x) and r
1
(x) # r2(x) leads
to a contradiction involving degree.
6. Use the unique factorization theorem.
7. a. g.c.d.: (x- 1)
3
(x + 2)
2
; l.c.m.: x(x- 1)
4
(x + 2)
4
(x
2
+ 5).
b. g.c.d.: (x- 4)(x- 1); l.c.m.: (x
2
- 16)(x- 4)2(x
3
- 1).
8. The Jast equation implies that rk(x) is a g.c.d. of rix) and rk_
1
(x), then the next
to the Jast equation implies that rk(x) is a g.c.d. of rk_
1
(x) and rk-2(x).
9. a. x
2
+ 1 is a g.c.d. The polynomials obtained are:
q1(x) = 2x, r 1(x) = x
3
- 2x
2
+ x- 2; q2(x) = x + 1, r2(x) = x
2
+ 1;
q3(x) = x - 2, r3(x) = O(x).
b. Re1atively prime. q
1
(x) = 2, r1(x) = x
3
- 4x + 6: q2(x) = x + 2,
r2(x) = x
2
+ x- 4; q
3
(x) = x- 1, r3(x) = x + 2; q4(x) = x- 1, r4(x)
= -2; qs(x) = -!(x + 2), rs(x) = O(x).
c. ix + 1 is a g.c.d. q
1
(x) = x + 1, r1(x) = 2x
3
- x
2
+ x + 1; q2(x) = 3a:;
rl(x) = 2x
2
+ x; q3(x) = x- 1, r3(x) = 2x + 1; q4(x) = x.
480
ANSWERS ANO SUGGESTIONS
10. Suppose c/d is in lowest terms, that is e and d are relatively prime as integers,
and consider d"p(cfd).
11. a. 6, 3, 3/2, 2, 1/2, 1. c. 8, 4, 2, l.
b. l. d. 1/9, 1/3, l.
12. Consider x
2
+ x + l.
13. Why is it only necessary to prove that well ordering implies the first principie
of induction? Given A such that 1 E A and k E A implies k + 1 E A, consider
the set B of all positive integers not in A.
17. c. z- 4 =a- b, x
2
+ y
2
+ (z- 4)
2
- 4 = (z- 3)a + (4- z)b, and
4x
2
+ 4y
2
- z
2
= (z + 4)b - za.
d. A curve in 3-space cannot be described by a single equation.
3 Page 366
l. a. x
3
- 5x
2
+ 7x- 3 = (x- 1)
2
(x - 3). b. X- 3. c. (x- 1)
2
d. X- l.
2. a. x
2
+ x + 2. c. If TE Hom(1"z(C)), then Tis diagonalizable.
3. a. (x + 4)(x - 3). b. (x + 2)
2
c. x
3
- 3x
2
+ 3x - l.
d. (x
2
-1- 8x -1- 16)(x + 5). e. (x
2
+ x + 1)(x + 2).
f. x
4
+ 6x
2
+ 9. g. x
2
- x- 2.
4. 3a and 3g.
5. a . .;V'T+4/Ef).JV'T-3/ =.9'(-4)(f).9'(3) = ..z'{(l, -1)}G;)..z'{(1,6)}.
b. .;V'(T+ 2J2 = 1" 2 C. .;V'(T-1)3 = 1" 3
d. .;V'(T+41)2 G;).!V' T+SI = 2'{(1, O, 0), (0, 1, O)} 2'{(0, O, 1)}.
e . .!V'T
2
+T+I .!V'T+2I = 2'{(1, O, -1), (0, 1, O)} 2'{(1, O, -2)}.
f. .;V'(T2+3IJ2 = 1" 4
g. .;V' T-21 .;V' T+I = {(x, y, z)!Sx -1- 10y"- 2z = O} 2'{(1, -1, -1) }.
6. (! =i 7. c. (?
8. a. (2, -1, -5) = (T- I)E
1
b. (3, 3, 3). c. (4, 2, -2).
15. Define T1 = s1(T) o q1(T), then 1 = T1 -1- -1- Tk and therefore each T1 is
a projection. T1 o T
1
= O for i cf- j follows from the fact that Ph(x)lq.(x) when
h cf- g. m(x) = pfi(x)q(x) implies .!V'p('<TJ e T1[1"], and it remains to obtain
the reverse containment.
4 Page 376
o
l. a. (5). b.
(O -25)
1 lO.
c.
(!
o -75.
125)
(o -4)
d. 1 3 .
1 15
\ o
o -16)
1 o o 24
e.
o 1 o -17.
o o 1 6
Chapter 8 481
2. a. Show that if B = {U, ... , Uh}, then B = {U T(U
1
), , Th-.
1
(U
1
)f
and if q(x) is not the mnimum polynomial of T,. then B is Iinearly dependen t.
b. Expand det(C(q)- Uh) along the Jast column and show that the cofactor
of the element in the kh position (occupied by -ak_
1
) is ( -:-1 )k+ll(- ;_)k-
1
(I)k-l
when k< h and (-1)h+h(-J.)h,-l when k== h.
3. a. Since the minimuin polynomial is (x- 2)3, 1"3 =S"( U, T) with U
1
any
vector such that (T- 21)
2
U
1
# O. A possiblecyclic basis is {(0, O, 1); ( -1, 1, 2),
(-6, 5, 4)}, which yields the matrix
(
? g
0 1 6 .
b. The mnimum polynomial is (x- 2)
2
For U 1{; .9'(2), choose u2 E .9'(2)
such that U2 1{; .9'(U, T). For example, B = {(1, O, 0), (-2, 2; -4), (1, -1, O)i
yields the matrix
(
o -4 o)
1 4 o .
o o 2
4. b. Suppose U
1
, , U_
1
have been chosen for 1 < i ::s;; h and 1"
= .9'(U
1
, T) .9'(U1_ , T). Then a maximal integer r
1
exists such that
p''-
1
(T) W 1{; .9' and p''(T) W E .9', for sorne W E 1". Show that p''-
1
(T) V= O
for all V E B leads to a contradiction.
5. a. diag( C(p
2
), C(p
2
), C(p
2
)), diag( C(p
2
), C(p
2
), C(p ), C(p )), diag( C(p
2
), - 3/
4
).
b. 4h. c. diag(C(p
2
), C(p)). d. C(p3).
e. diag(C(p
3
), C(p
3
)), diag(C(p
3
), C(p
2
), C(p)), diag(C(p
3
), /
3
).
6. a. (T- e!)' U= O implies (T- cl)((T- c!)r-I U)= O.
7. a. There are three linearly independent characteristic vectors for the charac-
teristic value -1, therefore the (x + 1 )-elementary divisors are (x + 1 )Z,
X -1- 1, X -1- l.
b. There are two linearly independent characteristic vectors for -1 so the
(x -1- 1 )-elementary divisors are (x -1- 1 )
2
, (x -1- 1)
2
o -1 !
1----.-..1 j
o
o o
1 o
o 1
-27: o
-27!
-9!
8. a.
--- :? :____: (:
-1;
b.
---- .. T-o
9.
c.
o
o o o
1 o o
o 1 o
o o 1
o
-49:
56:
-30;
8!
:o .. ...::"]
:1 -1
o
-------- ,- - --:_:-7
: 1 4
o
j 1 -6:
:o -9
: 1 -6
Suppose the given matrix is the matrix of TE Hom(-r 4) with respect to the
standard basis.
a. The mnimum polynomial of T is (x + 1)
2
(x- 1)
2
Therefore
482
ANSWERS ANO SUGGESTIONS
diag(C((x + 1)
2
), C((x- 1)2)) is the matrix of T wth respect to sorne basis.
This matrix is similar to the given matrix since both are matrices for T.
b. The mnimum polynomial of T is (x - 2)
2
(x + 4) and T has on\y one
characteristic vector for the characteristic value -4. Therefore T has only one
(x - 4)-elementary divisor and because of dimension restrictions, there are
two (x - 2)-elementary divisors, (x - 2)
2
and x - 2. Thus diag(C((x - 2)
2
),
C(x - 2), C(x - 4)) is a diagonal block matrix similar to the given matrix.
5 Page 386
o o
1 o
l. a. D = O 1
o
27 : o
-27 i
9!
--------::n
- ""fj
o o o
1 o o
-16 : o
o!
-8 :
b. Dl = O 1 O
o o 1 o:
o
o o
1 o
D=
o 1
c.
o o
o
o o
1 o
d. D=
o 1
o o
o o
o o
---:o --_:::4
1 -4
o -256: o
o o
o -32
1 o
o o o
o o o
o o o
1 o o
o 1 o
o o 1
(j" _::_: 16
1 o
-8
-24
-36
-32'
-18
-6
2. a. No change, D3 = D1 and D4 = Dz.
b. (x- 2i)
2
, (x + 2i)l, (x + 2)
2
;
o 4 : o
1 4i :
3 o o: o
1 3 o!
o 1 3 i
-- -- -r--r:
o ---------3
o -4: o
1 o :
: 1 o :
--- :.::.z-
o
o 1 -2
o -16- o o
1 o: o o
o --c:-<r -16
Dz = O O' 1 O
o 1
o -2 o o o o
1 -2!0 o o o
o
1'0 __:2o o
o o. 1 -2 : o o .
o o --o-- r. o -i
o o o o: 1 -2
2i o ;
1 2i :
o
---- -:-o- - "4
D3 = _
4
D4 =
; -2i
1
o -
-2i
__:z- o
1 -2
o
:o __:_4
- 1 -4
c. (x- 4i)l x- 4i, (x + 4i)
2
, x + 4i.
16 o
8i
-__:_4
o
Chapter 8 483
3.
4.
6.
4i o '_-
1 4i
o
--
o
o
1 -4i
' -4i
d. (x + 1 - i)
3
and (x + 1 + i)
3
are the elementary divisors.
o o
1 o
o 1
D3 = --
o
i- 1
1
o
2 + 2i
6i
3i- 3 :
-- -- --ro o
o
i- 1
1
: 1 o
:o 1
o
o
i- 1
o
2- 2i'
-6i
-3- 3i
.. -- 1--
o
-i- 1
1 o
1
o
o
---o
-i -1
b. Be= {(1, O, O, 0), (0, 1, O, 0), (-1, -2, O, 1), (0, 1, -1, -1)}.
The mnimum polynomial of A is x
2
+ 9, i.e., A
2
+ 9h = O. A classical form
o ver R is dag { -6), -6)} and aJordan forro over C is diag(3i, 3i,
-3i, -3i).
a. The characteristic polynomial is -(x- 2)
3
Since there is only one
linearly independent characteristic vector for 2, the mnimum polynomial is
(x- 2)
3
, and the Jordan forro is
(
f g)
o 1 2
b. The characteristic polynomial is - (x - 2)
3
, but there are two linearly
independent characteristic vectors for 2. So the mnimum polynomial is
(x - 2)
2
anda Jordan form is
(
i g)
o o 2
c. The characteristic polynomial is -(x - 1)(x- i)(x + i) and this must
be the negative ofthe mnimum polynomial. AJordan form is diag(l, i, -i).
7. a. There are 11 possibilities from (x- 2)
6
to x- 2, x 2, x- 2, x- 2,
X- 2, X- 2.
b. The first and last (as in parta) and (x - 2)
2
, x - 2, x - 2, x - 2, x - 2.
8. Show by induction that
=
9. Let k denote the number of independent characteristic vectors for 2.
a. (x - 2/, (x - 2)2, k = 2.
484 ANSWERS ANO SUGGESTIONS
b. (x - 2)\ x - 2, k = 2.
C. (X- 2)
2
, X- 2, X- 2, k= 3.
10. b. Be= {(0, O, l, -1), (0, 1, O, 0}, (0, O, -1, 0}, (-1, -2, O, -1)}.
c. Let p(X) =X- v2i and P2(x) =X+ v2i, Be= {U, ... ' U4}. If
W=(O,O, 1, -1), U1 U3 =pj(T)W,then
Be= {(O, 2v2i, -5, 4), (-I, -2, -v2i, -1), co, -2v2i, -5, 4),
e -t, -2. v2i, -t)}.
11. b. There are five possible sets: x4; x
3
, x; x
2
, x
2
; x
2
, x, x; and x, x, x, x.
The corresponding Jordan forros are:
(
b g g). (b g g). (b g g g). (b g g
o o 1 o o o o o o o 1 o o o o
o) (o o
o o o
o' o o
o o o
o o)
o o
o o.
o o
12. The elementary divisors for the given matrix are:
a. x
3
b. x
2
, x. c. x
2
, x
2
13. a. lf Be= {(1, O), (1, 1)}, the matrix of T with respect to Be is (j
T.(a, b) = (a - b, a - b) is the map with matrix (? g) with respect to Be,
and Tia, b) = (4a, 4b) has ( as its matrix with respect to Be. Thus
T = T,, + Td. Notice that this answer does not depend on the choice of basis
Be. Will this always be true?
b. The matrix of T with respect to Be = {(0, O, 1), ( -1, 1, -2), (0, 1, -1)}
is (i g) If (? g g) is ti'e matrix of T,, with respect to Be, and diag
003 000
(2, 2, 3) is the matrix of Td with respect to Be, then T = T. + Td, and T.( a, b, e)
= (a - b - e, b - a + e, 2a - 2b - 2e) and Td(a, b, e) = (2a, a + 3b,
2e- a- b).
14. There are two approaches. Either show that a matrix with all characteristic
roots zero is nilpotent, or show that if A = (au) and A
2
= (biJ), then aj = O
for i :::;; j implies bii =.,A,for i - 1 :::;; j and continue by induction.
6 Page 401
(
2 3) p
1
AP (-
2
_
0
3
), x'=-13e-2t, y'=9e-
3
'; X=
l. a. P = 1 2 ' - = O
P(-l3e-2', 9e-3')r.
b. p = =j), p-
1
AP = G x' = 4e', (D,y'- y'= 4e')e- and
y'(O) = 9 yields y' = (4t + 9)e'; X= P(4e', (41 + 9)e')r.
c. p = (i -1). p-
1
AP = diag(l, 1, -1); X= P(5e', -6e', e-')T.
1 1 3
Chapter 8
485
(
2 1 2) (2 o
d. P = 1 1 1 , p-
1
AP = 1 2
o 1 1 o 1
PX', x' =el', y'= (t + 3)el',
z' = (1
2
/2 + 3t - 2)e2'.
o)
o .
es'
o o)
1 o .
t 1
4. a. AT = qp) withp() = ;t + a._
1
;t-t + + a
1
+ ao.
5. a. X= 7e2t- 5e
3
'. b. X = (6t + 7)e
41
C. X= e'+ 2e-t - e-
21
6. a. diag(1, 0). b. Does not exist. c. O, characteristic values are
1/2 and 1/3. d. Does not exist.
o
?)
('t'
o o
1 o
e. 1 f.
o o
o
5/2 o -12
7. b. For the converse, considerdiag(l/4, 3).
9. Both limits exist.
Review Problems Page 402
l. d. = 8).
o o 1
e. No.
el
8
g. -1 5
-1 4
3. a. (x- 2)
2
(x + 1)
2
b. Why is it also (x- 2)
2
(x + 1)
2
?
-6)
-3.
-2
c. diag((? -!). (-) d. diag((f -?)).
4. c. Notice that if T(V) = A.Vandp(x) = a
1
x
1
, thenp(T)V = D =o a
1
A.
1
V
= p(l)V.
e o
-8) (3 o o)
(" o o
-t) (o
-1 o
J)
1 o o -2 1 -l o
5. a. 1 o -12 ' 1 3 o .
b. o 1 o
-3' o 1 o
o 1 -6 o 1 3
o o 1 -2 o o 1 -1
6. a. diag(G
?) , (i
b. diag(G ?) ' (2}, (2)).
7. a. (x + 4)
3
, (x + 1)
2
; one independent vector each for -4 and -l.
b. (x + 4)
3
, (x + 4)
2
, (x + 1)2; 2 for -4 and 1 for -l.
486
8. a.
ANSWERS ANO SUGGESTIONS
(x + 4)
3
, x + 4, (x + 1)
2
? x + 1 ; 2 for -4 and 2 for -l.
(x + 4)
3
, (x + 1)2, (x + 1)
2
; 1 for -4 and 2 for -l.
(x + 4)
3
, (x + 1)
2
, x + 1, x + 1; 1 for -4 and 3 for -l.
o -4 ; o 2i o o ; o
1 o! 12iO!
rny-:.:.:ir' o 1 2i :
1 o b. ; y o
-ri o:.:.:.4 i 1 -2i o
o 1 o o o 1 -2i
3 o o! o
3 o o; o
9. a. (1
1 3 o i
b. 9. ... L ... .. .L......... ,
:3 o
o l, 1 3
1 3 o!
o 1 3 j
:r--o
o i o 3
11. a. subdiag(1, 1, O, O, O). b. subdiag(l, O, 1, O, 1, 0).
c. subdiag(1, 1, O, 1, O, 1) or subdiag(l, 1, O, 1, 1, 0).
12. a. I g )( ? -
1
)(-i
1
)
O 1 1 O O e-
3
' -1 1
13. Consider problem 8, page 387.
1 Page 411
l. a. -1.
2. a. l.
7. Yes.
b. o
.b. o.
2 Page 417
c. -1.
c. -1.
Appendix
d. l.
d. o.
1. The even permutations in f!l'
3
are 1, 2, 3; 3, 1, 2; and 2, 3, l. The odd permuta-
tions are 1, 3, 2; 3, 2, 1; and 2, 1, 3.
2. a. 3, 4, 2, 1, 5. b. 4, 1' 3, 5, 2. c. a. d. 2, 3, 5, 4, l.
3. !, a, a
2
6. a. {x
1
- x
2
)(x
1
- x3)(x
1
- x4)(x2 - x3)(x2 - x4)(x3 - x4).
b. For a: (x
4
- x2)(x4 - X){x4- X3)(x2 - Xt)(x2 - X3)(x1 - X3)
= (-1)
4
P{x, x2, x3, x
4
), so sgn a= l. Sgn' = (-1)
3
=-l.
7. n!
8. b. Jf a is odd, 'o a is even. Set 'o a= Jl, then a= ,-
1
o Jl ='o a.
9. a. 10; +1. b. 15; -1. c. 6; +1. d. 9; -1.
Appendix 487
3 Page 426
4. If AZ = In and Z = (Z, ... , Zn), then AZ
1
= E
1
for 1 :s;; j :s;; n.
7. b. 3..)61. c. !II(A- B) x (C- B)ll. Would !II(A- B) x (B- C)ll
also give the area? d. ll.JTI/2.
8. The only property it satisfies is closure.
9. a. ForU,V,WeC4 ,1etUx Vx WbethevectorsatisfyingUx Vx WoY
= det( UT, VT, WT, YT) for all Y E?"' 4 Such a vector exists sin ce the map
Y ..... det(UT, VT, WT, YT) is linear. The value of U x V x Wmay be obtained
by computing U x V x W o E
1
for 1 :s;; i :s;; 4.
c. (0, o, o, 1). d. (3, 4, -6, -6).
Symbol lndext
Vector Spaces Matrix Notation
f(f. 278
AdJ
183
82 13 A,, 111
s. 24
AT
99
g;
34
A-l
181
ff(C) 279 A* 76
Hom(-1'", "'ff'), Hom(-1'") 146 C(p) 368
Jr 138 c.(p) 383
2(S) 44 det A or IAI 111
.inxm 80 D(A) 408
.inxm(F) 273 diag(d, ... , d.) 223
ffr 140 diag(A, ... , A.) 344
R[t] 33
eA
393
R.[t] 40 I. 181
.'l'nff 43(10) tr A 232
.'l'+ff 65
.'l'(f)ff 67
Maps
.'/' (f) ... (f) .'1',, 342
.'l'(U, T) 236, 345 /or 1., 170
.'l'(l) 235 T: -1'" __, ifl" 130
.'/'L
259 T[S] 137
-1'"2 5 r-1
170
-r. 22 T-lS] 145(7)
-r.(F) 272 ToS 167
Tf/' 144
t Numbers in parentheses refer to prblem numbers.
489
490
IIUII
uxv
U V
(U, V)
(U, V)
8
Vectors
13, 247
110, 423
13,24
242
246(3)
U: (a, ... , a.)s 59
Miscellaneous
[a] 215
416 .w.
~ b
b(x)ia(x)
e 3
214
350
01) 250
dim "f" 54
2
E2
8
55
348
{E,}
F[x]
g.c.d.
J.c. m.
R 2
355
358(6)
f!l'. 412
Q 271
sgn a 415
z 221(3)
z. 272
z 277
1 - 1
AcB
1"'25"//f'
132
37
62
{ ... ... }
3
(=>) 21
Symbol lndex
A
Adjoint matrix, 183
Alternating function, 407
Alternating subgroup, 416
Angle, 20, 249
Associative laws, 2, 6, 32
Augmented matrix, 76
Basis, 51
canonical, 382
change of, 200
cyclic, 371
extension to a, 56
orthonormal, 250
standard, 55
Bilinear form, 294
Bilinear function, 243
Binary operation, 3
Canonical forms, 218
classical, 384
echelon, 88
Jordan, 384
rational, 379
B
e
Subject lndext
Cayley-Hamilton theorem, 380
Center (geometric), 335, 339(6)
Central quadric surface, 337
Characterstic equaton, 227, 223
Characteristic polynomial, 227, 233
Characteristic root, 227, 233
Characteristic space, 342
Characteristic value, 224, 233
Characterstic vector, 224, 233
Chord, 330
Closure under an operation, 3, 39
Codomain, 130
Coefficient matrix, 76
Cofactor, 111, 421
Column-equivalence, 125
Column operations, 117
Column rank, 124
Column space, 124
Column vector, 81
Commutative laws, 2, 6, 32
Companion matrix, 368
Complete inverse image, 145(7)
Components, 5, 23
Composition of maps, 167
Cone, 323
asymptotic, 329(8)
Congruent figures, 31 O
t Numbers in parentheses refer to problem numbers.
491
9 ~
Congruent matrices, 298
Conic, 312
Conic sections, 329(5)
Conjugate of a complex number, 277
Consistent equations, 79
Contrapositive, 67
Converse, 21
Coordinates, 59
Cramer's rule, 185, 423
Cross product, 110, 423
Cylin9er, 322
D
Decomposition of a vector space
direct sum, 67, 342
primary, 363
secondary, 3 70
Degree of freedom, 102
Degree of a polynomial, 350
Determinan!, 107, 111
expansion by cofactors, 113, 421
of a product, 193, 422
of a transposed matrix, 115(13)
Determinan! function, 408, 416, 419
Diagonalizable map, 234
Diagonalizable matrix, 223
Diametral plane, 331
Differential equations, 389, 401(4)
Dimension, 54
Direction cosines, 29(14)
Direction numbers, 11, 24
Directrix, 322
Direct sum, 67, 342
Discriminan!, 314
Distributive laws, 3, 32, 168, 180
Division algorithm for polynomials, 352
Divisor, 350
elementary, 375, 378
Domain, 130
Dot product, 13, 24
Echelon form, 88
Eigenspace, 342
Eigenvector, 224
E
Elementary divisors, 375, 378
Elementary matrices, 187, 188
Ellipse, 314
Ellipsoid, 323
Equivalence class, 215
Subject lndex
Equivalence relation, 214
Equivalent matrices, 125, 210
Equivalent systems of equations, 83
Euclidean space, 242
Euclid's algorithm, 358(8)
Euler's theorem, 270(11)
Extension
to a basis, 56
ofa map, 144
Exterior product, 427(10)
Factor theorem, 353
Fields
F
abstrae!, 271
algebraically closed, 353
complex number, 271
finite, 272
of integers modulo a prime, 272
ordered, 276(15)
rational..number, 271
real number, 3
Finite dimension, 54
Fourier series, 254
Functions, 130
addition of, 34
alternating, 407
bilinear, 243
determinan!, 408, 416, 419
equality of, 35
linear, see Linear transformations
multilinear, 407
quadratic, 292
scalar multiple of, 35
skew-symmetric, 408
Fundamental theorem of algebra, 353
G
Gaussian elimination, 95
Generator of a cylinder, 322
Generators of a vector space, 46
Gram-Schmidt process, 257
Greatest common divisor, 355
Group, 174
alternating, 416
commutative, 174
of nonsingular maps, 17 4
permutation, 412
Subject lndex
H
Hermitian inner prduct, 278
Homogerieous linear equations, 79
Homogeneous polynomial, 291
Homomorphism, 146
Hyperbola, 314
Hyperboloid, 323
Hypercompanion matrix, 383
Hyperplane, 27
Ideal, 354
Identity
additive, 2, 6
multiplicative, 2
Identity map, 170
Identity matrix, 181
Image
of a set, 137
of a vector, 130
1
Image space, 138
Inconsistent equations, 79
Index of nilpotency, 386
Induction
first principie of, 437(16)
second principie of, 351
Infinite dimension, 54
Inner product
bilinearity of, 243
complex, 278
hermitian, 278
positive definiteness of, 242
real, 242
symmetry of, 242
Integers modulo a prime, 272
Integral donra'iftol"350
of polynomials, 349
Intersection, 43(10)
Invariants, 220
complete set of, 220
Invariant subspace, 235
In verse
additive, 2, 6, 32
adjoint form for, 183
computation with row operations, 189
of a map, 170
of a matrix, 181
multiplicative, 2
Irreducible polynomial, 356
lsometry, 305
Isomorphism
inner product space, 263
vector space, 62, 133
Jordan form, 384
Kernel, 140
Kronecker delta, 250
J
K
L
Law of cosines, 249
Leading coefficient, 350
Least common multiple, 358(6)
Left-hand rule, 306
Length, 13
Leve! curve, 323
Line
direction cosines for, 29(14)
direction numbers for, 11, 24
symmetric equations for, 25
vector equation for, JO, 24
Linear combination, 44
Linear dependence, 4 7, 48
Linear equations
consisten!, 79
criterion for solution of, 95
equivalen!, 83
Gaussian elimination in, 95
homogeneous, 79
inconsistent, 79
independent, 104
nontrivial solution of, 79
operations on, 84
solution of, 78
solution set of, 82(14), 101
solution space of, 81
Linear form, 297(13)
Linear functional, 286(6)
Linear independence, 48
Linear transformations, 130
addition of, 146
adjoint of, 287(8)
composition of, 167
diagonalizable, 234
extension of, 144
group of nonsingular, 174
hermitian, 280
493
494
idempotent, 367(12)
identity, 170
image space of, 138
inverse of, 170
invertible, 170
matrix of, 157
nilpotent, 386
nonsingular, 173
null space of, 140
one to one, 132
onto, 132
orientation preserving, 307, 308
orthogonal, 261
projection, 133, 367(12)
rank of, 142
restriction of, 144
ring of, 169
rotation, 135, 267, 268
scalar multiplication of, 146
singular, 173
symmetric, 286(12)
vector space of, 147
zero, 148
M
Main diagonal, 118
Map, 130; see also Linear transformation
translation, 270(8)
Markov chain, 395
Matrices, 76
addition of, 80
adjoint, 183
augmented, 76
of a bilinear form, 294
coefficient, 76
companion, 368
congruence of, 298
determinan! of, 111
diagonal, 223
diagonal block, 344
diagonalizable, 223
echelon form of, 88
elementary, 187, 188
equality of, 77
equivalence of, 125, 210
hermitian, 281
hypercompanion, 383
identity, 181
inverse, 181
Jordan, 361
Subject lndex
of a linear transformation, 157
main diagonal of, 118
multiplication of, 77, 177
nilpotent, 386
nonsingular, 181
orthogonal, 264
of a quadratic form, 291
of a quadratic function, 293
rank of, 90
scalar multiplication of, 80
series of, 392
similarity of, 211
singular, 181
stochastic, 396
subdiagonal, 386
subdiagonal of, 369
symmetric, 282
trace of, 232
transition, 200
transpose of, 99
triangular, 118
vector space of, 80
Matrix equation, 78
Matrix multiplication, 77, 177
Midpoint, 17
Minor, 111
Monic polynomial, 350
Multilinear function, 407
Multiplicity of a factor, 357
N
Nonsingular map, 173
Nonsingular matrix, 181
Nontrivial solution, 79
Norm, 13, 247
Normal to a plane, 26
n-tuple, 22
Nullity, 142
Null space, 140
Numerical analysis, 397
o
Orientation, 305
Orientation preserving map, 307, 308
Orthogonal complement, 259
Orthogonal map, 261
Orthogonal matrix, 264
Orthogonal set, 252
Ortho!9nal vectors, 20, 249
Orthonormal basis, 250
Subject lndex
Orthonormal set, 252
Parabola, 314
Paraboloid, 323
p
Parallelepiped, volume in 3-space, 111
Parallelogram
area in plane, 108
area in 3-space, 427(7)
Parallelogram law, 286(5)
Parallelogram rule for addition of vectors,
14
Parameter, 10
Partition of a set, 218
Permutation, 412
even and odd, 414
Perpendicular vectors, 20
Pivota! condensation, 120(6)
Plane, 103
normal to, 26
Polynomials, 33, 348
characteristic, 227, 233
degree of, 350
equality of, 348
factors of, 350
greatest common divisor of, 355
homogeneous, 291
irreducible, 356
leading coefficient of, 350
least common multiple of, 358(6)
mnimum, 361
monic, 350
relatively prime, 356
roots of, 353
skew-symmetric, 408
T-minimum, 360
zero, 348
Positive definite inner product, 242
Preimage, 130
Principal axis theorem, 313
Principal ideal domain, 354
Projection, 133, 367(12)
orthogonal, 256
Proper subspace, 38
Pythagorean theorem, 250
Q
Quadratic form, 291
Quadratic function, 292
symmetric, 293
Quadric surface, 312, 321
Quaternionic multiplication, 198(18)
Quotient, 352
Range, 130
Rank
column, 124
determinan!, 121
__ of a map, 142
matrix, 90
row, 91
Reflection, 262, 268
Regular point, 338(6)
Relation, 214
R
Relatively prime polynomials, 356
Remainder, 352
Remainder theorem, 353
Restriction map, 144
Right-hand rule, 306
Rigid motion, 305, 310
Ring, 169
ideal in, 354
with identity, 170
of maps, 169
of polynomials, 349
Root, 353
characteristic, 227, 233
Rotation, 135, 267, 268
Row-equivalent matrices, 88
Row operations, 85
Row reduction, 95
Row space, 90
Row vector, 81
Ruled surface, 333
Ruling, 332
S
Scalar, 5, 272
Scalar multiplication, 5, 22, 32
ofmaps, 146
of matrices, 80
Scalar product, 242
Scalar triple product, 116(15)
Schwartz inequality, 247
Section of a surface, 323
Series of matrices, 392
Sets
equality of, 38
intersection of, 43(1 O)
495
496
orthogonal, 252
orthonormal, 252
partition of, 218
subsets of, 27
union of, 43(10)
well ordered, 351
Sign or signum, 415
Signature, 302
Similar matrices, 211
Singular map, 173
Singular matrix, 181
Skew-symmetric function, 408
Solution of a system of linear equations, 78
Span,44,46
Standard basis, 51, 55
Standard position for a graph, 312
Subdiagonal, 369
Submatrix, 121
Subset, 37
Subspaces, 37
direct sum of, 67, 342
invariant, 235
nontrivial, 38
proper, 38
sum of, 43(10)
T-cyclic, 345
trivial, 38
zero, 38.
Su m
direct, 67, 342
of maps, 146
of matrices, 80
of subspaces, 43(10)
Summand, 65
Symmetric equations for a line, 25
Symmetric functions of characteristic roots,
232
Symmetric matrix, 282
Systems of linear equations, see Linear
equations
T
T-cyclic subspace, 345
T-minimum polynomial of U, 360
Trace
of a matrix, 232
of a surface, 323
Transformation, 130; see al so Linear trans-
formations
Transition matrix, 200
Translation, 270(8)
Transpose, 99
Transposition, 413
Triangle inequality, 248
Trivial solution, 79
Trivial subspace, 38
u
Subject lndex
Unique factorization theorem for poly-
nomials, 357
Unit vector, 29(13), 247
Unitary space, 278
V
Vectors
addition of, 5, 22, 32
angle between, 20, 249
characteristic, 224, 233
column, 81
components of, 5, 23
coordinates of, 59
linear dependence of, 48
norm of, 13, 247
orthogonal, 20, 249
scalar multiplication of, 5, 22, 32
scalar triple product of, 116(15)
T-minimum polynomial of, 360
unit, 29(13), 247
.zero, 6, 32
Vector spaces, 32, 272
over an arbitrary field, 272
basis for, 51
dimension of, 54
direct sum of, 67, 342
generators of, 46
intersection of, 43(10)
isomorphic, 62, 133
orthonormal basis for, 250
real, 32
subspace of, 37
sum of, 43(10}
zero, 35
Well ordered set, 351
Zero map, 148
Zero subspace, 38
Zero vector, 6, 32
Zero vector space, 35
w
z