Analog Vector

Linear Algebra (part 2) : Vector Spaces and Linear Transformations
(by Evan Dummit, 2012, v. 1.00)
Contents
1
Vector Spaces and Linear Transformations
Rn
1.1
Review of Vectors in
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Formal Denition of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4
Span, Independence, Bases, Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1
Linear Combinations and Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2
Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.3
Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5
Linear Transformations
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.5.1
Kernel and Image
1.5.2
The Derivative as a Linear Transformation
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
Vector Spaces and Linear Transformations
1.1 Review of Vectors in Rn
A vector, as we typically think of it, is a quantity which has both a magnitude and a direction. This is in
contrast to a scalar, which carries only a magnitude.
Real-valued vectors are extremely useful in just about every aspect of the physical sciences, since just
about everything in Newtonian physics is a vector position, velocity, acceleration, forces, etc. There
is also vector calculus namely, calculus in the context of vector elds which is typically part of a
multivariable calculus course; it has many applications to physics as well.
We often think of vectors geometrically, as a directed line segment (having a starting point and an endpoint).
We denote the
where the
ai
n-dimensional
vector from the origin to the point
(a1 , a2 , , an )
as
~v = ha1 , a2 , , an i,
are scalars.
Some vectors:
h1, 2i, h3, 5, 1i,

4
, e2 , 27, 3, , 0, 0, 1 .
3
Notation: I prefer to use angle brackets
hi
rather than parentheses
()
so as to draw a visual distinction
between a vector, and the coordinates of a point in space. I also draw arrows above vectors or typeset
them in boldface (thus
~v
or
v),
in order to set them apart from scalars. This is not standard notation
everywhere; many other authors use regular parentheses for vectors.
Note/Warning: Vectors are a little bit dierent from directed line segments, because we don't care where a
vector starts: we only care about the dierence between the starting and ending positions. Thus: the directed
segment whose start is
the same vector,
(0, 0)
and end is
(1, 1)
and the segment starting at
(1, 1)
and ending at
(2, 2)
represent
h1, 1i.
We can add vectors (provided they are of the same length!) in the obvious way, one component at a time: if
~v = ha1 , , an i
and
w
~ = hb1 , , bn i
then
~v + w
~ = ha1 + b1 , , an + bn i.
~v moves us from the origin to the point

w
~ tells us to add hb1 , , bn i to the coordinates of our current position, and so w
~
moves us from (a1 , , an ) to (a1 +b1 , , an +bn ). So the net result is that the sum vector ~
v +w
~ moves
us from the origin to (a1 + b1 , , an + bn ), meaning that it is just the vector ha1 + b1 , , an + bn i.
We can justify this using our geometric idea of what a vector does:
(a1 , , an ).
Then
Another way (though it's really the same way) to think of vector addition is via the parallelogram
diagram, whose pairs of parallel sides are
~v
and
w
~,
and whose long diagonal is
We can also 'scale' a vector by a scalar, one component at a time:
if
~v + w
~.
is a scalar, then we have
r ~v =
hra1 , , ran i.
Again, we can justify this by our geometric idea of what a vector does: if
direction, then
1
~v
2
Example: If
~v
should move us exactly as far, but in the opposite direction.
~v = h1, 2, 2i and w
~ = h3, 0, 4i then 2w
~ = h6, 0, 8i
2w
~ = h7, 2, 10i
moves us some amount in a
should move us half as far in that direction. Analogously,2~

v should move us twice as
far in that direction, and
~v
, and
~v + w
~ = h2, 2, 2i
. Furthermore,~
v
The arithmetic of vectors in
Rn
satises several algebraic properties (which follow more or less directly from
the denition):
Addition of vectors is commutative and associative.
There is a zero vector (namely, the vector with all entries zero) such that every vector has an additive
inverse.
Scalar multiplication distributes over addition of both vectors and scalars.
1.2 Formal Denition of Vector Spaces
The two operations of addition and scalar multiplication (and the various algebraic properties they satisfy)
are the key properties of vectors in
Rn .
the same properties as vectors in
We would like to investigate other collections of things which possess
Denition: A (real) vector space is a collection
of vectors together with two binary operations, addition of
vectors (+) and scalar multiplication of a vector by a real number (), satisfying the following axioms:
Let
Note: The statement that
[A1] Addition is commutative:
[A2] Addition is associative:
[A3] There exists a zero
[A4] Every vector
[M1] Scalar multiplication is consistent with regular multiplication:
[M2] Addition of scalars distributes:
(1 + 2 ) ~v = 1 ~v + 2 ~v .
[M3] Addition of vectors distributes:
(~v1 + ~v2 ) = ~v1 + ~v2 .
[M4] The scalar 1 acts like the identity on vectors:
~v , ~v1 , ~v2 , ~v3
be any vectors and
~v
and
, 1 ,2
be any (real number) scalars.
are binary operations means that
~v1 + ~v2
and
~v
are always dened.
~v1 + ~v2 = ~v2 + ~v1 .
(~v1 + ~v2 ) + ~v3 = ~v1 + (~v2 + ~v3 ).

vector ~
0, with ~v + ~0 = ~v .
has an additive inverse
~v ,
with
~v + (~v ) = ~0.
1 (2 ~v ) = (1 2 ) ~v .
1 ~v1 = ~v1 .
Important Remark: One may also consider vector spaces where the collection of scalars is something other
than the real numbers for example, there exists an equally important notion of a complex vector space,
whose scalars are the complex numbers. (The axioms are the same.)
We will principally consider real vector spaces, in which the scalars are the real numbers.
The most general notion of a vector space involves scalars from a eld, which is a collection of numbers
which possess addition and multiplication operations which are commutative, associative, and distributive, with an additive identity 0 and multiplicative identity 1, such that every element has an additive
inverse and every nonzero element has a multiplicative inverse.
Aside from the real and complex numbers, another example of a eld is the rational numbers (i.e.,
fractions). One can formulate an equally interesting theory of vector spaces over the rational numbers.
Examples: Here are some examples of vector spaces:
The vectors in
Rn
are a vector space, for any
n > 0.
(This had better be true!)
Note: For simplicity I will demonstrate all of the axioms for vectors in
This space is rather boring: since it only contains one element, there's really not much to say about
In particular, if we take
n = 1,
then we see that the real numbers themselves are a vector space.
R2 ; there, the vectors are of

the form hx, yi and scalar multiplication is dened as hx, yi = hx, yi.
[A1]: We have hx1 , y1 i + hx2 , y2 i = hx1 + x2 , y1 + y2 i = hx2 , y2 i + hx1 , y1 i.
[A2]: We have (hx1 , y1 i + hx2 , y2 i)+hx3 , y3 i = hx1 + x2 + x3 , y1 + y2 + y3 i = hx1 , y1 i+(hx2 , y2 i + hx3 , y3 i).
[A3]: The zero vector is h0, 0i, and clearly hx, yi + h0, 0i = hx, yi.
[A4]: The additive inverse of hx, yi is hx, yi, since hx, yi + hx, yi = h0, 0i.
[M1]: We have 1 (2 hx, yi) = h1 2 x, 1 2 yi = (1 2 ) hx, yi.
[M2]: We have (1 + 2 ) hx, yi = h(1 + 2 )x, (1 + 2 )yi = 1 hx, yi + 2 hx, yi.
[M3]: We have (hx1 , y1 i + hx2 , y2 i) = h(x1 + x2 ), (y1 + y2 )i = hx1 , y1 i + hx2 , y2 i.
[M4]: Finally, we have 1 hx, yi = hx, yi.
The zero space with a single element ~0, with ~0 + ~0 = ~0 and ~0 = ~0 for every , is a vector space.
All of the axioms in this case eventually boil down to ~0 = ~0.
it.
The set of
mn
matrices for any
and any
n,
forms a vector space.
The various algebraic properties we know about matrix addition give [A1] and [A2] along with [M1],
[M2], [M3], and [M4].
The zero vector in this vector space is the zero matrix (with all entries zero), and [A3] and [A4]
follow easily.
Note of course that in some cases we can also multiply matrices by other matrices. However, the
requirements for being a vector space don't care that we can multiply matrices by other matrices!
(All we need to be able to do is add them and multiply them by scalars.)
The complex numbers
a + bi,
where
i2 = 1,
The axioms all follow from the standard properties of complex numbers. As might be expected, the
0 = 0 + 0i.
zero vector is just the complex number
are a vector space.
Again, note that the complex numbers have more structure to them, because we can also multiply
two complex numbers, and the multiplication is also commutative, associative, and distributive over
addition. However, the requirements for being a vector space don't care that the complex numbers
have these additional properties.
The collection of all real-valued functions on any part of the real line is a vector space, where we dene
the sum of two functions as
(f + g)(x) = f (x) + g(x)
for every
x,
and scalar multiplication as
( f )(x) = f (x).
To illustrate: if
f (x) = x and g(x) = x2 ,

(2f )(x) = 2x.
then
f +g
is the function with
(f + g)(x) = x + x2 ,
and
2f
is the function with
The axioms follow from the properties of functions and real numbers. The zero vector in this space
is the zero function; namely, the function
and
which has
z(x) = 0
for every
For example (just to demonstrate a few of the axioms), for any value
g,
x.
[a, b]
in
and any functions
we have
(f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x).

(f + g)(x) = f (x) + g(x) = (f )(x) + (g)(x).
[M4]: (1 f )(x) = f (x).
[A1]:
[M2]:
There are many simple algebraic properties that can be derived from the axioms (and thus, are true in every
vector space), using some amount of cleverness. For example:
1. Addition has a cancellation law: for any vector
Idea: Add
~v
~v ,
~v + ~a = ~v + ~b
if
then
to both sides and then use [A1]-[A4] to rearrange
~a = ~b.
(~v + ~a) + (~v ) = (~v + ~b) + (~v )
to
~a = ~b.
~v , if ~v + ~a = ~v ,
~
when b = 0.
2. The zero vector is unique: for any vector
Idea: Use property (1) applied
3. The additive inverse is unique: for any vector
Idea: Use property (1) applied when
4. The scalar
~v ,
if
~v + ~a = ~0
~v = (1 + 0) ~v = ~v + 0 ~v
Idea: Expand
6. The scalar
~0 = (~0 + ~0) = ~0 + ~0
0 ~v = ~0
~0 = ~0
for any vector
~v .
.
for any scalar
Idea: Use property (3) and [M2]-[M4] to write
(1) ~v = ~v
~v .
for any vector
~0 = 0 ~v = (1 + (1)) ~v = ~v + (1)~v ,
and then use
~a = ~v .
7. The additive inverse of the additive inverse is the original vector:
~a = ~v .
via [M1] and then apply property (1).
times any vector gives the additive inverse:
property (1) with
then
via [M2] and then apply property (2).
5. Any scalar times the zero vector is the zero vector:
~a = ~0.
~b = ~v .
times any vector gives the zero vector:
Idea: Expand
then
Idea: Use property (5) and [M1], [M4] to write
(~v ) = ~v
for any vector
~v .
(~v ) = (1) ~v = 1 ~v = ~v .
1.3 Subspaces
Denition: A subspace
of a vector space
and scalar multiplication operations as
V,
is a subset of the vector space
which, under the same addition
is itself a vector space.
Very often, if we want to check that something is a vector space, it is often much easier to verify that it
is a subspace of something else we already know is a vector space.
We will make use of this idea when we talk about the solutions to a homogeneous linear dierential
equation (see the examples below), and prove that the solutions form a vector space merely by
checking that they are a subspace of the set of all functions, rather than going through all of the
axioms.
We are aided by the following criterion, which tells us exactly what properties a subspace must satisfy:
Theorem (Subspace Criterion): To check that
is a subspace of
V,
it is enough to check the following three
properties:
[S1]
contains the zero vector of
[S2]
is closed under addition: For any
[S3]
is closed under scalar multiplication: For any scalar
V.
w
~1
and
w
~2
in
W,
the vector
and
w
~
w
~1 + w
~2
in
W,
is also in
the vector
W.
w
~
is also in
W.
The reason we don't need to check everything to verify that a collection of vectors forms a subspace is that
most of the axioms will automatically be satised in
because they're true in
V.
As long as all of the operations are dened, axioms [A1]-[A2] and [M1]-[M4] will hold in
hold in
V.
because they
But we need to make sure we can always add and scalar-multiply, which is why we need [S2]
and [S3].
In order to get axiom [A3] for
W,
we need to know that the zero vector is in
W,
which is why we need
[S1].
In order to get axiom [A4] for
we can use the result that
(1) w
~ = w
~,
to see that the closure under
scalar multiplication automatically gives additive inverses.
Remark: Any vector space automatically has two easy subspaces: the entire space
consisting only of the zero vector.
V , and the trivial
subspace
Examples: Here is a rather long list of examples of less trivial subspaces (of vector spaces which are of interest
to us):
The vectors of the form
ht, t, ti
are a subspace of
R3 .
[This is the line
x = y = z .]
t = 0.
ht1 , t1 , t1 i + ht2 , t2 , t2 i = ht1 + t2 , t1 + t2 , t1 + t2 i, which is again
we take t = t1 + t2 .
[S3]: We have ht1 , t1 , t1 i = ht1 , t1 , t1 i, which is again of the same form if
[S1]: The zero vector is of this form: take

[S2]: We have
The vectors of the form
hs, t, 0i
are a subspace of
. [This is the
xy -plane,
of the same form if
we take
aka the plane
t = t1 .
z = 0.]
s = t = 0.
[S2]: We have hs1 , t1 , 0i + hs2 , t2 , 0i = hs1 + s2 , t1 + t2 , 0i, which is again of the same form, if we take
s = s1 + s2 and t = t1 + t2 .
[S3]: We have hs1 , t1 , 0i = hs1 , t1 , 0i, which is again of the same form, if we take s = s1 and
t = t1 .
[S1]: The zero vector is of this form: take
The vectors
hx, y, zi
with
2x y + z = 0
are a subspace of
R3 .
2(0) 0 + 0 = 0.
hx2 , y2 , z2 i have 2x1 y1 + z1 = 0 and 2x2 y2 + z2 = 0 then adding the
equations shows that the sum hx1 + x2 , y1 + y2 , z1 + z2 i also lies in the space.
[S3]: If hx1 , y1 , z1 i has 2x1 y1 + z1 = 0 then scaling the equation by shows that hx1 , x2 , x3 i
[S1]: The zero vector is of this form, since
[S2]: If
hx1 , y1 , z1 i
and
also lies in the space.
More generally, the collection of solution vectors
of
homogeneous equations, form a subspace of
hx1 , , xn i
Rn .
to any homogeneous equation, or system
It is possible to check this directly by working with equations. But it is much easier to use matrices:
A~x = ~0, where ~x = hx1 , , xn i is a solution vector.

~
~
[S1]: We have A0 = 0, by the properties of the zero vector.
[S2]: If ~x and ~y are two solutions, the properties of matrix arithmetic imply A(~x + ~y ) = A~x + A~y =
~0 + ~0 = ~0 so that ~x + ~y is also a solution.
[S3]: If is a scalar and ~x is a solution, then A( ~x) = (A~x) = ~0 = ~0, so that ~x is also a
write the system in matrix form, as
solution.
22
The collection of
matrices of the form
[S2]: We have
[S3]: We have
a1 b1
a2 b2
+
0 a1
0 a2

a1 b1
a1
=
0 a1
0
22
matrices.
a + 2ai
also of this form.
is a subspace of the complex numbers.
The three requirements should be second nature by now!
The collection of continuous functions on
is a subspace of the space of all
a = b = 0.

a1 + a2 b1 + b2
=
, which is
0
a1 + a2

b1
, which is also of this form.
a1
The collection of complex numbers of the form
[S1]: The zero matrix is of this form, with
a b
0 a
[a, b]
is a subspace of the space of all functions on
[a, b].
[S1]: The zero function is continuous.

[S2]: The sum of two continuous functions is continuous, from basic calculus.
[S3]: The product of continuous functions is continuous, so in particular a constant times a continuous
function is continuous.
The collection of
on
[a, b],
n-times dierentiable functions on [a, b] is a subspace of the space of continuous functions

n.
for any positive integer
The zero function is dierentiable, as are the sum and product of any two functions which are
dierentiable
times.
The collection of all polynomials is a vector space.
Observe that polynomials are functions on the entire real line. Therefore, it is sucient to verify the
subspace criteria.
The zero function is a polynomial, as is the sum of two polynomials, and any scalar multiple of a
polynomial.
The collection of solutions to the (homogeneous, linear) dierential equation
y 00 + 6y 0 + 5y = 0
form a
vector space.
We show this by verifying that the solutions form a subspace of the space of all functions.
[S1]: The zero function is a solution.
[S2]: If
y1
and
y2
y100 + 6y10 + 5y1 = 0 and y200 + 6y20 + 5y2 = 0, so adding and using
00
0
that (y1 + y2 ) + 6(y1 + y2 ) + 5(y1 + y2 ) = 0, so y1 + y2 is also a
are solutions, then
properties of derivatives shows

solution.
[S3]: If
of derivatives shows
y1 is a solution, then scaling y100 + 6y10 + 5y1 = 0 by and

00
0
that (y1 ) + 6(y1 ) + 5(y1 ) = 0, so y1 is also a solution.
is a scalar and
using properties
Note: Observe that we can say something about what the set of solutions to this equation looks like,
namely that it is a vector space, without actually solving it!
For completeness, the solutions are
y = Aex + Be5x
for any constants
and
B.
From here,
if we wanted to, we could directly verify that such functions form a vector space.
nth-order homogeneous linear dierential equation y (n) +Pn (x)y (n1) +

+ P2 (x) y + P1 (x) y = 0 for continuous functions P1 (x), , Pn (x), form a vector space.
The collection of solutions to any
Note that
y (n)
means the
nth
derivative of
y.
As in the previous example, we show this by verifying that the solutions form a subspace of the
space of all functions.
[S1]: The zero function is a solution.
(n)
(n1)
y1 and y2 are solutions, then by adding the equations y1 +Pn (x)y1
+ +P1 (x)y1 = 0
(n)
(n1)
and y2
+ Pn (x) y2
+ + P1 (x) y2 = 0 and using properties of derivatives shows that
(y1 + y2 )(n) + Pn (x) (y1 + y2 )(n1) + + P1 (x) (y1 + y2 ) = 0, so y1 + y2 is also a solution.
(n)
(n1)
[S3]: If is a scalar and y1 is a solution, then scaling y1 +Pn (x)y1
+ +P2 (x)y10 +P1 (x)y1 = 0
(n)
by and using properties of derivatives shows that (y1 )
+Pn (x)(y1 )(n1) + +P1 (x)(y1 ) = 0,
[S2]: If
so
y1
is also a solution.
Note: This example is a fairly signicant amount of the reason we are interested in linear algebra
(as it relates to dierential equations):
because the solutions to homogeneous linear dierential
equations form a vector space. In general, for arbitrary functions

to solve the dierential equation explicitly for
y;
P1 (x), , Pn (x),
it is not possible
nonetheless, we can still say something about what
the solutions look like.
1.4 Span, Independence, Bases, Dimension
One thing we would like to know, now that we have the denition of a vector space and a subspace, is what
else we can say about elements of a vector space i.e., we would like to know what kind of structure the
elements of a vector space have.
In some of the earlier examples we saw that, in
Rn
and a few other vector spaces, subspaces could all be
written down in terms of one or more parameters. In order to discuss this idea more precisely, we rst need
some terminology.
1.4.1
Linear Combinations and Span
Denition: Given a set

exist scalars
a1 , , an
Example: In
1 h0, 1i.
R2 ,
~v1 , , ~vn
such that
the vector
of vectors, we say a vector
w
~
is a linear combination of
~v1 , , ~vn
if there
w
~ = a1 ~v1 + + an ~vn .
h1, 1i
is a linear combination of
h1, 0i
and
h0, 1i,
because
h1, 1i = 1 h1, 0i +
because
and
h1, 1, 1, 2i,
R3 , the vector h0, 0, 1i is not a linear combination of h1, 1, 0i and h0, 1, 1i because there
scalars a1 and a2 for which a1 h1, 1, 0i + a2 h0, 1, 1i = h0, 0, 1i: this would require a common
to the three equations a1 = 0, a1 + a2 = 0, and a2 = 1, and this system has no solution.
Non-Example: In
exist no
solution
R4 , the vector h4, 0, 5, 9i is a linear combination of h1, 0, 0, 1i, h0, 1, 0, 0i,

h4, 0, 5, 9i = 1 h1, 1, 2, 3i 2 h0, 1, 0, 0i + 3 h1, 1, 1, 2i.
Example: In
~v1 , , ~vn , denoted span(~v1 , , ~vn ), to be the set W of all vectors

~v1 , , ~vn . Explicitly, the span is the set of vectors of the form a1 ~v1 + +
a1 , , an .
Denition: We dene the span of vectors

which are linear combinations of
an ~vn ,
for some scalars
Remark 1: The span is always subspace: since the zero vector can be written as
0 ~v1 + + 0 ~vn ,
and
the span is closed under addition and scalar multiplication.
Remark 2: The span is, in fact, the smallest subspace

any scalars
W.
a1 , , an ,
~v1 , , ~vn : because for

a1~v1 , a2~v2 , , an~vn to be in
a1~v1 + + an~vn to be in W .
containing the vectors
closure under scalar multiplication requires each of
Then closure under vector addition forces the sum
Remark 3: For technical reasons, we dene the span of the empty set to be the zero vector.
Example: The span of the vectors
h1, 0, 0i and h0, 1, 0i in R3
is the set of vectors of the form
a h1, 0, 0i +
b h0, 1, 0i = ha, b, 0i.
Equivalently, the span of these vectors is the set of vectors whose

plane
Denition: Given a vector space

generating set for
1.4.2
V,
V,
if the span of vectors
or that they generate
is zero i.e., the
~v1 , , ~vn
is all of
V,
we say that
~v1 , , ~vn
are a
V.
h1, 0, 0i, h0, 1, 0i, and h0, 0, 1i

ha, b, ci = a h1, 0, 0i + b h0, 1, 0i + c h0, 0, 1i.
Example: The three vectors

can write
z -coordinate
z = 0.
generate
R3 ,
since for any vector
ha, b, ci
we
Linear Independence
Denition: We say a nite set of vectors
a1 = = an = 0.
~v1 , , ~vn
is linearly independent if
a1 ~v1 + + an ~vn = ~0
implies
(Otherwise, we say the collection is linearly dependent.)
Note: For an innite set of vectors, we say it is linearly independent if every nite subset is linearly
independent (per the denition above); otherwise (if some nite subset displays a dependence) we say it
is dependent.
In other words,
~v1 , , ~vn
as a linear combination of
are linearly independent precisely when the only way to form the zero vector
~v1 , , ~vn
is to have all the scalars equal to zero.
An equivalent way of thinking of linear (in)dependence is that a set is dependent if one of the vectors is a
a1 ~v1 +a2 ~v2 + +an ~vn =

1
~0 and a1 6= 0, then we can rearrange to see that ~v1 = (a2 ~v2 + + an ~vn ).
a1
Example: The vectors h1, 1, 0i and h0, 2, 1i in R3 are linearly independent, because if we have scalars a
and b with a h1, 1, 0i + b h0, 2, 1i = h0, 0, 0i, then comparing the two sides requires a = 0, a + 2b = 0,
b = 0, which has only the solution a = b = 0.
linear combination of the others i.e., it depends on the others. Explicitly, if
3
Example: The vectors h1, 1, 0i and h2, 2, 0i in R are linearly dependent, because we can write 2h1, 1, 0i+
(1) h2, 2, 0i = h0, 0, 0i. Or, in the equivalent formulation, we have h2, 2, 0i = 2 h1, 1, 0i.
Example: The vectors

because we can write
Theorem: The vectors
h1, 0, 2, 2i, h2, 2, 0, 3i, h0, 3, 3, 1i, and h0, 4, 2, 1i in R4 are linearly dependent,
2 h1, 0, 2, 2i + (1) h2, 2, 0, 3i + (2) h0, 3, 3, 1i + 1 h0, 4, 2, 1i = h0, 0, 0, 0i.
~v1 , , ~vn are linearly independent if and only if every vector w

~ in the span of ~v1 , , ~vn
w
~ = a1 ~v1 + a2 ~v2 + + an ~vn .
may be uniquely written as a sum
For one direction, if the decomposition is always unique, then
a1 = = an = 0,
because
0 ~v1 + + 0 ~vn = ~0
a1 ~v1 + a2 ~v2 + + an ~vn = ~0 implies

~0.
is by assumption the only decomposition of
For the other direction, suppose we had two dierent ways of decomposing a vector
a1 ~v1 + a2 ~v2 + + an ~vn
w
~,
say as
w
~ =
w
~ = b1 ~v1 + b2 ~v2 + + bn ~vn .
and
Then subtracting and then rearranging the dierence between these two equations yields
w
~ w
~ =
(a1 b1 ) ~v1 + + (an bn ) ~vn .
Now
w
~ w
~
is the zero vector, so we have
But now because
b1 , , an bn
~v1 , , ~vn
are zero.
(a1 b1 ) ~v1 + + (an bn ) ~vn = ~0.
are linearly independent, we see that all of the scalar coecients
But this says
a1 = b1 , a2 = b2 , . . . , an = bn
a1
which is to say, the two
decompositions are actually the same.
1.4.3
Bases and Dimension
Denition: A linearly independent set of vectors which generate
is called a basis for
Terminology Note: The plural form of the (singular) word basis is bases.
Example: The three vectors
Example: More generally, in
V.
h1, 0, 0i, h0, 1, 0i, and h0, 0, 1i generate R3 , as we saw above. They are also
linearly independent, since a h1, 0, 0i + b h0, 1, 0i + c h0, 0, 1i is the zero vector only when a = b = c = 0.
3
Thus, these three vectors are a basis for R .
Rn ,
the standard unit vectors
e1 , e2 , , en
(where
ej
j th
has a 1 in the
coordinate and 0s elsewhere) are a basis.
not possible to obtain the
h1, 1, 0i and h0, 2, 1i in R3 are not a basis, as they fail to generate V :

vector h1, 0, 0i as a linear combination of h1, 1, 0i and h0, 2, 1i.
Non-Example: The vectors
Non-Example: The vectors

linearly dependent: we have
1, x, x2 , x3 ,
Example: The polynomials
First observe that
h1, 0, 0i, h0, 1, 0i, h0, 0, 1i, and h1, 1, 1i in R3 are not a basis, as
1 h1, 0, 0i + 1 h0, 1, 0i + 1 h0, 0, 1i + (1) h1, 1, 1i = h0, 0, 0i.
2
1, x, x , x ,
it is
they are
are a basis for the vector space of all polynomials.
certainly generate the set of all polynomials (by denition of a
polynomial).
Now we want to see that these polynomials are linearly independent.
a0 , a1 , , an
a0 1 + a1 x + + an xn = 0,
So suppose we had scalars
x.
Then if we take the nth derivative of both sides (which is allowable because a0 1+a1 x+ +an xn = 0
is assumed to be true for all x) then we obtain n! an = 0, from which we see that an = 0.
Then repeat by taking the (n1)st derivative to see an1 = 0, and so on, until nally we are left with
2
n
just a0 = 0. Hence the only way to form the zero function as a linear combination of 1, x, x , , x
2
3
is with all coecients zero, which says that 1, x, x , x , is a linearly-independent set.
such that
n vectors ~v1 , , ~vn in Rn

~v1 , , ~vn , is an invertible matrix.
Theorem: A collection of
are the vectors
for all values of
is a basis if and only if the
n n matrix B , whose columns
The idea behind the theorem is to multiply out and compare coordinates, and then analyze the resulting
system of equations.
So suppose we are looking for scalars
a1 , , an
such that
a1~v1 + + an~vn = w
~,
for some vector
w
~
in
Rn .
This vector equation is the same as the matrix equation

are the vectors
~v1 , , ~vn , ~a
B ~a = w
~,
where
is the matrix whose columns
is the column vector whose entries are the scalars
a1 , , an ,
and
w
~
is
thought of as a column vector.
Now from what we know about matrix equations, we know that
B ~a = w
~
has a unique solution for every
is an invertible matrix precisely when
w
~.
But having a unique way to write any vector as a linear combination of vectors in a set is precisely the
statement that the set is a basis. So we are done.
Theorem: Every vector space

generating set for
has a basis. Any two bases of
contain the same number of elements. Any
contains a basis. Any linearly independent set of vectors can be extended to a basis.
Remark: If you only remember one thing about vector spaces, remember that every vector space has a basis !
Remark: That a basis always exists is really, really, really useful. It is without a doubt the most useful
fact about vector spaces: vector spaces in the abstract are very hard to think about, but a vector space
with a basis is something very concrete (since then we know exactly what the elements of the vector
space look like).
To show the rst and last parts of the theorem, we show that we can build any set of linearly independent
vectors into a basis:
Start with
being some set of linearly independent vectors. (In any vector space, the empty set is
always linearly independent.)
1. If
spans
V,
then we are done, because then
is a linearly independent generating set i.e., a
basis.
2. If
does not span
in
S,
the new
V,
there is an element
of
which is not in the span of
S.
Then if we put
is still linearly independent. Then start over.
Eventually (to justify this statement in general, some fairly technical and advanced machinery may
be needed), it can be proven that we will eventually land in case (1).
If
has dimension
(see below), then we will always be able to construct a basis in at most
steps; it is in the case when
has innite dimension that things get tricky and confusing, and
requires use of what is called the axiom of choice.
To show the third part of the theorem, the idea is to imagine going through the list of elements in a
generating set and removing elements until it becomes linearly independent.
This idea is not so easy to formulate with an innite list, but if we have a nite generating set,
then we can go through the elements of the generating set one at a time, throwing out an element
if it is linearly dependent with the elements that came before it. Then, once we have gotten to the
end of the generating set, the collection of elements which we have not thrown away will still be a
generating set (since removing a dependent element will not change the span), but the collection will
also now be linearly independent (since we threw away elements which were dependent).
To show the second part of the theorem, we will show that if
is a basis with
To see this, since

elements of
B,
m > n,
elements, with
then
is a set of vectors with
is a basis, we can write every element
say as
ai =
n
X
ci,j bj
for
elements and
is linearly dependent.
ai
in
as a linear combination of the
1 i m.
j=1
Now suppose we have a linear combination of the

that there is some choice of scalars
dk ,
ai
which is the zero vector. We would like to see
not all zero, such that
n
X
dk ak = ~0.
k=1
If we substitute in for the vectors in
B,
is a basis, this means each coecient of
equalling the zero vector.
Since
then we obtain a linear combination of the elements of
bj
in the resulting
expression must be zero.
If we tabulate the resulting system, we can check that it is equivalent to the matrix equation
where
is the
the scalars
Now since
dk .
C is
mn
matrix of coecients with entries
and
d~ is
the
n1
But then we have
n
X
C d~ = ~0
dk ak = ~0
has a solution vector
for scalars
dk
d~ which
not all zero, so the set
C d~ = ~0,
matrix with entries
a matrix which has more rows than columns, by the assumption that
that the homogeneous system
ci,j ,
m > n,
we see
is not the zero vector.
is linearly dependent.
k=1
Denition: We dene the number of elements in any basis of
to be the dimension of
The theorem above assures us that this quantity is always well-dened.
Example: The dimension of
Rn
is
n,
since the
V.
standard unit vectors form a basis.
This says that the term dimension is reasonable, since it is the same as our usual notion of dimension.
Example: The dimension of the vector space of

of the
mn
matrices
Ei,j ,
where
Ei,j
mn
matrices is
is the matrix with a 1 in the
mn, because there is a basis consisting

(i, j)-entry and 0s elsewhere.
Example:
The dimension of the vector space of all polynomials is
polynomials
1, x, x2 , x3 ,
because the (innite list of )
are a basis for the space.
1.5 Linear Transformations
Now that we have a reasonably good idea of what the structure of a vector space is, the next natural question
is: what do maps from one vector space to another look like?
It turns out that we don't want to ask about arbitrary functions, but about functions from one vector space
to another which preserve the structure (namely, addition and scalar multiplication) of the vector space.
The analogy to the real numbers is: once we know what the real numbers look like, what can we say
about arbitrary real-valued functions?
The answer is, not much, unless we specify that the functions preserve the structure of the real numbers
which is abstract math-speak for saying that we want to talk about continuous functions, which turn
out to behave much more nicely.
This is the idea behind the denition of a linear transformation: it is a map that preserves the structure of a
vector space.
Denition:
If
and
T from V to W
, we have the two
are vector spaces, we say a map
linear transformation if, for any vectors
~v , ~v1 , ~v2
and scalar
[T1] The map respects addition of vectors:
[T2] The map respects scalar multiplication:
(denoted
T ( ~v ) = ~v .
Remark: Like with the denition of a vector space, one can show a few simple algebraic properties of linear
(of
is a
T (~v1 + ~v2 ) = T (~v1 ) + T (~v2 )
transformations for example, that any linear transformation sends the zero vector (of
T : V W)
properties
V)
to the zero vector
W ).
Example: If
V = W = R2 ,
then the map
~v = hx, yi, ~v1 = hx1 , y1 i,
which sends
to
hx, x + yi
is a linear transformation.
Let
[T1]: We have
T (~v1 + ~v2 ) = hx1 + x2 , x1 + x2 + y1 + y2 i = hx1 , x1 + y1 i + hx2 , x2 + y2 i = T (~v1 ) + T (~v2 ).
[T2]: We have
T ( ~v ) = hx, x + yi = hx, x + yi = T (~v ).
More General Example: If
and
~v2 = hx2 , y2 i,
hx, yi
V = W = R2 ,
so that
then the map
~v1 + ~v2 = hx1 + x2 , y1 + y2 i.
which sends
hx, yi
to
hax + by, cx + dyi
is a linear
transformation.
Just like in the previous example, we can work out the calculations explicitly.
But another way we can think of this map is as a matrix map:

column vector
ax + by
cx + dy

=
sends the column vector
a
c
a
c
So, in fact, this map
When we think of the map in this way, it is easier to see what is happening:
is really just (left) multiplication by the matrix
[T1]: We have
[T2]: Also,
T (~v1 + ~v2 ) =

a b
T ( ~v ) =
c d
is any
to the

.

a b
a b
(~v1 + ~v2 ) =
~v1 +
~v2 = T (~v1 ) + T (~v2 ).
c d
c d

a b
~v1 =
~v1 = T (~v ).
c d
b
d
b
d
V = Rm (thought of as m 1 matrices) and W = Rn (thought of as n 1

n m matrix, then the map T sending ~v to A ~v is a linear transformation.
Really General Example:

matrices) and
a
c

b
x
.
d
y
x
y
If
The verication is exactly the same as in the previous example.
[T1]: We have
[T2]: Also,
T (~v1 + ~v2 ) = A (~v1 + ~v2 ) = A ~v1 + A ~v2 = T (~v1 ) + T (~v2 ).
T ( ~v ) = A ~v1 = (A ~v1 ) = T (~v ).
This last example is very general: in fact, it is so general that every linear transformation from
of this form! Namely, if
that
acts by sending
T is a linear
~v to A ~v .
transformation from
to
, then there is some
nm
Rm
Rn is
A such
to
matrix
A is: it is just the m n

T (e1 ), T (e2 ), . . . , T (em ), where e1 , , em are the standard
with a 1 in the j th position and 0s elsewhere).
The reason is actually very simple, and it is easy to write down what the matrix
matrix whose columns are the vectors
basis elements of
Rm (ej
is the vector
To see that this choice of

combination
~v =
m
X
aj ej
works, note that every vector
of the basis elements.
~v
in
Rm
can be written as a unique linear
Then, after applying the two properties of a linear
j=1
transformation, we obtain
T (~v ) =
m
X
aj T (ej ).
But this is precisely the matrix product of the matrix
j=1
with the coecients of
~v .
Tangential Remark: If we write down the map
explicitly, we see that the term in each coordinate in

V
is a linear function of the coordinates in

and
cx + dy .
e.g., if
A=
a
c
b
d

then the linear functions are
ax + by
This is the reason that linear transformations are named so because they are really just
linear functions, in the traditional sense.
In fact, we can state something even more general:

dimensional vector space
linear transformation
and any
from
to
n-dimensional
the argument above shows that, if we take any
vector space
m-
and choose bases for each space, then a
behaves just like multiplication by (some)
nm
matrix
A.
Remark 1: This result underlines one of the reasons that matrices and vector spaces (which initially seem
like they have almost nothing to do with one another) are actually closely related: because matrices
describe the maps from one vector space to another.
Remark 2: One can also use this relationship between maps on vector spaces and matrices to provide
almost trivial proofs of some of the algebraic properties of matrix multiplication which are hard to prove
by direct computation.
For example: the composition of linear transformations is associative (because linear transformations
are functions, and function composition is associative).
Multiplication of matrices is the same as
composition of functions. Hence multiplication of matrices is associative.
1.5.1
Kernel and Image
T : V W is a linear transformation, then the kernel of T , denoted ker(T ), is the set of

v in V with T (v) = ~0. The image of T , denoted im(T ), is the set of elements w in W such that there
v in V with T (v) = w.
Denition: If
elements
exists a
Intuitively, the kernel is the elements which are sent to zero by

which are hit by
(the range of
and the image is the elements in
Essentially (see below), the kernel measures how far from one-to-one the map
measures how far from onto the map
T,
A ~x = ~0
is the kernel of the linear transformation
A.
Another reason is that they will say something about the subspace
The kernel is a subspace of
is, and the image
is.
One of the reasons we care about these subspaces is that (for example) the set of solutions to a set of
homogeneous linear equations
T ).
[S1] We have
V.
T (~0) = ~0,
by simple properties of linear transformations.
(see below).
of multiplication by
[S2] If v1 and v2
~0 + ~0 = ~0.
[S3] If
are in the kernel, then
is in the kernel, then
The image is a subspace of
[S1] We have
[S2] If
Then
[S3] If
T (v1 ) = ~0 and T (v2 ) = ~0.
T (v) = ~0.
Hence
T (v1 +v2 ) = T (v1 )+T (v2 ) =
Therefore,
T ( v) = T (v) = ~0 = ~0.
W.
T (~0) = ~0,
by simple properties of linear transformations.
w1 and w2 are in the image, then there exist v1 and v2 are such that T (v1 ) = w1
T (v1 + v2 ) = T (v1 ) + T (v2 ) = w1 + w2 , so that w1 + w2 is also in the image.
w
is in the image, then there exists
with
T (v) = w.
Then
and
T (v2 ) = w2 .
T ( v) = T (v) = w,
so
is
also in the image.
Theorem: The kernel
ker(T ) consists of only the zero vector if and only if the map T
W if and only if the map T is onto.
is one-to-one. The image
im(T ) consists of all of
The statement about the image is just the denition of onto.
If
is one-to-one, then (at most) one element of
zero vector, we see that
maps to
cannot send anything else
to ~
0.
~0.
But since the zero vector is taken to the
Thus
ker(T ) = ~0.
T is a linear transformation, the statement T (v1 ) = T (v2 ) is

T (v1 ) T (v2 ) = T (v1 v2 ) is the zero vector. But, by the denition of
the kernel, T (v1 v2 ) = ~
0 precisely when v1 v2 is in the kernel. However, this means v1 v2 = ~0, or
v1 = v2 . Hence T (v1 ) = T (v2 ) implies v1 = v2 , which means T is one-to-one.
If
ker(T )
is only the zero vector, then since
equivalent to the statement that
Theorem (Rank-Nullity): For any linear transformation
The idea behind this theorem is that if we have a basis for im(T ), say
v1 , , vk
with
that the set of
T : V W , dim(ker(T )) + dim(im(T )) = dim(V ).
T (v1 ) = w1 , . . . , T (vk ) = wk . Then if a1 , , al

vectors {v1 , , vk , a1 , al } is a basis for V .
To do this, given any
write
T (v) =
k
X
j wj =
j=1
k
X
w1 , , wk , then there
ker(T ), the goal is to
is a basis for
k
X
j T (vj ) = T j vj ,
where the
exist
show
are unique.
j=1
j=1
k
k
X
X
Then subtraction shows that T v
j vj = ~0 so that v
j vj
j=1
as a sum
l
X
i ai ,
where the
is in
ker(T ), hence can be written
j=1
are unique.
i=1
Putting all this together shows
v =
k
l
X
X
j vj +
i a i
j=1
{v1 , , vk , a1 , al }
1.5.2
is a basis for
for unique scalars
and
i ,
which says that
i=1
V.
The Derivative as a Linear Transformation
Example: If
then
T1 + T2
W are any vector spaces, and T1 and T2 are any linear

T1 are also linear transformations, for any scalar .
and
and
transformations from
to
W,
These follow from the criteria. (They are somewhat confusing to follow when written down, so I won't
bother.)
Example: If
to the value
is the vector space of real-valued functions and
f (0)
[T1]: We have
[T2]: Also,
W = R,
then the evaluation at 0 map taking
is a linear transformation.
T (f1 + f2 ) = (f1 + f2 )(0) = f1 (0) + f2 (0) = T (f1 ) + T (f2 ).
T ( f ) = (f )(0) = f (0) = T (f ).
Note of course that being a linear transformation has nothing to do with the fact that we are evaluating
at 0. We could just as well evaluate at 1, or
Example: If
and
and the map would still be a linear transformation.
W are both the vector space of real-valued functions and P (x) is any real-valued function,
f (x) to the function P (x)f (x) is a linear transformation.
then the map taking
[T1]: We have
[T2]: Also,
Example: If
T ( f ) = P (x)(f )(x) = P (x)f (x) = T (f ).

is the vector space of all
functions, then the
nth
n-times dierentiable functions and W is

f (x) to its nth derivative f (n) (x), is a
derivative map, taking
nth derivative of the sum is the sum

(n)
(n)
(x) = f1 (x) + f2 (x) = T (f1 ) + T (f2 ).
[T1]: The
f2 )
T (f1 + f2 ) = P (x)(f1 + f2 )(x) = P (x)f1 (x) + P (x)f2 (x) = T (f1 ) + T (f2 ).
(n)
[T2]: Also,
of the
nth
the vector space of all

linear transformation.
derivatives, so we have
T (f1 + f2 ) = (f1 +
T ( f ) = (f )(n) (x) = f (n) (x) = T (f ).
V is the vector space of all

n-times dierentiable functions, then the map T which sends a function y to the function y (n) + Pn (x) y (n1) +
+ P2 (x) y 0 + P1 (x) y is a linear transformation, for any functions Pn (x), , P1 (x).
If we combine the results from the previous four examples, we can show that if
In particular, the kernel of this linear transformation is the collection of all functions
Pn (x) y (n1) + + P2 (x) y 0 + P1 (x) y = 0
Note that since we know the kernel is a vector space (as it is a subspace of
solutions to
such that
y (n) +
i.e., the set of solutions to this dierential equation.
y (n) + Pn (x) y (n1) + + P2 (x) y 0 + P1 (x) y = 0
V ),
we see that the set of
forms a vector space. (Of course, we could
just show this statement directly, by checking the subspace criteria.)
However, it is very useful to be able to think of this linear dierential operator sending
Pn (x) y
(n1)
+ + P2 (x) y + P1 (x) y
to
y (n) +
as a linear transformation.
Well, you're at the end of my handout. Hope it was helpful.

Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material
without my express permission.

Analog Vector

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analog Vector

Uploaded by

Copyright:

Available Formats

Linear Algebra (part 2) : Vector Spaces and Linear Transformations

(by Evan Dummit, 2012, v. 1.00)

Vector Spaces and Linear Transformations

Formal Denition of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Span, Independence, Bases, Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Linear Combinations and Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Kernel and Image

The Derivative as a Linear Transformation

Vector Spaces and Linear Transformations

1.1 Review of Vectors in Rn

vector from the origin to the point

h1, 2i, h3, 5, 1i,

Notation: I prefer to use angle brackets

rather than parentheses

so as to draw a visual distinction

everywhere; many other authors use regular parentheses for vectors.

and the segment starting at

~v moves us from the origin to the point

and whose long diagonal is

We can also 'scale' a vector by a scalar, one component at a time:

is a scalar, then we have

should move us exactly as far, but in the opposite direction.

moves us some amount in a

should move us half as far in that direction. Analogously,2~

far in that direction, and

The arithmetic of vectors in

Addition of vectors is commutative and associative.

Scalar multiplication distributes over addition of both vectors and scalars.

1.2 Formal Denition of Vector Spaces

the same properties as vectors in

We would like to investigate other collections of things which possess

Denition: A (real) vector space is a collection

of vectors together with two binary operations, addition of

Note: The statement that

[A1] Addition is commutative:

[A2] Addition is associative:

[A3] There exists a zero

[A4] Every vector

[M1] Scalar multiplication is consistent with regular multiplication:

[M2] Addition of scalars distributes:

[M3] Addition of vectors distributes:

(~v1 + ~v2 ) = ~v1 + ~v2 .

[M4] The scalar 1 acts like the identity on vectors:

~v , ~v1 , ~v2 , ~v3

be any vectors and

be any (real number) scalars.

are binary operations means that

are always dened.

~v1 + ~v2 = ~v2 + ~v1 .

(~v1 + ~v2 ) + ~v3 = ~v1 + (~v2 + ~v3 ).

has an additive inverse

Examples: Here are some examples of vector spaces:

are a vector space, for any

(This had better be true!)

R2 ; there, the vectors are of

matrices for any

forms a vector space.

The complex numbers

zero vector is just the complex number

are a vector space.

(f + g)(x) = f (x) + g(x)

and scalar multiplication as

Formal Denition of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2 Formal Denition of Vector Spaces

Denition: A (real) vector space is a collection

are always dened.

zero vector is just the complex number

and scalar multiplication as

V , and the trivial

n-times dierentiable functions on [a, b] is a subspace of the space of continuous functions

The collection of solutions to the (homogeneous, linear) dierential equation