You are on page 1of 18

Chapter 7

Orthogonality
7.1

Orthogonal transformations. Orthogonal matrices

We have already seen that when we define an inner product in a vector space we can
generalize the concepts of lengths and angles. In particular, we know that the length
of a vector x is given by
||x|| =

x x,

and that the angle between two vectors x and y is found as


xy
.
||x|| ||y||

cos =

Thus, the angle between two vectors is a right angle if and only if their inner product
is zero, and in that case we say that those vectors are orthogonal. Recall as well that
the projection of a vector v on a given vector u is the vector:
Pu (v) =

uv
u.
uu

A linear transformation T from Rn to Rn is called orthogonal (or an isometry) if it


preserves the length of vectors:
||T (x)|| = ||x||,

for all x Rn .
1

In the previous chapter, we defined an orthonormal matrix as a matrix A verifying


At A = I. Such matrices are tightly related to orthogonal transformations. Here we
present some important result for matrices associated to a linear transformation:
Theorem: If a matrix AT represents an orthogonal transformation T ( that is, ||AT x|| =
||x||, x Rn ), then AT is an orthogonal matrix.
Orthogonal matrices (and the linear transformations that they represent) have many
interesting properties:
Orthogonal transformations preserve the dot product: if T is orthogonal then
T (x) T (y) = x y, for all x and y in Rn .
The product of two orthogonal matrices is orthogonal. Equivalently, the composition of two orthogonal transformations is orthogonal.
The inverse of an orthogonal matrix is orthogonal; equivalently, the inverse of
an orthogonal transformation is an orthogonal transformation.

7.2

Orthogonal complement

Let A be an n m matrix and let x = (x1 , . . . .xn ) 0 N(A) be the nullspace of A.


Then, we have Ax = 0, and this implies that the dot product of x and each row of A
is zero; in other words, x is orthogonal to every row in A, and thus, x is orthogonal
to every linear combination of the row vectors of A. This proves the result that we
have already seen in chapter 6, that every vector in the nullspace of A is orthogonal
to every vector in the row space of A. This property is generalized with the following
definition.
2

Two subspaces S1 and S2 of Rn are said to be orthogonal subspaces if st1 s2 = 0 for


every s1 S1 and every s2 S2 . If S1 and S2 are orthogonal, we write S1 S2 .
Let S be a subspace of Rn . The set of all vectors in Rn that are orthogonal to every
vector in S is denoted by S and is called the orthogonal complement of S. Thus
S = {x Rn : xt s = 0, for every s S}.
Example: Let S1 = Span(e1 ) and S2 = Span(e2 ) be two subspaces of R3 . It is easy to
see that S1 and S2 are orthogonal. If s1 S1 we have s1 = (, 0, 0) 0 , and if s2 S2 we
have s2 = (0, , 0) 0 ; then
st1 s2 = 0 + 0 + 0 0 = 0,
and S1 S2 . However, S1 and S2 are not orthogonal complements. Indeed, S
1 =
Span(e2 , e3 ) and S
2 = Span(e1 , e3 ).

The following are important properties about orthogonal subspaces:


If S1 and S2 are orthogonal subspaces of Rn , then S1 S2 = {0}.
If S is a subspace of Rn , then S is also a subspace of Rn .
If S is a subspace of Rn , then dim(S) + dim(S ) = n. Furthermore, if {v1 , . . . , vr }
is a basis for S and {vr+1 , . . . , vn } is a basis for S , then {v1 , . . . , vr , vr+1 , . . . , vn }
is a basis for Rn .

7.3

Orthogonal bases. Orthonormal bases

A set of vectors {u1 , . . . , up } in Rn is said to be an orthogonal set if each pair of distinct


vectors from the set is orthogonal, that is, if ui uj = 0 whenever i 6= j.
3

Example: Let us show that {u1 , u2 , u3 } is an orthogonal set, where u1 = (3, 1, 1) 0 ,


u2 = (1, 2, 1) 0 and u3 = (1/2, 2, 7/2) 0 . We have to consider the three possible
pairs of distinct vectors:
u1 u2 = 3 (1) + 1 2 + 1 1 = 0,
u1 u3 = 3 (1/2) + 1 (2) + 1 (7/2) = 0,
u2 u3 = 1 (1/2) + 2 (2) + 1 (7/2) = 0.
Therefore, {u1 , u2 , u3 } is an orthogonal set.

Theorem: If S = {u1 , . . . , up } is an orthogonal set of nonzero vectors in Rn , then S is


linearly independent, and hence is a basis for the subspace spanned by S.
Let S be an orthogonal set of nonzero vectors in Rn , and let W be the subspace
spanned by S. Then S is called an orthogonal basis for W because it is both an orthogonal set and a basis for W. If there are n vectors in S, then W = Rn and S is an
orthogonal basis for Rn .
Theorem: Let {u1 , . . . , up } be an orthogonal basis for a subspace W of Rn . Then each
w in W has a unique representation as a linear combination of u1 , . . . , up . In fact, if
w = c1 u1 + + cp up ,
then
cj =

w uj
,
uj uj

(j = 1, . . . , p).

Example: We have seen that the vectors u1 = (3, 1, 1) 0 , u2 = (1, 2, 1) 0 and u3 =


(1/2, 2, 7/2) 0 form an orthogonal set, and thus, they are linearly independent and,
4

consequently, an orthogonal basis for R3 . Let us express the vector v = (6, 1, 8) 0 as a


linear combination of these vectors. We have
v u1 = 11;

v u2 = 12;

v u3 = 33;

u1 u1 = 11;

u2 u2 = 6;

u3 u3 = 33/2;

Then
v=

12
33
11
u1 +
u2 +
u3 = u1 2u2 2u3 .
11
6
33/2


A set {u1 , . . . , up } is an orthonormal set if it is an orthogonal set of unit vectors. If S is


the subspace spanned by such a set, then we say that it is an orthonormal basis for S.
Matrices whose columns form an orthonormal set are important in applications and
in computer algorithms for matrix computations. Their main properties are:
Theorem: An n m matrix A has orthonormal columns if and only if At A = I.
Theorem: Let A be an n m matrix with orthonormal columns, and let x and y be in
Rm . Then:
1) ||Ax|| = ||x||,
2) (Ax) (Ay) = x y,
3) (Ax) (Ay) = 0 if and only if x y = 0.
Properties 1) and 3) say that the linear mapping x 7 Ax preserves lengths and orthogonality.
Theorem: A is an orthogonal matrix if and only if the columns of A form an orthonormal basis of Rn .
5

7.4

Gram-Schmidt process

The Gram-Schmidt process is a simple algorithm for producing an orthogonal or orthonormal basis for any subspace of Rn . To illustrate how it works, consider the
following example:
Example: Let S = Span(v1 , v2 ), with v1 = (3, 6, 0) 0 and v2 = (1, 2, 2) 0 , linearly independent, and not orthogonal, since their inner product does not equal zero. If we
15
want to construct an orthogonal basis for S, we can consider p = Pv1 (v2 ) = (3, 6, 0) 0
45
and v2 p is orthogonal to v1 (see the picture below).

v2

v2 p

v1
p=Pv1(v2)

Obviously w2 = v2 p = (0, 0, 2) is in S, since it is a linear combination of v1 , with


coefficient ||Pv1 (v2 )||, and v2 , with coefficient 1, and it is orthogonal to v1 . Then the
set {v1 , w2 } is an orthogonal basis for S.

The Gram-Schmidt process.

Given a basis {v1 , . . . , vp } for a subspace S of Rn , define:


6

w 1 = v1 ,
w 2 = v2

v2 w 1
w1 ,
w1 w1

w 3 = v3

v3 w 1
v3 w 2
w1
w2 ,
w1 w1
w2 w2

..
.
wp =

vp w 1
vp w 2
vp wp1
w1
w2
wp1 .
w1 w 1
w2 w2
wp1 wp1

Then {w1 , . . . , wp } is an orthogonal basis for S. In addition


Span(v1 , . . . , vk ) = Span(w1 , . . . , wk ),

1 k p.

Once an orthogonal basis has been found, it is easy to obtain an orthonormal basis:
simply normalize all the vectors.

7.4.1

QR factorization

If {A1 , . . . , Am } are the columns of an nm matrix A, then applying the Gram-Schmidt


process with normalizations to A1 , . . . , Am results in factoring A as described in the
next theorem. This factorization is very useful for simplifying several problems, such
as solving equations and others that we will face later on this course.

Theorem. The QR factorization: If A is an n m matrix with linearly independent


columns, then A may be factored as A = QR, where Q is an n m matrix whose
columns form an orthonormal basis for C(A) and R is an m m upper triangular
invertible matrix with positive entries on its diagonal.
7

1 0 0

1 1 0

Example: Find a QR factorization of A =


. The columns of A are li 1 1 1

1 1 1
nearly independent, so let us find a basis of orthonormal vectors for Span(A1 , A2 , A3 ) =
C(A). First set w1 = A1 = (1, 1, 1, 1) 0 . Then

0
1 3/4

1 3 1 1/4
A 2 w1

w2 = A 2
w1 = =
.
1 4 1 1/4
w1 w1

1
1
1/4

To simplify computations, we can scale w2 by a factor of 4, so w2 =


. Finally,
1

1
we obtain the third vector of the orthogonal basis by

0
0
1
3

A3 w2
A 3 w1
0 2 1 2 1 2/3
w1
w =
w3 = A 3
.
=
w1 w1
w2 w2 2 1 4 1 12 1 1/3

1/3
1
1
1
Now we can normalize the basis, and the resulting vectors will be the columns of Q
in the factorization:

1/2

1/2

Q=
1/2

1/2

3/ 12

1/ 12

1/ 12

1/ 12
8

2/ 6

1/ 6

1/ 6

Since the columns of Q are orthonormal, we have that Qt Q = I. Now, we need to


find the upper right matrix R verifying A = QR. If we multiply both sides of this
expression by Qt we get
Qt A = Qt QR = IR = R.

Thus

1/2

R = Qt A = 3/ 12

1/2
1/2
1/2

1/ 12 1/ 12 1/ 12

2/ 6 1/ 6 1/ 6

= 0

3/2

3/ 12
0

0
1
1
1

=
1

2/ 12 .

2/ 6


7.5

Geometry of linear maps

The following pictures show a few pairs from different mappings from R to R; two
of them, f1 (x) = ex and f2 (x) = x2 are not linear, and it can be seen that they distort
the domain when it is transformed into the range. On the other hand, f3 (x) = 2x and
f4 (x) = 2x are linear transformations, and clearly they spread the domain evenly,
always by the same factor.
9

f1(x) = ex

f2(x) = x2

f4(x) = 2x

f3(x) = 2x

The only linear maps from R to R are multiplications by a scalar but in higher dimensions more can happen. For instance, the linear transformation from R2 to R2 defined
by:
10

Rotation by an angle= 3

x1

x2

cos
sin

sin
cos

x1

x2

rotates vectors counterclockwise by an angle .


In the following subsections, we are going to focus on the most important linear transformations in R2 and R3 , but the analysis can be extended to Rn for any n.

7.5.1

Reflection

A reflection is an orthogonal linear transformation from a vector space V to itself with


a hyperplane as set of fixed points, that is, points whose images coincide with the
points themselves; this set is called the axis (in dimension 2) or plane (in dimension
3) of reflection. Intuitively, we can visualize the image of a vector by a reflection as its
mirror image in the axis or plane of reflection.
Some special (and simple) reflections are the following:
R2
11

Reflection about the xaxis

Reflection about the yaxis

(x,y)

(x,y)

(x,y)

T(v)

T(v)

(x,y)

Reflection about the x-axis: This reflection transforms every vector in R2 with
coordinates (x, y) into the vector (x, y). The points in the x-axis are the fixed
points. The matrix associated to this linear transformation is given by AT =
(T (e1 ), T (e2 )). Clearly,

AT =

0 1

Reflection about the y-axis: This reflection transforms every vector in R2 with
coordinates (x, y) into the vector (x, y). The points in the y-axis are the fixed
points. The matrix associated to this linear transformation is given by AT =
(T (e1 ), T (e2 )). In this situation,

AT =

12

Reflection about the line y=x

Reflection about the origin

(y,x)

(x,y)

(x,y)
T(v)

T(v)
(y,x)

Reflection about the line x = y : In this reflection, the set of fixed points are the
points in the line y = x. The image of the vectors in the standard basis is given
by T (e1 ) = e2 and T (e2 ) = e1 . Thus, the associated matrix is

0 1
.
AT =
1 0
Reflection about the origin: The image of the vectors in the standard basis is
T (e1 ) = e1 and T (e2 ) = e2 . The only fixed point is the origin. Thus, the
matrix associated to this isometry is

AT =

R3
In R3 , when we reflect about a given plane, say the xy-plane, we move from above
the plane to below the plane (or vise-versa); this means simply changing the sign of
the other variable (z in the case of the xy-plane).
13

Reflection about the xy-plane: The associated matrix is

AT = 0

0 .

1
0

Reflection about the yz-plane: The matrix of the linear transformation is

AT =

0 .

Reflection about the xz-plane: The associated matrix is

AT = 0

7.5.2

1 0 .

0 1

Orthogonal Projections

A projection is a linear transformation T from a vector space V to itself verifying ToT =


T . An orthogonal projection is a projection for which the range R(T ) and the null
space ker(T ) are orthogonal subspaces. A projection is orthogonal if and only the
associated matrix is symmetric (relative to an orthonormal basis, like the standard
one): AT = AtT .
Some special orthogonal projections are the following:
R2
14

Projection on the xaxis

Projection on the yaxis

(x,y)

(0,y)

T(v)

T(v)

(x,y)

(x,0)

Projection on the x-axis: This projection transforms every vector in R2 with


coordinates (x, y) into the vector (x, 0). The matrix associated to this linear transformation is given by AT = (T (e1 ), T (e2 )). Clearly,

AT =

Projection on the y-axis: This projection transforms every vector in R2 with


coordinates (x, y) into the vector (0, y). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 )). Now,

AT =

R3
In this vector space, the simplest projections are those in which we project onto one
of the three axes or onto one of the three coordinate planes.
15

Projection on the x-axis: This projection transforms every vector in R3 with


coordinates (x, y, z) into the vector (x, 0, 0). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 ), T (e3 )):

1 0 0

AT = 0 0 0 .

0 0 0
Projection on the y-axis: This projection transforms every vector in R3 with
coordinates (x, y, z) into the vector (0, y, 0). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 ), T (e3 )):

0 0 0

AT = 0 1 0 .

0 0 0
Projection on the z-axis: This projection transforms every vector in R3 with
coordinates (x, y, z) into the vector (0, 0, z). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 ), T (e3 )):

0 0 0

AT = 0 0 0 .

0 0 1
Projection on the xy-plane: This projection transforms every vector in R3 with
coordinates (x, y, z) into the vector (x, y, 0). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 ), T (e3 )):

1 0 0

AT = 0 1 0 .

0 0 0
16

Projection on the xz-plane: This projection transforms every vector in R3 with


coordinates (x, y, z) into the vector (x, 0, z). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 ), T (e3 )):

1 0 0

AT = 0 0 0 .

0 0 1
Projection on the yz-plane: This projection transforms every vector in R3 with
coordinates (x, y, z) into the vector (0, y, z). The matrix associated to this linear
transformation is given by AT = (T (e1 ), T (e2 ), T (e3 )):

0 0 0

AT = 0 1 0 .

0 0 1

7.5.3

Contractions and Dilations

These transformations correspond to scalar multiplication: T (x) = cx, where c is a


nonnegative scalar. The transformation is called a contraction if 0 c < 1 and a dilation if c 1. The coordinates or a vector in R2 (x, y) become (cx, cy), and thus the matrix associated to this linear transformation is given, as usual, by AT = (T (e1 , T (e2 ))):

c 0
.
AT =
0 c
Analogously, in R3 the matrix associated to a dilation/contraction is given by:

c 0 0

AT = 0 c 0 .

0 0 c
17

7.5.4

Rotations

R2
Let v be a vector with coordinates (x, y), and let T be the linear transformation that
performs a counter clock wise rotation by an angle . The associated matrix is characterized by T (e1 ) = (cos , sin ) and T (e2 ) = ( sin , cos ); therefore

cos sin
.
AT =
sin
cos
R3
In 3 dimensions, we will consider the following rotations about the three positive
coordinate axes.
Counter clock wise rotation by an angle of about the positive x-axis: The
corresponding matrix associated to this linear transformation is given by

1
0
0

AT = 0 cos sin .

0 sin
cos
Counter clock wise rotation by an angle of about the positive y-axis: In this
case we have

cos

sin

AT =

sin

0 .

cos

Counter clock wise rotation by an angle of about the positive z-axis: Finally,
we have

cos

AT = sin

18

sin

cos 0 .

0 1

You might also like