You are on page 1of 9

1

Homogeneous Coordinates

Homogeneous coordinates have played an important role in the development


of projective geometry [Max63, Max51] and are used extensively in the field of
computer graphics [RA76, NS79]. Although used in a more restrictive sense in
the field of robotics, they nevertheless play a key role in the representation of
three-dimensional objects. The position of three-dimensional objects is commonly specified by three Cartesian coordinates, denoted by x, y, and z. It is
sometimes useful, however, to augment this description with a fourth coordinate
which may be considered as a scale factor. This results in a set of homogeneous
coordinates, which will be denoted here by the symbols wx, wy, wz, and w.
The relationship between the homogeneous coordinates and the original ordinary coordinates is given by
wx
x=
(1)
w
y=

wy
,
w

and

(2)

wz
(3)
w
In general, a homogeneous representation refers to any n-dimensional space
problem which is described by an (n + 1)-dimensional space problem. The term
homogeneous comes from the fact that the mathematical equations used to
describe such problems contain no explicit constants. The motivation for using
such a representation lies in the fact that many geometrical problems possess
simpler solutions in the higher dimensional space which can then be projected
back into the original representation.
A simple example which illustrates one such benefit of homogeneous coordinates is the ease with which infinity can be represented. When using ordinary
coordinates there is no satisfactory representation of infinity. However, in homogeneous coordinates, infinity can be represented by w = 0 with the ratio
of the other three coordinates preserved in order to specify direction. While
the ability to represent infinity may not be significant for many applications,
it does serve to illustrate some of the practical issues involved in physically
storing positional coordinates in a fixed bit computer representation. In this
respect, the scale factor w represents a convenient method of implementing a
tradeoff between resolution and dynamic range given a fixed number of bits for
the representation.
From (1) to (3) it is clear that for any given set of ordinary coordinates there
is a one dimensional infinity of homogeneous coordinate representations. Geometrically, one can consider the mapping from four-dimensional homogeneous
coordinate space to three-dimensional ordinary coordinate space as a projection
z=

of a point through the origin onto the hyperplane w = 1. This non-linear projective transformation is the key to producing perspective displays for computer
graphic simulations.
As stated above, the use of homogeneous coordinates in robotics is usually
restricted such that the scale factor w is set identically equal to 1. Thus in
the remainder of this work the homogeneous coordinates representing a threedimensional position will be given by the vector notation [x y z 1]T . Although
at first glance this representation may appear trivial it still provides significant
advantages when relating the positions and orientations of objects through the
use of homogeneous transformations which is the topic of the next section.

Homogeneous Transformations

Homogeneous transformations are linear transformations which are instrumental


in relating the representation of positions and orientations from one coordinate
system to another. This mapping between coordinate systems is represented by
a 4 x 4 matrix, A, which can be partitioned in the following manner:

R33 p31
A=
(4)
f13 s11
The 3 x 3 orthogonal matrix R is sometimes called the rotation matrix and
is composed of the direction cosines which relate the two coordinate systems.
This matrix can be considered to be composed of three columns as follows:

nx ox a x
R = ny oy ay
(5)
nz oz az
where these columns represent the coordinate axes of one system with respect
to another. The notational convention comes from the terms approach vector
used for the z axis, orientation vector for the y axis and normal vector for the
x axis denoting the common normal required to fully specify a right-handed
coordinate system [Pau81]. The term rotation matrix results from its ability to
specify an arbitrary rotation transformation about any axis which goes through
the origin. In particular, the three basic rotation matrices which represent a
rotation of about the x, y, and z axes respectively, are given by the matrices

1
0
0
R(x, ) = 0 cos sin ,
(6)
0 sin
cos

cos
0
R(y, ) =
sin
2

0 sin
1
0 ,
0 cos

(7)

and

cos
R(z, ) = sin
0

sin
cos
0

0
0
1

(8)

Since any orientation change can be represented by a single rotation about a


fixed axis it is sometimes useful to represent a rotation about an arbitrary
axis k by the transformation

kx2 + (1 kx2 ) cos


kx ky (1 cos ) kz sin kx kz (1 cos ) + ky sin
ky2 + (1 ky2 ) cos
ky kz (1 cos ) kx sin
R(k, ) = kx ky (1 cos ) + kz sin
kx kz (1 cos ) ky sin ky kz (1 cos ) + kx sin
kz2 + (1 kz2 ) cos
(9)
where kx , ky , and kz are the direction cosines of the axis of rotation [Pau81].
Equation (9) clearly reduces to (6) to (8) when k is equal to one of the coordinate
axes.
In robotics work the R matrix is usually restricted to its interpretation as
a rotation transformation which is a result of the orthonormality constraint. It
should be noted, however, that this is a special case of a more general transformation. In particular, the diagonal elements of this matrix can product a
scaling transformation on the x, y and z axes independently, with the off diagonal elements producing shear in three dimensions [RA76].
At this point it is instructive to consider the fact that without the use of
homogeneous coordinates this would be the limit of the available transformation
operations. All of the above transformations are such that the position of the
origin is preserved, that is there is no provision for translational transformations.
However, by virtue of using homogeneous coordinates, the 3 x 1 column vector
p in the homogeneous transformation matrix can represent an arbitrary threedimensional translation between coordinate systems. This is clearly illustrated
in the following equation which utilizes the basic translation matrix

x + px
1 0 0 px
x
y + py 0 1 0 py y

(10)
z + pz = 0 0 1 pz z
1
0 0 0 1
1
The remaining components of the homogeneous transformation matrix are
the 1 x 3 row vector f and the scalar s. The vector f is useful in specifying
perspective transformations by modifying the homogeneous coordinate component w to be a function of the other three coordinates. This will result in
a non-linear projective transformation when the three-dimensional coordinates
are normalized by dividing by the scale factor w. While used extensively when
generating computer graphic displays, in robotics work the vector f is usually
defined as zero. Likewise, the scalar s which represents an overall scale factor in
the transformation is typically restricted to 1 so that the overall homogeneous
3

transformation preserves a unity value for the fourth homogeneous coordinate.


Due to the above considerations, it is not unusual to see the 4 x 4 homogeneous
transformation A stored as a set of four vectors, namely, n, o, a, and p with the
fourth row implied.
As stated above, a homogeneous transformation A specifies a mapping from
one coordinate system to another. It is sometimes useful, however, to obtain
the inverse mapping, A1 . Due to the orthonormality of the rotation matrix
and the restriction on the fourth row of A, this inverse is easily obtained by
applying the equation

p n

RT
p o

A1 =
(11)

p a
0 0 0
1
In order to avoid ambiguity when using homogeneous transformation matrices to describe the relationships between coordinate systems it is conventional
to attach the name of the related coordinate systems to the variable specifying the transformation. The name of the source coordinate system is typically
appended as a subscript with the destination coordinate system added as a
preceding superscript [Pau81]. Thus the transformation denoted by 0 A1 refers
to the transformation A which relates the mapping from objects described in
coordinate system 1 to coordinate system 0. Likewise vectors are preceded by
a superscript which identifies the particular coordinate system with respect to
which they are specified. Thus the description of a vector v in two different
coordinate systems can be related by the equation
0

v=

A1 1 v.

(12)

Orientation Specification

The specification of position is usually taken for granted as being represented


as a three-dimensional vector. The specification of orientation, however, requires more careful consideration since there exists no vector representation for
orientations which satisfies the rules of vector algebra. The rotation matrix R
which consists of the direction cosines of one coordinate system with respect to
another is a convenient method of orientation specification which is used extensively in robotics work. It is not, however, without its disadvantages. One such
disadvantage is the high degree of redundancy implicit in this representation.
Since orientation can be shown to possess only three independent degrees of
freedom, it can be represented by three variables. The extra six elements in the
rotation matrix are a result of the orthonormality constraint, three of which are
required to maintain unit length of the columns (rows) and three to maintain
orthogonality of the columns (rows). This results not only in extra data storage
but also creates difficulties when interpolating between two orientations.
4

In addition to the above difficulties, the rotation matrix specification of


orientation does not explicitly represent the axis or angle of rotation required
to achieve the given orientation. This axis and angle are, of course, implicitly
included in the matrix R and can be computed by setting (5) equal to (9) and
solving for them in terms of the matrix elements. This results in the angle of
rotation given by
p
(oz ay )2 + (ax nz )2 + (ny ox )2
tan =
(13)
(nx + oy + az 1)
and the axis by

oz a y
,
2 sin
a x nz
ky =
2 sin

kx =

(14)
(15)

and

ny ox
.
(16)
2 sin
The above equations clearly reflect the fact that the axis of rotation is not
physically well defined for small angles of rotation. Note, however, that the
same ill-conditioning occurs when approaches 180 , a case which is physically
well defined. To avoid these ill-conditioned equations, an alternate formulation
can be applied [Pau81]. In any event, the above discussion illustrates some of
the drawbacks inherent in a rotation matrix specification of orientation.
kz =

3.1

Euler Angles

Euler angles, sometimes referred to as orientation angles, are a compact method


of specifying orientation which is commonly used in a number of different fields.
This specification consists of the rotation angles around three independent axes
which are required to achieve a desired orientation. Unfortunately, the choice
of axes about which these rotations are performed is somewhat arbitrary resulting in as many as 24 different possibilities. A useful taxonomy is presented by
Kane [KLL83] in which each case is classified as belonging to one of four sets,
each of which has six members. These sets are differentiated by two independent properties. The first property determines whether the rotations are being
performed around an axis fixed in space or fixed in the body undergoing the
rotation. The second property is based on whether the rotations are performed
around three distinct axes or whether the third axis of rotation is identical to
the first. Thus the four sets are known as space-three angles, body-three angles,
space-two angles, and body-two angles.
One set of space-three angles commonly used in the aeronautical field and
which is sometimes encountered in robotics work is one in which the three consecutive rotations are applied to the x, y, and z axes in that order. These
5

angles are commonly referred to as roll, pitch, and yaw, indicating the amount
of rotation about the z, y, and x axes, respectively. The rotation matrix for an
orientation specified in this manner can be obtained by multiplying the appropriate basic rotation matrices (see (6) to (8)) in the proper order. In this case,
the resultant rotation matrix is given by
R(, , )

= R(z, )R(y, )R(x, )

cos cos cos sin sin sin cos


= sin cos sin sin sin + cos cos
sin
cos sin

(17)

cos sin cos + sin sin

sin sin cos cos sin


(18)
cos cos

where , , and correspond to roll, pitch, and yaw, respectively.


An examination of the transformation from rotation matrix specification
of orientation to Euler angle representation serves to illustrate some of the
difficulties associated with Euler angles. By equating the elements of (5) with
those of (18) the values of roll, pitch, and yaw for a specified rotation matrix
are given by
=

arctan

arctan

ny
nx

nz
nx cos + ny sin
oz
arctan .
az

(19)
(20)
(21)

Note that the above equations are singular when cos = 0. This type of difficulty
is not unique to the roll, pitch and yaw convention but is shared by all 24 possible
combinations. All sets of space-three and body-three angles have a singularity
when the cosine of the second angle is equal to zero. Likewise, all sets of spacetwo and body-two angles have a singularity when the sine of the second angle is
equal to zero. Physically, this occurs because these cases result in an alignment
of the first axis of rotation with the third axis of rotation. Thus the first and
third angles of rotation are not independent and cannot be distinguished, a case
often referred to as gimbal lock. Consequently, for some applications two sets
of Euler angles are used, one from the three axis set and one from the two axis
set, with the least ill-conditioned set of equations being employed. In addition
to the above difficulties, there is also no simple way of combining two rotations
which are described in terms of Euler angles.

3.2

Quaternions

Since the axis and angle of rotation required to achieve a given orientation have
a physically meaningful interpretation, they too are used to specify orientation.
While the magnitude of the angle of rotation can be used to scale the axis of
rotation, these quantities are frequently represented as the pair (, k) where k
6

is of unit length. The vector u0 which is the result of applying a rotation of


to the vector v around the axis k is then given by
u0 = u cos (k u) sin + k(k u)(1 cos )]

(22)

Unfortunately, the equations for combining two rotations specified in this manner is unduly complex. For this reason, representations which scale the pair
by trigonometric functions of /2 are commonly employed. One such representation, called the Rodrigues vector [KLL83] is a result of multiplying the axis
of rotation k by the quantity tan(/2). Unfortunately, this representation can
result in numerical inaccuracy since the magnitude of the Rodrigues vector can
become infinite.
Ideally, one would like a representation which explicitly contains the geometric information of axis and angle of rotation in a numerically well-conditioned
manner along with a straightforward method of combining successive rotations.
Such a representation is possible through the use of quaternions, a mathematical entity composed of a scalar and vector pair denoted here as (s, v). The axis
and angle of rotation required to achieve an orientation can be represented by
the unit quaternion (cos(/2), k sin(/2)) where the magnitude of quaternions
is defined as
|(s, v)|2 = s2 + v v.
(23)
The vector portion of this pair is sometimes call the Euler vector with the set
of four scalar quantities also known as Euler parameters [KLL83].
For calculations involving quaternions, the following four operations are defined:
(s1 , v1 ) (s2 , v2 ) = (s1 s2 v1 v2 , s1 v2 + s2 v1 + v1 v2 )
(24)
(s1 , v1 ) + (s2 , v2 ) = (s1 + s2 , v1 + v2 )

(25)

s1 (s2 , v2 ) = (s1 s2 , s1 v2 )

(26)

(s, v)1 =

(s, v)
|(s, v)|2

(27)

where quaternion multiplication is denoted by in order to avoid confusion


with either the vector dot or cross product. Using the above definitions, it can
be easily shown that quaternion multiplication results in rotation. Given an
arbitrary vector u in quaternion notation (0, u), premultiplication by the unit
quaternion (cos(/2), k sin(/2)) and postmultiplication by its inverse results in

(cos(/2), k sin(/2)) (0, u) (cos(/2), k sin(/2))


(28)
( sin(/2)k u, cos(/2)u + sin(/2)k u) (cos(/2), k sin(/2))
(29)

= (0, u cos (k u) sin + k(k u)(1 cos ))

(30)

which is identical to (22). The associativity of quaternion multiplication allows successive rotations to be combined by simply multiplying their respective
7

quaternion representations in the same manner as homogeneous transformations


are combined. Thus the resultant effect of the n rotations denoted by (si , vi )
can be calculated using
(s, v) = (s1 , v1 ) (s2 , v2 ) (sn , vn )

(31)

Quanternion multiplication, once again like matrix multiplication, is not commutative.


The elegance of the above formulation also results in practical advantages
due to the reduced number of computations as compared to rotation matrix
multiplication [Tay79]. Conversion from quaternions to rotation matrices is
relatively straight-forward. By substituting the definition of a unit quaternion
into (9) the formula for a rotation matrix in terms of a unit quaternion (s,v)
can be obtained:

1 2(vy2 + vz2 ) 2(vx vy svz ) 2(vx vz + svy )


(32)
R = 2(vx vy + svz ) 1 2(vx2 + vz2 ) 2(vy vz svx )
2(vx vz svy ) 2(vy vz + svx ) 1 2(vx2 + vy2 )
The conversion from rotation matrices to quaternion, however, requires slightly
more thought. Four formulations for the conversion are presented in Table ??.
The choice of which formulation to use is based on isolating the Euler parameter
with the largest magnitude by choosing the appropriate linear combination of
the matrix diagonal. Since at least one of the parameters is guaranteed to have a
magnitude greater than or equal to 0.5, ill-conditioned equations can be avoided.

Conclusion

Due to the advantages of homogeneous coordinates which have been outlined


in the above section, they are used exclusively throughout this work for describing three-dimensional positions. The scale factor is set identically equal to
one and is frequently suppressed in the notation. Homogeneous transformations
are employed for specifying end effector position and orientation and describing
the relationship between various coordinate systems. These transformations are
stored as four three-dimensional vectors with the fourth row implied. Despite
the redundancy in the rotation matrix of the transformation, this representation
is perhaps most physically intuitive for visualizing the orientation of the end effector due to the explicit representation of the approach and orientation vectors.
For calculating the results of successive rotations, however, the quaternion representation has a decided advantage. The explicit information of axis and angle
of rotation provided in a compact representation justify their use for all rotation computations. They are also the most desirable format for interpolating
between sets of given orientations to obtain a rotational velocity trajectory.

References
[KLL83] T R Kane, P W Likins, and D A Levinson. Spacecraft Dynamics.
McGraw-Hill, New York, 1983.
[Max51] E A Maxwell. General Homogeneous Coordinates in Space of Three
Dimensions. Cambridge University Press, London, 1951.
[Max63] E A Maxwell. The Methods of Plane Projective Geometry Based on
the Use of General Homogeneous Coordinates. Cambridge University
Press, London, 1963.
[NS79]

W M Newman and R F Sproull. Principles of Interactive Computer


Graphics. McGraw-Hill, New York, 1979.

[Pau81] R P Paul. Robot Manipulators: Mathematics, Programming, and Control. MIT Press, Cambridge, Mass., 1981.
[RA76]

D F Rogers and J A Adams. Mathematical Elements for Computer


Graphics. McGraw-Hill, New York, 1976.

[Tay79] R H Taylor. Planning and execution of straight line manipulator trajectories. IBM Journal of Research and Development, 23(4):424436,
1979.

You might also like