Computer Graphics Alexandre Hardy

Computer Graphics
Alexandre Hardy
August 24, 2006
Written in LAT
E
X, diagrams in MetaPost
Chapter 1
Introduction
Introduction
Display models
Rectangular array of pixels (colours)
Vector graphics (lines)
Polygonal modelling,
usually
planar (the polygon lies in a plane and is not bent)
convex (a line intersects the polygon in no more than
two locations)
simple (edges do not cross)
Right hand coordinate system
1
CHAPTER 1. INTRODUCTION 2
(0, 0)
x
y
z
Note: a point is not a pixel!
Some issues in computer graphics
Hidden surfaces
Painters algorithm (limitations)
Z-Buer
Polygon face orientations (backface/frontface)
Double Buering for Animation
Animation: a timed display of several images
Use two images for display
render to one while the other is displayed
avoids shearing
Chapter 2
Transformations and Viewing
The rendering pipeline
Modelling
View
Selection
Perspective
Division
Displaying
Modelling (vertices, lines, polygons) + transforms
View Selection (camera / viewpoint)
Perspective division (Homogeneous 3D coordinate 2D
Coordinate)
Displaying (physical drawing on screen or device)
2D Transformations
A transformation on R
2
is any mapping A : R
2
R
2
. That is,
each point x R
2
is mapped to a unique point, A(x), also in
R
2
.
3
CHAPTER 2. TRANSFORMATIONS AND VIEWING 4
Denition 1 Let A be a transformation. A is a linear trans-
formation if and only if
For all R and all x R
2
, A(x) = A(x).
For all x, y R
2
, A(x +y) = A(x) + A(y).
This implies that A(0) = 0 since A(0x) = 0A(x). To apply
a transform to an object, we apply the transform to each of the
vertices in the object.
What is the dierence between a point and a vector?
Denition 2 A transformation A is a translation if u R
2
so that x R
2
, A(x) = x +u. A translation moves all vectors
or points by a xed amount.
Denition 3 An ane transformation is a transformation
that can be written as A(x) = T(L(x)) where L is a linear
transform and T is a translation. This can also be written as
A = T
u
L
A(x) = L(x) +u
Proposition 4 If A is an ane transform then u and L are
uniquely determined by A.
Proof A(0) = T
u
(L(0)) = T
u
(0) = u. So u is uniquely
determined.
L = T
1
u
A = T
u
A and L is uniquely determined.
Let

x = (1, 0) and

y = (0, 1) then any vector v can be written
as v = (v
x
, v
y
) = v
x
x + v
y
y. Now we can express any linear

transform (but not ane) as a matrix
M =
_
m
11
m
12
m
21
m
22
_
Vectors are actually column vectors, and so we should write
(01)
T
. We use (0, 1) as a convenient shorthand for the column
vector.
Let u = A(
x) and v = A(
y), then
M =
_
u
1
v
1
u
2
v
2
_
A rotation is a transformation that rotates a point around the
origin by a xed angle. A rotation can be written as
R
=
_
cos sin
sin cos
_
We have the useful properties R
1
= R
and R
+
= R
.
Denition 5 A rigid transformation is a transformation that
preserves
Distances between points
Angles between lines
A transform can also be orientation preserving, in which case
the direction of an angle is preserved after transformation. Ex-
amples of rigid transforms are rotation and translation.
A linear transform is a rigid transform if for all x, y R
2
, xy =
A(x) A(y).
Theorem 6 Every rigid, orientation preserving linear trans-
formation is a rotation.
Proof Let (a, b) = A(
x). From rigidity we have A(
x) A(
x) =
a
2
+ b
2
= 1. A(
y) is obtained by rotating

x by 90
so A(
y) =
(b, a). The transform can thus be expressed as the matrix
_
a b
b a
_
. Since cos
2
+ sin
2
= 1 and a
2
+ b
2
= 1 there must
be an angle so that a = cos and b = sin . This is clearly a
rotation.
Corollary 7 Every rigid, orientation-preserving ane trans-
formation can be (uniquely) expressed as the composition of a
translation and a rotation.
Denition 8 A generalised rotation is a rotation about an
arbitrary point u. A generalised rotation can be written as
R
u
= T
u
R
T
u
.
Theorem 9 Every rigid, orientation preserving, ane trans-
formation is either a translation or a generalised rotation.
Proof Let A be a rigid, orientation preserving, ane trans-
formation. Let u = A(0). We have two cases:
u = 0. Then A is a linear transformation and, by theorem
6 A is a rotation.
u ,= 0. We must prove that A is a translation or has a xed
point v, i.e. A(v) = v. We consider a line L that contains
0 and u. Again we consider two cases:
A maps L to itself. Rigidity implies that distance is
preserved so [[u 0[[ = [[u[[ = [[A(u) A(0)[[. Since
A(0) = u we must have A(u) = u +u or A(u) = 0. If
A(u) = u + u then A = T
u
since angles are preserved
(rigid transform). if A(u) = 0 then v =
1
2
u is a xed
point of A, since v is on L and distance preserving
implies that v is the point halfway between A(u) and
A(0) (which is v).
A maps L to L
. Let L make an angle with L
. Let
L
2
and L
2
be perpendicular lines to L at 0 and u. Let
L
3
and L
3
be L
2
and L
2
rotated by
2
and

2
respec-
tively. The angle between L and L
3
is equal to the
angle between L
and L
3
. This can be seen since the
angle between L and L
3
is 90

2
. And the angle be-
tween L
and L
3
is 90
+

2
= 90

2
. The line
L
3
is mapped to L
3
by A. L
3
and L
3
intersect at v
(they are not parallel). v is equidistant from u and 0
by construction and so A(v) = v.
0
u = A(0)
A(u)
v
2
L
L
2
L
2
L
3
L
3
Homogeneous coordinates
Denition 10 If x, y, w R and w ,= 0, then (x, y, w) is a
homogeneous coordinate representation of the point (x/w, y/w).
The homogeneous representation allows us to represent ane
transforms as a matrix. i.e. we can represent translation as a
matrix.
Any ane transform A can be expressed as A = T
u
B where B
is a linear transform. Now consider the matrix
N =
_
_
_
_
a b e
c d f
0 0 1
_
_
_
_
.
Now we multiply the homogeneous coordinate (x, y, 1) by N:
_
_
_
_
a b e
c d f
0 0 1
_
_
_
_
_
_
_
_
x
y
1
_
_
_
_
=
_
_
_
_
ax + by + e
cx + dy + f
1
_
_
_
_
=
_
_
_
_
a b
c d
__
x
y
_
+
_
e
f
_
1
_
_
_
=
_
_
_
B
_
x
y
_
+u
1
_
_
_
where B is a linear transform. We also have the useful property:
N
_
_
_
_
x
y
_
_
_
_
=
_
_
_
_
(ax + by + e)
(cx + dy + f)
_
_
_
_
Projective geometry
A projective geometry is a system of points and lines that satisfy
the following axioms (amongst others):
Any two distinct points lie on exactly one line.
Any two distinct lines contain exactly one common point.
We construct a projective geometry from R as follows: For each
pair of parallel lines add a new point at innity to each of the
parallel lines. Also add the line at innity that consists of all
points at innity.
A line L can be specied by a starting point u and direction
v.
u + v : R = (u
x
+ v
x
, u
y
+ v
y
) : R.
A possible homogeneous representation of the line is (u
x
/+
v
x
, u
y
/ + v
y
, 1/).
As the triple approaches (v
x
, v
y
, 0). This is the
point at innity.
As the triple approaches (v
x
, v
y
, 0). This is
equivalent to the rst point since multiplying by 1 yields
the above point.
Two triples (x, y, w) and (x
, y
, w
) are equivalent if there

is an such that x = x
, y = y
and w = w
. The class
of all points equivalent to (x, y, w) is written as (x, y, w)
P
.
A projective line L can now be dened as L = (x, y, w)
P
:
ax + by + cw = 0, x, y, wnot all zero.
3D Transformations
For three dimensions we have coordinates of the form (x, y, z)
with 0 = (0, 0, 0). A transformation on R
3
is any mapping
from R
3
to R
3
. Linear transforms, translations and ane trans-
forms are dened almost identically to those for 2D coordinates.
Three dimensional homogeneous coordinates are expressed as
(x, y, z, w) which represents the point (x/w, y/w, z/w) in R
3
. A
linear transform can now be represented by a matrix
M =
_
_
_
_
u
1
v
1
w
1
u
2
v
2
w
2
u
3
v
3
w
3
_
_
_
_
A rigid transform preserves distances between points and angles
between lines. An equivalent denition is that a transform is
rigid if it preserves dot-products. i.e. A(x) A(y) = x y.
Theorem 11 M = (uvw) represents a rigid transform if and
only if [[u[[ = [[v[[ = [[w[[ = 1 and u v = v w = u w = 0.
This implies that M
1
= M
T
.
Proof In class.
Denition 12 An orientation-preserving transformation is
one that preserves right-handedness. A is orientation pre-
serving if (A(u) A(v)) A(u v) > 0 for all non-collinear u,
v (u ,= v).
We can now use homogeneous 4x4 matrices to represent ane
transforms (linear transform + translation).
_
_
_
_
_
_
_
x
y
z
1
_
_
_
_
_
_
_
_
_
_
_
_
_
_
a b c u
d e f v
g h i w
0 0 0 1
_
_
_
_
_
_
_
_
_
_
_
_
_
_
x
y
z
1
_
_
_
_
_
_
_
.
Rotation matrix
The transform to rotate by an angle around the axis u (unit
vector) is given by
R
,u
=
_
_
_
_
_
_
_
(1 c)u
2
1
+ c (1 c)u
1
u
2
su
3
(1 c)u
1
u
3
+ su
2
0
(1 c)u
1
u
2
+ su
3
(1 c)u
2
2
+ c (1 c)u
2
u
3
su
1
0
(1 c)u
1
u
3
su
2
(1 c)u
2
u
3
+ su
1
(1 c)u
2
3
+ c 0
0 0 0 1
_
_
_
_
_
_
_
with c = cos and s = sin. We can derive the rotation matrix
by analysing w = R
,u
v. Let v = v
1
+v
2
with v
1
is parallel to
u and v
2
is orthogonal to u.
We have
v
1
= (u v)u = u(u v) = u(u
T
v) = (uu
T
)v.
Proj
u
:= uu
T
.
v
1
= Proj
u
v and v
2
= (I Proj
u
)v.
R
,u
v
1
= v
1
.
v
3
= u v
2
= u v
R
,u
v
2
= (cos )v
2
+(sin)v
3
. This is a straightforward 2D
rotation using the axis system we have built with v
2
and
v
3
as axes.
R
,u
v = R
,u
v
1
+ R
,u
v
2
= v
1
+ (cos )v
2
+ (sin)v
3
= Proj
u
v + (cos )(I Proj
u
)v + (sin)(u v).
Letting
M
u
=
_
_
_
_
0 u
3
u
2
u
3
0 u
1
u
2
u
1
0
_
_
_
_
we have
R
,u
v = [Proj
u
+ (cos )(I Proj
u
) + (sin)M
u
]v
= [(1 cos )Proj
u
+ (cos )I + (sin)M
u
]v
Which after simplication and conversion to a homogeneous
matrix yields the desired result.
Note that multiplying matrices all the time can cause inaccu-
racies, especially with small values of . You should keep the
original rotation of the object and determine one transform that
is required to move the object to its desired position and orienta-
tion. Continuously deforming the object will destroy it because
of numerical inaccuracy. Rotation matrices can become shear
matrices after a while when many matrices are multiplied to-
gether. The matrices should be normalised in some way. In this
sense a quaternion oers some advantages.
Quaternions
Quaternions are an extension to complex numbers.
We dene i
2
= j
2
= k
2
= 1 and ij = k, jk = i, ki = j,
ji = k, kj = i, ik = j.
Quaternions are of the form ai+bj +ck+d where a, b, c, d
R.
This can be written in short as (u, v) with u = d and
v = ai + bj + ck.
Two quaternions are equal if their components are equal.
(u, v) = (u, v).
Multiplication is dened by q
1
q
2
= (u
1
u
2
v
1
v
2
, u
1
v
2
+
u
2
v
1
+v
1
v
2
).
We dene [[q[[ =
a
2
+ b
2
+ c
2
+ d
2
.
We have q(r + s) = qr + qs.
We dene the conjugate q
= (u, v).
q
1
q = 1, therefore q
1
=
1
||q||
2
q
.
Unit quaternions can be used to represent rotation.
To rotate around the axis (unit vector) u by an angle of 2,
construct the unit quaternion q = (cos , sinu).
To rotate a homogeneous coordinate p (which can be di-
rectly translated to a quaternion) calculate p
= qpq
1
=
qpq
.
Compound rotations can be applied. We have the useful
properties (q
1
q
2
)
= q
2
q
1
.
Advantages? Numerical stability?
Slerp
Theorem 13 Eulers theorem. If A is a rigid orientation-
preserving linear transformation of R
3
, then A is the same as
some rotation R
,u
.
Proof Similar to 9, but we consider the mapping of all points
on the unit sphere.
Three Dimensional Projective Geometry
Once again we create a point at innity for parallel lines, so that
the new lines in the Projective geometry intersect at a point. We
also create a line at innity (consisting of points at innity) for
parallel planes. And so we get a plane at innity consisting of all
the points and lines at innity. Clearly any two planes intersect
in a line.
The homogeneous coordinate is thus of the form (x, y, z, w). If
w = 0 then we have a point at innity.
Viewing Transformations and Perspective
Using the homogeneous coordinate (x, y, z, w) we apply perspec-
tive division to obtain the point in R
3
(x, y, z, w) (x/w, y/w, z/w).
Coordinates are mapped into a 222 unit cube centred at the
origin (Canonical view volume). Simple scaling and translating
can then be used to map the cube into the nal position on the
screen, where the primitives can be rasterized. We are interested
in two kinds of viewing transforms:
Orthographic projection Relative size and angles are pre-
served.
Perspective projection Objects appear to become smaller
the greater the distance to the object.
Orthographic Viewing Transformations
If we wish to preserve relative sizes and angles, then if we look
down the z-axis at two points that only dier in the z compo-
nent, it appears that these two points are in the same position.
This leads to the transform
P
o
=
_
_
_
_
_
_
_
1 0 0 0
0 1 0 0
0 0 0 0
0 0 0 1
_
_
_
_
_
_
_
where the z component is simply discarded. Since visibility in-
formation is also discarded, this transform is rarely useful. We
rather scale the points into the canonical view volume (222
cube). First we decide on the visible space by selecting f, n, t,
b, l and r to select the view volume dened by
l x r,
b y t,
n z f.
After dening the view volume, we can use the constraints as
clipping planes, which dene what is considered in the view
volume and what is not. Now the transform can be dened in
terms of a scale and translate to obtain the 222 cube centred
at the origin:
P
o
= S
o
T
o
=
_
_
_
_
_
_
_
2
rl
0 0 0
0
2
tb
0 0
0 0
2
fn
0
0 0 0 1
_
_
_
_
_
_
_
_
_
_
_
_
_
_
1 0 0
l+r
2
0 1 0
t+b
2
0 0 1
f+n
2
0 0 0 1
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
2
rl
0 0
r+l
rl
0
2
tb
0
t+b
tb
0 0
2
fn

f+n
fn
0 0 0 1
_
_
_
_
_
_
_
_
In certain documents (this one, for example) it is desirable to
emphasise the z coordinate in the orthographic projection. This
is traditionally done with a shear transform:
x
= x + s
x
z
y
= y + s
y
z
This is easily represented as a matrix:
H =
_
_
_
_
_
_
_
1 0 s
x
0
0 1 s
y
0
0 0 1 0
0 0 0 1
_
_
_
_
_
_
_
.
We can simply multiply by this matrix to achieve the desired
eect.
If we render a unit cube, we get
x
y
z
(0, 0)
instead of a square.
Perspective Transformations
If we consider the diagram
0
x
z
Viewscreen plane z = d
(d x/z, d y/z, d)
Vertex
(x, y, z)
we see that the transformed coordinate can be easily calculated
using similar triangles to get
x
=
d x
z
y
=
d y
z
(Compare this to the raytracing document, where this computa-
tion is done implicitly). What remains is the decision regarding
the z-component. Since we are looking down the negative z
axis, the notion of distance is not preserved by z
= z. Also, we
would like lines to map to lines when applying the perspective
transform. This is useful for hardware to be able to linearly
interpolate the calculated depth values. We use the following
pseudo-distance function which has the required properties:
pd(z) = A + B/z.
Once again we can dene a view volume but in this case it is
a frustum using the values l, r, n, f, t and b. Our target is
to produce the canonical view volume, so
pd(n) = AB/n = 1 and pd(f) = A B/f = 1.
Solving for A and B we get
A =
(f + n)
f n
and B =
2fn
f n
.
Using the homogeneous coordinate representation we can multi-
ply by z without aecting the point which is represented. We
obtain
(x, y, z, 1) (d x, d y, A z B, z).
The perspective division will yield the desired coordinates. Us-
ing the fact that (x, y, z, w) is the same point as (x/w, y/w, z/w, 1)
we have
(x/w, y/w, z/w, 1) (d x/w, d y/w, A (z/w) B, z/w).
Multiplying by w gives
(x, y, z, w) (d x, d y, (A z + B w), z).
The transform can be represented by the matrix
P =
_
_
_
_
_
_
_
d 0 0 0
0 d 0 0
0 0 A B
0 0 1 0
_
_
_
_
_
_
_
.
Projections are also useful for projecting the shadow of an object
onto a plane. The matrix can be used as is after transforming
the scene so that the plane is parallel to the xy plane and at a
distance d from the origin. By using the values from the view
frustum
to scale and translate the points into the canonical view volume,
we get the perspective projection matrix
P =
_
_
_
_
_
_
_
_
2n
rl
0 0
r+l
rl
0
2n
tb
0
t+b
tb
0 0
(f+n)
fn
2fn
fn
0 0 1 0
_
_
_
_
_
_
_
_
.
Figure 2.1 was rendered with r = 0.03, l = 0.03, t = 0.03,
Figure 2.1: Perspective projection
b = 0.03, n = 0.1 and f = 100 using the above perspective
projection matrix.
Note that OpenGL uses slightly dierent matrices to negate the
z values obtained.
Chapter 3
Rasterization
In this section, we briey take a look at rendering lines and poly-
gons on raster displays. We assume that the viewing transforms
in the previous chapter have been applied, so that only the x
and y coordinates need be considered when deciding how pixels
should be coloured.
Lines
We can use two approaches to rasterizing lines. In the rst
approach, we use a parametric description (explicit) of a line
between two points p
1
= (x
1
, y
1
) and p
2
= (x
2
, y
2
),
p(t) = p
1
+ t(p
2
p
1
).
The line can be drawn by plotting p(t) for successive values of
t on the display. The distances between t should be suciently
small so that there are no visible gaps in the line.
This proves to be inecient. If x
2
x
1
> y
2
y
1
then we expect
each column to have only one pixel coloured for the line. A
similar case occurs if x
2
x
1
< y
2
y
1
, in which case we expect
21
CHAPTER 3. RASTERIZATION 22
only one pixel in each row to be coloured. Since these cases are
so similar, we consider only the rst case.
Using the implicit representation of the line
y = mx + c
where m =
y
2
y
1
x
2
x
1
and c = y
1
mx
1
, we can calculate the y
coordinate for each column i on the display (between x
1
and x
2
)
as y = mi + c. For our algorithm we have
float m=(y2-y1)/(x2-x1);
float c=y1-m*x1;
float y;
int i;
for (i=x1; i<=x2; i++) {
y=m*i+c;
plot(i, y);
}
Note that plot will round (or truncate) the oating point values
to discrete integers. We can avoid the multiply if we calculate
(y
i+1
y
i
) = m. This leads to
float m=(y2-y1)/(x2-x1);
float c=y1-m*x1;
float y=m*x1+c;
int i;
for (i=x1; i<=x2; i++) {
plot(i, y);
y=y+m;
}
Floating point operations are normally much more expensive
than integer operations. In the following we assume that m > 0
and x
2
x
1
> 0. The algorithm is easily adapted for m < 0.
In the case of x
1
x
1
< 0 we simply swap p
1
and p
2
. We
replace the oating point operations by integer operations by
noting that
f
x
2
x
1
1 if and only if f x
2
x
1
. In other words,
the increase in y is less than one, if the cumulative increase so
far (f) is such that f < x
2
x
1
. If f x
2
x
1
then we must
increase y by one and set f := f (x
2
x
1
).
int dx=(x2-x1);
int dy=y2-y1;
int thresh=dx;
int ry=0;
int y=y1;
int i;
for (i=x1; i<=x2; i++) {
plot(i, y);
ry=ry+dy;
if (ry>=thresh) {
y=y+1;
ry=ry-dx;
}
}
If we allow for rounding up then
round(
f
x
2
x
1
) 1
_
f
x
2
x
1
+ 0.5
_
1
_
f + 0.5(x
2
x
1
)
x
2
x
1
_
1
if and only if
f + 0.5(x
2
x
1
) x
2
x
1
f 0.5(x
2
x
1
)
We still set f := f (x
2
x
1
). Only the criteria for incrementing
y changes. This gives us Bresenhams algorithm:
int dx=(x2-x1);
int dy=y2-y1;
int thresh=dx/2;
int ry=0;
int y=y1;
int i;
for (i=x1; i<=x2; i++) {
plot(i, y);
ry=ry+dy;
if (ry>=thresh) {
y=y+1;
ry=ry-dx;
}
}
Polygons
Polygons can be rendered by interpolating the lines forming the
edges of the polygon, to determine boundary edges for each row.
The boundary coordinates can then be interpolated to ll the
polygon. This allows colour to also be interpolated, which is
useful for lighting so that per pixel lighting calculations are ap-
proximated by the interpolation with a corresponding gain in
speed. Colours, texture coordinates and z values may be in-
terpolated using the Bresenham algorithm. This technique is
known as Gouraud shading.
Chapter 4
Lighting Illumination and
Shading
Shading
Shading refers to the process whereby the interior colour of the
polygon is inferred (on a per pixel level) using the properties of
the vertices. This allows us to treat polygonal objects as if they
were curved (if necessary) and adds to the perceived shape of
the object. There are two main shading techniques:
Gouraud Shading The colours computed at the vertices
of the polygon are interpolated. (Perhaps using the Bre-
senham algorithm). The colour of a pixel is thus a linear
combination of the vertices colours. For a line with colours
c
1
and c
2
at the end points, this would be (1 )c
1
+c
2
,
where is computed from the current screen coordinates.
Phong Shading The normals at the vertices of the poly-
gon are interpolated. The interpolated normal is then used
in the lighting model (per pixel lighting). We usually want
a unit normal, so interpolation of the normal would be of
25
CHAPTER 4. LIGHTING ILLUMINATION AND SHADING 26
the form
n
=
(1 )n
0
+ n
1
[[(1 )n
0
+ n
1
[[
.
Since Phong shading applies the lighting model per pixel, whereas
Gouraud shading applies the lighting model per vertex, the re-
sults for Phong shading are usually superior. Gouraud shad-
ing is much faster and usually implemented in hardware. Two
polygons, one very large, and another quite small can map to
precisely the same location on screen due to viewing transforms.
In this case, a slanted large polygon would be expected to vary
in colour dierently to a smaller slanted polygon mapped to the
same coordinates. These algorithms do not take this into ac-
count. A later section on interpolation will address this issue.
One way to partially address this issue with Phong shading, is to
interpolate the x and y components of the normal, and calculate
the z component so that the resulting vector is normalised.
Complex, curved objects can be approximated by polygons. If
the normals at the vertices are adjusted so that they match the
normals of the curved surface, Phong or Gouraud shading will
enhance the appearance that the object is indeed curved.
A Note on Ane Transforms and Normal Vec-
tors
Using the homogeneous coordinate system, normal vectors can
be represented as (n
x
, n
y
, n
z
, 0). Using a 0 for the last compo-
nent (w) allows us to use the normal 44 matrix for transform-
ing the vectors and the normals. For rigid body transforms,
no translation will be applied. Linear transforms do not nec-
essarily maintain angles between lines however (but rigid body
transforms do), so after the transform the normal vector may
not be perpendicular to the surface. We can however determine
the transform that will leave the normal perpendicular.
Theorem 14 Let B be a linear transformation represented by
the invertible matrix M. Let N equal (M
T
)
1
= (M
1
)
T
. Let
P be a plane and n be orthogonal to P. Then Nn is orthogonal
to the image B(P) of the plane P under the map B.
Proof Suppose x is a vector lying in the plane P. Then
n x = 0. We need to show that (Nn) (Mx) = 0. This follows
immediately from
(Nn) (Mx) = ((M
1
)
T
n) (Mx)
= ((M
1
)
T
n)
T
(Mx)
= (n
T
M
1
)(Mx)
= n
T
(M
1
Mx)
= n
T
x
= n x = 0
as required. We use the fact that x
T
y = x y.
The Phong Lighting Model
We ignore colour for a large part of the following discussions,
since light consists of several frequencies. We can consider spe-
cic frequencies and apply the lighting model to these frequen-
cies. Superposition is then used to combine the results. This
means that we can use a colour model such as RGB by indi-
vidually calculating the colour components (Red, Green, and
Blue frequencies) for each surface or light and then simply use
the resulting colour for the colour of the pixel. Both the Phong
lighting model and Cook-Torrance lighting model are local light-
ing models. In other words, only the lighting contribution made
by identied light sources are used in the calculations. Direct
reections from a light source to the eye is calculated but not
light reecting o other surfaces. Global lighting models address
these problems. Global lighting models include
Ray-Tracing (can include Monte Carlo, photon mapping)
Radiosity
They attempt to model lighting in a physically accurate way, but
do not use physics based modelling. Only point light sources
(emitting light equally in all directions) is dealt with, but we
show later how this can be adapted for spot lights and directional
lights. The model considers several properties of the light and
object material. Each of these is discussed below.
Emissive properties
The object may not be considered a light source, but may emit
light of its own. This means that the object will be visible even
if there is no light source present. The intensity of the emitted
light will be indicated by I
e
.
Ambient properties
Ambient light refers to lighting of the object due to indirect
lighting. In other words reections of light from other objects.
However, this is an approximation. A general level of ambient
light I
a
is contributed by each light to the scene. It may also
p
l
n
l
l/ cos
(a) (b) (c)
Figure 4.1: Diuse reection
be necessary to specify a general level of ambient light I
a,global
irrespective of the number of lights in the scene.
Diuse reection
Diuse reection models matte surfaces. It models light that is
reected equally in all directions. The position of the viewer is,
therefore, irrelevant. The position of the light source does aect
the perceived intensity. Figure 4.1 (a) illustrates the necessary
information.
The intensity with respect to diuse reection is given by
I
d
=
d
I
in
d
cos =
d
I
in
d
(l n)
where
I
in
d
is the diuse component of the light source,

d
is the diuse reectivity coecient of the material.
l and n are unit vectors.
n is the normal to the surface.
l is the direction of the light source.
p
l
n
r
n l
2(n l)n
l

p
l
n
r
v

(a) (b)
Figure 4.2: Specular reection and the reection vector
We normally measure light intensity in terms of energy ux per
unit area. If we observe what happens in the gure 4.1 (b) and
(c), we see that if the light reects of the surface at an angle,
then the density of light rays hitting a unit area decreases. In
this example we measure the density by looking at the distance
between the light rays. This explains the cosine term. This is
known as Lamberts Law.
Specular reection
The specular reection component models glossy surfaces, and
in particular the specular highlight due to the light reected
from the light source. In this case light is not reected equally
in all directions. The light is reected most in the vicinity of
the reection vector
r = 2(l n)n l.
The reection vector is easily determined from gure 4.2 (a).
The position of the viewer inuences the perceived intensity.
The closer the viewer is to the reection vector, the brighter the
perceived lighting. So we have
I
s
=
s
I
in
s
(cos )
f
=
s
I
in
s
(v r)
f
,
where where
I
in
s
is the specular component of the light source,

s
is the specular reectivity coecient of the material.
l, r, v and n are unit vectors.
f is a factor that determines the size of the specular high-
light (or reectivity).
f is determined experimentally.
r is the direction of the reected ray.
v is the direction of the viewer.
We can avoid calculating the reection vector if we note that
there is a correspondence between the angle between v and r,
and the angle between n and h, where
h =
l +v
[[l +v[[
.
The specular contribution is then given by
I
s
=
s
I
in
s
(cos )
f
=
s
I
in
s
(h n)
f
,
with
(v r)
f
(h n)
4f
.
Figure 4.3 shows dierent coecients for the Phong model. In
the left hand image, the specular reection coecient increases
from left to right, the diuse reection coecient increases from
top to bottom. In the right hand image, the specular reection
coecient increases from left to right, the specular highlight
coecient f increases from top to bottom.
Figure 4.3: Phong lighting model
Multiple coloured light sources
We can specify a materials properties in terms of
s
and
d
to get
the right balance between matte and glossy surface. We can also
specify how the light from the light source reacts with glossy and
specular surfaces by specifying I
in
s
and I
in
d
. Although we would
not realistically expect light to consists of a diuse component
and a specular component, it may be useful to model lights that
emit white light with a blue element. If we consider dierent
frequencies of light, then each coecient can be represented as
a vector. Each component of the vector is the coecient (re-
sponse) to that frequency. For example, a blue matte object
could be specied by
d
= (0.0, 0.0, 0.9),
s
= (0.0, 0.0, 0.1) and
f = 8 assuming an RGB colour model. Eects of multiple lights
are additive so we have
I = I
e
+
a
I
in
a,global
+
a
i=1
I
in,i
a
+
d
i=1
I
in,i
d
(l
i
n)
+
s
i=1
I
in,i
s
(r
i
v)
f
,
where denotes component wise multiplication, i.e.
(r
1
, g
1
, b
1
) (r
2
, g
2
, b
2
) = (r
1
r
2
, g
1
g
2
, b
1
b
2
).
Naturally, we only keep the diuse contribution if l
i
n > 0,
otherwise the light is behind the surface. Likewise, we only
keep the specular contribution for each light if r
i
v > 0.
Attenuation, spot lights
Since light is an electromagnetic wave, we know that the energy
must decrease with distance. This is known as Attenuation. To
model attenuation we multiply the contribution of each light
source by
d =
1
a
c
+ a
l
[[l
i
[[ + a
q
[[l
i
[[
2
,
where
a
c
is the constant attenuation factor.
a
l
is the linear attenuation factor.
a
q
is the quadratic attenuation factor.
We can then choose attenuation that is suitable for our needs
(but not necessary realistic).
Spot lights are similar to point lights, except they have a di-
rection on which the light is pointed and an exponential fallo
n
Figure 4.4: Microfacets
factor which describes how tight the beam is. For each spotlight
we can calculate the constant
c
spot
= max(l l
dir
, 0)
c
exp
,
where
l
dir
is the direction in which the spotlight is pointing.
c
exp
is the factor controlling the tightness of the beam.
Each spotlights contribution is multiplied by c
spot
to take into
account the directional and fallo characteristics of the light.
Transparent objects
If objects are transparent, then we should use a coecient of
transparency
t
and coecient of reectivity
r
to model the
characteristics of the object. Now the colour of the surface will
be
r
I +
t
T, where T is the intensity of the transmitted light.
We expect that
r
+
t
= 1, but this may be adjusted to obtain
the desired eects.
The Cook-Torrance Lighting Model
The Cook-Torrance lighting model can better represent a wider
range of surface materials. It uses a microfacet model for the
surface of the object. The object surface is assumed to consist
of many facets that are perfect mirrors as illustrated in gure
4.4. These facets are much smaller than the eye can see. If
we model the interaction between the light and the facets, we
model how the surface appears when lit. This lighting model
tends to handle rough surfaces, metallic surfaces and changes in
reection due to grazing angles (angles nearly perpendicular to
the normal) better than the Phong model.
Bidirectional reectivity
We would like to describe how light from a source in the direction
of the unit vector l is reected in the direction of a unit vector
v. We can use a single BRIDF (Bidirectional reected intensity
distribution function) to do so.
The parameters to the BRIDF function are
the incoming direction l,
the outgoing direction v,
the colour or wavelength of the incoming light,
the properties of the reecting surface, including the normal
and orientation.
We write the BRIDF function as BRIDF(l, v, ) and omit the
surface properties. The BRIDF function returns the ratio of the
intensity of the outgoing light in direction v to the intensity of
the incoming light pointed to by l.
BRIDF has the potential to model more surface characteristics,
such as anisotropic surfaces. Anisotropic surfaces have dier-
ent reective characteristics dependant on the direction of the
incoming light. This may be caused by parallel grooves in the
surface or other artifacts. Examples of anisotropic surfaces in-
clude some types of cloth (velvet), CDs, hair, feathers and fur.
Although BRIDF has the potential to simulate a wide range of
materials, physical attributes such as subsurface scattering, po-
larisation and diraction are not takes into account. Note that
the Phong lighting model is a simple example of BRIDF.
The Cook-Torrance Model
The Cook-Torrance model considers a surface to consist of small
at pieces called facets. The assumption is made that light hit-
ting a microfacet is either reected or can enter into the surface.
The reection is presumed to be a perfect reection. Light that
enters the surface is presumed to reect many times internally
before exiting the surface, in an unpredictable direction. This
is considered to be the diuse reection.
The Cook-Torrance model also considers various components to
the light:
Ambient lighting - computed in the same way as the Phong
model
Diuse lighting - computed in the same way as the Phong
model
Specular lighting - This considers the microfacets of the
surface and is calculated by
I
s
=
(n l)
(n v)
s F GD I
in
s
,
where
s is a scalar constant,
(n l) is used to calculate the intensity of light hitting
a unit area of the surface,
(n v) is used to calculate the intensity of light leaving
a unit area of the surface.
D = D(l, v) is the microfacet distribution term, or the
fraction of microfacets that are oriented correctly for
the specular reection from l to v,
G = G(l, v) is the geometric term and measures the ef-
fect that shadowing and masking have on the outgoing
light,
F = F(l, v, ) is the Fresnel coecient. This is a useful
term for modelling lighting at grazing angles, possibly
making the light more specular at grazing angles (near
perpendicular to the normal).
The Microfacet Distribution Term
The amount of light reected in the direction v is assumed to be
proportional to the number of microfacets correctly oriented for
a mirror reection in that direction. We use the halfway vector
h =
l +v
[[l +v[[
for this calculation. For perfect reection the microfacet normal
must be equal to h. We let = cos
1
(h n), where n is the
surface normal (not microfacet), so that we can write D = D().
Possible functions include
The Gaussian distribution function
D() = ce
2
/m
2
n
Figure 4.5: V shaped grooves
The Beckmann distribution
D() =
1
m
2
cos
4
e
tan
2
/m
2
where c and m are constants selected for the material. Some
books use 4 instead of for the Beckmann distribution function.
The Geometric Surface Occlusion term
This term takes into account shadowing and masking caused by
the microfacets. To simplify the calculation, we assume that v,
l and n are coplanar, and that the facets form symmetrical V
shaped grooves (not necessarily of the same depth) as shown in
gure 4.5. The tops of the grooves are assumed to be at the
same height.
Figure 4.6 shows the dierent masking and shadowing combina-
tions that can occur. In each diagram, 6 rays represent the full
incoming light. If there are less rays, they have been shadowed or
masked as illustrated. We do not need to take into account dif-
ferent V shaped grooves, only those that reect perfectly need be
considered. The fraction of V shaped grooves that do not reect
perfectly are removed by the Microfacet Distribution Term(D).
The classic Cook-Torrance model, takes the minimum of light
that is not shadowed, and light that is not masked. We will
h h
l
v
h
h
l
v
(a) No shadowing or masking. (b) Only masking.
h h
l
v
h
h
l
v
(c) Only shadowing. (d) Both shadowing and masking.
Figure 4.6: Shadowing and masking
h h
v
n
A
B
C D
Figure 4.7: Blinns geometric lemma

instead follow the approach that Buss [2] recommends. To cal-
culate the amount of shadowing and masking we need to use the
following lemma by Blinn.
Lemma 15 We consider gure 4.7. Let [[AB[[ be the distance
from A to B, then
[[BC[[
[[AC[[
=
2(n h)(n v)
(h v)
.
Proof We dene h
as the unit vector normal to the opposite

side of the groove. Since the groove is symmetric, is is clear that
h
is the reection of h around the normal to the surface n so

that h
= 2(n h)n h. From the symmetry of the groove and

the law of sines we have
[[AB[[
[[AC[[
=
[[AB[[
[[AD[[
=
sin
sin
.
We also have sin = cos(
2
) = v h
and
sin = cos(
2
) = v h. Using these results we get:
[[BC[[
[[AC[[
= 1
[[AB[[
[[AC[[
(ratio of lengths must add up to one)
= 1
sin
sin
= 1
v h
v h
= 1
v (2(n h)n h)
v h
= 1 +
v (2(n h)n h)
v h
=
v h +v (2(n h)n h)
v h
=
v (2(n h)n)
v h
=
2(n h)(v n)
v h
as required.
Masking
Masking only occurs if v h
< 0. Figure 4.6(b) illustrates this

clearly. The fraction of the side that is not masked is given by
2(n h)(n v)
(h v)
(lemma).
Shadowing
Shadowing only occurs if l h
< 0. Figure 4.6(c) illustrates this

clearly. The fraction of the side that is not in shadow is given
by
2(n h)(n l)
(h l)
(lemma).
If there is no shadowing or masking then G = 1. If there is only
masking then
G =
2(n h)(n v)
(h v)
If there is only shadowing, then according to Buss[2], there is no
decrease in the intensity of the reected light. Any light that is
shadowed by the edge of one facet will be reected by the facet
just before it. So in this case G = 1.
If there is shadowing and masking, then the decrease in intensity
is the amount of masking of the unshadowed light so that
G =
2(n h)(n v)
(h v)

2(n h)(n l)
(h l)
=
n v
n l
since h v = h l since h is the half vector of v and l.
G should be such that 0 G 1, so this formula is only applied
if n v < n l.
Final formula
Our nal formula for coplanar n, l and v is thus
G =
_
_
1 if v h
0 or n v n l
2(nh)(nv)
(hv)
if v h
< 0 and l h
0
nv
nl
if v h
< 0, l h
< 0 and n v < n l

If the vectors are not coplanar, we project the normal onto the
plane created by v and l to get
n
0
=
(n l)l + (n v)v (v l)(v n)l (v l)(l n)v
1 (v l)
2
.
In the formula above, we use m =
n
0
||n
0
||
instead of n. Although
the geometric term is only applied to specular reection in the
Cook-Torrance model, it can be applied to diuse reection as
well for non-Lambertian surfaces.
The Fresnel Term
The Fresnel equations describe what fraction of incident light is
specularly reected from a at surface. For a particular wave-
length , this can be dened in terms of a function T
F(l, v, ) = T(, )
where = cos
1
(l h), and is the index of refraction of the
surface. For materials that are not electrically conducting we
have
T =
1
2
_
_
sin
2
( )
sin
2
( + )
+
tan
2
( )
tan
2
( + )
_
_
which applies to unpolarised light. From Snells law we know
that
sin
sin
= .
If we let c = cos and g =
2
+ c
2
1 we nd that g = cos ,
sin( )
sin( + )
=
(g c)
(g + c)
,
and
cos( )
cos( + )
=
c(g c) + 1
c(g + c) 1
.
The Fresnel equation becomes
T =
1
2
(g c)
2
(g + c)
2
_
_
1 +
[c(g + c) 1]
2
[c(g c) + 1]
2
_
_
.
If > 1 then g is well dened. If < 1 then we have total
internal reection and we should set T = 1. Conducting mate-
rials need an imaginary component (the extinction coecient)
in the index of refraction and are not considered here. Some
values for for dierent wavelengths and materials are listed in
[2]. Figure 4.8 compares Phong shading on the left and Cook-
Torrance shading on the right.
Figure 4.8: Phong compared to Cook-Torrance lighting model
Chapter 5
Averaging and interpolation
We often wish to calculate the value of a function at interme-
diate points. This can be approximated by taking the average
value of the function at extreme points. We have already seen
interpolation applied in Gouraud shading and Phong shading.
Linear interpolation
We parametrise a line segment between two points by
x() = (1 )x
1
+ x
2
or x() = x
1
+ (x
2
x
1
)
for 0 1. Any point 0 1 interpolates the end points.
We can also extrapolate a value for the line with > 1 or < 0.
We can determine a value for given a point u on the line with
u x
1
= (x
2
x
1
)
=
(u x
1
) (x
2
x
1
)
(x
2
x
1
)
2
.
We can also linearly interpolate a function f(u) on a line seg-
ment x
1
x
2
given the values at the end points f(x
1
) and f(x
2
).
45
CHAPTER 5. AVERAGING AND INTERPOLATION 46
We express u as u = (1 )x
1
+x
2
and linearly interpolation
for f gives
f(u) = (1 )f(x
1
) + f(x
2
).
We can generalise interpolation to more than two points.
Denition 16 Let x
1
, x
2
, . . ., x
k
be points, and a
1
, a
2
, . . . ,
a
k
be real numbers.
a
1
x
1
+ a
2
x
2
+ + a
k
x
k
is a linear combination of x
1
, x
2
, . . ., x
k
. If

k
i=1
a
i
= 1 then this
is called an ane combination of x
1
, x
2
, . . ., x
k
. If

k
i=1
a
i
= 1
and a
i
0 then this is called a weighted average of x
1
, x
2
, . . .,
x
k
.
Theorem 17 Ane combinations are preserved under ane
transformations. That is, if
a
1
x
1
+ a
2
x
2
+ + a
k
x
k
is an ane combination and A is an ane transformation then
a
1
A(x
1
)+a
2
A(x
2
)+ +a
k
A(x
k
) = A(a
1
x
1
+a
2
x
2
+ +a
k
x
k
).
The ane combination of the transformed points is the same
as the ane combination of the points transformed. In other
words, the combination is such that it is irrelevant whether an
ane transformation is applied before or after forming the ane
combination.
Proof An ane transformation A can be written as A(x) =
x
z
y
w
u
Figure 5.1: Barycentric coordinate u.
B(x) + A(0) where B is a linear transform. So we have
A(a
1
x
1
+ a
2
x
2
+ + a
k
x
k
)
= B(a
1
x
1
+ a
2
x
2
+ + a
k
x
k
) + A(0)
= a
1
B(x
1
) + a
2
B(x
2
) + + a
k
B(x
k
) + A(0)
= a
1
B(x
1
) + a
2
B(x
2
) + + a
k
B(x
k
) +
k
i=1
a
i
A(0)
= a
1
B(x
1
) + a
1
A(0) + a
2
B(x
2
) + a
2
A(0) + + a
k
B(x
k
) + a
k
A(0)
= a
1
A(x
1
) + a
2
A(x
2
) + + a
k
A(x
k
).
Note that interpolation does not work with homogeneous coor-
dinates!
To interpolate on three points (a triangle) we use barycentric
coordinates. A point on the triangle is specied by
u = x + y + z
with + + = 1 and , , all positive.
Theorem 18 Let x, y and z be non-collinear points and let
T be the triangle formed by these three points.
(a) Let u be a point on T or in the interior of T. Then u can
be expressed as a a weighted average of the three vertices
x, y, z with , , 0 and + + = 1.
(b) Let u be any point on the plane containing T. Then u can
be expressed as an ane combination of the three vertices
with + + = 1.
Proof
(a) If u is on an edge of T then it is a weighted average of
those two vertices. So we consider the case where u is in
the interior of T. Refer to gure 5.1. We form the line
containing u and z. Let this line intersect xy at the point
w. Then w can be written as
w = ax + by.
with a+b = 1 and a, b 0. u is clearly a weighted average
of w and z so that
u = cw + dz
with c + d = 1 and c, d 0. So we have
u = cw + dz
u = c(ax + by) + dz
u = x + y + z
with = ac, = bc and = d. We have c + d = 1, and
a +b = 1, so that (a +b)c +d = 1. Thus ac +bc +d = 1 =
+ + . Since a, b, c and d are nonnegative, so are ,
and .
We note that + + = 1, so that one parameter can be
eliminated by = 1 .
(b) The proof of part (b) is similar.
x
z
y
u
A
B
C
Figure 5.2: Barycentric coordinates based on area.
Theorem 19 Let u be a point on the plane containing T.
Then there are unique values for , and so that ++ = 1
and u = x + y + z.
Theorem 20 Suppose we have the situation illustrated in
gure 5.2. Then
=
A
A + B + C
=
B
A + B + C
=
C
A + B + C
.
We can calculate the barycentric coordinates using this theorem.
We have the following result:
D = [[
1
2
(z x) (y x)[[
A = [[
1
2
(z y) (u y)[[
B = [[
1
2
(z x) (u x)[[
C = [[
1
2
(y x) (u x)[[
x y
w
z
u b
1
b
2
a
1
a
2
Figure 5.3: Bilinear interpolation.

where D = A+B +C. We can simplify this calculation by fac-
toring out terms and noting that only the magnitude is required.
Transforming the problem to avoid the cross product also avoids
unnecessary calculation. Another technique is provided in the
separate document concerning raytracing.
Bilinear and Trilinear Interpolation
We can interpolate four points as in gure 5.3 using
u = (1 ) [(1 )x + y] + [(1 )w+ z]
= (1 ) [(1 )x + w] + [(1 )y + z]
= (1 )(1 )x + (1 )y + z + (1 )w
for 0 1 and 0 1. It is important to see that the
results are the same irrespective of which edges are interpolated
rst. There are several theorems discussing bilinear interpola-
tion when the vertices of the quadrilateral are not coplanar and
not convex. We will assume that the vertices of the quadrilat-
erals are coplanar and that the quadrilateral is convex. To test
if a quadrilateral is convex we dene
v
1
= y x
v
2
= z y
v
3
= wz
v
4
= x w.
If the quadrilateral is convex then (v
1
v
2
) n, (v
2
v
3
) n,
(v
3
v
4
) n and (v
4
v
1
) n all have the same sign. In this
case n is the normal to the quadrilateral.
Inverting bilinear interpolation
Obtaining the parameters and is very important for ap-
plications such as texture mapping. Buss [2] discusses bilinear
interpolation of four points, including the case where the points
are not coplanar. In this discussion, we will assume the points
are coplanar. u is obtained by linearly interpolating two points
on the border of the quadrilateral, namely s
1
and s
2
. These
points are also obtained by linear interpolation, that is
s
1
() = x v
4
s
2
() = y + v
2
If u is obtained by linear interpolation of s
1
and s
2
, then s
1
, s
2
and u must be colinear, that is
0 = (s
1
() u) (s
2
() u)
= (x v
4
u) (y + v
2
u)
= (v
4
v
2
)
2
+ [v
4
(y u) +v
2
(x u)] + (x u) (y u).
This is a quadratic equation in which can be solved using the
formula
=
B
B
2
4AC
2A
or, for better numerical stability (see Buss)
=
2C
B +
B
2
4AC
.
Naturally, we only want the answer for which 0 1.
Note that the quadratic equation actually provides us with three
equations in the 3D case. One solution is to project the vectors
onto a plane and then perform the computation. Another one
is to determine which equation gives us the most stable solution
(eg. large absolute value for A) and solve for accordingly.
Once has been determined we can compute using
=
(u s
1
()) (s
2
() s
1
())
(s
2
() s
1
())
2
.
Trilinear interpolation
We generalise bilinear interpolation to three dimensions by in-
terpolating 8 points. We expect that the 8 points should be
arranged in the form of a rectangular prism. We dene trilinear
interpolation as
u(, , ) =
i,j,k
w
i
()w
j
()w
k
()x
i,j,k
with i, j, k 0, 1 and for n 0, 1
w
n
() =
_
1 if n = 0
if n = 1
Trilinear interpolation is important for real-time graphics and
mipmapping. See [1] for further details.
Convex sets
Denition 21 Let A be a set of points in R
d
. The set A is
convex if and only if for any two points x and y in A, the line
segment joining x and y is entirely in A.
Denition 22 The convex hull of A is the smallest convex
set containing A.
Denition 23 Let A be a set and x a point. We say that x is
a weighted average of points in A if and only if there is a nite
set of points y
1
, . . . , y
k
in A such that x is equal to a weighted
average of y
1
, . . . , y
k
.
Theorem 24 The convex hull of A is precisely the set of
points that are weighted averages of points in A.
Proof We follow a rather informal approach, see [2] for a
formal proof.
We consider any two points in A. The line segment joining the
two points consists of weighted averages of these two points.
Since these points are by denition in A and the convex hull of
A, the convex hull of A contains all weighted averages of the
points in A. Suppose the convex hull of A contains a point x
that is not a weighted average of the points in A. This would
imply that every point in the convex hull can be expressed as a
weighted average of x and other points in A. We know that A
is a subset of the convex hull, and that all elements of A can be
expressed as a weighted average of points in A. So if x is in the
convex hull, then the convex hull is not the smallest convex set
containing A, contrary to the denition.
Interpolation and homogeneous coordinates
We note that every point x in R
3
can be represented by the
homogeneous coordinate (wx, w). If we consider the weighted
sum
1
(w
1
x
1
, w
1
) + +
k
(w
k
x
k
, w
k
)
= (
1
w
1
x
1
,
1
w
1
) + + (
k
w
k
x
k
,
k
w
k
)
= (
1
w
1
x
1
+ +
k
w
k
x
k
,
1
w
1
+ +
k
w
k
)
(
1
w
1
x
1
+ +
k
w
k
x
k
1
w
1
+ +
k
w
k
, 1).
It is clear that the homogeneous values w
k
change the impor-
tance, or relative weight, of each term. This is a weighted sum,
since the sum of the coecients
i
=

i
w
i
x
i
1
w
1
++
k
w
k
is 1, and each
coecient is in the interval [0, 1]. This will be useful later, to
allow Bezier curves and splines to model conic sections.
Hyperbolic interpolation
We see that interpolation of homogeneous coordinates does not
provide the expected result. So given the homogeneous coordi-
nates (x
i
, w
i
) representing the points y
i
= x
i
/w
i
, and the ane
combination
z =
i
y
i
with

i
i
= 1 can we determine coecients
i
such that
i
(x
i
, w
i
)
is a homogeneous representation of z. From the previous section
it is evident that we can use
i
=

i
/w
i
j

j
/w
j
.
First we note that x
i
i
/w
i
= y
i
i
. Secondly, we desire a weighted
average, which can be achieved by dividing by the sum of the co-
ecients

j

j
/w
j
. Hyperbolic interpolation (or rational linear
interpolation) can be used when interpolating in screen space,
to take into account perspective distortion.
Spherical linear interpolation
In some situations, we wish to interpolate between a unit vector
x and a unit vector y. At each stage of the interpolation we ex-
pect a unit vector. Thus, we want to rotate along the geodesic
(shortest path on a unit sphere) between x and y. Simply nor-
malising the weighted average results in a nonconstant rate of
interpolation along the geodesic. Instead we must nd a for-
mula to rotate by an angle , where 0 180
is the
angle between x and y. We let v be the component of y that is
perpendicular to x, and w the unit vector in the direction of v.
v = y (cos )x = y (y x)x
w =
v
sin
=
v
v v
Now spherical linear interpolation can be dened as
slerp(x, y, ) = cos()x + sin()w
We simplify as follows
= cos()x + sin()
y cos x
sin
=
_
cos() sin()
cos
sin
_
x +
sin()
sin
y
=
_
_
sin cos() sin()
sin
_
_
x +
sin()
sin
y
=
sin( )
sin
x +
sin()
sin
y
=
sin((1 ))
sin
x +
sin()
sin
y
Spherical linear interpolation can be used for quaternions to
interpolate from one orientation to another smoothly.
Chapter 6
Texture Mapping
Texture mapping applies some sort of image to the surface of an
object. We are not restricted to replacing a colour, but can mod-
ify any portion of the lighting equation. This includes colour,
diuse coecients, specular coecients, transparency and sur-
face normals. Storing surface normals in a texture map would
allow us to give a at surface a bumpy appearance. We would
like to use hyperbolic interpolation if we rasterize polygons, so
that textures are rendered in a perspective correct fashion.
Assigning texture coordinates to surfaces.
Essentially, we need to nd a mapping p(s, t) from the texture
space to the object space. We normally have s, t [0, 1]. This
corresponds to the row and column in the texture (usual coor-
dinates in R
2
. The mapping for some common objects are:
Triangle Use barycentric coordinates. Remember: only
two parameters are needed.
Quadrilateral Use bilinear interpolation. This is the com-
mon application of textures. Does bilinear interpolation
57
CHAPTER 6. TEXTURE MAPPING 58
yield the same results as splitting the quadrilateral into
two triangles and using barycentric coordinates?
Cylinder Use the mapping
p(s, t) = (r sin , y, r cos )
with = 360s, y = ht h/2, h is the height of the cylinder
and r is the radius of the cylinder. This is only for the side
surface.
Sphere Use the mapping
p(s, t) = (r sin cos , r sin, r cos cos )
with = 360s, = 180t 90, and r is the radius of the
sphere. We use latitude and longitude. Another option is to
apply the cylindrical mapping and degenerate the cylinder
to form the sphere. In this case we have the same as above
except sin = 180t 90.
Torus A torus can be described by the mapping
p(s, t) = ((R + r cos )sin, r sin , (R + r cos ) cos )
where R is the major radius, r is the minor radius and
= 360s, and = 360t.
If we choose to vary the texture coordinates so that the texture
only maps to a patch on the object surface, then we need to
decide what to do when we exceed our texture coordinates (s, t ,
[0, 1]). We have a few options:
Border choose a xed colour
Clamp if s > 1 then use s = 1. If s < 0 use s = 0.
Likewise for t.
Tile Use s := s mod 1.0 and t := t mod 1.0.
Mirror Use s := s mod 1.0 if s| is even, and s := 1.0
(s mod 1.0) if s| is odd.
When rasterizing, we normally interpolate the values of s and
t while interpolating the vertices, colours etc. It is also quite
common that we have the surface coordinate, but not s and t
(as in raytracing). The equations above would then have to be
solved for s and t to obtain the required texture coordinates.
Magnication and minication
If we have the situation that more than one texel in the texture
map maps to a pixel on the display, then we have minica-
tion. If one texel maps to more than one pixel on the screen
we have magnication. In the case of magnication, the texture
appears blocky. With minication we have a problem deciding
which texel should be used. This can cause interference pat-
terns, graininess and similar distortions. With animation the
eects can be even more disastrous as subtle changes in position
result in markedly dierent texels being selected. The following
approaches may be used to eliminate these aliasing artifacts to
some extent.
Bilinear interpolation
With bilinear interpolation, we nd the four nearest texels to
(s, t) and bilinearly interpolate the colour values. This approach
works well for magnication, but for minication we need the
weighted average of far more pixels, and this proves to be ex-
pensive.
Mipmappingmultum in parvo (many in one)
With mipmapping, we attempt to precompute the aect of sam-
pling a large number of texels in the texture. The texture is as-
sumed to be of dimension 2
m
2
m
. We divide the image width
and height in half and resample to obtain a reduced texture. The
original image forms the rst level of the texture, the reduced
image forms the second level of the texture. We repeat the pro-
cess with the reduced texture, reducing the size repeatedly until
the size of the texture is 1 1. When we render the texture we
select the mipmap level whose texels most closely correspond to
the size of a pixel on the screen. In other words, when the object
is further away, a low resolution mipmap will be used. When
the object comes closer a high resolution mipmap will be used.
Bilinear interpolation can still be applied within the mipmap
level. The change in mipmap levels can cause an apparent jump
in appearance. To solve this problem, bilinear interpolation can
be used between mipmap levels. Mipmaps are not expensive to
use, since each level reduces the amount of information by four:
1 +
1
4
+
1
16
+ =
i
1
2
i
= 1
1
3
.
We only have a 33 percent more memory used by this technique.
Mipmapping fails somewhat for oblique surfaces. Ripmapping
[1] can compensate to some extent for this.
Figure 6.1: Supersampling.
Figure 6.2: Stochastic supersampling.
Stochastic supersampling
For supersampling we take more than one sample from the tex-
ture per pixel (this can also be applied more generally). We
may decide to take nine samples per pixel as in gure 6.1. We
then take an average value of the resulting colours. We may still
see artifacts due to the xed frequency at which we sample. If
we randomly displace the sample points within the subpixel, we
sample dierent frequencies and the results are usually better.
Figure 6.2 illustrates the stochastic(random) supersampling or
jitter process.
Bump mapping
For bump mapping we store the height of the surface at each
point. We can deform the surface accordingly and calculate the
resulting normals to apply to the lighting equation. Buss [2]
discussed the eect on an arbitrary surface.
Environment mapping
If we use the normal vector and calculate the reection vector,
we can use the reection vector as an index into a texture map
which represents the environment around us. If we assume that
the viewer is in a xed direction for all points this can be done
rapidly. The texture used is often a spherical texture or a cube
map.
Spherical environment mapping
For spherical environment mapping, we assume that the environ-
ment is innitely far away. A reective sphere is placed some-
where in the environment and a photo is taken of the sphere
(the sphere map). Since the environment is innitely far away,
it does not matter where the sphere is, the reection will be the
same. The sphere will only be visible in a portion of the image
(see gure 6.3), the portions outside of the sphere will not be
used. A small amount of texture memory will thus be wasted
when using a sphere map.
We can now use the sphere map to determine the reection of
the environment on any surface. If the unit normal to the sur-
face we are considering is n, then only one point on the sphere
will have the same normal. We assume that all vectors have
Figure 6.3: A sphere map.
n
v
n
v
Figure 6.4: The use of the normal in a sphere map.
been transformed into the coordinate system of the sphere. Let
us assume that the point with the same normal is p (relative
to the centre of the sphere). The normal to the sphere at this
point is p. We may at this point assume that the sphere is
a unit sphere. We also assume that no projection takes place.
Under these assumptions we must have p = n. If no projection
takes place, then n = (n
x
, n
y
, n
z
) can be used to directly nd
the corresponding point in the sphere map. The point is simply
((n
x
+1.0)/2, (n
y
+1.0)/2). Since this point reects the environ-
ment with the same normal as the surface under consideration,
the reection of the environment in the surface has exactly the
same colour at this point. We can apply this formula to the
three vertices of a triangle and interpolate the texture coordi-
nates across the surface to obtain a fast approximate reection
of the environment.
The disadvantages of spherical environment maps includes the
wasted texture memory, and the lower sampling rate of the en-
vironment near the edge of the sphere. The lower sampling rate
should not be too much of an inconvenience since the reective
surface is then almost turned away from the viewer. Another
disadvantage is the fact that the sphere map is view dependant.
We thus need a dierent sphere map for dierent views.
Cubic environment mapping
For cubic environment mapping, we assume that the environ-
ment may be represented by an innitely large cube. If the
environment does not conform, we can simply project the envi-
ronment onto the cube to obtain a cube. The six faces of the
cube form the cubic environment map (gure 6.5).
To map the environment onto the surface, we need to determine
which faces of the cube are reected and what the texture coor-
dinates on these faces are. We will assume that only one face of
the cube is necessary. If we have a unit reection vector r, then
we can select the face to use according to the coordinate with
the largest absolute value (of r). The sign of this coordinate de-
termines which of the two parallel faces should be selected. Now
that we have the face, we can begin computing which point on
the face the reected ray hits. Since the face is innitely far
away, we may assume that the reected ray comes from the ori-
(a) Cube (b) Unfolded cube
Figure 6.5: The cubic environment map.
gin and intersects the plane. We will consider the plane with
normal n = (0, 1, 0). Calculations for the other planes will
proceed in a similar fashion. If we compute the relative posi-
tion of the intercept, we obtain a texture coordinate. That is,
compute u =
p
x
w
where w is the width of the cube face and p
x
is
the point of intersect. If the width of the cube face is w, then
the distance to the cube face is
w
2
. The point of intersect of the
ray with the surface is thus given by the parameter (standard
ray-plane intersection)
t =
w/2
r n
.
We nd that t =
w/2
r
y
. The position p is then p =
w/2
r
y
(r
x
, r
y
, r
z
).
The relative position u =
p
x
w
is u =
r
x
2r
y
and likewise the other
relative position is v =
r
z
2r
y
. These two coordinates are easily
converted to texture coordinates using u + 0.5 and v + 0.5.
Cubic maps provide a much better sampling of the environment
and oer a view independent environment map. However, it
may be dicult to work with 6 dierent textures, especially if
we plan to interpolate textures across a triangle and we discover
that each vertex of the triangle reects a dierent face of the
cube environment.
Chapter 7
Bezier Curves
Splines are a useful tool for dening curves and surfaces. Some
of the advantages are:
Reduced storage size
Control via a few points
Smooth
Simple to nd points on the spline
Bezier curves of degree three
The most common Bezier curves are those of degree 3. These
curves usually provide sucient modelling possibilities and are
fast and easy to implement. We will also see that we can join
Bezier curves so that some degree of continuity is maintained.
A Bezier curve of degree 3 is specied by four control points p
0
,
p
1
, p
2
and p
3
. The Bezier curve interpolates the end points.
The other two points inuence the shape of the curve.
67
CHAPTER 7. B
EZIER CURVES 68
0
u
y
1
1
B
0
B
1
B
2
B
3
Figure 7.1: Blending functions for Bezier curves of degree 3.
Degree 3 Bezier curves are parametrically dened by
q(u) =
3
k=0
B
k
(u)p
k
where the four functions B
k
(u) are blending functions dened
by
B
k
(u) =
_
3
k
_
u
k
(1 u)
3k
.
B
k
(u) are known as Bernstein polynomials with
_
n
m
_
=
n!
m!(n m)!
.
p can be in any space R
d
. If we multiply the polynomials out,
we see that q(u) is a polynomial of degree 3 in u.
The blending functions, given by
B
0
(u) = (1 u)
3
B
1
(u) = 3u(1 u)
2
B
2
(u) = 3u
2
(1 u)
B
3
(u) = u
3
CHAPTER 7. B
EZIER CURVES 69
are illustrated in gure 7.2. We have the following properties of
the blending functions:
B
k
(u) [0, 1] for u [0, 1].
k=0
B
k
(u) =
3
k=0
_
3
k
_
u
k
(1 u)
3k
= (1 u)
3
+ 3u(1 u)
2
+ 3u
2
(1 u) + u
3
= (1 u)(1 2u + u
2
) + 3u(1 2u + u
2
) +
3u
2
3u
3
+ u
3
= 1 2u + u
2
1u + 2u
2
u
3
+ 3u 6u
2
+
3u
3
+ 3u
2
3u
3
+ u
3
= 1 2u + 3u 1u + u
2
+ 2u
2
6u
2
+ 3u
2
u
3
+ 3u
3
3u
3
+ u
3
= 1
or
3
k=0
B
k
(u) =
3
k=0
_
3
k
_
u
k
(1 u)
3k
= (u + (1 u))
3
(Binomial theorem)
= 1
B
0
(0) = 1 and B
3
(1) = 1. Which shows that the end points
are interpolated.
The derivatives are:
B
0
(u) = 3(1 u)
2
CHAPTER 7. B
EZIER CURVES 70
B
1
(u) = 3(1 u)
2
6u(1 u)
B
2
(u) = 6u(1 u) 3u
2
B
3
(u) = 3u
2
At the endpoints the derivatives evaluate to:
B
0
(0) = 3
B
1
(0) = 3
B
2
(0) = 0
B
3
(0) = 0
B
0
(1) = 0
B
1
(1) = 0
B
2
(1) = 3
B
3
(1) = 3
We can easily calculate the derivative of q(u):
q
(u) =
3
k=0
B
k
(u)p
k
.
We have the following interesting properties
q
(0) = 3(p
1
p
0
)
q
(1) = 3(p
3
p
2
)
so that we can easily calculate the tangent to the curve at
the endpoints.
Piecewise smooth Bezier curves
To create more complex curves, we could use Bezier curves of
higher degree. Before we do that, we consider another option,
piecewise smooth curves. First we have to dene smooth, or
continuous curves.
Denition 25 Let k 0. A function f(u) is C
k
continuous
CHAPTER 7. B
EZIER CURVES 71
p
0
p
1
p
2
p
3
Figure 7.2: Example of a Bezier curve.
if f has the kth derivative dened and continuous everywhere
in the domain of f. C
0
continuity is simply the usual denition
of continuity. f is C
continuous if f is C
k
continuous for all
k 0.
Denition 26 A function f(u) is G
1
continuous provided f
is continuous and there is a function t = t(u) that is continuous
and strictly increasing such that the function g(u) = f(t(u)) has
continuous, nonzero rst derivative everywhere in its domain.
We often simply require that the tangents at the point where
we join the two curves be scalar multiples of each other. If we
have two curves dened by
q(u) =
3
k=0
B
k
(u)p
k
and
r(u) =
3
k=0
B
k
(u)s
k
with r following q. We simply set (p
3
p
2
) = (s
1
s
0
) to get C
1
continuity. We can place many curves in sequence to achieve the
desired results. To obtain G
1
continuity we set (p
3
p
2
) = a(s
1
CHAPTER 7. B
EZIER CURVES 72
p
0
p
1
p
2
p
3
/s
0
s
1
s
2
s
3
p
0
p
1
p
2
p
3
/s
0
s
1
s
2
s
3
(a) C
1
continuity (b) G
1
continuity
Figure 7.3: Piecewise smooth Bezier curves.
s
0
). We dont know what t(u) is, but we know that at the point
where the curves are joined, u
j
, we have g
(u
j
) = f
(t(u
j
))t(u
j
)
so we expect the one tangent to be a scalar multiple of the other.
We naturally also require that p
3
= s
0
.
Examples of piecewise smooth curves are shown in gure 7.3.
Bezier curves of higher degree
It is straightforward to generalise Bezier curves to higher degree.
Degree n Bezier curves are parametrically dened by
q(u) =
n
k=0
B
n
k
(u)p
k
where the functions B
k
(u) are blending functions dened by
B
n
k
(u) =
_
n
k
_
u
k
(1 u)
nk
.
One again we have the properties
B
n
0
(0) = 1 = B
n
n
(1).

n
k=0
B
n
k
(u) = 1 for all u.
B
n
k
(u) 0 for all 0 u 1.
CHAPTER 7. B
EZIER CURVES 73
This implies that the curve will start at p
0
and end at p
n
, and
the curve lies in the convex hull of its control points.
The derivative is given by
q
(u) =
n
k=0
B
k
(u)p
k
=
n
k=0
(
d
du
_
n
k
_
u
k
(1 u)
nk
)p
k
=
n
k=0
_
n
k
_
(ku
k1
(1 u)
nk
u
k
(n k)(1 u)
nk1
)p
k
=
n
k=0
_
n
k
_
ku
k1
(1 u)
nk
p
k
k=0
_
n
k
_
u
k
(n k)(1 u)
nk1
p
k
=
n
k=0
_
n
k
_
ku
k1
(1 u)
nk
p
k
n 1
k=0
_
n
k
_
u
k
(n k)(1 u)
nk1
p
k
=
n
k = 1
_
n
k
_
ku
k1
(1 u)
nk
p
k
n1
k=0
_
n
k
_
u
k
(n k)(1 u)
nk1
p
k
=
n 1
k = 0
_
n
k + 1
_
(k + 1)u
k
(1 u)
nk1
p
k+1
n1
k=0
_
n
k
_
u
k
(n k)(1 u)
nk1
p
k
=
n1
k=0
n
_
n1
k
_
u
k
(1 u)
nk1
p
k+1
n1
k=0
n
_
n1
k
_
u
k
(1 u)
nk1
p
k
=
n1
k=0
n
_
n 1
k
_
u
k
(1 u)
nk1
(p
k+1
p
k
)
CHAPTER 7. B
EZIER CURVES 74
= n
n1
k=0
B
n1
k
(u)(p
k+1
p
k
).
One again we can easily create piecewise continuous curves. It
is also interesting to note that the derivative is a Bezier curve
of one degree less. We also have q
(0) = n(p
1
p
0
) and q
(1) =
n(p
n
p
n1
).
Bezier curves and ane transforms
An important question to ask, is what happens to a Bezier curve
if it is transformed by an ane transformation? Do all the
points on the curve have to be transformed individually? As it
so happens, Bezier curves are ane invariant. In other words,
we can transform the control points by an ane transform, and
then draw the Bezier curve. We get precisely the same result as
generating the points on the Bezier curve and then transforming
these points.
Theorem 27 The Bezier curve of degree three is ane in-
variant.
Proof A Bezier curve of degree three is given by
q(u) =
3
k=0
B
k
(u)p
k
,
where p
k
are the control points and B
k
(u) are the Bernstein
polynomials
B
k
(u) =
_
3
k
_
u
k
(1 u)
3k
.
CHAPTER 7. B
EZIER CURVES 75
Let A be an ane transform, ie. Ax = Lx + t (where L is
a linear transform). We must show that A
3
k=0
B
k
(u)p
k
=
3
k=0
B
k
(u)Ap
k
. So we have
A
3
k=0
B
k
(u)p
k
= L
3
k=0
B
k
(u)p
k
+t
=
3
k=0
B
k
(u)Lp
k
+t
=
3
k=0
B
k
(u)Lp
k
+
3
k=0
B
k
(u)t
(Property of the basis functions )
=
3
k=0
(B
k
(u)(Lp
k
+t))
=
3
k=0
(B
k
(u)Ap
k
)
as required.
It is straightforward to extend this proof to general Bezier curves.
Bezier curves also have the variation diminishing property, that
is for any line L in R
2
(plane P in R
3
) the number of times the
curve crosses the line (plane) is less than or equal to the number
of times the control polygon crosses the line (plane).
de Casteljaus Method
We consider once again Bezier curves of degree 3. We can calcu-
late the point on the Bezier curve for u [0, 1] using de Castel-
jaus method as follows:
r
j+1
k
= (1 u) r
j
k
+ u r
j
k+1
CHAPTER 7. B
EZIER CURVES 76
Figure 7.4: de Casteljaus method.
with r
0
k
= p
k
. The nal point on the curve is given by r
3
0
. This
algorithm can be applied for Bezier curves of any degree. Figure
7.4 illustrates the process of calculating the point on the Bezier
curve using de Casteljaus method.
To prove that de Casteljaus algorithm yields the same points
we must prove that q(u) = p
n
0
(u) with
p
j+1
k
(u) = (1 u) p
j
k
(u) + u p
j
k+1
(u).
We prove a more general statement
p
r
k
(u) =
r
j=0
B
r
j
(u)p
k+j
.
For r = n and k = 0 we have the desired result.
Theorem 28 Let 0 r n and 0 k n r, then
p
r
k
(u) =
r
j=0
B
r
j
(u)p
k+j
.
Proof We use induction:
CHAPTER 7. B
EZIER CURVES 77
for r = 0:
r
j=0
B
r
j
(u)p
k+j
=
0
j=0
B
0
j
(u)p
k+j
= B
0
0
(u)p
k+0
= p
k
= p
0
k
(u)
= p
r
k
(u)
Suppose the equation is true for r then
p
r+1
k
(u) = (1 u) p
r
k
(u) + u p
r
k+1
(u)
=
r
j=0
(1 u)B
r
j
(u)p
k+j
+
r
j=0
uB
r
j
(u)p
k+j+1
=
r+1
j=0
((1 u)B
r
j
(u) + uB
r
j1
(u))p
k+j
=
r+1
j=0
B
r+1
j
(u)p
k+j
.
In the last portion we dene
_
r
r+1
_
=
_
r
1
_
= 0, and thus
also B
r
1
(u) and B
r
r+1
(u). Also, since
_
r
j
_
+
_
r
j1
_
=
_
r+1
j
_
we have (1 u)B
r
j
(u) + uB
r
j1
(u) = B
r+1
j
(u).
Recursive subdivision
It is useful to be able to divide a Bezier curve into two parts,
each of which is also a Bezier curve. This will prove to be useful
for rendering Bezier curves and for intersection tests. If we have
a Bezier curve q(u) of degree n, then
q
1
(u) = q(u/2) and q
2
(u) = q((u + 1)/2)
CHAPTER 7. B
EZIER CURVES 78
are both Bezier curves of degree n. We have then successfully
divided the curve in two.
Drawing a Bezier curve can now be achieved by recursively sub-
dividing the Bezier curve in two until the portions of the Bezier
curves are as close to a straight line as possible. One way to
determine if the curve is close enough to a straight line, is to
dene an error value . If
[[q(
1
2
)
1
2
(p
0
+p
3
)[[ <
then we assume the distance from the curve to the line will be
less than pixels, and can thus be approximated by a straight
line segment. This test may fail on occasion, another test that
can be used is to determine how far the interior points of the
control polygon are from the line connecting the rst and last
control point.
Now we prove that the two portions of the curve are indeed
Bezier curves (in general).
Theorem 29 Let q
1
(u) = q(u
0
u) and
q
2
(u) = q(u
0
+ (1 u
0
)u).
a) The curve q
1
(u) is equal to the degree n Bezier curve with
control points p
0
0
, p
1
0
, p
2
0
, . . . , p
n
0
.
b) The curve q
2
(u) is equal to the degree n Bezier curve with
control points p
n
0
, p
n1
1
, p
n2
2
, . . . , p
0
n
.
Proof Only part (a) will be proven.
We want to show that
q(u
0
u) =
n
j=0
B
n
j
(u)p
j
0
(u
0
).
CHAPTER 7. B
EZIER CURVES 79
This is equivalent to showing that
n
i=0
B
n
i
(u
0
u)p
i
=
n
j=0
B
n
j
(u)
j
i=0
B
j
i
(u
0
)p
i
n
i=0
B
n
i
(u
0
u)p
i
=
n
j=0
j
i=0
B
n
j
(u)B
j
i
(u
0
)p
i
n
i=0
B
n
i
(u
0
u)p
i
=
n
i=0
n
j=i
B
n
j
(u)B
j
i
(u
0
)p
i
.
The coecients of p
i
should be equal, so we must show that
B
n
i
(u
0
u) =
n
j=i
B
n
j
(u)B
j
i
(u
0
)
_
_
n
i
_
_
(u
0
u)
i
(1 u
0
u)
ni
=
n
j=i
_
_
n
j
_
_
_
_
j
i
_
_
u
j
u
i
0
(1 u)
nj
(1 u
0
)
ji
use
_
n
j
__
j
i
_
=
_
n
i
__
ni
ji
_
, and divide by (u
0
u)
i
(1 u
0
u)
ni
=
n
j=i
_
_
n i
j i
_
_
u
ji
(1 u)
nj
(1 u
0
)
ji
.
So we calculate
n
j=i
_
_
n i
j i
_
_
u
ji
(1 u)
nj
(1 u
0
)
ji
=
ni
j=0
_
_
n i
j
_
_
u
j
(1 u)
nji
(1 u
0
)
j
= ((u u
0
u) + (1 u))
ni
Binomial theorem
= (1 u
0
u)
ni
.
Which is what we wanted to prove.
CHAPTER 7. B
EZIER CURVES 80
Degree elevation
If we have a Bezier curve of degree n, we can create a Bezier
curve of degree n + 1 which yields the same curve. Suppose we
have a Bezier curve of degree n dened by the control points p
0
,
p
1
, . . . p
n
. Then we can create a Bezier curve of degree n + 1
with control points r
0
, r
1
, . . . r
n+1
which is the same curve by
dening:
r
0
= p
0
r
n+1
= p
n
r
i
=
i
n + 1
p
i1
+
n i + 1
n + 1
p
i
.
Bezier surface patches
We can also have Bezier curves of Bezier curves. The control
points for the Bezier curves are determined by other Bezier
curves. In this way we create a surface. A Bezier patch of
degree three is described as follows:
q(u, v) =
3
i=0
3
j=0
B
i
(u)B
j
(v)p
i,j
.
Subdivision, degree elevation and many of the other techniques
can be applied. We have to be careful for edge cracking though.
Chapter 8
Raytracing
See the associated raytracing document.
81
Chapter 9
Radiosity
Raytracing yields impressive results, but fails to capture some
of the lighting eects that we observe. In particular, raytracing
handles mirror reections well, but the environments we model
have few highly reective objects. Instead, most of the lighting
is diuse. We observe many subtle eects such as colour bleeding
in a closed diuse environment. Raytracing does not model this
eect since we only model certain kinds of light transport.
Light transport notation
We need to describe our models in terms of the possible light
paths that can be modelled. We can use the notation of Heck-
bert for this purpose. The vertices of a light path can be:
L - a light source.
E - the eye.
S - a specular reection.
D - a diuse reection.
82
CHAPTER 9. RADIOSITY 83
We are only interested in light paths that begin at a light source
and end at the eye. We can represent the light paths as regular
expressions. Raytracing models light paths of the form LD?S*E,
or light (possibly reected o a diuse surface) reected specu-
larly zero or more times before reaching the eye. This is not a
global illumination model, since indirect illumination from dif-
fuse surfaces is not considered.
Radiosity attempts to model light paths of the form L(D+)E.
The radiosity matrix
Radiosity algorithms approximate the lighting of a scene by di-
viding the scene into patches and calculating the energy transfer
between patches. The radiosity of a patch, B is the total rate
of energy leaving a surface and is equal to the sum of emitted
and reected energies. Consider two patches P
i
and P
j
,
B
i
dA
i
= E
i
dA
i
+
i
B
j
dA
j
F
dA
j
dA
i
.
The contribution for every patch P
j
to patch P
i
is then
B
i
dA
i
= E
i
dA
i
+
i
_
j
B
j
dA
j
F
dA
j
dA
i
.
i.e. Radiosity area = emitted energy + reected energy.
E
i
is the light emitted from patch i.

i
is the reectivity of patch i.
F
dA
j
dA
i
is the form factor between dA
i
and dA
j
, the fraction
of energy leaving dA
j
that arrives at dA
i
.
In a closed environment this can be repeated for each patch to
produce a system of equations. Normally the environment is
discretized into n patches and constant radiosity for the patch
is assumed:
B
i
A
i
= E
i
A
i
+
i
n
j=1
B
j
F
ji
A
j
with F
ij
A
i
= F
ji
A
j
F
ij
= F
ji
A
j
A
i
so that B
i
= E
i
+
i
n
j=1
B
j
F
ij
with F
ii
= 0, since we assume the surface is at (or convex).
This is the equation for one patch. For all the patches we have
n simultaneous equations in n unknown B
i
values:
_
_
_
_
_
_
_
_
_
_
_
_
_
1
1
F
11

1
F
12
. . .
1
F
1n
2
F
21
1
2
F
22
. . .
2
F
2n
. . . . . .
. . . . . .
. . . . . .
n
F
n1

n
F
n2
. . . 1
n
F
nn
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
B
1
B
2
.
.
.
B
n
_
_
_
_
_
_
_
_
_
_
_
_
_
=
_
_
_
_
_
_
_
_
_
_
_
_
_
E
1
E
2
.
.
.
E
n
_
_
_
_
_
_
_
_
_
_
_
_
_
Solving for radiosity values
Solving for the radiosity values will give us intensity values for
the patches. Gauss-Jordan elimination, or Gaussian elimination
with back substitution can be used to calculate the radiosity
values. In practice many of the form factors are zero and other
properties of the matrix (diagonally dominant) allow tech-
niques such as the Gauss-Seidel method to be used.
Solving: The Jacobi method
The Jacobi method calculated the total incoming energy from
each of the other patches, to calculate the new radiosity value for
a particular patch. This process is repeated for each patch. The
calculated values are stored in a separate array and then copied
back to the rst array, for the next iteration.
B E
for i = 1 to n do
B[i] E[i]
end for
while (not converged enough) do
for i = 1 to n do
B
new
[i] = E
i
+
n
j=1
M
i,j
B[j]
end for
for i = 1 to n do
B[i] B
new
[i]
end for
end while
Solving: Gauss-Seidel Iteration
Gauss-Seidel iteration converges more quickly to a solution than
the Jacobi method. It can also be applied in place, no separate
array is required.
B E
for i = 1 to n do
B[i] E[i]
end for
for i = 1 to n do
B[i] = E
i
+

n
j=1
M
i,j
B[j]
end for
end while
This technique can be applied because the sum of each row of
form factors should equal 1 (Since we cannot have more visibility
than what we can see). Also
i
should be less than 1.
Solving: The shooting method
Instead of determining how much energy a patch receives from
other patches, we can determine how much energy a patch radi-
ates to other patches. In this way, we can select bright patches to
transmit energy rst, and we should converge more quickly. The
radiosity of a patch i at any time is B[i]+B[i].
B E
for i = 1 to n do
B[i] 0
B[i] E[i]
end for
Choose j so that B[j] A
j
is maximised
B[j] = B[j] + B[j]
for i = 1 to n do
B[i] = B[i] + M
i,j
B[j]
end for
B[j] = 0
end while
Form factors
So far we have assumed that the form factors, or the visible
portion of each patch from each other patch is known. We will
have to calculate these values in some way.
Form factors are dened as
F
A
i
A
j
= F
ij
=
Radiative energy reaching A
j
from A
i
Total radiative energy leaving A
i
in all directions
F
ij
=
1
A
i
_
A
i
_
A
j
cos
i
cos
j
r
2
dA
j
dA
i
where r is the distance between the two patches, and
i
,
j
are
the angles between the patches and the line connecting the two
points under consideration. This approach doesnt take into
account patches that may be in between the two patches for
which we want to calculate the form factor.
The rest of this section presents some options for form factor
calculation.
Numerical solution
Convert the double area integral into a double contour integral
(Stokes theorem):
F
ij
=
1
2A
i
_
C
j
_
C
i
ln(r)dx
i
dy
i
+ ln(r)dy
i
dy
j
+ ln(r)dz
i
dz
j
This technique is expensive and requires the calculation of the
contour of the visible are of the patch.
The raytracing method
In this method we use an existing raytracing engine to approx-
imate the form factor. We simply cast a number of rays from
various points on patch j to patch i. Jittering should be used
when selecting rays from a regular grid to obtain better results.
Let the percentage of rays that make it from patch j to patch
i be V
ij
. We have V
ij
= V
ji
. Let r be the distance between the
centres of the patches, then we approximate the form factor by
F
ij
= V
ij
cos
i
cos
j
A
j
r
2
The r
2
term comes from the portion of the hemisphere above
the centre of the patch through which the other patch is visible
(the area of the hemisphere is 2) and the reduced visible size
of the patch due to distance (inverse square law).
The hemicube method
Ray casting can be an intensive process. The hemicube process
uses rasterization techniques and can thus be accelerated by
common graphics hardware. Instead of using a hemisphere on
which projections are calculated, we use a hemicube (half a cube
of side length 2). The hemicube is centred above the centre of
the patch. Now we render the scene from the viewpoint of the
patch onto each side of the hemicube. A standard Z-buer can
be used for the for visibility calculations. Each patch is assigned
a unique colour (typically one of 16 million colours for a 24-bit
colour buer). It is trivial to calculate the visibility V
ij
by simply
determining how many pixels on the surface of the hemicube are
from the patch we investigate in proportion to the total number
of pixels. The hemicube is illustrated in gure 9.1.
Patch i
Patch j
Figure 9.1: The hemicube method.
Ba+B
b
+Bc+B
d
4
B
b
+B
d
2
2B
2
B
1
2B
e
B
3
B
a
B
b
B
c
B
d
B
e
B
1 B
2
B
3
Figure 9.2: Determining radiosity values.
Figure 9.3: Examples of radiosity from [6].
Rendering
To render the scene, we simply render the patches using a ras-
terizer. We have the radiosity (brightness) for each patch, but
this seldom provides smooth results. Instead we should try to
calculate the radiosity value for each pixel using interpolation.
Another option is to determine the radiosity at each vertex and
then interpolate using Gouraud shading. Vertices that are not
surrounded by patches can be calculated via extrapolation. The
calculations are illustrated in gure 9.2. One drawback of the
radiosity method is shadows. Shadows are seldom well dened.
Subdivision of the patches in areas where radiosity varies greatly
can reduce the problem.
Chapter 10
Photon mapping
One way to obtain better results for various scenes, is to combine
radiosity and raytracing. Radiosity then provides the diuse
term. Another option is provided by Photon mapping. The
key feature of photon mapping is that the existing raytracing
algorithm can be used for global illumination. Also, since the
dierences provided by global illumination tends to be rather
subtle, we can choose how much detail (and quality) we wish
to expend on this stage. To introduce photon mapping, we rst
examine path tracing, an expensive (high quality) solution to
the global illumination equation.
Path tracing
The standard raytracing algorithm uses a local lighting model in
combination with pure reection. To support full global illumi-
nation we must sample the light coming in from all directions.
In this way we sample light paths of the form L(D|S)*E. We can
do this by numerically integrating over the visible space of the
object we are considering. i.e. we cast rays in several random
directions and see what answers we get. The answers should,
91
CHAPTER 10. PHOTON MAPPING 92
of course, be adjusted by the properties of the surface (BRDF
function). So we should calculate the diuse and specular con-
tribution according to our lighting equations. The more rays we
cast out, the better our approximation for global illumination.
This turns out to be costly. The recursive nature of ray tracing
implies an exponential growth in the number of rays that needs
to be cast.
Instead of using this computationally expensive approach, we
sample light in only one direction. Now, we only have to follow
a single path to its destination. However, this is clearly not
sucient to model the lighting in the scene. If we use many
such paths and average the result, we get an estimate for the
radiosity at that point. Usually 1000+ rays are needed to get
acceptable quality. One problem with this random sampling
technique is the amount of noise created in the image. Although
path tracing is far faster than the naive approach it is still very
expensive (1000 times slower for 1000 rays).
This approach does illustrate that raytracing techniques can pro-
duce global illumination results.
Photon mapping
To get global illumination and eects such as caustics, we could
trace rays of light from the light sources thorough the scene until
they hit the eye. This technique was rejected for raytracing
because a very small fraction of the light rays would ever reach
the eye. So instead we trace rays backward from the eye. Photon
mapping reintroduces the forward raytrace (but in a limited
fashion). We implement the renderer in two passes:
Trace photons into the scene (and store their positions)
Render the scene using standard raytracing, but use the
traced photons to improve the accuracy of the lighting cal-
culations.
The advantages of this technique are
All global illumination eects can be simulated.
Arbitrary geometry is supported. (We are not forced to use
patches)
Low memory consumption.
Little noise (variance).
Consistency (will converge to the correct solution with enough
photons).
These phases are discussed in more detail in the following sec-
tions.
Photon tracing
Photon emission
Photons are created at light sources. Any type of light source
can be used, point lights, spotlights or even area lights. The
power of the light source is divided among the emitted photons
so that each carries a fraction of the light source power. For a
point light we can simply emit photons evenly in all directions.
For area lights we have to distribute the origin of the photons
across the area of the light. For spotlights we concentrate pho-
tons in a particular direction. Both source points and directions
of the photons can be generated randomly using the proper-
ties of the light. Rejection testing can be used to simplify the
creation of the photons. Projection maps can be used to send
photons in the direction of geometry (where they will yield the
greatest results).
Photon scattering
An emitted photon is traced through the scene using standard
raytracing techniques. When a photon hits an object it can be
Reected
Transmitted
Absorbed
The power of the photon should be scaled by the reectivity (or
other property). The photon is not stored for specular (mirror)
reection. The photon is stored for diuse reections. In that
case the photon can be transmitted in a random direction, and
it is scaled by the diuse reectivity coecient.
Russian roulette can be used to reduce the number of photon
traces required. Instead of scaling the power of the photon by
the surface property (e.g. diuse reection coecient), we de-
termine if the photon should be reected at all. If the surface
has property 0 1, then we generate a random number
0 1. If < , then the photon is reected, otherwise the
photon is absorbed. The same approach can be used to decide if
specular or diuse reection should be followed for the photon.
If the diuse coecient is
d
and the specular coecient is
s
,
then if
[0,
d
] , then we use diuse reection
[
d
,
d
+
s
] , then we use specular reection
[
d
+
s
, 1] , the photon is absorbed
For coloured surfaces we can use the average of the coecient
values for the dierent frequencies as the threshold value. The
power should also be scaled according to the average value and
the coecient (unless Russian roulette is used).
Photon storing
Photons are stored in the photon map when they hit a diuse
(non-specular) surface. Specular reections are handled (better)
by standard raytracing. The photons are stored at every point
that they hit a diuse surface. Photons represent the incoming
ux at the surface. So photons are not really isolated packets of
light.
Photon map data structure
One of the strengths of the photon map is that it is decou-
pled from the geometry involved. The information regarding
the photons is stored in a separate datastructure, and not in the
datastructure of the geometry. The photon map and ray tracing
algorithm allows any geometry to be used in during rendering
(even for example point clouds!).
Photons are stored during the photon tracing phase, they then
have to be queried for nearest neighbours. The structure should
thus be ecient (for speed) both in query time and in size (we
plan to store millions of photons). The kd-tree is a suitable
structure for this purpose. The query for k nearest neighbours
is O(k + log n) on average, where n is the number of photons.
The kd-tree stores one photon at each node in the tree. Each
node then divides the space into two half spaces. The left child
is all photons on the left (positive) side of the half space, and the
right child is all the photons in the right (negative) half space.
Each split occurs along one of the coordinate axes. First x, then
y, then z and then x again for each level of the kd-tree. We thus
use a 3d-tree. It may be more convenient to decide on the split
based on the distribution of the points in the set. i.e. use the
maximum distance between components to determine according
to which axis we should split.
Since the kd-tree will be used many times when rendering, it is
important that the kd-tree be balanced. We store the photons
in a simple structure during the photon trace phase, and then
construct a balanced tree from this information. If we store a
balanced tree, then pointers to the child notes are not necessary
since we can store the photon map in an arrangement similar to
a heap. The root will be at index 1. A node in position i has
2i as left child and 2i + 1 as right child. To build the balanced
tree, we simply repetitively split the set using the median value
of the data.
We need to be able to locate photons to get an approximation
of the lighting near a point. The following algorithm nds the
nearest photons within maximum distance d of the point x:
locate photons(p)
= signed distance to splitting plane of node n(p)
if < 0 then
x is on the left hand side, examine the left subtree
locate photons (2p)
if (
2
< d
2
) then
the right subtree is within range, check it
locate photons(2p + 1)
end if
else
x is on the right hand side, examine the right subtree
locate photons (2p + 1)
if (
2
< d
2
) then
the left subtree is within range, check it
locate photons(2p)
end if
end if
check if this photon is within range and should be stored
2
= squared distance from photon p to x
if (
2
< d
2
) then
insert photon p into list of nearest photons
reduce maximum range to check, prune search
d
2
= squared distance to median photon in list of nearest
photons
end if

The radiance estimate
We calculate the radiance estimate using the photon map, by
nding a number of photons in a certain range of the intersection
point (during raytracing). This is a spherical volume. We want
to calculate a density estimate using the area that the photons,
are in. The sphere projects onto the at surface as a circle with
area d
2
. We can sum the values of the photons and divide by
the area to get an approximation of the ux. However, we have
to integrate this into the lighting equation as follows:
L(x, r)
1
d
2
n
p=1
f(x, i, r)
p
(x, i)
where
d is the maximum radius of the nearest neighbour search.
n is the number of photons found.
x is the point of intersection.
L is the luminance.
f is the BRDF (Phong, Cook Torrance).
i is the incoming direction of the photon (stored).
r is the view direction (direction for evaluation).

p
(x, i) is the power of photon p, which hit from direction
i.
Photon mapping allows many other global illumination phe-
nomenon to be rendered (as illustrated in gures 10.1, 10.2 and
10.3):
Ray tracing Soft shadows Caustics
Global illumination
Figure 10.1: Examples of photon mapping from [7].
Caustics (focused light).
Soft shadows.
Participating media.
Subsurface scattering.
Cornell Box Indoor scene Caustics
Volume caustic (participating media) Indoor scene
Chapter 11
Animation
Animation is the process of controlling the movement of objects.
The objective of animation is to produce a series of images which
can be displayed in succession rapidly, thus giving the appear-
ance of motion. In other words, we just need to render a suitable
set of frames on the computer and then play back these frames
rapidly. The dicult task is to dene the motion that we wish
to represent, and allow the motion to be dened easily. Some of
the traditional animation techniques can still be applied.
Traditional animation techniques
In this section we discuss a few of the traditional animation
techniques that are easily applied in a computer environment.
Keyframing
Each image in a n animation sequence is known as a frame.
Keyframes are important frames in the animation that indi-
cate key portions of the motion. After keyframes have been
produced, the inbetween frames can be interpolated from these
101
CHAPTER 11. ANIMATION 102
frames. The inbetween frames must be generated carefully to
provide the desired motion.
For computer animation we can specify the position and shape
of objects at key frames. We then use some form of interpolation
to obtain frames at inbetween values. In the worst case linear
interpolation can be used. However, cubic splines (Bezier curves
etc.) can also be used to provide smooth interpolation over time.
We can move objects along various paths using piecewise smooth
cubic curves. Interpolation on the computer is not restricted to
position only. The velocity or acceleration of the object can also
be modelled.
Motion capture
Not all data need be fed into the animation system manually.
Motion capture attempts to capture the motion of live actors.
The actors will have devices placed on them that provide mea-
surements of the position and/or orientation of various parts of
the body. The information is then recorded and processed by
the computer to provide animation. Sometimes spline curves
will be used to approximate the data that has been acquired.
Various sensors can be used. Sensors may measure their position
with respect to a magnetic eld. Another approach is an optical,
system. Reective spheres are placed on the actor, and the
actor is recorded by several cameras. The information from
the cameras allows the location of the spheres to be computed.
However, this is not a trivial process since information from
several cameras must be combined to provide reliable tracking
information. As a result the data may contain a lot of noise.
Physics models
For simple rigid objects, we can use classical physics models.
Namely, compute the force and torque on the objects. The
force determines acceleration, acceleration determines velocity
and velocity determines position. We can then detect collisions
between objects. Conservation of momentum, friction and other
techniques can the be used to determine the new forces that
apply to the object. Torque on the object determines rotation.
The dicult part is the computation of collisions. Refer to Real-
Time rendering [1] for details on collision detection algorithms.
If the force on an object is F then we have
F = ma
t
,
so the acceleration can be computed. F is naturally the sum of
all forces acting on the object. The velocity is then
v
t+1
= v
t
+ ta
t
and the position of the object
p
t+1
= p
t
+ tv
t
.
Similar calculations can be performed to determine rotation us-
ing the torque.
Animation of position
We can control camera movement or object movement by speci-
fying curves that trace out the path of the object. These curves
usually specify the centre of mass of the object. The deriva-
tives of cubic spline curves are available, so it is possible to also
modify the velocity of the object, the curve will then alter ac-
cordingly.
Sometimes we want fairly dynamic situations. For example, we
may want the camera to follow a certain object. It can be te-
dious and dicult to do so by hand. Another approach is to
automatically build the spline curves required to follow the ob-
ject.
As an example we can consider ease-in. In this case, we would
like an object with some initial velocity to approach a point
smoothly and stop at that point. This can be achieved using
the formula
q(u) = p
0
H
0
(u) +v
0
H
1
(u) +p
1
H
3
(u),
where p
0
is the initials position of the object, v
0
is the initial
velocity of the object and p
1
is the stopping point. The Hermite
polynomials H
i
are
H
0
(u) = (1 + 2u)(1 u)
2
H
1
(u) = u(1 u)
2
H
2
(u) = u
2
(1 u)
H
3
(u) = u
2
(3 2u).
We can change the time interval from 0 u 1 with the simple
substitution
J
i
(u) = H
i
_
u u
0
u
1
u
0
_
.
In the case of a moving target, this approach is less eective. If
the target does not change direction or velocity rapidly, linear
interpolation may provide a suitable solution:
c
i+1
= (1 )c
i
+ t
i+1
,
where subscripts indicate time, is a xed parameter and t
i
is
the target position at time i.
Arc length parametrisation
The splines do not have equal speed across the path of the curve.
Velocity curves are thus also not entirely easy to work with.
If we compare |q(t
1
+ t) q(t
1
)| and |q(t
2
+ t) q(t
2
)|
for small t we may very well nd that distance diers even
though we may not want this to be the case. Greater distance
implies greater velocity. To obtain constant may require the
path to change and this may not be desirable. The solution
is to reparameterise the curve so that we have constant speed
throughout the curve. The parametrisation q(t) = q(t(s)) is
known as the arclength parametrisation of the curve.
The arclength of a curve is given by
s(t
) =
_
t
t
0
_
_
dx
dt
_
2
+
_
dy
dt
_
2
+
_
dz
dt
_
2
dt.
There is often no analytical solution to this equation, so the
answer would have to be found numerically. You can use the
Trapezoid rule, Simpsons rule or Rhomberg integration to inte-
grate the function numerically depending on the desired accu-
racy. Computing s is not our primary concern, however. Rather,
we wish to compute t(s), the value of t which will produce the ar-
clength s. Since the function s(t) is often not analytic, neither is
t(s). We can use the bisection method (not Newton Rhapson?)
to numerically nd t(s). Start with two guesses of the interval
in which t lies. Let the interval be given by [t
1
, t
2
]. Compute the
parameter halfway along the interval t
m
= 0.5(t
1
+t
2
). Compute
s
m
for this parameter. If s
m
> s then apply the bisection method
to [t
1
, t
m
], otherwise, apply the bisection method to [t
m
, t
2
].
This approach may be time consuming, so forward dierencing
can be used to quickly determine points on the curve that can be
used as estimates for arclength paramaterisation. The estimate
is given by
i
k=1
|q(u
k
) q(u
k1
)|.
The book by Watt and Watt [3] provides considerably more
detail than we have presented here.
Orientation
We have already discussed orientation of objects to some extent.
We briey list some of the results.
Rotation matrices can be used to represent the orientation of an
object. To do so we need a consistent way to represent the rota-
tion of the object. One technique, is to use Euler angles. Euler
angles represent yaw, pitch and roll angles that correspond to
rotation around the y, x and z axes respectively. We can the
specify our angles for an object and it will be rotated accord-
ingly. However, we may encounter Gimbal Lock where we lose a
degree of freedom. This is due to the limited way in which the
orientation is presented.
Quaternions oer an alternative solution to representing rota-
tions. One of the advantages of quaternions is relative simplic-
ity of the calculation. Matrices may also be multiplied in the
same way, however. And in addition, matrices can represent
translation too! (which quaternions cannot do). So what are
the advantages of quaternions? The advantages are:
Numerical stability - A unit quaternion is a rotation. If
the quaternion is not a unit quaternion, normalise it.
Spherical linear interpolation - We can easily interpo-
late between quaternions, providing a smooth transition
from one orientation to another. I know of no good way to
do this with matrices.
In terms of animation, it is relatively easy to construct a quater-
nion to indicate the direction in which a camera should be look-
ing. One the destination direction has been determined, we can
use spherical linear interpolation to smoothly transition from
the one view to another. This technique can be used very eec-
tively for a pilot that needs to look over his shoulder.
It is also possible to convert between representations, eg. be-
tween a quaternion and a matrix. We do not have to exclusively
use one technique or the other.
Articulated structures (kinematics)
An articulated structure or multibody, is a series of rigid links
which are joined in a tree structure. We will consider joints with
1 degree of freedom. An example of an articulated structure is
shown in gure 11.1. We will consider the use of kinematics
for the animation of articulated structures. Kinematics refers
to the motion without respect to physical properties such as
Figure 11.1: An articulated structure
mass, acceleration etc. We can use one of several techniques to
animate articulated structures:
Forward kinematics - Given the angles of all joints, de-
termine their positions.
Inverse kinematics - Given the desired position of one
or more joints, determine possible angles that satisfy these
constraints.
Forward dynamics - Given initial positions, velocities,
forces, masses etc. determine the position of each joint at
a particular time.
Inverse dynamics - Given the motion of an object, deter-
mine what forces must be applied to the joints to achieve
this motion.
Articulated structures are very useful for animation as they al-
low a structure or skeleton to be used to drive the animation.
The skeleton can be used to deform a model.
Forward kinematics
To describe the position of the model we specify the angle
i
for each link. The angle is specied relative to the orientation
of the parent link. The relative positions of the joints (initial
state) are specied by the vectors r
i
.
Using these values we can determine the positions of the joints
and end eectors s
i
for the articulated structure.
The calculation is trivially performed using the recursive evalu-
ation
M
i
= M
p
R
i
T
r
i
of the matrix to be applied to each joint. In this case p represents
the parent joint. The point s
i
is then given by
s
i
= M
i
0;
This calculation assumes that we are always rotating around the
same axis, which is correct for a two dimensional model. For a
three dimensional model, we need to store the axis of rotation
v
i
. During recursive evaluation, we rst determine the axis of
rotation w
i
:
w
i
= M
i
v
i
.
Vertex blending
Using the articulated structure we can manipulate a model. We
determine which links (or bones) eect each vertex in the mesh
representing the object. We might use a cylinder to determine
the area of inuence of a link. If several links aect a vertex,
then we assign a weight w
i
to each link for this vertex. The
weight might be determined by the distance from the link to
the vertex. Typically we would expect that
n1
i=0
w
i
= 1, w
i
0
for a particular vertex. The position of the vertex when the
articulated structure is moved is a weighted sum of the eect of
the individual links:
v
t
=
n1
i=0
w
i
B
i
(t)M
1
i
p,
where p is the original vertex, M
i
transforms from link is co-
ordinate system into the world coordinate system, and B
i
(t) is
the transform of the link at time t. B
i
can be determined by the
formulae in the previous section.
Other techniques such as the free form deformation can also
be used once the position of the articulated structure has been
determined.
Inverse kinematics
Inverse kinematics attempts to nd
i
such that we obtain a
desired set of joint positions s
i
or angles
i
(some may not be
specied) given the current state s
i
and
i
.
We can determine the eect of changing the angles
i
by a small
amount using the Jacobian J:
dX = J()d,
where X is the positions of the joints s
i
and end eectors, and
is the current set of angles
i
. The entries of the Jacobian are
given by
J
ij
=
f
i
j
,
where f
i
is the function that determines s
i
given the angles
i
.
We thus have
X = f().
If we can compute the Jacobian, then we can compute the
change in angles to achieve the desired position by
= J
1
X.
The updates angles are given by
= + .
We use the term to try to keep the process numerically stable.
We can determine the function f from the matrix M
i
calculated
above. Computing the symbolic form of the matrix can be dif-
cult. One option is to use a numerical approximation, but
this is likely to be pretty unstable. We can use the formulation
above to obtain a simpler technique for calculating the partial
derivative. A more suitable option is to use the chain rule.
Mass Spring systems
Not all objects are rigid, some objects are deformable. A typical
example of a deformable object is a piece of cloth. The same
techniques that are applied to rigid objects can be applied to
deformable objects. Often the representation of the deformable
object diers to accommodate the deformation.
One representation for deformable objects, is the mass spring
system. In the mass spring system we have a number of point
masses which are connected to other point masses by springs. If
we arrange the masses linearly we may approximate a rope. I
we arrange the point masses in a grid, then we can approximate
a cloth. The springs in the system give exibility and allow the
object to stretch. But, with suitable constants, the springs may
also be very rigid.
The force that a spring exerts is given by Hookes law:
F = k
x
|x|
(|x| ),
where k is the spring constant that determines the force exerted
by the spring, is the rest length of the spring and x is the
position of the endpoint of the spring under consideration.
The point masses may also be under the inuence of gravity or
other forces. Using these equations we can determine the motion
of a rope or piece of cloth. More sophisticated objects may be
simulated in a similar fashion. A cushion may be modelled using
a cloth surface, a rigid interior with string springs that connect
to the point masses.
Particle systems
A number of natural phenomena may be modelled quickly (but
not necessarily accurately) by particles. Particles are small ob-
jects that are animated under certain rules. These rules may
be physics or other suitable rules. Usually a large number of
particles are required to produce the desired eect.
A simple example of particles, is a water fountain. A number
of water particles are shot into the air and fall back to earth
under the eect of gravity and air resistance. A pure physics
model can be used in this case. Because there is minimal inter-
action between the uid particles in this scenario, it is sucient
to simply model the water as numerous particles. If we want
a more realistic simulation of water we may consider using the
Navier-Stokes equations.
Fire may also be modelled by particles. In this case, the parti-
cles are red out in the direction of the re, probably with some
turbulent motion until they cool suciently to be no longer visi-
ble. Instead of removing the particles, the particles may be used
to produce smoke.
Free Form Deformations
Many of the tools described in the previous section used physics
to model some environment. The user of such a system has lim-
ited control. The free form deformation (FFD) is a tool that
allows an animator to choose the shape of an object using con-
ventional tools. The FFD may be used to deform a rigid or soft
object. The deformation is totally under the control of the ani-
mator and not physics based at all.
The great advantage of the FFD, is that it may be applied to
any object and is very simple to implement. Free form deforma-
tions have been used successfully as an animation tool, as well
as an automated control for more physically realistic applica-
tions. FFDs were used successfully for modelling muscles when
controlled by a skeleton based kinematic system [3].
The free form deformation is based on Bezier volumes. In the
same way that we extend Bezier curves to Bezier surfaces, we
can further extend the Bezier surface to a Bezier volume. A
Bezier volume is simply given by
q(u, v, w) =
3
i=0
3
j=0
3
k=0
B
i
(u)B
j
(v)B
k
(w)p
i,j,k
.
Naturally we are not restricted to cubic curves, but this is the
form we will use for the FFD. There are thus 3 parameters de-
scribing the volume. Initially we arrange the control points in
a grid fashion so that the boundaries are given bi straight lines
and planes. The Bezier volume is created so that it surrounds
the object to be deformed. The object is represented in the co-
ordinate system of the Bezier volume. For each vertex s we nd
s
u
, s
v
and s
w
so that
s =
3
i=0
3
j=0
3
k=0
B
i
(s
u
)B
j
(s
v
)B
k
(s
w
)p
i,j,k
.
We then use these parameters to represent the vertex rather than
the world coordinates. It is trivial to determine these parame-
ters due to the (grid) linear arrangement of the control points.
To deform the object, we simply move the control points of the
FFD, thereby moving and warping the space in which the object
lies. When we render the object we use the equation
q(u, v, w) =
3
i=0
3
j=0
3
k=0
B
i
(u)B
j
(v)B
k
(w)p
i,j,k
to convert from the object space (s
u
, s
v
, s
w
) to the world coordi-
nates. The free form deformation can thus be applied to various
kinds of objects.
It is important to remember that the space in which the object
lies is warped. Straight lines in object space are not necessarily
straight in the warped space produced by the FFD. We can-
not necessarily just transform the control points. Rather the
rendering process should be carefully controlled.
Bibliography
[1] Real-Time Rendering, Tomas Akenine-M oller and Eric
Haines, Second Edition, AK Peters, 2002
[2] 3D-Computer Graphics - A Mathematical Introduction with
OpenGL, Samuel R. Buss, Cambridge University Press,
2003
[3] Advanced Animation and Rendering Techniques - Theory
and Practice, Alan Watt and Mark Watt, Addison-Wesley,
1992
[4] Computer Graphics - Principles and Practice, Foley, van
Dam, Feiner and Hughes, Second Edition, Addison-Wesley,
1996
[5] Realistic Image Synthesis Using Photon Mapping, Henrik
Wann Jensen, AK Peters, 2001
[6] http://www.graphics.cornell.edu/online/research/, Radios-
ity images
[7] http://graphics.ucsd.edu/henrik/, Photon mapping im-
ages
116

Computer Graphics Alexandre Hardy

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Graphics Alexandre Hardy

Uploaded by

Copyright:

Available Formats

Computer Graphics

y. Now we can express any linear

x). From rigidity we have A(

. Let L make an angle with L

) are equivalent if there

Figure 4.7: Blinns geometric lemma

as the unit vector normal to the opposite

is the reection of h around the normal to the surface n so

= 2(n h)n h. From the symmetry of the groove and

< 0. Figure 4.6(b) illustrates this

< 0. Figure 4.6(c) illustrates this

< 0 and n v < n l

Figure 5.3: Bilinear interpolation.

CHAPTER 10. PHOTON MAPPING 98

You might also like