Professional Documents
Culture Documents
Processing
1 2
What are Computer Graphics &
Computer Graphics & Image Processing
Image Processing?
Sixteen lectures
Part IB Scene
Part II(General) description
Image processing
3 4
What are Computer Graphics &
Why bother with CG & IP?
Image Processing?
All visual computer output depends on
Computer Graphics
Scene
description printed output
Computer Image analysis &
monitor (CRT/LCD/whatever)
graphics computer vision all visual computer output consists of real images
Image processing
5 6
What are CG & IP used for? Course Structure
2D computer graphics Background [3L]
graphical user interfaces: Mac, Windows, X,… images, human vision, displays
graphic design: posters, cereal packets,… 2D computer graphics [4L] 3D CG
typesetting: book publishing, report writing,…
lines, curves, clipping, polygon filling,
Image processing transformations
photograph retouching: publishing, posters,…
3D computer graphics [6L] 2D CG IP
photocollaging: satellite imagery,…
projection (3D→2D), surfaces,
art: new forms of artwork based on digitised images clipping, transformations, lighting,
3D computer graphics filling, ray tracing, texture mapping Background
visualisation: scientific, medical, architectural,… Image processing [3L]
Computer Aided Design (CAD) filtering, compositing, half-toning,
entertainment: special effect, games, movies,… dithering, encoding, compression
7 8
Course books Past exam questions
Computer Graphics: Principles & Practice Dr Dodgson has been lecturing the course since 1996
Foley, van Dam, Feiner & Hughes,Addison-Wesley, 1990 the course changed considerably between 1996 and 1997
z Older version: Fundamentals of Interactive Computer Graphics all questions from 1997 onwards are good examples of his
Foley & van Dam, Addison-Wesley, 1982 question setting style
do not worry about the last 5 marks of 97/5/2
Computer Graphics & Virtual Environments
z this is now part of Advanced Graphics syllabus
Slater, Steed, & Chrysanthou, Addison-Wesley, 2002
3D CG 9 10
Background 2D CG IP What is an image?
what is a digital image? Background
two dimensional function
what are the constraints on digital images? value at any point is an intensity or colour
how does human vision work? not digital!
what are the limits of human vision?
what can we get away with given these constraints
& limits?
how do displays & printers work?
how do we fool the human eye into seeing what
we want it to see?
11 12
What is a digital image? Image capture
a contradiction in terms a variety of devices can be used
if you can see it, it’s not digital scanners
if it’s digital, it’s just a collection of numbers line CCD in a flatbed scanner
spot detector in a drum scanner
a sampled and quantised version of a real
cameras
image area CCD
a rectangular array of intensity or colour
values
13 14
Image capture example Image display
103 59 12 80 56 12 34 30 1 78 79 21 145 156 52 136 143 65 115 129 41 128 143 50 85
106 11 74 96 14 85 97 23 66 74 23 73 82 29 67 76 21 40 48 7 33 39 9 94 54 19
42 27 6 19 10 3 59 60 28 102 107 41 208 88 63 204 75 54 197 82 63 179 63 46 158 62
a digital image is an array of integers, how do
46 146 49 40 52 65 21 60 68 11 40 51 17 35 37 0 28 29 0 83 50 15 2 0 1 13 14
8 243 173 161 231 140 69 239 142 89 230 143 90 210 126 79 184 88 48 152 69 35 123 51
27 104 41 23 55 45 9 36 27 0 28 28 2 29 28 7 40 28 16 13 13 1 224 167 112 240
you display it?
174 80 227 174 78 227 176 87 233 177 94 213 149 78 196 123 57 141 72 31 108 53 22 121
62 22 126 50 24 101 49 35 16 21 1 12 5 0 14 16 11 3 0 0 237 176 83 244 206 123
241 236 144 238 222 147 221 190 108 215 170 77 190 135 52 136 93 38 76 35 7 113 56 26
156 83 38 107 52 21 31 14 7 9 6 0 20 14 12 255 214 112 242 215 108 246 227 133 239
reconstruct a real image on some sort of
display device
232 152 229 209 123 232 193 98 208 162 64 179 133 47 142 90 32 29 19 27 89 53 21 171
116 49 114 64 29 75 49 24 10 9 5 11 16 9 237 190 82 249 221 122 241 225 129 240 219
126 240 199 93 218 173 69 188 135 33 219 186 79 189 184 93 136 104 65 112 69 37 191 153
80 122 74 28 80 51 19 19 37 47 16 37 32 223 177 83 235 208 105 243 218 125 238 206
103 221 188 83 228 204 98 224 220 123 210 194 109 192 159 62 150 98 40 116 73 28 146 104
46 109 59 24 75 48 18 27 33 33 47 100 118 216 177 98 223 189 91 239 209 111 236 213
117 217 200 108 218 200 100 218 206 104 207 175 76 177 131 54 142 88 41 108 65 22 103
CRT - computer monitor, TV set
59 22 93 53 18 76 50 17 9 10 2 54 76 74 108 111 102 218 194 108 228 203 102 228 200
100 212 180 79 220 182 85 198 158 62 180 138 54 155 106 37 132 82 33 95 51 14 87 48
15 81 46 14 16 15 0 11 6 0 64 90 91 54 80 93 220 186 97 212 190 105 214 177 86 208
165 71 196 150 64 175 127 42 170 117 49 139 89 30 102 53 12 84 43 13 79 46 15 72 42
LCD - portable computer
14 10 13 4 12 8 0 69 104 110 58 96 109 130 128 115 196 154 82 196 148 66 183 138 70
174 125 56 169 120 54 146 97 41 118 67 24 90 52 16 75 46 16 58 42 19 13 7 9 10 5
0 18 11 3 66 111 116 70 100 102 78 103 99 57 71 82 162 111 66 141 96 37 152 102 51 printer - dot matrix, laser printer, dye sublimation
130 80 31 110 63 21 83 44 11 69 42 12 28 8 0 7 5 10 18 4 0 17 10 2 30 20 10
58 88 96 53 88 94 59 91 102 69 99 110 54 80 79 23 69 85 31 34 25 53 41 25 21 2
0 8 0 0 17 10 4 11 0 0 34 21 13 47 35 23 38 26 14 47 35 23
15 16
Different ways of displaying the same
Image display example
digital image
103 59 12 80 56 12 34 30 1 78 79 21 145 156 52 136 143 65 115 129 41 128 143 50 85
106 11 74 96 14 85 97 23 66 74 23 73 82 29 67 76 21 40 48 7 33 39 9 94 54 19
42 27 6 19 10 3 59 60 28 102 107 41 208 88 63 204 75 54 197 82 63 179 63 46 158 62
46 146 49 40 52 65 21 60 68 11 40 51 17 35 37 0 28 29 0 83 50 15 2 0 1 13 14
8 243 173 161 231 140 69 239 142 89 230 143 90 210 126 79 184 88 48 152 69 35 123 51
27 104 41 23 55 45 9 36 27 0 28 28 2 29 28 7 40 28 16 13 13 1 224 167 112 240
174 80 227 174 78 227 176 87 233 177 94 213 149 78 196 123 57 141 72 31 108 53 22 121
62 22 126 50 24 101 49 35 16 21 1 12 5 0 14 16 11 3 0 0 237 176 83 244 206 123
241 236 144 238 222 147 221 190 108 215 170 77 190 135 52 136 93 38 76 35 7 113 56 26
156 83 38 107 52 21 31 14 7 9 6 0 20 14 12 255 214 112 242 215 108 246 227 133 239
232 152 229 209 123 232 193 98 208 162 64 179 133 47 142 90 32 29 19 27 89 53 21 171
116 49 114 64 29 75 49 24 10 9 5 11 16 9 237 190 82 249 221 122 241 225 129 240 219
126 240 199 93 218 173 69 188 135 33 219 186 79 189 184 93 136 104 65 112 69 37 191 153
80 122 74 28 80 51 19 19 37 47 16 37 32 223 177 83 235 208 105 243 218 125 238 206
103 221 188 83 228 204 98 224 220 123 210 194 109 192 159 62 150 98 40 116 73 28 146 104
46 109 59 24 75 48 18 27 33 33 47 100 118 216 177 98 223 189 91 239 209 111 236 213
117 217 200 108 218 200 100 218 206 104 207 175 76 177 131 54 142 88 41 108 65 22 103
59 22 93 53 18 76 50 17 9 10 2 54 76 74 108 111 102 218 194 108 228 203 102 228 200
100 212 180 79 220 182 85 198 158 62 180 138 54 155 106 37 132 82 33 95 51 14 87 48
15 81 46 14 16 15 0 11 6 0 64 90 91 54 80 93 220 186 97 212 190 105 214 177 86 208
165 71 196 150 64 175 127 42 170 117 49 139 89 30 102 53 12 84 43 13 79 46 15 72 42
14 10 13 4 12 8 0 69 104 110 58 96 109 130 128 115 196 154 82 196 148 66 183 138 70
174 125 56 169 120 54 146 97 41 118 67 24 90 52 16 75 46 16 58 42 19 13 7 9 10 5
0 18 11 3 66 111 116 70 100 102 78 103 99 57 71 82 162 111 66 141 96 37 152 102 51
130 80 31 110 63 21 83 44 11 69 42 12 28 8 0 7 5 10 18 4 0 17 10 2 30 20 10
Displayed on a CRT Nearest-neighbour Gaussian Half-toning
58 88 96 53 88 94 59 91 102 69 99 110 54 80 79 23 69 85 31 34 25 53 41 25 21 2
0 8 0 0 17 10 4 11 0 0 34 21 13 47 35 23 38 26 14 47 35 23 e.g. LCD e.g. CRT e.g. laser printer
The image data
17 18
Sampling Sampling resolution
a digital image is a rectangular array of 256×256 128×128 64×64 32×32
intensity values
each value is called a pixel
“picture element”
sampling resolution is normally measured in
pixels per inch (ppi) or dots per inch (dpi)
computer monitors have a resolution around 100 ppi
laser printers have resolutions between 300 and 1200 ppi
19 20
Quantisation Quantisation levels
8 bits 7 bits 6 bits 5 bits
each intensity value is a number (256 levels) (128 levels) (64 levels) (32 levels)
21 22
The workings of the human visual system The retina
to understand the requirements of displays consists of ~150 million light receptors
(resolution, quantisation and colour) we need retina outputs information to the brain along
to know how the human eye works... the optic nerve
The lens of the eye forms an there are ~1 million nerve fibres in the optic nerve
image of the world on the
retina: the back surface of the retina performs significant pre-processing to
the eye reduce the number of signals from 150M to 1M
pre-processing includes:
averaging multiple inputs together
colour signal processing
GW Fig 2.1, 2.2; Sec 2.1.1 edge detection
FLS Fig 35-2
23 24
Some of the processing in the eye Simultaneous contrast
discrimination as well as responding to changes in overall
discriminates between different intensities and colours light, the eye responds to local changes
adaptation GLA Fig 1.17
GW Fig 2.4
adapts to changes in illumination level and colour
can see about 1:100 contrast at any given time
but can adapt to see light over a range of 1010
persistence
integrates light over a period of about 1/30 second
edge detection and edge enhancement
The centre square is the same intensity in all four cases
visible in e.g. Mach banding effects
25 26
Mach bands Ghost squares
show the effect of edge enhancement in the another effect caused by retinal pre-processing
retina’s pre-processing
27 28
Light detectors in the retina Foveal vision
two classes 150,000 cones per square millimetre in the
rods fovea
cones high resolution
colour
cones come in three types
sensitive to short, medium and long wavelengths
outside fovea: mostly rods
lower resolution
the fovea is a densely packed region in the principally monochromatic
centre of the retina provides peripheral vision
contains the highest density of cones z allows you to keep the high resolution region in context
provides the highest resolution vision z allows you to avoid being hit by passing branches
GW Fig 2.1, 2.2
29 30
Summary of what human eyes do... What is required for vision?
sample the image that is projected onto the illumination
retina some source of light
31 32
Light: wavelengths & spectra Classifying colours
light is electromagnetic radiation we want some way of classifying colours and,
visible light is a tiny part of the electromagnetic spectrum preferably, quantifying them
visible light ranges in wavelength from 700nm (red end of
spectrum) to 400nm (violet end) we will discuss:
Munsell’s artists’ scheme
every light has a spectrum of wavelengths that
which classifies colours on a perceptual basis
it emits MIN Fig 22a
the mechanism of colour vision
every object has a spectrum of wavelengths how colour perception works
that it reflects (or transmits) various colour spaces
the combination of the two gives the spectrum which quantify colour based on either physical or
perceptual models of colour
of wavelengths that arrive at the eye MIN Examples 1 & 2
33 34
Munsell’s colour classification system Colour vision
three axes three types of cone
hue ¾ the dominant colour each responds to a different spectrum JMF Fig 20b
lightness ¾ bright colours/dark colours very roughly long, medium, and short wavelengths
saturation ¾ vivid colours/dull colours each has a response function l(λ), m(λ), s(λ)
can represent this as a 3D graph different numbers of the different types
any two adjacent colours are a standard far fewer of the short wavelength receptors
“perceptual” distance apart so cannot see fine detail in blue
MIN Fig 4
35 36
Colour signals sent to the brain Chromatic metamerism
the signal that is sent to the brain is pre- many different spectra will induce the same response
processed by the retina in our cones
the values of the three perceived values can be calculated as:
long + medium + short = luminance
z l = k ∫ P(λ) l(λ) dλ
z m = k ∫ P(λ) m(λ) dλ
long - medium = red-green
z s = k ∫ P(λ) s(λ) dλ
k is some constant, P(λ) is the spectrum of the light incident on
long + medium - short = yellow-blue
the retina
this theory explains: two different spectra (e.g. P1(λ) and P2(λ)) can give the same
values of l, m, s
colour-blindness effects
we can thus fool the eye into seeing (almost) any colour by
why red, yellow, green and blue are perceptually important
mixing correct proportions of some small number of lights
why you can see e.g. a yellowish red but not a greenish red
37 38
Mixing coloured lights XYZ colour space FvDFH Sec 13.2.2
Figs 13.20, 13.22, 13.23
by mixing different amounts of red, green, not every wavelength can be represented as a
and blue lights we can generate a wide range mix of red, green, and blue
of responses in the human eye but matching & defining coloured light with a
mixture of three fixed primaries is desirable
CIE define three standard primaries: X, Y, Z
Y matches the human eye’s response to light of a constant
green
green
39 40
CIE chromaticity diagram RGB in XYZ space
chromaticity values are defined in terms of x, y, z CRTs and LCDs mix red, green, and blue to
x=
X
, y=
Y
, z=
Z
∴ x + y + z =1
make all other colours
X +Y +Z X +Y +Z X +Y +Z
the red, green, and blue primaries each map to
ignores luminance
FvDFH Fig 13.24
Colour plate 2
a point in XYZ space
can be plotted as a 2D function
pure colours (single wavelength) lie along the outer
any colour within the resulting triangle can be
curve displayed
any colour outside the triangle cannot be displayed
all other colours are a mix of pure colours and hence
for example: CRTs cannot display very saturated purples,
lie inside the curve
blues, or greens
points outside the curve do not exist as colours
FvDFH Figs 13.26, 13.27
41 42
Colour spaces Summary of colour spaces
CIE XYZ, Yxy the eye has three types of colour receptor
Pragmatic therefore we can validly use a three-dimensional
used because they relate directly to the way that the hardware co-ordinate system to represent colour
works FvDFH Fig 13.28
XYZ is one such co-ordinate system
RGB, CMY, CMYK
Y is the eye’s response to intensity (luminance)
Munsell-like X and Z are, therefore, the colour co-ordinates
considered by many to be easier for people to use than the
z same Y, change X or Z ⇒ same intensity, different colour
pragmatic colour spaces FvDFH Figs 13.30, 13,35
z same X and Z, change Y ⇒ same colour, different intensity
HSV, HLS
some other systems use three colour co-ordinates
Uniform
luminance can then be derived as some function of the three
equal steps in any direction make equal perceptual differences
z e.g. in RGB: Y = 0.299 R + 0.587 G + 0.114 B
L*a*b*, L*u*v* GLA Figs 2.1, 2.2; Colour plates 3 & 4
43 44
Implications of vision on resolution Implications of vision on quantisation
in theory you can see about 600dpi, 30cm from humans can distinguish, at best, about a 2%
your eye change in intensity
in practice, opticians say that the acuity of the eye not so good at distinguishing colour differences
is measured as the ability to see a white gap,
1 minute wide, between two black lines
for TV ⇒ 10 bits of intensity information
about 300dpi at 30cm 8 bits is usually sufficient
z why use only 8 bits? why is it usually acceptable?
resolution decreases as contrast decreases for movie film ⇒ 14 bits of intensity information
colour resolution is much worse than intensity
for TV the brightest white is about 25x as bright as
resolution the darkest black
this is exploited in TV broadcast movie film has about 10x the contrast ratio of TV
45 46
Storing images in memory Colour images
8 bits has become a de facto standard for tend to be 24 bits per pixel
greyscale images 3 bytes: one red, one green, one blue
47 48
The frame buffer Double buffering
if we allow the currently displayed image to be updated
most computers have a special piece of then we may see bits of the image being displayed
memory reserved for storage of the current halfway through the update
image being displayed this can be visually disturbing, especially if we want the illusion
of smooth animation
B output double buffering solves this problem: we draw into one
frame
U
buffer stage display frame buffer and display from the other
S (e.g. DAC) when drawing is complete we flip buffers
B
the frame buffer normally consists of dual- U
Buffer A
output
stage display
ported Dynamic RAM (DRAM) S
(e.g. DAC)
Buffer B
sometimes referred to as Video RAM (VRAM)
49 50
Image display Liquid crystal display
a handful of technologies cover over 99% of all liquid crystal can twist the polarisation of light
display devices control is by the voltage that is applied across the
active displays liquid crystal
cathode ray tube most common, declining use either on or off: transparent or opaque
liquid crystal display rapidly increasing use greyscale can be achieved in some liquid crystals
plasma displays still rare, but increasing use by varying the voltage
special displays e.g. LEDs for special applications colour is achieved with colour filters
printers (passive displays) low power consumption but image quality not as
laser printers good as cathode ray tubes
ink jet printers
several other technologies JMF Figs 90, 91
51 52
Cathode ray tubes How fast do CRTs need to be?
focus an electron gun on a phosphor screen speed at which entire screen is updated
produces a bright spot is called the “refresh rate” Flicker/resolution
trade-off
scan the spot back and forth, up and down to 50Hz (PAL TV, used in most of Europe)
PAL 50Hz
cover the whole screen many people can see a slight flicker 768x576
vary the intensity of the electron beam to change 60Hz (NTSC TV, used in USA and Japan) NTSC 60Hz
the intensity of the spot better 640x480
53 54
Colour CRTs: shadow masks Printers
use three electron guns & colour phosphors many types of printer
electrons have no colour FvDFH Fig 4.14
ink jet
use shadow mask to direct electrons from each gun sprays ink onto paper
onto the appropriate phosphor
dot matrix
the electron beams’ spots are bigger than the pushes pins against an ink ribbon and onto the paper
shadow mask pitch
laser printer
can get spot size down to 7/4 of the pitch
uses a laser to lay down a pattern of charge on a drum;
pitch can get down to 0.25mm with delta arrangement
this picks up charged toner which is then pressed onto
of phosphor dots
the paper
with a flat tension shadow mask can reduce this to
0.15mm all make marks on paper
essentially binary devices: mark/no mark
55 56
Printer resolution What about greyscale?
achieved by halftoning
laser printer divide image into cells, in each cell draw a spot of the
up to 1200dpi, generally 600dpi appropriate size for the intensity of that cell
ink jet on a printer each cell is m×m pixels, allowing m2+1 different
intensity levels
used to be lower resolution & quality than laser e.g. 300dpi with 4×4 cells ⇒ 75 cells per inch, 17 intensity
printers but now have comparable resolution levels
phototypesetter phototypesetters can make 256 intensity levels in cells so
small you can only just see them
up to about 3000dpi
an alternative method is dithering
bi-level devices: each pixel is either black or dithering photocopies badly, halftoning photocopies well
white
will discuss halftoning and dithering in Image Processing section of course
57 58
Dye sublimation printers: true greyscale What about colour?
dye sublimation gives true greyscale generally use cyan, magenta, yellow, and black
inks (CMYK)
pixel sized heater
inks aborb colour
dye sheet c.f. lights which emit colour
direction of travel
special paper
CMY is the inverse of RGB
dye sublimes off dye sheet and onto paper in why is black (K) necessary?
proportion to the heat level inks are not perfect aborbers JMF Fig 9b
59 3D CG 60
How do you produce halftoned colour? 2D Computer Graphics 2D CG IP
print four halftone screens, one in each colour Colour plate 5 lines Background
61 62
Drawing a straight line Which pixels do we use?
y
a straight line can be defined by: there are two reasonably sensible alternatives:
y = mx + c
m
the slope of c 1
the line
x
a mathematical line is “length without every pixel through which the “closest” pixel to the
breadth” the line passes line in each column
a computer graphics line is a set of (can have either one or two (always have just one pixel
pixels in each column) in every column)
pixels
which pixels do we need to turn on to 8 9
draw a given line? in general, use this
63 64
A line drawing algorithm - preparation 1 A line drawing algorithm - preparation 2
pixel (x,y) has its centre at real co-ordinate (x,y) the line goes from (x0,y0) to (x1,y1)
it thus stretches from (x-½, y-½) to (x+½, y+½) the line lies in the first octant (0 ≤ m ≤ 1)
y+1½ x0 < x1
pixel (x,y) (x1,y1)
y+1
y+½
y
y-½
x-1½ x-½ x+½ x+1½
(x0,y0)
x-1 x x+1
65 66
Bresenham’s line drawing algorithm 1 Bresenham’s line drawing algorithm 2
d = (y1 - y0) / (x1 - x0)
Initialisation d = (y1 - y0) / (x1 - x0) naïve algorithm involves x = x0
x = x0 d floating point arithmetic & yf = 0
yi = y0 y yi
rounding inside the loop y = y0
y = y0 (x0,y0) DRAW(x,y)
⇒ slow
DRAW(x,y) x x+1 WHILE x < x1 DO
Speed up A: x=x+1
Iteration WHILE x < x1 DO separate integer and fractional yf = yf + d
x=x+1 parts of yi (into y and yf) IF ( yf > ½ ) THEN
yi’
yi = yi + d replace rounding by an IF y=y+1
y & y’ d
y = ROUND(yi) z removes need to do rounding yf = yf - 1
yi
assumes END IF
DRAW(x,y)
integer end
points END WHILE x x’ DRAW(x,y)
END WHILE
J. E. Bresenham, “Algorithm for Computer Control of a Digital Plotter”, IBM Systems Journal, 4(1), 1965
67 68
Bresenham’s line drawing algorithm 3 Bresenham’s algorithm for floating point
dy = (y1 - y0)
dx = (x1 - x0)
end points
Speed up B: d = (y1 - y0) / (x1 - x0)
x = x0
multiply all operations involving yf x = ROUND(x0)
yf = 0
by 2(x1 - x0) yi = y0 + d * (x-x0)
y = y0 y = ROUND(yi) d
z yf = yf + dy/dx → yf = yf + 2dy y yi = y+yf
DRAW(x,y) yf = yi - y
z yf > ½ → yf > dx WHILE x < x1 DO DRAW(x,y) (x0,y0)
z yf = yf - 1 → yf = yf - 2dx WHILE x < (x1 - ½) DO x x+1
x=x+1
removes need to do floating point yf = yf + 2dy x=x+1
arithmetic if end-points have IF ( yf > dx ) THEN yf = yf + d
IF ( yf > ½ ) THEN
integer co-ordinates y=y+1
y=y+1 y’+yf’
yf = yf - 2dx y & y’ d
yf = yf - 1
END IF y+yf
END IF
DRAW(x,y) DRAW(x,y)
END WHILE END WHILE x x’
69 70
Bresenham’s algorithm — more details A second line drawing algorithm
we assumed that the line is in the first octant a line can be specified using an equation of
can do fifth octant by swapping end points the form:
k = ax + by + c
71 72
Midpoint line drawing algorithm 1 Midpoint line drawing algorithm 2
given that a particular pixel is on the line, decision variable needs to make a decision at
the next pixel must be either immediately to point (x+1, y+½)
the right (E) or to the right and up one (NE) d = a( x + 1) + b( y + 12 ) + c
if go E then the new decision variable is at
use a decision variable (x+2, y+½) d ' = a ( x + 2 ) + b( y + 12 ) + c
(based on k) to determine = d +a
which way to go Evaluate the
decision variable if go NE then the new decision variable is at
at this point (x+2, y+1½) d ' = a ( x + 2 ) + b( y + 1 12 ) + c
if ≥ 0 then go NE
= d +a+b
This is the current pixel if < 0 then go E
73 74
Midpoint line drawing algorithm 3 Midpoint - comments
Initialisation Iteration this version only works for lines in the first
a = (y1 - y0) WHILE x < (x1 - ½) DO
b = -(x1 - x0) x=x+1
E case octant
just increment x
c = x1 y0 - x0 y1 IF d < 0 THEN extend to other octants as for Bresenham
x = ROUND(x0) d=d+a
y = ROUND(y0-(x- x0)(a / b)) ELSE
Sproull has proven that Bresenham and
d = a * (x+1) + b * (y+½) + c d=d+a+b Midpoint give identical results
DRAW(x,y) y=y+1
END IF NE case Midpoint algorithm can be generalised to
DRAW(x,y) increment x & y draw arbitary circles & ellipses
END WHILE
y Bresenham can only be generalised to draw
First decision circles with integer radii
(x0,y0) x x+1 point If end-points have integer co-ordinates then
all operations can be in integer arithmetic
75 76
Curves Midpoint circle algorithm 1
equation of a circle is x + y = r
2 2 2
circles & ellipses
centred at the origin
Bezier cubics
77 78
Midpoint circle algorithm 2 Taking circles further
decision variable needs to make a the algorithm can be easily extended to
decision at point (x+1, y-½) circles not centred at the origin
d = ( x + 1) 2 + ( y − 12 ) 2 − r 2
a similar method can be derived for ovals
if go E then the new decision variable is at but: cannot naively use octants
(x+2, y-½) d ' = ( x + 2) 2 + ( y − 12 )2 − r 2 use points of 45° slope to divide
= d + 2x + 3 oval into eight sections
79 80
Are circles & ellipses enough? Why cubics?
simple drawing packages use ellipses & lower orders cannot:
segments of ellipses have a point of inflection
for graphic design & CAD need something match both position and slope at both ends of a
with more flexibility segment
be non-planar in 3D
use cubic polynomials
higher orders:
can wiggle too much
take longer to compute
81 82
Hermite cubic Bezier cubic
the Hermite form of the cubic is defined by its two difficult to think in terms of tangent vectors
end-points and by the tangent vectors at these Bezier defined by two end points and two
end-points: P( t ) = ( 2t 3 − 3t 2 + 1) P
0 other control points
+ ( −2t 3 + 3t 2 ) P1
P( t ) = (1 − t )3 P0 P2
+ ( t 3 − 2t 2 + t )T0
+ 3t (1 − t ) 2 P1 P1
+ ( t 3 − t 2 )T1
+ 3t 2 (1 − t ) P2
two Hermite cubics can be smoothly joined by
matching both position and tangent at an end + t 3 P3 P0 P3
point of each cubic
where: Pi ≡ ( x i , y i )
Charles Hermite, mathematician, 1822–1901 Pierre Bézier worked for Citroën in the 1960s
83 84
Bezier properties Types of curve join
Bezier is equivalent to Hermite each curve is smooth within itself
T0 = 3( P1 − P0 ) T1 = 3( P3 − P2 ) joins at endpoints can be:
C1 – continuous in both position and tangent
Weighting functions are Bernstein polynomials vector
b0 ( t ) = (1 − t )3 b1 ( t ) = 3t (1 − t ) 2 b2 ( t ) = 3t 2 (1 − t ) b3 ( t ) = t 3 smooth join
C0 – continuous in position
Weighting functions sum to one “corner”
3
∑ b (t ) = 1
i= 0
i discontinuous in position
85 86
Drawing a Bezier cubic – naïve method Drawing a Bezier cubic – sensible method
draw as a set of short line segments equispaced in adaptive subdivision
parameter space, t check if a straight line between P0 and P3 is an
(x0,y0) = Bezier(0)
FOR t = 0.05 TO 1 STEP 0.05 DO
adequate approximation to the Bezier
(x1,y1) = Bezier(t) if so: draw the straight line
DrawLine( (x0,y0), (x1,y1) )
if not: divide the Bezier into two halves, each a
(x0,y0) = (x1,y1)
END FOR Bezier, and repeat for the two new Beziers
problems: need to specify some tolerance for when a
cannot fix a number of segments that is appropriate for straight line is an adequate approximation
all possible Beziers: too many or too few segments
when the Bezier lies within half a pixel width of the
distance in real space, (x,y), is not linearly related to
distance in parameter space, t straight line along its entire length
87 88
Drawing a Bezier cubic (continued) Subdividing a Bezier cubic into two halves
e.g. if P1 and P2 both lie
a Bezier cubic can be easily subdivided into
Procedure DrawCurve( Bezier curve )
VAR Bezier left, right within half a pixel width of two smaller Bezier cubics
BEGIN DrawCurve the line joining P0 to P3
IF Flat( curve ) THEN Exercise: How do you
Q0 = P0 R0 = 18 P0 + 83 P1 + 83 P2 + 18 P3
DrawLine( curve ) calculate the distance Q1 = 12 P0 + 12 P1 R1 = 14 P1 + 12 P2 + 14 P3
ELSE from P1 to P0P3?
SubdivideCurve( curve, left, right ) Q2 = 14 P0 + 12 P1 + 14 P2 R2 = 12 P2 + 12 P3
DrawCurve( left ) draw a line between Q3 = 18 P0 + 83 P1 + 83 P2 + 18 P3 R3 = P3
DrawCurve( right ) P0 and P3: we already
END IF know how to do this
END DrawCurve Exercise: prove that the Bezier cubic curves defined by Q0, Q1, Q2, Q3 and R0, R1, R2, R3
how do we do this? match the Bezier cubic curve defined by P0, P1, P2, P3 over the ranges t∈[0,½] and
see the next slide… t∈[½,1] respectively
89 90
What if we have no tangent vectors? Overhauser’s cubic
base each cubic piece on the four surrounding method
data points calculate the appropriate Bezier or Hermite values from
the given points
e.g. given points A, B, C, D, the Bezier control points are:
P0=B P1=B+(C-A)/6
P3=C P2=C-(D-B)/6
at each data point the curve must depend solely
(potential) problem
on the three surrounding data points Why?
moving a single point modifies the surrounding four
define the tangent at each point as the direction from the curve segments (c.f. Bezier where moving a single point
preceding point to the succeeding point modifies just the two segments connected to that point)
z tangent at P1 is ½(P2 -P0), at P2 is ½(P3 -P1)
good for control of movement in animation
this is the basis of Overhauser’s cubic
Overhauser worked for the Ford motor company in the 1960s
91 92
Simplifying line chains Douglas & Pücker’s algorithm
find point, C, at greatest distance from line AB
the problem: you are given a chain of line segments
if distance from C to AB is more than some specified
at a very high resolution, how can you reduce the
tolerance then subdivide into AC and CB, repeat for
number of line segments without compromising the
each of the two subdivisions
quality of the line
e.g. given the coastline of Britain defined as a chain of line otherwise approximate entire chain from A to B by
segments at 10m resolution, draw the entire outline on a the single line segment AB
1280×1024 pixel screen C Exercises: (1) How do
the solution: Douglas & Pücker’s line chain you calculate the
distance from C to AB?
simplification algorithm (2) What special cases
need to be considered?
This can also be applied to chains of Bezier curves at high resolution: most of the curves
How should they be
will each be approximated (by the previous algorithm) as a single line segment, Douglas A B handled?
& Pücker’s algorithm can then be used to further simplify the line chain
Douglas & Pücker, Canadian Cartographer, 10(2), 1973
93 94
Clipping Clipping lines against a rectangle
what about lines that go off the edge of the
screen?
need to clip them so that we only draw the part of y = yT
the line that is actually on the screen
clipping points against a rectangle
need to check four inequalities:
y = yT x ≥ xL
x ≤ xR y = yB
y ≥ yB
y = yB
x = xL x = xR y ≤ yT x = xL x = xR
95 96
Cohen-Sutherland clipper 1 Cohen-Sutherland clipper 2
make a four bit code, one bit for each inequality Q1= Q2 =0
A ≡ x < x L B ≡ x > x R C ≡ y < y B D ≡ y > yT both ends in rectangle ACCEPT
Ivan Sutherland is one of the founders of Evans & Sutherland, manufacturers of flight simulator systems x = xL
97 98
Cohen-Sutherland clipper 3 Polygon filling
if code has more than a single 1 then you cannot tell which pixels do we turn on?
which is the best: simply select one and loop again
horizontal and vertical lines are not a problem Why not?
need a line drawing algorithm that can cope with
Why?
floating-point endpoint co-ordinates
99 100
Scanline polygon fill algorithm Scanline polygon fill example
ntake all polygon edges and place in an edge list (EL) , sorted on
lowest y value
ostart with the first scanline that intersects the polygon, get all
edges which intersect that scan line and move them to an active
edge list (AEL)
pfor each edge in the AEL: find the intersection point with the
current scanline; sort these into ascending order on the x value
qfill between pairs of intersection points
rmove to the next scanline (increment y); remove edges from
the AEL if endpoint < y ; move new edges from EL to AEL if start
point ≤ y; if any edges remain in the AEL go back to step p
101 102
Scanline polygon fill details Clipping polygons
how do we efficiently calculate the intersection points?
use a line drawing algorithm to do incremental calculation
103 104
Sutherland-Hodgman polygon clipping 1 Sutherland-Hodgman polygon clipping 2
the algorithm progresses around the polygon checking if
clips an arbitrary polygon against an arbitrary convex
polygon each edge crosses the clipping line and outputting the
appropriate points
basic algorithm clips an arbitrary polygon against a single
inside outside inside outside inside outside inside outside
infinite clip edge e s
the polygon is clipped against one edge at a time, passing s e
i
the result on to the next stage
e s
e s i
105 106
2D transformations Basic 2D transformations
scale
scale why? x ' = mx
about origin
it is extremely useful to
by factor m y ' = my
be able to transform
predefined objects to an rotate
rotate arbitrary location,
x ' = x cos θ − y sin θ
about origin
orientation, and size by angle θ y ' = x sin θ + y cos θ
any reasonable graphics
translate
translate package will include x' = x + xo
transforms along vector (xo,yo)
y' = y + yo
2D Î Postscript
3D Î OpenGL shear
(shear) parallel to x axis x ' = x + ay
by factor a y' = y
107 108
Matrix representation of transformations Homogeneous 2D co-ordinates
scale rotate translations cannot be represented using simple
about origin, factor m about origin, angle θ 2D matrix multiplication on 2D vectors, so we
switch to homogeneous co-ordinates
x' m 0 x x ' cos θ − sin θ x
y ' = 0 m y y ' = sin θ cos θ y ( x , y , w ) ≡ ( wx , wy )
an infinite number of homogeneous co-ordinates
map to every 2D point
do nothing shear w=0 represents a point at infinity
identity parallel to x axis, factor a
usually take the inverse transform to be:
x ' 1 0 x x ' 1 a x ( x , y ) ≡ ( x , y ,1)
y ' = 0 1 y y ' = 0 1 y
109 110
Matrices in homogeneous co-ordinates Translation by matrix algebra
scale rotate
about origin, factor m about origin, angle θ x ' 1 0 xo x
x ' m 0 0 x x ' cos θ − sin θ 0 x y ' = 0 1 y0 y
y ' = 0 m 0 y y ' = sin θ cos θ 0 y
w' 0 0 1 w
w' 0 0 1 w w' 0 0 1 w
In homogeneous coordinates
do nothing shear x ' = x + wx o y ' = y + wy o w' = w
identity parallel to x axis, factor a
x ' 1 0 0 x x ' 1 a 0 x In conventional coordinates
y ' = 0 1 0 y y ' = 0 1 0 y x' x y' y
= + x0 = + y0
w' 0 0 1 w w' 0 0 1 w w' w w' w
111 112
Concatenating transformations Concatenation is not commutative
often necessary to perform more than one be careful of the order in which you
transformation on the same object concatenate transformations
can concatenate transformations by multiplying their rotate then scale scale
matrices 2 2 −2
2 0 2 0 0
e.g. a shear followed by a scaling: 1 1 0 0 1 0
scale shear rotate by 45° scale by 2
along x axis
2 2
x ''
y '' =
m
0
0 0 x ' x' 1 a 0 x 0 0 1 0 0 1
m 0 y ' y' = 0 1 0 y
w'' 0 0 1 w' w' 0 0 1 w 2 2 −1
2 0 1 2 −1 2 0
2 1 0 1 1 0
scale shear both
x '' m 0 0 1 a 0 x m ma 0 x scale by 2 rotate by 45°
2 2
2 2
y '' = 0 m 0 0 1 0 y = 0 m 0 y along x axis 0 0 1 0 0 1
w'' 0 0 1 0 0 1 w 0 0 1 w scale then rotate rotate
113 114
Scaling about an arbitrary point Bounding boxes
scale by a factor m about point (xo,yo) when working with complex objects, bounding boxes
ntranslate point (xo,yo) to the origin can be used to speed up some operations
(xo,yo)
oscale by a factor m about the origin
ptranslate the origin to (xo,yo) (0,0) N
x ''' 1 0 x o m 0 0 1 0 − x o x
y ''' = 0 1 y o 0 m 0 0 1 − y o y Exercise: show how to
S
perform rotation about
w''' 0 0 1 0 0 1 0 0 1 w
an arbitrary point
115 116
Clipping with bounding boxes Object inclusion with bounding boxes
including one object (e.g. a graphics) file inside another
do a quick accept/reject/unsure test to the bounding
box then apply clipping to only the unsure objects can be easily done if bounding boxes are known and used
PL PR
R R BBT PT
COMPASS
N
U N
BBT yT
A productions W E
R R
BBB
A
yB U A S
PB
BBL BBR
xL xR U W E
R
R use the eight values to
BBL > x R ∨ BBR < x L ∨ BBB > x T ∨ BBT < x B ⇒ REJECT translate and scale the
S original to the appropriate
BBL ≥ x L ∧ BB R ≤ x R ∧ BBB ≥ x B ∧ BBT ≤ x T ⇒ ACCEPT
BBB
position in the destination
BBL BBR Tel: 01234 567890 Fax: 01234 567899
E-mail: compass@piped.co.uk document
otherwise ⇒ clip at next higher level of detail
117 118
Bit block transfer (BitBlT) XOR drawing
it is sometimes preferable to predraw something and generally we draw objects in the appropriate colours,
then copy the image to the correct position on the overwriting what was already there
screen as and when required sometimes, usually in HCI, we want to draw something
e.g. icons e.g. games temporarily, with the intention of wiping it out (almost)
immediately e.g. when drawing a rubber-band line
if we bitwise XOR the object’s colour with the colour
copying an image from place to place is essentially a already in the frame buffer we will draw an object of the
memory operation correct shape (but wrong colour)
can be made very fast if we do this twice we will restore the original frame
e.g. 32×32 pixel icon can be copied, say, 8 adjacent pixels at buffer
a time, if there is an appropriate memory copy operation saves drawing the whole screen twice
119 120
Application 1: user interface Application 2: typography
tend to use objects that typeface: a family of letters designed to look good together
are quick to draw usually has upright (roman/regular), italic (oblique), bold and bold-
straight lines italic members
filled rectangles abcd efgh ijkl mnop - Helvetica abcd efgh ijkl mnop - Times
121 3D CG 122
Application 3: Postscript 3D Computer Graphics 2D CG IP
123 124
3D Ö 2D projection Types of projection
to make a picture parallel
3D world is projected to a 2D image e.g. ( x , y , z ) → ( x , y )
like a camera taking a photograph useful in CAD, architecture, etc
the three dimensional world is projected onto a plane
looks unrealistic
125 126
Viewing volume Geometry of perspective projection
y
( x', y', d ) d
viewing plane
(screen plane)
( x, y, z ) x' = x
z
( 0,0,0 ) z
d
the rectangular pyramid is y' = y
eye point the viewing volume z
(camera point)
d
everything within the
viewing volume is projected
onto the viewing plane
127 128
Perspective projection
3D transformations
with an arbitrary camera
3D homogeneous co-ordinates
we have assumed that: ( x , y , z , w ) → ( wx , wy , wz )
screen centre at (0,0,d) 3D transformation matrices
screen parallel to xy-plane
translation identity rotation about x-axis
z-axis into screen 1 0 0 tx 1 0 0 0
1 0 0 0
0 0 ty 0 0 cos θ − sin θ 0
y-axis up and x-axis to the right 1 1 0 0
eye (camera) at origin (0,0,0) 0 0 1 tz 0 0 1 0 0 sin θ cos θ 0
0 0 1 0 0 1
0 0 0 1 0 0
for an arbitrary camera we can either:
work out equations for projecting objects about an scale rotation about z-axis rotation about y-axis
arbitrary point onto an arbitrary plane mx 0 0 0 cos θ − sin θ 0 0 cos θ 0 sin θ 0
0 my 0 0 sin θ cos θ 0 0 0 1 0 0
transform all objects into our standard co-ordinate
0 0 mz 0 0 0 1 0 − sin θ 0 cos θ 0
system (viewing co-ordinates) and use the above 0
0
0 0 1 0
0 0 1 0 0 1
assumptions
129 130
3D transformations are not commutative Viewing transform 1
world viewing
90° rotation 90° rotation
about z-axis about x-axis co-ordinates viewing co-ordinates
y opposite transform
z faces
z x the problem:
x ↔
to transform an arbitrary co-ordinate system to
↔ the default viewing co-ordinate system
y
z ↔
camera specification in world co-ordinates
z x
x
eye (camera) at (ex,ey,ez)
look point (centre of screen) at (lx,ly,lz) l
90° rotation 90° rotation
about x-axis about z-axis
u
up along vector (ux,uy,uz)
perpendicular to el e
131 132
Viewing transform 2 Viewing transform 3
translate eye point, (ex,ey,ez), to origin, (0,0,0) need to align line el with z-axis
1 0 0 − ex first transform e and l into new co-ordinate system
0 1 0 − ey e '' = S × T × e = 0 l '' = S × T × l
T=
0 0 1 − ez then rotate e''l'' into yz-plane, rotating about y-axis
0 0 0 1
cos θ 0 − sin θ 0
scale so that eye point to look point distance, el , is 0 0
z
(0, l ' ' )
1 0 2 2
R1 = , l ' ' x + l '' z
distance from origin to screen centre, d sin θ 0 cos θ 0
y
133 134
Viewing transform 4 Viewing transform 5
having rotated the viewing vector onto the yz plane, the final step is to ensure that the up vector actually
rotate it about the x-axis so that it aligns with the z-axis points up, i.e. along the positive y-axis
actually need to rotate the up vector about the z-axis so that it
l''' = R 1 × l'' lies in the positive y half of the yz plane
u'''' = R 2 × R 1 × u why don’t we need to
1 0 0 0 multiply u by S or T?
0 cos φ sin φ
R2 =
0
(0 , 0 , 2
l''' y + l'''z
2
) z
cos ψ sin ψ 0 0
0 − sin φ cos φ 0 = ( 0,0, d ) − sin ψ cos ψ 0 0
0 R3 =
0 0 1 ( 0, l ' ' ' y , l ' ' ' z )
0 0 1 0
φ
l ''' z 0 0 0 1
φ = arccos y
2 2
l ''' y + l ''' z u'''' y
ψ = arccos 2 2
u'''' x + u'''' y
135 136
Viewing transform 6 Another transformation example
a well known graphics package (Open Inventor) defines a y
world viewing cylinder to be:
co-ordinates viewing co-ordinates
z centre at the origin, (0,0,0)
transform 2 x
z radius 1 unit
we can now transform any point in world co-ordinates z height 2 units, aligned along the y-axis
to the equivalent point in viewing co-ordinate this is the only cylinder that can be drawn, 2
x' x but the package has a complete set of 3D transformations
y' = R 3 × R 2 × R1 × S × T × y we want to draw a cylinder of:
w '
z'
wz
z radius 2 units
in particular: e → ( 0 , 0 , 0 ) l → ( 0, 0 , d ) z the centres of its two ends located at (1,2,3) and (2,4,5)
the matrices depend only on e, l, and u, so they can be its length is thus 3 units
137 138
A variety of transformations Clipping in 3D
object in
object
object in
world
object in
viewing
object in
2D screen
clipping against a volume in viewing co-ordinates
co-ordinates modelling co-ordinates viewing co-ordinates co-ordinates
projection a point (x,y,z) can be
transform transform
clipped against the
2a
the modelling transform and viewing transform can be multiplied pyramid by checking it
together to produce a single matrix taking an object directly from against four planes:
object co-ordinates into viewing co-ordinates
a a
either or both of the modelling transform and viewing transform 2b x > −z x<z
matrices can be the identity matrix d d
y
z e.g. objects can be specified directly in viewing co-ordinates, or b b
directly in world co-ordinates z y > −z y<z
x d d d
this is a useful set of transforms, not a hard and fast model of how
things should be done
139 140
What about clipping in z? Clipping in 3D — two methods which is
best?
141 142
Bounding volumes & clipping Curves in 3D
can be very useful for reducing the amount of same as curves in 2D, with an extra
work involved in clipping co-ordinate for each point
what kind of bounding volume? e.g. Bezier cubic in 3D:
P( t ) = (1 − t )3 P0 P2
axis aligned box
+ 3t (1 − t ) 2 P1 P1
+ 3t 2 (1 − t ) P2
sphere
+ t 3 P3 P0 P3
can have multiple levels of bounding volume where: Pi ≡ ( x i , y i , z i )
143 144
Surfaces in 3D: polygons Splitting polygons into triangles
lines generalise to planar polygons some graphics processors accept only triangles
3 vertices (triangle) must be planar an arbitrary polygon with more than three vertices
> 3 vertices, not necessarily planar
isn’t guaranteed to be planar; a triangle is
a non-planar
“polygon” rotate the polygon
about the vertical axis
should the result be this
or this? which is preferable?
this vertex is in
front of the other
three, which are all
?
in the same plane
145 146
Surfaces in 3D: patches Bezier patch definition
curves generalise to patches the Bezier patch defined by the sixteen control
a Bezier patch has a Bezier curve running along points, P0,0,P0,1,…,P3,3, is:
each of its four edges and four extra internal 3 3
P( s, t ) = ∑ ∑ bi ( s )b j ( t )Pi , j
control points i=0 j =0
where: b0 ( t ) = (1 − t )3 b1 ( t ) = 3t (1 − t )2 b2 ( t ) = 3t 2 (1 − t ) b3 ( t ) = t 3
147 148
Continuity between Bezier patches Drawing Bezier patches
each patch is smooth within itself in a similar fashion to Bezier curves, Bezier patches can be
drawn by approximating them with planar polygons
ensuring continuity in 3D:
method:
C0 – continuous in position
check if the Bezier patch is sufficiently well approximated by a
the four edge control points must match quadrilateral, if so use that quadrilateral
C1 – continuous in both position and tangent vector if not then subdivide it into two smaller Bezier patches and repeat on
the four edge control points must match each
the two control points on either side of each of the four edge z subdivide in different dimensions on alternate calls to the subdivision
control points must be co-linear with both the edge point and function
each another and be equidistant from the edge point having approximated the whole Bezier patch as a set of (non-planar)
quadrilaterals, further subdivide these into (planar) triangles
z be careful to not leave any gaps in the resulting surface!
149 150
Subdividing a Bezier patch - example Triangulating the subdivided patch
1 2 3
4 5 6
Final quadrilateral Naïve More intelligent
mesh triangulation triangulation
151 152
3D scan conversion 3D line drawing
lines given a list of 3D lines we draw them by:
projecting end points onto the 2D screen
polygons using a line drawing algorithm on the resulting 2D lines
depth sort
this produces a wireframe version of whatever
Binary Space-Partitioning tree objects are represented by the lines
z-buffer
A-buffer
ray tracing
153 154
Hidden line removal 3D polygon drawing
by careful use of cunning algorithms, lines that are given a list of 3D polygons we draw them by:
hidden by surfaces can be carefully removed from projecting vertices onto the 2D screen
the projected version of the objects z but also keep the z information
still just a line drawing using a 2D polygon scan conversion algorithm on the
will not be covered further in this course resulting 2D polygons
in what order do we draw the polygons?
some sort of order on z
z depth sort
z Binary Space-Partitioning tree
is there a method in which order does not matter?
z z-buffer
155 156
Depth sort algorithm Resolving ambiguities in depth sort
ntransform all polygon vertices into viewing co-ordinates may need to split polygons into smaller polygons to
and project these into 2D, keeping z information make a coherent depth ordering
ocalculate a depth ordering for polygons, based on the
most distant z co-ordinate in each polygon o
presolve any ambiguities caused by polygons
n
overlapping in z split
OK split
qdraw the polygons in depth order from back to front
“painter’s algorithm”: later polygons draw on top of earlier n n
polygons o
o q
steps n and o are simple, step q is 2D polygon scan n
p
OK p o
conversion, step p requires more thought
157 158
Resolving ambiguities: algorithm Depth sort: comments
for the rearmost polygon, P, in the list, need to compare each
polygon, Q, which overlaps P in z the depth sort algorithm produces a list of
the question is: can I draw P before Q?
polygons which can be scan-converted in 2D,
n do the polygons y extents not overlap?
backmost to frontmost, to produce the correct
tests get o do the polygons x extents not overlap? image
more
expensive p is P entirely on the opposite side of Q’s plane from the viewpoint? reasonably cheap for small number of polygons,
q is Q entirely on the same side of P’s plane as the viewpoint? becomes expensive for large numbers of polygons
r do the projections of the two polygons into the xy plane not overlap?
if all 5 tests fail, repeat p and q with P and Q swapped (i.e. can I
draw Q before P?), if true swap P and Q
the ordering is only valid from one particular
otherwise split either P or Q by the plane of the other, throw away
viewpoint
the original polygon and insert the two pieces into the list
draw rearmost polygon once it has been completely checked
159 160
Back face culling: a time-saving trick Binary Space-Partitioning trees
if a polygon is a face of a closed BSP trees provide a way of quickly calculating the
polyhedron and faces backwards with 8 correct depth order:
respect to the viewpoint then it need for a collection of static polygons
not be drawn at all because front facing 8
8 from an arbitrary viewpoint
faces would later obscure it anyway the BSP tree trades off an initial time- and space-
saves drawing time at the the cost of one
intensive pre-processing step against a linear display
extra test per polygon
algorithm (O(N)) which is executed whenever a new
assumes that we know which way a
polygon is oriented viewpoint is specified
back face culling can be used in the BSP tree allows you to easily determine the
combination with any 3D scan- correct order in which to draw polygons by traversing
conversion algorithm the tree in a simple way
161 162
BSP tree: basic idea Making a BSP tree
a given polygon will be correctly scan-converted if: given a set of polygons
all polygons on the far side of it from the viewer are scan- select an arbitrary polygon as the root of the tree
converted first divide all remaining polygons into two subsets:
then it is scan-converted those in front of the selected polygon’s plane
then all the polygons on the near side of it are scan- those behind the selected polygon’s plane
converted z any polygons through which the plane passes are split
into two polygons and the two parts put into the
appropriate subsets
make two BSP trees, one from each of the two subsets
z these become the front and back subtrees of the root
163 164
Drawing a BSP tree Scan-line algorithms
instead of drawing one polygon at a time:
if the viewpoint is in front of the root’s polygon’s modify the 2D polygon scan-conversion algorithm to handle
plane then: all of the polygons at once
draw the BSP tree for the back child of the root the algorithm keeps a list of the active edges in all polygons
draw the root’s polygon and proceeds one scan-line at a time
draw the BSP tree for the front child of the root
there is thus one large active edge list and one (even larger) edge list
otherwise: z enormous memory requirements
draw the BSP tree for the front child of the root still fill in pixels between adjacent pairs of edges on the
draw the root’s polygon scan-line but:
draw the BSP tree for the back child of the root
need to be intelligent about which polygon is in front
and therefore what colours to put in the pixels
every edge is used in two pairs:
one to the left and one to the right of it
165 166
z-buffer polygon scan conversion z-buffer basics
depth sort & BSP-tree methods involve clever store both colour and depth at each pixel
sorting algorithms followed by the invocation when scan converting a polygon:
of the standard 2D polygon scan conversion calculate the polygon’s depth at each pixel
algorithm if the polygon is closer than the current depth
by modifying the 2D scan conversion stored at that pixel
algorithm we can remove the need to sort the then store both the polygon’s colour and depth at that
pixel
polygons otherwise do nothing
makes hardware implementation easier
167 168
z-buffer algorithm z-buffer example
FOR every pixel (x,y)
Colour[x,y] = background colour ;
Depth[x,y] = infinity ;
END FOR ;
This is essentially the 2D 4 4 ∞∞∞∞ 4 4 ∞∞ 6 6 4 4 ∞∞ 6 6
FOR each polygon polygon scan conversion 5 5 ∞∞∞∞ 5 5 6 6 6 6 5 5 6 6 6 6
FOR every pixel (x,y) in the polygon’s projection algorithm with depth 6 6 6 ∞∞∞ 6 6 6 6 6 6 6 5 6 6 6 6
calculation and depth
z = polygon’s z-value at pixel (x,y) ;
IF z < Depth[x,y] THEN comparison added. 7 7 7 ∞∞∞ 6 6 6 6 6 6 6 4 5 6 6 6
Depth[x,y] = z ; 8 8 8 8 ∞∞ 8 6 6 6 6 6 8 3 4 5 6 6
Colour[x,y] = polygon’s colour at (x,y) ; 9 9 9 9 ∞∞ 9 9 6 6 6 6 9 2 3 4 5 6
END IF ;
END FOR ;
END FOR ;
169 170
Interpolating depth values 1 Interpolating depth values 2
just as we incrementally interpolate x as we move we thus have 2D vertices, with added depth information
down the edges of the polygon, we can [( x a ', y a ' ), z a ]
incrementally interpolate z:
as we move down the edges of the polygon we can interpolate x and y in 2D
as we move across the polygon’s projection x ' = (1 − t ) x1 '+( t ) x 2 '
( x1 , y1 , z1 ) ( x1 ' , y1 ', d ) y ' = (1 − t ) y1 '+( t ) y 2 '
d
xa ' = xa
project
za but z must be interpolated in 3D
( x2 , y2 , z2 ) ( x2 ', y2 ', d ) d 1 1 1
ya ' = ya = (1 − t ) + ( t )
za z z1 z2
( x3 , y3 , z3 ) ( x3 ' , y3 ' , d )
171 172
Comparison of methods Putting it all together - a summary
Algorithm Complexity Notes a 3D polygon scan conversion algorithm
Depth sort O(N log N) Need to resolve ambiguities
Scan line O(N log N) Memory intensive needs to include:
BSP tree O(N) O(N log N) pre-processing step a 2D polygon scan conversion algorithm
z-buffer O(N) Easy to implement in hardware
2D or 3D polygon clipping
BSP is only useful for scenes which do not change projection from 3D to 2D
as number of polygons increases, average size of polygon decreases, so some method of ordering the polygons so that
time to draw a single polygon decreases
they are drawn in the correct order
z-buffer easy to implement in hardware: simply give it polygons in any
order you like
other algorithms need to know about all the polygons before drawing a
single one, so that they can sort them into order
173 174
Sampling Anti-aliasing
all of the methods so far take a these artefacts (and others) are jointly known as
single sample for each pixel at aliasing
the precise centre of the pixel methods of ameliorating the effects of aliasing are
i.e. the value for each pixel is the
known as anti-aliasing
colour of the polygon which happens
to lie exactly under the centre of the
pixel in signal processing aliasing is a precisely defined technical
term for a particular kind of artefact
this leads to:
in computer graphics its meaning has expanded to include
stair step (jagged) edges to polygons
most undesirable effects that can occur in the image
small polygons being missed
z this is because the same anti-aliasing techniques which
completely
ameliorate true aliasing artefacts also ameliorate most of the
thin polygons being missed other artefacts
completely or split into small pieces
175 176
Anti-aliasing method 1: area averaging Anti-aliasing method 2: super-sampling
average the contributions of all polygons to
sample on a finer grid,
each pixel then average the samples
e.g. assume pixels are square and we just want the
in each pixel to produce
average colour in the square
the final colour
Ed Catmull developed an algorithm which does this:
for an n×n sub-pixel grid, the
z works a scan-line at a time
algorithm would take roughly
z clips all polygons to the scan-line n2 times as long as just
z determines the fragment of each polygon which taking one sample per pixel
projects to each pixel
can simply average all of
z determines the amount of the pixel covered by the
visible part of each fragment
the sub-pixels in a pixel or
z pixel's colour is a weighted sum of the visible parts
can do some sort of
expensive algorithm!
weighted average
177 178
The A-buffer A-buffer: details
a significant modification of the z-buffer, which for each pixel, a list of masks is stored
allows for sub-pixel sampling without as high an each mask shows how much of a polygon covers the
overhead as straightforward super-sampling pixel
{
need to store both
basic observation: the masks are sorted in depth order colour and depth in
a given polygon will cover a pixel: addition to the mask
a mask is a 4×8 array of bits:
z totally
z partially 1 1 1 1 1 1 1 1 1 = polygon covers this sub-pixel
z not at all 0 0 0 1 1 1 1 1
sub-pixel sampling is only required in the 0 = polygon doesn’t cover this sub-pixel
0 0 0 0 0 0 1 1
case of pixels which are partially covered
by the polygon 0 0 0 0 0 0 0 0 sampling is done at the centre of each
of the sub-pixels
L. Carpenter, “The A-buffer: an antialiased hidden surface method”, SIGGRAPH 84, 103–8
179 180
A-buffer: example Making the A-buffer more efficient
to get the final colour of the pixel you need to average if a polygon totally covers a pixel then:
together all visible bits of polygons do not need to calculate a mask, because the mask is all 1s
sub-pixel final pixel
A (frontmost) B C (backmost) all masks currently in the list which are behind this polygon
colours colour
can be discarded
1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
any subsequent polygons which are behind this polygon can
0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0
be immediately discounted (without calculating a mask)
0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
in most scenes, therefore, the majority of pixels will
have only a single entry in their list of masks
A=11111111 00011111 00000011 00000000 A covers 15/32 of the pixel
B=00000011 00000111 00001111 00011111 ¬A∧B covers 7/32 of the pixel
C=00000000 00000000 11111111 11111111 ¬A∧¬B∧C covers 7/32 of the pixel
the polygon scan-conversion algorithm can be
structured so that it is immediately obvious whether a
¬A∧B =00000000 00000000 00001100 00011111
¬A∧¬B∧C =00000000 00000000 11110000 11100000 pixel is totally or partially within a polygon
181 182
A-buffer: calculating masks A-buffer: comments
clip polygon to pixel the A-buffer algorithm essentially adds anti-aliasing to
calculate the mask for each edge bounded by the the z-buffer algorithm in an efficient way
right hand side of the pixel
there are few enough of these that they can be stored in a most operations on masks are AND, OR, NOT, XOR
look-up table very efficient boolean operations
XOR all masks together why 4×8?
algorithm originally implemented on a machine with 32-bit
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
registers (VAX 11/780)
0 0 1 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
⊕ ⊕ ⊕ = on a 64-bit register machine, 8×8 seems more sensible
0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0
0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1
what does the A stand for in A-buffer?
anti-aliased, area averaged, accumulator
183 184
A-buffer: extensions Illumination & shading
as presented the algorithm assumes that a mask has until now we have assumed that each polygon is a
a constant depth (z value) uniform colour and have not thought about how that
can modify the algorithm and perform approximate colour is determined
intersection between polygons things look more realistic if there is some sort of
can save memory by combining fragments which illumination in the scene
start life in the same primitive we therefore need a mechanism of determining the
e.g. two triangles that are part of the decomposition of a colour of a polygon based on its surface properties
Bezier patch
and the positions of the lights
can extend to allow transparent objects
we will, as a consequence, need to find ways to
shade polygons which do not have a uniform colour
185 186
Illumination & shading (continued) How do surfaces reflect light?
in the real world every light source emits millions of
photons every second
θ θ θ θ θ
these photons bounce off objects, pass through
objects, and are absorbed by objects
a tiny proportion of these photons enter your eyes perfect reflection specular reflection diffuse reflection
allowing you to see the objects (mirror) (Lambertian reflection)
187 188
Comments on reflection Calculating the shading of a polygon
gross assumptions:
the surface can absorb some wavelengths of light
there is only diffuse (Lambertian) reflection
e.g. shiny gold or shiny copper
all light falling on a polygon comes directly from a light source
specular reflection has “interesting” properties at z there is no interaction between polygons
glancing angles owing to occlusion of micro-facets by no polygon casts shadows on any other
one another z so can treat each polygon as if it were the only polygon in the scene
light sources are considered to be infinitely distant from the
polygon
z the vector to the light is the same across the whole polygon
plastics are good examples of surfaces with: observation:
specular reflection in the light’s colour the colour of a flat polygon will be uniform across its surface,
diffuse reflection in the plastic’s colour dependent only on the colour & position of the polygon and the
colour & position of the light sources
189 190
Diffuse shading calculation Diffuse shading: comments
N L is a normalised vector pointing in can have different Il and different kd for different
the direction of the light source wavelengths (colours)
L
θ N is the normal to the polygon watch out for cosθ < 0
implies that the light is behind the polygon and so it cannot
Il is the intensity of the light source
illuminate this side of the polygon
kd is the proportion of light which is
I = I l k d cos θ do you use one-sided or two-sided polygons?
diffusely reflected by the surface
one sided: only the side in the direction of the normal vector
= Il kd ( N ⋅ L) I is the intensity of the light reflected can be illuminated
by the surface z if cosθ < 0 then both sides are black
two sided: the sign of cosθ determines which side of the
use this equation to set the colour of the whole polygon and draw polygon is illuminated
the polygon using a standard polygon scan-conversion routine z need to invert the sign of the intensity for the back side
191 192
θ θ
Gouraud shading Specular reflection
for a polygonal model, calculate the diffuse illumination at Phong developed an easy- L is a normalised vector pointing in the
each vertex rather than for each polygon to-calculate approximation direction of the light source
calculate the normal at the vertex, and use this to calculate the to specular reflection R is the vector of perfect reflection
diffuse illumination at that point N is the normal to the polygon
normal can be calculated directly if the polygonal model was N
L R V is a normalised vector pointing at the
derived from a curved surface [( x1 ' , y 1 ' ), z 1 , ( r1 , g 1 , b1 )] viewer
interpolate the colour across the θ θ Il is the intensity of the light source
α V
polygon, in a similar manner to that ks is the proportion of light which is
used to interpolate z [( x 2 ' , y 2 ' ), z 2 , specularly reflected by the surface
( r2 , g 2 , b 2 )]
surface will look smoothly curved
I = I l k s cos n α n is Phong’s ad hoc “roughness” coefficient
rather than looking like a set of polygons
= Il ks ( R ⋅V )n I is the intensity of the specularly reflected
surface outline will still look polygonal [( x 3 ' , y 3 ' ), z 3 , ( r3 , g 3 , b 3 )] light
Henri Gouraud, “Continuous Shading of Curved Surfaces”, IEEE Trans Computers, 20(6), 1971 Phong Bui-Tuong, “Illumination for computer generated pictures”, CACM, 18(6), 1975, 311–7
193 194
Phong shading The gross assumptions revisited
only diffuse reflection
similar to Gouraud shading, but calculate the specular
now have a method of approximating specular reflection
component in addition to the diffuse component
therefore need to interpolate the normal across the
no shadows
need to do ray tracing to get shadows
polygon in order to be able to calculate the reflection
vector [( x ' , y ' ), z , ( r , g , b ), N
1 1 1 1 1 1 1 ] lights at infinity
can add local lights at the expense of more calculation
z need to interpolate the L vector
195 196
Shading: overall equation Illumination & shading: comments
how good is this shading equation?
the overall shading equation can thus be considered to gives reasonable results but most objects tend to look as if they
be the ambient illumination plus the diffuse and are made out of plastic
specular reflections from each light source Cook & Torrance have developed a more realistic (and more
expensive) shading model which takes into account:
N
Li Ri z micro-facet geometry (which models, amongst other things, the
I = I a k a + ∑ I i k d ( Li ⋅ N ) + ∑ I i k s ( Ri ⋅ V ) n θ θ V
roughness of the surface)
α z Fresnel’s formulas for reflectance off a surface
i i
there are other, even more complex, models
is there a better way to handle inter-object interaction?
the more lights there are in the scene, the longer this “ambient illumination” is, frankly, a gross approximation
calculation will take distributed ray tracing can handle specular inter-reflection
radiosity can handle diffuse inter-reflection
197 198
Ray tracing Ray tracing algorithm
a powerful alternative to polygon scan-conversion select an eye point and a screen plane
techniques
given a set of 3D objects, shoot a ray from the eye FOR every pixel in the screen plane
determine the ray from the eye through the pixel’s centre
through the centre of every pixel and see what it hits FOR each object in the scene
IF the object is intersected by the ray
IF the intersection is the closest (so far) to the eye
record intersection point and object
END IF ;
END IF ;
END FOR ;
set pixel’s colour to that of the object at the closest intersection point
END FOR ;
shoot a ray through each pixel whatever the ray hits determines the colour of
that pixel
199 200
Intersection of a ray with an object 1 Intersection of a ray with an object 2
plane sphere
a = D⋅ D
D N D C
r b = 2D ⋅ (O − C )
O O
ray: P = O + sD , s ≥ 0
ray: P = O + sD , s ≥ 0 c = (O − C ) ⋅ (O − C ) − r 2
circle: ( P − C ) ⋅ ( P − C ) − r = 0
2
plane: P ⋅ N + d = 0 d = b2 − 4ac
d + N ⋅O −b + d
s=− s1 =
N ⋅D 2a
d real d imaginary −b − d
box, polygon, polyhedron s2 =
defined as a set of bounded planes cylinder, cone, torus 2a
all similar to sphere
201 202
Ray tracing: shading Ray tracing: shadows
once you have the because you are
light 2 light 2
intersection of a ray with the tracing rays from the
light 1 nearest object you can also: light 1 intersection point to
calculate the normal to the the light, you can
object at that intersection point check whether
shoot rays from that point to all light 3 another object is
of the light sources, and between the
N calculate the diffuse and N
specular reflections off the
intersection and the
object at that point light and is hence
D C D C
r z this (plus ambient illumination) r casting a shadow
O O
gives the colour of the object also need to watch for
(at that point) self-shadowing
203 204
Ray tracing: reflection Ray tracing: transparency & refraction
if a surface is totally objects can be totally or
or partially reflective partially transparent
then new rays can this allows objects behind
be spawned to find the current one to be seen
the contribution to D2 through it
N2 the pixel’s colour transparent objects can
light D1
light given by the have refractive indices
N1
reflection bending the rays as they
this is perfect D'1
pass through the objects
D0
(mirror) reflection O transparency + reflection
O D' 2 means that a ray can split
into two parts
205 206
Sampling in ray tracing Types of super-sampling 1
single point regular grid
shoot a single ray through the divide the pixel into a number of sub-
pixel’s centre pixels and shoot a ray through the centre
super-sampling for anti-aliasing of each
shoot multiple rays through the problem: can still lead to noticable
pixel and average the result aliasing unless a very high resolution sub-
regular grid, random, jittered, pixel grid is used 12 8 4
Poisson disc random
adaptive super-sampling shoot N rays at random points in the pixel
shoot a few rays through the pixel, replaces aliasing artefacts with noise
check the variance of the resulting artefacts
values, if similar enough stop, z the eye is far less sensitive to noise than
otherwise shoot some more rays to aliasing
207 208
Types of super-sampling 2 Types of super-sampling 3
Poisson disc jittered
shoot N rays at random divide pixel into N sub-pixels
points in the pixel with the and shoot one ray at a random
proviso that no two rays point in each sub-pixel
shall pass through the pixel an approximation to Poisson
closer than ε to one another disc sampling
for N rays this produces a for N rays it is better than pure
better looking image than random sampling
pure random sampling easy to implement
very hard to implement
properly
211 212
Distributed ray tracing previously we could only
calculate the effect of Handling direct illumination
for specular reflection perfect reflection
light
we can now distribute the diffuse reflection
reflected rays over the handled by ray tracing and
polygon scan conversion
range of directions from
assumes that the object is
which specularly reflected a perfect Lambertian
light could come reflector
provides a method of specular reflection
light light
handling some of the inter- also handled by ray tracing
reflections between and polygon scan
objects in the scene conversion
O requires a very large
use Phong’s approximation
to true specular reflection
number of ray per pixel
213 214
Handing indirect illumination: 1 Handing indirect illumination: 2
light
diffuse to specular light
diffuse to diffuse
handled by handled by radiosity
distributed ray tracing covered in the Part II
Advanced Graphics
course
specular to diffuse
specular to specular handled by no usable
light light
also handled by algorithm
distributed ray tracing some research work
has been done on this
but uses enormous
amounts of CPU time
215 216
Multiple inter-reflection Hybrid algorithms
light may reflect off many surfaces on (diffuse | specular)* polygon scan conversion and ray tracing are
its way from the light to the camera the two principal 3D rendering mechanisms
standard ray tracing and polygon each has its advantages
diffuse | specular
scan conversion can handle a single polygon scan conversion is faster
diffuse or specular bounce ray tracing produces more realistic looking results
distributed ray tracing can handle (diffuse | specular) (specular)*
multiple specular bounces
hybrid algorithms exist
these generally use the speed of polygon scan
radiosity can handle multiple diffuse (diffuse)*
conversion for most of the work and use ray
bounces
tracing only to achieve particular special effects
the general case cannot be handled (diffuse | specular )*
by any efficient algorithm
217 218
Surface detail Texture mapping
so far we have assumed perfectly without with
smooth, uniformly coloured
surfaces
real life isn’t like that:
multicoloured surfaces
e.g. a painting, a food can, a page in a book
bumpy surfaces
e.g. almost any surface! (very few things are
perfectly smooth) all surfaces are smooth and of uniform most surfaces are textured with
textured surfaces colour 2D texture maps
the pillars are textured with a solid texture
e.g. wood, marble
219 220
Basic texture mapping Paramaterising a primitive
v a texture is simply an polygon: give (u,v)
image, with a 2D coordinate coordinates for three
system (u,v) vertices, or treat as part
of a plane
u
plane: give u-axis and v-
each 3D object is axis directions in the
parameterised in (u,v) space plane
each pixel maps to some cylinder: one axis goes
part of the surface up the cylinder, the
that part of the surface other around the
maps to part of the texture cylinder
221 222
Sampling texture space Sampling texture space: finding the value
(a) (b)
v (i,j+1) (i+1,j+1)
(i,j)
s
(i,j) (i+1,j)
t
u
Find (u,v) coordinate of the sample point on nearest neighbour: the sample value is the nearest pixel
the object and map this into texture space value to the sample point
as shown bilinear reconstruction: the sample value is the weighted
mean of pixels around the sample point
223 224
Sampling texture space:
Texture mapping examples
interpolation methods
nearest neighbour
fast with many artefacts
bilinear
reasonably fast, blurry
can we get better results? v
bicubic gives better results
uses 16 values (4×4) around the sample location
but runs at one quarter the speed of bilinear
biquadratic u
use 9 values (3×3) around the sample location
nearest- bilinear
faster than bicubic, slower than linear, results seem to be nearly neighbour
as good as bicubic
225 226
Down-sampling Multi-resolution texture
Rather than down-sampling every time you need to, have
if the pixel covers quite a large multiple versions of the texture at different resolutions and pick
area of the texture, then it will the appropriate resolution to sample from…
be necessary to average the
texture across that area, not just
take a sample in the middle of
the area
You can use tri-linear
interpolation to get an even better
result: that is, use bi-linear
interpolation in the two nearest levels
and then linearly interpolate between
the two interpolated values
227 228
The MIP map Solid textures
an efficient memory arrangement for a multi- texture mapping applies
resolution colour image a 2D texture to a surface
colour = f(u,v)
pixel (x,y) is a bottom level pixel location solid textures have
(level 0); for an image of size (m,n), it is stored colour defined for every
at these locations in level k: point in space
colour = f(x,y,z)
m + x y
2 k , 2 k Red permits the modelling of
2
1 objects which appear to
2 2
1 1
0
x m + y m + x m + y be carved out of a
Blue k , k 2 k , 2 k Green 0 0 material
2 2
229 230
What can a texture map modify? Bump mapping
the surface normal is used in
any (or all) of the colour components calculating both diffuse and
ambient, diffuse, specular specular reflection
transparency bump mapping modifies the
“transparency mapping” direction of the surface
normal so that the surface
reflectiveness appears more or less bumpy
rather than using a texture
map, a 2D function can be but bump mapping
but also the surface normal used which varies the doesn’t change the
“bump mapping” surface normal smoothly object’s outline
across the plane
3D CG 231 232
Image Processing 2D CG IP Filtering
filtering
convolution
Background
move a filter over the image, calculating a
nonlinear filtering new value for every pixel
point processing
intensity/colour correction
compositing
halftoning & dithering
compression
various coding schemes
233 234
Filters - discrete convolution Example filters - averaging/blurring
convolve a discrete filter with the image to 1
9
1
9
1
9 1 1 1
produce a new image Basic 3x3 blurring filter 1
9
1
9
1
9 = 1
9 × 1 1 1
1 1 1
in one dimension: 9 9 9 1 1 1
+∞
f '( x ) = ∑ h( i ) × f ( x − i )
i =−∞
Gaussian 5x5 blurring filter
where h(i) is the filter Gaussian 3x3 blurring filter 1 2 4 2 1
1 2 1 2 6 9 6 2
in two dimensions:
+∞ +∞
1
16 × 2 4 2
1
112 × 4 9 16 9 4
2 6 9 6 2
f '( x , y ) = ∑ ∑ h ( i, j ) × f ( x − i, y − j )
i =−∞ j =−∞
1 2 1
1 2 4 2 1
235 236
Example filters - edge detection Example filter - horizontal edge detection
Horizontal Vertical Diagonal Horizontal edge Image Result
1 1 1 1 0 -1 1 1 0 detection filter
0 0 0 1 0 -1 1 0 -1 100 100 100 100 100 100 100 100 100 0 0 0 0 0 0 0 0 0
Prewitt filters 0 -1 -1 0 1 1 1 100 100 100 100 100 100 100 100 100 300 300 300 300 200 100 0 0 0
Roberts filters
0 0 0 ∗ 0 0 0 0 0 100 100 100 100 = 300 300 300 300 300 200 100 0 0
1 2 1 1 0 -1 2 1 0 -1 -1 -1 0 0 0 0 0 0 100 100 100 0 0 0 0 100 100 100 0 0
0 0 0 2 0 -2 1 0 -1 0 0 0 0 0 0 100 100 100 0 0 0 0 0 0 0 0 0
-1 -2 -1 1 0 -1 0 -1 -2 0 0 0 0 0 0 100 100 100 0 0 0 0 0 0 0 0 0
Sobel filters
237 238
Example filter - horizontal edge detection Median filtering
not a convolution method
the new value of a pixel is the median of the
values of all the pixels in its neighbourhood
e.g. 3×3 median filter
10 15 17 21 24 27
12 16 20 25 99 37 16 21 24 27
(16,20,22,23,
15 22 23 25 38 42 25, 20 25 36 39
18 37 36 39 40 44 25,36,37,39) 23 36 39 41
original image after use of a 3×3 Prewitt
horizontal edge detection filter 34 2 40 41 43 47
sort into order and take median
mid-grey = no edge, black or white = strong edge
239 240
Median filter - example Median filter - limitations
original
copes well with shot (impulse) noise
not so good at other types of noise
original
Gaussian filter
eliminates noise
Gaussian at the expense of
blur Gaussian excessive blurring
blur
241 242
Point processing
Point processing
inverting an image
each pixel’s value is modified
the modification function only takes that
pixel’s value into account f(p)
white
p '( i, j ) = f { p ( i, j )}
black p
where p(i,j) is the value of the pixel and p'(i,j) is the black white
modified value
the modification function, f (p), can perform any
operation that maps one intensity value to another
243 244
Point processing Point processing
improving an image’s contrast modifying the output of a filter
black = edge
black or white = edge white = no edge black = edge
mid-grey = no edge grey = indeterminate white = no edge
f(p)
white
black p
black white
f(p) f(p)
white white
thresholding
dark histogram improved histogram black p black p
black white black white
245 246
Point processing: gamma correction Image compositing
the intensity displayed on a CRT is related to the voltage on the merging two or more images together
electron gun by: γ
i ∝V
the voltage is directly related to the pixel value:
V∝p
gamma correction modifies pixel values in the inverse manner:
p' = p1/γ
thus generating the appropriate intensity on the CRT:
i ∝ V γ ∝ p' γ ∝ p
CRTs generally have gamma values around 2.0 what does this operator do?
247 248
Simple compositing Alpha blending for compositing
copy pixels from one image to another instead of a simple boolean mask, use an
only copying the pixels you want alpha mask
use a mask to specify the desired pixels value of alpha mask determines how much of each
image to blend together to produce final pixel
a b d
249 250
Arithmetic operations Difference example
images can be manipulated arithmetically the two images are taken from slightly different viewpoints
simply apply the operation to each pixel location
in turn
a b d
multiplication
used in masking - =
subtraction (difference)
used to compare images take the difference between the two images black = large difference
e.g. comparing two x-ray images before and after d = 1−| a − b| white = no difference
251 252
Halftoning & dithering Halftoning
mainly used to convert greyscale to binary each greyscale pixel maps to a square of
e.g. printing greyscale pictures on a laser printer binary pixels
8-bit to 1-bit e.g. five intensity levels can be approximated by a
is also used in colour printing, 2×2 pixel square
1-to-4 pixel mapping
normally with four colours:
cyan, magenta, yellow, black
253 254
Halftoning dither matrix Rules for halftone pattern design
one possible set of patterns for the 3×3 case mustn’t introduce visual artefacts in areas of
is: constant intensity
e.g. this won’t work very well:
every on pixel in intensity level j must also be on in
levels > j
i.e. on pixels form a growth sequence
these patterns can be represented by the pattern must grow outward from the centre
dither matrix: 7 9 5 simulates a dot getting bigger
2 1 4 all on pixels must be connected to one another
6 3 8 this is essential for printing, as isolated on pixels will not
print very well (if at all)
1-to-9 pixel mapping
255 256
Ordered dither 1-to-1 pixel mapping
halftone prints and photocopies well, at the a simple modification of the ordered dither
expense of large dots method can be used
an ordered dither matrix produces a nicer visual
turn a pixel on if its intensity is greater than (or
result than a halftone dither matrix equal to) the value of the corresponding cell in the
1 9 3 11 dither matrix m
ordered 15 5 13 7 e.g.
dither d m, n 0 1 2 3
4 12 2 10 quantise 8 bit pixel value
14 8 16 6 q i , j = p i , j div 15 0 1 9 3 11
3 6 9 14 1 15 5 13 7
16 8 11 14
n
find binary value 2 4 12 2 10
halftone 12 1 2 5 bi , j = ( q i , j ≥ d i m od 4 , j mod 4 ) 3 14 8 16 6
7 4 3 10
15 9 6 13
257 258
Error diffusion Error diffusion - example (1)
error diffusion gives a more pleasing visual map 8-bit pixels to 1-bit pixels
result than ordered dither quantise and calculate new error values
method: 8-bit value 1-bit value error
fi,j bi,j ei,j
work left to right, top to bottom
0-127 0 f i, j
map each pixel to the closest quantised value
128-255 1 f i , j − 255
pass the quantisation error on to the pixels to the
right and below, and add in the errors before each 8-bit value is calculated from pixel and error
quantising these pixels values: in this example the errors
f i , j = pi , j + 12 ei−1, j + 12 ei, j −1 from the pixels to the left
and above are taken into
account
259 260
Error diffusion - example (2) Error diffusion
original image process pixel (0,0) process pixel (1,0) Floyd & Steinberg developed the error diffusion
+30 +55
60 80 60 80 0 110 method in 1975
+30 0
+30 +55
0
often called the “Floyd-Steinberg algorithm”
107 100 107 100 107 100 their original method diffused the errors in the
following proportions:
process pixel (0,1) process pixel (1,1) pixels that have
0 0 0 0
been processed
7
+55 16
-59 +48 3 5 1
137 100 1 96 16 16
16
-59 +48 current pixel pixels still to
-59
1 0
be processed
261 262
Halftoning & dithering — examples Halftoning & dithering — examples
original ordered dither error diffused
original ordered dither error diffused
halftoned with a very the regular dither more random than
fine screen pattern is clearly ordered dither and
visible therefore looks more
attractive to the
human eye
thresholding halftoning
<128 ⇒ black the larger the cell size, the more intensity levels
available
≥128 ⇒ white
the smaller the cell, the less noticable the
halftone dots
thresholding halftoning halftoning
(4×4 cells) (5×5 cells)
263 264
Encoding & compression What you should note about image data
introduction there’s lots of it!
various coding schemes an A4 page scanned at 300 ppi produces:
24MB of data in 24 bit per pixel colour
difference, predictive, run-length, quadtree
1MB of data at 1 bit per pixel
transform coding z the Encyclopaedia Britannica would require 25GB at 300
Fourier, cosine, wavelets, JPEG ppi, 1 bit per pixel
adjacent pixels tend to be very similar
265 266
Encoding - overview Lossless vs lossy compression
encoded
image
Mapper Quantiser
Symbol image lossless
encoder fewer
allows you to exactly reconstruct the pixel values
bits than
mapper original from the encoded data
maps pixel values to some other set of values implies no quantisation stage and no losses in either of
the other stages
designed to reduce inter-pixel redundancies
quantiser lossy
reduces the accuracy of the mapper’s output loses some data, you cannot exactly reconstruct
designed to reduce psychovisual redundancies the original pixel values
symbol encoder all three
encodes the quantiser’s output operations are
optional
designed to reduce symbol redundancies
267 268
Raw image data Symbol encoding on raw data
(an example of symbol encoding)
can be stored simply as a sequence of pixel pixels are encoded by variable length
values symbols
no mapping, quantisation, or encoding the length of the symbol is determined by the
frequency of the pixel value’s occurence
5 54 5 18 5 30 16 69 43 58 40 33 18 13 16 3 16 9 7 189 119 69 44 60 42 68 161 149 70 37 48 35 57 2
56 12 15 64 41 21 14 4 3 218 57 64 6 54 57 46 118 149 140 32 45 39 24 199 156 81 16 12 29 12 15 42
130 168 124 174 38 59 50 9 65 29 128 22 192 125 147 29 38 22 198 170 78 42 41 43 43 46 163 188 1
27 57 24 40 24 21 43 37 44 163 110 100 74 51 39 31 232 20 121 50 55 10 186 77 111 112 40 86 186
81 7 32 18 136 78 151 159 187 114 35 18 29 233 3 86 35 87 26 42 52 14 13 13 31 50 73 20 18 22 81
152 186 137 80 131 47 19 47 24 66 72 29 194 161 63 17 9 8 29 33 33 38 31 27 81 74 74 66 38 48 65
66 42 26 36 51 55 77 229 61 65 11 28 32 41 35 36 28 24 34 138 130 150 109 56 37 30 45 38 41 157 1 e.g.
44 110 176 71 36 30 25 41 44 47 60 20 11 19 16 155 156 165 125 69 39 38 48 38 22 18 49 107 119 1
43 32 44 30 26 45 44 39 33 37 63 22 148 178 141 121 76 55 44 42 25 13 17 21 39 70 47 25 57 93 121
39 11 128 137 61 41 168 170 195 168 135 102 83 48 39 33 19 16 23 33 42 95 43 121 71 34 39 40 38 4
p P( p ) Code 1 Code 2
168 137 78 143 182 189 160 109 104 87 57 36 35 6 16 34 41 36 63 26 118 75 37 41 34 33 31 39 33 15
95 21 181 197 134 125 109 66 46 31 3 33 38 42 33 38 46 12 109 25 41 36 34 36 34 34 37 174 202 210
148 132 101 79 58 41 32 0 11 26 53 46 45 48 38 42 42 38 32 37 36 37 40 30 183 201 201 152 92 67 2 0 0.19 000 11 with Code 1 each pixel requires 3 bits
1 0.25 001 01
41 24 15 4 7 43 43 41 50 45 10 44 17 37 41 37 33 31 33 33 172 180 168 112 54 55 11 182 179 159 89
269 270
Quantisation as a compression method Difference mapping
(an example of quantisation) (an example of mapping)
quantisation, on its own, is not normally used every pixel in an image will be very similar to those
for compression because of the visual either side of it
a simple mapping is to store the first pixel value and,
degradation of the resulting image
for every other pixel, the difference between it and the
however, an 8-bit to 4-bit quantisation using previous pixel
error diffusion would compress an image to
50% of the space 67 73 74 69 53 54 52 49 127 125 125 126
67 +6 +1 -5 -16 +1 -2 -3 +78 -2 0 +1
271 272
Difference mapping - example (1) Difference mapping - example (2)
(an example of mapping and symbol encoding combined)
273 274
Predictive mapping Run-length encoding
(an example of mapping) (an example of symbol encoding)
when transmitting an image left-to-right top-to-bottom, we based on the idea that images often contain
already know the values above and to the left of the current
runs of identical pixel values
pixel
method:
predictive mapping uses those known pixel values to
encode runs of identical pixels as run length and pixel
predict the current pixel value, and maps each pixel value value
to the difference between its actual value and the encode runs of non-identical pixels as run length and
prediction pixel values
e.g. prediction
original pixels
p i, j = 1
2 p i −1 , j + 1
2 p i , j −1
34 36 37 38 38 38 38 39 40 40 40 40 40 49 57 65 65 65 65
difference - this is what we transmit run-length encoding
d i, j = pi, j − pi, j 3 34 36 37 4 38 1 39 5 40 2 49 57 4 65
275 276
Run-length encoding - example (1) Run-length encoding - example (2)
run length is encoded as an 8-bit value: works well for computer generated imagery
first bit determines type of run not so good for real-life imagery
z 0 = identical pixels, 1 = non-identical pixels
especially bad for noisy images
other seven bits code length of run
z binary value of run length - 1 (run length ∈{1,…,128})
pixels are encoded as 8-bit values
277 278
CCITT fax encoding Transform coding
fax images are binary 79 73 63 71 73 79 81 89
transform N pixel values into
1D CCITT group 3 coefficients of a set of N basis = 76
+1.5
279 280
Mathematical foundations Calculating the coefficients
each of the N pixels, f(x), is represented as a the coefficients can be calculated from the
weighted sum of coefficients, F(u) pixel values using this equation:
N −1
∑ F ( u ) H ( u, x )
N −1
f (x) =
∑
forward
H(u,x) is the array F (u) = f ( x )h( x , u ) transform
u= 0
of weights x=0
e.g. H(u,x) x
0 1 2 3 4 5 6 7
0 +1 +1 +1 +1 +1 +1 +1 +1 compare this with the equation for a pixel value,
1 +1 +1 +1 +1 -1 -1 -1 -1
2 +1 +1 -1 -1 +1 +1 -1 -1
from the previous slide:
3 +1 +1 -1 -1 -1 -1 +1 +1 N −1
u 4
5
+1
+1
-1
-1
+1
+1
-1
-1
+1
-1
-1
+1
+1
-1
-1
+1
f (x) = ∑ F ( u ) H ( u, x )
u= 0
inverse
transform
6 +1 -1 -1 +1 +1 -1 -1 +1
7 +1 -1 -1 +1 -1 +1 +1 -1
281 282
Walsh-Hadamard transform 2D transforms
“square wave” transform the two-dimensional versions of the transforms are an
extension of the one-dimensional cases
h(x,u)= 1/N H(u,x)
one dimension two dimensions
0 8
forward transform
1 9
N −1 N −1 N − 1
∑ F ( u ) H ( u, x ) ∑ ∑ F ( u, v ) H ( u, v , x , y )
5 13
functions are the same, f (x) = f ( x, y ) =
but numbered differently!) 6 14
u= 0 u=0 v = 0
7 15
invented by Walsh (1923) and Hadamard (1893) - the two variants give the same results for N a power of 2
283 284
2D Walsh basis functions Discrete Fourier transform (DFT)
forward transform:
N −1
e − i 2 π ux / N
F (u) = ∑
x =0
f (x)
N
these are the Walsh basis
functions for N=4 inverse transform:
N −1
2
in general, there are N basis f (x) = ∑ F ( u )e i 2 π xu / N
285 286
DFT - alternative interpretation Discrete cosine transform (DCT)
the DFT uses complex coefficients to represent forward transform:
real pixel values N −1
( 2 x + 1) u π
F (u) = ∑ f ( x ) cos
it can be reinterpreted as: x=0
2N
N
2
−1
f ( x) = ∑ A (u ) cos( 2π ux + θ (u )) inverse transform:
N −1
( 2 x + 1) u π
u =0 f (x) = ∑ F ( u )α ( u ) cos 2N
where A(u) and θ(u) are real values u= 0
2
N u ∈ {1, 2, … N − 1}
287 288
DCT basis functions Haar transform: wavelets
2 5
1 1
Hadamard
0 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
h(u,x) for N=8 Haar basis functions get progressively more local
-0.75 -0.75
-1 -1
1 4 7
1 1 1
0.75 0.75 0.75
0.5 0.5 0.5
0.25 0.25 0.25
0 0 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
-0.25 -0.25 -0.25
-0.5 -0.5 -0.5
-0.75 -0.75 -0.75
-1 -1 -1
289 290
Haar basis functions Karhunen-Loève transform (KLT)
“eigenvector”, “principal component”, “Hotelling” transform
the first sixteen Haar basis functions
0 8 based on statistical properties of the image
1 9 source
2 10 theoretically best transform encoding method
3 11 but different basis functions for every
4 12 different image source
5 13
6 14
7 15
first derived by Hotelling (1933) for discrete data; by Karhunen (1947) and Loève (1948) for continuous data
291 292
JPEG: a practical example JPEG sequential baseline scheme
compression standard input and output pixel data limited to 8 bits
JPEG = Joint Photographic Expert Group DCT coefficients restricted to 11 bits
three different coding schemes: three step method JPEG
encoded
baseline coding scheme image Variable image
DCT
based on DCT, lossy Quantisation length
transform
adequate for most compression applications encoding
293 294
JPEG example: DCT transform JPEG example: quantisation
Z ( u, v )
subtract 128 from each (8-bit) pixel value quantise each coefficient, F(u,v), 16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
subdivide the image into 8×8 pixel blocks using the values in the 14 13 16 24 40 57 69 56
295
JPEG example: symbol encoding
the DC coefficient (mean intensity) is coded
relative to the DC coefficient of the previous 8×8
block
each non-zero AC coefficient is encoded by a
variable length code representing both the
coefficient’s value and the number of preceding
zeroes in the sequence
this is to take advantage of the fact that the sequence
of 63 AC coefficients will normally contain long runs
of zeroes