You are on page 1of 35

Lehrstuhl fr Informatik 4

Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Digital Image Representation

Chapter 2: Basics
Audio Technology
Images and Graphics
Video and Animation

A digital image is a spatial representation of an object


(2D, 3D scene or another image - real or virtual)

2.2: Images and Graphics


Digital image representation
Image formats and color models
JPEG, JPEG2000
Image synthesis and graphics
systems
Image analysis

Definition of digital image:


Let I, J, K Z be a finite interval. Let G N0 with |G| < be the grey scale level /
color depth (intensity value of a picture element = a pixel) of the image.
(1) A 2D-image is a function f: I J G
(2) A 3D-image is a function f: I J K G
(3) If G = {0,1}, the function is a binary (or bit) image, otherwise it is a pixel image
The Resolution depends on the size of I and J (and K) and describes the number of
pixels per row resp. column.

Chapter 3: Multimedia Systems - Communication Aspects and Services


Chapter 4: Multimedia Systems Storage Aspects

Example
To display a 525-line television picture (NTSC) without noticeable degradation with a
Video Graphics Array (VGA) video controller, 640x480 pixels and 256 discrete grey
levels give an array of 307.200 8-bit numbers and a total of 2.457.600 bit.

Chapter 5: Multimedia Usage

Chapter 2.2: Images and Graphics

Page 1

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Representation

Color Models

An Image Capturing Format is specified by:


spatial resolution (pixel x pixel) and color encoding (bits per pixel)

Why storing values for red, green, blue?


Color perception by the human brain is possible through the additive composition of red,
green and blue light (RGB system). The relative intensities of RGB values are transmitted
to the monitor where they are reproduced at each point in time.

Example: captured image of a DVD video with 4:3 picture size:


spatial resolution:
768 x 576 pixel
color encoding:
1-bit (binary image), 8-bit (color or grayscale),
24-bit (color-RGB)

On a computer monitor, each pixel is given as an overlay of


those three image tones with different intensities by this,
any color can be reproduced.

An Image Storing Format is a 2-dimensional array of values representing the image in a


bitmap or pixmap, respectively. Also called raster graphics. The data of the fields of a
bitmap is a binary digit, data in a pixmap may be a collection of:
3 numbers representing the intensities of red, green, and blue components of the color
3 numbers representing indices to tables of red, green and blue intensities
Single numbers as index to a table of color triples
Single numbers as index to any other data structures that represents a color / color
system
Further properties can be assigned with the whole image: width, height, depth, version, etc.

Chapter 2.2: Images and Graphics

Page 2

Chapter 2.2: Images and Graphics

Page 3

But: another possible color model: CYMK


When printing an image, other color components are used
cyan, yellow, magenta, kontrast which in all can also
reproduce all colors.
Thus, many image processing software and also some image
storing formats also support this model.

Chapter 2.2: Images and Graphics

Page 4

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Color Models

Color Models

Another possibility is to use a different representation of color information by means of the


YUV system where
Y is the brightness (or luminance) information
U and V are color difference signals (chrominance)
Y, U and V are functions of R, G and B
Why? As the human eye is more sensitive to brightness than to chrominance, separate
brightness information from the color information and code the more important luminance
with more bit than the chrominance this can save bits in the representation format.

Chapter 2.2: Images and Graphics

Page 5

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Usual scheme:
Y = 0.30 R + 0.59 G + 0.11 B (the color sensitivity of the human eye is considered)
U = c1 (B-Y); V = c2 (R-Y)
c1 , c2 = constants reflecting perception aspects of the human eye and the human brain!
Possible Coding:
YUV signal
Y = 0.30 R + 0.59 G + 0.11 B
U = (B-Y) 0.493 = - 0.148 R - 0.29 G + 0.439 B
V = (R-Y) 0.877 = 0.614 R - 0.517 G - 0.096 B
This is a system of 3 equations for determining Y, U, V from R, G, B or for recalculating
R, G, B from Y, U, V
The resolution of Y is more important than the resolution of U and V
Spend more bits for Y than for U and V (Y : U : V = 4 : 2 : 2)
The weighting factors in the calculation of the Y signal compensate the color perception
misbalance of the human eye

Chapter 2.2: Images and Graphics

Page 6

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Formats

Why Compression?

Lots of different image formats are in use today, e.g.


GIF (Graphics Interchange Format)
Compressed with some basic lossless compression techniques to 20 25% of original
picture without loss. Supports 24-bit colors.

High-resolution image: e.g. 1024768 pixel, 24 bit color depth


102476824 = 18.874.368 bit

BMP (Bitmap)
Devide-independent representation of an image: uses RGB color model, without
compression. Color depth up to 24-bit, additional option of specifying a color table to use.

Image formats like GIF:


Lossless compression (entropy encoding) for reducing data amount while keeping
image quality

TIFF (Tagged Image File Format)


Supports grey levels, RGB, and CYMK color model. Also supports lots of different
compression methods. Additionally contains a descriptive part with properties a display
should provide to show the image.

JPEG:
Lossy compression remove some image details to achieve a higher compression rate
by suppressing higher frequencies
Combined with lossless techniques
Trade-Off between file size and quality
JPEG is a joint standard of ISO and ITU-T
In June 1987, an adaptive transformation coding technique based on DCT was
adopted for JPEG
In 1992, JPEG became a ISO international standard

PostScript
Images are described without reference to special properties as e.g. resolution. Nice feature
for printers, but hard to include into documents where you have to know the image size...
JPEG (Joint Photographics Expert Group)
Lots of possible compressions, mostly with loss!

Chapter 2.2: Images and Graphics

Page 7

Chapter 2.2: Images and Graphics

Page 8

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

JPEG

How could We compress?

Implementation
Independent of image size
Applicable to any image and pixel aspect ratio

Entropy encoding
Data stream is considered to be a simple digital sequence without semantics
Lossless coding, decompression process regenerates the data completely
Used regardless of the medias specific characteristics
Examples: Run-length encoding, Huffman encoding, Arithmetic encoding

Color representation
JPEG applies to color and grey-scaled still images
Image content
Of any complexity, with any statistical characteristics
Properties of JPEG
State-of-the-art regarding compression factor and image quality
Run on as many available standard processors as possible
Compression mechanisms are available as software-only packages or together with
specific hardware support - use of specialized hardware should speed up image
decompression
Encoded data stream has a fixed interchange format
Fast coding is also used for video sequences: Motion JPEG

Page 9

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Compressed
Image

DCT

Pixel
Block, MCU

Predictor

Quantization
(approximation of real
numbers by
rational
numbers)

Entropy
Encoding
Run-length
Huffman
Arithmetic

Page 10

Image Preparation
Analog-to-digital conversion
Image division into blocks of NN pixels
Suitable structuring and ordering of image information
Image Processing - Source Encoding
Transformation from time to frequency domain using DCT
In principle no compression itself but computation of new coefficients as input for
compression process
Quantization
Mapping of real numbers into rational numbers (approximation)
A certain loss of precision will in general be unavoidable
Entropy Encoding
Lossless compression of a sequential digital data stream

MCU: Minimum Coded Unit


DCT: Discrete Cosine Transform

Chapter 2.2: Images and Graphics

Chapter 2.2: Images and Graphics

Compression Steps in JPEG

Uncompressed
Image

Image
Processing

Hybrid encoding
Used by most multimedia systems
Combination of entropy and source encoding
Examples: JPEG, MPEG, H.261

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Compression Steps in JPEG

Image
Preparation

Source encoding
Semantics of the data are taken into account
Lossy coding (encoded data are not identical with original data)
Degree of compression depends on the data contents
Example: Discrete Cosine Transformation (DCT) as transformation technique of the
spatial domain into the two-dimensional frequency domain

Page 11

Chapter 2.2: Images and Graphics

Page 12

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

The Principle

Variants of Image Compression


get rid of
invisible details

Original
Transformation

JPEG is not a single format, but it can be chosen from a number of modes:

Quantization

Quantization
Table

JPEG
Picture

Encode

Lossy sequential DCT-based mode (baseline process)


Must be supported by every JPEG implementation
Block, MCU, FDCT, Run-length, Huffman

Huffman,
Run Length Encoding

Expanded lossy DCT-based mode


Enhancement to the baseline process by adding progressive encoding

The opposite
JPEG

Decoder

Dequantization

Retransformation

Lossless mode
Low compression ratio perfect reconstruction of original image
No DCT, but differential encoding by prediction

Original
the details cannot
be reconstructed

Without Quantization: Encoding gain would be very poor (or nonexisting)


Transformation and Retransformation must be inverse to each other
Task of transformation: produce a picture representation which may be encoded with a
high gain of reduction

Page 13

Chapter 2.2: Images and Graphics

Hierarchical mode
Accommodates images of different resolutions
Selects its algorithms from the three other modes

Page 14

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

First Step: Image Preparation

Picture Preparation - Components

General image model


Independence from image parameters like size and pixel ratio
Description of most of the well-known picture representations
Source picture consists of 1 to 255 components (planes) Ci
Components may be assigned to RGB or YUV values
For example, C1 may be assigned to red color information
Each component Ci can have a different number of superpixels Xi, Yi
(A superpixel is a rectangle of pixels which all have the same value)

Resolution of the components may be different:


X1

X2
B1

A1 A2

X3

Y2

Y1
AN

D1

B2

D2

X1 = 2 X 2 = 2 X 3

Y3
BM

Y1 = Y2 = Y3
DM

Yi

CN

C1

C2

C3

Chapter 2.2: Images and Graphics

superpixel

A grey-scale image consists (in most cases) of a single component


RGB color representation has three components with equal resolution
YUV color image processing uses
Y1 = 4 Y2 = 4 Y3 and X1 = 4 X2 = 4 X3

Xi
Ci

Page 15

Chapter 2.2: Images and Graphics

Page 16

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Preparation - Dimensions

Image Preparation Data Ordering

Dimensions of a compressed image are defined by


X (maximum of all Xi),
Y (maximum of all Yi),
Hi and Vi (relative horizontal and vertical sampling ratios for each component i)
Y
with H i = minX iX j and V i = m ini Y j
j

Hi and Vi must be integers in the range of 1 to 4. This restriction is needed for the
interleaving of components
C1

Example:
Y = 4 pixels,
X = 6 pixels

Y1

C3

C2

X1
X1 = 6, Y1 = 4
H1 = 2
V1 = 2

X2 = 6, Y2 = 2
H2 = 2
V2 = 1

An image is divided into several components which can be


processed one by one. But: how to prepare a component
for processing?
Observation for most parts of an image: not so much
difference between the values in a rectangle of NN pixels
For further processing: divide each component of an
image into blocks of NN pixels
Thus, the image is divided into data units (blocks):
Lossless mode uses one pixel as one data unit
Lossy mode uses blocks of 88 pixels (with 8 or 12
bits per pixel)

X3 = 3, Y3 = 2
H3 = 1
V3 = 1

Page 17

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Chapter 2.2: Images and Graphics

Page 18

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Preparation - Data Ordering

Interleaved Data Ordering


Often more suitable: interleave data units

Non-interleaved data ordering:


The easiest but not the most convenient sequence of data processing
Data units are processed component by component
For one component, the processing order is left-to-right and top-to-bottom

Interleaving means: dont process all blocks component by component, but mix data units
from all components
Interleaved data units of different components:
Combination to Minimum Coded Units (MCUs)
If all components have the same resolution
MCU consists of one data unit for each component
If components have different resolutions
1. For each component, regions of data units are determined;
data units in one region are ordered left-to-right and top-to bottom
2. Each component consists of the same number of regions
3. MCU consists of one region in each component

With the non-interleaved technique, a RGB-encoded image is processed by:


First the red component only
Then the blue component, followed by the green component
This is (for speed reasons) less suitable than data unit interleaving

Chapter 2.2: Images and Graphics

Up to 4 components can be encoded in interleaved mode (according to JPEG)


Each MCU consists of at most ten data units

Page 19

Chapter 2.2: Images and Graphics

Page 20

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Preparation - MCUs

Compression Steps in JPEG

MCU example: four components C1, C2, C3, C4


0

a00 a01

a10 a11

Uncompressed
Image

5
c00
b00 b01

c10

Compressed
Image

d00

Image
Processing

Image
Preparation

3
C1: H1 = 2, V1 = 2

C2: H2 = 2, V2 = 1

C3: H3 = 1, V3 = 2

C4: H4 = 1, V4 = 1

DCT

Pixel
MCUs:

Block, MCU

9 data units per MCU

MCU1 = a00a01a10a11b00b01c00c10d00
MCU2 = a02a03a12a13b02b03c01c11d01
MCU3 = a04a05a14a15b04b05c02c12d02
MCU4 = a20a21a30a31b10b11c20c30d10

where
aij: data units of C1
bij: data units of C2
cij: data units of C3
dij: data units of C4

Hi =

xi
min x j

Vi =

yi
min y j

MCU: Minimum Coded Unit


DCT: Discrete Cosine Transform

Page 21

Chapter 2.2: Images and Graphics

x =0

n 1
y =0

fxy e

2i
m

ux

2i
n

Arithmetic

Result of image preparation: sequence of 88 blocks, the


order is defined by MCUs
The samples are encoded with 8 bit/pixel
Next step: image processing by source encoding

Page 22

vy

Let f xy be a pixel (x, y) in the original picture. ( 0 x N 1; 0 y N 1 )


N 1 N 1

Fuv : = N c u c v f xy cos
x =0 y =0

cu =

The parameters m and n indicate the granularity


Most effective transformation for image compression:
Discrete Cosine Transformation (DCT)

fxy

1
2
1

(2x

+ 1 ) u
2N

(2y
cos

+ 1 ) v
,
2N

u ,v {0 ,...,N 1} ,

u=0
u>0

space domain (i.e. geometric)

Fuv frequency domain (indicates how fast the information moves inside the rectangle)

Fuv = nm x = 0 y = 0 f xy cos u (22mx +1 ) cos v ( 22 ny +1 )


m 1

Huffman

Discrete Cosine Transformation

Encoding by transformation: Data are transformed into another mathematical domain,


which is more suitable for compression.
The inverse transformation must exist and must be easy to calculate
Most widely known example: Fourier transformation
m 1

Run-length

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Source Encoding Transformation

Entropy
Encoding

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Fu v =

Quantization
(approximation of real
numbers by
rational
numbers)

n 1

F00 is the lowest frequency in both directions, i.e. a measure of the average pixel
value

Fast Fourier Transformation (FFT)

Fuv with small total frequency (i.e u+v small) are (in general) larger than Fuv with large u+v

Chapter 2.2: Images and Graphics

Page 23

Chapter 2.2: Images and Graphics

Page 24

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Retransformation: Inverse Cosine


Transformation
N 1 N 1

fxy = N

u =0 v =0

Example

( 2 x + 1 ) u
( 2 y + 1 ) v
c v Fu v c o s
cos

2N
2N

N=2:
Fu v = c u c v

Simplest example (just for demonstration): Let fxy = f = constant

F00 = N

1
1

2
2

f cos( 0 ) =
x

N
f
2

all other Fuv = 0

xy

if N = N then

N =

cos

( 2 x + 1 ) u
4

cos

(2 y + 1 ) v
4

=1

Page 25

Chapter 2.2: Images and Graphics

Meaning of Coefficients

First step of image processing:


Samples are encoded with 8 bits/pixel; each pixel is an integer in the range [0,255]
Pixel values are shifted to the range [-128, 127] (2-complement representation)
Data units of 8 x 8 pixel values are defined by fxy [-128, 127] , where x, y are in the range
[0, 7]
Each value is transformed using the Forward DCT (FDCT):
7

c uc v

x =0 y =0

xy

cos

( 2 x +1 )u
16

cos

1 for u / v = 0
where cu/v = 2
1 otherwise

Low
Transformation to frequencies

and

+
F00

u,v [ 0,7 ]

High

Low

Low

High

High

88 block

( 2 y +1 )v
16

Cosine expressions are independent of fxy fast calculation is possible


Result: From 64 coefficients fxy we get 64 coefficients Fuv in the frequency domain

Chapter 2.2: Images and Graphics

Page 26

How can DCT be useful for JPEG? - Fuv for larger values of u and v are often very small!

1
4

2 positive + 2 negative terms,


F01 0
i.e. if fxy f

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Baseline Process - Image Processing

F uv =

= f00 f01 + f10 f11


2

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

i.e. 2 f if fxy f

Transformed values can be much smaller than original values:

3
f00 cos + f01 cos
+ f10 cos + f11 cos
F01 =
4
4
4
4
2
1
1

1
1
2
2
2
2
2
2
2
2

2
N

u
v
u 3 v
3 u v
3 u 3 v
cos
+ f0 1 c o s

+ f1 0 c o s

+ f1 1 c o s

4
4
4
4
4
4
4
4

1
F00 = [f00 + f01 + f10 + f11 ]
2

1
N2
N
f
2
2

x =0 y =0

= N c 0 c 0 F00 1 1
=

f = f00 = N c u c v Fuv cos(...) cos(...)


u

2
2

= c u c v f0 0 c o s

Fuv = cu cv 14 ...

N=8 (Standard):

F10
F01

Low

F12

F23

High

...
Page 27

Chapter 2.2: Images and Graphics

Page 28

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Baseline Process - Image Processing

Compression Steps in JPEG

Coefficient F00: DC-coefficient


Corresponds to the lowest frequency in both dimensions
Determines the fundamental color of the data unit of 64 pixels
Normally the values for F00 are very similar in neighbored blocks

Uncompressed
Image

Other coefficients (Fuv for u+v > 0): AC-coefficients


Non-zero frequency in one or both dimensions

Image
Preparation

Reconstruction of the image: Inverse DCT (IDCT)


If FDCT and IDCT could be calculated with full precision
DCT would be lossless
In practice: precision is restricted (real numbers!), thus DCT is lossy
different implementations of JPEG decoder may produce different images
Reason for the transformation:
Experience shows that many AC-coefficients have a value of almost zero, i.e. they
are zero after quantization
entropy encoding may lead to significant data reduction.

Page 29

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Compressed
Image

Image
Processing

Pixel

Predictor

Block, MCU

DCT

Quantization
(approximation of real
numbers by
rational
numbers)

Entropy
Encoding
Run-length
Huffman
Arithmetic

Result of image processing: 88 blocks of DC/AC


coefficients
Till now, no compression is done this task is enabled by
MCU: Minimum Coded Unit
DCT: Discrete Cosine Transformquantization

Chapter 2.2: Images and Graphics

Page 30

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Quantization

Baseline Process - Quantization


0

Observation:

...

N-1

Fuv

smaller
values

1
..
.

Quantization process:
Divide DCT-coefficient value Fuv by an integer number Quv
and round the result to the nearest integer

N-1

Quantization of all DCT-coefficients results in a lossy transformation some image


details given by higher frequencies are cut off.

most values
are zero

Dequantization: F Q uv Quv = F * uv
(only an approximation of Fuv)

Chapter 2.2: Images and Graphics

N=8; quantization step=2,


Quv =2(u+v)+3
5 7 9 ... 17

..
...

31

...

7 9
9

..

Quv =

17

F Q uv = [Fuv / Quv ]

Example:

...

How to enforce that even more values are zero?


Answer: by Quantization.
Divide Fuv by Quantumuv = Quv and take the nearest
integer as the result

Page 31

JPEG application provides a table with 64 entries, each used for quantization of one
DCT-coefficient
each coefficient can be adjusted separately
A high compression factor is achievable on the expense of image quality
large quantization numbers: high data reduction but information loss increases
No default values for quantization tables are specified in JPEG

Chapter 2.2: Images and Graphics

Page 32

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Example

Example

Input values from exemplary


grey-scale image

FDCT Output Values


(because of space reasons only
the part before the comma )

DC
coefficient

Quantization Matrix for Quality Level 2

140

144

147

140

140

155

179

175

144

152

140

147

140

148

167

179

186

-18

15

-9

23

-9

-14

-19

11

13

15

17

152

155

136

167

163

162

152

172

21

-34

26

-9

-11

11

14

11

13

15

17

19

168

145

156

160

152

155

136

160

First: subtract 128 from each element

-10

-24

-2

-18

-20

-1

11

13

15

17

19

21

162

148

156

148

140

136

147

162

Then: perform FDCT

-8

-5

14

-15

-8

-3

-3

11

13

15

17

19

21

23

-3

10

-11

18

18

15

11

13

15

17

19

21

23

25

-2

-18

-4

-7

13

15

17

19

21

23

25

27

17

19

21

23

25

27

29

19

21

23

25

27

29

31

147

167

140

155

155

140

136

162

136

156

123

167

162

144

140

147

148

155

136

155

152

147

147

136

Fuv

1.

Page 33

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

-3

-1

-7

-1

-2

15

-8

-2

-6

17

3.

2.

Chapter 2.2: Images and Graphics

Page 34

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Example

Example
F*uv - reconstruction
after dequantization

Effects of Quantization
62

-4

-1

-1

-1

-1

186

-20

14

-9

22

-13

-15

-17

-5

-1

-1

20

-35

27

-11

-13

15

17

-1

-3

-1

-1

-7

-27

-15

-19

-1

-1

-9

13

-15

-1

13

15

-19

21

-1

-17

15

Error in reconstruction:

146

157

161

155

149

162

181

185

152

176

148

176

167

176

186

209

13

161

183

161

193

188

187

180

203

14

15

10

24

29

27

28

19

30

28

25

26

25

25

28

31

169

170

180

178

175

176

169

183

172

174

175

169

163

161

163

186

25

24

18

23

21

33

23

158

188

152

188

181

162

159

190

10

26

19

21

23

25

16

24

23

25

143

180

140

186

186

160

155

171

11

21

12

33

26

22

23

28

157

179

159

172

181

167

175

168

24

17

19

24

16

15

24

24

23

17

29

20

28

32

4.

5.

Quantized Matrix

Indication of quality loss

Chapter 2.2: Images and Graphics

Correct value
was -11

Reconstructed
image after
performing the
inverse DCT:
6.

instead of -18

Page 35

Chapter 2.2: Images and Graphics

Page 36

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Problem of Quantization

Compression Steps in JPEG


Cutting of higher frequencies
leads to partly wrong color
information
the higher the quantization
coefficients, the more disturbance
is in a 88 block

Result: edges of
blocks can be seen

Uncompressed
Image

Compressed
Image

Image
Processing

Image
Preparation
Pixel

Predictor

Block, MCU

DCT

Quantization
(approximation of real
numbers by
rational
numbers)

Entropy
Encoding
Run-length
Huffman
Arithmetic

Result of quantization: 88 blocks of DC/AC coefficients


with lots of zeros
How to process and encode the data efficiently?

MCU: Minimum Coded Unit


DCT: Discrete Cosine Transform

Page 37

Chapter 2.2: Images and Graphics

Page 38

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Baseline Process - Entropy Encoding

Example

Entropy Encoding
Initial step: map 88 block of transformed values FQuv to a 64 element vector which
can be further process by entropy encoding
DC-coefficients determine the basic color of the data units in a frame; variation
between DC-coefficients of successive frames is typically small
The DC-coefficient is encoded as difference between the current coefficient and the
previous one

Zig-zag ordering
62

-3

-1

-1

-1

-5

-1

-1

-1

-3

-1

-1

-1

-1

-1

-1

AC-coefficients: processing order uses zig-zag sequence


DC-coefficient

AC-coefficients, higher frequencies

Coefficients with lower frequencies are encoded first, followed by higher frequencies.
Result: sequence of similar data bytes efficient entropy encoding

Chapter 2.2: Images and Graphics

Page 39

DC-coefficient: code coefficients for one block


as difference to the previous one
AC-coefficients: consider each block
separately, order data using zig-zag
sequence to achieve long sequences of zerovalues:
-3 4 -1 -5 2 -1 3 -3 -1 0 0 0 -1 2 -1 -1
0 1 1 0 0 0 0 -1 -1 1 -1 1 1 0 0 0 -1
0 0 0 0 0 -1 0 -1 0 0 0 1 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0 0 0 0

Entropy encoding:
Run-length encoding of zero values of quantized AC-coefficients
Huffman encoding on DC- and AC-coefficients

Chapter 2.2: Images and Graphics

Page 40

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Run-length Encoding

Run-length Encoding

Run-length encoding is a content-dependent coding technique


Sequences of the same bytes are replaced by the number of their occurrences
A special flag byte is used which doesnt occur in the byte stream itself
Coding procedure:
If a byte occurs at least four consecutive times, the number of occurrences 4
(offset = 4) is counted
The compression data contain this byte followed by the special flag and the
number of occurrences 4
As a consequence: Representation of 4 to 259 bytes with three bytes is possible (with
corresponding compression effect)
Example with ! as special flag:
Uncompressed sequence:
Run-length coded sequence:

ABCCCCCCCCDEFGGG
ABC!4DEFGGG

Offset of 4, since for smaller blocks there would be no reduction effect; e.g. with offset 3:
D!0
DDD
(both strings have same length)

Page 41

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Page 42

Huffman Encoding

0000 =

Size i

Amplitude

0001 =

0010 =

10

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Example

1
2
3
4

Similar it is done in JPEG:


The zero-value is the only one appearing in longer sequences, thus use a more efficient
coding by only compressing zero sequences: code nonzero coefficients together with
their run-length, i.e. the number of zeros preceding the nonzero value
Run-length {0,...,15}, i.e. 4 Bit for representing the length of zero sequences
Coded sequence: run-length, size, amplitude
with run-length
number of subsequent zero-coefficients
size
number of bits used for representing the following coefficient
amplitude
value of that following coefficient using size bits
By adapting the size of representing a coefficient to its value achieves a further
compression because most coefficients for higher frequencies have very small values
If (run-length, size) = (15, 0) then there are more than 15 zeros after each other.
(0,0) = EoB symbol (End of Block) indicates the termination of the actual rectangle (EoB is
very frequently used)

-1
-3, -2
-7,...,-4
-15,...,-8
i
-2 +1, ...,-2i-1
-1023,...,-512

1
2,3
4,...,7
8,...,15
2i-1,...,2i-1
512,...,1023

0011 =

0100 =

0101 =

0110 =

0111 =

8
9
10
11
12
13
14

4 bits 1-complement
Representation (other
representations are
possible)

15

1000 = 15

1001 = 14

1010 = 13

1011 = 12

1100 = 11

1101 = 10

11 is for instance represented by:


size = 4, amplitude = 0011
The sequence 0 . . . 0 121 0 . . . is encoded by
35 zeroes

1110 = 9

1111 = 8

15, 0, 15, 0,

5, 7,

57

35 zeros at all,
followed by a value
represented using 7 bit

With 7
bit, 121 is
64 + 57

The Huffman code is an optimal code using the minimum number of bits for a string
of data with given probabilities per character
Statistical encoding method:
For each character, a probability of occurrence is known by encoder and
decoder
Frequently occurring characters are coded with shorter strings than seldomly
occurring characters
Successive characters are coded independent of each other
Resulting code is prefix free
unique decoding is guaranteed
A binary tree is constructed to determine the Huffman codewords of the characters:
Leaves represent the characters that are to be encoded
Nodes contain the occurrence probability of the characters belonging to the
subtree
Edges of the tree are assigned with 0 and 1

In a second step, the string may be still reduced by Huffman encoding principles

Chapter 2.2: Images and Graphics

Page 43

Chapter 2.2: Images and Graphics

Page 44

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Huffman Encoding

Huffman Encoding Example

Algorithm for computing the Huffman code:


1.) List all characters as well as their frequencies
2.) Select the two list elements with the smallest frequency and remove them
from the list
3.) Make them the leafs of a tree, whereby the probabilities for both elements are
being added; place the tree into the list
4.) Repeat steps 2 and 3, until the list contains only one element
5.) Mark all edges:
Father left son with 0
Father right son with 1
The code words result from the path from the root to the leafs

Suppose that characters A, B, C, D and E occur with probabilities


p(A) = 0.27, p(B) = 0.36, p(C) = 0.16, p(D) = 0.14, p(E) = 0.07

p(ADCEB) = 1.00
0

1
p(AB) = 0.63

p(CED) = 0.37
0

Page 45

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Chapter 2.2: Images and Graphics

p(D) = 0.14

Chapter 2.2: Images and Graphics

x
A
B
C
D
E

w(x)
10
11
00
011
010

Page 46

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Huffman Encoding in JPEG


Coding of run-length ( {0, , 15}),
size ( {0, , 10})
(i,j): i preceding zeroes (0 i 15) in front
of a nonzero value coded with j bits
The table has 1016+2 = 162 entries with
significantly different occurrence probabilities
EoB is relatively frequent
ZRL: at least 16 successive zeroes, i.e.
ZRL = (15,0)
Some values such as (15,10) are extremely
rare: 15 preceding zeros in front of a very large
value is practically impossible! The same holds
for most of the combinations in the table.
Thus: Huffman coding of the table entries will
lead to significant further compression!

p(A) = 0.27

p(E) = 0.07

Chapter 2.2: Images and Graphics

p(B) = 0.36

p(ED) = 0.21

p(C) = 0.16

Resulting Code:

Huffman Encoding in JPEG


Different Huffman tables for (run-length, size) are used for different 8x8 blocks, basing
on their contents
Thus the coding begins with a HTN (Huffman-table-number)
size
runlength

0
1
2
.
.
14
15

The coding of amplitudes may also change from block to block


0
1
EoB
(impossible)
(impossible)
.
.
(impossible)
ZRL

... 10

Amplitude codes are stored in the preceding (run-length, size) coding table

A 88 block thus is coded as follows:

(1,3)

[VLC, DC coefficient, sequence of (run-length, size, amplitude) for the AC coefficients]

(i,j)
(15,10)

VLC = variable length code:


contains actual HTN + actual VLI (Variable Length Integer),
i.e. coding method for next amplitude

Page 47

Chapter 2.2: Images and Graphics

Page 48

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Alternative to Huffman: Arithmetic Coding

Arithmetic Coding: Example


Code data ACAB with
pA = 0.5, pB = 0.2, pC = 0.3

Characteristics:
Achieves optimality (coding rate) as the Huffman coding

pA = 0.5

Difference to Huffman: the entire data stream has an assigned probability, which
consists of the probabilities of the contained characters. Coding a character takes
place with consideration of all previous characters.

pB = 0.2

pAA = 0.25
0

The data are coded as an interval of real numbers between 0 and 1. Each value
within the interval can be used as code word.

pAB= 0.1 pAC = 0.15


0.25

0.35

0.7

pBA
0.5

0.7

pACB = 0.03
0.425

0.35

pBB pBC
0.6 0.68

pACA = 0.075

The minimum length of the code is determined by the assigned probability.

pC = 0.3

0.5

pCA

pCB
0.85

pCC

0.91

pACC = 0.045
0.5

0.455

Disadvantage: the data stream can be decoded only as a whole.


pACAA = 0.0375

pACAB = 0.015
0.3875

0.35

pACAC = 0.0225

0.4025

0.425

ACAB can be coded by each binary number from the interval [0.3875, 0.4025),
rounded up to -log2(pACAB) = 6.06 i.e. 7 bit, e.g. 0.0110010

Chapter 2.2: Images and Graphics

Page 49

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Chapter 2.2: Images and Graphics

Page 50

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Variants of Image Compression

Variants: Expanded Lossy DCT-based Mode


With sequential encoding as in the baseline process the whole image is coded and
decoded in a single run. An alternative to sequential encoding is progressive encoding,
done in the entropy encoding step.

JPEG is not a single format, but it can be chosen from a number of modes:

Lossy sequential DCT-based mode (baseline process)


Presented before, but not the only method

Two alternatives for progressive encoding are possible:


Spectral selection
At first, coefficients of low frequencies are passed to entropy encoding, coefficients of
higher frequencies are processed in successive runs
Successive approximation
All coefficients are transferred in one run, but most-significant bits are encoded prior to
less-significant bits.

Expanded lossy DCT-based mode


Enhancement to the baseline process by adding progressive encoding
Lossless mode
Low compression ratio perfect reconstruction of original image
No DCT, but differential encoding
Hierarchical mode
Accommodates images of different resolutions
Selects its algorithms from the three other modes

12 possible coding alternatives in the expanded mode:


Using sequential encoding, spectral selection, or successive approximation (3 variants)
Using Huffman or Arithmetic encoding (2 variants)
Using 8 or 12 bits for representing the samples (2 variants)

Most popular mode: sequential display mode with 8 bits/sample and Huffman encoding

Chapter 2.2: Images and Graphics

Page 51

Chapter 2.2: Images and Graphics

Page 52

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Expanded Lossy DCT-based Mode (Example)

Variants: Lossless Mode

Sequential encoding: image is coded and decoded in a single run


Lossless mode uses differential encoding
(Differential encoding is also known as prediction or relative encoding)
Sequence of characters whose values are different from zero, but which do not differ
much.
Calculate only the difference wrt. the previous value (used also for DC-coefficients)
Step 1

Step 2

Differential encoding for still images:


Avoid using DCT/quantization
Instead: calculation of differences between nearby pixels or pixel groups
Edges are represented by large values
Areas with similar luminance and chrominance are represented by small values
Homogenous area is represented by a large number of zeros
further compression with run-length encoding is possible as for DCT

Step 3

Progressive encoding: image is coded and decoded in refining steps

Step 1

Step 2

Step 3

Page 53

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Variants: Hierarchical Mode

Uses data units of single pixels for image preparation


Any precision between 2 and 16 bits/pixel can be used
Image processing and quantization use a predictive technique instead of transformation
encoding
8 predictors are specified for each pixel X by means of a combination of the already
known adjacent samples A, B, and C

C
A

B
X

The actual predictor should give the


best approximation of x by the
already known values A,B,C

0
1
2
3
4
5
6
7

predicted values X
no prediction
A
B
C
A+B-C
A+(B-C)/2
B+(A-C)/2
(A+B)/2

The number of the chosen predictor and the difference of the


prediction to the actual value are passed to entropy encoding
(Huffman or Arithmetic Encoding)

Chapter 2.2: Images and Graphics

Page 54

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Variants: Lossless Mode

predictor

Chapter 2.2: Images and Graphics

Uncompressed data

Predictor

Entropy encoder

Compressed data

Example:
(4,0): X is exactly given by A+B-C
(7,1): X is (A+B)/2+1

Page 55

The Hierarchical mode uses either the lossy DCT-based algorithms or the lossless
compression technique
The idea: encoding of an image at different resolutions

Algorithm:
Image is initially sampled at a low resolution
Subsequently, the resolution is raised and the compressed image is subtracted from the
previous result
The process is repeated until the full resolution of the image is obtained in a compressed
form
Disadvantage:
Requires substantially more storage capacity
Advantage:
Compressed image is immediately available at different resolutions
scaling becomes cheap

Chapter 2.2: Images and Graphics

Page 56

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

JPEG 2000

Wavelets

Improvement of the original JPEG standard:


16 Bit color depth (up to 281 billion colors)
Progressive mode is not only an option, but mandatory
Definition of Regions of Interest choose an image region which will be less compressed
than the rest of the image. Thus, images can be individually coded, improving the
subjective image quality
Integration of watermarks invisibly embed additional information which can be
recognized by certain programs; watermarks cannot be removed from the image
Resync Marker improve fault tolerance and error correction by setting markers; if an
transmission error corrupts the data, not all following data are lost like in normal JPEG
Most important: better compression, faster coding and decoding process by using
Wavelets instead of DCT. Wavelets describe transforming functions, how fast image
information are changing. No pixel blocks are used in compression, infinite functions are
performed on finite image regions. Thus the wavelets can describe edges and hard
changes better than DCT.

Page 57

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Thus, the most important aspects are available with a relatively small
number of bits
Application of Wavelets:
Image compression
Image editing
Animation
Signal processing
...

Chapter 2.2: Images and Graphics

Page 58

1D Haar Transforms

The simplest example of wavelet technology is the Haar transform. There are
one-dimensional Haar wavelet transforms (which allow the compression of the
representation of piecewise constant functions)
two-dimensional Haar wavelet transforms (for image compression, ...)
higher-dimensional Haar wavelet transforms

In the first step of this procedure we get:


[average1; average2; ...; detail1; detail2; ...]
Thereafter we get:
[average1,2; detail1,2;...]
where average1,2 is the mean value of average1 and average2
and detail1,2 is the mean value of detail1 and detail2

Example (one-dimensional):
Simple example to show how we can reduce the amount of bits needed for
representation by transformation and compression
Let a string of pixels be given as follows:
7

[overall shape, first detail, second detail, ...], e.g. by


(most important features, refinements, less important topics, ...)

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Haar Transforms

Wavelets are mathematical tools which allow to decompose functions in


a hierarchical way, i.e. to describe a function by means of:

This procedure may be continued. The detail values become less and less important
From the transformed strings we can fully reconstruct the original
Very often the details are very small and may be suppressed. In such a case we
cannot exactly reconstruct the original but the errors are comparatively small.

Then we do (recursively):
calculate the average of successive pairs
calculate the distance from the average (detail)

Chapter 2.2: Images and Graphics

Page 59

Chapter 2.2: Images and Graphics

Page 60

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Example
9

Example
7

Reconstruction:
6; 0; 2, 1; 1, -1, -1, 1

6=6+0
Averages

+1 -1

-1

+1

8=6+2

0
detail

Thus the sequence


has been transformed to

[9, 7, 3, 5, 6, 8, 6, 4]
[6; 0; 2, 1; 1,-1,-1,1]

detail coefficient coarsest resolution

Page 61

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

5=6-1

7=8-1

3 = 4 + (-1) 5 = 4 - (-1) 6 = 7 + (-1) 8 = 7 - (-1)

6=5+1

4=5-1

Page 62

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Generalizsation

Generalization

Generalization to piecewise constant functions (instead of strings):


Let Vj be vector space of all functions which are piecewise constant in the 2j equal
subintervals of [0:1]
Every one-dimensional image with 2j
pixels can be considered as an
f(x)
10
element of Vj
8
(e.g. a two-pixel image has two
6
constant parts over the intervals
4
2
[0,0.5) and [0.5,1)
The figure shows an element of V3
1
Obviously: V3 is a refinement of V2 etc.

A simple basis of Vj (there are many other alternatives for a basis) is:

j
i ( x ) =
0

The piecewise constant sections


might be an approximation of the
function f(x) according to a given norm

03

i +1
i
x j ;
2 j
2
i +1
i
x j ;
2 j
2

for i = 0, ...,

A basis for the vector space Vj is a set


of functions by which all functions of Vj
may be represented as linear combinations
of the basis functions.

Chapter 2.2: Images and Graphics

7=6+1

If we would suppress the finest detail coefficients (i.e., 1, -1, -1, 1)


then we would reconstruct to:
8 8 4 4 7 7 5 5
instead of:
9 7 3 5 6 8 6 4

detail coefficients finest resolution

Chapter 2.2: Images and Graphics

V 0 V 1 V 2 V 3 ...

4=6-2

1
9=8+1

Global average

6=6-0

Details

2j -

43

0
1

Definition:
1
1. The inner product of two functions f, g Vj is defined as f g : = f ( x ) g ( x ) dx
0
2. The L2 norm is defined by
1

u( x) 2 : = < u u > 2 =
1

u( x) dx
2

Very often we try to approximate a given function f(x) by a function f ( x ) according to


the L2 norm.
I.e. find f ( x ) such that f ( x ) f ( x ) 2 = minimum for all f ( x )

Page 63

Chapter 2.2: Images and Graphics

Page 64

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Generalization

Generalization

In the following we will use another basis for Vj than the very simple basis which
consists only of the elementary components shown before

Definition:
j
The elements of a basis of Wj (i.e. linearly independent functions i ( x ) which
span Wj) are called Wavelets.

This will lead us to the concept of a Haar Wavelet basis


Construction procedure of a suitable basis for Vj+1 from a basis of Vj:
Suppose that we already have a basis for Vj
Then define a new vector space Wj as orthogonal complement of Vj in Vj+1, i.e.
Wj := vector space of functions in Vj+1 which are orthogonal to all functions in Vj
( orthogonal means that <u|v>=0 if u Vj; v Wj)

Immediate consequence:
j
1. i ( x ) together with the basis of Vj are a basis of Vj+1

Wj is the detail of Vj+1 which cannot be expressed by Vj

The detail coefficients are coefficients of the wavelet basis functions.

j
j
2. i ( x ) is orthogonal to k ( x ), i.e .

j
i

( x ) kj ( x )d x = 0

The basis functions of Vj together with the basis functions of Wj form a basis of Vj+1

Page 65

Chapter 2.2: Images and Graphics

Page 66

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Generalization

Example

Wavelets for Wj (i.e. elements of Vj+1 which are orthogonal to basis elements of Vj)
are denoted as Haar wavelets
Haar wavelets of Wj (there would be other possibilities for basis functions):

i (x):
j

+1

= 1

i + 12
i
if x j ;
2 j
2
1
i + 2 i +1
if x
;
j
2 j
2
o t h e r w is e

12

02
+1

+1

12

-1

-1

32

22

22

Page 67

Haar wavelets for W2

02
0

for i = 0, ..., 2j - 1

Chapter 2.2: Images and Graphics

Example: V2 basis functions

2
3

Chapter 2.2: Images and Graphics

+1

+1

-1

-1

Page 68

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Example
9

Example
3

Graphical representation:

We can (using these functions) redo the string example as follows:

6x

S( x ) = string (written as a function of x )

0x

1
+ 2x

= 9 ( x ) + 7 ( x ) + 3 ( x ) + 5 ( x ) + 6 ( x ) + 8 ( x ) + 6 ( x ) + 4 ( x )
3
0

3
1

3
2

3
3

3
4

3
5

3
6

3
7

1
+ 1x
1
+ 1x

= 8 02 ( x ) + 4 12 ( x ) + 7 22 ( x ) + 5 32 ( x ) + 1 02 ( x ) 1 12 ( x ) 1 22 ( x ) + 1 32 ( x )

1
- 1x
1
- 1x

= 6 01 ( x ) + 6 11 ( x ) + 2 01 ( x ) + 1 11 ( x ) + 1 02 ( x ) 1 12 ( x ) 1 22 ( x ) + 1 32 ( x )

1
+ 1x

= 6 00 ( x ) + 0 00 ( x ) + 2 01 ( x ) + 1 11 ( x ) + 1 02 ( x ) 1 12 ( x ) 1 22 ( x ) + 1 32 ( x )

Page 69

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

00
00
01
11
02

12
22
32

Page 70

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Example

Approximation Of Continuous Functions

Starting with the basis function


0
i.e. by 0 to the basis for V1

00

for V0 we can refine it by the W0 detail coefficient,

This in turn can be refined by two detail coefficients


of V2: 00 , 00 , 01 , 11

W2

1
0

With four more detail coefficients of


we get a basis for
basis functions 0 ; 0 ; 1 , 1 ; 2 , 2 , 2 , 2
0

V3

1
1

of

W1

to the basis

Approximation of a continuous function f(x)


(dotted line) by averages and detail
coefficients
We start with V0. Then:

V0-appr. + W0-detail = V1-appr.


V1-appr. + W1-detail = V2-appr.
V2-appr. + W2-detail = V3-appr.
V3-appr. + W3-detail = V4-appr.

consisting of the eight

With eight more detail coefficients of W3 we get a basis for V4 etc.


This procedure leads to a better approximation of a given function (or of a 1D image)

We can also work backwards, i.e. start with


the V4 approximation and go to the V3
approximation by suppressing the W3
detail coefficients etc.

Page 71

Chapter 2.2: Images and Graphics

V3 approximation

V2 approximation

W3 detail coefficients

W2 detail coefficients

W1 detail coefficients

average of f(x)

Chapter 2.2: Images and Graphics

V4 approximation

V1 approximation

V0 approximation

W0 detail coefficients

Page 72

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Approximation Of Continuous Functions

Decomposition

All Haar basis functions 0 , 0 , 0 , 1 ,.... are orthogonal to each other (of course
not to itself); i.e. <f|g> = 0 if f g are basis functions
Orthogonality to all other basis functions is not necessarily valid for other systems of
basis functions
In addition to orthogonality we can provide orthonormality, i.e. <f|f> = 1
The basis functions
j
0

i ( x) : = 2 i ( x)
j*

i j* ( x) : = 2 i j ( x)
j

are orthonormal
If the basis functions are orthonormalized, then
the detail coefficients and the
j
average coefficient have to be multiplied by 2 2
In our example we then get:

2
2

01* +

1
2

11* + 12 02* 12 12* 12 22* + 21 32*

Page 73

Chapter 2.2: Images and Graphics

proc DecompositionStep(C:array[1..h] of reals)


for i:=1 to h/2 do
C[i]:=(C[2i-1]+C[2i])*sqrt(2)
C[h/2+i]:=(C[2i-1]-C[2i])*sqrt(2)
end for
C:=C;
end proc
proc Decomposition(C:array[1..h] of reals)
C := C/sqrt(h);
// normalize input coefficients
while h>1 do
DecompositionStep(C[1..h]);
h:=h/2;
end while;
end proc

S( x ) = 6 00 + 0 00 + 2 01 +1 11 +1 02 1 12 1 22 + 1 32
= 6 00* + 0 00* +

Decomposition of a sequence of h numbers together with a normalization: each


original coefficient with superscript j is multiplied by 2(-j/2)

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Usage for Data Compression

Data Compression

Haar wavelets for data compression:


Suppose that a function f(x) is given by a linear combination of m basis functions, e.g.
by a linear combination of m Haar wavelet functions:
m

f ( x ) = c i ui ( x )
i =1

for example: u1 ( x ) = 00 ( x ),...., u8 ( x ) = 32 ( x )

If we do not change the basis function but reduce the number of coefficients then it is
easy to show that:
If the basis functions are orthonormal, then for m < m coefficients and for the L2norm, i.e. f f =< f(x) f (x)| f(x) f (x) > , the error is minimized if we keep the
coefficients of c1 ,..., cm whose absolute values are the largest ones
Proof: Let 1, ..., m be a permutation of 1, ..., m

We want to reduce the number of coefficients, i.e. we seek for a function f ( x ) which:
1. Is similar to f(x), i.e. f ( x ) f ( x ) for some norm

Page 74

Chapter 2.2: Images and Graphics

Let

f ( x) = ci ui
i =1

2. May be represented by fewer coefficients (possibly with other basis functions)


u i ( x ) instead of u i ( x )
~
m
~
~ m
f ( x) =
c~i u~i ( x) with m
<

~
f ( x ) - f ( x)

~ +1
i =m

~ +1 j = m
~ +1
i =m

(c )

~ +1
i =m

~ +1
j =m

c < u u >

since ui are orthonormal

i =1

Finding the best f ( x ) is a difficult problem if all possible u i ( x ) are taken into account.
Thus we restrict ourselves to the Haar Wavelet basis

Chapter 2.2: Images and Graphics

Page 75

The error is minimized if for c m +1 ,...,c m the smallest coefficients in absolute


values are used

Chapter 2.2: Images and Graphics

Page 76

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Data Compression

2D Haar Wavelets

Approximation of f(x) by V3 , W3
3
3
i.e. 8 averages 0 ( x ),...,7 ( x )
3
8 details
0 ( x ),..., 73 ( x )

We have shown how to use one-dimensional Haar wavelets for representation and
compression of one-dimensional functions
Now we generalize to 2D-images
First we show how to apply the (averaging+detail) technique to 2D-images

orthogonal functions!!
(for simplicity the * has
been suppressed)

And (further on) by:

00 ; 00 ; 01 ; 11 ; 02 ,..., 32 ; 03 ,..., 73

a pixel of 2D-image
16 coefficients

global average

First approach: Standard decomposition


1. Apply the 1D-approach to each row.
Result: Average + details for each row
2. Apply the 1D-approach to each
column of the results obtained by
step 1
I.e.: first the rows, then the columns

The sequence of pictures shows the effect


(i.e. loss of exactitude) if more and more
coefficients are eliminated

Page 77

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

global average XXXX++++

Details of different
resolution
(see previous
examples)

Detail
coefficients

Page 78

Chapter 2.2: Images and Graphics

2D Haar Wavelet Basis

Second Approach: Nonstandard decomposition


Alternate between rows and columns
1. Out of the 2n values per row:
calculate n averages + n details
2. Take the result and do the same for the columns:
i.e. calculate m averages + m details for the 2m values per column
averages
Result of steps 1 and 2:

x++++
x++++
x++++
x++++

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

2D Haar Wavelets

average

There are two methods for the construction of a two-dimensional basis:


1. The standard construction
2. The nonstandard construction
Regarding Vi which has 2i basis functions in 1D, namely

00 ; 00 ; 01 , 11 ; 02 ,..., 32 ;.....; 0i-1 ,..., 2i-1 1


i-1

we get 2i 2i basis functions for Vi in 2D by combining the 1D basis functions with each
other.

detail
coefficients

step 2
(for left
half only)

Example for V2 where the 2D-basis will consist of 4x4 = 16 basis functions (since there
are 4 basis functions 00 , 00 , 01 , 11 in V 2 )

step 1

3. Repeat step 1 and 2 for the averages part, i.e. for the left upper corner
4. Do that recursively until only one global average is left over
Remark: This technique works only for square images, i.e. 2n2n pixels

Chapter 2.2: Images and Graphics

Page 79

Chapter 2.2: Images and Graphics

Page 80

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

The Standard Construction

The Nonstandard Construction

Let u1 ,...,u 2 i be the 1D basis functions


i
Then v j ( x , y ) = u j1 ( x ) u j2 ( y ) for j1 , j 2 1 : 2 is a basis function for 2D images
There are 2i x 2i = 22i basis functions for Vj in 2D.

Example:
y

11 ( y )
-

+
-

+ means: +1
- means: - 1
0

Furthermore we define scaling by a superscript j and horizontal and


vertical translations by a pair of subscripts k and l
0
The nonstandard basis consists of 00 ( x , y ) : = ( x, y )
as well as of scaled and translated versions of the wavelet functions ,, and .

0
x

00 ( y )
00 ( y )
01 ( y )
11 ( y )

00 ( x )
01 ( x )
11 ( x )

The result is

klj ( x,y ) := 2 j(2 j x k,2 j y l )


klj ( x,y ) := 2 j (2 j x k,2 j y l )

we get all the 16 basis functions for V2 in 2D

Page 81

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

01 , 1 ( x,y )

1,1 1 ( x,y )

01 , 0 ( x,y )

1,1 0 ( x,y )

00, 0 ( x,y )

+
-

01 , 1 ( x,y )

+
-

01 , 0 ( x,y )
+

The Standard Construction


+

+
-

The standard construction of a 2D Haar


wavelet basis for V2
We calculate the products of the one0
0
1
1
dimensional basis 0 ,0 ,0 ,1 for both
dimensions

1,1 1 ( x,y )

00, 0 ( x,y )

01, 1 ( x,y )

1,1 1 ( x,y )

0 , 0 ( x,y )
Chapter 2.2:
Images and Graphics0 , 0 ( x,y )
0

01, 0 ( x,y )

+-

+
-

++
-

00 ( x) 11(y ) 00 ( x) 11(y ) 01 ( x) 11(y ) 11( x) 11(y )


+
-

+-

+-

+-+

+-+

A detail: The portion of the square


[0:1]x[0:1] where the basis function is
different from zero is not a square for
all 16 functions

00 ( x) 01(y ) 00 ( x) 01(y ) 01( x) 01(y ) 11( x) 01(y )

If we apply the standard construction


to an orthonormal basis in one
dimension, we get an orthonormal
basis in two dimensions

( x) (y ) ( x) (y ) ( x) (y ) 11( x) 00 (y )

function for
global average

+
-

1,1 0 ( x,y )
+

+
-

Page 82

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Nonstandard Construction Of 2D Haar Wavelets


+

klj ( x,y ) := 2 j (2 j x k,2 j y l )

Doing that for 00 ( x )

( x,y ) := ( x)(y )
( x,y ) := ( x) (y )

01 ( x ) 11 ( y )

+
-

+ -

+-

01 ( x )

The nonstandard construction of a two-dimensional basis defines


a two-dimensional scaling function ( x,y ) := ( x)(y )
and three wavelet functions ( x,y ) := ( x) (y )

0
0

+
0
0

0
0

0
0

1
0

+-

0
0

function for
global average

1,1 0 ( x,y )

Page 83

Chapter 2.2: Images and Graphics

00 ( x) 00 (y ) 00 ( x) 00 (y ) 01( x) 00 (y ) 11( x) 00 (y )

Page 84

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Comparison

Decomposition

The standard decomposition of an m x m image requires one-dimensional transforms


on all rows and then on all columns. 4(m2-m) assignment operations are needed
The nonstandard decomposition requires only (8/3) (m2-1) assignment operations
In the nonstandard case, all the nonstandard basis functions have square support
(support = area where the function is nonzero). This is not the case for standard
basis functions.

Only to remember: decomposition of a sequence of h numbers together with a


normalization.
These procedures are needed in the 2-dimensional decomposition.
proc DecompositionStep(C:array[1..h] of reals)
for i:=1 to h/2 do
C[i]:=(C[2i-1]+C[2i])*sqrt(2)
C[h/2+i]:=(C[2i-1]-C[2i])*sqrt(2)
end for
C:=C;
end proc
proc Decomposition(C:array[1..h] of reals)
C := C/sqrt(h);
// normalize input coefficients
while h>1 do
DecompositionStep(C[1..h]);
h:=h/2;
end while;
end proc

Chapter 2.2: Images and Graphics

Page 85

Page 86

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Decomposition

Decomposition

The standard decomposition of a 2D picture of hw pixels

Instead as of the unnormalized Haar Wavelet functions

proc StandardDecompositionStep(C:array[1..h,1..w] of reals)


for row:=1 to h do
Decomposition(C[row,1..w]);
end for
for col:=1 to h do
Decomposition(C[1..h,col]);
end for
end proc

1
0

( x ) :=
i j ( x ) : = ( 2 j x i )
ij ( x ) : = ( 2 j x i )

where

for 0 x 1
otherwise
for 0 x < 12
1
x <1
2

( x ) : = 1 for
0

otherwise

we can use the normalized Haar wavelet functions


j
2

i j ( x ) : = 2 ( 2 j x i )

The nonstandard decomposition of a 2D picture of hw pixels

j
2

i j ( x ) : = 2 ( 2 j x i )

proc NonstandardDecompositionStep(C:array[1..h,1..w] of reals)


C:=C/h;
// normalize input coefficients
while h>1 do
for row:=1 to h do
DecompositionStep(C[row,1..h]);
end for
for col:=1 to h do
DecompositionStep(C[1..h,col]);
end for
end while
end proc

Chapter 2.2: Images and Graphics

As a compensation we have to multiply each unnormalized coefficient with superscript


j by 2(-j/2)
Unnormalized version:

Example:

coefficients
basis functions

(6; 0; 2,1;1,-1,-1,1)

= ( 00 ; 00 ; 01 , 11 ; 02 , 12 , 22 , 32 )

Normalized version:
coefficients

Page 87

basis functions
Chapter 2.2: Images and Graphics

(6; 0;

2
2

1
2

; 12 , -12 , -12 , 12 )

= ( ; ; , ; 02* , 12* , 22* , 32* )


0*
0

0*
0

1*
0

1*
1

normalized according to (*)

Page 88

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Decomposition

2D Image Compression

Comment on input normalization

2D-image compression with 2D-Haar basis functions


(Generalization of the corresponding technique for 1D-images)

C[i]:=(C[2i-1]+C[2i])*sqrt(2), i.e. the average is multiplied by 2


At the beginning we normalize by

C :=

Example: Let h = 2n, e.g. h = 24; basis functions ( ; ; ;


Then the higher superscript is n - 1, e.g. 3
0
0

C :=

We normalize first:

22

0
0

3
7

Step2: Sort c1,...,cm in order of decreasing magnitude. Result: c 1 ,...,c m

C
C
C
=
= n
h
22
2n

Then in the first run we multiply the first detail by


normalized by C 2 = C

Step1: Compute coefficients c1,...,cm which represent an image in a normalized Haar


basis. (Image = c1u1(x,y) + c2 u2(x,y) + ... + cm um(x,y))

k with
Step3: Find the smallest
!
the L2norm,i.e. f f

2 , then the coefficient is

i =k +1

2 where is the allowed error regarding

n 1
2

The same holds for the averages. In successive runs the other detail levels and the
global average are correctly evaluated by means of successive multiplications with 2

Chapter 2.2: Images and Graphics

Page 89

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Decomposition

Decomposition
This procedure starts with coefficients C[1], , C[m] and finds the smallest m for
m
which
c i 2 2 where 1 , , m is a permutation of 1, ,m

A fast procedure (by binary search) for finding the threshold which coefficients are
negligible

i = m +1

proc Compress(C:array[1..m]of reals, : real)


min:=min{|C[i]|}
max:=max{|C[i]|}
do
:=(min+ max)/2
s:=0
for i:=1 to m do
if |C[i]| then
s:=s+(C[i])^2
end for
if s < ^2 then
min:=
else
max:=
until min max
for i:=1 to m do
if |C[i]|< then C[i]:=0;
end for
end proc

Chapter 2.2: Images and Graphics

Page 90

Chapter 2.2: Images and Graphics

such that c 1 c 2

c m

and where is a tolerable L2 error


The algorithm works as follows:
1. It starts with a threshold which is the average between the smallest and the
largest coefficient
2. It computes the L2 error if all coefficients smaller in magnitude than would be
discarded
3. If squared error < 2 then we continue with the right half of the interval
4. If squared error 2 then we continue with the left half of the interval

Page 91

Chapter 2.2: Images and Graphics

Page 92

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Decomposition

Multiresolution Analysis

Let (without any loss of generality) c1, c2, , cm be already sorted such that
|c1| (max) |c2| |cm| (min)

Multiresolution analysis means analyzing a signal at different frequencies giving different


resolutions
Consider a nested set of vector spaces V0V1 V2 ...
Basis functions of Vj are sometimes called scaling functions
Wj := orthogonal complement of Vj in Vj+1
[orthogonal according to some definition of an inner product]
The functions which we choose as a basis for Wj are called wavelets
Example (Haar Wavelets):

Let s := squared error by discarding all coefficients smaller than , i.e. all coefficients
on the left side of
Case 1: s < 2
We could even discard (possibly) more than those coefficients! Hence we discard the
part on the left of . We set new,min:= and restart with the new interval.
Still very small error?

12

02

new,min =
max
min
Case 2: s > 2
new interval
By discarding the elements left of the actual we would discard too many elements.
The elements left of must be kept anyway. Set new,max:= and restart.

32

22

Error too large?

Matrix notation: ( x ) : = [ 0 ( x ),...,m j 1 ( x )]; ( x ) : = [ 0 ( x ),..., n j 1 ( x )]


If the basis Vj and Wj consists of mj and nj elements respectively, then we may
combine
basis
Chapter
2.2: the
Images
andfunctions
Graphics into single row matrices
j

new,max =

min

Chapter 2.2: Images and Graphics

max

new interval

Page 93

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Page 94

Matrix Formulation

Some immediate consequences:


1. Vj Wj = Vj+1 ; Wj orthogonal to Vj mj + nj = mj+1
2. Vj-1 Vj The elements of Vj-1 must be expressible as linear combination of the finer
functions of Vj, i.e. the scaling functions must be refinable

01

Scaling functions of V1:


0

2
0

2
1

2
2

2
3

1
0

1
1

02

Scaling functions of V2:


We have:

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Matrix Formulation

Example:

Example:

32

4. (a) and (b) can be combined to:

Chapter 2.2: Images and Graphics

1
with: P 2 =
0

0
1 0

1 0
0
and Q2 =
0 1
1

1
0 1
j 1

j 1

mj-1
j 1
j
j
There must be a matrix Pj with (a) ( x ) = ( x ) P
j
j
j-1
where P is a m m matrix

1 ( x ) = 2 ( x ) P 2 and 1 ( x ) = 2 ( x ) Q2

11
12

22

3. Wj-1 Vj Elements of Wj-1 can be written as linear combination of scaling functions


of Vj
There is a mj nj-1 matrix Qj with ( b ) j-1 ( x ) = j ( x ) Q j

In our example:

1 1
0 1

mj

Page 95

Chapter 2.2: Images and Graphics

1
0

1
1

=
2 2
0 1

2
2

2
3

= j P j Q j

1

0

0
0
1
1

1
1
0
0

0
1

Page 96

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Matrix Formulation
5. All functions of

<

j 1

(from (b)):

6.

<

j 1

j 1

Matrix Formulation

( x ) must be orthogonal to all functions of

j 1

The columns of Qj are a basis for the null space. There are many alternatives for Qj, thus
there are many different wavelet bases for a given Wj. The Haar wavelets are defined
by the additional requirement that the number of consecutive nonzero values in Qj per
column is minimal
0
1

1
0
2

Q =
0
1

1
0

( x ).

(Matrix whose (k,l) entry is < kj 1 l j 1 > )

> = 0

<

j 1

j 1

> Q j = 0

Definition
Orthogonal wavelet basis: all functions (wavelets + scaling functions) are orthogonal to
each other
Semi-orthogonal wavelet basis: wavelets orthogonal to scaling functions but not to
each other

j 1 > Q j = 0 is a homogeneous systems of equations. The set of


solutions is called null space of < j 1 j 1 > .
j 1

Haar basis

Chapter 2.2: Images and Graphics

Page 97

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

=
semi-orthogonal
orthogonal; spline basis =
Page 98

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Matrix Formulation

Matrix Formulation

A matrix notation for the approximation of functions


We have shown earlier that (and how) it was possible to approximate a function f(x)
in Vj by f ( x ) = c 0j u 0 ( x ) + ... + c mj j 1 u m j 1 ( x )
where ui(x) are basis functions of Vj
The coefficients cij might be, for example, pixel colors or y-coordinates of the
function f(x)
T
j
j
j
We can write the coefficients cij as C = c 0 ,...,c m j 1

We may want to express f(x) within Vj-1 (a lower resolution version, i.e. less accurate,
with lower number of coefficients, i.e. mj-1 instead as of mj coefficients)
The standard procedure of creating the mj-1 coefficients of Cj-1 out of the mj coefficients
of Cj consists of:
linear filtering
the coefficients of Cj
down sampling
This may be expressed as a matrix equation: Cj-1 = Aj Cj

mj
where Aj is a mj-1 mj matrix

f ( x ) = U j C j where U j = [u0 (x),...,um j 1 (x)]

mj-1

Since Cj-1 is smaller then Cj this filtering process looses some detail.
The lost detail may be expressed in a second matrix Dj-1 with Dj-1 = Bj Cj where Bj is a
nj-1 mj matrix

Chapter 2.2: Images and Graphics

Page 99

Chapter 2.2: Images and Graphics

Page 100

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Matrix Formulation

Example

Cj-1 = Aj Cj

Dj-1 = Bj Cj

For the Haar basis of V2 we have

Detail of Cj which cannot be


expressed by Cj-1

Low resolution of Cj

A2 =

If Aj and Bj are appropriately chosen, Cj may be recovered from Cj-1 and Dj-1:
Cj = Pj Cj-1 + Qj Dj-1
The process of calculating Cj-1 and Dj-1 from Cj is called decomposition of Cj or
<analysis>
The process of calculating Cj from Cj-1 and Dj-1 is called reconstruction of Cj or
<synthesis>
The global procedure looks as follows:
The coefficients Cj can be
j
reconstructed
from:
j-1
1
A
A
A
Cj
Cj-1
Cj-2......C1
C0
C0,D0,D1,D2,...,Dj-1
j-1
This sequence is called
Bj
B1
Dj-1 B
Dj-2
D0
wavelet transform. It has
This recursive procedure is called a filter bank
the same size as Cj.

Page 101

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

B2 =

1
2

1 0

C1 = A2 C2
D1 = B2 C2

A2 is the averaging operation


B2 is the differencing operation
(this explains the factor )

A general relation which must be satisfied by the matrices Aj and Bj is :

j 1

j 1

B j

Thus since

j 1

A
j
B

= P j Q j

j 1

= j P j Q j

(shown earlier) we must have

i.e.

A
j
B

must be invertible

Chapter 2.2: Images and Graphics

Page 102

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Multiresolution Analysis

Multiresolution Analysis

How to choose a suitable technique for transformation? Suitable for a particular


application.

It is desirable for reasons of data compression that:


1. Wavelets are an orthogonal basis for Wj
2. Wavelets have a small support (the support of a function f(x) is the number of points
where f(x) 0)

The following steps have to be made:


1. Select the scaling functions ( x )
(this defines Vj and Pj)
j

(j = 0,1,...)

But: orthogonality comes very often with large support. Thus it may be better to
sacrifice the orthogonality. An example for that are the spline wavelets which:
have minimum support
are not orthogonal to each other (except for degree = 0)

2. Select an inner product for V0, V1, ...


(this defines the L2 norm as well as Wj)
3. Select a set of wavelets
which are a basis for Wj (j = 0, 1, ...)
j

(
x
)
j
(this defines Q )
Remark: Pj and Qj define Aj and Bj since A j

j
j
j = P Q
B

Chapter 2.2: Images and Graphics

1
2

Page 103

Chapter 2.2: Images and Graphics

Page 104

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

B-Spline Wavelets

Non-Uniform B-Spline Scaling Functions

Haar type wavelets have many advantages:


simplicity
orthogonality
very small supports
non-overlapping scaling functions for a given level j
non-overlapping wavelets for a given level j

The nonuniform B-spline basis functions for degree d are constructed as follows:
Choose positive integers k and d (k d) and values x 0 ,..., x k + d +1
These values are called knots
Then we define recursively the non-uniform B-spline basis functions of degree d:

However, Haar transforms are not well suited for animation and for curve editing since
they do not have enough smoothness
Therefore, we are interested in wavelets which have several continuous derivatives
Such wavelets can be derived from piecewise polynomial splines (B-splines)

1
N i0 ( x ) : =
0

x i x < x i +1
otherwise

N ir ( x ) : =

x xi
N ir 1 ( x ) +
xi +r xi

x i + r 1 x
x i + r 1 x i 1

N ir+11 ( x )

(i = 0, ..., k; r = 1, ..., d)
Remark: If the denominator is 0 then the whole term is defined to be zero

B-spline scaling functions and B-spline wavelets can be defined for different degrees
d = 0, d = 1, d = 2, ...
The higher the degree the smoother the transform
With degree d, the scaling functions have d - 1 continuous derivatives
d = 0 is the Haar transform

Page 105

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Uniform B-Spline Scaling Functions

B-Spline Scaling Functions


Example
J = 1, i.e. 21 = 2 subintervals: [ 0 :

Endpoint-interpolating B-splines of degree d on the interval [0,1] are defined by


setting the first d + 1 knots to 0 and the last d + 1 knots to 1
N 0d ( x ),...,N kd ( x ) form a basis for the space of piecewise-polynomials of degree d
with d-1 continuous derivatives
Finally, uniformly spaced B-splines are constructed by selecting k =
making the interior knots x d +1 ,..., x k equally spaced.

Page 106

Chapter 2.2: Images and Graphics

2j

+ d 1 and by

1
2

),[ 12 : 1 )

N 03

B-spline scaling functions

N 0d ( x ),...,N 2d j + d 1 ( x ) = N dd +1 ( x )

N 02
N

This gives 2j + d B-spline functions N 0d ( x ),...,N 2d j + d 1 ( x ) which are a basis of a


particular degree d and a level j, i.e. for Vj(d).

Njd(x)

have
The functions
d - 1 continuous derivatives.
N0d(x), ..., Nid(x) are a basis
for V1(d)

N 13

1
0

0
0

N 12

If Vj(d) denotes the space which is spanned by the B-spline scaling functions of
degree j with 2j uniform intervals, then the spaces V0(d), V1(d), ... are nested:
V0(d) V1(d) V2(d) ...

Degrees d=0, d=1, d=2, d=3

N 23

1 N1
1

0
1

Degree 0

1 N2
2

N 21

Degree 1

N 33
0

N 32

N 43

Degree 2
Degree 3

Chapter 2.2: Images and Graphics

Page 107

Chapter 2.2: Images and Graphics

Page 108

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Scaling and Wavelet Functions for V3(d)

B-Spline Scaling Functions for V3(0)


Degree d = 0 (Haar wavelets) - not a continuous function:

In the following we show B-spline scaling functions for j = 3 and for degrees
d = 0 (Haar wavelet functions)
d = 1 (linear splines)
d = 2 (quadratic splines)
d = 3 (cubic splines)

lim f ( x + ) lim f ( x )
0

if x =

1
8

, 28 ,..., 78

We get 2j + d = 8 + d scaling functions and 2j =8 wavelets for each case


The wavelets are determined by matrices Q j which satisfy < j 1 j 1 > Q j = 0
The solution of this equation system is not unique

8 scaling functions

Page 109

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Degree d = 2 (quadratic B-spline) - can be continuously differentiated once

Chapter 2.2: Images and Graphics

Page 110

B-Spline Scaling Functions for V3(2)

Degree d = 1 (linear B-spline wavelets)

9 scaling functions

8 wavelets

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

B-Spline Scaling Functions for V3(1)

0
8 wavelets

Page 111

10 scaling functions

Chapter 2.2: Images and Graphics

1
8 wavelets

Page 112

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

B-Spline Scaling Functions for V3(d)

Wavelets in JPEG2000

Degree d = 3 (cubic B-spline) - can be differentiated continuously two times:

1
11 scaling functions

Chapter 2.2: Images and Graphics

Using wavelets, images can better be compressed than with DCT.


Used in compression here: nonstandard decomposition

There are 2j + d scaling


functions if the unit
interval [0:1) is
subdivided into 2j
subintervals. The
scaling functions have
d-1 continuous
derivatives

8 wavelets

Page 113

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Page 114

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Decomposition

JPEG vs. JPEG2000


Coarser details

Vertical information
(fine details)

Diagonal information
(finest details)

Horizontal information
(fine details)

Chapter 2.2: Images and Graphics

Page 115

(a)

(b)

(c)

(a) Original Image256x256Pixels, 24-Bit RGB


(b) JPEG - Compressed with compression ratio 43:1
(c) JPEG2000 - Compressed with compression ratio 43:1

Chapter 2.2: Images and Graphics

Page 116

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

But

Graphics Format
Graphics image formats are specified through:
Graphics primitives: lines, rectangles, circles, ellipses, text strings (2D), polyhedron
(3D), vectors
Attributes: line style, line width, color affect.

Disadvantages of using Wavelets


The cost of computing as compared to DCT may be higher
The use of larger basis functions or wavelet filters produces blurring and ringing
noise near edge regions in images or video frames
Longer compression time
Lower quality than JPEG at low compression rates

Graphics primitives and their attributes allow a higher level of an image representation
by abstract representation. The graphics package determines which primitives are
supported.
Formats
IGES (Initial Graphics Exchange Standard), for transferring 2D/3D CAD data
DXF: developed for 2D/3D AutoCAD data
HPGL (Hewlett Packard Graphics Language), developed for addressing plotters
Also possible: mixed formats (vector and raster graphics), e.g. WMF (Windows
Metafile)

Page 117

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Synthesis
Image synthesis (generation) is an integral part of all computer graphical user interfaces
and indispensable for visualising 2D, 3D and higher dimensional objects. E.g.:

Advantages of graphics format


Reduction of the graphical image data
Easier manipulation of graphical images
(enlarging, scaling, ...)

Disadvantages of graphics format


Higher computation effort for representation
on a monitor transformation to a raster
representation

Raster format

Chapter 2.2: Images and Graphics

Page 118

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Raster and Graphics Formats

Graphics
(vector) format

Chapter 2.2: Images and Graphics

Page 119

Graphical User Interface (GUI)


Desktop windows system with icons and menu items
Office Automation and Electronic Publishing: desktop publishing, Hypermedia
systems
Simulation and Animation for Scientific Visualisation and Entertainment
Pictures can be dynamically varied by adjusting the animation speed, portion of the
total scene in view, the amount of details shown etc.
Motion Dynamics: Objects are moved and enabled with respect to a stationary or
also dynamic observer, e.g. flight simulator
Update Dynamics: Objects being viewed are changed in shape, color, or other
properties, e.g. deformation of an in-flight aeroplane structure

Chapter 2.2: Images and Graphics

Page 120

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Components of Interactive Graphics Systems

Graphics Hardware

Application model
Represents data or objects to be pictured (stored in an application database)
Stores graphics image formats and connectivity relationships of the components
Should be application specific and independent of any particular display system
Converts image database representations to the graphics system format
Application program
Handles user inputs by sending commands to the graphics system describing what to
display and how this objects should appear
Graphics system
Intermediary component between application programs and the display
Performs an output transformation on objects in the application model
Performs an input transformation on user actions to the application
Consists of output subroutines collected in a graphics package to display images

Input:

Mouse, keyboard, data tablet, touch-sensitive panel on the screen (2D input)
track-balls, space-balls, data glove, etc (3D and higher-dimensional input)
Output: raster display

Sampled Image
1101100001010010100
0101000101010000001
0000000000000100001
0000111010110111000
0010000100100000010
0100000000000000100
0100000101010101111

Video Controller
Display Controller (DC)

Refresh Buffer
Interaction Data
Mouse
Display Commands

Graphics hardware
Receives input from interaction devices and outputs images to display device

Computer-Host Interface

Page 121

Chapter 2.2: Images and Graphics

Page 122

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Raster Display

Object Representation - Bresenham Line


Problem: an exact representation of a geometrical object on a raster display is impossible!
How to determine a good approximation by choosing near pixels?

Start raster scan


raster lines
horizontal retrace

Bresenham Algorithm. Determine the pixels nearest to a real line as follows:

vertical retrace

Construct a line from left to right with a maximal gradient 0 m 1, ( 45), all other
lines are achieved through reflections.

pixel

Next pixel is the one in the east

To avoid flickering of images, a 60 Hz (or higher) refresh rate is recommended


For each pixel, the beams intensity is set to reflect the pixels intensity
In color screens, three beams are controlled (red, green, blue)
Monochrome displays use achromatic light that is determined by the quality of the light
(intensity and luminance)

Chapter 2.2: Images and Graphics

Page 123

Next pixel is the one in the north-east

Keyboard

0 < dj

dj = 0

1
2

d j +1

Chapter 2.2: Images and Graphics

1
2

j +1

= yj

d j +1 >

1
2

j +1

= yj +1

Page 124

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Object Representation - Bresenham Line


i

i+1

yi+2

Yi+2

d i +3 = d i + 2 + m 1 = d i + 3m 1

di+3

Yi

y di

i+1

di+1 = di + m + (yi-1 - yi) =

since d i + 2 > 2

Yi+3

di+2

di+1

Yi+1 = yi + di+1

yi + di+1 = yi-1 + di + m

d i + 2 = d i +1 + m = d i + 2m

yi+3

Yi+1

Yi+1 = Yi + m;

d i +1 = d i + m

i+3

i+2

Object Representation - Bresenham Line

if yi = yi-1

di + m - 1

if yi = yi-1+ 1

di + m

(and yi = yi-1)

=
di + m - 1(and yi = yi-1 + 1)

y i +1 = y i
1

y i + 2 = y i +1 + 1 since d i + 2 >

Let Yi be the true value of curve,

then Yi+1 = yi + di+1

y i +3 = y i + 2

with 0 di+1 <1, y i ,Yi

since

d i +3

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

1
2
1
if d i >
2
if d i

Changes of yi:

yi+1 =

Page 125

Chapter 2.2: Images and Graphics

1
2
1
criterion :di >
2
criterion :di

di + m

1
2
1
>
2

yi

if d i +1

yi+1

if d i +1

Easier calculation by Di instead of di as follows:


D0 = (0 12 ) 2 x = x, M=m 2 x = 2 y
Di + M
Di +1 =
Di + M 2 x

if Di 0
if Di > 0

and

if Di +1 0
y
y i +1 = i
y i +1 if Di +1 > 0

Chapter 2.2: Images and Graphics

Page 126

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Object Representation - Bresenham Line

Dithering
Kind of cheating to achieve more than two intensity levels if only two-level intensities
may be represented resp. to increase the possible color depth is dithering.
Dithering means to use the human eyes impossibility to differentiate between
neighbored pixels if the resolution is too fine.

Example:
m = 0.2 x = 1, y = 0.2,
M = 2y = 0.4, D0 = x = 1
xi

10

11

12

13

14

15

16

17

yi

Di

-1 -0,60 -0,20 0,20 -1,40 -1,00 -0,60 -0,20 0,20 -1,40 -1,00 -0,60 -0,20 0,20 -1,40 -1,00 -0,60

Halftoning (clustered-dot ordered dithering) uses the human eyes weakness to


produce different intensity levels, e.g. used by laser printers that are not able to display
individual dots.
Example: 2 x 2 dither patterns (5 intensity levels) represented in dither matrices

yi

2
1
0
1

Chapter 2.2: Images and Graphics

8
xi

10 11 12 13

14 15 16 17

Page 127

Dispersed-dot ordered dithering (monochrome dithering) is used e.g. to extend the


number of available colors of displays (at the expense of resolution).
Example: consider 3 bits (red, green, blue) per pixel and a 2 x 2 pattern area - each
pattern can display 5 intensities for each color resulting in 5 x 5 x 5 = 125 colors.

Chapter 2.2: Images and Graphics

Page 128

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Analysis

Image Analysis
Image analysis is concerned with techniques for extracting descriptions about images:
Computation of perceived brightness and color
Partial or complete recovery of 3D data in a scene
Location of discontinuities corresponding to objects in a scene
Characterisation of the properties of uniform regions in a image

Pixel information is very often insufficient! There is no information about:


shape of an object
orientation of an object
three-dimensional interpretation of an object
Image Analysis (Image Processing) is required.

Image Enhancement
Improves image quality by eliminating noise (extraneous or missing pixels) or by
enhancing contrast, i.e. X-ray images, computerized axial tomography (CAT)

Some applications or special topics are:


Image enhancement
Pattern recognition
Scene analysis
Computer vision

Example:
Problems:

Scene Analysis and Computer Vision


Deals with recognising and reconstructing 3D models of a scene from several 2D
images, i.e. industrial robot sensing (relative sizes, shapes, positions, colors)

Traffic scenes taken by a camera installed in a car


- Is there a traffic sign visible? Which traffic sign?
- Is a moving car in front of our car? Which type of car?
- Which relative speed to our speed?

Pattern Detection and Recognition


Deals with detecting and clarifying standard patterns and finds distortions from these
patterns, i.e. Optical Character Recognition (OCR)
Static character recognition (OCR), dynamic recognition (handwriting)

Page 129

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Page 130

Chapter 2.2: Images and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Target Detection

Image Recognition Steps

Turning radar detector:


000000000000000000000000111111111111111111100000000000000000000000000000000

Steps

an object

unimproved

Imperfect radar: some 0/1 mistakes:

Source
of Image

000000010000110000000000111111001101111110010000000001000001110000000000000

Formatting

digital
image data
structure

Conditioning

Elimination of mistakes by digital filtering

Labelling

Simplest version: symmetrical (2m-1, m) filter


a 1 a 2 ... a i ... a n a 1 a 2 ... a h ... a n

Example:

1
w ith a h =
0

h + m -1

if

ai m

Grouping

i=hm +1

e ls e

Extracting

000001000010001110110110111100010

improved
Filtering with (5,3)-filter

000000000000001111111111111100000

Chapter 2.2: Images and Graphics

logical data
structure

Target
Image

Matching

Symbolic Representation

Page 131

Chapter 2.2: Images and Graphics

Page 132

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Digital Image Data Structures

Image Transformation

Formatting
Capturing of an image and transforming to a digital representation

Digital Image Data Structures are produced through image transformations:


Let I,J,G and f : I J G an image.
: GIJ GIJ is an image transformation.

Conditioning
Based on a model that assumes that the observed image is composed of an
informative pattern modified by uninteresting variations
Estimates informative pattern on the basis of the observed image
Suppresses noise and performs background normalization by suppressing
uninteresting systematic or patterned variations
Applied uniformly and context-independent

Examples
1. Location dependent weakening of Intensity (e.g. effect with lens):
(f ( x,y )) = f ( x,y ) + intensityfunction( x,y )

2. Threshold-Transformation (e.g. filtering of greyscales/colors):

Labelling
Based on a model that assumes that the informative pattern has structure as a
spatial arrangement of events
Determines in what kind of spatial events each pixel participates
Labelling operations:
edge detection, corner detection
thresholding (e.g. filters significant edges and labels them)
identification
of pixels that participate in various shape primitives
Page 133
Chapter 2.2: Images
and Graphics

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

i
s

Chapter 2.2: Images and Graphics

Page 134

Grouping
Identifies events by collecting or identifying maximal connected sets of pixels
participating in the same kind of event (neural networks!)
Determines new sets of entities
Changes the logical data structure
Entities of interest after grouping are sets of pixels
E.g. line-fitting is a grouping operation, where edges are grouped into lines

Let G : = { 0,...,r },r and f : I J G an image.


Assume the image f uses only grey values in the scale of 0 < a i b < r .
With t defined as follows, the whole greyscale values of G are used.
so that ht ( f ) is the new Histogram, t( i ) [0,r ]

Extracting
Computes for each group of pixels a list of properties:
Centroid, area, orientation, spatial moments, grey tone moments, spatial-grey tone
moments, circumscribing circle, inscribing circle, number of holes in a region, average
curvature in an arc, etc.
Measures topological or spatial relationships between two or more groupings, i.e.
clarifies whether two groupings touch, are spatially close or layered

Chapter 2.2: Images and Graphics

t(i)

Grouping and Extracting

3. Histogram Spreading (e.g. to increase the contrast of an image)

Analogous to spreading,
the greyscale can be
compressed and used
only in partial intervals.
A combination of both is
e.g. useful to manipulate
film materials.

hf : G with hf ( i ) = f 1( i ) = number of pixels with greyvalue i.

t(i)

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Image Transformation

for 0 i a
r

t( i ) : =
( i a ) for a < i < b
b a
for b i r
r

Let G : = { 0,...,r },r the maximum intensity and s G the threshold,


1
f : I J G an image and
t : G G' (here G' := { 0,1}) is defined as:
0 for 0 i s
t(i ):=
so that

1 for s < i r
ht( f ) is the new Histogram of hf defined as:

Range of new
greyscale
values

Range of old values

Page 135

Chapter 2.2: Images and Graphics

Page 136

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Matching

Detection of License Plates

Matching
Determines the interpretation of some related set of image events recognised
previously in the extracting step
Associates events with some given 3D objects or 2D shapes
Template matching is a classical example of a wide variety of matching operations,
compares examined pattern with known and stored models and chooses the best
match
Conclusion
Conditioning, labelling, grouping, extracting and matching constitute a canonical
decomposition of the image recognition problem
Each step prepares and transforms the data to facilitate the next step
On any level the transformation is a unit process and data are prepared for the unit
transformation to the next higher level
Depending on the application, the sequence of steps has more than one level of
recognition and description process

Chapter 2.2: Images and Graphics

Page 137

Lehrstuhl fr Informatik 4
Kommunikation und verteilte Systeme

Conclusion
Images are represented by...
Resolution (pixel x pixel)
Color depth (bit)
Color model: greyscale, RGB, CYMK
Graphics primitives / vectors in graphics image formats
Lots of image formats, as well lossless as lossy compression schemes
JPEG / JPEG2000 as examples for common formats
Image Synthesis...
Is needed to bring an image to the monitor
Can use dithering to artificially enhance image quality
Image analysis...
Tries to detect patterns in an image representation
Can be used for improving image quality by suppressing noise or rescaling
Can be used for recognize characters / text

Chapter 2.2: Images and Graphics

Page 139

Grouping
maximal group of events

Formatting

a group

Extracting
properties as size,orientation ...
are detected

Matching

Conditioning
some purification has been done

Meaning detected:
It is letter A

Labelling
Corners structures as corners, edges ...

are detected

Chapter 2.2: Images and Graphics

Page 138

You might also like