Presentation Funtamentals-Lossless

Compression is a process intended to yield a compact digital representation of a signal.
source coding data compression bandwidth compression
image video stream audio signal other Without compression many applications would not be feasible! minimise the bit rate
Example 1: facsimile image transmission The document is scanned and digitised. Typically, an 8.5x11 inches page is scanned at 200 dpi; thus, resulting in 3.74 Mbits. Transmitting this data over a low-cost 14.4 kbits/s modem would require 5.62 minutes. With compression, the transmission time can be reduced to 17 seconds. This results in substantial savings in transmission costs.
Example 2: video-based CD-ROM application Full-motion video, at 30 fps and a 720 x 480 resolution, generates data at 20.736 Mbytes/s. At this rate, only 31 seconds of video can be stored on a 650 MByte CD-ROM. Compression technology can increase the storage capacity to 74 minutes, for VHS-grade video quality.
Why is compression possible? Because there is considerable redundancy in the signal! 1. Within a single image or a single video frame, there exists correlation among neighbour samples
spatial correlation
2. For data acquired from multiple sensors (satellite images),
there
exists
correlation
amongst
samples from these sensors spectral correlation
3. For temporal data (video), there is correlation amongst samples in different segments of time
temporal correlation
compression ratio system
cr
used instead of bit rate to
characterise the capability of the compression
cr =
source coder input size source coder output size
This definition is somewhat ambiguous depends on data type specific compression method
Example:
Still-image
size could refer to the bits needed to represent the entire image Example: Video size could refer to the bits needed to represent one frame of video size is the bits needed to
represent one second of video
In the following figure, a systems view of the compression process is depicted.
ENCODER
Digital image, Video, Audio
Source Coder
Channel Coder
DECODER Digital image, Video, Audio Source Decoder Channel Decoder
Figure: Generic compression system
Application Voice 8 ksamples/s, 8 bits/sample Slow motion video (10fps)
Data Rate Uncompressed Compressed 64 kbps 2-4 kbps 5.07 Mbps 8-16 kbps
framesize 176x120, 8bits/pixel Audio conference 8 ksamples/s, 8 bits/sample Video conference (15fps) framesize 352x240, 8bits/pixel Digital audio 44.1 ksamples/s, 16 bits/sample Video file transfer (15fps) framesize 352x240, 8bits/pixel Digital video on CD-ROM (30fps) framesize 352x240, 8bits/pixel Broadcast video (30fps) framesize 720x480, 8bits/pixel HDTV (59.94 fps) framesize 1280x720, 8bits/pixel
64 kbps 30.41 Mbps 1.5 Mbps 30.41 Mbps 60.83 Mbps 248.83 Mbps 1.33 Gbps
16-64 kbps 64-768 kbps 1.28-1.5 Mbps 384 kbps 1.5-4 Mbps 3-8 Mbps 20 Mbps
compression problem: a bit rate minimisation problem with several constraints! as Specified level of signal quality. This constraint is usually applied at the decoder.
Implementation complexity. This constraint is often applied at the decoder, and in some instances at both the encoder and the decoder. Communication delay. This constraint refers to the end to end delay, and is measured from the start of encoding a sample to the complete decoding of that sample.
Lossless compression The reconstructed data and the original data must be identical in value for each and every data sample. This is also referred to as a reversible process.
Coding Efficiency -Compression Ratio? Coder Complexity
Coding Delay
-Memory requirements? -Power requirements? -Operations per second?
Figure: Trade-offs in lossless compression.
Lossy compression some amount of loss is permitted in the reconstructed data; irreversible process
Signal Quality -Bit error probability?
-Signal/Noise? -Mean opinion score?
Coding Efficiency -Compression Ratio? Coder Complexity
Coding Delay
-Memory requirements? -Power requirements? -Operations per second?
Figure: Trade-offs in lossy compression. Signal Quality This term is often used to characterise the signal at the output of the decoder. There is no universally accepted measure for signal quality. SNR
SNR = 10 log10
encoder input signal energy noise signal energy
noise = encoder input signal-decoder output signal In the case of images or video, to-noise ratio) is used instead of (peak signal-
PSNR SNR .
MEAN OPINION SCORE The performance of a compression process is characterised by the subjective quality of the decoded signal. For instance, a five point scale such as
very annoying annoying slightly annoying perceptible but not annoying imperceptible might be used to characterise the impairments in the decoder output.
THE SOURCE CODER

original image data 1
decomposition transformation Feature selection predictive coding transform based coding fractal coding sub-band coding
2 quantisation
in
finite
number of levels 3 symbol encoding
could be Huffman
Compressed image
Figure: A generic representation of a source coder ELEMENTS OF INFORMATION THEORY Any information generating process can be viewed as a source that emits a sequence of symbols chosen from a finite alphabet. Example: text: ASCII symbols
Example: n -bit images:
2n
symbols.
Simplest form of an information source: discrete memoryless source (DMS). Successive symbols produced by such a source are statistically independent. A DMS is completely specified by the source alphabet
S = {s1 , s2 , , sn }
and
the
associated
probabilities { p1 , p2 , , pn }. Self Information

1 = log 2 pi pi
I ( si ) = log 2
The occurrence of a less probable event provides more information.
The information of independent events taken as a single event equals the sum of the information. Example:
sk = {si s j } p k = pi p j
I ( s k ) = log 2 pi p j = log 2 pi log 2 p j = I ( si ) + I ( s j )
Average Information per Symbol or Entropy of a DMS

n n
H ( S ) = pi I ( si ) = pi log 2 pi
i =1 i =1
bits/symbol
Average amount of information per symbol provided by the source (definition).
Average amount of information per symbol an observer needs to spend to remove the uncertainty in the source.
N th
extention of the DMS
Given a DMS of size n , group the source into blocks of

N
symbols.
Each block can now be considered as a single source symbol generated by a source alphabet size
nN SN
with
In this case it is proven that

H ( S N ) = N H ( s)
Noiseless Source Coding Theorem Let

S
be a source with alphabet size n and entropy
H (S )
.
N
Consider coding blocks of binary codewords.
source symbols into
For any
> 0,
it is possible by choosing
large
enough to construct a code in such a way that the average number of bits per original source symbol
l avg
satisfies
H ( s ) l avg H ( s ) +
The redundancy of a code is the difference code should be zero.
l avg H ( s )
in bits/pixel. Ideally, the redundancy of a good
METHODS
AND
STANDARDS
FOR
LOSSLESS COMPRESSION Digitized medical data Bitonal Input image transmission via a facsimile Symbol device also imposes such requirements.
Delay
Symbol-toCodeword Mapping Probability Model
Codeword
Figure: A generic model for lossless compression The combination of the probability modeling and the symbol-to-codeword mapping functions is usually referred to as entropy coding. Message-to-Symbol Partitioning As noted before, entropy coding is performed on a symbol by symbol basis. Appropriate partitioning of the input messages into symbols is very important for efficient coding.
One could view one instance of a frame image as a single message,
256 256
multi-
256 2 = 65536
long.
However, it is very difficult to provide probability models for such long symbols. In practice, we typically view any image as a string of symbols drawn from the alphabet
0,1,2, ,255
Differential Coding If, say, the pixels in the image are in the order
x1 , x 2 , x3 , , x N
, then instead of compressing these

y i = xi xi 1
pixels, one might process the sequence of differentials , where i = 1,2, , N , and
yi x0 = 0
In compression terminology, prediction residual of

xi
is referred to as the
Preprocessing
255
-255
255
The Lossless JPEG Standard Differential coding to form prediction residuals. Residuals then coded with either a Huffman coder or an arithmetic coder In lossless JPEG, one forms a prediction residual using "previous" pixels in the current line and/or the previous line. The prediction residual for pixel
y =0
is
r= yX
y=a
y =b
y=c
y =a+bc
c a
b X
y = a + (b c ) / 2 y = b + (a c) / 2 y = ( a + b) / 2
The prediction residual Is computed modulo

216 .
Is expressed as a pair of symbols: the category and the actual value (magnitude). The category represents the number of bits needed to encode the magnitude. This value is Huffman coded. Example: Magnitude 42 Category 6 Residual 42 (6, 6-bit code for 42).
Huffman code If the residual is negative, then the code for the magnitude is the one's complement of its absolute value. Codewords for negative residual always start wish a zero bit.
Example: Consider
c a
a = 100 b = 191 c = 100 X = 180
y = ( a + b) / 2
y = 145
b X
r = 145 180 = 35
Category 6
Suppose Huffman code for six is 1110 then

35
is coded by the 10-bit codeword 1110011100

35
Without entropy coding,
would require 16 bits.
Category 0
Prediction Residual 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
-1, 1 -3, -2, 2, 3 -7, , -4, 4, , 7 -15, , -8, 8, , 15 -31, ,-16, 16, , 31 -63, , -32, 32, , 63 -127, ..., -64, 64, , 127 -255, ..., -128, 128, ..., 255 -511, ..., -256, 256, ..., 511 -1023,..., -512, 512, ..., 1023 -2047, ..., -1024, 1024, ..., 2047 -4095, ..., -2048, 2048, ..., 4095 -8191, ..., -4096, 4096, ..., 8191 -16383, ,-8192, 8192, ..., 16383 -32767, ..., -16384, 16384, ..., 32767 32768
STANDARDS FOR LOSSLESS COMPRESSION Facsimile Compression Standards and Run-Length Coding Scheme In every bitonal image there are large regions that are either all white or all black.
(position, value) (run, value) Such a mapping scheme is referred to as a run-length coding scheme.
Figure: Sample scanline of a bitonal image The combination of a run-length coding scheme followed by a Huffman coder forms the basis of the image coding standards for facsimile applications .
FACSIMILE COMPRESSION STANDARDS ITU-T Rec. T.4 (also known as Group 3).
1. Modified Huffman (MH) code. 2. Modified Read (MR) code.
ITU-T Rec. T.6 (also known as Group 4).

Compression ratio: 20:1 to 50:1 for business-type scanned documents. Severely degraded for images composed of natural scenes and rendered as bitonal images.
JBIG

Presentation Funtamentals-Lossless

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Presentation Funtamentals-Lossless

Uploaded by

Copyright:

Available Formats

Compression is a process intended to yield a compact digital representation of a signal.

source coding data compression bandwidth compression

2. For data acquired from multiple sensors (satellite images),

samples from these sensors spectral correlation

compression ratio system

used instead of bit rate to

characterise the capability of the compression

source coder input size source coder output size

represent one second of video

In the following figure, a systems view of the compression process is depicted.

Digital image, Video, Audio

DECODER Digital image, Video, Audio Source Decoder Channel Decoder

Figure: Generic compression system

Application Voice 8 ksamples/s, 8 bits/sample Slow motion video (10fps)

Coding Efficiency -Compression Ratio? Coder Complexity

-Memory requirements? -Power requirements? -Operations per second?

Figure: Trade-offs in lossless compression.

-Signal/Noise? -Mean opinion score?

Coding Efficiency -Compression Ratio? Coder Complexity

-Memory requirements? -Power requirements? -Operations per second?

encoder input signal energy noise signal energy

THE SOURCE CODER

number of levels 3 symbol encoding

Example: n -bit images:

probabilities { p1 , p2 , , pn }. Self Information

The occurrence of a less probable event provides more information.

Average Information per Symbol or Entropy of a DMS

Average amount of information per symbol provided by the source (definition).

extention of the DMS

Given a DMS of size n , group the source into blocks of

In this case it is proven that

Noiseless Source Coding Theorem Let

be a source with alphabet size n and entropy

Consider coding blocks of binary codewords.

source symbols into

The redundancy of a code is the difference code should be zero.

in bits/pixel. Ideally, the redundancy of a good

Symbol-toCodeword Mapping Probability Model

One could view one instance of a frame image as a single message,

, then instead of compressing these

In compression terminology, prediction residual of

The prediction residual Is computed modulo

Suppose Huffman code for six is 1110 then

is coded by the 10-bit codeword 1110011100

Without entropy coding,

would require 16 bits.

ITU-T Rec. T.6 (also known as Group 4).

You might also like