Professional Documents
Culture Documents
compression
Why compression? Lossy or Lossless compression Entropy encoding and source encoding Source encoding Suppression of repetitive sequences Statistical encoding Pattern substitution Huffman encoding Transform encoding Transformations Differential or predictive encoding Different types of differential encoding Vector quantization Errors in vector quantization Vector quantization Fractal compression Asymmetry in compression/decompression
Why compression? Minimize the bit-rate! CD-ROM: 648 MB or 72 minutes of uncompressed stereophonic CD-quality sound BUT only 30 seconds of uncompressed studioquality digital TV
A 90 minutes movie would take about 120 GB, which is about 189 CD-ROMs
We need compression!!
Lossless: All information is saved and the compression is reversible Lossy: Some information is "thrown away" based on that the perceptual response of an observer. The compression is irreversible
Source encoding
The data is transformed based on the source and its known characteristics lossy or lossless
Source encoding
Examples: remove silent parts in an audio sequence find common blocks in two video-frames Classification: Transform encoding Differential encoding Vector quantization
Run-length encoding:
Same as above but the replaced character is also entered into the code The number of sequential occurrences must be higher than 3
Statistical encoding
The sequence of data that occurs most frequently use the shortest codes A code-book is generated either in advance:
Morse alphabet
Pattern substitution
Used when encoding text
Frequent words are replaced with a shorter codeword "Multimedia" is replaced with "*M" and "Network" with "*N"
---------------------------------
Huffman encoding
The code-book is created dynamically and is sent with the compressed information Used for images, movies and program-files. In the case of movies, the code-book can be calculated per frame or per movie
Transform encoding
Some data is easier to compress in the frequency domain The data is translated from the spatial or temporal domain to the frequency domain
Transformations
Mathematical transformations
Fourier, cosines
Lossy or lossless Some frequencies are coded with lower precision or are removed completely Discrete Cosine Transform - DCT
Used when coding images
Differential pulse code modulation - DPCM Delta modulation Adaptive differential pulse code modulation - ADPCM
Vector quantization
Special case of pattern substitution The data is divided into vectors These vectors are compared to a table The the best matching pattern in the code-book is used
The code-book can be constructed in advance or dynamically
Vector quantization
Works well with data with known characteristics, where the code-book can be generated in advance Useful with speech General problems are: How to construct an optimal code-book? Which algorithm to use to find the best matching index?
Fractal compression
Fractals has normally been used to create images not to compress images Fractal transformations:
Divide the image into several small parts Compare each individual part to other parts within the same image
translated, shrunk, slanted, rotated or mirrored
A virtual code-book is created and is dependent of each coded image Requires a large amount of computer power!!!
Audio compression
ITU-T G.7XX Recommendations GSM compression standard Code excited linear prediction (CELP) voice coder VAT Higher-quality audio compression standards Compression techniques used by MPEG-audio Performance and quality Objective of each MPEG-layer
G.727
Embedded ADPCM
G.728
16 Kbps
3.4kHz
LD-CELP
How it works:
Both methods use a form of vector quantization with predefined code-books In 1016, the error is transmitted with the code-word The resulting quality of 1016 is equivalent to that the 32 Kbps ADPCM algorithm used in G.721
VAT
pcm 78Kb/s 8-bit -law encoded 8KHz PCM (20ms frames) pcm2 71Kb/s 8-bit -law encoded 8KHz PCM (40ms frames) pcm4 68Kb/s 8-bit -law encoded 8KHz PCM (80ms frames) dvi 46Kb/s Intel DVI ADPCM (20ms frames) dvi2 39Kb/s Intel DVI ADPCM (40ms frames) dvi4 36Kb/s Intel DVI ADPCM (80ms frames) gsm 17Kb/s GSM (80ms frames) lpc4 9Kb/s Linear Predictive Coder (80ms frames)
Higher-quality audio compression standards Moving Pictures Expert Group (MPEG) family of compression techniques Targets not only speech but sound in general MPEG-1 is described in IS 11172-3 MPEG-1 contains a family of three audio encoding and compression schemes
All three layers are hierarchically compatible
Performance and quality The target bit-rate is ranging between 32 and 448 Kbps per monophonic channel The sampling rate is for MPEG-1 32, 44.1 or 48 kHz MPEG-2 adds 16, 22.05, 24kHz Two audio channels for MPEG-1 Five audio channels for MPEG-2 + low frequency enhancements channel MPEG-audio compression schemes are lossy, but they can achieve perceptually lossless quality
Objective of each MPEG-layer Layer192 or 256 Kbps 96 or 128 Kbps 64 Kbps per audio channel
The quality is very close to CD quality! (Nickname mp3)
JPEG MPEG Achievements MPEG MPEG-1 MPEG-2 MPEG-2 Scalable Extensions MPEG-4 MPEG-7 H.320 - H.261 H.320 - H.221, H.230 H.320 - H.231, H.242, H.233, G.7XX H.261 H.261 vs. MPEG Data Rates H.263 PNG, SVG, JPEG-2000
JPEG Joint Photographic Expert Group A standard for compression of both bitonal and continuous-tone images Either lossy a lossless !
MPEG ISO/IEC Joint Technical Committee 1, Sub Committee 29, Work Group 11 MPEG means Moving Picture Experts Group
worldwide 3-5 one week meetings in a year with 300 - 400 people
MPEG-1 MPEG-1 is a standard for storage and retrieval of moving pictures and audio on storage media.
Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to 1.5 Mbps VCR-quality video Standard Interchange Format (SIF) 4:1:1 Subsampling 352 x 240 or 288, but can deal with images up to 4095 x 4095 Progressive in contrast to broadcast TV (interlaced) Compression-rate: 26:1
MPEG-2
MPEG-2 Scalable Extensions Support for more than one layer, layers of more or less complexity. A lower layer and an enhanced layer are supported for each of Data partitioning SNR scalability Spatial scalability Temporal scalability
MPEG-4 Note: Changed focus for MPEG-4! MPEG-4 is a standard for multimedia applications. Very important for the future! Earlier:
Very low bit rate Audio-Visual Coding
4.8-64 kbps, QCIF, 10 frames/second
H.320 - H.261 ITU standard family for videophony 56 - 1930 Kbps H.320:
The main document for the whole standard with references to other documents
H.261:
Encoding and compression of video p x 64 H.263
H.230:
Same as H.221 but for none audio/video data
H.242:
How to make a connection
H.261 Optimized for p*64Kbps, where p=1..30 (ISDN) Three picture components:
Y:C:C with 4:1:1 subsampling
H.261 vs. MPEG Data Rates Note the MPEG I-frame peaks! A lost I-frame packet is severe
Better optimization H.263+, Video Coding for Low Bit Rate Communication
PNG, SVG, JPEG-2000 PNG, Portable Network Graphics JPEG-2000 SVG, Scalable Vector Graphics
JPEG
JPEG Encoding-modes of JPEG Progressive example Encoding-modes ... Overview JPEG Preparation of the data-blocks Discrete Cosine Transform More DCT Why? Quantization step DPCM encoding Run-length encoding Notes on JPEG Overview JPEG Compression of moving images Implementations
JPEG Joint Photographic Expert Group A standard for compression of both bitonal and continuous-tone images DCT + quantization + run-length + Huffman Either lossy a lossless !
Progressive encoding
Allows the image to be rebuilt in multiple coarse-to-clear passes Lossy
Hierarchical encoding
Includes multiple resolution levels which can be decompressed separately.
The components are divided into blocks to make the transform easier 8x8 samples non-interleaved ordering