Unit-Ii Itc

UNIT-II
SOURCE CODING : TEXT, AUDIO & SPEECH
Syllabus
TEXT: Adaptive Huffman coding
Arithmetic Coding LZW algorithm AUDIO: Perceptual Coding: masking techniques, Psychoacoustic model MPEG Audio Layers I,II & III Dolby AC3 SPEECH: Channel Vocoder Linear Predictive Coding
INTRODUCTION
Compression : done either to reduce the volume
of information to be transmitted(text, fax and images) or to reduce the bandwidth that is required for its transmission(speech, audio and video). Compression Principles:
source encoders and destination decoders lossless and lossy compression
Entropy encoding
Source encoding
Encoder and Decoder
The application of the compression algorithm is
the main function carried out by the encoder and the decompression algorithm is carried out by the destination decoder. When time required to perform compression and decompression not always critical-compression algorithm implemented in software. eg., text, image files
When time required to perform compression and decompression algorithms in software not acceptable, must be performed by special processors in separate units. eg., speech, audio and video
Lossless and Lossy Compression

Compressions algorithms can be classified as being either lossless or reversible (to reduce the amount of source information to be transmitted with no loss of information) e.g transfer of text file over the network or
lossy- cannot produce an exact copy of the source after decompression. (reproduced a version perceived by the recipient as a true copy) e.g digitized images, audio and video streams
Entropy Encoding - Run-length encoding -Lossless

Examples
of run-length encoding are when the source information comprises long substrings of the same character or binary digit In this the source string is transmitted as a different set of codewords which indicates only the character but also the number of bits in the substring provided the destination knows the set of codewords being used, it simply interprets each codeword received and outputs the appropriate number of characters/bits e.g. output from a scanner in a Fax Machine
000000011111111110000011 will be represented as 0,7 1,10 0,5 1,2 If we ensure the first substring always 0 , represented as 7,10,5,2 The individual decimal digits are send in binary form and assigning
Entropy Encoding statistical encoding

A set of ASCII codewords are often used for the
transmission of strings of characters However, the symbols and hence the codewords in the source information does not occur with the same frequency. E.g A may occur more frequently than P which may occur more frequently than Q The statistical coding uses this property by using a set of variable length codewords the shortest being the one representing the most frequently appearing symbol. Eg. Prefix code, Huffman code
Differential encoding Uses smaller codewords to represent the

difference signals. Can be lossy or lossless This type of coding is used where the amplitude of a signal covers a large range but the difference between successive values is small Instead of using large codewords a set of smaller code words representing only the difference in amplitude is used For example if the digitization of the analog signal requires 12 bits and the difference signal only requires 3 bits then there is a saving of 75% on transmission bandwidth
Transform Encoding
Transform encoding involves transforming the source

information from one form into another, the other form lending itself more readily to the application of compression
Transform Encoding As we scan across a set of pixel locations the rate of change in
magnitude will vary from zero if all the pixel values remain the same to a low rate of change if say one half is different from the next half, through to a high rate of change if each pixel changes magnitude from one location to the next The rate of change in magnitude as one traverses the matrix gives rise to a term known as the spatial frequency Hence by identifying and eliminating the higher frequency components the volume of the information transmitted can be reduced.
Human eye less sensitive to higher spatial frequency components in an image, moreover if its amplitude falls below a certain threshold amplitude it will not be detected by the eye.
Transform coding: DCT transform principles
Discrete Cosine Transformation is used to transform a two-dimensional
matrix of pixel values into an equivalent matrix of spatial frequency components (coefficients) At this point any frequency components with amplitudes below the threshold values can be dropped (lossy)
Text Compression Lampel-Ziv coding

The
LZ algorithm uses strings of characters instead of single characters For example for text transfer, a table containing all possible character strings are present in the encoder and the decoder As each word appears instead of sending the ASCII code, the encoder sends only the index of the word in the table This index value will be used by the decoder to reconstruct the text into its original form. This algorithm is also known as a dictionary-based
Text Compression LZW Compression
The principle of the Lempel-Ziv-Welsh coding algorithm is for the encoder and decoder to build the contents of the dictionary dynamically as the text is being transferred Initially the decoder has only the character set e.g ASCII. The remaining entries in the dictionary are built dynamically by the encoder and decoder
Text Compression LZW coding

Initially
the encoder sends the index of the four characters T, H, I, S and sends the space character which will be detected as a non alphanumeric character It therefore transmits the character using its index as before but in addition interprets it as terminating the first word and this will be stored in the next free location in the dictionary Similar procedure is followed by both the encoder and decoder In applications with 128 characters initially the dictionary will start with 8 bits and 256 entries 128 for the characters and the rest 128 for the words
Text Compression LZW Compression Algorithm
key issue in determining the level of compression that is achieved, is the number of entries in the dictionary since this determines the number of bits that are required for the index
SOURCE CODING- AUDIO

Perceptual Coding (PC)
PC are designed for compression of general audio
such as that associated with a digital television broadcast. sampled segments of the source audio waveform are analysed but only those features that are perceptible to the ear are transmitted E.g although the human ear is sensitive to signals in the range 15Hz to 20 kHz, the level of sensitivity to each signal is non-linear; that is the ear is more sensitive to some signals than others.
When multiple signals are present as in audio a
strong signal may reduce the level of sensitivity of the ear to other signals which are near to it in frequency. The effect is known as frequency masking
When the ear hears a loud sound it takes a short
but a finite time before it could hear a quieter sound .This effect is known as temporal masking
SOURCE CODING- AUDIO

Perceptual Coding (PC)
PC are designed for compression of general audio
such as that associated with a digital television broadcast. sampled segments of the source audio waveform are analysed but only those features that are perceptible to the ear are transmitted E.g although the human ear is sensitive to signals in the range 15Hz to 20 kHz, the level of sensitivity to each signal is non-linear; that is the ear is more sensitive to some signals than others.
When multiple signals are present as in audio a
strong signal may reduce the level of sensitivity of the ear to other signals which are near to it in frequency. The effect is known as frequency masking
When the ear hears a loud sound it takes a short
but a finite time before it could hear a quieter sound .This effect is known as temporal masking
Sensitivity of the ear

The dynamic range of ear is defined as the ratio
of the maximum amplitude of the signal to the minimum amplitude and is measured in decibels(dB) It is the ratio of loudest sound it can hear to the quietest sound Sensitivity of the ear varies with the frequency of the signal The ear is most sensitive to signals in the range 2-5kHz hence the signals in this band are the quietest the ear is sensitive to Vertical axis gives all the other signal amplitudes relative to this signal (2-5 kHz) Signal A is above the hearing threshold and B is below the hearing threshold
Audio Compression Perceptual properties of the human ear
Perceptual encoders have been designed for the compression of general audio such as that associated with a digital television broadcast
Audio Compression Perceptual properties of the human ear
When an audio sound consists of multiple frequency signals is present, the sensitivity of the ear changes and varies with the relative amplitude of the signal
Signal B is larger than signal A. This causes the basic
sensitivity curve of the ear to be distorted in the region of signal B Signal A will no longer be heard as it is within the distortion band
Variation with frequency of effect of frequency masking
The width of each curve at a particular signal level is known as the critical bandwidth for that frequency
Variation with frequency of effect of frequency masking

The width of each curve at a particular signal level is
known as the critical bandwidth It has been observed that for frequencies less than 500Hz, the critical bandwidth is around 100Hz, however, for frequencies greater than 500Hz then bandwidth increases linearly in multiples of 100Hz Hence if the magnitude of the frequency components that make up an audio sound can be determined, it becomes possible to determine those frequencies that will be masked and do not therefore need to be transmitted
Temporal masking
After the ear hears a loud sound it takes a further short
time before it can hear a quieter sound This is known as the temporal masking After the loud sound ceases it takes a short period of time for the signal amplitude to decay During this time, signals whose amplitudes are less than the decay envelope will not be heard and hence need not be transmitted In order to achieve this the input audio waveform must be processed over a time period that is comparable with that associated with temporal masking
Audio Compression Temporal masking caused by loud signal
After the ear hears a loud signal, it takes a further short time before it can hear a quieter sound (temporal masking)
Audio Compression MPEG perceptual coder schematic
MPEG audio coder

The audio input signal is first sampled and quantized
using PCM The bandwidth available for transmission is divided into a number of frequency subbands using a bank of analysis filters The bank of filters maps each set of 32 (time related) PCM samples into an equivalent set of 32 frequency samples Processing associated with both frequency and temporal masking is carried out by the psychoacoustic model In basic encoder the time duration of each sampled segment of the audio input signal is equal to the time to accumulate 12 successive sets of 32 PCM
MPEG audio coder

The output of the psychoacoutic model is a set of what
are known as signal-to-mask ratios (SMRs) and indicate the frequency components whose amplitude is below the audible components This is done to have more bits for highest sensitivity regions compared with less sensitive regions In an encoder all the frequency components are carried in a frame
Audio Compression MPEG perceptual coder schematic
MPEG audio coder frame format

The header contains information such as the sampling
frequency that has been used The quantization is performed in two stages using a form of companding The peak amplitude level in each subband is first quantized using 6 bits and a further 4 bits are then used to quantize the 12 frequency components in the subband relative to this level Collectively this is known as the subband sample (SBS) format The ancillary data field at the end of the frame is optional. it is used to carry additional coded samples associated with the surround-sound that is present
MPEG audio decoder

At the decoder section the dequantizers will determine
the magnitude of each signal The synthesis filters will produce the PCM samples at the decoders

Unit-Ii Itc

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit-Ii Itc

Uploaded by

Copyright:

Available Formats

UNIT-II

SOURCE CODING : TEXT, AUDIO & SPEECH

Encoder and Decoder

The application of the compression algorithm is

Lossless and Lossy Compression

Entropy Encoding - Run-length encoding -Lossless

Entropy Encoding statistical encoding

Differential encoding Uses smaller codewords to represent the

Transform encoding involves transforming the source

Transform coding: DCT transform principles

Discrete Cosine Transformation is used to transform a two-dimensional

Text Compression Lampel-Ziv coding

Text Compression LZW Compression

Text Compression LZW coding

Text Compression LZW Compression Algorithm

SOURCE CODING- AUDIO

When multiple signals are present as in audio a

SOURCE CODING- AUDIO

When multiple signals are present as in audio a

Sensitivity of the ear

Audio Compression Perceptual properties of the human ear

Audio Compression Perceptual properties of the human ear

Signal B is larger than signal A. This causes the basic

Variation with frequency of effect of frequency masking

Variation with frequency of effect of frequency masking

Audio Compression Temporal masking caused by loud signal

Audio Compression MPEG perceptual coder schematic

MPEG audio coder

MPEG audio coder

Audio Compression MPEG perceptual coder schematic

MPEG audio coder frame format

MPEG audio decoder

You might also like