Itc Unit-Iii

UNIT-III
SOURCE CODING: IMAGE AND VIDEO

Image and Video Formats GIF, TIFF, SIF, CIF, QCIF Image compression: READ, JPEG Video Compression: Principles-I,B,P frames, Motion estimation, Motion compensation, H.261, MPEG standard
IMAGE COMPRESSION
Image Compression GIF

Although
colour images comprising 24-bit pixels are supported GIF reduces the number of possible colours that are present by choosing 256 entries from the original set of 224 colours that match closely to the original image Hence instead of sending as 24-bit colour values only 8-bit index to the table entry that contains the closest match to the original is sent. This results in a 3:1 compression ratio The contents of the table are sent in addition to the screen size and aspect ratio information The image can also be transferred over the network using the interlaced mode
Image Compression GIF Compression Dynamic mode using LZW coding
The LZW can be used to obtain further levels of compression
Image Compression GIF interlaced mode
1/8 and 1/8 of the total compressed image
GIF
also allows an image to be stored and subsequently transferred over the network in an interlaced mode; useful over either low bit rate channels or the Internet which provides a variable transmission rate
Image Compression GIF interlaced mode
Further and remaining of the image
The
compression image data is organized so that the decompressed image is built up in a progressive way as the data arrives
Tagged Image File Format(TIFF)

Transfer of images/digitized documents Supports 48 bits of pixel resolution 16 bits are used for each R,G,B colors Code number indicates particular format Code 1- uncompressed format Codes 2,3,4- digitized format Code 5- LZW format
SIF
Source Intermediate Format (SIF) defined in MPEG-1, is a video format that was developed to allow the storage and transmission of digital video. SIF format- picture quality comparable with that obtained from VCR. Uses half the spatial resolution in the 4:2:0 format subsampling. Uses half the refresh rate -temporal resolution Frame refresh rate- 30 Hz for 525 line system and 25 Hz for 625 line system.
Format- 4:1:1 525 line system
Y=360 x 240
Cb=Cr=180 x120
625 line system
Y=360 x 288
Cb=Cr=180 x144 Worst case bit rate=81 Mbps
CIF (Common Intermediate Format)

CIF (Common Intermediate Format), also known as FCIF (Full Common Intermediate Format), is a format used to standardize the horizontal and vertical resolutions in pixels of YCbCr sequences in video signals. It is commonly used in video teleconferencing systems. It was first proposed in the H.261 standard.
CIF was designed to be easy to convert to PAL or NTSC standards. It is derived from SIF CIF defines a video sequence with a resolution of Y=360 x 288 Cb=Cr=180 x144 The temporal resolution is 30Hz. Worst case bit rate=81 Mbps
QCIF
QCIF means "Quarter CIF". To have one fourth of the area as "quarter" implies the height and width of the frame are halved. Used in video telephony. QCIF defines a video sequence with a resolution of Y=180 x 144 Cb=Cr=90 x72 With a temporal resolution is divided by 2/4 ie; 15 or 7.5 Worst case bit rate=40.5Mbps
Lower resolution QCIF-Sub QCIF

Y=128 x 96 Cb=Cr=64 x48
Image Compression JPEG encoder schematic
The Joint Photographic Experts Group forms the basis of most video compression algorithms
Image Compression Image Preparation
Block preparation is
necessary since computing the transformed value for each position in a matrix requires the values in all the locations to be processed
Image Compression Image/block preparation

Source image is
made up of one or more 2-D matrices of values
2-D matrix is required to store the required set of 8-bit grey-level values that represent the image For the colour image if a CLUT is used then a single matrix of values is required If the image is represented in R, G, B format then three matrices are required If the Y, Cr, Cb format is used then the matrix size for the chrominance components is smaller than the Y matrix ( Reduced representation)
JPEG Image/block preparation

Once the image format is selected then the values in each matrix are compressed separately using the DCT In order to make the transformation more efficient a second step known as block preparation is carried out before DCT In block preparation each global matrix is divided into a set of smaller 8X8 submatrices (block) which are fed sequentially to the DCT
Once
the source image format has been selected and prepared (four alternative forms of representation), the set values in each matrix are compressed separately using the DCT)
Forward DCT
Each pixel value is quantized using 8 bits which produces a value in the range 0 to 255 for the R, G, B or Y and a value in the range 128 to 127 for the two chrominance values Cb and Cr If the input matrix is P[x,y] and the transformed matrix is F[i,j] then the DCT for the 8X8 block is computed using the expression:
1 (2 x 1)i (2 y 1) j F [i, j ] C (i)C ( j ) P[ x, y] cos cos 4 16 16

7 7 x 0 y 0
The values are first centred around zero by substracting 128 from each intensity/luminance value
Forward DCT
All 64 values in the input matrix P[x,y] contribute to each entry in the transformed matrix F[i,j]
For i = j = 0 the two cosine terms are 0 and hence the value in the location F[0,0] of the transformed matrix is simply a function of the summation of all the values in the input matrix This is the mean of all 64 values in the matrix and is known as the DC coefficient Since the values in all the other locations of the transformed matrix have a frequency coefficient associated with them they are known as AC coefficients
Image Compression Forward DCT

for j
= 0 only the horizontal frequency coefficients are present
for i = 0 only the vertical frequency components are present For all the other locations both the horizontal and vertical frequency coefficients are present
Image Compression Quantization

In
addition to classifying the spatial frequency components the quantization process aims to reduce the size of the DC and AC coefficients so that less bandwidth is required for their transmission (by using a divisor) The sensitivity of the eye varies with spatial frequency and hence the amplitude threshold below which the eye will detect a particular frequency also varies
The threshold values vary for each of the 64 DCT coefficients and these are held in a 2-D matrix known as the quantization table with the threshold value to be used with a particular DCT coefficient in the corresponding position in the matrix
Image Compression Example computation of a set of quantized DCT coefficients
Image Compression Quantization
From the quantization table and the DCT and quantization coefficents number of observations can be made: - The computation of the quantized coefficients involves rounding the quotients to the nearest integer value
- The threshold values used increase in magnitude with increasing spatial frequency
- The DC coefficient in the transformed matrix is largest
- Many of the higher frequency coefficients are zero
Entropy
Vectoring
Image Compression Entropy Encoding

encoding consists of four stages
The entropy encoding operates on a one-dimensional string of values (vector).
However the output of the quantization is a 2-D matrix and hence this has to be represented in a 1-D form. This is known as vectoring Differential encoding In this section only the difference in magnitude of the DC coefficient in a quantized block relative to the value in the preceding block is encoded. This will reduce the number of bits required to encode the relatively large magnitude The difference values are then encoded in the form (SSS, value) SSS indicates the number of bits needed and actual bits that represent the value e.g: if the sequence of DC coefficients in consecutive quantized blocks was: 12, 13, 11, 11, 10, --- the difference values will be 12, 1, -2, 0, -1
run length encoding

The
remaining 63 values in the vector are the AC coefficients
Because of the large number of 0s in the AC coefficients they are encoded as string of pairs of values Each pair is made up of (skip, value) where skip is the number of zeros in
the run and value is the next non-zero coefficient
The above will be encoded as

(0,6) (0,7) (0,3)(0,3)(0,3) (0,2)(0,2)(0,2)(0,2)(0,0) Final pair indicates the end of the string for this block
Huffman encoding
Significant
levels of compression can be obtained by replacing long strings of binary digits by a string of much shorter codewords
The length of each codeword is a function of its relative frequency of occurrence

Normally, a table of codewords is used with the set of codewords precomputed using the Huffman coding algorithm
Image Compression Frame Building

In
order for the remote computer to interpret all the different fields and tables that make up the bitstream it is necessary to delimit each field and set of table values in a defined way
The JPEG standard includes a definition of the structure of the total bitstream relating to a particular image/picture. This is known as a frame
The role of the frame builder is to encapsulate all the information relating to an encoded image/picture

At the
top level the complete frame-plus-header is encapsulated between a start-of-frame and an end-of-frame delimiter which allows the receiver to determine the start and end of all the information relating to a complete image The frame header contains a number of fields - the overall width and height of the image in pixels
- the number and type of components (CLUT, R/G/B, Y/Cb/Cr)

- the digitization format used (4:2:2, 4:2:0 etc.)

At the
next level a frame consists of a number of components each of which is known as a scan The level two header contains fields that include:
- the identity of the components

- the number of bits used to digitize each component - the quantization table of values that have been used to encode each component Each scan comprises one or more segments each of which can contain a group of (8X8) blocks preceded by a header
This contains the set of Huffman codewords for each block
Image Compression JPEG decoder
JPEG decoder is made up of a number of stages which are simply the corresponding decoder sections of those used in the encoder
The
JPEG decoder is made up of a number of stages which are the corresponding decoder sections of those used in the encoder The frame decoder first identifies the encoded bitstream and its associated control information and tables within the various headers It then loads the contents of each table into the related table and passes the control information to the image builder
Then the Huffman decoder carries out the decompression operation using preloaded or the default tables of codewords
The
two decompressed streams containing the DC and AC coefficients of each block are then passed to the differential and runlength decoders
The resulting matrix of values is then dequantized using either the default or the preloaded values in the quantization table
Each resulting block of 8X8 spatial frequency coefficient is passed in turn to the inverse DCT which in turn transforms it back to their spatial form The image builder then reconstructs the image from these blocks using the control information passed to it by the frame decoder
Although complex using JPEG compression
ratios of 20:1 can be obtained while still retaining a good quality image This level (20:1) is applied for images with few colour transitions
For more complicated images compression ratios of 10:1 are more common
Like GIF images it is possible to encode and rebuild the image in a progressive manner. This can be achieved by two different modes progressive mode and hierarchical mode
Progressive
mode First the DC and low-frequency coefficients of each block are sent and then the highfrequency coefficients hierarchial mode in this mode, the total image is first sent using a low resolution e.g 320 X 240 and then at a higher resolution 640 X 480
One approach to compressing a video source is to apply the JPEG algorithm to each frame independently. This is known as moving JPEG or MJPEG There are two types of compressed frames - Those that are compressed independently (Iframes) - Those that are predicted (P-frame and B-frame)
Video Compression
Video Compression Example frame sequences I and P frames
In the context of compression, since video is simply a sequence of digitized pictures, video is also referred to as moving pictures and the terms frames and picture are used interchangeably
I frames
I-frames (Intracoded frames) are encoded without reference to any other frames. Each frame is treated as a separate picture and the Y, Cr and Cb matrices are encoded separately using JPEG Iframes the compression level is small They are good for the first frame relating to a new scene in a movie I-frames must be repeated at regular intervals to avoid losing the whole picture as during transmission it can get corrupted and hence looses the frame The number of frames/pictures between successive I-frames is known as a group of pictures (GOP). Typical values of GOP are 3 12
P frames
The encoding of the P-frame is relative to the contents of either a preceding I-frame or a preceding P-frame. P-frames are encoded using a combination of motion estimation and motion compensation The accuracy of the prediction operation is determined by how well any movement between successive frames is estimated. This is known as the motion estimation Since the estimation is not exact, additional information must also be sent to indicate any small differences between the predicted and actual positions of the moving segments involved. This is known as the motion compensation No of P frames between I-frames is limited to avoid error propagation
Frame Sequences I-, P- and B-frames
Each frame is treated as a separate (digitized) picture and the Y, Cb and

Cr matrices are encoded independently using the JPEG algorithm (DCT, Quantization, entropy encoding) except that the quantization threshold values that are used are the same for all DCT coefficients
PB-Frames
A fourth type of frame known as PB-frame has also been

defined; it does not refer to a new frame type as such but rather the way two neighbouring P- and B-frames are encoded as if they were a single frame
Video Compression
Motion estimation involves comparing small segments of two consecutive frames for differences and should a difference be detected a search is carried out to determine which neighbouring segments the original segment has moved. To limit the time for search the comparison is limited to few segments
Works well in slow moving applications like video telephony For fast moving video it will not work effectively. Hence Bframes (Bi-directional) are used. Their contents are predicted using the past and the future frames.
B- frames provides highest level of compression and because they are not involved in the coding of other frames they do not propagate errors
P-frame encoding
The digitized contents of the Y matrix associated with each

frame are first divided into a two-dimensional matrix of 16 X 16 pixels known as a macroblock
P-frame encoding
4 DCT blocks for the luminance signals in the example here and 1 each for the two chrominance signals are used To encode a p-frame the contents of each macroblock in the frame known as the target frame are compared on a pixel-by-pixel basis with the contents of the I or P frames (reference frames) If a close match is found then only the address of the macroblock is encoded If a match is not found the search is extended to cover an area around the macroblock in the reference frame
P-frame encoding
To encode a P-frame, the contents of each macroblock in the frame (target frame) are compared on a pixel-bypixel basis with the contents of the corresponding macroblock in the preceeding I- or P-frame
B-frame encoding
B-frame encoding
To encode B-frame any motion is estimated with reference to both the preceding I or P frame and the succeeding P or I frame The motion vector and difference matrices are computed using first the preceding frame as the reference frame and then the succeeding frame as the reference Third motion vectors and set of difference , matrices are then computed using the target and the mean of the two other predicted set of values The set with the lowest set of difference matrices is chosen and is encoded
Decoding of I, P, and B frames

I-frames decode immediately to recreate original frame P-frames: the received information is decoded and the resulting information is used with the decoded contents of the preceding I/P frames (two buffers are used) B-frames :the received information is decoded and the resulting information is used with the decoded contents of the preceding and succeeding P or I frame (three buffers are used) PB-frame A new frame type showing how two neighbouring P and B frames are encoded as if they were a single frame
MPEG
MPEG-1 ISO Recommendation 11172 uses resolution of 352x288 pixels and used for VHS quality audio and video on CD-ROM at a bit rate of 1.5 Mbps MPEG-2 ISO Recommendation 13818 Used in recording and transmission of studio quality audio and video. Different levels of video resolution possible Low: 352X288 comparable with MPEG-1 Main: 720X 576 pixels studio quality video and audio, bit rate up to 15 Mbps High: 1920X1152 pixels used in wide screen HDTV bit rate of up to 80Mbps are possible
MPEG
MPEG-4: Used for interactive multimedia applications over the Internet and over various entertainment networks MPEG standard contains features to enable a user not only to passively access a video sequence using for example the start/stop/ but also enables the manipulation of the individual elements that make up a scene within a video In MPEG-4 each video frame is segmented into a number of video object planes (VOP) each of which will correspond to an AVO (Audio visual object) of interest Each audio and video object has a separate object descriptor associated with it which allows the object providing the creator of the audio and /or video has provided the facility to be manipulated by the viewer prior to it being decoded and played out
Video Compression MPEG-1 video bitstream structure: composition
The compressed bitstream produced by the video encoder is

hierarchical: at the top level, the complete compressed video (sequence) which consists of a string of groups of pictures
Video Compression MPEG-1 video bitstream structure: format
In order for the decoder to decompress the received bitstream, each data structure must be clearly identified within the bitstream
Video Compression MPEG-4 coding principles
Content based video coding principles showing how a frame/scene is defined in the form of multiple video object planes
Video Compression MPEG 4 encoder/decoder schematic
Before being compressed each scene is defined in the form of a background and one or more foreground audio-visual objects (AVOs)
Video Compression MPEG VOP encoder
The audio associated with an AVO is compressed using one of the algorithms described before and depends on the available bit rate of the transmission channel and the sound quality required

Itc Unit-Iii

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Itc Unit-Iii

Uploaded by

Copyright:

Available Formats

UNIT-III

SOURCE CODING: IMAGE AND VIDEO

Image Compression GIF

Image Compression GIF Compression Dynamic mode using LZW coding

The LZW can be used to obtain further levels of compression

Image Compression GIF interlaced mode

1/8 and 1/8 of the total compressed image

Image Compression GIF interlaced mode

Further and remaining of the image

Tagged Image File Format(TIFF)

Format- 4:1:1 525 line system

625 line system

CIF (Common Intermediate Format)

Lower resolution QCIF-Sub QCIF

Image Compression JPEG encoder schematic

Image Compression Image Preparation

Image Compression Image/block preparation

made up of one or more 2-D matrices of values

JPEG Image/block preparation

Image Compression Image Preparation

1 (2 x 1)i (2 y 1) j F [i, j ] C (i)C ( j ) P[ x, y] cos cos 4 16 16

Image Compression Image Preparation

Image Compression Forward DCT

= 0 only the horizontal frequency coefficients are present

Image Compression Quantization

Image Compression Example computation of a set of quantized DCT coefficients

Image Compression Quantization

- Many of the higher frequency coefficients are zero

Image Compression Entropy Encoding

The entropy encoding operates on a one-dimensional string of values (vector).

run length encoding

remaining 63 values in the vector are the AC coefficients

The above will be encoded as

The length of each codeword is a function of its relative frequency of occurrence

Image Compression Frame Building

Image Compression Frame Building

- the number and type of components (CLUT, R/G/B, Y/Cb/Cr)

Image Compression Frame Building

- the identity of the components

This contains the set of Huffman codewords for each block

Image Compression JPEG decoder

Although complex using JPEG compression

Video Compression Example frame sequences I and P frames

Frame Sequences I-, P- and B-frames

Each frame is treated as a separate (digitized) picture and the Y, Cb and

A fourth type of frame known as PB-frame has also been

The digitized contents of the Y matrix associated with each

Decoding of I, P, and B frames

Video Compression MPEG-1 video bitstream structure: composition

The compressed bitstream produced by the video encoder is

Video Compression MPEG-1 video bitstream structure: format

Video Compression MPEG-4 coding principles

Video Compression MPEG 4 encoder/decoder schematic

Video Compression MPEG VOP encoder

You might also like