You are on page 1of 25

MPEG Compression Standards

NAME- MD. SAHJAD FAROUQUI


CLASS- CS-S4(A)
REGISTRATION NO- 12150035
Video Coding Basics

Video signals differ from image signals in several important characteristics. Of


course the most important difference is that video signals have a camera frame rate
of anywhere from 15 to 60 frames/s, which provides the illusion of smooth motion
in the displayed signal. One another difference between images and video is the ability
to exploit temporal redundancy as well as spatial redundancy in designing compression
methods for video. For example, we can take advantage of the fact that objects
in video sequences tend to move in predictable patterns, and can therefore be
motion-compensated from frame-to-frame if we can detect the object and its motion
trajectory over time.
Need for Video Compression

Uncompressed video (and audio) data are huge. In HDTV, the bit rate easily exceeds 1 Gbps. --
big problems for storage and network communications.
For example:One of the formats defined for HDTV broadcasting within the United States is
1920 pixels horizontally by 1080 lines vertically, at 30 frames per second.
If these numbers are all multiplied together, along with 8 bits for each of the three primary
colors, the total data rate required would be approximately 1.5 Gb/sec.
Because of the 6 MHz channel bandwidth allocated, each channel will only support a data rate
of 19.2 Mb/sec, which is further reduced to 18 Mb/sec by the fact that the channel must also
support audio, transport, and ancillary data information.
As can be seen, this restriction in data rate means that the original signal must be compressed
by a figure of approximately 83:1. This number seems all the more impressive when it is
realized that the intent is to deliver very high quality video to the end user, with as few visible
artifacts as possible.
MPEG

MPEG stands for Moving Picture Experts Group, and pronounced


m-peg, is a working group of the ISO. MPEG is a working
group of authorities that was formed by ISO and IEC to set standards
for audio and video compression and transmission. It was established
in 1988 by the initiative of Hiroshi Yasuda and Leonardo Chiariglione.
The term also refers to the family of digital video compression
standards and file formats developed by the group. MPEG generally
produces better-quality video than competing formats, such as Video
for Windows, Indeo and QuickTime. MPEG files previously on PCs
needed hardware decoders (codecs) for MPEG processing. Today,
however, PCs can use software-only codecs including products from
Real Networks, QuickTime or Windows Media Player.
Evolution Of MPEG

MPEG-1: The most common implementations of the MPEG-1 standard provide a


video resolution of 352-by-240 at 30 frames per second (fps). This produces video
quality slightly below the quality of conventional VCR videos.
MPEG-2: Offers resolutions of 720x480 and 1280x720 at 60 fps, with full CD-
quality audio. This is sufficient for all the major TV standards, including NTSC, and
even HDTV. MPEG-2 is used by DVD-ROMs. MPEG-2 can compress a 2 hour video
into a few gigabytes. While decompressing an MPEG-2 data stream requires only
modest computing power, encoding video in MPEG-2 format requires significantly
more processing power.
MPEG-3: Was designed for HDTV but was abandoned in place of using MPEG-2 for
HDTV.
MPEG-4: A graphics and video compression algorithm standard that is based on
MPEG-1 and MPEG-2 and Apple QuickTime technology. Wavelet-based MPEG-4
files are smaller than JPEG or QuickTime files, so they are designed to transmit video
and images over a narrower bandwidth and can mix video with text, graphics and 2-D
and 3-D animation layers. MPEG-4 was standardized in October 1998 in the ISO/IEC
document 14496.
MPEG-7: Formally called the Multimedia Content Description Interface, MPEG-7
provides a tool set for completely describing multimedia content. MPEG-7 is
designed to be generic and not targeted to a specific application.
MPEG-21: Includes a Rights Expression Language (REL) and a Rights Data
Dictionary. Unlike other MPEG standards that describe compression coding methods,
MPEG-21 describes a standard that defines the description of content and also
processes for accessing, searching, storing and protecting the copyrights of content.
Definitions

Bitrate
Information stored/transmitted per unit time
Usually measured in Mbps (Megabits per second)
Ranges from < 1 Mbps to > 40 Mbps
Resolution
Number of pixels per frame
Ranges from 160x120 to 1920x1080
FPS (frames per second)
Usually 24, 25, 30, or 60
Dont need more because of limitations of the human eye
Lossy Compression
It is the class of data encoding methods that uses inexact approximations (or partial
data discarding) to represent the content. These techniques are used to reduce data
size for storage, handling, and transmitting content.
Loosy Codecs

Most video codecs are necessarily lossy, because it is usually


impractical to store and transmit uncompressed video signals. Even
though most codecs lose some information in the video signal, the
goal is to make this information loss visually imperceptible.
When codec algorithms are developed, they are fine-tuned based
on analyses of human vision and perception. For example, if the
human eye cannot differentiate between lots of subtle variation in
the red channel, a codec may throw away some of that
information and viewers may never notice.
MPEG Compression

MPEG encoding is based on eliminating redundant video information, not


only within a frame but over a period of time. In a shot where there is little
motion, such as an interview, most of the video content does not change
from frame to frame, and MPEG encoding can compress the video by a
huge ratio with little or no perceptible quality loss.
MPEG compression reduces video data rates in two ways:
Spatial (intraframe) compression: Compresses individual frames.
Temporal (interframe) compression: Compresses groups of frames together
by eliminating redundant visual data across multiple frames.
Intraframe and Interframe Compression

Intraframe Compression
Within a single frame, areas of similar color and texture can be coded with
fewer bits than the original, thus reducing the data rate with minimal loss in
noticeable visual quality.
Interframe Compression
Instead of storing complete frames, temporal compression stores only what
has changed from one frame to the next, which dramatically reduces the
amount of data that needs to be stored while still achieving high-quality
images.
Working Of MPEG

MPEG algorithms compress data to form small bits that


can be easily transmitted and then decompressed. MPEG
achieves its high compression rate by storing only the
changes from one frame to another, instead of each entire
frame. The video information is then encoded using a
technique called Discrete Cosine Transform (DCT).
MPEG uses a type of lossy compression, since some data
is removed. But the diminishment of data is generally
imperceptible to the human eye.
Types Of Frames

I frame (intra-coded)
Coded without reference to other frames
P frame (predictive-coded)
Coded with reference to a previous reference frame (either I or
P)
Size is usually about 1/3rd of an I frame
B frame (bi-directional predictive-coded)
Coded with reference to both previous and future reference
frames (either I or P)
Size is usually about 1/6th of an I frame
Group Of Pictures

In video coding, a group of pictures, or GOP structure, specifies the order


in which intra- and inter-frames are arranged. The GOP is a group of
successive pictures within a coded video stream. Each coded video
stream consists of successive GOPs. From the pictures contained in it, the
visible frames are generated.
Group Of Pictures (GOP)

A GOP can contain the following picture types:


I picture or I frame (intra coded picture) a picture that is coded independently of all other pictures.
Each GOP begins (in decoding order) with this type of picture.
P picture or P frame (predictive coded picture) contains motion-compensated difference
information relative to previously decoded pictures. In older designs such as MPEG-1, H.262/MPEG-
2 and H.263, each P picture can only reference one picture, and that picture must precede the P
picture in display order as well as in decoding order and must be an I or P picture. These constraints
do not apply in the newer standards H.264/MPEG-4 AVC and HEVC.
B picture or B frame (bipredictive coded picture) contains motion-compensated difference
information relative to previously decoded pictures. In older designs such as MPEG-1 and
H.262/MPEG-2, each B picture can only reference two pictures, one of which must precede the B
picture in display order and the other must follow the B picture in display order, and all pictures that
are referenced must be I or P pictures. These constraints do not apply in newer
standards H.264/MPEG-4 AVC and HEVC.
D picture or D frame (DC direct coded picture) serves as a fast-access representation of a picture
for loss robustness or fast-forward. D pictures are only used in MPEG-1video.
Working Of GOP

An I frame indicates the beginning of a


GOP. Afterwards several P and B frames
follow. In older designs, the allowed ordering
and referencing structure is relatively
constrained.
The I frames contain the full image and do
not require any additional information to
reconstruct it. Typically, encoders use GOP
structures that cause each I frame to be a
"clean random access point", such that any
errors within the GOP structure are corrected
by the next I frame.
Generally, the more I frames the video
stream has, the more editable it is. However,
having more I frames substantially increases
bit rate needed to code the video.
GOP Pattern

A GOP pattern is defined by the ratio of P- to B-frames within a GOP.


Common patterns used for DVD are IBP and IBBP. All three frame types do
not have to be used in a pattern. For example, an IP pattern can be used.
IBP and IBBP GOP patterns, in conjunction with longer GOP lengths,
encode video very efficiently. Smaller GOP patterns with shorter GOP
lengths work better with video that has quick movements, but they dont
compress the data rate as much.
Some encoders can force I-frames to be added sporadically throughout a
streams GOPs. These I-frames can be placed manually during editing or
automatically by an encoder detecting abrupt visual changes such as
cuts, transitions, and fast camera movements.
GOP Length

Longer GOP lengths encode video more efficiently by reducing the number of
I-frames but are less desirable during short-duration effects such as fast
transitions or quick camera pans. MPEG video may be classified as long-
GOP or short-GOP. The term long-GOP refers to the fact that several P- and B-
frames are used between I-frame intervals. At the other end of the spectrum,
short-GOP MPEG is synonymous with I-frameonly MPEG. Formats such as IMX
use I-frameonly MPEG-2, which reduces temporal artifacts and improves
editing performance. However, I-frameonly formats have a significantly higher
data rate because each frame must store enough data to be completely self-
contained. Therefore, although the decoding demands on your computer are
decreased, there is a greater demand for scratch disk speed and capacity.
Maximum GOP length depends on the specifications of the playback device.
The minimum GOP length depends on the GOP pattern. For example, an IP
pattern can have a length as short as two frames.
Motion Vector

In video compression, a motion vector is the key element in the motion


estimation process. It is used to represent a macro block in a picture based
on the position of this macro block (or a similar one) in another picture,
called the reference picture.
The H.264/MPEG-4 AVC standard defines motion vector as:
motion vector: A two-dimensional vector used for inter prediction that
provides an offset from the coordinates in the decoded picture to the
coordinates in a reference picture.
Prediction

Only use motion vector if a close match can be


found
Evaluate closeness with MSE or other metric
Cant search all possible blocks, so need a smart algorithm
If no suitable match found, just code the macroblock as an I-
block
If a scene change is detected, start fresh
Dont want too many P or B frames in a row
Predictive error will keep propagating until next I frame
Delay in decoding
Discrete Cosine Transform

A discrete cosine transform (DCT) expresses a finite sequence of data points in


terms of a sum of cosine functions oscillating at different frequencies. DCTs are
important to numerous applications mainly lossy compression.

DCT Formula :
MPEG Block Diagram
MPEG-1

MPEG-1 is the earliest format specification in the family of MPEG formats. Because
of its low bit rate, MPEG-1 has been popular for online distribution and in formats
such as Video CD (VCD). DVDs can also store MPEG-1 video, though MPEG-2 is
more commonly used. Although the MPEG-1 standard actually allows high
resolutions, almost all applications use NTSC- or PAL-compatible image dimensions
at quarter resolution or lower.
Common MPEG-1 formats include 320 x 240, 352 x 240 at 29.97 fps (NTSC), and
352 x 288 at 25 fps (PAL). Maximum data rates are often limited to around
1.5 Mbps. MPEG-1 only supports progressive-scan video.
MPEG-1 supports three layers of audio compression, called MPEG-1 Layers 1, 2,
and 3. MPEG-1 Layer 2 audio is used in some formats such as HDV and DVD, but
MPEG-1 Layer 3 (also known as MP3) is by far the most common. In fact, MP3
audio compression has become so popular that it is usually used independently of
video.
MPEG-1 elementary stream files often have extensions such as .m1v and .m1a, for
video and audio, respectively.
MPEG-2

The MPEG-2 standard made many improvements to the MPEG-1 standard,


including:
Support for interlaced video
Higher data rates and larger frame sizes, including internationally accepted
standard definition and high definition profiles
Two kinds of multiplexed system streamsTransport Streams (TS) for unreliable
network transmission such as broadcast digital television, and Program Streams
(PS) for local, reliable media access (such as DVD playback)
MPEG-4

MPEG-4 inherited many of the features in MPEG-1 and MPEG-2 and


then added a rich set of multimedia features such as discrete
object encoding, scene description, rich metadata, and digital
rights management (DRM). Most applications support only a subset
of all the features available in MPEG-4.
Compared to MPEG-1 and MPEG-2, MPEG-4 video compression
(known as MPEG-4 Part 2) provides superior quality at low bit rates.
However, MPEG-4 supports high-resolution video as well. For
example, Sony HDCAM SR uses a form of MPEG-4 compression.
Advantages and Disadvantages

Advantages Disadvantages
Occupies less disk space. Picture flashes, blurs when there is too
much movement on screen
Reading and writing is faster.
Higher bitrate often does not solve this
File transferring is faster. problem
The order of bytes is independent.
Overall sharp picture
Audio and video stay in sync with each
other

You might also like