You are on page 1of 16

5/4/2014

Virtual
Environment and
Multimedia System

S M TALHA JUBAED
DEPT. OF COMPUTER SCIENCE & ENGINEERING (CSE)
UNIVERSITY OF RAJSHAHI
HOTLINE: + 088- 01911 088 706

Suggestion for CSE-310: Virtual Environment and Multimedia Systems


Question-01: What is motion compensation in video compression? What do you mean by I-Frame, PFrame and B-frame? 5 Marks CSE-2009 *** 288 PAGES
Motion Compensation in Video Compression:
Motion compensation is an algorithmic technique employed in the encoding of video data for
video compression, for example in the generation of MPEG-2 files. Motion compensation describes a
picture in terms of the transformation of a reference picture to the current picture. The reference picture
may be previous in time or even from the future. When images can be accurately synthesised from
previously transmitted/stored images, the compression efficiency can be improved.
[How it works
]
Motion compensation exploits the fact that, often, for many frames of a movie, the only
difference between one frame and another is the result of either the camera moving or an object in the
frame moving. In reference to a video file, this means much of the information that represents one frame
will be the same as the information used in the next frame.
Using motion compensation, a video stream will contain some full (reference) frames; then
the only information stored for the frames in between would be the information needed to transform
the previous frame into the next frame.
I-frames: (Intra-Coded Frame):
It is an independent frame that is not related to any other frame.
They are present at regular intervals.
I-frames are independent of other frames and cannot be constructed from other frames.
P-frames: (Predicted Frame):
It is related to the preceding I-frame or P-frame.
Each P-frame contains only the changes from the preceding frame.
P-frames can be constructed only from previous I- or P-frames.
B-frames: (Bidirectional Frame):
It is relative to the preceding and following I-frame or P-frame.
Each B-frame is relative to the past and the future.
A B-frame is never related to another B-frame.
Question-02: What is Water Marking? Briefly discuss the main features that are possessed by watermarks.
5 Marks CSE-2009 ***
Watermarking:
A watermark is a secret code described by a digital signal carrying information about the
copyright property of the product. The watermark is embedded in the digital data such that it is
perceptually not visible. The copyright holder is the only person who can show the existence of his own
watermark and to prove the origin of the product.
Reproduction of digital products is easy and inexpensive. In a network environment,
like the web, retransmission of copies all throughout the world is easy. The problem of protecting the
intellectual property of digital products can be handled by the notion of watermarks.
Main Features of Water Marking:
Watermarks are digital signals that are superimposed on a digital image causing alternations to the
original data. A particular watermark belongs exclusively to one owner who is the only person that can
proceed to a trustworthily detection of the personal watermark and thus, prove the ownership of the
watermark from the digital data. Watermarks should possess the following features:
S M TALHA JUBAED

Perceptual Invisibility:
The modification caused by the watermarks embedding should not degrade the
perceived image quality. However, even hardly visible differences may become apparent when the
original image is directly compared to the watermarked one.
Trustworthily Detection:
Watermarks should constitute a sufficient and trustworthily part of ownership of a particular product.
Detection of a false alarm should be extremely rare. Watermarks signals should be characterized by
great complexity. This is necessary in order to be able to produce an extensive set of sufficiently well
distinguishable watermarks. An enormous set of watermarks prevents the recovery of a particular
watermark by trial-and-error procedure.
Associated Key:
Watermarks should be associated with an identification number called watermark key. The key is used
to cast, detect and remove a watermark. Subsequently, the key should be private and should exclusively
characterize the legal owner. Any digital signal, extracted from a digital image, is assumed to be a valid
watermark if and only if it is associated to a key using a well established algorithm.
Automated Detection/Search:
Watermarks should combine easily with a search procedure that scans any publicly accessible domain
in a network environment for illegal deposition of an owners product.
Statistical Invisibility:
Watermarks should not be recovered using statistical methods. For example, the possession of a great
number of digital products, watermarked with the same key, should not be disclose the watermark by
applying statistical methods. Therefore, watermarks should be image dependent.
Multiple Watermarking:
We should be able to embed a sufficient number of different watermarks in the same images. This
feature seems necessary because we cannot prevent someone from watermarking an already
watermarked image. It is also convenient when the copyright property is transferred from one owner
to another.
Robustness:
A watermark that is of some practical use should be robust to image modifications up to a certain degree.
The most common image manipulations are compression, filtering, color quantization/color brightness
modifications, geometric distortions and format change. A digital image can undergo a great deal of
different modifications that may deliberately affect the embedded watermark. Obviously, a watermark
that is to be used as a means of copyright protection should be detectable up to the point that the host
image quality remains within acceptable limits.
Question: Write down the Huffman coding algorithm used in lossless compression. Draw the coding tree
for HELLO using the Huffman coding algorithm. 7 Marks CSE-2009 ***
Huffman Coding Algorithm:
1. Initialization: Put all symbols on a list sorted according to their frequency counts.
2. Repeat until the list has only one symbol left:
(a) From the list pick two symbols with the lowest frequency counts. Form a Huffman subtree
that has these two symbols as child nodes and create a parent node.
(b) Assign the sum of the children's frequency counts to the parent and insert it into the list
such that the order is maintained.
(c) Delete the children from the list.
S M TALHA JUBAED

3. Assign a codeword for each leaf based on the path from the root.
Coding Tree for HELLOW using the Huffman Algorithm

Figure: Coding Tree for HELLOW using the Huffman Algorithm


Question: What are the advantages of adaptive Huffman coding over Huffman coding? 2 Marks CSE-2012
***
Advantages of adaptive Huffman coding over Huffman coding:
1. The Huffman algorithm requires prior statistical knowledge about the information source and
such information is often not available. This is particularly true in multimedia applications,
where future data is unknown before its arrival as for example in live (or streaming) audio and
video. Even when the statistics are available, the transmission of the symbol table could
represent heavy overhead. In adaptive Huffman coding, statistics are gathered and updated
dynamically as the DataStream arrives. The probabilities are no longer based on prior
knowledge but on the actual data received so far. In adaptive Huffman coding, if the probability
distribution of the received symbols changes, symbols will be given new (longer or shorter)
codes. This is especially desirable for multimedia data, when the content (the music or the color
of the scene) and hence the statistics can change rapidly.
2. Adaptive Huffman coding uses defined word schemes which determine the mapping from
source messages to code words based upon a running estimate of the source message
probabilities. The code is adaptive and changes so as to remain optimal for the current estimates.
In this way, the adaptive Huffman codes respond to locality and the encoder thus learns the
characteristics of the source data. The decoder must then learn along with the encoder by
continually updating the Huffman tree so as to stay in synchronization with the encoder.
3. A Third advantage of adaptive Huffman coding is that it only requires a single pass over the data.
In many cases the adaptive Huffman method actually gives a better performance, in terms of
number of bits transmitted, than static Huffman coding.
Adaptive Huffman codes: respond to locality
S M TALHA JUBAED

Encoder is "learning" the characteristics of the source. The decoder must learn along with the
encoder by continually updating the Huffman tree so as to stay in synchronization with the
encoder.
Another advantage: they require only one pass over the data.
Question: Write down the LZW compression algorithm and generate the output code for the string
CSERURUCSE. The initial dictionary is C-1, S-2, E-3, R-4, U-5. What are the limitations of the LZW
compression? 6 Marks CSE-2012 ***
LZW Compression Algorithm:

Second Part:
Let's start with a very simple dictionary (also referred to as a "string table"), initially containing only 3
characters, with codes as follows:
Code
string
------------------------------------------1
C
2
S
3
E
4
R
5
U
Now if the input string is CSERURUCSE, the LZW compression algorithm works as follows:
S
C
OUTPUT
CODE
STRING
---------------------------------------------------------------------------------------1
C
2
S
3
E
4
R
5
U
S M TALHA JUBAED

---------------------------------------------------------------------------------------C
S
1
6
CS
S
E
2
7
SE
E
R
3
8
ER
R
U
4
9
RU
U
R
5
10
UR
R
U
RU
C
9
11
RUC
C
S
CS
E
6
12
CSE
E
EOF 5
----------------------------------------------------------------------------------------The output codes are 1 2 3 4 5 9 6 5.
Similar Important Question: Generate the output code for the string ABABBABCABABBA. The initial
dictionary is A-1, B-2, C-3.
Let's start with a very simple dictionary (also referred to as a "string table"), initially containing only 3
characters, with codes as follows:
Code
string
------------------------------------------1
C
2
S
3
E
4
R
5
U
Now if the input string is ABABBABCABABBA, the LZW compression algorithm works as follows:

The output codes are 1 2 4 5 2 3 4 6 1.


S M TALHA JUBAED

Limitations of the LZW Compression:


What happens when the dictionary gets too large (i.e., when all the 4096 locations have been
used)?
Here are some options usually implemented:
Simply forget about adding any more entries and use the table as is.
Throw the dictionary away when it reaches a certain size.
Throw the dictionary away when it is no longer effective at compression.
Clear entries 256-4095 and start building the dictionary again.
Some clever schemes rebuild a string table from the last N input characters.
Question: Define the 2D Discrete Cosine Transform (DCT). Is DCT is a part of DFT? 2 Marks CSE-2009 ***
2D Discrete Cosine Transform (DCT):
Given an input function f(i, j) over two integer variables i and j (a piece of an image), the 2D
DCT transforms it into a new function F(u, v), with integer u and v running over the same range as i and
j. The general definition of the transform is:

Second Part: The discrete cosine transform (DCM), a widely used transform coding technique, is able to
perform decorrelation of the input signal in a data-independent manner. Because of this, it has gained
tremendous popularity. The DCT is a close counterpart to the Discrete Fourier Transform (DFT). DCT is
likely a transform that involves only the real part of the DFT.
Question: Is DCT is a part of DFT? Justify the answer. 2 Marks CSE-2012 ***
Question: Write down the properties of Huffman coding with example. 3 Marks CSE-2012 ***
Properties of Huffman coding:
The following are important properties of Huffman Coding:
1. Unique Prefix Property: No Huffman code is a prefix of any other Huffman code -precludes any
ambiguity in decoding.
2. Optimality: The Huffman code is a minimum redundancy code. It has been proved optimal for a
given data model (i.e., a given, accurate, probability distribution):
The two least frequent symbols will have the same length for their Huffman codes, differing
only at the last bit.
Symbols that occur more frequently will have shorter Huffman codes than symbols that occur
less frequently.
The average code length for an information source S is strictly less than + 1. Combined with
we have:
Eq. ( ),
< + 1
Question: What is the entropy of an information source with alphabet = {, , . } ? 2 Marks
CSE-2012 ***
Entropy of an Information source:
The entropy of an information source with alphabet = {, , . }is:

S M TALHA JUBAED

Where is the probability that symbol in S will occur.

The term indicates the amount of information contained in , which corresponds to the number

of bits needed to encode .


Question: What do you mean by JPEG? Write down the main steps required for JPEG encoder and draw
the block diagram for JPEG encoder. 8 Marks CSE-2009 ***
JPEG:
JPEG is an image compression standard that was developed by the "Joint Photographic Experts
Group". JPEG was formally accepted as an international standard in 1992. JPEG is a lossy image
compression method. It employs a transform coding method using the DCT (Discrete Cosine
Transform).
An image is a function of i and j (or conventionally x and y) in the spatial domain. The 2D DCT
is used as one step in JPEG in order to yield a frequency response which is a function F (u, v) in the
spatial frequency domain, indexed by two integers u and v.
Main steps required for JPEG encoder:
The main steps required for JPEG encoder are as follows:
Transform RGB to YIQ or YUV and subsample color.
Perform DCT on image blocks.
Apply Quantization.
Perform Zig-zag ordering and run-length encoding.
Perform Entropy coding.
The block diagram for JPEG encoder:

S M TALHA JUBAED

Question: Write short notes on the following topics: Progressive JPEG CSE-2011
Progressive JPEG:
The JPEG standard supports numerous modes (variations). Some of the commonly used ones are:
1.
2.
3.
4.

Sequential Mode
Progressive Mode
Hierarchical Mode
Lossless Mode

Progressive JPEG delivers low quality versions of the image quickly, followed by higher quality passes
and has become widely supported in web browsers. Such multiple scan of images are of course useful
when the speed of the communication line is low. In progressive mode, the first few scans carry only a
few bits and deliver a rough picture of what is to follow. After each additional scan, more data is received
and image quality is gradually enhanced. The advantage is that the user-end has a choice whether to
continue receiving image data after the first scan.
Progressive JPEG can be realized in one of the following two ways:
1. Spectral selection: This scheme takes advantage of the "spectral" (spatial frequency spectrum)
characteristics of the DCT coefficients: higher AC components provide detail information.
Scan 1: Encode DC and first few AC components, e.g., AC1, AC2.
Scan 2: Encode a few more AC components, e.g., AC3,AC4, AC5.
.
.
.
Scan k: Encode the last few ACs, e.g, AC61,AC62, AC63.
2. Successive approximation: Instead of gradually encoding spectral bands, all DCT coefficients are
encoded simultaneously but with their most significant bits (MSBs) first.
Scan 1: Encode the first few MSBs, e.g., Bits 7, 6, 5, 4.
Scan 2: Encode a few more less significant bits, e.g., Bit 3.
.
.
.
Scan m: Encode the least significant bit (LSB), Bit 0.
Question: Briefly discuss the parameters measured in Quality of services for multimedia data
transmission. 4 Marks CSE-2011 CSE-2009 ***
Parameters measured in Quality of services for multimedia data transmission:
Quality of Service (QoS) for multimedia data transmission depends on many parameters. Some of the
most important are:
Data rate: A measure of transmission speed, often measured in kbps or Mbps.
Latency (maximum frame/packet delay): maximum time needed from transmission to reception,
often measured in milliseconds(msec).
For example, when the round-trip delay exceeds 50 msec, echo becomes a noticeable problem;
when the one way delay is longer than 250 msec, talker overlap will occur, since each talker will
talk without knowing the other is also talking.
Packet loss or error: A measure (in percentage) of error rate of the packetized data transmission.
Packets get lost or garbled, such as over the internet. They may also be delivered late or in the
S M TALHA JUBAED

wrong order. Since retransmission is often undesirable, a simple error-recovery method for realtime multimedia is to replay the last packet hoping the error is not noticeable.
In general, for uncompressed audio/video, packet loss is < 102 . For compressed multimedia
and ordinary data, the desirable packet loss is less than 107 108 .
Jitter (or delay jitter): A measure of smoothness of the audio/video playback. Technically, jitter
is related to the variance of frame/packet delays. A large buffer (jitter buffer) can to hold enough
frames to allow the frame with the longest delay to arrive, to reduce playback jitter. However,
this increases the latency and may not be desirable in real-time and interactive applications.
Sync skew: A measure of multimedia data synchronization, often measured in milliseconds
(msec). For a good lip synchronization, the limit of the sync skew is 80 between audio
and video. In general 200 is still acceptable.
Question: When Real Time Streaming Protocol (RTSP) is used? Briefly discuss its operation. 4 Marks CSE2009 ***
When Real Time Streaming Protocol (RTSP) is used:
Streaming Audio and Video are Audio and video data that are transmitted from a stored media server to
the client in a data stream that is almost instantly decoded. (
)
The operations of Real Time Streaming Protocol (RTSP):
RTSP is for communication between a client and a stored media server. Fig. 16.5 illustrates a possible
scenario of four RTSP operations:
1. Requesting presentation description: the client issues a DESCRIBE request to the Stored Media
Server to obtain the presentation description such as media types (audio, video, graphics etc),
frame rate, resolution, codec, etc. from the server.
2. Session setup: the client issues a SETUP to inform the server of the destination IP address, port
number, protocols, TTL (for multicast). The session is set up when the server returns a session
ID.
3. Requesting and receiving media: after receiving a PLAY, the server started to transmit streaming
audio/video data using RTP. It is followed by a RECORD or PAUSE. Other VCR commands, such
as FAST-FORWARD and REWIND are also supported. During the session, the client periodically
sends an RTCP packet to the server, to provide feedback information about the QoS received.
4. Session closure: TEARDOWN closes the session.

S M TALHA JUBAED

Question: Describe the SIP Protocol In Internet Telephony. 6 Marks CSE-2011


SIP Protocol in Internet Telephony:
SIP is an application-layer control protocol in charge of the establishment and termination of
sessions in Internet telephony. These sessions are not limited to VoIP communications- they also include
multimedia conferences and multimedia distributions. SIP is a text-based protocol, also a client-server
protocol that is different from H.323. SIP can advertise its session using email, news group, web pages
or directories, or SAP a multicast protocol.
The methods (commands) for clients to invoke:
INVITE: invites callee(s) to participate in a call.
ACK: acknowledges the invitation.
OPTIONS: enquires media capabilities without setting up a call.
CANCEL: terminates the invitation.
BYE: terminates a call.
REGISTER: sends user's location info to a Registrar (a SIP server).

S M TALHA JUBAED

Scenario of a SIP Session


Fig. 16.7 illustrates a scenario when a caller initiates a SIP session:
Step 1. Caller sends an "INVITE john@home.ca" to the local Proxy server P1.
Step 2. The proxy uses its DNS (Domain Name Service) to locate the server for john@home.ca
and sends the request to it.
Step 3,4. john@home.ca is current not logged on the server. A request is sent to the nearby
location server. John's current address john@work.ca is located.
Step 5. Since the server is a Redirect server, it returns the address john@work.ca to the Proxy
server P1.
Step 6. Try the next Proxy server P2 for john@work.ca.
Step 7,8. P2 consults its Location server and obtains John's local address john doe@my.work.ca.
Step 9,10. The next-hop Proxy server P3 is contacted, it in turn forwards the invitation to where
the client (callee) is.
Step 11-14. John accepts the call at his current location (at work) and the acknowledgments are
returned to the caller.
Question: Write short notes on the following topics: RSVP CSE-2011
RSVP (Resource ReserVation Protocol)
RSVP is a setup protocol for internet resource reservation. RSVP is developed to guarantee desirable
QoS, mostly for multicast although also applicable to unicast.
A general communication model supported by RSVP consists of m senders and n receivers, possibly in
various multicast groups (e.g. Fig.16.4(a)). The most important messages of RSVP:
1. Path Message: A Path message is initiated by the sender, and contains information about the
sender and the path (e.g., the previous RSVP hop).
2. Resv message: A Resv message is sent by a receiver that wishes to make a reservation.
Main Challenges of RSVP
a. There can be a large number of senders and receivers competing for the limited network
bandwidth.
b. The receivers can be heterogeneous in demanding different contents with different QoS.
c. They can be dynamic by joining or quitting multicast groups at any time.

S M TALHA JUBAED

Figure 16.4: A Scenario of network resource reservation with RSVP.


Fig. 16.4 depicts a simple network with 2 senders (S1, S2), three receivers (R1, R2, and R3) and 4 routers
(A, B, C, D):
1. In (a), Path messages are sent by both S1 and S2 along their paths to R1, R2, and R3.
2. In (b) and (c), R1 and R2 send out Resv messages to S1 and S2 respectively to make reservations
for S1 and S2 resources. Note that from C to A, two separate channels must be reserved since R1
and R2 requested different data streams.
3. In (d), R2 and R3 send out their Resv messages to S1 to make additional requests. R3's request
was merged with R1's previous request at A and R2's was merged with R1's at C.
Question: What do you mean by MIDI message? Discuss different types of MIDI message. 4 Marks CSE2011 ***
MIDI Message:
MIDI messages are used by MIDI devices to communicate with each other.
Structure of MIDI messages:
MIDI message includes a status byte and up to two data bytes.
Status byte
o The most significant bit of status byte is set to 1.
o The 4 low-order bits identify which channel it belongs to (four bits produce 16 possible
channels).
o The 3 remaining bits identify the message.
The most significant bit of data byte is set to 0.
Different types of MIDI message:
S M TALHA JUBAED

Channel messages:
A Channel message can have up to 3 bytes: The first byte is the status byte (the opcode, as it
were); has its most significant bit set to 1.
The 4 low-order bits identify which channel this message belongs to (for 16 possible channels).
The 3 remaining bits hold the message. For a data byte, the most significant bit is set to 0.
Voice messages:
This type of channel message controls a voice, i.e., sends information specifying which note to
play or to turn off, and encodes key pressure.
Voice messages are also used to specify controller effects such as sustain, vibrato, tremolo, and
the pitch wheel. Table 6.3 lists these operations.

Table 6.3: MIDI Voice Message


Channel mode messages:
Channel mode messages form a special case of the Control Change message and therefore all
mode messages have opcode B (the message is &HBn, or 1011nnnn).
However, a Channel Mode message has its first data byte in 121 through 127 (&H797F).
Channel mode messages determine how an instrument processes MIDI voice messages.
Some examples include respond to all messages, respond just to the correct channel, don't
respond at all, or go over to local control of the instrument.
The data bytes have meanings as shown in Table 6.4.

System Messages:
S M TALHA JUBAED

System messages have no channel number and are meant for commands that are not channel
specific, such as timing signals for synchronization, positioning information in prerecorded MIDI
sequences, and detailed setup information for the destination device.
Opcodes for all system messages start with &HF.
System messages are divided into three classifications, according to their use:
1. System common messages
2. System real-time messages
3. System exclusive messages
System Common Message: Table 6.5 sets out these messages, which relate to timing or positioning..

Table 6.5: MIDI System Common Message


System Real-Time Messages: Table 6.6 sets out system real time messages, which are related to
synchronization

Table 6.6: MIDI system real time messages.


System exclusive messages:
System exclusive messages is included so that the MIDI standard can be extended by
manufacturers.
After the initial code, a stream of any specific messages can be inserted that apply to their own
product.
A System Exclusive message is supposed to be terminated by a terminator byte &HF7, as
specified in Table 6.5
The terminator is optional and the data stream may simply be ended by sending the status byte
of the next message.

Question: Define Multimedia System. What are the desirable features of a multimedia system? 5 Marks
CSE-2012 CSE-2011 CSE-2010 ***
Multimedia System:
A Multimedia System is a system capable of processing multimedia data and
applications. A Multimedia System is characterised by the processing, storage, generation, manipulation
and rendition of Multimedia information.
Characteristics of a Multimedia System
A Multimedia system has four basic characteristics:
S M TALHA JUBAED

1.
2.
3.
4.

Multimedia systems must be computer controlled.


Multimedia systems are integrated.
The information they handle must be represented digitally.
The interface to the final presentation of media is usually interactive.

Desirable Features for a Multimedia System


The following feature is desirable (if not a prerequisite) for a Multimedia System:
1. Very High Processing Power needed to deal with large data processing and real time delivery
of media. Special hardware commonplace.
2. Multimedia Capable File System needed to deliver real-time media e.g. Video/Audio
Streaming.
3. Special Hardware/Software needed e.g. RAID technology.
4. Data Representations File Formats that support multimedia should be easy to handle yet
allow for compression/decompression in real-time.
5. Efficient and High I/O input and output to the file subsystem needs to be efficient and fast.
Needs to allow for real-time recording as well as playback of data. e.g. Direct to Disk recording
systems.
6. Special Operating System to allow access to file system and process data efficiently and
quickly. Needs to support direct transfers to disk, real-time scheduling, fast interrupt processing,
I/O streaming etc.
7. Storage and Memory large storage units (of the order of hundreds of Tb if not more) and large
memory (several Gb or more). Large Caches also required and high speed buses for efficient
management.
8. Network Support Client-server systems common as distributed systems common.
9. Software Tools user friendly tools needed to handle media, design and develop applications,
deliver media.
Question: What do you mean by multimedia authoring? What are the functions of the authoring tools?
Name some multimedia authoring systems/ tools. 2 Marks CSE-2012 ***
Multimedia authoring:
Multimedia authoring involves assembling, arranging and presenting information in the
structure of a digital multimedia, which can include text, audio, as well as moving images. This process
requires a tool known as author ware, a program that helps in writing hypertext or multimedia
applications. Tools that provide are the capability for creating a complete multimedia presentation,
including interactive user control, are called authoring programs.
Multimedia Authoring Tools:
Macromedia Flash: allows users to create interactive movies by using the score metaphor, i.e., a
timeline arranged in parallel event sequences.
Macromedia Director: uses a movie metaphor to create interactive presentations very
powerful and includes a builtin scripting language, Lingo, that allows creation of complex
interactive movies.
Authorware: a mature, wellsupported authoring product based on the Iconic/Flowcontrol
metaphor.
Quest: similar to Authorware in many ways, uses a type of flowcharting metaphor. However, the
flowchart nodes can encapsulate information in a more abstract way (called frames) than
simply subroutine levels.

S M TALHA JUBAED

You might also like