Professional Documents
Culture Documents
t
e
A
'
-
u
i
s
i
t
i
o
n
C
o
n
t
r
i
(
u
t
i
o
n
T
p
e
S
t
o
r
g
e
D
i
s
+
S
t
o
r
g
e
S
e
r
*
e
r
5
n
d
e
p
e
n
d
e
n
t
L
o
n
g
G
C
8
E
d
i
t
i
n
g
"
w
i
t
%
;
e
C
o
d
i
n
g
D
a
t
a
$
5
D
r
a
m
e
C
n
l
y
E
d
i
t
i
n
g
E
d
i
t
i
n
g
5
D
r
a
m
e
C
n
l
y
V
T
;
,
r
o
g
r
m
m
e
S
t
o
r
g
e
8
r
o
g
r
a
m
m
e
S
e
r
*
e
r
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-9& Compression Te'$nologies for Video nd Audio
<(
5Drame Cnly
)/'/'
5 Drame Cnly
)/'/'
Transcode to
5frame )/'/'
=(
'(
Any GC8
)/'/' or
)/'/(
3(
(
Any GC8
)/'/' or
)/'/(
Transcode to
)/'/'
Trns'ode to
Emission
*ormt
,ro'ess *lo9
*igure 11.3.% Current practice for standarddefinition compressed *ideo process flow# "From 23'4#
Used with permission#$
b!e qua!ity. It is a!so possib!e to transcode the incoming feed to permit a standard native stor#
age format. .owever, care must be ta'en to ensure that the qua!ity is not degraded.
E Editing. &uts#on!y editing is simp!e to perform in the =C20# domain in the case of I#
frame#on!y 4TJs or servers. ;here more comp!e" editing is required, the signa! must be
pro# cessed at video baseband. This can be achieved using either I#frame or !ong 08C
=C20# provided that the system has sufficient headroom. ;here this headroom does not
e"ist, it is necessary to use the =C20# recoding parameters as defined in $=CT2 3@= if
=C20# concatenation artifacts are to be avoided.
E Program storage. Crogram storage shou!d a!so be made in the editing format or in the trans#
mission format so that the number of transcoding stages is reduced to a minimum. The 'ey
options avai!ab!e to the program ma'er can therefore be summarized as ,6 an I#frame#on!y or
!ong 08C system with a suff icient!y high data rate to a!!ow mu!tip!e naive
decoding/recoding processes, and 6 a !ong 08C system using !ower data rates but passing
forward recoding information as described in $=CT2 3@= to minimize =C20#
concatenation artifacts.
11.3.6( S5,TE ", 21.
$=CT2 Jecommended Cractice ,3 was deve!oped to address the issues out!ined in the
previous section. The document specif ies the structure and parameters of the data for
interfacing =C20#
:DD profi!e and digita! audio in the professiona! environment +,3-. The purpose of this JC is to
faci!itate video and audio bitstream interchange between =C20# comp!iant equipment.
Compression System Constrints nd ,erformn'e /ssues 11-91
The combination of JC ,3 and re!ated documents are intended to assist the design and
app!i# cation of =C20##based professiona! te!evision equipment that faci!itates bitstream
interchange among different app!ications and over a wide set of user requirements. The JC is
!imited to the video and audio parameters of such a system.
JC ,3 a!so specifies =C20# operating ranges that are def ined to be subsets of
I$8/=C20 profi!es and !eve!s. It defines two operating ranges for standard#definition
te!evision and three operating ranges for high#definition te!evision. 3!! of the =C20# data
structures addressed in this practice are I$8/I2& ,3<,<# 3mendment :DD profi!e comp!iant
and as such are decod# ab!e by =C20# :DD profi!e comp!iant stand#a!one decoders at the
appropriate !eve!. Inasmuch as the :DD profi!e a!so requires stand#a!one decoders to decode
main profi!e structures 5:DD06, e"isting main profi!e sources can be accommodated.
Appli'tion of ",
21.
The f!e"ibi!ity of =C20# compression a!!ows =C20##based equipment to meet the diverse
operationa! requirements of a broad range of professiona! te!evision app!ications +,3-. 3!though
some app!ications might be served by choosing a specific operating point, different users have
different constraints and ob7ectives, and may choose different specific operating parameters.
&ognizant of these considerations, JC ,3 specifies the fo!!owingD
E 8perating ranges, inc!uding constrained bit rates and 08C structures
E 8perating ranges created for random access and editing capabi!ity
E $patia! a!ignment of coded images
E /se of :<#'.z samp!ed digita! audio
This practice describes parameter choices avai!ab!e in =C20# and the factors to be ta'en into
account when defining an =C20##based system. $pecific operating parameter choices wi!!
depend on the individua! app!ication requirements, inc!uding editing capabi!ity, storage
capacity, contribution feeds, and distribution/emission bandwidth.
In ma'ing this se!ection for a given app!ication environment, it is further recognized that
tradeoffs among many different parameters must be considered. $uch considerations inc!ude the
bitstream overhead imposed by various operating range constraints, the required degree of bit#
stream interoperabi!ity among various types of broadcast equipment, and overa!! system com#
p!e"ity.
(or audio, no sing!e wor!dwide compressed standard has been adopted) various transmission
systems are in use depending upon geographic area. 0!oba! audio interchange can, therefore,
on!y be achieved by specifying a noncompressed audio format.
5,EG-2 Video ,rmeters
;ithin professiona! app!ications of =C20#, inc!uding the .%T4 e"tensions to =C20# as
defined by $=CT2 30<=, f ive operating ranges are def ined by JC ,3 5(igure ,,.F.<6.
$eparate !ong# and short#08C ranges are defined for both main !eve! and high !eve! systems.
3dditiona! operating ranges may be added as required to meet future .%T4 requirements.
8perating ranges
, and cover the =C20# :DDCN=A options inc!uding the standard FF#!ine and 6F#!ine
$%T4 formats.
8perating ranges 3 and : cover the =C20# :DDN.A inc!udingD
11-92 Compression Te'$nologies for Video nd Audio
)it-rte
8perting "nge 4
!DTV
,p to =(( MbitEs
5only coding
8perting "nge .)
!DTV
,p to 3?< MbitEs
Any GC8 structure
8perting "nge .A
!DTV
,p to >( MbitEs
Any GC8 structure
8perting "nge 2
SDTV
,p to <( MbitEs
5only coding
8perting "nge 1
SDTV
,p to <( MbitEs
Any GC8 structure
G8, Stru'ture
*igure 11.3.4 SM8TE operating ranges specified in ;8 '3=# "From 23=4# Used with permission#$
E :<0#!ine progressive scan
E F@6#!ine progressive scan
E @0#!ine progressive scan
E ,0<0#!ine inter!aced scan
E ,0<0#!ine progressive scan 5up to 30#.z frame rate6
Je!ationships among different operating ranges are i!!ustrated in (igure ,,.F.H. 8perating range
is a subset of operating ranges , and :. 8perating range , is a subset of operating ranges 33
and 39. 8perating range 33 is a subset of operating range 39.
11.3.% S5,TE Do'uments "elting to 5,EG-2
The fo!!owing sections !ist the primary $=CT2 standards re!ating to =C20#. (or additiona!
information, visit the $=CT2 ;eb site at httpD//ww w .sm p te.o rg.
S5,TE .&25/ Linear 8CM Digital Audio in an M8EG' Transport Stream
This standard specifies the transport of uncompressed 5!inear C&=6 digita! audio in an =C20#
transport system. $ome app!ications may require !inear C&= 5pu!se code modu!ated6 digita!
Compression System Constrints nd ,erformn'e /ssues 11-9.
8perting "nge 4
!DTV
,p to =(( MbitEs
5only coding
8perting "nge 2
SDTV
,p to <( MbitEs
5only coding
8perting "nge .)
!DTV
,p to 3?< MbitEs
Any GC8 structure
8perting "nge .A
!DTV
,p to >( MbitEs
Any GC8 structure
8perting "nge 1
SDTV
,p to <( MbitEs
Any GC8 structure
8perting "nge 2
SDTV
,p to <( MbitEs
5only coding
*igure 11.3.9 ;elations%ip among operating ranges in SM8TE ;8 '3=# "From 23=4# Used with
permission#$
audio in con7unction with compressed video specified in the =C20# :DD profi!e. The =C20
audio standard defines compressed audio, but does not define uncompressed audio for carriage
in an =C20# transport system. This standard augments the =C20 standards to address the
requirement for !inear C&= digita! audio.
S5,TE .&45/ M8EG' )/'/' 8rofile at !ig% Le*el
I$8/I2& ,3<,<#, common!y 'nown as =C20# video, inc!udes specification of the =C20#
:DD profi!e. 9ased on I$8/I2& ,3<,<#, this standard provides additiona! specification for the
=C20# :DD prof i!e at high !eve!. It is intended for use in high#definition te!evision produc#
tion, contribution, and distribution app!ications. 3s in I$8/I2& ,3<,<#, this standard defines
bit#streams, inc!uding their synta" and semantics, together with the requirements for a
comp!iant decoder for :DD profi!e at high !eve!, but does not specify particu!ar encoder
operating parame# ters.
S5,TE .1&5/ Sync%ronous Serial 5nterface for M8EG' Digital Transport
Stream
This standard describes the physica! interface and modu!ation characteristics for a synchronous
seria! interface to carry =C20# transport bit streams at rates up to :0 =bits/s. It is a point#to#
point interface intended for use in a !ow#noise environment. The !ow#noise environment is
defined as a noise !eve! that wou!d corrupt no more than one =C20# data pac'et per day at the
transport c!oc' rate. ;hen other transmission systems 5e.g., studio#to#transmitter microwave
!in's, etc.6 are interposed between devices emp!oying this interface, higher noise !eve!s may be
encountered. In such cases, it is recommended that appropriate error correcting methods by
used.
S5,TE ", 2&2: Video Alignment for M8EG' Coding
2quipment conforming to this practice wi!! minimize artifacts in mu!tip!e generations of encod#
ing and decoding by optimizing macrob!oc' a!ignment. 3s =C20# becomes pervasive in
emis# sion, contribution, and distribution of video content, mu!tip!e compression and
decompression 5codec6 cyc!es wi!! be required. &oncatenation of codecs may be needed for
production, post#
11-94 Compression Te'$nologies for Video nd Audio
production, transcoding, or format conversion. 3ny time video transitions to or from the coeffi#
cient domain of =C20# are performed, care must be e"ercised in a!ignment of the video both
horizonta!!y and vertica!!y as it is coded from the raster format or decoded and p!aced in the ras#
ter format.
S5,TE ", 2&4/ SDT5C8 M8EG Decoder
Templates
This practice defines decoder temp!ates for the encoding of $%TI content pac'ages 5$%TI#&C6
with =C20 coded picture streams.
S5,TE ", 21./ M8EG' Cperating ;anges
This practice specifies the structure and parameters of the data for interfacing =C20# :DD
profi!e and digita! audio in the professiona! environment. The purpose of this practice is to
faci!# itate video and audio bitstream interchange between =C20# comp!iant equipment.
S5,TE ",21%/ Fonsync%roni1ed Mapping of GLV 8ac.ets into M8EG' Systems Streams
This practice describes a means for mapping $=CT2 metadata and other data essence, encoded
in the $=CT2 KA4 protoco!, into =C20# systems streams. /se of synchronized streams and
their synta" and semantics is beyond the scope of this practice.
S5,TE EG .4/ M8EG' Cperating ;ange Applications
The aim of this document is to provide practica! guide!ines to users of =C20# in studio and in
other professiona! app!ications. This guide!ine provides a system overview, detai!ing the e!e#
ments to be considered when choosing an =C20# operating range. This guide!ine describes
how the structure and parameters defined in $=CT2 JC ,3 may be conf igured to meet a
se!ected operating point. This is achieved by giving specif ic, but representative,
imp!ementation e"amp!es p!anned or in use around the wor!d.
11.3.% 5,EG-2 Editing nd Spli'ing
S5,TE .125/ Splice 8oints for M8EG' Transport
Streams
This standard defines constraints on the encoding of and synta" for =C20# transport streams
such that they may be sp!iced without modifying the C2$ pac'et pay!oad. 0eneric =C20#
transport streams, which do not comp!y with the constraints in this standard, may require more
sophisticated techniques for sp!icing.
S5,TE .245/ M8EG' Video Elementary Stream Editing
5nformation
This standard defines the =C20 video e!ementary stream 52$6 information to faci!itate seam#
!ess edits under defined circumstances. The video 2$, as defined by the =C20 standards, are
supp!emented with additiona! information for professiona! studio app!ications. $upp!ementary
information wi!! be carried within the sequence header and the user data area of the video 2$.
This standard def ines the data to be carried and the !ocation of the data.
Compression System Constrints nd ,erformn'e /ssues 11-93
11.3.%( 5,EG-2 "e'oding
S5,TE .195/ Transporting M8EG' ;ecoding 5nformation t%roug% )/'/' Component
Digital
5nterfaces
This standard specif ies an embedded transport mechanism for the =C20# recoding data set as
defined in $=CT2 3@= for the representation of =C20# recoding information in IT/#J
9T.6F6, :DD component digita! interfaces.
S5,TE .2%5/ M8EG' Video ;ecoding Data Set
This standard specifies the content of the picture re!ated recoding data set for the representation
of I$8/I2& ,3<,<# =C20 coding information for the purpose of optima!!y cascading
decoders and recoders at any bit rate or 08C structure. The coding information is as derived
from an I$8/ I2& ,3<,< comp!iant =C20 bit stream during the picture decoding process, as
described in I$8/I2& ,3<,<#.
S5,TE .295/ M8EG' Video ;ecoding Data SetHCompressed Stream
Dormat
This standard specifies the stream format of the =C20# recoding data set for the
representation of compressed I$8/I2& ,3<,<# =C20 coding information, as used in
app!ications requiring transport systems of reduced data capacity.
S5,TE .315: Tele*isionHTransporting M8EG' ;ecoding 5nformation t%roug% !ig%
Defini tion Digital 5nterfaces
This standard specif ies an embedded transport mechanism for the =C20# recoding data set as
defined in $=CT2 3@= for the representation of =C20# recoding information on a $=CT2
@:= interface and subsequent!y upon a $=CT2 H= bit#seria! digita! interface. The recoding
data set is derived from an I$8/I2& ,3<,<#,/ comp!iant =C20 bitstream during the decoding
process, as described in the I$8/I2& ,3<,<#,/ standards.
S5,TE .3.5: Tele*isionHTransport of M8EG' ;ecoding 5nformation as Ancillary
Data
8ac.ets
This standard specifies the mechanism for the transport of =C20# video recoding information
as anci!!ary data pac'ets in an anci!!ary data space1for e"amp!e, through IT/#J 9T.6F6 /
$=CT2 FH= interfaces. The video recoding information transported through this mechanism
is for the purpose of preserving picture qua!ity at re#encoding stages when cascading =C20#
decoders and encoders. 3!though the specif ied mechanism operates on ,0#bit digita! video
inter# faces, it is by design transparent to systems !imited to <#bit operation.
11.3.4 "eferen'es
,. 2pstein, $teveD >2diting =C20 9itstreams,? Broadcast $ngineering, Intertec Cub!ishing,
8ver!and Car', Kan., pp. 3@O:, 8ctober ,HH@.
. &ugnini, 3!do 0.D >=C20# 9itstream $p!icing,? Proceedings of the Digital Television
%&'
(onference, Intertec Cub!ishing, 8ver!and Car', Kan., %ecember ,HH@.
3. $=CT2 Jecommended CracticeD JC 0, >4ideo 3!ignment for =C20# &oding,?
$=CT2, ;hite C!ains, G.M., 000.
11-96 Compression Te'$nologies for Video nd Audio
:. $=CT2 $tandardD $=CT2 3,=, plice Points for )P$*-+ Transport treams, $=CT2,
;hite C!ains, G.M., 00,.
F. ;ard, &hristopher, &. Cecota, P. Aee and 0. .ughesD >$eam!ess $p!icing for =C20#
Transport $tream 4ideo $ervers,? Proceedings! ,,rd )PT$ -dvanced )otion Imaging
(onference, $=CT2, ;hite C!ains, G.M., 000.
6. $=CT2 $tandardD $=CT2 3<=#000D >=C20# 4ideo 2!ementary $tream 2diting
Information,? $=CT2, ;hite C!ains, G.M., 000.
@. $=CT2 3@=#000, >=C20# 4ideo Jecoding %ata $et,? $=CT2, ;hite C!ains, G.M.,
000.
<. $=CT2 3H=#000, >=C20# 4ideo Jecoding %ata $et1&ompressed $tream (ormat,?
$=CT2, ;hite C!ains, G.M., 000.
H. $=CT2 3,H=#000, >Transporting =C20# Jecoding Information Through :DD &om#
ponent %igita! Interfaces,? $=CT2, ;hite C!ains, G.M., 000.
,0. $=CT2 3F,=, >Transporting =C20# Jecoding Information through .igh#%efinition
%igita! Interfaces,? $=CT2, ;hite C!ains, G.M., 000.
,,. $=CT2 3F3=, >Transport of =C20# Jecoding Information as 3nci!!ary %ata Cac'ets,?
$=CT2, ;hite C!ains, G.M., 000.
,. $=CT2 2ngineering 0uide!ineD 20 3<D >=C20# 8perating Jange 3pp!ications,? $oci#
ety of =otion Cicture and Te!evision 2ngineers, ;hite C!ains, G.M., 00,.
,3. $=CT2 Jecommended CracticeD CJ ,3, >=C20# 8perating Janges,? $ociety of
=otion Cicture and Te!evision 2ngineers, ;hite C!ains, G.M., 00,.
11.3.9 )i(liogrp$y
9ennett, &hristopherD >Three =C20 =yths,? Proceedings of the .&&/ 0-B Broadcast
$ngineer- ing (onference, Gationa! 3ssociation of 9roadcasters, ;ashington, %.&., pp.
,HO,36,
,HH6.
9onomi, =auroD >The 3rt and $cience of %igita! 4ideo &ompression,? 0-B Broadcast $ngi-
neering (onference Proceedings, Gationa! 3ssociation of 9roadcasters, ;ashington,
%.&., pp. @O,:, ,HHF.
%are, CeterD >The (uture of Getwor'ing,? Broadcast $ngineering, Intertec Cub!ishing, 8ver!and
Car', Kan., p. 36, 3pri! ,HH6.
(ibush, %avid K.D >Testing =C20#&ompressed $igna!s,? Broadcast $ngineering, 8ver!and
Car', Kan., pp. @6O<6, (ebruary ,HH6.
(reed, KenD >4ideo &ompression,? Broadcast $ngineering, 8ver!and Car', Kan., pp. :6O@@,
Ian# uary ,HH@.
I$$$ tandard Dictionary of $lectrical and $lectronics Terms, 3G$I/I222 $tandard ,00#
,H<:, Institute of 2!ectrica! and 2!ectronics 2ngineers, Gew Mor', ,H<:.
Compression System Constrints nd ,erformn'e /ssues 11-9%
Iones, KenD >The Te!evision A3G,? Proceedings of the .&&1 0-B $ngineering (onference,
Gationa! 3ssociation of 9roadcasters, ;ashington, %.&., p. ,6<, 3pri! ,HHF.
$ta!!ings, ;i!!iamD ID0 and Broadband ID0,
nd
2d., =ac=i!!an, Gew Mor'.
Tay!or, C.D >9roadcast *ua!ity and &ompression,? Broadcast $ngineering, Intertec
Cub!ishing, 8ver!and Car', Kan., p. :6, 8ctober ,HHF.
;hita'er, Ierry &., and .aro!d ;inard 5eds.6D The Information -ge Dictionary, Intertec
Cub!ish# ing/9e!!core, 8ver!and Car', Kan., ,HH.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Source/ Standard !andboo. of Video and Tele*ision Engineering
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
C$pter
33#B
Audio Compression Systems
*red 1ylie
0erry C. 1$it+er2 Editor-in-C$ief
11.6.1 /ntrodu'tion
3s with video, high on the !ist of priorities for the professiona! audio industry is to ref ine and
e"tend the range of digita! equipment capab!e of the capture, storage, post production,
e"change, distribution, and transmission of high#qua!ity audio1be it mono, stereo, or F.,
channe! 3 +,-. This demand being driven by end#users, broadcasters, f i!m ma'ers, and the
recording indus# try a!i'e, who are moving rapid!y towards a >tape!ess? environment. 8ver the
!ast two decades, there have been continuing advances in %$C techno!ogy, which have
supported research engi# neers in their endeavors to produce the necessary hardware,
particu!ar!y in the fie!d of digita! audio data compression or1as it is often referred to1bit-rate
reduction. There e"ist a number of rea!#time or1in rea!ity1near instantaneous compression
coding a!gorithms. These can signifi# cant!y !ower the circuit bandwidth and storage
requirements for the transmission, distribution, and e"change of high#qua!ity audio.
The introduction in ,H<3 of the compact disc 5&%6 digita! audio format set a qua!ity bench#
mar' that the manufacturers of subsequent professiona! audio equipment strive to match or
improve upon. The discerning consumer now e"pects the same qua!ity from radio and te!evision
receivers. This !eaves the broadcaster with an enormous cha!!enge.
11.6.1 ,C5 Versus Compression
It can be an e"pensive and comp!e" technica! e"ercise to fu!!y imp!ement a !inear pulse code
modulation 5C&=6 infrastructure, e"cept over very short distances and within studio areas +,-.
To demonstrate the advantages of distributing compressed digita! audio over wire!ess or wired
systems and networ's, consider again the &% format as a reference. The &% is a ,6 bit !inear
C&= process, but has one ma7or handicapD the amount of circuit bandwidth the digita! signa!
occupies in a transmission system. 3 stereo &% transfers information 5data6 at ,.:,, =bits/s,
which wou!d require a circuit with a bandwidth of appro"imate!y @00 '.z to avoid distortion of
the digita! signa!. In practice, additiona! bits are added to the signa! for channe! coding,
synchro#
11-99
Audio Compression Systems
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
11-1&& Compression Te'$nologies for Video nd Audio
nization, and error correction) this increases the bandwidth demands yet again. ,.F =.z is the
common!y quoted bandwidth figure for a circuit capab!e of carrying a &% or simi!ar!y coded
!in# ear C&= digita! stereo signa!. This can be compared with the 0 '.z needed for each of
two cir# cuits to distribute the same stereo audio in the ana!og format, a @F#fo!d increase in
bandwidth requirements.
11.6.1( Audio )it "te "edu'tion
In genera!, ana!og audio transmission requires fi"ed input and output bandwidths +-. This con#
dition imp!ies that in a rea!#time compression system, the qua!ity, bandwidth, and distortion/
noise !eve! of both the origina! and the decoded output sound shou!d not be sub2ectively
different, thus giving the appearance of a !oss!ess and rea!#time process.
In a technica! sense, a!! practica! rea!#time bit#rate#reduction systems can be referred to as
>!ossy.? In other words, the digita! audio signa! at the output is not identica! to the input signa!
data stream. .owever, some compression a!gorithms are, for a!! intents and purposes, !oss!ess)
they !ose as !itt!e as percent of the origina! signa!. 8thers remove appro"imate!y <0 percent of
the origina! signa!.
"edundn'y nd /rrele7n'y
3 comp!e" audio signa! contains a great dea! of information, some of which, because the human
ear cannot hear it, is deemed irre!evant. +-. The same signa!, depending on its comp!e"ity, a!so
contains information that is high!y predictab!e and, therefore, can be made redundant.
3edundancy, measurab!e and quantifiab!e, can be removed in the coder and rep!aced in the
decoder) this process often is referred to as statistical compression. Irrelevancy, on the other
hand, referred to as perceptual coding, once removed from the signa! cannot be rep!aced and is
!ost, irretrievab!y. This is entire!y a sub7ective process, with each proprietary a!gorithm using a
different psychoacoustic mode!.
&ritica!!y perceived signa!s, such as pure tones, are high in redundancy and !ow in irre!e#
vancy. They compress quite easi!y, a!most tota!!y a statistica! compression process. &onverse!y,
noncritica!!y perceived signa!s, such as comp!e" audio or noisy signa!s, are !ow in redundancy
and high in irre!evancy. These compress easi!y in the perceptua! coder, but with the tota! !oss of
a!! the irre!evancy content.
!umn Auditory System
The sensitivity of the human ear is biased toward the !ower end of the audib!e frequency spec#
trum, around 3 '.z +-. 3t F0 .z, the bottom end of the spectrum, and at ,@ '.z at the top end,
the sensitivity of the ear is down by appro"imate!y F0 d9 re!ative to its sensitivity at 3 '.z
5(ig# ure ,,.6.,6. 3dditiona!!y, very few audio signa!s1music# or speech#based1carry
fundamenta! frequencies above : '.z. Ta'ing advantage of these characteristics of the ear, the
structure of audib!e sounds, and the redundancy content of the C&= signa! is the basis used by
the designers of the predictive range of compression a!gorithms.
3nother we!!#'nown feature of the hearing process is that !oud sounds mas' out quieter
sounds at a simi!ar or nearby frequency. This compares with the action of an automatic gain
con# tro!, turning the gain down when sub7ected to !oud sounds, thus ma'ing quieter sounds !ess
!i'e!y to be heard. (or e"amp!e, as i!!ustrated in (igure ,,.6., if we assume a , '.z tone at a
!eve! of
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Audio Compression Systems
Audio Compression Systems 11-1&1
*igure 11.6.1 Generali1ed fre6uency response of t%e %uman ear# Fote %ow t%e 8CM process
captures signals t%at t%e ear cannot distinguis%# "From 2'4# Used with permission#$
@0 d9u, !eve!s of greater than :0 d9u at @F0 .z and '.z wou!d be required for those frequen#
cies to be heard. The ear a!so e"ercises a degree of tempora! mas'ing, being e"ceptiona!!y to!er#
ant of sharp transient sounds.
It is by mimic'ing these additiona! psychoacoustic features of the human ear and identifying
the irre!evancy content of the input signa! that the transform range of !ow bit#rate a!gorithms
operate, adopting the princip!e that if the ear is unab!e to hear the sound then there is no point in
transmitting it in the first p!ace.
;unti<tion
*uantization is the process of converting an ana!og signa! to its representative digita! format or,
as in the case with compression, the requantizing of an a!ready converted signa! +-. This
process is the !imiting of a finite !eve! measurement of a signa! samp!e to a specific preset
integer va!ue. This means that the actual !eve! of the samp!e may be greater or sma!!er than the
preset reference !eve! it is being compared with. The difference between these two !eve!s, ca!!ed
the #uantization error, is compounded in the decoded signa! as #uantization noise.
*uantization noise, therefore, wi!! be in7ected into the audio signa! after each 3/% and %/3
conversion, the !eve! of that noise being governed by the bit a!!ocation associated with the
coding process 5i.e., the number of bits a!!ocated to represent the !eve! of each samp!e ta'en of
the ana# !og signa!6. (or !inear C&=, the bit a!!ocation is common!y ,6. The !eve! of each audio
samp!e,
therefore, wi!! be compared with one of
,6
or 6F,F36 discrete !eve!s or steps.
&ompression or bit#rate reduction of the C&= signa! !eads to the requantizing of an a!ready
quantized signa!, which wi!! unavoidab!y in7ect further quantization noise. It a!ways has been
good operating practice to restrict the number of 3/% and %/3 conversions in an audio chain.
Gothing has changed in this regard, and now the number of compression stages a!so shou!d be
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Audio Compression Systems
11-1&2 Compression Te'$nologies for Video nd Audio
*igure 11.6.2 E:ample of t%e mas.ing effect of a %ig%le*el sound# "From 2'4# Used with permis-
sion#$
'ept to a minimum. 3dditiona!!y, the bit rates of these stages shou!d be set as high as practica!)
put another way, the compression ratio shou!d be as !ow as
possib!e.
$ooner or !ater1after a finite number of 3/%, %/3 conversions and passes of compression
coding, of whatever type1the accumu!ation of quantization noise and other unpredictab!e
signa! degradations eventua!!y wi!! brea' through the noise/signa! thresho!d, be interpreted as
part of the audio signa!, be processed as such, and be heard by the !istener.
Smpling *re-uen'y nd )it "te
The bit rate of a digita! signa! is def ined
by
sampling fre#uency B bit resolution B number of audio channels
The ru!es regarding the se!ection of a samp!ing frequency are based on GyquistQs theorem +-.
This ensures that, in particu!ar, the !ower sideband of the samp!ing frequency does not encroach
into the baseband audio. 8b7ectionab!e and audib!e a!iasing effects wou!d occur if the two bands
were to over!ap. In practice, the samp!ing rate is set s!ight!y above twice the highest audib!e fre#
quency, which ma'es the fi!ter designs !ess comp!e" and !ess e"pensive.
In the case of a stereo &% with the audio signa! having been samp!ed at ::., '.z, this sam#
p!ing rate produces audio bandwidths of appro"imate!y 0 '.z for each channe!. The resu!ting
audio bit rate R ::., '.z B ,6 B R ,.:,, =bits/s, as discussed previous!y.