You are on page 1of 23

Chapter 6.

Quantization

To convert an analog signal to a digital signal, the following three procedures are
required. First, the signal is passed through a lowpass filter to prevent aliasing. Second,
the signal is sampled by a sample-and-hold circuit. Finally, the samples are quantized by
an analog to digital converter (ADC) in order to be represented in digital form as shown
in Figure 6.1.

x(t) Anti-aliasing x(n) x$( n )


Filter Sample-and-hold A/D Converter
(Analog Signal) (LPF) Circuit (Discrete-time (Digital Signal)
Signal)

Figure 6.1. Typical analog to digital conversion process

There are many different kinds of quantization techniques available. Quantization


methods such as the linear quantization, the nonlinear quantization, the delta modulation,
and the sigma-delta modulation are described in this chapter. Also efficient quantization
methods such as the adaptive quantization and the differential quantization are explained.
The Adaptive Differential Pulse Code Modulation (ADPCM) is also described.

6.1 Linear Quantization

In linear or uniform quantization, the quantization step size is fixed. The constant
quantization step size is used no matter what the instantaneous signal amplitude is.
Linear quantization with 2m quantization levels is shown in Figure 6.2 where ∆ is the
quantization step size and m is the number of bits in a quantization word.

Sample value in volts[V]

(2m−1)∆/2 (Positive Peak Value = 2m∆/2: loudest)

(2m−3)∆/2

3∆/2

∆/2 (softest)

−∆/2 (softest)

−3∆/2

−(2m−3)∆/2

−(2m−1)∆/2 (Negative Peak Value = −2m∆/2: loudest)

Figure 6.2. Constant quantization step size ∆ is used for linear quantization.
(m is the number of bits used for quantization)

6-1
For example, if the peak to peak value of a signal is 4 [V] and m = 3, then the signal may
be quantized according to the rule shown in Figure 6.3.

Digital value after quantization

(1.75) 111

(1.25) 110

(0.75) 101

(0.25) 100
Sample value before
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 quantization [V]
011 (−0.25)

010 (−0.75)

001 (−1.25)

000 (−1.75)

Figure 6.3. Three-bit linear quantization example.


(sign-magnitude binary representation is used)

In this case ∆ = 0.5 [V]. A value between 0 and 0.5 is approximated (or quantized) by
the quantization level ∆/2 (0.25) and encoded by 100. A value between 0.5 and 1.0 is
quantized by the quantization level 3∆/2 (0.75) and encoded by 101. Likewise, a value
between −2 and −1.5 is quantized by the level −7∆/2 (−1.75) and encoded by 000 and so
on. The process of sampling, quantization and encoding is referred to as the pulse code
modulation (PCM).
Keeping a step size fixed is especially essential for high-fidelity digital audio. One
notable example is the compact disc (CD) format. The CD format uses 16 bit linear
quantization so that the range between the negative peak and the positive peak is divided
by 216 (65,536) uniform quantization levels.

6.2 Quantization Noise and SQNR

Suppose that the signal to be quantized has a peak-to-peak value of 2V [V] and that
the number of bits in a quantization word is m. If there are 2m quantization levels, then
the quantization step size is given by

6-2
2V
∆= . (6.1)
2m

As shown in Figure 1, the quantizer input is x(n) and the output is xˆ (n) . The output can
be expressed by the following equation:

xˆ (n) = x(n) + e(n) (6.2)

where e(n) is termed the quantization noise or quantization error. The noise sequence,
e(n), is uncorrelated with the sequence x(n). After quantization, actual information about
the noise is forgotten. However, the statistics of the noise is known. The noise is
between −∆/2 and ∆/2 and is uniformly distributed. The probability density function of
the noise or error, e(n), is given by Fig. 6.4.

pe(e)

e
−∆/2 ∆/2

Figure 6.4 Probability density function of the quantization noise e(n)

The mean of the noise is

∞ ∆/21
<e(n)> = me = ∫−∞
epe (e)de = ∫ e de = 0.
−∆ / 2 ∆
(6.3)

The noise power (which is the same as the variance because the mean is zero in this case)
is given by

<e2(n)> = σe2 = z
−∞

e 2 pe (e)de = z ∆/2

− ∆/2
1
e 2 de =

∆2
12
=
(2V ) 2
12 ⋅ 2 2 m
. (6.4)

The signal to quantization noise ratio (SQNR) in dB is given by

SQNR = 10log
FG σ IJ = 10logFG 2 ⋅ σ IJ
2 2m 2
(6.5)
Hσ K H (2V ) / 12 K
x x
2 2
e

= 10log(2 ) + 10log G
2m F σ IJ 2

H V / 3K
x
2

= 6m + 10log G
F σ IJ [dB] 2

H V / 3K
x
2

6-3
where σx2 is the signal power (or variance assuming that the mean is zero). The SQNR is
linearly proportional to the number of bits m in the ADC. For each extra bit of resolution
in the ADC, there is improvement of 6 dB in the SQNR. Let us assume that the signal is
uniformly distributed between −V and V. The signal power is given by

σx2 =
b2V g 2

=
V2
. (6.6)
12 3

Thus, the SQNR of the signal that is uniformly distributed between the negative and
positive peak values becomes

SQNRuniform = 6m [dB]. (6.7)

The dynamic range of the ADC is a measure of the range of input amplitudes for
which the ADC produces a positive SQNR. In the case when the signal is uniformly
distributed between the negative and the positive peak values, the dynamic range is
defined as the ratio of the loudest amplitude to the softest one. The dynamic range in
decibel is given by

DRuniform = 20 log10
FG 2 ∆ / 2 IJ = 20m log
m
2 = 6m [dB]. (6.8)
H ∆/2 K 10

Note that the SQNR and the dynamic range are identical. By adding one extra bit,
dynamic range is increased by 6 dB. In the case of the CD format, the dynamic range is
about 16(6) = 96 dB. In comparison, the dynamic range of a typical cassette player is
about 60 dB and the dynamic range of the audio part of a typical hi-fi VHS VCR is about
80 dB.
For sinusoidal inputs, the dynamic range of the ADC is defined as the ratio of the
signal power of a full scale sinusoid to the signal power of a small sinusoidal input that
results in a SQNR of 0 dB. The signal power of a full-scale sinusoid is V2/2. To have an
SQNR of 0 dB, the smallest sinusoidal signal power is ∆2/12.
Thus, the dynamic range for sinusoidal input is the same as the SQNR when the
signal is a sinusoid with the amplitude V.

DRsinusoid = SQNRfull scale sinusoid = 6m + 10log


FG V / 2 IJ = 6m + 1.76 [dB].
2
(6.9)
H V / 3K
2

In other words, the dynamic range is the peak SQNR of the ADC for a sinusoidal input.

6.3 Nonlinear Quantization

When people talk over the telephone, they seldom yell on the mouthpiece all the time.
Thus, speech sample values are mostly concentrated in the soft or medium range of

6-4
amplitude. For efficient quantization of a speech signal, nonuniform quantization step
size is often used. A smaller step size is used for a softer sound and a bigger step size is
used for a louder sound. A small difference may be noticeable when the sound is soft,
but the same difference may not be noticeable if the sound is loud.
Two methods, the µ-law and the A-law, are widely used for nonlinear quantization in
telephone system. The µ-law is used in North America and the A-law is used in Europe.
In both cases, the sample is quantized linearly and then compressed according to the
nonlinear compression rule. The µ-law compresses a 13 to 14-bit linearly quantized
speech sample to an 8-bit word. On the other hand, the A-law compresses a 12-bit
linearly quantized speech sample to an 8-bit word. At the receiver the compressed words
are expanded for reconstruction of original speech. A special chip which performs both
the compression and the expansion is called the compander.
Figure 6.5 shows the 8-bit representation of the A-law nonlinear quantization. The
first bit is used for the sign bit, the next three bits are used for the segment identifier
which can vary between 0 and 7, and the last four bits are used to represent numbers
between 0 and 15.

Sign
bit Segment Identifier A B C D
(s)

Figure 6.5 8-bit word for A-law nonlinear quantization

TABLE 6.1 summarizes the compression rule and the expansion rule for A-law
quantization.

Segment 12-bit original 8-bit compressed 12-bit expanded

0 s000 0000 ABCD s000 ABCD s000 0000 ABCD


1 s000 0001 ABCD s001 ABCD s000 0001 ABCD
2 s000 001A BCDx s010 ABCD s000 001A BCD1
3 s000 01AB CDxx s011 ABCD s000 01AB CD10
4 s000 1ABC Dxxx s100 ABCD s000 1ABC D100
5 s001 ABCD xxxx s101 ABCD s001 ABCD 1000
6 s01A BCDx xxxx s110 ABCD s01A BCD1 0000
7 s1AB CDxx xxxx s111 ABCD s1AB CD10 0000

TABLE 6.1. A-law compression and expansion rule


(x denotes the don’t care term which can be either 0 or 1)

As the segment identifier increases, the signal sample amplitude increases or gets louder.
For the louder signal, effective quantization step size increases. For example, the original
sample with segment identifier 2 has twice bigger the quantization step size than that of
the sample with the segment identifier of 1 or 0. The µ-law is very similar to the A-law
but is more complicated. The dynamic range of the A-law PCM is 72 dB even though it
uses eight bits per sample. Eight bits in linear quantization would give a dynamic range
of only 48 dB.

6-5
6.4 Delta Modulation (DM)

Delta modulation is obtained from a staircase approximation xq(t) of the continuous-


time signal x(t) as shown in Figure 6.6. The ∆ is termed the step size of the staircase
approximation and the T is the sampling interval.

staircase approximation: xq(t)

3∆

continuous-time
∆ signal: x(t)

t
-∆ T 2T 3T 4T 5T 15T

-2∆

Figure 6.6 Staircase approximation of an analog signal for ∆ modulation.

As in Figure 6.6, the initial value of xq(t) is zero at t = 0. We keep this value for T
seconds and compare this to the actual signal value at t = T. Because x(T) is greater than
xq(0), xq(t) is increased by ∆ at t = T to follow x(t) as closely as possible. The same thing
happens at t = 2T and t = 3T. At t = 4T, xq(3T) is compared to x(4T). Because x(4T) is
smaller than xq(3T), xq(t) is decreased by ∆ at t = 4T to follow closely the original signal.
This process continues. To follow closely the original signal with the staircase
approximation, the sampling interval T needs to be kept small. A sampling rate may be
much higher than twice the highest frequency of x(t). This is called the oversampling.
We do not send or store the staircase approximation. Instead, the information about the
increment or decrement at each sampling instant is transmitted or stored. Only one bit is
required to quantize this information as in Figure 6.7.

Analog signal DM signal out


Σ One-bit Quantizer
x(t) + x(t) − xq(t−T) (binary signal)

Accumulator Decoder
xq(t−T) ±∆

(a) Modulator

6-6
Decoder Accumulator Lowpass Reconstructed analog signal
DM signal ±∆ xq(t) Filter x$ ( t )

(b) Demodulator

Figure 6.7 Delta Modulation and Demodulation

If the increment is denoted by 1 and the decrement is denoted by 0, then the example
shown in Figure 6.6 will produce one-bit quantizer output (or DM output) as shown in
TABLE 6.2. At the receiver, the staircase approximation is reconstructed from the binary
signal. The staircase signal in turn is smoothed after the lowpass filter.

Samp. Inst. T 2T 3T 4T 5T 6T 7T 8T 9T 10T 11T 12T 13T 14T


Change +∆ +∆ +∆ -∆ -∆ -∆ -∆ -∆ +∆ +∆ +∆ +∆ -∆ -∆
DM output 1 1 1 0 0 0 0 0 1 1 1 1 0 0

TABLE 6.2 Output binary sequence of the delta modulator

6.5 Oversampling

There are mainly two reasons why oversampling is used in A/D conversion. First, the
analog anti-aliasing filter does not need a sharp cutoff characteristic with oversampling.
A simple passive RC filter instead of an OP-amp based active filter can be used for anti-
aliasing. Secondly, the quantization noise is reduced as it spreads over the wider band.
Fig. 6.8 shows the simplified Fourier spectrum of the sampled signal whose sampling
frequency is slightly greater than the highest frequency of the analog signal. It also
shows the required lowpass filter response for anti-aliasing.

Required analog lowpass filter response

f [Hz]
-5F1 -4F1 -3F1 -2F1 -F1 F1 2F1 3F1 4F1 5F1

Figure 6.8 Fourier spectrum of the sampled signal (Sampling frequency: F1)

6-7
Now suppose that the same analog signal is sampled at the rate of F2 which is four times
F1. The Fourier spectrum of the sampled signal is shown in Fig. 6.9. The required
lowpass filter response for anti-aliasing has much wider transition band. A simple
passive lowpass filter may be good enough for anti-aliasing.

Required analog lowpass filter response

f [Hz]
-4F1 (-F2) -3F1 -2F1 -F1 F1 2F1 3F1 4F1 (F2)

Figure 6.9 Fourier spectrum of the sampled signal (Sampling frequency: F2 = 4F1)

After sampling, each sample is quantized by an ADC. The oversampled digital signal
must be downsampled later for obvious reason (we do not want to have too many
samples!). Before downsampling the oversampled signal must be passed through a
digital lowpass filter for anti-aliasing. The anti-aliasing digital filter is usually an FIR
filter to ensure linear phase. The collective operation of lowpass filtering and
downsampling is known as the decimation.
When each sample is quantized by an ADC, quantization noise is resulted. Let the
quantization noise power is σe2. The noise power is the same no matter what the
sampling rate is as long as the ADC resolution is fixed. However, its frequency
distribution is different because of the different sampling rates. Because the quantization
is assumed to be white, the noise power is uniformly distributed between −Fs/2 and Fs/2
where Fs is the sampling frequency. Thus, the power spectral density is given by

Pe(f) = σe2/Fs [W/Hz] (6.10)

Fig. 6.10 shows the power spectral density (PSD) of the quantization noise for two
different sampling frequencies.

Pe(f)
: PSD for sampling frequency F1
σe2/F1 : PSD for sampling frequency F2

σe2/F2

f [Hz]
-F2 (-4F1 ) -3F1 -F2/2 -F1 -F1/2 F1/2 F1 F2/2 3F1 F2 (4F1 )

Figure 6.10 Quantization power spectral density for two PCM conversions
(Area of each rectangle is σe2)

6-8
If the digital lowpass filter used for decimation has the cutoff frequency F1/2, then the
decimated signal will have one fourth the quantization noise power of the original
oversampled signal because the noise power outside of the passband is eliminated. In
general the ratio of the in-band noise power (after lowpass filtering) to the original noise
power is given by

F1 2 1 2
σe = r σe (6.11)
F2 2

where 2r = F2/F1 is termed the oversampling ratio. Thus, the effective SQNR is

F σ I = 10logF σ I + 3r [dB]
2 2
SQNR = 10log GH F F σ JK
1
GH σ JK
x

2
2
e
x
2
e
(6.12)

For every doubling of the oversampling ratio, i.e., for every increment in r, the SQNR
improves by 3 dB, or the resolution improves by a half bit.

6.6 Sigma-Delta Modulation (One-bit Quantization)

Delta modulator described in the previous section is reproduced below. Note that the
signals are in the discrete-time domain for convenience.

Analog signal DM signal out


Σ One-bit ADC
x(n) + x(n) − xq(n-1) (binary signal)

Accumulator DAC
xq(n−1) ±∆

(a) Modulator

DAC Accumulator Lowpass Reconstructed signal


DM signal ±∆ xq(n) Filter x$( n )

(b) Demodulator

Figure 6.11 Delta Modulator and Demodulator in the discrete-time domain

6-9
Because the accumulator in the demodulator is a linear system, it can be moved and
placed in front of the modulator without changing the overall performance.

Analog signal One-bit resolution


Accumulator Σ One-bit ADC binary signal
x(n) +

Accumulator DAC

(a) Modulator

DAC Lowpass Reconstructed signal


Binary signal Filter x$( n )

(b) Demodulator

Figure 6.12 Modified delta modulator and demodulator in the discrete-time domain.

Because two accumulators in Fig. 6.12 are still linear, they can be combined in one
accumulator and placed before the one-bit ADC as shown in Fig. 6.13. The resulting
modulator is referred to as the sigma-delta (Σ-∆) or delta-sigma modulator.

Analog signal One-bit resolution


Σ Accumulator One-bit ADC binary signal
x(n) + y(n)

DAC
y(n−1)

Fig. 6.13 Sigma-delta modulator A/D system

Let xi(n) be an input and xo(n) be an output of the accumulator. The input-output relation
of the accumulator in the time domain is given by

xo(n) = xo(n−1) + xi(n). (6.13)

6-10
In the z-transform domain, the relation becomes

Xo(z) = z−1Xo(z) + Xi(z). (6.14)

The transfer function of the accumulator is

X o ( z) 1
H(z) = = . (6.15)
X i ( z ) 1 − z −1

An equivalent block diagram of the Σ-∆ modulator is obtained by replacing the


accumulator with its transfer function and the ADC with the additive quantization noise
model as shown in Fig. 6.15.

e(n)

+
1
x(n) + Σ −1
+ Σ y(n)
1− z

z− 1
y(n−1)

Fig. 6.15 Equivalent sigma-delta modulator A/D system

One can show that

1
Y(z) = X ( z ) − z −1Y ( z ) + E(z) (6.16)
1 − z −1

where X(z), Y(z), and E(z) are z-transforms of x(n), y(n), and e(n), respectively. By
simplifying the above equation, the following is obtained.

Y(z) = X(z) + (1−z−1)E(z). (6.17)

Note that the resulting noise is given by

N(z) = (1−z−1)E(z). (6.18)

Now the noise transfer function, Hn(z), is given by

Hn(z) = 1 − z−1.

6-11
The frequency response is given by

θ θ
j −j θ
−jθ e 2
−e 2 −j
Hn(θ) = 1 − e = j2 e 2
j2
θ
⎛ θ⎞ −j
= j 2sin ⎜ ⎟ e 2
⎝ 2⎠

The magnitude response is

H n (θ) = 2 sin θ ( 2) . (6.19)

Because the sampling frequency was F2, the magnitude response can be given in terms of
the analog frequency, f, as (remember that θ = ω/F2 = 2πf/F2)

⎛ πf ⎞
H n ( f ) = 2 sin ⎜ ⎟ . (6.20)
⎝ F2 ⎠

Note:

Let H(f) be the transfer function of a linear system and Pi(f) be the power
spectral density of the input of the system. The power spectral density of
the output of the system is given by

2
Po ( f ) = H ( f ) Pi ( f )

Thus, the power spectral density of the noise is given by

2 σe2 ⎛ 2πf ⎞ σe2


Pn ( f ) = H n ( f ) = 2 1 − cos ⎜ ⎟ (6.21)
F2 ⎝ F2 ⎠ F2

The power spectral density of the noise is shown in Fig. 6.16. Note that the gain at DC
is zero and the large attenuation is achieved at low frequencies. There is amplification in
higher frequencies but high frequencies are going to be removed by a digital lowpass
filter. The in-band noise power is obtained by calculating the shaded area and is given by

0.5 F1
σ 2n = ∫ Pn ( f )df . (6.22)
−0.5 F1

6-12
4σe2/F2

3σe2/F2

2σe2/F2
Shaded area is the in-
band noise power

σe2/F2

0
-0.5F2 -F1 -0.5F1 0 0.5F1 F1 0.5F2 (f)

Figure 6.16 Power spectral density of the noise (Shaded area is the in-band noise power).

The in-band noise power at the output of a Σ-∆ modulator is approximately given by

3
π2 ⎛ F1 ⎞
σ =σ
2
n
2
e
⎜⎜ ⎟⎟ for F1 << F2 . (6.23)
3 ⎝ F2 ⎠

The SQNR in dB is

SQNR = 10log
FG σ IJ − 10logFG π IJ + 30log FG F IJ
2 2
2
(6.24)
Hσ K H 3K HFK
x
2
e 1

Fσ I 2
Fπ I
= 10log G J − 10log G J + 9r [dB].
2

Hσ K H 3K
x
2
e

For every doubling of the oversampling ratio, i.e., for every increment in r, the SQNR
improves by 9 dB, or equivalently, the resolution improves by 1.5 bits. To increase the
resolution further, the second-order sigma-delta modulator as shown in Fig. 6.17, for
example, is considered.

e(n)

x(n) Σ Σ Σ Σ z− 1 Σ y(n)
− −

z− 1 z− 1

DAC

Figure 6.17 Second-order sigma-delta modulator

In this case, the SQNR becomes

6-13
Fσ I 2
Fπ I 4
SQNR = 10log G J − 10log G J + 15r [dB]. (6.25)
Hσ K H 5K
x
2
e

For every doubling of the oversampling ratio, the SQNR is improved by 15 dB. To
achieve the SQNR of 96 dB as in the case of 16 bit linear quantization, oversampling
ratio can be chosen as 64 (= 26). At this rate one bit quantization is as good as 16-bit
linear quantization. This technique can be applied to the CD player. A sequence stored
in a CD can be upsampled by 64 and interpolated. The interpolated version of the
sequence will be the input to the sigma-delta modulator such as one shown in Fig. 6.17.
Output y(n) will be converted to analog signal. This is called the one-bit DAC. Recently
Super Audio Compact Disc (SACD) format uses the 1-bit Direct Stream Digital
technique that is similar to this.

6.7 Adaptive Quantization

Speech signals are nonstationary. The standard deviation of a typical speech signal
varies with time. The quantization step size can be adapted to the signal’s dynamics.
The adaptation can be done at every sample or every few samples. Or it can be done in
longer intervals, e.g. 10-20 ms.
There are mainly two kinds of adaptation methods available: feedforward adaptation
and feedback adaptation. In the feedforward adaptation, the adaptation is computed from
the incoming signal into the encoder as shown in Figure 6.18.

Transmission channel
x(n) Quantizer Decoder x$ ( n )
c(n) c(n)

Adaptation
∆(n) ∆(n)

Figure 6.18 Feedforward Adaptive Quantizer

In this case, x(n) is the signal into the encoder, x$ (n) is the reconstructed signal at the
decoder, c(n) is the coded binary sequence. The step size at time n is given by

∆(n) = Kσ(n) (6.26)

where σ(n) is the estimation of the standard deviation of the signal and K is the arbitrarily
chosen constant. The variance of the signal can be computed from the segment of signal
to be quantized:

6-14
M −1
1
σ 2 ( n) =
M
∑x
m= 0
2
(n + m) . (6.27)

Note that equation (6.27) uses M future samples. This means that there will be a delay of
M and the variance is updated every M-th sample. To update the variance for every
sample and to use past samples instead of future samples, an alternative method can be
used:
M −1
σ$ 2 (n) = ∑α
m= 0
m
x 2 (n − m) (6.28)

where 0 < α < 1. Equation (6.28) can be computed recursively. By taking the z-
transform of equation (6.28), one will have

M −1
1 − (αz −1 ) M
Z{ σ$ 2 (n) } = ∑ (αz −1 ) m Z{x2(n)} =
m= 0 1 − αz −1
Z{x2(n)} (6.29)

where Z{⋅} is the z-transform operation. By approximating the numerator to one (for
large enough M ), Equation (6.29) becomes

(1 − αz-1)Z{ σ$ 2 (n) } = Z{x2(n)} (6.30)

The inverse z-transform of equation (6.30) is given by

σ$ 2(n) = α σ$ 2(n-1) + x2(n). (6.31)

The smaller the α, the faster the quantizer can track change in the signal. A typical value
of α is 0.9. ∆(n) is usually restricted to a range ∆min < ∆(n) < ∆max. The ratio ∆max/∆min is
usually given by 100. One major drawback is that it is necessary to transmit information
about ∆(n) as well as c(n). This transmission increases the bit rate.
On the other hand, in the feedback adaptation, the adaptation is computed from the
outgoing signal as shown in Fig. 6.19.

Transmission channel
x(n) Quantizer Decoder x$ ( n )
c(n) c(n)

∆(n) ∆(n)
Adaptation Adaptation

Figure 6.19. Feedback adaptive quantizer.

6-15
Feedback stepsize adaptation has the advantage that no additional information needs to
be transmitted besides the quantized signal. Typically the stepsize is adapted according
to the rule

∆(n) = P×∆(n−1). (6.32)

The value of the multiplier P depends only on the value of |c(n−1)| which is the
magnitude of the codeword in the previous time instant. As an example, when the
quantizer is a 3-bit (or 8-level) quantizer as in section 6.1, the P can follow the TABLE
6.3 shown below.

|c(n−1)| 00 01 10 11
P 0.85 1 1 1.5

TABLE 6.3 Multiplier P of the 3-Bit Quantization Stepsize

The rationale behind this multiplication is that, for small |c(n−1)|, the signal is soft and
we use P<1 to diminish the step size and achieve a finer quantization. On the other hand,
for large |c(n−1)|, P>1 because the signal is already loud and needs bigger quantization
step size.

6.8 Differential Quantization

In the speech signal, especially in the voiced or vowel sound, there is a relatively
smooth change from one speech sample to the next. In other words, there is considerable
correlation between adjacent samples. As a result, it is expected that the difference of
adjacent samples will have a smaller variance and dynamic range than the speech
samples themselves. This motivates the quantization of the difference d(n) = x(n)- ~ x (n)
~
instead of the speech sample x(n), where x (n) is the estimation or prediction of x(n) as in
Figure 6.20.

x(n) + d(n) Quantizer d$ ( n ) Encoder c(n) c(n) Decoder d$ ( n ) + x$ ( n )


channel
− +
~
x (n) ~
x (n)
+
x$ ( n )
Predictor Predictor
+

Figure 6.20 Differential Quantization (Transmitter and Receiver)

6-16
A typical prediction rule is given by
P
x (n) = ∑ α m x$ (n − m) .
~ (6.33)
m =1

Prediction of incoming sample is made based on the P past decoded samples. The
prediction coefficients αm are chosen to minimize the average squared prediction error.
By writing the equations around the adders of Figure 6.20, it can be shown that the
quantization error x(n)− x$ (n) of the speech signal is equal to the quantization error
d(n)− d$ (n) of the difference signal. The difference signal has a smaller variance, and so
does the corresponding quantization error. With this approach we decrease the
quantization error and increase the SNR.

6.9 Adaptive Differential Quantization

So far, the differential quantization has used a fixed predictor and a fixed quantizer.
However, speech signal’s characteristics change with time. The adaptation can be
performed on both the quantizer and the predictor. This results in adaptive differential
quantization. The system incorporates adaptive differential quantization is called the
Adaptive Differential Pulse Code Modulation (ADPCM). The CCITT G.721 ADPCM
standard employs feedback adaptation of step size and prediction. Bit rate in this case is
32 kbps which is half that of the CCITT A-law or µ-law PCM standard. A detailed
description of the ADPCM standard can be found in CCITT Recommendation G.721 “32
kbps Adaptive Differential Pulse Code Modulation,” October 1985.

6.10 Data compression

Lossless data compression algorithms preserve all the information in the data so that
it can be reconstructed without error. In lossless data compression, compression rate is
only a modest 2:1 to 8:1, depending upon the redundancy of the information source and
compression techniques' capabilities. Lossless algorithms are mandatory for transmitting
or storing such data as computer programs, documents, medical image and numerical
information, where a single bad bit could lead to disaster.
Lossy compression techniques do not offer perfect reproduction, but can compress
data into as little as 1 percent of its uncoded length. The information recovered only
approximates the source material, but that is enough in many applications – for images
and sounds destined for human eyes and ears, for example.
Run-length encoding is effective whenever a particular character is repeated many
times in succession. Instead of repeating the character, run-length encoding uses an
escape sequence to specify it and how many times to repeat it. The repeated character is
replaced by an escape character followed by 2 bytes: the byte for the character to be

6-17
duplicated, and a byte specifying how many times to repeat it. Using run-length
encoding, the 35-byte sequence abcde000000000000000000000000000000 reduces to
abcde<Esc> 0 30 which is only 8 bytes long.

6.11 Shanon’s information theory

How much compression can we reasonably expect to get? In the late 1940s Claude
E. Shannon discovered that the extent to which a message can be compressed and then
accurately restored is limited by its entropy. Entropy is a measure of the message’s
information content or the average information of the message. The information is
expressed in bits as the base 2 logarithm of the inverse of message’s probability.

For example, suppose any given time one out of four possible letters is transmitted: A, B,
C, and D. The probability of transmitting A is 1/2 (or P(A) = 0.5) and the probabilities of
transmitting B, C, and D are 1/4, 1/8, and 1/8, respectively. In this case, the information
of A is

I (A) = log2(1/P(A)) = log2(2) = 1 bit

Other informations are given as

I (B) = log2(1/P(B)) = log2(4) = 2 bits


I (C) = log2(1/P(C)) = log2(8) = 3 bits = I(D).

Note that the more probable the letter, the lower its information. Now the entropy
(average information) is computed as

Entropy = P(A)I(A) + P(B)I(B) + P(C)I(C) + P(D)I(D) = 0.5 + 0.5 + 0.375 +


0.375 = 1.75 bits.

In this case, one easiest way to encode four letters will be as follows.

A – 00; B – 01; C – 10; D – 11.

Note that two bits are required to encode the message.


Another way to encode the message is as follows:

A – 0; B – 10; C – 110; D – 111.

With this kind of encoding, any binary sequence can be uniquely decoded. For example,
a binary sequence

00110101001111010111

can be parsed as

6-18
0/0/110/10/0/0/111/10/10/111.

That will be decoded as

AACBAADBBD.

It looks like we need more bits to encode the message. However, in this particular
example, the average bits required to encode the message is the same as the entropy.
Hence, this kind of encoding is called the entropy coding. One systematic way to encode
a message so that the average bits will approach the entropy is called the Huffman coding
that was developed by David Huffman as part of a class assignment at MIT in 1950. In
general average bit rate of Huffman coding is larger than the enropy.

6.12 Huffman Coding

Suppose symbols and their corresponding probabilities are given below.

A - 0.4 B - 0.2 C - 0.15


D - 0.1 E - 0.1 F - 0.05

Huffman code is derived from a binary tree that is built corresponding to the symbol
probabilities.

A (0.4)
0

B (0.2) 0
(0.35) (1.0)
0
C (0.15) 1
(0.6) 1
D (0.1) 0

(0.25) 1
E (0.1) 0
(0.15) 1
F (0.05)
1

The resulting codes are as follows:

A: 0 B: 100 C: 101
D: 110 E: 1110 F: 1111

6-19
Example
(a) Find the average length of the codeword.
Average code length = (.4)(1) + (.2)(3) + (.15)(3) + (.1)(3) + (.1)(4) + (.05)(4)
= 2.35
(b) Find the entropy.
Entropy = (.4)(−log20.4) + (.2)(−log20.2) + (.15)(−log20.15) + (.1)(−log20.1) +
+ (.1)(−log20.1) + (.05)(−log20.05)
= (.4)(1.322) + (.2)(2.322) + (.15)(2.737) + (.1)(3.322) + (.1)(3.322) +
(.05)(4.322)
= 2.2843

Note

The WAVE file is the Windows standard file for recording and playing a quantized
signal using Sound Blaster compatible cards. The header, which is 44 bytes long,
contains the information about stereo or mono recording, 16-bit or 8-bit quantization, A-
law or µ-law compression, sampling rate, and so on. The following program converts an
8-bit binary WAVE file to a decimal data file.
#include <stdio.h>
void main()
{
int n;
double speech;
FILE *in, *out;
in = fopen(“input.wav”, “rb”);
out = fopen(“output.dat”, “w”);
for (n=0; n<44; n++) speech = getc(in); /* To get rid of 44-byte long
header */
while(!feof(in))
{
speech = getc(in) - 128.; /* To make mean value of speech zero */
fprintf (out, “%f\n”, speech);
}
fclose(in);
fclose(out);
}

Computer Assignment 4

Compute and plot a magnitude spectrum of the first 1024 points of the wave file,
C:\WINDOWS\ringin.wav, or any waveform you recorded. Use FFT function.

6-20
C program to compute the magnitude spectrum of speech signals

The following C program is to compute the magnitude spectrum of a speech signal.


#include <stdio.h>
#include <math.h>
#include "fft.cpp"
#define N 1024
void fft(double xr[], double xi[], int npt, int inv);
void main()
{
int n;
double mag, xr[N], xi[N];
FILE *in, *out;
in = fopen("ah.wav", "rb");
out = fopen("magnit.dat", "w");
for (n=0; n<44; n++) mag = getc(in); // Get rid of the first 44-byte long header
for (n=0; n<N; n++)
{
xr[n] = getc(in) - 127.5; // To make the mean value of speech zero
xi[n] = 0; // To make the imaginary part zero
}
fft (xr, xi, N, 0);
for (n=0; n<N; n++)
{
mag = xr[n]*xr[n] + xi[n]*xi[n];
mag = 10.*log10(mag);
fprintf (out, "%f\n", mag);
}
fclose(in);
fclose(out);
}

100

80

60

40

20

0
0 200 400 600 800 1000 1200

(a) Spectrum of “ah” sound

100

80

60

40

20

0
0 200 400 600 800 1000 1200

(b) Spectrum of “oh” sound

6-21
PROBLEMS

6.1 The probability density function of the noise, e(n), is given below.

pe(e)

e
a b

(a) Find the mean of the noise.


(b) Find the variance of the noise.

6.2 The signal to be quantized has uniform distribution and is sampled at 8 kHz. 8-bit
linear quantization is used for each sample.

(a) Find the signal to quantization noise ratio.


(b) Find the bit rate of the digital signal.

6.3 An analog signal is to be quantized and transmitted over a digital system with a
dynamic range of at least 58 dB. The analog signal has an absolute bandwidth of
1,000Hz and an amplitude range of −5 to 5 V.

(a) Determine the minimum sampling rate needed.


(b) Determine the number of bits needed for quantization.
(c) Determine the quantization step size.
(d) Determine the minimum bit rate required in the digital system to transmit the
signal.

6.4 Assume that the input to a delta modulation is 0.4t3 − t [V]. The step size of the
DM is 0.1 [V] and the sampler operates at 10 samples/sec. Over a time interval of 0
to 2 sec, sketch the input waveform and the staircase approximation. Find also the
delta modulator output.

6.5 A binary sequence is 1110101000100111. Sketch the resulting analog waveform


that appears at the delta demodulator output.

6.6 Consider the first order Σ-∆ modulator shown in Fig. 6.15.

(a) Show that the in-band quantization noise power is given by

6-22
σe2 ⎛ πF1 πF ⎞
2 ⎜ − sin 1 ⎟
π ⎝ F2 F2 ⎠

(b) Using the Taylor series expansion of the sine function and assuming that F1 <<
F2, show that the in-band noise power is simplified to

σ
π 2 F1
2 FG IJ 3

e
3 F2 H K
x3 x5 x7
Taylor series: sin x = x − + − +…
3! 5! 7!

6.7 Suppose symbols and their corresponding probabilities are given below.

A − 0.4 B − 0.4 C – 0.2

(a) Find the entropy of the symbols.


(b) Find the average length of the codeword.

6.8 Find the Huffman code for the following symbols.

A - 0.4 B - 0.2 C - 0.2


D - 0.1 E - 0.1

6-23

You might also like