You are on page 1of 4

A

digital filter is a basic


building block in any Digital
Signal Processing (DSP) sys-
tem. The frequency response
of the filter depends on the value of its
coefficients, or taps. Many software
programs can compute the values of the
coefficients based on the desired fre-
quency response. These values are typi-
cally floating point numbers and they
are represented with a fairly high
degree of precision.
However, when a digital filter is imple-
mented in hardware, the designer wants
to represent the coefficients (and also
the data) with the smallest number of
bits that still gives acceptable resolution
for the numbers. This is because repre-
senting a number with excess bits
increases the size of the registers, buses,
adders, multipliers and other hardware
used to process that signal. The bigger
sizes result in a chip with a larger die
size, which translates into increased
power consumption and a higher chip
price. Thus, the bit precisions used to
represent numbers are important in the
performance of real-world signal pro-
cessing designs.
FIR digital filters
A Finite Impulse Response (FIR)
digital filter is one whose impulse
response is of finite duration. This can
be stated mathematically as
(1)
where h(n) denotes the impulse
response of the digital filter, n is the
discrete time index, and
1
and
2
are
constants. A difference equation is the
discrete time equivalent of a continuous
time differential equation. The general
difference equation for a FIR digital fil-
ter is
(2)
where y(n) is the filter output at discrete
time instance n, b
k
is the k-th feedfor-
ward tap, or filter coefficient, and x(n-
k) is the filter input delayed by k
samples. The denotes summation
from k = 0 to k = M -1 where M is the
number of feedforward taps in the FIR
filter. Note that the FIR filter output
depends only on the previous M inputs.
This feature is why the impulse
response for a FIR filter is finite.
When the input to a FIR filter is the
Kronecker delta function (n), the
impulse ripples through the tapped
delay line of the filter, and the output at
time k (for k = 0 to M-1) is the value of
the k-th tap. (The function is defined as
(n)=1 for n=0, and (n)=0 for n 0.)
Once the impulse passes through the
tapped delay line, the output of the filter
is zero. This is because the tapped delay
is (and remains) filled with zeros.
Advantages of FIR filters
FIR filters are simple to design and
they are guaranteed to be bounded
input-bounded output (BIBO) stable.
By designing the filter taps to be sym-
metrical about the center tap position, a
FIR filter can be guaranteed to have lin-
ear phase. This is a desirable property
for many applications such as music
and video processing.
FIR filters also have a low sensitivity
to filter coefficient quantization errors.
This is an important property to have
when implementing a filter on a DSP
processor or on an integrated circuit.
IIR digital filters
Another type of digital filter is the
Infinite Impulse Response (IIR) filter.
As you may have guessed, the impulse
response of an IIR filter is of infinite
duration. Mathematically speaking, this
means that either
1
or
2
in (1) is equal
to . The general difference equation
for an IIR digital filter is
(3)
where a
k
is the k-th feedback tap. The
left denotes summation from k = 1 to
y a y b x
k k
n n k n k ( ) = ( ) + ( )
y n n k ( ) = ( ) b x
k
h n n
n
( ) = { < < < +
{
0
0
1 1 2
2
,
,

28 0278-6648/00/$10.00 2000 IEEE IEEE POTENTIALS


FIR and IIR
di gi tal
fi l ters
The effects of fi ni te
bi t preci si on
Louis Litwin

D
i
g
i
t
a
l

V
i
s
i
o
n

L
t
d
.
/
P
h
o
t
o
D
i
s
c


C
o
m
p
o
s
i
t
e
:

D
.

C
a
n
t
i
l
l
o
OCTOBER/NOVEMBER 2000 0278-6648/00/$10.00 2000 IEEE 29
k = N -1 where N is the number of feed-
back taps in the IIR filter. The right
denotes summation from k = 0 to k = M
-1 where M is the number of feedfor-
ward taps.
Note that, unlike the FIR filter, the
output of an IIR filter depends on both
the previous M inputs and the previous
N outputs. This feedback mechanism is
inherent in any IIR structure. (Feedback
occurs when a scaled version of the out-
put is fed back into the input.) It is
responsible for the infinite duration of
the impulse response.
How do we show that an IIR filter
can have an infinite duration impulse
response? The easiest way is to consider
a simple two-tap IIR filter with the fol-
lowing difference equation
(4)
where is the value of the single feed-
back tap a
1
in (3) and b
1
= 1 is the value
of the single feedforward tap. The input
to the filter is (n) and the output of the
filter at various times is shown in Table
1. Note that although the input to the fil-
ter had only one non-zero value (at time
n = 0), the output of the filter is non-
zero for all time 0 n . This is due
to the feedback nature of the IIR filter.
From a mathematical standpoint, this
impulse response takes on non-zero val-
ues for an infinite duration. However,
the response of this IIR filter when real-
ized in practice will eventually die out.
This happens when the numbers
become too small to be represented with
the finite precision of the filter.
Advantages of IIR filters
IIR filters are useful for high-speed
designs because they typically require a
lower number of multiplies compared to
FIR filters. IIR filters can also be
designed to have a frequency response
that is a discrete version of the frequen-
cy response of an analog filter.
Unfortunately, IIR filters do not have
linear phase and they can be unstable if
not designed properly. IIR filters also
are very sensitive to filter coefficient
quantization errors that occur due to
using a finite number of bits to repre-
sent the filter coefficients. One way to
reduce this sensitivity is to use a cascad-
ed design. That is, the IIR filter is
implemented as a series of lower-order
IIR filters as opposed to one high-order
section. The effect of this implementa-
tion will be shown later in this article.
Poles and zeros
A detailed discussion of what poles
and zeros are is beyond the scope of this
article. The interested reader can find
out more by looking in any DSP book.
(See Read more about it for some possi-
ble sources.) Only a very brief descrip-
tion will be given here.
Suppose that a filter has the follow-
ing transfer function
(5)
where z
n
and p
n
are values on the com-
plex plane (z-domain). The zeros of the
function are the values of z for which
H(z) equals zero. Hence, the zeros of
this function are at z = {z
1
, z
2
, z
3
}.
Similarly, the poles of the function
are those values of z for which H(z)
equals infinity. Thus, the poles of this
function are at z = {p
1
, p
2
, and p
3
}.
H z ( ) =
( ) ( ) ( )
( ) ( ) ( )
z z z z z z
z p z p z p
1 2 3
1 2 3
y y x n n n ( ) = ( ) + ( ) 1
Fig. 1 Results for floating point FIR and IIR filters
0 0.1 0.2 0.3 0.4 0.5
-100
-80
-60
-40
-20
0
Magnitude Of Floating Point FIR Filter
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Magnitude Response Of Floating Point IIR Filter
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
-1 0 1
-1.5
-1
-0.5
0
0.5
1
1.5
Pole-Zero Plot Of Floating Point FIR Filter
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1.5
-1
-0.5
0
0.5
1
1.5
Pole-Zero Plot Of Floating Point IIR Filter
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
Fig. 2 Results for bit accurate FIR filters
0 0.1 0.2 0.3 0.4 0.5
-100
-50
0
Magnitude Responses Of Bit Accurate FIR Filters
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
4 Bit
0 0.1 0.2 0.3 0.4 0.5
-100
-50
0
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
10 Bit
0 0.1 0.2 0.3 0.4 0.5
-100
-50
0
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
16 Bit
-1 0 1
-1
0
1
Pole-Zero Plots Of Bit Accurate FIR Filters
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1
0
1
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1
0
1
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
30 IEEE POTENTIALS
When plotting the pole-zero plot of a
function, a circle (o) is used to denote
the location of a zero, and a cross (x) is
used to denote a pole.
Implementation
Implementing a digital filter in prac-
tice typically involves using software to
determine the filter coefficients based
on various user-defined parameters. The
filter design software usually computes
and displays the filter coefficients with
a high degree of precision. If the digital
filter can be implemented using that
same degree of precision, then the filter
will behave as predicted by the filter
design software.
In practice, only a finite number of
bits can be used to represent the digital
filter coefficients. This reduction in
each coefficients precision causes the
frequency response of the filter to differ
from the ideal response due to coeffi-
cient quantization errors.
When using B
coeff
bits to represent
the filter coefficients, the total number
of possible values that the filter coeffi-
cients can take on is 2
Bcoeff
. Thus,
instead of having an infinite range of
values for the coefficients, they are
instead constrained to one of the 2
Bcoeff
levels. The location of the poles and
zeros of the filter are also quantized.
This is because they depend on the
value of the filter coefficients.
The quantization of the pole and zero
locations will typically move the poles
and zeros of the filter to locations that
are different from the ideal setting.
This can have drastic effects on the per-
formance of the filter.
For example, suppose a pole is to be
located at a distance of 0.999 from the
origin. However, the quantization of
that poles location can cause the pole
to be moved to a location that is at a dis-
tance of 1 from the origin (since
Q{0.999} = 1 for certain bit precisions
where Q{.} denotes quantization). Since
filters with poles on the unit circle are
unstable, quantization can cause stable
filter designs to become unstable when
actually implemented. Because of the
effects of filter coefficient quantization,
the number of bits to assign to the filter
coefficients (and for the data as well)
must be carefully chosen.
Fixed point arithmetic
Fixed point arithmetic is a typical for-
mat used to implement digital filters on
both DSP processors and in VLSI imple-
mentations. (Another popular format is
floating point arithmetic in which num-
bers are represented using two parts:
mantissa and exponent.) A very popular
form of fixed point arithmetic is the
twos complement fixed point format.
In this format, a B-bit numbers most
significant bit (MSB) represents the sign
of the number. The lower B-1 bits repre-
sent the magnitude. Using this format, a
B-bit number can represent signed num-
bers in the range from -2
B-1
to 2
B-1
-1.
In twos complement arithmetic, the
negative of a binary number is formed by
inverting each bit of the number. We then
add a 1 to the least significant bit (LSB).
Table 2 shows all the numbers that can be
represented with 3 bits using the twos
complement fixed point format. Fig. 4 Results for bit accurate cascaded IIR filters
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Magnitude Responses Of Bit
Accurate Cascaded IIR Filters
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
10 Bit
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
15 Bit
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
20 Bit
-1 0 1
-1
0
1
Pole-Zero Plots Of Bit
Accurate Cascaded IIR Filters
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1
0
1
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1
0
1
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
Fig. 3 Results for bit accurate IIR filters
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Magnitude Responses Of Bit Accurate IIR Filters
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
10 Bit
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
15 Bit
0 0.1 0.2 0.3 0.4 0.5
-200
-150
-100
-50
0
Digital Frequency (Hz)
|
H
(
w
)
|

(
d
B
)
Float
20 Bit
1 0 1
-1
0
1
Pole-Zero Plots Of Bit Accurate IIR Filters
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1
0
1
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
-1 0 1
-1
0
1
Real Axis
I
m
a
g
i
n
a
r
y

A
x
i
s
Simulation results
Two ninth-order lowpass filters were
designed to demonstrate the effects of
using finite bit precisions to represent
digital filter coefficients. The first filter
is a FIR filter, and the second one is an
IIR filter. Both were designed by plac-
ing the poles and zeros to get a filter
with a lowpass response.
Figure 1 shows the frequency
response and the pole-zero plots for
these filters. These filters are referred to
as the floating point filters: The coeffi-
cients for these filters are represented in
floating point format with the full preci-
sion of the computer. The frequency
responses shown in Fig. 1 represent the
desired response. The other figures show
how using a reduced precision adversely
affects the filters response. These filters
are referred to as bit accurate.
The coefficients for these filters are
represented using the twos complement
fixed point format with a finite number
of bits. Three different bit precisions are
used for each bit accurate filter to show
the performance for different precisions.
Figure 2 shows the results for the bit
accurate FIR filters. The three plots on
the left show the frequency response for
the floating point filter in blue and for
the bit accurate filters in red. The bit
precisions for each bit accurate filter are
shown on the plots legend.
The three plots on the right show the
pole-zero plots for the various FIR fil-
ters. Again, the blue corresponds to the
locations for the floating point filter and
the red corresponds to the locations for
the bit accurate filters.
At 16 bits, the bit accurate response
matches that of the floating point filter.
When the precision drops to 10 bits, the
location of the zeros changes. The fre-
quency response is altered slightly.
Using only 4 bits for the filter coeffi-
cients has a drastic effect on the loca-
tion of the zeros. (Some are moved into
different quadrants and even outside of
the unit circle.) The frequency response
flattens out considerably.
The results for the IIR filters are
shown in Fig. 3. As the bottom plot
shows, 20 bits gives a frequency
response that matches the floating point
filters response. At 15 bits there is a
change in the position of some of the
poles. The frequency response is shifted
upward. Using 10 bits leads to a slight
change in the position of a few zeros
and significant changes in the positions
of the poles. The frequency response is
shifted upward even more.
Note that the bit accurate FIR filters
response matches the floating point
response when using 16 bits. Also, there
is only a slight degradation in the per-
formance when using 10 bits. However,
for the IIR filter, 20 bits matches the
floating point response, but dropping
the precision to 15 bits causes the fre-
quency response to change.
These plots demonstrate that IIR fil-
ters are particularly susceptible to finite
bit precision effects. This is due to the
feedback nature of the IIR structure. It
causes the filter quantization effects to
magnify. One way to reduce this sensi-
tivity is to implement the IIR filter as a
cascaded series of lower-order filters
instead of one higher-order filter.
For the plots shown in Fig. 3, the IIR
filter was implemented as one ninth-
order filter. Figure 4 shows the results
from implementing the same filter (same
zero and pole locations) as a cascade of
four second-order filters and one first-
order filter. The bit accurate frequency
responses match the response of the
floating point filter at precisions of 20
and 15 bits (contrast this to the plots in
Fig. 3). Dropping the precision to 10 bits
shows a slight movement in the location
of some poles and zeros. But the effect
on the frequency response is negligible.
From these results, we see that the
cascaded implementation requires a
lower bit precision when compared with
the precision needed for implementing
the IIR filter as a single ninth-order fil-
ter. The differences in the results shown
in Figs. 3 and 4 highlight the effect that
implementing a filter with different
structures can have on its performance.
Conclusions
The simulation results presented
show how finite bit precisions can affect
the performance of a digital filter. IIR
filters were shown to be even more sus-
ceptible to finite bit precision effects
than FIR filters. However, these effects
can be reduced using the IIR filter with
a cascaded structure.
Acknowledgments
The author would like to thank Dr.
Tom Fear & Loathing Endres and Dr.
Samir Hulyalkar (Sarnoff Digital Com-
munications) for first getting the author
interested in the wonderful world of
finite bit precisions. Thanks also to
Tom Krauss (Purdue University) for
numerous discussions on finite bit pre-
cisions, cascaded filters, and the finan-
cial implications of the Year 2000 bug.
Read more about it
Ifeachor, E., and Jervis, B., Digital
Signal Processing: A Practical
Approach, Addison-Wesley, 1995.
Litwin, L., Endres, T., Hulyalkar,
S., and Zoltowski, M., The Effects Of
Finite Bit Precision For A Fixed Point
VLSI Implementation Of The Constant
Modulus Algorithm, International
Conference on Acoustics, Speech, and
Signal Processing, Phoenix, AZ, Vol. 4,
pp. 2013-2016, March 1999.
Lyons, R., Understanding Digital
Signal Processing, Addison-Wesley, 97.
Proakis, J., and Manolakis, D., Digital
Signal Processing: Principles, Algorithms,
and Applications, Prentice-Hall, 96.
About the author
Louis Litwin is a Member of the Tech-
nical Staff in the Corporate Research
department at Thomson Multimedia where
he works on wireless digital home net-
working technology. Mr. Litwin received
his M.S. degree in Electrical Engineering
from Purdue University in 99 and his B.S.
degree in Electrical Engineering with dis-
tinction from Drexel University in 97. He
was named by Eta Kappa Nu as the Alton
B. Zerby and Carl T. Koerner Outstanding
Electrical Engineering Student for 97. His
interests include digital signal processing
and digital VLSI design. He often annoys
the IRS by computing his income taxes
using only 3 bits of precision.
OCTOBER/NOVEMBER 2000 31
Discrete time Filter input Filter output
instance x(n) y(n)
0 1 1
1 0
2 0
2
3 0
3
n 0
n
Table 1 Output of example IIR filter
for an input of (n)
Decimal format Fixed point format
0 000
1 001
2 010
3 011
-4 100
-3 101
-2 110
-1 111
Table 2 Twos complement fixed
point format for 3 bit numbers

You might also like