You are on page 1of 8

When computational resources are limited, especially multipliers, distributed arithmetic (DA) is used in lieu of the

typical multiplier-based ltering structures. Several attempts have been done to accelerate updating the memory, but
at the expense of additional memory usage and of convergence speed. Distributed arithmetic (DA) is commonly
used for signal processing algorithms where computing the inner product of two vectors comprises most of the
computational workload. This type of computing pro le describes a large portion of signal processing algorithms, so
the potential usage of distributed arithmetic is tremendous
Adaptive Filters using Distributed Arithmetic
Another important signal processing area is adaptive ltering. Adaptive ltering is extensively used in several signal
processing applications including acoustic echo cancellation signal de-noising, sonar signal processing, clutter
rejection in radars, and channel equaliza
LUT BASED MULTIPLIER FOR DIGITAL FIR FILTER
This chapter deals with complete structure of proposed LUT based multiplier for implementation of digital FIR
filter.
3.1 Introduction
The memory based implementation of digital FIR filter is having less area utilization and high throughput.
DA and LUT based multipliers are two different variants of memory based multiplication. Both the multipliers uses
look up tables for its operation. These multipliers are very much useful for high speed hardware implementation of
FIR filters provided the filter coefficient is fixed. The inner product terms are stored in a LUT. The memory-size of
LUT-multiplier based implementation increases exponentially with the word length of input values, while that of the
DA-based approach increases exponentially with the inner-product-length. The proposed LUT based multiplier is
having many advantages in implementing digital FIR filter design.
This multiplier occupies less area and having less combinational delay when compare with non memory
based multipliers by synthesizing using Xilinx synthesis tool. The proposed structure uses LUT which store only
odd multiples of filter coefficient and by shift mechanism remaining even multiples are obtained thus the area
utilization is less than DA based multiplier.
3.2 Memory Based Multiplication Using LUT
The basic principle of memory-based multiplication is depicted in fig.3.1.

Fig 3.1 Memory based Multiplier
As the memory based multiplication is done with a fixed filter coefficient. Let A is a fixed coefficient and
X be an input word to be multiplied with A. If we assume X to be an unsigned Binary number of word-length L,
there can be 2
L
possible Values of X, and accordingly, there can be 2
L
possible Values of X, and accordingly,
there can be 2
L
possible values of product C = A X .Therefore, for the conventional implementation of memory-
based multiplication, a memory unit of 2
L
words is required to be used as look-up-table consisting of pre-computed
product values corresponding to all possible values X. The product word (A.Xi), for 0 < Xi < 2
L
-1, is stored at the
memory location whose address is the same as the binary value of Xi, such that if L-bit binary value of Xi is used
as address for the memory-unit, then the corresponding product value is read-out from the memory. Although 2
L
possible Values of X correspond to2
L
possible values of product C = A X.
3.2.1 Proposed LUT Based Multiplier
The general memory based multiplication using LUT is shown in figure 3.1. Where the LUT needs store 2
L
words. But the proposed LUT based multiplier method the LUT needs to store only 2
L/2
words corresponding to the
odd multiplies of A. As one of the possible product words is zero, the rest 2
L
/2 -1 are even multiples of A which
could be derived by left-shift operations of one of the odd multiples of A.
Table 3.1 LUT words and product values for input word length L=4 where s0, s1 are control inputs of logarithmic
barrel shifter.
From the table 3.1 there are eight memory locations for eight odd multiples A X (2i + 1) which are
stored as Pi for i = 0, 1, 2... 7. The even multiples 2A, 4A, 8A are derived from shift operations of A. Similarly 6A,
12A are derived by left shifting 3A, while 10A and 14A are derived by left shifting 5A, 7A, respectively.
The address X = (0000) corresponding to (A.X) = 0, which can be obtained by resetting the LUT output. For an
input multiplicand of word-size L similarly, only 2
L
/2 odd multiple values need to be stored in the memory-core of
the LUT, while the other 2
L
/2 -1 non-zero values could be derived by left-shift operations of the stored values.
Based on the above, an LUT for the multiplication of an L-bit input with W-bit coefficient is designed by the
following strategy:
A memory-unit of ( 2
L
/2) words of (W+L) bit width is used to store all the odd multiples of A.
A barrel-shifter for producing a maximum of (L 1) left shifts is used to derive all the even multiples of A.
The L-bit input word is mapped to (L-1) -bit LUT-address by an encoder.
The control-bits for the barrel-shifter are derived by a control-circuit to perform the necessary shifts of the
LUT output.
Besides, a RESET signal is generated by the same control circuit to reset the LUT output when X = 0.
3.2.1.1 Structure of Proposed LUT Based Multiplier
The proposed LUT-based multiplier for input L = 4 word-size is shown in Fig.3.2 It consists of a memory-
array of eight words of (W+4) -bit width and a 3-to-8 line address decoder, along with a NOR-cell, a barrel-shifter, a
4-to-3 bit encoder to map the 4-bit input operand to 3-bit LUT-address, and a control circuit for generating the
Control-word (s0, s1) for the barrel-shifter, and the RESET signal for the NOR cell.
Figure 3.2 Proposed LUT multiplier for L= 4
3.2.1.2 Structure of 4- to- 3 bit Encoder
The encoder receives a four-bit input word (x3,x2,x1,x0) and maps that onto the three-bit address word (d2,d1
,d0), according to the logical relations.
( ) ( ) ( ) ( ) ( )
3 2 0 2 1 1 0 0
. . . . . x x x x x x x d + =
(3.1(a))
( ) ( ) ( )
3
.
1 0
.
2
.
0 1
x x x x x d + = (3.1(b))
3 0 2
.x x d = (3.1(c))
The structure of the 4-to-3 bit encoder to map the 4-bit input operand to 3-bit LUT-address according to above
equations is given in figure 3.3.

Figure 3.3 4-to-3 bit encoder
The decoder takes the 3-bit address from the input encoder, and generates 8 word- select signals, wi, for 0 i 7,
to select the referenced-word from the memory-array.
3.2.1.3 Structure of Control Circuit and Logarithmic Barrel Shifter
A barrel-shifter is used for providing necessary left shifts to derive all the even multiples of filter
coefficients. The control-bits for the barrel-shifter are derived by a Control-circuit to perform the necessary shifts of
the LUT output. Besides, a RESET signal is generated by the same control circuit to reset the LUT output when X =
0. From figure 3.2 the output of the memory-array is either AX or its sub-multiple in bit-inverted form depending on
the value of X. From Table 3.1, we find that the LUT output is required to be shifted through location to left when
the input operand X is one of the values {(0 0 1 0), (0 1 1 0),(1 0 1 0),(1 1 1 0)}. Two left-shifts are required if X is
either (0 1 0 0) or (1 1 0 0). Only when the input word X = (1 0 0 0), three shifts are required. For all other possible
input operands, no shifts are required. Since the maximum number of left-shifts required on the stored-word is three;
a two-stage logarithmic barrel-shifter is adequate to perform the necessary left-shift operations. The figure 3.5 which
represents logarithmic barrel-shifter. The number of shifts required to be performed on the output of the LUT and
the control-bits s0, s1 and for different values of X are shown Table 3.1.The control circuit shown in Figure 3.4. The
control circuit accordingly generates the control-bits given by
3.2(a)
3.2(b)

Figure 3.4 Control circuit
The figure 3.5 represents logarithmic barrel shifter, a barrel-shifter for producing left shifts that are used to
derive all the even multiples of A. It consists of two stages of 2-to-1 line bit-level multiplexors with inverted
output, where each of the two stages involves (W+4) number of 2-input AND-OR-INVERT (AOI) gates. The
control-bits (s0, s0
l
), (s1,s1
l
) and are fed to the AOI gates of stage-1 and stage-2 of the barrel-shifter, respectively.
Since each stage of the AOI gates perform inverted multiplexing, after two stages of inverted multiplexing, outputs
with desired number of shifts are produced by the barrel-shifter in the usual un-inverted form.
Figure 3.5 Logarithmic barrel shifter
3.2.1.4 Structure of RESET Circuit
The input X = (0 0 0 0) corresponds to this multiplication by X = 0 which results in the product value A.X
= 0 Therefore, when the input operand word X = (0 0 0 0), the output of the LUT is required to be reset. The reset
function is implemented by a NOR-cell consisting of (W+ 4) NOR gates as shown in Figure. 3.6 using an active-
high RESET.
Figure 3.6 NOR cell
The RESET bit is fed as one of the inputs of all those NOR gates, and the other input (W+4) lines of NOR
gates of NOR cell are fed with (W+4) bits of LUT output in parallel. When, the control circuit in Figure. 3.4,
generates an active-high RESET according to the logic expression:
(3.3)
When RESET = 1, the outputs of all the NOR gates become 0, so that the barrel-shifter is fed with (W+4) number of
zeros. When RESET =0, the outputs of all the NOR gates become the complement of the LUT output-bits. Note
that, keeping this in view, the product values are stored in the LUT in bit-inverted form. Reset function can be
implemented by an array of 2-input AND gates in a straight-forward way, but the implementation simpler CMOS
implementation compared with the AND gates.
3.3 Realization of Digital FIR Filter Using Proposed LUT Based Multiplier
The Realization of digital FIR filter using proposed LUT based multiplier is done by using direct form
realization structure of digital FIR filter. The equation, which defines the FIR filter with output sequence y[n] in
terms of its input sequence x[n]:

=
=
1
0
] [ ]. [ ] [
N
k
k n x k h n y (3.4 )
Where x[n] is the input signal, y[n] is the output signal, h[k] is the coefficients of FIR filter frequency response, and
N is the filter order. The figure 3.7 shows the direct form realization of digital FIR filter.
Figure 3.7 FIR filter in direct form
From this figure 3.7 the input X is delayed and given to multiplier each multiplier gives products corresponding to
different filter coefficients and all these products are accumulated and give fir filter output. The figure 3.8 shows the
realization of digital FIR filter using proposed LUT based multiplier
Figure 3.8 Realization of digital FIR filter using proposed LUT based multiplier
The proposed LUT multiplier is used in the figure 3.8 in which each multiplier is having fixed filter
coefficients ,the inputs are delayed and given to this LUT multiplier .A memory-unit of ( 2
L
/2) words of (W+L) bit
width is used to store all the odd multiples of filter coefficient. The L-bit input word is mapped to (L-1) -bit LUT-
address by an encoder. The barrel-shifter is used to derive all the even multiples of filter coefficient. The required
control-bits for the barrel-shifter are derived by Control-circuit to perform the necessary shifts of the LUT output.
RESET signal is generated by the same control circuit to reset the LUT output when X = 0. There by corresponding
products which are stored in the LUT of particular input given to LUT based multiplier based circuit in figure 3.2
are obtained. These products are finally accumulated and give FIR filter output based number of taps for given filter.
A 4 tap, 8 tap, FIR filter is realized using this LUT multiplier in this project and synthesized using Xilinx tool.
Results
Simulation Results
The below figures shows the simulation results of test cases applied to the DUT. Figure 9.1shows the
response of the device for the control test case at the usb interface. Figure 6.2 shows the master transmitter sending
random data to the external slave device.