Professional Documents
Culture Documents
Pre-Encoded Multipliers
Based on Non-Redundant Radix-4
Signed-Digit Encoding
K. Tsoumanis, N. Axelos, N. Moshopoulos, G. Zervakis and K. Pekmestzi
AbstractIn this paper, we introduce an architecture of pre-encoded multipliers for Digital Signal Processing applications based
on off-line encoding of coefficients. To this extend, the Non-Redundant radix-4 Signed-Digit (NR4SD) encoding technique, which
uses the digit values {1, 0, +1, +2} or {2, 1, 0, +1}, is proposed leading to a multiplier design with less complex partial
products implementation. Extensive experimental analysis verifies that the proposed pre-encoded NR4SD multipliers, including
the coefficients memory, are more area and power efficient than the conventional Modified Booth scheme.
Index TermsMultiplying circuits, Modified Booth encoding, Pre-Encoded multipliers, VLSI implementation
I NTRODUCTION
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TSOUMANIS ET AL., PRE-ENCODED MULTIPLIERS BASED ON NON-REDUNDANT RADIX-4 SIGNED-DIGIT ENCODING
TABLE 1
Modified Booth Encoding
n2 j 1
c2j+2
b2j+1
0
0
0
0
1
1
1
1
b2j
0
0
1
1
0
0
1
1
b2j1
0
1
0
1
0
1
0
1
B
bM
j
0
+1
+1
+2
-2
-1
-1
0
sj
0
0
0
0
1
1
1
1
onej
0
1
1
0
0
1
1
0
twoj
0
0
0
1
1
0
0
0
c2k-2
+ +
MB
b k
1 {-2,-1,0,+1,+2}
MB form
n
b
NR
k 2
b2j+1
c2k-4
c2j+2
2k 4
n2 j 1
n
+
{-2,-1,0,+1}
b2j
2's complement
to NR4SD- form
NR
j
b1
c2j
n2 j
+
{-2,-1,0,+1}
c2
n
b
b0
2's complement to
NR4SD- form
NR
0
n
+
{-2,-1,0,+1}
NR4SD- form
(b)
2's complement to
NR4SD+ form
+
HA*
C
S
-
c2j+1
+
c2j
+
n2 j
NR
NR4SD+ digit b j {-1,0,+1,+2}
(a)
2's complement form
bi 2i
i=0
B
MB
= hbM
iM B =
k1 . . . b0
2's complement
to NR4SD- form
2 k 3
k1
X
c2j
+
b2k-1 b2k-2
c2j+2
2k2
X
c2j+1
+
(a)
2's complement to
NR4SD- form
+
HA
C
S
+
n2 j
NR4SD- digit b NR
{-2,-1,0,+1}
j
b2k-3 b2k-4
b2k-1 b2k-2
c2k-2
(1)
B 2j
bM
2 .
j
j=0
c2j+2
n2k 3
+ +
b {-2,-1,0,+1,+2}
MB form
MB
k1
2's complement
to NR4SD+ form
b2j+1
c2k-4
n2k 4
+
b kNR2 {-1,0,+1,+2}
b2j
2's complement
to NR4SD+ form
b1
c2j
n2 j
+
b NR
{-1,0,+1,+2}
j
n2 j 1
c2
b0
2's complement to
NR4SD+ form
n1
n0
+
b 0NR {-1,0,+1,+2}
NR4SD+ form
B
Digits bM
{2, 1, 0, +1, +2}, 0 j k 1, are
j
formed as follows:
(b)
B
bM
= 2b2j+1 + b2j + b2j1 ,
j
(2)
(3)
3 N ON -R EDUNDANT
D IGIT A LGORITHM
R ADIX -4
(4)
S IGNED -
R
digits bN
take one of four values: {2, 1, 0, +1}
j
N R+
or bj
{1, 0, +1, +2} at the NR4SD or NR4SD+
algorithm, respectively. Only four different values are
used and not five as in MB algorithm, which leads
to 0 j k 2. As we need to cover the dynamic
range of the 2s complement form, the most significant
B
digit is MB encoded (i.e., bM
k1 {2, 1, 0, +1, +2}).
+
The NR4SD and NR4SD encoding algorithms are
illustrated in detail in Fig. 1 and 2, respectively.
NR4SD Algorithm
Step 1: Consider the initial values j = 0 and c0 =0.
Step 2: Calculate the carry c2j+1 and the sum n+
2j of a
Half Adder (HA) with inputs b2j and c2j (Fig. 1a).
c2j+1 = b2j c2j , n+
2j = b2j c2j .
Step 3: Calculate the positively signed carry c2j+2 (+)
and the negatively signed sum n
2j+1 (-) of a Half
Adder* (HA*) with inputs b2j+1 (+) and c2j+1 (+) (Fig.
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TSOUMANIS ET AL., PRE-ENCODED MULTIPLIERS BASED ON NON-REDUNDANT RADIX-4 SIGNED-DIGIT ENCODING
TABLE 3
NR4SD+ Encoding
TABLE 2
NR4SD Encoding
2s complement
b2j+1 b2j
NR4SD form
c2j
c2j+2
n
2j+1
Digit
NR4SD Encoding
n+
2j
R
bN
j
one+
j
one
j
two
j
2s complement
b2j+1 b2j
NR4SD+ form
c2j
c2j+2
n+
2j+1
Digit
NR4SD+ Encoding
n
2j
R+
bN
j
+
one+
j onej twoj
+1
+1
+1
+1
-2
+2
-2
+2
-1
-1
-1
-1
2c2j+2 n
2j+1 = b2j+1 + c2j+1 .
R
Step 4: Calculate the value of the bN
digit.
j
R
+
bN
= 2n
j
2j+1 + n2j .
(5)
R+
bN
= 2n+
j
2j+1 n2j .
(7)
nals one+
j , onej and twoj of Table 2 are generated.
+
one+
j = n2j+1 n2j ,
one+
j = n2j+1 n2j ,
+
one
j = n2j+1 n2j ,
two
j
n
2j+1
(6)
n+
2j .
one
j = n2j+1 n2j ,
two+
j
n+
2j+1
(8)
n
2j .
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TABLE 4
Numerical Examples of the Encoding Techniques
3 bits, MB
3 bits, MB
-128
Modified Booth
2000
2000
000
2
NR4SD
NR4SD+
-102
221
2
+89
1
122
+127
2001
1
2
1
2
2122
2
2
21
200
1
2001
1121
3 bits, MB
ROMCoefficients
Pre-Encoded in MB
Form
CEn
Addr
Clock
(1010...1011)2
n-bit
PP0 Generator
PP1 Generator
3 bits
0 b0
MB
Encoding
b1
b2
MB
Encoding
b3
A 22k-2
3 bits
PPk-1 Generator
b2k-2
MB
Encoding
b2k-1
CSA Tree
3 bits
A 22
ROM - Coefficients in
CEn
Addr
Clock
Coefficients ROM
and MB encoding
P = A B = COR +
P=AB
k1
X
P Pj 22j ,
(10)
j=0
j=0
Conventional MB Multiplier
n1
X
i=0
pj,i 2i .
k1
X
(9)
cin,j 22j + 2n (1 +
k1
X
22j+1 ),
(11)
j=0
In the pre-encoded MB multiplier scheme, the coefficient B is encoded off-line according to the conventional MB form (Table 1). The resulting encoding
signals of B are stored in a ROM. The circled part of
Fig. 3, which contains the ROM with coefficients in
2s complement form and the MB encoding circuit,
is now totally replaced by the ROM of Fig. 5. The
MB encoding blocks of Fig. 3 are omitted. The new
ROM of Fig. 5 is used to store the encoding signals of
B and feed them into the partial product generators
(P Pj Generators - PPG) on each clock cycle.
Targeting to decrease switching activity, the value
1 of sj in the last entry of Table 1 is replaced by 0.
The sign sj is now given by the relation:
sj = b2j+1 (b2j+1 b2j b2j1 ).
(12)
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TSOUMANIS ET AL., PRE-ENCODED MULTIPLIERS BASED ON NON-REDUNDANT RADIX-4 SIGNED-DIGIT ENCODING
ai
ai
ai
sj
one j ai1
one j ai 1
two j
ai one j
ai 1 two j
two j
ai one j
ai 1 two j
ai one j
ai one j ai 1 two j
ai one j ai 1 two j
ai one j
ai one j
ai one j
sj
p j ,i
p j ,i
p j ,i
(a)
(b)
p j ,i
(c)
p j ,i
(d)
p j ,i
(e)
(f)
Fig. 4. Generation of the ith Bit pj,i of P Pj for a) Conventional, b) Pre-Encoded MB Multipliers, c) NR4SD , d)
NR4SD+ Pre-Encoded Multipliers, and e) NR4SD , f) NR4SD+ Pre-Encoded Multipliers after reconstruction.
3 bits
PP0 Generator
(1010...1011)2n
n-bit
3 bits
A 22k-2
PP1 Generator
NR4SD
Encoding
2 bits
NR4SD
Encoding
2 bits
3 bits, MB
PPk-1 Generator
Pre-Encoded in
NR4SD Form
A
A 22
ROM Coefficients
CEn
Addr
Clock
CSA Tree
n-bit
4.3
stored in ROM: n
2j+1 , n2j (Table 2) for the NR4SD
+
cin,j = two
j onej and cin,j = onej for the NR4SD
+
and NR4SD pre-encoded multipliers, respectively,
based on Tables 2 and 3. The carry-save output of the
CSA tree is finally summed using a fast CLA adder.
I MPLEMENTATION R ESULTS
one j
one j
(a)
n2 j
n2 j 1
two j
one j
n2 j
one j
two j
(b)
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TSOUMANIS ET AL., PRE-ENCODED MULTIPLIERS BASED ON NON-REDUNDANT RADIX-4 SIGNED-DIGIT ENCODING
TABLE 6
Performance at Lowest Clock Periods
TABLE 5
Multiplier Designs
MB
MB encoding
n-bit
MB
NR4SD
NR4SD+
(n+1)-bit
(n+1)-bit
Da
Ab
AD
Pc
PD
38079
37205
35796
36045
11.20
11.00
11.00
11.10
22.51
21.45
21.45
21.65
73121
65948
62832
64470
18.80
16.80
16.80
17.40
42.49
36.62
36.62
37.93
117465
108277
99816
106068
29.50
27.30
26.40
27.50
69.03
62.24
60.19
62.98
16 bits
PreEncoded
NR4SD+
Technique
Design
Conventional MB
MB
NR4SD
NR4SD+
2.01
1.95
1.95
1.95
PreEncoded
MB
NR4SD
Type
ROM
width
Conventional MB
MB
NR4SD
NR4SD+
2.26
2.18
2.18
2.18
Conventional MB
MB
NR4SD
NR4SD+
2.34
2.28
2.28
2.29
18945
19079
18357
18485
24 bits
32354
30251
28822
29573
32 bits
PreEncoded
PreEncoded
Conventional
MB
Input B Encoding
Input A
2s complement
n-bit
Design
50199
47490
43779
46318
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TSOUMANIS ET AL., PRE-ENCODED MULTIPLIERS BASED ON NON-REDUNDANT RADIX-4 SIGNED-DIGIT ENCODING
15%
0%
2.00
2.50
3.00
3.50
4.00
-5%
35%
40%
30%
5%
10%
40%
25%
20%
15%
10%
-10%
2.50
3.00
PreEncodedMB
PreEncodedNR4SD+
25%
3.50
4.00
2.00
PreEncodedMB
PreEncodedNR4SD+
50%
5%
0%
2.50
3.00
3.50
4.00
-10%
40%
30%
20%
10%
2.00
2.50
3.00
Delay (ns)
PreEncodedMB
PreEncodedNR4SD-
4.00
PreEncodedNR4SD+
50%
40%
30%
20%
10%
3.50
2.00
4.00
2.50
PreEncodedMB
3.00
3.50
4.00
Delay (ns)
Delay (ns)
PreEncodedNR4SD+
3.50
0%
0%
-15%
3.00
PreEncodedNR4SD-
60%
10%
15%
2.00
2.50
Delay (ns)
PreEncodedNR4SD-
20%
-5%
10%
Delay (ns)
Delay (ns)
PreEncodedNR4SD-
20%
0%
2.00
-15%
30%
5%
0%
PreEncodedMB
PreEncodedNR4SD-
PreEncodedNR4SD+
PreEncodedMB
PreEncodedNR4SD-
PreEncodedNR4SD+
Fig. 8. Area / Power Gains of the Pre-Encoded Designs Over the Conventional MB Scheme at 16 Bits.
System Area Gain at 24 Bits
25%
60%
5%
0%
2.00
2.50
3.00
3.50
4.00
35%
50%
10%
15%
-5%
20%
30%
25%
20%
15%
10%
0%
-15%
PreEncodedNR4SD-
2.50
3.00
3.50
4.00
2.00
PreEncodedNR4SD+
PreEncodedMB
PreEncodedNR4SD-
60%
40%
50%
0%
2.50
3.00
3.50
4.00
-10%
-20%
10%
20%
30%
20%
10%
2.50
3.00
3.50
4.00
40%
30%
20%
0%
2.00
-10%
PreEncodedNR4SD-
PreEncodedMB
PreEncodedNR4SD-
2.50
3.00
3.50
4.00
Delay (ns)
Delay (ns)
PreEncodedNR4SD+
4.00
PreEncodedNR4SD+
10%
Delay (ns)
PreEncodedMB
3.50
PreEncodedNR4SD-
0%
2.00
-30%
3.00
Delay (ns)
PreEncodedMB
PreEncodedNR4SD+
30%
2.00
2.50
Delay (ns)
40%
20%
0%
2.00
Delay (ns)
PreEncodedMB
30%
10%
5%
-10%
40%
PreEncodedNR4SD+
PreEncodedMB
PreEncodedNR4SD-
PreEncodedNR4SD+
Fig. 9. Area / Power Gains of the Pre-Encoded Designs Over the Conventional MB Scheme at 24 Bits.
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation
information: DOI 10.1109/TC.2015.2428691, IEEE Transactions on Computers
TSOUMANIS ET AL., PRE-ENCODED MULTIPLIERS BASED ON NON-REDUNDANT RADIX-4 SIGNED-DIGIT ENCODING
30%
15%
10%
5%
0%
-5%
2.00
2.50
3.00
3.50
4.00
-10%
25%
20%
15%
10%
2.50
3.00
3.50
2.00
4.00
PreEncodedNR4SD-
PreEncodedMB
PreEncodedNR4SD+
10%
0%
2.50
3.00
3.50
4.00
-10%
-20%
20%
PreEncodedNR4SD-
PreEncodedNR4SD+
10%
0%
2.00
2.50
3.00
3.50
4.00
30%
20%
10%
0%
2.00
PreEncodedMB
2.50
3.00
3.50
4.00
-10%
-20%
Delay (ns)
PreEncodedNR4SD+
4.00
40%
30%
-10%
PreEncodedNR4SD-
3.50
50%
Delay (ns)
PreEncodedMB
3.00
20%
2.00
2.50
Delay (ns)
40%
PreEncodedMB
PreEncodedNR4SD+
30%
20%
Delay (ns)
Delay (ns)
PreEncodedNR4SD-
30%
0%
2.00
PreEncodedMB
40%
10%
5%
0%
-15%
50%
35%
20%
PreEncodedNR4SD-
Delay (ns)
PreEncodedNR4SD+
PreEncodedMB
PreEncodedNR4SD-
PreEncodedNR4SD+
Fig. 10. Area / Power Gains of the Pre-Encoded Designs Over the Conventional MB Scheme at 32 Bits.
C ONCLUSION
In this paper, new designs of pre-encoded multipliers are explored by off-line encoding the standard coefficients and storing them in system memory. We propose encoding these coefficients in the
Non-Redundant radix-4 Signed-Digit (NR4SD) form.
The proposed pre-encoded NR4SD multiplier designs
are more area and power efficient compared to the
conventional and pre-encoded MB designs. Extensive
experimental analysis verifies the gains of the proposed pre-encoded NR4SD multipliers in terms of
area complexity and power consumption compared
to the conventional MB multiplier.
R EFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
0018-9340 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.