Considering The Alternatives in Low Power Design

Employing Alternative Number Systems
to Reduce Power Dissipation

in Portable Devices and High-Performance Systems
1999 ARTVILLE, LLC.
The choice of the number systemi.e., the way numbers

ower dissipation has evolved into an instrumental deare represented in a digital systemcan reduce power dissipasign optimization objective due to the growing detion, since the number system has an effect on several levels of
mand for portable electronics equipment as well as
the design abstraction. In particular, the appropriate selection
due to excessive heat generation in high-performance
of the number system can reduce
systems. In the former case,
power dissipation, because it can relow-power techniques are employed to
T. Stouraitis and V. Paliouras duce:
prolong battery life, while in the latter
1. the number of the operations;
case, low-power techniques are re2. the strength of the operators; and
quired to mitigate the reliability problems that may arise. The
dominant component of power dissipation for well-designed
3. the activity of the data.
CMOS circuits is static power dissipation, given as [1]
A particular choice of number system can reduce the number of
the actual operations required to accomplish certain computa(1)
P = aC L f Vdd2 ,
tional tasks; therefore, it can reduce the computational load of
an application. Furthermore, both data activity and the strength
of the operators are influenced by the choice of the number syswhere a is the activity factor, C L is the switching capacitance, f
tem. Finally, power dissipation can be reduced by using
is the clock frequency, and Vdd is the supply voltage. A variety of
low-power arithmetic circuit architectures. Again, the possible
design techniques are commonly employed to reduce the facarchitectures are determined by the number system.
tors of Eq. (1), without degrading system performance. As
Several authors address the issue of low-power arithmetic;
slower circuits tend to dissipate less power, the low-power dethe bulk work in this field is on the definition of new low-power
sign problem can be seen as an attempt to achieve a specified
circuit-level architectures [3] or the identification among existsystem performance by employing slow components.
ing architectures of those that dissipate minimal power for the
The reduction of the various factors that determine power
basic operations, such as addition and multiplication [4]. Also,
dissipation is sought at all levels of the design abstraction. In
comparisons of number representations, such as sign-magnitude
particular, techniques for power dissipation reduction at higher
and twos-complement systems [5], in terms of underlying bit acdesign abstraction levels aim to reduce the computational load
tivity have been reported [6].
and the number of memory accesses required to perform a cerIn this article we focus on two alternative number systems
tain task as well as to introduce parallelism and pipelining in the
that are quite different than the conventional linear number repsystem [2]. At the circuit and process levels, minimal feature size
resentations, namely the logarithmic number system (LNS) and
circuits are preferred, capable of operating at minimal supply
the residue number system (RNS). Both have recently attracted
voltages, while leakage currents and device threshold voltages
the interest of researchers for their low-power properties. We
are minimized.
CIRCUITS & DEVICES
JULY 2001
8755-3996/01/$10.00 2001 IEEE
23
address aspects of the conventional arithmetic representations, the impact of

logarithmic arithmetic on
power dissipation, and discuss the low-power aspects
of residue arithmetic.
LNS is applicable for low-power design

because it reduces the strength of certain
arithmetic operators and the bit activity.
Conventional Arithmetic Representations

Parhami [5] offers an overview of low-power techniques for
arithmetic circuits. Common techniques for low-power logic design can be applied to arithmetic circuits as well [5]. Such techniques are based on the following guidelines:
1. avoid wasted power: glitching minimization, not clocking
idle modules;
2. barely meet performance requirements, since slower circuits dissipate less power; and
3. minimize signal activity by properly encoding data.
In some cases, wasted power can be reduced by several times
by minimizing the computational load of a particular task. The
appropriate selection of the number system will be shown below
to reduce the computational load in certain tasks.
Callaway and Swartzlander [7] have focused on low-power
arithmetic at the gate level; they have characterized several adder
and multiplier architectures in terms of power dissipation. They
offer area, time, and power dissipation measures for various architectures and word lengths. In terms of minimal power dissipation
for 16-bit adders, the constant-width carry-skip adder emerges as
the optimal choice. However, minimal absolute power dissipation
may not be the optimization objective in a design. In most cases, a
more complex criterion, the power-delay product, is more applicable, because it describes the combined effect of reducing power
dissipation at the cost of increasing circuit delay. Returning to the
16-bit adder example, the utilization of the power-delay product
criterion points out a different topology as an optimal solution,
namely the variable-width carry-skip adder [7].
This example demonstrates that there is not an optimal
choice of architecture applicable to every design situation. Instead, the design specifications (expressed as area, time, and
power dissipation constraints) should be met, while minimizing
an appropriate cost function. A similar discussion for multipliers
of word sizes between 8 and 32 bits reveals that Wallace and
Dadda architectures outperform array multipliers for low-power
operation [7].
Bit activity is another factor that affects power dissipation
and depends on the number system selection. It has been shown
that the probabilistic distribution of the input signals largely af-
n1
x = logb |X |
1. The organization of a (n + 1)-bit LNS digital word.
24
fects the performance of

the number representation
in terms of bit activity.
Landman and Rabaey demonstrate this effect by introducing the dual-bit type
(DBT) method for modeling the bit activity in a data
word [6], assuming twos-complement and sign-magnitude representations. While the sign magnitude representation is found
to exhibit less bit activity than twos-complement coding, a general conclusion on the power dissipation behavior cannot be
drawn, since the complexity of the corresponding processing
circuitry is different. Since sign-magnitude arithmetic requires
more complicated adders and subtractors than twos- complement arithmetic, the increased activity of the latter can be compensated from a power dissipation viewpoint.
The Logarithmic Number System

The LNS [8] has been employed in the design of low-power DSP
devices, such as a digital hearing aid by Morley et al. [9]. More recently, Sacha and Irwin report that LNS can reduce power dissipation in adaptive filtering processors [10].
LNS Basics
The LNS maps a linear number X to a triplet as follows
X LNS
( z , s , x = log b | X | ),
(2)
where z is a single-bit flag which, when asserted, denotes that X

is zero, s is the sign of X, and bis the base of the logarithmic representation. The organization of an LNS word is shown in Fig. 1.
Mapping [Eq. (2)] is of practical interest because it can simplify
certain arithmetic operations; i.e., reduce the strength of the operators. For example, due to the properties of the logarithm
function, the multiplication of two linear numbers X = bx and
Y = b y is reduced to the addition of their logarithmic images, x
and y. The basic arithmetic operations and their LNS counterparts are summarized in Table 1.
In order to utilize the benefits of LNS, a conversion overhead
is required in most cases. Conversion circuitry is required to perform the forward LNS mapping [Eq. (2)] and the inverse mapping of the logarithmic results to linear numbers, defined as
X = (1 z )( 1)s bx .
(3)
It is noted that mappings (2) and (3) are required in the case that
an LNS processor receives as input and transmits as output linear data in digital format. Since all arithmetic operations can be
performed in the logarithmic domain, only an initial conversion
is imposed; therefore, as the amount of processing implemented
in LNS grows, the conversion overhead contribution to power
dissipation becomes negligible since it remains constant.
In stand-alone DSP systems a different approach is possible.
The LNS forward and inverse mapping overhead can be mitiCIRCUITS & DEVICES
JULY 2001
gated by employing logarithmic A/D and D/A

converters, instead of linear converters, followed by
corresponding digital conversion circuitry. Such an
approach has been adopted
by Morley et al. in the design of a digital hearing-aid
processor [9].
It is shown that RNS can even reduce

the computation load in complex-number
processing, thus providing savings
at the algorithmic level.
LNS and Power Dissipation

LNS is applicable for low-power design because it reduces
1. the strength of certain arithmetic operators; and
2. the bit activity.
The operator strength reduction by LNS reduces the switching capacitance; i.e., it reduces the C L factor of Eq. (1). Sacha and
Irwin have studied the impact of the number system choice on
the QRD-RLS algorithm [10]. They have compared the amount
of switched capacitance per algorithm iteration for several implementations of QRD-RLS, each using a particular arithmetic,
namely CORDIC, floating-point, fixed-point, and LNS. A performance comparison of the various implementations reveals that
LNS offers accuracy comparable to floating-point, but only at a
fraction of switched capacitance per iteration of the algorithm.
The reduction of average switched capacitance due to LNS
stems from the simplification of basic arithmetic operations,
shown in Table 1. However, LNS can affect power dissipation in
an additional waythe bit activity; i.e., the a factor of Eq. (1). A
design parameter that is often neglected despite playing a key
role in an LNS-based processor performance is the base of the
logarithm b [11, 12], as demonstrated in Fig. 2. The choice of
base has a substantial impact on the average bit activity. Figure
2 shows activity per bit position; i.e., the probability of a transition from low to high in a particular bit position, for a
twos-complement word and several LNS words, each of a different base b. It can be seen that departing from the traditional
choice b = 2 can substantially reduce the signal activity in comparison to the twos-complement representation. The input data
are sampled from a zero-mean Gaussian process with a correlation factor = 0.99, similar to the derivation of the DBT model.
Since multiplication-additions are important in DSP applications, the power requirements of an LNS and a linear fixed-point
adder-multiplier have been compared. Paliouras and Stouraitis
report that approximately a two-times reduction in power dissipation is possible for operations with word size of 8 to 14 bits.
Given a sufficient number of multiplication-additions, the LNS
implementation becomes more efficient from the low-power
dissipation viewpoint, even when a constant conversion overhead is taken into consideration.
The Residue Number System

The RNS [13] has recently been shown to offer significant
power-dissipation savings in the design of signal processing architectures for FIR filters [14] and frequency synthesizers [15].
CIRCUITS & DEVICES
JULY 2001
It is shown that RNS can

even reduce the computat i o n l o a d i n c o mplex-number processing,
thus providing savings at
the algorithmic level.
RNS Basics
The RNS maps an integer X
to a N-tuple of residues xi ,
{ x1 , x2 ,K , x N },
X RNS
(4)
where
xi = X
mi
(5)
m denotes the mod mi operation, and mi is a member of the set

i
of the co-prime integers {m1 , m2 ,K , mM }, called moduli.
Co-prime integers have the property that gcd( mi , mj ) = 1, i j.
The modulo operation X m returns the integer remainder of
the integer division x div m; i.e., a number k such that
x = m l + k, where l is an integer.
Table 1. Basic Linear Arithmetic Operations

and their LNS Counterparts
Linear Operation
y
Z = XY = b x b = b
Logarithmic Operation
x +y
Z = X /Y =bx / b =b
Z = m X = m bx =b
z = log b Z = x + y
x y
z=x y
x
m
z = x / m , m , integer
Z = X m = (b x ) m
z = mx , m , integer
y x
y x
Z = X + Y = b x + b = b x (1 + b
Z = X Y = b x b = b x (1 b
z = x + log b (1 + b
y x
z = x + log b (1 b
y x
)
)
p01
0.5
0.4
Twos Complement
0.3
0.2
b = 1.5
b = 1.3
b = 1.1
0.1
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
bits
2. Probability p0 1 per bit position for twos-complement and LNS

encoding for = 0.99.
25
RNS is of interest because basic arithmetic operations can be performed

in
a
d i g i t - pa r a l l e l
carry-free manner; i.e.,
zi = x i o y i
mi
The impact of the arithmetic in a digital

system is not limited to the definition of the
architecture of arithmetic circuits.
(6)
where i = 1,2,K , M , and the symbol o stands for addition, subtraction, or multiplication. Every integer in the range
N
0 X < i = 1 mi has a unique RNS representation. Inverse
conversion is accomplished by means of the Chinese Remainder
Theorem (CRT) or mixed-radix conversion [16].
The basic architecture of an RNS processor in comparison to
a binary counterpart is depicted in Fig. 3. Figure 3 shows that
the word length n of the binary counterpart is partitioned into M
subwords, the residues, which can be processed independently
and are of word length significantly smaller than n. The ith residue channel performs arithmetic modulo mi . Conceptually, RNS
introduces a subword-level parallelism into an algorithm; therefore, its hardware implementation can enjoy the low-power benefits of parallel architectures [2].
RNS and Power Dissipation

Freking and Parhi have studied the power dissipation of FIR filter architectures that employ RNS. They report that RNS can reduce power dissipation since it reduces [14]:
1. the hardware cost;
2. the switching activity; and
3. the supply voltage.
By employing the binary-like RNS filter structures by Ibrahim
[17], Freking and Parhi report that RNS reduces the bit activity
up to 38% in ( 4 4)-bit multipliers. As the critical path in an
RNS architecture increases logarithmically with the equivalent
binary word length, RNS can tolerate a larger reduction in the
supply voltage than the corresponding binary architecture while
achieving a particular delay specification. To demonstrate the
overall impact of the RNS on the power budget of an FIR filter,
Freking and Parhi report that a filter unit with 16-bit coefficients
and 32-bit dynamic range, operating at 50 MHz, dissipates 26.2
mW on average for a twos-complement implementation, while
n/M
more than 100 taps.

A different approach to low-power RNS is proposed by Chren.
Chren suggests to one-hot encode the residues in an RNS-based
architecture, thus defining one-hot RNS (OHR) [15]. Instead of
encoding a residue value xi in a conventional positional notation, an ( m 1)-bit word is employed. In this word, the assertion
of the ith bit denotes the residue value xi . The one-hot approach
allows for further reducing bit activity and power-delay products
using residue arithmetic. OHR is found to require simple circuits for processing. The power reduction is rendered possible
since all basic operations (i.e., addition/subtraction and multiplication) and the RNS-specific operations of scaling (i.e., division by constant), modulus conversion, and index computation
are performed using transposition of bit lines and barrel shifters.
The performance of the obtained residue architectures is demonstrated through the design of a direct digital frequency synthesizer, which exhibits a power-delay product reduction of 85%
over the conventional approach [15].
RNS Signal Activity for Gaussian Input

In the following, the bit activity in an RNS architecture with
positionally encoded residues is experimentally studied for the
encoding of 8-bit data using the base {2,151}, which provides a
linear FXP dynamic range of approximately 8.24 bits. Assuming
data sampled from a Gaussian process, the bit assertion activities
of the particular RNS, an 8-bit sign-magnitude, and an 8-bit
twos-complement system are measured and compared. The results are depicted in Figs. 4-6 for 100 Monte Carlo runs. It is observed that RNS performs better than twos-complement
representation for anti-correlated data and slightly worse than
sign-magnitude and twos-complement representations for
uncorrelated and correlated sequences.
The Quadratic RNS
n/M
n/M
n/M
mod m2
n/M
Inverse Converter
Forward Converter
mod m1
n/M
mod mM
(a)
the RNS equivalent architecture dissipates 3.8 mW.

Hence, power dissipation
reduction becomes more
significant as the number
of filter taps increases, and
a three-times reduction is
possible for filters with
(b)
Residue arithmetic can be exploited to

reduce the number of real operations required to perform complex-number
multiplication. This is achieved by employing an extension of RNS, the
quadratic RNS (QRNS) [16]. The direct
complex-number multiplication can be
performed as
p = ( a + jb)( c + jd )
(7)
3. Structure of a binary architecture (a) and the corresponding RNS processor (b).
26
CIRCUITS & DEVICES
JULY 2001
= ( ac bd ) + j( bc + da ),
(8)
where j is the imaginary unit (i.e., 1), and a, b, c, and d are real
numbers. Parhami [5] shows a different technique to reduce the
number of multiplications to three by performing five additions
or subtractions with an extra computational step. According to
this technique, the complex product is computed as
p = [c( a + b) b( c + d )] + j[c( a + b) a( c d )],
Equations (16) and (17) show that a complex multiplication requires only two residue multiplications instead of four multiplications, an addition, and a subtraction. Therefore, by paying an
initial cost for conversion, a significant computational complexity reduction can be achieved by the QRNS mapping, which is directly translated to power savings.
(9)
where the common term c( a + b) is initially computed.

In case the moduli are primes of the form mi = 4k + 1, a QRNS
mapping can be established, such that the residue pair of the real
and imaginary part modulo mi can be mapped to a quadratic residue as
TC
2500
2000
RNS
SM
1500
1000
( qi , qi* ),
( ai , bi )
QRNS
(10)
500
where qi and qi* are the quadratic images of ai and bi , respectively. The quadratic images are obtained as
qi = ai + jbi
qi* = ai jbi
mi
mi
(11)
*
i
mi
= 0.
2500
(13)
RNS
TC
SM
1500
1000
mi
,
(14)
500
20
(15)
The quadratic mapping is of practical importance because it alleviates the dependency of the real and imaginary parts of a complex product from both the real and imaginary parts of both the
operands, as shown by Eq. (8). In other words, it eliminates the
cross-product terms. Therefore, by exploiting the QRNS, the
complex product {( qpi , qpi* )| i = 1,2,K , N } of two QRNS-encoded
complex
n u m be r s,
and
{( qai , qai* )| i = 1,2,K , N }
*
{( qbi , qbi )| i = 1,2,K , N }, can be evaluated as the direct product
of the corresponding quadratic images; i.e.,
(16)
mi
CIRCUITS & DEVICES
mi
JULY 2001
(17)
40
60
80
100
5. Number of low-to-high transitions, assuming uncorrelated ( = 0)

Gaussian data.
2500
2000
RNS
1500
TC
SM
1000
500
20
qpi* = qai* qbi*
100
4. Number of low-to-high transitions, assuming strongly anti-correlated ( = 0.99) Gaussian data, for twos-complement, RNS, and
sign-magnitude number systems for 100 Monte Carlo runs.
where j is the solution of
qpi = qai qbi
80
2000
bi = 21 j 1 ( qi qi* )
mi
60
(12)
ai = 2 ( qi + q )
j 2 + 1
40
while the mapping is inversed as

1
20
40
60
80
100
6. Number of low-to-high transitions, assuming strongly correlated

( = 0.99) Gaussian data.
27
Consider the Monte Carlo runs of the following experiment.

Assuming that the real and imaginary parts of the factors of a
complex product are taken from two Gaussian random processes, the total bit activity in the intermediate results is measured for the complex product evaluation. Specifically, 10-bit
sign-magnitude and 10-bit twos-complement operations are
25000
TC
SM
20000
15000
QRNS
10000
5000
10
7. Number of low-to-high transitions for complex-number multiplication, assuming uncorrelated ( = 0) Gaussian operands.
compared to QRNS operations that cover a dynamic range in excess of 20 bits. Ten Monte-Carlo runs, each of 1000 samples,
compose the experiment, which is repeated for uncorrelated
( = 0), correlated ( = 0.99), and anti-correlated ( = 0.99)
Gaussian data; results are shown in Figs. 7-9, respectively. Even
in the case that QRNS provides significantly larger dynamic
range, it can be seen that the bit activity is reduced approximately two times.
DAmora et al. have compared the implementation of a direct-form complex FIR filter with its QRNS counterpart [18].
They report that, for a particular throughput rate, the
QRNS-based implementation requires half the area and a third
of the power dissipation of the conventional implementation.
The conventional implementation is assumed to utilize the
four-multiplication scheme for complex-number multiplication, while the QRNS implementation exploits the index transform.
The index transform reduces a modulo-m multiplication to a
modulo-( m 1) addition, for m prime, resembling the reduction
of multiplication to addition by LNS. An integer root can be determined, such that the residues r [1, m) can be written as
r = n
TC
SM
20000
15000
QRNS
10000
(18)
and the multiplication of the residues can be reduced to addition

modulo ( m 1) of the indices, which correspond to the residues
to be multiplied. Therefore, the modulo product p of two residues, r1 and r2 is
p = r1 r2
= n1 n2
= n1 + n2
= n,
(19)
where
5000
n = n1 + n2
2
25000
TC
20000
SM
15000
QRNS
10000
m1
(20)
10
8. Number of low-to-high transitions for complex-number multiplication, assuming strongly correlated ( = 0.99 ) Gaussian operands.
Hence, modulo multiplication can be performed as residue addition, preceded and followed by a mapping of the operands to
their indices and of the result to the residue. These mappings are
commonly implemented as table look-ups [16].
The QRNS can exploit the index transform because the utilized moduli need to be prime. Hence, in the case of DSP architectures such as FIR filters, the coefficients can be directly stored
in index-residue form, thus the strength of each multiplication
can be further reduced, since the determination of the corresponding indices is not repeated for every residue multiplication. The significant power dissipation savings reported by
DAmora et al. assume the utilization of the index transform for
residue multiplication [18].
Conclusions
5000
10
9. Number of low-to-high transitions for complex-number multiplication, assuming strongly anti-correlated ( = 0.99) Gaussian
28
Recent advances in computer arithmetic offer interesting alternative solutions for low-power design. Depending on an assortment of factors that need to be considered, such as signal
statistics, computational load, type of arithmetic operations, accuracy and dynamic range, it is worth evaluating the LNS or the
CIRCUITS & DEVICES
JULY 2001
RNS for hardware implementations of computationally intensive tasks.

The choice of arithmetic can lead to substantial power savings. It affects several levels of the design abstraction since it can
reduce the number of operations, the signal activity, and the
strength of the operators. The impact of the arithmetic in a digital system is not limited to the definition of the architecture of
arithmetic circuits.
[2] J.M. Rabaey and M. Pedram, Low Power Design Methodologies. Boston,
MA: Kluwer, 1996.
Thanos Stouraitis received a B.S. in physics and an M.S. in electronic automation from the University of Athens, Greece, in
1979 and 1981, respectively; an M.S. in electrical engineering
from the University of Cincinnati in 1983; and the Ph.D. degree
from the University of Florida in 1986. He was awarded the Outstanding Ph.D. Dissertation award of the University of Florida
and a Certificate of Appreciation by the IEEE Circuits and Systems Society in 1997. He is a professor of electrical and computer engineering at the University of Patras, Greece. He has
served on the faculty of the University of Florida and the Ohio
State University. He has published two books, several book chapters, and approximately 30 journal and 70 conference papers in
the areas of computer architecture, computer arithmetic, VLSI
signal and image processing, and low-power processing. He
serves on the IEEE Circuits and Systems Societys technical
committee on VLSI Systems and Applications and the digital signal processing and the multimedia systems committees (e-mail:
thanos@ee. Upatras.gr).
[6] P.E. Landman and J.M. Rabaey, Architectural power analysis: The dual bit
type method, IEEE Trans. VLSI Syst., vol. 3, pp. 173-187, June 1995.
Vassilis Paliouras received the Diploma in electrical engineering in 1992 and the Ph.D. degree in electrical engineering in
1999, from the Electrical and Computer Engineering Department, University of Patras, Greece. He works as a researcher at
the VLSI Design Laboratory, ECE Dept., while teaching microprocessor-based system design at the Computer Engineering
and Informatics Department, both at the University of Patras.
His research interests include computer arithmetic algorithms
and circuits, microprocessor architecture, and VLSI signal processing, areas where he has published more than 30 conference
and journal articles. Dr. Paliouras received the MEDCHIP VLSI
Design Award in 1997. He is also the recipient of the 2000 IEEE
Circuits and Systems Society Guillemin-Cauer Award. He is a
Member of ACM, SIAM, and the Technical Chamber of Greece.
References
[1] A.P. Chandrakasan, S. Sheng, and R. Brodersen, Low-power CMOS digital design, IEEE J. Solid-State Circuits, vol. 27, pp. 473-484, Apr. 1992.
CIRCUITS & DEVICES
JULY 2001
[3] K.K. Parhi, Low-energy CSMT carry generators and binary adders, IEEE
Trans. VLSI Syst., vol. 7, pp. 450-462, Dec. 1999.
[4] T.K. Callaway and E.E. Swartzlander, Jr., Power-delay characteristics of
CMOS multipliers, in Proc. 13th Symp. Computer Arithmetic (ARITH13),
Asilomar, USA, July 1997, pp. 26-32.
[5] B. Parhami, Computer ArithmeticAlgorithms and Hardware Designs.
New York: Oxford Univ. Press, 2000.
[7] T.K. Callaway and E.E. Swartzlander, Low power arithmetic components, in Low Power Design Methodologies. J.M. Rabaey and M. Pedram,
Eds. Boston, MA: Kluwer, 1996.
[8] E. Swartzlander and A. Alexopoulos, The sign/logarithm number system, IEEE Trans. Computers, vol. 24, pp. 1238-1242, Dec. 1975.
[9] R.E. Morley, Jr., G.L. Engel, T.J. Sullivan, and S.M. Natarajan, VLSI based
design of a battery-operated digital hearing aid, in Proc. IEEE Int. Conf.
Acoustics, Speech and Signal Processing, pp. 2512-2515, 1988.
[10] J.R. Sacha and M.J. Irwin, The logarithmic number system for strength
reduction in adaptive filtering, in Proc. Int. Symp. Low-Power Electronics
and Design (ISLPED98), Monterey, CA, 1998, pp. 256-261.
[11] V. Paliouras and T. Stouraitis, Signal activity and power consumption
reduction using the Logarithmic Number System, in Proc. 2001 IEEE Int.
Symp. Circuits and Systems (ISCAS), vol. 2, pp. II-653-II-656, 2001.
[12] V. Paliouras and T. Stouraitis, Low-power properties of the Logarithmic
Number System, in Proc. 15th Symp. Computer Arithmetic (ARITH15),
2001.
[13] N. Szab and R. Tanaka, Residue Arithmetic and its Applications to Computer Technology. New York: McGraw-Hill, 1967.
[14] W.L. Freking and K.K. Parhi, Low-power FIR digital filters using residue
arithmetic, in Proc. 31st Asilomar Conference on Signals, Systems, and
Computers, vol. 1, pp. 739-743, 1997.
[15] W.A. Chren, Jr., One-hot residue coding for low delay-power product
CMOS design, IEEE Trans. Circuits Syst. II, vol. 45, pp. 303-313, March
1998.
[16] M.A. Soderstrand, W.K. Jenkins, G.A. Jullien, and F.J. Taylor, Residue
Number Arithmetic: Modern Applications in Digital Signal Processing.
Piscataway, NJ: IEEE Press, 1986.
[17] M.K. Ibrahim, Novel digital filter implementations using hybrid
RNS-binary arithmetic, Signal Processing, vol. 40, no. 2-3, pp. 287-294,
1994.
[18] A. DAmora, A. Nannarelli, M. Re, and G.C. Cardarilli, Reducing power
dissipation in complex digital filters by using the Quadratic Residue Number System, in Proc. 34th Asilomar Conference on Signals, Systems, and
Computers, 2000.
CD
29

Considering The Alternatives in Low Power Design

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Considering The Alternatives in Low Power Design

Uploaded by

Copyright:

Available Formats

Employing Alternative Number Systems

to Reduce Power Dissipation

1999 ARTVILLE, LLC.

The choice of the number systemi.e., the way numbers

8755-3996/01/$10.00 2001 IEEE

address aspects of the conventional arithmetic representations, the impact of

LNS is applicable for low-power design

Conventional Arithmetic Representations

fects the performance of

The Logarithmic Number System

where z is a single-bit flag which, when asserted, denotes that X

gated by employing logarithmic A/D and D/A

It is shown that RNS can even reduce

LNS and Power Dissipation

The Residue Number System

It is shown that RNS can

m denotes the mod mi operation, and mi is a member of the set

Table 1. Basic Linear Arithmetic Operations

2. Probability p0 1 per bit position for twos-complement and LNS

RNS is of interest because basic arithmetic operations can be performed

The impact of the arithmetic in a digital

RNS and Power Dissipation

more than 100 taps.

RNS Signal Activity for Gaussian Input

The Quadratic RNS

the RNS equivalent architecture dissipates 3.8 mW.

Residue arithmetic can be exploited to

CIRCUITS & DEVICES

where the common term c( a + b) is initially computed.

CIRCUITS & DEVICES

5. Number of low-to-high transitions, assuming uncorrelated ( = 0)

qpi* = qai* qbi*

where j is the solution of

qpi = qai qbi

while the mapping is inversed as

6. Number of low-to-high transitions, assuming strongly correlated

Consider the Monte Carlo runs of the following experiment.

and the multiplication of the residues can be reduced to addition

RNS for hardware implementations of computationally intensive tasks.

CIRCUITS & DEVICES

You might also like