You are on page 1of 161

CDA 3101: Introduction to Computer Hardware and

Organization
Supplementary Notes
Charles N. Winton
Department of Computer and Information Sciences
University of North Florida
Jacksonville, FL 32224-2645
Levels of organization of a computer system:
a) Electronic circuit level
b) Logic level - combinational logic*, sequential logic*,
register-transfer logic*
c) Programming level - microcode programming*,
machine/assembly language programming, high-level
language programming
d) Computer systems level - systems hardware* (basic
hardware architecture and organization - memory, CPU
(ALU, control unit), I/0, bus structures), systems
software, application systems
* topics discussed in these notes
Objectives:
Understand computer organization and component logic
Boolean algebra and truth table logic
Integer arithmetic and implementation algorithms
IEEE floating point standard and floating point algorithms
Register contruction
Memory construction and organization
Register transfer logic
CPU organization
Machine language instruction implementation
Develop a foundation for
Computer architecture
Microprocessor interfacing
System software
Sections:
combinational logic
sequential logic
computer architecture

2005

Contents
Section I - Logic Level: Combinational Logic.................................................... 1
Table of binary operations .................................................................. 3
Graphical symbols for logic gates ........................................................... 4
Representing data ........................................................................... 6
2s complement representation .......................................................... 10
Gray code .............................................................................. 15
Boolean algebra ............................................................................ 16
Canonical forms ............................................................................ 22
and notations .......................................................................... 23
NAND-NOR conversions ....................................................................... 23
Circuit analysis ........................................................................... 25
Circuit simplification: K-maps ............................................................. 25
Circuit design ............................................................................. 33
Gray to binary decoder ..................................................................... 35
BCD to 7-segment display decoder ........................................................... 36
Arithmetic circuits ........................................................................ 39
AOI gates .................................................................................. 42
Decoders/demultiplexers .................................................................... 43
Multiplexers ............................................................................... 44
Comparators ................................................................................ 46
Quine-McCluskey procedure .................................................................. 48
Section II - Logic Level: Sequential Logic..................................................... 50
Set-Reset (SR) latches ..................................................................... 51
Edge-triggered flip-flops .................................................................. 54
An aside about electricity ................................................................. 56
(Ohmss Law, resistor values, batteries, AC)
D-latches and D flip-flops ................................................................. 58
T flip-flops and JK flip-flops ............................................................. 60
Excitation controls ........................................................................ 61
Registers .................................................................................. 64
Counters ................................................................................... 65
Sequential circuit design finite state automata .......................................... 66
Counter design ............................................................................. 70
Moore and Mealy circuits ................................................................... 72
Circuit analysis ........................................................................... 72
Additional counters ........................................................................ 74
Barrel shifter ............................................................................. 77
Glitches and hazards ....................................................................... 78
Constructing memory ........................................................................ 83
International Unit Prefixes (base 10) ...................................................... 88
Circuit implementation using ROMs .......................................................... 89
Hamming Code ............................................................................... 93
Section III Computer Systems Level........................................................... 96
Representing numeric fractions ............................................................. 96
IEEE 754 Floating Point Standard ........................................................... 98
Register transfer logic ................................................................... 101
Register transfer language (RTL) .......................................................... 102
UNF RTL .................................................................................. 106
Signed multiply architecture and algorithm ................................................ 112
Booths method ............................................................................ 114
Restoring and non-restoring division ...................................................... 117
Implementing floating point using UNFRTL .................................................. 125
Computer organization ..................................................................... 128
Control unit .............................................................................. 129
Arithmetic and Logic unit ................................................................. 129
CPU registers ............................................................................. 130
Single bus CPU organization ............................................................... 131
Microcode signals ......................................................................... 132
Microprograms ............................................................................. 134
Branching ................................................................................. 136
Microcode programming ..................................................................... 137
Other machine language instructions ....................................................... 137
Index register ............................................................................ 140
Simplified Instructional Computer (SIC) ................................................... 143
Architectural enhancements ................................................................ 144
CPU-memory synchronization ................................................................ 146
Inverting microcode ....................................................................... 148
Vertical microcode ........................................................................ 149
Managing the CPU and peripheral devices ................................................... 149
The Z80 ................................................................................... 152

Page 1
Logic level: Combinational Logic
Combinational logic is characterized by functional specifications
using only binary valued inputs and binary valued outputs

r input
variables

...

combinational
logic

Z=f(X) (Z is

s output
variables
...

a function of

Z
X)

Remark: for given values of r and s, the number of possible functions


is finite since both the domain and the range of functions are finite,
of size 2r and 2s respectively (this is because the r input variables
and the s output variables assume only the binary values 0 and 1).
Although finite, it is worth noting that in practice the number of
functions is usually quite large:
For example, for r = 5 input variables and s = 1 output variable, the
domain consists of the 25 = 32 possible input combinations of the
two binary input values 0 and 1.
To specify a function, each of these 32 possible input combinations
must be assigned a value in the range, which consists of the two
binary output values 0 and 1.
This yields 232 = 4 billion such functions of 5 variables!
In general, with r input variables and s output variables, the domain
consists of the k = 2r combinations of the binary input values. The
range consists of the j = 2s combinations of the binary output values.
To specify a function, each of the j input combinations must be
assigned to 1 of k possible values in the range. Since there are jk
possible ways to do this, there are jk functions having r inputs and s
outputs.
Each such function corresponds to a logic circuit having r
(binary-valued) inputs and s (binary-valued) outputs.
When r = 2 input variables and s = 1 output variable, there are 24 = 16
possible functions (circuits), each having the basic appearance
X
Y

Z = f(X,Y)

Recall that functions of 2 variables are called binary operations. For


the usual algebra of numbers these include the familiar operations of
addition, subtraction, multiplication, and division and as many more
as we might care to define.

Page 2
For circuit logic, the input variables are restricted to the values 0
and 1, so there are only 4 possible input combinations of X and Y,
yielding exactly 16 possible binary operations. The corresponding
logic circuits provide fundamental building blocks for more complex
logic circuits. Such fundamental circuits are termed logic gates.
Since there are only 16 of them, they can be listed out - see
overleaf. They are named for ease of reference and to reflect common
terminology.
It should be noted that some of the binary operation are "degenerate."
In particular,
Zero(X,Y) and One(X,Y) depend on neither X nor Y to determine their
output;
X(X,Y) and NOT X(X,Y) have output determined strictly by X;
Y(X,Y) and NOT Y(X,Y) have output determined strictly by Y.
X and NOT X operations (or Y and NOT Y, for that matter) are usually
thought of as unary operations (functions of 1 variable) rather than
degenerate binary operations. As unary operations they are
respectively termed the "identity" and the "complement".

Page 3

TABLE OF BINARY OPERATIONS


Inhibit X
on Y=1
X

Inhibit Y
on X=1

XOR

OR

NOR

Zero

AND

COINC

NOT Y

Y X

NOT X X Y

NAND

One

Page 4
__
The complement (or NOT) is designated by an overbar; e.g., X is the
complement of X.
The other most commonly employed binary operations for combinational
logic also have notational designations; e.g.,
AND
OR
NAND
NOR
XOR
COINCIDENCE

is
is
is
is
is
is

designated
designated
designated
designated
designated
designated

by
by
by
by
by
by

, e.g., X Y
+, e.g., X + Y
, e.g., X Y
, e.g., X Y
, e.g., X Y
, e.g., X
Y.

Note that if we form the simple composite function f (NOT f, or


the complement of f), that
__
_______
==
f (X) = f ( X ) and f = f
__________
Moreover, X Y = X Y = X Y(NAND NOT AND) - Sheffer stroke
__________
X Y = X + Y (NOR = NOT OR) - Pierce arrow
_____ _____
X
Y = X Y (COINC = complement of XOR)

In particular, NAND and AND, OR and NOR, XOR and COINC are
respectively complementary in the sense that each is respectively the
complement of the other.
Rather than use a general graphical "logic gate" designation
X

Z = f(X,Y)

Y
ANSI (American National Standards Institute) has standardized on the
following graphical symbols for the most commonly used logic gates.
AND ()

NAND ()

XOR

OR

NOR

COINC ( )

NOT

(+)

()

()

Page 5
Composite functions such as f(g(x)) can be easily represented using
these symbols; e.g., consider the composite
__
__
f(A,B,C,D) = ((A B )C) ((AC) D )

This is easily represented as a 3-level circuit diagrammed by:


A

f(A,B,C,D)

D
The level of a circuit is the maximal number of gates an input signal
has to travel through to establish the circuit output. Normally, both
an input signal and it's inverse are assumed to be available, so the
NOT gate on B does not count as a 4th level for the circuit.
Note that the behavior of the above circuit can be totally
determined by evaluating its behavior for each possible input
combination (we'll return to determining its values later):
A

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

f(A,B,C,D)
0
1
0
0
0
1
0
0
0
0
1
0
0
0
0
1

Note that this table provides an exhaustive specification of the logic


circuit more compactly given by the above algebraic expression for f.
Its form corresponds to the "truth" tables used in symbolic logic. For
small circuits, the truth table form of specifying a logic function is
often used.
The inputs to a logic circuit typically represent data values encoded
in a binary format as a sequence of 0's and 1's. The encoding scheme
may be selected to facilitate manipulation of the data. For example,
if the data is numeric, it is normally encoded to facilitate
performing arithmetic operations. If the data is alphabetic

Page 6
characters, it may be encoded to facilitate operations such as
sorting. There are also encoding schemes to specifically facilitate
effective use of the underlying hardware. A single input line is
normally used to provide a single data bit of information to a logic
circuit, representing the binary values of 0 or 1. At the hardware
level, 0 and 1 are typically represented by voltage levels; e.g., 0 by
voltage L ("low") and 1 by voltage H ("high"). For the TTL
(Transistor-Transistor Logic) technology, H = +5V and L = OV (H is
also referenced as Vcc - "common cathode" and L as GND or "ground").
Representing Data
There are three fundamental types of data that must be considered:

logical data (the discrete truth values - True and False)


numeric data (the integers and real numbers)
character data (the members of a defined finite alphabet)

Logical data representation:


There is no imposed standard for representing logical data in computer
hardware and software systems, but a single data bit is normally used
to represent a logical data item in the context, of logic circuits,
with "True" represented by 1 and "False" by 0. This is the
representation implicitly employed in the earlier discussion of
combinational logic circuits, which are typically implementations of
logic functions described via the mechanisms of symbolic logic. If the
roles of 0 and 1 are reversed (0 representing True and 1 representing
False), then the term negative logic is used to emphasize the change
in representation for logical data.
Numeric data:
The two types of numeric data,

integers
real numbers

are represented very differently. The representation in each case must


deal with the fact that a computing environment is inherently finite.
Integers:
When integers are displayed for human consumption we use a "base
representation. This requires us to establish characters which
represent the base digits. Since we have ten fingers, the natural
human base is ten and the Arabic characters
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
are used to represent the base digits. Since logic circuits deal with
binary inputs (0 or 1), the natural base in this context is two. Rather
than invent new characters, the first two base ten characters (0 and 1)

Page 7
are used to represent the base two digits. Any integer can be
represented in any base, so long as we have a clear understanding of
which base is being used and know what characters represent its digits.
For example, 1910 indicates a base ten representation of nineteen. In
base two it is represented by 1 0 0 1 12. When dealing different
bases, it is important to be able to convert from the representation in
one base to that of the other. Note that it is easy to convert from
base 2 to base 10, since each base 2 digit can be thought of as
indicating the presence or absence of a power of 2.
1 0 0 1 12 = 124 + 023 + 022 + 121 + 120
= 16 + 0 + 0 + 2 + 1 = 1910 = 1101 + 9100
A conversion from base 10 to base 2 is more difficult but still
straight forward. It can be handled "bottom-up" by repeated division by
2 until a quotient of 0 is reached, the remainders determining the
powers of 2 that are present:
19/2
9/2
4/2
2/2
1/2

=
=
=
=
=

9
4
2
1
0

R
R
R
R
R

1
1
0
0
1

(20
(21
(22
(23
(24

is
is
is
is
is

present)
present)
not present)
not present)
present)

The conversion can also be handled "top-down" by iteratively


subtracting out the highest power of 2 present until a difference of 0
is reached:
19 - 16 =
no 8's
no 4's
3 - 2 =
1 - 1 =

3
1
0

(1)
(0)
(0)
(1)
(1)

(16=24
( 8=23
( 4=22
( 2=21
( 1=20

is
is
is
is
is

present so remove 16)


not present in what's left)
not present)
present so remove 2)
present in what's left)

Bases which are powers of 2 are particularly useful for representing


binary data since it is easy to convert to and from among them. The
most commonly used are base 8 (octal) which uses as base digits
0, 1, 2, 3, 4, 5, 6, 7
and base 16 (hexadecimal) which uses as base digits
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
where A, B, C, D, E, F are the base digits for ten, eleven, twelve,
thirteen, fourteen, and fifteen. An n-bit binary item can easily be
viewed in the context of any of base 2, base 8, or base 16 simply by
appropriately grouping the bits; for example, the 28 bit binary item

Page 8
1

4
|

6
|

5
|

3
|

4
|

0
|

0
|

2
|

4
|

1 1 0 0 1 1 0 1 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0
|
|
|
|
|
|
|
C
D
5
C
0
1
4

|
|

is easily seen to be 14653400248 = CD5C01416 when the bits are grouped


as indicated (using a calculator that handles base conversions, you can
determine that the base ten value is 21533493210; note that such
calculators are typically limited to ten base 2 digits, but handle 8
hexadecimal digits, effectively extending the range of the calculator
to 32 bits when the hexadecimal digits are viewed as 4-bit chunks).
Since it is easier to read a string of hexadecimal (hex) digits than a
string of 0's and 1's, and the conversion to and from base 16 is so
straightforward, digital information of many bits is frequently
displayed using hex digits (or sometimes octal, particularly for older
equipment).
Since digital circuits generally are viewed as processing binary data,
a natural way to encode integers for the use by such circuits is to
use fixed blocks of n bits each; in particular, 32-bit integers are
commonly used (i.e., n = 32). In general, an n-bit quantity may be
viewed as naturally representing one of the 2n integers in the range
[0, 2n-1] in its base 2 form. For example, for n = 5, there are 25 = 32
such numbers. The 5-bit representations of these numbers in base 2
form are
0 0 0 0 02 = 0l0
0 0 0 0 12 = 110
. . .
1 1 1 1 12 = 3110
Note that as listed, the representation does not provide for negative
numbers. One strategy to provide for negative numbers is to mimic the
"sign-magnitude" approach normally used in everyday base 10
representation of integers. For example, -27310 explicitly exhibits
as separate entries the sign and the magnitude of the number. A
sign-magnitude representation strategy could use the first bit to
represent the sign (0 for +, 1 for -). While perhaps satisfactory for
everyday paper and pencil use, this strategy has awkward
characteristics that weigh against it. First of all, the operation of
subtraction is algorithmically vexing even for base 10 paper and
pencil exercises. For example, the subtraction problem
2310
- 3410
is typically handled not by subtracting 3410 from 2310, but by first
subtracting 2310 from 3410, exactly the opposite of what the problem is
asking for! Even worse, 010 is represented twice (e.g., when n = 5, 010
is represented by both 0 0 0 0 0 and 1 0 0 0 0). Conceptually, the
subtraction problem above can be viewed as the addition problem 2310 +
(-3410). However, adding the corresponding sign-magnitude

Page 9
representations as base 2 quantities will yield an incorrect result in
many cases. Since numeric data is typically manipulated
computationally, the representation strategy should facilitate, rather
than complicate, the circuitry designed to handle the data
manipulation. For these reasons, when n bits are used, the resulting
2n binary combinations are viewed as representing the integers modulo
2n, which inherently provides for negative integers and well-defined
arithmetic (modulo 2n).
The last statement needs some explanation.
considering the number line
|
. . . -231 . . .

|
-2

|
-1

|
0

First observe that in


|
2 . . .

|
231-1 . . .

truncation of the binary representation for any non-negative integer i


to n bits results in i mod 2n. Note that an infinite number of
non-negative integers (precisely 2n apart from each other) truncate to
a given particular value in the range [0, 2n-1]; i.e., there are 2n
such groupings, corresponding to 0, 1, 2, . . . , 2n-1. Negative
integers can be included in each grouping simply by taking integers 2n
apart without regard to sign. These groupings are called the "residue
classes modulo 2n. Knowing any member of a residue class is
equivalent to knowing all of them (just adjust up or down by multiples
of 2n to find the others, or for non-negative integers truncate the
base 2 representation at n bits to find the value in the range [0,
2n-1]). In other words, the 2n residue classes represented by 0, 1, 2,
..., 2n-1 provide a (finite) algebraic system that inherits its
algebraic properties from the (infinite) integers, which justifies the
viewpoint that this is a natural way to represent integer data in the
context of a finite environment. Note that negative integers are
implicitly provided for algebraically, since each algebraic entity
(residue class) has an inverse under addition. For example, with n =
5, adding the mod 25 residue classes for 710 and 2510 yields
[2510] + [710] = [3210] = [010] , so [2510] = [-710]
Returning to the computing practice point of view of identifying the
residue classes with the 5-bit representations of 0, 1, 2, ..., 25-1 in
base 2 form, the calculation becomes
1 1 0 0 12 + 0 0 1 1 12 = 0 0 0 0 02 (truncated to 5 bits).
The evident extension of this observation is that n-bit base 2
addition conforms exactly to addition modulo 2n, a fact that lends
itself to circuit implementation.
Again referring to the number line
|
. . . -16

. . .

|
-2

|
-1

|
0

|
1

|
|
2 . . . 15

|
16

. . .

|
31

|
32

consider for n = 5 the following table exhibiting in base 10 the 32


residue classes modulo 25. Each residue class is matched to the 5 bit

Page 10
representation corresponding to its base value in the range 0, 1, 2,
..., 31:
5-bit
residue class
representation
{ . . . , -32, 0, 32, . . . } = [0]
0 0 0 0 0
{ . . . , -31, 1, 33, . . . } = [1]
0 0 0 0 1
{ . . . , -30, 2, 34, . . . } = [2]
0 0 0 1 0
. . .
{ . . . , -17, 15, 47, . . . } = [15]
0 1 1 1 1
{ . . . , -16, 16, 48, . . . } = [16] = [-16] 1 0 0 0 0
. . .
{ . . . , -2, 30, 62, . . . } = [30] = [-2] 1 1 1 1 0
{ . . . , -1, 31, 63, . . . } = [31] = [-1] 1 1 1 1 1
Evidently, the 5-bit representations with a leading 0 viewed as base 2
integers best represent the integers 0, 1, ..., 15. The 5-bit
representations with a leading 1 best represent -16, -15, ..., -2, -1.
This representation is called the 5-bit 2's complement representation.
It provides for 0, 15 positive integers, and 16 negative integers.
Since data normally originates in sign-magnitude form, an easy means is
needed to convert to/from the sign-magnitude form.
An examination of the table leads to the conclusion that finding the
magnitude for a negative value in 5-bit 2's complement form can be
accomplished by subtracting from 32 (1 0 0 0 0 0) and truncating the
result. In general, this follows from the mod 25 residue class
equivalences,
-[i] = [-i] + [010] = [-i] + [3210] = [-i + 3210] = [3210-i]
which demonstrates that subtracting from 32 and truncating the result
will always result in the representation for -i. -i is called the 2's
complement of i. One way to subtract from 32 is to subtract from
1 1 1 1 1 (which is 31) and then add 1 (all in base 2). This is
equivalent to inverting each bit and then adding 1 (in base 2) to the
overall result. There is nothing special in this discussion that
requires 5 bits; i.e., the same rationale is equally applicable to an
n-bit environment. Hence, in general, to find the 2's complement of
an integer represented in n-bit 2's complement form, invert its bits
and add 1 (in base 2).
Example 1: Determine the 8-bit 2's complement representation of -3710.
First, the magnitude of -3710 is given by
3710 = 1 0 0 1 0 12
which is
0 0 1 0 0 1 0 1 in 8-bit 2's complement form.
The representation for -3710 is then given by the 2's complement of
3710, obtained by inverting the bits of the 8-bit representation of
the magnitude and adding 1; i.e.,

Page 11

-3710

1 1 0 1 1 0 1 0
+0 0 0 0 0 0 0 1
= 1 1 0 1 1 0 1 1 in 8-bit 2's complement form.

Example 2: Determine the (base 10) value of the 9 bit 2's complement
integers
i = 0 0 0 0 1 1 0 1 1
j = 1 1 1 0 1 1 0 1 0
s = i + j
For i, since the lead bit is 0, the sign is + and the magnitude of
the number is directly given by its representation as a base 2
integer; i.e.,
i = 2710.
For j, since the lead bit is 1, the number is negative, so its
magnitude is given by -j. Inverting j's bits and adding 1 gives
0 0 0 1 0 0 1 0 1
+0 0 0 0 0 0 0 0 1
0 0 0 1 0 0 1 1 0 = 3810 = -j (j's magnitude);
i.e., j = -3810.
i+j (which we now know is -1110) can be computed directly using
ordinary base 2 addition modulo 29; i.e.,
i:
0 0 0 0 1 1 0 1 1
j: + 1 1 1 0 1 1 0 1 0
i+j:
1 1 1 1 1 0 1 0 1

= 2710
= -3810
= -1110

Example 2 illustrates that only circuitry for base 2 addition needs to


be developed to perform addition and subtraction on integers
represented in n-bit 2's complement form.
Historically, a variation closed related to n-bit 2's complement,
namely, n-bit 1's complement has also been used for integer
representation in computing devices. The 1's complement of an n-bit
block of 0's and 1's is obtained by inverting each bit. For this
representation, arithmetic still requires only addition, but whenever
there is a carry out of the sign position (and no overflow has
occurred), 1 must be added to the result (a so-called "end-around
carry", something easily achieved at the hardware level). For
example, in 8-bit 1's complement
3810 =
0 0 1 0 0 1 1 0
-2710 =
1 1 1 0 0 1 0 0
1110 =
0 0 0 0 1 0 1 0
1 (end-around carry of carry-out)
0 0 0 0 1 0 1 1
Note that the end-around carry is only used when working in 1's
complement.
Integers do not have to be represented in n-bit blocks. Another
representation format is Binary Coded Decimal (BCD), where each

Page 12
decimal digit of the base 10 representation of the number is
separately represented using its 4-bit binary (base 2) form.
The 4-bit forms are
0 = 0 0
1 = 0 0
2 = 0 0
. .
9 = 1 0

0
0
1
.
0

0
1
0
1

so in BCD,
27 is represented in 8 bits by

0 0 1 0 0 1 1 1
|
|
|
2
7

183 is represented in 12 bits by

0 0 0 1 1 0 0 0 0 0 1 1
|
|
|
|
1
8
3

BCD is obviously a base 10 representation strategy. It has the


advantage of being close to a character representation form (discussed
below). When used in actual implementation, it is employed in
sign-magnitude form (the best known of which is IBM's packed decimal
form, which maintains the sign in conjunction with the last digit to
accommodate the fact that the number of bits varies from number to
number). Since there is no clear choice as to how to represent the
sign, we will not address the sign-magnitude form further in the
context of discussing BCD. It is possible to build BCD arithmetic
circuitry, but it is more complex than that used for 2's complement.
The arithmetic difficulties associated with BCD can easily be seen by
considering what happens when two decimal digits are added whose sum
exceeds 9.
For example, adding 9 and 4 using ordinary base 2 yields
1 0 0 1 = 9
0 1 0 0 = 4
1 1 0 1 = 13
which differs from 0 0 0 1 0 0 1 1 , which is 13 in BCD.
|
|
|
1
3
Achieving the correct BCD result from the base 2 result requires
adding a correction (+610 = 0 1 1 02); e.g.,
1 1 0 1
+ 0 1 1 0
0 0 0 1 0 0 1 1 = 13 in BCD.
|
|
|
In general, a correction of 6 is required whenever the sum of the two
digits exceeds 9. Hence, the circuitry has to allow for the fact that

Page 13
sometimes a correction factor is required and sometimes not.
BCD representation is normally handled using sign-magnitude,
subtraction is an added problem to cope with.

Since a

Real numbers:
Real numbers are normally represented in a format deriving from the
idea of the decimal expansion, which is used in paper and pencil
calculations to provide rational approximations to real numbers (this
is termed a "floating point representation, since the base point
separating the integer part from the fractional part may shift as
operations are performed on the number). There is a defined standard
for representing real numbers, the IEEE 754 Floating Point Standard,
whose discussion will be deferred until later due to its complexity.
An alternate representation for real numbers is to fix the number of
allowed places after the base point (a so-called "fixed point
representation) and use integer arithmetic. Since the number of
places is fixed, the base point does not need to be explicitly
represented (i.e., it is an "implied base point"). The result of
applying arithmetic operations such as multiplication and division
typically requires the use of additional (hidden) positions after the
base point to accurately represent the result since a fixed point
format truncates any additional positions resulting from
multiplication or division. For this reason precision is quickly
lost, further limiting the practicality of using this format.
Character representation:
Character data is defined by a finite set, its alphabet, which
provides the character domain. The characters of the alphabet are
represented as binary combinations of 0's and 1's. If 7 (ordered) bits
are used, then the 7 bits provide 128 different combinations of 0's
and 1's. Thus 7 bits provide encodings for an alphabet of up to 128
characters. If 8 bits are employed, then the alphabet may have as
many as 256 characters. There are two defined standards in use in this
country for representing character data:
ASCII (American Standard Code for Information Interchange)
EBCDIC (Extended Binary Coded Decimal Interchange Code).
ASCII has a 7-bit base definition, and an 8-bit extended version
providing additional graphics characters. (table page 21)
In each case the standard prescribes an alphabet and its
representation. Both standards have representation formats that make
conversion from character form to BCD easy (for each character
representing a decimal digit, the last 4 bits are its BCD
representation). The representation is chosen so that when viewed in
numeric ascending order, the corresponding characters follow the
desired ordering for the defining alphabet, which means a numeric sort
procedure can also be used for character sorting needs. Since
character strings typically encompass many bits, character data is
usually represented using hex digits rather than binary.

Page 14
For example, the text string "CDA 3101" is represented by
C 3 C 4 C 1 4 0 F 3 F 1 F 0 F 1
in EBCDIC
|
|
|
|
|
|
|
|
C
D
A spc 3
1
0
1
|
|
|
|
|
|
|
|
|
4 3 4 4 4 1 2 0 3 3 3 1 3 0 3 1 in ASCII (or ASCII-8).
|

and

Since characters are the most easily understood measure for data
capacity, an 8-bit quantity is termed a byte of storage and data
storage capacities are given in bytes rather than bits or some other
measure. 210 = 1024 bytes is called a K-byte, 220 = 1,048,576 bytes is
called a megabyte, 230 bytes is called a gigabyte, 220 bytes is called a
terabyte, and so forth.
Other representation schemes:
BCD is an example of a weighted representation scheme that utilizes
the natural weighting of the binary representation of a number; i.e.,
w3 d3 + w 2 d2 + w 1 d1 + w 0 d0
where the digits di are just 0 or 1 and the weights are w3=8, w2=4,
w1=2, w0=1. Since only 10 of the possible 16 combinations are used, w3
is 0 for all but 2 cases (8 and 9). A variation uses w3=2 to form what
is known as "2421 BCD". w3=0 for 0,1,2,3,4 and w3=1 for 5,6,7,8,9. A
major advantage over regular BCD is that the code is "selfcomplementing" in the sense that flipping the bits produced the 9's
complement.
Example: subtraction by using addition
a subtraction such as 654 - 470 is awkward because of the need to
borrow. The computation can be done by using addition if you think
in terms of
654+(999-470)-999 = 654+529-999 = 1183-999 = 183+1 = 184.
999-470 = 529 is called the "9's complement" of 470, so the
algorithm to do a subtraction A-B is
1. form the 9's complement (529) of the subtrahend B (470)
2. add it to the minuend A (654)
3. discard the carry and add 1 (corresponding to the end-around
carry of 1's complement)
Note that no subtraction circuitry is needed, but the technique does
need an easy way to get the 9's complement.
With 2421 BCD, 470 = 0100 1101 0000 and the 9's complement of 470 is
529 = 1011 0010 1111
Addition is still complicated as can be seen by adding 6+5 which is
1100 + 1011 = 0001 carry 1 (i.e., ordinary binary addition fails).
A final BCD code, "excess-3 BCD", is also self-complementing. It is
simply ordinary BCD + 3, so for the above example,
with excess-3, 470 = 0111 1010 0011 and the 9's complement of 470 is
529 = 1000 0101 1100.

Page 15
The lesson to learn is that codes must be formulated to represent data
in a computer, and different representations are employed for
different purposes; e.g.,
2's complement is a number representation that facilitates
arithmetic in base 2
BCD is another number representation that facilitates translation
of numbers to decimal character form but complicates arithmetic
ASCII represents characters in a manner that facilitates uppercase/lower-case adjustment and ease of conversion of decimal
characters
Other schemes such as "2421 BCD" and "excess-3 BCD" seek to
improve decimal arithmetic by facilitating use of 9's complement
to avoid subtraction
Sometimes representation schemes are designed to facilitate other
tasks, such as representing graphical data elements or for tracking.
For example, Gray Code is commonly used for identifying sectors on a
rotating disk. Gray code is defined recursively by using the rule:
to form the n+1 bit representation from the n-bit representation
preface the n-bit representation by 0
append to this the n-bit representation in reverse order
prefaced by 1
Hence, the 1, 2, and 3-bit representations are
0
1

00
01
11
10

000
001
011
010
110
111
101
100

Consider three concentric disks shaded as follows:

0
1

1
1

0
1

0
1

1
0
0

1
0
0

Page 16
The shading provides a gray code identification for 8 distinct wedgeshaped sections on the disk.
As the disk rotates from one section to the next, no more than one
digit position (represented by shaded and unshaded segments) changes,
simplifying the task of determining the id of the next section when
going from one section to the next. Note that this is a
characteristic of the gray code.
In contrast, note that in regular binary for the transition from 3 to
4, 011 to 100, all 3 digits change, which means hardware tracking the
change if this representation was used would potentially face
arbitrary intermediate patterns in the transition from section 3 to
section 4, complicating the process of to determining that 4 is the id
of the next section (e.g., something such as a delay would have to be
added to the control circuitry to allow the transition to stabilize).
For a disk such as above, a row of 3 reflectance sensors, one for each
concentric band, can be used to track the transitions.
Boolean algebra:
Boolean algebra is the algebra of circuits, the algebra of sets, and
the algebra of truth table logic. A Boolean algebra has two
fundamental elements, a "zero" and a "one," whose properties are
described below.
For circuits
"zero" is designated by 0 or L (for low voltage) and "one" by 1 or H
(for high voltage).
For sets,
"zero" is the empty set and "one" is the set universe.
For truth table logic,
"zero" is designated by F (for false) and "one" by T (for true).
Just as the algebraic properties of numbers are described in terms of
fundamental operations (addition and multiplication), the algebraic
properties of a Boolean algebra are described in terms of basic
Boolean operations.
For circuits, the basic Boolean operations are ones weve already
discussed
__
AND (), OR (+), and complement ( )
For sets the corresponding operations are
intersection (), union (), and set complement.
For truth table logic they are
AND (), OR (), and NOT (~).
Recall that AND and OR are binary operations (an operation requiring
two arguments), while complement is a unary operation (an operation
requiring one argument).

Page 17
For circuits, also recall that
the multiplication symbol is used for AND
the addition symbol + is use for OR
__
the symbol for complement is an overbar; i.e., X designates the
complement of X.
The utilization of for AND and + for OR is due to the fact that these
Boolean operations have algebraic properties similar to (but
definitely not the same as) those of multiplication and addition for
ordinary numbers. Basic properties for Boolean algebras (using the
circuit operation symbols, rather than those for sets or for symbolic
logic) are as follows:
1.

+ and are commutative operations; e.g.,

Commutative property:
X + Y = Y + X

and

X Y = Y X

In contrast to operations such as subtraction and division, a


commutative operation has a left-right symmetry, permitting us to
ignore the order of the operation's operands.
2.

+ and are associative operations; e.g.,

Associative property:

X + (Y + Z) = (X + Y) + Z

and

X (Y Z) = (X Y) Z

Non-associative operations (such as subtraction and division)


tend to cause difficulty precisely because they are nonassociative. The property of associativity permits selective
omission of parentheses, since the order in which the operation
is applied has no effect on the outcome; i.e., we can just as
easily write X + Y + Z as X + (Y + Z) or (X + Y) + Z since the
result is the same whether we first evaluate X + Y or Y + Z.
3.

Distributive property:
over ; e.g.,

distributes over + and + distributes

X (Y + Z) = (X Y) + (X Z) and also
X + (Y Z) = (X + Y) (X + Z)
With the distributive property we see a strong departure from the
algebra of ordinary numbers which definitely does not have the
property of + distributing over .
The distributive property illustrates a strong element of
symmetry that occurs in Boolean algebras, a characteristic known
as duality.
4.

Zero and one: there is an element zero (0) and an element one
(1) such that for every X,
X + 1 = 1

and

X 0 = 0

Page 18
5.

Identity:
e.g.,

0 is an identity for + and 1 is an identity for ;

X + 0 = X
6.

and

Complement property:
that
__
X + X = 1 and

X 1 = X

for every X

__
every element X has a complement X such

__
X X = 0

The complement of 1 is 0 and vice-versa; it can be shown that in


general complements are unique; i.e., each element has exactly
one complement.
7.

Involution property (rule of double complements):


==
X = X

8.

Idempotent property:
X + X = X

9.

and

Absorption property:
X + (X Y) = X

for each X,

for every element X,


X X = X
for every X and Y,
__
and X + ( X Y) = X + Y

Anything "AND"ed with X is absorbed into X under "OR" with X.


__
Anything "AND"ed with X is absorbed in its entirety under "OR"
with X.
10.

DeMorgan property: for every X and Y,


__________
__________
__
__
__
__
X Y = X + Y and X + Y = X Y
The DeMorgan property describes the relationship between "AND"
and "OR", which with the rule of double complements, allows
expressions to be converted from use of "AND"s to use of "OR"s
and vice-versa; e.g.,
__
__
==========
X + Y = X + Y = X Y
__
__
==========
X Y = X Y = X + Y

Some of these properties can be proven from others (i.e., they do not
constitute a minimal defining set of properties for Boolean algebras);
for example, the idempotent rule
X + X = X can be obtained by the manipulation
X + X = X + (X 1) = X by the absorption property.
The DeMorgan property provides rules for using NANDs and NORs (where
NAND stands for "NOT AND" and NOR stands for "NOT OR"). The operation
NAND (sometimes called the Sheffer stroke) is denoted by

Page 19
__________
X Y = X Y
and the operation NOR (sometimes called the Pierce arrow) is denoted
by
__________
X + Y = X Y
Utilizing the rule of double complements and the DeMorgan property,
any expression can be written in terms of the complement operation and
or the complement operation and . Moreover, since the complement
can be written in terms of either or ; i.e.,
__
X = X X = X X
any Boolean expression can be written solely in terms of either or
solely in terms of . This observation is particularly significant
for a circuit whose function is represented by a Boolean expression,
since this property of Boolean algebra implies that the circuit
construction can be accomplished using as basic circuit elements only
NAND circuits or only NOR circuits.
Note that properties such as commutative and associative are also a
characteristic of the algebra of numbers, but others, such as the
idempotent and DeMorgan properties are not; i.e., Boolean algebra, the
algebra of circuits, has behaviors quite different from what we are
used to with numbers. Just as successfully working with numbers
requires gaining understanding of their algebraic properties, working
with circuits requires gaining understanding of Boolean algebra.
In working with numbers, just as we often omit writing the times
symbol in formulas, we may omit the AND symbol in formulas.
Examples:
1.
2.

3.

There is no cancellation; i.e., XY = XZ does not imply that Y = Z


(if it did, the idempotent property XX = X = X 1 would imply
that X = 1!)
Complements are unique
To see this just assume that Y is also a complement for X; i.e.,
X + Y = 1 and XY = 0.
__
__
__
__
st
AND the _
1
equation
through
with
X
to
get
X
X
+
Y
X
=
X
_
__
__
Since X X = 0, this reduces
to Y X = X
__
__
Similarly,
since
X
+
X
=
1
and
XY
=
0,
XY
+
X Y = Y reduces
__
to X Y = Y
__
Putting the last two lines together we have X = Y
The list of properties is not minimal; e.g.,
Given that the properties other than the idempotent
property are true, then it can be shown that the idempotent
property
__ is also true as follows:
X + X __
= 1, so using the distributive property,
XX + X X = X which in turn leads to

Page 20

__
XX = X since X X = 0
A similar argument can be used to show that X + X = X
Given that the properties other than the absorption
property are true, then it can be shown that the absorption
property is also true as follows:
Since 1 + Y = 1, X__+ XY = X, the 1st __absorption criteria
Starting from X + X = 1 we get XY + X Y __= Y
Adding X to both sides we get X + XY + X Y = X + Y
By the __
first absorption criteria this reduces to
X + X Y = X + Y, which is the 2nd absorption criteria

The DeMorgan property has great impact on circuit equations, since it


provides the formula for converting from OR to NAND and from AND to
NOR.
The above proofs are by logical deduction. For a 2-element Boolean
algebra, proof can be done exhaustively be examining all cases; e.g.,
we can verify DeMorgan by means of a "truth table":
__________
__
__
__
__
X | Y
X | Y | X Y | X + Y| X + Y
|
|
|
|
|
0 | 0
1 | 1 |
1
|
0
|
1
|
|
|
|
|
0 | 1
1 | 0 |
0
|
1
|
0
|
|
|
|
|
1 | 0
0 | 1 |
0
|
1
|
0
|
|
|
|
|
1 | 1
0 | 0 |
0
|
1
|
0
|
|
|
|
|
This is called a "brute force" method for verifying the equation
__________
__
__
X + Y = X Y because it exhaustively checks every case using
the definition of the AND, OR and NOT operations.
Since
we can write
_____ and OR
_____are
________associative,
_____
_____________AND
X Y Z and X + Y + Z unparenthesized.
__________________
__
It can be shown that _X________
Y____
___Z
=
X__ +
__
and X + Y + Z = X

__
__
Y
+
Z
__
_
_
Y Z

This leads to the "generalized DeMorgan property":


___
___
______________________ ___
X1X2 . . . Xn = X1 + X2 + . . . + Xn
___
___ ___
______________________________
X1+ X2+. . .+ Xn = X1 X2 . . . Xn
which is often useful for circuits of more than 2 variables.
There are multi-input NAND gates to take advantage of this property.
WARNING: NAND and NOR are not associative.

Page 21
Consider the truth table:
__ __
__ ________
X | Y | Z
X | Y | Z| X Y
|
|
|
| |
0 | 0 | 0
1 | 1 | 1|
1
|
|
|
| |
0 | 0 | 1
1 | 1 | 0|
1
|
|
|
| |
0 | 1 | 0
1 | 0 | 1|
1
|
|
|
| |
0 | 1 | 1
1 | 0 | 0|
1
|
|
|
| |
1 | 0 | 0
0 | 1 | 1|
1
|
|
|
| |
1 | 0 | 1
0 | 1 | 0|
1
|
|
|
| |
1 | 1 | 0
0 | 0 | 1|
0
|
|
|
| |
1 | 1 | 1
0 | 0 | 0|
0
|
|
|
| |

________
| YZ
|
|
1
|
|
1
|
|
1
|
|
0
|
|
1
|
|
1
|
|
1
|
|
0
|

________
| X YZ
|
|
1
|
|
1
|
|
1
|
|
1
|
|
0
|
|
0
|
|
0
|
|
1
|

________
______________
| X Y Z| X Y Z
|
|
|
1
|
1
|
|
|
0
|
1
|
|
|
1
|
1
|
|
|
0
|
1
|
|
|
1
|
1
|
|
|
0
|
1
|
|
|
1
|
0
|
|
|
1
|
0
|
|

(X(YZ))((XY)Z)
______________
It is evident that (X(YZ)) ((XY)Z) X Y Z
______________
Similarly (X(YZ)) ((XY)Z) X + Y + Z
This means that care must be taken in grouping the NAND () and NOR
() operators in algebraic expressions!

The other two common binary operations, XOR () and COINC ( ) are
both associative.
X |
|
0 |
|
0 |
|
0 |
|
0 |
|
1 |
|
1 |
|
1 |
|
1 |
|

Y |
|
0 |
|
0 |
|
1 |
|
1 |
|
0 |
|
0 |
|
1 |
|
1 |
|

Z
0
1
0
1
0
1
0
1

XY |YZ |(XY)Z |X(YZ) |X


|
|
|
|
0 | 0 |
0
|
0
|
|
|
|
|
0 | 1 |
1
|
1
|
|
|
|
|
1 | 1 |
1
|
1
|
|
|
|
|
1 | 0 |
0
|
0
|
|
|
|
|
1 | 0 |
1
|
1
|
|
|
|
|
1 | 1 |
0
|
0
|
|
|
|
|
0 | 1 |
0
|
0
|
|
|
|
|
0 | 0 |
1
|
1
|
|
|
|
|

uY |YuZ
1
1
0
0
0
0
1
1

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

1
0
0
1
1
0
0
1

u u

u u

|(X Y) Z |X (Y Z)
|
|
|
0
|
0
|
|
|
1
|
1
|
|
|
1
|
1
|
|
|
0
|
0
|
|
|
1
|
1
|
|
|
0
|
0
|
|
|
0
|
0
|
|
|
1
|
1
|
|

Generalized operations (multi-input) serve to reduce the number of


levels in a circuit; e.g., a 3 input AND is a 1-level circuit for XYZ
equivalent to the 2-level circuit (XY)Z:
2-level (XY)Z

1-level XYZ

X
Y
Z

X
Y
Z

Page 22
Canonical forms:
Any combinational circuit, regardless of the gates used, can be
expressed in terms of combinations of AND, OR, and NOT. The most
general form of this expression is called a canonical form. There are
two types:
the canonical sum of products
the canonical product of sums
Formulating these turns out to be quite easy if the truth table for
the circuit is constructed. For example, consider a circuit f(X,Y,Z)
with specification:
__ __ __
__
__
X | Y | Z
f(X,Y,Z) || X Y Z
| XYZ
| XYZ
|
|
||
|
|
0 | 0 | 0
1
||
1
|
0
|
0
|
|
||
|
|
0 | 0 | 1
0
||
0
|
0
|
0
|
|
||
|
|
0 | 1 | 0
0
||
0
|
0
|
0
|
|
||
|
|
0 | 1 | 1
0
||
0
|
0
|
0
|
|
||
|
|
1 | 0 | 0
0
||
0
|
0
|
0
|
|
||
|
|
1 | 0 | 1
1
||
0
|
1
|
0
|
|
||
|
|
1 | 1 | 0
1
||
0
|
0
|
1
|
|
||
|
|
1 | 1 | 1
0
||
0
|
0
|
0
|
|
||
|
|
Note that
__ __ __
__
__
f(X,Y,Z) = X Y Z + X Y Z + X Y Z
Each of these terms is obtained just by looking at the combinations
for which f(X,Y,Z) is 1. Each of these is call a minterm. There are
8 possible minterms for 3 variables (see below).
Analogously, for the
is 0 we get
__ combinations
__
__ __for__which f(X,Y,Z)
__ __ __
f(X,Y,Z) = (X+Y+ Z )(X+ Y +Z)(X+ Y + Z )( X +Y+Z)( X + Y + Z )
Each of these terms is obtained just by looking at the combinations
for which f(X,Y,Z) is 0. Each of these is call a maxterm. There are
8 possible maxterms for 3 variables (see below). The minterms and
maxterms are numbered from 0 corresponding to the binary combination
they represent.
X

0.

1.

2.

3.

4.

5.

6.

7.

minterms
__ __ __
XYZ
__ __
X YZ
__ __
X YZ
__
X YZ
__ __
XY Z
__
XY Z
__
XY Z
XYZ

maxterms
X+Y+Z
__
X+Y+ Z
__
__
X +Y+ Z
__ __
X+ Y + Z
__
X +Y+Z
__
__
X +Y+ Z
__ __
X + Y +Z
__ __ __
X +Y +Z

Page 23
Note that the maxterms are just the complements of their
corresponding minterms.
Representing a function by using its minterms is called the canonical
sum of products and by using its maxterms the canonical product of
sums; i.e.,
__ __ __
__
__
f(X,Y,Z) = X Y Z + X Y Z + X Y Z is the canonical sum of products
and
__
__
__ __ __
__ __ __
f(X,Y,Z) = (X+Y+ Z )(X+ Y +Z)(X+ Y + Z )( X +Y+Z)( X + Y + Z ) is the canonical
product of sums for the function f(X,Y,Z).
The short-hand notation (-notation)
f(X,Y,Z) = (0,5,6) is used for the canonical sum of products.
Similarly the short-hand notation (-notation)
f(X,Y,Z) = (1,2,3,4,7) is used for the canonical product of sums.
Canonical representations are considered to be 2-level
representations, since for most circuits a signal and its opposite are
both available as inputs.
A combinational circuit's behavior is specified by one of
truth table listing the outputs for every possible combination of
input values
canonical representation of the outputs using or notation
circuit diagram using logic gates
Converting to NANDS or NORS:
For a Boolean algebra,
__ notice that
the complement X is given by (XX)
Since XY is given by the complement of (XY) we have
XY = (XY)(XY)
__ __
==========
By DeMorgan X + Y = X + Y = X Y = (XX)(YY)
Hence, we can describe and equation using AND, OR, and complement
solely in terms of NANDS using the above conversions.
Similarly,
for NOR we have the conversions
__
- X = (XX)
X+Y = (XY)(XY)
__
__
====
XY = X Y = X + Y = (XX)(YY) (By DeMorgan)
By DeMorgan,
a NAND gate

is equivalent to

__________
__
__
( X Y = X + Y)

and a NOR gate

is equivalent to

__________
__
__
( X + Y = X Y)

Page 24
Using these equivalences, an OR-AND (product of sums) combination can
be converted to NOR-NOR as follows:

NOR-NOR

OR-AND

Other equivalences to OR-AND that follow from this one are NAND-AND
and AND-OR as follows:

NAND-AND

NAND-AND

For the sum of products (AND-OR) we have the counterpart equivalences:

NAND-NAND

AND-OR

NOR-OR

OR-NAND

Page 25
At this point, if given a truth table, or a representation using or
notation, we can generate a 2-level circuit diagram as the canonical
sum of products or product of sums. Similarly, given a circuit
diagram, we can produce its truth table. This process is called
circuit analysis. For example, recall that the circuit equation,
__
__
f(A,B,C,D) = ((A B )C) ((AC) D )

was earlier represented as a 3-level circuit diagrammed by:


__
AB
A
__
(A B )C
B
__
__
C
f(A,B,C,D)= ((A B )C) ((AC) D
AC

__

(AC) D

From the circuit equation we can obtain the truth table as follows,
conforming to the value given earlier
__
__
__
__
__
A B C D
f(A,B,C,D) A B (A B )C AC (AC) D ((A B )C ) ((AC) D )

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
1
0
0
0
1
0
0
0
0
1
0
0
0
0
1

0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0

1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1

0
0
1
1
0
0
1
1
1
1
0
0
1
1
0
0

0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
1

0
1
0
0
0
1
0
0
0
0
1
0
0
0
0
1

From the truth table f(A,B,C,D) = (1,5,10,15)


= (0,2,3,4,6,7,8,9,11,12,13,14)
Note that the canonical representations are not as compact as the
original circuit equation.
Circuit simplification:
A circuit represented in a canonical form (usually by or notation)
can usually be simplified. There are 3 techniques commonly employed:
algebraic reduction
Karnaugh maps (K-maps)
Quine-McCluskey method

Page 26
Algebraic reduction is limited by the extent to which one is able to
observe
combinations
e.g.,
__ __ potential
__
__
__ __ in__examining
__ __ the equation;
__
A B C D + A BCD + AB C D = __
A B C D__ + A BCD + __A B C D + __AB C D (idempotent)
= _
A
(distributive)
_ BD( C + C) __+ ( A + A)B C D
= _
A
(complement)
_ BD 1 __+ B C D 1
= A BD + B C D
(identity)
This is a minimal 2-level
for the circuit. The further
__ representation
__
algebraic reduction to ( A + C )BD produces a 2-level circuit dependent
only on 2-input gates.
The Quine-McCluskey method is an extraction from the K-map approach
abstracted for computer implementation. It is not dependent on visual
graphs and is effective no matter the number of inputs. Since it does
not lend itself to hand implementation for more than a few variables,
it will only be discussed later and in sketchy detail.
For circuits with no more than 4 or 5 input variables, K-maps provide
a visual reduction technique for effectively reducing a combinational
circuit to a minimal form.
The idea for K-maps is to arrange minterms whose value is 1 (or
maxterms whose value is 0) on a grid so as to locate patterns which
will combine.
For a 1-variable map, input variable X, the minterm locations are as
follows:
X 0
1
__
X
X
While a 1-variable map is not useful, it is worth including to round
out the discussion of maps using more variables.
For a 2-variable map, input variables X, Y has minterm locations
Y
0
1
X
__
__ __
0 X Y
XY
1

__
X Y

XY

In general we only label the cells according to the binary number they
correspond to in the truth table (the number used by the or
notations). The map structure is then:
Y
0
1
X
0

0
1

Page 27
For example, if we have f(X,Y) = (1,3), we mark the minterms for 1
and 3 in the 2-variable map as follows:
Y

1
0

0
2

1
1

Now we can graphically see that a reduction is possible


by delineating
__
the adjacent pair of minterms (corresponding to X Y + XY), which in
fact reduces to Y. Notice that there are visual clues: the 1 over the
column corresponds to Y and the looking down vertically, the 0 and 1
"cancel".
2-variable K-maps also are not particularly useful, but again are
illustrative.
With 3-variables, the pattern is
YZ
00
01
11
X

10

0
1
The key thing to note is that the order across the top follows the
Gray code pattern so that there is exactly one 0-1 matchup between
each column, including a match between the 1st and 4th columns.
For the function f(X,Y,Z) = (1,3,4,6), the K-map is
X

YZ

00

01

1
4

11
1

10
3

1
5

__
__
f(X,Y,Z) = X Z + X Z
__
The 1st term of the reduced form for f(X,Y,Z) is in the X row (flagged
by 0) and the 2nd is in the X row (flagged by 1). In each case the Y
term cancels since it is the one with 0 matched to 1. Pay particular
attention to the box that wraps around.

Page 28
For a more complex example, consider f(X,Y,Z) = (1,3,4,5)
YZ

00

01

11
1

1
5

10
3

Here f(X,Y,Z) can__ be reduced


to either of the following
__
f(X,Y,Z) = _
X
Y
_ Z + X_
_
__
f(X,Y,Z) = X Z__ + X Y + Y Z
Not that the term Y Z is "redundant" since its 1's are covered by the
other two terms. The first expression is called a minimal sum of
products expression for f(X,Y,Z) since it cannot be reduced further.
For combinational circuits, the redundant term can be omitted, but
sometimes in the context of sequential circuits, where intermediate
values matter, it must be left in.
With 4-variables, the K-map pattern is
CD
00
01
11
10
AB
0

12

13

15

14

11

10

00
01
11
10

Now the Gray code pattern of the rows must also be present for the
columns. More complex situations can also arise; for example,
AB

CD

00

00

01

11
1

1
5

01
13

12

11

15

14

11

10

1
9

1
4

10

10
3

Page 29
describes f(A,B,C,D) = (0,2,6,7,8,9,13,15). There are two patterns
present that produce a minimal number of terms:
AB

CD

00

00

01

11
1

13

12

15

01

13

12

11

Hence, either of the following produces


expression:
__ __ __
from the rows
f(A,B,C,D) = A B D +
__ __ __
from the columns f(A,B,C,D) = B C D +

1
15

14

11

10

1
9

10

10

1
4

10
3

14

11

11
1

01

1
9

00

11

CD

00

1
4

AB

01

10

10

a minimal sum of produces


__
__ __
A BC + ABD + A B C
__
__ __
A C D + BCD + A C D

In either case we know we have the function since all 1's are covered.
When working with maxterms, the 0's of the function are what is
considered. For the function above,
f(A,B,C,D) = (1,3,4,5,10,11,12,14)
and the K-map is
CD
00
01
11
10
AB
0

00
4

01

11

13

15

0
12

14

0
8

10

10

11

leading to the following two minimal product of sums expressions:


__
__
__ __
__
__
f(A,B,C,D) = (A+B+ D )(A+ B +C)(A + B +D)( A +B+ C ) from the rows
__
__
__ __ __ __
f(A,B,C,D) = ( B +C+D)(A+C+ D )(B+ C + D )( A + C +D) from the columns.
Be sure to observe that when working with maxterms, "barred" items
correspond to 1's and unbarred items correspond to 0's, exactly the
opposite of what is done when working with minterms.
Just as a 4-variable K-map is formed by combining two 3-variable maps,
a 5-variable K-map can be formed by combining two 4-variable maps

Page 30
(conceptually, 1 on top of the other, representing 0 and 1 for the 5th
variable).
In general, blocks of size 2n are the ones that can be reduced.
are blocks of size 4 on a 4-variable K-map:
CD
CD
00
01
11
10
00
01
11
10
AB
AB
1

15

14

11

10

00

00
5

01

01
13

12

11

15

11

11

00

01
0

11
1

CD

00

01

11
1

10
3

12

13

15

14

11

10

01

01
12

13

15

14

11

11
8

10

AB

00

1
4

f(A,B,C,D) = AD

10

f(A,B,C,D) = AB
00

10

10

CD

13

12

14

10

AB

Here

11

__ __
f(A,B,C,D) = B D

10

10

__
f(A,B,C,D) = B D

In each case, the horizontal term with 0 against 1 is omitted and the
vertical term with 0 against 1 is omitted. Be sure to pay particular
attention to the pattern with a 1 in each corner, where A is omitted
vertically and C is omitted horizontally.
Note that each block of 4 contains 4 blocks of 2, but these are not
diagrammed since they are absorbed (in contrast, the Quine-McCloskey
method, which we wont look at until later, does keep tabs on all such
blocks!).
In general, an implicant (implicate for 0's) is a term that is a
product of inputs (including complements) for which the function
evaluates to 1 whenever the term evaluates to 1. These are
represented by blocks of size 2n on K-maps.

Page 31
A prime implicant (implicate for 0's) is one not contained in any
larger blocks of 1's.
An essential prime implicant is a prime implicant containing a 1 not
covered by any other prime implicant.
A distinguished cell is a 1-cell covered by exactly 1 prime implicant.
A don't care cell is one that may be either 0 or 1 for a particular
circuit. The value used in K-map analysis is one which increases the
amount of reduction. Don't care conditions occur because in circuits,
there are often combinations of inputs that cannot occur, so we don't
care whether their values are 0 or 1.
General Procedure for Circuit Reduction Using K-maps
1.
2.
3.
4.
5.

6.
7.
8.

Map the circuit's function into a K-map, marking don't cares by


using dashes
Treating don't cares as if they were 1's (0's for implicates),
box in all prime implicants (implicates), omitting any consisting
solely of dashes.
Mark any distinguished cells with * (dashes don't count)
Include all essential prime implicants in the sum, change their
1's to dashes and remove their boxes - exit if there aren't any
more 1's at this point.
Remove any prime implicants whose 1's are contained in a box
having more 1's (dominated case)
if there is a case where the number of 1's is the same (codominant case), discard the smaller box
if there is a case where the number of 1's is the same and the
box sizes are the same, discard either.
Go back to step 3 if there are any new distinguished cells
Include the largest of the remaining prime implicants in the sum
and go back to step 4 (this step is rarely needed) - if there is
no largest, choose any
If step 7 was used, choose from among the possible sums the one
with the fewest terms, then the one using the fewest variables.

Remark: if this procedure is employed with the K-map


CD
00
01
11
10
AB
1

00

13

12

11

1
15

14

11

10

1
9

step 7 will be employed.

01

10

Page 32
Worked out example:
AB

CD

00

01

00
01

*1

1
7

15

14

1
1

*1
9

10

13

12

10
3

1
4

11

11
1

10

11

__
__
There are 2 essential prime implicants to put in the sum: A D + A D
Now change the 1's in these 2 boxes to don't cares and redraw the
map:
CD
00
01
11
10
AB
1

00

10

1
7

15

14

13

12

11

01

11

10

The map has 2-sets of co-dominant implicants, so pick one of the codominant boxes from each and delete it; mark distinguished cells.
CD
00
01
11
10
AB
00

*1

01
12

11

10

13

14

15

1*
8

1*

10

11

1*

Adding in the new essential prime implicants covers all 1's so


__
__
__ __
f(A,B,C,D) = A D + A D + B D + A C

Page 33
We earlier considered the circuit analysis process, where given a
circuit diagram, it can be converted into a circuit equation based on
the gates employed, and from there converted into a truth table. The
circuit design process proceeds as follows:
1. Formalize the problem statement into inputs and outputs, devising
representations for inputs and outputs
2. Translate the problem statement to a logic function
3. Determine the outputs corresponding to inputs (some of which may
be dont cares)
4. Convert to or notation (truth table optional), including any
dont cares
Example: if f(A,B,C) = 1 for 0,3,4 and 1,5 are dont cares,
then the circuit is given by either of
f(A,B,C) = (0,3,4) + d(1,5) or
f(A,B,C) = (2,6,7) + d(1,5)
5. Create a K-map from the or notation
6. Use K-map reduction to obtain a minimal circuit equation
7. Produce a circuit diagram from the circuit equation
Employing XOR gates requires manipulation of the circuit equation.
Employing NAND and NOR gates can be accomplished by adjusting the
circuit diagram. [Recall that using the equivalences
a NAND gate

is equivalent to

and a NOR gate

is equivalent to

there are diagrammatic techniques for converting sum of products and


product of sums expressions to ones using NAND and NOR].
Example: (circuit design)
Design a matching circuit for the following:
There are 3 types of ball bearings in a bin (plastic, steel, and
brass). An assembly machine needs ball bearings of each type at
different points in the assembly process. Given the type of ball
bearing it needs at present, it needs to look through the bin for a
ball bearing matching the type; ie.,
Needed type
Accept/Reject
Observed type

Page 34
Step 1: Formalize
Type
Representation
Plastic
Steel
Brass

01
10
11

Accept = 1
Reject = 0

Steps 2,3,4: Translate to logic function


Needed obsrvd
A

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

f(A,B,C,D) = (5,10,15) + d(0,1,2,3,4,8,12)


= (6,7,9,11,13,14) + d(0,1,2,3,4,8,12)

d
d
d
d
d
1
0
0
d
0
1
0
d
0
0
1

Steps 5: K-map reduction


CD
00
01
11
AB
1

00

01

11

10

12

*1

10
-

13

*1

15

11

AB

CD

00

00

01

11

10

10
3

*
13

12

14

11
1

* 1 10

01
0

15

14

11

__ __
__ __
Step 6: Circuit equation f(A,B,C,D) = A__C + ABCD
+
B__D __
__
or f(A,B,C,D) = ( A +C)(B+ D )(A+ C )( B +D)
Step 7: Circuit diagram (there are 2 obvious NORs)
__________
__________
f(A,B,C,D) = A + C + ABCD + B + D

10

Page 35
__________
A + C
A
B

ABCD

f(A,B,C,D)

C
D
__________
B + D

Example: (circuit design)


Design a combinational circuit to convert 3-bit Gray code to 3-bit
binary (this is called a Gray to binary decoder).
X
Y
Z

Gray in

A
B
C

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
0
0
0
1
1
1
1

0
0
1
1
1
1
0
0

0
1
1
0
1
0
0
1

YZ

00

01

11
1

0
1

1
6

YZ

00

01

11

10

A = (4,5,6,7), B=(2,3,4,5), C=(1,2,4,7)

10

Binary out

K-map
for B

YZ

00

11
1

10
3

1
4

01
0

1
5

K-map
for A

A = X
__ __
B = X Y + X Y = XY
__ __ __ __
__ __
C = X Y__ Z +__ X Y
Z
+
XYZ
+
___ ___
__
__ __ X Y Z
__
= (X Z + X Z) Y + (XZ+ X Z )Y = (XZ) Y + ( X Z )Y = (XZ) Y = XYZ
Pay particular attention to the patterns that produced the XORs!

K-map
for C

Page 36

Gray in

Binary out

Gray to Binary Decoder


A Gray to binary decoder is an example of a circuit that could be
packaged as a specialized circuit.
As an example of a more complex decoder, consider the 7-segment display

a
f

b
g

d
This are used to produce representations of decimal digits and (to a
lesser extent) the hex characters A-F as follows:
0
_
| |
|_|

2
_
| _|
| |_

3
4
5
6
_
_
_
_| |_| |_ |_
_|
| _| |_|

7
_

8
9
A
B
C
_
_
_
_
| |_| |_| |_| |_ |
| |_|
| | | |_| |_

E
_
_| |_
|_| |_

F
_
|_
|

Pay particular attention to the difference between the representations


for 6 and B (a common mistake is to interpret the B pattern as 6).
Note that a logic circuit to convert 4-bit (hexa)decimal data to 7segment display format will require 7 outputs, one for each of segments
a,b,c,d,e,f,g. If only a BCD conversion is needed, then the circuit is
simplified (somewhat) because for the inputs for A,B,C,D,E,F, the outputs
are dont cares. The construction of such a circuit can be achieved by
the means already covered, albeit with some tedium due to the number of
outputs.
The SN7447 chip is a BCD to 7-segment display decoder/driver (LED
segments have to be protected from excess current, a capability built in
to this chip so that it can directly drive LED segments without use of
pull-up resistors). A worked out circuit diagram for this chip follows:

Page 37

SN 7447: BCD to 7-segment Display Decoder/Driver


a
BD
f
b
__
e

AC

_ __
__
AB C D

d
A

__
a

__
A

BD

_
ABC

__
B

_
ABC

__
b

B
C

__
C

CD
_ _
ABC

__
c

C
D

__
D

8
D

__
AB C
__
A BC

__
d

ABC

____
BI
RBO

A
Wired
AND

__
BC

__
e

AB
__
BC

__
f

__
AC D

____
LT
______
RBI

ABC

Lamp test

___
BCD

__
g

BI Blanking Input; RBO Ripple Blanking Output; LT Lamp Test; RBI Ripple Blanking Input
Points marked are take HIGH by taking the blanking input line LOW (this forces all outputs HIGH)

Page 38
BI (Blanking Input), RBI (Ripple Blanking Input), and LT (Lamp Test) have
no effect if they are not connected or if their lines are held HIGH.

If the blanking input is taken LOW, a 1 is forced at each point


marked , in effect blanking all LED by taking their lines high

Taking the lamp test input LOW forces the internal lines
representing A,B,C to go LOW, which internally produces the same
effect as an input of numeric 0 or 8, thus enabling LED lines
a,b,c,d,e, and f. LED line g requires an additional enable via the
internal lamp test line.

Taking the ripple blanking input line LOW enables the six input NAND
gates
_
_ __ in
__ the
__ circuit to respond to the internal lines representing
A , B , C , D , which will then cause the blanking of the LEDs if
the numeric value of the input is 0. To suppress leading 0s in
a sequence of digits, the blanking input line for each digit is
used as an output (Ripple Blanking Output) connected to the
ripple blanking input line of the digit of next lower order (note
that as soon as a non-zero digit occurs in the sequence, it
produces a HIGH signal on RBO which will then cause ripple
blanking to be disabled for all subsequent lower order digits).
Careful examination of the circuit shows that segment a is not lit for
the number 6!

BCD to 7-segment display function table:


__ __ __ __ __ __ __
D C B A
a
b c
d e f g
0
0
0
0
0
0
0
0
1
1
-

0
0
0
0
1
1
1
1
0
0
-

0
0
1
1
0
0
1
1
0
0
-

0
1
0
1
0
1
0
1
0
1
-

0
0 0
0
1
0 0
1
0
0 1
0
0
0 0
0
1
0 0
1
0
1 0
0
1
1 0
0
0
0 0
1
0
0 0
0
0
0 0
1
- (non-BCD input

0 0 1
1 1 1
Remark: the SN7447
0 1 0
display pattern
1 1 0
for 6 is given by
1 0 0
|_
1 0 0
|_|
0 0 0
1 1 1
0 0 0
1 0 0
- - combinations are all dont cares)

Standard K-map analysis results in the following equations:


__
__ __ __
__
[BD added in from dont cares for blanking output
a = A B C D + A C + BD
__
__
__
[BD added in from dont cares for blanking output
b = A B C + A BC + BD
__
__ __
[CD added in from dont cares for blanking output
c = A B C + CD
__
__ __
__ __
d = ABC + A B C + A B C
__
__
e = A + BC
__ __
__
__
f = AB + B C + A C D
__
__ __ __
g = ABC + B C D

purposes]
purposes]
purposes]

Page 39
Arithmetic circuits:
Half adder 2-bit addition is accomplished by XOR. A circuit for 2bit addition that outputs both the sum (S) and carry (Cout) is called a
half adder (a full adder also accounts for an input carry from a prior
addition (Cin)
X

Cout

0
0
1
1

0
1
0
1

0
1
1
0

0
0
0
1

X
Y

S
Cout
Half adder (HA)

Full adder To accommodate an input carry we have


X

Cin

Cout

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
1
1
0
1
0
0
1

0
0
0
1
0
1
1
1

S = X Y Cin by
the Gray to binary
__
Cout = X_YC
_ in + _XYC
_ in
= ( X Y + X Y )Cin

the same analysis used for the C output variable of


decoder discussed earlier.
__
__
+ X Y Cin + XY
__ C in which reduces nicely to
+ XY(Cin + C in) = (X Y)Cin + XY

Both (X Y)Cin and XY are produced by two half adders arranged as


follows:
X
Y

Cin
XY

(X Y)Cin

Hence to get a full adder (FA) we simple use two half-adders with an
OR gate applied to the two carries:
X
Y
Cin

HA

HA

S
Cout

Page 40
4-bit parallel adder: Input is two 4-bit quantities (X3,X2,X1,X0) and
(Y3,Y2,Y1,Y0). Input corresponding digits to each full adder circuit
and propagate each carry out to the carry in of the next higher full
adder.
X3 Y3

X2 Y2

X1 Y1

X0 Y0

FA

FA

FA

FA

Cout

S3

S2

S1

Cin

S0

It is evident that this technique can be extended for multiple bits.


The major drawback to this circuit construction is the fact that the
carry propagation must go through many circuit levels to reach the
high order bit. For this reason, adders may employ carry
anticipation; for example, for a 2-bit adder, the Cout value can be
determined combinationally by examining its specification or simply
employing logic; i.e., Cout is given by
(X1 AND Y1) OR
[carry out via X1 and Y1 alone]
((X1 OR Y1) AND X0 AND Y0) OR
[carry out via carry in from 1st FA]
((X1 OR Y1) AND Cin AND (X0 OR Y0)
Multiplier: Input is two 3-bit quantities (X2,X1,X0) and (Y2,Y1,Y0).
Think in terms of the construction
X1
X0
X2
Y2
Y1
Y0

X2Y2

X2Y0 X1Y0 X0Y0


X2Y1 X1Y1 X0Y1
X1Y2 X0Y2

X2Y0 +2 . . .

+2 X0Y0

where +2 is the binary addition accomplished by a full adder. The


number of gates for this kind of construction is the reason
multiplication circuits may use sequential circuit techniques (to be
covered later).
Subtraction: Full and half-subtractors can be constructed analogously
to full and half-adders. Half subtractor 2-bit subtraction is also
accomplished by XOR. A circuit for 2-bit subtraction that outputs
both the difference (D) and borrow (Bout) is called a half subtractor (a
full subtractor also accounts for an input borrow from a prior
subtraction (Bin)
X

Bout

0
0
1
1

0
1
0
1

0
1
1
0

0
1
0
0

X
Y

D
Bout

Half subtractor (HS)

Page 41
Full subtractor To accommodate an input borrow we have
X

Bin

Bout

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
1
1
0
1
0
0
1

0
1
1
1
0
0
0
1

D = X Y Bin by the same analysis used for the C output variable of


the Gray to binary decoder discussed earlier.
__ __
__ __
__
Bout = X__Y__
Bin + X Y B in + __X YBin + XYB
____ ___reduces
_
__ in which
__nicely to
= ( X Y + XY)Bin + X Y(Bin + B in) = ( X Y )Bin + X Y
____ _____
__
Both ( X Y )Bin and X Y are produced by two half subtractors arranged as
follows:
X
Y

Bin

________
( XY)Bin

XY

Hence to get a full subtractor (FS) we simple use two half-subtractors


with an OR gate applied to the two borrows:
X

HS

HS

Bout

Bin

4-bit parallel subtractor: Input is two 4-bit quantities (X3,X2,X1,X0)


and (Y3,Y2,Y1,Y0). Input corresponding digits to each full subtractor
circuit and propagate each borrow out to the borrow in of the next
higher full subtractor.

Bout

X3 Y3

X2 Y2

X1 Y1

X0 Y0

FS

FS

FS

FS

D3

D2

D1

Bin

D0

Page 42
Just as for the adder circuit, it is evident that this technique can be
extended for multiple bits. Note that the difference between the adder
and subtractor circuits is in how the propagated signal is dealt with
(whether carry or borrow).
BCD adder: Recall that BCD addition required adding 6 if the sum
exceeded 9. A BCD adder can then be formed by combining a 4-bit binary
adder with circuitry to make the adjustment when the sum exceeds 9.
Note that the test for 9 or greater is R3(R2+R1+R0).
X 3 Y3 X2 Y2 X1 Y1 X0 Y0

carry
in

4-bit binary adder

R3

R2

R1

R0

carry
out
HA

FA

HA

S3

S2

S1

(add 6)
test for result > 9
S0

BCD Sum
Note that when the exceeds 9 test is 0, the HA,FA,HA combination
simply adds in 0, which has no effect on the sum; otherwise, 011 is
added to R3R2R1 , in effect adding 6.
Other specialized circuits:
AOI gates: (AND-OR-Invert)

__
Suppose you have an expression such as (A + B + C)(A + B). Then
double-inverting and applying the DeMorgan property, this becomes
__
__ __
__ __
(A + B + C)(A + B) = ( A B C )+ ( A B )
which is an AND-OR-Invert expression. Hence AOI gates are employed to
implement product of sums expressions. A 2-wide, 3-input AOI gate has
the form:

Page 43
Decoders/demultiplexers:
Both the Gray to binary decoder and BCD to 7-segment display
decoder/driver constructed earlier are cases of a class of circuits
called decoders and demultiplexers. Basically, a decoder translates
input data to a different output format.
Of particular interest is a decoder that decodes an input address to
activate exactly one of several outputs. In particular, a 1 of 2n
decoder is one for which exactly one of 2n output lines goes High in
response to an n-input address. If there is a data input line also,
and the selected output matches the data input, then the circuit is
called a demultiplexer.
Example 1: 1 of 8 demultiplexer
Data in

Address in

0
1
2
3
4
5
6
7

1
2
4

Addressed outputs

In essence a demultiplexer routes the input data to the addressed


output.
Example 2: Constructing a 1 of 16 decoder/demultiplexer from two 1 of
8 decoder/demultiplexers
Decoder/demultiplexers usually include a chip select or enable
input to activate/deactivate the circuit. With an enable input a
larger decoder/demultiplexer can be constructed from smaller ones;
for example, a 1 of 16 decoder/demultiplexer can be constructed from
two 1 of 8 decoder/demultiplexers as follows:
Data in

1
2
4

CS

0
1
2
3
4
5
6
7

0
1
2
3
4
5
6
7

Addressed outputs

Address in

2
4
8

1
2
4

CS

0
1
2
3
4
5
6
7

8
9
10
11
12
13
14
15

This kind of construction is very useful for addressing memory.

Page 44
A 1 of n decoder can also be used to directly implement a logic
function. For example, the specification
f(X,Y,Z) = (2,5,6)
can be implemented using a 1 of 8 decoder by

X
Y
Z

0
1
2
1 of 8 3
decoder 4
5
1
6
2
7
4

f(X,Y,Z) = (2,5,6)

Internally, a decoder simply uses AND gates to produce the desired


outputs; e.g., a 1 of 4 decoder has the construction
0
1
2

Address in

Addressed outputs

3
So the circuit implementation for f(X,Y,Z) as implemented above is
just a sum of products (in fact, the canonical form since it is just
minterms ORed together).
Multiplexers:
A multiplexer circuit is the inverse of a demultiplexer and is even
more useful for implementing logic circuits because it does not
require ORing of outputs.
An 8 input multiplexer has the form

Data in

Address in

0
1
2
3
4
5
6
7

Output
Output

1
2
4

CS

For a multiplexer, the address refers to the input lines. The output
value is that of the addressed input. Normally, both a chip select line
and the complement of the output are also provided.

Page 45
A 4 input multiplexer (MUX) has the construction:

0
1
2

Output

Output

Address in
The basic addressing strategy is the same as for a decoder, but for a
multiplexer the AND gates are also used to enable (or suppress) input
values. Chip select is not implemented above, but can be accomplished by
increasing the input capacity of each AND gate, attaching the chip select
line to each AND. The OR gate that had to be supplied externally when
using a decoder to implement a logic function is now incorporated into
the construction.
Implementing a logic function using a multiplexer is best illustrated by
an example. Suppose that the specification
f(A,B,C,D) = (0,2,3,11,14)
is what is given. f(A,B,C,D) can be implemented using an 8-input
multiplexer as follows:
A
0
1
2
3
4
5
6
7

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

B
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

C
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

D
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

__
D

5
6

0
__
D

0
__
D

A
B
C

f(A,B,C,D)
1
0
1
1
0
0
0
0
0
0
0
1
0
0
1
0

__
D

f(A,B,C,D)
__
f

0
7

2
4

CS

Note that columns A,B,C select


0,1, ..., 7 in pairs, each of which
__
corresponds to one of 0,D, D ,1 on the output side. This provides a
mapping from the truth table to an 8-input MUX as indicated. The SN74151
chip is an 8-input MUX commonly used for this purpose.

Page 46
Comparators:
A comparator takes two input values and reports them as <, =, or >.
Starting from the most significant bit, the comparator cascades
comparisons until corresponding bits are found that are different (the
limiting case is all bits are equal). The first occurrence of
corresponding bits that are different determines whether the output
should be > or <. The circuit diagram for a 4-bit binary comparator to
compare (X3,X2,X1,X0) to (Y3,y2,Y1,Y0) is below:
<

X3
Y3

>

The top line is a


< test. Each
remaining line is
an = test for a
higher order bit
pair. 1 is
output if the <
test is 1 and all
higher order =
tests are 1.

<
<

X2
Y2
I
N
P
U
T
s

A 1 from a prior
comparator (< is
set in testing
higher order
pairs) forces the
< output to be 1.

>

<
=
>

<

X1
Y1

O
U
T
P
U
T
S

>
>
<

X0
Y0

>

The circuit allows for cascading of comparators, where input from a


comparator testing higher order bits may have already determined the
outcome. Tracing the circuit strategy as indicated in the annotation
shows that it implements the approach sketched out above.

Page 47

More specifically, the comparator as given is based on standard


comparison logic; i.e.,
case:
1. the "<" input line is 1 (the outcome is already "<"
based on higher order bits)
then
the "<" output line will be 1,
the "=" output line will be 0, and
the ">" output line will be 0
2.

the ">" input line is 1 (the outcome is already ">"


based on higher order bits)
then
the "<" output line will be 0,
the "=" output line will be 0, and
the ">" output line will be 1

3.

the "=" input line is 1 (the higher order bits are all
"=", so the comparison depends on lower order digits)
then
if A3 < B3 OR
A3 = B3 AND A2 < B2 OR
A3 = B3 AND A2 = B2 AND A1 < B1 OR
A3 = B3 AND A2 = B2 AND A1 = B1 AND A0 < B0 then
the "<" output line will be 1,
the "=" output line will be 0, and
the ">" output line will be 0
else if
A3 > B3 OR
A3 = B3 AND A2 > B2 OR
A3 = B3 AND A2 = B2 AND A1 > B1 OR
A3 = B3 AND A2 = B2 AND A1 = B1 AND A0 > B0 then
the "<" output line will be 0,
the "=" output line will be 0, and
the ">" output line will be 1
else (the result must be "=")
the "<" output line will be 0,
the "=" output line will be 1, and
the ">" output line will be 0

Particular attention should be given to how the logic has been


implemented in the circuit diagram. Contrast this to an approach
that seeks to work from a truth table specification to a minimal
sum of products or product of sums solution.

Page 48
Quine-McCluskey procedure: (optional non-graphical approach to reduction)
As the number of variables increases, the K-map graphical reduction
technique becomes increasingly problematic. The Quine-McCluskey
procedure is an algorithmic alternative best employed for computer
implementation and is covered for completeness.
Step 1:
Lay out the minterms in groups having the same number of 1s, groups
ordered by increasing numbers of 1s. This is a listing of all
blocks of 1.
Step 2:
Compare each group to the one immediately below it to form all
blocks of 2. Flag each block of 1 when it is used in forming a
block of 2.
Repeat this process on the blocks of 2 to form all possible blocks
of 4, then blocks of 8, and so on. Flag each block when it is used
to form a larger block.
Any blocks not used in forming larger blocks are carried forward to
step 3. Do not list any blocks formed redundantly (e.g., a block of
4 occurs has 4 blocks of 2 and so can be formed 2 different ways)
Illustration:
A

blocks of 1

blocks of 2

blocks of 4

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

1
0
1
1
1
1
0
0
1
0
1
1
1
1
1
1

1)

0000 *

1) -0-0
2) --00

2)
3)
4)

0010 *
0100 *
1000 *

1) 00-0 *
2) 0-00 *
3) -000 *

5)
6)
7)
8)

0011
0101
1010
1100

*
*
*
*

9)
10)
11)

1011 *
1101 *
1110 *

12)

1111 *

4)
5)
6)
7)
8)
9)

001-010
010-100
10-0
1-00

*
*
*
*
*
*

10)
11)
12)
13)
14)
15)

-011
-101
1011-10
11011-0

*
*
*
*
*
*

16) 1-11 *
17) 11-1 *
18) 111- *

3) -014) -105) 1--0


6) 1-17) 11--

Page 49
Step 3:
Form the table of minterms and blocks from the first 2 steps. Mark
each minterm participating in a block in the corresponding rowcolumn as illustrated below. Any column with a single entry is
essential. Continuing with the example we have:
0000 0010 0011 0100 0101 1000 1010 1011 1100 1101 1110 1111
-0-0

__ --00
B C -01__
B C -10-

*
*

*
*

1--0

*
*

1-1-

*
*

11--

*
*

Step 4:
Remove the rows associated with essential entries along with any
columns intersected by one or more of these rows. Put the terms
representing the rows into the final sum. If 2 rows are identical,
first eliminate based on dominance (number of 1s), next
arbitrarily. Repeat Steps 3 and 4 until all rows are used. In the
example, all rows get removed the 2nd time step 4 is used.
0000 1000 1110 1111
__ __
CD

-0-0

--00

1--0
AB

Identical rows (remove


1 arbitrarily)
*

1-1-

11--

Identical rows (remove


1 arbitrarily)

__
__
__ __
f(A,B,C,D) = AB + B C +B C + C D
Repeat Steps 3 and 4 until all rows are used. Note that in the
example, all rows get removed the 2nd time step 4 is used.
Step 5:
When an identical row is removed arbitrarily in Step 4 (no
dominance), repeat the process for the alternate case - all
combinations of duplicate row elimination should be explored and the
minimal expression for each case generated. The user can then
select from among these (which may provide additional possibilities
for combinations. In the above
__ __example, BC is
__ __present in the
result given. Alternatively, B D can replace C D and AC can replace
AB).

Page 50

Logic level: Sequential Logic


Sequential logic addresses circuits that have current-state, nextstate behavior; ie., are of the form:
Inputs

Outputs

Combinational
Circuit

Current
State

Storage
Elements

Feedback Loop

Next
State

Sequential Circuit
The storage elements provide current state inputs, which together with
external inputs are the inputs for a combinational circuit whose
outputs provide the external outputs for the sequential circuit and the
next state (to be captured in the storage elements to form a feedback
loop).
The circuit is clocked in the sense that the circuit only changes state
when a clock signal is received; ie., the next state output is
captured in the storage elements (to become the current state) only on
a clock pulse, typically on a clock transition from 0 to 1.
A state diagram is used to specify the current-state, next-state
behavior of a circuit. If there are 2 inputs, then for each state,
there are up to 4 possible next states that must be specified.
The fundamental circuit have current-state, next-state behavior is
called a flip-flop. A flip-flop has 2 stable states (0 and 1); ie., it
is bistable. It stores a single bit of information and maintains its
state as long as power is supplied to the circuit. State change occurs
only in response to a change in input values.
Types of flip-flops differ as to the number of inputs and how the
inputs affect the state of the device.
The most basic type of flip-flop is called a latch. Latches can be
used to store information, but are subject to race conditions (the
latch has a setup time, during which there may be an output value
that is wrong, which may race to a another part of the circuit and
cause a transition that should not occur this is not an issue for
combinational circuits so long as they are not being used in a
sequential context).

Page 51
Set-Reset latches:
The SR-latch formed from NOR gates is one of the fundamental latches
that can be formed from basic logic gates. It has the construction:
R

__
Q
S
Each NOR gates output is fed back to the others input. SR stands for
Set-Reset. The behavior (characteristic table) can be tabulated by
__
S R Q
Qnext Q next
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
1
0
0
1
1
0
0

1
0
1
1
0
0
0
0

no change
reset to 0
active transitions occur

set to 1
unstable

The state diagram for an SR flip-flop is given by


00,10

00,01
10
0

1
01

Valid inputs
are 00,01,10
If
_
_ __ NAND gates are used instead of NOR, the result is called an
__
S R -latch.
S
Q

__
R

__
Q

The reason for this becomes__clear __if the behavior is tabulated against
S and R inputs rather than S and R . Note that the behavior duplicates

Page 52
that of the SR latch, except for the invalid case; ie., the
characteristic table is:
__
S R Q
Qnext Q next
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
1
0
0
1
1
1
1

1
0
1
1
0
0
1
1

no change
reset to 0
active transitions occur

set to 1
unstable

It is instructive to examine timing considerations for the two cases


where there are transitions as the latch sets up the new output
values. Assume that it takes a discrete time interval t before the
output of a gate registers so our viewpoint for the latch is
R

__
Q

__
Taking snapshots of Q and Q at time intervals 0, t, 2t we get
__
S R Q Q
elapsed time
Active
Reset to 0

0
0
0

1
1
1

1
0
0

0
0
1

0
t
2t

-- response to R = 1 sets Q
__ to 0
-- response to Q = 0 sets Q to 1

Active
Set to 1

1
1
1

0
0
0

0
0
1

1
0
0

0
t
2t

__
-- response to S
__ = 1 sets Q to 0
-- response to Q = 0 sets Q to 1

In both instances it takes 2t for the circuit to stabilize. Flip


flops are usually handled synchronously with inputs held at the the no
change state until a clock pulse occurs. A gate can be used for this
purpose, for example, with an AND gate:
input
clock
If clock = 0 then output = 0.

output
If clock = 1, then output = input.

Page 53
The basic SR-latch has no provision for clock input and so is
configured for asynchronous usage. Note that since Qnext is a function
of S,R,Q we can derive a next state equation as follows:
Qnext = f(S,R,Q) = (1,4,5) + d(6,7) from the earlier tabulation
and the K-map is
RQ
00
01
11
10
S
0

1
4

__
so Qnext = S + R Q
__ __
To add a control (or enable), the S R -latch turns out to be the most
natural underlying latch because it responds to inverted inputs:
____
CS

Q
C
____
CR

__
Q

R
Separate preset and preclear lines can be added to allow flip-flop
initialization without using the controlled inputs.
Preset
____
CS

Q
C
____
CR

__
Q

R
Preclear
Clock signals typically are produced in the form of
square waves

or regularly spaced pulses

Page 54
Edge-triggered flip-flops:
There is a voltage setup time when the signal changes from 0 to 1.
The edge of the pulse for the 0 to 1 transition is called the leading
edge, and for the 1 to 0 transition the trailing edge.
leading edge

trailing edge

Voltage setup
interval

An edge-triggered flip-flop changes state when the edge is reached.


The value of the flip-flop remains constant until the next edge is
reached. There are leading edge triggered and trailing edge triggered
flip-flops. Normally all flip-flops in a circuit should trigger on
the same edge. If types are mixed, the leading edge can be converted
to trailing edge by inverting the control input (and vice-versa).
Flip flops are designated by the symbols

the first for leading edge triggered and the second for trailing edge
__
triggered.
marks the control input. Note that the output for Q is
marked as well. For example, the flip-flops on the SN7473 are trailing
edge triggered and those on the SN7474 are leading edge triggered.
Master-Slave flip-flops:
A master-slave flip-flop combines two flip-flops (with controls) where
the master flip-flop triggers on the leading edge. The slave flipflop then triggers on the trailing edge in response to the values of
the master flip-flop.
S
Q

__
Q

__
Q

R
Master ff: leading
edge triggered

Slave ff: trailing


edge triggered

Page 55
There are two virtues to this construction:
1.
the overall output does not change while the control input is
high, since the overall output comes from the slave flip-flop,
which sets up only when the control input goes low
2.
the slave flip-flop is isolated from the rest of the circuit,
responding only to the master flip-flops value.
(without this kind of protection in a circuit with multiple
interconnected flip-flops, a race condition may occur, where an
intermediate value gets latched rather than a final value).
From the external view point, the master-slave flip-flop triggers on
the trailing edge.
A note on latches:
Although basic latches should be avoided when a circuit requires
multiple flip-flops, basic latches still have uses.
Example: debouncing a switch
the mechanical nature of a physical switch precludes a smooth
transition between 0 and 1 when the switch is opened or closed.
This phenomenom is called bounce, because the switch value may
haphazardly alternate between open and closed as the switch contacts
separate on opening or connect on closing. It is a simple
application to debounce a single pole, double throw switch using a
basic latch; eg.,
Vcc

__
S
Q

SPDT
switch
__
R

__
Q

GND
The two resistors are needed to prevent a short circuit between Vcc
and GND for the input connected through the switch (they are called
pull-up resistors because when connected between Vcc and GND, they
pull the voltage on the Vcc side of the resistor up to logic 1).
When the switch as shown above is thrown to its opposite position,
__
the flip-flop will set to 1 the first time 0 detected
on S , and
__
will hold that value because
__ __if a bounce takes S back to 1, the
effect is applying 1,1 on R , S which is the no-change state of the
flip-flop (ie., the flip-flop cant revert to its prior value).
Generally, the term latch is only used in reference to flip-flops
whose outputs are not protected from intermediate values while setting
up. Unless qualified by the term latch, the use of the term flip-

Page 56
flop normally refers to a leading or trailing edge triggered flip-flop
that is protected. The master-slave construction is one approach used
for producing flip-flops. The SN7473 and SN7476 are in this category.
An aside about electricity:
The example of debouncing a switch may arouse curiousity regarding
use and selection of resistors with TTL integrated circuits such as
the SN7400 (quad 2-input NAND chip). Selection requires the
application of a small amount of knowledge about voltage,
resistance, and electric current.
Ohms Law:
Ohms Law is the relationship between electromotive force E
(measured in voltage, symbolized by V), current I (measured in
Amperes, symbolized by A), and resistance or impedance R (measured
in Ohms, symbolized by ); namely,
E = IR
This is closely related to Joules Law of power (measured in Watts,
symbolized by W); namely,
P = EI
Current is the rate of flow of electric charge in a circuit and is
measured in electron charge. By international standard, 1 Ampere of
current is defined as the flow of 6.24 1018 electron charges
(called a Coulomb) per second. Its rather bizarre value is derived
from the number of atoms in a gram of Carbon.
Note that the relationship between current and resistance is I=E/R,
so current is inversely proportional to resistance at constant
voltage. When plotted, the curve is
I

R
1

10

The area under the curve is given by multiplying current by


resistance; ie., it represents voltage. It is also given by the
natural logarithm, as discussed in calculus classes.
Standard resistor values:
If the curve is scaled by the inverse of the natural logarithm of 10
(1/ln(10)=.4343), the area is given by the base 10 logarithm and
consequently the area between 1 and 10 is 1V. Manufacturers have
chose to use impedance values that equally divide this area into 6,
12, or 24 equal subareas (the E6, E12, and E24 series). 1/6 =
.166667=p and the impedance values are then 100=1, 10p=1.468,
102p=2.154, 103p=3.162, 104p=4.642, 105p=6.813. The values adopted
for the E6 resister series are 1.0, 1.5, 2.2, 3.3, 4.7, 6.8, which
approximate the above calculations. Resistors are chosen whose Eseries value is a close match for the value needed. For example, if

Page 57
a 50000 resistor is needed, then 47K is used from the E6 series
or 51K from the E24 series.
If a 47K resistor is used in the debouncing circuit above, and Vcc
is at +5V, the current flow is I = 5/47000, which is .000106 Amps or
0.106 mA, where mA designates milliamps. TTL draws no more than
.04mA for 1 to be detected at an input; ie., 47K resistors are
commonly used as pull-up resistors when working with TTL chips.
Using a higher resistor value reduces the current draw (and thus,
power consumption) but the circuit may fail to work if the power at
the input is inadequate.
Batteries:
Batteries have an internal impedance which varies according to
battery size and type. As a battery is used, its impedance grows,
reducing power output.
Alkaline batteries: A fully charged 1.5V alkaline cell will have
an impedance of about 0.32, which means that the limiting
current between terminals is 4.7A.
NiCad batteries: NiCad batteries in contrast have about half the
capacity (stored energy) of alkalines, but hold their voltage
relatively constant during discharge (alkalines lose voltage
linearly). The basic NiCad cell is 1.2V and when fully chargned
has an impedance of about 0.12. yielding a maximum current of
about 10A; ie., NiCad batteries can supply power at about twice
the rate of alkalines and so are used in more power hungry
applications.
Following Joules Law, Amps Volts time = Watt hours is used as a
measure of power consumption.
It follows that an alkaline cell can provide up to 7 Watts and a
NiCad cell up to 12 Watts of power. Battery capacity is usually
measured in Amp hours rather than Watt hours.
Batteries in series:
Putting batteries in series increases electrical potential
additively; ie., two alkaline cells in series produces a 3V
battery. Impedance also doubles, so there is no change in
maximum discharge characteristics.
Batteries in parallel:
If batteries are placed in parallel, then the voltage is
unaffected and the impedance is changed according to R2/2R = R/2.
For 2 alkaline cells this is 0.16, increasing the discharge
maximum to 9.4A or doubling its current capacity. This assumes
that the batteries are matched. Note that in parallel, a weak
battery will tend to discharge its companions, since Mother
Nature seeks balance.
Alternating current:
Batteries produce direct current (DC) with current flow in one
direction (source to ground). A current for which the current flow
reverses direction cyclically is called alternating current (AC) and

Page 58
is produced by rotating a wire coil through a magnetic field.
Magnets have poles (+ and -), so if the coil is first oriented + -,
after a 180 rotation it will be oriented - + and the induced
voltage will reverse. If the rotation is constant then the voltage
will follow a sinusoidal pattern. In the US, the AC standard for
house wiring is 60 cycles per second alternating between -120V and
+120V. AC is used because it is relatively efficient to transform
it to high voltage for transmission (which requires less current
flow to move the same amount of power). Of course it has to be
transformed back to safer levels for use in the home. Devices
called rectifiers are used to convert AC power to DC. House current
can be converted by using both a transformer and a rectifier to
produce a DC output that can be used in place of a battery (just be
sure that the voltage is correct for the use intended).
A 60 Watt 120V light bulb requires 60/120 = 0.5A. Circuit capacity
is limited by the amount of current the transmission wire can handle
before its natural resistance causes overheating (and failure).
Increasing wire diameter, or braiding together multiple wires,
reduces resistance and increases capacity. To protect the
transmission wire, a fuse is used to keep from overloading the
circuit. A 20Amp 120V circuit can handle a load of 2400 Watts (ie.,
two 1500 Watt hair dryers will blow the fuse).
D-latches and D flip-flops:
A D-latch (D for delay) has the form:
D
Q
C
__
Q

__
It is the (clocked) SR-latch with S fed to the R input. Hence, it
triggers on the leading edge. Obviously, the same minor modification
applied to the master-slave SR flip-flop covered earlier will produce a
master-slave D flip-flop. The value of a D flip-flop is just the
input, but one cycle behind (hence the term delay). It should be noted
that a D flip-flop has only one input.
An alternative construction of a D-latch:
A Tri-state Buffer is a is a gate whose output can be in one of three
states, 1, 0, or null (same as no contact). It has the form
Ctrl
Input

Output

When the Ctrl = 1 then Output = Input; when Ctrl = 0, Output = null.

Page 59
Tri-state buffers can be used to construct a D-latch as follows:
Clock
Q

__
Q

When the clock value goes high, output Q = input D; ie., the latch is
leading edge triggered. Either this construction or the NAND
construction produces a viable D-latch.
Two D flip-flop constructions based on D-latches are as follows:
D-latch
(master)

ck

ck

D-latch
(slave)
D

ck
Q

Master-Slave D flip-flop
The master-slave construction works with either version of the D-latch
since both trigger on the leading edge. The overall construction is a
trailing edge triggered flip-flop.
__ __
The next construction uses 3 S R latches cleverly to produce a leading
edge triggered D flip-flop. By inverting the clock input, the masterslave version can be converted to leading edge triggered, but it
requires more logic gates.

Page 60

D0

ck
y

D0 = D
x,y = 1
x = D0
y = D0

when
when
when
when

ck
ck
ck
ck

=
=
=
=

0
0
1
1

Leading Edge Triggered D Flip-flop


When ck = 0, both x and y are held at 1, the no change state__for the
right-most latch. At the same time the upper latch outputs D and
feeds it to the
lower latch to produce D internally. When the clock
__
rises to 1, D is latched at y__ and D at x, to be latched by the
rightmost flip-flop as Q and Q . If D is changed while ck = 1 and x
has latched 0, there is no effect. If x = 1, then y = 0 blocks any
change in D from affecting x (the purpose of the feedback from the
lower to the upper latch) and also prevents the feedback from the upper
latch from affecting the lower latch. Hence, the flip-flop latches the
value on the leading edge.
In effect, the flip-flops in the circuit set up based on the values
from the prior clock cycle, and so all inputs are stable each time the
triggering edge is reached.
Other flip-flops:
While the SR-latch has uses in practice, the SR flip-flop does not
because it does not make use of a 1,1 input. As we have seen, a D
flip-flop uses a single input (other than ck). A T flip-flop also uses
an single input and simply toggles the state when the input is 1.
A
JK flip-flop combines the SR flip-flop and toggles when inputs are 1,1.
T flip-flop:
When T=0, the flip-flop values are unchanged. When T=1, the next
state is the opposite of the current state. Hence, the
characteristic table for the flip-flop is given by:

Page 61

Qnext

0
0
1
1

0
1
0
1

0
1
1
0

toggle when T = 1

__
__
Qnext = T Q + T Q = T Q
JK flip-flop:
This flip-flop just combines the functions of the SR and T flipflops and so is widely used. Its characteristic table is given by
J

Qnext

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
1
0
0
1
1
1
0

no change
reset to 0
set to 1
toggle

__
__
K-map analysis shows that Qnext = J Q + K Q.
Excitation controls:
In a circuit, flip-flop inputs have to be set to produce desired nextstate behavior. This is trivial for the D flip-flop. For the JK flipflop excitations are given by
Qpresent Qnext

Excitation: J,K values as


a function of Q and Qnext

Any flip-flop has present-state, next-state capabilities, so any flipflop type can be produced from any other flip-flop type.
Example: A T flip-flop from a JK flip-flop
J

__
Q

ck

Page 62
Example: A JK flip-flop from a D flip-flop
The key to the construction is to set it up as follows:

A Combinational
circuit that uses
both external and
current state values
to determine the
controls that produce
the specd next state

J
K

Q
__
Q

ck
Type of flip-flop
being created

Type of flip-flop
being used

This guides the table to construct as follows:


J

Qnext

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
1
0
0
1
1
1
0

0
1
0
0
1
1
1
0

J
D controls
producing specd
next state

KQ

00

01
0

10
3

1
4

11
1

1
__
__
D = JQ + K Q

JK spec
so our diagram becomes

__
Q

ck

Page 63
Example: Make up your own flip-flop and construct it from JK flip-flops
Specify the characteristic table and the JK excitations that will
produce the same next state behavior.
U

Qnext

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0

0
0
0
0
0
0
0
1
-

0
1
1
1
1
1
1
1

UN

FQ

00

01

00

01

11

10

11
1

13

00

00

01

11

10

0
5

J = UNF

13

15

__
Q

N
F
ck

14

11

K = U+N+F

1
9

1
8

10

10
3

1
12

11
1

1
11

01
0

14

15

FQ

UN

12

10
3

characteristic equation of the


flip-flop:
__ __ __
__
Qnext = U N F Q + U N F Q

10

Page 64
Example: Race Condition
S
D
C
D
R

clock
Assume that leading edge-triggered D flip-flops are being used (say
of the type described earlier). Then for 0 to 1 transition on the
latch enabled by the control line C, any of (0,1), (1,1), (1,0) may
be latched depending on when the clock rises. Note that even if the
control line is controlled by the clock, it could rise t ahead of
the clock signal at the D flip-flops, the point at which the latch
outputs are (1,1) when an active transition is in progress.
Registers:
A row of associated flip-flops in series or in parallel is called a
register. The combinations are:
serial in, serial out (slow devices)
serial in, parallel out (slow in, fast out)
parallel in, serial out (fast in, slow out)
parallel in, parallel out (fast in, fast out)
A shift register uses serial in, serial out.
input

output

clock
Every clock pulse the flip-flop values shift one to the right. The
left-most flip-flop obtains its new value from the input line and the
value of the right-most flip-flop is the output at each clock pulse.
It should be noted that this requires all leading edge or all trailing
edge flip-flops to work properly.
If the output is fed back to the input, the shift is called a circular
shift.
Three-state logic is needed to construct a shift register that can
shift in either direction.

Page 65
shift
right
input
left

output
right

input
right
output
left

shift
left
In contrast, parallel input has the appearance
i0
i2
i1
i3
D

ck
Counters:
Counters are often needed to control tasks such as count by 8 to shift
in 8 bits (1 byte) serially.
T flip-flops provide a natural means for constructing a mod 2n ripple
counter (counts cyclically 0 to 2n-1). It can be initialized to 0 via
the clear input provided on most flip-flops.
Q1
Q0
Q2
J
J
J
ck
K
K
K
enable
If trailing edge flip-flops are used, then when enabled, the counter
operates according to Q0 changing with the clock falling, Q1 with Q0
falling, and Q2 with Q1 falling as given by:
count clock
0
1
2
3
4
5
6
7

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

Q2

Q1

Q0

0
0
0

0
1
1

1
0
1

1
1
1

0
0
1

0
1
0

Q0 falls

Page 66
Sequential circuit design:
Sequential circuits make transitions from state to state in response
to inputs. Sequential circuits are physical realizations of a kind of
theoretical machine called a finite state automaton (FSA). An FSA can
be described by use of a graphical representation called a state
diagram.
An FSA is given by specifying:
1. An input alphabet I
2. An output alphabet O (possibly NULL)
3. A finite set of states S
4. A start state, s0 S
5. A transition function f:S I S (this is the next state
function, where f(current-state, current-input) = next-state)
6. Moore circuit output is on the state (may be NULL)
an output function g:S O is given
7. Mealy circuit output is on the transition (may be NULL)
an output function h:S I O is given
Examples:
1. Serial parity checker input is data (having a parity bit) and
output is the current parity bit (odd parity)
Input alphabet is {0,1}
Output alphabet is {0,1)
States are {S0, S1}
S0 is the start state
The transition function is given by the state diagram
(Moore circuit)
0

0
S1/0

S0/1

S
S0
S0
S1
S1

I
0
1
0
1

S output
S0
1
S1
0
S1
0
S0
1

1
The parity bit is an added data bit used to check for occurrence
of an error in data. It is commonly employed with memory
circuits, where any error indicates a serious problem (usually a
failed memory chip). The parity bit is usually appended to the
data bits. For odd parity, the added parity bit is selected so
that the total number of 1s is odd. For even parity, it is
selected so that the total number of 1s is even. For example, if
the data is 0 1 0 1 0 0 1 0 and odd parity is being used, then the
parity bit
data including the parity bit is 0 1 0 1 0 0 1 0 0
For the parity-checking FSA, data is input serially and the
current state outputs the bit needed for odd parity. Note the
boundary condition when no data has been input (empty input),
the parity bit is 1. If the 9-bit example above is sent through
the parity checker and the output of the final state does not
agree with the parity bit, a parity error has occurred.

Page 67
2. Sequential binary adder input is pairs of binary digits and
output is their sum; carry-in, carry-out information is tracked
by the current state.

Input alphabet {00,01,10,11}


Output alphabet {0,1}
States are {S0, S1, S2, S3} as follows:
S0 outputs 0, no carry
S1 outputs 0, carry
S2 outputs 1, no carry
S3 outputs 1, carry
Transitions are given by the state diagram
00
01,10
S0/0

01,10

11

11

00

00
01,10

01,10

S3/1

S1/0

States with
no carry

S2/1

00

11

States with
carry

11
Trace: 0 1 0 0 1
+0 1 0 1 1
The input pairs are (1,1),(0,1),(0,0),(1,1),(0,0).
For [current-state, current input] the transitions are
[S0, 11] S1 output 0
to carry state
[S1, 01] S1 output 0
remain in carry state
to no carry state
[S1, 00] S2 output 1
[S2, 11] S1 output 0
to carry state
[S1, 00] S2 output 1
to no carry state (final)
so the result is 1 0 1 0 0 as expected.
Generally, the structure of the FSA can be determined from the state
diagram, so usually only the state diagram is specified in the design
process. The next step is to detail how the FSA is converted to a
circuit.

Page 68
The sequential circuit design process is conducted as follows:
1. Problem statement
2. State diagram
3. Elimination of inaccessible states (if any) these are states
that cannot be reached from the Start State
4. Assignment of states to flip-flop combinations:
# of states
# of ffs needed
1 or 2
1
3 or 4
2
5,6,7 or 8
3
. . . and so forth
5. Transition/output table control values producing the needed next
state behavior are determined from flip-flop excitation tables
current states

inputs

next states

controls

outputs

6. K-map analysis to produce


control equations
output equations
7. Circuit diagram
Example: Parity checker using JK flip-flops.
Steps 1 and 2 were done earlier.
There are no inaccessible states.
Step 4: Assignment of states to flip-flop combinations.
Since there are only 2 states, 1 flip-flop (Q0) can represent
both.
0
0
1
State Q0
S0
0
S1
1
S1/0
S0/1
1
Step 5: Transitiion table 1
Q0next

Q0 I
0
0
1
1

0
1
0
1

0
1
1
0

J K

0
1
-

1
1
0
0

0
1

Recall: JK flip-flop
excitation table

Step 6: K-map analysis for J,K and Z


I
I
0
1
0
Q0
Q0
0

0 -

- 0

1 - 1

J=I, K=I

1
1
0

__
Z= Q 0

Q Qnext

0
0
1
1

0
1
-

1
0

0
1
0
1

Page 69
Step 7: Circuit for parity checker
J

__
Q0

clock

Example: Binary adder using JK flip-flops


Steps 1 and 2 were done earlier.
Step 3: There are no inaccessible states.
Step 4: Assignment of states to flip-flop combinations.
Since there are 4 states, 2 flip-flops (Q0,Q1) will be needed.
State

Q0

Q1

S0
S1
S2
S3

0
0
1
1

0
1
0
1

Step 5: Transition/output table

S0

S1

S2

S3

Q0

Q1

I0

I1

Q0n

Q1n

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
1
1
0
1
0
0
1
0
1
1
0
1
0
0
1

0
0
0
1
0
1
1
1
0
0
0
1
0
1
1
1

S0
S2
S2
S1
S2
S1
S1
S3
S0
S2
S2
S1
S2
S1
S1
S3

J0

K0

J1

K1

0
1
1
0
1
0
0
1
-

1
0
0
1
0
1
1
0

0
0
0
1
0
0
0
1
-

1
0
0
0
1
0
0
0

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

Recall: JK flip-flop
excitation table
Q Qnext

0
0
1
1

0
1
-

1
0

0
1
0
1

00
01,10
S0/0

01,10
S2/1

00
11

11

00

00
01,10

01,10

S3/1

S1/0
11

11

Page 70
J0, K0, J1, K1 can be resolved via K-maps.
observe an XOR pattern.
I0I1

I0I1
00

Q0Q1

Note that J0 and K0

01

00

0 -

01

1 -

11

- 0

10

- 1

11

10

1 0 13

12

- 1

15

- 0
9

- 0

11

- 1

01

00

0 -

01

- 1

11

- 1

10

0 -

0 -

14

10

13

0 -

J0 K0
___
J0 = Q1(I0 I1) + Q 1 (I0 I1) = Q1 I0 I1
___
___
K0 = Q 1 (I0 I1) + Q1(I0 I1) = Q 1 I0 I1

- 0
14

15

- 0

- 0
9

- 0

- 0

- 0

0 -

1 -

- 0

12

- 1

10
3

0 -

11
1

1 -

00

Q0Q1

1 -

0 5

10

11

0 -

1 -

J1 K1

J1 = I0 I1
___ ___
K1 = I 0 I 1 = I0 I1
By observation, Z = Q0
The circuit construction is then given by:
Z
J0 Q0

J1 Q1

__
K0 Q 0

__
K1 Q 1

I0
I1

Counter design:
Counters can have particularly simply design.
counter has the state diagram:
0000

0001

0010

0011

0100

1001

1000

0111

0110

0101

For example, a BCD

Page 71
Transitions are made with the clock. External inputs are not
required. States are named using flip-flop values. The
transition/output table is then
Q3 Q2 Q1 Q0

Q3n Q2n Q1n Q0n

J3 K3 J2 K2 J1 K1 J0 K0

0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
the rest are

0 0
0 0
0 0
0 1
0 1
0 1
0 1
1 0
1 0
0 0
dont

0
0
0
0
0
0
0
1
-

0 1
1 0
1 1
0 0
0 1
1 0
1 1
0 0
0 1
0 0
cares

0
1

0
0
0
1
0
0

0
0
0
1
-

Q1Q0

0
1
0
1
-

1
1
1
1
1
-

01

00

0 -

01

0 -

11

- -

10

- 0

11

0 -

13

0 -

- 1

- -

00

0 -

01

- 0

11

- -

- -

10

0 -

11

0 5

J3,K3
J3 = Q2Q1Q0
K3 = Q0

- 0
14

15

- -

- 9

0 -

13

0 -

- 1

- -

1 -

- 0

12

10

10

11

- -

01
0

14

15

- -

00

Q3Q2

1 -

- 8

0 -

0 -

0 -

12

10

10

11

- -

- -

J2,K2
J2 = Q1Q0
K2 = Q1Q0

Q1Q0
Q3Q2

1
1
1
1
1

Q1Q0
00

Q3Q2

0
1
0
1
0
0

Q1Q0
00

01
0

00

0 -

01

0 -

11

- -

10

0 -

11

1 4

15

- 9

0 -

11

- -

J1,K
__1

J1 = Q 3Q0
K1 = Q0

Q3Q2

00

01
0

00

1 -

01

1 -

11

- -

10

1 -

13

- 0

- 1

- -

- 1

1 -

12

10

- 0

10

15

- 9

- 1

13

1 -

- 1

- -

- 1

- 1

12

10

- -

- 1
4

14

- -

11
1

11

- -

1 14

- 10

- -

J0,K0
J0 = 1
K0 = 1

The counter operates synchronously with the clock. Note that Q0 is


common to each of J3,K3,J2,K2,J1,K1. Hence if we assign CK3=Q0, J3=Q2Q1,

Page 72
and K3=1, we have the same effect as the original assignment when __the
clock is high. Likewise assign CK2=Q0, J2=Q1, K2=Q1 and CK1=Q0, J1= Q 3,
K1=1. The counter now operates asynchronously with the clock attached
to CK0. Observe that the Q0 flip-flop is operating as a T flip-flop
(not a surprise since the 1s position of the counter toggles with
each increment).
Moore and Mealy circuits:
For a Moore circuit, the outputs are strictly a function of the
states. For a Mealy circuit, the outputs are a function of the inputs
as well as the states. For example,
input
clock

output

Moore circuit

input
clock

output

Mealy circuit

Circuit Analysis: reverse the design process


1.
2.
3.
4.

Produce control and output equations from the circuit


Generate the transition/output table from the equations
Determine the next state columns in the transition/output table
raw the state diagram

Example: starting from the following circuit diagram, assume that the
start state is (Q0,Q1) = (0,0)

I0

J0 Q0

J1 Q1
Z

I1

__
K0 Q 0

Circuit equations:
J0 = I0, K0 = I0I1

__ __
J1 = I0+Q0+I1, K1 = I0 Q 0 + I 1
__
Z = I1Q0 Q 1

__
K1 Q 1

Page 73

Transition/output table:

S0

S1

S2

S3

Q0

Q1

I0

I1

Q0n

Q1n

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
0
1
1
0
0
1
1
1
0
0
1
1
0
0
1

1
0
0
0
0
0
0
0
0
1
0
1
0
1
0
1

S1
S0
S2
S2
S0
S0
S1
S1
S2
S1
S0
S3
S2
S1
S0
S3

J0

K0

J1

K1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0

1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0

1
0
1
1
1
0
1
1
1
0
1
0
1
0
1
0

0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0

State diagram: (Mealy circuit)


01/0

00/0

10/0,11/0
S0
00/0,
01/0

00/0

S2

10/0
01/1

11/1

10/0,
11/0
S1

01/0

00/0
S3

11/0

10/0
Remark: the semantic for the circuit can only be inferred from the
state diagram; also, dont care conditions used in the original
design are unknown since they are accounted for in the circuit.
Example: Given the control and output equations
__
J1 = Y Q0 + Q1
Z=Q1
J0 = X Y
__
__
K1 = Q 0
K0 = X Q1 + Q0
the transition/output table is given by

Page 74
Transition/output table:

S0

S1

S2

S3

Q0

Q1

Q0n

Q1n

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

0
1
1
0
0
1
1
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
1
0
1
0
1
1
1
1

S0
S2
S2
S0
S0
S2
S2
S0
S1
S0
S1
S0
S1
S1
S1
S1

J0

K0

J1

K1

0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0

0
0
0
0
1
1
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
1
0
1
0
1
1
1
1

1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

State diagram: assume (0,0) is the start state (Moore circuit)


00,11
01,10
S0/0

S2/0

01,11

00,11

00,10

01,10
S3/1

S1/1

The isolated state is


an artifact of the
circuit implementation

00,01,10,11
Other Counters:
The first counter considered was a mod 2n ripple counter, a natural
counter formed by hooking T flip-flops up in series. It required no
additional gate logic and was easily devised without resorting to
sequential design techniques.
In contrast, the BCD counter exemplifies designing a counter by
working from a state diagram. In the BCD counter as given, no
attention was paid to the 6 states present in the circuit but not used
in the counting process. In particular, if the circuit initiated in
one of these 6 states, its behavior would be unspecified. Hence, the
user must initialize the flip-flops to 0 to assure that the counter
gets to the BCD counting sequence.
A self-starting counter is one which transitions to its counting
sequence regardless of the state in which the circuit is initiated.

Page 75
A counter that employs n flip-flops is called an n-stage counter.
Using a state diagram in designing a counter automatically minimizes
the number of stages, but there are useful counters that employ more
than the minimum.
A shift-register counter counts by using a circular shift to move a
bit pattern through the register. For example, to count 4, the
pattern might be 1000, 0100, 0010, 0001. The register layout is
initialize
D

clock

There are reasons to use this kind of counter (e.g., to produce a


sequence of polling signals, where each flip-flop enables the device
being polled). There are 12 other bit patterns:
0111, 1011, 1101, 1110 and 0011, 1001, 1100, 0110
0101, 1010
0000 and 1111
These are grouped according to how they would count (the first group
has two patterns that count 4, the second group has a pattern that
counts 2, and the last groups has two patterns that count 1). Its
obvious that initialization is important if this kind of counter is to
be employed.
The counter can be constructed to force it to move to the desired
counting sequence by adjusting the D0 input (currently Q3) for those
cases that are not in the right sequence.
Q0

Q1

Q2

Q3

D0

0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1

0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1

1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0

(force change
(OK)
(OK)
(force change
(OK)
(force change
(self-correct
(force change
(OK)
(force change
(self-correct
(force change
(self-correct
(force change
(self-correct
(force change

from Q3)
from Q3)
from Q3)
on later cycle)
from Q3)
from Q3)
on later cycle)
from Q3)
on later cycle)
from Q3)
on later cycle)
from Q3)

Page 76
__ __ __
From this it can be seen that D3 = Q 0 Q 1 Q 2 rather than D3 = Q3 will
cause the counter to fall into the 1000, 0100, 0010, 0001 pattern
within 3 clock cycles. The counter thus becomes self-starting. The
initialization can be retained, but the above minor change enables the
counter to return to its expected behavior in the event an anomalous
event knocks the counter out of sequence at some point after
initialization.
To make a counter self-starting, any unused states simply need to be
accounted for in the design. For example, for a counter counting 6
(which requires a minimum of 3 flip-flops), the state diagram
Ex1

2
3

0
Ex2

accounts for the 2 extra states that will occur when a circuit is
implemented using 3 flip-flops and ensures that the counter will be in
its counting sequence within 1 clock cycle.
Johnson counter: an n-stage count by 2n counter based on shiftregister counting.
Johnson counters cycle the complement of the final flip-flop in the
sequence to double the counting period. For example, a count by 8
Johnson counter has the form:
initialize
D

clock
With a counting sequence 1000,1100,1110,1111,0111,0011,0001,0000
Note that 8 states are unused, so either the counter has to be forced
to its counting sequence, or it has to be initialized.

Page 77
Barrel Shifter:
Recall that a shift register shifts 1 bit at a time. If the shift
amount is more than 1, then the process has to be repeated until the
specified amount of shifting has been accomplished. A barrel shifter
uses multiplexers to determine the shift so that it can be
accomplished in one cycle. For a 4-bit register, a barrel shifter
accomplishing a circular shift right of 0, 1, 2, or 3 (specified via
(s0,s1)) is structured as follows:

0
1
2
3

0
1
2
3

0
1
2
3

0
1
2
3

4
to
1
MUX

4
to
1
MUX

4
to
1
MUX

4
to
1
MUX

s0

s1
Note that each flip-flop is controlled by a multiplexer, which is used
to select the input sent to the flip-flop. The multiplexer's function
is to route the value selected according to its address lines to the
flip-flop's input. To set up the circuit as a shift register, the 4
multiplexer input data lines are simply hooked up to the flip-flop
outputs so that each address matches a shift value, with address 0
matching a shift of 0, address 1 matching a shift of 1, and so forth.
Thus, the amount of the shift is entered via the address lines
(S0,S1). The circuit can be reconfigured for different shift patterns
by simply hooking up the multiplexer input data lines to the flip-flop
outputs (or other data values) in different ways.

Page 78
Glitches and hazards
Physically, there is a time lag in a combinational circuit from the
point in time that input signals are applied until their effect
propagates through the various components of the circuit and the
outputs react to the inputs. This is called the propagational delay.
A manufacturer may include the expected propagational delay as a part
of circuit specifications. Propagational delay is a physical reality
with consequences that may affect circuit behavior, particularly that
of a sequential circuit.
To illustrate
point, consider the circuit given by
__ this
__
f = AB + A C
Assume that f is implemented by

)t 1
)t 3

B
C

)t 2

where t1, t2, t3 give the propagational delay associated with the
delineated components. Assume also that
t1 > t2

__
__
__
For purposes of illustration, suppose that the inputs A, A ,B, B ,C, C are
changing (synchronously) according to the timing pattern
Logic 1
A

Logic 0
Logic 1

Logic 0
Logic 1

Logic 0

Page 79
If we extend the timing diagram to track the circuit components as
they react to the inputs using a similar timing diagram, we obtain the
following:

A
B
delay
t1

delay
t2

C
__
AB
__
AC
__ __
AB+AC
delay
t3

glitch

delays t1 and t2 coupled with the changing values of A, B, C produce


a signal variance in the expected value of f that would not happen in
the absence of propagational delay. This variance, called a circuit
glitch, appears in the form of a brief pulse, which could possibly
trigger a state change elsewhere in the circuit. The component
organization which causes it is called a hazard. An examination of
the K-map for f is instructive in determining the source of the
hazard.
BC
00
01
11
10
A
1

1
5

As can easily be determined, the formulation for f we started with is


in fact a minimal sum-of-products expression, using two of the three
prime implicants of f.
__
__
These two prime
implicants (A B and A C) cover the third prime
__
implicant, B C, which is usually considered unnecessary, since
logically
__
__
__
__
__
f= A B + A C = A B + A C + B C

Page 80
Assuming some appropriate propagational delay for BC (t4), consider
what happens to the timing diagram when using the formulation
__
__
__
__
__
f= A B + A C = A B + A C + B C

A
B
C
__
AB
__
AC
__ __
AB+AC
__
BC
__ __
__
A B + A C + BC
It is evident that adding the logically redundant term back into the
expression has eliminated the glitch!
There
points to consider. The assumption that inputs
__ are __some subtle
__
A, A , B, B , C, C change synchronously is critical. For example,
consider the following (admittedly nonsensical) construction using
separate NOT and AND gates (with propagational delays as indicated):

__
A A

A
t1

t2

Page 81
The timing diagram for this circuit is as follows:

A
__
A

delay t1

delay t2
__
A A
Even for a simple construction such as this (or a similarly
constructed prime implicant) a glitch is experienced. In our first
example, such problems were avoided by synchronizing all inputs
(including complements) to the prime implicants. In practice,
glitches are not a great concern for combinational circuits
(especially since the outputs are typically used to drive devices slow
to react, such as lights). Although it is possible that a glitch's
duration may be too short for a component such as a flip-flop to
react, their presence is an obvious cause for concern in sequential
circuits, which may change state unexpectedly (and hence perform
incorrectly) on a misplaced signal pulse.
In general, when inputs to prime implicants (or implicates) are
synchronized in a combinational circuit, circuit hazards occur where
two prime implicants (or implicates) that are non-overlapping have
adjacent cells. Adding back the logically redundant (non-essential)
prime implicants (or implicates) serves to eliminate the hazards
causing such glitches.
It should be noted that under this scenario, there may be a dramatic
difference in the choice between using the sum-of-products form or the
product-of-sums form. For example, consider the K-map
BC
00
01
11
10
A
1

1
5

There are 6 prime implicants and 2 prime implicates, yielding the following
two hazard
free
formulations:
__
_
_
__
__
__
__
A B + B C + A C__ + A__B + __B C + A C
(A + B + C)( A + B + C )
It is evident that the product-of-sums expression is simpler. This
occurs because the removal of circuit hazards from the sum-of-products

Page 82
form requires adding in prime implicants that are logically nonessential or redundant.
There are other alternatives. If a particular input combination
triggering a glitch does not occur in actual implementation, then the
associated hazard does not need to be addressed. Another strategy is
to employ a flip-flop (perhaps a D flip-flop on the trailing edge of
the input synchronization) to latch the value of f at a point after
all glitches have occurred. Under this scenario, the circuit
performance is slowed until the flip-flop outputs are set. A third
alternative is to use an added synchronizing signal to hold the output
at a (known) fixed value until the danger of glitches is past. In
general this strategy takes the form:

I
n
p

u
t
s

.
.
.

Glitch-prone
combinational circuit

.
.
.

S
y
n
c
h
r
o
n
o
u
s

O
u
t
p
u
t
s

synch signal
In this case, there is the added complication of having to provide
careful timing for the added "synch" signal. By setting the synch
signal to 0 at the beginning of each cycle of input synchronization,
all outputs of the glitch prone circuit can be held at 0 through the
setup period when glitches are likely to occur, regardless of the
presence of hazards. When the chance for glitches is past, the
synchronizing signal is then changed to allow each output to cleanly
switch to its logical value for the current set of inputs. This
strategy does not have the longer implicit delay that is present in
our second alternative, but does require close coordination with the
system signal that is being used to synchronize the circuit inputs.
At this point it should be noted that every strategy for dealing with
glitches (even the one of removing hazards) has an element of
synchronization associated with it. This is the primary reason that
asynchronous sequential circuits have limited utility.

Page 83
Constructing memory:
Generally a memory block is organized to have
address lines to determine which bits in the block to access
bidirectional data lines to send data to an addressed location in
memory (write operation) or retrieve data from the addressed
location (read operaton)
a R/W line to specify a read operation or a write operation
an enable line to activate the memory block for read or write
access
A single bit is a 11 block of memory and can be represented by a flipflop (no address line is needed).

Bi-directional
data line

enable
(CS)

R/W

(R=1,W=0)

A 21 block of memory can now be constructed from two 11 cells.


of 2 decoder is needed to address the 11 cell wanted:

addr

11

A 1

d0

R/W
CS

CS

11
R/W

R/W
CS

A 41 block can be constructed from two 21 blocks using a 1 of 2


decoder, or from four 11 blocks using a 1 of 4 decoder. These two
equivalent constructions appear as:

Page 84
(addr)
a0
a1

d0

21
1
of
2
CS

(addr)
a0
a1

R/W
CS

1
of
4
CS

11
R/W
CS

CS

CS

11

21
R/W

d0

R/W

R/W

CS

CS

11
R/W
CS

11
R/W

R/W
CS

Note that for the construction using two 21 cells, a1 selects a 21


cell and a0 selects a bit from within the cell. In effect, when larger
memory blocks are constructed from smaller memory blocks, the higher
order bits of the address are used to select one of the smaller blocks
and the lower order bits are used to select the data item from within
the selected smaller block.
The memory modules in the 41 block can be arranged to construct a 22
block with 2 data lines, instead of 1:
a0
CS

21

21

R/W

R/W

CS

CS

d0
d1

R/W
A memory chip has a fixed capacity in bits, which can be organized
either in favor of addressibility (41 requires more address lines than
22) or in favor of data groups (22 provides 2 bits per data group vs.
1 bit for 41).

Page 85
Note that in general, accessing a location in memory requires a large
decoder. In practice, a 1 of 16 decoder requires a 24 pin package (4
address lines, 16 data lines, Vcc, GND, and CS), which indicates
building a large decoder as a single chip is impractical. However, it
is easy to build larger decoders from smaller ones; for example, a 1 of
64 decoder can be constructed from five 1 of 16 decoders as follows:
a5
a4
a3
a2
a1
a0

1
of
16
CS

1
of
16
CS

.
.
.

CS
(unused)

1
of
16
CS

.
.
.

.
.
.

64 output lines
1
of
16
CS

1
of
16
CS

.
.
.

.
.
.

This structure can obviously be extended to provide a decoder for any


address requirement (albeit by using a lot of chips; for this reason,
address decoding is normally a built-in feature of a memory chip).
Hence, arbitrarily large memory blocks can be constructed. Memory is
generally classified as
RAM memory - Random Access Memory (so-called since any randomly
generated address can be accessed directly, which contrasts to a
serial memory such as a magnetic tape). RAM memory can also be
both read from and written to.
ROM memory Read Only Memory (non-volatile memory with a fixed
content that can be read from, but not written to). There are
multiple varieties or ROM, some of which can be rewritten and some
not. For example, EPROM (electrically programmable ROM) is ROM
which can be erased (by ultra-violet exposure) and is written by
special circuitry operating at a higher voltage; PLAs

Page 86
(programmable logic arrays) start as a rectangular array of
fuseable links which when selectively blown to create (permanent)
bit patterns that then form a ROM; FPGAs (field programmable gate
arrays) are another variation, and can be rewritten with special
circuitry; CD-ROMs are yet another and may be either rewriteable
or not.
Memory is organized in a 2i2j array of bits with i address lines and 2j
data lines. The number of data lines is called the word size of the
memory. If j=3, then the word size is 8. Since 8 bits is a byte, the
memory would then have a capacity of 2i. If j=5, then the word size is
32, or 4 bytes. A 2568 memory has 8 address lines (since 28 = 256)
and 8 data lines. For RAM memory, data lines are bi-directional and
the memory includes both R/W and enable control lines. The overall
memory configuration has the appearance:
address
lines

bi-directional
data lines

R/W
enable

As already noted, larger memory units can be constructed from smaller


ones by arranging the blocks in a grid, tying all R/W lines together,
and using a decoder to select rows.
Example: Construction of a 256 byte memory with word size of 4 bytes
using 16 byte memory modules.
The specification calls for 256/4 = 64 words.
Each word has 4 bytes, so there are 32 data lines.
To get 64 words using 16 byte modules, there needs to be 64/16 = 4
rows, each having 4 modules.
256 bytes requires 6 address lines.
Hence, the memory should appear as a 44 grid with 6 address lines
and 32 bi-directional data lines.

Page 87
address
a5 a4 a3 a2 a1 a0

1
of
4
CS
CS

16
bytes

16
bytes

16
bytes

16
bytes

CS

CS

CS

CS

16
bytes

16
bytes

16
bytes

16
bytes

CS

CS

CS

CS

16
bytes

16
bytes

16
bytes

16
bytes

CS

CS

CS

CS

16
bytes

16
bytes

16
bytes

16
bytes

CS

CS

CS

CS

d0d1...d7

d8d9...d15

d16d17...d23

d24d25...d31

Note that the high order bits of the address are tied to the 1 of 4
decoder and the 4 lower order bits address the memory modules across
each row. The decoder activates a row and the lower order bits
select a word within that row.
The R/W lines are omitted because they are all tied together.
The addressing requirement can be reduced by using memory blocks that
require 2 select (enable) inputs (S1,S2).
S2
data

S1
R/W

Arranging these in a rectangular grid effectively halves the decoder


requirement; eg., a 256=28 byte memory module requires a 1 of 256

Page 88
decoder. If blocks using 2 select inputs are employed, and the memory
is arranged in a 1616 grid, with a 1 of 16 decoder accessing s0 lines
and another 1 of 16 decoder accessing s1 lines, then all blocks are
accessed and only 32 decoder lines have been used instead of 256!
Building decoding into a memory module obviously reduces the need for
large external decoding circuits.
Memory sizes are given by employing standardized prefixes as follows:
International Unit (base 10) Prefixes, 1993
1024 .....
1021 .....
1018 .....
1015 .....
1012 .....
109 .....
106 .....
103 ......
102 ......
101 ......
10-1 .....
10-2 .....
10-3 .....
10-6 .....
10-9 .....
10-12 ....
10-15 ....
10-18 ....
10-21 ....
10-24 ....

yotta
zetta
exa
peta
tera
giga
mega
kilo
hecta
deca
deci
centi
milli
micro
nano
pico
femto
atto
zepto
yocto

These are used directly with base 10 measures; e.g.,


picosecond (1 trillionth of a second = 10-12)
millimeter (1 thousandth of a meter = 10-3)
megaflop (1 million floating point operations per second = 106)
They are also used with measures based on 1K = 210 = 1024 1000 = 103 ;
eg.,
gigabyte ( 1 billion bytes)
Sequential circuit clock speed is measured in Hertz where
1 Hertz 1 Hz 1 cycle per second.
This is a measure CPU manufacturers often cite with respect to
processor speed (e.g., a 2.5GHz processor has speed measured in
GigaHertz).
Example:
100 MegaHertz = 100 MHz = 100106 Hz = 108 cycles per second.
108 cycles per second is 10/109 seconds per cyle or 10 nanoseconds
per cycle. Memory generally operates at slower speeds than the
processor, which means it is accessed asynchronously (on a different
clock timing). A delay of 3 nanoseconds is 3/109 seconds. If
signals need to occur at no more than 1/3rd this rate, then clock
pulses are limited to 9/109 implying a clock speed of no more than
109/9, no faster than 110 MHz.

Page 89
Implementing Circuits Using ROMs:
We have already observed that combinational circuits can be
implemented by discrete logic gates or by using higher order circuits
such as decoders and multiplexers.
They can also be implemented by using ROMs.
Combinational circuits:
The information in the truth table specification for a combinational
circuit can be viewed as specifying the contents for a ROM
implementation of the circuit; e.g., the circuit specification for the
function f below can be implemented by an 81 ROM whose contents are
the given by the specification:

Specification
for f

Address Contents

000:

001:

010:

011:

100:

101:

110:

111:

Inputs = Address

X = A2
Y = A1
Z = A0

Data = Output f

81 ROM
For contrast, recall the alternative approaches for the same
specification as illustrated below:
K-map analysis and logic gate implementation:
X

YZ
0
1

00

01

11

10

1
1

__ __
______
f = Y Z +XZ = ( Y + Z )+XZ

Page 90
Multiplexer implementation:

0
1
2
3

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

1
0
0
0
1
1
0
1

__
Z
0
1
Z

__
Z
0

0
1
2
3
41
MUX

1
Z

X
Y

2
1

Decoder implementation:
X

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

1
0
0
0
1
1
0
1

(0)

(4)
(5)
(7)

X
Y
Z

4
2
1

0
1
2
3
4
5
6
7

1 of 8 decoder

The K-map logic gate approach requires the most analysis but uses
the simplest components. The contrasting ROM implementation
requires the least analysis, but this advantage is offset by having
to burn the desired contents into each ROM memory cell.

Sequential Circuits:
In a similar fashion, in the transition/output table for a sequential
circuit, the current state and input columns can be viewed as
providing ROM addresses that point to memory locations where the next
state information is stored. Circuit outputs can likewise be stored
at ROM addresses pointed to by the current state and input columns.
To illustrate this, consider the following state diagram for a
sequential circuit:

Page 91

C/1

A,D/1

A,B,C,D/0

D/0

C/0
C/0
S/
0000

T1/
0001
A/1
B/0

D/0

T2/
0011

A/0

A/0

T3/
0111

T4/
1111

B/0

B/0

B,C,D/0

H/
0000
A,B,C,D/0

Suppose that the states and inputs are encoded as follows and that
state outputs are given by variables O1, O2, O3, O4 as indicated:
States
S
T1
T2
T3
T4
H

Inputs
Q2

Q1

Q0

0
0
0
0
1
1

0
0
1
1
0
1

0
1
0
1
0
1

A
B
C
D

States
X

0
0
1
1

0
1
0
1

S
T1
T2
T3
T4
H

State Output
O1

O2

O3

O4

0
0
0
0
1
0

0
0
0
1
1
0

0
0
1
1
1
0

0
1
1
1
1
0

The transition/output table corresponding to the state diagram and


based on this encoding is as follows:

Page 92
Q2n Q1n

T1

T2

T3

T4

Q0n

Q2

Q1

Q0

M2

M1

M0

Q0=A2

X=A1

Current
State

D2

D1

D0

Q2

Q1

Q2=A4
Q1=A3

Y=A0

Q0

Address Contents

00000:
00001:
00010:
00011:
00100:
00101:
00110:
00111:
01000:
01001:
01010:
01011:
01100:
01101:
01110:
01111:
10000:
10001:
10010:
10011:
10100:
10101:
10110:
10111:
11000:
11001:
11010:
11011:
11100:
11101:
11110:
11111:

0001
1110
0010
0001
0001
1110
0010
0100
0110
1110
0001
0100
1000
1110
1110
1110
1000
1000
1000
1000

M2
M1
M0
Z

1110
1110
1110
1110

32 4 ROM
Address Contents

Q2=B2
Q1=B1
Q0=B0

000:
001:
010:
011:
100:
101:
110:
111:

0000
0001
0011
0111
1111
0000

8 4 ROM
Z is the output on transitions.
Note that the current state information is maintained in the 3 D
flip-flops given by Q2, Q1, Q0. The next state is given by the
memory data lines labeled M2, M1, M0. The M2, M1, M0 output values
are applied to the inputs of the D flip-flops, ready to be latched
on the transition to the next state. Since the output associated
with each state does not rely on the transition inputs X, Y, a
smaller memory unit is sufficient for representing this requirement
of the circuit specification.
The implementation is almost a direct transliteration of the truth
table specification for the circuit, which requires considerably

O1
O2
O3
O4

Page 93
less analysis than implementing the circuit using gate logic. The
downside again is the need to program ROMS for the circuit
specification. If FPGAs or similar ROMs are available along with
the means to program them, then this approach becomes a good choice
for implementation, especially in light of the fact that it
requires relatively few connections.
Hamming code:
Adding a parity bit to a sequence of data bits provides an encoding
of the data that enables detection of the presence or absence of
error in the data, so long as at most 1 bit is at fault. If there
is an erroneous bit, the approach does not identify it, however.
The idea of parity bits can be easily extended to provide means for
not only detecting the presence of an erroneous bit, but also the
means for locating and correcting it. This kind of encoding is
called an error correcting code.
There are error correcting code techniques that will detect and
correct multiple bit errors. Hamming code provides an introduction
to the idea behind these coding techniques. We will only consider
Hammings single error detection/correction code.
First view data as occurring at positions 1,2,3, ...
To show the concept, we first limit ourselves to 15 data positions.
Consider the position numbers listed in binary and observe the
column patterns of 1s:
dcba
1 0001
a: 1s at 1,3,5,7,9,11,13,15
2 0010
3 0011
b: 1s at 2,3,6,7,10,11,14,15
4 0100
5 0101
c: 1s at 4,5,6,7,12,13,14,15
6 0110
7 0111
d: 1s at 8,9,10,11,12,13,14,15
8 1000
9 1001
10 1010
11 1011
12 1100
13 1101
14 1110
15 1111
Any position is identified by the columns it has 1s in (ie., 13
occurs only in columns a,c,d and 2 only occurs in column b). To
elaborate, if there is a bit error in position 13, then
a parity check of the positions given by d will identify the
problem bit as being in one of 8,9,...,15.
A parity check of the positions given by c reduces this list
to one of 12,13,14,15.
A parity check of b doesnt include 13 and so doesnt find
any error, which eliminates 14,15 and reduces the list to
one of 12,13.

Page 94

A parity check of the positions given by a identifies 13 as


the culprit.

To summarize, if parity checks are conducted on the bit positions


identified by 1s in each of columns a,b,c,d then an error in a bit
position will result in an parity error for 1 or more of these
checks. The combination of the parity errors precisely locates the
bit position causing the error.
There are 24-1 data positions, and we need 4 parity bits, so that
leaves up to 11 bits available for user data. With 5 parity
checks, there would be 31-5=26 bits available for user data.
If we assume the data is in bytes (ie., we have 8 user bits), then
adding on 4 bits for parity checking results in a 12 bit encoding
of the data. If the parity bits are simply appended to the user
bits, then some difficulty will occur in setting them. This can be
avoided if the parity bits are placed at the positions which occur
in only 1 column (those with a single 1, position 1,2,4,8).
If the user data is 0 1 1 0 1 0 1 1, then it is encoded as
_ _ 0 _ 1 1 0 _ 1 0 1 1
1 2

where positions 1,2,4,8 receive the corresponding parity check.


For even parity, this determination is as follows:
_ _ 0 _ 1 1 0 _ 1 0 1 1
1 2

_ _ 0 _ 1 1 0 _ 1 0 1 1
1 2

parity at position 4 is 1

_ _ 0 _ 1 1 0 _ 1 0 1 1
1 2

parity at position 2 is 0

_ _ 0 _ 1 1 0 _ 1 0 1 1
1 2

parity at position 1 is 1

parity at position 8 is 1

The encoded user data is then


1 0 0 1 1 1 0 1 1 0 1 1
1 2

If the data is transmitted and the received data is


1 0 0 1 1 0 0 1 1 0 1 1
1 2
4
8
then the parity checks result in
a) 1 0 0 1 1 0 0 1 1 0 1 1 - OK
b) 1 0 0 1 1 0 0 1 1 0 1 1 - error
c) 1 0 0 1 1 0 0 1 1 0 1 1 - error
d) 1 0 0 1 1 0 0 1 1 0 1 1 - OK
which identifies position 0110 =10 6 as the one in error.

Page 95
Note that setting the parity bit can be accomplished simply by
using XOR; e.g.,
if the code word is notated by C[1 2 3 4 5 6 7 8 9 10 11 12],
then the parity bits are obtained by
C[1] C[3] C[5] C[7] C[9] C[11]
C[2] C[3] C[6] C[7] C[10] C[11]
C[4] C[5] C[6] C[7] C[12]
C[8] C[9] C[10] C[11] C[12]
If an overall parity check is included at position 0, then the
Hamming code word extended by this bit becomes a single error
correcting, double error detecting code. The following 4 cases
cover all possibilities for 2 or fewer errors:
1. no parity error, no Hamming error no error detected
2. no parity error, Hamming error
double error detected
3. parity error, no Hamming error
parity bit in error
4. parity error, Hamming error
correctable error detected
This is easy to see:
If there are no errors, there are no parity errors for any of the
checks and no error correction is needed. This is the no parity
error, no Hamming error case.
If 2 bits are in error in the overall code word, then the overall
parity will be unaffected; ie., the overall parity check will
find no error. On the other hand, since at least one of the
errant bits is in the Hamming code word, the Hamming parity
checks will flag an error. This is the no parity error, Hamming
error case, and flags occurrence of a double error. In this
case error correction no longer applies, since there is no way to
determine which 2 bits are in error, even if one of them happens
to be the parity bit, but the double error has been detected.
If a single bit is in error then an overall parity error will be
flagged. If the bit is the parity bit, then the Hamming code
word generates no errors. This is the parity error, no Hamming
error case, and the parity error can be corrected by changing
the parity bit (so single error correction remains in effect).
If a single bit is in error and it is in the Hamming code word,
then the Hamming parity checks locate the position of the bit.
This is the parity error, Hamming error case, and the error can
be corrected using the Hamming decoding technique.
This covers all possibilities of 0, 1, or 2 errors being present. If
more than two errors are present, one of these cases will occur, but
the result will be erroneous.

Page 96
Computer Systems Level
Representing numeric fractions:
Earlier we examined data representation formats for integers,
Boolean values, and characters. A full processing environment also
needs to include a representation format for fractions. The
systems circuitry that implements these kinds of data manipulations
is called the arithmetic and logic unit (ALU).
One of the things that has to be considered in designing a system
is whether a feature should be implemented in hardware or software.
For example, floating point numbers can be implemented either in
circuitry or by software. If implemented in software, the
specification for the representation format can be easily changed.
If implemented in hardware, then it is advantageous to use a
representation standard, since changes at the hardware level carry
more severe penalties than changes at the software level.
The term floating point numbers is used because the representation
employed is based on scientific notation where the value is
approximated by floating the decimal point until only one digit
is to the left of the decimal point, marking the magnitude by
keeping track of the power of 10 necessary to restore the decimal
points location. Hence, the basic format has the form:
<> <d.ddd...d> 10<exponent>
for example, -3.456 10-23

or 5.12345 1015

An arithmetic operation for numbers in this format utilizes the


arithmetic operations for integers, but requires special handling
for exponents and normalization. Normalization is the process of
manipulating a result by adjusting the exponent, floating the
decimal point until there is only one digit to its left.
Normalization example:
-123.456 10-11 normalizes to -1.23456 10-9
[normalize by adding 2 to the exponent]
(In this case the exponent has been decreased by 2 to float the
decimal point two positions to the left).
0.0000012345 1015 normalizes to 1.2345 109
[normalize by subtracting 6 from the exponent]
(In this case the exponent has been increased by 6 to float the
decimal point six positions to the right).
Multiplication and division are straight forward.
Multiplication and division examples:
(2.01 10-11) (-9.3 1016) = -18.693 10-11+16 = -1.8693 106
[set the sign, multiply the mantissas, add the exponents,
normalize and round]
(2.01 10-11) (-9.3 1016) = -.216129_ 10-11-16 = -2.1613 10-26
[set the sign, divide the mantissas, subtract the exponents,
normalize and round]

Page 97
Addition and subtraction require exponent manipulation since the digits
have to the lined up according to position.
Addition/subtraction example:
(2.345 109) + (9.31 1014) = (.00002345 1014) + (9.31 1014)
= 9.31002345 1014
[adjust the number with the smaller magnitude to match the
exponent of the one with larger magnitude, then add/subtract the
mantissas, normalize and round]
Another way to look at this is that addition and subtraction
require moving the decimal point for the smaller number until the
magnitudes of the two numbers match.
In the computer context, base 10 is not the natural base to employ.
In particular, on IBM mainframes (360 series), floating point
numbers are hexadecimal based, using a 64-bit format developed by
IBM for their systems. On these systems, IBM also employed its own
character encoding format (EBCDIC).
For obvious reasons, it is not advisable for a single manufacturer
to dictate standard formats, so neutral groups, in which
representatives from many manufacturers participate, develop and
promulgate standards for general adoption by industry.
Industry recognizes that lack of standard representation formats
complicates the portability of data among systems. Systems that do
not conform to standards eventually lose market appeal as more and
more competing companies adopt recognized standards. As discussed
earlier, the ASCII character encoding format has been widely
adopted and integers are now almost always represented in 2s
complement, rather than the 1s complement format.
The most widely adopted floating point standard is the IEEE 754
standard. It employs the biased exponent concept used in IBMs
format, but in contrast employs a base 2 format rather than
hexadecimal. Note that it is the binary point that floats, rather
than the decimal point. In contrast to 2s complement, there is no
natural underlying finite algebra for floating point numbers.
Hence, a sign-magnitude representation, with its implicit
complications for managing arithmetic, is employed.
For this reason, in early computational machines, floating point
computations were almost always handled via software to hold down
the size of the computational circuits. Floating point circuitry
is now integrated into most processors and for almost all of them
is compliant with the IEEE standard.

Page 98
IEEE 754 Floating Point Standard
The IEEE 754 floating point standard provides a standard way
of representing fractional quantities based on standard
scientific notation (in base 2). The basic components for
representing a number x are organized:
(-1)<sgn> 2(<exponent>
exponent
1

- <bias>)

mantissa

8
23
32-bit single precision
base 2 exponent biased by 127 (range -126 to 127)
[true exponent is (<exponent> - 01111111)]

exponent
1

1.<mantissa>

mantissa

11
52
64-bit double precision
base 2 exponent biased by 1023 (range -1022 to 1023)
[true exponent is (<exponent> - 01111111111)]
exponent

mantissa

15
64
80-bit extended precision
base 2 exponent biased by 16383 (range -16382 to 16383)
[true exponent is (<exponent> - 011111111111111)]

An exponent of all 1's is used to show an exception: with a


mantissa of 0 it represents , depending on the sign; otherwise
the mantissa provides the designation for an illegal operation.
For an exponent not all 0's (and not all 1's), the number is in
normalized form, meaning the exponent and mantissa have been
adjusted to produce a mantissa of the form 1.xxx ... xxx. In
the representation, the leading 1 is an implied leading 1
(providing an extra bit of precision). This is the usual way
numbers are represented in floating point.
For an exponent all 0's, the number is too small to be normalized
and so is represented unnormalized.
0 is given by a mantissa of 0 and the minimum exponent (all 0's).
Example:
212.562510 = 11010100.10012 or 1.101010010012 * (27)10
The biased exponent is 127+7 = 13410 = 100001102
In IEEE 32 bit format:

0 10000110 10101001001000000000000

Page 99
Guard bits, rounding:
Guard bits are extra bits maintained during intermediate steps to
minimize loss of precision due to use of routine arithmetic
operations and rounding. The implied 1 under the IEEE format
limits precision loss under multiplication, since the result of
multiplying mantissas will always be greater than or equal to 1.
However, the simple multiplication of binary floating point values,
1.12 1.12 = 10.012, illustrates that a right shift may be needed
to normalize the result (in this case to 1.0012) and that the
number of significant bits may double. A right shift may result in
loss of precision since a significant bit may get shifted off of
the end. By carrying an extra bit during intermediate steps, this
effect can be countered. If an extra bit is carried for rounding,
then an additional guard bit is needed to prevent an intermediate
right shift from shifting away the rounding bit.
Rounding strategies:
The rounding technique is important, because you dont want loss of
precision to cascade to a significant error when multiple
calculations are being performed; hence, to be viable, the rounding
strategy must balance, rounding up half the time and the other half
rounding down.
1. truncation
As a strategy, truncation is not viable since it always
rounds down (up if the number is negative)
2. Von Newman rounding
The strategy is to always set the least significant bit to 1;
e.g., internally the IEEE mantissa (implied leading 1) is
carried with two extra bits and has the form
1.dddd ... lee
least significant bit

guard bits (rounding and shift protect)

1.ddd ... d000 and 1.ddd ... d001 round up to 1.ddd ... d1
1.ddd ... d101 and 1.ddd ... d111 round down to 1.ddd ... d1
ie., half the time rounding is up and the other half it is
down.
3. True rounding is the opposite of Von Neuman rounding
1.dddd ... 1
if ee=11 or 10 ( 2)
1.dddd ... lee
1.dddd ... 0
if ee=00 or 01 (< 2)
Note that this simply requires assigning the first guard bit at
end of computation to be the least significant bit.
Other considerations:
Addition/subtraction can add to precision loss; for example,
1.1112 1.1102 = 0.0012 = 1.000 2-310 has operands with 4 significant
figures and a result that has only 1. If significant figures
disappeared via earlier computations in obtaining the operands, then
data which could be present in the final result has been lost. This
suggests that the results of extended calculations in floating point
should be carried in the highest precision format available, a
function of programming rather than hardware.

Page 100

Rules for processing floating point numbers:


Multiplication:
The format for floating point numbers
(-1)<sgn> 2(<exponent>

- <bias>)

1.<mantissa>

is implicitly multiplicative, so determining the result requires


XOR the sign bits
Add the exponents:
Since (<exponent1>-<bias>) + (<exponent2>-<bias>) =
(<exponent1>+<exponent2>) - 2<bias>
the hardware approach is
Add the biased exponents obtained from the IEEE
representation and subtract the bias
Multiply the mantissas, including the implied 1, and round
the result; if the first bit of each mantissa in the IEEE
format is 1, decrement the exponent by 1 (corresponds to
floating the binary point left by 1)
Division:
(-1)<sgn1> 2(<exponent1>

- <bias>)

1.<mantissa1>

(-1)<sgn1> 2(<exponent2>

- <bias>)

1.<mantissa2>

= (-1)(<sgn1><sgn2>) 2(<exponent1>

- <exponent2>)

1.<mantissa1>)/(1.<mantissa1>

so the procedure is
XOR the sign bits
Subtract the biased exponents obtained from the IEEE
representation and add the bias
Divide the mantissas, including the implied 1, and round the
result; if the dividend is less than the divisor, increment
the exponent by 1 (corresponds to floating the binary point
right by 1) Note: if the dividend is less than the
divisor, the result is less than 1 and a normalization step
is needed; however, the worst case scenario is a dividend
of 1.02 and a divisor of 1.11 ... 12 which is greater than
(1/2)10 and so normalization still only moves by 1 position.
Addition/Subtraction: these are easily implemented for integers, but
require a good bit more attention for floating point.
Addition/Subtraction:

Increment the smaller exponent to match the larger one and


shift its mantissa (including the implied 1) to the left by
the increment amount. Note that this is the opposite of
normalizing.
Process addition/subtraction according to the signs of the two
values, round, and normalize the result.

Page 101
Register transfer logic:
A computing device generally transforms data from one form to
another over a series of steps. This is a characteristic of finite
state automata, so in its concept a computing device is a (large)
finite state machine. It is impractical to concoct a monolithic
finite state automaton to describe a computer, so its architecture
is instead described in terms of components and their interfaces.
We have now seen how to construct sequential circuits that are
large memory modules. We also have seen how specialized memory
elements called registers can be used to provide the operands for
data manipulation techniques such as arithmetic operations,
comparison operations, shift operations, and the like. The results
of such an operation, if not done directly on the register (such as
happens with shift), can be captured in a target register.
Conceptually, it appears wise to view memory and manipulation of
data in different contexts, one for the storage and retrieval of
data, and the other for performing operations on data.
Registers are used to hold data retrieved from memory (or data
ready to be stored in memory), where it can be accessed for data
manipulation needs. Data can be easily moved from register to
register; for example, to load two registers providing the operands
for an adder circuit, or moved from a target register to a register
designed to hold data ready to be stored in memory (ie., a register
whose outputs are connected to memory data lines).
Register transfer logic organizes registers in a manner which
provides means of moving data among registers for purposes of
applying various data manipulations to the data contained within
them.
A register transfer architecture provides an abstracting
realization of register transfer logic, conceptualizing data
transfer and control in the following manner:
Memory I/O

data registers connected along a bus

clock

status
control
input

signals

control module

(has its own internal


working registers)

f e e d b a c k

control
output

Page 102
There may be more than one control module deployed. Control signals
may need to be generated from outside the control module or passed on
to other modules. The clock synchronizes control and data elements
and may be suspended by a control signal (e.g., to allow asynchronous
transfer of data to or from memory).
A register transfer language (RTL) provides a means for instantiating
control modules and register elements for accomplishing a task.
Transfer/control statements are executed sequentially with the clock.
There is no standard register transfer language, but basic elements can
be represented using the following notation:
1. data manipulation
transfer (assignment operator): A B
copy (transfer non-destructively) the contents
of register B to register A
access: A[i]
access bit i of register A
operators: +, , , =, ...
apply bit-wise across either selected bits, or whole register
2. control
conditional execution: (<condition>) <statement>
example: (C) A B
if condition C=1, the transfer occurs, if C=0 it does not
branch: [<cond-1><step-1> + ... + <condition-n><step-n>]
the next step is changed to the first one in the branch
statement having a true condition; if no conditions are true,
dont branch (proceed to next step)
Each step can have both data manipulation and control parts.
expressed on one line are assumed to be parallel.

Transfers

Example: assuming register bits are numbered left to right starting


from 0, then
A[0]0, A[1]A[0], A[2]A[1], A[3]A[2]
is a right shift by 1 of the bits referenced in register A (ie.,
each bit is copied to its neighbor before it is reset).
Example:

a fragment of a sequence of RTL steps

Step_1: A B
(transfer B
[A[0]Step_3]
(if A[0]=1,
__
Step_2: A A
(complement
Step_3: C A
__ (transfer A
C receives either A or A depending on

to A)
branch to Step-3)
A)
to C)
the value of A[0].

An RTL program simply describes next state behavior, and so is a more


abstract way to describe a circuit than can be accomplished using state
diagrams.
Operations (such as floating point arithmetic) which can be done in
circuitry using sequential logic are one example of the kinds of
circuitry that may be best described in RTL.

Page 103
Example: Consider the RTL sequence
C0: A B
C1: A[0]A[1], A[1]A[2], A[2]A[3], A[3]A[0]
[SEQC0]
C2: A[0]A[0]+ A[1], A[2]A[2]+ A[3]
C3: A[1]A[0] A[1], A[3]A[2] A[3]
[C0]

(left circular shift)

The steps in the control sequence correspond to states, and so can


be represented by using 2 flip-flops. The overall circuit then has
the appearance:
input data (B)
control
input

control
signals

D1

SEQ

control
combina
tional
logic
(TBD)

Q1

C0

D0

2
1

1
of
4

transfer
combinational
logic (TBD)

C1
C2

C3

O0
A[0]

O1

A[1]

O2
A[2]

Q0
D

ck

O3

A[3]

feedback
Since 6 flip-flops are needed to describe the control logic and
provide the 4-bit data register A, if a state diagram approach was
employed, the circuit would require 26=64 states!
The remaining work is to fill in the two combinational circuits
noted as TBD. The program sequence occurs as follows:
current control state

next control state


SEQ=0
SEQ=1

C0

C1

C1

C1

C2

C1

C2

C3

C3

C3

C0

C0

o
u
t
p
u
t
d
a
t
a

Page 104

Control combinational logic:


From the control state transitions we can determine the control
combinational logic using sequential circuit design, starting from a
state diagram as follows:

C1

0,1

C0

C2

0,1

C3

0,1

Q1
Q1n

Q2n

Q1

Q0

Q1 Q0 SEQ

C0

C1

C2

C3

Q0SEQ
00

01
0

1
4

_
_ ____
__
__
D1 = Q1Q 0 + Q 1Q0 S E Q
Q0SEQ
00
Q1
0

01
0

11

Q1
D1
SEQ

D0

10

1
1
_
__
D0 = Q 0

The circuit is then

Q0

10
3

_
__

11
1

Page 105
Transfer combinational logic:
There are 4 transfer statements, each of which requires its own
combinational logic and each of which must be activated when its
control signal (C0,C1,C2,C3 ) is raised. This is handled by using an
AND gate with each control signal to activate/deactivate the
appropriate transfer.
input data (B)

C0
C1
A0+A1

O0

A[0]

C2

C0
C1

C0
C1

A0A1

A[1]

C3

C2

C0

C3

C1
A2+A3

O1

O2

A[2]

C2

o
u
t
p
u
t
d
a
t
a

C0
C1
A2A3

O3

A[3]

C3

ck
feedback

Transfer combinational circuit


Register Transfers Required
C0:
C1:
C2:
C3:

A B
A[0]A[1], A[1]A[2], A[2]A[3], A[3]A[0]
A[0]A[0]+ A[1], A[2]A[2]+ A[3]
A[1]A[0] A[1], A[3]A[2] A[3]

(left circular shift)

Page 106
UNF RTL: A Register-Transfer Language Simulator
High-level programming languages are usually portable across multiple
environments, because they are designed to be used at a level of
abstraction above physical implementation. They also tend to have a
large user base. In contrast, RTL implementations (even more so than
machine and assembly languages) tend to be tailored for a specific
manufacturers needs; ie., there is no standard RTL. Elsewhere defined
RTL circuit modules can also be employed (in the manner of subprograms)
if there is a language context in which they are described.
UNF RTL is an implementation of an RTL for a simulated machine
environment. It has its own syntax and semantics, and can be used to
verify register-transfer functionality for microcode-level algorithms.
It does not incorporate any timing capabilities, which would normally
be desirable in an implementation to be used for actual computer
circuit construction. We will illustrate its functionality via a
series of programs describing sequential circuits (including ones for
specialized arithmetic).
I. UNF RTL: Basic Structure
An RTL program consists of the following three sections:
1. DEFREG - define registers.
2. DEFBUS - define buses.
3. Control section bracketed by BEGIN and END.
For example,
DEFREG:
REG1(16)
** REG1 is a 16 bit register
REGISTER2(8) ** REGISTER2 is an 8 bit register
ACC(32)
** ACC is a 32 bit register
DEFBUS:
MAINBUS(32) ** MAINBUS is a 32 bit bus
LASTBUS(8)
** LASTBUS is an 8 bit bus
BEGIN:
...
** Register transfer and manipulation statements.
...
END:
It is assumed that a transfer from one register to another does not
require explicit representation of a bus structure. Defined buses
are assumed to have a bus sense register to maintain any value
transferred onto the bus. The purpose of having buses is to support
communication among separately defined modules by explicitly
representing the data path.
II. UNF RTL: Naming the Registers and Buses
DEFREG, DEFBUS, BEGIN, END, and the operator names (see next
section) are reserved words. Register and bus names must start with
an upper-case letter and may have up to twenty upper-case alphabetic

Page 107
and numeric characters. The number enclosed in parentheses
indicates the number of bits in the register being declared or the
path width of the bus being defined (number of bits).
For example, REG123XYZ(32) defines a register with the name
REG123XYZ having 32 bits (bits 0,1,2,...,31). Bits in a register
or bus are indexed from left to right beginning with 0.
III. UNF RTL: Labels, Conditional Execution, Conditional Branch, Merge
Statements in the control section (between the BEGIN and END
brackets) may optionally start with a label and/or a condition.
<Label>: (<condition>)
Examples:
label

<...Register transfer statement...>

condition

RTL statement

L1: (X[15 16] LEQ 1 0) X[0 TO 7] SETREG X[0 TO 7] SUB Y


M23: REG1 SETREG REG2
A[3 4] SETREG B[2 2]
Labels follow the same formation rules as those used for naming
registers and buses. A label is terminated with a colon. A
condition is an expression involving current contents of registers
and buses and should evaluate to either 1 or 0. The statement
following the condition is executed if the condition evaluates to
1, otherwise it is ignored. Statements without a "pre-condition"
are executed when encountered.
In addition to the conditional execution discussed above, there is a
conditional branching capability. The syntax is as follows:
<Label>:(<c0>) BRANCH (<c1>;<L1>)(<c2>;<L2>)(<c3>;<L3>) ... (<cN>;<LN>)
The execution of the BRANCH statement is conditioned on <c0> if
present. If the BRANCH statement is executed, the
(<condition>;<Label>) pairs are considered from left to right and
the first condition to evaluate to 1 causes a BRANCH to the
corresponding label. If none of the conditions evaluate to 1, then
execution proceeds to the next sequential line.
An unconditional branch is provided to simulate a merging of control
signals. The syntax is:
<Label>:(<c0>) MERGEAT <Lbl>
Examples:
BRANCH (SC ANEQ 0; L1) (X[0];L2)
MERGEAT TOP
IV. UNF RTL: Assignment Statements, Register Transfer, Expressions
Assignment statements simulate transfer of bit strings between
registers and buses.
<Busname> SETBUS <expression>
<Regname> SETREG <expression>
The expression on the right of the SETREG or SETBUS command
indicates processing of current contents of registers and/or buses,

Page 108
the result of which is transferred to the register or bus named on
the left hand side of the SETREG or SETBUS command. For example,
LASTBUS SETBUS REG34
indicates that the contents of REG34 are to be sent to LASTBUS;
REG8 SETREG LASTBUS
means that the current set of signals on the LASTBUS is to be copied
to REG8;
REG9[7 8] SETREG R12[4 9] OR BUS10[2 30]
specifies that the sub-register REG9[7 8] (bits 7 and 8 of REG9) is
to receive the result of bit-wise OR'ing the contents of the subregister R12[4 9] and sub-bus BUS10[2 30].
An expression may be formed by applying the following rules:
1. A binary vector is a term; e.g.,
1 0 1 1 1 1 0 0
2. A register name or a bus name is a term.
3. A sub-register or a sub-bus is a term; e.g.,
R1[0 4 6 7] or
BUS9[28 29 30 31]
4. Concatenation of terms is a term (binary vectors must be
enclosed in parentheses when involved in concatenation);
concatenation is indicated by using a comma "," between terms;
e.g.,
R1[4 5 6],(1 0 0 1 1),BUS27[16 17]
5. A term (as defined in 1 through 4) is an expression.
6. An expression enclosed in parentheses is a term.
7. <term> <binary-operator> <expression> is an expression.
8. <unary-operator> <expression> is an expression.
Two reserved bus names (INBUS, OUTBUS) are used for simulated
I/O. Expressions using these bus names provide simulated
input (with prompt - from keyboard) and output (to screen with optional MESG text, if desired), their syntax is
9. INBUS '<input-prompt-message>'
Either of the statements
REG1 SETREG INBUS 'enter an 8 bit integer'
REG5[0 TO 7] SETREG INBUS 'enter 8 bits'
first sends the prompt message to the display, then accepts
user input from the keyboard.
10. OUTBUS SETBUS <register> MESG '<message>'
where MESG is a reserved word, optionally included along with
its '<message>' to specify the addition of the <message> to
the <register> display; e.g.,
OUTBUS SETBUS REG3 MESG 'this is reg3'
appends the message text to the display of the contents of
REG3.
NOTE: INBUS is read only.

OUTBUS is write only.

Page 109
V. UNF RTL: Operators
A list of dyadic (requiring two operands) and monadic (requiring one
operand) operators follows:
Dyadic Operators:
Standard Boolean Logic Operations
OR, AND, NAND, NOR, XOR, COINC
For example, 1 0 1 1 NOR 0 0 1 0 results in 0 1 0 0
Logical and Arithmetic Shifts (Left and Right), Rotate (Circular
Shift, Left and Right)
LLSHIFT, RLSHIFT, LASHIFT, RASHIFT, LROTATE, RROTATE
For example,
3 RLSHIFT 1 0 1 1 1 0 0 1 0 results in 0 0 0 1 0 1 1 1 0
3 RASHIFT 1 0 1 1 1 0 0 1 0 results in 1 1 1 1 0 1 1 1 0
3 RROTATE 1 0 1 1 1 0 0 1 0 results in 0 1 0 1 0 1 1 1 0
Two's Complement Arithmetic
ADD, SUB, MUL, DIV
For example,
0 0 1 1 1 MUL 0 0 1 1 0 results in 0 0 0 0 1 0 1 0 1 0
Logical (Unsigned) Compare and Arithmetic (Signed) Compare
LGT, LLT, LGE, LLE, LEQ, LNEQ
AGT, ALT, AGE, ALE, AEQ, ANEQ
For example,
1 1 0 0 1 0 LLE 0 0 0 1 0 1 results in 0
1 1 0 0 1 0 ALE 0 0 0 1 0 1 results in 1
String Manipulation
FIRST, LAST
For example,
4 FIRST 1 0 1 1 0 0 0 1 results in 1 0 1 1
4 LAST 1 0 1 1 0 0 0 1 results in 0 0 0 1
Reformat of User Input under INBUS
decTOtwo, hexTOtwo
For example,
8 decTOtwo -5 results in 1 1 1 1 1 0 1 1
8 hexTOtwo A9 results in 1 0 1 0 1 0 0 1
Monadic Operators:
Standard Boolean Logic Operations
NOT
For example, NOT 1 0 1 0 1 1 results in 0 1 0 1 0 0
Increment by 1, Decrement by 1
INCREMENT, DECREMENT
For example, INCREMENT 0 1 0 0 1 results in 0 1 0 1 0
DECODE, ENCODE, twosCMPL, ZERO, twoTOdec, twoTOhex
DECODE performs the function of a 1 of 2n decoder so
DECODE 1 1 0
gives 0 0 0 0 0 0 1 0 (activating bit number 6 of the 8 bits)
(bit 6)
ENCODE is the inverse of decode so
ENCODE 0 0 0 0 0 1 0 0 results in 1 0 1

Page 110
twosCMPL simply forms the 2's complement of its argument so
twosCMPL 1 1 1 0 1 results in 0 0 0 1 1
ZERO returns a string of 0's of the given length so
ZERO 5 results in 0 0 0 0 0
twoTOdec converts 2's complement to a decimal value for an address
or output; e.g., twoTOdec 1 1 1 0 1 returns -3
twoTOhex converts a binary string into hexadecimal notation; eg.,
twoTOhex 1 1 1 0 1 returns 1D
VI.

Evaluation of Conditions

A condition is either an expression (as defined in section IV) or


two expressions connected by one of the comparison operators. A
condition may appear as a "pre-condition" (in front of any
statement) or as the first component in a (<condition>;<label>)
pair. If a condition takes the form of an expression without a
comparison operator, it should evaluate to a 1 or 0. If a logical
comparison operator is used, the resulting bit strings on both sides
of the comparison operator are treated as unsigned integers in
making the comparison. Arithmetic comparisons treat the operands
under the assumption they are in 2's complement representation.

Page 111
UNFRTL Examples
Generic RTL example of a simple register transfer sequence
C0: A B
C1: A[0]A[1], A[1]A[2], A[2]A[3], A[3]A[0] (left circular shift)
[SEQC0]
C2: A[0]A[0]+ A[1], A[2]A[2]+ A[3]
C3: A[1]A[0] A[1], A[3]A[2] A[3]
[C0]
UNFRTL program providing an implementation of the sequence
[0] RtlSIMPLX
[1] DEFREG:
[2] SEQ(1)
[3] A(4)
[4] B(4)
[5] DEFBUS:
[6] BEGIN:
smultip
[7] C0:B SETREG INBUS 'Enter 4 bit B input'
[8] SEQ SETREG INBUS 'Enter SEQ value'
[9] A SETREG B
[10] C1:A SETREG 1 LROTATE A
[11] OUTBUS SETBUS A MESG 'Register A left rotated by 1 - '
[12] BRANCH(SEQ;C0)
[13] C2:A[0 2] SETREG ((A[0] ADD A[1]), (A[2] ADD A[3]))
[14] OUTBUS SETBUS A MESG 'Register A with A[0 2] added - '
[15] C3:A[1 3] SETREG ((A[0] XOR A[1]), (A[2] XOR A[3]))
[16] OUTBUS SETBUS A MESG 'Register A with A[1 3] XORed - '
[17] MERGEAT C0
[18] END:
Lines 9, 10, 13, 15 are the statements providing the actual register
transfer specified by C0, C1, C2, C3

Page 112
Signed multiply:
Architecture:
3 n-bit registers X, A, Y
1-bit register SGN
(A,Y) can be treated as a single 2n bit register for shifting

SGN
X register

shift counter

ADD

A register
X = multiplicand,

Y0

Y = multiplier,

Y register
A = accumulator

The sign of the product is first determined by (sgn(X) sgn(Y))


and stored in SGN. X and Y are changed to their absolute values so
that the arithmetic only has to deal with positive integers.
Basic procedure for multiplying (positive) integers X and Y:
Clear A
X <multiplicand>
Y <multiplier>
REPEAT
IF Y0 = 1
A A + X
ENDIF
Shift (A,Y) right by 1 bit
UNTIL there have been n shifts
When done, the product will be in (A,Y). The 2's complement form is
then produced according to the sign value found in SGN.

Page 113
UNFRTL program for implementing the procedure for multiplication with
extensions for accomodating the sign; X and Y are assumed to be
2's complement sign + 7 integers.
[0] RtlSMULTIPLY
[1] DEFREG:
[2] AY(16)
[3] X(8)
[4] SC(8)
[5] SGN(1)
[6] DEFBUS:
[7] BEGIN:
[8] AY[0 TO 7] SETREG ZERO 8
** Clear accumulator A
[9] SC SETREG 0 0 0 0 1 0 0 0
** Shift counter initially 8
[10] X SETREG 8 INBUS 'Enter multiplicand (8 bit 2''s complement)'
[11] AY[8 TO 15] SETREG 8 INBUS 'Multiplier (8 bit 2''s)'
[12] SGN SETREG X[0] XOR AY[8]
** Set sign bit for the product
[13] BRANCH(NOT X[0]; CKY)
[14] X SETREG twosCMPL X
** change sign of X if X < 0
[15] CKY:BRANCH(NOT AY[8]; L)
**
and do likewise for Y
[16] AY[8 TO 15]SETREG twosCMPL AY[8 TO 15]
[17]** Accumulate in A if rightmost bit of Y=1 (recall: AY[15] Y0)
[18] L:(AY[15]) AY[0 TO 7] SETREG AY[0 TO 7] ADD X
[19] OUTBUS SETBUS AY MESG 'REG AY '
[20] AY SETREG 1 RLSHIFT AY
** Shift AY right
[21] OUTBUS SETBUS AY MESG 'shf AY '
[22] SC SETREG DECREMENT SC
** Decrement seq counter
[23] BRANCH(SC ANEQ 0; L)
** Repeat if shift counter =
/ 0
[24] AY SETREG 1 RLSHIFT AY
** Shift to clear the sign bit
[25] BRANCH(NOT SGN;D)
[26] AY[0 TO 15] SETREG twosCMPL AY[0 TO 15]
[27] D:OUTBUS SETBUS AY[0 TO 15] MESG 'PRODUCT '
[28] OUTBUS SETBUS (twoTOdec AY[0 TO 15]) MESG '(base 10)'
[29] END:
Execution trace: RtlSMULTIPLY (input data is -10 and 11)
Enter multiplicand (8 bit 2's complement):
Multiplier (8 bit 2's): 0 0 0 0 1 0 1 1
REG AY
( 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 1
shf AY
( 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1
REG AY
( 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0 1
shf AY
( 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0
REG AY
( 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 0
shf AY
( 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1
REG AY
( 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0 1
shf AY
( 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0
REG AY
( 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0 0
shf AY
( 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0
REG AY
( 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0 0
shf AY
( 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0
REG AY
( 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0 0
shf AY
( 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0
REG AY
( 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0 0
shf AY
( 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0
PRODUCT ( 1 1 1 1 1 1 1 1 1 0 0 1 0 0 1 0
(base 10) ( -110 )

1 1 1 1 0 1 1 0
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)

Page 114
Booth's method for multiplying 2's complement integers:
Architecture:
3 n-bit registers X, A, Y
a 1-bit register P
(A,Y,P) can be treated as a single 2n+1 bit register for shifting
X register

shift counter

ADD/SUB

A register

Y0
Y register

X = multiplicand, Y = multiplier,
P = prior bit from multiplier

P0

A = accumulator,

In contrast to the "Signed Multiply" procedure, Booth's method requires


no independent consideration of the sign of the multiplicand and
multiplier.
The basic procedure for multiplying 2's complement integers X and Y is as
follows:
Clear A
X <multiplicand>
Y <multiplier>
P 0
REPEAT
CASE
(Y0,P0) = 1 0:
A A - X
(Y0,P0) = 0 1:
A A + X
ENDCASE
Shift (A,Y,P) right arithmetically by 1 bit
UNTIL there have been n shifts
When done, the product will be in (A,Y).
Remark: The first time a 1 appears in position Y0, X will be subtracted.
If the next value to appear in Y0 is 0, X will then be added. Because of
the shift, the effect is equivalent to having added 2X at the preceding
step, which then has the combined effect over the two steps of adding
2X - X = X. If the next value to appear in Y0 had been 1, and then 0
following that, then two shifts would take place before adding X,
yielding a combined effect of 4X - X = 3X over the three steps (note that
multiplying by 1 1 is the same as multiplying by 3, so adding 3X is
exactly what is desired). Thus, the procedure produces the desired
outcome for patterns in the multiplier of 0 1 0, 0 1 1 0, 0 1 1 1 0, ...
allowing us to conclude that it will work in general. Note that the
procedure works regardless of sign. If the multiplier is negative, its
lead bits are 1's, and so the procedure simply winds out with a series of

Page 115
shifts once it gets into the leading 1's of the multiplier. Similarly,
if the multiplier is positive, its lead bits are 0's and the procedure
likewise winds out with a series of shifts once it gets into the leading
0's of the multiplier.
Trace of Booth's method:

8 bit registers, (-11)10 x (19)10 =

-20910

X = 1 1 1 1 0 1 0 1
A

0 0 0 0 0 0 0 0

0 0 0 1 0 0 1 1

P
0

1 0

(-X: 0 0 0 0 1 0 1 1)
0 0 0 0 1 0 1 1
A

0 0 0 1 0 0 1 1
Y

0 0 0 0 0 1 0 1

1:0 0 0 1 0 0 1

-X = 0 0 0 0 1 0 1 1

subtract

(A A - X)

and then

shift right 1 (arithmetic)


1 1

0 0 0 0 0 0 1 0

1 1:0 0 0 1 0 0

1 1:0 0 0 1 0 0
Y

1 1 1 1 1 0 1 1

1 1 1:0 0 0 1 0

shift right 1 (arithmetic)

0 1

(+X: 1 1 1 1 0 1 0 1)
1 1 1 1 0 1 1 1

add

(A A + X)
and then

shift right 1 (arithmetic)


0 0

1 1 1 1 1 1 0 1

1 1 1 1:0 0 0 1

1 1 1 1:0 0 0 1
Y

0 0 0 0 0 1 0 0

0 1 1 1 1:0 0 0

0 1 1 1 1:0 0 0
Y

1 1 1 1 1 1 0 0

1 0 1 1 1 1:0 0

subtract

(X X - 1)

and then

shift right 1 (arithmetic)


0 1

(+X: 1 1 1 1 0 1 0 1)
1 1 1 1 1 0 0 1

shift right 1 (arithmetic)

1 0

(-X: 0 0 0 0 1 0 1 1)
0 0 0 0 1 0 0 0

add

(X X + 1)
and then

shift right 1 (arithmetic)


0 0

1 1 1 1 1 1 1 0

0 1 0 1 1 1 1:0

shift right 1 (arithmetic)

0 0
1 1 1 1 1 1 1 1

0 0 1 0 1 1 1 1

Product
( 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 12 = -20910)

shift right 1 (arithmetic)

Page 116
UNFRTL program for implementing Booth's procedure for multiplication;
X and Y are assumed to be 2's complement sign + 7 integers.
[0] RtlBOOTHMULT
[1] DEFREG:
[2] AYP(17)
[3] X(8)
[4] SC(8)
[5] DEFBUS:
[6] BEGIN:
[7] AYP[0 TO 7]SETREG ZERO 8
** Clear accumulator A
[8] SC SETREG 0 0 0 0 1 0 0 0
** Shift counter initially 8
[9] X SETREG 8 INBUS 'Enter Multiplicand (8 bit 2''s complement)'
[10] AYP[8 TO 15]SETREG 8 INBUS 'Multiplier (8 bit 2''s)'
[11] AYP[16]SETREG 0
** initialize P to 0
[12] OUTBUS SETBUS(twoTOdec X)MESG '(base 10 Multiplicand)'
[13] OUTBUS SETBUS(twoTOdec AYP[8 TO 15])MESG '(base 10 Multiplier)'
[14]** Cases: (recall: AY[15] Y0 and AY[16] P0)
[15] L:(AYP[15 16] LEQ 1 0)AYP[0 TO 7]SETREG AYP[0 TO 7]SUB X
[16] (AYP[15 16] LEQ 0 1)AYP[0 TO 7]SETREG AYP[0 TO 7]ADD X
[17] OUTBUS SETBUS AYP MESG 'REG AYP '
[18] AYP SETREG 1 RASHIFT AYP
** right arithmetic shift
[19] OUTBUS SETBUS AYP MESG 'shf AYP '
[20] SC SETREG DECREMENT SC
** Decrement shift counter
[21] BRANCH(SC ANEQ 0;L)
** Repeat if shift counter =
/ 0
[22] D:OUTBUS SETBUS AYP[0 TO 15]MESG 'PRODUCT '
[23] OUTBUS SETBUS(twoTOdec AYP[0 TO 15])MESG '(base 10)'
[24] END:
Execution trace: RtlBOOTHMULT (input data is -11 and 19)
Enter 8 bit Multiplicand: 1 1 1 1 0 1 0 1
Enter 8 bit Multiplier : 0 0 0 1 0 0 1 1
(base 10 Multiplicand) ( -11 )
(base 10 Multiplier) ( 19 )
REG AYP
( 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1 0
shf AYP
( 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1
REG AYP
( 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1 1
shf AYP
( 0 0 0 0 0 0 1 0 1 1 0 0 0 1 0 0 1
REG AYP
( 1 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0 1
shf AYP
( 1 1 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0
REG AYP
( 1 1 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0
shf AYP
( 1 1 1 1 1 1 0 1 1 1 1 1 0 0 0 1 0
REG AYP
( 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 1 0
shf AYP
( 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 1
REG AYP
( 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0 0 1
shf AYP
( 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0 0
REG AYP
( 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0 0
shf AYP
( 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0
REG AYP
( 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0 0
shf AYP
( 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 0
PRODUCT ( 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 1 )
(base 10) ( -209 )

)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)

Page 117
Restoring and non-restoring division:
Architecture:
3 n-bit registers: A, X, Y
1-bit sign registers SGNQ and SGNR
(A,X) can be treated as a single 2n-bit register for shifting.
Y register

shift counter

ADD/SUB

A register

X register

X0

SGNQ

SGNR

sign rules:
<dividend> = <quotient>*<divisor> + <remainder>
sign(<quotient>) = sign(<dividend>) sign(<divisor>)
sign(<remainder>) = sign(<quotient>) sign(<divisor>)
(for instance,

6/-5 = -1 r 1; -6/-5 = 1 r -1)

Using these rules, the sign of the quotient is stored in SGNQ and that of
the remainder in SGNR.
The basic procedure for restoring division,
positive integers X and Y is as follows:
Clear A
X <dividend>
Y <divisor>
REPEAT
Shift (A,X) left by 1 bit
A A - Y
IF A < 0
A A + Y
/* "restore" A */
X0 0
/* set least significant bit of X */
ELSE
X0 1
ENDIF
UNTIL the register has been shifted n times
When the algorithm terminates,
register A has <remainder>
register X has <quotient>
(register Y, the <divisor>, is unchanged)
At this point, the values in SGNQ and SGNR are used to establish the
correct 2's complement form for the quotient and the remainder.

Page 118
Trace of restoring division: 8 bit registers, (74)10/(25)10 = 210 r 2410

ShiftL
Sub Y

ShiftL
Sub Y

ShiftL
Sub Y

ShiftL
Sub Y

ShiftL
Sub Y

ShiftL
Sub Y

ShiftL
Sub Y

ShiftL
Sub Y

A
0
A
0
A
1
A
0
A
0
A
1
A
0
A
0
A
1
A
0
A
0
A
1
A
0
A
0
A
1
A
0
A
0
A
1
A
0
A
0
A
0
A
0
A
0
A
1
A
0

0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 0 0 1 1 1
0 0 0 0 0 0 0
0 0 0 0 0 0 1
1 1 0 1 0 0 0
0 0 0 0 0 0 1
0 0 0 0 0 1 0
1 1 0 1 0 0 1
0 0 0 0 0 1 0
0 0 0 0 1 0 0
1 1 0 1 0 1 1
0 0 0 0 1 0 0
0 0 0 1 0 0 1
1 1 1 0 0 0 0
0 0 0 1 0 0 1
0 0 1 0 0 1 0
1 1 1 1 0 0 1
0 0 1 0 0 1 0
0 1 0 0 1 0 1
0 0 0 1 1 0 0
0 0 0 1 1 0 0
0 0 1 1 0 0 0
1 1 1 1 1 1 1
0 0 1 1 0 0 0
<remainder>

X
0 1 0 0 1 0 1 0
X
1 0 0 1 0 1 0 ?
X
1 0 0 1 0 1 0 ?
X
1 0 0 1 0 1 0:0
X
0 0 1 0 1 0:0 ?
X
0 0 1 0 1 0:0 ?
X
0 0 1 0 1 0:0 0
XGr
0 1 0 1 0:0 0 ?
X
0 1 0 1 0:0 0 ?
X
0 1 0 1 0:0 0 0
X
1 0 1 0:0 0 0 ?
X
1 0 1 0:0 0 0 ?
X
1 0 1 0:0 0 0 0
X
0 1 0:0 0 0 0 ?
X
0 1 0:0 0 0 0 ?
X
0 1 0:0 0 0 0 0
X
1 0:0 0 0 0 0 ?
X
1 0:0 0 0 0 0 ?
X
1 0:0 0 0 0 0 0
X
0:0 0 0 0 0 0 ?
X
0:0 0 0 0 0 0 ?
X
0:0 0 0 0 0 0 1
X
:0 0 0 0 0 0 1 ?
X
:0 0 0 0 0 0 1 ?
X
:0 0 0 0 0 0 1 0
<quotient>

Y = 0 0 0 1 1 0 0 1
-Y = 1 1 1 0 0 1 1 1

A < 0; set vacated


bit to
0 (and restore A)
(: marks quotient so far)
A < 0; set vacated
bit to
0 (and restore A)

A < 0; set vacated


bit to
0 (and restore A)

A < 0; set vacated


bit to
0 (and restore A)

A < 0; set vacated


bit to
0 (and restore A)

A < 0; set vacated


bit to
0 (and restore A)

A > 0; set vacated


bit to
1

A < 0; set vacated


bit to
0 (and restore A)

Page 119

UNFRTL program for implementing the procedure for restoring division


with extensions for accomodating the sign; X and Y are assumed to be
2's complement sign + 7 integers.
[0] RtlRESTORING
[1] DEFREG:
[2] AX(16)
[3] Y(8)
[4] SC(8)
[5] SGNQ(1)
[6] SGNR(1)
[7] DEFBUS:
[8] BEGIN:
[9]** Initialize first half of AX register to zeroes
[10] AX[0 TO 7] SETREG ZERO 8
[11]** Initialize shift counter to 8
[12] SC SETREG 0 0 0 0 1 0 0 0
[13] AX[8 TO 15] SETREG 8 INBUS 'Enter 8 bit dividend (2''s comp)'
[14] Y SETREG 8 INBUS 'Enter 8 bit divisor (2''s comp)'
[15] SGNQ SETREG AX[8] XOR Y[0]
[16] SGNR SETREG SGNQ XOR Y[0]
[17] BRANCH(NOT AX[8]; CKY)
[18] AX[8 TO 15] SETREG twosCMPL AX[8 TO 15]
[19] CKY:BRANCH(NOT Y[0]; L)
[20] Y SETREG twosCMPL Y
[21] L:AX SETREG 1 LLSHIFT AX
[22] SC SETREG SC SUB 0 0 0 0 0 0 0 1
[23] OUTBUS SETBUS AX MESG 'shf AX '
[24] AX[0 TO 7] SETREG AX[0 TO 7] SUB Y
[25] OUTBUS SETBUS AX MESG 'SUB
'
[26] BRANCH(AX[0 TO 7] ALT 0; RESTORE)
[27] AX[15] SETREG 1
[28] OUTBUS SETBUS AX MESG 'set 1
'
[29] MERGEAT TST
[30] RESTORE:AX[0 TO 7] SETREG AX[0 TO 7] ADD Y
[31] OUTBUS SETBUS AX MESG 'restore '
[32] TST:BRANCH(SC AGT ZERO 8; L)
[33] BRANCH(NOT SGNR; CKQ)
[34] AX[0 TO 7] SETREG twosCMPL AX[0 TO 7]
[35] CKQ:BRANCH(NOT SGNQ; D)
[36] AX[8 TO 15] SETREG twosCMPL AX[8 TO 15]
[37] D:OUTBUS SETBUS AX[8 TO 15] MESG 'QUOTIENT '
[38] OUTBUS SETBUS (twoTOdec AX[8 TO 15]) MESG '(base 10)'
[39] OUTBUS SETBUS AX[0 TO 7] MESG 'REMAINDER'
[40] OUTBUS SETBUS (twoTOdec AX[0 TO 7]) MESG '(base 10)'
[41] END:

Page 120

Execution trace: RtlRESTORING (input data is 74 and 25)


Enter 8 bit dividend (2's comp):
Enter 8 bit divisor (2's comp):
shf AX
( 0 0 0 0 0 0 0 0 1 0 0
SUB
( 1 1 1 0 0 1 1 1 1 0 0
restore ( 0 0 0 0 0 0 0 0 1 0 0
shf AX
( 0 0 0 0 0 0 0 1 0 0 1
SUB
( 1 1 1 0 1 0 0 0 0 0 1
restore ( 0 0 0 0 0 0 0 1 0 0 1
shf AX
( 0 0 0 0 0 0 1 0 0 1 0
SUB
( 1 1 1 0 1 0 0 1 0 1 0
restore ( 0 0 0 0 0 0 1 0 0 1 0
shf AX
( 0 0 0 0 0 1 0 0 1 0 1
SUB
( 1 1 1 0 1 0 1 1 1 0 1
restore ( 0 0 0 0 0 1 0 0 1 0 1
shf AX
( 0 0 0 0 1 0 0 1 0 1 0
SUB
( 1 1 1 1 0 0 0 0 0 1 0
restore ( 0 0 0 0 1 0 0 1 0 1 0
shf AX
( 0 0 0 1 0 0 1 0 1 0 0
SUB
( 1 1 1 1 1 0 0 1 1 0 0
restore ( 0 0 0 1 0 0 1 0 1 0 0
shf AX
( 0 0 1 0 0 1 0 1 0 0 0
SUB
( 0 0 0 0 1 1 0 0 0 0 0
set 1
( 0 0 0 0 1 1 0 0 0 0 0
shf AX
( 0 0 0 1 1 0 0 0 0 0 0
SUB
( 1 1 1 1 1 1 1 1 0 0 0
restore ( 0 0 0 1 1 0 0 0 0 0 0
QUOTIENT ( 0 0 0 0 0 0 1 0 )
(base 10) ( 2 )
REMAINDER ( 0 0 0 1 1 0 0 0 )
(base 10) ( 24 )

0
0
1
1
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

1
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1

1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0

0 1 0
0 0 1
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)

Page 121
The basic procedure for non-restoring division, positive integers X
and Y is as follows:
Clear A
X <dividend>
Y <divisor>
Shift (A,X) left by 1 bit
A A - Y
REPEAT
IF A < 0
X0 0
/* set least significant bit of X */
Shift (A,X) left by 1 bit
/* Remark: below
*/
A A + Y
ELSE
X0 1
Shift (A,X) left by 1 bit
A A - Y
ENDIF
UNTIL the register has been shifted n times
(including the initial shift)
IF A < 0
X0 0
A A + Y
/* the only time A is "restored" */
ELSE
X0 1
ENDIF
When the algorithm terminates,
register A has <remainder>
register X has <quotient>
(register Y, the <divisor>, is unchanged)
Remark:
Shifting and then adding Y as done above is equivalent to adding
Y (to restore A), then shifting, and then subtracting Y as done
in the restoring algorithm. This is true because a left shift
has the effect of multiplying by 2; i.e.,
1. in the non-restoring algorithm when A < 0:
Y was subtracted initially;
a shift has the effect that 2Y is now subtracted;
adding Y leaves the effect of a single subtraction of Y for
the next pass (without having to restore!).
2. in the restoring algorithm when A < 0:
Y was subtracted initially;
Y is added back to restore;
a shift now has no effect on the Y arithmetic;
Y must now be explicitly subtracted for the next pass.

Page 122
Trace of non-restoring division: 8 bit registers, (74)10 / (25)10

Shift
Sub Y

ShiftL
Add Y

ShiftL
Add Y

ShiftL
Add Y

ShiftL
Add Y

ShiftL
Add Y

ShiftL
Add Y

ShiftL
Sub Y

A
0
A
0
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
1
A
0
A
0
A
0
A
1
A
0

0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 0 0 1 1 1
1 1 0 0 1 1 1
1 0 0 1 1 1 1
1 1 0 1 0 0 0
1 1 0 1 0 0 0
1 0 1 0 0 0 0
1 1 0 1 0 0 1
1 1 0 1 0 0 1
1 0 1 0 0 1 0
1 1 0 1 0 1 1
1 1 0 1 0 1 1
1 0 1 0 1 1 1
1 1 1 0 0 0 0
1 1 1 0 0 0 0
1 1 0 0 0 0 0
1 1 1 1 0 0 1
1 1 1 1 0 0 1
1 1 1 0 0 1 1
0 0 0 1 1 0 0
0 0 0 1 1 0 0
0 0 1 1 0 0 0
1 1 1 1 1 1 1
0 0 1 1 0 0 0
<remainder>

X
0 1 0 0 1 0 1 0
X
1 0 0 1 0 1 0 ?
X
1 0 0 1 0 1 0 ?
X
1 0 0 1 0 1 0:0
X
0 0 1 0 1 0:0 ?
X
0 0 1 0 1 0:0 ?
X
0 0 1 0 1 0:0 0
X
0 1 0 1 0:0 0 ?
X
0 1 0 1 0:0 0 ?
X
0 1 0 1 0:0 0 0
X
1 0 1 0:0 0 0 ?
X
1 0 1 0:0 0 0 ?
X
1 0 1 0:0 0 0 0
X
0 1 0:0 0 0 0 ?
X
0 1 0:0 0 0 0 ?
X
0 1 0:0 0 0 0 0
X
1 0:0 0 0 0 0 ?
X
1 0:0 0 0 0 0 ?
X
1 0:0 0 0 0 0 0
X
0:0 0 0 0 0 0 ?
X
0:0 0 0 0 0 0 ?
X
0:0 0 0 0 0 0 1
X
:0 0 0 0 0 0 1 ?
X
:0 0 0 0 0 0 1 ?
X
:0 0 0 0 0 0 1 0
<quotient>

Y = 0 0 0 1 1 0 0 1
-Y = 1 1 1 0 0 1 1 1

A < 0; set vacated


bit to
0
(: marks quotient so far)
A < 0; set vacated
bit to
0

A < 0; set vacated


bit to
0

A < 0; set vacated


bit to
0

A < 0; set vacated


bit to
0

A < 0; set vacated


bit to
0

A > 0; set vacated


bit to
1

A < 0; set vacated


bit to
0 and restore A

Page 123
UNFRTL program for implementing the procedure for non-restoring
division with extensions for accomodating the sign; X and Y are
assumed to be 2's complement sign + 7 integers.
[0] RtlNRESTORE
[1] DEFREG:
[2] AX(16)
[3] Y(8)
[4] SC(8)
[5] SGNQ(1)
[6] SGNR(1)
[7] DEFBUS:
[8] BEGIN:
[9]** Initialize first half of AX register to zeroes
[10] AX[0 TO 7] SETREG ZERO 8
[11]** Initialize shift counter to 8
[12] SC SETREG 0 0 0 0 1 0 0 0
[13] AX[8 TO 15] SETREG 8 INBUS 'Enter 8 bit Dividend'
[14] Y SETREG 8 INBUS 'Enter 8 bit Divisor '
[15] SGNQ SETREG AX[8] XOR Y[0]
[16] SGNR SETREG SGNQ XOR Y[0]
[17] (AX[8]) AX[8 TO 15] SETREG twosCMPL AX[8 TO 15]
[18] (Y[0]) Y SETREG twosCMPL Y
[19] AX SETREG 1 LLSHIFT AX
[20] OUTBUS SETBUS AX MESG 'shf AX '
[21] AX[0 TO 7] SETREG AX[0 TO 7] SUB Y
[22] OUTBUS SETBUS AX MESG 'SUB
'
[23] L:SC SETREG SC SUB 0 0 0 0 0 0 0 1
[24] BRANCH(SC ALE ZERO 8; CHK)
[25] (AX[0]) MERGEAT ADDY
[26] AX[15] SETREG 1
[27] OUTBUS SETBUS AX MESG 'set 1
'
[28] AX SETREG 1 LLSHIFT AX
[29] OUTBUS SETBUS AX MESG 'shf AX '
[30] AX[0 TO 7] SETREG AX[0 TO 7] SUB Y
[31] OUTBUS SETBUS AX MESG 'SUB
'
[32] MERGEAT L
[33] ADDY: AX SETREG 1 LLSHIFT AX
[34] OUTBUS SETBUS AX MESG 'shf AX '
[35] AX[0 TO 7] SETREG AX[0 TO 7] ADD Y
[36] OUTBUS SETBUS AX MESG 'ADD
'
[37] MERGEAT L
[38] CHK:(AX[0]) MERGEAT REST
[39] AX[15] SETREG 1
[40] OUTBUS SETBUS AX MESG 'set 1
'
[41] MERGEAT CKQ
[42] REST:AX[0 TO 7] SETREG AX[0 TO 7] ADD Y
[43] OUTBUS SETBUS AX MESG 'ADD
'
[44] CKQ:(SGNR) AX[0 TO 7] SETREG twosCMPL AX[0 TO 7]
[45] (SGNQ) AX[8 TO 15] SETREG twosCMPL AX[8 TO 15]
[46] D:OUTBUS SETBUS AX[8 TO 15] MESG 'QUOTIENT '
[47] OUTBUS SETBUS (twoTOdec AX[8 TO 15]) MESG '(base 10)'
[48] OUTBUS SETBUS AX[0 TO 7] MESG 'REMAINDER'
[49] OUTBUS SETBUS (twoTOdec AX[0 TO 7]) MESG '(base 10)'
[50] END:

Page 124
Execution trace: RtlNRESTORE (input data is 74 and 25)
Enter 8 bit Dividend: 0 1 0 0 1 0 1 0
Enter 8 bit Divisor : 0 0 0 1 1 0 0 1
shf AX
( 0 0 0 0 0 0 0 0 1 0 0 1 0 1
SUB
( 1 1 1 0 0 1 1 1 1 0 0 1 0 1
shf AX
( 1 1 0 0 1 1 1 1 0 0 1 0 1 0
ADD
( 1 1 1 0 1 0 0 0 0 0 1 0 1 0
shf AX
( 1 1 0 1 0 0 0 0 0 1 0 1 0 0
ADD
( 1 1 1 0 1 0 0 1 0 1 0 1 0 0
shf AX
( 1 1 0 1 0 0 1 0 1 0 1 0 0 0
ADD
( 1 1 1 0 1 0 1 1 1 0 1 0 0 0
shf AX
( 1 1 0 1 0 1 1 1 0 1 0 0 0 0
ADD
( 1 1 1 1 0 0 0 0 0 1 0 0 0 0
shf AX
( 1 1 1 0 0 0 0 0 1 0 0 0 0 0
ADD
( 1 1 1 1 1 0 0 1 1 0 0 0 0 0
shf AX
( 1 1 1 1 0 0 1 1 0 0 0 0 0 0
ADD
( 0 0 0 0 1 1 0 0 0 0 0 0 0 0
set 1
( 0 0 0 0 1 1 0 0 0 0 0 0 0 0
shf AX
( 0 0 0 1 1 0 0 0 0 0 0 0 0 0
SUB
( 1 1 1 1 1 1 1 1 0 0 0 0 0 0
ADD
( 0 0 0 1 1 0 0 0 0 0 0 0 0 0
QUOTIENT ( 0 0 0 0 0 0 1 0 )
(base 10) ( 2 )
REMAINDER ( 0 0 0 1 1 0 0 0 )
(base 10) ( 24 )

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1

0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0

)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)
)

Page 125
Floating point operations: Floating point operations can likewise be
implemented in RTL. We will only illustrate this for floating point
add (IEEE 32 bit format). Assume that both numbers are positive (if
one is positive and one is negative, then the add operation becomes
subtract; if both are negative, then the operation is add with the
sign set to negative).
Architecture:
two 32-bit floating point registers F1,F2
two 2-bit registers GB1,GB2 for guard bits
two 2-bit registers IB1,IB2 for manipulating the implied 1
Each of (IB1,F1[931],GB1) and (IB2,F2[931],GB2) can be treated
as a single 27-bit register for addition and shifting.
The basic procedure for floating point addition/subtraction, numbers
X and Y is as follows:
Clear GB1, Clear GB2, Clear IB1, Clear IB2
IB1[1] 1, IB2[1] 1 /* Set the implied 1s */
F1 <addend1>
F2 <addend2>
/* Make F1 the larger of the two numbers in magnitude */
IF F1[18] < F2[18]
Swap(F1,F2)
ENDIF
IF F1[18] = F2[18]
IF F1[931] < F2[931]
Swap(F1,F2)
ENDIF
ENDIF
/* shift F2s mantissa to line up the two exponents */
Shift (IB2,F2[931],GB2) right by (F1[18] F2[18]) bits
IF F1[0] = F2[0] /* add mantissas including the extra bits */
(IB1,F1[931],GB1) (IB1,F1[931],GB1) + (IB2,F2[931],GB2)
ELSE
/* subtract */
(IB1,F1[931],GB1) (IB1,F1[931],GB1) - (IB2,F2[931],GB2)
IF (IB1,F1[931],GB1) is all zeroes
Clear F1 /* special case when result is 0 */
Exit
ENDIF
ENDIF
/* normalize by shifting until IB1 has the implied 1 */
IF IB1[0] = 1
Shift (IB1,F1[931],GB1) right by 1
F1[18] F1[18] + 1
/* increment the exponent */
ENDIF
normalization
WHILE IB1[0] = 0
Shift (IB1,F1[931],GB1) left by 1
F1[18] F1[18] - 1
/* decrement the exponent */
ENDWHILE
F1[31] GB1[0]
/* round the result */
Remark: specialed cases (eg., exponent all 0s) are not considered.

Page 126
UNFRTL program for implementing the procedure for implementing 32-bit
IEEE add/subtract (decided by the signs of the numbers)
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]

RtlADDIEEE
* Addition or subtraction determined by signs
DEFREG:
* Mantissa has 2 for overflow, 1 for implied 1, 2 as guard bits
MANT1(28)
MANT2(28)
* Exponent has 2 for overflow; special cases are not handled
EXP1(10)
EXP2(10)
SGN1(1)
SGN2(1)
DEFBUS:
BEGIN:
* Clear extra bits and set implied ones (assumes normal case)
MANT1[0 1 2 26 27]SETREG 0 0 1 0 0
MANT2[0 1 2 26 27]SETREG 0 0 1 0 0
EXP1[0 1]SETREG 0 0
EXP2[0 1]SETREG 0 0
SGN1 SETREG 1 INBUS '1st numner - enter 1 bit sign 1'
EXP1[2 TO 9]SETREG 8 INBUS 'Enter 8 bit exponent 1'
MANT1[3 TO 25]SETREG 23 INBUS 'Enter 23 bit mantissa 1'
SGN2 SETREG 1 INBUS '2nd number - enter 1 bit sign 2'
EXP2[2 TO 9]SETREG 8 INBUS 'Enter 8 bit exponent 2'
MANT2[3 TO 25]SETREG 23 INBUS 'Enter 23 bit mantissa 2'
OUTBUS SETBUS 'Adding'
OUTBUS SETBUS SGN1,'-',EXP1[2 TO 9],'-',MANT1[3 TO 25]
OUTBUS SETBUS SGN2,'-',EXP2[2 TO 9],'-',MANT2[3 TO 25]
* Swap operands if necessary
BRANCH(EXP1[2 TO 9]LGT EXP2[0 TO 9];OK)
BRANCH(EXP1[2 TO 9]LLT EXP2[0 TO 9];DS)
BRANCH(MANT1[3 TO 25]LGE MANT2[3 TO 25];OK)
DS:EXP1 SETREG EXP1 XOR EXP2
EXP2 SETREG EXP1 XOR EXP2
EXP1 SETREG EXP1 XOR EXP2
MANT1 SETREG MANT1 XOR MANT2
MANT2 SETREG MANT1 XOR MANT2
MANT1 SETREG MANT1 XOR MANT2
* Sign of F1 determines sign of result
SGN1 SETREG SGN1 XOR SGN2
SGN2 SETREG SGN1 XOR SGN2
SGN1 SETREG SGN1 XOR SGN2
* Line up exponents and add or subtract mantissas
OK:MANT2 SETREG(twoTOdec EXP1 SUB EXP2)RLSHIFT MANT2
(SGN1 LEQ SGN2)MANT1 SETREG MANT1 ADD MANT2
(SGN1 LNEQ SGN2)MANT1 SETREG MANT1 SUB MANT2
BRANCH(MANT1 LNEQ ZERO 28;NORM)
SGN1 SETREG 0
EXP1 SETREG ZERO 10
MERGEAT D
* If necessary shift implied 1 into position

Page 127
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]

NORM:(NOT MANT1[1])MERGEAT L
MANT1 SETREG 1 RLSHIFT MANT1
EXP1 SETREG INCREMENT EXP1
* If necessary normalize to get 1 into implied position
L:(MANT1[2])MERGEAT RND
MANT1 SETREG 1 LLSHIFT MANT1
DECREMEMT EXP1
MERGEAT L
* Round the result
RND:MANT1[25]SETREG MANT1[26]
D:OUTBUS SETBUS EXP1[2 TO 9]MESG 'Exponent - '
OUTBUS SETBUS MANT1[3 TO 25]MESG 'Mantissa - '
OUTBUS SETBUS SGN1,'-',EXP1[2 TO 9],'-',MANT1[3 TO 25]
END:

Example Usage:
Operand 1 = 232.125 =2 11101000.001
normalized is 1.11010000012 27
sign = 0
biased exponent = 7 + 127 = 134 = 100001102
mantissa = 11010000010000000000000 (implied leading 1)
Operand 2 = -1.03125 =2 -1.00001
normalized is -1. 000012 20
sign = 1
biased exponent = 0 + 127 = 127 = 011111112
mantissa = 00001000000000000000000 (implied leading 1)
RTL simulator results:
Name of Machine: RtlADDIEEE
Processing RtlADDIEEE specifications and statements.
1st numner - enter 1 bit sign 1: 0
Enter 8 bit exponent 1: 1 0 0 0 0 1 1 0
Enter 23 bit mantissa 1: 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
2nd number - enter 1 bit sign 2: 1
Enter 8 bit exponent 2: 0 1 1 1 1 1 1 1
Enter 23 bit mantissa 2: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Adding
0 - 1 0 0 0 0 1 1 0 - 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
1 - 0 1 1 1 1 1 1 1 - 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Exponent - ( 1 0 0 0 0 1 1 0 )
Mantissa - ( 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 )
0 - 1 0 0 0 0 1 1 0 - 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0
Normal Termination
Verification:
1.11010000012 27 = 1.1101000001002 27 = 232.12500
-1.000012 20
= -0.0000001000012 27 = -1.03125
1.1100111000112 27 = 231.09375
1.110011100011227 = 11100111.000112 = 128+64+32+4+2+1+1/16+1/32 = 231.09375

Page 128
Computer organization:
Computer hardware is generally organized in three component areas:
1. Memory
2. Central Processing Unit (CPU)
3. Peripherals
Memory has already been described, and RTL provides the means for describing
the CPU. The CPU components are given by the following:

General Hardware Organization Example


(separate I/O:memory bus and CPU:memory bus)

INPUT
DEVICE

IC
Instruction
Counter

SR
Status
Register

MEMORY
(shared data and
instructions)

CONTROL
Instruction
Decode

IR
Instruction
Register

MAR
Memory
Address Reg

Data Path

OUTPUT
DEVICE

MDR

Working
Registers
(Accumulator,
Index)

Memory Data
Reg

Communications
Link

CPU
Y
Z
ALU
{Arithmetic and
Logic Unit
including support
registers)

Central
Processing Unit

Page 129
The memory and CPU can operate asynchronously (in effect, each has its
own clock). For peripheral (I/O) devices, the CPU sends a signal to
the I/O device and then a device controller independent of the CPU
takes care of data transfer, which is to/from a designated memory
location called an I/O buffer. When the transfer is complete, the I/O
device sends a signal to the CPU to notify it that the buffer is now
ready for access. If the CPU is performing an operation that requires
the transfer to complete, then the CPU will need to pause operation
(essentially by holding its clock at zero) until the completion signal
is received is received. The user sees this as the system hanging.
Hence, the
(including
which when
causes the
The

CPU needs to be able to signal resource controllers


memory). The signal can be as simple as taking a bit to 1,
dropped back to 0 (by the external resource controller)
CPU clock to resume.

elements within the CPU consist of


A control unit
An Arithmetic and Logic Unit (ALU)
Registers for user program data (working registers)
Registers for managing user programs

The control unit:


The control unit has circuitry for signaling data transfers to and
from memory. Two registers are employed for controlling the transfer.
1. The memory address register (MAR), which has the memory address
for the transfer
2. The memory data register (MDR), which has the data to be
transferred to memory, or which receives the data transferred from
memory.
The memory data register is attached to the CPU-memory bus as a bus
sense register accessible by both memory and CPU. The memory control
unit also must be able to access the MAR to determine the memory
address to use.
The Von Neumann architecture stipulates that programs and data reside
in the same memory area. The process of transferring a machine
language instruction into the CPU is called an instruction fetch. The
control unit has an internal working register, the instruction
register (IR), where it stores the instruction fetched. The IR is
attached to a circuit that decodes the instruction to
extract the instruction operation
determine the memory address the instruction is to act on
The instruction address can then be transferred to the MAR to initiate
the transfer of the needed data to the MDR.
Arithmetic and Logic Unit:
The arithmetic and logic unit contains circuits such as those
described using RTL and combinational logic for useful computational
work, such as arithmetic operations and logical comparison. The ALU
is signaled as to which operations output is to be captured in its
output register.

Page 130
Registers for user program data (working registers):
Conceptually, user program data must be placed in a work area where it
can be retained to permit cascading operations that characterize
complex arithmetic expressions. Each of these registers is usually
attached to the CPU bus, which effectively limits their number (the
IBM 360 architecture provides 16, for example). Some of these
register may have special purposes; for example, an accumulator is a
register in which the ongoing outcome of a computation is accumulated;
an index register whose value is added to the instruction address to
allow stepping through a sequence of memory locations (usually
representing a data table). The idea is to do as much work in the CPU
as possible to avoid transfers to and from memory; for example, a swap
sequence through a temporary memory location
Read m1
Write T
Read m2
Write m1
Read T
Write m2

Transfer m1 to T
Transfer m2 to m1
Transfer T to m2

requires 6 Read/Write operations, whereas using working registers R1


and R2
Read m1 to R1
Read m2 to R2
R1R1R2
CPU time for these is negligible compared to Read/Write
R2R1R2
R1R1R2
Write R1 to m1
Write R2 to m2
requires only 4 (plus no temporary location is needed).
Registers for managing user programs:
The Instruction Counter (IC) has the address of the machine language
instruction to fetch after the instruction currently in the IR is
finished. When an instruction is fetched, the IC is updated to point
to the address of the next instruction. A branch instruction is
simply one that can modify the value in the IC.
The Status Register (SR) is set by the control unit to flag results of
comparisons, overflow conditions, and the like.
To facilitate register, CPU elements
bus structures, with access to a bus
blocks that allow a registers value
registers to which it is transferred
organization has a structure such as

are connected along one or more


controlled by 3-state logic
onto the bus and select the
from the bus. A single bus
the following:

Page 131

Single Bus CPU Organization


CPU BUS
SR

I
C

Instr
Decoder
&
Operand
Address

I
R

M
D
R

M
A
R

R0

R1

CCC

Rn

ALU

CCC
Address
Lines

Data
Lines

Control Lines
for ADD, SUB
etc.

CPU-memory bus

This organization provides means for moving values in and out of


selected registers. No more than 1 register can be gated onto the bus
at any one time, or the signals will conflict. Any number of
registers can simultaneously be loaded from the bus, however. Binary
operations take one operand from register Y and the other from the
bus. Registers such as the IR do not have a transfer to the bus
because there is no reason to be transferring their contents back out
of the register. The IC is not in this category, because its contents
must eventually be transferred to the bus and into the MAR as part of
fetching the next instruction to execute.
The Register-Bus gating is as follows:

Register-Bus
Gating
R

R0

0 in
(enable)

out
(enable)

R0

Data transfer
example: Y R0
R0

out

, Yin

Here R0 in enables the R0 Write/Enable on the clock signal.


activates a 3-state logic connection from R0 to the bus.

R0

out

Page 132
From the diagram we can determine the gating signals. Memory I/O
control signals for Read from memory and Write to memory are also
needed, along with ALU commands. A microcounter is used to select the
current line of microcode from a table and control signals are needed
to selectively reset the counter.
Gating signals:
ICout, ICin, Addrout, IRin, MARin, MDRout, MDRin,
R0 out, R0 in, ..., Ri out, Ri in, ..., Yin, Zout, Zin
Memory I/O control signals:
Read, Write, WaitM (hold CPU clock at 0 until memory read is done)
ALU commands:
Add, Sub, Set carry (to 1), ShiftR Y, ShiftL Y, Clear Y,
Compare, GT, LT, EQ, NE
Micro counter control signals:
End
A line of microcode consists of a sequence of bits which give the
values for each of the gating signals, the Memory I/O control signals,
the ALU commands, and the micro counter control signals (1 means the
signal is active, 0 means it is inactive). Microcode organized in this
fashion is called horizontal microcode.
Several lines of microcode are needed to specify a machine language
instruction. A machine language instruction is divided into two parts:
1. the op code (specifies what the instruction is to do)
2. the operand (identifies the location of the data to be acted on)
In a basic machine, a machine language instruction occupies a single
word of memory. If the word length is 32 and the op code takes 8 bits,
then 256 different machine language instructions can be provided. The
operand takes the remaining 24 bits. Since operands represent memory
addresses, 224 = 16,777,216 different memory locations can be directly
addressed. Larger memory address space can be accomodated by using
operands that represent relative addresses rather than absolute
addresses.
Once the instruction is brought in from memory and transferred into the
IR, the instruction interpreter can decode the op code part to point the
microcounter to the right microprogram; the operand is gated to the bus
when Addrout is signaled.

Page 133
Example: Suppose that the machine language instruction
ADD0 <addr>
means increment the value in R0 by the value pointed to by <addr>. A
register used in this fashion is sometimes called an accumulator. A
microcode sequence for the ADD0 instruction is as follows:
Instruction
fetch

Accumulate

ICout, MARin, Read, Clear Y, Set carry, Add, Zin


Zout, ICin, WaitM
MDRout, IRin
Addrout, MARin, Read
Operand
fetch
R0 out, Yin, WaitM
MDRout, Add, Zin
Zout, R0 in, End

Each line of microcode represents the signals which are on.


others are presumed to be off.

All

The 1st line of microcode initiates instruction fetch and does the
following (simultaneously):

the IC is gated onto the bus and into the MAR (ICout, MARin)
memory is signaled to READ the value addressed by MAR into
the MDR
the ALUs Y register is cleared to 0
the carry-in for Add is set to 1
the ALU Add circuit is selected
(calculating bus + Y + 1 = address of the next instruction)
the ALU result is gated into Z.

This all takes place in 1 CPU cycle. The idea is to do as much in


each step as possible to minimize the number of CPU cycles required.
The 2nd line of the instruction fetch does housekeeping, gating
the address of the next instruction (as calculated by the 1st line)
out of Z and into the IC, after which nothing else can be done until
memory releases the Wait signal (Zout, ICin, WaitM).
The simplifying assumption here is that each instruction occupies a
single word and the machine is word addressable (as opposed to
byte addressable). This speeds up instruction fetch since adding
1 to the IC points the IC to the next instruction. If variable
length instructions are to be employed, then the IC increment must
wait until the IR is fetched. Moreover, the instruction interpreter
must provide the instruction length via a new transfer link (ILout)
for the address calculation.
To complete the instruction fetch, the 3rd line of microcode gates
the retrieved instruction from the MDR onto the bus and into the IR
(MDRout, IRin).
The same instruction fetch sequence starts the microcode for every
machine language instruction!

Page 134
The 4th line begins the operand fetch, where
the operand address determined by the instruction decoder is
gated onto the bus and into the MAR (Addrout, MARin)
memory is signaled to READ into the MDR the value addressed
by MAR.
To set up for the accumulate, on the 5th microcode line
the value in R0 is gated onto the bus and into Y (R0 out, Yin)
the CPU clock is suspended by issuing a Wait signal.
For accumulate, the 6th line of microcode becomes active when the
Wait signal is released, at which point
the MDR is gated onto the bus, ALU Add is triggered, and the
result is captured in Z (MDRout, ADD, Zin).
The 7th and final line of microcode finishes the accumulate, where
Z is gated onto the bus and into R0 (Zout, R0 in)
End triggers instruction fetch on the next CPU cycle.
Since each line of microcode requires a CPU cycle, the accumulate
instruction in this example requires 7 CPU cycles, 3 cycles for
instruction fetch and 4 for the accumulate procedure. Designers spend
a great deal of effort to make architectural adjustments which serve to
reduce the number CPU cycles required by machine language instructions,
since each machine language instruction will be executed countless
times in the operation of a computer. In particular, since instruction
fetch is used for every instruction, it is advantageous that the
machine architecture be structured for an instruction fetch requiring
as few CPU cycles as possible (hence, they devise strategies such as
pipelines, which can be filled while the current instruction is being
processed, taking advantage of the predictable nature of instruction
fetch).
A rule of thumb in writing microcode is that multiple in signals are
permitted on a line of code, but only one out signal.
Microprograms:
The microcode for a machine language instruction such as ADD0 is called
a microprogram. It can be stored in a table accessed by a counter.
The instruction fetch is a microprogram in its own right, and for this
architecture would occupy the 1st three lines of the table. When the
instruction fetch triggers IRin, the instruction interpreter decodes
the op code now in the IR to set the counter to point to the
microprogram for the machine language instruction. When the last line
of the microprogram is accessed, the End signal causes the counter to
reset to 0 so the process will repeat, resulting in fetching and
processing the next instruction.
Instruction fetch assumes that the IC contains the address of the
instruction to transfer in from memory, so the question can be asked,
how does it get an address in the first place?. The answer to this
is that the machine has a start/reset button, which forces a hard-wired

Page 135
address value into the IC when it is pressed. This address points to a
bootstrap program that has been stored in memory (usually as ROM, so it
cant be altered accidentally), which gets the program flow going for a
user. The reset address has to be specified by the CPU designer and
the boostrap program has to be provided by the computer manufacturer
(on a PC as part of the ROM BIOS).
Translation from a programming language:
Programming languages such as C compile user programs into machine
language instructions such as ADD0. For example, the C statement
x = x + 3;
could compile to the 3 machine language statements:
LOAD0 <addr of x>
ADD0 <addr of the constant value 3>
STORE0 <addr of x>
The address values are those assigned by the C compiler (which is just
another program). A compiler translates program statements to machine
code, and among other things assigns an address location to each
constant and variable, initializing memory for each constant as part
of the process. Before the compiled program can be run, it has to be
located in memory so that addresses match those assigned by the C
compiler. A piece of system software called a loader handles this.
If LOAD0 <addr> means transfer the value at <addr> to R0 and STORE0
<addr> means transfer the value in R0 to memory at <addr> the above
three statements
1. transfer x to R0
2. add 3 to R0
3. transfer R0 to x
and x has been incremented by 3.
Microcode for LOAD0 and STORE0 is as follows:
LOAD0 <addr>:
Addrout, MARin, Read, WaitM
MDRout, R0 in, End
STORE0 <addr>:
R0 out, MDRin
Addrout, MARin, Write, WaitM, End

Page 136
Branching:
The Status Register (SR) has condition code (CC) bits which are set by
the ALU Compare instruction. The CC value in conjunction with the ALU
instructions EQ, LT, GT, and NE provide the means for conditional
branches.
When one of the following combinations are in effect:
CC=1 0 and LT
CC=1 1 and GT
CC=0 - and EQ
CC=1 - and NE
ALU input from line "b" is routed to ALU output "c". Otherwise, for
LT, GT, EQ, and NE ALU input from line "a" is routed to ALU output
"c".
Hence, LT, GT, EQ, and NE cause either the bus (input a) or register Y
(input b) to be routed to the ALU output (output c) depending on the
current CC value in the status register (SR).
LT, GT, EQ, and NE allow the construction of conditional branch
instructions. The routing patterns are summarized by the following
diagram:
ALU routing for
CC (Compare result)
EQ
NE
LT
GT
Y Z
bus Z
bus Z
bus Z
= 0 0 (bus = Y)
0 1 (bus = Y)
Y Z
bus Z
bus Z
bus Z
1 0 (bus < Y)
bus Z
Y Z
Y Z
bus Z

1 1 (bus > Y)
bus Z
Y Z
bus Z
Y Z
Compare sets the first CC bit to 0 if the value in Y and the value on
the bus are equal. If they are unequal, COMP sets the first CC bit to
1 and sets the 2nd bit to specify either bus < Y or bus > Y.
Example: Suppose that the machine language instruction
BGT <addr>
means branch on greater than to the instruction whose location in
memory is given by <addr>. Here the assumption is that the CC bits
in SR were set by Compare in an earlier instruction and if they are 1
1, <addr> is gated into the IC to replace the address of the next
instruction computed during instruction fetch. Microcode is as
follows:
Set branch
options
Set branch

<instruction fetch>
Addrout, Yin
ICout, GT, Zin
Zout, ICin, End

If the value of the CC is 1 1 then GT causes Y to be routed into Z and


the branch is taken. For any other CC value, the bus (which has the
current IC value) is routed into Z and the branch is not taken.

Page 137
Microcode programming:
A microcode programmer establishes the microcode for the machine
language instructions that comprise the machine language for a given
computing device. The table holding the microcode is sometimes called
the control store and resides in the CPU for rapid access via the
microcode counter. The End signal provides a 0-cycle microbranch to
the instruction fetch. Additional microbranches may be provided to
permit reuse of microcode sequences in addition to the one for
instruction fetch. This is typically the case for microprogrammable
machines, which provide means for making (limited) changes to the
machines control store. Note that the instruction interpreter is
already making microbranches that arent reflected in the microcode.
Machine language instructions are represented by mnemonics such at BGT,
BLE, COMP, SUB, MOV12, ADD0, RSHIFT0, J and so forth. Instruction
fetch is the same for all instructions, and its CPU cycle consumption
adds to the CPU cycles consumed by the instructions microprogram. To
this point machine language instructions ADD0, LOAD0, STORE0 and BGT
have been described. A sampling of others follows.
Other machine language instructions:
BLE calls for a branch if the condition is not BGT. This could be
done by testing for LT and then testing for EQ, but it is better
handled by reversing the branch options for BGT.
BLE <addr>:
Set branch
options
Set branch

ICout, Yin
Addrout, GT, Zin
Zout, ICin, End

this reverses what what BGT


put on the bus and in Y

If CC is 1 1 then GT causes the current IC, which is in Y, to be routed


into Z, meaning the branch is not taken. Hence, the branch is taken
for not GT. Since LE is the same as not GT this microcode
implements BLE.
Compares can always be structured as a comparison with 0 (eg., A > B is
the same as A-B > 0). Hence, for a COMP machine language instruction,
the strategy can be comparison of the operand with 0. The microcode is
then
COMP <addr>:
Addrout, MARin, Read, WaitM
MDRout, Clear Y, Compare, End
Here the comparator in the ALU is comparing the bus to Y=0, setting the
CC accordingly.
Since COMP only compares to 0, it becomes the machine language
programmers responsibility to convert A > B to A-B > 0. Machine
language instructions to accomplish this are as follows
LOAD0 A
SUB0 B
STORE0 TEMP
COMP TEMP

Page 138
Subtract is not quite the same as add, because the order of operands
matters.
The normal assumption is that SUB0 <addr> means subtract the value at
<addr> from R0. Assume also that the ALU Sub signal causes the value
on the bus to be subtracted from Y. The microcode is
SUB0 <addr>:
Addrout, MARin, Read
R0 out, Yin, WaitM
MDRout, Sub, Zin
Zout, R0 in, End
If there is a COMPR0 instruction for comparing R0 to 0, then the
machine language program can be improved; eg., it can be shortened to
LOAD0 A
SUB0 B
COMPR0
where now the TEMP memory location has been eliminated.
COMPR0 is particularly simple:

Microcode for

COMPR0 :
R0

out,

Clear Y, Compare, End

Only 1 CPU cycle is required (other than the CPU cycles for
instruction fetch). This is characteristic of register to register
machine language instructions. Register to register instructions are
advantageous because they require no memory access (other than
instruction fetch). If MOV12 means copy R1 to R2, then its microcode
is
MOV12 :
R1

out,

R2

in,

End

It should be noted that as with COMPR0, only 1 CPU cycle is needed.


In the architecture as described, registers have to be explicitly
identified by the mnemonic for machine language instructions, which is
why neither COMPR0 nor MOV12 required an operand. To identify
registers dynamically, for example, in an instruction such as
MOV <reg-a>,<reg-b>
a mechanism has to be added to the architecture for dynamically
matching <reg-a> and <reg-b> to working registers.
If <reg-a> is specified by 4 bits (providing 16 possible register
ids), an 8 bit operand is sufficient to specify the pair
(<reg-a>,<reg-b>). The architectural addition needed is an 8-bit
register R attached to the bus to serve the purpose of identifying
<reg-a> and <reg-b>. The microcode for a machine language instruction
using register operands then has to load R with the ids of the
registers to be used. To see how this might work, suppose Ra in is a
signal that triggers a decoder which accesses the first 4 bits of the
register identifier and Rb in does the same for the second 4 bits. In

Page 139
other words, If 0001 in the 1st 4 bits of R identifies R1, then R1 in is
triggered by the 4 to 1 decoder for an Ra in signal. To identify the
register pair <reg-a> and <reg-b>, only the 1st 8 bits of the MOV
instructions operand field have to be set; eg., for MOV 1,2 the
operand bits are 00010010.
If MOV <reg-a>,<reg-b> means copy the contents of <reg-a> to <reg-b>,
then microcode for MOV <reg-a>,<reg-b> is:
Addrout, Rin
Ra out, Rb in, End

(move the immediate value to R)

Addrout in this case is putting register ids on the bus rather than an
address (the operand is the immediate value). This is called immediate
addressing (meaning the address for the operand is the immediate
location on the instruction itself).
The machine language code for copying R1 to R2 is then
MOV 1,2
or for copying R2 to Ro is
MOV 2,0
This kind of enhancement characterizes the release of an extended
version of an existing computer architecture, where upward
compatibility is being sought.
If RSHIFT0 means to shift R0 right by the value of the operand, then
assuming the ShiftR Y command shifts Y by the value given by the bus,
the microcode sequence is
RSHIFT0 <amt>:
R0 out, Yin
Addrout, ShiftR, Zin
Zout, R0 in, End
Note that in this case the address portion of the instruction is
treated as a number. This is another example of immediate addressing,
where it is the immediate value, rather than a value in memory, that is
of interest.
Immediate addressing provides an easy means for establishing values in
registers. For example, if the instruction LOAD0I <addr> means load
the immediate value (namely <addr>) into R0 then the specific value
encoded on the instruction is loaded into R0. More explicitly,
LOAD0I 31
provides means to initialize R0 (in this case to the integer 31).
If the address given by the instruction is the address of the data item
in memory, then the addressing is called direct addressing.
In some circumstances, it is desirable that the address part of the
instruction point to a memory location that holds the address of the
desired data item. This is called indirect addressing. For example,
BGTN could specify a branch on greater than, not to the address, but to

Page 140
the address stored at the address. This is useful for providing a
table of jump addresses that point to different routines to be
invoked depending on machine state. In contrast to BGT, a memory
access is required to get the address to jump to:
BGTN <addr>:
Set branch
options
Set branch

Addrout, MARin, Read, WaitM


MDRout, Yin
ICout, GT, Zin
Zout, ICin, End

get the indirect address

A jump instruction is an unconditional branch and is very simple to


construct:
J <addr>:

Addrout, ICin, End

Index register:
To process a table, it is useful to have an index register whose value
is automatically added onto the <addr> operand before it is transferred
to the MAR. For example, suppose that the instruction ADD0X means
accumulate in R0 indexed by R2; ie., R2 is designated to be the index
register. <addr> for ADD0X provides a base address for a table. A
specific entry in the table is obtained by adding R2 to the base
address. The microcode for ADD0X is:
ADD0X <addr>:
Adjust address
by index

Accumulate

Addrout, Yin
R2 out, Add, Zin
Zout, MARin, Read
R0 out, Yin, WaitM
MDRout, Add, Zin
Zout, R0 in, End

Operand
fetch

If indirect addressing is combined with indexing, a jump table can be


easily processed. A jump table typically holds the addresses of the
programs that a process must select from among dynamically (eg., an
operating system service routine to process an interrupt flag raised
by a device controller). If JNX <addr> means jump to the machine
language instruction located at the address specified by <addr>, then
the microcode is
JNX <addr>:
Adjust address
by index

Addrout, Yin
R2 out, Add, Zin
Zout, MARin, Read, WaitM
MDRout, ICin, End

get the indirect address


and move it to the IC

The instruction operand provides the base address for the table.
Incrementing the operand by the index (line 2) changes the address to a
location further along in the table. This is the address of the value
to be retrieved and it is sent to the MAR to retrieve the table entry
(line 3). The retrieved value (line 4) is then transferred to the IC
so that the instruction executed next will be the one whose memory
location is stored in the table.

Page 141
If the table entries are the addresses of programs, then the effect of
the jump is to start up the program whose address was retrieved from
the table.
Logically, we have the following hierarchy:
immediate address

op code

operand
immediate value

direct address

indirect address

These examples demonstrate why it is desirable to have machine language


instructions that utilize immediate or indirect addressing. Suppose
that designated bits within the opcode specify if addressing is to be
immediate, direct, or indirect. The additional micro counter control
signals can be added which respond to these. Let Endi reset the micro
counter to 0 if the designated bits specify immediate addressing. Let
Endd reset the micro counter to 0 if the designated bits specify direct
addressing. With these additional micro branches, a single
microprogram can serve for all 3 addressing modes. A typical way to
specify the addressing mode is to append a qualifier to the instruction
mnemonic; eg., LOAD0* for immediate, LOAD0 for direct, and LOAD@ for
indirect. Microcode for LOAD0 that uses this capability is as follows:
LOAD0<qual> <addr>:
Addrout, MARin, R0 in, Read, WaitM, Endi
MDRout, MARin, R0 in, Read, WaitM, Endd
MDRout, R0 in, End
On line 1, the program ends with the immediate value (<addr>) in R0 in
if addressing is immediate. On line 2, the program ends with the
value directly fetched from memory transferred into R0. Otherwise the
retrieved indirect value is transferred into R0 in line 3.
The Read in line 1 is anticipatory in case addressing is direct. If
addressing is direct, the line 1 transfer into R0 is overridden by line
2. Likewise, the Read in line 2 is anticipatory in case addressing is
indirect, and if so the line 2 transfer into R0 is overridden by line 3.

Page 142
Simplified Instructional Computer (SIC):
A widely used architecture for instruction in systems architecture and
programming is the SIC machine described by Beck (Systems Software: An
Introduction to Systems Programming Addison-Wesley). This machine
incorporates the kind of CPU elements that have been discussed and its
machine language can be easily represented using the microprogramming
techniquess just covered. First of all the SIC hardware organization
can be represented by almost the same block diagram exhibited earlier.

SIC Hardware Organization


(separate I/O:memory bus and CPU:memory bus)

INPUT
DEVICE

IC
Program
Counter

SW
Status
Word

MEMORY
(shared data and
instructions)

CONTROL
Instruction
Decode

IR
Instruction
Register

MAR

OUTPUT
DEVICE

A
X
L
Working Registers
(Accumulator,
Index, Link)

Memory
Address Reg
Data Path

MDR
Memory Data
Reg

Communications
Link

CPU
Y
Z
ALU
{Arithmetic and
Logic Unit
including support
registers)

Central
Processing Unit

Page 143
Note that only minor modifications are needed in the structure of this
diagram. In essence, the working register set has been specified to
consist of an accumulator A, an index register X, and a link register
L. The Instruction Counter and Status Register are renamed and no
other changes are necessary.
For the CPU organization, the diagram becomes

Single Bus CPU Organization for


Implementing SIC
CPU BUS
SW

P
C

Instr
Decoder
&
Operand
Address

I
R

M
A
R

M
D
R

ALU

CCC
Address
Lines

Data
Lines

Control Lines
for ADD, SUB
etc.

CPU-memory bus

The SIC ADD instruction accumulates in A instead of R0. COMP is


exactly as described already, BGT is named JGT, and so forth. Load
instructions are dubbed LDA, LDX, and LDL, respectively. Shift is not
provided in the basic SIC machine, but is available under the extended
version (SIC/XE), which requires means for dynamically identifying
registers as related earlier. Arithmetic is available only for
register A for the basic SIC machine, but is available for all
registers under the extended architecture. Immediate and indirect
addressing are available only for the SIC/XE version of the machine.
Weve already seen the reason for having a register designated to
provided indexing. The link register is one whose use includes
automatic provision of the return address when jumping to a subroutine.
In the basic SIC machine, this is called JSUB, which simply jumps to
the address given by its operand after setting the link register. The
microcode is as follows:
JSUB <addr>:
ICout, Lin
Addrout, ICin, End
Instruction fetch has already produced the address of the instruction
immediately following the JSUB (the so-called return address). It is a
simple matter to transfer it to register L before changing the IC to
cause the jump to the subroutine.

Page 144
The counterpart to JSUB is RSUB, which jumps to the value given by
register L. Its microcode is:
RSUB:

Lout, ICin, End

Note that RSUB requires no operand. If the subroutine also invokes


JSUB, then it must first save register L and restore it before
executing RSUB, or the subroutine will return to itself! High level
languages provide a stack structure so that the programmer doesnt have
to worry about this detail (the current value of L is pushed onto the
stack as part of the subroutine call, and is popped off of the stack as
part of the subroutine return).
Architecture enhancements:
By separating the bus into an input bus and an output bus (which can be
selectively tied together), CPU cycles can be saved. Consider

Dual Bus CPU Organization


OUTPUT BUS
SR

I
C

Instr
Decoder
&
Operand
Address

I
R

M
A
R

M
D
R

R0

R1

CCC

Rn

ALU

b
CCC

INPUT BUS
Address
Lines

Data
Lines

Control Lines
for ADD, SUB
etc.

CPU-memory bus

Register Z has been eliminated in favor of the input bus.


Recall for the single bus architecture we had
ADD0 <addr>:
ICout, MARin, Read, Clear Y, Set carry, Add, Zin
Instruction
Z
fetch
out, ICin, WaitM
MDRout, IRin
Operand
Addrout, MARin, Read
fetch
R0 out, Yin, WaitM
MDRout, Add, Zin
Accumulate
Zout, R0 in, End

bus
tie
Bt

Page 145
For the dual bus modification, ADD0 becomes
Instruction
fetch

Accumulate

Btenable, ICout, MARin, Read


ICout, Clear Y, Set carry, Add, ALUout, ICin, WaitM
Btenable, MDRout, IRin
Operand
Btenable, Addrout, MARin, Read
fetch
Btenable, R0 out, Yin, WaitM
MDRout, Add, ALUout, R0 in, End

Note that altering the architecture in this manner reduces the CPU cycles
for ADD0 by 1. Basically, splitting the bus eliminates the need for
register Z. Two out signals are now allowed (as is the case for both
line 2 and line 6), but only if they are on different buses and the buses
are not tied.
Another technique that can be used is to utilize both halves of the clock
cycle (half the time it is high, the other half low). By dividing the
circuitry into components that activate on logic high (positive logic)
and components that activate on logic low (negative logic), speed may be
almost doubled. For example, the two lines of microcode
Addrout, Btenable, MARin, Read
R0 out, Btenable, Yin, WaitM

Operand
fetch

do not have any register transfer signal conflicts (an in signal for
the same register on each line), so the first could be accomplished
while clock is high and the second while clock is low.
This can be done by setting up the microcode table as two tables, the
first of which provides microcode signals on clock high and the second
on clock low. Each half of the table is addressed via the
microcounter. Hence, table entries that have the same address
represent consecutive lines of microcode. A microprogram will now need
to have an even number of lines, with End appearing on the last line,
even if it is the only control signal on the line. Under this
strategy, the same register cannot be set on consecutive lines of
microcode. Also, since a register sets up when its flip-flop CK lines
go low, an out for a register should not be on the line immediately
following an in. For these reasons, either the first entry or the
second entry of the pair may need to be left <null> (all signals off).
To illustrate, if this approach is used, single bus ADDR0 becomes:
Instruction
fetch

Accumulate

ICout, MARin, Read, Clear Y, Set carry, Add, Zin


<null>
Zout, ICin, WaitM
MDRout, IRin
Operand
Addrout, MARin, Read
fetch
R0 out, Yin, WaitM
MDRout, Add, Zin
<null>
Zout, R0 in
End

CPU time to execute the microprogram is reduced from 7 CPU cycles to 5.


For the dual bus scenario, it can be shown that the time can be reduced
to 3 CPU cycles with only minor code rearrangement.

Page 146
CPU-memory synchronization:
At the microcode level, the CPU can trigger memory Read, Write, and
WaitM signals. Circuitry for how the signals are used to synchronize
the data transfers is as follows:
Read Write Wai tM

If Read or Write
is active and
memory is busy
(i.e., Enable is
1), taking WaitM
to 1 disables the
CPU clock,
effectively
putting the CPU to
sleep until Mhold
is cleared by
Enable going to 0.

Q
CPU
Clock

CK
Q
Clock 1

CPU side
Memory side
Read
Write
Mhold

Note: Mhold is set whenever memory


is busy and Read or Write goes to
1. Synchronization between the CPU
and memory occurs when Enable
clears Mhold by going to 0.
Memory setup occurs
while Enable is 1.
The memory action
occurs when Enable
goes to 0.

set
D

Mx
Enable

CK

CK

Q
clear
Clock 2
Read=1 WaitM=1
CPU Clock
Mhold
Mx
Enable
Clock 2

Read=1 & WaitM =1

Mx follows
Mhold on
trailing edge
of Clock 2.
Enable falls
to 0 when
Mx=1 and
Clock 2 rises
(clearing
Mhold)

Page 147
The timing considerations are given by the timing diagram, which shows
two typical cases on the CPU clock line:

A Read issued on a CPU cycle followed by WaitM issued on the next


CPU cycle; eg.,
Zout, MARin, Read
R0 out, Yin, WaitM
Both Read and WaitM issued on the same CPU cycle; eg.,
Addrout, MARin, Read, WaitM

If Mx = 0, Enable holds at 1. If Mx = 1, then when the asynchronous


clock signal (Clock 2) rises to 1, Enable falls to 0 and the Mhold ff
is cleared, with Mx falling to 0 when the clock falls to 0. Hence,
when neither Read or Write signal is active, Enable = 1 and Mhold = 0.
This is how the timing diagram starts.
In the first case, when Read goes to 1 in the CPU, the Mhold ff is set
to 1 and Mx goes to 1 when Clock 2 falls. This in turn takes Enable
to 1 and memory sets up MDR based on the value in the MAR. When Clock
2 rises again, Enable falls to 0 (note that memory has had full set-up
time as represented by Clock 2) and Mhold is cleared. WaitM has been
issued, but has no effect since the CPU clock only responds to Mhold
when Clock 1 falls. It should be noted that the CPU clock operates in
phase with Clock 1 except when held low by the synchronizing ff. In
the case illustrated, no CPU cycles are lost and the MDR is available
on the next clock cycle.
In the second case, when Mhold rises to 1 (suspending the CPU clock
because WaitM = 1), Mx doesnt rise as quickly to 1, because
asynchronous match up of Clock 1 and Clock 2 is in a worst case
scenario and Mx only rises when Clock 2 falls again. When Mx does
rise to 1, memory has had a full setup period (Clock 2 low) for when
the CPU clock resumes in synch with Clock 1. Note that 2 CPU cycles
have been lost.
Computer architects seek to define means to eliminate these kinds of
wait states between memory and CPU. This may involve using cache
memory (a fast intermediate memory between main memory and the CPU) to
reduce clock differences. Of course, when the data is not in cache,
the cache has to be reloaded, which may cost some CPU cycles. Another
technique is to pipeline data into CPU registers, so that in many
cases the next item is in the pipeline. The cache can be loaded while
the pipeline is being processed, and if more than one pipeline is
employed, one pipeline can be loaded while another is being processed.
If successive memory retrievals cross wide stretches of memory, then
neither caching or pipelining will help (and may actually hinder,
because loading them requires time). This is normally not an issue,
since typical programs operate within a compact area of memory.

Page 148
Inverting microcode:
Microcode can be inverted to form a large logic circuit by examining
what microcode signals are on at each time step T1, T2, ..., Tn.
The microprogram sequences are examined at each of T1, T2, ..., Tn for
those signals each microprogram turns on. For example, Zout is on at T2
for every case (since it is in line 2 of instruction fetch). It is
also on at T6 for BGT, BLE, RSHIFT0, JNX, at T7 for ADDR0 and BGTN, and
at T9 for ADD0X.
The Zout signal is then set in the large logic circuit via the
combinational equation
Zout = T2 + T6 (BGT+BLE+RSHIFT0+ADD0X+JNX) +
T7 (ADD0+SUBB0+BGTN) + T9 ADD0X + ...
Similarly, ICout is set in instruction fetch and in branch instructions,
leading to the combinatorial equation
ICout = T1 + T4 (ADD0+JSUB) + T5 (BGT+BGTN) + ...
Specialized signals such as WaitM and End are also represented in
combinatorial equations:
WaitM = T2 + T4 (LOAD0+COMP+BGTN) + T5 (ADD0+SUB0+STORE0) +
T6 JNX + T7 ADD0X + ...
End = T4 (COMPR0+MOV12+J) + T5 (LOAD0+STORE0+COMP) +
T6 (BGT+BLE+RSHIFT0) + T7 (ADD0+SUB0+BGTN+JNX) +
T9 ADD0X + ...
In this manner the combinational logic for setting signals at each
point of the instruction counter is described. A block diagram for
the CPU as a circuit is then given by:

clock

reset

counter
. . .

inhibit

decoder
T1

T2

. . .

Tn

ADD0

I
R

.
.
.

Instr
Decoder

BGT
.
.
.

Logic circuit to set


control signals

SUB0

WaitM

(& Mhold)

. . .

End

.
.
.

status
flags
(eg.,
Mhold)

.
.
.

condition
codes

Page 149
Either approach will control the register transfer requirements
specified by the microcode. In contrast to using a generic component
which applies microcode from a table to set control signals, the large
circuit is cast in concrete as a combinational circuit. The gain is
in efficiency. The loss is that making changes to the system
microcode requires major circuit modification.
Modern microprocessors employ microcode tables imbedded in firmware,
so that a need to make changes to microcode only involves modifying
the imbedded table rather than the other circuitry. As a case in
point, some years ago when the Intel Pentium was found to have a
computational bug in its floating point routines, Intel was able to
very quickly issue replacement processors which corrected the problem
because the floating point operations were defined by microcode.
Vertical vs. horizontal microcode:
The microcode as examined to this point has been viewed horizontally
as a sequence of bits. Manufacturers often group logically related
signals in much the manner used earlier to dynamically identify
registers. For example, a 1 of 16 decoder can be used to select a
signal using just 4 bits, a reduction of 12 microcode bits. This is
OK so long as no more than 1 of the 16 signals needs to be selected at
a time. In particular, since only 1 out signal can be selected at a
time, all out signals could be selected in this fashion. Microcode
employing this technique is called vertical microcode.
Managing the CPU and peripheral devices:
The CPU is the central resource for a computer, and its failure
precludes any utilization of the system otherwise. Moreover, whenever
the CPU clock has been inhibited, the system is effectively shut down,
so steps that reduce the probability of this occurring are advisable.
For example, if the CPU sends a signal to a printer and the CPU clock
is inhibited until the printer responds, no matter the state of the
rest of the system, the computer is effectively shut down until a
signal is received from the printer (perhaps the printer has not been
turned on, or there is a cable problem). This tactic was commonly
employed by earlier computers.
Memory-CPU synchronization is always necessary because of the tight
coupling between the memory and CPU for instruction fetch, which means
a possibility always exists for the synchronization circuitry to
inhibit the CPU clock. Tactics such as instruction pipelines and
memory caches are used to minimize this possibility.
Peripheral devices are not tightly coupled to the CPU, so peripheralCPU synchronization does not have to be directly achieved. The tactic
employed is called direct memory access. Direct memory access takes

Page 150
advantage of the fact that memory is not driven by a counter (in
contrast to the CPU). For this reason data transfers between a
peripheral device and memory can take place without suspending the
memory clock to wait for the device to respond. Since peripheral
devices operate at considerably slower speeds than either CPU or
memory, a number of clock cycles may go by before a device response
takes place, during which time there can be continued memory-CPU
activity.
When using direct memory access, peripheral-CPU synchronization is
taken care of indirectly by memory-CPU synchronization, and the CPU
clock does not need to be inhibited while waiting for peripheral
device response.
When a program initiates a peripheral data transfer, the program
usually must pause until the transfer has been accomplished. The CPU
provides the signals that control a peripheral devices behavior and
there may be a driver program that causes the peripheral to step
through its physical requirement. A peripheral device usually has its
own controller, which responds to the signals received from the
driver. Regardless of strategy, a program handling a peripheral data
transfer will reach a point where it can go no further without a
response from the peripheral device.
To meet the objective of keeping the valuable resource, the CPU, from
being held up by slow peripheral response times, it is evident that
means are needed to switch from a waiting program to one ready to run.
This is normally accomplished by maintaining multiple programs in
memory, devising means both for keeping track of these programs and
for switching off to one of them when the currently executing program
must pause. This is a primary task of the modern operating system.
At the core of the operating system there is a supervisor program
whose job is simply to manage other programs that are in memory. When
a program wants to access a peripheral device it does so by executing
a supervisor call (SVC) machine language instruction. The
supervisor does housekeeping (saving the state of the program that
executed the SVC), initiates the peripheral data transfer, and turns
the CPU over to a new program (via a machine language instruction such
as JNX, after restoring the state of the new program). In this way
the CPU no longer gets suspended by programs that initiate peripheral
data transfers.
When a program is suspended, the current machine state (register
values, including SR and the IC) must be saved. Note that the
microcode for the SVC must save the programs current IC (in the
manner of an RSUB) since starting the supervisor program changes the
IC. Also, means must be provided to capture the SR. In making the

Page 151
context switch to the new program, the supervisor must restore the
machine state of the program being resumed. This information is
maintained in state tables that are under control of the supervisor.
It is important that the supervisor program periodically resumes
execution so that every program in memory gets a turn with the CPU.
Since SVC commands for peripheral devices access may occur
erratically, a timer is needed so that in the absence of any program
executing an SVC, program control returns to the supervisor after a
defined period of time has elapsed. This implies that a timer
interrupt is needed to force a null SVC if a peripheral access has
not occurred in the meantime. Both the timer and an interrupt
capability represent an added hardware need.
At the hardware level, an interrupt is just a signal which when
present redirects the End microbranch to branch to a microprogram that
captures the IC and starts the supervisor (via a microbranch to the
SVC microprogram).
The interrupt capability can also be used as the means for a
peripheral device to signal that its done. When the supervisor
program is run in response to an interrupt from a peripheral device,
it conducts a context switch and resumes the program that executed the
SVC which originated the peripheral device access.
To determine the source of an interrupt, the supervisor needs to
maintain information to match peripheral devices and programs that
have a pending SVC action. For a timer interrupt, the supervisor
simply needs to make a context switch to another program that is ready
to run.
Since the supervisor program should not be interrupted, means are also
needed to mask interrupts while the supervisor program is executing.
A mask is just a (bit) signal which when present keeps an interrupt
from manifesting itself; e.g.,
interrupt
mask
Masking bits are set by the microcode of the SVC instruction, to be
relinquished when the supervisor completes the context switch. The
supervisor also must be able to deactivate an interrupt signal it has
serviced so that the interrupt wont immediately manifest itself again
on release of the mask.
These kinds of considerations are covered in the context of an
operating systems course. In addition to providing this capability,
the hardware also needs to support a capability of having privileged

Page 152
instructions (instructions that can only be used if the privilege
signal has been activated the SVC turns on this signal, in
particular, so that the supervisor program can run privileged
instructions). Privileged instructions (e.g., direct I/O
instructions) are ones reserved for use of operating system software.
They typically are instructions whose use in ordinary programs could
compromise the operating systems ability to manage the CPU (eg.,
using a privileged I/O instruction leads to an interrupt when the I/O
operation completes; the supervisor only has the means for handling
the interrupt if it is the one issuing the I/O instruction).
A +5V commercial microprocessor the Z80:
The Zilog Z80 microprocessor is an 8-bit processor that was first
issued in 1976. Running a superset of the Intel 8080 instruction set,
the chip was in wide use by 1980, perhaps most notably in the Radio
Shack TRS-80, which was the first personal computer made available via
a mass distributor, foretelling the future direction computing was to
take with desktop machines.
The Z80s advantages (low cost, +5V compatibility) have made it a
favorite to this day, although it is now used primarily for embedded
applications where processing power is not an issue (eg., device
controllers).
The features of the Z80 are as follows:
8-bit CPU in a 40 pin package

16 address lines
8 data lines

13 control lines

power, ground, clock

158 instructions forming a superset of the Intel 8080


64Kb address space
Another interesting feature of the Z80 is that it has a duplicate set
of registers to support making a context swap on an interrupt; ie.,
the registers of the interrupted program do not necessarily have to be
saved (however, if a 2nd interrupt can occur, a register save will be
needed).

Page 153
The chip pin-outs are as follows:

A0

19

MREQ

A1

20

IORQ

21

RD

22

WR

24

WAIT

25

BUSRQ

23

BUSAK

16

INT

17

NMI

26

Reset

27

M1

28

RFSH
HALT

Addressing
(pins 30-40, 1-5)

A15
Z80
D0
D1

Data

(pins 14,15,
12,7-10,13)

D7
6

29

18
11

CK

GND

+5V

Memory &
I/O Control

Bus Control
Interrupt
Control

Miscellaneous
Controls

Bus control enables the CPU to share the data bus with another device.
To access the bus, the device signals its request via BUSRQ. When the
CPU finishes its current operation it floats the address lines, data
lines, I/O control and memory control lines, and signals back via
BUSAK. The device is responsible for sending an interrupt signal to
reactivate the CPU when it is finished with the bus.
The MREQ line signals that the MAR is ready for a Read or Write
operation.
The IORQ line signals that the first 8 bits of the address bus have a
valid I/O address for an I/O Read or Write operation. This signal is
not used if memory-mapped I/O is being employed.
The RD and WR lines apply to both memory and I/O operations.
The WAIT line is used by memory or an I/O device to signal the CPU to
enter a wait state until the signal is released (memory refresh
continues via NOP operations see below).
The INT line is for maskable interrupts (the command set provides the
software controls).

Page 154

The NMI line is for non-maskable interrupts.


The Reset line resets the internal CPU status and resets the
instruction counter to 0.
The M1 line is a signal that is output at the start of instruction
read (more than one memory fetch is necessary to get the whole
instruction). The Z80 allocates extra time to the op code read to
provide time for refreshing dynamic memory. During the 2nd half of the
opcode read, a counter value is placed on the first 7 address lines
for the memory bank in need of refresh and the RFSH signal is raised.
The HALT signal stops CPU activity (except that NOPs continue to be
executed to maintain memory refresh) an interrupt is needed for the
CPU to resume.
The Z80 can be clocked cycle by cycle via the clock input.
Many designs of simple Z80 implementations have been devised.
following 6-chip design is from Tannenbaum.

The

Data Bus (8 bits)

...

A0
.
.
A10

A14
A15

button

M1
IORQ
RFSH
HALT
BUSAK

Z
8
0

INT
NMI
WAIT
BUSRQ
MEMRQ
WR
Reset
RD

A0
.
.
A10

f
l
o
a
t

A0
.
.
A10

A0
A1

2K8
RAM

2K8
EPROM

PIO

CS

CS

CS

OE

OE

WR

R/W

RD

Addressing:
A15 A14
[controls are active on signal LOW]
0
= EPROM
1
0
= RAM
1
1
= PIO (memory mapped)

Page 155
The PIO is a (+5V compatible) chip providing parallel I/O ports.
EPROM is electrically programmable read only memory, which can be
erased using a strong ultraviolet light source and programmed using an
EPROM programmer. SRAM is static random access memory, a designation
for memory that does not need to be refreshed to maintain its values
(ie., it is composed of flip flops). The counterpart, dynamic memory,
requires periodic refreshing, and uses a different technology than
gate logic. Dynamic memory provides greater capacity for less cost,
but at the expense of speed.
The design is complete except that a control program is needed for the
EPROM (systems software for I/O, including a display device, in
particular), a CPU clock is needed, and a power supply is needed (3 Dcell batteries will do). Note that address 0 maps to the EPROM, which
is where the Z80 initiates program load on Reset.
Representative pricing for the configuration is as follows:
Z80
$1.39
PIO (MK 3881)
$1.49
7400 chip
$0.65
7410 chip
$0.292KSRAM
2016 2KSRAM
$1.39
2716 2KEPROM
$2.25
$7.46

Page 156
INDEX
-notation ........................... 23
-notation ........................... 23
2421 BCD representation .............. 14
2's complement ....................... 10
4-bit parallel adder ................. 40
4-bit parallel subtractor ............ 41
9's complement ....................... 14
Absorption property .................. 18
Accumulator .................... 130, 133
Adder
4-bit parallel adder ............... 40
BCD adder .......................... 42
carry anticipation ................. 40
full adder ......................... 39
half adder ......................... 39
Adders
Sequential binary adder ............ 67
Addressing modes
direct addressing ............ 139, 141
immediate addressing ......... 139, 141
indirect addressing .......... 139, 141
Alkaline battery ..................... 57
Alternating current .................. 57
ALU ............................. 96, 129
Amperes .............................. 56
AOI gates ............................ 42
Arithmetic and logic unit ....... 96, 129
ASCII ................................ 13
Associative property ................. 17
Barrel Shifter ....................... 77
Base address ........................ 140
Batteries ............................ 57
in series .......................... 57
BCD adder ............................ 42
BCD to 7-segment display decoder/driver
................................... 36
Binary operations ..................... 1
AND ................................. 4
COINCIDENCE ......................... 4
NAND ................................ 4
NOR ................................. 4
One ................................. 2
OR .................................. 4
table of binary operations .......... 3
XOR ................................. 4
Zero ................................ 2
Boolean algebra ...................... 16
absorption property ................ 18
associative property ............... 17
commutative property ............... 17
complement property ................ 18
DeMorgan property .................. 18
Distributive property .............. 17
duality ............................ 17
generalized DeMorgan property ...... 20
idempotent property ................ 18
identity property .................. 18
involution property ................ 18
one ................................ 16
zero ............................... 16
zero and one property .............. 17
Boolean operations ................... 16
for circuits ....................... 16

for sets........................... 16
for truth table logic.............. 16
Booth's method...................... 114
UNF RTL program................... 116
Bootstrap program................... 135
Branch instruction.................. 130
Branching........................... 136
Bus tie............................. 144
Byte............................. 14, 86
gigabyte........................... 14
K-byte............................. 14
megabyte........................... 14
terabyte........................... 14
Canonical forms...................... 22
Canonical product of sums............ 23
Canonical sum of products............ 23
Carry anticipation................... 40
CC.................................... 6
Central Processing Unit............. 128
Character representation............. 13
ASCII.............................. 13
EBCDIC............................. 13
Characteristic table................. 51
Chip select.......................... 43
Circuit design
combinational circuits............. 33
Circuit simplification............... 25
circular shift....................... 64
Clear signal......................... 65
Clock
asynchronous....................... 72
speed.............................. 88
Combinational circuit analysis....... 25
Combinational circuits design process 33
Common cathode........................ 6
Commutative property................. 17
Comparators.......................... 46
Complement property.................. 18
Computer organization............... 128
Context switch...................... 151
Control store....................... 137
Control unit........................ 129
Coulomb.............................. 56
Counter design....................... 70
Counters............................. 65
Johnson counter.................... 76
mod 2n ripple counter .............. 65
n-stage counter.................... 75
self-starting...................... 74
sequential design.................. 70
shift-register..................... 75
switch-tail counter................ 76
CPU................................. 128
ALU commands...................... 132
arithmetic and logic unit......... 129
arithmetic and working registers.. 129
control unit...................... 129
gating signals.................... 132
index register.................... 140
instruction counter............... 130
instruction register.............. 129
machine language instruction...... 132
managing peripherals.............. 149

Page 157
memory address register ........... 129
memory data register .............. 129
memory-I/O control signals ........ 132
micro counter control signals ..... 132
register-bus gating ............... 131
status register ................... 130
timer interrupt ................... 151
working registers ................. 130
CPU organization
dual bus .......................... 144
SIC ............................... 143
CPU-memory synchronization .......... 146
D flip-flop .......................... 58
Data bit .............................. 6
Debouncing a switch .................. 55
Decoder
1 of 2n decoder..................... 43
BCD to 7-segment display ........... 36
Gray to binary ..................... 36
Decoders/demultiplexers .............. 43
DeMorgan property .................... 18
Demultiplexer ........................ 43
Device interrupt .................... 151
Direct addressing .............. 139, 141
Direct memory access ................ 149
Distinguished cell ................... 31
distributive property ................ 17
D-latch .............................. 58
DMA ................................. 149
Don't care cell ...................... 31
Double precision floating point ...... 98
EBCDIC ............................... 13
Enable ............................... 43
End-around carry ..................... 11
EPROM ............................... 155
EPROM memory ......................... 85
Error correcting code ................ 95
Essential prime implicant ............ 31
Even parity .......................... 66
Excess-3 BCD ......................... 14
Excitation controls .................. 61
Extended precision floating point .... 98
Field programmable gate arrays ....... 86
Finite state automaton ............... 66
Flip-flop ............................ 56
Flip-flops ........................... 50
D flip-flop ........................ 58
edge-triggered ..................... 54
excitation controls ................ 61
JK flip-flop ....................... 61
Master-Slave ....................... 54
T flip-flop ........................ 60
Floating point numbers ............... 96
addition/subtraction .......... 97, 100
algorithm for addition/subtraction 125
division .......................... 100
guard bits ......................... 99
multiplication .................... 100
multiplication/division ............ 96
normalization ...................... 96
normalized form .................... 98
rounding strategies ................ 99
UNF RTL for addition/subtraction .. 126
Full adder ........................... 39
Full subtractor ...................... 41
Generalized DeMorgan property ........ 20

Gigabyte............................. 14
Glitch............................... 79
Glitches and hazards................. 78
GND................................... 6
Gray Code............................ 15
Gray to binary decoder............... 35
Ground................................ 6
Guard bits........................... 99
Half adder........................... 39
Half subtractor...................... 40
Hamming code......................... 93
Hazard............................... 79
Hertz................................ 88
Hexadecimal........................... 7
Horizontal microcode................ 149
I/O buffer.......................... 129
IC.................................. 130
Idempotent property.................. 18
IEEE 754 floating point representation
................................... 97
IEEE 754 Floating Point Standard..... 98
IEEE floating point standard
biased exponent.................... 98
exponent all 0's................... 98
implied leading 1.................. 98
Immediate addressing........... 139, 141
Immediate value..................... 139
Implicant............................ 30
Implicate............................ 30
Implied leading 1.................... 98
Index register................. 130, 140
Indirect addressing............ 139, 141
Instruction Counter................. 130
Instruction fetch.............. 129, 133
Instruction register................ 129
Integer arithmetic
Booth's method.................... 114
non-restoring division............ 121
restoring division................ 117
UNF RTL for Booth's method........ 116
UNF RTL for non-restoring division 123
UNF RTL for restoring division.... 119
UNF RTL for signed multiply....... 113
Integer multiplication
Booth's method.................... 114
signed multiply................... 112
Integers.............................. 6
1's complement representation...... 11
2421 BCD representation............ 14
2's complement representation...... 10
9's complement representation...... 14
base representation................. 6
excess-3 BCD....................... 14
hexadecimal......................... 7
octal............................... 7
self-complementing representation.. 14
sign-magnitude representation....... 8
Interrupt
device............................ 151
mask.............................. 151
timer............................. 151
Inverting microcode................. 148
Involution property.................. 18
IR.................................. 129
JK flip-flop......................... 61
Johnson counter...................... 76

Page 158
Joules Law .......................... 57
Jump instruction .................... 140
Jump table .......................... 140
Karnaugh maps ........................ 25
K-byte ............................... 14
K-maps ............................... 25
distinguished cell ................. 31
don't care cell .................... 31
essential prime implicant .......... 31
general procedure .................. 31
implicant .......................... 30
implicate .......................... 30
prime implicant .................... 31
Latch ................................ 55
Latches .............................. 50
D-latch ............................ 58
SR-latch ........................... 51
Leading edge ......................... 54
Logic functions ....................... 1
composite functions ................. 5
truth table representation .......... 5
Logic gates ........................ 2, 4
ANSI symbols ........................ 4
Logic signals ......................... 6
False ............................... 6
high ................................ 6
low ................................. 6
True ................................ 6
Machine language .................... 137
Machine language instruction ........ 132
Machine language instructions ....... 137
MAR ................................. 129
Mask ................................ 151
Master-Slave flip-flop ............... 54
Maxterm .............................. 22
MDR ................................. 129
Mealy circuit ........................ 66
Megabyte ............................. 14
Megaflop ............................. 88
Memory .......................... 83, 128
CD-ROM ............................. 86
dynamic RAM ....................... 155
FPGA ............................... 86
PLA ................................ 85
RAM ................................ 85
ROM ................................ 85
static RAM ........................ 155
word size .......................... 86
Memory address register ............. 129
Memory address space ................ 132
Memory data register ................ 129
Microbranch ......................... 137
Microcode ........................... 132
branching ......................... 136
control store ..................... 137
End signal ........................ 134
horizontal ........................ 149
horizontal microcode .............. 132
instruction fetch ................. 133
inverting to obtain a circuit ..... 148
microbranch ....................... 137
vertical .......................... 149
Microcode programming ............... 137
Microprogrammable machine ........... 137
Microprograms ....................... 134
Milliamp ............................. 57

Minterm.............................. 22
Moore and Mealy circuits............. 72
Moore circuit........................ 66
Multiplexers......................... 44
used to implement a logic function. 45
Multiplier........................... 40
NAND conversions..................... 23
Negative logic........................ 6
Next state equation.................. 53
Next state function.................. 66
NiCad battery........................ 57
Non-restoring division.............. 121
UNF RTL program................... 123
NOR conversions...................... 23
Normalization........................ 96
Normalized form...................... 98
n-stage counter...................... 75
Numeric data.......................... 6
integers............................ 6
real numbers........................ 6
Octal................................. 7
Odd parity........................... 66
Ohms Law............................ 56
Ohms................................. 56
Operating system.................... 150
OR-AND conversions
to NAND-AND........................ 24
to NAND-NAND....................... 24
to NOR-NOR......................... 24
Parity bit........................... 66
even parity........................ 66
odd parity......................... 66
Peripheral devices.................. 149
Peripherals......................... 128
Picosecond........................... 88
PIO................................. 155
Prime implicant...................... 31
Programmable logic arrays............ 86
Propagational delay.................. 78
Pull-up resistor..................... 55
Quine-McCluskey procedure............ 48
RAM memory........................... 85
Real numbers......................... 13
addition and subtraction........... 97
fixed point representation......... 13
floating point numbers............. 96
guard bits......................... 99
IEEE 754 Floating Point Standard... 98
multiplication and division........ 96
normalization...................... 96
normalized......................... 98
rounding strategies................ 99
Register transfer architecture...... 101
Register transfer language.......... 102
Register transfer logic............. 101
Register-Bus gating................. 131
Registers............................ 64
Residue classes....................... 9
Restoring division.................. 117
UNF RTL program................... 119
ROM memory........................... 85
Rounding............................. 99
RTL................................. 102
implementing control logic........ 104
implementing transfer logic....... 105
UNF RTL........................... 106

Page 159
Self-complementing representation .... 14
Self-starting counter ................ 74
Sequential binary adder .............. 67
Sequential circuit design ............ 66
Sequential circuit design process .... 68
Sequential circuits
analysis ........................... 72
Set-Reset latch ...................... 51
Setup time ........................... 50
Shift-register counter ............... 75
SIC machine ......................... 142
single bus CPU organization ....... 143
Signed multiply
algorithm ......................... 112
architecture ...................... 112
UNF RTL program ................... 113
Single pole, double throw switch ..... 55
Single precision floating point ...... 98
SN7447
BCD to 7-segment display
decoder/driver ................... 36
SR status register ................ 130
SRAM ................................ 155
SR-latch ............................. 51
Standard resistor values ............. 56
State diagram .................... 50, 66
Status register ..................... 130
Subtractor ........................... 40
4-bit parallel subtractor .......... 41
full subtractor .................... 41
half subtractor .................... 40
supervisor call ..................... 150
supervisor program .................. 150
SVC ................................. 150
Switch-tail counter .................. 76
T flip-flop .......................... 60
Terabyte ............................. 14
Timer interrupt ..................... 151
Trailing edge ........................ 54
Unary operations ...................... 2
complement ....................... 2, 4
identity ............................ 2
UNF RTL ............................. 106
arithmetic compare
AEQ, ANEQ ....................... 109
AGT, AGE ........................ 109
ALT, ALE ........................ 109
assignment statement .............. 107
basic structure ................... 106
Boolean logic operations
AND ............................. 109

COINC ........................... 109


NAND ............................ 109
NOR ............................. 109
NOT ............................. 109
OR .............................. 109
XOR ............................. 109
conditional branch................ 107
conditional execution............. 107
DECODE, ENCODE.................... 109
decrement by 1
DECREMENT ....................... 109
description....................... 106
dyadic operators.................. 109
expressions....................... 107
increment by 1
INCREMENT ....................... 109
labels............................ 107
logical and arithmetic shifts
LASHIFT ......................... 109
LLSHIFT ......................... 109
LROTATE ......................... 109
RASHIFT ......................... 109
RLSHIFT ......................... 109
RROTATE ......................... 109
logical compare
LEQ, LNEQ ....................... 109
LGT, LGTE ....................... 109
LLT, LLTE ....................... 109
merge............................. 107
monadic operators................. 109
naming registers and buses........ 106
reformat of user input
decTOtwo, hexTOtwo .............. 109
register transfer................. 107
string manipulation
FIRST, LAST ..................... 109
two's complement arithmetic
ADD ............................. 109
DIV ............................. 109
MUL ............................. 109
SUB ............................. 109
twosCMPL.......................... 109
twoTOdec, twoTOhex................ 109
ZERO.............................. 109
Vertical microcode.................. 149
Voltage.............................. 56
Von Neumann architecture............ 129
Watt hours........................... 57
Word size............................ 86
Z80................................. 152