Professional Documents
Culture Documents
CHAPTER 1
Introduction to digital design
Digital design revolution completely took over analogue designs, capturing major electronic design
industry. Besides other amazing features, design automation is well defined reasons for digital design
dominance. This chapter will introduce to amazing world of digital design, and travel through several
fundamental features to grasp state of art tools, hardware development; advancements and complex
digital design algorithms. First section will embark on basic terminology and concepts of digital design.
Figure 1-1: Algorithm mapping inputs to outputs (This is what digital design is about,
designing algorithms and using automated techniques to map them on silicon)
Digital designer have to search for an optimal transfer function, which takes less area and provides high
performance. The goals of optimality are bounded by four criteria’s:
▪ Area, the design should consume as less area as possible.
▪ Performance, the design should work fast and delivers high performance.
▪ Power, the design should consume acceptable amount of power. So that it can sustain longer
period of time with battery and dissipate less heat.
▪ Testability, the design should be testable after it is developed, so that faults can easily be traced.
In an ideal world design should utilize minimum area, provides high speed and performance, consumes
less power and should be fully testable after being manufactured.
1-1-1 Example of digital design algorithm:
Let’s start with a very simple example to understand what design mapping means? Consider a 2-bit
comparator in figure 1-2. The design can compare two inputs and assert ‘Cmp_EQ’ (if both inputs are
equal), ‘CMP_LT’ (if first input is less then second input), ‘CMP_GT’ (if first input is greater than second).
Cmp_EQ
Cmp_In1
Comparator2
Cmp_LT
Cmp_In2
Cmp_GT
Figure 1-2: Example of algorithm to be mapped on silicon, inputs and outputs are indicated
To figure out mapping between input and output conventional technique such as truth table and writing
Boolean equations can be used. Boolean equations are written for each output and are minimized using
K maps or Boolean reduction. Figure 1-3 shows truth table and possible Boolean equations.
Cmp_In1 Cmp_In2 Cmp_EQ Cmp_LT Cmp_GT
0 0 1 0 0
0 1 0 1 0
1 0 0 0 1
1 1 1 0 0
̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
𝐶𝑚𝑝_𝐸𝑄 = 𝐶𝑚𝑝_𝐼𝑛1⋀𝐶𝑚𝑝_𝐼𝑛2
𝐶𝑚𝑝_𝐿𝑇 = ̅̅̅̅̅̅̅̅̅̅̅̅
𝐶𝑚𝑝_𝐼𝑛1& 𝐶𝑚𝑝_𝐼𝑛2
𝐶𝑚𝑝_𝐺𝑇 = 𝐶𝑚𝑝_𝐼𝑛1 & ̅̅̅̅̅̅̅̅̅̅̅̅
𝐶𝑚𝑝_𝐼𝑛2
Based on algorithm worked out in figure 1-3, the black box of figure 1-2 is filled. The result is a digital
design which transfers inputs to outputs. The circuit exactly defines, how each output will be shaped up
using each input. In other words, a combination of wires and logic elements defines a path from input to
output. The circuit elements are dictated by the functionality of digital design. In certain cases the
design may be twisted to satisfy the optimality goals.
Cmp_In1 Cmp_EQ
Cmp_In2
Cmp_LT
Cmp_GT
Figure 1-4: The algorithm defines a mapping between input and output
Figure 1-5 illustrate all steps of mapping, algorithm on silicon and cross validating it’s functionality.
Design
Specification
Mapping
algorithm on
silicon
Function
Verification
Verified
previous stage, therefore full adders of next stage must wait until all previous stage manufactures carry
and forward it, as shown in figure 1-6. This means in 16-bit adder, last stage has to wait for 15 stages to
receive rippled carry, which reduces the design performance.
Carry_out Carry_in
Figure 1-6: Ripple carry adder, low performance with less area and power
The solution to this problem is carry look ahead, which generates carry in advance using propagate and
generate mechanism, as shown in figure 1-7. This speeds up the performance of circuit at cost of extra
area, hence to get more performance more area is required to be invested.
G[2]
G[1]
P[2]
G[2]
P[2]
P[1]
P[2]
Figure 1-7 : Better performance at cost of extra area and power consumption
Let’s try to reduce area of ripple carry adder and make it slim with the same functionality. Figure 1-8
shows a single full adder, whose output ‘COut’ is stored in d flip flop. This carry out is feedback to carry
in stage of adder. Two inputs are stored in two n-bit registers, which shifts single bit at every clock cycle,
finally the generated sum is also stored in shift register. This reduces hardware for any width of
numbers, but it can be easily figured out that it will require n clock cycle to sum up n-bit number. In this
case, power and area are reduced at the cost of performance.
COut FF
Cin ADDER D Q
X_in1 A
Sum
X_in2 B
Clk
X_in2
Figure 1-8: One full adder with shift registers and D-FF can produce N-bit adder, reduced
area with low performance
In similar way, to make design testable, an extra area must be spent to compensate testability logic
inside circuit. The main goals of testability are to provide a digital hardware in which fault diagnostics is
easy, it’s exact location can be spotted and if possible provides a repair or run time alternatives. The
least can be fault detection from outside of IC or digital design; obviously it can’t be achieved without
spending extra hardware.
The digital designer must find an optimal trade off while mapping algorithm onto silicon considering
required application and requirements of design. Most of current designs are power and performance
critical, this effectively means that digital designers have extra liberty to use larger chip areas due to
availability of high transistor densities for enhance performance. The area of design is less critical then
the performance itself, however using larger chip area effects in larger power consumption which
should be restricted. Higher power consumption means lower time for battery held devices and high
heat dissipation. Besides testability has gained almost mandatory place in almost all type of digital
designs. This makes digital designing more challenging and demanding and to meet these constraint
dimension of research are wide open.
1-1-4 Level of digital design
The algorithm can be designed and thought at different level of abstraction. Ultimately the design is to
be translated into silicon, and things will proceed into transistors. The transistors are manufactured
using metal interconnects and semiconductors materials, however, it seems dauntingly different to
think mammoth designs on physical layout level. Therefore, other levels of abstraction are introduced to
make life of digital designer easy and tools are provided to link different levels of abstraction. The
section will cover following levels RTL (register transfer level), gate level, schematics/switch level and
layout/physical level.
designers. The mixed signal designs have included designs on single chip mostly used in smart phones.
This effectively means advance digital design products in market using electronic design
automation(EDA).
Figure 1-9: Moore’s Law, transistor count will double every year
Innovations were made in electronic design automation (EDA). The birth of tool based development
allowed IC designer to handled complex algorithms and digital design using state of the art tools. One of
the innovation was programming based hardware development, this means effectively to invent a
language which can model hardware also known as HDL (hardware descriptive language. Initially,
purpose of Verilog HDL was to support simulation of hardware, but latter additional feature of
manufacturing hardware (synthesis) was incorporated. This modifies digital design philosophy, and
extends the definition to mapping of algorithms on to silicon using electronic design automation. Figure
1-5 is redrawn to extend the idea; mapping algorithm into silicon is replaced based HDL coding of the
design. Functional verification /testing of design are automated through use of software and verified
design can be synthesized into functional hardware. The reflection of digital design using automated
techniques is shown in figure 1-10.
Design
Design
specifications
Specification
Simulation
HDL code for
Simulation
Function
Verification using
Test Bench
Synthesis of
Synthesis
Digital Design
list
Gate net
Post Synthesis
Simulation
First step remains the same, and starts with design specifications. Once design specifications are frozen,
Verilog HDL coding is done considering simulation of design. Third step , loops back between Verilog
HDL code and test bench until the code is functionally correct and matches design specifications.
Once design become functionally and logically correct it’s transformed to gate level netlist or transistor
level schematics. Most synthesis tools involve library cell driven synthesis, therefore post synthesis
simulation is performed to check if the synthesized circuit is functionally correct. EDA industry has
A B C F Mux inputs
0 0 0 0 C
0 0 1 1
0 1 0 1 C’
0 1 1 0
1 0 0 1 1
1 0 1 1
1 1 0 0 0
1 1 1 0
Because it is 4x1 multiplexer this means that we need four inputs therefore we will group three inputs
in pair and use A and B inputs from Selection.
Step-3-Implement the design by applying all four inputs of multiplexer with C, C’, 1 and 0. And apply A
and B for controlling which input will be selected to the output.
C I0
I1 MUX 4x1
Y
1
I2
0 I3 S0 S1
If the multiplexer in Example 1-1 was 8x1 then we do not need any pairing. C is directly applied on the
input of Multiplexer directly. For example, 1-2 a XOR logic gate is implemented. The two inputs A and B
are used as selection line and the output of the truth table is applied at the inputs. It just like storing the
output and then selecting them using inputs.
A B F
0 0 0 F0=0
A
B
This means that if a buffer for inputs and the buffer are connected to multiplier then multiplexer will be
reconfigurable by copying different data in the buffers and then applying these inputs to multiplexer.
Figure 1-11 has Multiplexer, Look up Table (LUT) and D Flip Flop. As described above that Multiplexer
can be reshaped in any form if inputs are changed Multiplexer can have any function. The new values
are stored in LUTs which is directly connected to multiplexer. The output of the Multiplexer is stored in
the DFF and hence the value is retained. This block of LUT, MUX and DFF is called complex logic
block(CLB). CLBs are basic building blocks of FPGAs , which allow flexible reconfigurability to device.
LUT
I0
MUX
4x1
I1 DFF
I2
I3
Sel0
Sel1
CLK
FPGAs has seas of such CLBS, allowed reprogram ability and reconfigurability. Blocks of CLBs are
connected through reprogrammable interconnects known as routing channels as shown in figure 1-12.
Each logic block consists of CLBs which can host any logic, while interconnection between them is
controlled by routing channels. Finally I/O pads and blocks are used to interconnect external world to
the logic inside device.
FPGAs are reprogrammable therefore they offer very high speed, with reasonable cost. They doesn’t
incur high manufacturing costs like ASIC and yet provides high level of flexibility. Unlike, microcontrollers
there is no fixed instruction set, limited I/O operations, restricted set of registers etc. However, they
have generally more upfront costs then microcontrollers. Most of modern FPGAs allow soft processor
cores, which can be reconfigured on the fly. Nevertheless, with or without soft processor cores they suit
versatile performance critical applications where microcontroller can’t work.
The first platform which comes to mind while thinking about any embedded system are
microcontrollers. There are tons of them, offering different capabilities, suiting variety of applications,
and supporting a large community of embedded developers. However, microcontrollers are ASICs which
process the code sequentially by loading it from ROM or Flash. Although, some powerful
microcontrollers are multicore these days, and process logics in parallel however, essentially, they are
sequential in nature. Microcontrollers may suit several application models, but where high speed is the
basic requirement things might turn into favor of FPGAs. They are based on parallel execution model
offering performance enhancements to many folds.
As said before, FPGAs are giant pile of logic which can be electronically reprogrammed to host any
circuit at high speed, while microcontroller are low cost predefined fixed architecture devices. Most of
the time embedded systems deploy both to them in same device to serve different layer of functionality
in system. Another popular device, when it comes to digital signal processing is DSP processors, which
have high end multiplication accumulation units (MACs). DSP processors suit tasks with high
performance requirements, with real time constraints. They mostly require multiplications and additions
at very high rate in parallel, and this is very common use case. Modern day system on chip (SOCs) use
DSP processors, microcontrollers, FPGAs and some ASICs all on one chip, interconnected via busses and
glue logic. It’s very common to buy intellectual property IPs cores from different vendors and put them
together in one packaging. For example, ARM sells IP core for microcontrollers, QUALCOMM offers
communication IP cores, and so on. Most of these IP cores are complaint to standard IP core principals
and connecting them together is concern of minor glue logic. These systems may communicate through
some bus architecture, or commercial bus architectures such as AMBA bus, However, since processor
scaling has sort of halted, multicore processor has become a computation culture. In company of several
processing cores, shared memories and abundance of core buses have sort of performance bottle neck
in modern day communication. Network on chip (NOC) borrows concepts from conventional networks
and provide a high-speed network inside chip. It contains routers, routing protocols, topologies and
other conventional networking inside chip to provide high speed communication capabilities to SOCs
with many dozen IP cores contained.
Rest of the book will take you to digital design journey focusing on core ideas of thinking in terms of
efficient architectures. Chapter 2, will discover Verilog HDL, the powerful language to model and
synthesize hardware. Chapter 3, and 4 will focus on algorithmic state machine ASMD, and Store
programmed machine SPM approaches to conquer larger designs. Once through basic design principals,
the next chapter will address synthesis of digital design and considering the actual piece of hardware.
Chapter 6 lays foundation of DSP components which are further discussed in Chapter-11 as digital signal
processing components. Chapter 7 to 10 will look into optimized and high-performance designs by
taking pipelining, high level synthesis, network on chip (NOC) and design for testability into
consideration repectively.