Professional Documents
Culture Documents
CMOS VLSI
Design
Outline
Clock Distribution
Clock Skew
Skew-Tolerant Static Circuits
Traditional Domino Circuits
Skew-Tolerant Domino Circuits
Design forSlide
Skew2
Clocking
Synchronous systems use a clock to keep
operations in sequence
Distinguish this from previous or next
Determine speed at which machine operates
Clock must be distributed to all the sequencing
elements
Flip-flops and latches
Also distribute clock to other elements
Domino circuits and memories
Design forSlide
Skew3
Clock Distribution
On a small chip, the clock distribution network is just
a wire
And possibly an inverter for clkb
On practical chips, the RC delay of the wire
resistance and gate load is very long
Variations in this delay cause clock to get to
different elements at different times
This is called clock skew
Most chips use repeaters to buffer the clock and
equalize the delay
Reduces but doesnt eliminate skew
CMOS VLSI Design
Design forSlide
Skew4
Example
Skew comes from differences in gate and wire delay
With right buffer sizing, clk1 and clk2 could ideally
arrive at the same time.
But power supply noise changes buffer delays
clk2 and clk3 will always see RC skew
gclk
3 mm
3.1 mm
clk1
clk2
1.3 pF
0.4 pF
0.5 mm
clk3
0.4 pF
Design forSlide
Skew5
sequencing overhead
F1
Q1
Combinational Logic
D2
F2
clk
Tc
clk
tpcq
Q1
tskew
tpdq
tsetup
D2
clk
Q1
CL
clk
D2
F2
clk
F1
tskew
clk
thold
Q1 tccq
D2
tcd
Design forSlide
Skew6
100
M H z
S p e c In t9 5
10
100
80386
80486
P e n tiu m
P e n tiu m II / III
0 .1
80386
80486
P e n tiu m
P e n tiu m II / III
10
1985
0 .0 1
1985
1988
1991
1994
1997
2000
1988
1991
1994
1997
2000
500
F O 4 in v e r te r d e la y s / c y c le
F a n o u t - o f - 4 ( F O 4 ) I n v e r t e r D e la y ( p s )
100
V D D = 3 .3
VD D = 5
V D D = 2 .5
200
100
50
2 .0
1 .2
0 .8
0 .6
0 .3 5
0 .2 5
50
80386
80486
P e n tiu m
P e n tiu m II / II I
20
10
1985
1988
1991
1994
1997
2000
P ro c e s s
Design forSlide
Skew7
Solutions
Reduce clock skew
Careful clock distribution network design
Plenty of metal wiring resources
Analyze clock skew
Only budget actual, not worst case skews
Local vs. global skew budgets
Tolerate clock skew
Choose circuit structures insensitive to skew
Design forSlide
Skew8
Ad hoc
Grids
H-tree
Hybrid
Design forSlide
Skew9
Clock Grids
Design for
Slide
Skew
10
Alpha 21164
Alpha 21264
PLL
gclk grid
Alpha 21064
gclk grid
Alpha 21164
Alpha 21264
Design for
Slide
Skew
11
H-Trees
Fractal structure
Gets clock arbitrarily close to any point
Matched delay along all paths
Delay variations cause skew
A
A and B might see big skew
Design for
Slide
Skew
12
Itanium 2 H-Tree
Four levels of buffering:
Primary driver
Repeater
Second-level
clock buffer
Gater
Route around
obstructions
Repeaters
Typical SLCB
Locations
Primary Buffer
Design for
Slide
Skew
13
Hybrid Networks
Use H-tree to distribute clock to many points
Tie these points together with a grid
Ex: IBM Power4, PowerPC
H-tree drives 16-64 sector buffers
Buffers drive total of 1024 points
All points shorted together with grid
Design for
Slide
Skew
14
Skew Tolerance
Flip-flops are sensitive to skew because of hard edges
Data launches at latest rising edge of clock
Must setup before earliest next rising edge of clock
Overhead would shrink if we can soften edge
Latches tolerate moderate amounts of skew
Data can arrive anytime latch is transparent
Design for
Slide
Skew
15
Skew: Latches
Q1
Combinational
Logic 1
D2
1
Q2
Combinational
Logic 2
D3
Q3
pdq
sequencing overhead
L3
122t 3
D1
L1
t pd Tc
L2
2-Phase Latches
Tc
tsetup tnonoverlap tskew
2
Pulsed Latches
Design for
Slide
Skew
16
C
D
A
Y
C
Y
B
Design for
Slide
Skew
17
Domino Circuits
Dynamic inputs must monotonically rise during
evaluation
Place inverting stage between each dynamic gate
Dynamic / static pair called domino gate
Domino gates can be safely cascaded
domino AND
A
B
dynamic static
NAND inverter
Design for
Slide
Skew
18
Domino Timing
Domino gates are 1.5 2x faster than static CMOS
Lower logical effort because of reduced Cin
Challenge is to keep precharge off critical path
Look at clocking schemes for precharge and eval
Traditional schemes have severe overhead
Skew-tolerant domino hides this overhead
Design for
Slide
Skew
19
Latch
Dynamic
clk clk
Static
Dynamic
clk
Static
Dynamic
clk
Static
tpdq
Dynamic
Latch
Dynamic
Dynamic
clk
Static
Dynamic
clk
Static
clk
Dynamic
t pd Tc 2t pdq
tpdq
Design for
Slide
Skew
20
Clock Skew
Skew increases sequencing overhead
Traditional domino has hard edges
Evaluate at latest rising edge
Setup at latch by earliest falling edge
clk
Latch
Dynamic
clk clk
Static
Dynamic
Dynamic
clk
Static
clk
Latch
clk clk
Dynamic
Static
Dynamic
clk
Static
clk
Dynamic
t pd Tc 2tsetup 2tskew
clk
tsetup tskew
Design for
Slide
Skew
21
Time Borrowing
Logic may not exactly fit half-cycle
No flexibility to borrow time to balance logic
between half cycles
Traditional domino sequencing overhead is about
25% of cycle time in fast systems!
clk
Latch
clk
Static
clk
Dynamic
clk
Static
clk
Dynamic
Static
Dynamic
clk
Static
Dynamic
clk
Latch
clk
tsetup tskew
Design for
Slide
Skew
22
Design for
Slide
Skew
23
Skew-Tolerant Domino
Use overlapping clocks to eliminate latches at phase
boundaries.
Second phase evaluates using results of first
No latch at
phase boundary
Static
Dynamic
2
Static
Dynamic
Design for
Slide
Skew
24
Full Keeper
After second phase evaluates, first phase precharges
Input to second phase falls
Violates monotonicity?
But we no longer need the value
Now the second gate has a floating output
Need full keeper to hold it either high or low
H
X
f
weak full
keeper
transistors
Design for
Slide
Skew
25
Time Borrowing
Overlap can be used to
Tolerate clock skew
Permit time borrowing
No sequencing overhead
toverlap
tborrow tskew
Phase 1
Static
Dynamic
2
Static
Dynamic
2
Static
Dynamic
2
Static
Dynamic
1
Static
Dynamic
1
Static
Dynamic
1
Static
Dynamic
1
Static
1
Dynamic
t pd Tc
Phase 2
Design for
Slide
Skew
26
Multiple Phases
With more clock phases, each phase overlaps more
Permits more skew tolerance and time borrowing
1
2
3
4
Phase 1
Phase 2
Phase 3
Static
Dynamic
4
Static
Dynamic
4
Static
Dynamic
3
Static
Dynamic
3
Static
Dynamic
2
Static
Dynamic
2
Static
Dynamic
1
Static
Dynamic
Phase 4
Design for
Slide
Skew
27
Clock Generation
en clk
1
2
3
4
CMOS VLSI Design
Design for
Slide
Skew
28
Summary
Clock skew effectively increases setup and hold
times in systems with hard edges
Managing skew
Reduce: good clock distribution network
Analyze: local vs. global skew
Tolerate: use systems with soft edges
Flip-flops and traditional domino are costly
Latches and skew-tolerant domino perform at full
speed even with moderate clock skews.
Design for
Slide
Skew
29