You are on page 1of 79

BiCMOS Technology

Combines Bipolar and CMOS transistors in a


single integrated circuit.
By retaining benefits of bipolar and CMOS,
BiCMOS is able to achieve VLSI circuits with
speed-power-density performance previously
unattainable with either technology individually.
BiCMOS technology enables highperformance
integrated circuits ICs but increases process
complexity

Characteristics of CMOS
Technology

Lower static power dissipation


Higher noise margins
Higher packing density lower manufacturing cost per device
High yield with large integrated complex functions
Advantages of CMOS over
bipolar

High input impedance (low drive current)


Scaleable threshold voltage
High delay sensitivity to load (fan-out limitations)
Low output drive current (issue when driving large capacitive loads)
Low transconductance, where transconductance, gm Vin
Bi-directional capability (drain & source are interchangeable)
A near ideal switching device
Other CMOS Advantages

Characteristics of Bipolar
Technology
Higher switching speed
Higher current drive per unit area, higher gain
Generally better noise performance and better high frequency
characteristics
Better analogue capability
Improved I/O speed (particularly significant with the growing
importance of package limitations in high speed systems).

high power dissipation


lower input impedance (high drive current)
low voltage swing logic
low packing density
low delay sensitivity to load
high gm (gm Vin)
high unity gain band width (ft) at low currents
essentially unidirectional

Advantages of Bipolar
over CMOS
Other Bipolar

Advantages

BJT & CMOS Advantages


CMOS over BJT

Power dissipation
Noise margin
Packing density
Ability to integrate large
and complex circuits and
functions with high yield
Good switch

BJT over CMOS


Switching speed
Current drive per unit area
Noise performance no 1/f
noise
Analog capability
I/O speed
High transconductance

BiCMOS Circuit Advantages


Improved speed over CMOS
Lower power dissipation over BJT (Simplifies
packing and board requirements)

Flexible I/O (ECL, CMOS, or TTL)


High performance analog
Latch-up immunity
High impedance input (FET)
High gain (BJT)
Low 1/f noise, >1 GHz toggle frequency
Zero offset analog switches
Gain-bandwidth product extended
Good voltage reference
Low input offset voltage for differential pair

The simplified BiCMOS Inverter


Two bipolar transistors (T3
and T4), one nMOS and
one pMOS transistor (both Vin
enhancement-type devices,
OFF at Vin=0V)
The MOS switches perform
the logic function & bipolar
transistors drive output
loads

Vdd

T2
T4
Vout
T1
T3

CL

Vin = 0 :
T1 is off. Therefore T3 is non-conducting.
T2 ON - supplies current to base of T4.
T4 base voltage set to Vdd.
T4 conducts & acts as current source to charge load CL
towards Vdd.
Vout rises to Vdd - Vbe (of T4)
Note : Vbe (of T4) is base-emitter voltage of T4.
(pullup bipolar transistor turns off as the output
approaches 5V - Vbe (of T4))
Vin = Vdd :
T2 is off. Therefore T4 is non-conducting.
T1 is on and supplies current to the base of T3
T3 conducts & acts as a current sink to discharge load CL
towards 0V.
Vout falls to 0V+ VCEsat (of T3)

T3 & T4 present low impedances when turned on into


saturation & load CL will be charged or discharged rapidly.
Output logic levels will be good & will be close to rail
voltages since VCEsat is quite small & VBE 0.7V.
Therefore, inverter has high noise margins
Inverter has high input impedance, i.e., MOS gate input
Inverter has low output impedance
Inverter has high drive capability but occupies a relatively
small area
However, this is not a good arrangement to implement
since no discharge path exists for current from the base of
either bipolar transistor when it is being turned off, i.e.,
when Vin=Vdd, T2 is off and no conducting path to the base of T4
exists
when Vin=0,T1 is off and no conducting path to the base of T3
exists
This will slow down the action of the circuit.

Further advantages of BiCMOS


Technology
High impedance CMOS transistors may be used for the
input circuitry while the remaining stages and output
drivers are realized using bipolar transistors.
BiCMOS devices offer many advantages where high load
current sinking and sourcing is required.
The high current gain of the NPN transistor greatly
improves the output drive capability of a conventional
CMOS device.
MOS speed depends on device parameters such as
saturation current and capacitance.
These in turn depend on oxide thickness, substrate
doping and channel length.
BiCMOS can take advantage of any advances in CMOS
and/or bipolar technology, greatly accelerating the
learning curve normally associated with new technologies.

Are there disadvantages with


BiCMOS technology ?
Main disadvantage: greater process complexity when
compared to CMOS
Results in a 1.25 -> 1.4 times increase in die costs over
conventional CMOS.
Taking into account packaging costs, the total
manufacturing costs of supplying a BiCMOS chip ranges
from 1.1-> 1.3 times that of CMOS.
Since BiCMOS does not have to be scaled as
aggressively as CMOS, existing fabs can be utilized
resulting in lower capital costs.
Extra costs incurred in developing a BiCMOS technology
is more than offset by the fact that the enhanced chip
performance obtained extends the usefulness of
manufacturing equipment & clean rooms by at least one
technology generation.

Signal Strengths
Signals such as 1 and 0 have strengths, measures ability
to sink or source current VDD and GND Rails are the
strongest 1 and 0.
Under the switch abstraction, G has complete control
and S and D have no effect.
In reality, the gate can turn the switch on only if a
potential difference of at least Vt exists between the G
and S.
We will look at Vt in detail later on in the course.
Thus signal strengths are related to Vt and therefore p
and n transistors produce signals with different strengths

Tristate Inverter
C=1, C=0 =>
inverter function Z=A

C=0, C=1 =>


high impedance output Z

Building CMOS Gates (n-side)

Building CMOS Gates (p-side)

Building CMOS Gates (Final CMOS gate)

Pseudo-NMOS Logic
In CMOS, use PMOS that is
always ON to make pseudoNMOS gates.
PMOS transistor is always on and
thus fights the pull-down NMOS
network.
Ratio of NMOS to PMOS strength
very important design parameter.
Uses a p-type transistor as a
resistive pullup, ntype network for
pulldowns.
Consumes static power.

Has much smaller pullup network than static gate.


Pulldown time is longer because pullup is fighting.
For logic 0 output, pullup and pulldown form a voltage
divider.
Must choose n, p transistor sizes to create effective
resistances of the required ratio.
Effective resistance of pulldown network must be
comptued in worst case; series n-types means larger
transistors.
Make PMOS about 1/4 effective strength of pulldown
network

Pseudo-NMOS Example

Pseudo-NMOS is a ratio circuit where dc current flows when the N pull-down


tree is conducting.
Must design the ratio of N devices W/L to P load device W/L so that
when the N pull down leg with max resistance is conducting, the output
is at a sufficiently low VOL.

Dynamic CMOS Logic


General Dynamic CMOS Logic:
IDD Path is turned off when clock-disabled and/or the
output is evaluated when clock enabled.

In general:

Basic Dynamic CMOS n-Logic.


Basic Dynamic CMOS p-Logic.
Clocked CMOS Logic.
Domino Logic

Basic Dynamic CMOS Logic with nLogic

Example

Erroneous Evaluation in Cascaded Dynamic nLogics:


They are evaluated at the same clock edges.
Even slow evaluation may be impossible if the output
has been discharged by pre-charged inputs.

Cascading Problem in Dynamic CMOS Logic


If several stages of the previous circuit are cascaded
together using the same clock , a problem in evaluation
involving a built-in race condition will exist
During pre-charge, both Vout1 and Vout2 are pre-charged to Vdd
When goes high to begin evaluate, all inputs at stage 1 require
some finite time to resolve, but during this time charge may
erroneously be discharged from Vout2
e.g. assume that eventually the 1st stage NMOS logic tree conducts
and fully discharges Vout1, but since all the inputs to the N-tree all not
immediately resolved, it takes some time for the N-tree to finally
discharge Vout1 to GND.
If, during this time delay, the 2nd stage has the input condition shown
with bottom NMOS transistor gate at a logic 1, then Vout2 will start to
fall and discharge its load capacitance until Vout1 finally evaluates and
turns off the top series NMOS transistor in stage 2

The result is an error in the output of the 2nd stage Vout2

Cascaded Dynamic CMOS Logic Gates:


Evaluate Problem

With simple cascading of dynamic CMOS logic


stages, a problem arises in the evaluate cycle:
The pre-charged high voltage on Node N2 in stage 2
may be inadvertently (partially) discharged by logic
inputs to stage 2 which have not yet reached final
correct (low) values from the stage 1 evaluation
operation.
Can not simply cascade dynamic CMOS logic gates
without preventing unwanted bleeding of charge from
pre-charged nodes

Possible Solutions:

Two phase clocks


Use of inverters to create Domino Logic
NP Domino Logic
Zipper/NORA logic

Single Phase Dynamic CMOS Gates


When the gates are precharged, N1 and N2 are charged
to Vdd.
During evaluate, n-logic-1 will conditionally discharge
with some delay due to t.
N1 must be fully evaluated before
N2 is evaluated; otherwise the output will N2 be in error.

This cannot be assured with single-phase clocks.


These problems can be overcome by using multi-phase
clocks and a sample and hold circuit to isolate cascaded
stages and control evaluate timing.
Example:

During precharge, is automatically high and S is


automatically ON
Suppose that n-logic-1 is ON and n-logic-2 is ON
Immediately after precharge ends,
N2 will tend to be pulled down until N1is pulled down
by n-logic-1 thereby turning S OFF

To make cascaded stages (single phase clock),


S should be OFF until it is conditionally turned
ON ( by N1 ).
Can do this by inserting an inverter between
N1and S => domino logic N1.

CMOS Domino Logic

Problem with faulty discharge of precharged nodes in


CMOS dynamic logic circuits can be solved by placing
an inverter in series with the output of each gate:
All inputs to N logic blocks (which are derived from
inverted outputs of previous stages) therefore will be at
zero volts during precharge and will remain at zero until
the evaluation stage has logic inputs to discharge the
precharged node PZ.
However, all circuits only provide non-inverted outputs

In (b) a weak P device compensates for charge loss


due to charge sharing and leakage at low frequency
clock operation
In (c) the weak P device can be used to latch the
output high

NP Domino Logic (NORA Logic)


An elegant solution to the dynamic CMOS logic erroneous
evaluation problem is to use NP Domino Logic (also called
NORA logic) as shown below.

Alternate stages of N logic with stages of P logic:


N logic stages use true clock, normal precharge and
evaluation phases, with N logic tree in the pull down
leg. P logic stages use a complement clock, with P
logic stage tied above the output node.
During precharge clk is low (-clk is high) and the Plogic output precharges to ground while N-logic
outputs precharge to Vdd.
During evaluate clk is high (-clk is low) and both type
stages go through evaluation; N-logic tree logically
evaluates to ground while P-logic tree logically
evaluates to Vdd.

Inverter outputs can be used to feed other Nblocks from N-blocks, or to feed other P-blocks
from P-blocks.

NORA CMOS Logic Circuit Example


An example of NP or NORA (No Race) logic is shown
below:

During low ( high), each stage pre-charges


N logic stages pre-charge to Vdd; P logic stages precharge to GND

When goes high ( low), each stage enters


the evaluation phase

N logic evaluates to GND; P logic stages evaluate to


Vdd
All NMOS and PMOS stages evaluate one after
another in succession, as in Domino logic

Logic is:
Stage 1 is X = (A B)
Stage 2 is G = X + Y
Stage 3 is Z = (F G + H)

Zipper CMOS Dynamic Logic


Zipper CMOS logic is a scheme for improving
charge leakage and charge sharing problems
Pre-charge transistors receive a slightly modified
clock where the clock pulse (during pre-charge
off time) holds the pre-charge transistor at weak
conduction in order to provide a trickle precharge current during the evaluation phase
PMOS pre-charge transistor gates are held at Vdd |Vtp|
NMOS pre-charge transistor gates are held at Vtn
above GND

Clocked CMOS Logic (C2MOS)

Clock Distribution
On a small chip, the clock distribution network is
just a wire
And possibly an inverter for clkb

On practical chips, the RC delay of the wire


resistance and gate load is very long
Variations in this delay cause clock to get to different
elements at different times
This is called clock skew

Most chips use repeaters to buffer the clock and


equalize the delay
Reduces but doesnt eliminate skew

Solutions
Reduce clock skew
Careful clock distribution network design
Plenty of metal wiring resources

Analyze clock skew


Only budget actual, not worst case skews
Local vs. global skew budgets

Tolerate clock skew


Choose circuit structures insensitive to skew

Clock Skew
Reduce clock skew
Careful clock distribution network design
Plenty of metal wiring resources

Analyze clock skew


Only budget actual, not worst case skews
Local vs. global skew budgets

Tolerate clock skew


Choose circuit structures insensitive to skew

Clock Skew Sources are:

systematic,
random,
drift,
jitter

Clock System Architecture

Synchronous chip interface with PLL

Phase-locked loop block diagram

Global Clock Distribution Networks

Grids
H-trees
Spines
Ad-hoc
Hybrid

Clock Grids
A clock grid is a mesh of horizontal and vertical wires
driven from the middle or edges.
Use grid on two or more levels to carry clock.
Make wires wide to reduce RC delay.
Ensures low skew between nearby points.
But possibly large skew across die.
Grids compensate for random skew.
Grids consume a large amount of metal resources and
hence a high switching capacitance and power
consumption.

Alpha Clock Grids


Alpha 21064

Alpha 21164

Alpha 21264

PLL

gclk grid

Alpha 21064

gclk grid

Alpha 21164

Alpha 21264

H-Trees
Fractal structure
Gets clock arbitrarily close to
any point
Matched delay along all paths

Delay variations cause skew


Buffers are added to serve
as repeaters
A and B might see big skew
High random skew, drift and
jitter between two nearby
points

Ad-hoc
The clock is routed haphazardly with some
attempt to equalize wire lengths or add buffers to
equalize delay.
Have low systematic skews because the buffer
sizes can be adjusted until nominal delays are
nearly equal.
Subject to random skew.

Clock Spine
If loads are uniform, the spine avoids the systematic skew
of the grid by matching the length of the clock wires.
Save power by not switching certain wires.
System with many clocked elements may require a large
number of serpentine routes, leading to high area and
capacitance for the clock network.
Clock spines have large skews between nearby elements
driven by different serpentines.

Hybrid Networks
Use H-tree to distribute clock to many points.
Tie these points together with a grid.
Hybrid combination of H-tree and grid offers
lower skew.
Hybrid approach has lower systematic skew,
less susceptible to skew from non-uniform load
distribution.
Hybrid approach is regular, making layout of
well-controlled transmission line structures
easier.

Clock Tree Distribution

To prevent clock skew problems on a chip, clock


distribution networks are designed very carefully
Example shown: linear (E-W) clock tree distribution
network
Clock is buffered several times before driving FO=3
Each FO3 buffer drives another high FO (FO=4 shown) buffer
Finally another single buffer is used for each linear clock line to
drive across chip or functional island on a chip

H tree distribution network often used on chips with area


pads (solder bumps)
Master clock is brought on board chip near central part of chip
and driven outward with large H interconnection arrangement

Clock Generation
en clk
1
2
3
4

Scaling of MOS Circuits


VLSI technology is constantly evolving towards
smaller line widths.
Reduced feature size generally leads to:
Better / faster performance
More gate / chip

More accurate description of modern technology


is ULSI (ultra large scale integration.

Scaling Factors
In our discussions we will consider 2 scaling
factors, and
1/ is the scaling factor for VDD and oxide
thickness D
1/ is scaling factor for all other linear
dimensions
Scaling models are:
Constant electric field
Constant voltage field
Combined E & V

Scaling Factors for Device Parameters


Simple derivations showing the effects of scaling are
derived in Pucknell and Eshraghian pages 125 129.
It is important that you understand how the following
parameters are effected by scaling:

Gate Area
Gate Capacitance per unit area
Gate Capacitance
Charge in Channel
Channel Resistance
Transistor Delay
Maximum Operating Frequency
Transistor Current
Switching Energy
Power Dissipation Per Gate (Static and Dynamic)
Power Dissipation Per Unit Area
Power - Speed Product

Introduction
Wiring-Up of chip devices takes place through
various conductors produced during processing
Today, interconnects constitute the main source
of delay in MOS circuits
We will examine:

Sheet Resistance Resistance / Unit Area


Area Capacitance
Delay Units
CMOS Inverter Delay
Rise and Fall Time Estimation

Sheet Resistance

Resistance of a square slab of material


RAB = L/A
t
=> R = L/t*W
Let L = W (square slab)
=> RAB = /t = Rs ohm / square

B
RAB = ZRsh
Z = L/W

Typical sheet resistance values for materials


are very well characterized
Layer

Rs (Ohm / Sq)

Aluminium

0.03

N Diffusion

10 50

Silicide

24

Polysilicon

15 - 100

N-transistor Channel

104

P-transistor Channel

2.5 x 104

Typical Sheet Resistances for 5m Technology

N-type Minimum Feature Device


Polysilicon

N - diffusion

2
R = 1sq x Rs = Rs = 104

Polysilicon
W = 8

L = 2

N - diffusion
R = Z Rs
R = (L/W) * Rs
R = 4 104

Area Capacitance of Layers


Conducting layers are separated from each other by
insulators (typically SiO2)
This may constitute a parallel plate capacitor, C = 0ox A
/ D (farads)
D = thickness of oxide, A = area,
ox = 4 F/m2
Area capacitance given in pF/m2
Standard unit for a technology node is the gate - channel
capacitance of the minimum sized transistor (2 x 2),
given as
Cg
This is a technology specific value

Delay Unit
For a feature size square gate, = Rs x
Cg
i.e for 5m technology, = 104 ohm/sq x 0.01pF = 0.1ns
Because of effects of parasitics which we have not
considered in our model, delay is typically of the order of
0.2 - 0.3 ns
Note that is very similar to channel transit time sd

CMOS Inverter Delay


Pull-down delay = Rpd x 2
Cg
Pull-up delay = Rpu x 2
Cg
Asymmetry in rise and fall due to resistance difference
between pull-up and pull-down (factor of 2.5) (due to
motilities of carriers)
Delay through a pair of inverters is 2 (fall time) + 5
(rise time)
Delay through a pair of CMOS inverters is therefore 7
Asymmetry can be improved by reducing resistance of
pull - up
Reduce resistance of pull - up by increasing channel
width ( typically by a factor of 2.5)
Note that increasing channel width also increases the
capacitance

CMOS Inverter Rise and Fall Time


Estimation
Tf ~ 3CL / VDD
r ~ 3CL / VDD
(Derivations for the above are in Pucknell and
Eshraghian Pages 105 - 107)
So, r/ f = n/p
Given that (due to mobilities) n = 2.5 p, rise time is
slower by a factor of 2.5 when using minimum
dimensions of n and p transistors

Large Capacitive Loads


When trying to drive off-chip loads, large capacitances
are often encountered
off-chip capacitances can be of the order of 104
Cg
Inverters intended to drive large capacitive loads must
therefore present low pull-up and pull-down resistances
The technique employed is to cascade a series of
inverters in series, each one of which is larger than the
preceding by a width factor f

Cascaded Inverters
If N is the number of stages required to drive the load,
f is the width scaling, CL is the load capacitance, and
Cmin is the minimum capacitance,
y = CL/Cmin = fN
The number of stages is minimized if f = base of
natural log e
Details of this derivation is given in Pucknell and
Eshraghian pages 107 - 109

Power and Energy


Power is drawn from a voltage source attached to the
VDD pin(s) of a chip.
Instantaneous Power:
Energy:
Average Power:

P(t ) iDD (t )VDD


T

E P(t )dt iDD (t )VDD dt


T

Pavg

E 1
iDD (t )VDD dt
T T 0

CMOS Power Dissipation


Power dissipation in CMOS circuits comes from
two components i.e. Ptotal = Pstatic + Pdynamic
Static dissipation due to:

subthreshold conduction through OFF transistors


tunneling current through gate oxide
leakage through reverse-biased diodes
contention current in ratioed circuits

Dynamic dissipation due to:


charging and discharging of load capacitances
short circuit current while both PMOS and NMOS
networks are partially ON

Ratioed circuits (e.g. pseudo NMOS) have more


static dissipation.

Dynamic Power Dissipation


Dynamic power is required to charge and discharge load
capacitances when transistors switch.
One cycle involves a rising and falling output.
On rising output, charge Q = CVDD is required
On falling output, charge is dumped to GND
This repeats Tfsw times over an interval of T
VDD
iDD(t)

fsw

Dynamic Power Cont.


T

Pdynamic

1
iDD (t )VDD dt
T 0

VDD

VDD

iDD (t )dt

T 0
VDD

TfswCVDD
T
CVDD 2 f sw

iDD(t)

fsw

Activity Factor
Suppose the system clock frequency = f
Let fsw = af, where a = activity factor
If the signal is a clock, a = 1
If the signal switches once per cycle, a =
Dynamic gates:
Switch either 0 or 2 times per cycle, a =
Static gates:
Depends on design, but typically a = 0.1
Dynamic power:

Pdynamic CVDD 2 f

Short Circuit Current


When transistors switch, both nMOS and pMOS
networks may be momentarily ON at once
Leads to a blip of short circuit current.
< 10% of dynamic power if rise/fall times are comparable
for input and output

Static Power
Static power is consumed even when chip is
quiescent:
Ratioed circuits burn power in fight between ON
transistors
Leakage draws power from nominally OFF devices
Vgs Vt

I ds I ds 0e

nvT

Vds

vT
1 e

Vt Vt 0 Vds

s Vsb s

Low Power Design


Reduce dynamic power

: clock gating, sleep mode


C: small transistors (esp. on clock), short wires
VDD: lowest suitable voltage
f: lowest suitable frequency

Reduce static power


Selectively use ratioed circuits
Selectively use low Vt devices
Leakage reduction:
stacked devices, body bias, low temperature

THANK YOU

You might also like