Design and Optimization Techniques of High-Speed VLSI Circuits

Design and optimization techniques of
high–speed VLSI circuits
Marco Delaurenti
Politecnico di Torino
Design and optimization techniques of
high–speed VLSI circuits
Marco Delaurenti
PhD Dissertation
December 1999
Politecnico di Torino
Advisor Coordinator
Prof. Maurizio Zamboni Prof. Ivo Montrosset
c
Copyright 1999 Marco Delaurenti
Writing comes more eas-
ily if you have something
to say.
(Sholem Asch)
“When I use a word,”

Humpty Dumpty said in
rather a scornful tone, “it
means just what I choose
it to mean—neither more
nor less.”
(Lewis Carroll)
Acknoledgments
First of all I would like to thank my advisor, Prof. M. Zamboni, Prof. G
Piccinini, Prof. G. Masera for their invaluable help, and Prof. P. Civera for
his being a bridge toward the real world. Also many thanks to the VLSI–
LAB members at Politecnico of Turin, Italy: Mario for his input about the
critical paths (no, I do not thank you for the jazz songs that you play all
day long), Luca for the long discussions about books and movies (no, I
haven’t seen the last Kubrick’s movie), Andrea for his very good cocktails
(especially the “Negroni” one) and Danilo, because I forgot him every time
we went to lunch. Thanks also to Max (for he gave me the root password),
and to “Yuan&Svensson” for the invention of the TSPC.
Special thanks, finally, to Mg, for her support and for have been tolerating
me till now.
CONTENTS
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Part I CMOS Logic 1
1. Introduction to CMOS logic . . . . . . . . . . . . . . . . . . . . . 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 CMOS logic families . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Static logic families . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Dynamic logic families . . . . . . . . . . . . . . . . . . 6
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Part II Circuit Modeling 13
2. A simple model . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 The Elmore’s model . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3. A complex model . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 The FAST model . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 MOS equations . . . . . . . . . . . . . . . . . . . . . . 23
3.1.2 Internal nodes approximation . . . . . . . . . . . . . . 24

viii Contents
3.1.3 Body effect . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Delay estimation . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Equation solving . . . . . . . . . . . . . . . . . . . . . 32
3.3 Power estimation . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.1 Switching energy . . . . . . . . . . . . . . . . . . . . . 36
3.3.2 Short–circuit energy . . . . . . . . . . . . . . . . . . . 39
3.3.3 Sub–threshold energy . . . . . . . . . . . . . . . . . . 39
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Part III Optimization 45
4. Mathematic Optimization . . . . . . . . . . . . . . . . . . . . . 47
4.1 Optimization theory . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.1 Mono-objective optimization . . . . . . . . . . . . . . 49
4.1.1.1 Unconstrained problem . . . . . . . . . . . . 51
4.1.1.2 Constrained problem . . . . . . . . . . . . . 52
Lagrange multiplier and Penalty functions . . 52
4.1.2 Multi-objective optimization . . . . . . . . . . . . . . 54
4.1.2.1 Unconstrained . . . . . . . . . . . . . . . . . 56
4.1.2.2 Constrained . . . . . . . . . . . . . . . . . . 57
Compromise solution . . . . . . . . . . . . . . 57
4.2 Optimization Algorithms . . . . . . . . . . . . . . . . . . . . 58
4.2.1 One-dimensional search techniques . . . . . . . . . . 59
4.2.1.1 The section search . . . . . . . . . . . . . . . 59
Dicotomic search . . . . . . . . . . . . . . . . . 59
Fibonacci Search . . . . . . . . . . . . . . . . . 60
Contents ix
The golden section search . . . . . . . . . . . . 60
Convergence considerations . . . . . . . . . . . 61
4.2.1.2 Parabolic interpolation . . . . . . . . . . . . 62
The Brent’s rule . . . . . . . . . . . . . . . . . . 62
4.2.2 Multi-dimensional search . . . . . . . . . . . . . . . . 63
4.2.2.1 The gradient direction: steepest (maximum)

descent . . . . . . . . . . . . . . . . . . . . . 63
4.2.2.2 The optimal gradient . . . . . . . . . . . . . 65
Convergence considerations . . . . . . . . . . . 66
4.2.3 The conjugate direction method . . . . . . . . . . . . 67
4.2.3.1 The Fletcher–Reeves conjugate gradient al-

gorithm . . . . . . . . . . . . . . . . . . . . . 68
4.2.3.2 The Powell conjugate gradient algorithm . . 69
4.2.4 The “SLOP” algorithm . . . . . . . . . . . . . . . . . . 70
4.2.5 The simulated-annealing algorithm . . . . . . . . . . 72
4.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5. Circuit Optimization . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1 Optimization targets . . . . . . . . . . . . . . . . . . . . . . . 78
5.1.1 Circuit delay . . . . . . . . . . . . . . . . . . . . . . . . 79
Critical Paths . . . . . . . . . . . . . . . . . . . 80
5.1.1.1 Delay formula obtained by the Elmore model 84
5.1.1.2 Delay measurement obtained by the FAST

model and by HSPICE . . . . . . . . . . . . . 86
5.1.2 Power consumption . . . . . . . . . . . . . . . . . . . 87
5.1.3 Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 Optimization examples . . . . . . . . . . . . . . . . . . . . . . 91
5.2.1 Algorithm choice . . . . . . . . . . . . . . . . . . . . . 94

x Contents
5.2.2 Mono-objective optimizations . . . . . . . . . . . . . . 95
5.2.2.1 Area . . . . . . . . . . . . . . . . . . . . . . . 95
5.2.2.2 Power . . . . . . . . . . . . . . . . . . . . . . 96
5.2.2.3 Delay . . . . . . . . . . . . . . . . . . . . . . 97
5.2.3 Multi-objective optimizations . . . . . . . . . . . . . . 102
5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6. A CAD tool for optimization . . . . . . . . . . . . . . . . . . . . 107
6.1 Logical description . . . . . . . . . . . . . . . . . . . . . . . . 107
6.1.1 The optimization algorithm module (OAM) . . . . . . 107
6.1.2 The function evaluation module (FEM) . . . . . . . . . 109
6.1.3 Core engine . . . . . . . . . . . . . . . . . . . . . . . . 109
6.2 Code implementation . . . . . . . . . . . . . . . . . . . . . . . 110
6.2.1 The classes CircuitNetlist and Circuit . . . . . . . . . 110
6.2.2 The class EvaluationAlgorithm . . . . . . . . . . . . . 112
6.2.3 The class OptimizationAlgorithm . . . . . . . . . . . 113
6.2.4 The critical path retrieving . . . . . . . . . . . . . . . 115
6.2.5 The derived classes . . . . . . . . . . . . . . . . . . . . 116
6.3 Program flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7. Results and conclusions . . . . . . . . . . . . . . . . . . . . . . 121
7.1 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.1.1 Mono-objective vs. Multiobjective . . . . . . . . . . . 122
7.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3 Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Contents xi
Appendix 143
A. Class graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
B. Source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
B.1 Main functions . . . . . . . . . . . . . . . . . . . . . . . . . . 149
B.2 Optimization algorithms . . . . . . . . . . . . . . . . . . . . . 208
B.3 Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

xii Contents
LIST OF FIGURES
1.1 Static and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Pass-transistor logic xor . . . . . . . . . . . . . . . . . . . . . 6
1.3 Domino typical gate . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 CVSL typical gate . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 C2 MOS typical gate . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 TSPC Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1 RC MOS equivalence . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 RC chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 RC single cell . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Elmore impulse response . . . . . . . . . . . . . . . . . . . . . 18
3.1 Inverter voltages waveform . . . . . . . . . . . . . . . . . . . 23
3.2 Mos chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Node voltages . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Voltages wave form in the n–MOS chain . . . . . . . . . . . . 27
3.5 Voltages wave forms in the p–MOS chain . . . . . . . . . . . . 28
3.6 VDS and VGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.7 MOSFET chain with static voltages . . . . . . . . . . . . . . . 30
3.8 Threshold variation . . . . . . . . . . . . . . . . . . . . . . . . 31
3.9 Delay comparison . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.10 Energy comparison . . . . . . . . . . . . . . . . . . . . . . . . 43

xiv List of Figures
4.1 Section search . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 Minimization by Powell algorithm . . . . . . . . . . . . . . . 70
4.3 Minimization by Powell algorithm . . . . . . . . . . . . . . . 71
4.4 Minimization by SLOP algorithm . . . . . . . . . . . . . . . . 72
4.5 Minimization by Simulated-annealing algorithm . . . . . . . 73
4.6 Minimization by Simulated-annealing algorithm . . . . . . . 74
5.1 Design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Delay definition . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Critical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 Critical path tree . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 Elmore delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.6 Elmore delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.7 HSPICE delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.8 FAST delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.9 HSPICE Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.10 CMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.11 TSPC Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.12 TSPC And gates . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.13 TSPC Or gates . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.14 Static and-or gate . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.15 Static parity gate . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.16 Static full-adder . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.17 TSPC full-adder (one—stage) . . . . . . . . . . . . . . . . . . . 101
6.1 Tool block diagram . . . . . . . . . . . . . . . . . . . . . . . . 108

List of Figures xv
7.1 Comparison of 0.7 µm and 0.25 µm. gates @ minimum tech-

nology width . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2 Delay optimization of 0.7 µm gates. . . . . . . . . . . . . . . . 125
7.3 Delay optimization of 0.25 µm gates. . . . . . . . . . . . . . . 126
7.4 Technology comparison of delay optimization. . . . . . . . . 127
7.5 Several delay–power optimization policies of 0.7 µm gates. . 132
7.6 Energy-dissipation variation (zoom of figure 7.5(b)) . . . . . 133
7.7 Several delay–power optimization policies of 0.25 µm gates. 134
7.8 Energy-dissipation variation (zoom of figure 7.7(b)) . . . . . 135
7.9 Delay–power optimization (50%–50%) comparison of 0.7 µm

and 0.25 µm gates. . . . . . . . . . . . . . . . . . . . . . . . . 136
7.10 Delay and power trajectory during 4 different multi-objective

optimizations for the and–or gate . . . . . . . . . . . . . . . . 137

optimizations for the parity gate . . . . . . . . . . . . . . . . 138

optimizations for the static full-adder . . . . . . . . . . . . . 139

optimizations for the dynamic full-adder . . . . . . . . . . . 140
xvi List of Figures
LIST OF TABLES
3.1 Mean Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Execution time . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1 Optimization algorithms . . . . . . . . . . . . . . . . . . . . . 75
5.1 Basic gates: complexity . . . . . . . . . . . . . . . . . . . . . . 92
5.2 Basic gates: pre-optimization delay, power consumption and

area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3 Full-adder: delay optimization . . . . . . . . . . . . . . . . . 99
5.4 Agreements of targets . . . . . . . . . . . . . . . . . . . . . . 103
5.5 Full-adder: delay and power optimization . . . . . . . . . . 105
5.6 Full-adder: optimizations comparison . . . . . . . . . . . . . 105
7.1 Library gates list . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2 Delay and energy dissipation @ minimum width (HSPICE) . 123
7.3 Delay decreasing and energy increasing (both relative) in a

delay optimization. . . . . . . . . . . . . . . . . . . . . . . . . 128
7.4 Elapsed time and total number of function evaluations for a

full-delay optimization with HSPICE — on a ULTRA-sparc 5 129
7.5 Constrained delay optimization of a few 0.25 µm gates. . . . 130
7.6 Delay worsening and energy improvement between a full

delay optimization and delay-power optimization . . . . . . 133
xviii List of Tables
Preface
The design of high speed integrated circuit is a long and complex op-
eration; nonetheless the total time–to–market required from the idea to the
silicon masks is reducing along the way.
To help the designer during this long and winding road several CAD tools
are available. In the first step the only thing existing is the description of
the circuit behaviour (the “idea”); in the central step of the design flow the
designer knows only the logic functioning of each block composing the cir-
cuit, but he ignores the technology realization of these blocks; in the last
steps, finally, the designer knows exactly the technology implementation
of every single gate of the circuit, and can “compose” the final layout with
every gate. Ca va sans dire that the CAD tool are nowadays of vital import-
ance in the design flow, and moreover the goodness or the badness of such
tools influence a lot the quality of the final design.
Among all the possible instruments, the optimization tools have a pri-
mary role in all the phases of a project, starting from the optimization at
higher level and descending to the optimization made at the electrical level.
This thesis focuses its efforts in developing new strategies and new
techniques for the optimization made at the transistor dimension level, that
is the one done by the cell library engineer, and developing also a CAD in-
strument to make this work as more as harmless as possible.
xx Preface
Part I
CMOS LOGIC
Chapter 1
INTRODUCTION TO CMOS LOGIC
HE optimization of VLSI circuits involves the optimization of single

T CMOS cell. In this chapter are briefly reported the basic CMOS logic
families, with their pros and cons. The simple goal is to pick up among
the static and dynamic logic families the most appealing for the use in vlsi
circuits, and, in some measure, the most actually used, and then apply to
them the optimization techniques shown in the next chapters.
1.1 Introduction
We might ask: why to optimize a single cell in VLSI circuit, when the
design nowadays is shifting toward higher and higher level?
Some answers could be:
• Need of re-usable library cells. This makes easier to reuse the same
library for different projects. It is a must nowadays, in order to reduce
the total time to target/market.
• An optimized library makes easier the design at higher level: floor-

planning, routing, can have “relaxed” constraints, since the gates have
a better “behaviour”. It is possible to reduce the time to repeat some
critical steps like floorplanning or routing until all the specifications
are met: these specifications are met earlier, since the cell globally
have a better behaviour.
• Need of having some equivalent libraries with different kind of op-

timization. It is possible to have different libraries that have different
4 Chapter 1. Introduction to CMOS logic
specifications, but are functionally equivalent, so that it is possible to

create different version of a project simply substituting the basic lib-
rary. It would be possible, for example, to have, of the same project, a
version that runs at full speed, and version optimized for low-power
dissipation.
This swapping of libraries does not involve the higher levels of design,
for it is totally transparent to the designer during floorplanning or
routing. Just before the layout production, during the cell mapping,
it is possible to choose the library on to which the project would be
mapped.
These answer have led to consider the appropriateness of the produc-

tion of a tool able to perform the optimization of a cell library, in a way
appropriate for the designer. The goal is to produce some results to show
that this optimization is worth during a design cycle, and also to make the
insertion of the tool in a design cycle as smooth as possible.
In order to attain results that are related to a real production cycle, we

have to choose some cells that are almost present in a real library.
For this purpose we introduce a very brief description of the most used
CMOS logic families, and among them we choose the cells to develop and
test the optimization framework.
1.2 CMOS logic families
The first basic distinction inside the CMOS logic families is among the
static logics and the dynamic logics ([1]).
Static logic: The static logic is a logic in which the functioning of the cir-
cuit is not synchronized by a global signal, namely the clock of the
circuit. The output is solely function of the input of the circuit, and
it is asynchronous with respect to them. The timing of the circuit is
defined exclusively by its internal delay.
Dynamic logic: The dynamic logic is a logic in which the output is syn-
chronized by a global signal, viz. the clock. The output is, then, func-
tion both of the inputs of the circuit and of the clock signal; and the
1.2. CMOS logic families 5
timing of the circuit is defined both by its internal delay and by the
timing of the clock.
Both the static and dynamic logics comprehend several logic families.
1.2.1 Static logic families
The principal static families are:
Conventional static logic It is the logic normally referred when speaking

of static logic. A static circuit has the same number of NMOS and PMOS
transistors, but the n and p branches are respectively one the dual
of the other. As an example see figure 1.1, which represents a static
A
OUT = A and B
Fig. 1.1: Static and
“and” gate. It has two NMOS transistor connected in series and two
PMOS connected in parallel.
The static logic is quite fast, does not dissipate power in steady state
and has a very good noise margin.
Pseudo-NMOS It is an evolution of the yet surpassed NMOS logic. It is ob-

tained by substituting the whole PMOS branch in a static logic with
a single PMOS transistor with its gate connected to ground. So this
PMOS is always conducting and leads the output node to the high
state. When the NMOS branch conducts also, then the output dis-
charges, if the ratio among the NMOS and PMOS transistor is well de-
signed.
This logic is cited here only for historical reason, since it is not so fast,
it dissipates static power in a steady state (when the output is in the
low state) and it is sensible to noise.
Pass-logic The pass-logic is relatively new logic, and, for many digital de-
signs, implementation in pass-transistor logic (PTL) has been shown
to be superior in terms of area, timing, and power characteristics to
static CMOS.
As an example see figure 1.2,

A

A
OUT = A xor B

Fig. 1.2: Pass transistor logic xor
1.2.2 Dynamic logic families
The principal dynamic families have a characteristic in common: every

dynamic logic needs of a pre-charge (or pre-discharge) transistor to lead to
a known state some pre-charged nodes. This is done during the working
phase known as pre-charge phase or memory phase; during another working
phase, the evaluation phase the output has a stable value1 .
1
This brief introduction is limited to systems that have a single global clock, or one
phase, intending here the word phase as synonym of clock, and not as above as a synonym
of working period. There are systems that have two, or even four phase, but they are not
introduced here. The basic functioning, however, remains the same.
The principal dynamic logics are divided yet in two sub-families, pipe-
lined and not-pipelined. The first two these are non-pipelined, while the oth-
ers are pipelined:
Domino logic and N–P Domino logic The typical domino gate is depicted
in figure 1.3
OUT

CLOCK

INPUTs

NMOS Block

Fig. 1.3: Domino typical gate
During the pre-charge phase the clock is at its low state, so that the
pre-charged node before the static inverter is high, and the output is
low. During the evaluation phase the clock is high, so that the inputs
of the n–block (that can perform any logical function) can discharge
the pre-charged node and lead the output to the high state.
We can cascade several of these gates, given that each gate has its
own output inverter, and we can drive every gate with the same clock
signal, given that the evaluation phase lasts the time necessary to all
the gates to finish their inputs evaluation. This last fact explains why
this is a non-pipelined logic: the output of every cell is available when
the cell has finished its evaluation phase.
Moreover this logic has a limited area occupancy, since it has a low
number of PMOS transistors. On the other hand it is not possible to
implement inverting-structure and, as all the other dynamic logics,
this logic is subject to the charge-sharing problem2 .
2
The charge-sharing problem, or charge-redistribution, is a problem that affects the dy-
A natural evolution of the domino logic is the N-P domino logic, or

zipper logic. It consist of two typical cells, the one depicted in fig-
ure 1.3, and the dual one obtained by that, simply swapping the n-
block with a p-block, and a PMOS pre-charge transistor with a NMOS
pre-discharge transistor, driven by the negated clock.
This logic has a lower are occupancy, since there is no need of a static
inverter, but has also a lower speed, given by the presence of PMOS
transistors.
Cascode voltage switch logic (CVSL) The CVSL is part of the large family
of differential logics. It needs both the inputs and the inputs negated,
and two complementary n-block that perform the logic function, as it
is possible to see in figure 1.4.
OUT OUT
INPUTs
INPUTs
Fig. 1.4: CVSL typical gate
It has the advantage to be quite fast, since the positive feed-back of

the two PMOS accelerates the switching of the gate, and also it has
very good noise margins. Moreover it produces both the outputs and
namic logics. Basically the charge stored in an precharged node node during the memory
phase does not remain fully stored in it. Let’s think to a domino gate during the pre-charge
phase, when the clock is low. If there is one input in the n-block that is high, then its cor-
responding transistor is conducting. The n-branch is still not conducting, since the clocked
NMOS transistor is not conducting, but some charge from the precharged node can flow to
others node via the conducting transistors in the n-block. This redistribution of charge is
simply a charge of a cap8citor partition and lead to a state of the precharged node lesser
than the high state.
This problem can produce logic errors, and surely diminishes the noise margins of
negated outputs without needing an inverter. As a drawback, it has

a large area occupancy.
C 2 MOS logic The typical C2 MOS gate is shown in figure 1.5. It is basically
a three-state gate, since when the clock is at the low state, the output
is floating at the high impedance state.

INPUTs

PMOS Block
CLOCK

OUT
CLOCK

INPUTs

NMOS Block

Fig. 1.5: C2 MOS typical gate
It is principally used as a dynamic latch, as an interface among static

logics and dynamic-pipelined logics.
NO RAce logic (NORA) The NORA logic, as acronym of no race, is an evol-

ution of the N-P domino logic. The static inverter of the domino logic
is substituted with a C2 MOS inverter. This is the first of the pipelined
logics, since the output of every gates is available only when the clock
switch its state, and not before.
Since the output stage of every cell is also dynamic (a C2 MOS in-
verter), then this logic is more subject to the charge-sharing problem
that the domino logic is.
True Single Phase Clock logic (TSPC) The final evolution of the NORA is
the TSPC logic, or true single phase clock logic ([2]).
The TSPC logic is a n-p logic, since of each gate exists the n-version
and the p-version. For example the n-latch and the p-latch are shown
in figure 1.6.
CLK
OUT
A
(a) Type n
CLK
OUT

(b) Type p
Fig. 1.6: TSPC Latches
The ultimate advantage of the TSPC logic is the presence of a single

clock, since for its internal structure it is not necessary the presence of
the clock negated.
The TSPC logic is among the faster dynamic families, and surely it has
a great appealing for its very low number of transistor employed.
1.3. Conclusion 11
1.3 Conclusion
After this very brief introduction to several CMOS families, we chose

two different logics, in order to apply the study of the optimization tech-
niques objects of this thesis. The criteria that drove us in choosing these
families was both the diffusion in VLSI circuits, and the presence of very
good qualities, perhaps not yet fully exploited in the real production of
circuits.
For these reasons we have chosen to include in our library a few static
gates (an “and” gate, an “or” gate, and a few more) and a few dynamic
gates, and in particular gates from the TSPC family. This family has shown
good characteristics in term of speed, area occupancy and power dissipa-
tion; it has also the very important feature to need only a single clock.
The complete list of the gates comprising the library can be found in the
table 7.1 (page 122), with their relative schematic diagram of CMOS imple-
mentation.
Part II
CIRCUIT MODELING
Chapter 2
A SIMPLE MODEL
HE first model applied in the calculus of the delay in MOS circuits is

T the Elmore’s model ([3]). It is a simple RC delay model, and it is the
basement of a switch MOS model (figure 2.1): the generic MOS is represen-
ted, during the ON state, by its dynamic resistance across the drain pin and
the source pin, and the parasitic capacitances and resistances at the drain
and source pins.
CL
RL
D CD
ON D
Rg
G
= G
Rd
CG S
S
CS
R0
Fig. 2.1: RC MOS equivalence
If this simple MOS model is valid, then the Elmore’s delay formula can
be used in every structure containing some MOS. The Elmore’s formula is
16 Chapter 2. A simple model
appealing for its simplicity and its easy of use; however the accuracy of the
formula can worsen in the deep sub–micron domain, since the modeling of
a MOS through its resistance it is no more valid.
Since the use of Elmore’s model is almost quite limited to comparis-

ons with other models, of for introduction to delay modelling, section 2.1
presents here only the very basic of the Elmore’s model and section 2.2
shows the conclusions about the use of this model for VLSI models.
2.1 The Elmore’s model
The Elmore’s model or the Elmore’s delay formula can predict the delay
of a RC chain as shown in figure 2.2.
V0 Ri-1 Vi-1 Ri Vi Ri+1 Vi+1
Ci-1 Ci C i+1
Fig. 2.2: RC chain
In order to obtain the formula, let’s start with a single RC cell, as shown
in figure 2.3. We can express the voltage V1 (t) by means of a differential
equation such as:
dV1 V (t) − V0 (t)

C0 = 1 (2.1)
dt R0
Integrating the equation (2.1), we can write
h i
− t
V1 = V0 (t) 1 − e R0 C0 .
The time constant is τ = R0 C0 , and with t = τ we obtain:

2.1. The Elmore’s model 17
R0
V0 V1
C0
Fig. 2.3: RC single cell
V1 = 0.63V0 (t).
So the time t D = τ represents the 63% delay from V0 (t) to V1 (t). Extend-
ing the formula of the time constant to the chain of figure 2.2, we obtain:
N i
tD = ∑ ∑ Rj Ci .
i=0 j=0
This delay is the input–output delay. When there is the need to know
the delay between the input and one of the inner nodes, a more complex
formula (a semi-empirical one) can be used; for example, with N = 2:
t1 = R0 C0 + qR1 C1 delay from the input note to the first node

t2 = R0 C0 + (R0 + R1 )C1 delay from the input note to the output node
where q is:
r

 R0
 if R1 ≤ 2R0 ,
q= R 0 + R1

 R0 C0
 if R1 > 2R0 .
R0 C0 + R1 C1
18 Chapter 2. A simple model
The first case (with R1 ≤ 2R0 ) is named strong coupling, while the second
one is named weak coupling.
Given the unit impulse response h(t) (figure 2.4) of the output node of
the RC tree, Elmore proposed to approximate the delay τ by the mean of
h(t), considering h(t) as a distribution. The 50% delay is given by:
h(t)
m
t
Fig. 2.4: Elmore impulse response
Z τ
h(t)dt = 0.5
0
while the original work of Elmore proposed:
Z ∞
tD = m = t · h(t)dt
0
with
Z ∞
h(t)dt = 1.
0
2.2. Conclusions 19
This approximation is valid only when h(t) is a symmetrical distribu-

tion, as in figure 2.4, while in real cases the h(t) distribution is asymmetrical;
however in [4] is proved that the Elmore approximation is an upper bound
for the 50% delay, even when the impulse response is not symmetrical, and,
furthermore, the real delay asymptotically approaches the Elmore bound as
the input signal rise (or fall) time increases.
2.2 Conclusions
The model shown in this chapter is quite appealing for the calculus of
the delay in CMOS structure, but it is inaccurate as far as we go into the
submicron domain, so its use should be limited to a first validation of an
optimization algorithm, but not for real production.
About this, it is important to note that the delay functions obtained by the
Elmore’s formula satisfy some properties useful in the optimization realm
(for example equation (4.1), page 50): then the Elmore model is very useful
for optimization algorithms testing.
Chapter 3
A COMPLEX MODEL
HE target of the model developed here is to offer limited estimation

T errors with respect to physical SPICE simulations and to improve the
computation speed of more than one order of magnitude. This could be
useful in optimization algorithms.
Thus the aim of the model is to evaluate the delay and power dissipation
of CMOS structures.
Several approaches have been used to evaluate the delays of CMOS

structures: some models are derived from SPICE simulations by means of
look–up–tables [5]; some are analytical [6] while others approximate the
evaluation of the delay with step or ramp inputs [7, 8, 9, 10, 11].
Regarding the power consumption the main contributions are: switch-

ing power, short circuit current and sub–threshold conduction. The first
one occurs during the charge and discharge of internal capacitances; short
circuit current originates from the simultaneous conduction of p and n net-
works and it is dominated by the slope of node voltages; sub–threshold
currents are due to the weak inversion conduction of MOSFETs and become
relevant when the power supply is scaled in sub-micron technologies.
Most of the proposed power models use estimation algorithms not com-
patible with the delay analysis. The purpose of the FAST model is to com-
bine delay and power evaluations in the same estimation procedure, allow-
ing the simultaneous optimization of delay and power.
22 Chapter 3. A complex model
The section 3.1 reports the theory behind the FAST model, and in par-
ticular: §3.1.1 shows the MOS equations used in the model, §3.1.2 shows
the internal nodes voltage approximation made by the model and §3.1.3
explains how the threshold voltage variation are taken into account in the
model. Section 3.2 shows how the FAST model estimates the delay, and in
particular §3.2.1 shows how the equation are solved; while section 3.3 re-
ports the method used for the calculation of the power consumption, and
in particular §3.3.1 accounts for the switching power, §3.3.2 accounts for the
short-circuit power, and §3.3.3 accounts for the subthreshold power.
Finally the section 3.4 presents some results by the comparison of the model
with HSPICE and the section 3.5 draws some conclusions.
3.1 The FAST model
The low complexity and the accuracy that can be obtained by taking
care of the phenomenon of carriers velocity saturation, which is domin-
ant in sub–micron technologies, suggested the use of the classical charge–
control analysis and the gradual–channel approximation (Hodges model),
described in §3.1.1.
Estimation accuracy and low computational effort can be achieved by

operating both on the waveforms of internal signals and on the topology
considerations: in particular all the waveforms in the circuit are approxim-
ated with linear ramps.
By approximating the input waveform with a ramp, a strong simplific-

ation of the I(V) equations is obtained. Figure 3.1 shows the output voltage
of an inverter driven by a ramp input. It can be noticed that a ramp can
properly approximate the output voltage variation, especially in the central
phases of the commutation. The increasing error on the tail of the switching
does not affect significatively the delay and power estimation.
The voltage ramp approximation are described in §3.1.2.

3.1. The FAST model 23
5
Vout
Vin
Model
4
3
V
0
1.2 1.25 1.3 1.35 1.4 1.45 1.5
Time (ns)
Fig. 3.1: Inverter voltages waveform
3.1.1 MOS equations
The well known equations for the MOS transistors are (for the n–type
and p–type transistors)[1]:
below saturation
" #
V2
IDSn, p = βn, p (VGS − VTn, p )VDS − DS (3.1)
2
above saturation
βn, p h i2
IDSn, p = VDSsatn, p (3.2)
2
µn, p Cox W
where βn, p = L , with µn, p modified by the carrier velocity saturation
effect:
µn0 µ p0
µn = µp =
1 + VLEDSc 1 − VLEDSc
The saturation voltage (drain–source), not including the carrier velocity

saturation effect, is given by the well known formula:
VDSn, p = VGSn, p − VTn, p
while considering the effect above–mentioned:
s 
2(VGSn, p − VTn, p )
VDSn, p = ±Vc  1 ± − 1 (3.3)
Vc
where the plus signs are for n–MOSFETs and the minus signs are for the
p–MOSFETs, and Vc = |Ec L|
3.1.2 Internal nodes approximation
Fig. 3.2: Mos chain with proper numbering
Let be N the number of n–MOSFETs in the n–chain and P as the num-

ber of p–MOSFETs in the p–chain, and let’s label the transistor in the chain
from 1 to N or from 1 to P (figure 3.2). Let’s assume that the label 1 comes
with the driving transistor (i.e. the n–MOSFET with source connected to VSS
as the p–MOSFET with source connected to VDD ), as in figure 3.2. This hy-
pothesis is only for the develop of the discussion; in our model any (but
only one) transistor can be a driving transistor, that is a transistor with a
changing gate voltage.
Notation 3.1. In the following equations the superscript index refers to the
node number (with the variable i always for the n–MOSFETs and j always
for the p–MOSFETs), and the small–letter subscript indexes n and p refer, re-
spectively, to n–MOSFETs and p–MOSFETs, both for the voltage variables or
for the time variables; for the voltage variables the capital subscript indexes
G and D refer to the drain node and the gate node, while the small–letter
index d refers to the initial conditions of the drain nodes.
So, for example, VGi n (t) is the gate voltage at the node i for the n–MOSFETs
j
(function of time), and Vd p is the initial condition of the drain voltage at
node j for the p–MOSFETs.
The wave forms of the voltage are shown in figure 3.4 and figure 3.5,
with the hypothesis t01n = t20n = · · · = t0Nn and t01 p = t02 p = · · · = t0Pp ; that is
because we suppose the start of conduction of all the MOSFETs in a chain
contemporary1 .
We can write, referring to figures 3.4, 3.5:


0 t<0


 VDD
VG1 n (t) = t 0 ≤ t < τi1n (3.4a)
 τi1n



 V τi1n ≤ t
DD


VDD t<0


 VDD
VG1 p (t) = VDD − τ 1 t 0 ≤ t < τi1p (3.4b)

 ip


0 τi1p ≤ t

VGi n (t) = VDD ∀t (3.4c)
i=2,3,..., N
1
This hypothesis is well supported by simulations

j
VGp (t) = VSS ∀t (3.4d)
j=2,3,..., P

V i
 t < t0i n
 dn



i i − Vi
t − t0i n
VD (t) = Vdn dn i i
t0i n ≤ t < τoin (3.4e)
n
i=1,2,..., N 
 τ o − t 0

 n n

VSS τoin ≤ t

 j j

 Vd p t < t0 p



 VDD − V j j j j
τo p Vd p − t0 p VDD
j dp j j
VD p (t) = j j
t+ j j
t0 p ≤ t < τo p (3.4f)
j=1,2,..., P 
 τ o p − t0 p


τ o p − t0 p

 j
V τo p ≤ t
DD
Fig. 3.3: The i–th and i + 1–th MOSFETs with node voltages
It is also possible to define τiin, p = τoi− 1 and the source voltage V i = V i+1 ,
n, p s d
as shown in figure 3.3 for the i–th n–MOS. The same is valid for the p–
MOSFETs.
The starting level Vdn, p are determined with a static analysis, described
in §3.1.3.
3.1.3 Body effect: threshold variation and its approximation
It is known that a MOS transistor with the source–body voltage differ-

ent from zero has the threshold voltage modified by the body effect, that
τi τo τo
τi τo τo τo τo
Fig. 3.5: Voltages wave forms in the p–mos chain
The source potential of the top transistor is
∗
Vs = VDD − VTn ,
Tn0 + ∆VTn
∗ =V
and, if VTn0 is the threshold voltage with Vsb = 0, then VTn
and we can solve for Vsb :
r q
γ 4γ 2|Φ p | + 8|Φ p | + 4VDD − 4VTn0 + γ 2
Vsb = ±
2
q γ2
+ γ 2|Φ p | + VDD − VTn0 + (> 0)
2
We can find an analogue equation for p–MOSFETs: knowing that, for

the p–MOS chain depicted in figure 3.7(b), the drain potential of transistor
j j
is VdPp = 0, while VsPp = −VDD − VTp
∗ ; for the middle transistors V = V =
dp sp
∗ ; and for the first (top– MOS t) transistor V 1 = −V ∗
−VDD − VTp dp DD − VTp and
Vs1p = VDD .
The threshold voltage variation function of Vsb again is:

τo τo
τi
Fig. 3.6: Drain–source (VDS ) and gate–source (VGS ) voltages of th i–th n–

MOS
q q
∆VTp = −γ ( 2|Φ p | + Vsb − 2|Φ p |)
(for p–MOS transistors threshold voltage is negative).
Again, solving:
q q
Vsb = ∗
−VDD − VTp = −VDD − VTp0 + γ ( 2|Φ p | + Vsb − 2|Φ p |)
where VTp0 is the threshold voltage with Vsb = VDD ; thus we find:
r q
γ 4γ 2|Φ p | + 8|Φ p | + 4VDD + 4VTp0 + γ 2
Vsb = ±
2
q γ2
− γ 2|Φ p | − VDD − VTp0 − (< 0)
2
The threshold variation is approximated in the model by a linear ap-

proximation given by:
VDD
VDD
VDD pmos 1
VDD nmos N
VSS - VTP
VDD - VTN
VSS pmos 2
VDD nmos 2
VSS - VTP
VDD - VTN
VSS pmos P
VSS nmos 1
VSS
(a) n–MOSFET chain (b) p–MOSFET chain
Fig. 3.7: MOSFET chain with static voltages
VTn = αn Vsb + βn
VTp = α p Vsb + β p
with αn, p and βn, p constants:
∗ −V
VTn Tn0
αn = βn = VTn0
VDD − VTn
∗
∗ ∗
VTp − VTp 0 VTp VDD + VTp0
αp = βp =
VDD + VTp
∗ VDD + VTp
∗
3.2. Delay estimation 31
1.5 -1
VTn(Vsb) VTp(Vsb)
VTn approx VTp approx
1.4 -1.1
1.3 -1.2
1.2 -1.3
-1.5
VTn
VTp
1.1 -1.4
0.9 -1.6
0.8 -1.7
0 1 2 3 4 5 0 1 2 3 4 5
Vsb Vsb
(a) n–MOSFET (b) p–MOSFET
Fig. 3.8: Threshold variation with Vsb (solid line) and its linear approxima-
tion (dashed line)
In figure 3.8(a) and 3.8(b) the actual threshold variation (of a n–MOS
transistor and a p–MOS transistor) when a Vsb voltage is applied is com-
pared with the linear approximation used in our model, for a 0.7 µm tech-
nology.
The max error due to the linear approximation is limited to 7%.
3.2 Delay estimation
The delay estimation of the structures reported in figure 3.2 implies the
evaluation of τoin, p and t0i n, p , for each transistor in the chains.
The currents in each transistor can be obtained from equations (3.1),

(3.2) (page 23), with the voltage function of time defined in equations (3.4a)–
(3.4f) (page 25). So we can calculate the quantity of charge at each node and
thus apply the charge conservation law, i.e. at each node the total charge
variation must be equal to zero:
j
Qin = 0 Qp = 0 i = 1, 2, . . . N and j = 1, 2, . . . , P (3.5)
The generic term Qin is the sum of three elements, Qin = QiI+1 − QiI − QiC ,
define below:
• QiI+1 is the charge due to the (i + 1)–th MOSFET placed above the i–th
node:
Z ti+1 Z τ i+1
sn on
QiI+1 = i+1
Isat (t)dt + i+1
Ilin (t)dt (3.6a)
t0i+n 1 tis+
n
1
which includes the contributions due to the currents above and be-
low saturation; ts is the time at which the MOSFET switches from the
saturation to the linear region;
• QiI is the charge due to the (i)–th mos below the i–th node:
Z ti Z τi
sn on
QiI = i
Isat (t)dt + i
Ilin (t)dt (3.6b)
t0i n tisn
• QiC is the charge due to the discharging of the capacitor at the i–th
node, Ci :
QiC = Ci Vdi n . (3.6c)
Similarly equations apply for p–MOSFET.
For each circuit node, a charge conservation equation can be written.
3.2.1 Equation solving
Referring to the n–MOS chain in figure 3.3, we can write at the output
node N:
QnN = − QCN = −C N VdNn (3.7)
because, neglecting the contribution of the p–MOS chain above (if it exists),
QN
I = 0.
At the node N − 1 we can write:

N −1
QnN−1 = Q N
I − QI − QCN−1 ,
and combining with eq. (3.7) (page 32)
QnN−1 = −C N VdNn − Q N
I
−1
− QCN−1 ,
and so on:
QnN−2 = −C N VdNn − C N VdNn−1 − Q N

I
−1
− QCN−2 .
More generally:
N
Qin = − ∑ Ck Vdkn − QiI − QiC
k=i+1
N
= − ∑ Ck Vdkn − QiI = 0
k =i
Proceeding till the first transistor, we obtain:
N
Q1n = − ∑ Ck Vdkn − Q1I = 0 , (3.8)
k=1
the same applies for p–MOSFETs.
In order to solve non–linear equation (3.8) one must substitute the defin-
ition of the current to calculate the charge Q, as in equations (3.6a), (3.6b)
(page 32), moreover one must substitute both the current calculated in the
saturation region and the one calculated in the linear region, extending the
integrals of the aforementioned equations to the proper extremes.
Finally we must distinguish among several different cases, depending

on the instant of time on which the transistor switch from the saturation
region to the linear region. For example, the first transistor can switches
between the two regions when the rising of the input has already finished,
or on the contrary can switches when the input is still rising.
All the possible cases are:
t01 6 t1s 6 τi1 6 τo1 t01 6 τi1 6 t1s 6 τo1

t1s 6 t10 6 τi1 6 τo1 τi1 6 t01 6 t1s 6 τo1
(3.9)
t1s 6 τi1 6 t10 6 τo1 t01 6 t1s 6 τo1 6 τi1
t1s 6t01 6 τo1 6 τi1
Evaluating all the possible cases, the equation (3.8) becomes a non–
linear equation of the variables t1s , t10 , τo1 , τi1 , with t1s , t10 , τo1 as unknowns.
A further step must be done, with the purpose of eliminating all the vari-
ables but one. The real unknown is the time τo1 , while all the other un-
knowns can be expressed in function of τo1 : in particular, the times t1s and
t10 can be calculated together, with the equation VDS = VGS − VT † and with
the equation that states the charge conservation at node 1 between the time
0 and the time t10 , similar to the equation (3.5) (page 31), including the boot-
strap effect due to capacitive coupling between the gate and the drain of
the first transistor.
Both these equations are functions of t1s , t10 , τo1 , τi1 . By this way one has
three equations with three unknowns, and by means of some approxim-
ated methods2 it is possible to evaluate the three unknowns.
This solution scheme ought to be repeated for all the seven cases shown
in equation (3.9). Each case gives as a solution a triple t1s , t01 , τo1 that is com-
patible with one and only one of the conditions expressed by these cases.
Thus, only one working condition is really selected, as it can be expected.
Indeed all the previous solving scheme is true only if the equation (3.6c)
(page 32) apply, i.e. only if the capacitance at the node i is not a function of
the voltage at the same node. But the capacitance actually is function of the
voltage in this manner:
†
Or, taking into account the carrier velocity saturation effect, the equation (3.3) (page 24).
2
The problem is always strictly non–linear.
−m j −m p
i Vi Vi
C = Cij 1+ i
+ Cp 1 + (3.10)
Φb Φb
where C j and C p are, respectively, function of area and function of peri-

meter of a junction, because the capacitance at the node i is due to the para-
sitics capacitances of the transistors connected to this node.
If the capacitance at each node are functions of the voltage at the node it-
self, then one equation is no more sufficient: one must write equations like
the equation (3.8) (page 33), one for each node, and the solve them with
standard solving algorithm for non–linear equations. The only difference
among the equations applied at the nodes above the first and the first node
equation is that not all of the cases of equation (3.9) are possible: in par-
ticular these conditions apply only when the transistor can pass from the
saturation region to the linear region, and moreover, only when the input
rising time τi1 can assume whichever value. The passage from saturation to
linearity can be made only by the first and the last transistors of the chain,
as they are the only that can saturate3 . But in the last transistor, the time τiN
is governed by τiN = τoN−1 , giving thus only two possible cases:
t0N 6 tsN 6 τiN 6 τoN t0 6 τiN 6 tsN 6 τoN
In order to make the algorithm convergent, two other fictitious cases

must be included:
t0N 6 tsN , τoN 6 τiN

t0 6 tsN , τoN 6 τiN
These conditions can never verify in a real circuit, since they imply that
the voltages at the source node and at the drain node of the last transistor
3
This is because they are the only that have a full voltage swing at some node, e.g. the
gate node the first, and the drain the last. All the transistor in the middle of the chain
are prevented to saturate by the body-effect, that makes the saturation condition VDS =
VGS − VT , (or, better, the equation (3.3), page 24) impossible.
crosses, making the transistor current flowing in an inverse direction (see

figure 3.6 for a visual explanation of the terms τi and τo and why they relat-
ive voltage waveforms cannot cross). Their inclusion help finding the real
circuit conditions when solving the equation (3.8) for each of these four
cases: the solution of one the fictitious cases gives only unknowns compat-
ible with one of the real cases.
All the other transistors, that can not saturate during the switching from
off to on, have only one possible working condition, again that the voltages
at source and drain nodes do not cross:
j j
τi 6 τ o j = 2, . . . N − 1
j
Solving all the equations, one for each node, the unknowns τo can be
evaluated, giving thus an estimate of the voltage waveform at each node
of the chain. The rising/falling time of the last node of the chain gives also
the delay of the chain itself.
3.3 Power consumption estimation
3.3.1 Switching energy
The contribution to the power dissipation due to the charge and dis-
charge of internal nodes for each MOSFET can be defined as the integral of
the voltage across the MOSFET times the current flowing through.
Theorem 3.2. The switching energy in generic n–networks and p–networks can
be written as:
1 N i 02
2 i∑
00 2
Eswn = C V i − Vi (3.11)
=1
1 P
2 2
Esw p = ∑ C j VDD − V j − VDD − V j
0 00
(3.12)
2 j=1
where Ci is the generic total capacitance of node i-th and Vi0 , Vi00 are, re-
spectively, the initial and final value of the voltage swing at the same node.
3.3. Power estimation 37
Corollary 3.2.1. If the voltage swing of each node of the network is the full swing
∆V = VDD − 0, then equations (3.11), (3.12) can be written as:
1 N i 2
2 i∑
Eswn = C ∆V (3.13)
=1
1 P i 2
2 i∑
Esw p = C ∆V (3.14)
=1
Proof of theorem 3.2. Since the internal voltages and currents are known from
the delay analysis, the energy for the n–MOS network can be written by
summing all the contributions of internal nodes (see figure 3.3)
N Z h i
Eswn = ∑ i+1
VD n
(t) − VD
i
n
(t) i
ID n
(t)dt
i=1
where the notation of figure 3.3 is adopted.
This equation can be written in this way:
Z N −1 h i
Eswn = VDNn (t)ID
N
n
(t) + ∑ i
VD n
(t) i
ID n
i+1
(t) − ID n
(t) dt (3.15)
i =1
It is possible to rewrite the previous equations by noting that in general:
i
dVD
i+1 i
ID − ID = Ci n
n n
dt
and, in particular, if we neglect the current of the p–MOS chain above the
node N,
dVDNn
− IDNn = C N .
dt
Thus, for the n network it is possible to define the Eswn energy in the
following way:
N Z t00 i
dVD
Eswn = − ∑ C
0
i i
VD n
dt
i=1 t00
n
dt
N Z V 00
=−∑C
i
i i i
VD n
dVD n
i=1 Vi0
1 N
2 i∑
i 02 00 2
= C Vi − Vi
=1
If we integrate the equation (3.11) (page 36) only when the argument of
the integrals are non zero, then the first integral in this equation goes from
i (ti ) to
t00 = t0i n to t000 = τoin , so that the second integral goes from Vi0 = VD n 0n
1
i (τ i ). Since V i (τ i ) = 0, we have E
Vi00 = VD n on Dn o n swn = 2 ∑iN=1 Ci Vi0 2 , where Vi0
is the actual voltage swing at the node i.
The energy dissipated in the p network (Esw p ) can be calculated with

similar considerations leading to
P Z t00 dV i
∑C
0 j Dn
j
Esw p = VDD − VD p dt
j=1 t00 dt
P Z V 00
∑C
j j j
j
= VDD − VD p dVD p
j=1 V j0
2 2
1
2∑
0 00
= C j VDD − V j − VDD − V j
j
j j j
Again, V j0 = VD p (t0i n ) and V j00 = VD p (τo p ), and in the same way V j00 =
VDD , so that Esw p = 12 ∑ Pj=1 C j (VDD − V j0 2 ), where (VDD − V j0 2 ) is the voltage
swing at the node j.
In the equations (3.11) and (3.12) (page 36) the voltage variation of ca-
pacitance must be included, obtaining expression for Eswn, p slightly more
complicated, but still in closed form.
3.3. Power estimation 39
3.3.2 Short–circuit energy
The short–circuit contribution (for a output falling transition) is given

by:
Z τo
Esc = VD ID dt
t0
where ID is the p–MOSFET current flowing through the p–MOSFET that

has a changing gate voltage, during the output falling; of course all the
p–MOSFETs among this one and the output node must be on to have this
contribution of power dissipation. So if we neglect the little discharging of
the source voltage of this MOSFET, we can easily calculate the short–circuit
energy, calculating the current flowing.
A similar equation can be written for the n–MOS network.
Since voltage swings, internal currents and capacitances are known from
the delay analysis, the power supply dissipation does not require addi-
tional computations.
3.3.3 Sub–threshold energy
The sub–threshold current in a MOSFET is given by ([12]):

W kT qVDS
− ξ kT
IDSsub–th = µ0 Q(VS ) 1 − e
L q
where
s
kT qs Na q(VGξ kT
−VT )
Q(VS ) ≈ − e
q |Φ p |
and
s
1 s Na
ξ =1+ .
2Cox |Φ p |
This current is proportional to the MOSFET width W, but, usually is neg-

ligible. However, with the scaling down of the dimensions and hence of the
threshold voltage this current may become no more negligible, and with
low VG and higher VD , the current becomes independent from VG .
Moreover, while the short–circuit current is limited by the switching times
of the circuit, the sub–threshold current is not limited in time, so its dissip-
ation can be comparable to the short–circuit dissipation.
3.4 Results
The circuit in figure 3.2 with 2 n–MOS and 2 p–MOS transistors (in a
0.7 µm technology) has been simulated using HSPICE (level 6) and the pro-
posed model, for each combination of MOSFET widths from 1 µm to 100 µm.
Figure 3.9 shows the comparison between delay (defined as the delay at
50% between an input rise ramp of 200 ps and an output falling ramp)
calculated by the model and the delay simulated by HSPICE for each com-
bination of widths among 5 µm and 30 µm; similarly figure 3.10 shows the
comparison between the energy dissipated (during the output discharging)
by the circuit calculated by the model and by HSPICE.
Tab. 3.1: Mean Error

Mean error Max Error Min Error
Delay 6.115% 12.985 % 0.905%
Energy dissipated 2.1% 6.3% 0.11%
Tab. 3.2: Execution time

HSPICE execution time FAST execution time
6384.3 sec. 188.91 sec.
The errors between the proposed model and the HSPICE simulation is
reported in table 3.1 while table 3.2 shows corresponding execution time.
These results are taken from the analysis of the circuit varying the dimen-
sions of the MOSFETs continuously from 1 µm to 100 µm.
3.5. Conclusions 41
3.5 Conclusions
The model of this chapter is suitable for the optimization application of

chapter 5. It is able to compute the delay and the power consumption of
CMOS structures with good accuracy and a consistent speed–up regarding
to the HSPICE simulation taken as a reference.
In a real production design cycle, this model might be used for a first pre–
optimization of some basic cell; then in the last steps of the design flow the
optimization using a more accurate model for the delay (or power) evalu-
ation must be used.
Delay Model
Delay [ps]
180
160
140
120
100
80
60
40
20
30
25
5 20
10 15 W2 [micron]
15
20 10
W1 [micron] 25
30 5
(a) FAST model
Hspice Simulation
(b) HSPICE
Fig. 3.9: Delay of the circuit 3.2 with several combination of W1 and W2 .
3.5. Conclusions 43
Energy Model
Energy [fJ]
1000
900
800
700
600
500
400
300
200
30
25
5 20
10 15 W2 [micron]
15
20 10
W1 [micron] 25
30 5
(a) FAST model
Hspice Simulation
(b) HSPICE
Fig. 3.10: Energy dissipated by the circuit of figure 3.2 with several combin-
ation of W1 and W2
Part III
OPTIMIZATION
Chapter 4
MATHEMATIC OPTIMIZATION
HE very basic theory of optimization is introduced here, in order to

T develop some optimization schemes, useful later for the optimization
of real circuits.
The theory of mono-objective optimization involves some properties and
theorems regarding finding the minimum of functions, hence the annulling
of the functions first derivatives. These results can be extended (with some
restrictions) to the case of multivariable functions but when the functions
to be optimized are more than one, being optimized simultaneously, the a
new theory may be introduced.
The whole goal of this introduction to mathematical optimization is

both the developing of reliable algorithms, and the justification of some as-
sumptions made in the chapter 5 (page 77), especially for the multi-objective
case.
In section 4.1 some mathematical optimization foundations are repor-

ted, and in particular in §4.1.1 is shown the theory of mono-objective optim-
ization (unconstrained, §4.1.1.1, and constrained, §4.1.1.2), while in §4.1.2 is
shown the theory of multi-objective optimization (unconstrained, §4.1.2.1,
and constrained, §4.1.2.2).
The section 4.2 reports the basic and most useful numerical algorithms for
optimization purposes: in §4.2.1 some one-dimensional search techniques,
in §4.2.2 some multi-dimensional search techniques, and in §4.2.4, §4.2.5
some “special” algorithms.
Some conclusion and summarized characteristics are reported in section 4.3.
48 Chapter 4. Mathematic Optimization
4.1 Optimization theory
Notation 4.1. In the following section, the function f is defined as:

f : X ⊆ R p → Y ⊆ R. X is called the decisions space, and Y is called the criteria
space.
Problem 4.2 (Unconstrained optimization). Given the function f that de-
pends on one or more variable x ∈ X, the problem of optimize f , in this
context, is equal to find:
min f (x)
x∈ X
this is also known as an unconstrained optimization, since there are not any
constraints on the values the function f may assumes.
The unconstrained optimization is seldom applied in the field of digital

circuits, so the constrained optimization is defined as:
Problem 4.3 (Constrained optimization). Find
min f (x) subject to g j (x) ≤ h j , j = 1, 2, . . . , m

x∈ X
where the n equations gi (x) ≤ hi constitute the set of constraints of the op-
timization.
The function f is also called the objective of the optimization, or the cost
function of the problem.
The above problems are classical optimization problems, or mono-objec-

tive problems. The multi-objective unconstrained optimization is defined as
the problem to optimize a vectorial function, so that the objective-function
is a vector of objective-functions.
Notation 4.4. In the following (multi-objective optimization), the function f
is defined as:
f : X ⊆ R p → Y ⊆ Rn , or f = ( f 1 , f 2 , . . . , n)| f i : X ⊆ R p → Y ⊆ R,
Problem 4.5 (Unconstrained multi-objective optimization). Find
min f i (x), i = 1, 2, . . . , n
x∈ X
4.1. Optimization theory 49
where there are n objective functions.
Finally, the multi-objective constrained optimization is defined as:

Problem 4.6 (Constrained multi-objective optimization). Find
min f i (x), i = 1, 2, . . . , n subject to gi (x) ≤ hi , i = 1, 2, . . . , m

x∈ X
where there are n objective functions and m constraints.
The multi-objective optimization is a very complex problem, since the

problem of finding the minimum of two or more functions is apparently
only trivial: the set of independent variables xmin that minimizes, let’s say,
the function f 1 , it is not supposed to minimizes (and generally it does not)
the other functions. So there should be a way to combine the information of
minimum among all the functions. The intuitive way of linear combination
is somewhat problematic:
n
f tot (x) = ∑ αi fi (x), αi ∈ R
i=1
because the functions f i cannot be commensurable among them. For ex-

ample, if there is one function f j that is f j >> f i , ∀i 6= j, then this function
dominate the total objective, giving false results for the optimization prob-
lem. This problem is illustrated in §4.1.2.
4.1.1 Mono-objective optimization
The mono-objective optimization is the standard optimization problem,

and is widely treated in literature (see [13] for an introduction). With this
preliminary statement, here are reported some results, useful to find a solu-
tion for the problems 4.2, 4.3.
The existence of the minimum (at least one) is granted by the Weierstrass
Theorem1 , but these minimums can be local or global:
Definition 4.7 (Local Minimum). The point x? ∈ X is a local (or relative)
minimum of the function f iff
∃ > 0 : f (x) ≥ f (x? ) ∧ ∀x ∈ X |x − x? | < .
1
iff X is a compact set, as is in this context
Definition 4.8 (Global Minimum). The point x? ∈ X is a global (or abso-

lute) minimum of the function f iff f (x) ≥ f (x? ) ∀x ∈ X.
Definition 4.9 (Feasible direction). d ∈ Rn is a feasible direction if ∃α? >

0 : x + αd ∈ X , ∀α : 0 ≤ α ≤ α?
In an intuitive manner the concept of feasible direction is useful to solve

the problem of minimization: we search all the direction in which the func-
tion f is decreasing.
Lemma 4.10 (First order necessary condition). If x? ∈ X is a minimum of

f ∈ C1 then ∀d ∈ Rn , where d is an feasible direction, dT · ∇ f (x? ) ≥ 0, where
(·) has the usual definition of scalar product in the space Rn .
Corollary 4.10.1. If x? ∈ X is an internal point of X, then dT · ∇ f (x? ) = 0
Lemma 4.11 (Second order necessary condition). If x? ∈ X is a minimum of

f ∈ C2 then ∀d ∈ Rn , where d is an feasible direction,
i) dT · ∇ f (x? ) ≥ 0;
ii) if dT · ∇ f (x? ) = 0 then dT · ∇2 f (x? ) · d ≥ 0
Corollary 4.11.1. If x? ∈ X is an internal point of X, then
i) dT · ∇ f (x? ) = 0
ii) dT · ∇2 f (x? ) · d ≥ 0
The conditions of the corollary 4.1.1 are necessary and sufficient con-
ditions for the existence of the minimum (local). In order to have some
information about the existence of a global minimum, the theory of convex
functions must be very briefly reported.
Definition 4.12 (Convex function). The function f : X → Y, where X is a

convex set2 , is convex if ∀x1 , x2 ∈ X ∧ ∀α : 0 ≤ α ≤ 1
f (αx1 + (1 − α)x2 ) ≤ α f (x1 ) + (1 − α) f x2 ) (4.1)

2
A set X ⊂ Rn is convex if ∀x, y ∈ X the segment [x, y] is totally contained in X
If in the equation (4.1) the sign < applies, then the function is said to be
strictly convex.
Another way to write the equation (4.1) is:
Lemma 4.13. The function f ∈ C1 : X → Y is convex over a convex set X if
f (y) ≥ f (x) + ∇ f (x) f (y − x), ∀y, x ∈ X
or, if f is twice derivable,
Lemma 4.14. The function f ∈ C2 : X → Y is convex over a convex set X if
∇2 f (x) ≥ 0, ∀x ∈ X
The convex functions are a very useful mathematical tool in the class of
optimization problem, mainly for the next two results:
Theorem 4.15. If f : X → Y is convex over a convex set X, the set A of the min-
imum of the function is convex, and every local minimum is also a global min-
imum.
Theorem 4.16. If f ∈ C1 : X → Y is convex over a convex set X, and if ∃x? ∈

X : ∀x ∈ X∇ f (x? )(x − x? ) ≥ 0, then x? is a global minimum of f over X.
The theorem 4.16 also implies that the conditions of the lemma 4.10 and
corollary 4.10.1 (first order conditions) are both necessary and sufficient
conditions for the existence of a global minimum.
4.1.1.1 Unconstrained problem
All the previous results are, almost in theory, sufficient to solve the
problem 4.2. The theory of the convex function ensures the existence of
a global minimum, while lemma 4.10, corollary 4.10.1, and theorem 4.16
suggest a method to find this minimum. We will see in §5.1 how these
methods apply to real circuits, in which, for example, the functions deriv-
ative are not available.
4.1.1.2 Constrained problem
The solution of problem 4.3 is slightly more complicated. The pres-

ence of constraints reduces the feasible set of independent variables that
are solutions of the problem. So the solutions, (i.e. the value of independ-
ent variables that minimize the objective function), must be searched in the
set x ∈ C ⊂ X that satisfies all the constraints.
The most important method to solve the problem of the minimization tak-
ing into account the satisfaction of some constraints (and, incidentally, the
method most useful for our real problem) is the method of the Lagrange
multiplier (and its derived, the method of the penalty function).
Lagrange multiplier and Penalty functions The first method defines a

Lagrangian function:
m
L(x, λ) = f (x) + ∑ λi gi (x) (4.2)
i=1
If we define x? as the solution that:
x? = min f (x) gi (x? ) ≤ 0, i = 1, 2, . . . , m

x∈ X
then we can write the necessary Kuhn–Tucker conditions for the existence
of the minimum:
∇x L(x? , λ? ) = 0 (4.3)
? ?
∇λ L(x , λ ) ≤ 0 (4.4)
? T ?
(λ ) g(x ) = 0 (4.5)
λ? ≥ 0 (4.6)
In order to find out sufficient conditions, we define the saddle-point condi-

tions:
Theorem 4.17. A point (x? , λ? ) with λ? ≥ 0 is a a saddle-point of the Lagrangian

L(x, λ) iff
i) x? minimizes L(x, λ) over the whole X
ii) gi (x? ) ≤ 0, i = 1, 2, . . . , m
iii) λi? gi (x? ) = 0, i = 1, 2, . . . , m
It can be proved that if the functions f , g are even not differentiable but
are convex, then the saddle-point conditions are necessary and sufficient
conditions. Although these conditions must hold at the minimum, they are
not very useful in determining the optimum point. The determination of
the optimum by direct solution of these equations is rarely practicable.
A more feasible way is to convert the constrained problem into an un-

constrained one, by defining the new objective function:
m
P(x, K) = f (x) + ∑ Ki [gi (x)]2 (4.7)
i=1
The sum added to the objective function is called penalty function, since it
penalizes the objective function adding a positive quantities (recall that we
want to minimize the cost function). The constants K = [K1 , K2 , . . . , Km ]T
are weighting factors (positive) that define how strongly must be satisfied
the i–th constraint, and can also made it commensurable.
Wherever x is inside the feasible region, we can ignore the constraints,

so a new objective function can be defined as:
m
P(x, K) = f (x) + ∑ Ki [gi (x)]2 ui (gi ) (4.8)
i=1
where ui (gi ) is the usual step function:


0 if g (x) ≤ 0
i
ui (gi ) =
1 if g (x) > 0
i
The introduction of the step function makes possible to relate the pen-
alty function defined in (4.8) with the Lagrangian function of (4.2) (page 52):
P(x, K) = L(λ, K)
if we let λi = Ki gi (x)ui (gi ), so that all previous results valid for the Lag-
rangian function are valid for the penalty function.
Note that the solution x? found optimizing the penalty function P(x, K)
converges to (x? , λ? ), defined by the Kuhn–Tucker conditions, only in the
limit K → ∞.
4.1.2 Multi-objective optimization
The multi-objective optimization is not a standard problem in the engin-

eering, but is quite common in economics ([14]). While with the mono-
dimensional problem the concept of optimum as a minimum is quite clear
and defined (the idea of greater or lesser is intuitive with the real number),
with multi-objective (also multi-criteria) the concept of minimum is less in-
tuitive. So we must define some relation of order among the points in a
multi-dimensional space.
Notation 4.18. Given x, y ∈ Rn , define
x=y iff x k = y k ∀k = 1, 2, . . . , n
x5y iff x k ≤ y k ∀k = 1, 2, . . . , n
x≤y iff x 5 y and x 6= y (so ∃k : xk < yk )
x<y iff x k < y k ∀k = 1, 2, . . . , n
Notation 4.19. In the following section, the function f is defined as: f : X →

Y, X ⊆ R p , Y ⊆ Rn . X is called the decisions space, while Y is called the criteria
space.
Given two outcome y1 , y2 of the cost functions, y1 = f (x1 ) and y2 =

f (x2 ), we must define which is better and we indicate that y1 is better than
y2 with y1 ≺ y2 , that y1 is worse than y2 with y1 y2 , and, finally, that y1 is
indifferent with respect to y2 with y1 ∼ y2 .
In the optimization theory a great importance has the definition of Pareto

point or Pareto preference:
Definition 4.20 (Pareto preference). Given y1 , y2 ∈ Y, the Pareto preference

is defined by
y1 ≺ y2 iff y1 ≤ y2 .
A Pareto preference is intuitively guided by the relation lesser is better.
Definition 4.21 (Non-Dominated and Dominated set). If y1 ≺ y2 is a bin-

ary preference defined on Y, the dominated and the non-dominated set
with respect to {≺} are defined as:
N({≺}, Y) = {y0 ∈ Y | @y ∈ Y : y ≺ y0 }
D({≺}, Y) = {y0 ∈ Y | ∃y ∈ Y : y ≺ y0 }
If y0 ∈ N({≺}, Y), y0 is a N–point. Similarly, if y0 ∈ D({≺}, Y), y0 is a D–

point.
Definition 4.22 (Pareto optimum). y ∈ Y is a Pareto optimum iff it is a N–

point with respect to Pareto preference.
We will give now two theorems that are fundamental for the solution of
the multi-objective optimization problem; first we introduce the definition
of convex cone in Rn :
Notation 4.23 (convex cone).
Λ> ={d ∈ Rn | d > 0}

Λ≥ ={d ∈ Rn | d ≥ 0}
Λ= ={d ∈ Rn | d = 0}
Theorem 4.24. i) if y0 ∈ Y minimizes λ · y over Y for some λ ∈ Λ> , then y0

is a N–point;
ii) if y0 ∈ Y uniquely minimizes λ · y over Y for some λ ∈ Λ≥ , then y0 is a

N–point.
Corollary 4.24.1. If Y is Λ= –convex, i.e. Y + Λ= is a convex set, then a necessary

condition for y0 ∈ Y to be an N–point is to minimize λ · y over Y for some λ ∈ Λ> .
This very important theorem (and its corollary) states that if y0 minim-
izes a linear weighted function λ · y (for some λ), then y0 is a Pareto optimum.
This reduces the problem from a multi-objective one to a mono-objective
one, i.e. is sufficient minimizes a linear weighted function of the cost func-
tions.
Note that:
∂ yi λj
=−
∂yj λi
λj
so the ratio λi is the trade-off exchanging an unit-gain in the variable y j with
an unit-gain for the variable yi . Finally, note that the theorem is valid for
any shape of Y.
Theorem 4.25. A necessary and sufficient condition for y0 ∈ Y to be an N–point

is that ∀i = 1, 2, . . . , n there are n − 1 constants ϑ(i) = {h j | j 6= i, j = 1, 2, . . . , n}
so that y0 uniquely minimizes yi over Y(ϑ(i)) = {y ∈ Y | y j ≤ h j , j 6= i, j =
1, 2, . . . , n}.
Each constant h j can be seen as a constraint: so this theorem claims that

a necessary and sufficient condition to be a Pareto optimum is to minimize
one criterion (the i–th objective function), while satisfying the constraints
for the remaining criteria. This is equal to say that the multiple criteria
problem can be reduced to a single criterion problem (minimize the yi func-
tions with multiple constraints (ensure that y j ≤ h j , i 6= j).
4.1.2.1 Unconstrained
Given all previous results, the solution of the unconstrained problem is

given by all previous tools: we reduce the multi-objective problem. We will
see in §5.1 how to apply these methods and which is preferred.
4.1.2.2 Constrained
Again, the solution is to reduce the complexity of the problem from the
multi-objectivity to a mono-objective one. It is possible to combine the two
previous methods, that is to minimize a linear weighted function plus a
sum of penalty function; the only critical point is to ensure the same order
of magnitude of each term of the sum, such that there is not a dictatorship
of one term of the sum. The third chance to solve an unconstrained problem
(or a constrained, but with some care) is to use the method of the compromise
solution:
Compromise solution Given the problem 4.3, it is possible to define y? as

the ideal outcome of the cost function f (x) without any constraints, so that
y? = inf f (x); the compromise solution is defined as the minimum of regret:
x∈ X
r(y) = ky − y? k;
typically, the L p –norm (the distance between the actual solution and the
ideal point) ) it is used:
" # 1p
n
r(y) = r(y; p) = ∑ | yi − yi? | p .
i=1
Again, a weight can be associated for each term of the sum:
" # 1p
n
∑
p
r(y; p, w) = wi | yi − yi? | p .
i=1
Definition 4.26 (Compromise solution). The compromise solution with re-

spect to L p –norm is y p ∈ Y that minimizes r(y; p, w) over Y.
The compromise solution enjoys several properties, the most important

is:
Property 4.27 (Pareto optimality). The compromise solution y p ∈ Y is an

N–point, for 1 ≤ p < ∞ with respect to Pareto preference (definition 4.20).
If y∞ is unique, then it is also an N–point.
When the ideal point is not known, one can use an approximation, or,
even, a constraint; in the latter case the more appropriate term is satisfying
level. To point out the differences between constraints and satisfying level,
one must observe:
♦ The constraints are, typically, a disequality constraints: the solution

must be as lesser as possible than the specified constraints. In term
of a L p –norm the solution must be as farther as possible from the
constraints, that is the L p –norm must not to be minimized. So the
method of the penalty function is the only suitable for this kind of
problem.
♦ The satisfying levels are, typically, equality constraints: the solution

must be as closer as possible to the levels indicated, that is the L p –
norm must be minimized. So the method of the compromise solution
can be devised.
4.2 Optimization Algorithms
This is a very concise report of some algorithms used in the optimiza-

tion of real circuit in the following chapters.
First are reported some one-dimensional (with respect to the decision

space) algorithms, and then the multi-dimensional algorithms, with some
based on the previous ones. Finally some “non”-standard algorithms are
reported, since they can be suitable for the application to digital circuit.
In the following report we focus on the algorithms that do not require

the evaluation of the gradient of the objective functions, or that approximate
this gradient3 , since (see §5.1) the functions available in real circuits are not
known in a closed form and almost
3 ∂f f (x + ∆x) − f (x)
Essentially with (x) ≈
∂ xi ∆x
4.2. Optimization Algorithms 59
4.2.1 One-dimensional search techniques
In order to find the minimum of a function f : R → R, we need to bracket

him:
Definition 4.28 (Bracketing). To bracket a minimum means to find a triple

a, b, c ∈ R, a < b < c, such that f (b) < f (a) and f (b) < f (c). This means that
the minimum is in the interval (a, c).
We show some algorithms, that are the most efficient in this field. First
we introduce the family of sectioning algorithm, from which the the golden
section search is probably the most suitable for our uses. Then we introduce
the Brent’s rule, a quadratic interpolation algorithm.
4.2.1.1 The section search
The algorithms of sectioning apply always the same policy: divide and
conquer. The initial interval [a, c] is reduced at each iteration to a smaller
interval, already bracketing the minimum x? . We have so a series of encap-
sulated intervals (see figure 4.1)
x? ∈ [an , cn ] ⊂ [an−1 , cn−1 ] · · · ⊂ [a, c].
Dicotomic search The simplest form of sectioning is the dicotomic search:

at first iteration the interval [a, c] is divided in two equal parts, [a, b] and
a+c
[b, c], so that b = ; then, choosing > 0, we check if f (b − ) > f (b +
2
). In such case we repeat he whole process with the new interval [a, b],
otherwise we repeat with [b, c]. It can be proved ([13]) that this method
requires 2k evaluations of the function f , where k is the iterations number.
Also the final interval length Ik = (ck − ak ) is
lim Ik = I0 ,
k→∞
where I0 = (c − a).
So the relative uncertainty on the minimum x? is .
I0
I1
I2
a0 b0=a 1 b1 =c 2 c0 =c1
Fig. 4.1: Section search algorithm
Fibonacci Search A more sophisticated algorithm is the Fibonacci search,

where at each iteration the length of the interval is chosen according to the
Fibonacci rule: Ik−3 = Ik−2 + Ik−1 . This method has the advantage that the
uncertainty after n iteration is known a priori: defining the initial interval
I0 = I1 = (c − a), then
I1 + f k−2
Ik =
fk
where f i is the i–th number of the Fibonacci sequence.

The number of function evaluations are again 2k, and the disadvantages of
this methods are that and n must be chosen a priori.
The golden section search Given a triplet (a, b, c) that brackets the min-
imum, we choose a new point x that defines a new bracketing triplet (a, x, b)
or (b, x, c) according to the rule:
x−b b−a
=1−2
c−a c−a
This implies that |b − a| = |x − c|, and that at each iteration the interval is
scaled of the same ratio λ.
Then we repeat the process with the new triplet. So the interval (a, c) is di-
vided in two parts, a smaller and a larger, and the ratio between the whole
interval and the larger is the same between the larger and the smaller, or in
other words:
1 λ
= ,
λ 1−λ
giving for λ the positive solution

√
5−1
λ= .
2
This fraction is known as the golden-mean or golden-section, whose aes-

thetic properties come from ancient Pythagoreans.
Convergence considerations All the three previous methods have a lin-

ear convergence, since at each iteration the ratio between the interval con-
taining x? and the new smaller interval is:
Ik+1
0≤ ≤ 1.
Ik
The asymptotic convergence rate is defined as
Ik+1
lim .
k→∞ Ik
For the dicotomic search, since 2Ik+1 = Ik + , taking = 0 we have
Ik+1 1
lim = .
k→∞ Ik 2
For the Fibonacci search, first we must write the generic number of the
Fibonacci sequence in a closed form:
√ ! k +1 √ !k+1
1 1+ 5 1− 5
fk = √ − .
5 2 2
then it can be proved that, taking = 0:

√
I f 5−1
lim k+1 = lim k+1 =
k→∞ Ik k→∞ f k 2
Ik+1
For the golden section search, as previously said Ik = λ, so
√
I 5−1
lim k+1 = λ = .
k→∞ Ik 2
Thus the convergence rate of the Fibonacci and the golden-section search are
identical.
4.2.1.2 Parabolic interpolation
Given a triplet (a, b, c) that brackets a minimum, we approximate the

objective function in the interval (a, c) with the parabola fitting the triplet.
Then we find the minimum of this parabola with the formula (since we
want the abscissa, the method is indeed an inverse parabolic interpolation):
1 (b − a)2 [ f (b) − f (c)] − (b − c)2 [ f (b) − f (a)]

x? = b − −
2 (b − a)[ f (b) − f (c)] − (b − c)[ f (b) − f (a)]
This method is useful only when the function is quite smooth in the in-
terval, but it has the advantage that the convergence is almost quadratic,
and it is perfectly quadratic when the function to be optimized is a quad-
ratic form.
The Brent’s rule The Brent’s rule is a mix of the last two techniques: it
uses the golden section when the function is not regular and switches to a
parabolic interpolation when the function is sufficiently regular. In particu-
lar, it tries always a parabolic step. When the parabolic step is useless then
the method use the golden section search.
4.2.2 Multi-dimensional search
This algorithms search the solution of the optimization problem in a

multi-dimensional space. Again, first an algorithm with a convergence or-
der of 1 is presented, then an algorithm with a quadratic order of conver-
gence is showed.
All the algorithms here presented show a sub-algorithm part that is a

one-dimensional search.
4.2.2.1 The gradient direction: steepest (maximum) descent
The method of the steepest descent chooses at each iteration a new point
in the decision space x + dx from the old point x, obviously such that:
f (x + dx) < f (x)
This new point must also be chosen such that the variation of the function
f is as more as possible. In other words, if dl is the length of the direction:
s
n
dl = ∑ (dxi )2 ,
i=1
the steepest descent maximizes the rate of change d f /dl.
The problem of minimize f becomes so the problem:
Problem 4.29 (Steepest descent).
n n
df ∂ f dxi
max ∑ = max ∑ ,
i=1
dl dx i i=1 ∂ xi dl
dl
such that
s
n
dl = ∑ (dxi )2 .
i=1
This problem can be solved with the Lagrangian multipliers; from equa-
tions (4.3) and (4.4) (page 52) we can write:
dxi 1 ∂f
= ,
dl 2λ ∂ xi
with
n 2 12
1 ∂f
λ=−
2 ∑ ∂ xi
.
i=1
This means:
∂f
(x)
dxi ∂ xi
(x) = − (4.9)
dl n 2 12
∂f
∑ ∂ xi (x)
i=1
The steepest descend algorithm chooses at each iteration a new point

xk+1 from the old point xk from the equation (4.9) (page 64)
xk+1 = xk − dl ∇ f (xk ), dl > 0
with dl chosen accordingly to the desired convergence rate: if dl is small

the algorithm will closely approximate the minimum, with slow conver-
gence, while if dl is large the convergence is fast but the algorithm can
oscillate near the minimum. Thus some methods are necessary to reduce
(or enlarge) the step dl at each iteration: large steps if we are far away from
the minimum, small steps if we are close to the minimum. The scheme
of choosing the proper step can affect greatly the convergence of the al-
gorithm. The best choice is the method of the optimal gradient.
4.2.2.2 The optimal gradient
This algorithm simply calculates the step dl according to:
min f (xk − dl ∇ f (xk ))

dl ∈R+
This is a one-dimensional optimization and it is usually performed with
a method as shown previously. Strictly speaking, the optimization of f is
always a multidimensional one, since we descend along the gradient path,
but inner this process there are a lot of sub-optimization steps that found
the optimal length of this descend.
If f ∈ C2 , that is f is twice differentiable and its derivatives are continue,

then a closed form for the optimum step dl is determinable; we expand f
in Taylor series:
h iT 1
f (xk + ∆x) = f (xk ) + ∇ f (xk ) ∆x + ∆xk H(xk )∆x,
2
where H(x) is the Hessian4 matrix of f .
Along the gradient direction:
∆x = dl k ∇ f (xk ).
Thus:
h iT
f (xk + dl k ∇ f (xk )) = f (xk ) + dl k ∇ f (xk ) ∇ f (xk ) +
1 h iT
+ (dl k )2 ∇ f (xk ) H(xk )∇ f (xk )
2
df h iT h iT
k k k k
= ∇ f (x ) ∇ f (x ) + dl ∇ f (x ) H(xk )∇ f (xk ) = 0 (4.10)
dl k
4
The Hessian matrix of a function f (x1 , x2 , . . . , xn ) is defined as:
 2 
∂ f ∂f
∂ x21 ∂ x1 ∂ x2 · · · ∂ x∂1 ∂f xn
 
 ∂f ∂2 f
 ∂ x2 ∂ x1 ∂ x 2 · · · ∂ x∂2 ∂f xn 

H( f ) = 
 . .
2
.


 . . . . . .
 . . .  
∂f ∂f ∂2 f
∂ xn ∂ x ∂ xn ∂ x
1 2
· · · ∂x 2
n
and
" #
− ∇ f (x k ) T ∇ f (xk )
dl k = T . (4.11)
∇ f (xk ) H(xk )∇ f (xk )
df
From dl k
(xk+1 ) = 0, we can see that:

∇ f (xk + dl k ∇ f (xk )), ∇ f (xk )) = 0,
that is ∇ f (xk )) and ∇ f (xk+1 )) are orthogonal, or, the same, xk and xk+1
are orthogonal. This means that successive steps of the optimal gradient
algorithm are orthogonal.
Convergence considerations A general descend algorithm converges if:
lim ∇ f (xk ) = 0.
k→∞
Property 4.30. The function f monotonically decreases along the (negative)

gradient path.
Proof. From equation (4.9)
∂f ∂f
n 1
df n
∂ f dxi ∂ xi ∂ xi ∂f 2 2
=∑ = − 1 = − ∑ ∂x (4.12)
dl ∂ xi dl n
i=1 ∂f 2 2 i=1 i
∑ ∂ xi
i=1
df
Thus ≤ 0, or the function f decreases along the path dl.
dl
Lemma 4.31. The convergence of a descend method along the gradient path can
not be obtained in a finite number of steps.
Proof. From equation (4.12) (page 66)
n 1
df ∂f 2 2
=− ∑
dl i=1
∂ xi
but when x approaches the optimum x? , then
∂f
lim? (x) = 0
x→ x ∂ xi
so that
df
lim (x) = 0
x→x? dl
meaning that the optimum is reached with a rate convergence that de-
creases.
For the optimal gradient method the convergence is only linear5 in f (xk )
and a halting criterion for the algorithm could be:
f (xk ) − f (xk+1 ) ≤ ;
alternatively from the necessary condition ∇ f (x) = 0
n 2
∂f k ∂f k
max |
i ∂ xi
(x )| ≤ or ∑ ∂ xi
(x ) ≤
i=1
Finally note that these methods, since they use a local gradient inform-
ation, they find only a local minimum, and that the gradient algorithms are
rather inefficient in the proximity of the optimum, due to the small step
size.
4.2.3 The conjugate direction method
Let u, v ∈ X ⊆ Rn . They are said mutually orthogonal if uT v = 0. Similarly

they are said mutually conjugate with respect to a matrix A if uT Av = 0.
5 f (xk+1 )
This means that limk→∞ f (xk )
= a, with 0 ≤ a ≤ 1
Property 4.32. A set of of mutually conjugate vectors in X ⊆ Rn constitutes

a basis for X.
The importance of a set of mutually conjugate vectors is stated from the

following theorem:
Theorem 4.33. Every descent method of optimization using mutually conjugate
directions is quadratically convergent.
The concept of conjugate directions is important, since, in an intuitively

manner, a minimization attained along one of this directions does not per-
turb the the minimization along the other direction.
4.2.3.1 The Fletcher–Reeves conjugate gradient algorithm
This algorithm calculates the mutually conjugate directions of search

with respect to the Hessian matrix of f directly from the function evalu-
ation and the gradient evaluation, but without the direct evaluation of the
Hessian of the function f .
Algorithm 4.34. Fletcher–Reeves conjugate gradient algorithm
Require: x0 = starting point
1: repeat
2: Compute ∇ f (x0 ) and h0 = ∇ f (x0 )
3: for i = 1, . . . , n − 1 do
4: Replace xi = xi−1 + λi−1 hi−1 ,
where λi−1 minimizes f (xi−1 + λi−1 hi−1 )
5: Compute ∇ f (xi )
6: if i < n then
k∇ f (xi )k2 i−1
7: hi = −∇ f (xi ) + h
k∇ f (xi−1 )k2
8: end if
9: x0 = x n
10: end for
11: until halting criterion
k∇ f (xi )k2 i−1

The quantity h is added to the gradient at each iteration,
k∇ f (xi−1 )k2
and when f is a quadratic form (positive definite), this results in a set of
mutually conjugate vectors.
4.2.3.2 The Powell conjugate gradient algorithm
Since the generation of the conjugate directions in the Fletcher–Reeves

algorithm requires the computation of ∇ f (x) at each iteration, and this
computation it is not always feasible, Powell ([15]) has developed a method
to generate the conjugate directions using only a one-dimensional search
at each iteration: if x1 , x2 are two vectors generated by one-dimensional
searches in the same direction v, but from different points, then x1 − x2 is
mutually conjugate to v.
Algorithm 4.35. Powell conjugate gradient algorithm

Require: {hi , i = 1, . . . , n} ∈ X ⊆ Rn = A set of linearly independent vec-
tors in X, and x0 = starting point
1: repeat
2: for i = 1, . . . , n do
3: Replace xi = xi−1 + λi hi , where λi minimizes f (xi−1 + λi hi )
4: for i = 1, . . . , n − 1 do
5: hi = hi+1
6: end for
7: hn = xn − x0
8: Find λn that minimizes f (xn + λn (xn − x0 ))
9: x0 = x0 + λn (xn − x0 )
10: end for
The Powell algorithm is equivalent to a one-dimensional search made

in a sequential way along mutually conjugate directions. The only critic
point of the Powell is the line 7 of the algorithm 4.35: replacing th n–th
direction hn with the vector xn − x0 tends to produce at each iteration a set
of directions that are more linearly dependent. The solution is to reinitialize
every n iterations the set of directions h; these directions can be the columns
of any orthogonal matrix, and there is an heuristic scheme due to Powell.
The figure 4.2 shows 20 iterations of the Powell algorithm to find the
minimum (located at x = 30) of a mono-dimensional function ∼ x4 . As it
can be see, the algorithm finds the minimum at x = 31.7 and it is not fooled
by the presence of a local minimum at x = 10. The figure 4.3 shows 24
iterations to find the minimum (located at x = 15) of a more complicated
80
f(x}
Powell
70
60
50
40
30
20
10
Sol
0
0 5 10 15 20 25 30 35 40 45
Fig. 4.2: Minimization by Powell algorithm of a function ∼ x4 : 20 steps.
function ∼ x6 : again the algorithm finds the global minimum at x = 13.7 in

a presence of local minima.
In both cases a better precision on the location of the minimum could be
obtained increasing the number of iterations.
4.2.4 The “SLOP” algorithm
The slop algorithm ([16]) is a simple algorithm, suitable for the minim-
ization of a particular function of a digital circuit, the delay. It is feasible
for smaller circuit, since it has no heuristics in reaching the minimum, and
also it stops at the first minimum it finds.
The idea behind the algorithm is simple: start from a given point x0 <
x? , the increment at each iteration a single component of x0 by a defined
step. For each increment track the diminution of the objective function,
then conserve memory only of the increment that give the best diminution.
Finally, use this increment as a new starting point.
Clearly, this algorithm works only if the starting point is x0 < x? (see nota-
tion 4.18), so that an increment in one component moves the function f
near the minimum. Also at the first minimum encountered the algorithm
h(x}
140 Powell
120
100
80
60
40
20
Sol
0
0 20 40 60 80 100 120
Fig. 4.3: Minimization by Powell algorithm of a function ∼ x6 : 24 steps.
stops.
The same function of figure 4.2 is shown in figures 4.4, minimized by

the SLOP algorithm. It is possible to see that the SLOP algorithm stops as
just as it encounters the first minimum (a local minimum) at x = 10.
Algorithm 4.36. SLOP algorithm

Require: x0 < x? starting point
1: repeat
2: for i = 1, . . . , n do
3: Compute f old = f (x0 )
4: Replace xi = xi + ∆x
5: Compute f (x0 ) and hi = f old − f .
6: Replace xi = xi − ∆x
7: end for
8: Search the index imax that corresponds to the maximum of hi :
imax = {1, . . . , n| max hi > 0}
i
9: Replace ximax = ximax + ∆x
80
f(x}
Slop
70
60
50
40
30
20
Sol
10
0
0 5 10 15 20 25 30 35 40 45
Fig. 4.4: Minimization by SLOP algorithm of a function ∼ x4 : 180 steps.
4.2.5 The simulated-annealing algorithm
The name of this algorithm comes from an analogy with thermodynam-

ics: it is known that if a slow cooling is applied to a liquid, the this liquid
freezes naturally to a state of minimum energy. This process is called an-
nealing.
The numerical algorithm applies this analogy to the minimization of func-
tion: first go downhill to a minimum as far as it can go, then go slightly
uphill, since the minimum just fond could be a local minimum, then again,
go downhill, and so on. In thermodynamics, the probability to go from a
state with energy E1 to a state of energy E2 is given by:
−(E1 − E2 )
p=e kT ,
where k is the Boltzmann constant and T is the temperature of the system.
In order to apply this scheme to a function minimization, it is necessary

to define the energy of the system (i.e. the objective function), the temper-
ature of the system, and an annealing schedule (i.e. the scheduled number
of annealing iterations): at each iteration the temperature defines a random
fluctuation in the minimum found, to simulate the thermal fluctuations of
80
f(x}
Anneal
70
60
50
40
30
20
10
Sol
0
0 5 10 15 20 25 30 35 40 45
Fig. 4.5: Minimization by Simulated-annealing algorithm of a function

∼ x4 : 130 steps.
the atoms. Also, at each iteration the temperature is decreased, to reduce

the thermal fluctuations and converging, thus, to the global minimum.
The rate of the diminution of the temperature influences the rate of con-
vergence (higher rate temperature, higher rate of convergence), but also
influences the quality of the minimum (lower rate temperature, higher the
probability to converge to a global minimum).
As an example, a possible annealing schedule (probably the simpler) would
be: after k steps, reduce the temperature T by T = (1 − )T, where is de-
termined by experiment.
The same function of figures 4.2, 4.3 are shown in figures 4.5, 4.6, min-
imized by the simulated annealing algorithm. As in the case of Powell
algorithm, the simulated annealing it is not fooled by the presence of local
minima, but the number of iterations is greater for both the functions: 130
in the first case, 200 in the second one.
h(x}
140 Anneal
120
100
80
60
40
20
Sol
0
0 20 40 60 80 100 120
Fig. 4.6: Minimization by Simulated-annealing algorithm of a function

∼ x6 : 200 steps.
4.3 Conclusions
After all this mathematic theory, some words must be spend about the
choice of which algorithm it is feasible to use.
The characteristics of each algorithm are summarized in the table 4.1:

this table should be indicate several characteristics that can be useful for
the real implementation of circuit optimizer.
In the same manner the previous sections illustrate all the basic theory, use-
ful to justify some choices made in the implementation of the optimizer.
4.3. Conclusions 75
Tab. 4.1: Optimization algorithms
Algorithm Pro Con

Mono-dimensional
Simple implementa- Converges to local
Section search
tion. minima.
Parabolic interpola- Has some pre-re-
Fast convergence.
tion quirements.
The simplest imple- Converges to local
SLOP
mentation. minima. Very slow.
Multi-dimensional
Requires gradient
Conjugate directions Good convergence.
knowledge.
Fast convergence.
Does not require
Difficult implementa-
Powell scheme gradient knowledge.
tion.
Is not trapped by local
minima.
Simple implementa-
tion. Does not require Very slow. Fragile
Simulated annealing gradient knowledge. with respect to some
Is not trapped by local critical parameters.
minima.
Chapter 5
CIRCUIT OPTIMIZATION
HE goal of the optimization step during a design flow is to obtain

T from a given design an “optimized” design. In the figure 5.1 are
showed the various levels of possible optimization.
The optimization level we concern is the inner level, indicated here as di-
mension optimization. The optimization levels, that is the level at which the
designer can apply suitable techniques are, briefly:

Fig. 5.1: Design flow
System Optimization This is higher level of optimization: it concerns the

optimization made on user space or kernel space of the applications
78 Chapter 5. Circuit Optimization
running in the system subject to the optimization process.
Behavioural optimization At this level the proper optimization techniques

are made by choosing the best algorithm to implement functions.
Logic optimization This is the optimization made by mapping the given

functions or algorithms (from a behavioural optimization) into bool-
ean functions. It is equal to choose the logic gates that implement
these functions.
Dimension optimization This is the lower level of optimization: it is made

by choosing the proper transistor dimensions in each gate that imple-
ment a logic function. This is the optimization which the efforts of
this thesis focus on.
In section 5.1 are shown the three kind of target to be optimized in a

real circuit: delay (§5.1.1), power consumption (§5.1.2) and area occupancy
(§5.1.3). In particular §5.1.1.1 shows the delay obtained from the Elmore’s
formula (chapter 2, page 15), while §5.1.1.2 shows the delay as it is obtained
by HSPICE and FAST (chapter 3, page 21).
The section 5.2 contains some application of the mathematical results of
chapter 4 (page 47): in particular §5.2.2 shows the results of a mono-objective
optimization, while §5.2.3 shows the results of a multi-objective optimiza-
tion. Some conclusions are drawn in section 5.3
5.1 Optimization targets
There are, mainly, three target policies in optimizing real circuits: min-
imize the delay, minimize the power consumption and minimize the area
occupancy. In some cases these policies can be conflicting among them, as,
for example, minimizing the delay surely increases the circuit area, while
ins some cases these policies can go together, as, for example, minimizing
the power consumption may lead to a reduction of the area occupancy.
There is another policy that can be considered, especially in the field

of sub-micron digital circuit design: the noise reduction; however this re-
quires a good noise model of the circuit, and actually there are a few good
ones.
5.1. Optimization targets 79
Now we are going to analyze the three principal optimization policies,

regarding especially the compatibility with the optimization algorithms of
chapter 4.
5.1.1 Circuit delay
Till now the generic word “delay” has been used, but now it is mandat-
ory to better define the meaning of delay in a real circuit.
Generally the delay of a CMOS gate, or a CMOS circuit, is defined as

the delay between the time when the output is at 50% of its peak value
(indicated with to in figure 5.2) and the time when the input is at 50% of its
peak value (indicated with ti in the same figure).
IN
VIN
50% VIN
time
OUT Delay = t o - ti
VOUT
50%VOUT
time
ti to
Fig. 5.2: Delay definition
This definition is good only for theoretical discussion since:
generally a circuit has more than one input and more than one output;
not always there is a direct path from the input to the output (let’s
think about dynamic logic), i.e. not always a change in an input cause
directly a change in the output.
So the definition of “delay” of a CMOS circuit must be investigated,

to produce real number useful for optimization. In order to define it the
concept of critical paths has been introduced in [16].

In the following I introduce a new mathematical formulation of the defin-
ition of “critical path”; this formulation will be useful for the automatic
solving of the problem of finding all the critical paths in a circuit in §6.2.4
(page 115).
Critical Paths The idea of critical paths in a CMOS circuit can be derived,
intuitively, from the idea of path between the the output and the input: a
critical path is a conducting path between a node (the “output” node, i.e.
the final node of the path) and the ground, or between this node and the
power supply, such that a change in the state of an input gate of a MOSFET
comprised in the path causes directly a change in that node. Naturally each
MOSFET included in the path must be on, or switch to, conduction, in order
to create a conducting path.
This concept must be extended, however, since a change of the so called
output node can cause itself a change of another critical path (i.e. the output
node is itself connected to a gate of another critical path), so that a change
in a gate node in the very beginning of the circuit may propagate through
a lot of conducting paths.
Definition 5.1 (Critical path). A critical path is a set of conducting paths such
that:
i) each conducting path is between a generic node and a ground node,

or between a generic node and a power supply node, and is composed
by MOSFETs; and
ii) each final node of a conducting path is either connected to a gate of a

MOSFET comprising another critical path, or is an output of the circuit;
and
iii) a change in the state of any MOSFET gates in the first conducting path
propagates till the last conducting path, causing a change in the critical
path output node.
Definition 5.2 (Critical path delay). The delay of a critical path is the delay
between the output node of the critical path and the gate node causing the
state change of the output node.
From the definition 5.1 it is clear that even a simple circuit has more
than one critical path in it 1 .
In order to develop a rigorous definition of critical paths, let’s introduce

the following sets, characterizing a typical CMOS circuit:

G = {set of all the MOSFET gate nodes in the circuit} = g1 , g2 , . . . , g j , . . .

N = {set of all the nodes in the circuit} = n1 , n2 , . . . , n j , . . .

O = {set of all the output nodes of the circuit} = o1 , o2 , . . . , o j , . . .

I = {set of all the input nodes of the circuit} = i1 , i2 , . . . , i j , . . .

M = {set of all MOSFETs in the circuit} = m1 , m2 , . . . , m j , . . .
V = {gnd=ground node, vdd=power supply node} ;
let’s define also the set Nm j as the set of all the nodes pertaining to the
MOSFET m j , and the gate of the j–th MOSFET with gm j .
All these sets are in such relations: I ⊆ G ⊂ N , V ⊂ N , O ⊆ N \ G.
The generic n–th critical path of a circuit, denoted by Cn , equation (5.1a)

(page 82), is the collection of conducting paths, denoted by γni , such that
each γni , equation (5.1b), is defined as the union of two ordered node sets,
the set Gγni , equation (5.1c), of all gates of all the k MOSFETs pertaining to the
conducting path, and the set Dγni , equation (5.1d), of all drain and source
nodes (in number of2 k + 1) of the same k MOSFETs, (5.1e)
The nodes in Dγni set have a peculiar property: the first and last one may
be or may be not3 in common among two or more MOSFETs, while the other
ones must be in common among two or more MOSFETs.
In other words, the set Dγni is an ordered collection of nodes such that
among these nodes there are k MOSFETs, constituting a continuous (and
conducting) path from the output node to a power supply (or ground)
node.
1
The simplest circuit, the inverter, has 2 critical path, since a change in the input from
low to high involves the path comprising only the n-MOSFET, while a change from high to
low involves the path comprising only the p-MOSFET.
2
Note that MOSFETs in a conducting path share a common drain or source node two by
two, so a conducting path constituted of one MOSFET has two nodes (one drain and one
source), a path constituted of two MOSFETs has three nodes (the MOSFETs share one drain
node) and so on.
3
This is the reason why in the equation (5.1d) the index j ends at k and not at k + 1
Finally, collecting all the definitions, respectively, of critical path, con-

ducting path, conducting path gate nodes set and conducting path drain nodes
set:
[
Cn = γni (5.1a)
i
γi = Gγni ∪ Dγni (5.1b)

Gγni = g j |g j ∈ G , ∀ j = 1, . . . , k (5.1c)

Dγni = n j | n1 ∈ V ∧ n j ∈ N \ G ∧ (n j , n j+1 ) ∈ Nm j \ gm j ∧
(5.1d)
∧ (nk+1 ∈ Gγi+1 ) ∨ (nk+1 ∈ O ) , ∀ j = 1, . . . , k
Gγni , Dγni such that given
n o
MG = m j | m j ∈ M ∧ gm j ∈ Gγni ,
n o (5.1e)
MD = m j | m j ∈ M ∧ Nm j \ gm j ⊂ Dγni ,
then MG = MD .
TSPC FULL ADDER (carry part)
1-2-4-5-11
11
6 1-3-4-5-11
C
5 1-7-8-5-11
A B
4 9-10
8 10
B C C
2 3 6
7 9
11
1
CLK
Fig. 5.3: Example of critical paths
The figure 5.3 shows an example of critical paths in a dynamic circuit

(actually the carry part of a full-adder in a TSPC logic). In this figure are rep-
resented the six critical paths, each one with the list of MOSFET numbers.
For example, the first critical path (C1 ) is composed by the conducting path
γ11 , made up of n-MOSFETs 1, 2, 4, 5 and the p-MOSFET 11: that means that
the set G11 is composed by the gates node of transistors 1, 2, 4, 5, and 11,
while D11 is made up of drain and source nodes of the same transistors; if
one gate of n-MOSFETs 1, 2, 4, 5 switch from the low state to the high state
(and the others are all at the high state), then the gate of p-MOSFET 11 is dis-
charged, and this p-MOSFET conducts, charging the output node. Another
critical path for example is the one composed only by the p-MOSFET 6: if
its gate switch from high to low, then the gate of n-MOSFET 9 switch form
low to high, but this can not produce the discharging of the output node,
since the gate of n-MOSFET 6 is driven by the same signal of the original
p-MOSFET.
Note. The definition of critical path can be viewed as a tree rooted at the
transistor that is driving the change in the critical path. One leaf of the tree
is the transistor which drain (or source) is the critical path output node. So
it is possible to traverse the tree between the root (the input) and a leaf (the
output): if one is able to model all the lateral subtree encountered during
the traversing of the tree as static load, then the tree becomes a transistor
chain (figure 5.4). This is the base of the use of several delay models that
are able to evaluate a chain delay.
TREE CHAIN
OUTPUT OUTPUT
INPUT INPUT
Fig. 5.4: Critical path tree that becomes a chain.
After the definition of critical paths, the problem of associating a delay

(one and only one) to a circuit is still unresolved, since there is surely more
than one critical path in a circuit: the solution is to find the max of all the
critical path delays, and regard this delay as the delay of the whole circuit.
In this manner, we are sure that a change in the state of a node caused dir-
ectly by an input, can never occurs after the max delay fixed. Also this
definition is consistent with the optimization purposes, since the optimiz-

ation objective is always (usually!) the minimization of the delay. So the
strategy to be applied is a min–max scheme of optimization (minimization
of the maximum).
Definition 5.3 (Circuit delay). 4 The delay of a circuit td is:
td = max {d(Cn )}
n
where d(Cn ) is the delay of the n–th critical path comprising in the cir-
cuit.
So, finally, in order to known the delay of a circuit, one must search
all the critical paths in the circuit, calculate (or measure) the delay of each
critical path, and calculate the max of these delays.
The delay of each critical path can be calculated by means of some

model (maybe after the transformation of figure 5.4), or measured by means
of simulations.
This delay, obtained in some way, must be analyzed in order to know

its coherence with the mathematical results of chapter 4 (page 47), and the
validity of these results.
5.1.1.1 Delay formula obtained by the Elmore model
The delay function obtainable by the Elmore’s model (§2.1, page 16) is a
continuous function. Referring to figure 2.1 (page 15), the delay of a single
MOS is:
tdi = R0 CSi + (R0 + Rdi )CDi + (R0 + Rdi + R L )CL
The drain and source capacitance, and the dynamic resistance of a MOS
are function of the MOS width W:
4
The reason why we want to define a single value for the optimization of delay and, for
example, we do not apply the multi-objective methods of the following sections, is that all
the critical path delay are commensurable and they have the same global behaviour (cfr.
§5.2.3, page 102)
CDi = C j Wi
CSi = C j Wi
Rj
R di =
Wi
where C j and R j , are, respectively, the capacitance for unit length and the
resistance for unit length. The delay function of the MOS width become:

Rj Rj
tdi = R0 C j Wi + R0 + C j Wi + R0 + + R L CL .
Wi Wi
Separating the terms containing the width W j from the terms that are
independent from W j we obtain:
Rj
tdi = 2R0 C j Wi + CL + R j C j + (R0 + R L )CL .
Wi
Summing the delay of all the MOS in a conducting path we obtain the
total delay of this path:

B
t d = ∑ t di = ∑ AWi + +C
i i
Wi
where A, B, C are all independent from Wi .
The delay of a critical path is the sum5 of the delays of all the conducting
path.
As long as A, B are not zero, the delay td is a convex function (defini-

tion 4.12, page 50) as in figure 5.5. If the term A is zero, instead, then the
delay is a monotonic decreasing function (figure 5.6).
Note that the term A is zero, practically, only if the the resistance R0 is
5
This definition introduces further errors in the delay model, since the conduction of the
conducting path successive to the first one does not start when the output of the first one is
at its 50%, but long before.
td
t min
Wmin Wj
Fig. 5.5: Elmore delay: convex function
zero, that is the MOSFET chain is driven by an ideal voltage source.
5.1.1.2 Delay measurement obtained by the FAST model and by HSPICE
The delay obtained by the FAST model and HSPICE simulations is a

measure and not a formula. It is a correspondence one–to–one between
the MOSFET widths and the resulting delay and it is not possible to express
this delay by means of a closed form formula6 .
The figures 5.7, 5.8 represent the delay of CMOS inverter, increasing in
an uniform manner the dimension of both the n-MOSFET and the p-MOSFET.
The first figure shows the delay of the inverter driven by another inverter
(with fixed dimensions) simulated by HSPICE; the second figure shows the
delay of the same inverter driven, instead, by an ideal voltage source and
simulated by FAST.
These are an experimental proof of the statement given in the previous sec-
tion: if the voltage source is not ideal, that is dependent from the MOSFET
widths of the circuit, the delay curve is strictly convex (figure 5.7), while if
the voltage source is ideal, i.e. independent from the MOSFET widths, then
6
It is possible, however, after measuring a set of delay varying with widths, to fit the
results with an approximated formula, now in a closed form.
td
Wj
Fig. 5.6: Elmore delay: monotonic function
the delay curve is decreasing monotonically (but still convex).

Taking into account the interconnection delays, which can be no more neg-
ligible in the deep sub-micron, does not modify the delay function, since a
width independent7 delay is added to the total delay function.
So, definitively, the delay curve is a convex function, strictly or not, de-
pending of the operating condition of the circuit, of all the MOSFET widths8 .
5.1.2 Power consumption
The calculus of the power consumption of a circuit is quite different

from the calculus of the delay: while the delay is a local property of a
single critical path (§5.1.1), the power consumption is a global property
of the circuit. That is the power consumption of a circuit is not the sum
7
The interconnection delay can be seen, in second approximation, as proportional to the
MOS widths, since greater widths means greater circuits, and in a layout this means that the
average length of interconnections increases also. This proportionality (empirically found
linear to quadratic) does not modify the delay function, since it adds a term that is both an
increasing and a convex function.
8
The two dimensions representation of figures 5.5, 5.6, 5.7 and 5.8 is only for the sake of
simplicity of the drawing. The convexity is still valid in multi-dimensional representations.
260
Delay
240
220
200
td [ps]
180
160
140
120
100
0 10 20 30 40 50 60
Wj [um]
Fig. 5.7: HSPICE delay: convex function
of the power consumption of each critical path9 . Even if the definition of

“power consumption” is global, it is not univocal: the power dissipation of
a circuit surely depends on the input conditions. Changing the input states
change the overall power dissipation, making some MOSFET conducting,
while others not. Again, one must choose a definition of “power dissipa-
tion” giving a single number, for the purpose of the optimization.
Considering that the objective of the power optimization (hereinafter we
will abbreviate power consumption optimization only with power optim-
ization) is the minimization of the total power dissipation, as in the case of
delay minimization, a min-max strategy is the most appropriate. Instead of
evaluating the power consumption for all the input combinations, we take
advantage from the definition of critical path:
Definition 5.4 (Circuit power consumption ). 10 The power dissipation of

a circuit Pd is:
Pd = max { p(Cn )}
n
where p(Cn ) is the power consumption of the entire circuit when the
9
In first approximation the power consumption could be the sum of the power dissip-
ated by each critical path in a fully static CMOS circuit.
10
The same reasoning of note 4 (page 84) applies here.
80
Delay
75
70
65
td [ps]
60
55
50
45
0 10 20 30 40 50 60
Wj [um]
Fig. 5.8: FAST delay: monotonic function
input conditions of n–th critical path are applied.

In this manner it is possible to apply a min–max scheme of optimization,
and, at the same time, it is possible to evaluate the power consumption
during the same bench of evaluation of the critical path delay, allowing a
substantial reduction of the time necessary for the complete evaluation.
In the following, the term power consumption and energy dissipated

will be used altogether, since the simple relation between them is:
Z
Ē = P(t)dt;
this means that the calculation of the mean energy dissipated by a circuit is
the integral average of the power and it depends from the simulation time
(or the window of time that we are considering), but it does not depends
from the frequency of the signals at which the circuit itself operates.
The power consumption of a CMOS circuit is the sum of three term (§3.3,
page 36):
PTOT = Pswitch + Pshort + Psub-th (5.2)
the switching power Pswitch , due to the charging and discharging of

internal parasitic capacitances; the short-circuit power Pswhort , due to the

simultaneous conduction of n-MOSFET and p-MOSFET, giving thus a direct
conducting path from the power supply to the ground for a short time; and
the sub-threshold power Psub-th , due to sub-threshold conduction of MOS.
In a first approximation the first term Pswitch is proportional to the MOSFET
widths in the circuit (greater width means greater capacitance), the second
term Pshort is proportional to switching time and thus it is inversely related
to the MOSFET widths (greater capacitance means slower switching time),
while the latter term Psub-th is is proportional to the MOSFET widths.
70
Energy
60
50
40
Energy [pJ]
30
20
10
0
0 10 20 30 40 50 60 70
Wj [µm]
Fig. 5.9: HSPICE Energy
As an example, the total power consumption of a single gate is sketched

in figure 5.9: as it can be expected the energy is increasing with widths, but
it is not convex.
The three terms of equation (5.2) do not weight equally in the sum giv-
ing the energy consumption: in order of influence the first term (§3.3.1,
page 36) is the greater, then comes the second term (§3.3.2, page 39) , and
finally the third term (§3.3.3, page 39) . For a sub-micron technology the
second term (the short-circuit dissipation) is about 10% of the first, with
the third term (sub-threshold conduction dissipation) about 1% of the first.
It could be expected than with the scaling of the technology (in the deep
sub micron field) the first and the second term become comparable, with
5.2. Optimization examples 91
the third term still a fraction of the other two, giving a power figure not
increasing (or even decreasing) with the MOS widths, but also it could be
expected that with the scaling down the interconnect capacitances become
predominant, making the first term (the power dissipation due to capacit-
ance charging and discharging) still the greatest.
In summary, the power consumption figure of a CMOS circuit is an in-

creasing function of the MOSFET widths, but no assumptions can be made
about the convexity of this function.
5.1.3 Area
The area occupation of a circuit can be expressed in a closed form:
A = ∑ α j W j + β. (5.3)
j
The area occupation is composed by two terms: a term directly propor-

tional to the MOSFET widths (i.e. to the are occupied by the single MOSFET)
and a term independent from the MOSFET widths (comprising, for example,
the interconnect area). Both terms are, of course, positive, so the curve
of the area occupied versus the MOSFET widths is a monotonic increasing
curve11 , that is a convex function.
Taking into account the interconnections area does not modify the prop-
erty of the “area” function, since the only modification of equation (5.3)
(page 91) is in the term β independent from the MOS widths12 .
5.2 Optimization examples
In order to show some issues introduced in the previous sections, in the

following some CMOS gates will be analyzed. These gates are summarized
in the table 5.1, with the second column showing the total number of critical
11
It is a straight line in two dimensions, a plane in three dimensions and an hyperplane
in four or more dimensions, but is always a convex function.
12
See note 7, page 87.
paths in a gate, and the third column showing the total number of MOSFET
in a gate. The last two gates are dynamic full-adder, the former composed
by complex gate in order to perform the computation in one stage, while
the latter is composed only by basic gates (and, or and inverter): this ex-
plain why the last full-adder has much more transistor than the first one.
Tab. 5.1: Basic gates: complexity
Gate # of critical paths # of transistors

Inverter (fig. 5.10) 2 2
TSPC type n latch
4 6
(fig. 5.11(a))
TSPC type p latch
4 6
(fig. 5.11(b))
TSPC type n and
4 7
(fig. 5.12(a))
TSPC type p and
5 7
(fig. 5.12(b))
TSPC type n or (fig. 5.13(a)) 5 7
TSPC type p or
4 7
(fig. 5.13(b))
Static and-or (fig. 5.14) 12 14
Static and14 3 4
Static or14 3 4
Static parity gate (fig. 5.15) 24 48
Static full-adder
34 40
(figs. 5.16(a), 5.16(b))
TSPC full-adder
(one-stage) (figs. 5.17(a), 26 13
5.17(b))
TSPC full-adder (basic
82 126
cells)
The table 5.2 shows the delays and the energy consumption of the gates
of table 5.1: for each gate it is shown the maximum delay, the average delay,
the maximum energy and the average energy of all critical paths. All the
simulation are made at the minimum width for that technology (viz. 1 µm
14
For a schematic of the static “and” and the static “or” see figure 5.14: the “and” is the
first gate of the schematic (on the left side of the picture), while the “or” is the last but one
gate, before the final inverter (on the right side); for a static and see also the figure 5.12,
page 96.
Tab. 5.2: Basic gates: pre-optimization delay, power consumption and area
Technology 0.7 µm 0.25 µm

Delay [ps] Energy [pJ] Area Delay [ps] Energy [pJ] Area
Gate max avg max avg [ µ m2 ] max avg max avg [ µ m2 ]
Inverter 717.5 572.7 0.6887 0.6864 2.4 259.6 189.1 0.0858 0.0853 1.2
TSPC type n
921.8 630.8 3.491 1.299 7.2 293.3 200.4 0.6586 0.2161 3.6
latch
TSPC type p
1413.0 718.9 2.087 0.965 7.2 482.5 209.1 0.2894 0.1221 3.6
latch
TSPC type n
1028.0 664.1 2.756 1.13 8.4 315.8 208.9 0.5126 0.1816 4.2
and
TSPC type p
1413.0 754.9 2.51 1.058 8.4 482.5 212.9 0.3488 0.1314 4.2
and
TSPC type n
904.3 689.6 4.47 1.42 8.4 299.7 224.8 0.8257 0.2288 4.2
or
TSPC type p
1413.0 787.3 1.654 0.879 8.4 482.5 225.1 0.2243 0.1077 4.2
or
Static and-
1180 894.2 3.639 2.816 16.8 334.1 253.3 1.998 1.51 8.4
or
5.2. Optimization examples
Static and 760.9 727.7 0.7224 0.7160 4. 8 277.2 240.1 0.0907 0.0891 2.4
Static or 1430 776.3 0.75 0.7114 4.8 233.7 89.3 0.0713 0.0434 2.4
Static parity
2650.0 1839.55 0.7442 0.676 57.6 922.2 582.5 0.0944 0.0863 28.8
gate
Static full-
1781 1080.6 6.475 1.219 48.0 571.3 311.3 3.155 0.324 24
adder
TSPC full-
adder 930.6 681.9 2.168 0.9425 15.6 276.7 204.2 0.641 0.188 7.8
(one-stage)
93
TSPC full-
adder 2691 .0 556.4 8.893 3.82 151.2 482.3 79.9 5.27 1.999 75.6
(basic cells)
A
OUT=A
Fig. 5.10: CMOS Inverter
for the 0.7 µm technology and 0.5 µm for the 0.25 µm technology).
5.2.1 Algorithm choice
Given the results of section 4.2 (page 58), and the results of the above
sections regarding the property of delay, power and area functions in real
circuits, the most suitable algorithm to be applied is the Powell’s scheme.
Briefly, it is fast, reliable, even in presence of multiple minima, and (per-
haps first of all) it does not require the knowledge of the first derivative of
the function to be minimized.
While some other algorithms could give the same quality of accuracy in
finding the minimum (namely the simulation annealing algorithm is prac-
tically the only one), the Powell’s one outperform all the others in the terms
of number of iteration, and hence in execution time, reaching the best solu-
tion.
The Powell’s algorithm is the first choice in all the optimization ex-
amples found in this chapter. As an example, performing the same optim-
ization of table 5.3 with the simulated annealing will require an execution
time by the optimizer15 of about ten times of that required by the Powell’s
algorithm.
15
For a complete description of the optimizer cad tool see chapter 6, page 107.
CLK
OUT
A
(a) Type n
CLK
OUT

(b) Type p
Fig. 5.11: TSPC Latches
5.2.2 Mono-objective optimizations
The mono-objective optimization of the circuits of table 5.1 means the

optimization of one and only one of the targets of §5.1, namely delay aut
power consumption aut area occupation.
5.2.2.1 Area
This target has a trivial optimization, since to minimize the occupation

of area of a circuit means obviously to have all the transistors in the circuit
as little as possible, i.e. the minimum allowed width by the technology.
CLK
A OUT=A B
(a) Type n
CLK

B
A

OUT=A B
(b) Type p
Fig. 5.12: TSPC And gates
5.2.2.2 Power
The power optimization worths some more words: all the attempts to
optimize exclusively the power of the gates of table 5.1 in spite of the delay
have led to the same result, for both technologies: all the transistors in the
circuit had the minimum width after the optimization. This outcome will
arise whatever would be the starting point of the optimization session, i.e.
the initial transistor widths of the circuit.
This is an experimental proof that, out of the three terms of equation (5.2)
(page 89), the term of the switching power Pswitch , due to charging and dis-
CLK

OUT=A + B

A B
(a) Type n
CLK
B OUT=A + B
(b) Type n
Fig. 5.13: TSPC Or gates
charging of capacitances in the circuit, is always the dominant one. Al-

though some authors in the past argue that this term could not be the
largest, especially for deep sub-micron circuits, there is not an experimental
proof of that, at least for small and medium circuits.
5.2.2.3 Delay
Given the results of the power optimization (and the simple results of
area optimization), the only “mono”-optimization feasible is the delay op-
8
Chapter 5. Circuit Optimization
9
12 7
13
11 10 C
A OUT
B
6 0
4 3
1 2
5
Fig. 5.14: Static and-or gatea .
a
This gate performs the action A · B + C, but there are two inverters between the and and the or. These leave intact the logic function, but introduce
some complexity in the critical paths formulation: it is only for this purpose that these inverters have been introduced.
98
4 4 4
5 5 5
D D

0 0 2 2 2
1 1 3 3 3
C C C C

B B B B

. . .
/ / /
A A

A A

* * * ( ( (
+ + + ) ) )
B B B B

& &
' '
$ $ $
% % %
C C C C
"# !
, , ,
- - -
D D
Fig. 5.15: Static parity gate
timization.
That it is the maximum delay of critical paths is minimized, disregarding
the power consumption and the area occupation, which both increase as
the delay diminishes.
Tab. 5.3: Full-adder: delay optimization
Delay [ps] Energy [pJ] Area [µm2 ]

Full-adder Pre-opt. Post-opt. Pre-opt. Post-opt. Pre-opt. Post-opt.
0.7 µm technology
1781 1080 6.475 40.0 34 195.6
Static
0.25 µm technology
571.3 415.2 3.155 111.2 17 692.6
0.7 µm technology
930.6 400.2 2.168 13.390 26 151.7
TSPC (one-stage)
0.25 µm technology
276.7 158.3 0.641 3.622 13 80.4
As an example, the delays of the static and dynamic full-adders before

the optimization (i.e. all the transistors with minimum width) and after
the optimization are presented in table 5.3; in the same table is reported
the power consumption of the circuit before and after the optimization of

B C

B A
C B
Chapter 5. Circuit Optimization
A
B A
C
B A

C
CARRY

B A
B A
SUM
C C
A A A A
B C
C
B B B B

100
(a) Sum part (b) Carry part

Fig. 5.16: Static full-adder

CARRY
CLK
B A
SUM
C C CLK
B C
A A A A
C

B B B B
5.2. Optimization examples
CLK
(a) Sum part (b) Carry part
Fig. 5.17: TSPC full-adder (one—stage)

101
the delay: it is possible to see how the power increases after the delay is
minimized.
The criterion that judges when the optimization is over is based on two
considerations (see chapter 6, page 107 for more details on the algorithms
implementation, and chapter 4, page 47 for mathematical foundations):
i) either if there is a minimum (either the delay figure is strictly convex

or, more generally, it has an absolute minimum), then the optimization
algorithm find it with an arbitrary accuracy, chosen a priori; or
ii) if the delay figure is not strictly convex (i.e. is monotone decrescent),
then the optimization algorithm goes on minimizing till the rate of de-
creasing of the delay is below the accuracy.
The former case is more stable from the point of view of the accur-
acy: given an accuracy, the same optimum solution is found independ-
ently from the starting point (i.e. the initial transistor widths) — the start-
ing point influences only the time it takes to reach the solution, which is
unique.
The latter case is somewhat more problematic, since the solution is depend-
ent from the starting point: the decreasing rate of the delay is dependent
from the starting point in the multi-dimensional space delay vs. widths.
This means that several optimization sessions can give different results,
depending on the initial transistor widths in each optimization.
In order to eliminate this ambiguity it is safe to chose a common starting

point for all the optimization sessions: the natural choice is to start with
all the transistors at the minimum allowed width by the technology. This
choice guarantees that changing from an optimization run to another the
solution found is always the same, and also it represents a comfortable
way for writing the netlist to be optimized, either by a human hand or by
a schematic editor.
5.2.3 Multi-objective optimizations
The multi-objective optimization means to optimize at the same time dif-

ferent target, that is, for example minimize contemporarily the delay and
the power, or the power and the area, and so on. From §5.2.2.1, §5.2.2.2
and §5.2.2.3 we have seen that some of these goals clash. These clashes are
briefly summarized in table 5.4.
Tab. 5.4: Agreements of targets
Area Delay Power

Area — ♠ ♥
Delay ♠ — ♠
Power ♥ ♠ —
So, for example, optimizing together delay and power, i.e. minimizing
both, it is not possible: the power is minimized when all the transistors
are at minimum width, while minimizing the delay involves to have some
transistors (maybe all) at a width greater than the minimum.
This disagreement among some optimization targets leads to new possible
definition(s) of “multi-objective” optimization:
i) there is a primary target to be optimized, and one or more secondary

targets to be taken into account: then we may define a threshold on
the latter. The algorithm goes on optimizing the primary target, being
careful on maintaining all the secondary targets below the threshold;
or
ii) there are only primary targets, and each target account into the total
objective function with a relative weight, which indicates how much
the final solution should depend on the corresponding target; or
iii) both the previous definitions.
The most suitable policy is the second, because it gives to each target
the same priority with different importance. The first alternatives leads
to a sub-optimal optimization since: first, the designer must know which
are the order of magnitude of the targets, in order to impose a limit on
them; second, not the whole space of solutions may be explored with such
constraints.
In the case of primary target with relative weights, we have chosen the
sum of relative weights to represents the entire normalized objective func-
tion, that is the sum of relative weights must be equal to one.
Given the results of §4.1.2 (page 54) then the total objective function to be
minimized is a linear combination of the delay (D ), power (P ) and area (A ):
O = αD + β P + γ A , (5.4)
where O is the total objective function and where
α ≥ 0, β ≥ 0, γ ≥ 0, α+β+γ =1
From the point of view of the user of the optimizer, specifying this kind
of weights means to have the possibility to see this weights as a measure
of how much the corresponding target matters in the final solution: for
example specifying α = 0.5, β = 0.5 and γ = 0 means that we want to op-
timize the delay at the 50% and the power at the 50%.
The subtle point in the eq. (5.4) is that the quantities D , P and A are
not commensurable, that is order of magnitude of the quantities may not be
same. Let’s think only to the unit of measure: if, for example, the delay
is measured in picosecond (e.g. 1000 ps), the power is measured in Joule
(e.g. 10−13 J). When one quantity is very greater than the others, then all
the changes in the latter quantities disappear in the total sum.
In order to overcome the problem of the non-commensurable quantit-

ies in eq. (5.4), all the terms comprising the sum should be normalized. The
mathematical theory of optimization states that each term should be nor-
malized dividing them by the optimum found optimizing only that par-
ticular term. This implies an a-priori knowledge of the optimum of each
term, and so of the total weighted sum. At every moment of the optimiz-
ation run is possible to know the distance between the actual solution and
the optimum.
This is not practically feasible for a circuit optimization, since it would in-
volve the run of mono-objective optimizations, one for each term of the
sum, and then the run of the final multi-objective optimization. This would
lead to a total session of the optimization unacceptable, both for the time it
will takes and for the resources it will occupy.
Thus the normalization applied here is the division of each quantity for
its corresponding maximum: a maximum of the delay occurs when all the
transistors are at minimum width, while the maximum of the power and of
the area is measured when all the transistors are at the maximum allowed
width in the optimization session (being careful that choosing a too large
maximum allowed width will result in a power and area term too little).
The total normalized optimization objective function becomes then:
D P A
O=α +β +γ . (5.5)
D |min widths P |max widths A |max widths
Choosing all the combinations of the parameters α, β and γ it is possible

to obtain an optimized circuit in which the delay, the power consumption
and the area occupancy account more or less.
Tab. 5.5: Full-adder: delay and power optimization
Delay [ps] Energy [pJ] Area [µm2 ]

Full-adder Pre-opt. Post-opt. Pre-opt. Post-opt. Pre-opt. Post-opt.
0.7 µm technology
1781 1156 6.475 22.34 43 110.5
Static
0.25 µm
571.3 429.5 3.155 13.63 17 83.12
0.7 µm technology
930.6 744.1 2.168 3.921 26 62.8
TSPC (one-stage)
0.25 µm technology
276.7 187.1 0.641 1.879 13 41.6
Tab. 5.6: Full-adder: optimizations comparison among two kinds of optim-

ization and the minimum widths results. The number in the par-
entheses shows the worsening (if positive) or the improvement (if
negative) of the power–delay optimization from the full-delay op-
timization.
∆Delay ∆Energy ∆Area
Full-adder α= 1 α=β= 0.5 α= 1 α=β= 0.5 α= 1 α=β= 0.5
0.7 µm technology
÷1.65 ÷1.54 (+7%) ×6.18 ×3.45 (-44.1%) ×5.75 ×2.57 (-43.5%)
Static
0.25 µm technology
÷1.38 ÷1.33 (+3.44%) ×35.25 ×4.32 (-87.7%) ×40.74 ×4.89 (-88%)
0.7 µm technology
TSPC ÷2.33 ÷1.25 (+85.9%) ×6.18 ×1.81 (-70.7%) ×5.83 ×2.42 (-58.6%)
(one-stage) 0.25 µm technology
÷1.75 ÷1.48 (+18.2%) ×5.65 ×2.93 (-48.1%) ×6.18 ×3.20 (-48.3%)
If for, for example the same full-adder of table 5.3 (page 99) are optim-
ized both for delay and for power in the same measure, i.e. in equation (5.5)
α = 0.5, β = 0.5 and γ = 0, we obtain the results of table 5.5.
The comparison of the full delay optimization (mono-objective) and delay–
power optimization (multi-objective) is sketched in table 5.6: as we can
see between the full-delay optimization and the power–delay optimization
(50%–50%) there is a slightly worsening in the delay of the final circuit
(from 5.2% to 46.9%); at the same time there is an effective improvement
in the power consumption: the power dissipation decreases from 5.8% to
76.5%.
A more complete survey of the optimization results of the circuit presen-

ted in this chapter can be found in chapter 7 (page 121).
5.3 Conclusion
This chapter first defines which are the targets of optimization, and then
it applies the mathematical theory of chapter 4 (page 47) to the optimiza-
tion of real circuits.
It has been shown how the only mono-objective optimization feasible by
means of transistor dimensions trimming is the delay minimization, since
both the minimization of area and power consumption lead the quasi-obvious
solution of all transistor at the minimum width allowed by the technology
or by the designer.
Regarding the multi-objective optimization a method that permits to

tackle several optimization policies has been presented. This method per-
mits to take into account all the variables, even whether they are incom-
mensurable among themselves; by means of a normalization all the targets
to be optimized can be combined in a single objective function, with a rel-
ative “agreement” level.
Moreover with this way of combining the several targets into one objective
function, the introduction of constraints is as simple as it is in a mono-
objective optimization.
Chapter 6
A CAD TOOL FOR OPTIMIZATION
HE optimization goals of the previous chapter require a modular and

T complete framework, in order to perform the real optimization of a
circuit. This chapter describes the implementation of such framework by
means of about 10000 lines of C++ code. The section 6.1 reports the logical
description of the tool and its modules, and the section 6.2 reports the code
implementation of the most important classes of the program. Finally the
section 6.3 reports the logical flow of the program during the execution.
For every other detail refer to appendix A and B (page 145, 149).
6.1 Logical description
The block diagram of the CAD tool is pitted in figure 6.1.
The core of the tool, the optimization engine, receives the input from
two modules: the optimization algorithm module (OAM), where different
optimization strategies can be selected, and the function evaluation module
(FEM), including the models for delay, power, and area estimation.
6.1.1 The optimization algorithm module (OAM)
The OAM supports the choice of different optimization algorithms in a

predefined set; three kinds of algorithm are currently included:
• a SLOP–like algorithm (§4.2.4, page 70), which works increasing at

108 Chapter 6. A CAD tool for optimization
Optimization algorithm module (OAM)
Grad.
SLOP Powell
descent
Optimization Results feedback
Constraints
Delay Power Area
function evaluation module (FEM)

Circuit
Description
(computer readable)
Parser Optimization
constraints
Circuit
Description
(human readable)
Fig. 6.1: Tool block diagram
each step the size of a single gate, chosen according to the best pos-
sible reduction of the delay along the critical path.
• The Powell algorithm (§4.2.3.2, page 69), which is a particular form

of the conjugate directions algorithm family ([17]): it does not require
the computation of any gradient function and it converges quadratic-
ally to the minimum of the cost function.
• The simulated annealing algorithm (§4.2.5, page 72): it chooses the

6.1. Logical description 109
transistor dimensions according to an “annealing” scheme, conver-

ging thus to a global minimum, getting rid of local minima. It is
surely much slower than the previous ones, and requires a fine tuning
of the annealing parameters.
For all the chosen methods, the analytical knowledge of the objective
functions and their derivatives is not required, but just numerical approx-
imations are exploited.
However methods requiring the gradient evaluation (e.g. the Fletcher–
Reeves–Polak–Ribiere version of conjugate directions algorithm [17]) can
be also supported.
6.1.2 The function evaluation module (FEM)
The FEM module performs the analysis of the circuit to be optimized,

and in particular it evaluates all the objective functions needed by the OAM:
the delays, power consumptions and area occupancy.
In order to perform this evaluation it invokes the timing analyzer or simu-
lator chosen at run-time. At the time of writing two analyzer are supported:
HSPICE and FAST (chapter 3, page 21).
Hereinafter the word “simulator” will be used, although some module in-
cluded in FEM are not real simulator, but more appropriately “delay-power
analyzer”, since they do not perform a real simulation of the circuit (such
as FAST).
6.1.3 Core engine
The core engine is the main module of the program. It handles the com-
munications among the others module and make the optimization feasible.
First of all, the engine parses the netlist of the circuit to be optimized,
written in a SPICE-like format. It then invokes the module that automatic-
ally searches all the critical paths in the circuit, and finally it invokes the
optimization algorithm.
6.2 Code implementation
The whole tool has been written in C++. All the classes of the program
are showed in appendix A and all the code details can be found in ap-
pendix B.
Here are reported the most important classes of the program:
• CircuitNetlist
• OptimizationAlgorithm
• EvaluationAlgorithm
The first class, CircuitNetlist, and its derived Circuit, contain the
graph of the circuit, in which every node is a transistor and every edge is a
connection between two transistor.
The class OptimizationAlgorithm is a virtual base class from which

every new optimization algorithm should be derived. It provides the in-
terface between the real class that implements the algorithm and the core
engine. Every derived class should provide the method Run() that per-
forms the optimizations.
The class EvaluationAlgorithm is again a virtual base class from which

every new simulator should be derived; and again every derived class
should provide the method Run(...) that performs the evaluation of all
the objectives of the circuit, as delay, power consumption and so on.
6.2.1 The classes CircuitNetlist and Circuit
The public and protected methods of class CircuitNetlist are:
1 class CircuitNetList
2 {
3 private:
4 ...
5
6 protected:
7 char *FileNetOut;
6.2. Code implementation 111
8 TransistorList TranList;
9 CapacitorList CapList;
10 char *FileIn;
11 double Val;
12 unsigned int ValNode;
13 public:
14 CircuitNetList( const char* FileNetList,
15 const Options& options );
16 virtual ~CircuitNetList();
17 unsigned int GetNTran() const
18 {return TranList.GetNTran(); }
19 unsigned int GetNCap() const
20 {return CapList.NumCap; }
21 double Valim() const
22 {return Val; }
23 unsigned int ValimNode() const {return ValNode; }
24 const TransistorNode& operator[]( unsigned int index ) const;
25 const TransistorNode& operator[]( const char* name ) const;
26 int TranPos( const char* name ) const;
27 };
This class provides some method to return the i–th transistor by means of
operator[], either by calling it with the relative number of transistor or
with its name. Also the class provides the methods to return the effective
power supply node (the ground node is assumed to be always the node 0).
Internally the class contains the list of all the transistors and all the capacit-
ors present in the original netlist.
The public and protected methods of class Circuit are:
28 class Circuit : public CircuitNetList

29 {
30 private:
31 ...
32 public:
33 Circuit( const char *FileNetList,
35 ~Circuit();
36 void PrintResult( unsigned long int Step, unsigned int NT,
37 unsigned int NP, const double* NewWidth,
38 const double* CPDelay, const double* CPPower,
39 const double *CPNoise, double Area,
40 double maxT, double maxP, double maxN,
41 double f, double fLast ) const;
42 int Simulate( const double *NewWidth ) const;

43 double JunctionNWidth( unsigned int node,
44 int& number,
45 const double* NewWidth = 0 ) const;
46 double GateNWidth( unsigned int node,
47 int& number,
49 double JunctionPWidth( unsigned int node,
50 int& number,
52 double GatePWidth( unsigned int node,
53 int& number,
55 double CapStaticGnd( unsigned int node, int& number ) const;
56 double CapStaticVdd( unsigned int node, int& number ) const;
57 int TransistorListNode(unsigned int node, TransistorList& TList,
58 unsigned int& n , unsigned int& p) const;
59 };
The class provides the method Simulate(const double *NewWidth ) that

invokes the simulator of the circuit with the new transistor widths NewWidth.
It provides also some methods ...Width(...) that return the sum of the
widths of all the transistors connected to a node and a few methods CapStatic...(...)
that return the sum of all the capacitances connected between a node and
the power supply node or between a node and the ground node. These
methods are useful for the FAST model.
6.2.2 The class EvaluationAlgorithm
The public and protected interface of this class are:
60 class EvaluationAlgorithm
61 {
62 private:
63 protected:
64 const CritPathList& pathlist;
65 const Options& options;
66 unsigned int NumPath;
67 unsigned long int Calls;
68 double *CPDelay; // delay
69 double *CPPower; // power
70 double *CPNoise; // noise

71 double Area;
72 public:
73 EvaluationAlgorithm( const CritPathList& pathlist,
75 virtual ~EvaluationAlgorithm();
76 virtual int Run( const Circuit& circuit,
77 const double *NewWidth,
78 const unsigned *ValidPath ) = 0;
79 unsigned long int GetCalls() const { return Calls; }
80 double GetDelay( unsigned int index ) const
81 { return CPDelay[ index ]; }
82 double GetPower( unsigned int index ) const
83 { return CPPower[ index ]; }
84 double GetNoise( unsigned int index ) const
85 { return CPNoise[ index ]; }
86 double GetArea() const
87 { return Area; }
88 unsigned int GetNPath() { return NumPath; }
89 };
The main method is

Run( const Circuit& circuit, const double *NewWidth, ...)
that performs the real simulation of circuit with the new dimensions
NewWidth. The other methods return the delay, power and area of the cir-
cuit with the new dimensions, the total number of calls to simulator, and
the number of critical path in the circuit. It contains vectors of all the delays
and power of all critical paths, an instance of a class Options that contains
all the options of the tool, and an instance of the class CritPathList that
contains all the critical paths of the circuit.
6.2.3 The class OptimizationAlgorithm
The public and protected interface of this class are:
90 class OptimizationAlgorithm
91 {
92 private:
93 ...
94 protected:
95 unsigned int InternalSteps;
96 const Circuit& circuit;

97 const Options& options;
98 unsigned int Steps;
99 unsigned int NumTran;
100 unsigned int NumPath;
101 double *Width;
102 double *CPDelay;
103 double *CPPower;
104 double *CPNoise;
105 double Area;
106 unsigned int *ValidPath;
107 double MaxDelayInitMin;
108 double MaxPowerInitMin;
109 double MaxNoiseInitMin;
110 double AreaInitMin;
111 double MaxDelayInitMax;
112 double MaxPowerInitMax;
113 double MaxNoiseInitMax;
114 double AreaInitMax;
115 EvaluationAlgorithm& Simulation;
116 double NormSim( const double* NewWidth, int& RetCode);
117 public:
118 OptimizationAlgorithm( const Circuit& circuit,
119 const Options& options,
120 EvaluationAlgorithm& simulation );
121 virtual ~OptimizationAlgorithm();
122 virtual int Run() = 0;
123 unsigned long int GetSteps() const { return Steps; }
124 int SimulateCircuit( const double *NewWidth );
125 int SimulateFirstCircuit();
126 double OptWidth( unsigned int index ) const
127 {return Width[ index ];
128 };
This class provides the method Run() that invokes the real algorithm,
and the method SimulateCircuit(...) that performs the function evalu-
ations by means of the instance EvaluationAlgorithm& Simulation:
simply every time that the algorithm needs to perform a function evalu-
ation with new dimensions, it invokes the public method
Simulation.Run(...), passing to it the new dimensions. It provides also
the methods to return the optimization steps and the final optimized widths.
The combination of all the functions returned by Simulation.Run(...)

(all the critical path delays, all the power consumptions, §5.2.3, page 102) is
performed by the method NormSim(...).
6.2.4 The critical path retrieving
The module that performs the retrieving of all the critical paths (see
§5.1.1, page 80, for the mathematical definitions) in the circuit is subdivided
into three parts:
• the first part identifies all the input of the circuit (gate nodes connec-
ted to nothing), and all the internal gate nodes (connected to a source
or a drain of another transistor);
• the first part search all the charging paths between a node and the
power supply and all the discharging paths between the ground, for
every node in the circuit;
• the third part combines all the previous charging and discharging
paths to obtain a true critical path. The combinations is performed
controlling that the inputs permit the real activation of the path; at
the same time the module sets all the inputs at the value necessary to
obtain the excitation of the path, i.e. such that a change in the input
causes a change in the output.
The main function of the critical paths retriever is:

int Critic(const Circuit& circuit,CritPathList& pathList,...)
that performs the search of the critical paths in circuit: it simply calls
the recursive function int CriticRecurse(...) to search all the charging
or discharging paths, and then it combines some of this path by means of
the recursive function int SearchCPRecurse(..). For every charging/dis-
charging path to be added, the function int SearchOKCond(...) is in-
voked: this very complex function controls that all the input conditions
are coherent with the conduction of the path.
In order to ensure a good flexibility of the tool, there is always the pos-
sibility for the designer to specify the critical paths to be used in the optim-
ization by hand. The standard format for them is a text file that for each
critical path lists the input node, the output node, and transition both on
the input and output node (fall or delay). It is possible in this way to list
only a part of all the critical paths present in a circuit and to take into ac-
count during the optimization only those paths.
Moreover it is possible to use the optimizer for topologies that normally
could confuse the algorithm for critical paths search, such as the pass-
transistors logic circuits.
6.2.5 The derived classes
Every time a new optimization algorithm or a new simulator must be

introduced a new class should be derived from the main classes.
As examples, the class for the HSPICE simulator is derived as:
129 class Hspice: public EvaluationAlgorithm

130 {
131 private:
132 ...
133 public:
134 Hspice( const CritPathList& pathlist,
136 const char* NE );
137 ~Hspice();
138 int Run( const Circuit& circuit,
139 const double *NewWidth,
140 const unsigned* ValidPath);
141 };
and the class for the Powell optimization algorithm (§4.2.3.2, page 69)
is derived as:
142 class Powell: public OptimizationAlgorithm

143 {
144 private:
145 ...
146 public:
147 Powell( const Circuit& circuit,
149 EvaluationAlgorithm& simulation );
150 ~Powell();
6.3. Program flows 117
151 int Run();

152 };
Basically, both the classes should provide only the method Run(...)
(with different parameters, of course), that performs the real simulation or
the real optimization algorithm.
6.3 Program flows
The logical flow of the main function of the program is:
Algorithm 6.1. Logical flow of main

Require: Circuit netlist in SPICE-like format
1: Preprocess the input netlist.
2: Process the options configuration file.
3: Build the graph of the circuit.
4: Search the critical path in the circuit.
5: Invoke the function optimizator.Run().
6: Write results.
The logical flow of the function that retrieve all the critical paths is di-
vided in a few functions:
Algorithm 6.2. Logical flow of the function Critic(...)

Require: A graph of the circuit in which each node is a transistor.
1: Invoke CriticRecurse(...) passing to it the ground node.
2: Invoke CriticRecurse(...) passing to it the power supply node.
3: Invoke SearchCriticalPath(...) passing to it the list of all the dis-
charging path starting from the ground node.
4: Invoke SearchCriticalPath(...) passing to it the list of all the char-
ging path starting from the power supply node.
5: Return a list of all the critical paths in the circuit.
Algorithm 6.3. Logical flow of the function CriticRecurse(...)

Require: Node.
1: for all The transistors that have the source or drain connected to Node
do
2: if Source = Node then
3: Node = Drain.
4: else
5: Node = Source.
6: end if
7: Memorize the current transistor in the current list.
8: Copy the current list in a new list, in order to create a new list every
time there are more than one transistors connected at the same node.
9: if At Node are connected both n–type and p–type transistor OR Node
is already visited then
10: Return.
11: else
12: Invoke myself with Node
13: end if
14: end for
15: return all the lists of node starting from Node
Algorithm 6.4. Logical flow of the function SearcCriticalPathRecurse(...)
Require: A List of all the charging and discharging paths and a path as a
starting point.
1: for all The charging paths do
2: Choose a discharging path that has as an input node the output node
of the first path
3: Check if the input condition are correct and eventually set them.
4: Invoke myself whit the new path as a first path.
5: end for
6: for all The discharging paths do
7: Choose a charging path that has as an input node the output node of
the first path
8: Check if the input condition are correct and eventually set them.
9: Invoke myself whit the new path as a first path.
10: end for
6.4. Conclusions 119
6.4 Conclusions
This chapter describes the implementation of the tool that is behind all
the optimizations through this thesis. It has been written in a very modular
way, in order to permit efficiently the insertion of new algorithms and new
simulators. It consists of about ten thousand lines of C++, and it exploits
deeply the object-oriented features, in order to hide to new developers the
implementing details.
Chapter 7
RESULTS AND CONCLUSIONS
T HIS chapter shows a survey of the optimization of the circuits showed

in chapter 5 (page 77): the goal here is to show how a cell library
can be optimized, in order to be used in VLSI circuits, either full-custom
or standard-cells. Going to multi-objective optimizations, starting from
mono-objective ones (and passing from constrained optimization) is the
path that this chapter will walk. In this path some conclusions (and opin-
ions!) are drawn, giving cell-libraries designers some guidelines and tools
to facilitate his work and obtain the wanted results.
7.1 Optimization
The cell library to be optimized is composed, principally, by basic TSPC1

CMOS dynamic logic, but with the purpose to extend the validity of the res-
ults, some static gates are included in the library. The full list of the gates
subjected to optimization is shown in table 7.1. For a complexity descrip-
tion (both for the number of transistor in each cell, and for the number of
critical paths in the same cell) of the gates see table 5.1 (page 92).
The library comprehends, thus, the inverter gate, the TSPC gates “and”
(both the n and the p versions), “or” and “latch” gates (again with the n
and the p versions), and a full-adder (the version included here is a n–p
construction, faster than the almost equivalent p–n construction). As above
said, for comparison are included: a complete static full-adder, a full static
“and–or” gate2 , a full static “and”, a full static “or”, a full static “parity”
1
For a description of the TSPC see chapter 1 (page 3), and [1].
2
See note a, (page 98).
122 Chapter 7. Results and conclusions
Tab. 7.1: Library gates list
Gate
Inverter (fig. 5.10, page 94)
TSPC type n latch (fig. 5.11(a), page 95)
TSPC type p latch (fig. 5.11(b), page 95)
TSPC type n and (fig. 5.12(a), page 96)
TSPC type p and (fig. 5.12(b), page 96)
TSPC type n or (fig. 5.13(a), page 97)
TSPC type p or (fig. 5.13(b), page 97)
Static and-or (fig. 5.14, page 98)
Static and (fig. 5.14, page 98) (See note 14, page 92.)
Static or (fig. 5.14, page 98) (See note 14, page 92.)
Static parity gate (fig. 5.15, page 99)
Static full-adder (figs. 5.16(a), 5.16(b), page 100)
TSPC full-adder (one-stage) (figs. 5.17(a), 5.17(b), page 101)
TSPC full-adder (basic cells)
gate (which performs the parity calculation among three inputs), and, fi-
nally, a TSPC full-adder, composed only by the TSPC basic gates above men-
tioned.
The very first result reported here is the comparison of the improve-
ment in the delay and power consumption between the 0.7 µm and the
0.25 µm technology, at minimum width: this comparison is reported in
table 7.2 and graphically pitted in figure 7.1(a) for delay and figure 7.1(b)
for the power consumption.
From that table it is possible to see that the average improvement (di-
minution) of the delay is 69.3% and of the power is 76.2%, passing from the
0.7 µm to the 0.25 µm technology.
Thus with scaling the dimension of quite 13 , the average delay and power
consumption are also scaled down of about the same factor.
7.1.1 Mono-objective vs. Multiobjective
Mono-objective optimization (§4.1.1, page 49) means to optimize (in our

case always to decrease) a single objective, i.e. a well defined target, to the
detriment of all the others possible targets.
The very first optimization policy applied to CMOS circuits was the
Tab. 7.2: Delay and energy dissipation @ minimum width (HSPICE)

Gate Delay [ps] Energy [pJ] Area [ µ m2 ] Delay [ps] Energy [pJ] Area [µm2 ]
Inverter 717.5 0.6887 2.4 259.6 (-63.8%) 0.086 (-87.5%) 1.2
TSPC type n latch 921.8 3.491 7.2 293.3 (-68.2%) 0.659 (-81.1%) 3.6
TSPC type p latch 1413.0 2.807 7.2 482.5 (-65.9%) 0.289 (-89.7%) 3.6
TSPC type n and 1028.0 2.756 8.4 315.8 (-69.3%) 0.513 (-81.4%) 4.8
TSPC type p and 1413.0 2.51 8.4 482.5 (-65.9%) 0.349 (-86.1%) 4.8
TSPC type n or 904.3 4.47 8.4 299.7 (-66.9%) 0.826 (-81.5%) 4.8
TSPC type p or 1413.0 1.654 8.4 482.5 (-65.9%) 0.224 (-86.5%) 4.8
Static and-or 1180.0 3.639 16.8 334.0 (-71.7%) 1.998 (-45.1%) 8.4
7.1. Optimization
Static and 760.9 0.722 4.8 277.2 (-63.6%) 0.0907 (-87.4%) 2.4
Static or 1430 0.75 4.8 233.7 (-83.7%) 0.0713 (-90.5%) 2.4
Static parity gate 2650.0 0.744 57.6 922.2 (-65.2%) 0.0945 (-87.3%) 28.8
Static full-adder 1781 6.475 48 571.3 (-67.9%) 3.155 (-51.3%) 24
TSPC full-adder (one-stage) 930.6 2.168 15.6 276.7 (-70.3%) 0.641 (-70.4%) 7.8
TSPC full-adder (basic cells) 2691.0 8.893 151.2 482.3 (-82.1%) 5.27 (-40.7%) 75.6
Average improvement -69.3% -76.2% -50%
123
Delay comparison of 0.7µm and 0.25µm

3000
0.7µm
0.25µm
2500
2000
Delay [ps]
1500
-65.2%
1000
-67.9%
500 -65.9% -65.9% -65.9% -82.1%
-69.3% -66.9% -68.2% -71.7%
-63.8% -63.6% -83.7% -70.3%
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(a) Delay
Energy-dissipation comparison of 0.7µm and 0.25µm

9
0.7µm
0.25µm
8
7 -30.6%
6
Energy [pJ]
4
-51.3%
3
-81.5%
-45.1%
2 -81.1% -70.4%
-81.4%
1 -86.1%
-86.2%
-86.5%
-87.5% -87.4% -90.5% -87.4%
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(b) Energy-dissipation
Fig. 7.1: Comparison of 0.7 µm and 0.25 µm. gates @ minimum technology
width
delay optimization. The figures 7.2 and 7.3 sketch the delay optimization of
the gates of table 7.1, respectively in 0.7 µm and 0.25 µm technology imple-
mentation, with arrows representing the delay and energy variation. The
7.1. Optimization 125
Full Delay Optimization: delay variation

3000
0.7µm
-70.1% -70.8%
2500
2000
-56.7%
Delay [ps]
1500 -84.2% -76.2% -84.1% -84.8%
-61.2%
-77.3%
1000 -81.0% -76.5% -60.1%
-88.1% -86.3%
500
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(a) Delay amelioration
Full Delay Optimization: energy variation

250
0.7µm
+2254.7%
200
+2605.8%
150 +3040.9% +3736.2%

Energy [pJ]
+4261.4%
100 +2680.0%
+2079.7%
50 +2275.7%
+430.7%
+1616.4%
+178.7% +620.9%
+70.1% +1331.5%
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(b) Energy-dissipation deterioration
Fig. 7.2: Delay optimization of 0.7 µm gates.
arrows start from the initial values (i.e. either the delay or the energy meas-
ured at the minimum technology width), and end to the values after the
optimization.

1000
-57.2% 0.25µm
900
800
700
600 -27.3%
Delay [ps]
500 -78.8% -60.6% -78.0% -63.7%
400
-59.3% -39.7%
300 -67.7% -74.4%
-87.1% -55.3% -42.8%
-68.1%
200
100
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(a) Delay amelioration

120
0.25µm
100
+3424.6%
80
Energy [pJ]
60 +847.7%
+2233.8%
40
20
+1818.9%
+1787.7%
+381.9%
+1305.4% +1855.0% +212.7% +465.1%
+279.3% +654.0% +133.7% +171.4%
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(b) Energy-dissipation deterioration
Fig. 7.3: Delay optimization of 0.25 µm gates.
As it can be expected, the delay has a sensible improvement (diminu-

tion, figures 7.2(a), 7.3(a)) while the energy dissipation has a very large
increase (figures 7.2(b), 7.3(b)): to decrease the delay the optimizer aug-

3000
0.7µm
0.25µm
2500
2000
Delay [ps]
1500
1000
500
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(a) Delay variation

250
0.7µm
0.25µm
200
150
Energy [pJ]
100
50
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(b) Energy-dissipation variation
Fig. 7.4: Technology comparison of delay optimization.
ments the transistor widths, thus augmenting the overall power dissip-
ation. Table 7.3 and figure 7.4 report the relative variation of delay and
power (as minimum, maximum and mean value), for both technology: so,
for the 0.7 µm technology the delay is, in average, decreased by 3.43 times,
while for the 0.25 µm technology it is decreased by 2.75 times (figure 7.4(a)).
On the contrary, the energy dissipation is increased by 20.42 times in 0.7 µm
and by 13.41 in 0.25 µm (figure 7.4(b)).
The table 7.4 shows the total time taken by the optimization of each
gate, together with the total number of function evaluations, that is the
number of times the simulator (in this case HSPICE) of the circuit has been
invoked. These numbers are quite reasonable per se, and moreover the op-
timization of a cell library ought to be performed only once, before the
reuse of it. Furthermore, in the case of very large circuits, the modular ar-
chitecture of the optimizer makes possible to switch from one simulator
to another on the fly; thus we can use a very fast simulator (as FAST) in
the earlier steps of optimization, and switch to a more precise but slower
simulator (as HSPICE) in the later stages of the optimization process.
Tab. 7.3: Delay decreasing and energy increasing (both relative) in a delay
optimization.
Delay decreasing Energy increasing

Technology
Max. Min. Mean Max. Min. Mean
0.7 µm 8.43 3.43 4.80 43.61 8.89 20.42
0.25 µm 7.78 2.75 3.16 35.25 6.17 13.41
These results are largely previsionable, since a hard delay optimization

leads to a very large increase in transistor dimensions, thus leading to a
great area occupancy and energy dissipation.
Moreover, another issue arises when optimizing an entire cell library:

is it necessary to push at their limits every single cell? In a generic static
circuit the total delay is, generally, the sum of the delay of each cell com-
prising the circuit, since this delay is bounded by the delay of the worst
critical path and, moreover, it is possible to have a single critical path3 from
a primary input to a primary output of the circuit; so it has some sense to
optimize every single cell to its best.
In a generic dynamic circuit, the global delay is still bounded by the delay
of the worst critical path in the circuit, determining thus the minimum clock
period. Since this critical path is contained in a single cell for a single-phase
3
§5.1.1, page 80.
Tab. 7.4: Elapsed time and total number of function evaluations for a full-
delay optimization with HSPICE — on a ULTRA-sparc 5

Gate El. time [s] Fun. eval. El. time [s] Fun. eval.
inv 332.6 12 212.4 13
and-n 1338.3 34 1675.6 36
and-p 1426.6 34 2449.5 41
or-n 1408.3 32 1950.0 34
or-p 1259.5 31 1355.5 27
latch-n 1286.5 32 1466.7 32
latch-p 1307.1 33 1574.7 31
and–or 5830.9 73 9280.3 91
and–static 786.5 25 729.6 31
or–static 651.6 21 626.1 24
parity 64098.2 159 35274.3 178
static–fa 27034.8 239 23794.1 180
tspc–fa1 2413.3 69 2881.2 70
tspc–fa2 16459.1 66 63485.2 121
dynamic logic (where there are n-gate and p-gate alternated, working with
different clock phases), the delay of the entire circuit is bounded by the
delay of the worst library cell in circuit. It has no sense, thus, to optimize
the basic library cells (that are present in every circuit) to their limits, when
the delay of a generic circuit is bounded by the worst of them. It is, instead,
more useful to try to optimize the worst cell in the library, while trying to
reduce the delay of the other cells to the value obtained by the previous
optimization. In this way a reduction of the dimensions of these cell is
achieved, obtaining thus a reduction of the overall energy dissipation.
So the consequent idea is to try to optimize an entire (dynamic) cell

library using a constrained optimization 4 ; the strategy for this purpose is:
i) evaluate the delay for every cell at minimum width;
ii) choose the worst cell (with regard to delay) among the previous;
iii) optimize the delay of this cell as long as it is possible;
iv) optimize all the other cell to have a delay not superior to the value
obtained in the previous point.
4
§4.1.1.2, page 52
As an example, the constrained optimization of dynamic 0.25 µm gates

is reported in table 7.5: this optimization has been performed with a con-
straint on every gate for not to have a delay greater than 125 ps. This value
has been obtained by an unconstrained optimization of the worst (with re-
spect of delay) cell, the TSPC type-p “or” gate (cfr. table 7.2). After this
optimization the delay of this gate was 121.2 ps, so the value chosen for the
optimization of all the other gates was 125 ps.
Tab. 7.5: Constrained delay optimization of a few 0.25 µm gates.
Gate Delay pre–opt. [ps] Delay post–opt. [ps]

and-n 315.800 100.500
and-p 482.500 111.900
or-n 299.700 114.900
or-p 482.500 121.200
latch-n 293.300 88.080
latch-p 482.500 118.600
Average delay 392.72 109.20
Standard deviation 36.65 3.83
It is possible to see, from table 7.5 that the delays after the optimization
have a standard deviation5 (3.83) far smaller than the standard deviation
before the optimization (36.65). This means that all the cells have quite the
same delay after the optimization, and that this value is an “optimal” one,
since minimizes the delay of block constituted by these cells, and in the
same time reduces the power dissipation and area occupancy with respect
to a solution with all the cells optimized independently.
The procedure of a constrained optimization is useful only when we

want to constraint a single target to a precise value. It is not useful when
we want to constraint more than one target at the same time, for example
delay and power together: such optimizations are not feasible as first they
would require an evaluation of quantities to be constrained (in order to
know if the constraints are reasonable), second it could not be possible for
the optimizer to satisfy all the constraints.
A much more useful policy to take into account specifically more than
5 ∑iN=1 (xi −m)2
The standard deviation of a number N of samples xi is defined as σ 2 = N
, where
∑iN=1 xi
m, m = N
,
is the arithmetic mean of the samples.
It is a measure of the spreading of the samples around the mean.
one target is to perform a multi-objective optimization.
The figures 7.5 and 7.7 show four different multi-objective optimization,
respectively, for the 0.7 µm and 0.25 µm technology (with figures 7.6, 7.8
that are, respectively, a zoom of the figures 7.5(b), 7.7(b). The four different
optimizations performed are:
i) full delay optimization, indicated with “Delay=100% Power=0%”;
ii) a delay optimization, taking slightly into account the power consump-
tion, indicated with “Delay=80% Power=20%”;
iii) a delay–power optimization, taking into account the power dissipation

in an equal measure, indicated with “Delay=50% Power=50%”;
iv) a delay optimization, taking strongly into account the power consump-
tion, indicated with “Delay=20% Power=80%”;
The percent numbers6 reported after delay and power, are, also, the
coefficients α and β of the equation 5.5 (page 105) used as a cost function
in the optimization algorithm.
From these figures we see the delay that reduces more and more with
the increasing of its relative weight, while the increasing of the power dis-
sipation is somewhat limited by the increase of its relative weight.
From all the optimization policies, the one that gives the most useful
results is the optimization of delay and power with the same weights, that
is the one indicated with “Delay=50% Power=50%” in the previous figures.
These results are reported also in figure 7.9, as a particular case.
This is, probably, the most useful optimization since it still reduces a lot
the delay, but it contains the increasing of the power dissipation to a more
acceptable value.
The figures 7.10, 7.11, 7.12 and 7.13, show the same four optimizations
by means of the trajectory in the space delay–power during the optimiza-
tion process. In these figures each marked point is a step in the optimization
process. It is so possible to see how augmenting the relative weight of the
6
The case “Delay=0% Power=100%” has not been included, since this kind of optim-
ization leads to the trivial result of all the transistor at the minimum width (cfr. §5.2.2.2,
page 96)
Delay--Power Optimization: delay variation

3000
Delay=100%, Power=0%
2500 Delay=20%, Power=80%
2000
Delay [ps]
1500
1000
500
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
Gate type
(a) Delay variation
Delay--Power Optimization: energy variation

180
140
120
Energy [pJ]
100
80
60
40
20
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
Gate type
Fig. 7.5: Several delay–power optimization policies of 0.7 µm gates.
delay in the cost function (and thus reducing the energy relative weight),
leads the optimizer to go further in the trajectory reducing the delay and
augmenting the energy dissipation.

50
40
35
30
Energy [pJ]
25
20
15
10
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
Gate type
Fig. 7.6: Energy-dissipation variation (zoom of figure 7.5(b))
Tab. 7.6: Delay worsening and energy-dissipation improvement between a

full delay optimization and delay-power optimization

Gate ∆ Delay ∆ Energy ∆ Area ∆ Delay ∆ Energy ∆ Area
inv 39.3% -20.1% -40.9% 15.7% -10.4% -21.1%
and–n 27.8% -92.2% -87.4% 6.3% -36.3% -42.1%
and–p 48.4% -81.0% -80.4% 1.1% -39.4% -49.2%
or–n 33.1% -67.2% -76.2% 46.9% -77.5% -66.9%
or–p 33.1% -77.8% -69.9% 11.8% -35.5% -21.6%
latch–n 31.5% -71.1% -84.7% 41.3% -22.0% -46.4%
latch–p 28.3% -75.2% -76.1% 14.6% -69.5% -72.1%
and–or 29.5% -91.2% -89.2% 6.7% -81.2% -79.1%
and–static 28.7% -67.3% -79.3% 14.4% -42.1% -53.3%
or–static 21.4% -30.4% -28.8% -3.4% 18.4% -12.8%
parity 7.7% -78.1% -81.2% 2.5% -50.3% -51.0%
static–fa 33.3% -87.2% -86.7% 5.9% -81.9% -82.4%
tspc–fa1 11.0% -29.3% -27.4% 15.4% -48.1% -48.3%
tspc–fa2 12.3% -72.5% -71.2% 8.6% -41.9% -44.1%
average +27.5% -67.2% -69.9% +13.4% -44.1% -49.3%
From these figures it can be clearly seen again that the multi-objective
optimization “Delay=50% Power=50%” has the best results with respect to
delay optimization and, at the same time, to containing the energy dissipa-
tion within reasonable value. These results are summarized in table 7.6: in

1000
800
700
600
Delay [ps]
500
400
300
200
100
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
Gate type
(a) Delay variation

120
Delay=100%, Power=0.0
80
Energy [pJ]
60
40
20
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
Gate type
Fig. 7.7: Several delay–power optimization policies of 0.25 µm gates.
this table are showed the percent variation of delay and energy dissipation
between the values obtained after a full delay optimization and the values
obtained after a delay–power optimization. The average worsening in the

5
Delay=100%, Power=0.0
4
3
Energy [pJ]
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
Gate type
Fig. 7.8: Energy-dissipation variation (zoom of figure 7.7(b))
delay (i.e. the difference between the delay value after a full delay optimiz-
ation and the same value after a delay–power optimization) is +27.5% for
the 0.7 µm technology and just +13.6% for the 0.25 µm technology. Despite
these low rate of worsening, the average energy-dissipation reduction is
−67.2% for the 0.7 µm technology and −44.1% for the 0.25 µm technology,
while the area occupancy reductions are, respectively, −69.9% and −49.3%
This means that accepting a slight degradation in the delay figure, leads to
a great reduction of the overall energy-dissipation and area occupancy.
7.2 Conclusions
The goal of the optimization framework presented in this chapter is to

show a new way to optimize the performance of CMOS cells employed in
VLSI circuits.
This new methodology, the multi-objective optimization, has led to a prom-
inent result: the delay of a circuit can be reduced taking into account the
power consumption and the area occupancy. The results of table 7.6 are the
most effective: giving a small compromise of the delay performance with
respect of a full delay optimization, the power consumption is strongly de-
creased; this means that the default optimization done until nowadays, the

3000
0.7µm
0.25µm
2500
2000
Delay [ps]
1500
1000
500
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(a) Delay variation

60
0.7µm
0.25µm
50
40
Energy [pJ]
30
20
10
0
andn
andp
orn
orp
latchn
latchp
inv
and--or
and--static
or--static
parity
static--fa
tspc--fa1
tspc--fa2
Gate type
(b) Energy–dissipation variation
Fig. 7.9: Delay–power optimization (50%–50%) comparison of 0.7 µm and

0.25 µm gates.
full delay optimization, can be safely switched with a multi-objective op-

timization. A circuit that has less power consumption while maintaining
Delay--power trajectory during optimizations.

340
Delay=100% Power=0%
Starting point Delay=80% Power=20%
Delay=50% Power=50%
320 Delay=20% Power=80%
300
280
Delay [ps]
260
240
220
200
0 5 10 15 20 25 30 35 40 45 50
Energy [pJ]
(a) 0.25 µm

1200
Delay=100% Power=0%
Delay=50% Power=50%
1000
900
Delay [ps]
800
700
600
500
400
0 20 40 60 80 100 120 140
Energy [pJ]
(b) 0.7 µm
Fig. 7.10: Delay and power trajectory during 4 different multi-objective op-
timizations for the and–or gate of figure 5.14 (page 98)
almost the same delay is safer from the operating point of view: it develops
less heat, hence it is more reliable.

1000
Delay=100% Power=0%
Delay=50% Power=50%
800
700
Delay [ps]
600
500
400
300
0 0.5 1 1.5 2 2.5
Energy [pJ]
(a) 0.25 µm

2800
Delay=100% Power=0%
Delay=20% Power=80%
2400
2200
2000
Delay [ps]
1800
1600
1400
1200
1000
800
600
0 1 2 3 4 5 6 7 8 9 10 11
Energy [pJ]
(b) 0.7 µm
timizations for the parity gate of figure 5.15 (page 99)
The easiness of obtaining circuits in which several optimization policies

can be performed helps a lot the work of cell-library designer: the designer
can, with a very low effort, produce with the same version of a library

1800
Delay=100% Power=0%
Delay=50% Power=50%
Delay=20% Power=80%
1600
1400
Delay [ps]
1200
1000
800
600
0 20 40 60 80 100 120 140 160 180
Energy [pJ]
(a) 0.25 µm

580
Delay=100% Power=0%
Delay=20% Power=80%
540
520
Delay [ps]
500
480
460
440
420
400
0 50 100 150 200 250
Energy [pJ]
(b) 0.25 µm
timizations for the static full-adder of figure 5.16 (page 100)
several libraries optimized in different ways. So each cell in a library has

different performances with respect to the same cell in the other libraries,
but it is still fully equivalent by the point of view of the function performed.

280
Delay=100% Power=0%
Delay=50% Power=50%
240
220
Delay [ps]
200
180
160
140
0.5 1 1.5 2 2.5 3 3.5 4 4.5
Energy [pJ]
(a) 0.25 µm

1000
Delay=100% Power=0%
Delay=50% Power=50%
800
700
Delay [ps]
600
500
400
300
2 4 6 8 10 12 14 16 18
Energy [pJ]
(b) 0.7 µm
timizations for the dynamic full-adder of figure 5.17 (page 101)
Let’s think for example to an “and” gate that performs always the same
function, but with different delays or maybe different power dissipations.
Simply swapping one library version (for example one optimized only for
7.3. Future works 141
the delay) with another (for example one optimized taking into account
the power consumption), the designer can develop several versions of the
same project with different performances.
7.3 Future works
Some future works that will be faced in the future could be:
Noise problems — This means to use another target in the optimization

policies: the “noise” ([18]) of a circuit.
This is a complex field, and a good starting point could be developing
of a noise-model of a CMOS circuit.
Interconnections — A simpler work could be to take into account the in-

fluence of interconnections in the optimization.
This means both to include a model of the interconnections into the
cell and to optimize the performance of the whole structure.
Topology extensions — The optimizer can be expanded to perform the op-

timization of different structures from the standard cells (both static
and dynamic): for example the memory cells, or the pass-logic gates.
This means principally to modify the algorithm that performs the
automatic search of all the critical paths in a circuits, to adapt it to
different topologies. There is, anyway, the possibility in the optim-
izer to list the critical path by hand and to perform the optimization
with these paths.
Cad integration — The optimizer could be integrated in a standard CAD

tool that assists the designer in developing an ASIC from high-level
specifications to layout level. One step of this flow could be the op-
timization of the library employed in the project.
APPENDIX
Appendix A
CLASS GRAPH
146 Appendix A. Class graph
Class Graph
CircuitNetList
> Circuit
OptimizationAlgorithm
> TestEval
> Slop2
> Slop
> Powell
> Anneal
147
EvaluationAlgorithm
> TestOpt
> Hspice
> Fast
Options
CPNode
CritPathList
TransistorNode
TransistorList
148 Appendix A. Class graph
CapacitorList
Node
NodeList
Appendix B
SOURCE CODE
B.1 Main functions

150 Appendix B. Source code
CPNode.cc
3 #include "mystdinclude.h"
4 #include "myenum.h"
5 #include "print.h"
6 #include "class_options.h"
7 #include "class_devices.h"
8 #include "class_nodes.h"
9 #include "class_critical.h"
10 #include "class_circuit.h"
11
12 ///
13 CPNode::CPNode() :
14 VALID( 0 ), NodeIn( 0 ), NodeOut( 0 ), NumTranList( 0 ),
15 ActiveInputs( 0 ), NoActiveInputs( 0 ), InitialConditions( 0 ),
16 ActiveInputsIter( 0 ), NoActiveInputsIter( 0 ), InitialConditionsIter( 0 ),
17 next( 0 )
18 {
19 for ( unsigned int i = 0; i < MAXCHAIN; i++ )
20 {
21 TransistorNameList[ i ] = 0;
22 TransistorNameListIter[ i ] = 0;
23 NumTranN[ i ] = 0;
24 NumTranP[ i ] = 0;
25 }
26 }
27
28 ///
29 CPNode::~CPNode()
30 {
31 NodeValueList* tmp;
32 while ( ActiveInputs )
33 {
34 tmp = ActiveInputs->next;
35 delete ActiveInputs;
36 ActiveInputs = tmp;
37 }
38 while ( NoActiveInputs )
39 {
40 tmp = NoActiveInputs->next;
41 delete NoActiveInputs;
42 NoActiveInputs = tmp;
43 }
44 while ( InitialConditions )
45 {
46 tmp = InitialConditions->next;
47 delete InitialConditions;
48 InitialConditions = tmp;
49 }
50 TrList* tmp2;
51 for ( unsigned int i = 0; i < MAXCHAIN; i++ )
52 while ( TransistorNameList[ i ] )
53 {
54 tmp2 = ( TransistorNameList[ i ] ) ->next;
55 delete TransistorNameList[ i ];
56 TransistorNameList[ i ] = tmp2;
57 }
58 }
59
60 ///
61 int CPNode::InsNodeIn( unsigned int Node, TransitionType T, double Time )
62 {
63 if ( NodeIn )
64 return NOT_FOUND; /// ERROR, yet inserted
B.1. Main functions 151
65 NodeIn = Node;
66 TransitionIn = T;
67 InTime = Time;
68 return OK;
69 }
70
71 ///
72 int CPNode::InsNodeOut( unsigned int Node, TransitionType T )
73 {
74 if ( NodeOut )
75 return NOT_FOUND; /// ERROR, yet inserted
76 NodeOut = Node;
77 TransitionOut = T;
78 return OK;
79 }
80
81 ///
82 int CPNode::InsActIn( unsigned int Node, double Val )
83 {
84 NodeValueList * tmp;
85 if ( !ActiveInputs )
86 {
87 ActiveInputs = new NodeValueList;
88 if ( !ActiveInputs )
89 return NO_MEM;
90 ActiveInputs->next = 0;
91 }
92 else
93 {
94 tmp = new NodeValueList;
95 if ( !tmp )
96 return NO_MEM;
97 tmp->next = ActiveInputs;
98 ActiveInputs = tmp;
99 }
100 ActiveInputs->node = Node;
101 ActiveInputs->value = Val;
102 ActiveInputsIter = ActiveInputs;
103 return OK;
104 }
105
106 ///
107 int CPNode::InsNoActIn( unsigned int Node, double Val )
108 {
110 if ( !NoActiveInputs )
111 {
112 NoActiveInputs = new NodeValueList;
113 if ( !NoActiveInputs )
114 return NO_MEM;
115 NoActiveInputs->next = 0;
116 }
117 else
118 {
120 if ( !tmp )
121 return NO_MEM;
122 tmp->next = NoActiveInputs;
123 NoActiveInputs = tmp;
124 }
125 NoActiveInputs->node = Node;
126 NoActiveInputs->value = Val;
127 NoActiveInputsIter = NoActiveInputs;
128 return OK;
129 }
130
131 ///
132 int CPNode::InsIniCond( unsigned int Node, double Val )
133 {

135 if ( !InitialConditions )
136 {
137 InitialConditions = new NodeValueList;
138 if ( !InitialConditions )
139 return NO_MEM;
140 InitialConditions->next = 0;
141 }
142 else
143 {
145 if ( !tmp )
146 return NO_MEM;
147 tmp->next = InitialConditions;
148 InitialConditions = tmp;
149 }
150 InitialConditions->node = Node;
151 InitialConditions->value = Val;
152 InitialConditionsIter = InitialConditions;
153 return OK;
154 }
155
156 ///
157 int CPNode::InsTran( const char* name, TransistorType TR, unsigned int index )
158 {
159 TrList * tmp;
160 TrList* tail;
161 if ( !TransistorNameList[ index ] )
162 {
163 TransistorNameList[ index ] = new TrList;
164 if ( !TransistorNameList[ index ] )
165 return NO_MEM;
166 ( TransistorNameList[ index ] ) ->next = 0;
167 TransistorNameListIter[ index ] = TransistorNameList[ index ];
168 tail = TransistorNameList[ index ];
169 }
170 else
171 {
172 tmp = new TrList;
173 tail = TransistorNameList[ index ];
174 if ( !tmp )
175 return NO_MEM;
176 tmp->next = 0;
177 while ( tail->next )
178 tail = tail->next;
179 tail->next = tmp;
180 tail = tmp;
181 }
182 tail->name = new char[ strlen( name ) + 1 ];
183 if ( !( tail->name ) )
184 return NO_MEM;
185 strcpy( tail->name, name );
186 if ( TR == NMOS )
187 NumTranN[ index ] ++;
188 else if ( TR == PMOS )
189 NumTranP[ index ] ++;
190 else
191 return NOT_FOUND;
192 return OK;
193 }
194
195 ///
196 int CPNode::TraverseActiveInputs( unsigned int& Node, double& value ) const
197 {
198 CPNode * const localThis = ( CPNode * const ) this;
199 if ( ActiveInputsIter )
200 {
201 Node = ActiveInputsIter->node;
202 value = ActiveInputsIter->value;
203 localThis->ActiveInputsIter = ActiveInputsIter->next;

204 return 1;
205 }
206 else
207 localThis->ActiveInputsIter = ActiveInputs;
208 return 0;
209 }
210
211 ///
212 int CPNode::TraverseNoActiveInputs( unsigned int& Node, double& value ) const
213 {
215 if ( NoActiveInputsIter )
216 {
217 Node = NoActiveInputsIter->node;
218 value = NoActiveInputsIter->value;
219 localThis->NoActiveInputsIter = NoActiveInputsIter->next;
220 return 1;
221 }
222 else
223 localThis->NoActiveInputsIter = NoActiveInputs;
224 return 0;
225 }
226
227 ///
228 int CPNode::TraverseInitialConditions( unsigned int& Node, double& value ) const
229 {
231 if ( InitialConditionsIter )
232 {
233 Node = InitialConditionsIter->node;
234 value = InitialConditionsIter->value;
235 localThis->InitialConditionsIter = InitialConditionsIter->next;
236 return 1;
237 }
238 else
239 localThis->InitialConditionsIter = InitialConditions;
240 return 0;
241 }
242
243 ///
244 const char* CPNode::TraverseTransistorNameList( unsigned int index = 0 ) const
245 {
247 if ( TransistorNameListIter[ index ] )
248 {
249 char * name = new char[ strlen( ( TransistorNameListIter[ index ] ) ->name ) + 1 ];
250 if ( !name )
251 return 0;
252 strcpy( name, ( TransistorNameListIter[ index ] ) ->name );
253 localThis->TransistorNameListIter[ index ] = ( TransistorNameListIter[ index ] ) ->next;
254 return name;
255 }
256 else
257 localThis->TransistorNameListIter[ index ] = TransistorNameList[ index ];
258 return 0;
259 }
260
261 ///
262 const char* CPNode::TransistorName( unsigned int pathIndex, unsigned int index = 0 ) const
263 {
264 TrList * tmp = TransistorNameList[ index ];
265 for ( unsigned int i = pathIndex; i > 0; i-- )
266 if ( tmp )
267 tmp = tmp->next;
268 else
269 return 0;
270 return tmp->name;
271 }
CapInsert.cc
7
8 ///
9 int CapacitorList::Insert( unsigned int node1, unsigned int node2, double val )
10 {
11 Capacitance * tmp;
12 if ( !head )
13 {
14 head = new Capacitance;
15 if ( !head )
16 return NO_MEM;
17 head->next = 0;
18 }
19 else
20 {
21 tmp = new Capacitance;
22 if ( !tmp )
23 return NO_MEM;
24 tmp->next = head;
25 head = tmp;
26 }
27 head->node1 = node1;
28 head->node2 = node2;
29 head->val = val;
30 NumCap++;
31 return OK;
32 }
CapacitorList.cc
7
8 ///
9 CapacitorList::CapacitorList() : NumCap( 0 ), head( 0 )
10 {}
11
12 ///
13
14 CapacitorList::~CapacitorList()
15 {
16 Capacitance* tmp;
17 while ( head )
18 {
19 tmp = head->next;
20 delete head;
21 head = tmp;
22 }
23 }
24
25 ///
26 const Capacitance& CapacitorList::operator[]( unsigned int index ) const
27 {
28 if ( index > NumCap )
29 error( NOT_FOUND, 0, "Index out of bounbd in [CapacitorList]..." );
30 unsigned int i = index;
31 Capacitance* tmp = head;
32 while ( i-- )
33 tmp = tmp->next;
34 return *tmp;
35 }
36
37 ///
38 Capacitance& CapacitorList::operator[]( unsigned int index )
39 {
40 if ( index > NumCap )
41 error( NOT_FOUND, 0, "Index out of bounbd in [CapacitorList]..." );
43 Capacitance* tmp = head;
44 while ( i-- )
45 tmp = tmp->next;
46 return *tmp;
47 }
Circuit.cc
11
12 ///
13 Circuit::Circuit( const char* FileNetList, const Options& options ) :
14 CircuitNetList( FileNetList, options )
15 {
16 print_log( "Creating circuit graph..." );
17 }
18
19 ///
20 Circuit::~Circuit()
21 {}
22
CircuitNetList.cc
11
12 ///
13 CircuitNetList::CircuitNetList( const char *FileNetList, const Options& options ) :
14 Val( 0.0 ), ValNode( 0 )
15 {
16 print_log( "Creating transistors list..." );
17 char *FileIn = new char[ strlen( FileNetList ) + 1 ];
18 if ( !FileIn )
19 {
20 print_log( "FATAL ERROR:" );
21 print_log( ReturnMessage[ NO_MEM ] );
22 error( NO_MEM, errno, "HEY! " );
23 }
24 strcpy( FileIn, FileNetList );
25 FileNetOut = new char[ strlen( FileNetList ) + strlen( NetListSuffix ) + 1 ];
26 if ( !FileNetOut )
27 {

31 }
32 strcpy( FileNetOut, FileNetList );
33 strcat( FileNetOut, NetListSuffix );
34 if ( int RetCode = PreProcess( FileNetList, options.NamemosN(), options.NamemosP() ) )
35 {
37 print_log( ReturnMessage[ RetCode ] );
38 error( RetCode, errno, "HEY! " );
39 }
40 delete[] FileIn;
41 }
42
43 ///
44 CircuitNetList::~CircuitNetList()
45 {
46 delete[] FileNetOut;
47 }
48
49 ///
50 const TransistorNode& CircuitNetList::operator[]( unsigned int index ) const
51 {
52 if ( index > GetNTran() )
53 error( NOT_FOUND, 0, "Index out of bound in [Circuit]..." );
54 return TranList[ index ];
55 }
56
57 ///
58 const TransistorNode& CircuitNetList::operator[]( const char* name ) const
59 {
60 return TranList[ name ];
61 }
62
63 ///
64 int CircuitNetList::TranPos( const char* name ) const
65 {
66 unsigned int index = 0;
67 unsigned int NT = GetNTran();
68 while ( index < NT )
69 {
70 if ( !strcasecmp( name, TranList[ index ].DevName() ) )
71 return TranList[ index ].Index();
72 index++;
73 }
74 return -1;
75 }
CircuitNetlistParse.cc
11
12 ///
13 int CircuitNetList::ParseMosLine( char *line, char *line2, const char* mosn, const char* mosp )
14 {
15 char tmpstr[ 128 ];
16 char parsestr[ 128 ];
17 char endpar[ 128 ];
18 char mos[ 8 ];
19 char type[ 16 ];
20 char par[ 16 ];
21 char lstr[ 16 ];
22 unsigned int n1, n2, n3, n4;

23 TransistorType Type;
24 double W, L;
25
26 strcpy( parsestr, "%s %u %u %u %u %s" );
27 strcpy( endpar, " " );
28 if ( sscanf( line, parsestr, mos, &n1, &n2, &n3, &n4, type ) == 6 )
29 {
30 unsigned int nw = 0;
31 unsigned int nl = 0;
32 sprintf( line2, "%s %u %u %u %u %s ", mos, n1, n2, n3, n4, type );
33 if ( !strcasecmp( mosn, type ) )
34 Type = NMOS;
35 else if ( !strcasecmp( mosp, type ) )
36 Type = PMOS;
37 else
38 return PARSE_ERROR;
39 strcpy( parsestr, "%*s %*u %*u %*u %*u %*s" );
40 strcpy( tmpstr, parsestr );
41 strcat( parsestr, " %s" );
42 while ( ( sscanf( line, parsestr, par ) == 1 ) && ( nl * nw == 0 ) )
43 {
44 unsigned int npar = 0;
45 if ( ( par[ 0 ] == ’w’ ) || ( par[ 0 ] == ’W’ ) )
46 nw = 1;
47 else if ( ( par[ 0 ] == ’l’ ) || ( par[ 0 ] == ’L’ ) )
48 nl = 1;
49 else
50 {
51 npar++;
52 if ( npar == 1 )
53 strcat( endpar, " \n+" );
54 strcat( endpar, " " );
55 strcat( endpar, par );
56 }
57 if ( nw == 1 )
58 {
59 unsigned int count = 0;
60 while ( !isdigit( par[ count++ ] ) );
61 sscanf( &par[ --count ], "%lf%*c", &W );
62 strcat( line2, par );
63 nw++;
64 }
65 if ( nl == 1 )
66 {
67 strcpy( lstr, par );
69 while ( !isdigit( par[ count++ ] ) );
70 sscanf( &par[ --count ], "%lf%*c", &L );
71 nl++;
72 }
73 if ( nw * nl )
74 {
75 strcat( line2, " " );
76 strcat( line2, lstr );
77 }
78 strcat( tmpstr, " %*s" );
79 strcpy( parsestr, tmpstr );
81 }
82 while ( sscanf( line, parsestr, par ) == 1 )
83 {
88 strcat( line2, par );
89 }
91 strcat( line2, endpar );

92 if ( TranList.Insert( mos, W, L, Type, n1, n2, n3 ) )
93 return NO_MEM;
94 return OK;
95 }
97 }
CircuitNetlistPreprocess.cc
11
12 ///
13 int CircuitNetList::PreProcess( const char* FileNetList, const char* NameMosN, const char* NameMosP )
14 {
15 char line[ 1024 ];
16 char line2[ 1024 ];
17 char command[ 32 ];
18 ifstream i_file( FileNetList );
19 ofstream o_file( FileNetOut );
20 int ToBeCopied;
21
22 if ( !i_file )
24 if ( !o_file )
26 while ( i_file.getline( line, 1023 ) )
27 {
28 int c = 0;
29 ToBeCopied = 1;
30 while ( isspace( line[ c++ ] ) );
31 switch ( line[ --c ] )
32 {
33 case ’.’:
34 sscanf( &line[ c + 1 ], "%s", command );
35 ToBeCopied = strcasecmp( command, "tran" ) && \
36 strcasecmp( command, "dc" ) && \
37 strcasecmp( command, "ac" );
38 if ( !ToBeCopied )
39 {
40 strcpy( line2, "***** " );
41 strcat( line2, &line[ c ] );
42 }
43 break;
44 case ’v’:
45 case ’V’:
46 sscanf( &line[ c + 1 ], "%s", command );
47 ToBeCopied = !( strcasecmp( command, "dd" ) && \
48 strcasecmp( command, "cc" ) && \
49 strcasecmp( command, "al" ) );
50 if ( ToBeCopied )
51 {
52 int node2;
53 ToBeCopied = 0;
54 sscanf( &line[ c ], "%*s %d %d %*s %lf", &ValNode, &node2, &Val );
55 sprintf( line2, "vdd %d %d dc %g ", ValNode, node2, Val );
56 }
57 else
58 {
59 strcpy( line2, "* " );
60 strcat( line2, &line[ c ] );
61 }
62 break;
63 case ’m’:
64 case ’M’:
65 case ’x’:
66 case ’X’:
67 ToBeCopied = ParseMosLine( &line[ c ], line2, NameMosN, NameMosP );
68 break;
69 case ’c’:
70 case ’C’:
71 unsigned int node1, node2;
72 double val;
73 sscanf( &line[ c ], "%*s %u %u %lg", &node1, &node2, &val );
74 if ( CapList.Insert( node1, node2, val ) != OK )
75 {
76 i_file.close();
77 o_file.close();
78 return NO_MEM;
79 }
80 break;
81 default:
82 break;
83 }
84 if ( ToBeCopied == 0 )
85 o_file << line2 << endl;
86 else
87 o_file << &line[ c ] << endl;
88 }
89 o_file.close();
90 i_file.close();
91 if ( Val <= 0.0 )
92 {
93 print_log( "Error: no|wrong VDD defined" );
95 }
96 return OK;
97 }
CircuitPrint.cc
11
12
13 ///
14 void Circuit::PrintResult( unsigned long int Step,
15 unsigned int NT,
16 unsigned int NP,
17 const double* NewWidth,
18 const double* CPDelay,
19 const double* CPPower,
20 const double *CPNoise,
21 double Area,
22 double maxT,
23 double maxP,
24 double maxN,
25 double f,
26 double fLast ) const
27 {
28 char log[ 1024 ], tmp[ 1024 ];
29 if ( Step == 1 )
30 {
31 ofstream o_file( "RESULT.log" );

32 ofstream o_fileW( "RESULT_W.log" );
33 ofstream o_fileT( "RESULT_T.log" );
34 ofstream o_fileP( "RESULT_P.log" );
35 if ( !o_file )
36 {
37 print_log( "Warning: can’t create file RESULT.log" );
38 return ;
39 }
40 sprintf( log, "# Step " );
41 strcat( log, "Norm_N(W[]) " );
42 strcat( log, "OptFunc " );
43 strcat( log, "Error " );
44 strcat( log, "Max(T[]) " );
45 strcat( log, "Max(P[]) " );
46 strcat( log, "Max(N[]) " );
47 strcat( log, " A " );
48 o_file << log << endl;
50 for ( unsigned int i = 0; i < NT; i++ )
51 {
52 sprintf( tmp, "W[%u] ", i );
53 strcat( log, tmp );
54 }
55 o_fileW << log << endl;
57 for ( unsigned int i = 0; i < NP; i++ )
58 {
59 sprintf( tmp, "T[%u] ", i );
61 }
62 o_fileT << log << endl;
65 {
66 sprintf( tmp, "P[%u] ", i );
68 }
69 o_fileP << log << endl;
71 {
72 sprintf( tmp, "N[%u] ", i );
74 }
75
76 o_file.close();
77 o_fileW.close();
78 o_fileP.close();
79 o_fileT.close();
80 }
81 ofstream o_file( "RESULT.log", ios::app );
82 ofstream o_fileW( "RESULT_W.log", ios::app );
83 ofstream o_fileT( "RESULT_T.log", ios::app );
84 ofstream o_fileP( "RESULT_P.log", ios::app );
85 if ( !o_file )
86 {
87 print_log( "Warning: can’t create file RESULT.log" );
88 return ;
89 }
90 sprintf( log, "%7ld ", Step );
91 sprintf( tmp, "%4.3f ", NORM_N( NewWidth, NT ) );
93 sprintf( tmp, "%4.3g ", f );
95 sprintf( tmp, "%4.3g ", (f - fLast) / fLast * 100);
97 sprintf( tmp, "%4.3f ", maxT );
99 sprintf( tmp, "%4.3f ", maxP );

101 sprintf( tmp, "%4.3f ", maxN );
103 sprintf( tmp, "%4.3f ", Area );
105 o_file << log << endl;
108 {
109 sprintf( tmp, "%4.3f ", NewWidth[ i ] );
111 }
112 o_fileW << log << endl;
115 {
116 sprintf( tmp, "%4.3f ", CPDelay[ i ] );
118 }
119 o_fileT << log << endl;
122 {
123 sprintf( tmp, "%4.3f ", CPPower[ i ] );
125 }
126 o_fileP << log << endl;
129 {
130 sprintf( tmp, "%4.3f ", CPNoise[ i ] );
132 }
133
134
135 o_file.close();
136 o_fileW.close();
137 o_fileP.close();
138 o_fileT.close();
139 }
140
141 ///
142 double NORM_N( const double* V, unsigned int l )
143 {
144 double norm = 0.0;
145 for ( unsigned int i = 0; i < l; i++ )
146 norm += pow( V[ i ], double( l ) );
147 norm = pow( norm, double( 1.0 / l ) );
148 // if(V[i] ¿ norm)
149 // norm = V[i];
150 return norm;
151 }
CircuitTranListNode.cc
11
12 ///
13 int Circuit::TransistorListNode(unsigned int node, TransistorList& TList, unsigned int& n, unsigned int& p) const
14 {
15 // find all the nmos transistors with source or drain
16 // connected to node an return a list, plus the number of n and p connected

18 n = p = 0;
20 {
21 if ( ( TranList[ i ].Source() == node ) ||
22 ( TranList[ i ].Drain() == node ) )
23 {
24 TList.Insert((TranList[i]).DevName(), (TranList[i]).Width(),
25 (TranList[i]).Length(), (TranList[i]).TrType(),
26 (TranList[i]).Source(), (TranList[i]).Gate(),
27 (TranList[i]).Drain());
28 if ((TranList[i]).TrType() == NMOS)
29 n++;
30 else if ((TranList[i]).TrType() == PMOS)
31 p++;
32 }
33 }
34 return OK;
35 }
CircuitWidth.cc
11
12 ///
13 double Circuit::JunctionNWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
14 {
15 // find all the nmos transistors with source or drain
16 // connected to node an return the sum of widths
18 number = 0;
19 double W = 0.0;
20 if ( node == 0 )
21 return 0.0;
23 {
24 if ( TranList[ i ].TrType() == NMOS )
27 {
28 if ( !NewWidth )
29 W += TranList[ i ].Width();
30 else
31 W += NewWidth[ TranPos( TranList[ i ].DevName() ) ];
32 number++;
33 }
34 }
35 return W;
36 }
37
38 ///
39 double Circuit::GateNWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
40 {
41 // find all the nmos transistors with gate
44 double W = 0.0;
45 number = 0;
46 if ( node == 0 )
47 return 0.0;

49 {
50 if ( TranList[ i ].TrType() == NMOS )
51 if ( TranList[ i ].Gate() == node )
52 {
53 if ( !NewWidth )
55 else
57 number++;
58 }
59 }
60 return W;
61 }
62
63 ///
64 double Circuit::JunctionPWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
65 {
66 // find all the pmos transistors with source or drain
69 double W = 0.0;
70 number = 0;
71 if ( node == 0 )
72 return 0.0;
74 {
75 if ( TranList[ i ].TrType() == PMOS )
78 {
79 if ( !NewWidth )
81 else
83 number++;
84 }
85 }
86 return W;
87 }
88
89 ///
90 double Circuit::GatePWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
91 {
92 // find all the pmos transistors with gate
95 double W = 0.0;
96 number = 0;
97 if ( node == 0 )
98 return 0.0;
100 {
101 if ( TranList[ i ].TrType() == PMOS )
102 if ( TranList[ i ].Gate() == node )
103 {
104 if ( !NewWidth )
106 else
108 number++;
109 }
110 }
111 return W;
112 }
113
114 ///
115 double Circuit::CapStaticGnd( unsigned int node, int& number ) const
116 {
117 // find all the fixed capacitances with a ground terminal

118 // and connected to node and return the sum of them
119 unsigned int NC = GetNCap();
120 double C = 0.0;
121 number = 0;
122 if ( node == 0 )
123 return 0.0;
124 for ( unsigned int i = 0; i < NC; i++ )
125 {
126 if ( ( CapList[ i ].node1 == 0 ) &&
127 ( CapList[ i ].node2 == node ) )
128 {
129 C += CapList[ i ].val;
130 number++;
131 }
132 else if ( ( CapList[ i ].node2 == 0 ) &&
134 {
136 number++;
137 }
138 }
139 return C;
140 }
141
142
143 ///
144 double Circuit::CapStaticVdd( unsigned int node, int& number ) const
145 {
146 // find all the fixed capacitances with a vdd terminal
147 // and connected to node and return the sum of them
148 unsigned int NC = GetNCap();
149 double C = 0.0;
150 number = 0;
151 if ( node == 0 )
152 return 0.0;
153 for ( unsigned int i = 0; i < NC; i++ )
154 {
155 if ( ( CapList[ i ].node1 == ValNode ) &&
157 {
159 number++;
160 }
161 else if ( ( CapList[ i ].node2 == ValNode ) &&
163 {
165 number++;
166 }
167 }
168 return C;
169 }
Critic.cc
11 #include "main.h"
12
13
14 ///
15 int Critic(const Circuit& circuit,

16 CritPathList& pathList,
17 const Options& options)
18 {
19 Node nodeInputList;
20 /// search primary input
21 print_log("Searching primary input...");
22 unsigned int Nt = circuit.GetNTran();
23 for (unsigned int i = 0; i < Nt; i++)
24 {
25 unsigned int gate = (circuit[i]).Gate();
26 unsigned int Pin = 1;
27 for (unsigned int j = 0; j < Nt; j++)
28 {
29 if (i != j)
30 {
31 for (unsigned int k = 0; k < nodeInputList.GetNumNode(); k++)
32 if ( (nodeInputList[k]).node == gate)
33 Pin = 0;
34 if ( ((circuit[j]).Drain() == gate) ||
35 ((circuit[j]).Source() == gate))
36 Pin = 0;
37 }
38 }
39 if (Pin)
40 nodeInputList.Insert(gate);
41 }
42 #ifdef DEBUG
43 cerr << endl << " Primary input List: " << (nodeInputList[0]).node << " ";
44 #endif
45 char log[1024];
46 char log2[16];
47 sprintf(log, "Primary input list: %u", (nodeInputList[0]).node);
48 for (unsigned int i = 1; i < nodeInputList.GetNumNode(); i++)
49 {
50 sprintf(log2, " -- %u", (nodeInputList[i]).node);
51 #ifdef DEBUG
52 cerr << (nodeInputList[i]).node << " ";
53 #endif
54 }
55 #ifdef DEBUG
56 cerr << endl;
57 #endif
58 print_log(log);
59 NodeList nodeListGnd;
60 NodeList nodeListVdd;
61 int RetCode1 = nodeListGnd.Create();
62 int RetCode2 = nodeListVdd.Create();
63 if ( (RetCode1 != OK) || (RetCode2 != OK) )
64 return (RetCode1 != OK ? RetCode1 : RetCode2);
65 #ifdef DEBUG
66 cerr << endl << "Creating critical Path with gnd..." << endl;
67 #endif
68 RetCode1 = CriticRecurse(circuit, 0, nodeListGnd);
69 if ((RetCode1 != OK) && (RetCode1 != CONT))
70 return RetCode1;
71 #ifdef DEBUG
72 cerr << endl << "Creating critical Path with vdd..." << endl;
73 #endif
74 unsigned int val = circuit.ValimNode();
75 RetCode2 = CriticRecurse(circuit, val, nodeListVdd);
76 if ( (RetCode2 != OK) && (RetCode2 != CONT) )
77 return RetCode2;
78 Node gateInternalList;
79 for (unsigned int i = 0; i < nodeListGnd.GetNumList(); i++)
80 {
81 unsigned int Gin = 1;
82 unsigned int nn = (nodeListGnd[i]).GetNumNode();
83 unsigned int new_node = ((nodeListGnd[i])[nn - 1]).node;
84 for (unsigned int k = 0; k < gateInternalList.GetNumNode(); k++)

85 if ( (gateInternalList[k]).node == new_node)
86 Gin = 0;
87 if (Gin)
88 gateInternalList.Insert(new_node, -1);
89 }
90 for (unsigned int i = 0; i < nodeListVdd.GetNumList(); i++)
91 {
92 unsigned int Gin = 1;
93 unsigned int nn = (nodeListVdd[i]).GetNumNode();
94 unsigned int new_node = ((nodeListVdd[i])[nn - 1]).node;
95 for (unsigned int k = 0; k < gateInternalList.GetNumNode(); k++)
96 if ( (gateInternalList[k]).node == new_node)
97 Gin = 0;
98 if (Gin)
99 gateInternalList.Insert(new_node, -1);
100 }
101 #ifdef DEBUG
102 cerr << endl << " Gate Internal List: " << (gateInternalList[0]).node << " ";
103 #endif
104 sprintf(log, "Internal gate list: %u", (gateInternalList[0]).node);
105 for (unsigned int i = 1; i < gateInternalList.GetNumNode(); i++)
106 {
107 sprintf(log2, " -- %u", (gateInternalList[i]).node);
108 #ifdef DEBUG
109 cerr << (gateInternalList[i]).node << " ";
110 #endif
111 }
112 #ifdef DEBUG
113 cerr << endl;
114 #endif
115 print_log(log);
116 int RetCode = SearchCriticalPath(circuit, pathList, nodeListGnd, nodeListVdd, nodeInputList, gateInternalList, options);
117 if (RetCode != OK)
118 return RetCode;
119 #ifdef DEBUG
120 cerr << endl << pathList.GetNumPath() << " CRITICAL PATHS ";
121 #endif
122 for (unsigned int i = 0; i < pathList.GetNumPath(); i++)
123 {
124 sprintf(log, "#%u) Node_in: %u (%s=%g ps) Node_Out: %u (%s)", i,
125 (pathList[i]).GetNodeIn(), TransitionString[(pathList[i]).GetTransitionIn()],
126 (pathList[i]).GetInTime(), (pathList[i]).GetNodeOut(),
127 TransitionString[(pathList[i]).GetTransitionOut()]);
128 print_log(log);
129 #ifdef DEBUG
130 cerr << endl << log;
131 #endif
132 sprintf(log, "\t\t Tran_List ");
133 for (unsigned int j = 0; j < (pathList[i]).GetNumListTran(); j++)
134 {
135 const char* name;
136 while ((name = (pathList[i]).TraverseTransistorNameList(j)) != 0)
137 {
138 strcat(log, name);
139 strcat(log, " ");
140 }
141 strcat(log, " / ");
142 }
143 print_log(log);
144 #ifdef DEBUG
146 #endif
147 unsigned int node;
148 double val;
149 sprintf(log, "\t\t Active_Inputs: ");
150 char tmpstr[1024];
151 while ((pathList[i]).TraverseActiveInputs(node, val))
152 {
153 sprintf(tmpstr, " v(%u)= %g V -- ", node, val);

154 strcat(log, tmpstr);
155 }
156 print_log(log);
157 #ifdef DEBUG
159 #endif
160 sprintf(log, "\t\t No_Active_Inputs: ");
161 while ((pathList[i]).TraverseNoActiveInputs(node, val))
162 {
165 }
166 print_log(log);
167 #ifdef DEBUG
168 cerr << endl << log << endl;
169 #endif
170 sprintf(log, "\t\t Initial Condition: ");
171 while ((pathList[i]).TraverseInitialConditions(node, val))
172 {
175 }
176 print_log(log);
177 #ifdef DEBUG
178 cerr << endl << log << endl;
179 #endif
180 }
181 return OK;
182 }
CriticRecurse.cc
12
13
14
15 ///
16 int CriticRecurse(const Circuit& circuit,
17 unsigned int node,
18 NodeList& node_list)
19 {
20 TransistorList TList;
21 static int level = 0;
22 unsigned int n = 0;
23 unsigned int p = 0;
24 int RetCode;
25 if ((RetCode = circuit.TransistorListNode(node, TList, n , p)) != OK)
26 return RetCode;
27 unsigned int Nt = TList.GetNTran();
28 if ((RetCode = node_list.InsertNode(node)) != OK)
29 return RetCode;
30 if ( (n > 0) && (p > 0))
31 {
32 return OK;
33 }
34 level++;
35 #ifdef DEBUG
36 cerr << "node " << node << ": ";
38 cerr << " - " << TList[i].DevName();

39 for (unsigned int i = 0; i < (node_list[node_list.GetNumList() - 1]).GetNumNode(); i++)
40 cerr << " - " << ((node_list[node_list.GetNumList() - 1])[i]).node;
41 cerr << endl;
42 #endif
43 unsigned int RecurseYes;
44 RetCode = 0;
46 {
47 unsigned int new_node;
48 if ((TList[i]).Source() == node)
49 new_node = TList[i].Drain();
50 else if ((TList[i]).Drain() == node)
51 new_node = TList[i].Source();
52 RecurseYes = 1;
53 unsigned int last_node;
54 #ifdef DEBUG
55 for (int j = 0; j < level; j++)
56 cerr << " ";
57 cerr << " LEVEL: " << level << " trying " << TList[i].DevName() << endl;
58 #endif
59 unsigned int nn = (node_list[node_list.GetNumList() - 1]).GetNumNode();
60 for (unsigned int j = 0; (j < nn) && RecurseYes; j++)
61 {
62 last_node = ((node_list[node_list.GetNumList() - 1])[j]).node;
63 if (new_node == last_node)
64 RecurseYes = 0;
65 }
66 if (RecurseYes)
67 {
68 if ((RetCode = node_list.InsertNode(TList[i].Gate())) != OK)
69 return RetCode;
70 int RecurseCode = CriticRecurse(circuit, new_node, node_list);
71 if (RecurseCode == OK)
72 {
73
74 // at least gnd (or vdd), one gate and one node
75 if ((RetCode = node_list.Create()) != OK)
76 return RetCode;
77 #ifdef DEBUG
78 cerr << endl << "NODE LIST : ";
79 #endif
80 nn = (node_list[node_list.GetNumList() - 2]).GetNumNode();
81 for (unsigned int j = 0; j < nn - 2; j++)
82 {
83 new_node = ((node_list[node_list.GetNumList() - 2])[j]).node;
84 #ifdef DEBUG
85 cerr << new_node << " ";
86 #endif
87 if ((RetCode = node_list.InsertNode(new_node)) != OK)
88 return RetCode;
89 }
90 #ifdef DEBUG
91 cerr << ((node_list[node_list.GetNumList() - 2])[nn - 2]).node << " "
92 << ((node_list[node_list.GetNumList() - 2])[nn - 1]).node << endl;
93 #endif
94 }
95 else if (RecurseCode == CONT)
96 {
97 if ( (RetCode = (node_list[node_list.GetNumList() - 1]).DeleteLevelNode(2 * level - 2)) != OK )
98 return RetCode;
99 }
100 else
101 return RecurseCode;
102 }
103 }
104 level--;
105 if (level == 0)
106 {
107 unsigned int nl = node_list.GetNumList();

108 for (unsigned int i = 0; i < nl; i++)
109 {
110 unsigned int nn = (node_list[i]).GetNumNode();
111 if (nn == 1)
112 {
113 RetCode = node_list.DeleteList(i);
115 return RetCode;
116 }
117 }
118 }
119 return CONT;
120 }
CriticalPath.cc
11
12
13
14 ///
15 CritPathList::CritPathList() : NumPath( 0 ), head( 0 ), tail( 0 )
16 {
17 print_log( "Creating critical path list..." );
18 }
19
20
21 ///
22 CritPathList::~CritPathList()
23 {
24 CPNode* tmp;
25 ///
26 while ( head != 0 )
27 {
29 delete head;
30 head = tmp;
31 }
32 }
33
34
35 ///
36 const CPNode& CritPathList::operator[]( unsigned int index ) const
37 {
38 CPNode * tmp = head;
40 if ( index > NumPath )
41 error( NOT_FOUND, 0, "Index out of bound in [CritPathList]..." );
42 while ( i-- )
43 tmp = tmp->next;
44 return *tmp;
45 }
CriticalPathCreate.cc
11 #include "class_simulator.h"
12
13 ///
14 int CritPathList::Create()
15 {
16 if ( !head )
17 {
18 head = new CPNode;
19 if ( !head )
20 return NO_MEM;
21 tail = head;
22 }
23 else
24 {
25 tail->next = new CPNode;
26 if ( !( tail->next ) )
27 return NO_MEM;
29 }
30 tail->next = 0;
31 return OK;
32 }
33
34 ///
35 int CritPathList::Stamp( unsigned int NumTranList )
36 {
37 tail->VALID = 1;
38 tail->NumTranList = NumTranList;
39 NumPath++;
40 return OK;
41 }
CriticalPathInsert.cc
11
12
13 ///
14 int CritPathList::InsertNodeIn( unsigned int NIn, TransitionType Type, double Time )
15 {
16 if ( tail )
17 return tail->InsNodeIn( NIn, Type, Time );
18 else
20 }
21
22 ///
23 int CritPathList::InsertNodeOut( unsigned int NOut, TransitionType Type )
24 {
25 if ( tail )
26 return tail->InsNodeOut( NOut, Type );
27 else
29 }
30
31 ///
32 int CritPathList::InsertActiveInputs( unsigned int Node, double Val )
33 {
34 if ( tail )
35 return tail->InsActIn( Node, Val );
36 else
38 }
39
40 ///
41 int CritPathList::InsertNoActiveInputs( unsigned int Node, double Val )
42 {
43 if ( tail )
44 return tail->InsNoActIn( Node, Val );
45 else
47 }
48
49 ///
50 int CritPathList::InsertInitialCondition( unsigned int Node, double Val )
51 {
52 if ( tail )
53 return tail->InsIniCond( Node, Val );
54 else
56 }
57
58 ///
59 int CritPathList::InsertPathTransistor( const char* name, TransistorType TR, unsigned int index )
60 {
61 if ( tail )
62 return tail->InsTran( name, TR, index );
63 else
65 }
CriticalPathParse.cc
11
12
13 ///
14 int CritPathList::ParseLineCPNode( const char* str, CPCOMMANDOPT NodeType )
15 {
16 char tmpType[ 5 ];
18 double time;
19 sscanf( str, "%*s%u%s", &node, tmpType );
20 switch ( NodeType )
21 {
22 case NODEIN:
23 sscanf( str, "%*s%*u%*s%lg", &time );
24 for ( unsigned int i = 0; ( TransitionString[ i ] != 0 ); i++ )
25 {
26 if ( !strcasecmp( tmpType, TransitionString[ i ] ) )
27 return InsertNodeIn( node, ( TransitionType ) i, time );
28 }
30 break;
31 case NODEOUT:
32 for ( unsigned int i = 0; ( TransitionString[ i ] != 0 ); i++ )
33 {
34 if ( !strcasecmp( tmpType, TransitionString[ i ] ) )
35 return InsertNodeOut( node, ( TransitionType ) i );

36 }
38 break;
39 default:
41 break;
42 }
43 }
44
45 ///
46 int CritPathList::ParseLineCPInputs( const char* str, CPCOMMANDOPT InputType )
47 {
48
50 unsigned int NumRead;
51 double val;
53 char laststr[ 1024 ];
54 char *tmpstr = " %*u %*lg";
55 strcpy( parsestr, "%*s" );
56 NumRead = sscanf( str, "%*s %u %lg", &node, &val );
57 while ( NumRead == 2 )
58 {
59 switch ( InputType )
60 {
61 case ACTIVEI:
62 if ( InsertActiveInputs( node, val ) )
64 break;
65 case NOACTIVEI:
66 if ( InsertNoActiveInputs( node, val ) )
68 break;
69 case IC:
70 if ( InsertInitialCondition( node, val ) )
72 break;
73 case CPATH:
74 case NODEIN:
75 case NODEOUT:
76 case TRANLIST:
77 case ENDCPATH:
78 case NONECP:
79 default:
80 break;
81 }
82 strcat( parsestr, tmpstr );
83 strcpy( laststr, parsestr );
84 strcat( laststr, " %u %lg" );
85 NumRead = sscanf( str, laststr, &node, &val );
86 }
87 if ( NumRead == 1 )
89 return OK;
90 }
91
92 ///
93 int CritPathList::ParseLineCPTran( const char* str, const Circuit& circuit, unsigned int index )
94 {
95
97 char laststr[ 1024 ];
98 char *tmpstr = " %*s";
99 char name[ 16 ];
100 strcpy( parsestr, "%*s" );
101 unsigned int NumRead = sscanf( str, "%*s %s", name );
102 if ( NumRead != 1 )
104 while ( NumRead == 1 )

105 {
106 if (circuit.TranPos(name) == -1)
108 if ( InsertPathTransistor( name, circuit[ name ].TrType(), index ) )
110 strcat( parsestr, tmpstr );
111 strcpy( laststr, parsestr );
112 strcat( laststr, " %s" );
113 NumRead = sscanf( str, laststr, name );
114 }
115 return OK;
116 }
CriticalPathRead.cc
6 #include "global.h"
12
13
14 ///
15 int CritPathList::Read( const char* FileOptions, const Circuit& circuit )
16 {
17 ifstream i_file( FileOptions );
18 char line[ 1024 ];
19 char command[ 256 ];
20 if ( !i_file )
22 unsigned int LineNum = 0;
23 unsigned int NumTranList = 0;
25 {
26 LineNum++;
27 if ( sscanf( line, "%s ", command ) == 1 )
28 if ( command[ 0 ] != ’#’ )
29 {
30 int RetCode;
31 switch ( CPCOMMANDOPT Cc = WhichCommand( command ) )
32 {
33 case CPATH:
34 RetCode = Create();
35 NumTranList = 0;
36 break;
37 case NODEIN:
38 case NODEOUT:
39 RetCode = ParseLineCPNode( line, Cc );
40 break;
41 case ACTIVEI:
42 case NOACTIVEI:
43 case IC:
44 RetCode = ParseLineCPInputs( line, Cc );
45 break;
46 case TRANLIST:
47 RetCode = ParseLineCPTran( line, circuit, NumTranList );
48 NumTranList++;
49 if ( NumTranList >= MAXCHAIN )
50 RetCode = NO_MEM;
51 break;
52 case ENDCPATH:
53 RetCode = Stamp( NumTranList );
54 break;
55 case NONECP:
56 RetCode = OK;
57 default:
58 break;
59 }
60 if ( RetCode != OK )
61 {
62 sprintf( line, "ERROR reading file %s line %d ", FileOptions, LineNum );
63 print_log( line );
64 i_file.close();
65 return RetCode;
66 }
67 }
68 }
69 i_file.close();
70 return OK;
71 }
EvaluationAlgorithm.cc
12
13
14 ///
15 EvaluationAlgorithm::EvaluationAlgorithm( const CritPathList& pathlist, const Options& options )
16 :
17 pathlist( pathlist ), options( options ),
18 NumPath( 0 ), Calls( 0 ),
19 CPDelay( 0 ), CPPower( 0 ),
20 CPNoise( 0 ), Area( 0.0 )
21 {
22 print_log( "Creating simulation algorithm..." );
23 NumPath = pathlist.GetNumPath();
24 CPDelay = new double[ NumPath ];
25 CPPower = new double[ NumPath ];
26 CPNoise = new double[ NumPath ];
27 if ( !CPDelay || !CPPower || !CPNoise )
28 {
32 }
33
34 }
35
36 ///
37 EvaluationAlgorithm::~EvaluationAlgorithm()
38 {
39 delete[] CPDelay;
40 delete[] CPPower;
41 delete[] CPNoise;
42 }
Global.cc
7
8 ///
9 GLOBCOMMANDOPT WhichGBOption( const char* option )
10 {
11 for ( unsigned int i = 0; (GlobCommandOptions[ i ] != 0 ); i++ )
12 {
13 if ( !strcasecmp( option, GlobCommandOptions[ i ] ) )
14 return ( ( GLOBCOMMANDOPT ) i );
15 }
16 return NONEGLOB;
17 }
18
19 ///
20 SIMCOMMANDOPT WhichSimOption( const char* option )
21 {
22 for ( unsigned int i = 0; (SimCommandOptions[ i ] != 0 ); i++ )
23 {
24 if ( !strcasecmp( option, SimCommandOptions[ i ] ) )
25 return ( ( SIMCOMMANDOPT ) i );
26 }
27 return NONESIM;
28 }
29
30
31 ///
32 OPTCOMMANDOPT WhichOptOption( const char* option )
33 {
34 for ( unsigned int i = 0; (OptCommandOptions[ i ] != 0 ); i++ )
35 {
36 if ( !strcasecmp( option, OptCommandOptions[ i ] ) )
37 return ( ( OPTCOMMANDOPT ) i );
38 }
39 return NONEOPT;
40 }
41
42 ///
43 CPCOMMANDOPT WhichCommand( const char* option )
44 {
45 for ( unsigned int i = 0; (CPCommandOptions[ i ] != 0 ); i++ )
46 {
47 if ( !strcasecmp( option, CPCommandOptions[ i ] ) )
48 return ( ( CPCOMMANDOPT ) i );
49 }
50 return NONECP;
51 }
52
53 #ifndef LINUX
54
55 ///
56 void error( int exitCode, int ErrorType, const char* message )
57 {
58 cerr << message << "Error " << ErrorType << endl;
59 if ( exitCode != 0 )
60 exit( exitCode );
61 }
62
63 #endif
IsIn.cc
12
13
14
15
16 ///
17 int IsIn(unsigned int node, Node& NList, unsigned int& pos)
18 {
19 pos = 0;
20 for (unsigned int i = 0; i < NList.GetNumNode(); i++)
21 if ((NList[i]).node == node)
22 {
23 pos = i;
24 return OK;
25 }
27 }
28
Node.cc
9
10 ///
11 Node::Node() : NumNode(0), next( 0 ), Head(0), Tail(0)
12 {}
13
14 ///
15
16 Node::~Node()
17 {
18 _NodeList* tmp;
19 while ( Head )
20 {
21 tmp = Head->next;
22 delete Head;
23 Head = tmp;
24 }
25 }
26
27 ///
28 int Node::Insert( unsigned int node, int flag = -1)
29 {
30 _NodeList * tmp;
31 if ( !Head )
32 {
33 Head = new _NodeList;
34 if ( !Head )
35 return NO_MEM;
36 Head->next = 0;
37 Tail = Head;
38 }
39 else
40 {
41 tmp = new _NodeList;
42 if ( !tmp )
43 return NO_MEM;
44 tmp->next = 0;
45 Tail->next = tmp;
46 Tail = tmp;
47 }
48 Tail->node = node;
49 Tail->flag = flag;
50 NumNode++;
51 return OK;
52 }
53
54 ///
55 int Node::DeleteLevelNode(unsigned int level)
56 {
57 // NodeList* tmp = Head;
58 //unsigned int i = level;
59 if (level >= NumNode)
61 //while(i–)
62 // tmp = tmp-¿next;
63 //tmp-¿next = 0;
64 //Tail = tmp;
65 Tail = &(operator[](level));
66 (operator[](level)).next = 0;
67 NumNode = level + 1;
68 return OK;
69 }
70
71 ///
72 const _NodeList& Node::operator[]( unsigned int index ) const
73 {
74 _NodeList* tmp = Head;
76 if ( index > NumNode )
77 error( NOT_FOUND, 0, "Index out of bound in [_NodeList]..." );
78 while ( i-- )
79 tmp = tmp->next;
80 return *tmp;
81 }
82
83 ///
84 _NodeList& Node::operator[]( unsigned int index )
85 {
86 _NodeList* tmp = Head;
88 if ( index > NumNode )
89 error( NOT_FOUND, 0, "Index out of bound in [_NodeList]..." );
90 while ( i-- )
91 tmp = tmp->next;
92 return *tmp;
93 }
NodeCreate.cc
7
8 ///
9 int NodeList::Create()
10 {
11 if ( !head )
12 {
13 head = new Node;
14 if ( !head )
15 return NO_MEM;
16 tail = head;
17 }
18 else
19 {
20 tail->next = new Node;
21 if ( !( tail->next ) )
22 return NO_MEM;
24 }
25 tail->next = 0;
26 NumList++;
27 return OK;
28 }
NodeList.cc
7
8
9 ///
10 NodeList::NodeList() : NumList( 0 ), head( 0 ), tail( 0 )
11 {
12 print_log( "Creating node list..." );
13 }
14
15
16 ///
17 NodeList::~NodeList()
18 {
19 Node* tmp;
20 while ( head != 0 )
21 {
23 delete head;
24 head = tmp;
25 }
26 }
27
28
29 ///
30 const Node& NodeList::operator[]( unsigned int index ) const
31 {
32 Node *tmp = head;
34 if ( index > NumList )
35 error( NOT_FOUND, 0, "Index out of bound in [NodeList]..." );
36 while ( i-- )
37 tmp = tmp->next;
38 return *tmp;
39 }
40
41 ///
42 Node& NodeList::operator[]( unsigned int index )
43 {
44 Node *tmp = head;
46 if ( index > NumList )
47 error( NOT_FOUND, 0, "Index out of bound in [NodeList]..." );
48 while ( i-- )
49 tmp = tmp->next;
50 return *tmp;
51 }
NodeListDelete.cc
7
8 ///
9 int NodeList::DeleteList( unsigned int list)
10 {
11 if (list >= NumList)
13 //unsigned int i = list - 1;
14 //Node* tmp = head;
15 //while(i–)
16 // tmp = tmp-¿next;
17 //tmp-¿next = tmp-¿next-¿next;
18 if ((operator[](list)).next)
19 {
20 (operator[](list - 1)).next = (operator[](list)).next;
21 Node* tmp = &(operator[](list - 1));
22 while (tmp->next)
23 tmp = tmp->next;
24 tail = tmp;
25 }
26 else
27 {
28 (operator[](list - 1)).next = 0;
29 tail = &(operator[](list - 1));
30 }
31 NumList--;
32 return OK;
33 }
NodeListInsert.cc
7
8 ///
9 int NodeList::InsertNode( unsigned int node)
10 {
11 if ( tail )
12 return tail->Insert( node );
13 else
15 }
OptSimulate.cc
12 #include "class_optimizator.h"
13
14 ///
15 int OptimizationAlgorithm::SimulateCircuit( const double *NewWidth )
16 {
17 int RetCode = Simulation.Run( circuit, NewWidth, ValidPath );
19 return RetCode;
20 for ( unsigned int i = 0; i < NumPath; i++ )
21 {
22 CPDelay[ i ] = Simulation.GetDelay( i );
23 CPPower[ i ] = Simulation.GetPower( i );
24 CPNoise[ i ] = Simulation.GetNoise( i );
25 }
26 Area = Simulation.GetArea();
27 return OK;
28 }
OptimizationAlFirstSim.cc
13
14 ///
15 int OptimizationAlgorithm::SimulateFirstCircuit()
16 {
17 double* MinimumWidth = new double[ NumTran ];
18 double* MaximumWidth = new double[ NumTran ];
19 if (!MinimumWidth || !MaximumWidth)
20 return NO_MEM;
21 for ( unsigned int i = 0; i < NumTran; i++ )
22 {
23 MinimumWidth[ i ] = options.GetOptOption( WMIN );
24 if (options.GetOptOption( WMAX ) <= 0)
25 MaximumWidth[ i ] = options.GetOptOption( WMIN ) * 100;
26 else
27 MaximumWidth[ i ] = options.GetOptOption( WMAX );
28 }
29 int RetCode = Simulation.Run( circuit, MaximumWidth, ValidPath );
31 return RetCode;
33 {
37 }
39 MaxDelayInitMax = 0.0;
41 {
42 if ( Simulation.GetDelay( i ) > 0.0 )
43 {
44 if ( Simulation.GetDelay( i ) > MaxDelayInitMax )
45 MaxDelayInitMax = CPDelay[i];
46 if ( Simulation.GetPower( i ) > MaxPowerInitMax )
47 MaxPowerInitMax = CPPower[i];
48 if ( Simulation.GetNoise( i ) > MaxNoiseInitMax )
49 MaxNoiseInitMax = CPNoise[i];
50 }
51 }
52 AreaInitMax = Area;
53 MaxNoiseInitMax = 1.0; /// FIX ME !!!!!!!!!
54 RetCode = Simulation.Run( circuit, MinimumWidth, ValidPath );
56 return RetCode;
57 unsigned int tmpP = 0;
59 {
61 if (CPDelay[ i ] > 0.0)
62 {
63 ValidPath[i] = 1;
64 tmpP++;
65 }
66 else
70 }
72 MaxDelayInitMin = 0.0;
74 {
75 if (ValidPath[i])
76 {
77 if ( Simulation.GetDelay( i ) > MaxDelayInitMin )
78 MaxDelayInitMin = CPDelay[i];
79 if ( Simulation.GetPower( i ) > MaxPowerInitMin )
80 MaxPowerInitMin = CPPower[i];
81 if ( Simulation.GetNoise( i ) > MaxNoiseInitMin )
82 MaxNoiseInitMin = CPNoise[i];
83 }
84 }
85 AreaInitMin = Area;
86 MaxNoiseInitMin = 1.0; /// FIX ME !!!!!!!!!
87 if (MaxDelayInitMin < MaxDelayInitMax)
88 MaxDelayInitMax = MaxDelayInitMin;
89 if (MaxPowerInitMin > MaxPowerInitMax)
90 MaxPowerInitMax = MaxPowerInitMin;
91 if (MaxNoiseInitMin > MaxNoiseInitMax)
92 MaxNoiseInitMax = MaxNoiseInitMin;
93 if (AreaInitMin > AreaInitMax)
94 AreaInitMax = AreaInitMin;
95 char log[512];
96 sprintf( log, "Found %u valid critical paths of %u", tmpP, NumPath );
97 print_log( log );
98 return OK;
99 }
OptimizationAlNormSim.cc
12
13 ///
14 double OptimizationAlgorithm::NormSim( const double* x, int& RetCode)
15 {
16 double f;
17 double* X = new double[ NumTran ];
18 double DelW = options.GetOptOption( DELTA );
19 unsigned int count_min = 0;
20 unsigned int count_max = 0;
21 static unsigned int elapsed = 0;
22 static double fLast;
23 static unsigned int count_conv = 0;
25 {
26 if ( x[ i ] <= options.GetOptOption( WMIN ) )
27 {
28 X[ i ] = options.GetOptOption( WMIN );
29 count_min++;
30 }
31 else if ( (x[ i ] > options.GetOptOption( WMAX )) &&
32 (options.GetOptOption( WMAX ) > 0) )
33 {
34 X[ i ] = options.GetOptOption( WMAX );
35 count_max++;
36 }
37 else
38 X[ i ] = double( rint( x[ i ] / DelW ) * DelW );
39 }
40 if ((count_min != NumTran) && (count_max != NumTran))
41 RetCode = SimulateCircuit( X );
42 else
43 {
44 RetCode = OK;
45 }
47 {
48 delete[] X;
49 return 0.0;
50 }
51 double maxT = 0.0;
52 double maxP = 0.0;
53 double maxN = 0.0;
54 double RatioT = 1.0; // MaxDelayInit / MaxDelayInit
55 //double RatioP = MaxDelayInitMin / MaxPowerInitMax;
56 double RatioP = 1.0;
57 //double RatioN = MaxDelayInitMin / MaxNoiseInitMax;
58 double RatioN = 0.0; // FIX ME !!!!!!!!!!!
59 //double RatioA = MaxDelayInitMin / AreaInitMax;
60 double RatioA = 1.0;
61 f = 0.0;
62 double fMin = COST_FACTOR;
63 if ( options.GetOptOption( WEIGHTS ) )
64 {
65 RatioT *= options.GetOptOption(WDELAY);
66 RatioP *= options.GetOptOption(WPOWER);
67 RatioN *= options.GetOptOption(WNOISE);
68 RatioA *= options.GetOptOption(WAREA);
69 }
70 double fMin_norm;
71 double fMax;
72 unsigned int Constraints = 0;
73 double MAXDelay = options.GetOptOption( MAXDELAY );
74 double MAXPower = options.GetOptOption( MAXPOWER );
75 double MAXNoise = options.GetOptOption( MAXNOISE );
76 double MAXArea = options.GetOptOption( MAXAREA );
77 if ( (MAXDelay > 0) || (MAXPower > 0) || (MAXNoise > 0) || (MAXArea > 0))
78 Constraints = 1;
79 fMin_norm = ( RatioT * MaxDelayInitMin / MaxDelayInitMin + \
80 RatioP * MaxPowerInitMin / MaxPowerInitMax + \
81 RatioN * MaxNoiseInitMin / MaxNoiseInitMax + \
82 RatioA * AreaInitMin / AreaInitMax);
83 fMax = (RatioT * MaxDelayInitMax / MaxDelayInitMin + \
84 RatioP * MaxPowerInitMax / MaxPowerInitMax + \
85 RatioN * MaxNoiseInitMax / MaxNoiseInitMax + \
86 RatioA * AreaInitMax / AreaInitMax) * \
87 COST_FACTOR / fMin_norm;
88 if (elapsed == 0)
89 fLast = fMin;
90 if ((count_min != NumTran) && (count_max != NumTran))
91 {
93 {
94 if ( CPDelay[ i ] > maxT )
95 maxT = CPDelay[ i ];
96 if (CPDelay[ i ] > 0)
97 {
98 if ( CPPower[ i ] > maxP )
99 maxP = CPPower[ i ];
100 if ( CPNoise[ i ] > maxN )
101 maxN = CPNoise[ i ];
102 }
103 }
104 f = (RatioT * maxT / MaxDelayInitMin + \
105 RatioP * maxP / MaxPowerInitMax + \
106 RatioN * maxN / MaxNoiseInitMax + \
107 RatioA * Area / AreaInitMax) * \
109 if ( Constraints )
110 {
111 if (MAXDelay > 0)
112 {
113 if (maxT > MAXDelay)
114 {
115 f += (maxT - MAXDelay ) / MaxDelayInitMin *\
116 (maxT - MAXDelay ) / MaxDelayInitMin *\
118 RetCode = CONT;
119 }
120 else
121 RetCode = END_ACC;
122 }
123 if (MAXPower > 0)
124 {
125 if (maxP > MAXPower)
126 {
127 f += (maxP - MAXPower ) / MaxPowerInitMax *\
128 (maxP - MAXPower ) / MaxPowerInitMax *\
130 RetCode = CONT;
131 }
132 else
134 }
135 if (MAXNoise > 0)
136 {
137 if (maxN > MAXNoise)
138 {
139 f += (maxN - MAXNoise ) / MaxNoiseInitMax *\
140 (maxN - MAXNoise ) / MaxNoiseInitMax *\
142 RetCode = CONT;
143 }
144 else
146 }
147 if (MAXArea > 0)
148 {
149 if (Area > MAXArea)
150 {
151 f += (Area - MAXArea ) / AreaInitMax *\
152 (Area - MAXArea ) / AreaInitMax *\
154 RetCode = CONT;
155 }
156 else
158 }
159 }
160 }
161 else if (count_min == NumTran)
162 {
163 f = fMin;
164 maxT = MaxDelayInitMin;
165 maxP = MaxPowerInitMin;
166 maxN = MaxNoiseInitMin;
167 }
168 else if (count_max == NumTran)
169 {
170 f = fMax * COST_FACTOR / fMin_norm;
171 maxT = MaxDelayInitMax;
172 maxP = MaxPowerInitMax;

173 maxN = MaxNoiseInitMax;
174 }
175 InternalSteps++;
176 if ( InternalSteps >= options.GetOptOption( MAXSTEPS ) )
177 RetCode = MAX_STEPS;
178 char log[ 1024 ];
179 if ((options.Verbose()) || (InternalSteps == 1))
180 {
181 circuit.PrintResult( InternalSteps, NumTran, NumPath, X, CPDelay, CPPower, CPNoise, Area, maxT, maxP, maxN, f, fLast);
182 sprintf( log, "...step: %d, objective: %g", InternalSteps, f );
184 }
185 else if ((f < fLast) || ((InternalSteps % 100) == 0))
186 {
187 circuit.PrintResult( InternalSteps, NumTran, NumPath, X, CPDelay, CPPower, CPNoise, Area, maxT, maxP, maxN, f, fLast);
188 sprintf( log, "...step: %d, objective: %g", InternalSteps, f );
190 if ( ((fLast - f) / fLast) < options.GetOptOption( ACC ) && (RetCode != CONT))
191 {
192 count_conv++;
193 if (count_conv >= 2)
194 {
196 }
197 }
198 fLast = f;
199 elapsed++;
200 }
201 //else
202 // count conv = 0;
203 delete[] X;
204 return f;
205 }
OptimizationAlgorithm.cc
13
14
15 ///
16 OptimizationAlgorithm::OptimizationAlgorithm( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
17 :
18 InternalSteps( 0 ), circuit( circuit ), options( options ),
19 Steps( 0 ), NumTran( 0 ), NumPath( 0 ),
20 Width( 0 ), CPDelay( 0 ), CPPower( 0 ),
21 CPNoise( 0 ), Area( 0.0 ), ValidPath(0),
22 MaxDelayInitMin( 0.0 ), MaxPowerInitMin( 0.0 ),
23 MaxNoiseInitMin( 0.0 ), AreaInitMin( 0.0 ),
24 MaxDelayInitMax( 0.0 ), MaxPowerInitMax( 0.0 ),
25 MaxNoiseInitMax( 0.0 ), AreaInitMax( 0.0 ),
26 Simulation( simulation )
27 {
28 // default inizialization
29 print_log( "Creating optimization algorithm..." );
30 NumTran = circuit.GetNTran();
31 Width = new double[ NumTran ];
32 NumPath = simulation.GetNPath();
33 CPDelay = new double[ NumPath ];
34 CPPower = new double[ NumPath ];

35 CPNoise = new double[ NumPath ];
36 ValidPath = new unsigned[NumPath];
37 if ( !Width || !CPDelay || !CPPower || !CPNoise )
38 {
42 }
44 Width[ i ] = circuit[ i ].Width();
47 }
48
49 ///
50 OptimizationAlgorithm::~OptimizationAlgorithm()
51 {
52 delete[] Width;
53 delete[] CPDelay;
54 delete[] CPPower;
55 delete[] CPNoise;
56 }
Optimize.cc
13 #include "slop.h"
14 #include "slop2.h"
15 #include "powell.h"
16 #include "anneal.h"
17 #include "test.h"
18
19 ///
20 int Optimize( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation, double* LastWidth )
21 {
22 struct timeb start_t, stop_t;
23 ftime( &start_t );
24 char log[ 1024 ];
25 OptimizationAlgorithm* Opt;
26 switch ( options.WhichOptAlgorithm() )
27 {
28 case SLOP:
29 Opt = new Slop( circuit, options, simulation );
30 break;
31 case SLOP2:
32 Opt = new Slop2( circuit, options, simulation );
33 break;
34 case POWELL:
35 Opt = new Powell( circuit, options, simulation );
36 break;
37 case ANNEAL:
38 Opt = new Anneal( circuit, options, simulation );
39 break;
40 case TESTEVAL:
41 Opt = new TestEval( circuit, options, simulation );
42 default:
43 break;
44 }
45 unsigned int n = circuit.GetNTran();

46 unsigned int np = simulation.GetNPath();
47 if ( !Opt )
48 return NO_MEM;
49 int RetCode;
50 if ( ( RetCode = Opt->SimulateFirstCircuit() ) != OK )
51 return RetCode;
52 print_log( "Initial critical paths: " );
53 for ( unsigned int i = 0; i < np; i++ )
54 {
55 sprintf( log, "%u) Delay=%g ps, Energy=%g pJ, Noise=%g", i,
56 simulation.GetDelay( i ),
57 simulation.GetPower( i ),
58 simulation.GetNoise( i ) );
60 }
61 sprintf( log, "Area=%g ", simulation.GetArea() );
62 print_log( "" );
63 print_log( "Starting optimization process..." );
64 RetCode = Opt->Run();
65 if ( ( RetCode != OK ) && ( RetCode != MAX_STEPS ) && ( RetCode != END_ACC ) && (RetCode != CONT))
66 return RetCode;
67 ftime( &stop_t );
68 if ( RetCode == MAX_STEPS )
69 {
70 print_log( "...WARNING: exceeded max steps..." );
71 }
72 if (( RetCode == END_ACC ) || ( RetCode == OK) )
73 {
74 print_log( "...Solution found. That’s all folk." );
75 }
76 for ( unsigned int i = 0; i < n; i++ )
77 {
78 int pos = circuit.TranPos( circuit[ i ].DevName() );
79 LastWidth[ i ] = Opt->OptWidth( pos );
80 }
81 RetCode = Opt->SimulateCircuit(LastWidth);
83 return RetCode;
84 long sec = stop_t.time - start_t.time;
85 short msec = abs( start_t.millitm - stop_t.millitm );
86 sprintf( log, "End Optimization: %ld steps, %ld function evaluations ", Opt->GetSteps(), simulation.GetCalls() );
88 sprintf( log, " : total time: %ld.%d secs ", sec, msec );
90 delete Opt;
91 return OK;
92 }
Options.cc
7
8
9 ///
10 Options::Options() :
11 SimOptions( 0 ), OptOptions( 0 ),
12 SimulationChosed( HSPICE ), OptimizationChosed( SLOP ),
13 verbose( 0 ), manual(0), NameMosN( 0 ), NameMosP( 0 ), WorkPath( 0 )
14 {
15 print_log( "Parsing Options..." );
16 }
17
18 ///
19 Options::~Options()
20 {
21 delete[] SimOptions;
22 delete[] OptOptions;
23 }
OptionsRead.cc
9
10 ///
11 int Options::Read( const char* FileOptionsName )
12 {
13 char line[ 1024 ];
14 char opt[ 256 ];
15 char log[1024];
16 int RetCode = OK;
17 ifstream i_file( FileOptionsName );
18 if ( !i_file )
20 unsigned int NumSimOptions = 0;
21 unsigned int NumOptOptions = 0;
22 while ( OptCommandOptions[ NumOptOptions++ ] );
23 while ( SimCommandOptions[ NumSimOptions++ ] );
24 OptOptions = new double[ NumOptOptions ];
25 SimOptions = new double[ NumSimOptions ];
26 for ( unsigned int i = 0; i < NumOptOptions; i++ )
27 OptOptions[ i ] = 0;
28 for ( unsigned int i = 0; i < NumSimOptions; i++ )
29 SimOptions[ i ] = 0;
30 unsigned int Line = 0;
32 {
33 Line++;
34 if ( sscanf( line, "%s ", opt ) == 1 )
35 if ( opt[ 0 ] != ’#’ )
36 {
37 GLOBCOMMANDOPT GlobalOption = WhichGBOption( opt );
38 SIMCOMMANDOPT SimOption = WhichSimOption( opt );
39 OPTCOMMANDOPT OptOption = WhichOptOption( opt );
40 switch ( GlobalOption )
41 {
42 case VERBOSE:
43 verbose = 1;
44 print_log("Well, let’s go verbose...");
45 break;
46 case MANUAL:
47 manual = 1;
48 print_log("So you think you’re better than me,");
49 print_log("in calculating critical paths?...");
50 break;
51 case SIMALG:
52 RetCode = NOT_FOUND;
53 sscanf( line, "%*s %s", opt );
54 for ( unsigned int S = 0 ; ( SimAlgorithms[ S ] != 0 ); S++ )
55 if ( !strcasecmp( opt, SimAlgorithms[ S ] ) )
56 {
57 SimulationChosed = ( SimMethod ) S;
58 RetCode = OK;
59 sprintf(log, "Simulator......%s", SimAlgorithms[ S ]);
60 print_log(log);
61 }
62 break;
63 case OPTALG:

66 for ( unsigned int O = 0; ( OptAlgorithms[ O ] != 0 ); O++ )
67 if ( !strcasecmp( opt, OptAlgorithms[ O ] ) )
68 {
69 OptimizationChosed = ( OptMethod ) O;
70 RetCode = OK;
71 sprintf(log, "Optimizer......%s", OptAlgorithms[ O ]);
72 print_log(log);
73 }
74 break;
75 case NAMEMOSN:
77 NameMosN = new char[ strlen( opt ) + 1 ];
78 if ( !NameMosN )
80 else
81 strcpy( NameMosN, opt );
82 break;
83 case NAMEMOSP:
85 NameMosP = new char[ strlen( opt ) + 1 ];
86 if ( !NameMosP )
88 else
89 strcpy( NameMosP, opt );
90 break;
91 case WORKPATH:
93 WorkPath = new char[ strlen( opt ) + 1 ];
94 if ( !WorkPath )
96 else
97 strcpy( WorkPath, opt );
98 break;
99 case NONEGLOB:
100 default:
101 break;
102 }
103 if ( SimOption != NONESIM )
104 {
105 sscanf( line, "%*s %lg", &SimOptions[ SimOption ] );
106 }
107
108 switch ( OptOption )
109 {
110 case CONSTRAINS:
111 if ( ( OptOptions[ CONSTRAINS ] == 1.0 ) || ( OptOptions[ ENDCONSTRAINS ] == 1 ) )
113 else
114 {
115 OptOptions[ CONSTRAINS ] = 1.0;
116 print_log("Hey, you mean some constraints...");
117 }
118 break;
119 case ENDCONSTRAINS:
120 if ( OptOptions[ CONSTRAINS ] == 0 )
122 else
123 OptOptions[ ENDCONSTRAINS ] = 1.0;
124 break;
125 case WEIGHTS:
126 if ( ( OptOptions[ WEIGHTS ] == 1.0 ) || ( OptOptions[ ENDWEIGHTS ] == 1 ) )
128 else
129 {
130 OptOptions[ WEIGHTS ] = 1.0;
131 print_log("Hey, you mean some weights...");
132 }
133 break;
134 case ENDWEIGHTS:

135 if ( OptOptions[ WEIGHTS ] == 0 )
137 else
138 OptOptions[ ENDWEIGHTS ] = 1.0;
139 break;
140 case WDELAY:
141 case WPOWER:
142 case WAREA:
143 case WNOISE:
144 case MAXSTEPS:
145 case ACC:
146 case WMAX:
147 case WMIN:
148 case DELTA:
149 case RISETIME:
150 case FALLTIME:
151 case MAXAREA:
152 case MAXDELAY:
153 case MAXPOWER:
154 case MAXNOISE:
155 sscanf( line, "%*s %lg", &OptOptions[ OptOption ] );
156 sprintf(log, "...%s=%g", OptCommandOptions[OptOption], OptOptions[ OptOption ]);
157 print_log(log);
158 break;
159 case NONEOPT:
160 default:
161 break;
162
163 }
164 }
166 {
167 sprintf( line, "ERROR reading file %s line %d ", FileOptionsName, Line );
168 print_log( line );
169 i_file.close();
170 return RetCode;
171 }
172 }
173 if ( !NameMosN )
174 {
175 NameMosN = new char[ strlen( TransistorString[ NMOS ] ) + 1 ];
176 if ( !NameMosN )
178 else
179 strcpy( NameMosN, TransistorString[ NMOS ] );
180 }
181 if ( !NameMosP )
182 {
183 NameMosP = new char[ strlen( TransistorString[ PMOS ] ) + 1 ];
184 if ( !NameMosP )
186 else
187 strcpy( NameMosP, TransistorString[ PMOS ] );
188 }
189 if ( !WorkPath )
190 {
191 WorkPath = new char[ strlen( WORKPath ) + 1 ];
192 if ( !WorkPath )
194 else
195 strcpy( WorkPath, WORKPath );
196 }
197 if ( SimulationChosed == NONESM )
198 SimulationChosed = HSPICE;
199 if ( OptimizationChosed == NONEOM )
200 OptimizationChosed = SLOP;
201 i_file.close();
202 return OK;
203 }
ReadTEch.cc
6 #include "tech.h"
7 #include "readt.h"
8
9 ///
10 struct _TECH_STR TECH;
11
12 int ReadTech()
13 {
nmos
ReadTEch.cc
15 TECH.Lmin = 0.25;
16 TECH.u0_n = 37.2; /** micron2̂ / (Volt * ns) */
16
17 TECH.Kp_n = 256.7916; /** uA / V2̂ */
17
18 TECH.vmax_n = 130.7952; /** micron / ns */
18
19 TECH.Vtn0 = 0.5885; /** Volt */
19
20 TECH.epss = 0.10359; /** fF / micron */
20
21 TECH.q = 1.602E-4; /** fF * Volt */
21
22 TECH.Na = 2.679E11; /** micron -̂3 */
22
23 TECH.gamma_n = 0.3356;
24 TECH.phi_n = 0.79424;
25 TECH.Cox_n = 6.903; /** fF / micron 2̂ */
25
26 TECH.C_nj = 689E-3; /** fF / micron2̂ */
26
27 TECH.C_np = 138E-3; /** fF / micron */
27
28 TECH.Ec_n = 3.516; /** Volt / micron = vmax/uo */
28
29 TECH.VT = 25.98E-3; /** Volt */
29
30 TECH.ni = 1.45E-2; /** micron -̂3 */
30
31 TECH.Df = 0.625; /** micron */
31
32 TECH.Cgd0_n = 0.32; /** fF / micron */
32
33 TECH.Cgs0_n = 0.32;
34 TECH.PB_n = 0.79424; /** Volt */
34
35 TECH.mj_n = 0.45495;
36 TECH.mjsw_n = 0.1;
37 TECH.XW_n = -0.79698; /** micron */
37
38 TECH.XL_n = 0; /** micron */
38
39 TECH.WD_n = 0.039849; /** micron */
39
40 TECH.LD_n = 0.0332; /** micron */
40
41 TECH.theta_n = 0.4314; /* V-̂1 */
pmos
ReadTEch.cc
43 TECH.u0_p = 6.341; /** micron2̂ / (Volt * ns) */
43
44 TECH.Kp_p = 30.16; /** uA / V2̂ */
44
45 TECH.Cox_p = 6.903; /** fF / micron2̂ */
45
46 TECH.gamma_p = 0.69468;
47 TECH.phi_p = 0.79547;
48 TECH.Vtp0 = -0.434;
49 TECH.vmax_p = 57.6714; /** micron /ns */
49
50 TECH.C_pj = 596E-3; /** fF / micron2̂ */
50
51 TECH.C_pp = 122.1E-3; /** fF / micron */
51
52 TECH.Ec_p = 9.095; /** Volt / micron */
52
53 TECH.Cgd0_p = 0.5; /** fF / micron */
53
54 TECH.Cgs0_p = 0.5;
55 TECH.Nd = 2.8E11; /** micron -̂3 */
55
56 TECH.PB_p = 0.79547; /** Volt */
56
57 TECH.mj_p = 0.36085;
58 TECH.mjsw_p = 0.1;
59 TECH.XW_p = -0.89852; /** micron */
59
60 TECH.XL_p = 0; /** micron */
60
61 TECH.WD_p = 0; /** micron */
61
62 TECH.LD_p = 0.054697; /** micron */
62
63 TECH.theta_p = 0.4071; /** V-̂1 */
63
64 return OK;
65 }
SearchCritic.cc
12
13
14 ///
15 int SearchCriticalPath(const Circuit& circuit,
16 CritPathList& pathList,
17 const NodeList& nodeListGnd,
18 const NodeList& nodeListVdd,
19 Node& nodeInputList,
20 Node& gateInternalList,
21 const Options& options)
22 {
23
24 ListNodeList* gndCPath;
25 ListNodeList* vddCPath;
26 int RetCode;
27 char log[1024];
28 unsigned int nlg = nodeListGnd.GetNumList();
29 unsigned int nlv = nodeListVdd.GetNumList();
30 print_log("Searching Critical Path...");
31 #ifdef DEBUG
32 cerr << "Searching Critical Path..." << endl;
33 #endif
34 gndCPath = 0;
35 vddCPath = 0;
36 for (unsigned int i = 0; i < nlg; i++)
37 {
38 ListNodeList* tmp = new ListNodeList;
39 if (!tmp)
40 return NO_MEM;
41 tmp->next = gndCPath;
42 gndCPath = tmp;
43 for (unsigned int j = 0; j < nodeInputList.GetNumNode(); j++)
44 nodeInputList[j].flag = -1;
45 for (unsigned int j = 0; j < gateInternalList.GetNumNode(); j++)
46 gateInternalList[j].flag = -1;
47 RetCode = SearchCPRecurse(gndCPath, circuit.ValimNode(), i, nodeListGnd, nodeListVdd, nodeInputList, gateInternalList, 0);
48 if ((RetCode != OK) && (RetCode != CONT))
49 return RetCode;
50 }
51 sprintf(log, "found first critical paths (gnd)...");
52 print_log(log);
53 for (unsigned int i = 0; i < nlv; i++)
54 {
56 if (!tmp)
57 return NO_MEM;
58 tmp->next = vddCPath;
59 vddCPath = tmp;
60 for (unsigned int j = 0; j < nodeInputList.GetNumNode(); j++)
61 nodeInputList[j].flag = -1;
62 for (unsigned int j = 0; j < gateInternalList.GetNumNode(); j++)
63 gateInternalList[j].flag = -1;
64 RetCode = SearchCPRecurse(vddCPath, circuit.ValimNode(), i, nodeListVdd, nodeListGnd, nodeInputList, gateInternalList, 0);
65 if ((RetCode != OK) && (RetCode != CONT))
66 return RetCode;
67 }
68 sprintf(log, "found the other critical paths (vdd)...");
69 print_log(log);
70 ListNodeList* tmp = gndCPath;
72 for (unsigned int gcount = 0; gcount < 2; gcount ++)
73 {
74 while (tmp)
75 {
76 #ifdef DEBUG
77 cerr << endl << "-------------> CP " << count << endl;
78 #endif
79 count++;
80 unsigned int nl = (tmp->NL).GetNumList();
81 if (nl)
82 {
83 if ((RetCode = pathList.Create()) != OK)
84 return RetCode;
85 unsigned int first_node = (((tmp->NL)[0])[0]).node;
86 unsigned int nn = ((tmp->NL)[nl - 1]).GetNumNode();
87 unsigned int output = (((tmp->NL)[nl - 1])[nn - 1]).node;
88 TransitionType Tr_in;
89 TransitionType Tr_out;
90 if (first_node == 0)
91 Tr_in = RISE;
92 else
93 Tr_in = FALL;
94 if (nl % 2)
95 Tr_out = (Tr_in == RISE ? FALL : RISE);
96 else
97 Tr_out = (Tr_in == RISE ? RISE : FALL);
98 OPTCOMMANDOPT TRin = (Tr_in == RISE ? RISETIME : FALLTIME);
99 RetCode = pathList.InsertNodeOut(output, Tr_out);
101 return RetCode;
102 unsigned int pos;
103 unsigned int first_input = 0;
104 double val;
105 double val_n;
107 {
108 (nodeInputList[i]).flag = -1;
109 }
111 {
112 first_node = (((tmp->NL)[i])[0]).node;
114 {
115 val = circuit.Valim();
116 val_n = 0;
117 }
118 else
119 {
120 val = 0;
121 val_n = circuit.Valim();
122 }
123 nn = ((tmp->NL)[i]).GetNumNode();
124 // set initial condition
125 unsigned int last_l_node = (((tmp->NL)[i])[nn - 1]).node;
126 if (i < nl - 1)
127 {
128 RetCode = pathList.InsertInitialCondition(last_l_node, val);
130 return RetCode;
131 }
132 for (unsigned int j = 1; j < nn; j = j + 2)
133 {
134 unsigned int input = (((tmp->NL)[i])[j]).node;
135 #ifdef DEBUG
136 cerr << endl << "input " << input;
137 #endif
138 if (IsIn(input, nodeInputList, pos) == OK)
139 {
140 #ifdef DEBUG
141 cerr << " primary input (" << pos << ")";
142 #endif
143 if (first_input == 0)
144 {
145 first_input = input;
146 RetCode = pathList.InsertNodeIn(first_input, Tr_in, options.GetOptOption(TRin));
148 return RetCode;
149 #ifdef DEBUG
150 cerr << " INPUT";
151 #endif
152 }
153 else
154 {
155 if ((nodeInputList[pos]).flag == -1)
156 {
157 RetCode = pathList.InsertActiveInputs(input, val);
159 return RetCode;
160 #ifdef DEBUG
161 cerr << " ACTIVE IN " << val;
162 #endif
163 }
164 else
165 {
167 {
168 if ((nodeInputList[pos]).flag != int(circuit.ValimNode()))
170 }
171 else
172 {
173 if ((nodeInputList[pos]).flag != 0)
175 }
176 }
177 }
178 (nodeInputList[pos]).flag = (first_node == 0 ? circuit.ValimNode() : 0);
179 }
180 if (i > 0)
181 {
182 unsigned int nn2 = ((tmp->NL)[i - 1]).GetNumNode();
183 last_l_node = (((tmp->NL)[i - 1])[nn2 - 1]).node;
184 }
185 else
186 last_l_node = 0;
187 if (IsIn(input, gateInternalList, pos) == OK)
188 {
189 #ifdef DEBUG
190 cerr << " internal gate (" << pos << " last " << last_l_node << ")";
191 #endif
192 if (last_l_node != input)
193 {
194 if (first_input == 0)
195 {
196 first_input = input;
197 RetCode = pathList.InsertNodeIn(first_input, Tr_in, options.GetOptOption(TRin));
199 return RetCode;
200 #ifdef DEBUG
201 cerr << " INPUT INTERNAL";
202 #endif
203 }
204 else
205 {
206 RetCode = pathList.InsertActiveInputs(input, val);
208 return RetCode;
209 #ifdef DEBUG
210 cerr << " ACTIVE IN INTERNAL " << val;
211 #endif
212
213 }
214 }
215 }
216 unsigned int drain = (((tmp->NL)[i])[j - 1]).node;
217 unsigned int source = (((tmp->NL)[i])[j + 1]).node;
218 unsigned int nt = circuit.GetNTran();
219 for (unsigned int k = 0; k < nt; k++)
220 {
221 if ( (circuit[k]).Gate() == input)
222 {
223 if ( (((circuit[k]).Drain() == drain) &&
224 ((circuit[k]).Source() == source)) ||
225 (((circuit[k]).Drain() == source) &&
226 ((circuit[k]).Source() == drain)))
227 {
228 RetCode = pathList.InsertPathTransistor((circuit[k]).DevName(), (circuit[k]).TrType(), i);
230 return RetCode;
231 }
232 }
233 }
234 }
235 }
237 {
238 if ((nodeInputList[i]).flag == -1)
239 {
240 unsigned int noActiveNode = (nodeInputList[i]).node;
241 double noActiveSupply;
242 if (gcount == 0) // CP starting with GND
243 noActiveSupply = 0;
244 else // CP starting with VDD
245 noActiveSupply = circuit.Valim();
246 #ifdef DEBUG
247 cerr << endl << "no active input "
248 << noActiveNode << " = " << noActiveSupply ;
249 #endif
250 RetCode = pathList.InsertNoActiveInputs(noActiveNode, noActiveSupply);
252 return RetCode;
253 }
254 }
255 if ((RetCode = pathList.Stamp(nl)) != OK)
256 return RetCode;
257 }
258 tmp = tmp->next;
259 }
260 tmp = vddCPath;
261 }
262 sprintf(log, "found total %u critical paths...", pathList.GetNumPath());
263 print_log(log);
264 return OK;
265 }
SearchCriticRecurse.cc
12
13
14 ///
15 int SearchCPRecurse(ListNodeList* CPath,
16 unsigned int valnode,
17 unsigned int index,
18 const NodeList& nodeListFirst,
19 const NodeList& nodeListSecond,
21 Node& gateInternalList,
22 unsigned int ilevel)
23 {
24
25 int RetCode;
26 unsigned int level = ilevel;
27 unsigned int i, j;
28 unsigned int nn = (nodeListFirst[index]).GetNumNode();
29 level++;
30 #ifdef DEBUG
31 cerr << endl;
32 for (i = 0; i < level; i++)
33 cerr << " ";
34 cerr << "(" << level << ") " << index << " - ";
35 #endif
36 unsigned int first_node = ((nodeListFirst[index])[0]).node;

37 // first node = gnd or vdd
38 unsigned int last_node = ((nodeListFirst[index])[nn - 1]).node;
39 int neg_node;
41 neg_node = valnode;
42 else
43 neg_node = 0;
44 for (i = 1; i < nn; i = i + 2)
45 {
46 unsigned int pos = 0;
47 unsigned int tmp_node = ((nodeListFirst[index])[i]).node;
48 if ( IsIn(tmp_node, nodeInputList, pos) == OK)
49 {
50 int flag = (nodeInputList[pos]).flag;
51 if ( flag == -1)
52 {
53 (nodeInputList[pos]).flag = neg_node;
54 }
55 else
56 {
57 if (flag != neg_node)
58 return OK;
59 }
60 }
61 else if ( IsIn(tmp_node, gateInternalList, pos) == OK)
62 {
63 int flag = (gateInternalList[pos]).flag;
64 if (flag == -1)
65 {
66 if (level > 1)
67 {
68 // very bastard inside
69 if ( SearchOKCond(tmp_node, valnode, nodeListFirst, nodeListSecond, nodeInputList, gateInternalList) == NOT_FOUND)
70 {
71
72 if ( SearchOKCond(tmp_node, valnode, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList) == NOT_FOUN
73 (gateInternalList[pos]).flag = neg_node;
74 else
75 return OK;
76 }
77 else
79 }
80 else
82 }
83 else
84 {
86 return OK;
87 }
88 }
89 }
90 if ( (RetCode = (CPath->NL).Create()) != OK)
91 return RetCode;
92 for (unsigned int ii = 0; ii < nn; ii++)
93 {
94 if ((RetCode = (CPath->NL).InsertNode(((nodeListFirst[index])[ii]).node)) != OK)
95 return RetCode;
96 }
97 unsigned int nl = nodeListSecond.GetNumList();
98 for (i = 0; i < nl; i++)
99 {
100 nn = (nodeListSecond[i]).GetNumNode();
101 j = 1;
102 unsigned int found = 0;
103 while ((j < nn) && (!found))
104 {
105 unsigned int try_node = ((nodeListSecond[i])[j]).node;

106 if (try_node == last_node)
107 found = j;
108 j = j + 2;
109 }
110 if (found)
111 {
112 int RecurseCode = SearchCPRecurse(CPath, valnode, i, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList, level);
113 if (RecurseCode == OK)
114 {
116 if (!tmp)
117 return NO_MEM;
118 tmp->next = CPath;
119 CPath = tmp;
120 unsigned int n_l = ((CPath->next)->NL).GetNumList();
121 for (unsigned int jj = 0; jj < n_l - 1; jj++)
122 {
123 if ((RetCode = (CPath->NL).Create()) != OK)
124 return RetCode;
125 unsigned int n_n = (((CPath->next->NL))[jj]).GetNumNode();
126 for (unsigned int k = 0; k < n_n; k++)
127 (CPath->NL).InsertNode((((CPath->next->NL)[jj])[k]).node );
128 }
129 }
130 else if (RecurseCode == CONT)
131 {
132 //if((RetCode = (CPath-¿NL).DeleteList(level)) != OK)
133 // return RetCode;
134 }
135
136 else
137 return RecurseCode;
138 }
139 }
140 level--;
141 if (level == 0)
142 {
143 #ifdef DEBUG
144 unsigned int pp = (CPath->NL).GetNumList();
145 cerr << endl << " NUMLIST " << pp << " --- ";
146 for (unsigned int i = 0; i < pp; i++)
147 {
148 unsigned int pn = ((CPath->NL)[i]).GetNumNode();
149 for (unsigned int j = 0; j < pn; j++)
150 cerr << " " << (((CPath->NL)[i])[j]).node;
151 cerr << " / ";
152 }
153 #endif
154 }
155 return CONT;
156 }
SearchOkCond.cc
12
13 ///
14 int SearchOKCond(unsigned int node,
15 unsigned int valnode,
16 const NodeList& nodeListFirst,

17 const NodeList& nodeListSecond,
19 Node& gateInternalList)
20 {
21 unsigned int nl = nodeListSecond.GetNumList();
22 unsigned int pos = 0;
23 unsigned int first_node = ((nodeListSecond[0])[0]).node;
24 // first node = gnd or vdd
25 int neg_node;
27 neg_node = valnode;
28 else
29 neg_node = 0;
31 {
32 unsigned int nn = (nodeListSecond[i]).GetNumNode();
33 unsigned int last_node = ((nodeListSecond[i])[nn - 1]).node;
34 if (node == last_node)
35 {
36 for (unsigned int j = 1; j < nn; j = j + 2)
37 {
38 unsigned int tmp_node = ((nodeListSecond[i])[j]).node;
39 if ( IsIn(tmp_node, nodeInputList, pos) == OK)
40 {
41 int flag = (nodeInputList[pos]).flag;
42 if ( flag == -1)
43 {
44 (nodeInputList[pos]).flag = neg_node;
45 }
46 else
47 {
50 }
51 }
52 else if ( IsIn(tmp_node, gateInternalList, pos) == OK)
53 {
54 int flag = (gateInternalList[pos]).flag;
55 if (flag == -1)
56 {
57 int RecurseCode = SearchOKCond(tmp_node, valnode, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList
58 if (RecurseCode == NOT_FOUND)
60 }
61 else
62 {
65 }
66 }
67 }
68 }
69 }
70 return OK;
71 }
TransistorList.cc
7
8 ///
9 TransistorList::TransistorList() : NumTran( 0 ), head( 0 ), tail( 0 )
10 {}
11
12 ///
13
14 TransistorList::~TransistorList()
15 {
16 TransistorNode* tmp;
17 while ( head )
18 {
20 delete head;
21 head = tmp;
22 }
23 }
24
25 ///
26 const TransistorNode& TransistorList::operator[]( unsigned int index ) const
27 {
28 if ( index > NumTran )
29 error( NOT_FOUND, 0, "Index out of bound in [TransistorList]..." );
30 TransistorNode* tmp = head;
31 while ( tmp )
32 {
33 if ( tmp->Index() == index )
34 return * tmp;
35 tmp = tmp->next;
36 }
37 return *tmp;
38 }
39
40 ///
41 const TransistorNode& TransistorList::operator[]( const char* name ) const
42 {
43 TransistorNode * tmp = head;
44 while ( tmp )
45 {
46 if ( !strcasecmp( name, tmp->Name ) )
47 return * tmp;
48 tmp = tmp->next;
49 }
50 return *tmp;
51 }
52
53 ///
54 TransistorNode& TransistorList::operator[]( unsigned int index )
55 {
56 if ( index > NumTran )
57 error( NOT_FOUND, 0, "Index out of bound in [TransistorList]..." );
58 TransistorNode* tmp = head;
59 while ( tmp )
60 {
61 if ( tmp->Index() == index )
62 return * tmp;
63 tmp = tmp->next;
64 }
65 return *tmp;
66 }
67
TransistorListInsert.cc
7
8 ///
9 int TransistorList::Insert( const char *name, double w, double l, TransistorType t, unsigned int s, unsigned int g, unsigned int d )
10 {
11 TransistorNode * tmp;
12 if ( !head )
13 {
14 head = new TransistorNode( name, w, l, t, s, d, g, NumTran );
15 if ( !head )
16 return NO_MEM;
17 head->next = 0;
18 tail = head;
19 }
20 else
21 {
22 tmp = new TransistorNode( name, w, l, t, s, d, g, NumTran );
23 if ( !tmp )
24 return NO_MEM;
25 tmp->next = 0;
26 tail->next = tmp;
27 tail = tmp;
28 }
29 NumTran++;
30 return OK;
31 }
TransistorNode.cc
7
8
9 ///
10 TransistorNode::TransistorNode( const char *name, double w, double l, TransistorType t, unsigned int s, unsigned int d, unsigned int g,
11 Name( 0 ), width( w ), length( l ), type( t ),
12 source( s ), drain( d ), gate( g ), hashindex( index ), next( 0 )
13 {
14 Name = new char[ strlen( name ) + 1 ];
15 if ( !Name )
16 {
17 print_log( "FATAL ERROR" );
19 error( NO_MEM, errno, "PANIC! " );
20 }
21 strcpy( Name, name );
22 }
23
24 ///
25 TransistorNode::~TransistorNode()
26 {
27 delete[] Name;
28 }
main.cc
5 #include <signal.h>
15 #include "hspice.h"
16 #include "fast.h"
17 #include "test.h"
19 #include "readt.h"
20
21
22 ///
23 extern char *optarg;
24 extern int optind;
25
26 ///
27 int main ( int argc, char **argv )
28 {
29 int c;
30 char *FileIn;
31 char *FileOptions;
32
33 time_t tm = time( 0 );
34 signal(15, catch_stop);
35 signal(2, catch_stop);
36 char log[ 256 ];
37 print_log( "\n*************************" );
38 sprintf( log, "%s Version: %s Copyrigth MFD 1998 ", argv[ 0 ], VERSION );
40 print_log( "*************************" );
41 print_log( ctime( &tm ) );
42 // some default initialization
43 FileOptions = 0;
44 while ( ( c = getopt( argc, argv, "hf:t:" ) ) != -1 )
45 {
46 switch ( c )
47 {
48 case ’h’: //HELP
49 return print_help( argv[ 0 ] );
50 break;
51 case ’f’:
52 FileOptions = new char[ strlen( optarg ) + 1 ];
53 if ( !FileOptions )
54 {
58 }
59 strcpy( FileOptions, optarg );
60 break;
61 case ’?’:
62 default:
64 break;
65 }
66 }
67
68 if ( ( argc - optind ) != 1 )
70 else
71 {
72 FileIn = new char[ strlen( argv[ optind ] ) + 1 ];
73 if ( !FileIn )
74 {
78 }
79 strcpy( FileIn, argv[ optind ] );
80 if ( !FileOptions )
81 {
82 FileOptions = new char[ strlen( "options.conf" ) + 1 ];
83 strcpy( FileOptions, "options.conf" );
84 }
85 }
86 Options options;
87 int RetCode;
88 if ( ( RetCode = options.Read( FileOptions ) ) )

89 {
90 print_log( "Error reading options file:" );
93 }
94 Circuit circuit( FileIn, options );
95 CritPathList pathList;
96 if (options.Manual() == 0)
97 {
98 RetCode = Critic(circuit, pathList, options);
100 {
101 print_log( "Error searching critical paths:" );
104 }
105 }
106 else
107 {
108 if ( ( RetCode = pathList.Read( FileOptions, circuit ) ) )
109 {
113 }
114 }
115 if ( ( RetCode = ReadTech() ) )
116 {
120 }
121
122 EvaluationAlgorithm* simulation;
123 switch ( options.WhichSimAlgorithm() )
124 {
125 case HSPICE:
126 simulation = new Hspice( pathList, options, FileIn );
127 if ( mkdir( options.Workpath(), 0770 ) )
128 {
129 if ( errno != EEXIST )
130 {
131 error( NOT_FOUND, errno, "HEY! " );
132 }
133 }
134 break;
135 case FAST:
136 simulation = new Fast( pathList, options );
137 break;
138 case TESTOPT:
139 simulation = new TestOpt( pathList, options);
140 break;
141 case NONESM:
142 default:
144 break;
145 }
146 double* LastWidth = new double[ circuit.GetNTran() ];
147 if ( !simulation || !LastWidth )
148 {
149 print_log( "FATAL ERROR" );
152 }
153 print_init( circuit, options, pathList.GetNumPath() );
154 if ( ( RetCode = Optimize( circuit, options, *simulation, LastWidth ) ) )
155 {
156 print_log( "Error in optimizing..." );

159 }
160 print_log("Writing optimized netlist...");
161 print_final(FileIn, circuit, pathList.GetNumPath(), *simulation, LastWidth );
162 print_log( "Time to die..." );
163 delete[] FileIn;
164 delete[] FileOptions;
165 delete[] LastWidth;
166 delete simulation;
167 return OK;
168 }
169
170 ///
171 int print_help( const char *name )
172 {
173 cerr << "Usage: " << name << " [-f FILEOPTIONS] Netlist_file" << endl;
174 cerr << " Where -f FILEOPTIONS = file containing general option (default = options.conf) " << endl;
175 return OK;
176 }
nrutil.cc
5 #include "nrutil.h"
7
8 ///
9 const int NR_END = 1;
10 #define FREE_ARG char*
11
12 ///
13 double *dvector ( long nl, long nh ) /* allocate a double vector with subscript range v[nl..nh] */
13
14 {
15 double * v;
16 17 v = new double[ nh - nl + 1 + NR_END ];
18 if ( !v )
19 {
22 //error(NO MEM, errno, ”HEY! ”);
23 return 0;
24 }
25 return v - nl + NR_END;
26 }
27 ///
28 double **dmatrix ( long nrl, long nrh, long ncl, long nch ) /* allocate a double matrix with subscript range m[nrl..nrh][ncl..nch] */
28
29 {
30 long i, nrow = nrh - nrl + 1, ncol = nch - ncl + 1;
31 double **m; 31
32 33 m = new double * [ nrow + NR_END ];
34
35 if ( !m )
36 {
40 return 0;
41 }
42 m += NR_END;
43 m -= nrl; 43
44 45 m[ nrl ] = new double[ nrow * ncol + NR_END ];
46
47 if ( !m[ nrl ] )
48 {

52 return 0;
53 }
54 m[ nrl ] += NR_END;
55 m[ nrl ] -= ncl;
56 for ( i = nrl + 1; i <= nrh; i++ )
57 m[ i ] = m[ i - 1 ] + ncol; 57
58 return m;
59 }
60 ///
61 void free_dvector ( double *v ) 61
62 {
63 64 delete[] v;
65
66 }
67 ///
68 void free_dmatrix ( double **m ) 68
69 {
70 71 72 delete[] m[ 1 ];
73 ///
74 delete[] m;
75 }
print final.cc
12
13 ///
14 void print_final(const char* FileNetList, const Circuit& circuit, unsigned int NP, EvaluationAlgorithm& simulation, double* LastWidth )
15 {
17 char log[ 1024 ];
18 print_log( "Final Dimensions: " );
20 {
21 int pos = circuit.TranPos( circuit[ i ].DevName() );
22 sprintf( log, "W[%s] = %3.3gu", circuit[ i ].DevName(), LastWidth[ pos ] );
24 }
25 print_log( "Final critical paths: " );
27 {
28 sprintf( log, "%u) Delay=%g ps, Energy=%g pJ, Noise=%g", i,
29 simulation.GetDelay( i ),
30 simulation.GetPower( i ),
31 simulation.GetNoise( i ) );
33 }
34 sprintf( log, "Area=%g ", simulation.GetArea() );
35 print_log(log);
36 char line[ 1024 ];
37 char line2[ 1024 ];
38 char* FileNetOut;
39 FileNetOut = new char[ strlen( FileNetList ) + strlen( NetListSuffix ) + 1 ];
40 if ( !FileNetOut )
41 {

45 }
46 strcpy( FileNetOut, FileNetList );
47 strcat( FileNetOut, NetListSuffix );
48 ifstream i_file( FileNetList );
49 ofstream o_file( FileNetOut );
50 if ( !i_file || !o_file)
51 {
53 print_log( ReturnMessage[ NOT_FOUND ] );
55 }
57 {
58 int c = 0;
59 while ( isspace( line[ c++ ] ) );
60 switch ( line[ --c ] )
61 {
62 case ’m’:
63 case ’M’:
64 case ’x’:
65 case ’X’:
66 char tmpstr[ 128 ];
68 char par[ 16 ];
69 char type[ 16 ];
70 char endpar[ 16 ];
71 char mos[ 8 ];
72 unsigned int n1, n2, n3, n4;
73 strcpy( parsestr, "%s %u %u %u %u %s" );
74 if ( sscanf( line, parsestr, mos, &n1, &n2, &n3, &n4, type ) == 6 )
75 {
76 sprintf( line2, "%s %u %u %u %u %s ", mos, n1, n2, n3, n4, type );
77 strcpy( parsestr, "%*s %*u %*u %*u %*u %*s" );
78 strcpy( tmpstr, parsestr );
80 while ( sscanf( line, parsestr, par ) == 1 )
81 {
83 while ( isspace( par[ count++ ] ) );
84 count--;
85 if ((par[count] == ’w’) || (par[count] == ’W’))
86 {
87 double W = LastWidth[circuit[mos].Index()];
88 sprintf(endpar, " w=%gu ", W);
89 strcat(line2, endpar);
90 }
91 else
92 {
93 strcat(line2, " ");
94 strcat(line2, par);
95 }
99 }
100 }
101 else
102 {
104 print_log( ReturnMessage[ NOT_FOUND ] );
106 }
107 break;
108 default:
109 strcpy(line2, line);
110 break;
111 }
112 o_file << line2 << endl;
113 }
114 i_file.close();
115 o_file.close();
116 }
print init.cc
11
12 ///
13 void print_init( const Circuit& circuit, const Options& options, unsigned int NP )
14 {
16 char log[ 1024 ];
17 sprintf( log, "Circuit: %u Transistor || %u Critical paths", n, NP );
19 print_log( "Initial Dimensions: " );
21 {
22 sprintf( log, "W[%s] = %3.3gu", circuit[ i ].DevName(), circuit[ i ].Width() );
24 }
25 }
print log.cc
6
7 ///
8 void print_log( const char *OutString )
9 {
10 ofstream o_file( "OPT.log", ios::app );
11 if ( o_file )
12 {
13 o_file << OutString << endl;
14 o_file.close();
15 }
16 }
signal.cc
4 #include <signal.h>
5
6
7
8 ///
9 void catch_stop(int n)
10 {
11 if (n == 15)
12 cerr << endl << "TERM" << endl;
13 if (n == 2)
14 {
15 cerr << endl << "TERM2" << endl;
16 exit(0);
17 }
18 }
B.2 Optimization algorithms

B.2. Optimization algorithms 209
Slop.cc
6 #include "slop.h"
7
8 ///
9 Slop::Slop( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
10 :
11 OptimizationAlgorithm( circuit, options, simulation ),
12 TMat(0), TMatOld(0), GMat(0), T(0), SaveW(0)
13 {
14 print_log( "Creating Slop instance..." );
15 TMat = new double * [ NumPath ];
16 TMatOld = new double * [ NumPath ];
17 GMat = new double * [ NumPath ];
18 T = new double [ NumPath ];
19 SaveW = new double[ NumTran ];
20 if ( ( !TMat ) || ( !TMatOld ) || ( !GMat ) || ( !T ) || ( !SaveW ) )
21 {
25 }
27 {
28 TMat[ i ] = new double [ NumTran ];
29 TMatOld[ i ] = new double [ NumTran ];
30 GMat[ i ] = new double [ NumTran ];
31 if ( ( !TMat[ i ] ) || ( !TMatOld[ i ] ) || ( !GMat[ i ] ) )
32 {
36 }
37 }
38 }
39
40 ///
41 Slop::~Slop()
42 {
44 {
45 delete[] TMat[ i ];
46 delete[] GMat[ i ];
47 }
48 delete[] TMat;
49 delete[] TMatOld;
50 delete[] GMat;
51 delete[] SaveW;
52 }
53
54 ///
55 int Slop::Run()
56 {
57 int Wbig;
58 int RetCode;
60 SaveW[ i ] = Width[ i ];
61 double max = 0;
62 double TMax = 0;
63 unsigned int jmax;
64 double dummy;
65 unsigned int end_acc = 0;
66 for ( Steps = 1; ( Steps < options.GetOptOption( MAXSTEPS ) ) \
67 && (max >= 0.0) \
68 && ( InternalSteps < options.GetOptOption( MAXSTEPS ) ) \

69 && (end_acc == 0); Steps++ )
70 {
71
73 Width[ i ] = SaveW[ i ];
74 dummy = SlopNormSim( Width, RetCode);
75 if ((RetCode != OK) && (RetCode != CONT) && (RetCode != MAX_STEPS) && (RetCode != END_ACC))
76 return RetCode;
77 else if ((RetCode == MAX_STEPS) || (RetCode == END_ACC))
78 end_acc = 1;
79 TMax = 0.0;
80 jmax = 0;
82 {
83 T[ i ] = CPDelay[ i ];
84 for ( unsigned int j = 0; j < NumTran; j++ )
85 TMatOld[ i ][ j ] = T[ i ];
86 if ( T[ i ] > TMax )
87 {
88 TMax = T[ i ];
89 jmax = i;
90 }
91 }
93 {
94 Width[ i ] += options.GetOptOption( DELTA );
95 if (options.GetOptOption( WMAX ) > 0)
96 if ( Width[ i ] >= options.GetOptOption( WMAX ) )
97 Width[ i ] = options.GetOptOption( WMAX );
100 return RetCode;
102 end_acc = 1;
104 if ( options.GetOptOption( CONSTRAINS ) )
105 {
106 if ( ( CPDelay[ i ] <= options.GetOptOption( MAXDELAY ) ) &&
107 ( CPPower[ i ] <= options.GetOptOption( MAXPOWER ) ) &&
108 ( CPNoise[ i ] <= options.GetOptOption( MAXNOISE ) ) &&
109 ( Area <= options.GetOptOption( MAXAREA ) ) )
110 for ( unsigned int j = 0; j < NumPath; j++ )
111 {
112 T[ j ] = CPDelay[ j ];
113 TMat[ j ][ i ] = T[ j ];
114 }
115 }
116 else
118 {
119 T[ j ] = CPDelay[ j ];
120 TMat[ j ][ i ] = T[ j ];
121 }
122 }
123 Wbig = -1;
124 max = 0.0;
126 {
128 {
129 GMat[ i ][ j ] = TMatOld[ i ][ j ] - TMat[ i ][ j ];
130 if ( ( GMat[ i ][ j ] > max ) && ( i == jmax ) )
131 {
132 max = GMat[ i ][ j ];
133 Wbig = j;
134 }
135 TMatOld[ i ][ j ] = TMat[ i ][ j ];
136 }
137 }
138 if ( Wbig != -1 )
139 SaveW[ Wbig ] += options.GetOptOption( DELTA );
140 else
141 max = -1.0; // so max ¡ 0
142 }
143
145 {
147 }
149 return RetCode;
150 }
SlopNorm.cc
6 #include "slop.h"
7
8
9 ///
10 double Slop::SlopNormSim( const double* NewWidth, int& RetCode)
11 {
12 return NormSim( NewWidth, RetCode);
13 }
Slop2.cc
7
8 ///
9 Slop2::Slop2( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
10 :
11 OptimizationAlgorithm( circuit, options, simulation ),
12 TMat(0), TMatOld(0), GMat(0), SaveW(0)
13 {
14 print_log( "Creating Slop2 instance..." );
15 TMat = new double * [ NumPath ];
16 TMatOld = new double * [ NumPath ];
17 GMat = new double * [ NumPath ];
18 SaveW = new double[ NumTran ];
19 if ( ( !TMat ) || ( !TMatOld ) || ( !GMat ) || ( !SaveW ) )
20 {
24 }
26 {
27 TMat[ i ] = new double [ NumTran ];
28 TMatOld[ i ] = new double [ NumTran ];
29 GMat[ i ] = new double [ NumTran ];
30 if ( ( !TMat[ i ] ) || ( !TMatOld[ i ] ) || ( !GMat[ i ] ) )
31 {
35 }
36 }
37 }
38
39 ///
40 Slop2::~Slop2()
41 {
43 {
44 delete[] TMat[ i ];
45 delete[] GMat[ i ];
46 }
47 delete[] TMat;
48 delete[] TMatOld;
49 delete[] GMat;
50 delete[] SaveW;
51 }
52
53 ///
54 int Slop2::Run()
55 {
56 int Wbig;
57 int RetCode;
59 SaveW[ i ] = Width[ i ];
60 double max = 0;
61 double dummy;
64 dummy = Slop2NormSim( Width, RetCode);
65 if ( (RetCode != OK ) && (RetCode != CONT))
66 return RetCode;
68 {
70 TMatOld[ i ][ j ] = dummy;
71 }
72 unsigned int end_acc = 0;
73 for ( Steps = 1; ( Steps < options.GetOptOption( MAXSTEPS ) ) \
74 && ( max >= 0.0 ) \
75 && ( InternalSteps < options.GetOptOption( MAXSTEPS ) \
76 && (end_acc == 0)); Steps++ )
77 {
78
82 {
83 Width[ i ] += options.GetOptOption( DELTA );
84 if (options.GetOptOption( WMAX ) > 0)
85 if ( Width[ i ] >= options.GetOptOption( WMAX ) )
86 Width[ i ] = options.GetOptOption( WMAX );
89 return RetCode;
91 end_acc = 1;
94 {
95 TMat[ j ][ i ] = dummy;
96 }
97 }
98 Wbig = -1;
99 max = 0.0;
102 {
103 GMat[ i ][ j ] = TMatOld[ i ][ j ] - TMat[ i ][ j ];
104 if ( GMat[ i ][ j ] > max )
105 {
106 max = GMat[ i ][ j ];
107 Wbig = j;
108 }
109 TMatOld[ i ][ j ] = TMat[ i ][ j ];
110 }
111 if ( Wbig != -1 )
112 SaveW[ Wbig ] += options.GetOptOption( DELTA );
113 else
114 max = -1.0; // so max ¡ 0
115 }
116
120 return RetCode;
121 }
Slop2Norm.cc
7
8
9 ///
10 double Slop2::Slop2NormSim( const double* NewWidth, int& RetCode)
11 {
13 }
TestEv.cc
6 #include "test.h"
7
8 ///
9 TestEval::TestEval( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
10 :
11 OptimizationAlgorithm( circuit, options, simulation ) ,
12 TryW(0)
13 {
14 print_log( "Creating TestEval instance..." );
15 TryW = new double[ NumTran ];
16 if ( !TryW )
17 {
21 }
22 }
23
24 ///
25 TestEval::~TestEval()
26 {
27 delete[] TryW;
28 }
29
30 ///
31 int TestEval::Run()
32 {
33 int RetCode;
34 double dummy;
36 TryW[ i ] = options.GetOptOption( WMIN );
37 for ( Steps = 1; Steps < options.GetOptOption( MAXSTEPS ) ; Steps++ )
38 {
39
40 dummy = TestEvalNormSim( TryW, RetCode);
41 if ( (RetCode != OK ) && (RetCode != CONT))
42 return RetCode;
43 // if (Steps <= NumTran)
44 // TryW[(Steps - 1)] += options.GetOptOption( DELTA );
45 // else
46 // TryW[(Steps - 1) % NumTran] += options.GetOptOption(DELTA);
47 for (int i = 0; i < NumTran; i++)
48 TryW[i] += options.GetOptOption( DELTA );
49 }
50 return OK;
51 }
TestNorm.cc
6 #include "test.h"
7
8
9 ///
10 double TestEval::TestEvalNormSim( const double* NewWidth, int& RetCode)
11 {
13 }
B.3 Simulators
B.3. Simulators 217
Basicnet.cc
7
8 ///
9 int Hspice::BasicNetlist( const double* NewWidth, unsigned int Np, const Circuit& circuit )
10 {
11 char * FileHspice;
12 char log[ 1024 ];
13 char suffix[ 8 ];
14 sprintf( suffix, "_%u", Np );
15 FileHspice = new char[ strlen( WorkPath ) + strlen( SimFile ) + strlen( suffix ) + 1 ];
16 strcpy( FileHspice, WorkPath );
17 strcat( FileHspice, SimFile );
18 strcat( FileHspice, suffix );
19 ofstream o_file( FileHspice );
20 ifstream i_file( NetlistFile );
21 if ( !o_file )
22 {
23 sprintf( log, " ERROR opening file %s ", FileHspice );
26 }
27 if ( !i_file )
28 {
29 sprintf( log, " ERROR opening file %s ", NetlistFile );
32 }
33 char line[ 1024 ];
34 i_file.getline( line, 1023 );
35 o_file << endl << endl << endl << "****** INPUTS ******" << endl;
36 o_file << ".include inputs." << Np << endl;
37 o_file << "********************" << endl;
39 {
40 unsigned int i = 0;
41 while ( isspace( line[ i++ ] ) );
42 char s = line[ --i ];
43 char st1[ 16 ], st2[ 16 ], st3[ 16 ];
44 int n1, n2, n3, n4;
45 if ( ( s == ’M’ ) || ( s == ’m’ ) || ( s == ’X’ ) || ( s == ’x’ ) )
46 {
47 sscanf( line, "%s %d %d %d %d %s %*s %s", st1, &n1, &n2, &n3, &n4, st2, st3 );
48 int position = circuit.TranPos( st1 );
49 if ( position == -1 )
51 o_file << st1 << " " << n1 << " " << n2 << " " << n3 << " " << n4 << \
52 " " << st2 << " " << st3 << " w=" << setprecision( 4 ) << NewWidth[ position ] << "u" << endl;
53 }
54 else
55 o_file << line << endl;
56 }
57 o_file.close();
58 i_file.close();
59 return OK;
60 }
Delayread.cc
7
8 ///
9 int Hspice::DelayRead( double& del, double& energy, unsigned int Np )
10 {
11 char * FileMeas;
13 sprintf( suffix, "_%d", Np );
14 FileMeas = new char[ strlen( WorkPath ) + strlen( SimFile ) + strlen( suffix ) + strlen( SuffixFileMeasure ) + 1 ];
15 if ( !FileMeas )
16 return NO_MEM;
17 strcpy( FileMeas, WorkPath );
18 strcat( FileMeas, SimFile );
19 strcat( FileMeas, suffix );
20 strcat( FileMeas, SuffixFileMeasure );
21 ifstream i_file( FileMeas );
22 if ( !i_file )
23 {
24 print_log( "ERROR opening hspice measure file " );
26 }
27 char line[ 1023 ];
28 for ( unsigned int i = 0; i <= 3; i++ )
29 if ( !i_file.getline( line, 1023 ) )
30 {
31 print_log( "ERROR parsing hspice measure file " );
33 }
34 sscanf( line, "%lg %lg", &del, &energy);
35 i_file.close();
36 del *= 1E12; // picosec.
37 energy *= ( 1E12); // pJ
38 delete[] FileMeas;
39 return OK;
40 }
Hspice.cc
7
8 //////////////////////////////////////////////////////////////////////////////
9 // //
10 // DELAY MODULE – HSPICE //
11 // //
12 // 1998 October 9 – Politecnico di Torino – VLSI LAB //
13 // //
14 // Mariagrazia Graziano – Ph.D. Student //
15 // //
16 ////////////////////////////////////////////////////////////////////////////////
17
18 Hspice::Hspice( const CritPathList& pathlist, const Options& options, const char* NE )
19 :
20 EvaluationAlgorithm( pathlist, options ), SimTime( 0.0 ),
21 WorkPath( 0 ), SimFile( 0 ),
22 InputFile( 0 ), NetlistFile( 0 ), SuffixFileMeasure( 0 )
23 {
24 print_log( "Creating Hspice instance..." );
25 WorkPath = new char[ strlen( options.Workpath() ) + 1 ];
26 SimFile = new char[ strlen( "net2use" ) + 1 ];
27 InputFile = new char[ strlen( "inputs" ) + 1 ];
28 NetlistFile = new char[ strlen( NE ) + strlen( NetListSuffix ) + 1 ];
29 if ( !SimFile || ! InputFile || !NetlistFile )
30 {
B.3. Simulators 219

34 }
35 strcpy( WorkPath, options.Workpath() );
36 strcpy( SimFile, "net2use" );
37 strcpy( InputFile, "inputs" );
38 strcpy( NetlistFile, NE );
39 strcat( NetlistFile, NetListSuffix );
40 SuffixFileMeasure = new char[ 5 ];
41 if ( !SimFile )
42 {
46 }
47 strcpy( SuffixFileMeasure, ".mt0" );
48 }
49
50 ///
51 Hspice::~Hspice()
52 {
53 delete[] WorkPath;
54 delete[] SimFile;
55 delete[] InputFile;
56 delete[] NetlistFile;
57 delete[] SuffixFileMeasure;
58 }
59
60 ///
61 int Hspice::Run( const Circuit& circuit, const double *NewWidth, const unsigned *ValidPath )
62 {
63 SimTime = options.GetSimOption( SIMTIME );
64 Calls++;
65 for ( unsigned int NP = 0; NP < NumPath; NP++ )
66 {
67 if (ValidPath[NP])
68 {
69 int RetCode;
70 RetCode = BasicNetlist( NewWidth, NP, circuit );
72 return RetCode;
73 RetCode = SetInput( NP, circuit.Valim() );
75 return RetCode;
76 RetCode = SimCall( NP );
78 return RetCode;
79 double OneDelay, OneEnergy;
80 RetCode = DelayRead( OneDelay, OneEnergy, NP );
82 return RetCode;
83 CPDelay[ NP ] = OneDelay;
84 CPPower[ NP ] = OneEnergy;
85 CPNoise[ NP ] = 0.0;
86 }
87 }
88 Area = CalcArea( NewWidth, circuit.GetNTran() );
89 return OK;
90 }
HspiceArea.cc
7
8 ///
9 double Hspice::CalcArea( const double *NewWidth, unsigned int NT )
10 {
11 double A = 0.0;
13 {
14 A += NewWidth[ i ];
15 }
16 return ( A );
17 }
Setinput.cc
7
8 ///
9 int Hspice::SetInput( unsigned int Np, double Val )
10 {
12 sprintf( suffix, ".%u", Np );
13 char *Inputs = new char[ strlen( WorkPath ) + strlen( InputFile ) + strlen( suffix ) + 1 ];
14 if ( !Inputs )
15 return NO_MEM;
16 strcpy( Inputs, WorkPath );
17 strcat( Inputs, InputFile );
18 strcat( Inputs, suffix );
19 ofstream input_file( Inputs );
20 if ( !input_file )
22 unsigned int node_in = pathlist[ Np ].GetNodeIn();
23 unsigned int node_out = pathlist[ Np ].GetNodeOut();
24 TransitionType TIn = pathlist[ Np ].GetTransitionIn();
25 if ( TIn == RISE )
26 input_file << endl << "v_node_in " << node_in << \
27 " 0 " << " pwl(0 0 " << ( SimTime / 2.0 ) << \
28 "p 0 " << ( SimTime / 2.0 ) + pathlist[ Np ].GetInTime() << "p " << Val << ")";
29 else if ( TIn == FALL )
30 input_file << endl << "v_node_in " << node_in << \
31 " 0 " << " pwl(0 " << Val << " " << ( SimTime / 2.0 ) << \
32 "p " << Val << " " << ( SimTime / 2.0 ) + \
33 pathlist[ Np ].GetInTime() << "p 0)" ;
35 double nodeVal;
36 while ( pathlist[ Np ].TraverseActiveInputs( node, nodeVal ) )
37 {
38 input_file << endl << "v_ACTIVE_" << node << " " << \
39 node << " 0 dc " << nodeVal;
40 }
41 while ( pathlist[ Np ].TraverseNoActiveInputs( node, nodeVal ) )
42 {
43 input_file << endl << "v_NO_ACTIVE_" << node << " " << \
44 node << " 0 dc " << nodeVal;
45 }
46 while ( pathlist[ Np ].TraverseInitialConditions( node, nodeVal ) )
47 {
48 input_file << endl << endl << ".ic v(" << node << ")=" << \
49 nodeVal;
50 }
51 TransitionType TOut = pathlist[ Np ].GetTransitionOut();
52 if ( TOut == RISE )
53 input_file << endl << endl << ".ic v(" << node_out << ")=0";
54 else if ( TOut == FALL )
55 input_file << endl << endl << ".ic v(" << node_out << ")=" << Val;
56
57 // Delay meas.
58 input_file << endl << endl << ".measure tran path_n0_" << Np << "delay " << \
59 " trig v(" << node_in << ")" << " val=" << Val*0.5 << " " << TransitionString[ TIn ] << "=1" << \
B.3. Simulators 221
60 " targ v(" << node_out << ")" << " val=" << Val * 0.5 << " " << TransitionString[ TOut ] << "=1";
61 // Power meas.
62 input_file << endl << endl << ".measure tran path_n0_" << Np << "power " << \
63 " integ " << "POWER" << " from=0ps" << " to=" << SimTime << "ps ";
64 input_file << endl << endl << ".tran 10p " << SimTime << "p" << endl;
65 input_file.close();
66 return OK;
67 }
Simcall.cc
7
8 ///
9 int Hspice::SimCall( unsigned int Np )
10 {
11 char system_string[ 512 ];
12 sprintf( system_string, "cd %s && hspice %s_%d 1>./hspice.log.%d 2>&1", WorkPath, SimFile, Np, Np );
13 if ( system( system_string ) == -1 )
14 {
15 print_log( "ERROR invoking hspice simulator " );
16 return NO_MEM;
17 }
18 return OK;
19 }
Brackets.cc
6 #include "fast.h"
7
8 const double FACTOR = 1.6;
9 const int NTRY = 50;
10 const double ZEPS = 1e-2;
11
12 ///
13 int Fast::Brackets( const Circuit& circuit, unsigned int NP, unsigned int NC, double& start, double& end, TransistorType type, unsigned
14 {
15 int jj;
16 double f1, f2, x1, x2;
17
18 if ( start == end )
19 {
20 x1 = x2 = 0;
22 return 0;
23 }
24 if ( type == NMOS )
25 {
26 f1 = EqN( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
27 f2 = EqN( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
28 }
29 else if ( type == PMOS )
30 {
31 f1 = EqP( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
32 f2 = EqP( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
33 }
34 for ( jj = 1; jj <= NTRY; jj++ )
35 {
36 if ( f1 * f2 < 0.0 )
37 {
38 RetCode = OK;
39 return 1;
40 }
41 if ( fabs ( f1 ) < fabs ( f2 ) )
42 {
43 start += FACTOR * ( start - end );
44 if ( start <= 0.0 )
45 start = ZEPS;
47 f1 = EqN( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
49 f1 = EqP( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
50 }
51 else
52 {
53 end += FACTOR * ( end - start );
54 if ( end <= 0.0 )
55 end = ZEPS;
57 f2 = EqN( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
59 f2 = EqP( circuit, NP, NC, end, RetCode, j , n, p, NewWidth );
60 }
61 }
63 return 0;
64 }
B.3. Simulators 223
CalcpowN.cc
6 #include "fast.h"
7
8
9 ///
10 int Fast::CalcPowerN( const Circuit& circuit, unsigned int NP, unsigned int NC, double& Ecc, double& Esc, unsigned int n, unsigned int p, const double*
11 {
12
13 double H_1, I_1, Vc, t0_bs, C_n;
14 double J_1, K_1, M_1, N_1;
15 double t0 = t0_n[ n ];
16 double tauo = tauo_n[ n ];
17 double tin = taui_n[ 1 ];
18 t0_bs = tin * ( VDD + TECH.Vtp0 ) / VDD;
19 double tc = (VDD * tauo_n[n] * (t0_n[n] - taui_n[n]) + \
20 Vs_n[n] * taui_n[n] * (tauo_n[n] - t0_n[n])) / \
21 (VDD * (t0_n[n] - taui_n[n]) + Vs_n[n] * (tauo_n[n] - t0_n[n]));
22 Esc = 0;
23 Ecc = 0;
24 if (p > 0)
25 {
26 Vc = TECH.Ec_p * L_p[ p ];
27 H_1 = Vc * beta_p[ p ] * ( VDD * ( t0 - tauo ) * \
28 ( 2 * Vc * ( t0 - tauo ) - Vd_n[ n ] * tin ) - \
29 Vd_n[ n ] * tin * ( Vc * ( t0 - tauo ) - Vd_n[ n ] * tauo + 2 * TECH.Vtp0 * ( t0 - tauo ) ) ) / \
30 ( 2 * Vd_n[ n ] * tin * ( t0 - tauo ) );
31 I_1 = Vc * beta_p[ p ] * ( 2 * VDD * ( t0 - tauo ) - Vd_n[ n ] * tin ) / \
32 ( 2 * tin * ( t0 - tauo ) );
33 J_1 = ( Vc * Vc ) * beta_p[ p ] * ( t0 - tauo ) * ( 2 * ( VDD * VDD ) * ( t0 - tauo ) + \
34 2 * VDD * ( Vc * ( t0 - tauo ) + Vd_n[ n ] * ( tauo - tin ) ) - \
35 Vd_n[ n ] * tin * ( Vc + 2 * TECH.Vtp0 ) );
36 K_1 = 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + \
37 Vc * ( t0 - tauo ) + Vd_n[ n ] * tauo );
38 M_1 = VDD * Vc * beta_p[ p ] * ( VDD + 2 * TECH.Vtp0 ) / ( 2 * ( VDD + Vc ) );
39 N_1 = VDD * VDD * Vc * beta_p[ p ] / ( tin * ( VDD + Vc ) );
40 if ( t0_bs < tauo )
41 {
42 Esc = ( J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
43 0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0 * tin - K_1 ) / ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * \
44 ( tin * tin ) * ( tauo - t0 ) ) + \
45 J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
46 0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0_bs * tin - K_1 ) / \
47 ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * ( tin * tin ) * ( t0 - tauo ) ) + \
48 ( ( t0 - t0_bs ) * \
49 ( 3 * H_1 * Vd_n[ n ] * tin * ( 2 * VDD * ( t0 - tauo ) - Vd_n[ n ] * ( t0 + t0_bs - 2 * tauo ) ) + \
50 I_1 * Vd_n[ n ] * tin * ( 3 * VDD * ( t0 + t0_bs ) * ( t0 - tauo ) - Vd_n[ n ] * ( 2 * ( t0 * t0 ) + \
51 t0 * ( 2 * t0_bs - 3 * tauo ) + t0_bs * ( 2 * t0_bs - 3 * tauo ) ) ) - 3 * J_1 ) ) / \
52 ( 6 * Vd_n[ n ] * tin * ( t0 - tauo ) ) );
53 }
54 else
55 {
56 Esc = ( J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
57 0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0 * tin - K_1 ) / ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * \
58 ( tin * tin ) * ( tauo - t0 ) ) + \
59 J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
60 0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * tauo * tin - K_1 ) / \
61 ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * ( tin * tin ) * ( t0 - tauo ) ) + \
62 ( 3 * H_1 * Vd_n[ n ] * tin * ( t0 - tauo ) * ( 2 * VDD - Vd_n[ n ] ) + \
63 I_1 * Vd_n[ n ] * tin * ( t0 - tauo ) * \
64 ( 3 * VDD * ( t0 + tauo ) - Vd_n[ n ] * ( 2 * t0 + tauo ) ) - 3 * J_1 ) / \
65 ( 6 * Vd_n[ n ] * tin ) );
66 }
67 Esc = fabs( Esc );
68 }
70 unsigned int node = 0;
71 double Wjn, Wgn, Wjp, Wgp;

72 int njn, ngn, njp, ngp;
73 for ( unsigned int i = 1; i <= n; i++ )
74 {
75 C_n = 0.0;
76 name = pathlist[ NP ].TransistorName( i - 1, NC );
77 if ( circuit[ name ].Source() == node )
78 {
79 node = circuit[ name ].Drain();
80 Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
81 Wgn = circuit.GateNWidth( node, ngn, NewWidth );
82 Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
83 Wgp = circuit.GatePWidth( node, ngp, NewWidth );
84 }
85 else if ( circuit[ name ].Drain() == node )
86 {
87 node = circuit[ name ].Source();
92 }
93 int nc;
94 // Cj N
95 C_n += TECH.C_nj * Wjn * TECH.Df * \
96 ( Vd_n[ i ] * Vd_n[ i ] * ( TECH.mj_n - 1 ) * ( TECH.mj_n - 1 ) + \
97 Vd_n[ i ] * TECH.mj_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
98 pow ( ( 1 + Vd_n[ i ] / TECH.PB_n ), -TECH.mj_n ) / \
99 ( ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) );
100 C_n += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * \
101 ( Vd_n[ i ] * Vd_n[ i ] * ( TECH.mjsw_n - 1 ) * ( TECH.mjsw_n - 1 ) + \
102 Vd_n[ i ] * TECH.mjsw_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
103 pow ( 1 + Vd_n[ i ] / TECH.PB_n, -TECH.mjsw_n ) / \
104 ( ( TECH.mjsw_n - 2 ) * ( TECH.mjsw_n - 1 ) ) + \
105 TECH.PB_n * TECH.PB_n * ( TECH.C_nj * Wjn * TECH.Df * \
106 ( TECH.mjsw_n - 2 ) * ( TECH.mjsw_n - 1 ) + \
107 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * \
108 ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) ) / \
109 ( ( 2 - TECH.mjsw_n ) * ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) * ( TECH.mjsw_n - 1 ) );
110 // Cj P
111 if (p > 0)
112 {
113 double x = TECH.mj_p - 1;
114 double y = TECH.mjsw_p - 1;
115 C_n += ( TECH.C_pj * Wjp * TECH.Df * \
116 ( VDD * VDD * x + VDD * TECH.mj_p * \
117 (TECH.PB_p - Vd_n[ i ] * x) + Vd_n[ i ] * Vd_n[ i ] * x * x - \
118 Vd_n[ i ] * TECH.mj_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
119 pow( ( VDD - Vd_n[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) ) / \
120 ( ( x - 1 ) * x );
121 C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
122 ( VDD * VDD * y + VDD * TECH.mjsw_p * \
123 (TECH.PB_p - Vd_n[ i ] * y) + Vd_n[ i ] * Vd_n[ i ] * y * y - \
124 Vd_n[ i ] * TECH.mjsw_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
125 pow( ( VDD - Vd_n[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) ) / \
126 ( ( y - 1 ) * y );
127 C_n += ( TECH.C_pj * Wjp * TECH.Df * \
128 ( VDD * VDD * x + VDD * TECH.mj_p * TECH.PB_p + \
129 TECH.PB_p * TECH.PB_p ) * \
130 pow( ( VDD + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) ) / \
131 ( ( 1 - x ) * x );
132 C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
133 ( VDD * VDD * y + VDD * TECH.mjsw_p * TECH.PB_p + \
134 TECH.PB_p * TECH.PB_p ) * \
135 pow( ( VDD + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) ) / \
136 ( ( 1 - y ) * y );
137 }
138 C_n += circuit.CapStaticGnd( node, nc ) * Vd_n[i] * Vd_n[i] * 0.5;
139 C_n += circuit.CapStaticVdd( node, nc ) * Vd_n[i] * Vd_n[i] * 0.5;
B.3. Simulators 225
140 C_n += Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[i] * Vd_n[i] * 0.5;

141 C_n += Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[i] * Vd_n[i] * 0.5;
142 if ( i < n )
143 C_n += TECH.Cgs0_n * ( ( Wjn - W_n[ i ] - W_n[ i + 1 ] ) + \
144 ( njn - 2 ) * TECH.XW_n ) * 0.5 * Vd_n[i] * Vd_n[i];
145 else
146 C_n += TECH.Cgs0_n * ( ( Wjn - W_n[ i ] ) + \
147 ( njn - 1 ) * TECH.XW_n ) * 0.5 * Vd_n[i] * Vd_n[i];
148 C_n += TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[i] * Vd_n[i] * 0.5;
149 // Cgd
150 if ( (( i == 1 ) && ( i < n - 1 )) || ((i == 1) && (n == 1)) )
151 {
152 double Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
153 double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
154 int Op, SOp;
155 Op = Calct0ts1N( circuit, NP, SOp, NewWidth );
156 if ( Op == _E_ )
157 {
158 Op = SOp;
159 }
160 switch ( Op )
161 {
162 case _A_:
163 C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] * \
164 (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1]) - \
165 VDD * (t0_n[i] - tauo_n[1]) * \
166 ((ts_n[i] * ts_n[i]) - 2 * ts_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
167 (2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
168 C_n += Cov * Vd_n[i] * ((t0_n[i] - ts_n[i]) * (t0_n[i] + ts_n[i] - 2 * tauo_n[1])) * \
169 (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
171 break;
172 case _AA_:
173 C_n += Cg * Vd_n[i] * (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1])*\
176 C_n += Cov * Vd_n[i] * (Vd_n[i] * taui_n[i]*\
177 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) - \
178 VDD * (t0_n[i] - tauo_n[1])*\
179 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
181 break;
182 case _B_:
183 C_n += Cg * Vd_n[i] * (VDD * ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-\
184 taui_n[i] * (taui_n[i] - 2 * tauo_n[1])) + \
185 Vd_n[i] * taui_n[i] * (tauo_n[1]-t0_n[i])) / \
186 (2 * taui_n[i] * (tauo_n[1]-t0_n[i]));
187 break;
188 case _C_:
189 C_n += Cg * Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[1]) * \
190 (ts_n[i] - tauo_n[1]) / \
191 (2 * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
192 C_n += Cov * Vd_n[i] * Vd_n[i]*\
193 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) / \
195 break;
196 case _D_:
197 C_n += Cg * Vd_n[i] * Vd_n[i] / 2;
198 break;
199 case _F_:
203 C_n += Cov * Vd_n[i]*\
204 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1]))*\
207 break;
208 case _G_:
209 C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \

210 (2 * taui_n[i]);
211 break;
212 case _E_:
213 default:
214 C_n += 0.0;
215 break;
216 }
217 }
218 else if ( ( i < n - 1 ) && ( i > 1 ) )
219 {
220 C_n += ( TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i ] * TECH.Lmin ) * \
221 Vd_n[ i ] * Vd_n[ i ] * 0.5;
222 C_n += ( TECH.Cgs0_n * ( W_n[ i + 1 ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i + 1 ] * TECH.Lmin ) * \
223 Vd_n[ i ] * Vd_n[ i ] * 0.5;
224 }
225 else if ( (i == 1) && (i == n - 1) )
226 {
227 double Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
228 double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
229 int Op, SOp;
230 Op = Calct0ts1N( circuit, NP, SOp, NewWidth );
231 if ( Op == _E_ )
232 {
233 Op = SOp;
234 }
235 switch ( Op )
236 {
237 case _A_:
238 C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] * \
239 (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1]) - \
240 VDD * (t0_n[i] - tauo_n[1]) * \
241 ((ts_n[i] * ts_n[i]) - 2 * ts_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
243 C_n += Cov * Vd_n[i] * ((t0_n[i] - ts_n[i]) * (t0_n[i] + ts_n[i] - 2 * tauo_n[1])) * \
246 break;
247 case _AA_:
251 C_n += Cov * Vd_n[i] * (Vd_n[i] * taui_n[i]*\
252 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) - \
253 VDD * (t0_n[i] - tauo_n[1])*\
254 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
256 break;
257 case _B_:
258 C_n += Cg * Vd_n[i] * (VDD * ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-\
259 taui_n[i] * (taui_n[i] - 2 * tauo_n[1])) + \
260 Vd_n[i] * taui_n[i] * (tauo_n[1]-t0_n[i])) / \
261 (2 * taui_n[i] * (tauo_n[1]-t0_n[i]));
262 break;
263 case _C_:
264 C_n += Cg * Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[1]) * \
265 (ts_n[i] - tauo_n[1]) / \
267 C_n += Cov * Vd_n[i] * Vd_n[i]*\
268 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) / \
270 break;
271 case _D_:
272 C_n += Cg * Vd_n[i] * Vd_n[i] / 2;
273 break;
274 case _F_:
B.3. Simulators 227
278 C_n += Cov * Vd_n[i]*\

279 ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1]))*\
282 break;
283 case _G_:
284 C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
285 (2 * taui_n[i]);
286 break;
287 case _E_:
288 default:
289 C_n += 0.0;
290 break;
291 }
292 Cov = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
293 Cg = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.666666 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
294 Op = Calct0tsnN( n, SOp);
295 if ( Op == _E_ )
296 {
297 Op = SOp;
298 }
299 switch ( Op )
300 {
301 case _A_:
302 C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[n] - \
303 ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
304 (2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
305 C_n += Cg * Vs_n[n] * Vs_n[n] * (ts_n[n] - taui_n[n]) * (ts_n[n] - taui_n[n]) * \
307 break;
308 case _B_:
309 C_n += Cov * Vs_n[n] * Vs_n[n] * 0.5;
310 break;
311 case _C_:
312 C_n += -Cg * Vs_n[n] * Vs_n[n] * (tc * tc - 2 * tc * taui_n[n] - \
313 ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
315 C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[i] - \
316 ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
317 (2 * (t0_n[i] - taui_n[n]) * (t0 - taui_n[n]));
318 break;
319 case _D_:
321 tc * (tc - 2 * taui_n[n])) / \
323 break;
324 case _E_:
325 default:
326 break;
327 }
328 }
329 else if ( (i == n - 1) && (n > 2))
330 {
331 C_n += ( TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i ] * TECH.Lmin ) * \
332 Vd_n[ i ] * ( 2 * VDD - Vd_n[ i ] ) * 0.5;
333 int Op, SOp;
334 double Cov = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
335 double Cg = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.666666 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
336 Op = Calct0tsnN( n, SOp);
337 if ( Op == _E_ )
338 {
339 Op = SOp;
340 }
341 switch ( Op )
342 {
343 case _A_:
344 C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[n] - \
345 ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
347 C_n += Cg * Vs_n[n] * Vs_n[n] * (ts_n[n] - taui_n[n]) * (ts_n[n] - taui_n[n]) * \

349 break;
350 case _B_:
351 C_n += Cov * Vs_n[n] * Vs_n[n] * 0.5;
352 break;
353 case _C_:
354 C_n += -Cg * Vs_n[n] * Vs_n[n] * (tc * tc - 2 * tc * taui_n[n] - \
355 ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
358 ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
359 (2 * (t0_n[i] - taui_n[n]) * (t0 - taui_n[n]));
360 break;
361 case _D_:
363 tc * (tc - 2 * taui_n[n])) / \
365 break;
366 case _E_:
367 default:
368 break;
369 }
370
371 }
372 else if ( i == n )
373 {
374 double Cov = TECH.Cgd0_n * ( W_n[ n ] + TECH.XW_n );
375 double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ];
376 int Op, SOp;
377 Op = Calct0tsnN(n, SOp);
378 if ( Op == _E_ )
379 {
380 Op = SOp;
381 }
382 switch ( Op )
383 {
384 case _A_:
385 case _B_:
386 C_n += Cov * \
387 Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * tauo_n[i] - \
388 ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
389 (2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
390 C_n += Cg * \
391 Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[i]) * (ts_n[i] - tauo_n[i]) / \
393 break;
394 case _C_:
395 C_n += Cov * \
396 Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * tauo_n[i] - \
397 ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
399 C_n += -Cg * \
400 Vd_n[i] * Vd_n[i] * (tc * tc - 2 * tc * tauo_n[i] - \
401 ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
403 break;
404 case _D_:
405 C_n += Cov * \
406 Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] - tauo_n[i] - \
407 tc * (tc - 2 * tauo_n[i])) / \
409 break;
410 case _E_:
411 default:
412 break;
413
414 }
415 }
B.3. Simulators 229
416 if ( C_n < 0.0 )

417 C_n *= -1;
418 Ecc += C_n;
419 }
420 return OK;
421 }
CalcpowP.cc
6 #include "fast.h"
7
8
9 ///
10 int Fast::CalcPowerP( const Circuit& circuit, unsigned int NP, unsigned int NC, double& Ecc, double& Esc, unsigned int n, unsigned int p, const double*
11 {
12 double H_1, I_1, Vc, t0_bs, C_p;
13 double J_1, K_1, M_1, N_1, O_1, P_1;
14
15 double t0 = t0_p[ p ];
16 double tauo = tauo_p[ p ];
17 double tin = taui_p[ 1 ];
18 t0_bs = tin * ( VDD - TECH.Vtn0 ) / VDD;
19 double tc = (VDD * tauo_p[p] * (t0_p[p] - taui_p[p]) + \
20 Vs_p[p] * taui_p[p] * (tauo_p[p] - t0_p[p])) / \
21 (VDD * (t0_p[p] - taui_p[p]) + Vs_p[p] * (tauo_p[p] - t0_p[p]));
22 Esc = 0;
23 Ecc = 0;
24 if (n > 0)
25 {
26 Vc = TECH.Ec_n * L_n[ n ];
27
28 H_1 = Vc * beta_n[ n ] * ( ( VDD * VDD ) * tin * ( t0 - 2 * tauo ) - \
29 VDD * ( Vc * ( t0 - tauo ) * ( 2 * t0 - tin - 2 * tauo ) + \
30 tin * ( Vd_p[ p ] * ( t0 - 3 * tauo ) + 2 * TECH.Vtn0 * ( t0 - tauo ) ) ) - \
31 Vd_p[ p ] * tin * ( Vc * ( t0 - tauo ) + Vd_p[ p ] * tauo + 2 * TECH.Vtn0 * ( tauo - t0 ) ) ) / \
32 ( 2 * tin * ( VDD - Vd_p[ p ] ) * ( t0 - tauo ) );
33 I_1 = Vc * beta_n[ n ] * ( VDD * ( 2 * t0 - tin - 2 * tauo ) + \
34 Vd_p[ p ] * tin ) / ( 2 * tin * ( tauo - t0 ) );
35 J_1 = ( Vc * Vc ) * beta_n[ n ] * ( tauo - t0 ) * \
36 ( 2 * ( VDD * VDD ) * ( t0 - tin ) + VDD * ( Vc * ( 2 * t0 - tin - 2 * tauo ) + \
37 2 * ( Vd_p[ p ] * ( tin - tauo ) + TECH.Vtn0 * tin ) ) + Vd_p[ p ] * tin * ( Vc - 2 * TECH.Vtn0 ) );
38 K_1 = 2 * tin * ( VDD - Vd_p[ p ] ) * ( VDD * t0 + Vc * ( t0 - tauo ) - \
39 Vd_p[ p ] * tauo );
40 M_1 = Vc * beta_n[ n ] * ( VDD * t0 + Vc * ( tauo - t0 ) - Vd_p[ p ] * tauo + 2 * TECH.Vtn0 * ( t0 - tauo ) ) / ( 2 * ( tauo - t0 ) );
41 N_1 = Vc * beta_n[ n ] * ( VDD - Vd_p[ p ] ) / ( 2 * ( t0 - tauo ) );
42 O_1 = ( Vc * Vc ) * beta_n[ n ] * ( Vc - 2 * TECH.Vtn0 ) * ( t0 - tauo );
43 P_1 = 2 * ( VDD * t0 + Vc * ( t0 - tauo ) - Vd_p[ p ] * tauo );
44
45 if ( t0_bs < tauo )
46 {
47 Esc = ( VDD * ( 3 * J_1 * ( K_1 - 2 * tin * t0 * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
48 LOG ( 2 * t0 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
49 3 * J_1 * ( 2 * tin * t0 * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) - K_1 ) * \
50 LOG ( 2 * t0_bs * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
51 2 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
52 ( t0_bs - t0 ) * ( 3 * H_1 * tin * ( VDD - Vd_p[ p ] ) * \
53 ( VDD - Vd_p[ p ] ) * ( t0 - t0_bs ) + \
54 I_1 * tin * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * \
55 ( t0 * t0 + t0 * t0_bs - 2 * t0_bs * t0_bs ) - 3 * J_1 ) ) / \
56 ( 12 * ( tin * tin ) * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
57 ( t0 - tauo ) ) );
58 }
59 else
60 {
61 Esc = ( ( 3 * J_1 * ( K_1 + 2 * tin * ( VDD - Vd_p[ p ] ) * ( Vd_p[ p ] * tauo - VDD * t0 ) ) * \
62 LOG ( 2 * t0 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) - \

63 3 * J_1 * ( K_1 + 2 * tin * ( VDD - Vd_p[ p ] ) * \
64 ( Vd_p[ p ] * tauo - VDD * t0 ) ) * \
65 LOG ( 2 * tauo * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
66 2 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
67 ( tauo - t0 ) * ( 3 * H_1 * tin * ( VDD - Vd_p[ p ] ) * \
68 ( VDD + Vd_p[ p ] ) * ( t0 - tauo ) + \
69 I_1 * tin * ( VDD - Vd_p[ p ] ) * ( t0 - tauo ) * \
70 ( VDD * ( t0 + 2 * tauo ) + Vd_p[ p ] * ( 2 * t0 + tauo ) ) - 3 * J_1 ) ) / \
71 ( 12 * ( tin * tin ) * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
72 ( t0 - tauo ) ) );
73 }
74 Esc = fabs( Esc );
75 }
77 unsigned int node = circuit.ValimNode();
80 for ( unsigned int i = 1; i <= p; i++ )
81 {
82 C_p = 0.0;
83 name = pathlist[ NP ].TransistorName( n + p - i, NC ); // first there are nmos
84 // then there are the pmos, in REVERSE order
86 {
92 }
94 {
100 }
101 int nc;
102 // Cj P
103 C_p += TECH.C_pj * Wjp * TECH.Df * \
104 ( ( VDD * VDD + Vd_p[ i ] * Vd_p[ i ] ) * ( TECH.mj_p - 1 ) * ( TECH.mj_p - 1 ) + VDD * \
105 ( TECH.mj_p * TECH.PB_p - 2 * Vd_p[ i ] * ( TECH.mj_p - 1 ) * ( TECH.mj_p - 1 ) ) - \
106 Vd_p[ i ] * TECH.mj_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
107 pow ( ( VDD - Vd_p[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) / \
108 ( ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) );
109 C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
110 ( ( VDD * VDD + Vd_p[ i ] * Vd_p[ i ] ) * ( TECH.mjsw_p - 1 ) * ( TECH.mjsw_p - 1 ) + VDD * \
111 ( TECH.mjsw_p * TECH.PB_p - 2 * Vd_p[ i ] * ( TECH.mjsw_p - 1 ) * ( TECH.mjsw_p - 1 ) ) - \
112 Vd_p[ i ] * TECH.mjsw_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
113 pow ( ( VDD - Vd_p[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) / \
114 ( ( TECH.mjsw_p - 2 ) * ( TECH.mjsw_p - 1 ) );
115 C_p += TECH.PB_p * TECH.PB_p * ( TECH.C_pj * Wjp * TECH.Df * \
116 ( TECH.mjsw_p - 2 ) * ( TECH.mjsw_p - 1 ) + \
117 TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
118 ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) ) / \
119 ( ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) * ( TECH.mjsw_p - 1 ) * ( 2 - TECH.mjsw_p ) );
120 // Cj N
121 if (n > 0)
122 {
123 double x = TECH.mj_n - 1;
124 double y = TECH.mjsw_n - 1;
125 C_p += TECH.C_nj * Wjn * TECH.Df *\
126 ( VDD * VDD * x + VDD * TECH.mj_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
127 pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mj_n ) / \
128 ( ( 1 - x ) * x );
129 C_p += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) *\
130 ( VDD * VDD * y + VDD * TECH.mjsw_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) *\
B.3. Simulators 231
131 pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mjsw_n ) / \

132 ( ( 1 - y ) * y );
133 C_p += TECH.C_nj * Wjn * TECH.Df *\
134 ( VDD * Vd_p[ i ] * (x - 1) * x - \
135 Vd_p[ i ] * Vd_p[i] * x * x - TECH.PB_n * (Vd_p[i] * TECH.mj_n + TECH.PB_n)) * \
136 pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mj_n ) / \
137 ( ( 1 - x ) * x );
138 C_p += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) *\
139 ( VDD * Vd_p[ i ] * (y - 1) * y - \
140 Vd_p[ i ] * Vd_p[i] * y * y - TECH.PB_n * (Vd_p[i] * TECH.mjsw_n + TECH.PB_n) ) *\
141 pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mjsw_n ) / \
142 ( ( 1 - y ) * y );
143 }
144 C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
145 C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
146 C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
147 C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
148 C_p += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
149 if ( i < n )
150 C_p += TECH.Cgs0_p * ( ( Wjp - W_p[ i ] - W_p[ i + 1 ] ) + ( njp - 2 ) * TECH.XW_p ) * \
151 ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
152 else
153 C_p += TECH.Cgs0_p * ( ( Wjp - W_p[ i ] ) + ( njp - 1 ) * TECH.XW_p ) * \
154 ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
155 // Cgs
156 if ( (( i == 1 ) && ( i < p - 1 )) || ((i == 1) && (p == 1)) )
157 {
158 double Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
159 double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
160 int Op, SOp;
161 Op = Calct0ts1P( circuit, NP, SOp, NewWidth );
162 if ( Op == _E_ )
163 {
164 Op = SOp;
165 }
166 switch ( Op )
167 {
168 case _A_:
169 C_p += Cov * (Vd_p[i] - VDD) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
170 (ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
171 Vd_p[i] * taui_p[i]) / \
172 (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
173 C_p += -Cg * ((VDD * VDD) * ((t0_p[i] * t0_p[i]) * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) + \
174 t0_p[i] * ((taui_p[i] * taui_p[i]) - (ts_p[i] * ts_p[i])) * (taui_p[i] + 2 * tauo_p[i]) + \
175 (ts_p[i] * ts_p[i]) * tauo_p[i] * (taui_p[i] + tauo_p[i]) - (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p
176 taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo_p[i]))) + \
177 VDD * Vd_p[i] * taui_p[i] * (t0_p[i] * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) - (ts_p[i] * ts_p[i]) * tauo_p[i]
178 taui_p[i] * (2 * (taui_p[i] * taui_p[i]) - 3 * taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo
179 (Vd_p[i] * Vd_p[i]) * (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p[i]) - 2 * taui_p[i] * tauo_p[i] + (tauo_p[i] * tauo_
180 (2 * (taui_p[i] * taui_p[i]) * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
181 break;
182 case _AA_:
183 C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / \
184 (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
185 C_p += Cov * (Vd_p[i] - VDD) * (VDD * ((t0_p[i] * t0_p[i] * t0_p[i]) - (t0_p[i] * t0_p[i]) * (taui_p[i] + \
186 3 * tauo_p[i]) - t0_p[i] * ((taui_p[i] * taui_p[i]) - 4 * taui_p[i] * tauo_p[i] - 2 * (tauo_p[i]
187 taui_p[i] * ((ts_p[i] * ts_p[i]) - 2 * ts_p[i] * tauo_p[i] + tauo_p[i] * (taui_p[i] - 2 * tauo_p
188 Vd_p[i] * taui_p[i] * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[
190 break;
191 case _B_:
192 C_p += Cg * (VDD - Vd_p[i]) * (VDD * ((t0_p[i] * t0_p[i]) - t0_p[i] * (taui_p[i] + 2 * tauo_p[i]) - \
193 (taui_p[i] * taui_p[i]) + 3 * taui_p[i] * tauo_p[i]) + Vd_p[i] * taui_p[i] * (t0_p[i] - tauo_p[i]
194 (2 * taui_p[i] * (tauo_p[i] - t0_p[i]));
195 break;
196 case _C_:
197 C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] -
198 C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i]))
200 break;
201 case _D_:
202 C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (taui_p[i] - tauo_p[i]) * (taui_p[i] - tauo_p[i]) / \
204 break;
205 case _F_:
206 C_p += Cg * (Vd_p[i] - VDD) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) +
207 Vd_p[i] * taui_p[i]) / (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
209 (ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p
211 break;
212 case _G_:
213 C_p += Cg * (Vd_p[i] - VDD) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / (2 * taui_p[i]);
214 break;
215 case _E_:
216 default:
217 C_p += 0.0;
218 break;
219 }
220 }
221 else if ( ( i < p - 1 ) && ( i > 1 ) )
222 {
223 C_p += ( TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i ] * TECH.Lmin ) * \
224 ( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
225 C_p += ( TECH.Cgs0_p * ( W_p[ i + 1 ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i + 1 ] * njp * TECH.Lmin ) * \
226 ( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
227 }
228 else if ( (i == 1) && (i == p - 1) )
229 {
230 double Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
231 double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
232 int Op, SOp;
233 Op = Calct0ts1P( circuit, NP, SOp, NewWidth );
234 if ( Op == _E_ )
235 {
236 Op = SOp;
237 }
238 switch ( Op )
239 {
240 case _A_:
242 (ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
243 Vd_p[i] * taui_p[i]) / \
245 C_p += -Cg * ((VDD * VDD) * ((t0_p[i] * t0_p[i]) * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) + \
246 t0_p[i] * ((taui_p[i] * taui_p[i]) - (ts_p[i] * ts_p[i])) * (taui_p[i] + 2 * tauo_p[i]) + \
247 (ts_p[i] * ts_p[i]) * tauo_p[i] * (taui_p[i] + tauo_p[i]) - (taui_p[i] * taui_p[i]) * ((tau
248 taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo_p[i]))) + \
249 VDD * Vd_p[i] * taui_p[i] * (t0_p[i] * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) - (ts_p[i] * ts_p[i
250 taui_p[i] * (2 * (taui_p[i] * taui_p[i]) - 3 * taui_p[i] * tauo_p[i] + 2 * (t
251 (Vd_p[i] * Vd_p[i]) * (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p[i]) - 2 * taui_p[i] * tauo_p[i] + (ta
252 (2 * (taui_p[i] * taui_p[i]) * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
253 break;
254 case _AA_:
257 C_p += Cov * (Vd_p[i] - VDD) * (VDD * ((t0_p[i] * t0_p[i] * t0_p[i]) - (t0_p[i] * t0_p[i]) * (taui_p[i] + \
258 3 * tauo_p[i]) - t0_p[i] * ((taui_p[i] * taui_p[i]) - 4 * taui_p[i] * tauo_p[i] -
259 taui_p[i] * ((ts_p[i] * ts_p[i]) - 2 * ts_p[i] * tauo_p[i] + tauo_p[i] * (taui_p[
260 Vd_p[i] * taui_p[i] * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i
262 break;
263 case _B_:
264 C_p += Cg * (VDD - Vd_p[i]) * (VDD * ((t0_p[i] * t0_p[i]) - t0_p[i] * (taui_p[i] + 2 * tauo_p[i]) - \
265 (taui_p[i] * taui_p[i]) + 3 * taui_p[i] * tauo_p[i]) + Vd_p[i] * taui_p[i] * (t0_p
266 (2 * taui_p[i] * (tauo_p[i] - t0_p[i]));
267 break;
268 case _C_:
B.3. Simulators 233
269 C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] -
270 C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i]))
272 break;
273 case _D_:
274 C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (taui_p[i] - tauo_p[i]) * (taui_p[i] - tauo_p[i]) / \
276 break;
277 case _F_:
278 C_p += Cg * (Vd_p[i] - VDD) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
279 Vd_p[i] * taui_p[i]) / (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
281 (ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / \
283 break;
284 case _G_:
285 C_p += Cg * (Vd_p[i] - VDD) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / (2 * taui_p[i]);
286 break;
287 case _E_:
288 default:
289 C_p += 0.0;
290 break;
291 }
292 Op = Calct0tsnP( p, SOp);
293 if ( Op == _E_ )
294 {
295 Op = SOp;
296 }
297 Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
298 Cg = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.66666 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
299 switch ( Op )
300 {
301 case _A_:
302 C_p += Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) * (ts_p[p] - taui_p[p]) * (ts_p[p] - taui_p[p]) / \
303 (2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
304 C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
305 ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
307 break;
308 case _B_:
309 C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) / 2;
310 break;
311 case _C_:
315 C_p += -Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
316 ((tc * tc) - 2 * tc * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
318 break;
319 case _D_:
321 ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - tc * (tc - 2 * taui_p[p])) / \
323 break;
324 case _E_:
325 default:
326 break;
327 }
328 }
329 else if ((i == p - 1) && (p > 2))
330 {
331 C_p += ( TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i ] * TECH.Lmin ) * \
332 ( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
333 int Op, SOp;
334 Op = Calct0tsnP( p, SOp);
335 if ( Op == _E_ )
336 {
337 Op = SOp;
338 }
339 double Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
340 double Cg = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.66666 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
341 switch ( Op )
342 {
343 case _A_:
344 C_p += Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) * (ts_p[p] - taui_p[p]) * (ts_p[p] - taui_p[p]) / \
349 break;
350 case _B_:
351 C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) / 2;
352 break;
353 case _C_:
357 C_p += -Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
358 ((tc * tc) - 2 * tc * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
360 break;
361 case _D_:
363 ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - tc * (tc - 2 * taui_p[p])) / \
365 break;
366 case _E_:
367 default:
368 break;
369 }
370 }
371 else if ( i == p )
372 {
373 double Cov = TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p );
374 double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ];
375 int Op, SOp;
376 Op = Calct0tsnP( p, SOp );
377 if ( Op == _E_ )
378 {
379 Op = SOp;
380 }
381 switch ( Op )
382 {
383 case _A_:
384 case _B_:
387 C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
388 ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i])) / \
390 break;
391 case _C_:
392 C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
393 (ts_p[i] - 2 * tauo_p[i])) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
394 C_p += -Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
395 ((tc * tc) - 2 * tc * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i])) / \
397 break;
398 case _D_:
399 C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
400 ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - tc * (tc - 2 * tauo_p[i])) / \
402 break;
403 case _E_:
404 default:
405 break;
406
B.3. Simulators 235
407 }
408 }
409 if ( C_p < 0.0 )
410 C_p *= -1;
411 Ecc += C_p;
412 }
413 return OK;
414 }
Calcstart.cc
6 #include "fast.h"
7
8 #define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
9
10 ///
11 double Fast::CalcStartTime( const Circuit& circuit, unsigned int NP, unsigned int NC, double start, double end, TransistorType type, const double* NewW
12 {
13 int iter;
14 double a = start, b = end, c = end, d, e, min1, min2;
15 double fa, fb, fc, pp, q, r, s, tol1, xm, last;
16 double tol = TOL;
17
19 fb = t0N( circuit, NP, NC, b, NewWidth, RetCode );
21 fb = t0P( circuit, NP, NC, b, NewWidth, RetCode );
22 last = fb;
24 fa = t0N( circuit, NP, NC, a, NewWidth, RetCode );
26 fa = t0P( circuit, NP, NC, a, NewWidth, RetCode );
27 if ( ( fa > 0.0 && fb > 0.0 ) || ( fa < 0.0 && fb < 0.0 ) )
28 {
30 return 0.0;
31 }
32 fc = fb;
33 for ( iter = 1; iter <= ITERMAX; iter++ )
34 {
35 if ( ( fb > 0.0 && fc > 0.0 ) || ( fb < 0.0 && fc < 0.0 ) )
36 {
37 c = a;
38 fc = fa;
39 e = d = b - a;
40 }
41 if ( fabs ( fc ) < fabs ( fb ) )
42 {
43 a = b;
44 b = c;
45 c = a;
46 fa = fb;
47 fb = fc;
48 fc = fa;
49 }
50 tol1 = 2.0 * EPS * fabs ( b ) + 0.5 * tol;
51 xm = 0.5 * ( c - b );
52 if ( fabs ( xm ) <= tol1 || fb == 0.0 )
53 return b;
54 if ( fabs ( e ) >= tol1 && fabs ( fa ) > fabs ( fb ) )
55 {
56 s = fb / fa;
57 if ( a == c )
58 {
59 pp = 2.0 * xm * s;
60 q = 1.0 - s;
61 }
62 else
63 {
64 q = fa / fc;
65 r = fb / fc;
66 pp = s * ( 2.0 * xm * q * ( q - r ) - ( b - a ) * ( r - 1.0 ) );
67 q = ( q - 1.0 ) * ( r - 1.0 ) * ( s - 1.0 );
68 }
69 if ( pp > 0.0 )
70 q = -q;
71 pp = fabs ( pp );
72 min1 = 3.0 * xm * q - fabs ( tol1 * q );
73 min2 = fabs ( e * q );
74 if ( 2.0 * pp < ( min1 < min2 ? min1 : min2 ) )
75 {
76 e = d;
77 d = pp / q;
78 }
79 else
80 {
81 d = xm;
82 e = d;
83 }
84 }
85 else
86 {
87 d = xm;
88 e = d;
89 }
90 a = b;
91 fa = fb;
92 if ( fabs ( d ) > tol1 )
93 b += d;
94 else
95 b += SIGN ( tol1, xm );
97 fb = t0N( circuit, NP, NC, b, NewWidth, RetCode );
99 fb = t0P( circuit, NP, NC, b, NewWidth, RetCode );
100 }
102 return 0.0;
103 }
Calctst0N.cc
6 #include "fast.h"
* t0: A=AA=F B=G C D * ts: A=F B=D=G AA=C *
Calctst0N.cc
14 int Fast::Calct0ts1N( const Circuit& circuit, unsigned int NP, int& SaveOpCondition, const double* NewWidth )
15 {
16 int OpCondition = _A_, LastOpCondition = 0;
17 double t0_bs, Vc, A_2_n, B_2_n, C_2_n, D_2_n, J_2_n, K_2_n, I_2_n, M_2_n;
18 double a, b, c, Cm1, Cm2, Cov, Cj, X, Y, alpha, beta, gamma, theta;
19
20 t0_bs = TECH.Vtn0 * taui_n[ 1 ] / VDD;
21 if ( taui_n[ 1 ] <= tauo_n[ 1 ] )
22 {
23 ts_n[ 1 ] = ( taui_n[ 1 ] - t0_n[ 1 ] ) * 0.5 + t0_n[ 1 ]; /* A */
23
24 }
25 else
B.3. Simulators 237
26 {
27 ts_n[ 1 ] = ( tauo_n[ 1 ] - t0_n[ 1 ] ) * 0.5 + t0_n[ 1 ]; /* F */
27
28 }
29
30 Vc = TECH.Ec_n * L_n[ 1 ];
31 Cov = TECH.Cgd0_n * ( W_n[ 1 ] + TECH.XW_n );
32 Cm1 = Cov;
33 Cm2 = Cm1 + 0.5 * TECH.Cox_n * W_n[ 1 ] * L_n[ 1 ];
34 unsigned int pp = pathlist[ NP ].GetNumTranP();
35 const char* name = pathlist[ NP ].TransistorName( 0 );
36 int node;
38 int njn, ngn, njp, ngp, nc;
39 if ( circuit[ name ].Source() == 0 )
40 {
46 }
47 else if ( circuit[ name ].Drain() == 0 )
48 {
54 }
55 // Nmos
56 Cj = TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
57 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
58 Cj += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n );
59 // Pmos
60 if (pp > 0)
61 {
62 Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
63 TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
64 Cj += TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p );
65 }
66 // static
67 Cj += circuit.CapStaticGnd( node, nc );
68 Cj += circuit.CapStaticVdd( node, nc );
69 Cj += Wgn * TECH.Lmin * TECH.Cox_n;
70 Cj += Wgp * TECH.Lmin * TECH.Cox_p;
71 A_2_n = Vc * beta_n[ 1 ] * ( Vc - TECH.Vtn0 );
72 B_2_n = VDD * Vc * beta_n[ 1 ] / taui_n[ 1 ];
73 C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
74 D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
75 J_2_n = Vc * Vd_n[ 1 ] * beta_n[ 1 ] * ( Vd_n[ 1 ] + 2 * TECH.Vtn0 ) / ( 2 * ( Vc + Vd_n[ 1 ] ) );
76 K_2_n = VDD * Vc * Vd_n[ 1 ] * beta_n[ 1 ] / ( taui_n[ 1 ] * ( Vc + Vd_n[ 1 ] ) );
77 I_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
78 ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) / 2;
79 M_2_n = K_2_n * taui_n[ 1 ] - J_2_n;
80 X = Cj + Cm2;
81 Y = Cj + Cov;
82 while ( OpCondition != LastOpCondition )
83 {
84 if ( LastOpCondition != 0 )
85 LastOpCondition = OpCondition;
86 else
87 LastOpCondition = _E_;
88 if ( ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
89 ( ts_n[ 1 ] <= taui_n[ 1 ] ) &&
90 ( taui_n[ 1 ] <= tauo_n[ 1 ] ) )
91 {
92 OpCondition = SaveOpCondition = _A_;
93 if ( OpCondition != LastOpCondition )
94 {
95 #ifdef SAT
96 b = ( 2 * ( Vd_n[ 1 ] * taui_n[ 1 ] * \
97 ( Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) - Vd_n[ 1 ] * tauo_n[ 1 ] ) - \
98 VDD * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) ) ) / \
99 ( Vd_n[ 1 ] * Vd_n[ 1 ] * taui_n[ 1 ] );
100 c = ( 2 * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( Vd_n[ 1 ] * tauo_n[ 1 ] + \
101 TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) - Vd_n[ 1 ] * Vd_n[ 1 ] * tauo_n[ 1 ] * tauo_n[ 1 ] ) / \
102 ( Vd_n[ 1 ] * Vd_n[ 1 ] );
103 if ( ( b * b + 4 * c ) >= 0 )
104 {
105 ts_n[ 1 ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
106 if ( ts_n[ 1 ] < 0 )
107 ts_n[ 1 ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
108 }
109 else
110 ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] * tauo_n[ 1 ] + TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
111 ( Vd_n[ 1 ] * taui_n[ 1 ] - VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) );
112 #else
115
116 #endif
117 }
118 }
119 else if ( ( t0_n[ 1 ] <= taui_n[ 1 ] ) &&
120 ( taui_n[ 1 ] <= ts_n[ 1 ] ) &&
121 ( ts_n[ 1 ] <= tauo_n[ 1 ] ) )
122 {
123 OpCondition = SaveOpCondition = _AA_;
125 {
126 #ifdef SAT
127 ts_n[ 1 ] = ( sqrt ( Vc ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * \
128 sqrt ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) + Vc * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
129 Vd_n[ 1 ] + tauo_n[ 1 ];
130 #else
131 ts_n[ 1 ] = ( VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) + Vd_n[ 1 ] * tauo_n[ 1 ] + \
132 TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / Vd_n[ 1 ];
133 #endif
134 }
135 }
136 else if ( ( ts_n[ 1 ] <= t0_n[ 1 ] ) &&
137 ( t0_n[ 1 ] <= taui_n[ 1 ] ) &&
138 ( taui_n[ 1 ] <= tauo_n[ 1 ] ) )
139 {
140 OpCondition = SaveOpCondition = _B_;
142 {
143 #ifdef SAT
144 ts_n[ 1 ] = taui_n[ 1 ] * \
145 ( 2 * Vc * ( Vd_n[ 1 ] + TECH.Vtn0 ) + ( Vd_n[ 1 ] * Vd_n[ 1 ] ) ) / \
146 ( 2 * VDD * Vc );
147 #else
148 ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] + TECH.Vtn0 ) / VDD;
149 #endif
150 a = 2 * ( Cm2 * VDD + J_2_n * taui_n[ 1 ] ) / ( K_2_n * taui_n[ 1 ]);
151 b = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
152 3 * B_2_n * C_2_n * taui_n[1] * ( t0_bs * t0_bs - ts_n[ 1 ] * ts_n[ 1 ] ) - \
153 4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * pow( C_2_n * t0_bs + D_2_n, 1.5 ) + \
154 4 * Vc * Vc * beta_n[ 1 ] * pow( C_2_n * ts_n[ 1 ] + D_2_n, 1.5 ) - \
155 3 * C_2_n * ts_n[ 1 ] * ( 2 * Cm2 * VDD - 2 * Cov * VDD + \
156 taui_n[1] * (2 * J_2_n - K_2_n * ts_n[1] ) ) ) / \
157 ( 3 * C_2_n * K_2_n * taui_n[ 1 ]);
158 if ( ( 4 * b + ( a * a ) ) >= 0 )
159 {
160 t0_n[ 1 ] = ( a - sqrt ( 4 * b + ( a * a ) ) ) / 2;
161 if ( t0_n[ 1 ] < 0 )
162 t0_n[ 1 ] = ( sqrt ( 4 * b + ( a * a ) ) + a ) / 2;
B.3. Simulators 239
163 }
164 else
165 t0_n[ 1 ] = t0_bs;
166 }
167 }
168 else if ( ( taui_n[ 1 ] <= t0_n[ 1 ] ) &&
169 ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
170 ( ts_n[ 1 ] <= tauo_n[ 1 ] ) )
171 {
172 OpCondition = SaveOpCondition = _C_;
174 {
175 alpha = C_2_n * t0_bs + D_2_n;
176 beta = C_2_n * ts_n[ 1 ] + D_2_n;
177 theta = pow( alpha, 1.5 ) - pow( beta, 1.5 );
178 t0_n[ 1 ] = ( 6 * A_2_n * C_2_n * ( t0_bs - taui_n[ 1 ] ) + \
179 3 * B_2_n * C_2_n * ( t0_bs * t0_bs - taui_n[ 1 ] * taui_n[ 1 ] ) - \
180 2 * ( 2 * Vc * Vc * beta_n[ 1 ] * theta - \
181 3 * C_2_n * (Cov * VDD + I_2_n * taui_n[1]))) /
182 ( 6 * C_2_n * I_2_n );
183
184 #ifdef SAT
185 ts_n[ 1 ] = ( sqrt ( Vc ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * \
186 sqrt ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) + Vc * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
187 Vd_n[ 1 ] + tauo_n[ 1 ];
188 #else
189 ts_n[ 1 ] = ( VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) + Vd_n[ 1 ] * tauo_n[ 1 ] + \
190 TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / Vd_n[ 1 ];
191 #endif
192 }
193 }
194 else if ( ( ts_n[ 1 ] <= taui_n[ 1 ] ) &&
195 ( taui_n[ 1 ] <= t0_n[ 1 ] ) &&
196 ( t0_n[ 1 ] <= tauo_n[ 1 ] ) )
197 {
198 OpCondition = SaveOpCondition = _D_;
200 {
201 #ifdef SAT
202 ts_n[ 1 ] = taui_n[ 1 ] * \
204 ( 2 * VDD * Vc );
205 #else
207 #endif
208 alpha = C_2_n * t0_bs + D_2_n;
209 beta = C_2_n * taui_n[ 1 ] + D_2_n;
210 gamma = C_2_n * ts_n[ 1 ] + D_2_n;
211 theta = -pow( alpha, 1.5 ) + pow( gamma, 1.5 );
212 t0_n[ 1 ] = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
213 3 * B_2_n * C_2_n * taui_n[1] * ( ( t0_bs * t0_bs ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) + \
214 4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * theta - \
215 3 * C_2_n * ( 2 * Cm2 * VDD * (ts_n[1] - taui_n[1]) - \
216 2 * Cov * VDD * ts_n[1] + taui_n[1] * \
217 (2 * J_2_n * (ts_n[1]-taui_n[1]) + K_2_n * \
218 (taui_n[1] * taui_n[1] - ts_n[1] * ts_n[1]) - \
219 2 * M_2_n * taui_n[1]))) / \
220 ( 6 * C_2_n * M_2_n * taui_n[ 1 ] );
221 }
222 }
223 else if ( ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
224 ( ts_n[ 1 ] <= tauo_n[ 1 ] ) &&
225 ( tauo_n[ 1 ] <= taui_n[ 1 ] ) )
226 {
227 OpCondition = SaveOpCondition = _F_;
229 {
230 #ifdef SAT
231 b = ( 2 * ( Vd_n[ 1 ] * taui_n[ 1 ] * \
232 ( Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) - Vd_n[ 1 ] * tauo_n[ 1 ] ) - \

233 VDD * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) ) ) / \
234 ( Vd_n[ 1 ] * Vd_n[ 1 ] * taui_n[ 1 ] );
235 c = ( 2 * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( Vd_n[ 1 ] * tauo_n[ 1 ] + \
236 TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) - Vd_n[ 1 ] * Vd_n[ 1 ] * tauo_n[ 1 ] * tauo_n[ 1 ] ) / \
237 ( Vd_n[ 1 ] * Vd_n[ 1 ] );
238 if ( ( b * b + 4 * c ) >= 0 )
239 {
240 ts_n[ 1 ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
241 if ( ts_n[ 1 ] < 0 )
242 ts_n[ 1 ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
243 }
244 else
247
248 #else
251
252 #endif
253 }
254 }
255 else if ( ( ts_n[ 1 ] <= t0_n[ 1 ] ) &&
256 ( t0_n[ 1 ] <= tauo_n[ 1 ] ) &&
257 ( tauo_n[ 1 ] <= taui_n[ 1 ] ) )
258 {
259 OpCondition = SaveOpCondition = _G_;
261 {
262 #ifdef SAT
263 ts_n[ 1 ] = taui_n[ 1 ] * \
265 ( 2 * VDD * Vc );
266 #else
268 #endif
269 a = 2 * ( Cm2 * VDD + J_2_n * taui_n[ 1 ] ) / ( K_2_n * taui_n[ 1 ]);
270 b = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
271 3 * B_2_n * C_2_n * taui_n[1] * ( t0_bs * t0_bs - ts_n[ 1 ] * ts_n[ 1 ] ) - \
272 4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * pow( C_2_n * t0_bs + D_2_n, 1.5 ) + \
273 4 * Vc * Vc * beta_n[ 1 ] * pow( C_2_n * ts_n[ 1 ] + D_2_n, 1.5 ) - \
274 3 * C_2_n * ts_n[ 1 ] * ( 2 * Cm2 * VDD - 2 * Cov * VDD + \
275 taui_n[1] * (2 * J_2_n - K_2_n * ts_n[1] ) ) ) / \
276 ( 3 * C_2_n * K_2_n * taui_n[ 1 ]);
277 if ( ( 4 * b + ( a * a ) ) >= 0 )
278 {
279 t0_n[ 1 ] = ( a - sqrt ( 4 * b + ( a * a ) ) ) / 2;
280 if ( t0_n[ 1 ] < 0 )
281 t0_n[ 1 ] = ( sqrt ( 4 * b + ( a * a ) ) + a ) / 2;
282 }
283 else
284 t0_n[ 1 ] = t0_bs;
285 }
286 }
287 else
288 {
289 OpCondition = _E_;
290 }
291 }
292 return OpCondition;
293 }
294
295 ///
296 int Fast::Calct0tsnN( unsigned int n, int& SaveOpCondition )
297 {
299 double Vc, tc, X, Y, b, c;
300
B.3. Simulators 241

302 ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
303 ( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
304 ( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
305 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
306 if ( taui_n[ n ] < tauo_n[ n ] )
307 {
308 SaveOpCondition = _A_;
309 }
310 else
311 {
312 tc = ( VDD * tauo_n[ n ] * ( t0_n[ n ] - taui_n[ n ] ) + \
313 Vs_n[ n ] * taui_n[ n ] * ( tauo_n[ n ] - t0_n[ n ] ) ) / \
314 ( VDD * ( t0_n[ n ] - taui_n[ n ] ) + Vs_n[ n ] * ( tauo_n[ n ] - t0_n[ n ] ) );
315 SaveOpCondition = _C_;
316 }
318 {
321 else
323 if ( ( taui_n[ n ] <= tauo_n[ n ] ) &&
324 ( ts_n[ n ] <= taui_n[ n ] ) &&
325 ( t0_n[ n ] <= ts_n[ n ] ) )
326 {
329 {
330 #ifdef SAT
331 X = t0_n[ n ] - taui_n[ n ];
332 Y = tauo_n[ n ] - t0_n[ n ];
333 b = -( 2 * ( VDD * X * X * ( Vc * Y + VDD * tauo_n[ n ] ) + \
334 Vs_n[ n ] * X * Y * ( VDD * ( taui_n[ n ] + tauo_n[ n ] ) - A_1_n[ n ] * Vc * Y ) + \
335 Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * Y * Y ) ) / \
336 ( ( VDD * X + Vs_n[ n ] * Y ) * ( VDD * X + Vs_n[ n ] * Y ) );
337 c = -( X * X * ( 2 * Y * Y * Vc * TECH.Vtn0 + 2 * Y * VDD * Vc * t0_n[ n ] + VDD * VDD * tauo_n[ n ] * tauo_n[ n ] ) + \
338 2 * Vs_n[ n ] * taui_n[ n ] * X * Y * ( VDD * tauo_n[ n ] - A_1_n[ n ] * Vc * Y ) + \
339 Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * taui_n[ n ] * Y * Y ) / \
341 if ( ( b * b + 4 * c ) >= 0 )
342 {
343 ts_n[ n ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
344 if ( ( ts_n[ n ] < 0 ) || ( ts_n[ n ] < t0_n[ n ] ) || ( ts_n[ n ] > tauo_n[ n ] ) )
345 ts_n[ n ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
350 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
351 }
352 else
356 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
357 #else
361 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
362 #endif
363 }
364 }
365 else if ( ( ts_n[ n ] <= tauo_n[ n ] ) &&
366 ( taui_n[ n ] <= ts_n[ n ] ) &&
367 ( t0_n[ n ] <= taui_n[ n ] ) )
368 {

371 {
372 #ifdef SAT
373 ts_n[ n ] = tauo_n[ n ] - ( tauo_n[ n ] - t0_n[ n ] ) * \
374 ( sqrt ( Vc * ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) ) - Vc ) / VDD;
375 #else
376 ts_n[ n ] = TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) / VDD + t0_n[ n ];
377 #endif
378 }
379 }
380 else if ( ( tauo_n[ n ] <= taui_n[ n ] ) &&
381 ( ts_n[ n ] < tc ) &&
382 ( t0_n[ n ] <= ts_n[ n ] ) )
383 {
(ts
n[n] ¡ tc), not ’¡=’ !!!
Calctst0N.cc
387 {
388 #ifdef SAT
389 X = t0_n[ n ] - taui_n[ n ];
390 Y = tauo_n[ n ] - t0_n[ n ];
391 b = -( 2 * ( VDD * X * X * ( Vc * Y + VDD * tauo_n[ n ] ) + \
392 Vs_n[ n ] * X * Y * ( VDD * ( taui_n[ n ] + tauo_n[ n ] ) - A_1_n[ n ] * Vc * Y ) + \
393 Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * Y * Y ) ) / \
395 c = -( X * X * ( 2 * Y * Y * Vc * TECH.Vtn0 + 2 * Y * VDD * Vc * t0_n[ n ] + VDD * VDD * tauo_n[ n ] * tauo_n[ n ] ) + \
396 2 * Vs_n[ n ] * taui_n[ n ] * X * Y * ( VDD * tauo_n[ n ] - A_1_n[ n ] * Vc * Y ) + \
397 Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * taui_n[ n ] * Y * Y ) / \
399 if ( ( b * b + 4 * c ) >= 0 )
400 {
401 ts_n[ n ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
403 ts_n[ n ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
408 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
409 }
410 else
411 {
415 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
416 if ( ts_n[ n ] > tauo_n[ n ] )
417 ts_n[ n ] = tc;
418 }
419 #else
423 VDD * ( t0_n[ n ] - taui_n[ n ] ) );
424 if ( ts_n[ n ] > tauo_n[ n ] )
425 ts_n[ n ] = tc;
426 #endif
427 }
428 }
429 else if ( ( tauo_n[ n ] <= taui_n[ n ] ) &&
430 ( tc <= ts_n[ n ] ) &&
431 ( t0_n[ n ] <= ts_n[ n ] ) )
432 {
B.3. Simulators 243
435 {
436 ts_n[ n ] = tc;
437 }
438 }
439 else
440 {
442 }
443 }
445 }
Calctst0P.cc
6 #include "fast.h"
* t0: A=AA=F B=G C D * ts: A=F B=D=G AA=C *
Calctst0P.cc
15 int Fast::Calct0ts1P( const Circuit& circuit, unsigned int NP, int& SaveOpCondition, const double* NewWidth )
16 {
18 double t0_bs, Vc, A_2_p, B_2_p, C_2_p, D_2_p, G_2_p, J_2_p, K_2_p, M_2_p;
19 double b, c, Cm1, Cm2, Cov, Cj, X, XX, Y, alpha, gamma, theta;
20
21 t0_bs = -TECH.Vtp0 * taui_p[ 1 ] / VDD;
22 if ( taui_p[ 1 ] <= tauo_p[ 1 ] )
23 {
24 ts_p[ 1 ] = ( taui_p[ 1 ] - t0_p[ 1 ] ) * 0.5 + t0_p[ 1 ]; /* A */
24
25 }
26 else
27 {
28 ts_p[ 1 ] = ( tauo_p[ 1 ] - t0_p[ 1 ] ) * 0.5 + t0_p[ 1 ]; /* F */
28
29 }
30
31 Vc = TECH.Ec_p * L_p[ 1 ];
32 Cov = TECH.Cgd0_p * ( W_p[ 1 ] + TECH.XW_p );
33 Cm1 = Cov;
34 Cm2 = Cm1 + 0.5 * TECH.Cox_p * W_p[ 1 ] * L_p[ 1 ];
35 unsigned int nn = pathlist[ NP ].GetNumTranN();
37 int node;
38 int njn, ngn, njp, ngp, nc;
39 if (nn > 0)
40 {
41 const char* name = pathlist[ NP ].TransistorName( nn - 1 );
43 {
49 }
51 {
57 }
58 // Nmos
59 Cj = TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \

60 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
62 }
63 // Pmos
64 Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
65 TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
67 // static
70 Cj += Wgn * TECH.Lmin * TECH.Cox_p;
72 A_2_p = Vc * beta_p[ 1 ] * ( Vc + TECH.Vtp0 );
73 B_2_p = VDD * Vc * beta_p[ 1 ] / taui_p[ 1 ];
74 C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
75 D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
76 G_2_p = Vc * Vc * beta_p[ 1 ] * SQRT ( ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) / Vc ) - \
77 VDD * Vc * beta_p[ 1 ] - Vc * Vc * beta_p[ 1 ] - Vc * TECH.Vtp0 * beta_p[ 1 ];
78 K_2_p = VDD * Vc * beta_p[ 1 ] * ( Vd_p[ 1 ] - VDD ) / ( taui_p[ 1 ] * ( VDD + Vc - Vd_p[ 1 ] ) );
79 J_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] - 2 * TECH.Vtp0 ) / ( 2 * ( VDD + Vc - Vd_p[ 1 ] ) );
80 M_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD + Vd_p[ 1 ] + 2 * TECH.Vtp0 ) / ( 2 * ( Vd_p[ 1 ] - Vc ) );
81 X = Cj + Cm2;
82 Y = Cj + Cov;
83 alpha = pow( ( D_2_p * t0_bs + C_2_p ), 1.5 );
84 gamma = pow( ( D_2_p * taui_p[ 1 ] + C_2_p ), 1.5 );
86 {
89 else
91 if ( ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
92 ( ts_p[ 1 ] <= taui_p[ 1 ] ) &&
93 ( taui_p[ 1 ] <= tauo_p[ 1 ] ) )
94 {
97 {
98 XX = t0_p[ 1 ] - tauo_p[ 1 ];
99 #ifdef SAT
100
101 b = ( 2 * ( VDD * VDD * taui_p[ 1 ] * tauo_p[ 1 ] + VDD * ( Vc * ( XX - taui_p[ 1 ] ) * XX - \
102 2 * Vd_p[ 1 ] * taui_p[ 1 ] * tauo_p[ 1 ] ) + \
103 Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * XX + Vd_p[ 1 ] * tauo_p[ 1 ] ) ) ) / \
104 ( taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
105 c = ( VDD * VDD * tauo_p[ 1 ] * tauo_p[ 1 ] - 2 * VDD * tauo_p[ 1 ] * ( Vc * XX + Vd_p[ 1 ] * tauo_p[ 1 ] ) + \
106 2 * Vc * XX * ( Vd_p[ 1 ] * tauo_p[ 1 ] - TECH.Vtp0 * XX ) + Vd_p[ 1 ] * tauo_p[ 1 ] * Vd_p[ 1 ] * tauo_p[ 1 ] ) /
107 ( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
108 if ( ( b * b - 4 * c ) >= 0 )
109 {
110 ts_p[ 1 ] = ( b - sqrt ( b * b - 4 * c ) ) * 0.5;
111 if ( ts_p[ 1 ] < 0 )
112 ts_p[ 1 ] = ( b + sqrt ( b * b - 4 * c ) ) * 0.5;
113 }
114 else
115 ts_p[ 1 ] = -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \
116 TECH.Vtp0 * XX ) / \
117 ( VDD * ( XX - taui_p[ 1 ] ) + \
118 Vd_p[ 1 ] * taui_p[ 1 ] );
119 #else
121 TECH.Vtp0 * XX ) / \
122 ( VDD * ( XX - taui_p[ 1 ] ) + \
123 Vd_p[ 1 ] * taui_p[ 1 ] );
124 #endif
125 }
126 }
127 else if ( ( t0_p[ 1 ] <= taui_p[ 1 ] ) &&
B.3. Simulators 245
128 ( taui_p[ 1 ] <= ts_p[ 1 ] ) &&

129 ( ts_p[ 1 ] <= tauo_p[ 1 ] ) )
130 {
131 OpCondition = SaveOpCondition = _AA_;
133 {
134 #ifdef SAT
135 ts_p[ 1 ] = tauo_p[ 1 ] - ( ( tauo_p[ 1 ] - t0_p[ 1 ] ) * \
136 ( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / \
137 ( VDD - Vd_p[ 1 ] );
138 #else
139 ts_p[ 1 ] = ( VDD * t0_p[ 1 ] - Vd_p[ 1 ] * tauo_p[ 1 ] + TECH.Vtp0 * ( t0_p[ 1 ] - tauo_p[ 1 ] ) ) / \
140 ( VDD - Vd_p[ 1 ] );
141 #endif
142 }
143 }
144 else if ( ( ts_p[ 1 ] <= t0_p[ 1 ] ) &&
145 ( t0_p[ 1 ] <= taui_p[ 1 ] ) &&
146 ( taui_p[ 1 ] <= tauo_p[ 1 ] ) )
147 {
150 {
151 #ifdef SAT
152 ts_p[ 1 ] = taui_p[ 1 ] * ( VDD * VDD + 2 * VDD * ( Vc - Vd_p[ 1 ] ) - 2 * Vc * ( Vd_p[ 1 ] + TECH.Vtp0 ) + \
153 Vd_p[ 1 ] * Vd_p[ 1 ] ) / ( 2 * VDD * Vc );
154 #else
155 ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] - TECH.Vtp0 ) / VDD;
156 #endif
157 theta = pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );
158 b = 2 * ( Cm2 * VDD + J_2_p * taui_p[ 1 ] ) / \
159 ( K_2_p * taui_p[ 1 ] );
160 c = ( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
161 3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[ 1 ] ) - \
162 4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * alpha + \
163 4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * theta - \
164 3 * D_2_p * ts_p[ 1 ] * \
165 ( 2 * Cm2 * VDD - 2 * Cov * VDD + taui_p[ 1 ] * \
166 ( 2 * J_2_p + K_2_p * ts_p[ 1 ] ) ) ) / \
167 ( 3 * D_2_p * K_2_p * taui_p[ 1 ] );
168 if ( ( b * b - 4 * c ) >= 0 )
169 {
170 t0_p[ 1 ] = ( sqrt ( ( b * b ) - 4 * c ) - b ) * 0.5;
171 if ( t0_p[ 1 ] < 0 )
172 t0_p[ 1 ] = -( sqrt ( ( b * b ) - 4 * c ) + b ) * 0.5;
173 }
174 else
175 t0_p[ 1 ] = t0_bs;
176 }
177 }
178 else if ( ( taui_p[ 1 ] <= t0_p[ 1 ] ) &&
179 ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
180 ( ts_p[ 1 ] <= tauo_p[ 1 ] ) )
181 {
184 {
185
186 t0_p[ 1 ] = -( 6 * A_2_p * D_2_p * ( t0_bs - taui_p[ 1 ] ) + \
187 3 * B_2_p * D_2_p * ( t0_bs * t0_bs - taui_p[ 1 ] * taui_p[ 1 ] ) - \
188 2 * ( 2 * Vc * Vc * beta_p[ 1 ] * ( alpha - gamma ) - \
189 3 * D_2_p * (Cov * VDD - G_2_p * taui_p[ 1 ]))) / \
190 ( 6 * D_2_p * G_2_p );
191 #ifdef SAT
192 ts_p[ 1 ] = tauo_p[ 1 ] - ( ( tauo_p[ 1 ] - t0_p[ 1 ] ) * \
193 ( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / \
194 ( VDD - Vd_p[ 1 ] );
195 #else
196 ts_p[ 1 ] = ( VDD * t0_p[ 1 ] - Vd_p[ 1 ] * tauo_p[ 1 ] + TECH.Vtp0 * ( t0_p[ 1 ] - tauo_p[ 1 ] ) ) / \
197 ( VDD - Vd_p[ 1 ] );

198 #endif
199 }
200 }
201 else if ( ( ts_p[ 1 ] <= taui_p[ 1 ] ) &&
202 ( taui_p[ 1 ] <= t0_p[ 1 ] ) &&
203 ( t0_p[ 1 ] <= tauo_p[ 1 ] ) )
204 {
207 {
208 #ifdef SAT
210 ( Vd_p[ 1 ] * Vd_p[ 1 ] ) ) / ( 2 * VDD * Vc );
211 #else
212 ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] + TECH.Vtp0 ) / VDD;
213 #endif
215 t0_p[ 1 ] = -( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
216 3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[1]) - \
217 4 * Vc * Vc * beta_p[ 1 ] * ( alpha - theta ) - \
218 3 * D_2_p * (2 * Cm2 * VDD * (ts_p[1] - taui_p[1]) - \
219 2 * Cov * VDD * ts_p[1] + \
220 taui_p[1] * (2 * J_2_p * (ts_p[1] - taui_p[1]) + \
221 K_2_p * (ts_p[1] * ts_p[1] - taui_p[1] * taui_p[1]) + \
222 2 * M_2_p * taui_p[1] ))) / \
223 ( 6 * D_2_p * M_2_p * taui_p[ 1 ] );
224 }
225 }
226 else if ( ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
227 ( ts_p[ 1 ] <= tauo_p[ 1 ] ) &&
228 ( tauo_p[ 1 ] <= taui_p[ 1 ] ) )
229 {
230 OpCondition = SaveOpCondition = _F_;
232 {
233 X = t0_p[ 1 ] - tauo_p[ 1 ];
234 #ifdef SAT
235
236 b = ( 2 * ( VDD * VDD * taui_p[ 1 ] * tauo_p[ 1 ] + VDD * ( Vc * ( X - taui_p[ 1 ] ) * X - \
237 2 * Vd_p[ 1 ] * taui_p[ 1 ] * tauo_p[ 1 ] ) + \
238 Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * X + Vd_p[ 1 ] * tauo_p[ 1 ] ) ) ) / \
239 ( taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
240 c = ( VDD * VDD * tauo_p[ 1 ] * tauo_p[ 1 ] - 2 * VDD * tauo_p[ 1 ] * ( Vc * X + Vd_p[ 1 ] * tauo_p[ 1 ] ) + \
241 2 * Vc * X * ( Vd_p[ 1 ] * tauo_p[ 1 ] - TECH.Vtp0 * X ) + Vd_p[ 1 ] * tauo_p[ 1 ] * Vd_p[ 1 ] * tauo_p[ 1 ] ) / \
242 ( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
243 if ( ( b * b - 4 * c ) >= 0 )
244 {
245 ts_p[ 1 ] = ( b - sqrt ( b * b - 4 * c ) ) * 0.5;
246 if ( ts_p[ 1 ] < 0 )
247 ts_p[ 1 ] = ( b + sqrt ( b * b - 4 * c ) ) * 0.5;
248 }
249 else
251 TECH.Vtp0 * X ) / \
252 ( VDD * ( X - taui_p[ 1 ] ) + \
253 Vd_p[ 1 ] * taui_p[ 1 ] );
254 #else
256 TECH.Vtp0 * X ) / \
257 ( VDD * ( X - taui_p[ 1 ] ) + \
258 Vd_p[ 1 ] * taui_p[ 1 ] );
259 #endif
260
261 }
262 }
263 else if ( ( ts_p[ 1 ] <= t0_p[ 1 ] ) &&
264 ( t0_p[ 1 ] <= tauo_p[ 1 ] ) &&
265 ( tauo_p[ 1 ] <= taui_p[ 1 ] ) )
B.3. Simulators 247
266 {
267 OpCondition = SaveOpCondition = _G_;
269 {
270 #ifdef SAT
272 Vd_p[ 1 ] * Vd_p[ 1 ] ) / ( 2 * VDD * Vc );
273 #else
274 ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] - TECH.Vtp0 ) / VDD;
275 #endif
277 b = 2 * ( Cm2 * VDD + J_2_p * taui_p[ 1 ] ) / \
278 ( K_2_p * taui_p[ 1 ] );
279 c = ( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
280 3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[ 1 ] ) - \
281 4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * alpha + \
282 4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * theta - \
283 3 * D_2_p * ts_p[ 1 ] * \
284 ( 2 * Cm2 * VDD - 2 * Cov * VDD + taui_p[ 1 ] * \
285 ( 2 * J_2_p + K_2_p * ts_p[ 1 ] ) ) ) / \
286 ( 3 * D_2_p * K_2_p * taui_p[ 1 ] );
287 if ( ( b * b - 4 * c ) >= 0 )
288 {
289 t0_p[ 1 ] = ( sqrt ( ( b * b ) - 4 * c ) - b ) * 0.5;
290 if ( t0_p[ 1 ] < 0 )
291 t0_p[ 1 ] = -( sqrt ( ( b * b ) - 4 * c ) + b ) * 0.5;
292 }
293 else
294 t0_p[ 1 ] = t0_bs;
295 }
296 else
297 {
299 }
300 }
301 }
303 }
304
305 ///
306 int Fast::Calct0tsnP( unsigned int p, int& SaveOpCondition )
307 {
309 double Vc, tc, X, Y, H, K, det, alpha, beta;
310
312 ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
313 ( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
314 VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
315 ( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
316 ( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
317 if ( taui_p[ p ] < tauo_p[ p ] )
318 {
319 SaveOpCondition = _A_;
320 }
321 else
322 {
323 tc = ( VDD * tauo_p[ p ] * ( t0_p[ p ] - taui_p[ p ] ) + \
324 Vs_p[ p ] * taui_p[ p ] * ( tauo_p[ p ] - t0_p[ p ] ) ) / \
325 ( VDD * ( t0_p[ p ] - taui_p[ p ] ) + Vs_p[ p ] * ( tauo_p[ p ] - t0_p[ p ] ) );
326 SaveOpCondition = _C_;
327 }
328 X = t0_p[ p ] - taui_p[ p ];
329 Y = tauo_p[ p ] - t0_p[ p ];
330 alpha = VDD - Vs_p[ p ];
331 beta = VDD - Vd_p[ p ];
333 {

336 else
338 if ( ( taui_p[ p ] <= tauo_p[ p ] ) &&
339 ( ts_p[ p ] <= taui_p[ p ] ) &&
340 ( t0_p[ p ] <= ts_p[ p ] ) )
341 {
344 {
345 #ifdef SAT
346 double AAA, BBB;
347 AAA = ( X * beta + Y * alpha ) / ( X * Y );
348 BBB = -( VDD * t0_p[ p ] * ( X + Y ) - Vd_p[ p ] * X * tauo_p[ p ] - Vs_p[ p ] * Y * taui_p[ p ] ) / ( X * Y );
349 AAA /= Vc;
350 BBB /= Vc;
351 BBB -= 1.0;
352 H = 2 * ( A_1_p[ p ] + 1 ) * ( Vs_p[ p ] - VDD ) / ( Vc * X );
353 K = ( 2 * A_1_p[ p ] * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
354 2 * B_1_p[ p ] * X + 2 * VDD * t0_p[ p ] + Vc * X - 2 * Vs_p[ p ] * taui_p[ p ] ) / \
355 ( Vc * X );
356 det = 4 * K * AAA * AAA - 4 * H * AAA * BBB + H * H;
357 if ( det >= 0 )
358 {
359 ts_p[ p ] = ( SQRT ( det ) - 2 * AAA * BBB + H ) / ( 2 * AAA * AAA );
360 if ( ( ts_p[ p ] < 0 ) || ( ts_p[ p ] < t0_p[ p ] ) || ( ts_p[ p ] > tauo_p[ p ] ) )
361 ts_p[ p ] = -( SQRT ( det ) + 2 * AAA * BBB - H ) / ( 2 * AAA * AAA );
362 }
363 else
369 #else
375 #endif
376 }
377 }
378 else if ( ( ts_p[ p ] <= tauo_p[ p ] ) &&
379 ( taui_p[ p ] <= ts_p[ p ] ) &&
380 ( t0_p[ p ] <= taui_p[ p ] ) )
381 {
384 {
385 #ifdef SAT
386
387 ts_p[ p ] = tauo_p[ p ] + ( ( t0_p[ p ] - tauo_p[ p ] ) * \
388 ( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / ( VDD - Vd_p[ p ] );
389 #else
390 ts_p[ p ] = ( VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] + \
391 TECH.Vtp0 * ( t0_p[ p ] - tauo_p[ p ] ) ) / \
392 ( VDD - Vd_p[ p ] );
393 #endif
394 }
395 }
396 else if ( ( tauo_p[ p ] <= taui_p[ p ] ) &&
397 ( ts_p[ p ] < tc ) &&
398 ( t0_p[ p ] <= ts_p[ p ] ) )
399 {
(ts p[p] ¡ tc), not ’¡=’ !!!

B.3. Simulators 249
Calctst0P.cc
403 {
404 #ifdef SAT
405 double AAA, BBB;
406 AAA = ( X * beta + Y * alpha ) / ( X * Y );
407 BBB = -( VDD * t0_p[ p ] * ( X + Y ) - Vd_p[ p ] * X * tauo_p[ p ] - Vs_p[ p ] * Y * taui_p[ p ] ) / ( X * Y );
408 AAA /= Vc;
409 BBB /= Vc;
410 BBB -= 1.0;
411 H = 2 * ( A_1_p[ p ] + 1 ) * ( Vs_p[ p ] - VDD ) / ( Vc * X );
412 K = ( 2 * A_1_p[ p ] * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
413 2 * B_1_p[ p ] * X + 2 * VDD * t0_p[ p ] + Vc * X - 2 * Vs_p[ p ] * taui_p[ p ] ) / \
414 ( Vc * X );
415 det = 4 * K * AAA * AAA - 4 * H * AAA * BBB + H * H;
416 if ( det >= 0 )
417 {
418 ts_p[ p ] = ( SQRT ( det ) - 2 * AAA * BBB + H ) / ( 2 * AAA * AAA );
419 if ( ( ts_p[ p ] < 0 ) || ( ts_p[ p ] < t0_p[ p ] ) || ( ts_p[ p ] > tauo_p[ p ] ) )
420 ts_p[ p ] = -( SQRT ( det ) + 2 * AAA * BBB - H ) / ( 2 * AAA * AAA );
421 }
422 else
428 #else
429
435 #endif
436 }
437 }
438 else if ( ( tauo_p[ p ] <= taui_p[ p ] ) &&
439 ( tc <= ts_p[ p ] ) &&
440 ( t0_p[ p ] <= ts_p[ p ] ) )
441 {
444 {
445 ts_p[ p ] = tc;
446 }
447 }
448 else
449 {
451 }
452 }
454 }
Delay.cc
6 #include "fast.h"
7
8
9 ///
10 double Fast::CalcDelay( const Circuit& circuit,
11 unsigned int NP,
12 unsigned int NC,
13 unsigned int n,
14 unsigned int p,
16 double tin,
17 TransitionType TOut,
18 int& RetCode )
19 {
20
21
22 double t0_bs;
24 {
25 tauo_n[ i ] = tin;
26 }
28 {
29 tauo_p[ i ] = tin;
30 }
31 taui_n[ 1 ] = tin;
32 taui_p[ 1 ] = tin;
33 switch ( TOut )
34 {
35 case FALL: // n chain
36 t0_bs = TECH.Vtn0 * tin / VDD;
37 t0_n[ 1 ] = CalcStartTime( circuit, NP, NC, t0_bs, tauo_n[ 1 ], NMOS, NewWidth, RetCode );
39 t0_n[ 1 ] = t0_bs;
40 //return 0.0;
41 RetCode = IterSol( circuit, NMOS, NP, NC, n, p, NewWidth );
43 return 0.0;
44 else
45 return ( tauo_n[ n ] + t0_n[ n ] - tin );
46 break;
47 case RISE: // p chain
48 t0_bs = -TECH.Vtp0 * tin / VDD;
49 t0_p[ 1 ] = CalcStartTime( circuit, NP, NC, t0_bs, tauo_p[ 1 ], PMOS, NewWidth, RetCode );
51 t0_p[ 1 ] = t0_bs;
52 //return 0.0;
53 RetCode = IterSol( circuit, PMOS, NP, NC, n, p, NewWidth );
55 return 0.0;
56 else
57 return ( tauo_p[ p ] + t0_p[ p ] - tin );
58 break;
59 case NOTRANSITION:
60 default:
61 break;
62 }
63 return 0.0; // never get here
64 }
EqN.cc
6 #include "fast.h"
7
8 ///
9 double Fast::EqN( const Circuit& circuit, unsigned int NP, unsigned int NC, double x, int& RetCode, unsigned int j, unsigned int n, unsi
10 {
11 int SaveOpCondition = _E_, OpCondition_1, OpCondition_i, OpCondition_n;
12 double y, C_n, Cov, Cgd1;
13
14
15 RetCode = OK;
B.3. Simulators 251
16 double* FY = new double[n + 1];

17 tauo_n[ j ] = x;
18 double t0_bs = TECH.Vtn0 / VDD * taui_n[ 1 ];
19 double tc = (VDD * tauo_n[n] * (t0_n[n] - taui_n[n]) + \
20 Vs_n[n] * taui_n[n] * (tauo_n[n] - t0_n[n])) / \
21 (VDD * (t0_n[n] - taui_n[n]) + Vs_n[n] * (tauo_n[n] - t0_n[n]));
22 // tc cross-time : Vd = Vs
23 if ( j == 1 )
24 {
25 OpCondition_1 = Calct0ts1N( circuit, NP, SaveOpCondition, NewWidth );
26 if ( ( OpCondition_1 == _E_ ) || ( t0_n[ 1 ] < t0_bs ) )
27 {
28 RetCode = PARSE_ERROR;
29 if ( SaveOpCondition == _E_ )
30 OpCondition_1 = _A_;
31 else
32 OpCondition_1 = SaveOpCondition;
33 }
34 FY[ 1 ] = FirstEqN( OpCondition_1, tauo_n[ j ] );
35 }
37 {
38 t0_n[ i ] = t0_n[ i - 1 ];
39 taui_n[ i ] = tauo_n[ i - 1 ];
40 }
41 // middle equations
42 if ( ( j > 1 ) && ( j < n ) )
43 {
44 if ( taui_n[ j ] <= tauo_n[ j ] )
45 OpCondition_i = _A_;
46 else
47 OpCondition_i = _E_;
48 if ( OpCondition_i == _E_ )
49 {
52 }
53 FY[ j ] = MiddleEqN( OpCondition_i, j, tauo_n[ j ] );
54 }
55 // last equation
56 if ( n > 1 )
57 {
58 OpCondition_n = Calct0tsnN( n, SaveOpCondition );
59 if ( ( OpCondition_n == _E_ ) || ( OpCondition_n == _C_ ) || ( OpCondition_n == _D_ ) )
60 {
63 OpCondition_n = _A_;
64 else
65 OpCondition_n = SaveOpCondition;
66 }
67 if ( j == n )
68 FY[ n ] = LastEqN( OpCondition_n, n, tauo_n[ n ] );
69 }
70 y = FY[ j ];
71 // evaluate capacitance at each node
72 unsigned int node; // it’s the common node used to traverse the path
73 unsigned int LastNode;
74 const char* name = pathlist[ NP ].TransistorName( 0, NC ); // first mos in path
77 node = 0;
78 for ( unsigned int i = j, k = 1; i > 1; i--, k++ )
79 {
84 name = pathlist[ NP ].TransistorName( k , NC);
85 }
86 for ( unsigned int i = j; i <= n; i++ )
87 {
88 C_n = 0.0;
89 name = pathlist[ NP ].TransistorName( i - 1 , NC);
91 {
97 }
99 {
105 }
106 if (i == n)
107 LastNode = node;
108 // Common capacitance
109 int nc; // dummy
110 // Cj
111 C_n += -TECH.C_nj * Wjn * TECH.Df * Vd_n[ i ] * pow ( ( 1 + Vd_n[ i ] / TECH.PB_n ), -TECH.mj_n ) - \
112 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_n[ i ] * pow ( 1 + Vd_n[ i ] / TECH.PB_n, -TECH.mjsw_n );
113 // static capacitances
114 C_n += -circuit.CapStaticGnd( node, nc ) * Vd_n[i];
115 C_n += -circuit.CapStaticVdd( node, nc ) * Vd_n[i];
116 // gate capacitances
117 C_n += -Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[i];
118 C_n += -Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[i];
119 if ( ( i == 1 ) && ( i < n - 1 ) )
120 {
121 // Cgd & Cgs minus first mos
122 C_n += -( TECH.Cgs0_n * ( ( Wjn - W_n[ 1 ] ) + ( njn - 1 ) * TECH.XW_n ) + \
123 0.5 * TECH.Cox_n * ( Wjn - W_n[ 1 ] ) * ( njn - 1 ) * TECH.Lmin ) * Vd_n[ i ];
124 }
125 else if ( ( i < n - 1 ) && ( i > 1 ) )
126 {
127 // all Cgd & Cgs
128 C_n += -( TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n ) + \
129 0.5 * TECH.Cox_n * Wjn * njn * TECH.Lmin ) * Vd_n[ i ];
130 }
131 else if ( (( i == n ) || ( i == n - 1 )) && (n > 1) )
132 {
133 // Cgd & Cgs minus last mos
134 C_n += -( TECH.Cgs0_n * ( ( Wjn - W_n[ n ] ) + ( njn - 1 ) * TECH.XW_n ) + \
135 0.5 * TECH.Cox_n * ( Wjn - W_n[ n ] ) * ( njn - 1 ) * TECH.Lmin ) * Vd_n[ i ];
136 }
137 // PMOS
138 C_n += ( TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_n[ i ] ) * \
139 pow ( ( 1 + ( VDD - Vd_n[ i ] ) / TECH.PB_p ), -TECH.mj_p ) );
140 C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_n[ i ] ) * \
141 pow ( ( 1 + ( VDD - Vd_n[ i ] ) / TECH.PB_p ), TECH.mjsw_p ) );
142 C_n += -VDD * njp * TECH.C_pj * Wjp * TECH.Df * \
143 pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mj_p );
144 C_n += -VDD * TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
145 pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mjsw_p );
146 // Cgs & Cgd PMOS
147 C_n += -( TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[ i ] );
148 // capacitance with voltages
149 if ( (( i == 1 ) && ( i < n - 1 )) || ((i == 1) && (n == 1)) )
150 {
151 // Cgd
152 Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
153 Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
B.3. Simulators 253
154 switch ( OpCondition_1 )

155 {
156 case _A_:
157 C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
158 ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
159 ( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
160 C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) * ( ts_n[ i ] - taui_n[ i ] ) + \
161 Vd_n[ i ] * taui_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) ) / \
163 break;
164 case _AA_:
165 C_n += Cov * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauo_n[ i ] ) + \
166 Vd_n[ i ] * taui_n[ i ] * ( ts_n[ i ] - t0_n[ i ] ) ) / \
168 C_n += Cgd1 * Vd_n[i] * ( tauo_n[ i ] - ts_n[i] ) / \
169 ( t0_n[i] - tauo_n[ i ]);
170 break;
171 case _B_:
172 C_n += Cgd1 * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
173 taui_n[ i ];
174 break;
175 case _C_:
176 C_n += Cov * Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / \
177 ( tauo_n[ i ] - t0_n[i]);
178 C_n += Cgd1 * Vd_n[ i ] * ( tauo_n[ i ] - ts_n[i]) / \
179 ( t0_n[ i ] - tauo_n[ i ] );
180 break;
181 case _D_:
182 C_n += -Cgd1 * Vd_n[ i ];
183 break;
184 case _F_:
185 C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
188 C_n += Cgd1 * ( ts_n[i] - tauo_n[i]) * \
190 ( taui_n[ i ] * ( t0_n[ i ] - tauo_n[ i ] ) );
191 break;
192 case _G_:
193 C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
194 taui_n[ i ];
195 break;
196 case _E_:
197 default:
198 break;
199 }
200 }
201 else if ( ( i == 1 ) && ( i == n - 1 ) )
202 {
203 // Cgd
204 Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
205 Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
207 {
208 case _A_:
209 C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
212 C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) * ( ts_n[ i ] - taui_n[ i ] ) + \
213 Vd_n[ i ] * taui_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) ) / \
215 break;
216 case _AA_:
217 C_n += Cov * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauo_n[ i ] ) + \
218 Vd_n[ i ] * taui_n[ i ] * ( ts_n[ i ] - t0_n[ i ] ) ) / \
220 C_n += Cgd1 * Vd_n[i] * ( tauo_n[ i ] - ts_n[i] ) / \
221 ( t0_n[i] - tauo_n[ i ]);
222 break;
223 case _B_:

224 C_n += Cgd1 * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
225 taui_n[ i ];
226 break;
227 case _C_:
228 C_n += Cov * Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / \
229 ( tauo_n[ i ] - t0_n[i]);
230 C_n += Cgd1 * Vd_n[ i ] * ( tauo_n[ i ] - ts_n[i]) / \
231 ( t0_n[ i ] - tauo_n[ i ] );
232 break;
233 case _D_:
234 C_n += -Cgd1 * Vd_n[ i ];
235 break;
236 case _F_:
237 C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
240 C_n += Cgd1 * ( ts_n[i] - tauo_n[i]) * \
242 ( taui_n[ i ] * ( t0_n[ i ] - tauo_n[ i ] ) );
243 break;
244 case _G_:
245 C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
246 taui_n[ i ];
247 break;
248 case _E_:
249 default:
250 break;
251 }
252 // Cgs
253 Cov = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
254 Cgd1 = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + double(2 / 3) * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
255 switch ( OpCondition_n )
256 {
257 case _A_:
258 C_n += Cov * \
259 Vs_n[ n ] * ( t0_n[ i ] - ts_n[ n ] ) / ( taui_n[ n ] - t0_n[ i ] );
260 C_n += Cgd1 * \
261 Vs_n[ n ] * ( taui_n[ n ] - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
262 break;
263 case _B_:
264 C_n += -Cov * Vs_n[n];
265 break;
266 case _C_:
267 C_n += Cov * \
269 C_n += Cgd1 * \
270 Vs_n[ n ] * ( tc - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
271 break;
272 case _D_:
273 C_n += Cov * \
274 Vs_n[ n ] * ( t0_n[ i ] - tc ) / ( taui_n[ n ] - t0_n[ i ] );
275 break;
276 case _E_:
277 default:
278 break;
279 }
280 }
281 else if ( ( i == n - 1 ) && ( i > 1 ) )
282 {
283 // Cgs, the last mos
284 Cov = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
285 Cgd1 = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + double(2 / 3) * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
287 {
288 case _A_:
289 C_n += Cov * \
291 C_n += Cgd1 * \
B.3. Simulators 255
292 Vs_n[ n ] * ( taui_n[ n ] - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );

293 break;
294 case _B_:
295 C_n += -Cov * Vs_n[n];
296 break;
297 case _C_:
298 C_n += Cov * \
300 C_n += Cgd1 * \
301 Vs_n[ n ] * ( tc - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
302 break;
303 case _D_:
304 C_n += Cov * \
305 Vs_n[ n ] * ( t0_n[ i ] - tc ) / ( taui_n[ n ] - t0_n[ i ] );
306 break;
307 case _E_:
308 default:
309 break;
310 }
311 }
312 else if ( i == n )
313 {
314 // Cgd
315 Cov = TECH.Cgd0_n * ( W_n[ n ] + TECH.XW_n );
316 Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ];
318 {
319 case _A_:
320 case _B_:
321 C_n += Cov * \
322 Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / ( tauo_n[ i ] - t0_n[ i ] );
323 C_n += Cgd1 * \
324 Vd_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) / ( t0_n[ i ] - tauo_n[ i ] );
325 break;
326 case _C_:
327 C_n += Cov * \
328 Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / ( tauo_n[ i ] - t0_n[ i ] );
329 C_n += Cgd1 * \
330 Vd_n[ i ] * ( tc - ts_n[ i ] ) / ( t0_n[ i ] - tauo_n[ i ] );
331 break;
332 case _D_:
333 C_n += Cov * \
334 Vd_n[ i ] * ( t0_n[ i ] - tc ) / ( tauo_n[ i ] - t0_n[ i ] );
335 break;
336 case _E_:
337 default:
338 break;
339 }
340 }
341 y += C_n;
342 }
343 // PMOS in chain
344 node = LastNode;
345 C_n = 0;
346 for (unsigned int i = 0; (i < p - 1) && (p > 0); i++)
347 {
348 name = pathlist[ NP ].TransistorName( i + n, NC);
349
351 {
357 }
359 {

365 }
366 C_n += -TECH.C_nj * Wjn * TECH.Df * Vd_n[ n ] * pow ( ( 1 + Vd_n[ n ] / TECH.PB_n ), -TECH.mj_n ) - \
367 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_n[ n ] * pow ( 1 + Vd_n[ n ] / TECH.PB_n, -TECH.mjsw_n );
369 int nc;
370 C_n += -circuit.CapStaticGnd( node, nc ) * Vd_n[n];
371 C_n += -circuit.CapStaticVdd( node, nc ) * Vd_n[n];
373 C_n += -Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[n];
374 C_n += -Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[n];
375 C_n += ( TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_n[ n ] ) * \
376 pow ( ( 1 + ( VDD - Vd_n[ n ] ) / TECH.PB_p ), -TECH.mj_p ) );
377 C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_n[ n ] ) * \
378 pow ( ( 1 + ( VDD - Vd_n[ n ] ) / TECH.PB_p ), TECH.mjsw_p ) );
379 C_n += -VDD * njp * TECH.C_pj * Wjp * TECH.Df * \
380 pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mj_p );
381 C_n += -VDD * TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
382 pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mjsw_p );
383 // Cgs & Cgd
384 C_n += -( TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[ n ] );
385 y += C_n;
386 }
387 y -= QpN(n, p);
388 delete[] FY;
389 return y;
390 }
EqP.cc
6 #include "fast.h"
7
8 ///
9 double Fast::EqP( const Circuit& circuit, unsigned int NP, unsigned int NC, double x, int& RetCode, unsigned int j, unsigned int n, unsi
10 {
11 int SaveOpCondition = _E_, OpCondition_1, OpCondition_i, OpCondition_p;
12 double y, C_p, Cov, Cgd1;
13
14
15 RetCode = OK;
16 double* FY = new double[p + 1];
17 tauo_p[ j ] = x;
18 double t0_bs = -TECH.Vtp0 / VDD * taui_p[ 1 ];
19 double tc = (VDD * tauo_p[p] * (t0_p[p] - taui_p[p]) + \
20 Vs_p[p] * taui_p[p] * (tauo_p[p] - t0_p[p])) / \
21 (VDD * (t0_p[p] - taui_p[p]) + Vs_p[p] * (tauo_p[p] - t0_p[p]));
22 // tc cross-time : Vd = Vs
23 if ( j == 1 )
24 {
25 OpCondition_1 = Calct0ts1P( circuit, NP, SaveOpCondition, NewWidth );
26 if ( ( OpCondition_1 == _E_ ) || ( t0_p[ 1 ] < t0_bs ) )
27 {
30 OpCondition_1 = _A_;
31 else
32 OpCondition_1 = SaveOpCondition;
33 }
34 FY[ 1 ] = FirstEqP( OpCondition_1, tauo_p[ j ] );
35 }
37 {
B.3. Simulators 257
38 t0_p[ i ] = t0_p[ i - 1 ];
39 taui_p[ i ] = tauo_p[ i - 1 ];
40 }
41 // middle equations
42 if ( ( j > 1 ) && ( j < p ) )
43 {
44 if ( taui_p[ j ] <= tauo_p[ j ] )
46 else
47 OpCondition_i = _E_;
48 if ( OpCondition_i == _E_ )
49 {
52 }
53 FY[ j ] = MiddleEqP( OpCondition_i, j, tauo_p[ j ] );
54 }
55 // last equation
56 if ( p > 1 )
57 {
58 OpCondition_p = Calct0tsnP( n, SaveOpCondition );
59 if ( ( OpCondition_p == _E_ ) || ( OpCondition_p == _C_ ) || ( OpCondition_p == _D_ ) )
60 {
63 OpCondition_p = _A_;
64 else
65 OpCondition_p = SaveOpCondition;
66 }
67 if ( j == p )
68 FY[ p ] = LastEqP( OpCondition_p, p, tauo_p[ p ] );
69 }
70 y = FY[ j ];
71 // evaluate capacitance at each node
72 unsigned int node; // it’s the common node used to traverse the path
73 unsigned int LastNode;
74 const char* name = pathlist[ NP ].TransistorName( n + p - 1, NC ); // first pmos in path
77 node = circuit.ValimNode();
78 for ( unsigned int i = j, k = 1; i > 1; i--, k++ )
79 {
84 name = pathlist[ NP ].TransistorName( n + p - 1 - k, NC );
85 }
86 for ( unsigned int i = j; i <= p; i++ )
87 {
88 C_p = 0.0;
89 name = pathlist[ NP ].TransistorName( n + p - i, NC ); // first there are n nmos
90 // the there are the pmos, in REVERSE order
92 {
98 }
100 {
106 }
107 if (i == p)
108 LastNode = node;
109 // Common capacitance
110 int nc; // dummy
111 // Cj
112 C_p += TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_p[ i ]) * pow ( ( 1 + ( VDD - Vd_p[ i ] ) / TECH.PB_p ), -TECH.mj_p );
113 C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_p[ i ]) * pow ( 1 + ( VDD - Vd_p[ i ] ) / TECH.PB_p, -TECH.mjsw_p );
115 C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ i ]);
116 C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ i ]);
118 C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ i ]);
119 C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ i ]);
120 if ( ( i == 1 ) && ( i < p - 1 ) )
121 {
122 // Cgd & Cgs minus first mos
123 C_p += ( TECH.Cgs0_p * ( ( Wjp - W_p[ 1 ] ) + ( njp - 1 ) * TECH.XW_p ) + \
124 0.5 * TECH.Cox_p * ( Wjp - W_p[ 1 ] ) * ( njp - 1 ) * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
125 }
126 else if ( ( i < p - 1 ) && ( i > 1 ) )
127 {
128 // all Cgd & Cgs
129 C_p += ( TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p ) + \
130 0.5 * TECH.Cox_p * Wjp * njp * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
131 }
132 else if ( (( i == p ) || ( i == p - 1 )) && (p > 1) )
133 {
134 // Cgd & Cgs minus last mos
135 C_p += ( TECH.Cgs0_p * ( ( Wjp - W_p[ p ] ) + ( njp - 1 ) * TECH.XW_p ) + \
136 0.5 * TECH.Cox_p * ( Wjp - W_p[ p ] ) * ( njp - 1 ) * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
137 }
138 // NMOS
139 C_p += -( TECH.C_nj * Wjn * TECH.Df * Vd_p[ i ] ) * \
140 pow ( ( 1 + Vd_p[ i ] / TECH.PB_n ), -TECH.mj_n );
141 C_p += -( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_p[ i ] ) * \
142 pow ( ( 1 + Vd_p[ i ] / TECH.PB_n ), -TECH.mjsw_n );
143 C_p += ( TECH.C_nj * Wjn * TECH.Df * VDD ) * \
144 pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mj_n );
145 C_p += ( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * VDD ) * \
146 pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mjsw_n );
147 // Cgs & Cgd
148 C_p += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ i ]);
149 // capacitance with voltages
150 if ( (( i == 1 ) && ( i < p - 1 )) || ((i == 1) && (p == 1)) )
151 {
152 // Cgd
153 Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
154 Cgd1 = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
156 {
157 case _A_:
158 C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
159 ( VDD * ( t0_p[ i ] - tauo_p[ i ] - taui_p[i]) + Vd_p[ i ] * taui_p[ i ] ) / \
160 ( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
161 C_p += Cgd1 * ( VDD * ( t0_p[ i ] * ( ts_p[ i ] - taui_p[ i ] ) - \
162 ts_p[ i ] * ( taui_p[ i ] + tauo_p[ i ] ) + 2 * taui_p[ i ] * tauo_p[ i ] ) + \
163 Vd_p[ i ] * taui_p[ i ] * ( ts_p[ i ] - tauo_p[ i ] ) ) / \
165 break;
166 case _AA_:
167 C_p += Cov * ( VDD * ( t0_p[ i ] * t0_p[ i ] - t0_p[ i ] * ( 2 * taui_p[ i ] + tauo_p[ i ] ) + \
168 taui_p[ i ] * ( ts_p[ i ] + tauo_p[ i ] ) ) + \
169 Vd_p[ i ] * taui_p[ i ] * ( t0_p[ i ] - ts_p[ i ] ) ) / \
171 C_p += Cgd1 * ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tauo_p[ i ] ) / \
172 ( t0_p[i] - tauo_p[ i ]);
173 break;
174 case _B_:
175 C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - 2 * taui_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
B.3. Simulators 259
176 taui_p[ i ];
177 break;
178 case _C_:
179 C_p += Cov * ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / \
180 ( t0_p[i] - tauo_p[ i ]);
182 ( t0_p[i] - tauo_p[ i ]);
183 break;
184 case _D_:
185 C_p += Cgd1 * ( VDD - Vd_p[ i ] );
186 break;
187 case _F_:
188 C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
189 ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
191 C_p += Cgd1 * ( tauo_p[ i ] - ts_p[ i ] ) * \
193 ( taui_p[ i ] * ( t0_p[i] - tauo_p[ i ] ) );
194 break;
195 case _G_:
196 C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
197 taui_p[ i ];
198 break;
199 case _E_:
200 default:
201 break;
202 }
203 }
204 else if ( ( i == 1 ) && ( i == p - 1 ) )
205 {
206 // Cgd
207 Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
208 Cgd1 = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
210 {
211 case _A_:
212 C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
213 ( VDD * ( t0_p[ i ] - tauo_p[ i ] - taui_p[i]) + Vd_p[ i ] * taui_p[ i ] ) / \
215 C_p += Cgd1 * ( VDD * ( t0_p[ i ] * ( ts_p[ i ] - taui_p[ i ] ) - \
216 ts_p[ i ] * ( taui_p[ i ] + tauo_p[ i ] ) + 2 * taui_p[ i ] * tauo_p[ i ] ) + \
217 Vd_p[ i ] * taui_p[ i ] * ( ts_p[ i ] - tauo_p[ i ] ) ) / \
219 break;
220 case _AA_:
221 C_p += Cov * ( VDD * ( t0_p[ i ] * t0_p[ i ] - t0_p[ i ] * ( 2 * taui_p[ i ] + tauo_p[ i ] ) + \
222 taui_p[ i ] * ( ts_p[ i ] + tauo_p[ i ] ) ) + \
223 Vd_p[ i ] * taui_p[ i ] * ( t0_p[ i ] - ts_p[ i ] ) ) / \
226 ( t0_p[i] - tauo_p[ i ]);
227 break;
228 case _B_:
229 C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - 2 * taui_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
230 taui_p[ i ];
231 break;
232 case _C_:
233 C_p += Cov * ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / \
234 ( t0_p[i] - tauo_p[ i ]);
236 ( t0_p[i] - tauo_p[ i ]);
237 break;
238 case _D_:
239 C_p += Cgd1 * ( VDD - Vd_p[ i ] );
240 break;
241 case _F_:
242 C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
245 C_p += Cgd1 * ( tauo_p[ i ] - ts_p[ i ] ) * \

247 ( taui_p[ i ] * ( t0_p[i] - tauo_p[ i ] ) );
248 break;
249 case _G_:
250 C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
251 taui_p[ i ];
252 break;
253 case _E_:
254 default:
255 break;
256 }
257 // Cgs
258 Cgd1 = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + double(2.0 / 3.0) * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
259 Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
260 switch ( OpCondition_p )
261 {
262 case _A_:
263 C_p += Cov * \
264 ( VDD - Vs_p[ p ] ) * ( t0_p[ i ] - ts_p[ p ] ) / ( t0_p[i] - taui_p[ p ]);
265 C_p += Cgd1 * \
266 ( VDD - Vs_p[ p ] ) * ( ts_p[p] - taui_p[ p ]) / ( t0_p[ i ] - taui_p[ p ] );
267 break;
268 case _B_:
269 C_p += Cov * (VDD - Vs_p[p]);
270 break;
271 case _C_:
272 C_p += Cov * \
274 C_p += Cgd1 * \
275 ( VDD - Vs_p[ p ] ) * ( ts_p[ p ] - tc) / ( t0_p[ i ] - taui_p[ p ] );
276 case _D_:
277 C_p += Cov * \
278 ( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - taui_p[ p ]);
279 break;
280 case _E_:
281 default:
282 break;
283 }
284 }
285 else if ( ( i == p - 1 ) && ( i > 1 ) )
286 {
287 // Cgs, the last mos
288 Cgd1 = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + double(2.0 / 3.0) * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
289 Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
291 {
292 case _A_:
293 C_p += Cov * \
295 C_p += Cgd1 * \
296 ( VDD - Vs_p[ p ] ) * ( ts_p[p] - taui_p[ p ]) / ( t0_p[ i ] - taui_p[ p ] );
297 break;
298 case _B_:
299 C_p += Cov * (VDD - Vs_p[p]);
300 break;
301 case _C_:
302 C_p += Cov * \
304 C_p += Cgd1 * \
305 ( VDD - Vs_p[ p ] ) * ( ts_p[ p ] - tc) / ( t0_p[ i ] - taui_p[ p ] );
306 case _D_:
307 C_p += Cov * \
308 ( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - taui_p[ p ]);
309 break;
310 case _E_:
311 default:
312 break;
313 }
B.3. Simulators 261
314 }
315 else if ( i == p )
316 {
317 // Cgd
318 Cov = ( TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p ) );
319 Cgd1 = ( TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
321 {
322 case _A_:
323 case _B_:
324 C_p += Cov * \
325 ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / ( t0_p[i] - tauo_p[ i ]);
326 C_p += Cgd1 * \
327 ( VDD - Vd_p[ i ] ) * ( ts_p[i] - tauo_p[ i ]) / ( t0_p[ i ] - tauo_p[ i ] );
328 break;
329 case _C_:
330 C_p += Cov * \
331 ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / ( t0_p[i] - tauo_p[ i ]);
332 C_p += Cgd1 * \
333 ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tc) / ( t0_p[ i ] - tauo_p[ i ] );
334 break;
335 case _D_:
336 C_p += Cov * \
337 ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - tauo_p[ i ]);
338 break;
339 case _E_:
340 default:
341 break;
342 }
343 }
344 y += C_p;
345 }
346 node = LastNode;
347 C_p = 0;
348 for (unsigned int i = 0; (i < n - 1) && (n > 0); i++)
349 {
350 name = pathlist[ NP ].TransistorName( n - 1 - i, NC );
352 {
358 }
360 {
366 }
367 // Cj
368 C_p += TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_p[ p ]) * pow ( ( 1 + ( VDD - Vd_p[ p ] ) / TECH.PB_p ), -TECH.mj_p );
369 C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_p[ p ]) * pow ( 1 + ( VDD - Vd_p[ p ] ) / TECH.PB_p, -TECH.mjsw_p );
371 int nc;
372 C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ p ]);
373 C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ p ]);
375 C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ p ] );
376 C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ p ] );
377 // NMOS in chain
378 C_p += -( TECH.C_nj * Wjn * TECH.Df * Vd_p[ p ] ) * \
379 pow ( ( 1 + Vd_p[ p ] / TECH.PB_n ), -TECH.mj_n );
380 C_p += -( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_p[ p ] ) * \
381 pow ( ( 1 + Vd_p[ p ] / TECH.PB_n ), -TECH.mjsw_n );
382 C_p += ( TECH.C_nj * Wjn * TECH.Df * VDD ) * \
383 pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mj_n );

384 C_p += ( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * VDD ) * \
385 pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mjsw_n );
386 // Cgs & Cgd
387 C_p += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ p ] );
388 y += C_p;
389 }
390 y += QnP(n, p);
391 delete[] FY;
392 return y;
393 }
Fast.cc
6 #include "fast.h"
7
8
9 ///
10 Fast::Fast( const CritPathList& pathlist, const Options& options )
11 :
12 EvaluationAlgorithm( pathlist, options ),
13 A_1_n( 0 ), ts_n( 0 ), tauo_n( 0 ), taui_n( 0 ), t0_n( 0 ),
14 beta_n( 0 ), Vd_n( 0 ), Vs_n( 0 ), W_n( 0 ), L_n( 0 ),
15 A_1_p( 0 ), B_1_p( 0 ), ts_p( 0 ), tauo_p( 0 ), taui_p( 0 ), t0_p( 0 ),
16 beta_p( 0 ), Vd_p( 0 ), Vs_p( 0 ), W_p( 0 ), L_p( 0 ), VDD( 0 )
17 {
18 print_log( "Creating Fast instance..." );
19 }
20
21 ///
22 Fast::~Fast()
23 {}
24
25 ///
26
27 int Fast::Run( const Circuit& circuit, const double *NewWidth, const unsigned* ValidPath )
28 {
29 VDD = circuit.Valim();
30 Calls++;
31 int RetCode;
32 double tin;
33 unsigned int n;
34 unsigned int p;
35 double TotalDelay;
36 double TotalPower;
37 double TotalNoise;
39 {
41 {
42 unsigned int NumChain = pathlist[ NP ].GetNumListTran();
43 TransitionType TIn;
44 TransitionType TOut;
45 TotalDelay = 0.0;
46 TotalPower = 0.0;
47 TotalNoise = 0.0;
48 for ( unsigned int NC = 0; NC < NumChain; NC++ )
49 {
50 if ( NC == 0 )
51 {
52 TIn = pathlist[ NP ].GetTransitionIn();
53 tin = pathlist[ NP ].GetInTime() / 1000;
54 }
55 else
56 {
B.3. Simulators 263
57 if ( TIn == FALL )
58 tin = tauo_n[ n ];
59 else if ( TIn == RISE )
60 tin = tauo_p[ p ];
61 }
62 if ( TIn == FALL )
63 {
64 TOut = RISE;
65 }
66 else if ( TIn == RISE )
67 {
68 TOut = FALL;
69 }
70 n = pathlist[ NP ].GetNumTranN( NC );
71 p = pathlist[ NP ].GetNumTranP( NC );
72 if ( ( RetCode = InitCircuitVar( n, p ) ) != OK )
73 return RetCode;
74 unsigned int in = 1;
75 unsigned int ip = 1;
76 while ( const char * tn = pathlist[ NP ].TraverseTransistorNameList( NC ) )
77 {
78 int position = circuit.TranPos( tn );
79 if ( position == -1 )
81 if ( in <= n )
82 {
83 W_n[ in ] = NewWidth[ position ];
84 L_n[ in++ ] = circuit[ position ].Length();
85 }
86 else if ( ip <= p )
87 {
88 W_p[ p - ip + 1 ] = NewWidth[ position ];
89 L_p[ p - ip + 1 ] = circuit[ position ].Length();
90 ip++;
91 }
92 else
94 }
95 CalcParamCircuit( n, p );
96
97
98 TotalDelay += CalcDelay( circuit, NP, NC, n, p, NewWidth, tin, TOut, RetCode );
100 return RetCode;
101 TotalPower += CalcPower( circuit, NP, NC, n, p, NewWidth, TOut, RetCode );
103 return RetCode;
104 TotalNoise = 0.0;
105 FreeCircuitPar( n, p );
106 TIn = TOut;
107 }
108 if (NumChain > 0)
109 for (unsigned int i = 0; i < NumChain; i++)
110 TotalDelay *= 1.85;
111 else if (NumChain > 1)
112 for (unsigned int i = 0; i < NumChain; i++)
113 TotalDelay *= 3.1; // 1.07 tech 07
114 CPDelay[ NP ] = TotalDelay * 1000; // ps
115 CPPower[ NP ] = TotalPower / 1000.0; // pJ
116 CPNoise[ NP ] = TotalNoise;
117 }
118 }
119 Area = CalcArea( NewWidth, circuit.GetNTran() );
121 return RetCode;
122 return OK;
123 }
FastArea.cc
6 #include "fast.h"
7
8 ///
9 double Fast::CalcArea( const double *NewWidth, unsigned int NT )
10 {
11 double A = 0.0;
13 {
14 A += NewWidth[ i ];
15 }
16 return ( A );
17 }
FirstEqN.cc
6 #include "fast.h"
7
8 ///
9 double Fast::FirstEqN( int OpCondition, double tauon )
10 {
11 double Vc, t0_bs, temp;
12 double A_2_n, B_2_n, C_2_n, D_2_n, I_2_n;
13 double N_2_n, O_2_n, P_2_n;
14 double Q_2_n, R_2_n, S_2_n, T_2_n, U_2_n;
15
17 Vc = TECH.Ec_n * L_n[ 1 ];
18 if ( OpCondition == _E_ )
19 if ( taui_n[ 1 ] <= tauon )
20 OpCondition = _A_;
21 else
22 OpCondition = _F_;
25 C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
26 D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
27 I_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
28 ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * 0.5;
29 N_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * Vc * ( ( t0_n[ 1 ] - tauon ) * ( t0_n[ 1 ] - tauon ) ) - \
30 Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc * ( t0_n[ 1 ] - tauon ) + \
31 Vd_n[ 1 ] * tauon + 2 * TECH.Vtn0 * ( tauon - t0_n[ 1 ] ) ) ) / \
32 ( 2 * Vd_n[ 1 ] * taui_n[ 1 ] * ( tauon - t0_n[ 1 ] ) );
33 O_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * taui_n[ 1 ] ) / \
34 ( 2 * taui_n[ 1 ] * ( t0_n[ 1 ] - tauon ) );
35 P_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( t0_n[ 1 ] - tauon ) * \
36 ( 2 * VDD * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon ) - Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc - 2 * TECH.Vtn0 ) );
37 Q_2_n = 2 * Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon );
38 R_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * ( t0_n[ 1 ] - tauon ) + \
39 Vc * ( t0_n[ 1 ] - tauon ) + \
40 Vd_n[ 1 ] * tauon + 2 * TECH.Vtn0 * ( tauon - t0_n[ 1 ] ) ) / \
41 ( 2 * ( t0_n[ 1 ] - tauon ) );
42 S_2_n = Vc * Vd_n[ 1 ] * beta_n[ 1 ] / ( 2 * ( tauon - t0_n[ 1 ] ) );
43 T_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( tauon - t0_n[ 1 ] ) * ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
44 U_2_n = 2 * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon );
45 switch ( OpCondition )
46 {
47 case _A_:
48 temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ts_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[
49 P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ( taui_n[ 1 ] * taui_n[ 1 ] ) ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ta
50 T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * taui_n[ 1 ] ) / Vd_n[ 1 ] + \
51 T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] - \
B.3. Simulators 265
52 ( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - ts_n[ 1 ] ) + \

53 3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) - \
54 4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5 ) + \
55 4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * ts_n[ 1 ] + D_2_n ), 1.5 ) + \
56 3 * C_2_n * ( 2 * N_2_n * ( ts_n[ 1 ] - taui_n[ 1 ] ) + \
57 O_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) + \
58 2 * R_2_n * ( taui_n[ 1 ] - tauon ) + \
59 S_2_n * ( ( taui_n[ 1 ] * taui_n[ 1 ] ) - ( tauon * tauon ) ) )
) / ( 6 * C_2_n );
60 break;
61 case _AA_:
62 temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * ts_n[ 1 ] ) / Vd_n[ 1 ] + \
64 ( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - taui_n[ 1 ] ) + \
65 3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) - \
67 4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * taui_n[ 1 ] + D_2_n ), 1.5 ) - \
68 3 * C_2_n * ( 2 * I_2_n * ( ts_n[ 1 ] - taui_n[ 1 ] ) + \
69 2 * R_2_n * ( tauon - ts_n[ 1 ] ) + \
70 S_2_n * ( ( tauon * tauon ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) ) ) / ( 6 * C_2_n );
71 break;
72 case _B_:
73 temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * t0_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
74 P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ( taui_n[ 1 ] * taui_n[ 1 ] ) ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
75 T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * taui_n[ 1 ] ) / Vd_n[ 1 ] + \
77 ( 2 * N_2_n * ( t0_n[ 1 ] - taui_n[ 1 ] ) + \
78 O_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) + \
79 2 * R_2_n * ( taui_n[ 1 ] - tauon ) + \
80 S_2_n * ( ( taui_n[ 1 ] * taui_n[ 1 ] ) - ( tauon * tauon ) ) ) / 2;
81 break;
82 case _C_:
83 temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * ts_n[ 1 ] ) / Vd_n[ 1 ] + \
84 T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] + \
85 I_2_n * ( ts_n[ 1 ] - t0_n[ 1 ] ) - \
86 ( 2 * R_2_n * ( ts_n[ 1 ] - tauon ) + S_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( tauon * tauon ) ) ) * 0.5;
87 break;
88 case _D_:
89 temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * t0_n[ 1 ] ) / Vd_n[ 1 ] + \
91 ( 2 * R_2_n * ( t0_n[ 1 ] - tauon ) + S_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( tauon * tauon ) ) ) * 0.5;
92 break;
93 case _F_:
94 temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ts_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
95 P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] * tauon ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
96 ( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - ts_n[ 1 ] ) + \
97 3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) - \
99 4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * ts_n[ 1 ] + D_2_n ), 1.5 ) + \
100 3 * C_2_n * ( 2 * N_2_n * ( ts_n[ 1 ] - tauon ) + \
101 O_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( tauon * tauon ) ) ) ) / ( 6 * C_2_n );
102 break;
103 case _G_:
104 temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * t0_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
105 P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] * tauon ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
106 ( 2 * N_2_n * ( t0_n[ 1 ] - tauon ) + O_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( tauon * tauon ) ) ) / 2;
107 break;
108 case _E_:
109 default:
110 temp = 0;
111 break;
112 }
113 return temp;
114 }
FirstEqP.cc
6 #include "fast.h"
7
8 ///
9 double Fast::FirstEqP( int OpCondition, double tauop )
10 {
11 double Vc, t0_bs, temp, x, y;
12 double A_2_p, B_2_p, C_2_p, D_2_p, G_2_p, J_2_p;
13 double K_2_p, M_2_p, N_2_p, O_2_p, P_2_p;
14 double Q_2_p, R_2_p, S_2_p, T_2_p;
15
17 Vc = TECH.Ec_p * L_p[ 1 ];
18 if ( OpCondition == _E_ )
19 if ( taui_p[ 1 ] <= tauop )
20 OpCondition = _A_;
21 else
22 OpCondition = _F_;
23 x = t0_p[ 1 ] - taui_p[ 1 ];
24 y = t0_p[ 1 ] - tauop;
27 C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
28 D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
29 G_2_p = Vc * Vc * beta_p[ 1 ] * SQRT ( ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) / Vc ) - \
30 VDD * Vc * beta_p[ 1 ] - Vc * Vc * beta_p[ 1 ] - Vc * TECH.Vtp0 * beta_p[ 1 ];
31 J_2_p = Vc * beta_p[ 1 ] * ( ( VDD * VDD ) * taui_p[ 1 ] * tauop - \
32 VDD * ( Vc * y * ( x + y - tauop ) + \
33 2 * taui_p[ 1 ] * ( Vd_p[ 1 ] * tauop - TECH.Vtp0 * y ) ) - \
34 Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * y - \
35 Vd_p[ 1 ] * tauop + 2 * TECH.Vtp0 * y ) ) / \
36 ( 2 * taui_p[ 1 ] * ( Vd_p[ 1 ] - VDD ) * y );
37 K_2_p = -Vc * beta_p[ 1 ] * ( VDD * ( x + y - tauop ) + \
38 Vd_p[ 1 ] * taui_p[ 1 ] ) / \
39 ( 2 * taui_p[ 1 ] * y );
40 M_2_p = ( Vc * Vc ) * beta_p[ 1 ] * y * ( 2 * ( VDD * VDD ) * tauop - \
41 VDD * ( Vc * ( x + y - tauop ) + \
42 2 * ( Vd_p[ 1 ] * tauop - TECH.Vtp0 * taui_p[ 1 ] ) ) - \
43 Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc + 2 * TECH.Vtp0 ) );
44 N_2_p = 2 * taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * \
45 ( VDD * tauop - Vc * y - Vd_p[ 1 ] * tauop );
46 O_2_p = 2 * taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
47 P_2_p = -Vc * beta_p[ 1 ] * ( VDD * ( t0_p[ 1 ] + y ) + \
48 Vc * y - Vd_p[ 1 ] * tauop + 2 * TECH.Vtp0 * y ) / \
49 ( 2 * y );
50 Q_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) / ( 2 * y );
51 R_2_p = ( Vc * Vc ) * beta_p[ 1 ] * y * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
52 S_2_p = 2 * ( VDD * tauop - Vc * y - Vd_p[ 1 ] * tauop );
53
54 T_2_p = 2 * ( VDD - Vd_p[ 1 ] );
56 {
57 case _A_:
58 temp = -M_2_p * LOG ( O_2_p * ts_p[ 1 ] - N_2_p ) / O_2_p + \
59 M_2_p * LOG ( O_2_p * taui_p[ 1 ] - N_2_p ) / O_2_p - \
60 R_2_p * LOG ( T_2_p * taui_p[ 1 ] - S_2_p ) / T_2_p + \
61 R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p + \
62 ( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - ts_p[ 1 ] ) + \
63 3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - ts_p[ 1 ] * ts_p[ 1 ] ) - \
64 4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5 ) + \
65 4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * ts_p[ 1 ] ), 1.5 ) - \
66 3 * D_2_p * ( 2 * J_2_p * ( ts_p[ 1 ] - taui_p[ 1 ] ) + \
67 K_2_p * ( ts_p[ 1 ] * ts_p[ 1 ] - taui_p[ 1 ] * taui_p[ 1 ] ) + \
68 2 * P_2_p * ( taui_p[ 1 ] - tauop ) + \
69 Q_2_p * ( taui_p[ 1 ] * taui_p[ 1 ] - tauop * tauop ) ) ) / \
70 ( 6 * D_2_p );
71 break;
72 case _AA_:
73 temp = -R_2_p * LOG ( T_2_p * ts_p[ 1 ] - S_2_p ) / T_2_p + \
B.3. Simulators 267

75 ( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - taui_p[ 1 ] ) + \
76 3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - taui_p[ 1 ] * taui_p[ 1 ] ) - \
78 4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * taui_p[ 1 ] ), 1.5 ) + \
79 3 * D_2_p * ( 2 * G_2_p * ( ts_p[ 1 ] - taui_p[ 1 ] ) + \
80 2 * P_2_p * ( tauop - ts_p[ 1 ] ) + \
81 Q_2_p * ( tauop * tauop - ts_p[ 1 ] * ts_p[ 1 ] ) ) ) / \
82 ( 6 * D_2_p );
83 break;
84 case _B_:
85 temp = -M_2_p * LOG ( O_2_p * t0_p[ 1 ] - N_2_p ) / O_2_p + \
86 M_2_p * LOG ( O_2_p * taui_p[ 1 ] - N_2_p ) / O_2_p - \
87 R_2_p * LOG ( T_2_p * taui_p[ 1 ] - S_2_p ) / T_2_p + \
88 R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p - \
89 ( 2 * J_2_p * ( t0_p[ 1 ] - taui_p[ 1 ] ) + \
90 K_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( taui_p[ 1 ] * taui_p[ 1 ] ) ) + \
91 2 * P_2_p * ( taui_p[ 1 ] - tauop ) + \
92 Q_2_p * ( ( taui_p[ 1 ] * taui_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
93
94 break;
95 case _C_:
96 temp = -R_2_p * LOG ( T_2_p * ts_p[ 1 ] - S_2_p ) / T_2_p + \
98 G_2_p * ( ts_p[ 1 ] - t0_p[ 1 ] ) - \
99 ( 2 * P_2_p * ( ts_p[ 1 ] - tauop ) + \
100 Q_2_p * ( ( ts_p[ 1 ] * ts_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
101 break;
102 case _D_:
103 temp = -R_2_p * LOG ( T_2_p * t0_p[ 1 ] - S_2_p ) / T_2_p + \
104 R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p - \
105 ( 2 * P_2_p * ( t0_p[ 1 ] - tauop ) + \
106 Q_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
107 break;
108 case _F_:
109 temp = -M_2_p * LOG ( O_2_p * ts_p[ 1 ] - N_2_p ) / O_2_p + \
110 M_2_p * LOG ( O_2_p * tauop - N_2_p ) / O_2_p + \
111 ( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - ts_p[ 1 ] ) + \
112 3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - ts_p[ 1 ] * ts_p[ 1 ] ) - \
114 4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * ts_p[ 1 ] ), 1.5 ) - \
115 3 * D_2_p * ( 2 * J_2_p * ( ts_p[ 1 ] - tauop ) + \
116 K_2_p * ( ts_p[ 1 ] * ts_p[ 1 ] - tauop * tauop ) ) ) / \
117 ( 6 * D_2_p );
118 break;
119 case _G_:
120 temp = -M_2_p * LOG ( O_2_p * t0_p[ 1 ] - N_2_p ) / O_2_p + \
121 M_2_p * LOG ( O_2_p * tauop - N_2_p ) / O_2_p - \
122 ( 2 * J_2_p * ( t0_p[ 1 ] - tauop ) + \
123 K_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
124 break;
125 case _E_:
126 default:
127 temp = 0;
128 break;
129 }
130 return temp;
131 }
Init.cc
6 #include "fast.h"
7
8 ///
9 int Fast::InitCircuitVar( unsigned int n, unsigned int p )
10 {
11 // NMOS
12 if (n == 0)
13 n = 1;
14 A_1_n = dvector ( 1, n );
15 ts_n = dvector ( 1, n );
16 tauo_n = dvector ( 1, n );
17 taui_n = dvector ( 1, n );
18 t0_n = dvector ( 1, n );
19 beta_n = dvector ( 1, n );
20 Vd_n = dvector ( 1, n );
21 Vs_n = dvector ( 1, n );
22 W_n = dvector ( 1, n );
23 L_n = dvector ( 1, n );
24 if ( !A_1_n || !ts_n || !tauo_n || !taui_n || !t0_n || !beta_n || !Vd_n || !Vs_n || !W_n || !L_n )
25 return NO_MEM;
26 TECH.u0_n = TECH.Kp_n / TECH.Cox_n;
27 TECH.Ec_n = TECH.vmax_n / TECH.u0_n * ( 1 + TECH.theta_n * ( VDD - TECH.Vtn0 ) );
28 double phip_n = TECH.phi_n;
29 double gamma_n = TECH.gamma_n;
30 double Vsb1_n = -0.5 * gamma_n * sqrt ( 4 * gamma_n * sqrt ( 2 * phip_n ) + \
31 8 * phip_n + 4 * VDD - 4 * TECH.Vtn0 + gamma_n * gamma_n ) + \
32 gamma_n * sqrt ( 2 * phip_n ) + VDD - TECH.Vtn0 + gamma_n * gamma_n * 0.5;
33 double Vt_n = TECH.Vtn0 + gamma_n * ( sqrt ( 2 * phip_n + Vsb1_n ) - \
34 sqrt ( 2 * phip_n ) );
36 {
37 A_1_n[ i ] = ( Vt_n - TECH.Vtn0 ) / Vsb1_n;
38 ts_n[ i ] = t0_n[ i ] = taui_n[ i ] = tauo_n[ i ] = 0.0;
39 }
41 Vd_n[ i ] = Vsb1_n;
43 Vs_n[ i ] = Vsb1_n;
44 Vd_n[ n ] = VDD;
45 Vs_n[ 1 ] = 0;
46 if (p == 0)
47 p = 1;
48 // PMOS
49 A_1_p = dvector ( 1, p );
50 B_1_p = dvector ( 1, p );
51 ts_p = dvector ( 1, p );
52 tauo_p = dvector ( 1, p );
53 taui_p = dvector ( 1, p );
54 t0_p = dvector ( 1, p );
55 beta_p = dvector ( 1, p );
56 Vd_p = dvector ( 1, p );
57 Vs_p = dvector ( 1, p );
58 W_p = dvector ( 1, p );
59 L_p = dvector ( 1, p );
60 if ( !A_1_p || !ts_p || !tauo_p || !taui_p || !t0_p || !beta_p || !Vd_p || !Vs_p || !W_p || !L_p )
61 return NO_MEM;
62 TECH.u0_p = TECH.Kp_p / TECH.Cox_p;
63 TECH.Ec_p = TECH.vmax_p / TECH.u0_p * ( 1 - TECH.theta_p * ( -VDD - TECH.Vtp0 ) );
64 //double phip p = fabs ( TECH.VT * log ( TECH.Nd / TECH.ni ) );
65 double phip_p = TECH.phi_p;
66 //double gamma p = sqrt ( 2 * TECH.epss * TECH.q * TECH.Nd ) / TECH.Cox p;
67 double gamma_p = TECH.gamma_p;
68 double Vsb1_p = 0.5 * gamma_p * sqrt ( 4 * gamma_p * sqrt ( 2 * phip_p ) + \
69 8 * phip_p + 4 * VDD + 4 * TECH.Vtp0 + gamma_p * gamma_p ) - \
70 gamma_p * sqrt ( 2 * phip_p ) - VDD - TECH.Vtp0 + gamma_p * gamma_p * 0.5;
71 //
72 double Vt_p = TECH.Vtp0 - gamma_p * ( sqrt ( 2 * phip_p - Vsb1_p ) - \
73 sqrt ( 2 * phip_p ) );
75 {
76 A_1_p[ i ] = ( TECH.Vtp0 - Vt_p ) / ( VDD + Vt_p );
77 B_1_p[ i ] = ( Vt_p * ( VDD + TECH.Vtp0 ) ) / ( VDD + Vt_p );
78 ts_p[ i ] = t0_p[ i ] = taui_p[ i ] = tauo_p[ i ] = 0.0;
B.3. Simulators 269
79 }
80 for ( unsigned int i = 1; i < p; i++ )
81 Vd_p[ i ] = VDD + Vsb1_p;
83 Vs_p[ i ] = VDD + Vsb1_p;
84 Vd_p[ p ] = 0;
85 Vs_p[ 1 ] = VDD;
86 return OK;
87 }
88
89 ///
90 void Fast::FreeCircuitPar( unsigned int n, unsigned int p )
91 {
92 if (n == 0)
93 n = 1;
94 free_dvector ( A_1_n);
95 free_dvector ( ts_n);
96 free_dvector ( tauo_n);
97 free_dvector ( taui_n);
98 free_dvector ( t0_n);
99 free_dvector ( beta_n);
100 free_dvector ( Vd_n);
101 free_dvector ( Vs_n);
102 free_dvector ( W_n);
103 free_dvector ( L_n);
104 if (p == 0)
105 p = 1;
106 free_dvector ( A_1_p);
107 free_dvector ( B_1_p);
108 free_dvector ( ts_p);
109 free_dvector ( tauo_p);
110 free_dvector ( taui_p);
111 free_dvector ( t0_p);
112 free_dvector ( beta_p);
113 free_dvector ( Vd_p);
114 free_dvector ( Vs_p);
115 free_dvector ( W_p);
116 free_dvector ( L_p);
117 }
118
119 ///
120 void Fast::CalcParamCircuit( unsigned int n, unsigned int p )
121 {
122
124 {
125 L_n[ i ] = L_n[ i ] - 2 * TECH.LD_n + TECH.XL_n;
126 W_n[ i ] = W_n[ i ] - 2 * TECH.WD_n + TECH.XW_n;
127 beta_n[ i ] = ( TECH.u0_n * TECH.Cox_n * W_n[ i ] / L_n[ i ] ) / ( 1 + TECH.theta_n * ( VDD - TECH.Vtn0 ) );
128 }
130 {
131 L_p[ i ] = L_p[ i ] - 2 * TECH.LD_p + TECH.XL_p;
132 W_p[ i ] = W_p[ i ] - 2 * TECH.WD_p + TECH.XW_p;
133 beta_p[ i ] = ( TECH.u0_p * TECH.Cox_p * W_p[ i ] / L_p[ i ] ) / ( 1 - TECH.theta_p * ( -VDD - TECH.Vtp0 ) );
134 }
135 }
Iter.cc
6 #include "fast.h"
7
8
9 ///
10 int Fast::IterSol( const Circuit& circuit, TransistorType type, unsigned int NP, unsigned int NC, unsigned int n, unsigned int p, const double* NewWidt
11 {
12 unsigned int found = 0, n_tol;
13 int num_sol;
14 int iter = 0;
15 double tin;
16 int RetCode;
17 unsigned int k;
19 {
20 k = n;
21 tin = taui_n[ 1 ];
22 }
24 {
25 k = p;
26 tin = taui_p[ 1 ];
27 }
28 double* to = new double[ k + 1 ];
29 double* to_old = new double[ k + 1 ];
30 double* t0 = new double[ k + 1 ];
31 for ( unsigned int i = 1; i <= k; i++ )
32 {
33 to_old[ i ] = to[ i ] = tin;
34 t0[ i ] = 0;
35 }
37 t0[ 1 ] = t0_n[ 1 ];
39 t0[ 1 ] = t0_p[ 1 ];
40 double xb1, xb2;
41 while ( !found )
42 {
43 iter++;
44 n_tol = 0;
45 for ( unsigned int i = 1; i <= k; i++ )
46 {
47 num_sol = 1;
48 xb1 = to_old[ i ] - STEP_SOL;
49 xb2 = to_old[ i ] + STEP_SOL;
50 to[ i ] = SolveEq( circuit, NP, NC, type, ( t0[ i ] + STEP_SOL ), MAX_SOL, RetCode, i, n, p, NewWidth );
51 //if(RetCode != OK)
53
54 if ( to[ i ] == 0.0 )
55 {
56 num_sol = Brackets( circuit, NP, NC, xb1, xb2, type, i, n, p, NewWidth, RetCode );
59 if ( xb1 < 0.0 )
60 xb1 = 0.0;
61 if ( num_sol == 0 )
62 {
63 if ( i == 1 )
64 to[ i ] = tin;
65 else
66 to[ i ] = to[ i - 1 ] + tin;
67 }
68 else
69 {
70 to[ i ] = SolveEq( circuit, NP, NC, type, xb1, xb2, RetCode, i, n, p, NewWidth );
73 if ( to[ i ] == 0.0 )
74 {
75 if ( i == 1 )
76 to[ i ] = tin;
77 else
78 to[ i ] = to[ i - 1 ] + tin;
79 }
B.3. Simulators 271
80 }
81 }
83 {
84 tauo_n[ i ] = to[ i ];
85 }
87 {
88 tauo_p[ i ] = to[ i ];
89 }
90 int Sop, Op; // Dummy
91 if ( i == 1 )
93 Op = Calct0ts1N ( circuit, NP, Sop, NewWidth );
94 else
95 Op = Calct0ts1P ( circuit, NP, Sop, NewWidth );
96 if ( ( i == k ) && ( k > 1 ) )
98 Op = Calct0tsnN( n, Sop );
99 else
100 Op = Calct0tsnP( n, Sop );
102 {
103 for ( unsigned int j = 1; j <= n; j++ )
104 t0[ j ] = t0_n[ j ];
105 }
107 {
108 for ( unsigned int j = 1; j <= p; j++ )
109 t0[ j ] = t0_p[ j ];
110 }
111
112 if ( ( fabs ( to[ i ] - to_old[ i ] ) <= STEP_SOL ) && ( iter > 1 ) )
113 n_tol++;
114 to_old[ i ] = to[ i ];
115 }
116 if ( ( n_tol == k ) || ( iter > ITERMAX ) )
117 found = 1;
118 }
119 delete[] to;
120 delete[] to_old;
121 return OK;
122 }
LastEqN.cc
6 #include "fast.h"
7
8 ///
9 double Fast::LastEqN( int OpCondition, unsigned int n, double tauon )
10 {
11 double Vc, temp, tc;
12 double C, D, F, G, H, I, J, K, M, N, O, P, R, S;
14 double x = t0_n[ n ] - taui_n[ n ];
15 double y = t0_n[ n ] - tauon;
16 tc = ( VDD * tauon * ( t0_n[ n ] - taui_n[ n ] ) + \
17 Vs_n[ n ] * taui_n[ n ] * ( tauon - t0_n[ n ] ) ) / \
18 ( VDD * ( t0_n[ n ] - taui_n[ n ] ) + Vs_n[ n ] * ( tauon - t0_n[ n ] ) );
19 C = -Vc * Vs_n[ n ] * beta_n[ n ] * ( A_1_n[ n ] + 1 ) / x;
20 I = ( 2 * A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] + \
21 2 * VDD * x + Vc * x + \
22 2 * ( Vs_n[ n ] * taui_n[ n ] - TECH.Vtn0 * x ) ) / \
23 ( Vc * x );
24 H = -2 * Vs_n[ n ] * ( A_1_n[ n ] + 1 ) / ( Vc * x );
25 F = Vc * beta_n[ n ] * ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] + \

26 VDD * x + Vc * x + \
27 Vs_n[ n ] * taui_n[ n ] - TECH.Vtn0 * x ) / x;
28 D = VDD * x - Vs_n[ n ] * y;
29 G = ( Vc * Vc ) * beta_n[ n ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
30 ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) / 2;
31 J = Vc * beta_n[ n ] * ( 2 * A_1_n[ n ] * Vs_n[ n ] * y * \
32 ( D * taui_n[ n ] + Vc * x * y ) + \
33 2 * D * VDD * x * y + \
34 ( VDD * VDD ) * tauon * x * x + \
35 VDD * x * y * ( Vc * x + \
36 Vs_n[ n ] * ( y - x ) - 2 * TECH.Vtn0 * x ) + \
37 Vs_n[ n ] * y * y * ( Vc * x - \
38 Vs_n[ n ] * taui_n[ n ] + 2 * TECH.Vtn0 * x ) ) / \
39 ( 2 * D * x * y );
40 K = -Vc * beta_n[ n ] * ( 2 * A_1_n[ n ] * Vs_n[ n ] * y + \
41 VDD * x + Vs_n[ n ] * y ) / \
42 ( 2 * x * y );
43 M = ( Vc * Vc ) * beta_n[ n ] * x * y * \
44 ( 2 * A_1_n[ n ] * Vs_n[ n ] * ( VDD * ( x - y ) - Vc * y ) - \
45 2 * D * VDD + VDD * ( -Vc * x + \
46 2 * ( Vs_n[ n ] * ( x - y ) + TECH.Vtn0 * x ) ) - \
47 Vs_n[ n ] * y * ( Vc + 2 * TECH.Vtn0 ) );
48 N = 2 * D * ( y * ( Vc * x + Vs_n[ n ] * taui_n[ n ] ) - \
49 VDD * x * tauon );
50 O = Vc * beta_n[ n ] * ( VDD * ( 2 * t0_n[ n ] - tauon ) + ( Vc - 2 * TECH.Vtn0 ) * y ) / \
51 ( 2 * y );
52 P = -Vc * VDD * beta_n[ n ] / ( 2 * y );
53 R = -( Vc * Vc ) * beta_n[ n ] * y * ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
54 S = 2 * ( Vc * y - VDD * tauon );
55
57 {
58 case _A_:
59
60 temp = -M * LOG2 ( 2 * ( D * D ) * ts_n[ n ] + N ) / ( D * D ) + \
61 M * LOG2 ( 2 * ( D * D ) * taui_n[ n ] + N ) / ( D * D ) - \
62 R * LOG2 ( S + 2 * VDD * taui_n[ n ] ) / VDD + \
63 R * LOG2 ( S + 2 * VDD * tauon ) / VDD - \
64 ( 6 * F * H * ( t0_n[ n ] - ts_n[ n ] ) + \
65 3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( ts_n[ n ] * ts_n[ n ] ) ) - \
66 4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5 ) + \
67 4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * ts_n[ n ] + I ), 1.5 ) + \
68 3 * H * ( 2 * J * ( ts_n[ n ] - taui_n[ n ] ) + \
69 K * ( ( ts_n[ n ] * ts_n[ n ] ) - ( taui_n[ n ] * taui_n[ n ] ) ) + \
70 2 * O * ( taui_n[ n ] - tauon ) + \
71 P * ( ( taui_n[ n ] * taui_n[ n ] ) - ( tauon * tauon ) ) ) ) / \
72 ( 6 * H );
73
74 break;
75 case _B_:
76 temp = -R * LOG2 ( S + 2 * VDD * ts_n[ n ] ) / VDD + \
77 R * LOG2 ( S + 2 * VDD * tauon ) / VDD - \
78 ( 6 * F * H * ( t0_n[ n ] - taui_n[ n ] ) + \
79 3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( taui_n[ n ] * taui_n[ n ] ) ) - \
81 4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * taui_n[ n ] + I ), 1.5 ) - \
82 3 * H * ( 2 * G * ( ts_n[ n ] - taui_n[ n ] ) + \
83 2 * O * ( tauon - ts_n[ n ] ) + \
84 P * ( ( tauon * tauon ) - ( ts_n[ n ] * ts_n[ n ] ) ) ) ) / ( 6 * H );
85
86 break;
87 case _C_:
88 temp = M * LOG2 ( 2 * ( D * D ) * tc + N ) / ( D * D ) - \
89 M * LOG2 ( 2 * ( D * D ) * ts_n[ n ] + N ) / ( D * D ) - \
90 ( 6 * F * H * ( t0_n[ n ] - ts_n[ n ] ) + \
91 3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( ts_n[ n ] * ts_n[ n ] ) ) - \
93 4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * ts_n[ n ] + I ), 1.5 ) - \
B.3. Simulators 273
94 3 * H * ( 2 * J * ( tc - ts_n[ n ] ) + \
95 K * ( ( tc * tc ) - ( ts_n[ n ] * ts_n[ n ] ) ) ) ) / \
96 ( 6 * H );
97 break;
98 case _D_:
99 temp = F * ( tc - t0_n[ n ] ) + C * 0.5 * ( tc * tc - t0_n[ n ] * t0_n[ n ] ) + \
100 2 * Vc * Vc * beta_n[ n ] * pow ( ( H * t0_n[ n ] + I ), 1.5 ) / ( 3 * H ) - \
101 2 * Vc * Vc * pow ( ( H * tc + I ), 1.5 ) / ( 3 * H );
102 break;
103 case _E_:
104 default:
105 temp = 0;
106 break;
107 }
108 return temp;
109 }
LastEqP.cc
6 #include "fast.h"
7
8 ///
9 double Fast::LastEqP( int OpCondition, unsigned int p, double tauop )
10 {
11 double Vc, temp, tc;
12 double C, D, F, G, I, J, K, M, N, O, P, R, S, X, x, y;
14 tc = ( VDD * t0_p[ p ] * ( taui_p[ p ] - tauop ) + Vd_p[ p ] * tauop * ( t0_p[ p ] - taui_p[ p ] ) + \
15 Vs_p[ p ] * taui_p[ p ] * ( tauop - t0_p[ p ] ) ) / ( VDD * ( taui_p[ p ] - tauop ) + Vd_p[ p ] * ( t0_p[ p ] - taui_p[ p ] ) + Vs_p[ p ] *
16 x = t0_p[ p ] - taui_p[ p ];
17 y = t0_p[ p ] - tauop;
18 X = VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ];
19 C = -Vc * beta_p[ p ] * ( A_1_p[ p ] * X + \
20 B_1_p[ p ] * x + \
21 VDD * t0_p[ p ] + Vc * x - Vs_p[ p ] * taui_p[ p ] ) / x;
22 D = Vc * beta_p[ p ] * ( A_1_p[ p ] + 1 ) * ( VDD - Vs_p[ p ] ) / x;
23 F = Vc * ( 2 * A_1_p[ p ] * X + \
24 2 * B_1_p[ p ] * x + \
25 2 * VDD * t0_p[ p ] + Vc * x - \
26 2 * Vs_p[ p ] * taui_p[ p ] ) / x;
27 G = Vc * beta_p[ p ] * SQRT ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - \
28 VDD * Vc * beta_p[ p ] - ( Vc * Vc ) * beta_p[ p ] - Vc * TECH.Vtp0 * beta_p[ p ];
29 J = VDD * ( y - x ) + Vd_p[ p ] * x - Vs_p[ p ] * y;
30 K = Vc * beta_p[ p ] * ( 2 * A_1_p[ p ] * y * \
31 ( ( VDD * VDD ) * t0_p[ p ] * ( x - y ) + \
32 VDD * ( Vc * x * y - Vd_p[ p ] * t0_p[ p ] * x + \
33 Vs_p[ p ] * ( t0_p[ p ] * y - \
34 taui_p[ p ] * ( x - y ) ) ) - \
35 Vs_p[ p ] * ( Vc * x * y - taui_p[ p ] * ( Vd_p[ p ] * x - \
36 Vs_p[ p ] * y ) ) ) - \
37 2 * B_1_p[ p ] * J * x * y + \
38 ( VDD * VDD ) * t0_p[ p ] * ( x - y ) * ( x + y ) + \
39 VDD * ( Vc * x * y * ( x + y ) - \
40 Vd_p[ p ] * x * ( t0_p[ p ] * ( x + y ) + tauop * ( x - y ) ) + \
41 Vs_p[ p ] * y * ( t0_p[ p ] * ( x + y ) - taui_p[ p ] * ( x - y ) ) ) - \
42 Vc * x * y * ( Vd_p[ p ] * x + Vs_p[ p ] * y ) + \
43 ( Vd_p[ p ] * Vd_p[ p ] ) * tauop * x * x + \
44 Vs_p[ p ] * Vd_p[ p ] * x * y * ( y - x ) - \
45 Vs_p[ p ] * Vs_p[ p ] * y * y * taui_p[ p ] ) / \
46 ( 2 * J * x * y );
47 I = Vc * beta_p[ p ] * ( 2 * A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * y + \
48 VDD * ( x + y ) - Vd_p[ p ] * x - Vs_p[ p ] * y ) / \
49 ( 2 * x * y );
50 M = ( Vc * Vc ) * beta_p[ p ] * x * y * \
51 ( 2 * A_1_p[ p ] * ( VDD * ( Vc * y - Vd_p[ p ] * y + Vs_p[ p ] * x ) - \
52 Vs_p[ p ] * ( Vc * y + Vd_p[ p ] * ( x - y ) ) ) - \
53 2 * B_1_p[ p ] * J + VDD * ( Vc * ( x + y ) - \
54 2 * ( Vd_p[ p ] * y - Vs_p[ p ] * x ) ) - \
55 Vc * ( Vd_p[ p ] * x + Vs_p[ p ] * y ) + \
56 2 * Vd_p[ p ] * Vs_p[ p ] * ( y - x ) );
57 N = 2 * J * ( VDD * t0_p[ p ] * ( y - x ) + \
58 Vc * x * y + \
59 Vd_p[ p ] * tauop * x - \
60 Vs_p[ p ] * taui_p[ p ] * y );
61 O = -Vc * beta_p[ p ] * ( VDD * ( t0_p[ p ] + y ) + \
62 Vc * y - Vd_p[ p ] * tauop + 2 * TECH.Vtp0 * y ) / \
63 ( 2 * y );
64 P = Vc * beta_p[ p ] * ( VDD - Vd_p[ p ] ) / ( 2 * y );
65 R = ( Vc * Vc ) * beta_p[ p ] * y * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
66 S = 2 * ( VDD * tauop - Vc * y - Vd_p[ p ] * tauop );
68 {
69 case _A_:
70 temp = -M * LOG2 ( ( 2 * ( J * J ) * ts_p[ p ] - N ) ) / ( J * J ) + \
71 M * LOG2 ( ( 2 * ( J * J ) * taui_p[ p ] - N ) ) / ( J * J ) + \
72 R * LOG2 ( ( 2 * taui_p[ p ] * ( VDD - Vd_p[ p ] ) - S ) ) / ( Vd_p[ p ] - VDD ) + \
73 R * LOG2 ( ( 2 * tauop * ( VDD - Vd_p[ p ] ) - S ) ) / ( VDD - Vd_p[ p ] ) - \
74 ( 2 * Vc * beta_p[ p ] * ( 2 * D * t0_p[ p ] - F * beta_p[ p ] ) * \
75 SQRT ( ( F * beta_p[ p ] - 2 * D * t0_p[ p ] ) / beta_p[ p ] ) + \
76 2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) * \
77 SQRT ( ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) / beta_p[ p ] ) + \
78 3 * D * ( 2 * C * ( t0_p[ p ] - ts_p[ p ] ) + D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( ts_p[ p ] * ts_p[ p ] ) ) + \
79 I * ( ( ts_p[ p ] * ts_p[ p ] ) - ( taui_p[ p ] * taui_p[ p ] ) ) + \
80 2 * K * ( ts_p[ p ] - taui_p[ p ] ) + \
81 2 * O * ( taui_p[ p ] - tauop ) + \
82 P * ( ( taui_p[ p ] * taui_p[ p ] ) - ( tauop * tauop ) ) ) ) / ( 6 * D );
83
84 break;
85 case _B_:
86 temp = R * LOG2 ( ( 2 * ts_p[ p ] * ( VDD - Vd_p[ p ] ) - S ) ) / ( Vd_p[ p ] - VDD ) + \
87 R * LOG2 ( ( 2 * tauop * ( VDD - Vd_p[ p ] ) - S ) ) / ( VDD - Vd_p[ p ] ) - \
90 2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * taui_p[ p ] ) * \
91 SQRT ( ( F * beta_p[ p ] - 2 * D * taui_p[ p ] ) / beta_p[ p ] ) + \
92 3 * D * ( 2 * C * ( t0_p[ p ] - taui_p[ p ] ) + \
93 D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( taui_p[ p ] * taui_p[ p ] ) ) + \
94 2 * G * ( taui_p[ p ] - ts_p[ p ] ) + \
95 2 * O * ( ts_p[ p ] - tauop ) + \
96 P * ( ( ts_p[ p ] * ts_p[ p ] ) - ( tauop * tauop ) ) ) ) / ( 6 * D );
97
98 break;
99 case _C_:
100 temp = M * LOG2 ( ( 2 * ( J * J ) * tc - N ) ) / ( J * J ) - \
101 M * LOG2 ( ( 2 * ( J * J ) * ts_p[ p ] - N ) ) / ( J * J ) - \
104 2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) * \
105 SQRT ( ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) / beta_p[ p ] ) + \
106 3 * D * ( 2 * C * ( t0_p[ p ] - ts_p[ p ] ) + \
107 D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( ts_p[ p ] * ts_p[ p ] ) ) + \
108 I * ( ( ts_p[ p ] * ts_p[ p ] ) - ( tc * tc ) ) + \
109 2 * K * ( ts_p[ p ] - tc ) ) ) / ( 6 * D );
110
111 break;
112 case _D_:
113 temp = -( 2 * Vc * beta_p[ p ] * ( 2 * D - F * beta_p[ p ] ) * \
115 2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * tc ) * \
116 SQRT ( ( F * beta_p[ p ] - 2 * D * tc ) / beta_p[ p ] ) + \
117 3 * D * ( 2 * C * ( t0_p[ p ] - tc ) + D * ( t0_p[ p ] * t0_p[ p ] - tc * tc ) ) ) / \
118 ( 6 * D );
119 break;
120 case _E_:
B.3. Simulators 275
121 default:
122 temp = 0;
123 break;
124 }
125 return temp;
126 }
MiddleEqN.cc
6 #include "fast.h"
7
8 ///
9 double Fast::MiddleEqN( int OpCondition, unsigned int i, double tauon )
10 {
11 double Vc, temp;
12 double D, F, G, H, I, J, K, M, N;
13 Vc = TECH.Ec_n * L_n[ i ];
15 {
16 case _A_:
17 D = Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + \
18 Vs_n[ i ] * ( tauon - t0_n[ i ] );
19 F = Vc * beta_n[ i ] * \
20 ( 2 * A_1_n[ i ] * Vs_n[ i ] * ( t0_n[ i ] - tauon ) * \
21 ( D * taui_n[ i ] + Vc * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) ) + \
22 2 * D * VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) + \
23 Vc * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) * \
24 ( Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) + \
25 ( Vd_n[ i ] * Vd_n[ i ] ) * tauon * ( ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - taui_n[ i ] ) ) + \
26 Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) * \
27 ( Vs_n[ i ] * ( taui_n[ i ] - tauon ) + 2 * TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) - \
28 Vs_n[ i ] * ( ( t0_n[ i ] - tauon ) * ( t0_n[ i ] - tauon ) ) * \
29 ( Vs_n[ i ] * taui_n[ i ] + 2 * TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) ) / \
30 ( 2 * D * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) );
31 G = Vc * beta_n[ i ] * \
32 ( 2 * A_1_n[ i ] * Vs_n[ i ] * ( t0_n[ i ] - tauon ) + \
33 Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) / \
34 ( 2 * ( t0_n[ i ] - taui_n[ i ] ) * ( tauon - t0_n[ i ] ) );
35 H = ( Vc * Vc ) * beta_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) * \
36 ( tauon - t0_n[ i ] ) * \
37 ( 2 * A_1_n[ i ] * Vs_n[ i ] * ( Vc * ( t0_n[ i ] - tauon ) + \
38 Vd_n[ i ] * ( taui_n[ i ] - tauon ) ) + 2 * D * VDD + Vc * \
39 ( Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) + \
40 2 * ( Vd_n[ i ] * ( Vs_n[ i ] * ( taui_n[ i ] - tauon ) + \
41 TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) + Vs_n[ i ] * TECH.Vtn0 * ( t0_n[ i ] - tauon ) ) );
42 I = 2 * D * ( Vc * ( t0_n[ i ] - taui_n[ i ] ) * \
43 ( t0_n[ i ] - tauon ) + Vd_n[ i ] * tauon * ( taui_n[ i ] - t0_n[ i ] ) + \
44 Vs_n[ i ] * taui_n[ i ] * ( t0_n[ i ] - tauon ) );
45 J = Vc * beta_n[ i ] * ( 2 * VDD * ( t0_n[ i ] - tauon ) + \
46 Vc * ( t0_n[ i ] - tauon ) + Vd_n[ i ] * tauon + \
47 2 * TECH.Vtn0 * ( tauon - t0_n[ i ] ) ) / ( 2 * ( t0_n[ i ] - tauon ) );
48 K = Vc * Vd_n[ i ] * beta_n[ i ] / ( 2 * ( tauon - t0_n[ i ] ) );
49 M = ( Vc * Vc ) * beta_n[ i ] * ( tauon - t0_n[ i ] ) * \
50 ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
51 N = 2 * ( Vc * ( t0_n[ i ] - tauon ) - Vd_n[ i ] * tauon );
52 temp = -H * LOG2 ( 2 * ( D * D ) * t0_n[ i ] + I ) / ( D * D ) + \
53 H * LOG2 ( 2 * ( D * D ) * taui_n[ i ] + I ) / ( D * D ) - \
54 M * LOG2 ( N + 2 * Vd_n[ i ] * taui_n[ i ] ) / Vd_n[ i ] + \
55 M * LOG2 ( N + 2 * Vd_n[ i ] * tauon ) / Vd_n[ i ] - \
56 ( 2 * F * ( t0_n[ i ] - taui_n[ i ] ) + G * ( ( t0_n[ i ] * t0_n[ i ] ) - ( taui_n[ i ] * taui_n[ i ] ) ) + 2 * J * ( taui_n[ i ] - tauo
57 break;
58 case _E_:
59 default:
60 temp = 0;
61 break;
62 }
63 return temp;
64 }
MiddleEqP.cc
6 #include "fast.h"
7
8 ///
9 double Fast::MiddleEqP( int OpCondition, unsigned int i, double tauop )
10 {
11 double Vc, temp;
12 double D, J, K, M, N, O, P, R, S, T;
13 Vc = TECH.Ec_p * L_p[ i ];
15 {
16 case _A_:
17
18 D = VDD * ( taui_p[ i ] - tauop ) + \
19 Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
20 Vs_p[ i ] * ( tauop - t0_p[ i ] );
21 J = Vc * beta_p[ i ] * \
22 ( 2 * A_1_p[ i ] * ( t0_p[ i ] - tauop ) * ( ( VDD * VDD ) * t0_p[ i ] * ( taui_p[ i ] - tauop ) - \
23 VDD * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
24 Vd_p[ i ] * t0_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) + \
25 Vs_p[ i ] * ( ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * tauop + taui_p[ i ] * ( taui_p[ i ] - tauop ) ) ) + \
26 Vs_p[ i ] * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) - \
27 taui_p[ i ] * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + Vs_p[ i ] * ( tauop - t0_p[ i ] ) ) ) ) + \
28 2 * B_1_p[ i ] * D * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
29 ( VDD * VDD ) * t0_p[ i ] * ( taui_p[ i ] - tauop ) * \
30 ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) - VDD * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * \
31 ( t0_p[ i ] - tauop ) * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) + \
32 Vd_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) * \
33 ( 2 * ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * ( taui_p[ i ] + tauop ) - \
34 tauop * ( taui_p[ i ] - tauop ) ) + Vs_p[ i ] * ( t0_p[ i ] - tauop ) * \
35 ( 2 * ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * ( taui_p[ i ] + tauop ) + taui_p[ i ] * ( taui_p[ i ] - tauop ) ) )
36 Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
37 Vs_p[ i ] * ( t0_p[ i ] - tauop ) ) - ( Vd_p[ i ] * Vd_p[ i ] ) * tauop * ( ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[
38 Vs_p[ i ] * ( tauop - t0_p[ i ] ) * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) * \
39 ( taui_p[ i ] - tauop ) + Vs_p[ i ] * taui_p[ i ] * ( tauop - t0_p[ i ] ) ) ) / \
40 ( 2 * D * ( t0_p[ i ] - taui_p[ i ] ) * ( tauop - t0_p[ i ] ) );
41 K = Vc * beta_p[ i ] * ( 2 * A_1_p[ i ] * ( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
42 VDD * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) + \
43 Vd_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) + \
44 Vs_p[ i ] * ( tauop - t0_p[ i ] ) ) / ( 2 * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) );
45 M = ( Vc * Vc ) * beta_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) * \
46 ( 2 * A_1_p[ i ] * ( VDD * ( Vc * ( t0_p[ i ] - tauop ) + \
47 Vd_p[ i ] * ( tauop - t0_p[ i ] ) + Vs_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) ) - \
48 Vs_p[ i ] * ( Vc * ( t0_p[ i ] - tauop ) + Vd_p[ i ] * ( tauop - taui_p[ i ] ) ) ) - \
49 2 * B_1_p[ i ] * D + VDD * ( Vc * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) - \
50 2 * ( Vd_p[ i ] * ( t0_p[ i ] - tauop ) + Vs_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) ) ) - \
51 Vc * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
52 Vs_p[ i ] * ( t0_p[ i ] - tauop ) ) + 2 * Vd_p[ i ] * Vs_p[ i ] * ( taui_p[ i ] - tauop ) );
53 N = 2 * D * ( VDD * t0_p[ i ] * ( taui_p[ i ] - tauop ) + \
54 Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
55 Vd_p[ i ] * tauop * ( t0_p[ i ] - taui_p[ i ] ) + Vs_p[ i ] * taui_p[ i ] * ( tauop - t0_p[ i ] ) );
56 O = Vc * beta_p[ i ] * ( VDD * ( 2 * t0_p[ i ] - tauop ) + \
57 Vc * ( t0_p[ i ] - tauop ) - Vd_p[ i ] * tauop + 2 * TECH.Vtp0 * ( t0_p[ i ] - tauop ) ) / \
58 ( 2 * ( tauop - t0_p[ i ] ) );
59 P = Vc * beta_p[ i ] * ( VDD - Vd_p[ i ] ) / ( 2 * ( t0_p[ i ] - tauop ) );
60 R = ( Vc * Vc ) * beta_p[ i ] * ( t0_p[ i ] - tauop ) * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
61 S = 2 * ( VDD * tauop + Vc * ( tauop - t0_p[ i ] ) - Vd_p[ i ] * tauop );
62 T = 2 * ( VDD - Vd_p[ i ] );
63 temp = -M * LOG2 ( 2 * ( D * D ) * t0_p[ i ] - N ) / ( D * D ) + \
64 M * LOG2 ( 2 * ( D * D ) * taui_p[ i ] - N ) / ( D * D ) - \
B.3. Simulators 277
65 R * LOG ( T * taui_p[ i ] - S ) / T + \
66 R * LOG ( T * tauop - S ) / T - \
67 ( 2 * J * ( t0_p[ i ] - taui_p[ i ] ) + \
68 K * ( ( t0_p[ i ] * t0_p[ i ] ) - ( taui_p[ i ] * taui_p[ i ] ) ) + \
69 2 * O * ( taui_p[ i ] - tauop ) + \
70 P * ( ( taui_p[ i ] * taui_p[ i ] ) - ( tauop * tauop ) ) ) / 2;
71
72 break;
73 case _E_:
74 default:
75 temp = 0;
76 break;
77 }
78 return temp;
79 }
Power.cc
6 #include "fast.h"
7
8
9 ///
10 double Fast::CalcPower( const Circuit& circuit,
11 unsigned int NP,
12 unsigned int NC,
13 unsigned int n,
14 unsigned int p,
16 TransitionType TOut,
17 int& RetCode )
18 {
19
20 double Ecc, Esc;
21 switch ( TOut )
22 {
23 case FALL: // n chain
24 RetCode = CalcPowerN( circuit, NP, NC, Ecc, Esc, n, p, NewWidth );
26 return 0.0;
27 break;
28 case RISE: // p chain
29 RetCode = CalcPowerP( circuit, NP, NC, Ecc, Esc, n, p, NewWidth );
31 return 0.0;
32 break;
33 case NOTRANSITION:
34 default:
35 break;
36 }
37 return ( Ecc + Esc );
38 }
QnP.cc
6 #include "fast.h"
7
8 ///
9 double Fast::QnP(unsigned int n, unsigned int p)
10 {
11 double D_n, E_n, F_n, G_n;
12 double to;
13 double Vc, y;
14
15 if (n != 0)
16 {
17 Vc = TECH.Ec_n * L_n[1];
18 to = taui_n[1] * (1 - TECH.Vtn0 / VDD);
19 if (to > tauo_p[p])
20 to = tauo_p[p];
21 D_n = Vc * beta_n[1] * (VDD * VDD * taui_p[1] * (t0_p[p] - 2 * tauo_p[p]) - \
22 VDD * (Vc * (t0_p[p] - tauo_p[p]) * \
23 (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
24 taui_p[1] * (Vd_p[p] * (t0_p[p] - 3 * tauo_p[p]) + \
25 2 * TECH.Vtn0 * (t0_p[p] - tauo_p[p]))) - \
26 Vd_p[p] * taui_p[1] * (Vc * (t0_p[p] - tauo_p[p]) + \
27 Vd_p[p] * tauo_p[p] + 2 * TECH.Vtn0 * \
28 (tauo_p[p] - t0_p[p]))) / \
29 (2 * taui_p[1] * (VDD - Vd_p[p]) * (t0_p[p] - tauo_p[p]));
30 E_n = Vc * beta_n[1] * (VDD * (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
31 Vd_p[p] * taui_p[1]) / \
32 (2 * taui_p[1] * (tauo_p[p] - t0_p[p]));
33 F_n = Vc * Vc * beta_n[1] * (tauo_p[p] - t0_p[p]) * \
34 (2 * VDD * VDD * (t0_p[p] - taui_p[1]) + \
35 VDD * (Vc * (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
36 2 * (Vd_p[p] * (taui_p[1] - tauo_p[p]) + TECH.Vtn0 * taui_p[1])) + \
37 Vd_p[p] * taui_p[1] * (Vc - 2 * TECH.Vtn0));
38 G_n = 2 * taui_p[1] * (VDD - Vd_p[p]) * (VDD * t0_p[p] + \
39 Vc * (t0_p[p] - tauo_p[p]) - \
40 Vd_p[p] * tauo_p[p]);
41 y = -F_n * (LOG2(2 * t0_p[p] * taui_p[1] * (VDD - Vd_p[p]) * \
42 (VDD - Vd_p[p]) - G_n)) / \
43 (taui_p[1] * (VDD - Vd_p[p]) * \
44 (VDD - Vd_p[p])) + \
45 F_n * (LOG2(2 * to * taui_p[1] * (VDD - Vd_p[p]) * \
46 (VDD - Vd_p[p]) - G_n)) / \
47 (taui_p[1] * (VDD - Vd_p[p]) * \
48 (VDD - Vd_p[p])) - \
49 (2 * D_n * (t0_p[p] - to) + E_n * (t0_p[p] * t0_p[p] - to * to)) * 0.5;
50 return y;
51 }
52 else
53 return 0.0;
54 }
QpN.cc
6 #include "fast.h"
7
8 ///
9 double Fast::QpN(unsigned int n, unsigned int p)
10 {
11 double D_p, E_p, F_p, G_p;
12 double to;
13 double Vc, y;
14
15 if (p != 0)
16 {
17 Vc = TECH.Ec_p * L_p[1];
18 to = taui_n[1] * (1 + TECH.Vtp0 / VDD);
19 if (to > tauo_n[n])
20 to = tauo_n[n];
21 D_p = Vc * beta_p[1] * (VDD * (t0_n[n] - tauo_n[n]) * \
22 (2 * Vc * (t0_n[n] - tauo_n[n]) - Vd_n[n] * taui_n[1]) - \
23 Vd_n[n] * taui_n[1] * (Vc * (t0_n[n] - tauo_n[n]) - \
24 Vd_n[n] * tauo_n[n] + 2 * TECH.Vtp0 * (t0_n[n] - tauo_n[n]))) / \
25 (2 * Vd_n[n] * taui_n[1] * (t0_n[n] - tauo_n[n]));
B.3. Simulators 279
26
27 E_p = Vc *
beta_p[1] * (2 * VDD * (t0_n[n] - tauo_n[n]) - Vd_n[n] * taui_n[1]) / \
28 (2 *
taui_n[1] * (t0_n[n] - tauo_n[n]));
29 F_p = Vc *
Vc * beta_p[1] * (t0_n[n] - tauo_n[n]) * \
30 (2 *
VDD * VDD * (t0_n[n] - tauo_n[n]) + \
31 2 *
VDD * (Vc * (t0_n[n] - tauo_n[n]) + \
32 Vd_n[n] * (tauo_n[n] - taui_n[1])) - Vd_n[n] * taui_n[1] * (Vc + 2 * TECH.Vtp0));
33 G_p = 2 * Vd_n[n] * taui_n[1] * (VDD * (t0_n[n] - tauo_n[n]) + \
34 Vc * (t0_n[n] - tauo_n[n]) + Vd_n[n] * tauo_n[n]);
35 y = -F_p * (LOG2(2 * Vd_n[n] * Vd_n[n] * t0_n[n] * taui_n[1] - G_p)) / \
36 (Vd_n[n] * Vd_n[n] * taui_n[1]) + \
37 (F_p * LOG2(2 * Vd_n[n] * Vd_n[n] * to * taui_n[1] - G_p)) / \
38 (Vd_n[n] * Vd_n[n] * taui_n[1]) - \
39 (2 * D_p * (t0_n[n] - to) + E_p * (t0_n[n] * t0_n[n] - to * to)) * 0.5;
40 return y;
41 }
42 else
43 return 0.0;
44 }
Solve.cc
6 #include "fast.h"
7
8 #define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
9
10 ///
11 double Fast::SolveEq( const Circuit& circuit, unsigned int NP, unsigned int NC, TransistorType type, double start, double end, int& RetCode, unsigned i
12 {
13 int iter;
14 double a = start, b = end, c = end, d, e, min1, min2;
15 double fa, fb, fc, pp, q, r, s, tol1, xm, last;
16 double tol = TOL;
17
19 fb = EqN( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
21 fb = EqP( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
22 last = fb;
24 fa = EqN( circuit, NP, NC, a, RetCode, i, n, p, NewWidth );
26 fa = EqP( circuit, NP, NC, a, RetCode, i, n, p, NewWidth );
27 if ( ( fa > 0.0 && fb > 0.0 ) || ( fa < 0.0 && fb < 0.0 ) )
28 {
30 return 0.0;
31 }
32 fc = fb;
33 for ( iter = 1; iter <= ITERMAX; iter++ )
34 {
35 if ( ( fb > 0.0 && fc > 0.0 ) || ( fb < 0.0 && fc < 0.0 ) )
36 {
37 c = a;
38 fc = fa;
39 e = d = b - a;
40 }
41 if ( fabs ( fc ) < fabs ( fb ) )
42 {
43 a = b;
44 b = c;
45 c = a;
46 fa = fb;
47 fb = fc;
48 fc = fa;
49 }
50 tol1 = 2.0 * EPS * fabs ( b ) + 0.5 * tol;
51 xm = 0.5 * ( c - b );
52 if ( fabs ( xm ) <= tol1 || fb == 0.0 )
53 return b;
54 if ( fabs ( e ) >= tol1 && fabs ( fa ) > fabs ( fb ) )
55 {
56 s = fb / fa;
57 if ( a == c )
58 {
59 pp = 2.0 * xm * s;
60 q = 1.0 - s;
61 }
62 else
63 {
64 q = fa / fc;
65 r = fb / fc;
66 pp = s * ( 2.0 * xm * q * ( q - r ) - ( b - a ) * ( r - 1.0 ) );
67 q = ( q - 1.0 ) * ( r - 1.0 ) * ( s - 1.0 );
68 }
69 if ( pp > 0.0 )
70 q = -q;
71 pp = fabs ( pp );
72 min1 = 3.0 * xm * q - fabs ( tol1 * q );
73 min2 = fabs ( e * q );
74 if ( 2.0 * pp < ( min1 < min2 ? min1 : min2 ) )
75 {
76 e = d;
77 d = pp / q;
78 }
79 else
80 {
81 d = xm;
82 e = d;
83 }
84 }
85 else
86 {
87 d = xm;
88 e = d;
89 }
90 a = b;
91 fa = fb;
92 if ( fabs ( d ) > tol1 )
93 b += d;
94 else
95 b += SIGN ( tol1, xm );
97 fb = EqN( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
99 fb = EqP( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
100 }
102 return 0.0;
103 }
t0N.cc
6 #include "fast.h"
7
8 ///
9 double Fast::t0N( const Circuit& circuit, unsigned int NP, unsigned int NC, double t, const double* NewWidth, int& RetCode )
10 {
11 // compute the time at which the first n-mos start conducting, using
12 // bootstrap
B.3. Simulators 281
13 double A_2_n, B_2_n, C_2_n, D_2_n, Vc, y, Cm1, Cov, Cj, t0_bs;
14
15 Vc = TECH.Ec_n * L_n[ 1 ];
16 Cov = TECH.Cgd0_n * ( W_n[ 1 ] + TECH.XW_n );
17 Cm1 = Cov;
18 Cj = 0.0;
22 const char* name = pathlist[ NP ].TransistorName( 0, NC );
24 {
30 }
32 {
38 }
39 Cj += TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
40 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
42 // evaluate Cgs Cgd @ V node 1 for pmos
43 Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
44 TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
46 // evaluate other capacitances
47 int nc;
50 // evaluate gate capacitances
56 C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
57 D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
58 y = -2 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( C_2_n * t + D_2_n, 1.5 ) / ( 3 * C_2_n ) + \
59 B_2_n * ( t * t ) * 0.5 + t * ( A_2_n * taui_n[ 1 ] - Cov * VDD ) / ( taui_n[ 1 ] ) - \
60 ( 6 * A_2_n * C_2_n * t0_bs + 3 * B_2_n * C_2_n * ( t0_bs * t0_bs ) - \
61 4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( C_2_n * t0_bs + D_2_n, 1.5 ) ) / ( 6 * C_2_n );
62 RetCode = OK;
63 return y;
64 }
t0P.cc
6 #include "fast.h"
7
8 ///
9 double Fast::t0P( const Circuit& circuit, unsigned int NP, unsigned int NC, double t, const double* NewWidth, int& RetCode )
10 {
11 double A_2_p, B_2_p, C_2_p, D_2_p, Vc, y, Cm1, Cov, t0_bs, Cj;
12 double alpha, theta, Y;
13
14 Vc = TECH.Ec_p * L_p[ 1 ];
15 Cov = TECH.Cgd0_p * ( W_p[ 1 ] + TECH.XW_p );
16 Cm1 = Cov;
17 Cj = 0.0;
20 unsigned int nn = pathlist[ NP ].GetNumTranN( NC );
21 unsigned int pp = pathlist[ NP ].GetNumTranP( NC );
23 const char* name = pathlist[ NP ].TransistorName( nn + pp - 1, NC ); // first pmos
24 unsigned int VDDNode = circuit.ValimNode();
25 if ( circuit[ name ].Source() == VDDNode )
26 {
32 }
33 else if ( circuit[ name ].Drain() == VDDNode )
34 {
40 }
41 else
42 {
44 return 0.0;
45 }
46 // evaluate Cgs Cgd @ V node 1 for pmos
47 Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
48 TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
49 Cj += TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p );
50 // evaluate Cgs Cgd @ V node 1 for nmos
51 Cj += TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
52 TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
53 Cj += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_p );
54 // evaluate other capacitances
55 int nc;
58 // evaluate gate capacitances
64 C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
65 D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
66 Y = Cj + Cov;
67 alpha = pow( ( D_2_p * t0_bs + C_2_p ), 1.5 );
68 theta = pow( ( D_2_p * t + C_2_p ), 1.5 );
69 y = ( 2 * Vc * Vc * beta_p[ 1 ] * theta ) / \
70 ( 3 * D_2_p ) - \
71 B_2_p * t * t * 0.5 +
72 t * ( -A_2_p * taui_p[ 1 ] + Cov * VDD ) / \
73 ( taui_p[ 1 ] ) + \
74 ( 6 * A_2_p * D_2_p * t0_bs + 3 * B_2_p * D_2_p * t0_bs * t0_bs - \
75 4 * Vc * Vc * beta_p[ 1 ] * alpha ) / ( 6 * D_2_p );
76 RetCode = OK;
77 return y;
78 }
B.3. Simulators 283
TestOpt.cc
6 #include "test.h"
7
8 ///
9 TestOpt::TestOpt( const CritPathList& pathlist, const Options& options)
10 :
11 EvaluationAlgorithm( pathlist, options )
12 {
13 print_log( "Creating TestOpt instance..." );
14 }
15
16 ///
17 TestOpt::~TestOpt()
18 {}
19
20 ///
21
22 int TestOpt::Run( const Circuit& circuit, const double *NewWidth, const unsigned* ValidPath )
23 {
24 Calls++;
26 {
28 {
29 int RetCode;
30 CPDelay[ NP ] = 0.0;
31 CPPower[ NP ] = 0.0;
32 CPNoise[ NP ] = 0.0;
33 Area = 0.0;
34 for (unsigned int i = 0; i < circuit.GetNTran(); i++)
35 {
36 double x = NewWidth[i];
37 double f, g, h , l;
38 // f
39 f = x * x * x * x * 3.0 / 8000.0;
40 f += -x * x * x * 11.0 / 400.0;
41 f += x * x * 27.0 / 40.0;
42 f += -x * 27.0 / 4.0;
43 f += 165.0 / 4.0;
44 // l
45 l = x * x * x * x * 3.0 / 7700.0;
46 l += -x * x * x * 11.0 / 402.0;
47 l += x * x * 27.0 / 39.5;
48 l += -x * 27.0 / 4.0;
49 l += 150.0 / 4.0;
50 // g
51 g = x * x * x * x * 3.0 / 8000.0;
52 g += -x * x * x * 13.0 / 400.0;
53 g += x * x * 39.0 / 40.0;
54 g += -x * 45.0 / 4.0;
55 g += 205.0 / 4.0;
56 // h
57 h = x * x * x * x * x * x * 5.01264E-8;
58 h += -x * x * x * x * x * 1.60540E-5;
59 h += x * x * x * x * 0.001948124;
60 h += -x * x * x * 0.111669;
61 h += x * x * 3.05849;
62 h += -x * 34.6888;
63 h += 164.782;
64 //CPDelay[ NP ] = f;
65 if (f > i)
66 CPDelay[ NP ] = f;
67 else
68 CPDelay[ NP ] = l;
69 CPPower[ NP ] = NewWidth[i] * NewWidth[i] * NewWidth[i];
70 Area += NewWidth[i];
71 }
72 CPNoise[ NP ] = 0.0;
73 }
74 }
75 return OK;
76 }
BIBLIOGRAPHY
[1] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design.

Addison-Wesley, 1993.
[2] J. Yuan, “High speed circuit techniques for pipelining and for one–
clock–cycle decision.” Eurochip advanced course, high speed silicon
design, Apr. 1994.
[3] W. C. Elmore, “The transient response of damped linear network with

particular regard to wideband amplifiers,” Journal of Applied Physics,
vol. 19, pp. 55–63, Jan. 1948.
[4] R. Gupta, B. Tutuianu, and L. T. Pileggi, “The elmore delay as a

bound for rc trees with generalized inputt signals,” IEEE Transaction
on Computer–Aided Design, vol. 16, pp. 95–104, Jan. 1997.
[5] L. Brocco, S. Mccormic, and J. allen, “Macromodelling cmos circuits

for timing simulation,” IEEE Transaction on Computer–Aided Design,
vol. 7, pp. 1237–1249, Dec. 1988.
[6] N. Hedenstierna and K. O. Jeppson, “Cmos circuit speed and buffer

optimization,” IEEE Transaction on Computer–Aided Design, vol. CAD–
6, pp. 270–281, Mar. 1987.
[7] P. Cocchini, G. Piccinini, and M. Zamboni, “A comprehensive sub-

micron most delay model and its application to cmos buffers,” IEEE
Journal of Solid State Circuits, vol. 32, Aug. 1997.
[8] T. Sakurai and A. R. Newton, “Alpha–power law mosfet model and

its application in inverter delay and other formulas,” IEEE Journal of
Solid State Circuits, vol. 25, pp. 584–594, Apr. 1990.
286 Bibliography
[9] T. Sakurai and A. R. Newton, “A simple mosfet model for circuit ana-
lysis,” IEEE Transactions on Electron Devices, vol. 38, pp. 887–894, Apr.
1991.
[10] S. Dutta, S. S. M. Shetti, and S. L. Lusky, “Comprehensive delay model

for cmos inverters,” IEEE Journal of Solid State Circuits, vol. 30, pp. 864–
871, Aug. 1995.
[11] L. Bisdounis, S. Nikolaidis, and O. Koufopavlou, “Propagation delay

and short–circuit power dissipation modeling of the cmos inverter,”
IEEE Transaction on Circuits and Systems, vol. 45, pp. 259–270, Mar.
1998.
[12] R. S. Muller and T. I. Kamins, Device electronics for integrated circuit,

second edition. John wiley & sons, 1986.
[13] D. A. Wismer and R. Chattergy, Introduction to nonlinear optimization.

System science and engineering, North–Holland, 1978.
[14] P. L. Yu, Multiple–criteria decision making. Mathematical concepts and

methods in science and engineering, Plenum Press – New York and
London, 1985.
[15] M. J. D. Powell, “An eeficient method for finding the minimum of a

function of several variables without calculating derivatives,” Com-
puter Journal, no. 7, pp. 152–162, 1964.
[16] J. Yuan and C. Svensson, “Cmos circuit speed optimization based

on switch level simulation,” in Proceedings of IEEE International Sym-
posium on Circuits and Systems, pp. 2109–2112, 1988.
[17] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Nu-

merical Recipes in C: The Art of Scientific Computing. Cambridge Univer-
sity Press, 1992.
[18] M. Graziano, M. Delaurenti, G. Masera, G. Piccinini, and M. Zamboni,

“Noise safety design methodologies,” in Proceedings of IEEE Interna-
tional Simposium on Quality of Electronics Design (ISQED’2000), IEEE,
Mar. 2000.
Bibliography 287
[19] J. T. Kong, S. Z. Hussain, and D. Overhauser, “Performance estima-

tion of complex mos gates,” IEEE Transaction on Circuits and Systems,
vol. 44, pp. 785–795, Sept. 1997.
[20] S. Devadas and S. Malik, “Survey of optimization techniques targeting

low power vlsi circuit,” in Proceedings of Conference on Design Automa-
tion (DAC), 1995.
[21] B. Davari, R. H. Dennard, and G. G. Shahidi, “Cmos scaling for high

performance and low power—the next ten years,” Proceedings of the
IEEE, vol. 83, Apr. 1995.
[22] S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S. M. Kang, “An exact

solution to the transistor sizing problem for cmos circuits using convex
approximation,” IEEE Transaction on Computer–Aided Design, vol. 12,
pp. 1621–1634, Nov. 1993.
[23] O. Coudert, “Gate sizing for constrained delay/power/area optimiz-

ation,” IEEE Transaction on Very Large Scale Integration Systems, vol. 5,
pp. 465–472, Dec. 1997.
[24] C. Chen, C. C. N. Chu, and D. F. Wong, “Fast and exact simultaneous

gate and wire sizing by lagrangian relaxation,” in Proceedings of Con-
ference on Design Automation (DAC), pp. 617–624, 1998.
[25] A. R. Conn, R. A. Haring, and C. Visweswariath, “Noise considera-

tions in circuit optimization,” in Proceedings of IEEE/ACM International
conference on Computer Aided Design, pp. 220–227, 1998.
[26] M. Delaurenti, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,

“Cmos power-delay model for cad optimization tools,” in Proceedings
of IEEE A.Volta Workshop on Low-Power Design (VOLTA’99), IEEE, Mar.
1999.
[27] G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni, “Isis: a cad tool

for high speed vlsi design,” in Proceedings of CSA, (Irbid, Jordan), Mar.
1998.
[28] M. Graziano, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,

“Noise-tolerance analysis for high speed cmos circuits,” in Proceedings
of ICM, (Monastir, Tunisia), Dec. 1998.
288 Bibliography
[29] M. Graziano, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,

“A statistical noise-tolerance analysis and test structure for logic fam-
ilies,” in Proceedings of ICMTS, (Goteborg, Sweden), Mar. 1999.
[30] S. Eliantonio, “Studio di algoritmi di ottimizzazione velocità–area per

strutture cmos,” tesi di laurea, Politecnico di Torino, Dipartimento di
Elettronica, Mar. 1999.
[31] D. Zhou and X. Y. Liu, “On the optimal drivers of high–speed low
power ics,” International journal of High Speed Electronics and Systems,
vol. 7, no. 2, pp. 287–303, 1996.
[32] V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez, “Re-

ducing power in high–performance microprocessors,” in Proceedings
of Conference on Design Automation (DAC), pp. 732–737, 1998.
[33] H. Liao and W. W. Dai, “A new cmos driver model for transient ana-
lysis and power dissipation,” International journal of High Speed Elec-
tronics and Systems, vol. 7, no. 2, pp. 269–285, 1996.
[34] G. Yeap and A. Wild, “Introduction to low–power vlsi design,” In-

ternational journal of High Speed Electronics and Systems, vol. 7, no. 2,
pp. 223–248, 1996.
[35] A. Hirata, H. Onodera, and K. Tamaru, “Proposal of a timing model

for cmos logic gates driving a crc π load,” in Proceedings of IEEE/ACM
International conference on Computer Aided Design, pp. 537–544, 1998.
[36] A. R. Conn, P. K. Coulman, R. A. Haring, G. L. Morril, and

C. Visweswariath, “Optimization of custom mos circuits by transistor
sizing,” in Proceedings of IEEE/ACM International conference on Com-
puter Aided Design, 1996.
[37] P. Larsson-Edefors, “Technology mapping onto very–high–speed

standard cmos hardware,” IEEE Transaction on Computer–Aided Design,
vol. 15, pp. 1137–1144, Sept. 1996.
[38] A. Wolfe, “Oppurtunities and obstacles in low–power system–level

cad,” in Proceedings of Conference on Design Automation (DAC), 1996.
Bibliography 289
[39] C. S. D. Liu, “Power consumption estimation in cmos vlsi chips,” IEEE

Journal of Solid State Circuits, vol. 29, pp. 663–670, June 1994.
[40] J. Cong and L. He, “An efficient approach to simultaneous transistor

and interconnect sizing,” in Proceedings of IEEE/ACM International con-
ference on Computer Aided Design, 1996.
[41] D. Liu and C. Svensson, “Impact of supply voltage on power con-

sumption, speed and reliability of cmos circuits,” in Proceedings of
internationa workshop on Power and Timing Modeling, Optimization and
Simulation (PATMOS), 1994.
[42] L. T. Wurtz, “An efficient scaling procedure for domino cmos logic,”
IEEE Journal of Solid State Circuits, vol. 28, pp. 979–982, Sept. 1993.
[43] J. Yuan, “Ultimate cmos speeds and device sizing.” Eurochip ad-
vanced course, high speed silicon design, Apr. 1994.
[44] D. Chen and M. Sarrafzadeh, “An exact algorithm for low power
library–specific gate re–sizing,” in Proceedings of Conference on Design
Automation (DAC), 1996.
[45] M. Borah, R. M. Owens, and M. J. Irwin, “Transistor sizing for low

power cmos circuits,” IEEE Transaction on Computer–Aided Design,
1996.
[46] B. Basaran and R. A. Rutenbar, “An o(n) algorithm for transistor

stacking with performance constraints,” in Proceedings of Conference on
Design Automation (DAC), 1996.
[47] M. R. C. M. Berkelaar, P. H. W. Buurman, and J. A. G. Jess, “Computing

the entire active area/power consumption versus delay tradeoff curve
for gate sizing with piecewise linear simulator,” IEEE Transaction on
Computer–Aided Design, vol. 15, pp. 1424–1434, Nov. 1996.
[48] M. R. C. M. Berkelaar, P. H. W. Buurman, and J. A. G. Jess, “Computing

the entire active area/power consumption versus delay tradeoff curve
for gate sizing with piecewise linear simulator,” IEEE Transaction on
Computer–Aided Design, vol. 15, pp. 1424–1434, Nov. 1996.
290 Bibliography
[49] S. Mehrotra, P. Franzon, and W. Liu, “Global optimization approach to

transistor sizing for high performance cmos vlsi circuits,” Tech. Rep.
NCSU–VLSI 93–10, North Carolina State University, Department of
Electrical and Computer Engineering, Nov. 1993.
[50] S. S.-S. Chung, “A charge–based capacitance model of short–channel

mosfet’s,” IEEE Transaction on Computer–Aided Design, vol. 8, pp. 1–7,
Jan. 1989.
[51] O. Coudert, R. Haddad, and S. Manne, “New algorithms for gate siz-
ing: a comparative study,” in Proceedings of Conference on Design Auto-
mation (DAC), 1996.
[52] A. Bogliolo, L. Benini, and B. Riccò, “Power estimation of cell-based

cmos circuits,” in Proceedings of Conference on Design Automation (DAC),
1996.
[53] D. Syslvester and K. Keutzer, “Getting to the bottom of deep sub-

micron,” in Proceedings of IEEE/ACM International conference on Com-
puter Aided Design, 1998.

Design and Optimization Techniques of High-Speed VLSI Circuits

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Design and Optimization Techniques of High-Speed VLSI Circuits

Uploaded by

Copyright:

Available Formats

Design and optimization techniques of

high–speed VLSI circuits

“When I use a word,”

Part I CMOS Logic 1

1. Introduction to CMOS logic . . . . . . . . . . . . . . . . . . . . . 3

1.2 CMOS logic families . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Static logic families . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Dynamic logic families . . . . . . . . . . . . . . . . . . 6

Part II Circuit Modeling 13

2.1 The Elmore’s model . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1 The FAST model . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.1 MOS equations . . . . . . . . . . . . . . . . . . . . . . 23

3.1.2 Internal nodes approximation . . . . . . . . . . . . . . 24

3.1.3 Body effect . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2 Delay estimation . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2.1 Equation solving . . . . . . . . . . . . . . . . . . . . . 32

3.3 Power estimation . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3.1 Switching energy . . . . . . . . . . . . . . . . . . . . . 36

3.3.2 Short–circuit energy . . . . . . . . . . . . . . . . . . . 39

3.3.3 Sub–threshold energy . . . . . . . . . . . . . . . . . . 39

Part III Optimization 45

4.1 Optimization theory . . . . . . . . . . . . . . . . . . . . . . . 48

4.1.1 Mono-objective optimization . . . . . . . . . . . . . . 49

4.1.1.1 Unconstrained problem . . . . . . . . . . . . 51

4.1.1.2 Constrained problem . . . . . . . . . . . . . 52

Lagrange multiplier and Penalty functions . . 52

4.1.2 Multi-objective optimization . . . . . . . . . . . . . . 54

4.2 Optimization Algorithms . . . . . . . . . . . . . . . . . . . . 58

4.2.1 One-dimensional search techniques . . . . . . . . . . 59

4.2.1.1 The section search . . . . . . . . . . . . . . . 59

The golden section search . . . . . . . . . . . . 60

4.2.1.2 Parabolic interpolation . . . . . . . . . . . . 62

The Brent’s rule . . . . . . . . . . . . . . . . . . 62

4.2.2 Multi-dimensional search . . . . . . . . . . . . . . . . 63

4.2.2.1 The gradient direction: steepest (maximum)

4.2.2.2 The optimal gradient . . . . . . . . . . . . . 65

4.2.3 The conjugate direction method . . . . . . . . . . . . 67

4.2.3.1 The Fletcher–Reeves conjugate gradient al-

4.2.3.2 The Powell conjugate gradient algorithm . . 69

4.2.4 The “SLOP” algorithm . . . . . . . . . . . . . . . . . . 70

4.2.5 The simulated-annealing algorithm . . . . . . . . . . 72

5.1 Optimization targets . . . . . . . . . . . . . . . . . . . . . . . 78

5.1.1 Circuit delay . . . . . . . . . . . . . . . . . . . . . . . . 79

5.1.1.1 Delay formula obtained by the Elmore model 84

5.1.1.2 Delay measurement obtained by the FAST

5.1.2 Power consumption . . . . . . . . . . . . . . . . . . . 87

5.2 Optimization examples . . . . . . . . . . . . . . . . . . . . . . 91

5.2.1 Algorithm choice . . . . . . . . . . . . . . . . . . . . . 94

5.2.2 Mono-objective optimizations . . . . . . . . . . . . . . 95

5.2.3 Multi-objective optimizations . . . . . . . . . . . . . . 102

5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6. A CAD tool for optimization . . . . . . . . . . . . . . . . . . . . 107

6.1 Logical description . . . . . . . . . . . . . . . . . . . . . . . . 107

6.1.1 The optimization algorithm module (OAM) . . . . . . 107

6.1.2 The function evaluation module (FEM) . . . . . . . . . 109

6.1.3 Core engine . . . . . . . . . . . . . . . . . . . . . . . . 109

6.2 Code implementation . . . . . . . . . . . . . . . . . . . . . . . 110

6.2.1 The classes CircuitNetlist and Circuit . . . . . . . . . 110

6.2.2 The class EvaluationAlgorithm . . . . . . . . . . . . . 112

6.2.3 The class OptimizationAlgorithm . . . . . . . . . . . 113

6.2.4 The critical path retrieving . . . . . . . . . . . . . . . 115

6.2.5 The derived classes . . . . . . . . . . . . . . . . . . . . 116