Design and Optimization Techniques of High Speed VLSI Circuit

Design and optimization techniques of
highspeed VLSI circuits
Marco Delaurenti
Politecnico di Torino
Design and optimization techniques of

highspeed VLSI circuits
Marco Delaurenti
PhD Dissertation
December 1999
Politecnico di Torino
Advisor
Coordinator
Prof. Maurizio Zamboni
Prof. Ivo Montrosset
Copyright c 1999 Marco Delaurenti
Writing comes more easily if you have something

to say.
(Sholem Asch)
When I use a word,
Humpty Dumpty said in
rather a scornful tone, it
means just what I choose
it to meanneither more
nor less.
(Lewis Carroll)
Acknoledgments
First of all I would like to thank my advisor, Prof. M. Zamboni, Prof. G
Piccinini, Prof. G. Masera for their invaluable help, and Prof. P. Civera for
his being a bridge toward the real world. Also many thanks to the VLSI
LAB members at Politecnico of Turin, Italy: Mario for his input about the
critical paths (no, I do not thank you for the jazz songs that you play all
day long), Luca for the long discussions about books and movies (no, I
havent seen the last Kubricks movie), Andrea for his very good cocktails
(especially the Negroni one) and Danilo, because I forgot him every time
we went to lunch. Thanks also to Max (for he gave me the root password),
and to Yuan&Svensson for the invention of the TSPC.
Special thanks, finally, to Mg, for her support and for have been tolerating
me till now.
CONTENTS
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part I
CMOS
Logic
xix
1. Introduction to CMOS logic . . . . . . . . . . . . . . . . . . . . .
1.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
CMOS
logic families . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1
Static logic families . . . . . . . . . . . . . . . . . . . .
1.2.2
Dynamic logic families . . . . . . . . . . . . . . . . . .
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
Circuit Modeling
13
1.3
Part II
2. A simple model . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.1
The Elmores model . . . . . . . . . . . . . . . . . . . . . . . .
16
2.2
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3. A complex model . . . . . . . . . . . . . . . . . . . . . . . . . .
21
3.1
The FAST model . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.1.1
MOS
equations . . . . . . . . . . . . . . . . . . . . . .
23
3.1.2
Internal nodes approximation . . . . . . . . . . . . . .
24
Contents
viii
3.1.3
Body effect . . . . . . . . . . . . . . . . . . . . . . . . .
26
Delay estimation . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.2.1
Equation solving . . . . . . . . . . . . . . . . . . . . .
32
Power estimation . . . . . . . . . . . . . . . . . . . . . . . . .
36
3.3.1
Switching energy . . . . . . . . . . . . . . . . . . . . .
36
3.3.2
Shortcircuit energy . . . . . . . . . . . . . . . . . . .
39
3.3.3
Subthreshold energy . . . . . . . . . . . . . . . . . .
39
3.4
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
3.5
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
3.2
3.3
Part III
Optimization
45
4. Mathematic Optimization . . . . . . . . . . . . . . . . . . . . .
4.1
Optimization theory . . . . . . . . . . . . . . . . . . . . . . .
48
4.1.1
Mono-objective optimization . . . . . . . . . . . . . .
49
4.1.1.1
Unconstrained problem . . . . . . . . . . . .
51
4.1.1.2
Constrained problem . . . . . . . . . . . . .
52
Lagrange multiplier and Penalty functions . .
52
Multi-objective optimization . . . . . . . . . . . . . .
54
4.1.2.1
Unconstrained . . . . . . . . . . . . . . . . .
56
4.1.2.2
Constrained . . . . . . . . . . . . . . . . . .
57
Compromise solution . . . . . . . . . . . . . .
57
Optimization Algorithms . . . . . . . . . . . . . . . . . . . .
58
4.2.1
One-dimensional search techniques . . . . . . . . . .
59
4.2.1.1
The section search . . . . . . . . . . . . . . .
59
Dicotomic search . . . . . . . . . . . . . . . . .
59
Fibonacci Search . . . . . . . . . . . . . . . . .
60
4.1.2
4.2
47
Contents
The golden section search . . . . . . . . . . . .
60
Convergence considerations . . . . . . . . . . .
61
Parabolic interpolation . . . . . . . . . . . .
62
The Brents rule . . . . . . . . . . . . . . . . . .
62
Multi-dimensional search . . . . . . . . . . . . . . . .
63
4.2.1.2
4.2.2
4.2.2.1
The gradient direction: steepest (maximum)

descent . . . . . . . . . . . . . . . . . . . . .
63
The optimal gradient . . . . . . . . . . . . .
65
Convergence considerations . . . . . . . . . . .
66
The conjugate direction method . . . . . . . . . . . .
67
4.2.2.2
4.2.3
ix
4.2.3.1
The FletcherReeves conjugate gradient algorithm . . . . . . . . . . . . . . . . . . . . .
68
The Powell conjugate gradient algorithm . .
69
4.2.4
The SLOP algorithm . . . . . . . . . . . . . . . . . .
70
4.2.5
The simulated-annealing algorithm . . . . . . . . . .
72
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
5. Circuit Optimization . . . . . . . . . . . . . . . . . . . . . . . .
77
4.2.3.2
4.3
5.1
5.2
Optimization targets . . . . . . . . . . . . . . . . . . . . . . .
78
5.1.1
Circuit delay . . . . . . . . . . . . . . . . . . . . . . . .
79
Critical Paths . . . . . . . . . . . . . . . . . . .
80
5.1.1.1
Delay formula obtained by the Elmore model 84
5.1.1.2
Delay measurement obtained by the FAST

model and by HSPICE . . . . . . . . . . . . .
86
5.1.2
Power consumption . . . . . . . . . . . . . . . . . . .
87
5.1.3
Area . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
Optimization examples . . . . . . . . . . . . . . . . . . . . . .
91
5.2.1
94
Algorithm choice . . . . . . . . . . . . . . . . . . . . .
Contents
5.2.2
5.2.3
5.3
Mono-objective optimizations . . . . . . . . . . . . . .
95
5.2.2.1
Area . . . . . . . . . . . . . . . . . . . . . . .
95
5.2.2.2
Power . . . . . . . . . . . . . . . . . . . . . .
96
5.2.2.3
Delay . . . . . . . . . . . . . . . . . . . . . .
97
Multi-objective optimizations . . . . . . . . . . . . . . 102
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6. A CAD tool for optimization . . . . . . . . . . . . . . . . . . . . 107

6.1
6.2
Logical description . . . . . . . . . . . . . . . . . . . . . . . . 107

6.1.1
The optimization algorithm module (OAM) . . . . . . 107
6.1.2
The function evaluation module (FEM) . . . . . . . . . 109
6.1.3
Core engine . . . . . . . . . . . . . . . . . . . . . . . . 109
Code implementation . . . . . . . . . . . . . . . . . . . . . . . 110

6.2.1
The classes CircuitNetlist and Circuit . . . . . . . . . 110
6.2.2
The class EvaluationAlgorithm . . . . . . . . . . . . . 112
6.2.3
The class OptimizationAlgorithm . . . . . . . . . . . 113
6.2.4
The critical path retrieving . . . . . . . . . . . . . . . 115
6.2.5
The derived classes . . . . . . . . . . . . . . . . . . . . 116
6.3
Program flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.4
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7. Results and conclusions . . . . . . . . . . . . . . . . . . . . . . 121

7.1
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.1.1
Mono-objective vs. Multiobjective . . . . . . . . . . . 122
7.2
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3
Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Contents
Appendix
xi
143
A. Class graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

B. Source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
B.1 Main functions
. . . . . . . . . . . . . . . . . . . . . . . . . . 149
B.2 Optimization algorithms . . . . . . . . . . . . . . . . . . . . . 208

B.3 Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
xii
Contents
LIST OF FIGURES
1.1
Static and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Pass-transistor logic xor . . . . . . . . . . . . . . . . . . . . .
1.3
Domino typical gate . . . . . . . . . . . . . . . . . . . . . . .
1.4
CVSL
typical gate . . . . . . . . . . . . . . . . . . . . . . . . .
1.5
C2 MOS
1.6
TSPC
typical gate . . . . . . . . . . . . . . . . . . . . . . . .
Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.1
RC MOS equivalence . . . . . . . . . . . . . . . . . . . . . . .
15
2.2
RC chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.3
RC single cell . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.4
Elmore impulse response . . . . . . . . . . . . . . . . . . . . .
18
3.1
Inverter voltages waveform . . . . . . . . . . . . . . . . . . .
23
3.2
Mos chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
3.3
Node voltages . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.4
Voltages wave form in the nMOS chain . . . . . . . . . . . .
27
3.5
Voltages wave forms in the pMOS chain . . . . . . . . . . . .
28
3.6
VDS and VGS . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.7
MOSFET
chain with static voltages . . . . . . . . . . . . . . .
30
3.8
Threshold variation . . . . . . . . . . . . . . . . . . . . . . . .
31
3.9
Delay comparison . . . . . . . . . . . . . . . . . . . . . . . . .
42
3.10 Energy comparison . . . . . . . . . . . . . . . . . . . . . . . .
43
List of Figures
xiv
4.1
Section search . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
4.2
Minimization by Powell algorithm . . . . . . . . . . . . . . .
70
4.3
Minimization by Powell algorithm . . . . . . . . . . . . . . .
71
4.4
Minimization by SLOP algorithm . . . . . . . . . . . . . . . .
72
4.5
Minimization by Simulated-annealing algorithm . . . . . . .
73
4.6
Minimization by Simulated-annealing algorithm . . . . . . .
74
5.1
Design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
5.2
Delay definition . . . . . . . . . . . . . . . . . . . . . . . . . .
79
5.3
Critical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
5.4
Critical path tree . . . . . . . . . . . . . . . . . . . . . . . . . .
83
5.5
Elmore delay . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
5.6
Elmore delay . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
5.7
HSPICE
delay . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.8
FAST
delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
5.9
HSPICE
Energy . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
5.10 CMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
5.11 TSPC Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . .
95
5.12 TSPC And gates . . . . . . . . . . . . . . . . . . . . . . . . . .
96
5.13 TSPC Or gates . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
5.14 Static and-or gate . . . . . . . . . . . . . . . . . . . . . . . . .
98
5.15 Static parity gate . . . . . . . . . . . . . . . . . . . . . . . . . .
99
5.16 Static full-adder . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.17 TSPC full-adder (onestage) . . . . . . . . . . . . . . . . . . . 101
6.1
Tool block diagram . . . . . . . . . . . . . . . . . . . . . . . . 108
List of Figures
7.1
xv
Comparison of 0.7 m and 0.25 m. gates @ minimum technology width . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2
Delay optimization of 0.7 m gates. . . . . . . . . . . . . . . . 125
7.3
Delay optimization of 0.25 m gates. . . . . . . . . . . . . . . 126
7.4
Technology comparison of delay optimization. . . . . . . . . 127
7.5
Several delaypower optimization policies of 0.7 m gates. . 132
7.6
Energy-dissipation variation (zoom of figure 7.5(b)) . . . . . 133
7.7
Several delaypower optimization policies of 0.25 m gates.
7.8
Energy-dissipation variation (zoom of figure 7.7(b)) . . . . . 135
7.9
Delaypower optimization (50%50%) comparison of 0.7 m
134
and 0.25 m gates. . . . . . . . . . . . . . . . . . . . . . . . . 136

7.10 Delay and power trajectory during 4 different multi-objective
optimizations for the andor gate . . . . . . . . . . . . . . . . 137
optimizations for the parity gate . . . . . . . . . . . . . . . . 138
optimizations for the static full-adder . . . . . . . . . . . . . 139
optimizations for the dynamic full-adder . . . . . . . . . . . 140
xvi
List of Figures
LIST OF TABLES
3.1
Mean Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
3.2
Execution time . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4.1
Optimization algorithms . . . . . . . . . . . . . . . . . . . . .
75
5.1
Basic gates: complexity . . . . . . . . . . . . . . . . . . . . . .
92
5.2
Basic gates: pre-optimization delay, power consumption and

area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
5.3
Full-adder: delay optimization . . . . . . . . . . . . . . . . .
99
5.4
Agreements of targets . . . . . . . . . . . . . . . . . . . . . . 103
5.5
Full-adder: delay and power optimization
5.6
Full-adder: optimizations comparison . . . . . . . . . . . . . 105
7.1
Library gates list . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2
Delay and energy dissipation @ minimum width (HSPICE) . 123
7.3
Delay decreasing and energy increasing (both relative) in a
. . . . . . . . . . 105
delay optimization. . . . . . . . . . . . . . . . . . . . . . . . . 128

7.4
Elapsed time and total number of function evaluations for a

full-delay optimization with HSPICE on a ULTRA-sparc 5
129
7.5
Constrained delay optimization of a few 0.25 m gates. . . . 130
7.6
Delay worsening and energy improvement between a full

delay optimization and delay-power optimization . . . . . . 133
xviii
List of Tables
Preface
The design of high speed integrated circuit is a long and complex operation; nonetheless the total timetomarket required from the idea to the
silicon masks is reducing along the way.
To help the designer during this long and winding road several CAD tools
are available. In the first step the only thing existing is the description of
the circuit behaviour (the idea); in the central step of the design flow the
designer knows only the logic functioning of each block composing the circuit, but he ignores the technology realization of these blocks; in the last
steps, finally, the designer knows exactly the technology implementation
of every single gate of the circuit, and can compose the final layout with
every gate. Ca va sans dire that the CAD tool are nowadays of vital importance in the design flow, and moreover the goodness or the badness of such
tools influence a lot the quality of the final design.
Among all the possible instruments, the optimization tools have a primary role in all the phases of a project, starting from the optimization at
higher level and descending to the optimization made at the electrical level.
This thesis focuses its efforts in developing new strategies and new
techniques for the optimization made at the transistor dimension level, that
is the one done by the cell library engineer, and developing also a CAD instrument to make this work as more as harmless as possible.
xx
Preface
Part I
CMOS LOGIC
Chapter 1
INTRODUCTION TO CMOS LOGIC

HE optimization of VLSI circuits involves the optimization of single
CMOS
cell. In this chapter are briefly reported the basic CMOS logic
families, with their pros and cons. The simple goal is to pick up among
the static and dynamic logic families the most appealing for the use in vlsi
circuits, and, in some measure, the most actually used, and then apply to
them the optimization techniques shown in the next chapters.
1.1
Introduction
We might ask: why to optimize a single cell in VLSI circuit, when the
design nowadays is shifting toward higher and higher level?
Some answers could be:
Need of re-usable library cells. This makes easier to reuse the same
library for different projects. It is a must nowadays, in order to reduce
the total time to target/market.
An optimized library makes easier the design at higher level: floorplanning, routing, can have relaxed constraints, since the gates have
a better behaviour. It is possible to reduce the time to repeat some
critical steps like floorplanning or routing until all the specifications
are met: these specifications are met earlier, since the cell globally
have a better behaviour.
Need of having some equivalent libraries with different kind of optimization. It is possible to have different libraries that have different
Chapter 1. Introduction to CMOS logic
specifications, but are functionally equivalent, so that it is possible to

create different version of a project simply substituting the basic library. It would be possible, for example, to have, of the same project, a
version that runs at full speed, and version optimized for low-power
dissipation.
This swapping of libraries does not involve the higher levels of design,
for it is totally transparent to the designer during floorplanning or
routing. Just before the layout production, during the cell mapping,
it is possible to choose the library on to which the project would be
mapped.
These answer have led to consider the appropriateness of the production of a tool able to perform the optimization of a cell library, in a way
appropriate for the designer. The goal is to produce some results to show
that this optimization is worth during a design cycle, and also to make the
insertion of the tool in a design cycle as smooth as possible.
In order to attain results that are related to a real production cycle, we
have to choose some cells that are almost present in a real library.
For this purpose we introduce a very brief description of the most used
CMOS
logic families, and among them we choose the cells to develop and
test the optimization framework.
1.2
CMOS
logic families
The first basic distinction inside the CMOS logic families is among the
static logics and the dynamic logics ([1]).
Static logic: The static logic is a logic in which the functioning of the circuit is not synchronized by a global signal, namely the clock of the
circuit. The output is solely function of the input of the circuit, and
it is asynchronous with respect to them. The timing of the circuit is
defined exclusively by its internal delay.
Dynamic logic: The dynamic logic is a logic in which the output is synchronized by a global signal, viz. the clock. The output is, then, function both of the inputs of the circuit and of the clock signal; and the
1.2. CMOS logic families
timing of the circuit is defined both by its internal delay and by the
timing of the clock.
Both the static and dynamic logics comprehend several logic families.
1.2.1
Static logic families
The principal static families are:

Conventional static logic It is the logic normally referred when speaking
of static logic. A static circuit has the same number of NMOS and PMOS
transistors, but the n and p branches are respectively one the dual
of the other. As an example see figure 1.1, which represents a static
A
OUT = A and B
B
Fig. 1.1: Static and

and gate. It has two NMOS transistor connected in series and two
PMOS
connected in parallel.
The static logic is quite fast, does not dissipate power in steady state
and has a very good noise margin.
Pseudo-NMOS It is an evolution of the yet surpassed NMOS logic. It is obtained by substituting the whole PMOS branch in a static logic with
a single PMOS transistor with its gate connected to ground. So this
6
PMOS
is always conducting and leads the output node to the high
state. When the NMOS branch conducts also, then the output discharges, if the ratio among the NMOS and PMOS transistor is well designed.
This logic is cited here only for historical reason, since it is not so fast,
it dissipates static power in a steady state (when the output is in the
low state) and it is sensible to noise.
Pass-logic The pass-logic is relatively new logic, and, for many digital designs, implementation in pass-transistor logic (PTL) has been shown
to be superior in terms of area, timing, and power characteristics to
static CMOS.
As an example see figure 1.2,
OUT = A xor B
Fig. 1.2: Pass transistor logic xor
1.2.2 Dynamic logic families

The principal dynamic families have a characteristic in common: every
dynamic logic needs of a pre-charge (or pre-discharge) transistor to lead to
a known state some pre-charged nodes. This is done during the working
phase known as pre-charge phase or memory phase; during another working
phase, the evaluation phase the output has a stable value1 .
1
This brief introduction is limited to systems that have a single global clock, or one
phase, intending here the word phase as synonym of clock, and not as above as a synonym
of working period. There are systems that have two, or even four phase, but they are not
introduced here. The basic functioning, however, remains the same.
The principal dynamic logics are divided yet in two sub-families, pipelined and not-pipelined. The first two these are non-pipelined, while the others are pipelined:
Domino logic and NP Domino logic The typical domino gate is depicted
in figure 1.3
INPUTs
CLOCK
OUT
NMOS Block
Fig. 1.3: Domino typical gate

During the pre-charge phase the clock is at its low state, so that the
pre-charged node before the static inverter is high, and the output is
low. During the evaluation phase the clock is high, so that the inputs
of the nblock (that can perform any logical function) can discharge
the pre-charged node and lead the output to the high state.
We can cascade several of these gates, given that each gate has its
own output inverter, and we can drive every gate with the same clock
signal, given that the evaluation phase lasts the time necessary to all
the gates to finish their inputs evaluation. This last fact explains why
this is a non-pipelined logic: the output of every cell is available when
the cell has finished its evaluation phase.
Moreover this logic has a limited area occupancy, since it has a low
number of PMOS transistors. On the other hand it is not possible to
implement inverting-structure and, as all the other dynamic logics,
this logic is subject to the charge-sharing problem2 .
2
The charge-sharing problem, or charge-redistribution, is a problem that affects the dy-
A natural evolution of the domino logic is the N-P domino logic, or

zipper logic. It consist of two typical cells, the one depicted in figure 1.3, and the dual one obtained by that, simply swapping the nblock with a p-block, and a PMOS pre-charge transistor with a NMOS
pre-discharge transistor, driven by the negated clock.
This logic has a lower are occupancy, since there is no need of a static
inverter, but has also a lower speed, given by the presence of PMOS
transistors.
Cascode voltage switch logic (CVSL) The CVSL is part of the large family
of differential logics. It needs both the inputs and the inputs negated,
and two complementary n-block that perform the logic function, as it
is possible to see in figure 1.4.
OUT
INPUTs
INPUTs
OUT
Fig. 1.4: CVSL typical gate

It has the advantage to be quite fast, since the positive feed-back of
the two PMOS accelerates the switching of the gate, and also it has
very good noise margins. Moreover it produces both the outputs and
namic logics. Basically the charge stored in an precharged node node during the memory
phase does not remain fully stored in it. Lets think to a domino gate during the pre-charge
phase, when the clock is low. If there is one input in the n-block that is high, then its corresponding transistor is conducting. The n-branch is still not conducting, since the clocked
NMOS transistor is not conducting, but some charge from the precharged node can flow to
others node via the conducting transistors in the n-block. This redistribution of charge is
simply a charge of a capacitor partition and lead to a state of the precharged node lesser
than the high state.
This problem can produce logic errors, and surely diminishes the noise margins of the
cell.
negated outputs without needing an inverter. As a drawback, it has

a large area occupancy.
C 2 MOS
logic The typical C2 MOS gate is shown in figure 1.5. It is basically
a three-state gate, since when the clock is at the low state, the output
is floating at the high impedance state.
INPUTs
PMOS Block
CLOCK
OUT
INPUTs
CLOCK
NMOS Block
Fig. 1.5: C2 MOS typical gate

It is principally used as a dynamic latch, as an interface among static
logics and dynamic-pipelined logics.
NO RAce logic (NORA) The NORA logic, as acronym of no race, is an evolution of the N-P domino logic. The static inverter of the domino logic
is substituted with a C2 MOS inverter. This is the first of the pipelined
logics, since the output of every gates is available only when the clock
switch its state, and not before.
Since the output stage of every cell is also dynamic (a C2 MOS inverter), then this logic is more subject to the charge-sharing problem
that the domino logic is.
10
True Single Phase Clock logic (TSPC) The final evolution of the NORA is
the TSPC logic, or true single phase clock logic ([2]).
The TSPC logic is a n-p logic, since of each gate exists the n-version
and the p-version. For example the n-latch and the p-latch are shown
in figure 1.6.
CLK
OUT
A
(a) Type n
CLK
OUT
(b) Type p
Fig. 1.6: TSPC Latches
The ultimate advantage of the TSPC logic is the presence of a single

clock, since for its internal structure it is not necessary the presence of
the clock negated.
The TSPC logic is among the faster dynamic families, and surely it has
a great appealing for its very low number of transistor employed.
1.3. Conclusion
11
1.3 Conclusion
After this very brief introduction to several CMOS families, we chose
two different logics, in order to apply the study of the optimization techniques objects of this thesis. The criteria that drove us in choosing these
families was both the diffusion in VLSI circuits, and the presence of very
good qualities, perhaps not yet fully exploited in the real production of
circuits.
For these reasons we have chosen to include in our library a few static
gates (an and gate, an or gate, and a few more) and a few dynamic
gates, and in particular gates from the TSPC family. This family has shown
good characteristics in term of speed, area occupancy and power dissipation; it has also the very important feature to need only a single clock.
The complete list of the gates comprising the library can be found in the
table 7.1 (page 122), with their relative schematic diagram of CMOS implementation.
Part II
CIRCUIT MODELING
Chapter 2
A SIMPLE MODEL
HE first model applied in the calculus of the delay in MOS circuits is

the Elmores model ([3]). It is a simple RC delay model, and it is the
basement of a switch MOS model (figure 2.1): the generic MOS is represen-
ted, during the ON state, by its dynamic resistance across the drain pin and
the source pin, and the parasitic capacitances and resistances at the drain
and source pins.
CL
RL
CD
D
ON
Rg
Rd
CG
S
CS
R0
Fig. 2.1: RC MOS equivalence
If this simple MOS model is valid, then the Elmores delay formula can
be used in every structure containing some MOS. The Elmores formula is
Chapter 2. A simple model
16
appealing for its simplicity and its easy of use; however the accuracy of the
formula can worsen in the deep submicron domain, since the modeling of
a MOS through its resistance it is no more valid.
Since the use of Elmores model is almost quite limited to comparisons with other models, of for introduction to delay modelling, section 2.1
presents here only the very basic of the Elmores model and section 2.2
shows the conclusions about the use of this model for VLSI models.
2.1 The Elmores model

The Elmores model or the Elmores delay formula can predict the delay
of a RC chain as shown in figure 2.2.
V0
Ri-1 Vi-1
Ri
Ri+1
Vi
Ci-1
Vi+1
C i+1
Ci
Fig. 2.2: RC chain
In order to obtain the formula, lets start with a single RC cell, as shown
in figure 2.3. We can express the voltage V1 (t) by means of a differential
equation such as:
C0
V (t) V0 (t)
dV1
= 1
dt
R0
(2.1)
Integrating the equation (2.1), we can write
V1 = V0 (t) 1 e
R tC
0 0
The time constant is = R0 C0 , and with t = we obtain:
2.1. The Elmores model
R0
V0
17
V1
C0
Fig. 2.3: RC single cell
V1 = 0.63V0 (t).
So the time t D = represents the 63% delay from V0 (t) to V1 (t). Extending the formula of the time constant to the chain of figure 2.2, we obtain:
tD =
i=0
j=0
Rj
Ci .
This delay is the inputoutput delay. When there is the need to know
the delay between the input and one of the inner nodes, a more complex
formula (a semi-empirical one) can be used; for example, with N = 2:
delay from the input note to the first node
t1 = R0 C0 + qR1 C1
t2 = R0 C0 + (R0 + R1 )C1
delay from the input note to the output node
where q is:
R0
R
0 + R1
q=
R0 C0
R0 C0 + R1 C1
if R1 2R0 ,
if R1 > 2R0 .
Chapter 2. A simple model
18
The first case (with R1 2R0 ) is named strong coupling, while the second
one is named weak coupling.
Given the unit impulse response h(t) (figure 2.4) of the output node of
the RC tree, Elmore proposed to approximate the delay by the mean of
h(t)
h(t), considering h(t) as a distribution. The 50% delay is given by:
m
t
Fig. 2.4: Elmore impulse response
Z
0
h(t)dt = 0.5
while the original work of Elmore proposed:
tD = m =
Z
0
t h(t)dt
with
Z
0
h(t)dt = 1.
2.2. Conclusions
19
This approximation is valid only when h(t) is a symmetrical distribution, as in figure 2.4, while in real cases the h(t) distribution is asymmetrical;
however in [4] is proved that the Elmore approximation is an upper bound
for the 50% delay, even when the impulse response is not symmetrical, and,
furthermore, the real delay asymptotically approaches the Elmore bound as
the input signal rise (or fall) time increases.
2.2
Conclusions
The model shown in this chapter is quite appealing for the calculus of
the delay in CMOS structure, but it is inaccurate as far as we go into the
submicron domain, so its use should be limited to a first validation of an
optimization algorithm, but not for real production.
About this, it is important to note that the delay functions obtained by the
Elmores formula satisfy some properties useful in the optimization realm
(for example equation (4.1), page 50): then the Elmore model is very useful
for optimization algorithms testing.
Chapter 3
A COMPLEX MODEL
HE target of the model developed here is to offer limited estimation

errors with respect to physical SPICE simulations and to improve the
computation speed of more than one order of magnitude. This could be

useful in optimization algorithms.
Thus the aim of the model is to evaluate the delay and power dissipation
of CMOS structures.
Several approaches have been used to evaluate the delays of CMOS

structures: some models are derived from SPICE simulations by means of
lookuptables [5]; some are analytical [6] while others approximate the
evaluation of the delay with step or ramp inputs [7, 8, 9, 10, 11].
Regarding the power consumption the main contributions are: switching power, short circuit current and subthreshold conduction. The first
one occurs during the charge and discharge of internal capacitances; short
circuit current originates from the simultaneous conduction of p and n networks and it is dominated by the slope of node voltages; subthreshold
currents are due to the weak inversion conduction of MOSFETs and become
relevant when the power supply is scaled in sub-micron technologies.
Most of the proposed power models use estimation algorithms not compatible with the delay analysis. The purpose of the FAST model is to combine delay and power evaluations in the same estimation procedure, allowing the simultaneous optimization of delay and power.
Chapter 3. A complex model
22
The section 3.1 reports the theory behind the FAST model, and in particular: 3.1.1 shows the MOS equations used in the model, 3.1.2 shows
the internal nodes voltage approximation made by the model and 3.1.3
explains how the threshold voltage variation are taken into account in the
model. Section 3.2 shows how the FAST model estimates the delay, and in
particular 3.2.1 shows how the equation are solved; while section 3.3 re-
ports the method used for the calculation of the power consumption, and
in particular 3.3.1 accounts for the switching power, 3.3.2 accounts for the
short-circuit power, and 3.3.3 accounts for the subthreshold power.
Finally the section 3.4 presents some results by the comparison of the model
with HSPICE and the section 3.5 draws some conclusions.
3.1 The FAST model

The low complexity and the accuracy that can be obtained by taking
care of the phenomenon of carriers velocity saturation, which is dominant in submicron technologies, suggested the use of the classical charge
control analysis and the gradualchannel approximation (Hodges model),
described in 3.1.1.
Estimation accuracy and low computational effort can be achieved by
operating both on the waveforms of internal signals and on the topology
considerations: in particular all the waveforms in the circuit are approximated with linear ramps.
By approximating the input waveform with a ramp, a strong simplification of the I(V) equations is obtained. Figure 3.1 shows the output voltage
of an inverter driven by a ramp input. It can be noticed that a ramp can
properly approximate the output voltage variation, especially in the central
phases of the commutation. The increasing error on the tail of the switching
does not affect significatively the delay and power estimation.
The voltage ramp approximation are described in 3.1.2.
3.1. The FAST model
23
Vout
Vin
Model
2
1
0
1.2
1.25
1.3
1.35
Time (ns)
1.4
1.45
1.5
Fig. 3.1: Inverter voltages waveform
3.1.1
MOS
equations
The well known equations for the MOS transistors are (for the ntype
and ptype transistors)[1]:
below saturation
IDSn, p = n, p (VGS VTn, p )VDS
2
VDS
2
(3.1)
above saturation
IDSn, p =
where n, p =
n, p Cox W
,
L
n, p
VDSsatn, p
2
(3.2)
with n, p modified by the carrier velocity saturation
effect:
n =
n0
1 + VLEDSc
p =
p0
1 VLEDSc
24
The saturation voltage (drainsource), not including the carrier velocity

saturation effect, is given by the well known formula:
VDSn, p = VGSn, p VTn, p
while considering the effect abovementioned:
VDSn, p = Vc 1
2(VGSn, p VTn, p )
Vc
(3.3)
where the plus signs are for nMOSFETs and the minus signs are for the
pMOSFETs, and Vc = |Ec L|
3.1.2 Internal nodes approximation
Fig. 3.2: Mos chain with proper numbering

Let be N the number of nMOSFETs in the nchain and P as the number of pMOSFETs in the pchain, and lets label the transistor in the chain
3.1. The FAST model
25
from 1 to N or from 1 to P (figure 3.2). Lets assume that the label 1 comes
with the driving transistor (i.e. the nMOSFET with source connected to VSS
as the pMOSFET with source connected to VDD ), as in figure 3.2. This hypothesis is only for the develop of the discussion; in our model any (but
only one) transistor can be a driving transistor, that is a transistor with a
changing gate voltage.
Notation 3.1. In the following equations the superscript index refers to the
node number (with the variable i always for the nMOSFETs and j always
for the pMOSFETs), and the smallletter subscript indexes n and p refer, respectively, to nMOSFETs and pMOSFETs, both for the voltage variables or
for the time variables; for the voltage variables the capital subscript indexes
G and D refer to the drain node and the gate node, while the smallletter
index d refers to the initial conditions of the drain nodes.
So, for example, VGi n (t) is the gate voltage at the node i for the nMOSFETs
j
(function of time), and Vd p is the initial condition of the drain voltage at
node j for the pMOSFETs.
The wave forms of the voltage are shown in figure 3.4 and figure 3.5,
with the hypothesis t01n = t20n = = t0Nn and t01 p = t02 p = = t0Pp ; that is
because we suppose the start of conduction of all the MOSFETs in a chain

contemporary1 .
We can write, referring to figures 3.4, 3.5:
VDD
t
VG1 n (t) =
i1n
V
DD
VGi n (t)
1
t<0
0 t < i1n
i1n t
VDD
VDD
VG1 p (t) = VDD 1 t
ip
i=2,3,..., N
= VDD
(3.4a)
t<0
0 t < i1p
(3.4b)
i1p t
This hypothesis is well supported by simulations
(3.4c)
26
j
VGp (t)
= VSS t
(3.4d)
V i
t < t0i n
dn
t t0i n
i
i Vi
(3.4e)
(t)
VD
=
V
t0i n t < oin
dn i
dn
n
i
i=1,2,..., N
t
o
0
n
n
VSS
oin t
j
j
t < t0 p
Vd p
j
j
j
VDD V j
o p Vd p t0 p VDD
dp
j
j
j
=
VD p (t)
t+
t0 p t < o p (3.4f)
j
j
j
j
j=1,2,..., P
o p t0 p
o p t0 p
j
V
o p t
DD
j=2,3,..., P
(t)
(t)
(t)
(t)
(t)
n(t)
n(t)
n(t)
Fig. 3.3: The ith and i + 1th MOSFETs with node voltages
1 and the source voltage V i = V i+1 ,
It is also possible to define iin, p = oi
s
n, p
d
as shown in figure 3.3 for the ith nMOS. The same is valid for the p
MOSFETs.
The starting level Vdn, p are determined with a static analysis, described
in 3.1.3.
3.1.3 Body effect: threshold variation and its approximation

It is known that a MOS transistor with the sourcebody voltage different from zero has the threshold voltage modified by the body effect, that
3.1. The FAST model
27
Fig. 3.4: Voltages wave forms in the nmos chain
is if Vsb = 0, with Vsb the sourcebody voltage (lets remember that for
a nMOSFET Vb = VSS and for a pMOSFET Vb = VDD ), then |VTh |Vsb =0 >
|VTh |Vsb =0 . The initial conditions of the chain nodes are set by the initial
condition on the output. So if the output node is discharging, then one
(and only one) nMOSFET is switching from off to on. It means that all the
other MOSFETs are already on, and while the starting voltage of the output
.
node is VDD , all the internal nodes have as a starting voltage VDD VTn
With the notations of previous paragraphs, the Nth (topmost) nMOS
, with V source potential and V the threshtransistor has VsNn = VDD VTn
s
Tn
old voltage modified by the body effect. All the internal transistors have
, while the first one has V 1 = V
1
Vdi n = Vsin = VDD VTn
DD VTn and Vsn =
dn
0.
The threshold voltage variation as a function of Vsb is given by:
VTn = (
with =
2s qNa
Cox
2| p | + Vsb
Na
and p = KT
q ln ( ni ).
2| p |) ,
28
Fig. 3.5: Voltages wave forms in the pmos chain
The source potential of the top transistor is
Vs = VDD VTn
,
=V
and, if VTn0 is the threshold voltage with Vsb = 0, then VTn
Tn0 + VTn
and we can solve for Vsb :
Vsb =
2| p | + 8| p | + 4VDD 4VTn0 + 2
2
2| p | + VDD VTn0 +
2
2
(> 0)
We can find an analogue equation for pMOSFETs: knowing that, for

the pMOS chain depicted in figure 3.7(b), the drain potential of transistor
j
; for the middle transistors V = V =

is VdPp = 0, while VsPp = VDD VTp
sp
dp
; and for the first (top MOS t) transistor V 1 = V
VDD VTp
DD VTp and
dp
Vs1p = VDD .
The threshold voltage variation function of Vsb again is:
3.1. The FAST model
29
i
Fig. 3.6: Drainsource (VDS ) and gatesource (VGS ) voltages of th ith n
MOS
VTp = ( 2| p | + Vsb
2| p |)
(for pMOS transistors threshold voltage is negative).

Again, solving:
= VDD VTp0 + (
Vsb = VDD VTp
2| p | + Vsb
2| p |)
where VTp0 is the threshold voltage with Vsb = VDD ; thus we find:
Vsb =
2| p | + 8| p | + 4VDD + 4VTp0 + 2
2
2| p | VDD VTp0
2
2
(< 0)
The threshold variation is approximated in the model by a linear approximation given by:
30
VDD
VDD
pmos 1
VDD
VDD
nmos N
VSS - VTP
VDD - VTN
VDD
nmos
VSS
pmos
VSS - VTP
VDD - VTN
VSS
nmos 1
pmos
VSS
VSS
(a) nMOSFET chain
(b) pMOSFET chain
Fig. 3.7: MOSFET chain with static voltages
VTn = n Vsb + n
VTp = p Vsb + p
with n, p and n, p constants:
n =
p =
V
VTn
Tn0
VDD VTn
VTp VTp 0
VDD + VTp
n = VTn0
p =
+V
VTp VDD
Tp0
VDD + VTp
3.2. Delay estimation
1.5
VTp(Vsb)
VTp approx
-1.1
1.3
-1.2
1.2
-1.3
VTp
VTn
-1
VTn(Vsb)
VTn approx
1.4
31
1.1
-1.4
-1.5
0.9
-1.6
0.8
-1.7
0
2
Vsb
(a) nMOSFET
2
Vsb
(b) pMOSFET
Fig. 3.8: Threshold variation with Vsb (solid line) and its linear approximation (dashed line)
In figure 3.8(a) and 3.8(b) the actual threshold variation (of a nMOS
transistor and a pMOS transistor) when a Vsb voltage is applied is compared with the linear approximation used in our model, for a 0.7 m technology.
The max error due to the linear approximation is limited to 7%.
3.2
Delay estimation
The delay estimation of the structures reported in figure 3.2 implies the
evaluation of oin, p and t0i n, p , for each transistor in the chains.
The currents in each transistor can be obtained from equations (3.1),
(3.2) (page 23), with the voltage function of time defined in equations (3.4a)
(3.4f) (page 25). So we can calculate the quantity of charge at each node and
thus apply the charge conservation law, i.e. at each node the total charge
variation must be equal to zero:
Qin = 0
Qp = 0
i = 1, 2, . . . N and j = 1, 2, . . . , P
(3.5)
The generic term Qin is the sum of three elements, Qin = QiI+1 QiI QiC ,
define below:
32
QiI+1 is the charge due to the (i + 1)th MOSFET placed above the ith
node:
QiI+1
Z ti+1
sn
t0i+n 1
i+1
Isat
(t)dt +
Z i+1
on
1
tis+
n
i+1
(t)dt
Ilin
(3.6a)
which includes the contributions due to the currents above and below saturation; ts is the time at which the MOSFET switches from the
saturation to the linear region;
QiI is the charge due to the (i)th mos below the ith node:
QiI
Z ti
sn
t0i n
i
Isat
(t)dt +
Z i
on
tisn
i
Ilin
(t)dt
(3.6b)
QiC is the charge due to the discharging of the capacitor at the ith
node, Ci :
QiC = Ci Vdi n .
(3.6c)
Similarly equations apply for pMOSFET.

For each circuit node, a charge conservation equation can be written.
3.2.1
Equation solving
Referring to the nMOS chain in figure 3.3, we can write at the output
node N:
QnN = QCN = C N VdNn
(3.7)
because, neglecting the contribution of the pMOS chain above (if it exists),
QN
I = 0.
At the node N 1 we can write:
33
N 1
QCN1 ,
QnN1 = Q N
I QI
and combining with eq. (3.7) (page 32)
1
QnN1 = C N VdNn Q N
QCN1 ,
I
and so on:
1
QnN2 = C N VdNn C N VdNn1 Q N
QCN2 .
I
More generally:
Qin =
k=i+1
Ck Vdkn QiI QiC
= Ck Vdkn QiI = 0
k =i
Proceeding till the first transistor, we obtain:
Q1n = Ck Vdkn Q1I = 0 ,
(3.8)
k=1
the same applies for pMOSFETs.

In order to solve nonlinear equation (3.8) one must substitute the definition of the current to calculate the charge Q, as in equations (3.6a), (3.6b)
(page 32), moreover one must substitute both the current calculated in the
saturation region and the one calculated in the linear region, extending the
integrals of the aforementioned equations to the proper extremes.
Finally we must distinguish among several different cases, depending
on the instant of time on which the transistor switch from the saturation
region to the linear region. For example, the first transistor can switches
34
between the two regions when the rising of the input has already finished,
or on the contrary can switches when the input is still rising.
All the possible cases are:
t01
t1s
i1
o1
t01
i1
t1s
o1
t1s
t10
i1
o1
i1
t01
t1s
o1
t1s
i1
t10
o1
t01
t1s
o1
i1
o1
i1
t1s
t01
(3.9)
Evaluating all the possible cases, the equation (3.8) becomes a non
linear equation of the variables t1s , t10 , o1 , i1 , with t1s , t10 , o1 as unknowns.
A further step must be done, with the purpose of eliminating all the variables but one. The real unknown is the time o1 , while all the other unknowns can be expressed in function of o1 : in particular, the times t1s and
t10 can be calculated together, with the equation VDS = VGS VT and with
the equation that states the charge conservation at node 1 between the time
0 and the time t10 , similar to the equation (3.5) (page 31), including the bootstrap effect due to capacitive coupling between the gate and the drain of
the first transistor.
Both these equations are functions of t1s , t10 , o1 , i1 . By this way one has
three equations with three unknowns, and by means of some approximated methods2 it is possible to evaluate the three unknowns.
This solution scheme ought to be repeated for all the seven cases shown
in equation (3.9). Each case gives as a solution a triple t1s , t01 , o1 that is compatible with one and only one of the conditions expressed by these cases.
Thus, only one working condition is really selected, as it can be expected.
Indeed all the previous solving scheme is true only if the equation (3.6c)
(page 32) apply, i.e. only if the capacitance at the node i is not a function of
the voltage at the same node. But the capacitance actually is function of the
voltage in this manner:
Or, taking into account the carrier velocity saturation effect, the equation (3.3) (page 24).
The problem is always strictly nonlinear.
C =
Cij
Vi
1+
b
m j
+ Cip
35
Vi
1+
b
m p
(3.10)
where C j and C p are, respectively, function of area and function of perimeter of a junction, because the capacitance at the node i is due to the parasitics capacitances of the transistors connected to this node.
If the capacitance at each node are functions of the voltage at the node itself, then one equation is no more sufficient: one must write equations like
the equation (3.8) (page 33), one for each node, and the solve them with
standard solving algorithm for nonlinear equations. The only difference
among the equations applied at the nodes above the first and the first node
equation is that not all of the cases of equation (3.9) are possible: in particular these conditions apply only when the transistor can pass from the
saturation region to the linear region, and moreover, only when the input
rising time i1 can assume whichever value. The passage from saturation to
linearity can be made only by the first and the last transistors of the chain,
as they are the only that can saturate3 . But in the last transistor, the time iN
is governed by iN = oN1 , giving thus only two possible cases:
t0N
tsN
iN
oN
t0
iN
tsN
oN
In order to make the algorithm convergent, two other fictitious cases

must be included:
t0N
tsN , oN
iN
t0
tsN , oN
iN
These conditions can never verify in a real circuit, since they imply that
the voltages at the source node and at the drain node of the last transistor
3
This is because they are the only that have a full voltage swing at some node, e.g. the
gate node the first, and the drain the last. All the transistor in the middle of the chain
are prevented to saturate by the body-effect, that makes the saturation condition VDS =
VGS VT , (or, better, the equation (3.3), page 24) impossible.
36
crosses, making the transistor current flowing in an inverse direction (see

figure 3.6 for a visual explanation of the terms i and o and why they relative voltage waveforms cannot cross). Their inclusion help finding the real
circuit conditions when solving the equation (3.8) for each of these four
cases: the solution of one the fictitious cases gives only unknowns compatible with one of the real cases.
All the other transistors, that can not saturate during the switching from
off to on, have only one possible working condition, again that the voltages
at source and drain nodes do not cross:
j = 2, . . . N 1
j
Solving all the equations, one for each node, the unknowns o can be
evaluated, giving thus an estimate of the voltage waveform at each node
of the chain. The rising/falling time of the last node of the chain gives also
the delay of the chain itself.
3.3 Power consumption estimation

3.3.1 Switching energy
The contribution to the power dissipation due to the charge and discharge of internal nodes for each MOSFET can be defined as the integral of
the voltage across the MOSFET times the current flowing through.
Theorem 3.2. The switching energy in generic nnetworks and pnetworks can
be written as:
Eswn =
1 N i
C Vi 2 Vi
2 i
=1
Esw p =
1
2
Cj
j=1
VDD V j
(3.11)
2
VDD V j
(3.12)
where Ci is the generic total capacitance of node i-th and Vi , Vi are, respectively, the initial and final value of the voltage swing at the same node.
3.3. Power estimation
37
Corollary 3.2.1. If the voltage swing of each node of the network is the full swing
V = VDD 0, then equations (3.11), (3.12) can be written as:
Eswn =
1 N i 2
C V
2 i
=1
(3.13)
Esw p =
1 P i 2
C V
2 i
=1
(3.14)
Proof of theorem 3.2. Since the internal voltages and currents are known from
the delay analysis, the energy for the nMOS network can be written by
summing all the contributions of internal nodes (see figure 3.3)
Eswn =
N Z
i=1
i+1
i
i
VD
(t) VD
(t) ID
(t)dt
n
n
n
where the notation of figure 3.3 is adopted.

This equation can be written in this way:
Z
Eswn =
N
(t) +
VDNn (t)ID
n
N 1
i =1
i
i
i+1
(t) ID
VD
(t) ID
(t)
n
n
n
dt
(3.15)
It is possible to rewrite the previous equations by noting that in general:
i+1
i
ID
ID
= Ci
n
n
i
dVD
n
dt
and, in particular, if we neglect the current of the pMOS chain above the
node N,
IDNn = C N
dVDNn
dt
Thus, for the n network it is possible to define the Eswn energy in the
following way:
38
Eswn = C
=C
i=1
N
t0
i=1
N
Z t
0
Z V
i
Vi
i
VD
n
i
dVD
n
dt
dt
i
i
dVD
VD
n
n
1
Ci Vi 2 Vi
2 i
=1
If we integrate the equation (3.11) (page 36) only when the argument of
the integrals are non zero, then the first integral in this equation goes from
i (ti ) to
t0 = t0i n to t0 = oin , so that the second integral goes from Vi = VD
n 0n
i ( i ). Since V i ( i ) = 0, we have E
Vi = VD
swn =
on
Dn o n
n
1
2
iN=1 Ci Vi 2 , where Vi
is the actual voltage swing at the node i.

The energy dissipated in the p network (Esw p ) can be calculated with
similar considerations leading to
Esw p =
Z t
0
t0
j=1
P
Z V
j
j=1
1
Cj
2
j
Vj
VDD VD p
j
i
dVD
n
dt
dt
VDD VD p dVD p
VDD V j
VDD V j
Again, V j = VD p (t0i n ) and V j = VD p (o p ), and in the same way V j =
VDD , so that Esw p = 12 Pj=1 C j (VDD V j 2 ), where (VDD V j 2 ) is the voltage

swing at the node j.
In the equations (3.11) and (3.12) (page 36) the voltage variation of capacitance must be included, obtaining expression for Eswn, p slightly more
complicated, but still in closed form.
3.3. Power estimation
39
3.3.2 Shortcircuit energy

The shortcircuit contribution (for a output falling transition) is given
by:
Esc =
Z o
t0
VD ID dt
where ID is the pMOSFET current flowing through the pMOSFET that

has a changing gate voltage, during the output falling; of course all the
pMOSFETs among this one and the output node must be on to have this
contribution of power dissipation. So if we neglect the little discharging of
the source voltage of this MOSFET, we can easily calculate the shortcircuit
energy, calculating the current flowing.
A similar equation can be written for the nMOS network.
Since voltage swings, internal currents and capacitances are known from
the delay analysis, the power supply dissipation does not require additional computations.
3.3.3 Subthreshold energy

The subthreshold current in a MOSFET is given by ([12]):
IDSsubth = 0
qVDS
W kT
Q(VS ) 1 e kT
L q
where
Q(VS )
kT
q
VT )
q s Na q(VG kT
e
| p |
and
=1+
1
2Cox
s Na
| p |
This current is proportional to the MOSFET width W, but, usually is neg-
40
ligible. However, with the scaling down of the dimensions and hence of the
threshold voltage this current may become no more negligible, and with
low VG and higher VD , the current becomes independent from VG .
Moreover, while the shortcircuit current is limited by the switching times
of the circuit, the subthreshold current is not limited in time, so its dissipation can be comparable to the shortcircuit dissipation.
3.4
Results
The circuit in figure 3.2 with 2 nMOS and 2 pMOS transistors (in a
0.7 m technology) has been simulated using HSPICE (level 6) and the proposed model, for each combination of MOSFET widths from 1 m to 100 m.
Figure 3.9 shows the comparison between delay (defined as the delay at
50% between an input rise ramp of 200 ps and an output falling ramp)
calculated by the model and the delay simulated by HSPICE for each combination of widths among 5 m and 30 m; similarly figure 3.10 shows the
comparison between the energy dissipated (during the output discharging)
by the circuit calculated by the model and by HSPICE.
Tab. 3.1: Mean Error
Delay
Energy dissipated
Mean error
6.115%
2.1%
Max Error
12.985 %
6.3%
Min Error
0.905%
0.11%
Tab. 3.2: Execution time

execution time
6384.3 sec.
HSPICE
FAST
execution time
188.91 sec.
The errors between the proposed model and the HSPICE simulation is
reported in table 3.1 while table 3.2 shows corresponding execution time.
These results are taken from the analysis of the circuit varying the dimensions of the MOSFETs continuously from 1 m to 100 m.
3.5. Conclusions
3.5
41
Conclusions
The model of this chapter is suitable for the optimization application of

chapter 5. It is able to compute the delay and the power consumption of
CMOS
structures with good accuracy and a consistent speedup regarding
to the HSPICE simulation taken as a reference.

In a real production design cycle, this model might be used for a first pre
optimization of some basic cell; then in the last steps of the design flow the
optimization using a more accurate model for the delay (or power) evaluation must be used.
42
Delay Model
Delay [ps]
180
160
140
120
100
80
60
40
20
30
25
5
20
10
15
15
20
W1 [micron]
W2 [micron]
10
25
30
(a) FAST model
Hspice Simulation
Delay [ps]
180
160
140
120
100
80
60
40
20
30
25
5
20
10
15
W1 [micron]
15
20
W2 [micron]
10
25
30
(b) HSPICE
Fig. 3.9: Delay of the circuit 3.2 with several combination of W1 and W2 .
3.5. Conclusions
43
Energy Model
Energy [fJ]
1000
900
800
700
600
500
400
300
200
30
25
5
20
10
15
15
20
W1 [micron]
W2 [micron]
10
25
30
(a) FAST model
Hspice Simulation
Energy [fJ]
1000
900
800
700
600
500
400
300
200
30
25
5
20
10
15
W1 [micron]
15
20
W2 [micron]
10
25
30
(b) HSPICE
Fig. 3.10: Energy dissipated by the circuit of figure 3.2 with several combination of W1 and W2
Part III
OPTIMIZATION
Chapter 4
MATHEMATIC OPTIMIZATION
HE very basic theory of optimization is introduced here, in order to
develop some optimization schemes, useful later for the optimization
of real circuits.
The theory of mono-objective optimization involves some properties and
theorems regarding finding the minimum of functions, hence the annulling
of the functions first derivatives. These results can be extended (with some
restrictions) to the case of multivariable functions but when the functions
to be optimized are more than one, being optimized simultaneously, the a
new theory may be introduced.
The whole goal of this introduction to mathematical optimization is
both the developing of reliable algorithms, and the justification of some assumptions made in the chapter 5 (page 77), especially for the multi-objective
case.
In section 4.1 some mathematical optimization foundations are reported, and in particular in 4.1.1 is shown the theory of mono-objective optimization (unconstrained, 4.1.1.1, and constrained, 4.1.1.2), while in 4.1.2 is
shown the theory of multi-objective optimization (unconstrained, 4.1.2.1,

and constrained, 4.1.2.2).
The section 4.2 reports the basic and most useful numerical algorithms for
optimization purposes: in 4.2.1 some one-dimensional search techniques,
in 4.2.2 some multi-dimensional search techniques, and in 4.2.4, 4.2.5
some special algorithms.
Some conclusion and summarized characteristics are reported in section 4.3.
Chapter 4. Mathematic Optimization
48
4.1 Optimization theory

Notation 4.1. In the following section, the function f is defined as:
f : X R p Y R. X is called the decisions space, and Y is called the criteria
space.
Problem 4.2 (Unconstrained optimization). Given the function f that depends on one or more variable x X, the problem of optimize f , in this
context, is equal to find:
min f (x)
x X
this is also known as an unconstrained optimization, since there are not any
constraints on the values the function f may assumes.
The unconstrained optimization is seldom applied in the field of digital
circuits, so the constrained optimization is defined as:
Problem 4.3 (Constrained optimization). Find
min f (x)
x X
subject to
g j (x) h j , j = 1, 2, . . . , m
where the n equations gi (x) hi constitute the set of constraints of the optimization.
The function f is also called the objective of the optimization, or the cost
function of the problem.
The above problems are classical optimization problems, or mono-objective problems. The multi-objective unconstrained optimization is defined as
the problem to optimize a vectorial function, so that the objective-function
is a vector of objective-functions.
Notation 4.4. In the following (multi-objective optimization), the function f
is defined as:
f : X R p Y Rn , or f = ( f 1 , f 2 , . . . , n)| f i : X R p Y R,
Problem 4.5 (Unconstrained multi-objective optimization). Find

min f i (x), i = 1, 2, . . . , n
x X
4.1. Optimization theory
49
where there are n objective functions.

Finally, the multi-objective constrained optimization is defined as:
Problem 4.6 (Constrained multi-objective optimization). Find
min f i (x), i = 1, 2, . . . , n
x X
subject to
gi (x) hi , i = 1, 2, . . . , m
where there are n objective functions and m constraints.

The multi-objective optimization is a very complex problem, since the
problem of finding the minimum of two or more functions is apparently
only trivial: the set of independent variables xmin that minimizes, lets say,
the function f 1 , it is not supposed to minimizes (and generally it does not)
the other functions. So there should be a way to combine the information of
minimum among all the functions. The intuitive way of linear combination
is somewhat problematic:
f tot (x) =
i fi (x), i R
i=1
because the functions f i cannot be commensurable among them. For example, if there is one function f j that is f j >> f i , i = j, then this function
dominate the total objective, giving false results for the optimization problem. This problem is illustrated in 4.1.2.
4.1.1 Mono-objective optimization

The mono-objective optimization is the standard optimization problem,
and is widely treated in literature (see [13] for an introduction). With this
preliminary statement, here are reported some results, useful to find a solution for the problems 4.2, 4.3.
The existence of the minimum (at least one) is granted by the Weierstrass
Theorem1 , but these minimums can be local or global:
Definition 4.7 (Local Minimum). The point x X is a local (or relative)
minimum of the function f iff
> 0 : f (x) f (x ) x X |x x | < .

1
iff X is a compact set, as is in this context
50
Definition 4.8 (Global Minimum). The point x X is a global (or absolute) minimum of the function f iff f (x) f (x ) x X.
Definition 4.9 (Feasible direction). d Rn is a feasible direction if >
0 : x + d X , : 0
In an intuitive manner the concept of feasible direction is useful to solve

the problem of minimization: we search all the direction in which the function f is decreasing.
Lemma 4.10 (First order necessary condition). If x X is a minimum of
f C1 then d Rn , where d is an feasible direction, dT f (x ) 0, where
() has the usual definition of scalar product in the space Rn .
Corollary 4.10.1. If x X is an internal point of X, then dT f (x ) = 0

Lemma 4.11 (Second order necessary condition). If x X is a minimum of
f C2 then d Rn , where d is an feasible direction,
i) dT f (x ) 0;
ii) if dT f (x ) = 0 then dT 2 f (x ) d 0
Corollary 4.11.1. If x X is an internal point of X, then
i) dT f (x ) = 0
ii) dT 2 f (x ) d 0
The conditions of the corollary 4.1.1 are necessary and sufficient conditions for the existence of the minimum (local). In order to have some
information about the existence of a global minimum, the theory of convex
functions must be very briefly reported.
Definition 4.12 (Convex function). The function f : X Y, where X is a
convex set2 , is convex if x1 , x2 X : 0 1
f (x1 + (1 )x2 ) f (x1 ) + (1 ) f x2 )
A set X Rn is convex if x, y X the segment [x, y] is totally contained in X
(4.1)
51
If in the equation (4.1) the sign < applies, then the function is said to be
strictly convex.
Another way to write the equation (4.1) is:
Lemma 4.13. The function f C1 : X Y is convex over a convex set X if
f (y) f (x) + f (x) f (y x), y, x X
or, if f is twice derivable,
Lemma 4.14. The function f C2 : X Y is convex over a convex set X if
2 f (x) 0, x X
The convex functions are a very useful mathematical tool in the class of
optimization problem, mainly for the next two results:
Theorem 4.15. If f : X Y is convex over a convex set X, the set A of the min-
imum of the function is convex, and every local minimum is also a global min-
imum.
Theorem 4.16. If f C1 : X Y is convex over a convex set X, and if x
X : x X f (x )(x x ) 0, then x is a global minimum of f over X.
The theorem 4.16 also implies that the conditions of the lemma 4.10 and
corollary 4.10.1 (first order conditions) are both necessary and sufficient
conditions for the existence of a global minimum.
4.1.1.1 Unconstrained problem

All the previous results are, almost in theory, sufficient to solve the
problem 4.2. The theory of the convex function ensures the existence of
a global minimum, while lemma 4.10, corollary 4.10.1, and theorem 4.16
suggest a method to find this minimum. We will see in 5.1 how these
methods apply to real circuits, in which, for example, the functions derivative are not available.
52
4.1.1.2 Constrained problem

The solution of problem 4.3 is slightly more complicated. The presence of constraints reduces the feasible set of independent variables that
are solutions of the problem. So the solutions, (i.e. the value of independent variables that minimize the objective function), must be searched in the
set x C X that satisfies all the constraints.
The most important method to solve the problem of the minimization taking into account the satisfaction of some constraints (and, incidentally, the
method most useful for our real problem) is the method of the Lagrange
multiplier (and its derived, the method of the penalty function).
Lagrange multiplier and Penalty functions The first method defines a

Lagrangian function:
L(x, ) = f (x) + i gi (x)
(4.2)
i=1
If we define x as the solution that:

x = min f (x)
x X
gi (x ) 0, i = 1, 2, . . . , m
then we can write the necessary KuhnTucker conditions for the existence
of the minimum:
x L(x , ) = 0
L(x , ) 0
T
(4.3)
(4.4)
( ) g(x ) = 0
(4.5)
(4.6)
In order to find out sufficient conditions, we define the saddle-point conditions:

Theorem 4.17. A point (x , ) with 0 is a a saddle-point of the Lagrangian
L(x, ) iff
53
i) x minimizes L(x, ) over the whole X

ii) gi (x ) 0, i = 1, 2, . . . , m
iii) i gi (x ) = 0, i = 1, 2, . . . , m
It can be proved that if the functions f , g are even not differentiable but
are convex, then the saddle-point conditions are necessary and sufficient
conditions. Although these conditions must hold at the minimum, they are
not very useful in determining the optimum point. The determination of
the optimum by direct solution of these equations is rarely practicable.
A more feasible way is to convert the constrained problem into an unconstrained one, by defining the new objective function:
P(x, K) = f (x) + Ki [gi (x)]2
(4.7)
i=1
The sum added to the objective function is called penalty function, since it
penalizes the objective function adding a positive quantities (recall that we
want to minimize the cost function). The constants K = [K1 , K2 , . . . , Km ]T
are weighting factors (positive) that define how strongly must be satisfied
the ith constraint, and can also made it commensurable.
Wherever x is inside the feasible region, we can ignore the constraints,
so a new objective function can be defined as:
P(x, K) = f (x) + Ki [gi (x)]2 ui (gi )
(4.8)
i=1
where ui (gi ) is the usual step function:
0 if g (x) 0
i
ui (gi ) =
1 if g (x) > 0
i
The introduction of the step function makes possible to relate the pen-
54
alty function defined in (4.8) with the Lagrangian function of (4.2) (page 52):
P(x, K) = L(, K)
if we let i = Ki gi (x)ui (gi ), so that all previous results valid for the Lagrangian function are valid for the penalty function.
Note that the solution x found optimizing the penalty function P(x, K)
converges to (x , ), defined by the KuhnTucker conditions, only in the
limit K .
4.1.2 Multi-objective optimization

The multi-objective optimization is not a standard problem in the engineering, but is quite common in economics ([14]). While with the monodimensional problem the concept of optimum as a minimum is quite clear
and defined (the idea of greater or lesser is intuitive with the real number),
with multi-objective (also multi-criteria) the concept of minimum is less intuitive. So we must define some relation of order among the points in a
multi-dimensional space.
Notation 4.18. Given x, y Rn , define
x=y
iff
iff
xy
iff
x<y
iff
x k = y k k = 1, 2, . . . , n
x k y k k = 1, 2, . . . , n
x
y and x = y (so k : xk < yk )
x k < y k k = 1, 2, . . . , n
Notation 4.19. In the following section, the function f is defined as: f : X

Y, X R p , Y Rn . X is called the decisions space, while Y is called the criteria
space.
Given two outcome y1 , y2 of the cost functions, y1 = f (x1 ) and y2 =

f (x2 ), we must define which is better and we indicate that y1 is better than
y2 with y1 y2 , that y1 is worse than y2 with y1
y2 , and, finally, that y1 is
indifferent with respect to y2 with y1 y2 .
In the optimization theory a great importance has the definition of Pareto
55
point or Pareto preference:

Definition 4.20 (Pareto preference). Given y1 , y2 Y, the Pareto preference
is defined by
y1 y2
iff
y1 y2 .
A Pareto preference is intuitively guided by the relation lesser is better.

Definition 4.21 (Non-Dominated and Dominated set). If y1 y2 is a bin-
ary preference defined on Y, the dominated and the non-dominated set
with respect to {} are defined as:

N({}, Y) = {y0 Y | y Y : y y0 }
D({}, Y) = {y0 Y | y Y : y y0 }
If y0 N({}, Y), y0 is a Npoint. Similarly, if y0 D({}, Y), y0 is a D

point.
Definition 4.22 (Pareto optimum). y Y is a Pareto optimum iff it is a N

point with respect to Pareto preference.
We will give now two theorems that are fundamental for the solution of
the multi-objective optimization problem; first we introduce the definition
of convex cone in Rn :
Notation 4.23 (convex cone).
> ={d Rn | d > 0}
={d Rn | d 0}
= ={d Rn | d
Theorem 4.24.
is a Npoint;
0}
i) if y0 Y minimizes y over Y for some > , then y0
ii) if y0 Y uniquely minimizes y over Y for some , then y0 is a

Npoint.
56
Corollary 4.24.1. If Y is = convex, i.e. Y + = is a convex set, then a necessary

condition for y0 Y to be an Npoint is to minimize y over Y for some > .
This very important theorem (and its corollary) states that if y0 minimizes a linear weighted function y (for some ), then y0 is a Pareto optimum.
This reduces the problem from a multi-objective one to a mono-objective
one, i.e. is sufficient minimizes a linear weighted function of the cost functions.
Note that:
j
yi
=
yj
i
so the ratio
j
i
is the trade-off exchanging an unit-gain in the variable y j with
an unit-gain for the variable yi . Finally, note that the theorem is valid for
any shape of Y.
Theorem 4.25. A necessary and sufficient condition for y0 Y to be an Npoint
is that i = 1, 2, . . . , n there are n 1 constants (i) = {h j | j = i, j = 1, 2, . . . , n}
so that y0 uniquely minimizes yi over Y((i)) = {y Y | y j h j , j = i, j =

1, 2, . . . , n}.
Each constant h j can be seen as a constraint: so this theorem claims that

a necessary and sufficient condition to be a Pareto optimum is to minimize
one criterion (the ith objective function), while satisfying the constraints
for the remaining criteria. This is equal to say that the multiple criteria
problem can be reduced to a single criterion problem (minimize the yi functions with multiple constraints (ensure that y j h j , i = j).
4.1.2.1 Unconstrained
Given all previous results, the solution of the unconstrained problem is
given by all previous tools: we reduce the multi-objective problem. We will
see in 5.1 how to apply these methods and which is preferred.
4.1.2.2
57
Constrained
Again, the solution is to reduce the complexity of the problem from the
multi-objectivity to a mono-objective one. It is possible to combine the two
previous methods, that is to minimize a linear weighted function plus a
sum of penalty function; the only critical point is to ensure the same order
of magnitude of each term of the sum, such that there is not a dictatorship
of one term of the sum. The third chance to solve an unconstrained problem
(or a constrained, but with some care) is to use the method of the compromise
solution:
Compromise solution Given the problem 4.3, it is possible to define y as

the ideal outcome of the cost function f (x) without any constraints, so that
y = inf f (x); the compromise solution is defined as the minimum of regret:
x X
r(y) = y y ;
typically, the L p norm (the distance between the actual solution and the
ideal point) ) it is used:
r(y) = r(y; p) =
| yi yi | p
1
p
i=1
Again, a weight can be associated for each term of the sum:
r(y; p, w) =
i=1
p
wi | yi
yi |
1
p
Definition 4.26 (Compromise solution). The compromise solution with respect to L p norm is y p Y that minimizes r(y; p, w) over Y.
The compromise solution enjoys several properties, the most important
is:
Property 4.27 (Pareto optimality). The compromise solution y p Y is an
Npoint, for 1 p < with respect to Pareto preference (definition 4.20).

If y is unique, then it is also an Npoint.
58
When the ideal point is not known, one can use an approximation, or,
even, a constraint; in the latter case the more appropriate term is satisfying
level. To point out the differences between constraints and satisfying level,
one must observe:
The constraints are, typically, a disequality constraints: the solution

must be as lesser as possible than the specified constraints. In term
of a L p norm the solution must be as farther as possible from the
constraints, that is the L p norm must not to be minimized. So the
method of the penalty function is the only suitable for this kind of
problem.
The satisfying levels are, typically, equality constraints: the solution
must be as closer as possible to the levels indicated, that is the L p
norm must be minimized. So the method of the compromise solution
can be devised.
4.2
Optimization Algorithms
This is a very concise report of some algorithms used in the optimization of real circuit in the following chapters.
First are reported some one-dimensional (with respect to the decision
space) algorithms, and then the multi-dimensional algorithms, with some
based on the previous ones. Finally some non-standard algorithms are
reported, since they can be suitable for the application to digital circuit.
In the following report we focus on the algorithms that do not require
the evaluation of the gradient of the objective functions, or that approximate
this gradient3 , since (see 5.1) the functions available in real circuits are not
known in a closed form and almost

3
Essentially with
f (x + x) f (x)
f
(x)
xi
x
4.2. Optimization Algorithms
59
4.2.1 One-dimensional search techniques

In order to find the minimum of a function f : R R, we need to bracket
him:
Definition 4.28 (Bracketing). To bracket a minimum means to find a triple

a, b, c R, a < b < c, such that f (b) < f (a) and f (b) < f (c). This means that
the minimum is in the interval (a, c).
We show some algorithms, that are the most efficient in this field. First
we introduce the family of sectioning algorithm, from which the the golden
section search is probably the most suitable for our uses. Then we introduce
the Brents rule, a quadratic interpolation algorithm.
4.2.1.1
The section search
The algorithms of sectioning apply always the same policy: divide and
conquer. The initial interval [a, c] is reduced at each iteration to a smaller
interval, already bracketing the minimum x . We have so a series of encapsulated intervals (see figure 4.1)
x [an , cn ] [an1 , cn1 ] [a, c].
Dicotomic search The simplest form of sectioning is the dicotomic search:

at first iteration the interval [a, c] is divided in two equal parts, [a, b] and
a+c
[b, c], so that b =
; then, choosing > 0, we check if f (b ) > f (b +
2
). In such case we repeat he whole process with the new interval [a, b],
otherwise we repeat with [b, c]. It can be proved ([13]) that this method
requires 2k evaluations of the function f , where k is the iterations number.
Also the final interval length Ik = (ck ak ) is
lim Ik = I0 ,
where I0 = (c a).
So the relative uncertainty on the minimum x is .
60
I0
I1
I2
a0
b0=a 1
b1 =c 2
c0 =c1
Fig. 4.1: Section search algorithm

Fibonacci Search A more sophisticated algorithm is the Fibonacci search,
where at each iteration the length of the interval is chosen according to the
Fibonacci rule: Ik3 = Ik2 + Ik1 . This method has the advantage that the
uncertainty after n iteration is known a priori: defining the initial interval

I0 = I1 = (c a), then
Ik =
I1 + f k2
fk
where f i is the ith number of the Fibonacci sequence.

The number of function evaluations are again 2k, and the disadvantages of
this methods are that and n must be chosen a priori.
The golden section search Given a triplet (a, b, c) that brackets the minimum, we choose a new point x that defines a new bracketing triplet (a, x, b)
or (b, x, c) according to the rule:
xb
ba
=12
ca
ca
61
This implies that |b a| = |x c|, and that at each iteration the interval is
scaled of the same ratio .
Then we repeat the process with the new triplet. So the interval (a, c) is divided in two parts, a smaller and a larger, and the ratio between the whole
interval and the larger is the same between the larger and the smaller, or in
other words:
1
=
,
1
giving for the positive solution
51
=
.
2
This fraction is known as the golden-mean or golden-section, whose aesthetic properties come from ancient Pythagoreans.
Convergence considerations
All the three previous methods have a lin-
ear convergence, since at each iteration the ratio between the interval containing x and the new smaller interval is:
0
Ik+1
1.
Ik
The asymptotic convergence rate is defined as

lim
Ik+1
.
Ik
For the dicotomic search, since 2Ik+1 = Ik + , taking = 0 we have

lim
Ik+1
1
= .
Ik
2
For the Fibonacci search, first we must write the generic number of the
Fibonacci sequence in a closed form:
62
1+ 5
2
1
fk =
5
k +1
1 5
2
k+1
then it can be proved that, taking = 0:

I
f
lim k+1 = lim k+1 =
k Ik
k f k
51
2
For the golden section search, as previously said

I
lim k+1 = =
k Ik
Ik+1
Ik
= , so
51
.
2
Thus the convergence rate of the Fibonacci and the golden-section search are
identical.
4.2.1.2
Parabolic interpolation
Given a triplet (a, b, c) that brackets a minimum, we approximate the

objective function in the interval (a, c) with the parabola fitting the triplet.
Then we find the minimum of this parabola with the formula (since we
want the abscissa, the method is indeed an inverse parabolic interpolation):
x =b
1 (b a)2 [ f (b) f (c)] (b c)2 [ f (b) f (a)]
2
(b a)[ f (b) f (c)] (b c)[ f (b) f (a)]
This method is useful only when the function is quite smooth in the interval, but it has the advantage that the convergence is almost quadratic,
and it is perfectly quadratic when the function to be optimized is a quadratic form.
The Brents rule The Brents rule is a mix of the last two techniques: it
uses the golden section when the function is not regular and switches to a
parabolic interpolation when the function is sufficiently regular. In particular, it tries always a parabolic step. When the parabolic step is useless then
63
the method use the golden section search.
4.2.2
Multi-dimensional search
This algorithms search the solution of the optimization problem in a

multi-dimensional space. Again, first an algorithm with a convergence order of 1 is presented, then an algorithm with a quadratic order of convergence is showed.
All the algorithms here presented show a sub-algorithm part that is a
one-dimensional search.
4.2.2.1 The gradient direction: steepest (maximum) descent

The method of the steepest descent chooses at each iteration a new point
in the decision space x + dx from the old point x, obviously such that:
f (x + dx) < f (x)
This new point must also be chosen such that the variation of the function
f is as more as possible. In other words, if dl is the length of the direction:
dl =
(dxi )2 ,
i=1
the steepest descent maximizes the rate of change d f /dl.

The problem of minimize f becomes so the problem:
Problem 4.29 (Steepest descent).
n
df
f dxi
= max
,
xi dl
dl
dx
i i=1
i=1
n
max
dl
such that
dl =
(dxi )2 .
i=1
64
This problem can be solved with the Lagrangian multipliers; from equations (4.3) and (4.4) (page 52) we can write:
dxi
1 f
=
,
dl
2 xi
with
1
=
2
i=1
f
xi
1
2
This means:
dxi
(x) =
dl
f
(x)
xi
n
i=1
f
(x)
xi
1
2
(4.9)
The steepest descend algorithm chooses at each iteration a new point

xk+1
from the old point xk from the equation (4.9) (page 64)
xk+1 = xk dl f (xk ),
dl > 0
with dl chosen accordingly to the desired convergence rate: if dl is small

the algorithm will closely approximate the minimum, with slow convergence, while if dl is large the convergence is fast but the algorithm can
oscillate near the minimum. Thus some methods are necessary to reduce
(or enlarge) the step dl at each iteration: large steps if we are far away from
the minimum, small steps if we are close to the minimum. The scheme
of choosing the proper step can affect greatly the convergence of the algorithm. The best choice is the method of the optimal gradient.
65
4.2.2.2 The optimal gradient

This algorithm simply calculates the step dl according to:
min f (xk dl f (xk ))
R+
dl
This is a one-dimensional optimization and it is usually performed with

a method as shown previously. Strictly speaking, the optimization of f is
always a multidimensional one, since we descend along the gradient path,
but inner this process there are a lot of sub-optimization steps that found
the optimal length of this descend.
If f C2 , that is f is twice differentiable and its derivatives are continue,
then a closed form for the optimum step dl is determinable; we expand f

in Taylor series:
f (xk + x) = f (xk ) + f (xk )
1
x + xk H(xk )x,
2
where H(x) is the Hessian4 matrix of f .

Along the gradient direction:
x = dl k f (xk ).
Thus:
f (xk + dl k f (xk )) = f (xk ) + dl k f (xk )
df
= f (xk )
dl k
1
+ (dl k )2 f (xk )
2
f (xk ) + dl k f (xk )
f (xk ) +
T
H(xk ) f (xk )
H(xk ) f (xk ) = 0
The Hessian matrix of a function f (x1 , x2 , . . . , xn ) is defined as:

2
f
f
x1 f xn
x1 x2
x21
2 f
f
x2 f xn
x2 x1
x
2
H( f ) =
.
.
.
.
.
.
.
.
.
.
.
.
f
f
2 f
2
xn x
xn x
x
1
(4.10)
66
and
k
dl =
From
df
(xk+1 )
dl k
f (xk )
f (xk )
f (xk )
H(xk ) f (xk )
(4.11)
= 0, we can see that:
f (xk + dl k f (xk )), f (xk )) = 0,

that is f (xk )) and f (xk+1 )) are orthogonal, or, the same, xk and xk+1
are orthogonal. This means that successive steps of the optimal gradient
algorithm are orthogonal.
Convergence considerations A general descend algorithm converges if:

lim f (xk ) = 0.
Property 4.30. The function f monotonically decreases along the (negative)

gradient path.
Proof. From equation (4.9)

n
df
f dxi
=
=
dl
xi dl
i=1
f f
xi xi
n
i=1
Thus
f
xi
1
2
i=1
f
xi
1
2
(4.12)
df
0, or the function f decreases along the path dl.
dl
Lemma 4.31. The convergence of a descend method along the gradient path can
not be obtained in a finite number of steps.
67
Proof. From equation (4.12) (page 66)

df
=
dl
i=1
f
xi
1
2
but when x approaches the optimum x , then

lim
f
(x) = 0
xi
lim
df
(x) = 0
dl
x x
so that
xx
meaning that the optimum is reached with a rate convergence that decreases.
For the optimal gradient method the convergence is only linear5 in f (xk )
and a halting criterion for the algorithm could be:
f (xk ) f (xk+1 ) ;
alternatively from the necessary condition f (x) = 0
max |
i
f k
(x )|
xi
or
i=1
f k
(x )
xi
Finally note that these methods, since they use a local gradient information, they find only a local minimum, and that the gradient algorithms are
rather inefficient in the proximity of the optimum, due to the small step
size.
4.2.3
The conjugate direction method
Let u, v X Rn . They are said mutually orthogonal if uT v = 0. Similarly
they are said mutually conjugate with respect to a matrix A if uT Av = 0.

5
This means that limk
f (xk+1 )
f (xk )
= a, with 0 a 1
68
Property 4.32. A set of of mutually conjugate vectors in X Rn constitutes

a basis for X.
The importance of a set of mutually conjugate vectors is stated from the

following theorem:
Theorem 4.33. Every descent method of optimization using mutually conjugate
directions is quadratically convergent.
The concept of conjugate directions is important, since, in an intuitively
manner, a minimization attained along one of this directions does not perturb the the minimization along the other direction.
4.2.3.1 The FletcherReeves conjugate gradient algorithm
This algorithm calculates the mutually conjugate directions of search
with respect to the Hessian matrix of f directly from the function evaluation and the gradient evaluation, but without the direct evaluation of the
Hessian of the function f .
Algorithm 4.34. FletcherReeves conjugate gradient algorithm
Require: x0 = starting point
1:
2:
3:
4:
5:
6:
7:
repeat
Compute f (x0 ) and h0 = f (x0 )
for i = 1, . . . , n 1 do
Replace xi = xi1 + i1 hi1 ,

where i1 minimizes f (xi1 + i1 hi1 )
Compute f (xi )
if i < n then
hi = f (xi ) +
8:
end if
9:
x0 = x n
10:
11:
f (xi ) 2 i1
h
f (xi1 ) 2
end for
until halting criterion
f (xi ) 2 i1
h is added to the gradient at each iteration,
f (xi1 ) 2
and when f is a quadratic form (positive definite), this results in a set of
mutually conjugate vectors.
The quantity
69
4.2.3.2 The Powell conjugate gradient algorithm

Since the generation of the conjugate directions in the FletcherReeves
algorithm requires the computation of f (x) at each iteration, and this
computation it is not always feasible, Powell ([15]) has developed a method

to generate the conjugate directions using only a one-dimensional search
at each iteration: if x1 , x2 are two vectors generated by one-dimensional

searches in the same direction v, but from different points, then x1 x2 is
mutually conjugate to v.
Algorithm 4.35. Powell conjugate gradient algorithm

Require: {hi , i = 1, . . . , n} X Rn = A set of linearly independent vectors in X, and x0 = starting point
1:
2:
repeat
for i = 1, . . . , n do
3:
Replace xi = xi1 + i hi , where i minimizes f (xi1 + i hi )
4:
for i = 1, . . . , n 1 do
5:
hi = hi+1
6:
end for
7:
hn = xn x0
8:
9:
10:
11:
Find n that minimizes f (xn + n (xn x0 ))

x0 = x0 + n (xn x0 )
end for

The Powell algorithm is equivalent to a one-dimensional search made
in a sequential way along mutually conjugate directions. The only critic

point of the Powell is the line 7 of the algorithm 4.35: replacing th nth
direction hn with the vector xn x0 tends to produce at each iteration a set
of directions that are more linearly dependent. The solution is to reinitialize

every n iterations the set of directions h; these directions can be the columns
of any orthogonal matrix, and there is an heuristic scheme due to Powell.
The figure 4.2 shows 20 iterations of the Powell algorithm to find the
minimum (located at x = 30) of a mono-dimensional function x4 . As it
can be see, the algorithm finds the minimum at x = 31.7 and it is not fooled
by the presence of a local minimum at x = 10. The figure 4.3 shows 24

iterations to find the minimum (located at x = 15) of a more complicated
70
80
f(x}
Powell
70
60
50
40
30
20
10
Sol
0
0
10
15
20
25
30
35
40
45
Fig. 4.2: Minimization by Powell algorithm of a function x4 : 20 steps.

function x6 : again the algorithm finds the global minimum at x = 13.7 in
a presence of local minima.
In both cases a better precision on the location of the minimum could be

obtained increasing the number of iterations.
4.2.4
The SLOP algorithm
The slop algorithm ([16]) is a simple algorithm, suitable for the minimization of a particular function of a digital circuit, the delay. It is feasible
for smaller circuit, since it has no heuristics in reaching the minimum, and
also it stops at the first minimum it finds.
The idea behind the algorithm is simple: start from a given point x0 <
x , the increment at each iteration a single component of x0 by a defined
step. For each increment track the diminution of the objective function,
then conserve memory only of the increment that give the best diminution.
Finally, use this increment as a new starting point.
Clearly, this algorithm works only if the starting point is x0 < x (see notation 4.18), so that an increment in one component moves the function f
near the minimum. Also at the first minimum encountered the algorithm
71
h(x}
Powell
140
120
100
80
60
40
20
Sol
0
0
20
40
60
80
100
120
Fig. 4.3: Minimization by Powell algorithm of a function x6 : 24 steps.

stops.
The same function of figure 4.2 is shown in figures 4.4, minimized by
the SLOP algorithm. It is possible to see that the SLOP algorithm stops as
just as it encounters the first minimum (a local minimum) at x = 10.
Algorithm 4.36. SLOP algorithm
Require: x0 < x starting point
1:
2:
repeat
for i = 1, . . . , n do
3:
Compute f old = f (x0 )
4:
Replace xi = xi + x
5:
Compute f (x0 ) and hi = f old f .
6:
Replace xi = xi x
7:
end for
8:
Search the index imax that corresponds to the maximum of hi :
9:
10:
imax = {1, . . . , n| max hi > 0}

i
Replace ximax = ximax + x

72
80
f(x}
Slop
70
60
50
40
30
20
Sol
10
0
0
10
15
20
25
30
35
40
45
Fig. 4.4: Minimization by SLOP algorithm of a function x4 : 180 steps.
4.2.5 The simulated-annealing algorithm

The name of this algorithm comes from an analogy with thermodynamics: it is known that if a slow cooling is applied to a liquid, the this liquid
freezes naturally to a state of minimum energy. This process is called annealing.
The numerical algorithm applies this analogy to the minimization of function: first go downhill to a minimum as far as it can go, then go slightly
uphill, since the minimum just fond could be a local minimum, then again,
go downhill, and so on. In thermodynamics, the probability to go from a
state with energy E1 to a state of energy E2 is given by:
p=e
(E1 E2 )
kT
where k is the Boltzmann constant and T is the temperature of the system.

In order to apply this scheme to a function minimization, it is necessary
to define the energy of the system (i.e. the objective function), the temperature of the system, and an annealing schedule (i.e. the scheduled number
of annealing iterations): at each iteration the temperature defines a random
fluctuation in the minimum found, to simulate the thermal fluctuations of
73
80
f(x}
Anneal
70
60
50
40
30
20
10
Sol
0
0
10
15
20
25
30
35
40
45
Fig. 4.5: Minimization by Simulated-annealing algorithm of a function

x4 : 130 steps.
the atoms. Also, at each iteration the temperature is decreased, to reduce

the thermal fluctuations and converging, thus, to the global minimum.
The rate of the diminution of the temperature influences the rate of convergence (higher rate temperature, higher rate of convergence), but also
influences the quality of the minimum (lower rate temperature, higher the
probability to converge to a global minimum).
As an example, a possible annealing schedule (probably the simpler) would
be: after k steps, reduce the temperature T by T = (1 )T, where
is de-
termined by experiment.
The same function of figures 4.2, 4.3 are shown in figures 4.5, 4.6, minimized by the simulated annealing algorithm. As in the case of Powell
algorithm, the simulated annealing it is not fooled by the presence of local
minima, but the number of iterations is greater for both the functions: 130
in the first case, 200 in the second one.
74
h(x}
Anneal
140
120
100
80
60
40
20
Sol
0
0
20
40
60
80
100
120
Fig. 4.6: Minimization by Simulated-annealing algorithm of a function

x6 : 200 steps.
4.3 Conclusions
After all this mathematic theory, some words must be spend about the
choice of which algorithm it is feasible to use.
The characteristics of each algorithm are summarized in the table 4.1:
this table should be indicate several characteristics that can be useful for
the real implementation of circuit optimizer.
In the same manner the previous sections illustrate all the basic theory, useful to justify some choices made in the implementation of the optimizer.
4.3. Conclusions
75
Tab. 4.1: Optimization algorithms

Algorithm
Section search
Parabolic
tion
interpola-
SLOP
Conjugate directions
Powell scheme
Simulated annealing
Pro
Mono-dimensional
Simple implementation.
Fast convergence.
The simplest implementation.
Multi-dimensional
Good convergence.
Fast convergence.
Does not require
gradient knowledge.
Is not trapped by local
minima.
Simple implementation. Does not require
gradient knowledge.
Is not trapped by local
minima.
Con
Converges to local
minima.
Has some pre-requirements.
Converges to local
minima. Very slow.
Requires gradient
knowledge.
Difficult implementation.
Very slow.
Fragile
with respect to some
critical parameters.
Chapter 5
CIRCUIT OPTIMIZATION
HE goal of the optimization step during a design flow is to obtain

from a given design an optimized design. In the figure 5.1 are
showed the various levels of possible optimization.

The optimization level we concern is the inner level, indicated here as dimension optimization. The optimization levels, that is the level at which the
designer can apply suitable techniques are, briefly:
Fig. 5.1: Design flow
System Optimization This is higher level of optimization: it concerns the

optimization made on user space or kernel space of the applications
Chapter 5. Circuit Optimization
78
running in the system subject to the optimization process.

Behavioural optimization At this level the proper optimization techniques
are made by choosing the best algorithm to implement functions.
Logic optimization This is the optimization made by mapping the given
functions or algorithms (from a behavioural optimization) into boolean functions. It is equal to choose the logic gates that implement
these functions.
Dimension optimization This is the lower level of optimization: it is made
by choosing the proper transistor dimensions in each gate that implement a logic function. This is the optimization which the efforts of
this thesis focus on.
In section 5.1 are shown the three kind of target to be optimized in a
real circuit: delay (5.1.1), power consumption (5.1.2) and area occupancy
(5.1.3). In particular 5.1.1.1 shows the delay obtained from the Elmores
formula (chapter 2, page 15), while 5.1.1.2 shows the delay as it is obtained
by HSPICE and FAST (chapter 3, page 21).
The section 5.2 contains some application of the mathematical results of

chapter 4 (page 47): in particular 5.2.2 shows the results of a mono-objective
optimization, while 5.2.3 shows the results of a multi-objective optimization. Some conclusions are drawn in section 5.3
5.1
Optimization targets
There are, mainly, three target policies in optimizing real circuits: minimize the delay, minimize the power consumption and minimize the area
occupancy. In some cases these policies can be conflicting among them, as,
for example, minimizing the delay surely increases the circuit area, while
ins some cases these policies can go together, as, for example, minimizing
the power consumption may lead to a reduction of the area occupancy.
There is another policy that can be considered, especially in the field
of sub-micron digital circuit design: the noise reduction; however this requires a good noise model of the circuit, and actually there are a few good
ones.
5.1. Optimization targets
79
Now we are going to analyze the three principal optimization policies,

regarding especially the compatibility with the optimization algorithms of
chapter 4.
5.1.1 Circuit delay

Till now the generic word delay has been used, but now it is mandatory to better define the meaning of delay in a real circuit.
Generally the delay of a CMOS gate, or a CMOS circuit, is defined as
the delay between the time when the output is at 50% of its peak value
(indicated with to in figure 5.2) and the time when the input is at 50% of its
peak value (indicated with ti in the same figure).
IN
VIN
50% VIN
time
Delay = t o
OUT
ti
VOUT
50%VOUT
time
to
ti
Fig. 5.2: Delay definition

This definition is good only for theoretical discussion since:
generally a circuit has more than one input and more than one output;
not always there is a direct path from the input to the output (lets
think about dynamic logic), i.e. not always a change in an input cause
directly a change in the output.
So the definition of delay of a CMOS circuit must be investigated,
to produce real number useful for optimization. In order to define it the
80
concept of critical paths has been introduced in [16].

In the following I introduce a new mathematical formulation of the definition of critical path; this formulation will be useful for the automatic
solving of the problem of finding all the critical paths in a circuit in 6.2.4
(page 115).
Critical Paths
The idea of critical paths in a CMOS circuit can be derived,
intuitively, from the idea of path between the the output and the input: a
critical path is a conducting path between a node (the output node, i.e.
the final node of the path) and the ground, or between this node and the
power supply, such that a change in the state of an input gate of a MOSFET
comprised in the path causes directly a change in that node. Naturally each
MOSFET
included in the path must be on, or switch to, conduction, in order
to create a conducting path.

This concept must be extended, however, since a change of the so called
output node can cause itself a change of another critical path (i.e. the output
node is itself connected to a gate of another critical path), so that a change
in a gate node in the very beginning of the circuit may propagate through
a lot of conducting paths.
Definition 5.1 (Critical path). A critical path is a set of conducting paths such
that:
i) each conducting path is between a generic node and a ground node,
or between a generic node and a power supply node, and is composed
by MOSFETs; and
ii) each final node of a conducting path is either connected to a gate of a
MOSFET
comprising another critical path, or is an output of the circuit;
and
iii) a change in the state of any MOSFET gates in the first conducting path
propagates till the last conducting path, causing a change in the critical
path output node.
Definition 5.2 (Critical path delay). The delay of a critical path is the delay
between the output node of the critical path and the gate node causing the
state change of the output node.
81
From the definition 5.1 it is clear that even a simple circuit has more
than one critical path in it 1 .
In order to develop a rigorous definition of critical paths, lets introduce
the following sets, characterizing a typical CMOS circuit:
G = {set of all the MOSFET gate nodes in the circuit} = g1 , g2 , . . . , g j , . . .
N = {set of all the nodes in the circuit} = n1 , n2 , . . . , n j , . . .
O = {set of all the output nodes of the circuit} = o1 , o2 , . . . , o j , . . .

I = {set of all the input nodes of the circuit} = i1 , i2 , . . . , i j , . . .
M = {set of all MOSFETs in the circuit} = m1 , m2 , . . . , m j , . . .

V = {gnd=ground node, vdd=power supply node} ;
lets define also the set Nm j as the set of all the nodes pertaining to the
MOSFET
m j , and the gate of the jth MOSFET with gm j .
All these sets are in such relations: I G N , V N , O N \ G.

The generic nth critical path of a circuit, denoted by Cn , equation (5.1a)
(page 82), is the collection of conducting paths, denoted by ni , such that
each ni , equation (5.1b), is defined as the union of two ordered node sets,
the set Gni , equation (5.1c), of all gates of all the k MOSFETs pertaining to the
conducting path, and the set Dni , equation (5.1d), of all drain and source
nodes (in number of2 k + 1) of the same k MOSFETs, (5.1e)
The nodes in Dni set have a peculiar property: the first and last one may
be or may be not3 in common among two or more MOSFETs, while the other
ones must be in common among two or more MOSFETs.
In other words, the set Dni is an ordered collection of nodes such that
among these nodes there are k MOSFETs, constituting a continuous (and
conducting) path from the output node to a power supply (or ground)
node.
1
The simplest circuit, the inverter, has 2 critical path, since a change in the input from
low to high involves the path comprising only the n-MOSFET, while a change from high to
low involves the path comprising only the p-MOSFET.
2
Note that MOSFETs in a conducting path share a common drain or source node two by
two, so a conducting path constituted of one MOSFET has two nodes (one drain and one
source), a path constituted of two MOSFETs has three nodes (the MOSFETs share one drain
node) and so on.
3
This is the reason why in the equation (5.1d) the index j ends at k and not at k + 1
82
Finally, collecting all the definitions, respectively, of critical path, conducting path, conducting path gate nodes set and conducting path drain nodes
set:
Cn =
(5.1a)
ni
i = Gni Dni
(5.1b)
Gni = g j |g j G , j = 1, . . . , k
(5.1c)
Dni = n j | n1 V n j N \ G (n j , n j+1 ) Nm j \ gm j
(nk+1 Gi+1 ) (nk+1 O ) , j = 1, . . . , k
(5.1d)
Gni , Dni such that given
MG = m j | m j M gm j Gni
MD = m j | m j M Nm j \ gm j Dni
(5.1e)
then MG = MD .
TSPC FULL ADDER (carry part)

1-2-4-5-11
11
1-3-4-5-11
6
C
1-7-8-5-11
5
A
8
C
9-10
10
6
9
11
CLK
Fig. 5.3: Example of critical paths

The figure 5.3 shows an example of critical paths in a dynamic circuit
(actually the carry part of a full-adder in a TSPC logic). In this figure are represented the six critical paths, each one with the list of MOSFET numbers.
For example, the first critical path (C1 ) is composed by the conducting path
11 , made up of n-MOSFETs 1, 2, 4, 5 and the p-MOSFET 11: that means that

the set G11 is composed by the gates node of transistors 1, 2, 4, 5, and 11,
while D11 is made up of drain and source nodes of the same transistors; if
one gate of n-MOSFETs 1, 2, 4, 5 switch from the low state to the high state
83
(and the others are all at the high state), then the gate of p-MOSFET 11 is discharged, and this p-MOSFET conducts, charging the output node. Another
critical path for example is the one composed only by the p-MOSFET 6: if
its gate switch from high to low, then the gate of n-MOSFET 9 switch form
low to high, but this can not produce the discharging of the output node,
since the gate of n-MOSFET 6 is driven by the same signal of the original
p-MOSFET.
Note. The definition of critical path can be viewed as a tree rooted at the
transistor that is driving the change in the critical path. One leaf of the tree
is the transistor which drain (or source) is the critical path output node. So
it is possible to traverse the tree between the root (the input) and a leaf (the
output): if one is able to model all the lateral subtree encountered during
the traversing of the tree as static load, then the tree becomes a transistor
chain (figure 5.4). This is the base of the use of several delay models that
are able to evaluate a chain delay.
CHAIN
TREE
OUTPUT
OUTPUT
INPUT
INPUT
Fig. 5.4: Critical path tree that becomes a chain.

After the definition of critical paths, the problem of associating a delay
(one and only one) to a circuit is still unresolved, since there is surely more
than one critical path in a circuit: the solution is to find the max of all the
critical path delays, and regard this delay as the delay of the whole circuit.
In this manner, we are sure that a change in the state of a node caused directly by an input, can never occurs after the max delay fixed. Also this
84
definition is consistent with the optimization purposes, since the optimization objective is always (usually!) the minimization of the delay. So the
strategy to be applied is a minmax scheme of optimization (minimization
of the maximum).
Definition 5.3 (Circuit delay).
The delay of a circuit td is:
td = max {d(Cn )}
n
where d(Cn ) is the delay of the nth critical path comprising in the circuit.
So, finally, in order to known the delay of a circuit, one must search
all the critical paths in the circuit, calculate (or measure) the delay of each
critical path, and calculate the max of these delays.
The delay of each critical path can be calculated by means of some
model (maybe after the transformation of figure 5.4), or measured by means
of simulations.
This delay, obtained in some way, must be analyzed in order to know
its coherence with the mathematical results of chapter 4 (page 47), and the
validity of these results.
5.1.1.1 Delay formula obtained by the Elmore model
The delay function obtainable by the Elmores model (2.1, page 16) is a
continuous function. Referring to figure 2.1 (page 15), the delay of a single
MOS
is:
tdi = R0 CSi + (R0 + Rdi )CDi + (R0 + Rdi + R L )CL

The drain and source capacitance, and the dynamic resistance of a MOS
are function of the MOS width W:
4
The reason why we want to define a single value for the optimization of delay and, for
example, we do not apply the multi-objective methods of the following sections, is that all
the critical path delay are commensurable and they have the same global behaviour (cfr.
5.2.3, page 102)
85
CDi = C j Wi
CSi = C j Wi
R di =
Rj
Wi
where C j and R j , are, respectively, the capacitance for unit length and the
resistance for unit length. The delay function of the MOS width become:
tdi = R0 C j Wi + R0 +
Rj
Rj
C j Wi + R0 +
+ R L CL .
Wi
Wi
Separating the terms containing the width W j from the terms that are
independent from W j we obtain:
tdi = 2R0 C j Wi +
Rj
CL + R j C j + (R0 + R L )CL .
Wi
Summing the delay of all the MOS in a conducting path we obtain the
total delay of this path:
td = tdi = AWi +
i
B
+C
Wi
where A, B, C are all independent from Wi .

The delay of a critical path is the sum5 of the delays of all the conducting
path.
As long as A, B are not zero, the delay td is a convex function (definition 4.12, page 50) as in figure 5.5. If the term A is zero, instead, then the
delay is a monotonic decreasing function (figure 5.6).
Note that the term A is zero, practically, only if the the resistance R0 is
5
This definition introduces further errors in the delay model, since the conduction of the
conducting path successive to the first one does not start when the output of the first one is
at its 50%, but long before.
86
td
t min
Wmin
Wj
Fig. 5.5: Elmore delay: convex function

zero, that is the MOSFET chain is driven by an ideal voltage source.
5.1.1.2 Delay measurement obtained by the FAST model and by HSPICE

The delay obtained by the FAST model and HSPICE simulations is a
measure and not a formula. It is a correspondence onetoone between
the MOSFET widths and the resulting delay and it is not possible to express
this delay by means of a closed form formula6 .
The figures 5.7, 5.8 represent the delay of CMOS inverter, increasing in
an uniform manner the dimension of both the n-MOSFET and the p-MOSFET.
The first figure shows the delay of the inverter driven by another inverter
(with fixed dimensions) simulated by HSPICE; the second figure shows the
delay of the same inverter driven, instead, by an ideal voltage source and
simulated by FAST.
These are an experimental proof of the statement given in the previous section: if the voltage source is not ideal, that is dependent from the MOSFET
widths of the circuit, the delay curve is strictly convex (figure 5.7), while if
the voltage source is ideal, i.e. independent from the MOSFET widths, then
6
It is possible, however, after measuring a set of delay varying with widths, to fit the
results with an approximated formula, now in a closed form.
87
td
Wj
Fig. 5.6: Elmore delay: monotonic function

the delay curve is decreasing monotonically (but still convex).
Taking into account the interconnection delays, which can be no more negligible in the deep sub-micron, does not modify the delay function, since a
width independent7 delay is added to the total delay function.
So, definitively, the delay curve is a convex function, strictly or not, depending of the operating condition of the circuit, of all the MOSFET widths8 .
5.1.2 Power consumption

The calculus of the power consumption of a circuit is quite different
from the calculus of the delay: while the delay is a local property of a
single critical path (5.1.1), the power consumption is a global property
of the circuit. That is the power consumption of a circuit is not the sum
7
The interconnection delay can be seen, in second approximation, as proportional to the

widths, since greater widths means greater circuits, and in a layout this means that the
average length of interconnections increases also. This proportionality (empirically found
linear to quadratic) does not modify the delay function, since it adds a term that is both an
increasing and a convex function.
8
The two dimensions representation of figures 5.5, 5.6, 5.7 and 5.8 is only for the sake of
simplicity of the drawing. The convexity is still valid in multi-dimensional representations.
MOS
88
260
Delay
240
220
td [ps]
200
180
160
140
120
100
0
10
20
30
Wj [um]
40
50
60
Fig. 5.7: HSPICE delay: convex function

of the power consumption of each critical path9 . Even if the definition of
power consumption is global, it is not univocal: the power dissipation of
a circuit surely depends on the input conditions. Changing the input states
change the overall power dissipation, making some MOSFET conducting,
while others not. Again, one must choose a definition of power dissipation giving a single number, for the purpose of the optimization.
Considering that the objective of the power optimization (hereinafter we
will abbreviate power consumption optimization only with power optimization) is the minimization of the total power dissipation, as in the case of
delay minimization, a min-max strategy is the most appropriate. Instead of
evaluating the power consumption for all the input combinations, we take
advantage from the definition of critical path:
Definition 5.4 (Circuit power consumption ).
10
The power dissipation of
a circuit Pd is:
Pd = max { p(Cn )}
n
where p(Cn ) is the power consumption of the entire circuit when the
9
In first approximation the power consumption could be the sum of the power dissipated by each critical path in a fully static CMOS circuit.
10
The same reasoning of note 4 (page 84) applies here.
80
89
Delay
75
td [ps]
70
65
60
55
50
45
0
10
20
30
Wj [um]
40
50
60
Fig. 5.8: FAST delay: monotonic function

input conditions of nth critical path are applied.
In this manner it is possible to apply a minmax scheme of optimization,
and, at the same time, it is possible to evaluate the power consumption
during the same bench of evaluation of the critical path delay, allowing a
substantial reduction of the time necessary for the complete evaluation.
In the following, the term power consumption and energy dissipated
will be used altogether, since the simple relation between them is:
E =
P(t)dt;
this means that the calculation of the mean energy dissipated by a circuit is
the integral average of the power and it depends from the simulation time
(or the window of time that we are considering), but it does not depends
from the frequency of the signals at which the circuit itself operates.
The power consumption of a CMOS circuit is the sum of three term (3.3,
page 36):
PTOT = Pswitch + Pshort + Psub-th
(5.2)
the switching power Pswitch , due to the charging and discharging of
90
internal parasitic capacitances; the short-circuit power Pswhort , due to the

simultaneous conduction of n-MOSFET and p-MOSFET, giving thus a direct
conducting path from the power supply to the ground for a short time; and
the sub-threshold power Psub-th , due to sub-threshold conduction of MOS.
In a first approximation the first term Pswitch is proportional to the MOSFET
widths in the circuit (greater width means greater capacitance), the second
term Pshort is proportional to switching time and thus it is inversely related
to the MOSFET widths (greater capacitance means slower switching time),
while the latter term Psub-th is is proportional to the MOSFET widths.
70
Energy
60
Energy [pJ]
50
40
30
20
10
0
0
10
20
30
40
50
60
70
Wj [m]
Fig. 5.9: HSPICE Energy

As an example, the total power consumption of a single gate is sketched
in figure 5.9: as it can be expected the energy is increasing with widths, but
it is not convex.
The three terms of equation (5.2) do not weight equally in the sum giving the energy consumption: in order of influence the first term (3.3.1,
page 36) is the greater, then comes the second term (3.3.2, page 39) , and
finally the third term (3.3.3, page 39) . For a sub-micron technology the
second term (the short-circuit dissipation) is about 10% of the first, with
the third term (sub-threshold conduction dissipation) about 1% of the first.
It could be expected than with the scaling of the technology (in the deep
sub micron field) the first and the second term become comparable, with
5.2. Optimization examples
91
the third term still a fraction of the other two, giving a power figure not
increasing (or even decreasing) with the MOS widths, but also it could be
expected that with the scaling down the interconnect capacitances become
predominant, making the first term (the power dissipation due to capacitance charging and discharging) still the greatest.
In summary, the power consumption figure of a CMOS circuit is an increasing function of the MOSFET widths, but no assumptions can be made
about the convexity of this function.
5.1.3 Area
The area occupation of a circuit can be expressed in a closed form:
A = j W j + .
(5.3)
The area occupation is composed by two terms: a term directly proportional to the MOSFET widths (i.e. to the are occupied by the single MOSFET)
and a term independent from the MOSFET widths (comprising, for example,
the interconnect area). Both terms are, of course, positive, so the curve
of the area occupied versus the MOSFET widths is a monotonic increasing
curve11 , that is a convex function.
Taking into account the interconnections area does not modify the property of the area function, since the only modification of equation (5.3)
(page 91) is in the term independent from the MOS widths12 .
5.2 Optimization examples

In order to show some issues introduced in the previous sections, in the
following some CMOS gates will be analyzed. These gates are summarized
in the table 5.1, with the second column showing the total number of critical
11
It is a straight line in two dimensions, a plane in three dimensions and an hyperplane

in four or more dimensions, but is always a convex function.
12
See note 7, page 87.
92
paths in a gate, and the third column showing the total number of MOSFET
in a gate. The last two gates are dynamic full-adder, the former composed
by complex gate in order to perform the computation in one stage, while
the latter is composed only by basic gates (and, or and inverter): this explain why the last full-adder has much more transistor than the first one.
Tab. 5.1: Basic gates: complexity
Gate
Inverter (fig. 5.10)
TSPC type n latch
(fig. 5.11(a))
TSPC type p latch
(fig. 5.11(b))
TSPC type n and
(fig. 5.12(a))
TSPC type p and
(fig. 5.12(b))
TSPC type n or (fig. 5.13(a))
# of critical paths
2
# of transistors
2
12
14
Static and14
Static or14
Static parity gate (fig. 5.15)
24
48
34
40
26
13
82
126
TSPC type p or
(fig. 5.13(b))
Static and-or (fig. 5.14)
Static full-adder
(figs. 5.16(a), 5.16(b))
TSPC full-adder
(one-stage) (figs. 5.17(a),
5.17(b))
TSPC full-adder
(basic
cells)
The table 5.2 shows the delays and the energy consumption of the gates
of table 5.1: for each gate it is shown the maximum delay, the average delay,
the maximum energy and the average energy of all critical paths. All the
simulation are made at the minimum width for that technology (viz. 1 m
14
For a schematic of the static and and the static or see figure 5.14: the and is the
first gate of the schematic (on the left side of the picture), while the or is the last but one
gate, before the final inverter (on the right side); for a static and see also the figure 5.12,
page 96.
Gate
Inverter
TSPC type n
latch
TSPC type p
latch
TSPC type n
and
TSPC type p
and
TSPC type n
or
TSPC type p
or
Static andor
Static and
Static or
Static parity
gate
Static fulladder
TSPC
fulladder
(one-stage)
TSPC
fulladder
(basic cells)
Technology
630.8
718.9
664.1
754.9
689.6
787.3
894.2
727.7
776.3
1839.55
1080.6
681.9
556.4
921.8
1413.0
1028.0
1413.0
904.3
1413.0
1180
760.9
1430
2650.0
1781
930.6
2691 .0
Delay [ps]
max
avg
717.5
572.7
8.893
2.168
6.475
0.7442
0.7224
0.75
3.639
1.654
4.47
2.51
2.756
2.087
3.491
3.82
0.9425
1.219
0.676
0.7160
0.7114
2.816
0.879
1.42
1.058
1.13
0.965
1.299
0.7 m
Energy [pJ]
max
avg
0.6887 0.6864
151.2
15.6
48.0
57.6
4. 8
4.8
16.8
8.4
8.4
8.4
8.4
7.2
7.2
Area
[ m2 ]
2.4
482.3
276.7
571.3
922.2
277.2
233.7
334.1
482.5
299.7
482.5
315.8
482.5
293.3
79.9
204.2
311.3
582.5
240.1
89.3
253.3
225.1
224.8
212.9
208.9
209.1
200.4
Delay [ps]
max
avg
259.6 189.1
5.27
0.641
3.155
0.0944
0.0907
0.0713
1.998
0.2243
0.8257
0.3488
0.5126
0.2894
0.6586
1.999
0.188
0.324
0.0863
0.0891
0.0434
1.51
0.1077
0.2288
0.1314
0.1816
0.1221
0.2161
0.25 m
Energy [pJ]
max
avg
0.0858 0.0853
Tab. 5.2: Basic gates: pre-optimization delay, power consumption and area
75.6
7.8
24
28.8
2.4
2.4
8.4
4.2
4.2
4.2
4.2
3.6
3.6
Area
[ m2 ]
1.2

93
94
OUT=A
Fig. 5.10: CMOS Inverter
for the 0.7 m technology and 0.5 m for the 0.25 m technology).
5.2.1
Algorithm choice
Given the results of section 4.2 (page 58), and the results of the above
sections regarding the property of delay, power and area functions in real
circuits, the most suitable algorithm to be applied is the Powells scheme.
Briefly, it is fast, reliable, even in presence of multiple minima, and (perhaps first of all) it does not require the knowledge of the first derivative of
the function to be minimized.
While some other algorithms could give the same quality of accuracy in
finding the minimum (namely the simulation annealing algorithm is practically the only one), the Powells one outperform all the others in the terms
of number of iteration, and hence in execution time, reaching the best solution.
The Powells algorithm is the first choice in all the optimization examples found in this chapter. As an example, performing the same optimization of table 5.3 with the simulated annealing will require an execution
time by the optimizer15 of about ten times of that required by the Powells
algorithm.
15
For a complete description of the optimizer cad tool see chapter 6, page 107.
95
CLK
OUT
A
(a) Type n
CLK
OUT
(b) Type p
Fig. 5.11: TSPC Latches
5.2.2 Mono-objective optimizations

The mono-objective optimization of the circuits of table 5.1 means the
optimization of one and only one of the targets of 5.1, namely delay aut
power consumption aut area occupation.
5.2.2.1
Area
This target has a trivial optimization, since to minimize the occupation

of area of a circuit means obviously to have all the transistors in the circuit
as little as possible, i.e. the minimum allowed width by the technology.
96
CLK
OUT=A B
(a) Type n
CLK
OUT=A B
(b) Type p
Fig. 5.12: TSPC And gates

5.2.2.2
Power
The power optimization worths some more words: all the attempts to
optimize exclusively the power of the gates of table 5.1 in spite of the delay
have led to the same result, for both technologies: all the transistors in the
circuit had the minimum width after the optimization. This outcome will
arise whatever would be the starting point of the optimization session, i.e.
the initial transistor widths of the circuit.
This is an experimental proof that, out of the three terms of equation (5.2)
(page 89), the term of the switching power Pswitch , due to charging and dis-
97
OUT=A + B
CLK
(a) Type n
CLK
OUT=A + B
(b) Type n
Fig. 5.13: TSPC Or gates

charging of capacitances in the circuit, is always the dominant one. Although some authors in the past argue that this term could not be the
largest, especially for deep sub-micron circuits, there is not an experimental
proof of that, at least for small and medium circuits.
5.2.2.3
Delay
Given the results of the power optimization (and the simple results of
area optimization), the only mono-optimization feasible is the delay op-

98
12
13
11
10
Fig. 5.14: Static and-or gatea .
OUT
a
This gate performs the action A B + C, but there are two inverters between the and and the or. These leave intact the logic function, but introduce
some complexity in the critical paths formulation: it is only for this purpose that these inverters have been introduced.
99
Fig. 5.15: Static parity gate

timization.
That it is the maximum delay of critical paths is minimized, disregarding
the power consumption and the area occupation, which both increase as
the delay diminishes.
Tab. 5.3: Full-adder: delay optimization
Full-adder
Static
TSPC
(one-stage)
Delay [ps]
Pre-opt. Post-opt.
1781
1080
571.3
415.2
930.6
400.2
276.7
158.3
Energy [pJ]
Pre-opt. Post-opt.
0.7 m technology
6.475
40.0
0.25 m technology
3.155
111.2
0.7 m technology
2.168
13.390
0.25 m technology
0.641
3.622
Area [m2 ]
Pre-opt. Post-opt.
34
195.6
17
692.6
26
151.7
13
80.4
As an example, the delays of the static and dynamic full-adders before

the optimization (i.e. all the transistors with minimum width) and after
the optimization are presented in table 5.3; in the same table is reported
the power consumption of the circuit before and after the optimization of
(b) Carry part
SUM
(a) Sum part
Fig. 5.16: Static full-adder

100
CARRY
CLK
(a) Sum part
Fig. 5.17: TSPC full-adder (onestage)
CLK
SUM
(b) Carry part
CLK
CARRY

101
102
the delay: it is possible to see how the power increases after the delay is
minimized.
The criterion that judges when the optimization is over is based on two
considerations (see chapter 6, page 107 for more details on the algorithms
implementation, and chapter 4, page 47 for mathematical foundations):
i) either if there is a minimum (either the delay figure is strictly convex
or, more generally, it has an absolute minimum), then the optimization
algorithm find it with an arbitrary accuracy, chosen a priori; or
ii) if the delay figure is not strictly convex (i.e. is monotone decrescent),
then the optimization algorithm goes on minimizing till the rate of decreasing of the delay is below the accuracy.
The former case is more stable from the point of view of the accuracy: given an accuracy, the same optimum solution is found independently from the starting point (i.e. the initial transistor widths) the starting point influences only the time it takes to reach the solution, which is
unique.
The latter case is somewhat more problematic, since the solution is dependent from the starting point: the decreasing rate of the delay is dependent
from the starting point in the multi-dimensional space delay vs. widths.
This means that several optimization sessions can give different results,
depending on the initial transistor widths in each optimization.
In order to eliminate this ambiguity it is safe to chose a common starting
point for all the optimization sessions: the natural choice is to start with
all the transistors at the minimum allowed width by the technology. This
choice guarantees that changing from an optimization run to another the
solution found is always the same, and also it represents a comfortable
way for writing the netlist to be optimized, either by a human hand or by
a schematic editor.
5.2.3 Multi-objective optimizations

The multi-objective optimization means to optimize at the same time different target, that is, for example minimize contemporarily the delay and
103
the power, or the power and the area, and so on. From 5.2.2.1, 5.2.2.2
and 5.2.2.3 we have seen that some of these goals clash. These clashes are
briefly summarized in table 5.4.
Tab. 5.4: Agreements of targets

Area
Delay
Power
Area
Delay
Power
So, for example, optimizing together delay and power, i.e. minimizing
both, it is not possible: the power is minimized when all the transistors
are at minimum width, while minimizing the delay involves to have some
transistors (maybe all) at a width greater than the minimum.
This disagreement among some optimization targets leads to new possible
definition(s) of multi-objective optimization:
i) there is a primary target to be optimized, and one or more secondary
targets to be taken into account: then we may define a threshold on
the latter. The algorithm goes on optimizing the primary target, being
careful on maintaining all the secondary targets below the threshold;
or
ii) there are only primary targets, and each target account into the total
objective function with a relative weight, which indicates how much
the final solution should depend on the corresponding target; or
iii) both the previous definitions.
The most suitable policy is the second, because it gives to each target
the same priority with different importance. The first alternatives leads
to a sub-optimal optimization since: first, the designer must know which
are the order of magnitude of the targets, in order to impose a limit on
them; second, not the whole space of solutions may be explored with such
constraints.
In the case of primary target with relative weights, we have chosen the
sum of relative weights to represents the entire normalized objective function, that is the sum of relative weights must be equal to one.
104
Given the results of 4.1.2 (page 54) then the total objective function to be
minimized is a linear combination of the delay (D ), power (P ) and area (A ):
O = D + P + A ,
(5.4)
where O is the total objective function and where
0, 0, 0,
++ =1
From the point of view of the user of the optimizer, specifying this kind
of weights means to have the possibility to see this weights as a measure
of how much the corresponding target matters in the final solution: for
example specifying = 0.5, = 0.5 and = 0 means that we want to optimize the delay at the 50% and the power at the 50%.
The subtle point in the eq. (5.4) is that the quantities D , P and A are
not commensurable, that is order of magnitude of the quantities may not be
same. Lets think only to the unit of measure: if, for example, the delay
is measured in picosecond (e.g. 1000 ps), the power is measured in Joule
(e.g. 1013 J). When one quantity is very greater than the others, then all
the changes in the latter quantities disappear in the total sum.
In order to overcome the problem of the non-commensurable quantities in eq. (5.4), all the terms comprising the sum should be normalized. The
mathematical theory of optimization states that each term should be normalized dividing them by the optimum found optimizing only that particular term. This implies an a-priori knowledge of the optimum of each
term, and so of the total weighted sum. At every moment of the optimization run is possible to know the distance between the actual solution and
the optimum.
This is not practically feasible for a circuit optimization, since it would involve the run of mono-objective optimizations, one for each term of the
sum, and then the run of the final multi-objective optimization. This would
lead to a total session of the optimization unacceptable, both for the time it
will takes and for the resources it will occupy.
Thus the normalization applied here is the division of each quantity for
105
its corresponding maximum: a maximum of the delay occurs when all the
transistors are at minimum width, while the maximum of the power and of
the area is measured when all the transistors are at the maximum allowed
width in the optimization session (being careful that choosing a too large
maximum allowed width will result in a power and area term too little).
The total normalized optimization objective function becomes then:
O=
D
D |min widths
P
P |max widths
A
A |max widths
(5.5)
Choosing all the combinations of the parameters , and it is possible

to obtain an optimized circuit in which the delay, the power consumption
and the area occupancy account more or less.
Tab. 5.5: Full-adder: delay and power optimization
Delay [ps]
Pre-opt. Post-opt.
Full-adder
Static
TSPC
(one-stage)
1781
1156
571.3
429.5
930.6
744.1
276.7
187.1
Energy [pJ]
Pre-opt. Post-opt.
0.7 m technology
6.475
22.34
0.25 m
3.155
13.63
0.7 m technology
2.168
3.921
0.25 m technology
0.641
1.879
Area [m2 ]
Pre-opt. Post-opt.
43
110.5
17
83.12
26
62.8
13
41.6
Tab. 5.6: Full-adder: optimizations comparison among two kinds of optimization and the minimum widths results. The number in the parentheses shows the worsening (if positive) or the improvement (if
negative) of the powerdelay optimization from the full-delay optimization.
Delay
== 0.5
Full-adder
= 1
Static
1.65
1.54 (+7%)
1.38
1.33 (+3.44%)
2.33
1.25 (+85.9%)
1.75
1.48 (+18.2%)
TSPC
(one-stage)
Energy
= 1
== 0.5
0.7 m technology
6.18
3.45 (-44.1%)
0.25 m technology
35.25 4.32 (-87.7%)
0.7 m technology
6.18
1.81 (-70.7%)
0.25 m technology
5.65
2.93 (-48.1%)
= 1
Area
== 0.5
5.75
2.57 (-43.5%)
40.74
4.89 (-88%)
5.83
2.42 (-58.6%)
6.18
3.20 (-48.3%)
If for, for example the same full-adder of table 5.3 (page 99) are optim-
106
ized both for delay and for power in the same measure, i.e. in equation (5.5)
= 0.5, = 0.5 and = 0, we obtain the results of table 5.5.

The comparison of the full delay optimization (mono-objective) and delay
power optimization (multi-objective) is sketched in table 5.6: as we can
see between the full-delay optimization and the powerdelay optimization
(50%50%) there is a slightly worsening in the delay of the final circuit
(from 5.2% to 46.9%); at the same time there is an effective improvement
in the power consumption: the power dissipation decreases from 5.8% to
76.5%.
A more complete survey of the optimization results of the circuit presented in this chapter can be found in chapter 7 (page 121).
5.3 Conclusion
This chapter first defines which are the targets of optimization, and then
it applies the mathematical theory of chapter 4 (page 47) to the optimization of real circuits.
It has been shown how the only mono-objective optimization feasible by
means of transistor dimensions trimming is the delay minimization, since
both the minimization of area and power consumption lead the quasi-obvious
solution of all transistor at the minimum width allowed by the technology
or by the designer.
Regarding the multi-objective optimization a method that permits to
tackle several optimization policies has been presented. This method permits to take into account all the variables, even whether they are incommensurable among themselves; by means of a normalization all the targets
to be optimized can be combined in a single objective function, with a relative agreement level.
Moreover with this way of combining the several targets into one objective
function, the introduction of constraints is as simple as it is in a monoobjective optimization.
Chapter 6
A CAD TOOL FOR OPTIMIZATION

HE optimization goals of the previous chapter require a modular and
complete framework, in order to perform the real optimization of a
circuit. This chapter describes the implementation of such framework by

means of about 10000 lines of C++ code. The section 6.1 reports the logical
description of the tool and its modules, and the section 6.2 reports the code
implementation of the most important classes of the program. Finally the
section 6.3 reports the logical flow of the program during the execution.
For every other detail refer to appendix A and B (page 145, 149).
6.1 Logical description

The block diagram of the CAD tool is pitted in figure 6.1.
The core of the tool, the optimization engine, receives the input from
two modules: the optimization algorithm module (OAM), where different
optimization strategies can be selected, and the function evaluation module
(FEM), including the models for delay, power, and area estimation.
6.1.1
The optimization algorithm module (OAM)
The OAM supports the choice of different optimization algorithms in a

predefined set; three kinds of algorithm are currently included:
a SLOPlike algorithm (4.2.4, page 70), which works increasing at
Chapter 6. A CAD tool for optimization
108
Optimization algorithm module (OAM)
SLOP
Powell
Grad.
descent
Optimization
Results feedback
Constraints
Delay
Power
Circuit
Description
Area
function evaluation module (FEM)
(computer readable)
Parser
Optimization
constraints
Circuit
Description
(human readable)
Fig. 6.1: Tool block diagram

each step the size of a single gate, chosen according to the best possible reduction of the delay along the critical path.
The Powell algorithm (4.2.3.2, page 69), which is a particular form

of the conjugate directions algorithm family ([17]): it does not require
the computation of any gradient function and it converges quadratically to the minimum of the cost function.
The simulated annealing algorithm (4.2.5, page 72): it chooses the
6.1. Logical description
109
transistor dimensions according to an annealing scheme, converging thus to a global minimum, getting rid of local minima. It is
surely much slower than the previous ones, and requires a fine tuning
of the annealing parameters.
For all the chosen methods, the analytical knowledge of the objective
functions and their derivatives is not required, but just numerical approximations are exploited.
However methods requiring the gradient evaluation (e.g. the Fletcher
ReevesPolakRibiere version of conjugate directions algorithm [17]) can
be also supported.
6.1.2 The function evaluation module (FEM)

The FEM module performs the analysis of the circuit to be optimized,
and in particular it evaluates all the objective functions needed by the OAM:
the delays, power consumptions and area occupancy.
In order to perform this evaluation it invokes the timing analyzer or simulator chosen at run-time. At the time of writing two analyzer are supported:
HSPICE
and FAST (chapter 3, page 21).
Hereinafter the word simulator will be used, although some module included in FEM are not real simulator, but more appropriately delay-power
analyzer, since they do not perform a real simulation of the circuit (such
as FAST).
6.1.3 Core engine

The core engine is the main module of the program. It handles the communications among the others module and make the optimization feasible.
First of all, the engine parses the netlist of the circuit to be optimized,
written in a SPICE-like format. It then invokes the module that automatically searches all the critical paths in the circuit, and finally it invokes the
optimization algorithm.
110
6.2 Code implementation

The whole tool has been written in C++. All the classes of the program
are showed in appendix A and all the code details can be found in appendix B.
Here are reported the most important classes of the program:
CircuitNetlist
OptimizationAlgorithm
EvaluationAlgorithm
The first class, CircuitNetlist, and its derived Circuit, contain the
graph of the circuit, in which every node is a transistor and every edge is a
connection between two transistor.
The class OptimizationAlgorithm is a virtual base class from which
every new optimization algorithm should be derived. It provides the interface between the real class that implements the algorithm and the core
engine. Every derived class should provide the method Run() that performs the optimizations.
The class EvaluationAlgorithm is again a virtual base class from which
every new simulator should be derived; and again every derived class
should provide the method Run(...) that performs the evaluation of all
the objectives of the circuit, as delay, power consumption and so on.
6.2.1
The classes CircuitNetlist and Circuit
The public and protected methods of class CircuitNetlist are:

1
2
3
4
5
6
7
class CircuitNetList
{
private:
...
protected:
char *FileNetOut;
6.2. Code implementation

8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
111
TransistorList TranList;
CapacitorList CapList;
char *FileIn;
double Val;
unsigned int ValNode;
public:
CircuitNetList( const char* FileNetList,
const Options& options );
virtual ~CircuitNetList();
unsigned int GetNTran() const
{return TranList.GetNTran(); }
unsigned int GetNCap() const
{return CapList.NumCap; }
double Valim() const
{return Val; }
unsigned int ValimNode() const {return ValNode; }
const TransistorNode& operator[]( unsigned int index ) const;
const TransistorNode& operator[]( const char* name ) const;
int TranPos( const char* name ) const;
};
This class provides some method to return the ith transistor by means of
operator[], either by calling it with the relative number of transistor or
with its name. Also the class provides the methods to return the effective
power supply node (the ground node is assumed to be always the node 0).
Internally the class contains the list of all the transistors and all the capacitors present in the original netlist.
The public and protected methods of class Circuit are:
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class Circuit : public CircuitNetList

{
private:
...
public:
Circuit( const char *FileNetList,
~Circuit();
void PrintResult( unsigned long int Step, unsigned int NT,
unsigned int NP, const double* NewWidth,
const double* CPDelay, const double* CPPower,
const double *CPNoise, double Area,
double maxT, double maxP, double maxN,
double f, double fLast ) const;
112
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
int Simulate( const double *NewWidth ) const;

double JunctionNWidth( unsigned int node,
int& number,
const double* NewWidth = 0 ) const;
double GateNWidth( unsigned int node,
int& number,
double JunctionPWidth( unsigned int node,
int& number,
double GatePWidth( unsigned int node,
int& number,
double CapStaticGnd( unsigned int node, int& number ) const;
double CapStaticVdd( unsigned int node, int& number ) const;
int TransistorListNode(unsigned int node, TransistorList& TList,
unsigned int& n , unsigned int& p) const;
};
The class provides the method Simulate(const double *NewWidth ) that

invokes the simulator of the circuit with the new transistor widths NewWidth.
It provides also some methods ...Width(...) that return the sum of the
widths of all the transistors connected to a node and a few methods CapStatic...(...)
that return the sum of all the capacitances connected between a node and
the power supply node or between a node and the ground node. These
methods are useful for the FAST model.
6.2.2
The class EvaluationAlgorithm
The public and protected interface of this class are:

60
61
62
63
64
65
66
67
68
69
class EvaluationAlgorithm
{
private:
protected:
const CritPathList& pathlist;
const Options& options;
unsigned int NumPath;
unsigned long int Calls;
double *CPDelay; // delay
double *CPPower; // power

70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
double *CPNoise;
double Area;
113
// noise
public:
EvaluationAlgorithm( const CritPathList& pathlist,
virtual ~EvaluationAlgorithm();
virtual int Run( const Circuit& circuit,
const double *NewWidth,
const unsigned *ValidPath ) = 0;
unsigned long int GetCalls() const { return Calls; }
double GetDelay( unsigned int index ) const
{ return CPDelay[ index ]; }
double GetPower( unsigned int index ) const
{ return CPPower[ index ]; }
double GetNoise( unsigned int index ) const
{ return CPNoise[ index ]; }
double GetArea() const
{ return Area; }
unsigned int GetNPath() { return NumPath; }
};
The main method is

Run( const Circuit& circuit, const double *NewWidth, ...)
that performs the real simulation of circuit with the new dimensions
NewWidth. The other methods return the delay, power and area of the circuit with the new dimensions, the total number of calls to simulator, and
the number of critical path in the circuit. It contains vectors of all the delays
and power of all critical paths, an instance of a class Options that contains
all the options of the tool, and an instance of the class CritPathList that
contains all the critical paths of the circuit.
6.2.3
The class OptimizationAlgorithm
The public and protected interface of this class are:

90
91
92
93
94
95
class OptimizationAlgorithm
{
private:
...
protected:
unsigned int InternalSteps;
114
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
const Circuit& circuit;

const Options& options;
unsigned int Steps;
unsigned int NumTran;
unsigned int NumPath;
double *Width;
double *CPDelay;
double *CPPower;
double *CPNoise;
double Area;
unsigned int *ValidPath;
double MaxDelayInitMin;
double MaxPowerInitMin;
double MaxNoiseInitMin;
double AreaInitMin;
double MaxDelayInitMax;
double MaxPowerInitMax;
double MaxNoiseInitMax;
double AreaInitMax;
EvaluationAlgorithm& Simulation;
double NormSim( const double* NewWidth, int& RetCode);
public:
OptimizationAlgorithm( const Circuit& circuit,
const Options& options,
EvaluationAlgorithm& simulation );
virtual ~OptimizationAlgorithm();
virtual int Run() = 0;
unsigned long int GetSteps() const { return Steps; }
int SimulateCircuit( const double *NewWidth );
int SimulateFirstCircuit();
double OptWidth( unsigned int index ) const
{return Width[ index ];
};
This class provides the method Run() that invokes the real algorithm,
and the method SimulateCircuit(...) that performs the function evaluations by means of the instance EvaluationAlgorithm& Simulation:
simply every time that the algorithm needs to perform a function evaluation with new dimensions, it invokes the public method
Simulation.Run(...), passing to it the new dimensions. It provides also
the methods to return the optimization steps and the final optimized widths.
115
The combination of all the functions returned by Simulation.Run(...)

(all the critical path delays, all the power consumptions, 5.2.3, page 102) is
performed by the method NormSim(...).
6.2.4
The critical path retrieving
The module that performs the retrieving of all the critical paths (see
5.1.1, page 80, for the mathematical definitions) in the circuit is subdivided
into three parts:
the first part identifies all the input of the circuit (gate nodes connected to nothing), and all the internal gate nodes (connected to a source
or a drain of another transistor);
the first part search all the charging paths between a node and the
power supply and all the discharging paths between the ground, for
every node in the circuit;
the third part combines all the previous charging and discharging
paths to obtain a true critical path. The combinations is performed
controlling that the inputs permit the real activation of the path; at
the same time the module sets all the inputs at the value necessary to
obtain the excitation of the path, i.e. such that a change in the input
causes a change in the output.
The main function of the critical paths retriever is:
int Critic(const Circuit& circuit,CritPathList& pathList,...)
that performs the search of the critical paths in circuit: it simply calls
the recursive function int CriticRecurse(...) to search all the charging
or discharging paths, and then it combines some of this path by means of
the recursive function int SearchCPRecurse(..). For every charging/discharging path to be added, the function int SearchOKCond(...) is invoked: this very complex function controls that all the input conditions
are coherent with the conduction of the path.
In order to ensure a good flexibility of the tool, there is always the possibility for the designer to specify the critical paths to be used in the optimization by hand. The standard format for them is a text file that for each
116
critical path lists the input node, the output node, and transition both on
the input and output node (fall or delay). It is possible in this way to list
only a part of all the critical paths present in a circuit and to take into account during the optimization only those paths.
Moreover it is possible to use the optimizer for topologies that normally
could confuse the algorithm for critical paths search, such as the passtransistors logic circuits.
6.2.5 The derived classes

Every time a new optimization algorithm or a new simulator must be
introduced a new class should be derived from the main classes.
As examples, the class for the HSPICE simulator is derived as:
129
130
131
132
133
134
135
136
137
138
139
140
141
class Hspice: public EvaluationAlgorithm

{
private:
...
public:
Hspice( const CritPathList& pathlist,
const char* NE );
~Hspice();
int Run( const Circuit& circuit,
const double *NewWidth,
const unsigned* ValidPath);
};
and the class for the Powell optimization algorithm (4.2.3.2, page 69)
is derived as:
142
143
144
145
146
147
148
149
150
class Powell: public OptimizationAlgorithm

{
private:
...
public:
Powell( const Circuit& circuit,
EvaluationAlgorithm& simulation );
~Powell();
6.3. Program flows

151
152
117
int Run();
};
Basically, both the classes should provide only the method Run(...)
(with different parameters, of course), that performs the real simulation or
the real optimization algorithm.
6.3 Program flows

The logical flow of the main function of the program is:
Algorithm 6.1. Logical flow of main
Require: Circuit netlist in SPICE-like format
1:
Preprocess the input netlist.
2:
Process the options configuration file.
3:
Build the graph of the circuit.
4:
Search the critical path in the circuit.
5:
Invoke the function optimizator.Run().
6:
Write results.
The logical flow of the function that retrieve all the critical paths is di-
vided in a few functions:

Algorithm 6.2. Logical flow of the function Critic(...)
Require: A graph of the circuit in which each node is a transistor.
1:
Invoke CriticRecurse(...) passing to it the ground node.
2:
Invoke CriticRecurse(...) passing to it the power supply node.
3:
Invoke SearchCriticalPath(...) passing to it the list of all the discharging path starting from the ground node.
4:
Invoke SearchCriticalPath(...) passing to it the list of all the charging path starting from the power supply node.
5:
Return a list of all the critical paths in the circuit.
Algorithm 6.3. Logical flow of the function CriticRecurse(...)

Require: Node.
118
1:
for all The transistors that have the source or drain connected to Node
do
2:
3:
4:
5:
if Source = Node then

Node = Drain.
else
Node = Source.
6:
end if
7:
Memorize the current transistor in the current list.
8:
Copy the current list in a new list, in order to create a new list every
time there are more than one transistors connected at the same node.
9:
if At Node are connected both ntype and ptype transistor OR Node

is already visited then
10:
11:
12:
13:
Return.
else
Invoke myself with Node
end if
14:
end for
15:
return all the lists of node starting from Node
Algorithm 6.4. Logical flow of the function SearcCriticalPathRecurse(...)

Require: A List of all the charging and discharging paths and a path as a
starting point.
1:
2:
for all The charging paths do

Choose a discharging path that has as an input node the output node
of the first path
3:
Check if the input condition are correct and eventually set them.
4:
Invoke myself whit the new path as a first path.
5:
end for
6:
for all The discharging paths do
7:
Choose a charging path that has as an input node the output node of
the first path
8:
Check if the input condition are correct and eventually set them.
9:
Invoke myself whit the new path as a first path.
10:
end for
6.4. Conclusions
6.4
119
Conclusions
This chapter describes the implementation of the tool that is behind all
the optimizations through this thesis. It has been written in a very modular
way, in order to permit efficiently the insertion of new algorithms and new
simulators. It consists of about ten thousand lines of C++, and it exploits
deeply the object-oriented features, in order to hide to new developers the
implementing details.
Chapter 7
RESULTS AND CONCLUSIONS
HIS chapter shows a survey of the optimization of the circuits showed

in chapter 5 (page 77): the goal here is to show how a cell library
can be optimized, in order to be used in VLSI circuits, either full-custom

or standard-cells. Going to multi-objective optimizations, starting from
mono-objective ones (and passing from constrained optimization) is the
path that this chapter will walk. In this path some conclusions (and opinions!) are drawn, giving cell-libraries designers some guidelines and tools
to facilitate his work and obtain the wanted results.
7.1 Optimization
The cell library to be optimized is composed, principally, by basic TSPC1
CMOS
dynamic logic, but with the purpose to extend the validity of the res-
ults, some static gates are included in the library. The full list of the gates
subjected to optimization is shown in table 7.1. For a complexity description (both for the number of transistor in each cell, and for the number of
critical paths in the same cell) of the gates see table 5.1 (page 92).
The library comprehends, thus, the inverter gate, the TSPC gates and
(both the n and the p versions), or and latch gates (again with the n
and the p versions), and a full-adder (the version included here is a np
construction, faster than the almost equivalent pn construction). As above
said, for comparison are included: a complete static full-adder, a full static
andor gate2 , a full static and, a full static or, a full static parity
1
2
For a description of the TSPC see chapter 1 (page 3), and [1].
See note a, (page 98).
Chapter 7. Results and conclusions
122
Tab. 7.1: Library gates list

Gate
Inverter (fig. 5.10, page 94)
TSPC type n latch (fig. 5.11(a), page 95)
TSPC type p latch (fig. 5.11(b), page 95)
TSPC type n and (fig. 5.12(a), page 96)
TSPC type p and (fig. 5.12(b), page 96)
TSPC type n or (fig. 5.13(a), page 97)
TSPC type p or (fig. 5.13(b), page 97)
Static and-or (fig. 5.14, page 98)
Static and (fig. 5.14, page 98) (See note 14, page 92.)
Static or (fig. 5.14, page 98) (See note 14, page 92.)
Static parity gate (fig. 5.15, page 99)
Static full-adder (figs. 5.16(a), 5.16(b), page 100)
TSPC full-adder (one-stage) (figs. 5.17(a), 5.17(b), page 101)
TSPC full-adder (basic cells)
gate (which performs the parity calculation among three inputs), and, finally, a TSPC full-adder, composed only by the TSPC basic gates above mentioned.
The very first result reported here is the comparison of the improvement in the delay and power consumption between the 0.7 m and the
0.25 m technology, at minimum width: this comparison is reported in
table 7.2 and graphically pitted in figure 7.1(a) for delay and figure 7.1(b)
for the power consumption.
From that table it is possible to see that the average improvement (diminution) of the delay is 69.3% and of the power is 76.2%, passing from the
0.7 m to the 0.25 m technology.
Thus with scaling the dimension of quite 13 , the average delay and power
consumption are also scaled down of about the same factor.
7.1.1
Mono-objective vs. Multiobjective
Mono-objective optimization (4.1.1, page 49) means to optimize (in our

case always to decrease) a single objective, i.e. a well defined target, to the
detriment of all the others possible targets.
The very first optimization policy applied to CMOS circuits was the
0.7 m
Gate
Delay [ps] Energy [pJ]
Inverter
717.5
0.6887
TSPC type n latch
921.8
3.491
TSPC type p latch
1413.0
2.807
TSPC type n and
1028.0
2.756
TSPC type p and
1413.0
2.51
TSPC type n or
904.3
4.47
TSPC type p or
1413.0
1.654
Static and-or
1180.0
3.639
Static and
760.9
0.722
Static or
1430
0.75
Static parity gate
2650.0
0.744
Static full-adder
1781
6.475
TSPC full-adder (one-stage)
930.6
2.168
TSPC full-adder (basic cells)
2691.0
8.893
Average improvement
Technology
Area
2.4
7.2
7.2
8.4
8.4
8.4
8.4
16.8
4.8
4.8
57.6
48
15.6
151.2
[ m2 ]
Energy [pJ]
0.086 (-87.5%)
0.659 (-81.1%)
0.289 (-89.7%)
0.513 (-81.4%)
0.349 (-86.1%)
0.826 (-81.5%)
0.224 (-86.5%)
1.998 (-45.1%)
0.0907 (-87.4%)
0.0713 (-90.5%)
0.0945 (-87.3%)
3.155 (-51.3%)
0.641 (-70.4%)
5.27 (-40.7%)
-76.2%
Delay [ps]
259.6 (-63.8%)
293.3 (-68.2%)
482.5 (-65.9%)
315.8 (-69.3%)
482.5 (-65.9%)
299.7 (-66.9%)
482.5 (-65.9%)
334.0 (-71.7%)
277.2 (-63.6%)
233.7 (-83.7%)
922.2 (-65.2%)
571.3 (-67.9%)
276.7 (-70.3%)
482.3 (-82.1%)
-69.3%
0.25 m
Tab. 7.2: Delay and energy dissipation @ minimum width (HSPICE)

Area [m2 ]
1.2
3.6
3.6
4.8
4.8
4.8
4.8
8.4
2.4
2.4
28.8
24
7.8
75.6
-50%
7.1. Optimization
123
124
Delay comparison of 0.7m and 0.25m

3000
0.7m
0.25m
2500
Delay [ps]
2000
1500
-65.2%
1000
-65.9%
500
-69.3%
-65.9%
-66.9%
-67.9%
-65.9%
-68.2%
-63.8%
-71.7%
-63.6% -83.7%
-82.1%
-70.3%
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay
Energy-dissipation comparison of 0.7m and 0.25m

9
0.7m
0.25m
-30.6%
Energy [pJ]
6
5
4
-51.3%
3
-81.5%
-45.1%
-81.1%
-70.4%
-81.4%
-86.1%
-86.2%
tspc--fa2
tspc--fa1
static--fa
and--or
inv
latchp
latchn
orp
orn
andp
andn
parity
-87.4% -90.5% -87.4%
-87.5%
or--static
-86.5%
0
and--static
Gate type
(b) Energy-dissipation
Fig. 7.1: Comparison of 0.7 m and 0.25 m. gates @ minimum technology

width
delay optimization. The figures 7.2 and 7.3 sketch the delay optimization of
the gates of table 7.1, respectively in 0.7 m and 0.25 m technology implementation, with arrows representing the delay and energy variation. The
7.1. Optimization
125
Full Delay Optimization: delay variation

3000
0.7m
-70.1%
-70.8%
2500
2000
Delay [ps]
-56.7%
1500
-84.2%
-76.2%
-84.8%
-84.1%
-61.2%
1000
-77.3%
-60.1%
-76.5%
-81.0%
-86.3%
-88.1%
500
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay amelioration
Full Delay Optimization: energy variation

250
0.7m
+2254.7%
200
Energy [pJ]
+2605.8%
150
+3040.9%
+3736.2%
+4261.4%
+2680.0%
100
+2079.7%
50
+2275.7%
+430.7%
+1616.4%
+178.7%
+1331.5%
+70.1%
+620.9%
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(b) Energy-dissipation deterioration
Fig. 7.2: Delay optimization of 0.7 m gates.
arrows start from the initial values (i.e. either the delay or the energy measured at the minimum technology width), and end to the values after the
optimization.
126

1000
-57.2%
0.25m
900
800
Delay [ps]
700
600
-27.3%
-78.8%
500
-60.6%
-78.0%
-63.7%
400
300
-39.7%
-59.3%
-67.7%
-74.4%
-55.3%
-87.1%
-42.8%
-68.1%
200
100
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay amelioration

120
0.25m
100
+3424.6%
Energy [pJ]
80
60
+847.7%
+2233.8%
40
tspc--fa2
+465.1%
static--fa
parity
or--static
and--or
inv
latchp
orp
and--static
+381.9%
+212.7%
+171.4%
tspc--fa1
+1787.7%
+1855.0%
+133.7%
+654.0%
orn
andp
+1305.4%
+279.3%
andn
+1818.9%
latchn
20
Gate type
(b) Energy-dissipation deterioration
Fig. 7.3: Delay optimization of 0.25 m gates.
As it can be expected, the delay has a sensible improvement (diminution, figures 7.2(a), 7.3(a)) while the energy dissipation has a very large
increase (figures 7.2(b), 7.3(b)): to decrease the delay the optimizer aug-
7.1. Optimization
127

3000
0.7m
0.25m
2500
Delay [ps]
2000
1500
1000
500
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay variation

250
0.7m
0.25m
Energy [pJ]
200
150
100
50
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(b) Energy-dissipation variation
Fig. 7.4: Technology comparison of delay optimization.
ments the transistor widths, thus augmenting the overall power dissipation. Table 7.3 and figure 7.4 report the relative variation of delay and
power (as minimum, maximum and mean value), for both technology: so,
128
for the 0.7 m technology the delay is, in average, decreased by 3.43 times,
while for the 0.25 m technology it is decreased by 2.75 times (figure 7.4(a)).
On the contrary, the energy dissipation is increased by 20.42 times in 0.7 m
and by 13.41 in 0.25 m (figure 7.4(b)).
The table 7.4 shows the total time taken by the optimization of each
gate, together with the total number of function evaluations, that is the
number of times the simulator (in this case HSPICE) of the circuit has been
invoked. These numbers are quite reasonable per se, and moreover the optimization of a cell library ought to be performed only once, before the
reuse of it. Furthermore, in the case of very large circuits, the modular architecture of the optimizer makes possible to switch from one simulator
to another on the fly; thus we can use a very fast simulator (as FAST) in
the earlier steps of optimization, and switch to a more precise but slower
simulator (as HSPICE) in the later stages of the optimization process.
Tab. 7.3: Delay decreasing and energy increasing (both relative) in a delay
optimization.
Technology
0.7 m
0.25 m
Delay decreasing
Max. Min. Mean
8.43
3.43
4.80
7.78
2.75
3.16
Energy increasing
Max. Min. Mean
43.61 8.89 20.42
35.25 6.17 13.41
These results are largely previsionable, since a hard delay optimization

leads to a very large increase in transistor dimensions, thus leading to a
great area occupancy and energy dissipation.
Moreover, another issue arises when optimizing an entire cell library:
is it necessary to push at their limits every single cell? In a generic static
circuit the total delay is, generally, the sum of the delay of each cell comprising the circuit, since this delay is bounded by the delay of the worst
critical path and, moreover, it is possible to have a single critical path3 from
a primary input to a primary output of the circuit; so it has some sense to
optimize every single cell to its best.
In a generic dynamic circuit, the global delay is still bounded by the delay
of the worst critical path in the circuit, determining thus the minimum clock
period. Since this critical path is contained in a single cell for a single-phase
3
5.1.1, page 80.
7.1. Optimization
129
Tab. 7.4: Elapsed time and total number of function evaluations for a fulldelay optimization with HSPICE on a ULTRA-sparc 5
Technology
Gate
inv
and-n
and-p
or-n
or-p
latch-n
latch-p
andor
andstatic
orstatic
parity
staticfa
tspcfa1
tspcfa2
0.7 m
El. time [s] Fun. eval.
332.6
12
1338.3
34
1426.6
34
1408.3
32
1259.5
31
1286.5
32
1307.1
33
5830.9
73
786.5
25
651.6
21
64098.2
159
27034.8
239
2413.3
69
16459.1
66
0.25 m
El. time [s] Fun. eval.
212.4
13
1675.6
36
2449.5
41
1950.0
34
1355.5
27
1466.7
32
1574.7
31
9280.3
91
729.6
31
626.1
24
35274.3
178
23794.1
180
2881.2
70
63485.2
121
dynamic logic (where there are n-gate and p-gate alternated, working with
different clock phases), the delay of the entire circuit is bounded by the
delay of the worst library cell in circuit. It has no sense, thus, to optimize
the basic library cells (that are present in every circuit) to their limits, when
the delay of a generic circuit is bounded by the worst of them. It is, instead,
more useful to try to optimize the worst cell in the library, while trying to
reduce the delay of the other cells to the value obtained by the previous
optimization. In this way a reduction of the dimensions of these cell is
achieved, obtaining thus a reduction of the overall energy dissipation.
So the consequent idea is to try to optimize an entire (dynamic) cell
library using a constrained optimization 4 ; the strategy for this purpose is:
i) evaluate the delay for every cell at minimum width;
ii) choose the worst cell (with regard to delay) among the previous;
iii) optimize the delay of this cell as long as it is possible;
iv) optimize all the other cell to have a delay not superior to the value
obtained in the previous point.
4
4.1.1.2, page 52
130
As an example, the constrained optimization of dynamic 0.25 m gates

is reported in table 7.5: this optimization has been performed with a constraint on every gate for not to have a delay greater than 125 ps. This value
has been obtained by an unconstrained optimization of the worst (with respect of delay) cell, the TSPC type-p or gate (cfr. table 7.2). After this
optimization the delay of this gate was 121.2 ps, so the value chosen for the
optimization of all the other gates was 125 ps.
Tab. 7.5: Constrained delay optimization of a few 0.25 m gates.
Gate
and-n
and-p
or-n
or-p
latch-n
latch-p
Average delay
Standard deviation
Delay preopt. [ps]

315.800
482.500
299.700
482.500
293.300
482.500
392.72
36.65
Delay postopt. [ps]

100.500
111.900
114.900
121.200
88.080
118.600
109.20
3.83
It is possible to see, from table 7.5 that the delays after the optimization
have a standard deviation5 (3.83) far smaller than the standard deviation
before the optimization (36.65). This means that all the cells have quite the
same delay after the optimization, and that this value is an optimal one,
since minimizes the delay of block constituted by these cells, and in the
same time reduces the power dissipation and area occupancy with respect
to a solution with all the cells optimized independently.
The procedure of a constrained optimization is useful only when we
want to constraint a single target to a precise value. It is not useful when
we want to constraint more than one target at the same time, for example
delay and power together: such optimizations are not feasible as first they
would require an evaluation of quantities to be constrained (in order to
know if the constraints are reasonable), second it could not be possible for
the optimizer to satisfy all the constraints.
A much more useful policy to take into account specifically more than
5
The standard deviation of a number N of samples xi is defined as 2 =

iN=1 xi
,
N
m, m =
is the arithmetic mean of the samples.
It is a measure of the spreading of the samples around the mean.
iN=1 (xi m)2

,
N
where
7.1. Optimization
131
one target is to perform a multi-objective optimization.

The figures 7.5 and 7.7 show four different multi-objective optimization,
respectively, for the 0.7 m and 0.25 m technology (with figures 7.6, 7.8
that are, respectively, a zoom of the figures 7.5(b), 7.7(b). The four different
optimizations performed are:
i) full delay optimization, indicated with Delay=100% Power=0%;
ii) a delay optimization, taking slightly into account the power consumption, indicated with Delay=80% Power=20%;
iii) a delaypower optimization, taking into account the power dissipation
in an equal measure, indicated with Delay=50% Power=50%;
iv) a delay optimization, taking strongly into account the power consumption, indicated with Delay=20% Power=80%;
The percent numbers6 reported after delay and power, are, also, the
coefficients and of the equation 5.5 (page 105) used as a cost function
in the optimization algorithm.
From these figures we see the delay that reduces more and more with
the increasing of its relative weight, while the increasing of the power dissipation is somewhat limited by the increase of its relative weight.
From all the optimization policies, the one that gives the most useful
results is the optimization of delay and power with the same weights, that
is the one indicated with Delay=50% Power=50% in the previous figures.
These results are reported also in figure 7.9, as a particular case.
This is, probably, the most useful optimization since it still reduces a lot
the delay, but it contains the increasing of the power dissipation to a more
acceptable value.
The figures 7.10, 7.11, 7.12 and 7.13, show the same four optimizations
by means of the trajectory in the space delaypower during the optimization process. In these figures each marked point is a step in the optimization
process. It is so possible to see how augmenting the relative weight of the
6
The case Delay=0% Power=100% has not been included, since this kind of optimization leads to the trivial result of all the transistor at the minimum width (cfr. 5.2.2.2,
page 96)
132
Delay--Power Optimization: delay variation

3000
Delay=100%, Power=0%
2500
Delay [ps]
2000
1500
1000
500
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay variation
Delay--Power Optimization: energy variation

180
160
140
Energy [pJ]
120
100
80
60
40
20
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
Fig. 7.5: Several delaypower optimization policies of 0.7 m gates.
delay in the cost function (and thus reducing the energy relative weight),
leads the optimizer to go further in the trajectory reducing the delay and
augmenting the energy dissipation.
7.1. Optimization
133

50
45
40
Energy [pJ]
35
30
25
20
15
10
5
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
Fig. 7.6: Energy-dissipation variation (zoom of figure 7.5(b))

Tab. 7.6: Delay worsening and energy-dissipation improvement between a
full delay optimization and delay-power optimization
Technology
Gate
inv
andn
andp
orn
orp
latchn
latchp
andor
andstatic
orstatic
parity
staticfa
tspcfa1
tspcfa2
average
Delay
39.3%
27.8%
48.4%
33.1%
33.1%
31.5%
28.3%
29.5%
28.7%
21.4%
7.7%
33.3%
11.0%
12.3%
+27.5%
0.7 m
Energy
-20.1%
-92.2%
-81.0%
-67.2%
-77.8%
-71.1%
-75.2%
-91.2%
-67.3%
-30.4%
-78.1%
-87.2%
-29.3%
-72.5%
-67.2%
Area
-40.9%
-87.4%
-80.4%
-76.2%
-69.9%
-84.7%
-76.1%
-89.2%
-79.3%
-28.8%
-81.2%
-86.7%
-27.4%
-71.2%
-69.9%
Delay
15.7%
6.3%
1.1%
46.9%
11.8%
41.3%
14.6%
6.7%
14.4%
-3.4%
2.5%
5.9%
15.4%
8.6%
+13.4%
0.25 m
Energy
-10.4%
-36.3%
-39.4%
-77.5%
-35.5%
-22.0%
-69.5%
-81.2%
-42.1%
18.4%
-50.3%
-81.9%
-48.1%
-41.9%
-44.1%
Area
-21.1%
-42.1%
-49.2%
-66.9%
-21.6%
-46.4%
-72.1%
-79.1%
-53.3%
-12.8%
-51.0%
-82.4%
-48.3%
-44.1%
-49.3%
From these figures it can be clearly seen again that the multi-objective
optimization Delay=50% Power=50% has the best results with respect to
delay optimization and, at the same time, to containing the energy dissipation within reasonable value. These results are summarized in table 7.6: in
134

1000
900
800
Delay [ps]
700
600
500
400
300
200
100
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay variation

120
Delay=100%, Power=0.0
100
Energy [pJ]
80
60
40
20
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
Fig. 7.7: Several delaypower optimization policies of 0.25 m gates.
this table are showed the percent variation of delay and energy dissipation
between the values obtained after a full delay optimization and the values
obtained after a delaypower optimization. The average worsening in the
7.2. Conclusions
135

5
Delay=100%, Power=0.0
Energy [pJ]
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
Fig. 7.8: Energy-dissipation variation (zoom of figure 7.7(b))

delay (i.e. the difference between the delay value after a full delay optimization and the same value after a delaypower optimization) is +27.5% for
the 0.7 m technology and just +13.6% for the 0.25 m technology. Despite
these low rate of worsening, the average energy-dissipation reduction is
67.2% for the 0.7 m technology and 44.1% for the 0.25 m technology,
while the area occupancy reductions are, respectively, 69.9% and 49.3%
This means that accepting a slight degradation in the delay figure, leads to
a great reduction of the overall energy-dissipation and area occupancy.
7.2
Conclusions
The goal of the optimization framework presented in this chapter is to

show a new way to optimize the performance of CMOS cells employed in
VLSI
circuits.
This new methodology, the multi-objective optimization, has led to a prominent result: the delay of a circuit can be reduced taking into account the
power consumption and the area occupancy. The results of table 7.6 are the
most effective: giving a small compromise of the delay performance with
respect of a full delay optimization, the power consumption is strongly decreased; this means that the default optimization done until nowadays, the
136

3000
0.7m
0.25m
2500
Delay [ps]
2000
1500
1000
500
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(a) Delay variation

60
0.7m
0.25m
50
Energy [pJ]
40
30
20
10
tspc--fa2
tspc--fa1
static--fa
parity
or--static
and--static
and--or
inv
latchp
latchn
orp
orn
andp
andn
Gate type
(b) Energydissipation variation
Fig. 7.9: Delaypower optimization (50%50%) comparison of 0.7 m and

0.25 m gates.
full delay optimization, can be safely switched with a multi-objective optimization. A circuit that has less power consumption while maintaining
7.2. Conclusions
137
Delay--power trajectory during optimizations.

340
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
320
Delay [ps]
300
280
260
240
220
200
0
10
15
20
25
Energy [pJ]
30
35
40
45
50
(a) 0.25 m

1200
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
1100
1000
Delay [ps]
900
800
700
600
500
400
0
20
40
60
80
100
120
140
Energy [pJ]
(b) 0.7 m
Fig. 7.10: Delay and power trajectory during 4 different multi-objective optimizations for the andor gate of figure 5.14 (page 98)
almost the same delay is safer from the operating point of view: it develops
less heat, hence it is more reliable.
138

1000
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
900
Delay [ps]
800
700
600
500
400
300
0
0.5
1.5
2.5
Energy [pJ]
(a) 0.25 m

2800
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
2600
2400
2200
Delay [ps]
2000
1800
1600
1400
1200
1000
800
600
0
5
6
Energy [pJ]
10
11
(b) 0.7 m
Fig. 7.11: Delay and power trajectory during 4 different multi-objective optimizations for the parity gate of figure 5.15 (page 99)
The easiness of obtaining circuits in which several optimization policies
can be performed helps a lot the work of cell-library designer: the designer
can, with a very low effort, produce with the same version of a library
7.2. Conclusions
139

1800
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
1600
Delay [ps]
1400
1200
1000
800
600
0
20
40
60
80
100
Energy [pJ]
120
140
160
180
(a) 0.25 m

580
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
560
540
Delay [ps]
520
500
480
460
440
420
400
0
50
100
150
200
250
Energy [pJ]
(b) 0.25 m
Fig. 7.12: Delay and power trajectory during 4 different multi-objective optimizations for the static full-adder of figure 5.16 (page 100)
several libraries optimized in different ways. So each cell in a library has
different performances with respect to the same cell in the other libraries,
but it is still fully equivalent by the point of view of the function performed.
140

280
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
260
Delay [ps]
240
220
200
180
160
140
0.5
1.5
2.5
Energy [pJ]
3.5
4.5
(a) 0.25 m

1000
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%
Starting point
900
Delay [ps]
800
700
600
500
400
300
2
10
Energy [pJ]
12
14
16
18
(b) 0.7 m
Fig. 7.13: Delay and power trajectory during 4 different multi-objective optimizations for the dynamic full-adder of figure 5.17 (page 101)
Lets think for example to an and gate that performs always the same
function, but with different delays or maybe different power dissipations.
Simply swapping one library version (for example one optimized only for
7.3. Future works
141
the delay) with another (for example one optimized taking into account
the power consumption), the designer can develop several versions of the
same project with different performances.
7.3 Future works

Some future works that will be faced in the future could be:
Noise problems This means to use another target in the optimization
policies: the noise ([18]) of a circuit.
This is a complex field, and a good starting point could be developing
of a noise-model of a CMOS circuit.
Interconnections A simpler work could be to take into account the influence of interconnections in the optimization.
This means both to include a model of the interconnections into the
cell and to optimize the performance of the whole structure.
Topology extensions The optimizer can be expanded to perform the optimization of different structures from the standard cells (both static
and dynamic): for example the memory cells, or the pass-logic gates.
This means principally to modify the algorithm that performs the
automatic search of all the critical paths in a circuits, to adapt it to
different topologies. There is, anyway, the possibility in the optimizer to list the critical path by hand and to perform the optimization
with these paths.
Cad integration The optimizer could be integrated in a standard CAD
tool that assists the designer in developing an ASIC from high-level
specifications to layout level. One step of this flow could be the optimization of the library employed in the project.
142
APPENDIX
Appendix A
CLASS GRAPH
Appendix A. Class graph
146
Class Graph
CircuitNetList
>
Circuit
OptimizationAlgorithm
>
TestEval
>
Slop2
>
Slop
>
Powell
>
Anneal
147
EvaluationAlgorithm
>
TestOpt
>
Hspice
>
Fast
Options
CPNode
CritPathList
TransistorNode
TransistorList
Appendix A. Class graph
148
CapacitorList
Node
NodeList
Appendix B
SOURCE CODE
B.1
Main functions
Appendix B. Source code
150
CPNode.cc
3 #include "mystdinclude.h"
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#include
#include
#include
#include
#include
#include
#include
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
CPNode::CPNode() :
VALID( 0 ), NodeIn( 0 ), NodeOut( 0 ), NumTranList( 0 ),
ActiveInputs( 0 ), NoActiveInputs( 0 ), InitialConditions( 0 ),
ActiveInputsIter( 0 ), NoActiveInputsIter( 0 ), InitialConditionsIter( 0 ),
next( 0 )
{
for ( unsigned int i = 0; i < MAXCHAIN; i++ )
{
TransistorNameList[ i ] = 0;
TransistorNameListIter[ i ] = 0;
NumTranN[ i ] = 0;
NumTranP[ i ] = 0;
}
}
///
CPNode::~CPNode()
{
NodeValueList* tmp;
while ( ActiveInputs )
{
tmp = ActiveInputs->next;
delete ActiveInputs;
ActiveInputs = tmp;
}
while ( NoActiveInputs )
{
tmp = NoActiveInputs->next;
delete NoActiveInputs;
NoActiveInputs = tmp;
}
while ( InitialConditions )
{
tmp = InitialConditions->next;
delete InitialConditions;
InitialConditions = tmp;
}
TrList* tmp2;
for ( unsigned int i = 0; i < MAXCHAIN; i++ )
while ( TransistorNameList[ i ] )
{
tmp2 = ( TransistorNameList[ i ] ) ->next;
delete TransistorNameList[ i ];
TransistorNameList[ i ] = tmp2;
}
}
///
int CPNode::InsNodeIn( unsigned int Node, TransitionType T, double Time )
{
if ( NodeIn )
/// ERROR, yet inserted
return NOT_FOUND;
B.1. Main functions
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
NodeIn = Node;
TransitionIn = T;
InTime = Time;
return OK;
}
///
int CPNode::InsNodeOut( unsigned int Node, TransitionType T )
{
if ( NodeOut )
/// ERROR, yet inserted
return NOT_FOUND;
NodeOut = Node;
TransitionOut = T;
return OK;
}
///
int CPNode::InsActIn( unsigned int Node, double Val )
{
NodeValueList * tmp;
if ( !ActiveInputs )
{
ActiveInputs = new NodeValueList;
if ( !ActiveInputs )
return NO_MEM;
ActiveInputs->next = 0;
}
else
{
tmp = new NodeValueList;
if ( !tmp )
return NO_MEM;
tmp->next = ActiveInputs;
ActiveInputs = tmp;
}
ActiveInputs->node = Node;
ActiveInputs->value = Val;
ActiveInputsIter = ActiveInputs;
return OK;
}
///
int CPNode::InsNoActIn( unsigned int Node, double Val )
{
if ( !NoActiveInputs )
{
NoActiveInputs = new NodeValueList;
if ( !NoActiveInputs )
return NO_MEM;
NoActiveInputs->next = 0;
}
else
{
if ( !tmp )
return NO_MEM;
tmp->next = NoActiveInputs;
NoActiveInputs = tmp;
}
NoActiveInputs->node = Node;
NoActiveInputs->value = Val;
NoActiveInputsIter = NoActiveInputs;
return OK;
}
///
int CPNode::InsIniCond( unsigned int Node, double Val )
{
151
152
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
if ( !InitialConditions )
{
InitialConditions = new NodeValueList;
if ( !InitialConditions )
return NO_MEM;
InitialConditions->next = 0;
}
else
{
if ( !tmp )
return NO_MEM;
tmp->next = InitialConditions;
InitialConditions = tmp;
}
InitialConditions->node = Node;
InitialConditions->value = Val;
InitialConditionsIter = InitialConditions;
return OK;
}
///
int CPNode::InsTran( const char* name, TransistorType TR, unsigned int index )
{
TrList * tmp;
TrList* tail;
if ( !TransistorNameList[ index ] )
{
TransistorNameList[ index ] = new TrList;
if ( !TransistorNameList[ index ] )
return NO_MEM;
( TransistorNameList[ index ] ) ->next = 0;
TransistorNameListIter[ index ] = TransistorNameList[ index ];
tail = TransistorNameList[ index ];
}
else
{
tmp = new TrList;
tail = TransistorNameList[ index ];
if ( !tmp )
return NO_MEM;
tmp->next = 0;
while ( tail->next )
tail = tail->next;
tail->next = tmp;
tail = tmp;
}
tail->name = new char[ strlen( name ) + 1 ];
if ( !( tail->name ) )
return NO_MEM;
strcpy( tail->name, name );
if ( TR == NMOS )
NumTranN[ index ] ++;
else if ( TR == PMOS )
NumTranP[ index ] ++;
else
return NOT_FOUND;
return OK;
}
///
int CPNode::TraverseActiveInputs( unsigned int& Node, double& value ) const
{
CPNode * const localThis = ( CPNode * const ) this;
if ( ActiveInputsIter )
{
Node = ActiveInputsIter->node;
value = ActiveInputsIter->value;
B.1. Main functions
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
localThis->ActiveInputsIter = ActiveInputsIter->next;
return 1;
}
else
localThis->ActiveInputsIter = ActiveInputs;
return 0;
}
///
int CPNode::TraverseNoActiveInputs( unsigned int& Node, double& value ) const
{
if ( NoActiveInputsIter )
{
Node = NoActiveInputsIter->node;
value = NoActiveInputsIter->value;
localThis->NoActiveInputsIter = NoActiveInputsIter->next;
return 1;
}
else
localThis->NoActiveInputsIter = NoActiveInputs;
return 0;
}
///
int CPNode::TraverseInitialConditions( unsigned int& Node, double& value ) const
{
if ( InitialConditionsIter )
{
Node = InitialConditionsIter->node;
value = InitialConditionsIter->value;
localThis->InitialConditionsIter = InitialConditionsIter->next;
return 1;
}
else
localThis->InitialConditionsIter = InitialConditions;
return 0;
}
///
const char* CPNode::TraverseTransistorNameList( unsigned int index = 0 ) const
{
if ( TransistorNameListIter[ index ] )
{
char * name = new char[ strlen( ( TransistorNameListIter[ index ] ) ->name ) + 1 ];
if ( !name )
return 0;
strcpy( name, ( TransistorNameListIter[ index ] ) ->name );
localThis->TransistorNameListIter[ index ] = ( TransistorNameListIter[ index ] ) ->next;
return name;
}
else
localThis->TransistorNameListIter[ index ] = TransistorNameList[ index ];
return 0;
}
///
const char* CPNode::TransistorName( unsigned int pathIndex, unsigned int index = 0 ) const
{
TrList * tmp = TransistorNameList[ index ];
for ( unsigned int i = pathIndex; i > 0; i-- )
if ( tmp )
tmp = tmp->next;
else
return 0;
return tmp->name;
}
153
154
CapInsert.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"
///
int CapacitorList::Insert( unsigned int node1, unsigned int node2, double val )
{
Capacitance * tmp;
if ( !head )
{
head = new Capacitance;
if ( !head )
return NO_MEM;
head->next = 0;
}
else
{
tmp = new Capacitance;
if ( !tmp )
return NO_MEM;
tmp->next = head;
head = tmp;
}
head->node1 = node1;
head->node2 = node2;
head->val = val;
NumCap++;
return OK;
}
CapacitorList.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"
///
CapacitorList::CapacitorList() : NumCap( 0 ), head( 0 )
{}
///
CapacitorList::~CapacitorList()
{
Capacitance* tmp;
while ( head )
{
tmp = head->next;
delete head;
head = tmp;
}
}
///
const Capacitance& CapacitorList::operator[]( unsigned int index ) const
{
if ( index > NumCap )
error( NOT_FOUND, 0, "Index out of bounbd in [CapacitorList]..." );
unsigned int i = index;
Capacitance* tmp = head;
while ( i-- )
tmp = tmp->next;
B.1. Main functions
34
35
36
37
38
39
40
41
42
43
44
45
46
47
return *tmp;
}
///
Capacitance& CapacitorList::operator[]( unsigned int index )
{
if ( index > NumCap )
error( NOT_FOUND, 0, "Index out of bounbd in [CapacitorList]..." );
Capacitance* tmp = head;
while ( i-- )
tmp = tmp->next;
return *tmp;
}
Circuit.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
Circuit::Circuit( const char* FileNetList, const Options& options ) :
CircuitNetList( FileNetList, options )
{
print_log( "Creating circuit graph..." );
}
///
Circuit::~Circuit()
{}
CircuitNetList.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
CircuitNetList::CircuitNetList( const char *FileNetList, const Options& options ) :
Val( 0.0 ), ValNode( 0 )
{
print_log( "Creating transistors list..." );
char *FileIn = new char[ strlen( FileNetList ) + 1 ];
if ( !FileIn )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
strcpy( FileIn, FileNetList );
FileNetOut = new char[ strlen( FileNetList ) + strlen( NetListSuffix ) + 1 ];
if ( !FileNetOut )
{
155
156
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75

}
strcpy( FileNetOut, FileNetList );
strcat( FileNetOut, NetListSuffix );
if ( int RetCode = PreProcess( FileNetList, options.NamemosN(), options.NamemosP() ) )
{
print_log( ReturnMessage[ RetCode ] );
error( RetCode, errno, "HEY! " );
}
delete[] FileIn;
}
///
CircuitNetList::~CircuitNetList()
{
delete[] FileNetOut;
}
///
const TransistorNode& CircuitNetList::operator[]( unsigned int index ) const
{
if ( index > GetNTran() )
error( NOT_FOUND, 0, "Index out of bound in [Circuit]..." );
return TranList[ index ];
}
///
const TransistorNode& CircuitNetList::operator[]( const char* name ) const
{
return TranList[ name ];
}
///
int CircuitNetList::TranPos( const char* name ) const
{
unsigned int index = 0;
unsigned int NT = GetNTran();
while ( index < NT )
{
if ( !strcasecmp( name, TranList[ index ].DevName() ) )
return TranList[ index ].Index();
index++;
}
return -1;
}
CircuitNetlistParse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
int CircuitNetList::ParseMosLine( char *line, char *line2, const char* mosn, const char* mosp )
{
char tmpstr[ 128 ];
char parsestr[ 128 ];
char endpar[ 128 ];
char mos[ 8 ];
char type[ 16 ];
char par[ 16 ];
char lstr[ 16 ];
B.1. Main functions
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
unsigned int n1, n2, n3, n4;

TransistorType Type;
double W, L;
strcpy( parsestr, "%s %u %u %u %u %s" );
strcpy( endpar, " " );
if ( sscanf( line, parsestr, mos, &n1, &n2, &n3, &n4, type ) == 6 )
{
unsigned int nw = 0;
unsigned int nl = 0;
sprintf( line2, "%s %u %u %u %u %s ", mos, n1, n2, n3, n4, type );
if ( !strcasecmp( mosn, type ) )
Type = NMOS;
else if ( !strcasecmp( mosp, type ) )
Type = PMOS;
else
return PARSE_ERROR;
strcpy( parsestr, "%*s %*u %*u %*u %*u %*s" );
strcpy( tmpstr, parsestr );
strcat( parsestr, " %s" );
while ( ( sscanf( line, parsestr, par ) == 1 ) && ( nl * nw == 0 ) )
{
unsigned int npar = 0;
if ( ( par[ 0 ] == w ) || ( par[ 0 ] == W ) )
nw = 1;
else if ( ( par[ 0 ] == l ) || ( par[ 0 ] == L ) )
nl = 1;
else
{
npar++;
if ( npar == 1 )
strcat( endpar, " \n+" );
strcat( endpar, " " );
strcat( endpar, par );
}
if ( nw == 1 )
{
unsigned int count = 0;
while ( !isdigit( par[ count++ ] ) );
sscanf( &par[ --count ], "%lf%*c", &W );
strcat( line2, par );
nw++;
}
if ( nl == 1 )
{
strcpy( lstr, par );
while ( !isdigit( par[ count++ ] ) );
sscanf( &par[ --count ], "%lf%*c", &L );
nl++;
}
if ( nw * nl )
{
strcat( line2, "
" );
strcat( line2, lstr );
}
strcat( tmpstr, " %*s" );
strcpy( parsestr, tmpstr );
}
while ( sscanf( line, parsestr, par ) == 1 )
{
strcat( line2, " " );
strcat( line2, par );
}
strcat( line2, " " );
157
158
91
92
93
94
95
96
97 }
strcat( line2, endpar );

if ( TranList.Insert( mos, W, L, Type, n1, n2, n3 ) )
return NO_MEM;
return OK;
}
return PARSE_ERROR;
CircuitNetlistPreprocess.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
int CircuitNetList::PreProcess( const char* FileNetList, const char* NameMosN, const char* NameMosP )
{
char line[ 1024 ];
char line2[ 1024 ];
char command[ 32 ];
ifstream i_file( FileNetList );
ofstream o_file( FileNetOut );
int ToBeCopied;
if ( !i_file )
return NOT_FOUND;
if ( !o_file )
return NOT_FOUND;
while ( i_file.getline( line, 1023 ) )
{
int c = 0;
ToBeCopied = 1;
while ( isspace( line[ c++ ] ) );
switch ( line[ --c ] )
{
case .:
sscanf( &line[ c + 1 ], "%s", command );
ToBeCopied = strcasecmp( command, "tran" ) && \
strcasecmp( command, "dc" ) && \
strcasecmp( command, "ac" );
if ( !ToBeCopied )
{
strcpy( line2, "***** " );
strcat( line2, &line[ c ] );
}
break;
case v:
case V:
sscanf( &line[ c + 1 ], "%s", command );
ToBeCopied = !( strcasecmp( command, "dd" ) && \
strcasecmp( command, "cc" ) && \
strcasecmp( command, "al" ) );
if ( ToBeCopied )
{
int node2;
ToBeCopied = 0;
sscanf( &line[ c ], "%*s %d %d %*s %lf", &ValNode, &node2, &Val );
sprintf( line2, "vdd %d %d dc %g ", ValNode, node2, Val );
}
else
{
strcpy( line2, "* " );
strcat( line2, &line[ c ] );
B.1. Main functions
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97 }
}
break;
case m:
case M:
case x:
case X:
ToBeCopied = ParseMosLine( &line[ c ], line2, NameMosN, NameMosP );
break;
case c:
case C:
unsigned int node1, node2;
double val;
sscanf( &line[ c ], "%*s %u %u %lg", &node1, &node2, &val );
if ( CapList.Insert( node1, node2, val ) != OK )
{
i_file.close();
o_file.close();
return NO_MEM;
}
break;
default:
break;
}
if ( ToBeCopied == 0 )
o_file << line2 << endl;
else
o_file << &line[ c ] << endl;
}
o_file.close();
i_file.close();
if ( Val <= 0.0 )
{
print_log( "Error: no|wrong VDD defined" );
return NOT_FOUND;
}
return OK;
CircuitPrint.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
void Circuit::PrintResult( unsigned long int Step,
unsigned int NT,
unsigned int NP,
const double* NewWidth,
const double* CPDelay,
const double* CPPower,
const double *CPNoise,
double Area,
double maxT,
double maxP,
double maxN,
double f,
double fLast ) const
{
char log[ 1024 ], tmp[ 1024 ];
if ( Step == 1 )
{
159
160
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
ofstream o_file( "RESULT.log" );

ofstream o_fileW( "RESULT_W.log" );
ofstream o_fileT( "RESULT_T.log" );
ofstream o_fileP( "RESULT_P.log" );
if ( !o_file )
{
print_log( "Warning: cant create
return ;
}
sprintf( log, "# Step " );
strcat( log, "Norm_N(W[]) " );
strcat( log, "OptFunc " );
strcat( log, "Error " );
strcat( log, "Max(T[]) " );
strcat( log, "Max(P[]) " );
strcat( log, "Max(N[]) " );
strcat( log, " A " );
o_file << log << endl;
for ( unsigned int i = 0; i < NT; i++
{
sprintf( tmp, "W[%u] ", i );
strcat( log, tmp );
}
o_fileW << log << endl;
for ( unsigned int i = 0; i < NP; i++
{
sprintf( tmp, "T[%u] ", i );
strcat( log, tmp );
}
o_fileT << log << endl;
{
sprintf( tmp, "P[%u] ", i );
strcat( log, tmp );
}
o_fileP << log << endl;
{
sprintf( tmp, "N[%u] ", i );
strcat( log, tmp );
}
file RESULT.log" );
o_file.close();
o_fileW.close();
o_fileP.close();
o_fileT.close();
}
ofstream o_file( "RESULT.log", ios::app );
ofstream o_fileW( "RESULT_W.log", ios::app );
ofstream o_fileT( "RESULT_T.log", ios::app );
ofstream o_fileP( "RESULT_P.log", ios::app );
if ( !o_file )
{
print_log( "Warning: cant create file RESULT.log" );
return ;
}
sprintf( log, "%7ld ", Step );
sprintf( tmp, "%4.3f ", NORM_N( NewWidth, NT ) );
strcat( log, tmp );
sprintf( tmp, "%4.3g ", f );
strcat( log, tmp );
sprintf( tmp, "%4.3g ", (f - fLast) / fLast * 100);
strcat( log, tmp );
sprintf( tmp, "%4.3f ", maxT );
strcat( log, tmp );
sprintf( tmp, "%4.3f ", maxP );
B.1. Main functions
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
161
strcat( log, tmp );

sprintf( tmp, "%4.3f ", maxN );
strcat( log, tmp );
sprintf( tmp, "%4.3f ", Area );
strcat( log, tmp );
o_file << log << endl;
for ( unsigned int i = 0; i < NT; i++ )
{
sprintf( tmp, "%4.3f ", NewWidth[ i ] );
strcat( log, tmp );
}
o_fileW << log << endl;
for ( unsigned int i = 0; i < NP; i++ )
{
sprintf( tmp, "%4.3f ", CPDelay[ i ] );
strcat( log, tmp );
}
o_fileT << log << endl;
{
sprintf( tmp, "%4.3f ", CPPower[ i ] );
strcat( log, tmp );
}
o_fileP << log << endl;
{
sprintf( tmp, "%4.3f ", CPNoise[ i ] );
strcat( log, tmp );
}
o_file.close();
o_fileW.close();
o_fileP.close();
o_fileT.close();
}
///
double NORM_N( const double* V, unsigned int l )
{
double norm = 0.0;
for ( unsigned int i = 0; i < l; i++ )
norm += pow( V[ i ], double( l ) );
norm = pow( norm, double( 1.0 / l ) );
// if(V[i] norm)
// norm = V[i];
return norm;
}
CircuitTranListNode.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
int Circuit::TransistorListNode(unsigned int node, TransistorList& TList, unsigned int& n, unsigned int& p) const
{
// find all the nmos transistors with source or drain
162
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35 }
// connected to node an return a list, plus the number of n and p connected

n = p = 0;
{
if ( ( TranList[ i ].Source() == node ) ||
( TranList[ i ].Drain() == node ) )
{
TList.Insert((TranList[i]).DevName(), (TranList[i]).Width(),
(TranList[i]).Length(), (TranList[i]).TrType(),
(TranList[i]).Source(), (TranList[i]).Gate(),
(TranList[i]).Drain());
if ((TranList[i]).TrType() == NMOS)
n++;
else if ((TranList[i]).TrType() == PMOS)
p++;
}
}
return OK;
CircuitWidth.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
double Circuit::JunctionNWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the nmos transistors with source or drain
// connected to node an return the sum of widths
number = 0;
double W = 0.0;
if ( node == 0 )
return 0.0;
{
if ( TranList[ i ].TrType() == NMOS )
{
if ( !NewWidth )
W += TranList[ i ].Width();
else
W += NewWidth[ TranPos( TranList[ i ].DevName() ) ];
number++;
}
}
return W;
}
///
double Circuit::GateNWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the nmos transistors with gate
double W = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
B.1. Main functions
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116

{
if ( TranList[ i ].TrType() == NMOS )
if ( TranList[ i ].Gate() == node )
{
if ( !NewWidth )
else
number++;
}
}
return W;
}
///
double Circuit::JunctionPWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the pmos transistors with source or drain
double W = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
{
if ( TranList[ i ].TrType() == PMOS )
{
if ( !NewWidth )
else
number++;
}
}
return W;
}
///
double Circuit::GatePWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the pmos transistors with gate
double W = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
{
if ( TranList[ i ].TrType() == PMOS )
if ( TranList[ i ].Gate() == node )
{
if ( !NewWidth )
else
number++;
}
}
return W;
}
///
double Circuit::CapStaticGnd( unsigned int node, int& number ) const
{
163
164
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
// find all the fixed capacitances with a ground terminal

// and connected to node and return the sum of them
unsigned int NC = GetNCap();
double C = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NC; i++ )
{
if ( ( CapList[ i ].node1 == 0 ) &&
( CapList[ i ].node2 == node ) )
{
C += CapList[ i ].val;
number++;
}
else if ( ( CapList[ i ].node2 == 0 ) &&
{
number++;
}
}
return C;
}
///
double Circuit::CapStaticVdd( unsigned int node, int& number ) const
{
// find all the fixed capacitances with a vdd terminal
// and connected to node and return the sum of them
unsigned int NC = GetNCap();
double C = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NC; i++ )
{
if ( ( CapList[ i ].node1 == ValNode ) &&
{
number++;
}
else if ( ( CapList[ i ].node2 == ValNode ) &&
{
number++;
}
}
return C;
}
Critic.cc
3
4
5
6
7
8
9
10
11
12
13
14
#include
#include
#include
#include
#include
#include
#include
#include
#include
///
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"
B.1. Main functions
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
int Critic(const Circuit& circuit,

CritPathList& pathList,
const Options& options)
{
Node nodeInputList;
/// search primary input
print_log("Searching primary input...");
unsigned int Nt = circuit.GetNTran();
for (unsigned int i = 0; i < Nt; i++)
{
unsigned int gate = (circuit[i]).Gate();
unsigned int Pin = 1;
for (unsigned int j = 0; j < Nt; j++)
{
if (i != j)
{
for (unsigned int k = 0; k < nodeInputList.GetNumNode(); k++)
if ( (nodeInputList[k]).node == gate)
Pin = 0;
if ( ((circuit[j]).Drain() == gate) ||
((circuit[j]).Source() == gate))
Pin = 0;
}
}
if (Pin)
nodeInputList.Insert(gate);
}
#ifdef DEBUG
cerr << endl << " Primary input List: " << (nodeInputList[0]).node << " ";
#endif
char log[1024];
char log2[16];
sprintf(log, "Primary input list: %u", (nodeInputList[0]).node);
for (unsigned int i = 1; i < nodeInputList.GetNumNode(); i++)
{
sprintf(log2, " -- %u", (nodeInputList[i]).node);
#ifdef DEBUG
cerr << (nodeInputList[i]).node << " ";
#endif
}
#ifdef DEBUG
cerr << endl;
#endif
print_log(log);
NodeList nodeListGnd;
NodeList nodeListVdd;
int RetCode1 = nodeListGnd.Create();
int RetCode2 = nodeListVdd.Create();
if ( (RetCode1 != OK) || (RetCode2 != OK) )
return (RetCode1 != OK ? RetCode1 : RetCode2);
#ifdef DEBUG
cerr << endl << "Creating critical Path with gnd..." << endl;
#endif
RetCode1 = CriticRecurse(circuit, 0, nodeListGnd);
if ((RetCode1 != OK) && (RetCode1 != CONT))
return RetCode1;
#ifdef DEBUG
cerr << endl << "Creating critical Path with vdd..." << endl;
#endif
unsigned int val = circuit.ValimNode();
RetCode2 = CriticRecurse(circuit, val, nodeListVdd);
if ( (RetCode2 != OK) && (RetCode2 != CONT) )
return RetCode2;
Node gateInternalList;
for (unsigned int i = 0; i < nodeListGnd.GetNumList(); i++)
{
unsigned int Gin = 1;
unsigned int nn = (nodeListGnd[i]).GetNumNode();
unsigned int new_node = ((nodeListGnd[i])[nn - 1]).node;
165
166
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
for (unsigned int k = 0; k < gateInternalList.GetNumNode(); k++)

if ( (gateInternalList[k]).node == new_node)
Gin = 0;
if (Gin)
gateInternalList.Insert(new_node, -1);
}
for (unsigned int i = 0; i < nodeListVdd.GetNumList(); i++)
{
unsigned int Gin = 1;
unsigned int nn = (nodeListVdd[i]).GetNumNode();
unsigned int new_node = ((nodeListVdd[i])[nn - 1]).node;
for (unsigned int k = 0; k < gateInternalList.GetNumNode(); k++)
if ( (gateInternalList[k]).node == new_node)
Gin = 0;
if (Gin)
gateInternalList.Insert(new_node, -1);
}
#ifdef DEBUG
cerr << endl << " Gate Internal List: " << (gateInternalList[0]).node << " ";
#endif
sprintf(log, "Internal gate list: %u", (gateInternalList[0]).node);
for (unsigned int i = 1; i < gateInternalList.GetNumNode(); i++)
{
sprintf(log2, " -- %u", (gateInternalList[i]).node);
#ifdef DEBUG
cerr << (gateInternalList[i]).node << " ";
#endif
}
#ifdef DEBUG
cerr << endl;
#endif
print_log(log);
int RetCode = SearchCriticalPath(circuit, pathList, nodeListGnd, nodeListVdd, nodeInputList, gateInternalList, options);
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << endl << pathList.GetNumPath() << " CRITICAL PATHS ";
#endif
for (unsigned int i = 0; i < pathList.GetNumPath(); i++)
{
sprintf(log, "#%u) Node_in: %u (%s=%g ps) Node_Out: %u (%s)", i,
(pathList[i]).GetNodeIn(), TransitionString[(pathList[i]).GetTransitionIn()],
(pathList[i]).GetInTime(), (pathList[i]).GetNodeOut(),
TransitionString[(pathList[i]).GetTransitionOut()]);
print_log(log);
#ifdef DEBUG
cerr << endl << log;
#endif
sprintf(log, "\t\t Tran_List ");
for (unsigned int j = 0; j < (pathList[i]).GetNumListTran(); j++)
{
const char* name;
while ((name = (pathList[i]).TraverseTransistorNameList(j)) != 0)
{
strcat(log, name);
strcat(log, " ");
}
strcat(log, " / ");
}
print_log(log);
#ifdef DEBUG
#endif
unsigned int node;
double val;
sprintf(log, "\t\t Active_Inputs: ");
char tmpstr[1024];
while ((pathList[i]).TraverseActiveInputs(node, val))
{
B.1. Main functions
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
sprintf(tmpstr, " v(%u)= %g V -- ", node, val);

strcat(log, tmpstr);
}
print_log(log);
#ifdef DEBUG
#endif
sprintf(log, "\t\t No_Active_Inputs: ");
while ((pathList[i]).TraverseNoActiveInputs(node, val))
{
}
print_log(log);
#ifdef DEBUG
cerr << endl << log << endl;
#endif
sprintf(log, "\t\t Initial Condition: ");
while ((pathList[i]).TraverseInitialConditions(node, val))
{
}
print_log(log);
#ifdef DEBUG
cerr << endl << log << endl;
#endif
}
return OK;
}
CriticRecurse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"
///
int CriticRecurse(const Circuit& circuit,
unsigned int node,
NodeList& node_list)
{
TransistorList TList;
static int level = 0;
unsigned int n = 0;
unsigned int p = 0;
int RetCode;
if ((RetCode = circuit.TransistorListNode(node, TList, n , p)) != OK)
return RetCode;
unsigned int Nt = TList.GetNTran();
if ((RetCode = node_list.InsertNode(node)) != OK)
return RetCode;
if ( (n > 0) && (p > 0))
{
return OK;
}
level++;
#ifdef DEBUG
cerr << "node " << node << ": ";
167
168
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
cerr << " - " << TList[i].DevName();

for (unsigned int i = 0; i < (node_list[node_list.GetNumList() - 1]).GetNumNode(); i++)
cerr << " - " << ((node_list[node_list.GetNumList() - 1])[i]).node;
cerr << endl;
#endif
unsigned int RecurseYes;
RetCode = 0;
{
unsigned int new_node;
if ((TList[i]).Source() == node)
new_node = TList[i].Drain();
else if ((TList[i]).Drain() == node)
new_node = TList[i].Source();
RecurseYes = 1;
unsigned int last_node;
#ifdef DEBUG
for (int j = 0; j < level; j++)
cerr << "
";
cerr << " LEVEL: " << level << " trying " << TList[i].DevName() << endl;
#endif
unsigned int nn = (node_list[node_list.GetNumList() - 1]).GetNumNode();
for (unsigned int j = 0; (j < nn) && RecurseYes; j++)
{
last_node = ((node_list[node_list.GetNumList() - 1])[j]).node;
if (new_node == last_node)
RecurseYes = 0;
}
if (RecurseYes)
{
if ((RetCode = node_list.InsertNode(TList[i].Gate())) != OK)
return RetCode;
int RecurseCode = CriticRecurse(circuit, new_node, node_list);
if (RecurseCode == OK)
{
// at least gnd (or vdd), one gate and one node
if ((RetCode = node_list.Create()) != OK)
return RetCode;
#ifdef DEBUG
cerr << endl << "NODE LIST : ";
#endif
nn = (node_list[node_list.GetNumList() - 2]).GetNumNode();
for (unsigned int j = 0; j < nn - 2; j++)
{
new_node = ((node_list[node_list.GetNumList() - 2])[j]).node;
#ifdef DEBUG
cerr << new_node << " ";
#endif
if ((RetCode = node_list.InsertNode(new_node)) != OK)
return RetCode;
}
#ifdef DEBUG
cerr << ((node_list[node_list.GetNumList() - 2])[nn - 2]).node << " "
<< ((node_list[node_list.GetNumList() - 2])[nn - 1]).node << endl;
#endif
}
else if (RecurseCode == CONT)
{
if ( (RetCode = (node_list[node_list.GetNumList() - 1]).DeleteLevelNode(2 * level - 2)) != OK )
return RetCode;
}
else
return RecurseCode;
}
}
level--;
if (level == 0)
{
B.1. Main functions
107
108
109
110
111
112
113
114
115
116
117
118
119
120 }
unsigned int nl = node_list.GetNumList();

for (unsigned int i = 0; i < nl; i++)
{
unsigned int nn = (node_list[i]).GetNumNode();
if (nn == 1)
{
RetCode = node_list.DeleteList(i);
if (RetCode != OK)
return RetCode;
}
}
}
return CONT;
CriticalPath.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
CritPathList::CritPathList() : NumPath( 0 ), head( 0 ), tail( 0 )
{
print_log( "Creating critical path list..." );
}
///
CritPathList::~CritPathList()
{
CPNode* tmp;
///
while ( head != 0 )
{
tmp = head->next;
delete head;
head = tmp;
}
}
///
const CPNode& CritPathList::operator[]( unsigned int index ) const
{
CPNode * tmp = head;
if ( index > NumPath )
error( NOT_FOUND, 0, "Index out of bound in [CritPathList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
CriticalPathCreate.cc
3
4
5
6
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
169
170
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include
#include
#include
#include
#include
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
///
int CritPathList::Create()
{
if ( !head )
{
head = new CPNode;
if ( !head )
return NO_MEM;
tail = head;
}
else
{
tail->next = new CPNode;
if ( !( tail->next ) )
return NO_MEM;
tail = tail->next;
}
tail->next = 0;
return OK;
}
///
int CritPathList::Stamp( unsigned int NumTranList )
{
tail->VALID = 1;
tail->NumTranList = NumTranList;
NumPath++;
return OK;
}
CriticalPathInsert.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
int CritPathList::InsertNodeIn( unsigned int NIn, TransitionType Type, double Time )
{
if ( tail )
return tail->InsNodeIn( NIn, Type, Time );
else
return NOT_FOUND;
}
///
int CritPathList::InsertNodeOut( unsigned int NOut, TransitionType Type )
{
if ( tail )
return tail->InsNodeOut( NOut, Type );
else
return NOT_FOUND;
}
///
int CritPathList::InsertActiveInputs( unsigned int Node, double Val )
B.1. Main functions
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
{
if ( tail )
return tail->InsActIn( Node, Val );
else
return NOT_FOUND;
}
///
int CritPathList::InsertNoActiveInputs( unsigned int Node, double Val )
{
if ( tail )
return tail->InsNoActIn( Node, Val );
else
return NOT_FOUND;
}
///
int CritPathList::InsertInitialCondition( unsigned int Node, double Val )
{
if ( tail )
return tail->InsIniCond( Node, Val );
else
return NOT_FOUND;
}
///
int CritPathList::InsertPathTransistor( const char* name, TransistorType TR, unsigned int index )
{
if ( tail )
return tail->InsTran( name, TR, index );
else
return NOT_FOUND;
}
CriticalPathParse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
int CritPathList::ParseLineCPNode( const char* str, CPCOMMANDOPT NodeType )
{
char tmpType[ 5 ];
unsigned int node;
double time;
sscanf( str, "%*s%u%s", &node, tmpType );
switch ( NodeType )
{
case NODEIN:
sscanf( str, "%*s%*u%*s%lg", &time );
for ( unsigned int i = 0; ( TransitionString[ i ] != 0 ); i++ )
{
if ( !strcasecmp( tmpType, TransitionString[ i ] ) )
return InsertNodeIn( node, ( TransitionType ) i, time );
}
return NOT_FOUND;
break;
case NODEOUT:
for ( unsigned int i = 0; ( TransitionString[ i ] != 0 ); i++ )
{
if ( !strcasecmp( tmpType, TransitionString[ i ] ) )
171
172
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
return InsertNodeOut( node, ( TransitionType ) i );

}
return NOT_FOUND;
break;
default:
return NOT_FOUND;
break;
}
}
///
int CritPathList::ParseLineCPInputs( const char* str, CPCOMMANDOPT InputType )
{
unsigned int node;
unsigned int NumRead;
double val;
char laststr[ 1024 ];
char *tmpstr = " %*u %*lg";
strcpy( parsestr, "%*s" );
NumRead = sscanf( str, "%*s %u %lg", &node, &val );
while ( NumRead == 2 )
{
switch ( InputType )
{
case ACTIVEI:
if ( InsertActiveInputs( node, val ) )
return NOT_FOUND;
break;
case NOACTIVEI:
if ( InsertNoActiveInputs( node, val ) )
return NOT_FOUND;
break;
case IC:
if ( InsertInitialCondition( node, val ) )
return NOT_FOUND;
break;
case CPATH:
case NODEIN:
case NODEOUT:
case TRANLIST:
case ENDCPATH:
case NONECP:
default:
break;
}
strcat( parsestr, tmpstr );
strcpy( laststr, parsestr );
strcat( laststr, " %u %lg" );
NumRead = sscanf( str, laststr, &node, &val );
}
if ( NumRead == 1 )
return PARSE_ERROR;
return OK;
}
///
int CritPathList::ParseLineCPTran( const char* str, const Circuit& circuit, unsigned int index )
{
char laststr[ 1024 ];
char *tmpstr = " %*s";
char name[ 16 ];
strcpy( parsestr, "%*s" );
unsigned int NumRead = sscanf( str, "%*s %s", name );
if ( NumRead != 1 )
return PARSE_ERROR;
B.1. Main functions
104
105
106
107
108
109
110
111
112
113
114
115
116 }
while ( NumRead == 1 )
{
if (circuit.TranPos(name) == -1)
return NOT_FOUND;
if ( InsertPathTransistor( name, circuit[ name ].TrType(), index ) )
return NOT_FOUND;
strcat( parsestr, tmpstr );
strcpy( laststr, parsestr );
strcat( laststr, " %s" );
NumRead = sscanf( str, laststr, name );
}
return OK;
CriticalPathRead.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"global.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
int CritPathList::Read( const char* FileOptions, const Circuit& circuit )
{
ifstream i_file( FileOptions );
char line[ 1024 ];
char command[ 256 ];
if ( !i_file )
return NOT_FOUND;
unsigned int LineNum = 0;
unsigned int NumTranList = 0;
{
LineNum++;
if ( sscanf( line, "%s ", command ) == 1 )
if ( command[ 0 ] != # )
{
int RetCode;
switch ( CPCOMMANDOPT Cc = WhichCommand( command ) )
{
case CPATH:
RetCode = Create();
NumTranList = 0;
break;
case NODEIN:
case NODEOUT:
RetCode = ParseLineCPNode( line, Cc );
break;
case ACTIVEI:
case NOACTIVEI:
case IC:
RetCode = ParseLineCPInputs( line, Cc );
break;
case TRANLIST:
RetCode = ParseLineCPTran( line, circuit, NumTranList );
NumTranList++;
if ( NumTranList >= MAXCHAIN )
RetCode = NO_MEM;
break;
case ENDCPATH:
RetCode = Stamp( NumTranList );
break;
173
174
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71 }
case NONECP:
RetCode = OK;
default:
break;
}
if ( RetCode != OK )
{
sprintf( line, "ERROR reading file %s line %d ", FileOptions, LineNum );
print_log( line );
i_file.close();
return RetCode;
}
}
}
i_file.close();
return OK;
EvaluationAlgorithm.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
///
EvaluationAlgorithm::EvaluationAlgorithm( const CritPathList& pathlist, const Options& options )
:
pathlist( pathlist ), options( options ),
NumPath( 0 ), Calls( 0 ),
CPDelay( 0 ), CPPower( 0 ),
CPNoise( 0 ), Area( 0.0 )
{
print_log( "Creating simulation algorithm..." );
NumPath = pathlist.GetNumPath();
CPDelay = new double[ NumPath ];
CPPower = new double[ NumPath ];
CPNoise = new double[ NumPath ];
if ( !CPDelay || !CPPower || !CPNoise )
{
}
}
///
EvaluationAlgorithm::~EvaluationAlgorithm()
{
delete[] CPDelay;
delete[] CPPower;
delete[] CPNoise;
}
Global.cc
3
4
5
6
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"global.h"
B.1. Main functions
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
///
GLOBCOMMANDOPT WhichGBOption( const char* option )
{
for ( unsigned int i = 0; (GlobCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, GlobCommandOptions[ i ] ) )
return ( ( GLOBCOMMANDOPT ) i );
}
return NONEGLOB;
}
///
SIMCOMMANDOPT WhichSimOption( const char* option )
{
for ( unsigned int i = 0; (SimCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, SimCommandOptions[ i ] ) )
return ( ( SIMCOMMANDOPT ) i );
}
return NONESIM;
}
///
OPTCOMMANDOPT WhichOptOption( const char* option )
{
for ( unsigned int i = 0; (OptCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, OptCommandOptions[ i ] ) )
return ( ( OPTCOMMANDOPT ) i );
}
return NONEOPT;
}
///
CPCOMMANDOPT WhichCommand( const char* option )
{
for ( unsigned int i = 0; (CPCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, CPCommandOptions[ i ] ) )
return ( ( CPCOMMANDOPT ) i );
}
return NONECP;
}
#ifndef LINUX
///
void error( int exitCode, int ErrorType, const char* message )
{
cerr << message << "Error " << ErrorType << endl;
if ( exitCode != 0 )
exit( exitCode );
}
#endif
IsIn.cc
3
4
5
6
7
8
9
10
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
175
176
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include "main.h"
///
int IsIn(unsigned int node, Node& NList, unsigned int& pos)
{
pos = 0;
for (unsigned int i = 0; i < NList.GetNumNode(); i++)
if ((NList[i]).node == node)
{
pos = i;
return OK;
}
return NOT_FOUND;
}
Node.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
///
Node::Node() : NumNode(0), next( 0 ), Head(0), Tail(0)
{}
///
Node::~Node()
{
_NodeList* tmp;
while ( Head )
{
tmp = Head->next;
delete Head;
Head = tmp;
}
}
///
int Node::Insert( unsigned int node, int flag = -1)
{
_NodeList * tmp;
if ( !Head )
{
Head = new _NodeList;
if ( !Head )
return NO_MEM;
Head->next = 0;
Tail = Head;
}
else
{
tmp = new _NodeList;
if ( !tmp )
return NO_MEM;
tmp->next = 0;
Tail->next = tmp;
Tail = tmp;
}
Tail->node = node;
Tail->flag = flag;
B.1. Main functions
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
NumNode++;
return OK;
}
///
int Node::DeleteLevelNode(unsigned int level)
{
// NodeList* tmp = Head;
//unsigned int i = level;
if (level >= NumNode)
return NOT_FOUND;
//while(i)
// tmp = tmp-next;
//tmp-next = 0;
//Tail = tmp;
Tail = &(operator[](level));
(operator[](level)).next = 0;
NumNode = level + 1;
return OK;
}
///
const _NodeList& Node::operator[]( unsigned int index ) const
{
_NodeList* tmp = Head;
if ( index > NumNode )
error( NOT_FOUND, 0, "Index out of bound in [_NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
///
_NodeList& Node::operator[]( unsigned int index )
{
_NodeList* tmp = Head;
if ( index > NumNode )
error( NOT_FOUND, 0, "Index out of bound in [_NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
NodeCreate.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"
///
int NodeList::Create()
{
if ( !head )
{
head = new Node;
if ( !head )
return NO_MEM;
tail = head;
}
else
{
tail->next = new Node;
if ( !( tail->next ) )
return NO_MEM;
tail = tail->next;
177
178
24
25
26
27
28 }
}
tail->next = 0;
NumList++;
return OK;
NodeList.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"
///
NodeList::NodeList() : NumList( 0 ), head( 0 ), tail( 0 )
{
print_log( "Creating node list..." );
}
///
NodeList::~NodeList()
{
Node* tmp;
while ( head != 0 )
{
tmp = head->next;
delete head;
head = tmp;
}
}
///
const Node& NodeList::operator[]( unsigned int index ) const
{
Node *tmp = head;
if ( index > NumList )
error( NOT_FOUND, 0, "Index out of bound in [NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
///
Node& NodeList::operator[]( unsigned int index )
{
Node *tmp = head;
if ( index > NumList )
error( NOT_FOUND, 0, "Index out of bound in [NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
NodeListDelete.cc
3
4
5
6
7
8
9
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"
///
int NodeList::DeleteList( unsigned int list)
B.1. Main functions
10 {
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 }
if (list >= NumList)

return NOT_FOUND;
//unsigned int i = list - 1;
//Node* tmp = head;
//while(i)
// tmp = tmp-next;
//tmp-next = tmp-next-next;
if ((operator[](list)).next)
{
(operator[](list - 1)).next = (operator[](list)).next;
Node* tmp = &(operator[](list - 1));
while (tmp->next)
tmp = tmp->next;
tail = tmp;
}
else
{
(operator[](list - 1)).next = 0;
tail = &(operator[](list - 1));
}
NumList--;
return OK;
NodeListInsert.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"
///
int NodeList::InsertNode( unsigned int node)
{
if ( tail )
return tail->Insert( node );
else
return NOT_FOUND;
}
OptSimulate.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"class_optimizator.h"
///
int OptimizationAlgorithm::SimulateCircuit( const double *NewWidth )
{
int RetCode = Simulation.Run( circuit, NewWidth, ValidPath );
return RetCode;
for ( unsigned int i = 0; i < NumPath; i++ )
{
CPDelay[ i ] = Simulation.GetDelay( i );
CPPower[ i ] = Simulation.GetPower( i );
CPNoise[ i ] = Simulation.GetNoise( i );
}
Area = Simulation.GetArea();
179
180
27
28 }
return OK;
OptimizationAlFirstSim.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
///
int OptimizationAlgorithm::SimulateFirstCircuit()
{
double* MinimumWidth = new double[ NumTran ];
double* MaximumWidth = new double[ NumTran ];
if (!MinimumWidth || !MaximumWidth)
return NO_MEM;
for ( unsigned int i = 0; i < NumTran; i++ )
{
MinimumWidth[ i ] = options.GetOptOption( WMIN );
if (options.GetOptOption( WMAX ) <= 0)
MaximumWidth[ i ] = options.GetOptOption( WMIN ) * 100;
else
MaximumWidth[ i ] = options.GetOptOption( WMAX );
}
int RetCode = Simulation.Run( circuit, MaximumWidth, ValidPath );
return RetCode;
{
}
MaxDelayInitMax = 0.0;
{
if ( Simulation.GetDelay( i ) > 0.0 )
{
if ( Simulation.GetDelay( i ) > MaxDelayInitMax )
MaxDelayInitMax = CPDelay[i];
if ( Simulation.GetPower( i ) > MaxPowerInitMax )
MaxPowerInitMax = CPPower[i];
if ( Simulation.GetNoise( i ) > MaxNoiseInitMax )
MaxNoiseInitMax = CPNoise[i];
}
}
AreaInitMax = Area;
/// FIX ME !!!!!!!!!
MaxNoiseInitMax = 1.0;
RetCode = Simulation.Run( circuit, MinimumWidth, ValidPath );
return RetCode;
unsigned int tmpP = 0;
{
if (CPDelay[ i ] > 0.0)
{
ValidPath[i] = 1;
tmpP++;
}
B.1. Main functions
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99 }
else
ValidPath[i] = 0;
}
MaxDelayInitMin = 0.0;
{
if (ValidPath[i])
{
if ( Simulation.GetDelay( i ) > MaxDelayInitMin )
MaxDelayInitMin = CPDelay[i];
if ( Simulation.GetPower( i ) > MaxPowerInitMin )
MaxPowerInitMin = CPPower[i];
if ( Simulation.GetNoise( i ) > MaxNoiseInitMin )
MaxNoiseInitMin = CPNoise[i];
}
}
AreaInitMin = Area;
MaxNoiseInitMin = 1.0;
/// FIX ME !!!!!!!!!
if (MaxDelayInitMin < MaxDelayInitMax)
MaxDelayInitMax = MaxDelayInitMin;
if (MaxPowerInitMin > MaxPowerInitMax)
MaxPowerInitMax = MaxPowerInitMin;
if (MaxNoiseInitMin > MaxNoiseInitMax)
MaxNoiseInitMax = MaxNoiseInitMin;
if (AreaInitMin > AreaInitMax)
AreaInitMax = AreaInitMin;
char log[512];
sprintf( log, "Found %u valid critical paths of %u", tmpP, NumPath );
print_log( log );
return OK;
OptimizationAlNormSim.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
double OptimizationAlgorithm::NormSim( const double* x, int& RetCode)
{
double f;
double* X = new double[ NumTran ];
double DelW = options.GetOptOption( DELTA );
unsigned int count_min = 0;
unsigned int count_max = 0;
static unsigned int elapsed = 0;
static double fLast;
static unsigned int count_conv = 0;
{
if ( x[ i ] <= options.GetOptOption( WMIN ) )
{
X[ i ] = options.GetOptOption( WMIN );
count_min++;
}
else if ( (x[ i ] > options.GetOptOption( WMAX )) &&
(options.GetOptOption( WMAX ) > 0) )
{
181
182
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
X[ i ] = options.GetOptOption( WMAX );
count_max++;
}
else
X[ i ] = double( rint( x[ i ] / DelW ) * DelW );
}
if ((count_min != NumTran) && (count_max != NumTran))
RetCode = SimulateCircuit( X );
else
{
RetCode = OK;
}
{
delete[] X;
return 0.0;
}
double maxT = 0.0;
double maxP = 0.0;
double maxN = 0.0;
double RatioT = 1.0;
// MaxDelayInit / MaxDelayInit
//double RatioP = MaxDelayInitMin / MaxPowerInitMax;
double RatioP = 1.0;
//double RatioN = MaxDelayInitMin / MaxNoiseInitMax;
// FIX ME !!!!!!!!!!!
double RatioN = 0.0;
//double RatioA = MaxDelayInitMin / AreaInitMax;
double RatioA = 1.0;
f = 0.0;
double fMin = COST_FACTOR;
if ( options.GetOptOption( WEIGHTS ) )
{
RatioT *= options.GetOptOption(WDELAY);
RatioP *= options.GetOptOption(WPOWER);
RatioN *= options.GetOptOption(WNOISE);
RatioA *= options.GetOptOption(WAREA);
}
double fMin_norm;
double fMax;
unsigned int Constraints = 0;
double MAXDelay = options.GetOptOption( MAXDELAY );
double MAXPower = options.GetOptOption( MAXPOWER );
double MAXNoise = options.GetOptOption( MAXNOISE );
double MAXArea = options.GetOptOption( MAXAREA );
if ( (MAXDelay > 0) || (MAXPower > 0) || (MAXNoise > 0) ||
Constraints = 1;
fMin_norm = ( RatioT * MaxDelayInitMin / MaxDelayInitMin +
RatioP * MaxPowerInitMin / MaxPowerInitMax +
RatioN * MaxNoiseInitMin / MaxNoiseInitMax +
RatioA * AreaInitMin / AreaInitMax);
fMax = (RatioT * MaxDelayInitMax / MaxDelayInitMin + \
RatioP * MaxPowerInitMax / MaxPowerInitMax + \
RatioN * MaxNoiseInitMax / MaxNoiseInitMax + \
RatioA * AreaInitMax / AreaInitMax) * \
COST_FACTOR / fMin_norm;
if (elapsed == 0)
fLast = fMin;
if ((count_min != NumTran) && (count_max != NumTran))
{
{
if ( CPDelay[ i ] > maxT )
maxT = CPDelay[ i ];
if (CPDelay[ i ] > 0)
{
if ( CPPower[ i ] > maxP )
maxP = CPPower[ i ];
if ( CPNoise[ i ] > maxN )
maxN = CPNoise[ i ];
}
(MAXArea > 0))

\
\
\
B.1. Main functions
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
}
f = (RatioT * maxT / MaxDelayInitMin + \
RatioP * maxP / MaxPowerInitMax + \
RatioN * maxN / MaxNoiseInitMax + \
RatioA * Area / AreaInitMax) * \
if ( Constraints )
{
if (MAXDelay > 0)
{
if (maxT > MAXDelay)
{
f += (maxT - MAXDelay ) / MaxDelayInitMin
(maxT - MAXDelay ) / MaxDelayInitMin
RetCode = CONT;
}
else
RetCode = END_ACC;
}
if (MAXPower > 0)
{
if (maxP > MAXPower)
{
f += (maxP - MAXPower ) / MaxPowerInitMax
(maxP - MAXPower ) / MaxPowerInitMax
RetCode = CONT;
}
else
RetCode = END_ACC;
}
if (MAXNoise > 0)
{
if (maxN > MAXNoise)
{
f += (maxN - MAXNoise ) / MaxNoiseInitMax
(maxN - MAXNoise ) / MaxNoiseInitMax
RetCode = CONT;
}
else
RetCode = END_ACC;
}
if (MAXArea > 0)
{
if (Area > MAXArea)
{
f += (Area - MAXArea ) / AreaInitMax *\
(Area - MAXArea ) / AreaInitMax *\
RetCode = CONT;
}
else
RetCode = END_ACC;
}
}
}
else if (count_min == NumTran)
{
f = fMin;
maxT = MaxDelayInitMin;
maxP = MaxPowerInitMin;
maxN = MaxNoiseInitMin;
}
else if (count_max == NumTran)
{
f = fMax * COST_FACTOR / fMin_norm;
maxT = MaxDelayInitMax;
183
*\
*\
*\
*\
*\
*\
184
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205 }
maxP = MaxPowerInitMax;
maxN = MaxNoiseInitMax;
}
InternalSteps++;
if ( InternalSteps >= options.GetOptOption( MAXSTEPS ) )
RetCode = MAX_STEPS;
char log[ 1024 ];
if ((options.Verbose()) || (InternalSteps == 1))
{
circuit.PrintResult( InternalSteps, NumTran, NumPath, X, CPDelay, CPPower, CPNoise, Area, maxT, maxP, maxN, f, fLast);
sprintf( log, "...step: %d, objective: %g", InternalSteps, f );
print_log( log );
}
else if ((f < fLast) || ((InternalSteps % 100) == 0))
{
circuit.PrintResult( InternalSteps, NumTran, NumPath, X, CPDelay, CPPower, CPNoise, Area, maxT, maxP, maxN, f, fLast);
sprintf( log, "...step: %d, objective: %g", InternalSteps, f );
print_log( log );
if ( ((fLast - f) / fLast) < options.GetOptOption( ACC ) && (RetCode != CONT))
{
count_conv++;
if (count_conv >= 2)
{
RetCode = END_ACC;
}
}
fLast = f;
elapsed++;
}
//else
// count conv = 0;
delete[] X;
return f;
OptimizationAlgorithm.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
///
OptimizationAlgorithm::OptimizationAlgorithm( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
InternalSteps( 0 ), circuit( circuit ), options( options ),
Steps( 0 ), NumTran( 0 ), NumPath( 0 ),
Width( 0 ), CPDelay( 0 ), CPPower( 0 ),
CPNoise( 0 ), Area( 0.0 ), ValidPath(0),
MaxDelayInitMin( 0.0 ), MaxPowerInitMin( 0.0 ),
MaxNoiseInitMin( 0.0 ), AreaInitMin( 0.0 ),
MaxDelayInitMax( 0.0 ), MaxPowerInitMax( 0.0 ),
MaxNoiseInitMax( 0.0 ), AreaInitMax( 0.0 ),
Simulation( simulation )
{
// default inizialization
print_log( "Creating optimization algorithm..." );
NumTran = circuit.GetNTran();
Width = new double[ NumTran ];
NumPath = simulation.GetNPath();
CPDelay = new double[ NumPath ];
B.1. Main functions
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
CPPower = new double[ NumPath ];

CPNoise = new double[ NumPath ];
ValidPath = new unsigned[NumPath];
if ( !Width || !CPDelay || !CPPower ||
{
print_log( ReturnMessage[ NO_MEM ]
}
for ( unsigned int i = 0; i < NumTran;
Width[ i ] = circuit[ i ].Width();
for ( unsigned int i = 0; i < NumPath;
ValidPath[i] = 1;
185
!CPNoise )
);
i++ )
i++ )
}
///
OptimizationAlgorithm::~OptimizationAlgorithm()
{
delete[] Width;
delete[] CPDelay;
delete[] CPPower;
delete[] CPNoise;
}
Optimize.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"slop.h"
"slop2.h"
"powell.h"
"anneal.h"
"test.h"
///
int Optimize( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation, double* LastWidth )
{
struct timeb start_t, stop_t;
ftime( &start_t );
char log[ 1024 ];
OptimizationAlgorithm* Opt;
switch ( options.WhichOptAlgorithm() )
{
case SLOP:
Opt = new Slop( circuit, options, simulation );
break;
case SLOP2:
Opt = new Slop2( circuit, options, simulation );
break;
case POWELL:
Opt = new Powell( circuit, options, simulation );
break;
case ANNEAL:
Opt = new Anneal( circuit, options, simulation );
break;
case TESTEVAL:
Opt = new TestEval( circuit, options, simulation );
default:
break;
}
186
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92 }
unsigned int n = circuit.GetNTran();

unsigned int np = simulation.GetNPath();
if ( !Opt )
return NO_MEM;
int RetCode;
if ( ( RetCode = Opt->SimulateFirstCircuit() ) != OK )
return RetCode;
print_log( "Initial critical paths: " );
for ( unsigned int i = 0; i < np; i++ )
{
sprintf( log, "%u) Delay=%g ps, Energy=%g pJ, Noise=%g", i,
simulation.GetDelay( i ),
simulation.GetPower( i ),
simulation.GetNoise( i ) );
print_log( log );
}
sprintf( log, "Area=%g ", simulation.GetArea() );
print_log( "" );
print_log( "Starting optimization process..." );
RetCode = Opt->Run();
if ( ( RetCode != OK ) && ( RetCode != MAX_STEPS ) && ( RetCode != END_ACC ) && (RetCode != CONT))
return RetCode;
ftime( &stop_t );
if ( RetCode == MAX_STEPS )
{
print_log( "...WARNING: exceeded max steps..." );
}
if (( RetCode == END_ACC ) || ( RetCode == OK) )
{
print_log( "...Solution found. Thats all folk." );
}
for ( unsigned int i = 0; i < n; i++ )
{
int pos = circuit.TranPos( circuit[ i ].DevName() );
LastWidth[ i ] = Opt->OptWidth( pos );
}
RetCode = Opt->SimulateCircuit(LastWidth);
return RetCode;
long sec = stop_t.time - start_t.time;
short msec = abs( start_t.millitm - stop_t.millitm );
sprintf( log, "End Optimization: %ld steps, %ld function evaluations ", Opt->GetSteps(), simulation.GetCalls() );
print_log( log );
sprintf( log, "
: total time: %ld.%d secs ", sec, msec );
print_log( log );
delete Opt;
return OK;
Options.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
///
Options::Options() :
SimOptions( 0 ), OptOptions( 0 ),
SimulationChosed( HSPICE ), OptimizationChosed( SLOP ),
verbose( 0 ), manual(0), NameMosN( 0 ), NameMosP( 0 ), WorkPath( 0 )
{
print_log( "Parsing Options..." );
}
///
Options::~Options()
B.1. Main functions
20 {
21
22
23 }
delete[] SimOptions;
delete[] OptOptions;
OptionsRead.cc
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"global.h"
///
int Options::Read( const char* FileOptionsName )
{
char line[ 1024 ];
char opt[ 256 ];
char log[1024];
int RetCode = OK;
ifstream i_file( FileOptionsName );
if ( !i_file )
return NOT_FOUND;
unsigned int NumSimOptions = 0;
unsigned int NumOptOptions = 0;
while ( OptCommandOptions[ NumOptOptions++ ] );
while ( SimCommandOptions[ NumSimOptions++ ] );
OptOptions = new double[ NumOptOptions ];
SimOptions = new double[ NumSimOptions ];
for ( unsigned int i = 0; i < NumOptOptions; i++ )
OptOptions[ i ] = 0;
for ( unsigned int i = 0; i < NumSimOptions; i++ )
SimOptions[ i ] = 0;
unsigned int Line = 0;
{
Line++;
if ( sscanf( line, "%s ", opt ) == 1 )
if ( opt[ 0 ] != # )
{
GLOBCOMMANDOPT GlobalOption = WhichGBOption( opt );
SIMCOMMANDOPT SimOption = WhichSimOption( opt );
OPTCOMMANDOPT OptOption = WhichOptOption( opt );
switch ( GlobalOption )
{
case VERBOSE:
verbose = 1;
print_log("Well, lets go verbose...");
break;
case MANUAL:
manual = 1;
print_log("So you think youre better than me,");
print_log("in calculating critical paths?...");
break;
case SIMALG:
RetCode = NOT_FOUND;
sscanf( line, "%*s %s", opt );
for ( unsigned int S = 0 ; ( SimAlgorithms[ S ] != 0 ); S++ )
if ( !strcasecmp( opt, SimAlgorithms[ S ] ) )
{
SimulationChosed = ( SimMethod ) S;
RetCode = OK;
sprintf(log, "Simulator......%s", SimAlgorithms[ S ]);
print_log(log);
}
break;
case OPTALG:
187
188
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133

for ( unsigned int O = 0; ( OptAlgorithms[ O ] != 0 ); O++ )
if ( !strcasecmp( opt, OptAlgorithms[ O ] ) )
{
OptimizationChosed = ( OptMethod ) O;
RetCode = OK;
sprintf(log, "Optimizer......%s", OptAlgorithms[ O ]);
print_log(log);
}
break;
case NAMEMOSN:
NameMosN = new char[ strlen( opt ) + 1 ];
if ( !NameMosN )
RetCode = NO_MEM;
else
strcpy( NameMosN, opt );
break;
case NAMEMOSP:
NameMosP = new char[ strlen( opt ) + 1 ];
if ( !NameMosP )
RetCode = NO_MEM;
else
strcpy( NameMosP, opt );
break;
case WORKPATH:
WorkPath = new char[ strlen( opt ) + 1 ];
if ( !WorkPath )
RetCode = NO_MEM;
else
strcpy( WorkPath, opt );
break;
case NONEGLOB:
default:
break;
}
if ( SimOption != NONESIM )
{
sscanf( line, "%*s %lg", &SimOptions[ SimOption ] );
}
switch ( OptOption )
{
case CONSTRAINS:
if ( ( OptOptions[ CONSTRAINS ] == 1.0 ) || ( OptOptions[ ENDCONSTRAINS ] == 1 ) )
else
{
OptOptions[ CONSTRAINS ] = 1.0;
print_log("Hey, you mean some constraints...");
}
break;
case ENDCONSTRAINS:
if ( OptOptions[ CONSTRAINS ] == 0 )
else
OptOptions[ ENDCONSTRAINS ] = 1.0;
break;
case WEIGHTS:
if ( ( OptOptions[ WEIGHTS ] == 1.0 ) || ( OptOptions[ ENDWEIGHTS ] == 1 ) )
else
{
OptOptions[ WEIGHTS ] = 1.0;
print_log("Hey, you mean some weights...");
}
break;
B.1. Main functions
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
case ENDWEIGHTS:
if ( OptOptions[ WEIGHTS ] == 0 )
else
OptOptions[ ENDWEIGHTS ] = 1.0;
break;
case WDELAY:
case WPOWER:
case WAREA:
case WNOISE:
case MAXSTEPS:
case ACC:
case WMAX:
case WMIN:
case DELTA:
case RISETIME:
case FALLTIME:
case MAXAREA:
case MAXDELAY:
case MAXPOWER:
case MAXNOISE:
sscanf( line, "%*s %lg", &OptOptions[ OptOption ] );
sprintf(log, "...%s=%g", OptCommandOptions[OptOption], OptOptions[ OptOption ]);
print_log(log);
break;
case NONEOPT:
default:
break;
}
}
{
sprintf( line, "ERROR reading file %s line %d ", FileOptionsName, Line );
print_log( line );
i_file.close();
return RetCode;
}
}
if ( !NameMosN )
{
NameMosN = new char[ strlen( TransistorString[
if ( !NameMosN )
RetCode = NO_MEM;
else
strcpy( NameMosN, TransistorString[ NMOS ]
}
if ( !NameMosP )
{
NameMosP = new char[ strlen( TransistorString[
if ( !NameMosP )
RetCode = NO_MEM;
else
strcpy( NameMosP, TransistorString[ PMOS ]
}
if ( !WorkPath )
{
WorkPath = new char[ strlen( WORKPath ) + 1 ];
if ( !WorkPath )
RetCode = NO_MEM;
else
strcpy( WorkPath, WORKPath );
}
if ( SimulationChosed == NONESM )
SimulationChosed = HSPICE;
if ( OptimizationChosed == NONEOM )
OptimizationChosed = SLOP;
i_file.close();
return OK;
NMOS ] ) + 1 ];
);
PMOS ] ) + 1 ];
);
189
190
203 }
ReadTEch.cc
3
4
5
6
7
8
9
10
11
12
13
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"tech.h"
"readt.h"
///
struct _TECH_STR TECH;
int ReadTech()
{
nmos
ReadTEch.cc
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
34
34
35
36
37
37
38
38
39
39
40
40
41
TECH.Lmin = 0.25;
TECH.u0_n = 37.2;
/** micron2 / (Volt * ns) */
TECH.Kp_n = 256.7916;
/** uA / V2 */
TECH.vmax_n = 130.7952;
/** micron / ns */
TECH.Vtn0 = 0.5885;
/** Volt */
TECH.epss = 0.10359;
/** fF / micron */
TECH.q = 1.602E-4;
/** fF * Volt */
TECH.Na = 2.679E11;
/** micron -3 */
TECH.gamma_n = 0.3356;
TECH.phi_n = 0.79424;
TECH.Cox_n = 6.903;
/** fF / micron 2 */
TECH.C_nj = 689E-3;
/** fF / micron2 */
TECH.C_np = 138E-3;
/** fF / micron */
TECH.Ec_n = 3.516;
/** Volt / micron = vmax/uo */
TECH.VT = 25.98E-3;
/** Volt */
TECH.ni = 1.45E-2;
/** micron -3 */
TECH.Df = 0.625;
/** micron */
TECH.Cgd0_n = 0.32;
/** fF / micron */
TECH.Cgs0_n = 0.32;
TECH.PB_n = 0.79424;
/** Volt */
TECH.mj_n = 0.45495;
TECH.mjsw_n = 0.1;
TECH.XW_n = -0.79698;
TECH.XL_n = 0;
TECH.WD_n = 0.039849;
TECH.LD_n = 0.0332;
TECH.theta_n = 0.4314;
/** micron */
/** micron */
/** micron */
/** micron */
/* V-1 */
B.1. Main functions
pmos
ReadTEch.cc
43
43
44
44
45
45
46
47
48
49
49
50
50
51
51
52
52
53
53
54
55
55
56
56
57
58
59
59
60
60
61
61
62
62
63
63
64
65 }
TECH.u0_p = 6.341;
/** micron2 / (Volt * ns) */
TECH.Kp_p = 30.16;
/** uA / V2 */
TECH.Cox_p = 6.903;
/** fF / micron2 */
TECH.gamma_p = 0.69468;
TECH.phi_p = 0.79547;
TECH.Vtp0 = -0.434;
TECH.vmax_p = 57.6714;
TECH.C_pj = 596E-3;
TECH.C_pp = 122.1E-3;
/** micron /ns */

/** fF / micron2 */
/** fF / micron */
TECH.Ec_p = 9.095;
/** Volt / micron */
TECH.Cgd0_p = 0.5;
/** fF / micron */
TECH.Cgs0_p = 0.5;
TECH.Nd = 2.8E11;
TECH.PB_p = 0.79547;
TECH.mj_p = 0.36085;
TECH.mjsw_p = 0.1;
TECH.XW_p = -0.89852;
/** micron -3 */
/** Volt */
/** micron */
TECH.XL_p = 0;
/** micron */
TECH.WD_p = 0;
/** micron */
TECH.LD_p = 0.054697;
/** micron */
TECH.theta_p = 0.4071;
/** V-1 */
return OK;
SearchCritic.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"
///
int SearchCriticalPath(const Circuit& circuit,
CritPathList& pathList,
const NodeList& nodeListGnd,
const NodeList& nodeListVdd,
Node& nodeInputList,
Node& gateInternalList,
const Options& options)
{
ListNodeList* gndCPath;
ListNodeList* vddCPath;
int RetCode;
191
192
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
char log[1024];
unsigned int nlg = nodeListGnd.GetNumList();
unsigned int nlv = nodeListVdd.GetNumList();
print_log("Searching Critical Path...");
#ifdef DEBUG
cerr << "Searching Critical Path..." << endl;
#endif
gndCPath = 0;
vddCPath = 0;
for (unsigned int i = 0; i < nlg; i++)
{
ListNodeList* tmp = new ListNodeList;
if (!tmp)
return NO_MEM;
tmp->next = gndCPath;
gndCPath = tmp;
for (unsigned int j = 0; j < nodeInputList.GetNumNode(); j++)
nodeInputList[j].flag = -1;
for (unsigned int j = 0; j < gateInternalList.GetNumNode(); j++)
gateInternalList[j].flag = -1;
RetCode = SearchCPRecurse(gndCPath, circuit.ValimNode(), i, nodeListGnd, nodeListVdd, nodeInputList, gateInternalList, 0);
if ((RetCode != OK) && (RetCode != CONT))
return RetCode;
}
sprintf(log, "found first critical paths (gnd)...");
print_log(log);
for (unsigned int i = 0; i < nlv; i++)
{
if (!tmp)
return NO_MEM;
tmp->next = vddCPath;
vddCPath = tmp;
for (unsigned int j = 0; j < nodeInputList.GetNumNode(); j++)
nodeInputList[j].flag = -1;
for (unsigned int j = 0; j < gateInternalList.GetNumNode(); j++)
gateInternalList[j].flag = -1;
RetCode = SearchCPRecurse(vddCPath, circuit.ValimNode(), i, nodeListVdd, nodeListGnd, nodeInputList, gateInternalList, 0);
if ((RetCode != OK) && (RetCode != CONT))
return RetCode;
}
sprintf(log, "found the other critical paths (vdd)...");
print_log(log);
ListNodeList* tmp = gndCPath;
for (unsigned int gcount = 0; gcount < 2; gcount ++)
{
while (tmp)
{
#ifdef DEBUG
cerr << endl << "-------------> CP " << count << endl;
#endif
count++;
unsigned int nl = (tmp->NL).GetNumList();
if (nl)
{
if ((RetCode = pathList.Create()) != OK)
return RetCode;
unsigned int first_node = (((tmp->NL)[0])[0]).node;
unsigned int nn = ((tmp->NL)[nl - 1]).GetNumNode();
unsigned int output = (((tmp->NL)[nl - 1])[nn - 1]).node;
TransitionType Tr_in;
TransitionType Tr_out;
if (first_node == 0)
Tr_in = RISE;
else
Tr_in = FALL;
if (nl % 2)
Tr_out = (Tr_in == RISE ? FALL : RISE);
B.1. Main functions
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
193
else
Tr_out = (Tr_in == RISE ? RISE : FALL);
OPTCOMMANDOPT TRin = (Tr_in == RISE ? RISETIME : FALLTIME);
RetCode = pathList.InsertNodeOut(output, Tr_out);
if (RetCode != OK)
return RetCode;
unsigned int pos;
unsigned int first_input = 0;
double val;
double val_n;
{
(nodeInputList[i]).flag = -1;
}
{
first_node = (((tmp->NL)[i])[0]).node;
{
val = circuit.Valim();
val_n = 0;
}
else
{
val = 0;
val_n = circuit.Valim();
}
nn = ((tmp->NL)[i]).GetNumNode();
// set initial condition
unsigned int last_l_node = (((tmp->NL)[i])[nn - 1]).node;
if (i < nl - 1)
{
RetCode = pathList.InsertInitialCondition(last_l_node, val);
if (RetCode != OK)
return RetCode;
}
for (unsigned int j = 1; j < nn; j = j + 2)
{
unsigned int input = (((tmp->NL)[i])[j]).node;
#ifdef DEBUG
cerr << endl << "input " << input;
#endif
if (IsIn(input, nodeInputList, pos) == OK)
{
#ifdef DEBUG
cerr << " primary input (" << pos << ")";
#endif
if (first_input == 0)
{
first_input = input;
RetCode = pathList.InsertNodeIn(first_input, Tr_in, options.GetOptOption(TRin));
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " INPUT";
#endif
}
else
{
if ((nodeInputList[pos]).flag == -1)
{
RetCode = pathList.InsertActiveInputs(input, val);
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " ACTIVE IN " << val;
#endif
}
else
194
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
{
{
if ((nodeInputList[pos]).flag != int(circuit.ValimNode()))
return NOT_FOUND;
}
else
{
if ((nodeInputList[pos]).flag != 0)
return NOT_FOUND;
}
}
}
(nodeInputList[pos]).flag = (first_node == 0 ? circuit.ValimNode() : 0);
}
if (i > 0)
{
unsigned int nn2 = ((tmp->NL)[i - 1]).GetNumNode();
last_l_node = (((tmp->NL)[i - 1])[nn2 - 1]).node;
}
else
last_l_node = 0;
if (IsIn(input, gateInternalList, pos) == OK)
{
#ifdef DEBUG
cerr << " internal gate (" << pos << " last " << last_l_node << ")";
#endif
if (last_l_node != input)
{
if (first_input == 0)
{
first_input = input;
RetCode = pathList.InsertNodeIn(first_input, Tr_in, options.GetOptOption(TRin));
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " INPUT INTERNAL";
#endif
}
else
{
RetCode = pathList.InsertActiveInputs(input, val);
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " ACTIVE IN INTERNAL " << val;
#endif
}
}
}
unsigned int drain = (((tmp->NL)[i])[j - 1]).node;
unsigned int source = (((tmp->NL)[i])[j + 1]).node;
unsigned int nt = circuit.GetNTran();
for (unsigned int k = 0; k < nt; k++)
{
if ( (circuit[k]).Gate() == input)
{
if ( (((circuit[k]).Drain() == drain) &&
((circuit[k]).Source() == source)) ||
(((circuit[k]).Drain() == source) &&
((circuit[k]).Source() == drain)))
{
RetCode = pathList.InsertPathTransistor((circuit[k]).DevName(), (circuit[k]).TrType(), i);
if (RetCode != OK)
return RetCode;
}
}
}
B.1. Main functions
234
}
235
}
236
{
237
238
if ((nodeInputList[i]).flag == -1)
239
{
240
unsigned int noActiveNode = (nodeInputList[i]).node;
241
double noActiveSupply;
242
if (gcount == 0) // CP starting with GND
noActiveSupply = 0;
243
244
else // CP starting with VDD
245
noActiveSupply = circuit.Valim();
246 #ifdef DEBUG
247
cerr << endl << "no active input "
<< noActiveNode << " = " << noActiveSupply ;
248
249 #endif
250
RetCode = pathList.InsertNoActiveInputs(noActiveNode, noActiveSupply);
251
if (RetCode != OK)
252
return RetCode;
253
}
254
}
255
if ((RetCode = pathList.Stamp(nl)) != OK)
256
return RetCode;
}
257
258
tmp = tmp->next;
}
259
tmp = vddCPath;
260
261
}
sprintf(log, "found total %u critical paths...", pathList.GetNumPath());
262
263
print_log(log);
264
return OK;
265 }
SearchCriticRecurse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"
///
int SearchCPRecurse(ListNodeList* CPath,
unsigned int valnode,
unsigned int index,
const NodeList& nodeListFirst,
const NodeList& nodeListSecond,
Node& nodeInputList,
Node& gateInternalList,
unsigned int ilevel)
{
int RetCode;
unsigned int level = ilevel;
unsigned int i, j;
unsigned int nn = (nodeListFirst[index]).GetNumNode();
level++;
#ifdef DEBUG
cerr << endl;
for (i = 0; i < level; i++)
cerr << "
";
cerr << "(" << level << ") " << index << " - ";
#endif
195
196
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
unsigned int first_node = ((nodeListFirst[index])[0]).node;

// first node = gnd or vdd
unsigned int last_node = ((nodeListFirst[index])[nn - 1]).node;
int neg_node;
neg_node = valnode;
else
neg_node = 0;
for (i = 1; i < nn; i = i + 2)
{
unsigned int pos = 0;
unsigned int tmp_node = ((nodeListFirst[index])[i]).node;
if ( IsIn(tmp_node, nodeInputList, pos) == OK)
{
int flag = (nodeInputList[pos]).flag;
if ( flag == -1)
{
(nodeInputList[pos]).flag = neg_node;
}
else
{
if (flag != neg_node)
return OK;
}
}
else if ( IsIn(tmp_node, gateInternalList, pos) == OK)
{
int flag = (gateInternalList[pos]).flag;
if (flag == -1)
{
if (level > 1)
{
// very bastard inside
if ( SearchOKCond(tmp_node, valnode, nodeListFirst, nodeListSecond, nodeInputList, gateInternalList) == NOT_FOUND)
{
if ( SearchOKCond(tmp_node, valnode, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList) == NOT_FOUN

(gateInternalList[pos]).flag = neg_node;
else
return OK;
}
else
}
else
}
else
{
return OK;
}
}
}
if ( (RetCode = (CPath->NL).Create()) != OK)
return RetCode;
for (unsigned int ii = 0; ii < nn; ii++)
{
if ((RetCode = (CPath->NL).InsertNode(((nodeListFirst[index])[ii]).node)) != OK)
return RetCode;
}
unsigned int nl = nodeListSecond.GetNumList();
for (i = 0; i < nl; i++)
{
nn = (nodeListSecond[i]).GetNumNode();
j = 1;
unsigned int found = 0;
while ((j < nn) && (!found))
{
B.1. Main functions
197
105
unsigned int try_node = ((nodeListSecond[i])[j]).node;
106
if (try_node == last_node)
107
found = j;
j = j + 2;
108
109
}
110
if (found)
111
{
112
int RecurseCode = SearchCPRecurse(CPath, valnode, i, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList, level);
113
if (RecurseCode == OK)
114
{
115
116
if (!tmp)
117
return NO_MEM;
tmp->next = CPath;
118
119
CPath = tmp;
120
unsigned int n_l = ((CPath->next)->NL).GetNumList();
for (unsigned int jj = 0; jj < n_l - 1; jj++)
121
122
{
123
if ((RetCode = (CPath->NL).Create()) != OK)
124
return RetCode;
125
unsigned int n_n = (((CPath->next->NL))[jj]).GetNumNode();
126
for (unsigned int k = 0; k < n_n; k++)
127
(CPath->NL).InsertNode((((CPath->next->NL)[jj])[k]).node );
}
128
129
}
130
else if (RecurseCode == CONT)
131
{
132
//if((RetCode = (CPath-NL).DeleteList(level)) != OK)
133
// return RetCode;
}
134
135
else
136
137
return RecurseCode;
}
138
139
}
140
level--;
141
if (level == 0)
{
142
143 #ifdef DEBUG
144
unsigned int pp = (CPath->NL).GetNumList();
145
cerr << endl << " NUMLIST " << pp << " --- ";
146
for (unsigned int i = 0; i < pp; i++)
147
{
unsigned int pn = ((CPath->NL)[i]).GetNumNode();
148
149
for (unsigned int j = 0; j < pn; j++)
150
cerr << " " << (((CPath->NL)[i])[j]).node;
151
cerr << " / ";
152
}
153 #endif
154
}
return CONT;
155
156 }
SearchOkCond.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"
///
int SearchOKCond(unsigned int node,
unsigned int valnode,
198
16
17
18
19
20 {
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71 }
const
const
Node&
Node&
NodeList& nodeListFirst,
NodeList& nodeListSecond,
nodeInputList,
gateInternalList)
unsigned int nl = nodeListSecond.GetNumList();

unsigned int pos = 0;
unsigned int first_node = ((nodeListSecond[0])[0]).node;
// first node = gnd or vdd
int neg_node;
neg_node = valnode;
else
neg_node = 0;
{
unsigned int nn = (nodeListSecond[i]).GetNumNode();
unsigned int last_node = ((nodeListSecond[i])[nn - 1]).node;
if (node == last_node)
{
for (unsigned int j = 1; j < nn; j = j + 2)
{
unsigned int tmp_node = ((nodeListSecond[i])[j]).node;
if ( IsIn(tmp_node, nodeInputList, pos) == OK)
{
int flag = (nodeInputList[pos]).flag;
if ( flag == -1)
{
(nodeInputList[pos]).flag = neg_node;
}
else
{
return NOT_FOUND;
}
}
else if ( IsIn(tmp_node, gateInternalList, pos) == OK)
{
int flag = (gateInternalList[pos]).flag;
if (flag == -1)
{
int RecurseCode = SearchOKCond(tmp_node, valnode, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList
if (RecurseCode == NOT_FOUND)
return NOT_FOUND;
}
else
{
return NOT_FOUND;
}
}
}
}
}
return OK;
TransistorList.cc
3
4
5
6
7
8
9
10
11
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"
///
TransistorList::TransistorList() : NumTran( 0 ), head( 0 ), tail( 0 )
{}
B.1. Main functions
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
199
///
TransistorList::~TransistorList()
{
TransistorNode* tmp;
while ( head )
{
tmp = head->next;
delete head;
head = tmp;
}
}
///
const TransistorNode& TransistorList::operator[]( unsigned int index ) const
{
if ( index > NumTran )
error( NOT_FOUND, 0, "Index out of bound in [TransistorList]..." );
TransistorNode* tmp = head;
while ( tmp )
{
if ( tmp->Index() == index )
return * tmp;
tmp = tmp->next;
}
return *tmp;
}
///
const TransistorNode& TransistorList::operator[]( const char* name ) const
{
TransistorNode * tmp = head;
while ( tmp )
{
if ( !strcasecmp( name, tmp->Name ) )
return * tmp;
tmp = tmp->next;
}
return *tmp;
}
///
TransistorNode& TransistorList::operator[]( unsigned int index )
{
if ( index > NumTran )
error( NOT_FOUND, 0, "Index out of bound in [TransistorList]..." );
TransistorNode* tmp = head;
while ( tmp )
{
if ( tmp->Index() == index )
return * tmp;
tmp = tmp->next;
}
return *tmp;
}
TransistorListInsert.cc
3
4
5
6
7
8
9
10
11
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"
///
int TransistorList::Insert( const char *name, double w, double l, TransistorType t, unsigned int s, unsigned int g, unsigned int d )
{
TransistorNode * tmp;
200
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31 }
if ( !head )
{
head = new TransistorNode( name, w, l, t, s, d, g, NumTran );
if ( !head )
return NO_MEM;
head->next = 0;
tail = head;
}
else
{
tmp = new TransistorNode( name, w, l, t, s, d, g, NumTran );
if ( !tmp )
return NO_MEM;
tmp->next = 0;
tail->next = tmp;
tail = tmp;
}
NumTran++;
return OK;
TransistorNode.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include
#include
#include
#include
"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"
///
TransistorNode::TransistorNode( const char *name, double w, double l, TransistorType t, unsigned int s, unsigned int d, unsigned int g,
Name( 0 ), width( w ), length( l ), type( t ),
source( s ), drain( d ), gate( g ), hashindex( index ), next( 0 )
{
Name = new char[ strlen( name ) + 1 ];
if ( !Name )
{
print_log( "FATAL ERROR" );
error( NO_MEM, errno, "PANIC! " );
}
strcpy( Name, name );
}
///
TransistorNode::~TransistorNode()
{
delete[] Name;
}
main.cc
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
<signal.h>
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"hspice.h"
"fast.h"
"test.h"
"main.h"
B.1. Main functions
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
#include "readt.h"
///
extern char *optarg;
extern int optind;
///
int main ( int argc, char **argv )
{
int c;
char *FileIn;
char *FileOptions;
time_t tm = time( 0 );
signal(15, catch_stop);
signal(2, catch_stop);
char log[ 256 ];
print_log( "\n*************************" );
sprintf( log, "%s Version: %s Copyrigth MFD 1998 ", argv[ 0 ], VERSION );
print_log( log );
print_log( "*************************" );
print_log( ctime( &tm ) );
// some default initialization
FileOptions = 0;
while ( ( c = getopt( argc, argv, "hf:t:" ) ) != -1 )
{
switch ( c )
{
case h:
//HELP
return print_help( argv[ 0 ] );
break;
case f:
FileOptions = new char[ strlen( optarg ) + 1 ];
if ( !FileOptions )
{
}
strcpy( FileOptions, optarg );
break;
case ?:
default:
break;
}
}
if ( ( argc - optind ) != 1 )
else
{
FileIn = new char[ strlen( argv[ optind ] ) + 1 ];
if ( !FileIn )
{
}
strcpy( FileIn, argv[ optind ] );
if ( !FileOptions )
{
FileOptions = new char[ strlen( "options.conf" ) + 1 ];
strcpy( FileOptions, "options.conf" );
}
}
Options options;
int RetCode;
201
202
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
if ( ( RetCode = options.Read( FileOptions ) ) )

{
print_log( "Error reading options file:" );
}
Circuit circuit( FileIn, options );
CritPathList pathList;
if (options.Manual() == 0)
{
RetCode = Critic(circuit, pathList, options);
if (RetCode != OK)
{
print_log( "Error searching critical paths:" );
}
}
else
{
if ( ( RetCode = pathList.Read( FileOptions, circuit ) ) )
{
}
}
if ( ( RetCode = ReadTech() ) )
{
}
EvaluationAlgorithm* simulation;
switch ( options.WhichSimAlgorithm() )
{
case HSPICE:
simulation = new Hspice( pathList, options, FileIn );
if ( mkdir( options.Workpath(), 0770 ) )
{
if ( errno != EEXIST )
{
error( NOT_FOUND, errno, "HEY! " );
}
}
break;
case FAST:
simulation = new Fast( pathList, options );
break;
case TESTOPT:
simulation = new TestOpt( pathList, options);
break;
case NONESM:
default:
break;
}
double* LastWidth = new double[ circuit.GetNTran() ];
if ( !simulation || !LastWidth )
{
print_log( "FATAL ERROR" );
}
print_init( circuit, options, pathList.GetNumPath() );
if ( ( RetCode = Optimize( circuit, options, *simulation, LastWidth ) ) )
{
print_log( "Error in optimizing..." );
B.1. Main functions
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
203

}
print_log("Writing optimized netlist...");
print_final(FileIn, circuit, pathList.GetNumPath(), *simulation, LastWidth );
print_log( "Time to die..." );
delete[] FileIn;
delete[] FileOptions;
delete[] LastWidth;
delete simulation;
return OK;
}
///
int print_help( const char *name )
{
cerr << "Usage: " << name << " [-f FILEOPTIONS] Netlist_file" << endl;
cerr << "
Where -f FILEOPTIONS = file containing general option (default = options.conf) " << endl;
return OK;
}
nrutil.cc
3
4
5
6
7
8
9
10
11
12
13
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
28
29
30
31
32
34
35
36
37
38
39
40
41
42
43
44
46
47
48
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"nrutil.h"
"print.h"
///
const int NR_END = 1;
#define FREE_ARG char*
///
double *dvector ( long nl, long nh )
/* allocate a double vector with subscript range v[nl..nh] */
{
double * v;
17
v = new double[ nh - nl + 1 + NR_END ];
if ( !v )
{
//error(NO MEM, errno, HEY! );
return 0;
}
return v - nl + NR_END;
}
///
double **dmatrix ( long nrl, long nrh, long ncl, long nch )
{
long i, nrow = nrh - nrl + 1, ncol = nch - ncl + 1;
double **m;
31
m = new double * [ nrow + NR_END ];
33
if ( !m )
{
return 0;
}
m += NR_END;
m -= nrl;
43
45
m[ nrl ] = new double[ nrow * ncol + NR_END ];
if ( !m[ nrl ] )
{
/* allocate a double matrix with subscript range m[nrl..nrh][ncl..nch] */
204
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
65
66
67
68
69
70
73
74
75

return 0;
}
m[ nrl ] += NR_END;
m[ nrl ] -= ncl;
for ( i = nrl + 1; i <= nrh; i++ )
m[ i ] = m[ i - 1 ] + ncol;
return m;
}
///
void free_dvector ( double *v )
{
delete[] v;
64
57
61
}
///
void free_dmatrix ( double **m )
68
{
71
72
delete[] m[ 1 ];
///
delete[] m;
}
print final.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
///
void print_final(const char* FileNetList, const Circuit& circuit, unsigned int NP, EvaluationAlgorithm& simulation, double* LastWidth )
{
char log[ 1024 ];
print_log( "Final Dimensions: " );
{
int pos = circuit.TranPos( circuit[ i ].DevName() );
sprintf( log, "W[%s] = %3.3gu", circuit[ i ].DevName(), LastWidth[ pos ] );
print_log( log );
}
print_log( "Final critical paths: " );
{
sprintf( log, "%u) Delay=%g ps, Energy=%g pJ, Noise=%g", i,
simulation.GetDelay( i ),
simulation.GetPower( i ),
simulation.GetNoise( i ) );
print_log( log );
}
sprintf( log, "Area=%g ", simulation.GetArea() );
print_log(log);
char line[ 1024 ];
char line2[ 1024 ];
char* FileNetOut;
FileNetOut = new char[ strlen( FileNetList ) + strlen( NetListSuffix ) + 1 ];
if ( !FileNetOut )
{
B.1. Main functions
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112

}
strcpy( FileNetOut, FileNetList );
strcat( FileNetOut, NetListSuffix );
ifstream i_file( FileNetList );
ofstream o_file( FileNetOut );
if ( !i_file || !o_file)
{
print_log( ReturnMessage[ NOT_FOUND ] );
}
{
int c = 0;
while ( isspace( line[ c++ ] ) );
switch ( line[ --c ] )
{
case m:
case M:
case x:
case X:
char tmpstr[ 128 ];
char par[ 16 ];
char type[ 16 ];
char endpar[ 16 ];
char mos[ 8 ];
unsigned int n1, n2, n3, n4;
strcpy( parsestr, "%s %u %u %u %u %s" );
if ( sscanf( line, parsestr, mos, &n1, &n2, &n3, &n4, type ) == 6 )
{
sprintf( line2, "%s %u %u %u %u %s ", mos, n1, n2, n3, n4, type );
strcpy( parsestr, "%*s %*u %*u %*u %*u %*s" );
strcpy( tmpstr, parsestr );
while ( sscanf( line, parsestr, par ) == 1 )
{
while ( isspace( par[ count++ ] ) );
count--;
if ((par[count] == w) || (par[count] == W))
{
double W = LastWidth[circuit[mos].Index()];
sprintf(endpar, " w=%gu ", W);
strcat(line2, endpar);
}
else
{
strcat(line2, " ");
strcat(line2, par);
}
}
}
else
{
print_log( ReturnMessage[ NOT_FOUND ] );
}
break;
default:
strcpy(line2, line);
break;
}
o_file << line2 << endl;
205
206
113
114
115
116 }
}
i_file.close();
o_file.close();
print init.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include
#include
#include
#include
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
///
void print_init( const Circuit& circuit, const Options& options, unsigned int NP )
{
char log[ 1024 ];
sprintf( log, "Circuit: %u Transistor || %u Critical paths", n, NP );
print_log( log );
print_log( "Initial Dimensions: " );
{
sprintf( log, "W[%s] = %3.3gu", circuit[ i ].DevName(), circuit[ i ].Width() );
print_log( log );
}
}
print log.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include "mystdinclude.h"
#include "myenum.h"
#include "print.h"
///
void print_log( const char *OutString )
{
ofstream o_file( "OPT.log", ios::app );
if ( o_file )
{
o_file << OutString << endl;
o_file.close();
}
}
signal.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include "mystdinclude.h"
#include <signal.h>
///
void catch_stop(int n)
{
if (n == 15)
cerr << endl << "TERM" << endl;
if (n == 2)
{
cerr << endl << "TERM2" << endl;
exit(0);
}
}
B.1. Main functions
207
208
B.2
Optimization algorithms
B.2. Optimization algorithms
Slop.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"slop.h"
///
Slop::Slop( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
OptimizationAlgorithm( circuit, options, simulation ),
TMat(0), TMatOld(0), GMat(0), T(0), SaveW(0)
{
print_log( "Creating Slop instance..." );
TMat = new double * [ NumPath ];
TMatOld = new double * [ NumPath ];
GMat = new double * [ NumPath ];
T = new double [ NumPath ];
SaveW = new double[ NumTran ];
if ( ( !TMat ) || ( !TMatOld ) || ( !GMat ) || ( !T ) || ( !SaveW ) )
{
}
{
TMat[ i ] = new double [ NumTran ];
TMatOld[ i ] = new double [ NumTran ];
GMat[ i ] = new double [ NumTran ];
if ( ( !TMat[ i ] ) || ( !TMatOld[ i ] ) || ( !GMat[ i ] ) )
{
}
}
}
///
Slop::~Slop()
{
{
delete[] TMat[ i ];
delete[] GMat[ i ];
}
delete[] TMat;
delete[] TMatOld;
delete[] GMat;
delete[] SaveW;
}
///
int Slop::Run()
{
int Wbig;
int RetCode;
SaveW[ i ] = Width[ i ];
double max = 0;
double TMax = 0;
unsigned int jmax;
double dummy;
unsigned int end_acc = 0;
for ( Steps = 1; ( Steps < options.GetOptOption( MAXSTEPS ) ) \
&& (max >= 0.0) \
209
210
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
&& ( InternalSteps < options.GetOptOption( MAXSTEPS ) ) \

&& (end_acc == 0); Steps++ )
{
Width[ i ] = SaveW[ i ];
dummy = SlopNormSim( Width, RetCode);
if ((RetCode != OK) && (RetCode != CONT) && (RetCode != MAX_STEPS) && (RetCode != END_ACC))
return RetCode;
else if ((RetCode == MAX_STEPS) || (RetCode == END_ACC))
end_acc = 1;
TMax = 0.0;
jmax = 0;
{
T[ i ] = CPDelay[ i ];
for ( unsigned int j = 0; j < NumTran; j++ )
TMatOld[ i ][ j ] = T[ i ];
if ( T[ i ] > TMax )
{
TMax = T[ i ];
jmax = i;
}
}
{
Width[ i ] += options.GetOptOption( DELTA );
if (options.GetOptOption( WMAX ) > 0)
if ( Width[ i ] >= options.GetOptOption( WMAX ) )
Width[ i ] = options.GetOptOption( WMAX );
return RetCode;
end_acc = 1;
if ( options.GetOptOption( CONSTRAINS ) )
{
if ( ( CPDelay[ i ] <= options.GetOptOption( MAXDELAY ) ) &&
( CPPower[ i ] <= options.GetOptOption( MAXPOWER ) ) &&
( CPNoise[ i ] <= options.GetOptOption( MAXNOISE ) ) &&
( Area <= options.GetOptOption( MAXAREA ) ) )
for ( unsigned int j = 0; j < NumPath; j++ )
{
T[ j ] = CPDelay[ j ];
TMat[ j ][ i ] = T[ j ];
}
}
else
{
T[ j ] = CPDelay[ j ];
TMat[ j ][ i ] = T[ j ];
}
}
Wbig = -1;
max = 0.0;
{
{
GMat[ i ][ j ] = TMatOld[ i ][ j ] - TMat[ i ][ j ];
if ( ( GMat[ i ][ j ] > max ) && ( i == jmax ) )
{
max = GMat[ i ][ j ];
Wbig = j;
}
TMatOld[ i ][ j ] = TMat[ i ][ j ];
}
137
138
139
140
141
142
143
144
145
146
147
148
149
150 }
}
if ( Wbig != -1 )
SaveW[ Wbig ] += options.GetOptOption( DELTA );
else
// so max 0
max = -1.0;
}
{
}
return RetCode;
SlopNorm.cc
3
4
5
6
7
8
9
10
11
12
13
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"slop.h"
///
double Slop::SlopNormSim( const double* NewWidth, int& RetCode)
{
return NormSim( NewWidth, RetCode);
}
211
212
Slop2.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"slop2.h"
///
Slop2::Slop2( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
OptimizationAlgorithm( circuit, options, simulation ),
TMat(0), TMatOld(0), GMat(0), SaveW(0)
{
print_log( "Creating Slop2 instance..." );
TMat = new double * [ NumPath ];
TMatOld = new double * [ NumPath ];
GMat = new double * [ NumPath ];
SaveW = new double[ NumTran ];
if ( ( !TMat ) || ( !TMatOld ) || ( !GMat ) || ( !SaveW ) )
{
}
{
TMat[ i ] = new double [ NumTran ];
TMatOld[ i ] = new double [ NumTran ];
GMat[ i ] = new double [ NumTran ];
if ( ( !TMat[ i ] ) || ( !TMatOld[ i ] ) || ( !GMat[ i ] ) )
{
}
}
}
///
Slop2::~Slop2()
{
{
delete[] TMat[ i ];
delete[] GMat[ i ];
}
delete[] TMat;
delete[] TMatOld;
delete[] GMat;
delete[] SaveW;
}
///
int Slop2::Run()
{
int Wbig;
int RetCode;
SaveW[ i ] = Width[ i ];
double max = 0;
double dummy;
dummy = Slop2NormSim( Width, RetCode);
if ( (RetCode != OK ) && (RetCode != CONT))
return RetCode;
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121 }
{
TMatOld[ i ][ j ] = dummy;
}
unsigned int end_acc = 0;
for ( Steps = 1; ( Steps < options.GetOptOption( MAXSTEPS ) ) \
&& ( max >= 0.0 ) \
&& ( InternalSteps < options.GetOptOption( MAXSTEPS ) \
&& (end_acc == 0)); Steps++ )
{
{
Width[ i ] += options.GetOptOption( DELTA );
if (options.GetOptOption( WMAX ) > 0)
if ( Width[ i ] >= options.GetOptOption( WMAX ) )
Width[ i ] = options.GetOptOption( WMAX );
return RetCode;
end_acc = 1;
{
TMat[ j ][ i ] = dummy;
}
}
Wbig = -1;
max = 0.0;
{
GMat[ i ][ j ] = TMatOld[ i ][ j ] - TMat[ i ][ j ];
if ( GMat[ i ][ j ] > max )
{
max = GMat[ i ][ j ];
Wbig = j;
}
TMatOld[ i ][ j ] = TMat[ i ][ j ];
}
if ( Wbig != -1 )
SaveW[ Wbig ] += options.GetOptOption( DELTA );
else
max = -1.0;
// so max 0
}
return RetCode;
Slop2Norm.cc
3
4
5
6
7
8
9
10
11
12
13
213
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"slop2.h"
///
double Slop2::Slop2NormSim( const double* NewWidth, int& RetCode)
{
}
214
215
TestEv.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"test.h"
///
TestEval::TestEval( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
OptimizationAlgorithm( circuit, options, simulation ) ,
TryW(0)
{
print_log( "Creating TestEval instance..." );
TryW = new double[ NumTran ];
if ( !TryW )
{
}
}
///
TestEval::~TestEval()
{
delete[] TryW;
}
///
int TestEval::Run()
{
int RetCode;
double dummy;
TryW[ i ] = options.GetOptOption( WMIN );
for ( Steps = 1; Steps < options.GetOptOption( MAXSTEPS ) ; Steps++ )
{
dummy = TestEvalNormSim( TryW, RetCode);
if ( (RetCode != OK ) && (RetCode != CONT))
return RetCode;
// if (Steps <= NumTran)
// TryW[(Steps - 1)] += options.GetOptOption( DELTA );
// else
// TryW[(Steps - 1) % NumTran] += options.GetOptOption(DELTA);
for (int i = 0; i < NumTran; i++)
TryW[i] += options.GetOptOption( DELTA );
}
return OK;
}
TestNorm.cc
3
4
5
6
7
8
9
10
11
12
13
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"test.h"
///
double TestEval::TestEvalNormSim( const double* NewWidth, int& RetCode)
{
}
216
B.3
Simulators
B.3. Simulators
217
Basicnet.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"
///
int Hspice::BasicNetlist( const double* NewWidth, unsigned int Np, const Circuit& circuit )
{
char * FileHspice;
char log[ 1024 ];
char suffix[ 8 ];
sprintf( suffix, "_%u", Np );
FileHspice = new char[ strlen( WorkPath ) + strlen( SimFile ) + strlen( suffix ) + 1 ];
strcpy( FileHspice, WorkPath );
strcat( FileHspice, SimFile );
strcat( FileHspice, suffix );
ofstream o_file( FileHspice );
ifstream i_file( NetlistFile );
if ( !o_file )
{
sprintf( log, " ERROR opening file %s ", FileHspice );
print_log( log );
return NOT_FOUND;
}
if ( !i_file )
{
sprintf( log, " ERROR opening file %s ", NetlistFile );
print_log( log );
return NOT_FOUND;
}
char line[ 1024 ];
i_file.getline( line, 1023 );
o_file << endl << endl << endl << "****** INPUTS ******" << endl;
o_file << ".include inputs." << Np << endl;
o_file << "********************" << endl;
{
unsigned int i = 0;
while ( isspace( line[ i++ ] ) );
char s = line[ --i ];
char st1[ 16 ], st2[ 16 ], st3[ 16 ];
int n1, n2, n3, n4;
if ( ( s == M ) || ( s == m ) || ( s == X ) || ( s == x ) )
{
sscanf( line, "%s %d %d %d %d %s %*s %s", st1, &n1, &n2, &n3, &n4, st2, st3 );
int position = circuit.TranPos( st1 );
if ( position == -1 )
return NOT_FOUND;
o_file << st1 << " " << n1 << " " << n2 << " " << n3 << " " << n4 << \
" " << st2 << " " << st3 << " w=" << setprecision( 4 ) << NewWidth[ position ] << "u" << endl;
}
else
o_file << line << endl;
}
o_file.close();
i_file.close();
return OK;
}
Delayread.cc
4 #include "myenum.h"
5 #include "print.h"
218
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#include "hspice.h"
///
int Hspice::DelayRead( double& del, double& energy, unsigned int Np )
{
char * FileMeas;
char suffix[ 8 ];
sprintf( suffix, "_%d", Np );
FileMeas = new char[ strlen( WorkPath ) + strlen( SimFile ) + strlen( suffix ) + strlen( SuffixFileMeasure ) + 1 ];
if ( !FileMeas )
return NO_MEM;
strcpy( FileMeas, WorkPath );
strcat( FileMeas, SimFile );
strcat( FileMeas, suffix );
strcat( FileMeas, SuffixFileMeasure );
ifstream i_file( FileMeas );
if ( !i_file )
{
print_log( "ERROR opening hspice measure file " );
return NOT_FOUND;
}
char line[ 1023 ];
for ( unsigned int i = 0; i <= 3; i++ )
if ( !i_file.getline( line, 1023 ) )
{
print_log( "ERROR parsing hspice measure file " );
return PARSE_ERROR;
}
sscanf( line, "%lg %lg", &del, &energy);
i_file.close();
del *= 1E12; // picosec.
energy *= ( 1E12);
// pJ
delete[] FileMeas;
return OK;
}
Hspice.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"
//////////////////////////////////////////////////////////////////////////////
// //
// DELAY MODULE HSPICE //
// //
// 1998 October 9 Politecnico di Torino VLSI LAB //
// //
// Mariagrazia Graziano Ph.D. Student //
// //
////////////////////////////////////////////////////////////////////////////////
Hspice::Hspice( const CritPathList& pathlist, const Options& options, const char* NE )
:
EvaluationAlgorithm( pathlist, options ), SimTime( 0.0 ),
WorkPath( 0 ), SimFile( 0 ),
InputFile( 0 ), NetlistFile( 0 ), SuffixFileMeasure( 0 )
{
print_log( "Creating Hspice instance..." );
WorkPath = new char[ strlen( options.Workpath() ) + 1 ];
SimFile = new char[ strlen( "net2use" ) + 1 ];
InputFile = new char[ strlen( "inputs" ) + 1 ];
NetlistFile = new char[ strlen( NE ) + strlen( NetListSuffix ) + 1 ];
if ( !SimFile || ! InputFile || !NetlistFile )
{
B.3. Simulators
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

}
strcpy( WorkPath, options.Workpath() );
strcpy( SimFile, "net2use" );
strcpy( InputFile, "inputs" );
strcpy( NetlistFile, NE );
strcat( NetlistFile, NetListSuffix );
SuffixFileMeasure = new char[ 5 ];
if ( !SimFile )
{
}
strcpy( SuffixFileMeasure, ".mt0" );
}
///
Hspice::~Hspice()
{
delete[] WorkPath;
delete[] SimFile;
delete[] InputFile;
delete[] NetlistFile;
delete[] SuffixFileMeasure;
}
///
int Hspice::Run( const Circuit& circuit, const double *NewWidth, const unsigned *ValidPath )
{
SimTime = options.GetSimOption( SIMTIME );
Calls++;
for ( unsigned int NP = 0; NP < NumPath; NP++ )
{
if (ValidPath[NP])
{
int RetCode;
RetCode = BasicNetlist( NewWidth, NP, circuit );
return RetCode;
RetCode = SetInput( NP, circuit.Valim() );
return RetCode;
RetCode = SimCall( NP );
return RetCode;
double OneDelay, OneEnergy;
RetCode = DelayRead( OneDelay, OneEnergy, NP );
return RetCode;
CPDelay[ NP ] = OneDelay;
CPPower[ NP ] = OneEnergy;
CPNoise[ NP ] = 0.0;
}
}
Area = CalcArea( NewWidth, circuit.GetNTran() );
return OK;
}
HspiceArea.cc
3
4
5
6
7
8
9
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"
///
double Hspice::CalcArea( const double *NewWidth, unsigned int NT )
219
220
10 {
11
12
13
14
15
16
17 }
double A = 0.0;
{
A += NewWidth[ i ];
}
return ( A );
Setinput.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"
///
int Hspice::SetInput( unsigned int Np, double Val )
{
char suffix[ 8 ];
sprintf( suffix, ".%u", Np );
char *Inputs = new char[ strlen( WorkPath ) + strlen( InputFile ) + strlen( suffix ) + 1 ];
if ( !Inputs )
return NO_MEM;
strcpy( Inputs, WorkPath );
strcat( Inputs, InputFile );
strcat( Inputs, suffix );
ofstream input_file( Inputs );
if ( !input_file )
return NOT_FOUND;
unsigned int node_in = pathlist[ Np ].GetNodeIn();
unsigned int node_out = pathlist[ Np ].GetNodeOut();
TransitionType TIn = pathlist[ Np ].GetTransitionIn();
if ( TIn == RISE )
input_file << endl << "v_node_in " << node_in << \
" 0 " << " pwl(0 0 " << ( SimTime / 2.0 ) << \
"p 0 " << ( SimTime / 2.0 ) + pathlist[ Np ].GetInTime() << "p " << Val << ")";
else if ( TIn == FALL )
input_file << endl << "v_node_in " << node_in << \
" 0 " << " pwl(0 " << Val << " " << ( SimTime / 2.0 ) << \
"p " << Val << " " << ( SimTime / 2.0 ) + \
pathlist[ Np ].GetInTime() << "p 0)" ;
unsigned int node;
double nodeVal;
while ( pathlist[ Np ].TraverseActiveInputs( node, nodeVal ) )
{
input_file << endl << "v_ACTIVE_" << node << " " << \
node << " 0 dc " << nodeVal;
}
while ( pathlist[ Np ].TraverseNoActiveInputs( node, nodeVal ) )
{
input_file << endl << "v_NO_ACTIVE_" << node << " " << \
node << " 0 dc " << nodeVal;
}
while ( pathlist[ Np ].TraverseInitialConditions( node, nodeVal ) )
{
input_file << endl << endl << ".ic v(" << node << ")=" << \
nodeVal;
}
TransitionType TOut = pathlist[ Np ].GetTransitionOut();
if ( TOut == RISE )
input_file << endl << endl << ".ic v(" << node_out << ")=0";
else if ( TOut == FALL )
input_file << endl << endl << ".ic v(" << node_out << ")=" << Val;
// Delay meas.
input_file << endl << endl << ".measure tran path_n0_" << Np << "delay " << \
" trig v(" << node_in << ")" << " val=" << Val*0.5 << " " << TransitionString[ TIn ] << "=1" << \
B.3. Simulators
60
61
62
63
64
65
66
67 }
" targ v(" << node_out << ")" << " val=" << Val * 0.5 << " " <<
// Power meas.
input_file << endl << endl << ".measure tran path_n0_" << Np <<
" integ " << "POWER" << " from=0ps" << " to=" << SimTime << "ps
input_file << endl << endl << ".tran 10p " << SimTime << "p" <<
input_file.close();
return OK;
221
TransitionString[ TOut ] << "=1";

"power " << \
";
endl;
Simcall.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"
///
int Hspice::SimCall( unsigned int Np )
{
char system_string[ 512 ];
sprintf( system_string, "cd %s && hspice %s_%d 1>./hspice.log.%d 2>&1", WorkPath, SimFile, Np, Np );
if ( system( system_string ) == -1 )
{
print_log( "ERROR invoking hspice simulator " );
return NO_MEM;
}
return OK;
}
222
Brackets.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
const double FACTOR = 1.6;

const int NTRY = 50;
const double ZEPS = 1e-2;
///
int Fast::Brackets( const Circuit& circuit, unsigned int NP, unsigned int NC, double& start, double& end, TransistorType type, unsigned
{
int jj;
double f1, f2, x1, x2;
if ( start == end )
{
x1 = x2 = 0;
return 0;
}
if ( type == NMOS )
{
f1 = EqN( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
f2 = EqN( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
}
else if ( type == PMOS )
{
f1 = EqP( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
f2 = EqP( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
}
for ( jj = 1; jj <= NTRY; jj++ )
{
if ( f1 * f2 < 0.0 )
{
RetCode = OK;
return 1;
}
if ( fabs ( f1 ) < fabs ( f2 ) )
{
start += FACTOR * ( start - end );
if ( start <= 0.0 )
start = ZEPS;
if ( type == NMOS )
f1 = EqN( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
f1 = EqP( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
}
else
{
end += FACTOR * ( end - start );
if ( end <= 0.0 )
end = ZEPS;
if ( type == NMOS )
f2 = EqN( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
f2 = EqP( circuit, NP, NC, end, RetCode, j , n, p, NewWidth );
}
}
return 0;
}
B.3. Simulators
223
CalcpowN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
int Fast::CalcPowerN( const Circuit& circuit, unsigned int NP, unsigned int NC, double& Ecc, double& Esc, unsigned int n, unsigned int p, const double*
{
double H_1, I_1, Vc, t0_bs, C_n;
double J_1, K_1, M_1, N_1;
double t0 = t0_n[ n ];
double tauo = tauo_n[ n ];
double tin = taui_n[ 1 ];
t0_bs = tin * ( VDD + TECH.Vtp0 ) / VDD;
double tc = (VDD * tauo_n[n] * (t0_n[n] - taui_n[n]) + \
Vs_n[n] * taui_n[n] * (tauo_n[n] - t0_n[n])) / \
(VDD * (t0_n[n] - taui_n[n]) + Vs_n[n] * (tauo_n[n] - t0_n[n]));
Esc = 0;
Ecc = 0;
if (p > 0)
{
Vc = TECH.Ec_p * L_p[ p ];
H_1 = Vc * beta_p[ p ] * ( VDD * ( t0 - tauo ) * \
( 2 * Vc * ( t0 - tauo ) - Vd_n[ n ] * tin ) - \
Vd_n[ n ] * tin * ( Vc * ( t0 - tauo ) - Vd_n[ n ] * tauo + 2 * TECH.Vtp0 * ( t0 - tauo ) ) ) / \
( 2 * Vd_n[ n ] * tin * ( t0 - tauo ) );
I_1 = Vc * beta_p[ p ] * ( 2 * VDD * ( t0 - tauo ) - Vd_n[ n ] * tin ) / \
( 2 * tin * ( t0 - tauo ) );
J_1 = ( Vc * Vc ) * beta_p[ p ] * ( t0 - tauo ) * ( 2 * ( VDD * VDD ) * ( t0 - tauo ) + \
2 * VDD * ( Vc * ( t0 - tauo ) + Vd_n[ n ] * ( tauo - tin ) ) - \
Vd_n[ n ] * tin * ( Vc + 2 * TECH.Vtp0 ) );
K_1 = 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + \
Vc * ( t0 - tauo ) + Vd_n[ n ] * tauo );
M_1 = VDD * Vc * beta_p[ p ] * ( VDD + 2 * TECH.Vtp0 ) / ( 2 * ( VDD + Vc ) );
N_1 = VDD * VDD * Vc * beta_p[ p ] / ( tin * ( VDD + Vc ) );
if ( t0_bs < tauo )
{
Esc = ( J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0 * tin - K_1 ) / ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * \
( tin * tin ) * ( tauo - t0 ) ) + \
J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0_bs * tin - K_1 ) / \
( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * ( tin * tin ) * ( t0 - tauo ) ) + \
( ( t0 - t0_bs ) * \
( 3 * H_1 * Vd_n[ n ] * tin * ( 2 * VDD * ( t0 - tauo ) - Vd_n[ n ] * ( t0 + t0_bs - 2 * tauo ) ) + \
I_1 * Vd_n[ n ] * tin * ( 3 * VDD * ( t0 + t0_bs ) * ( t0 - tauo ) - Vd_n[ n ] * ( 2 * ( t0 * t0 ) + \
t0 * ( 2 * t0_bs - 3 * tauo ) + t0_bs * ( 2 * t0_bs - 3 * tauo ) ) ) - 3 * J_1 ) ) / \
( 6 * Vd_n[ n ] * tin * ( t0 - tauo ) ) );
}
else
{
Esc = ( J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0 * tin - K_1 ) / ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * \
( tin * tin ) * ( tauo - t0 ) ) + \
J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * tauo * tin - K_1 ) / \
( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * ( tin * tin ) * ( t0 - tauo ) ) + \
( 3 * H_1 * Vd_n[ n ] * tin * ( t0 - tauo ) * ( 2 * VDD - Vd_n[ n ] ) + \
I_1 * Vd_n[ n ] * tin * ( t0 - tauo ) * \
( 3 * VDD * ( t0 + tauo ) - Vd_n[ n ] * ( 2 * t0 + tauo ) ) - 3 * J_1 ) / \
( 6 * Vd_n[ n ] * tin ) );
}
Esc = fabs( Esc );
}
const char* name;
unsigned int node = 0;
224
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
double Wjn, Wgn, Wjp, Wgp;

int njn, ngn, njp, ngp;
for ( unsigned int i = 1; i <= n; i++ )
{
C_n = 0.0;
name = pathlist[ NP ].TransistorName( i - 1, NC );
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();
}
int nc;
// Cj N
C_n += TECH.C_nj * Wjn * TECH.Df * \
( Vd_n[ i ] * Vd_n[ i ] * ( TECH.mj_n - 1 ) * ( TECH.mj_n - 1 ) + \
Vd_n[ i ] * TECH.mj_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
pow ( ( 1 + Vd_n[ i ] / TECH.PB_n ), -TECH.mj_n ) / \
( ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) );
C_n += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * \
( Vd_n[ i ] * Vd_n[ i ] * ( TECH.mjsw_n - 1 ) * ( TECH.mjsw_n - 1 ) + \
Vd_n[ i ] * TECH.mjsw_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
pow ( 1 + Vd_n[ i ] / TECH.PB_n, -TECH.mjsw_n ) / \
( ( TECH.mjsw_n - 2 ) * ( TECH.mjsw_n - 1 ) ) + \
TECH.PB_n * TECH.PB_n * ( TECH.C_nj * Wjn * TECH.Df * \
( TECH.mjsw_n - 2 ) * ( TECH.mjsw_n - 1 ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * \
( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) ) / \
( ( 2 - TECH.mjsw_n ) * ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) * ( TECH.mjsw_n - 1 ) );
// Cj P
if (p > 0)
{
double x = TECH.mj_p - 1;
double y = TECH.mjsw_p - 1;
C_n += ( TECH.C_pj * Wjp * TECH.Df * \
( VDD * VDD * x + VDD * TECH.mj_p * \
(TECH.PB_p - Vd_n[ i ] * x) + Vd_n[ i ] * Vd_n[ i ] * x * x - \
Vd_n[ i ] * TECH.mj_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD - Vd_n[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) ) / \
( ( x - 1 ) * x );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( VDD * VDD * y + VDD * TECH.mjsw_p * \
(TECH.PB_p - Vd_n[ i ] * y) + Vd_n[ i ] * Vd_n[ i ] * y * y - \
Vd_n[ i ] * TECH.mjsw_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD - Vd_n[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) ) / \
( ( y - 1 ) * y );
C_n += ( TECH.C_pj * Wjp * TECH.Df * \
( VDD * VDD * x + VDD * TECH.mj_p * TECH.PB_p + \
TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) ) / \
( ( 1 - x ) * x );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( VDD * VDD * y + VDD * TECH.mjsw_p * TECH.PB_p + \
TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) ) / \
( ( 1 - y ) * y );
}
C_n += circuit.CapStaticGnd( node, nc ) * Vd_n[i] * Vd_n[i] * 0.5;
C_n += circuit.CapStaticVdd( node, nc ) * Vd_n[i] * Vd_n[i] * 0.5;
B.3. Simulators
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
C_n += Wgn
C_n += Wgp
if ( i < n
C_n +=
225
* TECH.Lmin * TECH.Cox_n * Vd_n[i] * Vd_n[i] * 0.5;

* TECH.Lmin * TECH.Cox_p * Vd_n[i] * Vd_n[i] * 0.5;
)
TECH.Cgs0_n * ( ( Wjn - W_n[ i ] - W_n[ i + 1 ] ) + \
( njn - 2 ) * TECH.XW_n ) * 0.5 * Vd_n[i] * Vd_n[i];
else
C_n += TECH.Cgs0_n * ( ( Wjn - W_n[ i ] ) + \
( njn - 1 ) * TECH.XW_n ) * 0.5 * Vd_n[i] * Vd_n[i];
C_n += TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[i] * Vd_n[i] * 0.5;
// Cgd
if ( (( i == 1 ) && ( i < n - 1 )) || ((i == 1) && (n == 1)) )
{
double Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
int Op, SOp;
Op = Calct0ts1N( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] * \
(ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1]) - \
VDD * (t0_n[i] - tauo_n[1]) * \
((ts_n[i] * ts_n[i]) - 2 * ts_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * ((t0_n[i] - ts_n[i]) * (t0_n[i] + ts_n[i] - 2 * tauo_n[1])) * \
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
break;
case _AA_:
C_n += Cg * Vd_n[i] * (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1])*\
C_n += Cov * Vd_n[i] * (Vd_n[i] * taui_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) - \
VDD * (t0_n[i] - tauo_n[1])*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
break;
case _B_:
C_n += Cg * Vd_n[i] * (VDD * ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-\
taui_n[i] * (taui_n[i] - 2 * tauo_n[1])) + \
Vd_n[i] * taui_n[i] * (tauo_n[1]-t0_n[i])) / \
(2 * taui_n[i] * (tauo_n[1]-t0_n[i]));
break;
case _C_:
C_n += Cg * Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[1]) * \
(ts_n[i] - tauo_n[1]) / \
(2 * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * Vd_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) / \
break;
case _D_:
C_n += Cg * Vd_n[i] * Vd_n[i] / 2;
break;
case _F_:
C_n += Cov * Vd_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1]))*\
break;
case _G_:
226
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \

(2 * taui_n[i]);
break;
case _E_:
default:
C_n += 0.0;
break;
}
}
else if ( ( i < n - 1 ) && ( i > 1 ) )
{
C_n += ( TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i ] * TECH.Lmin ) * \
Vd_n[ i ] * Vd_n[ i ] * 0.5;
C_n += ( TECH.Cgs0_n * ( W_n[ i + 1 ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i + 1 ] * TECH.Lmin ) * \
Vd_n[ i ] * Vd_n[ i ] * 0.5;
}
else if ( (i == 1) && (i == n - 1) )
{
double Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
int Op, SOp;
Op = Calct0ts1N( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] * \
(ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1]) - \
VDD * (t0_n[i] - tauo_n[1]) * \
((ts_n[i] * ts_n[i]) - 2 * ts_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
C_n += Cov * Vd_n[i] * ((t0_n[i] - ts_n[i]) * (t0_n[i] + ts_n[i] - 2 * tauo_n[1])) * \
break;
case _AA_:
C_n += Cov * Vd_n[i] * (Vd_n[i] * taui_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) - \
VDD * (t0_n[i] - tauo_n[1])*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
break;
case _B_:
C_n += Cg * Vd_n[i] * (VDD * ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-\
taui_n[i] * (taui_n[i] - 2 * tauo_n[1])) + \
Vd_n[i] * taui_n[i] * (tauo_n[1]-t0_n[i])) / \
(2 * taui_n[i] * (tauo_n[1]-t0_n[i]));
break;
case _C_:
C_n += Cg * Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[1]) * \
(ts_n[i] - tauo_n[1]) / \
C_n += Cov * Vd_n[i] * Vd_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) / \
break;
case _D_:
C_n += Cg * Vd_n[i] * Vd_n[i] / 2;
break;
case _F_:
B.3. Simulators
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
227
C_n += Cov * Vd_n[i]*\

((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1]))*\
break;
case _G_:
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i]);
break;
case _E_:
default:
C_n += 0.0;
break;
}
Cov = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Cg = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.666666 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Op = Calct0tsnN( n, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
C_n += Cg * Vs_n[n] * Vs_n[n] * (ts_n[n] - taui_n[n]) * (ts_n[n] - taui_n[n]) * \
break;
case _B_:
C_n += Cov * Vs_n[n] * Vs_n[n] * 0.5;
break;
case _C_:
C_n += -Cg * Vs_n[n] * Vs_n[n] * (tc * tc - 2 * tc * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[i] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0 - taui_n[n]));
break;
case _D_:
tc * (tc - 2 * taui_n[n])) / \
break;
case _E_:
default:
break;
}
}
else if ( (i == n - 1) && (n > 2))
{
C_n += ( TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i ] * TECH.Lmin ) * \
Vd_n[ i ] * ( 2 * VDD - Vd_n[ i ] ) * 0.5;
int Op, SOp;
double Cov = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
double Cg = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.666666 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Op = Calct0tsnN( n, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
228
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
C_n += Cg * Vs_n[n] * Vs_n[n] * (ts_n[n] - taui_n[n]) * (ts_n[n] - taui_n[n]) * \

break;
case _B_:
C_n += Cov * Vs_n[n] * Vs_n[n] * 0.5;
break;
case _C_:
C_n += -Cg * Vs_n[n] * Vs_n[n] * (tc * tc - 2 * tc * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0 - taui_n[n]));
break;
case _D_:
tc * (tc - 2 * taui_n[n])) / \
break;
case _E_:
default:
break;
}
}
else if ( i == n )
{
double Cov = TECH.Cgd0_n * ( W_n[ n ] + TECH.XW_n );
double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ];
int Op, SOp;
Op = Calct0tsnN(n, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
case _B_:
C_n += Cov * \
Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * tauo_n[i] - \
ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
(2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
C_n += Cg * \
Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[i]) * (ts_n[i] - tauo_n[i]) / \
break;
case _C_:
C_n += Cov * \
Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * tauo_n[i] - \
ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
C_n += -Cg * \
Vd_n[i] * Vd_n[i] * (tc * tc - 2 * tc * tauo_n[i] - \
ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
break;
case _D_:
C_n += Cov * \
Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] - tauo_n[i] - \
tc * (tc - 2 * tauo_n[i])) / \
break;
case _E_:
default:
break;
}
}
B.3. Simulators
416
417
418
419
420
421 }
229
if ( C_n < 0.0 )

C_n *= -1;
Ecc += C_n;
}
return OK;
CalcpowP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
int Fast::CalcPowerP( const Circuit& circuit, unsigned int NP, unsigned int NC, double& Ecc, double& Esc, unsigned int n, unsigned int p, const double*
{
double H_1, I_1, Vc, t0_bs, C_p;
double J_1, K_1, M_1, N_1, O_1, P_1;
double t0 = t0_p[ p ];
double tauo = tauo_p[ p ];
double tin = taui_p[ 1 ];
t0_bs = tin * ( VDD - TECH.Vtn0 ) / VDD;
double tc = (VDD * tauo_p[p] * (t0_p[p] - taui_p[p]) + \
Vs_p[p] * taui_p[p] * (tauo_p[p] - t0_p[p])) / \
(VDD * (t0_p[p] - taui_p[p]) + Vs_p[p] * (tauo_p[p] - t0_p[p]));
Esc = 0;
Ecc = 0;
if (n > 0)
{
Vc = TECH.Ec_n * L_n[ n ];
H_1 = Vc * beta_n[ n ] * ( ( VDD * VDD ) * tin * ( t0 - 2 * tauo ) - \
VDD * ( Vc * ( t0 - tauo ) * ( 2 * t0 - tin - 2 * tauo ) + \
tin * ( Vd_p[ p ] * ( t0 - 3 * tauo ) + 2 * TECH.Vtn0 * ( t0 - tauo ) ) ) - \
Vd_p[ p ] * tin * ( Vc * ( t0 - tauo ) + Vd_p[ p ] * tauo + 2 * TECH.Vtn0 * ( tauo - t0 ) ) ) / \
( 2 * tin * ( VDD - Vd_p[ p ] ) * ( t0 - tauo ) );
I_1 = Vc * beta_n[ n ] * ( VDD * ( 2 * t0 - tin - 2 * tauo ) + \
Vd_p[ p ] * tin ) / ( 2 * tin * ( tauo - t0 ) );
J_1 = ( Vc * Vc ) * beta_n[ n ] * ( tauo - t0 ) * \
( 2 * ( VDD * VDD ) * ( t0 - tin ) + VDD * ( Vc * ( 2 * t0 - tin - 2 * tauo ) + \
2 * ( Vd_p[ p ] * ( tin - tauo ) + TECH.Vtn0 * tin ) ) + Vd_p[ p ] * tin * ( Vc - 2 * TECH.Vtn0 ) );
K_1 = 2 * tin * ( VDD - Vd_p[ p ] ) * ( VDD * t0 + Vc * ( t0 - tauo ) - \
Vd_p[ p ] * tauo );
M_1 = Vc * beta_n[ n ] * ( VDD * t0 + Vc * ( tauo - t0 ) - Vd_p[ p ] * tauo + 2 * TECH.Vtn0 * ( t0 - tauo ) ) / ( 2 * ( tauo - t0 ) );
N_1 = Vc * beta_n[ n ] * ( VDD - Vd_p[ p ] ) / ( 2 * ( t0 - tauo ) );
O_1 = ( Vc * Vc ) * beta_n[ n ] * ( Vc - 2 * TECH.Vtn0 ) * ( t0 - tauo );
P_1 = 2 * ( VDD * t0 + Vc * ( t0 - tauo ) - Vd_p[ p ] * tauo );
if ( t0_bs < tauo )
{
Esc = ( VDD * ( 3 * J_1
LOG ( 2
3 * J_1
LOG ( 2
2 * tin
( t0_bs
*
*
*
*
*
-
( K_1 - 2 * tin * t0 * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \

t0 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
( 2 * tin * t0 * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) - K_1 ) * \
t0_bs * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
t0 ) * ( 3 * H_1 * tin * ( VDD - Vd_p[ p ] ) * \
( VDD - Vd_p[ p ] ) * ( t0 - t0_bs ) + \
I_1 * tin * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * \
( t0 * t0 + t0 * t0_bs - 2 * t0_bs * t0_bs ) - 3 * J_1 ) ) / \
( 12 * ( tin * tin ) * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
( t0 - tauo ) ) );
}
else
{
Esc = ( ( 3 * J_1 * ( K_1 + 2 * tin * ( VDD - Vd_p[ p ] ) * ( Vd_p[ p ] * tauo - VDD * t0 ) ) * \
230
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
LOG ( 2 * t0 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) - \

3 * J_1 * ( K_1 + 2 * tin * ( VDD - Vd_p[ p ] ) * \
( Vd_p[ p ] * tauo - VDD * t0 ) ) * \
LOG ( 2 * tauo * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
2 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
( tauo - t0 ) * ( 3 * H_1 * tin * ( VDD - Vd_p[ p ] ) * \
( VDD + Vd_p[ p ] ) * ( t0 - tauo ) + \
I_1 * tin * ( VDD - Vd_p[ p ] ) * ( t0 - tauo ) * \
( VDD * ( t0 + 2 * tauo ) + Vd_p[ p ] * ( 2 * t0 + tauo ) ) - 3 * J_1 ) ) / \
( 12 * ( tin * tin ) * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
( t0 - tauo ) ) );
}
Esc = fabs( Esc );
}
const char* name;
unsigned int node = circuit.ValimNode();
for ( unsigned int i = 1; i <= p; i++ )
{
C_p = 0.0;
// first there are nmos
name = pathlist[ NP ].TransistorName( n + p - i, NC );
// then there are the pmos, in REVERSE order
{
}
{
}
int nc;
// Cj P
C_p += TECH.C_pj * Wjp * TECH.Df * \
( ( VDD * VDD + Vd_p[ i ] * Vd_p[ i ] ) * ( TECH.mj_p - 1 ) * ( TECH.mj_p - 1 ) +
( TECH.mj_p * TECH.PB_p - 2 * Vd_p[ i ] * ( TECH.mj_p - 1 ) * ( TECH.mj_p - 1 )
Vd_p[ i ] * TECH.mj_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow ( ( VDD - Vd_p[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) / \
( ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) );
C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( ( VDD * VDD + Vd_p[ i ] * Vd_p[ i ] ) * ( TECH.mjsw_p - 1 ) * ( TECH.mjsw_p - 1
( TECH.mjsw_p * TECH.PB_p - 2 * Vd_p[ i ] * ( TECH.mjsw_p - 1 ) * ( TECH.mjsw_p
Vd_p[ i ] * TECH.mjsw_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow ( ( VDD - Vd_p[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) / \
( ( TECH.mjsw_p - 2 ) * ( TECH.mjsw_p - 1 ) );
C_p += TECH.PB_p * TECH.PB_p * ( TECH.C_pj * Wjp * TECH.Df * \
( TECH.mjsw_p - 2 ) * ( TECH.mjsw_p - 1 ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) ) / \
( ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) * ( TECH.mjsw_p - 1 ) * ( 2 - TECH.mjsw_p
// Cj N
if (n > 0)
{
double x = TECH.mj_n - 1;
double y = TECH.mjsw_n - 1;
C_p += TECH.C_nj * Wjn * TECH.Df *\
( VDD * VDD * x + VDD * TECH.mj_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mj_n ) / \
( ( 1 - x ) * x );
C_p += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) *\
( VDD * VDD * y + VDD * TECH.mjsw_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) *\
VDD * \
) - \
) + VDD * \
- 1 ) ) - \
) );
B.3. Simulators
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mjsw_n )

( ( 1 - y ) * y );
C_p += TECH.C_nj * Wjn * TECH.Df *\
( VDD * Vd_p[ i ] * (x - 1) * x - \
Vd_p[ i ] * Vd_p[i] * x * x - TECH.PB_n * (Vd_p[i]
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mj_n ) /
( ( 1 - x ) * x );
C_p += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) *\
( VDD * Vd_p[ i ] * (y - 1) * y - \
Vd_p[ i ] * Vd_p[i] * y * y - TECH.PB_n * (Vd_p[i]
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mjsw_n )
( ( 1 - y ) * y );
231
/ \
* TECH.mj_n + TECH.PB_n)) * \
\
* TECH.mjsw_n + TECH.PB_n) ) *\
/ \
}
C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
if ( i < n )
C_p += TECH.Cgs0_p * ( ( Wjp - W_p[ i ] - W_p[ i + 1 ] ) + ( njp - 2 ) * TECH.XW_p ) * \
( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
else
C_p += TECH.Cgs0_p * ( ( Wjp - W_p[ i ] ) + ( njp - 1 ) * TECH.XW_p ) * \
( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
// Cgs
if ( (( i == 1 ) && ( i < p - 1 )) || ((i == 1) && (p == 1)) )
{
double Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
int Op, SOp;
Op = Calct0ts1P( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_p += Cov * (Vd_p[i] - VDD) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
Vd_p[i] * taui_p[i]) / \
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += -Cg * ((VDD * VDD) * ((t0_p[i] * t0_p[i]) * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) + \
t0_p[i] * ((taui_p[i] * taui_p[i]) - (ts_p[i] * ts_p[i])) * (taui_p[i] + 2 * tauo_p[i]) + \
(ts_p[i] * ts_p[i]) * tauo_p[i] * (taui_p[i] + tauo_p[i]) - (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p
taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo_p[i]))) + \
VDD * Vd_p[i] * taui_p[i] * (t0_p[i] * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) - (ts_p[i] * ts_p[i]) * tauo_p[i]
taui_p[i] * (2 * (taui_p[i] * taui_p[i]) - 3 * taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo
(Vd_p[i] * Vd_p[i]) * (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p[i]) - 2 * taui_p[i] * tauo_p[i] + (tauo_p[i] * tauo_
(2 * (taui_p[i] * taui_p[i]) * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _AA_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += Cov * (Vd_p[i] - VDD) * (VDD * ((t0_p[i] * t0_p[i] * t0_p[i]) - (t0_p[i] * t0_p[i]) * (taui_p[i] + \
3 * tauo_p[i]) - t0_p[i] * ((taui_p[i] * taui_p[i]) - 4 * taui_p[i] * tauo_p[i] - 2 * (tauo_p[i]
taui_p[i] * ((ts_p[i] * ts_p[i]) - 2 * ts_p[i] * tauo_p[i] + tauo_p[i] * (taui_p[i] - 2 * tauo_p
Vd_p[i] * taui_p[i] * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[
break;
case _B_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD * ((t0_p[i] * t0_p[i]) - t0_p[i] * (taui_p[i] + 2 * tauo_p[i]) - \
(taui_p[i] * taui_p[i]) + 3 * taui_p[i] * tauo_p[i]) + Vd_p[i] * taui_p[i] * (t0_p[i] - tauo_p[i]
(2 * taui_p[i] * (tauo_p[i] - t0_p[i]));
break;
case _C_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i]))
232
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
break;
case _D_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (taui_p[i] - tauo_p[i]) * (taui_p[i] - tauo_p[i]) / \
break;
case _F_:
C_p += Cg * (Vd_p[i] - VDD) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) +
Vd_p[i] * taui_p[i]) / (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p
break;
case _G_:
C_p += Cg * (Vd_p[i] - VDD) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / (2 * taui_p[i]);
break;
case _E_:
default:
C_p += 0.0;
break;
}
}
else if ( ( i < p - 1 ) && ( i > 1 ) )
{
C_p += ( TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i ] * TECH.Lmin ) * \
( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
C_p += ( TECH.Cgs0_p * ( W_p[ i + 1 ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i + 1 ] * njp * TECH.Lmin ) * \
( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
}
else if ( (i == 1) && (i == p - 1) )
{
double Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
int Op, SOp;
Op = Calct0ts1P( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
Vd_p[i] * taui_p[i]) / \
C_p += -Cg * ((VDD * VDD) * ((t0_p[i] * t0_p[i]) * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) + \
t0_p[i] * ((taui_p[i] * taui_p[i]) - (ts_p[i] * ts_p[i])) * (taui_p[i] + 2 * tauo_p[i]) + \
(ts_p[i] * ts_p[i]) * tauo_p[i] * (taui_p[i] + tauo_p[i]) - (taui_p[i] * taui_p[i]) * ((tau
taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo_p[i]))) + \
VDD * Vd_p[i] * taui_p[i] * (t0_p[i] * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) - (ts_p[i] * ts_p[i
taui_p[i] * (2 * (taui_p[i] * taui_p[i]) - 3 * taui_p[i] * tauo_p[i] + 2 * (t
(Vd_p[i] * Vd_p[i]) * (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p[i]) - 2 * taui_p[i] * tauo_p[i] + (ta
(2 * (taui_p[i] * taui_p[i]) * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _AA_:
C_p += Cov * (Vd_p[i] - VDD) * (VDD * ((t0_p[i] * t0_p[i] * t0_p[i]) - (t0_p[i] * t0_p[i]) * (taui_p[i] + \
3 * tauo_p[i]) - t0_p[i] * ((taui_p[i] * taui_p[i]) - 4 * taui_p[i] * tauo_p[i] taui_p[i] * ((ts_p[i] * ts_p[i]) - 2 * ts_p[i] * tauo_p[i] + tauo_p[i] * (taui_p[
Vd_p[i] * taui_p[i] * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i
break;
case _B_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD * ((t0_p[i] * t0_p[i]) - t0_p[i] * (taui_p[i] + 2 * tauo_p[i]) - \
(taui_p[i] * taui_p[i]) + 3 * taui_p[i] * tauo_p[i]) + Vd_p[i] * taui_p[i] * (t0_p
(2 * taui_p[i] * (tauo_p[i] - t0_p[i]));
break;
case _C_:
B.3. Simulators
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
233
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i]))
break;
case _D_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (taui_p[i] - tauo_p[i]) * (taui_p[i] - tauo_p[i]) / \
break;
case _F_:
C_p += Cg * (Vd_p[i] - VDD) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
Vd_p[i] * taui_p[i]) / (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / \
break;
case _G_:
C_p += Cg * (Vd_p[i] - VDD) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / (2 * taui_p[i]);
break;
case _E_:
default:
C_p += 0.0;
break;
}
Op = Calct0tsnP( p, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
Cg = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.66666 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
switch ( Op )
{
case _A_:
C_p += Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) * (ts_p[p] - taui_p[p]) * (ts_p[p] - taui_p[p]) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) / 2;
break;
case _C_:
C_p += -Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((tc * tc) - 2 * tc * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
break;
case _D_:
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - tc * (tc - 2 * taui_p[p])) / \
break;
case _E_:
default:
break;
}
}
else if ((i == p - 1) && (p > 2))
{
C_p += ( TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i ] * TECH.Lmin ) * \
( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
int Op, SOp;
Op = Calct0tsnP( p, SOp);
if ( Op == _E_ )
{
Op = SOp;
234
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
}
double Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
double Cg = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.66666 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
switch ( Op )
{
case _A_:
C_p += Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) * (ts_p[p] - taui_p[p]) * (ts_p[p] - taui_p[p]) / \
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) / 2;
break;
case _C_:
C_p += -Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((tc * tc) - 2 * tc * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
break;
case _D_:
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - tc * (tc - 2 * taui_p[p])) / \
break;
case _E_:
default:
break;
}
}
else if ( i == p )
{
double Cov = TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p );
double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ];
int Op, SOp;
Op = Calct0tsnP( p, SOp );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
case _B_:
C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i])) / \
break;
case _C_:
C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += -Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
((tc * tc) - 2 * tc * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i])) / \
break;
case _D_:
C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - tc * (tc - 2 * tauo_p[i])) / \
break;
case _E_:
default:
break;
B.3. Simulators
407
408
409
410
411
412
413
414 }
235
}
}
if ( C_p < 0.0 )
C_p *= -1;
Ecc += C_p;
}
return OK;
Calcstart.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
#define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
///
double Fast::CalcStartTime( const Circuit& circuit, unsigned int NP, unsigned int NC, double start, double end, TransistorType type, const double* NewW
{
int iter;
double a = start, b = end, c = end, d, e, min1, min2;
double fa, fb, fc, pp, q, r, s, tol1, xm, last;
double tol = TOL;
if ( type == NMOS )
fb = t0N( circuit, NP, NC, b, NewWidth, RetCode );
fb = t0P( circuit, NP, NC, b, NewWidth, RetCode );
last = fb;
if ( type == NMOS )
fa = t0N( circuit, NP, NC, a, NewWidth, RetCode );
fa = t0P( circuit, NP, NC, a, NewWidth, RetCode );
if ( ( fa > 0.0 && fb > 0.0 ) || ( fa < 0.0 && fb < 0.0 ) )
{
return 0.0;
}
fc = fb;
for ( iter = 1; iter <= ITERMAX; iter++ )
{
if ( ( fb > 0.0 && fc > 0.0 ) || ( fb < 0.0 && fc < 0.0 ) )
{
c = a;
fc = fa;
e = d = b - a;
}
if ( fabs ( fc ) < fabs ( fb ) )
{
a = b;
b = c;
c = a;
fa = fb;
fb = fc;
fc = fa;
}
tol1 = 2.0 * EPS * fabs ( b ) + 0.5 * tol;
xm = 0.5 * ( c - b );
if ( fabs ( xm ) <= tol1 || fb == 0.0 )
return b;
if ( fabs ( e ) >= tol1 && fabs ( fa ) > fabs ( fb ) )
{
s = fb / fa;
if ( a == c )
{
pp = 2.0 * xm * s;
236
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103 }
q = 1.0 - s;
}
else
{
q = fa / fc;
r = fb / fc;
pp = s * ( 2.0 * xm * q * ( q - r ) - ( b - a ) * ( r - 1.0 ) );
q = ( q - 1.0 ) * ( r - 1.0 ) * ( s - 1.0 );
}
if ( pp > 0.0 )
q = -q;
pp = fabs ( pp );
min1 = 3.0 * xm * q - fabs ( tol1 * q );
min2 = fabs ( e * q );
if ( 2.0 * pp < ( min1 < min2 ? min1 : min2 ) )
{
e = d;
d = pp / q;
}
else
{
d = xm;
e = d;
}
}
else
{
d = xm;
e = d;
}
a = b;
fa = fb;
if ( fabs ( d ) > tol1 )
b += d;
else
b += SIGN ( tol1, xm );
if ( type == NMOS )
fb = t0N( circuit, NP, NC, b, NewWidth, RetCode );
fb = t0P( circuit, NP, NC, b, NewWidth, RetCode );
}
return 0.0;
Calctst0N.cc
3
4
5
6
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
* t0: A=AA=F B=G C D * ts: A=F B=D=G AA=C *
Calctst0N.cc
14 int Fast::Calct0ts1N( const Circuit& circuit, unsigned int NP, int& SaveOpCondition, const double* NewWidth )
15 {
16
int OpCondition = _A_, LastOpCondition = 0;
17
double t0_bs, Vc, A_2_n, B_2_n, C_2_n, D_2_n, J_2_n, K_2_n, I_2_n, M_2_n;
double a, b, c, Cm1, Cm2, Cov, Cj, X, Y, alpha, beta, gamma, theta;
18
19
20
t0_bs = TECH.Vtn0 * taui_n[ 1 ] / VDD;
21
if ( taui_n[ 1 ] <= tauo_n[ 1 ] )
{
22
23
ts_n[ 1 ] = ( taui_n[ 1 ] - t0_n[ 1 ] ) * 0.5 + t0_n[ 1 ];
/* A */
23
24
}
else
25
B.3. Simulators
26
27
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
237
{
ts_n[ 1 ] = ( tauo_n[ 1 ] - t0_n[ 1 ] ) * 0.5 + t0_n[ 1 ];
/* F */
}
Vc = TECH.Ec_n * L_n[ 1 ];
Cov = TECH.Cgd0_n * ( W_n[ 1 ] + TECH.XW_n );
Cm1 = Cov;
Cm2 = Cm1 + 0.5 * TECH.Cox_n * W_n[ 1 ] * L_n[ 1 ];
unsigned int pp = pathlist[ NP ].GetNumTranP();
const char* name = pathlist[ NP ].TransistorName( 0 );
int node;
int njn, ngn, njp, ngp, nc;
if ( circuit[ name ].Source() == 0 )
{
}
else if ( circuit[ name ].Drain() == 0 )
{
}
// Nmos
Cj = TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
Cj += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n );
// Pmos
if (pp > 0)
{
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
Cj += TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p );
}
// static
Cj += circuit.CapStaticGnd( node, nc );
Cj += circuit.CapStaticVdd( node, nc );
Cj += Wgn * TECH.Lmin * TECH.Cox_n;
Cj += Wgp * TECH.Lmin * TECH.Cox_p;
A_2_n = Vc * beta_n[ 1 ] * ( Vc - TECH.Vtn0 );
B_2_n = VDD * Vc * beta_n[ 1 ] / taui_n[ 1 ];
C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
J_2_n = Vc * Vd_n[ 1 ] * beta_n[ 1 ] * ( Vd_n[ 1 ] + 2 * TECH.Vtn0 ) / ( 2 * ( Vc + Vd_n[ 1 ] ) );
K_2_n = VDD * Vc * Vd_n[ 1 ] * beta_n[ 1 ] / ( taui_n[ 1 ] * ( Vc + Vd_n[ 1 ] ) );
I_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) / 2;
M_2_n = K_2_n * taui_n[ 1 ] - J_2_n;
X = Cj + Cm2;
Y = Cj + Cov;
while ( OpCondition != LastOpCondition )
{
if ( LastOpCondition != 0 )
LastOpCondition = OpCondition;
else
LastOpCondition = _E_;
if ( ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _A_;
if ( OpCondition != LastOpCondition )
238
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
{
#ifdef SAT
b = ( 2 * ( Vd_n[ 1 ] * taui_n[ 1 ] * \
( Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) - Vd_n[ 1 ] * tauo_n[ 1 ] ) - \
VDD * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) ) ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] * taui_n[ 1 ] );
c = ( 2 * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) - Vd_n[ 1 ] * Vd_n[ 1 ] * tauo_n[ 1 ] * tauo_n[ 1 ] ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] );
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ 1 ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ts_n[ 1 ] < 0 )
ts_n[ 1 ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
}
else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] * tauo_n[ 1 ] + TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
( Vd_n[ 1 ] * taui_n[ 1 ] - VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) );
#else
#endif
}
}
else if ( ( t0_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _AA_;
{
#ifdef SAT
ts_n[ 1 ] = ( sqrt ( Vc ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * \
sqrt ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) + Vc * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
Vd_n[ 1 ] + tauo_n[ 1 ];
#else
ts_n[ 1 ] = ( VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) + Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / Vd_n[ 1 ];
#endif
}
}
else if ( ( ts_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _B_;
{
#ifdef SAT
ts_n[ 1 ] = taui_n[ 1 ] * \
( 2 * Vc * ( Vd_n[ 1 ] + TECH.Vtn0 ) + ( Vd_n[ 1 ] * Vd_n[ 1 ] ) ) / \
( 2 * VDD * Vc );
#else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] + TECH.Vtn0 ) / VDD;
#endif
a = 2 * ( Cm2 * VDD + J_2_n * taui_n[ 1 ] ) / ( K_2_n * taui_n[ 1 ]);
b = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * taui_n[1] * ( t0_bs * t0_bs - ts_n[ 1 ] * ts_n[ 1 ] ) - \
4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * pow( C_2_n * t0_bs + D_2_n, 1.5 ) + \
4 * Vc * Vc * beta_n[ 1 ] * pow( C_2_n * ts_n[ 1 ] + D_2_n, 1.5 ) - \
3 * C_2_n * ts_n[ 1 ] * ( 2 * Cm2 * VDD - 2 * Cov * VDD + \
taui_n[1] * (2 * J_2_n - K_2_n * ts_n[1] ) ) ) / \
( 3 * C_2_n * K_2_n * taui_n[ 1 ]);
if ( ( 4 * b + ( a * a ) ) >= 0 )
{
t0_n[ 1 ] = ( a - sqrt ( 4 * b + ( a * a ) ) ) / 2;
if ( t0_n[ 1 ] < 0 )
t0_n[ 1 ] = ( sqrt ( 4 * b + ( a * a ) ) + a ) / 2;
B.3. Simulators
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
239
}
else
t0_n[ 1 ] = t0_bs;
}
}
else if ( ( taui_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _C_;
{
alpha = C_2_n * t0_bs + D_2_n;
beta = C_2_n * ts_n[ 1 ] + D_2_n;
theta = pow( alpha, 1.5 ) - pow( beta, 1.5 );
t0_n[ 1 ] = ( 6 * A_2_n * C_2_n * ( t0_bs - taui_n[ 1 ] ) + \
3 * B_2_n * C_2_n * ( t0_bs * t0_bs - taui_n[ 1 ] * taui_n[ 1 ] ) - \
2 * ( 2 * Vc * Vc * beta_n[ 1 ] * theta - \
3 * C_2_n * (Cov * VDD + I_2_n * taui_n[1]))) /
( 6 * C_2_n * I_2_n );
#ifdef SAT
ts_n[ 1 ] = ( sqrt ( Vc ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * \
sqrt ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) + Vc * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
Vd_n[ 1 ] + tauo_n[ 1 ];
#else
ts_n[ 1 ] = ( VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) + Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / Vd_n[ 1 ];
#endif
}
}
else if ( ( ts_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _D_;
{
#ifdef SAT
ts_n[ 1 ] = taui_n[ 1 ] * \
( 2 * VDD * Vc );
#else
#endif
alpha = C_2_n * t0_bs + D_2_n;
beta = C_2_n * taui_n[ 1 ] + D_2_n;
gamma = C_2_n * ts_n[ 1 ] + D_2_n;
theta = -pow( alpha, 1.5 ) + pow( gamma, 1.5 );
t0_n[ 1 ] = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * taui_n[1] * ( ( t0_bs * t0_bs ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) + \
4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * theta - \
3 * C_2_n * ( 2 * Cm2 * VDD * (ts_n[1] - taui_n[1]) - \
2 * Cov * VDD * ts_n[1] + taui_n[1] * \
(2 * J_2_n * (ts_n[1]-taui_n[1]) + K_2_n * \
(taui_n[1] * taui_n[1] - ts_n[1] * ts_n[1]) - \
2 * M_2_n * taui_n[1]))) / \
( 6 * C_2_n * M_2_n * taui_n[ 1 ] );
}
}
else if ( ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= tauo_n[ 1 ] ) &&
( tauo_n[ 1 ] <= taui_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _F_;
{
#ifdef SAT
b = ( 2 * ( Vd_n[ 1 ] * taui_n[ 1 ] * \
240
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
( Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) - Vd_n[ 1 ] * tauo_n[ 1 ] ) - \

VDD * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) ) ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] * taui_n[ 1 ] );
c = ( 2 * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) - Vd_n[ 1 ] * Vd_n[ 1 ] * tauo_n[ 1 ] * tauo_n[ 1 ] ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] );
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ 1 ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ts_n[ 1 ] < 0 )
ts_n[ 1 ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
}
else
#else
#endif
}
}
else if ( ( ts_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= tauo_n[ 1 ] ) &&
( tauo_n[ 1 ] <= taui_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _G_;
{
#ifdef SAT
ts_n[ 1 ] = taui_n[ 1 ] * \
( 2 * VDD * Vc );
#else
#endif
a = 2 * ( Cm2 * VDD + J_2_n * taui_n[ 1 ] ) / ( K_2_n * taui_n[ 1 ]);
b = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * taui_n[1] * ( t0_bs * t0_bs - ts_n[ 1 ] * ts_n[ 1 ] ) - \
4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * pow( C_2_n * t0_bs + D_2_n, 1.5 ) + \
4 * Vc * Vc * beta_n[ 1 ] * pow( C_2_n * ts_n[ 1 ] + D_2_n, 1.5 ) - \
3 * C_2_n * ts_n[ 1 ] * ( 2 * Cm2 * VDD - 2 * Cov * VDD + \
taui_n[1] * (2 * J_2_n - K_2_n * ts_n[1] ) ) ) / \
( 3 * C_2_n * K_2_n * taui_n[ 1 ]);
if ( ( 4 * b + ( a * a ) ) >= 0 )
{
t0_n[ 1 ] = ( a - sqrt ( 4 * b + ( a * a ) ) ) / 2;
if ( t0_n[ 1 ] < 0 )
t0_n[ 1 ] = ( sqrt ( 4 * b + ( a * a ) ) + a ) / 2;
}
else
t0_n[ 1 ] = t0_bs;
}
}
else
{
OpCondition = _E_;
}
}
return OpCondition;
}
///
int Fast::Calct0tsnN( unsigned int n, int& SaveOpCondition )
{
double Vc, tc, X, Y, b, c;
B.3. Simulators
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
241
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
if ( taui_n[ n ] < tauo_n[ n ] )
{
SaveOpCondition = _A_;
}
else
{
tc = ( VDD * tauo_n[ n ] * ( t0_n[ n ] - taui_n[ n ] ) + \
Vs_n[ n ] * taui_n[ n ] * ( tauo_n[ n ] - t0_n[ n ] ) ) / \
( VDD * ( t0_n[ n ] - taui_n[ n ] ) + Vs_n[ n ] * ( tauo_n[ n ] - t0_n[ n ] ) );
SaveOpCondition = _C_;
}
{
else
if ( ( taui_n[ n ] <= tauo_n[ n ] ) &&
( ts_n[ n ] <= taui_n[ n ] ) &&
( t0_n[ n ] <= ts_n[ n ] ) )
{
{
#ifdef SAT
X = t0_n[ n ] - taui_n[ n ];
Y = tauo_n[ n ] - t0_n[ n ];
b = -( 2 * ( VDD * X * X * ( Vc * Y + VDD * tauo_n[ n ] ) + \
Vs_n[ n ] * X * Y * ( VDD * ( taui_n[ n ] + tauo_n[ n ] ) - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * Y * Y ) ) / \
( ( VDD * X + Vs_n[ n ] * Y ) * ( VDD * X + Vs_n[ n ] * Y ) );
c = -( X * X * ( 2 * Y * Y * Vc * TECH.Vtn0 + 2 * Y * VDD * Vc * t0_n[ n ] + VDD * VDD * tauo_n[ n ] * tauo_n[ n ] ) + \
2 * Vs_n[ n ] * taui_n[ n ] * X * Y * ( VDD * tauo_n[ n ] - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * taui_n[ n ] * Y * Y ) / \
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ n ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ( ts_n[ n ] < 0 ) || ( ts_n[ n ] < t0_n[ n ] ) || ( ts_n[ n ] > tauo_n[ n ] ) )
ts_n[ n ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
}
else
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
#else
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
#endif
}
}
else if ( ( ts_n[ n ] <= tauo_n[ n ] ) &&
( taui_n[ n ] <= ts_n[ n ] ) &&
( t0_n[ n ] <= taui_n[ n ] ) )
{
242
370
371
372
373
374
375
376
377
378
379
380
381
382
383
{
#ifdef SAT
ts_n[ n ] = tauo_n[ n ] - ( tauo_n[ n ] - t0_n[ n ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) ) - Vc ) / VDD;
#else
ts_n[ n ] = TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) / VDD + t0_n[ n ];
#endif
}
}
else if ( ( tauo_n[ n ] <= taui_n[ n ] ) &&
( ts_n[ n ] < tc ) &&
( t0_n[ n ] <= ts_n[ n ] ) )
{
(ts
n[n] tc), not = !!!
Calctst0N.cc
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434

{
#ifdef SAT
X = t0_n[ n ] - taui_n[ n ];
Y = tauo_n[ n ] - t0_n[ n ];
b = -( 2 * ( VDD * X * X * ( Vc * Y + VDD * tauo_n[ n ] ) + \
Vs_n[ n ] * X * Y * ( VDD * ( taui_n[ n ] + tauo_n[ n ] ) - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * Y * Y ) ) / \
c = -( X * X * ( 2 * Y * Y * Vc * TECH.Vtn0 + 2 * Y * VDD * Vc * t0_n[ n ] + VDD * VDD * tauo_n[ n ] * tauo_n[ n ] ) + \
2 * Vs_n[ n ] * taui_n[ n ] * X * Y * ( VDD * tauo_n[ n ] - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * taui_n[ n ] * Y * Y ) / \
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ n ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
ts_n[ n ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
}
else
{
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
if ( ts_n[ n ] > tauo_n[ n ] )
ts_n[ n ] = tc;
}
#else
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
if ( ts_n[ n ] > tauo_n[ n ] )
ts_n[ n ] = tc;
#endif
}
}
else if ( ( tauo_n[ n ] <= taui_n[ n ] ) &&
( tc <= ts_n[ n ] ) &&
( t0_n[ n ] <= ts_n[ n ] ) )
{
B.3. Simulators
435
436
437
438
439
440
441
442
443
444
445 }
243
{
ts_n[ n ] = tc;
}
}
else
{
OpCondition = _E_;
}
}
return OpCondition;
Calctst0P.cc
3
4
5
6
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
* t0: A=AA=F B=G C D * ts: A=F B=D=G AA=C *
Calctst0P.cc
15 int Fast::Calct0ts1P( const Circuit& circuit, unsigned int NP, int& SaveOpCondition, const double* NewWidth )
16 {
17
18
double t0_bs, Vc, A_2_p, B_2_p, C_2_p, D_2_p, G_2_p, J_2_p, K_2_p, M_2_p;
19
double b, c, Cm1, Cm2, Cov, Cj, X, XX, Y, alpha, gamma, theta;
20
21
t0_bs = -TECH.Vtp0 * taui_p[ 1 ] / VDD;
if ( taui_p[ 1 ] <= tauo_p[ 1 ] )
22
23
{
24
ts_p[ 1 ] = ( taui_p[ 1 ] - t0_p[ 1 ] ) * 0.5 + t0_p[ 1 ];
/* A */
24
25
}
26
else
27
{
ts_p[ 1 ] = ( tauo_p[ 1 ] - t0_p[ 1 ] ) * 0.5 + t0_p[ 1 ];
/* F */
28
28
29
}
30
31
Vc = TECH.Ec_p * L_p[ 1 ];
32
Cov = TECH.Cgd0_p * ( W_p[ 1 ] + TECH.XW_p );
33
Cm1 = Cov;
34
Cm2 = Cm1 + 0.5 * TECH.Cox_p * W_p[ 1 ] * L_p[ 1 ];
35
unsigned int nn = pathlist[ NP ].GetNumTranN();
36
37
int node;
38
int njn, ngn, njp, ngp, nc;
if (nn > 0)
39
{
40
41
const char* name = pathlist[ NP ].TransistorName( nn - 1 );
42
43
{
44
45
46
47
48
49
}
50
51
{
52
53
54
55
56
57
}
58
// Nmos
244
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
Cj = TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \

TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
}
// Pmos
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
// static
Cj += Wgn * TECH.Lmin * TECH.Cox_p;
A_2_p = Vc * beta_p[ 1 ] * ( Vc + TECH.Vtp0 );
B_2_p = VDD * Vc * beta_p[ 1 ] / taui_p[ 1 ];
C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
G_2_p = Vc * Vc * beta_p[ 1 ] * SQRT ( ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) / Vc ) - \
VDD * Vc * beta_p[ 1 ] - Vc * Vc * beta_p[ 1 ] - Vc * TECH.Vtp0 * beta_p[ 1 ];
K_2_p = VDD * Vc * beta_p[ 1 ] * ( Vd_p[ 1 ] - VDD ) / ( taui_p[ 1 ] * ( VDD + Vc - Vd_p[ 1 ] ) );
J_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] - 2 * TECH.Vtp0 ) / ( 2 * ( VDD + Vc - Vd_p[ 1 ] ) );
M_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD + Vd_p[ 1 ] + 2 * TECH.Vtp0 ) / ( 2 * ( Vd_p[ 1 ] - Vc ) );
X = Cj + Cm2;
Y = Cj + Cov;
alpha = pow( ( D_2_p * t0_bs + C_2_p ), 1.5 );
gamma = pow( ( D_2_p * taui_p[ 1 ] + C_2_p ), 1.5 );
{
else
if ( ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
( ts_p[ 1 ] <= taui_p[ 1 ] ) &&
( taui_p[ 1 ] <= tauo_p[ 1 ] ) )
{
{
XX = t0_p[ 1 ] - tauo_p[ 1 ];
#ifdef SAT
b = ( 2 * ( VDD * VDD * taui_p[ 1 ] * tauo_p[ 1 ] + VDD * ( Vc * ( XX - taui_p[ 1 ] ) * XX - \

2 * Vd_p[ 1 ] * taui_p[ 1 ] * tauo_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * XX + Vd_p[ 1 ] * tauo_p[ 1 ] ) ) ) / \
( taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
c = ( VDD * VDD * tauo_p[ 1 ] * tauo_p[ 1 ] - 2 * VDD * tauo_p[ 1 ] * ( Vc * XX + Vd_p[ 1 ] * tauo_p[ 1 ] ) + \
2 * Vc * XX * ( Vd_p[ 1 ] * tauo_p[ 1 ] - TECH.Vtp0 * XX ) + Vd_p[ 1 ] * tauo_p[ 1 ] * Vd_p[ 1 ] * tauo_p[ 1 ] ) /
( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
if ( ( b * b - 4 * c ) >= 0 )
{
ts_p[ 1 ] = ( b - sqrt ( b * b - 4 * c ) ) * 0.5;
if ( ts_p[ 1 ] < 0 )
ts_p[ 1 ] = ( b + sqrt ( b * b - 4 * c ) ) * 0.5;
}
else
ts_p[ 1 ] = -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \
TECH.Vtp0 * XX ) / \
( VDD * ( XX - taui_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] );
#else
ts_p[ 1 ]
TECH.Vtp0
( VDD * (
Vd_p[ 1 ]
#endif
= -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \

* XX ) / \
XX - taui_p[ 1 ] ) + \
* taui_p[ 1 ] );
}
}
else if ( ( t0_p[ 1 ] <= taui_p[ 1 ] ) &&
B.3. Simulators
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
245
( taui_p[ 1 ] <= ts_p[ 1 ] ) &&

( ts_p[ 1 ] <= tauo_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _AA_;
{
#ifdef SAT
ts_p[ 1 ] = tauo_p[ 1 ] - ( ( tauo_p[ 1 ] - t0_p[ 1 ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / \
( VDD - Vd_p[ 1 ] );
#else
ts_p[ 1 ] = ( VDD * t0_p[ 1 ] - Vd_p[ 1 ] * tauo_p[ 1 ] + TECH.Vtp0 * ( t0_p[ 1 ] - tauo_p[ 1 ] ) ) / \
( VDD - Vd_p[ 1 ] );
#endif
}
}
else if ( ( ts_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= taui_p[ 1 ] ) &&
( taui_p[ 1 ] <= tauo_p[ 1 ] ) )
{
{
#ifdef SAT
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD * VDD + 2 * VDD * ( Vc - Vd_p[ 1 ] ) - 2 * Vc * ( Vd_p[ 1 ] + TECH.Vtp0 ) + \
Vd_p[ 1 ] * Vd_p[ 1 ] ) / ( 2 * VDD * Vc );
#else
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] - TECH.Vtp0 ) / VDD;
#endif
theta = pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );
b = 2 * ( Cm2 * VDD + J_2_p * taui_p[ 1 ] ) / \
( K_2_p * taui_p[ 1 ] );
c = ( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[ 1 ] ) - \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * alpha + \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * theta - \
3 * D_2_p * ts_p[ 1 ] * \
( 2 * Cm2 * VDD - 2 * Cov * VDD + taui_p[ 1 ] * \
( 2 * J_2_p + K_2_p * ts_p[ 1 ] ) ) ) / \
( 3 * D_2_p * K_2_p * taui_p[ 1 ] );
if ( ( b * b - 4 * c ) >= 0 )
{
t0_p[ 1 ] = ( sqrt ( ( b * b ) - 4 * c ) - b ) * 0.5;
if ( t0_p[ 1 ] < 0 )
t0_p[ 1 ] = -( sqrt ( ( b * b ) - 4 * c ) + b ) * 0.5;
}
else
t0_p[ 1 ] = t0_bs;
}
}
else if ( ( taui_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
( ts_p[ 1 ] <= tauo_p[ 1 ] ) )
{
{
t0_p[ 1 ] = -( 6 * A_2_p * D_2_p * ( t0_bs - taui_p[ 1 ] ) + \
3 * B_2_p * D_2_p * ( t0_bs * t0_bs - taui_p[ 1 ] * taui_p[ 1 ] ) - \
2 * ( 2 * Vc * Vc * beta_p[ 1 ] * ( alpha - gamma ) - \
3 * D_2_p * (Cov * VDD - G_2_p * taui_p[ 1 ]))) / \
( 6 * D_2_p * G_2_p );
#ifdef SAT
ts_p[ 1 ] = tauo_p[ 1 ] - ( ( tauo_p[ 1 ] - t0_p[ 1 ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / \
( VDD - Vd_p[ 1 ] );
#else
ts_p[ 1 ] = ( VDD * t0_p[ 1 ] - Vd_p[ 1 ] * tauo_p[ 1 ] + TECH.Vtp0 * ( t0_p[ 1 ] - tauo_p[ 1 ] ) ) / \
246
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
( VDD - Vd_p[ 1 ] );
#endif
}
}
else if ( ( ts_p[ 1 ] <= taui_p[ 1 ] ) &&
( taui_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= tauo_p[ 1 ] ) )
{
{
#ifdef SAT
( Vd_p[ 1 ] * Vd_p[ 1 ] ) ) / ( 2 * VDD * Vc );
#else
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] + TECH.Vtp0 ) / VDD;
#endif
theta = pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );
t0_p[ 1 ] = -( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[1]) - \
4 * Vc * Vc * beta_p[ 1 ] * ( alpha - theta ) - \
3 * D_2_p * (2 * Cm2 * VDD * (ts_p[1] - taui_p[1]) - \
2 * Cov * VDD * ts_p[1] + \
taui_p[1] * (2 * J_2_p * (ts_p[1] - taui_p[1]) + \
K_2_p * (ts_p[1] * ts_p[1] - taui_p[1] * taui_p[1]) + \
2 * M_2_p * taui_p[1] ))) / \
( 6 * D_2_p * M_2_p * taui_p[ 1 ] );
}
}
else if ( ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
( ts_p[ 1 ] <= tauo_p[ 1 ] ) &&
( tauo_p[ 1 ] <= taui_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _F_;
{
X = t0_p[ 1 ] - tauo_p[ 1 ];
#ifdef SAT
b = ( 2 * ( VDD * VDD * taui_p[ 1 ] * tauo_p[ 1 ] + VDD * ( Vc * ( X - taui_p[ 1 ] ) * X - \

2 * Vd_p[ 1 ] * taui_p[ 1 ] * tauo_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * X + Vd_p[ 1 ] * tauo_p[ 1 ] ) ) ) / \
( taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
c = ( VDD * VDD * tauo_p[ 1 ] * tauo_p[ 1 ] - 2 * VDD * tauo_p[ 1 ] * ( Vc * X + Vd_p[ 1 ] * tauo_p[ 1 ] ) + \
2 * Vc * X * ( Vd_p[ 1 ] * tauo_p[ 1 ] - TECH.Vtp0 * X ) + Vd_p[ 1 ] * tauo_p[ 1 ] * Vd_p[ 1 ] * tauo_p[ 1 ] ) / \
( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
if ( ( b * b - 4 * c ) >= 0 )
{
ts_p[ 1 ] = ( b - sqrt ( b * b - 4 * c ) ) * 0.5;
if ( ts_p[ 1 ] < 0 )
ts_p[ 1 ] = ( b + sqrt ( b * b - 4 * c ) ) * 0.5;
}
else
TECH.Vtp0 * X ) / \
( VDD * ( X - taui_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] );
#else
TECH.Vtp0 * X ) / \
( VDD * ( X - taui_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] );
#endif
}
}
else if ( ( ts_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= tauo_p[ 1 ] ) &&
( tauo_p[ 1 ] <= taui_p[ 1 ] ) )
B.3. Simulators
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
247
{
OpCondition = SaveOpCondition = _G_;
{
#ifdef SAT
Vd_p[ 1 ] * Vd_p[ 1 ] ) / ( 2 * VDD * Vc );
#else
ts_p[ 1 ] = taui_p[ 1
#endif
theta
b = 2
(
c = (
] * ( VDD - Vd_p[ 1 ] - TECH.Vtp0 ) / VDD;
= pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );

* ( Cm2 * VDD + J_2_p * taui_p[ 1 ] ) / \
K_2_p * taui_p[ 1 ] );
6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[ 1 ] ) - \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * alpha + \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * theta - \
3 * D_2_p * ts_p[ 1 ] * \
( 2 * Cm2 * VDD - 2 * Cov * VDD + taui_p[ 1 ] * \
( 2 * J_2_p + K_2_p * ts_p[ 1 ] ) ) ) / \
( 3 * D_2_p * K_2_p * taui_p[ 1 ] );
if ( ( b * b - 4 * c ) >= 0 )
{
t0_p[ 1 ] = ( sqrt ( ( b * b ) - 4 * c ) - b ) * 0.5;
if ( t0_p[ 1 ] < 0 )
t0_p[ 1 ] = -( sqrt ( ( b * b ) - 4 * c ) + b ) * 0.5;
}
else
t0_p[ 1 ] = t0_bs;
}
else
{
OpCondition = _E_;
}
}
}
return OpCondition;
}
///
int Fast::Calct0tsnP( unsigned int p, int& SaveOpCondition )
{
double Vc, tc, X, Y, H, K, det, alpha, beta;
ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
if ( taui_p[ p ] < tauo_p[ p ] )
{
SaveOpCondition = _A_;
}
else
{
tc = ( VDD * tauo_p[ p ] * ( t0_p[ p ] - taui_p[ p ] ) + \
Vs_p[ p ] * taui_p[ p ] * ( tauo_p[ p ] - t0_p[ p ] ) ) / \
( VDD * ( t0_p[ p ] - taui_p[ p ] ) + Vs_p[ p ] * ( tauo_p[ p ] - t0_p[ p ] ) );
SaveOpCondition = _C_;
}
X = t0_p[ p ] - taui_p[ p ];
Y = tauo_p[ p ] - t0_p[ p ];
alpha = VDD - Vs_p[ p ];
beta = VDD - Vd_p[ p ];
{
248
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
else
if ( ( taui_p[ p ] <= tauo_p[ p ] ) &&
( ts_p[ p ] <= taui_p[ p ] ) &&
( t0_p[ p ] <= ts_p[ p ] ) )
{
{
#ifdef SAT
double AAA, BBB;
AAA = ( X * beta + Y * alpha ) / ( X * Y );
BBB = -( VDD * t0_p[ p ] * ( X + Y ) - Vd_p[ p ] * X * tauo_p[ p ] - Vs_p[ p ] * Y * taui_p[ p ] ) / ( X * Y );
AAA /= Vc;
BBB /= Vc;
BBB -= 1.0;
H = 2 * ( A_1_p[ p ] + 1 ) * ( Vs_p[ p ] - VDD ) / ( Vc * X );
K = ( 2 * A_1_p[ p ] * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
2 * B_1_p[ p ] * X + 2 * VDD * t0_p[ p ] + Vc * X - 2 * Vs_p[ p ] * taui_p[ p ] ) / \
( Vc * X );
det = 4 * K * AAA * AAA - 4 * H * AAA * BBB + H * H;
if ( det >= 0 )
{
ts_p[ p ] = ( SQRT ( det ) - 2 * AAA * BBB + H ) / ( 2 * AAA * AAA );
if ( ( ts_p[ p ] < 0 ) || ( ts_p[ p ] < t0_p[ p ] ) || ( ts_p[ p ] > tauo_p[ p ] ) )
ts_p[ p ] = -( SQRT ( det ) + 2 * AAA * BBB - H ) / ( 2 * AAA * AAA );
}
else
#else
#endif
}
}
else if ( ( ts_p[ p ] <= tauo_p[ p ] ) &&
( taui_p[ p ] <= ts_p[ p ] ) &&
( t0_p[ p ] <= taui_p[ p ] ) )
{
{
#ifdef SAT
ts_p[ p ] = tauo_p[ p ] + ( ( t0_p[ p ] - tauo_p[ p ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / ( VDD - Vd_p[ p ] );
#else
ts_p[ p ] = ( VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] + \
TECH.Vtp0 * ( t0_p[ p ] - tauo_p[ p ] ) ) / \
( VDD - Vd_p[ p ] );
#endif
}
}
else if ( ( tauo_p[ p ] <= taui_p[ p ] ) &&
( ts_p[ p ] < tc ) &&
( t0_p[ p ] <= ts_p[ p ] ) )
{
(ts p[p] tc), not = !!!
B.3. Simulators
249
Calctst0P.cc
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454

{
#ifdef SAT
double AAA, BBB;
AAA = ( X * beta + Y * alpha ) / ( X * Y );
BBB = -( VDD * t0_p[ p ] * ( X + Y ) - Vd_p[ p ] * X * tauo_p[ p ] - Vs_p[ p ] * Y * taui_p[ p ] ) / ( X * Y );
AAA /= Vc;
BBB /= Vc;
BBB -= 1.0;
H = 2 * ( A_1_p[ p ] + 1 ) * ( Vs_p[ p ] - VDD ) / ( Vc * X );
K = ( 2 * A_1_p[ p ] * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
2 * B_1_p[ p ] * X + 2 * VDD * t0_p[ p ] + Vc * X - 2 * Vs_p[ p ] * taui_p[ p ] ) / \
( Vc * X );
det = 4 * K * AAA * AAA - 4 * H * AAA * BBB + H * H;
if ( det >= 0 )
{
ts_p[ p ] = ( SQRT ( det ) - 2 * AAA * BBB + H ) / ( 2 * AAA * AAA );
if ( ( ts_p[ p ] < 0 ) || ( ts_p[ p ] < t0_p[ p ] ) || ( ts_p[ p ] > tauo_p[ p ] ) )
ts_p[ p ] = -( SQRT ( det ) + 2 * AAA * BBB - H ) / ( 2 * AAA * AAA );
}
else
#else
#endif
}
}
else if ( ( tauo_p[ p ] <= taui_p[ p ] ) &&
( tc <= ts_p[ p ] ) &&
( t0_p[ p ] <= ts_p[ p ] ) )
{
{
ts_p[ p ] = tc;
}
}
else
{
OpCondition = _E_;
}
}
return OpCondition;
}
Delay.cc
3
4
5
6
7
8
9
10
11
12
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::CalcDelay( const Circuit& circuit,
unsigned int NP,
unsigned int NC,
250
13
14
15
16
17
18
19 {
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64 }
unsigned int n,
unsigned int p,
double tin,
TransitionType TOut,
int& RetCode )
double t0_bs;
{
tauo_n[ i ] = tin;
}
{
tauo_p[ i ] = tin;
}
taui_n[ 1 ] = tin;
taui_p[ 1 ] = tin;
switch ( TOut )
{
case FALL:
// n chain
t0_bs = TECH.Vtn0 * tin / VDD;
t0_n[ 1 ] = CalcStartTime( circuit, NP, NC, t0_bs, tauo_n[ 1 ], NMOS, NewWidth, RetCode );
t0_n[ 1 ] = t0_bs;
//return 0.0;
RetCode = IterSol( circuit, NMOS, NP, NC, n, p, NewWidth );
return 0.0;
else
return ( tauo_n[ n ] + t0_n[ n ] - tin );
break;
case RISE:
// p chain
t0_bs = -TECH.Vtp0 * tin / VDD;
t0_p[ 1 ] = CalcStartTime( circuit, NP, NC, t0_bs, tauo_p[ 1 ], PMOS, NewWidth, RetCode );
t0_p[ 1 ] = t0_bs;
//return 0.0;
RetCode = IterSol( circuit, PMOS, NP, NC, n, p, NewWidth );
return 0.0;
else
return ( tauo_p[ p ] + t0_p[ p ] - tin );
break;
case NOTRANSITION:
default:
break;
}
return 0.0;
// never get here
EqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::EqN( const Circuit& circuit, unsigned int NP, unsigned int NC, double x, int& RetCode, unsigned int j, unsigned int n, unsi
{
int SaveOpCondition = _E_, OpCondition_1, OpCondition_i, OpCondition_n;
double y, C_n, Cov, Cgd1;
RetCode = OK;
B.3. Simulators
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
double* FY = new double[n + 1];

tauo_n[ j ] = x;
double t0_bs = TECH.Vtn0 / VDD * taui_n[ 1 ];
double tc = (VDD * tauo_n[n] * (t0_n[n] - taui_n[n]) + \
Vs_n[n] * taui_n[n] * (tauo_n[n] - t0_n[n])) / \
(VDD * (t0_n[n] - taui_n[n]) + Vs_n[n] * (tauo_n[n] - t0_n[n]));
// tc cross-time : Vd = Vs
if ( j == 1 )
{
OpCondition_1 = Calct0ts1N( circuit, NP, SaveOpCondition, NewWidth );
if ( ( OpCondition_1 == _E_ ) || ( t0_n[ 1 ] < t0_bs ) )
{
RetCode = PARSE_ERROR;
if ( SaveOpCondition == _E_ )
OpCondition_1 = _A_;
else
OpCondition_1 = SaveOpCondition;
}
FY[ 1 ] = FirstEqN( OpCondition_1, tauo_n[ j ] );
}
{
t0_n[ i ] = t0_n[ i - 1 ];
taui_n[ i ] = tauo_n[ i - 1 ];
}
// middle equations
if ( ( j > 1 ) && ( j < n ) )
{
if ( taui_n[ j ] <= tauo_n[ j ] )
OpCondition_i = _A_;
else
OpCondition_i = _E_;
if ( OpCondition_i == _E_ )
{
}
FY[ j ] = MiddleEqN( OpCondition_i, j, tauo_n[ j ] );
}
// last equation
if ( n > 1 )
{
OpCondition_n = Calct0tsnN( n, SaveOpCondition );
if ( ( OpCondition_n == _E_ ) || ( OpCondition_n == _C_ ) || ( OpCondition_n == _D_ ) )
{
OpCondition_n = _A_;
else
OpCondition_n = SaveOpCondition;
}
if ( j == n )
FY[ n ] = LastEqN( OpCondition_n, n, tauo_n[ n ] );
}
y = FY[ j ];
// evaluate capacitance at each node
unsigned int node;
// its the common node used to traverse the path
unsigned int LastNode;
const char* name = pathlist[ NP ].TransistorName( 0, NC );
// first mos in path
node = 0;
for ( unsigned int i = j, k = 1; i > 1; i--, k++ )
{
name = pathlist[ NP ].TransistorName( k , NC);
251
252
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
}
for ( unsigned int i = j; i <= n; i++ )
{
C_n = 0.0;
name = pathlist[ NP ].TransistorName( i - 1 , NC);
{
}
{
}
if (i == n)
LastNode = node;
// Common capacitance
int nc;
// dummy
// Cj
C_n += -TECH.C_nj * Wjn * TECH.Df * Vd_n[ i ] * pow ( ( 1 + Vd_n[ i ] / TECH.PB_n ), -TECH.mj_n ) - \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_n[ i ] * pow ( 1 + Vd_n[ i ] / TECH.PB_n, -TECH.mjsw_n );
// static capacitances
C_n += -circuit.CapStaticGnd( node, nc ) * Vd_n[i];
C_n += -circuit.CapStaticVdd( node, nc ) * Vd_n[i];
// gate capacitances
C_n += -Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[i];
C_n += -Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[i];
if ( ( i == 1 ) && ( i < n - 1 ) )
{
// Cgd & Cgs minus first mos
C_n += -( TECH.Cgs0_n * ( ( Wjn - W_n[ 1 ] ) + ( njn - 1 ) * TECH.XW_n ) + \
0.5 * TECH.Cox_n * ( Wjn - W_n[ 1 ] ) * ( njn - 1 ) * TECH.Lmin ) * Vd_n[ i ];
}
else if ( ( i < n - 1 ) && ( i > 1 ) )
{
// all Cgd & Cgs
C_n += -( TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n ) + \
0.5 * TECH.Cox_n * Wjn * njn * TECH.Lmin ) * Vd_n[ i ];
}
else if ( (( i == n ) || ( i == n - 1 )) && (n > 1) )
{
// Cgd & Cgs minus last mos
C_n += -( TECH.Cgs0_n * ( ( Wjn - W_n[ n ] ) + ( njn - 1 ) * TECH.XW_n ) + \
0.5 * TECH.Cox_n * ( Wjn - W_n[ n ] ) * ( njn - 1 ) * TECH.Lmin ) * Vd_n[ i ];
}
// PMOS
C_n += ( TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_n[ i ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ i ] ) / TECH.PB_p ), -TECH.mj_p ) );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_n[ i ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ i ] ) / TECH.PB_p ), TECH.mjsw_p ) );
C_n += -VDD * njp * TECH.C_pj * Wjp * TECH.Df * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mj_p );
C_n += -VDD * TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mjsw_p );
// Cgs & Cgd PMOS
C_n += -( TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[ i ] );
// capacitance with voltages
if ( (( i == 1 ) && ( i < n - 1 )) || ((i == 1) && (n == 1)) )
{
// Cgd
Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
B.3. Simulators
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
switch ( OpCondition_1 )
{
case _A_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) * ( ts_n[ i ] - taui_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) ) / \
break;
case _AA_:
C_n += Cov * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauo_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( ts_n[ i ] - t0_n[ i ] ) ) / \
C_n += Cgd1 * Vd_n[i] * ( tauo_n[ i ] - ts_n[i] ) / \
( t0_n[i] - tauo_n[ i ]);
break;
case _B_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
taui_n[ i ];
break;
case _C_:
C_n += Cov * Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / \
( tauo_n[ i ] - t0_n[i]);
C_n += Cgd1 * Vd_n[ i ] * ( tauo_n[ i ] - ts_n[i]) / \
( t0_n[ i ] - tauo_n[ i ] );
break;
case _D_:
C_n += -Cgd1 * Vd_n[ i ];
break;
case _F_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
C_n += Cgd1 * ( ts_n[i] - tauo_n[i]) * \
( taui_n[ i ] * ( t0_n[ i ] - tauo_n[ i ] ) );
break;
case _G_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
taui_n[ i ];
break;
case _E_:
default:
break;
}
}
else if ( ( i == 1 ) && ( i == n - 1 ) )
{
// Cgd
Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
{
case _A_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) * ( ts_n[ i ] - taui_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) ) / \
break;
case _AA_:
C_n += Cov * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauo_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( ts_n[ i ] - t0_n[ i ] ) ) / \
C_n += Cgd1 * Vd_n[i] * ( tauo_n[ i ] - ts_n[i] ) / \
( t0_n[i] - tauo_n[ i ]);
break;
253
254
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
case _B_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) /
taui_n[ i ];
break;
case _C_:
C_n += Cov * Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / \
( tauo_n[ i ] - t0_n[i]);
C_n += Cgd1 * Vd_n[ i ] * ( tauo_n[ i ] - ts_n[i]) / \
( t0_n[ i ] - tauo_n[ i ] );
break;
case _D_:
C_n += -Cgd1 * Vd_n[ i ];
break;
case _F_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
C_n += Cgd1 * ( ts_n[i] - tauo_n[i]) * \
( taui_n[ i ] * ( t0_n[ i ] - tauo_n[ i ] ) );
break;
case _G_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) /
taui_n[ i ];
break;
case _E_:
default:
break;
}
// Cgs
Cov = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[
Cgd1 = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + double(2 / 3) * TECH.Cox_n * W_n[
switch ( OpCondition_n )
{
case _A_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - ts_n[ n ] ) / ( taui_n[ n ] - t0_n[ i ] );
C_n += Cgd1 * \
Vs_n[ n ] * ( taui_n[ n ] - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
break;
case _B_:
C_n += -Cov * Vs_n[n];
break;
case _C_:
C_n += Cov * \
C_n += Cgd1 * \
Vs_n[ n ] * ( tc - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
break;
case _D_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - tc ) / ( taui_n[ n ] - t0_n[ i ] );
break;
case _E_:
default:
break;
}
n ] );
n ] * L_n[ n ] );
}
else if ( ( i == n - 1 ) && ( i > 1 ) )
{
// Cgs, the last mos
Cov = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Cgd1 = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + double(2 / 3) * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
{
case _A_:
C_n += Cov * \
C_n += Cgd1 * \
B.3. Simulators
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
Vs_n[ n ] * ( taui_n[ n ] - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );

break;
case _B_:
C_n += -Cov * Vs_n[n];
break;
case _C_:
C_n += Cov * \
C_n += Cgd1 * \
Vs_n[ n ] * ( tc - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
break;
case _D_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - tc ) / ( taui_n[ n ] - t0_n[ i ] );
break;
case _E_:
default:
break;
}
}
else if ( i == n )
{
// Cgd
Cov = TECH.Cgd0_n * ( W_n[ n ] + TECH.XW_n );
Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ];
{
case _A_:
case _B_:
C_n += Cov * \
Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / ( tauo_n[ i ] - t0_n[ i ] );
C_n += Cgd1 * \
Vd_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) / ( t0_n[ i ] - tauo_n[ i ] );
break;
case _C_:
C_n += Cov * \
Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / ( tauo_n[ i ] - t0_n[ i ] );
C_n += Cgd1 * \
Vd_n[ i ] * ( tc - ts_n[ i ] ) / ( t0_n[ i ] - tauo_n[ i ] );
break;
case _D_:
C_n += Cov * \
Vd_n[ i ] * ( t0_n[ i ] - tc ) / ( tauo_n[ i ] - t0_n[ i ] );
break;
case _E_:
default:
break;
}
}
y += C_n;
}
// PMOS in chain
node = LastNode;
C_n = 0;
for (unsigned int i = 0; (i < p - 1) && (p > 0); i++)
{
name = pathlist[ NP ].TransistorName( i + n, NC);
{
}
{
255
256
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390 }
Wjn
Wgn
Wjp
Wgp
=
=
=
=
circuit.JunctionNWidth( node, njn, NewWidth );

circuit.GateNWidth( node, ngn, NewWidth );
circuit.JunctionPWidth( node, njp, NewWidth );
circuit.GatePWidth( node, ngp, NewWidth );
}
C_n += -TECH.C_nj * Wjn * TECH.Df * Vd_n[ n ] * pow ( ( 1 + Vd_n[ n ] / TECH.PB_n ), -TECH.mj_n ) - \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_n[ n ] * pow ( 1 + Vd_n[ n ] / TECH.PB_n, -TECH.mjsw_n );
int nc;
C_n += -circuit.CapStaticGnd( node, nc ) * Vd_n[n];
C_n += -circuit.CapStaticVdd( node, nc ) * Vd_n[n];
C_n += -Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[n];
C_n += -Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[n];
C_n += ( TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_n[ n ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ n ] ) / TECH.PB_p ), -TECH.mj_p ) );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_n[ n ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ n ] ) / TECH.PB_p ), TECH.mjsw_p ) );
C_n += -VDD * njp * TECH.C_pj * Wjp * TECH.Df * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mj_p );
C_n += -VDD * TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mjsw_p );
// Cgs & Cgd
C_n += -( TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[ n ] );
y += C_n;
}
y -= QpN(n, p);
delete[] FY;
return y;
EqP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::EqP( const Circuit& circuit, unsigned int NP, unsigned int NC, double x, int& RetCode, unsigned int j, unsigned int n, unsi
{
int SaveOpCondition = _E_, OpCondition_1, OpCondition_i, OpCondition_p;
double y, C_p, Cov, Cgd1;
RetCode = OK;
double* FY = new double[p + 1];
tauo_p[ j ] = x;
double t0_bs = -TECH.Vtp0 / VDD * taui_p[ 1 ];
double tc = (VDD * tauo_p[p] * (t0_p[p] - taui_p[p]) + \
Vs_p[p] * taui_p[p] * (tauo_p[p] - t0_p[p])) / \
(VDD * (t0_p[p] - taui_p[p]) + Vs_p[p] * (tauo_p[p] - t0_p[p]));
// tc cross-time : Vd = Vs
if ( j == 1 )
{
OpCondition_1 = Calct0ts1P( circuit, NP, SaveOpCondition, NewWidth );
if ( ( OpCondition_1 == _E_ ) || ( t0_p[ 1 ] < t0_bs ) )
{
OpCondition_1 = _A_;
else
OpCondition_1 = SaveOpCondition;
}
FY[ 1 ] = FirstEqP( OpCondition_1, tauo_p[ j ] );
}
{
B.3. Simulators
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
t0_p[ i ] = t0_p[ i - 1 ];
taui_p[ i ] = tauo_p[ i - 1 ];
}
// middle equations
if ( ( j > 1 ) && ( j < p ) )
{
if ( taui_p[ j ] <= tauo_p[ j ] )
else
OpCondition_i = _E_;
if ( OpCondition_i == _E_ )
{
}
FY[ j ] = MiddleEqP( OpCondition_i, j, tauo_p[ j ] );
}
// last equation
if ( p > 1 )
{
OpCondition_p = Calct0tsnP( n, SaveOpCondition );
if ( ( OpCondition_p == _E_ ) || ( OpCondition_p == _C_ ) || ( OpCondition_p == _D_ ) )
{
OpCondition_p = _A_;
else
OpCondition_p = SaveOpCondition;
}
if ( j == p )
FY[ p ] = LastEqP( OpCondition_p, p, tauo_p[ p ] );
}
y = FY[ j ];
// evaluate capacitance at each node
unsigned int node;
// its the common node used to traverse the path
unsigned int LastNode;
const char* name = pathlist[ NP ].TransistorName( n + p - 1, NC );
// first pmos in path
node = circuit.ValimNode();
for ( unsigned int i = j, k = 1; i > 1; i--, k++ )
{
name = pathlist[ NP ].TransistorName( n + p - 1 - k, NC );
}
for ( unsigned int i = j; i <= p; i++ )
{
C_p = 0.0;
name = pathlist[ NP ].TransistorName( n + p - i, NC );
// first there are n nmos
// the there are the pmos, in REVERSE order
{
}
{
}
257
258
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
if (i == p)
LastNode = node;
// Common capacitance
// dummy
int nc;
// Cj
C_p += TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_p[ i ]) * pow ( ( 1 + ( VDD - Vd_p[ i ] ) / TECH.PB_p ), -TECH.mj_p );
C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_p[ i ]) * pow ( 1 + ( VDD - Vd_p[ i ] ) / TECH.PB_p, -TECH.mjsw_p );
C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ i ]);
C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ i ]);
C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ i ]);
C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ i ]);
if ( ( i == 1 ) && ( i < p - 1 ) )
{
// Cgd & Cgs minus first mos
C_p += ( TECH.Cgs0_p * ( ( Wjp - W_p[ 1 ] ) + ( njp - 1 ) * TECH.XW_p ) + \
0.5 * TECH.Cox_p * ( Wjp - W_p[ 1 ] ) * ( njp - 1 ) * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
}
else if ( ( i < p - 1 ) && ( i > 1 ) )
{
// all Cgd & Cgs
C_p += ( TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p ) + \
0.5 * TECH.Cox_p * Wjp * njp * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
}
else if ( (( i == p ) || ( i == p - 1 )) && (p > 1) )
{
// Cgd & Cgs minus last mos
C_p += ( TECH.Cgs0_p * ( ( Wjp - W_p[ p ] ) + ( njp - 1 ) * TECH.XW_p ) + \
0.5 * TECH.Cox_p * ( Wjp - W_p[ p ] ) * ( njp - 1 ) * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
}
// NMOS
C_p += -( TECH.C_nj * Wjn * TECH.Df * Vd_p[ i ] ) * \
pow ( ( 1 + Vd_p[ i ] / TECH.PB_n ), -TECH.mj_n );
C_p += -( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_p[ i ] ) * \
pow ( ( 1 + Vd_p[ i ] / TECH.PB_n ), -TECH.mjsw_n );
C_p += ( TECH.C_nj * Wjn * TECH.Df * VDD ) * \
pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mj_n );
C_p += ( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * VDD ) * \
pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mjsw_n );
// Cgs & Cgd
C_p += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ i ]);
// capacitance with voltages
if ( (( i == 1 ) && ( i < p - 1 )) || ((i == 1) && (p == 1)) )
{
// Cgd
Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
Cgd1 = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
{
case _A_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - tauo_p[ i ] - taui_p[i]) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
C_p += Cgd1 * ( VDD * ( t0_p[ i ] * ( ts_p[ i ] - taui_p[ i ] ) - \
ts_p[ i ] * ( taui_p[ i ] + tauo_p[ i ] ) + 2 * taui_p[ i ] * tauo_p[ i ] ) + \
Vd_p[ i ] * taui_p[ i ] * ( ts_p[ i ] - tauo_p[ i ] ) ) / \
break;
case _AA_:
C_p += Cov * ( VDD * ( t0_p[ i ] * t0_p[ i ] - t0_p[ i ] * ( 2 * taui_p[ i ] + tauo_p[ i ] ) + \
taui_p[ i ] * ( ts_p[ i ] + tauo_p[ i ] ) ) + \
Vd_p[ i ] * taui_p[ i ] * ( t0_p[ i ] - ts_p[ i ] ) ) / \
C_p += Cgd1 * ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tauo_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
break;
case _B_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - 2 * taui_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
B.3. Simulators
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
259
taui_p[ i ];
break;
case _C_:
C_p += Cov * ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
( t0_p[i] - tauo_p[ i ]);
break;
case _D_:
C_p += Cgd1 * ( VDD - Vd_p[ i ] );
break;
case _F_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
C_p += Cgd1 * ( tauo_p[ i ] - ts_p[ i ] ) * \
( taui_p[ i ] * ( t0_p[i] - tauo_p[ i ] ) );
break;
case _G_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
taui_p[ i ];
break;
case _E_:
default:
break;
}
}
else if ( ( i == 1 ) && ( i == p - 1 ) )
{
// Cgd
Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
Cgd1 = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
{
case _A_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - tauo_p[ i ] - taui_p[i]) + Vd_p[ i ] * taui_p[ i ] ) / \
C_p += Cgd1 * ( VDD * ( t0_p[ i ] * ( ts_p[ i ] - taui_p[ i ] ) - \
ts_p[ i ] * ( taui_p[ i ] + tauo_p[ i ] ) + 2 * taui_p[ i ] * tauo_p[ i ] ) + \
Vd_p[ i ] * taui_p[ i ] * ( ts_p[ i ] - tauo_p[ i ] ) ) / \
break;
case _AA_:
C_p += Cov * ( VDD * ( t0_p[ i ] * t0_p[ i ] - t0_p[ i ] * ( 2 * taui_p[ i ] + tauo_p[ i ] ) + \
taui_p[ i ] * ( ts_p[ i ] + tauo_p[ i ] ) ) + \
Vd_p[ i ] * taui_p[ i ] * ( t0_p[ i ] - ts_p[ i ] ) ) / \
( t0_p[i] - tauo_p[ i ]);
break;
case _B_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - 2 * taui_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
taui_p[ i ];
break;
case _C_:
C_p += Cov * ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
( t0_p[i] - tauo_p[ i ]);
break;
case _D_:
C_p += Cgd1 * ( VDD - Vd_p[ i ] );
break;
case _F_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
260
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
C_p += Cgd1 * ( tauo_p[ i ] - ts_p[ i ] ) * \

( taui_p[ i ] * ( t0_p[i] - tauo_p[ i ] ) );
break;
case _G_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
taui_p[ i ];
break;
case _E_:
default:
break;
}
// Cgs
Cgd1 = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + double(2.0 / 3.0) * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
switch ( OpCondition_p )
{
case _A_:
C_p += Cov * \
( VDD - Vs_p[ p ] ) * ( t0_p[ i ] - ts_p[ p ] ) / ( t0_p[i] - taui_p[ p ]);
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[p] - taui_p[ p ]) / ( t0_p[ i ] - taui_p[ p ] );
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]);
break;
case _C_:
C_p += Cov * \
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[ p ] - tc) / ( t0_p[ i ] - taui_p[ p ] );
case _D_:
C_p += Cov * \
( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - taui_p[ p ]);
break;
case _E_:
default:
break;
}
}
else if ( ( i == p - 1 ) && ( i > 1 ) )
{
// Cgs, the last mos
Cgd1 = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + double(2.0 / 3.0) * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
{
case _A_:
C_p += Cov * \
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[p] - taui_p[ p ]) / ( t0_p[ i ] - taui_p[ p ] );
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]);
break;
case _C_:
C_p += Cov * \
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[ p ] - tc) / ( t0_p[ i ] - taui_p[ p ] );
case _D_:
C_p += Cov * \
( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - taui_p[ p ]);
break;
case _E_:
default:
break;
}
B.3. Simulators
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
261
}
else if ( i == p )
{
// Cgd
Cov = ( TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p ) );
Cgd1 = ( TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
{
case _A_:
case _B_:
C_p += Cov * \
( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / ( t0_p[i] - tauo_p[ i ]);
C_p += Cgd1 * \
( VDD - Vd_p[ i ] ) * ( ts_p[i] - tauo_p[ i ]) / ( t0_p[ i ] - tauo_p[ i ] );
break;
case _C_:
C_p += Cov * \
( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / ( t0_p[i] - tauo_p[ i ]);
C_p += Cgd1 * \
( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tc) / ( t0_p[ i ] - tauo_p[ i ] );
break;
case _D_:
C_p += Cov * \
( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - tauo_p[ i ]);
break;
case _E_:
default:
break;
}
}
y += C_p;
}
node = LastNode;
C_p = 0;
for (unsigned int i = 0; (i < n - 1) && (n > 0); i++)
{
name = pathlist[ NP ].TransistorName( n - 1 - i, NC );
{
}
{
}
// Cj
C_p += TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_p[ p ]) * pow ( ( 1 + ( VDD - Vd_p[ p ] ) / TECH.PB_p ), -TECH.mj_p );
C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_p[ p ]) * pow ( 1 + ( VDD - Vd_p[ p ] ) / TECH.PB_p, -TECH.mjsw_p );
int nc;
C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ p ]);
C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ p ]);
C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ p ] );
C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ p ] );
// NMOS in chain
C_p += -( TECH.C_nj * Wjn * TECH.Df * Vd_p[ p ] ) * \
pow ( ( 1 + Vd_p[ p ] / TECH.PB_n ), -TECH.mj_n );
C_p += -( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_p[ p ] ) * \
pow ( ( 1 + Vd_p[ p ] / TECH.PB_n ), -TECH.mjsw_n );
C_p += ( TECH.C_nj * Wjn * TECH.Df * VDD ) * \
262
383
384
385
386
387
388
389
390
391
392
393 }
pow ( ( 1 +
C_p += ( TECH.C_np
pow ( ( 1 +
// Cgs & Cgd
C_p += TECH.Cgd0_n
y += C_p;
VDD / TECH.PB_n ), -TECH.mj_n );

* 2 * ( Wjn + njn * TECH.Df ) * VDD ) * \
VDD / TECH.PB_n ), -TECH.mjsw_n );
* ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ p ] );
}
y += QnP(n, p);
delete[] FY;
return y;
Fast.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
Fast::Fast( const CritPathList& pathlist, const Options& options )
:
EvaluationAlgorithm( pathlist, options ),
A_1_n( 0 ), ts_n( 0 ), tauo_n( 0 ), taui_n( 0 ), t0_n( 0 ),
beta_n( 0 ), Vd_n( 0 ), Vs_n( 0 ), W_n( 0 ), L_n( 0 ),
A_1_p( 0 ), B_1_p( 0 ), ts_p( 0 ), tauo_p( 0 ), taui_p( 0 ), t0_p( 0 ),
beta_p( 0 ), Vd_p( 0 ), Vs_p( 0 ), W_p( 0 ), L_p( 0 ), VDD( 0 )
{
print_log( "Creating Fast instance..." );
}
///
Fast::~Fast()
{}
///
int Fast::Run( const Circuit& circuit, const double *NewWidth, const unsigned* ValidPath )
{
VDD = circuit.Valim();
Calls++;
int RetCode;
double tin;
unsigned int n;
unsigned int p;
double TotalDelay;
double TotalPower;
double TotalNoise;
{
if (ValidPath[NP])
{
unsigned int NumChain = pathlist[ NP ].GetNumListTran();
TransitionType TIn;
TransitionType TOut;
TotalDelay = 0.0;
TotalPower = 0.0;
TotalNoise = 0.0;
for ( unsigned int NC = 0; NC < NumChain; NC++ )
{
if ( NC == 0 )
{
TIn = pathlist[ NP ].GetTransitionIn();
tin = pathlist[ NP ].GetInTime() / 1000;
}
else
{
B.3. Simulators
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123 }
if ( TIn == FALL )
tin = tauo_n[ n ];
else if ( TIn == RISE )
tin = tauo_p[ p ];
}
if ( TIn == FALL )
{
TOut = RISE;
}
else if ( TIn == RISE )
{
TOut = FALL;
}
n = pathlist[ NP ].GetNumTranN( NC );
p = pathlist[ NP ].GetNumTranP( NC );
if ( ( RetCode = InitCircuitVar( n, p ) ) != OK )
return RetCode;
unsigned int in = 1;
unsigned int ip = 1;
while ( const char * tn = pathlist[ NP ].TraverseTransistorNameList( NC ) )
{
int position = circuit.TranPos( tn );
if ( position == -1 )
return NOT_FOUND;
if ( in <= n )
{
W_n[ in ] = NewWidth[ position ];
L_n[ in++ ] = circuit[ position ].Length();
}
else if ( ip <= p )
{
W_p[ p - ip + 1 ] = NewWidth[ position ];
L_p[ p - ip + 1 ] = circuit[ position ].Length();
ip++;
}
else
return NOT_FOUND;
}
CalcParamCircuit( n, p );
TotalDelay += CalcDelay( circuit, NP, NC, n, p, NewWidth, tin, TOut, RetCode );

return RetCode;
TotalPower += CalcPower( circuit, NP, NC, n, p, NewWidth, TOut, RetCode );
return RetCode;
TotalNoise = 0.0;
FreeCircuitPar( n, p );
TIn = TOut;
}
if (NumChain > 0)
for (unsigned int i = 0;
TotalDelay *= 1.85;
else if (NumChain > 1)
for (unsigned int i = 0;
TotalDelay *= 3.1;
CPDelay[ NP ] = TotalDelay *
CPPower[ NP ] = TotalPower /
CPNoise[ NP ] = TotalNoise;
i < NumChain; i++)
i < NumChain; i++)

// 1.07 tech 07
1000;
// ps
1000.0;
// pJ
}
}
Area = CalcArea( NewWidth, circuit.GetNTran() );
return RetCode;
return OK;
263
264
FastArea.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::CalcArea( const double *NewWidth, unsigned int NT )
{
double A = 0.0;
{
A += NewWidth[ i ];
}
return ( A );
}
FirstEqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::FirstEqN( int OpCondition, double tauon )
{
double Vc, t0_bs, temp;
double A_2_n, B_2_n, C_2_n, D_2_n, I_2_n;
double N_2_n, O_2_n, P_2_n;
double Q_2_n, R_2_n, S_2_n, T_2_n, U_2_n;

if ( OpCondition == _E_ )
if ( taui_n[ 1 ] <= tauon )
OpCondition = _A_;
else
OpCondition = _F_;
C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
I_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * 0.5;
N_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * Vc * ( ( t0_n[ 1 ] - tauon ) * ( t0_n[ 1 ] - tauon ) ) - \
Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc * ( t0_n[ 1 ] - tauon ) + \
Vd_n[ 1 ] * tauon + 2 * TECH.Vtn0 * ( tauon - t0_n[ 1 ] ) ) ) / \
( 2 * Vd_n[ 1 ] * taui_n[ 1 ] * ( tauon - t0_n[ 1 ] ) );
O_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * taui_n[ 1 ] ) / \
( 2 * taui_n[ 1 ] * ( t0_n[ 1 ] - tauon ) );
P_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( t0_n[ 1 ] - tauon ) * \
( 2 * VDD * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon ) - Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc - 2 * TECH.Vtn0 ) );
Q_2_n = 2 * Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon );
R_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * ( t0_n[ 1 ] - tauon ) + \
Vc * ( t0_n[ 1 ] - tauon ) + \
Vd_n[ 1 ] * tauon + 2 * TECH.Vtn0 * ( tauon - t0_n[ 1 ] ) ) / \
( 2 * ( t0_n[ 1 ] - tauon ) );
S_2_n = Vc * Vd_n[ 1 ] * beta_n[ 1 ] / ( 2 * ( tauon - t0_n[ 1 ] ) );
T_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( tauon - t0_n[ 1 ] ) * ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
U_2_n = 2 * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon );
switch ( OpCondition )
{
case _A_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ts_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ( taui_n[ 1 ] * taui_n[ 1 ] ) ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ta
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * taui_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] - \
B.3. Simulators
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114 }
( 6
3
4
4
3
*
*
*
*
*
A_2_n * C_2_n * ( t0_n[ 1 ] - ts_n[ 1 ] ) + \

B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) )
( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5
( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * ts_n[ 1 ] + D_2_n ), 1.5
C_2_n * ( 2 * N_2_n * ( ts_n[ 1 ] - taui_n[ 1 ] ) + \
O_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1
2 * R_2_n * ( taui_n[ 1 ] - tauon ) + \
S_2_n * ( ( taui_n[ 1 ] * taui_n[ 1 ] ) - ( tauon * tauon ) ) )
265
- \
) + \
) + \
] ) ) + \
) / ( 6 * C_2_n );
break;
case _AA_:
temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * ts_n[ 1 ] ) / Vd_n[ 1 ] + \
( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - taui_n[ 1 ] ) + \
3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) - \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5 ) + \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * taui_n[ 1 ] + D_2_n ), 1.5 ) - \
3 * C_2_n * ( 2 * I_2_n * ( ts_n[ 1 ] - taui_n[ 1 ] ) + \
2 * R_2_n * ( tauon - ts_n[ 1 ] ) + \
S_2_n * ( ( tauon * tauon ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) ) ) / ( 6 * C_2_n );
break;
case _B_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * t0_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ( taui_n[ 1 ] * taui_n[ 1 ] ) ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * taui_n[ 1 ] ) / Vd_n[ 1 ] + \
( 2 * N_2_n * ( t0_n[ 1 ] - taui_n[ 1 ] ) + \
O_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) + \
2 * R_2_n * ( taui_n[ 1 ] - tauon ) + \
S_2_n * ( ( taui_n[ 1 ] * taui_n[ 1 ] ) - ( tauon * tauon ) ) ) / 2;
break;
case _C_:
temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * ts_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] + \
I_2_n * ( ts_n[ 1 ] - t0_n[ 1 ] ) - \
( 2 * R_2_n * ( ts_n[ 1 ] - tauon ) + S_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( tauon * tauon ) ) ) * 0.5;
break;
case _D_:
temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * t0_n[ 1 ] ) / Vd_n[ 1 ] + \
( 2 * R_2_n * ( t0_n[ 1 ] - tauon ) + S_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( tauon * tauon ) ) ) * 0.5;
break;
case _F_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ts_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] * tauon ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) - \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5 ) + \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * ts_n[ 1 ] + D_2_n ), 1.5 ) + \
3 * C_2_n * ( 2 * N_2_n * ( ts_n[ 1 ] - tauon ) + \
O_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( tauon * tauon ) ) ) ) / ( 6 * C_2_n );
break;
case _G_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * t0_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] * tauon ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
( 2 * N_2_n * ( t0_n[ 1 ] - tauon ) + O_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( tauon * tauon ) ) ) / 2;
break;
case _E_:
default:
temp = 0;
break;
}
return temp;
FirstEqP.cc
4 #include "myenum.h"
266
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include "print.h"
#include "fast.h"
///
double Fast::FirstEqP( int OpCondition, double tauop )
{
double Vc, t0_bs, temp, x, y;
double A_2_p, B_2_p, C_2_p, D_2_p, G_2_p, J_2_p;
double K_2_p, M_2_p, N_2_p, O_2_p, P_2_p;
double Q_2_p, R_2_p, S_2_p, T_2_p;
if ( OpCondition == _E_ )
if ( taui_p[ 1 ] <= tauop )
OpCondition = _A_;
else
OpCondition = _F_;
x = t0_p[ 1 ] - taui_p[ 1 ];
y = t0_p[ 1 ] - tauop;
C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
G_2_p = Vc * Vc * beta_p[ 1 ] * SQRT ( ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) / Vc ) - \
VDD * Vc * beta_p[ 1 ] - Vc * Vc * beta_p[ 1 ] - Vc * TECH.Vtp0 * beta_p[ 1 ];
J_2_p = Vc * beta_p[ 1 ] * ( ( VDD * VDD ) * taui_p[ 1 ] * tauop - \
VDD * ( Vc * y * ( x + y - tauop ) + \
2 * taui_p[ 1 ] * ( Vd_p[ 1 ] * tauop - TECH.Vtp0 * y ) ) - \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * y - \
Vd_p[ 1 ] * tauop + 2 * TECH.Vtp0 * y ) ) / \
( 2 * taui_p[ 1 ] * ( Vd_p[ 1 ] - VDD ) * y );
K_2_p = -Vc * beta_p[ 1 ] * ( VDD * ( x + y - tauop ) + \
Vd_p[ 1 ] * taui_p[ 1 ] ) / \
( 2 * taui_p[ 1 ] * y );
M_2_p = ( Vc * Vc ) * beta_p[ 1 ] * y * ( 2 * ( VDD * VDD ) * tauop - \
VDD * ( Vc * ( x + y - tauop ) + \
2 * ( Vd_p[ 1 ] * tauop - TECH.Vtp0 * taui_p[ 1 ] ) ) - \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc + 2 * TECH.Vtp0 ) );
N_2_p = 2 * taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * \
( VDD * tauop - Vc * y - Vd_p[ 1 ] * tauop );
O_2_p = 2 * taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
P_2_p = -Vc * beta_p[ 1 ] * ( VDD * ( t0_p[ 1 ] + y ) + \
Vc * y - Vd_p[ 1 ] * tauop + 2 * TECH.Vtp0 * y ) / \
( 2 * y );
Q_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) / ( 2 * y );
R_2_p = ( Vc * Vc ) * beta_p[ 1 ] * y * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
S_2_p = 2 * ( VDD * tauop - Vc * y - Vd_p[ 1 ] * tauop );
T_2_p = 2 * ( VDD - Vd_p[ 1 ] );
{
case _A_:
temp = -M_2_p * LOG ( O_2_p * ts_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * taui_p[ 1 ] - N_2_p ) / O_2_p - \
R_2_p * LOG ( T_2_p * taui_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p + \
( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - ts_p[ 1 ] * ts_p[ 1 ] )
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5 )
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * ts_p[ 1 ] ), 1.5 )
3 * D_2_p * ( 2 * J_2_p * ( ts_p[ 1 ] - taui_p[ 1 ] ) + \
K_2_p * ( ts_p[ 1 ] * ts_p[ 1 ] - taui_p[ 1 ] * taui_p[
2 * P_2_p * ( taui_p[ 1 ] - tauop ) + \
Q_2_p * ( taui_p[ 1 ] * taui_p[ 1 ] - tauop * tauop ) )
( 6 * D_2_p );
break;
case _AA_:
temp = -R_2_p * LOG ( T_2_p * ts_p[ 1 ] - S_2_p ) / T_2_p + \
- \
+ \
- \
1 ] ) + \
) / \
B.3. Simulators
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131 }
R_2_p
( 6 *
3 *
4 *
4 *
3 *
* LOG (
A_2_p *
B_2_p *
Vc * Vc
Vc * Vc
D_2_p *
T_2_p * tauop - S_2_p ) / T_2_p + \

D_2_p * ( t0_p[ 1 ] - taui_p[ 1 ] ) + \
D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - taui_p[ 1 ] * taui_p[ 1
* beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5 )
* beta_p[ 1 ] * pow( ( C_2_p + D_2_p * taui_p[ 1 ] ), 1.5
( 2 * G_2_p * ( ts_p[ 1 ] - taui_p[ 1 ] ) + \
2 * P_2_p * ( tauop - ts_p[ 1 ] ) + \
Q_2_p * ( tauop * tauop - ts_p[ 1 ] * ts_p[ 1 ] ) ) ) /
] ) - \
+ \
) + \
( 6 * D_2_p );
break;
case _B_:
temp = -M_2_p * LOG ( O_2_p * t0_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * taui_p[ 1 ] - N_2_p ) / O_2_p - \
R_2_p * LOG ( T_2_p * taui_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p - \
( 2 * J_2_p * ( t0_p[ 1 ] - taui_p[ 1 ] ) + \
K_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( taui_p[ 1 ] * taui_p[ 1 ] ) ) + \
2 * P_2_p * ( taui_p[ 1 ] - tauop ) + \
Q_2_p * ( ( taui_p[ 1 ] * taui_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _C_:
temp = -R_2_p * LOG ( T_2_p * ts_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p + \
G_2_p * ( ts_p[ 1 ] - t0_p[ 1 ] ) - \
( 2 * P_2_p * ( ts_p[ 1 ] - tauop ) + \
Q_2_p * ( ( ts_p[ 1 ] * ts_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _D_:
temp = -R_2_p * LOG ( T_2_p * t0_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p - \
( 2 * P_2_p * ( t0_p[ 1 ] - tauop ) + \
Q_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _F_:
temp = -M_2_p * LOG ( O_2_p * ts_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * tauop - N_2_p ) / O_2_p + \
( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - ts_p[ 1 ] * ts_p[ 1 ]
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * ts_p[ 1 ] ), 1.5
3 * D_2_p * ( 2 * J_2_p * ( ts_p[ 1 ] - tauop ) + \
K_2_p * ( ts_p[ 1 ] * ts_p[ 1 ] - tauop * tauop ) ) )
( 6 * D_2_p );
break;
case _G_:
temp = -M_2_p * LOG ( O_2_p * t0_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * tauop - N_2_p ) / O_2_p - \
( 2 * J_2_p * ( t0_p[ 1 ] - tauop ) + \
K_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _E_:
default:
temp = 0;
break;
}
return temp;
Init.cc
3
4
5
6
7
8
9
267
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
int Fast::InitCircuitVar( unsigned int n, unsigned int p )
) - \
) + \
) - \
/ \
268
10 {
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
// NMOS
if (n == 0)
n = 1;
A_1_n = dvector ( 1, n );
ts_n = dvector ( 1, n );
tauo_n = dvector ( 1, n );
taui_n = dvector ( 1, n );
t0_n = dvector ( 1, n );
beta_n = dvector ( 1, n );
Vd_n = dvector ( 1, n );
Vs_n = dvector ( 1, n );
W_n = dvector ( 1, n );
L_n = dvector ( 1, n );
if ( !A_1_n || !ts_n || !tauo_n || !taui_n || !t0_n || !beta_n || !Vd_n || !Vs_n || !W_n || !L_n )
return NO_MEM;
TECH.u0_n = TECH.Kp_n / TECH.Cox_n;
TECH.Ec_n = TECH.vmax_n / TECH.u0_n * ( 1 + TECH.theta_n * ( VDD - TECH.Vtn0 ) );
double phip_n = TECH.phi_n;
double gamma_n = TECH.gamma_n;
double Vsb1_n = -0.5 * gamma_n * sqrt ( 4 * gamma_n * sqrt ( 2 * phip_n ) + \
8 * phip_n + 4 * VDD - 4 * TECH.Vtn0 + gamma_n * gamma_n ) + \
gamma_n * sqrt ( 2 * phip_n ) + VDD - TECH.Vtn0 + gamma_n * gamma_n * 0.5;
double Vt_n = TECH.Vtn0 + gamma_n * ( sqrt ( 2 * phip_n + Vsb1_n ) - \
sqrt ( 2 * phip_n ) );
{
A_1_n[ i ] = ( Vt_n - TECH.Vtn0 ) / Vsb1_n;
ts_n[ i ] = t0_n[ i ] = taui_n[ i ] = tauo_n[ i ] = 0.0;
}
Vd_n[ i ] = Vsb1_n;
Vs_n[ i ] = Vsb1_n;
Vd_n[ n ] = VDD;
Vs_n[ 1 ] = 0;
if (p == 0)
p = 1;
// PMOS
A_1_p = dvector ( 1, p );
B_1_p = dvector ( 1, p );
ts_p = dvector ( 1, p );
tauo_p = dvector ( 1, p );
taui_p = dvector ( 1, p );
t0_p = dvector ( 1, p );
beta_p = dvector ( 1, p );
Vd_p = dvector ( 1, p );
Vs_p = dvector ( 1, p );
W_p = dvector ( 1, p );
L_p = dvector ( 1, p );
if ( !A_1_p || !ts_p || !tauo_p || !taui_p || !t0_p || !beta_p || !Vd_p || !Vs_p || !W_p || !L_p )
return NO_MEM;
TECH.u0_p = TECH.Kp_p / TECH.Cox_p;
TECH.Ec_p = TECH.vmax_p / TECH.u0_p * ( 1 - TECH.theta_p * ( -VDD - TECH.Vtp0 ) );
//double phip p = fabs ( TECH.VT * log ( TECH.Nd / TECH.ni ) );
double phip_p = TECH.phi_p;
//double gamma p = sqrt ( 2 * TECH.epss * TECH.q * TECH.Nd ) / TECH.Cox p;
double gamma_p = TECH.gamma_p;
double Vsb1_p = 0.5 * gamma_p * sqrt ( 4 * gamma_p * sqrt ( 2 * phip_p ) + \
8 * phip_p + 4 * VDD + 4 * TECH.Vtp0 + gamma_p * gamma_p ) - \
gamma_p * sqrt ( 2 * phip_p ) - VDD - TECH.Vtp0 + gamma_p * gamma_p * 0.5;
//
double Vt_p = TECH.Vtp0 - gamma_p * ( sqrt ( 2 * phip_p - Vsb1_p ) - \
sqrt ( 2 * phip_p ) );
{
A_1_p[ i ] = ( TECH.Vtp0 - Vt_p ) / ( VDD + Vt_p );
B_1_p[ i ] = ( Vt_p * ( VDD + TECH.Vtp0 ) ) / ( VDD + Vt_p );
ts_p[ i ] = t0_p[ i ] = taui_p[ i ] = tauo_p[ i ] = 0.0;
B.3. Simulators
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
269
}
for ( unsigned int i = 1; i < p; i++ )
Vd_p[ i ] = VDD + Vsb1_p;
Vs_p[ i ] = VDD + Vsb1_p;
Vd_p[ p ] = 0;
Vs_p[ 1 ] = VDD;
return OK;
}
///
void Fast::FreeCircuitPar( unsigned int n, unsigned int p )
{
if (n == 0)
n = 1;
free_dvector ( A_1_n);
free_dvector ( ts_n);
free_dvector ( tauo_n);
free_dvector ( taui_n);
free_dvector ( t0_n);
free_dvector ( beta_n);
free_dvector ( Vd_n);
free_dvector ( Vs_n);
free_dvector ( W_n);
free_dvector ( L_n);
if (p == 0)
p = 1;
free_dvector ( A_1_p);
free_dvector ( B_1_p);
free_dvector ( ts_p);
free_dvector ( tauo_p);
free_dvector ( taui_p);
free_dvector ( t0_p);
free_dvector ( beta_p);
free_dvector ( Vd_p);
free_dvector ( Vs_p);
free_dvector ( W_p);
free_dvector ( L_p);
}
///
void Fast::CalcParamCircuit( unsigned int n, unsigned int p )
{
{
L_n[ i ] = L_n[ i ] - 2 * TECH.LD_n + TECH.XL_n;
W_n[ i ] = W_n[ i ] - 2 * TECH.WD_n + TECH.XW_n;
beta_n[ i ] = ( TECH.u0_n * TECH.Cox_n * W_n[ i ] / L_n[ i ] ) / ( 1 + TECH.theta_n * ( VDD - TECH.Vtn0 ) );
}
{
L_p[ i ] = L_p[ i ] - 2 * TECH.LD_p + TECH.XL_p;
W_p[ i ] = W_p[ i ] - 2 * TECH.WD_p + TECH.XW_p;
beta_p[ i ] = ( TECH.u0_p * TECH.Cox_p * W_p[ i ] / L_p[ i ] ) / ( 1 - TECH.theta_p * ( -VDD - TECH.Vtp0 ) );
}
}
Iter.cc
3
4
5
6
7
8
9
10
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
int Fast::IterSol( const Circuit& circuit, TransistorType type, unsigned int NP, unsigned int NC, unsigned int n, unsigned int p, const double* NewWidt
270
11 {
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
unsigned int found = 0, n_tol;

int num_sol;
int iter = 0;
double tin;
int RetCode;
unsigned int k;
if ( type == NMOS )
{
k = n;
tin = taui_n[ 1 ];
}
{
k = p;
tin = taui_p[ 1 ];
}
double* to = new double[ k + 1 ];
double* to_old = new double[ k + 1 ];
double* t0 = new double[ k + 1 ];
for ( unsigned int i = 1; i <= k; i++ )
{
to_old[ i ] = to[ i ] = tin;
t0[ i ] = 0;
}
if ( type == NMOS )
t0[ 1 ] = t0_n[ 1 ];
t0[ 1 ] = t0_p[ 1 ];
double xb1, xb2;
while ( !found )
{
iter++;
n_tol = 0;
for ( unsigned int i = 1; i <= k; i++ )
{
num_sol = 1;
xb1 = to_old[ i ] - STEP_SOL;
xb2 = to_old[ i ] + STEP_SOL;
to[ i ] = SolveEq( circuit, NP, NC, type, ( t0[ i ] + STEP_SOL ), MAX_SOL, RetCode, i, n, p, NewWidth );
//if(RetCode != OK)
// return RetCode;
if ( to[ i ] == 0.0 )
{
num_sol = Brackets( circuit, NP, NC, xb1, xb2, type, i, n, p, NewWidth, RetCode );
//if(RetCode != OK)
// return RetCode;
if ( xb1 < 0.0 )
xb1 = 0.0;
if ( num_sol == 0 )
{
if ( i == 1 )
to[ i ] = tin;
else
to[ i ] = to[ i - 1 ] + tin;
}
else
{
to[ i ] = SolveEq( circuit, NP, NC, type, xb1, xb2, RetCode, i, n, p, NewWidth );
//if(RetCode != OK)
// return RetCode;
if ( to[ i ] == 0.0 )
{
if ( i == 1 )
to[ i ] = tin;
else
to[ i ] = to[ i - 1 ] + tin;
}
B.3. Simulators
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122 }
}
}
if ( type == NMOS )
{
tauo_n[ i ] = to[ i ];
}
{
tauo_p[ i ] = to[ i ];
}
// Dummy
int Sop, Op;
if ( i == 1 )
if ( type == NMOS )
Op = Calct0ts1N ( circuit,
else
Op = Calct0ts1P ( circuit,
if ( ( i == k ) && ( k > 1 ) )
if ( type == NMOS )
Op = Calct0tsnN( n, Sop );
else
Op = Calct0tsnP( n, Sop );
if ( type == NMOS )
{
for ( unsigned int j = 1; j <=
t0[ j ] = t0_n[ j ];
}
{
for ( unsigned int j = 1; j <=
t0[ j ] = t0_p[ j ];
}
NP, Sop, NewWidth );

NP, Sop, NewWidth );
n; j++ )
p; j++ )
if ( ( fabs ( to[ i ] - to_old[ i ] ) <= STEP_SOL ) && ( iter > 1 ) )

n_tol++;
to_old[ i ] = to[ i ];
}
if ( ( n_tol == k ) || ( iter > ITERMAX ) )
found = 1;
}
delete[] to;
delete[] to_old;
return OK;
LastEqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::LastEqN( int OpCondition, unsigned int n, double tauon )
{
double Vc, temp, tc;
double C, D, F, G, H, I, J, K, M, N, O, P, R, S;
double x = t0_n[ n ] - taui_n[ n ];
double y = t0_n[ n ] - tauon;
tc = ( VDD * tauon * ( t0_n[ n ] - taui_n[ n ] ) + \
Vs_n[ n ] * taui_n[ n ] * ( tauon - t0_n[ n ] ) ) / \
( VDD * ( t0_n[ n ] - taui_n[ n ] ) + Vs_n[ n ] * ( tauon - t0_n[ n ] ) );
C = -Vc * Vs_n[ n ] * beta_n[ n ] * ( A_1_n[ n ] + 1 ) / x;
I = ( 2 * A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] + \
2 * VDD * x + Vc * x + \
2 * ( Vs_n[ n ] * taui_n[ n ] - TECH.Vtn0 * x ) ) / \
( Vc * x );
H = -2 * Vs_n[ n ] * ( A_1_n[ n ] + 1 ) / ( Vc * x );
271
272
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
F = Vc * beta_n[ n ] * ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] + \

VDD * x + Vc * x + \
Vs_n[ n ] * taui_n[ n ] - TECH.Vtn0 * x ) / x;
D = VDD * x - Vs_n[ n ] * y;
G = ( Vc * Vc ) * beta_n[ n ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) / 2;
J = Vc * beta_n[ n ] * ( 2 * A_1_n[ n ] * Vs_n[ n ] * y * \
( D * taui_n[ n ] + Vc * x * y ) + \
2 * D * VDD * x * y + \
( VDD * VDD ) * tauon * x * x + \
VDD * x * y * ( Vc * x + \
Vs_n[ n ] * ( y - x ) - 2 * TECH.Vtn0 * x ) + \
Vs_n[ n ] * y * y * ( Vc * x - \
Vs_n[ n ] * taui_n[ n ] + 2 * TECH.Vtn0 * x ) ) / \
( 2 * D * x * y );
K = -Vc * beta_n[ n ] * ( 2 * A_1_n[ n ] * Vs_n[ n ] * y + \
VDD * x + Vs_n[ n ] * y ) / \
( 2 * x * y );
M = ( Vc * Vc ) * beta_n[ n ] * x * y * \
( 2 * A_1_n[ n ] * Vs_n[ n ] * ( VDD * ( x - y ) - Vc * y ) - \
2 * D * VDD + VDD * ( -Vc * x + \
2 * ( Vs_n[ n ] * ( x - y ) + TECH.Vtn0 * x ) ) - \
Vs_n[ n ] * y * ( Vc + 2 * TECH.Vtn0 ) );
N = 2 * D * ( y * ( Vc * x + Vs_n[ n ] * taui_n[ n ] ) - \
VDD * x * tauon );
O = Vc * beta_n[ n ] * ( VDD * ( 2 * t0_n[ n ] - tauon ) + ( Vc - 2 * TECH.Vtn0 ) * y ) / \
( 2 * y );
P = -Vc * VDD * beta_n[ n ] / ( 2 * y );
R = -( Vc * Vc ) * beta_n[ n ] * y * ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
S = 2 * ( Vc * y - VDD * tauon );
{
case _A_:
temp = -M * LOG2 ( 2 * ( D * D ) * ts_n[ n ] + N ) / ( D * D ) + \
M * LOG2 ( 2 * ( D * D ) * taui_n[ n ] + N ) / ( D * D ) - \
R * LOG2 ( S + 2 * VDD * taui_n[ n ] ) / VDD + \
R * LOG2 ( S + 2 * VDD * tauon ) / VDD - \
( 6 * F * H * ( t0_n[ n ] - ts_n[ n ] ) + \
3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( ts_n[ n ] * ts_n[ n ] ) )
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * ts_n[ n ] + I ), 1.5
3 * H * ( 2 * J * ( ts_n[ n ] - taui_n[ n ] ) + \
K * ( ( ts_n[ n ] * ts_n[ n ] ) - ( taui_n[ n ] * taui_n[ n
2 * O * ( taui_n[ n ] - tauon ) + \
P * ( ( taui_n[ n ] * taui_n[ n ] ) - ( tauon * tauon ) ) )
( 6 * H );
- \
) + \
) + \
] ) ) + \
) / \
break;
case _B_:
temp = -R * LOG2 ( S + 2 * VDD * ts_n[ n ] ) / VDD + \
R * LOG2 ( S + 2 * VDD * tauon ) / VDD - \
( 6 * F * H * ( t0_n[ n ] - taui_n[ n ] ) + \
3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( taui_n[ n ] * taui_n[ n ] )
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5 )
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * taui_n[ n ] + I ), 1.5
3 * H * ( 2 * G * ( ts_n[ n ] - taui_n[ n ] ) + \
2 * O * ( tauon - ts_n[ n ] ) + \
P * ( ( tauon * tauon ) - ( ts_n[ n ] * ts_n[ n ] ) ) ) ) / (
) - \
+ \
) - \
6 * H );
break;
case _C_:
temp = M * LOG2 ( 2 * ( D * D ) * tc + N ) / ( D * D ) - \
M * LOG2 ( 2 * ( D * D ) * ts_n[ n ] + N ) / ( D * D ) - \
( 6 * F * H * ( t0_n[ n ] - ts_n[ n ] ) + \
3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( ts_n[ n ] * ts_n[ n ] ) ) - \
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5 ) + \
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * ts_n[ n ] + I ), 1.5 ) - \
B.3. Simulators
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109 }
273
3 * H * ( 2 * J * ( tc - ts_n[ n ] ) + \
K * ( ( tc * tc ) - ( ts_n[ n ] * ts_n[ n ] ) ) ) ) / \
( 6 * H );
break;
case _D_:
temp = F * ( tc - t0_n[ n ] ) + C * 0.5 * ( tc * tc - t0_n[ n ] * t0_n[ n ] ) + \
2 * Vc * Vc * beta_n[ n ] * pow ( ( H * t0_n[ n ] + I ), 1.5 ) / ( 3 * H ) - \
2 * Vc * Vc * pow ( ( H * tc + I ), 1.5 ) / ( 3 * H );
break;
case _E_:
default:
temp = 0;
break;
}
return temp;
LastEqP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::LastEqP( int OpCondition, unsigned int p, double tauop )
{
double Vc, temp, tc;
double C, D, F, G, I, J, K, M, N, O, P, R, S, X, x, y;
tc = ( VDD * t0_p[ p ] * ( taui_p[ p ] - tauop ) + Vd_p[ p ] * tauop * ( t0_p[ p ] - taui_p[
Vs_p[ p ] * taui_p[ p ] * ( tauop - t0_p[ p ] ) ) / ( VDD * ( taui_p[ p ] - tauop ) +
x = t0_p[ p ] - taui_p[ p ];
y = t0_p[ p ] - tauop;
X = VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ];
C = -Vc * beta_p[ p ] * ( A_1_p[ p ] * X + \
B_1_p[ p ] * x + \
VDD * t0_p[ p ] + Vc * x - Vs_p[ p ] * taui_p[ p ] ) / x;
D = Vc * beta_p[ p ] * ( A_1_p[ p ] + 1 ) * ( VDD - Vs_p[ p ] ) / x;
F = Vc * ( 2 * A_1_p[ p ] * X + \
2 * B_1_p[ p ] * x + \
2 * VDD * t0_p[ p ] + Vc * x - \
2 * Vs_p[ p ] * taui_p[ p ] ) / x;
G = Vc * beta_p[ p ] * SQRT ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - \
VDD * Vc * beta_p[ p ] - ( Vc * Vc ) * beta_p[ p ] - Vc * TECH.Vtp0 * beta_p[ p ];
J = VDD * ( y - x ) + Vd_p[ p ] * x - Vs_p[ p ] * y;
K = Vc * beta_p[ p ] * ( 2 * A_1_p[ p ] * y * \
( ( VDD * VDD ) * t0_p[ p ] * ( x - y ) + \
VDD * ( Vc * x * y - Vd_p[ p ] * t0_p[ p ] * x + \
Vs_p[ p ] * ( t0_p[ p ] * y - \
taui_p[ p ] * ( x - y ) ) ) - \
Vs_p[ p ] * ( Vc * x * y - taui_p[ p ] * ( Vd_p[ p ] * x - \
Vs_p[ p ] * y ) ) ) - \
2 * B_1_p[ p ] * J * x * y + \
( VDD * VDD ) * t0_p[ p ] * ( x - y ) * ( x + y ) + \
VDD * ( Vc * x * y * ( x + y ) - \
Vd_p[ p ] * x * ( t0_p[ p ] * ( x + y ) + tauop * ( x - y )
Vs_p[ p ] * y * ( t0_p[ p ] * ( x + y ) - taui_p[ p ] * ( x
Vc * x * y * ( Vd_p[ p ] * x + Vs_p[ p ] * y ) + \
( Vd_p[ p ] * Vd_p[ p ] ) * tauop * x * x + \
Vs_p[ p ] * Vd_p[ p ] * x * y * ( y - x ) - \
Vs_p[ p ] * Vs_p[ p ] * y * y * taui_p[ p ] ) / \
( 2 * J * x * y );
I = Vc * beta_p[ p ] * ( 2 * A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * y + \
VDD * ( x + y ) - Vd_p[ p ] * x - Vs_p[ p ] * y ) / \
( 2 * x * y );
M = ( Vc * Vc ) * beta_p[ p ] * x * y * \
( 2 * A_1_p[ p ] * ( VDD * ( Vc * y - Vd_p[ p ] * y + Vs_p[ p ] * x ) - \
p ] ) + \
Vd_p[ p ] * ( t0_p[ p ] - taui_p[ p ] ) + Vs_p[ p ] *
) + \
- y ) ) ) - \
274
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
Vs_p[ p ] * ( Vc * y + Vd_p[ p ] * ( x - y ) ) ) - \
2 * B_1_p[ p ] * J + VDD * ( Vc * ( x + y ) - \
2 * ( Vd_p[ p ] * y - Vs_p[ p ] * x ) ) - \
Vc * ( Vd_p[ p ] * x + Vs_p[ p ] * y ) + \
2 * Vd_p[ p ] * Vs_p[ p ] * ( y - x ) );
N = 2 * J * ( VDD * t0_p[ p ] * ( y - x ) + \
Vc * x * y + \
Vd_p[ p ] * tauop * x - \
Vs_p[ p ] * taui_p[ p ] * y );
O = -Vc * beta_p[ p ] * ( VDD * ( t0_p[ p ] + y ) + \
Vc * y - Vd_p[ p ] * tauop + 2 * TECH.Vtp0 * y ) / \
( 2 * y );
P = Vc * beta_p[ p ] * ( VDD - Vd_p[ p ] ) / ( 2 * y );
R = ( Vc * Vc ) * beta_p[ p ] * y * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
S = 2 * ( VDD * tauop - Vc * y - Vd_p[ p ] * tauop );
{
case _A_:
temp = -M * LOG2 ( ( 2 * ( J * J ) * ts_p[ p ] - N ) ) / ( J * J ) + \
M * LOG2 ( ( 2 * ( J * J ) * taui_p[ p ] - N ) ) / ( J * J ) + \
R * LOG2 ( ( 2 * taui_p[ p ] * ( VDD - Vd_p[ p ] ) - S ) ) / ( Vd_p[ p ] - VDD ) + \
R * LOG2 ( ( 2 * tauop * ( VDD - Vd_p[ p ] ) - S ) ) / ( VDD - Vd_p[ p ] ) - \
( 2 * Vc * beta_p[ p ] * ( 2 * D * t0_p[ p ] - F * beta_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * t0_p[ p ] ) / beta_p[ p ] ) + \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - ts_p[ p ] ) + D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( ts_p[ p ] * ts_p[ p ] ) ) + \
I * ( ( ts_p[ p ] * ts_p[ p ] ) - ( taui_p[ p ] * taui_p[ p ] ) ) + \
2 * K * ( ts_p[ p ] - taui_p[ p ] ) + \
2 * O * ( taui_p[ p ] - tauop ) + \
P * ( ( taui_p[ p ] * taui_p[ p ] ) - ( tauop * tauop ) ) ) ) / ( 6 * D );
break;
case _B_:
temp = R * LOG2 ( ( 2 * ts_p[ p ] * ( VDD - Vd_p[ p ] ) - S ) ) / ( Vd_p[ p ] - VDD ) + \
R * LOG2 ( ( 2 * tauop * ( VDD - Vd_p[ p ] ) - S ) ) / ( VDD - Vd_p[ p ] ) - \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * taui_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * taui_p[ p ] ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - taui_p[ p ] ) + \
D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( taui_p[ p ] * taui_p[ p ] ) ) + \
2 * G * ( taui_p[ p ] - ts_p[ p ] ) + \
2 * O * ( ts_p[ p ] - tauop ) + \
P * ( ( ts_p[ p ] * ts_p[ p ] ) - ( tauop * tauop ) ) ) ) / ( 6 * D );
break;
case _C_:
temp = M * LOG2 ( ( 2 * ( J * J ) * tc - N ) ) / ( J * J ) - \
M * LOG2 ( ( 2 * ( J * J ) * ts_p[ p ] - N ) ) / ( J * J ) - \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - ts_p[ p ] ) + \
D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( ts_p[ p ] * ts_p[ p ] ) ) + \
I * ( ( ts_p[ p ] * ts_p[ p ] ) - ( tc * tc ) ) + \
2 * K * ( ts_p[ p ] - tc ) ) ) / ( 6 * D );
break;
case _D_:
temp = -( 2 * Vc * beta_p[ p ] * ( 2 * D - F * beta_p[ p ] ) * \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * tc ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * tc ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - tc ) + D * ( t0_p[ p ] * t0_p[ p ] - tc * tc ) ) ) / \
( 6 * D );
break;
case _E_:
B.3. Simulators
121
122
123
124
125
126 }
275
default:
temp = 0;
break;
}
return temp;
MiddleEqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::MiddleEqN( int OpCondition, unsigned int i, double tauon )
{
double Vc, temp;
double D, F, G, H, I, J, K, M, N;
Vc = TECH.Ec_n * L_n[ i ];
{
case _A_:
D = Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + \
Vs_n[ i ] * ( tauon - t0_n[ i ] );
F = Vc * beta_n[ i ] * \
( 2 * A_1_n[ i ] * Vs_n[ i ] * ( t0_n[ i ] - tauon ) * \
( D * taui_n[ i ] + Vc * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) ) + \
2 * D * VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) + \
Vc * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) * \
( Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) + \
( Vd_n[ i ] * Vd_n[ i ] ) * tauon * ( ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - taui_n[ i ] ) ) + \
Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) * \
( Vs_n[ i ] * ( taui_n[ i ] - tauon ) + 2 * TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) - \
Vs_n[ i ] * ( ( t0_n[ i ] - tauon ) * ( t0_n[ i ] - tauon ) ) * \
( Vs_n[ i ] * taui_n[ i ] + 2 * TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) ) / \
( 2 * D * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) );
G = Vc * beta_n[ i ] * \
( 2 * A_1_n[ i ] * Vs_n[ i ] * ( t0_n[ i ] - tauon ) + \
Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) / \
( 2 * ( t0_n[ i ] - taui_n[ i ] ) * ( tauon - t0_n[ i ] ) );
H = ( Vc * Vc ) * beta_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) * \
( tauon - t0_n[ i ] ) * \
( 2 * A_1_n[ i ] * Vs_n[ i ] * ( Vc * ( t0_n[ i ] - tauon ) + \
Vd_n[ i ] * ( taui_n[ i ] - tauon ) ) + 2 * D * VDD + Vc * \
( Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) + \
2 * ( Vd_n[ i ] * ( Vs_n[ i ] * ( taui_n[ i ] - tauon ) + \
TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) + Vs_n[ i ] * TECH.Vtn0 * ( t0_n[ i ] - tauon ) ) );
I = 2 * D * ( Vc * ( t0_n[ i ] - taui_n[ i ] ) * \
( t0_n[ i ] - tauon ) + Vd_n[ i ] * tauon * ( taui_n[ i ] - t0_n[ i ] ) + \
Vs_n[ i ] * taui_n[ i ] * ( t0_n[ i ] - tauon ) );
J = Vc * beta_n[ i ] * ( 2 * VDD * ( t0_n[ i ] - tauon ) + \
Vc * ( t0_n[ i ] - tauon ) + Vd_n[ i ] * tauon + \
2 * TECH.Vtn0 * ( tauon - t0_n[ i ] ) ) / ( 2 * ( t0_n[ i ] - tauon ) );
K = Vc * Vd_n[ i ] * beta_n[ i ] / ( 2 * ( tauon - t0_n[ i ] ) );
M = ( Vc * Vc ) * beta_n[ i ] * ( tauon - t0_n[ i ] ) * \
( 2 * VDD + Vc - 2 * TECH.Vtn0 );
N = 2 * ( Vc * ( t0_n[ i ] - tauon ) - Vd_n[ i ] * tauon );
temp = -H * LOG2 ( 2 * ( D * D ) * t0_n[ i ] + I ) / ( D * D ) + \
H * LOG2 ( 2 * ( D * D ) * taui_n[ i ] + I ) / ( D * D ) - \
M * LOG2 ( N + 2 * Vd_n[ i ] * taui_n[ i ] ) / Vd_n[ i ] + \
M * LOG2 ( N + 2 * Vd_n[ i ] * tauon ) / Vd_n[ i ] - \
( 2 * F * ( t0_n[ i ] - taui_n[ i ] ) + G * ( ( t0_n[ i ] * t0_n[ i ] ) - ( taui_n[ i ] * taui_n[ i ] ) ) + 2 * J * ( taui_n[ i ] - tauo
break;
case _E_:
default:
temp = 0;
break;
276
62
63
64 }
}
return temp;
MiddleEqP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::MiddleEqP( int OpCondition, unsigned int i, double tauop )
{
double Vc, temp;
double D, J, K, M, N, O, P, R, S, T;
Vc = TECH.Ec_p * L_p[ i ];
{
case _A_:
D = VDD * ( taui_p[ i ] - tauop ) + \
Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
Vs_p[ i ] * ( tauop - t0_p[ i ] );
J = Vc * beta_p[ i ] * \
( 2 * A_1_p[ i ] * ( t0_p[ i ] - tauop ) * ( ( VDD * VDD ) * t0_p[ i ] * ( taui_p[ i ] - tauop ) - \
VDD * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
Vd_p[ i ] * t0_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) + \
Vs_p[ i ] * ( ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * tauop + taui_p[ i ] * ( taui_p[ i ] - tauop ) ) ) + \
Vs_p[ i ] * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) - \
taui_p[ i ] * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + Vs_p[ i ] * ( tauop - t0_p[ i ] ) ) ) ) + \
2 * B_1_p[ i ] * D * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
( VDD * VDD ) * t0_p[ i ] * ( taui_p[ i ] - tauop ) * \
( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) - VDD * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * \
( t0_p[ i ] - tauop ) * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) + \
Vd_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) * \
( 2 * ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * ( taui_p[ i ] + tauop ) - \
tauop * ( taui_p[ i ] - tauop ) ) + Vs_p[ i ] * ( t0_p[ i ] - tauop ) * \
( 2 * ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * ( taui_p[ i ] + tauop ) + taui_p[ i ] * ( taui_p[ i ] - tauop ) ) )
Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
Vs_p[ i ] * ( t0_p[ i ] - tauop ) ) - ( Vd_p[ i ] * Vd_p[ i ] ) * tauop * ( ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[
Vs_p[ i ] * ( tauop - t0_p[ i ] ) * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) * \
( taui_p[ i ] - tauop ) + Vs_p[ i ] * taui_p[ i ] * ( tauop - t0_p[ i ] ) ) ) / \
( 2 * D * ( t0_p[ i ] - taui_p[ i ] ) * ( tauop - t0_p[ i ] ) );
K = Vc * beta_p[ i ] * ( 2 * A_1_p[ i ] * ( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
VDD * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) + \
Vd_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) + \
Vs_p[ i ] * ( tauop - t0_p[ i ] ) ) / ( 2 * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) );
M = ( Vc * Vc ) * beta_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) * \
( 2 * A_1_p[ i ] * ( VDD * ( Vc * ( t0_p[ i ] - tauop ) + \
Vd_p[ i ] * ( tauop - t0_p[ i ] ) + Vs_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) ) - \
Vs_p[ i ] * ( Vc * ( t0_p[ i ] - tauop ) + Vd_p[ i ] * ( tauop - taui_p[ i ] ) ) ) - \
2 * B_1_p[ i ] * D + VDD * ( Vc * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) - \
2 * ( Vd_p[ i ] * ( t0_p[ i ] - tauop ) + Vs_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) ) ) - \
Vc * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
Vs_p[ i ] * ( t0_p[ i ] - tauop ) ) + 2 * Vd_p[ i ] * Vs_p[ i ] * ( taui_p[ i ] - tauop ) );
N = 2 * D * ( VDD * t0_p[ i ] * ( taui_p[ i ] - tauop ) + \
Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
Vd_p[ i ] * tauop * ( t0_p[ i ] - taui_p[ i ] ) + Vs_p[ i ] * taui_p[ i ] * ( tauop - t0_p[ i ] ) );
O = Vc * beta_p[ i ] * ( VDD * ( 2 * t0_p[ i ] - tauop ) + \
Vc * ( t0_p[ i ] - tauop ) - Vd_p[ i ] * tauop + 2 * TECH.Vtp0 * ( t0_p[ i ] - tauop ) ) / \
( 2 * ( tauop - t0_p[ i ] ) );
P = Vc * beta_p[ i ] * ( VDD - Vd_p[ i ] ) / ( 2 * ( t0_p[ i ] - tauop ) );
R = ( Vc * Vc ) * beta_p[ i ] * ( t0_p[ i ] - tauop ) * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
S = 2 * ( VDD * tauop + Vc * ( tauop - t0_p[ i ] ) - Vd_p[ i ] * tauop );
T = 2 * ( VDD - Vd_p[ i ] );
temp = -M * LOG2 ( 2 * ( D * D ) * t0_p[ i ] - N ) / ( D * D ) + \
M * LOG2 ( 2 * ( D * D ) * taui_p[ i ] - N ) / ( D * D ) - \
B.3. Simulators
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79 }
R * LOG (
R * LOG (
( 2 * J *
K * ( (
2 * O *
P * ( (
T * taui_p[
T * tauop ( t0_p[ i ]
t0_p[ i ] *
( taui_p[ i
taui_p[ i ]
i ] - S )
S ) / T - taui_p[
t0_p[ i ]
] - tauop
* taui_p[
/
\
i
)
)
i
T + \
]
+
]
) + \
( taui_p[ i ] * taui_p[ i ] ) ) + \
\
) - ( tauop * tauop ) ) ) / 2;
break;
case _E_:
default:
temp = 0;
break;
}
return temp;
Power.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::CalcPower( const Circuit& circuit,
unsigned int NP,
unsigned int NC,
unsigned int n,
unsigned int p,
TransitionType TOut,
int& RetCode )
{
double Ecc, Esc;
switch ( TOut )
{
case FALL:
// n chain
RetCode = CalcPowerN( circuit, NP, NC, Ecc, Esc, n, p, NewWidth );
return 0.0;
break;
case RISE:
// p chain
RetCode = CalcPowerP( circuit, NP, NC, Ecc, Esc, n, p, NewWidth );
return 0.0;
break;
case NOTRANSITION:
default:
break;
}
return ( Ecc + Esc );
}
QnP.cc
3
4
5
6
7
8
9
10
11
12
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::QnP(unsigned int n, unsigned int p)
{
double D_n, E_n, F_n, G_n;
double to;
277
278
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 }
double Vc, y;
if (n != 0)
{
Vc = TECH.Ec_n * L_n[1];
to = taui_n[1] * (1 - TECH.Vtn0 / VDD);
if (to > tauo_p[p])
to = tauo_p[p];
D_n = Vc * beta_n[1] * (VDD * VDD * taui_p[1] * (t0_p[p] - 2 * tauo_p[p]) - \
VDD * (Vc * (t0_p[p] - tauo_p[p]) * \
(2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
taui_p[1] * (Vd_p[p] * (t0_p[p] - 3 * tauo_p[p]) + \
2 * TECH.Vtn0 * (t0_p[p] - tauo_p[p]))) - \
Vd_p[p] * taui_p[1] * (Vc * (t0_p[p] - tauo_p[p]) + \
Vd_p[p] * tauo_p[p] + 2 * TECH.Vtn0 * \
(tauo_p[p] - t0_p[p]))) / \
(2 * taui_p[1] * (VDD - Vd_p[p]) * (t0_p[p] - tauo_p[p]));
E_n = Vc * beta_n[1] * (VDD * (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
Vd_p[p] * taui_p[1]) / \
(2 * taui_p[1] * (tauo_p[p] - t0_p[p]));
F_n = Vc * Vc * beta_n[1] * (tauo_p[p] - t0_p[p]) * \
(2 * VDD * VDD * (t0_p[p] - taui_p[1]) + \
VDD * (Vc * (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
2 * (Vd_p[p] * (taui_p[1] - tauo_p[p]) + TECH.Vtn0 * taui_p[1])) + \
Vd_p[p] * taui_p[1] * (Vc - 2 * TECH.Vtn0));
G_n = 2 * taui_p[1] * (VDD - Vd_p[p]) * (VDD * t0_p[p] + \
Vc * (t0_p[p] - tauo_p[p]) - \
Vd_p[p] * tauo_p[p]);
y = -F_n * (LOG2(2 * t0_p[p] * taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p]) - G_n)) / \
(taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p])) + \
F_n * (LOG2(2 * to * taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p]) - G_n)) / \
(taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p])) - \
(2 * D_n * (t0_p[p] - to) + E_n * (t0_p[p] * t0_p[p] - to * to)) * 0.5;
return y;
}
else
return 0.0;
QpN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::QpN(unsigned int n, unsigned int p)
{
double D_p, E_p, F_p, G_p;
double to;
double Vc, y;
if (p != 0)
{
Vc = TECH.Ec_p * L_p[1];
to = taui_n[1] * (1 + TECH.Vtp0 / VDD);
if (to > tauo_n[n])
to = tauo_n[n];
D_p = Vc * beta_p[1] * (VDD * (t0_n[n] - tauo_n[n]) * \
(2 * Vc * (t0_n[n] - tauo_n[n]) - Vd_n[n] * taui_n[1]) - \
Vd_n[n] * taui_n[1] * (Vc * (t0_n[n] - tauo_n[n]) - \
Vd_n[n] * tauo_n[n] + 2 * TECH.Vtp0 * (t0_n[n] - tauo_n[n]))) / \
(2 * Vd_n[n] * taui_n[1] * (t0_n[n] - tauo_n[n]));
B.3. Simulators
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44 }
E_p = Vc
(2
F_p = Vc
(2
2
279
*
*
*
*
*
beta_p[1] * (2 * VDD * (t0_n[n] - tauo_n[n]) - Vd_n[n] * taui_n[1]) / \

taui_n[1] * (t0_n[n] - tauo_n[n]));
Vc * beta_p[1] * (t0_n[n] - tauo_n[n]) * \
VDD * VDD * (t0_n[n] - tauo_n[n]) + \
VDD * (Vc * (t0_n[n] - tauo_n[n]) + \
Vd_n[n] * (tauo_n[n] - taui_n[1])) - Vd_n[n] * taui_n[1] * (Vc + 2 * TECH.Vtp0));
G_p = 2 * Vd_n[n] * taui_n[1] * (VDD * (t0_n[n] - tauo_n[n]) + \
Vc * (t0_n[n] - tauo_n[n]) + Vd_n[n] * tauo_n[n]);
y = -F_p * (LOG2(2 * Vd_n[n] * Vd_n[n] * t0_n[n] * taui_n[1] - G_p)) / \
(Vd_n[n] * Vd_n[n] * taui_n[1]) + \
(F_p * LOG2(2 * Vd_n[n] * Vd_n[n] * to * taui_n[1] - G_p)) / \
(Vd_n[n] * Vd_n[n] * taui_n[1]) - \
(2 * D_p * (t0_n[n] - to) + E_p * (t0_n[n] * t0_n[n] - to * to)) * 0.5;
return y;
}
else
return 0.0;
Solve.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
#define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))
///
double Fast::SolveEq( const Circuit& circuit, unsigned int NP, unsigned int NC, TransistorType type, double start, double end, int& RetCode, unsigned i
{
int iter;
double a = start, b = end, c = end, d, e, min1, min2;
double fa, fb, fc, pp, q, r, s, tol1, xm, last;
double tol = TOL;
if ( type == NMOS )
fb = EqN( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
fb = EqP( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
last = fb;
if ( type == NMOS )
fa = EqN( circuit, NP, NC, a, RetCode, i, n, p, NewWidth );
fa = EqP( circuit, NP, NC, a, RetCode, i, n, p, NewWidth );
if ( ( fa > 0.0 && fb > 0.0 ) || ( fa < 0.0 && fb < 0.0 ) )
{
return 0.0;
}
fc = fb;
for ( iter = 1; iter <= ITERMAX; iter++ )
{
if ( ( fb > 0.0 && fc > 0.0 ) || ( fb < 0.0 && fc < 0.0 ) )
{
c = a;
fc = fa;
e = d = b - a;
}
if ( fabs ( fc ) < fabs ( fb ) )
{
a = b;
b = c;
c = a;
fa = fb;
fb = fc;
fc = fa;
280
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103 }
}
tol1 = 2.0 * EPS * fabs ( b ) + 0.5 * tol;
xm = 0.5 * ( c - b );
if ( fabs ( xm ) <= tol1 || fb == 0.0 )
return b;
if ( fabs ( e ) >= tol1 && fabs ( fa ) > fabs ( fb ) )
{
s = fb / fa;
if ( a == c )
{
pp = 2.0 * xm * s;
q = 1.0 - s;
}
else
{
q = fa / fc;
r = fb / fc;
pp = s * ( 2.0 * xm * q * ( q - r ) - ( b - a ) * ( r - 1.0 ) );
q = ( q - 1.0 ) * ( r - 1.0 ) * ( s - 1.0 );
}
if ( pp > 0.0 )
q = -q;
pp = fabs ( pp );
min1 = 3.0 * xm * q - fabs ( tol1 * q );
min2 = fabs ( e * q );
if ( 2.0 * pp < ( min1 < min2 ? min1 : min2 ) )
{
e = d;
d = pp / q;
}
else
{
d = xm;
e = d;
}
}
else
{
d = xm;
e = d;
}
a = b;
fa = fb;
if ( fabs ( d ) > tol1 )
b += d;
else
b += SIGN ( tol1, xm );
if ( type == NMOS )
fb = EqN( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
fb = EqP( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
}
return 0.0;
t0N.cc
3
4
5
6
7
8
9
10
11
12
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::t0N( const Circuit& circuit, unsigned int NP, unsigned int NC, double t, const double* NewWidth, int& RetCode )
{
// compute the time at which the first n-mos start conducting, using
// bootstrap
B.3. Simulators
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64 }
281
double A_2_n, B_2_n, C_2_n, D_2_n, Vc, y, Cm1, Cov, Cj, t0_bs;
Cov = TECH.Cgd0_n * ( W_n[ 1 ] + TECH.XW_n );
Cm1 = Cov;
Cj = 0.0;
unsigned int node;
const char* name = pathlist[ NP ].TransistorName( 0, NC );
{
}
{
}
Cj += TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
// evaluate Cgs Cgd @ V node 1 for pmos
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
// evaluate other capacitances
int nc;
// evaluate gate capacitances
C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
y = -2 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( C_2_n * t + D_2_n, 1.5 ) / ( 3 * C_2_n ) + \
B_2_n * ( t * t ) * 0.5 + t * ( A_2_n * taui_n[ 1 ] - Cov * VDD ) / ( taui_n[ 1 ] ) - \
( 6 * A_2_n * C_2_n * t0_bs + 3 * B_2_n * C_2_n * ( t0_bs * t0_bs ) - \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( C_2_n * t0_bs + D_2_n, 1.5 ) ) / ( 6 * C_2_n );
RetCode = OK;
return y;
t0P.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"
///
double Fast::t0P( const Circuit& circuit, unsigned int NP, unsigned int NC, double t, const double* NewWidth, int& RetCode )
{
double A_2_p, B_2_p, C_2_p, D_2_p, Vc, y, Cm1, Cov, t0_bs, Cj;
double alpha, theta, Y;
Cov = TECH.Cgd0_p * ( W_p[ 1 ] + TECH.XW_p );
282
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78 }
Cm1 = Cov;
Cj = 0.0;
unsigned int nn = pathlist[ NP ].GetNumTranN( NC );
unsigned int pp = pathlist[ NP ].GetNumTranP( NC );
unsigned int node;
const char* name = pathlist[ NP ].TransistorName( nn + pp - 1, NC );
// first pmos
unsigned int VDDNode = circuit.ValimNode();
if ( circuit[ name ].Source() == VDDNode )
{
}
else if ( circuit[ name ].Drain() == VDDNode )
{
}
else
{
return 0.0;
}
// evaluate Cgs Cgd @ V node 1 for pmos
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
Cj += TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p );
// evaluate Cgs Cgd @ V node 1 for nmos
Cj += TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
Cj += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_p );
// evaluate other capacitances
int nc;
// evaluate gate capacitances
C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
Y = Cj + Cov;
alpha = pow( ( D_2_p * t0_bs + C_2_p ), 1.5 );
theta = pow( ( D_2_p * t + C_2_p ), 1.5 );
y = ( 2 * Vc * Vc * beta_p[ 1 ] * theta ) / \
( 3 * D_2_p ) - \
B_2_p * t * t * 0.5 +
t * ( -A_2_p * taui_p[ 1 ] + Cov * VDD ) / \
( taui_p[ 1 ] ) + \
( 6 * A_2_p * D_2_p * t0_bs + 3 * B_2_p * D_2_p * t0_bs * t0_bs - \
4 * Vc * Vc * beta_p[ 1 ] * alpha ) / ( 6 * D_2_p );
RetCode = OK;
return y;
B.3. Simulators
TestOpt.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#include
#include
#include
#include
"mystdinclude.h"
"myenum.h"
"print.h"
"test.h"
///
TestOpt::TestOpt( const CritPathList& pathlist, const Options& options)
:
EvaluationAlgorithm( pathlist, options )
{
print_log( "Creating TestOpt instance..." );
}
///
TestOpt::~TestOpt()
{}
///
int TestOpt::Run( const Circuit& circuit, const double *NewWidth, const unsigned* ValidPath )
{
Calls++;
{
if (ValidPath[NP])
{
int RetCode;
CPDelay[ NP ] = 0.0;
CPPower[ NP ] = 0.0;
Area = 0.0;
for (unsigned int i = 0; i < circuit.GetNTran(); i++)
{
double x = NewWidth[i];
double f, g, h , l;
// f
f = x * x * x * x * 3.0 / 8000.0;
f += -x * x * x * 11.0 / 400.0;
f += x * x * 27.0 / 40.0;
f += -x * 27.0 / 4.0;
f += 165.0 / 4.0;
// l
l = x * x * x * x * 3.0 / 7700.0;
l += -x * x * x * 11.0 / 402.0;
l += x * x * 27.0 / 39.5;
l += -x * 27.0 / 4.0;
l += 150.0 / 4.0;
// g
g = x * x * x * x * 3.0 / 8000.0;
g += -x * x * x * 13.0 / 400.0;
g += x * x * 39.0 / 40.0;
g += -x * 45.0 / 4.0;
g += 205.0 / 4.0;
// h
h = x * x * x * x * x * x * 5.01264E-8;
h += -x * x * x * x * x * 1.60540E-5;
h += x * x * x * x * 0.001948124;
h += -x * x * x * 0.111669;
h += x * x * 3.05849;
h += -x * 34.6888;
h += 164.782;
//CPDelay[ NP ] = f;
if (f > i)
CPDelay[ NP ] = f;
else
283
284
68
69
70
71
72
73
74
75
76 }
CPDelay[ NP ] = l;
CPPower[ NP ] = NewWidth[i] * NewWidth[i] * NewWidth[i];
Area += NewWidth[i];
}
}
}
return OK;
BIBLIOGRAPHY
[1] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design.
Addison-Wesley, 1993.
[2] J. Yuan, High speed circuit techniques for pipelining and for one
clockcycle decision. Eurochip advanced course, high speed silicon
design, Apr. 1994.
[3] W. C. Elmore, The transient response of damped linear network with
particular regard to wideband amplifiers, Journal of Applied Physics,
vol. 19, pp. 5563, Jan. 1948.
[4] R. Gupta, B. Tutuianu, and L. T. Pileggi, The elmore delay as a
bound for rc trees with generalized inputt signals, IEEE Transaction
on ComputerAided Design, vol. 16, pp. 95104, Jan. 1997.
[5] L. Brocco, S. Mccormic, and J. allen, Macromodelling cmos circuits
for timing simulation, IEEE Transaction on ComputerAided Design,
vol. 7, pp. 12371249, Dec. 1988.
[6] N. Hedenstierna and K. O. Jeppson, Cmos circuit speed and buffer
optimization, IEEE Transaction on ComputerAided Design, vol. CAD
6, pp. 270281, Mar. 1987.
[7] P. Cocchini, G. Piccinini, and M. Zamboni, A comprehensive submicron most delay model and its application to cmos buffers, IEEE
Journal of Solid State Circuits, vol. 32, Aug. 1997.
[8] T. Sakurai and A. R. Newton, Alphapower law mosfet model and
its application in inverter delay and other formulas, IEEE Journal of
Solid State Circuits, vol. 25, pp. 584594, Apr. 1990.
Bibliography
286
[9] T. Sakurai and A. R. Newton, A simple mosfet model for circuit analysis, IEEE Transactions on Electron Devices, vol. 38, pp. 887894, Apr.
1991.
[10] S. Dutta, S. S. M. Shetti, and S. L. Lusky, Comprehensive delay model
for cmos inverters, IEEE Journal of Solid State Circuits, vol. 30, pp. 864
871, Aug. 1995.
[11] L. Bisdounis, S. Nikolaidis, and O. Koufopavlou, Propagation delay
and shortcircuit power dissipation modeling of the cmos inverter,
IEEE Transaction on Circuits and Systems, vol. 45, pp. 259270, Mar.
1998.
[12] R. S. Muller and T. I. Kamins, Device electronics for integrated circuit,
second edition. John wiley & sons, 1986.
[13] D. A. Wismer and R. Chattergy, Introduction to nonlinear optimization.
System science and engineering, NorthHolland, 1978.
[14] P. L. Yu, Multiplecriteria decision making. Mathematical concepts and
methods in science and engineering, Plenum Press New York and
London, 1985.
[15] M. J. D. Powell, An eeficient method for finding the minimum of a
function of several variables without calculating derivatives, Computer Journal, no. 7, pp. 152162, 1964.
[16] J. Yuan and C. Svensson, Cmos circuit speed optimization based
on switch level simulation, in Proceedings of IEEE International Symposium on Circuits and Systems, pp. 21092112, 1988.
[17] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, 1992.
[18] M. Graziano, M. Delaurenti, G. Masera, G. Piccinini, and M. Zamboni,
Noise safety design methodologies, in Proceedings of IEEE International Simposium on Quality of Electronics Design (ISQED2000), IEEE,
Mar. 2000.
Bibliography
287
[19] J. T. Kong, S. Z. Hussain, and D. Overhauser, Performance estimation of complex mos gates, IEEE Transaction on Circuits and Systems,
vol. 44, pp. 785795, Sept. 1997.
[20] S. Devadas and S. Malik, Survey of optimization techniques targeting
low power vlsi circuit, in Proceedings of Conference on Design Automation (DAC), 1995.
[21] B. Davari, R. H. Dennard, and G. G. Shahidi, Cmos scaling for high
performance and low powerthe next ten years, Proceedings of the
IEEE, vol. 83, Apr. 1995.
[22] S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S. M. Kang, An exact
solution to the transistor sizing problem for cmos circuits using convex
approximation, IEEE Transaction on ComputerAided Design, vol. 12,
pp. 16211634, Nov. 1993.
[23] O. Coudert, Gate sizing for constrained delay/power/area optimization, IEEE Transaction on Very Large Scale Integration Systems, vol. 5,
pp. 465472, Dec. 1997.
[24] C. Chen, C. C. N. Chu, and D. F. Wong, Fast and exact simultaneous
gate and wire sizing by lagrangian relaxation, in Proceedings of Conference on Design Automation (DAC), pp. 617624, 1998.
[25] A. R. Conn, R. A. Haring, and C. Visweswariath, Noise considerations in circuit optimization, in Proceedings of IEEE/ACM International
conference on Computer Aided Design, pp. 220227, 1998.
[26] M. Delaurenti, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,
Cmos power-delay model for cad optimization tools, in Proceedings
of IEEE A.Volta Workshop on Low-Power Design (VOLTA99), IEEE, Mar.
1999.
[27] G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni, Isis: a cad tool
for high speed vlsi design, in Proceedings of CSA, (Irbid, Jordan), Mar.
1998.
[28] M. Graziano, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,
Noise-tolerance analysis for high speed cmos circuits, in Proceedings
of ICM, (Monastir, Tunisia), Dec. 1998.
Bibliography
288
[29] M. Graziano, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,

A statistical noise-tolerance analysis and test structure for logic families, in Proceedings of ICMTS, (Goteborg, Sweden), Mar. 1999.
[30] S. Eliantonio, Studio di algoritmi di ottimizzazione velocit`aarea per
strutture cmos, tesi di laurea, Politecnico di Torino, Dipartimento di
Elettronica, Mar. 1999.
[31] D. Zhou and X. Y. Liu, On the optimal drivers of highspeed low
power ics, International journal of High Speed Electronics and Systems,
vol. 7, no. 2, pp. 287303, 1996.
[32] V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez, Reducing power in highperformance microprocessors, in Proceedings
of Conference on Design Automation (DAC), pp. 732737, 1998.
[33] H. Liao and W. W. Dai, A new cmos driver model for transient analysis and power dissipation, International journal of High Speed Electronics and Systems, vol. 7, no. 2, pp. 269285, 1996.
[34] G. Yeap and A. Wild, Introduction to lowpower vlsi design, International journal of High Speed Electronics and Systems, vol. 7, no. 2,
pp. 223248, 1996.
[35] A. Hirata, H. Onodera, and K. Tamaru, Proposal of a timing model
for cmos logic gates driving a crc load, in Proceedings of IEEE/ACM
International conference on Computer Aided Design, pp. 537544, 1998.
[36] A. R. Conn, P. K. Coulman, R. A. Haring, G. L. Morril, and
C. Visweswariath, Optimization of custom mos circuits by transistor
sizing, in Proceedings of IEEE/ACM International conference on Computer Aided Design, 1996.
[37] P. Larsson-Edefors, Technology mapping onto veryhighspeed
standard cmos hardware, IEEE Transaction on ComputerAided Design,
vol. 15, pp. 11371144, Sept. 1996.
[38] A. Wolfe, Oppurtunities and obstacles in lowpower systemlevel
cad, in Proceedings of Conference on Design Automation (DAC), 1996.
Bibliography
289
[39] C. S. D. Liu, Power consumption estimation in cmos vlsi chips, IEEE

Journal of Solid State Circuits, vol. 29, pp. 663670, June 1994.
[40] J. Cong and L. He, An efficient approach to simultaneous transistor
and interconnect sizing, in Proceedings of IEEE/ACM International conference on Computer Aided Design, 1996.
[41] D. Liu and C. Svensson, Impact of supply voltage on power consumption, speed and reliability of cmos circuits, in Proceedings of
internationa workshop on Power and Timing Modeling, Optimization and
Simulation (PATMOS), 1994.
[42] L. T. Wurtz, An efficient scaling procedure for domino cmos logic,
IEEE Journal of Solid State Circuits, vol. 28, pp. 979982, Sept. 1993.
[43] J. Yuan, Ultimate cmos speeds and device sizing. Eurochip advanced course, high speed silicon design, Apr. 1994.
[44] D. Chen and M. Sarrafzadeh, An exact algorithm for low power
libraryspecific gate resizing, in Proceedings of Conference on Design
Automation (DAC), 1996.
[45] M. Borah, R. M. Owens, and M. J. Irwin, Transistor sizing for low
power cmos circuits, IEEE Transaction on ComputerAided Design,
1996.
[46] B. Basaran and R. A. Rutenbar, An o(n) algorithm for transistor
stacking with performance constraints, in Proceedings of Conference on
Design Automation (DAC), 1996.
[47] M. R. C. M. Berkelaar, P. H. W. Buurman, and J. A. G. Jess, Computing
the entire active area/power consumption versus delay tradeoff curve
for gate sizing with piecewise linear simulator, IEEE Transaction on
ComputerAided Design, vol. 15, pp. 14241434, Nov. 1996.
[48] M. R. C. M. Berkelaar, P. H. W. Buurman, and J. A. G. Jess, Computing
the entire active area/power consumption versus delay tradeoff curve
for gate sizing with piecewise linear simulator, IEEE Transaction on
ComputerAided Design, vol. 15, pp. 14241434, Nov. 1996.
Bibliography
290
[49] S. Mehrotra, P. Franzon, and W. Liu, Global optimization approach to

transistor sizing for high performance cmos vlsi circuits, Tech. Rep.
NCSUVLSI 9310, North Carolina State University, Department of
Electrical and Computer Engineering, Nov. 1993.
[50] S. S.-S. Chung, A chargebased capacitance model of shortchannel
mosfets, IEEE Transaction on ComputerAided Design, vol. 8, pp. 17,
Jan. 1989.
[51] O. Coudert, R. Haddad, and S. Manne, New algorithms for gate sizing: a comparative study, in Proceedings of Conference on Design Automation (DAC), 1996.
` Power estimation of cell-based
[52] A. Bogliolo, L. Benini, and B. Ricco,
cmos circuits, in Proceedings of Conference on Design Automation (DAC),
1996.
[53] D. Syslvester and K. Keutzer, Getting to the bottom of deep submicron, in Proceedings of IEEE/ACM International conference on Computer Aided Design, 1998.

Design and Optimization Techniques of High Speed VLSI Circuit

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Design and Optimization Techniques of High Speed VLSI Circuit

Diunggah oleh

Hak Cipta:

Format Tersedia

Design and optimization techniques of

highspeed VLSI circuits

Design and optimization techniques of

Prof. Maurizio Zamboni

Prof. Ivo Montrosset

Copyright c 1999 Marco Delaurenti

Writing comes more easily if you have something

1. Introduction to CMOS logic . . . . . . . . . . . . . . . . . . . . .

Static logic families . . . . . . . . . . . . . . . . . . . .

Dynamic logic families . . . . . . . . . . . . . . . . . .

The Elmores model . . . . . . . . . . . . . . . . . . . . . . . .

The FAST model . . . . . . . . . . . . . . . . . . . . . . . . . .

Internal nodes approximation . . . . . . . . . . . . . .

Lagrange multiplier and Penalty functions . .

One-dimensional search techniques . . . . . . . . . .

The section search . . . . . . . . . . . . . . .

The golden section search . . . . . . . . . . . .

The Brents rule . . . . . . . . . . . . . . . . . .

The gradient direction: steepest (maximum)

The optimal gradient . . . . . . . . . . . . .

The conjugate direction method . . . . . . . . . . . .

The FletcherReeves conjugate gradient algorithm . . . . . . . . . . . . . . . . . . . . .

The Powell conjugate gradient algorithm . .

The SLOP algorithm . . . . . . . . . . . . . . . . . .

The simulated-annealing algorithm . . . . . . . . . .

Delay formula obtained by the Elmore model 84

Delay measurement obtained by the FAST

Multi-objective optimizations . . . . . . . . . . . . . . 102

6. A CAD tool for optimization . . . . . . . . . . . . . . . . . . . . 107

Logical description . . . . . . . . . . . . . . . . . . . . . . . . 107

The optimization algorithm module (OAM) . . . . . . 107

The function evaluation module (FEM) . . . . . . . . . 109

Core engine . . . . . . . . . . . . . . . . . . . . . . . . 109

Code implementation . . . . . . . . . . . . . . . . . . . . . . . 110

The classes CircuitNetlist and Circuit . . . . . . . . . 110

The class EvaluationAlgorithm . . . . . . . . . . . . . 112

The class OptimizationAlgorithm . . . . . . . . . . . 113

The critical path retrieving . . . . . . . . . . . . . . . 115

The derived classes . . . . . . . . . . . . . . . . . . . . 116

Program flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7. Results and conclusions . . . . . . . . . . . . . . . . . . . . . . 121

Mono-objective vs. Multiobjective . . . . . . . . . . . 122

Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

A. Class graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

B.2 Optimization algorithms . . . . . . . . . . . . . . . . . . . . . 208

Pass-transistor logic xor . . . . . . . . . . . . . . . . . . . . .

Domino typical gate . . . . . . . . . . . . . . . . . . . . . . .

Elmore impulse response . . . . . . . . . . . . . . . . . . . . .

Inverter voltages waveform . . . . . . . . . . . . . . . . . . .

Voltages wave form in the nMOS chain . . . . . . . . . . . .

Voltages wave forms in the pMOS chain . . . . . . . . . . . .

VDS and VGS . . . . . . . . . . . . . . . . . . . . . . . . . . . .

chain with static voltages . . . . . . . . . . . . . . .

3.10 Energy comparison . . . . . . . . . . . . . . . . . . . . . . . .

Minimization by Powell algorithm . . . . . . . . . . . . . . .

Minimization by Powell algorithm . . . . . . . . . . . . . . .

Minimization by SLOP algorithm . . . . . . . . . . . . . . . .

Minimization by Simulated-annealing algorithm . . . . . . .

Minimization by Simulated-annealing algorithm . . . . . . .

Critical path tree . . . . . . . . . . . . . . . . . . . . . . . . . .

5.10 CMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.11 TSPC Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.12 TSPC And gates . . . . . . . . . . . . . . . . . . . . . . . . . .

5.13 TSPC Or gates . . . . . . . . . . . . . . . . . . . . . . . . . . .