Anda di halaman 1dari 310

Design and optimization techniques of

highspeed VLSI circuits

Marco Delaurenti

Politecnico di Torino

Design and optimization techniques of


highspeed VLSI circuits

Marco Delaurenti

PhD Dissertation

December 1999

Politecnico di Torino
Advisor

Coordinator

Prof. Maurizio Zamboni

Prof. Ivo Montrosset

Copyright c 1999 Marco Delaurenti

Writing comes more easily if you have something


to say.
(Sholem Asch)
When I use a word,
Humpty Dumpty said in
rather a scornful tone, it
means just what I choose
it to meanneither more
nor less.
(Lewis Carroll)

Acknoledgments
First of all I would like to thank my advisor, Prof. M. Zamboni, Prof. G
Piccinini, Prof. G. Masera for their invaluable help, and Prof. P. Civera for
his being a bridge toward the real world. Also many thanks to the VLSI
LAB members at Politecnico of Turin, Italy: Mario for his input about the
critical paths (no, I do not thank you for the jazz songs that you play all
day long), Luca for the long discussions about books and movies (no, I
havent seen the last Kubricks movie), Andrea for his very good cocktails
(especially the Negroni one) and Danilo, because I forgot him every time
we went to lunch. Thanks also to Max (for he gave me the root password),
and to Yuan&Svensson for the invention of the TSPC.
Special thanks, finally, to Mg, for her support and for have been tolerating
me till now.

CONTENTS

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Part I

CMOS

Logic

xix

1. Introduction to CMOS logic . . . . . . . . . . . . . . . . . . . . .

1.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

CMOS

logic families . . . . . . . . . . . . . . . . . . . . . . . .

1.2.1

Static logic families . . . . . . . . . . . . . . . . . . . .

1.2.2

Dynamic logic families . . . . . . . . . . . . . . . . . .

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

Circuit Modeling

13

1.3

Part II

2. A simple model . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.1

The Elmores model . . . . . . . . . . . . . . . . . . . . . . . .

16

2.2

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

3. A complex model . . . . . . . . . . . . . . . . . . . . . . . . . .

21

3.1

The FAST model . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.1.1

MOS

equations . . . . . . . . . . . . . . . . . . . . . .

23

3.1.2

Internal nodes approximation . . . . . . . . . . . . . .

24

Contents

viii

3.1.3

Body effect . . . . . . . . . . . . . . . . . . . . . . . . .

26

Delay estimation . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.2.1

Equation solving . . . . . . . . . . . . . . . . . . . . .

32

Power estimation . . . . . . . . . . . . . . . . . . . . . . . . .

36

3.3.1

Switching energy . . . . . . . . . . . . . . . . . . . . .

36

3.3.2

Shortcircuit energy . . . . . . . . . . . . . . . . . . .

39

3.3.3

Subthreshold energy . . . . . . . . . . . . . . . . . .

39

3.4

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.5

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

3.2

3.3

Part III

Optimization

45

4. Mathematic Optimization . . . . . . . . . . . . . . . . . . . . .
4.1

Optimization theory . . . . . . . . . . . . . . . . . . . . . . .

48

4.1.1

Mono-objective optimization . . . . . . . . . . . . . .

49

4.1.1.1

Unconstrained problem . . . . . . . . . . . .

51

4.1.1.2

Constrained problem . . . . . . . . . . . . .

52

Lagrange multiplier and Penalty functions . .

52

Multi-objective optimization . . . . . . . . . . . . . .

54

4.1.2.1

Unconstrained . . . . . . . . . . . . . . . . .

56

4.1.2.2

Constrained . . . . . . . . . . . . . . . . . .

57

Compromise solution . . . . . . . . . . . . . .

57

Optimization Algorithms . . . . . . . . . . . . . . . . . . . .

58

4.2.1

One-dimensional search techniques . . . . . . . . . .

59

4.2.1.1

The section search . . . . . . . . . . . . . . .

59

Dicotomic search . . . . . . . . . . . . . . . . .

59

Fibonacci Search . . . . . . . . . . . . . . . . .

60

4.1.2

4.2

47

Contents

The golden section search . . . . . . . . . . . .

60

Convergence considerations . . . . . . . . . . .

61

Parabolic interpolation . . . . . . . . . . . .

62

The Brents rule . . . . . . . . . . . . . . . . . .

62

Multi-dimensional search . . . . . . . . . . . . . . . .

63

4.2.1.2

4.2.2

4.2.2.1

The gradient direction: steepest (maximum)


descent . . . . . . . . . . . . . . . . . . . . .

63

The optimal gradient . . . . . . . . . . . . .

65

Convergence considerations . . . . . . . . . . .

66

The conjugate direction method . . . . . . . . . . . .

67

4.2.2.2

4.2.3

ix

4.2.3.1

The FletcherReeves conjugate gradient algorithm . . . . . . . . . . . . . . . . . . . . .

68

The Powell conjugate gradient algorithm . .

69

4.2.4

The SLOP algorithm . . . . . . . . . . . . . . . . . .

70

4.2.5

The simulated-annealing algorithm . . . . . . . . . .

72

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

5. Circuit Optimization . . . . . . . . . . . . . . . . . . . . . . . .

77

4.2.3.2

4.3

5.1

5.2

Optimization targets . . . . . . . . . . . . . . . . . . . . . . .

78

5.1.1

Circuit delay . . . . . . . . . . . . . . . . . . . . . . . .

79

Critical Paths . . . . . . . . . . . . . . . . . . .

80

5.1.1.1

Delay formula obtained by the Elmore model 84

5.1.1.2

Delay measurement obtained by the FAST


model and by HSPICE . . . . . . . . . . . . .

86

5.1.2

Power consumption . . . . . . . . . . . . . . . . . . .

87

5.1.3

Area . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

Optimization examples . . . . . . . . . . . . . . . . . . . . . .

91

5.2.1

94

Algorithm choice . . . . . . . . . . . . . . . . . . . . .

Contents

5.2.2

5.2.3
5.3

Mono-objective optimizations . . . . . . . . . . . . . .

95

5.2.2.1

Area . . . . . . . . . . . . . . . . . . . . . . .

95

5.2.2.2

Power . . . . . . . . . . . . . . . . . . . . . .

96

5.2.2.3

Delay . . . . . . . . . . . . . . . . . . . . . .

97

Multi-objective optimizations . . . . . . . . . . . . . . 102

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6. A CAD tool for optimization . . . . . . . . . . . . . . . . . . . . 107


6.1

6.2

Logical description . . . . . . . . . . . . . . . . . . . . . . . . 107


6.1.1

The optimization algorithm module (OAM) . . . . . . 107

6.1.2

The function evaluation module (FEM) . . . . . . . . . 109

6.1.3

Core engine . . . . . . . . . . . . . . . . . . . . . . . . 109

Code implementation . . . . . . . . . . . . . . . . . . . . . . . 110


6.2.1

The classes CircuitNetlist and Circuit . . . . . . . . . 110

6.2.2

The class EvaluationAlgorithm . . . . . . . . . . . . . 112

6.2.3

The class OptimizationAlgorithm . . . . . . . . . . . 113

6.2.4

The critical path retrieving . . . . . . . . . . . . . . . 115

6.2.5

The derived classes . . . . . . . . . . . . . . . . . . . . 116

6.3

Program flows . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.4

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7. Results and conclusions . . . . . . . . . . . . . . . . . . . . . . 121


7.1

Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.1.1

Mono-objective vs. Multiobjective . . . . . . . . . . . 122

7.2

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7.3

Future works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Contents

Appendix

xi

143

A. Class graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145


B. Source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
B.1 Main functions

. . . . . . . . . . . . . . . . . . . . . . . . . . 149

B.2 Optimization algorithms . . . . . . . . . . . . . . . . . . . . . 208


B.3 Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

xii

Contents

LIST OF FIGURES
1.1

Static and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Pass-transistor logic xor . . . . . . . . . . . . . . . . . . . . .

1.3

Domino typical gate . . . . . . . . . . . . . . . . . . . . . . .

1.4

CVSL

typical gate . . . . . . . . . . . . . . . . . . . . . . . . .

1.5

C2 MOS

1.6

TSPC

typical gate . . . . . . . . . . . . . . . . . . . . . . . .

Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.1

RC MOS equivalence . . . . . . . . . . . . . . . . . . . . . . .

15

2.2

RC chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

2.3

RC single cell . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.4

Elmore impulse response . . . . . . . . . . . . . . . . . . . . .

18

3.1

Inverter voltages waveform . . . . . . . . . . . . . . . . . . .

23

3.2

Mos chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

3.3

Node voltages . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

3.4

Voltages wave form in the nMOS chain . . . . . . . . . . . .

27

3.5

Voltages wave forms in the pMOS chain . . . . . . . . . . . .

28

3.6

VDS and VGS . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.7

MOSFET

chain with static voltages . . . . . . . . . . . . . . .

30

3.8

Threshold variation . . . . . . . . . . . . . . . . . . . . . . . .

31

3.9

Delay comparison . . . . . . . . . . . . . . . . . . . . . . . . .

42

3.10 Energy comparison . . . . . . . . . . . . . . . . . . . . . . . .

43

List of Figures

xiv

4.1

Section search . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

4.2

Minimization by Powell algorithm . . . . . . . . . . . . . . .

70

4.3

Minimization by Powell algorithm . . . . . . . . . . . . . . .

71

4.4

Minimization by SLOP algorithm . . . . . . . . . . . . . . . .

72

4.5

Minimization by Simulated-annealing algorithm . . . . . . .

73

4.6

Minimization by Simulated-annealing algorithm . . . . . . .

74

5.1

Design flow . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

5.2

Delay definition . . . . . . . . . . . . . . . . . . . . . . . . . .

79

5.3

Critical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

5.4

Critical path tree . . . . . . . . . . . . . . . . . . . . . . . . . .

83

5.5

Elmore delay . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

5.6

Elmore delay . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

5.7

HSPICE

delay . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

5.8

FAST

delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

5.9

HSPICE

Energy . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

5.10 CMOS Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

5.11 TSPC Latches . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

5.12 TSPC And gates . . . . . . . . . . . . . . . . . . . . . . . . . .

96

5.13 TSPC Or gates . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

5.14 Static and-or gate . . . . . . . . . . . . . . . . . . . . . . . . .

98

5.15 Static parity gate . . . . . . . . . . . . . . . . . . . . . . . . . .

99

5.16 Static full-adder . . . . . . . . . . . . . . . . . . . . . . . . . . 100


5.17 TSPC full-adder (onestage) . . . . . . . . . . . . . . . . . . . 101
6.1

Tool block diagram . . . . . . . . . . . . . . . . . . . . . . . . 108

List of Figures

7.1

xv

Comparison of 0.7 m and 0.25 m. gates @ minimum technology width . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.2

Delay optimization of 0.7 m gates. . . . . . . . . . . . . . . . 125

7.3

Delay optimization of 0.25 m gates. . . . . . . . . . . . . . . 126

7.4

Technology comparison of delay optimization. . . . . . . . . 127

7.5

Several delaypower optimization policies of 0.7 m gates. . 132

7.6

Energy-dissipation variation (zoom of figure 7.5(b)) . . . . . 133

7.7

Several delaypower optimization policies of 0.25 m gates.

7.8

Energy-dissipation variation (zoom of figure 7.7(b)) . . . . . 135

7.9

Delaypower optimization (50%50%) comparison of 0.7 m

134

and 0.25 m gates. . . . . . . . . . . . . . . . . . . . . . . . . 136


7.10 Delay and power trajectory during 4 different multi-objective
optimizations for the andor gate . . . . . . . . . . . . . . . . 137
7.11 Delay and power trajectory during 4 different multi-objective
optimizations for the parity gate . . . . . . . . . . . . . . . . 138
7.12 Delay and power trajectory during 4 different multi-objective
optimizations for the static full-adder . . . . . . . . . . . . . 139
7.13 Delay and power trajectory during 4 different multi-objective
optimizations for the dynamic full-adder . . . . . . . . . . . 140

xvi

List of Figures

LIST OF TABLES
3.1

Mean Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

3.2

Execution time . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

4.1

Optimization algorithms . . . . . . . . . . . . . . . . . . . . .

75

5.1

Basic gates: complexity . . . . . . . . . . . . . . . . . . . . . .

92

5.2

Basic gates: pre-optimization delay, power consumption and


area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

5.3

Full-adder: delay optimization . . . . . . . . . . . . . . . . .

99

5.4

Agreements of targets . . . . . . . . . . . . . . . . . . . . . . 103

5.5

Full-adder: delay and power optimization

5.6

Full-adder: optimizations comparison . . . . . . . . . . . . . 105

7.1

Library gates list . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.2

Delay and energy dissipation @ minimum width (HSPICE) . 123

7.3

Delay decreasing and energy increasing (both relative) in a

. . . . . . . . . . 105

delay optimization. . . . . . . . . . . . . . . . . . . . . . . . . 128


7.4

Elapsed time and total number of function evaluations for a


full-delay optimization with HSPICE on a ULTRA-sparc 5

129

7.5

Constrained delay optimization of a few 0.25 m gates. . . . 130

7.6

Delay worsening and energy improvement between a full


delay optimization and delay-power optimization . . . . . . 133

xviii

List of Tables

Preface
The design of high speed integrated circuit is a long and complex operation; nonetheless the total timetomarket required from the idea to the
silicon masks is reducing along the way.
To help the designer during this long and winding road several CAD tools
are available. In the first step the only thing existing is the description of
the circuit behaviour (the idea); in the central step of the design flow the
designer knows only the logic functioning of each block composing the circuit, but he ignores the technology realization of these blocks; in the last
steps, finally, the designer knows exactly the technology implementation
of every single gate of the circuit, and can compose the final layout with
every gate. Ca va sans dire that the CAD tool are nowadays of vital importance in the design flow, and moreover the goodness or the badness of such
tools influence a lot the quality of the final design.
Among all the possible instruments, the optimization tools have a primary role in all the phases of a project, starting from the optimization at
higher level and descending to the optimization made at the electrical level.
This thesis focuses its efforts in developing new strategies and new
techniques for the optimization made at the transistor dimension level, that
is the one done by the cell library engineer, and developing also a CAD instrument to make this work as more as harmless as possible.

xx

Preface

Part I
CMOS LOGIC

Chapter 1

INTRODUCTION TO CMOS LOGIC


HE optimization of VLSI circuits involves the optimization of single

CMOS

cell. In this chapter are briefly reported the basic CMOS logic

families, with their pros and cons. The simple goal is to pick up among

the static and dynamic logic families the most appealing for the use in vlsi
circuits, and, in some measure, the most actually used, and then apply to
them the optimization techniques shown in the next chapters.

1.1

Introduction

We might ask: why to optimize a single cell in VLSI circuit, when the
design nowadays is shifting toward higher and higher level?
Some answers could be:

Need of re-usable library cells. This makes easier to reuse the same
library for different projects. It is a must nowadays, in order to reduce
the total time to target/market.
An optimized library makes easier the design at higher level: floorplanning, routing, can have relaxed constraints, since the gates have
a better behaviour. It is possible to reduce the time to repeat some
critical steps like floorplanning or routing until all the specifications
are met: these specifications are met earlier, since the cell globally
have a better behaviour.
Need of having some equivalent libraries with different kind of optimization. It is possible to have different libraries that have different

Chapter 1. Introduction to CMOS logic

specifications, but are functionally equivalent, so that it is possible to


create different version of a project simply substituting the basic library. It would be possible, for example, to have, of the same project, a
version that runs at full speed, and version optimized for low-power
dissipation.
This swapping of libraries does not involve the higher levels of design,
for it is totally transparent to the designer during floorplanning or
routing. Just before the layout production, during the cell mapping,
it is possible to choose the library on to which the project would be
mapped.
These answer have led to consider the appropriateness of the production of a tool able to perform the optimization of a cell library, in a way
appropriate for the designer. The goal is to produce some results to show
that this optimization is worth during a design cycle, and also to make the
insertion of the tool in a design cycle as smooth as possible.
In order to attain results that are related to a real production cycle, we
have to choose some cells that are almost present in a real library.
For this purpose we introduce a very brief description of the most used
CMOS

logic families, and among them we choose the cells to develop and

test the optimization framework.

1.2

CMOS

logic families

The first basic distinction inside the CMOS logic families is among the
static logics and the dynamic logics ([1]).
Static logic: The static logic is a logic in which the functioning of the circuit is not synchronized by a global signal, namely the clock of the
circuit. The output is solely function of the input of the circuit, and
it is asynchronous with respect to them. The timing of the circuit is
defined exclusively by its internal delay.
Dynamic logic: The dynamic logic is a logic in which the output is synchronized by a global signal, viz. the clock. The output is, then, function both of the inputs of the circuit and of the clock signal; and the

1.2. CMOS logic families

timing of the circuit is defined both by its internal delay and by the
timing of the clock.
Both the static and dynamic logics comprehend several logic families.

1.2.1

Static logic families

The principal static families are:


Conventional static logic It is the logic normally referred when speaking
of static logic. A static circuit has the same number of NMOS and PMOS
transistors, but the n and p branches are respectively one the dual
of the other. As an example see figure 1.1, which represents a static

A
OUT = A and B
B

Fig. 1.1: Static and


and gate. It has two NMOS transistor connected in series and two
PMOS

connected in parallel.

The static logic is quite fast, does not dissipate power in steady state
and has a very good noise margin.
Pseudo-NMOS It is an evolution of the yet surpassed NMOS logic. It is obtained by substituting the whole PMOS branch in a static logic with
a single PMOS transistor with its gate connected to ground. So this

Chapter 1. Introduction to CMOS logic

6
PMOS

is always conducting and leads the output node to the high

state. When the NMOS branch conducts also, then the output discharges, if the ratio among the NMOS and PMOS transistor is well designed.
This logic is cited here only for historical reason, since it is not so fast,
it dissipates static power in a steady state (when the output is in the
low state) and it is sensible to noise.
Pass-logic The pass-logic is relatively new logic, and, for many digital designs, implementation in pass-transistor logic (PTL) has been shown
to be superior in terms of area, timing, and power characteristics to
static CMOS.

As an example see figure 1.2,

OUT = A xor B

Fig. 1.2: Pass transistor logic xor

1.2.2 Dynamic logic families


The principal dynamic families have a characteristic in common: every
dynamic logic needs of a pre-charge (or pre-discharge) transistor to lead to
a known state some pre-charged nodes. This is done during the working
phase known as pre-charge phase or memory phase; during another working
phase, the evaluation phase the output has a stable value1 .
1

This brief introduction is limited to systems that have a single global clock, or one
phase, intending here the word phase as synonym of clock, and not as above as a synonym
of working period. There are systems that have two, or even four phase, but they are not
introduced here. The basic functioning, however, remains the same.

1.2. CMOS logic families

The principal dynamic logics are divided yet in two sub-families, pipelined and not-pipelined. The first two these are non-pipelined, while the others are pipelined:
Domino logic and NP Domino logic The typical domino gate is depicted
in figure 1.3

INPUTs

CLOCK

OUT

NMOS Block

Fig. 1.3: Domino typical gate


During the pre-charge phase the clock is at its low state, so that the
pre-charged node before the static inverter is high, and the output is
low. During the evaluation phase the clock is high, so that the inputs
of the nblock (that can perform any logical function) can discharge
the pre-charged node and lead the output to the high state.
We can cascade several of these gates, given that each gate has its
own output inverter, and we can drive every gate with the same clock
signal, given that the evaluation phase lasts the time necessary to all
the gates to finish their inputs evaluation. This last fact explains why
this is a non-pipelined logic: the output of every cell is available when
the cell has finished its evaluation phase.
Moreover this logic has a limited area occupancy, since it has a low
number of PMOS transistors. On the other hand it is not possible to
implement inverting-structure and, as all the other dynamic logics,
this logic is subject to the charge-sharing problem2 .
2

The charge-sharing problem, or charge-redistribution, is a problem that affects the dy-

Chapter 1. Introduction to CMOS logic

A natural evolution of the domino logic is the N-P domino logic, or


zipper logic. It consist of two typical cells, the one depicted in figure 1.3, and the dual one obtained by that, simply swapping the nblock with a p-block, and a PMOS pre-charge transistor with a NMOS
pre-discharge transistor, driven by the negated clock.
This logic has a lower are occupancy, since there is no need of a static
inverter, but has also a lower speed, given by the presence of PMOS
transistors.
Cascode voltage switch logic (CVSL) The CVSL is part of the large family
of differential logics. It needs both the inputs and the inputs negated,
and two complementary n-block that perform the logic function, as it
is possible to see in figure 1.4.

OUT

INPUTs

INPUTs

OUT

Fig. 1.4: CVSL typical gate


It has the advantage to be quite fast, since the positive feed-back of
the two PMOS accelerates the switching of the gate, and also it has
very good noise margins. Moreover it produces both the outputs and
namic logics. Basically the charge stored in an precharged node node during the memory
phase does not remain fully stored in it. Lets think to a domino gate during the pre-charge
phase, when the clock is low. If there is one input in the n-block that is high, then its corresponding transistor is conducting. The n-branch is still not conducting, since the clocked
NMOS transistor is not conducting, but some charge from the precharged node can flow to
others node via the conducting transistors in the n-block. This redistribution of charge is
simply a charge of a capacitor partition and lead to a state of the precharged node lesser
than the high state.
This problem can produce logic errors, and surely diminishes the noise margins of the
cell.

1.2. CMOS logic families

negated outputs without needing an inverter. As a drawback, it has


a large area occupancy.
C 2 MOS

logic The typical C2 MOS gate is shown in figure 1.5. It is basically

a three-state gate, since when the clock is at the low state, the output
is floating at the high impedance state.

INPUTs

PMOS Block

CLOCK
OUT

INPUTs

CLOCK

NMOS Block

Fig. 1.5: C2 MOS typical gate


It is principally used as a dynamic latch, as an interface among static
logics and dynamic-pipelined logics.
NO RAce logic (NORA) The NORA logic, as acronym of no race, is an evolution of the N-P domino logic. The static inverter of the domino logic
is substituted with a C2 MOS inverter. This is the first of the pipelined
logics, since the output of every gates is available only when the clock
switch its state, and not before.
Since the output stage of every cell is also dynamic (a C2 MOS inverter), then this logic is more subject to the charge-sharing problem
that the domino logic is.

Chapter 1. Introduction to CMOS logic

10

True Single Phase Clock logic (TSPC) The final evolution of the NORA is
the TSPC logic, or true single phase clock logic ([2]).
The TSPC logic is a n-p logic, since of each gate exists the n-version
and the p-version. For example the n-latch and the p-latch are shown
in figure 1.6.

CLK
OUT
A

(a) Type n

CLK

OUT

(b) Type p

Fig. 1.6: TSPC Latches

The ultimate advantage of the TSPC logic is the presence of a single


clock, since for its internal structure it is not necessary the presence of
the clock negated.
The TSPC logic is among the faster dynamic families, and surely it has
a great appealing for its very low number of transistor employed.

1.3. Conclusion

11

1.3 Conclusion
After this very brief introduction to several CMOS families, we chose
two different logics, in order to apply the study of the optimization techniques objects of this thesis. The criteria that drove us in choosing these
families was both the diffusion in VLSI circuits, and the presence of very
good qualities, perhaps not yet fully exploited in the real production of
circuits.
For these reasons we have chosen to include in our library a few static
gates (an and gate, an or gate, and a few more) and a few dynamic
gates, and in particular gates from the TSPC family. This family has shown
good characteristics in term of speed, area occupancy and power dissipation; it has also the very important feature to need only a single clock.
The complete list of the gates comprising the library can be found in the
table 7.1 (page 122), with their relative schematic diagram of CMOS implementation.

Part II
CIRCUIT MODELING

Chapter 2

A SIMPLE MODEL

HE first model applied in the calculus of the delay in MOS circuits is


the Elmores model ([3]). It is a simple RC delay model, and it is the

basement of a switch MOS model (figure 2.1): the generic MOS is represen-

ted, during the ON state, by its dynamic resistance across the drain pin and
the source pin, and the parasitic capacitances and resistances at the drain
and source pins.

CL

RL
CD

D
ON

Rg

Rd
CG

S
CS
R0

Fig. 2.1: RC MOS equivalence

If this simple MOS model is valid, then the Elmores delay formula can
be used in every structure containing some MOS. The Elmores formula is

Chapter 2. A simple model

16

appealing for its simplicity and its easy of use; however the accuracy of the
formula can worsen in the deep submicron domain, since the modeling of
a MOS through its resistance it is no more valid.
Since the use of Elmores model is almost quite limited to comparisons with other models, of for introduction to delay modelling, section 2.1
presents here only the very basic of the Elmores model and section 2.2
shows the conclusions about the use of this model for VLSI models.

2.1 The Elmores model


The Elmores model or the Elmores delay formula can predict the delay
of a RC chain as shown in figure 2.2.

V0

Ri-1 Vi-1

Ri

Ri+1

Vi

Ci-1

Vi+1
C i+1

Ci

Fig. 2.2: RC chain

In order to obtain the formula, lets start with a single RC cell, as shown
in figure 2.3. We can express the voltage V1 (t) by means of a differential
equation such as:

C0

V (t) V0 (t)
dV1
= 1
dt
R0

(2.1)

Integrating the equation (2.1), we can write

V1 = V0 (t) 1 e

R tC

0 0

The time constant is = R0 C0 , and with t = we obtain:

2.1. The Elmores model

R0

V0

17

V1
C0

Fig. 2.3: RC single cell

V1 = 0.63V0 (t).
So the time t D = represents the 63% delay from V0 (t) to V1 (t). Extending the formula of the time constant to the chain of figure 2.2, we obtain:

tD =

i=0

j=0

Rj

Ci .

This delay is the inputoutput delay. When there is the need to know
the delay between the input and one of the inner nodes, a more complex
formula (a semi-empirical one) can be used; for example, with N = 2:

delay from the input note to the first node

t1 = R0 C0 + qR1 C1
t2 = R0 C0 + (R0 + R1 )C1

delay from the input note to the output node

where q is:

R0
R
0 + R1
q=
R0 C0

R0 C0 + R1 C1

if R1 2R0 ,
if R1 > 2R0 .

Chapter 2. A simple model

18

The first case (with R1 2R0 ) is named strong coupling, while the second

one is named weak coupling.

Given the unit impulse response h(t) (figure 2.4) of the output node of
the RC tree, Elmore proposed to approximate the delay by the mean of

h(t)

h(t), considering h(t) as a distribution. The 50% delay is given by:

m
t

Fig. 2.4: Elmore impulse response

Z
0

h(t)dt = 0.5

while the original work of Elmore proposed:

tD = m =

Z
0

t h(t)dt

with

Z
0

h(t)dt = 1.

2.2. Conclusions

19

This approximation is valid only when h(t) is a symmetrical distribution, as in figure 2.4, while in real cases the h(t) distribution is asymmetrical;
however in [4] is proved that the Elmore approximation is an upper bound
for the 50% delay, even when the impulse response is not symmetrical, and,
furthermore, the real delay asymptotically approaches the Elmore bound as
the input signal rise (or fall) time increases.

2.2

Conclusions

The model shown in this chapter is quite appealing for the calculus of
the delay in CMOS structure, but it is inaccurate as far as we go into the
submicron domain, so its use should be limited to a first validation of an
optimization algorithm, but not for real production.
About this, it is important to note that the delay functions obtained by the
Elmores formula satisfy some properties useful in the optimization realm
(for example equation (4.1), page 50): then the Elmore model is very useful
for optimization algorithms testing.

Chapter 3

A COMPLEX MODEL

HE target of the model developed here is to offer limited estimation


errors with respect to physical SPICE simulations and to improve the

computation speed of more than one order of magnitude. This could be


useful in optimization algorithms.
Thus the aim of the model is to evaluate the delay and power dissipation

of CMOS structures.

Several approaches have been used to evaluate the delays of CMOS


structures: some models are derived from SPICE simulations by means of
lookuptables [5]; some are analytical [6] while others approximate the
evaluation of the delay with step or ramp inputs [7, 8, 9, 10, 11].

Regarding the power consumption the main contributions are: switching power, short circuit current and subthreshold conduction. The first
one occurs during the charge and discharge of internal capacitances; short
circuit current originates from the simultaneous conduction of p and n networks and it is dominated by the slope of node voltages; subthreshold
currents are due to the weak inversion conduction of MOSFETs and become
relevant when the power supply is scaled in sub-micron technologies.

Most of the proposed power models use estimation algorithms not compatible with the delay analysis. The purpose of the FAST model is to combine delay and power evaluations in the same estimation procedure, allowing the simultaneous optimization of delay and power.

Chapter 3. A complex model

22

The section 3.1 reports the theory behind the FAST model, and in particular: 3.1.1 shows the MOS equations used in the model, 3.1.2 shows

the internal nodes voltage approximation made by the model and 3.1.3

explains how the threshold voltage variation are taken into account in the
model. Section 3.2 shows how the FAST model estimates the delay, and in
particular 3.2.1 shows how the equation are solved; while section 3.3 re-

ports the method used for the calculation of the power consumption, and
in particular 3.3.1 accounts for the switching power, 3.3.2 accounts for the

short-circuit power, and 3.3.3 accounts for the subthreshold power.

Finally the section 3.4 presents some results by the comparison of the model
with HSPICE and the section 3.5 draws some conclusions.

3.1 The FAST model


The low complexity and the accuracy that can be obtained by taking
care of the phenomenon of carriers velocity saturation, which is dominant in submicron technologies, suggested the use of the classical charge
control analysis and the gradualchannel approximation (Hodges model),
described in 3.1.1.
Estimation accuracy and low computational effort can be achieved by
operating both on the waveforms of internal signals and on the topology
considerations: in particular all the waveforms in the circuit are approximated with linear ramps.
By approximating the input waveform with a ramp, a strong simplification of the I(V) equations is obtained. Figure 3.1 shows the output voltage
of an inverter driven by a ramp input. It can be noticed that a ramp can
properly approximate the output voltage variation, especially in the central
phases of the commutation. The increasing error on the tail of the switching
does not affect significatively the delay and power estimation.
The voltage ramp approximation are described in 3.1.2.

3.1. The FAST model

23

Vout
Vin
Model

2
1
0
1.2

1.25

1.3

1.35
Time (ns)

1.4

1.45

1.5

Fig. 3.1: Inverter voltages waveform

3.1.1

MOS

equations

The well known equations for the MOS transistors are (for the ntype
and ptype transistors)[1]:

below saturation
IDSn, p = n, p (VGS VTn, p )VDS

2
VDS
2

(3.1)

above saturation
IDSn, p =

where n, p =

n, p Cox W
,
L

n, p
VDSsatn, p
2

(3.2)

with n, p modified by the carrier velocity saturation

effect:

n =

n0
1 + VLEDSc

p =

p0
1 VLEDSc

24

Chapter 3. A complex model

The saturation voltage (drainsource), not including the carrier velocity


saturation effect, is given by the well known formula:
VDSn, p = VGSn, p VTn, p
while considering the effect abovementioned:

VDSn, p = Vc 1

2(VGSn, p VTn, p )
Vc

(3.3)

where the plus signs are for nMOSFETs and the minus signs are for the
pMOSFETs, and Vc = |Ec L|

3.1.2 Internal nodes approximation

Fig. 3.2: Mos chain with proper numbering


Let be N the number of nMOSFETs in the nchain and P as the number of pMOSFETs in the pchain, and lets label the transistor in the chain

3.1. The FAST model

25

from 1 to N or from 1 to P (figure 3.2). Lets assume that the label 1 comes
with the driving transistor (i.e. the nMOSFET with source connected to VSS
as the pMOSFET with source connected to VDD ), as in figure 3.2. This hypothesis is only for the develop of the discussion; in our model any (but
only one) transistor can be a driving transistor, that is a transistor with a
changing gate voltage.
Notation 3.1. In the following equations the superscript index refers to the
node number (with the variable i always for the nMOSFETs and j always
for the pMOSFETs), and the smallletter subscript indexes n and p refer, respectively, to nMOSFETs and pMOSFETs, both for the voltage variables or
for the time variables; for the voltage variables the capital subscript indexes
G and D refer to the drain node and the gate node, while the smallletter
index d refers to the initial conditions of the drain nodes.
So, for example, VGi n (t) is the gate voltage at the node i for the nMOSFETs
j

(function of time), and Vd p is the initial condition of the drain voltage at

node j for the pMOSFETs.

The wave forms of the voltage are shown in figure 3.4 and figure 3.5,
with the hypothesis t01n = t20n = = t0Nn and t01 p = t02 p = = t0Pp ; that is

because we suppose the start of conduction of all the MOSFETs in a chain


contemporary1 .
We can write, referring to figures 3.4, 3.5:

VDD
t
VG1 n (t) =
i1n

V
DD

VGi n (t)
1

t<0
0 t < i1n

i1n t

VDD

VDD
VG1 p (t) = VDD 1 t

ip

i=2,3,..., N

= VDD

(3.4a)

t<0
0 t < i1p

(3.4b)

i1p t

This hypothesis is well supported by simulations

(3.4c)

Chapter 3. A complex model

26
j

VGp (t)

= VSS t
(3.4d)

V i

t < t0i n
dn

t t0i n
i
i Vi
(3.4e)
(t)
VD
=
V
t0i n t < oin
dn i
dn
n
i

i=1,2,..., N
t

o
0
n
n

VSS
oin t

j
j

t < t0 p
Vd p

j
j
j

VDD V j
o p Vd p t0 p VDD
dp
j
j
j
=
VD p (t)
t+
t0 p t < o p (3.4f)
j
j
j
j

j=1,2,..., P

o p t0 p
o p t0 p

j
V
o p t
DD
j=2,3,..., P

(t)

(t)

(t)

(t)

(t)

n(t)

n(t)

n(t)

Fig. 3.3: The ith and i + 1th MOSFETs with node voltages
1 and the source voltage V i = V i+1 ,
It is also possible to define iin, p = oi
s
n, p
d

as shown in figure 3.3 for the ith nMOS. The same is valid for the p
MOSFETs.

The starting level Vdn, p are determined with a static analysis, described
in 3.1.3.

3.1.3 Body effect: threshold variation and its approximation


It is known that a MOS transistor with the sourcebody voltage different from zero has the threshold voltage modified by the body effect, that

3.1. The FAST model

27

Fig. 3.4: Voltages wave forms in the nmos chain

is if Vsb = 0, with Vsb the sourcebody voltage (lets remember that for
a nMOSFET Vb = VSS and for a pMOSFET Vb = VDD ), then |VTh |Vsb =0 >

|VTh |Vsb =0 . The initial conditions of the chain nodes are set by the initial
condition on the output. So if the output node is discharging, then one
(and only one) nMOSFET is switching from off to on. It means that all the
other MOSFETs are already on, and while the starting voltage of the output
.
node is VDD , all the internal nodes have as a starting voltage VDD VTn
With the notations of previous paragraphs, the Nth (topmost) nMOS
, with V source potential and V the threshtransistor has VsNn = VDD VTn
s
Tn

old voltage modified by the body effect. All the internal transistors have
, while the first one has V 1 = V

1
Vdi n = Vsin = VDD VTn
DD VTn and Vsn =
dn

0.

The threshold voltage variation as a function of Vsb is given by:

VTn = (
with =

2s qNa
Cox

2| p | + Vsb

Na
and p = KT
q ln ( ni ).

2| p |) ,

Chapter 3. A complex model

28

Fig. 3.5: Voltages wave forms in the pmos chain

The source potential of the top transistor is

Vs = VDD VTn
,
=V
and, if VTn0 is the threshold voltage with Vsb = 0, then VTn
Tn0 + VTn

and we can solve for Vsb :

Vsb =

2| p | + 8| p | + 4VDD 4VTn0 + 2
2

2| p | + VDD VTn0 +

2
2

(> 0)

We can find an analogue equation for pMOSFETs: knowing that, for


the pMOS chain depicted in figure 3.7(b), the drain potential of transistor
j

; for the middle transistors V = V =


is VdPp = 0, while VsPp = VDD VTp
sp
dp

; and for the first (top MOS t) transistor V 1 = V

VDD VTp
DD VTp and
dp
Vs1p = VDD .

The threshold voltage variation function of Vsb again is:

3.1. The FAST model

29

i
Fig. 3.6: Drainsource (VDS ) and gatesource (VGS ) voltages of th ith n
MOS

VTp = ( 2| p | + Vsb

2| p |)

(for pMOS transistors threshold voltage is negative).


Again, solving:

= VDD VTp0 + (
Vsb = VDD VTp

2| p | + Vsb

2| p |)

where VTp0 is the threshold voltage with Vsb = VDD ; thus we find:

Vsb =

2| p | + 8| p | + 4VDD + 4VTp0 + 2
2

2| p | VDD VTp0

2
2

(< 0)

The threshold variation is approximated in the model by a linear approximation given by:

Chapter 3. A complex model

30

VDD
VDD
pmos 1

VDD
VDD

nmos N

VSS - VTP

VDD - VTN

VDD

nmos

VSS

pmos

VSS - VTP

VDD - VTN

VSS

nmos 1

pmos

VSS

VSS
(a) nMOSFET chain

(b) pMOSFET chain

Fig. 3.7: MOSFET chain with static voltages

VTn = n Vsb + n
VTp = p Vsb + p
with n, p and n, p constants:

n =
p =

V
VTn
Tn0

VDD VTn

VTp VTp 0

VDD + VTp

n = VTn0
p =

+V
VTp VDD
Tp0

VDD + VTp

3.2. Delay estimation

1.5

VTp(Vsb)
VTp approx

-1.1

1.3

-1.2

1.2

-1.3

VTp

VTn

-1

VTn(Vsb)
VTn approx

1.4

31

1.1

-1.4

-1.5

0.9

-1.6

0.8

-1.7
0

2
Vsb

(a) nMOSFET

2
Vsb

(b) pMOSFET

Fig. 3.8: Threshold variation with Vsb (solid line) and its linear approximation (dashed line)
In figure 3.8(a) and 3.8(b) the actual threshold variation (of a nMOS
transistor and a pMOS transistor) when a Vsb voltage is applied is compared with the linear approximation used in our model, for a 0.7 m technology.
The max error due to the linear approximation is limited to 7%.

3.2

Delay estimation

The delay estimation of the structures reported in figure 3.2 implies the
evaluation of oin, p and t0i n, p , for each transistor in the chains.
The currents in each transistor can be obtained from equations (3.1),
(3.2) (page 23), with the voltage function of time defined in equations (3.4a)
(3.4f) (page 25). So we can calculate the quantity of charge at each node and
thus apply the charge conservation law, i.e. at each node the total charge
variation must be equal to zero:

Qin = 0

Qp = 0

i = 1, 2, . . . N and j = 1, 2, . . . , P

(3.5)

The generic term Qin is the sum of three elements, Qin = QiI+1 QiI QiC ,

define below:

Chapter 3. A complex model

32

QiI+1 is the charge due to the (i + 1)th MOSFET placed above the ith
node:
QiI+1

Z ti+1
sn
t0i+n 1

i+1
Isat
(t)dt +

Z i+1
on
1
tis+
n

i+1
(t)dt
Ilin

(3.6a)

which includes the contributions due to the currents above and below saturation; ts is the time at which the MOSFET switches from the
saturation to the linear region;

QiI is the charge due to the (i)th mos below the ith node:
QiI

Z ti
sn
t0i n

i
Isat
(t)dt +

Z i
on
tisn

i
Ilin
(t)dt

(3.6b)

QiC is the charge due to the discharging of the capacitor at the ith
node, Ci :
QiC = Ci Vdi n .

(3.6c)

Similarly equations apply for pMOSFET.


For each circuit node, a charge conservation equation can be written.

3.2.1

Equation solving

Referring to the nMOS chain in figure 3.3, we can write at the output
node N:

QnN = QCN = C N VdNn

(3.7)

because, neglecting the contribution of the pMOS chain above (if it exists),
QN
I = 0.
At the node N 1 we can write:

3.2. Delay estimation

33

N 1
QCN1 ,
QnN1 = Q N
I QI

and combining with eq. (3.7) (page 32)

1
QnN1 = C N VdNn Q N
QCN1 ,
I

and so on:
1
QnN2 = C N VdNn C N VdNn1 Q N
QCN2 .
I

More generally:

Qin =

k=i+1

Ck Vdkn QiI QiC

= Ck Vdkn QiI = 0
k =i

Proceeding till the first transistor, we obtain:

Q1n = Ck Vdkn Q1I = 0 ,

(3.8)

k=1

the same applies for pMOSFETs.


In order to solve nonlinear equation (3.8) one must substitute the definition of the current to calculate the charge Q, as in equations (3.6a), (3.6b)
(page 32), moreover one must substitute both the current calculated in the
saturation region and the one calculated in the linear region, extending the
integrals of the aforementioned equations to the proper extremes.
Finally we must distinguish among several different cases, depending
on the instant of time on which the transistor switch from the saturation
region to the linear region. For example, the first transistor can switches

Chapter 3. A complex model

34

between the two regions when the rising of the input has already finished,
or on the contrary can switches when the input is still rising.
All the possible cases are:

t01

t1s

i1

o1

t01

i1

t1s

o1

t1s

t10

i1

o1

i1

t01

t1s

o1

t1s

i1

t10

o1

t01

t1s

o1

i1

o1

i1

t1s

t01

(3.9)

Evaluating all the possible cases, the equation (3.8) becomes a non
linear equation of the variables t1s , t10 , o1 , i1 , with t1s , t10 , o1 as unknowns.
A further step must be done, with the purpose of eliminating all the variables but one. The real unknown is the time o1 , while all the other unknowns can be expressed in function of o1 : in particular, the times t1s and
t10 can be calculated together, with the equation VDS = VGS VT and with

the equation that states the charge conservation at node 1 between the time

0 and the time t10 , similar to the equation (3.5) (page 31), including the bootstrap effect due to capacitive coupling between the gate and the drain of
the first transistor.
Both these equations are functions of t1s , t10 , o1 , i1 . By this way one has
three equations with three unknowns, and by means of some approximated methods2 it is possible to evaluate the three unknowns.
This solution scheme ought to be repeated for all the seven cases shown
in equation (3.9). Each case gives as a solution a triple t1s , t01 , o1 that is compatible with one and only one of the conditions expressed by these cases.
Thus, only one working condition is really selected, as it can be expected.
Indeed all the previous solving scheme is true only if the equation (3.6c)
(page 32) apply, i.e. only if the capacitance at the node i is not a function of
the voltage at the same node. But the capacitance actually is function of the
voltage in this manner:

Or, taking into account the carrier velocity saturation effect, the equation (3.3) (page 24).
The problem is always strictly nonlinear.

3.2. Delay estimation

C =

Cij

Vi
1+
b

m j

+ Cip

35

Vi
1+
b

m p

(3.10)

where C j and C p are, respectively, function of area and function of perimeter of a junction, because the capacitance at the node i is due to the parasitics capacitances of the transistors connected to this node.
If the capacitance at each node are functions of the voltage at the node itself, then one equation is no more sufficient: one must write equations like
the equation (3.8) (page 33), one for each node, and the solve them with
standard solving algorithm for nonlinear equations. The only difference
among the equations applied at the nodes above the first and the first node
equation is that not all of the cases of equation (3.9) are possible: in particular these conditions apply only when the transistor can pass from the
saturation region to the linear region, and moreover, only when the input
rising time i1 can assume whichever value. The passage from saturation to
linearity can be made only by the first and the last transistors of the chain,
as they are the only that can saturate3 . But in the last transistor, the time iN

is governed by iN = oN1 , giving thus only two possible cases:

t0N

tsN

iN

oN

t0

iN

tsN

oN

In order to make the algorithm convergent, two other fictitious cases


must be included:

t0N

tsN , oN

iN

t0

tsN , oN

iN

These conditions can never verify in a real circuit, since they imply that
the voltages at the source node and at the drain node of the last transistor
3

This is because they are the only that have a full voltage swing at some node, e.g. the
gate node the first, and the drain the last. All the transistor in the middle of the chain
are prevented to saturate by the body-effect, that makes the saturation condition VDS =
VGS VT , (or, better, the equation (3.3), page 24) impossible.

Chapter 3. A complex model

36

crosses, making the transistor current flowing in an inverse direction (see


figure 3.6 for a visual explanation of the terms i and o and why they relative voltage waveforms cannot cross). Their inclusion help finding the real
circuit conditions when solving the equation (3.8) for each of these four
cases: the solution of one the fictitious cases gives only unknowns compatible with one of the real cases.
All the other transistors, that can not saturate during the switching from
off to on, have only one possible working condition, again that the voltages
at source and drain nodes do not cross:

j = 2, . . . N 1
j

Solving all the equations, one for each node, the unknowns o can be
evaluated, giving thus an estimate of the voltage waveform at each node
of the chain. The rising/falling time of the last node of the chain gives also
the delay of the chain itself.

3.3 Power consumption estimation


3.3.1 Switching energy
The contribution to the power dissipation due to the charge and discharge of internal nodes for each MOSFET can be defined as the integral of
the voltage across the MOSFET times the current flowing through.
Theorem 3.2. The switching energy in generic nnetworks and pnetworks can
be written as:
Eswn =

1 N i
C Vi 2 Vi
2 i
=1

Esw p =

1
2

Cj

j=1

VDD V j

(3.11)
2

VDD V j

(3.12)

where Ci is the generic total capacitance of node i-th and Vi , Vi are, respectively, the initial and final value of the voltage swing at the same node.

3.3. Power estimation

37

Corollary 3.2.1. If the voltage swing of each node of the network is the full swing
V = VDD 0, then equations (3.11), (3.12) can be written as:
Eswn =

1 N i 2
C V
2 i
=1

(3.13)

Esw p =

1 P i 2
C V
2 i
=1

(3.14)

Proof of theorem 3.2. Since the internal voltages and currents are known from
the delay analysis, the energy for the nMOS network can be written by
summing all the contributions of internal nodes (see figure 3.3)
Eswn =

N Z

i=1

i+1
i
i
VD
(t) VD
(t) ID
(t)dt
n
n
n

where the notation of figure 3.3 is adopted.


This equation can be written in this way:
Z

Eswn =

N
(t) +
VDNn (t)ID
n

N 1

i =1

i
i
i+1
(t) ID
VD
(t) ID
(t)
n
n
n

dt

(3.15)

It is possible to rewrite the previous equations by noting that in general:

i+1
i
ID
ID
= Ci
n
n

i
dVD
n
dt

and, in particular, if we neglect the current of the pMOS chain above the
node N,

IDNn = C N

dVDNn
dt

Thus, for the n network it is possible to define the Eswn energy in the
following way:

Chapter 3. A complex model

38

Eswn = C

=C

i=1
N

t0

i=1
N

Z t
0
Z V
i
Vi

i
VD
n

i
dVD
n

dt

dt

i
i
dVD
VD
n
n

1
Ci Vi 2 Vi
2 i
=1

If we integrate the equation (3.11) (page 36) only when the argument of
the integrals are non zero, then the first integral in this equation goes from
i (ti ) to
t0 = t0i n to t0 = oin , so that the second integral goes from Vi = VD
n 0n
i ( i ). Since V i ( i ) = 0, we have E
Vi = VD
swn =
on
Dn o n
n

1
2

iN=1 Ci Vi 2 , where Vi

is the actual voltage swing at the node i.


The energy dissipated in the p network (Esw p ) can be calculated with
similar considerations leading to

Esw p =

Z t
0
t0

j=1
P

Z V
j

j=1

1
Cj
2
j

Vj

VDD VD p
j

i
dVD
n

dt

dt

VDD VD p dVD p
VDD V j

VDD V j

Again, V j = VD p (t0i n ) and V j = VD p (o p ), and in the same way V j =

VDD , so that Esw p = 12 Pj=1 C j (VDD V j 2 ), where (VDD V j 2 ) is the voltage


swing at the node j.

In the equations (3.11) and (3.12) (page 36) the voltage variation of capacitance must be included, obtaining expression for Eswn, p slightly more
complicated, but still in closed form.

3.3. Power estimation

39

3.3.2 Shortcircuit energy


The shortcircuit contribution (for a output falling transition) is given
by:
Esc =

Z o
t0

VD ID dt

where ID is the pMOSFET current flowing through the pMOSFET that


has a changing gate voltage, during the output falling; of course all the
pMOSFETs among this one and the output node must be on to have this
contribution of power dissipation. So if we neglect the little discharging of
the source voltage of this MOSFET, we can easily calculate the shortcircuit
energy, calculating the current flowing.
A similar equation can be written for the nMOS network.
Since voltage swings, internal currents and capacitances are known from
the delay analysis, the power supply dissipation does not require additional computations.

3.3.3 Subthreshold energy


The subthreshold current in a MOSFET is given by ([12]):

IDSsubth = 0

qVDS
W kT
Q(VS ) 1 e kT
L q

where
Q(VS )

kT
q

VT )
q s Na q(VG kT
e
| p |

and

=1+

1
2Cox

s Na

| p |

This current is proportional to the MOSFET width W, but, usually is neg-

Chapter 3. A complex model

40

ligible. However, with the scaling down of the dimensions and hence of the
threshold voltage this current may become no more negligible, and with
low VG and higher VD , the current becomes independent from VG .
Moreover, while the shortcircuit current is limited by the switching times
of the circuit, the subthreshold current is not limited in time, so its dissipation can be comparable to the shortcircuit dissipation.

3.4

Results

The circuit in figure 3.2 with 2 nMOS and 2 pMOS transistors (in a
0.7 m technology) has been simulated using HSPICE (level 6) and the proposed model, for each combination of MOSFET widths from 1 m to 100 m.
Figure 3.9 shows the comparison between delay (defined as the delay at
50% between an input rise ramp of 200 ps and an output falling ramp)
calculated by the model and the delay simulated by HSPICE for each combination of widths among 5 m and 30 m; similarly figure 3.10 shows the
comparison between the energy dissipated (during the output discharging)
by the circuit calculated by the model and by HSPICE.
Tab. 3.1: Mean Error
Delay
Energy dissipated

Mean error
6.115%
2.1%

Max Error
12.985 %
6.3%

Min Error
0.905%
0.11%

Tab. 3.2: Execution time


execution time
6384.3 sec.

HSPICE

FAST

execution time
188.91 sec.

The errors between the proposed model and the HSPICE simulation is
reported in table 3.1 while table 3.2 shows corresponding execution time.
These results are taken from the analysis of the circuit varying the dimensions of the MOSFETs continuously from 1 m to 100 m.

3.5. Conclusions

3.5

41

Conclusions

The model of this chapter is suitable for the optimization application of


chapter 5. It is able to compute the delay and the power consumption of
CMOS

structures with good accuracy and a consistent speedup regarding

to the HSPICE simulation taken as a reference.


In a real production design cycle, this model might be used for a first pre
optimization of some basic cell; then in the last steps of the design flow the
optimization using a more accurate model for the delay (or power) evaluation must be used.

Chapter 3. A complex model

42

Delay Model

Delay [ps]

180
160
140
120
100
80
60
40
20
30
25
5

20
10

15

15
20

W1 [micron]

W2 [micron]

10
25

30

(a) FAST model

Hspice Simulation

Delay [ps]

180
160
140
120
100
80
60
40
20
30
25
5

20
10

15
W1 [micron]

15
20

W2 [micron]

10
25

30

(b) HSPICE

Fig. 3.9: Delay of the circuit 3.2 with several combination of W1 and W2 .

3.5. Conclusions

43

Energy Model

Energy [fJ]

1000
900
800
700
600
500
400
300
200
30
25
5

20
10

15

15
20

W1 [micron]

W2 [micron]

10
25

30

(a) FAST model

Hspice Simulation

Energy [fJ]

1000
900
800
700
600
500
400
300
200
30
25
5

20
10

15
W1 [micron]

15
20

W2 [micron]

10
25

30

(b) HSPICE

Fig. 3.10: Energy dissipated by the circuit of figure 3.2 with several combination of W1 and W2

Part III
OPTIMIZATION

Chapter 4

MATHEMATIC OPTIMIZATION

HE very basic theory of optimization is introduced here, in order to

develop some optimization schemes, useful later for the optimization

of real circuits.
The theory of mono-objective optimization involves some properties and
theorems regarding finding the minimum of functions, hence the annulling
of the functions first derivatives. These results can be extended (with some
restrictions) to the case of multivariable functions but when the functions
to be optimized are more than one, being optimized simultaneously, the a
new theory may be introduced.
The whole goal of this introduction to mathematical optimization is
both the developing of reliable algorithms, and the justification of some assumptions made in the chapter 5 (page 77), especially for the multi-objective
case.
In section 4.1 some mathematical optimization foundations are reported, and in particular in 4.1.1 is shown the theory of mono-objective optimization (unconstrained, 4.1.1.1, and constrained, 4.1.1.2), while in 4.1.2 is

shown the theory of multi-objective optimization (unconstrained, 4.1.2.1,


and constrained, 4.1.2.2).

The section 4.2 reports the basic and most useful numerical algorithms for
optimization purposes: in 4.2.1 some one-dimensional search techniques,
in 4.2.2 some multi-dimensional search techniques, and in 4.2.4, 4.2.5

some special algorithms.

Some conclusion and summarized characteristics are reported in section 4.3.

Chapter 4. Mathematic Optimization

48

4.1 Optimization theory


Notation 4.1. In the following section, the function f is defined as:
f : X R p Y R. X is called the decisions space, and Y is called the criteria

space.

Problem 4.2 (Unconstrained optimization). Given the function f that depends on one or more variable x X, the problem of optimize f , in this
context, is equal to find:

min f (x)
x X

this is also known as an unconstrained optimization, since there are not any
constraints on the values the function f may assumes.
The unconstrained optimization is seldom applied in the field of digital
circuits, so the constrained optimization is defined as:
Problem 4.3 (Constrained optimization). Find
min f (x)
x X

subject to

g j (x) h j , j = 1, 2, . . . , m

where the n equations gi (x) hi constitute the set of constraints of the optimization.

The function f is also called the objective of the optimization, or the cost
function of the problem.
The above problems are classical optimization problems, or mono-objective problems. The multi-objective unconstrained optimization is defined as
the problem to optimize a vectorial function, so that the objective-function
is a vector of objective-functions.
Notation 4.4. In the following (multi-objective optimization), the function f
is defined as:
f : X R p Y Rn , or f = ( f 1 , f 2 , . . . , n)| f i : X R p Y R,

Problem 4.5 (Unconstrained multi-objective optimization). Find


min f i (x), i = 1, 2, . . . , n
x X

4.1. Optimization theory

49

where there are n objective functions.


Finally, the multi-objective constrained optimization is defined as:
Problem 4.6 (Constrained multi-objective optimization). Find
min f i (x), i = 1, 2, . . . , n
x X

subject to

gi (x) hi , i = 1, 2, . . . , m

where there are n objective functions and m constraints.


The multi-objective optimization is a very complex problem, since the
problem of finding the minimum of two or more functions is apparently
only trivial: the set of independent variables xmin that minimizes, lets say,
the function f 1 , it is not supposed to minimizes (and generally it does not)
the other functions. So there should be a way to combine the information of
minimum among all the functions. The intuitive way of linear combination
is somewhat problematic:
f tot (x) =

i fi (x), i R

i=1

because the functions f i cannot be commensurable among them. For example, if there is one function f j that is f j >> f i , i = j, then this function

dominate the total objective, giving false results for the optimization problem. This problem is illustrated in 4.1.2.

4.1.1 Mono-objective optimization


The mono-objective optimization is the standard optimization problem,
and is widely treated in literature (see [13] for an introduction). With this
preliminary statement, here are reported some results, useful to find a solution for the problems 4.2, 4.3.
The existence of the minimum (at least one) is granted by the Weierstrass
Theorem1 , but these minimums can be local or global:
Definition 4.7 (Local Minimum). The point x X is a local (or relative)

minimum of the function f iff

> 0 : f (x) f (x ) x X |x x | < .


1

iff X is a compact set, as is in this context

Chapter 4. Mathematic Optimization

50

Definition 4.8 (Global Minimum). The point x X is a global (or absolute) minimum of the function f iff f (x) f (x ) x X.

Definition 4.9 (Feasible direction). d Rn is a feasible direction if >

0 : x + d X , : 0

In an intuitive manner the concept of feasible direction is useful to solve


the problem of minimization: we search all the direction in which the function f is decreasing.
Lemma 4.10 (First order necessary condition). If x X is a minimum of

f C1 then d Rn , where d is an feasible direction, dT f (x ) 0, where

() has the usual definition of scalar product in the space Rn .

Corollary 4.10.1. If x X is an internal point of X, then dT f (x ) = 0


Lemma 4.11 (Second order necessary condition). If x X is a minimum of
f C2 then d Rn , where d is an feasible direction,
i) dT f (x ) 0;
ii) if dT f (x ) = 0 then dT 2 f (x ) d 0
Corollary 4.11.1. If x X is an internal point of X, then
i) dT f (x ) = 0
ii) dT 2 f (x ) d 0
The conditions of the corollary 4.1.1 are necessary and sufficient conditions for the existence of the minimum (local). In order to have some
information about the existence of a global minimum, the theory of convex
functions must be very briefly reported.
Definition 4.12 (Convex function). The function f : X Y, where X is a

convex set2 , is convex if x1 , x2 X : 0 1

f (x1 + (1 )x2 ) f (x1 ) + (1 ) f x2 )

A set X Rn is convex if x, y X the segment [x, y] is totally contained in X

(4.1)

4.1. Optimization theory

51

If in the equation (4.1) the sign < applies, then the function is said to be
strictly convex.
Another way to write the equation (4.1) is:
Lemma 4.13. The function f C1 : X Y is convex over a convex set X if
f (y) f (x) + f (x) f (y x), y, x X
or, if f is twice derivable,
Lemma 4.14. The function f C2 : X Y is convex over a convex set X if

2 f (x) 0, x X
The convex functions are a very useful mathematical tool in the class of
optimization problem, mainly for the next two results:
Theorem 4.15. If f : X Y is convex over a convex set X, the set A of the min-

imum of the function is convex, and every local minimum is also a global min-

imum.
Theorem 4.16. If f C1 : X Y is convex over a convex set X, and if x
X : x X f (x )(x x ) 0, then x is a global minimum of f over X.

The theorem 4.16 also implies that the conditions of the lemma 4.10 and
corollary 4.10.1 (first order conditions) are both necessary and sufficient
conditions for the existence of a global minimum.

4.1.1.1 Unconstrained problem


All the previous results are, almost in theory, sufficient to solve the
problem 4.2. The theory of the convex function ensures the existence of
a global minimum, while lemma 4.10, corollary 4.10.1, and theorem 4.16
suggest a method to find this minimum. We will see in 5.1 how these

methods apply to real circuits, in which, for example, the functions derivative are not available.

Chapter 4. Mathematic Optimization

52

4.1.1.2 Constrained problem


The solution of problem 4.3 is slightly more complicated. The presence of constraints reduces the feasible set of independent variables that
are solutions of the problem. So the solutions, (i.e. the value of independent variables that minimize the objective function), must be searched in the
set x C X that satisfies all the constraints.

The most important method to solve the problem of the minimization taking into account the satisfaction of some constraints (and, incidentally, the
method most useful for our real problem) is the method of the Lagrange
multiplier (and its derived, the method of the penalty function).

Lagrange multiplier and Penalty functions The first method defines a


Lagrangian function:

L(x, ) = f (x) + i gi (x)

(4.2)

i=1

If we define x as the solution that:


x = min f (x)
x X

gi (x ) 0, i = 1, 2, . . . , m

then we can write the necessary KuhnTucker conditions for the existence
of the minimum:

x L(x , ) = 0

L(x , ) 0
T

(4.3)
(4.4)

( ) g(x ) = 0

(4.5)

(4.6)

In order to find out sufficient conditions, we define the saddle-point conditions:


Theorem 4.17. A point (x , ) with 0 is a a saddle-point of the Lagrangian
L(x, ) iff

4.1. Optimization theory

53

i) x minimizes L(x, ) over the whole X


ii) gi (x ) 0, i = 1, 2, . . . , m
iii) i gi (x ) = 0, i = 1, 2, . . . , m
It can be proved that if the functions f , g are even not differentiable but
are convex, then the saddle-point conditions are necessary and sufficient
conditions. Although these conditions must hold at the minimum, they are
not very useful in determining the optimum point. The determination of
the optimum by direct solution of these equations is rarely practicable.
A more feasible way is to convert the constrained problem into an unconstrained one, by defining the new objective function:

P(x, K) = f (x) + Ki [gi (x)]2

(4.7)

i=1

The sum added to the objective function is called penalty function, since it
penalizes the objective function adding a positive quantities (recall that we
want to minimize the cost function). The constants K = [K1 , K2 , . . . , Km ]T
are weighting factors (positive) that define how strongly must be satisfied
the ith constraint, and can also made it commensurable.
Wherever x is inside the feasible region, we can ignore the constraints,
so a new objective function can be defined as:

P(x, K) = f (x) + Ki [gi (x)]2 ui (gi )

(4.8)

i=1

where ui (gi ) is the usual step function:

0 if g (x) 0
i
ui (gi ) =
1 if g (x) > 0
i

The introduction of the step function makes possible to relate the pen-

Chapter 4. Mathematic Optimization

54

alty function defined in (4.8) with the Lagrangian function of (4.2) (page 52):
P(x, K) = L(, K)
if we let i = Ki gi (x)ui (gi ), so that all previous results valid for the Lagrangian function are valid for the penalty function.
Note that the solution x found optimizing the penalty function P(x, K)
converges to (x , ), defined by the KuhnTucker conditions, only in the
limit K .

4.1.2 Multi-objective optimization


The multi-objective optimization is not a standard problem in the engineering, but is quite common in economics ([14]). While with the monodimensional problem the concept of optimum as a minimum is quite clear
and defined (the idea of greater or lesser is intuitive with the real number),
with multi-objective (also multi-criteria) the concept of minimum is less intuitive. So we must define some relation of order among the points in a
multi-dimensional space.
Notation 4.18. Given x, y Rn , define
x=y

iff

iff

xy

iff

x<y

iff

x k = y k k = 1, 2, . . . , n

x k y k k = 1, 2, . . . , n
x

y and x = y (so k : xk < yk )

x k < y k k = 1, 2, . . . , n

Notation 4.19. In the following section, the function f is defined as: f : X


Y, X R p , Y Rn . X is called the decisions space, while Y is called the criteria

space.

Given two outcome y1 , y2 of the cost functions, y1 = f (x1 ) and y2 =


f (x2 ), we must define which is better and we indicate that y1 is better than
y2 with y1 y2 , that y1 is worse than y2 with y1

y2 , and, finally, that y1 is

indifferent with respect to y2 with y1 y2 .

In the optimization theory a great importance has the definition of Pareto

4.1. Optimization theory

55

point or Pareto preference:


Definition 4.20 (Pareto preference). Given y1 , y2 Y, the Pareto preference
is defined by

y1 y2

iff

y1 y2 .

A Pareto preference is intuitively guided by the relation lesser is better.


Definition 4.21 (Non-Dominated and Dominated set). If y1 y2 is a bin-

ary preference defined on Y, the dominated and the non-dominated set

with respect to {} are defined as:


N({}, Y) = {y0 Y | y Y : y y0 }

D({}, Y) = {y0 Y | y Y : y y0 }

If y0 N({}, Y), y0 is a Npoint. Similarly, if y0 D({}, Y), y0 is a D


point.

Definition 4.22 (Pareto optimum). y Y is a Pareto optimum iff it is a N


point with respect to Pareto preference.

We will give now two theorems that are fundamental for the solution of
the multi-objective optimization problem; first we introduce the definition
of convex cone in Rn :
Notation 4.23 (convex cone).
> ={d Rn | d > 0}
={d Rn | d 0}

= ={d Rn | d

Theorem 4.24.
is a Npoint;

0}

i) if y0 Y minimizes y over Y for some > , then y0

ii) if y0 Y uniquely minimizes y over Y for some , then y0 is a


Npoint.

Chapter 4. Mathematic Optimization

56

Corollary 4.24.1. If Y is = convex, i.e. Y + = is a convex set, then a necessary


condition for y0 Y to be an Npoint is to minimize y over Y for some > .
This very important theorem (and its corollary) states that if y0 minimizes a linear weighted function y (for some ), then y0 is a Pareto optimum.

This reduces the problem from a multi-objective one to a mono-objective

one, i.e. is sufficient minimizes a linear weighted function of the cost functions.
Note that:

j
yi
=
yj
i
so the ratio

j
i

is the trade-off exchanging an unit-gain in the variable y j with

an unit-gain for the variable yi . Finally, note that the theorem is valid for
any shape of Y.
Theorem 4.25. A necessary and sufficient condition for y0 Y to be an Npoint

is that i = 1, 2, . . . , n there are n 1 constants (i) = {h j | j = i, j = 1, 2, . . . , n}

so that y0 uniquely minimizes yi over Y((i)) = {y Y | y j h j , j = i, j =


1, 2, . . . , n}.

Each constant h j can be seen as a constraint: so this theorem claims that


a necessary and sufficient condition to be a Pareto optimum is to minimize
one criterion (the ith objective function), while satisfying the constraints
for the remaining criteria. This is equal to say that the multiple criteria
problem can be reduced to a single criterion problem (minimize the yi functions with multiple constraints (ensure that y j h j , i = j).

4.1.2.1 Unconstrained
Given all previous results, the solution of the unconstrained problem is
given by all previous tools: we reduce the multi-objective problem. We will
see in 5.1 how to apply these methods and which is preferred.

4.1. Optimization theory

4.1.2.2

57

Constrained

Again, the solution is to reduce the complexity of the problem from the
multi-objectivity to a mono-objective one. It is possible to combine the two
previous methods, that is to minimize a linear weighted function plus a
sum of penalty function; the only critical point is to ensure the same order
of magnitude of each term of the sum, such that there is not a dictatorship
of one term of the sum. The third chance to solve an unconstrained problem
(or a constrained, but with some care) is to use the method of the compromise
solution:

Compromise solution Given the problem 4.3, it is possible to define y as


the ideal outcome of the cost function f (x) without any constraints, so that
y = inf f (x); the compromise solution is defined as the minimum of regret:
x X

r(y) = y y ;
typically, the L p norm (the distance between the actual solution and the
ideal point) ) it is used:

r(y) = r(y; p) =

| yi yi | p

1
p

i=1

Again, a weight can be associated for each term of the sum:

r(y; p, w) =

i=1

p
wi | yi

yi |

1
p

Definition 4.26 (Compromise solution). The compromise solution with respect to L p norm is y p Y that minimizes r(y; p, w) over Y.
The compromise solution enjoys several properties, the most important
is:
Property 4.27 (Pareto optimality). The compromise solution y p Y is an

Npoint, for 1 p < with respect to Pareto preference (definition 4.20).


If y is unique, then it is also an Npoint.

Chapter 4. Mathematic Optimization

58

When the ideal point is not known, one can use an approximation, or,
even, a constraint; in the latter case the more appropriate term is satisfying
level. To point out the differences between constraints and satisfying level,
one must observe:

The constraints are, typically, a disequality constraints: the solution


must be as lesser as possible than the specified constraints. In term
of a L p norm the solution must be as farther as possible from the
constraints, that is the L p norm must not to be minimized. So the
method of the penalty function is the only suitable for this kind of
problem.
The satisfying levels are, typically, equality constraints: the solution
must be as closer as possible to the levels indicated, that is the L p
norm must be minimized. So the method of the compromise solution
can be devised.

4.2

Optimization Algorithms

This is a very concise report of some algorithms used in the optimization of real circuit in the following chapters.
First are reported some one-dimensional (with respect to the decision
space) algorithms, and then the multi-dimensional algorithms, with some
based on the previous ones. Finally some non-standard algorithms are
reported, since they can be suitable for the application to digital circuit.
In the following report we focus on the algorithms that do not require
the evaluation of the gradient of the objective functions, or that approximate
this gradient3 , since (see 5.1) the functions available in real circuits are not

known in a closed form and almost


3

Essentially with

f (x + x) f (x)
f
(x)
xi
x

4.2. Optimization Algorithms

59

4.2.1 One-dimensional search techniques


In order to find the minimum of a function f : R R, we need to bracket

him:

Definition 4.28 (Bracketing). To bracket a minimum means to find a triple


a, b, c R, a < b < c, such that f (b) < f (a) and f (b) < f (c). This means that

the minimum is in the interval (a, c).

We show some algorithms, that are the most efficient in this field. First
we introduce the family of sectioning algorithm, from which the the golden
section search is probably the most suitable for our uses. Then we introduce
the Brents rule, a quadratic interpolation algorithm.

4.2.1.1

The section search

The algorithms of sectioning apply always the same policy: divide and
conquer. The initial interval [a, c] is reduced at each iteration to a smaller
interval, already bracketing the minimum x . We have so a series of encapsulated intervals (see figure 4.1)
x [an , cn ] [an1 , cn1 ] [a, c].

Dicotomic search The simplest form of sectioning is the dicotomic search:


at first iteration the interval [a, c] is divided in two equal parts, [a, b] and
a+c
[b, c], so that b =
; then, choosing > 0, we check if f (b ) > f (b +
2
). In such case we repeat he whole process with the new interval [a, b],
otherwise we repeat with [b, c]. It can be proved ([13]) that this method
requires 2k evaluations of the function f , where k is the iterations number.
Also the final interval length Ik = (ck ak ) is
lim Ik = I0 ,

where I0 = (c a).

So the relative uncertainty on the minimum x is .

Chapter 4. Mathematic Optimization

60

I0
I1
I2

a0

b0=a 1

b1 =c 2

c0 =c1

Fig. 4.1: Section search algorithm


Fibonacci Search A more sophisticated algorithm is the Fibonacci search,
where at each iteration the length of the interval is chosen according to the
Fibonacci rule: Ik3 = Ik2 + Ik1 . This method has the advantage that the

uncertainty after n iteration is known a priori: defining the initial interval


I0 = I1 = (c a), then
Ik =

I1 + f k2
fk

where f i is the ith number of the Fibonacci sequence.


The number of function evaluations are again 2k, and the disadvantages of
this methods are that and n must be chosen a priori.

The golden section search Given a triplet (a, b, c) that brackets the minimum, we choose a new point x that defines a new bracketing triplet (a, x, b)
or (b, x, c) according to the rule:
xb
ba
=12
ca
ca

4.2. Optimization Algorithms

61

This implies that |b a| = |x c|, and that at each iteration the interval is

scaled of the same ratio .

Then we repeat the process with the new triplet. So the interval (a, c) is divided in two parts, a smaller and a larger, and the ratio between the whole
interval and the larger is the same between the larger and the smaller, or in
other words:
1

=
,
1
giving for the positive solution

51
=
.
2
This fraction is known as the golden-mean or golden-section, whose aesthetic properties come from ancient Pythagoreans.

Convergence considerations

All the three previous methods have a lin-

ear convergence, since at each iteration the ratio between the interval containing x and the new smaller interval is:
0

Ik+1
1.
Ik

The asymptotic convergence rate is defined as


lim

Ik+1
.
Ik

For the dicotomic search, since 2Ik+1 = Ik + , taking = 0 we have


lim

Ik+1
1
= .
Ik
2

For the Fibonacci search, first we must write the generic number of the
Fibonacci sequence in a closed form:

Chapter 4. Mathematic Optimization

62

1+ 5
2

1
fk =
5

k +1

1 5
2

k+1

then it can be proved that, taking = 0:


I
f
lim k+1 = lim k+1 =
k Ik
k f k

51
2

For the golden section search, as previously said


I
lim k+1 = =
k Ik

Ik+1
Ik

= , so

51
.
2

Thus the convergence rate of the Fibonacci and the golden-section search are
identical.
4.2.1.2

Parabolic interpolation

Given a triplet (a, b, c) that brackets a minimum, we approximate the


objective function in the interval (a, c) with the parabola fitting the triplet.
Then we find the minimum of this parabola with the formula (since we
want the abscissa, the method is indeed an inverse parabolic interpolation):

x =b

1 (b a)2 [ f (b) f (c)] (b c)2 [ f (b) f (a)]

2
(b a)[ f (b) f (c)] (b c)[ f (b) f (a)]

This method is useful only when the function is quite smooth in the interval, but it has the advantage that the convergence is almost quadratic,
and it is perfectly quadratic when the function to be optimized is a quadratic form.
The Brents rule The Brents rule is a mix of the last two techniques: it
uses the golden section when the function is not regular and switches to a
parabolic interpolation when the function is sufficiently regular. In particular, it tries always a parabolic step. When the parabolic step is useless then

4.2. Optimization Algorithms

63

the method use the golden section search.

4.2.2

Multi-dimensional search

This algorithms search the solution of the optimization problem in a


multi-dimensional space. Again, first an algorithm with a convergence order of 1 is presented, then an algorithm with a quadratic order of convergence is showed.
All the algorithms here presented show a sub-algorithm part that is a
one-dimensional search.

4.2.2.1 The gradient direction: steepest (maximum) descent


The method of the steepest descent chooses at each iteration a new point
in the decision space x + dx from the old point x, obviously such that:
f (x + dx) < f (x)
This new point must also be chosen such that the variation of the function
f is as more as possible. In other words, if dl is the length of the direction:
dl =

(dxi )2 ,

i=1

the steepest descent maximizes the rate of change d f /dl.


The problem of minimize f becomes so the problem:
Problem 4.29 (Steepest descent).
n
df
f dxi
= max
,
xi dl
dl

dx
i i=1
i=1
n

max

dl

such that
dl =

(dxi )2 .

i=1

Chapter 4. Mathematic Optimization

64

This problem can be solved with the Lagrangian multipliers; from equations (4.3) and (4.4) (page 52) we can write:

dxi
1 f
=
,
dl
2 xi
with
1
=
2

i=1

f
xi

1
2

This means:

dxi
(x) =
dl

f
(x)
xi
n

i=1

f
(x)
xi

1
2

(4.9)

The steepest descend algorithm chooses at each iteration a new point


xk+1

from the old point xk from the equation (4.9) (page 64)

xk+1 = xk dl f (xk ),

dl > 0

with dl chosen accordingly to the desired convergence rate: if dl is small


the algorithm will closely approximate the minimum, with slow convergence, while if dl is large the convergence is fast but the algorithm can
oscillate near the minimum. Thus some methods are necessary to reduce
(or enlarge) the step dl at each iteration: large steps if we are far away from
the minimum, small steps if we are close to the minimum. The scheme
of choosing the proper step can affect greatly the convergence of the algorithm. The best choice is the method of the optimal gradient.

4.2. Optimization Algorithms

65

4.2.2.2 The optimal gradient


This algorithm simply calculates the step dl according to:
min f (xk dl f (xk ))

R+

dl

This is a one-dimensional optimization and it is usually performed with


a method as shown previously. Strictly speaking, the optimization of f is
always a multidimensional one, since we descend along the gradient path,
but inner this process there are a lot of sub-optimization steps that found
the optimal length of this descend.
If f C2 , that is f is twice differentiable and its derivatives are continue,

then a closed form for the optimum step dl is determinable; we expand f


in Taylor series:
f (xk + x) = f (xk ) + f (xk )

1
x + xk H(xk )x,
2

where H(x) is the Hessian4 matrix of f .


Along the gradient direction:
x = dl k f (xk ).
Thus:
f (xk + dl k f (xk )) = f (xk ) + dl k f (xk )
df
= f (xk )
dl k

1
+ (dl k )2 f (xk )
2

f (xk ) + dl k f (xk )

f (xk ) +
T

H(xk ) f (xk )

H(xk ) f (xk ) = 0

The Hessian matrix of a function f (x1 , x2 , . . . , xn ) is defined as:


2

f
f
x1 f xn
x1 x2

x21

2 f
f
x2 f xn
x2 x1

x
2

H( f ) =
.

.
.
.
.
.
.
.
.
.
.
.

f
f
2 f

2
xn x
xn x
x
1

(4.10)

Chapter 4. Mathematic Optimization

66

and
k

dl =

From

df
(xk+1 )
dl k

f (xk )

f (xk )

f (xk )

H(xk ) f (xk )

(4.11)

= 0, we can see that:

f (xk + dl k f (xk )), f (xk )) = 0,


that is f (xk )) and f (xk+1 )) are orthogonal, or, the same, xk and xk+1

are orthogonal. This means that successive steps of the optimal gradient
algorithm are orthogonal.

Convergence considerations A general descend algorithm converges if:


lim f (xk ) = 0.

Property 4.30. The function f monotonically decreases along the (negative)


gradient path.

Proof. From equation (4.9)


n
df
f dxi
=
=
dl
xi dl
i=1

f f
xi xi
n

i=1

Thus

f
xi

1
2

i=1

f
xi

1
2

(4.12)

df
0, or the function f decreases along the path dl.
dl

Lemma 4.31. The convergence of a descend method along the gradient path can
not be obtained in a finite number of steps.

4.2. Optimization Algorithms

67

Proof. From equation (4.12) (page 66)


df
=
dl

i=1

f
xi

1
2

but when x approaches the optimum x , then


lim

f
(x) = 0
xi

lim

df
(x) = 0
dl

x x

so that
xx

meaning that the optimum is reached with a rate convergence that decreases.

For the optimal gradient method the convergence is only linear5 in f (xk )
and a halting criterion for the algorithm could be:
f (xk ) f (xk+1 ) ;
alternatively from the necessary condition f (x) = 0
max |
i

f k
(x )|
xi

or

i=1

f k
(x )
xi

Finally note that these methods, since they use a local gradient information, they find only a local minimum, and that the gradient algorithms are
rather inefficient in the proximity of the optimum, due to the small step
size.

4.2.3

The conjugate direction method

Let u, v X Rn . They are said mutually orthogonal if uT v = 0. Similarly

they are said mutually conjugate with respect to a matrix A if uT Av = 0.


5

This means that limk

f (xk+1 )
f (xk )

= a, with 0 a 1

Chapter 4. Mathematic Optimization

68

Property 4.32. A set of of mutually conjugate vectors in X Rn constitutes


a basis for X.

The importance of a set of mutually conjugate vectors is stated from the


following theorem:
Theorem 4.33. Every descent method of optimization using mutually conjugate
directions is quadratically convergent.
The concept of conjugate directions is important, since, in an intuitively
manner, a minimization attained along one of this directions does not perturb the the minimization along the other direction.
4.2.3.1 The FletcherReeves conjugate gradient algorithm
This algorithm calculates the mutually conjugate directions of search
with respect to the Hessian matrix of f directly from the function evaluation and the gradient evaluation, but without the direct evaluation of the
Hessian of the function f .
Algorithm 4.34. FletcherReeves conjugate gradient algorithm
Require: x0 = starting point
1:
2:
3:
4:
5:
6:
7:

repeat
Compute f (x0 ) and h0 = f (x0 )
for i = 1, . . . , n 1 do

Replace xi = xi1 + i1 hi1 ,


where i1 minimizes f (xi1 + i1 hi1 )

Compute f (xi )
if i < n then

hi = f (xi ) +

8:

end if

9:

x0 = x n

10:
11:

f (xi ) 2 i1
h
f (xi1 ) 2

end for
until halting criterion

f (xi ) 2 i1
h is added to the gradient at each iteration,
f (xi1 ) 2
and when f is a quadratic form (positive definite), this results in a set of
mutually conjugate vectors.
The quantity

4.2. Optimization Algorithms

69

4.2.3.2 The Powell conjugate gradient algorithm


Since the generation of the conjugate directions in the FletcherReeves
algorithm requires the computation of f (x) at each iteration, and this

computation it is not always feasible, Powell ([15]) has developed a method


to generate the conjugate directions using only a one-dimensional search

at each iteration: if x1 , x2 are two vectors generated by one-dimensional


searches in the same direction v, but from different points, then x1 x2 is
mutually conjugate to v.

Algorithm 4.35. Powell conjugate gradient algorithm


Require: {hi , i = 1, . . . , n} X Rn = A set of linearly independent vectors in X, and x0 = starting point

1:
2:

repeat
for i = 1, . . . , n do

3:

Replace xi = xi1 + i hi , where i minimizes f (xi1 + i hi )

4:

for i = 1, . . . , n 1 do

5:

hi = hi+1

6:

end for

7:

hn = xn x0

8:
9:
10:
11:

Find n that minimizes f (xn + n (xn x0 ))


x0 = x0 + n (xn x0 )

end for

until halting criterion


The Powell algorithm is equivalent to a one-dimensional search made

in a sequential way along mutually conjugate directions. The only critic


point of the Powell is the line 7 of the algorithm 4.35: replacing th nth
direction hn with the vector xn x0 tends to produce at each iteration a set

of directions that are more linearly dependent. The solution is to reinitialize


every n iterations the set of directions h; these directions can be the columns
of any orthogonal matrix, and there is an heuristic scheme due to Powell.
The figure 4.2 shows 20 iterations of the Powell algorithm to find the
minimum (located at x = 30) of a mono-dimensional function x4 . As it
can be see, the algorithm finds the minimum at x = 31.7 and it is not fooled

by the presence of a local minimum at x = 10. The figure 4.3 shows 24


iterations to find the minimum (located at x = 15) of a more complicated

Chapter 4. Mathematic Optimization

70

80
f(x}
Powell
70

60

50

40

30

20

10
Sol
0
0

10

15

20

25

30

35

40

45

Fig. 4.2: Minimization by Powell algorithm of a function x4 : 20 steps.


function x6 : again the algorithm finds the global minimum at x = 13.7 in

a presence of local minima.

In both cases a better precision on the location of the minimum could be


obtained increasing the number of iterations.

4.2.4

The SLOP algorithm

The slop algorithm ([16]) is a simple algorithm, suitable for the minimization of a particular function of a digital circuit, the delay. It is feasible
for smaller circuit, since it has no heuristics in reaching the minimum, and
also it stops at the first minimum it finds.
The idea behind the algorithm is simple: start from a given point x0 <
x , the increment at each iteration a single component of x0 by a defined
step. For each increment track the diminution of the objective function,
then conserve memory only of the increment that give the best diminution.
Finally, use this increment as a new starting point.
Clearly, this algorithm works only if the starting point is x0 < x (see notation 4.18), so that an increment in one component moves the function f
near the minimum. Also at the first minimum encountered the algorithm

4.2. Optimization Algorithms

71

h(x}
Powell

140

120

100

80

60

40

20
Sol
0
0

20

40

60

80

100

120

Fig. 4.3: Minimization by Powell algorithm of a function x6 : 24 steps.


stops.
The same function of figure 4.2 is shown in figures 4.4, minimized by
the SLOP algorithm. It is possible to see that the SLOP algorithm stops as
just as it encounters the first minimum (a local minimum) at x = 10.
Algorithm 4.36. SLOP algorithm
Require: x0 < x starting point
1:
2:

repeat
for i = 1, . . . , n do

3:

Compute f old = f (x0 )

4:

Replace xi = xi + x

5:

Compute f (x0 ) and hi = f old f .

6:

Replace xi = xi x

7:

end for

8:

Search the index imax that corresponds to the maximum of hi :

9:
10:

imax = {1, . . . , n| max hi > 0}


i

Replace ximax = ximax + x


until halting criterion

Chapter 4. Mathematic Optimization

72

80
f(x}
Slop
70

60

50

40

30

20
Sol
10

0
0

10

15

20

25

30

35

40

45

Fig. 4.4: Minimization by SLOP algorithm of a function x4 : 180 steps.

4.2.5 The simulated-annealing algorithm


The name of this algorithm comes from an analogy with thermodynamics: it is known that if a slow cooling is applied to a liquid, the this liquid
freezes naturally to a state of minimum energy. This process is called annealing.
The numerical algorithm applies this analogy to the minimization of function: first go downhill to a minimum as far as it can go, then go slightly
uphill, since the minimum just fond could be a local minimum, then again,
go downhill, and so on. In thermodynamics, the probability to go from a
state with energy E1 to a state of energy E2 is given by:
p=e

(E1 E2 )
kT

where k is the Boltzmann constant and T is the temperature of the system.


In order to apply this scheme to a function minimization, it is necessary
to define the energy of the system (i.e. the objective function), the temperature of the system, and an annealing schedule (i.e. the scheduled number
of annealing iterations): at each iteration the temperature defines a random
fluctuation in the minimum found, to simulate the thermal fluctuations of

4.2. Optimization Algorithms

73

80
f(x}
Anneal
70

60

50

40

30

20

10
Sol
0
0

10

15

20

25

30

35

40

45

Fig. 4.5: Minimization by Simulated-annealing algorithm of a function


x4 : 130 steps.

the atoms. Also, at each iteration the temperature is decreased, to reduce


the thermal fluctuations and converging, thus, to the global minimum.
The rate of the diminution of the temperature influences the rate of convergence (higher rate temperature, higher rate of convergence), but also
influences the quality of the minimum (lower rate temperature, higher the
probability to converge to a global minimum).
As an example, a possible annealing schedule (probably the simpler) would
be: after k steps, reduce the temperature T by T = (1 )T, where

is de-

termined by experiment.

The same function of figures 4.2, 4.3 are shown in figures 4.5, 4.6, minimized by the simulated annealing algorithm. As in the case of Powell
algorithm, the simulated annealing it is not fooled by the presence of local
minima, but the number of iterations is greater for both the functions: 130
in the first case, 200 in the second one.

Chapter 4. Mathematic Optimization

74

h(x}
Anneal

140

120

100

80

60

40

20
Sol
0
0

20

40

60

80

100

120

Fig. 4.6: Minimization by Simulated-annealing algorithm of a function


x6 : 200 steps.

4.3 Conclusions
After all this mathematic theory, some words must be spend about the
choice of which algorithm it is feasible to use.
The characteristics of each algorithm are summarized in the table 4.1:
this table should be indicate several characteristics that can be useful for
the real implementation of circuit optimizer.
In the same manner the previous sections illustrate all the basic theory, useful to justify some choices made in the implementation of the optimizer.

4.3. Conclusions

75

Tab. 4.1: Optimization algorithms


Algorithm
Section search
Parabolic
tion

interpola-

SLOP

Conjugate directions

Powell scheme

Simulated annealing

Pro
Mono-dimensional
Simple implementation.
Fast convergence.
The simplest implementation.
Multi-dimensional
Good convergence.
Fast convergence.
Does not require
gradient knowledge.
Is not trapped by local
minima.
Simple implementation. Does not require
gradient knowledge.
Is not trapped by local
minima.

Con
Converges to local
minima.
Has some pre-requirements.
Converges to local
minima. Very slow.
Requires gradient
knowledge.
Difficult implementation.

Very slow.
Fragile
with respect to some
critical parameters.

Chapter 5

CIRCUIT OPTIMIZATION

HE goal of the optimization step during a design flow is to obtain


from a given design an optimized design. In the figure 5.1 are

showed the various levels of possible optimization.


The optimization level we concern is the inner level, indicated here as dimension optimization. The optimization levels, that is the level at which the
designer can apply suitable techniques are, briefly:

Fig. 5.1: Design flow

System Optimization This is higher level of optimization: it concerns the


optimization made on user space or kernel space of the applications

Chapter 5. Circuit Optimization

78

running in the system subject to the optimization process.


Behavioural optimization At this level the proper optimization techniques
are made by choosing the best algorithm to implement functions.
Logic optimization This is the optimization made by mapping the given
functions or algorithms (from a behavioural optimization) into boolean functions. It is equal to choose the logic gates that implement
these functions.
Dimension optimization This is the lower level of optimization: it is made
by choosing the proper transistor dimensions in each gate that implement a logic function. This is the optimization which the efforts of
this thesis focus on.
In section 5.1 are shown the three kind of target to be optimized in a
real circuit: delay (5.1.1), power consumption (5.1.2) and area occupancy
(5.1.3). In particular 5.1.1.1 shows the delay obtained from the Elmores
formula (chapter 2, page 15), while 5.1.1.2 shows the delay as it is obtained
by HSPICE and FAST (chapter 3, page 21).

The section 5.2 contains some application of the mathematical results of


chapter 4 (page 47): in particular 5.2.2 shows the results of a mono-objective
optimization, while 5.2.3 shows the results of a multi-objective optimization. Some conclusions are drawn in section 5.3

5.1

Optimization targets

There are, mainly, three target policies in optimizing real circuits: minimize the delay, minimize the power consumption and minimize the area
occupancy. In some cases these policies can be conflicting among them, as,
for example, minimizing the delay surely increases the circuit area, while
ins some cases these policies can go together, as, for example, minimizing
the power consumption may lead to a reduction of the area occupancy.
There is another policy that can be considered, especially in the field
of sub-micron digital circuit design: the noise reduction; however this requires a good noise model of the circuit, and actually there are a few good
ones.

5.1. Optimization targets

79

Now we are going to analyze the three principal optimization policies,


regarding especially the compatibility with the optimization algorithms of
chapter 4.

5.1.1 Circuit delay


Till now the generic word delay has been used, but now it is mandatory to better define the meaning of delay in a real circuit.
Generally the delay of a CMOS gate, or a CMOS circuit, is defined as
the delay between the time when the output is at 50% of its peak value
(indicated with to in figure 5.2) and the time when the input is at 50% of its
peak value (indicated with ti in the same figure).
IN
VIN
50% VIN

time
Delay = t o

OUT

ti

VOUT
50%VOUT

time

to

ti

Fig. 5.2: Delay definition


This definition is good only for theoretical discussion since:
generally a circuit has more than one input and more than one output;
not always there is a direct path from the input to the output (lets
think about dynamic logic), i.e. not always a change in an input cause
directly a change in the output.
So the definition of delay of a CMOS circuit must be investigated,
to produce real number useful for optimization. In order to define it the

Chapter 5. Circuit Optimization

80

concept of critical paths has been introduced in [16].


In the following I introduce a new mathematical formulation of the definition of critical path; this formulation will be useful for the automatic
solving of the problem of finding all the critical paths in a circuit in 6.2.4
(page 115).

Critical Paths

The idea of critical paths in a CMOS circuit can be derived,

intuitively, from the idea of path between the the output and the input: a
critical path is a conducting path between a node (the output node, i.e.
the final node of the path) and the ground, or between this node and the
power supply, such that a change in the state of an input gate of a MOSFET
comprised in the path causes directly a change in that node. Naturally each
MOSFET

included in the path must be on, or switch to, conduction, in order

to create a conducting path.


This concept must be extended, however, since a change of the so called
output node can cause itself a change of another critical path (i.e. the output
node is itself connected to a gate of another critical path), so that a change
in a gate node in the very beginning of the circuit may propagate through
a lot of conducting paths.
Definition 5.1 (Critical path). A critical path is a set of conducting paths such
that:
i) each conducting path is between a generic node and a ground node,
or between a generic node and a power supply node, and is composed
by MOSFETs; and
ii) each final node of a conducting path is either connected to a gate of a
MOSFET

comprising another critical path, or is an output of the circuit;

and
iii) a change in the state of any MOSFET gates in the first conducting path
propagates till the last conducting path, causing a change in the critical
path output node.
Definition 5.2 (Critical path delay). The delay of a critical path is the delay
between the output node of the critical path and the gate node causing the
state change of the output node.

5.1. Optimization targets

81

From the definition 5.1 it is clear that even a simple circuit has more
than one critical path in it 1 .
In order to develop a rigorous definition of critical paths, lets introduce
the following sets, characterizing a typical CMOS circuit:

G = {set of all the MOSFET gate nodes in the circuit} = g1 , g2 , . . . , g j , . . .

N = {set of all the nodes in the circuit} = n1 , n2 , . . . , n j , . . .

O = {set of all the output nodes of the circuit} = o1 , o2 , . . . , o j , . . .


I = {set of all the input nodes of the circuit} = i1 , i2 , . . . , i j , . . .

M = {set of all MOSFETs in the circuit} = m1 , m2 , . . . , m j , . . .


V = {gnd=ground node, vdd=power supply node} ;

lets define also the set Nm j as the set of all the nodes pertaining to the
MOSFET

m j , and the gate of the jth MOSFET with gm j .

All these sets are in such relations: I G N , V N , O N \ G.


The generic nth critical path of a circuit, denoted by Cn , equation (5.1a)
(page 82), is the collection of conducting paths, denoted by ni , such that
each ni , equation (5.1b), is defined as the union of two ordered node sets,
the set Gni , equation (5.1c), of all gates of all the k MOSFETs pertaining to the
conducting path, and the set Dni , equation (5.1d), of all drain and source
nodes (in number of2 k + 1) of the same k MOSFETs, (5.1e)
The nodes in Dni set have a peculiar property: the first and last one may
be or may be not3 in common among two or more MOSFETs, while the other
ones must be in common among two or more MOSFETs.
In other words, the set Dni is an ordered collection of nodes such that
among these nodes there are k MOSFETs, constituting a continuous (and
conducting) path from the output node to a power supply (or ground)
node.
1

The simplest circuit, the inverter, has 2 critical path, since a change in the input from
low to high involves the path comprising only the n-MOSFET, while a change from high to
low involves the path comprising only the p-MOSFET.
2
Note that MOSFETs in a conducting path share a common drain or source node two by
two, so a conducting path constituted of one MOSFET has two nodes (one drain and one
source), a path constituted of two MOSFETs has three nodes (the MOSFETs share one drain
node) and so on.
3
This is the reason why in the equation (5.1d) the index j ends at k and not at k + 1

Chapter 5. Circuit Optimization

82

Finally, collecting all the definitions, respectively, of critical path, conducting path, conducting path gate nodes set and conducting path drain nodes
set:
Cn =

(5.1a)

ni

i = Gni Dni

(5.1b)

Gni = g j |g j G , j = 1, . . . , k

(5.1c)

Dni = n j | n1 V n j N \ G (n j , n j+1 ) Nm j \ gm j
(nk+1 Gi+1 ) (nk+1 O ) , j = 1, . . . , k

(5.1d)

Gni , Dni such that given

MG = m j | m j M gm j Gni

MD = m j | m j M Nm j \ gm j Dni

(5.1e)

then MG = MD .

TSPC FULL ADDER (carry part)


1-2-4-5-11
11
1-3-4-5-11

6
C

1-7-8-5-11

5
A

8
C

9-10
10

6
9

11

CLK

Fig. 5.3: Example of critical paths


The figure 5.3 shows an example of critical paths in a dynamic circuit
(actually the carry part of a full-adder in a TSPC logic). In this figure are represented the six critical paths, each one with the list of MOSFET numbers.
For example, the first critical path (C1 ) is composed by the conducting path

11 , made up of n-MOSFETs 1, 2, 4, 5 and the p-MOSFET 11: that means that


the set G11 is composed by the gates node of transistors 1, 2, 4, 5, and 11,
while D11 is made up of drain and source nodes of the same transistors; if
one gate of n-MOSFETs 1, 2, 4, 5 switch from the low state to the high state

5.1. Optimization targets

83

(and the others are all at the high state), then the gate of p-MOSFET 11 is discharged, and this p-MOSFET conducts, charging the output node. Another
critical path for example is the one composed only by the p-MOSFET 6: if
its gate switch from high to low, then the gate of n-MOSFET 9 switch form
low to high, but this can not produce the discharging of the output node,
since the gate of n-MOSFET 6 is driven by the same signal of the original
p-MOSFET.
Note. The definition of critical path can be viewed as a tree rooted at the
transistor that is driving the change in the critical path. One leaf of the tree
is the transistor which drain (or source) is the critical path output node. So
it is possible to traverse the tree between the root (the input) and a leaf (the
output): if one is able to model all the lateral subtree encountered during
the traversing of the tree as static load, then the tree becomes a transistor
chain (figure 5.4). This is the base of the use of several delay models that
are able to evaluate a chain delay.
CHAIN

TREE

OUTPUT

OUTPUT

INPUT

INPUT

Fig. 5.4: Critical path tree that becomes a chain.


After the definition of critical paths, the problem of associating a delay
(one and only one) to a circuit is still unresolved, since there is surely more
than one critical path in a circuit: the solution is to find the max of all the
critical path delays, and regard this delay as the delay of the whole circuit.
In this manner, we are sure that a change in the state of a node caused directly by an input, can never occurs after the max delay fixed. Also this

Chapter 5. Circuit Optimization

84

definition is consistent with the optimization purposes, since the optimization objective is always (usually!) the minimization of the delay. So the
strategy to be applied is a minmax scheme of optimization (minimization
of the maximum).
Definition 5.3 (Circuit delay).

The delay of a circuit td is:

td = max {d(Cn )}
n

where d(Cn ) is the delay of the nth critical path comprising in the circuit.
So, finally, in order to known the delay of a circuit, one must search
all the critical paths in the circuit, calculate (or measure) the delay of each
critical path, and calculate the max of these delays.
The delay of each critical path can be calculated by means of some
model (maybe after the transformation of figure 5.4), or measured by means
of simulations.
This delay, obtained in some way, must be analyzed in order to know
its coherence with the mathematical results of chapter 4 (page 47), and the
validity of these results.
5.1.1.1 Delay formula obtained by the Elmore model
The delay function obtainable by the Elmores model (2.1, page 16) is a
continuous function. Referring to figure 2.1 (page 15), the delay of a single
MOS

is:

tdi = R0 CSi + (R0 + Rdi )CDi + (R0 + Rdi + R L )CL


The drain and source capacitance, and the dynamic resistance of a MOS
are function of the MOS width W:
4

The reason why we want to define a single value for the optimization of delay and, for
example, we do not apply the multi-objective methods of the following sections, is that all
the critical path delay are commensurable and they have the same global behaviour (cfr.
5.2.3, page 102)

5.1. Optimization targets

85

CDi = C j Wi
CSi = C j Wi
R di =

Rj
Wi

where C j and R j , are, respectively, the capacitance for unit length and the
resistance for unit length. The delay function of the MOS width become:

tdi = R0 C j Wi + R0 +

Rj
Rj
C j Wi + R0 +
+ R L CL .
Wi
Wi

Separating the terms containing the width W j from the terms that are
independent from W j we obtain:

tdi = 2R0 C j Wi +

Rj
CL + R j C j + (R0 + R L )CL .
Wi

Summing the delay of all the MOS in a conducting path we obtain the
total delay of this path:

td = tdi = AWi +
i

B
+C
Wi

where A, B, C are all independent from Wi .


The delay of a critical path is the sum5 of the delays of all the conducting
path.
As long as A, B are not zero, the delay td is a convex function (definition 4.12, page 50) as in figure 5.5. If the term A is zero, instead, then the
delay is a monotonic decreasing function (figure 5.6).
Note that the term A is zero, practically, only if the the resistance R0 is
5

This definition introduces further errors in the delay model, since the conduction of the
conducting path successive to the first one does not start when the output of the first one is
at its 50%, but long before.

Chapter 5. Circuit Optimization

86
td

t min
Wmin

Wj

Fig. 5.5: Elmore delay: convex function


zero, that is the MOSFET chain is driven by an ideal voltage source.

5.1.1.2 Delay measurement obtained by the FAST model and by HSPICE


The delay obtained by the FAST model and HSPICE simulations is a
measure and not a formula. It is a correspondence onetoone between
the MOSFET widths and the resulting delay and it is not possible to express
this delay by means of a closed form formula6 .
The figures 5.7, 5.8 represent the delay of CMOS inverter, increasing in
an uniform manner the dimension of both the n-MOSFET and the p-MOSFET.
The first figure shows the delay of the inverter driven by another inverter
(with fixed dimensions) simulated by HSPICE; the second figure shows the
delay of the same inverter driven, instead, by an ideal voltage source and
simulated by FAST.
These are an experimental proof of the statement given in the previous section: if the voltage source is not ideal, that is dependent from the MOSFET
widths of the circuit, the delay curve is strictly convex (figure 5.7), while if
the voltage source is ideal, i.e. independent from the MOSFET widths, then
6
It is possible, however, after measuring a set of delay varying with widths, to fit the
results with an approximated formula, now in a closed form.

5.1. Optimization targets

87

td

Wj

Fig. 5.6: Elmore delay: monotonic function


the delay curve is decreasing monotonically (but still convex).
Taking into account the interconnection delays, which can be no more negligible in the deep sub-micron, does not modify the delay function, since a
width independent7 delay is added to the total delay function.
So, definitively, the delay curve is a convex function, strictly or not, depending of the operating condition of the circuit, of all the MOSFET widths8 .

5.1.2 Power consumption


The calculus of the power consumption of a circuit is quite different
from the calculus of the delay: while the delay is a local property of a
single critical path (5.1.1), the power consumption is a global property
of the circuit. That is the power consumption of a circuit is not the sum
7

The interconnection delay can be seen, in second approximation, as proportional to the


widths, since greater widths means greater circuits, and in a layout this means that the
average length of interconnections increases also. This proportionality (empirically found
linear to quadratic) does not modify the delay function, since it adds a term that is both an
increasing and a convex function.
8
The two dimensions representation of figures 5.5, 5.6, 5.7 and 5.8 is only for the sake of
simplicity of the drawing. The convexity is still valid in multi-dimensional representations.
MOS

Chapter 5. Circuit Optimization

88

260

Delay

240
220
td [ps]

200
180
160
140
120
100
0

10

20

30
Wj [um]

40

50

60

Fig. 5.7: HSPICE delay: convex function


of the power consumption of each critical path9 . Even if the definition of
power consumption is global, it is not univocal: the power dissipation of
a circuit surely depends on the input conditions. Changing the input states
change the overall power dissipation, making some MOSFET conducting,
while others not. Again, one must choose a definition of power dissipation giving a single number, for the purpose of the optimization.
Considering that the objective of the power optimization (hereinafter we
will abbreviate power consumption optimization only with power optimization) is the minimization of the total power dissipation, as in the case of
delay minimization, a min-max strategy is the most appropriate. Instead of
evaluating the power consumption for all the input combinations, we take
advantage from the definition of critical path:
Definition 5.4 (Circuit power consumption ).

10

The power dissipation of

a circuit Pd is:
Pd = max { p(Cn )}
n

where p(Cn ) is the power consumption of the entire circuit when the
9

In first approximation the power consumption could be the sum of the power dissipated by each critical path in a fully static CMOS circuit.
10
The same reasoning of note 4 (page 84) applies here.

5.1. Optimization targets

80

89

Delay

75

td [ps]

70
65
60
55
50
45
0

10

20

30
Wj [um]

40

50

60

Fig. 5.8: FAST delay: monotonic function


input conditions of nth critical path are applied.
In this manner it is possible to apply a minmax scheme of optimization,
and, at the same time, it is possible to evaluate the power consumption
during the same bench of evaluation of the critical path delay, allowing a
substantial reduction of the time necessary for the complete evaluation.
In the following, the term power consumption and energy dissipated
will be used altogether, since the simple relation between them is:
E =

P(t)dt;

this means that the calculation of the mean energy dissipated by a circuit is
the integral average of the power and it depends from the simulation time
(or the window of time that we are considering), but it does not depends
from the frequency of the signals at which the circuit itself operates.
The power consumption of a CMOS circuit is the sum of three term (3.3,
page 36):
PTOT = Pswitch + Pshort + Psub-th

(5.2)

the switching power Pswitch , due to the charging and discharging of

Chapter 5. Circuit Optimization

90

internal parasitic capacitances; the short-circuit power Pswhort , due to the


simultaneous conduction of n-MOSFET and p-MOSFET, giving thus a direct
conducting path from the power supply to the ground for a short time; and
the sub-threshold power Psub-th , due to sub-threshold conduction of MOS.
In a first approximation the first term Pswitch is proportional to the MOSFET
widths in the circuit (greater width means greater capacitance), the second
term Pshort is proportional to switching time and thus it is inversely related
to the MOSFET widths (greater capacitance means slower switching time),
while the latter term Psub-th is is proportional to the MOSFET widths.
70
Energy

60

Energy [pJ]

50

40

30

20

10

0
0

10

20

30

40

50

60

70

Wj [m]

Fig. 5.9: HSPICE Energy


As an example, the total power consumption of a single gate is sketched
in figure 5.9: as it can be expected the energy is increasing with widths, but
it is not convex.
The three terms of equation (5.2) do not weight equally in the sum giving the energy consumption: in order of influence the first term (3.3.1,
page 36) is the greater, then comes the second term (3.3.2, page 39) , and
finally the third term (3.3.3, page 39) . For a sub-micron technology the
second term (the short-circuit dissipation) is about 10% of the first, with
the third term (sub-threshold conduction dissipation) about 1% of the first.
It could be expected than with the scaling of the technology (in the deep
sub micron field) the first and the second term become comparable, with

5.2. Optimization examples

91

the third term still a fraction of the other two, giving a power figure not
increasing (or even decreasing) with the MOS widths, but also it could be
expected that with the scaling down the interconnect capacitances become
predominant, making the first term (the power dissipation due to capacitance charging and discharging) still the greatest.
In summary, the power consumption figure of a CMOS circuit is an increasing function of the MOSFET widths, but no assumptions can be made
about the convexity of this function.

5.1.3 Area
The area occupation of a circuit can be expressed in a closed form:

A = j W j + .

(5.3)

The area occupation is composed by two terms: a term directly proportional to the MOSFET widths (i.e. to the are occupied by the single MOSFET)
and a term independent from the MOSFET widths (comprising, for example,
the interconnect area). Both terms are, of course, positive, so the curve
of the area occupied versus the MOSFET widths is a monotonic increasing
curve11 , that is a convex function.
Taking into account the interconnections area does not modify the property of the area function, since the only modification of equation (5.3)
(page 91) is in the term independent from the MOS widths12 .

5.2 Optimization examples


In order to show some issues introduced in the previous sections, in the
following some CMOS gates will be analyzed. These gates are summarized
in the table 5.1, with the second column showing the total number of critical
11

It is a straight line in two dimensions, a plane in three dimensions and an hyperplane


in four or more dimensions, but is always a convex function.
12
See note 7, page 87.

Chapter 5. Circuit Optimization

92

paths in a gate, and the third column showing the total number of MOSFET
in a gate. The last two gates are dynamic full-adder, the former composed
by complex gate in order to perform the computation in one stage, while
the latter is composed only by basic gates (and, or and inverter): this explain why the last full-adder has much more transistor than the first one.
Tab. 5.1: Basic gates: complexity
Gate
Inverter (fig. 5.10)
TSPC type n latch
(fig. 5.11(a))
TSPC type p latch
(fig. 5.11(b))
TSPC type n and
(fig. 5.12(a))
TSPC type p and
(fig. 5.12(b))
TSPC type n or (fig. 5.13(a))

# of critical paths
2

# of transistors
2

12

14

Static and14

Static or14

Static parity gate (fig. 5.15)

24

48

34

40

26

13

82

126

TSPC type p or
(fig. 5.13(b))
Static and-or (fig. 5.14)

Static full-adder
(figs. 5.16(a), 5.16(b))
TSPC full-adder
(one-stage) (figs. 5.17(a),
5.17(b))
TSPC full-adder
(basic
cells)

The table 5.2 shows the delays and the energy consumption of the gates
of table 5.1: for each gate it is shown the maximum delay, the average delay,
the maximum energy and the average energy of all critical paths. All the
simulation are made at the minimum width for that technology (viz. 1 m
14

For a schematic of the static and and the static or see figure 5.14: the and is the
first gate of the schematic (on the left side of the picture), while the or is the last but one
gate, before the final inverter (on the right side); for a static and see also the figure 5.12,
page 96.

Gate
Inverter
TSPC type n
latch
TSPC type p
latch
TSPC type n
and
TSPC type p
and
TSPC type n
or
TSPC type p
or
Static andor
Static and
Static or
Static parity
gate
Static fulladder
TSPC
fulladder
(one-stage)
TSPC
fulladder
(basic cells)

Technology

630.8
718.9
664.1
754.9
689.6
787.3
894.2
727.7
776.3
1839.55
1080.6
681.9

556.4

921.8
1413.0
1028.0
1413.0
904.3
1413.0
1180
760.9
1430
2650.0
1781
930.6

2691 .0

Delay [ps]
max
avg
717.5
572.7

8.893

2.168

6.475

0.7442

0.7224
0.75

3.639

1.654

4.47

2.51

2.756

2.087

3.491

3.82

0.9425

1.219

0.676

0.7160
0.7114

2.816

0.879

1.42

1.058

1.13

0.965

1.299

0.7 m
Energy [pJ]
max
avg
0.6887 0.6864

151.2

15.6

48.0

57.6

4. 8
4.8

16.8

8.4

8.4

8.4

8.4

7.2

7.2

Area
[ m2 ]
2.4

482.3

276.7

571.3

922.2

277.2
233.7

334.1

482.5

299.7

482.5

315.8

482.5

293.3

79.9

204.2

311.3

582.5

240.1
89.3

253.3

225.1

224.8

212.9

208.9

209.1

200.4

Delay [ps]
max
avg
259.6 189.1

5.27

0.641

3.155

0.0944

0.0907
0.0713

1.998

0.2243

0.8257

0.3488

0.5126

0.2894

0.6586

1.999

0.188

0.324

0.0863

0.0891
0.0434

1.51

0.1077

0.2288

0.1314

0.1816

0.1221

0.2161

0.25 m
Energy [pJ]
max
avg
0.0858 0.0853

Tab. 5.2: Basic gates: pre-optimization delay, power consumption and area

75.6

7.8

24

28.8

2.4
2.4

8.4

4.2

4.2

4.2

4.2

3.6

3.6

Area
[ m2 ]
1.2

5.2. Optimization examples


93

Chapter 5. Circuit Optimization

94

OUT=A

Fig. 5.10: CMOS Inverter

for the 0.7 m technology and 0.5 m for the 0.25 m technology).

5.2.1

Algorithm choice

Given the results of section 4.2 (page 58), and the results of the above
sections regarding the property of delay, power and area functions in real
circuits, the most suitable algorithm to be applied is the Powells scheme.
Briefly, it is fast, reliable, even in presence of multiple minima, and (perhaps first of all) it does not require the knowledge of the first derivative of
the function to be minimized.
While some other algorithms could give the same quality of accuracy in
finding the minimum (namely the simulation annealing algorithm is practically the only one), the Powells one outperform all the others in the terms
of number of iteration, and hence in execution time, reaching the best solution.
The Powells algorithm is the first choice in all the optimization examples found in this chapter. As an example, performing the same optimization of table 5.3 with the simulated annealing will require an execution
time by the optimizer15 of about ten times of that required by the Powells
algorithm.
15

For a complete description of the optimizer cad tool see chapter 6, page 107.

5.2. Optimization examples

95

CLK
OUT
A

(a) Type n

CLK

OUT

(b) Type p

Fig. 5.11: TSPC Latches

5.2.2 Mono-objective optimizations


The mono-objective optimization of the circuits of table 5.1 means the
optimization of one and only one of the targets of 5.1, namely delay aut

power consumption aut area occupation.

5.2.2.1

Area

This target has a trivial optimization, since to minimize the occupation


of area of a circuit means obviously to have all the transistors in the circuit
as little as possible, i.e. the minimum allowed width by the technology.

Chapter 5. Circuit Optimization

96

CLK

OUT=A B

(a) Type n

CLK

OUT=A B

(b) Type p

Fig. 5.12: TSPC And gates


5.2.2.2

Power

The power optimization worths some more words: all the attempts to
optimize exclusively the power of the gates of table 5.1 in spite of the delay
have led to the same result, for both technologies: all the transistors in the
circuit had the minimum width after the optimization. This outcome will
arise whatever would be the starting point of the optimization session, i.e.
the initial transistor widths of the circuit.
This is an experimental proof that, out of the three terms of equation (5.2)
(page 89), the term of the switching power Pswitch , due to charging and dis-

5.2. Optimization examples

97

OUT=A + B

CLK

(a) Type n

CLK

OUT=A + B

(b) Type n

Fig. 5.13: TSPC Or gates


charging of capacitances in the circuit, is always the dominant one. Although some authors in the past argue that this term could not be the
largest, especially for deep sub-micron circuits, there is not an experimental
proof of that, at least for small and medium circuits.

5.2.2.3

Delay

Given the results of the power optimization (and the simple results of
area optimization), the only mono-optimization feasible is the delay op-

Chapter 5. Circuit Optimization


98

12

13
11

10

Fig. 5.14: Static and-or gatea .

OUT

a
This gate performs the action A B + C, but there are two inverters between the and and the or. These leave intact the logic function, but introduce
some complexity in the critical paths formulation: it is only for this purpose that these inverters have been introduced.

5.2. Optimization examples

99

Fig. 5.15: Static parity gate


timization.
That it is the maximum delay of critical paths is minimized, disregarding
the power consumption and the area occupation, which both increase as
the delay diminishes.
Tab. 5.3: Full-adder: delay optimization
Full-adder
Static

TSPC

(one-stage)

Delay [ps]
Pre-opt. Post-opt.
1781

1080

571.3

415.2

930.6

400.2

276.7

158.3

Energy [pJ]
Pre-opt. Post-opt.
0.7 m technology
6.475
40.0
0.25 m technology
3.155
111.2
0.7 m technology
2.168
13.390
0.25 m technology
0.641
3.622

Area [m2 ]
Pre-opt. Post-opt.
34

195.6

17

692.6

26

151.7

13

80.4

As an example, the delays of the static and dynamic full-adders before


the optimization (i.e. all the transistors with minimum width) and after
the optimization are presented in table 5.3; in the same table is reported
the power consumption of the circuit before and after the optimization of

(b) Carry part

SUM

(a) Sum part

Fig. 5.16: Static full-adder

Chapter 5. Circuit Optimization


100

CARRY

CLK

(a) Sum part

Fig. 5.17: TSPC full-adder (onestage)

CLK

SUM

(b) Carry part

CLK

CARRY

5.2. Optimization examples


101

Chapter 5. Circuit Optimization

102

the delay: it is possible to see how the power increases after the delay is
minimized.
The criterion that judges when the optimization is over is based on two
considerations (see chapter 6, page 107 for more details on the algorithms
implementation, and chapter 4, page 47 for mathematical foundations):
i) either if there is a minimum (either the delay figure is strictly convex
or, more generally, it has an absolute minimum), then the optimization
algorithm find it with an arbitrary accuracy, chosen a priori; or
ii) if the delay figure is not strictly convex (i.e. is monotone decrescent),
then the optimization algorithm goes on minimizing till the rate of decreasing of the delay is below the accuracy.
The former case is more stable from the point of view of the accuracy: given an accuracy, the same optimum solution is found independently from the starting point (i.e. the initial transistor widths) the starting point influences only the time it takes to reach the solution, which is
unique.
The latter case is somewhat more problematic, since the solution is dependent from the starting point: the decreasing rate of the delay is dependent
from the starting point in the multi-dimensional space delay vs. widths.
This means that several optimization sessions can give different results,
depending on the initial transistor widths in each optimization.
In order to eliminate this ambiguity it is safe to chose a common starting
point for all the optimization sessions: the natural choice is to start with
all the transistors at the minimum allowed width by the technology. This
choice guarantees that changing from an optimization run to another the
solution found is always the same, and also it represents a comfortable
way for writing the netlist to be optimized, either by a human hand or by
a schematic editor.

5.2.3 Multi-objective optimizations


The multi-objective optimization means to optimize at the same time different target, that is, for example minimize contemporarily the delay and

5.2. Optimization examples

103

the power, or the power and the area, and so on. From 5.2.2.1, 5.2.2.2

and 5.2.2.3 we have seen that some of these goals clash. These clashes are
briefly summarized in table 5.4.

Tab. 5.4: Agreements of targets


Area
Delay
Power

Area

Delay

Power

So, for example, optimizing together delay and power, i.e. minimizing
both, it is not possible: the power is minimized when all the transistors
are at minimum width, while minimizing the delay involves to have some
transistors (maybe all) at a width greater than the minimum.
This disagreement among some optimization targets leads to new possible
definition(s) of multi-objective optimization:
i) there is a primary target to be optimized, and one or more secondary
targets to be taken into account: then we may define a threshold on
the latter. The algorithm goes on optimizing the primary target, being
careful on maintaining all the secondary targets below the threshold;
or
ii) there are only primary targets, and each target account into the total
objective function with a relative weight, which indicates how much
the final solution should depend on the corresponding target; or
iii) both the previous definitions.
The most suitable policy is the second, because it gives to each target
the same priority with different importance. The first alternatives leads
to a sub-optimal optimization since: first, the designer must know which
are the order of magnitude of the targets, in order to impose a limit on
them; second, not the whole space of solutions may be explored with such
constraints.
In the case of primary target with relative weights, we have chosen the
sum of relative weights to represents the entire normalized objective function, that is the sum of relative weights must be equal to one.

Chapter 5. Circuit Optimization

104

Given the results of 4.1.2 (page 54) then the total objective function to be
minimized is a linear combination of the delay (D ), power (P ) and area (A ):

O = D + P + A ,

(5.4)

where O is the total objective function and where

0, 0, 0,

++ =1

From the point of view of the user of the optimizer, specifying this kind
of weights means to have the possibility to see this weights as a measure
of how much the corresponding target matters in the final solution: for
example specifying = 0.5, = 0.5 and = 0 means that we want to optimize the delay at the 50% and the power at the 50%.
The subtle point in the eq. (5.4) is that the quantities D , P and A are
not commensurable, that is order of magnitude of the quantities may not be
same. Lets think only to the unit of measure: if, for example, the delay
is measured in picosecond (e.g. 1000 ps), the power is measured in Joule
(e.g. 1013 J). When one quantity is very greater than the others, then all
the changes in the latter quantities disappear in the total sum.
In order to overcome the problem of the non-commensurable quantities in eq. (5.4), all the terms comprising the sum should be normalized. The
mathematical theory of optimization states that each term should be normalized dividing them by the optimum found optimizing only that particular term. This implies an a-priori knowledge of the optimum of each
term, and so of the total weighted sum. At every moment of the optimization run is possible to know the distance between the actual solution and
the optimum.
This is not practically feasible for a circuit optimization, since it would involve the run of mono-objective optimizations, one for each term of the
sum, and then the run of the final multi-objective optimization. This would
lead to a total session of the optimization unacceptable, both for the time it
will takes and for the resources it will occupy.
Thus the normalization applied here is the division of each quantity for

5.2. Optimization examples

105

its corresponding maximum: a maximum of the delay occurs when all the
transistors are at minimum width, while the maximum of the power and of
the area is measured when all the transistors are at the maximum allowed
width in the optimization session (being careful that choosing a too large
maximum allowed width will result in a power and area term too little).
The total normalized optimization objective function becomes then:

O=

D
D |min widths

P
P |max widths

A
A |max widths

(5.5)

Choosing all the combinations of the parameters , and it is possible


to obtain an optimized circuit in which the delay, the power consumption
and the area occupancy account more or less.
Tab. 5.5: Full-adder: delay and power optimization
Delay [ps]
Pre-opt. Post-opt.

Full-adder
Static

TSPC

(one-stage)

1781

1156

571.3

429.5

930.6

744.1

276.7

187.1

Energy [pJ]
Pre-opt. Post-opt.
0.7 m technology
6.475
22.34
0.25 m
3.155
13.63
0.7 m technology
2.168
3.921
0.25 m technology
0.641
1.879

Area [m2 ]
Pre-opt. Post-opt.
43

110.5

17

83.12

26

62.8

13

41.6

Tab. 5.6: Full-adder: optimizations comparison among two kinds of optimization and the minimum widths results. The number in the parentheses shows the worsening (if positive) or the improvement (if
negative) of the powerdelay optimization from the full-delay optimization.
Delay
== 0.5

Full-adder

= 1

Static

1.65

1.54 (+7%)

1.38

1.33 (+3.44%)

2.33

1.25 (+85.9%)

1.75

1.48 (+18.2%)

TSPC

(one-stage)

Energy
= 1
== 0.5
0.7 m technology
6.18
3.45 (-44.1%)
0.25 m technology
35.25 4.32 (-87.7%)
0.7 m technology
6.18
1.81 (-70.7%)
0.25 m technology
5.65
2.93 (-48.1%)

= 1

Area
== 0.5

5.75

2.57 (-43.5%)

40.74

4.89 (-88%)

5.83

2.42 (-58.6%)

6.18

3.20 (-48.3%)

If for, for example the same full-adder of table 5.3 (page 99) are optim-

Chapter 5. Circuit Optimization

106

ized both for delay and for power in the same measure, i.e. in equation (5.5)

= 0.5, = 0.5 and = 0, we obtain the results of table 5.5.


The comparison of the full delay optimization (mono-objective) and delay
power optimization (multi-objective) is sketched in table 5.6: as we can
see between the full-delay optimization and the powerdelay optimization
(50%50%) there is a slightly worsening in the delay of the final circuit
(from 5.2% to 46.9%); at the same time there is an effective improvement
in the power consumption: the power dissipation decreases from 5.8% to
76.5%.
A more complete survey of the optimization results of the circuit presented in this chapter can be found in chapter 7 (page 121).

5.3 Conclusion
This chapter first defines which are the targets of optimization, and then
it applies the mathematical theory of chapter 4 (page 47) to the optimization of real circuits.
It has been shown how the only mono-objective optimization feasible by
means of transistor dimensions trimming is the delay minimization, since
both the minimization of area and power consumption lead the quasi-obvious
solution of all transistor at the minimum width allowed by the technology
or by the designer.
Regarding the multi-objective optimization a method that permits to
tackle several optimization policies has been presented. This method permits to take into account all the variables, even whether they are incommensurable among themselves; by means of a normalization all the targets
to be optimized can be combined in a single objective function, with a relative agreement level.
Moreover with this way of combining the several targets into one objective
function, the introduction of constraints is as simple as it is in a monoobjective optimization.

Chapter 6

A CAD TOOL FOR OPTIMIZATION


HE optimization goals of the previous chapter require a modular and

complete framework, in order to perform the real optimization of a

circuit. This chapter describes the implementation of such framework by


means of about 10000 lines of C++ code. The section 6.1 reports the logical
description of the tool and its modules, and the section 6.2 reports the code
implementation of the most important classes of the program. Finally the
section 6.3 reports the logical flow of the program during the execution.
For every other detail refer to appendix A and B (page 145, 149).

6.1 Logical description


The block diagram of the CAD tool is pitted in figure 6.1.
The core of the tool, the optimization engine, receives the input from
two modules: the optimization algorithm module (OAM), where different
optimization strategies can be selected, and the function evaluation module
(FEM), including the models for delay, power, and area estimation.

6.1.1

The optimization algorithm module (OAM)

The OAM supports the choice of different optimization algorithms in a


predefined set; three kinds of algorithm are currently included:

a SLOPlike algorithm (4.2.4, page 70), which works increasing at

Chapter 6. A CAD tool for optimization

108

Optimization algorithm module (OAM)

SLOP

Powell

Grad.
descent

Optimization

Results feedback

Constraints

Delay

Power

Circuit
Description

Area

function evaluation module (FEM)

(computer readable)

Parser

Optimization
constraints

Circuit
Description
(human readable)

Fig. 6.1: Tool block diagram


each step the size of a single gate, chosen according to the best possible reduction of the delay along the critical path.

The Powell algorithm (4.2.3.2, page 69), which is a particular form


of the conjugate directions algorithm family ([17]): it does not require
the computation of any gradient function and it converges quadratically to the minimum of the cost function.
The simulated annealing algorithm (4.2.5, page 72): it chooses the

6.1. Logical description

109

transistor dimensions according to an annealing scheme, converging thus to a global minimum, getting rid of local minima. It is
surely much slower than the previous ones, and requires a fine tuning
of the annealing parameters.
For all the chosen methods, the analytical knowledge of the objective
functions and their derivatives is not required, but just numerical approximations are exploited.
However methods requiring the gradient evaluation (e.g. the Fletcher
ReevesPolakRibiere version of conjugate directions algorithm [17]) can
be also supported.

6.1.2 The function evaluation module (FEM)


The FEM module performs the analysis of the circuit to be optimized,
and in particular it evaluates all the objective functions needed by the OAM:
the delays, power consumptions and area occupancy.
In order to perform this evaluation it invokes the timing analyzer or simulator chosen at run-time. At the time of writing two analyzer are supported:
HSPICE

and FAST (chapter 3, page 21).

Hereinafter the word simulator will be used, although some module included in FEM are not real simulator, but more appropriately delay-power
analyzer, since they do not perform a real simulation of the circuit (such
as FAST).

6.1.3 Core engine


The core engine is the main module of the program. It handles the communications among the others module and make the optimization feasible.
First of all, the engine parses the netlist of the circuit to be optimized,
written in a SPICE-like format. It then invokes the module that automatically searches all the critical paths in the circuit, and finally it invokes the
optimization algorithm.

Chapter 6. A CAD tool for optimization

110

6.2 Code implementation


The whole tool has been written in C++. All the classes of the program
are showed in appendix A and all the code details can be found in appendix B.
Here are reported the most important classes of the program:

CircuitNetlist
OptimizationAlgorithm
EvaluationAlgorithm
The first class, CircuitNetlist, and its derived Circuit, contain the
graph of the circuit, in which every node is a transistor and every edge is a
connection between two transistor.
The class OptimizationAlgorithm is a virtual base class from which
every new optimization algorithm should be derived. It provides the interface between the real class that implements the algorithm and the core
engine. Every derived class should provide the method Run() that performs the optimizations.
The class EvaluationAlgorithm is again a virtual base class from which
every new simulator should be derived; and again every derived class
should provide the method Run(...) that performs the evaluation of all
the objectives of the circuit, as delay, power consumption and so on.

6.2.1

The classes CircuitNetlist and Circuit

The public and protected methods of class CircuitNetlist are:


1
2
3
4
5
6
7

class CircuitNetList
{
private:
...
protected:
char *FileNetOut;

6.2. Code implementation


8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

111

TransistorList TranList;
CapacitorList CapList;
char *FileIn;
double Val;
unsigned int ValNode;
public:
CircuitNetList( const char* FileNetList,
const Options& options );
virtual ~CircuitNetList();
unsigned int GetNTran() const
{return TranList.GetNTran(); }
unsigned int GetNCap() const
{return CapList.NumCap; }
double Valim() const
{return Val; }
unsigned int ValimNode() const {return ValNode; }
const TransistorNode& operator[]( unsigned int index ) const;
const TransistorNode& operator[]( const char* name ) const;
int TranPos( const char* name ) const;
};

This class provides some method to return the ith transistor by means of
operator[], either by calling it with the relative number of transistor or
with its name. Also the class provides the methods to return the effective
power supply node (the ground node is assumed to be always the node 0).
Internally the class contains the list of all the transistors and all the capacitors present in the original netlist.
The public and protected methods of class Circuit are:
28
29
30
31
32
33
34
35
36
37
38
39
40
41

class Circuit : public CircuitNetList


{
private:
...
public:
Circuit( const char *FileNetList,
const Options& options );
~Circuit();
void PrintResult( unsigned long int Step, unsigned int NT,
unsigned int NP, const double* NewWidth,
const double* CPDelay, const double* CPPower,
const double *CPNoise, double Area,
double maxT, double maxP, double maxN,
double f, double fLast ) const;

Chapter 6. A CAD tool for optimization

112
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

int Simulate( const double *NewWidth ) const;


double JunctionNWidth( unsigned int node,
int& number,
const double* NewWidth = 0 ) const;
double GateNWidth( unsigned int node,
int& number,
const double* NewWidth = 0 ) const;
double JunctionPWidth( unsigned int node,
int& number,
const double* NewWidth = 0 ) const;
double GatePWidth( unsigned int node,
int& number,
const double* NewWidth = 0 ) const;
double CapStaticGnd( unsigned int node, int& number ) const;
double CapStaticVdd( unsigned int node, int& number ) const;
int TransistorListNode(unsigned int node, TransistorList& TList,
unsigned int& n , unsigned int& p) const;
};

The class provides the method Simulate(const double *NewWidth ) that


invokes the simulator of the circuit with the new transistor widths NewWidth.
It provides also some methods ...Width(...) that return the sum of the
widths of all the transistors connected to a node and a few methods CapStatic...(...)
that return the sum of all the capacitances connected between a node and
the power supply node or between a node and the ground node. These
methods are useful for the FAST model.

6.2.2

The class EvaluationAlgorithm

The public and protected interface of this class are:


60
61
62
63
64
65
66
67
68
69

class EvaluationAlgorithm
{
private:
protected:
const CritPathList& pathlist;
const Options& options;
unsigned int NumPath;
unsigned long int Calls;
double *CPDelay; // delay
double *CPPower; // power

6.2. Code implementation


70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89

double *CPNoise;
double Area;

113

// noise

public:
EvaluationAlgorithm( const CritPathList& pathlist,
const Options& options );
virtual ~EvaluationAlgorithm();
virtual int Run( const Circuit& circuit,
const double *NewWidth,
const unsigned *ValidPath ) = 0;
unsigned long int GetCalls() const { return Calls; }
double GetDelay( unsigned int index ) const
{ return CPDelay[ index ]; }
double GetPower( unsigned int index ) const
{ return CPPower[ index ]; }
double GetNoise( unsigned int index ) const
{ return CPNoise[ index ]; }
double GetArea() const
{ return Area; }
unsigned int GetNPath() { return NumPath; }
};

The main method is


Run( const Circuit& circuit, const double *NewWidth, ...)
that performs the real simulation of circuit with the new dimensions
NewWidth. The other methods return the delay, power and area of the circuit with the new dimensions, the total number of calls to simulator, and
the number of critical path in the circuit. It contains vectors of all the delays
and power of all critical paths, an instance of a class Options that contains
all the options of the tool, and an instance of the class CritPathList that
contains all the critical paths of the circuit.

6.2.3

The class OptimizationAlgorithm

The public and protected interface of this class are:


90
91
92
93
94
95

class OptimizationAlgorithm
{
private:
...
protected:
unsigned int InternalSteps;

Chapter 6. A CAD tool for optimization

114
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128

const Circuit& circuit;


const Options& options;
unsigned int Steps;
unsigned int NumTran;
unsigned int NumPath;
double *Width;
double *CPDelay;
double *CPPower;
double *CPNoise;
double Area;
unsigned int *ValidPath;
double MaxDelayInitMin;
double MaxPowerInitMin;
double MaxNoiseInitMin;
double AreaInitMin;
double MaxDelayInitMax;
double MaxPowerInitMax;
double MaxNoiseInitMax;
double AreaInitMax;
EvaluationAlgorithm& Simulation;
double NormSim( const double* NewWidth, int& RetCode);
public:
OptimizationAlgorithm( const Circuit& circuit,
const Options& options,
EvaluationAlgorithm& simulation );
virtual ~OptimizationAlgorithm();
virtual int Run() = 0;
unsigned long int GetSteps() const { return Steps; }
int SimulateCircuit( const double *NewWidth );
int SimulateFirstCircuit();
double OptWidth( unsigned int index ) const
{return Width[ index ];
};

This class provides the method Run() that invokes the real algorithm,
and the method SimulateCircuit(...) that performs the function evaluations by means of the instance EvaluationAlgorithm& Simulation:
simply every time that the algorithm needs to perform a function evaluation with new dimensions, it invokes the public method
Simulation.Run(...), passing to it the new dimensions. It provides also
the methods to return the optimization steps and the final optimized widths.

6.2. Code implementation

115

The combination of all the functions returned by Simulation.Run(...)


(all the critical path delays, all the power consumptions, 5.2.3, page 102) is
performed by the method NormSim(...).

6.2.4

The critical path retrieving

The module that performs the retrieving of all the critical paths (see
5.1.1, page 80, for the mathematical definitions) in the circuit is subdivided
into three parts:

the first part identifies all the input of the circuit (gate nodes connected to nothing), and all the internal gate nodes (connected to a source
or a drain of another transistor);
the first part search all the charging paths between a node and the
power supply and all the discharging paths between the ground, for
every node in the circuit;
the third part combines all the previous charging and discharging
paths to obtain a true critical path. The combinations is performed
controlling that the inputs permit the real activation of the path; at
the same time the module sets all the inputs at the value necessary to
obtain the excitation of the path, i.e. such that a change in the input
causes a change in the output.
The main function of the critical paths retriever is:
int Critic(const Circuit& circuit,CritPathList& pathList,...)
that performs the search of the critical paths in circuit: it simply calls
the recursive function int CriticRecurse(...) to search all the charging
or discharging paths, and then it combines some of this path by means of
the recursive function int SearchCPRecurse(..). For every charging/discharging path to be added, the function int SearchOKCond(...) is invoked: this very complex function controls that all the input conditions
are coherent with the conduction of the path.
In order to ensure a good flexibility of the tool, there is always the possibility for the designer to specify the critical paths to be used in the optimization by hand. The standard format for them is a text file that for each

Chapter 6. A CAD tool for optimization

116

critical path lists the input node, the output node, and transition both on
the input and output node (fall or delay). It is possible in this way to list
only a part of all the critical paths present in a circuit and to take into account during the optimization only those paths.
Moreover it is possible to use the optimizer for topologies that normally
could confuse the algorithm for critical paths search, such as the passtransistors logic circuits.

6.2.5 The derived classes


Every time a new optimization algorithm or a new simulator must be
introduced a new class should be derived from the main classes.
As examples, the class for the HSPICE simulator is derived as:
129
130
131
132
133
134
135
136
137
138
139
140
141

class Hspice: public EvaluationAlgorithm


{
private:
...
public:
Hspice( const CritPathList& pathlist,
const Options& options,
const char* NE );
~Hspice();
int Run( const Circuit& circuit,
const double *NewWidth,
const unsigned* ValidPath);
};

and the class for the Powell optimization algorithm (4.2.3.2, page 69)
is derived as:
142
143
144
145
146
147
148
149
150

class Powell: public OptimizationAlgorithm


{
private:
...
public:
Powell( const Circuit& circuit,
const Options& options,
EvaluationAlgorithm& simulation );
~Powell();

6.3. Program flows


151
152

117

int Run();
};

Basically, both the classes should provide only the method Run(...)
(with different parameters, of course), that performs the real simulation or
the real optimization algorithm.

6.3 Program flows


The logical flow of the main function of the program is:
Algorithm 6.1. Logical flow of main
Require: Circuit netlist in SPICE-like format
1:

Preprocess the input netlist.

2:

Process the options configuration file.

3:

Build the graph of the circuit.

4:

Search the critical path in the circuit.

5:

Invoke the function optimizator.Run().

6:

Write results.
The logical flow of the function that retrieve all the critical paths is di-

vided in a few functions:


Algorithm 6.2. Logical flow of the function Critic(...)
Require: A graph of the circuit in which each node is a transistor.
1:

Invoke CriticRecurse(...) passing to it the ground node.

2:

Invoke CriticRecurse(...) passing to it the power supply node.

3:

Invoke SearchCriticalPath(...) passing to it the list of all the discharging path starting from the ground node.

4:

Invoke SearchCriticalPath(...) passing to it the list of all the charging path starting from the power supply node.

5:

Return a list of all the critical paths in the circuit.

Algorithm 6.3. Logical flow of the function CriticRecurse(...)


Require: Node.

Chapter 6. A CAD tool for optimization

118
1:

for all The transistors that have the source or drain connected to Node
do

2:
3:
4:
5:

if Source = Node then


Node = Drain.
else
Node = Source.

6:

end if

7:

Memorize the current transistor in the current list.

8:

Copy the current list in a new list, in order to create a new list every
time there are more than one transistors connected at the same node.

9:

if At Node are connected both ntype and ptype transistor OR Node


is already visited then

10:
11:
12:
13:

Return.
else
Invoke myself with Node
end if

14:

end for

15:

return all the lists of node starting from Node

Algorithm 6.4. Logical flow of the function SearcCriticalPathRecurse(...)


Require: A List of all the charging and discharging paths and a path as a
starting point.
1:
2:

for all The charging paths do


Choose a discharging path that has as an input node the output node
of the first path

3:

Check if the input condition are correct and eventually set them.

4:

Invoke myself whit the new path as a first path.

5:

end for

6:

for all The discharging paths do

7:

Choose a charging path that has as an input node the output node of
the first path

8:

Check if the input condition are correct and eventually set them.

9:

Invoke myself whit the new path as a first path.

10:

end for

6.4. Conclusions

6.4

119

Conclusions

This chapter describes the implementation of the tool that is behind all
the optimizations through this thesis. It has been written in a very modular
way, in order to permit efficiently the insertion of new algorithms and new
simulators. It consists of about ten thousand lines of C++, and it exploits
deeply the object-oriented features, in order to hide to new developers the
implementing details.

Chapter 7

RESULTS AND CONCLUSIONS

HIS chapter shows a survey of the optimization of the circuits showed


in chapter 5 (page 77): the goal here is to show how a cell library

can be optimized, in order to be used in VLSI circuits, either full-custom


or standard-cells. Going to multi-objective optimizations, starting from
mono-objective ones (and passing from constrained optimization) is the
path that this chapter will walk. In this path some conclusions (and opinions!) are drawn, giving cell-libraries designers some guidelines and tools
to facilitate his work and obtain the wanted results.

7.1 Optimization
The cell library to be optimized is composed, principally, by basic TSPC1
CMOS

dynamic logic, but with the purpose to extend the validity of the res-

ults, some static gates are included in the library. The full list of the gates
subjected to optimization is shown in table 7.1. For a complexity description (both for the number of transistor in each cell, and for the number of
critical paths in the same cell) of the gates see table 5.1 (page 92).
The library comprehends, thus, the inverter gate, the TSPC gates and
(both the n and the p versions), or and latch gates (again with the n
and the p versions), and a full-adder (the version included here is a np
construction, faster than the almost equivalent pn construction). As above
said, for comparison are included: a complete static full-adder, a full static
andor gate2 , a full static and, a full static or, a full static parity
1
2

For a description of the TSPC see chapter 1 (page 3), and [1].
See note a, (page 98).

Chapter 7. Results and conclusions

122

Tab. 7.1: Library gates list


Gate
Inverter (fig. 5.10, page 94)
TSPC type n latch (fig. 5.11(a), page 95)
TSPC type p latch (fig. 5.11(b), page 95)
TSPC type n and (fig. 5.12(a), page 96)
TSPC type p and (fig. 5.12(b), page 96)
TSPC type n or (fig. 5.13(a), page 97)
TSPC type p or (fig. 5.13(b), page 97)
Static and-or (fig. 5.14, page 98)
Static and (fig. 5.14, page 98) (See note 14, page 92.)
Static or (fig. 5.14, page 98) (See note 14, page 92.)
Static parity gate (fig. 5.15, page 99)
Static full-adder (figs. 5.16(a), 5.16(b), page 100)
TSPC full-adder (one-stage) (figs. 5.17(a), 5.17(b), page 101)
TSPC full-adder (basic cells)
gate (which performs the parity calculation among three inputs), and, finally, a TSPC full-adder, composed only by the TSPC basic gates above mentioned.
The very first result reported here is the comparison of the improvement in the delay and power consumption between the 0.7 m and the
0.25 m technology, at minimum width: this comparison is reported in
table 7.2 and graphically pitted in figure 7.1(a) for delay and figure 7.1(b)
for the power consumption.
From that table it is possible to see that the average improvement (diminution) of the delay is 69.3% and of the power is 76.2%, passing from the
0.7 m to the 0.25 m technology.
Thus with scaling the dimension of quite 13 , the average delay and power
consumption are also scaled down of about the same factor.

7.1.1

Mono-objective vs. Multiobjective

Mono-objective optimization (4.1.1, page 49) means to optimize (in our


case always to decrease) a single objective, i.e. a well defined target, to the
detriment of all the others possible targets.
The very first optimization policy applied to CMOS circuits was the

0.7 m

Gate
Delay [ps] Energy [pJ]
Inverter
717.5
0.6887
TSPC type n latch
921.8
3.491
TSPC type p latch
1413.0
2.807
TSPC type n and
1028.0
2.756
TSPC type p and
1413.0
2.51
TSPC type n or
904.3
4.47
TSPC type p or
1413.0
1.654
Static and-or
1180.0
3.639
Static and
760.9
0.722
Static or
1430
0.75
Static parity gate
2650.0
0.744
Static full-adder
1781
6.475
TSPC full-adder (one-stage)
930.6
2.168
TSPC full-adder (basic cells)
2691.0
8.893
Average improvement

Technology
Area
2.4
7.2
7.2
8.4
8.4
8.4
8.4
16.8
4.8
4.8
57.6
48
15.6
151.2

[ m2 ]
Energy [pJ]

0.086 (-87.5%)
0.659 (-81.1%)
0.289 (-89.7%)
0.513 (-81.4%)
0.349 (-86.1%)
0.826 (-81.5%)
0.224 (-86.5%)
1.998 (-45.1%)
0.0907 (-87.4%)
0.0713 (-90.5%)
0.0945 (-87.3%)
3.155 (-51.3%)
0.641 (-70.4%)
5.27 (-40.7%)
-76.2%

Delay [ps]

259.6 (-63.8%)
293.3 (-68.2%)
482.5 (-65.9%)
315.8 (-69.3%)
482.5 (-65.9%)
299.7 (-66.9%)
482.5 (-65.9%)
334.0 (-71.7%)
277.2 (-63.6%)
233.7 (-83.7%)
922.2 (-65.2%)
571.3 (-67.9%)
276.7 (-70.3%)
482.3 (-82.1%)
-69.3%

0.25 m

Tab. 7.2: Delay and energy dissipation @ minimum width (HSPICE)


Area [m2 ]
1.2
3.6
3.6
4.8
4.8
4.8
4.8
8.4
2.4
2.4
28.8
24
7.8
75.6
-50%

7.1. Optimization
123

Chapter 7. Results and conclusions

124

Delay comparison of 0.7m and 0.25m


3000
0.7m
0.25m
2500

Delay [ps]

2000

1500

-65.2%

1000

-65.9%

500
-69.3%

-65.9%
-66.9%

-67.9%

-65.9%
-68.2%

-63.8%

-71.7%

-63.6% -83.7%

-82.1%
-70.3%

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay

Energy-dissipation comparison of 0.7m and 0.25m


9
0.7m
0.25m

-30.6%

Energy [pJ]

6
5

4
-51.3%
3
-81.5%

-45.1%

-81.1%

-70.4%

-81.4%
-86.1%

-86.2%
tspc--fa2

tspc--fa1

static--fa

and--or

inv

latchp

latchn

orp

orn

andp

andn

parity

-87.4% -90.5% -87.4%

-87.5%

or--static

-86.5%
0

and--static

Gate type

(b) Energy-dissipation

Fig. 7.1: Comparison of 0.7 m and 0.25 m. gates @ minimum technology


width
delay optimization. The figures 7.2 and 7.3 sketch the delay optimization of
the gates of table 7.1, respectively in 0.7 m and 0.25 m technology implementation, with arrows representing the delay and energy variation. The

7.1. Optimization

125

Full Delay Optimization: delay variation


3000
0.7m

-70.1%

-70.8%

2500

2000
Delay [ps]

-56.7%

1500

-84.2%

-76.2%

-84.8%

-84.1%
-61.2%

1000

-77.3%

-60.1%

-76.5%

-81.0%

-86.3%

-88.1%
500

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay amelioration

Full Delay Optimization: energy variation


250
0.7m
+2254.7%
200

Energy [pJ]

+2605.8%

150

+3040.9%

+3736.2%

+4261.4%

+2680.0%

100

+2079.7%

50

+2275.7%
+430.7%

+1616.4%

+178.7%
+1331.5%

+70.1%

+620.9%
tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(b) Energy-dissipation deterioration

Fig. 7.2: Delay optimization of 0.7 m gates.

arrows start from the initial values (i.e. either the delay or the energy measured at the minimum technology width), and end to the values after the
optimization.

Chapter 7. Results and conclusions

126

Full Delay Optimization: delay variation


1000

-57.2%

0.25m

900
800

Delay [ps]

700

600

-27.3%
-78.8%

500

-60.6%

-78.0%

-63.7%

400
300

-39.7%

-59.3%

-67.7%

-74.4%

-55.3%

-87.1%

-42.8%
-68.1%

200
100

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay amelioration

Full Delay Optimization: energy variation


120
0.25m
100
+3424.6%

Energy [pJ]

80

60

+847.7%
+2233.8%

40

tspc--fa2

+465.1%
static--fa

parity

or--static

and--or

inv

latchp

orp

and--static

+381.9%
+212.7%
+171.4%

tspc--fa1

+1787.7%
+1855.0%
+133.7%

+654.0%
orn

andp

+1305.4%
+279.3%
andn

+1818.9%

latchn

20

Gate type

(b) Energy-dissipation deterioration

Fig. 7.3: Delay optimization of 0.25 m gates.

As it can be expected, the delay has a sensible improvement (diminution, figures 7.2(a), 7.3(a)) while the energy dissipation has a very large
increase (figures 7.2(b), 7.3(b)): to decrease the delay the optimizer aug-

7.1. Optimization

127

Full Delay Optimization: delay variation


3000
0.7m
0.25m
2500

Delay [ps]

2000

1500

1000

500

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay variation

Full Delay Optimization: energy variation


250
0.7m
0.25m

Energy [pJ]

200

150

100

50

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(b) Energy-dissipation variation

Fig. 7.4: Technology comparison of delay optimization.

ments the transistor widths, thus augmenting the overall power dissipation. Table 7.3 and figure 7.4 report the relative variation of delay and
power (as minimum, maximum and mean value), for both technology: so,

Chapter 7. Results and conclusions

128

for the 0.7 m technology the delay is, in average, decreased by 3.43 times,
while for the 0.25 m technology it is decreased by 2.75 times (figure 7.4(a)).
On the contrary, the energy dissipation is increased by 20.42 times in 0.7 m
and by 13.41 in 0.25 m (figure 7.4(b)).
The table 7.4 shows the total time taken by the optimization of each
gate, together with the total number of function evaluations, that is the
number of times the simulator (in this case HSPICE) of the circuit has been
invoked. These numbers are quite reasonable per se, and moreover the optimization of a cell library ought to be performed only once, before the
reuse of it. Furthermore, in the case of very large circuits, the modular architecture of the optimizer makes possible to switch from one simulator
to another on the fly; thus we can use a very fast simulator (as FAST) in
the earlier steps of optimization, and switch to a more precise but slower
simulator (as HSPICE) in the later stages of the optimization process.
Tab. 7.3: Delay decreasing and energy increasing (both relative) in a delay
optimization.
Technology
0.7 m
0.25 m

Delay decreasing
Max. Min. Mean
8.43
3.43
4.80
7.78
2.75
3.16

Energy increasing
Max. Min. Mean
43.61 8.89 20.42
35.25 6.17 13.41

These results are largely previsionable, since a hard delay optimization


leads to a very large increase in transistor dimensions, thus leading to a
great area occupancy and energy dissipation.
Moreover, another issue arises when optimizing an entire cell library:
is it necessary to push at their limits every single cell? In a generic static
circuit the total delay is, generally, the sum of the delay of each cell comprising the circuit, since this delay is bounded by the delay of the worst
critical path and, moreover, it is possible to have a single critical path3 from
a primary input to a primary output of the circuit; so it has some sense to
optimize every single cell to its best.
In a generic dynamic circuit, the global delay is still bounded by the delay
of the worst critical path in the circuit, determining thus the minimum clock
period. Since this critical path is contained in a single cell for a single-phase
3

5.1.1, page 80.

7.1. Optimization

129

Tab. 7.4: Elapsed time and total number of function evaluations for a fulldelay optimization with HSPICE on a ULTRA-sparc 5
Technology
Gate
inv
and-n
and-p
or-n
or-p
latch-n
latch-p
andor
andstatic
orstatic
parity
staticfa
tspcfa1
tspcfa2

0.7 m
El. time [s] Fun. eval.
332.6
12
1338.3
34
1426.6
34
1408.3
32
1259.5
31
1286.5
32
1307.1
33
5830.9
73
786.5
25
651.6
21
64098.2
159
27034.8
239
2413.3
69
16459.1
66

0.25 m
El. time [s] Fun. eval.
212.4
13
1675.6
36
2449.5
41
1950.0
34
1355.5
27
1466.7
32
1574.7
31
9280.3
91
729.6
31
626.1
24
35274.3
178
23794.1
180
2881.2
70
63485.2
121

dynamic logic (where there are n-gate and p-gate alternated, working with
different clock phases), the delay of the entire circuit is bounded by the
delay of the worst library cell in circuit. It has no sense, thus, to optimize
the basic library cells (that are present in every circuit) to their limits, when
the delay of a generic circuit is bounded by the worst of them. It is, instead,
more useful to try to optimize the worst cell in the library, while trying to
reduce the delay of the other cells to the value obtained by the previous
optimization. In this way a reduction of the dimensions of these cell is
achieved, obtaining thus a reduction of the overall energy dissipation.
So the consequent idea is to try to optimize an entire (dynamic) cell
library using a constrained optimization 4 ; the strategy for this purpose is:
i) evaluate the delay for every cell at minimum width;
ii) choose the worst cell (with regard to delay) among the previous;
iii) optimize the delay of this cell as long as it is possible;
iv) optimize all the other cell to have a delay not superior to the value
obtained in the previous point.
4

4.1.1.2, page 52

Chapter 7. Results and conclusions

130

As an example, the constrained optimization of dynamic 0.25 m gates


is reported in table 7.5: this optimization has been performed with a constraint on every gate for not to have a delay greater than 125 ps. This value
has been obtained by an unconstrained optimization of the worst (with respect of delay) cell, the TSPC type-p or gate (cfr. table 7.2). After this
optimization the delay of this gate was 121.2 ps, so the value chosen for the
optimization of all the other gates was 125 ps.
Tab. 7.5: Constrained delay optimization of a few 0.25 m gates.
Gate
and-n
and-p
or-n
or-p
latch-n
latch-p
Average delay
Standard deviation

Delay preopt. [ps]


315.800
482.500
299.700
482.500
293.300
482.500
392.72
36.65

Delay postopt. [ps]


100.500
111.900
114.900
121.200
88.080
118.600
109.20
3.83

It is possible to see, from table 7.5 that the delays after the optimization
have a standard deviation5 (3.83) far smaller than the standard deviation
before the optimization (36.65). This means that all the cells have quite the
same delay after the optimization, and that this value is an optimal one,
since minimizes the delay of block constituted by these cells, and in the
same time reduces the power dissipation and area occupancy with respect
to a solution with all the cells optimized independently.
The procedure of a constrained optimization is useful only when we
want to constraint a single target to a precise value. It is not useful when
we want to constraint more than one target at the same time, for example
delay and power together: such optimizations are not feasible as first they
would require an evaluation of quantities to be constrained (in order to
know if the constraints are reasonable), second it could not be possible for
the optimizer to satisfy all the constraints.
A much more useful policy to take into account specifically more than
5

The standard deviation of a number N of samples xi is defined as 2 =


iN=1 xi
,
N

m, m =
is the arithmetic mean of the samples.
It is a measure of the spreading of the samples around the mean.

iN=1 (xi m)2


,
N

where

7.1. Optimization

131

one target is to perform a multi-objective optimization.


The figures 7.5 and 7.7 show four different multi-objective optimization,
respectively, for the 0.7 m and 0.25 m technology (with figures 7.6, 7.8
that are, respectively, a zoom of the figures 7.5(b), 7.7(b). The four different
optimizations performed are:
i) full delay optimization, indicated with Delay=100% Power=0%;
ii) a delay optimization, taking slightly into account the power consumption, indicated with Delay=80% Power=20%;
iii) a delaypower optimization, taking into account the power dissipation
in an equal measure, indicated with Delay=50% Power=50%;
iv) a delay optimization, taking strongly into account the power consumption, indicated with Delay=20% Power=80%;
The percent numbers6 reported after delay and power, are, also, the
coefficients and of the equation 5.5 (page 105) used as a cost function
in the optimization algorithm.
From these figures we see the delay that reduces more and more with
the increasing of its relative weight, while the increasing of the power dissipation is somewhat limited by the increase of its relative weight.
From all the optimization policies, the one that gives the most useful
results is the optimization of delay and power with the same weights, that
is the one indicated with Delay=50% Power=50% in the previous figures.
These results are reported also in figure 7.9, as a particular case.
This is, probably, the most useful optimization since it still reduces a lot
the delay, but it contains the increasing of the power dissipation to a more
acceptable value.
The figures 7.10, 7.11, 7.12 and 7.13, show the same four optimizations
by means of the trajectory in the space delaypower during the optimization process. In these figures each marked point is a step in the optimization
process. It is so possible to see how augmenting the relative weight of the
6

The case Delay=0% Power=100% has not been included, since this kind of optimization leads to the trivial result of all the transistor at the minimum width (cfr. 5.2.2.2,
page 96)

Chapter 7. Results and conclusions

132

Delay--Power Optimization: delay variation


3000
Delay=100%, Power=0%
Delay=80%, Power=20%
Delay=50%, Power=50%
Delay=20%, Power=80%

2500

Delay [ps]

2000

1500

1000

500

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay variation

Delay--Power Optimization: energy variation


180
Delay=100%, Power=0%
Delay=80%, Power=20%
Delay=50%, Power=50%
Delay=20%, Power=80%

160
140

Energy [pJ]

120
100

80
60
40
20

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(b) Energy-dissipation variation

Fig. 7.5: Several delaypower optimization policies of 0.7 m gates.

delay in the cost function (and thus reducing the energy relative weight),
leads the optimizer to go further in the trajectory reducing the delay and
augmenting the energy dissipation.

7.1. Optimization

133

Delay--Power Optimization: energy variation


50
Delay=100%, Power=0%
Delay=80%, Power=20%
Delay=50%, Power=50%
Delay=20%, Power=80%

45
40

Energy [pJ]

35

30
25
20
15
10
5

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

Fig. 7.6: Energy-dissipation variation (zoom of figure 7.5(b))


Tab. 7.6: Delay worsening and energy-dissipation improvement between a
full delay optimization and delay-power optimization
Technology
Gate
inv
andn
andp
orn
orp
latchn
latchp
andor
andstatic
orstatic
parity
staticfa
tspcfa1
tspcfa2
average

Delay
39.3%
27.8%
48.4%
33.1%
33.1%
31.5%
28.3%
29.5%
28.7%
21.4%
7.7%
33.3%
11.0%
12.3%
+27.5%

0.7 m
Energy
-20.1%
-92.2%
-81.0%
-67.2%
-77.8%
-71.1%
-75.2%
-91.2%
-67.3%
-30.4%
-78.1%
-87.2%
-29.3%
-72.5%
-67.2%

Area
-40.9%
-87.4%
-80.4%
-76.2%
-69.9%
-84.7%
-76.1%
-89.2%
-79.3%
-28.8%
-81.2%
-86.7%
-27.4%
-71.2%
-69.9%

Delay
15.7%
6.3%
1.1%
46.9%
11.8%
41.3%
14.6%
6.7%
14.4%
-3.4%
2.5%
5.9%
15.4%
8.6%
+13.4%

0.25 m
Energy
-10.4%
-36.3%
-39.4%
-77.5%
-35.5%
-22.0%
-69.5%
-81.2%
-42.1%
18.4%
-50.3%
-81.9%
-48.1%
-41.9%
-44.1%

Area
-21.1%
-42.1%
-49.2%
-66.9%
-21.6%
-46.4%
-72.1%
-79.1%
-53.3%
-12.8%
-51.0%
-82.4%
-48.3%
-44.1%
-49.3%

From these figures it can be clearly seen again that the multi-objective
optimization Delay=50% Power=50% has the best results with respect to
delay optimization and, at the same time, to containing the energy dissipation within reasonable value. These results are summarized in table 7.6: in

Chapter 7. Results and conclusions

134

Delay--Power Optimization: delay variation


1000
Delay=100%, Power=0%
Delay=80%, Power=20%
Delay=50%, Power=50%
Delay=20%, Power=80%

900
800

Delay [ps]

700

600
500
400
300
200
100

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay variation

Delay--Power Optimization: energy variation


120
Delay=100%, Power=0.0
Delay=80%, Power=20%
Delay=50%, Power=50%
Delay=20%, Power=80%

100

Energy [pJ]

80

60

40

20

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(b) Energy-dissipation variation

Fig. 7.7: Several delaypower optimization policies of 0.25 m gates.

this table are showed the percent variation of delay and energy dissipation
between the values obtained after a full delay optimization and the values
obtained after a delaypower optimization. The average worsening in the

7.2. Conclusions

135

Delay--Power Optimization: energy variation


5
Delay=100%, Power=0.0
Delay=80%, Power=20%
Delay=50%, Power=50%
Delay=20%, Power=80%

Energy [pJ]

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

Fig. 7.8: Energy-dissipation variation (zoom of figure 7.7(b))


delay (i.e. the difference between the delay value after a full delay optimization and the same value after a delaypower optimization) is +27.5% for
the 0.7 m technology and just +13.6% for the 0.25 m technology. Despite
these low rate of worsening, the average energy-dissipation reduction is

67.2% for the 0.7 m technology and 44.1% for the 0.25 m technology,
while the area occupancy reductions are, respectively, 69.9% and 49.3%
This means that accepting a slight degradation in the delay figure, leads to
a great reduction of the overall energy-dissipation and area occupancy.

7.2

Conclusions

The goal of the optimization framework presented in this chapter is to


show a new way to optimize the performance of CMOS cells employed in
VLSI

circuits.

This new methodology, the multi-objective optimization, has led to a prominent result: the delay of a circuit can be reduced taking into account the
power consumption and the area occupancy. The results of table 7.6 are the
most effective: giving a small compromise of the delay performance with
respect of a full delay optimization, the power consumption is strongly decreased; this means that the default optimization done until nowadays, the

Chapter 7. Results and conclusions

136

Delay--Power Optimization: delay variation


3000
0.7m
0.25m
2500

Delay [ps]

2000

1500

1000

500

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(a) Delay variation

Delay--Power Optimization: energy variation


60
0.7m
0.25m
50

Energy [pJ]

40

30

20

10

tspc--fa2

tspc--fa1

static--fa

parity

or--static

and--static

and--or

inv

latchp

latchn

orp

orn

andp

andn

Gate type

(b) Energydissipation variation

Fig. 7.9: Delaypower optimization (50%50%) comparison of 0.7 m and


0.25 m gates.

full delay optimization, can be safely switched with a multi-objective optimization. A circuit that has less power consumption while maintaining

7.2. Conclusions

137

Delay--power trajectory during optimizations.


340
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
320

Delay [ps]

300

280

260

240

220

200
0

10

15

20

25
Energy [pJ]

30

35

40

45

50

(a) 0.25 m

Delay--power trajectory during optimizations.


1200
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
1100

1000

Delay [ps]

900

800

700

600

500

400
0

20

40

60

80

100

120

140

Energy [pJ]

(b) 0.7 m

Fig. 7.10: Delay and power trajectory during 4 different multi-objective optimizations for the andor gate of figure 5.14 (page 98)

almost the same delay is safer from the operating point of view: it develops
less heat, hence it is more reliable.

Chapter 7. Results and conclusions

138

Delay--power trajectory during optimizations.


1000
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
900

Delay [ps]

800

700

600

500

400

300
0

0.5

1.5

2.5

Energy [pJ]

(a) 0.25 m

Delay--power trajectory during optimizations.


2800
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
2600
2400
2200

Delay [ps]

2000
1800

1600
1400
1200
1000
800
600
0

5
6
Energy [pJ]

10

11

(b) 0.7 m

Fig. 7.11: Delay and power trajectory during 4 different multi-objective optimizations for the parity gate of figure 5.15 (page 99)
The easiness of obtaining circuits in which several optimization policies
can be performed helps a lot the work of cell-library designer: the designer
can, with a very low effort, produce with the same version of a library

7.2. Conclusions

139

Delay--power trajectory during optimizations.


1800
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point

1600

Delay [ps]

1400

1200

1000

800

600
0

20

40

60

80
100
Energy [pJ]

120

140

160

180

(a) 0.25 m

Delay--power trajectory during optimizations.


580
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
560
540

Delay [ps]

520
500

480
460
440
420
400
0

50

100

150

200

250

Energy [pJ]

(b) 0.25 m

Fig. 7.12: Delay and power trajectory during 4 different multi-objective optimizations for the static full-adder of figure 5.16 (page 100)
several libraries optimized in different ways. So each cell in a library has
different performances with respect to the same cell in the other libraries,
but it is still fully equivalent by the point of view of the function performed.

Chapter 7. Results and conclusions

140

Delay--power trajectory during optimizations.


280
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
260

Delay [ps]

240

220

200

180

160

140
0.5

1.5

2.5
Energy [pJ]

3.5

4.5

(a) 0.25 m

Delay--power trajectory during optimizations.


1000
Delay=100% Power=0%
Delay=80% Power=20%
Delay=50% Power=50%
Delay=20% Power=80%

Starting point
900

Delay [ps]

800

700

600

500

400

300
2

10
Energy [pJ]

12

14

16

18

(b) 0.7 m

Fig. 7.13: Delay and power trajectory during 4 different multi-objective optimizations for the dynamic full-adder of figure 5.17 (page 101)
Lets think for example to an and gate that performs always the same
function, but with different delays or maybe different power dissipations.
Simply swapping one library version (for example one optimized only for

7.3. Future works

141

the delay) with another (for example one optimized taking into account
the power consumption), the designer can develop several versions of the
same project with different performances.

7.3 Future works


Some future works that will be faced in the future could be:
Noise problems This means to use another target in the optimization
policies: the noise ([18]) of a circuit.
This is a complex field, and a good starting point could be developing
of a noise-model of a CMOS circuit.
Interconnections A simpler work could be to take into account the influence of interconnections in the optimization.
This means both to include a model of the interconnections into the
cell and to optimize the performance of the whole structure.
Topology extensions The optimizer can be expanded to perform the optimization of different structures from the standard cells (both static
and dynamic): for example the memory cells, or the pass-logic gates.
This means principally to modify the algorithm that performs the
automatic search of all the critical paths in a circuits, to adapt it to
different topologies. There is, anyway, the possibility in the optimizer to list the critical path by hand and to perform the optimization
with these paths.
Cad integration The optimizer could be integrated in a standard CAD
tool that assists the designer in developing an ASIC from high-level
specifications to layout level. One step of this flow could be the optimization of the library employed in the project.

142

Chapter 7. Results and conclusions

APPENDIX

Appendix A

CLASS GRAPH

Appendix A. Class graph

146

Class Graph

CircuitNetList

>

Circuit

OptimizationAlgorithm

>

TestEval

>

Slop2

>

Slop

>

Powell

>

Anneal

147

EvaluationAlgorithm

>

TestOpt

>

Hspice

>

Fast

Options

CPNode

CritPathList

TransistorNode

TransistorList

Appendix A. Class graph

148

CapacitorList

Node

NodeList

Appendix B

SOURCE CODE
B.1

Main functions

Appendix B. Source code

150

CPNode.cc
3 #include "mystdinclude.h"
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

#include
#include
#include
#include
#include
#include
#include

"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
CPNode::CPNode() :
VALID( 0 ), NodeIn( 0 ), NodeOut( 0 ), NumTranList( 0 ),
ActiveInputs( 0 ), NoActiveInputs( 0 ), InitialConditions( 0 ),
ActiveInputsIter( 0 ), NoActiveInputsIter( 0 ), InitialConditionsIter( 0 ),
next( 0 )
{
for ( unsigned int i = 0; i < MAXCHAIN; i++ )
{
TransistorNameList[ i ] = 0;
TransistorNameListIter[ i ] = 0;
NumTranN[ i ] = 0;
NumTranP[ i ] = 0;
}
}
///
CPNode::~CPNode()
{
NodeValueList* tmp;
while ( ActiveInputs )
{
tmp = ActiveInputs->next;
delete ActiveInputs;
ActiveInputs = tmp;
}
while ( NoActiveInputs )
{
tmp = NoActiveInputs->next;
delete NoActiveInputs;
NoActiveInputs = tmp;
}
while ( InitialConditions )
{
tmp = InitialConditions->next;
delete InitialConditions;
InitialConditions = tmp;
}
TrList* tmp2;
for ( unsigned int i = 0; i < MAXCHAIN; i++ )
while ( TransistorNameList[ i ] )
{
tmp2 = ( TransistorNameList[ i ] ) ->next;
delete TransistorNameList[ i ];
TransistorNameList[ i ] = tmp2;
}
}
///
int CPNode::InsNodeIn( unsigned int Node, TransitionType T, double Time )
{
if ( NodeIn )
/// ERROR, yet inserted
return NOT_FOUND;

B.1. Main functions

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133

NodeIn = Node;
TransitionIn = T;
InTime = Time;
return OK;
}
///
int CPNode::InsNodeOut( unsigned int Node, TransitionType T )
{
if ( NodeOut )
/// ERROR, yet inserted
return NOT_FOUND;
NodeOut = Node;
TransitionOut = T;
return OK;
}
///
int CPNode::InsActIn( unsigned int Node, double Val )
{
NodeValueList * tmp;
if ( !ActiveInputs )
{
ActiveInputs = new NodeValueList;
if ( !ActiveInputs )
return NO_MEM;
ActiveInputs->next = 0;
}
else
{
tmp = new NodeValueList;
if ( !tmp )
return NO_MEM;
tmp->next = ActiveInputs;
ActiveInputs = tmp;
}
ActiveInputs->node = Node;
ActiveInputs->value = Val;
ActiveInputsIter = ActiveInputs;
return OK;
}
///
int CPNode::InsNoActIn( unsigned int Node, double Val )
{
NodeValueList * tmp;
if ( !NoActiveInputs )
{
NoActiveInputs = new NodeValueList;
if ( !NoActiveInputs )
return NO_MEM;
NoActiveInputs->next = 0;
}
else
{
tmp = new NodeValueList;
if ( !tmp )
return NO_MEM;
tmp->next = NoActiveInputs;
NoActiveInputs = tmp;
}
NoActiveInputs->node = Node;
NoActiveInputs->value = Val;
NoActiveInputsIter = NoActiveInputs;
return OK;
}
///
int CPNode::InsIniCond( unsigned int Node, double Val )
{

151

Appendix B. Source code

152

134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202

NodeValueList * tmp;
if ( !InitialConditions )
{
InitialConditions = new NodeValueList;
if ( !InitialConditions )
return NO_MEM;
InitialConditions->next = 0;
}
else
{
tmp = new NodeValueList;
if ( !tmp )
return NO_MEM;
tmp->next = InitialConditions;
InitialConditions = tmp;
}
InitialConditions->node = Node;
InitialConditions->value = Val;
InitialConditionsIter = InitialConditions;
return OK;
}
///
int CPNode::InsTran( const char* name, TransistorType TR, unsigned int index )
{
TrList * tmp;
TrList* tail;
if ( !TransistorNameList[ index ] )
{
TransistorNameList[ index ] = new TrList;
if ( !TransistorNameList[ index ] )
return NO_MEM;
( TransistorNameList[ index ] ) ->next = 0;
TransistorNameListIter[ index ] = TransistorNameList[ index ];
tail = TransistorNameList[ index ];
}
else
{
tmp = new TrList;
tail = TransistorNameList[ index ];
if ( !tmp )
return NO_MEM;
tmp->next = 0;
while ( tail->next )
tail = tail->next;
tail->next = tmp;
tail = tmp;
}
tail->name = new char[ strlen( name ) + 1 ];
if ( !( tail->name ) )
return NO_MEM;
strcpy( tail->name, name );
if ( TR == NMOS )
NumTranN[ index ] ++;
else if ( TR == PMOS )
NumTranP[ index ] ++;
else
return NOT_FOUND;
return OK;
}
///
int CPNode::TraverseActiveInputs( unsigned int& Node, double& value ) const
{
CPNode * const localThis = ( CPNode * const ) this;
if ( ActiveInputsIter )
{
Node = ActiveInputsIter->node;
value = ActiveInputsIter->value;

B.1. Main functions

203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271

localThis->ActiveInputsIter = ActiveInputsIter->next;
return 1;
}
else
localThis->ActiveInputsIter = ActiveInputs;
return 0;
}
///
int CPNode::TraverseNoActiveInputs( unsigned int& Node, double& value ) const
{
CPNode * const localThis = ( CPNode * const ) this;
if ( NoActiveInputsIter )
{
Node = NoActiveInputsIter->node;
value = NoActiveInputsIter->value;
localThis->NoActiveInputsIter = NoActiveInputsIter->next;
return 1;
}
else
localThis->NoActiveInputsIter = NoActiveInputs;
return 0;
}
///
int CPNode::TraverseInitialConditions( unsigned int& Node, double& value ) const
{
CPNode * const localThis = ( CPNode * const ) this;
if ( InitialConditionsIter )
{
Node = InitialConditionsIter->node;
value = InitialConditionsIter->value;
localThis->InitialConditionsIter = InitialConditionsIter->next;
return 1;
}
else
localThis->InitialConditionsIter = InitialConditions;
return 0;
}
///
const char* CPNode::TraverseTransistorNameList( unsigned int index = 0 ) const
{
CPNode * const localThis = ( CPNode * const ) this;
if ( TransistorNameListIter[ index ] )
{
char * name = new char[ strlen( ( TransistorNameListIter[ index ] ) ->name ) + 1 ];
if ( !name )
return 0;
strcpy( name, ( TransistorNameListIter[ index ] ) ->name );
localThis->TransistorNameListIter[ index ] = ( TransistorNameListIter[ index ] ) ->next;
return name;
}
else
localThis->TransistorNameListIter[ index ] = TransistorNameList[ index ];
return 0;
}
///
const char* CPNode::TransistorName( unsigned int pathIndex, unsigned int index = 0 ) const
{
TrList * tmp = TransistorNameList[ index ];
for ( unsigned int i = pathIndex; i > 0; i-- )
if ( tmp )
tmp = tmp->next;
else
return 0;
return tmp->name;
}

153

Appendix B. Source code

154

CapInsert.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"

///
int CapacitorList::Insert( unsigned int node1, unsigned int node2, double val )
{
Capacitance * tmp;
if ( !head )
{
head = new Capacitance;
if ( !head )
return NO_MEM;
head->next = 0;
}
else
{
tmp = new Capacitance;
if ( !tmp )
return NO_MEM;
tmp->next = head;
head = tmp;
}
head->node1 = node1;
head->node2 = node2;
head->val = val;
NumCap++;
return OK;
}

CapacitorList.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"

///
CapacitorList::CapacitorList() : NumCap( 0 ), head( 0 )
{}
///
CapacitorList::~CapacitorList()
{
Capacitance* tmp;
while ( head )
{
tmp = head->next;
delete head;
head = tmp;
}
}
///
const Capacitance& CapacitorList::operator[]( unsigned int index ) const
{
if ( index > NumCap )
error( NOT_FOUND, 0, "Index out of bounbd in [CapacitorList]..." );
unsigned int i = index;
Capacitance* tmp = head;
while ( i-- )
tmp = tmp->next;

B.1. Main functions

34
35
36
37
38
39
40
41
42
43
44
45
46
47

return *tmp;
}
///
Capacitance& CapacitorList::operator[]( unsigned int index )
{
if ( index > NumCap )
error( NOT_FOUND, 0, "Index out of bounbd in [CapacitorList]..." );
unsigned int i = index;
Capacitance* tmp = head;
while ( i-- )
tmp = tmp->next;
return *tmp;
}

Circuit.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
Circuit::Circuit( const char* FileNetList, const Options& options ) :
CircuitNetList( FileNetList, options )
{
print_log( "Creating circuit graph..." );
}
///
Circuit::~Circuit()
{}

CircuitNetList.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
CircuitNetList::CircuitNetList( const char *FileNetList, const Options& options ) :
Val( 0.0 ), ValNode( 0 )
{
print_log( "Creating transistors list..." );
char *FileIn = new char[ strlen( FileNetList ) + 1 ];
if ( !FileIn )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
strcpy( FileIn, FileNetList );
FileNetOut = new char[ strlen( FileNetList ) + strlen( NetListSuffix ) + 1 ];
if ( !FileNetOut )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );

155

Appendix B. Source code

156

30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75

error( NO_MEM, errno, "HEY! " );


}
strcpy( FileNetOut, FileNetList );
strcat( FileNetOut, NetListSuffix );
if ( int RetCode = PreProcess( FileNetList, options.NamemosN(), options.NamemosP() ) )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ RetCode ] );
error( RetCode, errno, "HEY! " );
}
delete[] FileIn;
}
///
CircuitNetList::~CircuitNetList()
{
delete[] FileNetOut;
}
///
const TransistorNode& CircuitNetList::operator[]( unsigned int index ) const
{
if ( index > GetNTran() )
error( NOT_FOUND, 0, "Index out of bound in [Circuit]..." );
return TranList[ index ];
}
///
const TransistorNode& CircuitNetList::operator[]( const char* name ) const
{
return TranList[ name ];
}
///
int CircuitNetList::TranPos( const char* name ) const
{
unsigned int index = 0;
unsigned int NT = GetNTran();
while ( index < NT )
{
if ( !strcasecmp( name, TranList[ index ].DevName() ) )
return TranList[ index ].Index();
index++;
}
return -1;
}

CircuitNetlistParse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
int CircuitNetList::ParseMosLine( char *line, char *line2, const char* mosn, const char* mosp )
{
char tmpstr[ 128 ];
char parsestr[ 128 ];
char endpar[ 128 ];
char mos[ 8 ];
char type[ 16 ];
char par[ 16 ];
char lstr[ 16 ];

B.1. Main functions

22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

unsigned int n1, n2, n3, n4;


TransistorType Type;
double W, L;
strcpy( parsestr, "%s %u %u %u %u %s" );
strcpy( endpar, " " );
if ( sscanf( line, parsestr, mos, &n1, &n2, &n3, &n4, type ) == 6 )
{
unsigned int nw = 0;
unsigned int nl = 0;
sprintf( line2, "%s %u %u %u %u %s ", mos, n1, n2, n3, n4, type );
if ( !strcasecmp( mosn, type ) )
Type = NMOS;
else if ( !strcasecmp( mosp, type ) )
Type = PMOS;
else
return PARSE_ERROR;
strcpy( parsestr, "%*s %*u %*u %*u %*u %*s" );
strcpy( tmpstr, parsestr );
strcat( parsestr, " %s" );
while ( ( sscanf( line, parsestr, par ) == 1 ) && ( nl * nw == 0 ) )
{
unsigned int npar = 0;
if ( ( par[ 0 ] == w ) || ( par[ 0 ] == W ) )
nw = 1;
else if ( ( par[ 0 ] == l ) || ( par[ 0 ] == L ) )
nl = 1;
else
{
npar++;
if ( npar == 1 )
strcat( endpar, " \n+" );
strcat( endpar, " " );
strcat( endpar, par );
}
if ( nw == 1 )
{
unsigned int count = 0;
while ( !isdigit( par[ count++ ] ) );
sscanf( &par[ --count ], "%lf%*c", &W );
strcat( line2, par );
nw++;
}
if ( nl == 1 )
{
strcpy( lstr, par );
unsigned int count = 0;
while ( !isdigit( par[ count++ ] ) );
sscanf( &par[ --count ], "%lf%*c", &L );
nl++;
}
if ( nw * nl )
{
strcat( line2, "
" );
strcat( line2, lstr );
}
strcat( tmpstr, " %*s" );
strcpy( parsestr, tmpstr );
strcat( parsestr, " %s" );
}
while ( sscanf( line, parsestr, par ) == 1 )
{
strcat( tmpstr, " %*s" );
strcpy( parsestr, tmpstr );
strcat( parsestr, " %s" );
strcat( line2, " " );
strcat( line2, par );
}
strcat( line2, " " );

157

Appendix B. Source code

158

91
92
93
94
95
96
97 }

strcat( line2, endpar );


if ( TranList.Insert( mos, W, L, Type, n1, n2, n3 ) )
return NO_MEM;
return OK;
}
return PARSE_ERROR;

CircuitNetlistPreprocess.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
int CircuitNetList::PreProcess( const char* FileNetList, const char* NameMosN, const char* NameMosP )
{
char line[ 1024 ];
char line2[ 1024 ];
char command[ 32 ];
ifstream i_file( FileNetList );
ofstream o_file( FileNetOut );
int ToBeCopied;
if ( !i_file )
return NOT_FOUND;
if ( !o_file )
return NOT_FOUND;
while ( i_file.getline( line, 1023 ) )
{
int c = 0;
ToBeCopied = 1;
while ( isspace( line[ c++ ] ) );
switch ( line[ --c ] )
{
case .:
sscanf( &line[ c + 1 ], "%s", command );
ToBeCopied = strcasecmp( command, "tran" ) && \
strcasecmp( command, "dc" ) && \
strcasecmp( command, "ac" );
if ( !ToBeCopied )
{
strcpy( line2, "***** " );
strcat( line2, &line[ c ] );
}
break;
case v:
case V:
sscanf( &line[ c + 1 ], "%s", command );
ToBeCopied = !( strcasecmp( command, "dd" ) && \
strcasecmp( command, "cc" ) && \
strcasecmp( command, "al" ) );
if ( ToBeCopied )
{
int node2;
ToBeCopied = 0;
sscanf( &line[ c ], "%*s %d %d %*s %lf", &ValNode, &node2, &Val );
sprintf( line2, "vdd %d %d dc %g ", ValNode, node2, Val );
}
else
{
strcpy( line2, "* " );
strcat( line2, &line[ c ] );

B.1. Main functions

61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97 }

}
break;
case m:
case M:
case x:
case X:
ToBeCopied = ParseMosLine( &line[ c ], line2, NameMosN, NameMosP );
break;
case c:
case C:
unsigned int node1, node2;
double val;
sscanf( &line[ c ], "%*s %u %u %lg", &node1, &node2, &val );
if ( CapList.Insert( node1, node2, val ) != OK )
{
i_file.close();
o_file.close();
return NO_MEM;
}
break;
default:
break;
}
if ( ToBeCopied == 0 )
o_file << line2 << endl;
else
o_file << &line[ c ] << endl;
}
o_file.close();
i_file.close();
if ( Val <= 0.0 )
{
print_log( "Error: no|wrong VDD defined" );
return NOT_FOUND;
}
return OK;

CircuitPrint.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
void Circuit::PrintResult( unsigned long int Step,
unsigned int NT,
unsigned int NP,
const double* NewWidth,
const double* CPDelay,
const double* CPPower,
const double *CPNoise,
double Area,
double maxT,
double maxP,
double maxN,
double f,
double fLast ) const
{
char log[ 1024 ], tmp[ 1024 ];
if ( Step == 1 )
{

159

Appendix B. Source code

160

31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99

ofstream o_file( "RESULT.log" );


ofstream o_fileW( "RESULT_W.log" );
ofstream o_fileT( "RESULT_T.log" );
ofstream o_fileP( "RESULT_P.log" );
if ( !o_file )
{
print_log( "Warning: cant create
return ;
}
sprintf( log, "# Step " );
strcat( log, "Norm_N(W[]) " );
strcat( log, "OptFunc " );
strcat( log, "Error " );
strcat( log, "Max(T[]) " );
strcat( log, "Max(P[]) " );
strcat( log, "Max(N[]) " );
strcat( log, " A " );
o_file << log << endl;
sprintf( log, "# Step " );
for ( unsigned int i = 0; i < NT; i++
{
sprintf( tmp, "W[%u] ", i );
strcat( log, tmp );
}
o_fileW << log << endl;
sprintf( log, "# Step " );
for ( unsigned int i = 0; i < NP; i++
{
sprintf( tmp, "T[%u] ", i );
strcat( log, tmp );
}
o_fileT << log << endl;
sprintf( log, "# Step " );
for ( unsigned int i = 0; i < NP; i++
{
sprintf( tmp, "P[%u] ", i );
strcat( log, tmp );
}
o_fileP << log << endl;
for ( unsigned int i = 0; i < NP; i++
{
sprintf( tmp, "N[%u] ", i );
strcat( log, tmp );
}

file RESULT.log" );

o_file.close();
o_fileW.close();
o_fileP.close();
o_fileT.close();
}
ofstream o_file( "RESULT.log", ios::app );
ofstream o_fileW( "RESULT_W.log", ios::app );
ofstream o_fileT( "RESULT_T.log", ios::app );
ofstream o_fileP( "RESULT_P.log", ios::app );
if ( !o_file )
{
print_log( "Warning: cant create file RESULT.log" );
return ;
}
sprintf( log, "%7ld ", Step );
sprintf( tmp, "%4.3f ", NORM_N( NewWidth, NT ) );
strcat( log, tmp );
sprintf( tmp, "%4.3g ", f );
strcat( log, tmp );
sprintf( tmp, "%4.3g ", (f - fLast) / fLast * 100);
strcat( log, tmp );
sprintf( tmp, "%4.3f ", maxT );
strcat( log, tmp );
sprintf( tmp, "%4.3f ", maxP );

B.1. Main functions

100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151

161

strcat( log, tmp );


sprintf( tmp, "%4.3f ", maxN );
strcat( log, tmp );
sprintf( tmp, "%4.3f ", Area );
strcat( log, tmp );
o_file << log << endl;
sprintf( log, "%7ld ", Step );
for ( unsigned int i = 0; i < NT; i++ )
{
sprintf( tmp, "%4.3f ", NewWidth[ i ] );
strcat( log, tmp );
}
o_fileW << log << endl;
sprintf( log, "%7ld ", Step );
for ( unsigned int i = 0; i < NP; i++ )
{
sprintf( tmp, "%4.3f ", CPDelay[ i ] );
strcat( log, tmp );
}
o_fileT << log << endl;
sprintf( log, "%7ld ", Step );
for ( unsigned int i = 0; i < NP; i++ )
{
sprintf( tmp, "%4.3f ", CPPower[ i ] );
strcat( log, tmp );
}
o_fileP << log << endl;
sprintf( log, "%7ld ", Step );
for ( unsigned int i = 0; i < NP; i++ )
{
sprintf( tmp, "%4.3f ", CPNoise[ i ] );
strcat( log, tmp );
}

o_file.close();
o_fileW.close();
o_fileP.close();
o_fileT.close();
}
///
double NORM_N( const double* V, unsigned int l )
{
double norm = 0.0;
for ( unsigned int i = 0; i < l; i++ )
norm += pow( V[ i ], double( l ) );
norm = pow( norm, double( 1.0 / l ) );
// if(V[i] norm)
// norm = V[i];
return norm;
}

CircuitTranListNode.cc
3
4
5
6
7
8
9
10
11
12
13
14
15

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
int Circuit::TransistorListNode(unsigned int node, TransistorList& TList, unsigned int& n, unsigned int& p) const
{
// find all the nmos transistors with source or drain

Appendix B. Source code

162

16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35 }

// connected to node an return a list, plus the number of n and p connected


unsigned int NT = GetNTran();
n = p = 0;
for ( unsigned int i = 0; i < NT; i++ )
{
if ( ( TranList[ i ].Source() == node ) ||
( TranList[ i ].Drain() == node ) )
{
TList.Insert((TranList[i]).DevName(), (TranList[i]).Width(),
(TranList[i]).Length(), (TranList[i]).TrType(),
(TranList[i]).Source(), (TranList[i]).Gate(),
(TranList[i]).Drain());
if ((TranList[i]).TrType() == NMOS)
n++;
else if ((TranList[i]).TrType() == PMOS)
p++;
}
}
return OK;

CircuitWidth.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
double Circuit::JunctionNWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the nmos transistors with source or drain
// connected to node an return the sum of widths
unsigned int NT = GetNTran();
number = 0;
double W = 0.0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NT; i++ )
{
if ( TranList[ i ].TrType() == NMOS )
if ( ( TranList[ i ].Source() == node ) ||
( TranList[ i ].Drain() == node ) )
{
if ( !NewWidth )
W += TranList[ i ].Width();
else
W += NewWidth[ TranPos( TranList[ i ].DevName() ) ];
number++;
}
}
return W;
}
///
double Circuit::GateNWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the nmos transistors with gate
// connected to node an return the sum of widths
unsigned int NT = GetNTran();
double W = 0.0;
number = 0;
if ( node == 0 )
return 0.0;

B.1. Main functions

48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116

for ( unsigned int i = 0; i < NT; i++ )


{
if ( TranList[ i ].TrType() == NMOS )
if ( TranList[ i ].Gate() == node )
{
if ( !NewWidth )
W += TranList[ i ].Width();
else
W += NewWidth[ TranPos( TranList[ i ].DevName() ) ];
number++;
}
}
return W;
}
///
double Circuit::JunctionPWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the pmos transistors with source or drain
// connected to node an return the sum of widths
unsigned int NT = GetNTran();
double W = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NT; i++ )
{
if ( TranList[ i ].TrType() == PMOS )
if ( ( TranList[ i ].Source() == node ) ||
( TranList[ i ].Drain() == node ) )
{
if ( !NewWidth )
W += TranList[ i ].Width();
else
W += NewWidth[ TranPos( TranList[ i ].DevName() ) ];
number++;
}
}
return W;
}
///
double Circuit::GatePWidth( unsigned int node, int& number, const double* NewWidth = 0 ) const
{
// find all the pmos transistors with gate
// connected to node an return the sum of widths
unsigned int NT = GetNTran();
double W = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NT; i++ )
{
if ( TranList[ i ].TrType() == PMOS )
if ( TranList[ i ].Gate() == node )
{
if ( !NewWidth )
W += TranList[ i ].Width();
else
W += NewWidth[ TranPos( TranList[ i ].DevName() ) ];
number++;
}
}
return W;
}
///
double Circuit::CapStaticGnd( unsigned int node, int& number ) const
{

163

Appendix B. Source code

164

117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169

// find all the fixed capacitances with a ground terminal


// and connected to node and return the sum of them
unsigned int NC = GetNCap();
double C = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NC; i++ )
{
if ( ( CapList[ i ].node1 == 0 ) &&
( CapList[ i ].node2 == node ) )
{
C += CapList[ i ].val;
number++;
}
else if ( ( CapList[ i ].node2 == 0 ) &&
( CapList[ i ].node1 == node ) )
{
C += CapList[ i ].val;
number++;
}
}
return C;
}

///
double Circuit::CapStaticVdd( unsigned int node, int& number ) const
{
// find all the fixed capacitances with a vdd terminal
// and connected to node and return the sum of them
unsigned int NC = GetNCap();
double C = 0.0;
number = 0;
if ( node == 0 )
return 0.0;
for ( unsigned int i = 0; i < NC; i++ )
{
if ( ( CapList[ i ].node1 == ValNode ) &&
( CapList[ i ].node2 == node ) )
{
C += CapList[ i ].val;
number++;
}
else if ( ( CapList[ i ].node2 == ValNode ) &&
( CapList[ i ].node1 == node ) )
{
C += CapList[ i ].val;
number++;
}
}
return C;
}

Critic.cc
3
4
5
6
7
8
9
10
11
12
13
14

#include
#include
#include
#include
#include
#include
#include
#include
#include

///

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"

B.1. Main functions

15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83

int Critic(const Circuit& circuit,


CritPathList& pathList,
const Options& options)
{
Node nodeInputList;
/// search primary input
print_log("Searching primary input...");
unsigned int Nt = circuit.GetNTran();
for (unsigned int i = 0; i < Nt; i++)
{
unsigned int gate = (circuit[i]).Gate();
unsigned int Pin = 1;
for (unsigned int j = 0; j < Nt; j++)
{
if (i != j)
{
for (unsigned int k = 0; k < nodeInputList.GetNumNode(); k++)
if ( (nodeInputList[k]).node == gate)
Pin = 0;
if ( ((circuit[j]).Drain() == gate) ||
((circuit[j]).Source() == gate))
Pin = 0;
}
}
if (Pin)
nodeInputList.Insert(gate);
}
#ifdef DEBUG
cerr << endl << " Primary input List: " << (nodeInputList[0]).node << " ";
#endif
char log[1024];
char log2[16];
sprintf(log, "Primary input list: %u", (nodeInputList[0]).node);
for (unsigned int i = 1; i < nodeInputList.GetNumNode(); i++)
{
sprintf(log2, " -- %u", (nodeInputList[i]).node);
#ifdef DEBUG
cerr << (nodeInputList[i]).node << " ";
#endif
}
#ifdef DEBUG
cerr << endl;
#endif
print_log(log);
NodeList nodeListGnd;
NodeList nodeListVdd;
int RetCode1 = nodeListGnd.Create();
int RetCode2 = nodeListVdd.Create();
if ( (RetCode1 != OK) || (RetCode2 != OK) )
return (RetCode1 != OK ? RetCode1 : RetCode2);
#ifdef DEBUG
cerr << endl << "Creating critical Path with gnd..." << endl;
#endif
RetCode1 = CriticRecurse(circuit, 0, nodeListGnd);
if ((RetCode1 != OK) && (RetCode1 != CONT))
return RetCode1;
#ifdef DEBUG
cerr << endl << "Creating critical Path with vdd..." << endl;
#endif
unsigned int val = circuit.ValimNode();
RetCode2 = CriticRecurse(circuit, val, nodeListVdd);
if ( (RetCode2 != OK) && (RetCode2 != CONT) )
return RetCode2;
Node gateInternalList;
for (unsigned int i = 0; i < nodeListGnd.GetNumList(); i++)
{
unsigned int Gin = 1;
unsigned int nn = (nodeListGnd[i]).GetNumNode();
unsigned int new_node = ((nodeListGnd[i])[nn - 1]).node;

165

166

84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152

Appendix B. Source code

for (unsigned int k = 0; k < gateInternalList.GetNumNode(); k++)


if ( (gateInternalList[k]).node == new_node)
Gin = 0;
if (Gin)
gateInternalList.Insert(new_node, -1);
}
for (unsigned int i = 0; i < nodeListVdd.GetNumList(); i++)
{
unsigned int Gin = 1;
unsigned int nn = (nodeListVdd[i]).GetNumNode();
unsigned int new_node = ((nodeListVdd[i])[nn - 1]).node;
for (unsigned int k = 0; k < gateInternalList.GetNumNode(); k++)
if ( (gateInternalList[k]).node == new_node)
Gin = 0;
if (Gin)
gateInternalList.Insert(new_node, -1);
}
#ifdef DEBUG
cerr << endl << " Gate Internal List: " << (gateInternalList[0]).node << " ";
#endif
sprintf(log, "Internal gate list: %u", (gateInternalList[0]).node);
for (unsigned int i = 1; i < gateInternalList.GetNumNode(); i++)
{
sprintf(log2, " -- %u", (gateInternalList[i]).node);
#ifdef DEBUG
cerr << (gateInternalList[i]).node << " ";
#endif
}
#ifdef DEBUG
cerr << endl;
#endif
print_log(log);
int RetCode = SearchCriticalPath(circuit, pathList, nodeListGnd, nodeListVdd, nodeInputList, gateInternalList, options);
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << endl << pathList.GetNumPath() << " CRITICAL PATHS ";
#endif
for (unsigned int i = 0; i < pathList.GetNumPath(); i++)
{
sprintf(log, "#%u) Node_in: %u (%s=%g ps) Node_Out: %u (%s)", i,
(pathList[i]).GetNodeIn(), TransitionString[(pathList[i]).GetTransitionIn()],
(pathList[i]).GetInTime(), (pathList[i]).GetNodeOut(),
TransitionString[(pathList[i]).GetTransitionOut()]);
print_log(log);
#ifdef DEBUG
cerr << endl << log;
#endif
sprintf(log, "\t\t Tran_List ");
for (unsigned int j = 0; j < (pathList[i]).GetNumListTran(); j++)
{
const char* name;
while ((name = (pathList[i]).TraverseTransistorNameList(j)) != 0)
{
strcat(log, name);
strcat(log, " ");
}
strcat(log, " / ");
}
print_log(log);
#ifdef DEBUG
cerr << endl << log;
#endif
unsigned int node;
double val;
sprintf(log, "\t\t Active_Inputs: ");
char tmpstr[1024];
while ((pathList[i]).TraverseActiveInputs(node, val))
{

B.1. Main functions

153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182

sprintf(tmpstr, " v(%u)= %g V -- ", node, val);


strcat(log, tmpstr);
}
print_log(log);
#ifdef DEBUG
cerr << endl << log;
#endif
sprintf(log, "\t\t No_Active_Inputs: ");
while ((pathList[i]).TraverseNoActiveInputs(node, val))
{
sprintf(tmpstr, " v(%u)= %g V -- ", node, val);
strcat(log, tmpstr);
}
print_log(log);
#ifdef DEBUG
cerr << endl << log << endl;
#endif
sprintf(log, "\t\t Initial Condition: ");
while ((pathList[i]).TraverseInitialConditions(node, val))
{
sprintf(tmpstr, " v(%u)= %g V -- ", node, val);
strcat(log, tmpstr);
}
print_log(log);
#ifdef DEBUG
cerr << endl << log << endl;
#endif
}
return OK;
}

CriticRecurse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"

///
int CriticRecurse(const Circuit& circuit,
unsigned int node,
NodeList& node_list)
{
TransistorList TList;
static int level = 0;
unsigned int n = 0;
unsigned int p = 0;
int RetCode;
if ((RetCode = circuit.TransistorListNode(node, TList, n , p)) != OK)
return RetCode;
unsigned int Nt = TList.GetNTran();
if ((RetCode = node_list.InsertNode(node)) != OK)
return RetCode;
if ( (n > 0) && (p > 0))
{
return OK;
}
level++;
#ifdef DEBUG
cerr << "node " << node << ": ";
for (unsigned int i = 0; i < Nt; i++)

167

Appendix B. Source code

168

38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

cerr << " - " << TList[i].DevName();


for (unsigned int i = 0; i < (node_list[node_list.GetNumList() - 1]).GetNumNode(); i++)
cerr << " - " << ((node_list[node_list.GetNumList() - 1])[i]).node;
cerr << endl;
#endif
unsigned int RecurseYes;
RetCode = 0;
for (unsigned int i = 0; i < Nt; i++)
{
unsigned int new_node;
if ((TList[i]).Source() == node)
new_node = TList[i].Drain();
else if ((TList[i]).Drain() == node)
new_node = TList[i].Source();
RecurseYes = 1;
unsigned int last_node;
#ifdef DEBUG
for (int j = 0; j < level; j++)
cerr << "
";
cerr << " LEVEL: " << level << " trying " << TList[i].DevName() << endl;
#endif
unsigned int nn = (node_list[node_list.GetNumList() - 1]).GetNumNode();
for (unsigned int j = 0; (j < nn) && RecurseYes; j++)
{
last_node = ((node_list[node_list.GetNumList() - 1])[j]).node;
if (new_node == last_node)
RecurseYes = 0;
}
if (RecurseYes)
{
if ((RetCode = node_list.InsertNode(TList[i].Gate())) != OK)
return RetCode;
int RecurseCode = CriticRecurse(circuit, new_node, node_list);
if (RecurseCode == OK)
{
// at least gnd (or vdd), one gate and one node
if ((RetCode = node_list.Create()) != OK)
return RetCode;
#ifdef DEBUG
cerr << endl << "NODE LIST : ";
#endif
nn = (node_list[node_list.GetNumList() - 2]).GetNumNode();
for (unsigned int j = 0; j < nn - 2; j++)
{
new_node = ((node_list[node_list.GetNumList() - 2])[j]).node;
#ifdef DEBUG
cerr << new_node << " ";
#endif
if ((RetCode = node_list.InsertNode(new_node)) != OK)
return RetCode;
}
#ifdef DEBUG
cerr << ((node_list[node_list.GetNumList() - 2])[nn - 2]).node << " "
<< ((node_list[node_list.GetNumList() - 2])[nn - 1]).node << endl;
#endif
}
else if (RecurseCode == CONT)
{
if ( (RetCode = (node_list[node_list.GetNumList() - 1]).DeleteLevelNode(2 * level - 2)) != OK )
return RetCode;
}
else
return RecurseCode;
}
}
level--;
if (level == 0)
{

B.1. Main functions

107
108
109
110
111
112
113
114
115
116
117
118
119
120 }

unsigned int nl = node_list.GetNumList();


for (unsigned int i = 0; i < nl; i++)
{
unsigned int nn = (node_list[i]).GetNumNode();
if (nn == 1)
{
RetCode = node_list.DeleteList(i);
if (RetCode != OK)
return RetCode;
}
}
}
return CONT;

CriticalPath.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
CritPathList::CritPathList() : NumPath( 0 ), head( 0 ), tail( 0 )
{
print_log( "Creating critical path list..." );
}

///
CritPathList::~CritPathList()
{
CPNode* tmp;
///
while ( head != 0 )
{
tmp = head->next;
delete head;
head = tmp;
}
}

///
const CPNode& CritPathList::operator[]( unsigned int index ) const
{
CPNode * tmp = head;
unsigned int i = index;
if ( index > NumPath )
error( NOT_FOUND, 0, "Index out of bound in [CritPathList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}

CriticalPathCreate.cc
3
4
5
6

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"

169

Appendix B. Source code

170

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

#include
#include
#include
#include
#include

"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"

///
int CritPathList::Create()
{
if ( !head )
{
head = new CPNode;
if ( !head )
return NO_MEM;
tail = head;
}
else
{
tail->next = new CPNode;
if ( !( tail->next ) )
return NO_MEM;
tail = tail->next;
}
tail->next = 0;
return OK;
}
///
int CritPathList::Stamp( unsigned int NumTranList )
{
tail->VALID = 1;
tail->NumTranList = NumTranList;
NumPath++;
return OK;
}

CriticalPathInsert.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
int CritPathList::InsertNodeIn( unsigned int NIn, TransitionType Type, double Time )
{
if ( tail )
return tail->InsNodeIn( NIn, Type, Time );
else
return NOT_FOUND;
}
///
int CritPathList::InsertNodeOut( unsigned int NOut, TransitionType Type )
{
if ( tail )
return tail->InsNodeOut( NOut, Type );
else
return NOT_FOUND;
}
///
int CritPathList::InsertActiveInputs( unsigned int Node, double Val )

B.1. Main functions

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

{
if ( tail )
return tail->InsActIn( Node, Val );
else
return NOT_FOUND;
}
///
int CritPathList::InsertNoActiveInputs( unsigned int Node, double Val )
{
if ( tail )
return tail->InsNoActIn( Node, Val );
else
return NOT_FOUND;
}
///
int CritPathList::InsertInitialCondition( unsigned int Node, double Val )
{
if ( tail )
return tail->InsIniCond( Node, Val );
else
return NOT_FOUND;
}
///
int CritPathList::InsertPathTransistor( const char* name, TransistorType TR, unsigned int index )
{
if ( tail )
return tail->InsTran( name, TR, index );
else
return NOT_FOUND;
}

CriticalPathParse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
int CritPathList::ParseLineCPNode( const char* str, CPCOMMANDOPT NodeType )
{
char tmpType[ 5 ];
unsigned int node;
double time;
sscanf( str, "%*s%u%s", &node, tmpType );
switch ( NodeType )
{
case NODEIN:
sscanf( str, "%*s%*u%*s%lg", &time );
for ( unsigned int i = 0; ( TransitionString[ i ] != 0 ); i++ )
{
if ( !strcasecmp( tmpType, TransitionString[ i ] ) )
return InsertNodeIn( node, ( TransitionType ) i, time );
}
return NOT_FOUND;
break;
case NODEOUT:
for ( unsigned int i = 0; ( TransitionString[ i ] != 0 ); i++ )
{
if ( !strcasecmp( tmpType, TransitionString[ i ] ) )

171

Appendix B. Source code

172

35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103

return InsertNodeOut( node, ( TransitionType ) i );


}
return NOT_FOUND;
break;
default:
return NOT_FOUND;
break;
}
}
///
int CritPathList::ParseLineCPInputs( const char* str, CPCOMMANDOPT InputType )
{
unsigned int node;
unsigned int NumRead;
double val;
char parsestr[ 1024 ];
char laststr[ 1024 ];
char *tmpstr = " %*u %*lg";
strcpy( parsestr, "%*s" );
NumRead = sscanf( str, "%*s %u %lg", &node, &val );
while ( NumRead == 2 )
{
switch ( InputType )
{
case ACTIVEI:
if ( InsertActiveInputs( node, val ) )
return NOT_FOUND;
break;
case NOACTIVEI:
if ( InsertNoActiveInputs( node, val ) )
return NOT_FOUND;
break;
case IC:
if ( InsertInitialCondition( node, val ) )
return NOT_FOUND;
break;
case CPATH:
case NODEIN:
case NODEOUT:
case TRANLIST:
case ENDCPATH:
case NONECP:
default:
break;
}
strcat( parsestr, tmpstr );
strcpy( laststr, parsestr );
strcat( laststr, " %u %lg" );
NumRead = sscanf( str, laststr, &node, &val );
}
if ( NumRead == 1 )
return PARSE_ERROR;
return OK;
}
///
int CritPathList::ParseLineCPTran( const char* str, const Circuit& circuit, unsigned int index )
{
char parsestr[ 1024 ];
char laststr[ 1024 ];
char *tmpstr = " %*s";
char name[ 16 ];
strcpy( parsestr, "%*s" );
unsigned int NumRead = sscanf( str, "%*s %s", name );
if ( NumRead != 1 )
return PARSE_ERROR;

B.1. Main functions

104
105
106
107
108
109
110
111
112
113
114
115
116 }

while ( NumRead == 1 )
{
if (circuit.TranPos(name) == -1)
return NOT_FOUND;
if ( InsertPathTransistor( name, circuit[ name ].TrType(), index ) )
return NOT_FOUND;
strcat( parsestr, tmpstr );
strcpy( laststr, parsestr );
strcat( laststr, " %s" );
NumRead = sscanf( str, laststr, name );
}
return OK;

CriticalPathRead.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"global.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
int CritPathList::Read( const char* FileOptions, const Circuit& circuit )
{
ifstream i_file( FileOptions );
char line[ 1024 ];
char command[ 256 ];
if ( !i_file )
return NOT_FOUND;
unsigned int LineNum = 0;
unsigned int NumTranList = 0;
while ( i_file.getline( line, 1023 ) )
{
LineNum++;
if ( sscanf( line, "%s ", command ) == 1 )
if ( command[ 0 ] != # )
{
int RetCode;
switch ( CPCOMMANDOPT Cc = WhichCommand( command ) )
{
case CPATH:
RetCode = Create();
NumTranList = 0;
break;
case NODEIN:
case NODEOUT:
RetCode = ParseLineCPNode( line, Cc );
break;
case ACTIVEI:
case NOACTIVEI:
case IC:
RetCode = ParseLineCPInputs( line, Cc );
break;
case TRANLIST:
RetCode = ParseLineCPTran( line, circuit, NumTranList );
NumTranList++;
if ( NumTranList >= MAXCHAIN )
RetCode = NO_MEM;
break;
case ENDCPATH:
RetCode = Stamp( NumTranList );
break;

173

Appendix B. Source code

174

55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71 }

case NONECP:
RetCode = OK;
default:
break;
}
if ( RetCode != OK )
{
sprintf( line, "ERROR reading file %s line %d ", FileOptions, LineNum );
print_log( line );
i_file.close();
return RetCode;
}
}
}
i_file.close();
return OK;

EvaluationAlgorithm.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"

///
EvaluationAlgorithm::EvaluationAlgorithm( const CritPathList& pathlist, const Options& options )
:
pathlist( pathlist ), options( options ),
NumPath( 0 ), Calls( 0 ),
CPDelay( 0 ), CPPower( 0 ),
CPNoise( 0 ), Area( 0.0 )
{
print_log( "Creating simulation algorithm..." );
NumPath = pathlist.GetNumPath();
CPDelay = new double[ NumPath ];
CPPower = new double[ NumPath ];
CPNoise = new double[ NumPath ];
if ( !CPDelay || !CPPower || !CPNoise )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
}
///
EvaluationAlgorithm::~EvaluationAlgorithm()
{
delete[] CPDelay;
delete[] CPPower;
delete[] CPNoise;
}

Global.cc
3
4
5
6

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"global.h"

B.1. Main functions

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

///
GLOBCOMMANDOPT WhichGBOption( const char* option )
{
for ( unsigned int i = 0; (GlobCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, GlobCommandOptions[ i ] ) )
return ( ( GLOBCOMMANDOPT ) i );
}
return NONEGLOB;
}
///
SIMCOMMANDOPT WhichSimOption( const char* option )
{
for ( unsigned int i = 0; (SimCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, SimCommandOptions[ i ] ) )
return ( ( SIMCOMMANDOPT ) i );
}
return NONESIM;
}

///
OPTCOMMANDOPT WhichOptOption( const char* option )
{
for ( unsigned int i = 0; (OptCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, OptCommandOptions[ i ] ) )
return ( ( OPTCOMMANDOPT ) i );
}
return NONEOPT;
}
///
CPCOMMANDOPT WhichCommand( const char* option )
{
for ( unsigned int i = 0; (CPCommandOptions[ i ] != 0 ); i++ )
{
if ( !strcasecmp( option, CPCommandOptions[ i ] ) )
return ( ( CPCOMMANDOPT ) i );
}
return NONECP;
}
#ifndef LINUX
///
void error( int exitCode, int ErrorType, const char* message )
{
cerr << message << "Error " << ErrorType << endl;
if ( exitCode != 0 )
exit( exitCode );
}
#endif

IsIn.cc
3
4
5
6
7
8
9
10

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

175

Appendix B. Source code

176

11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

#include "main.h"

///
int IsIn(unsigned int node, Node& NList, unsigned int& pos)
{
pos = 0;
for (unsigned int i = 0; i < NList.GetNumNode(); i++)
if ((NList[i]).node == node)
{
pos = i;
return OK;
}
return NOT_FOUND;
}

Node.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"

///
Node::Node() : NumNode(0), next( 0 ), Head(0), Tail(0)
{}
///
Node::~Node()
{
_NodeList* tmp;
while ( Head )
{
tmp = Head->next;
delete Head;
Head = tmp;
}
}
///
int Node::Insert( unsigned int node, int flag = -1)
{
_NodeList * tmp;
if ( !Head )
{
Head = new _NodeList;
if ( !Head )
return NO_MEM;
Head->next = 0;
Tail = Head;
}
else
{
tmp = new _NodeList;
if ( !tmp )
return NO_MEM;
tmp->next = 0;
Tail->next = tmp;
Tail = tmp;
}
Tail->node = node;
Tail->flag = flag;

B.1. Main functions

50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93

NumNode++;
return OK;
}
///
int Node::DeleteLevelNode(unsigned int level)
{
// NodeList* tmp = Head;
//unsigned int i = level;
if (level >= NumNode)
return NOT_FOUND;
//while(i)
// tmp = tmp-next;
//tmp-next = 0;
//Tail = tmp;
Tail = &(operator[](level));
(operator[](level)).next = 0;
NumNode = level + 1;
return OK;
}
///
const _NodeList& Node::operator[]( unsigned int index ) const
{
_NodeList* tmp = Head;
unsigned int i = index;
if ( index > NumNode )
error( NOT_FOUND, 0, "Index out of bound in [_NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
///
_NodeList& Node::operator[]( unsigned int index )
{
_NodeList* tmp = Head;
unsigned int i = index;
if ( index > NumNode )
error( NOT_FOUND, 0, "Index out of bound in [_NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}

NodeCreate.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"

///
int NodeList::Create()
{
if ( !head )
{
head = new Node;
if ( !head )
return NO_MEM;
tail = head;
}
else
{
tail->next = new Node;
if ( !( tail->next ) )
return NO_MEM;
tail = tail->next;

177

Appendix B. Source code

178

24
25
26
27
28 }

}
tail->next = 0;
NumList++;
return OK;

NodeList.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"

///
NodeList::NodeList() : NumList( 0 ), head( 0 ), tail( 0 )
{
print_log( "Creating node list..." );
}

///
NodeList::~NodeList()
{
Node* tmp;
while ( head != 0 )
{
tmp = head->next;
delete head;
head = tmp;
}
}

///
const Node& NodeList::operator[]( unsigned int index ) const
{
Node *tmp = head;
unsigned int i = index;
if ( index > NumList )
error( NOT_FOUND, 0, "Index out of bound in [NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}
///
Node& NodeList::operator[]( unsigned int index )
{
Node *tmp = head;
unsigned int i = index;
if ( index > NumList )
error( NOT_FOUND, 0, "Index out of bound in [NodeList]..." );
while ( i-- )
tmp = tmp->next;
return *tmp;
}

NodeListDelete.cc
3
4
5
6
7
8
9

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"

///
int NodeList::DeleteList( unsigned int list)

B.1. Main functions

10 {
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 }

if (list >= NumList)


return NOT_FOUND;
//unsigned int i = list - 1;
//Node* tmp = head;
//while(i)
// tmp = tmp-next;
//tmp-next = tmp-next-next;
if ((operator[](list)).next)
{
(operator[](list - 1)).next = (operator[](list)).next;
Node* tmp = &(operator[](list - 1));
while (tmp->next)
tmp = tmp->next;
tail = tmp;
}
else
{
(operator[](list - 1)).next = 0;
tail = &(operator[](list - 1));
}
NumList--;
return OK;

NodeListInsert.cc
3
4
5
6
7
8
9
10
11
12
13
14
15

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_nodes.h"

///
int NodeList::InsertNode( unsigned int node)
{
if ( tail )
return tail->Insert( node );
else
return NOT_FOUND;
}

OptSimulate.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"class_optimizator.h"

///
int OptimizationAlgorithm::SimulateCircuit( const double *NewWidth )
{
int RetCode = Simulation.Run( circuit, NewWidth, ValidPath );
if ( RetCode != OK )
return RetCode;
for ( unsigned int i = 0; i < NumPath; i++ )
{
CPDelay[ i ] = Simulation.GetDelay( i );
CPPower[ i ] = Simulation.GetPower( i );
CPNoise[ i ] = Simulation.GetNoise( i );
}
Area = Simulation.GetArea();

179

Appendix B. Source code

180

27
28 }

return OK;

OptimizationAlFirstSim.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"class_optimizator.h"

///
int OptimizationAlgorithm::SimulateFirstCircuit()
{
double* MinimumWidth = new double[ NumTran ];
double* MaximumWidth = new double[ NumTran ];
if (!MinimumWidth || !MaximumWidth)
return NO_MEM;
for ( unsigned int i = 0; i < NumTran; i++ )
{
MinimumWidth[ i ] = options.GetOptOption( WMIN );
if (options.GetOptOption( WMAX ) <= 0)
MaximumWidth[ i ] = options.GetOptOption( WMIN ) * 100;
else
MaximumWidth[ i ] = options.GetOptOption( WMAX );
}
int RetCode = Simulation.Run( circuit, MaximumWidth, ValidPath );
if ( RetCode != OK )
return RetCode;
for ( unsigned int i = 0; i < NumPath; i++ )
{
CPDelay[ i ] = Simulation.GetDelay( i );
CPPower[ i ] = Simulation.GetPower( i );
CPNoise[ i ] = Simulation.GetNoise( i );
}
Area = Simulation.GetArea();
MaxDelayInitMax = 0.0;
for ( unsigned int i = 0; i < NumPath; i++ )
{
if ( Simulation.GetDelay( i ) > 0.0 )
{
if ( Simulation.GetDelay( i ) > MaxDelayInitMax )
MaxDelayInitMax = CPDelay[i];
if ( Simulation.GetPower( i ) > MaxPowerInitMax )
MaxPowerInitMax = CPPower[i];
if ( Simulation.GetNoise( i ) > MaxNoiseInitMax )
MaxNoiseInitMax = CPNoise[i];
}
}
AreaInitMax = Area;
/// FIX ME !!!!!!!!!
MaxNoiseInitMax = 1.0;
RetCode = Simulation.Run( circuit, MinimumWidth, ValidPath );
if ( RetCode != OK )
return RetCode;
unsigned int tmpP = 0;
for ( unsigned int i = 0; i < NumPath; i++ )
{
CPDelay[ i ] = Simulation.GetDelay( i );
if (CPDelay[ i ] > 0.0)
{
ValidPath[i] = 1;
tmpP++;
}

B.1. Main functions

66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99 }

else
ValidPath[i] = 0;
CPPower[ i ] = Simulation.GetPower( i );
CPNoise[ i ] = Simulation.GetNoise( i );
}
Area = Simulation.GetArea();
MaxDelayInitMin = 0.0;
for ( unsigned int i = 0; i < NumPath; i++ )
{
if (ValidPath[i])
{
if ( Simulation.GetDelay( i ) > MaxDelayInitMin )
MaxDelayInitMin = CPDelay[i];
if ( Simulation.GetPower( i ) > MaxPowerInitMin )
MaxPowerInitMin = CPPower[i];
if ( Simulation.GetNoise( i ) > MaxNoiseInitMin )
MaxNoiseInitMin = CPNoise[i];
}
}
AreaInitMin = Area;
MaxNoiseInitMin = 1.0;
/// FIX ME !!!!!!!!!
if (MaxDelayInitMin < MaxDelayInitMax)
MaxDelayInitMax = MaxDelayInitMin;
if (MaxPowerInitMin > MaxPowerInitMax)
MaxPowerInitMax = MaxPowerInitMin;
if (MaxNoiseInitMin > MaxNoiseInitMax)
MaxNoiseInitMax = MaxNoiseInitMin;
if (AreaInitMin > AreaInitMax)
AreaInitMax = AreaInitMin;
char log[512];
sprintf( log, "Found %u valid critical paths of %u", tmpP, NumPath );
print_log( log );
return OK;

OptimizationAlNormSim.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_optimizator.h"

///
double OptimizationAlgorithm::NormSim( const double* x, int& RetCode)
{
double f;
double* X = new double[ NumTran ];
double DelW = options.GetOptOption( DELTA );
unsigned int count_min = 0;
unsigned int count_max = 0;
static unsigned int elapsed = 0;
static double fLast;
static unsigned int count_conv = 0;
for ( unsigned int i = 0; i < NumTran; i++ )
{
if ( x[ i ] <= options.GetOptOption( WMIN ) )
{
X[ i ] = options.GetOptOption( WMIN );
count_min++;
}
else if ( (x[ i ] > options.GetOptOption( WMAX )) &&
(options.GetOptOption( WMAX ) > 0) )
{

181

Appendix B. Source code

182

34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102

X[ i ] = options.GetOptOption( WMAX );
count_max++;
}
else
X[ i ] = double( rint( x[ i ] / DelW ) * DelW );
}
if ((count_min != NumTran) && (count_max != NumTran))
RetCode = SimulateCircuit( X );
else
{
RetCode = OK;
}
if ( RetCode != OK )
{
delete[] X;
return 0.0;
}
double maxT = 0.0;
double maxP = 0.0;
double maxN = 0.0;
double RatioT = 1.0;
// MaxDelayInit / MaxDelayInit
//double RatioP = MaxDelayInitMin / MaxPowerInitMax;
double RatioP = 1.0;
//double RatioN = MaxDelayInitMin / MaxNoiseInitMax;
// FIX ME !!!!!!!!!!!
double RatioN = 0.0;
//double RatioA = MaxDelayInitMin / AreaInitMax;
double RatioA = 1.0;
f = 0.0;
double fMin = COST_FACTOR;
if ( options.GetOptOption( WEIGHTS ) )
{
RatioT *= options.GetOptOption(WDELAY);
RatioP *= options.GetOptOption(WPOWER);
RatioN *= options.GetOptOption(WNOISE);
RatioA *= options.GetOptOption(WAREA);
}
double fMin_norm;
double fMax;
unsigned int Constraints = 0;
double MAXDelay = options.GetOptOption( MAXDELAY );
double MAXPower = options.GetOptOption( MAXPOWER );
double MAXNoise = options.GetOptOption( MAXNOISE );
double MAXArea = options.GetOptOption( MAXAREA );
if ( (MAXDelay > 0) || (MAXPower > 0) || (MAXNoise > 0) ||
Constraints = 1;
fMin_norm = ( RatioT * MaxDelayInitMin / MaxDelayInitMin +
RatioP * MaxPowerInitMin / MaxPowerInitMax +
RatioN * MaxNoiseInitMin / MaxNoiseInitMax +
RatioA * AreaInitMin / AreaInitMax);
fMax = (RatioT * MaxDelayInitMax / MaxDelayInitMin + \
RatioP * MaxPowerInitMax / MaxPowerInitMax + \
RatioN * MaxNoiseInitMax / MaxNoiseInitMax + \
RatioA * AreaInitMax / AreaInitMax) * \
COST_FACTOR / fMin_norm;
if (elapsed == 0)
fLast = fMin;
if ((count_min != NumTran) && (count_max != NumTran))
{
for ( unsigned int i = 0; i < NumPath; i++ )
{
if ( CPDelay[ i ] > maxT )
maxT = CPDelay[ i ];
if (CPDelay[ i ] > 0)
{
if ( CPPower[ i ] > maxP )
maxP = CPPower[ i ];
if ( CPNoise[ i ] > maxN )
maxN = CPNoise[ i ];
}

(MAXArea > 0))


\
\
\

B.1. Main functions

103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171

}
f = (RatioT * maxT / MaxDelayInitMin + \
RatioP * maxP / MaxPowerInitMax + \
RatioN * maxN / MaxNoiseInitMax + \
RatioA * Area / AreaInitMax) * \
COST_FACTOR / fMin_norm;
if ( Constraints )
{
if (MAXDelay > 0)
{
if (maxT > MAXDelay)
{
f += (maxT - MAXDelay ) / MaxDelayInitMin
(maxT - MAXDelay ) / MaxDelayInitMin
COST_FACTOR / fMin_norm;
RetCode = CONT;
}
else
RetCode = END_ACC;
}
if (MAXPower > 0)
{
if (maxP > MAXPower)
{
f += (maxP - MAXPower ) / MaxPowerInitMax
(maxP - MAXPower ) / MaxPowerInitMax
COST_FACTOR / fMin_norm;
RetCode = CONT;
}
else
RetCode = END_ACC;
}
if (MAXNoise > 0)
{
if (maxN > MAXNoise)
{
f += (maxN - MAXNoise ) / MaxNoiseInitMax
(maxN - MAXNoise ) / MaxNoiseInitMax
COST_FACTOR / fMin_norm;
RetCode = CONT;
}
else
RetCode = END_ACC;
}
if (MAXArea > 0)
{
if (Area > MAXArea)
{
f += (Area - MAXArea ) / AreaInitMax *\
(Area - MAXArea ) / AreaInitMax *\
COST_FACTOR / fMin_norm;
RetCode = CONT;
}
else
RetCode = END_ACC;
}
}
}
else if (count_min == NumTran)
{
f = fMin;
maxT = MaxDelayInitMin;
maxP = MaxPowerInitMin;
maxN = MaxNoiseInitMin;
}
else if (count_max == NumTran)
{
f = fMax * COST_FACTOR / fMin_norm;
maxT = MaxDelayInitMax;

183

*\
*\

*\
*\

*\
*\

Appendix B. Source code

184

172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205 }

maxP = MaxPowerInitMax;
maxN = MaxNoiseInitMax;
}
InternalSteps++;
if ( InternalSteps >= options.GetOptOption( MAXSTEPS ) )
RetCode = MAX_STEPS;
char log[ 1024 ];
if ((options.Verbose()) || (InternalSteps == 1))
{
circuit.PrintResult( InternalSteps, NumTran, NumPath, X, CPDelay, CPPower, CPNoise, Area, maxT, maxP, maxN, f, fLast);
sprintf( log, "...step: %d, objective: %g", InternalSteps, f );
print_log( log );
}
else if ((f < fLast) || ((InternalSteps % 100) == 0))
{
circuit.PrintResult( InternalSteps, NumTran, NumPath, X, CPDelay, CPPower, CPNoise, Area, maxT, maxP, maxN, f, fLast);
sprintf( log, "...step: %d, objective: %g", InternalSteps, f );
print_log( log );
if ( ((fLast - f) / fLast) < options.GetOptOption( ACC ) && (RetCode != CONT))
{
count_conv++;
if (count_conv >= 2)
{
RetCode = END_ACC;
}
}
fLast = f;
elapsed++;
}
//else
// count conv = 0;
delete[] X;
return f;

OptimizationAlgorithm.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"class_optimizator.h"

///
OptimizationAlgorithm::OptimizationAlgorithm( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
InternalSteps( 0 ), circuit( circuit ), options( options ),
Steps( 0 ), NumTran( 0 ), NumPath( 0 ),
Width( 0 ), CPDelay( 0 ), CPPower( 0 ),
CPNoise( 0 ), Area( 0.0 ), ValidPath(0),
MaxDelayInitMin( 0.0 ), MaxPowerInitMin( 0.0 ),
MaxNoiseInitMin( 0.0 ), AreaInitMin( 0.0 ),
MaxDelayInitMax( 0.0 ), MaxPowerInitMax( 0.0 ),
MaxNoiseInitMax( 0.0 ), AreaInitMax( 0.0 ),
Simulation( simulation )
{
// default inizialization
print_log( "Creating optimization algorithm..." );
NumTran = circuit.GetNTran();
Width = new double[ NumTran ];
NumPath = simulation.GetNPath();
CPDelay = new double[ NumPath ];

B.1. Main functions

34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

CPPower = new double[ NumPath ];


CPNoise = new double[ NumPath ];
ValidPath = new unsigned[NumPath];
if ( !Width || !CPDelay || !CPPower ||
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ]
error( NO_MEM, errno, "HEY! " );
}
for ( unsigned int i = 0; i < NumTran;
Width[ i ] = circuit[ i ].Width();
for ( unsigned int i = 0; i < NumPath;
ValidPath[i] = 1;

185

!CPNoise )

);

i++ )
i++ )

}
///
OptimizationAlgorithm::~OptimizationAlgorithm()
{
delete[] Width;
delete[] CPDelay;
delete[] CPPower;
delete[] CPNoise;
}

Optimize.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_optimizator.h"
"class_simulator.h"
"slop.h"
"slop2.h"
"powell.h"
"anneal.h"
"test.h"

///
int Optimize( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation, double* LastWidth )
{
struct timeb start_t, stop_t;
ftime( &start_t );
char log[ 1024 ];
OptimizationAlgorithm* Opt;
switch ( options.WhichOptAlgorithm() )
{
case SLOP:
Opt = new Slop( circuit, options, simulation );
break;
case SLOP2:
Opt = new Slop2( circuit, options, simulation );
break;
case POWELL:
Opt = new Powell( circuit, options, simulation );
break;
case ANNEAL:
Opt = new Anneal( circuit, options, simulation );
break;
case TESTEVAL:
Opt = new TestEval( circuit, options, simulation );
default:
break;
}

Appendix B. Source code

186

45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92 }

unsigned int n = circuit.GetNTran();


unsigned int np = simulation.GetNPath();
if ( !Opt )
return NO_MEM;
int RetCode;
if ( ( RetCode = Opt->SimulateFirstCircuit() ) != OK )
return RetCode;
print_log( "Initial critical paths: " );
for ( unsigned int i = 0; i < np; i++ )
{
sprintf( log, "%u) Delay=%g ps, Energy=%g pJ, Noise=%g", i,
simulation.GetDelay( i ),
simulation.GetPower( i ),
simulation.GetNoise( i ) );
print_log( log );
}
sprintf( log, "Area=%g ", simulation.GetArea() );
print_log( "" );
print_log( "Starting optimization process..." );
RetCode = Opt->Run();
if ( ( RetCode != OK ) && ( RetCode != MAX_STEPS ) && ( RetCode != END_ACC ) && (RetCode != CONT))
return RetCode;
ftime( &stop_t );
if ( RetCode == MAX_STEPS )
{
print_log( "...WARNING: exceeded max steps..." );
}
if (( RetCode == END_ACC ) || ( RetCode == OK) )
{
print_log( "...Solution found. Thats all folk." );
}
for ( unsigned int i = 0; i < n; i++ )
{
int pos = circuit.TranPos( circuit[ i ].DevName() );
LastWidth[ i ] = Opt->OptWidth( pos );
}
RetCode = Opt->SimulateCircuit(LastWidth);
if ( RetCode != OK )
return RetCode;
long sec = stop_t.time - start_t.time;
short msec = abs( start_t.millitm - stop_t.millitm );
sprintf( log, "End Optimization: %ld steps, %ld function evaluations ", Opt->GetSteps(), simulation.GetCalls() );
print_log( log );
sprintf( log, "
: total time: %ld.%d secs ", sec, msec );
print_log( log );
delete Opt;
return OK;

Options.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"

///
Options::Options() :
SimOptions( 0 ), OptOptions( 0 ),
SimulationChosed( HSPICE ), OptimizationChosed( SLOP ),
verbose( 0 ), manual(0), NameMosN( 0 ), NameMosP( 0 ), WorkPath( 0 )
{
print_log( "Parsing Options..." );
}
///
Options::~Options()

B.1. Main functions

20 {
21
22
23 }

delete[] SimOptions;
delete[] OptOptions;

OptionsRead.cc
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"global.h"

///
int Options::Read( const char* FileOptionsName )
{
char line[ 1024 ];
char opt[ 256 ];
char log[1024];
int RetCode = OK;
ifstream i_file( FileOptionsName );
if ( !i_file )
return NOT_FOUND;
unsigned int NumSimOptions = 0;
unsigned int NumOptOptions = 0;
while ( OptCommandOptions[ NumOptOptions++ ] );
while ( SimCommandOptions[ NumSimOptions++ ] );
OptOptions = new double[ NumOptOptions ];
SimOptions = new double[ NumSimOptions ];
for ( unsigned int i = 0; i < NumOptOptions; i++ )
OptOptions[ i ] = 0;
for ( unsigned int i = 0; i < NumSimOptions; i++ )
SimOptions[ i ] = 0;
unsigned int Line = 0;
while ( i_file.getline( line, 1023 ) )
{
Line++;
if ( sscanf( line, "%s ", opt ) == 1 )
if ( opt[ 0 ] != # )
{
GLOBCOMMANDOPT GlobalOption = WhichGBOption( opt );
SIMCOMMANDOPT SimOption = WhichSimOption( opt );
OPTCOMMANDOPT OptOption = WhichOptOption( opt );
switch ( GlobalOption )
{
case VERBOSE:
verbose = 1;
print_log("Well, lets go verbose...");
break;
case MANUAL:
manual = 1;
print_log("So you think youre better than me,");
print_log("in calculating critical paths?...");
break;
case SIMALG:
RetCode = NOT_FOUND;
sscanf( line, "%*s %s", opt );
for ( unsigned int S = 0 ; ( SimAlgorithms[ S ] != 0 ); S++ )
if ( !strcasecmp( opt, SimAlgorithms[ S ] ) )
{
SimulationChosed = ( SimMethod ) S;
RetCode = OK;
sprintf(log, "Simulator......%s", SimAlgorithms[ S ]);
print_log(log);
}
break;
case OPTALG:
RetCode = NOT_FOUND;

187

188

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133

Appendix B. Source code

sscanf( line, "%*s %s", opt );


for ( unsigned int O = 0; ( OptAlgorithms[ O ] != 0 ); O++ )
if ( !strcasecmp( opt, OptAlgorithms[ O ] ) )
{
OptimizationChosed = ( OptMethod ) O;
RetCode = OK;
sprintf(log, "Optimizer......%s", OptAlgorithms[ O ]);
print_log(log);
}
break;
case NAMEMOSN:
sscanf( line, "%*s %s", opt );
NameMosN = new char[ strlen( opt ) + 1 ];
if ( !NameMosN )
RetCode = NO_MEM;
else
strcpy( NameMosN, opt );
break;
case NAMEMOSP:
sscanf( line, "%*s %s", opt );
NameMosP = new char[ strlen( opt ) + 1 ];
if ( !NameMosP )
RetCode = NO_MEM;
else
strcpy( NameMosP, opt );
break;
case WORKPATH:
sscanf( line, "%*s %s", opt );
WorkPath = new char[ strlen( opt ) + 1 ];
if ( !WorkPath )
RetCode = NO_MEM;
else
strcpy( WorkPath, opt );
break;
case NONEGLOB:
default:
break;
}
if ( SimOption != NONESIM )
{
sscanf( line, "%*s %lg", &SimOptions[ SimOption ] );
}
switch ( OptOption )
{
case CONSTRAINS:
if ( ( OptOptions[ CONSTRAINS ] == 1.0 ) || ( OptOptions[ ENDCONSTRAINS ] == 1 ) )
RetCode = NOT_FOUND;
else
{
OptOptions[ CONSTRAINS ] = 1.0;
print_log("Hey, you mean some constraints...");
}
break;
case ENDCONSTRAINS:
if ( OptOptions[ CONSTRAINS ] == 0 )
RetCode = NOT_FOUND;
else
OptOptions[ ENDCONSTRAINS ] = 1.0;
break;
case WEIGHTS:
if ( ( OptOptions[ WEIGHTS ] == 1.0 ) || ( OptOptions[ ENDWEIGHTS ] == 1 ) )
RetCode = NOT_FOUND;
else
{
OptOptions[ WEIGHTS ] = 1.0;
print_log("Hey, you mean some weights...");
}
break;

B.1. Main functions

134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202

case ENDWEIGHTS:
if ( OptOptions[ WEIGHTS ] == 0 )
RetCode = NOT_FOUND;
else
OptOptions[ ENDWEIGHTS ] = 1.0;
break;
case WDELAY:
case WPOWER:
case WAREA:
case WNOISE:
case MAXSTEPS:
case ACC:
case WMAX:
case WMIN:
case DELTA:
case RISETIME:
case FALLTIME:
case MAXAREA:
case MAXDELAY:
case MAXPOWER:
case MAXNOISE:
sscanf( line, "%*s %lg", &OptOptions[ OptOption ] );
sprintf(log, "...%s=%g", OptCommandOptions[OptOption], OptOptions[ OptOption ]);
print_log(log);
break;
case NONEOPT:
default:
break;
}
}
if ( RetCode != OK )
{
sprintf( line, "ERROR reading file %s line %d ", FileOptionsName, Line );
print_log( line );
i_file.close();
return RetCode;
}
}
if ( !NameMosN )
{
NameMosN = new char[ strlen( TransistorString[
if ( !NameMosN )
RetCode = NO_MEM;
else
strcpy( NameMosN, TransistorString[ NMOS ]
}
if ( !NameMosP )
{
NameMosP = new char[ strlen( TransistorString[
if ( !NameMosP )
RetCode = NO_MEM;
else
strcpy( NameMosP, TransistorString[ PMOS ]
}
if ( !WorkPath )
{
WorkPath = new char[ strlen( WORKPath ) + 1 ];
if ( !WorkPath )
RetCode = NO_MEM;
else
strcpy( WorkPath, WORKPath );
}
if ( SimulationChosed == NONESM )
SimulationChosed = HSPICE;
if ( OptimizationChosed == NONEOM )
OptimizationChosed = SLOP;
i_file.close();
return OK;

NMOS ] ) + 1 ];

);

PMOS ] ) + 1 ];

);

189

Appendix B. Source code

190

203 }

ReadTEch.cc
3
4
5
6
7
8
9
10
11
12
13

#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"tech.h"
"readt.h"

///
struct _TECH_STR TECH;
int ReadTech()
{

nmos

ReadTEch.cc
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
34
34
35
36
37
37
38
38
39
39
40
40
41

TECH.Lmin = 0.25;
TECH.u0_n = 37.2;

/** micron2 / (Volt * ns) */

TECH.Kp_n = 256.7916;

/** uA / V2 */

TECH.vmax_n = 130.7952;

/** micron / ns */

TECH.Vtn0 = 0.5885;

/** Volt */

TECH.epss = 0.10359;

/** fF / micron */

TECH.q = 1.602E-4;

/** fF * Volt */

TECH.Na = 2.679E11;

/** micron -3 */

TECH.gamma_n = 0.3356;
TECH.phi_n = 0.79424;
TECH.Cox_n = 6.903;

/** fF / micron 2 */

TECH.C_nj = 689E-3;

/** fF / micron2 */

TECH.C_np = 138E-3;

/** fF / micron */

TECH.Ec_n = 3.516;

/** Volt / micron = vmax/uo */

TECH.VT = 25.98E-3;

/** Volt */

TECH.ni = 1.45E-2;

/** micron -3 */

TECH.Df = 0.625;

/** micron */

TECH.Cgd0_n = 0.32;

/** fF / micron */

TECH.Cgs0_n = 0.32;
TECH.PB_n = 0.79424;

/** Volt */

TECH.mj_n = 0.45495;
TECH.mjsw_n = 0.1;
TECH.XW_n = -0.79698;
TECH.XL_n = 0;
TECH.WD_n = 0.039849;
TECH.LD_n = 0.0332;
TECH.theta_n = 0.4314;

/** micron */
/** micron */
/** micron */
/** micron */
/* V-1 */

B.1. Main functions

pmos

ReadTEch.cc
43
43
44
44
45
45
46
47
48
49
49
50
50
51
51
52
52
53
53
54
55
55
56
56
57
58
59
59
60
60
61
61
62
62
63
63
64
65 }

TECH.u0_p = 6.341;

/** micron2 / (Volt * ns) */

TECH.Kp_p = 30.16;

/** uA / V2 */

TECH.Cox_p = 6.903;

/** fF / micron2 */

TECH.gamma_p = 0.69468;
TECH.phi_p = 0.79547;
TECH.Vtp0 = -0.434;
TECH.vmax_p = 57.6714;
TECH.C_pj = 596E-3;
TECH.C_pp = 122.1E-3;

/** micron /ns */


/** fF / micron2 */
/** fF / micron */

TECH.Ec_p = 9.095;

/** Volt / micron */

TECH.Cgd0_p = 0.5;

/** fF / micron */

TECH.Cgs0_p = 0.5;
TECH.Nd = 2.8E11;
TECH.PB_p = 0.79547;
TECH.mj_p = 0.36085;
TECH.mjsw_p = 0.1;
TECH.XW_p = -0.89852;

/** micron -3 */
/** Volt */

/** micron */

TECH.XL_p = 0;

/** micron */

TECH.WD_p = 0;

/** micron */

TECH.LD_p = 0.054697;

/** micron */

TECH.theta_p = 0.4071;

/** V-1 */

return OK;

SearchCritic.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"

///
int SearchCriticalPath(const Circuit& circuit,
CritPathList& pathList,
const NodeList& nodeListGnd,
const NodeList& nodeListVdd,
Node& nodeInputList,
Node& gateInternalList,
const Options& options)
{
ListNodeList* gndCPath;
ListNodeList* vddCPath;
int RetCode;

191

192

27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95

Appendix B. Source code

char log[1024];
unsigned int nlg = nodeListGnd.GetNumList();
unsigned int nlv = nodeListVdd.GetNumList();
print_log("Searching Critical Path...");
#ifdef DEBUG
cerr << "Searching Critical Path..." << endl;
#endif
gndCPath = 0;
vddCPath = 0;
for (unsigned int i = 0; i < nlg; i++)
{
ListNodeList* tmp = new ListNodeList;
if (!tmp)
return NO_MEM;
tmp->next = gndCPath;
gndCPath = tmp;
for (unsigned int j = 0; j < nodeInputList.GetNumNode(); j++)
nodeInputList[j].flag = -1;
for (unsigned int j = 0; j < gateInternalList.GetNumNode(); j++)
gateInternalList[j].flag = -1;
RetCode = SearchCPRecurse(gndCPath, circuit.ValimNode(), i, nodeListGnd, nodeListVdd, nodeInputList, gateInternalList, 0);
if ((RetCode != OK) && (RetCode != CONT))
return RetCode;
}
sprintf(log, "found first critical paths (gnd)...");
print_log(log);
for (unsigned int i = 0; i < nlv; i++)
{
ListNodeList* tmp = new ListNodeList;
if (!tmp)
return NO_MEM;
tmp->next = vddCPath;
vddCPath = tmp;
for (unsigned int j = 0; j < nodeInputList.GetNumNode(); j++)
nodeInputList[j].flag = -1;
for (unsigned int j = 0; j < gateInternalList.GetNumNode(); j++)
gateInternalList[j].flag = -1;
RetCode = SearchCPRecurse(vddCPath, circuit.ValimNode(), i, nodeListVdd, nodeListGnd, nodeInputList, gateInternalList, 0);
if ((RetCode != OK) && (RetCode != CONT))
return RetCode;
}
sprintf(log, "found the other critical paths (vdd)...");
print_log(log);
ListNodeList* tmp = gndCPath;
unsigned int count = 0;
for (unsigned int gcount = 0; gcount < 2; gcount ++)
{
while (tmp)
{
#ifdef DEBUG
cerr << endl << "-------------> CP " << count << endl;
#endif
count++;
unsigned int nl = (tmp->NL).GetNumList();
if (nl)
{
if ((RetCode = pathList.Create()) != OK)
return RetCode;
unsigned int first_node = (((tmp->NL)[0])[0]).node;
unsigned int nn = ((tmp->NL)[nl - 1]).GetNumNode();
unsigned int output = (((tmp->NL)[nl - 1])[nn - 1]).node;
TransitionType Tr_in;
TransitionType Tr_out;
if (first_node == 0)
Tr_in = RISE;
else
Tr_in = FALL;
if (nl % 2)
Tr_out = (Tr_in == RISE ? FALL : RISE);

B.1. Main functions

96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164

193

else
Tr_out = (Tr_in == RISE ? RISE : FALL);
OPTCOMMANDOPT TRin = (Tr_in == RISE ? RISETIME : FALLTIME);
RetCode = pathList.InsertNodeOut(output, Tr_out);
if (RetCode != OK)
return RetCode;
unsigned int pos;
unsigned int first_input = 0;
double val;
double val_n;
for (unsigned int i = 0; i < nodeInputList.GetNumNode(); i++)
{
(nodeInputList[i]).flag = -1;
}
for (unsigned int i = 0; i < nl; i++)
{
first_node = (((tmp->NL)[i])[0]).node;
if (first_node == 0)
{
val = circuit.Valim();
val_n = 0;
}
else
{
val = 0;
val_n = circuit.Valim();
}
nn = ((tmp->NL)[i]).GetNumNode();
// set initial condition
unsigned int last_l_node = (((tmp->NL)[i])[nn - 1]).node;
if (i < nl - 1)
{
RetCode = pathList.InsertInitialCondition(last_l_node, val);
if (RetCode != OK)
return RetCode;
}
for (unsigned int j = 1; j < nn; j = j + 2)
{
unsigned int input = (((tmp->NL)[i])[j]).node;
#ifdef DEBUG
cerr << endl << "input " << input;
#endif
if (IsIn(input, nodeInputList, pos) == OK)
{
#ifdef DEBUG
cerr << " primary input (" << pos << ")";
#endif
if (first_input == 0)
{
first_input = input;
RetCode = pathList.InsertNodeIn(first_input, Tr_in, options.GetOptOption(TRin));
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " INPUT";
#endif
}
else
{
if ((nodeInputList[pos]).flag == -1)
{
RetCode = pathList.InsertActiveInputs(input, val);
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " ACTIVE IN " << val;
#endif
}
else

Appendix B. Source code

194

165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233

{
if (first_node == 0)
{
if ((nodeInputList[pos]).flag != int(circuit.ValimNode()))
return NOT_FOUND;
}
else
{
if ((nodeInputList[pos]).flag != 0)
return NOT_FOUND;
}
}
}
(nodeInputList[pos]).flag = (first_node == 0 ? circuit.ValimNode() : 0);
}
if (i > 0)
{
unsigned int nn2 = ((tmp->NL)[i - 1]).GetNumNode();
last_l_node = (((tmp->NL)[i - 1])[nn2 - 1]).node;
}
else
last_l_node = 0;
if (IsIn(input, gateInternalList, pos) == OK)
{
#ifdef DEBUG
cerr << " internal gate (" << pos << " last " << last_l_node << ")";
#endif
if (last_l_node != input)
{
if (first_input == 0)
{
first_input = input;
RetCode = pathList.InsertNodeIn(first_input, Tr_in, options.GetOptOption(TRin));
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " INPUT INTERNAL";
#endif
}
else
{
RetCode = pathList.InsertActiveInputs(input, val);
if (RetCode != OK)
return RetCode;
#ifdef DEBUG
cerr << " ACTIVE IN INTERNAL " << val;
#endif
}
}
}
unsigned int drain = (((tmp->NL)[i])[j - 1]).node;
unsigned int source = (((tmp->NL)[i])[j + 1]).node;
unsigned int nt = circuit.GetNTran();
for (unsigned int k = 0; k < nt; k++)
{
if ( (circuit[k]).Gate() == input)
{
if ( (((circuit[k]).Drain() == drain) &&
((circuit[k]).Source() == source)) ||
(((circuit[k]).Drain() == source) &&
((circuit[k]).Source() == drain)))
{
RetCode = pathList.InsertPathTransistor((circuit[k]).DevName(), (circuit[k]).TrType(), i);
if (RetCode != OK)
return RetCode;
}
}
}

B.1. Main functions

234
}
235
}
236
for (unsigned int i = 0; i < nodeInputList.GetNumNode(); i++)
{
237
238
if ((nodeInputList[i]).flag == -1)
239
{
240
unsigned int noActiveNode = (nodeInputList[i]).node;
241
double noActiveSupply;
242
if (gcount == 0) // CP starting with GND
noActiveSupply = 0;
243
244
else // CP starting with VDD
245
noActiveSupply = circuit.Valim();
246 #ifdef DEBUG
247
cerr << endl << "no active input "
<< noActiveNode << " = " << noActiveSupply ;
248
249 #endif
250
RetCode = pathList.InsertNoActiveInputs(noActiveNode, noActiveSupply);
251
if (RetCode != OK)
252
return RetCode;
253
}
254
}
255
if ((RetCode = pathList.Stamp(nl)) != OK)
256
return RetCode;
}
257
258
tmp = tmp->next;
}
259
tmp = vddCPath;
260
261
}
sprintf(log, "found total %u critical paths...", pathList.GetNumPath());
262
263
print_log(log);
264
return OK;
265 }

SearchCriticRecurse.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"

///
int SearchCPRecurse(ListNodeList* CPath,
unsigned int valnode,
unsigned int index,
const NodeList& nodeListFirst,
const NodeList& nodeListSecond,
Node& nodeInputList,
Node& gateInternalList,
unsigned int ilevel)
{
int RetCode;
unsigned int level = ilevel;
unsigned int i, j;
unsigned int nn = (nodeListFirst[index]).GetNumNode();
level++;
#ifdef DEBUG
cerr << endl;
for (i = 0; i < level; i++)
cerr << "
";
cerr << "(" << level << ") " << index << " - ";
#endif

195

Appendix B. Source code

196

36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

unsigned int first_node = ((nodeListFirst[index])[0]).node;


// first node = gnd or vdd
unsigned int last_node = ((nodeListFirst[index])[nn - 1]).node;
int neg_node;
if (first_node == 0)
neg_node = valnode;
else
neg_node = 0;
for (i = 1; i < nn; i = i + 2)
{
unsigned int pos = 0;
unsigned int tmp_node = ((nodeListFirst[index])[i]).node;
if ( IsIn(tmp_node, nodeInputList, pos) == OK)
{
int flag = (nodeInputList[pos]).flag;
if ( flag == -1)
{
(nodeInputList[pos]).flag = neg_node;
}
else
{
if (flag != neg_node)
return OK;
}
}
else if ( IsIn(tmp_node, gateInternalList, pos) == OK)
{
int flag = (gateInternalList[pos]).flag;
if (flag == -1)
{
if (level > 1)
{
// very bastard inside
if ( SearchOKCond(tmp_node, valnode, nodeListFirst, nodeListSecond, nodeInputList, gateInternalList) == NOT_FOUND)
{

if ( SearchOKCond(tmp_node, valnode, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList) == NOT_FOUN


(gateInternalList[pos]).flag = neg_node;
else
return OK;
}
else
(gateInternalList[pos]).flag = neg_node;
}
else
(gateInternalList[pos]).flag = neg_node;
}
else
{
if (flag != neg_node)
return OK;
}
}
}
if ( (RetCode = (CPath->NL).Create()) != OK)
return RetCode;
for (unsigned int ii = 0; ii < nn; ii++)
{
if ((RetCode = (CPath->NL).InsertNode(((nodeListFirst[index])[ii]).node)) != OK)
return RetCode;
}
unsigned int nl = nodeListSecond.GetNumList();
for (i = 0; i < nl; i++)
{
nn = (nodeListSecond[i]).GetNumNode();
j = 1;
unsigned int found = 0;
while ((j < nn) && (!found))
{

B.1. Main functions

197

105
unsigned int try_node = ((nodeListSecond[i])[j]).node;
106
if (try_node == last_node)
107
found = j;
j = j + 2;
108
109
}
110
if (found)
111
{
112
int RecurseCode = SearchCPRecurse(CPath, valnode, i, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList, level);
113
if (RecurseCode == OK)
114
{
115
ListNodeList* tmp = new ListNodeList;
116
if (!tmp)
117
return NO_MEM;
tmp->next = CPath;
118
119
CPath = tmp;
120
unsigned int n_l = ((CPath->next)->NL).GetNumList();
for (unsigned int jj = 0; jj < n_l - 1; jj++)
121
122
{
123
if ((RetCode = (CPath->NL).Create()) != OK)
124
return RetCode;
125
unsigned int n_n = (((CPath->next->NL))[jj]).GetNumNode();
126
for (unsigned int k = 0; k < n_n; k++)
127
(CPath->NL).InsertNode((((CPath->next->NL)[jj])[k]).node );
}
128
129
}
130
else if (RecurseCode == CONT)
131
{
132
//if((RetCode = (CPath-NL).DeleteList(level)) != OK)
133
// return RetCode;
}
134
135
else
136
137
return RecurseCode;
}
138
139
}
140
level--;
141
if (level == 0)
{
142
143 #ifdef DEBUG
144
unsigned int pp = (CPath->NL).GetNumList();
145
cerr << endl << " NUMLIST " << pp << " --- ";
146
for (unsigned int i = 0; i < pp; i++)
147
{
unsigned int pn = ((CPath->NL)[i]).GetNumNode();
148
149
for (unsigned int j = 0; j < pn; j++)
150
cerr << " " << (((CPath->NL)[i])[j]).node;
151
cerr << " / ";
152
}
153 #endif
154
}
return CONT;
155
156 }

SearchOkCond.cc
3
4
5
6
7
8
9
10
11
12
13
14
15

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"main.h"

///
int SearchOKCond(unsigned int node,
unsigned int valnode,

Appendix B. Source code

198

16
17
18
19
20 {
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71 }

const
const
Node&
Node&

NodeList& nodeListFirst,
NodeList& nodeListSecond,
nodeInputList,
gateInternalList)

unsigned int nl = nodeListSecond.GetNumList();


unsigned int pos = 0;
unsigned int first_node = ((nodeListSecond[0])[0]).node;
// first node = gnd or vdd
int neg_node;
if (first_node == 0)
neg_node = valnode;
else
neg_node = 0;
for (unsigned int i = 0; i < nl; i++)
{
unsigned int nn = (nodeListSecond[i]).GetNumNode();
unsigned int last_node = ((nodeListSecond[i])[nn - 1]).node;
if (node == last_node)
{
for (unsigned int j = 1; j < nn; j = j + 2)
{
unsigned int tmp_node = ((nodeListSecond[i])[j]).node;
if ( IsIn(tmp_node, nodeInputList, pos) == OK)
{
int flag = (nodeInputList[pos]).flag;
if ( flag == -1)
{
(nodeInputList[pos]).flag = neg_node;
}
else
{
if (flag != neg_node)
return NOT_FOUND;
}
}
else if ( IsIn(tmp_node, gateInternalList, pos) == OK)
{
int flag = (gateInternalList[pos]).flag;
if (flag == -1)
{
int RecurseCode = SearchOKCond(tmp_node, valnode, nodeListSecond, nodeListFirst, nodeInputList, gateInternalList
if (RecurseCode == NOT_FOUND)
return NOT_FOUND;
}
else
{
if (flag != neg_node)
return NOT_FOUND;
}
}
}
}
}
return OK;

TransistorList.cc
3
4
5
6
7
8
9
10
11

#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"

///
TransistorList::TransistorList() : NumTran( 0 ), head( 0 ), tail( 0 )
{}

B.1. Main functions

12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

199

///
TransistorList::~TransistorList()
{
TransistorNode* tmp;
while ( head )
{
tmp = head->next;
delete head;
head = tmp;
}
}
///
const TransistorNode& TransistorList::operator[]( unsigned int index ) const
{
if ( index > NumTran )
error( NOT_FOUND, 0, "Index out of bound in [TransistorList]..." );
TransistorNode* tmp = head;
while ( tmp )
{
if ( tmp->Index() == index )
return * tmp;
tmp = tmp->next;
}
return *tmp;
}
///
const TransistorNode& TransistorList::operator[]( const char* name ) const
{
TransistorNode * tmp = head;
while ( tmp )
{
if ( !strcasecmp( name, tmp->Name ) )
return * tmp;
tmp = tmp->next;
}
return *tmp;
}
///
TransistorNode& TransistorList::operator[]( unsigned int index )
{
if ( index > NumTran )
error( NOT_FOUND, 0, "Index out of bound in [TransistorList]..." );
TransistorNode* tmp = head;
while ( tmp )
{
if ( tmp->Index() == index )
return * tmp;
tmp = tmp->next;
}
return *tmp;
}

TransistorListInsert.cc
3
4
5
6
7
8
9
10
11

#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"

///
int TransistorList::Insert( const char *name, double w, double l, TransistorType t, unsigned int s, unsigned int g, unsigned int d )
{
TransistorNode * tmp;

Appendix B. Source code

200

12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31 }

if ( !head )
{
head = new TransistorNode( name, w, l, t, s, d, g, NumTran );
if ( !head )
return NO_MEM;
head->next = 0;
tail = head;
}
else
{
tmp = new TransistorNode( name, w, l, t, s, d, g, NumTran );
if ( !tmp )
return NO_MEM;
tmp->next = 0;
tail->next = tmp;
tail = tmp;
}
NumTran++;
return OK;

TransistorNode.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

#include
#include
#include
#include

"mystdinclude.h"
"print.h"
"myenum.h"
"class_devices.h"

///
TransistorNode::TransistorNode( const char *name, double w, double l, TransistorType t, unsigned int s, unsigned int d, unsigned int g,
Name( 0 ), width( w ), length( l ), type( t ),
source( s ), drain( d ), gate( g ), hashindex( index ), next( 0 )
{
Name = new char[ strlen( name ) + 1 ];
if ( !Name )
{
print_log( "FATAL ERROR" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "PANIC! " );
}
strcpy( Name, name );
}
///
TransistorNode::~TransistorNode()
{
delete[] Name;
}

main.cc
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
<signal.h>
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"
"class_optimizator.h"
"hspice.h"
"fast.h"
"test.h"
"main.h"

B.1. Main functions

19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

#include "readt.h"

///
extern char *optarg;
extern int optind;
///
int main ( int argc, char **argv )
{
int c;
char *FileIn;
char *FileOptions;
time_t tm = time( 0 );
signal(15, catch_stop);
signal(2, catch_stop);
char log[ 256 ];
print_log( "\n*************************" );
sprintf( log, "%s Version: %s Copyrigth MFD 1998 ", argv[ 0 ], VERSION );
print_log( log );
print_log( "*************************" );
print_log( ctime( &tm ) );
// some default initialization
FileOptions = 0;
while ( ( c = getopt( argc, argv, "hf:t:" ) ) != -1 )
{
switch ( c )
{
case h:
//HELP
return print_help( argv[ 0 ] );
break;
case f:
FileOptions = new char[ strlen( optarg ) + 1 ];
if ( !FileOptions )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
strcpy( FileOptions, optarg );
break;
case ?:
default:
return print_help( argv[ 0 ] );
break;
}
}
if ( ( argc - optind ) != 1 )
return print_help( argv[ 0 ] );
else
{
FileIn = new char[ strlen( argv[ optind ] ) + 1 ];
if ( !FileIn )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
strcpy( FileIn, argv[ optind ] );
if ( !FileOptions )
{
FileOptions = new char[ strlen( "options.conf" ) + 1 ];
strcpy( FileOptions, "options.conf" );
}
}
Options options;
int RetCode;

201

202

88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156

Appendix B. Source code

if ( ( RetCode = options.Read( FileOptions ) ) )


{
print_log( "Error reading options file:" );
print_log( ReturnMessage[ RetCode ] );
error( RetCode, errno, "HEY! " );
}
Circuit circuit( FileIn, options );
CritPathList pathList;
if (options.Manual() == 0)
{
RetCode = Critic(circuit, pathList, options);
if (RetCode != OK)
{
print_log( "Error searching critical paths:" );
print_log( ReturnMessage[ RetCode ] );
error( RetCode, errno, "HEY! " );
}
}
else
{
if ( ( RetCode = pathList.Read( FileOptions, circuit ) ) )
{
print_log( "Error reading options file:" );
print_log( ReturnMessage[ RetCode ] );
error( RetCode, errno, "HEY! " );
}
}
if ( ( RetCode = ReadTech() ) )
{
print_log( "Error reading options file:" );
print_log( ReturnMessage[ RetCode ] );
error( RetCode, errno, "HEY! " );
}
EvaluationAlgorithm* simulation;
switch ( options.WhichSimAlgorithm() )
{
case HSPICE:
simulation = new Hspice( pathList, options, FileIn );
if ( mkdir( options.Workpath(), 0770 ) )
{
if ( errno != EEXIST )
{
error( NOT_FOUND, errno, "HEY! " );
}
}
break;
case FAST:
simulation = new Fast( pathList, options );
break;
case TESTOPT:
simulation = new TestOpt( pathList, options);
break;
case NONESM:
default:
error( NOT_FOUND, errno, "HEY! " );
break;
}
double* LastWidth = new double[ circuit.GetNTran() ];
if ( !simulation || !LastWidth )
{
print_log( "FATAL ERROR" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
print_init( circuit, options, pathList.GetNumPath() );
if ( ( RetCode = Optimize( circuit, options, *simulation, LastWidth ) ) )
{
print_log( "Error in optimizing..." );

B.1. Main functions

157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176

203

print_log( ReturnMessage[ RetCode ] );


error( RetCode, errno, "HEY! " );
}
print_log("Writing optimized netlist...");
print_final(FileIn, circuit, pathList.GetNumPath(), *simulation, LastWidth );
print_log( "Time to die..." );
delete[] FileIn;
delete[] FileOptions;
delete[] LastWidth;
delete simulation;
return OK;
}
///
int print_help( const char *name )
{
cerr << "Usage: " << name << " [-f FILEOPTIONS] Netlist_file" << endl;
cerr << "
Where -f FILEOPTIONS = file containing general option (default = options.conf) " << endl;
return OK;
}

nrutil.cc
3
4
5
6
7
8
9
10
11
12
13
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
28
29
30
31
32
34
35
36
37
38
39
40
41
42
43
44
46
47
48

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"nrutil.h"
"print.h"

///
const int NR_END = 1;
#define FREE_ARG char*
///
double *dvector ( long nl, long nh )

/* allocate a double vector with subscript range v[nl..nh] */

{
double * v;
17
v = new double[ nh - nl + 1 + NR_END ];
if ( !v )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
//error(NO MEM, errno, HEY! );
return 0;
}
return v - nl + NR_END;
}
///
double **dmatrix ( long nrl, long nrh, long ncl, long nch )
{
long i, nrow = nrh - nrl + 1, ncol = nch - ncl + 1;
double **m;
31
m = new double * [ nrow + NR_END ];
33
if ( !m )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
//error(NO MEM, errno, HEY! );
return 0;
}
m += NR_END;
m -= nrl;
43
45
m[ nrl ] = new double[ nrow * ncol + NR_END ];
if ( !m[ nrl ] )
{

/* allocate a double matrix with subscript range m[nrl..nrh][ncl..nch] */

Appendix B. Source code

204

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
65
66
67
68
69
70
73
74
75

print_log( "FATAL ERROR:" );


print_log( ReturnMessage[ NO_MEM ] );
//error(NO MEM, errno, HEY! );
return 0;
}
m[ nrl ] += NR_END;
m[ nrl ] -= ncl;
for ( i = nrl + 1; i <= nrh; i++ )
m[ i ] = m[ i - 1 ] + ncol;
return m;
}
///
void free_dvector ( double *v )
{
delete[] v;
64

57

61

}
///
void free_dmatrix ( double **m )
68
{
71
72
delete[] m[ 1 ];
///
delete[] m;
}

print final.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

#include
#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"
"class_simulator.h"

///
void print_final(const char* FileNetList, const Circuit& circuit, unsigned int NP, EvaluationAlgorithm& simulation, double* LastWidth )
{
unsigned int n = circuit.GetNTran();
char log[ 1024 ];
print_log( "Final Dimensions: " );
for ( unsigned int i = 0; i < n; i++ )
{
int pos = circuit.TranPos( circuit[ i ].DevName() );
sprintf( log, "W[%s] = %3.3gu", circuit[ i ].DevName(), LastWidth[ pos ] );
print_log( log );
}
print_log( "Final critical paths: " );
for ( unsigned int i = 0; i < NP; i++ )
{
sprintf( log, "%u) Delay=%g ps, Energy=%g pJ, Noise=%g", i,
simulation.GetDelay( i ),
simulation.GetPower( i ),
simulation.GetNoise( i ) );
print_log( log );
}
sprintf( log, "Area=%g ", simulation.GetArea() );
print_log(log);
char line[ 1024 ];
char line2[ 1024 ];
char* FileNetOut;
FileNetOut = new char[ strlen( FileNetList ) + strlen( NetListSuffix ) + 1 ];
if ( !FileNetOut )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );

B.1. Main functions

44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112

error( NO_MEM, errno, "HEY! " );


}
strcpy( FileNetOut, FileNetList );
strcat( FileNetOut, NetListSuffix );
ifstream i_file( FileNetList );
ofstream o_file( FileNetOut );
if ( !i_file || !o_file)
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NOT_FOUND ] );
error( NOT_FOUND, errno, "HEY! " );
}
while ( i_file.getline( line, 1023 ) )
{
int c = 0;
while ( isspace( line[ c++ ] ) );
switch ( line[ --c ] )
{
case m:
case M:
case x:
case X:
char tmpstr[ 128 ];
char parsestr[ 128 ];
char par[ 16 ];
char type[ 16 ];
char endpar[ 16 ];
char mos[ 8 ];
unsigned int n1, n2, n3, n4;
strcpy( parsestr, "%s %u %u %u %u %s" );
if ( sscanf( line, parsestr, mos, &n1, &n2, &n3, &n4, type ) == 6 )
{
sprintf( line2, "%s %u %u %u %u %s ", mos, n1, n2, n3, n4, type );
strcpy( parsestr, "%*s %*u %*u %*u %*u %*s" );
strcpy( tmpstr, parsestr );
strcat( parsestr, " %s" );
while ( sscanf( line, parsestr, par ) == 1 )
{
unsigned int count = 0;
while ( isspace( par[ count++ ] ) );
count--;
if ((par[count] == w) || (par[count] == W))
{
double W = LastWidth[circuit[mos].Index()];
sprintf(endpar, " w=%gu ", W);
strcat(line2, endpar);
}
else
{
strcat(line2, " ");
strcat(line2, par);
}
strcat( tmpstr, " %*s" );
strcpy( parsestr, tmpstr );
strcat( parsestr, " %s" );
}
}
else
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NOT_FOUND ] );
error( NOT_FOUND, errno, "HEY! " );
}
break;
default:
strcpy(line2, line);
break;
}
o_file << line2 << endl;

205

Appendix B. Source code

206

113
114
115
116 }

}
i_file.close();
o_file.close();

print init.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

#include
#include
#include
#include
#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"class_options.h"
"class_devices.h"
"class_nodes.h"
"class_critical.h"
"class_circuit.h"

///
void print_init( const Circuit& circuit, const Options& options, unsigned int NP )
{
unsigned int n = circuit.GetNTran();
char log[ 1024 ];
sprintf( log, "Circuit: %u Transistor || %u Critical paths", n, NP );
print_log( log );
print_log( "Initial Dimensions: " );
for ( unsigned int i = 0; i < n; i++ )
{
sprintf( log, "W[%s] = %3.3gu", circuit[ i ].DevName(), circuit[ i ].Width() );
print_log( log );
}
}

print log.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16

#include "mystdinclude.h"
#include "myenum.h"
#include "print.h"
///
void print_log( const char *OutString )
{
ofstream o_file( "OPT.log", ios::app );
if ( o_file )
{
o_file << OutString << endl;
o_file.close();
}
}

signal.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

#include "mystdinclude.h"
#include <signal.h>

///
void catch_stop(int n)
{
if (n == 15)
cerr << endl << "TERM" << endl;
if (n == 2)
{
cerr << endl << "TERM2" << endl;
exit(0);
}
}

B.1. Main functions

207

Appendix B. Source code

208

B.2

Optimization algorithms

B.2. Optimization algorithms

Slop.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"slop.h"

///
Slop::Slop( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
OptimizationAlgorithm( circuit, options, simulation ),
TMat(0), TMatOld(0), GMat(0), T(0), SaveW(0)
{
print_log( "Creating Slop instance..." );
TMat = new double * [ NumPath ];
TMatOld = new double * [ NumPath ];
GMat = new double * [ NumPath ];
T = new double [ NumPath ];
SaveW = new double[ NumTran ];
if ( ( !TMat ) || ( !TMatOld ) || ( !GMat ) || ( !T ) || ( !SaveW ) )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
for ( unsigned int i = 0; i < NumPath; i++ )
{
TMat[ i ] = new double [ NumTran ];
TMatOld[ i ] = new double [ NumTran ];
GMat[ i ] = new double [ NumTran ];
if ( ( !TMat[ i ] ) || ( !TMatOld[ i ] ) || ( !GMat[ i ] ) )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
}
}
///
Slop::~Slop()
{
for ( unsigned int i = 0; i < NumPath; i++ )
{
delete[] TMat[ i ];
delete[] GMat[ i ];
}
delete[] TMat;
delete[] TMatOld;
delete[] GMat;
delete[] SaveW;
}
///
int Slop::Run()
{
int Wbig;
int RetCode;
for ( unsigned int i = 0; i < NumTran; i++ )
SaveW[ i ] = Width[ i ];
double max = 0;
double TMax = 0;
unsigned int jmax;
double dummy;
unsigned int end_acc = 0;
for ( Steps = 1; ( Steps < options.GetOptOption( MAXSTEPS ) ) \
&& (max >= 0.0) \

209

Appendix B. Source code

210

68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136

&& ( InternalSteps < options.GetOptOption( MAXSTEPS ) ) \


&& (end_acc == 0); Steps++ )
{
for ( unsigned int i = 0; i < NumTran; i++ )
Width[ i ] = SaveW[ i ];
dummy = SlopNormSim( Width, RetCode);
if ((RetCode != OK) && (RetCode != CONT) && (RetCode != MAX_STEPS) && (RetCode != END_ACC))
return RetCode;
else if ((RetCode == MAX_STEPS) || (RetCode == END_ACC))
end_acc = 1;
TMax = 0.0;
jmax = 0;
for ( unsigned int i = 0; i < NumPath; i++ )
{
T[ i ] = CPDelay[ i ];
for ( unsigned int j = 0; j < NumTran; j++ )
TMatOld[ i ][ j ] = T[ i ];
if ( T[ i ] > TMax )
{
TMax = T[ i ];
jmax = i;
}
}
for ( unsigned int i = 0; i < NumTran; i++ )
{
Width[ i ] += options.GetOptOption( DELTA );
if (options.GetOptOption( WMAX ) > 0)
if ( Width[ i ] >= options.GetOptOption( WMAX ) )
Width[ i ] = options.GetOptOption( WMAX );
dummy = SlopNormSim( Width, RetCode);
if ((RetCode != OK) && (RetCode != CONT) && (RetCode != MAX_STEPS) && (RetCode != END_ACC))
return RetCode;
else if ((RetCode == MAX_STEPS) || (RetCode == END_ACC))
end_acc = 1;
Width[ i ] = SaveW[ i ];
if ( options.GetOptOption( CONSTRAINS ) )
{
if ( ( CPDelay[ i ] <= options.GetOptOption( MAXDELAY ) ) &&
( CPPower[ i ] <= options.GetOptOption( MAXPOWER ) ) &&
( CPNoise[ i ] <= options.GetOptOption( MAXNOISE ) ) &&
( Area <= options.GetOptOption( MAXAREA ) ) )
for ( unsigned int j = 0; j < NumPath; j++ )
{
T[ j ] = CPDelay[ j ];
TMat[ j ][ i ] = T[ j ];
}
}
else
for ( unsigned int j = 0; j < NumPath; j++ )
{
T[ j ] = CPDelay[ j ];
TMat[ j ][ i ] = T[ j ];
}
}
Wbig = -1;
max = 0.0;
for ( unsigned int i = 0; i < NumPath; i++ )
{
for ( unsigned int j = 0; j < NumTran; j++ )
{
GMat[ i ][ j ] = TMatOld[ i ][ j ] - TMat[ i ][ j ];
if ( ( GMat[ i ][ j ] > max ) && ( i == jmax ) )
{
max = GMat[ i ][ j ];
Wbig = j;
}
TMatOld[ i ][ j ] = TMat[ i ][ j ];
}

B.2. Optimization algorithms

137
138
139
140
141
142
143
144
145
146
147
148
149
150 }

}
if ( Wbig != -1 )
SaveW[ Wbig ] += options.GetOptOption( DELTA );
else
// so max 0
max = -1.0;
}
for ( unsigned int i = 0; i < NumTran; i++ )
{
Width[ i ] = SaveW[ i ];
}
dummy = SlopNormSim( Width, RetCode);
return RetCode;

SlopNorm.cc
3
4
5
6
7
8
9
10
11
12
13

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"slop.h"

///
double Slop::SlopNormSim( const double* NewWidth, int& RetCode)
{
return NormSim( NewWidth, RetCode);
}

211

Appendix B. Source code

212

Slop2.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"slop2.h"

///
Slop2::Slop2( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
OptimizationAlgorithm( circuit, options, simulation ),
TMat(0), TMatOld(0), GMat(0), SaveW(0)
{
print_log( "Creating Slop2 instance..." );
TMat = new double * [ NumPath ];
TMatOld = new double * [ NumPath ];
GMat = new double * [ NumPath ];
SaveW = new double[ NumTran ];
if ( ( !TMat ) || ( !TMatOld ) || ( !GMat ) || ( !SaveW ) )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
for ( unsigned int i = 0; i < NumPath; i++ )
{
TMat[ i ] = new double [ NumTran ];
TMatOld[ i ] = new double [ NumTran ];
GMat[ i ] = new double [ NumTran ];
if ( ( !TMat[ i ] ) || ( !TMatOld[ i ] ) || ( !GMat[ i ] ) )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
}
}
///
Slop2::~Slop2()
{
for ( unsigned int i = 0; i < NumPath; i++ )
{
delete[] TMat[ i ];
delete[] GMat[ i ];
}
delete[] TMat;
delete[] TMatOld;
delete[] GMat;
delete[] SaveW;
}
///
int Slop2::Run()
{
int Wbig;
int RetCode;
for ( unsigned int i = 0; i < NumTran; i++ )
SaveW[ i ] = Width[ i ];
double max = 0;
double dummy;
for ( unsigned int i = 0; i < NumTran; i++ )
Width[ i ] = SaveW[ i ];
dummy = Slop2NormSim( Width, RetCode);
if ( (RetCode != OK ) && (RetCode != CONT))
return RetCode;
for ( unsigned int i = 0; i < NumPath; i++ )

B.2. Optimization algorithms

68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121 }

{
for ( unsigned int j = 0; j < NumTran; j++ )
TMatOld[ i ][ j ] = dummy;
}
unsigned int end_acc = 0;
for ( Steps = 1; ( Steps < options.GetOptOption( MAXSTEPS ) ) \
&& ( max >= 0.0 ) \
&& ( InternalSteps < options.GetOptOption( MAXSTEPS ) \
&& (end_acc == 0)); Steps++ )
{
for ( unsigned int i = 0; i < NumTran; i++ )
Width[ i ] = SaveW[ i ];
for ( unsigned int i = 0; i < NumTran; i++ )
{
Width[ i ] += options.GetOptOption( DELTA );
if (options.GetOptOption( WMAX ) > 0)
if ( Width[ i ] >= options.GetOptOption( WMAX ) )
Width[ i ] = options.GetOptOption( WMAX );
dummy = Slop2NormSim( Width, RetCode);
if ((RetCode != OK) && (RetCode != CONT) && (RetCode != MAX_STEPS) && (RetCode != END_ACC))
return RetCode;
else if ((RetCode == MAX_STEPS) || (RetCode == END_ACC))
end_acc = 1;
Width[ i ] = SaveW[ i ];
for ( unsigned int j = 0; j < NumPath; j++ )
{
TMat[ j ][ i ] = dummy;
}
}
Wbig = -1;
max = 0.0;
for ( unsigned int i = 0; i < NumPath; i++ )
for ( unsigned int j = 0; j < NumTran; j++ )
{
GMat[ i ][ j ] = TMatOld[ i ][ j ] - TMat[ i ][ j ];
if ( GMat[ i ][ j ] > max )
{
max = GMat[ i ][ j ];
Wbig = j;
}
TMatOld[ i ][ j ] = TMat[ i ][ j ];
}
if ( Wbig != -1 )
SaveW[ Wbig ] += options.GetOptOption( DELTA );
else
max = -1.0;
// so max 0
}
for ( unsigned int i = 0; i < NumTran; i++ )
Width[ i ] = SaveW[ i ];
dummy = Slop2NormSim( Width, RetCode);
return RetCode;

Slop2Norm.cc
3
4
5
6
7
8
9
10
11
12
13

213

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"slop2.h"

///
double Slop2::Slop2NormSim( const double* NewWidth, int& RetCode)
{
return NormSim( NewWidth, RetCode);
}

214

Appendix B. Source code

B.2. Optimization algorithms

215

TestEv.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"test.h"

///
TestEval::TestEval( const Circuit& circuit, const Options& options, EvaluationAlgorithm& simulation )
:
OptimizationAlgorithm( circuit, options, simulation ) ,
TryW(0)
{
print_log( "Creating TestEval instance..." );
TryW = new double[ NumTran ];
if ( !TryW )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
}
///
TestEval::~TestEval()
{
delete[] TryW;
}
///
int TestEval::Run()
{
int RetCode;
double dummy;
for ( unsigned int i = 0; i < NumTran; i++ )
TryW[ i ] = options.GetOptOption( WMIN );
for ( Steps = 1; Steps < options.GetOptOption( MAXSTEPS ) ; Steps++ )
{
dummy = TestEvalNormSim( TryW, RetCode);
if ( (RetCode != OK ) && (RetCode != CONT))
return RetCode;
// if (Steps <= NumTran)
// TryW[(Steps - 1)] += options.GetOptOption( DELTA );
// else
// TryW[(Steps - 1) % NumTran] += options.GetOptOption(DELTA);
for (int i = 0; i < NumTran; i++)
TryW[i] += options.GetOptOption( DELTA );
}
return OK;
}

TestNorm.cc
3
4
5
6
7
8
9
10
11
12
13

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"test.h"

///
double TestEval::TestEvalNormSim( const double* NewWidth, int& RetCode)
{
return NormSim( NewWidth, RetCode);
}

216

Appendix B. Source code

B.3

Simulators

B.3. Simulators

217

Basicnet.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"

///
int Hspice::BasicNetlist( const double* NewWidth, unsigned int Np, const Circuit& circuit )
{
char * FileHspice;
char log[ 1024 ];
char suffix[ 8 ];
sprintf( suffix, "_%u", Np );
FileHspice = new char[ strlen( WorkPath ) + strlen( SimFile ) + strlen( suffix ) + 1 ];
strcpy( FileHspice, WorkPath );
strcat( FileHspice, SimFile );
strcat( FileHspice, suffix );
ofstream o_file( FileHspice );
ifstream i_file( NetlistFile );
if ( !o_file )
{
sprintf( log, " ERROR opening file %s ", FileHspice );
print_log( log );
return NOT_FOUND;
}
if ( !i_file )
{
sprintf( log, " ERROR opening file %s ", NetlistFile );
print_log( log );
return NOT_FOUND;
}
char line[ 1024 ];
i_file.getline( line, 1023 );
o_file << endl << endl << endl << "****** INPUTS ******" << endl;
o_file << ".include inputs." << Np << endl;
o_file << "********************" << endl;
while ( i_file.getline( line, 1023 ) )
{
unsigned int i = 0;
while ( isspace( line[ i++ ] ) );
char s = line[ --i ];
char st1[ 16 ], st2[ 16 ], st3[ 16 ];
int n1, n2, n3, n4;
if ( ( s == M ) || ( s == m ) || ( s == X ) || ( s == x ) )
{
sscanf( line, "%s %d %d %d %d %s %*s %s", st1, &n1, &n2, &n3, &n4, st2, st3 );
int position = circuit.TranPos( st1 );
if ( position == -1 )
return NOT_FOUND;
o_file << st1 << " " << n1 << " " << n2 << " " << n3 << " " << n4 << \
" " << st2 << " " << st3 << " w=" << setprecision( 4 ) << NewWidth[ position ] << "u" << endl;
}
else
o_file << line << endl;
}
o_file.close();
i_file.close();
return OK;
}

Delayread.cc
3 #include "mystdinclude.h"
4 #include "myenum.h"
5 #include "print.h"

Appendix B. Source code

218

6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

#include "hspice.h"
///
int Hspice::DelayRead( double& del, double& energy, unsigned int Np )
{
char * FileMeas;
char suffix[ 8 ];
sprintf( suffix, "_%d", Np );
FileMeas = new char[ strlen( WorkPath ) + strlen( SimFile ) + strlen( suffix ) + strlen( SuffixFileMeasure ) + 1 ];
if ( !FileMeas )
return NO_MEM;
strcpy( FileMeas, WorkPath );
strcat( FileMeas, SimFile );
strcat( FileMeas, suffix );
strcat( FileMeas, SuffixFileMeasure );
ifstream i_file( FileMeas );
if ( !i_file )
{
print_log( "ERROR opening hspice measure file " );
return NOT_FOUND;
}
char line[ 1023 ];
for ( unsigned int i = 0; i <= 3; i++ )
if ( !i_file.getline( line, 1023 ) )
{
print_log( "ERROR parsing hspice measure file " );
return PARSE_ERROR;
}
sscanf( line, "%lg %lg", &del, &energy);
i_file.close();
del *= 1E12; // picosec.
energy *= ( 1E12);
// pJ
delete[] FileMeas;
return OK;
}

Hspice.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"

//////////////////////////////////////////////////////////////////////////////
// //
// DELAY MODULE HSPICE //
// //
// 1998 October 9 Politecnico di Torino VLSI LAB //
// //
// Mariagrazia Graziano Ph.D. Student //
// //
////////////////////////////////////////////////////////////////////////////////
Hspice::Hspice( const CritPathList& pathlist, const Options& options, const char* NE )
:
EvaluationAlgorithm( pathlist, options ), SimTime( 0.0 ),
WorkPath( 0 ), SimFile( 0 ),
InputFile( 0 ), NetlistFile( 0 ), SuffixFileMeasure( 0 )
{
print_log( "Creating Hspice instance..." );
WorkPath = new char[ strlen( options.Workpath() ) + 1 ];
SimFile = new char[ strlen( "net2use" ) + 1 ];
InputFile = new char[ strlen( "inputs" ) + 1 ];
NetlistFile = new char[ strlen( NE ) + strlen( NetListSuffix ) + 1 ];
if ( !SimFile || ! InputFile || !NetlistFile )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );

B.3. Simulators

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

error( NO_MEM, errno, "HEY! " );


}
strcpy( WorkPath, options.Workpath() );
strcpy( SimFile, "net2use" );
strcpy( InputFile, "inputs" );
strcpy( NetlistFile, NE );
strcat( NetlistFile, NetListSuffix );
SuffixFileMeasure = new char[ 5 ];
if ( !SimFile )
{
print_log( "FATAL ERROR:" );
print_log( ReturnMessage[ NO_MEM ] );
error( NO_MEM, errno, "HEY! " );
}
strcpy( SuffixFileMeasure, ".mt0" );
}
///
Hspice::~Hspice()
{
delete[] WorkPath;
delete[] SimFile;
delete[] InputFile;
delete[] NetlistFile;
delete[] SuffixFileMeasure;
}
///
int Hspice::Run( const Circuit& circuit, const double *NewWidth, const unsigned *ValidPath )
{
SimTime = options.GetSimOption( SIMTIME );
Calls++;
for ( unsigned int NP = 0; NP < NumPath; NP++ )
{
if (ValidPath[NP])
{
int RetCode;
RetCode = BasicNetlist( NewWidth, NP, circuit );
if ( RetCode != OK )
return RetCode;
RetCode = SetInput( NP, circuit.Valim() );
if ( RetCode != OK )
return RetCode;
RetCode = SimCall( NP );
if ( RetCode != OK )
return RetCode;
double OneDelay, OneEnergy;
RetCode = DelayRead( OneDelay, OneEnergy, NP );
if ( RetCode != OK )
return RetCode;
CPDelay[ NP ] = OneDelay;
CPPower[ NP ] = OneEnergy;
CPNoise[ NP ] = 0.0;
}
}
Area = CalcArea( NewWidth, circuit.GetNTran() );
return OK;
}

HspiceArea.cc
3
4
5
6
7
8
9

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"

///
double Hspice::CalcArea( const double *NewWidth, unsigned int NT )

219

Appendix B. Source code

220

10 {
11
12
13
14
15
16
17 }

double A = 0.0;
for ( unsigned int i = 0; i < NT; i++ )
{
A += NewWidth[ i ];
}
return ( A );

Setinput.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"

///
int Hspice::SetInput( unsigned int Np, double Val )
{
char suffix[ 8 ];
sprintf( suffix, ".%u", Np );
char *Inputs = new char[ strlen( WorkPath ) + strlen( InputFile ) + strlen( suffix ) + 1 ];
if ( !Inputs )
return NO_MEM;
strcpy( Inputs, WorkPath );
strcat( Inputs, InputFile );
strcat( Inputs, suffix );
ofstream input_file( Inputs );
if ( !input_file )
return NOT_FOUND;
unsigned int node_in = pathlist[ Np ].GetNodeIn();
unsigned int node_out = pathlist[ Np ].GetNodeOut();
TransitionType TIn = pathlist[ Np ].GetTransitionIn();
if ( TIn == RISE )
input_file << endl << "v_node_in " << node_in << \
" 0 " << " pwl(0 0 " << ( SimTime / 2.0 ) << \
"p 0 " << ( SimTime / 2.0 ) + pathlist[ Np ].GetInTime() << "p " << Val << ")";
else if ( TIn == FALL )
input_file << endl << "v_node_in " << node_in << \
" 0 " << " pwl(0 " << Val << " " << ( SimTime / 2.0 ) << \
"p " << Val << " " << ( SimTime / 2.0 ) + \
pathlist[ Np ].GetInTime() << "p 0)" ;
unsigned int node;
double nodeVal;
while ( pathlist[ Np ].TraverseActiveInputs( node, nodeVal ) )
{
input_file << endl << "v_ACTIVE_" << node << " " << \
node << " 0 dc " << nodeVal;
}
while ( pathlist[ Np ].TraverseNoActiveInputs( node, nodeVal ) )
{
input_file << endl << "v_NO_ACTIVE_" << node << " " << \
node << " 0 dc " << nodeVal;
}
while ( pathlist[ Np ].TraverseInitialConditions( node, nodeVal ) )
{
input_file << endl << endl << ".ic v(" << node << ")=" << \
nodeVal;
}
TransitionType TOut = pathlist[ Np ].GetTransitionOut();
if ( TOut == RISE )
input_file << endl << endl << ".ic v(" << node_out << ")=0";
else if ( TOut == FALL )
input_file << endl << endl << ".ic v(" << node_out << ")=" << Val;
// Delay meas.
input_file << endl << endl << ".measure tran path_n0_" << Np << "delay " << \
" trig v(" << node_in << ")" << " val=" << Val*0.5 << " " << TransitionString[ TIn ] << "=1" << \

B.3. Simulators

60
61
62
63
64
65
66
67 }

" targ v(" << node_out << ")" << " val=" << Val * 0.5 << " " <<
// Power meas.
input_file << endl << endl << ".measure tran path_n0_" << Np <<
" integ " << "POWER" << " from=0ps" << " to=" << SimTime << "ps
input_file << endl << endl << ".tran 10p " << SimTime << "p" <<
input_file.close();
return OK;

221

TransitionString[ TOut ] << "=1";


"power " << \
";
endl;

Simcall.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"hspice.h"

///
int Hspice::SimCall( unsigned int Np )
{
char system_string[ 512 ];
sprintf( system_string, "cd %s && hspice %s_%d 1>./hspice.log.%d 2>&1", WorkPath, SimFile, Np, Np );
if ( system( system_string ) == -1 )
{
print_log( "ERROR invoking hspice simulator " );
return NO_MEM;
}
return OK;
}

Appendix B. Source code

222

Brackets.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

const double FACTOR = 1.6;


const int NTRY = 50;
const double ZEPS = 1e-2;
///
int Fast::Brackets( const Circuit& circuit, unsigned int NP, unsigned int NC, double& start, double& end, TransistorType type, unsigned
{
int jj;
double f1, f2, x1, x2;
if ( start == end )
{
x1 = x2 = 0;
RetCode = NOT_FOUND;
return 0;
}
if ( type == NMOS )
{
f1 = EqN( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
f2 = EqN( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
}
else if ( type == PMOS )
{
f1 = EqP( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
f2 = EqP( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
}
for ( jj = 1; jj <= NTRY; jj++ )
{
if ( f1 * f2 < 0.0 )
{
RetCode = OK;
return 1;
}
if ( fabs ( f1 ) < fabs ( f2 ) )
{
start += FACTOR * ( start - end );
if ( start <= 0.0 )
start = ZEPS;
if ( type == NMOS )
f1 = EqN( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
else if ( type == PMOS )
f1 = EqP( circuit, NP, NC, start, RetCode, j, n, p, NewWidth );
}
else
{
end += FACTOR * ( end - start );
if ( end <= 0.0 )
end = ZEPS;
if ( type == NMOS )
f2 = EqN( circuit, NP, NC, end, RetCode, j, n, p, NewWidth );
else if ( type == PMOS )
f2 = EqP( circuit, NP, NC, end, RetCode, j , n, p, NewWidth );
}
}
RetCode = NOT_FOUND;
return 0;
}

B.3. Simulators

223

CalcpowN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
int Fast::CalcPowerN( const Circuit& circuit, unsigned int NP, unsigned int NC, double& Ecc, double& Esc, unsigned int n, unsigned int p, const double*
{
double H_1, I_1, Vc, t0_bs, C_n;
double J_1, K_1, M_1, N_1;
double t0 = t0_n[ n ];
double tauo = tauo_n[ n ];
double tin = taui_n[ 1 ];
t0_bs = tin * ( VDD + TECH.Vtp0 ) / VDD;
double tc = (VDD * tauo_n[n] * (t0_n[n] - taui_n[n]) + \
Vs_n[n] * taui_n[n] * (tauo_n[n] - t0_n[n])) / \
(VDD * (t0_n[n] - taui_n[n]) + Vs_n[n] * (tauo_n[n] - t0_n[n]));
Esc = 0;
Ecc = 0;
if (p > 0)
{
Vc = TECH.Ec_p * L_p[ p ];
H_1 = Vc * beta_p[ p ] * ( VDD * ( t0 - tauo ) * \
( 2 * Vc * ( t0 - tauo ) - Vd_n[ n ] * tin ) - \
Vd_n[ n ] * tin * ( Vc * ( t0 - tauo ) - Vd_n[ n ] * tauo + 2 * TECH.Vtp0 * ( t0 - tauo ) ) ) / \
( 2 * Vd_n[ n ] * tin * ( t0 - tauo ) );
I_1 = Vc * beta_p[ p ] * ( 2 * VDD * ( t0 - tauo ) - Vd_n[ n ] * tin ) / \
( 2 * tin * ( t0 - tauo ) );
J_1 = ( Vc * Vc ) * beta_p[ p ] * ( t0 - tauo ) * ( 2 * ( VDD * VDD ) * ( t0 - tauo ) + \
2 * VDD * ( Vc * ( t0 - tauo ) + Vd_n[ n ] * ( tauo - tin ) ) - \
Vd_n[ n ] * tin * ( Vc + 2 * TECH.Vtp0 ) );
K_1 = 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + \
Vc * ( t0 - tauo ) + Vd_n[ n ] * tauo );
M_1 = VDD * Vc * beta_p[ p ] * ( VDD + 2 * TECH.Vtp0 ) / ( 2 * ( VDD + Vc ) );
N_1 = VDD * VDD * Vc * beta_p[ p ] / ( tin * ( VDD + Vc ) );
if ( t0_bs < tauo )
{
Esc = ( J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0 * tin - K_1 ) / ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * \
( tin * tin ) * ( tauo - t0 ) ) + \
J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0_bs * tin - K_1 ) / \
( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * ( tin * tin ) * ( t0 - tauo ) ) + \
( ( t0 - t0_bs ) * \
( 3 * H_1 * Vd_n[ n ] * tin * ( 2 * VDD * ( t0 - tauo ) - Vd_n[ n ] * ( t0 + t0_bs - 2 * tauo ) ) + \
I_1 * Vd_n[ n ] * tin * ( 3 * VDD * ( t0 + t0_bs ) * ( t0 - tauo ) - Vd_n[ n ] * ( 2 * ( t0 * t0 ) + \
t0 * ( 2 * t0_bs - 3 * tauo ) + t0_bs * ( 2 * t0_bs - 3 * tauo ) ) ) - 3 * J_1 ) ) / \
( 6 * Vd_n[ n ] * tin * ( t0 - tauo ) ) );
}
else
{
Esc = ( J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * t0 * tin - K_1 ) / ( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * \
( tin * tin ) * ( tauo - t0 ) ) + \
J_1 * ( K_1 - 2 * Vd_n[ n ] * tin * ( VDD * ( t0 - tauo ) + Vd_n[ n ] * tauo ) ) * \
0.25 * LOG ( 2 * Vd_n[ n ] * Vd_n[ n ] * tauo * tin - K_1 ) / \
( Vd_n[ n ] * Vd_n[ n ] * Vd_n[ n ] * ( tin * tin ) * ( t0 - tauo ) ) + \
( 3 * H_1 * Vd_n[ n ] * tin * ( t0 - tauo ) * ( 2 * VDD - Vd_n[ n ] ) + \
I_1 * Vd_n[ n ] * tin * ( t0 - tauo ) * \
( 3 * VDD * ( t0 + tauo ) - Vd_n[ n ] * ( 2 * t0 + tauo ) ) - 3 * J_1 ) / \
( 6 * Vd_n[ n ] * tin ) );
}
Esc = fabs( Esc );
}
const char* name;
unsigned int node = 0;

224

71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139

Appendix B. Source code

double Wjn, Wgn, Wjp, Wgp;


int njn, ngn, njp, ngp;
for ( unsigned int i = 1; i <= n; i++ )
{
C_n = 0.0;
name = pathlist[ NP ].TransistorName( i - 1, NC );
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
int nc;
// Cj N
C_n += TECH.C_nj * Wjn * TECH.Df * \
( Vd_n[ i ] * Vd_n[ i ] * ( TECH.mj_n - 1 ) * ( TECH.mj_n - 1 ) + \
Vd_n[ i ] * TECH.mj_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
pow ( ( 1 + Vd_n[ i ] / TECH.PB_n ), -TECH.mj_n ) / \
( ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) );
C_n += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * \
( Vd_n[ i ] * Vd_n[ i ] * ( TECH.mjsw_n - 1 ) * ( TECH.mjsw_n - 1 ) + \
Vd_n[ i ] * TECH.mjsw_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
pow ( 1 + Vd_n[ i ] / TECH.PB_n, -TECH.mjsw_n ) / \
( ( TECH.mjsw_n - 2 ) * ( TECH.mjsw_n - 1 ) ) + \
TECH.PB_n * TECH.PB_n * ( TECH.C_nj * Wjn * TECH.Df * \
( TECH.mjsw_n - 2 ) * ( TECH.mjsw_n - 1 ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * \
( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) ) / \
( ( 2 - TECH.mjsw_n ) * ( TECH.mj_n - 2 ) * ( TECH.mj_n - 1 ) * ( TECH.mjsw_n - 1 ) );
// Cj P
if (p > 0)
{
double x = TECH.mj_p - 1;
double y = TECH.mjsw_p - 1;
C_n += ( TECH.C_pj * Wjp * TECH.Df * \
( VDD * VDD * x + VDD * TECH.mj_p * \
(TECH.PB_p - Vd_n[ i ] * x) + Vd_n[ i ] * Vd_n[ i ] * x * x - \
Vd_n[ i ] * TECH.mj_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD - Vd_n[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) ) / \
( ( x - 1 ) * x );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( VDD * VDD * y + VDD * TECH.mjsw_p * \
(TECH.PB_p - Vd_n[ i ] * y) + Vd_n[ i ] * Vd_n[ i ] * y * y - \
Vd_n[ i ] * TECH.mjsw_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD - Vd_n[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) ) / \
( ( y - 1 ) * y );
C_n += ( TECH.C_pj * Wjp * TECH.Df * \
( VDD * VDD * x + VDD * TECH.mj_p * TECH.PB_p + \
TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) ) / \
( ( 1 - x ) * x );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( VDD * VDD * y + VDD * TECH.mjsw_p * TECH.PB_p + \
TECH.PB_p * TECH.PB_p ) * \
pow( ( VDD + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) ) / \
( ( 1 - y ) * y );
}
C_n += circuit.CapStaticGnd( node, nc ) * Vd_n[i] * Vd_n[i] * 0.5;
C_n += circuit.CapStaticVdd( node, nc ) * Vd_n[i] * Vd_n[i] * 0.5;

B.3. Simulators

140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208

C_n += Wgn
C_n += Wgp
if ( i < n
C_n +=

225

* TECH.Lmin * TECH.Cox_n * Vd_n[i] * Vd_n[i] * 0.5;


* TECH.Lmin * TECH.Cox_p * Vd_n[i] * Vd_n[i] * 0.5;
)
TECH.Cgs0_n * ( ( Wjn - W_n[ i ] - W_n[ i + 1 ] ) + \
( njn - 2 ) * TECH.XW_n ) * 0.5 * Vd_n[i] * Vd_n[i];

else
C_n += TECH.Cgs0_n * ( ( Wjn - W_n[ i ] ) + \
( njn - 1 ) * TECH.XW_n ) * 0.5 * Vd_n[i] * Vd_n[i];
C_n += TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[i] * Vd_n[i] * 0.5;
// Cgd
if ( (( i == 1 ) && ( i < n - 1 )) || ((i == 1) && (n == 1)) )
{
double Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
int Op, SOp;
Op = Calct0ts1N( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] * \
(ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1]) - \
VDD * (t0_n[i] - tauo_n[1]) * \
((ts_n[i] * ts_n[i]) - 2 * ts_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * ((t0_n[i] - ts_n[i]) * (t0_n[i] + ts_n[i] - 2 * tauo_n[1])) * \
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _AA_:
C_n += Cg * Vd_n[i] * (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1])*\
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * (Vd_n[i] * taui_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) - \
VDD * (t0_n[i] - tauo_n[1])*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _B_:
C_n += Cg * Vd_n[i] * (VDD * ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-\
taui_n[i] * (taui_n[i] - 2 * tauo_n[1])) + \
Vd_n[i] * taui_n[i] * (tauo_n[1]-t0_n[i])) / \
(2 * taui_n[i] * (tauo_n[1]-t0_n[i]));
break;
case _C_:
C_n += Cg * Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[1]) * \
(ts_n[i] - tauo_n[1]) / \
(2 * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * Vd_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) / \
(2 * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _D_:
C_n += Cg * Vd_n[i] * Vd_n[i] / 2;
break;
case _F_:
C_n += Cg * Vd_n[i] * (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1])*\
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1]))*\
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _G_:

226

209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277

Appendix B. Source code

C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \


(2 * taui_n[i]);
break;
case _E_:
default:
C_n += 0.0;
break;
}
}
else if ( ( i < n - 1 ) && ( i > 1 ) )
{
C_n += ( TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i ] * TECH.Lmin ) * \
Vd_n[ i ] * Vd_n[ i ] * 0.5;
C_n += ( TECH.Cgs0_n * ( W_n[ i + 1 ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i + 1 ] * TECH.Lmin ) * \
Vd_n[ i ] * Vd_n[ i ] * 0.5;
}
else if ( (i == 1) && (i == n - 1) )
{
double Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
int Op, SOp;
Op = Calct0ts1N( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] * \
(ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1]) - \
VDD * (t0_n[i] - tauo_n[1]) * \
((ts_n[i] * ts_n[i]) - 2 * ts_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * ((t0_n[i] - ts_n[i]) * (t0_n[i] + ts_n[i] - 2 * tauo_n[1])) * \
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _AA_:
C_n += Cg * Vd_n[i] * (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1])*\
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * (Vd_n[i] * taui_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) - \
VDD * (t0_n[i] - tauo_n[1])*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-taui_n[i] * (taui_n[i] - 2 * tauo_n[1]))) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _B_:
C_n += Cg * Vd_n[i] * (VDD * ((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-\
taui_n[i] * (taui_n[i] - 2 * tauo_n[1])) + \
Vd_n[i] * taui_n[i] * (tauo_n[1]-t0_n[i])) / \
(2 * taui_n[i] * (tauo_n[1]-t0_n[i]));
break;
case _C_:
C_n += Cg * Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[1]) * \
(ts_n[i] - tauo_n[1]) / \
(2 * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
C_n += Cov * Vd_n[i] * Vd_n[i]*\
((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1])) / \
(2 * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _D_:
C_n += Cg * Vd_n[i] * Vd_n[i] / 2;
break;
case _F_:
C_n += Cg * Vd_n[i] * (ts_n[i] - tauo_n[1]) * (ts_n[i] - tauo_n[1])*\
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));

B.3. Simulators

278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346

227

C_n += Cov * Vd_n[i]*\


((t0_n[i] * t0_n[i]) - 2 * t0_n[i] * tauo_n[1]-ts_n[i] * (ts_n[i] - 2 * tauo_n[1]))*\
(Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i] * (t0_n[i] - tauo_n[1]) * (t0_n[i] - tauo_n[1]));
break;
case _G_:
C_n += Cg * Vd_n[i] * (Vd_n[i] * taui_n[i] - VDD * (t0_n[i] - tauo_n[1])) / \
(2 * taui_n[i]);
break;
case _E_:
default:
C_n += 0.0;
break;
}
Cov = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Cg = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.666666 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Op = Calct0tsnN( n, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
C_n += Cg * Vs_n[n] * Vs_n[n] * (ts_n[n] - taui_n[n]) * (ts_n[n] - taui_n[n]) * \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
break;
case _B_:
C_n += Cov * Vs_n[n] * Vs_n[n] * 0.5;
break;
case _C_:
C_n += -Cg * Vs_n[n] * Vs_n[n] * (tc * tc - 2 * tc * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[i] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0 - taui_n[n]));
break;
case _D_:
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[i] - \
tc * (tc - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
break;
case _E_:
default:
break;
}
}
else if ( (i == n - 1) && (n > 2))
{
C_n += ( TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ i ] * TECH.Lmin ) * \
Vd_n[ i ] * ( 2 * VDD - Vd_n[ i ] ) * 0.5;
int Op, SOp;
double Cov = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
double Cg = ( ( TECH.Cgs0_n * W_n[ n ] + TECH.XW_n ) + 0.666666 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Op = Calct0tsnN( n, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));

Appendix B. Source code

228

347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415

C_n += Cg * Vs_n[n] * Vs_n[n] * (ts_n[n] - taui_n[n]) * (ts_n[n] - taui_n[n]) * \


(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
break;
case _B_:
C_n += Cov * Vs_n[n] * Vs_n[n] * 0.5;
break;
case _C_:
C_n += -Cg * Vs_n[n] * Vs_n[n] * (tc * tc - 2 * tc * taui_n[n] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[i] - \
ts_n[n] * (ts_n[n] - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0 - taui_n[n]));
break;
case _D_:
C_n += Cov * Vs_n[n] * Vs_n[n] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * taui_n[i] - \
tc * (tc - 2 * taui_n[n])) / \
(2 * (t0_n[i] - taui_n[n]) * (t0_n[i] - taui_n[n]));
break;
case _E_:
default:
break;
}
}
else if ( i == n )
{
double Cov = TECH.Cgd0_n * ( W_n[ n ] + TECH.XW_n );
double Cg = Cov + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ];
int Op, SOp;
Op = Calct0tsnN(n, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
case _B_:
C_n += Cov * \
Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * tauo_n[i] - \
ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
(2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
C_n += Cg * \
Vd_n[i] * Vd_n[i] * (ts_n[i] - tauo_n[i]) * (ts_n[i] - tauo_n[i]) / \
(2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
break;
case _C_:
C_n += Cov * \
Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] * tauo_n[i] - \
ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
(2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
C_n += -Cg * \
Vd_n[i] * Vd_n[i] * (tc * tc - 2 * tc * tauo_n[i] - \
ts_n[i] * (ts_n[i] - 2 * tauo_n[i])) / \
(2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
break;
case _D_:
C_n += Cov * \
Vd_n[i] * Vd_n[i] * (t0_n[i] * t0_n[i] - 2 * t0_n[i] - tauo_n[i] - \
tc * (tc - 2 * tauo_n[i])) / \
(2 * (t0_n[i] - tauo_n[i]) * (t0_n[i] - tauo_n[i]));
break;
case _E_:
default:
break;
}
}

B.3. Simulators

416
417
418
419
420
421 }

229

if ( C_n < 0.0 )


C_n *= -1;
Ecc += C_n;
}
return OK;

CalcpowP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
int Fast::CalcPowerP( const Circuit& circuit, unsigned int NP, unsigned int NC, double& Ecc, double& Esc, unsigned int n, unsigned int p, const double*
{
double H_1, I_1, Vc, t0_bs, C_p;
double J_1, K_1, M_1, N_1, O_1, P_1;
double t0 = t0_p[ p ];
double tauo = tauo_p[ p ];
double tin = taui_p[ 1 ];
t0_bs = tin * ( VDD - TECH.Vtn0 ) / VDD;
double tc = (VDD * tauo_p[p] * (t0_p[p] - taui_p[p]) + \
Vs_p[p] * taui_p[p] * (tauo_p[p] - t0_p[p])) / \
(VDD * (t0_p[p] - taui_p[p]) + Vs_p[p] * (tauo_p[p] - t0_p[p]));
Esc = 0;
Ecc = 0;
if (n > 0)
{
Vc = TECH.Ec_n * L_n[ n ];
H_1 = Vc * beta_n[ n ] * ( ( VDD * VDD ) * tin * ( t0 - 2 * tauo ) - \
VDD * ( Vc * ( t0 - tauo ) * ( 2 * t0 - tin - 2 * tauo ) + \
tin * ( Vd_p[ p ] * ( t0 - 3 * tauo ) + 2 * TECH.Vtn0 * ( t0 - tauo ) ) ) - \
Vd_p[ p ] * tin * ( Vc * ( t0 - tauo ) + Vd_p[ p ] * tauo + 2 * TECH.Vtn0 * ( tauo - t0 ) ) ) / \
( 2 * tin * ( VDD - Vd_p[ p ] ) * ( t0 - tauo ) );
I_1 = Vc * beta_n[ n ] * ( VDD * ( 2 * t0 - tin - 2 * tauo ) + \
Vd_p[ p ] * tin ) / ( 2 * tin * ( tauo - t0 ) );
J_1 = ( Vc * Vc ) * beta_n[ n ] * ( tauo - t0 ) * \
( 2 * ( VDD * VDD ) * ( t0 - tin ) + VDD * ( Vc * ( 2 * t0 - tin - 2 * tauo ) + \
2 * ( Vd_p[ p ] * ( tin - tauo ) + TECH.Vtn0 * tin ) ) + Vd_p[ p ] * tin * ( Vc - 2 * TECH.Vtn0 ) );
K_1 = 2 * tin * ( VDD - Vd_p[ p ] ) * ( VDD * t0 + Vc * ( t0 - tauo ) - \
Vd_p[ p ] * tauo );
M_1 = Vc * beta_n[ n ] * ( VDD * t0 + Vc * ( tauo - t0 ) - Vd_p[ p ] * tauo + 2 * TECH.Vtn0 * ( t0 - tauo ) ) / ( 2 * ( tauo - t0 ) );
N_1 = Vc * beta_n[ n ] * ( VDD - Vd_p[ p ] ) / ( 2 * ( t0 - tauo ) );
O_1 = ( Vc * Vc ) * beta_n[ n ] * ( Vc - 2 * TECH.Vtn0 ) * ( t0 - tauo );
P_1 = 2 * ( VDD * t0 + Vc * ( t0 - tauo ) - Vd_p[ p ] * tauo );
if ( t0_bs < tauo )
{
Esc = ( VDD * ( 3 * J_1
LOG ( 2
3 * J_1
LOG ( 2
2 * tin
( t0_bs

*
*
*
*
*
-

( K_1 - 2 * tin * t0 * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \


t0 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
( 2 * tin * t0 * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) - K_1 ) * \
t0_bs * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
t0 ) * ( 3 * H_1 * tin * ( VDD - Vd_p[ p ] ) * \
( VDD - Vd_p[ p ] ) * ( t0 - t0_bs ) + \
I_1 * tin * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * \
( t0 * t0 + t0 * t0_bs - 2 * t0_bs * t0_bs ) - 3 * J_1 ) ) / \
( 12 * ( tin * tin ) * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
( t0 - tauo ) ) );

}
else
{
Esc = ( ( 3 * J_1 * ( K_1 + 2 * tin * ( VDD - Vd_p[ p ] ) * ( Vd_p[ p ] * tauo - VDD * t0 ) ) * \

Appendix B. Source code

230

62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130

LOG ( 2 * t0 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) - \


3 * J_1 * ( K_1 + 2 * tin * ( VDD - Vd_p[ p ] ) * \
( Vd_p[ p ] * tauo - VDD * t0 ) ) * \
LOG ( 2 * tauo * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) - K_1 ) + \
2 * tin * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
( tauo - t0 ) * ( 3 * H_1 * tin * ( VDD - Vd_p[ p ] ) * \
( VDD + Vd_p[ p ] ) * ( t0 - tauo ) + \
I_1 * tin * ( VDD - Vd_p[ p ] ) * ( t0 - tauo ) * \
( VDD * ( t0 + 2 * tauo ) + Vd_p[ p ] * ( 2 * t0 + tauo ) ) - 3 * J_1 ) ) / \
( 12 * ( tin * tin ) * ( ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) * ( VDD - Vd_p[ p ] ) ) * \
( t0 - tauo ) ) );
}
Esc = fabs( Esc );
}
const char* name;
unsigned int node = circuit.ValimNode();
int njn, ngn, njp, ngp;
double Wjn, Wgn, Wjp, Wgp;
for ( unsigned int i = 1; i <= p; i++ )
{
C_p = 0.0;
// first there are nmos
name = pathlist[ NP ].TransistorName( n + p - i, NC );
// then there are the pmos, in REVERSE order
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
int nc;
// Cj P
C_p += TECH.C_pj * Wjp * TECH.Df * \
( ( VDD * VDD + Vd_p[ i ] * Vd_p[ i ] ) * ( TECH.mj_p - 1 ) * ( TECH.mj_p - 1 ) +
( TECH.mj_p * TECH.PB_p - 2 * Vd_p[ i ] * ( TECH.mj_p - 1 ) * ( TECH.mj_p - 1 )
Vd_p[ i ] * TECH.mj_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow ( ( VDD - Vd_p[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mj_p ) / \
( ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) );
C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( ( VDD * VDD + Vd_p[ i ] * Vd_p[ i ] ) * ( TECH.mjsw_p - 1 ) * ( TECH.mjsw_p - 1
( TECH.mjsw_p * TECH.PB_p - 2 * Vd_p[ i ] * ( TECH.mjsw_p - 1 ) * ( TECH.mjsw_p
Vd_p[ i ] * TECH.mjsw_p * TECH.PB_p + TECH.PB_p * TECH.PB_p ) * \
pow ( ( VDD - Vd_p[ i ] + TECH.PB_p ) / TECH.PB_p, -TECH.mjsw_p ) / \
( ( TECH.mjsw_p - 2 ) * ( TECH.mjsw_p - 1 ) );
C_p += TECH.PB_p * TECH.PB_p * ( TECH.C_pj * Wjp * TECH.Df * \
( TECH.mjsw_p - 2 ) * ( TECH.mjsw_p - 1 ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) ) / \
( ( TECH.mj_p - 2 ) * ( TECH.mj_p - 1 ) * ( TECH.mjsw_p - 1 ) * ( 2 - TECH.mjsw_p
// Cj N
if (n > 0)
{
double x = TECH.mj_n - 1;
double y = TECH.mjsw_n - 1;
C_p += TECH.C_nj * Wjn * TECH.Df *\
( VDD * VDD * x + VDD * TECH.mj_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) * \
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mj_n ) / \
( ( 1 - x ) * x );
C_p += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) *\
( VDD * VDD * y + VDD * TECH.mjsw_n * TECH.PB_n + TECH.PB_n * TECH.PB_n ) *\

VDD * \
) - \

) + VDD * \
- 1 ) ) - \

) );

B.3. Simulators

131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199

pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mjsw_n )


( ( 1 - y ) * y );
C_p += TECH.C_nj * Wjn * TECH.Df *\
( VDD * Vd_p[ i ] * (x - 1) * x - \
Vd_p[ i ] * Vd_p[i] * x * x - TECH.PB_n * (Vd_p[i]
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mj_n ) /
( ( 1 - x ) * x );
C_p += TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) *\
( VDD * Vd_p[ i ] * (y - 1) * y - \
Vd_p[ i ] * Vd_p[i] * y * y - TECH.PB_n * (Vd_p[i]
pow( ( VDD + TECH.PB_n ) / TECH.PB_n, -TECH.mjsw_n )
( ( 1 - y ) * y );

231

/ \

* TECH.mj_n + TECH.PB_n)) * \
\

* TECH.mjsw_n + TECH.PB_n) ) *\
/ \

}
C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
C_p += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
if ( i < n )
C_p += TECH.Cgs0_p * ( ( Wjp - W_p[ i ] - W_p[ i + 1 ] ) + ( njp - 2 ) * TECH.XW_p ) * \
( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
else
C_p += TECH.Cgs0_p * ( ( Wjp - W_p[ i ] ) + ( njp - 1 ) * TECH.XW_p ) * \
( VDD - Vd_p[ i ] ) * ( VDD - Vd_p[ i ] ) * 0.5;
// Cgs
if ( (( i == 1 ) && ( i < p - 1 )) || ((i == 1) && (p == 1)) )
{
double Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
int Op, SOp;
Op = Calct0ts1P( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_p += Cov * (Vd_p[i] - VDD) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
Vd_p[i] * taui_p[i]) / \
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += -Cg * ((VDD * VDD) * ((t0_p[i] * t0_p[i]) * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) + \
t0_p[i] * ((taui_p[i] * taui_p[i]) - (ts_p[i] * ts_p[i])) * (taui_p[i] + 2 * tauo_p[i]) + \
(ts_p[i] * ts_p[i]) * tauo_p[i] * (taui_p[i] + tauo_p[i]) - (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p
taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo_p[i]))) + \
VDD * Vd_p[i] * taui_p[i] * (t0_p[i] * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) - (ts_p[i] * ts_p[i]) * tauo_p[i]
taui_p[i] * (2 * (taui_p[i] * taui_p[i]) - 3 * taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo
(Vd_p[i] * Vd_p[i]) * (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p[i]) - 2 * taui_p[i] * tauo_p[i] + (tauo_p[i] * tauo_
(2 * (taui_p[i] * taui_p[i]) * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _AA_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += Cov * (Vd_p[i] - VDD) * (VDD * ((t0_p[i] * t0_p[i] * t0_p[i]) - (t0_p[i] * t0_p[i]) * (taui_p[i] + \
3 * tauo_p[i]) - t0_p[i] * ((taui_p[i] * taui_p[i]) - 4 * taui_p[i] * tauo_p[i] - 2 * (tauo_p[i]
taui_p[i] * ((ts_p[i] * ts_p[i]) - 2 * ts_p[i] * tauo_p[i] + tauo_p[i] * (taui_p[i] - 2 * tauo_p
Vd_p[i] * taui_p[i] * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _B_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD * ((t0_p[i] * t0_p[i]) - t0_p[i] * (taui_p[i] + 2 * tauo_p[i]) - \
(taui_p[i] * taui_p[i]) + 3 * taui_p[i] * tauo_p[i]) + Vd_p[i] * taui_p[i] * (t0_p[i] - tauo_p[i]
(2 * taui_p[i] * (tauo_p[i] - t0_p[i]));
break;
case _C_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i]))
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));

232

200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268

Appendix B. Source code

break;
case _D_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (taui_p[i] - tauo_p[i]) * (taui_p[i] - tauo_p[i]) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _F_:
C_p += Cg * (Vd_p[i] - VDD) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) +
Vd_p[i] * taui_p[i]) / (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += Cov * (Vd_p[i] - VDD) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _G_:
C_p += Cg * (Vd_p[i] - VDD) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / (2 * taui_p[i]);
break;
case _E_:
default:
C_p += 0.0;
break;
}

}
else if ( ( i < p - 1 ) && ( i > 1 ) )
{
C_p += ( TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i ] * TECH.Lmin ) * \
( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
C_p += ( TECH.Cgs0_p * ( W_p[ i + 1 ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i + 1 ] * njp * TECH.Lmin ) * \
( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
}
else if ( (i == 1) && (i == p - 1) )
{
double Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
int Op, SOp;
Op = Calct0ts1P( circuit, NP, SOp, NewWidth );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
C_p += Cov * (Vd_p[i] - VDD) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
Vd_p[i] * taui_p[i]) / \
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += -Cg * ((VDD * VDD) * ((t0_p[i] * t0_p[i]) * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) + \
t0_p[i] * ((taui_p[i] * taui_p[i]) - (ts_p[i] * ts_p[i])) * (taui_p[i] + 2 * tauo_p[i]) + \
(ts_p[i] * ts_p[i]) * tauo_p[i] * (taui_p[i] + tauo_p[i]) - (taui_p[i] * taui_p[i]) * ((tau
taui_p[i] * tauo_p[i] + 2 * (tauo_p[i] * tauo_p[i]))) + \
VDD * Vd_p[i] * taui_p[i] * (t0_p[i] * ((ts_p[i] * ts_p[i]) - (taui_p[i] * taui_p[i])) - (ts_p[i] * ts_p[i
taui_p[i] * (2 * (taui_p[i] * taui_p[i]) - 3 * taui_p[i] * tauo_p[i] + 2 * (t
(Vd_p[i] * Vd_p[i]) * (taui_p[i] * taui_p[i]) * ((taui_p[i] * taui_p[i]) - 2 * taui_p[i] * tauo_p[i] + (ta
(2 * (taui_p[i] * taui_p[i]) * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _AA_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += Cov * (Vd_p[i] - VDD) * (VDD * ((t0_p[i] * t0_p[i] * t0_p[i]) - (t0_p[i] * t0_p[i]) * (taui_p[i] + \
3 * tauo_p[i]) - t0_p[i] * ((taui_p[i] * taui_p[i]) - 4 * taui_p[i] * tauo_p[i] taui_p[i] * ((ts_p[i] * ts_p[i]) - 2 * ts_p[i] * tauo_p[i] + tauo_p[i] * (taui_p[
Vd_p[i] * taui_p[i] * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _B_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD * ((t0_p[i] * t0_p[i]) - t0_p[i] * (taui_p[i] + 2 * tauo_p[i]) - \
(taui_p[i] * taui_p[i]) + 3 * taui_p[i] * tauo_p[i]) + Vd_p[i] * taui_p[i] * (t0_p
(2 * taui_p[i] * (tauo_p[i] - t0_p[i]));
break;
case _C_:

B.3. Simulators

269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337

233

C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i]))
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _D_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (taui_p[i] - tauo_p[i]) * (taui_p[i] - tauo_p[i]) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _F_:
C_p += Cg * (Vd_p[i] - VDD) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + \
Vd_p[i] * taui_p[i]) / (2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += Cov * (Vd_p[i] - VDD) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / \
(2 * taui_p[i] * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _G_:
C_p += Cg * (Vd_p[i] - VDD) * (VDD * (t0_p[i] - taui_p[i] - tauo_p[i]) + Vd_p[i] * taui_p[i]) / (2 * taui_p[i]);
break;
case _E_:
default:
C_p += 0.0;
break;
}
Op = Calct0tsnP( p, SOp);
if ( Op == _E_ )
{
Op = SOp;
}
Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
Cg = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.66666 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
switch ( Op )
{
case _A_:
C_p += Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) * (ts_p[p] - taui_p[p]) * (ts_p[p] - taui_p[p]) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) / 2;
break;
case _C_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
C_p += -Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((tc * tc) - 2 * tc * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
break;
case _D_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - tc * (tc - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
break;
case _E_:
default:
break;
}
}
else if ((i == p - 1) && (p > 2))
{
C_p += ( TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ i ] * TECH.Lmin ) * \
( VDD + Vd_p[ i ] ) * ( Vd_p[ i ] - VDD ) * 0.5;
int Op, SOp;
Op = Calct0tsnP( p, SOp);
if ( Op == _E_ )
{
Op = SOp;

234

338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406

Appendix B. Source code

}
double Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
double Cg = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.66666 * TECH.Cox_p * W_p[ p ] * TECH.Lmin );
switch ( Op )
{
case _A_:
C_p += Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) * (ts_p[p] - taui_p[p]) * (ts_p[p] - taui_p[p]) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p]) / 2;
break;
case _C_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
C_p += -Cg * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((tc * tc) - 2 * tc * taui_p[p] - ts_p[p] * (ts_p[p] - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
break;
case _D_:
C_p += Cov * (VDD - Vs_p[p]) * (VDD - Vs_p[p])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * taui_p[p] - tc * (tc - 2 * taui_p[p])) / \
(2 * (t0_p[i] - taui_p[p]) * (t0_p[i] - taui_p[p]));
break;
case _E_:
default:
break;
}
}
else if ( i == p )
{
double Cov = TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p );
double Cg = Cov + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ];
int Op, SOp;
Op = Calct0tsnP( p, SOp );
if ( Op == _E_ )
{
Op = SOp;
}
switch ( Op )
{
case _A_:
case _B_:
C_p += Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * (ts_p[i] - tauo_p[i]) * (ts_p[i] - tauo_p[i]) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i])) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _C_:
C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i]) * ((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - ts_p[i]*\
(ts_p[i] - 2 * tauo_p[i])) / (2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
C_p += -Cg * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
((tc * tc) - 2 * tc * tauo_p[i] - ts_p[i] * (ts_p[i] - 2 * tauo_p[i])) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _D_:
C_p += Cov * (VDD - Vd_p[i]) * (VDD - Vd_p[i])*\
((t0_p[i] * t0_p[i]) - 2 * t0_p[i] * tauo_p[i] - tc * (tc - 2 * tauo_p[i])) / \
(2 * (t0_p[i] - tauo_p[i]) * (t0_p[i] - tauo_p[i]));
break;
case _E_:
default:
break;

B.3. Simulators

407
408
409
410
411
412
413
414 }

235

}
}
if ( C_p < 0.0 )
C_p *= -1;
Ecc += C_p;
}
return OK;

Calcstart.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

#define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))

///
double Fast::CalcStartTime( const Circuit& circuit, unsigned int NP, unsigned int NC, double start, double end, TransistorType type, const double* NewW
{
int iter;
double a = start, b = end, c = end, d, e, min1, min2;
double fa, fb, fc, pp, q, r, s, tol1, xm, last;
double tol = TOL;
if ( type == NMOS )
fb = t0N( circuit, NP, NC, b, NewWidth, RetCode );
else if ( type == PMOS )
fb = t0P( circuit, NP, NC, b, NewWidth, RetCode );
last = fb;
if ( type == NMOS )
fa = t0N( circuit, NP, NC, a, NewWidth, RetCode );
else if ( type == PMOS )
fa = t0P( circuit, NP, NC, a, NewWidth, RetCode );
if ( ( fa > 0.0 && fb > 0.0 ) || ( fa < 0.0 && fb < 0.0 ) )
{
RetCode = NOT_FOUND;
return 0.0;
}
fc = fb;
for ( iter = 1; iter <= ITERMAX; iter++ )
{
if ( ( fb > 0.0 && fc > 0.0 ) || ( fb < 0.0 && fc < 0.0 ) )
{
c = a;
fc = fa;
e = d = b - a;
}
if ( fabs ( fc ) < fabs ( fb ) )
{
a = b;
b = c;
c = a;
fa = fb;
fb = fc;
fc = fa;
}
tol1 = 2.0 * EPS * fabs ( b ) + 0.5 * tol;
xm = 0.5 * ( c - b );
if ( fabs ( xm ) <= tol1 || fb == 0.0 )
return b;
if ( fabs ( e ) >= tol1 && fabs ( fa ) > fabs ( fb ) )
{
s = fb / fa;
if ( a == c )
{
pp = 2.0 * xm * s;

Appendix B. Source code

236

60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103 }

q = 1.0 - s;
}
else
{
q = fa / fc;
r = fb / fc;
pp = s * ( 2.0 * xm * q * ( q - r ) - ( b - a ) * ( r - 1.0 ) );
q = ( q - 1.0 ) * ( r - 1.0 ) * ( s - 1.0 );
}
if ( pp > 0.0 )
q = -q;
pp = fabs ( pp );
min1 = 3.0 * xm * q - fabs ( tol1 * q );
min2 = fabs ( e * q );
if ( 2.0 * pp < ( min1 < min2 ? min1 : min2 ) )
{
e = d;
d = pp / q;
}
else
{
d = xm;
e = d;
}
}
else
{
d = xm;
e = d;
}
a = b;
fa = fb;
if ( fabs ( d ) > tol1 )
b += d;
else
b += SIGN ( tol1, xm );
if ( type == NMOS )
fb = t0N( circuit, NP, NC, b, NewWidth, RetCode );
else if ( type == PMOS )
fb = t0P( circuit, NP, NC, b, NewWidth, RetCode );
}
RetCode = NOT_FOUND;
return 0.0;

Calctst0N.cc
3
4
5
6

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

* t0: A=AA=F B=G C D * ts: A=F B=D=G AA=C *

Calctst0N.cc
14 int Fast::Calct0ts1N( const Circuit& circuit, unsigned int NP, int& SaveOpCondition, const double* NewWidth )
15 {
16
int OpCondition = _A_, LastOpCondition = 0;
17
double t0_bs, Vc, A_2_n, B_2_n, C_2_n, D_2_n, J_2_n, K_2_n, I_2_n, M_2_n;
double a, b, c, Cm1, Cm2, Cov, Cj, X, Y, alpha, beta, gamma, theta;
18
19
20
t0_bs = TECH.Vtn0 * taui_n[ 1 ] / VDD;
21
if ( taui_n[ 1 ] <= tauo_n[ 1 ] )
{
22
23
ts_n[ 1 ] = ( taui_n[ 1 ] - t0_n[ 1 ] ) * 0.5 + t0_n[ 1 ];
/* A */
23
24
}
else
25

B.3. Simulators

26
27
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93

237

{
ts_n[ 1 ] = ( tauo_n[ 1 ] - t0_n[ 1 ] ) * 0.5 + t0_n[ 1 ];

/* F */

}
Vc = TECH.Ec_n * L_n[ 1 ];
Cov = TECH.Cgd0_n * ( W_n[ 1 ] + TECH.XW_n );
Cm1 = Cov;
Cm2 = Cm1 + 0.5 * TECH.Cox_n * W_n[ 1 ] * L_n[ 1 ];
unsigned int pp = pathlist[ NP ].GetNumTranP();
const char* name = pathlist[ NP ].TransistorName( 0 );
int node;
double Wjn, Wgn, Wjp, Wgp;
int njn, ngn, njp, ngp, nc;
if ( circuit[ name ].Source() == 0 )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == 0 )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
// Nmos
Cj = TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
Cj += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n );
// Pmos
if (pp > 0)
{
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
Cj += TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p );
}
// static
Cj += circuit.CapStaticGnd( node, nc );
Cj += circuit.CapStaticVdd( node, nc );
Cj += Wgn * TECH.Lmin * TECH.Cox_n;
Cj += Wgp * TECH.Lmin * TECH.Cox_p;
A_2_n = Vc * beta_n[ 1 ] * ( Vc - TECH.Vtn0 );
B_2_n = VDD * Vc * beta_n[ 1 ] / taui_n[ 1 ];
C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
J_2_n = Vc * Vd_n[ 1 ] * beta_n[ 1 ] * ( Vd_n[ 1 ] + 2 * TECH.Vtn0 ) / ( 2 * ( Vc + Vd_n[ 1 ] ) );
K_2_n = VDD * Vc * Vd_n[ 1 ] * beta_n[ 1 ] / ( taui_n[ 1 ] * ( Vc + Vd_n[ 1 ] ) );
I_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) / 2;
M_2_n = K_2_n * taui_n[ 1 ] - J_2_n;
X = Cj + Cm2;
Y = Cj + Cov;
while ( OpCondition != LastOpCondition )
{
if ( LastOpCondition != 0 )
LastOpCondition = OpCondition;
else
LastOpCondition = _E_;
if ( ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _A_;
if ( OpCondition != LastOpCondition )

Appendix B. Source code

238

94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162

{
#ifdef SAT
b = ( 2 * ( Vd_n[ 1 ] * taui_n[ 1 ] * \
( Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) - Vd_n[ 1 ] * tauo_n[ 1 ] ) - \
VDD * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) ) ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] * taui_n[ 1 ] );
c = ( 2 * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) - Vd_n[ 1 ] * Vd_n[ 1 ] * tauo_n[ 1 ] * tauo_n[ 1 ] ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] );
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ 1 ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ts_n[ 1 ] < 0 )
ts_n[ 1 ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
}
else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] * tauo_n[ 1 ] + TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
( Vd_n[ 1 ] * taui_n[ 1 ] - VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) );
#else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] * tauo_n[ 1 ] + TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
( Vd_n[ 1 ] * taui_n[ 1 ] - VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) );
#endif
}
}
else if ( ( t0_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _AA_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_n[ 1 ] = ( sqrt ( Vc ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * \
sqrt ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) + Vc * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
Vd_n[ 1 ] + tauo_n[ 1 ];
#else
ts_n[ 1 ] = ( VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) + Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / Vd_n[ 1 ];
#endif
}
}
else if ( ( ts_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _B_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_n[ 1 ] = taui_n[ 1 ] * \
( 2 * Vc * ( Vd_n[ 1 ] + TECH.Vtn0 ) + ( Vd_n[ 1 ] * Vd_n[ 1 ] ) ) / \
( 2 * VDD * Vc );
#else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] + TECH.Vtn0 ) / VDD;
#endif
a = 2 * ( Cm2 * VDD + J_2_n * taui_n[ 1 ] ) / ( K_2_n * taui_n[ 1 ]);
b = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * taui_n[1] * ( t0_bs * t0_bs - ts_n[ 1 ] * ts_n[ 1 ] ) - \
4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * pow( C_2_n * t0_bs + D_2_n, 1.5 ) + \
4 * Vc * Vc * beta_n[ 1 ] * pow( C_2_n * ts_n[ 1 ] + D_2_n, 1.5 ) - \
3 * C_2_n * ts_n[ 1 ] * ( 2 * Cm2 * VDD - 2 * Cov * VDD + \
taui_n[1] * (2 * J_2_n - K_2_n * ts_n[1] ) ) ) / \
( 3 * C_2_n * K_2_n * taui_n[ 1 ]);
if ( ( 4 * b + ( a * a ) ) >= 0 )
{
t0_n[ 1 ] = ( a - sqrt ( 4 * b + ( a * a ) ) ) / 2;
if ( t0_n[ 1 ] < 0 )
t0_n[ 1 ] = ( sqrt ( 4 * b + ( a * a ) ) + a ) / 2;

B.3. Simulators

163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231

239

}
else
t0_n[ 1 ] = t0_bs;
}
}
else if ( ( taui_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _C_;
if ( OpCondition != LastOpCondition )
{
alpha = C_2_n * t0_bs + D_2_n;
beta = C_2_n * ts_n[ 1 ] + D_2_n;
theta = pow( alpha, 1.5 ) - pow( beta, 1.5 );
t0_n[ 1 ] = ( 6 * A_2_n * C_2_n * ( t0_bs - taui_n[ 1 ] ) + \
3 * B_2_n * C_2_n * ( t0_bs * t0_bs - taui_n[ 1 ] * taui_n[ 1 ] ) - \
2 * ( 2 * Vc * Vc * beta_n[ 1 ] * theta - \
3 * C_2_n * (Cov * VDD + I_2_n * taui_n[1]))) /
( 6 * C_2_n * I_2_n );
#ifdef SAT
ts_n[ 1 ] = ( sqrt ( Vc ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * \
sqrt ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) + Vc * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
Vd_n[ 1 ] + tauo_n[ 1 ];
#else
ts_n[ 1 ] = ( VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) + Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / Vd_n[ 1 ];
#endif
}
}
else if ( ( ts_n[ 1 ] <= taui_n[ 1 ] ) &&
( taui_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= tauo_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _D_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_n[ 1 ] = taui_n[ 1 ] * \
( 2 * Vc * ( Vd_n[ 1 ] + TECH.Vtn0 ) + ( Vd_n[ 1 ] * Vd_n[ 1 ] ) ) / \
( 2 * VDD * Vc );
#else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] + TECH.Vtn0 ) / VDD;
#endif
alpha = C_2_n * t0_bs + D_2_n;
beta = C_2_n * taui_n[ 1 ] + D_2_n;
gamma = C_2_n * ts_n[ 1 ] + D_2_n;
theta = -pow( alpha, 1.5 ) + pow( gamma, 1.5 );
t0_n[ 1 ] = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * taui_n[1] * ( ( t0_bs * t0_bs ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) + \
4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * theta - \
3 * C_2_n * ( 2 * Cm2 * VDD * (ts_n[1] - taui_n[1]) - \
2 * Cov * VDD * ts_n[1] + taui_n[1] * \
(2 * J_2_n * (ts_n[1]-taui_n[1]) + K_2_n * \
(taui_n[1] * taui_n[1] - ts_n[1] * ts_n[1]) - \
2 * M_2_n * taui_n[1]))) / \
( 6 * C_2_n * M_2_n * taui_n[ 1 ] );
}
}
else if ( ( t0_n[ 1 ] <= ts_n[ 1 ] ) &&
( ts_n[ 1 ] <= tauo_n[ 1 ] ) &&
( tauo_n[ 1 ] <= taui_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _F_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
b = ( 2 * ( Vd_n[ 1 ] * taui_n[ 1 ] * \

Appendix B. Source code

240

232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300

( Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) - Vd_n[ 1 ] * tauo_n[ 1 ] ) - \


VDD * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( t0_n[ 1 ] - tauo_n[ 1 ] ) ) ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] * taui_n[ 1 ] );
c = ( 2 * Vc * ( t0_n[ 1 ] - tauo_n[ 1 ] ) * ( Vd_n[ 1 ] * tauo_n[ 1 ] + \
TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) - Vd_n[ 1 ] * Vd_n[ 1 ] * tauo_n[ 1 ] * tauo_n[ 1 ] ) / \
( Vd_n[ 1 ] * Vd_n[ 1 ] );
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ 1 ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ts_n[ 1 ] < 0 )
ts_n[ 1 ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
}
else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] * tauo_n[ 1 ] + TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
( Vd_n[ 1 ] * taui_n[ 1 ] - VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) );
#else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] * tauo_n[ 1 ] + TECH.Vtn0 * ( tauo_n[ 1 ] - t0_n[ 1 ] ) ) / \
( Vd_n[ 1 ] * taui_n[ 1 ] - VDD * ( t0_n[ 1 ] - tauo_n[ 1 ] ) );
#endif
}
}
else if ( ( ts_n[ 1 ] <= t0_n[ 1 ] ) &&
( t0_n[ 1 ] <= tauo_n[ 1 ] ) &&
( tauo_n[ 1 ] <= taui_n[ 1 ] ) )
{
OpCondition = SaveOpCondition = _G_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_n[ 1 ] = taui_n[ 1 ] * \
( 2 * Vc * ( Vd_n[ 1 ] + TECH.Vtn0 ) + ( Vd_n[ 1 ] * Vd_n[ 1 ] ) ) / \
( 2 * VDD * Vc );
#else
ts_n[ 1 ] = taui_n[ 1 ] * ( Vd_n[ 1 ] + TECH.Vtn0 ) / VDD;
#endif
a = 2 * ( Cm2 * VDD + J_2_n * taui_n[ 1 ] ) / ( K_2_n * taui_n[ 1 ]);
b = ( 6 * A_2_n * C_2_n * taui_n[1] * ( t0_bs - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * taui_n[1] * ( t0_bs * t0_bs - ts_n[ 1 ] * ts_n[ 1 ] ) - \
4 * Vc * Vc * beta_n[ 1 ] * taui_n[1] * pow( C_2_n * t0_bs + D_2_n, 1.5 ) + \
4 * Vc * Vc * beta_n[ 1 ] * pow( C_2_n * ts_n[ 1 ] + D_2_n, 1.5 ) - \
3 * C_2_n * ts_n[ 1 ] * ( 2 * Cm2 * VDD - 2 * Cov * VDD + \
taui_n[1] * (2 * J_2_n - K_2_n * ts_n[1] ) ) ) / \
( 3 * C_2_n * K_2_n * taui_n[ 1 ]);
if ( ( 4 * b + ( a * a ) ) >= 0 )
{
t0_n[ 1 ] = ( a - sqrt ( 4 * b + ( a * a ) ) ) / 2;
if ( t0_n[ 1 ] < 0 )
t0_n[ 1 ] = ( sqrt ( 4 * b + ( a * a ) ) + a ) / 2;
}
else
t0_n[ 1 ] = t0_bs;
}
}
else
{
OpCondition = _E_;
}
}
return OpCondition;
}
///
int Fast::Calct0tsnN( unsigned int n, int& SaveOpCondition )
{
int OpCondition = _A_, LastOpCondition = 0;
double Vc, tc, X, Y, b, c;

B.3. Simulators

301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369

241

Vc = TECH.Ec_n * L_n[ n ];
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
if ( taui_n[ n ] < tauo_n[ n ] )
{
SaveOpCondition = _A_;
}
else
{
tc = ( VDD * tauo_n[ n ] * ( t0_n[ n ] - taui_n[ n ] ) + \
Vs_n[ n ] * taui_n[ n ] * ( tauo_n[ n ] - t0_n[ n ] ) ) / \
( VDD * ( t0_n[ n ] - taui_n[ n ] ) + Vs_n[ n ] * ( tauo_n[ n ] - t0_n[ n ] ) );
SaveOpCondition = _C_;
}
while ( OpCondition != LastOpCondition )
{
if ( LastOpCondition != 0 )
LastOpCondition = OpCondition;
else
LastOpCondition = _E_;
if ( ( taui_n[ n ] <= tauo_n[ n ] ) &&
( ts_n[ n ] <= taui_n[ n ] ) &&
( t0_n[ n ] <= ts_n[ n ] ) )
{
OpCondition = SaveOpCondition = _A_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
X = t0_n[ n ] - taui_n[ n ];
Y = tauo_n[ n ] - t0_n[ n ];
b = -( 2 * ( VDD * X * X * ( Vc * Y + VDD * tauo_n[ n ] ) + \
Vs_n[ n ] * X * Y * ( VDD * ( taui_n[ n ] + tauo_n[ n ] ) - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * Y * Y ) ) / \
( ( VDD * X + Vs_n[ n ] * Y ) * ( VDD * X + Vs_n[ n ] * Y ) );
c = -( X * X * ( 2 * Y * Y * Vc * TECH.Vtn0 + 2 * Y * VDD * Vc * t0_n[ n ] + VDD * VDD * tauo_n[ n ] * tauo_n[ n ] ) + \
2 * Vs_n[ n ] * taui_n[ n ] * X * Y * ( VDD * tauo_n[ n ] - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * taui_n[ n ] * Y * Y ) / \
( ( VDD * X + Vs_n[ n ] * Y ) * ( VDD * X + Vs_n[ n ] * Y ) );
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ n ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ( ts_n[ n ] < 0 ) || ( ts_n[ n ] < t0_n[ n ] ) || ( ts_n[ n ] > tauo_n[ n ] ) )
ts_n[ n ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
if ( ( ts_n[ n ] < 0 ) || ( ts_n[ n ] < t0_n[ n ] ) || ( ts_n[ n ] > tauo_n[ n ] ) )
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
}
else
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
#else
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
#endif
}
}
else if ( ( ts_n[ n ] <= tauo_n[ n ] ) &&
( taui_n[ n ] <= ts_n[ n ] ) &&
( t0_n[ n ] <= taui_n[ n ] ) )
{
OpCondition = SaveOpCondition = _B_;

Appendix B. Source code

242

370
371
372
373
374
375
376
377
378
379
380
381
382
383

if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_n[ n ] = tauo_n[ n ] - ( tauo_n[ n ] - t0_n[ n ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) ) - Vc ) / VDD;
#else
ts_n[ n ] = TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) / VDD + t0_n[ n ];
#endif
}
}
else if ( ( tauo_n[ n ] <= taui_n[ n ] ) &&
( ts_n[ n ] < tc ) &&
( t0_n[ n ] <= ts_n[ n ] ) )
{

(ts
n[n] tc), not = !!!

Calctst0N.cc
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434

OpCondition = SaveOpCondition = _C_;


if ( OpCondition != LastOpCondition )
{
#ifdef SAT

X = t0_n[ n ] - taui_n[ n ];
Y = tauo_n[ n ] - t0_n[ n ];
b = -( 2 * ( VDD * X * X * ( Vc * Y + VDD * tauo_n[ n ] ) + \
Vs_n[ n ] * X * Y * ( VDD * ( taui_n[ n ] + tauo_n[ n ] ) - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * Y * Y ) ) / \
( ( VDD * X + Vs_n[ n ] * Y ) * ( VDD * X + Vs_n[ n ] * Y ) );
c = -( X * X * ( 2 * Y * Y * Vc * TECH.Vtn0 + 2 * Y * VDD * Vc * t0_n[ n ] + VDD * VDD * tauo_n[ n ] * tauo_n[ n ] ) + \
2 * Vs_n[ n ] * taui_n[ n ] * X * Y * ( VDD * tauo_n[ n ] - A_1_n[ n ] * Vc * Y ) + \
Vs_n[ n ] * Vs_n[ n ] * taui_n[ n ] * taui_n[ n ] * Y * Y ) / \
( ( VDD * X + Vs_n[ n ] * Y ) * ( VDD * X + Vs_n[ n ] * Y ) );
if ( ( b * b + 4 * c ) >= 0 )
{
ts_n[ n ] = -( sqrt ( b * b + 4 * c ) + b ) * 0.5;
if ( ( ts_n[ n ] < 0 ) || ( ts_n[ n ] < t0_n[ n ] ) || ( ts_n[ n ] > tauo_n[ n ] ) )
ts_n[ n ] = ( sqrt ( b * b + 4 * c ) - b ) * 0.5;
if ( ( ts_n[ n ] < 0 ) || ( ts_n[ n ] < t0_n[ n ] ) || ( ts_n[ n ] > tauo_n[ n ] ) )
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
}
else
{
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
if ( ts_n[ n ] > tauo_n[ n ] )
ts_n[ n ] = tc;
}
#else
ts_n[ n ] = ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
( t0_n[ n ] - taui_n[ n ] ) * ( VDD * t0_n[ n ] + TECH.Vtn0 * ( tauo_n[ n ] - t0_n[ n ] ) ) ) / \
( A_1_n[ n ] * Vs_n[ n ] * ( t0_n[ n ] - tauo_n[ n ] ) + \
VDD * ( t0_n[ n ] - taui_n[ n ] ) );
if ( ts_n[ n ] > tauo_n[ n ] )
ts_n[ n ] = tc;
#endif
}
}
else if ( ( tauo_n[ n ] <= taui_n[ n ] ) &&
( tc <= ts_n[ n ] ) &&
( t0_n[ n ] <= ts_n[ n ] ) )
{
OpCondition = SaveOpCondition = _D_;
if ( OpCondition != LastOpCondition )

B.3. Simulators

435
436
437
438
439
440
441
442
443
444
445 }

243

{
ts_n[ n ] = tc;
}
}
else
{
OpCondition = _E_;
}
}
return OpCondition;

Calctst0P.cc
3
4
5
6

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

* t0: A=AA=F B=G C D * ts: A=F B=D=G AA=C *

Calctst0P.cc
15 int Fast::Calct0ts1P( const Circuit& circuit, unsigned int NP, int& SaveOpCondition, const double* NewWidth )
16 {
int OpCondition = _A_, LastOpCondition = 0;
17
18
double t0_bs, Vc, A_2_p, B_2_p, C_2_p, D_2_p, G_2_p, J_2_p, K_2_p, M_2_p;
19
double b, c, Cm1, Cm2, Cov, Cj, X, XX, Y, alpha, gamma, theta;
20
21
t0_bs = -TECH.Vtp0 * taui_p[ 1 ] / VDD;
if ( taui_p[ 1 ] <= tauo_p[ 1 ] )
22
23
{
24
ts_p[ 1 ] = ( taui_p[ 1 ] - t0_p[ 1 ] ) * 0.5 + t0_p[ 1 ];
/* A */
24
25
}
26
else
27
{
ts_p[ 1 ] = ( tauo_p[ 1 ] - t0_p[ 1 ] ) * 0.5 + t0_p[ 1 ];
/* F */
28
28
29
}
30
31
Vc = TECH.Ec_p * L_p[ 1 ];
32
Cov = TECH.Cgd0_p * ( W_p[ 1 ] + TECH.XW_p );
33
Cm1 = Cov;
34
Cm2 = Cm1 + 0.5 * TECH.Cox_p * W_p[ 1 ] * L_p[ 1 ];
35
unsigned int nn = pathlist[ NP ].GetNumTranN();
double Wjn, Wgn, Wjp, Wgp;
36
37
int node;
38
int njn, ngn, njp, ngp, nc;
if (nn > 0)
39
{
40
41
const char* name = pathlist[ NP ].TransistorName( nn - 1 );
42
if ( circuit[ name ].Source() == 0 )
43
{
44
node = circuit[ name ].Drain();
45
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
46
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
47
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
48
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
49
}
50
else if ( circuit[ name ].Drain() == 0 )
51
{
52
node = circuit[ name ].Source();
53
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
54
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
55
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
56
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
57
}
58
// Nmos

Appendix B. Source code

244

59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127

Cj = TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \


TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
Cj += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n );
}
// Pmos
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
Cj += TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p );
// static
Cj += circuit.CapStaticGnd( node, nc );
Cj += circuit.CapStaticVdd( node, nc );
Cj += Wgn * TECH.Lmin * TECH.Cox_p;
Cj += Wgp * TECH.Lmin * TECH.Cox_p;
A_2_p = Vc * beta_p[ 1 ] * ( Vc + TECH.Vtp0 );
B_2_p = VDD * Vc * beta_p[ 1 ] / taui_p[ 1 ];
C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
G_2_p = Vc * Vc * beta_p[ 1 ] * SQRT ( ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) / Vc ) - \
VDD * Vc * beta_p[ 1 ] - Vc * Vc * beta_p[ 1 ] - Vc * TECH.Vtp0 * beta_p[ 1 ];
K_2_p = VDD * Vc * beta_p[ 1 ] * ( Vd_p[ 1 ] - VDD ) / ( taui_p[ 1 ] * ( VDD + Vc - Vd_p[ 1 ] ) );
J_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] - 2 * TECH.Vtp0 ) / ( 2 * ( VDD + Vc - Vd_p[ 1 ] ) );
M_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD + Vd_p[ 1 ] + 2 * TECH.Vtp0 ) / ( 2 * ( Vd_p[ 1 ] - Vc ) );
X = Cj + Cm2;
Y = Cj + Cov;
alpha = pow( ( D_2_p * t0_bs + C_2_p ), 1.5 );
gamma = pow( ( D_2_p * taui_p[ 1 ] + C_2_p ), 1.5 );
while ( OpCondition != LastOpCondition )
{
if ( LastOpCondition != 0 )
LastOpCondition = OpCondition;
else
LastOpCondition = _E_;
if ( ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
( ts_p[ 1 ] <= taui_p[ 1 ] ) &&
( taui_p[ 1 ] <= tauo_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _A_;
if ( OpCondition != LastOpCondition )
{
XX = t0_p[ 1 ] - tauo_p[ 1 ];
#ifdef SAT

b = ( 2 * ( VDD * VDD * taui_p[ 1 ] * tauo_p[ 1 ] + VDD * ( Vc * ( XX - taui_p[ 1 ] ) * XX - \


2 * Vd_p[ 1 ] * taui_p[ 1 ] * tauo_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * XX + Vd_p[ 1 ] * tauo_p[ 1 ] ) ) ) / \
( taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
c = ( VDD * VDD * tauo_p[ 1 ] * tauo_p[ 1 ] - 2 * VDD * tauo_p[ 1 ] * ( Vc * XX + Vd_p[ 1 ] * tauo_p[ 1 ] ) + \
2 * Vc * XX * ( Vd_p[ 1 ] * tauo_p[ 1 ] - TECH.Vtp0 * XX ) + Vd_p[ 1 ] * tauo_p[ 1 ] * Vd_p[ 1 ] * tauo_p[ 1 ] ) /
( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
if ( ( b * b - 4 * c ) >= 0 )
{
ts_p[ 1 ] = ( b - sqrt ( b * b - 4 * c ) ) * 0.5;
if ( ts_p[ 1 ] < 0 )
ts_p[ 1 ] = ( b + sqrt ( b * b - 4 * c ) ) * 0.5;
}
else
ts_p[ 1 ] = -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \
TECH.Vtp0 * XX ) / \
( VDD * ( XX - taui_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] );
#else
ts_p[ 1 ]
TECH.Vtp0
( VDD * (
Vd_p[ 1 ]
#endif

= -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \


* XX ) / \
XX - taui_p[ 1 ] ) + \
* taui_p[ 1 ] );

}
}
else if ( ( t0_p[ 1 ] <= taui_p[ 1 ] ) &&

B.3. Simulators

128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196

245

( taui_p[ 1 ] <= ts_p[ 1 ] ) &&


( ts_p[ 1 ] <= tauo_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _AA_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_p[ 1 ] = tauo_p[ 1 ] - ( ( tauo_p[ 1 ] - t0_p[ 1 ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / \
( VDD - Vd_p[ 1 ] );
#else
ts_p[ 1 ] = ( VDD * t0_p[ 1 ] - Vd_p[ 1 ] * tauo_p[ 1 ] + TECH.Vtp0 * ( t0_p[ 1 ] - tauo_p[ 1 ] ) ) / \
( VDD - Vd_p[ 1 ] );
#endif
}
}
else if ( ( ts_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= taui_p[ 1 ] ) &&
( taui_p[ 1 ] <= tauo_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _B_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD * VDD + 2 * VDD * ( Vc - Vd_p[ 1 ] ) - 2 * Vc * ( Vd_p[ 1 ] + TECH.Vtp0 ) + \
Vd_p[ 1 ] * Vd_p[ 1 ] ) / ( 2 * VDD * Vc );
#else
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] - TECH.Vtp0 ) / VDD;
#endif
theta = pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );
b = 2 * ( Cm2 * VDD + J_2_p * taui_p[ 1 ] ) / \
( K_2_p * taui_p[ 1 ] );
c = ( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[ 1 ] ) - \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * alpha + \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * theta - \
3 * D_2_p * ts_p[ 1 ] * \
( 2 * Cm2 * VDD - 2 * Cov * VDD + taui_p[ 1 ] * \
( 2 * J_2_p + K_2_p * ts_p[ 1 ] ) ) ) / \
( 3 * D_2_p * K_2_p * taui_p[ 1 ] );
if ( ( b * b - 4 * c ) >= 0 )
{
t0_p[ 1 ] = ( sqrt ( ( b * b ) - 4 * c ) - b ) * 0.5;
if ( t0_p[ 1 ] < 0 )
t0_p[ 1 ] = -( sqrt ( ( b * b ) - 4 * c ) + b ) * 0.5;
}
else
t0_p[ 1 ] = t0_bs;
}
}
else if ( ( taui_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
( ts_p[ 1 ] <= tauo_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _C_;
if ( OpCondition != LastOpCondition )
{
t0_p[ 1 ] = -( 6 * A_2_p * D_2_p * ( t0_bs - taui_p[ 1 ] ) + \
3 * B_2_p * D_2_p * ( t0_bs * t0_bs - taui_p[ 1 ] * taui_p[ 1 ] ) - \
2 * ( 2 * Vc * Vc * beta_p[ 1 ] * ( alpha - gamma ) - \
3 * D_2_p * (Cov * VDD - G_2_p * taui_p[ 1 ]))) / \
( 6 * D_2_p * G_2_p );
#ifdef SAT
ts_p[ 1 ] = tauo_p[ 1 ] - ( ( tauo_p[ 1 ] - t0_p[ 1 ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / \
( VDD - Vd_p[ 1 ] );
#else
ts_p[ 1 ] = ( VDD * t0_p[ 1 ] - Vd_p[ 1 ] * tauo_p[ 1 ] + TECH.Vtp0 * ( t0_p[ 1 ] - tauo_p[ 1 ] ) ) / \

Appendix B. Source code

246

197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265

( VDD - Vd_p[ 1 ] );
#endif
}
}
else if ( ( ts_p[ 1 ] <= taui_p[ 1 ] ) &&
( taui_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= tauo_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _D_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD * VDD + 2 * VDD * ( Vc - Vd_p[ 1 ] ) - 2 * Vc * ( Vd_p[ 1 ] + TECH.Vtp0 ) + \
( Vd_p[ 1 ] * Vd_p[ 1 ] ) ) / ( 2 * VDD * Vc );
#else
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] + TECH.Vtp0 ) / VDD;
#endif
theta = pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );
t0_p[ 1 ] = -( 6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[1]) - \
4 * Vc * Vc * beta_p[ 1 ] * ( alpha - theta ) - \
3 * D_2_p * (2 * Cm2 * VDD * (ts_p[1] - taui_p[1]) - \
2 * Cov * VDD * ts_p[1] + \
taui_p[1] * (2 * J_2_p * (ts_p[1] - taui_p[1]) + \
K_2_p * (ts_p[1] * ts_p[1] - taui_p[1] * taui_p[1]) + \
2 * M_2_p * taui_p[1] ))) / \
( 6 * D_2_p * M_2_p * taui_p[ 1 ] );
}
}
else if ( ( t0_p[ 1 ] <= ts_p[ 1 ] ) &&
( ts_p[ 1 ] <= tauo_p[ 1 ] ) &&
( tauo_p[ 1 ] <= taui_p[ 1 ] ) )
{
OpCondition = SaveOpCondition = _F_;
if ( OpCondition != LastOpCondition )
{
X = t0_p[ 1 ] - tauo_p[ 1 ];
#ifdef SAT

b = ( 2 * ( VDD * VDD * taui_p[ 1 ] * tauo_p[ 1 ] + VDD * ( Vc * ( X - taui_p[ 1 ] ) * X - \


2 * Vd_p[ 1 ] * taui_p[ 1 ] * tauo_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * X + Vd_p[ 1 ] * tauo_p[ 1 ] ) ) ) / \
( taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
c = ( VDD * VDD * tauo_p[ 1 ] * tauo_p[ 1 ] - 2 * VDD * tauo_p[ 1 ] * ( Vc * X + Vd_p[ 1 ] * tauo_p[ 1 ] ) + \
2 * Vc * X * ( Vd_p[ 1 ] * tauo_p[ 1 ] - TECH.Vtp0 * X ) + Vd_p[ 1 ] * tauo_p[ 1 ] * Vd_p[ 1 ] * tauo_p[ 1 ] ) / \
( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
if ( ( b * b - 4 * c ) >= 0 )
{
ts_p[ 1 ] = ( b - sqrt ( b * b - 4 * c ) ) * 0.5;
if ( ts_p[ 1 ] < 0 )
ts_p[ 1 ] = ( b + sqrt ( b * b - 4 * c ) ) * 0.5;
}
else
ts_p[ 1 ] = -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \
TECH.Vtp0 * X ) / \
( VDD * ( X - taui_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] );
#else
ts_p[ 1 ] = -taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * tauo_p[ 1 ] + \
TECH.Vtp0 * X ) / \
( VDD * ( X - taui_p[ 1 ] ) + \
Vd_p[ 1 ] * taui_p[ 1 ] );
#endif
}
}
else if ( ( ts_p[ 1 ] <= t0_p[ 1 ] ) &&
( t0_p[ 1 ] <= tauo_p[ 1 ] ) &&
( tauo_p[ 1 ] <= taui_p[ 1 ] ) )

B.3. Simulators

266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334

247

{
OpCondition = SaveOpCondition = _G_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_p[ 1 ] = taui_p[ 1 ] * ( VDD * VDD + 2 * VDD * ( Vc - Vd_p[ 1 ] ) - 2 * Vc * ( Vd_p[ 1 ] + TECH.Vtp0 ) + \
Vd_p[ 1 ] * Vd_p[ 1 ] ) / ( 2 * VDD * Vc );
#else
ts_p[ 1 ] = taui_p[ 1
#endif
theta
b = 2
(
c = (

] * ( VDD - Vd_p[ 1 ] - TECH.Vtp0 ) / VDD;

= pow( ( D_2_p * ts_p[ 1 ] + C_2_p ), 1.5 );


* ( Cm2 * VDD + J_2_p * taui_p[ 1 ] ) / \
K_2_p * taui_p[ 1 ] );
6 * A_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * taui_p[ 1 ] * ( t0_bs * t0_bs - ts_p[ 1 ] * ts_p[ 1 ] ) - \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * alpha + \
4 * Vc * Vc * beta_p[ 1 ] * taui_p[ 1 ] * theta - \
3 * D_2_p * ts_p[ 1 ] * \
( 2 * Cm2 * VDD - 2 * Cov * VDD + taui_p[ 1 ] * \
( 2 * J_2_p + K_2_p * ts_p[ 1 ] ) ) ) / \
( 3 * D_2_p * K_2_p * taui_p[ 1 ] );
if ( ( b * b - 4 * c ) >= 0 )
{
t0_p[ 1 ] = ( sqrt ( ( b * b ) - 4 * c ) - b ) * 0.5;
if ( t0_p[ 1 ] < 0 )
t0_p[ 1 ] = -( sqrt ( ( b * b ) - 4 * c ) + b ) * 0.5;
}
else
t0_p[ 1 ] = t0_bs;

}
else
{
OpCondition = _E_;
}
}
}
return OpCondition;
}
///
int Fast::Calct0tsnP( unsigned int p, int& SaveOpCondition )
{
int OpCondition = _A_, LastOpCondition = 0;
double Vc, tc, X, Y, H, K, det, alpha, beta;
Vc = TECH.Ec_p * L_p[ p ];
ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
if ( taui_p[ p ] < tauo_p[ p ] )
{
SaveOpCondition = _A_;
}
else
{
tc = ( VDD * tauo_p[ p ] * ( t0_p[ p ] - taui_p[ p ] ) + \
Vs_p[ p ] * taui_p[ p ] * ( tauo_p[ p ] - t0_p[ p ] ) ) / \
( VDD * ( t0_p[ p ] - taui_p[ p ] ) + Vs_p[ p ] * ( tauo_p[ p ] - t0_p[ p ] ) );
SaveOpCondition = _C_;
}
X = t0_p[ p ] - taui_p[ p ];
Y = tauo_p[ p ] - t0_p[ p ];
alpha = VDD - Vs_p[ p ];
beta = VDD - Vd_p[ p ];
while ( OpCondition != LastOpCondition )
{
if ( LastOpCondition != 0 )

Appendix B. Source code

248

335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399

LastOpCondition = OpCondition;
else
LastOpCondition = _E_;
if ( ( taui_p[ p ] <= tauo_p[ p ] ) &&
( ts_p[ p ] <= taui_p[ p ] ) &&
( t0_p[ p ] <= ts_p[ p ] ) )
{
OpCondition = SaveOpCondition = _A_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
double AAA, BBB;
AAA = ( X * beta + Y * alpha ) / ( X * Y );
BBB = -( VDD * t0_p[ p ] * ( X + Y ) - Vd_p[ p ] * X * tauo_p[ p ] - Vs_p[ p ] * Y * taui_p[ p ] ) / ( X * Y );
AAA /= Vc;
BBB /= Vc;
BBB -= 1.0;
H = 2 * ( A_1_p[ p ] + 1 ) * ( Vs_p[ p ] - VDD ) / ( Vc * X );
K = ( 2 * A_1_p[ p ] * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
2 * B_1_p[ p ] * X + 2 * VDD * t0_p[ p ] + Vc * X - 2 * Vs_p[ p ] * taui_p[ p ] ) / \
( Vc * X );
det = 4 * K * AAA * AAA - 4 * H * AAA * BBB + H * H;
if ( det >= 0 )
{
ts_p[ p ] = ( SQRT ( det ) - 2 * AAA * BBB + H ) / ( 2 * AAA * AAA );
if ( ( ts_p[ p ] < 0 ) || ( ts_p[ p ] < t0_p[ p ] ) || ( ts_p[ p ] > tauo_p[ p ] ) )
ts_p[ p ] = -( SQRT ( det ) + 2 * AAA * BBB - H ) / ( 2 * AAA * AAA );
}
else
ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
#else
ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
#endif
}
}
else if ( ( ts_p[ p ] <= tauo_p[ p ] ) &&
( taui_p[ p ] <= ts_p[ p ] ) &&
( t0_p[ p ] <= taui_p[ p ] ) )
{
OpCondition = SaveOpCondition = _B_;
if ( OpCondition != LastOpCondition )
{
#ifdef SAT
ts_p[ p ] = tauo_p[ p ] + ( ( t0_p[ p ] - tauo_p[ p ] ) * \
( sqrt ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - Vc ) ) / ( VDD - Vd_p[ p ] );
#else
ts_p[ p ] = ( VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] + \
TECH.Vtp0 * ( t0_p[ p ] - tauo_p[ p ] ) ) / \
( VDD - Vd_p[ p ] );
#endif
}
}
else if ( ( tauo_p[ p ] <= taui_p[ p ] ) &&
( ts_p[ p ] < tc ) &&
( t0_p[ p ] <= ts_p[ p ] ) )
{

(ts p[p] tc), not = !!!

B.3. Simulators

249

Calctst0P.cc
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454

OpCondition = SaveOpCondition = _C_;


if ( OpCondition != LastOpCondition )
{
#ifdef SAT
double AAA, BBB;
AAA = ( X * beta + Y * alpha ) / ( X * Y );
BBB = -( VDD * t0_p[ p ] * ( X + Y ) - Vd_p[ p ] * X * tauo_p[ p ] - Vs_p[ p ] * Y * taui_p[ p ] ) / ( X * Y );
AAA /= Vc;
BBB /= Vc;
BBB -= 1.0;
H = 2 * ( A_1_p[ p ] + 1 ) * ( Vs_p[ p ] - VDD ) / ( Vc * X );
K = ( 2 * A_1_p[ p ] * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
2 * B_1_p[ p ] * X + 2 * VDD * t0_p[ p ] + Vc * X - 2 * Vs_p[ p ] * taui_p[ p ] ) / \
( Vc * X );
det = 4 * K * AAA * AAA - 4 * H * AAA * BBB + H * H;
if ( det >= 0 )
{
ts_p[ p ] = ( SQRT ( det ) - 2 * AAA * BBB + H ) / ( 2 * AAA * AAA );
if ( ( ts_p[ p ] < 0 ) || ( ts_p[ p ] < t0_p[ p ] ) || ( ts_p[ p ] > tauo_p[ p ] ) )
ts_p[ p ] = -( SQRT ( det ) + 2 * AAA * BBB - H ) / ( 2 * AAA * AAA );
}
else
ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
#else
ts_p[ p ] = ( A_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) * ( VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ] ) + \
( t0_p[ p ] - taui_p[ p ] ) * ( B_1_p[ p ] * ( t0_p[ p ] - tauo_p[ p ] ) + \
VDD * t0_p[ p ] - Vd_p[ p ] * tauo_p[ p ] ) ) / \
( A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * ( t0_p[ p ] - tauo_p[ p ] ) + \
( VDD - Vd_p[ p ] ) * ( t0_p[ p ] - taui_p[ p ] ) );
#endif
}
}
else if ( ( tauo_p[ p ] <= taui_p[ p ] ) &&
( tc <= ts_p[ p ] ) &&
( t0_p[ p ] <= ts_p[ p ] ) )
{
OpCondition = SaveOpCondition = _D_;
if ( OpCondition != LastOpCondition )
{
ts_p[ p ] = tc;
}
}
else
{
OpCondition = _E_;
}
}
return OpCondition;
}

Delay.cc
3
4
5
6
7
8
9
10
11
12

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::CalcDelay( const Circuit& circuit,
unsigned int NP,
unsigned int NC,

Appendix B. Source code

250

13
14
15
16
17
18
19 {
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64 }

unsigned int n,
unsigned int p,
const double* NewWidth,
double tin,
TransitionType TOut,
int& RetCode )

double t0_bs;
for ( unsigned int i = 1; i <= n; i++ )
{
tauo_n[ i ] = tin;
}
for ( unsigned int i = 1; i <= p; i++ )
{
tauo_p[ i ] = tin;
}
taui_n[ 1 ] = tin;
taui_p[ 1 ] = tin;
switch ( TOut )
{
case FALL:
// n chain
t0_bs = TECH.Vtn0 * tin / VDD;
t0_n[ 1 ] = CalcStartTime( circuit, NP, NC, t0_bs, tauo_n[ 1 ], NMOS, NewWidth, RetCode );
if ( RetCode != OK )
t0_n[ 1 ] = t0_bs;
//return 0.0;
RetCode = IterSol( circuit, NMOS, NP, NC, n, p, NewWidth );
if ( RetCode != OK )
return 0.0;
else
return ( tauo_n[ n ] + t0_n[ n ] - tin );
break;
case RISE:
// p chain
t0_bs = -TECH.Vtp0 * tin / VDD;
t0_p[ 1 ] = CalcStartTime( circuit, NP, NC, t0_bs, tauo_p[ 1 ], PMOS, NewWidth, RetCode );
if ( RetCode != OK )
t0_p[ 1 ] = t0_bs;
//return 0.0;
RetCode = IterSol( circuit, PMOS, NP, NC, n, p, NewWidth );
if ( RetCode != OK )
return 0.0;
else
return ( tauo_p[ p ] + t0_p[ p ] - tin );
break;
case NOTRANSITION:
default:
break;
}
return 0.0;
// never get here

EqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::EqN( const Circuit& circuit, unsigned int NP, unsigned int NC, double x, int& RetCode, unsigned int j, unsigned int n, unsi
{
int SaveOpCondition = _E_, OpCondition_1, OpCondition_i, OpCondition_n;
double y, C_n, Cov, Cgd1;

RetCode = OK;

B.3. Simulators

16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

double* FY = new double[n + 1];


tauo_n[ j ] = x;
double t0_bs = TECH.Vtn0 / VDD * taui_n[ 1 ];
double tc = (VDD * tauo_n[n] * (t0_n[n] - taui_n[n]) + \
Vs_n[n] * taui_n[n] * (tauo_n[n] - t0_n[n])) / \
(VDD * (t0_n[n] - taui_n[n]) + Vs_n[n] * (tauo_n[n] - t0_n[n]));
// tc cross-time : Vd = Vs
if ( j == 1 )
{
OpCondition_1 = Calct0ts1N( circuit, NP, SaveOpCondition, NewWidth );
if ( ( OpCondition_1 == _E_ ) || ( t0_n[ 1 ] < t0_bs ) )
{
RetCode = PARSE_ERROR;
if ( SaveOpCondition == _E_ )
OpCondition_1 = _A_;
else
OpCondition_1 = SaveOpCondition;
}
FY[ 1 ] = FirstEqN( OpCondition_1, tauo_n[ j ] );
}
for ( unsigned int i = 2; i <= n; i++ )
{
t0_n[ i ] = t0_n[ i - 1 ];
taui_n[ i ] = tauo_n[ i - 1 ];
}
// middle equations
if ( ( j > 1 ) && ( j < n ) )
{
if ( taui_n[ j ] <= tauo_n[ j ] )
OpCondition_i = _A_;
else
OpCondition_i = _E_;
if ( OpCondition_i == _E_ )
{
RetCode = PARSE_ERROR;
OpCondition_i = _A_;
}
FY[ j ] = MiddleEqN( OpCondition_i, j, tauo_n[ j ] );
}
// last equation
if ( n > 1 )
{
OpCondition_n = Calct0tsnN( n, SaveOpCondition );
if ( ( OpCondition_n == _E_ ) || ( OpCondition_n == _C_ ) || ( OpCondition_n == _D_ ) )
{
RetCode = PARSE_ERROR;
if ( SaveOpCondition == _E_ )
OpCondition_n = _A_;
else
OpCondition_n = SaveOpCondition;
}
if ( j == n )
FY[ n ] = LastEqN( OpCondition_n, n, tauo_n[ n ] );
}
y = FY[ j ];
// evaluate capacitance at each node
unsigned int node;
// its the common node used to traverse the path
unsigned int LastNode;
const char* name = pathlist[ NP ].TransistorName( 0, NC );
// first mos in path
int njn, ngn, njp, ngp;
double Wjn, Wgn, Wjp, Wgp;
node = 0;
for ( unsigned int i = j, k = 1; i > 1; i--, k++ )
{
if ( circuit[ name ].Source() == node )
node = circuit[ name ].Drain();
else if ( circuit[ name ].Drain() == node )
node = circuit[ name ].Source();
name = pathlist[ NP ].TransistorName( k , NC);

251

252

85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153

Appendix B. Source code

}
for ( unsigned int i = j; i <= n; i++ )
{
C_n = 0.0;
name = pathlist[ NP ].TransistorName( i - 1 , NC);
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
if (i == n)
LastNode = node;
// Common capacitance
int nc;
// dummy
// Cj
C_n += -TECH.C_nj * Wjn * TECH.Df * Vd_n[ i ] * pow ( ( 1 + Vd_n[ i ] / TECH.PB_n ), -TECH.mj_n ) - \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_n[ i ] * pow ( 1 + Vd_n[ i ] / TECH.PB_n, -TECH.mjsw_n );
// static capacitances
C_n += -circuit.CapStaticGnd( node, nc ) * Vd_n[i];
C_n += -circuit.CapStaticVdd( node, nc ) * Vd_n[i];
// gate capacitances
C_n += -Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[i];
C_n += -Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[i];
if ( ( i == 1 ) && ( i < n - 1 ) )
{
// Cgd & Cgs minus first mos
C_n += -( TECH.Cgs0_n * ( ( Wjn - W_n[ 1 ] ) + ( njn - 1 ) * TECH.XW_n ) + \
0.5 * TECH.Cox_n * ( Wjn - W_n[ 1 ] ) * ( njn - 1 ) * TECH.Lmin ) * Vd_n[ i ];
}
else if ( ( i < n - 1 ) && ( i > 1 ) )
{
// all Cgd & Cgs
C_n += -( TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n ) + \
0.5 * TECH.Cox_n * Wjn * njn * TECH.Lmin ) * Vd_n[ i ];
}
else if ( (( i == n ) || ( i == n - 1 )) && (n > 1) )
{
// Cgd & Cgs minus last mos
C_n += -( TECH.Cgs0_n * ( ( Wjn - W_n[ n ] ) + ( njn - 1 ) * TECH.XW_n ) + \
0.5 * TECH.Cox_n * ( Wjn - W_n[ n ] ) * ( njn - 1 ) * TECH.Lmin ) * Vd_n[ i ];
}
// PMOS
C_n += ( TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_n[ i ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ i ] ) / TECH.PB_p ), -TECH.mj_p ) );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_n[ i ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ i ] ) / TECH.PB_p ), TECH.mjsw_p ) );
C_n += -VDD * njp * TECH.C_pj * Wjp * TECH.Df * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mj_p );
C_n += -VDD * TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mjsw_p );
// Cgs & Cgd PMOS
C_n += -( TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[ i ] );
// capacitance with voltages
if ( (( i == 1 ) && ( i < n - 1 )) || ((i == 1) && (n == 1)) )
{
// Cgd
Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];

B.3. Simulators

154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222

switch ( OpCondition_1 )
{
case _A_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) * ( ts_n[ i ] - taui_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
break;
case _AA_:
C_n += Cov * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauo_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( ts_n[ i ] - t0_n[ i ] ) ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * Vd_n[i] * ( tauo_n[ i ] - ts_n[i] ) / \
( t0_n[i] - tauo_n[ i ]);
break;
case _B_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
taui_n[ i ];
break;
case _C_:
C_n += Cov * Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / \
( tauo_n[ i ] - t0_n[i]);
C_n += Cgd1 * Vd_n[ i ] * ( tauo_n[ i ] - ts_n[i]) / \
( t0_n[ i ] - tauo_n[ i ] );
break;
case _D_:
C_n += -Cgd1 * Vd_n[ i ];
break;
case _F_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * ( ts_n[i] - tauo_n[i]) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[ i ] - tauo_n[ i ] ) );
break;
case _G_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
taui_n[ i ];
break;
case _E_:
default:
break;
}
}
else if ( ( i == 1 ) && ( i == n - 1 ) )
{
// Cgd
Cov = TECH.Cgd0_n * ( W_n[ i ] + TECH.XW_n );
Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ i ] * L_n[ i ];
switch ( OpCondition_1 )
{
case _A_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) * ( ts_n[ i ] - taui_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
break;
case _AA_:
C_n += Cov * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauo_n[ i ] ) + \
Vd_n[ i ] * taui_n[ i ] * ( ts_n[ i ] - t0_n[ i ] ) ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * Vd_n[i] * ( tauo_n[ i ] - ts_n[i] ) / \
( t0_n[i] - tauo_n[ i ]);
break;

253

254

223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291

Appendix B. Source code

case _B_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - taui_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) /
taui_n[ i ];
break;
case _C_:
C_n += Cov * Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / \
( tauo_n[ i ] - t0_n[i]);
C_n += Cgd1 * Vd_n[ i ] * ( tauo_n[ i ] - ts_n[i]) / \
( t0_n[ i ] - tauo_n[ i ] );
break;
case _D_:
C_n += -Cgd1 * Vd_n[ i ];
break;
case _F_:
C_n += Cov * ( t0_n[ i ] - ts_n[ i ] ) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[i] - tauo_n[ i ]) );
C_n += Cgd1 * ( ts_n[i] - tauo_n[i]) * \
( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) / \
( taui_n[ i ] * ( t0_n[ i ] - tauo_n[ i ] ) );
break;
case _G_:
C_n += Cgd1 * ( VDD * ( t0_n[ i ] - tauo_n[ i ] ) - Vd_n[ i ] * taui_n[ i ] ) /
taui_n[ i ];
break;
case _E_:
default:
break;
}
// Cgs
Cov = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[
Cgd1 = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + double(2 / 3) * TECH.Cox_n * W_n[
switch ( OpCondition_n )
{
case _A_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - ts_n[ n ] ) / ( taui_n[ n ] - t0_n[ i ] );
C_n += Cgd1 * \
Vs_n[ n ] * ( taui_n[ n ] - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
break;
case _B_:
C_n += -Cov * Vs_n[n];
break;
case _C_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - ts_n[ n ] ) / ( taui_n[ n ] - t0_n[ i ] );
C_n += Cgd1 * \
Vs_n[ n ] * ( tc - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
break;
case _D_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - tc ) / ( taui_n[ n ] - t0_n[ i ] );
break;
case _E_:
default:
break;
}

n ] );
n ] * L_n[ n ] );

}
else if ( ( i == n - 1 ) && ( i > 1 ) )
{
// Cgs, the last mos
Cov = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
Cgd1 = ( TECH.Cgs0_n * ( W_n[ n ] + TECH.XW_n ) + double(2 / 3) * TECH.Cox_n * W_n[ n ] * L_n[ n ] );
switch ( OpCondition_n )
{
case _A_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - ts_n[ n ] ) / ( taui_n[ n ] - t0_n[ i ] );
C_n += Cgd1 * \

B.3. Simulators

292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360

Vs_n[ n ] * ( taui_n[ n ] - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );


break;
case _B_:
C_n += -Cov * Vs_n[n];
break;
case _C_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - ts_n[ n ] ) / ( taui_n[ n ] - t0_n[ i ] );
C_n += Cgd1 * \
Vs_n[ n ] * ( tc - ts_n[ n ] ) / ( t0_n[ i ] - taui_n[ n ] );
break;
case _D_:
C_n += Cov * \
Vs_n[ n ] * ( t0_n[ i ] - tc ) / ( taui_n[ n ] - t0_n[ i ] );
break;
case _E_:
default:
break;
}
}
else if ( i == n )
{
// Cgd
Cov = TECH.Cgd0_n * ( W_n[ n ] + TECH.XW_n );
Cgd1 = Cov + 0.5 * TECH.Cox_n * W_n[ n ] * L_n[ n ];
switch ( OpCondition_n )
{
case _A_:
case _B_:
C_n += Cov * \
Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / ( tauo_n[ i ] - t0_n[ i ] );
C_n += Cgd1 * \
Vd_n[ i ] * ( tauo_n[ i ] - ts_n[ i ] ) / ( t0_n[ i ] - tauo_n[ i ] );
break;
case _C_:
C_n += Cov * \
Vd_n[ i ] * ( t0_n[ i ] - ts_n[ i ] ) / ( tauo_n[ i ] - t0_n[ i ] );
C_n += Cgd1 * \
Vd_n[ i ] * ( tc - ts_n[ i ] ) / ( t0_n[ i ] - tauo_n[ i ] );
break;
case _D_:
C_n += Cov * \
Vd_n[ i ] * ( t0_n[ i ] - tc ) / ( tauo_n[ i ] - t0_n[ i ] );
break;
case _E_:
default:
break;
}
}
y += C_n;
}
// PMOS in chain
node = LastNode;
C_n = 0;
for (unsigned int i = 0; (i < p - 1) && (p > 0); i++)
{
name = pathlist[ NP ].TransistorName( i + n, NC);
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();

255

Appendix B. Source code

256

361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390 }

Wjn
Wgn
Wjp
Wgp

=
=
=
=

circuit.JunctionNWidth( node, njn, NewWidth );


circuit.GateNWidth( node, ngn, NewWidth );
circuit.JunctionPWidth( node, njp, NewWidth );
circuit.GatePWidth( node, ngp, NewWidth );

}
C_n += -TECH.C_nj * Wjn * TECH.Df * Vd_n[ n ] * pow ( ( 1 + Vd_n[ n ] / TECH.PB_n ), -TECH.mj_n ) - \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_n[ n ] * pow ( 1 + Vd_n[ n ] / TECH.PB_n, -TECH.mjsw_n );
// static capacitances
int nc;
C_n += -circuit.CapStaticGnd( node, nc ) * Vd_n[n];
C_n += -circuit.CapStaticVdd( node, nc ) * Vd_n[n];
// gate capacitances
C_n += -Wgn * TECH.Lmin * TECH.Cox_n * Vd_n[n];
C_n += -Wgp * TECH.Lmin * TECH.Cox_p * Vd_n[n];
C_n += ( TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_n[ n ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ n ] ) / TECH.PB_p ), -TECH.mj_p ) );
C_n += ( TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_n[ n ] ) * \
pow ( ( 1 + ( VDD - Vd_n[ n ] ) / TECH.PB_p ), TECH.mjsw_p ) );
C_n += -VDD * njp * TECH.C_pj * Wjp * TECH.Df * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mj_p );
C_n += -VDD * TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * \
pow ( ( 1 + VDD / TECH.PB_p ), -TECH.mjsw_p );
// Cgs & Cgd
C_n += -( TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p ) * Vd_n[ n ] );
y += C_n;
}
y -= QpN(n, p);
delete[] FY;
return y;

EqP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::EqP( const Circuit& circuit, unsigned int NP, unsigned int NC, double x, int& RetCode, unsigned int j, unsigned int n, unsi
{
int SaveOpCondition = _E_, OpCondition_1, OpCondition_i, OpCondition_p;
double y, C_p, Cov, Cgd1;

RetCode = OK;
double* FY = new double[p + 1];
tauo_p[ j ] = x;
double t0_bs = -TECH.Vtp0 / VDD * taui_p[ 1 ];
double tc = (VDD * tauo_p[p] * (t0_p[p] - taui_p[p]) + \
Vs_p[p] * taui_p[p] * (tauo_p[p] - t0_p[p])) / \
(VDD * (t0_p[p] - taui_p[p]) + Vs_p[p] * (tauo_p[p] - t0_p[p]));
// tc cross-time : Vd = Vs
if ( j == 1 )
{
OpCondition_1 = Calct0ts1P( circuit, NP, SaveOpCondition, NewWidth );
if ( ( OpCondition_1 == _E_ ) || ( t0_p[ 1 ] < t0_bs ) )
{
RetCode = PARSE_ERROR;
if ( SaveOpCondition == _E_ )
OpCondition_1 = _A_;
else
OpCondition_1 = SaveOpCondition;
}
FY[ 1 ] = FirstEqP( OpCondition_1, tauo_p[ j ] );
}
for ( unsigned int i = 2; i <= p; i++ )
{

B.3. Simulators

38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

t0_p[ i ] = t0_p[ i - 1 ];
taui_p[ i ] = tauo_p[ i - 1 ];
}
// middle equations
if ( ( j > 1 ) && ( j < p ) )
{
if ( taui_p[ j ] <= tauo_p[ j ] )
OpCondition_i = _A_;
else
OpCondition_i = _E_;
if ( OpCondition_i == _E_ )
{
RetCode = PARSE_ERROR;
OpCondition_i = _A_;
}
FY[ j ] = MiddleEqP( OpCondition_i, j, tauo_p[ j ] );
}
// last equation
if ( p > 1 )
{
OpCondition_p = Calct0tsnP( n, SaveOpCondition );
if ( ( OpCondition_p == _E_ ) || ( OpCondition_p == _C_ ) || ( OpCondition_p == _D_ ) )
{
RetCode = PARSE_ERROR;
if ( SaveOpCondition == _E_ )
OpCondition_p = _A_;
else
OpCondition_p = SaveOpCondition;
}
if ( j == p )
FY[ p ] = LastEqP( OpCondition_p, p, tauo_p[ p ] );
}
y = FY[ j ];
// evaluate capacitance at each node
unsigned int node;
// its the common node used to traverse the path
unsigned int LastNode;
const char* name = pathlist[ NP ].TransistorName( n + p - 1, NC );
// first pmos in path
int njn, ngn, njp, ngp;
double Wjn, Wgn, Wjp, Wgp;
node = circuit.ValimNode();
for ( unsigned int i = j, k = 1; i > 1; i--, k++ )
{
if ( circuit[ name ].Source() == node )
node = circuit[ name ].Drain();
else if ( circuit[ name ].Drain() == node )
node = circuit[ name ].Source();
name = pathlist[ NP ].TransistorName( n + p - 1 - k, NC );
}
for ( unsigned int i = j; i <= p; i++ )
{
C_p = 0.0;
name = pathlist[ NP ].TransistorName( n + p - i, NC );
// first there are n nmos
// the there are the pmos, in REVERSE order
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}

257

258

107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175

Appendix B. Source code

if (i == p)
LastNode = node;
// Common capacitance
// dummy
int nc;
// Cj
C_p += TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_p[ i ]) * pow ( ( 1 + ( VDD - Vd_p[ i ] ) / TECH.PB_p ), -TECH.mj_p );
C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_p[ i ]) * pow ( 1 + ( VDD - Vd_p[ i ] ) / TECH.PB_p, -TECH.mjsw_p );
// static capacitances
C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ i ]);
C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ i ]);
// gate capacitances
C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ i ]);
C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ i ]);
if ( ( i == 1 ) && ( i < p - 1 ) )
{
// Cgd & Cgs minus first mos
C_p += ( TECH.Cgs0_p * ( ( Wjp - W_p[ 1 ] ) + ( njp - 1 ) * TECH.XW_p ) + \
0.5 * TECH.Cox_p * ( Wjp - W_p[ 1 ] ) * ( njp - 1 ) * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
}
else if ( ( i < p - 1 ) && ( i > 1 ) )
{
// all Cgd & Cgs
C_p += ( TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p ) + \
0.5 * TECH.Cox_p * Wjp * njp * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
}
else if ( (( i == p ) || ( i == p - 1 )) && (p > 1) )
{
// Cgd & Cgs minus last mos
C_p += ( TECH.Cgs0_p * ( ( Wjp - W_p[ p ] ) + ( njp - 1 ) * TECH.XW_p ) + \
0.5 * TECH.Cox_p * ( Wjp - W_p[ p ] ) * ( njp - 1 ) * TECH.Lmin ) * ( VDD - Vd_p[ i ] );
}
// NMOS
C_p += -( TECH.C_nj * Wjn * TECH.Df * Vd_p[ i ] ) * \
pow ( ( 1 + Vd_p[ i ] / TECH.PB_n ), -TECH.mj_n );
C_p += -( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_p[ i ] ) * \
pow ( ( 1 + Vd_p[ i ] / TECH.PB_n ), -TECH.mjsw_n );
C_p += ( TECH.C_nj * Wjn * TECH.Df * VDD ) * \
pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mj_n );
C_p += ( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * VDD ) * \
pow ( ( 1 + VDD / TECH.PB_n ), -TECH.mjsw_n );
// Cgs & Cgd
C_p += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ i ]);
// capacitance with voltages
if ( (( i == 1 ) && ( i < p - 1 )) || ((i == 1) && (p == 1)) )
{
// Cgd
Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
Cgd1 = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
switch ( OpCondition_1 )
{
case _A_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - tauo_p[ i ] - taui_p[i]) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
C_p += Cgd1 * ( VDD * ( t0_p[ i ] * ( ts_p[ i ] - taui_p[ i ] ) - \
ts_p[ i ] * ( taui_p[ i ] + tauo_p[ i ] ) + 2 * taui_p[ i ] * tauo_p[ i ] ) + \
Vd_p[ i ] * taui_p[ i ] * ( ts_p[ i ] - tauo_p[ i ] ) ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
break;
case _AA_:
C_p += Cov * ( VDD * ( t0_p[ i ] * t0_p[ i ] - t0_p[ i ] * ( 2 * taui_p[ i ] + tauo_p[ i ] ) + \
taui_p[ i ] * ( ts_p[ i ] + tauo_p[ i ] ) ) + \
Vd_p[ i ] * taui_p[ i ] * ( t0_p[ i ] - ts_p[ i ] ) ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
C_p += Cgd1 * ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tauo_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
break;
case _B_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - 2 * taui_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \

B.3. Simulators

176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244

259

taui_p[ i ];
break;
case _C_:
C_p += Cov * ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
C_p += Cgd1 * ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tauo_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
break;
case _D_:
C_p += Cgd1 * ( VDD - Vd_p[ i ] );
break;
case _F_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
C_p += Cgd1 * ( tauo_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( t0_p[i] - tauo_p[ i ] ) );
break;
case _G_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
taui_p[ i ];
break;
case _E_:
default:
break;
}
}
else if ( ( i == 1 ) && ( i == p - 1 ) )
{
// Cgd
Cov = TECH.Cgd0_p * ( W_p[ i ] + TECH.XW_p );
Cgd1 = Cov + 0.5 * TECH.Cox_p * W_p[ i ] * L_p[ i ];
switch ( OpCondition_1 )
{
case _A_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - tauo_p[ i ] - taui_p[i]) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
C_p += Cgd1 * ( VDD * ( t0_p[ i ] * ( ts_p[ i ] - taui_p[ i ] ) - \
ts_p[ i ] * ( taui_p[ i ] + tauo_p[ i ] ) + 2 * taui_p[ i ] * tauo_p[ i ] ) + \
Vd_p[ i ] * taui_p[ i ] * ( ts_p[ i ] - tauo_p[ i ] ) ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
break;
case _AA_:
C_p += Cov * ( VDD * ( t0_p[ i ] * t0_p[ i ] - t0_p[ i ] * ( 2 * taui_p[ i ] + tauo_p[ i ] ) + \
taui_p[ i ] * ( ts_p[ i ] + tauo_p[ i ] ) ) + \
Vd_p[ i ] * taui_p[ i ] * ( t0_p[ i ] - ts_p[ i ] ) ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );
C_p += Cgd1 * ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tauo_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
break;
case _B_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - 2 * taui_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
taui_p[ i ];
break;
case _C_:
C_p += Cov * ( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
C_p += Cgd1 * ( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tauo_p[ i ] ) / \
( t0_p[i] - tauo_p[ i ]);
break;
case _D_:
C_p += Cgd1 * ( VDD - Vd_p[ i ] );
break;
case _F_:
C_p += Cov * ( t0_p[ i ] - ts_p[ i ] ) * \
( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( tauo_p[ i ] - t0_p[i]) );

260

245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313

Appendix B. Source code

C_p += Cgd1 * ( tauo_p[ i ] - ts_p[ i ] ) * \


( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
( taui_p[ i ] * ( t0_p[i] - tauo_p[ i ] ) );
break;
case _G_:
C_p += -Cgd1 * ( VDD * ( t0_p[ i ] - taui_p[ i ] - tauo_p[ i ] ) + Vd_p[ i ] * taui_p[ i ] ) / \
taui_p[ i ];
break;
case _E_:
default:
break;
}
// Cgs
Cgd1 = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + double(2.0 / 3.0) * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
switch ( OpCondition_p )
{
case _A_:
C_p += Cov * \
( VDD - Vs_p[ p ] ) * ( t0_p[ i ] - ts_p[ p ] ) / ( t0_p[i] - taui_p[ p ]);
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[p] - taui_p[ p ]) / ( t0_p[ i ] - taui_p[ p ] );
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]);
break;
case _C_:
C_p += Cov * \
( VDD - Vs_p[ p ] ) * ( t0_p[ i ] - ts_p[ p ] ) / ( t0_p[i] - taui_p[ p ]);
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[ p ] - tc) / ( t0_p[ i ] - taui_p[ p ] );
case _D_:
C_p += Cov * \
( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - taui_p[ p ]);
break;
case _E_:
default:
break;
}
}
else if ( ( i == p - 1 ) && ( i > 1 ) )
{
// Cgs, the last mos
Cgd1 = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + double(2.0 / 3.0) * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
Cov = ( TECH.Cgs0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
switch ( OpCondition_p )
{
case _A_:
C_p += Cov * \
( VDD - Vs_p[ p ] ) * ( t0_p[ i ] - ts_p[ p ] ) / ( t0_p[i] - taui_p[ p ]);
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[p] - taui_p[ p ]) / ( t0_p[ i ] - taui_p[ p ] );
break;
case _B_:
C_p += Cov * (VDD - Vs_p[p]);
break;
case _C_:
C_p += Cov * \
( VDD - Vs_p[ p ] ) * ( t0_p[ i ] - ts_p[ p ] ) / ( t0_p[i] - taui_p[ p ]);
C_p += Cgd1 * \
( VDD - Vs_p[ p ] ) * ( ts_p[ p ] - tc) / ( t0_p[ i ] - taui_p[ p ] );
case _D_:
C_p += Cov * \
( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - taui_p[ p ]);
break;
case _E_:
default:
break;
}

B.3. Simulators

314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382

261

}
else if ( i == p )
{
// Cgd
Cov = ( TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p ) );
Cgd1 = ( TECH.Cgd0_p * ( W_p[ p ] + TECH.XW_p ) + 0.5 * TECH.Cox_p * W_p[ p ] * L_p[ p ] );
switch ( OpCondition_p )
{
case _A_:
case _B_:
C_p += Cov * \
( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / ( t0_p[i] - tauo_p[ i ]);
C_p += Cgd1 * \
( VDD - Vd_p[ i ] ) * ( ts_p[i] - tauo_p[ i ]) / ( t0_p[ i ] - tauo_p[ i ] );
break;
case _C_:
C_p += Cov * \
( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - ts_p[ i ] ) / ( t0_p[i] - tauo_p[ i ]);
C_p += Cgd1 * \
( VDD - Vd_p[ i ] ) * ( ts_p[ i ] - tc) / ( t0_p[ i ] - tauo_p[ i ] );
break;
case _D_:
C_p += Cov * \
( VDD - Vd_p[ i ] ) * ( t0_p[ i ] - tc ) / ( t0_p[i] - tauo_p[ i ]);
break;
case _E_:
default:
break;
}
}
y += C_p;
}
node = LastNode;
C_p = 0;
for (unsigned int i = 0; (i < n - 1) && (n > 0); i++)
{
name = pathlist[ NP ].TransistorName( n - 1 - i, NC );
if ( circuit[ name ].Source() == node )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == node )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
// Cj
C_p += TECH.C_pj * Wjp * TECH.Df * ( VDD - Vd_p[ p ]) * pow ( ( 1 + ( VDD - Vd_p[ p ] ) / TECH.PB_p ), -TECH.mj_p );
C_p += TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * ( VDD - Vd_p[ p ]) * pow ( 1 + ( VDD - Vd_p[ p ] ) / TECH.PB_p, -TECH.mjsw_p );
// static capacitances
int nc;
C_p += circuit.CapStaticGnd( node, nc ) * ( VDD - Vd_p[ p ]);
C_p += circuit.CapStaticVdd( node, nc ) * ( VDD - Vd_p[ p ]);
// gate capacitances
C_p += Wgn * TECH.Lmin * TECH.Cox_n * ( VDD - Vd_p[ p ] );
C_p += Wgp * TECH.Lmin * TECH.Cox_p * ( VDD - Vd_p[ p ] );
// NMOS in chain
C_p += -( TECH.C_nj * Wjn * TECH.Df * Vd_p[ p ] ) * \
pow ( ( 1 + Vd_p[ p ] / TECH.PB_n ), -TECH.mj_n );
C_p += -( TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * Vd_p[ p ] ) * \
pow ( ( 1 + Vd_p[ p ] / TECH.PB_n ), -TECH.mjsw_n );
C_p += ( TECH.C_nj * Wjn * TECH.Df * VDD ) * \

Appendix B. Source code

262

383
384
385
386
387
388
389
390
391
392
393 }

pow ( ( 1 +
C_p += ( TECH.C_np
pow ( ( 1 +
// Cgs & Cgd
C_p += TECH.Cgd0_n
y += C_p;

VDD / TECH.PB_n ), -TECH.mj_n );


* 2 * ( Wjn + njn * TECH.Df ) * VDD ) * \
VDD / TECH.PB_n ), -TECH.mjsw_n );
* ( Wjn + njn * TECH.XW_n ) * ( VDD - Vd_p[ p ] );

}
y += QnP(n, p);
delete[] FY;
return y;

Fast.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
Fast::Fast( const CritPathList& pathlist, const Options& options )
:
EvaluationAlgorithm( pathlist, options ),
A_1_n( 0 ), ts_n( 0 ), tauo_n( 0 ), taui_n( 0 ), t0_n( 0 ),
beta_n( 0 ), Vd_n( 0 ), Vs_n( 0 ), W_n( 0 ), L_n( 0 ),
A_1_p( 0 ), B_1_p( 0 ), ts_p( 0 ), tauo_p( 0 ), taui_p( 0 ), t0_p( 0 ),
beta_p( 0 ), Vd_p( 0 ), Vs_p( 0 ), W_p( 0 ), L_p( 0 ), VDD( 0 )
{
print_log( "Creating Fast instance..." );
}
///
Fast::~Fast()
{}
///
int Fast::Run( const Circuit& circuit, const double *NewWidth, const unsigned* ValidPath )
{
VDD = circuit.Valim();
Calls++;
int RetCode;
double tin;
unsigned int n;
unsigned int p;
double TotalDelay;
double TotalPower;
double TotalNoise;
for ( unsigned int NP = 0; NP < NumPath; NP++ )
{
if (ValidPath[NP])
{
unsigned int NumChain = pathlist[ NP ].GetNumListTran();
TransitionType TIn;
TransitionType TOut;
TotalDelay = 0.0;
TotalPower = 0.0;
TotalNoise = 0.0;
for ( unsigned int NC = 0; NC < NumChain; NC++ )
{
if ( NC == 0 )
{
TIn = pathlist[ NP ].GetTransitionIn();
tin = pathlist[ NP ].GetInTime() / 1000;
}
else
{

B.3. Simulators

57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123 }

if ( TIn == FALL )
tin = tauo_n[ n ];
else if ( TIn == RISE )
tin = tauo_p[ p ];
}
if ( TIn == FALL )
{
TOut = RISE;
}
else if ( TIn == RISE )
{
TOut = FALL;
}
n = pathlist[ NP ].GetNumTranN( NC );
p = pathlist[ NP ].GetNumTranP( NC );
if ( ( RetCode = InitCircuitVar( n, p ) ) != OK )
return RetCode;
unsigned int in = 1;
unsigned int ip = 1;
while ( const char * tn = pathlist[ NP ].TraverseTransistorNameList( NC ) )
{
int position = circuit.TranPos( tn );
if ( position == -1 )
return NOT_FOUND;
if ( in <= n )
{
W_n[ in ] = NewWidth[ position ];
L_n[ in++ ] = circuit[ position ].Length();
}
else if ( ip <= p )
{
W_p[ p - ip + 1 ] = NewWidth[ position ];
L_p[ p - ip + 1 ] = circuit[ position ].Length();
ip++;
}
else
return NOT_FOUND;
}
CalcParamCircuit( n, p );

TotalDelay += CalcDelay( circuit, NP, NC, n, p, NewWidth, tin, TOut, RetCode );


if ( RetCode != OK )
return RetCode;
TotalPower += CalcPower( circuit, NP, NC, n, p, NewWidth, TOut, RetCode );
if ( RetCode != OK )
return RetCode;
TotalNoise = 0.0;
FreeCircuitPar( n, p );
TIn = TOut;
}
if (NumChain > 0)
for (unsigned int i = 0;
TotalDelay *= 1.85;
else if (NumChain > 1)
for (unsigned int i = 0;
TotalDelay *= 3.1;
CPDelay[ NP ] = TotalDelay *
CPPower[ NP ] = TotalPower /
CPNoise[ NP ] = TotalNoise;

i < NumChain; i++)

i < NumChain; i++)


// 1.07 tech 07
1000;
// ps
1000.0;
// pJ

}
}
Area = CalcArea( NewWidth, circuit.GetNTran() );
if ( RetCode != OK )
return RetCode;
return OK;

263

Appendix B. Source code

264
FastArea.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::CalcArea( const double *NewWidth, unsigned int NT )
{
double A = 0.0;
for ( unsigned int i = 0; i < NT; i++ )
{
A += NewWidth[ i ];
}
return ( A );
}

FirstEqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::FirstEqN( int OpCondition, double tauon )
{
double Vc, t0_bs, temp;
double A_2_n, B_2_n, C_2_n, D_2_n, I_2_n;
double N_2_n, O_2_n, P_2_n;
double Q_2_n, R_2_n, S_2_n, T_2_n, U_2_n;

t0_bs = TECH.Vtn0 * taui_n[ 1 ] / VDD;


Vc = TECH.Ec_n * L_n[ 1 ];
if ( OpCondition == _E_ )
if ( taui_n[ 1 ] <= tauon )
OpCondition = _A_;
else
OpCondition = _F_;
A_2_n = Vc * beta_n[ 1 ] * ( Vc - TECH.Vtn0 );
B_2_n = VDD * Vc * beta_n[ 1 ] / taui_n[ 1 ];
C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
I_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * 0.5;
N_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * Vc * ( ( t0_n[ 1 ] - tauon ) * ( t0_n[ 1 ] - tauon ) ) - \
Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc * ( t0_n[ 1 ] - tauon ) + \
Vd_n[ 1 ] * tauon + 2 * TECH.Vtn0 * ( tauon - t0_n[ 1 ] ) ) ) / \
( 2 * Vd_n[ 1 ] * taui_n[ 1 ] * ( tauon - t0_n[ 1 ] ) );
O_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * taui_n[ 1 ] ) / \
( 2 * taui_n[ 1 ] * ( t0_n[ 1 ] - tauon ) );
P_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( t0_n[ 1 ] - tauon ) * \
( 2 * VDD * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon ) - Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc - 2 * TECH.Vtn0 ) );
Q_2_n = 2 * Vd_n[ 1 ] * taui_n[ 1 ] * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon );
R_2_n = Vc * beta_n[ 1 ] * ( 2 * VDD * ( t0_n[ 1 ] - tauon ) + \
Vc * ( t0_n[ 1 ] - tauon ) + \
Vd_n[ 1 ] * tauon + 2 * TECH.Vtn0 * ( tauon - t0_n[ 1 ] ) ) / \
( 2 * ( t0_n[ 1 ] - tauon ) );
S_2_n = Vc * Vd_n[ 1 ] * beta_n[ 1 ] / ( 2 * ( tauon - t0_n[ 1 ] ) );
T_2_n = ( Vc * Vc ) * beta_n[ 1 ] * ( tauon - t0_n[ 1 ] ) * ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
U_2_n = 2 * ( Vc * ( t0_n[ 1 ] - tauon ) - Vd_n[ 1 ] * tauon );
switch ( OpCondition )
{
case _A_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ts_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ( taui_n[ 1 ] * taui_n[ 1 ] ) ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ta
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * taui_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] - \

B.3. Simulators

52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114 }

( 6
3
4
4
3

*
*
*
*
*

A_2_n * C_2_n * ( t0_n[ 1 ] - ts_n[ 1 ] ) + \


B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) )
( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5
( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * ts_n[ 1 ] + D_2_n ), 1.5
C_2_n * ( 2 * N_2_n * ( ts_n[ 1 ] - taui_n[ 1 ] ) + \
O_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1
2 * R_2_n * ( taui_n[ 1 ] - tauon ) + \
S_2_n * ( ( taui_n[ 1 ] * taui_n[ 1 ] ) - ( tauon * tauon ) ) )

265

- \
) + \
) + \
] ) ) + \

) / ( 6 * C_2_n );
break;
case _AA_:
temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * ts_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] - \
( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - taui_n[ 1 ] ) + \
3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) - \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5 ) + \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * taui_n[ 1 ] + D_2_n ), 1.5 ) - \
3 * C_2_n * ( 2 * I_2_n * ( ts_n[ 1 ] - taui_n[ 1 ] ) + \
2 * R_2_n * ( tauon - ts_n[ 1 ] ) + \
S_2_n * ( ( tauon * tauon ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) ) ) / ( 6 * C_2_n );
break;
case _B_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * t0_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ( taui_n[ 1 ] * taui_n[ 1 ] ) ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * taui_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] - \
( 2 * N_2_n * ( t0_n[ 1 ] - taui_n[ 1 ] ) + \
O_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( taui_n[ 1 ] * taui_n[ 1 ] ) ) + \
2 * R_2_n * ( taui_n[ 1 ] - tauon ) + \
S_2_n * ( ( taui_n[ 1 ] * taui_n[ 1 ] ) - ( tauon * tauon ) ) ) / 2;
break;
case _C_:
temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * ts_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] + \
I_2_n * ( ts_n[ 1 ] - t0_n[ 1 ] ) - \
( 2 * R_2_n * ( ts_n[ 1 ] - tauon ) + S_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( tauon * tauon ) ) ) * 0.5;
break;
case _D_:
temp = -T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * t0_n[ 1 ] ) / Vd_n[ 1 ] + \
T_2_n * LOG2 ( U_2_n + 2 * Vd_n[ 1 ] * tauon ) / Vd_n[ 1 ] - \
( 2 * R_2_n * ( t0_n[ 1 ] - tauon ) + S_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( tauon * tauon ) ) ) * 0.5;
break;
case _F_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * ts_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] * tauon ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
( 6 * A_2_n * C_2_n * ( t0_n[ 1 ] - ts_n[ 1 ] ) + \
3 * B_2_n * C_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( ts_n[ 1 ] * ts_n[ 1 ] ) ) - \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * t0_n[ 1 ] + D_2_n ), 1.5 ) + \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( fabs ( C_2_n * ts_n[ 1 ] + D_2_n ), 1.5 ) + \
3 * C_2_n * ( 2 * N_2_n * ( ts_n[ 1 ] - tauon ) + \
O_2_n * ( ( ts_n[ 1 ] * ts_n[ 1 ] ) - ( tauon * tauon ) ) ) ) / ( 6 * C_2_n );
break;
case _G_:
temp = -P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * t0_n[ 1 ] * taui_n[ 1 ] ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) + \
P_2_n * LOG2 ( Q_2_n + 2 * ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] * tauon ) / ( ( Vd_n[ 1 ] * Vd_n[ 1 ] ) * taui_n[ 1 ] ) - \
( 2 * N_2_n * ( t0_n[ 1 ] - tauon ) + O_2_n * ( ( t0_n[ 1 ] * t0_n[ 1 ] ) - ( tauon * tauon ) ) ) / 2;
break;
case _E_:
default:
temp = 0;
break;
}
return temp;

FirstEqP.cc
3 #include "mystdinclude.h"
4 #include "myenum.h"

Appendix B. Source code

266

5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

#include "print.h"
#include "fast.h"
///
double Fast::FirstEqP( int OpCondition, double tauop )
{
double Vc, t0_bs, temp, x, y;
double A_2_p, B_2_p, C_2_p, D_2_p, G_2_p, J_2_p;
double K_2_p, M_2_p, N_2_p, O_2_p, P_2_p;
double Q_2_p, R_2_p, S_2_p, T_2_p;
t0_bs = -TECH.Vtp0 * taui_p[ 1 ] / VDD;
Vc = TECH.Ec_p * L_p[ 1 ];
if ( OpCondition == _E_ )
if ( taui_p[ 1 ] <= tauop )
OpCondition = _A_;
else
OpCondition = _F_;
x = t0_p[ 1 ] - taui_p[ 1 ];
y = t0_p[ 1 ] - tauop;
A_2_p = Vc * beta_p[ 1 ] * ( Vc + TECH.Vtp0 );
B_2_p = VDD * Vc * beta_p[ 1 ] / taui_p[ 1 ];
C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
G_2_p = Vc * Vc * beta_p[ 1 ] * SQRT ( ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) / Vc ) - \
VDD * Vc * beta_p[ 1 ] - Vc * Vc * beta_p[ 1 ] - Vc * TECH.Vtp0 * beta_p[ 1 ];
J_2_p = Vc * beta_p[ 1 ] * ( ( VDD * VDD ) * taui_p[ 1 ] * tauop - \
VDD * ( Vc * y * ( x + y - tauop ) + \
2 * taui_p[ 1 ] * ( Vd_p[ 1 ] * tauop - TECH.Vtp0 * y ) ) - \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc * y - \
Vd_p[ 1 ] * tauop + 2 * TECH.Vtp0 * y ) ) / \
( 2 * taui_p[ 1 ] * ( Vd_p[ 1 ] - VDD ) * y );
K_2_p = -Vc * beta_p[ 1 ] * ( VDD * ( x + y - tauop ) + \
Vd_p[ 1 ] * taui_p[ 1 ] ) / \
( 2 * taui_p[ 1 ] * y );
M_2_p = ( Vc * Vc ) * beta_p[ 1 ] * y * ( 2 * ( VDD * VDD ) * tauop - \
VDD * ( Vc * ( x + y - tauop ) + \
2 * ( Vd_p[ 1 ] * tauop - TECH.Vtp0 * taui_p[ 1 ] ) ) - \
Vd_p[ 1 ] * taui_p[ 1 ] * ( Vc + 2 * TECH.Vtp0 ) );
N_2_p = 2 * taui_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) * \
( VDD * tauop - Vc * y - Vd_p[ 1 ] * tauop );
O_2_p = 2 * taui_p[ 1 ] * ( ( VDD - Vd_p[ 1 ] ) * ( VDD - Vd_p[ 1 ] ) );
P_2_p = -Vc * beta_p[ 1 ] * ( VDD * ( t0_p[ 1 ] + y ) + \
Vc * y - Vd_p[ 1 ] * tauop + 2 * TECH.Vtp0 * y ) / \
( 2 * y );
Q_2_p = Vc * beta_p[ 1 ] * ( VDD - Vd_p[ 1 ] ) / ( 2 * y );
R_2_p = ( Vc * Vc ) * beta_p[ 1 ] * y * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
S_2_p = 2 * ( VDD * tauop - Vc * y - Vd_p[ 1 ] * tauop );
T_2_p = 2 * ( VDD - Vd_p[ 1 ] );
switch ( OpCondition )
{
case _A_:
temp = -M_2_p * LOG ( O_2_p * ts_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * taui_p[ 1 ] - N_2_p ) / O_2_p - \
R_2_p * LOG ( T_2_p * taui_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p + \
( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - ts_p[ 1 ] * ts_p[ 1 ] )
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5 )
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * ts_p[ 1 ] ), 1.5 )
3 * D_2_p * ( 2 * J_2_p * ( ts_p[ 1 ] - taui_p[ 1 ] ) + \
K_2_p * ( ts_p[ 1 ] * ts_p[ 1 ] - taui_p[ 1 ] * taui_p[
2 * P_2_p * ( taui_p[ 1 ] - tauop ) + \
Q_2_p * ( taui_p[ 1 ] * taui_p[ 1 ] - tauop * tauop ) )
( 6 * D_2_p );
break;
case _AA_:
temp = -R_2_p * LOG ( T_2_p * ts_p[ 1 ] - S_2_p ) / T_2_p + \

- \
+ \
- \
1 ] ) + \
) / \

B.3. Simulators

74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131 }

R_2_p
( 6 *
3 *
4 *
4 *
3 *

* LOG (
A_2_p *
B_2_p *
Vc * Vc
Vc * Vc
D_2_p *

T_2_p * tauop - S_2_p ) / T_2_p + \


D_2_p * ( t0_p[ 1 ] - taui_p[ 1 ] ) + \
D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - taui_p[ 1 ] * taui_p[ 1
* beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5 )
* beta_p[ 1 ] * pow( ( C_2_p + D_2_p * taui_p[ 1 ] ), 1.5
( 2 * G_2_p * ( ts_p[ 1 ] - taui_p[ 1 ] ) + \
2 * P_2_p * ( tauop - ts_p[ 1 ] ) + \
Q_2_p * ( tauop * tauop - ts_p[ 1 ] * ts_p[ 1 ] ) ) ) /

] ) - \
+ \
) + \

( 6 * D_2_p );
break;
case _B_:
temp = -M_2_p * LOG ( O_2_p * t0_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * taui_p[ 1 ] - N_2_p ) / O_2_p - \
R_2_p * LOG ( T_2_p * taui_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p - \
( 2 * J_2_p * ( t0_p[ 1 ] - taui_p[ 1 ] ) + \
K_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( taui_p[ 1 ] * taui_p[ 1 ] ) ) + \
2 * P_2_p * ( taui_p[ 1 ] - tauop ) + \
Q_2_p * ( ( taui_p[ 1 ] * taui_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _C_:
temp = -R_2_p * LOG ( T_2_p * ts_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p + \
G_2_p * ( ts_p[ 1 ] - t0_p[ 1 ] ) - \
( 2 * P_2_p * ( ts_p[ 1 ] - tauop ) + \
Q_2_p * ( ( ts_p[ 1 ] * ts_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _D_:
temp = -R_2_p * LOG ( T_2_p * t0_p[ 1 ] - S_2_p ) / T_2_p + \
R_2_p * LOG ( T_2_p * tauop - S_2_p ) / T_2_p - \
( 2 * P_2_p * ( t0_p[ 1 ] - tauop ) + \
Q_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _F_:
temp = -M_2_p * LOG ( O_2_p * ts_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * tauop - N_2_p ) / O_2_p + \
( 6 * A_2_p * D_2_p * ( t0_p[ 1 ] - ts_p[ 1 ] ) + \
3 * B_2_p * D_2_p * ( t0_p[ 1 ] * t0_p[ 1 ] - ts_p[ 1 ] * ts_p[ 1 ]
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * t0_p[ 1 ] ), 1.5
4 * Vc * Vc * beta_p[ 1 ] * pow( ( C_2_p + D_2_p * ts_p[ 1 ] ), 1.5
3 * D_2_p * ( 2 * J_2_p * ( ts_p[ 1 ] - tauop ) + \
K_2_p * ( ts_p[ 1 ] * ts_p[ 1 ] - tauop * tauop ) ) )
( 6 * D_2_p );
break;
case _G_:
temp = -M_2_p * LOG ( O_2_p * t0_p[ 1 ] - N_2_p ) / O_2_p + \
M_2_p * LOG ( O_2_p * tauop - N_2_p ) / O_2_p - \
( 2 * J_2_p * ( t0_p[ 1 ] - tauop ) + \
K_2_p * ( ( t0_p[ 1 ] * t0_p[ 1 ] ) - ( tauop * tauop ) ) ) * 0.5;
break;
case _E_:
default:
temp = 0;
break;
}
return temp;

Init.cc
3
4
5
6
7
8
9

267

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
int Fast::InitCircuitVar( unsigned int n, unsigned int p )

) - \
) + \
) - \
/ \

268

10 {
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

Appendix B. Source code

// NMOS
if (n == 0)
n = 1;
A_1_n = dvector ( 1, n );
ts_n = dvector ( 1, n );
tauo_n = dvector ( 1, n );
taui_n = dvector ( 1, n );
t0_n = dvector ( 1, n );
beta_n = dvector ( 1, n );
Vd_n = dvector ( 1, n );
Vs_n = dvector ( 1, n );
W_n = dvector ( 1, n );
L_n = dvector ( 1, n );
if ( !A_1_n || !ts_n || !tauo_n || !taui_n || !t0_n || !beta_n || !Vd_n || !Vs_n || !W_n || !L_n )
return NO_MEM;
TECH.u0_n = TECH.Kp_n / TECH.Cox_n;
TECH.Ec_n = TECH.vmax_n / TECH.u0_n * ( 1 + TECH.theta_n * ( VDD - TECH.Vtn0 ) );
double phip_n = TECH.phi_n;
double gamma_n = TECH.gamma_n;
double Vsb1_n = -0.5 * gamma_n * sqrt ( 4 * gamma_n * sqrt ( 2 * phip_n ) + \
8 * phip_n + 4 * VDD - 4 * TECH.Vtn0 + gamma_n * gamma_n ) + \
gamma_n * sqrt ( 2 * phip_n ) + VDD - TECH.Vtn0 + gamma_n * gamma_n * 0.5;
double Vt_n = TECH.Vtn0 + gamma_n * ( sqrt ( 2 * phip_n + Vsb1_n ) - \
sqrt ( 2 * phip_n ) );
for ( unsigned int i = 1; i <= n; i++ )
{
A_1_n[ i ] = ( Vt_n - TECH.Vtn0 ) / Vsb1_n;
ts_n[ i ] = t0_n[ i ] = taui_n[ i ] = tauo_n[ i ] = 0.0;
}
for ( unsigned int i = 1; i < n; i++ )
Vd_n[ i ] = Vsb1_n;
for ( unsigned int i = 2; i <= n; i++ )
Vs_n[ i ] = Vsb1_n;
Vd_n[ n ] = VDD;
Vs_n[ 1 ] = 0;
if (p == 0)
p = 1;
// PMOS
A_1_p = dvector ( 1, p );
B_1_p = dvector ( 1, p );
ts_p = dvector ( 1, p );
tauo_p = dvector ( 1, p );
taui_p = dvector ( 1, p );
t0_p = dvector ( 1, p );
beta_p = dvector ( 1, p );
Vd_p = dvector ( 1, p );
Vs_p = dvector ( 1, p );
W_p = dvector ( 1, p );
L_p = dvector ( 1, p );
if ( !A_1_p || !ts_p || !tauo_p || !taui_p || !t0_p || !beta_p || !Vd_p || !Vs_p || !W_p || !L_p )
return NO_MEM;
TECH.u0_p = TECH.Kp_p / TECH.Cox_p;
TECH.Ec_p = TECH.vmax_p / TECH.u0_p * ( 1 - TECH.theta_p * ( -VDD - TECH.Vtp0 ) );
//double phip p = fabs ( TECH.VT * log ( TECH.Nd / TECH.ni ) );
double phip_p = TECH.phi_p;
//double gamma p = sqrt ( 2 * TECH.epss * TECH.q * TECH.Nd ) / TECH.Cox p;
double gamma_p = TECH.gamma_p;
double Vsb1_p = 0.5 * gamma_p * sqrt ( 4 * gamma_p * sqrt ( 2 * phip_p ) + \
8 * phip_p + 4 * VDD + 4 * TECH.Vtp0 + gamma_p * gamma_p ) - \
gamma_p * sqrt ( 2 * phip_p ) - VDD - TECH.Vtp0 + gamma_p * gamma_p * 0.5;
//
double Vt_p = TECH.Vtp0 - gamma_p * ( sqrt ( 2 * phip_p - Vsb1_p ) - \
sqrt ( 2 * phip_p ) );
for ( unsigned int i = 1; i <= p; i++ )
{
A_1_p[ i ] = ( TECH.Vtp0 - Vt_p ) / ( VDD + Vt_p );
B_1_p[ i ] = ( Vt_p * ( VDD + TECH.Vtp0 ) ) / ( VDD + Vt_p );
ts_p[ i ] = t0_p[ i ] = taui_p[ i ] = tauo_p[ i ] = 0.0;

B.3. Simulators

79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135

269

}
for ( unsigned int i = 1; i < p; i++ )
Vd_p[ i ] = VDD + Vsb1_p;
for ( unsigned int i = 2; i <= p; i++ )
Vs_p[ i ] = VDD + Vsb1_p;
Vd_p[ p ] = 0;
Vs_p[ 1 ] = VDD;
return OK;
}
///
void Fast::FreeCircuitPar( unsigned int n, unsigned int p )
{
if (n == 0)
n = 1;
free_dvector ( A_1_n);
free_dvector ( ts_n);
free_dvector ( tauo_n);
free_dvector ( taui_n);
free_dvector ( t0_n);
free_dvector ( beta_n);
free_dvector ( Vd_n);
free_dvector ( Vs_n);
free_dvector ( W_n);
free_dvector ( L_n);
if (p == 0)
p = 1;
free_dvector ( A_1_p);
free_dvector ( B_1_p);
free_dvector ( ts_p);
free_dvector ( tauo_p);
free_dvector ( taui_p);
free_dvector ( t0_p);
free_dvector ( beta_p);
free_dvector ( Vd_p);
free_dvector ( Vs_p);
free_dvector ( W_p);
free_dvector ( L_p);
}
///
void Fast::CalcParamCircuit( unsigned int n, unsigned int p )
{
for ( unsigned int i = 1; i <= n; i++ )
{
L_n[ i ] = L_n[ i ] - 2 * TECH.LD_n + TECH.XL_n;
W_n[ i ] = W_n[ i ] - 2 * TECH.WD_n + TECH.XW_n;
beta_n[ i ] = ( TECH.u0_n * TECH.Cox_n * W_n[ i ] / L_n[ i ] ) / ( 1 + TECH.theta_n * ( VDD - TECH.Vtn0 ) );
}
for ( unsigned int i = 1; i <= p; i++ )
{
L_p[ i ] = L_p[ i ] - 2 * TECH.LD_p + TECH.XL_p;
W_p[ i ] = W_p[ i ] - 2 * TECH.WD_p + TECH.XW_p;
beta_p[ i ] = ( TECH.u0_p * TECH.Cox_p * W_p[ i ] / L_p[ i ] ) / ( 1 - TECH.theta_p * ( -VDD - TECH.Vtp0 ) );
}
}

Iter.cc
3
4
5
6
7
8
9
10

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
int Fast::IterSol( const Circuit& circuit, TransistorType type, unsigned int NP, unsigned int NC, unsigned int n, unsigned int p, const double* NewWidt

270

11 {
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79

Appendix B. Source code

unsigned int found = 0, n_tol;


int num_sol;
int iter = 0;
double tin;
int RetCode;
unsigned int k;
if ( type == NMOS )
{
k = n;
tin = taui_n[ 1 ];
}
else if ( type == PMOS )
{
k = p;
tin = taui_p[ 1 ];
}
double* to = new double[ k + 1 ];
double* to_old = new double[ k + 1 ];
double* t0 = new double[ k + 1 ];
for ( unsigned int i = 1; i <= k; i++ )
{
to_old[ i ] = to[ i ] = tin;
t0[ i ] = 0;
}
if ( type == NMOS )
t0[ 1 ] = t0_n[ 1 ];
else if ( type == PMOS )
t0[ 1 ] = t0_p[ 1 ];
double xb1, xb2;
while ( !found )
{
iter++;
n_tol = 0;
for ( unsigned int i = 1; i <= k; i++ )
{
num_sol = 1;
xb1 = to_old[ i ] - STEP_SOL;
xb2 = to_old[ i ] + STEP_SOL;
to[ i ] = SolveEq( circuit, NP, NC, type, ( t0[ i ] + STEP_SOL ), MAX_SOL, RetCode, i, n, p, NewWidth );
//if(RetCode != OK)
// return RetCode;
if ( to[ i ] == 0.0 )
{
num_sol = Brackets( circuit, NP, NC, xb1, xb2, type, i, n, p, NewWidth, RetCode );
//if(RetCode != OK)
// return RetCode;
if ( xb1 < 0.0 )
xb1 = 0.0;
if ( num_sol == 0 )
{
if ( i == 1 )
to[ i ] = tin;
else
to[ i ] = to[ i - 1 ] + tin;
}
else
{
to[ i ] = SolveEq( circuit, NP, NC, type, xb1, xb2, RetCode, i, n, p, NewWidth );
//if(RetCode != OK)
// return RetCode;
if ( to[ i ] == 0.0 )
{
if ( i == 1 )
to[ i ] = tin;
else
to[ i ] = to[ i - 1 ] + tin;
}

B.3. Simulators

80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122 }

}
}
if ( type == NMOS )
{
tauo_n[ i ] = to[ i ];
}
else if ( type == PMOS )
{
tauo_p[ i ] = to[ i ];
}
// Dummy
int Sop, Op;
if ( i == 1 )
if ( type == NMOS )
Op = Calct0ts1N ( circuit,
else
Op = Calct0ts1P ( circuit,
if ( ( i == k ) && ( k > 1 ) )
if ( type == NMOS )
Op = Calct0tsnN( n, Sop );
else
Op = Calct0tsnP( n, Sop );
if ( type == NMOS )
{
for ( unsigned int j = 1; j <=
t0[ j ] = t0_n[ j ];
}
else if ( type == PMOS )
{
for ( unsigned int j = 1; j <=
t0[ j ] = t0_p[ j ];
}

NP, Sop, NewWidth );


NP, Sop, NewWidth );

n; j++ )

p; j++ )

if ( ( fabs ( to[ i ] - to_old[ i ] ) <= STEP_SOL ) && ( iter > 1 ) )


n_tol++;
to_old[ i ] = to[ i ];
}
if ( ( n_tol == k ) || ( iter > ITERMAX ) )
found = 1;
}
delete[] to;
delete[] to_old;
return OK;

LastEqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::LastEqN( int OpCondition, unsigned int n, double tauon )
{
double Vc, temp, tc;
double C, D, F, G, H, I, J, K, M, N, O, P, R, S;
Vc = TECH.Ec_n * L_n[ n ];
double x = t0_n[ n ] - taui_n[ n ];
double y = t0_n[ n ] - tauon;
tc = ( VDD * tauon * ( t0_n[ n ] - taui_n[ n ] ) + \
Vs_n[ n ] * taui_n[ n ] * ( tauon - t0_n[ n ] ) ) / \
( VDD * ( t0_n[ n ] - taui_n[ n ] ) + Vs_n[ n ] * ( tauon - t0_n[ n ] ) );
C = -Vc * Vs_n[ n ] * beta_n[ n ] * ( A_1_n[ n ] + 1 ) / x;
I = ( 2 * A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] + \
2 * VDD * x + Vc * x + \
2 * ( Vs_n[ n ] * taui_n[ n ] - TECH.Vtn0 * x ) ) / \
( Vc * x );
H = -2 * Vs_n[ n ] * ( A_1_n[ n ] + 1 ) / ( Vc * x );

271

Appendix B. Source code

272

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93

F = Vc * beta_n[ n ] * ( A_1_n[ n ] * Vs_n[ n ] * taui_n[ n ] + \


VDD * x + Vc * x + \
Vs_n[ n ] * taui_n[ n ] - TECH.Vtn0 * x ) / x;
D = VDD * x - Vs_n[ n ] * y;
G = ( Vc * Vc ) * beta_n[ n ] * ( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) * \
( sqrt ( ( 2 * VDD + Vc - 2 * TECH.Vtn0 ) / Vc ) - 1 ) / 2;
J = Vc * beta_n[ n ] * ( 2 * A_1_n[ n ] * Vs_n[ n ] * y * \
( D * taui_n[ n ] + Vc * x * y ) + \
2 * D * VDD * x * y + \
( VDD * VDD ) * tauon * x * x + \
VDD * x * y * ( Vc * x + \
Vs_n[ n ] * ( y - x ) - 2 * TECH.Vtn0 * x ) + \
Vs_n[ n ] * y * y * ( Vc * x - \
Vs_n[ n ] * taui_n[ n ] + 2 * TECH.Vtn0 * x ) ) / \
( 2 * D * x * y );
K = -Vc * beta_n[ n ] * ( 2 * A_1_n[ n ] * Vs_n[ n ] * y + \
VDD * x + Vs_n[ n ] * y ) / \
( 2 * x * y );
M = ( Vc * Vc ) * beta_n[ n ] * x * y * \
( 2 * A_1_n[ n ] * Vs_n[ n ] * ( VDD * ( x - y ) - Vc * y ) - \
2 * D * VDD + VDD * ( -Vc * x + \
2 * ( Vs_n[ n ] * ( x - y ) + TECH.Vtn0 * x ) ) - \
Vs_n[ n ] * y * ( Vc + 2 * TECH.Vtn0 ) );
N = 2 * D * ( y * ( Vc * x + Vs_n[ n ] * taui_n[ n ] ) - \
VDD * x * tauon );
O = Vc * beta_n[ n ] * ( VDD * ( 2 * t0_n[ n ] - tauon ) + ( Vc - 2 * TECH.Vtn0 ) * y ) / \
( 2 * y );
P = -Vc * VDD * beta_n[ n ] / ( 2 * y );
R = -( Vc * Vc ) * beta_n[ n ] * y * ( 2 * VDD + Vc - 2 * TECH.Vtn0 );
S = 2 * ( Vc * y - VDD * tauon );
switch ( OpCondition )
{
case _A_:
temp = -M * LOG2 ( 2 * ( D * D ) * ts_n[ n ] + N ) / ( D * D ) + \
M * LOG2 ( 2 * ( D * D ) * taui_n[ n ] + N ) / ( D * D ) - \
R * LOG2 ( S + 2 * VDD * taui_n[ n ] ) / VDD + \
R * LOG2 ( S + 2 * VDD * tauon ) / VDD - \
( 6 * F * H * ( t0_n[ n ] - ts_n[ n ] ) + \
3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( ts_n[ n ] * ts_n[ n ] ) )
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * ts_n[ n ] + I ), 1.5
3 * H * ( 2 * J * ( ts_n[ n ] - taui_n[ n ] ) + \
K * ( ( ts_n[ n ] * ts_n[ n ] ) - ( taui_n[ n ] * taui_n[ n
2 * O * ( taui_n[ n ] - tauon ) + \
P * ( ( taui_n[ n ] * taui_n[ n ] ) - ( tauon * tauon ) ) )
( 6 * H );

- \
) + \
) + \
] ) ) + \
) / \

break;
case _B_:
temp = -R * LOG2 ( S + 2 * VDD * ts_n[ n ] ) / VDD + \
R * LOG2 ( S + 2 * VDD * tauon ) / VDD - \
( 6 * F * H * ( t0_n[ n ] - taui_n[ n ] ) + \
3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( taui_n[ n ] * taui_n[ n ] )
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5 )
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * taui_n[ n ] + I ), 1.5
3 * H * ( 2 * G * ( ts_n[ n ] - taui_n[ n ] ) + \
2 * O * ( tauon - ts_n[ n ] ) + \
P * ( ( tauon * tauon ) - ( ts_n[ n ] * ts_n[ n ] ) ) ) ) / (

) - \
+ \
) - \

6 * H );

break;
case _C_:
temp = M * LOG2 ( 2 * ( D * D ) * tc + N ) / ( D * D ) - \
M * LOG2 ( 2 * ( D * D ) * ts_n[ n ] + N ) / ( D * D ) - \
( 6 * F * H * ( t0_n[ n ] - ts_n[ n ] ) + \
3 * C * H * ( ( t0_n[ n ] * t0_n[ n ] ) - ( ts_n[ n ] * ts_n[ n ] ) ) - \
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * t0_n[ n ] + I ), 1.5 ) + \
4 * ( Vc * Vc ) * beta_n[ n ] * pow ( fabs ( H * ts_n[ n ] + I ), 1.5 ) - \

B.3. Simulators

94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109 }

273

3 * H * ( 2 * J * ( tc - ts_n[ n ] ) + \
K * ( ( tc * tc ) - ( ts_n[ n ] * ts_n[ n ] ) ) ) ) / \
( 6 * H );
break;
case _D_:
temp = F * ( tc - t0_n[ n ] ) + C * 0.5 * ( tc * tc - t0_n[ n ] * t0_n[ n ] ) + \
2 * Vc * Vc * beta_n[ n ] * pow ( ( H * t0_n[ n ] + I ), 1.5 ) / ( 3 * H ) - \
2 * Vc * Vc * pow ( ( H * tc + I ), 1.5 ) / ( 3 * H );
break;
case _E_:
default:
temp = 0;
break;
}
return temp;

LastEqP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::LastEqP( int OpCondition, unsigned int p, double tauop )
{
double Vc, temp, tc;
double C, D, F, G, I, J, K, M, N, O, P, R, S, X, x, y;
Vc = TECH.Ec_p * L_p[ p ];
tc = ( VDD * t0_p[ p ] * ( taui_p[ p ] - tauop ) + Vd_p[ p ] * tauop * ( t0_p[ p ] - taui_p[
Vs_p[ p ] * taui_p[ p ] * ( tauop - t0_p[ p ] ) ) / ( VDD * ( taui_p[ p ] - tauop ) +
x = t0_p[ p ] - taui_p[ p ];
y = t0_p[ p ] - tauop;
X = VDD * t0_p[ p ] - Vs_p[ p ] * taui_p[ p ];
C = -Vc * beta_p[ p ] * ( A_1_p[ p ] * X + \
B_1_p[ p ] * x + \
VDD * t0_p[ p ] + Vc * x - Vs_p[ p ] * taui_p[ p ] ) / x;
D = Vc * beta_p[ p ] * ( A_1_p[ p ] + 1 ) * ( VDD - Vs_p[ p ] ) / x;
F = Vc * ( 2 * A_1_p[ p ] * X + \
2 * B_1_p[ p ] * x + \
2 * VDD * t0_p[ p ] + Vc * x - \
2 * Vs_p[ p ] * taui_p[ p ] ) / x;
G = Vc * beta_p[ p ] * SQRT ( Vc * ( 2 * VDD + Vc + 2 * TECH.Vtp0 ) ) - \
VDD * Vc * beta_p[ p ] - ( Vc * Vc ) * beta_p[ p ] - Vc * TECH.Vtp0 * beta_p[ p ];
J = VDD * ( y - x ) + Vd_p[ p ] * x - Vs_p[ p ] * y;
K = Vc * beta_p[ p ] * ( 2 * A_1_p[ p ] * y * \
( ( VDD * VDD ) * t0_p[ p ] * ( x - y ) + \
VDD * ( Vc * x * y - Vd_p[ p ] * t0_p[ p ] * x + \
Vs_p[ p ] * ( t0_p[ p ] * y - \
taui_p[ p ] * ( x - y ) ) ) - \
Vs_p[ p ] * ( Vc * x * y - taui_p[ p ] * ( Vd_p[ p ] * x - \
Vs_p[ p ] * y ) ) ) - \
2 * B_1_p[ p ] * J * x * y + \
( VDD * VDD ) * t0_p[ p ] * ( x - y ) * ( x + y ) + \
VDD * ( Vc * x * y * ( x + y ) - \
Vd_p[ p ] * x * ( t0_p[ p ] * ( x + y ) + tauop * ( x - y )
Vs_p[ p ] * y * ( t0_p[ p ] * ( x + y ) - taui_p[ p ] * ( x
Vc * x * y * ( Vd_p[ p ] * x + Vs_p[ p ] * y ) + \
( Vd_p[ p ] * Vd_p[ p ] ) * tauop * x * x + \
Vs_p[ p ] * Vd_p[ p ] * x * y * ( y - x ) - \
Vs_p[ p ] * Vs_p[ p ] * y * y * taui_p[ p ] ) / \
( 2 * J * x * y );
I = Vc * beta_p[ p ] * ( 2 * A_1_p[ p ] * ( VDD - Vs_p[ p ] ) * y + \
VDD * ( x + y ) - Vd_p[ p ] * x - Vs_p[ p ] * y ) / \
( 2 * x * y );
M = ( Vc * Vc ) * beta_p[ p ] * x * y * \
( 2 * A_1_p[ p ] * ( VDD * ( Vc * y - Vd_p[ p ] * y + Vs_p[ p ] * x ) - \

p ] ) + \
Vd_p[ p ] * ( t0_p[ p ] - taui_p[ p ] ) + Vs_p[ p ] *

) + \
- y ) ) ) - \

274

52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120

Appendix B. Source code

Vs_p[ p ] * ( Vc * y + Vd_p[ p ] * ( x - y ) ) ) - \
2 * B_1_p[ p ] * J + VDD * ( Vc * ( x + y ) - \
2 * ( Vd_p[ p ] * y - Vs_p[ p ] * x ) ) - \
Vc * ( Vd_p[ p ] * x + Vs_p[ p ] * y ) + \
2 * Vd_p[ p ] * Vs_p[ p ] * ( y - x ) );
N = 2 * J * ( VDD * t0_p[ p ] * ( y - x ) + \
Vc * x * y + \
Vd_p[ p ] * tauop * x - \
Vs_p[ p ] * taui_p[ p ] * y );
O = -Vc * beta_p[ p ] * ( VDD * ( t0_p[ p ] + y ) + \
Vc * y - Vd_p[ p ] * tauop + 2 * TECH.Vtp0 * y ) / \
( 2 * y );
P = Vc * beta_p[ p ] * ( VDD - Vd_p[ p ] ) / ( 2 * y );
R = ( Vc * Vc ) * beta_p[ p ] * y * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
S = 2 * ( VDD * tauop - Vc * y - Vd_p[ p ] * tauop );
switch ( OpCondition )
{
case _A_:
temp = -M * LOG2 ( ( 2 * ( J * J ) * ts_p[ p ] - N ) ) / ( J * J ) + \
M * LOG2 ( ( 2 * ( J * J ) * taui_p[ p ] - N ) ) / ( J * J ) + \
R * LOG2 ( ( 2 * taui_p[ p ] * ( VDD - Vd_p[ p ] ) - S ) ) / ( Vd_p[ p ] - VDD ) + \
R * LOG2 ( ( 2 * tauop * ( VDD - Vd_p[ p ] ) - S ) ) / ( VDD - Vd_p[ p ] ) - \
( 2 * Vc * beta_p[ p ] * ( 2 * D * t0_p[ p ] - F * beta_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * t0_p[ p ] ) / beta_p[ p ] ) + \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - ts_p[ p ] ) + D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( ts_p[ p ] * ts_p[ p ] ) ) + \
I * ( ( ts_p[ p ] * ts_p[ p ] ) - ( taui_p[ p ] * taui_p[ p ] ) ) + \
2 * K * ( ts_p[ p ] - taui_p[ p ] ) + \
2 * O * ( taui_p[ p ] - tauop ) + \
P * ( ( taui_p[ p ] * taui_p[ p ] ) - ( tauop * tauop ) ) ) ) / ( 6 * D );
break;
case _B_:
temp = R * LOG2 ( ( 2 * ts_p[ p ] * ( VDD - Vd_p[ p ] ) - S ) ) / ( Vd_p[ p ] - VDD ) + \
R * LOG2 ( ( 2 * tauop * ( VDD - Vd_p[ p ] ) - S ) ) / ( VDD - Vd_p[ p ] ) - \
( 2 * Vc * beta_p[ p ] * ( 2 * D * t0_p[ p ] - F * beta_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * t0_p[ p ] ) / beta_p[ p ] ) + \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * taui_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * taui_p[ p ] ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - taui_p[ p ] ) + \
D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( taui_p[ p ] * taui_p[ p ] ) ) + \
2 * G * ( taui_p[ p ] - ts_p[ p ] ) + \
2 * O * ( ts_p[ p ] - tauop ) + \
P * ( ( ts_p[ p ] * ts_p[ p ] ) - ( tauop * tauop ) ) ) ) / ( 6 * D );
break;
case _C_:
temp = M * LOG2 ( ( 2 * ( J * J ) * tc - N ) ) / ( J * J ) - \
M * LOG2 ( ( 2 * ( J * J ) * ts_p[ p ] - N ) ) / ( J * J ) - \
( 2 * Vc * beta_p[ p ] * ( 2 * D * t0_p[ p ] - F * beta_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * t0_p[ p ] ) / beta_p[ p ] ) + \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * ts_p[ p ] ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - ts_p[ p ] ) + \
D * ( ( t0_p[ p ] * t0_p[ p ] ) - ( ts_p[ p ] * ts_p[ p ] ) ) + \
I * ( ( ts_p[ p ] * ts_p[ p ] ) - ( tc * tc ) ) + \
2 * K * ( ts_p[ p ] - tc ) ) ) / ( 6 * D );
break;
case _D_:
temp = -( 2 * Vc * beta_p[ p ] * ( 2 * D - F * beta_p[ p ] ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * t0_p[ p ] ) / beta_p[ p ] ) + \
2 * Vc * beta_p[ p ] * ( F * beta_p[ p ] - 2 * D * tc ) * \
SQRT ( ( F * beta_p[ p ] - 2 * D * tc ) / beta_p[ p ] ) + \
3 * D * ( 2 * C * ( t0_p[ p ] - tc ) + D * ( t0_p[ p ] * t0_p[ p ] - tc * tc ) ) ) / \
( 6 * D );
break;
case _E_:

B.3. Simulators

121
122
123
124
125
126 }

275

default:
temp = 0;
break;
}
return temp;

MiddleEqN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::MiddleEqN( int OpCondition, unsigned int i, double tauon )
{
double Vc, temp;
double D, F, G, H, I, J, K, M, N;
Vc = TECH.Ec_n * L_n[ i ];
switch ( OpCondition )
{
case _A_:
D = Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + \
Vs_n[ i ] * ( tauon - t0_n[ i ] );
F = Vc * beta_n[ i ] * \
( 2 * A_1_n[ i ] * Vs_n[ i ] * ( t0_n[ i ] - tauon ) * \
( D * taui_n[ i ] + Vc * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) ) + \
2 * D * VDD * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) + \
Vc * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) * \
( Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) + \
( Vd_n[ i ] * Vd_n[ i ] ) * tauon * ( ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - taui_n[ i ] ) ) + \
Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) * \
( Vs_n[ i ] * ( taui_n[ i ] - tauon ) + 2 * TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) - \
Vs_n[ i ] * ( ( t0_n[ i ] - tauon ) * ( t0_n[ i ] - tauon ) ) * \
( Vs_n[ i ] * taui_n[ i ] + 2 * TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) ) / \
( 2 * D * ( t0_n[ i ] - taui_n[ i ] ) * ( t0_n[ i ] - tauon ) );
G = Vc * beta_n[ i ] * \
( 2 * A_1_n[ i ] * Vs_n[ i ] * ( t0_n[ i ] - tauon ) + \
Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) / \
( 2 * ( t0_n[ i ] - taui_n[ i ] ) * ( tauon - t0_n[ i ] ) );
H = ( Vc * Vc ) * beta_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) * \
( tauon - t0_n[ i ] ) * \
( 2 * A_1_n[ i ] * Vs_n[ i ] * ( Vc * ( t0_n[ i ] - tauon ) + \
Vd_n[ i ] * ( taui_n[ i ] - tauon ) ) + 2 * D * VDD + Vc * \
( Vd_n[ i ] * ( t0_n[ i ] - taui_n[ i ] ) + Vs_n[ i ] * ( t0_n[ i ] - tauon ) ) + \
2 * ( Vd_n[ i ] * ( Vs_n[ i ] * ( taui_n[ i ] - tauon ) + \
TECH.Vtn0 * ( taui_n[ i ] - t0_n[ i ] ) ) + Vs_n[ i ] * TECH.Vtn0 * ( t0_n[ i ] - tauon ) ) );
I = 2 * D * ( Vc * ( t0_n[ i ] - taui_n[ i ] ) * \
( t0_n[ i ] - tauon ) + Vd_n[ i ] * tauon * ( taui_n[ i ] - t0_n[ i ] ) + \
Vs_n[ i ] * taui_n[ i ] * ( t0_n[ i ] - tauon ) );
J = Vc * beta_n[ i ] * ( 2 * VDD * ( t0_n[ i ] - tauon ) + \
Vc * ( t0_n[ i ] - tauon ) + Vd_n[ i ] * tauon + \
2 * TECH.Vtn0 * ( tauon - t0_n[ i ] ) ) / ( 2 * ( t0_n[ i ] - tauon ) );
K = Vc * Vd_n[ i ] * beta_n[ i ] / ( 2 * ( tauon - t0_n[ i ] ) );
M = ( Vc * Vc ) * beta_n[ i ] * ( tauon - t0_n[ i ] ) * \
( 2 * VDD + Vc - 2 * TECH.Vtn0 );
N = 2 * ( Vc * ( t0_n[ i ] - tauon ) - Vd_n[ i ] * tauon );
temp = -H * LOG2 ( 2 * ( D * D ) * t0_n[ i ] + I ) / ( D * D ) + \
H * LOG2 ( 2 * ( D * D ) * taui_n[ i ] + I ) / ( D * D ) - \
M * LOG2 ( N + 2 * Vd_n[ i ] * taui_n[ i ] ) / Vd_n[ i ] + \
M * LOG2 ( N + 2 * Vd_n[ i ] * tauon ) / Vd_n[ i ] - \
( 2 * F * ( t0_n[ i ] - taui_n[ i ] ) + G * ( ( t0_n[ i ] * t0_n[ i ] ) - ( taui_n[ i ] * taui_n[ i ] ) ) + 2 * J * ( taui_n[ i ] - tauo
break;
case _E_:
default:
temp = 0;
break;

Appendix B. Source code

276

62
63
64 }

}
return temp;

MiddleEqP.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::MiddleEqP( int OpCondition, unsigned int i, double tauop )
{
double Vc, temp;
double D, J, K, M, N, O, P, R, S, T;
Vc = TECH.Ec_p * L_p[ i ];
switch ( OpCondition )
{
case _A_:
D = VDD * ( taui_p[ i ] - tauop ) + \
Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
Vs_p[ i ] * ( tauop - t0_p[ i ] );
J = Vc * beta_p[ i ] * \
( 2 * A_1_p[ i ] * ( t0_p[ i ] - tauop ) * ( ( VDD * VDD ) * t0_p[ i ] * ( taui_p[ i ] - tauop ) - \
VDD * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
Vd_p[ i ] * t0_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) + \
Vs_p[ i ] * ( ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * tauop + taui_p[ i ] * ( taui_p[ i ] - tauop ) ) ) + \
Vs_p[ i ] * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) - \
taui_p[ i ] * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + Vs_p[ i ] * ( tauop - t0_p[ i ] ) ) ) ) + \
2 * B_1_p[ i ] * D * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
( VDD * VDD ) * t0_p[ i ] * ( taui_p[ i ] - tauop ) * \
( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) - VDD * ( Vc * ( t0_p[ i ] - taui_p[ i ] ) * \
( t0_p[ i ] - tauop ) * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) + \
Vd_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) * \
( 2 * ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * ( taui_p[ i ] + tauop ) - \
tauop * ( taui_p[ i ] - tauop ) ) + Vs_p[ i ] * ( t0_p[ i ] - tauop ) * \
( 2 * ( t0_p[ i ] * t0_p[ i ] ) - t0_p[ i ] * ( taui_p[ i ] + tauop ) + taui_p[ i ] * ( taui_p[ i ] - tauop ) ) )
Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
Vs_p[ i ] * ( t0_p[ i ] - tauop ) ) - ( Vd_p[ i ] * Vd_p[ i ] ) * tauop * ( ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[
Vs_p[ i ] * ( tauop - t0_p[ i ] ) * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) * \
( taui_p[ i ] - tauop ) + Vs_p[ i ] * taui_p[ i ] * ( tauop - t0_p[ i ] ) ) ) / \
( 2 * D * ( t0_p[ i ] - taui_p[ i ] ) * ( tauop - t0_p[ i ] ) );
K = Vc * beta_p[ i ] * ( 2 * A_1_p[ i ] * ( VDD - Vs_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
VDD * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) + \
Vd_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) + \
Vs_p[ i ] * ( tauop - t0_p[ i ] ) ) / ( 2 * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) );
M = ( Vc * Vc ) * beta_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) * \
( 2 * A_1_p[ i ] * ( VDD * ( Vc * ( t0_p[ i ] - tauop ) + \
Vd_p[ i ] * ( tauop - t0_p[ i ] ) + Vs_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) ) - \
Vs_p[ i ] * ( Vc * ( t0_p[ i ] - tauop ) + Vd_p[ i ] * ( tauop - taui_p[ i ] ) ) ) - \
2 * B_1_p[ i ] * D + VDD * ( Vc * ( 2 * t0_p[ i ] - taui_p[ i ] - tauop ) - \
2 * ( Vd_p[ i ] * ( t0_p[ i ] - tauop ) + Vs_p[ i ] * ( taui_p[ i ] - t0_p[ i ] ) ) ) - \
Vc * ( Vd_p[ i ] * ( t0_p[ i ] - taui_p[ i ] ) + \
Vs_p[ i ] * ( t0_p[ i ] - tauop ) ) + 2 * Vd_p[ i ] * Vs_p[ i ] * ( taui_p[ i ] - tauop ) );
N = 2 * D * ( VDD * t0_p[ i ] * ( taui_p[ i ] - tauop ) + \
Vc * ( t0_p[ i ] - taui_p[ i ] ) * ( t0_p[ i ] - tauop ) + \
Vd_p[ i ] * tauop * ( t0_p[ i ] - taui_p[ i ] ) + Vs_p[ i ] * taui_p[ i ] * ( tauop - t0_p[ i ] ) );
O = Vc * beta_p[ i ] * ( VDD * ( 2 * t0_p[ i ] - tauop ) + \
Vc * ( t0_p[ i ] - tauop ) - Vd_p[ i ] * tauop + 2 * TECH.Vtp0 * ( t0_p[ i ] - tauop ) ) / \
( 2 * ( tauop - t0_p[ i ] ) );
P = Vc * beta_p[ i ] * ( VDD - Vd_p[ i ] ) / ( 2 * ( t0_p[ i ] - tauop ) );
R = ( Vc * Vc ) * beta_p[ i ] * ( t0_p[ i ] - tauop ) * ( 2 * VDD + Vc + 2 * TECH.Vtp0 );
S = 2 * ( VDD * tauop + Vc * ( tauop - t0_p[ i ] ) - Vd_p[ i ] * tauop );
T = 2 * ( VDD - Vd_p[ i ] );
temp = -M * LOG2 ( 2 * ( D * D ) * t0_p[ i ] - N ) / ( D * D ) + \
M * LOG2 ( 2 * ( D * D ) * taui_p[ i ] - N ) / ( D * D ) - \

B.3. Simulators

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79 }

R * LOG (
R * LOG (
( 2 * J *
K * ( (
2 * O *
P * ( (

T * taui_p[
T * tauop ( t0_p[ i ]
t0_p[ i ] *
( taui_p[ i
taui_p[ i ]

i ] - S )
S ) / T - taui_p[
t0_p[ i ]
] - tauop
* taui_p[

/
\
i
)
)
i

T + \
]
+
]

) + \
( taui_p[ i ] * taui_p[ i ] ) ) + \
\
) - ( tauop * tauop ) ) ) / 2;

break;
case _E_:
default:
temp = 0;
break;
}
return temp;

Power.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::CalcPower( const Circuit& circuit,
unsigned int NP,
unsigned int NC,
unsigned int n,
unsigned int p,
const double* NewWidth,
TransitionType TOut,
int& RetCode )
{
double Ecc, Esc;
switch ( TOut )
{
case FALL:
// n chain
RetCode = CalcPowerN( circuit, NP, NC, Ecc, Esc, n, p, NewWidth );
if ( RetCode != OK )
return 0.0;
break;
case RISE:
// p chain
RetCode = CalcPowerP( circuit, NP, NC, Ecc, Esc, n, p, NewWidth );
if ( RetCode != OK )
return 0.0;
break;
case NOTRANSITION:
default:
break;
}
return ( Ecc + Esc );
}

QnP.cc
3
4
5
6
7
8
9
10
11
12

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::QnP(unsigned int n, unsigned int p)
{
double D_n, E_n, F_n, G_n;
double to;

277

Appendix B. Source code

278

13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 }

double Vc, y;
if (n != 0)
{
Vc = TECH.Ec_n * L_n[1];
to = taui_n[1] * (1 - TECH.Vtn0 / VDD);
if (to > tauo_p[p])
to = tauo_p[p];
D_n = Vc * beta_n[1] * (VDD * VDD * taui_p[1] * (t0_p[p] - 2 * tauo_p[p]) - \
VDD * (Vc * (t0_p[p] - tauo_p[p]) * \
(2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
taui_p[1] * (Vd_p[p] * (t0_p[p] - 3 * tauo_p[p]) + \
2 * TECH.Vtn0 * (t0_p[p] - tauo_p[p]))) - \
Vd_p[p] * taui_p[1] * (Vc * (t0_p[p] - tauo_p[p]) + \
Vd_p[p] * tauo_p[p] + 2 * TECH.Vtn0 * \
(tauo_p[p] - t0_p[p]))) / \
(2 * taui_p[1] * (VDD - Vd_p[p]) * (t0_p[p] - tauo_p[p]));
E_n = Vc * beta_n[1] * (VDD * (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
Vd_p[p] * taui_p[1]) / \
(2 * taui_p[1] * (tauo_p[p] - t0_p[p]));
F_n = Vc * Vc * beta_n[1] * (tauo_p[p] - t0_p[p]) * \
(2 * VDD * VDD * (t0_p[p] - taui_p[1]) + \
VDD * (Vc * (2 * t0_p[p] - taui_p[1] - 2 * tauo_p[p]) + \
2 * (Vd_p[p] * (taui_p[1] - tauo_p[p]) + TECH.Vtn0 * taui_p[1])) + \
Vd_p[p] * taui_p[1] * (Vc - 2 * TECH.Vtn0));
G_n = 2 * taui_p[1] * (VDD - Vd_p[p]) * (VDD * t0_p[p] + \
Vc * (t0_p[p] - tauo_p[p]) - \
Vd_p[p] * tauo_p[p]);
y = -F_n * (LOG2(2 * t0_p[p] * taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p]) - G_n)) / \
(taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p])) + \
F_n * (LOG2(2 * to * taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p]) - G_n)) / \
(taui_p[1] * (VDD - Vd_p[p]) * \
(VDD - Vd_p[p])) - \
(2 * D_n * (t0_p[p] - to) + E_n * (t0_p[p] * t0_p[p] - to * to)) * 0.5;
return y;
}
else
return 0.0;

QpN.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::QpN(unsigned int n, unsigned int p)
{
double D_p, E_p, F_p, G_p;
double to;
double Vc, y;
if (p != 0)
{
Vc = TECH.Ec_p * L_p[1];
to = taui_n[1] * (1 + TECH.Vtp0 / VDD);
if (to > tauo_n[n])
to = tauo_n[n];
D_p = Vc * beta_p[1] * (VDD * (t0_n[n] - tauo_n[n]) * \
(2 * Vc * (t0_n[n] - tauo_n[n]) - Vd_n[n] * taui_n[1]) - \
Vd_n[n] * taui_n[1] * (Vc * (t0_n[n] - tauo_n[n]) - \
Vd_n[n] * tauo_n[n] + 2 * TECH.Vtp0 * (t0_n[n] - tauo_n[n]))) / \
(2 * Vd_n[n] * taui_n[1] * (t0_n[n] - tauo_n[n]));

B.3. Simulators

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44 }

E_p = Vc
(2
F_p = Vc
(2
2

279

*
*
*
*
*

beta_p[1] * (2 * VDD * (t0_n[n] - tauo_n[n]) - Vd_n[n] * taui_n[1]) / \


taui_n[1] * (t0_n[n] - tauo_n[n]));
Vc * beta_p[1] * (t0_n[n] - tauo_n[n]) * \
VDD * VDD * (t0_n[n] - tauo_n[n]) + \
VDD * (Vc * (t0_n[n] - tauo_n[n]) + \
Vd_n[n] * (tauo_n[n] - taui_n[1])) - Vd_n[n] * taui_n[1] * (Vc + 2 * TECH.Vtp0));
G_p = 2 * Vd_n[n] * taui_n[1] * (VDD * (t0_n[n] - tauo_n[n]) + \
Vc * (t0_n[n] - tauo_n[n]) + Vd_n[n] * tauo_n[n]);
y = -F_p * (LOG2(2 * Vd_n[n] * Vd_n[n] * t0_n[n] * taui_n[1] - G_p)) / \
(Vd_n[n] * Vd_n[n] * taui_n[1]) + \
(F_p * LOG2(2 * Vd_n[n] * Vd_n[n] * to * taui_n[1] - G_p)) / \
(Vd_n[n] * Vd_n[n] * taui_n[1]) - \
(2 * D_p * (t0_n[n] - to) + E_p * (t0_n[n] * t0_n[n] - to * to)) * 0.5;
return y;
}
else
return 0.0;

Solve.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

#define SIGN(a,b) ((b) >= 0.0 ? fabs(a) : -fabs(a))

///
double Fast::SolveEq( const Circuit& circuit, unsigned int NP, unsigned int NC, TransistorType type, double start, double end, int& RetCode, unsigned i
{
int iter;
double a = start, b = end, c = end, d, e, min1, min2;
double fa, fb, fc, pp, q, r, s, tol1, xm, last;
double tol = TOL;
if ( type == NMOS )
fb = EqN( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
else if ( type == PMOS )
fb = EqP( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
last = fb;
if ( type == NMOS )
fa = EqN( circuit, NP, NC, a, RetCode, i, n, p, NewWidth );
else if ( type == PMOS )
fa = EqP( circuit, NP, NC, a, RetCode, i, n, p, NewWidth );
if ( ( fa > 0.0 && fb > 0.0 ) || ( fa < 0.0 && fb < 0.0 ) )
{
RetCode = NOT_FOUND;
return 0.0;
}
fc = fb;
for ( iter = 1; iter <= ITERMAX; iter++ )
{
if ( ( fb > 0.0 && fc > 0.0 ) || ( fb < 0.0 && fc < 0.0 ) )
{
c = a;
fc = fa;
e = d = b - a;
}
if ( fabs ( fc ) < fabs ( fb ) )
{
a = b;
b = c;
c = a;
fa = fb;
fb = fc;
fc = fa;

Appendix B. Source code

280

49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103 }

}
tol1 = 2.0 * EPS * fabs ( b ) + 0.5 * tol;
xm = 0.5 * ( c - b );
if ( fabs ( xm ) <= tol1 || fb == 0.0 )
return b;
if ( fabs ( e ) >= tol1 && fabs ( fa ) > fabs ( fb ) )
{
s = fb / fa;
if ( a == c )
{
pp = 2.0 * xm * s;
q = 1.0 - s;
}
else
{
q = fa / fc;
r = fb / fc;
pp = s * ( 2.0 * xm * q * ( q - r ) - ( b - a ) * ( r - 1.0 ) );
q = ( q - 1.0 ) * ( r - 1.0 ) * ( s - 1.0 );
}
if ( pp > 0.0 )
q = -q;
pp = fabs ( pp );
min1 = 3.0 * xm * q - fabs ( tol1 * q );
min2 = fabs ( e * q );
if ( 2.0 * pp < ( min1 < min2 ? min1 : min2 ) )
{
e = d;
d = pp / q;
}
else
{
d = xm;
e = d;
}
}
else
{
d = xm;
e = d;
}
a = b;
fa = fb;
if ( fabs ( d ) > tol1 )
b += d;
else
b += SIGN ( tol1, xm );
if ( type == NMOS )
fb = EqN( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
else if ( type == PMOS )
fb = EqP( circuit, NP, NC, b, RetCode, i, n, p, NewWidth );
}
RetCode = NOT_FOUND;
return 0.0;

t0N.cc
3
4
5
6
7
8
9
10
11
12

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::t0N( const Circuit& circuit, unsigned int NP, unsigned int NC, double t, const double* NewWidth, int& RetCode )
{
// compute the time at which the first n-mos start conducting, using
// bootstrap

B.3. Simulators

13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64 }

281

double A_2_n, B_2_n, C_2_n, D_2_n, Vc, y, Cm1, Cov, Cj, t0_bs;
Vc = TECH.Ec_n * L_n[ 1 ];
Cov = TECH.Cgd0_n * ( W_n[ 1 ] + TECH.XW_n );
Cm1 = Cov;
Cj = 0.0;
int njn, ngn, njp, ngp;
double Wjn, Wgn, Wjp, Wgp;
unsigned int node;
const char* name = pathlist[ NP ].TransistorName( 0, NC );
if ( circuit[ name ].Source() == 0 )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == 0 )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
Cj += TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_n[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
Cj += TECH.Cgs0_n * ( Wjn + njn * TECH.XW_n );
// evaluate Cgs Cgd @ V node 1 for pmos
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_n[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
Cj += TECH.Cgd0_p * ( Wjp + njp * TECH.XW_p );
// evaluate other capacitances
int nc;
Cj += circuit.CapStaticGnd( node, nc );
Cj += circuit.CapStaticVdd( node, nc );
// evaluate gate capacitances
Cj += Wgn * TECH.Lmin * TECH.Cox_n;
Cj += Wgp * TECH.Lmin * TECH.Cox_p;
t0_bs = TECH.Vtn0 * taui_n[ 1 ] / VDD;
A_2_n = Vc * beta_n[ 1 ] * ( Vc - TECH.Vtn0 );
B_2_n = VDD * Vc * beta_n[ 1 ] / taui_n[ 1 ];
C_2_n = 2 * VDD / ( Vc * taui_n[ 1 ] );
D_2_n = ( Vc - 2 * TECH.Vtn0 ) / Vc;
y = -2 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( C_2_n * t + D_2_n, 1.5 ) / ( 3 * C_2_n ) + \
B_2_n * ( t * t ) * 0.5 + t * ( A_2_n * taui_n[ 1 ] - Cov * VDD ) / ( taui_n[ 1 ] ) - \
( 6 * A_2_n * C_2_n * t0_bs + 3 * B_2_n * C_2_n * ( t0_bs * t0_bs ) - \
4 * ( Vc * Vc ) * beta_n[ 1 ] * pow ( C_2_n * t0_bs + D_2_n, 1.5 ) ) / ( 6 * C_2_n );
RetCode = OK;
return y;

t0P.cc
3
4
5
6
7
8
9
10
11
12
13
14
15

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"fast.h"

///
double Fast::t0P( const Circuit& circuit, unsigned int NP, unsigned int NC, double t, const double* NewWidth, int& RetCode )
{
double A_2_p, B_2_p, C_2_p, D_2_p, Vc, y, Cm1, Cov, t0_bs, Cj;
double alpha, theta, Y;
Vc = TECH.Ec_p * L_p[ 1 ];
Cov = TECH.Cgd0_p * ( W_p[ 1 ] + TECH.XW_p );

282

16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78 }

Appendix B. Source code

Cm1 = Cov;
Cj = 0.0;
int njn, ngn, njp, ngp;
double Wjn, Wgn, Wjp, Wgp;
unsigned int nn = pathlist[ NP ].GetNumTranN( NC );
unsigned int pp = pathlist[ NP ].GetNumTranP( NC );
unsigned int node;
const char* name = pathlist[ NP ].TransistorName( nn + pp - 1, NC );
// first pmos
unsigned int VDDNode = circuit.ValimNode();
if ( circuit[ name ].Source() == VDDNode )
{
node = circuit[ name ].Drain();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else if ( circuit[ name ].Drain() == VDDNode )
{
node = circuit[ name ].Source();
Wjn = circuit.JunctionNWidth( node, njn, NewWidth );
Wgn = circuit.GateNWidth( node, ngn, NewWidth );
Wjp = circuit.JunctionPWidth( node, njp, NewWidth );
Wgp = circuit.GatePWidth( node, ngp, NewWidth );
}
else
{
RetCode = NOT_FOUND;
return 0.0;
}
// evaluate Cgs Cgd @ V node 1 for pmos
Cj += TECH.C_pj * Wjp * TECH.Df * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mj_p ) + \
TECH.C_pp * 2 * ( Wjp + njp * TECH.Df ) * pow ( 1 + ( VDD - Vd_p[ 1 ] ) / TECH.PB_p, -TECH.mjsw_p );
Cj += TECH.Cgs0_p * ( Wjp + njp * TECH.XW_p );
// evaluate Cgs Cgd @ V node 1 for nmos
Cj += TECH.C_nj * Wjn * TECH.Df * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mj_n ) + \
TECH.C_np * 2 * ( Wjn + njn * TECH.Df ) * pow ( 1 + Vd_p[ 1 ] / TECH.PB_n, -TECH.mjsw_n );
Cj += TECH.Cgd0_n * ( Wjn + njn * TECH.XW_p );
// evaluate other capacitances
int nc;
Cj += circuit.CapStaticGnd( node, nc );
Cj += circuit.CapStaticVdd( node, nc );
// evaluate gate capacitances
Cj += Wgn * TECH.Lmin * TECH.Cox_n;
Cj += Wgp * TECH.Lmin * TECH.Cox_p;
t0_bs = -TECH.Vtp0 * taui_p[ 1 ] / VDD;
A_2_p = Vc * beta_p[ 1 ] * ( Vc + TECH.Vtp0 );
B_2_p = VDD * Vc * beta_p[ 1 ] / taui_p[ 1 ];
C_2_p = ( Vc + 2 * TECH.Vtp0 ) / Vc;
D_2_p = 2 * VDD / ( Vc * taui_p[ 1 ] );
Y = Cj + Cov;
alpha = pow( ( D_2_p * t0_bs + C_2_p ), 1.5 );
theta = pow( ( D_2_p * t + C_2_p ), 1.5 );
y = ( 2 * Vc * Vc * beta_p[ 1 ] * theta ) / \
( 3 * D_2_p ) - \
B_2_p * t * t * 0.5 +
t * ( -A_2_p * taui_p[ 1 ] + Cov * VDD ) / \
( taui_p[ 1 ] ) + \
( 6 * A_2_p * D_2_p * t0_bs + 3 * B_2_p * D_2_p * t0_bs * t0_bs - \
4 * Vc * Vc * beta_p[ 1 ] * alpha ) / ( 6 * D_2_p );
RetCode = OK;
return y;

B.3. Simulators

TestOpt.cc
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

#include
#include
#include
#include

"mystdinclude.h"
"myenum.h"
"print.h"
"test.h"

///
TestOpt::TestOpt( const CritPathList& pathlist, const Options& options)
:
EvaluationAlgorithm( pathlist, options )
{
print_log( "Creating TestOpt instance..." );
}
///
TestOpt::~TestOpt()
{}
///
int TestOpt::Run( const Circuit& circuit, const double *NewWidth, const unsigned* ValidPath )
{
Calls++;
for ( unsigned int NP = 0; NP < NumPath; NP++ )
{
if (ValidPath[NP])
{
int RetCode;
CPDelay[ NP ] = 0.0;
CPPower[ NP ] = 0.0;
CPNoise[ NP ] = 0.0;
Area = 0.0;
for (unsigned int i = 0; i < circuit.GetNTran(); i++)
{
double x = NewWidth[i];
double f, g, h , l;
// f
f = x * x * x * x * 3.0 / 8000.0;
f += -x * x * x * 11.0 / 400.0;
f += x * x * 27.0 / 40.0;
f += -x * 27.0 / 4.0;
f += 165.0 / 4.0;
// l
l = x * x * x * x * 3.0 / 7700.0;
l += -x * x * x * 11.0 / 402.0;
l += x * x * 27.0 / 39.5;
l += -x * 27.0 / 4.0;
l += 150.0 / 4.0;
// g
g = x * x * x * x * 3.0 / 8000.0;
g += -x * x * x * 13.0 / 400.0;
g += x * x * 39.0 / 40.0;
g += -x * 45.0 / 4.0;
g += 205.0 / 4.0;
// h
h = x * x * x * x * x * x * 5.01264E-8;
h += -x * x * x * x * x * 1.60540E-5;
h += x * x * x * x * 0.001948124;
h += -x * x * x * 0.111669;
h += x * x * 3.05849;
h += -x * 34.6888;
h += 164.782;
//CPDelay[ NP ] = f;
if (f > i)
CPDelay[ NP ] = f;
else

283

Appendix B. Source code

284

68
69
70
71
72
73
74
75
76 }

CPDelay[ NP ] = l;
CPPower[ NP ] = NewWidth[i] * NewWidth[i] * NewWidth[i];
Area += NewWidth[i];
}
CPNoise[ NP ] = 0.0;
}
}
return OK;

BIBLIOGRAPHY
[1] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design.
Addison-Wesley, 1993.
[2] J. Yuan, High speed circuit techniques for pipelining and for one
clockcycle decision. Eurochip advanced course, high speed silicon
design, Apr. 1994.
[3] W. C. Elmore, The transient response of damped linear network with
particular regard to wideband amplifiers, Journal of Applied Physics,
vol. 19, pp. 5563, Jan. 1948.
[4] R. Gupta, B. Tutuianu, and L. T. Pileggi, The elmore delay as a
bound for rc trees with generalized inputt signals, IEEE Transaction
on ComputerAided Design, vol. 16, pp. 95104, Jan. 1997.
[5] L. Brocco, S. Mccormic, and J. allen, Macromodelling cmos circuits
for timing simulation, IEEE Transaction on ComputerAided Design,
vol. 7, pp. 12371249, Dec. 1988.
[6] N. Hedenstierna and K. O. Jeppson, Cmos circuit speed and buffer
optimization, IEEE Transaction on ComputerAided Design, vol. CAD
6, pp. 270281, Mar. 1987.
[7] P. Cocchini, G. Piccinini, and M. Zamboni, A comprehensive submicron most delay model and its application to cmos buffers, IEEE
Journal of Solid State Circuits, vol. 32, Aug. 1997.
[8] T. Sakurai and A. R. Newton, Alphapower law mosfet model and
its application in inverter delay and other formulas, IEEE Journal of
Solid State Circuits, vol. 25, pp. 584594, Apr. 1990.

Bibliography

286

[9] T. Sakurai and A. R. Newton, A simple mosfet model for circuit analysis, IEEE Transactions on Electron Devices, vol. 38, pp. 887894, Apr.
1991.
[10] S. Dutta, S. S. M. Shetti, and S. L. Lusky, Comprehensive delay model
for cmos inverters, IEEE Journal of Solid State Circuits, vol. 30, pp. 864
871, Aug. 1995.
[11] L. Bisdounis, S. Nikolaidis, and O. Koufopavlou, Propagation delay
and shortcircuit power dissipation modeling of the cmos inverter,
IEEE Transaction on Circuits and Systems, vol. 45, pp. 259270, Mar.
1998.
[12] R. S. Muller and T. I. Kamins, Device electronics for integrated circuit,
second edition. John wiley & sons, 1986.
[13] D. A. Wismer and R. Chattergy, Introduction to nonlinear optimization.
System science and engineering, NorthHolland, 1978.
[14] P. L. Yu, Multiplecriteria decision making. Mathematical concepts and
methods in science and engineering, Plenum Press New York and
London, 1985.
[15] M. J. D. Powell, An eeficient method for finding the minimum of a
function of several variables without calculating derivatives, Computer Journal, no. 7, pp. 152162, 1964.
[16] J. Yuan and C. Svensson, Cmos circuit speed optimization based
on switch level simulation, in Proceedings of IEEE International Symposium on Circuits and Systems, pp. 21092112, 1988.
[17] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, 1992.
[18] M. Graziano, M. Delaurenti, G. Masera, G. Piccinini, and M. Zamboni,
Noise safety design methodologies, in Proceedings of IEEE International Simposium on Quality of Electronics Design (ISQED2000), IEEE,
Mar. 2000.

Bibliography

287

[19] J. T. Kong, S. Z. Hussain, and D. Overhauser, Performance estimation of complex mos gates, IEEE Transaction on Circuits and Systems,
vol. 44, pp. 785795, Sept. 1997.
[20] S. Devadas and S. Malik, Survey of optimization techniques targeting
low power vlsi circuit, in Proceedings of Conference on Design Automation (DAC), 1995.
[21] B. Davari, R. H. Dennard, and G. G. Shahidi, Cmos scaling for high
performance and low powerthe next ten years, Proceedings of the
IEEE, vol. 83, Apr. 1995.
[22] S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S. M. Kang, An exact
solution to the transistor sizing problem for cmos circuits using convex
approximation, IEEE Transaction on ComputerAided Design, vol. 12,
pp. 16211634, Nov. 1993.
[23] O. Coudert, Gate sizing for constrained delay/power/area optimization, IEEE Transaction on Very Large Scale Integration Systems, vol. 5,
pp. 465472, Dec. 1997.
[24] C. Chen, C. C. N. Chu, and D. F. Wong, Fast and exact simultaneous
gate and wire sizing by lagrangian relaxation, in Proceedings of Conference on Design Automation (DAC), pp. 617624, 1998.
[25] A. R. Conn, R. A. Haring, and C. Visweswariath, Noise considerations in circuit optimization, in Proceedings of IEEE/ACM International
conference on Computer Aided Design, pp. 220227, 1998.
[26] M. Delaurenti, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,
Cmos power-delay model for cad optimization tools, in Proceedings
of IEEE A.Volta Workshop on Low-Power Design (VOLTA99), IEEE, Mar.
1999.
[27] G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni, Isis: a cad tool
for high speed vlsi design, in Proceedings of CSA, (Irbid, Jordan), Mar.
1998.
[28] M. Graziano, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,
Noise-tolerance analysis for high speed cmos circuits, in Proceedings
of ICM, (Monastir, Tunisia), Dec. 1998.

Bibliography

288

[29] M. Graziano, G. Masera, G. Piccinini, M. R. Roch, and M. Zamboni,


A statistical noise-tolerance analysis and test structure for logic families, in Proceedings of ICMTS, (Goteborg, Sweden), Mar. 1999.
[30] S. Eliantonio, Studio di algoritmi di ottimizzazione velocit`aarea per
strutture cmos, tesi di laurea, Politecnico di Torino, Dipartimento di
Elettronica, Mar. 1999.
[31] D. Zhou and X. Y. Liu, On the optimal drivers of highspeed low
power ics, International journal of High Speed Electronics and Systems,
vol. 7, no. 2, pp. 287303, 1996.
[32] V. Tiwari, D. Singh, S. Rajgopal, G. Mehta, R. Patel, and F. Baez, Reducing power in highperformance microprocessors, in Proceedings
of Conference on Design Automation (DAC), pp. 732737, 1998.
[33] H. Liao and W. W. Dai, A new cmos driver model for transient analysis and power dissipation, International journal of High Speed Electronics and Systems, vol. 7, no. 2, pp. 269285, 1996.
[34] G. Yeap and A. Wild, Introduction to lowpower vlsi design, International journal of High Speed Electronics and Systems, vol. 7, no. 2,
pp. 223248, 1996.
[35] A. Hirata, H. Onodera, and K. Tamaru, Proposal of a timing model
for cmos logic gates driving a crc load, in Proceedings of IEEE/ACM
International conference on Computer Aided Design, pp. 537544, 1998.
[36] A. R. Conn, P. K. Coulman, R. A. Haring, G. L. Morril, and
C. Visweswariath, Optimization of custom mos circuits by transistor
sizing, in Proceedings of IEEE/ACM International conference on Computer Aided Design, 1996.
[37] P. Larsson-Edefors, Technology mapping onto veryhighspeed
standard cmos hardware, IEEE Transaction on ComputerAided Design,
vol. 15, pp. 11371144, Sept. 1996.
[38] A. Wolfe, Oppurtunities and obstacles in lowpower systemlevel
cad, in Proceedings of Conference on Design Automation (DAC), 1996.

Bibliography

289

[39] C. S. D. Liu, Power consumption estimation in cmos vlsi chips, IEEE


Journal of Solid State Circuits, vol. 29, pp. 663670, June 1994.
[40] J. Cong and L. He, An efficient approach to simultaneous transistor
and interconnect sizing, in Proceedings of IEEE/ACM International conference on Computer Aided Design, 1996.
[41] D. Liu and C. Svensson, Impact of supply voltage on power consumption, speed and reliability of cmos circuits, in Proceedings of
internationa workshop on Power and Timing Modeling, Optimization and
Simulation (PATMOS), 1994.
[42] L. T. Wurtz, An efficient scaling procedure for domino cmos logic,
IEEE Journal of Solid State Circuits, vol. 28, pp. 979982, Sept. 1993.
[43] J. Yuan, Ultimate cmos speeds and device sizing. Eurochip advanced course, high speed silicon design, Apr. 1994.
[44] D. Chen and M. Sarrafzadeh, An exact algorithm for low power
libraryspecific gate resizing, in Proceedings of Conference on Design
Automation (DAC), 1996.
[45] M. Borah, R. M. Owens, and M. J. Irwin, Transistor sizing for low
power cmos circuits, IEEE Transaction on ComputerAided Design,
1996.
[46] B. Basaran and R. A. Rutenbar, An o(n) algorithm for transistor
stacking with performance constraints, in Proceedings of Conference on
Design Automation (DAC), 1996.
[47] M. R. C. M. Berkelaar, P. H. W. Buurman, and J. A. G. Jess, Computing
the entire active area/power consumption versus delay tradeoff curve
for gate sizing with piecewise linear simulator, IEEE Transaction on
ComputerAided Design, vol. 15, pp. 14241434, Nov. 1996.
[48] M. R. C. M. Berkelaar, P. H. W. Buurman, and J. A. G. Jess, Computing
the entire active area/power consumption versus delay tradeoff curve
for gate sizing with piecewise linear simulator, IEEE Transaction on
ComputerAided Design, vol. 15, pp. 14241434, Nov. 1996.

Bibliography

290

[49] S. Mehrotra, P. Franzon, and W. Liu, Global optimization approach to


transistor sizing for high performance cmos vlsi circuits, Tech. Rep.
NCSUVLSI 9310, North Carolina State University, Department of
Electrical and Computer Engineering, Nov. 1993.
[50] S. S.-S. Chung, A chargebased capacitance model of shortchannel
mosfets, IEEE Transaction on ComputerAided Design, vol. 8, pp. 17,
Jan. 1989.
[51] O. Coudert, R. Haddad, and S. Manne, New algorithms for gate sizing: a comparative study, in Proceedings of Conference on Design Automation (DAC), 1996.
` Power estimation of cell-based
[52] A. Bogliolo, L. Benini, and B. Ricco,
cmos circuits, in Proceedings of Conference on Design Automation (DAC),
1996.
[53] D. Syslvester and K. Keutzer, Getting to the bottom of deep submicron, in Proceedings of IEEE/ACM International conference on Computer Aided Design, 1998.

Anda mungkin juga menyukai