Anda di halaman 1dari 105

EC E62- VLSI DESIGN

UNIT I
Introduction: Introduction to VLSI and VLSI fabrication- Introduction to power
reduction techniques-Dynamic Power Reduction-Static Power Reduction- CMOS
inverter propagation delays power dissipation - Stick Diagram. MOS layers -
design rules and layout- choice of layers.
UNIT II
VLSI Logic Circuits, Design Process and Layout: Pass transistor and transmission
gatesinverter- NAND gates and NOR Gates for n MOS, CMOS and Bi CMOS
parity generator multiplexers- code converters PLA Clocked sequential
circuits- Memories and Registers.
UNIT III
Arithmetic Circuits: One bit adder- multibit adder Ripple carry-Carry Skip Adder-
Carry Look Ahead Adder- design of signed parallel adder-comparison of different
schemes in terms of delay - multipliers Design of serial, parallel and pipelined
multipliers- different schemes and their comparison. 2s complement array
multiplication-Booth encoding- Wallace Tree multiplier.
UNIT IV
Programmable ASICs and FPGAs: Actel, Altera and Xilinx FPGA devices.
UNIT V
Introduction to Verilog: Basics of Verilog, operators, Data Types, Continuous
assignments, Sequential and parallel statement groups. Timing control (level and
edge sensitive) and delays, tasks and functions, control statements, Blocking &
nonblocking assignments, If-else and case statements, For-while-repeat and forever
loops, Rise, fall, min, max delays, Behavioral and synthesizable coding styles for
modeling combinational logic, Behavioral & synthesizable coding styles for
modeling sequential logic, Parameters and Defines for design reuse. Verilog and
logic synthesis.
TEXT BOOKS:
1. Neil H.E. Weste and K.Eshraghian, Principles of CMOS VLSI design, Addison
Wesley Publishing Company,1985.
2. Neil He Weste,David Harris and Ayan Banerjee, Principles of CMOS VLSI
design- A circuits and Systems Perspective, Dorling Kindersley (india) Pvt Ltd,
2006.
3. Sebastian Smith, Application Specific Integrated Circuits, Pearson
Education,2001
4. J. Bhasker A Verilog HDL Primer, Star Galaxy Press,1997.
5. Wayne wolf, Modern VLSI Design: System on Chip Design, Prentice Hall of
India, 2005.
REFERENCEBOOKS:
1. E.D.Fabricious, Introduction to VLSI design, Mc Graw Hill, 1990.
2. Thomas, D . E .,Philip.R. Moorby The Verilog Hardware Description Language,
2nd ed.,Kluwer Academic Publishers,2002.
3. Jan M Rabaey, Anantha Chandrakasan and Borivoje Nikolic, Digital Integrated
Circuits: A Design Perspective, Prentice Hall India, 2007.
UNIT I
Introduction
1.1. Introduction to VLSI
ECE62 VLSI Design
ECE62 VLSI Design

1.2. Introduction to VLSI fabrication


ECE62 VLSI Design
ECE62 VLSI Design

1.3. Introduction to power reduction techniques

1.4. Dynamic Power Reduction


ECE62 VLSI Design
ECE62 VLSI Design

1.5. Static Power Reduction


ECE62 VLSI Design
ECE62 VLSI Design

1.6. CMOS inverter


CMOS inverters (Complementary NOSFET Inverters) are some of the most widely
used and adaptable MOSFET inverters used in chip design. They operate with very
little power loss and at relatively high speed. Furthermore, the CMOS inverter has
good logic buffer characteristics, in that, its noise margins in both low and high
states are large.

This short description of CMOS inverters gives a basic understanding of the how a
CMOS inverter works. It will cover input/output characteristics, MOSFET states at
different input voltages, and power losses due to electrical current.

A CMOS inverter contains a PMOS and a NMOS transistor connected at the drain
and gate terminals, a supply voltage VDD at the PMOS source terminal, and a
ground connected at the NMOS source terminal, were VIN is connected to the gate
terminals and VOUT is connected to the drain terminals.(See diagram). It is
important to notice that the CMOS does not contain any resistors, which makes it
ECE62 VLSI Design

more power efficient that a regular resistor-MOSFET inverter.As the voltage at the
input of the CMOS device varies between 0 and 5 volts, the state of the NMOS and
PMOS varies accordingly. If we model each transistor as a simple switch activated
by VIN, the inverters operations can be seen very easily:

Transistor "switch model"

The switch model of the MOSFET transistor is defined as follows:

MOSFET Condition on State of


MOSFET MOSFET
NMOS Vgs<Vtn OFF
NMOS Vgs>Vtn ON
PMOS Vsg<Vtp OFF
PMOS Vsg>Vtp ON

When VIN is low, the NMOS is "off", while the PMOS stays "on": instantly charging
VOUT to logic high. When Vin is high, the NMOS is "on and the PMOS is "on:
draining the voltage at VOUT to logic low.

This model of the CMOS inverter helps to describe the inverter conceptually, but
does not accurately describe the voltage transfer characteristics to any extent. A
more full description employs more calculations and more device states.

Multiple state transistor model

The multiple state transistor model is a very accurate way to model the CMOS
inverter. It reduces the states of the MOSFET into three modes of operation: Cut-
Off, Linear, and Saturated: each of which have a different dependence on Vgs and
ECE62 VLSI Design

Vds. The formulas which govern the state and the current in that given state is
given by the following tabel:

NMOS Characteristics

Condition Condition on Mode of


on VGS VDS Operation

ID = 0 VGS < VTN All Cut-off

ID = kN [2(VGS - VTN ) VGS > VTN VDS < VGS - Linear


VDS - VDS2 ] VTN

ID = kN (VGS - VTN )2 VGS > VTN VDS > VGS - Saturated


VTN

PMOS Characteristics

Condition Condition on Mode of


on VSG VSD Operation

ID = 0 VSG < -VTP All Cut-off

ID = kP [2(VSG + VTP ) VSG > -VTP VSD < VSG Linear


VSD - VSD2 ] +VTP

ID = kP (VSG + VTP )2 VSG > -VTP VSD > VSG Saturated


+VTP
ECE62 VLSI Design

1.7. Propagation delays


ECE62 VLSI Design

RC Delay models
ECE62 VLSI Design

1.8. Power dissipation


ECE62 VLSI Design
ECE62 VLSI Design
ECE62 VLSI Design

1.9. Stick Diagram


ECE62 VLSI Design
ECE62 VLSI Design

1.10. MOS layers


ECE62 VLSI Design
ECE62 VLSI Design
ECE62 VLSI Design
ECE62 VLSI Design
ECE62 VLSI Design

1.11. Design rules and layout


ECE62 VLSI Design
ECE62 VLSI Design

1.12. Choice of layers.


ECE62 VLSI Design
ECE62 VLSI Design
EC E62 VLSI Design

UNIT II
VLSI Logic Circuits, Design Process and Layout

2.1. Pass transistor and transmission gates

Unit II & III 1


EC E62 VLSI Design

Unit II & III 2


EC E62 VLSI Design

2.2. Inverter for n MOS, CMOS and Bi CMOS

Unit II & III 3


EC E62 VLSI Design

2.3. NAND gates for n MOS, CMOS and Bi CMOS

Unit II & III 4


EC E62 VLSI Design

2.4. NOR gates for n MOS, CMOS and Bi CMOS

2.3. Parity generator


An even parity bit generator generates an output of 0 if the number of 1s in
the input sequence is even and 1 if the number of 1s in the input sequence is
odd.

Unit II & III 5


EC E62 VLSI Design

The checker circuit gives an output of 0 if there is no error in the parity bit
generated. Thus it basically checks to see if the parity bit generator is error
free or not.
The design procedure is made simple by writing the truth table for the circuit.
Truth table:
Message Even parity bit Checker bit
X Y Z P C

0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 1 1 0 0
1 0 0 1 0
1 0 1 0 0
1 1 0 0 0
1 1 1 1 0

The circuit can now be derived by drawing the K-map for the output.

From this the minimal output equation is


P X Y Z XY Z XYZ X Y Z X Y Z
This function can be implemented using exclusive-or gates. The schematic of
the parity generator circuit is shown in Figure

Figure : Parity bit generator


Similarly the checker circuit can be designed using XOR gates, where
C X Y Z P and the circuit is shown in Figure.

Unit II & III 6


EC E62 VLSI Design

Figure : Checker circuit

Now the parity bit generator and the checker circuit can be combined into one
circuit for simplicity. The final schematic of the circuit is shown in Figure.

Figure: Combined schematic of both parity bit generator and checker circuit
2.4. Multiplexers

Unit II & III 7


EC E62 VLSI Design

Unit II & III 8


EC E62 VLSI Design

Unit II & III 9


EC E62 VLSI Design

2.5. Code converters

Unit II & III 10


EC E62 VLSI Design

2.6. PLA

Unit II & III 11


EC E62 VLSI Design

Unit II & III 12


EC E62 VLSI Design

2.7. Clocked sequential circuits

Unit II & III 13


EC E62 VLSI Design

2.8. Memories and Registers.


Memories are usually constructed as two dimensional arrays of bits. Thus a
memory containing 2w words each of 2b bits will be configured as 2w rows by
2b columns. w address bits will be decoded to give the row and either the
whole word will be output or multiplexor used to select a single bit using a
further b address bits.

Unit II & III 14


EC E62 VLSI Design

Read-only memory
A read-only memory (ROM) is like a PLA with all the possible minterms
being calculated. The individual memory cells can be very compact; here is a
4 x 4 fragment of the memory array:

Diffusion tabs are run under the polysilicon word lines wherever a 0 is to be
stored, other bit positions read as 1. The 4 words stored here will read as 4, 6,
3 and 7.
Progammable read-only memories (PROMs) allow the diffusion tabs to be
switched in electrically. Erasable PROMs allow this switching to be reversed,
either by exposure to ultaviolet light (EPROMs) or under digital control
(electrically erasable PROMs or EEPROMs).
Static read/write memory
The simplest form of writeable memory (RAM) is static memory. A bit is
stored in a pair of cross-coupled invertors, with separate circuits to control
the reading and writing of the data.

Unit II & III 15


EC E62 VLSI Design

The memory has two independent ports for reading; both selection lines are
opened for writing. Six transistors are required to store each bit, plus some
overheard for the control circuitry.
Dynamic RAM
Fewer transistors are needed if the bit is stored as charge on the gate of a
FET.

The three-transistor memory cell operates as follows:


- Write by putting data on Data and strobing Write
- Read by pre-charging Data and strobing Read; the value obtained has to be
inverted.
- Refresh by reading and re-writing at least every millisecond or so.
Less circuitry is required for each individual bit at the expense of more
sophisticated control circuits.
This is taken to an extreme with a one-transistor memory cell:

The bit is stored as charge under the grounded gate of a second transistor.
Again, refreshing is required and reading requires the use of subtle analogue
sense amplifiers. The tessellated layout is, however, very compact:

Really dense memory circuits use specialised processes not available for
normal digital logic.

Unit II & III 16


EC E62 VLSI Design

UNIT III
Arithmetic Circuits
3.1. One bit adder
The most basic arithmetic operation is the addition of two binary digits,
i.e. bits.
A combinational circuit that adds two bits, according the scheme
outlined below, is called a half adder.
A full adder is one that adds three bits, the third produced from a
previous addition operation.
One way of implementing a full adder is to utilizes two half adders in
its implementation.
The full adder is the basic unit of addition employed in all the adders
studied here
Half Adder
A half adder is used to add two binary digits together, A and B. It
produces S, the sum of A and B, and the corresponding carry out Co.
Although by itself, a half adder is not extremely useful, it can be used
as a building block for larger adding circuits (FA).

Full Adder
A full adder is a combinational circuit that performs the arithmetic
sum of three bits: A, B and a carry in, C, from a previous addition.
Also, as in the case of the half adder, the full adder produces the
corresponding sum, S, and a carry out Co.
As mentioned previously a full adder maybe designed by two half
adders in series as shown below in Figure.
The sum of A and B are fed to a second half adder, which then adds it
to the carry in C (from a previous addition operation) to generate the
final sum S.
The carry out, Co, is the result of an OR operation taken from the carry
outs of both half adders.

Unit II & III 17


EC E62 VLSI Design

3.2. Multi bit adder


Multi bit adders are digital circuits that compute the addition of
variable binary strings of equivalent or different size in parallel.
The schematic diagram of a Multi bit adder is shown below in Fig.

Unit II & III 18


EC E62 VLSI Design

3.3. Ripple carry adder


The ripple carry adder is constructed by cascading full adders (FA)
blocks in series.
One full adder is responsible for the addition of two binary digits at
any stage of the ripple carry.
The carryout of one stage is fed directly to the carry-in of the next
stage.
A number of full adders may be added to the ripple carry adder or
ripple carry adders of different sizes may be cascaded in order to
accommodate binary vector strings of larger sizes.
For an n-bit parallel adder, it requires n computational elements (FA).
Figure shows an example of a parallel adder: a 4-bit ripple-carry adder.
It is composed of four full adders.
The augend's bits of x are added to the addend bits of y respectfully of
their binary position.
Each bit addition creates a sum and a carry out.
The carry out is then transmitted to the carry in of the next higher-
order bit. The final result creates a sum of four bits plus a carry out
(c4).

Even though this is a simple adder and can be used to add unrestricted
bit length numbers, it is however not very efficient when large bit
numbers are used.
One of the most serious drawbacks of this adder is that the delay
increases linearly with the bit length.
As mentioned before, each full adder has to wait for the carry out of the
previous stage to output steady-state result.
Therefore even if the adder has a value at its output terminal, it has to
wait for the propagation of the carry before the output reaches a
correct value as shown in Fig.

Unit II & III 19


EC E62 VLSI Design

Taking again the example in figure, the addition of x4 and y4 cannot


reach steady state until c4 becomes available. In turn, c4 has to wait
for c3, and so on down to c1.
If one full adder takes Tfa seconds to complete its operation, the final
result will reach its steady-state value only after 4Tfa seconds. Its area
is n Afa
A (very) small improvement in area consumption can be achieved if it
is known in advance that the first carry in (c0) will always be zero. (If
so, the first full adder can be replace by a half adder).
In general, assuming all gates have the same delay and area of NAND-
2 then this circuit has 3n Tgate delay and 5nAgate. (One must be aware
that in Static CMOs, this assumption is not true). Gate delays depend
on intrinsic delay + fanin delay+fanout delay

Generally speaking, the worst-case delay of the RCA is when a carry


signal transition ripples through all stages of adder chain from the
least significant bit to the most significant bit, which is approximated
by:

where tc is the delay through the carry stage of a full adder, and ts is
the delay to compute the sum of the last stage.
The delay of ripple carry adder is linearly proportional to n, the
number of bits, therefore the performance of the RCA is limited when n
grows bigger.
The advantages of the RCA are lower power consumption as well as a
compact layout giving smaller chip area.
3.4. Carry Skip Adder
A carry-skip adder consists of a simple ripple carry-adder with a
special speed up carry chain called a skip chain.
This chain defines the distribution of ripple carry blocks, which
compose the skip adder.

Unit II & III 20


EC E62 VLSI Design

Carry Skip Mechanics


The addition of two binary digits at stage i, where i 0, of the ripple
carry adder depends on the carry in, Ci , which in reality is the carry
out, Ci-1, of the previous stage.
Therefore, in order to calculate the sum and the carry out, Ci+1 , of
stage i, it is imperative that the carry in, Ci, be known in advance.
It is interesting to note that in some cases Ci+1 can be calculated
without knowledge of Ci.
Boolean Equations of a Full Adder:

Supposing that Ai = Bi, then Pi in equation 1 would become zero


(equation 4). This would make Ci+1 to depend only on the inputs Ai
and Bi, without needing to know the value of Ci.


Therefore, if Equation 4 is true then the carry out, Ci+1, will be one if
Ai = Bi = 1 or zero if Ai = Bi = 0.
Hence we can compute the carry out at any stage of the addition
provided equation 4 holds.
These findings would enable us to build an adder whose average time
of computation would be proportional to the longest chains of zeros and
of different digits of A and B.
Alternatively, given two binary strings of numbers, such as the
example below, it is very likely that we may encounter large chains of
consecutive bits (block 2) where Ai Bi. In order to deal with this
scenario we must reanalyze equation 3 carefully.

In the case of comparing two bits of opposite value, the carry out at
that particular stage, will simply be equivalent to the carry in.
Hence we can simply propagate the carry to the next stage without
having to wait for the sum to be calculated.

Unit II & III 21


EC E62 VLSI Design

Two Random Bit Strings:

In order to take advantage of the last property, we can design an adder


that is divided into blocks, as shown in Fig, where a special purpose
circuit can compare the two binary strings inside each block and
determine if they are equal or not.
In the latter case the carry entering the block will simply be
propagated to the next block and if this is the case all the carry inputs
to the bit positions in that block are all either 0's or 1's depending on
the carry in into the block.
Should only one pair of bits (Ai and Bi) inside a block be equal then the
carry skip mechanism would be unable to skip the block. In the
extreme case, although still likely, that there exist one such case,
where Ai = Bi, in each block, then no block is skipped but a carry would
be generated in each block instead.
Carry Skip Chain
In summary the carry skip chain mechanism works as follows:

Unit II & III 22


EC E62 VLSI Design

Two strings of binary numbers to be added are divided into blocks of


equal length.
In each cell within a block both bits are compared for un-equivalence.
This is done by Exclusive ORing each individual cell (parallel
operation and already present in the full adder) producing a
comparison string.
Next the comparison string is ANDed within itself in a domino fashion.
This process ensures that the comparison of each and all cells was
indeed unequal and we can therefore proceed to propagate the carry to
the next block.
A MUX is responsible for selecting a generated carry or a propagated
(previous) carry with its selection line being the output of the
comparison circuit just described.
If for each cell in the block Ai Bi then we say that a carry can skip
over the block otherwise if Ai = Bi we shall say that the carry must be
generated in the block.
When studying carry skip adders the main purpose is to find a
configuration of blocks that minimizes the longest life of a carry, i.e.
from the time of its generation to the time of the generation of the next
carry.
Many models have been suggested: the first with blocks of equal size
and the second with blocks of different sizes according to some
heuristic.

The delay of n-bit adder based on m-bit blocks of Carry Bypass Adder,
CBA rippled together can be given by:

n is the adder length and m is the length of the blocksComparing to the RCA,
the CBA has slightly improved speed for wider-bit adders (still linear to n),
but with higher active capacitance and the area overhead because of the
extra bypass circuit.

Unit II & III 23


EC E62 VLSI Design

3.5. Carry Look Ahead Adder

Unit II & III 24


EC E62 VLSI Design

3.6. Design of signed parallel adder

Unit II & III 25


EC E62 VLSI Design

3.7. Comparison of different schemes in terms of delay

3.8. Multipliers

Unit II & III 26


EC E62 VLSI Design

3.9. Design of serial multipliers


Where area and power is of utmost importance and delay can be tolerated the
serial multiplier is used. This circuit uses one adder to add the m * n partial
products. The circuit is shown in the fig. below for m=n=4. Multiplicand and
Multiplier inputs have to be arranged in a special manner synchronized with
circuit behavior as shown on the figure. The inputs could be presented at
different rates depending on the length of the multiplicand and the multiplier.
Two clocks are used, one to clock the data and one for the reset. A first order
approximation of the delay is O (m,n). With this circuit arrangement the
delay is given as D =[ (m+1)n + 1 ] tfa.

Unit II & III 27


EC E62 VLSI Design

As shown the individual PP is formed individually. The addition of the PPs


are performed as the intermediate values of PPs addition are stored in the
DFF, circulated and added together with the newly formed PP. This approach
is not suitable for large values of M or N.
3.10. Design of parallel multipliers
Each partial product bit of the multiplication can be computed in
parallel
Then, perform n n-bit adds to sum the partial products

In the design shown, the longest path is 2n-1 cells


Issues:
1. Dont need full adder at each of the cells
2. Parallelogram shape doesnt fit well with other parts of chip

Unit II & III 28


EC E62 VLSI Design

n x n multiplier requires n(n - 2) full adders (FA), n half adders and n2


AND gates
High bit of output is the last one to evaluate, so worst case delay for
4x4 multiplier is

Both the carry and sum delay for adders appear on the critical path, so
want balanced design
Because the inputs to different columns in a row arrive at different
times, fast carry chains dont work well
Use carry-save adder and only sum carries at last stage
Can pipeline array on diagonals to improve throughput
3.11. Design of pipelined multipliers
The general architecture of the serial/parallel multiplier is shown in the
figure below. One operand is fed to the circuit in parallel while the other is
serial. N partial products are formed each cycle. On successive cycles, each
cycle does the addition of one column of the multiplication table of M*N PPs.
The final results are stored in the output register after N+M cycles. While the
area required is N-1 for M=N.

Unit II & III 29


EC E62 VLSI Design

3.12. Different schemes and their comparison

Unit II & III 30


EC E62 VLSI Design

3.13. 2s complement array multiplication

3.14. Booth encoding


Multiplication of two's-complement numbers more complicated
Because performing a straightforward unsigned multiplication of the two's
complement representations of the inputs does not give the correct result
Multipliers could be designed to convert both of their inputs to positive quantities and
use the sign bits of the original inputs to determine the sign of the result

Unit II & III 31


EC E62 VLSI Design

Increases the time required to perform a multiplication technique called Booth


encoding
To quickly convert two's-complement numbers into a format that is easily multiplied
Apply encoding to the multiplier bits before the bits are used for getting partial products
1. If ith bit bi is 0 and (i 1)th bit bi-1 is 1, then take bi as +1
2. If ith bit bi is 1 and (i 1)th bit bi-1 is 0, then take bi as 1
3. If ith bit bi is 0 and (i 1)th bit bi-1 is 0, then take bi as 0
4. If ith bit bi is 1 and (i 1)th bit bi-1 is 1, then take bi as 0
When LSB b0 = 1, assume that it had b-1 as 0, thus take b0 = 1

Booths algorithm permits skipping over 1s and when there are blocks of
1s
It improves performance significantly

Observe the addition of 00000000 00010100 or its twos complement is


done only thrice, in contrast to the addition of 00000000 00010100 done 15
times in earlier described procedures without using Booths algorithm
The adder circuit takes longer period to implement than finding 1 and +1
and 0s for multiplier
The worst case of an implementation using Booths algorithm is when pairs
of 01s or 10s occur very frequently in the multiplier
3.15. Wallace Tree multiplier

Unit II & III 32


EC E62 VLSI Design

3.15. Wallace Tree Multiplier

Unit II & III 33


EC E62 VLSI Design

Unit II & III 34


ECE62 VLSI Design

UNIT IV
Programmable ASICs and FPGAs

1 Unit IV
ECE62 VLSI Design

2 Unit IV
ECE62 VLSI Design

3 Unit IV
ECE62 VLSI Design

4 Unit IV
ECE62 VLSI Design

5 Unit IV
ECE62 VLSI Design

6 Unit IV
ECE62 VLSI Design

7 Unit IV
ECE62 VLSI Design

8 Unit IV
ECE62 VLSI Design

UNIT V
Introduction to Verilog
5.1. Basics of Verilog

Verilog can be used to describe designs at four levels of abstraction:


(i) Algorithmic level (much like c code with if, case and loop statements).
(ii) Register transfer level (RTL uses registers connected by Boolean
equations).
(iii) Gate level (interconnected AND, NOR etc.).
(iv) Switch level (the switches are MOS transistors inside gates).
The language also defines constructs that can be used to control the input
and output of simulation.
There are two types of code in most HDLs:
Structural, which is a verbal wiring diagram without storage.
assign a=b & c | d; /* | is a OR */
assign d = e & (~c);
Here the order of the statements does not matter. Changing e will change a.
Procedural which is used for circuits with storage, or as a convenient way to
write conditional logic.
always @(posedge clk) // Execute the next statement on every rising
clock edge.
count <= count+1;

1 Unit V
ECE62 VLSI Design

5.2. Operators
1. Value Set
Verilog consists of only four basic values. Almost all Verilog data types store
all these values:
0 (logic zero, or false condition)
1 (logic one, or true condition)
x (unknown logic value)
z (high impedance state)
2. Wire
A wire represents a physical wire in a circuit and is used to connect gates or
modules. The value of a wire can be read, but not assigned to, in a function or
block.
A wire does not store its value but must be driven by a continuous
assignment statement or by connecting it to the output of a gate or module.
Other specific types of wires include:
wand (wired-AND);:the value of a wand depend on logical AND of all the
drivers connected to it.
wor (wired-OR);: the value of a wor depend on logical OR of all the drivers
connected to it.
tri (three-state;): all drivers connected to a tri must be z, except one (which
determines the value of the tri).

3. Reg
A reg (register) is a data object that holds its value from one procedural
assignment to the next. They are used only in functions and procedural
blocks. A reg is a Verilog variable type and does not necessarily imply a
physical register.

2 Unit V
ECE62 VLSI Design

4. Input, Output, Inout


These keywords declare input, output and bidirectional ports of a module or
task. Input and inout ports are of type wire. An output port can be configured
to be of type wire, reg, wand, wor or tri. The default is wire.

5. Integer
Integers are general-purpose variables. For synthesis they are used mainly
loops-indicies, parameters, and constants. They are of implicitly of type reg.
However they store data as signed numbers whereas explicitly declared reg
types store them as unsigned.

6. Supply0, Supply1
Supply0 and supply1 define wires tied to logic 0 (ground) and logic 1 (power),
respectively.

7. Time
Time is a 64-bit quantity that can be used in conjunction with the $time
system task to hold simulation time. Time is not supported for synthesis and
hence is used only for simulation purposes.

3 Unit V
ECE62 VLSI Design

8. Parameter
A parameter defines a constant that can be set when you instantiate a
module. This allows customization of a module during instantiation.

5.3. Data Types


1. Arithmetic Operators
These perform arithmetic operations. The + and - can be used as either unary
(-z) or binary (x-y) operators.

2. Relational Operators
Relational operators compare two operands and return a single bit 1or 0.
These operators synthesize into comparators. Wire and reg variables are
positive. Thus (-3b001) = = 3b111 and (-3d001)>3d110. However for integers
-1< 6.

4 Unit V
ECE62 VLSI Design

3. Bit-wise Operators
Bit-wise operators do a bit-by-bit comparison between two operands.

4. Logical Operators
Logical operators return a single bit 1 or 0. Logical operators are typically
used in conditional (if ... else) statements since they work with expressions.

5. Reduction Operators
Reduction operators operate on all the bits of an operand vector and return a
single-bit value. These are the unary (one argument) form of the bit-wise
operators above.

6. Shift Operators
Shift operators shift the first operand by the number of bits specified by the
second operand. Vacated positions are filled with zeros for both left and right
shifts (There is no sign extension).

5 Unit V
ECE62 VLSI Design

7. Concatenation Operator
The concatenation operator combines two or more operands to form a larger
vector.

8. Replication Operator
The replication operator makes multiple copies of an item.

9. Conditional Operator: ?
Conditional operator is like those in C/C++. They evaluate one of the two
expressions based on a condition. It will synthesize to a multiplexer (MUX).

10. Operator Precedence


Table shows the precedence of operators from highest to lowest. Operators on
the same level evaluate from left to right. It is strongly recommended to use
parentheses to define order of precedence and improve the readability of our
code.

6 Unit V
ECE62 VLSI Design

5.4. Continuous assignments

7 Unit V
ECE62 VLSI Design

8 Unit V
ECE62 VLSI Design

5.5. Sequential and parallel statement groups

9 Unit V
ECE62 VLSI Design

10 Unit V
ECE62 VLSI Design

5.6. Timing control (level and edge sensitive) and delays

11 Unit V
ECE62 VLSI Design

5.7. Tasks and functions

12 Unit V
ECE62 VLSI Design

5.8. Control statements

13 Unit V
ECE62 VLSI Design

14 Unit V
ECE62 VLSI Design

5.9. Blocking & Non blocking assignments


Blocking (the = operator)
With blocking assignments each statement in the same time frame is
executed in sequential order within their blocks. If there is a time delay in
one line then the next statement will not be executed until this delay is over.
integer a,b,c,d;

initial begin
a = 4; b = 3; example 1
#10 c = 18;
#5 d = 7;
end
Above, at time=0 both a and b will have 4 and 3 assigned to them respectively
and at time=10, c will equal 18 and at time=15, d will equal 7.
Non-Blocking (the <= operator)
Non-Blocking assignments tackle the procedure of assigning values to
variables in a totally different way. Instead of executing each statement as
they are found, the right-hand side variables of all non-blocking statements
are read and stored in temporary memory locations. When they have all been
read, the left-hand side variables will be determined. They are non-blocking

15 Unit V
ECE62 VLSI Design

because they allow the execution of other events to occur in the block even if
there are time delays set.
integer a,b,c;
initial begin
a = 67;
#10;
a <= 4; example 2
c <= #15 a;
d <= #10 9;
b <= 3;
end
This example sets a=67 then waits for a count of 10. Then the right-hand
variables are read and stored in tempory memory locations. Here this is
a=67. Then the left-hand variables are set. At time=10 a and b will be set to 4
and 3. Then at time=20 d=9. Finally at time=25, c=a which was 67, therefore
c=67.
Note that d is set before c. This is because the four statements for setting a-d
are performed at the same time. Variable d is not waiting for variable c to
complete its task. This is similar to a Parallel Block.
This example has used both blocking and non-blocking statements. The
blocking statement could be non-blocking, but this method saves on simulator
memory and will not have as large a performance drain.
Application of Non-Blocking Assignments
We have already seen that non-blocking assignments can be used to enable
variables to be set anywhere in time without worrying what the previous
statements are going to do.
Another important use of the non-blocking assignment is to prevent race
conditions. If the programmer wishes two variables to swap their values and
blocking operators are used, the output is not what is expected:
initial begin
x = 5;
y = 3;
end
example 3
always @(negedge clock) begin
x = y;
y = x;
end

This will give both x and y the same value. If the circuit was to be built a race
condition has been entered which is unstable. The compliler will give a stable
output, however this is not the output expected. The simulator assigns x the
value of 3 and then y is then assigned x. As x is now 3, y will not change its
value. If the non-blocking operator is used instead:

16 Unit V
ECE62 VLSI Design

always @(negedge) begin


x <= y; example 4
y <= x;
end
both the values of x and y are stored first. Then x is assigned the old value of
y (3) and y is then assigned the old value of x (5).
Another example when the non-blocking operator has to be used is when a
variable is being set to a new value which involves its old value.
i <= i+1;
or examples 5,6
register[5:0] <= {register[4:0] , new_bit};
5.10. If-else and case statements

17 Unit V
ECE62 VLSI Design

18 Unit V
ECE62 VLSI Design

19 Unit V
ECE62 VLSI Design

5.11. For-while-repeat and forever loops

20 Unit V
ECE62 VLSI Design

21 Unit V
ECE62 VLSI Design

22 Unit V
ECE62 VLSI Design

5.12. Rise, fall, min, max delays

23 Unit V
ECE62 VLSI Design

24 Unit V
ECE62 VLSI Design

25 Unit V
ECE62 VLSI Design

5.13. Behavioral and synthesizable coding styles for modeling combinational


logic
Combinational circuits modeling in Verilog can be done using assign and always
blocks. Writing simple combinational circuits in Verilog using assign statements
is very straightforward, like in the example below
Tri-state buffer

module tri_buf (a,b,enable);


input a;
output b;
input enable;
wire a,enable;
wire b;
assign b = (enable) ? a : 1'bz;
endmodule
Mux

module mux_21 (a,b,sel,y);


input a, b;
output y;
input sel;
wire y;
assign y = (sel) ? b : a;
endmodule
Simple Concatenation

26 Unit V
ECE62 VLSI Design

module bus_con (a,b);


input [3:0] a, b;
output [7:0] y;
wire [7:0] y;
assign y = {a,b};
endmodule
1 bit adder with carry
module addbit (
a , // first input
b , // Second input
ci , // Carry input
sum , // sum output
co // carry output
);
//Input declaration
input a;
input b;
input ci;
//Ouput declaration
output sum;
output co;
//Port Data types
wire a;
wire b;
wire ci;
wire sum;
wire co;
//Code starts here
assign {co,sum} = a + b + ci;
endmodule // End of Module addbit
Multiply by 2
module muliply (a,product);
input [3:0] a;
output [4:0] product;
wire [4:0] product;
assign product = a << 1;
endmodule
3 is to 8 decoder
module decoder (in,out);
input [2:0] in;
output [7:0] out;
wire [7:0] out;
assign out = (in == 3'b000 ) ? 8'b0000_0001 :

27 Unit V
ECE62 VLSI Design

(in == 3'b001 ) ? 8'b0000_0010 :


(in == 3'b010 ) ? 8'b0000_0100 :
(in == 3'b011 ) ? 8'b0000_1000 :
(in == 3'b100 ) ? 8'b0001_0000 :
(in == 3'b101 ) ? 8'b0010_0000 :
(in == 3'b110 ) ? 8'b0100_0000 :
(in == 3'b111 ) ? 8'b1000_0000 : 8'h00;
endmodule
Combinational Circuit Modeling using always
While modeling using always statements, there is the chance of getting a
latch after synthesis if care is not taken. (No one seems to like latches in
design, though they are faster, and take lesser transistor. This is due to the
fact that timing analysis tools always have problems with latches; glitch at
enable pin of latch is another problem).
One simple way to eliminate the latch with always statement is to always
drive 0 to the LHS variable in the beginning of always code as shown in the
code below.
3 is to 8 decoder using always
module decoder_always (in,out);
input [2:0] in;
output [7:0] out;
reg [7:0] out;
always @ (in)
begin
out = 0;
case (in)
3'b001 : out = 8'b0000_0001;
3'b010 : out = 8'b0000_0010;
3'b011 : out = 8'b0000_0100;
3'b100 : out = 8'b0000_1000;
3'b101 : out = 8'b0001_0000;
3'b110 : out = 8'b0100_0000;
3'b111 : out = 8'b1000_0000;
endcase
end
endmodule

5.14. Behavioral & synthesizable coding styles for modeling sequential logic
Sequential logic circuits are modeled using edge sensitive elements in the
sensitive list of always blocks. Sequential logic can be modeled only using
always blocks. Normally we use nonblocking assignments for sequential
circuits.

28 Unit V
ECE62 VLSI Design

Simple Flip-Flop
module flif_flop (clk,reset, q, d);
input clk, reset, d;
output q;
reg q;
always @ (posedge clk )
begin
if (reset == 1) begin
q <= 0;
end else begin
q <= d;
end
end
endmodule
5.15. Parameters and Defines for design reuse

29 Unit V
ECE62 VLSI Design

5.16. Verilog and logic synthesis.


Logic synthesis is the process of converting a high-level description of design
into an optimized gate-level representation. Logic synthesis uses a standard
cell library which have simple cells, such as basic logic gates like and, or, and
nor, or macro cells, such as adder, muxes, memory, and flip-flops. Standard
cells put together are called technology library. Normally the technology
library is known by the transistor size (0.18u, 90nm).
A circuit description is written in Hardware Description Language (HDL) such
as Verilog. The designer should first understand the architectural description.
Then he should consider design constraints such as timing, area, testability,
and power.

30 Unit V
ECE62 VLSI Design

Life before HDL (Logic synthesis)


As we must have experienced in college, everything (all the digital circuits) is
designed manually. Draw K-maps, optimize the logic, draw the schematic. This
is how engineers used to design digital logic circuits in early days. Well this
works fine as long as the design is a few hundred gates.
Impact of HDL and Logic synthesis.
High-level design is less prone to human error because designs are described at
a higher level of abstraction. High-level design is done without significant
concern about design constraints. Conversion from high-level design to gates is
done by synthesis tools, using various algorithms to optimize the design as a
whole. This removes the problem with varied designer styles for the different
blocks in the design and suboptimal designs. Logic synthesis tools allow
technology independent design. Design reuse is possible for technology-
independent descriptions.
Verilog Logic Synthesis
When it comes to Verilog, the synthesis flow is the same as for the rest of the
languages. It tells how particular code gets translated to gates. For example,
there is no way we could synthesize delays, but of course we can add delay to
particular signals by adding buffers. But then this becomes too dependent on
synthesis target technology.
First we will look at the constructs that are not supported by synthesis tools;
the table below shows the constructs that are not supported by the synthesis
tool.

31 Unit V
ECE62 VLSI Design

Constructs Not Supported in Synthesis


Construct Type Notes
initial Used only in test benches.
events Events make more sense for syncing test bench components.
real Real data type not supported.
time Time data type not supported.
force and release Force and release of data types not supported.
assign and deassign of reg data types is not supported. But
assign and deassign
assign on wire data type is supported.
fork join Use nonblocking assignments to get same effect.
primitives Only gate level primitives are supported.
table UDP and tables are not supported.
Example of Non-Synthesizable Verilog construct.
Any code that contains the above constructs are not synthesizable, but within
synthesizable constructs, bad coding could cause synthesis issues.
Then we have another common type of code, where one reg variable is driven
from more than one always block. Well it will surely cause synthesis error.
Example - Initial Statement
module synthesis_initial( clk,q,d);
input clk,d;
output q;
reg q;
initial begin
q <= 0;
end
always @ (posedge clk)
begin
q <= d;
end
endmodule
Delays
Synthesis tool normally ignores such constructs, and just assumes that there is
no #10 in above statement, thus treating above code as a = b;
Comparison to X and Z are always ignored
module synthesis_compare_xz (a,b);
output a;
input b;
reg a;
always @ (b)
begin
if ((b == 1'bz) || (b == 1'bx)) begin
a = 1;

32 Unit V
ECE62 VLSI Design

end else begin


a = 0;
end
end
endmodule
There seems to be a common problem with all the design engineers new to
hardware. They normally tend to compare variables with X and Z. In practice
it is the worst thing to do, so please avoid comparing with X and Z. Limit your
design to two states, 0 and 1. Use tri-state only at chip IO pads level. We will
see this as an example in the next few pages.
Constructs Supported in Synthesis
Verilog is such a simple language; you could easily write code which is easy to
understand and easy to map to gates. Code which uses if, case statements is
simple and cause little headaches with synthesis tools. But if you like fancy
coding and like to have some trouble, ok don't be scared, you could use them
after you get some experience with Verilog. Its great fun to use high level
constructs, saves time.
The most common way to model any logic is to use either assign statements or
always blocks. An assign statement can be used for modeling only
combinational logic and always can be used for modeling both combinational
and sequential logic.
Keyword or
Construct Type Notes
Description
ports input, inout, output Use inout only at IO level.
This makes design more
parameters parameter
generic
module definition module
signals and variables wire, reg, tri Vectors are allowed
module instances /
E.g.- nand (out,a,b), bad idea
instantiation primitive gate
to code RTL this way.
instances
function and tasks function , task Timing constructs ignored
always, if, else, case,
procedural initial is not supported
casex, casez
begin, end, named Disabling of named blocks
procedural blocks
blocks, disable allowed
data flow assign Delay information is ignored
Disabling of named block
named Blocks disable
supported.
While and forever loops must
loops for, while, forever contain @(posedge clk) or
@(negedge clk)

33 Unit V
ECE62 VLSI Design

Operators and their Effect.


One common problem that seems to occur is getting confused with logical and
reduction operators. So watch out.
Operator Type Operator Symbol Operation Performed
Arithmetic * Multiply
/ Division
+ Add
- Subtract
% Modulus
+ Unary plus
- Unary minus
Logical ! Logical negation
&& Logical AND
|| Logical OR
Relational > Greater than
< Less than
>= Greater than or equal
<= Less than or equal
Equality == Equality
!= inequality
Reduction & Bitwise AND
~& Bitwise NAND
| Bitwise OR
~| Bitwise NOR
^ Bitwise XOR
^~ ~^ Bitwise XNOR
Shift >> Right shift
<< Left shift
Concatenation {} Concatenation
Conditional ? conditional

34 Unit V