Anda di halaman 1dari 4

Dual Threshold Voltage Domino Logic

James Kao
Massachusetts Institute of Technology
jkao@mtl.mit.edu

Abstract
2. Dual Vt Domino Logic
Dual threshold voltage (DVT) domino logic utilizes
dual Vts to provide the performance equivalent of a In this paper, a dual Vt circuit style based on domino
purely low Vt design with the standby leakage logic will be introduced. In this style, individual gates
characteristic of a purely high Vt implementation. DVT will utilize both high Vt and low Vt transistors, and the
domino logic is an attractive circuit style compared to overall circuit will exhibit extremely low leakage in the
other dual Vt techniques because there are no sleep mode, yet suffer no reduction in performance. The
performance penalties, no difficult transistor sizing conversion from an existing low Vt domino design to a
issues, and all gates (not just non-critical ones) can be dual Vt one is extremely straightforward, requiring
compensated. In fact, any low Vt domino design can simply a modification of some devices to be high Vt. No
easily be converted to a dual Vt implementation with resizing is necessary, and timing relations remain
superior characteristics. unchanged.
Since domino gates are skewed such that critical
transitions only occur in one direction, the DVT domino
1. Background
methodology selectively assigns a priori low threshold
As circuit technology scales, maintaining performance voltages to every device in the critical charge/discharge
while reducing power dissipation becomes a primary paths. In effect, devices that can switch during the
concern. It is well known that the best way to reduce evaluate mode should be low Vt devices, while those
power dissipation is through a reduction in supply devices that switch during precharge modes should be
voltage, but to maintain performance at these operating high Vt devices. By placing the circuit in the proper state
points, threshold voltages must scale as well. As a result, during standby mode, it becomes easy to ensure that all
leakage currents will grow exponentially. high Vt transistors are turned off, thereby reducing
One promising circuit technique to address this standby leakage currents. The following figure shows a
problem is to use dual Vt devices, where high Vt devices typical dual Vt domino stage, consisting of a pull down
are used to help reduce leakage currents, while low Vt network, inverter (I1), leaker device (P1), and clock
devices are used to maintain performance. In the current drivers (I2, I3), with the low Vt devices shaded.
literature, two main dual Vt circuit styles are common.
MTCMOS, or multi-threshold CMOS, involves using
high Vt sleep transistors to gate the power supplies for a
low Vt block[1]. Leakage currents will thus be reduced I2 I3
during sleep modes, but the circuit will require large
areas for the sleep transistors, and active performance Clkn+1
will be affected. Furthermore, optimal sizing of the sleep
transistors is complex for larger circuits, and will be Clkn P1
affected by the discharge and data patterns
encountered[2,3].
The second family of dual Vt circuits are those where I1 Dn+1
individual gates are partitioned to be either high Vt or Dn
low Vt depending on their timing requirements. For
example, gates in the critical path would be chosen to
have low Vt, while non critical gates would have high
Vts, with correspondingly lower leakage currents[4].
This technique is only effective up to a certain point Figure 1. Dual Vt Domino Logic Gate
(diminishes with more critical paths in the circuit), and
determining which paths can be made high Vt is a Figure 2 below shows how this domino gate can be used
complex CAD problem[5]. in a typical pipe stage in a two phase domino clocking
methodology. The pipe stage shows a logic depth of 8 an NMOS evaluate switch in the internal logic blocks
gate delays, consisting of 4 domino gates and 4 inverters. (the first logic gate requires an evaluate switch however).
Because high Vt devices perform the precharge
functions, the precharge time is longer than for the case
A Pipeline Stage (Ph 1)
where all low Vt devices are used, as seen in Figure 2B.
C2 C3 C4 However, since, the precharge time is not in the critical
C1
Out path, more time is available for the transition.
D1 D2 D3 D4 Specifically, consider the case of a two phase, 50% duty
D1 D2 D3 D4 cycle with 4 logic gates/ phase. If Tg is the delay per gate
(domino + inverter), then the available precharge delay
per gate is 7/4* Tg. In the case where the evaluate clocks
(A)
are not 50% duty cycle, then even more time can be
allotted for precharge. In general, if the total evaluation
Evaluate occurs Precharge occurs clock is N*Tg, the precharge clock is M*Tg (so total
clock period is (N+M) *Tg), then the time available per
C1
gate for precharge is (M+N-1)Tg/N.
C2 Slow
In the traditional domino style where all domino gates
are clocked with the same clock (and NMOS evaluate
C3 switches are used to prevent contention), then the entire
precharge clock can be utilized for each gate[6].
C4
2.1. Standby Mode
Phase2 Clk1
When a dual Vt domino logic stage is placed in
(B) standby mode, the domino clock needs to be high
(evaluate) in order to shut off the high Vt devices. For
Figure 2A. Ph1 Pipeline Stage with 4 levels example, the precharge PMOS device, the P1 leakage
Figure 2B. Clocking methodology device, the I2 PMOS, the I3 NMOS, and the I1 NMOS
all need to be turned off strongly in order to reduce
leakage currents during the idle state. Furthermore, to
2.1. Evaluate Mode keep the internal nodes at a hard 0, the initial inputs into
Before the domino gate enters the evaluate stage, the the domino pipeline stage must be set high. For example,
internal node is precharged high, while Dn, Dn+1, Clkn, in Figure 2, if D1 inputs are all at a logic 1, then all
and Clkn+1 are all low. When Clkn goes from low to high subsequent gates in the pipeline will also evaluate. Since
and data arrives on Dn, the domino gate will quickly the low Vt devices are on, and the high Vt devices are
evaluate through the low Vt NMOS devices in the logic off, the net effect is that overall leakage power is
network and the low Vt PMOS of I1. Likewise the rising significantly reduced.
Clkn signal will also pass through I2 (fast pull down) and To effectively reduce leakage, both phases of the
I3 (fast pull up) to supply the clocking signal to the next domino pipeline need to be put in sleep mode, so both
level of domino logic. The delay through I2 and I3 are clock phases should be gated to 1 during standby
matched to the delay through the logic and inverting operation. As long as inputs at the beginning of the
stages such that the next data arrival is timed with the pipeline (i.e. from a control register) are fixed at 1, the
next evaluate clock. Finally, to maintain a high internal entire pipeline will be evaluated, and remain in a low
node voltage during evaluation, the P1 transistor needs to leakage state. In order to ensure that data is not lost, the
supply enough current to satisfy the leakage from the low control circuitry must finish computing any instruction in
Vt NMOS block. The main benefit of this dual Vt the pipeline before placing the pipeline in standby mode.
domino approach however, is that during the evaluate
phase, all transitions in the domino gate pass through low 3. Simulation Results
Vt devices.
To verify the functionality and benefit of dual Vt
2.1. Precharge Mode domino logic, simulation were performed on a
representative pipeline stage modelled as an inverter
During precharge, the behavior of the circuit is the chain with 4 domino NOR gates and 4 accompanying
exact opposite, where the active charging and static inverters (see Figure 1, 2). The NOR gate has 8
discharging paths must pass through high Vt devices. By inputs, each driving a fanout of 3 load. These wide gates
balancing the clock drivers I2, I3 with the precharge time are a good representative of domino circuits, because
and I1 delay, the data zeroing and clock precharge signal domino technology is most effective for gates with wide,
for the next stage will be closely aligned to avoid rather than deep, pull down networks. The experimental
contention. This is a well known method to avoid using circuit has the exact same structure as shown in Figures 1
and 2A. Simulations were performed on three circuit As can be seen in the figure, the LVT implementation
variants with the exact same transistor sizings: an all low has a fast precharge delay, while the HVT and DVT
Vt design, an all high Vt design, and a dual Vt design. As circuits have virtually identical but much larger delay
predicted, the LVT delay is significantly faster than the times. Again, since the precharge delay is not in the
HVT one. However, the DVT has a fast evaluate time on critical path, this will not effect the overall circuit speed.
par with the all LVT design, but has a slow precharge Simulations were also performed to verify the leakage
delay on par with the all HVT design. benefits in the DVT design. Two scenarios are explored:
The performance benefit of low Vt domino is most one where the circuit is stalled in the precharge mode
apparent at lower voltages, where Vt is on par with with the data input zeroed, while the second scenario is
VCC. Figure 3 shows a comparison of LVT, HVT, and where the circuit is stalled in the evaluate mode with all
DVT delays as a function of the operating voltage, data inputs activated. As described earlier, the proper
shown in the graph as a percent deviation from the DVT standby mode is the latter case.
nominal VCC operating point. Clearly, the trend shows
how LVT and DVT benefits are most effective at low Leakage During Precharge Hold
voltages. For example, at 40% deviation (low VCC), 1000

Leakage I [NORM]
the reduction in delay over a HVT implementation is
44.5%, while it is only 24.1% at +20% deviation (high 100 HVT
VCC). Interestingly, the DVT circuit delay is actually
LVT
faster than the all LVT device in all cases, and this can
10 DVT
be attributed to the fact that during switching, the pull
down network has less leakage contention from the off
PMOS device in the DVT case. 1
-60% -40% -20% 0% 20% 40%
Delay (D rise to Out rise) VCC % deviation from nominal
5
HVT Figure 5. Leakage Current for CLK = 1
Delay [NORM]

4
LVT
3 DVT
2 Leakage During Evaluate Hold

1 1000
Leakage I [NORM]

0
100
HVT
-60% -40% -20% 0% 20% 40%
LVT
VCC % deviation from nominal 10 DVT

Figure 3. Eval delay through pipeline stage 1


-60% -40% -20% 0% 20% 40%
Figure 4 on the otherhand shows a plot of precharge
VCC % deviation from nominal
delay as function of operating voltage. Precharge delay
was measured as the delay between the falling Clk line at Figure 6. Leakage Current for CLK =0
the input of the block to the falling edge (precharged
state) of the final block output. Figures 5 and 6 illustrate two different components of
leakage reduction seen in the DVT standby mode case.
Precharge (C1 Fall -> Out Fall) First of all, by holding the circuit in the evaluate mode
5
rather than the precharge mode, the leakage will be
Delay [NORM]

4 HVT
reduced because the leakage path in each gate is through
LVT
3 a single off PMOS, rather than 8 off NMOS transistors in
DVT parallel. Thus leakage currents are reduced slightly in all
2 three cases. For the DVT case, the greatest benefit of
1 holding the circuit in the evaluate mode is that the
leakage path will be through a high Vt PMOS device. As
0
can be seen in Figure 6, the DVT implementation
-60% -40% -20% 0% 20% 40% leakage is comparable to the leakage of the HVT
VCC % deviation from nominal implementation, both of which are an order of magnitude
less than the LVT case. Another interesting phenomenon
Figure 4. Precharge delay for pipeline stage shown in the figures is the trend showing how low Vt
device leakage increases more rapidly than high Vt
devices with supply voltage. This scaling trend is due to
worst short channel effect on the lower Vt devices, 6.0 References
making their leakage more susceptible to supply voltage
scaling. [1] S. Mutoh, T. Douseki, Y. Matsuya, T. Aoki, S.
Shigematsu, J. Yamada, "1-V Power Supply High-Speed
4. Dual Vt Issues Digital Circuit Technology with Multithreshold-Voltage
CMOS," IEEE JSSC, vol. 30, no. 8, pp. 847-854, August
Dynamic logics superior performance over static 1995.
CMOS can be directly attributed to its lower noise [2] J. Kao, A. Chandrakasan, D. Antoniadis,
margin. The trade off between higher performance and "Transistor Sizing Issues and Tool For Multi-Threshold
lower noise margin is fundamental in VLSI circuits. In CMOS Technology," 34th Design Automation
domino circuits, the noise margin is directly related to Conference, pp. 409-414, June 1997.
the threshold voltage of the NMOS pull down tree, so [3] J. Kao, S. Narendra, A. Chandrakasan,
there is definitely a limit to how low Vts can scale. "MTCMOS Hierarchical Sizing Based on Mutual
Furthermore, active leakage in large fan in gates, if large Exclusive Discharge Patterns, 35th Design Automation
enough, can effect functionality when a domino gate tries Conference, pp. 495-500, June 1998.
to hold an internal node high. A large keeper device [4] W. Lee, et al., "A 1V DSP for Wireless
helps, but this will directly effect performance, and Communications," ISSCC, pp. 92-93, Feb., 1997.
active leakage power dissipation still remains a problem. [5] L. Wei, Z. Chen, M. Johnson, K. Roy, Design
However, research has shown that domino gates can and Optimization of Low Voltage High Performance
be made to function at low voltages and low Vts. With Dual Threshold CMOS Circuits, 35th Design
careful attention to noise, the use of keeper devices, and Automation Conference, pp. 489-494, June 1998.
improved device characteristics, domino logic is still [6] N. Weste, K. Eshraghian, "Principles of CMOS
likely be used in future technologies[7]. As long as low VLSI Design," Addison-Wesley, Reading MA., pp. 301-
Vt and low VCC dynamic logic can be made to work, 302, 1993.
then it will be beneficial to use the dual Vt domino [7] Kerry Bernstein, IBM Microelectronics, Essex
methodology described in this paper. Although it has Junction, Vt, personal communication.
little effect on active leakage power, dual Vt domino
significantly reduces standby leakage, which can play an
important role in many applications where waiting times 7.0 Acknowledgements
are long. Furthermore, switching to standby mode using
this methodology has low overhead because one only The author would like to acknowledge the feedback
needs to gate the clocks and then assert the initial inputs and help from Anantha Chandrakasan and Siva Narendra
into the pipeline. As a result, this power down mode can from the Microsystems Technology Lab at MIT. This
also be effective at fine grain control such as for inactive work was partially funded under SRC Task #633.001.
modules within a chip like a multiplier or divider.

5. Conclusions
Dual Vt circuit technology is becoming a popular
solution for reducing leakage currents while maintaining
high performance in leading edge VLSI circuits. The use
of high Vt sleep transistors (MTCMOS) to gate power
supplies, and partioning circuits into high Vt non-critical
components and low Vt critical paths are actively being
researched in academia and industry. This paper presents
an alternative application of dual Vt technology to
domino logic. This methodology avoids the sizing
difficulties and inherent performance penalty of
MTCMOS, and can also be applied to critical paths not
correctable in the partitioning technique. Dual Vt
Domino allows one to maintain the performance of an all
LVT design, yet still maintain the low subthreshold
leakage current of an all HVT design during the standby
mode. Although noise margin considerations ultimately
limits dynamic logic applicability in future technologies,
there exists a window of opportunity where low Vt
domino circuits can benefit from the use of dual Vt.

Anda mungkin juga menyukai