SEMINAR REPORT
ON
ASYNCRONOUS CHIP
Submitted to:-
SHEKHAR SHARMA
SESSION: 2012
CERTIFICATE
This is to certify that Mr. Shekhar Kumar Sharma of Electronics & Communication Branch of
Siddhi Vinayak College Of Science & Hr. Education, ALWAR has completed his seminar
work entitled ASYNCRONOUS CHIP under my Supervision as a partial, fulfillment of his
Degree of Bachelor of Technology of Rajasthan Technical University, Kota. I am fully
satisfied with the work carried out by him which has been reported in this seminar and all
the work done is bonafide to the above name student. I strongly recommend for the award of the
Degree.
SEMINAR GUIDE :
:
ACKNOWLEDGEMENT
First and foremost, I would like to thank my respected parents, who always encouraged me and taught
me to think and work out innovatively what so ever would be the field of my life.
My sincere thanks go to Mr. Vikas Tiwari (H.O.D., ECE Dept.) for his prodigious guidance,
persuasion, reformative and prudential suggestion throughout my seminar work. It is just her guidance
because of which I was able to know every aspects of the seminar used here.
Finally it is indeed a great pleasure & privilege to express my thanks to colleagues, my friends and my
family members for their all types of help and suggestions.
Date :- -----------
ABSTRACT
Breaking the bounds of the clock on a processor may seem a daunting task to those
brought up through a typical engineering program. Without the clock, how do you organize the
chip and know when you have the correct data or instruction? We may have to take this task on
very soon.
Clock speeds are now on the gigahertz range and there is not much room for speedup
before physical realities start to complicate things. With a gigahertz powering a chip, signals
barely have enough time to make it across the chip before the next clock tick. At this point,
speedup the clock frequency could become disastrous. This is when a chip that is not
constricted by clock speed could become very valuable.
Interestingly, the idea of designing a computer processor without a central controlling clock is
not a new one. In fact, this idea was suggested as early as 1946, but engineers felt that this
asynchronous design would be too difficult to design with their current, and by todays
standards, clumsy technology.
Today, we have the advanced manufacturing devices to make chips extremely accurate.
Because of this, it is possible to create prototype processors without a clock. But will these
chips catch on? A major hindrance to the development of clock less chips is the
competitiveness of the computer industry. Presently, it is nearly impossible for companies to
develop and manufacture a clock less chip while keeping the cost reasonable. Until this is
possible, clock less chips will not be a major player in the market.
CONTENTS
Chapter no.
Particulars
INTRODUCTION
1.
1.1
DEFINITION
1.2
CLOCK CONCEPT
2.
CLOCKLESS APPROACH
2.1.
CLOCK LIMITATIONS
2.1.
ASYNCRONOUS VIEW
3.
LOW PERFOMANCE
3.2
LOW SPEED
3.3
3.4
4.
ASYNCRONOUS CIRCUITS
4.1
4.2
4.3
STANDADISE OF COMPONENTS
4.4
5.
5.2
Page
5.3
5.4
LOCAL OPERATION
5.5
RENDEZVOUS CIRCUITS
5.6
ARBITER CIRCUIT
5.7
6.
SIMPLICITY IN DESIGN
6.1
6.2
6.3
7.
8.
8.2
8.3
8.4
HETEROGENEOUS TIMING
9.
A CHALLENGING TIME
10.
FUTURE SCOPE
CONCLUSION
REFERENCES
Chapter-1
INTRODUCTION
1.1 DEFINITION
Every action of the computer takes place in tiny steps, each a billionth of a second long.
A simple transfer of data may take only one step; complex calculations may take many steps. All
operations, however, must begin and end according to the clock's timing signals. The use of a
central clock also creates problems. As speeds have increased, distributing the timing signals has
become more and more difficult. Present-day transistors can process data so quickly that they can
accomplish several steps in the time that it takes a wire to carry a signal from one side of the chip
to the other. Keeping the rhythm identical in all parts of a large chip requires careful design and a
great deal of electrical power. Wouldn't it be nice to have an alternative? Clockless approach,
which uses a technique known as asynchronous logic, differs from conventional computer circuit
design in that the switching on and off of digital circuits is controlled individually by specific
pieces of data rather than by a tyrannical clock that forces all of the millions of the circuits on a
chip to march in unison. It overcomes all the disadvantages of a clocked circuit such as slow
speed, high power consumption, high electromagnetic noise etc. For these reasons the clockless
technology is considered as the technology which is going to drive majority of electronic chips in
the coming years.
in
any
manner
of
sequence
and
order
does
not
matter.
The diagram above shows the global clock is governing all components in the system that need
timing signals. All components operate exactly once per clock tick and their outputs need to be
ready and next clock tick.
Chapter-2
CLOCKLESS APPROACH
very
valuable.
Clock
(Frequency
One can create a clock that is so fast and it sends its timing signals to the logic circuits which are
governed by the clock timing signals. These logic circuits are supposed to respond to every tick
of the clock and yet when they can compile to match the speed then logic circuits will be not
optimum according to the speed of clock and hence the input and output can go incorrect. This
will result hardware problem since one has to assemble chips to achieve the speed of clock and
hence much more complicated situation arise.
Another advantage of clockless chips is that they give off very low levels of
electromagnetic noise. The faster the clock, the more difficult it is to prevent a device from
interfering with other devices; dispensing with the clock all but eliminates this problem. The
combination of low noise and low power consumption makes asynchronous chips a natural
choice for mobile devices.
Computer chips of today are synchronous. They contain a main clock, which controls the
timing of the entire chips. There are problems, however, involved with these clocked designs that
are common today.
One problem is speed. A chip can only work as fast as its slowest component. Therefore,
if one part of the chip is especially slow, the other parts of the chip are forced to sit idle. This
wasted computed time is obviously detrimental to the speed of the chip.
New problems with speeding up a clocked chip are just around the corner. Clock
frequencies are getting so fast that signals can barely cross the chip in one clock cycle. When we
get to the point where the clock cannot drive the entire chip, well be forced to come up with a
solution. One possible solution is a second clock, but this will incur overhead and power
consumption, so this is a poor solution. It is also important to note that doubling the frequency of
the clock does not double the chip speed, therefore blindly trying to increase chip speed by
increasing frequency without considering other options is foolish.
The other major problem with c clocked design is power consumption. The clock
consumes more power that any other component of the chip. The most disturbing thing about this
is that the clock serves no direct computational use. A clock does not perform operations on
information; it simply orchestrates the computational parts of the computer.
New problems with power consumption are arising. As the number of transistors on a chi
increases, so does the power used by the clock. Therefore, as we design more complicated chips,
power consumption becomes an even more crucial topic. Mobile electronics are the target for
many chips.
These chips need to be even more conservative with power consumption in order to have
a reasonable battery lifetime. The natural solution to the above problems, as you may have
guessed, is to eliminate the source of these headaches: the clock.
Chapter-3
PROBLEMS WITH
SYNCRONOUS CIRCUITS
Sy
nchronous circuits are digital circuits in which parts are synchronized by clock signals. In an
ideal synchronous circuit, every change in the logical levels of its storage components is
simultaneous. These transitions follow the level change of a special signal called the clock
signal. Ideally, the input to each storage element has reached its final value before the next clock
occurs, so the behavior of the whole circuit can be predicted exactly. Practically, some delay is
required for each logical operation, resulting in a maximum speed at which each synchronous
system can run. However there are several problems that are associated with synchronous
circuits:
of the presence of a higher voltage or bus speed setting, or a lower ambient temperature, than
'normal' or expected.
Since clock itself is crystal oscillator it is then associated with electromagnetic waves.
These waves produce electromagnetic noise due to oscillations. Noise will also be accompanied
by emission spectra. The higher the speed of clock is the higher number of oscillations per
second and this leak high value of electromagnetic noise and spectra emission. This is not a good
sign for design of mobile devices too. Apart from the problems above, the clock is synchronous
circuit and globally distributed over the components which are obviously in running in different
speed and hence the order of arrive of the timing signal is not important. Data can be received
and transmitted in any form of order regardless of there sequential order they arrive at the fist
stage of execution. The designing of clock frequency should be so sophisticated since the
frequency of the clock is fixed and poor march of design can result problem in the reusability of
resources and interfacing with mixed-time environment devices.
Chapter-4
ASYNCRONOUS CIRCUITS
Asynchronous circuits are the electronic digital circuits that are not govern by the central
clock in their timing instead they are standardized in their installation and they use handshakes
signals for communication to each other components. In this case the circuits are not tied up
together and forced to follow the global clock timing signals but each and every component is
loosely and they run at average speed.
Asynchronous is can be achieved by implementing three vital techniques and these are:
two newest asynchronous startups, Asynchronous Digital Devices and Self-Timed Solutions, are
populating now, and clockless-chip research has been going on the longest. For a chip to be
successful, all three elements-design tools, manufacturing efficiency and experienced designersneed to come together. The asynchronous cadre has very promising ideas. There is now way one
can obtain pure asynchronous circuits to be used in the complete design of the system and this is
one of major barrier of clockless implementation but the circuits were successfully standardized
and hence they do not have to be in synchronous mode. And hence handshakes were the solution
to overcome synchronization.
One component which needs to communicate with the other uses the handshake signals
to achieve the establishment of connection and then with set up the time at which is going to
send data and at the other side another component will also use the same kind of handshakes to
harden the connection and wait for that time to receive data.
In circuits implemented by clockless chips, data do not have to move at random and out
of order as in synchronous in which the movement of data is no so essential. In asynchronous
circuits data are treated as very important aspect and hence do not move at any time they only
and only move when are required to move in case such as transmission between several
components. This technique has offered low power consumption and low electromagnetic noise
and also there will of course be smooth data streaming.
Chapter-5
COMPUTERS WITHOUT CLOCKS
Asynchronous chips improve computer performance by letting each circuit run as fast as
it can.
rough terrain. Some of the pioneers of the computer age, such as mathematician Allen M Turing,
tried using asynchronous designs to build machines in the early 1950s. Engineers soon
abandoned this approach in favour of synchronous computers because common timing made the
design process so much easier.
Now asynchronous computing is experiencing a renaissance. Researchers at the
University of Manchester in England, The University of Tokyo and The California Institute of
Technology had demonstrated asynchronous microprocessors. Some asynchronous chips are
already in commercial mass production. In the late 1990s Sharp, the Japanese electronics
company used asynchronous design to build a data driven media processor a chip for editing
graphics, video and audio and Philips Electronics produced an asynchronous microcontroller
for two of its pagers.
Asynchronous parts of otherwise synchronous systems are also beginning to appear; the
Ultra SPARC IIIi processor recently introduced by SUN includes some asynchronous circuits
developed by our group. We believe that asynchronous systems will become ever more popular
as researchers learn how to exploit their benefits and develop methods for simplifying their
design. Asynchronous chipmakers have achieved a good measure of technical success, but
commercial success is still to come. We remain a long way from fulfilling the full promise of
asynchrony.
Complex operations can take more time than average, and simple ones can take les.
Actions can start as soon as the prerequisite actions are done, without waiting for the next tick of
the clock. Thus the systems speed depends on the average action time rather than the slowest
action time.
Coordinating as actions, however, also takes time and chip area. If the efforts required for
local coordination are small, an asynchronous system may, on average, be faster than a clocked
system. Asynchrony offers the most help to irregular chip designs in which slow actions occur
infrequently.
Asynchronous design may also reduce a chips power consumption. In the current
generation of large, fast synchronous chips, the circuits that deliver the timing signals take up a
good chunk of the chips area. In addition, as much as 30% of the electrical power used by the
chip, must be devoted to the clock and its distribution system. Moreover, because the clock is
always running, it generates heat whether or not the chip is doing anything useful.
In asynchronous systems, idle parts of the chip consume negligible power. This feature is
particularly valuable for battery-powered equipment, but it can also cut the cost of larger systems
by reducing the need for cooling fans and air-conditioning to prevent them from overheating.
The amount of power saved depends on the machines pattern of activity. Systems with parts that
act only occasionally benefit more than systems that act continuously. Most computers have
components, such as the floating-point arithmetic unit, that often remain idle for long periods.
Furthermore, as systems produce less ratio interference than synchronous machines do.
Because of a clocked system uses a fixed rhythm, it broadcasts a strong radio signal at its
operating frequency and at the harmonics of that frequency. Such signals can interfere with
cellular phones, televisions and aircraft navigation systems that operates t the same frequencies.
Asynchronous systems lack a fixed rhythm, so they spread their radiated energy broadly across
the radio spectrum, emitting less at any one frequency.
Most modern computers are synchronous: all their operations are coordinated by the
timing signals of tiny crystal oscillators within the machines. Now researchers are
designing asynchronous systems that can process data without the need for a governing
clock.
The potential benefits of asynchronous systems include faster speeds, lower power
consumption and less radio interference. As integrated circuit become more complex,
chip designers will need to learn asynchronous techniques.
Yet another benefit of asynchronous design is that it can be used to build bridges between
clocked computers running at different speeds. Many computing clusters, for instance,
link fast PCs with slower machines. These clusters can tackle complex problems by
dividing the computational tasks among the PCs. Such a system is inherently
asynchronous: different parts march to different beats. Moving data controlled by one
clock to the control of another clock requires asynchronous bridges, because data may be
out of sync with the receiving clock.
To describe how asynchronous systems work, we often use the metaphor of the bucket
brigade. A clocked system is like a bucket brigade in which each person must pass and receive
buckets according to the tick tock rhythm of the clock. When the clock ticks, each person pushes
a bucket forward to the next person down the line. When the clock tocks, each person grasps the
bucket pushed forward by the preceding person. The rhythm of this brigade cannot go faster than
the time it takes the slowest person to move the heaviest bucket. Even if most of the buckets are
light, everyone in the line must wait for the clock to tick before passing the next bucket.
Local cooperation rather than the common clock governs an asynchronous bucket
brigade. Each person who holds a bucket can pass it to the next person down the line as soon as
the next persons hands are free. Before each action, one person may have to wait until the other
is ready. When most of the buckets are light, however, they can move down the line very quickly.
Moreover, when theres no water to move, everyone can rest between buckets. A slow person
will still hinder the performance of the entire brigade, but replacing the slowpoke will return the
system to its best speed.
Bucket brigade
Bucket brigades in computers are called pipelines. A common pipeline executes the
computers instructions. Such a pipeline has half a dozen or so stages, each of which acts as a
person in a bucket brigade.
For example, a processor executing the instruction ADD A B Chip must fetch the
instruction from memory, decode the instruction, get the numbers from addresses A and B in
memory, do the addition and store the sum in memory address C.
Processing logic
Req
Ack
Req
Ack
Delay
Pipeline diagram
Here a bundled data self-timing scheme is used, where conventional data processing
logic is used along with a separate request (Req) line to indicate data validity. Requests may be
delayed by at least the logic delay to insure that they still indicate data validity at the receiving
register. An acknowledge signal (ack) provides flow control, so the receiving register can tell
the transmitting register when to begin sending the next data.
A clocked pipeline executes these actions in a rhythm independent of the operations
performed or the size of the numbers. In an asynchronous pipeline, though, the duration of
each action may depend on the operation performed the size of the numbers and the location of
the data in memory (just as in bucket brigade the amount of water in a bucket may determine
how long it takes to pass it on).
Without a clock to govern its actions, an asynchronous system must rely on local
coordination circuits instead. These circuits exchange completion signals to ensure that the
actions at each stage begin only when the circuits have the data they need. The two most
important coordination circuits are called the Rendezvous and the Arbiter circuits.
A Rendezvous element indicates when the last of two or more signals has arrived at a
particular stage. Asynchronous systems use these elements to wait until all the concurrent
actions finish before starting the next action.
One form of Rendezvous circuit is called the Muller C-element, named after David
Muller, now retired from a professorship at the University of Illinois. A Muller C-element is a
logic circuit with two inputs and on output. When both inputs of a Muller C-element are TRUE,
its output becomes TRUE.
Delay
Processing logic
When both inputs are FALSE, its output becomes FALSE. Otherwise the output remains
unchanged. For therefore, Muller C-element to act as a Rendezvous circuit, its inputs must not
change again until its output responds. A chain of Muller C-elements can control the flow of
data down an electronic bucket brigade.
Rendezvous circuit
It can coordinate the action of an asynchronous system, allowing data to flow in an
orderly fashion without the need for a central clock. Shown here is an electronic pipeline
control by a chain of Muller C-elements, each of which allows data to pass down the line only
when the preceding stage is full indicating that data are ready to move and the following
stage is empty.
Each Muller C-element has two input wires and one output wire. The output changes to
FALSE when both inputs are FALSE and back to TRUE when both inputs are TRUE (in the
diagram, TRUE signals are shown in blue and FALSE signals are in red.). The inverter makes
the initial inputs to the Muller C-element differ, setting all stages empty at the start. Lets
assume that the left input is initially TRUE and the right input FALSE (1). A change in signal at
the left input from TRUE to FALSE (2) indicates that the stage to the left is full that is, some
data have arrived. Because the inputs to the Muller C-element are now the same, its output
changes to FALSE. This change in signals does three things: it moves data down the pipeline by
briefly making the data latch transparent, it sends a FALSE signal back to the preceding Celement to make the left stage empty, and it sends a FALSE signal ahead to the next Muller Celement to make the right stage full (3).
Research groups recently introduced a new kind of Rendezvous circuit called GasP.
GasP evolved from an earlier family of circuits designed by Charles E. Molnar, at SUN
Microsystems. Molnar dubbed his creation asP*, which stands for asynchronous symmetric
pulse protocol (the asterisk indicates the double P). G is added to the name because GasP is
what you are supposed to do when you see how fat our new circuits go. It is found that GasP
modules are as fast as and as energy-efficient as Muller C-elements, fit better with ordinary data
latches and offer much greater versatility in complex designs.
For example, when two processors request access to a shared memory at approximately
the same time, the Arbiter puts the request into a sequence, granting access to only one
processor at a time. The Arbiter guarantees that there are never two actions under way at once,
just as the traffic officer prevents accidents by ensuring that there are never two cars passing
through the intersection on a collision course.
Although Arbiter circuits never grant more than one request at a time, there is no way to
build an Arbiter that will always reach a decision within a fixed time limit. Present-day Arbiters
reach decisions very quickly on average, usually within about a few hundred picoseconds.
When faced with close calls, however, the circuits may occasionally take twice as long, and in
very rare cases the time needed to make a decision may be 10 times as long as normal.
The fundamental difficulty in making these decisions causes minor dilemmas, which are
familiar in everyday life. For example, two people approaching a doorway at the same time may
pause before deciding who will go through first. They can go through in either order. All that
needed is a way to break the tie.
An Arbiter breaks ties. Like a flip-flop circuit, an Arbiter has two stable states
corresponding to the two choices. One can think of these states as the Pacific Ocean and The
Gulf of Mexico. Each request to an Arbiter pushes the circuit toward one stable state or the
other, just as a hailstone that falls in the Rocky Mountains can roll downhill toward The Pacific
or the Gulf. Between the two stable states, however, there must be a meta-stable line, which is
equivalent to the Continental Divide. If a hailstone falls precisely on the Divide, it may balance
momentarily on that sharp mountain ridge before tipping toward The Pacific or the Gulf.
Similarly, if two requests arrive at an Arbiter within a few picoseconds of each other, the circuit
may pause in its meta-stable state before reaching one of its stable states to break the tie.
This project proved very useful as a research target; we learned a great deal about
coordination and arbitration and built test chips to prove the reliability of our Arbiter circuits.
Chapter-6
SIMPLICITY IN DESIGN
There in no complexity of a simple design for clockless chips. The one fundamental
achievement is to throw the central clock away and standardization of components can be used
intensively. Integrated pipeline mode plays an important role in total system design.
There are about four factors regarding pipeline and these are:
1.Domino logic
2. Delay insensitive
3. Bundle data
4.
Dual
rail
Domino logic is a CMOS-based evolution of the dynamic logic techniques which were
based on either PMOS or NMOS transistors. It allows a rail-to-rail logic swing. It was developed
to speed up circuits. In a cascade structure consisting of several stages, the evaluation of each
stage ripples the next stage evaluation, similar to a domino falling one after the other. The
structure is hence called Domino CMOS Logic. Important features include:
*
They
Parasitic
have
smaller
capacitances
are
areas
than
smaller
so
conventional
that
higher
CMOS
operating
logic.
speeds
are
possible.
*
Operation is free of glitches as each gate can make only one transition.
Only non-inverting structures are possible because of the presence of inverting buffer.
Charge
distribution
may
be
problem
result
of
the
first
computation
is
completed.
The main advantage of such circuits is their ability to optimize processing of activities
that can take arbitrary periods of time depending on the data or requested function. An example
of a process with a variable time for completion would be mathematical division or recovery of
data where such data might be in a cache. The Delay-Insensitive (DI) class is the most robust of
all asynchronous circuit delay models. It makes no assumptions on the delay of wires or gates. In
this model all transitions on gates or wires must be acknowledged before transitioning again.
This condition stops unseen transitions from occurring. In DI circuits any transition on an input
to a gate must be seen on the output of the gate before a subsequent transition on that input is
allowed to happen. This forces some input states or sequences to become illegal. For example
OR gates must never go into the state where both inputs are one, as the entry and exit from this
state will not be seen on the output of the gate. Although this model is very robust, no practical
circuits are possible due to the heavy restrictions. Instead the Quasi-Delay-Insensitive model is
the smallest compromise model yet capable of generating useful computing circuits. For this
reason circuits are often incorrectly referred to as Delay-Insensitive when they are Quasi-DelayInsensitive.
Data-dependent
*All
carry
bits
need
delays.
to
be
computed.
The figure show first circuit being not asynchronous and then the second shows dual rail with
every bit taken into computation.
Chapter-7
ADVANTAGE OF ASYNCRONOUS CHIP
A clocked chip can run no faster than its most slothful piece of logic; the answer isn't
guaranteed until every part completes its work. By contrast, the transistors on an asynchronous
chip can swap information independently, without needing to wait for everything else. The
result? Instead of the entire chip running at the speed of its slowest components, it can run at the
average speed of all components. At both Intel and Sun, this approach has led to prototype chips
that run two to three times faster than comparable products using conventional circuitry.
Clockless chips draw power only when there is useful work to do, enabling a huge
savings in battery-driven devices; an asynchronous-chip-based pager marketed by Philips
Electronics, for example, runs almost twice as long as competitors' products, which use
conventional clocked chips.
Asynchronous chips use 10 percent to 50 percent less energy than synchronous chips, in
which the clocks are constantly drawing power. That makes them ideal for mobile
communications applications - which usually need low power sources - and the chips' quiet
nature also makes them more secure, as typical hacking techniques involve listening to clock
ticks.
Another advantage of clockless chips is that they give off very low levels of
electromagnetic noise. The faster the clock, the more difficult it is to prevent a device from
interfering with other devices; dispensing with the clock all but eliminates this problem. The
combination of low noise and low power consumption makes asynchronous chips a natural
choice for mobile devices. "The low-hanging fruit for clockless chips will be in communications
devices," starting with cell phones
Asynchronous logic would offer better security than conventional chips: "The clock is
like a big signal that says, Okay, look now," says Fant. "It's like looking for someone in a
marching band. Asynchronous is more like a milling crowd. There's no clear signal to watch.
Potential hackers don't know where to begin."
Analyzing the power consumption for each clock tick can crack the encryption on
existing smart cards. This allows details of the chips inner workings to be deduced. Such an
attack would be far more difficult on a smartcard based on asynchronous logic.
They can perform encryption in a way that is harder to identify and to crack. Improved
encryption makes asynchronous circuits an obvious choice for smart cardsthe chip-endowed
plastic cards beginning to be used for such security-sensitive applications as storage of medical
records, electronic funds exchange and personal identification.
Ivan Sutherland of Sun Microsystems, who is regarded as the guru of the field, believes
that such chips will have twice the power of conventional designs, which will make them ideal
for use in high-performance computers. But Dr Furber suggests that the most promising
application for asynchronous chips may be in mobile wireless devices and smart cards.
Different styles
There are several styles of asynchronous design. Conventional chips represent the zeroes
and ones of binary digits (bits) using low and high voltages on a particular wire.
One clockless approach, called dual rail, uses two wires for each bit. Sudden voltage
changes on one of the wires represent a zero, and on the other wire a one.
"Dual-rail" circuits use two wires giving the chip communications pathways, not only to
send bits, but also to send "handshake" signals to indicate when work has been completed.
Replacing the conventional system of digital logic with what he calls "null convention logic," a
scheme that identifies not only "yes" and "no," but also "no answer yet"a convenient way for
clockless chips to recognize when an operation has not yet been completed.
Another approach is called bundled data. Low and high voltages on 32 wires are used
to represent 32 bits, and a change in voltage on a 33rd wire indicates when the values on the
other 32 wires are to be used.
Chapter-8
APPLICATION OF ASYNCRONOUS CHIP
1. High performance.
2. Low power dissipation.
3. Low noise and low electro-magnetic emission.
4. A good match with heterogeneous system timing.
Data-dependent delays
The delay of the combinational logic circuit show in Figure-1 depends on the current
state and the value of the primary inputs. The worst-case delay, plus some margin for flip-flop
delays and clock skew, is then a lower bound for the clock period of a synchronous circuit. Thus,
the actual delay is always less (and sometimes much less) than the clock period.
A simple example is an N-bit ripple-carry adder (Figure 2). The worst-case delay occurs
when 1 is added to 2N - 1. Then the carry ripples from FA1 to FAN. In the best case there is no
carry ripple at all, as, for example, when adding 1 to 0. Assuming random inputs, the average
length of the longest carry-propagation chain is bounded by log 2 N. For a 32-bit wide ripple-carry
adder the average length is therefore 5, but the clock period must be 6 times longer! On the other
hand, the average length determines the average case delay of an asynchronous ripple-carry
adder, which we consider next. In an asynchronous circuit this variation in delays can be
exploited by detecting the actual completion of the addition. Most practical solutions use dualrail encoding of the carry signal (Figure 2(b)); the addition has completed when all internal
carry-signals have been computed. That is, when each pair (cf i; cti) has made a monotonous
transition from (0; 0) to (0; 1) (carry = false) or to (1; 0) (carry = true). Dual-rail encoding of the
carry signal has also been applied to a carry bypass adder. When inputs and outputs are dual-rail
encoded as well, the completion can be observed from the outputs of the adder.
The second element runs at only half the rate of the first one and hence dissipates only
half the power; the third one dissipates only a quarter, and so on. Hence, the entire asynchronous
cascade consumes, over a given period of time, slightly less than twice the power of its head
element, independent of N. That is, fixed power dissipation is obtained.
In contrast, a similar synchronous divider would dissipate in proportion to N. A cascade
of 15 such divide-by-two elements is used in watches to convert a 32 kHz crystal clock down to
a 1 Hz clock. The potential of asynchronous for low power depends on the application.
For example, in a digital filter where the clock rate equals the data rate, all flip-flops and
all combinational circuits are active during each clock cycle. Then little or nothing can be gained
by implementing the filter as an asynchronous circuit. However, in many digital-signal
processing functions the clock rate exceeds the data (signal) rate by a large factor, sometimes by
several orders of magnitude 2. In such circuits, only a small fraction of registers change state
during a clock cycle. Furthermore, this fraction may be highly data dependent. The clock
frequency is chosen that high to accommodate sequential algorithms that share resources over
subsequent computation steps. One is vastly improved electrical efficiency, which leads directly
to prolonged battery life.
One application for which asynchronous circuits can save power is Reed-Solomon error
correctors operating at audio rates, as demonstrated at Philips Research Laboratories. Two
different asynchronous realizations of this decoder (single-rail and dual-rail) are compared with a
synchronous (product) version.
The single rail was clearly superior and consumed five times less power than the
synchronous version.
A second example is the infrared communications receiver IC designed at HewlettPackard/Stanford. The receiver IC draws only leakage current while waiting for incoming data,
but can start up as soon as a signal arrives so that it loses no data. Also, most modules operate
well below the maximum frequency of operation.
The filter bank for a digital hearing aid was the subject of another successful
demonstration, this time by the Technical University of Denmark in cooperation with Oticon Inc.
They re-implemented an existing filter bank as a fully asynchronous circuit. The result is a factor
five less power consumption.
A fourth application is a pager in which several power-hungry sub circuits were
redesigned as asynchronous circuits, as shown later in this issue.
Due to the absence of a clock, asynchronous circuits may have better noise and EMC
(Electro-Magnetic Compatibility) properties than synchronous circuits.
This advantage can be appreciated by analyzing the supply current of a clocked circuit in
both the time and frequency domains.
Circuit activity of a clocked circuit is usually maximal shortly after the productive clock
edge. It gradually fades away and the circuit must become totally quiescent before the next
productive clock edge. Viewed differently, the clock signal modulates the supply current as
depicted schematically in Figure 5(a). Due to parasitic resistance and inductance in the on-chip
and off-chip supply wiring this causes noise on the on-chip power and ground lines.
Chapter-9
A CHALLENGING TIME
Although the architectural freedom of asynchronous systems is a great benefit, it also poses a
difficult challenge. Because each part sets its own pace, that pace may vary from time to time
in any one system and may vary from system to system. If several actions are concurrent, they
may finish in a large number of possible sequences. Enumerating all the possible sequences of
actions in a complex asynchronous chip is as difficult as predicting the sequences of actions in
a school yard full of children. This dilemma is called the state explosion problem.
Can chip designers create order out of the potential chaos of concurrent actions?
Fortunately, researchers are developing theories for tracking this problem. Designers need not
worry about all the possible sequences of actions if they can set certain limitations on the
communication behavior of each circuit. To continue the schoolyard metaphor, a teacher can
promote safe play by teaching each child how to avoid danger.
Another difficulty is that we lack mature design tools, accepted testing methods and
widespread education in asynchronous design. A growing research community is making good
progress, but the present total investment in clock-free computing parlances in comparison
with the investment in clocked design. Nevertheless, we are confident that the relentless
advances in the speed and complexity of integrated circuits will force designers to learn
asynchronous techniques. We do not know yet whether asynchronous systems will flourish
first within large computer and electronics companies or within start-up companies eager to
develop new ideas. The technological trend, however, is inevitable: in he coming decades,
asynchronous design will become prevalent.
Chapter-10
FUTURE SCOPE
The first place well see, and have already seen, clock less designs are in the lab. Many
prototypes will be necessary to create reliable designs. Manufacturing techniques must also be
improved so the chips can be mass-produced.
The second place well see these chips are in mobile electronics. This is an ideal place to
implement a clock less chip because of the minimal power consumption. Also, low levels of
electromagnetic noise creates less interference, less interference is critical in designs with many
components packed very tightly, as is the case with mobile electronics.
The third place is in personal computers (PCs). Clock less designs will occur here last
because of the competitive PC market.
CONCLUSION
Clocks have served the electronics design industry very well for a long time, but there
are insignificant difficulties looming for clocked design in future. These difficulties are most
obvious in complex SOC development, where electrical noise, power and design costs threaten
to render the potential of future process technologies inaccessible to clocked design.
Self-timed design offers an alternative paradigm that addresses these problem areas, but
until now VLSI designers have largely ignored it. Things are beginning to change; however,
self-timed design is poised to emerge as a viable alternative to clocked design. The drawbacks,
which are the lack of design tools and designers capable of handling self-timed design, are
beginning to be addressed, and a few companies (including a couple of start-ups, Theseus
Logic Inc., and Cogency Technology, Inc.) have made significant commitments to the
technology.
Although full-scale commercial demonstrations of the value of self-timed design are still
few in number, the examples available, demonstrates that there are no show stoppers to
threaten the ultimate viability for this strategy. Self-timed technology is poised to make an
impact, and there are significant rewards on offer to those brave enough to take the lead in its
exploitation.
REFERENCES