Anda di halaman 1dari 45

A

SEMINAR REPORT
ON

ASYNCRONOUS CHIP

SUBMITTED IN PARTIAL FULFILLMENT


FOR THE AWARD OF THE DEGREE
OF
BACHELOR OF TECHNOLOGY
TO
RAJASTHAN TECHNICAL UNIVERSITY, KOTA
Submitted By: -

Submitted to:-

SHEKHAR SHARMA

Mr. VIKAS TIWARI


Asst.Prof(E.C.E. Dept.)

ELECTRONICS & COMMUNICATION ENGINEERING


SIDDHI VINAYAK COLLEGE OF SCIENCE & HIGHER EDUCATION, ALWAR

SESSION: 2012

An ISO: 9001:2000 certified College

Siddhi Vinayak College of Science & HR. Education


Approved by A.I.C.T.E., Govt. of Rajasthan & Affiliated to University of Rajasthan
College campus: E-1, B-1, M.I.A., (Ext.) Alwar 301030 Fax: 0144-2881777
Ph: 0144-2881888, 288144, 2882888, 3295489, 9351010431
Visit us at: www.gmcolleges.com

E-mail: siddhivinayak @gmcolleges.com

CERTIFICATE
This is to certify that Mr. Shekhar Kumar Sharma of Electronics & Communication Branch of
Siddhi Vinayak College Of Science & Hr. Education, ALWAR has completed his seminar
work entitled ASYNCRONOUS CHIP under my Supervision as a partial, fulfillment of his
Degree of Bachelor of Technology of Rajasthan Technical University, Kota. I am fully
satisfied with the work carried out by him which has been reported in this seminar and all
the work done is bonafide to the above name student. I strongly recommend for the award of the
Degree.

SEMINAR GUIDE :
:

Mr. Vikas Tiwari


Asst. Prof(E.C.E. Dept.)

ACKNOWLEDGEMENT

First and foremost, I would like to thank my respected parents, who always encouraged me and taught
me to think and work out innovatively what so ever would be the field of my life.
My sincere thanks go to Mr. Vikas Tiwari (H.O.D., ECE Dept.) for his prodigious guidance,
persuasion, reformative and prudential suggestion throughout my seminar work. It is just her guidance
because of which I was able to know every aspects of the seminar used here.
Finally it is indeed a great pleasure & privilege to express my thanks to colleagues, my friends and my
family members for their all types of help and suggestions.

Date :- -----------

Shekhar Kumar Sharma

ABSTRACT

Breaking the bounds of the clock on a processor may seem a daunting task to those
brought up through a typical engineering program. Without the clock, how do you organize the
chip and know when you have the correct data or instruction? We may have to take this task on
very soon.

Clock speeds are now on the gigahertz range and there is not much room for speedup
before physical realities start to complicate things. With a gigahertz powering a chip, signals
barely have enough time to make it across the chip before the next clock tick. At this point,
speedup the clock frequency could become disastrous. This is when a chip that is not
constricted by clock speed could become very valuable.

Interestingly, the idea of designing a computer processor without a central controlling clock is
not a new one. In fact, this idea was suggested as early as 1946, but engineers felt that this
asynchronous design would be too difficult to design with their current, and by todays
standards, clumsy technology.
Today, we have the advanced manufacturing devices to make chips extremely accurate.
Because of this, it is possible to create prototype processors without a clock. But will these
chips catch on? A major hindrance to the development of clock less chips is the
competitiveness of the computer industry. Presently, it is nearly impossible for companies to
develop and manufacture a clock less chip while keeping the cost reasonable. Until this is
possible, clock less chips will not be a major player in the market.

CONTENTS

Chapter no.

Particulars

INTRODUCTION

1.
1.1

DEFINITION

1.2

CLOCK CONCEPT

2.

CLOCKLESS APPROACH
2.1.

CLOCK LIMITATIONS

2.1.

ASYNCRONOUS VIEW

3.

PROBLEMS WITH SYNCRONOUS CIRCUITS


3.1

LOW PERFOMANCE

3.2

LOW SPEED

3.3

HIGH POWER DISSIPATION

3.4

HIGH ELECTROMAGNETIC NOISE

4.

ASYNCRONOUS CIRCUITS
4.1

CLOCKLESS CHIPS IMPLEMENTATION

4.2

THROWING AWAY GLOBAL CLOCK

4.3

STANDADISE OF COMPONENTS

4.4

HOW CLOCKLESS CHIPS WORKS

5.

COMPUTERS WITHOUT CLOCKS


5.1

HOW FAST COMPUTER IS?

5.2

BEAT THE CLOCK

Page

5.3

OVERVIEW/ CLOCKLESS SYSTEM

5.4

LOCAL OPERATION

5.5

RENDEZVOUS CIRCUITS

5.6

ARBITER CIRCUIT

5.7

THE NEED FOR SPEED

6.

SIMPLICITY IN DESIGN
6.1

ASYNCRONOUS OF HIGHER PERFOMANCE

6.2

ASYNCHRONOUS FOR LOW POWER

6.3

ASYNCRONOUS FOR LOW NOISE

7.

ADVANTAGE OF ASYNCRONOUS CHIP

8.

APPLICATION OF ASYNCRONOUS CHIP


8.1

ASYNCHRONOUS FOR HIGH PERFORMANCE

8.2

ASYNCHRONOUS FOR LOW POWER

8.3

ASYNCHRONOUS FOR LOW NOISE AND LOW EMISSION

8.4

HETEROGENEOUS TIMING

9.

A CHALLENGING TIME

10.

FUTURE SCOPE
CONCLUSION
REFERENCES

Chapter-1
INTRODUCTION

1.1 DEFINITION
Every action of the computer takes place in tiny steps, each a billionth of a second long.
A simple transfer of data may take only one step; complex calculations may take many steps. All
operations, however, must begin and end according to the clock's timing signals. The use of a
central clock also creates problems. As speeds have increased, distributing the timing signals has
become more and more difficult. Present-day transistors can process data so quickly that they can
accomplish several steps in the time that it takes a wire to carry a signal from one side of the chip
to the other. Keeping the rhythm identical in all parts of a large chip requires careful design and a
great deal of electrical power. Wouldn't it be nice to have an alternative? Clockless approach,
which uses a technique known as asynchronous logic, differs from conventional computer circuit
design in that the switching on and off of digital circuits is controlled individually by specific
pieces of data rather than by a tyrannical clock that forces all of the millions of the circuits on a
chip to march in unison. It overcomes all the disadvantages of a clocked circuit such as slow
speed, high power consumption, high electromagnetic noise etc. For these reasons the clockless
technology is considered as the technology which is going to drive majority of electronic chips in
the coming years.

1.2 CLOCK CONCEPT


The clock is a tiny crystal oscillator that resides in the heart of every microprocessor chip.
The clock is what which sets the basic rhythm used throughout the machine. The clock
orchestrates the synchronous dance of electrons that course through the hundreds of millions of
wires and transistors of a modern computer. Such crystals which tick up to 2 billion times each
second in the fastest of today's desktop personal computers, dictate the timing of every circuit in
every one of the chips that add, subtract, divide, multiply and move the ones and zeros that are
the basic stuff of the information age. Conventional chips (synchronous) operate under the
control of a central clock, which samples data in the registers at precisely timed intervals.
Computer chips of today are synchronous: they contain a main clock which controls the timing
of the entire chips. One advantage of a clock is that, the clock signals to the devices of the chip
when to input or output. The circuit which uses global clock can allow data to flow in the circuit

in

any

manner

of

sequence

and

order

does

not

matter.

The diagram above shows the global clock is governing all components in the system that need
timing signals. All components operate exactly once per clock tick and their outputs need to be
ready and next clock tick.

fig1.1: Power Dissipation in Synchronous (Left) & Asynchronous (Right)

Chapter-2
CLOCKLESS APPROACH

2.1 CLOCK LIMITATIONS


There are problems that go along with the clock, however. Clock speeds are now in the
gigahertz range and there is not much room for speedup before physical realities start to
complicate things. With a gigahertz clock powering a chip, signals barely have enough time to
make it across the chip before the next clock tick. At this point, speedup up the clock frequency
could become disastrous. This is when a chip that is not constricted by clock speeds could
become

very

valuable.

Clock

(Frequency

One can create a clock that is so fast and it sends its timing signals to the logic circuits which are
governed by the clock timing signals. These logic circuits are supposed to respond to every tick
of the clock and yet when they can compile to match the speed then logic circuits will be not
optimum according to the speed of clock and hence the input and output can go incorrect. This
will result hardware problem since one has to assemble chips to achieve the speed of clock and
hence much more complicated situation arise.

2.2 ASYNCRONOUS VIEW


By throwing out the clock, chip makers will be able to escape from huge power
dissipation. Clockless chips draw power only when there is useful work to do, enabling a huge
savings in battery-driven devices. Like a team of horses that can only run as fast as its slowest
member, a clocked chip can run no faster than its most slothful piece of logic; the answer isn't
guaranteed until every part completes its work. By contrast, the transistors on an asynchronous
chip can swap information independently, without needing to wait for everything else. The
result? Instead of the entire chip running at the speed of its slowest components, it can run at the
average speed of all components. At both Intel and Sun, this approach has led to prototype chips
that run two to three times faster than comparable products using conventional circuitry.

Another advantage of clockless chips is that they give off very low levels of
electromagnetic noise. The faster the clock, the more difficult it is to prevent a device from

interfering with other devices; dispensing with the clock all but eliminates this problem. The
combination of low noise and low power consumption makes asynchronous chips a natural
choice for mobile devices.
Computer chips of today are synchronous. They contain a main clock, which controls the
timing of the entire chips. There are problems, however, involved with these clocked designs that
are common today.

One problem is speed. A chip can only work as fast as its slowest component. Therefore,
if one part of the chip is especially slow, the other parts of the chip are forced to sit idle. This
wasted computed time is obviously detrimental to the speed of the chip.

New problems with speeding up a clocked chip are just around the corner. Clock
frequencies are getting so fast that signals can barely cross the chip in one clock cycle. When we
get to the point where the clock cannot drive the entire chip, well be forced to come up with a
solution. One possible solution is a second clock, but this will incur overhead and power
consumption, so this is a poor solution. It is also important to note that doubling the frequency of
the clock does not double the chip speed, therefore blindly trying to increase chip speed by
increasing frequency without considering other options is foolish.

The other major problem with c clocked design is power consumption. The clock
consumes more power that any other component of the chip. The most disturbing thing about this
is that the clock serves no direct computational use. A clock does not perform operations on
information; it simply orchestrates the computational parts of the computer.

New problems with power consumption are arising. As the number of transistors on a chi
increases, so does the power used by the clock. Therefore, as we design more complicated chips,

power consumption becomes an even more crucial topic. Mobile electronics are the target for
many chips.

These chips need to be even more conservative with power consumption in order to have
a reasonable battery lifetime. The natural solution to the above problems, as you may have
guessed, is to eliminate the source of these headaches: the clock.

Fig 2.1: ASYNCRONOUS VIEW

Chapter-3
PROBLEMS WITH

SYNCRONOUS CIRCUITS
Sy

nchronous circuits are digital circuits in which parts are synchronized by clock signals. In an
ideal synchronous circuit, every change in the logical levels of its storage components is
simultaneous. These transitions follow the level change of a special signal called the clock
signal. Ideally, the input to each storage element has reached its final value before the next clock
occurs, so the behavior of the whole circuit can be predicted exactly. Practically, some delay is
required for each logical operation, resulting in a maximum speed at which each synchronous
system can run. However there are several problems that are associated with synchronous
circuits:

3.1 LOW PERFOMANCE


In a synchronous system, all the components are tied up together and the system is
working on its worst case execution. The speed of execution will not be faster than that of the
slowest circuit in the system and this will determine the final working performance of the
system. Although there are faster circuits which have sophisticated performance but since they
are depending of some other slow components for input and output of data then they can no long
run faster than the slowest components. Hence the performance of the synchronous system is
limited to its worst case performance.

3.2 LOW SPEED


A traditional CPU cannot "go faster" than the expected worst-case performance of the
slowest stage/instruction/component. When an asynchronous CPU completes an operation more
quickly than anticipated, the next stage can immediately begin processing the results, rather than
waiting for synchronization with a central clock. An operation might finish faster than normal
because of attributes of the data being processed (e.g., multiplication can be very fast when
multiplying by 0 or 1, even when running code produced by a brain-dead compiler), or because

of the presence of a higher voltage or bus speed setting, or a lower ambient temperature, than
'normal' or expected.

3.3 HIGH POWER DISSIPATION


As we know that clock is a tiny crystal oscillator that keeps vibrating during all time as
long as the system is power on, this lead into high power dissipation by the synchronous circuit
since they use central clock in their timings. The clock itself consumes about 30 percent of the
total power supplied to the circuit and sometimes can even reach high value such as 70 percent.
Even if the synchronous system is not active at the moment still its clock will be oscillating and
consumes power that is dissipated as heat energy. This makes synchronous system more power
consumer and hence not suitable for use in design of mobile devices and battery driven devices.

3.4 HIGH ELECTROMAGNETIC NOISE

Since clock itself is crystal oscillator it is then associated with electromagnetic waves.
These waves produce electromagnetic noise due to oscillations. Noise will also be accompanied
by emission spectra. The higher the speed of clock is the higher number of oscillations per
second and this leak high value of electromagnetic noise and spectra emission. This is not a good
sign for design of mobile devices too. Apart from the problems above, the clock is synchronous
circuit and globally distributed over the components which are obviously in running in different
speed and hence the order of arrive of the timing signal is not important. Data can be received
and transmitted in any form of order regardless of there sequential order they arrive at the fist
stage of execution. The designing of clock frequency should be so sophisticated since the
frequency of the clock is fixed and poor march of design can result problem in the reusability of
resources and interfacing with mixed-time environment devices.

Chapter-4
ASYNCRONOUS CIRCUITS

Asynchronous circuits are the electronic digital circuits that are not govern by the central
clock in their timing instead they are standardized in their installation and they use handshakes
signals for communication to each other components. In this case the circuits are not tied up
together and forced to follow the global clock timing signals but each and every component is
loosely and they run at average speed.
Asynchronous is can be achieved by implementing three vital techniques and these are:

4.1 CLOCKLESS CHIPS IMPLEMENTATION


In order to achieve asynchronous as final goal one must implement the electronic circuits
without using central clock and hence make the system free from tied components obeying
clock. One tricky technique is to use clockless chips in the circuit design. Since these chips are
not working with central clock and guarantee to free different components from being tied up
together. Now as components can run on their own different performance and speed hence
asynchronous is established.

4.2 THROWING AWAY GLOBAL CLOCK


There is no way one can success to implement asynchronous in circuits if there is global
clock that is managing the whole system timing signals. Since the clock is installed only to
enable the synchronization of components, by throwing away the global clock it is possible now
for components to be completely not synchronized and the communication between them is only
by handshaking mechanism.

4.3 STANDADISE OF COMPONENTS


In synchronous system all the components are closed up together as to be managed by
central clock. Synchronous ness can be split up if these components are not bound together and
hence standardizing these components is one of the alternatives. Here all the components are
going to be standard in a given range of working performance and speed. There is average speed
in which the design of system is dedicated to compile and the worst case execution will be
avoided.

4.4 HOW CLOCKLESS CHIPS WORKS


Beyond a new generation of design-and-testing equipment, successful development of
clockless chips requires the understanding of asynchronous design. Such talent is scarce, as
asynchronous principles fly in the face of the way almost every university teaches its engineering
students. Conventional chips can have values arrive at a register incorrectly and out of sequence;
but in a clockless chip, the values that arrive in registers must be correct the first time. One way
to achieve this goal is to pay close attention to such details as the lengths of the wires and the
number of logic gates connected to a given register, thereby assuring that signals travel to the
register in the proper logical sequence. But that means being far more meticulous about the
physical design than synchronous designers have been trained to be.

An alternative is to open up a separate communication channel on the chip. Clocked


chips represent ones and zeroes using low and high voltages on a single wire, "dual-rail" circuits,
on the other hand, use two wires, giving the chip communications pathways, not only to send
bits, but also to send "handshake" signals to indicate when work has been completed. Fair
additionally proposes replacing the conventional system of digital logic with what known as
"null convention logic," a scheme that identifies not only "yes" and "no," but also "no answer
yet"-a convenient way for clockless chips to recognize when an operation has not yet been
completed. All of these ideas and approaches are different enough that executing them could
confound the mind of an engineer trained to design to the beat of a clock. It's no surprise that the

two newest asynchronous startups, Asynchronous Digital Devices and Self-Timed Solutions, are
populating now, and clockless-chip research has been going on the longest. For a chip to be
successful, all three elements-design tools, manufacturing efficiency and experienced designersneed to come together. The asynchronous cadre has very promising ideas. There is now way one
can obtain pure asynchronous circuits to be used in the complete design of the system and this is
one of major barrier of clockless implementation but the circuits were successfully standardized
and hence they do not have to be in synchronous mode. And hence handshakes were the solution
to overcome synchronization.

One component which needs to communicate with the other uses the handshake signals
to achieve the establishment of connection and then with set up the time at which is going to
send data and at the other side another component will also use the same kind of handshakes to
harden the connection and wait for that time to receive data.

In circuits implemented by clockless chips, data do not have to move at random and out
of order as in synchronous in which the movement of data is no so essential. In asynchronous
circuits data are treated as very important aspect and hence do not move at any time they only
and only move when are required to move in case such as transmission between several
components. This technique has offered low power consumption and low electromagnetic noise
and also there will of course be smooth data streaming.

Chapter-5
COMPUTERS WITHOUT CLOCKS

Asynchronous chips improve computer performance by letting each circuit run as fast as
it can.

5.1 How fast is your personal computer?


When people ask this question, they are typically referring to the frequency of a
minuscule clock inside the computer, a crystal oscillator that sets the basic rhythm used
throughout the machine. In a computer with a speed of one Gigahertz, for example, the crystal
ticks a billion times a second. Every action of the computer takes place in tiny step; complex
calculations may take many steps. All operations, however, must begin and end according to the
clocks timing signals.
Since most modern computers use a single rhythm, we call them synchronous. Inside the
computers microprocessor chip, a clock distribution system delivers the timing signals from the
crystal oscillator to the various circuits, just as sound in air delivers the beat of a drum to soldiers
to set their marching space. Because all parts of the chip share the same rhythm, the output of
any circuit from one step can serve as the input to any other circuit for the next step. The
synchronization provided by the clock helps chip designers plan sequences of actions for the
computer.
The use of a central clock also creates problems. As speeds have increased, distributing
the timing signals has become more and more difficult. Present day transistors can process data
so quickly that they can accomplish several steps in the time that it takes a wire to carry a signal
from one side of the chip to the other.
Keeping the rhythm identical in all parts of a large chip requires careful design and a
great deal of electrical power. Each part of an asynchronous system may extend or shorten the
timing of its steps when necessary, much as a hiker takes long or short steps when walking across

rough terrain. Some of the pioneers of the computer age, such as mathematician Allen M Turing,
tried using asynchronous designs to build machines in the early 1950s. Engineers soon
abandoned this approach in favour of synchronous computers because common timing made the
design process so much easier.
Now asynchronous computing is experiencing a renaissance. Researchers at the
University of Manchester in England, The University of Tokyo and The California Institute of
Technology had demonstrated asynchronous microprocessors. Some asynchronous chips are
already in commercial mass production. In the late 1990s Sharp, the Japanese electronics
company used asynchronous design to build a data driven media processor a chip for editing
graphics, video and audio and Philips Electronics produced an asynchronous microcontroller
for two of its pagers.
Asynchronous parts of otherwise synchronous systems are also beginning to appear; the
Ultra SPARC IIIi processor recently introduced by SUN includes some asynchronous circuits
developed by our group. We believe that asynchronous systems will become ever more popular
as researchers learn how to exploit their benefits and develop methods for simplifying their
design. Asynchronous chipmakers have achieved a good measure of technical success, but
commercial success is still to come. We remain a long way from fulfilling the full promise of
asynchrony.

5.2 BEAT THE CLOCK


What are the potential benefits of asynchronous systems?
First, asynchrony may speed up computers. In a synchronous chip, the clocks rhythm
must be slow enough to accommodate the slowest action in the chips circuits. If it takes a
billionth of a second for one circuit to complete its operation, the chip cannot run faster than one
gigahertz. Even though many other circuits on that chip may be able to complete their operations
in less time, these circuits must wait until the clock ticks again before proceeding to the next
logical step. In contrast each part of an asynchronous system takes as much or as little time for
each action as it needs.

Complex operations can take more time than average, and simple ones can take les.
Actions can start as soon as the prerequisite actions are done, without waiting for the next tick of
the clock. Thus the systems speed depends on the average action time rather than the slowest
action time.
Coordinating as actions, however, also takes time and chip area. If the efforts required for
local coordination are small, an asynchronous system may, on average, be faster than a clocked
system. Asynchrony offers the most help to irregular chip designs in which slow actions occur
infrequently.
Asynchronous design may also reduce a chips power consumption. In the current
generation of large, fast synchronous chips, the circuits that deliver the timing signals take up a
good chunk of the chips area. In addition, as much as 30% of the electrical power used by the
chip, must be devoted to the clock and its distribution system. Moreover, because the clock is
always running, it generates heat whether or not the chip is doing anything useful.
In asynchronous systems, idle parts of the chip consume negligible power. This feature is
particularly valuable for battery-powered equipment, but it can also cut the cost of larger systems
by reducing the need for cooling fans and air-conditioning to prevent them from overheating.
The amount of power saved depends on the machines pattern of activity. Systems with parts that
act only occasionally benefit more than systems that act continuously. Most computers have
components, such as the floating-point arithmetic unit, that often remain idle for long periods.
Furthermore, as systems produce less ratio interference than synchronous machines do.
Because of a clocked system uses a fixed rhythm, it broadcasts a strong radio signal at its
operating frequency and at the harmonics of that frequency. Such signals can interfere with
cellular phones, televisions and aircraft navigation systems that operates t the same frequencies.
Asynchronous systems lack a fixed rhythm, so they spread their radiated energy broadly across
the radio spectrum, emitting less at any one frequency.

5.3 OVERVIEW/ CLOCKLESS SYSTEM

Most modern computers are synchronous: all their operations are coordinated by the
timing signals of tiny crystal oscillators within the machines. Now researchers are
designing asynchronous systems that can process data without the need for a governing
clock.

Asynchronous systems rely on local coordination circuits to ensure an orderly flow of


data. The two most important coordination circuits are called the Rendezvous and the
Arbiter.

The potential benefits of asynchronous systems include faster speeds, lower power
consumption and less radio interference. As integrated circuit become more complex,
chip designers will need to learn asynchronous techniques.

Yet another benefit of asynchronous design is that it can be used to build bridges between
clocked computers running at different speeds. Many computing clusters, for instance,
link fast PCs with slower machines. These clusters can tackle complex problems by
dividing the computational tasks among the PCs. Such a system is inherently
asynchronous: different parts march to different beats. Moving data controlled by one
clock to the control of another clock requires asynchronous bridges, because data may be
out of sync with the receiving clock.

Finally, although asynchronous design can be challenging, it can also be wonderfully


flexible. Because of the circuits of an asynchronous system need not share a common rhythm,
designers have more freedom in choosing the systems parts and determining how they interact.
Moreover, replacing any part with a faster version will improve the speed of the entire system.

5.4 LOCAL OPERATION

To describe how asynchronous systems work, we often use the metaphor of the bucket
brigade. A clocked system is like a bucket brigade in which each person must pass and receive
buckets according to the tick tock rhythm of the clock. When the clock ticks, each person pushes
a bucket forward to the next person down the line. When the clock tocks, each person grasps the
bucket pushed forward by the preceding person. The rhythm of this brigade cannot go faster than
the time it takes the slowest person to move the heaviest bucket. Even if most of the buckets are
light, everyone in the line must wait for the clock to tick before passing the next bucket.
Local cooperation rather than the common clock governs an asynchronous bucket
brigade. Each person who holds a bucket can pass it to the next person down the line as soon as
the next persons hands are free. Before each action, one person may have to wait until the other
is ready. When most of the buckets are light, however, they can move down the line very quickly.
Moreover, when theres no water to move, everyone can rest between buckets. A slow person
will still hinder the performance of the entire brigade, but replacing the slowpoke will return the
system to its best speed.

Bucket brigade
Bucket brigades in computers are called pipelines. A common pipeline executes the
computers instructions. Such a pipeline has half a dozen or so stages, each of which acts as a
person in a bucket brigade.

For example, a processor executing the instruction ADD A B Chip must fetch the
instruction from memory, decode the instruction, get the numbers from addresses A and B in
memory, do the addition and store the sum in memory address C.

Processing logic

Req

Ack
Req

Ack

Delay

Pipeline diagram

Here a bundled data self-timing scheme is used, where conventional data processing
logic is used along with a separate request (Req) line to indicate data validity. Requests may be
delayed by at least the logic delay to insure that they still indicate data validity at the receiving
register. An acknowledge signal (ack) provides flow control, so the receiving register can tell
the transmitting register when to begin sending the next data.
A clocked pipeline executes these actions in a rhythm independent of the operations
performed or the size of the numbers. In an asynchronous pipeline, though, the duration of
each action may depend on the operation performed the size of the numbers and the location of
the data in memory (just as in bucket brigade the amount of water in a bucket may determine
how long it takes to pass it on).

Without a clock to govern its actions, an asynchronous system must rely on local
coordination circuits instead. These circuits exchange completion signals to ensure that the
actions at each stage begin only when the circuits have the data they need. The two most
important coordination circuits are called the Rendezvous and the Arbiter circuits.

A Rendezvous element indicates when the last of two or more signals has arrived at a
particular stage. Asynchronous systems use these elements to wait until all the concurrent
actions finish before starting the next action.

One form of Rendezvous circuit is called the Muller C-element, named after David
Muller, now retired from a professorship at the University of Illinois. A Muller C-element is a
logic circuit with two inputs and on output. When both inputs of a Muller C-element are TRUE,
its output becomes TRUE.

Delay

Processing logic

When both inputs are FALSE, its output becomes FALSE. Otherwise the output remains
unchanged. For therefore, Muller C-element to act as a Rendezvous circuit, its inputs must not
change again until its output responds. A chain of Muller C-elements can control the flow of
data down an electronic bucket brigade.

5.5 RENDEZVOUS CIRCUITS

Rendezvous circuit
It can coordinate the action of an asynchronous system, allowing data to flow in an
orderly fashion without the need for a central clock. Shown here is an electronic pipeline
control by a chain of Muller C-elements, each of which allows data to pass down the line only
when the preceding stage is full indicating that data are ready to move and the following
stage is empty.

Each Muller C-element has two input wires and one output wire. The output changes to
FALSE when both inputs are FALSE and back to TRUE when both inputs are TRUE (in the
diagram, TRUE signals are shown in blue and FALSE signals are in red.). The inverter makes
the initial inputs to the Muller C-element differ, setting all stages empty at the start. Lets
assume that the left input is initially TRUE and the right input FALSE (1). A change in signal at
the left input from TRUE to FALSE (2) indicates that the stage to the left is full that is, some
data have arrived. Because the inputs to the Muller C-element are now the same, its output
changes to FALSE. This change in signals does three things: it moves data down the pipeline by
briefly making the data latch transparent, it sends a FALSE signal back to the preceding Celement to make the left stage empty, and it sends a FALSE signal ahead to the next Muller Celement to make the right stage full (3).

Research groups recently introduced a new kind of Rendezvous circuit called GasP.
GasP evolved from an earlier family of circuits designed by Charles E. Molnar, at SUN
Microsystems. Molnar dubbed his creation asP*, which stands for asynchronous symmetric
pulse protocol (the asterisk indicates the double P). G is added to the name because GasP is
what you are supposed to do when you see how fat our new circuits go. It is found that GasP
modules are as fast as and as energy-efficient as Muller C-elements, fit better with ordinary data
latches and offer much greater versatility in complex designs.

5.6 ARBITER CIRCUIT


An arbiter circuit performs another task essential for asynchronous computers. An
arbiter is like a traffic officer at an intersection who decides which car may pass through next.
Given only one request, an Arbiter promptly permits the corresponding action, delaying any
request until the first action is completed. When an Arbiter gets two requests at once, it must
decide which request to grant first.

For example, when two processors request access to a shared memory at approximately
the same time, the Arbiter puts the request into a sequence, granting access to only one
processor at a time. The Arbiter guarantees that there are never two actions under way at once,
just as the traffic officer prevents accidents by ensuring that there are never two cars passing
through the intersection on a collision course.

Although Arbiter circuits never grant more than one request at a time, there is no way to
build an Arbiter that will always reach a decision within a fixed time limit. Present-day Arbiters
reach decisions very quickly on average, usually within about a few hundred picoseconds.
When faced with close calls, however, the circuits may occasionally take twice as long, and in
very rare cases the time needed to make a decision may be 10 times as long as normal.

The fundamental difficulty in making these decisions causes minor dilemmas, which are
familiar in everyday life. For example, two people approaching a doorway at the same time may
pause before deciding who will go through first. They can go through in either order. All that
needed is a way to break the tie.

An Arbiter breaks ties. Like a flip-flop circuit, an Arbiter has two stable states
corresponding to the two choices. One can think of these states as the Pacific Ocean and The
Gulf of Mexico. Each request to an Arbiter pushes the circuit toward one stable state or the
other, just as a hailstone that falls in the Rocky Mountains can roll downhill toward The Pacific
or the Gulf. Between the two stable states, however, there must be a meta-stable line, which is
equivalent to the Continental Divide. If a hailstone falls precisely on the Divide, it may balance
momentarily on that sharp mountain ridge before tipping toward The Pacific or the Gulf.
Similarly, if two requests arrive at an Arbiter within a few picoseconds of each other, the circuit
may pause in its meta-stable state before reaching one of its stable states to break the tie.

5.7 THE NEED FOR SPEED

Research group at Sun Microsystems concentrates on designing fast asynchronous


systems. We have found that speed often comes from simplicity. Our initial goal was to build a
counter flow pipeline with two opposing data flows like two parallel bucket brigades moving
in opposite directions. We wanted the data from both flows to interact at each of these stages;
the hard challenge was to ensure that every northbound data element would interact with
every southbound data element. Arbitration turned out to be essential. At each joints between
successive stages, an Arbiter circuit permitted only one element at a time to pass.

This project proved very useful as a research target; we learned a great deal about
coordination and arbitration and built test chips to prove the reliability of our Arbiter circuits.

The experiments at Manchester, Caltech and Philips demonstrate that asynchronous


microprocessors can be compatible with their clocked counterparts. The asynchronous
processors can connect to peripheral machines without special programs or interface circuitry.

Chapter-6
SIMPLICITY IN DESIGN

There in no complexity of a simple design for clockless chips. The one fundamental
achievement is to throw the central clock away and standardization of components can be used
intensively. Integrated pipeline mode plays an important role in total system design.
There are about four factors regarding pipeline and these are:
1.Domino logic
2. Delay insensitive
3. Bundle data
4.

Dual

rail

Domino logic is a CMOS-based evolution of the dynamic logic techniques which were
based on either PMOS or NMOS transistors. It allows a rail-to-rail logic swing. It was developed
to speed up circuits. In a cascade structure consisting of several stages, the evaluation of each
stage ripples the next stage evaluation, similar to a domino falling one after the other. The
structure is hence called Domino CMOS Logic. Important features include:
*

They

Parasitic

have

smaller

capacitances

are

areas

than

smaller

so

conventional
that

higher

CMOS

operating

logic.

speeds

are

possible.
*

Operation is free of glitches as each gate can make only one transition.

Only non-inverting structures are possible because of the presence of inverting buffer.

Charge

distribution

may

be

problem

Delay insensitive circuit is a type of asynchronous circuit which performs a logic


operation often within a computing processor chip. Instead of using clock signals or other global
control signals, the sequencing of computation in delay insensitive circuit is determined by the
data flow. Typically handshake signals are used to indicate the readiness of such a circuit to
accept new data (the previous computation is complete) and the delivery of such data by the
requesting function. Similarly there may be output handshake signals indicating the readiness of
the result and the safe delivery of the result to the next stage in a computational chain or pipeline.
In a delay insensitive circuit, there is therefore no need to provide a clock signal to determine a
starting time for a computation. Instead, the arrival of data to the input of a sub-circuit triggers
the computation to start. Consequently, the next computation can be initiated immediately when
the

result

of

the

first

computation

is

completed.

The main advantage of such circuits is their ability to optimize processing of activities
that can take arbitrary periods of time depending on the data or requested function. An example
of a process with a variable time for completion would be mathematical division or recovery of
data where such data might be in a cache. The Delay-Insensitive (DI) class is the most robust of
all asynchronous circuit delay models. It makes no assumptions on the delay of wires or gates. In
this model all transitions on gates or wires must be acknowledged before transitioning again.
This condition stops unseen transitions from occurring. In DI circuits any transition on an input
to a gate must be seen on the output of the gate before a subsequent transition on that input is
allowed to happen. This forces some input states or sequences to become illegal. For example
OR gates must never go into the state where both inputs are one, as the entry and exit from this
state will not be seen on the output of the gate. Although this model is very robust, no practical
circuits are possible due to the heavy restrictions. Instead the Quasi-Delay-Insensitive model is
the smallest compromise model yet capable of generating useful computing circuits. For this

reason circuits are often incorrectly referred to as Delay-Insensitive when they are Quasi-DelayInsensitive.

6.1 ASYNCRONOUS OF HIGHER PERFOMANCE


In order to increase the performance of the circuit, the following are basics to be implements.
*

Data-dependent

*All

carry

bits

need

delays.
to

be

computed.

The figure show first circuit being not asynchronous and then the second shows dual rail with
every bit taken into computation.

6.2 ASYNCHRONOUS FOR LOW POWER


Power consumption is very important aspect in designing any mobile and to increase
the battery capacity and life for battery driven devices. Hence asynchronization of power is
completely inevitable to achieve a low level of power dissipated. The circuit should consume
power only when and where active. Rest of the time the circuit returns to a non-dissipating state,
until next activation. The figure shows how power is less confused by first taking down the
frequency by dividing the give frequency to two and the next one show as many circuits are
cascaded the more the frequency is divided. This provides a crucial reduction on power
consumption.

6.3 ASYNCRONOUS FOR LOW NOISE


Any system with clock will be having oscillations in it and will create electromagnetic
noise and this is the source of the actual noise one hears from convectional computers. For every
clock cycle there will be spike emitted and emission of random spectra is accompanied together
with noise. This problem is greatly reduced to significant considerable range by discarding the
central clock as explain above and the spectra radiation are much smoother in asynchronous
circuits.

Chapter-7
ADVANTAGE OF ASYNCRONOUS CHIP

A clocked chip can run no faster than its most slothful piece of logic; the answer isn't
guaranteed until every part completes its work. By contrast, the transistors on an asynchronous
chip can swap information independently, without needing to wait for everything else. The
result? Instead of the entire chip running at the speed of its slowest components, it can run at the
average speed of all components. At both Intel and Sun, this approach has led to prototype chips
that run two to three times faster than comparable products using conventional circuitry.

Clockless chips draw power only when there is useful work to do, enabling a huge
savings in battery-driven devices; an asynchronous-chip-based pager marketed by Philips
Electronics, for example, runs almost twice as long as competitors' products, which use
conventional clocked chips.
Asynchronous chips use 10 percent to 50 percent less energy than synchronous chips, in
which the clocks are constantly drawing power. That makes them ideal for mobile
communications applications - which usually need low power sources - and the chips' quiet

nature also makes them more secure, as typical hacking techniques involve listening to clock
ticks.

Another advantage of clockless chips is that they give off very low levels of
electromagnetic noise. The faster the clock, the more difficult it is to prevent a device from
interfering with other devices; dispensing with the clock all but eliminates this problem. The
combination of low noise and low power consumption makes asynchronous chips a natural
choice for mobile devices. "The low-hanging fruit for clockless chips will be in communications
devices," starting with cell phones
Asynchronous logic would offer better security than conventional chips: "The clock is
like a big signal that says, Okay, look now," says Fant. "It's like looking for someone in a
marching band. Asynchronous is more like a milling crowd. There's no clear signal to watch.
Potential hackers don't know where to begin."

Analyzing the power consumption for each clock tick can crack the encryption on
existing smart cards. This allows details of the chips inner workings to be deduced. Such an
attack would be far more difficult on a smartcard based on asynchronous logic.

They can perform encryption in a way that is harder to identify and to crack. Improved
encryption makes asynchronous circuits an obvious choice for smart cardsthe chip-endowed
plastic cards beginning to be used for such security-sensitive applications as storage of medical
records, electronic funds exchange and personal identification.

Ivan Sutherland of Sun Microsystems, who is regarded as the guru of the field, believes
that such chips will have twice the power of conventional designs, which will make them ideal

for use in high-performance computers. But Dr Furber suggests that the most promising
application for asynchronous chips may be in mobile wireless devices and smart cards.

Different styles
There are several styles of asynchronous design. Conventional chips represent the zeroes
and ones of binary digits (bits) using low and high voltages on a particular wire.
One clockless approach, called dual rail, uses two wires for each bit. Sudden voltage
changes on one of the wires represent a zero, and on the other wire a one.

"Dual-rail" circuits use two wires giving the chip communications pathways, not only to
send bits, but also to send "handshake" signals to indicate when work has been completed.
Replacing the conventional system of digital logic with what he calls "null convention logic," a
scheme that identifies not only "yes" and "no," but also "no answer yet"a convenient way for
clockless chips to recognize when an operation has not yet been completed.

Another approach is called bundled data. Low and high voltages on 32 wires are used
to represent 32 bits, and a change in voltage on a 33rd wire indicates when the values on the
other 32 wires are to be used.

Chapter-8
APPLICATION OF ASYNCRONOUS CHIP
1. High performance.
2. Low power dissipation.
3. Low noise and low electro-magnetic emission.
4. A good match with heterogeneous system timing.

8.1 ASYNCHRONOUS FOR HIGH PERFORMANCE


In an asynchronous circuit the next computation step can start immediately after the
previous step has completed: there is no need to wait for a transition of the clock signal. This
leads, potentially, to a fundamental performance advantage for asynchronous circuits, an
advantage that increases with the variability in delays associated with these computation steps.
However, part of this advantage is canceled by the overhead required to detect the completion of
a step. Furthermore, it may be difficult to translate local timing variability into a global system
performance advantage.

Data-dependent delays
The delay of the combinational logic circuit show in Figure-1 depends on the current
state and the value of the primary inputs. The worst-case delay, plus some margin for flip-flop
delays and clock skew, is then a lower bound for the clock period of a synchronous circuit. Thus,
the actual delay is always less (and sometimes much less) than the clock period.

A simple example is an N-bit ripple-carry adder (Figure 2). The worst-case delay occurs
when 1 is added to 2N - 1. Then the carry ripples from FA1 to FAN. In the best case there is no
carry ripple at all, as, for example, when adding 1 to 0. Assuming random inputs, the average
length of the longest carry-propagation chain is bounded by log 2 N. For a 32-bit wide ripple-carry
adder the average length is therefore 5, but the clock period must be 6 times longer! On the other
hand, the average length determines the average case delay of an asynchronous ripple-carry
adder, which we consider next. In an asynchronous circuit this variation in delays can be
exploited by detecting the actual completion of the addition. Most practical solutions use dualrail encoding of the carry signal (Figure 2(b)); the addition has completed when all internal
carry-signals have been computed. That is, when each pair (cf i; cti) has made a monotonous
transition from (0; 0) to (0; 1) (carry = false) or to (1; 0) (carry = true). Dual-rail encoding of the
carry signal has also been applied to a carry bypass adder. When inputs and outputs are dual-rail
encoded as well, the completion can be observed from the outputs of the adder.

The controller communicates exclusively with the controllers of the immediately


preceding and succeeding stages by means of handshake signaling, and controls the state of the
data latches (transparent or opaque). Between the request and the next acknowledge phase the
corresponding data wires must be kept stable.

8.2 ASYNCHRONOUS FOR LOW POWER


Dissipating when and where active the classic example of a low-power asynchronous
circuit is a frequency divider. A D-flip-flop with its inverted output fed back to its input divides
an incoming (clock) frequency by two (Figure 4(a)). A cascade of N such divide-by-two
elements (Figure 4(b)) divide the incoming frequency by 2N.

The second element runs at only half the rate of the first one and hence dissipates only
half the power; the third one dissipates only a quarter, and so on. Hence, the entire asynchronous
cascade consumes, over a given period of time, slightly less than twice the power of its head
element, independent of N. That is, fixed power dissipation is obtained.
In contrast, a similar synchronous divider would dissipate in proportion to N. A cascade
of 15 such divide-by-two elements is used in watches to convert a 32 kHz crystal clock down to
a 1 Hz clock. The potential of asynchronous for low power depends on the application.
For example, in a digital filter where the clock rate equals the data rate, all flip-flops and
all combinational circuits are active during each clock cycle. Then little or nothing can be gained
by implementing the filter as an asynchronous circuit. However, in many digital-signal
processing functions the clock rate exceeds the data (signal) rate by a large factor, sometimes by
several orders of magnitude 2. In such circuits, only a small fraction of registers change state
during a clock cycle. Furthermore, this fraction may be highly data dependent. The clock
frequency is chosen that high to accommodate sequential algorithms that share resources over

subsequent computation steps. One is vastly improved electrical efficiency, which leads directly
to prolonged battery life.
One application for which asynchronous circuits can save power is Reed-Solomon error
correctors operating at audio rates, as demonstrated at Philips Research Laboratories. Two
different asynchronous realizations of this decoder (single-rail and dual-rail) are compared with a
synchronous (product) version.
The single rail was clearly superior and consumed five times less power than the
synchronous version.
A second example is the infrared communications receiver IC designed at HewlettPackard/Stanford. The receiver IC draws only leakage current while waiting for incoming data,
but can start up as soon as a signal arrives so that it loses no data. Also, most modules operate
well below the maximum frequency of operation.
The filter bank for a digital hearing aid was the subject of another successful
demonstration, this time by the Technical University of Denmark in cooperation with Oticon Inc.
They re-implemented an existing filter bank as a fully asynchronous circuit. The result is a factor
five less power consumption.
A fourth application is a pager in which several power-hungry sub circuits were
redesigned as asynchronous circuits, as shown later in this issue.

8.3 ASYNCHRONOUS FOR LOW NOISE AND LOW EMISSION


Sub circuits of a system may interact in unintended and often subtle ways. For example, a
digital sub circuit generates voltage noise on the power-supply lines or induces currents in the
silicon substrate. This noise may affect the performance of an analog-to-digital converter
connected so as to draw power from the same source or that is integrated on the same substrate.
Another example is that of a digital sub circuit that emits electromagnetic radiation at its clock
frequency (and the higher harmonic frequencies), and a radio receiver sub-circuit that mistakes
this radiation for a radio signal.

Due to the absence of a clock, asynchronous circuits may have better noise and EMC
(Electro-Magnetic Compatibility) properties than synchronous circuits.
This advantage can be appreciated by analyzing the supply current of a clocked circuit in
both the time and frequency domains.
Circuit activity of a clocked circuit is usually maximal shortly after the productive clock
edge. It gradually fades away and the circuit must become totally quiescent before the next
productive clock edge. Viewed differently, the clock signal modulates the supply current as
depicted schematically in Figure 5(a). Due to parasitic resistance and inductance in the on-chip
and off-chip supply wiring this causes noise on the on-chip power and ground lines.

8.4 HETEROGENEOUS TIMING


There are two on-going trends that affect the timing of a system-on-a-chip: the relative
increase of interconnects delays versus gate delays and the rapid growth of design reuse. Their
combined effect results in an increasingly heterogeneous organization of system-on-a-chip
timing. According to Figure 7, gate delays rapidly decrease with each technology generation. By
contrast, the delay of a piece of interconnect of fixed modest length increases, soon leading to a
dominance of interconnect delay over gate delay. The introduction of additional interconnects
layers and new materials (copper and low dielectric constant insulators) may slow down this
trend somewhat. Nevertheless, new circuits and architectures are required to circumvent these
parasitic limitations. For example, across-chip communication may no longer fit within a single
clock period of a processor core.
Heterogeneous system timing will offer considerable design challenge for system-level
interconnect, including buses, FIFOs, switch matrices, routers, and multi-port memories.
Asynchrony makes it easier to deal with interconnecting a variety of different clock frequencies,
without worrying about synchronization problems, differences in clock phases and frequencies,
and clock skew. Hence, new opportunities will arise for asynchronous interconnect structures and
protocols. Once asynchronous on-chip interconnect structures are accepted, the threshold to
introduce asynchronous clients to these interconnects is lowered as well. Also, mixed
synchronous-asynchronous circuits hold promise.

Chapter-9
A CHALLENGING TIME

Although the architectural freedom of asynchronous systems is a great benefit, it also poses a
difficult challenge. Because each part sets its own pace, that pace may vary from time to time
in any one system and may vary from system to system. If several actions are concurrent, they
may finish in a large number of possible sequences. Enumerating all the possible sequences of
actions in a complex asynchronous chip is as difficult as predicting the sequences of actions in
a school yard full of children. This dilemma is called the state explosion problem.

Can chip designers create order out of the potential chaos of concurrent actions?

Fortunately, researchers are developing theories for tracking this problem. Designers need not
worry about all the possible sequences of actions if they can set certain limitations on the
communication behavior of each circuit. To continue the schoolyard metaphor, a teacher can
promote safe play by teaching each child how to avoid danger.

Another difficulty is that we lack mature design tools, accepted testing methods and
widespread education in asynchronous design. A growing research community is making good
progress, but the present total investment in clock-free computing parlances in comparison
with the investment in clocked design. Nevertheless, we are confident that the relentless
advances in the speed and complexity of integrated circuits will force designers to learn
asynchronous techniques. We do not know yet whether asynchronous systems will flourish
first within large computer and electronics companies or within start-up companies eager to
develop new ideas. The technological trend, however, is inevitable: in he coming decades,
asynchronous design will become prevalent.

Chapter-10
FUTURE SCOPE

The first place well see, and have already seen, clock less designs are in the lab. Many
prototypes will be necessary to create reliable designs. Manufacturing techniques must also be
improved so the chips can be mass-produced.

The second place well see these chips are in mobile electronics. This is an ideal place to
implement a clock less chip because of the minimal power consumption. Also, low levels of
electromagnetic noise creates less interference, less interference is critical in designs with many
components packed very tightly, as is the case with mobile electronics.

The third place is in personal computers (PCs). Clock less designs will occur here last
because of the competitive PC market.

It is essential in that market to create an efficient design that is reasonably priced. A


manufacturing cost increase of a couple of cents per chip can cause an entire line of computers to
fail because of the large cost increase passed onto the customer. Therefore, the manufacturing
process must be improved to create a reasonably priced chip.

CONCLUSION

Clocks have served the electronics design industry very well for a long time, but there
are insignificant difficulties looming for clocked design in future. These difficulties are most
obvious in complex SOC development, where electrical noise, power and design costs threaten
to render the potential of future process technologies inaccessible to clocked design.

Self-timed design offers an alternative paradigm that addresses these problem areas, but
until now VLSI designers have largely ignored it. Things are beginning to change; however,
self-timed design is poised to emerge as a viable alternative to clocked design. The drawbacks,
which are the lack of design tools and designers capable of handling self-timed design, are
beginning to be addressed, and a few companies (including a couple of start-ups, Theseus
Logic Inc., and Cogency Technology, Inc.) have made significant commitments to the
technology.

Although full-scale commercial demonstrations of the value of self-timed design are still
few in number, the examples available, demonstrates that there are no show stoppers to
threaten the ultimate viability for this strategy. Self-timed technology is poised to make an
impact, and there are significant rewards on offer to those brave enough to take the lead in its
exploitation.

REFERENCES

1. Scanning the Technology: Applications of Asynchronous Circuits C. H. (Kees) van


Berkel, Mark B. Josephs, and Steven M. Nowick proceedings of IEEE, December 1998.
2. Computers without clocks Ivan E Sutherland and Jo Ebergen Scientific American,
August 2002.
3. Is it time for Clockless chips? David Geer published by IEEE Computer Society,
March 2005.
4. Guest Editors Introduction: Clockless VLSI Systems Soha Hassoun, Yong-Bin
Kim and Fabrizio Lombardi copublished by IEEE CS and IEEE November December
2005.
5. It's Time for Clockless Chips Claire Tristram from MIT Technology October 2001
6. Old tricks for new chips Apr 19th 2001 From The Economist print edition

Anda mungkin juga menyukai