Anda di halaman 1dari 14

Computer System Organisation

Assignment

Prepared By:
Arindam Bandyopadhyay
225/CO/03
COE-1, Sem-5.

Netaji Subhas Institute Of Technology , Sector-3 , Dwarka , New Delhi-110045.

MIPS architecture

A MIPS R4400 microprocessor made by Toshiba

Contents
1) Introduction...2
2) History.3
3) MIPS CPU Family & Specifications.5
4) MIPS Cores7
5) MIPS Programming & Emulation.7
6) Instruction Set..7
7) Summary of R3000 Instruction Set.8
8) RISC Pipelines..9
9) Pipelining Developments...9
10) Basic Processor Architecture11
11) Applications12
12) Other models and Future of MIPS.12
13) Further Reading & External Links.13

Introduction:
MIPS, for Microprocessor without Interlocked Pipeline Stages, is a RISC
microprocessor architecture developed by MIPS Computer Systems Inc.
MIPS designs are used in SGIs computer product line, and have found broad application
in embedded systems, Windows CE devices, and Cisco routers. The Nintendo 64 console,
Sony PlayStation console, Sony PlayStation 2 console, and Sony PSP handheld system
use MIPS processors. By the late 1990s it was estimated that one in three RISC chips
produced were MIPS-based designs.
The early MIPS architectures were 32-bit implementations (generally 32-bit wide
registers and data paths), while later versions were 64-bit implementations. Five
backward-compatible revisions of the MIPS instruction set exist, named
MIPS I,
MIPS II,
MIPS III,
MIPS IV,

and

MIPS 32/64.
MIPS MT,

( new Hyperthreadind Extensions )

The latest of these, MIPS 32/64 Release 2, defines a control register set as well as the
instruction set. Several add-on extensions are also available, including MIPS-3D which
is a simple set of floating-point SIMD instructions dedicated to common 3D tasks, MDMX
which is a more extensive integer SIMD instruction set using the 64-bit floating-point
registers, MIPS16 which adds compression to the instruction stream to make programs
take up less room (allegedly a response to the Thumb encoding in the ARM architecture),
and the recent addition of MIPS MT, new multithreading additions to the system similar to
HyperThreading in the latest Intel lineup.
Because the designers created such a clean instruction set (see Instructions), computer
architecture courses in universities and technical schools often study the MIPS
architecture. The design of the MIPS CPU family, together with SPARC, another early
RISC architecture, greatly influenced later RISC designs like DEC Alpha.

History:
In 1981, a team led by John L. Hennessy at Stanford University started work on what
would become the first MIPS processor. The basic concept was to dramatically increase
performance through the use of deep instruction pipelines, a technique that was well
known, but difficult to implement. Generally a pipeline spreads out the task of running an
instruction into several steps, starting work on step one of an instruction before the
preceding instruction is complete. In contrast, traditional designs of the era waited to
complete an entire instruction before moving on, thereby leaving large areas of the CPU
idle as the process continued.
One major barrier to pipelining was that it required interlocks to be set up to ensure that
instructions that took multiple clock cycles to complete would stop the pipeline from
loading more data basically to pause while it completed. These interlocks can take a
long time to set up, and were thought to be a major barrier to future speed improvements.
A major design aspect of the MIPS design was to demand that all instructions take only
one cycle to complete, thereby removing any needs for interlocking.
Although this design eliminated a number of useful instructions, notably things like
multiply and divide which take multiple steps, it was felt that the overall performance of
the system would be dramatically improved by running the chips at much higher clock
rates. This ramping of the speed would be difficult with interlocking involved, as the time
needed to set up locks is as much a function of die size as clock rate: adding the hardware
needed might actually slow down the overall speed.
The elimination of these instructions became a contentious point, which many observers
used to claim the design (and RISC in general) would never live up to its hype. If one
simply replaces the complex multiply instruction with many simpler additions, where is
the speed increase? This overly-simple analysis ignored the fact that the speed of the
design was in the pipelines, not the instructions.
In 1984 Hennessy was convinced of the future commercial potential of the design, and
left Stanford to form MIPS Computer Systems. They released their first design, the
R2000, in 1985, improving the design as the R3000 in 1988. These 32-bit CPUs formed
the basis of their company through the 1980s, used primarily in SGIs series of
workstations. These commercial designs deviated from the Stanford academic research
by implementing most of the interlocks in hardware, supplying full multiply and divide
instructions (among others).
In 1991 MIPS released the first 64-bit microprocessor, the R4000. The design was so
important to SGI, at the time one of their only major customers, that SGI bought the
company outright in 1992 in order to guarantee the design would not be lost given the
financial difficulties MIPS had while bringing it to market. As a subsidiary of SGI, the
company became known as MIPS Technologies.
In the early 1990s MIPS started licensing their designs to third-party vendors. This
proved fairly successful due to the simplicity of the core, which allowed it to be used in a
number of applications that would have formerly used much less capable CISC designs
of similar gate count and pricethe two are strongly related, the price of a CPU is

generally the number of gates plus the number of external pins. Sun Microsystems
attempted to follow their success by licensing their SPARC core, but have never been
anywhere near as successful. By the late 1990s MIPS was a powerhouse in the embedded
processor field, and in 1997 the 48-millionth MIPS-based CPU shipped, making it the
first RISC CPU to outship the famous Motorola 68000 family. They were so successful
that SGI spun-off MIPS Technologies in 1998. Fully half of MIPS income today comes
from licensing their designs, while much of the rest comes from contract design work on
cores that will then be produced by third parties.
In 1999 MIPS formalized their licensing system around two basic designs, the 32-bit
MIPS32 and 64-bit MIPS64. NEC, Toshiba and SiByte (later acquired by Broadcom)
each obtained licenses for the MIPS64 as soon as it was announced. Philips, LSI Logic
and IDT have since joined them. Success followed success, and today the MIPS cores are
one of the most-used heavyweight cores in the marketplace for computer-like devices
(hand-held computers, set-top boxes, etc.), with other designers fighting it out for other
niches. Some indication of their success is the fact that Motorola/Freescale uses MIPS
cores in their set-top box designs, instead of their own PowerPC-based cores.
Since the MIPS architecture is licensable, it has attracted several processor start-up
companies over the years. One of the first start-ups to design MIPS processors was
Quantum Effect Devices (see next section). The MIPS design team that designed the
R4300 started the company SandCraft, which designed the R5432 for NEC and later
produced the SR7100, one of the first out-of-order execution processors for the
embedded market. The original DEC StrongARM team eventually split into two MIPSbased start-ups: Sibyte which produced the SB-1250, one of the first high-performance
MIPS-based systems-on-a-chip (SOC); while Alchemy Semiconductor produced the Au1000 SOC for low-power applications. Sibyte was acquired by Broadcom while Alchemy
was acquired by AMD. Lexra used a MIPS-like architecture and added DSP extensions
for the audio chip market and multithreading support for the networking market. Due to
Lexra not licensing the architecture, two lawsuits were started between the two
companies. The first was quickly resolved when Lexra promised not to advertise their
processorss as MIPS-compatible. The second was protracted, hurt both companies
business, and culminated in MIPS Technologies giving Lexra a free license and a large
cash payment.

MIPS CPU family :


The first commercial MIPS CPU, model, the R2000, was announced in 1985. It added
multiple-cycle multiply and divide instructions in a somewhat independent on-chip unit.
New instructions were added to retrieve the results from this unit back to the execution
core. Ironically, the result-retrieving instructions were interlocked, which improved
compiled code density but made the MIPS name meaningless.
The R2000 also had support for up to four co-processors, one of which was built into the
main CPU and handled exceptions and traps, while the other three were left for other
uses. One of these could be filled by the optional R2010 FPU, which had thirty-two 32bit registers that could be used as sixteen 64-bit registers for double-precision.
The R3000 succeeded the R2000 in 1988, adding 32kB (soon increased to 64kB) caches
for instructions and data, along with cache coherency support for multi-processor use.
While there were flaws in the R3000s multiprocessor support, it still managed to be a
part of several successful multiprocessor designs. The R3000 also included a built-in
MMU, a common feature on CPUs of the era. The R3000 was the first successful MIPS
design in the marketplace, and eventually over 1 million were made. The R3000A was a
speed bumped version running at 40MHz that delivered a performance of 32 VUPS. Like
the R2000, the R3000 was paired with the R3010 FPU. Pacemips produced an R3400
which was an R3000 with R3010 fpu on a single chip.
The R4000 series, released in 1991, extended the MIPS instruction set to a full 64-bit
architecture, moved the FPU onto the main die to create a single-chip system, and
operated at a radically high internal clock speed (it was introduced at 100MHz).
However, in order to achieve the clock speed the caches were reduced to 8kB each and
took three cycles to access. The high operating frequencies were achieved through the
technique of deep pipelining (called super-pipelining at the time). With the introduction
of the R4000 a number of improved versions soon followed, including the R4400 of 1993
which included 16kB caches, largely bug-free 64-bit operation, and a controller for
another 1MB external (level 2) cache.
MIPS, now a division of SGI called MTI, designed the lower-cost R4200, and later the
even lower cost R4300, which was the R4200 with a 32-bit external bus. The R4300 was
used in the Nintendo 64.
Quantum Effect Devices (QED), a separate company started by refugees from MIPS,
designed the R4600, the R4700, the R4650 and the R5000. Where the R4000 had pushed
clock frequency and sacrificed cache capacity, the QED designs emphasized large caches
which could be accessed in just two cycles and efficient use of silicon area. The R4600
and R4700 were used in low-cost versions of the SGI Indy workstation as well as the first
MIPS based Cisco routers. The R4650 was used in the original WebTV set-top boxes
(now Microsoft TV). The R5000 FPU had more flexible single precision floating-point
scheduling than the R4000, and as a result, R5000-based SGI Indys had much better
graphics performance than similarly clocked R4400 Indys with the same graphics
hardware. SGI gave the old graphics board a new name when it was combined with
R5000 in order to emphasize the improvement. QED later designed the RM7000 and
RM9000 family of devices for embedded markets like networking and laser printers.
5

QED was acquired by the semiconductor manufacturer PMC-Sierra in August 2000, the
latter company continuing to invest in the MIPS architecture.
The R8000 (1994) was the first superscalar MIPS design, able to execute two ALU and
two memory operations per cycle. The design was spread over six chips: an integer unit
(with 16KB instruction and 16KB L1 data caches), a floating-point unit, three full-custom
secondary cache tag RAMs (two for secondary cache accesses, one for bus snooping),
and a cache controller ASIC. The design had two fully pipelined double precision
multiply-add units, which could stream data from the 4MB off-chip secondary cache. The
R8000 powered SGIs Power Challenge computer servers in the mid 1990s and later
became available in the Indigo2 Impact workstation. Its limited integer performance and
high cost dampened appeal for most users, although its FPU performance fit scientific
users quite well, and the R8000 was in the marketplace for only a year and remains fairly
rare.
In 1995, the R10000 was released. This processor was a single-chip design, ran at a faster
clock speed than the R8000, and had larger 32KB primary instruction and data caches. It
was also superscalar, but its major innovation was out-of-order execution. Even with a
single memory pipeline and simpler FPU, the vastly improved integer performance,
lower price, and higher density made the R10000 preferable for most customers.
Recent designs have all been based upon R10000 core. The R12000 used improved
manufacturing to shrink the chip and operate at higher clock rates. The revised R14000
allowed higher clock rates with additional support for DDR SRAM in the off-chip cache,
and a faster front side bus clocked to 200MHz for better throughput. Later iterations are
named the R16000 and the R16000A and feature increased clock speed, additional L1
cache, and smaller die manufacturing compared with before.
MIPS Processor Specifications
Model
R2000
R3000
R4000
R4400
R4600
R5000
R8000
R10000
R12000
R14000
R16000
R16000A

Frequency
[MHz]
16.7
25
100
150
133
180
90
200
300
600
700
800

Year Process
[m]
1985 2.0
1988 1.2
1991 0.8
1992 0.6
1994 0.64
1996 0.35
1994 0.5
1995 0.35
1998 0.18-0.25
2001 0.13
2002 0.11
2004 0.11

Transistors
[millions]
0.11
0.11
1.35
2.3
2.2
3.7
2.6
6.8
6.9
7.2
---

Die size
[mm]
-66.12
213
186
77
84
299
299
204
204
---

IO
Pins
-145
179
179
179
223
591
599
600
527
---

Power
[W]
-4
15
15
4.6
10
30
30
20
17
20
--

Voltage
--5
5
5
3.3
3.3
3.3
4
----

Dcache
[k]
32
64
8
16
16
32
16
32
32
32
64
64

Note: These specifications are only common processor configurations. Variations exist,
especially in clock speed and Level 2 cache.
6

Icache
[k]
64
64
8
16
16
32
16
32
32
32
64
64

Scache
[k]
none
none
1024
1024
512
1024
1024
512
1024
2048
4096
4096

MIPS Cores :
In recent years most of technology used in the various MIPS generations has been offered
as building-blocks for embedded processor designs. Both 32-bit and 64-bit basic cores
are offered, known as the 4K and 5K respectively, and the design itself can be licensed as
MIPS32 and MIPS64. These cores can be mixed with add-in units such as FPUs, SIMD
systems, various input/output devices, etc.
MIPS cores have been very successful, they form the basis of many newer Cisco routers,
cable modems and ADSL modems, smartcards, laser printer engines, set-top boxes,
handheld computers, and the Sony PlayStation 2.

MIPS Programming and Emulation :


There is a freely available MIPS R2000/R3000 Simulator called SPIM for several
operating systems (specifically UNIX or GNU/Linux; Mac OS X; MS Windows 95, 98,
NT, 2000, XP; and DOS) which is good for learning MIPS assembly language
programming and the general concepts of RISC-assembly language programming:
http://www.cs.wisc.edu/~larus/spim.html
A more feature-rich MIPS emulator is available from the GXemul project (formerly
known as the mips64emul project), which emulates not only the various MIPS III and
higher microprocessors (from the R4000 through the R10000), but also emulates entire
computer systems which use the microprocessors. For example, GXemul can emulate
both a DECstation with a MIPS R4400 CPU (and boot to Ultrix), and an SGI O2 with a
MIPS R10000 CPU (although the ability to boot Irix is limited), among others, as well as
the various framebuffers, SCSI controllers, and the like which comprise those systems.

Instruction Set :
T he MI PS ins tr uction s et cons is ts of about 111 total ins tr uctions , each r epr es ented in
32 bits . An ex ample of a MI PS ins tr uction is below:
add $r12, $r7, $r8

Above is the as s embly (left) and binar y (r ight) r epr es entation of a MI PS addition
ins tr uction. T he ins tr uction tells the pr oces s or to compute the s um of the values in

r egis ter s 7 and 8 and s tor e the r es ult in r egis ter 12. T he dollar s igns ar e us ed to
indicate an oper ation on a r egis ter . T he color ed binar y r epr es entation on the r ight
illus tr ates the 6 fields of a MI PS ins tr uction. T he pr oces s or identifies the type of
ins tr uction by the binar y digits in the fir s t and las t fields . I n this cas e, the pr oces s or
r ecogiz es that this ins tr uction is an addition fr om the z er o in its fir s t field and the 20
in its las t field.
T he oper ands ar e r epr es ented in the blue and yellow fields , and the des ir ed r es ult
location is pr es ented in the four th (pur ple) field. T he or ange field r epr es ents the s hift
amount, s omething that is not us ed in an addition oper ation.
T he ins tr uction s et cons is ts of a var iety of bas ic ins tr uctions , including:
21 ar ithmetic ins tr uctions (+ , - , * , /, % )
8 logic ins tr uctions (&, |, ~ )
8 bit manipulation ins tr uctions
12 compar is on ins tr uctions (> , < , = , > = , < = , )
25 br anch/j ump ins tr uctions
15 load ins tr uctions
10 s tor e ins tr uctions
8 move ins tr uctions
4 mis cellaneous ins tr uctions

Summary of R3000 instruction set:


Instructions are divided into three kinds of format: R, I and J format. R format consists of
three registers and func field, I format contains two registers and 16-bits-long immediate
value and J format six-bit opcode followed by 26-bits immediate value. [1] [2] [3]
Common arithmetic instructions follow:
add $1,$2,$3 ; $1 = $2 + $3 (signed)
addu $1,$2,$3 ; $1 = $2 + $3 (unsigned)
sub $1,$2,$3 ; $1 = $2 - $3 (signed)
subu $1,$2,$3 ; $1 = $2 - $3 (unsigned)
addi $1,$2,100 ; $1 = $2 + 100 (immediate)
An operation with signed immediates differs from one with unsigned ones in that it does
not throw an exception. Subtracting an immediate can be done with adding the negation
of that value as the immediate.
Since MIPS is a load-store architecture, as the common case, it has two and only two
operations to transfer data from and to the memory.
lw $1,100($2) ; load a word from the memory at $2 + 100 into the register $1.
sw $1,100($2) ; store a word $1 in the register to the memory at $2 + 100.
Branching and jump instructions are:
beq $1,$2,100 ; if ($1 == $2) go to PC+4+100
slt $1,$2,$3 ; if ($2 < $3) $1 = 1; else $1 = 0
j 10000 ; goto 10000
jal 10000 ; $31 = PC + 4 and go to 10000
Among some other important instructions are:

lui $1,100 ; load the immediate into the upper 16 bits.

RISC Pipelines:
A RISC processor pipeline operates in much the same way, although the stages in the
pipeline are different. While different processors have different numbers of steps, they
are basically variations of these five, used in the MIPS R3000 processor:
1. fetch instructions from memory
2. read registers and decode the instruction
3. execute the instruction or calculate an address
4. access an operand in data memory
5. write the result into a register
If you glance back at the diagram of the laundry pipeline, you'll notice that although the
washer finishes in half an hour, the dryer takes an extra ten minutes, and thus the wet
clothes must wait ten minutes for the dryer to free up. Thus, the length of the pipeline is
dependent on the length of the longest step. Because RISC instructions are simpler than
those used in pre-RISC processors (now called CISC, or Complex Instruction Set
Computer), they are more conducive to pipelining. While CISC instructions varied in
length, RISC instructions are all the same length and can be fetched in a single operation.
Ideally, each of the stages in a RISC processor pipeline should take 1 clock cycle so that
the processor finishes an instruction each clock cycle and averages one cycle per
instruction (CPI).

Pipelining Developments:
In order to make processors even faster, various methods of optimizing pipelines have
been devised.
Superpipelining refers to dividing the pipeline into more steps. The more pipe stages
there are, the faster the pipeline is because each stage is then shorter. Ideally, a pipeline
with five stages should be five times faster than a non-pipelined processor (or rather, a
pipeline with one stage). The instructions are executed at the speed at which each stage is
completed, and each stage takes one fifth of the amount of time that the non-pipelined
instruction takes. Thus, a processor with an 8-step pipeline (the MIPS R4000) will be
even faster than its 5-step counterpart. The MIPS R4000 chops its pipeline into more
pieces by dividing some steps into two. Instruction fetching, for example, is now done in
two stages rather than one. The stages are as shown:
1. Instruction Fetch (First Half)
2. Instruction Fetch (Second Half)
3. Register Fetch
4. Instruction Execute
5. Data Cache Access (First Half)
6. Data Cache Access (Second Half)
7. Tag Check
8. Write Back
Superscalar pipelining involves multiple pipelines in parallel. Internal components of the
processor are replicated so it can launch multiple instructions in some or all of its pipeline
stages. The RISC System/6000 has a forked pipeline with different paths for floating-

point and integer instructions. If there is a mixture of both types in a program, the
processor can keep both forks running simultaneously. Both types of instructions share
two initial stages (Instruction Fetch and Instruction Dispatch) before they fork. Often,
however, superscalar pipelining refers to multiple copies of all pipeline stages (In terms
of laundry, this would mean four washers, four dryers, and four people who fold clothes).
Many of today's machines attempt to find two to six instructions that it can execute in
every pipeline stage. If some of the instructions are dependent, however, only the first
instruction or instructions are issued.
Dynamic pipelines have the capability to schedule around stalls. A dynamic pipeline is
divided into three units: the instruction fetch and decode unit, five to ten execute or
functional units, and a commit unit. Each execute unit has reservation stations, which act
as buffers and hold the operands and operations.
While the functional units have the freedom to execute out of order, the instruction
fetch/decode and commit units must operate in-order to maintain simple pipeline
behavior. When the instruction is executed and the result is calculated, the commit unit
decides when it is safe to store the result. If a stall occurs, the processor can schedule
other instructions to be executed until the stall is resolved. This, coupled with the
efficiency of multiple units executing instructions simultaneously, makes a dynamic
pipeline an attractive alternative.

10

Basic Processor Architecture :


The execution of an instruction in a processor can be split up into a number of stages.
How many stages there are, and the purpose of each stage is different for each processor
design. Examples includes 2 stages (Instruction Fetch / Instruction Execute) and 3 stages
(Instruction Fetch, Instruction Decode, Instruction Execute). The MIPS processor has 5
stages:
IF
ID
EX
MA
WB

The Instruction Fetch stage fetches the next instruction from memory using the address in the
PC (Program Counter) register and stores this instruction in the IR (Instruction Register)
The Instruction Decode stage decodes the instruction in the IR, calculates the next PC, and
reads any operands required from the register file.
The Execute stage executes the instruction. In fact, all ALU operations are done in this
stage. (The ALU is the Arithmetic and Logic Unit and performs operations such as addition,
subtraction, shifts left and right, etc.)
The Memory Access stage performs any memory access required by the current instruction,
So, for loads, it would load an operand from memory. For stores, it would store an operand
into memory. For all other instructions, it would do nothing.
For instructions that have a result (a destination register), the Write Back writes this result
back to the register file. Note that this includes nearly all instructions, except nops (a nop, noop or no-operation instruction simply does nothing) and s (stores).

11

Applications :
Among the manufacturers which made computer workstation systems using MIPS
processors are SGI, MIPS Computer Systems, Inc., Olivetti, Siemens-Nixdorf, Acer,
Digital Equipment Corporation, NEC, and DeskStation. Various operating systems have
been ported to the architecture, such as SGIs IRIX, Microsofts Windows NT (although
support for MIPS ended with the release of Windows NT 4.0) and Windows CE, Linux,
BSD, UNIX System V, MIPS Computer Systems own RISC/os, and others.
However, use of MIPS as the main processor of computer workstations has declined, and
SGI has announced its plans to cease developing high-performance iterations of the MIPS
architecture in favor of using Intel IA64-based processors (see Other models and future
plans section below).
On the other hand, use of MIPS microprocessors in embedded roles is likely to remain
common, because of the low power-consumption and heat characteristics of embedded
MIPS implementations, the wide availability of embedded development tools for MIPS,
as well as experts knowledgeable about the architecture.

Other models and future plans :


Other members of the MIPS family include the R6000, an ECL implementation of the
MIPS architecture which was produced by Bipolar Integrated Technology. The R6000
microprocessor introduced the MIPS II instruction set. Its TLB and cache architecture
are different from all other members of the MIPS family. The R6000 did not deliver the
promised performance benefits, and although it saw some use in Control Data machines,
it quickly disappeared from the mainstream market. The RM7000 was a version of the
R5000 with a built-in 256kB level 2 cache and a controller for optional level three cache.
It was primarily targeted at embedded designs, including SGIs graphics processors and
various networking solutions, primarily by Cisco. The R9000 name was never used.
At one time SGI had intended to move off the MIPS platform to the Intel Itanium, and
development was to have ended with the R10000. The ever-longer delays in introducing
the Itanium meant that the installed base of MIPS-based machines continued to increase.
By 1999 it was clear that development had ended too soon, and the R14000 and R16000
were created as a result. SGI has hinted at a more complex R8000 style FPU for later Rseries, and a dual core processor is probable. Low power consumption / heat dissipation
will continue be a focus.

12

Further reading:

Patterson and Hennessy: Computer Organization and Design. The


Hardware/Software Interface. Morgan Kaufmann Publishers. ISBN 1-55860604-1
This book about computer design in general, and RISC in particular, takes its examples
directly from the MIPS architecture.

Dominic Sweetman: See MIPS Run. Morgan Kaufmann Publishers. ISBN 155860-410-3
The definitive book on the MIPS architecture. Survey of the hardware architecture and
good details on the hardware/software compact with the compiler and operating systems.

External links and References :

MIPS processor images and descriptions at cpu-collection.de


http://en.wikipedia.org/wiki/MIPS_architecture

13

Anda mungkin juga menyukai