Anda di halaman 1dari 40

ECE 567 Spring 2003

University of Illinois at Chicago

16 bit Reduced Instruction Set Computer (RISC) Processor Design A Project Report

Amanjyot Singh Johar Department of Electrical and Computer Engineering University of Illinois at Chicago

ajohar1@uic.edu

ECE 567 Spring 2003

Abstract

University of Illinois at Chicago

This project presents the structural design and the functional characteristics of a general purpose RISC processor. The design of the RISC processor is based on the study and evaluation of a variety of assembly instruction sets. The designed RISC processor is a pipeline RISC processor with 5 stages of pipelining. A variety of instructions have been implemented for the proposed design and care has been taken to provide a control mechanism for data, structural and timing hazards. The processor’s architecture features 16 bit instruction words, 8 internal general-purpose registers each of which can hold a 16 bit data word, and 16 external address lines to external memory. The entire processor was modeled as a bottom up approach in the design methodology. The design has been done in VHDL and synthesized using hardware tool VHDL-Simili 2.1 available from Symphony EDA. The features of this processor include 16-bit architecture and multi- cycle implementation of the data path. A number of MIPS instruction set have been implemented. The design has been done at the behavioral and structural levels of VHDL. The individual components were designed, analyzed, synthesized and tested at each level of implementation. The individual components were finally integrated in a top-level simulation by appropriate port mapping.

Introduction

In the mid-1970's advances in semiconductor technology began to reduce the difference in speed between main memory and processor chips. As memory speed increased, and high-level languages displaced assembly language, the major reasons for CISC began to disappear, and computer designers began to look at ways computer performance could be optimized beyond just making faster hardware. One of their key realizations was that a sequence of simple instructions produces the same results as a sequence of complex instructions, but can be implemented with a simpler (and faster) hardware design. The advances in memory design to achieve the desired speed up have been critical in the development of RISC architectures. The basic characteristics of RISC processors are:

Simple instruction set.

In a RISC machine, the instruction set contains simple, basic instructions, from which more complex instructions can be composed. Same length instructions.

Each instruction is the same length, so that it may be fetched in a single operation. 1 machine-cycle instructions. Most instructions complete in one machine cycle, which allows the processor to handle several instructions at the same time. This pipelining is a key technique used to speed up RISC machines.

RISC designers are concerned primarily with creating the fastest chip possible, and so they use a number of techniques, including pipelining. Pipelining is a design technique where the computer's hardware processes more than one instruction at a time, and doesn't wait for one instruction to complete before starting the next.

ECE 567 Spring 2003

University of Illinois at Chicago

The basic execution stages for a RISC machine are the instruction fetch, instruction decode, instruction execute, memory and input output addressing and write-back to memory. These stages are executed in parallel which implies that as soon as one stage completes, it passes on the result to the next stage and then begins working on another instruction. The techniques are illustrated in detail in the next sections. The figure below shows an abstract view of a RISC processor and the interconnectivity of the building blocks.

processor and the interconnectivity of the building blocks. Figure 1: An abstract view of the RISC

Figure 1: An abstract view of the RISC processor

In a typical pipelined RISC design, each instruction takes 1 clock cycle for each stage, so

the processor can accept 1 new instruction per clock. Pipelining doesn't improve the latency of instructions (each instruction still requires the same amount of time to

complete), but it does improve the overall throughput. As with CISC computers, the ideal

is not always achieved. Sometimes pipelined instructions take more than one clock to

complete a stage. When that happens, the processor has to stall and not accept new

instructions until the slow instruction has moved on to the next stage. Since the processor

is sitting idle when stalled, both the designers and programmers of RISC systems make a

conscious effort to avoid stalls. To do this, designers employ several techniques, which

basically enhance the performance of the processor.

Performance Issues in Pipelined Systems

A pipelined processor can stall for a variety of reasons, including delays in reading

information from memory, a poor instruction set design, or dependencies between

instructions. The hazard can be resolved in a number of ways:

Memory speed resolution

Memory speed issues are commonly solved using caches. A cache is a section of fast memory placed between the processor and slower memory. When accessing a memory location, the processor also copies it into the cache. Subsequently, any references to that location come from the cache, which results in a return a much more quickly than the main memory. However, caches present one major problem to system designers and programmers, and that is the problem of coherency. When the processor writes a value to memory, the result goes into the cache instead of going directly to main memory.

ECE 567 Spring 2003

University of Illinois at Chicago

Therefore, special hardware (usually implemented as part of the processor) needs to write the information out to main memory before something else tries to read that location or before re-using that part of the cache for some different information.

Instruction Latency

A poorly designed instruction set can cause a pipelined processor to stall frequently. Some of the more common problem areas are:

1. Highly encoded instructions such as those used on CISC machines, that require a dedicated setup to decode

2. Variable-length instructions which require multiple references to memory to fetch in the entire instruction.

3. Instructions which access main memory instead of the registers, since the main memory can be slow

4. Complex instructions which require multiple clocks for execution for example:

many floating-point operations.

5. Instructions which need to read and write the same register. For example "ADD 5 to register 3" had to read register 3, add 5 to that value, then write 5 back to the same register. In such a situation, the processor may still be "busy" from the earlier read operation, causing the processor to stall until the register becomes available to write.

6. Dependence on single-point resources. For example: a condition code register. If one instruction sets the conditions in the condition code register and the following instruction tries to read those bits, the second instruction may have to stall until the first instruction's write completes.

Dependencies

One problem that RISC programmers face is that the processor can be slowed down by a poor choice of instructions. Since each instruction takes some amount of time to store its result, and several instructions are being handled at the same time, later instructions may have to wait for the results of earlier instructions to be stored. However, a simple rearrangement of the instructions in a program called Instruction Scheduling can remove these performance limitations from RISC programs.

Implementation Details

Design Rules and Assumptions

1. Design is to be generated for a true 16 bit processor. That is both the data-path and the instruction word are 16 bits long.

2. Processor design is based on the von-Neumann architecture of a single memory rather than separate memories for data and instructions

ECE 567 Spring 2003

Design of the Instruction Set

University of Illinois at Chicago

The instruction set has been designed with respect to the following four instruction types. Different formats for certain instructions have been specified. This is because different instructions use different operands and hence different formats needed to be constructed for them.

The general format of the Instruction word is as shown below:

OPC

OPC

OPC

OPC

Reg

Reg

Reg

R/I

R/I

R/I

R/I

R/I

R/I

I/F

I/F

I/F

Where OPC stands for the op-code, Reg specifies a register, R/I specifies either a register or an immediate value and I/F specifies a function. The classification of all the instruction types is given as below:

Register Format (R-type)

The most common style of instruction is the R-type. The R-type instruction has two read registers and one write register. All of the instructions have a 4-bit opcode which is used to determine what type of instruction is to be executed. Each of the register specifications in all of the instructions is 3-bits wide, this means that the register file has eight registers in it. In an R-type instruction the first 3-bit register specification is the write register, the following two 3-bit register specifications are the two read registers that the instruction will use. The last 3-bits in the instruction are used as function bits. These function bits specify what the actual instruction that will be performed is. This means for a single R- type op-code there can be up to eight different instructions that can be implemented using the function bits.

Register Immediate Format (RI-type)

The RI-type is similar to the R-type except the second read register and the 3 function bits are replaced by a 6-bit immediate value. Each RI-type opcode can have only one instruction because it has no function bits like the R-type.

Immediate Format (I-type)

The I-type format is used by the two move immediate instructions and the two PC relative branch instructions. The I-type format is comprised of a 4-bit opcode followed by a 3-bit register value, an 8-bit immediate field and a 1-bit function bit. The function bit allows to instructions to be assigned to each I-type opcode.

Immediate Shift Format (SI-type)

The SI-type format is used by the shift instructions. It is comprised of a 4-bit opcode, a 3- bit destination register value, a 3-bit source register value, a 5-bit immediate field and a

ECE 567 Spring 2003

University of Illinois at Chicago

1-bit function bit. The 5-bit immediate field is used to shift the source register from -15 to 16 places in the desired direction.

The table below shows the actual instructions that have been implemented. A complex instruction can be implemented by a combination of two or more instructions in the instruction set:

Instruction

 

Format

Description

 

ADD

R1,

R2,

R

Signed addition (R1 = R2 + R3)

R3

ADDI

R1,

R2,

RI

Signed Immediate Addition (R1 = R2 + data6)

data6

ADDu R1, R2,

R

Unsigned Addition (R1 = R2 + R3)

R3

AND

R1,

R2,

R

Logic AND

 

R3

BNZ R1, R2

 

R

Branch on Not Zero (If R1 not 0 jump to loc R2)

BNZI R1, data8

I

Branch on Not Zero

BZ R1, R2

 

R

Branch on Zero (If R1=0 jump to loc

 

R2)

BZI R1, data8

I

Branch on Zero PC

 

Relative (If

R1=0

jump

to

loc

PC+data8)

EI data6

 

R

Enable Interrupts (Each bit of data6 represents whether an interrupt is on or off)

IN R1

R

Input Port

LW

R1,

R2,

RI

Load Word

 

data6

MVIL R1, data8

I

Move Immediate Low Byte (Move data8 into the low byte of R1)

NOP

R

No Operation

 

NOR

R1,

R2,

R

Logic NOR

 

R3

ECE 567 Spring 2003

University of Illinois at Chicago

NOT R1, R2

 

R

Logic NOT (R1 = NOT R2)

OR R1, R2, R3

R

Logic OR

RETI

R

Return

from

 

interrupt routine

ROR

R1,

R2,

S

Signed Rotate Right

R3

SLL R1, R2, R3

S

Signed Logic Shift Left (R1 = R2 shifted by R3)

SLLI

R1,

R2,

SI

Immediate Signed Logic Shift Left

data5

SRA

R1,

R2,

S

Signed

Arithmetic

R3

Shift Right

SRAI

R1,

R2,

SI

Immediate

Signed

data5

Arithmetic

Shift

Right

SRL R1, R2, R3

S

Signed Logic Shift Right

SRLI

R1,

R2,

SI

Immediate Signed Logic Shift Right

data5

SUB

R1,

R2,

R

Signed Subtraction (R1 = R2 - R3)

R3

SUBI

R1,

R2,

RI

Signed

Immediate

data6

Subtraction

SUBu

R1,

R2,

R

Unsigned Subtraction (R1 = R2 - R3)

R3

SW

R1,

R2,

RI

Store Word

data6

XOR

R1,

R2,

R

Logic XOR

R3

Control and Data-path Design

The design of the instruction set was followed by the analysis of data-paths which are taken by a data set upon instruction execution. These are the paths in which data will flow around the processor. The paths for each instruction are combined to form the overall data-path for the processor. The most common data-paths are as follows:

R-Type Data-path

The R-type data-path is also known as the arithmetic data-path. In the R-type data-path the instruction is fetched from memory and broken up into its various parts. The two read registers from the instruction are fetched from the Register File and the ALU performs

ECE 567 Spring 2003

University of Illinois at Chicago

the operation given to it by the instruction. The result from the ALU is then written back into the register file.

from the ALU is then written back into the register file. Figure – R type Data-path

Figure – R type Data-path

back into the register file. Figure – R type Data-path Figure: Data-path for a simple add

Figure: Data-path for a simple add instruction ADD R1, R2, R3

Register Immediate Type Data-path

Also known as the Register Immediate Data-path or the RI Data-path, it is similar to the R-type except the second read register is replaced with a value that is actually inside the instruction. This immediate value is sign extended from 6-bits to 16-bits and then used as the second input to the ALU. As with the R-type the result from the ALU is then written back into the register file.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure – R-I type Data-path Load Word

Figure – R-I type Data-path

Load Word Data-path

The data-path for a load word is identical to the RI-type data-path with the exception that the result from the ALU is sent to fetch a value from memory instead of being written to the register file. The value that is fetched from memory is then loaded into the register file.

fetched from memory is then loaded into the register file. Figure – Data-path for a Load

Figure – Data-path for a Load Instruction

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure: The complete data-path for a load

Figure: The complete data-path for a load and store word operation

Store Word Data-path

The store word data-path is similar to the load word with the exception that the write register actually specifies which register to write to memory and not the register file.

which register to write to memory and not the register file. Figure – Data-path for a

Figure – Data-path for a Store Instruction

To determine the entire data-path of the processor, all the affiliated data-paths are combined into a comprehensive unit as shown in the figure below.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure: All the affiliated data-paths for the

Figure: All the affiliated data-paths for the different instruction types

Load/Store Architecture

The only memory operations permitted in the RISC processor are loads and stores. All other operations are performed directly on the values in the registers. The load/store architecture simplifies the design of the processor and allows the programmer to hide the delay caused by memory accesses. Loads from memory always have at least a one clock delay, even if the data is in the onboard cache. The delay occurs because the load pipeline, unlike the ALU or shift pipeline, has 5 stages. The figure below shows an instruction sequence for a load instruction and two subsequent instructions. The data from the load operation is available at the end of the load instruction’s cache-read stage. This is too late for the instruction immediately following the load to use the data as a source operand. The instruction slot following a load is the load-delay slot. If the instruction in the load-delay slot attempts to use the data, the processor will stall for one clock cycle (assuming a cache hit) as shown.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure: A load delay stalled pipeline Figure:

Figure: A load delay stalled pipeline

of Illinois at Chicago Figure: A load delay stalled pipeline Figure: No load delay stall in

Figure: No load delay stall in the pipeline

The instruction set can be utilized in an efficient manner to avoid the problems of stalling here. The most beneficial option is to rearrange the sequence of instructions so that a useful instruction, which does not depend on the load data, is place in the load-delay slot as is shown in the above example where the load instruction takes only one clock cycle and causes no disruption to the pipeline.

Hazards As shown above due to the load-store architecture of the processor, data and structural hazards can occur. These have been separately classified below and solutions have been provided so that the effect on the processor performance is minimized.

Data Hazards

Data Hazards occur when an instruction attempts to use a register whose value depends

on

the

result

of

previous

instructions

that

have

not

yet

finished.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure - Example of a Data Hazard

Figure - Example of a Data Hazard

There are two main ways of dealing with such hazards, stalling and forwarding.

Stalling

Stalling involves halting the flow of instructions until the required result is ready to be used. It is the simplest way to resolve a data hazard. However, as can be seen in Figure 5.2 stalling wastes processor time by doing nothing while waiting

for

result.

the

time by doing nothing while waiting for result. the Figure - Stalling the Pipeline Forwarding The

Figure - Stalling the Pipeline

Forwarding

The forwarding method is best described through the use of an example. The figure below shows two instructions in the pipeline, It can be seen that the SUB instruction needs the result of the ADD instruction in the SUB's EX stage but the ADD instruction does not write the result until the ADD's WB stage. Howeverit can also be seen that the result for the ADD instruction is actually computed before the SUB instruction needs it so the result is forwarded from the EX/MEM stage back to the EX stage of the SUB instruction.

from the EX/MEM stage back to the EX stage of the SUB instruction. Figure - Forwarding

Figure - Forwarding

ECE 567 Spring 2003

University of Illinois at Chicago

Control Hazards

A control hazard occurs whenever there is a change in the normal execution flow of the

program. Events such as branches, interrupts, exceptions and return from interrupts. A hazard occurs because branches, interrupts etc are not caught until the instruction is decoded in the second stage. By the time it is decoded the following instruction is already entered into the pipeline and left unchecked an unwanted instruction would remain in the pipeline. There is really only one solution to this type of hazard. That is, to implement a hardware stall. The hardware stall simply flushes the offending instruction from the pipeline.

Structural Hazards

A structural hazard occurs when the hardware is unable to handle certain combinations of

instruction simultaneously. For example, the IF stage under normal conditions will be accessing the memory on every clock cycle. When a load or store word instruction is used the MEM stage tries to access the memory and because of the single memory architecture a conflict occurs. There are a couple of ways of dealing with such conflicts - stalling and pre-fetching.

Stalling

This method works in the same way as stalling for data hazards. Instead of accessing memory in the IF stage the load/store instruction is allowed to use memory and the processor is simply stalled until the load/store instruction is finished. The problem with this method like with data hazards is that it can take a long time if there are multiple load/store instructions in a row.

Pre-fetching

Pre-fetching involves fetching two instructions in the IF stage and storing them in a small buffer. The buffer size that I have used is 4 instructions to save on the hardware required.

Due to the fact that we fetch two instruction in the IF stage, when a load/store instruction

is used it is allowed to access memory and the instruction for the IF stage is fetched from

the buffer and no instructions are fetched from memory. This pre-fetching method is what I am using to solve this particular hazard. However, pre-fetching works better than stalling only if the memory used is fast enough to access two words in one clock cycle.

Interrupts Interrupts refer to events that occurs external to the processor. The basic operation for when an interrupt occurs is as follows:

1. The requesting device sends an interrupt request to the processor.

ECE 567 Spring 2003

University of Illinois at Chicago

2. The interrupt lines are sampled at the end of the current instruction, if there is a request and that particular interrupt line is enabled then move onto step 3, otherwise the normal execution of the processor resumes.

3. The processor sends out an interrupt acknowledge and disables further interrupts.

4. Finally, the processor saves the program counter and jumps to the vectored interrupt address and begins executing instructions from the interrupt routine.

When returning from the interrupt routine the processor restores the old program counter values and re-enables interrupts.

Exceptions Exceptions differ from interrupts because they occur within the processor. The two types of exception that my processor can handle are arithmetic overflow and undefined instructions. The basic operation of exceptions are as follows:

1. An undefined instruction or arithmetic overflow causes an exception line to go high. This line causes the current instruction and any instructions that follow it to be flushed from the pipeline.

2. The address following the exception causing instruction is stored.

3. On the next positive clock edge the processor jumps to the vectored address of the exception and interrupts and exceptions are disabled.

When returning from the exception routine, interrupts and exceptions are re-enabled and the saved program counter is restored.

Priority

Exceptions have priority over an interrupt. The way that I have designed the hardware is so that each interrupt has a priority. Each interrupt has its own unique priority number, in the current design these numbers range from 0 to 5 where interrupt 0 has the highest priority. This priority is actually implemented via the use of IF, ELSIF and ELSE statements.

Implementation of all such features has been successfully carried out in VHDL.

ECE 567 Spring 2003

Conclusions

University of Illinois at Chicago

The RISC processor was implemented and the working was verified at the top level and

at the component levels using Simili 2.1 from Symphony EDA. Separate modules have

been implemented for the five pipeline stages i.e. instruction-fetch, instruction decode, instruction execute, memory and input access, and memory write-back. The codes from the main functional units are attached in the appendices along with a number of figures illustrating the working of different entities in the processor. The individual components were successfully tested for their functioning. Individual test-benches were generated and all the signals were monitored for their proper functionality. Some test results have been included in the appendices as well. The design functionality was also verified using Synopsys and also the possibility of implementation over a Xilinx module was explored.

The code compiled efficiently.

A number of future proposals can be suggested based on the project. They include:

Development of a 32-bit RISC processor

Development of separate hardware for memory and implementing memory

management sub-routines Hardware implementation.

Development of a full cache memory

ECE 567 Spring 2003

University of Illinois at Chicago

Appendix:

VHDL codes for the Processor: Not all the codes have been listed. The main entities that have been listed here are the pipeline stages, the top level CPU model, the ALU model, the shifter, controller, the hazard detection unit and buffers.

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

ENTITY AckReg IS PORT(Ack clk AckSignals END ENTITY AckReg;

: IN STD_LOGIC_VECTOR(5 DOWNTO 0); : IN STD_LOGIC; : OUT STD_LOGIC_VECTOR(5 DOWNTO 0));

ARCHITECTURE AckReg_behav OF AckReg IS BEGIN name : PROCESS(clk) IS

--This variable is used to hold the value of the acknowledgements --This value is defaulted to all ZERO's.

VARIABLE regValue

: STD_LOGIC_VECTOR(5 DOWNTO 0) := "000000";

BEGIN

IF(clk='1') THEN

--If the clock goes high then set the acknowledgement --register to the value of the current acknowledgement and output --that result. regValue := Ack; AckSignals <= regValue; END IF; END PROCESS name; END ARCHITECTURE AckReg_behav;

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

ENTITY ALU IS PORT(RegA, RegB, A_Immed, S_Immed

mvi_Immed alu_sel operation res_sel result overflow END ENTITY ALU;

: IN STD_LOGIC_VECTOR(15 DOWNTO 0);

: IN STD_LOGIC_VECTOR(7 DOWNTO 0); : IN STD_LOGIC; : IN STD_LOGIC_VECTOR(3 DOWNTO 0); : IN STD_LOGIC_VECTOR(1 DOWNTO 0); : OUT STD_LOGIC_VECTOR(15 DOWNTO 0); : OUT STD_LOGIC);

ARCHITECTURE structural_ALU OF ALU IS --Declare Signals Needed

SIGNAL reg_or_immediate SIGNAL reg_or_simmediate SIGNAL alu_result SIGNAL su_result SIGNAL mvi_result SIGNAL overflow_result

: STD_LOGIC_VECTOR(15 DOWNTO 0); : STD_LOGIC_VECTOR(15 DOWNTO 0); : STD_LOGIC_VECTOR(15 DOWNTO 0); : STD_LOGIC_VECTOR(15 DOWNTO 0); : STD_LOGIC_VECTOR(15 DOWNTO 0); : STD_LOGIC;

BEGIN

select1 : entity risc.selector PORT MAP(RegB, A_Immed, alu_sel, reg_or_immediate);

ECE 567 Spring 2003

select2 : entity risc.selector PORT MAP(RegB, S_Immed, alu_sel, reg_or_simmediate);

result1 : entity risc.alu_16 PORT MAP(RegA, reg_or_immediate, operation, overflow_result, alu_result);

University of Illinois at Chicago

result2 : entity risc.shift_16 PORT MAP(RegA, reg_or_simmediate, operation(1 downto 0), su_result);

result3 : entity risc.mvibox PORT MAP(RegA(7 downto 0), mvi_Immed, operation(0), mvi_result);

Final_Results : entity risc.alu_mux PORT MAP(alu_result, su_result, mvi_result, overflow_result, res_sel, result, overflow); END ARCHITECTURE structural_alu;

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

ENTITY alu_16 IS PORT(a, b func overflow

: IN STD_LOGIC_VECTOR(15 DOWNTO 0); : IN STD_LOGIC_VECTOR(3 DOWNTO 0); : OUT STD_LOGIC;

c : OUT STD_LOGIC_VECTOR(15 DOWNTO 0));

END ENTITY alu_16;

ARCHITECTURE alu_behav OF alu_16 IS

BEGIN

name : PROCESS(a, b, func) IS

VARIABLE signedResult VARIABLE unsignedResult VARIABLE temp

BEGIN CASE func IS

: SIGNED(15 DOWNTO 0); : UNSIGNED(16 DOWNTO 0); : STD_LOGIC_VECTOR(15 DOWNTO 0);

--AND_WORD

WHEN "0000" =>

c <= a and b;

overflow <= '0';

--OR WHEN "0001" =>

c <= a or b;

overflow <= '0';

--XOR

WHEN "0010" =>

c <= a xor b;

overflow <= '0';

--NOR

WHEN "0011" =>

c <= a nor b;

ECE 567 Spring 2003

overflow <= '0';

--NOT

WHEN "0100" =>

c <= not a;

overflow <= '0';

--ADD WHEN "0101" => signedResult := conv_signed(conv_integer(signed(a))

+conv_integer(signed(b)),16);

temp := conv_std_logic_vector(signed(a) + signed(b), 16);

c <= conv_std_logic_vector(signedResult,16);

University of Illinois at Chicago

if(conv_signed(signed(temp),16) >=conv_signed(32768,16) or conv_signed(signed(temp),16) < conv_signed(-32768,32))

then

overflow <= '1';

else

overflow <= '0'; end if;

--SUB WHEN "0110" => signedResult := signed(a) - signed(b);

c <= conv_std_logic_vector(signedResult,16);

if(conv_integer(signedResult) >=32768 or conv_integer(signedResult) <-32768) then

overflow <= '1';

else

overflow <= '0'; end if;

--ADDu WHEN "0111" => unsignedResult := unsigned(a) + unsigned(b);

c <= conv_std_logic_vector(unsignedResult,16);

IF(conv_integer(unsignedResult) >= 65536) then overflow <= '1';

ELSE

overflow <= '0'; END IF;

--SUBu WHEN "1000" => unsignedResult := unsigned(a) - unsigned(b);

c <=conv_std_logic_vector(unsignedResult,16);

--if(conv_integer(unsignedResult) < 0) then

-- overflow <= '1';

--else

overflow <= '0'; -- end if;

--SLTu WHEN "1001" =>

if(conv_integer(unsigned(a))<=conv_integer(unsigned(b))) then

c

<= "0000000000000000";

else

c <= "0000000000000001"; end if; overflow <= '0';

--SLT WHEN "1010" =>

if(conv_integer(signed(a))<=conv_integer(signed(b))) then

c<="0000000000000000";

else

c<="0000000000000001";

end if;

ECE 567 Spring 2003

overflow <= '0';

WHEN others => c <= a; overflow <= '0';

END CASE; END PROCESS name;

END ARCHITECTURE alu_behav;

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all;

ENTITY alu_mux IS PORT(alu,su,mvi ovf sel result overflow END ENTITY alu_mux;

: IN STD_LOGIC_VECTOR(15 DOWNTO 0); : IN STD_LOGIC; : IN STD_LOGIC_VECTOR(1 DOWNTO 0); : OUT STD_LOGIC_VECTOR(15 DOWNTO 0); : OUT STD_LOGIC);

ARCHITECTURE alu_mux_behav OF alu_mux IS BEGIN

name : PROCESS(alu,su,mvi,sel,ovf) IS

BEGIN CASE sel IS WHEN "00" => -- Select ALU result <= alu; overflow <= ovf;

WHEN "01" => --Select SU result <= su; overflow <= '0';

WHEN others => --Select MVIBOX result <= mvi; overflow <= '0';

END CASE;

END PROCESS name; END ARCHITECTURE alu_mux_behav;

library ieee; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all; use ieee.std_logic_1164.all;

ENTITY bshifter IS PORT(amountToShift operation shiftLines

: IN STD_LOGIC_VECTOR(15 DOWNTO 0); : IN STD_LOGIC_VECTOR(1 DOWNTO 0); : OUT STD_LOGIC_VECTOR(3 DOWNTO 0);

University of Illinois at Chicago

ECE 567 Spring 2003

shiftOp END ENTITY bshifter;

: OUT STD_LOGIC_VECTOR(1 DOWNTO 0));

ARCHITECTURE bshifter_behav OF bshifter IS BEGIN name :PROCESS(amountToShift, operation) IS BEGIN CASE conv_integer(SIGNED(amountToShift)) IS

--When the amount to shift is negative do the following WHEN -15 TO -1 => CASE operation IS --if operation was shift left then shift right by the

--two's compliment +1

WHEN "00" => shiftOp <= "01"; shiftLines <= SIGNED(not amountToShift) +1; WHEN "01" | "10" => shiftOp <= "00"; shiftLines <= SIGNED(not amountToShift) +1; --When rotating right simply rotate by the signed amount WHEN OTHERS => shiftOp <= "11";

similarily for all others.

shiftLines <=

conv_std_logic_vector(SIGNED(amountToShift),4);

END CASE;

--When amount to shift is positive, things remain unchanged. WHEN 0 TO 15 => shiftOp <= operation; shiftLines <= conv_std_logic_vector(UNSIGNED(amountToShift),4);

University of Illinois at Chicago

--When amount goes beyond the bounds then do change nothing as --we only want to perform shifts between -16 and 15 no more. WHEN OTHERS => shiftOp <= operation; shiftLines <= "0000"; END CASE; END PROCESS name; END ARCHITECTURE bshifter_behav;

library ieee; use ieee.std_logic_1164.all;

entity control is port(opcode : in std_logic_vector(3 downto 0); funct : in std_logic_vector(2 downto 0); zero : in std_logic; outputEnable : out std_logic; pcSelect : out std_logic_vector(1 downto 0); buSelect : out std_logic; wbstage : out std_logic_vector(2 downto 0); aluSelect,ie : out std_logic; exstageSel : out std_logic_vector(1 downto 0); aluOp : out std_logic_vector(3 downto 0); regSelect : out std_logic_vector(1 downto 0); ifidFlush : out std_logic; memstagewe : out std_logic_vector(1 downto 0); retfi,jal : out std_logic); end entity control;

architecture control_behav of control is begin

name : process(opcode, funct, zero) is

ECE 567 Spring 2003

University of Illinois at Chicago

begin

case(opcode) is when "0000" => --Standard Control Values for 0000 opcode jal <= '0'; aluSelect <= '0'; outputEnable <= '0'; exstageSel <= "00"; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <= '0';

--Special control values for different function bits. case(funct) is when "000" => --nop wbstage <= "000"; aluOp <= "1111"; when "001" => --signed addition wbstage <= "011"; aluOp <= "0101"; when "010" => --unsigned addition wbstage <= "011"; aluOp <= "0111"; when "011" => --signed subtraction wbstage <= "011"; aluOp <= "0110"; when "100" => --unsigned subtraction wbstage <= "011"; aluOp <= "1000"; when "101" => --signed set less than wbstage <= "011"; aluOp <= "1010"; when "110" => --unsigned set less than wbstage <= "011"; aluOp <= "1001"; when others => wbstage <= "011"; aluOp <= "0100";

end case;

when "0001" => --Standard control lines for this opcode. outputEnable <= '0'; aluSelect <= '0'; jal <= '0'; wbstage <= "011"; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <= '0';

case(funct) is when "000" => --Logic AND exstageSel <="00"; aluOp <= "0000"; when "001" => --Logic OR exstageSel <= "00"; aluOp <= "0001"; when "010" => -- Logic XOR exstageSel <= "00"; aluOp <= "0010";

ECE 567 Spring 2003

end case;

University of Illinois at Chicago

when "011" => --Logic NOR exstageSel <= "00"; aluOp <= "0011"; when "100" => --Logic Shift Left exstageSel <= "01"; aluOp <= "0000"; when "101" => --Logic Shift Right exstageSel <= "01"; aluOp <= "0001"; when "110" => --Arithmetic Shift Right exstageSel <="01"; aluOp <= "0010"; when others => --Rotate Right exstageSel <= "01"; aluOp <= "0011";

when "0010" => --Standard Control Line Values for this opcode. aluSelect <= '0'; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; retfi <= '0'; jal <='0';

--Control Values Associated with the function bits case(funct) is when "000" => --Input Port wbstage <= "101"; outputEnable <= '0'; exstageSel <= "00"; aluOp <= "1111"; ifidFlush <= '0'; memstagewe <= "10"; ie <= '0'; when "001" => --Output Port wbstage <= "000"; outputEnable <= '1'; aluOp <= "1111"; exstageSel <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; when "010" => --Some branch I think buSelect <= '0'; wbstage <= "000"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; memstagewe <= "00"; ie <= '0'; if(zero='1') then pcSelect <= "01"; ifidFlush <= '1';

else

pcSelect <= "00"; ifidFlush <= '0';

end if; when "011" => --BNZ Register buSelect <= '0'; wbstage <= "000"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; memstagewe <= "00"; ie <= '0'; if(zero='0') then pcSelect <= "01"; ifidFlush <= '1';

ECE 567 Spring 2003

end case;

University of Illinois at Chicago

else

pcSelect <= "00"; ifidFlush <= '0';

end if; when others => --No IDEA!! buSelect <= '0'; wbstage <= "000"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; pcSelect <= "00"; ie <= '1'; ifidFlush <= '0'; memstagewe <= "00";

when "0011" => if(funct="000") then --this is the JAL register instruction wbstage <= "011"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; aluSelect <= '0'; buSelect <= '0'; pcSelect <= "10"; regSelect <= "00"; ifidFlush <= '1'; memstagewe <= "00"; ie <= '0'; retfi <= '0'; jal <='1';

elsif(funct="001") then --this is the RET from Jal instruction wbstage <= "000"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; aluSelect <= '0'; buSelect <= '0'; pcSelect <= "10"; --new pc value = readOne regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <= '0';

elsif(funct="010") then --this is the RETI instruction aluSelect <= '0'; outputEnable <= '0'; exstageSel <= "00"; buSelect <= '0'; regSelect <= "00"; ifidFlush <= '0'; wbstage <= "000"; memstagewe <= "00"; ie <= '0'; jal <= '0';

pcSelect <="11"; retfi <= '1';

aluOp <= "0000";

ECE 567 Spring 2003

University of Illinois at Chicago

else

pcSelect <= "00";

end if;

when "0100" => --MVIL and MVIH buSelect <= '0'; pcSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; regSelect <= "01"; wbstage <= "011"; exstageSel <= "10"; outputEnable <= '0'; aluSelect <= '0'; retfi <= '0'; ie <= '0'; jal <= '0'; if(funct(0)='0') then --MVIL aluOp <= "0000";

else

aluOp<="0001";

end if;

when "0101" => -- Standard Control Lines for this opcode buSelect <= '1'; wbstage <= "000"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; regSelect <= "01"; memstagewe <= "00"; ie <= '0'; retfi <= '0'; jal <= '0'; --Branch Zero PC Relative if(funct(0)='0') then if(zero='1') then pcSelect <= "01"; ifidFlush <= '1';

end if;

else

pcSelect <= "00"; ifidFlush <= '0';

end if;

--Branch Not Zero PC Relative if(funct(0)='1') then

if(zero='0') then pcSelect <= "01"; ifidFlush <= '1';

end if;

else

pcSelect <= "00"; ifidFlush <= '0';

end if;

ECE 567 Spring 2003

when "0111" =>

--SLLI and SRLI wbstage <= "011"; exstageSel <= "01"; outputEnable <= '0'; aluSelect <= '1'; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; jal <= '0'; retfi <= '0'; if(funct(0) = '0') then aluOp <= "0000";

else

aluOp <= "0001";

end if;

when "1000" =>

--SRAI and RORI wbstage <= "011"; exstageSel <= "01"; outputEnable <= '0'; aluSelect <= '1'; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <= '0'; jal <= '0'; if(funct(0) = '0') then aluOp <= "0010";

else

aluOp <= "0011";

end if;

--ADDI when "1001" => wbstage <= "011"; exstageSel <= "00"; outputEnable <= '0';

aluSelect<='1';

aluOp <= "0101"; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <= '0'; jal <= '0';

--SUBI when "1010" => wbstage <= "011"; exstageSel <= "00"; outputEnable <= '0';

aluSelect<='1';

University of Illinois at Chicago

ECE 567 Spring 2003

aluOp <= "0110"; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <='0'; jal <= '0';

-- LW when "1011" => wbstage <= "001"; exstageSel <= "00"; outputEnable <= '0';

aluSelect<='1';

aluOp <= "0101"; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "10"; ie <= '0'; retfi <= '0'; jal <= '0';

--SW when "1100" => wbstage <= "000"; exstageSel <= "00"; outputEnable <= '0';

aluSelect<='1';

aluOp <= "0101"; buSelect <= '0'; pcSelect <= "00"; regSelect <= "10"; ifidFlush <= '0'; memstagewe <= "11"; ie <= '0'; retfi <= '0'; jal <= '0';

University of Illinois at Chicago

--Standard Control Values when others => wbstage <= "000"; outputEnable <= '0'; aluOp <= "0000"; exstageSel <= "00"; aluSelect <= '0'; buSelect <= '0'; pcSelect <= "00"; regSelect <= "00"; ifidFlush <= '0'; memstagewe <= "00"; ie <= '0'; retfi <= '0'; jal <= '0';

end case; end process name; end architecture control_behav;

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all;

ECE 567 Spring 2003

University of Illinois at Chicago

use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

ENTITY cpu is PORT(InputPort, MemoryRead

: IN STD_LOGIC_VECTOR(15 DOWNTO 0);

Interrupts CLOCK, clk, Undefined SYSCLOCK

OutputPort, FETCH1, FETCH2, READ

: IN STD_LOGIC_VECTOR(5 DOWNTO 0); : IN STD_LOGIC; : OUT STD_LOGIC;

: OUT STD_LOGIC_VECTOR(15 DOWNTO 0);

Acknowledge MemStageC END ENTITY cpu;

: OUT STD_LOGIC_VECTOR(5 DOWNTO 0); : OUT STD_LOGIC_VECTOR(1 DOWNTO 0));

ARCHITECTURE structural_cpu OF cpu IS --Declare Signals Needed

--******** 16-bit temporary signals required *********

SIGNAL temp_bupc, temp_readOne, temp_intretu: STD_LOGIC_VECTOR(15 DOWNTO 0);

SIGNAL temp_IF_PC, temp_ID_PC, temp_INS

SIGNAL temp_wbValue, temp_intoBU, temp_fin1 : STD_LOGIC_VECTOR(15 DOWNTO 0);

SIGNAL temp_reg2, temp_reg3, temp_exmempc

SIGNAL temp_INSTRUCTION, temp_simmed,temp_re: STD_LOGIC_VECTOR(15 DOWNTO 0);

SIGNAL temp_readTwo, temp_idexpc, temp_fin2 : STD_LOGIC_VECTOR(15 DOWNTO 0);

SIGNAL temp_immed16Out, temp_res

: STD_LOGIC_VECTOR(15 DOWNTO 0);

: STD_LOGIC_VECTOR(15 DOWNTO 0);

: STD_LOGIC_VECTOR(15 DOWNTO 0);

--******** 2-bit temporary singals required **********

SIGNAL temp_regSelect, temp_exSel SIGNAL temp_memSel, temp_pcSelect SIGNAL temp_idexmemwe, temp_memstage

: STD_LOGIC_VECTOR(1 DOWNTO 0); : STD_LOGIC_VECTOR(1 DOWNTO 0); : STD_LOGIC_VECTOR(1 DOWNTO 0);

--******** 3-bit temporary signals required **********

SIGNAL temp_regtin, temp_exmemwb SIGNAL temp_wbc, temp_exmemcrw

SIGNAL temp_wbWReg, temp_wbidex, temp_regin : STD_LOGIC_VECTOR(2 DOWNTO 0);

: STD_LOGIC_VECTOR(2 DOWNTO 0); : STD_LOGIC_VECTOR(2 DOWNTO 0);

--******** 8-bit temporary signals required **********

SIGNAL temp_immed8Out

: STD_LOGIC_VECTOR(7 DOWNTO 0);

--******** 6-bit temporary signals required **********

SIGNAL temp_ReadRegOut

: STD_LOGIC_VECTOR(5 DOWNTO 0);

--******** 1-bit temporary signals required ********** SIGNAL temp_pcenableOut, temp_flushboy, temp_reti SIGNAL temp_ifid_enableO, temp_rfenable,temp_zero SIGNAL temp_buSelect, temp_overflow, temp_ie SIGNAL temp_exmem, temp_idexflusher, temp_idexflushO SIGNAL temp_dflush, temp_mflush, temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded

temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded
temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded
temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded
temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded
temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded
temp_pbpcenable SIGNAL temp_aluSel, temp_jalcontrol, temp_iflush SIGNAL temp_id_flush, temp_pcenableO, temp_pcanded

: STD_LOGIC;

: STD_LOGIC;

: STD_LOGIC;

: STD_LOGIC;

: STD_LOGIC;

: STD_LOGIC;

: STD_LOGIC;

--******** 4-bit temporary signals required **********

SIGNAL temp_OPC, temp_aluFUNC

: STD_LOGIC_VECTOR(3 DOWNTO 0);

BEGIN

stage1 : entity risc.stage1 PORT MAP(temp_bupc, temp_readOne, temp_intretu, temp_INS, temp_pcenableOut, temp_flushboy, temp_ifid_enableO, clk, temp_OPC, temp_ID_PC, temp_INSTRUCTION,temp_IF_PC);

stage2 : entity risc.stage2 PORT MAP(temp_IF_PC, temp_INSTRUCTION, temp_wbValue, temp_intoBU, temp_regSelect, temp_wbWReg, temp_rfenable, temp_buSelect, clk, temp_bupc, temp_idexpc, temp_fin1, temp_fin2, temp_immed16Out, temp_simmed, temp_readOne, temp_readTwo, temp_immed8Out, temp_wbidex, temp_regin, temp_regtin, temp_ReadRegOut);

stage3 : entity risc.stage3 PORT MAP(temp_reg2, temp_reg3, temp_immed16Out, temp_simmed,

ECE 567 Spring 2003

temp_idexpc, temp_immed8Out, temp_aluSel, temp_jalcontrol, clk, temp_aluFUNC, temp_exSel, temp_wbidex, temp_exmempc, temp_re, temp_res, temp_exmemwb, temp_overflow);

University of Illinois at Chicago

stage45 : entity risc.stage45 PORT MAP(InputPort, MemoryRead, temp_res, temp_exmemwb, temp_memSel, clk, temp_exmem, OutputPort, temp_wbValue, temp_wbWReg);

cpucontrol : entity risc.cpucont PORT MAP(temp_INSTRUCTION(15 downto 12), temp_INSTRUCTION(2 downto 0), temp_zero, clk, temp_idexflusher, temp_pcSelect, temp_regSelect, temp_exSel, temp_idexmemwe, temp_memstage, temp_memSel, temp_buSelect, temp_ie, temp_reti, temp_id_flush, temp_aluSel, temp_jalcontrol, temp_exmem, temp_rfenable, temp_aluFUNC, temp_wbc, temp_exmemcrw);

hazard : entity risc.hazard PORT MAP(temp_INSTRUCTION(15 downto 12), temp_INSTRUCTION(2 downto 0), temp_regin, temp_regtin, temp_wbidex, temp_idexmemwe(1), temp_wbc(0), temp_exmemwb, temp_memstage(1), temp_pcenableO, temp_ifid_enableO, temp_idexflushO);

bforward : entity risc.branchforw PORT MAP(temp_readOne, temp_readTwo, temp_res, temp_wbValue, temp_exmemcrw(0), temp_rfenable, temp_exmemwb, temp_wbWReg, temp_regin, temp_INSTRUCTION(5 downto 3), temp_zero, temp_intoBU);

eforward : entity risc.forwardingu PORT MAP(temp_fin1, temp_fin2, temp_res, temp_wbValue, temp_exmemcrw(0), temp_rfenable, temp_exmemwb, temp_wbWReg, temp_ReadRegOut(5 downto 3), temp_ReadRegOut(2 downto 0), temp_reg2, temp_reg3);

interrupt : entity risc.intexcu PORT MAP(temp_IF_PC, temp_ID_PC, temp_exmempc, temp_INSTRUCTION(8 downto 3), Interrupts, temp_pcSelect, temp_overflow, Undefined, temp_pcanded, clk, temp_reti, temp_ie, temp_iflush, temp_dflush, temp_mflush, temp_pcenableOut, temp_OPC, temp_intretu, Acknowledge);

--************************************************************************** --**** Prefetch Buffer --**** This PORT MAP shows the connections between the processor and the pre --**** fetch buffer. --*********************** *************************************************** prefetcher : entity risc.pbuffer_16 PORT MAP(temp_IF_PC, MemoryRead, CLOCK, temp_memstage(1), temp_INS, FETCH1, temp_pbpcenable, SYSCLOCK); pcenableAND : entity risc.andgate PORT MAP(temp_pcenableO, temp_pbpcenable, temp_pcanded);

idexflushOR : entity risc.orgate PORT MAP(temp_idexflushO, temp_dflush, temp_idexflusher);

ifidflushOR: entity risc.orgate PORT MAP(temp_id_flush, temp_iflush, temp_flushboy);

select13: entity risc.through2 PORT MAP(temp_memstage, MemStageC);

select15 : entity risc.through PORT MAP(temp_res, FETCH2);

select16 : entity risc.through PORT MAP(temp_re, READ);

END ARCHITECTURE structural_cpu;

ECE 567 Spring 2003

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

ENTITY exceptCU IS

PORT(Interrupt, enInterrupt pcSel

: IN STD_LOGIC_VECTOR(5 DOWNTO 0); : IN STD_LOGIC_VECTOR(1 DOWNTO 0);

overflow, undfIns, enabled pcenableIn IntAck

ifid_flush, idex_flush, exmem_flush : OUT STD_LOGIC;

: IN STD_LOGIC; : IN STD_LOGIC; : OUT STD_LOGIC_VECTOR(5 DOWNTO 0);

pcSelect disableInt trapPC pcenableOut END ENTITY exceptCU;

: OUT STD_LOGIC_VECTOR(3 DOWNTO 0); : OUT STD_LOGIC; : OUT STD_LOGIC_VECTOR(1 DOWNTO 0); : OUT STD_LOGIC);

ARCHITECTURE except_behav OF exceptCU IS

BEGIN

University of Illinois at Chicago

name : PROCESS(Interrupt, enInterrupt, enabled, overflow, undfIns, pcSel) IS BEGIN

IF(enabled='1') THEN IF(overflow='1') THEN --this means if the overflow line is sethigh pcSelect <= "0100"; IntAck <= "000000"; ifid_flush <= '1'; exmem_flush <= '1'; idex_flush <= '1'; disableInt <= '1'; trapPC <= "01"; -- save the PC address in ex stage so that we -- can jump back to it. pcenableOut <= '1';

ELSIF(undfIns='1') THEN --this means if the undefined instruction --line is high pcSelect <= "0101"; IntAck <= "000000"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '0'; disableInt <= '1'; trapPC <= "10"; -- save the PC address in the id stage so we can -- jump back to it pcenableOut <= '1';

-- if interrupt0 is high and it is also enabled do ELSIF(Interrupt(0)='1' and enInterrupt(0)='1') THEN pcSelect <= "0110"; IntAck <= "000001"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '1'; disableInt <= '1'; trapPC <= "11"; pcenableOut <= '1';

-- if interrupt1 is high and it is also enabled do ELSIF(Interrupt(1)='1' and enInterrupt(1)='1') THEN pcSelect <= "0111";

ECE 567 Spring 2003

IntAck <= "000010"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '1'; disableInt <= '1'; trapPC <= "11"; pcenableOut <= '1';

-- if interrupt2 is high and it is also enabled do ELSIF(Interrupt(2)='1' and enInterrupt(2)='1') THEN pcSelect <= "1000"; IntAck <= "000100"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '1'; disableInt <= '1'; trapPC <= "11"; pcenableOut <= '1';

--if interrupt3 is high and it is also enabled do ELSIF(Interrupt(3)='1' and enInterrupt(3)='1') THEN pcSelect <= "1001"; IntAck <= "001000"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '1'; d isableInt <= '1'; trapPC <= "11"; pcenableOut <= '1';

-- if interrupt4 is high and it is also enabled do ELSIF(Interrupt(4)='1' and enInterrupt(4)='1') THEN pcSelect <= "1010"; IntAck <= "010000"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '1'; disableInt <= '1'; trapPC <= "11"; pcenableOut <= '1';

-- if interrupt5 is high and it is also enabled do ELSIF(Interrupt(5)='1' and enInterrupt(5)='1') THEN pcSelect <= "1011"; IntAck <= "100000"; ifid_flush <= '1'; exmem_flush <= '0'; idex_flush <= '1'; disableInt <= '1'; trapPC <= "11"; pcenableOut <= '1';

-- no interrupts or exceptions have occured do nothing. ELSE pcSelect <= "00"&pcSel; IntAck <= "000000"; ifid_flush <= '0'; exmem_flush <= '0'; idex_flush <= '0'; disableInt <= '0'; trapPC <= "00"; pcenableOut <= pcenableIn; END IF;

-- Interrupts and Exceptions are not enabled so do nothing. ELSE pcSelect <= "00"&pcSel; IntAck <= "000000"; ifid_flush <= '0'; exmem_flush <= '0';

University of Illinois at Chicago

ECE 567 Spring 2003

idex_flush <= '0'; disableInt <= '1'; trapPC <= "00"; -- no need to TRAP the PC pcenableOut <= pcenableIn; END IF;

END PROCESS name; END ARCHITECTURE except_behav;

library ieee; use ieee.std_logic_1164.all;

entity hazard is

University of Illinois at Chicago

port(opcode : in std_logic_vector(3 downto 0); func, readone, readtwo, idexwrite : in std_logic_vector(2 downto 0); idexMemWE, idexwe : in std_logic; exmemwrite : in std_logic_vector(2 downto 0); memstage : in std_logic; pcenable, ifidenable,idexflush : out std_logic);

end entity hazard;

architecture hazard_behav of hazard is begin

name : process(opcode, func, readone, readtwo, idexwrite, idexMemWE,memstage,exmemwrite) is

begin

--IF we are going to use something in the MEM/IO stage look here if(idexMemWE='1') then if(idexwrite=readone or idexwrite=readtwo) then pcenable <= '0'; ifidenable <= '0'; idexflush <= '1';

else

pcenable <= '1'; ifidenable <= '1'; idexflush <= '0';

end if;

--If we currently have a BRANCH INSTRUCTION and we are writing to the read registers of the --branch instruction in either the mem/io stage or ex stage than must stall. else if((opcode="0010" and func="010") or (opcode="0010" and func="011") or opcode="0101") then --?? what instruction is this jason?

--are we writing in the EX Stage? if(idexwe='1') then if(idexwrite = readone or idexwrite=readtwo) then pcenable <= '0'; ifidenable <= '0'; idexflush <= '1';

else

pcenable <= '1'; ifidenable <= '1'; idexflush <= '0';

end if;

--Else Are we writing in the MEM stage? elsif (memstage='1') then if(exmemwrite = readone or exmemwrite = readtwo) then pcenable <= '0'; ifidenable <= '0'; idexflush <= '1';

else

pcenable <= '1';

ECE 567 Spring 2003

ifidenable <= '1'; idexflush <= '0';

end if;

--If neither than do not worry. else

pcEnable <='1'; ifidenable <= '1'; idexflush <= '0';

 

end if;

else

 

pcenable <= '1'; ifidenable <= '1'; idexflush <= '0';

end if;

end if;

end process name; end architecture hazard_behav;

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

University of Illinois at Chicago

--*********************************************************************** --** Description: This entity structually describes the BUFFER. --****************************************************************************** entity pbuffer is port(ReadAddress, FetchAddress, Data : in std_logic_vector(15 downto 0); pbwe, clk : in std_logic; lastFetchedAddress, instruction : out std_logic_vector(15 downto 0); miss : out std_logic);

end entity pbuffer;

architecture structural_pbuffer of pbuffer is --Declare Signals Needed signal tag : std_logic_vector(13 downto 0); signal validbit : std_logic;

begin

select1 : entity risc.pref --This selects either the Register or Immediate Option port map(FetchAddress, Data, ReadAddress(1 downto 0), pbwe, clk, instruction, lastFetchedAddress, tag, validbit);

select2 : entity risc.pref_c --This selects either the Register or Shift Immediate Option port map(ReadAddress(15 downto 2), tag, validbit, miss);

end architecture structural_pbuffer;

ECE 567 Spring 2003

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

University of Illinois at Chicago

--****************************************************************************** --** Description: This entity structually des cribes the Instruction Fetch stage --** of the pipeline. --****************************************************************************** ENTITY stage1 IS PORT(BU_PC, ReadOne, Intret, Instruction: IN STD_LOGIC_VECTOR(15 DOWNTO 0); PC_enable, IFID_flush, IFID_enable : IN STD_LOGIC;

clk

: IN STD_LOGIC;

OPC

: IN STD_LOGIC_VECTOR(3 DOWNTO 0);

PC_INCREMENT, InstructionOut

pcValue

: OUT STD_LOGIC_VECTOR(15 DOWNTO 0);

: OUT STD_LOGIC_VECTOR(15 DOWNTO 0)

); END ENTITY stage1;

ARCHITECTURE structural_stage1 OF stage1 IS

--Declare Signals Needed

--********** 16-bit temporary signals needed **********

SIGNAL program_counter, inc_pc SIGNAL temp_newpc

: STD_LOGIC_VECTOR(15 DOWNTO 0); : STD_LOGIC_VECTOR(15 DOWNTO 0);

--********** 1-bit temporary signals needed ***********

SIGNAL pbenable_bit

: STD_LOGIC;

SIGNAL miss_bit

: STD_LOGIC;

BEGIN

pcselect : entity risc.pcselector PORT MAP(inc_pc, BU_PC, ReadOne, Intret, OPC, temp_newpc);

programcounter : entity risc.progc PORT MAP(temp_newpc, clk, PC_enable, program_counter);

pluspc : entity risc.incpc PORT MAP(program_counter, inc_pc);

ifidpipe : entity risc.IFID PORT MAP(inc_pc, Instruction, clk, IFID_flush, IFID_enable, PC_INCREMENT, InstructionOut);

select5 : entity risc.through PORT MAP(program_counter, pcValue);

END ARCHITECTURE structural_stage1;

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

--****************************************************************************** --** Description: This entity structually describes the Instruction Decode stage

ECE 567 Spring 2003

University of Illinois at Chicago

--** of the pipeline. --****************************************************************************** ENTITY stage2 IS

PORT(PC, INSTRUCTION, RF_WriteData

: IN STD_LOGIC_VECTOR(15 DOWNTO 0);

BU_Register

: IN STD_LOGIC_VECTOR(15 DOWNTO 0);

RegSelect

: IN STD_LOGIC_VECTOR(1 DOWNTO 0);

RF_WriteReg

: IN STD_LOGIC_VECTOR(2 DOWNTO 0);

RF_Enable, Branch_Select, clk

BU_PC, IDEX_PC,RegOneOut, RegTwoOut: OUT STD_LOGIC_VECTOR(15 DOWNTO 0); R_Immediate, S_Immediate, ReadOne : OUT STD_LOGIC_VECTOR(15 DOWNTO 0);

: IN STD_LOGIC;

ReadTwo

: OUT STD_LOGIC_VECTOR(15 DOWNTO 0);

M_Immediate

: OUT STD_LOGIC_VECTOR(7 DOWNTO 0);

WriteBack, readRegOne, readRegTwo : OUT STD_LOGIC_VECTOR(2 DOWNTO 0);

ReadRegisters END ENTITY stage2;

: OUT STD_LOGIC_VECTOR(5 DOWNTO 0));

ARCHITECTURE structural_stage2 OF stage2 IS --Declare Signals Needed

--******** 3-bit temporary signals needed ********

signal regOneIn, regTwoIn

: STD_LOGIC_VECTOR(2 DOWNTO 0);

--******** 16-bit temporary signals needed ********

signal readOnea, readTwoa, RI_Immediate : STD_LOGIC_VECTOR(15 DOWNTO 0);

signal I_Immediate, SI_Immediate

: STD_LOGIC_VECTOR(15 DOWNTO 0);

BEGIN

mux1 : entity risc.bus_mux_3 PORT MAP(INSTRUCTION(8 downto 6), INSTRUCTION(11 downto 9), RegSelect(0), regOneIn);

mux2 : entity risc.bus_mux_3 PORT MAP(INSTRUCTION(5 downto 3), INSTRUCTION(11 downto 9), RegSelect(1), regTwoIn);

registerfile : entity risc.regfile PORT MAP(regOneIn, regTwoIn, RF_WriteReg, RF_WriteData, RF_Enable, clk, readOnea, readtwoa);

signextender : entity risc.signext PORT MAP(INSTRUCTION(5 downto 0), INSTRUCTION(8 downto 1), RI_Immediate, I_Immediate, SI_Immediate);

branchunit : entity risc.branch PORT MAP(PC, I_Immediate, BU_Register, Branch_Select, BU_PC);

idexpipeline : entity risc.idex PORT MAP(PC, readOnea, readTwoa, RI_Immediate, SI_Immediate, INSTRUCTION(8 downto 1), clk, INSTRUCTION(11 downto 9), regOneIn, regTwoIn, IDEX_PC, RegOneOut, RegTwoOut, R_Immediate, S_Immediate, M_Immediate, WriteBack, ReadRegisters);

select7: entity risc.through PORT MAP(readOnea, readOne);

select8: entity risc.through PORT MAP(readTwoa, readTwo);

select9: entity risc.through3 PORT MAP(regOneIn, readRegOne);

select10: entity risc.through3 PORT MAP(regTwoIn, readRegTwo);

END ARCHITECTURE structural_stage2;

ECE 567 Spring 2003

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_signed.all; use ieee.std_logic_unsigned.all;

University of Illinois at Chicago

--****************************************************************************** --** Description: This entity structually describes the Execution Stage of the

--** pipeline. --****************************************************************************** ENTITY stage3 IS

PORT(RegA, RegB, R_immediate idex_pc, S_Immediate

: IN STD_LOGIC_VECTOR(15 DOWNTO 0); : IN STD_LOGIC_VECTOR(15 DOWNTO 0);

M_Immediate alu_sel, jal_control, clk alu_function ex_select idexwb

exmem_pc, exmem_read, exmem_result : OUT STD_LOGIC_VECTOR(15 DOWNTO 0);

: IN STD_LOGIC_VECTOR(7 DOWNTO 0); : IN STD_LOGIC; : IN STD_LOGIC_VECTOR(3 DOWNTO 0); : IN STD_LOGIC_VECTOR(1 DOWNTO 0); : IN STD_LOGIC_VECTOR(2 DOWNTO 0);

exmem_wb overflow END ENTITY stage3;

: OUT STD_LOGIC_VECTOR(2 DOWNTO 0); : OUT STD_LOGIC);

ARCHITECTURE structural_stage3 OF stage3 IS --Declare S ignals Needed

--******** 3-bit temporary signals needed ********

SIGNAL regOneIn, regTwoIn

: STD_LOGIC_VECTOR(2 DOWNTO 0);

--******** 16-bit temporary signals needed *******

SIGNAL alu_result, temp_result

: STD_LOGIC_VECTOR(15 DOWNTO 0);

BEGIN

alu : entity risc.alu PORT MAP(regA, regB, R_Immediate, S_Immediate, M_Immediate, alu_sel, alu_function, ex_select, alu_result, overflow);

mux1 : entity risc.bus_mux_16 PORT MAP(alu_result, idex_pc, jal_control, temp_result);

exmempipeline : entity risc.exmem PORT MAP(idex_pc, temp_result, regB, clk, idexwb, exmem_pc, exmem_result, exmem_read, exmem_wb);

END ARCHITECTURE structural_stage3;

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Entity: shift1 simulation result: This entity shows

Entity: shift1 simulation result: This entity shows a few cycles of the memory fetch stage. The increment in the memory address can be observed.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure: The figure above shows the execution

Figure: The figure above shows the execution of several instructions simultaneously on the processor core.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure: The figure above shows the execution

Figure: The figure above shows the execution of several instructions simultaneously on the processor core.

ECE 567 Spring 2003

University of Illinois at Chicago

ECE 567 Spring 2003 University of Illinois at Chicago Figure: The figure above shows the timing

Figure: The figure above shows the timing signals for the entity cpu. The signals are all inputs or outputs to the system.