A Thesis Presented to
The Faculty of the College of Engineering and Technology
Ohio University
In Partial Fulfillment
of the Requirements for the Degree
Master of Science
by
Vij ay K. Reddy Anam,
+.-
June, 1990
TABLE OF CONTENTS
CHAPTER I
Introduction
CHAPTER I1
Design of Look-Ahead Pipelined
Computer System
2.1
Introduction
2.2
2.3
2.4
Hardware System
CHAPTER I11
Design of Dynamic Pipelined
Arithmetic Unit
3.1
Introduction
3.2
Principle of Operation of
the CSA Tree
3.3
Conversion of Unifunction
Pipeline to Multifunction Pipeline
3.4
CHAPTER IV
Instruction Execution in the Pipeline
System
CHAPTER V
Computer Simulation and Experimental
Results
5.1
5.2
217
5.3
219
5.4
223
5.5
Experimental Results
225
CHAPTER VI
Conclusions and Discussions
REFERENCES
APPENDIX
A.
State Matrices
B.
CHAPTER ONE
INTRODUCTION
following
PIPELINE CYCLES
lnstruction
lnstruction
lnstruction
lnstruction
lnstruction
lnstruction
lnstruction
1
2
3
4
5
6
7
Fig 1.1
for
CRAY
storage.
This
u n i t s a s p r o v i d e d i n t h e T I - A S C c o m p u t e r [ 1 3 ] and
reconfiguring the units as needed. The general approach is
to provide a static functional unit for each class of
operations. Static functional units can execute instructions
only when the operation defined by the instruction fall
within the same class for which the unit was designed. The
Astronautics ZS-1 [14] operates on a decoupled architecture
and supports two instruction streams. This machine is
capable of forwarding two instructions to the execution
units within a clock period. The dependent instructions are
held at the issue stage until the dependency is resolved.
The two streams are unequal in length and are supported by
multiple static execution units. Data can be copied between
the two units via a copy unit. Queues are used for memory
operands providing a flexible way of relating the memory
access functions and floating point operations. This
provides a dynamic allocation of memory access functions
ahead of t h e floating point operations. There is no
reordering of instructions within a pipeline.
In this research a system is developed which executes
instructions dynamically. The hardware is a pipelined system
consisting of two fundamental sub-systems: the pipelined
instruction unit (PIU), and the pipelined execution unit
(PEU). The PIU can further be divided into the fetch unit
(FTJ),
the decode unit (DU), and the issue unit (IU). The PEU
r-l
Fetch Unit
Unit II
Unit I
Fig. 1.2 Proposed pipeline system shown with the sub units
. The overall
system configuration is
three
CHAPTER TWO
2.1
INTRODUCTION:
11
Pipeline cycle # 0
Latch 1
Fetch
unit
Latch 2
Latch 3
Decode
Issue
unit
'
I
,unit
lnstr 1
1
Pipeline cycle # 1
Latch 1
I
Fetch
unit
lnstr 2
I
n
s
t
* +
r
1
Latch 2
Decode
unit
lnstr 1
Latch 3
lssue
unit
- u
Pipeline cycle # 2
Latch 1
lnstr 3
+ st ,L
lnstr 2
r
2
u
Fig.
2.1
Time
Decode
unit
- u
Latch 3
I
n
I
Fetch
unit
Latch 2
Issue
unit
s
* t +
lnstr 1
r
1
Fetch unit
Decode unit
Issue unit
Logic unit
Fixed point
Floating point
arithmetic unit
arithmetic unit
Fig. 2.2
Instruction
Instruction Type
Add / Subtract
Arithmetic
Multiplication
Arithmetic
Division
Arithmetic
23
Store / Load
Logic
And / Or / Not
Logic
Table
Fig.
2.3
Execution
Time
2.1
15
hardware system.
2.2
.....
load
.....
r3, (A);
.
..
r2, (B) ;
I
add
store
load
rl;
.....
17
this case the resource (X) will be loaded with the result
of the third load instruction, before the store instruction
c o u l d a c c e s s rl. I n s i m p l e r t e r m s , t h e t h i r d load
instruction will reinitialize rl soon after the add
instruction has initialized it. These events would take
place before the store instruction access rl. The hazard is
illustrated in Fig. 2.5. A WAW hazard occurs when the third
load instruction updates rl before the add instruction. This
is shown in Fig. 2.6. The operational hazard takes place if
more than one instruction attempts to use the facilities of
rl, ( X ) ;
r2, (Y);
r3, rl, r2;
( Z ) , r3;
r3, rl, r2;
(U)I r3
r4, ( B ) ;
r5, (D) ;
r3, r4, r5;
(V) 1 r3
rl <-- (X)
r2 <-- (Y)
r3 <-- rl + r2
(Z) <-- r3
r3 <-- rl + r2
( C ) <--r3
r5 <-- (B)
r5 <-- (D)
r3 <-- r4 * r5
(C) <-- r3
T h e d o m a i n D(1) o f a n
11
D(J) =
R(J) =
R(J) =
0
0
for RAW
for WAW
for WAR
(2 1)
(2.2)
(2.3)
0
E
ADD instruction is issued for execution while the previous two instructions
are in execution.
Fig. 2.4 Occurence of RAW hazard.
C
E
STORE instruction is issued after the LOAD instruction has completed execution
Fig. 2.5
load r l ,
(X); F D l E E ~ E EE
F D I EEEEEE
F Dl E EE E E E E E
mult 3 r 2 ;
store (Z), r3;
FDI E
load r l ,
(X); F D I E E E
load r2,
(Y);
FDI EE
load r4,
FDI EEE
FDl EEEEEE
(B);
FDI
FDl EEEEEE
F DI
E EEEEEEE
lJ
mult
EEEEEE
store ( V ) ,
EEE
r3;
RAWhazardbetweenthe
store and the multipllcabon
instrucbon.
mult
Fig. 2.8
F D I
Pipeline
Cycles
load r l , ( X ) ;
mult
store (Z),
r3;
store ( U ) , r3;
load r5, ( D ) ;
mult
store ( V ) ,
Fig. 2.10
r3;
25
Instruction 4 depends on
0 Start o f execution
Issue cycle
1 End of execution
Fig. 2.1 1
pipeline cydes
9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2'
E E
E E
Issue cycle
0 Start of execution
0 End of execution
Fig. 2.12
F D I E E E E E E
store ( V ) ,
(B);
r3;
F D l E E E E E E
F D I E E E E E E
F D I E E E E E E
The instructions between the two horizontal lines are dependent on the first
multiplication instruction. These instructions will have to be scheduled depending
on the availability of the result of the multiplication instruction.
Fig. 2.13
28
4 = 13 pipeline cycles.
29
IS
I1,I2,I3,I 4 , .
. . . .In } , where
the
= {
r1,r2,r3,.....
)'
= {
. . . . , c,
c1,c2,c3,
Pipeline
Cycles
load r l ,
(X);
Polnter
C1
Pointer
C2
mult
r3, r l , (2;
Pointer
C3
store (Z),
r3;
Fig. 2.14
32
by
2)
Ik =
O C , r a trb,r,, ca,cb,
C, ) , where OC is the op-code of the
T,
(2.4)
T,
load
load
8.
9.
mult
store
1.
..
. .. .
. . . . . ..
(X)
(Y)
..........
1 , 2
r3 <-- rl + r2
(Z), r3; (Z) <-- r3
34
Tadditional delay =
(2-5)
0.
R(J)
is not equal to
The TSink-
is( less
~ ~ ~than
)
TteSt.
counter element c , ~ ~ ~
CASE A:
Csink(old)
in
Csink(old)
Csink(old)
and
'test
Let
Tinst-delay
Csink(old)
- (Te +
(2 6 ,
1)
(2-7)
Tsink-delay
~ can
~ be
~ set
( according
~ ~ ~ )to the
following equation :
Csink(new)
= Te
= Ts
(Tinst-delay
Ts
Csink(old)
Csink(old)
Csink(old)
(2.8)
( T e + 1)-1 (2.9)
(2.10)
(2.11)
36
Tinst-delay
Tsink-delay =
if
Ttest
'
Csink(old)
= 0
Tsink-delay =
if
'test
Tsink-delay =
if
Ttest
Csink(old)
= 1
0
Csink(old)
> 2
- T + Tinst-delay
= T + 2
- T + TinSt-delay
= T + 1
'
Tinst-delay =
.
.
3.
is necessary as the
exist simultaneously. The Tsr,.delay
execution of an instruction will have to be delayed until
the RAW hazards are resolved. The test total time is now
equal to:
case 2:
Tsrc-delay
Csource- reg
+ 1
case 3:
-
Tsrc-delay
( Csource- reg1 r
) + I
Csource- reg2
(2.20)
Tinst-delay
Csink(new)
Csink(old)
if
<
Ttest
=s '
'test
Ttest
Tinst-delay
Csink(old)
'
(2.21)
Csink(old)
(Te + 1)
Csink(old)
(2.22)
(2.25)
Csink(old)
= 0
+ 2
(2.26)
Tsrc-delay
(2.27)
Csink(old)
(2.28)
> 2
'src-delay
~ calculated
~ ( ~ ~ as
~ follows:
)
The new value of c ~ ~ is
%ink(new)
(2.24)
Csink(old)=
Tinst-delay
if
Ts
- Tsrc-delay
(Tinst-delay
Tinst-delay
if
'test!
=e '
If
Csink(old)
(Tinst-delay
'src-delay
(Tinst-delay
'
Tsrc-delay
'
Csink(new)
('inst-delay
(Tsrc-delay
and
=
are as follows:
Csink(nex)
T
=
Csink(new)
The values of c
= O
Csink(new)
are
shown below:
=
'test
(2.32)
'inst-delay
if
if
'
Csink(old)
'inst-delay
Tinst-delay
The
2 - (Te - 1)
(2.33)
'test
%ink(old)
Csink(new)
if
'test
Csink(old)
'
'
('inst-delay
> 0.
csource-regl
Csource- reg2 = 0 .
'inst-delay
('src-delay
- Csink(old)
(2.35)
2
(Te - 1)
(2.36)
40
if
>
Csink(old)
'test
is
Tinst-delay
(Tsrc-de[ay
two)
if
Ttest
Csink(new)
if
<
Csink(old)'
Tinst-delay
(Tinst-delay
(2.37)
> 0.
Pipeline Cycle # 3
lnstruction
I
I
1
I
I
0
lnstruction delay
I
0
Pipeline Cycle # 4
lnstruction
II
I
1I
lnstruction delay
Pipeline Cycle # 5
lnstruction
IRAW mult
R3, R1, R2;
hazard delay
i
b
I
i
Pipeline Cycle # 6
lnstruction
store (Z), R3;
12
WAW hazard delav
t
Instruction delay
I
Fig. 2.16
13
Pipeline Cycle # 7
lnstruction
I add R3, R1, R2;
RAW hazard delay
4
WAW hazard delav
I
Updated counter values
b
lnstruction delay
11
Pipeline Cycle # 8
lnstruction
C1
C2
3
C3
C4
1 3 0
C5
0
]=I
1
1I
1I
lnstruction delav
Pipeline Cycle # 9
lnstruction
load R4, ( 6 ) ;
RAW hazard delay
I
0
1
WAW hazard delav
1
1I
-
lnsruction delay
0
Pipeline Cycle # 10
lnstruction
1 load R5, (D);
RAW hazard delay
1
0
lnstruction delav
Fig. 2.1 6
I
lnitial counter values
1
I
Pipeline Cycle # 11
lnstruction
1
add R3, R4, R5;
RAW hazard delay
1I
6
WAW hazard delay
6
lnstruction delay
1
I
Pipeline Cycle # 12
lnstruction
0
lnstruction delay
b
Fig. 2.16
I
I
44
0 ,
0
Code
C - Field
R - Field
Exec
Time
R1
R2
R3 R4
R5
C1
C3
C2
C4
C5
I llllllllll
>
Fig. 2.17
Fetch Unit
i
Decode unit
D
Instruction
status
unit
Issue unit
.c
Fig.
2.18
The
Floating
point u n l
v
1
Logic
unit
E 2
Fixed point
arithmetic
unit
47
Fetch Unit
I
Decode unit
4
D
Instruction
status
unit
Issue unit
4
Buffer
I
units
88
-
Buffer
units
units
Logic
unit
Fixed point
arithmetic
unit
Fig. 2.19
Vl
Buffer
Floating
point uni
Fig. 2.20
Fig. 2.21
51
Pipeline
Cycles
Polnter
c1
load r l ,
(XI;
Pointer
c2
load r2,
(Y);
Polnter
c3
mult r3, r l ,
r2;
....
....
Pointer
c3
add r3, r l ,
r2;
.....
.....
Pointer
c4
load r4,
(B);
.....
Polnter
c5
load r5,
(Dl;
.....
Pointer
c3
mutt r3, r4,
r5;
Fig. 2.22
Pr#
ASR1
DSR1
SD1
ASR2
Unit 1
Unit 2
Unit 3
Unit 4
Unit 5
Unit 6
Unit 7
2
DSR2
SD2
ID
DR
54
1) priority number, 2) address of source registerl (ASRl),
3) delay of source registerl (DSRl), 4) source data1 (SD1),
> .
Register 1
Register 2
Register 3
Register 4
Register N
I
b
t T I L
SPLITTER.
II
MUX5to1
A
ID.
DR
Fetch Unit
Decode unit
-C
Thick lines
represent
common data bus.
Instruction
status
unit
R a
e r
9 r
i a
s Y
Issue unit
.
Buffer units
Fixed point
arithmetic unit
Fig. 2.25
Buffer units
b'Buffer
units
dm' e
57
In case of a
lnstruction 1
lnstruction 2
lnstruction 3
FDIEEEE
FDIEEEE
FDIEEEE
lnstruction 4
lnstruction 5
lnstruction 6
FDI.
lnstruction 1
lnstruction 2
lnstruction 3
lnstruction 4
FDIEEEE
FDIEEEE
FDIEEEE
F D I .... E E E E E E
lnstruction 5
FDI..
Instruction 6
FDI
EEEEE
.......
EEEE
Additional delay
introduced by the
execution unit.
rl,
r2,
r3,
r4,
r3 ;
r4;
20;
30;
r2,rl;
r2,rl;
60
Pipeline Cycle # 3
load R1, 20;
1
Updated counter values
Pipeline Cycle # 4
Pipeline Cycle # 5
mult R3, R2, R1;
Pipeline Cycle # 6
mult R4, R2, R1;
Fig. 2.27
Pipeline Cycle # 7
store
R3;
Updated counter values
Pipeline Cycle # 8
store
I - I
1
R4;
10
10
Pipeline Cycle #
Pipeline Cycle #
Fig. 2.27
Pipeline Cycle # 7
store
R3;
11
11
Pipeline Cycle # 8
store
R4;
Pipeline Cycle #
Pipeline Cycle #
I
c
Updated counter values
Fig. 2.28
Pipeline cyde #
C2
C3
C4
Assuming a delay of 'k' pipeline cycles are needed to resolve the hazard
Counter values after updating are :
65
load
load
rl, (X);
r2, (Y);
rl <-r2 <--
(X)
(Y)
Pipeline cycle # 6
C1
Before updating
After updating
Before updating
Pr #
Unit 1
R1
Unit 2
h b
m
U
ID
I33
R3
After updating
Unit 2
Before updating
ASR1 DSRl SD1
9 b
i
C
f
f
Unit 2
After updating
n
i
t
Fig. 2.30
Pipeline cycle # 7
C1
Before updating
After updating
Before updating
L
h b
u
f
Pr#
Unit 1
R1
R2
R3
unit 2
R1
R2
R4
ID
a?
After updating
Before updating
After updating
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
Pipeline cycle # 8
C1
Before updating
After updating
Before updating
Pr #
Unit 1
R1
unit 2
R1
h b
f
ID
R2
R3
R2
R4
DR
After updating
Before updating
Unit 2
After updating
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
Pipeline cycle # 9
Before updating
After updating
Before updating
h b
m
U
ID
Pr #
DR
Unit 1
R1
20
R2
R3
Unit 2
R1
20
R2
R4
After updating
Before updating
Mem
After updating
Unit 2
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
Pipeline cycle # 10
C1
Before updating
After updating
- -
Before updating
h b
u
f
CR
Pr #
Unit 1
R1
20
R2
30
R3
unit 2
R1
20
R2
30
R4
ID
After updating
Before updating
f
f
Mem
After updating
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
Pipeline cycle # 11
C1
Before updating
After updating
Before updating
h b
u
f
"
Pr #
Unit 1
R1
20
R2
30
R3
unit 2
R1
20
R2
30
R4
ID
a?
After updating
Unit 2
Before updating
Mem
After updating
Mem
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
Pipeline cycle # 12
Before updating
After updating
Before updating
k
Pr #
ASRl
DSRl SD1
Unit 1
R1
20
30
ID
0
R4
Unit 2
After updating
Unit 2
--
--
Before updating
g
i
C
b
f
f
After updating
Unit 2
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
Pipeline cycle # 13
C1
Before updating
After updating
Before updating
Pr #
ASR2 DSW
SD2
ID
Unit 1
Unit 2
After updating
Before updating
f
f
Mem
After updating
Unit 2
Fig. 2.30 The process of capturing the operands and resolving collisions (cont'd)
3.
4.
5.
6.
7.
8.
9.
add
store
branchz
load
load
mult
store
r3 , rl, r2;
r3 <-- rl + r2
(Z), r3;
(Z) <-- r3
r3, 100;
branch to 100 if r3 = 0
r4, (A);
r4 <-- (A)
r5, (B);
r5 <-- (B)
r3, r4, r5;
r3 <-- r4 * r5
(C)I r3
(C) <-- r3
75
CI
EAC queue
Fetch Unit
PIC queue
Decode unit
L
Instruction
status
unit
I.
Buffer units
Issue unit
Buffer units
a
r
r
a
3
Floating point unit
It_l
Fig 2.31
R
e
9
i
e
* r
Logic unit
77
29
lnstructions
starting from
address 23 in
memory
EAC stream
Jump (Result = 0) 60
28
,27
26
Jump (Carry = 0) 45
25
Jump (Overflow = 0) 36
24
Jump (Carry = 0) 28
23
Instructions
starting from
address 10 in
memory
16
Jump (Result = 0) 80
15
Jump (Overflow = 0) 70
-
14
Jump (Carry = 0) 56
13
12
Jump (Overflow = 0) 33
11
PIC stream
10
Jump (Carry = 0) 23
Fig. 2.32
79
The jump
= 4
PC = Program counter
The jump instruction n is being evaluated in the logic unit. The issue
unit and the decode unlt have suspended operations until the jump is
evaluated
81
2^n where
82
A flow chart
HARDWARE SYSTEM:
15
14
Jump (Result = 0) 80
Jump (Overflow = 0) 70
Jump (Carry = 0) 56
'1 3
12
11
Jump (Overflow = 0) 33
10
Jump (Carry = 0) 23
EAC stream
PIC stream
14
PC. 15
PC 1 60
The contents of the counter after fetching the last instruction in both the sbeams
XXX in memory
PIC stream
EAC stream
PC = Program counter
The jump instruction n has been evaluated and the branch is taken. The old
PIC stream is the redudant stream and hence it is flushed.
Fig. 2.36 Sequence of updating the counters during the jump operation
Control Paths
disable individual
streams
instructions to the
EAC queue
Fig. 2.37
instructions to the
PIC queue
4
b
No
t
Fetch instruction
Unconditional
Place in PIC queue
PC(P1C) <-- PC
+1
EAC stream in
Yes
Start The EAC
stream
u
Fig. 2.38 Flow chart for the PIC queue assuming PIC queue is in session
Load program
counter with contents
of counter 1
Yes
b
Fig. 2.38 The flow chart of the PIC queue assuming PIC queue in session (cont'd)
No
Fetch instruction
Unconditional
Place in EAC queue
PC(EAC) <-- PC + 1
Yes
Start The PIC
stream
Fig. 2.38 Flow chart for the EAC queue assuming EAC queue is in session (cont'd)
No
Yes
Load program
counter with contents
of cwnter 1
fb
'
h
yes
Fig. 2.38 The flow chart of the EAC queue assuming EAC queue in session (cont'd)
rn
Main Memory Module
Fig. 2.39
90
91
92
L
.
Counter set 1
Counter set 2
dassifier
---------- Opcode
and EA generator
Fetch unit
I
1
Decode units
PIC Queue
system status units
Decoder
unit 2
I
Deader
unit 1
I
hstr~ction
status unit
7 7-
issue
unit I2
Issue units
-+
lssue
unit 1
(;'
R1
R2
R3
R4
Execution units
--
Fig. 2.40
FETCH UNIT:
to
are used
C : Control signals
to disable individual
streams
Fig. 2.41
96
the issue unit are disabled but the queue is still filled
with instructions. The instructions of both the streams are
classified by the classifier and the various conditional
branch instructions a r e i d e n t i f i e d . W h e n a branch
i n s t r u c t i o n i s encountered in t h e P I C s t r e a m , t h e
destination address is calculated and placed in a counter
belonging to the EAC stream. The appropriate counter is
determined by the number of branch instructions that are
present between the present instruction in the fetch unit
and the branch instruction that is being currently evaluated
in the logic unit. Thusthe counter 1 of the EAC stream is
initialized by the address of the first branch instruction
in the PIC queue, with respect to the branch instruction
that is in the logicunit. At any instant of time there will
only be a single branch instruction, being evaluated in the
logic unit. Counter 2 (EAC stream) is loaded with the
destination address of the second branch instruction in the
PIC stream and so on. In general, the destination address
of the branch instructions are loaded into the counters of
the EAC stream in the same order as their physical presence
in the PIC stream. This allows the EAC stream to store all
the possible destination addresses. In the event of the
current branch instruction not being valid, the EAC queue
is flushed and the counter 0 is loaded with the value in
counter 1, along with the other address moving up one
counter to the left. This is shown in Fig. 2.42. The
s
Instructions from memory
I
T7-T-
Control Paths
C :
Control signals
to disable individual
streams
Fig. 2.42
A : Control signals
for path information
98
the EAC stream becomes the current stream and the PIC queue
is flushed along with the contents of the counters 1 to 9
of the EAC stream. The counter
DECODE UNIT:
- field
Field
Exec
Code
Time
R1 R2
a d d 6
R3
R4 R5
C1
C2
C3
C4
C5
Fig. 2.43
Instruction
status
unit.
100
R2
ISSUE UNIT:
flush Queue 2
Instruction from
INSTRUCTION
Decoded instruction
to issue unit # 2
Decoded instruction
to issue unit # 1
Opcode
Fig. 2.45
ASRl
DSRl
SD1
ASR2
DSR2
SO2
DR
103
Let R1 = 5 and R2 = 5.
R1 = R2
OpcOde
ADD
Fig. 2.46
R1 = R2
ASRl
DSRl
SDl
R2
R3
DR
R1
+ R3, where
Opcode
ASRl
DSRl
SO1
ASR2
DSRP SD2
ADD
R2
R3
Fig. 2.47
DR
R1
lnstructions
from
EAC Queue
lnstructions
from
PIC Queue
lnstructions to the
Execution unit
N: Update counter fields in system status units.
M: Input of the counter fields from the system status unit
U: Common Data Bus.
V: Disable issue unit signal from logic unit controller.
unit
106
EXECUTION UNIT:
The execution unit comprises of three sub units namely:
t
Dynamic
Fixed
Point
Arithmetic
Unit
Dynamic
Floating
Point
Unit
Logic Unit.
*
i
+
b Controller for
point unit
the fixed
108
109
CHAPTER THREE
DESIGN OF DYNAMIC PIPELINE ARITHMETIC UNIT
3.1
INTRODUCTION:
(CPA)
(CSA)
The
CSA
(SV)
n-bit
CSA
element
CSA
CSA
element.
Mathematically the carry save adder is represented as:
A + B + D = S + C
where
(3.1)
vectors are
A,
B, and
D.
and
C.
The
CARRY
SAVE
ADDER
UNIT.
SUM
CARRY
Fig. 3.1
Fig. 3.2
the n input vectors into two vectors S and C, each 2*m bits
long. The process of merging is carried out in stages. The
number of C S A elements in a stage is equal to the highest
number of three vector groups that are possible from the
input vectors to that stage. The ungrouped vectors are
passed on to be processed in the next stage. The final
result is obtained by adding the last sum and the carry
vector. T h e relative order o f t h e vectors has to be
maintained through out the pipe so as to obtain the correct
result.
Let the eight vectors shown below be the shifted
114
multiplicands of two eight bit binary vectors wherein the
operation between them is multiplication. These partial
products are to be added to obtain the final product and
hence involve multiple additions. The leading and trailing
zeros are added to show the relative displacement of the
vectors to each other.
W1
0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1
W2
0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0
W3
0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0
W4
0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0
W5
0 0 0 0 1 1 0 1 0 0 1 1 0 0 0 0
W6
0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0
W7
0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0
W8
0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0
0.5
N(v)
N(v-1)
mod 2 ) )
(3.2)
with N(1) = 3.
For example, we need 10 CSA tree levels to add 64 to 94
numbers in one pass through the tree.
M a t h e m a t i c a l l y f o r e i g h t o f t h e e i g h t b i t v e c t o r s we
need a f i v e l e v e l CSA t r e e . The l e a d i n g z e r o s a r e o m i t t e d
f o r t h e c a l c u l a t i o n s . The p r o c e s s of a d d i t i o n is i l l u s t r a t e d
b e l o w w h e r e i n SV r e p r e s e n t s t h e s u m v e c t o r
a n d CV
The f o l l o w i n g i s t h e o p e r a t i o n i n e i g h t b i t CSA
u n i t #1:
The f o l l o w i n g i s t h e o p e r a t i o n i n e i g h t b i t C S A u n i t
#2:
A t t h e end of l e v e l one t h e r e s u l t s a r e t a b u l a t e d i n
t h e i r c o r r e c t order:
These v e c t o r s a r e forwarded t o t h e l e v e l 2 f o r f u r t h e r
processing. A t l e v e l two t h e r e a r e s i x b i n a r y v e c t o r s t o be
Level 4 :
SUM
1 0 1 0 0 1 1 1 1 1 0 1 1 1 1 1
[22]
STRUCTURE:
44 C
C C
C C C
4 + 4
.:................................
:.:-:.:<...:.:.;<.:.:.:.:.:.;...
.-:;.:...
.;.;<...>. ,...
2,. ..............................
.......:.:?~::.':.y<::*:::::::::~:~~::::i:::*.:
.::.:~:,:>;<:i~>>;i:$i~iyj:.>j~p*::;<:::$*:.:j$:>ii:j:;::;:~i:i<>:*':jj<:.$:j:~>*:::;<:::
;
44 4
CSA UNIT
S1
-1
CSA UNIT
LATCH 1
-2
LATCH 2
S2
C S A UNIT
-4
C S A UNIT
+ .
-3
C4
S4
c 3 T
s3
................................................................................................................
.,..
:..-.:.::;:::::::
>.: .,::::;.,.
...................................
. . . . . . . . . . . . . . . . . . . . . ............................................................................
. . . . . .................
. . . . - .'.:<.:.>:;.............
. . . .....................
. . . . . .:s,;.
. .......
,
.., .:.::...:::,:::.:. -:-::>::.:>::.:;.
,
LATCH 3
4
S3
C S A UNIT - 5
S5
w ..................
C5
4 ..........
'. .......................
..................
........
.;.;. ..........
.......... .......
...............................................................
;.
..........................................................
.............. .............................
...>
....................................................
...................................
4
,,
...:;
..?
<<,,.x:A::::-:.:.?..
...................
...................
&.
""'.'i
4 4
~,
CSfl UNIT
C6
:::: :*
:.:.:.:.;.:.:
:::::;:y:.:::.:.:.
..................................................
:::,:
::l.:.:,.>:.:...
:.:.
..:.:
:,,:.
.:..
,:.:
:.::.
,..:.
..::>:
LATCH 4
-6
S6
-. . . . . . . . . . .
...........................................
............
. . . . ...........
. :........o:.;r.?,,y+x. .:............................
::::.:.:.A* . :...................................................
. ..r.. :.:.,.> ...........:, . ....i;.. ..: .r............
. :... . . . . . . . . .
LATCH 5
LATCH 6
TOTRL SUM.
Fig. 3.3
CSA
119
numbers which are N bits wide. The main aim of the design
is to modify the static structure to support the operations
of addition, subtraction, multiplication and division.
3.3.1
MULTIPLICATION:
T h e
o p e r a t i o n
of
Operand #I
-V
MULTIPLEXER
Operand #2
External Data
Streams.
C A R R Y LOOK flHEAD
RDDER
I
Carry in
Fig. 3.4.
Operand #1
MULTIPLEXER
Operand #2
External Data
Streams.
C A R R Y LOOK AHEAD
RDDER
Select line
f o r inversion.
Fig. 3.5.
Carry in = 1
121
p9
P1O
8'
p7
6'
p5
p4
p3
p2
Pl
Po
.........................................................
P
= {
Fig. 3.6.
W
1
W
2
W
3
W
4
W
5
W
6
W
7
W
8
123
MODIFICATIONS DUE TO MULTIPLICATION:
3.3.3
DIVISION:
The division process is different from that of the
. . . .n be
t h e successive
R,
g 2
for i
1,2,. ..,k
where
= 1
D a n d 0 < 8 <=
0.5.
((1-6)x R x R x R x
1
2
3
. . . . .
.X
(i-1)
Expanding Ri in terms of ( 1 + 6
i
= 1, 2 , 3 ,
given below:
as in equation ( 3 . 3 ) for
as
The value of
.x
1.
(l+6
(k-1)
)
*(k)
Iteration #
( k )
= 0.5
2k
6 2k
(1-
0.25
0.75
0.0625
0.9375
0.003906
0.996094
1.526 x 1 0 -
2.32 x 10
'
0. 9999874
O
0. 9999999
f r a c t i o n ' w h e r e i n t h e f r a c t i o n i s l e s s t h a n 1.
Mathematically we have
Q
where B = N
N/D = (D + B)/D
(B/D)
L e t Pp
L e t Pg
= -P2x (
P ~ X(
2
+ b )
4
1 + 6 )
1
( 3 8)
(3-9)
127
. Hence
128
argument becomes available. The changes made are shown in
Fig. 3.8. Thus the pipe is converted from a unifunction pipe
to that of a multifunction pipe capable of"dynamic behavior
as shown in Fig 3.9. The dynamic operation depends on both
the hardware and the control schemes for its successful
operation. The control is based on the hardware and the
details of how the control was realized is explained in the
next section.
3.4
CSfl UNIT 6
MULTIPLEHER
MULTIPLEXER
LATCH 2
CSA UNlT
S2
-2
LATCH 3
S3
-4
CSA UNIT
CSA UNIT
- . . .-
-3
...
..:.
::+::.:::,,,:
. .::;::::I:.:;5:::I:::j:;:;:::;:::i::: . . .,j:::::::;:::;:F:....
...
.........................
.........................................
:~~j~:i:::~~~~:~:~~~~~:~.wlj.:::~~3j~~~~j.;::P,:i'::::~:2~::~jj~~,~:,,~:~:~:~~~~:~:~::t;j.:~~~~:~I~~~~:.;.~~j
..........................
I
$4
CSR UNIT
CSR UNIT- 5
............
-6
Inverter
CARRY LOOK
AHEAD ADDER
C o n y in.
P= A'B.
Fig. 3.9
LATCH 4
131
3.4.1
COLLISION VEXTORS:
a single
132
00110101011
).
Collision
STAGE 4
STAGE 5
Fig. 3.10
X
X
134
135
cad,
={
1 0 0 0 0 0 0 0 0 }
(3.11)
'sub
={
1 0 0 0 0 0 0 0 0 }
(3.12)
1 0 0 0 0 0 0 0 )
(3.13)
This
This process
Stage
Stage
Stage
Stage
Stage
Stage
Stage
Fig. 3.1 1
Stage
Stage
Stage
Stage
Stage
Stage
Stage
Fig. 3.1 2
137
which immediately follows the initiation of the previous
function. This is carried out to have all the operands
available for the next iteration without any undue delay.
The value of delta is held in stage five for two consecutive
clock cycles because of the necessity of obtaining the new
values of 1
)A
and (
)A
. This is illustrated in
1 1 1 0 0 0 0 0 )
(3.16)
Time
Fig.
3.13
(I+6
(k)
Fig. 3.14
2
Reservation table for delta products. ( 6
).
Fig. 3.15
140
vector between an initiation of type i and latter initiation
of t y p e j . T h u s C M i ( j ,k) i s 0 only if shifting the
respective reservation tables j , k places right, and
overlaying them on a copy of the reservation table i f
results in no collision. k denotes the number of clock
cycles from the initial clock cycle 0, when an initiation
of the function j is desired.
Step 3:
I t is
row 1
row 2
row 3
row i
row n-1
Cross collision vector between operation (n-1) and operation (i)l
Cross collision vector between operation (n) and operation (i)
Fig. 3.16
142
of the initial collision matrix for division is chosen to
illustrate the process. The operations are to be tagged as
i, i+l
collision matrices.
The rowl of the matrix 1 will be the cross collision
vector of division operation. The elements of rowl are
3,4,5,6,7
1 1 1 0 0 0 0 0 ).
143
Fig. 3.17.
In the proposed pipeline system, there are three
distinct functions which produce three initial collision
matrices and they are presented in Fig. 3.18. The state
diagrams are generated using the initial collision matrices.
GENERATION OF STATE DIAGRAMS:
3.4.3
occurs. If a
is
for function i is
Fig. 3.17
Fig. 3.18
145
However, this does not guarantee that it will not collide
with any other initiations that may be possible at the same
time. For each initiation there will be a new state. The new
state is determined by ORing the present collision matrix
with initial collision matrix corresponding to the function
i.
Step 3:
146
remains in the present state. The column 1 now becomes the
column 0 and the step 1 to step 5 are again followed.
STEP 6:
I1 = { Addition ) .
I2 = { Subtraction ) .
I3 = { Multiplication ) .
I4 =
Division
I5 = { Multiplication, Addition )
I6
= {
~ultiplication,Subtraction
I7
= {
Division, Addition
I8 = { Division, Subtraction )
2.
=> 3.
Fig.
3.1 9
Column
Initiation set
= {
1 )
= {
1 )
= {
1 )
= {
1,2,3,{1,2),{1,3))
0:
1,2,3,{1,2),{1,3))
2 )
= {
2 )
= {
,
,
Fig. 3.21.
2 )
2 ) . Hence
0.
Fig. 3.23
Fig. 3.24
Fig.
3.25
3.26
Combined cross
collision matrix for
dual initiation
Fig.
Combined cross
collision matrix
152
The process of generating the next state for the double
initiation is not
1: Division operation
2: multiplication operation
3: addition or subtradion operation
Fig. 3.27
The possible states from the initial collision matrix for division
CHAPTER FOUR
....
store
rl, 20;
r2, 30;
r3, 40;
r4, rl, r2;
k, r4;
r4, r2, r3;
r5, r2, r3;
r5, 60;
Consider the
156
rl, 2 0
is fetched by the
Counter 1
Counter 2
EAC
Queue
PIC
Queue
der unit 1
Decoder unit 2
issue
unit 2
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.1
Counter 1
Fetch unit
Counter 2
EAC
Queue
PIC
Queue
Decoder unit 1
Decoder unit 2
lssue
unit 2
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.2
Fig. 4.3
160
Pipeline cycle # 2:
The first instruction is in the issue unit. The
contents of the counter cl which represents the register rl,
is zero. This implies that there is no RAW or WAW hazard.
The Tinst-delay is calculated as follows:
Csink(old)
'test
= O
Counter 1
Counter 2
EAC
Queue
Decoder unit 2
Decoder unit 1
II
m
Issue
unit 2
lssue
unit 1
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.4
II
Fig. 4.5
163
4:
Csource- reg2
- 4
Csink(old)
= 0
Counter 1
Fetch unit
Counter 2
EAC
Queue
PIC
Queue
Decoder unit 2
Decoder unit 1
'
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.6
Fig. 4.7
Counter 1
Fetch unit
Counter 2
EAC
Queue
Decoder unit 2
Decoder unit 1
'
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.8
Fig. 4.9
Tsrc-delay
= 5,
'test
= 5 + 3 - 1 = 7 ,
'
Csink(old)
Tinst-delay
and
Ttest
> 2.
Csink(old)
Tsrc-delay = 5'
- T
+ (Tsr,-d,,a,
- 1)
= 3
4 = 7,
Counter 1
Fetch unit
Counter 2
EAC
Queue
Decoder unit 1
Decoder unit 2
t
I
lssue
unit 2
Issue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.10
Fig. 4.11
Counter 1
Fetch unit
Counter 2
EAC
Queue
HEl
PIC
Queue
add R5 R2 R3.
Decoder unit 2
Decoder unit 1
I - _ - - - - - -
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.12
Fig. 4.13
Unit 4
Unit 5
Unit 6
Unit 7
Unit 4
Unit 5
Unit 6
, Unit 7
174
Csource- reg1
Csource- reg2 = 3
Csink(old)
Tsrc-delay =
'test
'test
' Csink(old)
(Tsrc-delay
and
1)
(Ttest
3+4-1
Csink(old)
= 6,
) = I
(Tsrc-delay
+1)=4+1=5.
Tsrc-delay =
7'
Counter 1
Counter 2
Fetch unit
Branch
EAC
Queue
Decoder unit 2
Decoder unit 1
-I------
Issue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.16
mult
Fig. 4.17
Unit 4
Unit 5
.Unit
, Unit 7
Pr # : Priority number attached to each unit
ASRl : Address of source register 1 ASR2 :Address of source register 2
DSR1 : Delay of source register 1.
DSR2 : Delay of source register 2.
SD1 : Source Data 1.
SD2 :Source Data 2.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.18
Unit 4
Unit 5
Unit 6
, Unit
Unit 4
Unit 5
Unit 6
Unit 7
2
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.20 State of the delay station in the LU during pipeline cycle # 7
179
Counter 1
Fetch unit
Counter 2
EAC
Queue
Decoder unit 2
Decoder unit 1
C
I
lssue
unit 2
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.21
Fig. 4.22
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASRI : Address of source register 1 ASR2 :Address of source register 2
DSR1 : Delay of source register 1.
DSR2 : Delay of source register 2.
SD1 :Source Data 1.
SD2 : Source Data 2.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.23
Unit 4
Unit 5
Unit 6
I
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1
DSRl :Delay of source register 1.
SD1 : Source Data 1.
ID : Instruction Delay.
Fig. 4.25
DR : Destination Resource
PIC
Queue
EAC
Queue
Decoder unit 1
Decoder unit 2
------
.
Issue
unit 2
Fixed
point
unit
~
-
Issue
unit 1
Floating
point
unit
Register
set
Fig. 4.26 The state of the system at pipeline cycle # 9
mult
Fig. 4.27
Unit 4
Unit 5
,
Unit 6
Unit 7
---- - -
- - -- --
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1 ASR2 :Address of source register 2
DSR2 : Delay of source register 2.
DSR1 : Delay of source register 1.
SD1 : Source Data 1.
SD2 : Source Data 2.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.29
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1
DSR1 :Delay of source register 1.
SD1 : Source Data 1.
ID : Instruction Delay.
Fig. 4.30
DR : Destination Resource
188
Fig. 4.31
Counter 1
Fetch unit
Counter 2
EAC
Queue
Decoder unit 2
Decoder unit 1
Issue
unit 2
Disabled
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASRl : Address of source register 1 ASR2 :Address of source register 2
DSR2 : Delay of source register 2.
DSR1 : Delay of source register 1.
SD2 : Source Data 2.
SD1 : Source Data 1.
I D : Instruction Delay.
DR : Destination Resource
Fig. 4.33
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1 ASR2 :Address of source register 2
DSR1 : Delay of source register 1.
DSR2 : Delay of source register 2.
SD1 : Source Data 1.
SD2 : Source Data 2.
I D : Instruction Delay.
DR : Destination Resource
Fig. 4.34
Unit 4
Unit 5
Unit 6
Unit 7
i
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.35 State of the delay station in the LU during pipeline cyde # 10
Fig. 4.36
Counter 1
Fetch unit
Counter 2
EAC
Queue
Decoder unit 1
Decoder unit 2
.
Hold instruction in issue unit
Disabled
I
lssue
unit 2
Disabled
---
- -
lssue
unit 1
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.37 The state of the system at pipeline cycle # 11
Unit 4
Unit 5
Unit 6
Unit 7
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1 ASR2 :Address of source register 2
DSR1 : Delay of source register 1.
DSR2 : Delay of source register 2.
S D l : Source Data 1.
SD2 : Source Data 2.
I D : Instruction Delay.
DR : Destination Resource
Fig. 4.39
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1
DSRl : Delay of source register 1.
SD1 : Source Data 1.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.40 State of the delay station in the LU during pipeline cyde # 11
Fig. 4.41
Fig. 4.42
Counter 1
Fetch unit
Counter 2
EAC
Queue
Decoder unit 1
Decoder unit 2
lssue
unit 2
rn
lssue
unit 1
Disabled
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.43 The state of the system at pipeline cycle # 12
Unit 4
Unit 5
, Unit
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1 ASR2 :Address of source register 2
DSR2 : Delay of source register 2.
DSR1 : Delay of source register 1.
SD1 : Source Data 1.
SD2 : Source Data 2.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.44 State of the delay station in stage 1 of the AU during pipeline cycle # 12
Each Unit is a Delay Buffer.
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1 ASR2 :Address of source register 2
DSR1 : Delay of source register 1.
DSR2 : Delay of source register 2.
SD1 : Source Data 1.
SD2 : Source Data 2.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.45 State of the delay station in stage 6 of the AU during pipeline cycle # 12
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASR1 : Address of source register 1
DSR1 : Delay of source register 1.
SD1 : Source Data 1.
ID : Instruction Delay.
Fig. 4.46
DR : Destination Resource
P i p e l i n e Cycles # 13
# 20:
Counter 1
Fetch unit
Counter 2
EAC
Queue
ww
PIC
Queue
Decoder unit 1
Decoder unit 2
lssue
unit 2
lssue
unit 1
Disabled
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.47 The state of the system at pipeline cycle # 13
Unit 4
Unit 5
Unit 6
I
Unit 7
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.48 State of the delay station in the LU during pipeline cycle # 13
Counter 1
Fetch unit
Counter 2
EAC
Queue
PIC
Queue
Decoder unit 2
Decoder unit 1
Issue
unit 2
+-
Issue
unit 1
Disabled
m-
Logic
unit
Fixed
point
unit
Floating
point
unit
Register
set
Fig. 4.49 The state of the system at pipeline cycle # 14
Unit 4
Unit 5
Unit 6
Unit 7
Pr # : Priority number attached to each unit
ASRl : Address of source register 1
DSR1 : Delay of source register 1.
SD1 : Source Data 1.
ID : Instruction Delay.
DR : Destination Resource
Fig. 4.50 State of the delay station in the LU during pipeline cyde # 14
Counter 1
etch
unit
Counter 2
EAC
Queue
Decoder unit 2
Decoder unit 1
lssue
unit 2
Fixed
point
unit
lssue
unit 1
Disabled
Floating
point
unit
Register
set
Fig. 4.51 The state of the system at pipeline cycle # 19
ASR1
R5
Unit 1
DSR1
0
SD1
1200
ID
DR
Unit 2
r
Unit 3
Unit 4
P
Unit 5
I
Unit 6
r
Unit 7
ID : Instruction Delay.
Fig. 4.52
DR : Destination Resource
Counter 1
Fetch unit
Counter 2
EAC
Queue
PIC
Queue
Decoder unit 2
Decoder unit 1
No delay is assigned
Issue
unit 2
Issue
unit 1
Logic
unit
Fixed
point
unit
Fig. 4.53
Floating
point
unit
Register
set
The state of the system at pipeline cycle, # 20
Fig. 4.54
CHAPTER FIVE
COMPUTER SIMULATION AND EXPERIMXNTAL RESULTS
in two sections. The first section simulates the PIU and the
second simulates the PEU. In real time operation, the
various units are synchronised. The units are termed as
stages. The total number of stages in the system are ten.
The first three stages are the fetch unit, decode unit and
issue unit respectively. The remaining seven stages
constitute the stages of the pipelined arithmetic unit. The
program is written in C language. Each stage is simulated
by a single function. This is illustrated in Fig. 5.1.
In actual operation, the stages operate concurrently.
For example, let us assume that decode unit will receive an
instruction I from the fetch unit at the begining of the
cycle J. It processes the instruction and forwards it to the
issue u n i t a t t h e end of t h e cycle. T h e issue unit
meanwhile, receives the instruction 1-1 at the begining of
the cycle J. The instruction I will be received by the issue
unit only at the begining of the cycle J+1. The Fig. 5.2
illustrates the data flow. The stages begin processing at
the begining of each cycle and complete processing at the
end of each cycle. This implies that the simulation program
must begin the execution of functions at the same time. The
Function Fetch-unit
Function Decode-unit
t
Function Issue-unit
*
Function Stage-one
Function Stage-two
Function Stage-three
Function Stage-four
>
.
Function Stage-five
i
Function Stage-six
Function Stage-seven
Fig. 5.1
Fig 5.2
213
(result of operation)
(result of operation)
Start .
b
End
Start ,
-b
End
excuting
excuting
excuting
excuting
Function decode-unit Function decode-unit Function issue-unit Function issue-unit
Program Flow
The data transfer operation during data transfer mode during iteration I:
int
int
int
int
int
opcode field;
source-operandl;
source-operand2;
dest operand;
valid;
216
"
issue-unit ":
stage-one ":
( j = 0)
( j = 1)
( j = 2)
( j = 3)
( j = 4)
w6
( j = 5)
( j = 6)
w8
( j = 7)
w2
a O *b.J a l *b.J a 2 * b
a3*bj a4*bj a5*bj a6*bj a7*bj
J
The e l e m e n t s a i a n d b j b e l o n g t o t h e i n p u t
b i n a r y v e c t o r s A and B.
Functions
"
stage-two
to
stage-five
n:
The s t a g e s two t o f i v e c o n s i s t o f t h e C S A e l e m e n t s .
These f u n c t i o n s s i m u l a t e t h e o p e r a t i o n of t h e CSA elements.
The c a r r y s a v e a d d e r is r e p r e s e n t e d by u s i n g t h e e q u a t i o n s
o f t h e sum a n d t h e c a r r y v e c t o r s . T h e s e f u n c t i o n s r e t u r n
w i t h t h e r e s u l t i n t h e o u t p u t b u f f e r s which a r e r e p r e s e n t e d
a s structures.
Function
"
stage-six ":
219
220
loadgipeline
'#:
221
output-check ":
Function
I@:
T h e f u n c t i o n s h i f t- t r a c i s u s e d t o t r a c k t h e
instructions in the arithmetic unit. Each instruction that
is initiated is assigned a tracking register. The tracking
registers contain seven fields and each field represents a
stage. T h e tracking register will also contain the
destination register for the result of the instruction. When
an instruction is initiated at stage one, the tracking
register assigned to the instruction is initialized by
placing a token in field one. This function advances the
token to the next field denoting that the instruction has
moved to the next stage. When the token indicates that the
instruction is at the output stage, the function
output-check loads the result into the register specified
by the tracking register.
Function
I9
set-logw:
223
Function
time-off
I1
index ":
224
Pointer
pres-num
n:
Function
"
or-cross
I*:
*name-it
It:
I*'
in the
225
Function
EXPERIMENTAL RESULTS:
I
CPCCOE
NMrnICS
OPmm
N3P
NoOPERAm
ADD
ADDITION
SUB
SUBTRACTION
MULT
MULTIPUCATION
DIVIDE
DIVISION
SrCRE
LCY\D
LOAD1
INC
INCREMENT
EC
DECREMENT
10
AND
AND
11
CR
12
NCrr
NCrr
13
BRANCH
UNCONDITIONALBRANCH
14
BRANCF1NZ
15
BRANCHNC
BRANCH IF NO CARRY
Fig. 5.4
un
Source register1
Source register2
STORE:
Memory
location
Memory
location
Destination
reaister
mn
Destination
reaister
Source registerl
Source register2
Destination
address
Do 10 I = 1,100
A(1) = B(1) + C (I)*D(I)
The micro code for loop section of the macro code is given
below:
100: load r8, (C);
load r9, (D);
load r10, (B);
load rll, (A);
add r3, r8, r2;
add r4, r9, r2;
add r12, rll, r2;
mult r6, r3, r4;
add r7, r6, r5;
Instruction set # 1:
load r l , (X);
load r2, (Y);
add r3, r2, r l ;
store (Z), r3;
Location
(X) = 20
(Y) = 30
Iteration #
r l
20
r2
10
30
r3
13
50
(z)
19
50
4
Fig. 5.6
Iteration #
load r l , (X);
load R, (Y);
add r3, r2, r l ;
store (Z), r3;
2 3
D I
F D
F
E
I
D
F
E E E E E
E E E E E E
l
D I
101112131415
E E
E
E E
Instruction set # 2:
load r l , (X);
load R , (Y);
add r3, r2, r l ;
store (Z), r3;
load
r3, (A)
Location
(X) = 20
(Y) = 30
(A) = 40
Iteration #
r l
20
r2
10
30
r3
13
50
(2)
19
50
r3
18
40
Fig. 5.8
Iteration #
load r l , (X);
bad r2, (Y);
add r3, r2, r l
store (Z), r3;
load r3, (A);
Fig. 5.9
Instruction set # 3:
load r l , (X);
load r2, (Y);
add r3, r2, r l ;
add r4, r2, r l ;
store (Z), r3;
load
r3, (A)
(X) = 20
(Y) = 30
(A) = 40
Iteration #
load r l O,(B);
add r3,r8,r2;
add r4,r9,r2;
add r5,r10,r2;
add r12,rll ,r2;
mutt r6, r3, r4;
add r7, r6, r5;
store (r12), r7;
branchnz 100, r2
Fig. 5.12
I I
Iteration #
Fig. 5.12
(Cont'd)
tteration #
2 8 2 9 3 0 31 3 2 33 34 35 36 3 7 38 3 9 40 41 42
load r l l ,(A);
add r3,r8,r2;
add r4,r9,r2;
add r5,r10,r2;
add r12,r11 ,r2;
mult r6, r3, r4;
add r7, r6, r5;
store (r12), r7;
dec r2;
branchnz 100, r2
Fig. 5.12
(Cont'd)
Iteration #
4 3 4 4 45 46 47 48 49 5C51 5 2 5 3 5 4 5 5 5 6 5 7
a,(D);
load r l O,(B);
load r l 1,(A);
add r3,r8,r2;
add r4,r9,r2;
add r5,r10,r2;
add r 1 2 , r l l ,r2;
dec r2;
Sranchnz 100, r2
Fig. 5.12
(Cont'd)
Fig. 5.12
(Cont'd)
Iteration #
load r l l,(A);
add r3,r8,r2;
add r4,r9,r2;
add rS,r10,r2;
add r 1 2 , r l l ,r2;
mult r6, r3, r4;
add 77, r6, r5;
store (r12), r7;
branchnz
Fig. 5.13
100, r2
Iteration #
1 6 1 7 18 19 2 0 2 1 2 2 23 2 4 2 5 2 6 2 7 2 8 2 9 3 0
Fig. 5.13
100, r2
Iteration #
Fig. 5.13
Iteration #
3 8 3 9 4 0 41 42 43 4 4 4 5 4 6 4 7 4 8 49 5 0 5 1 5 2
E E
--
(D);
load 1'9,
load r l O,(B);
E E E E
load r l 1,(A);
--------
add r3,r8,r2;
add r4,r9,r2;
add r5,r10,r2;
add r12,rll ,r2;
1
Fig. 5.13
Iteration #
Fig. 5.13
CHAPTER SIX
CONCLUSIONS AND DISCUSSION
243
REFERENCES
[ 11
[21
C31
141
[ 101
1221
[231
H w a n g K., a n d B r i g g s F . A., It C o m p u t e r
Architecture and Parallel Processing," McGraw-Hill
Book Company, 1984.
APPENDIX
A.
B.
Simulation Program
APPENDIX A.
APPENDIX B.
/*
*/
# include <stdio.h>
# include <math.h>
# define true 1
# define false 0
struct matrix
{
>;
struct direction
{
int
int
int
int
int
div latency[8];
mult latency[8];
add iatency[8];
div-add[8];
mult-add[8];
1:
struct ident
{
int name ;
1:
struct collision-matrix
{
1;
/*
i*
/*
/*
/*
*/
*/
*/
*/
*/
1
f
0
);
now = new;
return (now);
*/
/*
/*
/*
/*
/*
*/
*/
*/
*/
matrix-one.smatrix.bits rowl[j]
=(matrix
o.smatrix.bits -rowl[j]
I
matrix-two.sma~rix.bits
-rowl[j]);
matrix-one.smatrix.bits row2[j]
=(matrix
o.smatrix.bits-row2[j]
I
matrix-two.smatrix.bits -row2[j]);
matrix-one.smatrix.bits row3[j]
=(matrix
o.smatrix.bits -row3[j]
I
matrix-two.sma~rix.bits
-row3[j]);
)
/*
/*
/*
/*
/*
*/
*/
*/
*/
*,/
/*
/*
/*
/*
/*
/*
FUNCTION NAME IT
*/
*/
*/
*/
*/
*/
=
=
s t r u c t
c o l l i s i o n
m a t r i x
dname it(present,coming,ineex,pre nutrepeat)
struct collision-matrix present,corning[];
int ineex,repeat,pre-nu;
f
for(j=O;j<8;++j)
I
flag
flag
=
=
flag << 1;
flag I 1;
flag
flag
=
=
flag << 1;
flag I 0;
else
{
if (flag
==
255)
binary-matrix[consider].sdirection.div-latency[repeat]
find sucess
breaE;
i;
1;
1
else
{
flag =O;
1
1
if (find-sucess != 1)
{
binary matrix[consider].sdirection.div
number-+ 1;
index = number + 1;
1
return;
number
+ 1;
-latency[repeat]=
/*
/*
/*
/*
/*
/*
FUNCTION NAME IT
*/
*/
*/
*/
*/
*/
s t r u c t
c o l l i s i o n
m a t r i x
mname it(present,coming,ineex,pre nutrepea%)
struct collision matrix present1c6ming[];
int ineex,repeatTpre-nu;
{
for(j=O;j<8;++j)
{
if (((present.smatrix.bits rowl[j] ==
c o m i n g [ i ] . s m a t r i x . b i t s - rowl-[j])
& &
( p r e s e n t . s m a t r i x . b i t s row2[j]
= =
coming[i].smatrix.bits
row2[j]))
& &
( p r e s e n t . s m a t r i x . b i t s-row3[j]
= =
coming[i].smatrix.bits -row3[j]))
{
1
else
{
1
1
if (flag
==
255)
binary-matrix[consider].sdirection.mult -latency[repeat]=
i;
find sucess = 1;
break;
1
else
{
flag =O;
1
if (find-sucess != 1)
{
number + 1;
return;
*/
/*
/*
/*
/*
/*
/*
FUNCTION NAME IT
*/
*/
*/
*/
*/
s t r u c t
c o l l i s i o n
m a t r i x
aname it(present,coming,ineex,pre nutrepea?)
struct collision matrix present,c~ming[];
int ineex,repeat,pre -nu;
{
for(j=O;j<8;++j)
{
if (((present.smatrix.bits rowl[j] ==
c o m i n g [ i ] . s m a t r i x . b i t s - rowl-[j])
& &
( p r e s e n t . s m a t r i x . b i t s
row2[j]
= =
coming[i].smatrix.bits
row2[j]))
& &
( p r e s e n t . s m a t r i x . b i t s-r o w 3 [ j ]
= =
coming[i].smatrix.bits -row3[j]))
{
flag
flag
=
=
flag << 1;
flag I 1;
flag
flag
=
=
flag << 1;
flag 1 0;
else
{
1
1
if (flag
==
255)
binary-matrix[consider].sdirection.add -latency[repeat]
find sucess = 1;
break;
1
else
{
flag =O;
if (find-sucess != 1)
i;
binary-matrix[number+l] = present;
binary-matrix[number+l].sident.name
binary matrix[consider].sdirection.add
number-+ 1;
index = number + 1;
number
+ 1;
-latency[repeat]=
return;
/*
/*
/*
/*
/*
/*
FUNCTION NAME IT
*/
*/
*/
*/
*/
*/
s t r u c t
c o l l i s i o n
m a t r i x
daname it (present,coming,ineex,pre nu,repeat)
struct-collision matrix present,coming[];
int ineex,repeat:pre
-nu;
{
flag
flag
flag << 1;
flag I 1;
1
else
{
1
if (flag
==
255)
binary-matrix[consider].sdirection.div -add[repeat]
find sucess = 1;
break;
1
else
{
i;
flag =O;
1
if (find-sucess != 1)
{
1;
/*
/*
/*
/*
/*
/*
FUNCTION NAME IT
*/
*/
*/
*/
*/
*/
s t r u c t
c o l l i s i o n
m a t r i x
maname it(present,coming,ineex,pre nutrepeat)
struct-collision-matrix present,coming[];
int ineex,repeat,pre-nu;
{
for(j=O;j<8;++j)
{
if (((present.smatrix.bits rowl[j] ==
c o m i n g [ i ] . s m a t r i x . b i t s - rowl-[j])
& &
(present.smatrix.bits
row2[j]
= =
coming[i].smatrix.bits -row2[j]))
& &
( p r e s e n t . s m a t r i x . b i t s -row3[j]
coming[i].smatrix.bits -row3[j]))
{
= =
else
{
flag
flag
=
=
flag << 1;
flag I 0;
if (flag
==
255)
binary-matrix[consider].sdirection.mult
find sucess = 1;
breaE;
-add[repeat]
i;
else
{
flag =O;
if (find-sucess != 1)
{
number
binary-matrix[consider].sdirection.mult -add[repeat]=
1;
index
number
+ 1;
number
1;
return ;
/*
/*
/*
/*
/*
/*
/*
*/
*/
*/
*/
*/
*/
*/
s t r u c t
c o l l i s i o n - m a t r i x
temp-struct,sec struc,third-struc;
int i,j,k,~,consider;
if (temp-struct.smatrix.bits-row1[j] == 0)
{
struc
sec-struc
-
set
1
sec-struc = temp-struct;
1
1
sec-struc = temp-struct;
struc
sec-struc
-
set
aname-it(sec-struc,put,in~consider,j);
sec-struc
temp-struct;
if((temp struct.smatrix.bits-rowl[j]== 0)
(temp-struct.smatrik.bits-row3 [ j] == 0))
{
&&
sec-struc = temp-struct;
}
for(j=O;j<8;++j)
{
&&
struc
sec-struc
-
set
maname-it(sec-struc,put,i~ex,consider,j);
sec-struc
temp-struct;
return;
1
main ( )
{
check-struc(binary matrix,index,pres-num);
pres-num = pres-nui + 1;
printf(
below \nu);
printf (!I\nw);
for (v=l; v <= index; v++)
{
for (1=0;1<8;++1)
{
ff,binary
-matrix[v].smatrix.bits
printf (
rowl[l]);
-
" % d \b
printf ("\nff)
;
printf (ff\n")
;
for (1=0;1<8;++1)
{
Iffbinary
-matrix[v].smatrix.bits
printf (If\nw)
;
printf (ff\nff)
;
printf (
row2[1]);
-
If%d \b
1
for (1=0;1<8;++1)
{
ff,binary
-matrix[v].smatrix.bits
printf (
row3[1]);
-
" % d \b
printf ("\nff)
;
printf iff\nff)
;
printf (:*\riff) ;
for (1=0;1<8;++1)
{
printf ("\nff)
;
printf (If \nw) ;
for (1=0;1<8;++1)
{
printf (
",binary-matrix[v].sdirection.mult -latency[l]);
printf (If\nff)
;
printf ( fl\nff)
;
If%d \b
for (1=0;1<8;++I)
{
printf (
",binary-matrix[v].sdirection.add -latency[l]);
If%d \b
1
printf ("\nW);
printf ("\nI1);
for (1=0;1<8;++1)
{
printf (
",binary-matrix[v].sdirection.div~add[l]);
I1%d \ b
1
printf (l1\n");
printf ("\nn);
for (1=0;1<8;++1)
{
printf (
lt,binary-matrix[v].sdirection.mult -add[l]);
I1%d \ b
1
printf ("\nu);
printf ("\nV1)
;
printf (
binary matrix[v].sident.name);
printf7l1\n1l)
;
printf ("\nW);
printf ("\nn);
printf (If\n");
printf ("\nut)
;
printf (tt\nN)
;
I1%d \ b
I f I
APPENDIX C.
simulation Program
................................................
/***** SIMULATION OF DYNAMIC ARITHMETIC *****/
/*****
*****/
/*****
PIPLINE
*****/
/*****
*****/
/*****
VERSION 1.0
*****/
................................................
................................................
..................................................
..................................................
/ * In this program The Eigth bit is stored in 0 */
/ * and the First Bit is stored in 8
*/
..................................................
..................................................
#
#
#
#
include <stdio.h>
include <math.h>
define true 1
define false 0
/*
*/
struct input-inst
{
int
int
int
int
int
opcode field;
source-operandl;
source-operand2;
dest operand;
valid;
1;
struct instruction-status
{
1;
int
int
int
int
int
int
int
int
inst num;
pipe-cyl;
opcode;
exec time:
reg-;ti1 [5] ;
count units[5]
decode-ptr;
issue-ptr;
struct reg-file
{
int reg-units[5];
1;
struct status-reg
{
int
int
int
int
carry;
overflow;
sign;
zero;
>;
struct address-counter
C
int counter[20];
int free-index;
>;
struct issue-latch
{
>;
int
int
int
int
int
int
int
int
int
opcode fld;
dest fid;
source1 fld;
srcldata fld;
srcldelay fld;
source2 fid;
src2data fld;
src2delay fld;
instdelay-fld;
-
struct dstack-status
C
int queue select;
int full queue;
int flush flag;
int top stack;
int bottom stack;
1;
struct fetch-status
C
int flush flag;
int address flag;
int picqueue full;
int eacqueue-full;
struct matrix
C
int bits rowl[8];
int bits-row2 [ 8];
int bits-row3
[8];
1;
struct direction
C
int div-latency[8];
int
int
int
int
mult latency[8];
add iatency[8];
div-add[8];
mult-add[8];
1;
struct ident
{
int name;
1;
struct collision-matrix
{
1;
struct recode
{
int
bits[15] ;
1;
struct reg-stages
{
int
word [ 17];
1:
struct div-track
{
1;
struct add-track
{
1;
struct logg-sheet
{
int logg[10] ;
int logg-stat;
1;
struct input-process
{
int
int
int
int
int
int
location;
func;
num one[lO];
num-two[l0];
over flow;
weight;
1:
struct itr-storage
{
int
int
int
int
address;
func;
num one[10];
num-two[l0];
-
1;
struct output-process
{
int
int
int
int
destination;
overflow;
result[l7];
wt-factor;
1;
typedef
typedef
typedef
typedef
typedef
typedef
typedef
typedef
typedef
typedef
struct
struct
struct
struct
struct
struct
struct
struct
struct
struct
typedef
typedef
typedef
typedef
typedef
typedef
typedef
typedef
struct
struct
struct
struct
struct
struct
struct
struct
*/
-s
tatus,
int queue-select,current-queue,disable-decode,disable-issue;
struct collision matrix binary matrix[l50];
struct collision-matrix for present,upto next,last temp;
structl argumentT[20],argum&1t2[20], multipurpose-reg[20],
*mpreg ptr;
structF latches[30],par product[l0],transfer[30],delay[20];
struct3div follow[l~],delta track[lO],*divflow -ptr,
*deltaflow ptr;
struct3 *copy seven,*copy eight;
struct4 mult follow[10],*~ultflow ptr,*copy nine;
struct5 a d 3 follow[lO] , *addflow-ptr, *copy- ten,
sub follow[l~~,
*subflow ptr;
strkct6 div-logg,mult-Togg,add-logg,process-logg[lO],
*prlogg-ptr;
struct7 input stack[41],*instack ptr,*copy eleven;
struct8 output stack[41],*outstack ptr,*copy-twelve;
struct9 priority ~tack[70],*~rstackptr;
structO *bin pointer;
structl *argi pointer,*arg2 pointer,*copy one,*copy-two;
s t r u c t 2 *par pointer,*lat p o i n t e r , *copy-four,
*copy three,*trans pointer,*copy-five;
struct2 *delay ptr, *copy six;
int op code[20r,arg one[2<][9],arg two[20][9];
int *ptr op,*ptr argmntl[20],*ptr argmnt2[20];
int index, pres ium,stk ptr,total,~ultiplication,division;
int varl,var2,v~r3,var~init
-key, addition, subtraction,
delta-flag;
int
global one[20],global -two[20],
global-three [20],readjust;
-
/*
/*
/*
/*
*/
*/
*/
*/
*/
structi4 *ptr3,*ptr4;
counters */
structi6 *ptr5, *ptr6 ;
current queue in session
/*
/*
*/
/*
== 1)
/*
*/
(ptr4)->counter[i]
0;
/*
*/
(ptr4)->counter[O] = 0;
for (i=l;i<=g;i++)
{
(ptr3)->counter[i] = 0;
1
(ptr3)->free index = I;/* setting the index
flag of the counterl to 1 so as to indicate that the
counters are flushed and the counter that has to be filled
first is counter[l] */
1
1
/ * reading the memory for instructions */
/* instructions will be fetched if the program counter
*/
/*
program counterl
program-counter2
-
/*
==
=
=
*/
(ptr3)->counter[O];
(ptr4)->counter[O];
if ( (program-counter1 != 0)
0))
&& (
*/
(ptr5)->picqueue-full
( p t r 2 + 1 ) - > o p c o d e - f i e l d
(ptrl+program counter1)->opcode-field;
( p t r Z + l ) - > s o u r c e - o p e r a n d 1
(ptrl+program counter1)->source-operandl;
( p t r Z + l ) - > s o u r c e _ o p e r a n d 2 =
(ptrl+program counter1)->source operand2;
( p t r A 2 + 1 ) - > d e s t
o p e r a n d
(ptrl+program counter1)->dest operand;
transferAflagl = I;/* vaiid instruction and pass it to
decode unit */
(ptr3)->counter[O]+=l;
(ptr2+1)->valid = 1;
1
/*
==
if((program -counter2 != 0)
0))
&&
*/
((ptr5)->eacqueue-full
( p t r 2 + 2 ) - > o p c o d e - f i e l d
(ptrl+program counter2)->opcode field;
( p t r z + 2 ) - > s o u r c e - o p e r a n d 1
(ptrl+program counter2)->source-operandl;
( p t r Z + 2 ) - > s o u r c e
o p e r a n d 2
=
(ptrl+program counter2)->source operand2;
( p t r - 2 + 2 - > d e s t
o p e r a n d
(ptrl+program counter2)->dest operand;
transferflag2 = I;/* val'ld instruction and pass it to
decode unit */
(ptr4)->counter[O]+=l;
(ptr2+2)->valid = 1;
1
*/
!=
/*
/*
*/
&&
(transfer-flag1
*/
/*
i f ( ( ( p t r 2 + 2 )->opcode-f i e l d >= 1 3 )
&&
( t r a n s f e r-f lag2
!= 0 ) )
{
p r i n t f ( " t h e r e i s a branch i n s t r u c t i o n d e t e c t e d
i n t h e EAC s t r e a m \ n t l ) ;
( p t r 3 ) - > c o u n t e r [ ( p t r 3 ) - > f r e e- i n d e x ]
( p t r 2 + 2 ) - > d e s t operand;
( p t r 3 ) - > f r e e-index += 1;
p r i n t f ( I t t h e i n s t r u c t i o n f e t c h e d from memory f o r P I C
s t r e a m \ntl ) ;
p r i n t f ( " o p c o d e o f p t r 2 + 1 i s
\nl1, ( p t r 2 + 1 )->opcode-f i e l d ) ;
p r i n t f (llsource operand1 of p t r 2 + 1
\nw , ( p t r 2 + l )- > s o u r c e-o p e r c n d l ) ;
p r i n t f (llsource operand2 of p t r 2 + 1
\nw, ( p t r 2 + 1 )->source operand2) ;
p r i n t f (llde-st-operand
o f p t r 2 + 1
% d
%d
%d
% d
% d
%d
%d
% d
p r i n t f ( I 1 t h e program c o u n t e r s a r e l i s t e d below \ n t l ) ;
f o r (i=O;i<=g;i++)
{
p r i n t f ( I 1 t h e v a l u e of c o u n t e r %d of PIC s t r e a m i s
%d \ n t l , i , ( p t r 3 )- > c o u n t e r [ i ] ) ;
1
for (i=O;i<=g;i++)
{
p r i n t f ( I 1 t h e v a l u e of c o u n t e r %d of EAC s t r e a m is
%d \nw,i, ( p t r 4 ) - > c o u n t e r [ i ] ) ;
*/
*/
/*
/*
/*
/*
/*
/*
*/
*/
*/
*/
(ptr3+(ptr3)->decode-ptr)->reg-util [i]= 3 ;
)
switch((ptrl+bottom -stack1)->source-operandl)
{
case 1:
(ptr3+(ptr3)->decode-ptr)->reg-util [1]=
break;
case 2:
(ptr3+(ptr3)->decode-ptr)->reg-util[2]=
break;
case 3:
(ptr3+(ptr3)->decode-ptr)->reg -util[3]=
break;
case 4:
(ptr3+(ptr3)->decode-ptr)->reg-util[4]=
break;
case 5:
(ptr3+(ptr3)->decode-ptr) ->reg-util[5] =
break;
1;
1;
1;
1;
1;
1
switch((ptrl+bottom -stack1)->source-operand2)
{
case 1:
(ptr3+(ptr3)->decode-ptr) ->reg-util [l]= 1;
break;
case 2:
(ptr3+(ptr3)->decode-ptr) ->reg-util[2 ]= 1;
break;
>
>
C
-4 I
3
U 0
C, cd
>>Z$>>Z
C, m
m
Q)
G 0
Q)
"5
20 0k
rl
0 C,
C, -4
2U
m
(d
C,
-a
\O
ka
C, F:
a fd
m
kX
U
C,
-4J
a-4Jfd
a rn
ka
C,
ao
F:
k
C, Q)
a+,
k O
ru -4
C, a
a- *
d,
k *a k
C,cu
-C,
-4
F:
..
m
k
C,
l k k k
in \O
Ik
W"o a+
C,
u II -4
0 *X II
d U
fd -4
V
-4
-4
-n II
-U)d
XC, *-
arum
%?
- . *aa. -- EI
uuua
rlC,""C,
0-4 -4 -4 k
" mu+u
0
-4 m
-4 a a a
C, 5
I* * *
a
..
cdOdbQ
UC,
C
fd
5 C,
m
Fr
5 5 3 3
4k kk"
* * * * * * -OC,C,C,
F:
m m m-4
-\\\\\\$
(ptr3+(ptr3)->decode-ptr)->reg-util [1]=
break;
case 2:
(ptr3+(ptr3)->decode-ptr) ->reg-util[2]=
break;
case 3:
(ptr3+(ptr3)->decode-ptr)->reg-util[3 I=
break;
case 4:
(ptr3+(ptr3)->decode-ptr) ->reg-util[4]=
break;
case 5:
(ptr3+(ptr3)->decode-ptr) ->reg-util[5]=
break;
1;
1;
1;
1;
1;
switch((ptr2+bottom-stack2)->source-operand2)
{
case 1:
(ptr3+(ptr3)->decode-ptr) ->reg-util [I]=
break;
case 2:
(ptr3+(ptr3)->decode-ptr) ->reg_uti1[2]=
break;
case 3:
(ptr3+(ptr3)->decode-ptr)->reg_util[3]=
break;
case 4:
(ptr3+(ptr3)->decode-ptr) ->reg_util[4]=
break;
case 5:
(ptr3+(ptr3)->decode-ptr) ->reg-util[5] =
break;
1;
1;
1;
1;
1;
1
case 1:
(ptr3+(ptr3)->decode-ptr)->reg-util[l]=
break;
case 2:
(ptr3+(ptr3)->decode-ptr)->reg_util[2 ]=
break;
case 3:
(ptr3+(ptr3)->decode-ptr) ->reg_util[3 ]=
break;
case 4:
(ptr3+(ptr3)->decode-ptr)->reg-util[4]=
break;
case 5:
(ptr3+(ptr3)->decode-ptr) ->reg_util[5]=
break;
0;
0;
0;
0;
0;
/*
/*
/*
1
Decode unit
*/
*/
*/
s
t
r
u
c
t
i
0
decode-unit(ptrl,ptr2,ptr3,ptr4,ptr5,ptr6,ptr7,~tr8,~trg)
int i,j,k,l;
int top stack1,bottom stack1,top-stack2,bottom-stack2;
top stack1 = (ptr5)-?top-stack;
bottom stack1 = (ptr5)->bottom stack;
top stack2 = (ptr6)->top stack?
bottom-stack2 = (ptr6)->bottom-stack;
/*
*/
if ( ( (ptr5)->queue-select == 1)
&&
(ptr5)->full-queue
!= 1))
{
/*
instruction is valid
*/
( p t r 2 + t o p- s t a c k 1 ) - > o p c o d e - f i e l d =
(ptrl+l)->opcode field;
(ptr2+top- stack1)->source- operand1 =
(ptrl+l)->source operandl;
(ptr2+top- stack1)->source- operand2 =
(ptrl+l)->source operand2;
(ptr2+top_stackl)->dest-operand =
(ptrl+l)->dest-operand;
(ptr5)->top-stack+=l;
1
/*
*/
if ( ( (ptr6)->queue-select == 1)
!= 1))
{
&&
(ptr6)->full-queue
/*
instruction is valid
*/
( p t r 3 + t o p- s t a c k 2 ) - > o p c o d e - f i e l d
(ptrl+2)->opcode field;
(ptr3+top- stack2)->source-operand1
(ptrl+2)->source operandl;
(ptr3+top- stack2)->source - operand2
(ptrl+2)->source operand2;
( p t r 3 + t o p- s t a c k 2 ) - > d e s t - o p e r a n d
(ptrl+2)->dest-operand;
(ptr6)->top-stack+=l;
/*
if (current-queue
==
*/
1)
switch((ptr2+bottom-stack1)->opcode-field)
{
case 1:
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 1;
(ptr4+(ptr4)->decode-ptr)
->exec-time = 3 ;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl);
break;
case 2:
/*
*/
-stackl);
case 3:
/*
*/
=
=
=
=
case 4:
/*
*/
(ptr4+(ptr4)->decode-ptr)->opcode = 4 ;
(ptr4+(ptr4)->decode-ptr) ->exec-time = 23 ;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl);
break;
case 5:
/*
*/
(ptr4+(ptr4)->decode ptr)-Bopcode = 5;
(ptr4+(ptr4)->decode-ptr)->e~ec
-time = 6;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl):
break;
case 6:
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 6;
(ptr4+(ptr4)->decode-ptr)
->exec-time = 6;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl):
break;
case 7:
/*
*/
(ptr4+(ptr4)->decode-ptr)->opcode = 7 ;
(ptr4+(ptr4)->decode-ptr) ->exec-time = 3 ;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom-stackl):
break;
case 8:
/*
*/
(ptr4+(ptr4)->decode-ptr)->opcode = 8 ;
(ptr4+(ptr4)->decode-ptr) ->exec-time = 3 ;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl);
break;
case 9:
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 9 ;
(ptr4+(ptr4)->decode-ptr)
->exec-time = 3 ;
-
load-isunitl(ptr2,ptr3,ptr41ptr51ptr61b~tt~m
-stackl);
break;
case 10:
/*
*/
/*
*/
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 12 ;
(ptr4+(ptr4)->decode-~tr)
->exec-time = 3 ;
load-isunitl(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl);
break;
case 13:
/*
*/
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 14 ;
(ptr4+(ptr4)->decode7ptr)
->exec-time = 3 ;
load-isuniti(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stackl);
break;
case 15:
/*
*/
(ptr4+(ptr4)->decode-ptr)->exec-time
3;
load-is~nitl(ptr2,ptr3,ptr4~ptr5,ptr6~bottom
-stackl);
break;
/*
*/
if(disab1e decode != 1)
{
( p t r 7 + 3 ) - > o p c o d e - f i e l d
(ptr2+bottom-stack1)->opcode-field;
(ptr7+3)->source-operand1
(ptr2+bottom-stack1)->source-operandl;
(ptr7+3)->source-operand2
(ptr2+bottom stack1)->source operand2;
(ptr7+3)->dest
- o p e r a n d
(ptr2+bottom-stack1)->dest-operand;
/*
=
=
=
=
*/
(ptr2+(i-1))->opcode-field
(ptr2+i)->opcode-field;
(ptr2+(i-1))->source-operand1
(ptr2+i)->source-operandl;
(ptr2+(i-1))->source - operand2
(ptr2+i)->source-operand2 ;
(ptr2+(i-1))->dest -operand
(ptr2+i)->dest-operand;
(ptr5)->top-stack-=l ;
1
}
if(current-queue
{
==
2)
switch((ptr3+bottom -stack2)->opcode-field)
{
case 1:
/*
*/
/*
*/
=
=
=
=
(ptr4+(ptr4)->decode-ptr)->opcode = 2;
(ptr4+(ptr4)->decode-ptr) ->exec-time = 3 ;
load-isunit2(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stack2);
break;
case 3:
/*
*/
load-isunit2(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stack2);
break;
case 4:
/*
*/
/*
*/
load-isunit2(ptr2,ptr3,ptr4,ptr5,ptr6,bottom
break;
-stack2);
case 6:
/*
*/
/*
(ptr4+(ptr4)->decode ptr)->opcode = 7;
(ptr4+(ptr4)->decode-ptr)
->exec-time = 3 ;
load-i~unit2(ptr2,ptr3,ptr4~ptr5~ptr6,bottom
stack2);
break;
case 8:
/*
*/
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 9 ;
(ptr4+(ptr4)->decodeptr)
->exec-time = 3 ;
load-isunit2(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stack2);
break;
case 10:
/*
*/
/*
*/
/*
*/
(ptr4+(ptr4)->decode ptr)->opcode = 12 ;
(ptr4+(ptr4)->decode-ptr)
->exec-time = 3 ;
load-i~unit2(ptr2,ptr3,ptr4~ptr5~ptr6~bottom
-stack2);
break;
case 13:
/*
*/
break;
case 14:
/*
*/
(ptr4+(ptr4)->decode-ptr) -Bopcode = 14 ;
(ptr4+(ptr4)->decode-ptr) ->exec-time = 3 ;
load-isunit2(ptr2,ptr3,ptr4,ptr5,ptr6,bottom -stack2);
break;
case 15:
/*
*/
/*
*/
if(disab1e-decode != 1)
{
( p t r 7 + 4 ) - > o p c o d e - f i e l d
(ptr3+bottom-stack1)->opcode-field;
(ptr7+4)->source-operand1
(ptr3+bottom-stack1)->source-operandl;
(ptr7+4)->source_operand2
(ptr3+bottom-stack1)->source operand2;
(ptr7+4)->dest
-operand
(ptr3+bottom-stack1)->dest-operand;
/*
=
=
*/
(ptr3+(i-1))->opcode- field
(ptr3+i)->opcode-field;
(ptr3+(i-1))->source - operand1
(ptr3+i)->source-operandl;
(ptr3+ (i-1)) ->source - operand2
(ptr3+i)->source-operand2;
(ptr3+(i-1))->dest -o p e r a n d
(ptr3+i)->dest
-operand;
(ptr6)->top-stack-=l;
1
1
(ptr4)->decode ptr+=l;
if ((ptr4)->decode-ptr
{
==
20)
=
=
=
=
(ptr4)->decode-ptr
0;
return (*ptr7);
1
/*
/*
/*
/*
/*
*/
*/
*/
*/
Issue unit
structi5
structiO
structil
structi6
structi5
*/
issue unit(ptrl,ptr2,ptr3,ptr4,ptr5,ptr6,ptr7)
*ptrlT/* pointers to the latches */
*ptr2;
*ptr3,*ptr4;/* decode stack pointers */
*ptr7;
int i,j,k,l;
int issue-pointer, dest-ptr, srcl-ptr, src2-ptr;
i
n
templ,temp2,temp3,temp4,raw delay,waw delaytinst
-delay;
issue-pointer = (ptr2)1>issue-ptr;
/*
if((current -queue
==
1)
&&
*/
(disable-issue != 1))
1 =
t
e
m
P
(ptr2+issue-pointer)->count-units[(ptrl+3)->dest -operand];
temp2 = (ptr2+issue-pointer)->count -units
[(ptrl+3)->source operandl];
temp3 =(ptr2+issue -pointer)->count -units
[(ptrl+3)->source operand21;
temp4 =-(ptr2+issue -pointer)->exec-time;
/*
if ( (temp2
==
0)
&&
(temp3
raw-delay
0;
raw-delay
temp2;
raw-delay
temp3;
*/
==
0))
raw-delay = temp2;
1
else
{
/*
raw-delay = temp3;
if (templ
== 0)
waw-delay
1
if((temp1 != 0)
{
1
else
{
waw-delay
raw-delay
&&
1;
templ + 2 ;
waw-delay = templ-temp4+3;
/*
1
else
inst-delay
raw delay + 1;
inst-delay
waw-delay + 1;
{
}
/*
register
*/
(ptrZ+issue pointer)->count-units[(ptrl+3)->dest-operand]
inst-delay + temp4 - 1;
issue pointer += 1;
(ptr2y->issue ptr = issue-pointer;
if((ptr2)->is%ue -ptr == 20 )
{
(ptr2)->issue-ptr
1
= 0;
/*
if((current -queue
2)
==
*/
(disable-issue != 1))
&&
t e m p l = (ptr2+issue-pointer)->count -units
[(ptrl+4)->dest operand];
t e m p F = (ptr2+issue-pointer)->count -units
[(ptrl+4)->source operandl];
temp3 r(ptr2+issue -pointer)->count -units
[(ptrl+4)->source operand2J;
temp4 =-(ptr2+issue -pointer)->exec-time;
/*
if ( (temp2
==
0)
*/
(temp3
&&
==
0))
raw-delay
1
if ( (temp2 > 0)
0;
(temp3
&&
==
0))
raw-delay
temp2;
raw-delay
temp3;
1
if ((temp2 > 0)
&&
raw-delay
temp2;
else
{
raw-delay = temp3;
/*
if (templ
{
==
0)
waw-delay
raw-delay + 1;
if((temp1 != 0)
&&
waw-delay
}
else
{
temp1
+ 2;
waw-delay
templ-temp4+3;
/*
inst-delay
raw-delay + 1;
1
else
{
inst-delay = waw-delay + 1;
register
/*
*/
(ptr2+issue pointer)->count-units[(ptrl+3)->dest-operand]
inst-delay + temp4 - 1;
issue pointer += 1;
(ptr27->issue-ptr = issue pointer;
if ( (ptr2)->issue-ptr == 2 0 )
1
return (*ptr7);
}
/*
/*
/*
/*
/*
/*
Function
*/
*/
Initializations
*/
*/
*/
void initialize(numl,num2)
struct7 *numl;
struct8 *num2;
{
int i,j,k,l;
for(i=l;i<=40;i++)
{
(numl+i)->location = i;
(numl+i)->func = 0;
(num2+i)->destination = i;
/*
/*
/*
/*
/*
/*
*/
*/
Function Re-Initializations */
*/
*/
*/
struct7 *temp;
int ifjfkfl;
temp = numl;
1 = 1;
for(i=l;i<=40;i++)
{
if ( (numl+i)->func
{
== 5)
(numl+i)->location
1 = 1+ 1;
1
else if((numl+i)->func
{
1;
==
4)
(numl+i)->location
1;
1
else
1
numl = temp;
return (*numl);
1
/*
/*
/*
/*
/*
/*
Function stage 1
*/
*/
*/
*/
*/
*/
struct2 stage-one(number-one,number-two,num-three,num-passl,
num pass2)
structl *number one,*number-two;
struct2 *num three;
int num-passi,num-pass2;
{
int i,jfkfl;
structl *pl,*p2;
struct2 *p3;
pl = number one;
p2 = number-two;
p3 = nurn three;
for (i=0;i<8;++i)
{
( n u m t h r e e + = i) - > w o r d [ i + j ] =
(number one += num-passl) ->bits [j]) * ( (number-two +=
num-pass27->bits[i]) ;
number one = pl;
number-two = p2;
num-thTee = p3;
1
1
printf ( " \nw);
printf(I1 the partial products calculated in function stage1
are as follows \nI1);
printf ( " \nI1);
printf(" \nw);
printf (I1 \nu);
for(i=O;i<8;++i)
(
i)->word[j])
p r i n t f ( I 1 % d 11,( n u m - t h r e e + =
num-three = p3;
1
printf (I1\n1l)
;
printf (lt\nll)
;
1
return (*num-three) ;
1
/*
/*
/*
/*
Function Stage 2
/*
*/
*/
*/
*/
*/
struct2 stage t w o ( n u m l , n u m 2 , n u m 3 , n u m 4 , n u m 5 , n u m 6 ,
num7,num8,num9,num10)
struct2 * n u m l , * n u m 2 , * n u m 3 , * n u m 4 1 * n u m 5 1 * n u m 6 1
*num7,*num8,*num9,*numlO;
{
int i,j,k,l,nega,negb,negc;
struct2 *pl,*p2,*p3,*p4,*p5,*p6,*p71*p8~*p9,*p10;
pl = numl;
p2 = num2;
/*
p3 = num3;
p4 = num4;
p5 = num5;
p6 = num6;
p7 = num7;
p8 = num8;
p9 = num9;
p10 = numl0;
realization of csa unit number one
for(i=O;i<l6;++i)
{
nega = 0;
negb = 0;
negc = 0;
if ( (numl)->word[i]
==
0)
nega = 1;
1
if ( (num2)->word[i] == 0)
{
negb
1;
1
if ( (num3)->word[i]
{
/*
negc
==
0)
1;
1
realization of csa unit number two
for(i=o;i<l6;++i)
{
nega = 0;
negb = 0;
negc = 0;
if ( (num4)->word[i]
{
nega
negb
negc
==
0)
1;
1
if ((num6)->word[i]
{
0)
1;
1
if ( (num5)->word[i]
{
==
==
1;
0)
*/
/*
/*
/*
/*
/*
*/
Function Stage 3
*/
*/
*/
*/
struct2 stage t h r e e ( n u m 1 , n u m 2 , n u m 3 ~ n u m 4 , n u m 5 , n u m 6 ,
num7,num8,num9,~um10)
struct2 * n u m l , * n u m 2 , * n u m 3 , * n u m 4 , * n u m 5 , * n u m 6 ,
*num7,*num8,*num9,*numlO;
{
/*
int i,j,k,l,nega,negb,negc;
struct2 *pl,*p2,*p3,*p4,*p5,*p6,*p61*p71*p81*p91*p10;
pl = numl;
p2 = num2;
p3 = num3;
p4 = num4;
p5 = num5;
p6 = num6;
p7 = num7;
p8 = num8;
p9 = num9;
p10 = numl0;
realization of csa unit number three
for(i=O;i<l6;++i)
{
nega = 0;
negb = 0;
negc = 0;
if ( (numl)->word[i] == 0)
{
nega = 1;
1
if ((num2)->word[i]
{
== 0)
negb = 1;
1
if((num3)->word[i]
==
negc
1;
0)
num7->word[i] = ((((numl->word[i]*negb*negc)
I (num2->word[i]*nega*negc)) ( (num3->word[i]*nega*negb)) I
(numl->word[i]*num2->word[i]*num3->word[i]));
n u m 8 - > w o r d [ i + l ]
= ( ( n u m l - > w o r d [ i ] * n u m 2 - > w o r d [ i ] )
I
( n u m 3 - > w o r d [ i ] * n u m l - > w o r d [ i ] )
I
(num2->word[i]*num3->word[i]));
/*
*/
nega = 0;
negb = 0;
negc = 0;
if ( (num4)->word[i] == 0)
{
nega = 1;
if ( (num5)->word[i] == 0)
{
negb
= 1;
negc
1;
num9->word[i] = ((((num4->word[i]*negb*negc)
I(num5->word[i]*nega*negc)) 1 (num6->word[i]*nega*negb)) 1
(num4->word[i]*num5->word[i]*num6->word[i]));
n u m l 0 - > w o r d [ i + l ]
= ( ( n u m 4 - > w o r d [ i ] * n u m 5 - > w o r d [ i ] )
I
( n u m 6 - > w o r d [ i ] * n u m 4 - > w o r d [ i ] )
I
(num5->word[i]*num6->word[i]));
return(*num7,*num8,*num9,*numl0);
/*
/*
/*
/*
/*
Function Stage 4
*/
*/
*/
*/
*/
int i,j,k,l,nega,negb,negc;
struct2 *pl1*p2,*p3,*p4,*p5;
pl = numl;
p2 = num2;
/*
p3 = num3;
p4 = num4;
p5 = num5;
realization of csa unit number five
for(i=O;i<16;++i)
{
nega = 0;
negb = 0;
negc = 0;
if ( (numl)->word[i]
==
0)
nega = 1;
if((num2)->word[i]
==
0)
negb
1;
if((num3)->word[i]
==
0)
negc
num4->word[i]
1;
((((numl->word[i]*negb*negc)
I (num2->word[i]*nega*negc)) I (num3->word[i]*nega*negb)) I
(numl->word[i]*num2->word[i]*num3->word[i]));
=
n u m 5 - > w o r d [ i + l
= ( ( n u m l - > w o r d [ i ] * n u m 2 - > w o r d [ i ] )
( n u m 3 - > w o r d [ i ] * n u m l - > w o r d [ i ] )
(num2->word[i]*num3->word[i]));
1
return(*num4,*num5);
/*
/*
/*
/*
/*
*/
Function Stage 5
*/
*/
*/
*/
/*
int i,j,k,l,nega,negb,negc;
struct2 *pl,*p2,*p3,*p4,*p5;
pl = numl;
p2 = num2;
p3 = num3;
p4 = num4;
p5 = num5;
realization of csa unit number six
for(i=O;i<l6;++i)
{
nega
0;
]
I
1
negb = 0;
negc = 0;
if ( (numl)->word [i]
{
nega
==
0)
negb = 1;
1
if ((num3)->word[i]
{
0)
1;
1
if ( (num2)->word[i]
{
==
==
0)
negc = 1;
1
num4->word[i]
((((numl->word[i]*negb*negc)
I (num2->word[i]*nega*negc)) I (num3->word[i]*nega*negb)) I
(numl->word[i]*num2->word[i]*num3->word[i]));
n u m 5 - > w o r d [ i + l
= ( ( n u m l - > w o r d [ i ] * n u m 2 - > w o r d [ i ] )
( n u m 3 - > w o r d [ i ] * n u m l - > w o r d [ i ] )
(num2->word[i]*num3->word[i]));
I
I
1
return(*num4,*num5);
}
/*
/*
/*
/*
/*
Function Stage 6
*/
*/
*/
*/
*/
int i,j,k,lfnegafnegb,negc,carry[l7];
struct2 *pl,*p2,*p3;
structl *p4;
pl = numl;
p2 = num2;
p3 = num3;
p4 = num4;
carry[O] = 0;
/* here the distinction is being made between add
the rest */
if((addition == 1) I I (subtraction == 1))
{
if(addition
==
1)
&
sub and
(numl)->word[7-i] = (p4+1)->bitsCil;
(num2)->word[7-i] = (p4+2)->bits[il;
addition = 0;
for(j=8;j<=16;j++)
{
1
if(subtraction
{
==
1)
/*
(numl)->word[7-i] = (p4+1)->bits[i];
inverting the operand */
if ( (p4+2)->bits[i] == 1)
{
(p4+2)->bits[i] = 0;
1
else
{
(p4+2)->bits [i]
1;
1
carry[O] = 1;
subtraction = 0;
for(j=8;j<=16;j++)
{
1
1
printf ("\nn);
printf (I1\nl1)
;
printf(" the following is the entered numbers\nl1);
printf ("\nu);
printf (I1\nl1)
;
printf(I1 the value of numl loaded is (16 - O)\nl1);
for(j=O;j<=l6;j++)
{
printf ("\n") ;
printf ("\nI1);
printf(I1 the value of num2 loaded is (16
for(j=O;j<=16;j++)
{
O)\nll);
printf ("\nl1);
printf ("\nW);
/*
nega = 0;
negb = 0;
negc = 0;
if ( (numl)->word[i]
== 0)
nega = 1;
1
if((num2)->word[i] == 0)
{
negb = 1;
1
if(carry[i] == 0)
{
negc = 1;
num3->word[i] = ((((numl->word[i]*negb*negc)
I (num2->word[i]*nega*negc)) I (carry[i]*nega*negb)) I
(numl->word[i]*num2->word[i]*carry[i]));
carry[i+l] =((numl->word[i]*num2->word[i])
I (carry[i]*numl->word[i]) 1 (num2->word[i]*carry[i]));
1
return (*num3);
1
/*
/*
/*
/*
/*
*/
*/
*/
*/
*/
int i,j,k;
struct2 *PI,*p2,*p3,*p4 ;
pl = numl;
p2 = num2;
p3 = num3;
p4 = num4;
for (i=O;i<l7;i++)
{
num3->word[i] = numl->word[i];
num4->word[i] = num2->word[i];
return (*num3,*num4) ;
/*
/*
/*
/*
/*
*/
*/
*/
*/
int i,j,k;
struct2 *pl,*p2;
pl = numl;
p2 = num2 ;
for (i=O;icl7 ;i++)
{
num2->word[i] = numl->word[i];
return (*num2);
/*
/*
*/
void pipeline()
{
s
t
r
u
c
t
2
*pone,*ptwo,*pthree,*pfourI*pfive,*psix,*pseven,*peight;
struct2 *pnine,*pten;
int one,two,i,j,k,l,m,v;
printf (I1 \nn);
printf ( " \nu);
/*printf(" enter the argument one from bit 8 to 0 \nlt);
s c a n f ( " % d
% d
% d
% d
% d
% d
% d
% d
%d~,&argumentl[O].bits[8]I&argument1[O].bits[7],
&argument1[0].bits[6],&argument1[0].bits[5],
&argument1[0].bits[4],&argument1[0].bits[3],
&argumentl[O].bits[2]I&argument1[O].bits[l]f
&argumentl[O].bits[O]);
printf (I1 \ntl
) ;
printf (It \nw);
printf(I1 enter the argument two from bit 7 to 0 \nV1);
s c a n f ( II % d % d % d % d % d % d %
%d11,&argument2[0].bits[7]I&argument2[O].bits[6],
&argument2[0].bits[5]I&argument2[O].bits[4]l
&argument2[0].bit~[3]~&argument2[O].bits[2],
&argument2[0].bits[l], &argument2[0].bits[O]);
printf ( " \ntl)
;*/
printf ( " the data is fed from mpreg + 3,4 \nI1);
if(multip1ication == 1)
{
multiplication = 0;
for(j=O;j<=7;j++)
{
1
1
if ( (division == 1)
I (delta-flag
==
1))
division = 0;
delta flag = 0;
for(j=0;j<E7;j++)
{
(arg2-pointer+0)->bits[7-j 1
(mpreg-ptr
(argl-pointer+0)->bits[8-j1
(mpreg-ptr + 4)->bits[ j1 ;
1
1
if (init-key
==
3)->bits[ j] ;
1)
init key = 0;
for(T=o;j<=8;j++)
{
(argl pointer+O)->bits[8-j]
(arg2pointer+0)->bits[8-j]
-
=
=
0;
0;
0 bits in
printf(I1 %d w,argumentl[O].bits[7-i]);
1
printf ( " \nw);
printf (I1 \nI1);
printf (I1 argument2 is 8 - O\nn);
printf ( " \nu);
for (i=0;i<=8;++i)
{
printf
par-pointer
1
printf (I1\nt1)
;
printf ("\nll);
/*
(I1
%d
" , ( p a r- p o i n t e r + =
copy-three;
*/
ptwo = delayptr + 4;
pthree = delay ptr + 0;
pfour = delay ptr + 1;
delay one (pone,ptwo,pthree,pfour) ;
~ r i n t f (\nn)
~ ;
printf ( " \nw);
printf ( " \nu);
print(" THE SUM AND CARRY VECTORS OF STAGE TWO \n1I);
printf ( " \nl1);
printf(I1 FIRST AND THIRD ARE SUM VECTORS S1 AND S2 \n1I);
printf ( " \nw);
printf (I1 SECOND AND FOURTH ARE CARY VECTORS C1 AND C2 \nl');
printf (I1 \nu);
printf ( " \ntl)
;
printf (I1 \nM);
for(i=O;i<4;++i)
{
p r i n t f ( " % d 11,( l a t- p o i n t e r + =
i)->word[j]) ;
lat-pointer
1
printf (I1\nw)
;
printf ("\nI1);
}
/*
copy-four;
*/
\nu);
\nI1);
\nit) ;
THE SUM AND CARRY VECTORS O F STAGE THREE \ n n ) ;
\n1I);
F I R S T AND T H I R D ARE SUM VECTORS S3 AND 54 \nI1);
\nw);
SECOND AND FOURTH ARE CARY VECTORS C 3 AND C 4 \nrl)
;
i)->word[j])
printf
lat-pointer
1
printf (If\nn);
printf ("\nW);
}
(I1
% d It,( l a t- p o i n t e r + =
copy-four;
/*
+=
i);
lat pointer = copy-four;
printf ("\nll);
for(j=d;j-<l6;++j)
{
i)->word[j ] )
printf
(I1
% d It,( l a t- p o i n t e r
+=
lat-pointer
copy-four;
printf (n\nll)
;
printf ("\nN);
1
/*
p r i n t f ( " % d !I, ( l a t- p o i n t e r + =
i)->word[j]) ;
lat-pointer
1
printf (l1\nN)
;
printf ("\n1I);
copy-four;
/*
printf
(I1
% d It,( l a t- p o i n t e r + =
12)->word[(15-j) 1);
lat-pointer = copy-four;
( " \ n n );
printf ("\n1I);
/* stage seven */
par pointer = copy three;
lat~ointer= copy-four;
trans pointer = copy-five;
delay-ptr
= copy six;
pone = trans pointer + 20;
ptwo = delayPptr + 7;
delay t~o(~one,ptwo);
~ r i n t (f" \nI1);
printf ( " \nI1);
printf ( " \nI1);
printf(" THE RESULT O F STAGE SEVEN \nl');
printf (I1 \nw);
printf (I1 \nI1);
for(j=O;j<16;++j)
{
7)->word[(15-j)])
printf
(I1
%d
It,( d e l a y - p t r
+=
delay-ptr = copy-six;
}
return ;
/*
*/
struct2 time-off()
{
int one,two,i,j,k,l,m,v;
for(v=O;v<l7;v++)
{
(trans-pointer
+=
0)->word[v]
(par-pointer += 0)->word[v];
(par-pointer
(par-pointer += 2)->word[v];
(par-pointer += 3)->word[v];
(par-pointer
(par-pointer += 5)->word[v];
(trans-pointer
+=
1)->word[v]
+= 1)->word[v];
(trans-pointer
+=
2)->word[v]
(trans-pointer += 3)->word[v]
trans pointer = copy five;
par-pointer = copy-three;
1
for(v=O;v<l7;v++)
{
(trans-pointer
+=
4)->word[v]
+= 4)->word[v];
(trans-pointer
+=
5)->word[v]
(par-pointer += 6)->word[v];
(trans-pointer
+=
(par-pointer
+= 7)->word[v];
(delay-ptr
+= 0)->word[v] ;
(delay-ptr
+= 1)->word[v];
(lat-pointer
+= 0)->word[v];
(lat-pointer
+= 1)->word[v];
(latjointer += 2)->word[v];
(lat-pointer
10)->word[v]
(trans-pointer
+=
11)->word[v]
(transjointer += 6)->word[v]
trans pointer = copy five;
lat-pointer = copy-four;
1
for(v=O;v<l7;v++)
{
(transjointer += 7)->word[v]
trans pointer = copy-five;
lat-pointer = copy-four;
{
(trans-pointer
+=
8)->word[v]
(transjointer += 9)->word[v]
+= 3)->word[v];
for (v=O;v<l7;v++)
{
(trans-pointer
+=
12)->word[v]
(lat-pointer +=
( l a t- pointer +=
(lat-pointer
( t r a n s pointer += 14)->word[v]
6)->word[v]
;
trans pointer = copy five;
lat-pointer = copy-four;
1
for(v=O;v<l7;v++)
{
(lat-pointer
(trans-pointer += 15)->word[v]
+=
+= 7)->word[v];
(delay-ptr += 2)->word[v];
for (v=O;v<l7;v++)
{
(lat- pointer + =
(lat-pointer
(lat-pointer +=
+=
1
{
(trans-pointer
+=
(lat-p o i n t e r +=
2 1)->word[v] = (delay-ptr
+=
7)->word[v];
trans-pointer = copy-five;
delay-ptr = copy-six;
return;
1
Function cal delta */
struct7 cal delta (bl,b2,b3,b4)
struct7 *blj
int *b2[20l1b3,b4; /* b3 = ref-num2
/*
{
, b4
struct7 *tempi;
int *temp2 [81, temp3 [ 8],temp5 ;
int i,j,k,l,carry;
/*printf(" entered cal delta\nw);*/
/ * inverting of the passed argument
for(i=l;ic=8;i++)
{
if ( * (b2[b3]+i)
{
else
==
ref-numl
*/
*/
1)
temp3[i] = 0;
1
{
temp3[i]
1;
1
/*printf (Ittheinverted value in cal-delta \nl');
for (i=l;i<=8;i++)
{
printf (
If
printf ("\nl1);*/
%d ",temp3[i])
/*
temp3[9-i]
carry = 1;
else
0;
carry
i++ ;
*/
0;
(bl
(bl
(bl
+
+
+
(bl + temp5)->func = 5;
(bl + b4)->num-two[O] = 1;
return (*bl);
1
/*
*/
struct7 *tempi;
int *temp2,temp3[8],temp4[8];
int i,j,k,l,nega,negb,negc,carray,carry[93;
/*printf("entered subtract load \nn);*/
/ * the process below finds out the twos complement of
/*
*/
temp3[i]
else
0;
/*
else
*/
temp3[9-i] = 0;
carray = 1;
1
{
temp3[9-i] = 1;
carray = 0;
1
i++ ;
}
/*
printf (ll\nlt)
;*/
/ * the below segment adds N and D 1 s 2's complement
*/
carry[O] = 0;
carry[l] = 0;
for (i=l;i<=8;++i)
/*
/*
nega
1;
1
printf
(If
t h e v a l u e of temp3 [9-%dl
negb
1
if(carry[i]
{
==
negc
1
temp4 [i] =
1;
0)
=
1;
*negb*negc)
I ( t e m p 3 [ 9 - i ] * n e g a * n e g c ) ) I (carry[i]*nega*negb)) I
( * (c3[c5]+(9-i) ) *temp3 [9-i]*carry[i]) ) ;
/*
printf(" the partial product of temp 4 with i =
%d\nfl,
i) ;
printf(
%d \nw,temp4[i]);
*/
carry[i+l] =(*(c3[c5]+(9-i))*temp3[9-i]) I
(carry[i]**(c3[c5]+(9-i)) I (temp3[9-i]*carry[i]));
/*
( ( ( ( * (c3[c5]+(9-i) )
printf(
D \nl*)
;
%d ",temp4[i]);
1
printf (l1\nW)
;
*/
/ * loading of N - D into num-one
for(i=l;i<=8;i++)
{
(cl
*/
c6)->num-one[9-i]
temp4 [i];
1
return (*cl);
1
/*
if ( * (a2[a4]+i) >
{
* (a3[a4]+i) )
flag-one
i = 8;
1;
if ( * (a2[a4]+i) <
{
1
1
if(f1ag-two
* (a3[a4]+i) )
flag-two
i = 8;
==
1;
1)
for (i=l;i<=8;i++)
{
into numl
(al+a5)->num-one[i]
*/
*(a2[a4]+i);/*
argl loaded
/*
return (*al);
/*
*/
case 1:
/ * printf(" case number one \nw); */
(numl+ ref numl)->func = * (num2 + ref-num2) ;
for(i=l;i<=8;i++)
{
(numl+
(numl+
1
(numl+
(numl+
=
=
*(num3[ref num2]+i);
*(num4[ref-num2]+i);
-
printf("
%d
" , (numl +
i)-mum -one[j]);
printf ("\nW);
printf(I1the value of argument two
is as follows
\nn);
for(j=l;j<=8;j++)
1
printf ("\nN);
1*/
ref numl = ref numl + 1;
/*printf ( " reached break at case one \nll);*/
break;
/*
case 2:
printf ( " case number two \nw) ;*/
(numl+ ref num1)->func = *(num2 + ref-num2);
for (i=l;icz8;i++)
=
=
*(num3[ref num2]+i);
* (num4[ref-num2]+i)
;
i)->num-one[j])
printf ( "
% d
",(numl +
;
)
printf ("\nu);
printf("the value of argument two
\nw)
is as follows
for(j=l;j<=B;j++)
{
printf
(I1
%d
It,( n u m l +
i)->num-two [ j ]) ;
1
printf (I1\n1l)
;
1*/
/*
1;
case 3:
printf ( " case number three \nu);*/
(numl+ ref num1)->func = *(num2 + ref-num2) ;
for(i=l;i<EB;i++)
{
printf
%d
(I1
" , (numl +
i)->num-one[j]) ;
printf ("\nH);
printf(I1the value of argument two
is as follows
\n") ;
for(j=l;j<=8;j++)
{
printf ( "
i)->num-two[j])
%d
" ,(numl +
1
printf ("\nV1)
;
1*/
1;
case 4:
/* the division case */
/*printf ( " case number four \nn);
printf (I1 entering compare load \nw);*/
compare-load (numl,num3,nui4,ref-num2,ref-numl ,num2) ;
if((num1
ref-num1)->over-flow
== 1)
subtract-load(numl,num2,num3,num41ref-num2,ref-numl);
1
)*
*/
printf ( "
%d
",(numl +
i)->num-one[j ] ) ;
printf (n\nu);
printf("the value of argument two
is as follows
\nW);
for(j=l;j<=8;j++)
{
i)->num-two[ j]) ;
printf ( "
%d
11,( n u m l
1 */
printf (I1\n");
2;
1
void print outstack (duml)
struct8 *diml;
{
struct8 *tempi;
int i,j,k,l;
templ = duml;
for(i=O;i<=2O;i++)
{
printf ("\nl1);
printf (ll\nll)
;
~ r i n t f (the
~ original 1 . S number \nw);
printf (I1 %d \nw, (templ+i)->destination);
printf (ll\nM)
;
printf ("\nI1);
printf (I1 The result of the instruction (16
printf ("\nW);
printf (ll\nn)
;
printf (ll\nll)
;
{
1
printf ("\nn);
printf ("\nm);
printf (ll\nll)
;
1
1
void print-psstack(dum1)
struct9 *duml;
{
struct9 *tempi;
int i,j,k,l;
templ = duml;
for(i=O;i<=20;i++)
{
printf ("\nl1);
printf ("\nfl)
;
printf(" the tracking register number \nw);
printf (I1 %d \nvl,(templ+i)->address);
printf ("\nW);
printf (I1\nn)
;
0)\nql)
;
printf (I1\nw)
;
printf (I1\nw)
;
printf (I1 The result of the num-two (0 - 8)\nw )
printf (ll\n");
printf ("\nI1);
for(j=O;j<=8;j++)
{
printf (I1 %d
11,
(templ+i)->num-two[j ] ) ;
printf (ll\n")
;
printf ("\nW);
printf (ll\n")
;
printf (l1\nl1)
;
1
}
.........................
..........................................
..........................
struct8 output check(num0,numl,num2,num3,num4,num41num51num61
num7, num8,num9)struct2 *numO; / * pointer to trans-pointer */
struct9 *numl; /* pointer to priority stack */
struct8 *numa; /* pointer to output structure */
struct3 *num3; / * pointer to div trac
*/
struct4 *num4; / * pointer to mult trac */
struct5 *num5; / * pointer to add trac */
struct6 *num6; / * pointer to logg sheet */
struct5 *numi'; / * pointer to sub track */
struct3 *num8; / * pointer to delta track */
structl *num9; /* pointer to multi-purpose registers*/
{
/*
*/
*/
if ( (temp6+1)->logg[i]
== 1)
if ( (temp5+i)->st-track[l]
== 1)
( t e m p 2 + k )- > r e s u l t [ j
(temp0+20)->word [ j ];
1
for(j=O;j<=6;j++)
{
(temp5+i)->st-track[ j ] = 0;
1
(temp6+1)->logg[i] = 0;
1
1
1
printf(" THE OUTPUT STACK AFTER LOADING ADDITION\nvv);
print-outstack(num2);
for(i=l;i<=g;i++)
{
if ( (temp6+2)->logg[i]
==
1)
if ( (temp7+i)->st-track[l]
==
1)
(temp2+k)->result[j]
(temp0+20)->word[j ];
for(j=O;j<=6;j++)
{
(temp7+i)->st-track[ j ] = 0;
(temp6+2)->logg[i] = 0;
1
1
*/
if((temp6+3)->logg[i]
{
==
1)
if((temp4+i)->st -track[6]
{
==
1)
loaded\nw) ;
k = (temp4+i)->address;
for (j=O;j<=16;j++)
{
( t e m p 2 + k )- > r e s u l t
(temp0+20)->word[j ] ;
1
for(j=O;j<=8;j++)
j]
track
remainder = remainder
20)->word[j ]
(temp0
I
==
0))
if ( ( (temp3+i)->itr-track
==
4) 1 I
(remainder
( t e m p 2 + k )- > r e s u l t [ j ]
(temp0+21)->word[ j ];
for(j=O;j<=8;j++)
{
(temp3+i)->st track[ j ] = 0;
(temp8+i)->st-track[j]
= 0;
(temp6+4)->logg[i] = 0;
(temp6+5)->logg[i] = 0;
(temp3+i)->itr track=O;
(temp8+i)->itr-track=0;
get-out = 1;
}
if (get-out
==
0)
/*
P.S\n1I);
/*
*/
( t e m p l + l o c a l- i n d e x ) - > n u m - one[j]
(temp0+21)->word[l5-j ];
(templ+local- index) ->num -two [ j+l]
(temp0+20)->word[15-j ];
(templ+future- index)->num - one[j]
(temp0+20)->word[15-j];
(templ+future- index)->num-two[j+l]
(temp0+20)->word[15-j ];
==
=
=
=
=
(temp9+0)->bits[0])
1;
/*
*/
1
)
print-outstack(num2) ;
return (*num2)
...........................
..........................................
/******* Function Shift Track **********/
/**********************%*****************I
..........................................
*/
*/
struct6 *num4;
{
struct3
struct4
struct5
struct6
struct5
struct3
int i,j,k,l;
temp1 = numl;
temp2 = num2;
temp3 = num3;
temp4 = num4;
temp5 = num5;
temp6 = num6;
/* shifting of add trac */
if ( (temp4+1)->logg-stat-== 1)
{
if((temp4+1)->logg[i]
{
==
1)
for(j=l;j<=8;j++)
{
if ( (temp3+i)->st-track[ j ]
{
==
1)
1
1
if((temp4+2)->logg[i]
{
if ( (temp5+i)->st-track[ j ] == 1)
{
/ * shifting of mult-trac
if((temp4+3)->logg-stat == 1)
{
1)
for(j=l;j<=8;j++)
{
==
*/
*/
if((temp4+5)->logg[i]
{
==
1)
for(j=l;j<=8;j++)
{
if ( (temp6+i)->st-track[ j ] == 1)
{
1
1
1
void status printl(numl,num2,num3)
structl *nuil;
struct5 *num2;
struct6 *num3;
{
structl *duml;
struct5 *dum2;
struct6 *dum3;
int iIj,kIl;
duml = numl;
duma = num2;
dum3 = num3;
printf("printing the pipeline register and flag
register and tracking registers and status logg\ntt);
printf (tt\ntt)
;
printf ("\ntt)
;
printf (I1 %d
",(duml+2)->bits[7-j 1 ) ;
printf ("\nW);
printf (It\nw)
;
printf (It 'the flag register \nI1);
printf ("\nu);
printf (ll\nll)
;
~ r i n t f (the
~ flag register is mpreg + 0 \nw);
for(i=O;i<=lO;i++)
{
printf (I1 %d
(duml+O)->bits[lo-i]) ;
1
printf (lt\nl1)
;
printf (It\nw)
;
printf(" the logging register for add\nw);
for(i=l;i<=g;i++)
{
if ( (dum3+1)->logg[i]
==
1)
printf (I1\nw)
;
for(j=l;j<=8;j++)
{
structl *duml;
struct5 *dum2;
struct6 *dum3;
int iljlkll;
duml = numl;
duma = num2;
dum3 = num3;
printf("printing the pipeline register and flag
register and tracking registers and status logg\nrr);
printf ("\nrr)
;
printf ("\nW);
printf ( " the input registers ( 8 - 0 )\nn) ;
printf ("\nn);
printf ("\nrr)
;
printf(" the input register mpreg + 1 \nw);
for(j=O;j<=7;j++)
(
printf ("\nu);
printf ("\nW);
printf (Ir the flag register \nw);
printf ("\nW);
printf ("\nrl)
;
printf(" the flag register is mpreg
for(i=O;i<=lO;i++)
+ 0 \nrr);
printf(" %d It,(duml+O)->bits[lo-i])
printf ("\nu);
printf ("\nrl)
;
printf(" the logging register for sub\nn);
for(i=l;i<=9;i++)
(
printf (Ir %d
, (dum3+2)->logg[9-i] ) ;
printf ("\nVr)
;
printf ("\nW);
printf (Ir the tracking registers \nw);
for(i=l;i<=g;i++)
{
if ( (dum3+2)->logg[i]
==
1)
printf ( n %d 11,(dum2+i)->st-track[8-j ] )
printf (l1\nt1)
;
printf ("\nW);
1
1
structl *duml;
struct4 *dum2;
struct6 *dum3;
int ilj,kll;
duml = numl;
duma = num2 ;
dum3 = num3;
~rintf(~Iprinting
the pipeline register and flag
register and tracking registers and status logg\nI1);
printf ("\nV1)
;
printf ("\nn);
printf (I1 the input registers ( 8 - 0 )\nI1);
printf (I1\nl1)
;
printf (l1\nl1)
;
printf (I1 the input register mpreg + 3 \ntl)
;
for(j=O;j<=8;j++)
{
printf ("\n1I);
printf (I1\nw)
;
printf(" the flag register \nw);
printf ("\n1I);
printf (lt\nU)
;
printf(It the flag register is mpreg
for(i=O;i<=lO;i++)
+ 0 \nw);
printf(" %d w,(duml+~)->bits[lO-i]);
1
printf ("\nV1)
;
printf (I1\nM)
;
printf (I1 the logging register for add\nIf);
for(i=l;i<=lO;i++)
{
if ( (dum3+3)->logg[i]
==
1)
printf ("\nu);
printf (I1\n1')
;
1
1
1
structl *duml;
struct3 *dum2;
struct6 *dum3;
int i,j,k,l;
duml = numl;
dum2 = num2;
dum3 = nurn3;
printf ("printing the pipeline register and flag
register and tracking registers and status logg\nIt);
printf ("\nI8);
printf ("\n8I);
printf ( " the input registers ( 8 - 0 )\ntv)
;
printf ("\nW);
printf ("\nW);
1
printf (Ig\ngt)
;
printf (I1\n");
printf ( " the input register mpreg + 4 \nV1)
;
for(j=O;j<=8;j++)
{
+ 0 \n") ;
printf(" %d w,(dum3+4)->logg[9-i]);
1
printf ("\nW);
printf ("\nu);
printf (Iw the tracking registers \nw);
for(i=l;i<=g;i++)
{
if ( (dum3+4)->logg[i]
==
1)
printf (n\nw);
for(j=l;j<=8;j++)
{
1
..........................................
..........................................
..........................................
/*******
*********/
..........................................
...................................
/***** 0. P.F indicator. ********/
/***** 1. Priority Flag. ********/
/***** 2. Stack Index.
********/
/***** 3. CCM Pointer.
********/
/***** 4. ADD Latency.
********/
/***** 5. SUB Latency.
********/
/***** 6. MULT Latency. ********/
/***** 7. DIV Latency.
********/
/***** 8. Priority Index. ********/
/***** 9. Local Index.
********/
...................................
structlload p i p e l i n e ( n u m 0 , n u m l , n u m 2 , n u m 3 , n u m 4 , n u m 5 ,
num6,num7,num8,num9,numl0,numll,numl2)
structl *numO; / * pointer to input registers */
structO *numl; /* pointer to cross collision matrices
struct7 *num2; / * pointer to input structure */
struct3 *num3; / * pointer to div trac
*/
struct4 *num4; / * pointer to mult trac */
struct5 *num5; / * pointer to add trac */
struct6 *num6; / * pointer to logg sheet */
struct9 *numlO; / * pointer to priority structure */
struct5 *numll;/* pointer to subtract trac */
struct3 *numl2;/* pointer to delta trac*/
int num7,num8,num9; / * registers */
*/
*/
init key = 0;
delta flag = 0;
addition = 0;
subtraction = 0;
multiplication = 0;
division = 0;
temp0 = num0;
temp1 = numl;
temp2 = num2;
temp3 = num3;
temp4 = num4;
temp5 = num5;
temp6 = num6;
temp10 = numl0;
temp11 = numll;
temp12 = numl2;
priority-flag = (num0+0)->bits[l];/*loading the priority
flag */
stack index = (numO+O)->bits [a] ;/* loading the current
instruction location */
matrix index = (num0+0)->bits[3] ;/* loading the current
address of CCM */
priority index = (temp0+0)->bits[8];
look-ahead = stack index + 1;
/ * checking for any priority situation */
if (priority-flag == 1)
{
switch((templ0
+ priority-index)->func)
case 4:
if((templ+matrix index)->smatrix.bits - r o w 1
[ (temp0+0)->bits [lo]1 == 0)
{
/*
/*
case 1:
if((templ+matrix -index)->smatrix.bits-row3
(tempO+O)->bits[lo]1 == 0)
{
1
else
divisional entry = 1;
printf ( " divisional entry is l\ngl)
;
printf (It addition is also possible\nn);
divisional entry = 0;
printf(I1 though addition is the next
instruction no latency is available \n1I);
printf(" divisional entry is O\n1I);
1
break;
case 2:
if((templ+matrix -index)->smatrix.bits-row3
[ (tempO+O)->bits[lo]1 == 0)
{
divisional entry = 2;
;
printf (I1 dTvisional entry is 2\nss)
p r i n t f (I1 s u b t r a c t i o n i s a l s o
possible\nw);
1
else
{
divisional entry = 0;
printf ( " though subtraction is the next
instruction no latency is available \nw);
printf ( " divisional entry is O\nsl)
;
1
break;
default :
divisional entry = 0;
printf ( " only division is possible \nI1);
printf ( " divisional entry is O\n1l);
break;
1
else
{
(tempO+O)->bits[10]+=1;
printf(I1the next latency is
%d\nn,(tempO+O)->bits[10] ) ;
printf(" initialising the input registers to
0\n1l);
init key =l;
for(T=o;ie=8;i++)
{
1
1
break;
case 5:
(tempO+l)->bits[i] = 0;
(temp0+2)->bits[i] = 0;
(temp0+3)->bits [i] = 0;
(temp0+4)->bits[i] = 0;
registers
*/
/*
/*
*/
(temp0+0)->bits[8]+=1;
divisional entry = 4;
delta-flag-= 1;
*/
/*
/*
*/
/*
*/
/*
*/
f o r ( j = O ; j < = 7 ;j++)
{
printf
(It
%d ",(temp0+3)- > b i t s [ 7 - j ] ) ;
p r i n t f ( t v \ n w;)
p r i n t f ("\nu) ;
p r i n t f ( " t h e i n p u t r e g i s t e r s temp0
0 ) \nl1) ;
(8
p r i n t f ( " \ n W );
p r i n t f (I1\nw);
for(j=O;j<=8;j++)
{
p r i n t f ( l l \ n l l );
p r i n t f ("\n1I) ;
p r i n t f (I1 t h e f l a g r e g i s t e r 8
p r i n t f ( " \ n W );
p r i n t f ( " \ n W );
for(i=O;i<=8;i++)
{
printf
(I1
%d
I!,
O\nfv);
(temp0+0)- > b i t s [ 8 - i ] ) ;
p r i n t f ( " \ n W );
p r i n t f ("\ntt);
p r i n t f ( " t h e logging r e g i s t e r s 9
p r i n t f ( " \ n W );
p r i n t f ( " \ n W );
f o r ( i = li ;< = 9 ;i + + )
{
printf
(I1
O\nw);
%d ",(temp6+5)- > l o g g [ 9 - i ] ) ;
p r i n t f ( l t \ n l l );
p r i n t f ( t f \ n w;
)
p r i n t f ( " the tracking registers \nw);
p r i n t f ( " \ n W );
p r i n t f ("\nu) ;
for(i=l;i<=g;i++)
{
i f ( (temp6+5)- > l o g g [ i ] == 1)
{
p r i n t f ( " %d
1
1
1
I!,
( t e m p l 2 + i )->st-t r a c k [ 8 - j ] ) ;
printf ("\r~*~) ;
printf ("\nW);
break;
1
4
/*
*/
switch (divisional-entry)
{
case 0:
/ * the division is being loaded */
printf (I1 t h e latency i s a v a i l a b l e for
iteration\nI1);
/ * loading the arguments into the stage div */
for(j=O;j<=7;j++)
{
( t e m p 0 + 3 ) - > b i t s [ j ]
(templO+priority-index)->num-one[j];
1
for(j=O;j<=8;j++)
( t e m p 0 + 4 ) - > b i t s [ j ]
(templO+priority-index)->num-two[j];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -latency
[(temp0+0)->bits[To]];
/ * this part will initiate the trackng
registers */
dis = (temp10 + priority index)->address;
(temp3 + dis)->st-track[i] =1;
(temp0+0)->bits[lo] = 0;
(tempO+O)->bits[8]+=1;
printf(I1 the division status is printed below in
case O\n iteration \nt1);
status print4(tempO,temp3,temp6);
division = 1;
break;
case 1:
the addition is being loaded */
printf ( " the latency is available for
iteration\nl1);
/ * loading the arguments into the stage add */
for(j=O;j<=7;j++)
/*
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one[j+l];
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j+l] ;
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (temp0+0)->bits[lo71;
(tempO+O)->bits[2]+=1;
addition = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=9;i++)
{
if ( (temp6+1)->logg[i]
==
0)
(temp6+1)->logg[i] = 1;
(temp5+i)->st track[l]=l;
( t e m p 5 q i ) - > a d d r e s s
(temp2+look-ahead)->location;
i=lO;
1
1
printf(lV the addition status is printed below in
case 1 in iteration\nw);
status printl(tempOItemp5,temp6);
/ * the-division is being loaded */
printf (I1 the latency is available in iteration
\nW);
/* loading the arguments into the stage div */
for(j=O;j<=7;j++)
( t e m p O + O ) - > b i t s [ 3 ]
( t e m p l + m a t r i x i n d e x ) - > s d i r e c t i o n . d i v- a d d
[ (temp0+0)->bits[lo71;
/ * this part will initiate the trackng
registers */
dis = (temp10 + priority index)->address;
(temp3 + dis)->st-track[i] =l;
(temp0+0)->bits[lo] = 0;
(tempO+O)->bits[8]+=1;
printf(" the division status is printed below in
;
case 1 in iteration\nl@)
status print4(tempOItemp3,temp6);
division = 1;
break;
case 2:
sub
for(j=O;j<=7;j++)
*/
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one [ j+l] ;
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j+l];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (tempO+O)->bits[lo71;
(tempO+O)->bits[2]+=1;
subtraction = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+2)->logg[i] == 0)
{
(temp6+2)->logg[i] = 1;
(templl+i)->st track[l]=l;
( t e m p l l q i ) - > a d d r e s s
(temp2+look-ahead)->location;
i=10;
1
1
printf(" the subtraction status is printed below
;
in case 2 in iteration\nff)
status~print2(temp0,templl,temp6);
( t e m p 0 + 3 ) - > b i t s [ j ]
(templO+priority-index)->num-one[j];
1
for(j=O;j<=8;j++)
{
( t e m p 0 + 4 ) - > b i t s [ j ]
(templO+priority-index)->num-two[j];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (temp0+0)->bits[lo71;
/ * this part will initiate the trackng
registers */
dis = (temp10 + priority index)-Baddress;
(temp3 + dis)->st-track[i] =l;
(temp0+0)->bits [lo] = 0;
(temp0+0)->bits[8]+=1;
printf(" the division status is printed below in
case 12\nl1);
status print4(tempO,ternp3,temp6);
division = 1;
break;
case 4:
break;
1
else
{
/*
/*
*/
*/
case 1:
/*
/*
case 3 :
if((templ+matrix -index)->smatrix.bits-row2
[ (temp0+0)->bits[4]1 == 0)
additional entry = 2;
printf (I1 additional entry is 2\n1I);
printf(I1 m u l t i p l i c a t i o n i s a l s o
possible\nI1);
1
else
{
additional entry = 1;
printf ( " additional entry is l\nV1)
;
printf(" though multiplication is the
next instruction no latency is available \nI1);
)
break;
case 4:
if((templ+matrix -index)->smatrix.bits-row1
[ (temp0+0)->bits [4]1 == 0)
{
additional entry = 3;
printf (11 additional entry is 3\nw);
printf (I1 division is also p~ssible\n~~)
;
else
{
additional entry = 1;
printf (I1 additional entry is l\nI1);
printf(" though division is the next
instruction no latency is available \nI1);
1
break;
default:
additional entry = 1;
printf ( " only addition is possible \nI1);
break;
1
else
{
(tempO+O)->bits[4]+=1;
(temp0+0)->bits[10]+=1;
printf (I1 the next latency to look for is %d
\nI1,(tempO+O)->bits [4]) ;
additional entry = 0;
if((tempo+E)->bits[4]==8)
{
p r i n t f (I1 no l a t e n c y is a v a i l a b l e and
reinitialising to matrix 3 \nI1);
(temp0+0)->bits[4] = 0;
1
1
break;
case 2:
/*
/*
case 3:
if((templ+matrix-index)->smatrix.bits-row2
[ (temp0+0)->bits[5] 1 == 0)
{
additional entry = 5;
printf ( " additional entry is 5\nvf)
;
printf(Iv m u l t i p l i c a t i o n i s a l s o
possible\nw);
1
else
{
additional entry = 4;
printf ( n additional entry is 4\nvv)
;
printf(Iv though multiplication is the
next instruction no latency is available \nvv);
1
break;
case 4:
if((templ+matrix -index)->smatrix.bits-row1
[ (temp0+0)->bits[5] 1 == 0)
{
additional entry = 6;
~ r i n t f (additional
~
entry is 6\nn);
printf ( " division is also possible\nvv)
;
1
else
{
additional entry = 4;
printf (Iv additional entry is 4\nvv)
;
printf(Iv though division is the next
instruction no latency is available \nvv);
1
break;
default:
additional-entry = 4;
1
else
p r i n t f (I1 n o l a t e n c y i s a v a i l a b l e f o r
subtraction\nw);
(tempO+O)->bits[5]+=1;
(temp0+0)->bits[10]+=1;
printf ( I 1 the next latency t o look for is %d
\nV1,
(tempO+O)->bits[5]) ;
additional entry = 0;
if((tempo+z)->bits[5]==8)
{
p r i n t f (I1 n o l a t e n c y i s a v a i l a b l e a n d
reinitialising to matrix 3 \nu);
(temp0+0)->bits[5] = 0;
(temp0+0)->bits[3] = 3;
1
1
break;
case 3:
/*
/*
*/
-row2
case 1:
if((templ+matrix -index)->smatrix.bits-row3
[ (temp0+0)->bits[6]1 == 0)
{
additional entry = 8;
printf ( " additional entry is 8\nm1)
;
printf (I1 addition is also possible\ntl)
;
1
else
{
additional entry = 7;
printf(" though addition is the next
instruction no latency is available \nn);
printf (Is additional entry is 7\nN);
break;
case 2:
if ( (templ+matrix-index)->smatrix.bits-row3
[ (temp0+0)->bits[6]] == 0)
{
additional entry = 9;
printf(" additional entry is 9\nw);
p r i n t f (I1 s u b t r a c t i o n i s a l s o
possible\nw);
1
else
{
additional entry = 7;
printf ( " though subtraction is the next
instruction no latency is available \nw);
printf (I1 additional entry is 7\nw);
break;
default:
additional entry = 7;
printf ( " only multiplication is possible
print(" additional entry is 7\nw);
break;
1
else
{
p r i n t f (I1 n o l a t e n c y i s a v a i l a b l e f o r
multiplication\nw);
(temp0+0)->bits[6]+=1;
(tempO+O)->bits[10]+=1;
printf ( " the next latency to look for is % d
\n", (tempO+O)->bits[6] ) ;
additional entry = 0;
if((tempO+c)->bits[6]==8)
{
p r i n t f (I1 n o l a t e n c y i s a v a i l a b l e a n d
reinitialising to matrix 2 \nv);
(temp0+0)->bits [6] = 0;
(tempO+O)->bits[3] = 2;
1
1
break:
case 4:
/*
/*
{
case 1:
if((templ+matrix-index)->smatrix.bits-row3
[ (temp0+0)->bits[7]1 == 0)
{
1
else
{
possible\nw);
1
else
{
additional-entry = 10;
printf(" though subtraction is the next
instruction no latency is available \nw);
printf ( " additional entry is 10\ntt)
;
1
break;
default:
additional entry = 10;
;
printf ( " only division is possible \ntt)
printf(" additional entry is 10\ntt);
break;
1
else
{
(temp0+0)->bits[lo]+=I;
printf ( " the next latency to look for is %d
\nvv,
(tempO+O)->bits[7] ) ;
additional entry = 0;
if ( (tempo+o)->bits[7]==8)
{
p r i n t f (Iv n o l a t e n c y i s a v a i l a b l e a n d
reinitialising to matrix 1 \nvv)
;
(temp0+0)->bits[7] = 0;
(temp0+0)->bits[3] = 1;
1
break;
case 5:
printf(" the case is 5 and delta is being loaded
wherein the priority index is O\nN);
for(j=O;jc=7;j++)
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j];
1
if (re-adjust == 1)
printf ("\nvv)
;
printf(Iv the re-adjust is recognised as
1 and re-adjust is assigned 0 and stack index is doubally
;
incremented\nvv)
printf ("\nu);
re-adjust = 0;
1
else
{
(temp0+0)->bits[2]+=1;
printf (Iv\nw)
;
printf(Iv the re-adjust is recognised as
0 and the instruction stack flag is singlely
incremented\nvv)
;
printf (Iv\nu)
;
1
delta-flag = 1;
additional entry = 0;
/ * this part will initiate the trackng
registers */
for(i=l;ic=9;i++)
(temp6+5)->logg[i] = 1;
(templ2+i)->st track[l]=l;
( t e m p 1 2 i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO;
1
1
/ * printing of the results of case 5 */
printf ("printing the pipeline register and flag
register\nl1);
printf ("\nIf);
printf ("\nlt);
printf (I8 the input registers temp0 + 3 (7 0)\nt');
printf (lt\nll)
;
printf (I1\nf1)
;
for(j=O;j<=7;j++)
{
printf ("\nW);
printf ("\nW);
for(i=l;i<=g;i++)
{
if ( (temp6+5)->logg[i] == 1)
{
case 1:
/* the addition is being loaded */
printf(Ig the additional entry is 1 and addition
only \nl');
/* loading the arguments into the stage add */
for(j=O;j<=7;j++)
{
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one [j+l];
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)->num-two [ j+l] ;
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.add -latency
[(temp0+0)->bits[z]];
(temp0+0)->bits [4] = 0;
(tempO+O)->bits[10] = 0;
(tempO+O)->bits[2]+=1;
addition = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+1)->logg[i]
== 0)
(temp6+1)->logg[i] = 1;
(temp5+i)->st track[l]=l;
( t e m p 5 T i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO ;
1
1
add
*/
for(j=O;j<=7;j++)
{
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one [ j+l] ;
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)->num-two [ j+l] ;
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.mult -add
[ (temp0+0)->bits [4 ;
(temp0+0)->bits[2]+=1;
(tempO+O)->bits[10] = 0;
addition = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
r]
if((temp6+1)->logg[i]
==
0)
(temp6+1)->logg[i] = 1;
(temp5+i)->st track[l]=l;
( t e m p 5 T i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO;
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.mult -add
[ (tempO+O)->bits[4
(tempO+O)->bits[4] = 0;
(tempO+O)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=9;i++)
n;
1
printf(" the multiplication status is printed
below in case 2\nn);
status print3(tempO,temp4,ternp6);
multiplication = 1;
break;
case 3 :
the addition is being loaded */
printf(" the latency is available \nl');
/ * loading the arguments into the stage add
for(j=O;j<=7; j++)
/*
*/
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one [ j+l] ;
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j+l];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (tempO+O)->bits[4]3
(temp0+0)->bits[2]+=I;
(temp0+0)->bits[lo] = 0;
addition = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=9;i++)
{
if ( (temp6+1)->logg[i]
==
0)
1
1
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+look-ahead)->num-one [ j+l] ;
for(j=O;j<=8;j++)
{
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+look-ahead)->num-two [ j ];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (temp0+0)->bits[4]>
(temp0+0)->bits[4] = 0;
(temp0+0)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+4)->logg[i]
== 0)
(temp6+4)->logg[i] = 1;
(temp3+i)->st track[l] =l;
( t e m p 3 T i ) - > a d d r e s s
(temp2+look-ahead)->location;
i=10;
1
1
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one[j+l];
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j+l];
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.add -latency
[(temp0+0)->bits[%]];
(temp0+0)->bits[5] = 0;
(tempO+O)->bits[10] = 0;
(tempO+O)->bits[2]+=1;
subtraction = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+2)->logg[i]
==
0)
(temp6+2)->logg[i] = 1;
(templl+i)->st track[l]=l;
( t e m p l l q i ) - > a d d r e s s
(temp2+stack-index)->location;
i=10;
1
1
printf(It the subtraction status is printed below
in case 4\nw) ;
status-print2(tempOftempll,temp6);
break;
case 5:
sub
*/
for(j=O;j<=7;j++)
{
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num-one[j+l];
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+stack-index)-mum-two [j+l];
1
( t e m p 0 + 0 ) - > b i t s [ 3 ]
templ+matrix index)->sdirection.mult -add
(tempO+O)->bits[5r];
(tempO+O)->bits[2]+=1;
(temp0+0)->bits[10] = 0;
subtraction = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+2)->logg[i]
==
0)
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+look-ahead)->num one[j+l];
( t e m p O + 4 - > b i t s [ j ]
(temp2+look-ahead)->num-two[j+l] ;
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.mult -add
[ (tempO+O)->bits[5r] ;
(temp0+0)->bits[5] = 0;
(temp0+0)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=9;i++)
{
if((temp6+3)->logg[i]
==
0)
(temp6+3)->logg[i] = 1;
(temp4+i)->st track[l]=l;
( t e m p 4 T i ) - > a d d r e s s
(temp2+look-ahead)->location;
i=lO;
1
1
printf (It the multiplication status is printed
below in case 5\nw);
status print3(tempO1temp4,temp6);
multiplication = 1;
break;
case 6:
/* the subtraction is being loaded */
printf(" the latency is available \nw);
/ * loading the arguments into the stage sub
for(j=O;j<=7;j++)
{
( t e m p O + l ) - > b i t s [ j ]
(temp2+stack-index)->num one [j+l];
( t e m p 0 + 2 ) - > b i t s [ j ]
*/
-
( t e m p O + O ) - > b i t s [ 3 ]
t e m p l + m a t r i x - i n d e x ) - > s d i r e c t i o n . d i v -add
(temp0+0)->bits[5] 1 ;
(temp0+0)->bits[2]+=1;
(temp0+0)->bits[10] = 0;
subtraction = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if ( (temp6+2)->logg[i]
==
0)
(temp6+2)->logg[i] = 1;
(templl+i)->st track[l]=l;
( t e m p l l - t i )- > a d d r e s s
(temp2+stack-index)->location;
i=10;
1
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+look-ahead)->num-one [ j+l] ;
1
for(j=O;j<=8;j++)
{
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+look-ahead)->num-two[j];
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (temp0+0)->bits[5]>
(tempO+O)->bits[5] = 0;
(temp0+0)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+4)->logg[i]
==
0)
(temp6+4)->logg[i] = 1;
(temp3+i)->st track[l]=l;
( t e m p 3 q i ) - > a d d r e s s
(temp2+look-ahead)->location;
i=10;
1
1
printf(" the division status is printed below in
case 6\nl1);
status print4(tempO,temp3,temp6);
division = 1;
break;
case 7:
/* the multiplication is being loaded */
printf(" the latency is available \ntt);
/ * loading the arguments into the stage mult */
for(j=O;j<=7;j++)
{
( t e m p 0 + 3 ) - > b i t s [ j l
(temp2+stack-index)->num-one[j+l] ;
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+stack-index)-mum-two [ j+l] ;
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.mult -latency
[ (tempO+O)->bits[%] ] ;
(temp0+0)->bits[6] = 0;
(temp0+0)->bits[lO] = 0;
(tempO+O)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=9;i++)
{
if ( (temp6+3)->logg[i]
==
0)
(temp6+3)->logg[i] = 1;
(temp4+i)->st track[l]=l;
( t e m p 4 q i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO;
1
1
printf (I1 the multiplication status is printed
below in case 7\nw);
status print3(ternpO,temp4,temp6);
multiplication = 1;
break;
case 8:
( t e m p O + l ) - > b i t s [ j ]
(temp2+look-ahead)->num-one [ j+1] ;
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+look-ahead)->num-two[j+l];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.mult -add
[ (temp0+0)->bits [61];
(tempO+O)->bits[2]+=1;
(temp0+0)->bits[lO] = 0;
addition = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+1)->logg[i]
==
0)
(temp6+1)->logg[i] = 1;
(temp5+i)->st track[l]=l;
( t e m p 5 T i ) - > a d d r e s s
(temp2+look-ahead)->location;
i=10 ;
1
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+stack-index)->num-one [ j+1] ;
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j+l] ;
1
printf ("\nW);
printf ("\nW);
printf(" looking at temp0 + 3 in case 8 \nw);
printf ("\n1I);
printf ("\nW);
for(j=O;jc=7;j++)
{
if ( (temp6+3)->logg[i] == 0)
{
(temp6+3)->logg[i] = 1;
(temp4+i)->st track[l]=l;
( t e m p 4 q i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO;
1
1
( t e m p O + l ) - > b i t s [ j ]
(temp2+look-ahead)->num-one [ j+l] ;
( t e m p 0 + 2 ) - > b i t s [ j ]
(temp2+look-ahead)->num-two [ j+1] ;
1
( t e m p O + O ) - > b i t s [ 3 ]
( t e m p l + m a t r i x i n d e x ) - > s d i r e c t i o n . m u l t -add
[ (tempO+O)->bits[
6
]
;
(tempO+O)->bits[2]+=1;
(tempO+O)->bits[lO] = 0;
subtraction = 1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+2)->logg[i]
{
==
0)
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+stack-index)->num-one[ j+l] ;
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+stack-index)->num-two [ j+l] ;
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.mult -add
[ (tempO+O)->bits [ 6 n ;
(temp0+0)->bits[6] = 0;
(tempO+O)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if ( (temp6+3)->logg[i] == 0)
{
(temp6+2)->logg[i] = 1;
(temp4+i)->st track[l]=l;
( t e m p 4 q i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO;
1
printf(" the multiplication status is printed
below in case 9\nn);
status print3(tempOftemp4,temp6);
m~ltiplication= 1;
break;
case 10:
the division is being loaded */
printf (lt the latency is available \n1I);
/ * loading the arguments into the stage div
for(j=O;j<=7;j++)
/*
( t e m p 0 + 3 ) - > b i t s [ j ]
*/
-
(temp2+stack-index)-mum-one[j+l]
for(j=0;j<=8;j++)
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -latency
[ (temp0+0)->bits[?I
];
(temp0+0)->bits[7] = 0;
(temp0+0)->bits[10] = 0;
(tempO+O)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+4)->logg[i]
==
0)
(temp6+4)->logg[i] = 1;
(temp3+i)->st track[l]=l;
( t e m p 3 7 i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO ;
1
1
printf(I1 the division status is printed below in
case 10\nI1);
status print4(tempO,temp3,temp6);
division = 1;
break;
case 11:
( t e m p O + l ) - > b i t s [ j ]
(temp2+(look-ahead+l)) - m u m one [j+l];
( t e m p o - + 2 ) - > b i t s [ j ]
(temp2+(look-ahead+l)) -mum-two[j+l] ;
1
( t e m p O + O ) - > b i t s [ 3 ]
templ+matrix index)->sdirection.div -add
(temp0+0)->bits[7]7;
re adjust = 1;
ifTre-adjust == 1)
{
printf (Ig\n1l)
;
printf ( " re-adjust has been assigned one
in case 1l\nl1);
printf (l1\nl1)
;
registers
*/
1
addition = 1;
/ * this part will initiate the trackng
for(i=l;i<=g;i++)
{
if((temp6+1)->logg[i]
==
0)
(temp6+1)->logg[i] = 1;
(temp5+i)->st track[l]=l;
( t e m p 5 T i ) - > a d d r e s s
(temp2+(look-ahead+l))->location;
i=10;
1
1
printf(" the addition status is printed below in
case ll\nH);
status printl(ternpO,temp5,temp6);
/* the-division is being loaded */
printf (I1 the latency is available \nI1);
/ * loading the arguments into the stage div */
for(j=O;j<=7;j++)
{
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+stack-index)->num-one[j+l] ;
1
for(j=O;j<=8;j++)
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+stack-index)->num-two [ j] ;
1
( t e m p 0 + 0 ) - > b i t s [ 3 ]
(templ+matrix-index)->sdirection.div-add
[ (temp0+0)->bits[7] 1 ;
(tempO+O)->bits[7] = 0;
(tempO+O)->bits[10] = 0;
(tempO+O)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+4)->logg[i]
{
== 0)
(temp6+4)->logg[i] = 1;
(temp3+i)->st track[l]=l;
( t e m p 3 T i ) - > a d d r e s s
(temp2+stack-index)->location;
i=lO ;
1
1
printf(I1 the division status is printed below in
case ll\nw);
status print4(tempOtternp3,temp6);
division = 1;
break;
case 12:
sub
*/
for(j=O;j<=7;j++)
{
( t e m p O + l ) - > b i t s [ j ]
(temp2+(look-ahead+l)) ->num one[j+l] ;
( t e m p o e + 2 ) - > b i t s [ j ]
(temp2+(look-ahead+l)) ->num-two[j+l] ;
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (tempO+O)->bits[7]>
re adjust = 1;
if(re -adjust == 1)
{
printf ("\nW);
printf ( " re-adjust has been assigned one
;
in case 12\ntt)
registers
*/
printf ("\ntt)
;
1
subtraction = 1;
/ * this part will initiate the trackng
for(i=l;i<=g;i++)
{
if((temp6+2)->logg[i]
==
0)
(temp6+2)->logg[i] = 1;
(templl+i)->st track[l]=l;
( t e m p l l ? i ) - > a d d r e s s
(temp2+(look-ahead+l))->location;
i=lO;
1
1
printf(" the subtraction status is printed below
in case 12\11"):
status print2(tempOttempll,temp6);
/* thedivision is being loaded */
printf ( " the latency is available \ntt)
;
/* loading the arguments into the stage div */
for(j=O;j<=7;j++)
{
( t e m p 0 + 3 ) - > b i t s [ j ]
(temp2+stack-index)->num-one[j+l];
1
for(j=O;j<=8;j++)
1
( t e m p 0 + 4 ) - > b i t s [ j ]
(temp2+stack-index)->num-two[j];
1
( t e m p O + O ) - > b i t s [ 3 ]
(templ+matrix index)->sdirection.div -add
[ (tempO+O)->bits[7]>
(temp0+0)->bits [7] = 0;
(tempO+O)->bits[10] = 0;
(tempO+O)->bits[2]+=1;
/ * this part will initiate the trackng
registers */
for(i=l;i<=g;i++)
{
if((temp6+4)->logg[i]
==
0)
1
printf(" the division status is printed below in
case 12\11");
status print4(ternpO,temp3,temp6);
division = 1;
break;
case 0:
if(de1ta -flag
== 0)
0\n1l);
/*
(tempO+l)->bits [ j ] =
(temp0+2)->bits [ j ] =
(temp0+3)->bits[ j] =
(temp0+4)->bits [ j ] =
1
1
break;
1
1
return (*numO);
0;
0;
0;
0;
*/
.................................
/****** Function Set Logg ******/
/*******************%***********/
struct6 set logg(num1)
struct6 *numl;
{
int iljlkfl;
struct6 *tempi;
temp1 = numl;
/* this functions checks to see */
/ * wether any of the process loggs
/ * are empty */
for(i=l;i<=E~;i++)
{
*/
k = 0;
for(j=O;j<=9;j++)
(
1
if (k
(
k = k I
==
(templ+i)->logg[j ]
0)
(templ+i)->logg stat = 0;
printf ( " logg stat is made 0 \nll);
1
else
{
(templ+i)->logg stat = 1;
printf ( " logg stat is made 1 \nw);
1
1
return (*numl);
main ( )
{
int one,twolifj,kll,mfv;
FILE *inptr;
FILE *read ptr;
int i~dexftestlx,enough;
index = 0;
test = 0;
x = 0;
enough = 0;
memory ptr = &memory;
dstacki ptr = &decode stackl;
dstack2-ptr = &decode-stack2;
ilatch ptr = &iunit latches;
inhold-ptr
= &internal-holders;
/ * enter-the instruction
par pointer = &par product;
argi-pointer = &argumentl;
*/
=
=
0;
0;
1
ptr op = &op-code;
instack ptr = &input stack;
outstack ptr = &output stack;
bin pointer = &binary matrix;
/ * Xnitialising all the flags to zero
printf ("initialising the flags \nl1);
for(i=O;i<=8;i++)
{
(mpreg-ptr+O) ->bits[i]
0;
*/
*/
1
(mpreg ptr+O)->bits [3] = 3 ;
(mpregAptr+O)->bits [2] = 1;
(mpreg-ptr+0)->bits[o] = 2 ;
re-adjust = 0;
/ * reading in of the instruction stack and control
structures */
/ * reading of control.dat */
printf(" enter the number of instructions in the stack \nu);
s ~ a n f ( ~ % d ~ , & sptr);
tk
inptr = fopen("control.datw, "rW);
if (inptr == (FILE *)NULL)
{
1
fread (bin-pointer, sizeof(struct collision-matrix), 89,
inptr) ;
fclose (inptr);
/ * reading of the instr.dat */
/ * reading of the instruction stack */
read ptr = fopen ("instr.datI1, "rgl)
;
if(read-ptr == (FILE *)NULL)
{
p r i n t f (I1 e r r o r i n r e a d i n g o p e r a t i o n f o r
instr.dat\nI1);
exit (1);
1
for(i=l;i<=stk-ptr;i++)
{
fscanf(read-ptrIW\tu);
for(j=l;j<=8;j++)
{
fscanf (read-ptr,
%d It,
&arg-two [ i] [ j])
1
fclose (read ptr) ;
/ * printing-of the instruction stack */
printf (
the instruction stack is printed below \nt1)
;
for(i=l;i<=stk-ptr;i++)
{
printf (I1\nw)
;
printf (I1 %d\t ",op-code [ i] )
for(j=l;j<=8;j++)
{
printf (n\tll)
;
for(j=l;j<=8;j++)
{
/*
printf(
below \n1I);
printf (I1\nn)
;
for (v=l; v <= 88; v++)
{
for (1=0;1<8;++1)
{
printf
",binary-matrix[v].smatrix.bits rowl[l]);
I
printf (l1\nv1)
;
printf (I1\n1l)
;
for (1=0;1<8;++1)
I1%d \b
It%d \b
printf (
",binary-matrix[v].smatrix.bits -row3[1]);
I1%d \ b
printf
n,binary-matrix[v].smatrix.bits -row2[1]);
printf (ll\nll)
;
printf (I1\nl1)
;
1
for (1=0;1<8;++1)
{
printf (I1\nM)
;
printf (I1\nl1)
;
printf (I1\nn)
;
for (1=0;1<8;++1)
{
p r i n t f ( It%d \b
,binary-matrix[v].sdirection.div -latency[l]);
It
printf ("\nl1);
printf ("\nW);
for (1=0;1<8;++1)
{
printf
I1%d \b
k
fd
k
C,
.. a.
=h rC,l *-
F:
-4
C,
F:
* - F : F:r"': .-
+'a,&/
aN fd
I
IU
Xrl k 0
UfdUk
fd -4 a a
C,C,
m -4 a Q) -4
+ C O k
a)
7 - 4 la k
O
k
-dc,d
da
F+
C,5 - 7
7 +
k 0.-0-4
IkC, k - k
*-
a o
*- u
UC, IC,C,c,cJ
f d X P I I I
*dad k d 0
2~ ; m:zX1;:
~ m
-4mmmfdmrl
i)->num-one[j])
printf
(If
%d If,(instack-ptr +
printf (I1\ntf)
;
printf(I1the value of argument two
{
i)->num two[j])
printf
%d
(If
'I,
is as follows
(instack-ptr +
printf (I1\nw)
;
i=O ;
while ( (mpreg-ptr+0) ->bits[2]<=13)
f
/*
/*
the T ON cycle
*/
*/
/*
*/
time off ( ) ;
output check(trans pointer,prstack ptr,
outstack~ptr,divflowptr,multflow ptr,addflow-ptr,
prlogg ptr,subflow-ptr,deltaflow-ptr,mfieg-ptr) ;
shift-track(divf1ow p t r , m u l t f l o w - p t r , a d d f l o w - p t r ,
prlogg-ptr,
subflow-ptrydeltaflow-ptr) ;
1
}