CDA5155 Fall 2016 Homework 1 - Dhiraj Borade

CDA 5155 COMPUTER ARCHITECTURE PRINCIPLE (FALL
2016)
HOMEWORK # 1
NAME: DHIRAJ V. BORADE
4595-8142
UFID:
1. In this exercise, assume that we are considering enhancing a machine by adding

vector hardware to it. When a computation is run in vector mode on the vector
hardware, it is 10 times faster than the normal mode of execution. We call the
percentage of time that could be spent using vector mode the percentage of
vectorization. Vectors are discussed in Chapter 4, but you dont need to know
anything about how they work to answer this question!
Solution:
As per Amdahls Law,
The computation time of the machine before addition of vector hardware be denoted by T.
It includes the computation time of the part that does not benefit from the vectorization
and the execution time of the one that benefits from it. The percentage of the
computation time of the task that benefits from the vectorization is denoted by p. The one
concerning the part that does not benefit from it is therefore 1 p. Then,
T =( 1 p ) T + pT
It is the computation of the part that benefits from the vectorization of the resources that
is sped up by the factor s after the vectorization. Consequently, the computation time of
the part that does not benefit from it remains the same, while the part that benefits from
it becomes,
p
T
s
The theoretical execution time T(s) of the whole task after the improvement of the
resources is then
p
T (s)=( 1 p ) T + T
s
Thus Net Speedup can be derived as,
NS=
T
=
T ( s)
1
1p +
p
s
a. What percentage of vectorization is needed to achieve a speedup of 2?
NS=
1
1p +
p
s
2=
1 p+
p
10
p=0.5555655.556
thus, a percent vectorization of 55.556% will be necessary to achieve a
speedup of 2.
b. What percentage of the computation run time is spent in vector mode if a
speedup of 2 is achieved?
I.
After achieving a speedup of 2, a percent vectorization of 55.556% is

achieved, i.e., 55.556% of the code is vectorized and 44.444% of the
code is unvectorized.
We assume that the original code takes 100 seconds to execute, and
thus it can be stated that the unvectorized portion of the code takes
44.444 seconds to execute.
Also, since a net speedup of 2 is achieved, the original whole code will
ultimately take 50 seconds to execute, which is possible due to
vectorization.
Hence, vectorized code will take 50 - 44.444 = 5.556 seconds to
5.556
100=11.112 of the computation
execute, which amounts to
50
run time.
II.
III.
IV.
c. What percentage of vectorization is needed to achieve one-half the maximum

speedup attainable from using vector mode?
One-half of the maximum speed up = 5
As per Amdahls Law,
NS=
5=
1
1p +
p
s
1
1 p+
p
10
p=0.8888 8988. 8889

thus, a percent vectorization of 88.8889% will be necessary to achieve a
speedup of 5.
d. Suppose you have measured the percentage of vectorization of the program
to be 70%. The hardware design group estimates it can speed up the vector
hardware even more with significant additional investment. You wonder
whether the compiler crew could increase the percentage of vectorization,
instead. What percentage of vectorization would the compiler team need to
achieve in order to equal an addition 2 speedup in the vector unit (beyond
the initial 10)?
70% of vectorization yields a net speedup of,
NS=
NS=
1
1p +
p
s
1
10.7+
0.7
10
N S =2.702
Now increase Hardware enhancement factor to 20 i.e.
1
NS=
p
1p +
s
NS=
s=102=20
1
10.7+
0.7
20
N S =2.985
Now, the percentage of vectorization, the compiler team need to achieve is
as follows:
NS=
1
1p +
2.985=
p
s
1
1 p+
p
10
p=0.73887973.8879
The compiler crew have to achieve 73.8879% of vectorization to perform
better than the proposed hardware development.
2. In a server farm such as that used by Amazon or eBay, a single failure does not
cause the entire system to crash. Instead, it will reduce the number of requests that
can be satisfied at any one time.
Solution:
a. If a company has 10,000 computers, each with a MTTF of 35 days, and it
experiences catastrophic failure only if 1/3 of the computers fail, what is the
MTTF for the system?
The number of computers that need to fail for a catastrophic failure of the
system is:
1
10000=3333.3333
3
Failures in Time (FIT),
FIT =
1
10000=285.714
35
MTTF of the System,
MTTF=
1
Number of computer failures for complete system failure
FIT
MTTF=
1
3333.3333=11.6666 Days
285.714
b. If it costs an extra $1000, per computer, to double the MTTF, would this be a
good business decision? Show your work.
Initial cost of a computer to be used for commercial purpose = $1000
(Estimated market price)
Initial setup cost for each computer = $5
Total cost for 10000 computers =
10000( $ 1000+ $ 5 )=$ 10.05 Million
Each and every computer in the server farm contributes to the business of
the company.
Also, the cost of downtime is very high, i.e. for E-Commerce websites like
Amazon and Ebay have downtime losses in the range of $100000 to
$2000000 (according to the Figure 1.3), which is a huge amount.
For,
FIT =
1
10000=285.714
35
Thus,
MTTF=
1
=5.04 minutes
285.714
which implies that in the current system, one computer fails approximately
every 5 minutes.
This is the time available to isolate the computer, swap it and get the
computer back online.
It is very important to extend this valuable time, because the cost of
downtime is huge.
Thus, if extra $1000 is invested into each computer it would greatly help the
company, since the downtime losses will be reduced.
3. The value represented by the hexadecimal number 434F 4D50 5554 4552 is to be
stored in an aligned 64-bit double word.
Solution:
a. Using the physical arrangement of the first row in Figure A.5, write the value
to be stored using Big Endian byte order. Next, interpret each byte as an
ASCII character and below each byte write the corresponding character,
forming the character string as it would be stored in Big Endian order.
Big Endian byte order puts the byte whose address is x . . . x000 at the
most-significant position in the double word (the big end). The bytes are
numbered:
0
Now, using the physical arrangement of the first row in Figure A.5 and given
64-bit double word, which is used to store the value represented by the
Hexadecimal Number.
The given hexadecimal number can be stored using Big Endian byte order as
follow:
4 character and forming the character string as it

Now interpreting each byte 43
as an4F
ASCII
50 55 54 45 52
D
would be
stored in Big Endian order:
4 50 55 54 45 52
43 4F
D
C
O
M
P
U
T
E
R
The string formed by the above given Hexadecimal Number in Big Endian
order is COMPUTER
b. Using the same physical arrangement as in part (a), write the value to be
stored using Little Endian byte order, and below each byte write the
corresponding ASCII character.
Little Endian byte order puts the byte whose address is x . . . x000 at the
least-significant position in the double word (the little end). The bytes are
numbered:
7
Now, using the physical arrangement of the first row in Figure A.5 and given
64-bit double word, which is used to store the value represented by the
Hexadecimal Number.
The given hexadecimal number can be stored using Little Endian byte order
as follow:
4
Now interpreting each byte 52
as an45
ASCII
54 character
55 50 and forming
4F 43the character string as it
D
would be
stored in Little Endian order:
55 50 4 4F 43
52 45 54
D
R
E
T
U
P
M
O
C
The string formed by the above given Hexadecimal Number in Little Endian
order is RETUPMOC
4. For the following we consider instruction encoding for instruction set architectures.
Solution:
a. Consider the case of a processor with an instruction length of 12 bits and with
32 general-purpose registers so the size of the address fields is 5 bits. Is it
possible to have instruction encodings for the following?
i. 3 two-address instructions
ii. 30 one-address instructions
iii. 45 zero-address instructions
First, we must determine if the encoding is possible.
3 Two-address instructions => 32525=3072
30 One-address instructions => 3025=960
45 Zero-address instructions => 45
Total instructions =>
3072+960+ 45=4077
Total possible instructions with an instruction length of 12 bits =>
212=4096
Since, 4077 < 4096, encoding is possible.

We need to make use of variable-length Opcodes, so that all of the instructions
can fit together with their respective operands.
3 Two-address instructions:
For address field, we make use of lower 10 bits i.e., 5 bits * 2 addresses.
Also, upper 2 bits with 1 extra encoding needs to be used.
2 bits with 1 extra encoding + 5 bits * 2 addresses = 12 bits
Therefore, we have,
00 + 2 * 5-bit address
30 One-address instructions:
Now, we need to make use of the fourth value of the upper 2 bits (11), to
differentiate from the first 3 Two-address instructions.
We have 25=32 bits for address, leaving us with 2 slots, after using 30 Oneaddress instructions.
2 bits + 5-bit opcode + 5-bit address = 12 bits
Therefore, we have,
11 + 00000 + 5 address bits
.
.
.
11 + 11101 + 5 address bits
45 Zero-address instructions:
Now we can use the two remaining encodings, i.e., 11 + 11110 and 11 +
11111, along with remaining bits to represent the zero-address instructions.
We have
26=64
bits, giving us 6 bits for opcode.
2 bits + 4 bits + 6 bit opcode = 12 bits

Therefore, we have,
11 + 1111 + 6 bits
Address [11:10]
3 Two-address
instructions
30 One-address
instructions
45 Zero-address
instructions
00 to 10
11
Address [9:5]
00000 to
11111
00000 to
11101
11
11110
11
11111
Address [4:0]
00000 to
11111
00000 to
11111
00000 to
11111
00000 to
11100
b. Assuming the same instruction length and address field sizes as above,
determine if it is possible to have
i. 3 two-address instructions
ii. 31 one-address instructions
iii. 35 zero-address instructions
Explain your answer.
First, we must determine if the encoding is possible.

31 One-address instructions => 3125 =992
Total instructions =>
3072+992+35=4099
12
2 =4096
Since, 4099 > 4096, encoding is not possible.

c. Assume the same instruction length and address field sizes as above. Further
assume there are already 3 two-address and 24 zero-address instructions.
What is the maximum number of one-address instructions that can be
encoded for this processor?
212=4096

Out of available quota of 4096 instructions, 3072 and 24 are occupied by Twoaddress and Zero-address instructions respectively.
4096307224=1000
Now, we have 1000 available space for One-address instructions.
For 31 One-address instructions =>
3125 =992
And for 32 One-address instructions =>
=> Possible
3225 =1024
=> Not possible
Hence, the maximum number of one-address instructions which can be encoded

are 31

CDA5155 Fall 2016 Homework 1 - Dhiraj Borade

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

CDA5155 Fall 2016 Homework 1 - Dhiraj Borade

Diunggah oleh

Hak Cipta:

Format Tersedia

CDA 5155 COMPUTER ARCHITECTURE PRINCIPLE (FALL

1. In this exercise, assume that we are considering enhancing a machine by adding

a. What percentage of vectorization is needed to achieve a speedup of 2?

After achieving a speedup of 2, a percent vectorization of 55.556% is

c. What percentage of vectorization is needed to achieve one-half the maximum

p=0.8888 8988. 8889

70% of vectorization yields a net speedup of,

MTTF of the System,

10000( $ 1000+ $ 5 )=$ 10.05 Million

4 character and forming the character string as it

Total possible instructions with an instruction length of 12 bits =>

Since, 4077 < 4096, encoding is possible.

bits, giving us 6 bits for opcode.

2 bits + 4 bits + 6 bit opcode = 12 bits

3 Two-address instructions => 32525=3072

Total possible instructions with an instruction length of 12 bits =>

Since, 4099 > 4096, encoding is not possible.

3 Two-address instructions => 32525=3072

And for 32 One-address instructions =>

=> Not possible

Hence, the maximum number of one-address instructions which can be encoded

Anda mungkin juga menyukai