Using the IEEE Floating Point Standard for an add/subtract execution units
Lecture overview
Double Precision
s e (11-bits) f (52-bits)
If e=2047 and f /= 0, then v is NaN regardless of s s If e=2047 and f = 0, then v = (-1) s e-1023 If 0 < e < 2047, then v = (-1) 2 (1.f)
normalized number
Specification of a FPA
Inputs in IEEE 754 Double Precision Must perform both addition and subtraction Must handle the full floating point standard
Specifications continued
Result will be a IEEE 754 Double Precision representation Unit will correctly handle the invalid operation of adding + and - = Nan per the standard Unit latches it inputs into registers from parallel 64-bit data busses. There is a separate signal line that indicates the operation add or subtract
Copyright 2006 - Joanne DeGroat, ECE, OSU 5
Specifications continued
Outputs
Zero result Overflow to infinity from normalized numbers as inputs NaN result Overshift (result is the larger of the two operands) Denormalized result Inexact (result was rounded) Invalid operation for addition
Copyright 2006 - Joanne DeGroat, ECE, OSU 6
Data 64 bit A,B,& C Busses Control signals Latch, Add/Sub, Asel, Drive Condition Flags Output 7 Flag signals Clocks Phi1 and Phi2 (a 2 phase clocked architecture
Abus Bbus
Asel Drive
Cbus
Flags
C_Main
A_Out
Basic design
A/S
INPUT LATCHES
A/S
INPUT LATCHES Input Adjust Add Mantissas Normalize Result Round according to selected scheme RESULT LATCHES OUTPUT DRIVERS
10
b1: block ((Phi2 and Latch) = '1') begin A_temp <= guarded A_Main; B_temp(63) <= guarded Add_or_sub xor B_Main(63); B_temp(62 downto 0) <= guarded B_Main(62 downto 0); end block; b2: block (Phi1 = '1') begin signa <= guarded A_temp(63); signb <= guarded B_temp(63); expa <= guarded A_temp(62 downto 52); expb <= guarded B_temp(62 downto 52); mana <= guarded A_temp(51 downto 0); manb <= guarded B_temp(51 downto 0); end block;
11
out_latch1 : block ((Drive and Phi2) = '1') begin Flags <= guarded flag_temp; end block out_latch1; out_latch2 : block ((Drive and Phi2 and (not Asel)) = '1') begin C_Main <= guarded signout & exp_round & man_round(51 downto 0); end block out_latch2; out_latch3 : block ((Drive and Phi2 and Asel) = '1') begin A_out <= guarded signout & exp_round & man_round(51 downto 0); end block out_latch3;
12
In the final design lots goes in between but You first want to make sure that the latches are working properly So just pass one input to the output and check
signout <= signa; exp_round <= expa; man_round <= '0' & A_temp;
And once this works properly can move on with the design
Copyright 2006 - Joanne DeGroat, ECE, OSU 13
Bexp
Amantissa
Bmantissa
Aman & 0
EA0MA0
selR
Swap
L 2-1 Mux R
selR
L 2-1 Mux R
E>+( E=M>)
EA1+EB1
selR
L 2-1 Mux R
selR
L 2-1 Mux R
Cntrl Eq
Sign Out (63) to output latch Shift Dist Right Linear Shifter
selR
14
Must get the larger exponent And the difference between the exponents which is the shift distance Also several control signals
Need to examine the two fractional parts and generate several control signals that are required to prepare the operands Need relational signals M>, M=, M<
Need to know if stored fractional part if all 0s or not Needed for NaN, 0, and determination
Copyright 2006 - Joanne DeGroat, ECE, OSU 16
Step 1 is to select between a normalized mantissa and a denormalized mantissa For normalized Prepend NOT(Ex0)
If Ex0 is a 1 then the exponent if all 0s and you have a denormalized number or 0 When Ex0 is a 0 you have a NaN, infinity, or a normalized number
For denormalized numbers Taking it from 2-126 to 2-127 and can now treat it like a normalized number
Copyright 2006 - Joanne DeGroat, ECE, OSU 17
WHEN Ex0 * (NOT Mx0) When Ex0 is a 1 you have a denormalized number or 0 When Mx0 is a 0 there is a least 1 bit of the fractional part that is a 1 and thus you have a denormalized number Select this case when Ex0 is a 0 or Mx0 is a 1 When Mx0 is a 1 have infinity, 0, or a normalized number When Ex0 is a 0 have a normalized number, infinity, or NaN
Copyright 2006 - Joanne DeGroat, ECE, OSU 18
Selection table to also point out this relationship Note that for a 0 have NOT(Ex0) prepended to the fractional part or a 0.00000000
Select L
19
The crossbar switch place the larger value on the right path and the small onto the left path The small is the operand to shift if any shifting to align the binary point is needed The equation for exchange on the crossbar is
E> + (E=*M>) or shift the A input to the right side if the exponent of A is the larger OR the exponents are equal and the fractional part of A is larger
Copyright 2006 - Joanne DeGroat, ECE, OSU 20
Now have the smaller on the left path and the larger on the right path. On the left path if either exponent is all 1s then that operand is NaN or infinity and has been crossbarred, or is equal, to the right path operand. In this case want to simply pass it through to the output by adding 0 to it. So a 0 is one choice of the left path mux. On the right path select the right path value or mux in a hardwired NaN for an illegal operation
Copyright 2006 - Joanne DeGroat, ECE, OSU 21
Linear shifting
Next step is to linear shift the left operand The exponent generates the exponent > signals by subtracting the exponents ExpA-ExpB and ExpB-ExpA Then with the help of the all control signals the exponent difference is known and this value is sent to the shifter.
Copyright 2006 - Joanne DeGroat, ECE, OSU 22
The right path operand, the larger is simply input to the ADDER. On the left path the output of the linear shifter is sent to the ADDER for a + operation OR The ones complement of the value is sent to the ADDER for a operation. In this case the input carry is handled appropriately.
Copyright 2006 - Joanne DeGroat, ECE, OSU 23
-expgt <= '1' when (expa > expb) else '0'; expeq <= '1' when (ex pa = ex pb) else '0'; Exponent Process ing explt <= '1' when (expa < expb) else '0'; expa0 <= '1' when (ex pa = zeroexp) else '0'; expa1 <= '1' when (ex pa = oneexp) else '0'; expb0 <= '1' when (expb = zeroexp) else '0'; expb1 <= '1' when (expb = oneex p) else '0'; shiftdist <= ex pdist (exp a,expb); larger_exp <= expa when (expa >= expb) else expb ; -mangt <= '1' when (mana > manb) else '0'; Mantiss a Proces sing maneq <= '1' when (mana = manb) else '0'; manlt <= '1' when (man a < manb) else '0'; mana0 <= '1' when (mana = zeroman) else '0'; manb 0 <= '1' wh en (manb = zeroman) else '0'; -adenorm <= expa0 and (not mana0); Expanded Normalized bdenorm <= expb0 and (not manb0); Form lshfa(52 downto 1) <= mana; lshfa (0) <= '0'; lshfb(52 downto 1) <= man b; lshfb(0) <= '0'; lxbarin <= lsh fa when (adenorm = '1') else ((n ot expa0) & mana); rxbarin <= lashb when (bden orm = '1') else ((not expb0) & manb); --
Hard Code Ze ro in_mux_l_zero <= exp a1 or expb1; in_mux_l <= zero_man when (in_mux_l_zero = '1') else xbar_l; -Shift smalle r shifted_sig <= sh ift (shiftdist, in_mux_l); -A+B or A+(-B)? twoscomp <= signa xor signb; lad derin <= '0' & shifted_sig when (twoscomp = '0') else '1' & (not shifted_sig); radderin <= '0' & in_mux_r & "0000"; -adder_out <= add (ladderin, radderin, twoscomp);
XBar to place large r swap <= expgt or (expeq and mangt); xbar_r <= lxbarin wh en (swap = '1') else rxbarin; # on right path xbar_l <= rxbarin wh en (swap = '1') esle lx barin; -in_mux_r_man <= expa1 and mana0 and exp b1 and manb0 and (signa xor signb); Hard Code Nan in_mux_r <= nan_man when (in_mux_r_nan = '1') else x bar_r; --
Binary Add
24
Code
swap <= expgt OR (expeq AND mangt); xbar_r <= lxbarin when (swap = 1) else rxbarin; xbar_l <= rxbarin when (swap = 1) else lxbarin;
25
The code
-- Control equation for mux in_mux_r_man <= expa1 AND mana0 and expb1 AND manb0 and (signa XOR signb); in_mux_r <= nan_man WHEN (in_mux_r_man = 1) ELSE xbar_r;
26
Simply add the two mantissas. As the sign of the B input was XORed with the operation, i.e., inverted if it was a subtract operation, the carry in the the XOR of the two signs. If the signs are different then a subtract is being performed and a 1 if being input to the carry in of the adder. The adder does twos complement addition. Inputs are of the form x.xxxxxxx or 54 bits. The output is of the form xx.xxxxxxx or 58 bits
Copyright 2006 - Joanne DeGroat, ECE, OSU 27
This is perhaps the hardest part renormalization of the result Have a result exponent (the exponent of the larger) and a mantissa in the form xx.xxxxxxxxxx The following slide shows the processing needed
Copyright 2006 - Joanne DeGroat, ECE, OSU 28
Renormalization Unit
Larger Exponent
Adder Output
Result Signal Generation XX.XXXXXX---> fract0 Left Linear Shifter Right Shift 1 Right Shift 1 zero
0 & value
inverters
UF
+1 incrementer
6lsb
2 4 to 1 Mux
c1 c2 c3 c4 c5
2 3 5 to 1 Mux
exp_norm
man_norm
29
May need to shift the mantissa 1 position to the right on a fixed binary point. May be OK as is May have to shift left then need to know the position of the leading 1.
In a behavioral model can simply shift left once, increment a counter and then check. In hardware need a leading 1 detector that give the position of the leading 1 so that the mantissa can be shifter left.
Copyright 2006 - Joanne DeGroat, ECE, OSU 30
Interactions
All shifts of mantissa result in exponent adjustment. There are 4 choices on the exponent
Interactions
As is Right shifted by 1 increment exp by 1 Left shifted for leading 1 Left shifted and then right shifted by 1 Hardwired 0
This part is the same for both addition and multiplication. Easy to do algorithmically.
Copyright 2006 - Joanne DeGroat, ECE, OSU 32
Rounding Unit
Once done with renormalization will look at the guard bits to determine rounding. Standard specifies several rounding modes. Can also just truncate.
exp_norm +1 incrementer msbin 2-1 Mux
Round(msbin xor msbout)
man_norm
5lsb
53msb
Mantissa Output
33
Rounding
Can result in changes to both the mantissa and the exponent. After rounding final result is output in normalized form.
34
Any arithmetic unit output flags on the status and validity of the result. The flags to be generated are output from various control signals or combinations of various control signals.
zero <= (not dec2) and (not dec1) and fract0; overflow <= man_sel_0 or (round_incr_exp and exp_norm_p1_all1); nan <= (expa1 and (not mana0)) or (expb1 and (not manb0)) or invalid; denorm <= (man_sel_ls_rs and (not fract0)) or ((not dec2) and dec1 and lgrall0); inexact <= round; invalid <= in_mux_r_nan; flag_temp <= zero & overflow & nan & overshift & denorm & inex act & invalid;
35
Must test for normal operation and boundary conditions Will check A by B
Boundary conditions
Denorm + Denorm = Max Denorm Denorm + Denorm = Min Norm Norm Norm = Max Denorm Rounding using first guard bit Rounding using 1st and 2nd guard bits
Copyright 2006 - Joanne DeGroat, ECE, OSU 37
Testing
Testing of the design code is not necessarily the same as the testing the would be done on the chip. The testing of the design is call verification and must insure that all possible input combinations produce the specified output.
38
39
40