Anda di halaman 1dari 40

IEEE Floating Point Adder

Using the IEEE Floating Point Standard for an add/subtract execution units

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

Lecture overview

The Interface Part by part A floating point adder design

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

Adder is double precision

Double Precision
s e (11-bits) f (52-bits)

Value of bits in word representation is:


If e=2047 and f /= 0, then v is NaN regardless of s s If e=2047 and f = 0, then v = (-1) s e-1023 If 0 < e < 2047, then v = (-1) 2 (1.f)

normalized number

If e = 0 and f /= 0, the v = (-1) 2-1022 (0.f)

Denormalized numbers allow for graceful underflow


s

If e = 0 and f = 0 the v = (-1) 0 (zero)


Copyright 2006 - Joanne DeGroat, ECE, OSU 3

1/8/2007 - L25 Floating Point Adder

Specification of a FPA

Floating Point Add/Subtract Unit Specification


Inputs in IEEE 754 Double Precision Must perform both addition and subtraction Must handle the full floating point standard

Normalized numbers Not a Numbers NaNs +/- Infinity Denormalized numbers


Copyright 2006 - Joanne DeGroat, ECE, OSU 4

1/8/2007 - L25 Floating Point Adder

Specifications continued

Result will be a IEEE 754 Double Precision representation Unit will correctly handle the invalid operation of adding + and - = Nan per the standard Unit latches it inputs into registers from parallel 64-bit data busses. There is a separate signal line that indicates the operation add or subtract
Copyright 2006 - Joanne DeGroat, ECE, OSU 5

1/8/2007 - L25 Floating Point Adder

Specifications continued

Outputs

The correctly represented result Flags that are output are

Zero result Overflow to infinity from normalized numbers as inputs NaN result Overshift (result is the larger of the two operands) Denormalized result Inexact (result was rounded) Invalid operation for addition
Copyright 2006 - Joanne DeGroat, ECE, OSU 6

1/8/2007 - L25 Floating Point Adder

High level block diagram

Basic architecture interface


Data 64 bit A,B,& C Busses Control signals Latch, Add/Sub, Asel, Drive Condition Flags Output 7 Flag signals Clocks Phi1 and Phi2 (a 2 phase clocked architecture
Abus Bbus

Add/Sub Latch Phi1 Phi2

Floating Point Adder Unit

Asel Drive

Cbus

Flags

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

Start the VHDL

The entity interface


entity Floating_Point_Adder is port (A_Ma in : in BIT_VECTOR; B_Main : in BIT_VECTOR; C_Main : out BIT_VECTOR; A_Out: out BIT_VECTOR; Flags : out BIT_VECTOR; Add_or_sub : in BIT; Latc h : in BIT; Driv e : in BIT; Phi1 : in BIT; Phi2 : in BIT; Asel : in BIT ); end Floa ting_Point_Adder;
A_Main B- Main Add_or_sbub Latch Phi1 Phi2 Drive Asel Flags

Floating Point Adder

C_Main

A_Out

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

Basic design

Can be divided into functional sub-blocks First latch and drive

A/S

INPUT LATCHES

RESULT LATCHES OUTPUT DRIVERS

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

What goes in the other blocks


From adjusting the inputs to prepare to add To add To renormalize To round

A/S

INPUT LATCHES Input Adjust Add Mantissas Normalize Result Round according to selected scheme RESULT LATCHES OUTPUT DRIVERS

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

10

VHDL coding for the latched


A first cut The input latches Note 2 phase

b1: block ((Phi2 and Latch) = '1') begin A_temp <= guarded A_Main; B_temp(63) <= guarded Add_or_sub xor B_Main(63); B_temp(62 downto 0) <= guarded B_Main(62 downto 0); end block; b2: block (Phi1 = '1') begin signa <= guarded A_temp(63); signb <= guarded B_temp(63); expa <= guarded A_temp(62 downto 52); expb <= guarded B_temp(62 downto 52); mana <= guarded A_temp(51 downto 0); manb <= guarded B_temp(51 downto 0); end block;

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

11

And on the output


Drivers Note use of guarded blocks

out_latch1 : block ((Drive and Phi2) = '1') begin Flags <= guarded flag_temp; end block out_latch1; out_latch2 : block ((Drive and Phi2 and (not Asel)) = '1') begin C_Main <= guarded signout & exp_round & man_round(51 downto 0); end block out_latch2; out_latch3 : block ((Drive and Phi2 and Asel) = '1') begin A_out <= guarded signout & exp_round & man_round(51 downto 0); end block out_latch3;

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

12

And what goes in between?


In the final design lots goes in between but You first want to make sure that the latches are working properly So just pass one input to the output and check
signout <= signa; exp_round <= expa; man_round <= '0' & A_temp;

And once this works properly can move on with the design
Copyright 2006 - Joanne DeGroat, ECE, OSU 13

1/8/2007 - L25 Floating Point Adder

The first section


Prepare to add Identify type of inputs and appropriately adjust operands


Aexp Asign Bsign Exponent Processing Logic
Shift Dist E> E= E< EA0 EA1 EB 0 EB 1 M> M= M< MA 0 MB 0

Bexp

Amantissa

Bmantissa

Mantissa Processing Logic

EA0 & Aman

Aman & 0

EB0 & Bman

Bman & 0 EB0MB0

Larger Exp (to Norm Unit) 2-1Mux


( E> + (E=M>) )

EA0MA0

selR
Swap

L 2-1 Mux R

selR

L 2-1 Mux R

E>+( E=M>)

2x2 crossb ars elements


"Zero" "Nan"

EA1+EB1

selR

L 2-1 Mux R

selR

L 2-1 Mux R

Cntrl Eq

Sign Out (63) to output latch Shift Dist Right Linear Shifter

SignA xor SignB

selR

L 2-1 Mux R ADDER

Adder Output (to normalize un it)

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

14

The exponent unit portion


Must get the larger exponent And the difference between the exponents which is the shift distance Also several control signals

Exponent all 0s and all 1s Exponent A>B, A<B, =


Copyright 2006 - Joanne DeGroat, ECE, OSU 15

1/8/2007 - L25 Floating Point Adder

Mantissa Processing Logic

Need to examine the two fractional parts and generate several control signals that are required to prepare the operands Need relational signals M>, M=, M<

Needed to know which operand to shift

Need to know if stored fractional part if all 0s or not Needed for NaN, 0, and determination
Copyright 2006 - Joanne DeGroat, ECE, OSU 16

1/8/2007 - L25 Floating Point Adder

After generating control signals


Step 1 is to select between a normalized mantissa and a denormalized mantissa For normalized Prepend NOT(Ex0)

If Ex0 is a 1 then the exponent if all 0s and you have a denormalized number or 0 When Ex0 is a 0 you have a NaN, infinity, or a normalized number

Other selection is the factional part shifted left by 1 and postpended by a 0


For denormalized numbers Taking it from 2-126 to 2-127 and can now treat it like a normalized number
Copyright 2006 - Joanne DeGroat, ECE, OSU 17

1/8/2007 - L25 Floating Point Adder

Now select between these two

Select the denormalized


WHEN Ex0 * (NOT Mx0) When Ex0 is a 1 you have a denormalized number or 0 When Mx0 is a 0 there is a least 1 bit of the fractional part that is a 1 and thus you have a denormalized number Select this case when Ex0 is a 0 or Mx0 is a 1 When Mx0 is a 1 have infinity, 0, or a normalized number When Ex0 is a 0 have a normalized number, infinity, or NaN
Copyright 2006 - Joanne DeGroat, ECE, OSU 18

Select the NaN, infinity, 0, normalized number


1/8/2007 - L25 Floating Point Adder

Shown in table form

Selection table to also point out this relationship Note that for a 0 have NOT(Ex0) prepended to the fractional part or a 0.00000000

Ex0 Mx0 Mx0 Ex0*Mx0 Select R 0 0 1 1 0 1 0 1 1 0 1 0

Select L

infinity 0 norm 0 NaN 0 0 1 denorm

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

19

Selections are input to a crossbar

The crossbar switch place the larger value on the right path and the small onto the left path The small is the operand to shift if any shifting to align the binary point is needed The equation for exchange on the crossbar is

E> + (E=*M>) or shift the A input to the right side if the exponent of A is the larger OR the exponents are equal and the fractional part of A is larger
Copyright 2006 - Joanne DeGroat, ECE, OSU 20

1/8/2007 - L25 Floating Point Adder

The next multiplexers


Now have the smaller on the left path and the larger on the right path. On the left path if either exponent is all 1s then that operand is NaN or infinity and has been crossbarred, or is equal, to the right path operand. In this case want to simply pass it through to the output by adding 0 to it. So a 0 is one choice of the left path mux. On the right path select the right path value or mux in a hardwired NaN for an illegal operation
Copyright 2006 - Joanne DeGroat, ECE, OSU 21

1/8/2007 - L25 Floating Point Adder

Linear shifting

Next step is to linear shift the left operand The exponent generates the exponent > signals by subtracting the exponents ExpA-ExpB and ExpB-ExpA Then with the help of the all control signals the exponent difference is known and this value is sent to the shifter.
Copyright 2006 - Joanne DeGroat, ECE, OSU 22

1/8/2007 - L25 Floating Point Adder

One last mulitplexer

The right path operand, the larger is simply input to the ADDER. On the left path the output of the linear shifter is sent to the ADDER for a + operation OR The ones complement of the value is sent to the ADDER for a operation. In this case the input carry is handled appropriately.
Copyright 2006 - Joanne DeGroat, ECE, OSU 23

1/8/2007 - L25 Floating Point Adder

Code for this section - behavioral

Most of code is generation of various signals and movement of data in muxes

-expgt <= '1' when (expa > expb) else '0'; expeq <= '1' when (ex pa = ex pb) else '0'; Exponent Process ing explt <= '1' when (expa < expb) else '0'; expa0 <= '1' when (ex pa = zeroexp) else '0'; expa1 <= '1' when (ex pa = oneexp) else '0'; expb0 <= '1' when (expb = zeroexp) else '0'; expb1 <= '1' when (expb = oneex p) else '0'; shiftdist <= ex pdist (exp a,expb); larger_exp <= expa when (expa >= expb) else expb ; -mangt <= '1' when (mana > manb) else '0'; Mantiss a Proces sing maneq <= '1' when (mana = manb) else '0'; manlt <= '1' when (man a < manb) else '0'; mana0 <= '1' when (mana = zeroman) else '0'; manb 0 <= '1' wh en (manb = zeroman) else '0'; -adenorm <= expa0 and (not mana0); Expanded Normalized bdenorm <= expb0 and (not manb0); Form lshfa(52 downto 1) <= mana; lshfa (0) <= '0'; lshfb(52 downto 1) <= man b; lshfb(0) <= '0'; lxbarin <= lsh fa when (adenorm = '1') else ((n ot expa0) & mana); rxbarin <= lashb when (bden orm = '1') else ((not expb0) & manb); --

Hard Code Ze ro in_mux_l_zero <= exp a1 or expb1; in_mux_l <= zero_man when (in_mux_l_zero = '1') else xbar_l; -Shift smalle r shifted_sig <= sh ift (shiftdist, in_mux_l); -A+B or A+(-B)? twoscomp <= signa xor signb; lad derin <= '0' & shifted_sig when (twoscomp = '0') else '1' & (not shifted_sig); radderin <= '0' & in_mux_r & "0000"; -adder_out <= add (ladderin, radderin, twoscomp);

XBar to place large r swap <= expgt or (expeq and mangt); xbar_r <= lxbarin wh en (swap = '1') else rxbarin; # on right path xbar_l <= rxbarin wh en (swap = '1') esle lx barin; -in_mux_r_man <= expa1 and mana0 and exp b1 and manb0 and (signa xor signb); Hard Code Nan in_mux_r <= nan_man when (in_mux_r_nan = '1') else x bar_r; --

Binary Add

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

24

Xbar code highlight

Code

swap <= expgt OR (expeq AND mangt); xbar_r <= lxbarin when (swap = 1) else rxbarin; xbar_l <= rxbarin when (swap = 1) else lxbarin;

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

25

Hard code NaN VHDL code

The code

-- Control equation for mux in_mux_r_man <= expa1 AND mana0 and expb1 AND manb0 and (signa XOR signb); in_mux_r <= nan_man WHEN (in_mux_r_man = 1) ELSE xbar_r;

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

26

Now add the mantissas


Simply add the two mantissas. As the sign of the B input was XORed with the operation, i.e., inverted if it was a subtract operation, the carry in the the XOR of the two signs. If the signs are different then a subtract is being performed and a 1 if being input to the carry in of the adder. The adder does twos complement addition. Inputs are of the form x.xxxxxxx or 54 bits. The output is of the form xx.xxxxxxx or 58 bits
Copyright 2006 - Joanne DeGroat, ECE, OSU 27

1/8/2007 - L25 Floating Point Adder

On to the next challenge

This is perhaps the hardest part renormalization of the result Have a result exponent (the exponent of the larger) and a mantissa in the form xx.xxxxxxxxxx The following slide shows the processing needed
Copyright 2006 - Joanne DeGroat, ECE, OSU 28

1/8/2007 - L25 Floating Point Adder

Renormalization Unit

Have exponent and mantissa to deal with.


detect all 1's all1
c1 c2 c3 c4

Larger Exponent

Adder Output

000000 & value

2nd 1st Ld1pos 2-1 Mux

Result Signal Generation XX.XXXXXX---> fract0 Left Linear Shifter Right Shift 1 Right Shift 1 zero

0 & value

inverters
UF

+1 incrementer

Adder detect all 0's UF all0 zero

6lsb

2 4 to 1 Mux

c1 c2 c3 c4 c5

2 3 5 to 1 Mux

exp_norm

man_norm

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

29

Many choices to deal with


May need to shift the mantissa 1 position to the right on a fixed binary point. May be OK as is May have to shift left then need to know the position of the leading 1.

In a behavioral model can simply shift left once, increment a counter and then check. In hardware need a leading 1 detector that give the position of the leading 1 so that the mantissa can be shifter left.
Copyright 2006 - Joanne DeGroat, ECE, OSU 30

1/8/2007 - L25 Floating Point Adder

Interactions

All shifts of mantissa result in exponent adjustment. There are 4 choices on the exponent

As is Incremented by 1 Adjusted down by some amount depending on shift Zero


Copyright 2006 - Joanne DeGroat, ECE, OSU 31

1/8/2007 - L25 Floating Point Adder

Interactions

There are 5 choices on the mantissa


As is Right shifted by 1 increment exp by 1 Left shifted for leading 1 Left shifted and then right shifted by 1 Hardwired 0

This part is the same for both addition and multiplication. Easy to do algorithmically.
Copyright 2006 - Joanne DeGroat, ECE, OSU 32

1/8/2007 - L25 Floating Point Adder

Rounding Unit

Once done with renormalization will look at the guard bits to determine rounding. Standard specifies several rounding modes. Can also just truncate.
exp_norm +1 incrementer msbin 2-1 Mux
Round(msbin xor msbout)

man_norm

5lsb
53msb

Round Logic Round

+1 incrementer msbout Exponent output 2-1 Mux

Mantissa Output

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

33

Rounding

Can result in changes to both the mantissa and the exponent. After rounding final result is output in normalized form.

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

34

And dont forget the flags


Any arithmetic unit output flags on the status and validity of the result. The flags to be generated are output from various control signals or combinations of various control signals.
zero <= (not dec2) and (not dec1) and fract0; overflow <= man_sel_0 or (round_incr_exp and exp_norm_p1_all1); nan <= (expa1 and (not mana0)) or (expb1 and (not manb0)) or invalid; denorm <= (man_sel_ls_rs and (not fract0)) or ((not dec2) and dec1 and lgrall0); inexact <= round; invalid <= in_mux_r_nan; flag_temp <= zero & overflow & nan & overshift & denorm & inex act & invalid;

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

35

To test (verify) the design


Must test for normal operation and boundary conditions Will check A by B

NaN +/- infinity +/- 0 Denorm Norm

NaN +/- infinity +/- 0 Denorm Norm

For both direct and all crossed pairings


Copyright 2006 - Joanne DeGroat, ECE, OSU 36

1/8/2007 - L25 Floating Point Adder

Boundary conditions

Wish to check several boundary conditions


Denorm + Denorm = Max Denorm Denorm + Denorm = Min Norm Norm Norm = Max Denorm Rounding using first guard bit Rounding using 1st and 2nd guard bits
Copyright 2006 - Joanne DeGroat, ECE, OSU 37

1/8/2007 - L25 Floating Point Adder

Testing

Testing of the design code is not necessarily the same as the testing the would be done on the chip. The testing of the design is call verification and must insure that all possible input combinations produce the specified output.

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

38

Scan of entire architecture

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

39

Scan of the chip

1/8/2007 - L25 Floating Point Adder

Copyright 2006 - Joanne DeGroat, ECE, OSU

40

Anda mungkin juga menyukai