Addition / Subtraction
Parts I. Number Representation
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 28.
C hapters
Numbers and Arithmetic Representing Signed Num bers Redundant Number Systems Residue Num ber Systems Basic Addition and Counting Carry-Look ahead Adders Variations in Fast Adders Multioperand Addition Basic Multiplication Schemes High-Radix Multipliers Tree and Array Multipliers Variations in Multipl iers Basic Division Schem es High-Radix Di viders Variations in Dividers Division by Convergence Floating-Point Reperesentations Floating-Point Operations Errors and Error Control Precise and Certii able Arithm etic f Square-Rooting Methods The CORDIC Algorithms Variations in Function Evaluation Arithmetic by Table L ookup High-Throughput Arithmetic Low-Power Arithmetic Fault-Tol erant Arithmetic Past, Present, Arithmetic Reconfigurableand Future
Elementary Operations
III. Multiplication
IV. Division
V. Real Arithmetic
Mar. 2011
Slide 1
Revised
Sep. 2001 Apr. 2009 Mar. 2011
Revised
Sep. 2003
Revised
Oct. 2005
Revised
Apr. 2007
Second
Mar. 2011
Apr. 2010
Slide 2
II Addition / Subtraction
Review addition schemes and various speedup methods Addition is a key op (in itself, and as a building block) Subtraction = negation + addition Carry propagation speedup: lookahead, skip, select, Two-operand versus multioperand addition
You cant add apples and oranges, son; only the government can do that.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 4
Mar. 2011
Slide 6
cout
FA
cin
Half-Adder Implementations
c
x y x y
Fig. 5.1
Mar. 2011
Full-Adder Implementations
y x cout HA s (a) Built of half-adders. y Mux cout
0 1 2 3
y x
0 1 s
0 1 2 3
cin
Fig. 5.2 Possible designs for a full-adder in terms of half-adders, logic gates, and CMOS transmission gates.
Slide 9
Full-Adder Implementations
c out HA s (a) FA built of two HAs x y
0 1 2 3
HA
x y c in c out
x y
c out
1 s
0 1 2 3
c in
Fig. 5.2 (alternate version) Possible designs for a full-adder in terms of half-adders, logic gates, and CMOS transmission gates.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 10
TG
z
TG
x1
TG
xi ci+1 si FA
yi ci
Clock
(a) Bit-serial adder. x31 c32 cout s32 y31 c31 x1 y1 c1 x0 y0 c0 cin
FA
. . .
c2
FA
FA
s31
s0
Mar. 2011
Slide 12
cin
Clock
150
s3
s2
s1 760
Fig. 5.4 The layout of a 4-bit ripple-carry adder in CMOS implementation [Puck94].
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 13
FA
FA
. . .
FA
FA
sk 1
sk 2
s1
s0
Fig. 5.5
Mar. 2011
Set one input to 0: Set one input to 1: Set one input to 0 and another to 1:
Bit 3 0 1 w xyz c4
---------------------0 0 0 0 0 cout = AND of other inputs 0 0 1 0 1 0 1 0 0 1 c other1inputs 0 out = OR of1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 1 0 s =1 NOT of third input 1 1 1 1 1
in
out
cout
FA
cin
Bit 2 w 1
Bit 1 z 0 c2 xyz
Bit 0 y x c1 xy c0 0
c3 w xyz
(w xyz)
Fig. 5.6 Four-bit binary adder used to realize the logic function f = w + xyz and its complement.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 15
s k1
s k2
s1
s0
Fig. 5.7 Twos-complement adder with provisions for detecting conditions and exceptions. overflow2s-compl = xk1 yk1 sk1 xk1 yk1 sk1 overflow2s-compl = ck ck1 = ck ck1 ck ck1
Computer Arithmetic, Addition/Subtraction
Mar. 2011
Slide 16
Saturating Adders
Saturating (saturation) arithmetic: When a results magnitude is too large, do not wrap around; rather, provide the most positive or the most negative value that is representable in the number format Example In 8-bit 2s-complement format, we have: 120 + 26 18 (wraparound); 120 +sat 26 127 (saturating) Saturating arithmetic in desirable in many DSP applications Designing saturating adders
Adder
0 1
Slide 17
Bit positions
15 14 13 12 ----------1 0 1 1
11 10 9 8 ----------0 1 1 0
7 6 5 4 ----------0 1 1 0 0
3 2 1 0 ----------1 1 1 0
0 0 0 1 1 cin \________/\____/ 3 2
Probability that carry generated at position i propagates through position j 1 and stops at position j (j > i) 2(j1i) 1/2 = 2(ji) Expected length of the carry chain that starts at position i 2 2(ki1) Average length of the longest carry chain in k-bit addition is strictly less than log2k; it is log2(1.25k) per experimental results Analogy: Expected number when rolling one die is 3.5; if one rolls many dice, the expected value of the largest number shown grows
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 19
c k = c out
...
c i+1 ci xi y i
c 0 = c in
di+1
alldone
bi 0 0 1
Fig. 5.9
Mar. 2011
The carry network of an adder with two-rail carries and carry completion detection logic.
Computer Arithmetic, Addition/Subtraction Slide 20
Mux 1
Count register x
Clear Enable
Load
Incrementer cout
(Dec rementer)
x+1 (x 1)
Data out
Fig. 5.10 An up (down) counter built of a register, an incrementer (decrementer), and a multiplexer.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 21
...
c2
sk1
sk2
s2
s1
s0
Q3 Q3
Q2 Q2
Q1 Q1
Q0 Q0
Increment
Load 1
Incrementer
Increment
Load
Control 2
1
Incrementer
Control 1
Fig. 5.12
Mar. 2011
Computing the carries ci is thus our central problem For this, the actual operand digits are not important What matters is whether in a given position a carry is generated, For binary addition: gi = xi yi propagated, pi = xi yi or annihilated (absorbed) ai = xi yi = (xi yi)
It is also helpful to define a transfer signal: ti = gi pi = ai = xi yi Using these signals, the carry recurrence is written as ci+1 = gi ci pi = gi ci gi ci pi = gi ci ti
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 24
VDD c'i gi pi
c i+1 ai
1 0
pi gi
1 0
ci
c'i+1
Clock
Logic 0
Logic 1
c5
c4
c3
c2
c1
c0 c0
xi
yi
gi = xi yi pi = xi yi
g i+1 p i+1 gi pi g1 p1 g0 p0
g k1 p k1
g k2 p k2
...
Carry network
...
c0
ck
c k1
...
c k2 c i+1
ci
...
c1
c0
si
Fig. 5.14 Generic structure of a binary adder, highlighting its carry network.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 27
...
ck1 ck2 c2 c1
Fig. 5.15 Alternate view of a ripple-carry network in connection with the generic adder structure shown in Fig. 5.14.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 28
xi
yi
gi = xi yi pi = xi yi
g i+1 p i+1 gi pi g1 p1 g0 p0
g k1 p k1
g k2 p k2
...
Carry network
...
c0
ck
c k1
...
c k2 c i+1
ci
...
c1
c0
si
6 Carry-Lookahead Adders
Chapter Goals Understand the carry-lookahead method and its many variations used in the design of fast adders Chapter Highlights Single- and multilevel carry lookahead Various designs for log-time adders Relating the carry determination problem to parallel prefix computation Implementing fast adders in VLSI
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 30
Mar. 2011
Slide 31
The carry recurrence can be unrolled to obtain each carry signal directly from inputs, rather than through propagation Note: ci = gi1 ci1 pi1 vs logical OR = gi1 (gi2 ci2 pi2) pi1 = gi1 gi2 pi1 ci2 pi2 pi1 = gi1 gi2 pi1 gi3 pi2 pi1 ci3 pi3 pi2 pi1 = gi1 gi2 pi1 gi3 pi2 pi1 gi4 pi3 pi2 pi1 ci4 pi4 pi3 pi2 pi1 Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 32 =...
Addition symbol
cin
...
s3 s2 s1 s0
Theoretically, it is possible to derive each sum digit directly from the inputs that affect it Carry-lookahead adder design is simply a way of reducing the complexity of this ideal, but impractical, arrangement by hardware sharing among the various lookahead circuits
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 33
c3 p2 g2
c2
p1 g1 p0
c1 c0
g0
Mar. 2011
32-input AND
c31 = g30 g29 p30 g28 p29 p30 g27 p28 p29 p30 . . . c0 p0 p1 p2 p3 ... p29 p30
...
32-input OR
Mar. 2011
gi+3 p i+3 gi+2 pi+2 gi+1 pi+1 gi pi 4-bit lookahead carry generator
ci
g[i,i+3]
p[i,i+3]
Fig. 6.2b
Mar. 2011
p [ i,i+3]
c3
ci+3
p2 g2
p1 g1 p0
ci+2
c1 c0
Mar. 2011
g0
c j 2+1 g p g p
c j 1+1
cj
0 +1
g p
g p
Block generate and propagate signals can be combined in the same way as bit g and p signals to form g and p signals for wider blocks
ci 0
Fig. 6.3 Combining of g and p signals of four (contiguous or overlapping) blocks of arbitrary widths into the g and p signals for the overall block [i0, j3].
Computer Arithmetic, Addition/Subtraction Slide 39
Mar. 2011
4-bit lookahead carry generator g [48 ,63] p [48 ,63] g [32,47] p [32,47] g [16,31] p [16,31] g [0,15] p [0,15] 16-bit Carry-Lookahead Adder
Fig. 6.4 Building a 64-bit carry-lookahead adder from 16 4-bit adders and 5 lookahead carry generators. cout = g [0,k1] c0 p [0,k1] = xk1yk1 sk1 (xk1 yk1)
Computer Arithmetic, Addition/Subtraction Slide 40
Carry-out:
Mar. 2011
(compare to 32 gate levels for a 16-bit ripple-carry adder) Each additional lookahead level adds 4 gate levels of latency Latency for k-bit CLA adder:
Mar. 2011
Tlookahead-add =
Once hi is known, however, the sum is obtained by a slightly more complex expression compared with si = pi ci
Mar. 2011
i i i i1 Computer Arithmetic, Addition/Subtraction
s =p h t
Slide 42
(g", p") (g', p') g" p" g' p' g p Block B g = g" + g'p" p = p'p" (g, p)
Fig. 6.5 Combining of g and p signals of two (contiguous or overlapping) blocks B' and B" of arbitrary widths into the g and p signals for block B.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 43
Carry-in can be viewed as an extra ( position: (g1, p1) = (cin, 0) 1) The desired pairs are found by evaluating all prefixes of (g0, p0) (g1, p1) . . . (gk2, pk2) (gk1, pk1) The carry operator is associative, but not commutative [(g1, p1) (g2, p2)] (g3, p3) = (g1, p1) [(g2, p2) (g3, p3)] Prefix sums analogy: Given x0 x1
Mar. 2011
x2 x0+x1+x2
. . . . . .
xk1
Slide 44
Find
x0
x0+x1
x0+x1+...+xk1
g3, p3
g1, p1
g0, p0
+ + 12 + 6
Fig. 6.6 Four-input parallel prefix sums network and its corresponding carry network.
g p g
p
Slide 45
s k1 . . . s k/2
Fig. 6.7 Ladner-Fischer parallel prefix sums network built of two k/2-input networks and k/2 adders. Delay recurrence Cost recurrence
Mar. 2011
Fig. 6.8 Parallel prefix sums network built of one k/2-input network and k 1 adders. Delay recurrence Cost recurrence
Mar. 2011
g[1,1] p[1,1]
g[0,0]
[6, 7 ] [4, 5 ]
[2, 3 ]
[0, 1 ]
p[0,0]
[4, 7 ]
[0, 3 ]
g[0,1] p[0,1]
[0, 1 ] [0, 0 ]
[0, 7 ]
[0, 6 ]
[0, 5 ]
[0, 4 ] [0, 3 ]
[0, 2 ]
Mar. 2011
Slide 48
Delay
log2k log2k 2 log2k 2
xk1 . . . xk/2 . . . Prefix Sums k/2 . . . + ... +
Cost
(k/2) log2k k log2k k + 1 2k 2 log2k
These outputs can be produced one time unit later without increasing the overall latency
s k1 . . . s k/2
Mar. 2011
This strategy saves enough to make the overall cost linear (best possible)
Slide 51
x x
14 13
12
11
10 9
8 x7
x x
6
x x
2
5 6
s 15 s14 s 13 s12 s s s s s s s s s s s s 11 10 9 8 7 6 5 4 3 2 1 0
Fig. 6.11 A Hybrid Brent-Kung/ Kogge-Stone parallel prefix graph for 16 inputs.
s 15 s14 s13 s12 s11 s10 s9 s 8 s s s s s s s s 7 6 5 4 3 2 1 0
Mar. 2011 Computer Arithmetic, Addition/Subtraction
KoggeStone
BrentKung
Slide 52
First, 4-bit Manchester carry chains (MCCs) of Fig. 6.12a are used to derive g and p signals for 4-bit blocks Next, the g and p signals for 4-bit blocks are combined to form the desired carries, using the MCCs in Fig. 6.12b
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 53
Fig. 6.12 Example 4-bit Manchester carry chain designs in CMOS technology [Lync92].
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 54
Type-b MCC
Type-b MCC
c 56 c 48
Type-b MCC
Type-b MCC
c 40 c 32 c
24
c 16 c8 c0
Type-b* MCC
c in
Fig. 6.13 Spanning-tree carry-lookahead network [Lync92]. Type-a and Type-b MCCs refer to the circuits of Figs. 6.12a and 6.12b, respectively.
Computer Arithmetic, Addition/Subtraction
Mar. 2011
Slide 55
Mar. 2011
Slide 57
c0
p[8,11]
4-bit block c8 0 1
p[4,7]
4-bit block c4 0 1
p[0,3]
3 2 1 0
c0
Fig. 7.1 Converting a 16-bit ripple-carry adder into a simple carry-skip adder with 4-bit skip blocks.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 58
4-bit block
4-bit block
0 1
0 1
0 1
One-way street
Freeway
g4j+3 p4j+3
g4j+2 p4j+2
g4j+1
p4j+1
g4j
p4j
c4j+4
c4j+3
c4j+2
c4j+1
c4j
g4j+3 p4j+3
0 1 4j+4 c
g4j+2 p4j+2
g4j+1
p4j+1
g4j
p4j
c4j+4
p[4j, 4j+3]
c4j+3
c4j+2
c4j+1
c4j
The carry-skip adder with OR combining works fine if we begin with a clean slate, where all signals are 0s at the outset; otherwise, it will run into problems, which do not exist in mux-based version
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 60
(k/b 1)
skips
(b 1)
in last block
. . .
Example: k = 32, b opt = 4, T opt = 13 stages (contrast with 32 stages for a ripple-carry adder)
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 61
Fig. 7.2 Carry-skip adder with variable-size blocks and three sample carry paths. The total number of bits in the t blocks is k:
Ripple Skip
2[b + (b + 1) + . . . + (b + t/2 1)] = t(b + t/4 1/2) = k b = k/t t/4 + 1/2 Tvar-skip-add = 2(b 1) + t 1 = 2k/t + t/2 2 dT/db = 2k/t 2 + 1/2 = 0 t opt = 2k
Slide 62
Fig. 7.3
c out
Fig. 7.4
c out
Fig. 7.5 Two-level carry-skip adder optimized by removing the short-block skip circuits.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 63
b6 7
b5 6 S1
b4 S1 5
b3 4 S1
b2 3 S1
b1 2 S1
b0 2
c in
Fig. 7.6 Timing constraints of a single-level carry-skip adder with a delay of 8 units.
Generalization of Example 7.1 for total time T (even or odd) 1 2 3 . . . T/2 T/2 . . . 4 3 1 2 3 ... (T + 1)/2 . . . 4 1 Thus, for any T, the total width is (T + 1)2/4 2 Mar. 2011 Computer Arithmetic, Addition/Subtraction
1 3
Slide 64
{6, 3} bD 5 S2 (a)
Block C
{5, 4} bC S2
{4, 5} {3, 8} c bB b A in 0 3 3 4 S2
Block B Block A
F Block E
cout t=8 7 6 5 4 3 3
cin t=0
b1 S1
b2 1 S1 2 S1 S1 3
b2 S1 2
b1 S1 1
b0 S1
c in
1 b2 b1 3 S1 S1 2 S1 b0
b 2 1 S1
b 3 2 S1 4 S1 S1
c in
Single-level carry-skip adder with Tproduce = Width of the ith level-1 block in the level-2 block characterized by {, } is bi = min( + i + 1, i); the total block width is then i=0 to 1 bi
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 66
Level-h skip
Fig. 7.8
Mar. 2011
c in
k/2+1
k/2
Mux
c k/2
c out
k/2
High k /2 bits
Low k /2 bits
Fig. 7.9 Carry-select adder for k-bit numbers built from three k/2-bit adders. Cselect-add(k) = 3Cadd(k/2) + k/2 + 1 Tselect-add(k) = Tadd(k/2) + 1
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 68
k/2 0 1
k/4
k/2 - 1
k/4 0
k/4 - 1
1
k/4+1
k/4-bit adder
k/4+1
1
k/4+1
k/4-bit adder
k/4
c in
Mux
Mux
c k/4
Mux
k/2+1
c k/2
k/4
Middle k /4 bits
Low k /4 bits
Fig. 7.10
Mar. 2011
ci+1 s i For c i = 1
Mar. 2011
c i+1 s i For c i = 0
x y
B lo c k w id th B lo c k c a rry-in
0 0 1 0 0 1 1 0 1 1 1 0 1 0 1 0 0 1 0 0 1 0 1 1 0 1 0 1 1 1 0 1
B lo c k s u m a n d b lo c k c a-o u t rry
15 14 13 12 11 10 9 8 7 6 5 4 3 2
cin
1 0
0 1
s c s c s c s c s c s c s c s c s c s c
0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 1 0 0 1 1 0 1 1 1 0 0 0 1 1 0 1 0 1 0 1 1 0 0 1 0 0 1 0 0 1 0 0 0 1 1 1 1 1 0 1 1 0 0 0 0 1 0 0 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 0 1 0 0 1 0 0 0 1 1 0 1 1 1 0 0 0 1 0 1 0 0 0 1 1 1 0 1 0 1 1 1 0 0 1 0 0 0 1 1 1 0 0 1 0 0 1 0 0 0 1 1 1 0
0 1
0 1
0 1
16
0 1
co u t
Computer Arithmetic, Addition/Subtraction Slide 71
0 0
0011 0100
0 1
0 1
1111 0000
0 1
8j + 7
. . .
8j + 3 . . . 8j
0 0
0 1
Slide 72
Carry-Select
Mux
Mux
Type-b MCC
Type-b MCC
c 56 c 48
Type-b MCC
Type-b MCC
c 40 c 32 c
24
c 16 c8
Type-b* MCC
c in
Each of the carries c8j, produced by the tree network above is used to select one of the two versions of the sum in positions 8j to 8j + 7
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 74
Fig. 7.13 Example 48-bit adder with hybrid ripple-carry/carry-lookahead design. Other possibilities: hybrid carry-select/ripple-carry hybrid ripple-carry/carry-select . . .
Slide 75
Mar. 2011
Mar. 2011
Slide 76
Bit Position
0 0 20 40 60
Fig. 7.14 Example arrival times for operand bits in the final fast adder of a tree multiplier [Oklo96].
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 77
Condl-Sum Ling
Three-Stage Ling
Wire tracks = 2t
Slide 79
Carry-Save Adder
Adder
Adder
x+ym
Mux Sign bit
Fast modular
(x + y) mod m
Computer Arithmetic, Addition/Subtraction Slide 81
8 Multioperand Addition
Chapter Goals Learn methods for speeding up the addition of several numbers (needed for multiplication or inner-product) Chapter Highlights Running total kept in redundant form Current total + Next number New total Deferred carry assimilation Wallace/Dadda trees, parallel counters Modular multioperand addition
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 82
Mar. 2011
Slide 83
--------- ---------------
a x x0 x 1 x2 x3 p a2 0 a2 1 2 a2 a2 3
----------------
Fig. 8.1 Multioperand addition problems for multiplication or inner-product computation in dot notation.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 84
x(i)
k bits
Adder
k + log2n bits
i1
x j=0
(j)
x(i1)
Ready to compute Delay
x(i6) + x(i7)
Delays
x(i) + x(i1)
s (i12)
x(i)
x(i8) + x(i9) + x(i10) + x(i11) x(i4) + x(i5) Fig. 8.3 Serial multioperand addition when each adder is a 4-stage pipeline.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 86
n1 adders
Adding 7 numbers in a binary tree of adders. = O(log k + log(k + 1) + . . . + log(k + log2n = O(log n log k + log n log log n)
t t+1 t+1 HA
t Level i
Level i+1
Fig. 8.5 Ripple-carry adders at levels i and i + 1 in the tree of adders used for multi-operand addition. The absolute best latency that we can hope for is O(log k + log n) There are kn data bits to process and using any set of computation elements with constant fan-in, this requires O(log(kn)) time We will see shortly that carry-save adders achieve this optimum time
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 88
Fig. 8.6 A ripple-carry adder turns into a carry-save adder if the carries are saved (stored) rather than propagated.
cin cout
FA
FA
FA
FA
FA
FA
FA
FA
FA
FA
FA
FA
Carry-propagate adder
Full-adder
Half-adder
Fig. 8.7 Carry-propagate adder (CPA) and carry-save adder (CSA) functions in dot notation.
Mar. 2011
Fig. 8.8 Specifying fulland half-adder blocks, with their inputs and outputs, in dot notation.
Slide 89
CSA CSA
CSA
CSA CSA
Carry-propagate adder
12 FAs
6 FAs
6 FAs
Fig. 8.11 Representing a sevenoperand addition in tabular form. A full-adder compacts 3 dots into 2 (compression ratio of 1.5) A half-adder rearranges 2 dots (no compression, but still useful)
Slide 91
k-bit CSA
[1, k] [0, k1]
k-bit CSA
[1, k] [0, k1]
Fig. 8.12 Adding seven k-bit numbers and the CSA/CPA widths required. Due to the gradual retirement (dropping out) of some of the result bits, CSA widths do not vary much as we go down the tree levels
k+1 k k1 4 3 2 1
k-bit CSA
[1, k] [0, k1]
k-bit CSA
[2, k+1] The index pair [i, j] m eans that bit positions from i up to j are involved. [1, k] [1, k1]
k-bit CSA
[1, k+1] [2, k+1] [2, k+1]
k-bit CPA
k+2 [2, k+1] 1 0
Slide 92
Mar. 2011
n inputs
12 FAs
6 FAs
6 FAs
11 FAs
6 FAs
Dadda tree: Postpone the reduction to the extent possible without causing added delay
7 FAs
6 FAs
6 FAs
11 FAs
11 FAs
7 FAs
6 FAs + 1 HA
3 FAs + 2 HA
FA
FA 1 1 0 HA 1 2 FA
HA
Circuit reducing n bits to their log2(n + 1) -bit sum = (n; log2(n + 1) )-counter
Mar. 2011
Count register
FA
FA FA FA
FA
FA FA FA
Parallel incrementer
n increment signals vi
FA FA FA
cq
Ignore, or use for decision
FA FA FA FA
q-bit final count y
Slide 97
Mar. 2011
Slide 98
Fig. 8.17 Dot notation for a (5, 5; 4)-counter and the use of such counters for reducing five numbers to two numbers.
(n; 2)-counters
i To i + 1 To i + 2 To i + 3 i3
...
Fig. 8.18 Schematic n + + + + . . . 3 + 2 + 4 + 8 + . . . 1 2 3 1 2 3 diagram of an (n; 2)-counter built of n 3 1 + 32 + 73 + . . . identical circuit slices Example: Design a bit-slice of an (11; 2)-counter Solution: Lets limit transfers to two stages. Then, 8 1 + 32 Possible choices include 1 = 5, 2 = 1 or 1 = 2 = 2
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 100
(a) Using sign extension ---------- Extended positions ---------1 1 1 yk1' yk2 b)z+ 1 k2 2 k1' z 1 0 yk3 zk3 Sign xk1' yk4 zk4 1 Magnitude positions --------xk2 ... ... xk3 xk4 ...
b = (1
(b) Using negatively weighted bits Fig. 8.19 Adding three 2's-complement numbers.
Mar. 2011 Computer Arithmetic, Addition/Subtraction Slide 101
Invert
Mar. 2011
Slide 102
Fig. 8.21 Modulo-21 reduction of 6 numbers taking advantage of the fact that 64 = 1 mod 21 and using 6-bit pseudoresidues.