Arithmetic Circuits
Prof. Taewhan Kim ([email protected])
Sources:
1. I. Koren, “Computer Arithmetic Algorithms,’’ Naticl, Massachusetts, 2001.
Contents
Number system
Addition/subtraction
Ripple-carry-adder
Carry-lookahead-adder
Carry-select-adder
ALU
Multiplication
Sequential multiplication
Fast multiplication
 Partial products reduction techniques
 Constant multiplication
Division (not covered)
Floating point (not covered)
Number systems
Representation of positive numbers is the same in most systems
Major differences are in how negative numbers are represented
Representation of negative numbers come in three major schemes
 sign and magnitude
 1s complement
 2s complement
Assumptions
 we'll assume a 4 bit machine word
 16 different values can be represented
 roughly half are positive, half are negative
0000
0111
0011 1011
11101111 1101 1100
1010 1001
1000 0110 0101
0100 0010 0001 +0
+1 +2
+3 +4 +5 –1 +6
–2 –3 –4
–5 –6
–7
0 100 = + 4 1 100 = – 4
Sign and magnitude
One bit dedicate to sign (positive or negative)
 sign: 0 = positive (or zero), 1 = negative
Rest represent the absolute value or magnitude
 three low order bits: 0 (000) thru 7 (111)
Range for n bits
 +/– (2n–1 –1) (two representations for 0)
Cumbersome addition/subtraction
 must compare magnitudes to determine sign of result
2 = 10000 1 = 00001 2 –1 = 1111 7 = 0111
1000 = –7 in 1s complement form 4
4
1s complement
If N is a positive number, then the negative of N (its 1s complement or N' ) is N' = (2
n– 1) – N
 example: 1s complement of 7
 shortcut: simply compute bit-wise complement ( 0111 -> 1000 )
0000
0111
0011 1011
11101111 1101 1100
1010 1001
1000 0110 0101
0100 0010 0001 +0
+1 +2
+3 +4 +5 –6 +6
–5 –4 –3
–2 –1
–0
0 100 = + 4 1 011 = – 4
1s complement (cont'd)
Subtraction implemented by 1s complement and then addition
Two representations of 0
causes some complexities in addition
High-order bit can act as sign bit
0 100 = + 4 1 100 = – 4
+0
+1 +2
+3 +4 +5 +6 +7
–7 –6 –5 –4
–3 –2
–1
0000
0111
0011 1011
11101111 1101 1100
1010 1001
1000 0110 0101
0100 0010 0001
2s complement
1s complement with negative numbers shifted one position clockwise
only one representation for 0
one more negative number than positive numbers
high-order bit can act as sign bit
2 = 10000 7 = 0111
1001 = repr. of –7 4
2 = 10000 –7 = 1001
0111 = repr. of 7 4
subtract
subtract
2s complement (cont’d)
If N is a positive number, then the negative of N (its 2s complement or N* ) is N* = 2
n– N
 example: 2s complement of 7
 example: 2s complement of –7
 shortcut: 2s complement = bit-wise complement + 1
 0111 -> 1000 + 1 -> 1001 (representation of -7)
 1001 -> 0110 + 1 -> 0111 (representation of 7)
4 + 3 7
0100 0011 0111
– 4 + (– 3) – 7
1100 1101 11001
4 – 3 1
0100 1101 10001
– 4 + 3 – 1
1100 0011 1111
2s complement addition and subtraction
Simple addition and subtraction
 simple scheme makes 2s complement the virtually unanimous choice for integer number systems in computers
Why can the carry-out be ignored?
 Can't ignore it completely
 needed to check for overflow (see next two slides)
 When there is no overflow, carry-out may be true but can be ignored – M + N when N > M:
M* + N = (2n – M) + N = 2n + (N – M)
ignoring carry-out is just like subtracting 2n – M + – N where N + M ≤ 2n–1
(– M) + (– N) = M* + N* = (2n– M) + (2n– N) = 2n – (M + N) + 2n
ignoring the carry, it is just the 2s complement representation for – (M + N)
+0
+1 +2
+3 +4 +5 +6 –8 +7
–7 –6 –5 –4
–3 –2
–1
0000
0111
0011 1011
11101111 1101 1100
1010 1001
1000 0110 0101
0100 0010 0001 +0
+1 +2
+3 +4 +5 +6 –8 +7
–7 –6 –5 –4
–3 –2
–1
0000
0111
0011 1011
11101111 1101 1100
1010 1001
1000 0110 0101
0100 0010 0001
Overflow in 2s complement addition/subtraction
 Overflow conditions
 add two positive numbers to get a negative number
 add two negative numbers to get a positive number
5 3 – 8
0 1 1 1
0 1 0 1 0 0 1 1 1 0 0 0
– 7 – 2 7
1 0 0 0
1 0 0 1 1 1 1 0 1 0 1 1 1
5 2 7
0 0 0 0
0 1 0 1 0 0 1 0 0 1 1 1
– 3 – 5 – 8
1 1 1 1
1 1 0 1 1 0 1 1 1 1 0 0 0
overflow overflow
Overflow conditions
 Overflow when carry into sign bit position is not equal to carry-out
A B CI
CO S
CO = A B+B CI+CI A
Truth Table:
A B CI | S CO ---|---
0 0 0 | 0 0 0 0 1 | 1 0 0 1 0 | 1 0 0 1 1 | 0 1 1 0 0 | 1 0 1 0 1 | 0 1 1 1 0 | 0 1 1 1 1 | 1 1
Addition: binary addition
This is the primitive of almost all arithmetic computation.
A B CI
CO S
A B CI
CO S
A B CI
CO S
A B CI
CO S
A[3] B[3] A[2] B[2] A[1] B[1] A[0] B[0] Cin
SUM[3] SUM[2] SUM[1] SUM[0]
Cout
A( ) B( )
S( )
Ci Co
A 4-Bit Ripple-Carry Adder (RCA)
Note: The carry chain ripples from the least to the most
significant bit (LSB to MSB).
A B Cout
Sum Cin 0 1
Add'Subtract A0 B0B0'
Sel
A B
Cout Sum
Cin A1 B1B1'
Sel
A B
Cout Sum
Cin A2 B2B2'
Sel 0 1 0 1
0 1
A B
Cout Sum
Cin A3 B3B3'
Sel
S3 S2 S1 S0
Adder/subtractor
 Use an adder to do subtraction thanks to 2s complement representation
 A – B = A + (– B) = A + B' + 1
 control signal selects B or 2s complement of B
Carry-lookahead logic
Carry generate: Gi = Ai Bi
 must generate carry when A = B = 1
Carry propagate: Pi = Ai xor Bi
 carry-in will equal carry-out here
Sum and Cout can be re-expressed in terms of generate/propagate:
 Si = Ai xor Bi xor Ci
= Pi xor Ci
 Ci+1 = Ai Bi + Ai Ci + Bi Ci
= Ai Bi + Ci (Ai + Bi)
= Ai Bi + Ci (Ai xor Bi)
= Gi + Ci Pi
Carry-lookahead logic (cont’d)
Re-express the carry logic as follows:
 C1 = G0 + P0 C0
 C2 = G1 + P1 C1 = G1 + P1 G0 + P1 P0 C0
 C3 = G2 + P2 C2 = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 C0
 C4 = G3 + P3 C3 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1 G0
+ P3 P2 P1 P0 C0
Each of the carry equations can be implemented with two-level logic
 all inputs are now directly derived from data inputs and not from intermediate carries
 this allows computation of all sum outputs to proceed in parallel
G3 C0 C0
C0 C0
P0 P0
P0
P0 G0
G0
G0
G0 C1 @ 3
P1 P1
P1 P1
P1 P1
G1 G1
G1
C2 @ 3
P2 P2 P2
P2 P2 P2
G2 G2
C3 @ 3
P3 P3 P3 P3
C4 @ 3 Pi @ 1 gate delay
Ci Si @ 2 gate delays
BiAi
Gi @ 1 gate delay
increasingly complex logic for carries
Carry-lookahead implementation
Adder with propagate and generate outputs
A+B
A B CI A B CI A B CI
A B CI
CO S CO S CO S
CO S
A[3] B[3] A[2] B[2] A[1] B[1] A[0] B[0] Cin
SUM[3] SUM[2] SUM[1] SUM[0]
Cout
CLA Logic
A2 B2 A1 B1 A0 B0
A3 B3
A[3] B[3] A[2] B[2] A[1] B[1] A[0] B[0]
C2 C1 C4 C3
Area = (CLA Logic Area) + ((bit-width) * (area of a one-bit full-adder cell))
= about(1 + Log (bit-width)) * (area of ripple adder) A( ) B( )
S( )
Ci Co
4-Bit Carry Look Ahead (CLA) Adder
4-Bit Adder
[3:0] C0
C4 4-bit adder
[7:4] 1
C8
C8 0
2:1 muxfive
0 1 0 1 0 1 0 1
adder low adder
high
0 1
4-bit adder [7:4]
C8 S7 S6 S5 S4 S3 S2 S1 S0
Carry-select adder
 Redundant hardware to make carry calculation go faster
 compute two high-order sums in parallel while waiting for carry-in
 one assuming carry-in is 0 and another assuming carry-in is 1
 select correct result once carry-in is finally computed
logical and arithmetic operations S10
01 1
S00 10 1
Function Fi = Ai Fi = not Ai Fi = Ai xor Bi Fi = Ai xnor Bi
Comment
input Ai transferred to output
complement of Ai transferred to output compute XOR of Ai, Bi
compute XNOR of Ai, Bi M = 0, logical bitwise operations
M = 1, C0 = 0, arithmetic operations 0
01 1
0 10 1
F = A F = not A F = A plus B F = (not A) plus B
input A passed to output
complement of A passed to output sum of A and B
sum of B and complement of A M = 1, C0 = 1, arithmetic operations
0 01 1
0 10 1
F = A plus 1 F = (not A) plus 1 F = A plus B plus 1 F = (not A) plus B plus 1
increment A
twos complement of A increment sum of A and B B minus A
Arithmetic logic unit (ALU) design specification
Bi
S1 S0 Ai M Ci
X1
X2
X3
A1 A2
A3 A4
O1
first-level gates
use S0 to complement Ai
S0 = 0 causes gate X1 to pass Ai S0 = 1 causes gate X1 to pass Ai' use S1 to block Bi
S1 = 0 causes gate A1 to make Bi go forward as 0 (don't want Bi for operations with just A) S1 = 1 causes gate A1 to pass Bi
use M to block Ci
M = 0 causes gate A2 to make Ci go forward as 0 (don't want Ci for logical operations)
M = 1 causes gate A2 to pass Ci other gates
for M=0 (logical operations, Ci is ignored) Fi = S1 Bi xor (S0 xor Ai)
= S1'S0' ( Ai ) + S1'S0 ( Ai' ) +
S1 S0' ( Ai Bi' + Ai' Bi ) + S1 S0 ( Ai' Bi' + Ai Bi ) for M=1 (arithmetic operations)
Fi = S1 Bi xor ( ( S0 xor Ai ) xor Ci ) =
Ci+1 = Ci (S0 xor Ai) + S1 Bi ( (S0 xor Ai) xor Ci ) = just a full adder with inputs S0 xor Ai, S1 Bi, and Ci
ALU design
Sample ALU –multi-level implementation
Multiplication
multiplicand multiplier
1101 (13) 1011 (11) 1101
1101 0000 1101
*
10001111 (143) Partial products
product of 2 4-bit numbers is an 8-bit number
Sequential Multiplier
Multiplicand
Product shift-right write 64 bits
32 bits 32-bit adder
Control unit
Partition Products
Partial Product Accumulation A0
B0 A0 B0 A1
B1 A1 B0 A0 B1 A2
B2 A2 B0 A1 B1 A0 B2 A3
B3 A2 B0 A2 B1 A1 B2 A0 B3 A3 B1
A2 B2 A1 B3 A3 B2
A2 B3 A3 B3
S6 S5 S4 S3 S2 S1 S0
S7
Reduction Example
Note use of parallel carry-outs to form higher order sums
12 Adders, if full adders, this is 6 gates each = 72 gates 16 gates form the partial products
A 0 B 0 A 1 B 0
A 0 B 1 A 0 B 2
A 1 B 1 A 2 B 0
A 0 B 3 A 1 B 2
A 2 B 1 A 3 B 0
A 1 B 3 A 2 B 2
A 3 B 1 A 2 B 3
A 3 B 2 A 3 B 3
HA
S 0 S 1
HA
F A
F A
S 3 F A
F A
S 4
HA
F A
S 2 F A
F A
S 5 F A
S 6
HA
S 7
Example: Six partial products to be added
Original matrix of 36-bits Reorganized matrix of bits
Compressor Final adder
Parallel multiplier
Reduction of the six partial products (Wallace)
10 9 8 7 6 5 4 3 2 1 0 10 9 8 7 6 5 4 3 2 1 0
11 10 9 8 7 6 5 4 3 2 1 0
11 10 9 8 7 6 5 4 3 2 1 0
(2,2)-counter
(3,2)-counter
Reduction of the six partial products (Dadda)
10 9 8 7 6 5 4 3 2 1 0
10 9 8 7 6 5 4 3 2 1 0
10 9 8 7 6 5 4 3 2 1 0
10 9 8 7 6 5 4 3 2 1 0
a
y
16 8
*
y[15:0] <= a[7:0] * “00100100“
== (a[7:0] << 5) + (a[7:0] << 2)
“00100100“
bit 5
bit 2
<<
8
<<
8
a
y
16
+
DW01_add DW02_mult
13 10
MULT ==> ADD
Constant Multiplication
a
y
16 8
*
y[15:0] <= a[7:0] * “00111110“
== a * (“01000000“ - “00000010“);
== a * “01000000” - a * “00000010“;
== (a << 6) - (a << 1);
== (a << 6) + ~(a << 1) + 1;
“00111110“
bit 6 bit 1
8 8
a
y
16
-
DW01_sub DW02_mult
14 9
+ -
MULT ==> ADD
Canonical encoding transforms a constant [vector] such that it contains less ‘1’s than ‘0’s.
<<
<<
Canonical Encoding