Arithmetic Building Blocks
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
A Generic Digital Processor
INPUT-OUTPUT
MEMORY
CONTROL
DATAPATH
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Building Blocks for Digital Architectures Arithmetic unit - Bit-sliced datapath (adder , multiplier, shifter, comparator, etc.) Memory - RAM, ROM, Buffers, Shift registers Control - Finite state machine (PLA, random logic.) - Counters Interconnect - Switches - Arbiters - Bus Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Bit-Sliced Design Control
Bit 2 Bit 1
Data-Out
Multiplexer
Shifter
Adder
Register
Data-In
Bit 3
Bit 0
Tile identical processing elements Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Full-Adder A Cin
B
Full adder
Cout
Sum
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Binary Adder A Cin
B
Full adder
Cout
Sum
S = A ⊕ B ⊕ Ci = A BC i + ABC i + ABC i + ABC i C o = AB + BC i + AC i Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Express Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, B Generate (G) = AB Propagate (P) = A ⊕ B Delete = A B
Can also derive expressions for S and Co based on D and P Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Ripple-Carry Adder A0
A1
B0 Co,0
Ci,0
FA
FA
A2
B1
A3
B2 Co,2
C o,1 FA
FA
S2
S3
B3 Co,3
(= C i,1 ) S0
S1
Worst case delay linear with the number of bits td = O(N) t adder ≈( N – 1 )t carry + t sum
Goal: Make the fastest possible carry path circuit Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Complimentary Static CMOS Full Adder VDD V DD Ci A
A
B
B A
B Ci A
B
V DD
X Ci
Ci
A
S Ci
A
B
B
VDD A
B
Ci
Co
A B
28 Transistors Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Inversion Property A
Ci
A
B
FA
Co
Ci
S
Digital Integrated Circuits
B
FA
Co
S
Arithmetic
© Prentice Hall 1995
Minimize Critical Path by Reducing Inverting Stages Even Cell A0 Ci,0
A1
B0 C o,0
FA’ S0
B1
FA’
S1
A2 Co,1
A3
B2
FA’
Odd Cell
C o,2
S2
B3 C o,3
FA’
S3
Exploit Inversion Property Note: need 2 different types of cells Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The better structure: the Mirror Adder V DD VDD A
B
V DD A
B
B
Ci
A B
Kill "0"-Propagate
A
Ci
Co
Ci
S Ci
A
"1"-Propagate
Generate A
B
A
B
B
Ci
A B
24 transistors Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Mirror Adder •The NMOS and PMOS chains are completely symmetrical. This guarantees identical rising and falling transitions if the NMOS and PMOS devices are properly sized. A maximum of two series transistors can be observed in the carrygeneration circuitry. •When laying out the cell, the most critical issue is the minimization of the capacitance at node Co. The reduction of the diffusion capacitances is particularly important. •The capacitance at node Co is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell . •The transistors connected to Ci are placed closest to the output. •Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Quasi-Clocked Adder VDD
VDD P B
B Ci P
B P P
A
A VDD
B V DD
Co
P
Ci
S
Ci
P
A P
P P
Signal Setup Digital Integrated Circuits
Carry Generation Arithmetic
Sum Generation © Prentice Hall 1995
NMOS-Only Pass Transistor Logic B
B
A A
A CC
A
A
B B
C C
A A
Sum
Sum
Cout
Cout
Transistor count (CPL) : 28 Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
NP-CMOS Adder VDD VDD φ
VDD
φ
φ
VDD φ
S1
Ci1 A1
B1
B1
A1
A1 φ
φ
B1
Ci2
φ V DD V DD
φ A0
φ
Ci1 A0
B0
A1 B1
φ
VDD
A0
Ci1
B0
B0
φ
φ B0
Ci0
A0 Ci0
φ
φ
S0
Ci0 Carry Path Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
NP-CMOS Adder Co1
S1 A1 B1
S0 A0 B0
Ci0
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Manchester Carry Chain V DD φ
Ci,0
P0
P1
P2
P3
P4
G0
G1
G2
G3
G4
φ
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Sizing Manchester Carry Chain Discharge Transistor 1
R1 MC
C1
2
R2 M0
C2
3
R3 M1
R4
4
M2
C3
5
C4
R5 M3
C5
6
R6 M4
Out
C6
i tp = 0.69 ∑ Ci ∑ R j i = 1 j = 1 25
400
20
300 Area
Speed
N
15
100
10 5 1
200
2.0 2.5 3.0 k Speed (normalized by 0.69RC) Digital Integrated Circuits
1.5
Arithmetic
0 1
1.5
2.0 2.5 3.0 k Area (in minimum size devices) © Prentice Hall 1995
Carry-Bypass Adder Ci,0
P0
G1
C o,0
P0
FA
P2
FA
G2
Co,1
FA
G3 Co,3
FA
G1
C o,0
P3 Co,2
FA
P0 G1
G2
C o,1
FA
Ci,0
P2
P3
G3
BP=P oP1 P2 P3
C o,2
FA
FA
Multiplexer
P0 G1
Co,3
Idea: If (P0 and P1 and P2 and P3 = 1) then Co3 = C 0, else “kill”or “generate”. Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Manchester-Carry Implementation
P0
P1
P2
P3
BP Co ,3
Ci,0 G0
G1
G2
G3 BP
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Carry-Bypass Adder (cont.) Bit 0-3
C i,0
Bit 4-7
Bit 8-11
Bit 12-15
S etup
Setup
Setup
Setup
Carry Propagation
Carry Propagation
Carry Propagation
Carry Propagation
Sum
Digital Integrated Circuits
Sum
Sum
Arithmetic
Sum
© Prentice Hall 1995
Carry Ripple versus Carry Bypass tp ripple adder
bypass adder
4..8
Digital Integrated Circuits
N Arithmetic
© Prentice Hall 1995
Carry-Select Adder Setup P,G
"0"
"0" Carry Propagation
"1"
"1" Carry Propagation
Co,k-1
Multiplexer
C o,k+3 Carry Vector
Sum Generation
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Carry Select Adder: Critical Path
"0"
Bit 0-3
Bit 4-7
Setup
Setup
"0" Carry
"0"
"1" Carry "1"
"0" Carry
"0"
"1" Carry
Ci,0
S 0-3
Digital Integrated Circuits
Setup
Setup
"0" Carry
"1"
Multiplexer
"0"
"0" Carry
"1" Carry "1"
Multiplexer
Co,3 Sum Generation
Bit 12-15
"1" Carry
"1"
Multiplexer
Bit 8-11
Co,7
Multiplexer Co,11
Co,15
Sum Generation
Sum Generation
Sum Generation
S 4-7
S 8-11
S 12-15
Arithmetic
© Prentice Hall 1995
Linear Carry Select Bit 0-3
Bit 4-7
Setup
Setup
Bit 8-11
Bit 12-15
Setup
Setup
(1) "0" Carry
"0"
"0"
"0" Carry
"0"
"0" Carry
"0"
"0" Carry
(1) "1" Carry "1"
"1" Carry "1"
(5)
(5)
(5)
"1" Carry "1"
(5)
(6) Multiplexer
"1" Carry "1" (7)
Multiplexer
(5) (8)
Multiplexer
Ci,0
Multiplexer (9)
Sum Generation S0-3
Digital Integrated Circuits
Sum Generation
Sum Generation
S 4-7
S8-11
Arithmetic
Sum Generation S 12-15 (10)
© Prentice Hall 1995
Square Root Carry Select Bit 0-1
Bit 2-4
Setup
Setup
Bit 5-8
Bit 9-13
Setup
Setup
Bit 14-19
(1) "0" Carry
"0"
"0"
"0" Carry
"0"
"0" Carry
"0"
"0" Carry
(1) "1" Carry
"1" (3)
"1" Carry "1"
(3)
(5) (5)
Multiplexer
"1" Carry "1"
(4) (4)
Multiplexer
"1" Carry "1"
(6) Multiplexer
(7)
(6) (7) Multiplexer
Mux
C i,0
(8) Sum Generation S0-1
Digital Integrated Circuits
Sum Generation
Sum Generation
S2-4
S 5-8
Arithmetic
Sum Generation S 9-13
Sum S 14-19 (9)
© Prentice Hall 1995
Adder Delays - Comparison 50.0
ripple adder
40.0
tp
30.0 linear select
20.0
10.0
0.0 0.0
square root select
20.0
40.0
60.0
N
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
LookAhead - Basic Idea A 0 ,B 0
C i,0
A 1,B 1
P0
Ci,1
AN-1 ,BN-1
...
P1
C i,N-1
PN-1
...
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Look-Ahead: Topology VDD
G3 G2 G1 G0 Ci,0 Co,3
P0 P1 P2 P3
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Logarithmic Look-Ahead Adder F
A0 A1
A2
A3
A4
A5
A6
A7
tp ∼ N
A0 A1 A2 A3 F A4 A5
tp∼ log2 (N)
A6 A7
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Brent-Kung Adder (G0,P0) (G1,P1)
Co,0 Co,2
Co,1
Co,3
(G2,P2)
Co,4
Co,5
(G3,P3)
(G4,P4) (G5,P5)
Co,6
(G6,P6)
Co,7
(G7,P7)
tadd ∼ log2 (N) Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Binary Multiplication Z
=
X·· ×Y
M+ N– 1
k
Zk 2
∑
=
k =0 N – 1 i
M – 1 = Xi 2 ∑ ∑ i=0 j = 0
j
Yj 2
M – 1 N – 1 =
∑ i=0
i + j ∑ Xi Yj 2 j= 0
with M –1
X
=
∑ Xi2
i=0 N–1
Y
=
i
j
∑ Yj 2
j= 0 Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Binary Multiplication 1 0 1 0 1 0 ×
1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0
AND operation Partial Products
0 0 0 0 0 0 +
1 0 1 0 1 0 1 1 1 0 0 1 1 1 0
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Array Multiplier
Z7
X3
X2
X1
X0
HA
FA
FA
HA
X3
X2
X1
X0
FA
FA
FA
HA
X3
X2
X1
X0
FA
FA
FA
HA
Z6
Digital Integrated Circuits
Z5
Z4
Y3
Y2
Y1
Z0
Z1
Z2
Z3 Arithmetic
© Prentice Hall 1995
The MxN Array Multiplier — Critical Path
FA
HA
FA
FA
FA
FA
HA
HA
Critical Path 1 Critical Path 2 Critical Path 1 & 2
FA
FA
Digital Integrated Circuits
FA
HA
Arithmetic
© Prentice Hall 1995
Carry-Save Multiplier
HA
HA
HA
HA
HA
FA
FA
FA
HA
FA
FA
FA
FA
FA
HA
HA
Vector Merging Adder
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Adder Cells in Array Multiplier P VDD
VDD
A
A
A
Ci P
P
S
Ci A B V DD
Ci
B P
P
VDD
A P
Ci A
Co
Ci P
Identical Delays for Carry and Sum Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Multiplier Floorplan X3
X2
X1
X0
Y0 HA Multiplier Cell
Y1
C S
C S
C S
C S
Z0 FA Multiplier Cell
Y2 C S
Y3
C S
C
Z7 Digital Integrated Circuits
C S
C S
C
C S
C S
C
C S Z1
Vector Merging Cell
Z2
X and Y signals are broadcasted through the complete array. ( )
C S
C
S
S
S
S
Z6
Z5
Z4
Z3 Arithmetic
© Prentice Hall 1995
Wallace-Tree Multiplier y0 y1
y2 y0 y1 y2
Ci-1
FA y3 Ci
y3 y4 y5
FA Ci-1
FA
FA
Ci
Ci-1
Ci
Ci-1
y4 FA Ci
Ci-1 FA
Ci
Ci-1
y5 FA
Ci FA
C Digital Integrated Circuits
C
S Arithmetic
S © Prentice Hall 1995
Multipliers — Summary • Optimization Goals Different Vs Binary Adder • Once Again: Identify Critical Path • Other possible techniques - Logarithmic versus Linear (Wallace Tree Mult) - Data encoding (Booth) - Pipelining FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Binary Shifter Right
nop
Left
Bi
Ai
Bi-1
Ai-1
Bit-Slice i
... Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
The Barrel Shifter A3
B3
Sh1 A2 B2
: Data Wire
Sh2 A1 B1
: Control Wire
Sh3 A0 B0
Sh0
Sh1
Sh2
Sh3
Area Dominated by Wiring Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
4x4 barrel shifter A3
A2
A1
A0
Sh0
Sh 1
Sh2
Sh3
Buffer
Widthbarrel ~ 2 pm M Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Logarithmic Shifter Sh1 Sh1
Sh2 Sh2
Sh4 Sh4
A3
B3
A2
B2
A1
B1
A0
B0
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
0-7 bit Logarithmic Shifter A
3 Out3
A
A
A
2
Out2
1
Out1
0
Out0
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Design as a Trade-Off 80.0
mirror
static
look-ahead
manchester
t p (nsec)
40.0
select look-ahead
20.0
Digital Integrated Circuits
Area (mm2 )
bypass
60.0
0.00
select 0.4 static bypass mirror
0.2
manchester
10
N
0.0 0
20
Arithmetic
10
N
20
© Prentice Hall 1995
Layout Strategies for BitSliced Datapaths Wires
Signals Wires (M2)
GND
Well V DD
Signals Wires (M2)
Control (M1) Wires
(M1)
Well
GND GND
Approach I —
GND
Approach II —
Signal and power lines parallel Digital Integrated Circuits
VDD
Arithmetic
Signal and power lines perpendicular © Prentice Hall 1995
Layout of Bit-sliced Datapaths
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995
Layout of Bit-sliced Datapaths (a) Datapath without feedthroughs and without pitch matching
(b) Adding feedthroughs (area = 3.2 mm2)
(c) Equalizing the cell height reduces the area to 2.2 mm 2.
(area = 4.2 mm 2).
Digital Integrated Circuits
Arithmetic
© Prentice Hall 1995