Low-power, low-noise adder design with pass-transistor ... - CiteSeerX

Report 3 Downloads 34 Views
The 12thInternational Conference on Microelectronics

Tehran,

Oct. 31- Nov. 2, 2000

Low-Power, Low-Noise Adder Design with Pass-transistor Adiabatic Logic Hamid Mahmoodi-Meimand' and Ali Afzali-Kusha2 IC Design Center Department of Electrical and Computer Engineering, University of Tehran, Iran [email protected] [email protected] i .ir

'

generation of the circuit. The logic circuit is an 8-bit carry look-ahead adder (CLA) which has been implemented using all three logic styles. The adiabatic logic is based on Pass-transistor Adiabatic Logic (PAL) proposed in [ l ] and has a fully adiabatic operation. Although the study is performed for this type of adiabatic logic, the results can be extended to other types of adiabatic logic. The structure of the paper is as follows. In Section 11, an overview of PAL is given. The adder designs are described in Section 111 while the results are presented in Section IV. Finally, Section V contains the summary and conclusion of the paper.

A b s t r a c e I n this paper, the efficiency of a fully adiabatic logic circuit is compared with its combinational and pipelined static CMOS counterparts. The performance of each circuit is studied in terms of the maximum frequency of operation, the minimum voltage of operation, the circuit energy consumption, and the switching noise generated by the circuit. An 8-bit carry look-ahead adder is designed using a 0.6-pm CMOS technology for all three logic styles. Based on the post-layout simulation results, the adiabatic adder exhibits energy savings of 76% to 87% and 87% to 90% compared to its combinational and pipelined static CMOS counterparts, respectively. It also exhibits a considerable reduction in switching noise, compared to its static CMOS counterparts.

11. PAL OVERVIEW PAL is a dual-rail adiabatic logic with a relatively low gate complexity that operates with a two-phase power clock [ 11.

I. INTRODUCTION Demands for low power and low noise digital circuits have motivated VLSI designers to explore new approaches to the design of VLSI circuits. Energy-recovering (adiabatic) logic is a new promising approach, which has originally been developed for low power digital circuits [l-31. Adiabatic circuits achieve low energy dissipation by restricting current to flow across devices with low voltage drop and by recycling the energy stored on their capacitors [4]. Another major advantage of adiabatic logic families is their best behavior for lower generation of switching noise, which is becoming one of the most important problems in current digital and especially in mixed mode integrated circuits. The traditional solution of employing on chip decoupling capacitors to combat the supply noise results in an unacceptable area increase [5]. Invoking adiabatic logic circuits will reduce the switching noise of digital circuits. The reason is that in these circuits, the switching occurs with the minimum voltage drop across devices and nodes voltages change slowly. To the best of knowledge no report on the efficacy of the switching noise characteristic of adiabatic circuits has been published in the literature. In this paper, we present the results of comparison between an adiabatic logic circuit and its combinational and pipelined static CMOS counterparts, in terms of the maximum frequency of operation, the minimum voltage of operation, the energy consumption, and the switching noise

A. PAL Gates A PAL gate consists of true and complementary pass-transistor NMOS functional blocks (f, if), and a cross-coupled PMOS latch (MP1, MP2), as illustrated by the example of Fig. 1, which shows the implementation of an AND-OR gate: Q=A.B+C. The power is supplied through a sinusoidal power clock (PC). When PC starts rising from low, input states make a conduction path from the power clock (PC) through one of the functional blocks to the corresponding output node and allow it to follow the power clock. The other node will be tri-state and kept close to OV by its load capacitance. This in tum causes one of the PMOS transistors to conduct and charge the node that should go to one state, up to the peak of PC. The output state is valid at around the

Q

1 . Fig. 1. Implementation of Q=A.B+C in PAL

61

Authorized licensed use limited to: San Francisco State Univ. Downloaded on December 10, 2008 at 19:14 from IEEE Xplore. Restrictions apply.

The 1 2 '

International Conference on Microelectronics

Tehran, Oct. 31- Nov. 2, 2000

clock is ramping down. The E phase of an odd stage coincides with the D phase of an even stage and vice versa. Fig. 3 shows the timing of the signals in a PAL cascade obtained from the HSPICE simulations of this cascade at lOMHz with a 0.6-pm CMOS technology. The input signal was periodic sequence 01 1 10111 .... More information regarding the operation of PAL can be found in [ 11.

top of the power clock. The power clock will then ramp down toward zero, recovering the energy stored on the output node capacitance.

B. PAL Cascades Cascade of logic gates is provided by alternate connection of their power clock ports to PC and its 180" phase shifted signal (/PC). Both PC and /PC can be obtained from an efficient LC oscillator and there will be no extra overhead for the generation of /PC. A cascade of four PAL inverters is shown in Fig. 2. All odd logic stages are supplied by the sinusoidal voltage PC, while all the even logic stages are supplied by /PC. The logic operation has only two phases: evaluate (E), when the power clock is ramping up, and discharge (D), when the power

111. ADDERDESIGNS For a fair comparison, all CLAs have the same logic architecture. The schematic diagram of the 8-bit CLA is shown in Fig. 4. The full custom layout of the adiabatic CLA consists of 445 transistors. All device sizes are minimum size in a 0.6-pm CMOS technology. The adiabatic adder is similar to a 6stage pipelined adder with two phase clocking. It generates one output each cycle and has a latency of , 3 cycles. Each primary output was connected to a 50fF load. To compare the performance of this adiabatic circuit, we developed two non-adiabatic designs with static CMOS logic. The first design was a purely combinational CLA while the second one was a pipelined version of the fully combinational design. In order to have the same architecture as adiabatic CLA circuit we implemented the pipelined CLA in a 6-stage architecture with two phase clocking, which has a latency of 3 cycles, equal to the latency of the adiabatic CLA circuit. The layouts

PC

in /in IYL-

Fig. 2. A 4-stage cascade of PAL inverters

CO

".

s7

a7

........................

b7

...... .......

a6

S6

M

ov s5

a5

b5 a4

s4

b4

s3

a3

b3 a2

s2

b2

0.811s

I.OUS

1.2us

1.411s

1.6uS

1.811s

SI

a1

TIME

bl

Fig. 3. Waveforms obtained from HSPICE simulations of a 4-stage pipeline of PAL inverters. (a) power clock (PC and /PC), (b) input of 1" stage, (c) output of 1" stage, (d) output of 2"dstage, (e) output of 3'd stage, and (f) output of 4* stage.

a0

so

bo

Fig. 4. Schematic diagram of 8-bit CLA 62

Authorized licensed use limited to: San Francisco State Univ. Downloaded on December 10, 2008 at 19:14 from IEEE Xplore. Restrictions apply.

ICN

The 12thInternational Conference on Microelectronics

Tehran, Oct. 31- Nov. 2, 2000

400 r

8

a

/

2501

x

r3

3v Jy

?,t.4V

40

20

;0

3.3v

Bo

io

IO0

1;o

io

180

160

2;o

Frequency (MHz)

Fig. 6. Energy consumption vs. frequency supply voltage for each design is shown next to its data point. As expected, the pipelined design has the lowest minimum operating voltage and the adiabatic design has the highest one. However, the adiabatic design has the least energy consumption among thc designs. Compared to the combinational adder, the adiabatic adder exhibits energy savings of 87% at IOMHz and 76% at 100MHz. In comparison with the pipelined adder, the adiabatic adder exhibits energy savings of 87% at lOMHz and 90% at 100MHz. The adiabatic design fails to function above 100MHz, implying the disadvantage of this style for high speed systems. Thus, the adiabatic logic is more efficient for applications where the speed of operation is not so much critical. The results show that the adiabatic logic is more suitable for the implementation of pipelined architectures. The graphs in Fig. 7 give the power profiles of the CLAs over three clock periods at lOMHz while operating at their minimum supply voltage. Negative values indicate power flowing into the circuit, and positive values denote power flowing out of the circuit. This case only occurs for the adiabatic circuit indicates energy recovery property of this logic style. Fig. 8 gives the overall energy profiles of the adders at IOMHz while operating at their minimum supply voltage. It again shows the energy recycling phenomenon of the adiabatic logic. The little energy dissipation increase of the adiabatic adder in each

Fig. 5. Layout of the test chip of the two designs were generated using standard cells and the LEDIT placement and routing tool. Standard cells were optimized for low power and high speed. The combinational and pipelined CLA consist of 704 and 3596 transistors, respectively. We have integrated these three layouts in a test chip, which has been submitted for fabrication. To limit the pin count of the experimental chip to 40 pins, input demultiplexers and output multiplexers have also been integrated in the test chip. To facilitate the net power measurement of the circuits, the power lines of the CLA blocks and other parts of the chip are separated. Fig. 5 shows the layout of the test chip, which has a die area of 5mm2.

IV. SIMULATION RESULTS In this section, we present the results of the HSPICE simulation for the adders. The circuits were simulated with the netlists extracted from the layouts. The simulations computed the dissipation of the gates and internal clock lines but did not include the energy consumed on the external clock distribution network or the power clock generator. At each frequency, the results obtained for the minimum supply voltage that ensured correct function of each circuit. We also applied the worst case input pattern to the adders that would cause the maximum rate of events on the circuit nodes and, hence, the maximum switching noise and power consumption.

-40m

A. Energy Consumption Results

'

I n,rvrlrl,

,

350"s

4&

sob.

6knr

TIME

Fig. 6 shows the energy consumption per cycle of the adders when operating at IOMHz, SOMHz, IOOMHz, 150 MHz, and 200 MHz. The minimum

Fig. 7. Power profiles of CLAs (a) Pipeline (b) Combinational (c) Adiabatic 63

Authorized licensed use limited to: San Francisco State Univ. Downloaded on December 10, 2008 at 19:14 from IEEE Xplore. Restrictions apply.

650";

IC%

. 0.

0 Ius

The 12thInternational Conference on Microelectronics

.

.

.

0 . 2 ~ ~0.3”s

.

.

0 . 4 ~ ~ 0 . 5 ~ 0.6“s

. 07”s

.

.

0 . 8 ~ 0%

1.0~~

TIME

Fig. 8. Energy profiles cycle is due to the energy loss on resistive components, although most of the energy is recycled. Energy consumption of the combinational adder occurs at the times of the input transitions and energy consumption of the pipelined adder occurs at the active edges of the clocks.

Tehran, Oct.31- Nov. 2, 2000

12.6mA. The switching current of the adiabatic CLA is much more regular a i d sinusoidal with maximum amplitude of 0.7mA. This shows 79% reduction compared to the pipelined CLA and 94% reduction compared to the combinational CLA. The maximum current slopes for the pipelined, combinational, and adiabatic designs are 4 7 A / p , 47A/ps, and 0 . 3 A / p , respectively. This means that the adiabatic design exhibits two orders of magnitude reduction in switching noise, compared to the static CMOS designs, assuming the same power supply effective inductance for all the designs.

v. SUMMARY AND CONCLUSION In this paper, an adiabatic logic style was compared with the combinational and pipelined static CMOS logic style by designing an 8-bit carry look-ahead adder using all three methods with a 0.6-ym CMOS technology. The designed circuits were post-layout simulated by HSPICE. Based on the simulation results, the adiabatic CLA exhibits energy savings of 76% to 90% and two orders of magnitude reduction in switching noise, compared to its static CMOS counterparts. At each operating frequency, the adiabatic design has the highest minimum supply voltage. The maximum operating frequency of the adiabatic design is about 2 times less than that of the static CMOS designs. In conclusion, while the adiabatic logic family studied here exhibits considerable improvements in terms of energy savings and switching noise characteristics, it has the disadvantages of higher supply voltage and lower speed of operation.

E. Switching Noise Generation Results Power supply switching noise is composed of resistive (IR) and inductive (LdZ/df)noise [ 6 ] .Here I is the supply current while L and R are the effective supply inductance and resistance, respectively. When a number of devices switch at the same time, the cumulative transient current (0 and the slew rate (dI/dt) can be very large. The best logic from this aspect is the one that causes the minimum current spike (I) and slew rate (dI/dt) on the supply lines. There are two specific characteristics in adiabatic circuits that cause them to have the best behavior for lowest switching noise generation. First, in the adiabatic circuits, switchings occur with the minimum voltage drop across devices. Second, both the signals and the power supplies change slowly. Thus, steep spikes can be effectively removed from the supply current resulting in a considerable decrease in switching noise. Fig. 9 shows the switching current waveforms obtained for the CLAs when operating with their minimum supply voltage at 1OMHz. The pipelined CLA has abrupt total switching currents at the active edges of the clocks. The maximum amplitude is 3.4mA with many peaks and valleys. The combinational CLA has abrupt switching currents at the input transitions with the maximum amplitude of

ACKNOWLEDGMENT This research was supported in part by the EMAD Semicon Corporation.

REFERENCES [l] V. G. Oklobdzija, D. Maksimovic, and F. Lin, “Pass-transistor

adiabatic logic using single power-clock supply,” IEEE Trans. on Circuits and Systems-11: Analog and Digital Signal Processing, vol. 44, no. 10,pp. 842-846,Oct. 1997. [2] S. Kim and M. C. Papaefthymiou, “True single-phase energyrecovering logic for low-power, high-speed VLSI,” Proc. of International Symp. on Low-Power Electronics and Design,

pp. 167-172,Aug. 1998. [3] S. Kim and M. C. Papaefthymiou, “Single-phase sourcecoupled adiabatic logic,” Proc. of International Symp. on Low-Power Electronics and Design, pp. 97-99, Aug. 1999. [4] W. C. Athas, L. J. Svensson, J. G. Koller, N.Tzartzanis, and Y. Chou, “Low-power digital systems based on adiabaticswitching principles,”IEEE Trans. on VLSI Systems, vol. 2,

....... I(pv**) hnA2

(b) ,amA .l5mA-

3”

........

............ y*................

.................

...................

no. 4, pp. 398-406,Dec. 1994.

I(cvdd)

600”s

4UON

[5] R. Downing, P. Gebler, and George Katopis, “Decoupling capacitors effects on switching noise,” IEEE Trans. on Components, Hybrids, and Manufacturing Technologv,vol. 16, no. 5, pp.“484-489,Aug. 1993. [ 6 ] S. Zhao and K. Roy, “Estimation of switching noise on power supply lines in deep sub-micron CMOS circuits,” 13th International Con$ on V U I Design, pp. 168 -173, Jan.

650ns

TIME

Fig. 9. Switching current waveforms of CLAS (a) Pipeline (b) Combinational (c) Adiabatic

2000.

64

Authorized licensed use limited to: San Francisco State Univ. Downloaded on December 10, 2008 at 19:14 from IEEE Xplore. Restrictions apply.