Distributed Data-Retention Power Gating ... - Semantic Scholar

Report 2 Downloads 122 Views
Distributed Data-Retention Power Gating Techniques for Column and Row Co-Controlled Embedded SRAM Chung-Hsien Hua, Tung-Shuan Cheng and Wei Hwang Department of Electronics Engineering & Institute of Electronics and Microelectronics and Information Systems Research Center (MIRC) National Chiao-Tung University, Hsin-Chu 300, Taiwan {cshua, hwang}@eic.nctu.edu.tw, [email protected]

Abstract In this paper, multi-mode data-retention power gating (P.G.) techniques are presented for embedded memories. These data retention power gating techniques are applied to embedded SRAM with distributed column and row cocontrolled capabilities .The SRAM array is divided into blocks. Each block has a dedicated data-retention power gating device. The data-retention power gating devices are controlled by signals from both row and column decoders. Only the selected block is powered-on. Multi-mode power gating structures proposed in this paper can provide 2X to 20X memory cell leakage reduction while maintaining good static noise margin. Simulation results show that for a 64-bit wordline, the active power reductions for 32-bit, 16-bit, and 8-bit blocks are 59%, 79%, and 94%, respectively. All the simulations and physical layout are implemented in TSMC CMOS technology.

In this paper, we propose a power gating structure supporting four different modes of operation including dataretention mode and intermediate mode in addition to the conventional active mode and standby/cut-off mode. The paper is organized as follows. Single mode power gating techniques with data-retention capabilities and the data retention voltage are presented in Section 2. The new multi mode power gating devices, which include data-retention and intermediate modes and their effects on static noise margin of storage elements, are presented in Section 3. The column and row co-controlled data retention power-gated SRAM architecture are presented in Section 4. The conclusions are summarized in Section 5.

2. Single Mode Power Gating Devices In this section, single mode conventional and new dataretention power gating devices will be presented and discussed in detail.

1. Introduction

2.1. Conventional Power Gating Devices

Future applications will require more embedded memory devices onto the SoCs. Over 90% of the future chip area is predicted to be occupied by memory circuits in the following decade [1]. As the technologies advance, the threshold voltages of transistors also decrease to maintain the ratio between supply voltage and threshold voltage. Therefore, leakage currents of all kinds become to dominate the overall power consumption in the era of sub-1V supply voltage nano-scale CMOS technologies. One solution is to use higher supply voltage for memory cells that are built with higher threshold voltage and thicker gate oxide. This approach will definitely reduce leakage currents that impute to technology scaling. However, extra masks are needed which means higher NRE costs in nanoscale SoC design. Another approach is to activate the memory cells row-byrow. The memory cells are made of low threshold voltage devices. The leakage current of non-accessed cells is reduced by means of reverse body biasing (RBB) or the control of supply voltage. By using reverse body biasing, the threshold voltage of transistors can be controlled adaptively. The body effect caused by reversed body biasing will suppress leakage currents of the memory cells.

The conventional power gating devices can be classified into two main categories: footer and header devices. Footer is by inserting NMOS sleep transistors between real ground and virtual ground and header is by inserting PMOS sleep transistors between real VDD and virtual VDD as shown in Fig. 1. The internal circuits can be either combinational or sequential circuits. The power gating devices are controlled by sleep signal from power management unit which decides what system power saving scenario is adopted now. This conventional power gating strategy is useful for combinational circuits. However, these gating devices ruin the static noise margin of the storage elements in sequential circuits. Lots of publications proposed multiple threshold CMOS (MTCMOS) as a power gating device solution [2 - 3]. However, using higher threshold voltage transistors as the power gating devices require larger silicon area for power gating devices to be capable of sinking the maximum instantaneous current at active mode. Therefore, using single threshold voltage transistors as the power gating devices or adaptively adjusting the well bias of the power gating devices are desirable to reduce the silicon area occupied by power gating devices.

Proceedings of the 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT’05) 1087-4852/05 $20.00 © 2005 IEEE

VDD

Regular Power Gating Device Array VDD

Internal Circuits

Sleep Signal

 Virtual VDD

Virtual Ground Sleep Signal



Internal Circuits GND GND

Regular Power Gating Device Array (a) Footer

(b) Header

Fig. 1 (a) Conventional NMOS footer array power gating devices (b) Conventional PMOS header array power gating devices

Cell Supply Voltage Reduction

VDD

DRV

GND

GND

I0

M1

VR

VL

VDD

I4

I2

GND

M2

VDD I1

Fig. 2 Standard 6-T SRAM structure and its leakage paths

2.2. Data-Retention Power Gating Devices Robustness of stored data during standby mode is indispensable no matter what low power techniques are used. An index for memory cell stability is static noise margin (SNM). Static noise margin indicates the ability of noiseimmunity for memory cells in the presence of noise sources. Reducing the cell supply voltage while the cells are not accessed is the most intuitive way to suppress leakage current. However, a critical voltage level called data retention voltage [4] must be defined before lowering the supply voltage. As shown in Fig. 2, the minimum supply voltage to retain stored data must be at least higher than the data retention voltage (DRV). From the circuit shown in Fig. 2, the data retention voltage can be defined as

wVL wVR

wVR wVL

, where VDD

DRV

(1)

The leakage current can be estimated once the DRV is defined and the cell leakage power can be defined as

Pleak

DRV ˜ I leak

(2)

avoid PVT variations. Practical supply voltage might vary with 10% of its nominal value. Another 10% of the supply voltage should be added to the DRV to avoid other on-chip noises such as coupling noise to destroy the stored data. Therefore, the appropriate DRV should be the DRV defined in (1) plus 10% of the nominal supply voltage. We call this voltage level the practical data retention voltage ( DRVP ). Fig. 3 is the relationship between effective supply voltage, the voltage across Vdd and virtual ground, and static noise margin. Fig. 4 is the relation between effective supply voltage and leakage current of memory cells. These two figures indicate that maintaining a high static noise margin results in higher leakage current. Therefore, we can minimize leakage currents of memory circuits by maintaining the minimum required static noise margin. Although reducing supply voltage will reduce the cell leakage current I 0 and I 2 as shown in Fig. 2, the bitline leakage I1 and I 4 are not suppressed. One way to suppress bitline leakage and the cell leakage at the same time is to lift the ground voltage instead of reducing the supply voltage. However, lifting the ground voltage also requires an on-chip DC-DC converter [5]. The Relative voltage level can be visualized as shown in Fig. 5. GNDV is the voltage level across virtual ground and real ground. Applying conventional power gating devices directly to memory cells will destroy the stored data. Data retention power gating devices are added in parallel with the conventional power gating devices as shown in Fig. 5 to protect the stored data during standby mode. The data retention power gating devices can also be turned off to further reduce the cell leakage at the price of losing the stored data. We know that conventional power gating devices work well with combinational circuits. But Storage elements need a new set of power gating devices call data-retention power gating devices to maintain static noise margin during standby mode. This structure is shown in Fig. 5 where the data retention power gating devices are inserted in parallel with regular power gating devices. From Table 1 we know that the SNM is 0mV when power gating devices are turned off which means data stored on SRAM can no longer be guaranteed to be correct. However, 24X-leakage-current reduction is achieved through the usage of conventional power gating devices during standby mode. A set of power gating devices are shown in Fig. 6. The data-retention power gating devices never truly turn off. Fig. 6(a) is a small NMOS transistor with its gate biased to a specific voltage. Due to this current connecting virtual ground and real ground, the static noise margin is as good as not seeing any power gating devices. However, the leakage current reduced by this data-retention power gating device is very small. Fig. 6(b) is by connecting NMOS transistor’s gate and drain together to form a voltage controlled resistor. While at active mode operation, the virtual ground line is nearly at equi-potential.

The DRV is the critical voltage level that memory cells retain data. A guard band voltage should be added to DRV to

Proceedings of the 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT’05) 1087-4852/05 $20.00 © 2005 IEEE

BL

WL

250



DRV VDD

200 150

Virtual Ground

GNDV

Static Noise Margin (mV)

BLB VDD

300

100



Data Retention Power gating Devices

Real Ground

50

0C 25C 75C 125C

0

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 5 Regular power gating devices with data retention power gating device to maintain static noise margin of storage elements in standby mode 1.2

˩˼̅̇̈˴˿ʳ˚̅̂̈́˷ˀ˄

Supply Voltage (V)

Fig. 3 Supply voltage vs static noise margin in 100nm CMOS technology

˩˼̅̇̈˴˿ʳ˚̅̂̈́˷ˀ˄

˩˼̅̇̈˴˿ʳ˚̅̂̈́˷ˀ˄

˩˗˗

Leakage Current (nA)

1000 0C 25C 75C 125C

ʻ˵ ʼʳ ˧̌̃˸ʳ҈ ʻ˴ ʼʳ ˧̌̃˸ʳ҇ ʻ˶ ʼʳ ˧̌̃˸ʳ҉ Fig. 6 Three types of different data retention power gating devices (a) NMOS transistor with gate biased to a specific voltage – VDD in this case (b) NMOS transistor with its gate connected to drain that forms a voltage controlled resistor (c) PMOS transistor with its gate connected to real ground that forms a voltage controlled resistor

100

10

1 0.0

0.2

0.4

0.6

0.8

1.0

1.2

Supply Voltage (V)

Fig. 4 Supply voltage vs leakage current in 100nm CMOS technology After turning off the regular power gating devices, the virtual ground line is floating and the leakage current begins to charge up the potential of the virtual ground line. At first the data-retention power gating device acts as a highly resistive resistor. As potential of the virtual ground line begins to raise, the effective resistance between virtual ground and real ground becomes smaller which prevents the potential on virtual ground line rise further. Finally, the potential comes to equilibrium and steady. Due to the reduced potential difference between VDD and virtual ground, the static noise margin is degraded but within an acceptable range. The leakage current is reduced but not as much as conventional power gating devices alone. As shown in Table 1, Type ҈ NMOS resistor data retention power gating device can cut the leakage current in half compared to the intrinsic leakage current. Type ҉ PMOS resistor dataretention power gating device as shown in Fig. 6(c) is its PMOS counterpart of Type ҈. If the device sizes of Type ҈ NMOS resistor and Type ҉ PMOS resistor are the same, Type ҉ will show better ability to suppress leakage current due to higher effective resistance.

Table 1 Static noise margin and standby power consumption comparison among different kinds of power gating devices in 100nm CMOS technology @ Vdd=1V Gating No Conv. Type҇: Type҈: Type҉ Style P.G. P.G. Small N-resistor P-resistor NMOS SNM 340 ~0 308 195 163 (mV) 22 558 280 220 Standby 561 Power (nW) To decide the size of the data retention power gating devices, the DRV must be defined first. After deciding what DRV we are using, the total cell and bitline leakage can either be simulated or measured. Therefore, the voltage level of the virtual ground as shown in Fig. 5 and the current sinking through the data retention power gating devices at data-retention mode are known. The size of data retention power gating devices can be determined by using these two parameters. Although Type ҉ (PMOS resistor) power gating devices may introduce body effect but we can compensate this effect by gate sizing. Therefore, the most important thing before deciding the size of data retention power gating devices is to define a good data retention voltage. The practical data retention voltage must be capable of resisting all PVT variations and guarantee the correctness of the stored data. This method will lift the voltage in the

Proceedings of the 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT’05) 1087-4852/05 $20.00 © 2005 IEEE

4. Column and Row Co-Controlled DataRetention Power Gated SRAM Architecture

Internal Circuits

Virtual Ground Ctrl1 GND



Ctrl2

Regular Power Gating Device Array

 Data Retention Power Gating Device Array

Fig. 7 Concurrent data-retention and intermediate power gating structure virtual ground line to around one threshold voltage. Therefore, we can stack multiple diode-connected MOSFETs to lift the voltage on the virtual ground to the required voltage level. The effects of effective supply voltage on static noise margin and leakage current are shown in Fig. 3 and Fig. 4. By using this method, we can reduce the leakage during standby mode without using DC-DC converters.

3. Multi-Mode Power Gating Devices From the description of data-retention power gating devices in previous section, we noticed that the ability to suppress leakage current is an order difference between regular and data-retention power gating devices. Therefore, once the data-retention power gating devices are inserted between virtual ground and real ground, the leakage current may be still unacceptable in some occasions. We modified Type ҉ data-retention power gating devices and have the gate terminals of data-retention power gating devices controllable to power management units. The detailed connection of this concurrent cut-off and data-retention power gating structure is as shown in Fig. 7. Four different modes are available in this structure. ҇ Active mode: The Ctrl1 signal in Fig. 7 is high and Ctrl2 is low. Therefore, both the regular and data-retention power gating devices are all on that supports full speed operation of the internal circuits. ҈ Standby/ Cut-off mode: The Ctrl1 signal in Fig. 7 is low and Ctrl2 is high. Since both the regular and dataretention power gating devices are all off, 20X leakage reduction is achievable in this configuration. The data stored in storage elements will be destroyed but data are allowed to be destroyed in this mode of operation. ҉ Data-retention mode: The Ctrl1 signal in Fig. 7 is low and Ctrl2 is also low. In this mode of operation, the regular power gating devices are turned off but not the dataretention ones. Therefore, data stored in storage elements are survived during standby mode. But the leakage reduction is not as much as cut-off mode. Ҋ Intermediate mode: The Ctrl1 signal in Fig. 7 is high and Ctrl2 is also high. This is an intermediate mode during the transition of the previous three modes. During mode transitions, ground bounce is inevitable. Therefore, intermediate mode is required to reduce ground bounce.

Row-controlled scheme uses row decoder to control gating transistors [6]. In this scheme, all the SRAM cells on the same wordline (row) share a common gating transistor and they are power-on when the row is selected, while other unselected rows are power gated. Notice that no diode transistor that limits the rising of voltage at virtual GND node is included. In contrast to row-controlled scheme, column-controlled scheme controls the gating devices by column decoder [7]. All the cells on the same bitline share a common gating device, and only the cells on the selected bitlines are poweron. The virtual GND node is a large capacitive node and the value is based on the number of wordlines. All the cells on the selected bitline are power-on but only one of them is activated by wordline in the column-controlled scheme. Consequently, less power reduction is obtained due to the unused cells that are power-on. In this section, the proposed column and row co-controlled power-gated SRAM architecture is presented. Fig. 8 shows the schematic diagram of the proposed column and row cocontrolled SRAM scheme. In contrast to the previous two schemes, this scheme controls the gating devices with signals from both row and column decoders. The cells on the same wordline are divided into blocks, and the block size depends on the size of I/O ports. Fig. 8 depicts an example of 16-bit I/O that 16 cells form a block and each block has a dedicated gating device. Note that the global wordline signals from row decoder are not directly connected to the cells instead of being fed into AND gates. The AND gates are controlled by signals from row and column decoders and generate control signals that are served as local wordlines and control signals for gating devices. In this scheme, blocks are power-on only when both the wordlines and the column selection signals (sel0, sel1, and so on) are pulled high. The reason for this scheme is that for an n-bit I/O SRAM core, only n-bit data are either read from or written into the SRAM for each operation. That’s why the block size depends on the size of the I/O. In this subsection the speed, power, and area overhead of the proposed scheme are discussed. The simulations and physical layouts are implemented in TSMC 0.13um CMOS technology. Fig. 9 shows the comparison of cell standby power of the above three schemes and the wordlines are 32 bits. Obviously, conventional scheme consumes most cell standby power since no gating device is adopted. Row-controlled and proposed schemes have almost the same cell standby power consumption. These two schemes are equivalent in standby mode and about 60% cell standby power reductions are achieved.

Proceedings of the 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT’05) 1087-4852/05 $20.00 © 2005 IEEE

Fig. 8 Schematic of the proposed column/row co-controlled SRAM scheme

83%

2

1

Cell active power saving (%)

SRAM cell standby power (nW)

- 59.8%

86%

79%

80

60

96%

94%

100

3

59%

53%

71%

column/row scheme degenerates to row-controlled scheme

40

20

8-bit block 16-bit block 32-bit block

0% 0 32

64

96

Wordline length (bit) 0

conventional

row-controlled

Fig. 11 Active power saving.

this work

Fig. 9 Cell standby power.

Normalized cell active power

1.0

conventional row-controlled this work (8-bit block) this work (16-bit block) this work (32-bit block)

0.5

column/row scheme degenerates to row-controlled scheme

Normalized power-delay product

1.00

0.98

1.00

0.75

0.51

0.50

0.25

0.25

0.07 0.00

conventional

rowcontrolled

this work

this work

this work

(32-bit block) (16-bit block) (8-bit block)

Fig. 12 Power-delay product. 0.0 32

64

96

Wordline length (bit)

Fig. 10 Cell active power. Fig. 10 depicts the cell active power versus various wordline lengths and block sizes. The figure reveals that the cell active power consumptions of the conventional and the rowcontrolled schemes are proportional to the wordline length, since both of them power-on all the cells for each operation.

As for the proposed scheme, however, the active power is almost a constant for a fixed block size, regarding of the wordline length. This is because that only one block of cells is activated at the same time no matter the size of the wordline is. Fig. 11 shows the active power savings of the proposed scheme with various wordline lengths and block sizes, and they are compared with the row-controlled scheme. It shows that more active power saving is achieved while the block size is smaller and the wordline length is longer. For the

Proceedings of the 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT’05) 1087-4852/05 $20.00 © 2005 IEEE

Fig. 13 Layout and device allocation.

situations that the lengths of wordlines are twice the sizes of blocks, the power savings are expected to be about 50%. From Fig. 6, however, all the power savings are slightly larger than our expected values. This is because that the proposed scheme not only poweron fewer cells per operation, but also turns on a smaller gating device in comparison with the row-controlled scheme. Fig. 12 shows the normalized power-delay products for 64-bit wordlines. From the graph, it’s observed that the proposed scheme achieves significant reductions in powerdelay product. A reduction of 93% is obtained for 8-bit block scheme, 75% for 16-bit block scheme, and 49% for 32-bit block scheme. Although the proposed scheme induces an extra AND gate delay, the power-delay products demonstrate that the performance degradation is insignificant. Fig. 13 shows the layout and the allocation of gating devices. Two gating devices are inserted between two adjacent blocks and each gating device is in charge of one block of cells. No significant IR drops on virtual GND lines are suffered since each gating device is closely adjacent to the corresponding block, and the lengths of the virtual GND lines of the blocks are equivalent. The layout area of the proposed scheme with 8-bit block is about 20.7% larger than that of the conventional scheme. Besides, about 12.1% and 8.1% area increases are induced for 16-bit and 32-bit blocks, respectively. Using smaller block sizes achieves more power reductions but induces larger area overhead.

5. Conclusions Three types of single-mode data-retention power gating configurations are introduced in this paper. Relation between static noise margin and leakage current is thoroughly demonstrated. It was found that type ҉ p-resistor is most area effective for data retention and easy to switch its operation modes. A concurrent multi-mode data-retention and intermediate power gating structure has been implemented. The advantages of new power gating structure with four different operation mode (data-retention, intermediate, active, and cut-off) are demonstrated. It shows that 2X to 20X leakage reduction is achieved and static noise margin is maintained while needed. The area overheads of dataretention and regular power gating devices are less than 1% and 10%, respectively. The design issues of power-gated SRAM are discussed and some prior designs are presented in this paper. Moreover,

a column and row co-controlled power-gated SRAM architecture that achieves both active and standby power reductions is proposed. The column and row co-controlled scheme divides the SRAM cells on the same wordline into blocks, and one gating device is in charge of one block. The cell active power of this scheme is much smaller than the other two schemes compared. Simulation results also show that this scheme achieves significant reductions of powerdelay products. This demonstrates the effectiveness of the proposed scheme in reducing active power consumption with insignificant performance degradation. The simulations and physical layout are implemented in TSMC 0.13um CMOS technology. This new scheme has larger layout area because of the extra AND gates and gating devices. About 20.7% and 12.1% area increases are induced for 8-bit and 16-bit block sizes, respectively. Besides, for 32bit block size the area overhead is about 8.1%.

6. Acknowledgments The work is supported by National Science Council, R.O.C., under the project NSC 92-2220-E-009-011, NSC 932220-E-009-024 and TSMC grant. This work is also supported by DOEIT 94-EC-17-A-01-S1-034. The authors would like to thank SOC Research Center in NCTU for support of this research.

7. Reference [1] T. Sakurai, “Perspectives on Power-Aware Electronics,” in International Solid-State Circuits Conference Dig. Tech. Papers, Feb. 2003, pp. 26-29. [2] B. H. Calhoun, F. A. Honore, and A. P. Chandrakasan, “A Leakage Reduction Methodology for Distributed MTCMOS,” IEEE Journal of Solid-State Circuits, vol. 39, issue 5, pp. 818-826, May 2004. [3] M. Anis, S. Areibi, and M. Elmasry, “Design and Optimization of Multithreshold CMOS (MTCMOS) Circuits,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 22, issue 10, pp. 1324-1342, Oct. 2003. [4] Hulfang Qin; Yu Cao; D. Markovic; A. Vladimirescu and J. Rabaey, “SRAM leakage suppression by minimizing standby supply voltage,” in 5th International Symposium on Quality Electronics Design, March 2004, pp. 55-60. [5] Kyeong-Sik Min, K. Kanda and T. Sakurai, “Row-byRow Dynamic Source-Line Voltage Control (RRDSV) scheme for Two Orders of Magnitude leakage current reduction of sub-1V VDD SRAM’s,” in Proceedings of the 2003 International Symposium on Low Power Electronics and Design, Aug. 2003, pp. 66-71. [6] A. Agarwal, H. Li, and K. Roy, “A Single-Vt LowLeakage Gated-Ground Cache for Deep Submicron,” IEEE Journal of Solid-State Circuits, vol. 38, pp. 319328, Feb. 2003. [7] K. Nii, Y. Tsukamoto, T. Yoshizawa, S. Imaoka, and H. Makino, “A 90nm Dual-Port SRAM with 2.04um2 8TThin Cell Using Dynamically-Controlled Column Bias Scheme,” in International Solid-State Circuits Conference Dig. Tech. Papers, Feb. 2004, pp. 508-509.

Proceedings of the 2005 IEEE International Workshop on Memory Technology, Design, and Testing (MTDT’05) 1087-4852/05 $20.00 © 2005 IEEE