9th International Symposium on Quality Electronic Design
Error-Tolerant SRAM Design for Ultra-Low Power Standby Operation Huifang Qin, Animesh Kumar, Kannan Ramchandran, Jan Rabaey EECS, University of California Berkeley, CA 94720, USA {huifangq, animesh, kannanr, jan}@eecs.berkeley.edu
Prakash Ishwar ECE, Boston University, Boston, MA 02215, USA
[email protected] We present an error-tolerant SRAM design optimized for ultra-low standby power. Using SRAM cell optimization techniques, the maximum data retention voltage (DRV) of a 90nm 26kb SRAM module is reduced from 550mV to 220mV. A novel errortolerant architecture further reduces the minimum static-error-free VDD to 155mV. With a 100mV noise margin, a 255mV standby VDD effectively reduces the SRAM leakage power by 98% compared to the typical standby at 1V VDD.
SRAM cell depends upon process variations and varies from die to die. Based on an in-depth understanding of SRAM data-retention behavior, we present a design that produces a predictable standby VDD below 300mV with DRV-aware SRAM cell optimization and an errortolerant architecture using Error Control Coding (ECC). Motivated by ultra-low power applications, our design goal is to achieve maximum standby power saving and reliable data retention, without degrading the operation speed and read-write stability. This design is validated using a 90nm 26kb SRAM chip, for which a 98% leakage power reduction is measured.
1. Introduction
2. DRV review
As technology scales and the size of on-chip memory increases, SRAM leakage becomes dominant especially for mobile applications. To reduce the leakage power while retaining the memory data, a low standby SRAM VDD is often used during sleep [1]. Previous work has focused on the design of sleep control circuitry to create a finely programmable standby supply voltage [2], or adaptively generate sleep pulses to optimize the tradeoff between leakage power reduction and dynamic power overhead [3]. But choosing the right level of standby VDD that achieves both reliable data retention and optimal leakage saving requires knowledge of the SRAM data retention voltage (DRV) [4]. The DRV indicates the capability an SRAM cell can maintain its state under very low VDD. DRV of an
In a standard SRAM cell (Fig. 1), the voltage transfer curves (VTC) of the SRAM cell inverters degrade as VDD scales down. When VDD is equal to the DRV, the SRAM cell static noise margin (SNM) becomes zero, as illustrated in Fig. 2. Using the notations of Fig. 1, this condition is given by:
Abstract
∂V1 ∂V2
VDD
M5
V1
Leakage current
M1
V2 M2
M4
M6
, when VDD = DRV Right inverter
SNM
0.2
DRV
0.1
VTC1 VTC2
VDD
0
Leakage current
0
0.1
0.2
0.3
0.4
Figure 2. SNM degrades to zero at DRV (worst-case mismatch is assumed in this simulated VTC).
Figure 1. Standard 6T SRAM cell structure.
0-7695-3117-2/08 $25.00 © 2008 IEEE DOI 10.1109/ISQED.2008.38
∂V1 ∂V2
VTC of SRAM cell inverters
0.3
0
M3
Left inverter
0.4
VDD 0
=
30
Authorized licensed use limited to: Univ of Calif Berkeley. Downloaded on May 6, 2009 at 14:00 from IEEE Xplore. Restrictions apply.
(1)
If VDD is further reduced below the DRV, the SRAM cell loses its capability to further retain the stored data. By solving the sub-Vth VTC equations of the cell inverters, the DRV can be determined as a function of both design and process parameters [4]. Theoretical analysis showed that the lower limit of 6T SRAM cell DRV is 36mV with a perfect symmetry in cell design and an ideal CMOS technology (i.e. sub-Vth swing is 60mV/dec). The DRV increases with transistor mismatches in an SRAM cell. Previous data from a 90nm test chip showed that the DRV of 4K standard SRAM cells ranges from 90mV to 220mV [5].
V1
Standard cell (VDD = DRV) Reduced Pass-T leakage Improved standby P/N ratio Reduced process variation
V2
3. DRV-aware SRAM cell optimization
Figure 3. DRV-aware SRAM cell optimization improves the SNM
Since DRV is a function of the SRAM circuit parameters [4], design optimization can be used to reduce DRV for an ultra-low voltage standby operation. Traditionally a standard SRAM cell is designed for high speed operations. The large NMOS pull-down device and small PMOS pull-up device reduce data access time, but degrade the data retention reliability at low voltage due to an unbalanced strength ratio in the pull-up and pull-down leakage paths. In order to gain a larger SNM and lower the DRV, the PMOS-to-NMOS (P/N) strength ratio needs to be improved during the standby operation. Next, since the DRV increases with transistor mismatch, reducing the process variations by using a larger channel length (L) helps reduce the DRV. Another mismatch factor is the leakage through the pass transistor that connects the state zero to the bitline voltage at VDD. Therefore, the following methods can be used to improve DRV in an SRAM cell design:
control during active operation, a 10% improvement on read speed and up to 100mV enhancement in read and write noise margins can be achieved [5]. Therefore this design methodology improves both the active and standby SRAM operations, while the only critical tradeoff is an area penalty due to larger L.
4. Minimize leakage power with ECC The circuit optimization reduces the transistor mismatch in an SRAM cell and improves the DRV. But there is still large DRV variation among the SRAM cells in an optimized design. For example, Fig. 4 plots the histogram of 4K bit DRV values measured from a 90nm SRAM test-chip with DRV-aware cell optimization design [5]. It is shown that the worst-case DRV (DRVmax) is 190mV but 90% of the SRAM cells have DRV below 130mV. While 4K bit size features a small SRAM design, the amount of DRV variation is higher for a larger memory. With a traditional worst-case design method, a standby VDD higher than DRVmax is required in a dualvoltage low leakage memory design. This approach ensures reliable data retention for every SRAM cell, but results in a sub-optimal leakage saving when the standby VDD is significantly higher than the DRV of most SRAM cells. In order to improve the leakage saving, we proposed an aggressive voltage reduction scheme, where the SRAM VDD is reduced below DRVmax during standbymode. After the standby period, the VDD is increased to 1V during active operation, and the memory retention errors are corrected with ECC and row-redundancy to ensure reliable data storage. The lower standby VDD used in this approach produces less leakage power, but induces higher data retention error rate and brings error-correction power overhead. Therefore, in order to achieve the maximum standby power saving in this
1) Use balanced P/N strength ratio during standby 2) Reduce process variation with larger L 3) Suppress access transistor leakage during standby The impact of these techniques on the SRAM cell VTC is shown in Fig. 3. In a practical memory design, the P/N standby strength ratio and the pass transistor leakage can be controlled with adjustable body bias control during standby to avoid impact on read and write operations. Experiments on the PMOS and NMOS body bias (VPB, VNB) control were implemented in a 90nm SRAM test chip. Measurement data showed that reverse-biasing the NMOS devices in a standard SRAM cell reduces the DRV, due to a more balanced standby P/N ratio and less pass transistor leakage. At a fixed W/L ratio, larger L reduces process variation and the worst-case DRV [5]. To avoid impact on active performance and power, the W/L ratios of all transistors in an SRAM cell are not changed in a DRV-aware optimization. Simulation analysis showed that by using adjustable body bias
31
Authorized licensed use limited to: Univ of Calif Berkeley. Downloaded on May 6, 2009 at 14:00 from IEEE Xplore. Restrictions apply.
aggressive voltage reduction scheme, a power per useful bit cost function (power per bit) can be optimized by choosing an appropriate standby VDD. Assuming that an SRAM is composed of many nlength SRAM cell-blocks in which k information bits are stored (n > k). The SRAM power per bit function is modeled as, 2 E nG VDD (2) P (VDD ) = + C , k k TS where VDD is the standby supply-voltage, EC is the total energy consumed in error-correction, TS is the memory standby time, and G is a constant extracted from the 90nm test-chip leakage measurement data [5]. In the following analysis, we assume that the DRV values are independently sampled from the empirical distribution of Fig. 4. This assumption is justified by a low spatial-correlation observed from the test-chip DRV data [5]. For any ECC, we select the standby VDD such that all the data-retention errors can be either decoded correctly by the ECC, or repaired by a fixed number of redundant rows if the decoding fails [6]. First we discuss the case when TS → ∞, i.e., EC contribution is normalized to zero. In this case, for n large and negligible decoding failure probability, it has been shown that [7], 2 2 GVDD GVDD , ≤ P(VDD ) ≤ 1 − h( p(VDD ) / 2) 1 − h(2 p(VDD ))
upper and lower bounds in (3) predict 44%-52% reduction in power per bit. These fundamental benchmarks assist in the ECC selection of a practical ultra-low leakage memory design. Next, in a practical design we need to account for EC, TS, and decoding latency. Most SRAM applications have a stringent decoding latency requirement of a few clock cycles. Among the codes with low decoding latency, we chose to compare the performance of Hamming and Reed Muller (RM) codes. With a system specification of 1% redundant rows, the optimum Hamming and RM codes achieve similar power per bit reduction (36%-37%) with similar parity bits overhead (