A Portless SRAM Cell Using Stunted Wordline Drivers Michael Wieckowski
Martin Margala
Department of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI
[email protected] Electrical and Computer Engineering Department University of Massachusetts Lowell Lowell, MA
[email protected] Abstract—A minimum area portless SRAM cell is presented along with a stunted wordline driver. When compared to an isoarea 6T cell using logic design rules, the portless cell exhibits 22% higher static noise margin and 14% lower leakage with a 51% penalty in the on / off cell current ratio. Measurements from a fabricated test chip in 0.5 µm CMOS demonstrate functionality of the proposed cell and driver in an array, and for the first time, validate the portless concept in an isolated test cell.
I. INTRODUCTION The recent focus on minimum-energy and low-voltage circuits coupled with the increased variability inherent to the sub-100 nm technology generations has lead to renewed interest in the exploration of new SRAM bitcell structures. Fundamentally, these new bitcells must compete with the traditional 6T structure at nominal voltage and iso-stability by offering gains in area, leakage, and array efficiency, or at low voltage and iso-area by offering gains in stability and power dissipation. For example, at low voltage, various designs have been published recently using six or more transistors [1-3]. These cells focus on mitigating read / write contention with respect to stability at low voltage, where sensitivity to local mismatch from random dopant fluctuation and other within die variation factors is most severe. For nominal voltage applications, we recently proposed a new five-transistor (5T) cell, termed “portless SRAM”, that focused on reduced cell area through elimination of the pass transistor ports found in traditional 6T designs [4, 5]. A fifth transistor was employed to modulate the feed-forward gain path enabling both read and write operations using a single wordline (or AXS) signal as shown in Figure 1. In this work, a second version of the portless SRAM cell is presented and is designed to achieve absolute minimum cell area. In addition, a stunted wordline driver is presented to satisfy the signaling requirements of the new cell. The remainder of this paper is structured as follows: Section II presents the proposed minimum area cell, and Section III details various simulation and implementation specifics. Section IV presents the measured test chip results, and Section V offers some conclusions and directions for future work.
Figure 1. The portless SRAM cell
II.
A MINIMUM AREA PORTLESS SRAM CELL
It was demonstrated previously in [5] that the portless cell could be modeled in the small-signal domain as shown in Figure 2. In order to maintain stability when the AXS transistor is conducting, the positive feed-forward component (A5) must be designed such that the overall loop gain remains less than unity. This was accomplished by increasing the gate length of the AXS transistor, effectively weakening its ability to short the cell nodes and ensuring that data is not lost.
Figure 2 - Equivalent small-signal gain model of the portless cell
This work was conducted while the authors were both at the University of Rochester, Department of Electrical and Computer Engineering, Rochester, NY.
978-1-4244-1684-4/08/$25.00 ©2008 IEEE
584
Authorized licensed use limited to: University of Michigan Library. Downloaded on October 30, 2009 at 18:02 from IEEE Xplore. Restrictions apply.
While effective at ensuring stability, this increase in the gate length of the AXS transistor resulted in a larger overall cell area. In this work, an alternate technique is proposed to maintain cell stability in which the gate-overdrive voltage of the AXS transistor is reduced during the read cycle. This results in a minimum area cell since all of the transistors can be sized at the allowable process minimum. When the AXS transistor is sized at the process minimum, the applied gate-source voltage must be used to control the operating region during a read. As this voltage is increased, the feed-forward gain grows larger and the cell loop gain approaches unity. This allows one to control the state of the cell using two gate-source voltage levels, one for read operations and the other for write operations. Interestingly, a similar technique was presented by Terman in 1971 for driving the wordline transistors in the 6T methodology, but the design was never fully adopted [6]. More recently, the opposite approach has been utilized where the wordline voltage is boosted above Vdd [7]. In both cases, the overhead of the driving circuitry must be offset by the achievable gains in performance. For the proposed portless SRAM, the relationship between cell operation and the AXS gate-source voltage is shown in Figure 3 where the gate-source voltage is swept in simulation from ground to Vdd. Bitline currents, bitline voltages, and cell node voltages are shown as a function of this signal. One can see that there exists an AXS voltage near 1.3 V that generates a large bitline current differential but maintains the cell data, conditions ideal for the read operation. Similarly, at an AXS voltage of 1.6 V, the cell data is completely erased and new data can be readily written by driving the bitlines.
anticipate significant improvements in portless cell area over 6T area when memory design rules are applied due to the symmetry of the structure and its uniform transistor sizing. Table 1 - Proposed cell compared to 6T cell in 0.18 µm CMOS
Metric Area (µm2) ICell (µA) SNM (mV) ILeak (pA)
III.
6T 8.64 97 298 85.6
Portless 8.77 41 364 73.5
IMPLEMENTATION AND SIMULATIONS
As a demonstration of the proposed technique, a simulated timing diagram is shown in Figure 4 for a 0.18 µm, 1.8 V technology. A read gate voltage of 1.3 V and a write gate voltage at Vdd are used. As can be seen in the figure, reading generates a substantial bitline differential in both the current and voltage modes while maintaining the cell data. Conversely, the write operation quickly equalizes the cell data nodes after which time the new data can be latched.
Figure 4 - Simulated timing of the proposed portless cell
To generate the two distinct voltage pulses needed for reading and writing in the proposed mode of operation, a straightforward circuit shown in Figure 5 has been developed and termed a “stunted wordline driver.” It is important to note that the simplicity of the circuit allows for a compact layout that fits into the row pitch of the SRAM array. Figure 3 - Read and write AXS voltage ranges for the proposed technique
A cell-level comparison between the proposed technique and a standard 6T design is presented in Table 1. The reference 6T design was designed after [8] and uses longer channel wordline transistors to minimize bitline leakage current. The portless cell exhibits 22% higher static noise margin and 14% lower leakage with a 51% penalty in the on / off cell current ratio. It is important to note that both of the cells were made to follow logic design rules, and that we
Figure 5 - Stunted wordline driver
585
Authorized licensed use limited to: University of Michigan Library. Downloaded on October 30, 2009 at 18:02 from IEEE Xplore. Restrictions apply.
During a read, the bitlines are charged nearly to Vdd. This forces the two output PFET’s into the sub-threshold region and prevents the output node from charging fully to Vdd. In fact, the output node can only be charged to Vdd-Vtn since the last stage inverter is composed of two NFET’s. The resulting low voltage pulse properly drives the cell during a read without erasing its data. During a write operation, one of the bitlines will be pulled low enough to turn on one of the output PFET’s. This will charge the output node to the full scale of the input pulse effectively erasing the data stored in the cell as shown in Figure 4. As shown in Figure 6, each stunted driver taps off of the global wordline driver to control one word of memory cells. In addition, it monitors the first pair of bitlines in each word to differentiate between reading and writing. One key aspect to this scheme is that there is no need to route an additional read/write signal from the column circuitry since the bitlines fulfill this task.
Figure 7 - Layout and micrograph of fabricated test chip
A single portless cell isolated from the array was tested by modulating its AXS and bitline node voltages using three 12bit digital to analog converters. As shown in Figure 8, as the AXS voltage increases from ground to Vdd, the cell nodes converge as expected. At approximately 3 to 4 volts, the cell is still stable and the maximal bitline read current differential is generated, nearly 75 µA in this case. As the AXS voltage increases further, the cell becomes unstable and a write is performed as one of the bitline nodes is lowered. When AXS returns to ground, the new value is latched into the cell. This operation is functionally identical to the simulations in Figure 3 and is the first measured demonstration of the portless cell theory.
Figure 6 - Organization of the wordline drivers
The driver organization shown in Figure 6 does introduce an area overhead penalty in two ways. The first is simply the additional driver required for each word of the array. Since the driver is only 6 transistors, and its area is amortized over the number of bits per word, this overhead is small, approximately 2% for a 128 bit word. The second penalty comes from the inability to spatially separate the bits of each word for multibit soft error protection. This same issue was recently solved by interleaving the bits of the ECC codes at an additional area cost of 5% [1]. IV.
MEASURED RESULTS
A proof of concept test chip was fabricated in a 0.5 µm, three-metal CMOS process. A 512 bit SRAM using the proposed cell and stunted wordline driver was included. The array was 64 rows tall and one 8-bit word across, due to the limitation in available metal layers. A dynamic NAND decoder was used to generate the global wordline signal, and a bank of latched sense amplifiers were used in the bidirectional data I/O path. In addition, a single cell was isolated from the array and configured for direct measurement through the pad-ring. The layout and the resulting test chip are shown in Figure 7.
Figure 8 - Isolated portless cell measurements showing read and write operations
To verify the functionality of the array, a 16-bit digital signal processor was used to assert data, address, read/write, and clock signals to the test chip array. A random sequence of data was written to a random sequence of addresses and then read back. A single bit of a single cycle of this test is shown in Figure 9 where the sequence “011101” was written and read from six different address locations at 5 MHz clock frequency. It is important to note that all of these operations were performed within the same column, which is the worstcase situation for portless SRAM with regard to stability, demonstrating the proper operation of the SRAM array.
586
Authorized licensed use limited to: University of Michigan Library. Downloaded on October 30, 2009 at 18:02 from IEEE Xplore. Restrictions apply.
value in portless SRAM when memory design rules are followed in sub-100 nm technologies. REFERENCES [1]
L. Chang, Y. Nakamura, R. K. Montoye, J. Sawada, A. K. Martin, K. Kinoshita, F. H. Gebara, K. B. Agarwal, D. J. Acharyya, W. Haensch, K. Hosokawa, and D. Jamsek, "A 5.3GHz 8T-SRAM with Operation Down to 0.41V in 65nm CMOS," in 2007 IEEE Symposium on VLSI Circuits, 2007, pp. 252-253.
[2]
R. Joshi, R. Houle, K. Batson, D. Rodko, P. Patel, W. Huott, R. Franch, Y. Chan, D. Plass, S. Wilson, and P. Wang, "6.6+ GHz Low Vmin, read and half select disturb-free 1.2 Mb SRAM," in 2007 IEEE Symposium on VLSI Circuits, 2007, pp. 250-251.
[3]
B. Zhai, D. Blaauw, D. Sylvester, and S. Hanson, "A sub-200mV 6T SRAM in 130nm CMOS," Int. Solid-State Circuits Conf, 2007.
[4]
Figure 9 - Measured functionality of the test chip SRAM array
V.
M. Wieckowski and M. Margala, "A novel five-transistor (5T) sram cell for high performance cache," in The IEEE International SOC Conference, 2005, pp. 101-102.
CONCLUSIONS AND FUTURE WORK
It has been shown in this work that the gate-overdrive voltage of the AXS transistor is an effective lever for controlling cell stability in portless SRAM. By using stunted wordline drivers, this technique can be employed with minimal impact on array efficiency while guaranteeing absolute minimum cell area. Test chip results demonstrate functionality of the proposed method in a proof of concept process technology, and also validate the portless SRAM cell theory for the first time through direct measurement of an isolated cell.
[5]
M. Wieckowski, S. Patil, and M. Margala, "Portless SRAM - A High-Performance Alternative to the 6T Methodolgy," IEEE Journal of Solid-State Circuits, vol. 42, November 2007.
[6]
L. M. Terman, "MOSFET memory circuits," Proceedings of the IEEE, vol. 59, pp. 1044-1058, 1971.
[7]
I. Masaaki, K. Masayuki, N. Masahiro, T. Akira, and I. Takashi, "Ultra Low Voltage Operation with Bootstrap Scheme for Single Power Supply SOI-SRAM," in 20th International Conference on VLSI Design, 2007, pp. 609-614.
[8]
It is clear that portless SRAM is a viable alternative to standard 6T designs and exhibits advantages with respect to stability, area, and leakage. Current implementations however, tradeoff a substantial percentage of their on / off current ratios in the process, resulting in slower operation or reduced array efficiency due to limited column height. Efforts are currently underway to improve upon this tradeoff through new bitline sensing schemes, and to further illuminate the
R. W. Mann, W. W. Abadeer, M. J. Breitwisch, O. Bula, J. S. Brown, and B. C. Colwill, "Ultralow-power SRAM technology," IBM Journal of Research and Development, vol. 47, pp. 553-566, September/November 2003.
587
Authorized licensed use limited to: University of Michigan Library. Downloaded on October 30, 2009 at 18:02 from IEEE Xplore. Restrictions apply.