Multi-level wordline driver for robust SRAM design ... - Semantic Scholar

Comment

Report 58 Downloads 40 Views

Microelectronics Journal 45 (2014) 23–34

Contents lists available at ScienceDirect

Microelectronics Journal journal homepage: www.elsevier.com/locate/mejo

Multi-level wordline driver for robust SRAM design in nano-scale CMOS technology Farshad Moradi a,n, Georgios Panagopoulos b, Georgios Karakonstantis c, Hooman Farkhani a,d, Dag T. Wisland e, Jens K. Madsen a, Hamid Mahmoodi f, Kaushik Roy b a

Integrated Circuits and Electronics Laboratory, Department of Engineering, Aarhus University, Denmark Nanoelectronic Research Laboratory, Purdue University, USA c Telecommunication Circuits Laboratory, École Polytechnique Fédérale de Lausanne, Switzerland d Ferdowsi University of Mashhad, Iran e Nanoelectronics group, University of Oslo, Norway f Nano-Electronics and Computing Research Laboratory, San Francisco State University, USA b

art ic l e i nf o

a b s t r a c t

Article history: Received 20 November 2012 Received in revised form 23 September 2013 Accepted 30 September 2013 Available online 29 October 2013

In this paper, a multi-level wordline driver scheme is presented to improve 6T-SRAM read and write stability. The proposed wordline driver generates a shaped pulse during the read mode and a boosted wordline during the write mode. During read, the shaped pulse is tuned at nominal voltage for a short period of time, whereas for the remaining access time, the wordline voltage is reduced to save the power consumption of the cell. This shaped wordline pulse results in improved read noise margin without any degradation in access time for small wordline load. The improvement is explained by examining the dynamic and nonlinear behavior of the SRAM cell. Furthermore, during the hold mode, for a short time (depending on the size of boosting capacitance), wordline voltage becomes negative and charges up to zero after a speciﬁc time that results in a lower leakage current compared to conventional SRAM. The proposed technique results in at least 2 improvement in read noise margin while it improves write margin by 3 for lower supply voltages than 0.7 V. The leakage power for the proposed SRAM is reduced by 2% while the total power is improved by 3% in the worst case scenario for an SRAM array. The main advantage of the proposed wordline driver is the improvement of dynamic noise margin with less than 2.5% penalty in area. TSMC 65 nm technology models are used for simulations. & 2013 Elsevier Ltd. All rights reserved.

Keywords: SRAM Wordline driver Low-power Leakage-power Digital circuits

1. Introduction Aggressive transistor scaling has led to increased process variations leading to major design challenges in the nanometer regime. Process variability stems from systematic effects such as variations in critical dimensions (transistor width and length) [1], oxide thickness [2], and truly random effects like the dopant ﬂuctuations (RDF) [3]. Current design methodology hardly distinguishes systematic variations from truly random ones. For digital design (logic and memory), the worst-case corners typically capture 3s variations. To satisfy the worst-case performance requirements, often a large penalty is paid in power and area.

n

Corresponding author. Tel.: þ 45 418 933 44. E-mail addresses: [email protected] (F. Moradi), [email protected] (G. Panagopoulos), georgios.karakonstantis@epﬂ.ch (G. Karakonstantis), [email protected] (H. Farkhani), dagwis@iﬁ.uio.no (D.T. Wisland), [email protected] (J.K. Madsen), [email protected] (H. Mahmoodi), [email protected] (K. Roy). 0026-2692/$ - see front matter & 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.mejo.2013.09.009

In such cases, satisfying the power budget often requires a trade-off in performance. Memory design presents an extreme example of corner-based design. To satisfy the functionality of several tens of millions of SRAM cells, the designer has to capture even 5s and 7s standard deviations of parameter variations. This is becoming increasingly challenging to satisfy, and may present a problem for continued scaling of memory density. Hence, to achieve the highest possible packing density with high parametric yield in CMOS and SOI technologies, designers use a combination of multi-layered ad-hoc and heuristics techniques. They include device sizing, supply and threshold voltage selection, SRAM column height and sense-ampliﬁer optimization, redundant columns, and error correction techniques. Power reduction has also become one of the most challenging design issues in every application domain. For SRAM designs, power reduction has been obtained by lowering the supply voltage [4] or by using high-Vth (HVT) devices [5,6]. Stability issues in SRAM cells have been aggravated using complex peripheral designs such as supply gating [7,8]. Lower voltages and smaller devices cause a signiﬁcant degradation in SRAM cell data stability especially in scaled CMOS

24

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

technologies. Another challenge is that SRAM cells are the main sources of leakage due to the large number of devices in memory banks that needs to be considered [9]. To improve data stability of SRAM cells, various techniques have been proposed in literature such as read/write separation [10,11] (e.g. 8T-SRAM cell), boosted wordline [12,13] (to improve voltage scaling), dynamic wordline driver [14], negative bit-line (to improve writability) [15–18], programmable wordline driver [19], compound device and circuit techniques [20], and adaptive dynamic wordline driver [21]. However, read and write stabilities are both important. For instance, improving the read stability degrades the read access time. Hence, the development of a memory technology with higher data stability and lower leakage power with negligible degradation in access time is desirable. Furthermore, a memory with the possibility of improved write and read margin simultaneously can help engineers signiﬁcantly. In this paper, we propose a technique with the following features: (a) boosted voltage level to VDD þα where α is a design parameter and depends on the size of transistors and boosting capacitance of wordline driver circuit. (b) Fixed voltage at VDD for a short period of time from the start of the read access and then lowered voltage than VDD/2 for the rest of read mode. This improves the dynamic noise margin and robustness against process variation that results in a lower read failure rate. (c) Negative level of wordline voltage during standby mode to reduce bitline leakage (partially negative wordline). The remainder of paper is organized as follows: in Section 2 a motivation example is given. In Section 3 we brieﬂy describe some existing wordline driver techniques. In Section 4 we present the proposed multi-level wordline driver technique (MLWD) and explain how read and write stability improvement is achieved based on dynamic analysis. Simulation results are presented in Section 5. In Section 6, we apply the proposed technique to other SRAM cell topologies. Finally, in Section 7 the conclusions are drawn.

2. Wordline drivers for SRAM arrays In literature, different topologies have been proposed to improve read or write static noise margins (SNM) [4–20]. However most of these techniques consider only one aspect of the existing challenge. In this section, we consider the most recently proposed techniques that are relevant to wordline driving technique. In this section, we brieﬂy present some of them. Fig. 1 shows the schematic of Level-programmable Wordline Driver (LPWD) proposed by Hirabayashi et al. in Ref. [19]. They proposed a wordline compensation technique combined with a dual power supply scheme. As shown in Fig. 1, there are two global power supplies named VSM and VDD. The authors reported that VSM is 200 mV higher than VDD based on their measurements on a 512 Kb SRAM block. This technique uses an adaptive wordline (WL)

Fig. 1. Read assist circuitry [19].

level-control generated from dual power supplies in the WL driver. In LPWD, the WL pull-up PMOS transistors are split in a binary manner. Even though the cell failure is reduced by 1000 , the usage of large PMOS transistors leads to large area penalty. Besides, this technique uses dual power supplies. A replica access transistor (RAT) is proposed in Ref. [15] that self-calibrates the WL voltage suppression under dynamic control voltage and frequency scaling. The schematic of the design is shown in Fig. 2. The technique reduces the cell current by 83% compared to conventional assist circuits [15]. Furthermore, the minimum operating voltage in the worst case was improved by 170 mV, conﬁrming a high immunity against process and temperature variations with less than 10% area overhead. However, since the WL voltage is lowered, there is a concern about degradation of the operating speed. Furthermore, this technique uses resistances as voltage-divider (implemented by N þ poly-Si) that imposes area penalty. In Refs. [15,16], the write capability is enhanced by negative write biasing without any reduction in the cell current. Furthermore, read capability is enhanced by cell current boosting. This technique uses a negative voltage booster to boost the selected column's VSS to a negative voltage to enhance the access current. However, in this technique the cells are 8T-SRAM cells with separate read and write operations. Therefore, these techniques suffer from large area and power penalties compared to 6T-SRAM array. In Ref. [17], capacitive coupling is used to generate a transient negative voltage at the low-going bit-line during Write operation without using any on-chip or off-chip negative voltage source. In this technique, bitline voltage is lowered to a negative level using a capacitance. Then the bitline voltage gets back to ground after a certain time depending on the design. This technique shows 1000 improvement in Write failures with no impact on the read stability. In Ref. [22], the authors proposed a Level-programmable Wordline Driver for Single Supply (LPWD-SS) design, shown in Fig. 3. This design uses a ﬁxed negative voltage using boosting capacitance that gives better writability to the SRAM cell. The design shows better rise time in WL compared to LPWD design. This design uses 32 nm high-k metal-gate CMOS technology. In this paper, we propose a technique that improves read and write SNM, with comparable read access time compared to standard SRAM-cell, while showing lower power consumption compared to conventional wordline drivers. In addition, it gives lower leakage current due to the negative WL voltage during a timeslot of the standby mode. The main advantage of the proposed wordline driver is the merging of read-assist and write-assist designs into one single technique. Furthermore, our design shows negligible area overhead compared to other techniques such as the

Fig. 2. RAT scheme [15].

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

RAT and LPWD-SS schemes due to lower capacitance without adding any resistance.

3. Motivation example Process variations lead to ﬂuctuations in transistor parameters such as threshold voltage, width and length variations that can ultimately lead to memory failures. In general, parametric failures [3] can occur during (a) read: ﬂipping the stored data, (b) write: the data is not stored correctly within the required time and (c) hold: ﬂipping of cell data with the application of a lower supply voltage. Fig. 4a and b shows examples of read and write failures assuming constant wordline (CWL) voltage equal to supply voltage. One method to reduce the effect of variability in these cases is the upsizing of the SRAM cell that helps to make the cell more stable. However, since the area of the SRAM cell is crucial for the density of the memory array, it is important to consider other techniques to improve the stability. A promising technique is to change the shape of the applied wordline voltage. Such technique

25

is preferable since the stability of the cell can be improved without altering the topology or size of each bit-cell. In order to explain the effect of changing the wordline voltage, let us consider use of some examples. To begin with, we consider a scenario in which the shape of the applied wordline voltage is not ﬁxed. Fig. 4a, c and b, d show the SRAM function during read and write modes respectively. After applying the shaped wordline, as illustrated in Fig. 4b and d, applying such a shaped wordline voltage can improve both read and write. Interestingly, we observe that applying such a shaped wordline voltage helps the cell to maintain the stored data during read as shown in Fig. 4b while in Fig. 4a it is shown that the stored data is ﬂipped. During read, by shaping the wordline voltage, data is not ﬂipped and during write it helps the cell to read the data. For instance, let us assume a memory with conventional worldline (Fig. 4c) and a cell that cannot write data correctly due to weakened access transistors inﬂuenced by parametric variations. Interestingly, by applying the shaped wordline voltage (boosted to a higher level) depicted in Fig. 4d to the same cell (assumed that fails under a conventional ﬁxed wordline voltage) strengthens the access transistor and allows data to be stored correctly. The above examples show that appropriate wordline shapes can signiﬁcantly improve the cell stability that results in an increase in memory yield which is necessary in nanoscale technologies. Motivated by the above examples, in this paper we propose a scheme that generates an appropriate wordline shape depending on the mode of operation and we examined its characteristics and its effects on a memory array. The details of the proposed circuit are discussed in the next section.

4. Proposed wordline driver

Fig. 3. LPWD-SS [22].

In this section we describe the proposed wordline driver in detail and explain how it achieves to improve the read and write SNM, while maintaining a read access time similar to standard SRAM-cell, and reducing power consumption as opposed to conventional wordline drivers.

Fig. 4. The effects of wordline shaping on SRAM cell stability for read ((a) and (b)) and write ((c) and (d)).

26

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

the MLWD output is set at a lower voltage under VDD/2. By applying such a voltage, the read noise margin improves signiﬁcantly while the read access time is marginally affected. The negligible degradation on read access time is attributed to the primary value of wordline voltage that is VDD 7δ, where δ is a value up to 100 mV depending on the wordline load capacitance. Thus the degradation in access current is negligible for small loads. However, for larger load capacitances (e. g. 50 fF) the primary level of wordline voltage is lower than VDD leading to a reduced access time. For instance for an SRAM array of 4 Kb (256 row and 32 columns), access time is degraded by 7.5% compared to CWD.

Fig. 5. MLWD circuit topology.

The schematic of the proposed wordline driver is shown in Fig. 5. The Multi-Level Wordline Driver circuit (MLWD) consists of three parts: (1) the delay element that determines the duration of a read operation and the time during which WL has a negative voltage level, (2) the pulse generator circuitry that produces a special shaped pulse and (3) the boosting capacitance that changes the voltage level of the MLWD output depending on the operational mode. In order to describe how the MLWD scheme works let us consider its operation at three different modes: (a) hold or standby mode (b) read mode and (c) write mode. (a) During hold, the WL of the SRAM is connected to a negative voltage for a short period of time ( 6τinv, where τinv is the delay of a single inverter) and thereafter WL voltage becomes zero. Note that the main reason behind the applying a negative WL voltage is the reduction of leakage current through the access transistors. Of course the actual improvement in leakage power consumption depends on various factors such as the transistor size, the boosting capacitance, and the delay of the inverter-chain circuitry as we will show in Section 4. (b) During write, the MLWD circuit generates a VDD þα voltage with α4 0. Higher the WL voltage level (wordline boosting technique) is, the better the writability becomes since the gate voltage of the access transistor becomes higher making it as a concequence stronger (as we also show in one of the examples in Section 2). Note that α is a design parameter and it is determined by the boosting capacitance, while the rise time of WL is tuned by the size of the transmission-gate in the MLWD circuit. Note that larger transmission-gate transistors (TGT) lead to lower rise time of the MLWD output during write or read modes, thus there is no need for dual supply voltages or negative ground (Vss). A challenging issue in boosted wordlines during read is the half select issue in which a cell in an unselected column is disturbed because of the increased voltage that is applied on WL for turning-on the pass gate transistors for writing the required cells. Finding the right trade-off between the write margin and disturb margin becomes especially challenging at low VDD (i.e. 0.5 V) [23–28]. (c) During read mode, for a short period of time from the beginning of the access of the cell ( ﬃ6τinv s, 120–150 ps), the MLWD output voltage is raised to VDD 7δ where δ is a very small value (up to150 mV), that depends on the transistor dimensions of the MLWD circuit, the boosting capacitance, and the wordline capacitance. For the remaining WL pulse period,

Let us next explain the operation of the proposed MLWD circuit at the three different modes and highlight the enhancements over the conventional circuit described above: During write, when WLin is at “1”, for a time equal to 3τinv (delay of three inverters) node A remains at “1”, while node B is “0”. In this case, transistors P1, P2 and N1 turn on charging the boosting capacitance to VDD. After this time ( 3τinv), node A becomes zero while node B remains at zero turning off transistor N1. After 6τinv from the beginning of operation, node B becomes one. Therefore, there is no path between node MLWD and WLin. At this point, capacitance Cboost discharges to “0” after a time constant equal to RN2Cboost. During read, for a time equals 6τinv from the beginning of read cycle, the WL voltage level is high depending on the wordline capacitance. However, after 6τinv delay, transistors N3 and NR turn on and try to pull down the MLWD node to ground. At this time, TG1 (i.e. N1 and P1) is off and TG2 (i.e. N2 and P2) is still ON (N2 turns on, while the drain and the source of N2 are connected to the same voltage). Under this condition, the ﬁnal voltage value on Cboost in the read cycle is reached after one time constant that is deﬁned by the on-resistance of the stacked N3 and NR transistors (Rstack) together with RN1. Output voltage is deﬁned as follows: V boost ¼ 1 þ

t t Rstack þ K 1 eðRstack jjRN2 ÞCboost þ K 2 eRN2 Cboost RN2

ð1Þ

where, Rstack and RN2 are the equivalent resistances for stacked N3–NR and N2 devices, respectively. As it can be seen from Eq. 1, since Rstack 4 RN2, the ﬁnal value of Vboost is larger than zero. Upsizing the stacked NMOS transistors lead to Vboost to approach zero. In this case, the size of the boosting capacitance determines the discharging time of Cboost to its ﬁnal value. During hold mode, when WLin becomes zero, for a delay equals to 3τinv, TG1 is OFF while N2 is still on. In this case, during the 3τinv time, voltage at node MLWD goes towards a negative value equal to 1 V with a time constant of τ¼ RN2Cboost. Note that the negative voltage is determined by the boosting capacitance Cboost and RN2. After an extra elapsed time equal to 3τinv (total delay of 6τinv from the beginning of standby mode), the Cboost is charged up to zero with a time constant determined by RP2||RN1. This affects the leakage power of the whole row, but its inﬂuence in the total leakage is very small. On the other hand, the lower WL voltage during read reduces the total leakage power signiﬁcantly as it is also shown in next sections. MLWD scheme results in 2% leakage current reduction for an SRAM array. In our simulations we assumed that 25% of wordline pulse width is enough to sense a voltage difference of 50–70 mV between two bitlines depending on the sense ampliﬁer design. However, the number of columns and rows deﬁne the sensing time. Increased number of rows leads to larger bitline capacitance (CBL) results in a longer sensing delay (access time). After the shortened wordline pulse (that is enough to sense the bitline difference), the level of wordline voltage is lowered to lower the failure rate at presence of process variation.

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

There is a possibility to make WL pulse wide enough to enable sensing the voltage at a cost of lower dynamic noise margin. Let us look at an example, at Fast–Fast corner (Fast NMOS and Fast PMOS) discharging the BL to the desired level is fast enough to be sensed during 6τinv. This is attributed to a 150 μA current through access transistor that is enough to discharge the bitline to a level to be sensed by sense-ampliﬁer. However, for Slow–Fast corner (Slow NMOS and Fast PMOS) this time is not enough to discharge BL to the desired level because of lower current, 80 μA, through access transistors. Therefore, the rest of discharging the bitlines happens at lower level of WL with a lower access current. It is clear that access time is degraded with an improved noise margin. In case, the pulse width is designed in a way to work at all corners (widening wordline pulse width), lowered level of wordline voltage is not necessary. However, by assuming different process variation effects and reliability issues such as NBTI, PBTI, and HCI and their effects on threshold voltage, the rest of wordline pulse (lowered level of wordline voltage) can help to prevent any failure. This is the main difference between proposed wordline driver and pulsed-wordline driver.

27

Fig. 6. MLWL outputs during read.

5. Simulation results To show the efﬁcacy of the proposed MLWD technique, we performed HSPICE simulations on a 6T SRAM cell and 128 Kb memory array using the proposed MLWD wordline driver scheme as well as a conventional wordline in TSMC 65 nm process technology. Before going into the details of the results we would like to mention that for comparison we used the technique that is referred as conventional wordline driver (CWD)[26]. However in Ref. [26], a custom layout for NAND is performed while in this paper a standard layout for NAND is assumed. CWD is implemented using a NAND gate that takes inputs read/write signal and the row decoder's output. In addition, note that the conventional wordline driver (CWD) design requires large buffers to drive the rows in an SRAM array. As we mentioned in the previous section, the proposed MLWD circuit generates three different voltage shapes one for each operation mode. The output of MLWD during write is a boosted wordline voltage to a VDD þα value. During hold, the wordline voltage becomes negative reducing the overall leakage current while in read mode the wordline voltage reaches to VDD 7 δ. In the following paragraphs we consider the different operational modes of the SRAM cell, and later on we discuss about power savings as well as the behavior of the proposed MLWD in presence of process variations. 5.1. Read mode Voltage waveforms showing the operation of a 6T-SRAM cell using the proposed MLWD are illustrated in Fig. 6 for a small wordline capacitance. As it can be seen the shaped wordline voltage during read enables SRAM to read the data through sense ampliﬁers. In Fig. 7, we plot the results for BLs with conventional wordline driver (CWD) and MLWD. As it is shown, the bitline discharging time is faster in case of MLWD than CWD. Note that, for small loads, due to the boosted WL provided by MLWD (even very small), the BL is discharged faster than the conventional case and hence shows a better read access time. The main reason of faster discharge of BL voltage is due to the small bump in the wordline voltage during read for MLWD design. Furthermore, since the MLWD has a delay and its value changes slower than CWD counterpart. During this time, access transistor connected to BLB has no effect on the cell for a longer time that results in a strong OFF-state for corresponding transistor connected to BLB.

Fig. 7. Simulation results for MLWD and CWD (Cboost ¼1 fF, single SRAM cell).

For CWD design, BLB starts to discharge a little from the beginning of read cycle and results in a slower discharge in BL. For lower supply voltages, due to the delay of MLWD design that is attributed to the transmission-gate transistors drivability, read access time is degrades by few percent compared to CWD. This is attributed to lowered level of wordline voltage during the read. The value of Cboost ¼1 fF is chosen for aforementioned results. However, for an SRAM array, a larger boosting capacitance is required to keep the level of WL voltage close to VDD at the primary time of reading (for few ps). For larger WL capacitance the level of WL voltage is lowered results in a degraded access time. For instance, for a 64 column SRAM array, the wordline capacitance is larger than 100 fF that requires a Cboost at least larger than 50 fF to achieve similar waveform in Fig. 7. For our simulations, a 10 fF boosting capacitance is used that results in lowered level of WL voltage and leads to an improvement in SNM while degradation in access time. The results for wordline voltage level during read are included in Section 5.4. Lowering the WL voltage level is a way to make the access transistors “weaker” and thus to improve read SNM. However, this technique degrades the access time due to weaker access transistors. Fig. 8 shows a family of butterﬂy curves and the corresponding read SNM of a conventional wordline driver for different supply voltages. It is known that lowering the supply voltage, SNM degrades signiﬁcantly. Fig. 9 explains how read SNM and access time are affected by lowering the WL voltage. It is apparent that lowering the WL voltage to 0.6 V, read access time degrades by 40% while the read SNM improves by 2 . By applying the

28

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

Fig. 10. Current saving of MLWD during read. Fig. 8. Read-SNM for 6T-SRAM at different operating voltages.

Fig. 9. Lowering the WL voltage effect on SNM and read access time. Fig. 11. Output of MLWD during the write. Table 1 SNM comparison for MLWD and CWD. VDD (V)

Read SNM (mV) (CWD)

Read SNM (mV) (MLWD)

Improvement

1 0.9 0.8 0.7 0.6 0.5

140.8 132.6 119.1 101.0 82 63.6

305.9 284.2 254.9 219.4 180.5 141.3

2.17 2.14 2.14 2.17 2.2 2.22

MLWD technique, read access time does not get degraded while the read SNM improves by more than two times. Table 1 shows the results for read SNM using MLWD and CWD topologies. Interestingly, MLWD design improves read SNM by at least 2.14 compared to the CWD scheme. In case that the sense ampliﬁer senses a voltage difference (ΔVBL) between the bitlines, the WL voltage is lowered to suppress the access current for the remaining read cycle ( ﬃ90% of TWL, where TWL is the WL period). The sense ampliﬁer (SA) used in our SRAM array design that can sense ΔVBL less than 100 mV [29]. Fig. 10 shows the access current ﬂowing through the passtransistors of the SRAM cell during read for MLWD and CWD. It is evident that our MLWD design can provide more than 50% current savings during read compared to CWD technique. As a result, MLWD reduces the power consumed by the SRAM cell during read. This is attributed to decreased current ﬂows through the access transistor (IAX). However, since the access time is degraded for larger CWL (wordline capacitance), the power saving is decreased. The power saving of proposed MLWD can be considered from two aspects. In case we measure read power consumption when one of the bitlines is fully discharged in which the proposed MLWD results in a high power saving. But when the difference

between bitlines' voltages is considered, the saved power is only for the rest of cells in the row with WLs enabled. During read, as mentioned, the level of wordline voltage is lowered to a voltage lower than VDD/2 instead of ground (pulsedwordline driver). The main advantage of proposed design compared to pulsed wordline driver is its better robustness against process variations that results in a much lower read failure. To clarify this concept, let us bring an example. Assume that, due to the process variation, the shortened pulse is not adequate to read the data. In this case, for MLWD, the level of WL has been lowered to VDD/2 instead of “0”, giving this beneﬁt to the circuit to discharge the bitline with a slower pace. This results in a lower failure rate in the proposed circuit. 5.2. Write mode Fig. 11 shows the MLWD operation during write and hold mode. As it can be seen, during write, the wordline voltage is boosted to VDD þα, where α is the boosted value that was discussed in Section 4. Writability of MLWD improves due to boosted WL voltage. In Fig. 12, we compare the write delay for different wordline driver techniques. Since, the MLWD design shows less rise in voltage at lower supply voltages, write delay is less affected by using this technique. At lower supply voltages, the negative BL technique (e.g. LPWD [7] or LPWD-SS [12]) results in a smaller delay compared to MLWD technique. Fig. 13 shows the waveforms of the MLWD output during write under different supply voltages. We observe that the write speed degrades which takes place due to the delay added by the MLWD circuitry. By upsizing the transistors in the MLWD (transmission-gate transistors), this problem can be mitigated for lower supply voltages. However, we have to mention that for higher supply voltages, the MLWD exhibits lower write delay

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

29

Fig. 15. Current saving during write and read at VDD ¼ 1 V. Fig. 12. Write delay for MLWD compared to CWD and LPWD. Table 2 Power and area for proposed MLWD (4 Kb SRAM array).

Fig. 13. Level of MLWD at different operation voltages.

Capacity

256 32

Read power at VDD ¼ 1.0 V Write power at VDD ¼ 1.0 V Read power at VDD ¼ 0.9 V Write power at VDD ¼ 1.2 V Leakage power for CWD Leakage power for CWD Cell area Read power at VDD ¼ 1 V (ΔBL)

395.33μW 392 μW 390 μW 395 μW 55.92 μW 50.14 μW 0.9282 μm2 375 μW

By utilizing the MLWD design, Vmin is reduced from 166 mV to 96 mV at VDD ¼1 V (CBL ¼400 fF, CWL ¼ 50 fF) while the ﬁnal Vmin for MLWD is reduced to 31 mV. All in all, Vmin is improved by at least 43% compared to CWD. 5.3. Power consumption

Fig. 14. Write margin comparison.

compared to the negative BL methods (LPWD and LPWD-SS). The proposed design also includes read-assist circuitry compared to the LPWD-SS design to improve read margin. Instead of shutting off the input wordline signal to ground (during read), we lowered the wordline voltage to a level less than VDD that helps to lower the leakage through access transistors. The results of write margin for the negative bitline design (LPWD) compared to CWD and MLWD schemes are depicted in Fig. 14. By lowering the supply voltages lower than 0.6 V, using CWD for SRAM cells is not possible. However, both LPWD and MLWD are able to help SRAM array to work with a reasonable WM down to 0.25 V and 0.3 V respectively. These simulation results show that the LPWD design provides better write margin compared to the MLWD technique. However, the MLWD achieves 2.75 improvement in write margin compared to the conventional design. Furthermore, our design shows better write delay at higher supply voltages compared to the LPWD design.

In Fig. 15, we have plotted IAX (Access transistor current) during read and write modes for MLWD and CWD schemes. To calculate power savings, we performed HSPICE transient analysis for read, write, and hold modes. The results show that MLWD design reduces the total power of SRAM cell during read and standby mode while write power consumption is increased. Table 2 summarizes the results for power consumption of a 4 Kb SRAM array. Due to boosted wordline voltage during write, the MLWD scheme shows 37% increase in the IAX current compared to CWD. However, as we mentioned, read current is reduced by more than 50% compared to using CWD technique. It is known that the number of read accesses is three times larger than the number of write accesses based on test-benchmarks [30]. This suggests that the total savings will improve for common applications, where read is more prevalent. However, read power consumption can be measured in two ways. In case, only the bitline voltage difference is considered, the power saving is different from when the bitline is allowed to be discharged to a lower level (e.g. fully discharged). To clarify the power saving of the proposed MLWD technique, we simulate a 4 Kb SRAM array and measure the average power consumption using CWD and MLWD. Simulation results show 390 μW and 395,33 μW read power consumption at VWL ¼0.9 V and VWL ¼1.0 V, respectively. Write power consumption is calculated as 395 μW at VWL ¼1.2 V while 392 μW at VWL ¼1.0 V. Leakage power consumption for this array is calculated as 55.92 μW and 50.14 μW for CWD and MLWD, respectively. These results do not consider the precharge power consumption and the power consumption of read and write circuitry. During read, by considering the required ΔVBL to be sensed by sense-ampliﬁer, there is no power saving during read. This is attributed to this fact that, the time

30

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

required to discharge the bitline adequately increases for lowered wordline voltage level. As a result, the total read power consumption is not changed. The total power saving of the proposed design depends on the number of read and writes for a speciﬁc time. The higher the number of the writes, the lower power saving will be achieved. However, the lower number of writes improves the power consumption of the SRAM array. 5.4. Process and temperature variations In case of process variations (higher Vth value for access transistors) by applying the CWD design, SRAM fails to read the data from storage nodes during the speciﬁed pulse width. However, for the proposed MLWD design, if the speciﬁed pulse is not wide enough to read the data, the SRAM still have the chance of reading the data at a slower rate when wordline voltage is at VDD/2 level. As a result, our design is less prone to process variations and more robust. In order to study the proposed scheme under variations we performed Monte-Carlo simulations for both read and write modes and the results are summarized in Figs. 16 and 17, respectively. In case of read mode, we observe that the effect of process variations is more prominent under the lowest WL voltage. As it can be seen, μ and s are 0.53 V and 17.4 mV, respectively. Furthermore, we observe that the mean of α during write is 0.11 V and equal to the designed value. Therefore, our circuit demonstrates robust behavior in presence of transistor process variations and mismatches due to small variations of α value in presence of process variations. Considering the effect of temperature on WL voltage during read and write cycles is crucially important. The effect of raised temperature on WL voltage for proposed MLWD technique is

Fig. 16. Process variations effect on WL voltage during the read.

Fig. 18. (a) Read wordline voltage and (b) write wordline voltage versus temperature.

shown in Fig. 18a and b. As it is shown in Fig. 18a, by increasing the temperature the value of read wordline voltage does not change signiﬁcantly (0.97-0.96) while the lowered VWL value changes from 0.44 V to 0.39 V. However, by raising the temperature, wordline voltage during write changes from 1.196 to 1.15 V. As a result, the change in read noise margin versus temperature is negligible while the change in write margin is more due to the lowered VWL. The effect of raised temperature on access time is negligible too. To explore the effect of process variations (VWL change ) on read noise margin, write margin, and access time, we sweep VWL during read and write. Fig. 19 illustrates the degradation trend in read noise margin due to the lowered VWL at different VWL. As it can be seen, in case the wordline voltage level changes from 1 V to 0.9 V, the read noise margin is degraded by 24%. However, as shown in Fig. 18a, temperature does not change the VWL value signiﬁcantly. Therefore, read noise margin change versus temperature is negligible. However, write margin is more affected by temperature as illustrated in Fig. 19. The lowered boosted VWL from 1.2 to 1.1 (equivalent to VWL versus temperature) lead to a write margin degradation by 22%.

5.5. Dynamic noise margin comparison

Fig. 17. Effect of process variations on WL voltage level during write.

When we perform static noise margin (SNM) we suppose that the word-line pulse width is inﬁnite. Even though this assumption does not represent the reality, it provides an easy, fast and comprehensive way to measure the stability of a SRAM cell. Hence, in order to capture the real behavior of the SRAM cell, dynamic stability analysis and dynamic noise margin (DNM) [30,31] is needed to perform. In our case, the use of DNM for SRAM cell analysis is necessary in our proposed technique the wordline is not

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

31

Fig. 19. (a) Read noise margin and (b) write margin versus temperature.

Fig. 20. Simulation results for CWD and MLWD designs.

constant. Thus, our goal is to capture the transient states which are not considered during DC static analysis. Before we present our results, it will be more convenient to introduce some terms that help explain our results appropriately. A more detailed discussion can be found in Ref. [32]. By running transient analysis using a numerical simulator like SPICE we get the voltage waveforms (VL, VR and WL) with respect to time as they are shown in Fig. 20a and c. Their trace in the VL–VR plane (phase plane) is called trajectory. The representation of the solutions in the phase plane is more convenient because some points and curves on it have some special properties and play a signiﬁcant role on SRAM analysis and stability. For instance the cross points of the butterﬂy curves are the equilibrium point (EP) of the SRAM cell. It is obvious that the metastable point is also an EP (for the hold mode). The ﬁrst two are stable but the last one is unstable. Note that these EPs are moved to different locations during read and write operation since the shape of butterﬂies and their cross points change. Furthermore, the curve that plays a signiﬁcant role

in our analysis is the separatrix [32]. Separatrix is a curve that splits the VL–VR plane into two sub-planes one for each stable EP. All the trajectories that start from one of these sub-planes converge to one of the two equilibrium points. In addition, separatrix passes through the metastable point. All these concepts are depicted in Fig. 20. Let us now use DNM for the analysis of cells that are statically unstable, that is cells for which their butterﬂies cross in only one point during read mode (see Fig. 21). In the example of Fig. 21 if a trajectory starts from the EP (VDD, 0) after some ﬁnite time it will cross the separatrix and then it will reach the unique read mode EP. When the access transistor turns off, this trajectory will reach the EP (0, VDD). It is obvious that this cell is dynamically unstable because the stored data is not the same before and after read operation. Let us suppose now that the time the WL is high is short enough then the trajectory may not have enough time to cross the separatrix and reach the other EP, thereby, retaining the data. In this case, the cell is dynamically stable. These two scenarios are

32

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

Fig. 21. Simulation results for MLWD at different levels of VF (WL).

Fig. 24. Layout of (a) MLWD and (b) CWD designs.

Fig. 22. Simulations results for CWD and MLWD.

Fig. 25. Layout of 128 Kb SRAM array including MLWD and CWD layouts.

Fig. 23. Thin-cell 6T-SRAM cell layout.

shown in Fig. 22. Hence, we understand that the use of DNM is necessary for the study of non-conventional wordline drivers such as the proposed MLWD driver.

memory array that is shown in Fig. 25. This is attributed to the small area of a whole wordline driver circuit compared to the total area of SRAM array. The whole area of a MLWD array is 7.9% of the area of a 128 Kb SRAM array without lateral circuitry (128 Kb SRAM array area: 144,300 μm2, MLWD array: 12451 μm2, and CWD array: 8493 μm2). By assuming the standard layout of SRAM array as half of our SRAM array layout in this paper, the area overhead is 4.9%. As a result, the area overhead of the proposed design is small compared to modifying SRAM cell circuit that gives signiﬁcantly larger area overhead such as 10T-SRAM gives 2 larger area overhead compared to 6T-SRAM cell.

5.6. Implementation-area 6. MLWD in other SRAM topologies We have implemented the layout of a thin-SRAM cell as depicted in Fig. 23. As it is shown the area of this layout is larger than the standard SRAM layout design due to using logic design rules to draw the SRAM layout results in 1.09 μm2 that is 2 larger than standard SRAM layout. The layout of the MLWD and CWD designs is shown in Fig. 24. Note that the capacitor Cboost is implemented using a MOSCAP device which shows different capacitance values at different biases, for instance Cboost ¼ 17 fF, 10 fF and 2 fF for capacitance voltage equal to 1 V, 0 V, and 1 V respectively. To drive larger wordline loads, larger MOSCAP is required. Although the area overhead for a the proposed MLWD compared to CWD design is 46%, the total area overhead due to MLWD circuitry compared to CWD is less than 2.5% for a 128 Kb

Fig. 26 shows the schematics of the 8T-SRAM and Schmitt Trigger SRAM (ST-SRAM) cells proposed in Refs. [33,34], respectively. By applying MLWD to 8T-SRAM cell, access time degrades due to single-ended bitline for the 8T-SRAM cell. Since read and write wordlines are separated in 8T-SRAM cell, the trade-off between read and write has been resolved. Therefore, our proposed technique can be applied to 8T-SRAM cell to improve writability of the circuit due to the boosted write wordline. However, since in MLWD, wordline voltage is pulse-shaped, the access time degradation is inevitable. However, conventional 8TSRAM cell still suffers from the half select disturb similarly to the 6T-SRAM cell due to the boosted wordline voltage.

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

33

Fig. 26. Schematics of ST-SRAM cells [34] and 8T-SRAM [33].

the 8T-SRAM cell non-selected columns are affected by high voltage of the wordline during write. The use of TGPT-SRAM cell leads to 25% area penalty compared to 6T-SRAM cell. As it is clear from the SRAM topology shown in Fig. 27a, the trade-off between read and write can be solved due to independent access transistors for read and write. For instance, to improve the write margin, access transistor with gate connected to WWL, can be upsized without any effect on read noise margin and access time. In another side, the access transistor for read is weakened to improve static noise margin while access time degradation is inevitable. Another conﬁguration that can be used is to separate RBL (read bitline) and WBL (write bitline) as shown in Fig. 27b. This SRAM topology provides a further improvement in read SNM and access time. Note that in this case, we should use separate wordline drivers for read and write. Due to this fact, there is an area overhead due to the required separate lines for read and write. The main advantage of this circuit is that it is not necessary to precharge bitlines for each write/read. In this case, there is a possibility of using write bitlines ﬂoated that gives a higher write time and write margin to the proposed SRAM cell. Furthermore, this topology is not suffering from the dummy-read due to the separated read and writes lines.

7. Conclusions Fig. 27. (a) TGPT-SRAM cell and (b) TGPT-SRAM cell with separate BLs.

The ST-SRAM cell improves read SNM by 1.47 compared to the 6T-SRAM cell. Therefore, considering the effect of MLWD on this design can be interesting. By applying MLWD technique, both read and write margins are improved. The main reason of such improvement is due to lowered level of wordline voltage during read. For instance, the level of voltage for primary time of read cycle is at 0.96 V for a load CWL ¼10 fF showing that access time is degraded by few percent while SNM is improved. However, as it was explained previously, for larger CWL values, such as 30 fF, the read wordline voltage is reduced to 0.92 V that results in an improved SNM. Moreover, as it was mentioned, the proposed technique, MWLD, improves DNM signiﬁcantly. Write margin is improved, also, due to the boosted wordline to a level up to 1.184 V for CWL ¼10 fF. By applying MLWD to the ST-SRAM cell we can get at least 1.94 read SNM improvement compared to using the CWD scheme. This technique can be applied to a transmission-gate passtransistor SRAM (TGPT-SRAM) cell as shown in Fig. 27a. As it can be seen, by the use of high-Vth (HVT) devices for the write-assist access-transistors, leakage through access transistors can be reduced when the data is read. Furthermore, by using Low-Vth devices for access transistors, the access time degradation at lower supply voltages can be compensated. However this comes with a penalty in read noise margin degradation. Furthermore, similar to

In this paper, we proposed a multi-level wordline driver (MLWD) to improve write margin and leakage power keeping the degradation in read access time low. Furthermore, the proposed MLWD scheme improves the read noise margin by at least 2 compared to the conventional wordline driver (CWD). When applying the proposed MLWD to an 8T-SRAM and a Schmitttrigger SRAM cell, the read-SNM is improved by at least 1.96 compared to conventional wordline driver. Total power reduction rate depends on the number of reads and writes in an SRAM array. In general, leakage power is reduced due to the negative level of wordline during hold mode. Furthermore, we proposed an 8TSRAM cell (TGPT-SRAM) using MLWD scheme to separate read and write wordlines resulting in improved read SNM. Although the proposed MLWD scheme can be applied to any SRAM cell, it is suggested to accompany in SRAM cells with dual bitlines.

References [1] E. Chang, B. Stine, T. Maung, R. Divecha, D. Boning, J. Chung, K. Chang, G. Ray, D. Bradbury, O.S. Nakagawa, S. Oh, D. Bartelink, Using a statistical metrology framework to identify systematic and random sources of die- and wafer-level ILD thickness variation in CMP processes, IEDM Tech. Dig. (1995) 499–502. [2] K.A. Bowman, S.G. Duvall, J.D. Meindl, Impact of die-to-die and within-die parameter ﬂuctuations on the maximum clock frequency distribution for gigascale integration, IEEE J. Solid-State Circuits 37 (2) (2002) 183–190. [3] S. Mukhopadhyay, H. Mahmoodi, K. Roy, Modeling of failure probability and statistical design of SRAMarray for yield enhancement in nanoscaled CMOS, IEEE Trans. Comput. Aided Des. integr. Circuits Syst. 24 (12) (2005) 1859–1880.

34

F. Moradi et al. / Microelectronics Journal 45 (2014) 23–34

[4] M.E. Sinangil, N. Verma, A.P. Chandrakasan, A reconﬁgurable 8T ultra-dynamic voltage scalable (U-DVS) SRAM in 65 nm CMOS, IEEE J. Solid-State Circuits 44 (11) (2009) 3163–3173. [5] F. Hamzaoglu, Y. Ye, A. Keshavarzi, K. Zhang, S. Narendra, S. Borkar, M. Stan, V. De, Analysis of dual-VT SRAM cells with full-swing single-ended bit line sensing for on-chip cache, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 10 (2) (2002) 91–95. [6] F. Hamzaoglu, Y. Ye, A. Keshavarzi, K. Zhang, S. Narendra, S. Borkar, M. Stan, V. De, Dual-VT SRAM cells with full-swing single-ended bit line sensing for high-performance on-chip cache in 0.13 μm technology generation, in: Proceedings of the International Symposium on Low Power Electronics and Design ISLPED, 2000, pp. 15–19. [7] J.B. Kuang, H.C. Ngo, K.J. Nowka, S. Ehrenreich, A.J. Drake, J. Pille, S. Kosonocky, R. Joshi, T. Nguyen, I. Vo, The design and implementation of a low-overhead supply-gated SRAM, in: Proceedings of the 32nd European Solid-State Circuits Conference ESSCIRC, Sept. 2006, pp. 287–290. [8] E. Morifuji, T. Yoshida, M. Kanda, S. Matsuda, S. Yamada, F. Matsuoka, Supply and threshold-voltage trends for scaled logic and SRAM MOSFETs, IEEE Trans. Electron. Devices 53 (6) (2006) 1427–1432. [9] C.H. Kim, Jae-Joon Kim, Ik-Joon Chang, K. Roy, PVT-aware leakage reduction for on-die caches with improved read stability, IEEE J. Solid-State Circuits 41 (1) (2006) 170–178. [10] Verma Naveen, A.P. Chandrakasan, A 65 nm 8T sub-Vt SRAM employing senseampliﬁer redundancy, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 11–15 Feb. 2007, pp. 328–606. [11] N. Verma, A.P. Chandrakasan, A 256 Kb 65 nm 8T subthreshold SRAM employing sense-ampliﬁer redundancy, IEEE J. Solid-State Circuits 43 (1) (2008) 141–149. [12] H. Morimura, N. Shibata, A step-down boosted-wordline scheme for 1–V battery-operated fast SRAM's, IEEE J. Solid-State Circuits 33 (8) (1998) 1220–1227. [13] R.V. Joshi, R. Kanj, K. Kim, R.Q. Williams, C.T. Chuang, A ﬂoating-body dynamic supply boosting technique for low-voltage SRAM in nanoscale PD/SOI CMOS technologies, in: Proceedings of the ISLPED, 2007, pp. 8–13. [14] Y. Wang, U. Bhattacharya, F. Hamzaoglu, P. Kolar, Y. Ng, L. Wei, Y. Zhang, K. Zhang, M. Bohr, A 4.0 GHz 291 Mb voltage-scalable SRAM design in 32 nm high-κ metal-gate CMOS with integrated power management, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 8–12 Feb. 2009, pp. 456–457,457a. [15] K. Nii, M. Yabuuchi, Y. Tsukamoto, S. Ohbayashi, Y. Oda, K. Usui, T. Kawamura, N. Tsuboi, T. Iwasaki, K. Hashimoto, H. Makino, H. Shinohara, A 45-nm singleport and dual-port SRAM family with robust read/write stabilizing circuitry under DVFS environment, in: Proceedings of the IEEE Symposium on VLSI Circuits, 18–20 June 2008, pp. 212–213. [16] M. Yabuuchi, K. Nii, Y. Tsukamoto, S. Ohbayashi, S. Imaoka, H. Makino, Y. Yamagami, S. lshikura, T. Terano, T. Oashi, K. Hashimoto, A. Sebe, G. Okazaki, K. Satomi, H. Akamatsu, H. Shinohara, A 45 nm low-standby-power embedded SRAM with improved immunity against process and temperature variations, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 11–15 Feb. 2007, pp. 326–606. [17] S. Mukhopadhyay, R.M. Rao, J.J. Kim, C.T. Chuang, SRAM write-ability improvement with transient negative bit-line voltage, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 9 (1) (2011) 24–32. [18] M. Yabuuchi, K. Nii, Y. Tsukamoto, S. Ohbayashi, Y. Nakase, H. Shinohara, A 45 nm 0.6 V cross-point 8T SRAM with negative biased read/write assist, in: Proceedings of the Symposium on VLSI Circuits, 16–18 June 2009, pp. 158–159.

[19] O. Hirabayashi, A. Kawasumi, A. Suzuki, Y. Takeyama, K. Kushida, T. Sasaki, A. Katayama, G. Fukano, Y. Fujimura, T. Nakazato, Y. Shizuki, N. Kushiyama, T. Yabe, A process-variation-tolerant dual-power-supply SRAM with 0.179 mm2 cell in 40 nm CMOS using level-programmable wordline driver, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 8–12 Feb. 2009, pp. 458–459,459a. [20] K. Nii, M. Yabuuchi, Y. Tsukamoto, Y. Hirano, T. Iwamatsu, Y. Kihara, A 0.5 V 100 MHz PD-SOI SRAM with enhanced read stability and write margin by asymmetric MOSFET and forward body bias, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 7–11 Feb. 2010, pp. 356–357. [21] Nho Hyunwoo, P. Kolar, F. Hamzaoglu, Wang Yih, E. Karl, Ng Yong-Gee, U. Bhattacharya, K. Zhang, A 32 nm high-k metal gate SRAM with adaptive dynamic stability enhancement for low-voltage operation, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 7–11 Feb. 2010, pp. 346–347. [22] Y. Fujimura, O. Hirabayashi, T. Sasaki, A. Suzuki, A. Kawasumi, Y. Takeyama, K. Kushida, G. Fukano, A. Katayama, Y. Niki, T. Yabe, A conﬁgurable SRAM with constant-negative-level write buffer for low-voltage operation with 0.149 mm2 cell in 32 nm high-k metal-gate CMOS, in: Proceedings of the IEEE International Solid-State Circuits Conference ISSCC, Digest of Technical Papers, 7–11 Feb. 2010, pp. 348–349. [23] V. Ramadurai, R. Joshi, R. Kanj, A disturb decoupled column select 8T SRAM cell, in: Proceedings of the CICC, Digest of Technical Papers, 2007, p. 25. [24] R. Joshi, R. Houle, K. Batson, et al., 6.6 þ GHz low Vmin, read and half select disturb-free 1.2 Mb SRAM, in: Proceedings of the IEEE Symposium on VLSI Circuits, Jun. 2007, pp. 250–251. [25] K. Honda, et al., Elimination of half select disturb in 8T-SRAM by local injected electron asymmetric pass gate transistor, in: Proceedings of the IEEE Custom Integrated Circuits Conference, Sep. 2010, pp. 1–4. [26] F. Moradi, D.T. Wisland, S. Aunet, H. Mahmoodi, Tuan-Vu Cao, 65 NM subthreshold 11T-SRAM for ultra-low voltage applications, in: Proceedings of the IEEE International SOC Conference, 17–20 Sept. 2008, pp. 113–118. [27] F. Moradi, D.T. Wisland, H. Mahmoodi, Y. Berg, Tuan-Vu Cao, New SRAM design using body bias technique for ultra-low power applications, in: Proceedings of the 11th International Symposium on Quality Electronic Design, ISQED 22–24 March 2010, pp. 468–471. [28] C. McNairy, D. Soltis, Itanium 2 processor microarchitecture, IEEE Micro 23 (2) (2003) 44–55. [29] T. Seki, E. Itoh, C. Furukawa, I. Maeno, T. Ozawa, H. Sano, N. Suzuki, A 6-ns 1-Mb CMOS SRAM with latched sense ampliﬁer, IEEE J. Solid-State Circuits 28 (4) (1993) 478–483. [30] M. Sharifkhani, M. Sachdev, SRAM cell stability: a dynamic perspective, IEEE J. Solid-State Circuits 44 (2) (2009) 609–619. [31] D.E. Khalil, M. Khellah, Nam-Sung Kim, Y. Ismail, T. Karnik, V.K. De, Accurate estimation of SRAM dynamic stability, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 16 (12) (2008) 1639–1647. [32] Zhang Bin, A. Arapostathis, S. Nassif, M. Orshansky, Analytical modeling of SRAM dynamic stability, in: Proceedings of the IEEE/ACM International Conference onComputer-Aided Design ICCAD, 5–9 Nov. 2006, pp. 315–322. [33] L. Chang, D. Fried, J. Hergenrother, J. Sleight, R. Dennard, R.R. Montoye, L. Sekaric, S. McNab, W. Topol, C. Adams, K. Guarini, W. Haensch, Stable SRAM cell design for the 32 nm node and beyond, in: Proceedings of the Symposium on VLSI Technology, Digest of Technical Papers, 2005, pp. 128–129. [34] J.P. Kulkarni, K. Kim, K. Roy, A 160 mV robust Schmitt trigger based subthreshold SRAM, IEEE J. Solid-State Circuits 42 (10) (2007) 2303–2313.