Modeling the Impact of Process Variation on Critical Charge Distribution

Report 4 Downloads 81 Views
Modeling the Impact of Process Variation on Critical Charge Distribution Qian DING1, Rong LUO1, Hui WANG1, Huazhong YANG1, Yuan XIE2 (1 Dept. of Electronic Engineering, Tsinghua University, Beijing, 100084, China) (2 Dept. of Computer Science & Engineering, Pennsylvania State University, University Park, PA, 16802, USA) ABSTRACT In this paper, we investigate the impact of process variation on soft error vulnerability with Monte Carlo analysis. Our simulation results show that Qcritical variation (3σ/mean) of four types of storage circuits caused by process variation can be as large as 13.6%. We also propose an empirical model to estimate the Qcritical variation caused by gate length and threshold voltage variations. Simulation results show that this simple model is very accurate. Based on this model, the dependence of Qcritical variation on gate length variation, threshold voltage variation, and correlation between gate lengths is studied, using 70nm SRAM as benchmark circuit. I. INTRODUCTION As technology scales, process variation is becoming a big concern as a result of uncertainty in the device and interconnect characteristics, such as gate length, the thickness of gate oxide, and doping concentrations [1]. The variation of process parameters, including inter-die variations and intradie variations, makes circuits less predictable, changing the design methodology for nanometer System-on-chip from deterministic to probabilistic. Along with the technology scaling, nanometer SOC designs become increasingly vulnerable to three major ground level soft error donors: alpha particles, fast neutrons, and thermal neutrons [2, 3, 4]. For memory elements, if charge deposited by the particle strike at the storage node is more than a minimum value, the node will be flipped and a soft error occurs. This minimum value is called critical charge (or Qcritical), which is used by many researchers to measure the circuit’s vulnerability to soft errors. Worst case (or corner-based) analysis of the impact of process variation on soft error vulnerability has been investigated [5], and the Qcritical variation was found to vary from -33.5% to 81.7% compared to the case without considering process variation. However, worst-case analysis of Qcritical variation leads to overly pessimistic results. In this paper, instead of worst case (or cornerbased) analysis, we study the impact of process variation on soft error vulnerability with Monte Carlo analysis. Compared to the worst case analysis, the result of Monte Carlo analysis is less pessimistic. The Qcritical variation (3σ/mean) is typically under 10% of the mean and can be as large as 13.6%.

0-7803-9782-7/06/$20.00 ©2006 IEEE

243

Since Monte Carlo analysis is very time consuming, we also propose a simple empirical model to estimate the distribution of Qcritical caused by gate length variation, threshold voltage variation, and study the impact of correlation between gate lengths. The time effectiveness of the model makes it suitable for larger circuit design analysis. The paper is organized as follows. Section 2 reviews related work. Section 3 introduces our experimental setup. Section 4 presents our simulation results. A simple Qcritical model is presented in Section 5. Based on this model, the dependence of Qcritical variation on gate length variation, threshold voltage variation and correlation between gate lengths is studied in Section 6. We summarize and conclude in Section 7. II. RELATED WORK Previous work focuses on the impact of process variation on performance and power. For example, maximum clock frequency (FMAX) distribution was studied by Bowman et al. [1]. The influence of process variation on the number of independent critical paths and logic depth is modeled. A microarchitecture level variability model was proposed based on Bowman's model [6]. Statistical timing analysis is of big concern as well [7]. Because of the accumulation of gate delay variability, number of critical paths will not be only one, unlike the traditional deterministic timing analysis. Process variation impact on full chip leakage power is investigated by Chang et al. [8]. Soft error analysis and optimization for nanometer circuits have been investigated intensively. For example, the impact of scaling on SER (soft error rate) [4] and the effect of threshold voltage on SER [9] have been studied, and circuit optimization techniques to reduce SER were proposed [10]. However, the influence of process variation on soft error has only gained attention recently. In [5], worst case (or corner-based) analysis of Qcritical fluctuations has been investigated. The gate length variation and threshold voltage variation is found to affect Qcritical considerably. Variation caused by gate oxide thickness variability is found to be small.

Correlation between gate lengths can be modeled as a function of the distance [12,13]. (3) can be used to describe this function.

III. EXPERIMENTAL SETUP Soft error rate in circuits can be estimated using an empirical model [2], which describes the relationship between SER and Qcritical: SER ∝ N flux × CS × e

− Qcritical QS

⎧ ⎪1 −

ρ =⎨

(1)

Particles that strike the silicon bulk will deposit a track of carriers. The carriers may recombine and form a very short current pulse at the circuit node. (2) can be used to estimate this effect [10]. − t /τ β

)

(3)

⎪ρ , x ≥ X corr ⎩ min

In (1), Nflux refers to the intensity of the Neutron Flux, CS is the cross section area of the node, and Qs is the charge collection efficiency. Qcritical is the minimum charge that can cause a bit flip.

I (t ) = I peak × (e − t /τ α − e

x (1 − ρ min ), 0 < x ≤ X corr X corr

(2)

Ipeak is the amplitude of the pulse, τα is the collection time constant, τβ is the ion-track establishment time constant. It is illustrated in Fig. 1 as well. The value of Qcritical is estimated by SPICE simulation. A current source used to model the particle strike induced current pulse is injected to a circuit node that is hit by particles. A series of runs of simulation is used to determine the minimum amounts of charge required to flip the node.

Xcorr refers to correlation length, ρmin refers to correlation baseline. However, the correlation will never reach 1, because very close devices still show variation. A typical maximum value of correlation is about 0.8 [13]. Because we only study small memory cells in this paper, this peak value of correlation coefficient is chosen to describe the correlation between gate lengths. Threshold voltage correlation is assumed weak according to [12]. The impact of different correlation coefficient is studied in Section 6. IV. EXPERIMENTAL RESULTS Table I, II, III, and IV present simulation results of Qcritical variation (3σ/mean) for SRAM, TGFF, C2MOSFF, and Dynamic Latch, respectively. Compared to the worst case analysis, Monte Carlo analysis considering spatial correlation is more accurate and less pessimistic. 3σ variation is typically under 10% of the mean and can be as large as 13.6%. Due to the exponential relationship between SER and Qcritical, the impact of process variation still cannot be ignored. TABLE I. Qcritical Variation from SRAM. Lgate

Vth

45 nm

70 nm

100 nm

1->0

12.6%

7.7%

11.6%

0->1

13.6%

5.2%

5.3%

1->0

5.8%

5.2%

5.9%

0->1

6.0%

4.5%

4.8%

TABLE II. Qcritical Variation from TGFF. Lgate Figure 1: Shape of the Current Source used to Model Particle Strike.

Vth

Four types of storage elements implemented with BSIM models at different technology nodes are studied. The HSPICE models used are 100 nm, 70 nm, and 45nm Berkley Predictive Technology Model [11]. For each storage elements, the most vulnerable nodes of those circuits are chosen to measure the Qcritical. Please refer to [5,9] for detailed schematics.

45 nm

70 nm

100 nm

1->0

12.4%

6.2%

6.7%

0->1

12.6%

3.5%

3.8%

1->0

5.4%

4.2%

4.4%

0->1 5.7% 3.9% 4.1% TABLE III. Qcritical Variation from C2MOS. 45 nm Lgate Vth

70 nm

100 nm

1->0

8.4%

8.4%

5.9%

0->1

10.1%

5.0%

5.0%

1->0

4.1%

4.8%

5.4%

0->1 5.8% 4.9% 5.8% TABLE IV. Qcritical Variation from Dynamic Latch.

We assume gate length has a Gaussian distribution with 15% 3σ variation and threshold voltage is also Gaussian distributed with 13% 3σ variation. These values are taken from industry source. The impact of gate oxide thickness (Tox) variation on Qcritical was found to be small [5] and therefore we ignore it in our analysis. We also consider variation correlations in our analysis.

Lgate Vth

244

45 nm

70 nm

100 nm

1->0

1.1%

5.7%

5.6%

0->1

0.5%

5.8%

6.2%

1->0

0.5%

0.5%

0.6%

0->1

0.6%

0.4%

0.5%

The estimation error, (|σmonte-carlo-σmodel| / σmonteof the standard deviation for SRAM, TGFF and C MOS is listed in table V, VI and VII. The 3σ variation of dynamic latch is very small compared to the other circuits, so it is not studied. The table shows that the error in estimating standard deviation of Qcritical variation varies from 0.0% to 3.7%, but is typically less than 1.0%.

V. Qcritical DISTRIBUTION MODEL Monte Carlo simulation is very time consuming to obtain a reliable result, which makes it not practical for large circuit analysis. The impact of different distribution of gate length and threshold voltage on Qcritical distribution needs to be investigated swiftly and accurately. In this section, we propose a simple empirical model to estimate the impact of gate length variation and threshold voltage variation on Qcritical. Results are calculated directly from two simple equations, which are much more efficient than Monte Carlo simulations. The time effectiveness of the model makes it suitable for circuit level design exploration and tool implementation.

carlo), 2

TABLE V. Estimation Errors for SRAM. Lgate Vth

Qapproximate

Lgate

= a1 × L1 + a2 × L2 + ... + b1 × L + b2 × L + ... + C (4)

Qapproximate = d1 ×Vth1 + d2 ×Vth2 + ... + D

70 nm

100 nm

0.5%

0.1%

0.0%

0->1

0.6%

0.0%

0.0%

1->0

0.1%

0.0%

0.0%

0->1 0.0% 0.0% 0.0% TABLE VI. Estimation Errors for TGFF.

The process variation impact of Gate Length and Threshold Voltage on Qcritical can be expressed by two simple equations respectively. 2 1

45 nm 1->0

2 2

Vth

(5)

45 nm

70 nm

100 nm

1->0

0.5%

0.0%

0.1%

0->1

0.3%

3.5%

2.9%

1->0

3.7%

0.1%

0.1%

0->1 0.0% 3.5% 2.5% TABLE VII. Estimation Errors for C2MOS.

Li and Vthi are the gate length and threshold voltage of the ith transistor that affect Qcritical of the circuit.

Lgate

For gate length variation, the second order part is needed to improve estimation precision, but the first order part is much more important. For threshold voltage variation, the first order equation is precise enough. Because the first order part of both equations is more important, Qcritical can be estimated as sum of Gaussian distributed random variables. This implies that the distribution of Qcritical is Gaussian like, because the sum of two or more Gaussian variables is still Gaussian.

Vth

45 nm

70 nm

100 nm

1->0

0.6%

0.0%

0.1%

0->1

0.5%

0.9%

2.8%

1->0

0.2%

0.1%

0.0%

0->1

0.1%

1.7%

2.0%

VI. IMPACT OF VAIRATION ON Qcritical VARIABILITY The new distribution model eases the calculation of Qcritical variability. In addition, the impact of variation on Qcritical variability can be estimated efficiently. In this section, we analyze Qcritical variability caused by different 3σ gate length and threshold voltage variation, using 70nm SRAM circuit as a benchmark circuit. The 3σ variations are listed in Table VIII and IX. The estimated cumulative distributions of Qcritical are shown in Fig. 3 and 4. Those results can be obtained in several seconds, which is much faster than Monte Carlo simulation that can take more than two days to finish all the simulation. TABLE VIII. Threshold Voltage Impact. 3σ Variation of Vth 12% 8% 3σ Variation of Qcritical 5.1% 3.3% TABLE IX. Gate Length Impact. 3σ Variation of Lgate 15% 10% 5% 3σ Variation of Qcritical 7.7% 5.3% 2.5%

Figure 2: Normalized Cumulative Distribution of Qcritical.

4% 1.6%

Using our empirical model, we also study the dependence of Qcritical variability on correlation between gate lengths. The correlation coefficients and Qcritical 3σ Variations are listed in Table X. The estimated cumulative distributions of Qcritical are shown in Fig. 5. It is interesting to find that correlation has big impact on Qcritical variability. The Qcritical variation is very small when the gates are perfectly matched. According to [5], 2 of the

Fig. 2 illustrates the variation of Qcritical caused by gate length variability. The "Simulation Result" curve comes from the simulation results of a 70 nm SRAM for a 1-0 flip (the SRAM bit flips from 1 to 0 due to particle strikes). The "Estimated Normal Distribution" curve is a Gaussian distribution which has the same mean and standard deviation.

245

transistors are much more important than the rest 4 transistors in a SRAM cell design. Their coefficients from (5) have opposite sign. When these two transistors are perfectly matched, their effects on Qcritical variation are counteracted.

Instead of worst case analysis (or corner-based), Monte Carlo analysis is used. Four types of storage elements implemented with BSIM models at different technology nodes are studied. Compared to the worst case analysis, Monte Carlo analysis considering spatial correlation seems more optimistic. 3σ variation is typically under 10% of the mean and can be as large as 13.6%. Still, due to the exponential relationship between SER (soft error rate) and Qcritical, the impact of process variation cannot be ignored. We also present a simple model to estimate Qcritical variation instead of Monte Carlo simulation. The simulation result shows that the error in estimating standard deviation of Qcritical variation varies from 0.0% to 3.7%. Based on this model, the dependence of Qcritical variation on gate length variation, threshold voltage variation, and correlation between gate lengths is studied efficiently, using 70nm SRAM as benchmark circuit.

Figure 3: Normalized Cumulative Distribution of Qcritical of different Gate Length Variations.

REFERENCES 1. K. A. Bowman, S. G. Duvall and J. D. Meindl, "Impact of dieto-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration," JSSC, Vol. 37, No. 2, pp. 183-190, Feb. 2002. 2. P. Hazucha and C. Svensson, "Impact of CMOS technology scaling on the atmospheric neutron soft error rate," IEEE Trans. Nucl. Sci., Vol. 47, No. 6, pp. 2586-2594, 2000. 3. P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi, "Modeling the effect of technology trends on the soft error rate of combinational logic," ICDSN, pp. 389-398, 2002. 4. N. Seifert, X. Zhu, and L.W. Massengill, "Impact of scaling on soft-error rates in commercial microprocessors," IEEE Trans. Nucl. Sci., Vol. 49, pp. 3100-3106, 2002.

Figure 4: Normalized Cumulative Distribution of Qcritical of different Threshold Voltage Variations. TABLE X. Correlation Impact. 0.6 0.8 0.9 0.95

Correlation Coefficient

0.2

3σ Variation of Qcritical

15.5%

11.3%

7.7%

5.7%

4.3%

5. Q. Ding, R. Luo and Y. Xie. "Impact of process variation on soft error vulnerability for nanometer VLSI circuits," ASICON, pp. 1023-1026, 2005. 6. D. Marculescu and E. Talpes, "Variability and energy awareness: a microarchitecture-level perspective, " DAC, pp. 11-16, 2005.

0.999 1.6%

7. C. Viswesvariah, “Statistical timing of digital integrated circuits,” ISSCC, 2004. 8. H. Chang and S. S. Sapatnekar, "Full-chip analysis of leakage power under process variations, including spatial correlations," DAC, pp.523-528, 2005. 9. Degalahal, V., R. Ramanarayanan, N. Vijaykrishnan, Y. Xie and M. J. Irwin, "The effect of threshold voltages on the soft error rate," ISQED, pp. 503-508. San Jose CA. 10. Q.Zhou and K.Mohanram, "Cost-effective radiation hardening technique for logic circuits", ICCAD, pp.100-106, 2004. 11. Y. Cao, T. Sato, D. Sylvester, M. Orshansky and C. Hu, "New paradigm of predictive MOSFET and interconnect modeling for early circuit design," pp. 201-204, CICC, 2000. 12. Y. Cao and L. T. Clark. "Mapping statistical process variations toward circuit performance variability: an analytical modeling approach," DAC, pp. 658-663, 2005. 13. P. Friedberg, Y. Cao, J. Cain, R. Wang, J. M. Rabaey and C. Spanos, “Modeling within-die spatial correlation effects for process-design co-optimization,” ISQED, 2005.

Figure 5: Normalized Cumulative Distribution of Qcritical of different Correlations.

VII. CONCLUSIONS The impact of process variation on critical charge distribution is investigated in this paper.

246