On the Superiority of DO-RE-ME / MPG-D Over Stuck-at-Based Defective Part Level Prediction * Jennifer Dworak, Michael R. Grimaila, Brad Cobb, T-C. Wang, Li-C. Wang, and M. Ray Mercer Department of Electrical Engineering, Computer Engineering Group Texas A&M University, College Station, Texas 77843-3259 E-mail:
[email protected] Abstract
The focus of the research program which produced this paper is two fold: (1) t o develop accurate defective part level predictors (MPG-D is our best mathematical model to date) [DWOROO], and (2) to use such predictors to produce superior test pattern generation techniques (DO-RE-ME is currently our best approach) [GRIM99]. In this paper, we first present data based upon a novel analysis of a benchmark logic circuit (C432) [BRGL85] t o demonstrate that the sets of test patterns that detect actual defects (such as bridges shorted circuits) are dramatically different than the test pattern sets produced using the single stuck-at fault model - as currently practiced in industry. Next, we identify that a minor modification t o standard stuck-at fault simulation (in particular] measuring the number of times a stuck-at fault is detected) can produce a significantly superior basis for defective part level prediction. Finally, we describe the MPG-D model in detail and compare the accuracy of its defective part level predictions with those based upon single stuck-at fault coverage [WILL81]. Data used in these comparisons were produced as a result of testing an actual industrial circuit with two separate test pattern sets and recording the number of failing parts as a function of the test pattern applied.
I n this paper we use data collected from benchmark circuit simulations to examine the relationship between the tests which detect stuck-at faults and those which detect bridging surrogates. W e show that the coeficent of correlation between these tests approaches zero as the stuck-at fault coverage approaches 100%. A n enhanced version of the MPG-D model, which is based upon the number of detections of each site in a logic circuit, is shown to be superior to stuck-at fault coverage-based defective part level prediction. W e then compare the accuracy of both predictors f o r a n industrial circuit tested using two different test pattern sequences.
1
Introduction
After the design of an integrated circuit is complete, but before first silicon is returned from the manufacturing facility] a set of test patterns is prepared to separate the good parts from those which are defective. The objective of applying this test pattern set to every manufactured part is t o reduce the fraction of defective parts that are erroneously sold to customers as defect-free parts. Ideally, the detective part level would be accurately estimated based upon analysis of the circuit structure and the applied test pattern set. If the expected defective part level exceeds some specified value, then either the test pattern set or (in extreme cases) the design can be modified t o achieve an adequate quality level. In fact, at this time defective part level prediction is rarely, if ever, utilized in a commercial environment to assess the adequacy of a test pattern set to achieve a targeted defective part level - because the accuracy of existing techniques/models is inadequate.
2
In the most general sense, a fault model restricts the set of all possible test patterns, and only patterns that are elements of this set can be used in the testing process. For example, consider single stuck-at faults. One by one, a fault is selected and one test for that fault is determined (usually via ATPG). As more tests are produced] fault coverage increases, the set of faults
*This material is based upon work supported under a National Science Foundation Graduate Research Fellowship. This work was also supported by the Texas Advanced Technology Program - Project No. 036327-152.
151 1081-7735100$10.00 0 2000 IEEE
A Quantitative Evaluation of the Correlation between the Single Stuck-at Fault Model and Bridging Defects
that remain undetected shrinks, and so does the set of all possible patterns that detect at least one of the remaining undetected faults. In the limit, when the stuck-at fault coverage is loo%, the set of all possible additional test patterns is empty.
contrast, if the overlap is very small, the probability of detecting at least one defect using a test for a fault is also small. More formally, we can consider the event “a test detects a fault” to be a predictor for the event “the same test detects at least one defect.” With t,his formulation, we can calculate the (four point) coefficient of correlation between faults and defects as a function of fault coverage.
In a similar way, there exists for the same integrated circuit the set of all possible defects, and each nonredundant defect has some set of patterns that will detect that defect. As test patterns are successively applied, more and more defects are detected, and the set of test patterns that detect at least one remaining undetected defect shrinks. If there is a perfect match between the faults and defects, then when the fault coverage reaches loo%, all defects have been detected as well, and the set of all possible tests for at least one undetected defect is also empty. In contrast, for real circuits, a mismatch exists between the fault set and the defect set such that even when the fault coverage is loo%, some defects remain undetected and the set of all tests that detect at least one remaining defect is not empty.
I Q
(b) FC = 80.3%1
Early research explored the use of random patterns as a means to enhance defect detection [AGRA72][SCHN75][LOSQ76]. Although this technique showed promise, especially for built-in self-test applications, it has the drawback that it requires an excessive number of test patterns to be effective. It was also found that that targeting stuck-at faults may detect many bridges and vice versa [MEI74][MILL88]. Subsequently, researchers have attempted to characterize the exact nature of defects t o create more accurate models for test generation [FERGSl]. To accomplish this, the fault simulation engine is modified to allow the simulation of the “enhanced fault model” to more accurately emulate real defects encountered in the manufacturing process. Unfortunately, using complex defect models during the ATPG process is too costly in both time and memory. Multiple fault models with multiple testing methods have also been proposed and studied [MAX92A][MAX92B]. However, with millions of transistors on a single device, it is clear that it is not practical to generate deterministic test sets for an infinite set of defect models. The massive scale of resources required to deterministically generate robust tests has provided the motivation for our probabilistic approach to detecting defects.
I~
0
0
o
0
w,, 3, WJ
”5
W, represents the number of tests that detect both defects and faults W,, represents the number of tests that detect only defects W,,, represents the number of tests that detect neither defects nor faults W,, represents the number of tests that detect only faults
Figure 1: Venn diagrams of testing space Fig. 1 contains a set of three Venn diagrams to depict this process for circuit C432. Fig. l a shows the Venn diagram that exists early in the testing process when the fault coverage is 17.8%. Call the set of all possible tests for at least one undetected fault “SE’.” This set is represented by the dark oval. Call the set of all possible tests for at least one undetected defect “SD.” This set is represented by the light oval. If we think of fault detection as a predictor of defect detection, then we can define four possible disjoint evenbs. S F intersect S D is the event that a test for a fault also detects at least one defect, and in Fig. 1 we associate that event with the point (1,l)on the graph (in quadrant I). The number of tests that satisfy these conditions determines the “weight,” Wr, in quadrant I. SF intersect S D is the event that a test that detects no fault detects at least one defect, and in Fig. 1 we associate that event with the point (-1,l)on the graph (inquadrant 11). S F intersect is the event that a test detects no fault and no defect. In Fig. 1, the number of such tests is associated with quadrant 111. Finally, the number of
The essence of successful defective part level reduction via testing involves the amount of overlap between the set of all tests for undetected faults and the set of all tests for undetected defects. When this overlap is high, there is a large probability that a test selected to detect a fault also detects at least one defect. In
152
tests in S F intersect are entered into quadrant IV. Fig. l b represents the data when the fault coverage is 80.3% and shows a line that is the optimal estimator of defect detection based upon fault detection; the slope of this line (after the appropriate normalization) indicates the coefficient of correlation between fault detection and defect detection. Fig. ICrepresents the d a t a when the fault coverage is 95.1% and shows a line that is a poor estimator of defect detection based upon fault detection; the slope of this line (after the appropriate normalization) indicates the coefficient of correlation between fault detection and defect detection. Fig. l a through ICshow how S F and S D change in general with increasing fault coverage. When the fault coverage is low, both S F and S D are large, and, as shown in Fig. l a and l b , both the amount of resulting overlap between tests and the coefficient of correlation are high. As the fault coverage increases, the amount of overlap and the coefficient of correlation decline. In the limit, when the fault coverage is loo%, the coefficient of correlation will become zero for real situations.
This is extremely bad news for those who wish t o use stuck-at fault coverage t o predict defective part levels. For example, the Williams-Brown model estimates defective part level as
D L = 1- y i e l d l - F a u l t C a v e r a g e
(1)
Just in the interval (near 100%) where it is most important that fault coverage be an accurate predictor of defect coverage, the predictor is at its extreme worst! In such a situation, it is highly desirable t o develop an alternate - and more accurate method for defective part level prediction.
3
The MPG-D Defective Part Level Prediction Model
Of course, the true purpose of testing is t o reduce the number of undetected defective parts, and this decrease is measured as a reduction in the defective part level. Defective parts are detected when tests are applied which both excite the defect and propagate the incorrect value that occurs at the defect sites t o a primary output. Thus, both excitation of defects and observation of circuit sites must occur for the defective part level t o decrease, and the new defective part level model must therefore take both of these factors into account [DWOR99]. MPG-D assumes that defects are evenly distributed across all circuit sites and initially assigns t o each circuit site i its contribution t o the overall defect level based upon the following formula:
Coefficient of Correlation between Undetected Stuck-at Faults and Undetected Bridging Surrogates vs. Fault Coverage 1.2
1
0.8 0.6 0.4 0.2 0 -0.2
0
0.2
0.4
0.6
0.8
1
1.2
Thus, the overall defect level is merely the sum of the contributions from each individual site. Once a test pattern has been applied, some of the circuit sites have been observed, some of the defects have been excited, and some of the defects have been detected. Then the defect level becomes:
Fault Coverage
Figure 2: Coefficient of correlation between faults and defects Fig. 2 shows an actual plot of the coefficient of correlation between faults and defects for the C432 benchmark circuit. The set of all non-redundant stuck-at faults was used, and about six times as many bridging surrogates (half AND shorts and half OR shorts) were used t o model the defects of interest. Ordered Binary Decision Diagrams were used t o calculate the set of all possible test patterns for each fault and each defect. The number of tests for each of the four cases above was calculated by simple logic operations (OR and AND) between the appropriate OBDDs. It is important t o note that the coefficient of correlation declines rapidly after about 80% fault coverage.
#-a f -sites
where each site's defective part level contribution has changed after the application of a test vector according t o the following equation:
153
The constant C determines the portion of the defective part level removal at an observed site which must also be removed from somewhere else. Thus, we are assuming that a certain reduction in the defective part level should occur at a given site based upon how many times it has been observed so far (and thus its current probability of excitation) and the fact t,hat it was observed by the current vector. However, a certain percentage of its defect level contribution (or all of it if we assume bridging defects) also involves other sites. In the case of bridging defects, the defect level contribution reduction which occurs at this site is only one half of the overall reduction that should occur because of the detection of defects at this site. Thus, this remaining reduction must be removed in some way. Ideally, we would remove it from the exact sites which are bridged to the current site and whose bridges were detected. Since we do not know which sites these would be, we divide the extra reduction among all the sites, so that every site experiences some corresponding decrease in its defective part level contribution. We make the amount of additional removal from the other sites proportional to the defective part level contribution which remains at that site. In other words, if a site contributes to 10% of the overall defective part level, 10% of the additional removal will be taken from that site. If the second site contributes to 1% of the overall defective part level, then it will receive a 1%share of the additional reduction.
DL~(+ W 1) = D L ~ ( w* )(1 - A * Pexcite;)observed; #-o f -sites -c* [ A D 4 1* j=1
The probability of excitation, Pexciteis defined as follows:
pexcatea . .-e-. -
#obs-
and A D L j is defined as:
ADLj =
{ DLj(w)* ( A * Pexcite,), 0,
ifobserwed, = 1, ifobserwedj = 0.
In this equation, Peaciteis a function of the number of times that site i has been observed and is calculated using the decaying exponential function with time constant r (The time constant for this experiment is chosen to be 2.0 in order t o match the data). The constant A is a measure of what fraction of the site’s contribution to the defective part level will be removed given that at least one as yet undetected defect is excited and observed. Thus, when this constant is multiplied by the probability of exciting at least one undetected defect (given that the site is observed), we have a value for the average reduction in this site’s current contribution to the defective part level (given that it was observed for the current number of observations). Subtracting this quantity from one gives the fraction of the site’s defective part level contribution which remains given that the site is observed. This entire quantity is raised to the power of obserwedi, which is equal to one if site i was observed on the current vector or zero if the site was not observed. Thus, observation of some sites is required t o reduce the defective part level contribution. Finally, this entire quantity is multiplied by the site’s current contribution to the defective part level in order to find the new contribution value. However, an additional point must be taken into account: a single defect may affect more than one site. For example, if we assume that all of the defects are AND/OR bridging defects, then they will involve two sites. The defective part level contribution of one of these bridging surrogates should really be divided equally between the involved sites. When a defect is detected at either of the two sites, equal portions of the reduction in DL contribution should be removed from each of these sites.
One more simplifying assumption is made. Since the additional reduction in the defective part level a t any particular site due to a reduction a t the current observed site is relatively small (because it is divided among many sites), the error introduced by allowing some of the additional reduction to occur at the current site tends t o be very small. Thus, it is not strictly true that exactly one half of the defective part level reduction due to the excitation and observation a.t a given site is attributed to that site, since the additional removal of defect level occurs at this site also. (The same overall reduction in defective part level occurs in both cases. It is merely how much of that reduction occurs at each site which is affected.) However, making this modification allows us to simplify and shorten the overall calculation process tremendously. Instead of trying to distribute the overall reduction for each site separately, we can collect this additional defective part level reduction in a running total for a given vector based upon the reduction at all of the observed sites. Then, after this analysis is completed, we can redistribute the running total among all of the sites. Thus, the redistribution loop must only be entered
154
once for each vector instead of once for each site that was observed and had a corresponding reduction in the defective part level. Once all of the additional defective part level reduction is redistributed, the defective part level contributions of all of the sites are summed together to obtain the final defective part level.
4
In the second commercial experiment conducted in February 1999, 20,591 die passed all parametric tests, and these were tested using the two test patterns sets described above; 245 were declared defective using the COMMERCIAL test pattern set, and 246 were declared defective using the RESEARCH test pattern set. We arbitrarily assume that twelve defective die were never detected by either of the test pattern sets.
A Comparison of Traditional Defective Part Level Predictors with MPG-D on an Actual Industrial Circuit
Defective Part Levels after CommercialVectors Applied (First Experiment) 1.00E-01
Our defective part level model (MPG-D) has previously been tested through surrogate simulation of seven of the ISCAS’85 benchmark circuits [DWOROO][BRGL85].We now examine results from two commercial experiments conducted on the same design with the same test pattern sets. The data was collected on two different lots of production wafers. The commercial integrated circuit, consisting of more than 75,000 two input NAND equivalent logic gates, was tested using two different test pattern sets. We will designate the first test pattern set as COMMERCIAL because it is exactly what is used in the standard manufacturing test flow. Because the chip was designed with 100% scannable flip-flops, ATPG was performed using the Mentor Graphics FASTSCAN program. The COMMERCIAL test pattern set consisted of approximately 3,000 test patterns where each test pattern was applied using one scan chain load/unload. The stuck-at fault coverage for this test pattern set was just above 97%. We will designate the second test pattern set as RESEARCH because it was produced using a new ATPG method we call DO-RE-ME (Deterministic Observation, Random Excitation, and MPG Estimation). The RESEARCH set of test patterns was also produced by the Mentor Graphics FASTSCAN program; the length of this test pattern set was exactly the same as the COMMERCIAL set, and its stuck-at fault coverage was 96.7%. However, the RESEARCH set differed from standard commercial practice in that the number of observations at sites which were difficult to observe was maximized using the DO-RE-ME method. In the first commercial experiment conducted in October 1998, 6,986 die passed all parametric tests, and these were tested using the two test patterns sets described above; 220 were declared defective using the COMMERCIAL test pattern set, and 229 were declared defective using the RESEARCH test pattern set. We arbitrarily assume that twelve defective die were never detected by either of the test pattern sets.
1.00E-02
1.00E-03
0
500
1000
1500
2000
2500
3000
3500
Number of Vectors Run
Figure 3: Experimental results and defective part level prediction for COMMERCIAL vectors in October 1998 experiment
Defective Part Levels after CommercialVectors Applied (Second Experiment) l.OOE-O1
r
Williams Brown
1
1.00E-04
0
500
1000
1500
2000
2500
3000
3500
Number of Vectors Run
Figure 4: Experimental results and defective part level prediction for COMMERCIAL vectors in February 1999 experiment In Fig. 3 to Fig. 6, we compare the measured experimental defective part level to the MPG-D predictions and the predictions of the Williams Brown model. The constants for the MPG-D model were chosen to fit the defective part level data which resulted after the application of the RESEARCH vectors in the second experiment. We chose T = 2.0, C=3.0, and A=0.2751. (C was chosen to be three because all of our observation data for the industrial circuit were in terms of faults
155
MPG-D was able t o make very accurate predictions when the defects changed (October vs. February) and when the test patterns changed (RESEARCH vs. COMMERCIAL) in all but one instance-the application of the COMMERCIAL vectors in October. In this case, the COMMERCIAL vectors detected less defects than were predicted by either MPG-D or Williams Brown (but the MPG-D prediction was closer). The exact reason for this error is unknown, but it may have been related t o the nature of the defects which occurred. Finally, the effectiveness of the RESEARCH vectors was better estimated by the two defective part level models than the effectiveness of the COMMERCIAL vectors because the shape of the predicted curves for the RESEARCH vectors better matched what actually happened. Thus, the RESEARCH vectors and their test generation method seem t o lend themselves t o the ability t o make more accurate predictions.
Defective Part Levels after Research Vectors Applied (First Expenment) 100E-01 MPG-D Prediction Expenment
&
100E-02
ia
0
500
1000
1500
2000
2500
3000
3500
Number of Vectors Run
Figure 5: Experimental results and defective part level prediction for RESEARCH vectors in October 1998 experiment Defective Part Levels after Research Vectors Applied (Second Expenment) 1.00E-01
5
Conclusions
1.00E-02
d
We have described the MPG-D defective part level model and the effectiveness of the DO-RE-ME test generation method. An analysis using OBDD simulation of the C432 benchmark circuit was used to illustrate the correlation between the overlap of the testing spaces of faults and defects as fault coverage increases. Because the amount of the overlap decreases significantly as we approach 100% fault coverage] tests generated t o detect the remaining stuck-at faults are considerably less effective at detecting the remaining defects. In addition, using fault coverage as a metric t o predict defective part level becomes less reliable at high fault coverages.
100E-03
100E-04
0
500
1000
1500
2000
2500
3000
3500
Number of Vectors Run
Figure 6: Experimental results and defective part level prediction for RESEARCH vectors in February 1999 experiment
instead of sites. Thus, if we assume bridging defects, each defect involves four faults corresponding t o two sites. Therefore] the total reduction in defective part level contribution at a n observered “fault” is only one fourth of the required reduction, and thus C should be 3.) These same constants were then used t o make our MPG-D predictions for the other 3 experimental runs. MPG-D is a significant improvement over Williams Brown in every case. Williams Brown is too pessimistic at the beginning of each run and optimistic a t the end. In addition] since the fault coverage is slightly higher for the set of COMMERCIAL vectors than for the set of RESEARCH vectors, Williams Brown incorrectly predicts that the COMMERCIAL vectors should detect more defects. In reality, the RESEARCH vectors were always better.
The accuracy of defective part level prediction by MPG-D represents a significant enhancement over all previous approaches. Not only does the MPG-D model allow relatively more accurate DL predictions, it is also capable of quantitatively comparing the effectiveness of an ATPG strategy in terms of defective part level reduction.
Acknowledgments: The authors would like to express appreciation to Ken Butler and Bret Stewart of Texas Tnstruments for providing access to commercial fabrication facilities. We also value the stimulating comments and observations by Jaehong Park. Additionally, we would like to thank D.S. Ha for use of the academic ATPG tool ATLANTA and Don Ross of Mentor Graphics for the use of FASTSCAN, a commercial ATPG tool.
156
References
[LEE931 Lee, H.K. and Ha, D.S., “On the generation of test patterns for combinational circuits,” Technical Report N o . 12, Department of Electrical Engineering,
[AGRA72] Agrawal, V.D. and Agrawal, P., “An Automatic Test Generation System for Illiac IV Logic Boards,” I E E E Trans. on Computers, Vol. C-21, no. 9, 1972, pp. 1015-1017.
Virginia Polytechnic Institute and State University, 1993. [LOSQ76] Losq, J., “Referenceless Random Testing,” Proc. Int. Symp. on Fault-Tolerant Computing, 1976, pp. 108-113.
[AGRA82] Agrawal, V.D., Seth, S.C., and Agrawal P., “Fault coverage requirement i production testing of LSI circuits,” I E E E Journal of Solid State Circuits, Vol. SC-17, no. 1, February 1982, pp. 57-61. [BRGL85] Brglez, F. and Fujiwara, “A Neutral Netlist of 10 Combination Benchmark Circuits and a Target Translator in FORTRAN,” Proc. Int. Symp. Circuits Syst., 1985.
[MAX92A] Maxwell, P.C. and Aitken, R.C, “IDDQ Testing as a Component of a Test Suite: The Need for Several Fault Coverage Metrics,” J . of Electronic Testing ( J E T T A ) , vol. 3, 1992, pp. 305-316. [MAX92B] Maxwell, P.C., Aitken, R.C., Johansen, V., and Chiang, I., “The Effectiveness of IDDQ, Functional, and Scan Tests: How Many Fault Coverages Do We Need?,” Proc. Int. Test Conf., 1992, pp. 168177. [ME1741 Mei, K.C., “Bridging and stuck-at faults,” I E E E Trans. on Computers, vol. C-23, no. 7, 1974, pp. 8794. [MILL881 hilillman, S.D. and McCluskey, E.J., “Detecting bridging faults with stuck-at test sets,” Proc. Int. Test Conf., 1988, pp. 773-783. [PARK941 Park, J., Naivar, M., Kapur, R., Mercer, M.R., and Williams, T.W., “Limitations in predicting defect level based on stuck-at fault coverage,” Proc. VLSI Test Symposium 1994, pp. 186-191. [SCHN75] Schnurmann, H.D., Lindbloom, E., Carpenter, R.G., “The Weighted Random Test Pattern Generator,” I E E E Trans. on Computers, Vol. C-24, no. 7, 1975, pp. 695-700. [WANG95] Wang, L-C., Mercer, M.R., Williams, T.W., and Kao, S.W., “On the Decline of Tesing Efficiency as Fault Coverage Approaches loo%,” Proc. VLSI Test Symposium 1995, pp. 74-83. [WILL811 Williams, T.W. and Brown, N.C. “Defect level as a function of fault coverage,” I E E E Trans. on Computers, Vol. C-30, no. 12, 1981, pp. 987-988.
[DWOR99] Dworak, J., Grimaila, M.R., Lee, S., Wang, L.C., and Mercer, M.R., “Modeling the probability of defect excitation for a commercial IC with implications for stuck-at fault-based ATPG strategies,” Proc. International Test Conference 1999, pp. 1031-1037. [DWOROO] Dworak, J., Grimaila, M.R., Lee, S., Wang, L.C., and Mercer, M.R., “Enhanced DO-RE-ME Based Defect Level Prediction Using Defect Site Aggregation - MPG-D,” Proc. International Test Conference 2000.
[ELDR59] Eldred, R.D., “Test routines based upon symbolic logic statements,” J . Assoc. Comput. Mach., vol. 6, 1959, pp. 33-36. [FERGSl] Ferguson, F.J., and Larrabee, T., “Test pattern generation for realistic bridge faults in CMOS ICs,” Proc. Int. Test Conf., 1991, pp. 492-499. [GRIM991 Grimaila, M.R., Lee, S., Dworak, J.,Butler, K.M., Stewart, B., Balachandran, H., Houchins, B., Mathur, V., Park, J. , Wang, L.C., Mercer, M.R., “REDO - Random Excitation and Deterministic Observation - First Commercial Experiment,” Proc. VLSI Test Symposium 1999, pp. 268-274. [KAPUR92] Kapur, R., Park, J., and Mercer, M.R., “All tests for a fault are not equally valuable for defect detection,” Proc. International Test Conference 1992, pp. 762-769.
157