Cost-Effective IR-Drop Failure Identification and ... - Semantic Scholar

Report 5 Downloads 39 Views
Cost-effective IR-Drop Failure Identification and Yield Recovery through a Failure-adaptive Test Scheme Mingjing Chen and Alex Orailoglu CSE Department, UC San Diego La Jolla, CA 92093, USA {mjchen,alex}@cs.ucsd.edu

Abstract—Ever-increasing test mode IR-drop results in a significant amount of defect-free chips failing at-speed testing. The lack of a systematic IR-drop failure identification technique engenders a highly increased failure analysis time/cost and significant yield loss. In this paper, we propose a failure-adaptive test scheme that enables a fast differentiation of the IR-drop induced failure from the actual defects of the chip. The proposed technique debugs the failing chips using low IR-drop vectors that are custom-generated from the observed faulty response. Since these special vectors are designed in such a way that all the actual defects captured by the original vectors are still manifestable, their application can clearly pinpoint whether the root cause of failure is IR-drop or not, thus eliminating reliance on an intrusive debugging process that incurs quite a high cost. Such a test scheme further enables effective yield recovery from failing chips by passing the ones validated by the debugging vectors whose IR-drop level matches the functional mode. Experimental results show that the proposed scheme delivers a significant IRdrop reduction in the second test (debugging) phase, thus enabling a highly effective IR-drop failure identification and yield recovery at a slightly increased test cost.

I. I NTRODUCTION The aggressive voltage scaling trend significantly reduces the noise margin in VLSI circuits. On the other hand, excessive power density in nanometer chips debilitates the delivery of sufficient current by the power supply network, causing significant IR-drop in the power nets. Consequent supply voltage variations are directly implicated in high timing uncertainties currently observed. It has been reported that a 10-15% IR-drop can induce a 20-30% increase in gate delay [1]. This constitutes a critical reliability challenge for modern VLSI systems designed with very tight timing slacks. The current commercial design flow handles IR-drop through an appropriate power network design. However, since the power budget is determined based on the functional mode operation, IRdrop induced errors have become an increasingly prevalent failure mechanism in today’s scan based at-speed test [2], [3]. Since structural test vectors result in much higher power density (and thus IR-drop level) than in functional mode, chips that operate properly in functional mode may fail the test due to the excessive test mode signal noise [2], [4], [5]. Such an effect results in a highly complicated, expensive failure analysis process and a significant yield loss due to the overscreening of passable chips. Several approaches have been proposed in the literature to mitigate test mode IR-drop. One category of approaches imposes power constraints [7] in the test compaction phase to guarantee that the compacted test vectors fulfill IR-drop thresholds [6]. However, the effectiveness of these techniques hinges on the IR-drop distribution of the original test set. X-filling techniques aim to reduce the mismatch between test stimuli and test responses for capture power

978-3-9810801-6-2/DATE10 © 2010 EDAA

reduction in scan test [5], [8]. But the focus of these techniques on capture edge transition reduction may increase the flip-flop toggling occurring at the clock event between the launch cycle and the capture cycle, which also significantly contributes to the capture cycle IR-drop. Techniques have also been proposed to examine whether a test vector is power-safe [4], [9] through a layout-aware margin analysis. All of the previous techniques focus on the pre-testing optimization of test vectors. While being quite effective in reducing the risk of IR-drop failure during testing, they provide little guidance for the IR-drop failure analysis. Avoidance of intrusive silicon inspection and minimization of failure analysis cost necessitates rapid differentiation of the IR-drop failures from the defect-induced ones. Once the defective parts within the set of failing chips have been screened out, the IR-drop tolerance capability of the remaining defect-free parts can be examined, and the ones whose tolerance margin exceeds the functional mode IR-drop level can be added into the passing category to attain yield recovery. In this paper, we propose a two-phase test methodology that enables the aforementioned IR-drop failure identification and yield recovery scheme for the launch-off-shift at-speed test, which is widely used in industry for its low ATPG complexity, test volume and high fault coverage. The proposed scheme collects faulty responses during the regular testing phase, and generates customized low IRdrop vectors based on this information. The customized vectors are generated in such a way that they incur minimum capture mode IRdrop in the layout grids that are accessible from the observed errors, while retaining the same defect activation/propagation condition as in the original vectors. Therefore, the application of these vectors in the second phase retains the manifestation of actual defects while minimizing the IR-drop failure in the chip, thus delivering a clear differentiation between the defective and defect-free chips. The rest of the paper is organized as follows. In Section II, we outline the proposed failure-adaptive test scheme. Section III presents a technical overview of the proposed methodology of generating debugging tests. The test relaxation scheme that guarantees defect detection is discussed in Section IV. The generation of customized, low IR-drop tests from the relaxed vectors is detailed in Section V. The experimental results are presented in Section VI, and a brief set of conclusions is provided in Section VII. II. P ROPOSED TEST SCHEME The high cost of failure analysis constitutes the major challenge for process learning and improvement. The increasing occurrence of IR-drop failures further complicates the failure analysis step due to the lack of identifiable silicon defects in chips that failed due to IR-drop. For the purpose of minimizing the failure analysis cost, it

IR-drop critical grids Failure flip-flop

Regular test phase Failure response

Vdd

Debugging vector generation

Yield recovery

Defect-free chips

Fig. 1.

IR-drop failure debugging phase

Failure flip-flops

GND

Defective chips

Vdd

Failure analysis

GND Vdd

The proposed test & debugging scheme

(a) IR-drop critical grids

is necessary to first confirm or preclude IR-drop as the root cause of the failures in a faulty chip and only send the truly defective ones to the costly intrusive silicon debugging. We propose a two-phase test scheme to achieve the aforementioned goal. In the first phase, regular test vectors are applied to the chips and any faulty response is recorded. If a chip fails at a particular vector, a debugging vector custom-generated according to the observed faulty response will be applied to that chip again in the second test phase to identify whether the failure is caused by IR-drop or actual defects. To attain an accurate differentiation between IR-drop and defects, the debugging vector must fulfill the following conditions. 1) It must reproduce the manifestation conditions (including signal and clocking requirements) for any defect that can possibly lead to the observed faulty response. 2) The IR-drop it incurs must be reduced to the levels prevalent in functional operation. Practically, this translates to the minimization of the test mode IR-drop, as even aggressive reductions typically still exceed functional IR-drop levels. A faulty response to the debugging test signifies the existence of defects in a chip. Such chips can be further sent to the follow-up intrusive silicon debugging phase for process learning purposes. The chips that pass the debugging test suffer solely from a test mode IR-drop problem. A subset of these chips, whose IR-drop tolerance exceeds the functional mode IR-drop, can still operate properly during functional mode. A yield recovery process for these chips can thus be performed, guided by input from the design engineers. Figure 1 presents a high-level overview of the proposed test scheme. The failure-adaptive strategy enables the test engineers to identify IR-drop critical grids in the layout. The chip can be partitioned to a number of power grids with the center of each grid being the crosspoint of the vertical and horizontal power nets [6], as shown in Figure 2(a). Distinct grids may have varying criticality in terms of IR-drop intensity. The grid criticality information can be attained by tracing the fault range in the layout. If the observed failure is caused by IR drop, the fault site where the IR-drop occurs must be within the fault range; otherwise, it cannot be propagated to the primary outputs or flip-flops. Therefore, the power grids covered by the fault range are the candidate fault site for both the defects and the IR-drop failures (e.g. the gray grids in Figure 2(a)). This effect has been confirmed by the diagnostic result of a medium size chip with IR-drop failures. Figure 2(b) illustrates the test mode IR-drop profile of this chip attained through RedHawkT M simulation. It can be clearly seen that the IR-drop level in different grids highly varies and the grids around the failure flip-flop (the red area) exhibit the most severe IR-drop. The debugging vector generation process only needs to focus on minimizing the IR-drop in these grids. Traditional IR-drop tests are afflicted by a sizable ambiguity zone

Fig. 2.

(b) Failure flip-flop distribution

Identification of IR-drop critical grids

forcing an arbitrary tradeoff between yield loss and test escapes; our focused technique sharply reduces this region enabling the appropriate classification of the members of this ambiguity class, thus delivering sharp reduction in yield loss while retaining low levels of test escapes. III. T ECHNICAL OVERVIEW The generation of IR-drop aware tests for the debugging phase constitutes the major challenge for the proposed test scheme. In this section, we analyze the contributing factors of capture-mode IR-drop, and outline the technical framework that delivers IR-drop resilience during the generation of debugging tests. A. Capture mode IR-drop The capture mode IR-drop during at-speed testing is induced by the transition activities during the capture cycle. Traditional techniques only consider the capture edge transitions as the source of capture mode IR-drop [5], [8]. However, this IR-drop model only reflects a portion of the capture mode IR-drop in a launchoff-shift test scheme thus leading to a suboptimal solution for IRdrop mitigation. Figure 3 illustrates this problem in detail. Assume a test vector has the value of 101 in the three scan flip-flops shown in the figure. The next state logic of the circuit generates a capture transition at scan flip-flops F Fn , as shown in Figure 3(c). This transition occurs at the capture edge (i.e. clock event t2 ), and also induces further transitions in the combinational logic. A comprehensive analysis should also take into account the IRdrop contribution of transitions occurring at clock event t1 . These launch edge transitions are induced by the last shift operation (e.g., the transitions at F Fn and F Fn+1 shown in Figure 3(b)), and result in a capture cycle voltage drop which eventually impacts the correct capturing of the test responses. The semantics for launchoff-shift test precludes the possibility of eliminating the launch transition impact through the use of a reduced clock frequency, as the cycle between the launch and capture edges must be at-speed to detect delay defects. Since both types of transitions (the launch and the capture transitions) have significant contributions to capture mode IR-drop, the optimal solution for this problem necessitates the successful minimization of both of them in a concurrent manner. B. IR-drop aware test generation overview The generation of IR-drop aware tests necessitates the tuning of the failing vectors to minimize both the launch and capture edge transitions in the critical power grids. The failure-adaptive strategy of generating debugging tests delivers higher potential for IR-drop

launch nth shift

t1

t2 capture

(a) Launch-off-shift test clock 1→0

1→0 0→1

0→1

1→0

1→1

FFn

FFn+1

FFn+2

(b) Transitions at t1 (launch edge) 0→1

0→1 1→0

0

1→0 FFn

0→0 FFn+1

1→1 FFn+2

(c) Transitions at t2 (capture edge)

Fig. 3.

Capture mode IR-drop sources

reduction than pre-test IR-drop minimization techniques, as the debugging vector only needs to reproduce the manifestation condition for the observed fault effect. Since a fault usually manifests itself at a small number of flip-flops, the signal constraints in the fault manifestation range can be justified by specifying a small subset of bits in the test vector, with all the remaining bits used for IR-drop reduction purposes. To take advantage of the large optimization space provided by the failure-adaptive scheme, the proposed technique first relaxes each failing vector to a form wherein only a small number of bits are specified to reproduce the defect manifestation condition for the observed failure, and then fills the unspecified bits in such a way that the flip-flop transitions in the critical grids are minimized. The elimination of a target transition will impose certain signal constraints. If multiple target transitions exist, the signal constraints imposed by them might conflict with each other. This imposes two technical challenges to the generation of the debugging tests, one being the order in which the transition targets are processed, and the other being the choice of the appropriate signal constraint set for each target. The proposed technique resolves the aforementioned challenges by maximally identifying unspecified-bit assignment cases that eliminate a large portion of transition targets while preserving the optimality of the solution. The remaining transition targets are handled through a highly guided signal justification heuristic. We have observed that the cost of justifying a certain signal target is dependent on the number of sensitized paths from this target to the unspecified input bits and on the logic relationships along these paths. A metric is proposed to quantify the impact of these factors on the justification cost, thus helping identify the justification target/path that consumes minimum optimization space. Such a frugal signal justification process guarantees the maximal elimination of target transitions. IV. FAILURE - ADAPTIVE TEST RELAXATION The failure information collected in the regular test phase can be utilized to relax the original test to a highly unspecified form, providing flexibility for IR-drop reduction while retaining the defect detection capability. If a failure is captured at a particular primary

output (PO) or pseudo-primary output (PPO) by a test vector, the corresponding fault, regardless of its physical nature, must be observable at that PO/PPO. This observation can be utilized to identify the range of possible fault sites (the critical grids discussed in Section II), which should be a subset of the faults that are detectable by the current test vector. For each original test vector, this subset of faults can be identified by fault simulation and added into a candidate fault set. In order to completely reproduce the fault manifestation condition of the original vector, the signals that activate and propagate the candidate faults must be set to the values that match the ones produced by the original vector. The use of such strict signal constraints ensures retainment of the original vector’s detection capability for defects that are not modeled by stuck-at faults (e.g., only a subset of the stuck-at test vectors targeting the same stuck-at fault can detect the transistor-level faults at the same fault site [10]). In order to also guarantee the manifestation of possible layout-dependent defects, such as bridging faults, the values on nets that are physically adjacent to the aforementioned signals need to match the original vector as well. The union of these two groups of signals constitutes the set of justification targets during test relaxation. Once the justification targets are identified, a justification process similar to the technique in [11] is performed to generate the relaxed vector from the original one. Since the justification target set is a subset of the signals produced by the original vector, the identification of a relaxed vector that fulfills all the targets is guaranteed. It is important to note that, unlike the technique in [11] where the justification is needed for all the possible faults detected by a test, the failure-adaptive strategy employed in our work results in a smaller target set, thus creating more unspecified bits in the relaxed vector which significantly increases the potential for IRdrop reduction. V. D EBUGGING TEST GENERATION The unspecified bits in the relaxed vectors enable the generation of low IR-drop debugging tests by appropriately assigning values to those bits. Since it has been shown that the transition density in flip-flops is highly correlated with power consumption [7], the proposed method focuses on concurrently minimizing the launch and capture transitions of flip-flops residing in the critical grids. This set of flip-flops is defined as critical flip-flops in the proposed scheme. Since distinct transition targets may raise conflicting signal constraints, the debugging test generation process needs to minimize the impact of each transition reduction decision it makes in order to leave sufficient optimization space for as many targets as possible. We propose to handle the launch transitions prior to the capture transitions, as the launch transition reduction process is more constrained and as its impact is local compared to capture transition reduction. A. Launch transition minimization The elimination of a launch transition necessitates a critical flipflop to share the same value with its immediate successor. In order to help minimize this type of transitions, a number of critical segments are identified in the scan chain. A critical segment consists of a sequence of consecutive critical scan flip-flops, plus the non-critical flip-flop immediately after them. The filling of unspecified bits can be performed independently for distinct critical segments in the scan chains. Depending on the

Case 1

1

X

1

X

1

1

1

1

Case 2

X

X

X

X

0/0

0/X

0/1

0/0

Case 3

1

X

X

0

1

1

0

0

Update & simulate the debugging vector

No

Target set empty?

Fig. 4.

Justify the target with lowest justification cost

Add unspecified response bits into target set

Launch transition minimization

No

Direct assignment possible?

Yes

Direct assignment on the debugging vector

Yes

End

condition of the specified bits in a critical segment, different filling strategies need to be applied to attain the optimal result. If all the specified values in a critical segment are identical, the transitions in all critical flip-flops can be eliminated by adjacent filling, as shown in Case 1 of Figure 4. If none of the bits in a critical segment is specified, as shown in Case 2 of Figure 4, the transitions can be eliminated by either an all-1 filling or an all-0 filling. However, these two filling possibilities may incur different capture transition densities. Taking this into account, we first simulate the partially specified vector and check the responses in the critical segment (the values after the slash in Figure 4). The dominating value of this segment of response will be filled in this segment of vectors. If the specified values contain both 1 and 0 in a critical segment as shown in Case 3, it will be impossible to eliminate all transitions. For this case, we choose the flip-flop with the least amount of its fan-out cone residing in the critical grids to serve as a transition point, as such a flip-flop has the lowest impact on critical grid IRdrop. Since the number of critical flip-flops is small as they are identified through tracing the observed faulty responses, this step only consumes a small portion of the unspecified bits and retains most of them intact for the subsequent capture reduction phase. B. Capture transition minimization The remaining unspecified bits in the debugging vector can be utilized to produce the response values in such a way that the match between the stimuli and the responses in the critical flipflops is maximized. However, the possible signal conflicts between the justification processes of different targets constitute the major challenge for this problem. To attain a near-optimal solution, the proposed algorithm maximally identifies the assignments on the unspecified bits that can reduce the search space without destroying the optimality of the solution. Only when no such assignments can be further identified, a highly guided justification heuristic is applied to further reduce a target transition at the minimum cost in terms of unspecified bit consumption. This step in turn may create further optimality-preserving assignments in the updated debugging vector, restarting the signal assignment iteration. Figure 5 presents a highlevel overview for this algorithm. 1) Optimality-preserving signal assignment: Since an unspecified bit in the stimuli can only assume one of the two binary values, it is computationally effective to check for each unspecified bit whether either of the direct assignments on it can reduce the target transitions without destroying the optimality of the solution. If the minimum benefit of setting a bit to a particular value exceeds the maximum possible cost of doing so, then this bit can be directly set to that value without impacting global optimality. The benefit and cost associated with a direct assignment on an unspecified bit can be evaluated in the following manner.

Fig. 5.

Capture transition minimization flow

If a partially specified vector generates a sensitized path from an unspecified bit, bi , to a target, fj , then fj is considered to be dependent on bi . For each unspecified value bi , we define five sets of targets, namely the SN 1, SN 0, S1, S0 and D sets. A target fj will be added into the SN 1(0) set of bi if the following condition holds: fj can be justified if and only if bi is set to 1(0). From the definition, it can be seen that the targets in SN 1(0) are uniquely dependent on bi . A target fj will be added into the S1(0) set of bi if setting bi to 1(0) is a sufficient but not necessary condition for justifying fj . The D set of bi contains all targets that are dependent on bi . The SN 1(0) set represents the lower bound of the benefit uniquely provided by setting bi to 1(0). On the other hand, setting bi to 1(0) may reduce the chance of justifying the targets in D −S1(0), which constitutes the maximum possible cost of doing so. The five sets defined above strictly comply with the following theorem. Theorem 1: For any unspecified bit, bi , if |SN 1| ≥ |D − S1|, then |SN 0| ≤ |D−S0|; if |SN 0| ≥ |D−S0|, then |SN 1| ≤ |D−S1|. Proof: It is trivial to see that SN 1 ⊆ D because every target in SN 1 must be dependent on bi . Since bi = 1 is sufficient and necessary for justifying any target in SN 1, setting bi to 0 will fail to justify any of them. Therefore, SN 1 ∩ S0 = φ. It follows that SN 1 ∩ (D − S0) = SN 1 ∩ D − SN 1 ∩ S0 = SN 1 − φ = SN 1. Therefore, it must hold that SN 1 ⊆ (D−S0) and |SN 1| ≤ |D−S0|. By an analogous reasoning, it can be proven that |SN 0| ≤ |D −S1|. As a result, if |SN 1| ≥ |D − S1|, then |SN 0| ≤ |D − S1| ≤ |SN 1| ≤ |D − S0|; if |SN 0| ≥ |D − S0|, then |SN 1| ≤ |D − S0| ≤ |SN 0| ≤ |D − S1|. For each unspecified bit, the proposed scheme checks whether |SN 1| ≥ |D − S1| or |SN 0| ≥ |D − S0| holds. The first case indicates that the benefit of setting bi to 1 is no less than the

maximum possible cost, whereas the second case indicates a similar situation for setting bi to 0. Theorem 1 shows that if the inequality condition for setting a signal to a particular value holds, then setting it to the opposite value would under no circumstances yield a better solution. Thus unspecified bits that fulfill any of these two conditions can be directly set to the corresponding value without impacting the optimality of the solution. When the condition |SN 1| < |D − S1| and |SN 0| < |D − S0| is observed for all unspecified bits, no further direct assignment can be performed. An efficient heuristic is proposed to further justify a target at the smallest cost. We reserve the use of the heuristic only for tie-breaking purposes, with the expectation that this step will create new direct assignment possibilities. 2) Justification heuristic: The proposed heuristic eliminates a target transition through signal justification. To attain maximal justification efficiency, it is essential to guarantee that the justification

X f1→0

Relationship matrix R b1 b2 b3 b4 f2

0 0

0 1 1 0 1 1

f3

Fig. 6.

(43/15, 14/15)

b4 X

(28/15, 19/15)

(3/5, 3/5)

(23/15, 14/15)

0

1

1 1 1

(29/15, 1)

X f1→0

X f3→0

1

(8/5, 2/3)

X f2→1

b4 X

f1

Justification cost: (JC0, JC1)

b1 X (1/3, 1/3) b2 X (1/3, 1/3) b3 X (3/5, 3/5)

b1 X b2 X b3 X

1

Popularity matrix P 1 1 1 1

0

0

1

1 1 1

0 1 1

0 1 1

0 0

0 1 1 0 1 1

=

1 1 1 1

1 1 1 1

1 1 3 3

1 3 3

target selected incurs the least amount of cost (i.e., impact on the justification opportunities for the remaining targets). It can be observed that the influence of distinct unspecified bits on the quality of the solution is highly skewed. The bit that has multiple dependent targets is much more important than the one that only impacts one target, as multiple targets may have to compete for the popular bit during their own justification. Therefore, each unspecified bit can be labeled with a weight that reflects its popularity level. A justification process that consumes a popular bit needs to be penalized as it might eliminate the justification hope for a large number of other targets. To help perform the cost analysis, a matrix, R, is defined to model the dependence relationship between the unspecified bits and the targets, wherein each column of the matrix corresponds to an unspecified bit in the vector, and each row corresponds to a target to be justified. If the target fi is dependent on the unspecified bit bj , the element rij of matrix R will be set to 1, and 0 otherwise. Figure 6 provides a relationship matrix example. A popularity matrix P can be defined as P = RT R. As shown in Figure 6, the diagonal element pii represents the number of targets that are dependent on bi . A large pii indicates that bi might be useful to a large number of targets leading to bi being assigned a larger weight to reflect its importance. The off-diagonal elements, pij,i=j , represent the number of dependent targets of bi that are also shared by bj . A larger pij indicates that the importance of bi to its dependent targets wanes as the justification of a large number of them can also rely on the value of bj . In line with these observations, the weight function can be defined for each unspecified bit bi in the unspecified bit set B , in the following manner. pii 1≤j≤|B|,j=i pij

(∞, 0)

Fig. 7.

1

A logic circuit and corresponding relationship matrix

wi = 

(0, ∞)

(1)

In addition to the weight of each unspecified bit, another issue that needs to be considered for the selection of the justification target is the Boolean relationship in the circuit. To incorporate the impact of distinct gate types on signal justifiability, we propose a SCOAPlike metric to evaluate the cost of justifying a particular target. It differs from the traditional SCOAP [12] testability in that it takes into account the skewing popularity of the unspecified bits and the impact of already specified bits. The proposed technique assigns a 1-justification cost (JC1 ) and a 0-justification cost (JC0 ) to each bit of the partially specified debugging vector under processing. If a bit is already specified to a value, v, v ∈ {0, 1}, then its v -justification cost will be set to zero and its v -justification cost will be set to infinity. For each unspecified

X f2→1 X f3→0 0

Justification cost computation

bit, both its JC1 and JC0 will be set to the weight value computed by Equation (1) to reflect the cost of consuming this bit during the justification of a particular target. The justification costs of internal signals with specified values are determined in the same manner as for the specified bits in the debugging vector. For the unspecified internal signals, their justification costs can be computed through a SCOAP-like manner. Figure 7 exemplifies the justification cost computation process. The cost of justifying a target f to value v is represented by JCv (f ), which enables the identification of the target whose justification has least impact on the other targets’ justifiability. A justification process is therefore performed for the selected target to eliminate the capture transition on it. The path selection during justification strongly impacts the optimality of the solution. When the justification process reaches a gate where only one of its inputs needs to be set to the controlling value of the gate, the algorithm will first justify the paths with the lowest justification cost. If the target can not be justified through the selected path, the algorithm backtracks to the branching point and tries the path with the second lowest justification cost, and so on. This strategy will force the algorithm to justify a target by maximally using its dedicated unspecified bits and consume as few shared bits as possible. VI. E XPERIMENTAL RESULTS The proposed scheme has been evaluated by comprehensive simulations. The failure adaptive debugging test generation scheme has been implemented using the C language. The effectiveness of the debugging phase in identifying the IR-drop failure strongly correlates with the capability of the debugging vectors in reducing the IR-drop. In our simulation, the IR-drop levels of the original and the debugging vectors have been evaluated in terms of the capturemode power consumption using the power estimation model in [13]. The proposed scheme has been applied to the largest circuits in the ISCAS89 benchmark set, and the MINTEST vectors [14] have been used as the original tests applied in the regular testing phase. In order to examine the proposed technique’s effectiveness under different failure densities, we randomly generated for each circuit a large set of failure responses with varying failure densities (the percentage of the faulty bits in the response ranging from 10% to 90%). The average IR-drop reduction in each failure density level is shown in Table I. It can be seen that significant IR-drop reduction can be attained in the debugging phase, especially when the failure density is low. A high correlation between the IR-drop reduction and the failure density is observed, which can provide additional insight for the test engineers to trade between the IR-drop level and

TABLE I. IR- DROP REDUCTION IN THE DEBUGGING PHASE Circuit s13207 s15850 s35932 s38417 s38584

(0, 0.1) 80.0% 82.7% 68.4% 47.9% 73.3%

(0.1, 0.2) 67.9% 68.3% 55.9% 45.7% 61.4%

IR-drop reduction under various fault density ranges (0.2, 0.3) (0.3, 0.4) (0.4, 0.5) (0.5, 0.6) (0.6, 0.7) 54.2% 43.3% 35.9% 27.8% 22.4% 49.4% 37.0% 27.6% 20.7% 16.7% 24.3% 16.9% 11.7% 8.8% 3.0% 33.6% 24.5% 18.7% 14.4% 10.4% 42.8% 32.2% 24.5% 16.6% 13.1%

debugging time. If the debugging vector is generated from the failure response of a single chip, a high IR-drop reduction can be attained as the failure density of a single chip is usually quite low. However, this strategy precludes the sharing of debugging vectors among multiple failing chips. Enabling vector sharing among multiple chips necessitates the generation of the debugging vector from the union of the failing responses of these chips, which reduces the debugging time at the cost of reduced IR-drop reduction. The high correlation between the failure density and IR-drop reduction enables the test engineers to accurately predict the expected IR-drop for a particular failure density and identify the best tradeoff point. Furthermore, the test time overhead incurred by the debugging phase has been evaluated as well. It has been assumed in our simulations that the total number of chips is 1K and that the tester is able to concurrently test 6 chips. We have simulated the debugging phase with randomly generated failure distributions and computed the time overhead as the ratio of the debugging time to the regular phase test time. Figure 8 plots the debugging time overhead of s15850 as a function of the yield and the failing vector percentage. As expected, the debugging time increases along with the number of failing chips and the number of failing vectors per chip. However, it has been observed that, when the yield is above 80% and the failing vector percentage per chip below 20% (which approximates the most common situation in practical testing), the debugging time overhead can be controlled to below 10%. In the case of 90% yield and 10% failing vectors, the overhead can be as low as 2.8%. A similar debugging time overhead has been observed for other benchmarks. It should also be noted that the debugging time evaluation in our simulation is performed in a conservative manner. In actuality, there might be strong correlations among the failing vectors of chips in the same lot due to their correlated failure mechanisms. This effect reduces the total number of failing vectors and increases the probability of debugging vector sharing, thus possibly leading to a much lower debugging overhead than the one conservatively estimated in our simulation.

Fig. 8.

Debugging time overhead

(0.7, 0.8) 15.5% 13.2% 2.8% 7.6% 8.5%

(0.8, 0.9) 8.9% 4.6% 0.2% 4.8% 4.9%

VII. C ONCLUSION Test mode IR-drop induced failures constitute a crucial cause for significantly increased failure analysis cost and yield loss. A systematic methodology is proposed to address this critical challenge by providing an efficient IR-drop failure identification technique. The proposed scheme consists of the application of debugging vectors custom-generated from the observed failure response in order to differentiate the IR-drop failure from the actual defects. This scheme can maximally eliminate the unnecessary intrusive silicon debugging on defect-free chips, thus significantly reducing the engineering time/cost. The faulty chips that are identified as defect-free can be graded with design engineer input, and a subset of them with relatively high IR-drop threshold can be categorized as passing parts thus minimizing yield loss caused by overtesting. The proposed scheme does not require circuit modification or DFT insertion and can be easily incorporated into state-of-the-art ATE flows. R EFERENCES [1] Y-M. Jiang and K-T. Cheng. Analysis of performance impact caused by power supply noise in deep submicron devices. Proc. DAC, pages 760-765, 1999. [2] J. Saxena, K. M. Butler, V. B. Jayaram, S. Kundu, N. V. Arvind, P. Sreeprakash and M. Hachinger. A case study of IR-drop in structured at-speed testing. Proc. ITC, pages 1098-1104, 2003. [3] K. T. Cheng, S. Dey, M. J. Rodgers and K. Roy. Test challenges for deep sub-micron technologies. Proc. DAC, pages 142-149, 2000. [4] A. A. Kokrady and C. P. Ravikumar. Static verification of test vectors for IR drop failure. Proc. ICCAD, pages 760-764, 2003. [5] X. Wen, K. Miyase, T. Suzuki, S. Kajihara, Y. Ohsumi and K. Saluja. Critical-path-aware X-filling for effective IR-drop reduction in at-speed scan testing. Proc. DAC, pages 527-532, 2007. [6] J. Lee, S. Narayan, M. Kapralos and M. Tehranipoor. Layout-aware, IR-drop tolerant transition fault pattern generation. Proc. DATE, pages 1172-1177, 2008. [7] R. Sankaralingam, R. Oruganti and N. Touba. Static compaction techniques to control scan vector power dissipation. Proc. VTS, pages 35-40, 2000. [8] X. Wen, Y. Yamashita, S. Kajihara, L.-T. Wang, K. Saluja and K. Kinoshita. On low-capture-power test generation for scan testing. Proc. VTS, pages 265-270, 2005. [9] V. R. Devanathan, C. P. Ravikumar and V. Kamakoti. Variation-tolerant, power-safe pattern generation. IEEE Design & Test of Computers, vo. 24, no. 4, pages 374-384, 2007. [10] E. J. McCluskey and C.-W. Tseng. Stuck-fault tests vs. actual defects. Proc. ITC, pages 336-343, 2000. [11] K. Miyase and S. Kajihara. XID: Dont’t care identification of test patterns for combinational circuits. IEEE Trans. on CAD, vol. 23, no. 2, pages 321-326, 2004. [12] L. H. Goldstein. Controllability/observability analysis of digital circuits. IEEE Trans. on Circuits and Systems, vol. CAS-26, no. 9, pages. 685-693, 1979. [13] P. Girard. Survey of low-power testing of VLSI circuits. IEEE Design & Test of Computers, vol. 19, no. 3, pages 80-90, 2002. [14] I. Hamzaoglu and J. H. Patel. Test set compaction algorithms for combinational circuits. Proc. ICCAD, pages 283-289, 1998.