Localizing Open Interconnect Defects using Targeted ... - CiteSeerX

Report 8 Downloads 17 Views
Localizing Open Interconnect Defects using Targeted Routing in FPGA’s Dave Mark Jenny Fan (408) 879-4648 (408) 879-4450 [email protected] [email protected] 2100 Logic Dr. San Jose, CA 95124 Read back: To dump the memory cell content to a file for post processing to find the failing bits.

Abstract A break though test strategy for detecting and localizing open metal interconnects faults is described in this paper. It utilizes the reprogrammable nature of FPGA’s to quickly isolate metal defects on different individual metal interconnect layers with existing FPGA resources. By only programming the FPGA once, multiple defects are isolated for each physical metal layer. Experimental results from a low yield wafer verified this methodology by successfully identifying the physical defect locations. The experimental results are presented in this paper. This strategy can also be applied to detecting metal bridging defects.

Introduction We first define important terms used in this paper for discussion. Terminology Metal test patterns: A group of test patterns, each one is targeted on one specific physical metal layer for test. For instance, Metal4 (m4) test pattern will create interconnects under test by using m4 only, a test fail indicates that m4 interconnect line has an open or short defect. Interconnect layer: Physical interconnect metal layer processed during chip manufacturing. Interconnect wire: FPGA interconnect used to physically connect signals in a circuit. Capture: Captures the interconnect wire logic state to a flip-flop and transfers information to a memory cell.

Programming: Setting SRAM bits within the FPGA which control the logical behavior of the device. Reprogrammability: The ability of the FPGA to change the logical behavior by configuring the device with a different set of SRAM cell data. PIP: Programmable Interconnect Point. Connects two interconnect wires with a programmable connection. In the last two decades, the semiconductor process technology has greatly shrunk lithography size, and increased the metal interconnect layer count from a single metal layer to 10 layers. The interconnect material also changed from aluminum to copper. Open copper trace defects occurs more frequently than aluminum due to the trench process, and it is very hard to detect by traditional failure analysis (FA) fault localization techniques. SRAM and various types of test structures are used to monitor process health and for process development [1]. Failure analysis is performed on those on-chip SRAM and test structures to find the root cause of low yield. The failure analysis result from SRAM and test structure help understand defect and process problems from earlier physical process layers, usually from silicon up to metal 3. It does not help in improving yield for upper metal process layers. Test structures can be designed to test every physical layer, but it is area size limited for controlling cost, therefore, it often does not present a true picture of problems existing in production devices. Most test strategies are focused on test fault coverage. If a failure occurs in middle interconnect layer, in most cases, neither traditional nor newer emerging FA techniques can detect a hot spot or emission from the front or backside of a failing die. For instance, localizing a defect occurring in metal 4 of a nine layer metal process is a very time

ITC INTERNATIONAL TEST CONFERENCE 0-7803-8580-2/04 $20.00 Copyright 2004 IEEE

Paper 22.1 627

consuming process if based on fault coverage oriented test results such as functional and scan test. Yield improvement demands a faster way to perform FA, since a large number of FA results are needed to help identify the root cause in either an intrinsic process problem or a yield crash. Many researchers have researched diagnosing interconnect faults on FPGA’s but only localized single faults [2], [3] or require many programming steps [4], [5]. Other researchers propose design changes to the FPGA for faster diagnosis, but it is unavailable for an existing product and it costs more [6]. This paper presents a new methodology implemented in FPGA products to quickly detect and localize metal open defects in one programming step. This idea is also applicable to detecting metal bridging defects. The reprogrammability of a FPGA makes it possible to generate test patterns targeted to each individual interconnect layer. The metal test patterns for each individual interconnect layer are designed similarly, which is a long shift register chain, interconnect wires under test are routed between flip-flops. Furthermore, the logic state of each interconnect wire segment under test can be captured to memory cells via flip-flops, and the contents of memory cells can then be read back. Post processed read back data localizes the failing interconnect wire segment. The size of an isolated failing segment depends on the original FPGA logic structure, it can range from 600um to 100um in length for a specific physical layer and particular wire.

1. Fault Diagnosis and Localization – The Old Way Fault diagnosis and fault localization is a laborious and time-consuming process. A typical flow of the analysis process may be as follows. The unit is first tested and identified as a defective unit. Fault isolation begins with more detailed testing of the unit and obtaining a data log of all failing tests. One of the failing tests is chosen for further analysis. Device knowledge along with an understanding of the test stimulus and test response is required to determine a best guess where the fault may occur in the device schematic. The localized fault is found in the device layout and all possible layers are highlighted for a physical failure analysis engineer. The device is delayered and each layer is inspected at the suspected faulty layout area until either a defect is found or all possible layers are inspected. The identified defect may be classified as a metal interconnect failure or one of Paper 22.1 628

many other types. In order to perform the previously described failure analysis successfully, it requires an enormous amount of time and device knowledge. Each successive defect on the device requires an equal investment in time and effort to isolate. One can easily see how difficult the task is to isolate multiple defects on a single device.

2. Fault Diagnosis and Localization – The New Way A new automated method requiring little or no device knowledge that targets open interconnect defects in an individual metal layer was developed. The new method quickly identifies the number and location of multiple defects. The new approach can be thought of as a bottom-up approach versus the top-down old method previously described. 2.1 Test Pattern Targeted to an Individual Interconnect Layer The targeted test pattern is designed to fail when defects occur on a specific individual metal interconnect layer. The philosophy behind this is the simple shift register test approach [7] as shown in Figure 1. This approach is commonly used, but interconnect wires residing in the same physical metal layer are used as the connection between flip-flops. Only three registers are shown in the test pattern as an illustration, the actual diagnostic pattern utilizes many registers in the design.

Figure 1: Typical shift register test pattern. All registers are D-type flip-flops that are set to zero. Clocking a single high pulse through all the registers until the high pulse is transferred to the last register tests the test pattern. This is commonly referred to as “shifting a 1 in a sea of zeros”. To guarantee complete test coverage, walking a “0 through a sea of 1’s” testing should be performed. The output of the last flip-flop is monitored and must match the expected behavior for each clock cycle. The

test stimulus and test response for the 3 flip-flop chain is shown below in Table 1. Table 1: Test vector. Vector Clock

Input

Output

1

C

1

L

2

C

0

L

3

C

0

H

An internal stuck high fault in the path of the chain would cause the high state to emerge earlier than expected. A stuck low causes the output never to be high. This test approach provides few clues as to the location of the defect. The first detected failure would identify the register nearest the end of the chain as a fault in the case of a stuck high fault. No information on multiple defects is available. There is not enough information to perform a physical failure analysis. Before there is a good probability of isolating the defect, the old fault diagnosis and localization way still needs to be used. In the case of the internal stuck low, even less information on the fault location is known.

Figure 2: Shift register with targeted routing.

2.2 New Fault Localization Approach The above mentioned shift register test pattern localizes the fault only to the register closest to the last register as previously noted. The new methodology will localize the fault to between two PIPs. The open metal interconnect defect at node 2 creates a stuck high fault at node 3 (Figure 3). Node 1 and the PIP before node 2 are driven low by the output of the D flipflop. Because of the defect at node 2, the interconnect floats and node 3 is interpreted as a stuck high fault due to a feature of the PIP.

However, this test pattern is used for production go/no-go testing to reject failing units along with the additional information that identifies the failing metal interconnect layer. Table 2 gives an example of how each test pattern targets an individual interconnect layer. Table 2: FPGA Routing Resources Example. Pattern Routing Metal Layer Name Resource Type Pattern 1

A

Metal 4

Pattern 2

B

Metal 5

Pattern 3

C

Metal 6

Pattern 4

D

Metal 7

An example of a targeted test pattern is shown in Figure 2. The circles represent Programmable Interconnect Point’s (PIP’s) that connects two interconnect wire segments. A1 and A2 are two separate interconnect wire segments belonging to routing resource type A, which resides on metal 4. The FPGA reprogrammability allows each test pattern to test each individual physical layer separately in this way (see table 2).

Figure 3: Test pattern with open interconnect defect. A simplified PIP structure (Figure 4) can be drawn with a weak pull-up resistor before each connection driving the next routing resource. If the interconnect is being driven from the previous PIP, then the weak pull-up has no effect. But with an open defect, the weak pull-up drives the open interconnect to a logic ‘high’ which forces the PIP to drive a stuck high state to the next routing resource. The new methodology benefits from this advantage to localize a fault to a single routing element. There are existing circuits in each PIP which performs the pull-up operation.

Paper 22.1 629

Figure 4: Test pattern with simplified PIP structure.

Each SRAM configuration memory cell has a specific address location in the configuration memory map and is available during the readback data stream. Each SRAM memory cell that is paired with an FPGA flip-flop has a known fixed location therefore after a read back operation, the contents of all flip-flops are known. [8]

2.3 Fault Localization using Probe, Capture and Read Back One advantage of an FPGA is that a routing resource can drive multiple PIPs allowing the state of each wire segment to be monitored. The targeted test pattern is modified with this concept (Figure 5) to read the state of each routing segment. At each PIP intersection, an additional PIP is enabled and routed to an existing unused flip-flop. These additional flip-flops are referred to as “capture flip-flops”. The addition of an extra PIP at each intersection is referred to as “probing”. The connection between the new PIP and the additional register uses routing that excludes the interconnect wires on the physical layer under test.

Figure 6: Example of Capture Operation. SRAM memory cell loaded with value of flip-flop when Capture Signal is enabled. A fault is localized with the probed test pattern shown in Figure 7 after one clock cycle. Node 1 is driven low by the output of the flip-flop. The flip-flop at node 3 is driven low by node 1. An open defect at node 2 causes a stuck high fault and drives the flip-flops at 4,5 and 6 to an incorrect high value. The fault is determined to be between the last known good reading at node 3 and the first failing reading at node 4. This is the key idea behind the new fault localization methodology. A new set of test patterns is generated with this fault diagnosis and localization ability. Multiple defects can be detected at the same time, if they occur between different pairs of driving flip-flops.

Figure 5: Targeted test pattern with Capture flip-flops added. The logical state of each wire segment can then be loaded into all flip-flops with one clock cycle. In a FPGA, each flip-flop is paired with a memory cell that is loaded with the state of the flip-flop when a capture operation is executed. Figure 6 shows how a specific SRAM configuration memory cell is loaded with the value from a flip-flop in an FPGA. When the capture signal is enabled the SRAM memory cell that is paired with the flip-flop is loaded with the value of the register. [8] Paper 22.1 630

Experimental Results The discussed idea was carried out on a FPGA Virtex-II Pro device. A full device description can be found at the Xilinx website (www.xilinx.com/virtex2pro)[9]. A wafer with high fall out due to interconnect failures was chosen for the experiment. Wafer sort test results indicates that most dies from this wafer failed the interconnect test pattern targeted on metal 4. The test pattern targeted for metal 4 with fault diagnosis and localization ability was then applied to the failing dies. Data is captured and read back from memory cells for each failing die after one clock cycle. Post processing of the data generated a symbolic failure map.

Figure 7: Probed test pattern with defect after one clock cycle. The defect between node 3 and 4 will be detected. 2.3 Register Flush Test

An obvious geometric failing pattern was observed from many failing dies shown in Figure 9 and Figure 10, Metal 4 runs horizontally across the die, one can easily tell that the failures occur where the embedded IP’s exists (the failing probe points are represented as darker dots). No failures were found in the area above and below the embedded IP region. This obviously shows a potential systematic process problem.

The new test methodology requires that all supporting circuitry are known good before using the targeted routing patterns. The use of the flush test pattern increases the probability that a failure in the targeted routing pattern is due to an open interconnect. A register flush test pattern as shown in Figure 8 tests the drive flip-flops used in the targeted routing patterns. Only devices passing the register flush test and failing the targeted routing patterns are further analyzed as failing interconnect routing,

Figure 9: Symbolic failing map showing broken horizontal metal 4 interconnect wires around embedded IP’s for SN1. It shows a potential systematic problem.

Figure 8: Register flush test pattern.

Paper 22.1 631

Figure 10: Symbolic failing map showing broken horizontal metal 4 interconnect wires around embedded IP’s for SN2. Probe point data for each failing wire connection between driving flip-flops are plotted in Figure 11 and Figure 12. Figure 11 plots failing interconnect wires driven from right to left. Figure 12 plots failing interconnect wires driven from left to right.

Figure 11: Plot of probe points for wires driving right to left on SN1. Open interconnect defect localized between the last “.” and first “F” for each line number.

In Figure 11 and Figure 12, each number in first column represents individual physical lines between two driving flip-flops. Columns are numbered from 0 to 6 represent probe points along the path. In these columns, the character ‘dot ‘ represents ‘pass’, ‘F’ represents ‘fail’, and ‘0’ means that point does not exist (eases plot generation). For Figure 11, the direction of the line is shown as driving from right to left since passing probe points are on the right and failures to the left. An open interconnect fault occurs where there is a transition from a ‘dot’ to ‘F’ for each line.

Figure 12: Plot of probe points for wires driving left to right on SN1.

Paper 22.1 632

Similar data is also obtained from analyzing other failing dies. By studying the layout, it is found that the metal 4 interconnection wires bridge up to metal 6 where they cross the embedded IP block for physical layout convenience. Most of the faults localized correspond to the metal 4 to metal 6 bridging area. The newly designed test pattern targeted for metal 4 with fault diagnosis and localization ability precisely localizes the physical failing layer and location. Two types of failure mechanism were suspected before the physical FA started for this case. One type may be open via between metal 4 and metal 6, another type may be metal 6 interconnect open due to drawn 45 degree diagonal lines. This unique geometric pattern may cause lithograph deformation during manufacturing. The physical FA results from a number of failing dies show metal 6 broken at the 45-degree diagonal lines. As suspected, lithograph deformation was the root cause. See Figure 13 and Figure 14.

Figure 14: Similar broken metal 6 lines found are from right side of embedded IP block. This novel technique helped to promptly correct the problematic process step by diagnosing and localizing the failure in a very timely manner. Additional failing dies were chosen for analysis by using the traditional fault diagnosis and localization method, which takes a significantly longer time, and found the identical root cause. See Figure 15.

Figure 13: Physical FA showing broken line at metal 6 from left side of embedded IP. Defect occurs at interconnect wires drawn at a 45-degree angle. Figure 15: FA results from traditional fault diagnosis and localization method. The random metal particle shown in Figure 16 caused open interconnect metal failures, which were easily found by this technique.

Paper 22.1 633

Proceedings Seventh Asian Test Symposium, pp. 283287, 1998. [5] Yinlei Yu, Jian Xu, Wei Kang Huang, and F. Lombardi, “Minimizing the Number of Programming Steps for Diagnosis of Interconnect Faults in FPGAs”, Proceedings Eighth Asian Test Symposium, pp. 357–362, 1999. [6] Yue Wang and Dongang Liu, ”A Fast Diagnosis Method for Interconnect Fault in FPGA”, The 2002 45th Midwest Symposium on Circuits and Systems, pp. III 231-4 vol.3, 2002. [7] S.Toutounchi, and A.Lai “FPGA Test and Coverage”, Proceedings IEEE International Test Conference, pp. 599-608, 2002. Figure 16: Particle causing open metal lines.

[8] “Virtex Configuration and Readback”, XAPP138, www.xilinx.com.

Conclusion

[9] Virtex-II Pro Platform Users Guide, Xilinx, Inc.

The fault isolation time is reduced from days to minutes by using the targeted test pattern with fault localization ability. Multiple defects are localized on the same unit at the same time without additional effort. This methodology is now regularly applied to all possible metal interconnect layers for quick fault localization.

Acknowledgements We greatly appreciate management support and team effort from Zhi-Min Ling, Xiao-Yu Li, Randy Simmons, Tarek Elden, Jason Xu, James Guan, Teymour Mansour, Ian McEwen, Carlis Collins, Shahin Toutounchi, and Eric Thorne.

References [1] A. Skumanich, Man-Ping Cai, J. Educato, and D. Yost, “Use of Test Structures for Cu Interconnect Process Development and Yield Enhancement”, Proceedings of the 2000 International Conference on Microelectronic Test Structures, pp. 63–66, March 2000. [2] M.B. Tahoori, “Diagnosis of Open Defects in FPGA Interconnect”, Proceedings 2002 IEEE International Conference on Field-Programmable Technology, pp. 328–331, Dec 2002. [3] Yinlei Yu, Jian Xu, Wei Kang Huang, and F. Lombardi, “Diagnosing Single Faults for Interconnects in SRAM Based FPGAs”, Proceedings of the Design Automation Conference, Asia and South Pacific, pp. 283286 vol.1, 1999. [4] Sying-Jyan Wang, Chao-Neng Huang, “Testing and Diagnosis of Interconnect Structures in FPGAs”, Paper 22.1 634