VTS 2004 Proceedings - Semantic Scholar

Report 5 Downloads 47 Views
A Diversified Memory Built-In Self-Repair Approach for Nanotechnologies Michael Nicolaidis, Nadir Achouri iRoC Technologies, Grenoble, France

Lorena Anghel TIMA Laboratory, Grenoble, France

Abstract Memory Built In Self Repair (BISR) is gaining importance since several years. Because defect densities are increasing with submicron scaling, more advanced solutions may be required for memories to be produced with the upcoming nanometric CMOS process generations. This problem will be exacerbated with nanotechnologies, where defect densities are predicted to reach levels that are several orders of magnitude higher than in current CMOS technologies. For such defect densities, traditional memory repair is not adequate. This work presents a diversified repair approach merging ECC codes and self-repair, for repairing memories affected by high defect densities. The approach was validated by means of statistical fault injection simulations considering defect densities as high as 3*10-2% (3% of cells are defective). The obtained results show that the approach provides close to 100% memory yield, by means of reasonable hardware cost, for technologies of very poor quality. Thus, the extreme defect densities that many authors predict for nanotechnologies do not represent a showstopper, at least as concerning memories.

Keywords: memory repair, high defect densities, nanotechnologies, ECC codes, word repair 1. Introduction CMOS technology runs into significant physical limits expected to be reached shortly after 2010. The reasons are not only technical (current leaks, signal integrity, power density, node storage capacities, ...) but also economic (e.g. excessive cost expected for the fab-lines of future CMOS process generations). Single electron devices, quantum cellular automata, molecular components, nanocristals, are some possibilities for future technologies. Manufacturing alternatives point at the horizon with the fast sophistication of the chemical synthesis processes, making it possible to synthesize chemically the electronic components and their inter-connections, to create very complex systems and at low cost. These

Proceedings of the 22nd IEEE VLSI Test Symposium (VTS 2004) 1093-0167/04 $20.00 © 2004 IEEE

circuits would integrate hundreds of billion devices in regular networks. However, the statistical yield of the chemical synthesis used in these approaches will result on systems affected by defect densities several orders of magnitude higher than current technologies [1] [14], which could be as high as a few defective cells for every 100 memory cells. As a matter of fact, fault tolerant approaches are becoming an enabling factor for these technologies. But a new fault tolerant design paradigm is needed, since with such high defect densities both the regular resources and the redundant ones will be affected by the defects, disabling the basic principle of traditional fault tolerance (use of a fault-free redundant unit to perform the job of a faulty regular unit). The approaches presented in [1] [14] propose solutions for repairing processor designs affected by high defect densities. The present paper addresses memory repair for high defect densities. Existing memory repair approaches target defect densities affecting current CMOS technologies. Thus, they consider memories affected by few faults.. The Word repair in [3] considers low numbers of faults. In addition, faults in the spare parts are not considered. Subsequent works on word BISR [4], [5], [6] also consider low numbers of faults (e.g. two faults in [5]), and no faults in the spare units. Work on column/data BISR is more recent due to the difficulty for elaborating the reconfiguration functions for this repair. Kim et al [7] present a first work in this domain. To master the complexity of the reconfiguration process, the scheme repairs a single fault per test phase. This simplifies the work of the BISR control unit, but the test and repair time becomes unacceptable if we have to repair large numbers of faults. This paper also considers a low number of faults. Another work [9] uses nonvolatile memory cells to fix once forever the repair of manufacturing faults. This problem is also treated more extensively in [2], where a taxonomy of approaches for repairing and

fixing manufacturing and field faults is presented. Again the considered number of faults is low.

2. Built In Self Repair for High Defect Densities A recent paper [10] derives optimal reconfiguration functions for data-bit repair. These functions repair multiple faults affecting both the regular and spare elements and perform the repair by means of a single test pass. The scheme also minimizes the hardware cost for implementing the repair control and for storing the reconfiguration information,. It optimises the BISR cost and the repair efficiency. These are important attributes when we have to consider high defect densities. A further paper [13] improves the above scheme by introducing a dynamic repair approach, which increases the multiplicity of repaired faults by using a single spare unit for repairing faults affecting several regular units. Thanks to these attributes, the scheme is a good candidate for addressing high defect densities. Thus, it was used in [11] for repairing memories affected by high defect densities. For further improving the repair efficiency, [11] has introduced a diversified repair approach, which distributes the spare resources between several repair schemes. #parts

an amount of redundancy able to repair the moderate number of faults that affect the majority of them, and add some extra (spare) parts, to replace the few ones that include a larger number of faults. For implementing the diversified approach, a block repair scheme was also proposed in [11] and combined with the dynamic data-bit repair scheme. Evaluation experiments show that the combined scheme can repair memories affected by defect densities several orders of magnitude higher than current technologies, by means of moderate area cost. -3 For instance, for a 1Mbit memory affected by a 10 defect density (one defective cell for every 100 memory cells), a yield of 93% was obtained by means of 68,4% area overhead). This is a significant result, however the approach becomes inefficient as we move -2 to higher defect densities (e.g. of the order of 10 ). To cope with such defect densities, the present paper presents a new repair approach exploiting the diversified repair principle introduced in [11]. The scheme combines error-correcting codes with word repair. In the above example and also in the rest of the paper, we consider as defect density Dd the defect probability per memory cell instead of the defect probability per memory area.

3. Diversified Repair Combining ECC with Word Repair.

Majority of parts (≈Q) few parts #defects/part

Figure 1: defect distribution within a total number of Q parts. The justification for this approach is based on the fact that a repair technique must repair all the faulty parts of a memory; otherwise the memory is rejected as un-repairable. Consider a memory composed of several parts and a scheme repairing each of them. For a given defect density, the distribution of the faults will result on a few parts with a number of faults much larger than the majority of the parts. These parts correspond to the right side of the defect distribution curve (see figure 1). Then, to be able to repair all parts with a high probability, the designer must a priori associate to each part an amount of redundancy much higher than what is needed for the majority of parts. The repair becomes very inefficient if the memory includes a large number of parts, since it increases the probability to have a few parts with a much larger number of faults than the majority of the parts. In this case it will be more efficient to use, within each part,

Proceedings of the 22nd IEEE VLSI Test Symposium (VTS 2004) 1093-0167/04 $20.00 © 2004 IEEE

To repair defects affecting a memory array we will use an error-correcting code (ECC) to repair the majority of the faulty memory words, and a word repair scheme to repair the remaining words left unrepaired because they include a large number of faults. Usually, error-correcting codes are used to correct transient faults, such as soft-errors induced when ionizing particles strike a memory. Such errors cannot be fixed by repair techniques, since particles can strike randomly at any memory cell, so the affected bit cannot be fixed by reconfiguring the memory in an a priori repair approach. On the other hand, when the memory is affected by a few fabrication faults, using a repair scheme will require a lower area cost than an error correcting code. For instance, to cope with 8 faults in a 512 kb memory using 32-bits words, a repair scheme using 2 spare bits requires a 6.9% extra area, while the lowest cost ECC (Hamming code) correcting a single errors per memory word, requires 6 extra bits or about 18.7% extra cost. In addition, the speed penalty of the ECC is significantly higher. Thus, error-correcting codes are used for transient faults, while repair is preferred for fabrication faults. However, when the defect density increases, the hardware cost of the repair approaches also increases and at a certain level of defect densities it becomes higher than the cost of the ECC codes. Thus, ECC

codes can become more efficient than other repair schemes. But, ECC codes can correct only a few errors within a memory word. They become quickly impractical in terms of area cost, coding and decoding circuit complexity and speed penalty, as we increase the number of correctable errors per memory word. For instance, to use a double error correcting code, we have to pay twice more code bits and a significant speed penalty. For higher error multiplicity (i.e. for triple or higher error correction per memory word) ECC becomes very impractical. As a matter of fact, in a context of memories affected by high defect densities, repair using error-correcting codes becomes inefficient. In fact, for high defect densities, several words may include many errors (they will correspond to the right side of the defect distribution curve of figure 1). Thus, repairing all memory words will require using impractical codes able to correct errors with high multiplicity. However, if the majority of memory words include a small number of errors, the ECC will correct these errors. For instance, if the -2 defect density is 10 , and we use a Hamming code to repair a memory using 32-bit words, the probability for a word to be left un-repaired is equal to 0.033 (i.e. the probability to have two or more faulty cells in the word). This probability is too high and will leave unrepaired many words in a large memory, resulting on a yield equal to 0. However, it is 8.3 times lower than the probability of a word to be faulty. This reduces significantly the number of words that still need repair. It is therefore clear that error-correcting codes are good candidates for applying the diversified repair approach. This approach requires a scheme able to repair the majority of the memory parts (here the ECC code), and a second scheme able to repair the remaining parts. Since this second scheme must be able to repair words including multiple faulty cells, a word repair scheme is the best suited for this purpose. However, for high defect densities, some spare words will also be affected by faults. Thus, the word-repair scheme must also cope with these faults otherwise the repair will fail.

3.1. Word Repair Considering Faulty Spare Words Word repair is a well-known scheme introduced in [3]. This scheme uses a content addressable memory (CAM) with k locations. During the test and repair mode a counter is used to select a CAM location. When a fault is detected, the current memory address is stored in the selected CAM location and the counter is incremented. During a read or a write operation of the functional mode, the current memory address is compared in parallel against the address field of all the CAM locations. A hit of the current memory address with the address

Proceedings of the 22nd IEEE VLSI Test Symposium (VTS 2004) 1093-0167/04 $20.00 © 2004 IEEE

field of a CAM location enables reading from or writing to the data field of the hit CAM location. On the other hand, an address miss enables reading from or writing to the memory. Because a memory test algorithm may address the same memory word several times, it may detect a faulty memory word at several instances. This will lead storing the same memory address at several CAM locations, wasting the spare resources. To improve this situation, our scheme activates the address comparison mechanism also during the test and repair phase. Thus, if a fault is detected but at the same time the hit signal is activated, no address is written in the CAM. Existing word repair solutions consider the CAM to be fault-free and do not use mechanisms able to repair faults in the CAM. Thus, if a fault affects a CAM word, the only possible action consists on testing the CAM and rejecting the memory if a fault is detected in the CAM. However, in the context of high defect densities, the memory may include thousands words left un-repaired by the ECC code. Thus, the size of the CAM is large, and the probability of having a fault-free CAM is negligible, especially in technologies with high defect densities. Consequently, rejecting the memory each time a fault affects the CAM will result on a yield equal to 0. It is therefore clear that we need to cope with faults affecting the CAM. A possible solution consists on repairing the faulty CAM locations. However, this may add a significant circuit consisting on the spare CAM locations and the reconfiguration circuitry. A more economic solution can be elaborated if we exploit the specificities of the operation of a CAM. In fact, since any CAM location can be used to replace a faulty memory word, we can just invalidate the faulty CAM locations in a manner that only fault-free CAM locations are used to replace the faulty memory words. For doing so, we add a flag cell to each CAM location (to be mentioned as fault indication flag). During a specific CAM test phase, we write in this flag the value 0 if the CAM word is found fault-free, and the value 1 if it is found faulty. Then, the value 1 is employed to disable the use of a faulty CAM location for performing repair. This is achieved as described below: 1- During the test and repair mode of the memory, each time the address counter of the CAM selects a CAM location in which the fault indication flag is equal to 1, the address counter is immediately incremented. This action selects another CAM location for storing the next faulty memory address. 2- The value 1 in the flag of a CAM location forces to 0 the output of the comparator of the address field of this location, as illustrated for flag F1 in

figure 2 (the reasons for the flag F2 shown in this figure will be discussed later). This action ensures that during the normal operation mode a fault in the address field of a CAM location, or in the address comparator of this location, does not select erroneously the data field of the CAM location for performing a read or write operation. It also ensures that a CAM location containing a faulty data field is never selected.

Comparator Address field F1 F2

Data field

Figure 2. Using a flag to invalidate the comparator of the address field of a CAM Let us now illustrate this scheme by means of a numerical example. For a 1Mb memory having 32bit words (32kx32bits), each CAM location will include an address field of 15 bits, a flag, and a data field of 32 bits. We have used designs where the area of the cells of the address field is 2.8 times larger than that of memory cells. The complete CAM location occupies an area equivalent to the area of 2.4 memory words of 32 bits. Considering that the defect probability is similar for two circuits of similar area, the defect probability of the above CAM location is equivalent to the defect probability of a 77-bit memory word. Then, for a defect probability of 0.01 -2 per memory cell (Dd=10 ), the probability of a CAM location to be fault-free will be 0.46. On the other hand, the probability of a 32-bit memory word to be un-repaired by the Hamming code will be 0.033. Thus, on the average, 1000 words of the 32kx32bits memory will be un-repaired. A CAM using 2175 locations will be required on the average to obtain 1000 fault-free CAM locations needed for repairing this memory. This corresponds to the 16.3% of the area of the 32kx32bits memory. This example was given only for illustration purposes. So, we have used rough calculations. More exact area overhead versus fabrication yield metrics will be presented in the evaluation part. However, we can already see that the proposed diversified repair approach can achieve a good fabrication yield by adding a 16.3% area overhead in addition to the 18.7% area overhead already added for Hamming code. This extra area is less than the 19% extra area required to pass from the Hamming code to a double- error correcting code, although the yield obtained by means of this code is still 0. In fact, we have a 0.996 probability for a word to be fault-free or to be corrected by the 2-error correcting code, resulting to a -57 yield of 1.15x10 for the whole memory (the

Proceedings of the 22nd IEEE VLSI Test Symposium (VTS 2004) 1093-0167/04 $20.00 © 2004 IEEE

probability that all the memory words are fault-free or corrected by the code). This illustrates the efficiency of the diversified approach, which employs a first repair scheme for repairing the majority of the defective parts, and a second repair scheme that targets specifically the fault distributions for which the first scheme is inefficient. These numerical results suppose that each faulty CAM location is deselected by means of the fault indication flag. However, if a CAM location contains a fault in the address field or in the data field and a fault in the flag, such that the flag is not able to invalidate the faulty CAM location, the repair may fail. The probability of this situation is computed by combining the probability to have a faulty CAM location (found earlier to be 0.46) and the probability to have an error in the flag cell or in the transistor driven by this cell (0.0116). This will give a probability equal to 0.0054 for the combined fault to affect a CAM location. Thus, the probability that no CAM location is affected by such a combined fault is -6 equal to 7.68x10 , bringing to 0 the yield. To achieve a reasonable yield we can use two flags F1, F2, to disable the output of the comparator of the address field of each CAM location, as shown in figure 2. This will result on a 0.13 probability to have an un-repaired memory due to this problem. This probability becomes 0.002 if we triplicate the flag. It allows guarantying that combined faults affecting a CAM location and its fault indication flags will not affect the yield significantly.

4. Evaluation To evaluate the repair efficiency, we need to determine the yield improvement obtained by means of our scheme, and the corresponding hardware cost. This has to be done for various defect densities. The hardware cost was determined for a commercial 0.18um process. Concerning the repair efficiency, we have considered two approaches. The one uses analytical expressions, and the other uses a probabilistic fault injection tool. The analytical approach has run on serious computational problems, because it involves very large terms such as Nw! (Nw being the number of memory words) and very small terms such as Nw Pw0/1 (Pw0/1 being the probability of a word to be fault-free or to include one fault). By rearranging the expressions, it was possible to multiply each factor of a large term with one factor of a small term (i.e. a Nw factor of Nw! by a factor of Pw0/1 ) to avoid the appearance of intermediate results with very large and very small values. These arrangements allowed performing the computations for several defect densities. But as we increased the defect densities, the

number of required spare words increased drastically. This resulted in a new family of terms with very large and very small values. Thus, a second level of rearrangements is required, resulting on very complex formulas. The new rearrangements are under implementation. Due to these problems, the repair efficiency presented in table 1 was obtained by means of a probabilistic fault injection tool, that we have developed within the Fracture IST European project. The evaluation of various repair schemes performed in the Fracture project shown that the analytical and the fault injection approaches give very similar results. Table 1: 1Mbit memory, 32-bit word length (for all defect densities the yield without repair is 0%) Defect density spare words (%)/ Yield (%) Dd area overhead (%) -4 0/19 1 1*10 0,01/19,03 19 (100 defects) 0,03/19,09 89 0,05/19,15 100 -4 0,01/19,03 0 3*10 0,05/19,15 30 (300 defects) 0,07/19,21 80 0,1/19,3 99 0,5/20,5 100 -3 0,1/19,3 0 1*10 0,2/19,6 10 (1000 defects) 0,3/19,9 99 0,5/20,5 100 -3 0,1/19,3 0 3*10 0,9/21,7 27 (3000 defects) 1/22 98 5/34 100 -2 1/22 0 1*10 (10000 5/34 5 defects) 6/37 100 -2 20/79 0 3*10 (30000 25/94 100 defects) The evaluations shown in table 1 were done for a 1 Mbit memory using 32 bit words. We have -4 -4 considered defect densities of Dd = 10 , 3x10 , -3 -3 -2 -2 1*10 , 3x10 , 1x10 , and 3x10 . The first column of the table gives the defect density considered in the experiments and the mean number of faults per memory for the given defect density. The second column gives the number of the CAM words and the area overhead (area of the ECC bits and area of the CAM). The last column shows the yield. In all cases we have used CAMs with one or two flag cells per word. In fact, for CAMs with no flag cells the yield -4 was very low for the 10 defect density, and 0 for all

Proceedings of the 22nd IEEE VLSI Test Symposium (VTS 2004) 1093-0167/04 $20.00 © 2004 IEEE

other defect densities. From table 1 we observe that for the ECC code alone (0 spare words), the yield is always 0. Let us now make some comparisons. For Dd -4 = 10 , the combined ECC & word repair scheme requires an area overhead of 19,1%. This result is better with respect to the results obtained from the best of the schemes discussed in the previous works targeting high defect densities [11] and [12], which for -4 Dd = 10 require an area overhead of 28,7%. The approaches in [11] and [12] become better only for -5 -5 lower defect densities (3*10 , 10 , or less), since for these defect densities they require an area (18,4% and 14,2%) which is even lower than the Hamming code alone (19%). As we increase the defect density, the superiority of the approach proposed here becomes more obvious, as it is the only capable to repair -2 memories affected by defect densities as high as 10 by means of a moderate extra area (100% yield for 37% extra area). -2 For higher defect densities (i.e. 3*10 in the last row of the table 1, the repair cost increases significantly (94%). The reason is that for this defect density the mean number of faulty cells per memory word and its code becomes 1,14. This is higher than the single faulty-cell that can be repaired by the ECC. Thus, for this mean value, the distribution of defects will give a very large number of words that are unrepairable by the ECC (in figure 1 it corresponds to the right-half part of the curve of defect distribution over the memory words). This situation will require a very large number of CAM words (about 25% of the memory words and 75% of the memory area). To reduce the area cost we can use a double-error correcting code, since in this case the number of correctable faults per memory word (2) becomes much higher than the mean number of faulty cells per memory word and its double error correcting code (1,3). However, such codes require a very complex and slow encoder/decoder. Thus, a better solution is to divide the memory word into two halves of 16 bits each, and associate a Hamming code to each half of a memory word. In this case, the mean number of faulty cells per each half memory word and its code becomes 0.63, which is much smaller than 1,18. As a matter of fact we can expect that this solution will reduce significantly the area of the CAM. However some of the saved area will be absorbed by the higher area cost of the ECC. For the defect density Dd = 3*10-2, the results for this implementation are shown in table 2. A significant reduction of the area overhead is obtained, since we achieve a 100% yield by means of a 61.5% area overhead, instead of 94% area overhead in table 1.

Table 2: 1Mbit memory, 32-bit word length, Dd = 3*10-2 (30000 defects), 1 ECC per 16 bits #spare words (%)/ Yield area overhead (%) (%) 0/31.5 0 5/46.5 0 10/61.5 100 This solution can also be applied for defect densities -2 lower than 3*10 , in situations where the memory word is larger than 32 bits. In such a case the mean number of faulty cells per word can be higher than 1 even for lower defect densities. Thus, partitioning the memory word in two or more parts, and using one Hamming codes per each part can be more efficient.

Conclusions This paper considers memory repair approaches for nano-technologies. For these technologies many authors expect very high defect densities. Thus, repair approaches able to cope with very high defect densities are mandatory for enabling such technologies. To tackle this problem, we propose a diversified repair scheme using ECC codes to repair the majority of the faulty memory words, and a word repair scheme to repair the words left un-repaired by the ECC code. For the considered defect densities, the CAM used for performing word repair will also be affected by the faults. So, we have also proposed a low-cost CAM repair architecture. Instead of using spare CAM locations for replacing redundant CAM locations, this architecture uses just a flag cell to discard faulty CAM locations. Thus the reconfiguration circuitry is simplified drastically. We have illustrated that this diversified scheme can repair memories affected by high defect densities, while a scheme based on ECC codes alone will results on zero yield, even if we use ECC codes using redundancy of higher cost than the mixed scheme. Of course, several other combinations of repair schemes can be used with the diversified repair approach, including the use of more than two schemes. The important condition to make the approach successful is to select a first scheme efficient for repairing the majority of the faulty units, and a second scheme for repairing the particular fault distributions left unrepaired by the first scheme. An important outcome of this work is that high memory fabrication yield can be achieved, by means of low area overhead, even for technologies affected by extremely high defect densities For instance, for a 1Mbit memory using 32-bit words and for a 3*10-2 defect density, a 100% yield is achieved by means of 61.5% extra area, while for a 10-2 defect density a

Proceedings of the 22nd IEEE VLSI Test Symposium (VTS 2004) 1093-0167/04 $20.00 © 2004 IEEE

100% yield is achieved by means of 37% extra area. In comparison, fault tolerant solutions for logic circuits, affected by similar levels of defect density, require a huge hardware overhead. For instance, a 25000% area overhead is required for a 10-3 defect density [14]. Thus, an important outcome of this work is that the extreme defect densities that many authors predict for nanotechnologies do not represent a showstopper, at least as concerning memories.

References [1] Heath J.R., Kuekes P.J., Snider G.S., Stanley Williams R., “A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology”, SCIENCE, Vol. 280, June 12, 1998 [2] Zorian Y., “Embedded Memory Test & Repair: Infrastructure IP for SOC Yield“, 2002 IEEE Int’l Test Conf. [3] Sawada K., Sakurai T., Uchino Y., Yamada K., “Built-In Self Repair Circuit for High Density ASMIC”, IEEE 1989 Custom Integrated Circuits Conference. [4] Tanabe A. et al “ A 30-ns 64-Mb DRAM with Built-in Self-test and Self-Repair Function”, IEEE Journal Solid State Circuits, pp. 1525-1533, Vol 27, No 11, Nov. 1992. [5] Bhavsar D. K., Edmodson J. H., “Testability Strategy of the Alpha AXP 21164 Microprocessor”, I994 IEEE International Test Conference. [6] Benso A. et al “A Family of Self-Repair SRAM Cores”, 2000 IEEE International Test Conference. 2000 In Proc. IEEE International On-Line Testing Workshop, July 3-5, 2000. [7] Kim I., Zorian Y., Komoriya G., Pham H., Higgins F. P., Newandowski J.L. "Built-In self repair for embedded highdensity SRAM" Proc. Int. Test Conference, 1998, pp11121119 [9] V. Schober, S. Paul, O. Picot, “Memory Built-In SelfRepair using redundant words”, 2001 IEEE Int’l Test Conference. [10] M. Nicolaidis, N. Achouri, S. Boutobza, ”Optimal Reconfiguration Functions for Column or Data-bit Built-In Self-Repair”, 2003 Design Automation and Test in Europe (DATE’03), March 3-7, 2003, Munich, Germany [11] M. Nicolaidis, N. Achouri, L. Anghel, ”Memory BuiltIn Self-Repair for Nanotechnologies”, 2003 IEEE International On-Line Testing Symposium, July 7-9, 2003, Kos, Greece [12] M. Nicolaidis, N.Achouri, L.Anghel, “A Memory Built In Self Repair for High Defect Densities based on Error Polarity, in Proceedings of 2003 IEEE Defect and Fault Tolerance Symposium, 3-5 Novembre, Cambrige, MA, USA. [13] M. Nicolaidis, N.Achouri, S. Boutobza, ”Dynamic Data-bit Memory Built-In Self-Repair”, in Proceedings of IEEE International Conference on Computer Aided Design, November 2003, USA [14] J. Han, D. Jonker “ A system Architecture Solution for Unreliable Nanoelectronic Devices”, IEEE Transactions on Nanotechnology, vol.1, No. 4, December 1992.