Multiple Fault Diagnosis in Crossbar Nano ... - Semantic Scholar

Report 3 Downloads 76 Views
Multiple Fault Diagnosis in Crossbar Nano-architectures 1

Navid Farazmand1

Mehdi B. Tahoori1,2

Department of Electrical and Computer Engineering Northeastern University Boston, MA, USA Email: [email protected]

Faculty of Informatik (ITEC) Karlsruhe Institute of Technology Karlsruhe, Germany Email: [email protected]

Abstract—Bottom up self-assembly of nano-crossbars from carbon nano-tubes and semiconductor nano-wires has shown the potential to overcome the limitations of lithographic fabrication of CMOS for further down-scaling. However, very high permanent and transient fault rates necessitates the incorporation of efficient fault tolerance techniques, capable of handling multiple faults. Self repair provides fault tolerance through fault detection, diagnosis and reconfiguration to recover from permanent faults. In this paper, we present a multiple faults diagnosis scheme based on dual rail error checking frameworks for nano-crossbar architectures. The proposed scheme is capable of identifying multiple faulty crosspoints with very low performance and area overheads. The experimental results show that all of the multiple faults are correctly diagnosed.

I. I NTRODUCTION Carbon Nano-Tubes (CNT) and Semiconductor Nano-Wires (SNW) have shown to be promising materials to be used in the fabrication of nano-electronic circuits for the continuation of Moore’s law beyond CMOS limitations. These materials are used to implement diodes and transistors as well as interconnect wires [1], [2]. Bottom-up self-assembly is used to combine devices built from CNTs and SNWs to form regular crossbar structures (nano-crossbars). Different Programmable Logic Array (PLA)-like architectures using nano-crossbars as the basic block have been proposed [1], [3], [4]. Due to high susceptibility of nano-crossbars to transient faults and wear out defects, it is necessary to incorporate appropriate fault tolerance schemes into the circuit. Fault tolerance techniques fall into two main categories: i) fault masking and ii) self repair. However, experimental analyses show that in high fault rates, anticipated for nano-crossbars, fault masking techniques, such as N-Modular Redundancy (NMR), become ineffective [5], [6]. Self repair techniques are based on fault detection and recovery. Recovery techniques such as checkpointing and rollback recovery [7] exist for transient faults. For permanent faults self repair contains three steps: i) Fault detection, ii) Fault location (diagnosis), iii) Circuit re-configuration. In our previous work [6], [8], we have proposed an efficient online multiple error detection scheme based on dual rail logic implementation. The objective of this paper is to present a multiple fault diagnosis and re-configuration (repair) scheme. In a circuit composed of arrays of nano-crossbar, a faulty crossbar, or a product/output line in the faulty crossbar could

978-1-4244-5833-2/10/$26.00 ©2010 IEEE

2

be identified to be replaced. Effectiveness of a diagnosis scheme can be described in terms of 1) diagnosis resolution, 2) diagnosis latency (number of test vectors and diagnosis reconfigurations), and 3) required hardware resources. Diagnosis has been used in the scope of defect tolerance for yield enhancement [9], [10], [11], [12]. These techniques suffer from very large test application [9], [10] or re-configuration time [11], [12]. Recent diagnosis techniques for fault tolerance in nano-crossbars cannot detect some cases of multiple faults [13], cannot efficiently be implemented on nano crossbars [14], and suffer from high diagnosis time [13], [14]. In this paper, we propose a multiple fault diagnosis technique compatible with dual rail online detection [8]. Our scheme can locate multiple faults with different resolutions, block level, product/output lines, and crosspoints. Therefore, it provides a tradeoff between diagnosis resolution and latency. By analyzing dual-rail output, test circuit configuration, and test vector application, we achieve a right balance between diagnosis resolution and latency. Experimental results show that the hardware overhead of our scheme is about 28%. The rest of this paper is organized as follows: Preliminaries and previous work are presented in Section II. Proposed diagnosis architecture and algorithms are presented in Section III. Experimental results are presented in Section IV. Finally, Section V concludes the paper. II. P RELIMINARY AND PREVIOUS WORK A. PLA-like nano-crossbar architectures Bottom-up integration approaches are used to assemble nano-materials (CNTs and SNWs) into larger structures [15]. Common feature among most of them is that they are only able to fabricate random or regular structures [15]. Nano-crossbars are composed of perpendicular sets of parallel nano-wires (Fig. 1). Diodes and transistors are formed at the intersection of CNTs/SNWs. Combining these logic elements with molecular [15] or electromechanical [16] bistable switches at the intersections results in a configurable regular logic structure. Bistable switches could be electrically (re)programmed to either activated (ON) or deactivated (OFF) states, leading to a flexible re-configurable regular logic [2]. Programmable Logic Arrays (PLAs)-like nano-architectures have been proposed to employ nano-crossbars to enable realization of large (re)configurable circuits [1], [2], [3], [4]. This

94

re-configurability can be used for defect and fault tolerance [11], [12], [13], [14]. There are two different cost components associated with re-configuration time [4], [3], [1]: 1) the time associated with the serialized re-configuration of crosspoint switches. This time is proportional to the total number of switches being programmed. 2) the time required to switch between normal operation and re-configuration. B. Fault model We consider two types of crosspoint faults: insertion and deletion. Crosspoint insertion fault refers to an activated (ON) crosspoint while it is supposed to be deactivated (OFF) in the fault free circuit. Similarly, crosspoint deletion is the condition of having a deactivated (OFF) crosspoint while it is supposed to be activated (ON). We consider crosspoint insertion and deletion in both AND and OR planes. These faults represent the highest diagnosis resolution. C. Previous work Defect tolerance techniques have been proposed for CMOS PLAs based on re-configuration [9], [10]. Using “walking 0” on the input lines and “walking 1” on the product lines, current PLA configuration is extracted. By comparing this configuration with expected configuration faulty crosspoints could be identified with a total of k × (m + 1) tests (m: # of inputs, and k: # of products). These techniques employ CMOS implemented shift registers [9], [10] to select individual products/inputs. Such constructs are not available in nano-crossbars. Besides, these techniques require application of large test sets. Built-In Self Test (BIST) and diagnosis schemes configure circuit as groups of Test Generators (TG) and Response Generators (RG) [11], [12]. They test and diagnose the circuit through multiple test configurations. They generally have low diagnosis resolution, i.e. crossbar level, and employ large number of (re)configurations. Row-column based test and diagnosis for nano-crossbars has been presented in [13]. This technique decreases diagnosis resolution to reduce number of test patterns. There are multiple faults scenarios which may invalidate the test (and diagnosis) process that have not been addressed in the paper. Also, the number of test sets are still high . Lastly, authors in [14] propose an online diagnosis scheme for nano-crossbars which is run upon the detection of faulty value on an output. They follow the same “walking 0” process, mentioned above, for AND-plane faults only for zeros in the

PI1 PI1' PI2 PI2'

PO1 PO1' PO2 PO2' PO3 PO3'

PI3 PI3' O1 O1' O2 O2' O3 O3'

I1 I1' I2 I2' I3 I3'

Fig. 2. Multistage dual rail nano-crossbar. Fault effects are propagated through PLAs all the way to the primary outputs (PO). There, at POs, faults are detected by checking dual rail outputs.

input pattern (“walking 1” is used for OR-plane faults only on the active products for the current input). The assumption of having access to the output of the AND-plane as well as the OR-plane, and successively checking large number of products and outputs using 1-bit comparator are among the limitations of this scheme. Besides, the authors do not present the implementation of “walking 0s/1s” on the inputs and products. D. Dual rail online fault detection scheme Our proposed diagnosis scheme is based on our previous work on dual-rail online multiple error detection [8]. Fig. 2 demonstrates an example of a two-stage circuit implemented with PLA-like nano-crossbars. In order to detect faults with dual rail scheme, both outputs (Ok ) and their complements (Ok ) are implemented assuming both inputs and their complements (ik and ik ) are available. Therefore, in fault free situation all outputs should be in dual rail form. By checking dual rail property of the outputs we are able to detect very high percentage of multiple faults in the circuit [8], [6]. As shown in Fig. 2, dual rail checking is done only at primary outputs thus eliminating intermediate checkers for each nano-crossbar stage. When multiple crosspoint faults occur in a nano-crossbar some Non-Dual Rail Output (NDRO) might be generated. Suppose that Ok Ok is 01 in fault free case. The effects of different faults in terms of dual rail property of the output are as follows: Crosspoint insertion in AND-plane: cause a one to zero change in the output Ok resulting in 00 faulty value on Ok Ok . Hereafter we refer to this effect as ZNDRO (Zero NDRO). Crosspoint insertion in OR-plane: generates 11 faulty value on Ok Ok referred to as ONDRO (One NDRO). Crosspoint deletion in AND-plane and Crosspoint deletion in OR-plane result in ONDRO and ZNDRO respectively. III. P ROPOSED D IAGNOSIS S CHEME

Fig. 1. Typical nano-crossbar regular structure formed by self-assembly of CNTs and SNWs. Bistable programmable switches exist at the crosspoints ([2]).

Diagnosis in nano-crossbar based architectures is done in two levels: i) block level diagnosis which identifies faulty nano-crossbar blocks, and ii) detailed diagnosis which identifies faulty resources within blocks, i.e. product lines, output lines, and individual crosspoints (Fig. 2).

95

TABLE I FAULT LOCATION BASED ON FAULT FREE (RO1 RO2 ) AND FAULTY (F F ) DI 2

VALUES

00 01/10

F F = 00 insertion in AND-plane deletion in OR-plane deletion in AND-plane deletion in OR-plane

11

A. Block level diagnosis Fault detection scheme propagates faulty outputs all the way to the checkers at primary outputs (Fig. 2). So, faulty blocks need to be diagnosed first, and then faulty resources within the blocks are identified, hierarchically. Two following schemes are proposed for the block level diagnosis: • Re-configure the circuit to incorporate dual rail checkers implemented with nano-scale devices for each nanocrossbar. This will enable us to identify which individual nano-crossbars should be diagnosed. When available resources are limited, one dual rail checker can be used to check all blocks iteratively through rerouting. • Observe the outputs of the individual crossbars through interface with CMOS controllers in an iterative way. It is also necessary for configuring the architecture to carry out a specific function. The focus of the rest of this paper is on detailed diagnosis.

0 1

1 0 0 1 1 0

AA B B C C DD

B. Detailed diagnosis The location of faults can be confined within a plane, product line, output line, or individual crosspoints (various diagnosis resolutions). This provides a tradeoff between diagnosis resolution and latency, depending on the availability of spare resources and other design requirements. 1) Identifying faulty plane (AND, or OR plane): In order to identify the faulty plane of PLA we take advantage of two additional spare outputs (RO1 and RO2 in Fig. 3). These spares could be shared for both the diagnosis and recovery phases. After detecting a non-dual rail output in a PLA

F1 1 F1 0 F2 0 F2 1

RO 1 RO2 Fig. 3. A nano-crossbar with two dual rail outputs. RO1 and RO2 are redundant re-configurable outputs in OR-plane used for plane level diagnosis. Fault-free crosspoints are in the form of ‘’ and ‘×’ in the AND-plane and the OR-plane respectively. A ‘ ’ is a faulty crosspoint added to the ANDplane. A ‘’ around a crosspoint stands for the deletion of that crosspoint due to fault. Finally, solid triangles are OR-plane crosspoints configured on RO1 RO2 according to the outputs being diagnosed (F1 F1 in this figure).



DI1

F F = 11 insertion in AND-plane insertion in OR-plane insertion in OR-plane deletion in AND-plane

DI 0 D D

RO1 RO2

F1 F1

RO 1 RO2 Fig. 4. PLA in Fig. 3 equipped with a decoder in the AND-plane to enable selecting individual product lines. Decoder employs m-out-of-n code; each codeword selects one product from every output. Only products belong to the outputs being tested are observed at RO1 and RO2

diagnosis for that output starts with configuring RO1 RO2 to implement the desired function of that output. By comparing the faulty outputs and values on spare lines, the location of the fault(s) could be identified based on the Table I. This process is illustrated in Fig. 5-a. For example, assume that ABCD = 0101 which implies values 10 on F1 F1 and 01 on F2 F2 (P2 and P9 are 1). Suppose that a crosspoint insertion occurs at the intersection of P2 and D which deactivates P2 thus changing the value on F1 F1 to 00. RO1 RO2 are configured to implement F1 F1 function (Fig. 3). Since none of the products used in either F1 or F1 are active, the value on RO1 RO2 is 00 in this case. According to Table I, crosspoint insertion in the AND-plane is identified correctly. In short, a simple interpretation of Table I is that whenever faulty outputs are the same as RO1 RO2 fault(s) is (are) in the AND-plane. Based on the fault effects presented in Section II-D, if the faulty values are ZNDRO fault type is crosspoint insertion otherwise it is crosspoint deletion. If RO1 RO2 is dual rail (10 or 01) fault(s) is (are) in OR-plane and obviously the faulty output is the one with different value from ROi . As an example, consider crosspoint deletion in Fig. 3 at the intersection of P2 and F1 with the same input as previous example. F1 F1 becomes 00 while RO1 RO2 remains 10. In OR-plane, faulty ZNDRO is due to crosspoint deletion and ONDRO is result of crosspoint insertion. Finally, if faulty output pair is 00 (11) and RO1 RO2 is 11 (00), both output lines are faulty (OR-plane fault) and should be replaced. In addition, in this case there are some faults in the AND-plane which caused non-dual rail value on RO1 RO2 . 2) Identifying faulty product in AND-plane: In the first step of diagnosis, the faulty plane was identified. If the fault(s) is (are) in the OR-plane, the exact faulty output is identified in this step and no further step is required. However, if the fault(s) is (are) in the AND-plane, further diagnosis is required to identify faulty product. Our approach for diagnosis in the AND-plane is divided in two processes depending whether there is crosspoint insertion or deletion in the AND-plane. Crosspoint deletion: Due to crosspoint deletion in ANDplane expansion occurs in some product lines, i.e. some products, which are supposed to be 0, become 1 due to faults. Our fault detection scheme detects faults by checking the dual rail property of the outputs (Section II-D) and do not rely on

96

expected outputs. Thus, it only identifies the faulty output pairs not the particular output. Since between F and F only one of them should be 1, the following steps are taken to identify faults and recover from them: Identifying all active products in both F (set APF ) and F (set APF ) B) Re-configuring all of them on the spare lines After these two steps the output should be corrected. The problem here is that some of the spare lines have been wasted for re-configuration of the fault-free products. In order to eliminate this effect, it needs to be checked which output, F or F , is 1 after re-configuration. Since its value has not been changed, its product lines (those APF or APF used before reconfiguration) could be marked as fault-free spare lines (Fig. 5-b). Note that, if there are enough spares to replace faulty products (APF or APF ) but not enough to replace all of them at once, the re-configuration could be performed in (possibly) two steps. First, re-configure APF and see if the problem is solved. If not, in second step re-configure APF which will correct the output. The next issue is how to identify active products in F and F . Using “walking 1” through a shift register on top of the ANDplane has been a common approach [9], [10]. Such a shift register is not available in current nano-architectures and its implementation is not as simple as in CMOS PLAs. Here we propose to use a decoder to be implemented in the AND-plane of the nano-crossbar by just adding a few input lines. Using m-out-of-n codes, the decoder enables selecting individual product lines. Fig. 4 presents the same nano-crossbar as in Fig. 3, with the decoder added to the AND-plane. As can be seen in Figure 4, each codeword selects one product from each and every output. The correct configuration of RO1 and RO2 makes it possible to observe only the selected products from F and F (faulty output pair). The number of required codewords is equal to the maximum number of products among all outputs (3 in Fig. 4). In order to minimize the number of additional lines for decoder it makes sense to use n/2-out-of-n code. During the process of AND-plane diagnosis for crosspoint deletion by applying appropriate inputs to the decoder, one at a time, so active products in F and F can be recognized. So, APF and APF sets required in diagnosis process described above can be formed. Crosspoint insertion: Recall, from Table I, diagnosis for crosspoint insertion in AND-plane is performed when RO1 RO2 is 00, meaning that all the products for F and F are 0. Thus, it is required to identify the products which were supposed to be 1 for the current input pattern. Assume that each product is shown with a bit-vector, VP , with equal length as the number of inputs. A 1 in position i of this vector corresponds to the existence of a crosspoint at the intersection of the product with input i. Every input vector which activates product P could be obtained from VP by changing some of the 0’s to 1. But, all 1’s in the vector should remain intact. Obviously, VP itself is an input which activates product P and reveals crosspoint insertion faults in product P . An extra crosspoint in product P while VP is applied to the inputs, A)

evaluates P to 0. Based on this observation the diagnosis process for crosspoint insertion in the products belong to F is as follows: for i from 1 to length(P(F)) do apply VPi at the input; if RO1 is 0 then Pi is faulty; end end The same procedure should be used for F . This algorithm is illustrated in Fig. 5-c . It is notable that here we assume that VPi only activates Pi . In a multiple fault scenario, it is possible to have some crosspoint deletion in other products so that when applying VPi , they become activated and make the output 1. This may lead to undetection of the crosspoint insertion faults in Pi . However, in our scheme, it is known that both RO1 and RO2 are 0 before running this procedure, i.e. no product is active for current input. So, it is guaranteed that applying VPi does not activate any product except (fault-free) Pi . 3) Identifying faulty crosspoints: after identifying faulty output/product by applying more test vectors, individual faulty crosspoints could be identified. This part is rather straightforward and hence, discussed briefly. Assume that faulty product, Pi , has been identified. By applying “walking 0” at the inputs and observing RO1 (RO2 ), the existence of crosspoint at the intersection of the product and input with value 0 could be verified. The number of test vectors can be reduced as follows: if the product has crosspoint insertion, “walking 0” is only applied for the bit-positions in which VPi is 0 (and for bitpositions in VPi with value 1, if the product has crosspoint deletion). C. Discussion 1) False diagnosis: the proposed diagnosis technique for dual rail implementations effectively deals with multiple faults. However, there are some combinations of multiple faults which lead to a false diagnosis and reconfiguration; i.e. after reconfiguration, the output is dual rail but incorrect. For an example, consider the fault-free circuit in Fig. 3. Fault-free F1 F1 is 10. Due to crosspoint insertion at the crosspoint of D and P2 , (D, P2 ), F1 becomes 0. A crosspoint deletion at (C, P5 ) activates P5 but at the same time crosspoint deletion at (P5 , F1 ) causes F1 to remain 0. The proposed detailed diagnosis procedure identifies only fault in F1 . After reconfiguration F1 F1 will be erroneously 01. Such false diagnosis and recovery occur only when some faults deactivate all products in Fi and other faults activate a product in Fi . Experimental results (Section IV) show that such cases are very rare. Besides, for other input patterns the faults might become detectable/diagnosable; in this example, for ABCD = 0100, both products P1 and P5 become 1 and the faults are correctly diagnosed. 2) Testing decoder implemented in the AND-plane: the decoder proposed in Section III to implement “walking 1” on product lines is implemented in the same nano-crossbar in the AND-plane. Therefore, it is also prone to the same

97

Begin

Begin

iÅ1 FPSFÅ )  FPSF_bÅ ) 

iÅ1,FPSFÅ ) , FPSF_bÅ ) 

F aultinboth

Begin

F & F *



Faultin F *

DIÅmͲoͲn[i]

ConfigureRO1RO2

no

FF

no

RO1RO2 dualrail?



FPSFÅPi

yes

 Fdiffers fromRO1?

 RO 1RO2= 00?

no RUN DIAGNOSE ANDͲDELETE

RUN DIAGNOSE ANDͲINSERT

no

iÅi+1

no yes

i

iÅi+1

yes

yes

RO 1 RO2  yes

PIÅVP

no

 1=1? RO

no

 2=1? RO

RO  1=0? no yes

yes

FPSFÅPi

FPSF_bÅPi

i>#of yes reͲconf.Pi  Products? FPSF&FPSF_b

 i>#of Products?

no

Faultin F * MarkFF: yes FPSF

END

MarkFF: FPSF_b

END

(a)

yes reͲconf.Pi  FPSF&FPSF_b

RO  1=0? no 

(b)

END (c)

Fig. 5. Proposed diagnosis algorithm: (a) complete algorithm using black boxes for AND-plane diagnosis (b) AND-plane crosspoint deletion diagnosis (DIAGNOSE AND-DELETE in part ‘a’) (c) AND-plane crosspoint insertion diagnosis (DIAGNOSE AND-INSERT in part ‘a’)

faults as the nano-crossbar. Recall from Section III that each product line is connected to m-out-of-n lines of the decoder (m crosspoints). In case of crosspoint deletion faults, diagnosis scheme will not be invalidated. Instead, it might report some fault-free products as faulty, while the faults are in decoder not the user logic. So, spare resources will be used inefficiently but diagnosis and recovery is done correctly (this situation also depends on the input pattern and might not happen). If crosspoint insertion occurs in the decoder, some products may not be selected at all (because only m of the inputs are 1 in each codeword, while number of crosspoints is more than m due to crosspoint insertion). Decoder could be tested for crosspoint insertion faults as follows: all inputs ← 1; for i from 1 to max(length(P (F )), length(P (F ))) do decoder input ← m-of-n(i); if i ≤ length(P (F )) && RO1 is 0 then PFi is faulty; end if i ≤ length(P (F )) && RO2 is 0 then PF i is faulty; end end IV. E XPERIMENTAL R ESULTS

the number of products for the corresponding output. So, in random fault injection we can also determine the running time for algorithm based on the number, type, and location of the injected faults. Benchmark circuits used in the experiments are adapted from a subset of MCNC benchmarks The benchmarks have been converted to multi-stage PLAs suitable for our experiments using PLAMAP from RASP mapping tool. Experimental results obtained by simulations are presented in Table II and Table III. The fault rate in these experiments is 10−3 which leads to maximum number of 605 simultaneous multiple faults for ‘alu4’ benchmark. In order to identify faulty nano-crossbar blocks, second approach for block level diagnosis (presented in Section III-A) has been used. Detailed diagnosis in faulty blocks is performed, starting from smaller logic depth blocks (blocks with primary input as their input are of logic depth zero; highest logic depth is assigned to the blocks calculating primary outputs). At each depth after diagnosis, blocks are recovered using re-configuration and then blocks in next depth are diagnosed. As can be seen in the Table II, the proposed scheme is capable of diagnosing and recovery from all cases of multiple faults with low performance and hardware overheads even with the high number of simultaneous faults in nano-crossbar blocks.

There are some cases of multiple faults for which proposed diagnosis algorithm fails. So, the effectiveness of the proposed diagnosis scheme should be evaluated in terms of percentage of successful diagnosis. The evaluation is done for some benchmark circuits with a simulator program written in C++. This simulator is capable of injecting multiple faults in the circuit. Since simulating all possible multiple faults is impossible (O(2n ) for n fault sites), random multiple fault injection is used. The required time for diagnosis also depends on the location of the fault (AND-plane or OR-plane) and

TABLE II E XPERIMENTAL RESULTS FOR THE PROPOSED SCHEME Circuit alu4 alu1 duke2 rd84 term1 x4 Average

98

Hardware Overhead (%) 27 27 27 27 30 29 28

Max # of faults 605 21 262 251 155 217 251

# of performed diagnosis 23073 24230 54780 41980 59004 83356 47737

Successful diagnosis (%) 100 100 100 100 100 100 100

TABLE III D IAGNOSIS TIME FOR DIFFERENT SCHEMES (OS*: O UR S CHEME ) Circuit alu4 alu1 duke2 rd84 term1 x4 Average

# test application steps (by resolution) Crosspoint Line OS* RAO[14] OS* GAR.[13] 32 291 7 115 32 290 7 115 31 293 6 117 32 304 7 122 33 338 8 135 33 292 8 117 32 301 7 120

TABLE IV C OMPARISON OF DIAGNOSIS ALGORITHMS FOR NANO - CROSSBARS .

# config. steps (for OS*) cross line 16 2 16 2 16 2 18 2 20 2 19 2 17 2

Diag. level

Block Line Crosspoint

Test generation Re-configuration False diagnosis Comparison W. Exp. Conf.

A. Comparison with related techniques Table III and Table IV provide information for comparing the proposed technique with two related diagnosis techniques, GARCIA [13] and RAO [14]. Column 2-7 in Table III represent diagnosis time in terms of required steps for test application and re-configuration. These values have been calculated for related techniques as follows: for RAO’s technique, diagnosis time has been calculated based on the number of 0s in input pattern, number of active product lines and the size of the PLA, as mentioned in the algorithm in [14]. Diagnosis time in GARCIA’s technique is a function of m+k, m: # of inputs and k: # of products. The average value of 2∗(m+k) is used in our experiments which is suggested by the experimental results in [13]. Only the proposed scheme in this paper involves re-configuration in diagnosis phase. Number of re-configuration steps is presented in columns 6-7. Depending on the architecture, configuration time could be equal to the number of configured crosspoints (column 6) or the number of configured output lines (column 7). As shown by the results, taking into account both test application and re-configuration time, the proposed technique is almost ten times faster than the two other techniques with the same resolution. Table IV summarizes other key features of these three approaches. Our scheme supports diagnosis at various resolutions which enables tradeoff between diagnosis resolution and latency (diagnosis time). Our method and ROA [14] approach do not require offline test generation. However, our scheme requires small number of re-configurations in the OR-plane which is shared with the recovery phase if the fault(s) is (are) in the OR-plane. False diagnosis and recovery might happen in our scheme and GARCIA [13]. Analytically, multiple fault scenarios leading to false diagnosis in GARCIA’s approach are more than our scheme. On the other hand, RAO’s technique involves the comparison between the current configuration of the PLA and the expected one, during diagnosis. V. C ONCLUSION High permanent and transient failure rates in nano-crossbars leading to multiple faults necessitates incorporation of appropriate fault tolerance techniques. General scheme of a self repair system provides fault tolerance in a two step process: i) fault detection and ii) recovery. A recovery process based on diagnosis and re-configuration is required to cope with permanent faults. In this paper we proposed an efficient online multiple fault diagnosis scheme for dual rail nano-crossbars. The proposed technique can achieve diagnosis with different resolutions (block, line, crosspoint) in order to balance

GARCIA [13] × √ × √ × √ ×

RAO [14] × × √

Our scheme √ √ √

× × × √

× √ √ ×

diagnosis resolution and latency. False diagnosis cases for multiple faults are very rare in our scheme so that there was no false diagnosis in our experiments. As shown for different benchmark circuits, our scheme is more efficient than related diagnosis techniques in terms of diagnosis time. ACKNOWLEDGEMENT This work was supported in part by the National Science Foundation Grant No. CCF-0746836. R EFERENCES [1] S. Goldstein and M. Budiu. Nanofabrics: Spatial computing using molecular electronics. Journal of computer architecture news, 29(2):178–191, 2001. [2] M. R. Stan, P. D. Franzon, S. C. Goldstein, J. C. Lach, and M. M. Ziegler. Molecular electronics: From devices and interconnect to circuits and architecture. Proc. of the IEEE, 91:1940–1957, 2003. [3] D. B. Strukov and K. K. Likharev. A reconfigurable architecture for hybrid cmos/nanodevice circuits. In Proc. of the ACM/SIGDA international symposium on Field Programmable Gate Arrays (FPGA), pages 131–140, New York, NY, USA, 2006. ACM. [4] A. DeHon. Array-based architecture for fet-based, nanoscale electronics. IEEE Trans. on Nanotechnology, 2(1):23–32, Mar 2003. [5] K. Nikolic, A. Sadek, and M. Forshaw. Architectures for reliable computing with unreliable nanodevices. In IEEE Conf. on Nanotechnology (IEEE-NANO), pages 254–259, 2001. [6] N. Farazmand and M.B. Tahoori. Online multiple error detection in crossbar nano-architectures. In IEEE International Conference on Computer Design (ICCD), October 2009. [7] N.S. Bowen and D.K. Pradham. Processor- and memory-based checkpoint and rollback recovery. Computer, 26(2):22–31, Feb 1993. [8] N. Farazmand and M.B. Tahoori. Online detection of multiple faults in crossbar nano-architectures using dual rail implementations. In IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), pages 79–82, July 2009. [9] Sy-Yen Kuo and W.K. Fuchs. Fault diagnosis and spare allocation for yield enhancement in large reconfigurable plas. IEEE Transactions on Computers, 41(2):221–226, Feb 1992. [10] C.L. Way. Fault location in repairable programmable logic arrays. In International Test Conference (ITC), pages 679–685, Aug 1989. [11] M. Tehranipoor. Defect tolerance for molecular electronics-based nanofabrics using built-in self-test procedure. In IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT), pages 305–313, Oct. 2005. [12] Z. Wang and K. Chakrabarty. Using built-in self-test and adaptive recovery for defect tolerance in molecular electronics-based nanofabrics. In IEEE International Test Conference (ITC), pages 10 pp.–486, Nov. 2005. [13] S. Garcia and A. Orailoglu. Online test and fault-tolerance for nanoelectronic programmable logic arrays. In IEEE International Symposium on Nanoscale Architectures (NANOARCH), pages 8–15, June 2008. [14] W. Rao, A. Orailoglu, and R. Karri. Fault tolerant approaches to nanoelectronic programmable logic arrays. In IEEE/IFIP Int’l Conf. on Dependable Systems and Networks (DSN), pages 216–224, June 2007. [15] M. Butts, A. DeHon, and S.C. Goldstein. Molecular electronics: devices, systems and tools for gigagate, gigabit chips. In IEEE/ACM International Conference on Computer Aided Design (ICCAD), pages 433–440, Nov. 2002. [16] T. Rueckes, K. Kim, E. Joselevich, G.Y. Tseng, C.L. Cheung, and C.M. Lieber. Carbon nanotube-based nonvolatile random access memory for molecular computing. Science, 289(5476):94–97, 2000.

99