# Heterogeneous Two-Level Logic and Its Density and Fault Tolerance Implications in Nanoscale Fabrics

Teng Wang, Pritish Narayanan, and Csaba Andras Moritz, Member, IEEE

Abstract-Most proposed nanoscale computing architectures are based on a certain type of two-level logic family, e.g., AND-OR, NOR-NOR, NAND-NAND, etc. In this paper, a new fabric architecture that combines different logic families in the same nanofabric is proposed for higher density and better defect tolerance. To achieve this, we apply very minor modifications on the way of controlling nanogrids, while the basic manufacturing requirements remain the same. The fabric that is based on the new heterogeneous two-level logic yields higher density for the applications mapped to it. We find that it also improves the efficiency of fault tolerance techniques as it significantly simplifies the designs. In addition, we found that it enables voting at nanoscale that can improve fault tolerance further. A nanoscale processor is implemented for evaluation purposes. We found that compared with an implementation on a Nanoscale Application-Specific IC (NASIC) fabric with one type of two-level logic, the density of this processor improves by up to 52% by using the heterogeneous logic. Furthermore, the yield is improved by 15% at 2% defective transistors and by 147% at 5% defect rates. Detailed analysis on density and yield is provided. The approach is applicable in grid-based fabrics in general, e.g., it can be used in both NASIC and hybrid semiconductor/nanowire/molecular (CMOL) designs.

*Index Terms*—hybrid semiconductor/nanowire/molecular (CMOL), nanoelectronics, nanofabrics, Nanoscale Application-Specific IC (NASIC), nanoscale processors, semiconductor nanowires (NWs).

## I. INTRODUCTION

**R** ESEARCHERS have shown that they can grow semiconductor nanowires (NWs) and control their electrical properties [1]. They can also assemble these NWs into crossbars [18]. Diodes and FETs can be implemented at the crosspoints of crossbar structures [2]. Furthermore, rapid progress on manufacturing makes computing systems at very high density levels (e.g.,  $10^{11}-10^{12}$  transistors/cm<sup>2</sup>) a promising direction beyond conventional CMOS.

Integrating nanodevices into computing systems is facing new challenges not encountered in conventional CMOS. Selfassembly-based manufacturing [3] imposes doping/layout constraints on nanoscale circuits, restricting routing and placement.

The authors are with the University of Massachusetts, Amherst, MA 01002 USA (e-mail: twang@ecs.umass.edu; pnarayan@ecs.umass.edu; ras@ecs.umass.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TNANO.2008.2007645

It is fairly common that nanoscale circuits are based on AND–OR (or equivalent) two-level logic: this is almost an obvious choice on a grid given the layout restrictions. In two-level logic, complementary signals are typically required to implement arbitrary logic functions.

Several fabric architectures have been proposed based on a certain grid-based two-level logic family. For example, hybrid semiconductor/nanowire/molecular (CMOL) [6], [15] is using NOR–NOR logic wherein the OR logic is implemented by NWs; CMOS cells provide signal inversion and restoration. Nanoscale programmable logic array (NanoPLA) [5] uses reprogrammable switches for logic and FETs for signal restoration but overall with a similar logic style.

An FET-based nanoscale fabric architecture proposed is Nanoscale Application-Specific IC (NASIC) [7]. It uses AND– OR logic or other equivalent two-level logic family such as NAND–NAND and proposes to mask errors in the circuit itself avoiding the requirement of reconfigurable devices.

This paper proposes a new fabric style that combines two different logic families in the same logic stage in the fabric, and evaluates it in the context of NASIC fabrics. With some simple circuit modifications, heterogeneous two-level (H2L) logic such as AND–OR/NOR is implemented instead of the pure AND–OR logic. This new fabric can easily generate complementary signals, can omit complementary signals in some cases, and require fewer partial products, and thus reduces the number of corresponding NWs. In this way, a significant reduction can be achieved in the area of nanoscale designs mapped to the fabric.

The new H2L logic technique can be easily combined with the built-in defect/fault tolerance techniques at different levels proposed for the original NASIC designs [7], [8]. These include *N*-way redundancy, built-in error correction circuitry, and system-level voting at key architectural points. In addition, we found that system-level voting, which had to be implemented in CMOS before [9]—because for voting to be beneficial it needed to achieve a certain level of fault resilience compared to the rest of the logic—can now be enabled at nanoscale by the new H2L technique. Simulations show that the H2L logic in conjunction with nanoscale voting significantly improves the yield of NASIC designs. Compared with voting in CMOS, which would require complex nano-/microinterfacing, nanoscale voting is tightly integrated into the design.

While the techniques in this paper are discussed and evaluated in the context of NASICs, the ideas could be applied in other 2-D grid-based architectures such as CMOL. The idea appears almost obvious, but to the best of our knowledge, it has not been proposed or evaluated earlier. In fact, its beauty is that it can be applied in a wide range of fabrics with minor modifications without adding any new manufacturing requirement.

Manuscript received March 30, 2008; accepted July 26, 2008. First published October 31, 2008; current version published January 16, 2009. This work was supported in part by the Focus Center Research Program (FCRP), by the Center on Functional Engineered Nano Architectonics (FENA), by the Center for Hierarchical Manufacturing (CHM), by the University of Massachusetts, Amherst, MA, and by the National Science Foundation (NSF) under Award CCR:0105516, Award NER:0508382, and Award CCR:0541066. The review of this paper was arranged by Associate Editor K. K. Likharev.



Fig. 1. Dynamic circuits implementing AND, NAND, OR, and NOR logic functions on NWs.

We use the wire streaming processor (WISP)-0 [13] design to evaluate the benefits of the modified NASIC fabric with the new logic family. For the purpose of the evaluation, WISP-0 is implemented on both the AND–OR and the AND–OR/NOR fabrics as well as in CMOS. Furthermore, some of the original defect tolerance techniques used in WISP-0 and NASICs are added in both versions and complemented with heterogeneous logicbased nanoscale voting.

The results show that the density of WISP-0 on the new AND-OR/NOR fabric is up to 52% better than on the original fabric. We similarly found that the new AND-OR/NOR fabric can also improve the efficiency of the built-in fault tolerance techniques. Simulation shows that the yield of WISP-0 with H2L logic is significantly better than WISP-0 with pure AND-OR logic. For example, the yield of WISP-0 when both two-way redundancy and system-level nanoscale voting are used can be improved by 15% at 2% defective transistors. The same improvement would be 147% at 5% defect rates. It appears that the improvement is increasing further at higher defect rates.

The paper is organized as follows. In Section II, we provide a brief overview of NASICs and WISP-0 to make the paper as self-contained as possible. Section III describes the proposed AND–OR/NOR NASIC fabric architecture in detail through simple circuit examples. Section IV discusses the implementations of nanoscale voting with AND–OR and H2L logics. The yield and density simulation results for WISP-0 are provided in Section V. Section VI concludes the paper.

# II. NANOCIRCUITS, NASICS, AND WISP-0 PROCESSOR

# A. Dynamic Nanocircuits on Semiconductor NWs

Dynamic circuits have been widely used in MOS designs. We can similarly implement dynamic circuits at nanoscale with the help of control signals generated in CMOS. For example, the circuits in Fig. 1 show how to implement basic logic functions (i.e., AND, NAND, OR, and NOR) in a dynamic style on semiconductor NWs.

A novel aspect of dynamic circuits in NASICs is the addition of the hold phase that is used to enable correct cascading. A variety of schemes have been proposed achieving different



Fig. 2. Waveform for dynamic AND circuit. The hold phase is added for cascading purpose.

throughputs. In NASICs, this hold phase also provides temporary storage of output values on NWs. Fig. 2 shows a waveform that illustrates the discharge–evaluate–hold phases for AND circuits. Details on dynamic circuits and their applications in NA-SICs can be found in [10] and [13]. To validate the concept of dynamic circuits and analyze the sensitivity of circuits to key device parameters, we evaluate the signal integration issue of cascaded dynamic circuits using circuit-level simulations in [23].

Comparing dynamic AND and NAND circuits, we find that the only difference between them is their connections to power supply ( $V_{dd}$ ) and ground (Gnd). It can be seen that one can easily generate complementary outputs by interchanging the power and Gnd NWs. Similar observation can be made for dynamic OR and NOR circuits. This observation is the key to our new fabric proposed in this paper that will be detailed in Section III. But let us first briefly review some more details on the NASIC fabric and the processor design that we will evaluate to allow the introduction of this logic style and associated new NASIC fabric architecture.

# B. Overview of NASICs

NASIC designs are based on dynamic circuits implemented on semiconductor NWs; various optimizations are applied to work around the layout and manufacturing constraints as well as defects [8], [11]. While still based on cascaded two-level logic style, e.g., AND-OR, NASIC designs are optimized according to specific applications to achieve higher density and defect/fault masking. The selection of this logic family is due to its simplicity and applicability on a 2-D style fabric where arbitrary placement and routing is not possible. Furthermore, due to manufacturing constraints (such as layout and uniform doping in each NW dimension), it may be impossible to use, for example, complementary devices close to each other, such as in CMOS or orient devices in arbitrary ways. By using dynamic circuits and pipelining on the wires, NASICs eliminate the need for explicit flip-flops in many areas of the design [10] and achieve unique pipelining schemes.

Fig. 3 demonstrates the design of a simple 1-bit NASIC full adder in dynamic AND–OR style [13]. The thinner wires represent NWs. All horizontal NWs are doped to n-type while all vertical NWs are doped to p-type. The signals *hdis*, *heva*, *vpre*, and *veva* correspond to discharge, evaluation, precharge, and evaluation phases on different NWs. Each nanotile is surrounded by microwires (MWs) (thicker wires in the figure), which carry Gnd, power supply voltage ( $V_{dd}$ ), and control signals for the dynamic evaluation of outputs. The control signals are generated in CMOS. As we mentioned before, complementary signals are required to implement arbitrary logic functions



Fig. 3. One-bit dynamic NASIC full adder using AND–OR cascaded logic. Arrows show propagation of data through the tile.



Fig. 4. Floorplan of the WISP-0 processor.

in two-level logic style. In the circuit in Fig. 3, we generate negative outputs  $\sim c_1$  and  $\sim s_0$  for cascading in multitle designs. Refer to [7], [8], [10], [11], and [13] for more details. NASICs can use a single type of FET, as shown in [14]: this simplifies manufacturing and improves overall performance.

# C. Overview of the WISP-0 Processor

WISP-0 is a stream processor that implements a five-stage pipelined streaming architecture in five nanotiles: *PC*, *ROM*, *DEC*, *RF*, and *ALU*. Local communication between adjacent nanotiles is provided by NWs. Each nanotile is surrounded by MWs that carry Gnd, power supply voltage, and some control signals. WISP-0 uses a 3-bit opcode and 2-bit operands. It supports many different arithmetic operations including multiplication.

Fig. 4 shows the layout of WISP-0 with AND–OR logic style. A nanotile is shown as a box surrounded by dashed lines. More details about various circuits used can be found in [10], [11], [13], and [14]. In this paper, we use WISP-0 mainly to evaluate our new nanofabric and focus on the density and defect/faulttolerance-related tradeoffs and implications.

# D. Built-In Defect/Fault Tolerance Techniques in NASICs

Nanoscale computing systems including NASICs have to deal with the high defect rates of nanodevices and faults introduced by manufacturing of fabrics. In NASICs, we consider a fairly generic fault model with both uniform and clustered defects and three main types of permanent defects: NWs may be broken, the transistors at the crosspoints may be stuck-on (no active transistor at crosspoint), or stuck-off (channel is switched off).

We consider defect rates of up to 15% at the finest granularity that is the device level. Our previous work indicates that device-level defect rates greater than 15% would likely eliminate the density benefits of nanoscale fabrics compared to projected CMOS technology, in the context of microprocessor designs. We also assume that the stuck-on transistor is much more prevalent than stuck-off transistors in an NASIC fabric due to the metallization process [4] in manufacturing steps. Stuck-off FETs are also less likely in depletion mode fabrics [16]. Note that a 15% defect rate is much higher than say a 50% defect rate at a cell level of designs (or circuit component level) that is assumed by some other researchers. Clearly, with 15% device-level defects, any reasonable size circuit would be defective, so even assuming a rate of 40%–50% at a component granularity seems highly unrealistic.

Built-in fault tolerance techniques are applied at various granularities for NASICs to make NASIC designs functional even in the presence of errors, while carefully managing area tradeoffs. Compared with reconfiguration-based approaches, this strategy also simplifies the micro-/nanointerfacing: no access to every crosspoint in the nanoarray is necessary. Furthermore, a defect map is not needed and the devices used do not have to be reconfigurable. The built-in fault tolerance techniques that are applied on the new fabric techniques include two-way redundancy and system-level voting, e.g., triple modular redundancy (TMR). A nanoscale voting is introduced in addition in this paper. The density and yield of WISP-0 under different fault tolerance scenarios are evaluated for the new fabric and compared with the original fabric. Comprehensive description of built-in fault tolerance techniques in NASICs can be found in [7] and [9] and is beyond the scope of this paper.

# **III. COMBINING LOGIC FAMILIES IN NASIC FABRIC**

In the design shown in Fig. 3, the outputs  $(c_1, s_0)$ , and their negative versions  $\sim c_1$  and  $\sim s_0$ ) are generated in the sum-ofproduct form of the inputs. The signals on horizontal NWs (excluding the control NWs such as *veva* and *vpre*) correspond to different partial products. For example, the signal on the top horizontal NW corresponds to partial product  $a_0b_0c_0$ , and the signal on the second NW corresponds to partial product  $a_0b_0\sim c_0$ . Each output signal is the sum of selected partial products.

From Fig. 3, we can see that different output signals require different groups of partial products. The output  $c_1$ , for example, requires partial products  $a_0 \sim b_0 \sim c_0$ ,  $\sim a_0 b_0 \sim c_0$ ,



Fig. 5. One-bit adder using the H2L logic.

 $\sim a_0 \sim b_0 c_0$ , and  $\sim a_0 \sim b_0 \sim c_0$  while the output  $\sim c_1$  requires  $a_0 b_0 c_0$ ,  $a_0 b_0 \sim c_0$ ,  $a_0 \sim b_0 c_0$ , and  $\sim a_0 b_0 c_0$ . The observation here is that *positive output and its negative version will require different partial products if both of them are implemented in the same AND-OR logic planes*. With single-FET designs such as shown in [14], this would be NAND-NAND. We have mentioned in Section II-A that the negative outputs can be easily generated by interchanging the power supply and Gnd connections. This way we can generate negative outputs in AND-NOR style; note that the negative output. We may therefore reduce the number of required partial products (i.e., the number of horizontal NWs) if a different control scheme is used. This thinking leads to our new nanofabric.

We propose to combine AND-OR and AND-NOR logic families (or NAND-NAND and NAND-AND) into the same NASIC logic plane. This requires some modifications on the OR plane. For comparison, the new circuit for the same 1-bit full adder but with H2L logic technique is shown in Fig. 5. Note that in the design of Fig. 3, all output NWs  $(c_1, s_0, \sim c_1, \text{ and } \sim s_0)$  in the OR plane connect to the Gnd MW at the top and to the  $V_{dd}$  MW at the bottom. In the design of Fig. 5, however, all negative output NWs ( $\sim c_1$  and  $\sim s_0$ ) are connected to  $V_{dd}$  and Gnd MWs in the opposite way. All positive outputs  $(c_1 \text{ and } s_0)$  of the design in Fig. 5 are generated by AND-OR logic while all negative outputs  $(\sim c_1 \text{ and } \sim s_0)$  by AND–NOR logic instead. The right logic plane in Fig. 5 now combines OR and NOR functions in the same plane. Compared with the design in Fig. 3, the partial product  $a_0b_0c_0$ (corresponding to the top horizontal NW) is not necessary, and therefore is removed from the new design in Fig. 5. This way we can reduce the number of horizontal NWs, and indirectly the overall number of transistors. The approach can be automated and applied on larger scale designs.

## A. Manufacturing Implications

A key advantage of this new fabric is that it effectively improves the density but does not introduce any new manufacturing challenges—For details on proposed manufacturing of NASICs, we refer the reader to [7]. The only modifications that are made



Fig. 6. TMR configuration in a pipelined system assuming voting circuits do not fail.

are at the connections from NWs to  $V_{dd}$  and Gnd MWs. This manufacturing step is accomplished at microscale in a fashion similar to the original fabric style. Compared with the design in Fig. 3, we have changed the order of vertical NWs in Fig. 5, effectively segregating the OR and NOR logics. This rearrangement of vertical NWs ensures that the nano-/microinterfacing is still at the microscale. Hence, no additional manufacturing constraints are imposed. As can be seen in Fig. 5, the dynamic control scheme otherwise remains completely unchanged. Positive and negative output NWs share the same control signals as previously.

## **B.** Fault Tolerance Implications

Another interesting benefit of the AND–OR/NOR fabric is that it also improves the yield of NASIC designs. The reason is quite simple: the total number of horizontal NWs and associated FETs are reduced compared to the original design—we can get the job done with fewer transistors. For a given defect rate, the expected number of defects in a design is also reduced. A design can therefore achieve better yield in AND–OR/NOR fabric as compared to the original AND–OR. We will evaluate the impact of this for WISP-0.

# C. Applicability to Other Types of Two-Level Fabrics

The H2L logic technique can be easily applied onto nanofabrics based on two-level logic. For example, on n-channel FET (nFET)-only NASIC fabrics [14], we can design circuits based on NAND–NAND/AND logic families. The approach can be applied in grid-based designs in general. For example, it can also be applied to NOR–NOR-based CMOL fabrics. The new logic family for CMOL would be NOR–NOR/OR. We are currently exploring such CMOL designs.

## IV. NANOSCALE VOTING ENABLED BY H2L LOGIC

# A. Voting With Reliable Voting Circuits

System-level voting techniques have been widely used in conventional CMOS systems to improve the reliability. TMR is the most popular one among these techniques [17]. The basic concept is illustrated in Fig. 6. There are three identical modules (e.g.,  $A_1, A_2$ , and  $A_3$ ) performing a given task. All three modules perform the task independently and their outputs (i.e.,  $a_1, a_2$ , and  $a_3$ ) are fed into a majority-voting circuit (shown as a shadowed box with label "V" in Fig. 6). The output of voting circuit "a" is sent to the next stage.

In our previous work, we have investigated the possibility of implementing voting circuits in CMOS at certain architectural points. CMOS voting circuits are much more reliable than nanoscale circuits: the defect density at 65 nm technology node of CMOS logic is 1395 defects/m<sup>2</sup>, which can be translated to a defect rate of  $9 \times 10^{-8}$  [22]. Compared with defect rates of nanodevices, as mentioned, this is negligible.

Therefore, the reliability of signal a,  $R_{TMR}^0$  is determined solely by the reliability of each module  $(A_1, A_2, \text{ and } A_3)$ ,  $R_M$  [17]:

$$R_{\rm TMR}^0(R_M) = R_M^3 + 3R_M^2(1 - R_M).$$
(1)

Different voting schemes yield different reliabilities. For example, six modular redundancy (6MR), which means voting on six redundant copies, yields the following reliability:

$$R_{6MR}^{0}(R_{M}) = R_{M}^{6} + 6R_{M}^{5}(1 - R_{M}) + 15R_{M}^{4}(1 - R_{M})^{2} + 10R_{M}^{3}(1 - R_{M})^{3}.$$
(2)

However, there are many challenges for implementing voting circuits in CMOS in a nanoscale fabric including manufacturability: inserting CMOS voting circuits between nanoscale modules would need a complex interfacing between nano- and microcircuits. Although nano-/microcontacts have been demonstrated, no proposals, with exception of perhaps CMOL (which has other challenges), address the alignment problem well. In addition, performance would be severely impacted by CMOS voting circuits, which will present heavy load capacitances to the nanoscale circuits. Density benefits of nanoscale implementation may also be reduced.

#### B. Nanoscale Voting With Unreliable Voting Circuits

In this paper, we explore possible strategies to implement voting circuits at nanoscale. The challenge is to achieve reliable voting using unreliable nanoscale voting circuits: the final reliability  $R_{\rm TMR}$  depends not only on the reliability of each module but also on the reliability of voting circuits

$$R_{\rm TMR}(R_M, R_V) = R_V R_{\rm TMR}^0(R_M) \tag{3}$$

where  $R_V$  is the reliability of voting circuits. Similar expression is applicable for 6MR. The equations are based on similar assumption as in [17], i.e., voting circuits are independent of computing modules in reliability.<sup>1</sup>

Fig. 7 shows the overall reliabilities (R) with voting circuits given different  $R_M$  and  $R_V$ . The black dash line represents the reliabilities of original signals. Three thin solid lines represent the reliabilities of TMR outputs and three thick lines represent the reliabilities of 6MR outputs. From the figure, we can see that if the voting is perfect  $(R_V = 1)$ , it always improves the reliability. 6MR is more efficient than TMR but typically with the cost of more components. If the voting circuits are faulty, then voting helps only in a certain range of  $R_M$ .

From (3), there are three possible ways to improve the overall reliability: 1) improve the reliability of each module  $(R_M)$ ; 2) improve the reliability of voting circuits  $(R_V)$ ; and 3) improve



Fig. 7. Reliabilities of output (R) after the TMR/6MR voting circuits.



Fig. 8. Nanoscale TMR design in pure AND-OR fabric.



Fig. 9. Nanoscale 6MR design in AND-OR/NOR fabric.

the voting logic itself (e.g., 6MR versus TMR). Based on this discussion, we will show the implementations of nanoscale voting in AND–OR and new AND–OR/NOR fabrics and discuss the benefit of H2L-logic-based implementation.

## C. Nanoscale TMR in AND -OR NASIC Fabric

As mentioned before, complementary signals are necessary in a fabric based on two-level logic. An original signal and its complementary version can provide "dual-rail" redundancy. However, voting on "dual-rail" signals (e.g.,  $a_1$  and  $\sim a_1$  in Fig. 8) requires signal inversion, which is difficult to achieve with two-level AND–OR logic on the 2-D grid. Therefore, as shown in Fig. 8, the voters on original signals and complementary signals are separated from each other in a pure AND–OR fabric, and the "dual-rail" redundancy is effectively unutilized.

## D. Nanoscale 6MR in AND -OR /NOR NASIC Fabric

With H2L logic, it is possible to produce complementary signals. Given this capability, we can vote only on original outputs and generate the complementary signals in voting circuits using AND–NOR logic only when they are necessary. As shown in Fig. 9, there is no need to generate complementary output in each module  $(A_1, A_2, \text{ and } A_3)$  for voting. Instead, we generate three more original copies  $(a'_1, a'_2, \text{ and } a'_3)$  for more redundancy and

<sup>&</sup>lt;sup>1</sup>This is relatively conservative since the voting result may still be correct even when two out of three modules and the voting circuit are faulty. However, this expression reveals the effective factors that determine the overall reliability R.

 TABLE I

 Reduction of Area in and-or/nor Fabric

|       | Nanoarray area (nm <sup>2</sup> ) |            |
|-------|-----------------------------------|------------|
|       | AND-OR                            | AND-OR/NOR |
| PC    | 35,200                            | 22,400     |
| ROM   | 26,400                            | 13,200     |
| DEC   | 57,600                            | 28,800     |
| RF    | 476,000                           | 265,200    |
| ALU   | 59,400                            | 28,800     |
| Total | 654,600                           | 358,400    |

vote on all six signals (6MR). In the voting circuit, the original and complementary outputs are generated for the next stage using H2L logic.

Note that the area of  $A_1$ ,  $A_2$ , and  $A_3$  in Fig. 9 is actually smaller than in Fig. 8 and they provide more redundancy (six copies compared with three copies in Fig. 8). By using H2L logic, we not only improve the voting scheme (use 6MR instead of TMR) but also the yield of each computing module ( $R_M$ ). Simulations (Section V) indicate that the overall reliability is significantly improved.

## V. EVALUATION

We developed a simulator to estimate the yield of the H2Llogic-based WISP-0 for various defect rates and distributions. We also evaluate the yield improvement of our new nanoscale TMR/6MR technique.

## A. WISP-0 on AND -OR /NOR NASIC Fabric

Table I shows the comparison between WISP-0 designs in the new AND–OR/NOR and the AND–OR NASIC fabrics. The area of each nanotile used in the nanoarray is listed. A 10-nm pitch between NWs is assumed [7]. We can see that for each tile in WISP-0, the new AND–OR/NOR fabric can save almost 50% of the nanoarray area. The number of required transistors in each tile is also reduced. In total, the number of transistors in WISP-0 is reduced by 52%.

## B. Density Evaluation of WISP-0

To get a more accurate evaluation on density, we need to take the area overhead of MWs into account. Note that the pitch between MWs in nanoscale WISP-0 also scales down with CMOS technology nodes—a reason why NASIC WISP-0 density changes somewhat with assumptions on MWs. Technology parameters used in the calculations are listed in Table II.

To get a better sense of what the densities actually mean, we normalize the density of nanoscale designs to an equivalent WISP-0 processor synthesized in CMOS. We designed this processor in Verilog, synthesized it to 180-nm CMOS. We derived the area with the help of the Synopsys Design Compiler. Next, we scaled it to various projected technology nodes based on the predicted parameters by the International Technology Roadmap for Semiconductors (ITRS), assuming area scales down quadratically [7].

TABLE II Technology Parameters

| NW pitch                    | 10nm     |
|-----------------------------|----------|
| NW width                    | 3~4nm    |
| Technology Node (ITRS 2004) | MW pitch |
| 70-nm                       | 170nm    |
| 45-nm                       | 108nm    |
| 32-nm                       | 76nm     |
| 18-nm                       | 42nm     |



Fig. 10. Density improvement of WISP-0 using H2L logic under different fault tolerance scenarios.

For the purpose of this paper, we assume that the CMOS version of WISP-0 is defect free. It is nevertheless expected that even CMOS designs would need redundancy and other techniques to deal with defects and mask delay variations due to process parameter variations. This means that our CMOS ASIC numbers are fairly optimistic. The normalized density of WISP-0 for various scenarios is shown in Fig. 10.

The notation used in the graphs is: *w/o Red* stands for WISP-0 without fault tolerance techniques (or baseline); 2-*way* stands for WISP-0 with two-way redundancy; and 2-*way* +*TMR/6MR* stands for two-way redundancy plus nanoscale TMR/6MR. The prefix *AND*–*OR* represents WISP-0 designed with pure AND–OR logic and the prefix *H2L* stands for WISP-0 with the new H2L logic. While other combinations are possible, we found these to be insightful and representative.

We can see from the results that the new AND–OR/NOR fabric improves the density of WISP-0 significantly for all possible scenarios. For NASIC WISP-0 without redundancy, at 45-nm CMOS technology node assumed for its MWs, the improvement is 26%. After applying two-way redundancy and nanoscale TMR, the improvement of density would be 43%. At 18-nm CMOS technology node, the improvement of the density for WISP-0 without redundancy is 41%. After applying two-way redundancy, the density improvement is 52%. Overall, the density improvement increases assuming MWs available from more advanced CMOS processes. This is because the area overhead of MWs in NASICs assuming MWs at 18-nm technology is much smaller than at 45-nm technology node, and thus, the corresponding area reduction is more prominent.



Fig. 11. Yield improvement of WISP-0 with H2L logic and nanoscale 6MR assuming stuck-on transistor.



Fig. 12. Yield improvement of WISP-0 with H2L logic and nanoscale 6MR assuming broken NWs.

## C. Comparison of AND –OR /NOR WISP-0 and CMOS Version

We found WISP-0 in the AND–OR/NOR NASIC fabric to be  $4\times$  (with two-way redundancy and 6MR) and  $15\times$  (with two-way redundancy alone) denser than the corresponding CMOS WISP-0 processor at projected 18-nm technology node.

## D. Yield Evaluation of the New WISP-0 Designs

We extended the NASIC simulator to verify the improvement of the new AND–OR/NOR fabric on the yield of WISP-0. This study assumes the manufacturing, defect, and fault model as discussed in [7], and its purpose is to show the impact of the logic family if the design also incorporates fault tolerance.

First, we present results assuming uniformly distributed defects. Clustered defects are addressed in subsequent sections. The simulation results for permanent defects are provided in Figs. 11 (assumes stuck-on FETs) and 12 (assumes broken NWs). From the results presented shortly, we can see that the H2L logic technique improves the yield considerably. Compared with the AND–OR approach in 2-way Red + TMR scenario, the improvement of H2L logic on the yield of WISP-0 with 2-way Red + 6MR is 15% when the defect rate of transistors is 2% and 147% at 5% defect rate. Note that the improvement is greater for higher defect rates. For broken NWs, the improvement of yield is 21% at 2% defect rate and 90% at 5% defect rate.

Nanoscale TMR technique in AND–OR NASIC fabric improves the yield when the defect rate is low. However, the improvement vanishes for higher defect rates. For example, if the defect rate for transistors is higher than 7% or the defect rate for broken NWs is higher than 3%, the nanoscale TMR is actually



Fig. 13. Yield improvement of WISP-0 with H2L logic and nanoscale 6MR assuming clustered stuck-on transistor.



Fig. 14. Yield improvement of WISP-0 with H2L logic and nanoscale 6MR assuming clustered broken NWs.

deteriorating the overall yield. This is because with high defect rate, the voting circuits themselves become so unreliable that they impact the yield negatively.

However, with H2L logic, the effectiveness of nanoscale 6MR is significantly better. In addition to the yield improvement for the logic itself, the nanoscale 6MR technique in new NASIC fabric consistently improves the yield of WISP-0. Compared with WISP-0 with H2L logic and two-way redundancy, the improvement of nanoscale 6MR technique is 7% and 47%, respectively, when 2% and 5% transistors are defective, respectively. For broken NWs, similar results are achieved.

# E. Impact of Clustered Defects on NASIC WISP-0

In our previous results, we assumed that all defects are uniformly distributed. However, defects can also be clustered as a group of adjacent FETs or a group of adjacent NWs could be damaged during the manufacturing process. To evaluate the impact of clustered defects, we need a model for clustered defects. In this paper, we assume the same model as in [7] for comparison purposes. In this model, the probability of defects decreases from the center of the cluster toward its margins. The model assumes a uniform cluster shape: we are currently working on modeling other possible cluster shapes (this is due to manufacturing) for more accurate estimates. Nevertheless, from the point of view of this comparison, we mainly focus on trends that are due to the new logic style.

Fig. 13 shows the yield of WISP-0 assuming clustered transistor defects. Fig. 14 shows the yield with clustered broken NWs. The results indicate that the H2L logic technique also helps to tolerate clustered defects/faults better: in Fig. 13, the yield of WISP-0 with 2-way Red + 6MR remains 46% even when the cluster defect rate of transistors is 5% for the parameters simulated. Note that each defect cluster may have multiple defects. Similar to uniform defects, nanoscale 6MR technique works much better in AND–OR/NOR fabric than nanoscale TMR in pure AND–OR fabric. For example, if the cluster defect rate for transistors is higher than 4%, nanoscale TMR in pure AND– OR fabric actually deteriorates the overall yield. For clustered broken NWs, nanoscale TMR does not appear to work at all. However, in AND–OR/NOR fabric, nanoscale 6MR consistently improves the yield for clustered defective transistors. Even for clustered broken NWs, the 6MR technique still improves the yield of WISP-0 when the cluster defect rate is below 4%.

# VI. CONCLUSION

In this paper, we demonstrated a new nanofabric that combines two different logic families in the same logic stage. Circuit designs in the new fabric could be significantly simplified compared with the previous fabric. Our simulation results show that it is possible to achieve much denser designs compared to other two-level logic approaches. In addition, the yields of the faulttolerant processor WISP-0 can also be improved significantly on the new fabric—in some cases by up to an order of magnitude. The heterogeneous logic technique also enables majority voting at nanoscale, which improves the yield of NASIC designs while not adding additional manufacturing requirement. It can also be applied in other grid-based nanofabrics. We are currently exploring this approach in wider data paths to gauge the benefits for larger scale designs.

## ACKNOWLEDGMENT

The authors would like to thank their colleagues at the Functional Engineered Nano Architectonics (FENA)/ Microelectronics Advanced Research Corporation (MARCO) and the Center for Hierarchical Manufacturing (CHM), National Science Foundation (NSF) for their valuable inputs.

## REFERENCES

- Y. Cui, X. Duan, J. Hu, and C. M. Lieber, "Doping and electrical transport in silicon nanowires," *J. Phys. Chem. B*, vol. 104, no. 22, pp. 5213–5216, Jun. 2000.
- [2] Y. Huang, X. Duan, Y. Cui, L. J. Lauhon, K-Y. Kim, and C. M. Lieber, "Logic gates and computation from assembled nanowire building blocks," *Science*, vol. 294, no. 5545, pp. 1313–1317, Nov. 2001.
- [3] Y. Huang, X. Duan, Q. Wei, and C. M. Lieber, "Directed assembly of onedimensional nanostructures into functional networks," *Science*, vol. 291, no. 5504, pp. 630–633, Jan. 2001.
- [4] Y. Wu, J. Xiang, C. Yang, W. Lu, and C. M. Lieber, "Single-crystal metallic nanowires and metal/semiconductor nanowire heterostructures," *Nature*, vol. 430, pp. 61–65, Jul. 2004.
- [5] A. DeHon, "Nanowire-based programmable architectures," ACM J. Emerging Technol. Comput. Syst., vol. 1, no. 2, pp. 109–162, Jul. 2005.
- [6] K. K. Likharev and D. B. Strukov, "CMOL: Devices, circuits, and architectures. Introducing molecular electronics," *Lecture Notes Phys.*, vol. 680, pp. 447–477, 2005.
- [7] C. A. Moritz, T. Wang, P. Narayanan, M. Leuchtenburg, Y. Guo, C. Dezan, and M. Bennaser, "Fault-tolerant nanoscale processors on semiconductor nanowire grids," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 11, pp. 2422–2437, Nov. 2007.

- [8] C. A. Moritz and T. Wang, "Towards defect-tolerant nanoscale architectures," in *Proc. 6th IEEE Conf. Nanotechnol. (IEEE-Nano 2006)*, Jun., vol. 1, pp. 331–334.
- [9] T. Wang, M. Bennaser, Y. Guo, and C. A. Moritz, "Combining circuit level and system level techniques for defect-tolerant nanoscale architectures," in *Proc. 2nd IEEE Int. Workshop Defect Fault Tolerant Nanoscale Archit.* (*NanoArch*), Boston, MA, Jun. 2006, pp. 101–108.
- [10] C. A. Moritz and T. Wang, "Latching on the wire and pipelining in nanoscale designs," presented at the 3rd Non-Silicon Comput. Workshop (NSC-3), Munich, Germany, 2004.
- [11] T. Wang, Z. Qi, and C. A. Moritz, "Opportunities and challenges in application-tuned circuits and architectures based on nanodevices," in *Proc. 1st ACM Int. Conf. Comput. Frontiers*, Ischia, Italy, 2004, pp. 503– 511.
- [12] D. Whang, S. Jin, Y. Wu, and C. M. Lieber, "Large-scale hierarchical organization of nanowire arrays for integrated nanosystems," *Nanoletters*, vol. 3, pp. 1255–1259, Sep. 2003.
- [13] T. Wang, M. Bennaser, Y. Guo, and C. A. Moritz, "Wire-streaming processors on 2-D nanowire fabrics," in *Proc. Nano Sci. Technol. Inst.*, *Nanotech Conf. 2005.* Anaheim, CA, vol. 2, pp. 619–622.
- [14] P. Narayanan, M. Leuchtenburg, T. Wang, and C. A. Moritz, "CMOScontrol enabled single-type FET NASIC," in *Proc. IEEE VLSI 2008*, pp. 191–196.
- [15] D. B. Strukov and K. K. Likharev, "Defect-tolerant architecture for nanoelectronic crossbar memories," *J. Nanosci. Nanotechnol.*, vol. 7, no. 1, pp. 151–167, Jan. 2006.
- [16] Y. W. Heo, L. C. Tien, Y. Kwon, D. P. Norton, S. J. Pearton, B. S. Kang, and F. Ren, "Depletion-mode ZnO nanowire field-effect transistor," *Appl. Phys. Lett.*, vol. 85, no. 12, pp. 2274–2276, Sep. 2004.
- [17] R. E. Lyions and W. Vanderkulk, "The use of triple modular redundancy to improve computer reliability," *IBM J. Res. Dev.*, vol. 6, no. 2, pp. 200–209, 1962.
- [18] Y. Luo, C. P. Collier, J. O. Jeppesen, K. A. Nielsen, E. Delonno, G. Ho, J. Perkins, H. Tseng, T. Yamamoto, J. F. Stoddart, and J. R. Heath, "Twodimensional molecular electronics circuits," *ChemPhysChem*, vol. 3, no. 6, pp. 519–525, 2002.
- [19] A. B. Greytak, L. J. Lauhon, M. S. Gudiksen, and C. M. Lieber, "Growth and transport properties of complementary germanium nanowire fieldeffect transistors," *Appl. Phys. Lett.*, vol. 84, no. 21, pp. 4176–4178, May 2004.
- [20] H. T. Ng, J. Han, T. Yamada, P. Nguyen, Y. P. Chen, and M. Meyyappan, "Single crystal nanowire vertical surround-gate field-effect transistor," *Nano Lett.*, vol. 4, no. 7, pp. 1247–1252, May 2004.
- [21] W. Lu and C. M. Lieber, "Semiconductor nanowires," J. Phys. D: Appl. Phys., vol. 39, pp. R387–R406, Oct. 2006.
- [22] International Technology Roadmap for Semiconductors. (2006). [Online]. Available: http://public.itrs.net/
- [23] T. Wang, P. Narayanan, and C. A. Moritz, "Dynamic style, single-type-FET based 2-D nano fabrics," submitted for publication.



Teng Wang received the B.S. degree in electronic engineering and information science from the University of Science and Technology of China (USTC), Hefei, China and the M.S. degree from the Chinese Academy of Sciences in 1999 and 2002, respectively. He is currently working toward the Ph.D. degree in electrical and computer engineering at University of Massachusetts, Amherst.

His research interests include nanoscale systems, computer architecture, and VLSI design.



**Pritish Narayanan** received the B.E. (Hons) degree in electrical and electronics engineering and M.Sc. (Hons) degree in chemistry from the Birla Institute of Technology and Science, Pilani, India in 2005. He is currently working toward the Ph.D. degree and is a Research Assistant in Electrical and Computer Engineering at the University of Massachusetts, Amherst.

He was previously employed as a Research and Development Engineer at IBM, where he worked on process variation and statistical timing analysis. His interests include nanoarchitectures and neural networks.



Csaba Andras Moritz (M'85) received the Ph.D. degree in computer systems from the Royal Institute of Technology, Stockholm, Sweden in 1998.

From 1997 to 2000, he was a research scientist at MIT, Laboratory for Computer Science. He has consulted for several technology companies in Scandinavia and held industrial positions ranging from CEO, to CTO, and to founder. His most recent startup company, BlueRISC Inc, develops security microprocessors and hardware-assisted security solutions. He is currently a tenured Associate Professor in the De-

partment of Electrical and Computer Engineering University of Massachusetts, Amherst. His research interests include computer architecture, compilers, low power design, security, and nanoscale systems.