# On the Design of Ultra-High Density 14nm Finfet based Transistor-Level Monolithic 3D ICs

Jiajun Shi<sup>1,2</sup>, Deepak Nayak<sup>1</sup>,Motoi Ichihashi<sup>1</sup>, Srinivasa Banna<sup>1</sup> and Csaba Andras Moritz<sup>2</sup> <sup>1</sup>Technology Research, GLOBALFOUNDRIES, Santa Clara, CA, USA <sup>2</sup>Department of ECE, University of Massachusetts, Amherst, MA, USA <u>jiajun@umass.edu</u>, deepak.nayak@globalfoundries.com, andras@ecs.umass.edu

Abstract- Conventional 2D CMOS faces severe challenges sub-22nm nodes. The monolithic 3D (M3D) IC technology enables ultra-high density vertical connections and provides a good path for technology node scaling. Transistor-level (TR-L) monolithic 3D IC is the most advanced and fine-grained M3D IC technology. In this paper, for the first time, the detailed design as well as benefits and challenges of a silicon validated 14nm Finfet process design kit (PDK) based TR-L M3D IC technology is explored. TR-L M3D standard cell layout is achieved based on 14nm Finfet design rules and feature sizes. A semi-customized RC extraction methodology is performed for accurate 3D cell RC extraction. After extensive simulation, TR-L M3D cell power, delay and area are evaluated and compared with equivalent 2D cells in the same technology node. System-level benchmarking with several circuits show up to 55% reduced footprint, 25% shorter wire length, and 18% lower power with TR-L M3D vs. 2D CMOS.

### Keywords—monolithic 3D IC, 3D RC extraction, 14nm Finfet, 3D performance characterization

### I. INTRODUCTION

As technology node scales down, CMOS faces severe challenges that result from device scaling limitations, interconnection bottlenecks, and manufacturing complexities. To continue scaling, 3D ICs with TSV or wire bonding technology have been explored in recent years. Compared with these 3D IC technologies, the gate- and transistor-level monolithic 3D integration use extremely small size Monolithic Inter-layer Via (MIV) to achieve a fine-grained vertical integration of devices and interconnects.

Transistor-level Monolithic 3D (TR-L M3D) technology, which is the focus of this paper, builds find-grained 3D integration by creating 3D standard cells. The 3D cells are designed by splitting PMOS and NMOS transistors into two tiers within a standard cell, and MIVs are used for cell internal vertical interconnection. Figure 1 shows the overview of a TR-L M3D cell structure and cell-to-cell connection. There are two metal layers (M1, M2), and one silicon layer (for PMOS) in the bottom tier (bot-tier) and one silicon layer in the top-tier (for NMOS), with an inter-layer-dielectric (ILD) for isolation. This way, the pull-up and pull-down networks of each standard cell are splitted and that each silicon layer has only one type of transistor. The MIV, which penetrates the ILD and connects with M2 in the bottom tier, connects the pull-up and pull-down network of each standard cell. Full metal stack is used in the top-tier for routing between different cells which is similar to cell routing in typical 2D designs. M1-M2 in bot-tier are just for intra-cell routing, where M2 is needed only for large cell designs such as Data Flip-flop (DFF), AND-OR-Interter3x2(AOI3x2).



As discussed in [1], TR-L M3D has considerable benefits over 2D CMOS. And a cell-folding technology is used to design TR-L 3D standard cells by folding 2D standard cells into half and inserting MIVs for pull-up and pull-down network connection. However, it keeps using the same pulldown network and pull-up network design as 2D cell and doesn't show much benefit for cell design compared to 2D cells. Moreover, their design and evaluation was done based on 45nm planar MOSFET technology, which is significantly different from tri-gate Finfet technologies available today. In this work, we first design and evaluate the 14nm Finfet based TR-L M3D ICs using silicon validated 14nm Finfet process design kit (PDK). We design compact 3D standard cells where the pull-up and pull-down network are redesigned by fully using 3D routing spaces and considering Finfet design rules. The proposed design not only provides accurate projection of fine-grained M3D benefits at state-of-the-art technology node, but also provides indication of benefits in future technology nodes.

We investigate the dimension of MIV in TR-L M3D considering cell footprint saving and manufacturability. The 3D standard cells are designed by considering their compatibility with 14nm Finfet design rules and MIV dimension. The standard cells RC are extracted precisely by using CalibrexACT and Sentaurus Interconnect. Based on the designed TR-L M3D standard cell library, we then extensively benchmark several circuits to evaluate this technology. After fair comparison with 2D benchmark with the same technology node, detailed analysis on wirelength, timing, and power are presented to show the benefits over 2D.

The key contributions of our 3D design include:

- We propose efficient method to investigate the dimension of MIV in 14nm Finfet based TR-L M3D
- We make full use of 3D routing spaces to redesign many small but critical cells for improved efficiencies vs. state of

the art. Higher benefits are shown over 2D compared with the published cell folding scheme in [1]

- We apply an accurate RC extraction methodology which can extract full RC information inside the cell by adopting commercial EDA tools while the method in [1] can only extract limited types' RC.
- First time to explain why TR-L M3D cells have significant RC reduction through detailed cell internal RC comparison between TR-L M3D cells and 2D cells.
- The system-level routing congestion issue in TR-L M3D is solved by adjusting cell footprint to reduce routing density, while significant benefit against 2D is still maintained.

# II. TR-L M3D CELL DESIGN AND RC EXTRACTION

## A. Overview of TR-L M3D Challenges

A sequential processing of M3D ICs has been proposed in [2]. A major challenge is to preserve the performance of bottier devices and interconnects, while keeping the top-tier processing temperature below 650°C. In addition, the top-tier device performance of M3D shall ideally match with the baseline 2D CMOS device performance. To achieve this, several process technology innovations (stable silicide, lowtemperature S/D epitaxy, implant activation via solid phase epitaxy, ns laser anneal) have been proposed to match the low temperature (LT) device characteristics with that of high temperature (HT) devices [3]. Furthermore, Tungsten wiring may be used in the bottom tier in order to avoid any metal line degradation during top-tier processing. However, the Tungsten wiring involves issues of Tungsten wire patterning and deposition whose feasibility and associate impact are huge uncertainty for current manufacturing technology. In this paper, we assume that both NMOS and PMOS in M3D have the same performance as in 2D, Copper interconnect is used in both bot- and top-tier.

## B. 14nm Finfet based TR-L M3D Cell Feature Sizes

The design of each TR-L M3D cell contains three parts: the pull-up network (PUN) in bot-tier, pull-down network (PDN) in top-tier, and MIVs that connect input/output ports between PUN and PDN. Each component in PUN and PDN is designed following 14nm Finfet design rules, and they are designed with regular 2D style. The dimensions of MIV are designed considering the compatibility with design rules, cell footprint savings, as well as MIV aspect ratio.

Figure 2A shows the typical 9 track cell design where each active device region uses 3 tracks, each power rail region uses one track, and one track in the middle is used for







changes

isolation of P- and N- devices. Figure 2B shows the top-view of our proposed 3D cell. The PDN is placed in the top-tier exactly aligning with PUN in bot-tier. For both PUN and PDN design, the power rail uses 1 track and the active device region uses three tracks. The MIVs are placed in the 5<sup>th</sup> metal track with the spacing equal to minimum M1 spacing to the active device region, which is set to avoid M1 routing violation inside the cell. The total cell height of 3D cell is the width of 5 metal tracks (~8 fin pitches).Since each MIV is connected with the M1 in top-tier, we set the MIV-to-MIV minimum spacing equal to M1 minimum spacing. The design of MIV width is critical due to its impact on both cell footprint and MIV manufacturability. Figure 3 shows how MIV width impacts footprint of 3D cell represented by ratio of 3D and 2D footprint (3D/2D). Each ratio curve has an inflection point where the 3D cell would lose footprint saving severely if MIV width exceeds a value around 50nm. And we also expect the MIV width can be as big as possible which can reduce MIV aspect ratio for enhanced manufacturability. Therefore, we set the width to be 50nm which makes the 3D cells having about 45% footprint saving against 2D cells (see Figure 3). The MIV height is designed by considering ILD thickness and dimensions of device interconnect components. The MIV connects the M2 in bot-tier and M1 in top-tier by penetrating the ILD layer and NMOS dielectric layer. According to the study in [5], minimum ILD thickness of 100nm is required to avoid the coupling between devices in top- and bot-tier (Figure 4A). As shown in Figure 4B, NMOS dielectric layer thickness is equal to the sum of NMOS fin, trench silicide region (TS), diffusion contact (CA) and via0. Based on our 14nm Finfet data, this sum is 200nm. The MIV height is the sum of ILD thickness (100nm) and the dielectric layer thicknesses of top-tier to M1 layer, which adds up to 300nm. So the aspect ratio of MIV is 6 (300nm/50nm) which is acceptable for the state-of-the-art fabrication techniques. A)



Figure 4: A) ILD thickness [5] B) Side view of MIV dielectric

# C. Standard Cell Design



Figure 5: A) 2D NAND3 layout B) Unfolded 3D NAND3 layout

For some 2D standard cells in 14nm Finfet technology, extra polysilicon line, called 'dummy poly', is inserted inbetween the polysilicon lines, creating space for internal signal routing. This way, the lateral width of the cell is enlarged to provide extra routing space for elimination of certain routing violations while it produces overhead. The NAND3 2D design using 14nm Finfet is shown in Figure 5A.

For the cell-folding technology proposed in [1], it takes the original design of 2D standard cell and folds the cell to generate 3D cell design. The PUN and PDN of 3D cell are separated into two tiers and connected by using MIVs. The 3D cell uses the same PUN and PDN design as the 2D cell where M1 is mainly used for intra-cell routing. Therefore, in this TR-L M3D technology, the M2 of the bot-tier is rarely used but specially required in the intra-cell routing of large cells such as DFF, AOI3x2, while M1 of bot-tier is extensively used for routing inside each cell. This way, the M2 of bot-tier is not fully used.

In this paper, our mindset of cell design is to give up the original 2D cell design and redesign 3D cell that makes full use of 3D routing space, especially the vertical-dimension routing space. The scheme is to make use of the idle M2 layer in bot-tier to create more routing spaces for design



Figure 6:A) Sprocess layout for vertical coupling capacitance extraction B) Sprocess layout for MIV RC extraction



optimization. Following this scheme, M2 of bot-tier are used in some small but critical cell designs (NAND3, AOI21 and AOI22). Figure 5B. shows our designed 3D NAND3 cell with an unfolded layout view. The use of the M2 layer in bot-tier generates extra routing space in the vertical dimension which eliminates certain routing violations and congestion in PUN (bot-tier). Instead of the way in 2D design that inserts dummy poly to horizontally create routing space, our approach fully uses vertical routing space that created by metal stacking which effectively removes the dummy poly and thus reduces cell footprint and diffusion RC.

Compared with the 3D cells design shown in [1], which just simply folds the original 2D standard cells, our design scheme fully uses the fine-grained 3D routing space and produces much more compact 3D standard cells. Therefore, as will be shown subsequently, our 3D cells achieve significant benefits in RC reduction. The evaluation results are shown in section III.

#### D. RC Extraction Methodology

Dealing with 3D cell RC extraction is a significant challenge because no commercial tool is available to support RC extraction of 3D ICs. In [1], the authors extract cell RC by using CalibrexACT which originally supports RC extraction of a 2D design and only accepts one silicon layer for RC analysis. Therefore, limited types of capacitances can be modeled, which results in a fairly inaccurate evaluation of 3D cells.

In our work, we propose a method which can accurately model capacitance values including silicon-to-silicon, siliconto-metal and metal-to-metal capacitances.

As shown in Figure 7, RC in a 3D cell is composed of four parts: RC inside PUN, RC inside PDN, RC of MIVs, and coupling capacitance between PUN and PDN. As noted, PUN and PDN are designed in a normal 2D design. Their layouts are independently prepared using Virtuoso and then the RC values of each part are extracted using CalibrexACT.

To evaluate the vertical coupling capacitance between PUN and PDN, we instead build the 3D layouts of PUN and PDN (Figure 6A) by using Sentaurus Process (Sprocess) and then modeled the capacitance using Sentaurus Interconnect (Sinterconnect) TCAD physical simulators. Notably, the 3D layout is built layer by layer, and each layer is built by a sequence of process steps for a highly realistic and practical evaluation: deposit oxide, make mask, etch, and deposit selected materials.

The key parameters such as dielectric constant, diffusion



Figure 8: A) Capacitance of 2D INV B) Capacitance of 3D INV

doping concentration and gate metal work function are defined based on our foundry data. The dimensions of each component are set according to design rules of the technology. Table II shows the extracted capacitance values of top-to-bot coupling. The RC of MIVs can be extracted in a similar way as how we shown the vertical coupling capacitance extraction. The MIVs mainly couple to the diffusion regions, power rails (VDD VSS) and metal layers (M1, M2) in both top- and bot-tier. Each MIV also has coupling capacitance to the adjacent MIVs. The resistance of MIV is evaluated using Sentaurus Interconnect physical simulation, and each MIV has a resistance value around 5.5 $\Omega$  and a capacitance value of 30-50aF. Figure 6B shows the layout of cell NAND3 in Sprocess.

#### III. CELL EVALUATION AND ANALYSIS

For the standard cell test, we set FO4 load, which assumes each standard cell in the system has an average fan-out of four inverters (INVs). The input signal slew is set to be 15ps. The power is measured by testing the average dynamic power of all switching scenarios. The power of each standard cell is measured assuming the same input signal frequency of 1GHz. The delay is measured by taking the worst case propagation delay. Normalized evaluation results are shown in Table I.

The 3D INV, NAND2, NOR2, DFF cells show up to 13% delay reduction, around 11% lower power, and 44% reduced footprint compared to the 2D cells.

For the redesigned cells NAND3, AOI21 and AOI22, we achieve up to 22% reduced delay, around 20% lower power, and 55% reduced footprint compared to 2D designs. Related to this, Table III shows the effective internal capacitance values in 3D cells vs. 2D cells.

The key factor in obtaining these benefits comes from our cell design vertically splitting of PDN and PUN, which effectively eliminates cross coupling between PDN and PUN.



Figure 9: Measured device and interconnect capacitance ratios between 3D cells and 2D cell (3D/2D)

Figure 8A shows the typical design of a 2D INV layout with some internal capacitance examples that can be effectively eliminated in our 3D design. In this 2D design, since the PDN and PUN are placed closely, the diffusions in PDN have big cross coupling capacitance to the PUN that belongs to device capacitance. However, in our 3D design, the PDN and PUN are placed far away from each other and with an inserted ILD layer for isolation. This way, the vertical coupling between components in PDN and PUN become negligible. Table II shows some significant vertical coupling capacitance values between PUN and PDN of 3D INV.

Additionally, in 3D design, the use of ILD layer isolates PUN and PDN, and helps to reduce the coupling capacitance of some intra-cell routing metal that belongs to interconnect capacitance. For example, in 2D design, the entire output metal couples to both VDD and VSS while in the 3D design (Figure 8B) the output metal in the PDN is far away from VDD metal in top-tier and only couples to VSS metal in toptier. Similarly, output metal in the PUN only couples to VDD power ail in the bot-tier. Therefore, the parasitic capacitance of output metal is significantly reduced in cell of TR-L M3D.

We carried out accurate measurement of internal capacitance reduction of 3D cell design compared with 2D cell which is not explained and analyzed in detail in ref [1]. We divide cell internal capacitance into two parts: i). the device capacitance which includes the diode/diffusion capacitance inside the n- and p- type Finfet transistors and the cross coupling capacitance between n-type device's diffusions (S/D) and p-type device's diffusions (S/D); ii). the interconnect capacitance which is produced by the parasitic capacitance of the routing metal, diffusion/gate contact and metal-to-contact via. After extracting RC from layout without interconnect metal and diffusion/gate contact, the effective device capacitance of each cell can be calculated through energy testing in HSPICE. The total effective internal capacitances of the cell that are measured in the same method by importing the layout with the interconnect components. And the difference between the total internal effective capacitance and the effective device capacitance is the effective interconnect capacitance. Figure 9 shows the device and interconnect capacitance reduction of 3D cells compared with 2D cells. For the cells without redesign (INV and NAND2), the device capacitance is reduced around 10% and the interconnect capacitance is reduced around 6% which totally contribute to abound 16% effective internal

Table I: Normalized Power & Delay Results of Cells

| Std-cells | Delay |           |      | Power     | Footprint |           |  |
|-----------|-------|-----------|------|-----------|-----------|-----------|--|
|           | 2D    | 3D        | 2D   | 3D        | 2D        | 3D        |  |
| INV       | 1.00  | 0.90(90%) | 1.00 | 0.83(87%) | 1.00      | 0.56(56%) |  |
| NAND2     | 1.64  | 1.43(88%) | 1.08 | 0.92(88%) | 1.49      | 0.84(56%) |  |
| NOR2      | 1.61  | 1.43(89%) | 1.07 | 0.92(88%) | 1.49      | 0.84(56%) |  |
| NAND3     | 2.55  | 2.20(86%) | 1.14 | 0.94(83%) | 2.49      | 0.11(45%) |  |
| AOI21     | 1.71  | 1.45(86%) | 1.11 | 0.89(81%) | 2.49      | 0.11(45%) |  |
| AOI22     | 1.87  | 1.60(85%) | 1.20 | 0.97(80%) | 2.99      | 1.39(46%) |  |
| DFF       | 5.09  | 4.53(89%) | 4.32 | 3.93(90%) | 9.47      | 5.26(56%) |  |

Table II: Vertical Coupling Capacitance of 3D INV (Unit: aF)

|              | Top<br>Diffusion | Top<br>Gate | Top<br>Power | Top<br>Contact | Top<br>Via0 | Top<br>M1 |
|--------------|------------------|-------------|--------------|----------------|-------------|-----------|
| Bot<br>Power | 2                | 0.7         | NA           | 0.2            | 0.09        | 0.009     |
| Bot-M1       | 3                | 1           | 0.02         | 0.1            | 0.09        | 0.01      |

capacitance reduction compared to 2D cell. For the cell with redesign (NAND3), it has totally 22% capacitance reduction where the device capacitance reduction contributes 15% and interconnect capacitance reduction contributes 7%. Thus it can be observed that splitting of PUN and PDN in 3D design contributes to both device capacitance reduction and interconnect capacitance reduction.

Though in [1] the authors point out that the splitting of PUN and PDN also contributes to the reduction of polysilicon length, our 3D cell has no significant reduction of polysilicon length due to the high density cell design and strict design rule in 14nm Finfet technology. Additionally, for each input node of our 3D cell, two high-resistance metal-to-silicon contacts are used for MIV-PUN and MIV-PDN connections, while the 2D cell only uses one contact for each input node. Overall, the driving resistance of each 3D cell is therefore comparable to the 2D cell.

| Fable III: In | nternal Ca | pacitance | Comparison | (Unit: 1 | fF) |
|---------------|------------|-----------|------------|----------|-----|
|---------------|------------|-----------|------------|----------|-----|

|    | INV  | NAND2 | NOR2 | NAND3 | AOI21 | AOI22 |
|----|------|-------|------|-------|-------|-------|
| 2D | 0.43 | 0.55  | 0.56 | 0.62  | 0.61  | 0.68  |
| 3D | 0.36 | 0.46  | 0.47 | 0.48  | 0.46  | 0.54  |

#### IV. SYSTEM-LEVEL BENCKMARK

A. Benchmarking Methodology





The overall system-level benchmarking flow is shown in Figure 10. Since the cell-to-cell routing only exists in full metal layers in the top-tier, only the PDN layouts of 3D cells are used to generate the LEF file. The Lib file is generated by Synopsys Siliconsmart with imported post-extracted HSPICE netlists of cells. These netlists contain device models and extracted cell RC information which is produced by CalibrexACT and Sentaurus Interconnect as discussed in section II.D. We synthesize the benchmark circuits based on our 3D standard cell library and benchmark design constraints. Next, we use Encounter to build physical lavout based on the synthesized netlist of certain benchmark circuit. Our full-chip timing/power optimization and analyses for TR-L M3D and 2D are the same, because our M3D chip-level routing style is exactly the same as that of 2D. Then, we use Synopsys Primetime to carry out power and timing analysis based on the routing RC information and cell libraries. We perform statistical power analysis with the switching activity of both inputs and sequential outputs at 0.2.



#### B. Clock Tree and Power Delivery Netowrk Design

The clock tree design in TR-L M3D follows the convention of 2D design. The clock tree only distributes in the full metal stack of top-tier and the clock input pin of DFF cell is designed to be placed in the top-tier.

We use standard power delivery design techniques for the power delivery network design of top-tier. The topmost metal layer of top-tier is used as global input and one intermediate layer is used to carry the power and ground signal. In TR-L M3D, top-tier only has VSS rails (Figure 13) and bot-tier only has VDD rails. So only the VSS stripes of intermediate layer connect to the power rails (VSS) of M1 layer through normal via. The VDD power signals of intermediate metal layer are carried to the power ring of bot-tier through via stack composed of normal via and MIV. Multiple via stacks are placed in a form of array for low-resistance connection. Figure 11 shows our power delivery network of TR-L M3D.

#### C. Routing Congestion Issue

The major drawback of TR-L M3D is routing congestion which is caused by reduced pin access to the input/output metal port of each cell. Typical 14nm Finfet based 2D cell has at least 6 pin access points in the input/output port while our 3D cell only has 3-4 pin access in both input/output port exposed on top-tier for cell-to-cell routing. This is caused by the reduced footprint in 3D cells. Figure 12 shows the accessible pins of 3D and 2D INV layouts. Yet, the number of pins in 3D cell remains the same as 2D and footprint is



Figure 12: A) Tracks and pin accesses of 2D INV B) Tracks and pin accesses in PDN (top-tier) of TR-L M3D INV

reduced about 50%, which means 50% increased pin access demand. Therefore, the 3D cell sometimes does not have enough access spaces for system-level routing which results in routing congestion.

The routing congestion is unique in this M3D technology due to heavily reduced footprint. Currently, the only solution we have is to increase the 3D cell height to create more pin

| Benchmark<br>Name | Design<br>Type | Total Wirelength | Leakage | Wire Power | Cell Pin Power | Cell Internal<br>Power | Total Power | Congestion Rate | Foot-<br>print | PPA  |
|-------------------|----------------|------------------|---------|------------|----------------|------------------------|-------------|-----------------|----------------|------|
| DES               | 2D             | 1.00             | 1.00    | 1.00       | 1.00           | 1.00                   | 1.00        | 0%              | 1.00           | 1.00 |
|                   | 3D-7fp         | 0.77(-22%)       | 0.98    | 0.81(-19%) | 0.85(-15%)     | 0.83(-17%)             | 0.83 (-17%) | 6%              | 0.45           | 2.60 |
|                   | 3D-8fp         | 0.84(-16%)       | 0.98    | 0.84(-16%) | 0.84(-16%)     | 0.81(-19%)             | 0.82 (-18%) | 0%              | 0.55           | 2.20 |
| JPEG              | 2D             | 1.00             | 1.00    | 1.00       | 1.00           | 1.00                   | 1.00        | 0%              | 1.00           | 1.00 |
|                   | 3D-7fp         | 0.75 (-25%)      | 0.99    | 0.78(-22%) | 0.84(-16%)     | 0.83(-17%)             | 0.83(-17%)  | 12%             | 0.49           | 2.50 |
|                   | 3D-8fp         | 0.79 (-21%)      | 0.99    | 0.83(-17%) | 0.83(-17%)     | 0.81(-19%)             | 0.82(-18%)  | 0%              | 0.54           | 2.20 |

Table IV: Evaluation Results of System-level Benchmarks

access points. This way, cell-to-cell routing length would increase and system-level design benefits would decrease. Therefore, comprehensive and realistic evaluation for power, performance and area (PPA) is necessary for evaluating these trade-offs (see section IV.D).

# D. Results and Analysis

The DES and JPEG circuits are used for system-level evaluation. We create two versions for the 3D standard cell. One version has a cell height of 7 fin pitches (3D-7fp) and the other has 8 fin pitches (3D-8fp). The metric performance/(power\*footprint), called PPA, is used for making comprehensive evaluation of each technology. Figure 13 shows the DES benchmark layouts of 2D, 3D-7fp and 3D-8fp TR-L M3D routed by Encounter. Comparison among 2D, TR-L M3D using 7 fin pitches and 8 fin pitches (3D-8fp) is shown in Table IV with normalized results. 3D-7fp has 55% footprint reduction in each benchmark. 3D-7fp has up to 25% reduced wire length which effectively contributes to 22% wire power reduction. 3D-7fp keeps similar leakage as 2D but achieves around 17% reduced total power. However, the cell congestion rate is much higher (6% for DES and 12% for JPEG) compared to 0% in 2D due to reduced pin access. For 3D-8fp, total wire length reduction is



Figure 13: DES Layouts of 2D, 7fp-3D and 8fp-3D versions

smaller than 3D-7fp because of the increased footprint. Therefore, the wire power saving of 3D-8fp is lower than 3D-7fp and also the footprint reduction goes down to 45%. However, the congestion rate in 3D-8fp goes down to 0%. The reduction of congestion rate means the enhancement of the system's routing flexibility which effectively eliminates system-level routing violations. Additionally, enhanced routing flexibility helps Encounter to optimize routing better for the benchmark. For the 3D-7fp version, the routing congestion increases the routing complexity, losing wire saving and even generating good amount of routing violations. However, 3D-8fp based benchmarks have 0% congestion rate and 0 routing violations which means Encounter successfully routes the design with sufficient routing space. Therefore, 3D-8fp even has comparable wire length saving with 3D-7fp. Compared to 3D-7fp, 3D-8fp has 25% reduced PPA due to increased footprint, but it retains significant PPA advantage over 2D.

Since the MIV to PUN/PDN distance becomes larger in 3D-8fp cells, the MIV coupling capacitance is reduced. However, the input/output metal length increases in PDN and PUN because cell height goes up, which results in increased intrinsic capacitance in PDN/PUN. Therefore, for the JPEG and DES benchmarks, we clearly see the 3D-7pf and 3D-8pf have similar cell pin power and internal power. The major difference is the wire power reduction. Since both DES and JPEG are cell-dominated benchmarks, the wire power does not contribute much to the total power. So 3D-7fp and 3D-8fp have comparable total power. Further research that considers cell height of 9 fin pitches or 10-fin pitches may be needed to determine the optimal design. However, the increased cell height may cause additional area overhead and reduction of wire length savings compared to 3D-7fp and 3D-8fp versions.

## V. CONCLUSION

We present a 3D IC design approach and show quantified analysis of design benefits. Accurate RC extraction method is proposed to evaluate internal RC of 3D standard cells. In 14nm Finfet based transistor-level monolithic 3D ICs, the splitting of PUN and PDN leads to reduction of cell internal capacitance as well as cell footprint. These benefits result in power efficiency of system-level designs. We performed system-level evaluations and showed up to 18% power benefits compared to 2D counterparts in 14nm Finfet. Additionally, a solution to address the routing congestion issue in TR-L M3D design is proposed.

#### REFERENCE

- Y. Lee.et.al "Power Benefits Study for Untr-High Density Transistor-Level Monolithic 3D ICs", DAC, 2013, pp. 1-10.
- [2] P. Batude.et.al "Low temperature FDSOI devices, a key enabling technology for 3D sequential integration", VLSI-TSA, 2013, pp. 1-4.
- [3] P Batude.et.al "3DVLSI with CoolCube process: An alternative Path to Scaling", VLSI-T, 2015, pp. T48-T49.
- [4] S. Panth.et.al "Power-Performance Study of Block-Level Monolithic 3D-ICs Considering Inter-Tier Performance Variations", DAC, 2014, pp. 1-6.
- [5] P. Batude.et.al "GeOI and SOI 3D monolithic cell integrations for high density applications", VLSI-T, 2009, pp. 166-167.