Reliability and Performance-aware 3D SRAM Design Mohit Pathak and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology
Abstract-In 3D integrated circuits, through-silicon-vias (TSVs) are used to connect different dies stacked on top of each other. These TSVs occupy significant silicon area and are many times larger than gates. Depending on the fabrication technique, TSVs can have different area and fabrication cost. TSVs can also cause reliability challenges for 3D ICs by reducing the yield of the chip. In this paper, we discuss how to
Based on this rough location we can perform global routing in step (2). The location of the TSVs and their number is determined during routing topology generation for 3D nets. Detailed placement in step (3) followed by detailed routing in step (4) is then performed based on the global placement and routing results obtained above.
perform physical design of bank-level 3D SRAM. We show that a tradeoff exists in terms of reliability and performance for 3D SRAMs. We also show the impact of via-first vs via-last TSVs on the layout quality of 3D SRAM designs. All our results are based on GDSII based layouts. Based on our results different SRAM organizations maybe chosen based on reliability versus performance tradeoff.
I. INTRODUCTION 3D ICs is a promising technology that can help reduce power, improve performance and reduce signal integrity challenges. How ever, 3D ICs may cause increased thermal challenges and use of TSVs may decrease the yield of 3D chips. Various methods exist for the fabrication of TSVs for stacked 3D ICs [1], [2]. The two most popular methods are (i) via first ([3]), and (ii) via last ([I]).Based on the fabrication method these TSVs can have different area, parasitics, routing resource usage and cost (fabrication cost) [2]. The authors in [4] and [5] explore different ways of stacking SRAM into 3D. They propose the stacking can be done by stacking banks, column-on-column stacking, and row-on-row stacking. The authors in [6] propose a method to do design exploration and timing optimization for 3D SRAM. However, none of the above papers generate an actual layout of 3D SRAM and do not consider the impact of TSV area. The authors in [7] provide the first detailed placement and routing results for 3D ICs based on GDSH layout. The authors show that through silicon vias can have a significant impact on the area and wire length of a circuit. In this paper we explore various methods of TSV management for via first and via last type of TSVs, for bank level 3D SRAM ICs. The contributions of this paper are 1) We explore various design choices in 3D SRAM IC that can impact the number of TSVs, area, and performance of 3D SRAM ICs, and 2) We explore the impact of TSV fabrication technology (via first versus via last) on 3D SRAM ICs. II. OVERALL DESIGN FL OW In this section we discuss our overall flow for 3D SRAM IC. To generate the layout we first determine the location of the SRAM banks. Since the size of SRAM banks is much greater than the standard cells we determine their location before we place standard cells or TSVs. Once the SRAM banks have been fixed, the location of the standard cells and TSVs are determined in the remaining white space available. To determine the location of TSVs and standard cells we use approach used by authors in [8]. The approach can be briefly summarized as follows. We perform (1) global placement, (2) global routing, (3) detailed placement, and (4) detailed routing in this sequence. Performing global placement in step (1) provides us with the rough location of standard cells. This material is based upon work supported by the National Science Foundation under CAREER Grant No. CCF-0546382, the SRC Interconnect Focus Center (IFC), and Intel Corporation.
A. SRAM bank location The SRAM banks are placed in each die of the 3D design in a regular three dimensional grid G(C x R x Z). Where C is the number of columns of SRAM banks, R is the number of rows of SRAM banks, and Z is the number of dies. For example to place 16 SRAM banks in four dies we place them in a regular array of
G(2 x 2 x 4). The SRAM banks in the design are connected to each other through the decoder and the multiplexer logic. The structure of either the decoder logic or the multiplexer logic is typically hierarchical in nature. That is, for example, a 4-to-I6 decoder can be built using two 3-to-8 decoder logic blocks and some extra logic. The 3-to-8 decoder logic block can in turn be made form two 2-to-4 decoder logic blocks and so on. A similar analogy exists for creating multiplexer logic as well. Thus the various SRAM banks are connected to each other in form of a hierarchical structure [6]. For a given SRAM design we call its corresponding hierarchical tree Stree. In a 2D IC design two possible cut orientations exist. V cut that divides the tree along the vertical axis, and H cut that divides the tree along the horizontal axis. For a 3D IC design an additional cut orientation exists that divides the tree into different dies (Z cut). We obtain the bank placement of a given SRAM design by performing cuts in three stages. Stage one VI x HI represents the number of vertical and horizontal cuts performed before the banks are partitioned into different dies. Then the banks partitioned into different dies using Z cuts (Z2). Finally stage three consists of additional horizontal and vertical cuts Va x H3 to give us the final bank placement. In addition VI* V3 C thus ensuring the total number of columns are as required and HI * H3 R thus ensuring the total number of rows are as required. To better understand the bank placement method we look at example in Figure 1. In this example 16 SRAM banks (Bo'" B15) are placed in a three dimensional grid G(2x2x4). In Figure 1(a) VIx HI is 1 x 1 and V3 x H3 is 2 x 2. Since VI and HI are equal to 1 no horizontal and vertical cuts are performed initially. This is followed by Z cuts. The Z cuts ensure that blocks (Bo,BI,B2,B3) lie in dieo, blocks (B4,B5,B6,Br) lie in die!, blocks (Bs,Bg,BlO,Bn) lie in die2, and blocks (BI2,B13,B14,B15) lie in die3. Finally the last set of cuts V3 x H3 divide the blocks in each die accordingly. Similarly for 1 (b) VI X HI is 2 x 2 and V3 x H3 is 1 x 1. In a 2D IC design, choosing the order of V or H cuts has little impact on various metrics of the layout. However, in 3D IC design the additional Z cut can have a significant impact, as it effects both the number of TSVs required and the total wire length. This happens because the interface logic for bank level SRAM designs are connected to each other in the same hierarchical tree as the banks are connected to each other. For example in Figure I(a) banks Bo,BI,B2,B3 lie in same die thus its corresponding interface logic can also lie in the =
=
978-1-61284-857-0/11/$26.00 @2011 IEEE
V,X H, (2X2)
-- - - --- - - -dieO
die1
die2
-
22
Fig. 1.
die3
dieO
die1
die2
die3
(b)
Cut sequence and final bank placement in different dies for a 16 bank SRAM design placed in
1 x 1 and V3 x H3 is 2 x 2 (b)Cut sequence where VI x HI is 2 x 2 and V3 x H3 is 1 x 1
same die. Whereas in Figure l(b) the banks Bo,BI,B2,B3 lie in different dies thus its corresponding interface logic tends to get split in different dies resulting in 3D nets. As more interface logic occurs in lower level of the hierarchical tree, dividing lower level banks in different dies leads to greater number of 3D nets and hence more TSVs. To determine the location and the number of TSVs needed in the design. We use the approach used by the authors in [8].
2 x 2 x 4 3D grid. (a) Cut sequence where VI x HI
-'-�-""/location Interface logic SRAM
bank
(b)
(a)
B. TSV and Gate Planning In this section we discuss how we reserve whitespace for standard cells and TSVs in 3D Ie design. While TSVs can help in reducing the length of long global wires, they can cause reliability problems due to thermo-mechanical stress [9]. Thus in order to achieve increased reliability a safe distance is needed to keep the TSVs separate from the SRAM banks and the interface logic. We implement several floor planning options. Each option having its own set of pros and cons. These options are discussed in detail below. 1) TSV periphery: The floor-plan for this option is shown in Figure 2(a). In this option we allow the TSVs to be present only at boundary of the 3D design. The standard cells of the interface logic are placed in between the channels of the SRAM bank. In this option TSVs have minimal interaction with the logic gates on the die. Such a configuration also allows easy stacking of already manufactured or verified designs. However, as shown later in our experiments such a configuration results in the worst performance. 2) TSV island: The floor-plan for this option is shown in Fig ure 2(b). In this option we allow TSVs and interface logic gates to be placed in between the channel space between SRAM banks. However, TSVs and the standard cells of the interface logic are placed in distinct islands reserved for each type. Such a configuration means both the SRAM banks and interface logic and under a greater influence of TSV stress than when compared with the TSV periphery option. However, as shown later in our experiments it has better performance. 3) TSV spread: The floor-plan for this option is shown in Fig ure 2(c). This option is similar to the TSV island option, where TSVs and logic gates are placed in between the channel space between the SRAM banks. In this option we however do not create distinct regions for TSVs and standard cells. Both TSVs and standard cells can be placed together as long as they pass the design rule check (DRC). In such a configuration the logic gates are under the maximum stress
V3XH3 (lXl)
-- - - --- - - --
(a) is
-
location r-_Intenface logic
(c) Fig. 2.
Organization of TSVs and interface logic in 3D SRAM IC design.
only a single die of the 3D stack is shown. (a) TSVs are allowed along the periphery only and interface logic in between the channel space of SRAM bank. (b) TSVs and interface logic are present in between the channel space of SRAM banks with each having its own distinct region. (c) TSVs and interface logic are present in between the channel space of SRAM banks and they are allowed to spread unevenly across the space available.
impact of TSVs. In terms of performance, as shown in our results later, such a configuration gives us the best results. III. VIA FIRST OR VIA LAST As discussed in Section I, there exist two major methods for fabricating TSVs. The authors of the paper [2] show that each fabrication method can result in TSVs of different diameter and cost. Via first fabrication results in TSVs with smaller diameter thus allowing for greater TSV density. Whereas, the via last fabrication results in TSVs with greater TSV diameter thus allowing for smaller TSV density. However, in terms of cost, via last fabrication has lower cost as compared to via first. In addition via last TSVs can use routing resources in all the metal layers as discussed in [1]. Thus choosing the right via type can be important. We later show in our results the area, wire-length, and performance impact of using via first or via
TABLE I IMPACT OF CUT SEQUENCE ON AREA (A (f,Lm2», WIRE-LENGTH (WL
TABLE
DIFFERENT TSV TYPES
(f,Lm», NUMBER TSV (TSV) AND PERFORMANCE (LPD (ns» FOR DIFFERENT SRAM DESIGNS VI 1MB
4MB
16MB
X
1 1 2 1 1 2 2 4 1 1 2 2 4 4 8
HI - V2 X H2 x 1-2 x 2 x 2-2 x 1 x 2-1 x 1 x 1-4 x 4 x 2-4 x 2 x 2-2 x 2 x 4-2 x 1 x 4-1 x 1 x l-8 x 8 x 2-8 x 4 x 2-4 x 4 x 4-4 x 2 x 4-2 x 2 x 8-2 x 1 x 8-1 x 1
Circuit
A
WL
LPD
TSV
6.72 x 10° 6.84 x 106 6.96 x 106 2.68 x 10 2.73 x 107 2.73 x 107 2.78 x 107 2.8 x 107 1.07 x 10" 1.09 x 108 1.09 x 108 1.11 x 108 1.12 x 108 1.12 x 108 1.15 x 108
6.8 x 10" 5.83 X 105 5.7 X 105 3.61 x 10" 3.70 X 106 3.40 X 106 2.77 X 106 2.72 X 106
4.5 1
475
4.36
6 12
4.4
728
7. 19
60 1
1.70 x 10 1.78 X 107 1.64 X 107 1.69 X 107 1.48 X 107 1.28 X 107 1.14 X 107
13.82
9 12
16.2
1 134
Cut Sequence
Circuit
TABLE
7. 18
8 12
7.0 1
1 172
5.77
1834
5.8 1
2203
12.8
1987
1 1.7
2 134
12. 1
2987
12.3
4324
1 1.2
8976
II
DIFFERENT TSV TYPES Type
Dimensions
Resistance
Capacitance
n
TSVa TSVb TSVc TSVd
f,Lmxf,Lm 7.39 x 7.39 7.39 x 7.39 40.59 x 40.59 40.59 x 40.59
IF
20
25
III
IMPACT ON WIRE-LENGTH WL, AREA A, AND DELAY LPD FOR TSV type
1MB
4MB
16MB
TSVa TSVb TSVc TSVd TSVa TSVb TSVc TSVd TSVa TSVb TSVc TSVd
A
WL
LPD
0.88
0.58
0.93
0.88
0.59
0.92
1.00
1.00
1.00
1.00
1.0 1
1.0 1
0.87
0.65
0.98
0.87
0.66
0.99
1.00
1.00
1.00
1.00
1.02
0.99
0.89
0.69
0.99
0.89
0.7
1.00
1.00
1.00
1.00
1.00
1.02
1.0 1
I) Discussion: Based on the results obtained interesting trade-offs exists for SRAM design. On one hand performing Z cut late for bank placement helps to reduce wire-length and more importantly helps improving performance. Whereas, on the other hand it requires more number of TSVs leading to greater area usage. In addition greater number of TSVs may impact the yield and reliability of 3D SRAM.
Fabrication type via first
20
25
via last
5
75
via first
5
75
via last
last TSV. IV. EXPERIMENTAL RESULTS In this section we discuss our experimental results in detail. All our experiments our performed using l30nm library provided by Global Foundries. We use the Artisan ARM memory compiler tool to generate the SRAM banks. All our experimental results are run on 2GHz Linux machine and our 3D SRAM ICs are built on a 4 die stack. We test our results on 3 different SRAM sizes (1MB, 4MB and 16MB). The SRAM bank used in the designs have a capacity of 64KB. The different TSV types we use in our experiments are shown in table II. We use two different area for the TSVs. Via last and via first TSV differ in the amount of routing resources they use. While via first TSV uses just the metal 1 (M l) layer for its landing pad, via last TSV uses all the metal layers as the TSV is drilled from the front side of the die [1].
A. SRAM Bank Placement Impact In this section we discuss the impact of bank placement on various metrics like number of TSVs, area, wire-length, and performance. The results are shown in table I. We observe that as more V and H cuts are performed before the Z cut is performed we require more TSVs in the design. Performing Z cut late in the cut sequence results in partitioning more gates in the lower levels of logic, as described in section II-A. This results in greater number of 3D nets that require more TSVs. More number of TSVs also results in greater area. Performing Z cut late helps in reducing wire-length. Performing Z cut late results in lower level banks and their corresponding logic be placed on top of each other. This helps in reducing the total wire length. The impact of bank placement on performance (longest path delay) is also shown in table I. We again observe that performing Z cut late in the sequence helps in improving performance.
B. TSV Placement Style In this section we discuss the impact of different TSV placement styles on wire-length, area and performance (longest path delay) of 3D SRAM IC. We fix the SRAM bank placement style for each SRAM that gives us best performance as discussed in section IV-A. We assume the TSV type to be TSVc, as shown in table II. The impact on wire-length is shown in Figure 3(a). We observe that the TSV periphery placement style gives the worst wire length whereas, the TSV spread placement style gives the best wire-length. In TSV-periphery placement style the TSVs are restricted to be placed only at the outside periphery of the die. This results in longer de-tours for 3D nets leading to greater wire-length. The impact on area is shown in Figure 3(b) and the impact on performance is shown in Figure 3(c). The best results for performance are obtained with the TSV spread placement style whereas, the worst results are obtained with the TSV periphery placement style. C. Via First or Via Last
The impact TSV type on various metrics like (1) area, (2) wire length, and (3) performance are reported in table III. All results are normalized with respect to TSV type TSVc• A significant difference in wire-length (around 40% ) is observed when we compare large TSV size to the smaller TSV size. We observe little difference in the total wire-length (around 1%) when we compare via first versus via last type of TSVs. Even though, via last type of TSVs block the routing resources above them they do not cause too much routing congestion. This happens because, the regions where the TSVs are placed have no standard cells resulting in little routing congestion. In addition, there exists significant spacing between the TSVs to meet the routing demands. Results show that smaller TSVs help in reducing area for all benchmarks (around 12%), while via first and via last TSV fabrication process have no impact on the area needed. Finally we again observe that via first or via last TSV fabrication process have little impact on performance. However, for the smaller SRAM design (lMB) a smaller TSV can help improve performance by as much as 8%. For larger designs the size of TSV has little impact. We believe this happens because for a smaller designs, TSV capacitance can play a significant role but for a larger design the impact of TSV capacitance
6.0
1MB
...
1.18
£ Cl
1.14
4.5
C 4.0 � 3.5
� .�
(1l
� «
3.0
�
a:::
1.5
1.12
�
1.10 1.08
� g> 1.6
1.04
0 1. 4 .....J 1.2
1.02
g
1.00
a::: 1.0 O .• +--�-_��_-�-� T
SV p eriphery
TSV island
TSV spre ad
Different TSV placement styles
0.98 TSV perp i hery
TSV island
&.
TSV spread
Different TSV placement styles
(a) Fig. 3.
�
ID
0 1.06
2.'
o 2.0
�
3.2 3.0 2.8 "0 2.6 £; 2.4 2.2 __ 2.0 1.8
1MB 4MB 16MB
1.16
'.0
1.0 0.8
- � - � --T- s�;�"'-" -- T v . -, e' +- s-v- '; � - - " P p d d S V pe he T
Different TSV placement styles
(b)
(c)
Comparison between different placement styles. The results are plotted as ratio when compared to the base case of TSV placement style TSV spread.
(a) Impact on wire-length (b) Impact on area (c) Impact on longest path delay
top die of 4-die 3D SRAM using via-first TSV
top die of 4-die 3D SRAM using via-lastTSV
Fig. 4.
GDSH shots of via-first vs via-last TSVs in SRAM design. Routing is permitted over the via-first TSVs. Via-last TSVs are larger.
decreases due to larger die dimensions. Figure 4 shows some of the layout images we obtained. V. CONCLUSION In this paper we show that bank placement plays a significant role in the design for 3D SRAM Ie. In addition we observe the impact of TSV size and different TSV fabrication process (via-first or via-last) on various metrics like wire-length, area, and performance. Based on our observations the designer may have to make some interesting trade-offs based on the design specifications. REFERENCES [I] C. Huyghebaert,
1. Olmen, Y. Civale, A. Phommhaxay, A. Jourdian,
S. Sood, S. Farrens, and P. Soussan, "Cu to Cu interconnect using 3D-TSV and Wafer to Wafer thermo-compression bonding," in Proc. International
Interconnect Technology Conference, 20 10. [2] D. Velenis, M. Stucchi, E. Marinissen, B. Swinnen, and E. Beyne, "Imapct of 3D Design choices on manufacturing cost," in Proc. 3D System
lnetegration 3DIC, 2009.
[3) M. Puech, J. M. Thevenoud,
1. M. Gruffat, N. Launay, N. Arnal, and
P. Godinat, "Fabrication of 3D packaging TSV using DRIE," in Design,
Test, Integration and Packaging of MEMSIMOEMS, 2008. [4) K. Puttaswamy and G. H. Loh, "Implementing caches in 3D Technology for High Performance MicroProcessors," in Proc. IEEE Int. Con! on
Computer Design, 2005. [5) --, "3D-Integrated SRAM Components for High Performance Micro processors," IEEE Trans. on Computers, pp. 1369-1381, 2009. [6) X. Chen and
W. R. Davis, "Delay analysis and design exploration for 3D
SRAM," in Proc. 3D System Inetegration 3DIC, 2009. [7) D. Kim, K. Athikulwongse, and S. Lim, "A study of Through-Silicon-Via Impact on 3D Stacked ICs," in Proc. IEEE Int. Con! on Computer-Aided
Design, 2009. [8] M. Pathak and S. Lim, "Through Silicon Via Management during 3D Physical Design: When to Add and How Many?" in Proc. IEEE Int.
Conf. on Computer-Aided Design, 20 10. [9) A. P. Karmakar, X. Xiaopeng, and
V. Moroz, "Performanace and reliabil
ity analysis of 3D-integration structures employing Through Silicon Via (TSV)," in Proc. IEEE Reliability Physics Symposium, 2009.