IEEE 2007 Custom Intergrated Circuits Conference (CICC)
Standard Cell and Custom Circuit Optimization using Dummy Diffusions through STI Width Stress Effect Utilization
at San Diego Rasit Onur Topaloglu , University of California Advanced Micro Devices
Abstract— Starting at the 65nm node, stress engineering to improve performance of transistors has been a major industry focus. An intrinsic stress source - shallow trench isolation has not been fully utilized up to now for circuit performance improvement.In this paper, we present a new methodology that enables the exploitation of STI stress for performance improvement of standard cells and custom integrated circuits.We start with process simulation of a 65nm STI technology, and generate mobility models for STI stress based on these simulations. Based on these models, we are able to perform STI stressaware modeling and simulation using SPICE. We then present our optimization of STI stress in standard-cell and custom designs using active-layer (dummy) fill insertion to alter the STI widths.Circuit level experimental results are based on a miscellaneous ring oscillator, which is known to correlate well to silicon. Using a generic 65nm cell library, we show that the STI-optimized designs provide up to 8% improvement in clock frequency.The frequency improvement through exploitation of STI stress comes at practically zero cost with respect to area and wire length.
I. I NTRODUCTION In 65nm, a number of stress optimization methods have been introduced. These methods are summarized below.
SiGe Stress From Underneath the Channel Embedded SiGe from the Source and Drain Stress Liner Stress Memorization Technique Hybrid Orientation Technique
The initial stress methods employed a Silicon Germanium layer underneath the channel, which improves the channel mobility. Recently, embedded SiGe (e-SiGe) [2] [3] [5] is used in the source and drain regions, hence exerting a stress along the channel. e-SiGe can improve PMOS speed only. Another stress engineering option includes deposition of stressed liners over the transistors on top of the polysilicon. Single stress liners, where all wafer is covered by compressive or tensile stressed liners, are available. Another modification to this scheme is the dual stress liner [8] [7], where NMOS transistors are covered by a tensile liner, whereas PMOS transistors are covered by a compressive liner. Stress memorization technique [9] [10] relies on plastic deformation of certain materials due to a process step and the consequent memorization of the applied stress in the channel. Stress memorization technique improves NMOS speed only. In hybrid orientation technique [6] [4], crystal orientations are used to enhance NMOS and PMOS speeds separately.
1-4244-1623-X/07/$25.00 ©2007 IEEE
In the next sections, we provide a motivation and present the previous work section. Section IV is devoted to a brief introduction to stress, followed by the process steps we have simulated using TCAD and the proposed stress models. In section V, we introduce the STI optimization for standard cells and custom circuits. In section VI, we provide the circuit level experimental result. We then conclude the paper. II. M OTIVATION It is evident that stress techniques and mobility improvement will dominate over traditional scaling starting with 65nm technologies. There has long been a stress source which has not been fully utilized until now. This is the stress from the shallow trench isolation (STI). STI usually exerts a compressive stress. It is well known by now that PMOS device mobility improves through compressive stress applied along the channel, i.e. in the current flow direction. The opposite type of stress, i.e. the tensile stress, degrades the PMOS performance in this direction. NMOS is generally speaking complementary to PMOS in terms of how it is affected from stress. STI degrades the NMOS mobility if applied through the direction along the channel. The mobility increase corresponds to speed increase. Hence, it is possible to utilize STI, which is used to separate device regions, for improving performance. III. P REVIOUS W ORK In the area of stress modeling and characterization, Rueda et al. [1] have provided general models for stress. Gallon et al. [15] have analyzed the stress induced by STI. Bradley et al. [16] have characterized the piezoresistance of CMOS transistors. [21] et al. have modeled well edge proximity effect on MOSFETs. Sheu et al. [14] have modeled the mechanical stress on dopant diffusion. Su et al.[17] have proposed a scalable model for layout dependence of stress. In the area of STI process, Elbel et al. [12] have proposed an STI process flow based on selective oxide deposition. Lee et al. [13] have proposed an optimization for densification of the STI fill oxide to reduce the stress. Stress TCAD simulations have been conducted by Moroz et al. in [18] and [20] and by Smith in [19]. Possible ways to enhance performance using STI stress have been indicated, yet with no circuit level optimization being explained. Although optimizations exist to reduce STI stress, these optimizations usually fail to totally eliminate the stress. Introduction of e-SiGe in the source and drain may reduce the
TP-17-1
Authorized licensed use limited to: ADVANCED MICRO DEVICES. Downloaded on September 11, 2009 at 19:35 from IEEE Xplore. Restrictions apply.
619
TABLE I
STI width effect for PMOS, but NMOS performance can still be improved through utilizing the STI width effect. Models are needed for relating the stress due to the STI width effect to transistor mobilities. There also seems to be a lack of stress optimization methods in the literature. Such methods and efficient simulation, modeling, analysis and optimization methods are required to aid the technology development for strained transistors. In this paper, we strive to enable this.
STI P ROCESS S TEPS 1. Deposit pad oxide 2. Deposit pad nitride 3. Deposit photoresist for STI lithography 4. Anisotrophically etch nitride and oxide 5. Strip photoresist 6. Directional etch at a rate of 0.01 for 40 seconds at 86 / angle 7. Deposit TEOS oxide at a rate of 1.0 for 0.2 seconds 8. Deposit trench fill oxide 9. Temperature ramp up from 600 / C to 1000/ C at a rate of 50 021'35476 10. Hold temperature for 1 minute at 1000 / C 11. Temperature ramp down from 1000 / C to 600 / C at a rate of 50 021'384.6 12. Diffuse oxide 13. STI CMP 14. Etch nitride isotrophically at a rate of 0.15 for 1.5 minutes 15. Etch oxide isotrophically for 1 minute at a rate of 0.02 16. Temperature ramp up from 600 / C to 800 / C at a rate of 40 021'384.6 17. Diffuse for 5 minutes using dry 9;: 18. Temperature ramp down from 800 / C to 600 / C at a rate of 40 021'384.6 19. Deposit polysilicon gate
IV. STI S TRESS M ODELING A. Brief Introduction to Stress
Fig. 1.
Stress components on a unit volume.
Based on the understanding in [1], the stress components on a unit cell are shown in Figure 1. The stress vector
acting normal to is given as . The stress tensor is defined by the three stress vectors: "#
$%& $'
!
$% & $%(
$% $%& (
)*
+
In this equation, ’s are stress components normal to the $ are shear components directed unit cube faces, whereas . ’s are used for towards , on the orthogonal face to - . The analyzing the impact of stress. Using the individual stress components, we have converted the stress values to mobility [20].
between the STI and underneath the channel to get accurate finite element calculations close to the channel. The temperature cycles, such as steps 9 to 11, are responsible for thermal mismatches. In step 13, STI CMP is applied. At the end of this step, the top of the STI is left above the active region on purpose. The basic reason is to avoid defectivity such as delamination of the STI oxide. At the edges of the channel, this step height difference would introduce threshold voltage variations and so called width effects.
orthogonal parallel W
SA
SB
STIW
LOD
Fig. 2. STIW parameter. LOD is accounted for in BSIM models. STIW impact is not modeled. Parallel and orthogonal distances with respect to a transistor is also indicated in the figure.
In this section, we describe a generic STI process flow. We then provide the STI models we propose.
(a)
(b)
Fig. 3. PMOS and NMOS Dependency on STI Width in Parallel Direction. (a) Process after CMP step. Mesh is not shown. (b) Process after STI CMP step. Mesh is shown.
C. STI Models
B. Process Steps In this section, we describe the process recipe we have used for the simulation of STI stress. We have simulated the structure up to gate deposition step. We have used Sentaurus process simulator for the simulations. The process steps are shown in Table I.1 The structure after STI CMP is shown in Figure 3 without and with the mesh. We have selected the mesh dense especially 1 If a foundry process is used, a detailed process flow is not available. The models in this case should be provided by the foundry.
Limitation of Current STI Stress Models. BSIM models contain an STI model. However, transistor channel to the STI boundary impact only is modeled. The dependency on the STI width (STIW) beyond the active (diffusion) regions are not present in the models. Our simulations as well as simulations and data in the literature show that STIW seems to be neglected so far. STIW can be seen in Figure 2. In this paper we not only show the significance of STIW effect, but also use this effect to improve circuit performance at no area cost.
TP-17-2
Authorized licensed use limited to: ADVANCED MICRO DEVICES. Downloaded on September 11, 2009 at 19:35 from IEEE Xplore. Restrictions apply.
620
We have developed the models below for STIW correction based on our TCAD simulations. The SA terms appears in the equations, as the STIW impact differs according to the choice of LOD. A large distance between channel and LOD pushes the STI away from the channel. The models can be used on top of the BSIM stress models for the STIW correction. =@?A=@B Parallel Models: IHKJ
CL5MN
(1)
2
Orthogonal Models =O?P=OB :
RQSN
JUTV
G
IHKJ
CL5MN
=@?A=@B
(2)
The model dependencies are shown in Figure 4. is the mobility multiplier. To find final mobility, models in four directions are multiplied and the mobility is saturated at W 60%, as larger modifications would not be physical.
insert dummy N-diffusion in parallel direction near NMOS insert dummy P-diffusion in orthogonal direction near PMOS For optimization, as illustrated in Figure 5 (b), N-diffusion fills are inserted next to the NMOS regions in the lateral direction. Regions next to the PMOS region are left blank in the lateral direction, hence these regions contain STI. In the orthogonal direction, P-diffusion fills are inserted next to the PMOS regions in the orthogonal direction. Regions next to the NMOS region are left blank in the orthogonal, hence these regions contain STI. During placement, usually the second row of cells are mirrored along the x-axis such that the NMOS and PMOS regions for cells in neighboring rows are right next to each other. In the orthogonal direction, P-diffusion fills are used next to PMOS devices. No diffusion fills are inserted next to the NMOS devices in the orthogonal direction.
TABLE II
`a\ `a\ `a\ `a\ `a\ `a\ `a\ `a\ a\ ` a\ ` a\ ` a\ ` a\ ` a\ ` a\ ` a\ ` a`a`
M ODEL PARAMETER TABLE Orthogonal -0.0552 -0.0367 0.0430 0.0224
PRX
PRX
Parallel 0.1102 0.0729 -0.0859 -0.0450
^_\ ^_\ _\ _^_^ ^_^\_\ ^_\ ^_\ _^_^ ^ _^\ ^_^\_\ ^_\ _\ _^_^ ^ ^_\ _^\ _^ ^_^\_\ ^_\ _\ ^^ _^__^^ _^\_\
NRX
NMOS X NMOS Y PMOS X PMOS Y
1.2 1.4
STIW=2um STIW=1um STIW=0.5um STIW=0.2um
1.35 1.3
(a)
1.25
mulu0
mulu0
1
1.2
1.1 1.05 STIW=2um STIW=1um STIW=0.5um STIW=0.2um
1
0.8
0.95 0.9
0.4
0.8
1.2 LOD (um)
(a)
1.6
2.0
0.4
0.8
1.2
1.6
0.4 2.0 LOD (um)
0.8
1.2
1.6
(b)
Fig. 5. Inter-layer coupling for second neighboring layers. (a) A representative standard cell. (b) Standard cell optimized with dummy diffusions to improve performance.
1.15
0
bb\ bb\ bb\ bb\ bb\ bb\ bb\ bb\ c\ c c\ c c\ c c\ c [\c\ c [\c\ c [\c\ c [\c\ c [\cbbc [ [\ [\ [\ [\ [\ ]\ [[ ]\ [[ ]\ [[ ]\ [[ ]\ [[ ][][[ ]\ \ ] \ ] \ ] ]\ ]\ ]\ ]][ [ ]\ [ ]\ [ ]\ ]\ ]\ \ ] \ ] ]\ [[]\ [ [ [ [[]\ [][ ]\ ]\ ]\ []\ []\ []\ []\ []\ ][] []\ [ [ [ [ ]\ \ ] \ ] \ ] []\ [ [ [ [ []\]\ []\]\ []\]\ []\]\ []\[][] ]\]\ ]\]\ ]\]
NRX
DCFE8G
2.0
(b)
Fig. 4. PMOS and NMOS Dependency on STI Width in Parallel Direction. (a) Increasing STIW or decreasing LOD increases PMOS mobility. (b) Decreasing STIW or increasing LOD increases NMOS mobility.
V. S TANDARD C ELL O PTIMIZATION An example standard cell is shown in Figure 5 (a). PRX (active regions for PMOS devices) region is shown at the top. NRX (active regions for NMOS devices) are shown at the bottom. Dummy Diffusion Insertion. STI exists in areas between diffusion regions. Larger STI causes a larger stress.3 Hence, dummy diffusions, which are not electrically connected nor contain any active devices can be inserted in vast STI regions to reduce the STI width, hence the resultant stress.4 The dummy diffusion guidelines to improve performance are summarized below: 2 We have developed the orthogonal models based on silicon experience, which gives a roughly 50% less impact as compared to stress along the parallel direction. 3 We have found through our simulations is that there is a maximum STI width beyond which stress in the channel does not further increase. We have found a value of 2Z3 for this saturation effect. 4 Notice that insertion of dummy diffusions does not increase the number of masks needed for the process.
It is quite possible to come across with more complex diffusion regions, such as shown in Figure 6. In this case, care must be paid while inserting fills into the dotted region, as the insertion of a dummy diffusion inside the dotted area will adversely affect the transistors A and B. We have left such regions with STI, i.e. we have not inserted any dummy diffusion layers.
A B Fig. 6.
Complex Diffusion Regions. No dummy fills should be inserted.
A. Practical Considerations Design Rules. When active layer fills are inserted, there are two primary design rules to consider. One obviously is the active to active spacing. The standard cell designer needs to make sure that this spacing is preserved after cell placement. The other important design rule is the active layer density. There is a limit in the active layer density due to STI CMP. The designer needs to check that this bound is not exceeded. Leakage. The original purpose of STI is to provide isolation between devices. Dummy diffusion insertion reduces the STI
TP-17-3
Authorized licensed use limited to: ADVANCED MICRO DEVICES. Downloaded on September 11, 2009 at 19:35 from IEEE Xplore. Restrictions apply.
621
TABLE III C OMPARISON OF C APACITANCES DUE TO D UMMY D IFFUSION
d
1'Ze3
M1 total M1 coupling
case1 1.71E-16 6.120E-17
case2 1.60E-16 6.85E-17
case3 1.64E-16 6.86E-17
TABLE IV C OMPARISON OF C APACITANCES DUE TO D UMMY D IFFUSION
d
1'Ze3
M1 total M1 coupling
case1 1.71E-16 6.120E-17
case2 1.60E-16 6.85E-17
case3 1.64E-16 6.86E-17
width. However the diffusion to diffusion spacing is preserved according to design rules, which should originally have been set to make the leakage negligible at the set width. RC Extraction. The insertion of fills may slightly increase the total capacitances. However, as the fills are floating and not grounded, this increase will be negligible. Furthermore, the insertion of floating active layer fills can reduce the line to line coupling for the M1 routing lines, as the fill will draw some of the electrical flux between the overlying lines. In order to understand the impact due to dummy diffusion insertion, we have conducted 2D field solver simulations with Raphael. Dense M1 layer with M2 routing on top and underlying active layer is simulated. For M1 routing, we have compared the following three cases: an underlying active layer which is grounded (case1), no underlying active layer (case2) and an underlying dummy active layer (case3). The results, as given in Table IV indicate that the total capacitance for M1 is between grounded fill and no fill for the floating dummy diffusion fills, closer to the latter one. The impact on the coupling capacitance is same as the case when there are no diffusion layers underneath. VI. C IRCUIT L EVEL E XPERIMENTAL A NALYSIS For circuit level analysis, we have used a set of 65nm standard cells. To conduct the simulations, we have used HSPICE 2006.03. RC extraction is handled using Cadence Assura. We have used MULU0 parameter for the mobility multipliers. We have written an extractor to extract the STIW parameters given a GDS and a netlist. We have designed a miscellaneous ring oscillator using 33 of these cells with a fanout of 2 each. We have selected a miscellaneous ring oscillator as we know through experience that they correlate well to silicon results for larger circuits. The cells contain inverters, NANDs, NORs, OAIs and MUXes. We have manually inserted dummy diffusions to optimize the STI widths according to the given guidelines. We have conducted simulations using nominal circuit and BSIM models, nominal circuit with BSIM models updated with STIW models, and optimized circuit with BSIM models updated with STIW models. We have extracted the RC’s separately for each case. We have obtained oscillation periods of 0.4126ns, 0.4678ns and 0.4288ns, respectively. Hence, an 8% performance improvement is obtained using the circuit optimization. VII. C ONCLUSIONS We have conducted TCAD process simulations to generate models which relate the dependence of transistor mobilities
to stress induced by the shallow trench isolation width. We have devised an optimization methodology based on dummy diffusion insertion to modify the STI width. We have discussed the implications of dummy diffusion insertion and conducted field solver simulations to investigate the impact on coupling. The proposed optimization method can be used for both standard cell and custom circuit optimization. We have applied the proposed optimization flow on a miscellaneous standard cell ring oscillator. Our analysis shows that STI width optimization can increase performance up to 8% with no area penalty. Proposed optimization can form the basis of circuit optimization for upcoming stress engineered transistor technologies. VIII. ACKNOWLEDGEMENTS The author would like to thank Frank Geelhaar of AMD for the discussions. R EFERENCES [1] H.A. Rueda, “Modeling of Mechanical Stress in Silicon Isolation Technology and Its Influence on Device Characteristics,” Ph.D. Thesis, 1999. [2] J.-P. Han et. al., “Novel Enhanced Stressor with Graded Embedded SiGe Source/Drain for High Performance CMOS Devices,” IEDM, 2006. [3] Q. Ouyang et. al., “Characteristics ofhg High g2i Performance PFETs with Channels on 45 / Rotated Embedded SiGe Source/Drain and f Wafers,” Int. Symp. On VLSI Technology, 2005. [4] Y. Tateshita et. al., “High-Performance and Low-Power CMOS Device Technologies Featuring Metal/High-k Gate Stacks with Uniaxial Strained Silicon Channels on (100) and (110) Substrates,” IEDM, 2006. [5] Q. Ouyang et. al., “Investigation of CMOS Devices with Embedded SiGe Source/Drain on Hybrid Orientation Substrates,” Symposium on VLSI Technology, 2005. [6] Min Yang et. al., “Hybrid-Orientation Technology (HOT): Opportunities and Challenges,” IEEE Tran. On Electron Devices, Vol. 53, No. 5, 2006. [7] W.-H. Lee et. al., “High Performance 65 nm SOI Technology with Enhanced Transistor Strain and Advanced-Low-K BEOL,” IEDM, 2005. [8] H.S Yang et. al., “Dual Stress Liner for High Performance sub-45nm Gate Length SO1 CMOS Manufacturing,” IEDM, 2004. [9] C. Ortolland, “Stress Memorization Technique (SMT) Optimization for 45nm CMOS,” Symp. on VLSI Technology, 2006. [10] C.-H. Chen et. al., “Stress Memorization Technique (SMT) by Selectively Strained-Nitride Capping for Sub-65nm High-Performance Strained-Si Device Application,” Symp. on VLSI Technology, 2004. [11] C.S. Smith, “Piezoresistance Effect in Germanium and Silicon,” Physical Review, Vol. 94, No. 1, pp.42-49, 1954. [12] N. Elbel, Z. Gabric, W. Langheinrich and B. Neureither, “A New STI Process Based on Selective Oxide Deposition,” Symposium on VLSI Technology Digest of Technical Papers, pp. 208-209, 1998. [13] H. S. Lee et al. “An Optimized Densification of the Filled Oxide for Quarter Micron Shallow Trench Isolation (STI),” Symp. on VLSI Technology Digest of Technical Papers, pp. 158-159, 1996. [14] Y.-M. Sheu et al. “Modeling Mechanical Stress Effect on Dopant Diffusion in Scaled MOSFETs,” IEEE Tran. on Electron Devices, Vol. 52, No. 1, Jan. 2005. [15] C. Gallon et al. “Electrical Analysis of Mechanical Stress Induced by STI in Short MOSFETs Using Externally Applied Stress,” IEEE Tran. on Electron Devices, Vol. 51, No. 8, Aug. 2004. [16] A.T. Bradley, R. C. Jaeger, J.C. Suhling and K.J. O’Connor, “Piezoresistive Characteristics of Short-Channel MOSFETs on (100) Silicon,” IEEE Tran. on Electron Devices, Vol. 48, No. 9, Sep. 2001. [17] K.-W. Su et al., “A Scaleable Model for STI Mechanical Stress Effect on Layout Dependence of MOS Electrical Characteristics,” IEEE Custom Integrated Circuits Conference, 2003.a [18] V. Moroz et al., “The Impact of Layout on Stress-Enhanced Transistor Performance,” Int. Conf. on Simulation of Semiconductor Processes and Devices, 2005. [19] L. Smith, “TCAD Modeling of Strain-Engineered MOSFETs,” Mater. Res. Soc. Symp. Proc., Vol. 913, 2006. [20] V. Moroz, L. Smith, X.-W. Lin, D. Pramanik and G. Rollins, “StressAware Design Methodology,” Int. Symposium on Quality Electronic Design, 2006. [21] Y.-M. Sheu et al., “Modeling Well Edge Proximity Effect on HighlyScaled MOSFETs”, IEEE Custom Integrated Circuits Conference, 2005.
TP-17-4
Authorized licensed use limited to: ADVANCED MICRO DEVICES. Downloaded on September 11, 2009 at 19:35 from IEEE Xplore. Restrictions apply.
622