A Power Optimization Method for CMOS Op-Amps Using Sub-Space Based Geometric Programming Wei Gao
Richard Hornsey, Senior Member, IEEE
Computer Science & Engineering Department York University Toronto, Canada
[email protected] Computer Science & Engineering Department York University Toronto, Canada
[email protected] Abstract— A new sub-space max-monomial modeling scheme for CMOS transistors in sub-micron technologies is proposed to improve the modeling accuracy. Major electrical parameters of CMOS transistors in each sub-space from the design space are modeled with max-monomials. This approach is demonstrated to have a better accuracy for sub-micron technologies than singlespace models. Sub-space modeling based geometric programming power optimization has been successfully applied to three different op-amps in 0.18µm technology. HSPICE simulation results show that sub-space modeling based GP optimization can allow efficient and accurate analog design. Computational effort can be managed to an acceptable level when searching sub-spaces for transistors by using practical constraints. An efficient scheme in dealing with non-convex constraint inherent in Kirchhoff’s voltage law is suggested in this paper. By using this scheme, the nonconvex constraint, such as posynomial equality, can be relaxed to a convex constraint without affecting the result. Keywords-power optimization; CMOS op-amps; geometric programming; monomial; posynomial
I.
INTRODUCTION
Automatic-synthesis-based analog design is one of the efficient schemes in power optimization design, in which a power optimal design constrained by other performance measures can be simply found by including power consumption in the objective function. Analog circuit synthesis based on geometric programming (GP) has become an active research subject over the last decade. Its unique features of efficient computation and reliable global optimum have successfully allowed researchers to explore the automated design of CMOS op-amps [1, 2, 3, 4], pipelined ADC [5], and CMOS DC-DC buck converters [6]. In most of published works, first-order transistor models based on the square-law theory are used. They demonstrated a very good accuracy in long-channel length technology, but failed in submicron technology [7]. Two major challenges are faced in geometric-programming -based analog circuit synthesis in the modern sub-micron technologies. The first is to improve the posynomial model accuracy of CMOS transistor parameters. The second is to cope with non-convex constraints efficiently. To authors’ knowledge, no posynomial models on CMOS transistors with the minimum feature size less than 0.18µm have been reported. The published posynomial modeling methods in 0.18µm CMOS technology include convex piecewise-linear fitting (PWL) [7], and
978-3-9810801-6-2/DATE10 © 2010 EDAA
genetic-algorithm-based posynomial modeling (GAP) [8]. PWL uses a set of max-affine functions to fit the experimental data, in which the fitting problem becomes an ordinary linear least-square problem. In contrast, GAP combines a genetic algorithm with quadratic programming to synthesize a posynomial model for the experimental data. It was reported to have a better performance than PWL [8] in CMOS transistor models in 0.18µm technology. These two schemes model CMOS transistor parameters in a single design space (ranging from weakto strong-inversion region). Although the model accuracy of the CMOS transistors in short-channel technologies is improved in both methods, large errors in some parameters, such as transconductances gds [7] and gm [8], can lead to significant prediction errors in some circuit performance measures. In fact, CMOS transistors demonstrate very different behaviors in different inversion regions. For example, gm has a convex property as Vgs varies in the weak and moderate inversion regions, while it has a concave property in strong inversion region. It implies that it is problematic to model a CMOS transistor accurately with a single posynomial model across the entire design space. In this paper, a new sub-space modeling method is suggested to further improve the accuracy of posynomial models of CMOS transistors in sub-micron technologies. PWL modeling method is selected in this work due to its simple implementation. Five design parameters (W, L, Vgs, Vds, Ids) are used to achieve a better model accuracy on some parameters. An efficient scheme to deal with non-convex constraint inherent in Kirchhoff’s voltage law (KVL) constraint is suggested as well. Without introducing an extra operating point convergence process as in [1], this scheme can guarantee that the KVL constraint is met with only one GP process. In Section II, the principle of the geometric programming is explained. The sub-space modeling method is described in Section III. Design examples of a simple two-stage op-amp, a symmetrical operational transcondunctance amplifier (OTA), and a common mode feedback (CMFB) OTA are presented in Section IV. The conclusion is given in Section V. II.
GEOMETRIC PROGRAMMING
Geometric programming (GP) is one example of mathematical optimization problems, which has the following standard form:
minimize f0(x) subject to fi(x) 1, i=1,
…, m
(1)
gj(x)=1, j=1,
…, p +
where x ∈ Rn is a vector of n real positive variables, fi are posynomial functions (or called posynomials), and gi are monomial functions (or called monomials). A monomial function has a form of
g ( x) = cx1a1 x2a2 ...xnan ,
(2)
where c > 0 and ai ∈ R . A posynomial function is a sum of one or more monomials, which has a form of a
f ( x) = ¦ ck ∏ x j kj , k =1
where ck > 0, and
(3)
TABLE I.
SUB-SPACE MAP FOR ALL PARAMETERS BUT 1/gm
j
akj ∈ R .
A posynomial/monomial function can be transformed to a convex function with a logarithmic transformation of the design variables [1]. In this way, a geometric programming problem can be converted to a convex programming problem, in which the global optimal point can be found with great efficiency (a few minutes to solve a problem with 1000 variables and 10000 constraints) by using standard interior-point algorithms [9]. In addition to finding a true global optimal point, GP can detect unfeasibility of the problem unambiguously. Compared to other optimization algorithms, GP is more constrained due to its special requirement on the form of the objective and constraints (posynomial or monomial forms). Therefore, it needs more effort to write a problem in a GP form. III.
Figure 1. gm versus Vgs for a NMOS transistor in 0.18µm technology with L=1.02µm, W: 0.4µm ~ 100µm, Vds: 0.2V ~ 1.8V working in (a) moderate inversion (b) strong inversion region.
SUB-SPACE MODELING
Unlike for long-channel length technology, parameters for a CMOS transistor in sub-micron technologies cannot be accurately modeled with a single monomial equation over the entire design space. By carefully studying the nature of MOS transistor parameters in TSMC 0.18µm technology, we find that some parameters exhibit very different behaviors in different operation regions. An example is illustrated in Fig. 1. We can see that gm has a convex property as Vgs varies in the moderate inversion region, while it has a concave property in the strong inversion region. This implies that different models have to be applied in different operation regions. In this work, the PWL modeling method is selected due to its simple implementation. Two models derived from 4 (W, L, Vds, and Ids) and 5 design variables (W, L, Vgs, Vds, and Ids) are compared for gm, 1/gm, and gds, and we find that 5 variable model gives a better accuracy. Other parameters are modeled sufficiently accurately with 4 design variables (W, L, Vds, and Vgs). A further study reveals that only 1/gm can be modeled accurately with three simple sub-design spaces (weak, moderate, and strong inversion regions) in 0.18µm technology. We note that the model accuracy is most sensitive to the channel length (L) and the gate-source voltage (Vgs). It is better to select sub-design spaces according to the different ranges of L and Vgs. Table I lists a detailed subspace map in our study. We currently only consider moderate
L
0.5~1µm 1~2µm
2~5µm 5~10µm 10~21µm
Vgs (V)
0.4~0.5 0.5~0.6 0.6~0.7 0.7~0.8 0.8~1.0 1.0~1.2 1.2~1.4 1.4~1.5 1.5~1.6 1.6~1.7 1.7~1.8
mod1 mod2 mod3 st1 st2 st3 st4 st5 st6 st7 st8
mod4 mod5 mod6 st9 st10 st11 st12 st13 st14 st15 st16
mod7 mod8 mod9 st17 st18 st19 st20 st21 st22 st23 st24
mod10 mod11 mod12 st25 st33 st26 st34 st27 st35 st28 st36 st29 st37 st30 st38 st31 st39 st32 st40
mod #: model in moderate inversion region; st #: model in strong inversion region
TABLE II.
model PWLsub PWL [7] GAP [8]
gm
COMPARISON OF MRAE (%) FOR DIFFERENT MODELS
1/gm gds Ids Cgd Cgs Cgb Cdb Csb
Cdg
2.15 0.47 3.37 9.36 3.42 4.19 4.96 2.03 4.32 2.24 ííí 1.70 9.40 ííí ííí 3.10 ííí ííí ííí
ííí
13.00 ííí 7.21 ííí 0.28 4.32 ííí 0.18 ííí
ííí
and strong inversion regions due to a tradeoff between the power consumption and speed. There are 52 sub-spaces used in our study for all parameters except 1/gm. In each sub-space, the drain to source voltage (Vds) ranges from 0.2V to 1.8V, and the channel width ranges from 0.4µm to 100µm. Experiment data of CMOS transistors are extracted from HSPICE simulation by using BSIM3 model. Ten parameters are modeled in this work. The mean relative error (MRAE) is calculated in each subspace, and the worst case among all sub-spaces is presented in Table II for comparison. The sub-space based PWL (PWLsub) modeling approach brings MRAE error under 10%. A significant improvement in gm and gds accuracy is observed in this modeling method compared to other two single space schemes. Among all transistor parameters, these two play the most important roles in circuit synthesis.
If a blind search is performed in all 52 sub-spaces, GP circuit synthesis with sub-space posynomial models will suffer a computing efficiency problem. Compared to a single posynomial model method, the computational effort in the subspace method will be increased by an order of O(mn), where m is the total number of sub-spaces to be searched, and n is the number of transistors in the circuit. Even with a simple differential op-amp (5 transistors), the computational effort with the blind search will not be acceptable. However, given a specific performance requirement and circuit topology, each transistor can only work in a few sub-spaces. In a differential op-amp design, the circuit is symmetrical in the input stage, which will allow fewer transistors to be considered. By constraining the searching space of transistors according to the performance requirements, we can make sub-space based GP circuit synthesis feasible. Some examples will be demonstrated in the following section. IV.
GP CIRCUIT SYNTHESYZING EXAMPLES
The feasibility of GP circuit synthesis based on sub-space posynomial models will be examined in three different opamps in TSMC 0.18µm technology. A. Two-stage op-amp design The two-stage op-amp in Fig. 2 has been studied by many researchers in their GP synthesis methods [1, 2, 4, 7]. It is revisited here for comparison. As we have mentioned before, five design variables (L, W, Vgs, Vds, and Ids) are used in our sub-space models for gm, 1/gm, and gds. Theoretically, design variables in posynomial models should be independent. In fact, Ids is dependent on other four variables. To guarantee that Ids has the right value, a constraint on Ids has to be added:
I ds _ cal = I ds
(4)
where Ids_cal is the drain to source current calculated from a monomial model, and Ids is a design variable of drain to source current. To solve the GP optimization problem, circuit performance measures have to be modeled with a posynomial/monomial form, and they must have a GP constraint. We use model equations similar to those in [2] for Av, PM, GBW, and SR, as shown in Table III. If 1/gm is used in models, the open-loop voltage gain (Av) and gain bandwidth product (GBW) are inverse posy-
TABLE III.
TWO-STAGE OP-AMP MODELS AND CONSTRAINTS
Perform.
Av
Models & Constraints g m1 g m6 ( g ds1 + g ds 3 ) ( g ds 6 + g ds 7 ) g m1 2πC c
GBW SR
,
I
min( Ctail , C C
C
,
GBWSPEC GBW
I ds 6 + C OUT
AV − SPEC AV
≤1
),
p1
g m1 2πAV C C
p2
g m6 2 π ( C L + C1 + C 2 )
p3
g m6 2πC1
PSH
≤1
SRSPEC SR
≤1
tan −1 ( GBW ) + tan −1 ( GBW ) + tan −1 ( GBW ) p1 p2 p3 180 o − PSH ,
PM
PSH 180 o − PM
≤1
C1 = Cdb7 + Cdg7 + Cdb6 + Cdg6 + Cgs6 , C2 = Cdb1 + Cdg1 + Cdb3 + Cdg3 Cout =CL + Cdb7 + Cdg7 + Cdb6 + Cdg6
nomial (i.e., 1/Av and 1/GBW are posynomials). The slew rate, SR, is an inverse posynomial. The phase shift (PSH) induced by the dominant pole (p1) is usually around 90 degrees, so we only consider the phase shift induced by the output pole (p2) and compensation pole (p3). When the phase shift is less than 25 degrees, tan-1(x) § x. So the phase shift can be modeled in a posynomial form. With such, the circuit performance measures can be effectively constrained with GP constraints. In Table III, AV-SPEC, GBWSPEC, SRSPEC, and PMSPEC are specifications for AV, GBW, SR, and PM respectively. The same bias constraints used in [2] are employed in this GP synthesis. To achieve an efficient search, the search spaces for each transistor have to be constrained according to the performance requirements. This circuit has a symmetrical structure in the input stage (M1, M2, M3, M4, and M5), so only a half part (M1, M3, and M5) will be considered. Hence, the number of active transistors reduces to five. The common mode input range also exerts a constraint on the searching space. The minimum and maximum common mode input voltages are: Vcom _ min = Vdsat 5 + Vgs1
(a)
Vcom _ max = Vdd + Vgd1_max - Vsd3 = Vdd + Vgs1 - Vdsat1 - Vsd3 (b)
(5)
where Vdsat1 and Vdsat5 are drain to source voltage at which M1 and M5 turn to saturate respectively, Vgd1-max is the maximum gate to drain voltage in M1. If we substitute (5a) into (5b), we find Vcom _ max = Vdd + Vcom _ min − Vdsat 5 − Vdsat1 − Vsd 3 .
(6)
To have Vcom_max = Vdd, Vsd3 must satisfy Vsd 3 = Vcom _ min − Vdsat 5 − Vdsat1 . Figure 2.
Two-stage compensated op-amp
(7)
In our defined search spaces (as shown in Table I), the minimum Vdsat is ~0.11V, therefore, Vsd3 < (Vcom-min í 0.22V). For
example, if Vcom-min = 0.7V, Vsd3 < 0.48V. As we know Vsg3 = Vsd3, the transistor M3 has to work in the moderate inversion region. The search spaces for M3 are sub-spaces mod1, mod4, mod7, and mod10. The source to gate voltage in M6 is the same as in M3, so the search spaces for M6 are the same as in M3. From (5a), we know that V dsat
5
= V com
_ min
− V gs 1 ,
(8)
and we have Vdsat5-max = Vcom-min í Vgs1-min = Vcom-min í 0.4V. If Vcom-min = 0.7V, Vdsat5-max = 0.3V. To have Vdsat5 0.3V, Vgs5 should satisfy 0.4V Vgs5 0.7V, therefore the search spaces for M5 are mod1 ~ mod12, st1, st9, st17, st25, and st33. A transistor working in strong inversion has a larger Vdsat. If a small Vcom-min is wanted, the search spaces can be restricted to moderate inversion region (mod1 ~ mod12). M7 has the same gate to source voltage as M5, and it has the same channel length as M5. Therefore, M5 and M7 can be searched in the same spaces at the same time. From (5a), we can have Vgs1 = Vcom _ min − Vdsat 5 .
(9)
That means 0.4V Vgs1 (Vcom-min í 0.11). If Vcom-min =0.7V, 0.4V Vgs1 0.59V. That means the search spaces for M1 are mod1, mod2, mod4, mod5, mod7, mod8, mod10, and mod11. As a result, the total search effort in the two-stage op-amp is 8 (M1) x 4 (M3) x 12 (M5, M7)) x 4(M6) = 1536 iterations. A MATLAB based GP solver [10] is used to solve this optimization problem. It is running on a Xeon workstation with 8 processors at 2.66GHz. The average computing time for an iteration with one processor is 5 minutes. If we let 8 processors running at the same time, it will take 0.67 days to finish the search. This computation effort is acceptable as long as the accuracy of the synthesis is acceptable. In this two-stage op-amp design, the sum of drain-source voltages along each signal path equals to Vdd, which is called posynomial equality. As we know from (1), a posynomial equality is invalid in GP optimization. Several methods can be used to deal with this posynomial equality. Vanderhaegen, et al. [3] suggested a branch-and-bound algorithm to replace the posynomial equality with a set of posynomial inequalities. GP is solved for each posynomial inequality, and the minimum of all these GPs will be the global optimum. Mandal, et al. [1] introduced an extra procedure to search the DC operating point. J. Kim [11] relaxed the posynomial equality to a posynomial inequality. If the relaxed posynomial constraint and the objective can vary with a desired design variable in an opposite monotonic direction, the posynomial equality can be met at the solution. The first two methods need more than one GP, therefore more computational effort is needed. The third method can solve the problem with only one GP, and it is more computationally efficient. Therefore, a similar scheme as in [11] is used here to solve our GP power optimization problem. The objective in this problem is obj = min( w1( I tail + I out ) + w2(1 / Vds 5 ) + w3(1 / Vds 7 )) ,
(10)
where w1, w2, and w3 are weights, and they are positive real numbers. Weight w1 usually take a large value compared to w2 and w3 to reduce the effect of the second and third terms on the
TABLE IV.
TWO-STAGE OP-AMP DESIGN VERIFICATION WITH HSPICE
Performance
Spec.
Predic.
HSPICE
RE
AV GBW SR PM Istatic
40dB 10MHz 5.7V/µs 70º minimum
76.7dB 10MHz 6.52V/µs 70º 15.89µA
80dB 9.04MHz 6.16V/µs 65º 16.75uA
4.1% 10.1% 5.8% 7.7% 5.1%
power optimization. In this case, w1=106, and w2=w3=1. The same power consumption is observed when the GP is solved from (10) and from the objective without the second and third terms. The relaxed posynomial equality in the input stage is Vds 3 + Vds1 + Vds 5 ≤ Vdd .
(11)
We can see the left side of (11) increases as Vds5 increases, but (10) decreases as Vds5 increases. Other performance constraints will not be affected by Vds5. As a result, the equality in (11) will always be satisfied. In the outer signal path, the sum of Vds is Vds 6 + Vds 7 ≤ Vdd .
(12)
We can see the left side of (12) increases with Vds7, but (10) decreases with Vds7. Other performance constraints, such as Av, SR, and PM, are non-decreasing with Vds7. Therefore, the equality in (12) will always be met. With constrained search spaces, GP based local power optimization synthesis is performed in each subset of search spaces. In fact, several designs (different transistor sizes and biases) will result in the same minimum static power consumption. The one with the minimum area overhead will be selected as the solution. A two-stage op-amp is synthesized with models and constraints mentioned above. The synthesis gives sizes and biases of all transistors, and the predicted performance of the circuit. The sizes and biases of transistors achieved from the GP synthesis are then input to a HSPICE test bench to verify the predicted performance from the synthesis. Table IV presents the predicted performance results (Predic.) with the optimized power consumption from GP synthesis and the equivalent results from HSPICE simulation. The relative errors (RE) between these two results are presented as well. We can see that they are in a very good agreement. The circuit specification requirements are presented in the second column (Spec.). Note that these simulations do not include layout effects.
B. Symmetrical OTA design Symmetrical OTA based design is very attractive in highspeed and low-power applications [12]. A conventional symmetrical OTA and a common mode feedback OTA (CMFB OTA), as shown in Fig. 3, will be considered in this study. Model equations and constraints on Av, GBW, SR, and PM of these two op-amps are listed in Table V and Table VI, respectively. In the symmetrical OTA, Av, GBW, and SR are inverse posynomials. The phase shift can be treated as the same way as for two-stage op-amp, and it can be approximated as a posy-
nomial. With such, the circuit performances can be effectively constrained with GP constraints. A similar method to the twostage op-amp design is used in this GP synthesis to converge the sum of Vds to Vdd. The objective function in this design is obj = min( w1( I tail + I out ) + w2(1 / Vds 5 ) + w3(1 / Vds 6 )) ,
(13)
where w1, w2, and w3 have the same meaning and values as in two-stage op-amp case. Search spaces for each transistor can be constrained effectively with the same scheme as in two-stage op-amp design. In the CMFB OTA, Av and GBW are inverse posynomials. The phase shift can be estimated as a posynomial. As a result, they can be constrained with GP constraints. Unlike the symmetrical OTA, the slew rate in CMFB OTA is independent of the static current in the output path. During the slew, the source-gate voltage at M6 is adaptively adjusted by the common mode feedback resistor Rc. The slew current (Islew-cal) drawn from M6 is the current when Vsg6 = Vsg3 + ½(ItailRc). The slew rate cannot be constrained with a GP constraint if the slew current Islew-cal is used in the model. To solve GP for CMFB OTA, two new extra variables are introduced. They are slew current (Islew) and slewing source-gate voltage of M6 (Vsg6-slew). Islew is a monomial with input variables of Vsg6-slew, Vds6, L6, and W6. The slew rate can now be constrained with a GP constraint if Islew is used in the model. A GP can be solved with adding the following objective and constraint: 1 )+ p I obj = min( p1 ( I tail + I out ) + p2 ( GBW 3 slew )
constraint :
I slew − cal ≤1 I slew
the objective) at the same time. Therefore, the equality of the constraint in (14) can always be met as long as the power is gain bandwidth product dominant. TABLE V.
CONVENTIONAL SYMMETRICAL OTA MODELS AND CONSTRAINTS
Perform.
Models & Constraints
Av
g m1 gm 6 ( g m 3 ) ( g ds 6 + g ds 7 )
GBW
gm 6 g m1 2πCOUT g m 3
2 I ds 6 COUT
SR
where p1, p2 and p3 are weights, and they are positive real numbers. In this case, we let p2