IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
1903
Constrained Optimal Control of Hybrid Systems With a Linear Performance Index Mato Baotic´, Frank J. Christophersen, and Manfred Morari, Fellow, IEEE
Abstract—We consider the constrained finite and infinite time optimal control problem for the class of discrete-time linear hybrid systems. When a linear performance index is used the finite and infinite time optimal solution is a piecewise affine state feedback control law. In this paper, we present algorithms that compute the optimal solution to both problems in a computationally efficient manner and with guaranteed convergence and error bounds. Both algorithms combine a dynamic programming exploration strategy with multiparametric linear programming and basic polyhedral manipulation. Index Terms—Constrained systems, discrete-time, dynamic programming, finite time, hybrid systems, infinite time, multiparametric linear program, optimal control, piecewise affine systems.
I. INTRODUCTION VER THE last few years, several different techniques have been developed for the analysis and controller synthesis for hybrid systems [1]–[8]. A significant amount of the research in this field has focused on solving constrained optimal control problems, both for continuous-time and discrete-time hybrid systems. We consider the class of discrete-time linear hybrid systems. In particular, the class of constrained piecewise affine (PWA) systems that are obtained by partitioning the extended stateinput space into polyhedral regions and associating with each region a different affine state update equation, cf. [3], [9]. As shown in [9], the class of piecewise affine systems is of rather general nature and equivalent to many other hybrid system formalisms, such as for example mixed logical dynamical systems or linear complementary systems. For piecewise affine systems the constrained finite time optimal control (CFTOC) problem can be solved by means of multiparametric programming [7]. The solution is a piecewise affine state feedback control law and can be computed by using multiparametric mixed-integer quadratic programming (mp-MIQP) for a quadratic performance index and multiparametric mixed-integer linear programming (mp-MILP) for a linear performance index, cf. [7], [10].
O
Manuscript received May 18, 2005; revised December 19, 2005 and April 10, 2006. Recommended by Associate Editor C. Abdallah. M. Baotic´ is with the Automatic Control Laboratory, ETH Zurich, CH-8092 Zürich, Switzerland, and also with the Faculty of Electrical Engineering and Computing, University of Zagreb, HR-10000 Zagreb, Croatia (e-mail:
[email protected];
[email protected]). F. J. Christophersen and M. Morari are with the Automatic Control Laboratory, ETH Zurich, CH-8092 Zürich, Switzerland (e-mail:
[email protected]. ethz.ch;
[email protected]). Color versions of Figs. 2–8 and 10–13 available online at http://ieeexplore. ieee.org. Digital Object Identifier 10.1109/TAC.2006.886486
As recently shown by Borrelli et al. [11] for a quadratic performance index and by [12], [13] for a linear performance index, it is possible to obtain the optimal solution to the CFTOC problem without the use of integer programming. In [11], [12] the authors propose efficient algorithms based on a dynamic programming strategy combined with multiparametric quadratic or linear program (mp-QP or mp-LP) solvers. However, stability and feasibility (constraint satisfaction) of the closed-loop system are not guaranteed if the solution to the CFTOC problem is used in a receding horizon control strategy. To remedy this deficiency various schemes have been proposed in the literature. For constrained linear systems stability can be (artificially) enforced by introducing “proper” terminal set constraints and/or a terminal cost to the formulation of the CFTOC problem [14]. For the class of constrained PWA systems, very few and restrictive stability criteria are known, e.g., [15], [14]. Only recently ideas used for enforcing closed-loop stability of the CFTOC problem for constrained linear systems have been extended to PWA systems [16]. Unfortunately, the technique presented in [16] introduces a certain level of sub-optimality in the solution. The main advantages of the infinite time solution, compared to the corresponding finite time solution of the optimal control problem, are the inherent guaranteed stability and feasibility as well as optimality of the closed-loop system [14], [17]–[19]. In this paper, we present novel, computationally efficient algorithms1 to solve the constrained finite time optimal control problem and the constrained infinite time optimal control (CITOC) problem with a linear performance index for PWA systems. The algorithms combine a dynamic programming exploration strategy with a multiparametric linear programming solver and basic polyhedral manipulation. In the case of the CITOC problem, the developed algorithm guarantees convergence to the solution to the Bellman equation (if a bounded solution exists) which corresponds to the solution to the CITOC problem and thus avoids potential pitfalls of other conservative approaches. The algorithm cannot obtain optimal solutions that have an unbounded cost, but this is hardly a practical limitation since in most applications we want to steer the state to some equilibrium point by spending a finite amount of “energy.” The presented algorithms to solve the CFTOC and the CITOC problem are implemented in the Multiparametric 1The problems considered in this paper belong to the class of combinatorial problems, which in general have an exponential worst-case complexity. The algorithms we introduce for solving CFTOC and CITOC problems are ‘efficient’ in the sense that they are outperforming the mp-MILP formulation, which currently is the only other viable method of obtaining the closed form solution to the problems at hand. Furthermore, unlike the mp-MILP approach, the algorithms proposed here can cope with moderate size problems.
0018-9286/$20.00 © 2006 IEEE
1904
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
Toolbox (MPT) [28] for MATLAB. The toolbox can be downloaded at http://control.ee.ethz.ch/mpt/. II. LINEAR HYBRID SYSTEMS PWA system are equivalent to many other hybrid system classes [3], [9], such as, for example, mixed logical dynamical systems [15], linear complementary systems [2], or min–max-plus-scaling systems and thus form a rather general class of linear hybrid systems. Moreover, piecewise affine systems present themselves to be a powerful class for identifying or approximating generic nonlinear systems via multiple linearizations at different operating points [3], [20], [21]. Even though hybrid systems (and in particular PWA systems) are a special class of nonlinear systems, most of the nonlinear system and control theory does not apply because it requires certain smoothness assumptions. For the same reason, we also cannot simply use linear control theory in some approximate manner to design controllers for PWA systems. We consider the class of discrete-time, stabilizable, linear hybrid systems that can be described as constrained continuous2 piecewise affine (PWA) systems of the following form:
where
is a real-valued function defined over the domain denotes a parameter vector (or state) and denotes the optimization variable (or input). We say that the optimization problem (2) is feasible if . (respectively, ) is called feasible if The point (respectively, ) such that and only if . Please note that the terminology of stability of a system, i.e., stability of the origin (as it is considered in this paper) only makes sense for feasible trajectories. Therefore, trajectories leaving a feasible set can not be considered as “unstable.” The terminology of stability (in the classical sense) is not defined outside of a feasible set. III. CONSTRAINED FINITE TIME OPTIMAL CONTROL
We consider the piecewise affine system (1) and define the constrained finite time optimal control (CFTOC) problem (3) subject to
(4)
where if
(1) (5)
is the state, is the control input, where of is a nonempty compact set the domain in with the number of system dynamics, and denotes the polyhedral partition of the domain , i.e., and for all . Note that linear state and input constraints of the general form can be incorporated in the description of . The standing assumption throughout this paper is as follows. Assumption II.1 (Equilibrium at the Origin): The origin in the extended state-input space is an equilibrium point of the and , PWA system (1), i.e., . where The previous assumption is not limiting the scope of the paper. For simplicity, we consider only the cost that penalizes the deviation of the state and control action from the origin (equilibrium point) in the extended state-input space. However, all presented results also hold for any non-zero equilibrium point since such problems are easily translated to the “steer-to-the-origin” problem by a simple linear substitution of the variables. In the rest of this paper, we will refer to the following definition of feasibility. Definition II.2 (Feasibility): Consider the general optimization problem
subject to 2Here,
if f
(2)
D
a PWA system defined over a disjoint domain is called continuous is continuous over connected subsets of the domain.
is the cost function (also called performance index), is the optimization variable defined as input sequence is the prediction horizon, is a compact terminal set in , and with in (5) denotes the corresponding standard vector 1- or -norm. The optimal value of the cost function, denoted with , is called the value function. The optimization variable that is called the optimizer and we denote it with achieves . With a slight abuse of notation, when the CFTOC problem (3)–(4) has multiple solutions, i.e., when the denotes one (arbitrarily chosen) optimizer is not unique, realization from the set of possible optimizers. Note that it is common practice to use the term linear performance index when referring to (5) even though, strictly speaking, the cost in (5) is a piecewise affine function of function its arguments. The CFTOC problem (3)–(4) implicitly defines the set of feaand the set of feasible sible initial states . Our goal inputs and is to find an explicit (closed form) expression for the set for the functions and , where . Remark III.1 (Infimum Problem): Strictly speaking, we should formulate the CFTOC problem (3)–(4) as a search for the infimum rather than for the minimum. However, in our comprises a finite number of linear case the cost function is compact, is connorms. Furthermore, since the set is finite, and the sets tinuous over connected subsets of , are compact, we know that the feasible space is compact. Consequently, the Bolzano-Weierstrass existence
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
theorem guarantees that the minimum and infimum problem are equivalent. Remark III.2 (Choice of ): The problem (3)–(4) can be posed and solved for any choice of the matrices , and . However, from a practical point of view if we want to avoid unnecessary controller action while steering the state to the origin, the choice of a full-column rank is a necessity. Moreover, for stability reasons (as it will be shown in Section IV) a full-column rank is assumed. Remark III.3 (Time-Varying System and/or Cost): Problem (3)–(4) naturally extends to PWA system and/or cost functions , as with time-varying parameters, i.e., and for . For simplicity we well as focus on the time-invariant case but the CFTOC problem with time-varying parameters is of the same form and complexity as the CFTOC problem with time-invariant parameters and therefore it can be solved in an analog manner. We summarize the main result concerning the solution to the CFTOC problem (3)–(4) which is proved in [22] and [7]. Theorem III.4 (Solution to CFTOC): The solution to the opis a piecewise timal control problem (3)–(4) with affine value function if
if where the set of feasible states
Fig. 1. Relation of the time axis and iteration step of the dynamic program.
and input constraints, the weighting matrices , and , as well as the state update equation for the whole time horizon . The reader is referred to [12] for further details on the computation of the CFTOC via an mp-MILP. B. The CFTOC Solution via Dynamic Programming Making use of Bellman’s optimality principle [23]–[25], the constrained finite time optimal control problem (3)–(4) can be solved in a computationally efficient way by solving an equivalent dynamic program (DP) backwards in time [12], [26], [27]. The corresponding DP has the following form:
(6)
of the initial state and the optimal input is a timevarying piecewise affine function of the initial state
1905
(10) (11)
subject to for conditions
with
, cf. Fig. 1, and with boundary
(7) and
is a polyhedral partition of
(12)
where (13)
(8) with the closure of
given by the polyhedron .
A. The CFTOC Solution via mp-MILP One way of solving the CFTOC problem (3)–(4) is by reformulating the PWA system (1) into a set of inequalities with inas switches between the different teger variables ‘dynamics’ of the hybrid system. By using an upper bound for each of the components, e.g., , of the cost function (5) the CFTOC problem can be rewritten as a mixed-integer linear program (MILP) of the form
subject to where
, and
is the set of all states at time for which the problem (10)–(12) is feasible. the dynamic programming problem Since (10)–(12) can be solved by multiparametric linear programs, is treated as a parameter and cf. [7], [12], where the state as an optimization variable. By solving the control input such programs at each iteration step , going backwards in time , we obtain the set , starting from the target set the optimal control law , with , that represents the so and the value function called “cost-to-go.” Properties of the solution are given in the following theorem, cf. [22], [7]. Theorem III.5 (Solution to CFTOC via DP): The solution to is a the optimal control problem (10)–(12) with piecewise affine value function
(9) are matrices of suitable dimension, , and the optimization variable is
of the form , where denotes an auxiliary continuous variable. Note that can be considered as a parameter of the mp-MILP. The matrices , and contain the whole information on the state
if and the optimal input function of the state control law
(14)
is a time-varying piecewise affine , i.e., it is given as a state feedback
if
(15)
1906
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
where partition of the set of feasible states
is a polyhedral at time (16)
given by the polyhedron with the closure of . Theorem III.4 states that the solution to the CFTOC problem given by (7), is (3)–(4), i.e., the optimal input sequence only. On the other hand, a function of the initial state Theorem III.5 describes the solution to the dynamic program . (10)–(12) as the optimal state feedback control law Since we know that both solutions must be identical (assuming that the optimizer is unique), this implies that there is a connecand in (7) and the matrices tion between the matrices and in (15). It is easy to see that and . To establish the connection for the other coefficients one would have to carry out the tedious sequence of sub, which stitutions in (15) as a function of would eventually express only. However, in this paper we focus on the DP approach in solving the CFTOC problem and since both approaches give the same solution, we will not go beyond this note in establishing an explicit connection between those coefficients. Having this in mind, from this point onwards, when we speak of the solution to the CFTOC problem we consider the solution in the form given in Theorem III.5. In the rest of this paper, with we denote a generic state feedback control law that maps a set of states to a set of control actions . Thus, specifies the control action (or input action) that will be chosen at time when the state is . C. An Efficient Algorithm for the CFTOC Solution In order to present Algorithm III.6 to solve the CFTOC problem via the dynamic program (10)–(12), some explanation of the notation and employed functions needs to be given. Algorithm III.6 (Generating the CFTOC Solution)
fPWA (x; u);
fD g
X
; p; P; Q; R; T; OUTPUT The CFTOC solution T3 ; . . . ; 13 3 3 (J0 (x) := P x p ; 0 (x) := m ; LET 03 FOR k = 1 TO T FOR i = 1 TO nd INPUT
i
n i=1
k k
S
S
X
0 :=
P 0 2X 0 SOLVE kQxk kRuk
FOR EACH
si;j
S
f
k
1;j
k
minu
X
1
p
+
p+
,
fPWA (x; u)
END
LET END
S3 k
)
Jk301 (fPWA (x; u)) 0 00 [x ; u ] 2 D i
subject to
END
f
INTERSECT & COMPARE
fs g i;j
2P 0 k
1;j
When we say SOLVE iteration of a DP, we mean that we formulate several multi-parametric linear programs (mp-LPs) and obtain a triplet of expressions for the value function, the optimizer, and the polyhedral partition of the feasible state space
for example by using MPT [28]. By inspection of the DP problem (10)–(12) we see that at each iteration step we are mp-LPs, where is the number of system solving dynamics. After that, by using polyhedral manipulation we have to compare all generated regions, check if they intersect and remove the redundant ones, before storing a new partition regions. that has In the step where INTERSECT & COMPARE is performed, we are removing redundant polyhedra, i.e., we remove such polyhedra that are completely covered with other polyhedra [29], [28] which have a “better” (meaning smaller) corresponding value function expression. If some polyhedron is only partially covered with “better” regions, the part of with the smaller value function can be partitioned into a set of convex polyhedra. Thus we preserve the polyhedral nature of the feasible state space partition in each iteration. D. Comments on the Dynamic Programming Approach for the CFTOC Problem In this section, some general remarks on important issues regarding the new technique will be given. An important advantage of the dynamic programming approach, compared to the approach based on multiparametric mixed-integer programming, shortly described in Section III-A, is that after every iteration step, starting from to , the data of all the intermediate optimal control laws, the polyhedral partitions of the state space, and the piecewise affine value functions are available. Thus, after completing dynamic programming iterations, the solutions to different CFTOC problems with time horizons varying from 1 to are available and can be used for analysis and control purposes. This, in addition, makes it possible to detect if the solution for a specific time horizon is identical to the infinite time solution , i.e., if for some the cost as a function of for all feasible states is identical to the initial state the cost for . For further details on the infinite time solution and explanation, see Section IV. In some parts of the state space, especially in the regions around the origin, it is likely that in two successive steps of the dynamic programming algorithm identical “regions”—in terms of the regions’ dimensions and the associated value function—are generated. Such a case is depicted in Fig. 6 and Fig. 7 for example (19) where the white encircled regions are identical. However, note that at some future iteration step the solution may change again and only when the piecewise affine value function converges on the whole feasible state space can we claim that the infinite time solution is obtained in any part of the state space. As a consequence it would be wrong to deduce
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
1907
that the infinite time solution was obtained in parts of the state . Such a claim can only be made a posspace for some teriori, i.e., after computing the solution to the CFTOC problem . with A modification of the algorithm that aims for the construction of the infinite time solution in a computationally efficient manner by limiting the exploration of the state space in intermediate iteration steps of the dynamic programming algorithm is presented in Section IV. E. Receding Horizon Control In the case that the receding horizon (RH) control policy [14] is used in closed-loop the control is given as a time-invariant state feedback control law of the form if with
(17)
and the time-invariant value function is if
Fig. 2. State–space partition of the finite time solution for T = 8 for Example (19) derived with the dynamic programming algorithm. Same color corresponds to the same affine control law (x(0)). The control law, comprising 19 different affine expressions, is defined over 262 polyhedral regions.
(18)
for . Thus, only (in the worst case different) control laws have to be stored. in (18) does not represent the Note that in general value function of the closed-loop system when the receding is applied because dehorizon control law to when the open-loop notes the cost-to-go from input sequence is applied. In the special case when the finite time solution is equivalent to the infinite time solution, i.e., for some in fact does represent the value function of the closed-loop system when applying (see also Remark IV.10). F. Example: Constrained PWA System Consider the piecewise affine system [15] as shown in (19) at the bottom of the page. The CFTOC problem (3)–(4) is solved with
, and
for . Fig. 2 shows the state space partition of the finite time solution computed with the dynamic programming algorithm for proposed in Section III.C. The same color corresponds to the . The control law, comprising same affine control law 19 different affine expressions, is defined over 262 polyhedral
regions. Each polyhedral region has a different affine value function expression assigned. Fig. 3 depicts the state space partition for the infinite time solution computed with the dynamic programming algorithm of Section III-C. A posteriori it can be shown with the dynamic programming procedure that the finite time solution for is in fact identical to the infinite a horizon time solution of the constrained optimal control problem, cf. Section III-D. The infinite time solution for this example was solved in 1515 s on a Pentium 4, 2.2-GHz machine running MATLAB 6.1. The same coloring scheme corresponds to the same affine , comprising control law. The optimal control law 23 different affine expressions, is defined over 252 polyhedral regions. Fig. 4 reveals the corresponding value function for the state–space partition. The same color corresponds to the same cost. The minimum cost is naturally achieved at the origin. Fig. 5 shows the state and control action evolution with an initial state for the infinite time solution obtained with of the dynamic programming procedure. IV. CONSTRAINED INFINITE TIME OPTIMAL CONTROL As in the previous section, we consider the piecewise affine system (1) subject to state and input constraints and by letting
with (19)
if if and
1908
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
Fig. 3. State–space partition of the infinite time solution (T = T = 11) for Example (19) derived with the dynamic programming algorithm. Same color corresponds to the same affine control law (x(0)).
Fig. 4. State–space partition of the infinite time solution J (x) with T = T = 11 for example (19) derived with the dynamic programming algorithm. Same color corresponds to the same cost value.
Fig. 5. State and control action evolution of the infinite time solution derived with the dynamic programming algorithm for example (19). Initial state x(0) = [ 10 10] .
0
where by we denote the optimization input sequence and by the optimizer of (21) and (22). In order to guarantee closed-loop stability we assume that is of full-column rank as it will be shown in the following, cf. Lemma IV.3. Additionally, also in the infinite time case it can be assumed that is of full column rank even though these assumptions are not strictly needed, cf. Remark III.2. ): The CITOC problem Assumption IV.1 (Boundedness of (21)–(22) is well defined, i.e., the minimum is achieved for some , and for any feasible feasible input sequence . state on a compact set Without Assumption IV.1 the optimal control problem (21)–(22) would in general be undecidable, cf. [30, Sec. 4]. Additionally, Assumption IV.1 is reasonable from a practical point of view, since in most applications we want to steer the or set to some equilibrium state from some given state point (here the origin, cf. Assumption II.1 and the paragraph that follows) by spending a finite amount of “energy.” In the following example, we additionally illustrate the reasoning behind Assumption IV.1. Example IV.2 (Constrained LTI System): Consider the simple CITOC problem
the cost function (5) takes the following form (assuming that the limit exists):
(23)
and
(20) where the stage-cost function is defined by with . Moreover, (3)–(4) becomes the constrained infinite time optimal control (CITOC) problem (21) (22)
(24)
for the constrained one-dimensional LTI system (24). Problem (23)–(24) is feasible for all initial states in and one can observe that the closed-loop system for the optimal state feedback control law if if if
(25)
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
Fig. 6. State–space partition of the finite time solution J (x) for T = 4 for example (19) derived with the dynamic programming algorithm. Same color corresponds to the same cost value. The white marked region is identical to the infinite time solution.
Fig. 7. State–space partition of the finite time solution J (x) for T = 5 for Example (19) derived with the dynamic programming algorithm. Same color corresponds to the same cost value. The white marked region is identical to the infinite time solution.
has three equilibria at , and 1. However, the closed-loop system is only asymptotically stable for the open set . Fig. 8 illustrates that the optimal value as and therefore the function problem is not well defined in the sense of Assumption IV.1. In practice, one can compute a -close approximation of on a closed subset as was done for obtaining Fig. 8. does only influence the “shape” Note that choosing any and but does not influence the above mentioned of characteristic behavior of the solution. Please note that most of the following results in Section IV hold also (or are straight forwardly extended) for general continuous nonlinear systems and are not restricted to the considered class of PWA systems.
1909
Fig. 8. -close approximation of the optimal value function J (x) for Example IV.2. The colored x-axis denotes the different regions over which the piecewise affine value function J (x) is defined.
Lemma IV.3 (Stability of the CITOC Solution): Consider the CITOC problem (21)–(22) and let its solution fulfill Assumption IV.1. Then the following hold. is part of the infinite time a) The origin with and solution, i.e., for . to the system, any system b) By applying the optimizer is driven to the origin (attractiveness), state then . i.e., if c) If is continuous at (it can be discontinuous elsewhere), then by applying the optimizer to the system, the equilibrium point of the closed-loop system is asymptotically stable in the Lyapunov sense. Proof: a) Because is an equilibrium point for all , the of the system (1) and for is achieved minimum of with e.g., for all . That means which is the smallest possible value of . we have b) By Assumption IV.1 and , i.e., is bounded from above and below. Additionally, the sequence with for any sequence , as increases, is nondecreasing. Since we are using , the sequence converges to . Consequently, for every there exists a with for all . and beTherefore, necessarily . cause is of full-column rank it follows (c) The origin is an equilibrium point of the closedis of full-column rank, we have for loop system. Because and that any . In general, for full-column rank there exists a finite such that . . On Therefore, we have the other side, with Assumption IV.1 and the continuity of at (it can be discontinuous elsewhere), there always
1910
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
exists a -class function [31] with for . Thus, for all and it follows all . Clearly, for each the that choice of satisfies the Lyapunov stability there exists such definition in [31], i.e., for all follows for all . that from is a Lyapunov stable point and together with Hence, the the attractiveness of [Lemma IV.3(b)] we have that is asymptotically stable. Remark IV.4 (Continuity at ): It should be remarked at the that Lemma IV.3 only requires the continuity of single point . No further (continuity) assumptions on for closed-loop asymptotic stability are needed, which is essential for the considered problem class where discontinuities of the solution may naturally appear. This is in contrast to most results in the literature where either a Lyapunov function and/or the closed-loop system description is assumed to be (Lipschitz) , cf. e.g., [32]. continuous for a neighborhood around is required, is An exception, where only continuity at [33] which uses a different approach from the result presented in Lemma IV.3. such that Note that, if there exist some finite for all then the equilibrium point of the closed-loop system is locally exponentially stable in the Lyapunov sense, cf. [32]–[34]. The arguments in Remark IV.4 also hold for and of Lemma IV.5 in Section IV-A.
, and Assumption II.1 it is easy to From see that the following properties: and
for all
(32)
of the dynamic program (26)–(28) hold. Lemma IV.5 (Stability of the DP Solution): Consider the DP be bounded on problem (26)–(28). In addition, let and continuous at . Then, when applying the corfor all responding optimal control law to the system, the following hold. for all , with a) The limit value function , is a global Lyapunov function for the closed-loop system. is driven to the origin (atb) Any system state then . tractiveness), i.e., if c) The origin is an asymptotically stable equilibrium point. Proof: a) From property (32), we have that . Because is a limit function of the DP (26)–(28) we have
for all
. Thus, with
, it follows that
A. The CITOC Solution via Dynamic Programming Similar to the recasting of the CFTOC problem into a recursive dynamic program as presented in Section III-B, it is possible to formulate for the CITOC problem (21)–(22) the corresponding dynamic program (DP) as follows
and the limit value function of the dynamic program (26)–(28) by
. Because is of full column rank, there exists a with and thus for all . This means that is always bounded below by some -class function. Similarly, we have that for all and some finite . By similar argument as in Lemma IV.3(c) there exists a -class function bounding from above. From these statements, it follows directly is a global Lyapunov function [31], [33] for the that closed-loop system. b) c) Because a global Lyapunov function exists [Lemma it follows IV.5(a)] for the closed-loop system on the set is a global asymptotically stable immediately that the origin then . [31], [33] and, thus, if It should be remarked that Lemma IV.5(a) can be proven at , cf. without imposing continuity of Appendix. However, due to the analogy with Lemma IV.3(c) the above proof was chosen. Note that in the infinite time case, in contrast to the finite time case discussed in Section III, the equivalence of the solution and the optimal solution of the dynamic program of the CITOC problem is not immediate. Before we prove this equivalence, it is useful to introduce the following operators. ): For any function Definition IV.6 (Operator and we define the following mapping:
(31)
(33)
(26) (27)
subject to for
, with initial conditions and (28)
for all . The set of all initial states for which (26)–(28) is feasible at iteration step is given by
(29) Furthermore, we define the feasible set of states as
by (30)
for all finite
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
where the set of feasible control actions implicitly through the domains of and
is defined
transforms the function on into the function . denotes the -times operator of with itself, i.e., and with . Accordingly, we use (34) and any control function for any function defined on the state–space . The DP procedure (26)–(28) can now be simply stated as , with . , satisfies the Bellman The solution to the DP procedure, equation
As in the CFTOC case, the infinite time problem might have . With a slight abuse of notation, in the multiple optimizers one case when the optimizer is not unique, we denote with (arbitrarily chosen) realization from the set of possible optimizers. Now, we are ready to state the theorem that characterizes the and the optimal state feedback control law optimal solution . Theorem IV.9 (Solution to the CITOC and DP Problem): Under Assumption IV.1 the solution to the optimal control is a piecewise affine value problem (21)–(22) with function if
(36) cf. [35, Sec. 4]. Now, if the DP (26)–(28) has a solution fulfilling property (36) then it is unique and according to [35, Th. 4.3] this solution satisfies the CITOC problem. From Lemma IV.5 and the Appendix, it follows immediately that the DP solution does in fact satisfy property (36). Having established this result, in the following we will denote the solution to both problems, the CITOC problem (21)–(22) . A control law is and the DP problem (26)–(28), with called stationary if for all . [26, Prop. 3.1.3]: A Lemma IV.8 (Optimal Control Law is optimal if and only if stationary control law for all . In other words, is optimal if and only if the minimum of (26) is obtained with for . all
(37)
and the optimal state feedback control law is of the time-invariant piecewise affine form if
(35) which is effectively being used as a stopping criterion to decide when the DP procedure has terminated. To prove that the solution to the CITOC problem (21)–(22), , is identical to the solution of the dynamic program , we actually have to answer two questions: (26)–(28), First, under which conditions does the DP procedure (26)–(28) is a unique solution to the converge, and second, when DP procedure (26)–(28). Theorem IV.7 (Equivalence: CITOC Solution-DP Solution): , satisfy Let the solution to the CITOC problem (21)–(22), . Let Assumption IV.1 and let it be continuous at be the solution to the dynamic program (26)–(28). for all . Moreover, the Then solution is a unique solution of the dynamic program (26)–(28). Proof: According to [26, Sec. 3], the solution to the CITOC problem (21)–(22) satisfies the dynamic program . On the other hand, in general, (26)–(27), that is the Bellman equation (35) may have no solution or it may have multiple solutions, but at most one solution has the property
1911
(38)
, is a polyhedral partition of the set of where at time with . feasible states Proof: Theorem IV.9 follows from the construction of the DP iterations (26)–(28), Theorem III.5, and Assumption IV.1. Remark IV.10 (Receding Horizon Control): Note that the infinite time optimal control law (38) has the same form as the corresponding RH control policy (17). This means that with the the closed-loop and open-loop response optimal control law of the system (1) are identical. Moreover, the value is to the origin, applying the opthe total “cost-to-go” from timal control policy. In the dynamic program (26)–(28), we start the value function iteration procedure with the zero-function (28) as initial condition. Due to the monotonicity property of the operator , cf. [26, Lemma 1.1.1], this guarantees the convergence (“from of “arbitrary” dynamic probelow”) to the optimal solution gramming problems if is for all feasible a stationary solution, i.e., , cf. [26, Prop. 3.1.5]. However, in numerical computations it might happen that at some point of the DP iteration an erroneous , with for some , is computed and thus convergence (“from below”) of the DP is not guaranteed. Moreover, is available in the case that a tight upper bound one might want to incorporate this information into the dynamic programming procedure because if convergence (“from above”) can be guaranteed, due to the monotonicity property of the operator , it is likely that this “speeds up” the convergence of the DP. The following results show that we can guarantee converfrom (almost) arbitrary ; gence to the optimal solution but first it is necessary to prove some preliminary result. Lemma IV.11 (Lower Bound ): Let the function with and (39) for all
then
.
1912
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
Proof: By Lemma IV.8, the control input sequence obtains the optimal value function for all . Using a with the properties given in (39) it follows for the optimal sequence that for all . Moreover, under Assumption IV.1, from Lemma and , i.e., IV.3 we have for all . Therefore, it follows that . Theorem IV.12 (Initial Value Function ): Let Assumption be continuous at , and let IV.1 hold, , with , and , be the initial value function of the DP iteration (26)–(27). Moreover, let (40) be a finitely bounded limit function, i.e., , and
, with
(41) . its corresponding stationary control law for all , a) If is some (arbitrary) finitely bounded function on is a and if after some (possibly infinite) DP iterations for all stationary solution, i.e., , then and is an optimal control . law for all b) If is some (arbitrary) finitely bounded function on with either for all or for all then the existence of is guaranteed, , and is an optimal . control law for all is a finitely bounded realization of some feasible c) If input sequence with , i.e., for all , then the existence of is guaranteed, , and is an optimal control law for . all Proof: is a stationary solution of the dynamic a) We have that program (26)–(27), i.e., for all with . However, due to optimality of we have that for all ; at the same that time it follows from for all . Thus, for all from Lemma IV.11 follows . Therefore, . for all see [26, b) For the case Prop. 1.1.5]. for all : Due to monoCase [26, Lemma 1.1.1], we have tonicity of the operator for all and . Additionally, the sequence with infor all creasing is bounded from below by , thus the sequence converges to some stafor tionary, bounded solution , i.e.,
. In addition, from part a) of this theorem we all . then have , c) For an arbitrary feasible input sequence due to sub-optimality, we clearly have for all . Because for is a realizable cost, it follows that all
for all . The rest follows directly from part b) of this theorem. and the stationarity of follows a)–c) From and thus with Lemma IV.8 directly that is an optimal control law for all . we have that Corollary IV.13 (Control Lyapunov Function): If is continuous at and chosen according to Theorem for all then it is a IV.12(b) with for the dynamical (global) control Lyapunov function on [as defined in (40)] system. Moreover, the existence of is guaranteed, , and for all . Proof: Corollary IV.13 follows directly from Theorem IV.12(b) and the definition of a control Lyapunov function [36], [37]. Theorem IV.12 (together with Corollary IV.13) is a rather strong and computationally important result guaranteeing that we will find the unique optimal finitely bounded solution if one exists. be solutions Theorem IV.14 (Convergence Rate): Let , for to the DP procedure (26)–(28) such that and . Then, the convergence rate some (42) is a monotonically nonincreasing function, i.e., , for all . Proof: Let . From the definition of the following holds: see that for all
, we
(43) (44) where is the optimal control for a given state it is a function of . Hence, it follows:
, i.e.,
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
1913
Theorem IV.16 (Stabilizing Sub-Optimal Control): Let Asbe some arbitrary sumption IV.1 hold and let and continuous finitely bounded function with at . for all then If
Fig. 9. Typical behavior of the “error” function (k ).
subject to subject to
where the first inequality is due to (44), and the second one due to the removal of some of the constraints from the problem and ). In a similar way, we get (
Finally, by taking into account that for any function we have , the rest of the proof follows easily:
Remark IV.15 (Local Convergence): Effectively, Theorem IV.14 states that the DP procedure (26)–(28) does not exhibit quasi-stationary behavior (confer Fig. 9), i.e., it cannot happen differ only slightly and then that a succession of functions in one additional itsuddenly a drastic change appears in eration step of the DP. However, note that we compare funcand over the whole feasible state space and not tions for a single point . For a fixed the local convergence rate is not necessarily a monotonic function.
is a globally asymptotic stabilizing (sub-optimal) control law for all . Moreover, is a global Lyapunov function for the controlled system when the control law is applied. Proof: First, from follows directly that for all . Because is of full-column rank, there always exists a finite with and, thus, for all . This means that is always bounded below by some -class function. Second, from it follows for some finite . By similar argument as in Lemma IV.3(c) there exists a -class function bounding from above. From these, three statements it follows that is a for the closed-loop global Lyapunov function [31], [33] on system and, thus, the stability argument of follows directly. This result is somewhat intuitive but at the same time very interesting: It allows one to compute in an “easy” way a stabion the optimality/perlizing controller as well as a bound formance of the controller. Simultaneously a Lyapunov function, , for the controlled system is given. Moreover, this result shows that if the value function iteration is started with some [confer, e.g., Theorem IV.12(a)] and at some iteration of the , then at dynamic program it is detected that all the following iteration steps of the dynamic program a stabilizing controller and a continously improving optimality bound is computed. For initial value functions which fulfill the conditions of the second case of Theorems IV.12(b) or IV.12(c) this . Please note is already true from the beginning, i.e., for all that for initial value functions which fulfill the classical con, of Theorem vergence conditions, i.e., IV.12(b) this stabilization and optimality bound property can but not for only be given after computing the limit function the intermediate iteration steps. As observed by the authors of [12], it may happen that the dynamic program (26)–(28) converges after a finite number of steps, confer Section III-F. Here, we mean by convergence that in two successive iterations of the dynamic program the value function (including its domain) does not change, i.e., for all . From this, however, one should not assume that the optimal control law steers any feasible state after at most time steps to the origin. Several observations -norm CITOC problem may should be made where the lead to two types of solutions: a) an optimal control sequence that in a finite number of time steps steers the state to the origin, and b) an optimal control sequence that takes an infinite number of time steps to steer the state to the origin. This type of behavior may be observed even for constrained linear systems as shown in the following example.
1914
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
Fig. 10. Feasible state-input space and optimal infinite time control law for Example IV.17.
Example IV.17 (CITOC of a Constrained LTI System): Consider the one-dimensional constrained linear system: if and The CITOC problem (21)–(22) is solved with , for . and The feasible state–space and the optimal infinite time control law are depicted in the extended state-input space in Fig. 10. it is obvious that the infinite time If one considers and therefore optimal control law is in fact the optimal value function or “cost-to-go” can be computed with as
From this, we see that for any state starting in it will take an infinite number of steps until the state reaches the origin; the infinite time optimal whereas if the state control law is given by and, thus, drives the state in one single step to the origin. With similar considerations for the rest of the feasible state–space, one finds for the infinite time optimal control law if if if and for the cost-to-go if if if The optimal infinite time closed-loop trajectory for three different initial values is depicted in Fig. 11.
Fig. 11. Closed-loop simulation for Example IV.17 for 3 different initial values.
B. An Efficient Algorithm for the CITOC Solution In this section, we first describe how the properties of the solution to the rather general DP problem (26)–(28) can be exploited and used for an efficient computation of the infinite time solution. Subsequently, the specific implementation of the algorithm is presented. In the rest of this paper, for some set or function we denote as its restriction to the neighborhood of the origin . For instance, describes the domain of those PWA dynamics that are valid for the origin , while represents the restriction of the function to that domain. Furthermore, equivalently to Section III-C, when we say SOLVE iteration of a DP, we mean formulate several multi-parametric linear programs for it and obtain a triplet of expressions for the value function, the control law (optimizer), and the polyhedral partition of the feasible state–space
(46) cf. Algorithm IV.18 and IV.19. By inspection of the DP problem (26)–(28) we see that at each iteration step we are mp-LPs. After that, by using polyhedral solving manipulation we have to compare all generated regions, check if they intersect and remove the redundant ones, before storing regions. a new partition that has Under the Assumptions II.1 and IV.1, all closed-loop trajectories that converge to the origin in an infinite number of time steps have to go through some of the PWA dynamics associated with the domain , and regions , that are touching the origin, cf. Example IV.17. Thus, at the beginning, instead of focusing on the whole feasible state space and the whole domain of the system, we can limit our algorithm to the neighborhood of the origin and only after attaining convergence do we proceed with the exploration of the rest of the state–space. In this way at each iteration
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
step of the DP we would—on average—have to solve a much smaller number of mp-LPs. We will call the solution to such a restricted problem the “core” . Note that in general the core is a nonconvex polyhedral partition. Any positive invariant set is a valid candidate for the core , as long as an associated control strategy is feasible and steers the state to the origin. The only prerequisite is that for any given initial feasible state, i.e., the state for which the original problem has a solution, we can reach at least one element of the core in a finite number of time steps. However, as its name says, the core is used as a “seed” or starting point for the future construction and exploration of the feasible state space. Thus, obtaining a good suboptimal solution for the core is desirable in order to limit the number of iterations which improve this very part of the state–space in further DP iterations. The task of solving the CITOC problem (21)–(22) is split into two subproblems and respective algorithms. In the first algorithm (Algorithm IV.18) we explore the portion of the state space around the origin and construct the “core” of the infinite time solution. In the second algorithm (Algorithm IV.19), starting from the core , we build a sequence of addi—to the core until the algorithm tions—named “rings” converges for the whole feasible state–space . At the end, . Here with , and we have the infinite time solution we denote the triplets of the form given in (46). would be part of an infinite In an ideal scenario the core time optimal solution, and every ring would also be a part of an infinite time solution. Then in all intermediate steps we would have to explore only the one step ahead optimal transitions from the whole domain of the PWA dynamics to the latest ring (instead of going from the whole domain of the dynamics to the initial core and all previous rings). In practice, we are likely to observe subideal scenarios: The newly generated ring, , may contain polyhedra with associated cost functions that are “worse” (meaning bigger) than the infinite time solution and thus such polyhedra will be altered in the future steps of the algorithm. As stated in Section III-C, the function INTERSECT & COMPARE removes such polyhedra that are completely covered with other polyhedra [29], [28] which have a corresponding “better” (meaning smaller) value function expression. If some is only partially covered with better “regions” polyhedron the part of with the smaller cost can be partitioned into a set of convex polyhedra. Thus, we preserve the polyhedral nature of the feasible state space partition in each iteration of Algorithm IV.18. Note that in the SOLVE step we are solving a smaller number of problems than in the general DP. Since we are restricting ourselves to the neighborhood of the origin, the number of regions at each step is likely to remain rather may lead small. However, the choice of the initial to a big number of iterations depending on the desired precision. If a better or other initial guess for is known, confer Section IV-C, it can be used to speed up Algorithm IV.18. Note that is a positive invariant set by construction.
LET LET
S
1915
(J0 (x) := 0; 0 (x) :=
0
k
0,
finished
m
;
P
0;1 :=
X
0)
FALSE
k < kmax AND NOT finished k k+1 FOR i = 1 TO nd
WHILE
LET
P 0 2S 0 kQxk kRuk SOLVE
FOR EACH
si;j
1;j
k
1
k
minu
p
+
p+
Jk01 (fPWA (x; u)) [x
subject to
0 ; u0 ]0 2 Di
fPWA (x; u)
2P 0 k
1;j
END END
S INTERSECT & COMPARE fs g RESTRICTION of S to the origin LET S IF S S0 THEN finished TRUE, C S
LET
i;j
k
k
k
=
k
1
k
0
k
END
After we have constructed the initial core, , we can proceed with the exploration of the rest of the state–space as described in Algorithm IV.19. Algorithm IV.19 (Generating the Infinite Time Solution)
fD g ; C ; p; Q; R 3 OUTPUT The infinite time solution S1 LET Solution S C C LET Ring R INPUT
kmax ; fPWA (x; u); 0
0
LET
k
0,
n i=1
i
0
0
0
finished
FALSE
k < kmax AND NOT finished k k+1 FOR i = 1 TO nd
WHILE
LET
P 0 2R 0 SOLVE kQxk kRuk
FOR EACH
si;j
1;j
k
minu
k
1
p
+
p+
,
Jk01 (fPWA (x; u)) 0 00 [x ; u ] i subject to fPWA (x; u)
2D 2P 0 k
1;j
END END
S INTERSECT & COMPARE S 0 ; fs g C S \S 0 LET R S nC IF R ; 3 x J x ; S1 3 S THEN finished TRUE,J1
LET
k
LET
k
k
k
k
k
k
k
1
i;j
1
k
=
( )
k(
)
k
END Algorithm IV.18 (Generating the CORE of the Infinite Time Solution) INPUT
kmax ; fPWA (x; u);
OUTPUT The core
C
0
fD g i
n i=1
; p; Q; R
Note that if Algorithm IV.18 ends successfully and if any optimal closed-loop trajectory starting in happens to stay in for all time then the value function computed in Algorithm IV.18 associated with is in fact the optimal value function
1916
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
associated with the feasible set of . However, if any optimal closed-loop trajectory starting in leaves this set in finite time then the value function computed in Algorithm IV.18 associated with is the best current upper bound of the optimal for all . In the case of the value function later scenario, by optimality, Algorithm IV.19 will account for this fact and improve the value function in the corresponding by going through additional DP iterations until optipart of mality is reached. The same argument holds for all the computed intermediate “cores,” , of Algorithm IV.19 as well. C. Alternative Choices of the Initial Value Function Theorem IV.12 guarantees the convergence of the dynamic programming based algorithm to the optimal solution starting from an (almost) arbitrary initial value function . This can be used in order to decrease or limit the number of iterations needed in the proposed algorithm. Moreover, at the same time an upper bound to the optimal solution can be given. as From the previous discussion, we know that ; and for some set around the well as origin the optimal value function and the optimal control is a piecewise linear function of the state . law In Algorithm IV.18, we limit the exploration and computation to such regions around the origin and consider only the domain of the system dynamics that touch the origin or have the origin in the interior. , Now, consider the special case of system (1) where i.e., the linear, stabilizable system (47) . In with polyhedral state and input constraints addition, consider for this system the stabilizing state feedback around the origin control law for states with
(48)
and with denotes where . the standard Hölder matrix 1- or -norm [38] of 1) as an Upper Bound to by Approximation: Then, that for control law (48) we have for all (49)
Fig. 12. Value function J of the infinite time solution S Same coloring implies the same cost value.
for system (19).
Note that this choice of is of a very simple type and might serve as a “good” first try for the DP iteration procedure but does not necessarily belong to the class of functions which are covered by Theorem IV.12. on for all However, finding a tight bound , i.e., (52) (53) might be a difficult nonlinear or even infeasible problem due to condition (53), depending on the system data. 2) as an Upper Bound to by Lyapunov Function Construction: For a given stabilizing state feedback control law for system (47), we can always compute a Lyapunov function for the closed-loop system of the type with as proposed in [39], [40]. In order to guarantee that such a the scaling Lyapunov function is always an upper bound to (54) with needs to be performed. Due to stabilizability3 of the system (47) one can always design a controller gain such has at least one real that the closed-loop system matrix eigenvector with its corresponding real eigenvalue . Then, it with follows for an initial condition (55)
(50) (51) Clearly, for any bounded .
we have that
(56) 3We need to assume that at least one mode of the system (47) can be influenced by the feedback controller (48).
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
1917
Fig. 13. State–space partition of the value function and of the piecewise affine control law of system (19). Same coloring in (a)–(e) implies the same cost value. (a) Value function of the “inner core” C . (b) Value function at the intermediate construction step k = 4. (c) “Core” of the value function at the intermediate construction step k = 4. (d) “Ring” of the value function at the intermediate construction step k = 4. (e) Value function of the infinite time solution. (f) PWA control law of the infinite time solution.
Consequently, for the choice of
(57)
we have find an appropriately scaled
for all . Another way to is proposed in [41].
of the type described by (54) and (57) fulfills the prerequisites of Corollary IV.13 and thus using it as initial value function of the dynamic programming iteration guarantees convergence to the optimal solution. as an Upper Bound to by Lyapunov Function Con3) struction for PWA Systems: Another, rather involved, Lyapunov function construction for the considered class of PWA systems (1) is described in [42]. However, one would still need to scale
1918
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 51, NO. 12, DECEMBER 2006
the computed Lyapunov function in a similar way as described in the previous section in order to guarantee that the obis an upper bound to the optimal solution . Again, tained this fulfills the prerequisites of Corollary IV.13 and, thus, serves as initial value function of the dynamic programming iteration and guarantees convergence to the optimal solution.
and according to [26, Sec. We have fulfills the Bellman equation. Thus it follows to3] [26, Lemma gether with the monotonicity property of 1.1.1] that . Similarly, for all , and consequently . Thus, it is automatic that is and bounded on . continuous at
D. Example: Constrained PWA System Consider again the constrained piecewise affine system (19) from Section III-F. The CITOC problem (21)–(22) is solved with
, and for . As described in Section IV-B the algorithm is divided into two parts: First, the so called “inner core” is constructed via a dynamic programming approach (Algorithm IV.18). After this inner core has converged it serves as a “seed” (or optimal current upper bound of the value function restricted to the part of the state–space in this particular step) for the second part of the algorithm (Algorithm IV.19) where from the seed the rest of the feasible state space is explored until the piecewise affine value does not change function for the whole feasible state space for two successive steps in the exploration procedure. For PWA system (19) the inner core, , is computed in 11.6 s in five iteration steps. Fig. 13(a) shows the state–space partitioning comprising 10 polyhedral regions of the value function of the inner core. Fig. 13(b) shows the state space partition of the value function with 188 polyhedral regions at the intermeof the second part of the algorithm. Fig. 13(c) diate step of the current optimal upper shows the state–space partition bound of the infinite time value function which does not change to and it consists of 104 from the intermediate step regions. This can be viewed as the new “core” in step from which the “ring” in Fig. 13(d) was computed in step . After steps of the second part of the algorithm the whole feasible state–space is explored and the value funcdoes not change from step to (Figs. 13(e) tion and 12) and thus the infinite time solution is obtained. The constructed state–space partition consists of 252 polyhedral regions. The PWA control law of the infinite time solution consists of 23 different piecewise affine expressions and is depicted in Fig. 13(f). The infinite time solution to (21)–(22) for this example was obtained in 184 seconds on a Pentium 4, 2.2–GHz machine running MATLAB 6.1. This shows the efficiency of the proposed algorithm compared to the approach given in Section III or [12] where the computation of the infinite time solution took 1515 s on the same machine.
APPENDIX ALTERNATIVE PROOF OF LEMMA IV.5(A) Note that we do not need to impose continuity of . It is sufficient to assume that the CITOC solution is bounded and continuous at and there(see Lemma fore there always exists a -class function from above, i.e., . IV.3) bounding
at
REFERENCES [1] A. van der Schaft and H. Schumacher, An Introduction to Hybrid Dynamical Systems, ser. Lecture Notes in Control and Information Sciences, M. Thoma, Ed. New York: Springer-Verlag, 2000, vol. 251. [2] W. P. M. H. Heemels, “Linear complementarity systems: A study in hybrid dynamics,” Ph.D. dissertation, Technische Univ. Eindhoven, Eindhoven, The Netherlands, Nov. 1999. [3] E. D. Sontag, “Nonlinear regulation: The piecewise linear approach,” IEEE Trans. Autom. Control, vol. AC-26, no. 2, pp. 346–358, Apr. 1981. [4] J. Lygeros, C. Tomlin, and S. Sastry, “Controllers for reachability specifications for hybrid systems,” Automatica, vol. 35, no. 3, pp. 349–370, 1999. [5] M. S. Branicky and G. Zhang, “Solving hybrid control problems: Level sets and behavioral programming,” in Proc. Amer. Control Conf., Chicago, IL, Jun. 2000, vol. 2, pp. 1175–1180. [6] A. Bemporad, F. Borrelli, and M. Morari, “Optimal controllers for hybrid systems: Stability and piecewise linear explicit form,” in Proc. Conf. Decision Control, Sydney, Australia, Dec. 2000, vol. 2, pp. 1810–1815. [7] F. Borrelli, Constrained Optimal Control of Linear and Hybrid Systems, ser. Lecture Notes in Control and Information Sciences. New York: Springer-Verlag, 2003, vol. 290. [8] M. Johansson, Piecewise Linear Control Systems: A Computational Approach, ser. Lecture Notes in Control and Information Sciences, M. Thoma and M. Morari, Eds. New York: Springer-Verlag, 2003, vol. 284. [9] W. P. M. Heemels, B. De Schutter, and A. Bemporad, “Equivalence of hybrid dynamical models,” Automatica, vol. 37, no. 7, pp. 1085–1091, 2001. [10] V. Dua and E. N. Pistikopoulos, “An algorithm for the solution of multiparametric mixed integer linear programming problems,” Ann. Oper. Res., vol. 99, pp. 123–139, 2000. [11] F. Borrelli, M. Baotic´ , A. Bemporad, and M. Morari, “An efficient algorithm for computing the state feedback solution to optimal control of discrete time hybrid systems,” in Proc. Amer. Control Conf., Denver, CO, Jun. 2003, vol. 6, pp. 4717–4722. [12] M. Baotic´ , F. J. Christophersen, and M. Morari, “A new algorithm for constrained finite time optimal control of hybrid systems with a linear performance index,” in Proc. Eur. Control Conf., Cambridge, U.K., Sep. 2003. [13] E. C. Kerrigan and D. Q. Mayne, “Optimal control of constrained, piecewise affine systems with bounded disturbances,” in Proc. Conf. Decision Control, Las Vegas, NV, Dec. 2002, pp. 1552–1557. [14] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert, “Constrained model predictive control: Stability and optimality,” Automatica, vol. 36, no. 6, pp. 789–814, Jun. 2000. [15] A. Bemporad and M. Morari, “Control of systems integrating logic, dynamics, and constraints,” Automatica, vol. 35, no. 3, pp. 407–427, Mar. 1999. [16] P. Grieder, M. Kvasnica, M. Baotic´ , and M. Morari, “Low complexity control of piecewise affine systems with stability guarantee,” in Proc. Amer. Control Conf., Boston, MA, Jun. 2004, vol. 2, pp. 1196–1201. [17] M. Sznaier and M. J. Damborg, “Suboptimal control of linear systems with state and control inequality constraints,” in Proc. Conf. Decision Control, Dec. 1987, vol. 1, pp. 761–762. [18] R. R. Bitmead, M. Gevers, and V. Wertz, Adaptive Optimal Control: The Thinking Man’s GPC, ser. International Series in Systems and Control Engineering. Upper Saddle River, NJ: Prentice-Hall, 1990. [19] P. Grieder, F. Borrelli, F. Torrisi, and M. Morari, “Computation of the constrained infnite time linear quadratic regulator,” Automatica, vol. 40, pp. 701–708, 2004. [20] G. Ferrari-Trecate, M. Muselli, D. Liberati, and M. Morari, “A clustering technique for the identification of piecewise affine systems,” Automatica, vol. 39, no. 2, pp. 205–217, Feb. 2003.
BAOTIC´ et al.: CONSTRAINED OPTIMAL CONTROL OF HYBRID SYSTEMS WITH A LINEAR PERFORMANCE INDEX
[21] J. Roll, A. Bemporad, and L. Ljung, “Identification of piecewise affine systems via mixed-integer programming,” Automatica, vol. 40, pp. 37–50, 2004. [22] D. Q. Mayne, “Constrained optimal control,” in Eur. Control Conf., Sep. 2001, Plenary Lecture. [23] R. Bellman, Dynamic Programming. Princeton, NJ: Princeton Univ. Press, 1957. [24] R. E. Bellman and S. E. Dreyfus, Applied Dynamic Programming, 2nd ed. Princeton, NJ: Princeton Univ. Press, 1962. [25] D. P. Bertsekas, Dynamic Programming and Optimal Control, 2nd ed. Belmont, MA: Athena Scientific, 2000, vol. I. [26] ——, Dynamic Programming and Optimal Control, 2nd ed. Belmont, MA: Athena Scientific, 2001, vol. II. [27] D. P. Bertsekas and S. Shreve, Stochastic Optimal Control: The Discrete-Time Case. Belmont, MA: Athena Scientific, 1996. [28] M. Kvasnica, P. Grieder, and M. Baotic´ , Multi-Parametric Toolbox (MPT). Zurich, Switzerland: ETH, 2004 [Online]. Available: http:// control.ee.ethz.ch/~mpt/ [29] M. Baotic´ , “Optimal control of piecewise affine systems—A multi-parametric approach,” Dr.Sc. thesis, ETH Zurich, Zurich, Switzerland, Mar. 2005 [Online]. Available: http://control.ee.ethz.ch/ index.cgi?page=publications;action=details;id=2235 [30] V. D. Blondel and J. N. Tsitsiklis, “A survey of computational complexity results in systems and control,” Automatica, vol. 36, no. 9, pp. 1249–1274, 2000. [31] M. Vidyasagar, Nonlinear Systems Analysis, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 1993. [32] G. C. Goodwin, M. M. Seron, and J. A. De Doná, Constrained Control and Estimation: An Optimisation Approach, ser. Communications and Control Engineering, E. D. Sontag, M. Thoma, A. Isidori, and J. H. van Schuppen, Eds. London, U.K.: Springer-Verlag, 2005. [33] M. Lazar, W. P. M. H. Heemels, S. Weiland, and A. Bemporad, Nonsmooth model predictive control: Stability and applications to hybrid systems Eindhoven Univ. Technol., Eindhoven, The Netherlands, Sep. 2005 [Online]. Available: http://www.cs.ele.tue.nl/MLazar/TechnicalReport05.pdf [34] P. O. M. Scokaert, J. B. Rawlings, and E. S. Meadows, “Discrete-time stability with perturbations: Application to model predictive control,” Automatica, vol. 33, no. 3, pp. 463–470, Mar. 1997. [35] N. L. Stokey and R. E. Lucas, Recursive Methods in Economic Dynamics, 2001st ed. Cambridge, MA: Harvard Univ. Press, 1989. [36] E. D. Sontag, “A “universal” construction of Artstein’s theorem on nonlinear stabilization,” Syst. Control Lett., vol. 34, pp. 117–123, 1989. [37] M. Krstic´ , I. Kanellakopoulos, and P. Kokotovic´ , Nonlinear and Adaptive Control Design, ser. Adaptive and Learning Systems for Signal Processing, Communications, and Control, S. Haykin, Ed. New York: Wiley, 1995. [38] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.: Cambridge University Press, 1985. [39] A. Polanski, “On infinity norms as Lyapunov functions for linear systems,” IEEE Trans. Autom. Control, vol. 40, no. 7, pp. 1270–1274, Jul. 1995. [40] F. J. Christophersen and M. Morari, “Further results on ‘infinity norms as Lyapunov functions for linear systems’,” IEEE Trans. Autom. Control, 2007, to be published. [41] A. Bemporad, F. Borrelli, and M. Morari, “Model predictive control based on linear programming—The explicit solution,” IEEE Trans. Autom. Control, vol. 47, no. 12, pp. 1974–1985, Dec. 2002. [42] M. Lazar, W. Heemels, S. Weiland, A. Bemporad, and O. Pastravanu, “Infinity norms as Lyapunov functions for model predictive control of constrained PWA systems,” in Proc. of the Intern. Workshop on Hybrid Systems: Computation and Control, Zurich, Switzerland, Mar. 2005, vol. 3414, ser. Lecture Notes in Computer Science, Springer-Verlag.
1919
Mato Baotic´ received the B.Sc. and M.Sc. degrees, both in electrical engineering, from the Faculty of Electrical Engineering and Computing (FER Zagreb), University of Zagreb, Croatia, in 1997 and 2000, respectively, and the Ph.D. from the ETH Zurich, Zurich, Switzerland, in 2005. As a recipient of the ESKAS scholarship of the Swiss Government, he was a Visiting Researcher at the Automatic Control Lab, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland, during the academic year 2000–2001. Currently, he is a Postdoctoral Researcher with the Department of Control and Computer Engineering, FER Zagreb, Croatia. His research interests include mathematical programming, hybrid systems, optimal control, and model predictive control.
Frank J. Christophersen received the diploma in engineering cybernetics from the University of Stuttgart, Stuttgart, Germany, in 2002, and the Ph.D. degree from the ETH, Zurich, Switzerland, in 2006. He received a research scholarship of the German Academic Exchange Service (DAAD) and an Erasmus/Socrates scholarship in 2000. He conducted his diploma thesis research at the Faculty of Applied Mathematics, University of Twente, Enschede, The Netherlands, in 2000–2001. He was working at the DaimlerChrysler Research Repartment for System Dynamics of Vehicles in Stuttgart, Germany, in 2001–2002. His research interests are in optimal and robust control and analysis of hybrid systems.
Manfred Morari (M’84–F’06) received the diploma from ETH Zurich, Zurich, Switzerland, and the Ph.D. degree from the University of Minnesota, Minneapolis, both in chemical engineering, in 1974 and 1977, respectively. He was appointed Head of the Automatic Control Laboratory at ETH Zurich in 1994. Before that, he was the McCollum–Corcoran Professor of Chemical Engineering and Executive Officer for Control and Dynamical Systems at the California Institute of Technology, Pasadena. His interests are in hybrid systems and the control of biomedical systems. Dr. Morari received numerous awards in recognition of his research contributions, among them the Donald P. Eckman Award of the Automatic Control Council, the Allan P. Colburn Award and the Professional Progress Award of the AIChE, the Curtis W. McGraw Research Award of the ASEE, Doctor Honoris Causa from Babes-Bolyai University, and the IEEE Control Systems Field Award. He was elected to the National Academy of Engineering. He has held appointments with Exxon and ICI plc and serves on the technical advisory board of several major corporations.