STOCHASTIC LINEAR-QUADRATIC CONTROL WITH ... - CiteSeerX

Report 2 Downloads 70 Views
c 2004 Society for Industrial and Applied Mathematics 

SIAM J. CONTROL OPTIM. Vol. 43, No. 3, pp. 1120–1150

STOCHASTIC LINEAR-QUADRATIC CONTROL WITH CONIC CONTROL CONSTRAINTS ON AN INFINITE TIME HORIZON∗ XI CHEN† AND XUN YU ZHOU‡ Abstract. This paper is concerned with a stochastic linear-quadratic (LQ) control problem in the infinite time horizon where the control is constrained in a given, arbitrary closed cone, the cost weighting matrices are allowed to be indefinite, and the state is scalar-valued. First, the (meansquare, conic) stabilizability of the system is defined, which is then characterized by a set of simple conditions involving linear matrix inequalities (LMIs). Next, the issue of well-posedness of the underlying optimal LQ control, which is necessitated by the indefiniteness of the problem, is addressed in great detail, and necessary and sufficient conditions of the well-posedness are presented. On the other hand, to address the LQ optimality two new algebraic equations ` a la Riccati, called extended algebraic Riccati equations (EAREs), along with the notion of their stabilizing solutions, are introduced for the first time. Optimal feedback control as well as the optimal value are explicitly derived in terms of the stabilizing solutions to the EAREs. Moreover, several cases when the stabilizing solutions do exist are discussed and algorithms of computing the solutions are presented. Finally, numerical examples are provided to illustrate the theoretical results established. Key words. stochastic linear-quadratic control, infinite time horizon, constrained control, cone, (conic) stabilizability, well-posedness, extended algebraic Riccati equations AMS subject classifications. 93E20, 93E15, 49K40 DOI. 10.1137/S0363012903429529

1. Introduction. Linear-quadratic (LQ) control, pioneered by Kalman [17] for deterministic systems and extended to stochastic systems by Wonham [31, 32] and Bismut [5], constitutes, in both theory and applications, an extremely important class of control problems. In recent years, there has been considerable renewed interest in stochastic LQ control. In particular, the notion of mean-square stabilizability and detectability was introduced in [12]. On the other hand, initiated by Chen, Li, and Zhou [10], extensive research has been carried out in the so-called indefinite stochastic LQ control, where, quite contrary to the conventional belief, the cost weighting matrices are allowed to be indefinite; see [11, 2, 1, 33]. Moreover, this new theory has found applications in financial portfolio selection; see [35, 19, 18]. For systematic accounts of the deterministic and stochastic LQ theory, refer to [4] and [34], respectively. A key assumption in the LQ theory at large, deterministic and stochastic alike, is that the control variable is unconstrained. This assumption renders the feedback control constructed via the Riccati equation automatically admissible, and in turn (along with the underlying LQ structure) makes possible the elegant explicit solution to the optimal LQ control problem. Because of this, the whole conventional LQ approach would collapse in the presence of any control constraint. ∗ Received by the editors June 9, 2003; accepted for publication (in revised form) March 8, 2004; published electronically November 17, 2004. http://www.siam.org/journals/sicon/43-3/42952.html † Center for Intelligent and Networked Systems, Tsinghua University, Beijing, China (bjchenxi@ tsinghua.edu.cn). The work of this author was done while she was a Postdoctoral Fellow at the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. ‡ Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, NT, Hong Kong ([email protected]). This author’s work was supported by RGC Earmarked grants CUHK4175/00E and CUHK 4234/01E.

1120

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1121

On the other hand, from a practical point of view, LQ control with control constraints is a well-defined and sensible problem. For example, in many real applications the control variable is required to take only nonnegative values. The mean-variance portfolio selection problem with short-selling prohibition exemplifies such problems. Other applications include models in medicine, chemistry, and economics where system inputs are inherently constrained. There have been some attempts in dealing with deterministic LQ problems with positive controls or, more general, with controls contained in a given cone. For example, controllability for linear system x˙ = Ax + Bu with positive/conic controls was studied in [26, 7, 25, 14, 8, 22]. These papers investigated the necessary and sufficient conditions of different types of controllability (null-controllability, global controllability, differential controllability, etc.). Later, conic stabilization was addressed in [23]. In a recent work on positive feedback stabilization [30], a stabilizing positive feedback controller was derived based on the pole placement technique. Deterministic continuous-time LQ optimal control problems with positive controllers were studied in [21, 9, 13]. Discrete-time versions can be found in [27, 28]. In these works, however, only some necessary and sufficient conditions for optimality were derived based on Pontryagin’s maximum principle and/or Bellman’s dynamic programming, and some numerical schemes were suggested. The special LQ structure was not fully taken advantage of, and no explicit result comparable to those of unconstrained control was obtained. As for the constrained stochastic LQ control, to the best of our knowledge it has never been studied by anyone else in the literature. A related, albeit specific, problem is the mean-portfolio selection model with no-shorting constraint solved in Li, Zhou, and Lim [18], which was formulated as a stochastic LQ control problem with positive controls in a finite time horizon. The approach developed in [18] is nevertheless rather ad hoc (via the Hamilton–Jacobi–Bellman equation and viscosity solution theory) and by no means suggests a remedy for a more general problem. More recently, a stochastic LQ control problem with conic control constraint, random coefficients, as well as possibly singular cost weighting matrices in a finite time horizon was solved by Hu and Zhou [15], with explicit solutions based on Tanaka’s formula and the backward stochastic differential equation theory. In this paper, we study stochastic LQ control in the infinite time horizon, where the control variable is constrained in a cone (which includes the problem with positive controls as a special case). Moreover, the problem is allowed to be indefinite in the sense that the cost weighting matrices are possibly indefinite. A main assumption of the paper is that the state variable is scalar-valued. Note that this assumption is valid in many meaningful practical applications, in particular in the area of finance where the one-dimensional wealth process is typically taken as the state. The investigation in this paper centers around several key issues associated with the problem, namely, conic stabilizability, well-posedness, and optimality. Conic stabilizability refers to the question of whether the system can be stabilized by a control satisfying the given conic constraint. It arises from the infinity of the time horizon under consideration, and is quite different from the normal stabilizability for unconstrained control. In this paper we will derive simple necessary and sufficient conditions for the conic stabilizability. The second issue, well-posedness of the LQ problem, becomes an issue because the problem is indefinite. To ensure well-posedness the problem data must coordinate well, which will be characterized in this paper by the nonemptiness of certain sets in the real space. Finally, for optimality, we aim to obtain explicit solutions comparable

1122

XI CHEN AND XUN YU ZHOU

to those classical unconstrained-control counterparts. To this end, we will introduce, for the first time in this paper, two algebraic equations termed the extended algebraic Riccati equations (EAREs) along with the notion of their stabilizing solutions. Then it will be shown that the existence of the stabilizing solutions is sufficient for the existence of optimal feedback control of the constrained LQ problem, and explicit forms of the optimal feedback control as well as the optimal cost value will be derived in terms of the stabilizing solutions. Furthermore, several important cases, including that of the definite LQ control, will be discussed where stabilizing solutions to the EAREs do exist, and algorithms for computing these solutions will be presented. To demonstrate the theoretical results obtained, numerical examples will be given. The rest of the paper is organized as follows. In section 2, the constrained stochastic LQ control problem is formulated and conic stabilizability of the system defined. As a prelude to the main analysis, two technical lemmas are presented in section 3. The subsequent three sections, sections 4, 5, and 6, are devoted to the three major issues, namely, stabilizability, well-posedness, and optimality, respectively. Numerical examples are reported in section 7. Finally, section 8 concludes the paper. 2. Problem formulation. Notation. We make use of the following basic notation in this paper: Rn : R : : Rn+ Rm×n : Sn×n : |v| : : x+ x− : : M M >0 : M ≥0 : E[x] :

the set of n-dimensional column vectors. = R1 . the subset of Rn consisting of elements with nonnegative components. the set of all m × n matrices. the√set of all n × n symmetric matrices. = v  v, v ∈ Rn . = max{x, 0}, x ∈ R. = max{−x, 0}, x ∈ R. the transpose of a matrix M . the square matrix M is positive definite. the square matrix M is positive semidefinite. the expectation of a random variable x.

Let (Ω, F, P; Ft ) be a given filtered probability space with a standard Ft -adapted, k-dimensional Brownian motion w(t) ≡ (w1 (t), w2 (t), . . . , wk (t)) on [0, +∞). Let Γ ⊆ Rm be a given closed cone; i.e., Γ is closed, and if u ∈ Γ, then αu ∈ Γ ∀α ≥ 0. m Typical examples of such a cone are Γ = Rm + , Γ = {u ∈ R |M u ≤ 0}, and Γ = m n×m {u ∈ R |M u = 0}, where M ∈ R , or the so-called second-order cone (cf., e.g., [20, p. 221]) Γ = {(t, u) ∈ R × Rm−1 |t ≥ |u|}. Next, for M ∈ Sm×m if there is δ > 0 so that K  M K ≥ δ|K|2 ∀0 = K ∈ Γ, then we denote M |Γ > 0. Similarly we write M |Γ ≥ 0 if K  M K ≥ 0 ∀K ∈ Γ. Clearly M > 0 (respectively, M ≥ 0) implies M |Γ > 0 (respectively, M |Γ ≥ 0). Finally, define the following Hilbert space:  L2F (Γ) =

  φ(·, ·) : [0, +∞) × Ω → Γφ(·, ·) is Ft -adapted, measurable,   +∞ 2 |φ(t, ω)| dt < +∞ and E 0

with the norm φ(·, ·) := (E

 +∞ 0

1

|φ(t, ω)|2 dt) 2 .

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1123

Consider the Itˆ o stochastic differential equation (SDE) ⎧ k

⎪ ⎪ ⎪ ⎨dx(t) = [Ax(t) + Bu(t)]dt + [Cj x(t) + Dj u(t)]dwj (t), (1)

t ∈ [0, ∞),

j=1

⎪ ⎪ ⎪ ⎩ x(0) = x0 ∈ R,

where A, Cj ∈ R and B, Dj ∈ R1×m . A process u(·) is called a control (with conic constraint) if u(·) ∈ L2F (Γ). Definition 2.1. A control u(·) ∈ L2F (Γ) is called (mean-square, conic) stabilizing with respect to x0 if the corresponding state x(·) of (1) with the initial state x0 satisfies limt→+∞ E|x(t)|2 = 0. Definition 2.2. System (1) is said to be (mean-square, conic) stabilizable if there is a feedback control of the form u(t) = K+ x+ (t) + K− x− (t), where K+ and K− are constant vectors with K+ ∈ Γ and K− ∈ Γ, which is (mean-square, conic) stabilizing with respect to every initial state x0 . Now, for any x0 ∈ R, we define the set of admissible controls (2)

Ux0 := {u(·) ∈ L2F (Γ)|u(·) is stabilizing with respect to x0 }.

If u(·) ∈ Ux0 and x(·) is the corresponding solution of (1), then (x(·), u(·)) is called an admissible pair (with respect to x0 ). For each (x0 , u(·)) ∈ R × Ux0 the associated cost to system (1) is  (3)

+∞

J(x0 ; u(·)) := E

[Qx(t)2 + u(t) Ru(t)]dt,

0

where Q ∈ R and R ∈ Sm×m . Note that here Q and R are not assumed to be nonnegative/positive semidefinite. As a result, J(x0 ; u(·)) is not necessarily bounded below. The indefinite LQ control problem with conic constraint entails minimizing the cost functional (3), for a given x0 , subject to (1) and u(·) ∈ Ux0 . Such a problem is denoted as problem (LQ). An admissible control u(·) ∈ Ux0 is called optimal (with respect to x0 ) if u(·) achieves the infimum of (3), and in this case problem (LQ) is also referred to as attainable (with respect to x0 ). The value function V is defined as (4)

V (x0 ) :=

inf

u(·)∈Ux0

J(x0 ; u(·)), x0 ∈ R,

where V (x0 ) is set to be +∞ in the case when Ux0 is empty. Definition 2.3. Problem (LQ) is called well-posed if (5)

V (x0 ) > −∞ ∀x0 ∈ R.

It is well known that V is a continuous, though not necessarily differentiable, function when problem (LQ) is well-posed. Also note that a well-posed problem is not necessarily attainable with respect to any x0 (see Example 6.2).

1124

XI CHEN AND XUN YU ZHOU

3. Two lemmas. In this section we present two lemmas that are useful in what follows. Lemma 3.1 (Tanaka’s formula). Let X(t) be a continuous semimartingale. Then 1 dX + (t) = 1(X(t)>0) dX(t) + dL(t), 2 1 dX − (t) = −1(X(t)≤0) dX(t) + dL(t), 2

(6)

where L(·) is an increasing continuous process, called the local time of X(·) at 0, satisfying 

t

|X(s)|dL(s) = 0, P-a.s.

(7) 0

In particular, X + (t) and X − (t) are semimartingales. Proof. See, for example, [24, Chapter VI, Theorem 1.2 and Proposition 1.3]. Lemma 3.2. Let constants N+ , N− ∈ R be given. Then for any admissible pair (x(·), u(·)) with respect to x0 , we have, for every t ≥ 0,  E

t

[Qx(s)2 + u(s) Ru(s)]ds

0 − 2 2 + 2 − 2 = N+ (x+ 0 ) + N− (x0 ) − E[N+ x (t) ] − E[N− x (t) ] ⎧ ⎞ ⎞ ⎛ ⎛  t⎨ k k



+E Qx(s)2 + ⎝2A + Cj2 ⎠ N+ x+ (s)2 + ⎝2A + Cj2 ⎠ N− x− (s)2 0 ⎩ j=1 j=1 (8) ⎡ ⎤ k k



+ u(s) ⎣R + 1(x(s)>0) N+ Dj Dj + 1(x(s)≤0) N− Dj Dj ⎦ u(s) j=1

⎛ + 2 ⎝B +

k



j=1



Cj Dj ⎠ u(s)N+ x+ (s) − 2 ⎝B +

j=1

k

j=1

⎫ ⎬ Cj Dj ⎠ u(s)N− x− (s) ds. ⎭ ⎞

Proof. Let x(·) be the solution of (1) under an arbitrary u(·) ∈ Ux0 . Lemma 3.1, we have dx+ (t) = 1(x(t)>0) [Ax(t) + Bu(t)]dt + 1(x(t)>0) +

1 dL(t), 2

k

dx− (t) = −1(x(t)≤0) [Ax(t) + Bu(t)]dt − 1(x(t)≤0) +

1 dL(t). 2

[Cj x(t) + Dj u(t)]dwj (t)

j=1

k

j=1

[Cj x(t) + Dj u(t)]dwj (t)

By

1125

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

In the above equations, L(·) is the local time as specified in Lemma 3.1. Applying Itˆ o’s formula, we get d[N+ x+ (t)2 ] ⎧ ⎫ k ⎨ ⎬

1 = 2N+ x+ (t) 1(x(t)>0) [Ax(t)+Bu(t)]dt+1(x(t)>0) [Cj x(t)+Dj u(t)]dwj (t)+ dL(t) ⎩ ⎭ 2 j=1

k

[Cj x(t) + u(t) Dj ][Cj x(t) + Dj u(t)]dt + 1(x(t)>0) N+ j=1 ⎧ ⎫ k ⎨ ⎬

= N+ [2Ax+ (t)2 +2Bu(t)x+ (t)+1(x(t)>0) (Cj x(t)+u(t) Dj )(Cj x(t)+Dj u(t))] dt ⎩ ⎭ j=1

k

+ N+ [2Cj x+ (t)2 + 2Dj u(t)x+ (t)]dwj (t) ⎞ ⎡⎛ j=1 k k



= ⎣⎝2A + Cj2 ⎠ N+ x+ (t)2 + 2Bu(t)N+ x+ (t) + 2 Cj Dj u(t)N+ x+ (t) j=1

+ 1(x(t)>0) N+ u(t)

k





j=1 k

Dj Dj u(t)⎦dt + ⎣N+ (t)

j=1



(2Cj x+ (t)2 +2Dj u(t)x+ (t))⎦dwj (t),

j=1

(9) where we have used the fact that x+ (t)dL(t) = 0 by virtue of (7). Similarly, for the constant N− ∈ R, d[N− x− (t)2 ] ⎧ ⎡ ⎤⎫ k ⎨ ⎬

= N− ⎣2Ax− (t)2 −2Bu(t)x− (t)+1(x(t)≤0) (Cj x(t)+u(t) Dj )(Cj x(t)+Dj u(t))⎦ dt ⎩ ⎭ j=1

k

+ N− [2Cj x− (t)2 − 2Dj u(t)x− (t)]dwj (t) ⎞ ⎡⎛ j=1 k k



= ⎣⎝2A + Cj2 ⎠ N− x− (t)2 − 2Bu(t)P− x− (t) − 2 Cj Dj u(t)N− x− (t) j=1

+ 1(x(t)≤0) N− u(t)

k

j=1





Dj Dj u(t)⎦dt+ ⎣N− (t)

j=1 k



(2Cj x− (t)2 −2Dj u(t)x− (t))⎦dwj (t).

j=1

(10) Fix t ≥ 0 and define a sequence of stopping times ⎧  2 ⎪   r k ⎨

  + 2 + N+ (s)  ds τn := inf r ∈ [0, t] : (2C x (s) + 2D u(s)x (s)) j j   ⎪ 0  ⎩  j=1 ⎫  2  ⎪  r k ⎬

  − 2 −   + N (s) (2C x (s) − 2D u(s)x (s)) ds ≥ n , n = 1, 2, . . . , − j j   ⎪ 0   ⎭ j=1

1126

XI CHEN AND XUN YU ZHOU

t where inf ∅ := t. It is clear that τn ↑ t as n → +∞, due to E 0 (|x(s)|2 + |u(s)|2 )ds < +∞. Now, summing up (9) and (10), taking integration from 0 to τn and then, taking expectation, we obtain (8) with t replaced by τn . Thus, (8) follows by sending n → +∞ together with Fatou’s lemma. 4. Conic stabilizability. In this section we address the issue of the conic stabilizability of system (1). Notice that conic stabilizability is different from the usual stabilizability with unconstrained controls, for clearly the former requires more stringent conditions. Here we will give a complete characterization of conic stabilizability in terms of simple conditions involving linear matrix inequalities (LMIs). Introduce a pair of functions F+ , F− from Γ to R:

(11)

F+ (K) := 2A +

k

⎛ Cj2 + 2 ⎝B +

j=1

k

⎞ Cj Dj ⎠ K + K 

j=1

k

Dj Dj K,

j=1

and

(12)

F− (K) := 2A +

k

⎛ Cj2 − 2 ⎝B +

j=1

k

⎞ Cj Dj ⎠ K + K 

j=1

k

Dj Dj K.

j=1

Theorem 4.1. The following assertions are equivalent. (i) System (1) is mean-square conic stabilizable. (ii) There exist K+ ∈ Γ and K− ∈ Γ such that F+ (K+ ) < 0 and F− (K− ) < 0. In this case the feedback control u(t) = K+ x+ (t) + K− x− (t) is stabilizing. (iii) There exist K+ ∈ Γ and K− ∈ Γ such that ⎛ (13)



k



⎜2A + Cj2 + 2 ⎝B + Cj Dj ⎠ K+ ⎜ ⎝ j=1 j=1 DK+

⎛ (14)

k

k



k

⎞  K+ D ⎟ ⎟ < 0, ⎠ −I



⎜2A + Cj2 − 2 ⎝B + Cj Dj ⎠ K− ⎜ ⎝ j=1 j=1 DK−

⎞  K− D ⎟ ⎟ < 0, ⎠ −I

k  where D ∈ Rm×m satisfies D D = j=1 Dj Dj . In this case the feedback control u(t) = K+ x+ (t) + K− x− (t) is stabilizing. Proof. Take a feedback control u(t) = K+ x+ (t) + K− x− (t) and consider the corresponding state x(·) with an initial state x0 . Note that by standard SDE theory (cf., e.g., [16]) such x(·) uniquely exists. Moreover, (15)

E sup |x(t)|p dt < +∞ ∀T > 0 ∀p ≥ 0. 0≤t≤T

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1127

Making use of (9) with N+ = 1 and u(t) = K+ x+ (t) + K− x− (t), we obtain ⎞ ⎤ ⎛ ⎡ k k k



 dx+ (t)2 = ⎣2A + Cj2 + 2 ⎝B + Cj Dj ⎠ K+ + K+ Dj Dj K+ ⎦ x+ (t)2 dt j=1

j=1

+

k

j=1

(2Cj + 2Dj K+ )x+ (t)2 dwj (t)

j=1

= F+ (K+ )x+ (t)2 dt +

k

(2Cj + 2Dj K+ )x+ (t)2 dwj (t).

j=1

Taking integration and then expectation yields (after a localization argument as in the proof of Lemma 3.2) (16)

dE[x+ (t)2 ] = F+ (K+ )E[x+ (t)2 ], dt

where the expectation of the Itˆ o integral vanishes due to (15). Similarly, we have (17)

dE[x− (t)2 ] = F− (K− )E[x− (t)2 ]. dt

Hence, the equivalence between the assertions (i) and (ii) is evident noting that limt→+∞ E|x(t)|2 = 0 if and only if limt→+∞ E|x+ (t)|2 = 0 and limt→+∞ E|x− (t)|2 = 0. Finally, the equivalence between the assertions (ii) and (iii) follows from Schur’s lemma [6]. Conditions (13) and (14) are in terms of LMIs. On the other hand, some of the conic constraint can also be expressed as LMIs, e.g., when Γ = Rm + or when Γ is a second-order cone. Hence, Theorem 4.1(iii) provides an easy way of numerically checking the stabilizability of system (1) due to the availability of many LMI solvers. Note that in general there may exist many feasible solutions to LMIs. Denote the two sets of feedback gains as K+ := {K ∈ Γ|F+ (K) < 0}, K− := {K ∈ Γ|F− (K) < 0}. Then Theorem 4.1 implies the following Theorem 4.2. System (1) is conic stabilizable if and only if K+ = ∅ and K− = ∅. k Corollary 4.1. If 2A + j=1 Cj2 < 0, then system (1) is conic stabilizable. Proof. In this case 0 ∈ K+ ∩ K− . Proposition 4.1. We have the following results. (i) K+ and K− are convex sets if Γ is a convex set. k (ii) K+ and K− are bounded if j=1 Dj Dj |Γ > 0. Proof. (i) Define the following operator M+ from Γ to S(m+1)×(m+1) : ⎞ ⎛ ⎛ ⎞ k k



⎜2A + Cj2 + 2 ⎝B + Cj Dj ⎠ K K  D ⎟ ⎟, M+ (K) = ⎜ ⎝ ⎠ j=1 j=1 DK −I  k where D D = j=1 Dj Dj . By Theorem 4.1, K+ can be equivalently represented as K+ = {K ∈ Γ|M+ (K) < 0}. Thus the convexity of K+ follows from that of Γ together

1128

XI CHEN AND XUN YU ZHOU

with the fact that the operator M+ is affine. Similarly we can show the convexity of K− . (ii) If K+ is unbounded, then there is a sequence {Kn } ⊂ K+ so that |Kn | → +∞. k k Since j=1 Dj Dj |Γ > 0, we have Kn j=1 Dj Dj Kn ≥ δ|Kn |2 → +∞ as n → +∞, where δ > 0 is some constant. Therefore F+ (Kn ) → +∞ as n → +∞. This contradicts F+ (Kn ) < 0 ∀n. Hence K+ is bounded. Similarly, K− is bounded. 5. Well-posedness. Since the cost weighting matrices Q and R are allowed to be indefinite, the well-posedness of the problem is no longer automatic or trivial (as opposed to the classical definite case when Q ≥ 0 and R > 0). In fact, the wellposedness for an indefinite LQ control problem is a prerequisite for the optimality and is an interesting problem in its own right. In this section we will carry out an extensive investigation on the well-posedness and some related issues, including necessary and sufficient conditions for the well-posedness in terms of the nonemptiness of certain sets. 5.1. Representation of value function. In this subsection we present the following representation result, which is a key to many results of this paper. Proposition 5.1. Problem (LQ) is well-posed if and only if the value function can be represented as (18)

− 2 2 V (x0 ) = P+ (x+ ∀x0 ∈ R 0 ) + P− (x0 )

for some P+ , P− ∈ R. Proof. We prove only the “only if” part, as the “if” part is evident. Assume that problem (LQ) is well-posed. Fix any x > 0, y > 0. Since V (y) > −∞, for any ε > 0 there is uε (·) ∈ Uy along with the corresponding state xε (·) (with the initial state y) satisfying  +∞ ε V (y) ≥ J(y; u (·)) − ε = E (19) [Qxε (t)2 + uε (t) Ruε (t)]dt − ε. 0

Now, as x > 0, the linearity of the dynamics (1) and the conic control constraint ensure that xuε (·) ∈ Uxy with the corresponding state xxε (·). Hence it follows from (19) that  +∞ 1 1 (20) V (y) ≥ 2 E [Q|xxε (t)|2 + (xuε (t)) R(xuε (t))]dt − ε ≥ 2 V (xy) − ε. x x 0 Sending ε → 0 we obtain (21)

V (xy) ≤ V (y)x2 ∀x > 0, y > 0.

Similarly, one can show that (22)

V (xy) ≤ V (−y)x2 ∀x < 0, y > 0.

Now for any x > 0, by (21) we have V (x) ≤ V (1)x2 . On the other hand, (21) also implies V (1) = V (x x1 ) ≤ x12 V (x). So we have shown that (23)

V (x) = V (1)x2 ∀x > 0.

Similarly, in view of (22) we can prove that (24)

V (x) = V (−1)x2 ∀x < 0.

1129

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

Finally, the continuity of the value function along with (23) yields (25)

V (0) = 0.

The desired result (18) thus follows from (23)–(25) with P+ := V (1) and P− := V (−1). Remark 5.1. The preceding proposition, which suggests the form of the value function when the underlying LQ problem is well-posed, is crucial for all the main results in this paper. In fact, the proofs for the stabilizability in the previous section, the characterization of the well-posedness in this section, and the optimality in the next section are all inspired by this result. This also explains why one needs to apply Tanaka’s formula to evaluate dx+ (t) and dx− (t), as we have seen in the previous section and will continue to see in the subsequent sections. Remark 5.2. We saw that the value function for the constrained LQ problem is not smooth. 5.2. Characterization of well-posedness. Define the following functions from R to R ∪ {−∞}: ⎡ ⎛ ⎞ ⎛ ⎞ ⎤ k k



Φ+ (P ) := inf ⎣K  ⎝R + P Dj Dj ⎠ K + 2 ⎝B + Cj Dj ⎠ P K ⎦ , K∈Γ



j=1



Φ− (P ) := inf ⎣K  ⎝R + P K∈Γ

k





Dj Dj ⎠ K − 2 ⎝B +

j=1

j=1 k





Cj Dj ⎠ P K ⎦ .

j=1

Remark 5.3. Since 0 ∈ Γ, we must have Φ+ (P ) ≤ 0 Φ− (P ) ≤ 0 ∀P ∈ R.

(26)

k On the other hand, Φ+ (P ) and Φ− (P ) have finite values if (R + P j=1 Dj Dj )|Γ > 0. Indeed, in this case there exist constants α1 = α1 (P ) > 0 and α2 = α2 (P ) > 0 such that ⎛ ⎞ ⎛ ⎞ k k



K  ⎝R + P Dj Dj ⎠ K + 2 ⎝B + Cj Dj ⎠ P K ≥ α1 |K|2 − α2 |K| j=1



= α1 |K| |K| − If |K| > we conclude Φ+ (P ) =

α2 α1 ,

α2 α1

j=1



∀K ∈ Γ.

then the above expression is positive. Taking (26) into consideration ⎡

inf

α

K∈Γ,|K|≤ α2 1



⎣K  ⎝R + P

k





Dj Dj ⎠ K + 2 ⎝B +

j=1

Hence, Φ+ (P ) is finite. The same is true for Φ− (P ). Next, we define the following two sets: ⎧ ⎛ ⎞  k ⎨

 Cj2 ⎠ P + Q + Φ+ (P ) ≥ 0, P+ := P ∈ R ⎝2A + ⎩  j=1 ⎧ ⎛ ⎞  k ⎨

 P− := P ∈ R ⎝2A + Cj2 ⎠ P + Q + Φ− (P ) ≥ 0, ⎩  j=1

k





Cj Dj ⎠ P K ⎦ > −∞.

j=1

⎫ ⎞  ⎬  ⎝R + P Dj Dj ⎠ ≥ 0 , ⎭  j=1 ⎫ ⎞Γ ⎛  k ⎬

 ⎝R + P Dj Dj ⎠ ≥ 0 . ⎭  j=1 ⎛

k

Γ

1130

XI CHEN AND XUN YU ZHOU

The following is the main result of the section, which characterizes the well-posedness of problem (LQ) by the nonemptiness of the sets P+ and P− . Theorem 5.1. Assume that system (1) is conic stabilizable. Then problem (LQ) is well-posed if and only if P+ = ∅ and P− = ∅. Moreover, in this case − 2 2 ∀x0 ∈ R, ∀P+ ∈ P+ , P− ∈ P− . V (x0 ) ≥ P+ (x+ 0 ) + P− (x0 )

(27)

Proof. First we prove the “if” part. For any x0 ∈ R let x(·) be the solution of (1) under an arbitrary u(·) ∈ Ux0 . Pick any P+ ∈ P+ and P− ∈ P− . By Lemma 3.2, we have  t E [Qx(s)2 + u(s) Ru(s)]ds 0 − 2 2 + 2 − 2 = P+ (x+ 0 ) + P− (x0 ) − E[P+ x (t) ] − E[P− x (t) ] ⎧ ⎞ ⎞ ⎛ ⎛  t⎨ k k



+E Qx(s)2 + ⎝2A + Cj2 ⎠ P+ x+ (s)2 + ⎝2A + Cj2 ⎠ P− x− (s)2 0 ⎩ j=1 j=1 (28) ⎡ ⎤ k k



⎣   + u(s) R + 1(x(s)>0) P+ Dj Dj + 1(x(s)≤0) P− Dj Dj ⎦ u(s)

⎛ + 2 ⎝B +



k

j=1

j=1



Cj Dj ⎠ u(s)P+ x+ (s) − 2 ⎝B +

j=1

k

j=1

⎫ ⎬ Cj Dj ⎠ u(s)P− x− (s) ds. ⎭ ⎞

Denote by ψ(x(s), u(s)) the integrand on the right-hand side of (28) and fix s ∈ [0, t]. If x(s) > 0, then write u(s) = Kx(s) (note that K may depend on s). Since u(s) ∈ Γ and Γ is a cone, we have K ∈ Γ. Hence at s, bearing in mind that 1(x(s)≤0) = 0, we have ψ(x(s), u(s)) ⎞ ⎡ ⎤ ⎛ k k



= Qx(s)2 + ⎝2A + Cj2 ⎠ P+ x(s)2 + K  ⎣R + P+ Dj Dj ⎦ Kx(s)2 j=1

⎛ + 2 ⎝B + ⎡⎛ = ⎣⎝2A+ ⎡⎛

k

j=1

k

Cj Dj ⎠ KP+ x(s)2 ⎞



Cj2 ⎠P+ + Q + K  ⎝R + P+

j=1

≥ ⎣⎝2A +

j=1



k





k

j=1





Dj Dj⎠K + 2⎝B +

k





Cj Dj⎠KP+ ⎦ x(s)2

j=1

Cj2 ⎠ P+ + Q + Φ+ (P+ )⎦x(s)2

j=1

≥ 0. (29) If x(s) < 0, then write u(s) = −Kx(s). Again K ∈ Γ. An argument similar to that above yields ⎡⎛ ⎞ ⎤ k

ψ(x(s), u(s)) ≥ ⎣⎝2A + Cj2 ⎠ P− + Q + Φ− (P− )⎦ x(s)2 ≥ 0 j=1

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1131

at s. Finally, if x(s) = 0 at s, then ⎡ ψ(x(s), u(s)) = u(s) ⎣R + P−

k

⎤ Dj Dj ⎦ u(s) ≥ 0.

j=1

The preceding analysis shows that it always holds that ψ(x(s), u(s)) ≥ 0 ∀s ∈ [0, t]. Consequently, it follows from (28) that  E

t

− 2 2 + 2 [Qx(s)2 + u(s) Ru(s)]ds ≥ P+ (x+ 0 ) + P− (x0 ) − E[P+ x (t) ]

0

− E[P− x− (t)2 ]. Letting t → +∞ and noting that u(·) is conic stabilizing, we obtain  J(x0 ; u(·)) =

lim E

t→+∞

t

[Qx(s)2 + u(s) Ru(s)]ds

0

− 2 2 ≥ P+ (x+ 0 ) + P− (x0 ) . − 2 2 Since u(·) ∈ Ux0 is arbitrary, we conclude V (x0 ) ≥ P+ (x+ 0 ) + P− (x0 ) > −∞. Hence problem (LQ) is well-posed. To prove the “only if” part, suppose that the LQ problem is well-posed. Then by Proposition 5.1 the value function has the following representation:

V (x) = P+ (x+ )2 + P− (x− )2 ∀x ∈ R. We want to show that P+ ∈ P+ , P− ∈ P− . To this end, applying the optimality principle of dynamic programming and noting the time-invariance of the underlying system, we obtain − 2 2 P+ (x+ 0 ) + P− (x0 )  h

(30)

≤E

2

 

+

2



2

[Qx(t) + u(t) Ru(t)]dt + P+ [x (h)] + P− [x (h)] 0

∀h > 0, ∀u(·) ∈ Ux0 , ∀x0 ∈ R. Using the above and applying Lemma 3.2, we obtain  (31)

h

ψ(x(s), u(s))ds ≥ 0

E

∀h > 0, ∀u(·) ∈ Ux0 , ∀x0 ∈ R,

0

where, as before, the mapping ψ is defined via the integrand on the right-hand side of (28). Set x0 > 0 and take the following control  K, 0 ≤ t < 1, u1 (t) := K+ x+ (t) + K− x− (t), t ≥ 1, where K ∈ Γ is arbitrarily fixed and K+ x+ (t)+K− x− (t) is a conic stabilizing feedback control which exists due to the stabilizability assumption. Clearly u1 (·) ∈ Ux0 , and

1132

XI CHEN AND XUN YU ZHOU

let x1 (·) be the corresponding state. Since x1 (s) → x0 and u1 (s) → K as s → 0, P-a.s., we have ψ(x1 (s), u1 (s)) ⎡⎛ ⎞ ⎤ k  

k → ⎣⎝2A + Cj2 ⎠ P+ + Q⎦ x20 + 2 B + j=1 Cj Dj KP+ x0 ⎛

j=1

+ K  ⎝R + P+

k

⎞ Dj Dj ⎠ K as s → 0, P-a.s.

j=1

Thus appealing to (31), the dominated convergence theorem yields 0≤

1 E h ⎡⎛



h

ψ(x1 (s), u1 (s))ds ⎞ ⎤ k 

k → ⎣⎝2A + Cj2 ⎠ P+ + Q⎦ x20 + 2 B + 0

j=1



j=1

+ K  ⎝R + P+

k

 Cj Dj KP+ x0

⎞ Dj Dj ⎠ K as h → 0.

j=1

k Letting x0 → 0 we obtain K  (R + P+ j=1 Dj Dj )K ≥ 0. The arbitrariness of K ∈ Γ k then implies (R+P+ j=1 Dj Dj )|Γ ≥ 0. On the other hand, take x0 > 0 and consider the following feedback control  + ˜ − (t), Kx (t) + Kx 0 ≤ t < 1, u2 (t) := K+ x+ (t) + K− x− (t), t ≥ 1, ˜ ∈ Γ are arbitrarily fixed. Let x2 (·) be the state under u2 (·). where K ∈ Γ and K ˜ − Noting x2 (s) → x0 and u2 (s) = Kx+ 2 (s) + Kx2 (s) → Kx0 as s → 0, P-a.s., we obtain ψ(x2 (s), u2 (s)) ⎡⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎤ k k k



→ ⎣⎝2A + Cj2⎠P+ + Q + K  ⎝R + P+ Dj Dj⎠K + 2 ⎝B + Cj Dj ⎠KP+ ⎦x20 j=1

j=1

j=1

as s → 0, P-a.s. An analysis similar to the preceding one leads to ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ k k k



⎝2A + Cj2 ⎠ P+ + Q + K  ⎝R + P+ Dj Dj ⎠ K + 2 ⎝B + Cj Dj ⎠ KP+ ≥ 0. j=1

j=1

j=1

k Since K ∈ Γ is arbitrary, we arrive at (2A + j=1 Cj2 )P+ + Q + Φ+ (P+ ) ≥ 0. So far we have shown P+ ∈ P+ . Similarly, we can prove P− ∈ P− . Finally, the inequality (27) has been proved in the proof of the “if” part. Remark 5.4. From the above proof we see that the stabilizability assumption is in fact not necessary for the “if” part of the theorem.

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1133

Remark 5.5. The above theorem tells that the positive definiteness or positive semidefiniteness of Q and R are not necessary for problem (LQ) to be well-posed. Corollary 5.1. If R|Γ ≥ 0 and Q ≥ 0, then problem (LQ) is well-posed. Proof. In this case 0 ∈ P+ ∩ P− . Proposition 5.2. P+ and P− are both convex sets (hence they are both intervals). Moreover, if system (1) is stabilizable and problem (LQ) is well-posed, then P+ and P− each has a finite maximum element. Proof. The convexity of P+ and P− are clear noticing that the functions Φ+ (P ) and Φ− (P ) are concave in P . To prove the existence of the finite maximum elements, we note that Proposition 5.1 provides (32)

− 2 2 ∗ V (x0 ) = P+∗ (x+ 0 ) + P− (x0 ) ∀x0 ∈ R

for some P+∗ , P−∗ ∈ R. Moreover, the proof of Theorem 5.1 implies P+∗ ∈ P+ and P−∗ ∈ P− . Hence it follows from (27) that P+∗ and P−∗ are the maximum elements of P+ and P− , respectively. Remark 5.6. Proposition 5.2 indicates that if system (1) is stabilizable and problem (LQ) is well-posed, then the infimum value of problem (LQ) or, equivalently, the P+ and P− in the representation of the value function as stipulated in Proposition 5.1 can be obtained by solving the following two mathematical programming problems, respectively: maximize

(33)

subject to

P⎛ ⎞ ⎧ k

⎪ ⎪ ⎪ ⎝2A + ⎪ Cj2 ⎠ P + Q + Φ+ (P ) ≥ 0, ⎪ ⎪ ⎨ j=1 ⎛ ⎞  ⎪ k ⎪

 ⎪ ⎪   ≥ 0, ⎝ ⎠ ⎪ R + P D D j ⎪ j  ⎩  j=1

Γ

and maximize

(34)

subject to

P⎛ ⎞ ⎧ k

⎪ ⎪ ⎪ ⎝2A + ⎪ Cj2 ⎠ P + Q + Φ− (P ) ≥ 0, ⎪ ⎪ ⎨ j=1 ⎛ ⎞  ⎪ k ⎪

 ⎪ ⎪  ⎝ ⎪ Dj Dj ⎠ ≥ 0. ⎪ ⎩ R+P  j=1 Γ

5.3. An algorithm. Theorem 5.1 stipulates that it suffices to check the nonemptiness of P+ and P− or the feasibility of the problems (33) and (34) in order to verify the well-posedness of a given LQ problem. However, it is sometimes hard to check numerically the aforementioned feasibility because, on one hand, the functions Φ+ (·) and Φ− (·) in general do not have analytical forms, and on the other hand, the k constraint (R + P j=1 Dj Dj )|Γ ≥ 0 is usually very hard to verify for a general cone Γ (except second-order cones for which the constraint can be reformulated as an LMI; see [29, Theorem 1]). In this subsection we give an algorithm that can check the well-posedness more directly. First we need a lemma.

1134

XI CHEN AND XUN YU ZHOU

Lemma 5.1. Assume that problem (LQ) is well-posed. Given K+ ∈ K+ and   RK+ RK− Q+K+ Q+K− and P˜− := − F− (K . Then K− ∈ K− , set P˜+ := − F+ (K +) −) P+ ≤ P˜+ ∀P+ ∈ P+ and P− ≤ P˜− ∀P− ∈ P− .

(35)

Proof. By their definitions P˜+ and P˜− satisfy, respectively, ⎛ ⎝2A +



k



 ⎝ R + P˜+ Cj2 ⎠ P˜+ +Q+K+

j=1

k





Dj Dj ⎠ K+ +2 ⎝B +

j=1

k

⎞ Cj Dj ⎠ K+ P˜+ = 0

j=1

(36) and ⎛ ⎝2A +



k



 ⎝ R + P˜− Cj2 ⎠ P˜− +Q+K−

j=1

k





Dj Dj ⎠ K− −2 ⎝B +

j=1

k

⎞ Cj Dj ⎠ K− P˜− = 0.

j=1

(37) Take a feedback control u(t) = K+ x+ (t) + K− x− (t), which is stabilizing by Theorem 4.1 (bearing in mind the definitions of K+ and K− ), and let x(·) be the corresponding state with x(0) = x0 . Then a similar calculation as in (28) yields  E (38)

t

[Qx(s)2 + u(s) Ru(s)]ds

0 − 2 2 ˜ ˜ + 2 ˜ − 2 = P˜+ (x+ 0 ) + P− (x0 ) − E[P+ x (t) ] − E[P− x (t) ] + E



t

ψ(x(s), u(s))ds, 0

where ψ(x(s), u(s)) is as the integrand on the right-hand side of (28), with P+ and P− replaced by P˜+ and P˜− , respectively. However, u(s) = K+ x(s) whenever x(s) > 0; hence ψ(x(s), u(s)) ⎡⎛ ⎞ ⎛ ⎞ ⎞ ⎤ ⎛ k k k



 ⎝ = ⎣⎝2A+ R+ P˜+ Cj2 ⎠P˜+ + Q+K+ Dj Dj⎠K+ + 2⎝B + Cj Dj⎠K P˜+ ⎦x(s)2 j=1

j=1

j=1

=0 in view of (36). Similarly, based on (37) one can show that ψ(x(s), u(s)) = 0 whenever x(s) ≤ 0. It then follows from (38) that  E

t

− 2 2 ˜ ˜ + 2 ˜ − 2 [Qx(s)2 + u(s) Ru(s)]ds = P˜+ (x+ 0 ) + P− (x0 ) − E[P+ x (t) ] − E[P− x (t) ].

0

Since u(·) is stabilizing, we have  J(x0 ; u(·)) = lim E t→+∞

0

t

− 2 2 ˜ [Qx(s)2 + u(s) Ru(s)]ds = P˜+ (x+ 0 ) + P− (x0 ) .

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1135

By virtue of Theorem 5.1, we conclude − 2 2 ˜ P˜+ (x+ 0 ) + P− (x0 ) = J(x0 ; u(·)) − 2 2 ≥ V (x0 ) ≥ P+ (x+ 0 ) + P− (x0 )

∀P+ ∈ P+ , ∀P− ∈ P− .

This proves (35). We now assume that (1) is conic stabilizable. According to Theorem 4.1 there exist K+ ∈ Γ and K− ∈ Γ such that F+ (K+ ) < 0 and F− (K− ) < 0. Take δ > 0 sufficiently small with δ < min{−F+ (K+ ), −F− (K− )}. Calculate (1)

(39) P+ :=

 min

F+ (K)≤−δ



 Q + K  RK Q + K  RK (1) , P− := min . − F+ (K) F− (K) F− (K)≤−δ (1)

(1)

In view of Lemma 5.1, P+ (respectively, P− ) is a (very tight) upper bound of P+ (respectively, P− ) under the well-posedness assumption. As a consequence, if (1) k  problem (LQ) is well-posed, then it is necessary that (R + P+ j=1 Dj Dj )|Γ ≥ 0  (1) k  and (R + P− j=1 Dj Dj )|Γ ≥ 0. Now, calculate (40)

P (0) := (R+P

kmin  j=1

P.

Dj Dj )|Γ ≥0

Then we know that points of P+ (respectively, P− ), if any, must lie between P (0) (1) (1) and P+ (respectively, P− ). Inspired by the above discussion, we have the following algorithm. Step 1. Apply Theorem 4.1(iii) to obtain K+ , K− ∈ Γ with F+ (K+ ) < 0 and F− (K− ) < 0. Set δ := min{ε, −F+ (K+ ), −F− (K− )}, where ε is a very small number allowed by the computer. (1) (1) (1) k  Step 2. Calculate P+ and P− via (39). If either (R + P+ j=1 Dj Dj )|Γ < 0 or  (1) k  (R+P− j=1 Dj Dj )|Γ < 0 holds, stop, and problem (LQ) is not well-posed. Step 3. Calculate P (0) via (40). (1) Step 4. If there exists a P ∈ [P (0) , P+ ) satisfying L+ (P ) ≥ 0, then go to Step 5; otherwise, stop, and problem (LQ) is not well-posed. (1) Step 5. If there exists a P ∈ [P (0) , P− ) satisfying L− (P ) ≥ 0, then stop, and problem (LQ) is well-posed; otherwise, stop, and problem (LQ) is not well-posed. 5.4. Well-posedness margin. In view of Remark 5.5, problem (LQ) may still be well-posed when R is indefinite or even negative definite. That said, it is clear that R cannot be too negative for the well-posedness. Therefore, it is interesting to study the range of R over which problem (LQ) is well-posed, given that all the other data is fixed. Specifically, define  r∗ := inf{r ∈ R  problem (LQ) is well-posed for any R ∈ Sm×m with R > rI}, (41) where inf ∅ := +∞. The value r∗ is called the well-posedness margin. By its very definition, r∗ has the following interpretation: Problem (LQ) is well-posed if the smallest eigenvalue of R, λmin (R), is such that λmin (R) > r∗ , and is not well-posed if the largest eigenvalue of R, λmax (R), is such that λmax (R) < r∗ . It follows from Theorem 5.1 that, provided that system (1) is stabilizable, the well-posedness margin r∗ can be obtained by solving the following nonlinear program

1136

XI CHEN AND XUN YU ZHOU

(with P and r being the decision variables): minimize

(42)

r⎧ ⎛ ⎞ k ⎪

⎪ ⎪ ⎝ ⎪ Cj2 ⎠ P + Q + Φ+ (P, r) ≥ 0, ⎪ ⎪ 2A + ⎪ ⎪ j=1 ⎪ ⎪ ⎪ ⎛ ⎞ ⎪ ⎪ k ⎨

⎝2A + Cj2 ⎠ P + Q + Φ− (P, r) ≥ 0, ⎪ ⎪ j=1 ⎪ ⎪ ⎪ ⎛ ⎞ ⎪ ⎪  k ⎪

⎪  ⎪  ⎪  ⎝ ⎠ ⎪ rI + P D D j  ≥ 0, j ⎪ ⎩  j=1

subject to

Γ

where





Φ+ (P, r) := inf ⎣K  ⎝rI + P K∈Γ





Φ− (P, r) := inf ⎣K  ⎝rI + P K∈Γ

k

j=1 k





Dj Dj ⎠ K + 2 ⎝B + ⎞



k



Cj Dj ⎠ P K ⎦ ,

j=1



Dj Dj ⎠ K − 2 ⎝B +

j=1

k





Cj Dj ⎠ P K ⎦ .

j=1

Notice, again, that it is hard to solve the preceding mathematical program as, in addition to the difficulty associated with the last constraint, Φ+ (P, r) and Φ+ (P, r) in general do not have analytical forms. In the following, we provide an explicit lower bound of r∗ . Theorem 5.2. Assume that system (1) is stabilizable. Then r¯ := max{¯ r+ , r¯− } is a lower bound of the well-posedness margin, where ⎧ λQ ⎪ if Q ≥ 0, ⎪ ⎨ inf K∈K {F+ (K) − λK  K} + (43) r¯+ := λQ ⎪ ⎪ ⎩ if Q < 0 supK∈K+ {F+ (K) − λK  K} and

⎧ λQ ⎪ ⎪ ⎨ inf K∈K {F− (K) − λK  K} − r¯− := (44) λQ ⎪ ⎪ ⎩ supK∈K− {F− (K) − λK  K} k with λ := inf K∈Γ,|K|=1 K  j=1 Dj Dj K ≥ 0.

if Q ≥ 0, if Q < 0,

Proof. Suppose problem (LQ) is well-posed with R = −rI, r ∈ R. Then P+ = ∅. Since system (1) is stabilizable, we can take a K ∈ K+ . It follows from Lemma 5.1  K that P := − Q+rK F+ (K) is an upper bound of the nonempty set P+ . Because (rI + k k P+ j=1 Dj Dj )|Γ ≥ 0 ∀P+ ∈ P+ , we conclude (rI + P j=1 Dj Dj )|Γ ≥ 0, which is k equivalent to r + P K  j=1 Dj Dj K ≥ 0 for any K ∈ Γ, |K| = 1. Hence r + P λ ≥ 0. 

λQ K Substituting P = − Q+rK F+ (K) we obtain r ≥ F+ (K)−λK  K . Since K ∈ K+ is arbitrary, we easily obtain that r ≥ r¯+ . Similarly, we have r ≥ r¯− . Our analysis implies that problem (LQ) is not well-posed whenever the largest eigenvalue of R, λmax (R), is such that λmax (R) < r¯. Hence r¯ is a lower bound of the well-posedness margin r∗ .

1137

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

6. Optimality. This section is devoted to solving the optimal LQ control problem under consideration. We will first introduce two algebraic equations, in the spirit of the classical Riccati equation (for the unconstrained LQ problem), along with the notion of the so-called stabilizing solution. Then the optimality of problem (LQ) is addressed via the stabilizing solutions of the two algebraic equations. We impose the following assumptions on the rest of the paper. Assumption 6.1. System (1) is conic stabilizable. Assumption 6.2. Problem (LQ) is well-posed. 6.1. Extended algebraic Riccati equations. In this subsection we define the two algebraic equations that play a key role in solving problem (LQ). Denote k R := {P ∈ R|(R + P j=1 Dj Dj )|Γ > 0} and consider the following two functions from R to R: ⎡ ⎛ ⎞ ⎛ ⎞ ⎤ k k



ξ+ (P ) := arg min ⎣K  ⎝R + P Dj Dj ⎠ K + 2 ⎝B + Cj Dj ⎠ P K ⎦ , K∈Γ



j=1



ξ− (P ) := arg min ⎣K  ⎝R + P K∈Γ

k



j=1



Dj Dj ⎠ K − 2 ⎝B +

j=1

k





Cj Dj ⎠ P K ⎦ .

j=1

Note that the minimizers above are uniquely achievable due to a similar argument in Remark 5.3 and the fact that Γ is closed. Moreover, it is evident that both ξ+ (·) and ξ− (·) are continuous on R. Define a pair of functions L+ and L− from R to R: ⎞ ⎛ k

Cj2 ⎠ P + Q + Φ+ (P ), L+ (P ) := ⎝2A + j=1

⎛ L− (P ) := ⎝2A +

k

⎞ Cj2 ⎠ P + Q + Φ− (P ).

j=1

The two equations

⎞    ⎠ ⎝ R+P Dj Dj  > 0, L+ (P ) = 0,  j=1 ⎛

(45)

k

Γ

⎞    ⎠ ⎝ R+P Dj Dj  > 0 L− (P ) = 0,  j=1 ⎛

(46)

k

Γ

are called extended algebraic Riccati equations (EAREs). Note that the constraint k (R + P j=1 Dj Dj )|Γ > 0 is part of each of the two equations; so the EAREs are not exactly equations in a strict sense. Also, being an algebraic equation, each of them may admit more than one solution, or may admit no solution at all. Note that the EAREs introduced here both reduce to the same stochastic algebraic Riccati equation extensively studied in [2]. Definition 6.1. A solution P of the EARE (45) (respectively, (46)) is called a stabilizing solution if ξ+ (P ) ∈ K+ (respectively, ξ− (P ) ∈ K− ).

1138

XI CHEN AND XUN YU ZHOU

It should be noted that the EAREs may not admit any stabilizing solution (see Proposition 6.1). Before we conclude this subsection, we will present several lemmas. Lemma 6.1. We have the inequalities (47)

(P2 − P1 )[F+ (ξ+ (P2 )) − F+ (ξ+ (P1 ))] ≤ 0,

(48)

(P2 − P1 )[F− (ξ− (P2 )) − F− (ξ− (P1 ))] ≤ 0,

(49)

L+ (P2 ) − L+ (P1 ) ≤ (P2 − P1 )F+ (ξ+ (P1 )),

(50)

L− (P2 ) − L− (P1 ) ≤ (P2 − P1 )F− (ξ− (P1 ))

for any P1 , P2 ∈ R. Proof. Denoting v1 := ξ+ (P1 ) and v2 := ξ+ (P2 ), we have ⎛ v2 ⎝R + P1

k



Dj Dj ⎠ v2 + 2 ⎝B +

j=1

⎛ v1 ⎝R + P2

k





k

⎞ Cj Dj ⎠ P1 v2 − Φ+ (P1 ) ≥ 0,

j=1



Dj Dj ⎠ v1 + 2 ⎝B +

k

⎞ Cj Dj ⎠ P2 v1 − Φ+ (P2 ) ≥ 0.

j=1

j=1

Then, adding the two inequalities, we get ⎡



⎣v2 ⎝R + P1 ⎡

k





Dj Dj ⎠ v2 + 2 ⎝B +

j=1



≥ Φ+ (P1 ) − ⎣v1 ⎝R + P2

k



k

j=1





Cj Dj ⎠ P1 v2 ⎦ − Φ+ (P2 ) ⎛

Dj Dj ⎠ v1 + 2 ⎝B +

j=1

k





Cj Dj ⎠ P2 v1 ⎦ .

j=1

Recall that the infimum in Φ+ (Pi ) is achieved by vi , i = 1, 2; hence the above yields ⎡ (P2 − P1 ) ⎣v2 ⎡ ≤ (P2 − P1 ) ⎣v1

k

j=1 k

j=1

⎛ Dj Dj v2 + 2 ⎝B + ⎛ Dj Dj v1 + 2 ⎝B +

k

j=1 k

j=1

This is equivalent to (47). Similarly we can prove (48).





Cj Dj ⎠ v2 ⎦ ⎞



Cj Dj ⎠ v1 ⎦ .

1139

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

Next, we calculate L+ (P2 ) − L+ (P1 ) ⎛ ⎞ k

= ⎝2A + Cj2 ⎠ (P2 − P1 ) + Φ+ (P2 ) − Φ+ (P1 ) j=1

⎛ ≤ ⎝2A + ⎛

k



Cj2 ⎠ (P2 − P1 ) + v1 ⎝R + P2

j=1

+ 2 ⎝B +



k

k

⎞ Dj Dj ⎠ v1

j=1

⎞ Cj Dj ⎠ P2 v1 − Φ+ (P1 )

j=1

⎡⎛

= (P2 − P1 ) ⎣⎝2A +

k

j=1

⎞ Cj2 ⎠ + v1

k

⎛ Dj Dj v1 + 2 ⎝B +

j=1

k





Cj Dj ⎠ v1 ⎦

j=1

= (P2 − P1 )F+ (ξ+ (P1 )). This proves (49). Similarly we can show (50). Lemma 6.2. Assume that P1 ∈ R and P2 ≥ P1 . If ξ+ (P1 ) ∈ K+ (respectively, ξ− (P1 ) ∈ K− ), then ξ+ (P2 ) ∈ K+ (respectively, ξ− (P2 ) ∈ K− ). Proof. Since P2 ≥ P1 it follows from (47) of Lemma 6.1 that F+ (ξ+ (P2 )) − F+ (ξ+ (P1 )) ≤ 0. As F+ (ξ+ (P1 )) < 0 we have F+ (ξ+ (P2 )) < 0, implying ξ+ (P2 ) ∈ K+ . Similarly we can prove the assertion for K− . (0) (0) Lemma 6.3. If there exists P+ ∈ R (respectively, P−0 ∈ R) with ξ+ (P+ ) ∈ K+ 0 (respectively, ξ− (P− ) ∈ K− ), then L+ (·) (respectively, L− (·)) is strictly decreasing on (0) [P+ , +∞) (respectively, [P−0 , +∞)). (0) Proof. Take P2 > P1 ≥ P+ . It follows from Lemma 6.2 that F+ (ξ+ (P1 )) < 0. On the other hand, it is clear that P1 , P2 ∈ R. Hence Lemma 6.1 yields L+ (P2 ) − L+ (P1 ) ≤ (P2 − P1 )F+ (ξ+ (P1 )) < 0. This proves that L+ (P2 ) < L+ (P1 ). We can prove the assertion for L− (P ) in a similar manner. 6.2. Optimality of the LQ problem via EAREs. In this subsection we prove that stabilizing solutions of (45) and (46), if any, lead to a complete and explicit solution to problem (LQ). Theorem 6.2. If the EAREs (45) and (46) admit stabilizing solutions P+∗ and ∗ P− respectively, then the feedback control (51)

u∗ (t) = ξ+ (P+∗ )x+ (t) + ξ− (P−∗ )x− (t)

is optimal for problem (LQ) with respect to any initial state x0 . Moreover, the value function is (52)

− 2 2 ∗ V (x0 ) = P+∗ (x+ 0 ) + P− (x0 ) ∀x0 ∈ R.

1140

XI CHEN AND XUN YU ZHOU

Proof. Since P+∗ solves (45), we have P+∗ = −

Q + ξ+ (P+∗ ) Rξ+ (P+∗ ) . F+ (ξ+ (P+∗ ))

P−∗ = −

Q + ξ− (P−∗ ) Rξ− (P−∗ ) . F− (ξ− (P−∗ ))

Similarly,

Moreover, ξ+ (P+∗ ) ∈ K+ , ξ− (P−∗ ) ∈ K− as both P+∗ and P−∗ are stabilizing solutions. − 2 2 ∗ Thus the proof of Lemma 5.1 yields V (x0 ) ≤ J(x0 ; u∗ (·)) = P+∗ (x+ 0 ) + P− (x0 ) . ∗ ∗ On the other hand, P+ ∈ P+ and P− ∈ P− . Hence it follows from Theorem 5.1 + 2 − 2 ∗ ∗ 2 ∗ that V (x0 ) ≥ P+∗ (x+ 0 ) + P− (x0 ) . Therefore, V (x0 ) = J(x0 ; u (·)) = P+ (x0 ) + − 2 ∗ P− (x0 ) . The above proof has also shown the following result. Corollary 6.1. If the EAREs (45) and (46) admit stabilizing solutions P+∗ and ∗ P− , respectively, then P+∗ = max{P |P ∈ P+ } and P−∗ = max{P |P ∈ P− }. As a result, (45) and (46) each has at most one stabilizing solution. Corollary 6.1 guarantees that any stabilizing solution is the maximal solution of the respective EAREs. This result is in parallel with the unconstrained case (see, e.g., [3, Theorem 2.3]). Note that the converse of Theorem 6.2 is not necessarily true. The following example shows that the existence of a solution to the EAREs is not necessary for the LQ problem to be attainable with respect to any initial state. Example 6.1. Consider the LQ problem  +∞ minimize J(x0 ; u(·)) = E [|x(t)|2 − |u(t)|2 ]dt 0 (53)  dx(t) = [−x(t) + u(t)]dt + [−x(t) + u(t)]dw(t), subject to x(0) = x0 , where all the variables are scalar-valued and Γ = R. This example was originally discussed in [33, Example 6.1, p. 817]. It was verified in [33] that the system is stabilizable, and the LQ problem is attainable with respect to any x0 (in fact there are infinitely many optimal feedback controls). But both EAREs (45) and (46) in this case reduce to −p + 1 = 0, −1 + p > 0, which clearly admits no solution at all. In spite of the preceding remarks and example, the following result shows that under an additional assumption, the EAREs indeed admit stabilizing solutions if problem (LQ) is attainable. Theorem 6.3. Assume that there exist P+ ∈ P+ and P− ∈ P− such that (R + k P j=1 Dj Dj )|Γ > 0 for P = P+ , P− . If problem (LQ) is attainable with respect to any x0 ∈ R, then the EAREs (45) and (46) admit stabilizing solutions P+∗ and P−∗ , respectively. Moreover, any optimal control with respect to a given x0 must be unique and represented by the feedback control (51). Proof. The proof of Proposition 5.2 yields that, under Assumptions 6.1 and 6.2, the value function can be represented as (32), where P+∗ and P−∗ are the maximum elements of P+ and P− , respectively. Moreover, by the assumption we have ⎞ ⎞ ⎛ ⎛   k k



  ⎝R + P+∗ (54) Dj Dj ⎠ > 0, ⎝R + P−∗ Dj Dj ⎠ > 0.   j=1 j=1 Γ

Γ

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1141

Now, for any x0 ∈ R let x∗ (·) be the solution of (1) under an optimal control u∗ (·) ∈ Ux0 which exists by the attainability assumption. Then a similar calculation to that of (28) leads to  t E [Qx∗ (s)2 + u∗ (s) Ru∗ (s)]ds 0  t + 2 − 2 ∗ ∗ ∗ ∗+ 2 ∗ ∗− 2 ψ(x∗ (s), u∗ (s))ds, = P+ (x0 ) + P− (x0 ) − E[P+ x (t) ] − E[P− x (t) ] + E 0

(55) where ψ(x∗ (s), u∗ (s)) is the same as the integrand on the right-hand side of (28), with P+ and P− replaced by P+∗ and P−∗ , respectively. Letting t → +∞ and noting that u∗ (·) is stabilizing, we obtain  t ∗ V (x0 ) ≡ J(x0 ; u (·)) = lim E [Qx∗ (s)2 + u∗ (s) Ru∗ (s)]ds t→+∞ 0  +∞ + 2 − 2 ∗ ∗ ψ(x∗ (s), u∗ (s))ds. = P+ (x0 ) + P− (x0 ) + E 0

2 P−∗ (x− 0)

2 P+∗ (x+ 0)

+ Recalling that V (x0 ) = same proof of Theorem 5.1), we conclude

and that ψ(x∗ (s), u∗ (s)) ≥ 0 (via the

ψ(x∗ (s), u∗ (s)) = 0, a.e. s ∈ [0, +∞), P-a.s. Fix s ∈ [0, +∞), satisfying the above equality. If x∗ (s) > 0, then we can write u∗ (s) = K(s)x∗ (s), where K(s) ∈ Γ. Going through the same analysis as in (29), we obtain ∗ ∗ 0=⎡ ψ(x ⎛ (s), u (s)) ⎞ ⎛ ⎞ k k



= ⎣⎝2A + Cj2 ⎠ P+∗ + Q + K(s) ⎝R + P+∗ Dj Dj ⎠ K(s)



+ 2 ⎝B + ⎡⎛ ≥ ⎣⎝2A +



j=1 k

j=1 k



j=1

Cj Dj ⎠ K(s)P+∗ ⎦ x∗ (s)2 ⎞



Cj2 ⎠ P+∗ + Q + Φ+ (P+∗ )⎦ x∗ (s)2

j=1

≥ 0. Thus, all the inequalities above become equalities and, noting that x∗ (s) = 0, one has K(s) = ξ+ (P+∗ ) and L+ (P+∗ ) = 0. As a result, u∗ (s) ≡ K(s)x∗ (s) = ξ+ (P+∗ )x∗ (s) at s when x∗ (s) > 0. Similarly, we can prove that u∗ (s) = −ξ− (P−∗ )x∗ (s) at s when x∗ (s) ≤ 0, and L− (P−∗ ) = 0. To summarize, we have shown that any optimal control u∗ (·) can be represented by (51), and hence the uniqueness of optimal control follows. On the other hand, we have also proved that P+∗ and P−∗ are solutions to the EAREs (45) and (46), respectively. Moreover, they must be stabilizing solutions because u∗ (·), which is now represented by (51), is a stabilizing control. Remark 6.1. Theorem 6.3 shows that the existence of stabilizing solutions to the EAREs (45) and (46) is almost necessary for the attainability of problem (LQ). The only exception, as also demonstrated by Example 6.1, is the “singular” case when k (R + P j=1 Dj Dj )|Γ = 0 for all elements P in at least one of the sets P+ and P− .

1142

XI CHEN AND XUN YU ZHOU

6.3. Existence of stabilizing solutions to EAREs. Theorem 6.2 asserts that if one can find stabilizing solutions to the EAREs, then the original optimal LQ control can be solved completely and explicitly in terms of obtaining the optimal feedback control as well as the value function. The next natural questions are, then, when do the EAREs admit stabilizing solutions, and how do we find them? These are the issues that we are going to address in this subsection. Indeed, we will identify and discuss three cases when the EAREs do have the stabilizing solutions. (0) (0) Theorem 6.4. If there exist P+ , P− satisfying ⎛

⎞   Dj Dj ⎠ > 0,  j=1

k

(0)

(0) L+ (P+ ) ≥ 0, ⎝R + P+

(56)

Γ



(0) L− (P− ) ≥ 0, ⎝R + P−

(57)

⎞    Dj Dj ⎠ > 0,  j=1

k

(0)

Γ

and (0)

(0)

F+ (ξ+ (P+ )) < 0,

(58)

F− (ξ− (P− )) < 0,

then the EAREs (45) and (46) admit unique stabilizing solutions P+∗ and P−∗ , respectively. (0) (0) Proof. If L+ (P+ ) = 0, then P+ is the stabilizing solution to (45) and we are k (0) (0) (0) done. So let us assume that L+ (P+ ) ≡ (2A + j=1 Cj2 )P+ + Q + Φ+ (P+ ) > 0, namely, F+ (ξ+ (P+ ))P+ + Q + ξ+ (P+ ) Rξ+ (P+ ) > 0. (0)

(0)

(0)

(0)

(0)

Since F+ (ξ+ (P+ )) < 0, we have Q + ξ+ (P+ ) Rξ+ (P+ ) (0)

(0)

P+ < −

(59)

(0)

(0)

F+ (ξ+ (P+ ))

(1)

(1)

:= P+ .

(1)

(0)

By Lemma 6.2, we have ξ+ (P+ ) ∈ K+ because P+ > P+ . Moreover, ⎛ (1) L+ (P+ )

≡ ⎝2A +

⎞ Cj2 ⎠ P+ + Q + Φ+ (P+ ) (1)

j=1

⎛ (60)

k

≤ ⎝2A + ⎛

k



k

j=1

= 0,



(1) Cj2 ⎠ P+

j=1

+ 2 ⎝B +

(1)

+Q+

(0) ξ+ (P+ )

⎝R +

(1) P+

k

j=1

⎞ Cj Dj ⎠ P+ ξ+ (P+ ) (1)

(0)

⎞ (0) Dj Dj ⎠ ξ+ (P+ )

1143

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT (1)

(1)

where the last inequality was due to the very definition of P+ . Now, if L+ (P+ ) = 0, (1) (1) then P+ is the stabilizing solution of (45). If L+ (P+ ) < 0, then, noting that L+ (·) (0) (1) is a strictly decreasing (by Lemma 6.3) continuous function on the interval [P+ , P+ ] (0) (0) (1) along with L+ (P+ ) > 0, we conclude that there exists a unique P+∗ ∈ (P+ , P+ )  k such that L+ (P+∗ ) = 0. Clearly (R + P+∗ j=1 Dj Dj )|Γ > 0, and ξ+ (P+∗ ) ∈ K+ thanks ∗ to Lemma 6.2. Hence P+ is the stabilizing solution of (45). The proof for the existence of the stabilizing solution to the EARE (46) is completely analogous. Finally, the uniqueness of the stabilizing solutions follows from Corollary 6.1. (0) (0) Theorem 6.5. If there exist P+ , P− satisfying ⎞ ⎛  k

 (0) (0) (61) Dj Dj ⎠ > 0, L+ (P+ ) > 0, ⎝R + P+  j=1 Γ

and



⎞   Dj Dj ⎠ > 0,  j=1

k

(0)

(0) L− (P− ) > 0, ⎝R + P−

(62)

Γ

then the EAREs (45) and (46) admit unique stabilizing solutions P+∗ and P−∗ , respectively. Proof. Take K+ ∈ K+ , which exists by the stabilizability assumption. Set (1)

P+ := −

(63)

 Q + K+ RK+ . F+ (K+ )

(1)

(0)

(0)

It follows from Lemma 5.1 that P+ ≥ P+ since P+ ∈ P+ . Hence ⎞ ⎛  k

 (1)  ⎝R + P+ Dj Dj ⎠ > 0.  j=1 Γ

Moreover, ⎛ L+ (P+ ) ≡ ⎝2A + (1)

⎞ Cj2 ⎠ P+ + Q + Φ+ (P+ ) (1)

j=1

⎛ (64)

k

≤ ⎝2A + ⎛

k





k

(1)

 ⎝ R + P+ Cj2 ⎠ P+ + Q + K+ (1)

j=1

+ 2 ⎝B +

(1)

k

⎞ Dj Dj ⎠ K+

j=1

⎞ Cj Dj ⎠ P+ K+ (1)

j=1

= 0. (0)

Applying Lemma 6.1 and noting that L+ (P+ ) > 0 we get (65)

(0)

(1)

(0)

(1)

(1)

0 < L+ (P+ ) − L+ (P+ ) ≤ (P+ − P+ )F+ (ξ+ (P+ )).

1144

XI CHEN AND XUN YU ZHOU (1)

(1)

Hence F+ (ξ+ (P+ )) < 0 or ξ+ (P+ ) ∈ K+ . Now set Q + ξ+ (P+ ) Rξ+ (P+ ) (1)

(2)

P+ := −

(66)

(1)

(1)

.

F+ (ξ+ (P+ )) k

 j=1 Dj Dj )|Γ (1) (2) (1) 0. On the other hand, the fact that L+ (P+ ) ≤ 0 can be rewritten as P+ ≤ P+ (1) view of the relation (66). Moreover, analysis similar to that for P+ above leads (2) (2) (2) L+ (P+ ) ≤ 0, and ξ+ (P+ ) ∈ K+ or F+ (ξ+ (P+ )) < 0. (2)

(2)

(0)

Again by Lemma 5.1 we obtain P+ ≥ P+ and, therefore, (R+P+

> in to

In general, we construct iteratively the following sequence: Q + ξ+ (P+ ) Rξ+ (P+ ) (i)

(i+1) P+

(67)

:= −

(i)

(i)

, i = 1, 2, . . . .

F+ (ξ+ (P+ )) (0)

(i+1)

(i)

(1)

≤ P+ · · · ≤ P+ , (R + An induction argument shows that P+ ≤ · · · ≤ P+ (i) (i) (i) k  P+ j=1 Dj Dj )|Γ > 0, L+ (P+ ) ≤ 0, and ξ+ (P+ ) ∈ K+ , i = 1, 2, . . . . Since the sequence {P+ } is decreasing with a lower bound P+ , there exists P+∗ ∈ R so k (i) that P+∗ = limi→∞ P+ . Moreover it is clear that (R + P+∗ j=1 Dj Dj )|Γ ≥ (R + (i) (0) k  ∗ P+ j=1 Dj Dj )|Γ > 0. On the other hand, L+ (P+ ) ≤ 0 since each L+ (P+ ) ≤ 0. Thus an argument similar to (65) yields ξ+ (P+∗ ) ∈ K+ or F+ (ξ+ (P+∗ )) < 0. As a (i)

(0)

result, we can pass the limit in (67) to obtain P+∗ = −

∗  ∗ Q+ξ+ (P+ ) Rξ+ (P+ ) , ∗ )) F+ (ξ+ (P+

which is

equivalent to L+ (P+∗ ) = 0. This shows that P+∗ is the stabilizing solution to (45). Similarly we can prove that (46) admits the stabilizing solution. Remark 6.2. Recall that Theorem 5.1 characterizes the well-posedness of problem (LQ) by the nonemptiness of the sets P+ and P− . Theorems 6.4 and 6.5 spell out two important cases when the EAREs (45) and (46) have stabilizing solutions and, therefore, problem (LQ) can be completely solved with explicit solutions. These two cases are specified in terms of the existence of certain “special elements” of the sets P+ and P− . Specifically, the case with Theorem 6.4 is one when each of P+ and P− has a “stabilizing element” in the sense that (58) holds. On the other hand, Theorem 6.5 asserts that the nonemptiness of the interiors of P+ and P− is sufficient for the existence of stabilizing solutions to the EAREs. In view of the fact that the nonemptiness of P+ and P− is the minimum requirement for the underlying LQ problem to be meaningful, the sufficient conditions respectively given in Theorems 6.4 and 6.5 are very mild indeed. Remark 6.3. The proof of Theorem 6.5 constitutes an algorithm for finding the stabilizing solutions to the EAREs. In fact it is given by the iterative scheme (67) with an initial point (63). On the other hand, although the proof of Theorem 6.4 has not given an explicit algorithm for computing the stabilizing solutions, one can use a middle-point algorithm to find them based on the proof. Alternatively, one may use (1) the same iterative scheme (67) with the initial point, P+ , given by (59). It can be proved, using almost the same analysis as that in the proof of Theorem 6.5, that the constructed sequence converges to the desired point, P+∗ . The only argument that needs to be modified is that for proving ξ+ (P+∗ ) ∈ K+ . In this case, ξ+ (P+∗ ) ∈ K+ is (0) (0) seen from the fact that P+∗ ≥ P+ and ξ+ (P+ ) ∈ K+ as well as from Lemma 6.2. Finally we present the results on the definite case Q ≥ 0 and R|Γ ≥ 0 (including the so-called singular case when R is allowed to be singular).

1145

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

Theorem 6.6. Assume Q ≥ 0 and R|Γ ≥ 0. Then the EAREs (45) and (46) admit unique stabilizing solutions P+∗ and P−∗ , respectively, under one of the following additional conditions: (i) Q > 0 and R|Γ > 0. k (ii) Q = 0, R|Γ > 0, and 2A + j=1 Cj2 = 0. k (iii) Q > 0, R|Γ ≥ 0, and j=1 Dj Dj |Γ > 0. (0)

(0)

Proof. (i) In this case P+ = P− = 0 satisfy the assumption of Theorem 6.5. k (0) (0) (0) (ii) If 2A + j=1 Cj2 < 0, then take P+ = P− = 0. We see that L+ (P+ ) = k (0) (0) (0) L− (P− ) = Q = 0 and F+ (ξ+ (P+ )) = F− (ξ− (P− )) = 2A + j=1 Cj2 < 0. Thus the assumption k of Theorem 6.4 is satisfied. If 2A + j=1 Cj2 > 0, due to the stabilizability assumption there is 0 = K+ ∈ K+ . (1)

K  RK

(1)

Set P+ := − F++(K++) . As F+ (K+ ) < 0, R|Γ > 0, and K+ = 0, we have P+ > 0. Define ξ+ (P+ ) Rξ+ (P+ ) (i)

(i+1)

P+

(68)

:= −

(i)

(i)

, i = 1, 2, . . . .

F+ (ξ+ (P+ ))

Then an analysis similar to that in the proof of Theorem 6.5 leads to 0 ≤ · · · ≤ (i+1) (i) (1) (i) k (i) (i)  P+ ≤ P+ · · · ≤ P+ , (R + P+ j=1 Dj Dj )|Γ > 0, L+ (P+ ) ≤ 0, and ξ+ (P+ ) ∈ K+ , i = 1, 2, . . . . Hence there exists P+∗ ≥ 0 so that P+∗ = limi→∞ P+ . Moreover k (R + P+∗ j=1 Dj Dj )|Γ ≥ R|Γ > 0. Note that at this point we can no longer apply the same argument as that used in the proof of Theorem 6.5 to conclude F+ (ξ+ (P+∗ )) < 0, (0) because the element 0, which substitutes the point P+ of Theorem 6.5, does not satisfy L+ (0) > 0. To get around this, let us suppose F+ (ξ+ (P+∗ )) = 0 (recall that (i) it always holds that F+ (ξ+ (P+∗ )) ≤ 0 since F+ (ξ+ (P+ )) < 0). Multiplying (68) by (i) F+ (ξ+ (P+ )) and then passing to the limit, we obtain ξ+ (P+∗ ) Rξ+ (P+∗ ) = 0, resulting k in ξ+ (P+∗ ) = 0. Thus F+ (ξ+ (P+∗ )) = 2A+ j=1 Cj2 > 0, which is a contradiction. This proves that F+ (ξ+ (P+∗ )) < 0. The rest of the proof is the same as that of Theorem 6.5. (iii) First note that for any P > 0, (i)

0 ≥ Φ+ (P )



≥ inf K  RK + P inf ⎣K  K∈Γ

K∈Γ

⎡ = P inf ⎣K  K∈Γ

k

j=1

k

⎛ Dj Dj K + 2 ⎝B +

j=1



Dj Dj K + 2 ⎝B +

k



k





Cj Dj ⎠ K ⎦

j=1



Cj Dj ⎠ K ⎦ .

j=1

Hence limP →0+ Φ+ (P ) = 0. Since Q > 0, we have, by the definition of L+ (·), that (0) (0) there exists P+ > 0 so that L+ (P+ ) > 0.  On the other hand, it is clear that  (0) k (0) k   (R + P+ j=1 Dj Dj )|Γ ≥ P+ j=1 Dj Dj  > 0. Consequently Theorem 6.5 apΓ plies. k One may be curious about what happens if Q = 0, R|Γ > 0, and 2A+ j=1 Cj2 = 0 (refer to Theorem 6.6(ii)). It turns out that in this case the EAREs never admit stabilizing solutions.

1146

XI CHEN AND XUN YU ZHOU

Proposition 6.1. Neither (45) nor (46) admits any stabilizing solution when k Q = 0, R|Γ > 0, and 2A + j=1 Cj2 = 0. Proof. By Assumption 6.1, there exists 0 = K+ ∈ K+ and 0 = K− ∈ K− since ε ε ε F+ (0) = F− (0) = 0. Denote K+ := εK+ and K− := εK− for ε ∈ (0, 1]. Then K+ ∈Γ ε and K− ∈ Γ. ! + F+ (K+ ) " For any fixed P+ > 0, set ε := min −P 2K  RK+ , 1 ∈ (0, 1]. Then +

L+ (P+ ) = Φ+ (P+ ) ⎛ ε ⎝ ≤ (K+ ) R + P+

k





ε Dj Dj ⎠ K+ + 2P+ ⎝B +

j=1

k

⎞ ε Cj Dj ⎠ K+

j=1

⎧ ⎞ ⎤⎫ ⎡ ⎛ k k ⎬ ⎨



  Dj Dj K+ + 2 ⎝B + Cj Dj ⎠ K+ ⎦ ≤ ε εK+ RK+ + P+ ⎣K+ ⎭ ⎩ j=1 j=1 # $ −P+ F+ (K+ )  ≤ε (K+ RK+ ) + P+ F+ (K+ )  RK 2K+ + ε = P+ F+ (K+ ) 2 < 0. This implies that there exists no positive solution to the EARE (45). k Next, for any fixed P+ < 0 with (R + P+ j=1 Dj Dj )|Γ > 0, set  ε := min

−|P+ |F− (K− ) ,1  RK 2K− −

 ∈ (0, 1].

Then L+ (P+ ) = Φ+ (P+ ) ⎛ ε ⎝ ≤ (K− ) R + P+

k

j=1





ε Dj Dj ⎠ K− + 2P+ ⎝B +

k

⎞ ε Cj Dj ⎠ K−

j=1

⎧ ⎞ ⎤⎫ ⎡ ⎛ k k ⎬ ⎨



  ≤ ε εK− RK− + |P+ | ⎣K− Dj Dj K− − 2 ⎝B + Cj Dj ⎠ K− ⎦ ⎭ ⎩ j=1 j=1 # $ −|P+ |F− (K− )  ≤ε (K− RK− ) + |P+ |F− (K− )  RK 2K− − ε = |P+ |F− (K− ) 2 < 0. Hence there is no negative solution to the EARE (45). Finally, when P+ = 0, we do have L+ (P+ ) = 0 but F+ (ξ+ (P+ )) = F+ (0) = 0. So P+ = 0 is not a stabilizing solution either. Similarly, we can prove the nonexistence of a stabilizing solution to (46). Although the conclusion of Proposition 6.1 does not necessarily lead to the nonexistence of optimal feedback control for the corresponding LQ problem (refer to section 6.2), the following example shows that the latter could indeed occur.

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1147

Example 6.2. Consider the LQ problem minimize (69) subject to

 +∞ r|u(t)|2 dt J(x0 ; u(·)) = E 0  dx(t) = [ax(t) + bu(t)]dt + cx(t)dw(t), x(0) = x0 ,

where all the variables are scalar-valued, Γ = R, 2a + c2 = 0, b > 0, and r > 0. It is easy to verify that the problem is stabilizable and well-posed. Take a feedback control uε (t) = −εxε (t) for ε > 0. Under this control the state satisfies 2

E|xε (t)|2 = e(2a+c

−2bε)t 2 x0

= e−2bεt x20 .

Hence uε is stabilizing. Moreover, the cost under this control is  +∞ r2 x20 ε 2 2 ε. |xε (t)|2 dt = J(x0 ; u (·)) = ε r E 2b 0 Letting ε → 0 we see that V (x0 ) = 0 ∀x0 ∈ R. Note that this value cannot be  +∞ attained if x0 = 0, for whenever E 0 r|u∗ (t)|2 dt = 0 it is necessary that u∗ (t) = 0, a.e. t ≥ 0. However, this control, u∗ (t), is not stabilizable when x0 = 0. In other words, the LQ problem is not attainable with respect to x0 = 0. Remark 6.4. Theorem 6.6(i),(ii) and Proposition 6.1 together with Example 6.2 give a complete answer to the question of optimality for problem (LQ) in the classical definite case Q ≥ 0 and R|Γ > 0. Moreover, Theorem 6.6(iii) addresses the case when R is possibly singular. Note that this case occurs often in financial applications (where typically R = 0). Remark 6.5. In view of Theorem 6.2, under the respective assumptions of Theorems 6.4, 6.5, and 6.6, problem (LQ) has the optimal feedback control (51) and the value function (52). Moreover, as per Remark 5.6, in these cases the stabilizing solutions P+∗ and P−∗ can also be obtained, in addition to the preceding algorithms, by solving the mathematical programs (33) and (34) if the corresponding constraints are tractable. 7. Numerical examples. To numerically calculate the optimal solution to problem (LQ) one needs to carry out two steps: the first is to check the conic stabilizability and the well-posedness, and the second is to find the stabilizing solutions to the EAREs. The procedures for the first step are depicted in sections 4 and 5.3, whereas that for the second part is described in section 6. Here we give an example to illustrate the whole process (where we used the computing tool Scilab to carry out all the calculations). Example 7.1. Consider problem (LQ) with m = k = 3, Γ = R3+ , and the dynamics coefficients as follows: A = 2.00, B = (−50 − 100 200), C1 = −0.84, D1 = (6.85 − 8.78 0.68), D2 = (11.22 13.24 14.53), C2 = −3.78, C3 = 0.849, D3 = (−1.98 − 5.44 − 2.32). 3 3  The eigenvalues of j=1 Dj Dj are 1.5880509, 126.23912, and 547.84943. So j=1 Dj Dj is positive definite in this case. The cost parameters are ⎛ ⎞ 0 0 0 Q = 10, R = ⎝0 −5.0 0 ⎠ . 0 0 4.0

1148

XI CHEN AND XUN YU ZHOU

Hence this is an indefinite LQ problem. To solve this problem, we first apply Theorem 4.1(iii) (note that in this example the constraint K ∈ R3+ is also an LMI) to obtain stabilizing feedback gains ⎛ ⎞ ⎛ ⎞ 0.5435353 0.0816967 K+ = ⎝0.5307289⎠ , K− = ⎝0.0562570⎠ . 0.0701460 0.7185273 Next we use the algorithm in section 5.3 to find out that problem (LQ) is well3 (0) (0) posed. Furthermore, when P+ = P− = 0.1, the eigenvalues of R + 0.1 ∗ j=1 Dj Dj 3 are 1.641309, 10.293806, and 54.632545; hence R+0.1∗ j=1 Dj Dj > 0. On the other hand, L+ (0.1) = 1.6073676 > 0 and L− (0.1) = 4.0651986 > 0. According to Theorem 6.5, problem (LQ) has an optimal feedback control. Now we use the algorithm given in the proof of Theorem 6.5 to obtain the optimal control and optimal value. First, (1) (1) set the initial values P+ and P− by using K+ , K− and formulas similar to (63), respectively. They are (1)

(1)

P+ = 1.1814412, P− = 13.167238. By the iterative formula (67), we obtain ⎛

⎞ 0.2889240 P+∗ = 0.1225762, ξ+ (P+∗ ) = ⎝0.4918755⎠ , 0 ⎛ ⎞ 0 ⎠. 0 P−∗ = 0.1561608, ξ− (P−∗ ) = ⎝ 0.5875815 Therefore, the optimal feedback control is ⎛ ⎞ ⎛ ⎞ 0.2889240 0 ⎠ x− (t), 0 u∗ (t) = ⎝0.4918755⎠ x+ (t) + ⎝ 0 0.5875815 with the optimal cost − 2 2 J ∗ (x0 ) = 0.1225762 ∗ (x+ 0 ) + 0.1561608 ∗ (x0 ) .

In the next example we demonstrate the calculation of a lower bound of the well-posedness margin (refer to section 5.4). Example 7.2. Using the same values of the coefficients A, B, Cj , Dj , j = 1, 2, 3, and Q as in Example 7.1, we want to compute a lower bound of the well-posedness margin r∗ . According to Theorem 5.2, we first calculate λ = 176.7313. Next we have r¯+ = −9.434553 and r¯− = −6.4967141. Hence, the lower bound of the well-posedness margin is r¯ = −6.4967141. Note that, as seen in Example 7.1, the problem is still well-posed when one of the eigenvalues of R is −5. 8. Conclusion. In this paper, we studied an indefinite stochastic LQ control problem in the infinite time horizon with conic control constraint. Several key issues, including conic stabilizability, well-posedness, and optimality were addressed

STOCHASTIC LQ CONTROL WITH CONIC CONTROL CONSTRAINT

1149

with complete solutions. In particular, two algebraic equations, the EAREs, were newly introduced, in lieu of the classical algebraic Riccati equation, whose stabilizing solutions give rise to the explicit forms of the optimal feedback control and the value function. It was also seen that the representation of the value function given by Proposition 5.1 served as the technical key to all the main results of this paper, which motivated the utilization of the cerebrated Tanaka formula. It should be stressed again that the approach of this paper crucially depended on the special structure of the problem. One main assumption is that the state of the system is one-dimensional. While the conclusion of Proposition 5.1 appears to hold, mutatis mutandis, for the problem with multidimensional state variable, it seems that an analogy of Lemma 3.2, if any, would be far more complicated. This makes the multidimensional problem very challenging. Another structural property of the model is that the dynamics of the system is homogeneous (in state and control) and the cost contains no first-order term of the state variable as well as no control-state cross term. As a result, our approach will fail, say, for the case when there is no state and control dependent noise, and with fixed variance. Solving these kinds of problems calls for different techniques. Finally, an even more difficult problem is the stochastic LQ control with state constraint. Acknowledgment. We thank the three anonymous reviewers for their careful reading of an earlier version of the paper and for their constructive comments that led to an improved version. REFERENCES [1] M. Ait Rami, X. Chen, J. B. Moore, and X. Y. Zhou, Solvability and asymptotic behavior of generalized Riccati equations arising in indefinite stochastic LQ controls, IEEE Trans. Automat. Control, AC-46 (2001), pp. 428–440. [2] M. Ait Rami and X. Y. Zhou, Linear matrix inequalities, Riccati equations, and indefinite stochastic linear quadratic control, IEEE Trans. Automat. Control, AC-45 (2000), pp. 1131–1143. [3] M. Ait Rami, X. Y. Zhou, and J. B. Moore, Well-posedness and attainability of indefinite stochastic linear quadratic control in infinite time horizon, Systems Control Lett., 41 (2000), 123–133. [4] B. D. O. Anderson and J. B. Moore, Optimal Control—Linear Quadratic Methods, PrenticeHall, Englewood Cliffs, NJ, 1989. [5] J.-M. Bismut, Linear quadratic optimal stochastic control with random coefficients, SIAM J. Control Optim., 14 (1976), pp. 419–444. [6] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, Linear Matrix Inequalities in System and Control Theory, SIAM, Philadelphia, 1994. [7] R. F. Brammer, Controllability in linear autonomous systems with positive controllers, SIAM J. Control, 10 (1972), pp. 339–353. [8] R. F. Brammer, Differential controllability and the solution of linear inequalities-Part I, IEEE Trans. Automat. Control, AC-20 (1975), pp. 128–131. [9] S. L. Campbell, On positive controllers and linear quadratic optimal control problems, Internat. J. Control, 36 (1982), pp. 885–888. [10] S. Chen, X. Li, and X. Y. Zhou, Stochastic linear quadratic regulators with indefinite control weight costs, SIAM J. Control Optim., 36 (1998), pp. 1685–1702. [11] S. Chen and J. Yong, Stochastic linear quadratic optimal control problems, Appl. Math. Optim., 43 (2001), pp. 21–45. [12] M. D. Fragoso, O. L. V. Costa, and C. E. de Souza, A new approach to linearly perturbed Riccati equations arising in stochastic control, Appl. Math. Optim., 37 (1998), pp. 99–126. [13] W. P. M. H. Heemels, S. J. L. van Eijndhoven, and A. A. Stoorvogel, Linear quadratic regulator problem with positive controls, Internat. J. Control, 70 (1998), pp. 551–578. [14] M. Heymann and R. J. Stern, Controllability of linear systems with positive controls: Geometric considerations, J. Math. Anal. Appl., 52 (1975), pp. 36–41.

1150

XI CHEN AND XUN YU ZHOU

[15] Y. Hu and X. Y. Zhou, Constrained Stochastic LQ Control with Random Coefficients, and Application to Mean-Variance Portfolio Selection, Preprint, 2003. [16] N. Ikeda and S. Watanabe, Stochastic Differential Equations and Diffusion Processes, NorthHolland/Kodansha, Tokyo, 1981. [17] R. E. Kalman, Contribution to the theory of optimal control, Bol. Soc. Mat. Mexicana (2), 5 (1960), pp. 102–119. [18] X. Li, X. Y. Zhou, and A. E. B. Lim, Dynamic mean-variance portfolio selection with noshorting constraints, SIAM J. Control Optim., 40 (2002), pp. 1540–1555. [19] A. E. B. Lim and X. Y. Zhou, Mean-variance portfolio selection with random parameters in a complete market, Math. Oper. Res., 27 (2002), pp. 101–120. [20] Y. Nesterov and A. Nemirovskii, Interior Point Polynomial Algorithms in Convex Programming, SIAM, Philadelphia, 1994. [21] M. Pachter, The linear-quadratic optimal control problem with positive controllers, Internat. J. Control, 32 (1980), pp. 589–608. [22] M. Pachter and D. H. Jacobson, Control with conic control constraint sets, J. Optim. Theory Appl., 25 (1978), pp. 117–123. [23] M. Pachter and D. H. Jacobson, Stabilization conic control constraint sets, Internat. J. Control, 29 (1979), pp. 125–132. [24] D. Revuz and M. Yor, Continuous Martingales and Brownian Motion, 3rd ed., SpringerVerlag, Berlin, 1999. [25] S. H. Saperstone, Global controllability of linear systems with positive controls, SIAM J. Control, 11 (1973), pp. 417–423. [26] S. H. Saperstone and J. A. Yorke, Controllability of linear oscillatory systems using positive controls, SIAM J. Control, 9 (1971), pp. 253–262. [27] H. Sissaoui and W. D. Collins, The numerical solution of the linear regulator problem with positive control, IMA J. Math. Control Inform., 5 (1988), pp. 191–201. [28] P. O. M. Sockaert and J. B. Rawlings, Constrained linear quadratic regulation, IEEE Trans. Automat. Control, AC-43 (1998), pp. 1163–1169. [29] J. F. Sturm and S. Zhang, On cones of nonnegative quadratic function, Math. Oper. Res., 28 (2003), pp. 246–267. [30] F. Willems, W. P. M. H. Heemels, B. de Jager, and A. A. Stoorvogel, Positive feedback stabilization of centrifugal compressor surge, Automatica, 38 (2002), pp. 311–318. [31] W. M. Wonham, Optimal stationary control of a linear system with state-dependent noises, SIAM J. Control, 5 (1967), pp. 486–500. [32] W. M. Wonham, On a matrix Riccati equation of stochastic control, SIAM J. Control, 6 (1968), pp. 681–697; erratum, SIAM J. Control, 7 (1969), p. 365. [33] D. D. Yao, S. Zhang, and X. Y. Zhou, Stochastic linear-quadratic control via semidefinite programming, SIAM J. Control Optim., 40 (2001), pp. 801–823. [34] J. Yong and X. Y. Zhou, Stochastic Controls. Hamiltonian Systems and HJB Equations, Springer-Verlag, New York, 1999. [35] X. Y. Zhou and D. Li, Continuous-time mean-variance portfolio selection: A stochastic LQ framework, Appl. Math. Optim., 42 (2000), pp. 19–33.