Math. Control Signals Syst. (2011) 23:141–157 DOI 10.1007/s00498-011-0064-9 ORIGINAL ARTICLE
Quadratic costs and second moments of jump linear systems with general Markov chain Eduardo F. Costa · Alessandro N. Vargas · João B. R. do Val
Received: 16 November 2008 / Accepted: 23 August 2011 / Published online: 14 September 2011 © Springer-Verlag London Limited 2011
Abstract This paper presents an analytic, systematic approach to handle quadratic functionals associated with Markov jump linear systems with general jumping state. The Markov chain is finite state, but otherwise general, possibly reducible and periodic. We study how the second moment dynamics are affected by the additive noise and the asymptotic behaviour, either oscillatory or invariant, of the Markov chain. The paper comprises a series of evaluations that lead to a tight two-sided bound for quadratic cost functionals. A tight two-sided bound for the norm of the second moment of the system is also obtained. These bounds allow us to show that the long-run average cost is well defined for system that are stable in the mean square sense, in spite of the periodic behaviour of the chain and taking into consideration that it may not be unique, as it may depend on the initial distribution. We also address the important question of approximation of the long-run average cost via adherence of finite horizon costs.
Research supported by FAPESP Grants 03/06736-7 and 04/06947-0 and by the CNPq Grants 304429/2007-4, 304856/2007-0 and 306466/2010-4. E. F. Costa (B) Universidade de São Paulo (USP), Depto. de Matemática Aplicada e Estatística, C.P. 668, São Carlos, SP 13560-970, Brazil e-mail:
[email protected] A. N. Vargas · J. B. R. do Val Universidade Estadual de Campinas (UNICAMP), Fac. de Eng. Elétrica e de Computação, Depto. de Telemática, C.P. 6101, Campinas, SP 13081-970, Brazil e-mail:
[email protected] J. B. R. do Val e-mail:
[email protected] 123
142
E. F. Costa et. al
Keywords Linear systems · Markov jump systems · Long-run average cost · Bounds for finite horizon cost Mathematics Subject Classification (2000) 74H40 · 74G45
93C05 · 60J05 · 74H55 ·
1 Introduction The paper focuses on the Markov jump linear system (MJLS) as studied in [1–3], in the situation when the Markov chain is finite-state homogeneous, but otherwise general. It may form a reducible or irreducible chain with either periodic or aperiodic states. The MJLS has a composite state (x(k), θ (k)), where x ∈ Rn is the continuous state and θ is the jumping state that evolves according to a homogeneous Markov chain. An additive wide sense white noise sequence is also taken into account. Basically, MJLSs are stochastic linear systems that comprise as particular subclasses the linear deterministic systems, periodic systems, and even systems with i.i.d. random parameters. They are well suited to model dynamic systems of practical importance, as they convey abrupt changes in the structure or parameters, e.g., see [4] and [5] for applications in economy and stochastic finance, and [6] in control of paper mills. We consider the long-run average cost functional C T :=
T 1 E x(k) Q(k, θ (k))x(k) , T
(1)
k=0
for finite T , and we investigate bounds for C T and the limiting behaviour when the number of stages T tends to infinity, under the assumption that the MJLS is stable in the mean square sense. The control problem associated to (1) when T → ∞ is an important paradigm for many applications, see for instance [7] or [8]. We also study how those bounds can be extended to the conditional second moments of the system. In the literature, the bounds for the cost C T are connected to basic conditions in different contexts such as conditions for existence of stabilizing receding horizon controls [9,10] or uniform cost convergence [11,12], among others. Note that even in the context of MJLS with complete state observation and general Markov chain there is no available result parallelling the ones obtained here. The results in [3, Thm. 4.6] take into account the ergodicity assumption and requires complete state observation. This paper obtains bounds for the cost in (1) in a general scenario. The result is a two-sided evaluation that may vary cyclically with the period of the Markov chain, and when the chain is aperiodic, these bounds are symmetric with respect to an asymptote, see Theorem 1. The evaluation is tight, in the sense that if one considers appropriate initial distribution for the chain and initial conditional second moment matrices for the system, the two-sided evaluation shows no gap, see Remark 2. For the limiting situation (T → ∞), the idea is to address an adequate limit notion for (1) and to obtain an asymptotic evaluation on its own terms, valid for mean square
123
Quadratic costs and second moments
143
stable MJLS’s. These are hard tasks as the straightforward limit may not exist or may be nonunique, as for the general case one seeks the limit supremum or infimum of (1) e.g., see [7,8] in the context of Markov decision processes. The aforementioned bounds allow us to demonstrate that the long-run average cost is well defined and is finite under mean square stability. The results appeal to controlled MJLS when the observation of the Markov state and/or the continuous state is not completely available to the controller. The corresponding finite cost for any fixed control is easily evaluated and it plays the role of an approximation to the long-run average cost, with error bounds evaluated by Theorem 1, yielding some understanding on this hard control problem. The bounds obtained for C T are extended to the conditional second moment of the system, showing that the difference between the second moment at time instant k and the limiting (possibly cyclic) second moment is limited by linear terms involving the initial distribution of the Markov chain and the initial conditional second moment, see Corollary 1. It is remarkable that the initial distribution of the Markov chain has to be taken explicitly into account, since the bounds cannot be expressed uniquely in terms of the initial conditional second moment, as illustrated by Example 1. After some preliminaries in Sect. 2, we deal in Sect. 3 with Markov chain convergence in the sense of a periodic limiting probability. In Sect. 4, we obtain the two-sided bound evaluation for the cost and for the conditional second moment of the system, which are presented in Theorem 1 and Corollary 1. After that, we are prepared to deal with the long-run average cost in Sect. 4.1. 2 Definitions and preliminary results Let Rn denote the nth dimensional Euclidean space with the usual norm | · |. Let Rr,s (respectively, Rr ) represent the normed linear space formed by all r × s real matrices (respectively, r × r ) and Rr ∗ denotes the set {U ∈ Rr : U = U }, where U denotes the transpose of U ; consider Rr 0 the closed convex cone {U ∈ Rr : U = U ≥ 0}; U ≥ V signifies that U − V ∈ Rr 0 . U stands for the largest singular value of U . Let N := {1, . . . , N } be a finite set, and let Mr,s denotes the linear space formed by a number N of matrices such that Mr,s = {U = (U1 , . . . , U N ) : Ui ∈ Rr,s , i ∈ N }; also, Mr ≡ Mr,r . For U, V ∈ Mr , U ≥ V signifies that Ui − Vi ∈ Rr 0 for each i ∈ N , and similarly for other mathematical relations. We denote by Mr ∗ (Mr 0 ) the set Mr when it is made up by Ui ∈ Rr ∗ (Ui ∈ Rr 0 ) for all i ∈ N . Consider tr {·} as the trace operator. It is known that Mr,s equipped with the inner product =
N
tr Ui Vi
i=1
forms a Hilbert space. For convenience, instead of dealing with the induced norm, we consider the equivalent supremum norm U = maxi∈N Ui . For an operator U : Mr ∗ → Mr ∗ , σ (U ) stands for the spectral radius of U , and the corresponding norm is defined as U = sup{V ∈Mr ∗ ,V =0} U (V )/V .
123
144
E. F. Costa et. al
Let P = [ pi j ], ∀i, j ∈ N be a stochastic matrix, i.e., Nj=1 pi j = 1 and pi j ≥ 0 for all i, j ∈ N . For convenience, we adopt here the notation of [13] and define, for each U ∈ Mn , V ∈ Mn∗ , the operator TU : Mn∗ → Mn∗ , according to TU,i (V ) :=
N j=1
p ji U j V j U j , ∀i ∈ N .
(2)
We denote T 0 (V ) = V, and for t ≥ 1, we define T t (V ) recursively by T t (V ) = T (T t−1 (V )) (the notation T is used instead of TU , whenever unambiguous). Notice that the operator T is linear and continuous. Let us define the set of identity matrices I = {I1 , . . . , I N } ∈ Mn0 . We need the following auxiliary results, the proofs of which are either omitted or quoted. Proposition 1 Let U ∈ Mn∗ and V ∈ Mn0 . The following statements hold: (i) (ii)
U T k (I), V ≥ T k (U ), V , −U T k (I), V ≤ T k (U ), V .
Proposition 2 The following statements hold: (i) [3, Proposition 2.5] Let U be the operator U : Mn → Mn . Then σ (U ) < 1 if and only if U k ≤ ηξ k , k = 0, 1, . . . for some η ≥ 0 and 0 < ξ < 1. (ii) Consider the following system in Rn : z(k + 1) = H z(k), ∀k ≥ 0, z(0) = z 0 , H ∈ Rn . If z(k) → 0 as k → ∞, then there exist η ≥ 0 and 0 ≤ ξ < 1 such that z(k) ≤ ηξ k z 0 . Let (, F, P) be the fundamental probability space. Let := {θ (k); k = 0, 1, . . .} be the discrete-time homogeneous Markov chain, with N as state space. The state θ of the Markov chain at time k has an associated probability distribution, namely, πi (k) := Pr(θ (k) = i). Considering the vector π(k) = [π1 (k), . . . , π N (k)], the state distribution of the chain is given by π(k) = π(0)Pk . Assuming π(∞) = lim π(k) k→∞
exists, Proposition 2 (ii) with z(k) = π(k) − π(∞) yields a geometric convergence rate for π(k); this in addition to the results in [14, Ch. 3] leads to the next facts. Proposition 3 The following statements hold: (i) Let be a finite Markov chain. If is periodic with period δ, then Pδ is the transition matrix of an aperiodic Markov chain. (ii) Let be a finite aperiodic Markov chain. Then for each initial distribution π0 , there exists π(∞) and η ≥ 0, 0 ≤ ξ < 1 such that π(k) − π(∞) ≤ ηξ k .
123
Quadratic costs and second moments
145
2.1 The discrete-time Markov jump linear system Consider the discrete-time MJLS H , defined in a probability space (, F, P), by: ⎧ ⎨ x(k + 1) = Aθ(k) x(k) + E θ(k) w(k) H : z(k) = x(k) Q θ(k) (k) x(k) ⎩ k ≥ 0, x(0) = x0 , π(0) ∼ π0 , where (x, θ ) is the state, with x ∈ Rn and θ takes values on N . θ is the state of a Markov chain and the joint process {x, θ } is Markovian. The second expression in H represents the cost by stage z. The stochastic process {w(k); k ≥ 0} is the second-order i.i.d. sequence of -dimensional random vectors with zero mean values and covariance matrix E[w(k)w(k) ] = I, ∀k ≥ 0, where E[·] represents the expected value. We also know that {w(k); k ≥ 0} is independent from {θ (k); k ≥ 0}. In particular, x(k) and w(k) are independent random vectors. The finite set of matrices are given: A ∈ Mn , E ∈ Mn, and Q(k) ∈ Mn0 , for each k ≥ 0. We assume that there exists Q¯ ∈ Mn0 ¯ ∀k ≥ 0. In this notation, Aθ(k) = Ai ∈ A, E θ(k) = E i ∈ E and such that Q(k) ≤ Q, Q θ(k) (k) = Q i (k) ∈ Q(k), whenever θ (k) = i. We consider the quadratic cost functional with a horizon of T stages, associated with H , defined by J T :=
T
Ex0 ,π0 [ z(k) ],
(3)
k=0
where Ex0 ,π0 [·] ≡ E[· |x(0) = x0 , θ (0) ∼ π0 ] and the horizon T may be infinite. 2.2 Second moment matrices and related results Let us define the set of matrices of second moments of the state X (k) ∈ Mn0 as X i (k) = Ex0 ,π0 x(k)x(k) 11{θ(k)=i} , ∀i ∈ N , k ≥ 0,
(4)
where π0 and x(0) = x0 are known vectors and 11C represents the Dirac function of the set C . For instance, with this notation we can write Ex0 ,π0 x(k) Q θ(k) (k) x(k) = X (k), Q(k) and hence, we can express the cost J T in (3) as
J T (X (0)) :=
T X (k), Q(k).
(5)
k=0
123
146
E. F. Costa et. al
Fig. 1 Norm of the conditional second moments for system in Example 1
We also denote (k) = (π1 (k)I, . . . , π N (k)I ), and define (k) ∈ Mn0 , k ≥ 0, as (k) = T E ((k)).
(6)
It is well known that the dynamics of X (k), k ≥ 0 can be represented using a linear operator, see e.g., [3], [15], or [13]. In particular, the next result is proven in [3, Ch. 3]. Proposition 4 Consider V (k) ∈ Mn0 defined according to
V (k + 1) = T A (V (k)) + (k), V (0) = V ∈ Mn0 .
k ≥ 0,
(7)
Then X (k) = V (k), provided that Vi = πi (0)x(0)x(0) , ∀i ∈ N . Remark 1 Note that (k) defined in (6) depends on π0 , P and E, and not on x0 . As is the forcing term of (7), we have that the second moments X (and consequently the cost) present a direct dependence on π0 that is not necessarily captured by X (0), see the illustrative Example 1. Example 1 Consider system H with A = (0.5, 0.5), E = (1, 0), Q(k) = (1, 1), k ≥ 0, x0 = 0, and p11 = 0.9, p12 = 0.1, p21 = 0 and p22 = 1 in P. As x0 = 0, we have that X (0) = 0. Figure 1 shows 5 plots of X (k) for different values of π0 , suggesting that X (∞) = 0 (in fact, Proposition 4 yields X (1000) ≈ (2.59, 0.398) × 10−46 for π0 = [1 0]), and illustrating how X (k) depends on π0 rather than on X (0) or X (∞). We consider the standard notion of mean square stability, which applied here states that H is the mean square stable (MSS) if E[|xk |2 ] → 0 as k → ∞ for each x(0) = x0 and π(0) = π0 , whenever E ≡ 0 (no additive noise), or equivalently, that the operator T A is contractive, see [13,15]. The next lemma is a mere consequence of the MSS definition and Proposition 1.
123
Quadratic costs and second moments
147
Lemma 1 The following statements hold. (i) H is MSS if and only if σ (T A ) < 1. (ii) If H is MSS, then for any U ∈ Mn∗ and V (k) ≤ V¯ ∈ Mn0 , ∀k ≥ 0, there a scalar ρ > 0 such that, for any integers 0 ≤ t ≤ T , T exists k k=t T A (U ), V (k) ≤ ρU . 3 Periodic limiting distribution and related evaluations We first show that π presents a geometric convergence to a periodic limiting distribution, with parameters η and ξ that do not depend on π0 . The literature frequently focuses on the rate of convergence only, and the parameter η remains dependent on π0 . Lemma 2 Assume that is a finite Markov chain with period δ, and define π (∞) = limk→∞ π0 P(kδ+ ) , 0 ≤ ≤ δ − 1. Then, there exist η ≥ 0 and 0 ≤ ξ < 1 such that π(kδ + ) − π (∞) ≤ ηξ k π(0)P − π (∞) , 0 ≤ ≤ δ − 1, ∀k ≥ 0. The proof of Lemma 2 follows in a straightforward manner from the fact that for a periodic Markov chain with period δ, θ (kδ + ) is an aperiodic Markov chain. Proposition 2, part (ii) then applies. As a consequence, from the sequence π(k), k ≥ 0, we have for each 0 ≤ ≤ δ − 1 that the subsequence π(kδ + ) converges to a periodic limiting distribution π (∞), which will pervade most evaluations in this paper. One way to avoid a complex notation and to obtain a formulation that resembles the situation when is aperiodic is to define the partial costs J T =
n
X (k), Q (k) , = 0, . . . , δ − 1.
(8)
k=0
where n is the largest integer such that n δ + ≤ T , and X (k) = X (kδ + ),
Q (k) = Q(kδ + ),
which allows us to express the cost (3) for all T ≥ δ as JT =
δ−1
=0
J T .
(9)
We also define for each k ≥ 0 and = 0, . . . , δ − 1, (k) = (kδ + ). The following relation with the original system (7) is straightforward and the proof is omitted.
j−1 T A ( − j (0)) Lemma 3 For 0 ≤ ≤ δ −1, we have X (0) = T A (X (0))+ j=1
and X (k) = T Aδ X (k − 1) + S (k), k ≥ 1,
(10)
123
148
E. F. Costa et. al
where S (k) =
δ
j−1
TA
− j (k) .
(11)
j=1
In addition, it is simple to check that for each 0 ≤ ≤ δ − 1, the sequence (k), k ≥ 0, defined as (k) = (kδ + ) = (π1 (kδ + )I, . . . , π N (kδ + )I ) converges to (∞) = limk→∞ (kδ + ). This allows us to define (∞), S (∞) ∈ Mn0 , respectively, as (∞) = T E ( (∞)), S (∞) = lim S (k). k→∞
(12)
4 Bounds for the finite horizon cost In this section, we assess different transient behaviours that are involved in the conditional second moments X of system H to produce in Theorem 1, a tight evaluation for the finite horizon cost. We start addressing the forcing term of (10), S . As a consequence of the relation between S and given in (11), we have that these quantities depend on π0 and not on x0 , hence may not be described solely by X (0), see Remark 1. This is why the following result is expressed in terms of π0 instead of X (0). Lemma 4 The Markov chain for system H is either periodic (δ > 1) or aperiodic (δ = 1). If H is the mean square stable, there exists a positive scalar γ such that, for each = 0, . . . , δ − 1, ∞
S (k) − S (∞) ≤ γ π(0) − π0 (∞),
(13)
k=1
where π0 (∞) = limk→∞ π0 Pkδ . Moreover, S (k) is Cauchy summable. Proof Proposition 2 (i) assures that if σ (T A ) < 1, there exist η ≥ 0 and 0 ≤ ξ < 1 such that T Am ≤ ηξ m , m = 0, 1, . . .. We employ this fact, (11) and a change of variable to evaluate, for each = 0, . . . , δ − 1, δ ∞ ∞ m−1 S (k) − S (∞) = −m (k) − −m (∞) TA k=1 m=1
k=1
≤ ηξ
∞
δ
k=1 m=1
= ηξ
(kδ + − m) − lim (kδ + − m) k→∞
∞ δ+ −1 ((k − 1)δ + n) − lim ((k − 1)δ + n) k→∞ k=1 n=
123
Quadratic costs and second moments
≤ ηξ
149
∞ 2δ−1 ((k − 1)δ + n) − lim ((k − 1)δ + n) =: ϕ. k→∞
(14)
k=1 n=0
Employing (6) and (12), we can write for an element i of the collection that i (m) = i∈N pi j πi (m)E i E i = i∈N pi j E i E i π(m − r )Pr ei , where ei ∈ R N represent the ith basis vector ei = [0, . . . , 0, 1, 0, . . . , 0] . Note also from Proposition 3 (ii) that limk→∞ i (kδ + n) = πn (∞)ei = π0 (∞)Pn ei . Now, we can evaluate ϕ as ⎛ ∞ δ−1 n ⎝ ϕ = ηξ pi j E i E i π((k − 1)δ) − π0 (∞)P ei max j∈N k=1 n=0 i∈N ⎞ n ⎠ + max pi j E i E i π(kδ) − π0 (∞)P ei j∈N i∈N ⎞ ⎛ δ−1 ∞ ⎠ ⎝ ≤ ηξ max pi j E i E i η˜ ξ˜ k−1 (1 + ξ˜ ) π(0)Pn − π0 (∞)Pn (15) j∈N k=1 n=0 i∈N where η˜ and ξ˜ are numbers as in Lemma 2. Recalling that Pn ≤ 1, n = 0, . . . , δ −1, and defining ⎞ ⎛ 2η˜ ⎝ ⎠ (16) γ = ηξ δ pi j E i E i , max 1 − ξ˜ j∈N i∈N
we obtain ϕ ≤ γ π(0) − π0 (∞); this and (14) yield (13). Now we show that S (k) is Cauchy summable. Similarly to (14) and (15), we have ⎞ ⎛ ∞ ⎠ sup S (k + v) − S (∞) ≤ ηξ ⎝ max p E E ij i i j∈N k=1 v≥0 i∈N ×
∞
sup
δ−1
k=1 v≥0 n=0
η˜ ξ˜ k+v−1 (1 + ξ˜ )π(0)Pn − π0 (∞)Pn ≤ γ π(0) − πδ (∞).
This and (13) yield ∞
sup S (k + v) − S (k) ≤
k=1 v≥0
∞ [sup S (k + v) − S (∞) k=1 v≥0
+S (∞) − S (k)] < ∞. The fact that S (k) is Cauchy summable allows us to obtain a simple adaptation of [3, Proposition 3.36 p. 52], as follows. The proof is omitted.
123
150
E. F. Costa et. al
Proposition 5 Suppose that H is the mean square stable, and that is a finite Markov chain with period δ. Then there exists, for each = 0, . . . , δ − 1, a limiting value X (∞) ∈ Mn0 with the following properties: (i) X (k) → X (∞) as k → ∞. (ii) X (∞) = T Aδ X (∞) + S (∞). (iii) X (∞) =
∞
T Akδ S (∞) .
k=0
Next we address the forced solution of (10). We define M (k) ∈ Mn0 as M (k) =
k
(k−t)δ
TA
(S (t)), k ≥ 1, = 0, . . . , δ − 1,
(17)
t=1
allowing to write the solution of (10) as the sum of the free and forced solutions, X (k) = T k (X (0)) + M (k), k ≥ 1.
(18)
It will be convenient to consider the quantities S (k) and M (k) as they approach their limiting values S (∞) and M (∞), by defining S˜ (k) ∈ Mn∗ , and M˜ (k) ∈ Mn∗ as: S˜ (k) = S (k) − S (∞), k ≥ 1, = 0, . . . , δ − 1, M˜ (k) =
k
T A(k−t)δ ( S˜ (t)), k ≥ 1, = 0, . . . , δ − 1.
(19) (20)
t=1
A key consequence of Lemma 4 is that we can obtain bounds for the forced solution of (10), M˜ , that are related to π(0) − π0 (∞). One interpretation is that the Tsecond moment of the additive noise “accumulated around” M (∞), described by t=0 M˜ , is bounded for all T ≥ 0. Here, we use ·, Q (t) as a measure that is convenient for evaluating the cost, but the result can be extended to the norm (see Corollary 1). Lemma 5 The Markov chain for system H is either periodic (δ > 1) or aperiodic (δ = 1). If H is the mean square stable, then for each = 0, . . . , δ − 1, there exists a positive scalar α such that − απ(0) − π0 (∞) ≤
T
M˜ (t), Q (t) ≤ απ(0) − π0 (∞), ∀T ≥ 1, (21)
t=1
where π0 (∞) = limk→∞ π0 Pkδ whenever π(0) = π0 . Proof Note from Lemma 1 (i) that since H is MSS, σ (T A ) < 1 and consequently, σ (T An ) < 1 for each n > 1. In this proof, T stands for T Aδ for simplicity. Set
123
Quadratic costs and second moments
151
˜ ϒ ∈ Mn0 as ϒ = ∞ k=1 S (k)I. It follows from Lemma 4 that ϒ ≤ γ π(0) − π0 (∞). We employ Lemma 1 (ii) to a MSS system with associated operator T to write ∞ t ¯ T (ϒ), Q ≤ ρϒ ≤ ργ π(0) − π0 (∞). (22) t=0
On the other hand, recalling that Q¯ ≥ Q (k), k ≥ 0, we have
∞
T (−ϒ), Q¯ = t
t=0
∞
T
t=0
≤
∞
t
˜ − S (k) I , Q¯
k=1
∞ ∞
− S˜ (k) T t (I), Q (k + t)
t=0 k=1
=
∞ u
˜ T u−k − S(k)I , Q (u)
u=1 k=1
≤
T u
˜ − S(k) T u−k (I), Q (u) ,
(23)
u=1 k=1
and Proposition 1 (ii) leads to ∞
T (−ϒ), Q¯ ≤ t
t=0
t T
T ˜ T t−k ( S(k)), M˜ (t), Q (t) . Q (t) =
t=1 k=1
(24)
t=1
The left-hand side of (21) follows from (22) and (24), with α = ργ . For the ˜ ˜ right-hand side of (21) replace, in (23) and (24), −ϒ and − S(k) by ϒ and S(k) respectively, and employ Proposition 1 (i) to obtain
∞
T (ϒ), Q¯ ≥
t=0
t
T
˜ M(t), Q (t) ,
t=1
and (22) leads to the result.
Lemma 6 The Markov chain for system H is either periodic (δ > 1) or aperiodic (δ = 1), and assume that H is the mean square stable. Then, for each T ≥ δ, there exist non-negative scalars α and β (that do not depend on T, X (0) or π(0)) such that, for each = 0, . . . , δ − 1, n
T Q (k) ≤ απ0 (∞) − π(0) + βX (∞) − X (0) J − X (∞), k=0
123
152
E. F. Costa et. al
where n is the largest integer such that n δ + ≤ T , X (∞) = and π0 (∞) = limk→∞ π0 Pkδ , with S (∞) as in (12).
∞
kδ k=0 T A (S (∞)),
Proof First suppose that T ≥ δ. Let n be the largest integer such that n δ + ≤ T for each = 0, . . . , δ − 1. Here T stands for T Aδ . From Proposition 5 (iii), we write ∞
X (∞) =
T k (S (∞)) =
=
T t (S (∞)) +
t=0
k=0 k−1
k−1
T t (S (∞)) + T k
∞
t=0
∞ t=k
T t (S (∞)) =
t=0
T t (S (∞))
k−1
T t (S (∞)) + T k (X (∞)),
t=0
which, in addition to (19)–(20), allows us to evaluate M˜ (k) =
k
T k−t (S (t) − S (∞))
t=1
=
k
T k−t (S (t)) −
t=1
k−1
T j (S (∞))
j=0
= M (k) − X (∞) + T k (X (∞)).
(25)
Now, for the proof of the expression in the theorem, let n be the number as in the statement. We employ (18) and (25) to obtain an equivalent way to express a difference involving the “partial” cost (8) J T =
n
n n
T k (X (0)), Q (k) + X (k), Q (k) = M (k), Q (k) k=0
=
k=0
n
T k (X (0)), Q (k) +
k=0
= X (∞),
n
Q (k) +
k=0
+
n
k=1 n
k=1 n
M˜ (k) + X (∞) − T k X (∞) , Q (k)
T k (X (0) − X (∞)), Q (k)
k=0
M˜ (k), Q (k) .
(26)
k=1
Let us consider the second term on the right-hand side of (26). It follows from Lemma 1 (ii) that there exists a positive scalar β such that n
T k X (0) − X (∞) , Q (k) ≤ βX (∞) − X (0).
k=0
123
Quadratic costs and second moments
153
This evaluation, together with the one in Lemma 5 for the third term on the right-hand side of (26), provides the following upper bound: J T
≤ X (∞),
n
Q (k) + απ0 (∞) − π(0) + βX (∞) − X (0).
k=0
Now, for the lower bound, we employ Lemma 1 (ii) again to write n
n T k (X (0) − X (∞)), Q (k) = − T k (X (∞) − X (0)), Q (k)
k=0
k=0
≥ −βX (∞) − X (0).
This evaluation, Lemma 5 and (26) lead to the result.
From the cost decomposition expressed in (9) and Lemma 6, the next evaluations of bounds for the cost J T follow. Theorem 1 Suppose that H is the mean square stable and that δ is the period of . Let n , X (∞) and π0 (∞) be as in Lemma 6. Then, for each T ≥ δ, there exist non-negative scalars α and β (that do not depend on T, X (0) or π(0)) such that n δ−1
T X (∞), Q(kδ + ) J (X (0)) −
=0
k=0
≤ αδπ0 (∞) − π(0) +
δ−1
βX (∞) − X ( ).
(27)
=0
By choosing Q adequately in Theorem 1, Q(kδ + ) = ei ej + e j ei for some 1 ≤ i, j ≤ n and otherwise, Q(·) = 0 (ei and e j are coordinate vectors), we get the evaluation in the following corollary. Corollary 1 The Markov chain for system H is either periodic (δ > 1) or aperiodic (δ = 1), and assume that H is the mean square stable. Then, there exists γ and ρ such that X (kδ + ) − X (∞) ≤ γ π0 (∞) − π(0) +
δ−1
ρX (∞) − X ( )
(28)
=0
for = 0, . . . , δ − 1 and X (∞) and π0 (∞) are as in Lemma 6. Example 2 Consider system H with A = (A1 , A2 , A3 , A4 ), ⎡ 0 ⎢1 −3.48 −5.4 0.01 0 , A2 = A4 = , P=⎢ A1 = A3 = ⎣0 −0.67 −1.09 0 0.01 0
0 0 1 1
0.8 0 0 0
⎤ 0.2 0⎥ ⎥, 0⎦ 0
123
154
E. F. Costa et. al
Fig. 2 The cost J T and the bounds given in Theorem 1 (for T ≥ δ = 3) in a numerical example
Fig. 3 The cost J T and the associated quantities V and E for different initial conditions
and E = Q(k) = (I, I, I, I ). It is simple to note that the Markov chain is periodic with δ = 3. The cost J T for a specific initial condition X (0) and π(0) (π(0) = [0.240 0.364 0.250 0.146] with X i (0) = x0 x0 πi (0), i ∈ N , with x0 = [0.871 0.491] ) and the associated bound given by Theorem 1 are illustrated in Fig. 2. For estimating the bound parameters α and β, we have proceeded as follows. First, we set π(0) = π0 (∞) (in fact, we pick a random initial distribution μ and set π(0) = μP Mδ with M large enough), in such a manner that the first term on the right-hand side of (27) vanishes and α becomes irrelevant for estimating β. Then, we employ a bisection algorithm in β, starting with a (typically large) initial guess β0 and checking if (27) holds for several values of x0 and μ. This yields β = 2.28. For α, we proceed similarly, using a bisection method and evaluating (27), now with random π(0) and x0 , and using the estimated β. We have obtained α = 320. Figure 3 shows costs for 20 different initial conditions, and the quantities V and E associated with each initial condition, where V stands for the second term on the left hand side of (27),
123
Quadratic costs and second moments
155
n δ−1
V = X (∞), Q(kδ + ) ,
=0
k=0
and E = |J T − V | − αδπ0 (∞) − π(0) − β δ−1
=0 X (∞) − X ( ). Note that each E is not positive, in accordance with Theorem 1. Remark 2 Some special cases can be evaluated by means of Theorem 1. For instance, if the Markov chain is aperiodic (δ = 1) and π(0) = π(∞), it yields |J T (X (0))−(T +1) X (∞), Q| ≤ βX (∞) − X (0). This is the situation with no jumps (N = 1); one has that δ = 1 and π(0) = π(∞) = 1, necessarily. If δ = 1 and one sets the initial condition X (0) = X (∞) and π(0) = π(∞), we get J T (X (∞)) = (T + 1)X (∞), Q, i.e., the total cost is the cost per stage X (∞), Q times the number of stages T + 1, with no transient behavior. 4.1 The Long-run average cost Let us consider the long-run average costs (LRAC) defined as J + := lim sup T →∞
T 1 Ex0 ,π0 [z(k)], T
J − := lim inf T →∞
k=0
T 1 Ex0 ,π0 [z(k)]. T k=0
Theorem 2 The Markov chain for system H is either periodic (δ > 1) or aperiodic (δ = 1), assume that H is the mean square stable and Q(·) ≡ Q. Then, J
−
=J
+
=
δ−1 1
=0
δ
X (∞), Q .
(29)
Proof Note from the definition of n in Lemma 6 that T − δ + 1 ≤ n δ + ≤ T , for each = 0, . . . , δ − 1. Hence, T T − 2δ + 2 ≤ n ≤ . δ δ
(30)
The two-side evaluations above can be used in (27) to get two-sided bounds for J T that do not depend on the number n and indeed on . It follows from Theorem 1 that T 1 1 T Ex0 ,π0 [ z(k) ] = lim inf J (X (0)) T →∞ T T →∞ T k=0 & δ−1 1 ≥ lim inf − βX (∞) − X ( ) −αδπδ (∞) − π(0) + T →∞ T
=0 ' δ−1 n
1 X (∞), Q(kδ + ) X (∞), Q , (31) + ≥ δ
lim inf
k=0
=0
123
156
E. F. Costa et. al
where the last inequality comes from the evaluation in (30), which yields n
δ−1
X (∞), Q(kδ+ )
=0 k=0 δ−1
=
=0
δ−1 T −δ+2 X (∞), Q . (n +1) X (∞), Q ≥ δ
=0
Similarly, δ−1 T 1 1 1 T lim sup Ex0 ,π0 [ z(k) ] = lim sup J (X (0)) ≤ X (∞), Q , δ T →∞ T T →∞ T
=0
k=0
(32) where again, the last inequality comes from (30), which applied here yields n
δ−1
δ−1 X (∞), Q(kδ + ) = (n + 1)X (∞), Q
=0 k=0
=0 δ−1 T +δ X (∞), Q. ≤ δ
=0
Theorems 1 and 2 yield an error bound for the approximation of the LRAC by the associate finite horizon cost. Corollary 2 The Markov chain for system H is either periodic (δ > 1) or aperiodic (δ = 1), and assume that H is the mean square stable. Then, for some M > 0, ± J −
JT T
≤
M T .
(33)
Remark 3 Since the Markov chain considered here is general, perhaps a reducible multichain, the LRAC J ± may depend on the initial distribution π0 , but not on x0 as (29) indicates. When is ergodic, the limiting values X (∞), = 0, . . . , δ − 1 do not depend on X (0), e.g., see [3, Prop 3.36]. Then, the limiting J ± = J + = J − in Theorem 2 and in Corollary 2 does not depend on the initial value X (0). The LRAC is unique. 5 Conclusions This paper derives the two-sided bound of Theorem 1 for the finite horizon cost J T associated with MJLS with additive noise, in a setting that encompasses the situation
123
Quadratic costs and second moments
157
when the Markov chain is finite-state but otherwise general. The bound presents some interesting features: (1) it is tight in the sense that there is no gap when one guesses the cyclical stationary solutions X (∞), 0 ≤ ≤ δ − 1 from the initial states; (2) it provides a similar bound for the conditional second moments of the state process, see Corollary 1; (3) An error bound for the approximation of the LRAC by the associate finite horizon cost is expressed in (33); (4) it assures the existence of the limiting cost J ± in such a general setting for MJLS, as long as the system is stable in the mean square sense. In the context of LRAC, the obtained results set an initial landmark on the pursue for approximating solutions for the important problem of LRAC for controlled MJLS with incomplete observation. References 1. Costa OLV, Fragoso MD (1993) Stability results for discrete-time linear systems with Markovian jumping parameters. J Math Anal Appl 179:154–178 2. Ji Y, Chizeck HJ (1990) Jump linear quadratic Gaussian control: steady-state solution and testable conditions. Control Theory Adv Technol 6(3):289–319 3. Costa OLV, Fragoso MD, Marques RP (2005) Discrete-time Markovian jump linear systems. Springer, New York 4. Zampolli F (2006) Optimal monetary policy in a regime-switching economy: the response to abrupt shifts in exchange rate dynamics. J Econ Dynam Control 30:1527–1567 5. Costa OLV, de Paulo WL (2007) Indefinite quadratic with linear costs optimal control of Markov jump with multiplicative noise systems. Automatica 43:587–597 6. Khanbaghi M, Malhame RP, Perrier M (2002) Optimal white water and broke recirculation policies in paper mills via jump linear quadratic control. IEEE Trans Automat Control 10(4):578–588 7. Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York 8. Arapostathis A, Borkar VS, Fernández-Gaucherand E, Ghosh MK, Marcus SI (1993) Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J Control Optim 31(2): 282–344 9. Grimm G, Messina MJ, Tuna SE, Teel AR (2005) Model predictive control: for want of a local control Lyapunov function, all is not lost. IEEE Trans Automat Control 50:546–558 10. Costa EF, do Val JBR (2009) Uniform approximation of infinite horizon control problems for nonlinear systems and stability of the approximating controls. IEEE Trans Automat Control 54(4):881–886 11. Costa EF, do Val JBR (2006) Obtaining stabilizing stationary controls via finite horizon cost. In: Proc. American Control Conference, Minneapolis, Minnesota, pp 4297–4302 12. Jadbabaie A, Hauser J (2005) On the stability of receding horizon control with a general terminal cost. IEEE Trans Automat Control 50:674–678 13. Costa EF, do Val JBR, Fragoso MD (2005) A new approach to detectability of discrete-time Markov jump linear systems. SIAM J Control Optim 43(6):2132–2156 14. Çinlar E (1975) Introduction to stochastic processes. Prentice Hall, New York 15. Costa OLV, Fragoso MD (1995) Discrete-time LQ-optimal control problems for finite Markov jump parameters systems. IEEE Trans Automat Control 40:2076–2088
123