Ergodic Control for Constrained Diffusions - CiteSeerX

Report 2 Downloads 79 Views
Ergodic Control for Constrained Diffusions: Characterization using HJB Equations. Vivek Borkar∗ School of Technology and Computer Science Tata Institute of Fundamental Research Homi Bhabha Road, Mumbai 400005

Amarjit Budhiraja Department of Statistics University of North Carolina Chapel Hill, NC 27599-3260, USA

November 12, 2002

Abstract Recently in [8] an ergodic control problem for a class of diffusion processes, constrained to take values in a polyhedral cone, was considered. The main result of that paper was that under appropriate conditions on the model, there is a Markov control for which the infimum of the cost function is attained. In the current work we characterize the value of the ergodic control problem via a suitable Hamilton-Jacobi-Bellman (HJB) equation. The theory of existence and uniqueness of classical solutions, for PDEs in domains with corners and reflection fields which are oblique, discontinuous and multi-valued on corners, is not available. We show that the natural HJB equation for the ergodic control problem admits a unique continuous viscosity solution which enables us to characterize the value function of the control problem. The existence of a solution to this HJB equation is established via the classical vanishing discount argument. The key step is proving the pre-compactness of the family of suitably re-normalized discounted value functions. In this regard we use a recent technique, introduced in [4], of using the Athreya-Ney-Nummelin pseudo-atom construction for obtaining a coupling of a pair of embedded, discrete time, controlled Markov chains. Keywords: Ergodic control, Optimal Markov control, Controlled reflected diffusions, Constrained processes, HJB equation, Viscosity solutions, Domains with corners, Oblique Neumann problem, Pseudo-atom, Coupling. ∗ Research supported in part by a grant for ‘Nonlinear Studies’ from the Indian Space Research Organization and the Defense Research and Development Organization, Government of India, administered through the Indian Institute of Science.

1

1

Introduction

In a recent work [8] an ergodic control problem for a class of constrained diffusion processes, in polyhedral cones, was studied. Such constrained diffusion processes are common in the heavy traffic analysis of queuing networks coming from problems in computer, communications and manufacturing systems. The control of such queuing systems and the corresponding limit diffusions is of significant interest (cf. [18], [16], [17], [15], [19]). The domain G ⊂ IRk , which is the state space of the controlled Markov process, is given as an intersection of N half spaces Gi ; i = 1, · · · N . Associated with each Gi is a vector di which defines the ”direction of constraint” in the relative interior of ∂Gi . At a point x ∈ ∂G where several faces meet, there is more than one possible direction of constraint, in fact the set of permissible directions is a cone denoted by d(x). Roughly speaking, the constrained version of a given unrestricted trajectory in IRk is obtained by pushing back the trajectory, whenever it is about to exit the domain, in one of the permissible directions of constraint using the minimal force required to keep the trajectory inside the domain. Precise definitions will be given in Section 2. The constraining mechanism is described via the notion of a Skorohod problem. Under appropriate conditions on (di )N i=1 it follows from the results in [12] that one can define the ”Skorohod map”, denoted as Γ(·) which takes an unrestricted trajectory . ψ(·) and maps it to a trajectory φ(·) = Γ(ψ)(·) such that φ(t) ∈ G for all t ∈ (0, ∞). The controlled constrained diffusion process that we will study is obtained as a solution to the equation: ’ “ Z · Z · σ(X(s))dW (s) (t); t ∈ [0, ∞), (1.1) X(t) = Γ X(0) + b(X(s), u(s))ds + 0

0

where W (·) is a standard Wiener process, b : G × U → IRk ; σ : G → IRk×k are suitable coefficients, U is a given control set and u(·) is a U valued ”admissible” control process. The cost of interest is the ergodic cost criterion: Z 1 T lim sup k(X(s), u(s))ds, (1.2) T →∞ T 0

where the limit above is taken almost surely and k : G × U → IR is a suitable map. In control theory, one of the most desirable features of a good control is that it should depend only on the current value of the state and not on the whole history of the state and/or the control process. Namely, one is interested in obtaining controls u(·) such that there exists some measurable map v : G → U satisfying u(t) = v(X(t)), a.s. for all t ∈ [0, ∞). Under such a control the solution to (1.1) becomes a Markov process and for this reason the map v(·) is referred to as a ”Markov control”. The main result of [8] is that, under appropriate conditions on the model, (Conditions 2.2, 2.4, 2.5, 2.8) there is a Markov control for which the infimum of the cost in (1.2) is attained. The other important goal in stochastic control theory is the characterization of the value function of the control problem via a suitable Hamilton-Jacobi-Bellman (HJB) equation. For unconstrained diffusions this problem has been extensively studied and we refer the reader to [5] for a detailed account. For the controlled Markov processes in the present work, the problem is quite challenging since the domain in which the process is constrained to lie is not smooth (because of the corners where the faces meet) and the reflection field is oblique, discontinuous and multi-valued at the boundary points which lie on more than one face. The theory of existence and uniqueness of classical solutions for PDEs in such domains is not available. However, using the fundamental ideas of Crandall and Lions [11, 20], Dupuis and Ishii [13] have developed existence and uniqueness theory of viscosity solutions for fully nonlinear second order elliptic PDEs on such domains. In this work we will show that the value of the ergodic control problem introduced above can be characterized via the unique viscosity solution of an appropriate HJB equation. The usual approach to the HJB equation for the ergodic control is via the

1

”vanishing discount method” (cf. [10, 7, 5, 3]). In this approach one first studies the value function Vα (x) of the discounted control problem: ’Z ∞ “ Vα (x) = inf E e−αs k(X x (s), u(s))ds , (1.3) u

0

where α ∈ (0, ∞), the infimum is taken over all admissible controls u and X x (·) is the solution of (1.1) with X(0) ≡ x. For f ∈ Cb2 (G) let Lf : G × U → IR be defined as: k k X ∂2f ∂f . 1 X (x) + (Lf )(x, u) = (x); (x, u) ∈ G × U, ai,j (x) bi (x, u) 2 i,j=1 ∂xi ∂xj ∂x i i=1

(1.4)

. where aij (x) = σ(x)σ T (x). Using results of [13] we will show that the value function Vα (x) is the unique viscosity solution (see Definition 3.3) of the following HJB equation. inf (Lψ(x, u) + k(x, u) − αψ(x)) =

u∈U

h∇ψ(x), di i

0, x ∈ G

= 0, x ∈ ∂G; i ∈ In(x),

(1.5)

. where In(x) = {i ∈ {1, 2, · · · N } : x ∈ ∂Gi }. We remark that the work [13] considers the case where the domain is bounded, however by a slight modification the techniques there can be used to cover the case in the present work. In order to study the HJB equation of the ergodic control problem, we need to take the limit as α → 0. The key step in this program is to show that the family: . {V α (·) = Vα (·) − Vα (0), α ∈ (0, ∞)}

(1.6)

is pre-compact in C(G). The classical derivation (see Theorem VI.3.1 of [5]) of such a result makes use of certain gradient estimates on Vα (x), uniform in α, which we are unable to prove for the model considered in the present work. Another approach based on viscosity solutions, taken in [3], proves the above pre-compactness by making some strong stability assumptions on the model (a restoring force towards bounded sets that grows without bound as |x| → ∞) which are not satisfied in the current setup because of the radially homogeneous nature of the problem. In the present work we prove the pre-compactness of the family in (1.6) by using the Athreya-Ney-Nummelin pseudo-atom construction which was recently introduced in the context of partially observed ergodic control problems in [4]. Using this construction, the pre-compactness of the family of (re-normalized) discounted value functions for a partially observed control problem was proved in [6]. One of the key requirements for the coupling methods used in the above cited works to work, is the existence of a suitable Lyapunov function for the underlying controlled Markov processes. For the processes considered in the present work, the existence of such a Lyapunov function was proved in [1]. Using this Lyapunov function one can show that a Foster type drift criterion is satisfied for an appropriate embedded discrete time controlled Markov chain. This, along with the pseudo-atom construction enables us to show that the coupling time, for two embedded controlled Markov chains driven by the same control and noise processes but two different initial conditions, has finite expected value. The above step is the main ingredient to the proof of the pre-compactness of (1.6). Once the pre-compactness is proved, one can take the limit of (αVα (0), V α (·)), along a subsequence, as α → 0. Then by stability (under perturbations) properties of viscosity solutions it follows that the limit, denoted as (ρ, V (·)) is a viscosity solution of the HJB equation for the ergodic control problem (see (5.28)). The rest of the work involves showing that this equation admits a unique solution and that ρ is the infimum, over all admissible controls, of the cost function in (1.2).

2

The paper is organized as follows. In Section 2 we present some preliminary definitions and known results that will be used in this work. Section 3 is devoted to showing that the value function of the discounted cost problem is the unique solution of the HJB equation in (1.5). In Section 4 we present the pseudo-atom construction and use it to show the pre-compactness of the family in (1.6). In Section 5, by taking limit as α → 0 we obtain a viscosity solution of the HJB equation for the ergodic control problem. Finally, we characterize the value function by showing that the equation admits a unique viscosity solution.

2

Preliminaries and Background Results.

Let G ⊂ IRk be a polyhedral cone with the vertex at origin given as the intersection of half spaces Gi , i = 1, · · · , N . Each half space Gi is associated with a unit vector ni via the relation Gi = {x ∈ IRk : hx, ni i ≥ 0}, where h·, ·i denotes the usual inner product in IRk . Denote the boundary of a set B ⊂ IRk by ∂B. We will denote the set {x ∈ ∂G : hx, ni i = 0} by Fi . For x ∈ ∂G, define the set, n(x), of inward . normals to G at x by n(x) = {r : |r| = 1, hr, x − yi ≤ 0, ∀y ∈ G}. With each face Fi we associate a unit vector di such that hdi , ni i > 0. the direction of constraint n This vector defines o associated with the P . k face Fi . For x ∈ ∂G define d(x) = d ∈ IR : d = i∈In(x) αi di ; αi ≥ 0; ||d|| = 1 . We will denote the collection of all subsets of {1, · · · N } by Λ. Let D([0, ∞) : IRk ) denote the set of functions mapping [0, ∞) to IRk that are right continuous and have limits from the left. We endow D([0, ∞) : IRk ) with the usual Skorohod topology. Let . DG ([0, ∞) : IRk ) = {ψ ∈ D([0, ∞) : IRk ) : ψ(0) ∈ G}. For η ∈ D([0, ∞) : IRk ) let |η|(T ) denote the total variation of η on [0, T ] with respect to the Euclidean norm on IRk . Definition 2.1 Let ψ ∈ DG ([0, ∞) : IRk ) be given. Then (φ, η) ∈ D([0, ∞) : IRk ) × D([0, ∞) : IRk ) solves the Skorohod problem (SP) for ψ with respect to G and d if and only if Z φ(0) = ψ(0), and for all t ∈ [0, ∞) (1) φ(t) = ψ(t) + η(t); (2) φ(t) ∈ G; (3) |η|(t) < ∞; (4) |η|(t) =

[0,t]

I{φ(s)∈∂G} d|η|(s); (5)

There exists (Borel) measurable γ : [0, ∞) → IRk such that γ(t) ∈ d(φ(t)) (d|η|-almost everywhere) and R η(t) = [0,t] γ(s)d|η|(s).

On the domain D ⊂ DG ([0, ∞) : IRk ) on which there is a unique solution to the Skorohod problem . we define the Skorohod map (SM) Γ as Γ(ψ) = φ, if (φ, ψ − φ) is the unique solution of the Skorohod problem posed by ψ. The following is the key assumption made in [8] on the data defining the Skorohod problem.

Condition 2.2 (a) There exists a compact, convex set B ∈ IRk with 0 ∈ B 0 , such that if v(z) denotes the set of inward normals to B at z ∈ ∂B, then for i = 1, 2, · · · , N , z ∈ ∂B and |hz, ni i| < 1 implies that hv, di i = 0 for all v ∈ v(z) . (b) There exists a map π : IRk → G such that if y ∈ G, then π(y) = y, and if y 6∈ G, then π(y) ∈ ∂G, and y − π(y) = αγ for some α ≤ 0 and γ ∈ d(π(y)). (c) For every x ∈ ∂G, there is n ∈ n(x) such that hd, ni > 0 for all d ∈ d(x). An important consequence of the above assumption is the regularity of the Skorohod map in the following sense.

Theorem 2.3 (Dupuis and Ishii [12]) Under Condition 2.2 the Skorohod map is well defined on all of DG ([0, ∞) : IRk ), i.e., D = DG ([0, ∞) : IRk ) and the SM is Lipschitz continuous in the following sense. There exists a K < ∞ such that for all φ1 , φ2 ∈ DG ([0, ∞) : IRk ): sup |Γ(φ1 )(t) − Γ(φ2 )(t)| < K sup |φ1 (t) − φ2 (t)|. 0≤t 0 such that hdi , Dψ(y)i < 0 ∀y ∈ G satisfying |x0 − y| ≤ .

(3.10)

Now 1 lim sup IE h→0 h

 Z

h 0

!

x0

hdi , Dψ(X (s))idYi (s)

‘ 1  |Dψ|∞ lim sup IE 1{sup0≤s≤h |X x0 (s)−x0 |≥} Yi (h) h→0 h q q IE(sup0≤s≤h |X x0 (s) − x0 |5 ) 2 ≤ |Dψ|∞ lim sup IE(Yi (h)) h5/2 h→0 q 5/4 h ≤ C|Dψ|∞ IE(Yi2 (1)) lim sup 5/2 h→0 h = 0, ≤

for a suitable constant C, where the second inequality follows on applying Cauchy-Schwarz inequality and Chebychev’s inequality; the final inequality follows on using the Lipschitz property of the Skorohod map, boundedness of drift and diffusion coefficients and Burkholder-Gundy inequalities. This proves (3.9) and hence part 1. of Definition 3.3. Next let ψ ∈ C 2 (G), be such that x0 is a strict minimum point of Vα − ψ. To complete the proof we need to show that (3.7) holds with φ there replaced by Vα . From (3.1) we have that ’Z  “ Vα (x0 ) = inf IE e−αt k(X x0 (t), u(t))dt + e−ᏠV (X x0 ()) . 0

Let u (·) be a feedback control such that “ ’Z  −Ꮰ x0 , −αt 2 x0 , V (X ()) , (t), u (t))dt + e Vα (x0 ) +  ≥ IE e k(X 0

where X x0 , (·) solves (1.1) with u(·) there replaced by u (·) and X(0) ≡ x. Let κ be as before. Then ψ(x0 ) =

Vα (x0 ) + κ ’Z  “ −Ꮰx0 , x0 , −αt x0 ,  ≥ IE − 1)V (X ()) + ψ(X e k(X (t), u (t))dt + (e ()) − 2 . 0

8

Dividing by  and taking limit as  → 0, we have ’Z  “ 1 0 ≥ lim inf IE e−αt k(X x0 , (t), u (t))dt + (e−Ꮰ− 1)V (X x0 , ()) + ψ(X x0 , ()) − ψ(x0 ) →0  0 !  Z Z  N Z   X 1 x0 ,   −αt (t))dYi (t) − αVα (x0 ), (Lψ)(x0 , u (t))dt + (Di ψ)(X = lim inf IE e k(x0 , u (t))dt + →0  0 0 i=1 0 The last step follows on using the continuity and / or Lipschitz properties of k, Lψ, σ, b and observing that lim sup IE( sup |X x0 , (s) − x0 |)p ≤ lim sup sup IE( sup |X x0 (s) − x0 |)p = 0, →0

→0

0≤s≤

0≤s≤

where the supremum on the right side is taken over all admissible controls. Thus we have that !  Z Z  N Z   X 1   x0 , (t))dYi (t) − αVα (x0 ) k(x0 , u (t))dt + (Di ψ)(X 0 ≥ lim inf IE (Lψ)(x0 , u (t))dt + →0  0 0 i=1 0 Z  N X 1 ≥ inf (k(x0 , u) + (Lψ)(x0 , u)) − αVα (x0 ) + lim inf IE( (Di ψ)(X x0 , (t))dYi (t)) u →0  0 i=1 Z N  X1 IE( (Di ψ)(X x0 , (t))dYi (t)) = −Fα (x0 , Vα (x0 ), Dψ(x0 ), D2 ψ(x0 )) + lim inf →0  0 i=1 From the above inequality one can prove part 2. of Definition 3.3 exactly in the way part 1. was proved from (3.8). This proves the theorem. Next, we will show that under the standing assumptions of this paper, there is a unique viscosity . solution of (1.5). We begin by considering the following equation. For n ∈ IN , let Bn = {x ∈ G| |x| < n}. Let ψ ∈ Cb (G) be given. inf (Lφ(x, u) + k(x, u) − αφ(x)) =

u∈U

0, x ∈ G ∩ Bn

h∇φ(x), di i = 0, x ∈ ∂G ∩ Bn ; i ∈ In(x) φ(x) = ψ(x); x ∈ ∂Bn

(3.11)

Definition 3.5 We say that φ ∈ Cb (G) is a viscosity solution of (3.11) if 1. and 2. in Definition 3.3 hold for all x0 ∈ G ∩ Bn and φ(x) = ψ(x) for all x ∈ ∂Bn . For x ∈ B n , let X x (·) be given as a solution of (1.1) with X(0) ≡ x and some admissible pair (u(·), W (·)). Let . (3.12) τn ≡ τn (x) = inf{t : X x (t) ∈ Bnc } and define

. V (x) = inf IE n

’Z

τn

e

−αs

x

−ατn

k(X (s), u(s))ds + e

0

x

“

ψ(X (τn )) ,

(3.13)

where the infimum above is taken over all admissible controls. The existence part of following result is proved exactly as Theorem 3.4. The proof of uniqueness, essentially, follows from results in [13]. A sketch of the argument is provided in the Appendix for reader’s convenience. Theorem 3.6 Let α ≥ 0 and let V n (·) be defined via (3.13). Then V n (·) is the unique viscosity solution of (3.11).

9

An immediate consequence of the above theorem is the following result. Theorem 3.7 Let α ∈ (0, ∞). Then Vα (·) defined via (1.3) is the unique viscosity solution of (1.5). Proof. From Theorem 3.4 we know that Vα (·) is a viscosity solution of (1.5). Now let V˜ be another viscosity solution of (1.5). Let τn (x) be defined via (3.12). Define !  Z τn (x) . x −αs x −ατn (x) ˜ k(X (s), u(s))ds + e V (X (τn (x))) φ(x) = inf IE e 0

where the infimum is taken over all admissible controls. From Theorem 3.6, φ is the unique viscosity solution of (3.11), with ψ there replaced by V˜ . However, since V˜ solves (1.5), clearly, it is also a solution of (3.11) (once more with ψ there replaced by V˜ ). Thus we have that φ = V˜ and so !  Z τn (x) V˜ (x) = inf IE e−αt k(X x (t), u(t))dt + e−ατn (x) V˜ (X x (τn (x))) . 0

Also from (3.1) we have that the above equality holds with V˜ replaced by Vα . Thus we have that for x ∈ G and n large enough so that x ∈ Bn ‘  |V˜ (x) − Vα (x)| ≤ sup |IE e−ατn (x) V˜ (X x (τn (x))) − e−ατn (x) Vα (X x (τn (x))) | ≤ (|V˜ |∞ + |Vα |∞ ) sup(IE(e−ατn (x) )),

where the supremum in the above display is taken over all admissible controls. Using the boundedness of the drift and diffusion coefficients and the Lipschitz property of the Skorohod map, it follows that sup(IE(e−ατn (x) )) → 0 as n → ∞. This shows that V˜ (x) = Vα (x) for all x ∈ G.

4

The Vanishing Discount Limit.

. In this section we will show that if Vα is given via (1.3) and V α (x) = Vα (x) − Vα (0); x ∈ G, then the family {V α ; α ∈ (0, ∞)} is pre-compact in C(G). We begin, following [4], by an embedding of the continuous time control problem in a discrete time control problem. Define . U = {θ : [0, 1] → U : θ is a measurable map }. We endow U with the coarsest topology under which , for every e ∈ L2 [0, 1] and f ∈ Cb (S), the map R . R1 ˆ ⊂ P(C([0, 1] : IRk ) × U ) be Ψ : U → IR defined as Ψ(u) = 0 e(t) S f (θ)ut (dθ)dt, is continuous. Let Φ the class of all probability measures which correspond to the probability law of some admissible pair ˆ is a compact metric space. (u(t), W (t))0≤t≤1 . It follows from [5], Chapter 1, that Φ ˆ ˆ Let φ ∈ Φ and let (u(·), W (·)) be the corresponding admissible pair on a filtered probability space (Ω, F , P, (Ft )). Define ’Z 1 “ . ˆ = e−αs k(X x (s), u(s))ds , kˆα (x, φ) IE (4.14) 0

. ˆ = e−α we have that where X x (·) is given as a solution of (1.1) with X(0) ≡ x. Then setting α ’Z ∞ “ ’ Z n+1 “ ∞ X −αt x −αt x IE = IE IE( e k(X (t), u(t))dt e k(X (t), u(t))dt | Fn ) 0

=

n=0 ∞ X

n=0

n

‘ ˆ n IE kˆα (Xnx , φn ) , α

10



(4.15)

where for n ∈ IN0 , φn is the conditional law of (u(n + s), W (n + s) − W (n))0≤s≤1 given Fn and . ˆ valued random variables. We will call Xnx = X x (n). Note that φn is a sequence of Fn measurable, Φ the sequence {φn } as the admissible control sequence corresponding to the admissible pair (u(·), W (·)). . We now introduce a controlled probability transition kernel on H = G × G, defined as follows. For ˆ let p(x, φ, dy) ∈ P(H) be defined as: x ≡ (x1 , x2 ) ∈ H and φ ∈ Φ, Z . (4.16) f (y)p(x, φ, dy) = IE(f (X1x1 , X1x2 )), f ∈ BM (H), H

X1xi

= X xi (1) and (X xi (t))0≤t≤1 is given via (1.1) with X(0) = xi and an admissible where for i = 1, 2, pair (u(t), W (t))0≤t≤1 having the probability law φ. For future reference we also introduce a controlled probability transition kernel pˆ(x1 , φ, dy1 ), on G, given as follows Z . f (y1 )ˆ p(x1 , φ, dy1 ) = IE(f (X1x1 )), f ∈ BM (G), (4.17) G

where X1x1 and φ are as above. We now introduce a Lyapunov function for the controlled Markov chain {Xnx }. This Lyapunov function was constructed in [1]. Theorem 4.1 [1] There exists a function F : G → IR such that it is twice continuously differentiable on G \ {0} and such that the following hold. (a) There exist c1 , c2 ∈ (0, ∞) such that c1 |x| ≤ F (x) ≤ c2 |x|, for all x ∈ G. (b) For all  > 0 there exists M ∈ (0, ∞) such that (x ∈ G, |x| ≥ M ) implies kD2 F (x)k ≤ . (c) There exists c ∈ (0, ∞) such that Df (x) · r ≤ −c, for r ∈ C(δ) and x ∈ G \ {0}, and Df (x) · d ≤ −c, for d ∈ d(x) and x ∈ ∂G \ {0}. (d) There exists L ∈ (0, ∞) such that supx∈G |Df (x)| ≤ L. Using the above theorem we can now prove the following result. Theorem 4.2 There exist c0 , `0 , M0 ∈ (0, ∞) such that for any admissible pair (u(·), W (·)) on some filtered probability space (Ω, F, {Ft }, P ), x ∈ G and X x (·) given by (1.1), we have that x IE(F (Xn+1 ) | Fn ) − F (Xnx ) ≤ −c0 1Xnx ∈B c + M0 1Xnx ∈B ,

(4.18)

. . where B = {x ∈ G| |x| ≤ `0 } and Xnx = X x (n).

Proof. An application of Itˆo’s formula gives that x F (Xn+1 )



F (Xnx )

+

1 2

Z

+

Z

n+1 x

x

hDF (X (s), b(X (s), u(s))ids +

n

n+1 ∗

x

2

x

x

N Z X

tr(σ (X (s))D F (X (s))σ(X (s)))ds +

n

n

i=1

Z

n+1

n+1

n

hDF (X x (s), di idYi (s) hDF (X x (s), σ(X x (s))dW (s)i.

Conditioning with respect to Fn and using part (c) and (d) of Theorem 4.1, we have that x IE(F (Xn+1 )

| Fn ) −

F (Xnx )

1 ≤ −c + IE( 2

Z

n+1 n

tr(σ ∗ (X x (s))D2 F (X x (s))σ(X x (s)))ds | Fn ).

11

(4.19)

. Let g˜(x) = 21 tr(σ ∗ (x)D2 F (x)σ(x)). From part (b) of Theorem 4.1 we can find `0 ∈ (0, ∞) such that |˜ g (x)| ≤ 2c for all x ∈ G such that |x| > `20 . This implies that 1 2

Z

n+1

n

 g |∞ 1sup tr(σ ∗ (X x (s))D2 F (X x (s))σ(X x (s)))ds ≤ |˜ g |∞ 1Xnx ∈B + |˜

`0 x x n≤s≤n+1 |X (s)−Xn |> 2

+

c‘ 1Xnx ∈B c 2

From the above display and (4.19) we have that  c‘ x | F ) | Fn ) − F (Xnx ) ≤ −c + |˜ g |∞ IE(1sup 1Xnx ∈B c + |˜ g |∞ 1Xnx ∈B ) + IE(F (Xn+1 ` n 0 x x n≤s≤n+1 |X (s)−Xn |> 2 2 ’ “ IE(supn≤s≤n+1 |X x (s) − Xnx | | Fn ) c − + |˜ ≤ g |∞ 1Xnx ∈B c + |˜ g |∞ 1Xnx ∈B . 2 `0 /2 Using the boundedness of the drift and diffusion coefficients we can find ` such that IE(

sup

n≤s≤n+1

|X x (s) − Xnx | | Fn ) ≤ `,

also without loss of generality we can assume that `0 is large enough so that bounds in the above display, we have that

g |∞ ` 2|˜ `0

< 4c . Using these

c x ) | Fn ) − F (Xnx ) ≤ − 1Xnx ∈B c + |˜ g |∞ 1Xnx ∈B . IE(F (Xn+1 4 The result now follows on setting c0 = 4c and M0 = |˜ g |∞ . Now let x1 , x2 ∈ G and fix an admissible pair (u(·), W (·)) and the corresponding control sequence {Φn } on some filtered probability space (Ω, F, {Ft }, P ). Define for i = 1, 2, X xi (·) via (1.1) using the admissible pair (u(·), W (·)) and with X(0) = xi , respectively. It is easy to see that if x = (x1 , x2 ) x . then {X n } = {(Xnx1 , Xnx2 )} is a H valued controlled Markov chain, starting at x, with the controlled probability transition kernel p(x, φ, dy) and the control sequence {Φn }. Also, it is easy to see that Xnx1 is a G valued controlled Markov chain, starting at x1 , with the controlled probability transition kernel pˆ(x1 , φ, dy1 ) and the control sequence {Φn }. The Pseudo-Atom Construction. We will now proceed, as in [4], to adapt the Athreya-Ney. Nummelin construction of a pseudo-atom to the current problem. Let H = G × G and B be as in the ∗ . ∗ . statement of Theorem 4.2. Define B = B × B and let H = H × {0, 1} = G × G × {0, 1}. Let λ denote the Lebesgue measure on G. Define ν ∈ P(H) as . (λ × λ)(A × B ∗ ) ν(A) = . (λ(B))2

(4.20)

Using the uniform non-degeneracy of the diffusion coefficient in (1.1), it follows that there exists 0 < δ ∗ < 1 such that ˆ (4.21) p(x, φ, A) ≥ δ ∗ 1B ∗ (x)ν(A), ∀ x ∈ H, A ∈ B(H), φ ∈ Φ. . . For a set A ∈ B(H), we let A0 = A × {0} and A1 = A × {1}. For every µ ∈ P(G × G) we define a µ∗ ∈ P(H ∗ ) as follows. For A ∈ B(H), . µ∗ (A0 ) = (1 − δ ∗ )µ(AB ∗ ) + µ(A(B ∗ )c ) . µ∗ (A1 ) = δ ∗ µ(AB ∗ ).

Clearly, µ∗ (A0 ) + µ∗ (A1 ) = µ(A) and if A ⊂ (B ∗ )c then µ∗ (A0 ) = µ(A).

12

(4.22)

On a suitable probability space (Ω∗ , F ∗ , P ∗ ), define a H ∗ valued controlled Markov chain: Zn ≡ ˆ n∗ ), with a Φ ˆ valued control process φ∗n so that: ˜ n∗ , X (Xn∗ , i∗n ), where Xn∗ ≡ (X ˆ (1). The controlled transition kernel of Zn is given as follows. For z ≡ (z, i) ∈ H ∗ and φ ∈ Φ  ∗ if z ∈ H0 \ B0∗  p (z, φ, dy) 1 ∗ ∗ if z ∈ B0∗ q(z, φ, dy) = ∗ (p(z, φ, dy) − δ ν (dy)  1−δ ∗ if z ∈ H1 . ν (dy)

(4.23)

where y ≡ (y, j) ∈ H ∗ .

(2). The initial distributions are given as . P ∗ (Z0 ∈ A0 , φ∗0 ∈ A0 ) = . P ∗ (Z0 ∈ A1 , φ0∗ ∈ A0 ) =

follows. €  (1 − δ ∗ )1AB ∗ (x) + 1A(B ∗ )c (x) P (φ0 ∈ A0 ) ˆ δ ∗ 1AB ∗ (x)P (φ0 ∈ A0 ) A ∈ B(H), A0 ∈ B(Φ).

ˆ and (zm , im , αm ) ∈ H × (3). The control sequence {φ∗n } is given as follows. For n ∈ IN, A0 ∈ B(Φ) ˆ {0, 1} × Φ P ∗ (φn∗ ∈ A0 | Zm = (zm , im ), φ∗m−1 = αm−1 , m ≤ n) x1 x2 ) = zm , φm−1 = αm−1 , m ≤ n). , Xm = P (φn ∈ A0 | (Xm

(4.24)

The above construction assures that the probability laws of {Xn∗ , φ∗n }n∈IN0 and {(Xnx1 , Xnx2 ), φn }n∈IN0 are the same. Furthermore, B1∗ is an accessible atom of {Zn } in the sense of [21]. One can now show, in a similar manner as in Lemma 3.3 of [4], that the hitting time of B1∗ by the controlled Markov chain {Zn } has a finite expected value. Theorem 4.3 Let Then, for every M0 ∈ (0, ∞),

. τ (x1 , x2 ) = inf{n ∈ IN0 : Zn ∈ B1∗ }. sup xi ∈G;|xi |≤M0 ;i=1,2

sup IE ∗ (τ (x1 , x2 )) < ∞,

where the inner supremum is taken over all admissible control pairs. Proof. Let F be as in Theorem 4.1 and let (X x (t))0≤t≤1 be given via (1.1) with X(0) = x ˆ Note that since for x, y ∈ G, and admissible pair (u(t), W (t))0≤t≤1 having the probability law φ. |F (x) − F (y)| < L|x − y|, we have from the boundedness of the drift and diffusion coefficients that Rs x C | σ(X x (s)dW (s)| IE(e|F (X (1))−F (x)| ) ≤ C1 IE( sup e 2 0 ) < ∞. 0≤s≤1

. Now fix δ0 ∈ (0, 1) and set V(y) = eδ0 F (y) , y ∈ G. Then as in Theorem 16.3.1 of [21], one has that for all x ∈ B c . Z IE(V(X x (1))) ˆ dy)eδ0 (F (y)−F (x)) = pˆ(x, φ, V(x) G ≤

1 − C 1 δ0 + C 2 δ02−ξ ,

13

for some ξ ∈ (0, 1) and C 1 , C 2 ∈ (0, ∞) which are independent of δ0 . Now choose δ0 small enough so that 1 − C 1 δ0 + C 2 δ02−ξ < 1. Then we have that, there exists β ∈ (0, 1) and b ∈ (0, ∞) such that for all x∈G sup IE(V(X x (1))) ≤ (1 − β)V(x) + b1B (x), ˆ Φ ˆ φ∈

. Now, let V : H ∗ → IR+ be defined as V(x1 , x2 , i) = V(x1 ) + V(x2 ), where (x1 , x2 , i) ∈ H ∗ . Also, for . ∗ , i∗m , φ∗m ; m ≤ n). Observe that n ∈ IN0 , let Γ∗n = σ(Xm ˜ ∗ ) | Γ∗ ) − V(X ˜ n∗ ) IE ∗ (V(X n+1 n ∗ ˆ n∗ ) ˆ n+1 ) | Γ∗n ) − V(X + IE ∗ (V(X ˆ ∗ ) − βV(X ˜ ∗ ) + 2b ≤ −βV(X

IE ∗ (V(Zn+1 ) | Γn∗ ) − V(Zn ) =

n

n

= −βV(Zn ) + 2b.

The result now follows as in Lemma 3.3 of [4]. We now prove the main result of this section. Theorem 4.4 For every M0 ∈ (0, ∞), there exist C1 , C2 ∈ (0, ∞), θ : IN → IR+ such that θ(m) → 0 as m → ∞, and for all M ∈ (0, ∞) sup xi ∈G,|xi |≤M0 ,i=1,2

|V α (x1 ) − V α (x2 )| ≤ C1 eM C2 |x − y| + θ(M ).

Proof. We begin by observing that ŒZ Œ |V α (x1 ) − V α (x2 )| ≤ sup IE ŒŒ

∞ 0

Œ Œ e−αt (k(X x1 (t), u(t)) − k(X x2 (t), u(t))) dtŒŒ ,

(4.25)

where the supremum on the right side is taken over all admissible pairs (u(·), W (·)). From (4.15) we have that the term on the right side above can be written as Œ Œ Œ∞ ‘Œ  Œ ŒX n x2 x1 ˆ ˆ ˆ IE kα (Xn , φn ) − kα (Xn , φn ) Œ , α Œ Œ Œ n=0

where {φn } is the admissible control sequence corresponding to the admissible pair (u(·), W (·)). Now from the above pseudo-atom construction we have that the above display equals: Œ∞ Œ ŒX ‘Œ  Œ ∗ ∗ n ∗ ˜ n , φn ) − kˆα (X ˆ n , φn ) ŒŒ . ˆ IE kˆα (X α Œ Œ Œ n=0

Let τ ≡ τ (x1 , x2 ) be as in Theorem 4.3. Then the above expression can be written as Œ Œ ∞ ŒX ‘Œ  Œ ˜ ∗ , φ∗n ) − kˆα (X ˆ n∗ , φ∗n ) ŒŒ α ˆ n IE1τ 0, p, q ∈ IRn and x, y ∈ G. Assume that ’ ’ “ ““ ’ I −I I 0 p, q, α ∈ D2,+ w(x, y) +β −I I 0 I Then there are X, Y ∈ S k for which “ ’ ’ X − βI I 0 ≤ −Cα 0 0 I and (u(x), p, X) ∈ D

2,+

0 Y − βI

“

≤ Cα

’

u(x) and (v(y), −q, −Y ) ∈ D

−I I

I −I 2,−

“

v(y),

where C ∈ (1, ∞) is an absolute constant. Proof of Theorem 3.6. As stated above Theorem 3.6, the proof that V n (·) defined via (3.13) is a viscosity solution of (3.5), follows exactly as the proof of Theorem 3.4. Now let V1 (·) and V2 (·) be two viscosity solutions of (3.5). Let g be as in Lemma 4.2 of [8], (cf. (3.2)). For γ, β ∈ (0, ∞), define . . Vγβ (x) = V1 (x) − γg(x) + β and Uγβ (x) = V2 (x) − γg(x) − β, x ∈ G.

Define F˜α,∗ and F˜α∗ as maps from G × IR × IRk × S k to IR, via (3.4) and (3.5) respectively by replacing the set {−hdi , pi; i ∈ In(x)}, in (3.4) by the set {−hdi , pi + γ; i ∈ In(x)} and, in (3.5) by the set {−hdi , pi − γ; i ∈ In(x)}. Then as in Theorem 2.1 of [13] to every β ∈ (0, ∞), there exists a γ ≡ γ(β) ≤ β in (0, ∞) such that F˜α,∗ (x, Uγ,β (x), p, A) ≤ 0 for x ∈ G and (p, A) ∈ D2,+ Uγ,β (x).

(6.38)

F˜α∗ (x, Vγ,β (x), p, A) ≥ 0 for x ∈ G and (p, A) ∈ D2,− Vγ,β (x).

(6.39)

Now fix a β and the corresponding γ in (0, ∞). Suppose that

. κ0 = max (Uγβ (x) − Vγβ (x)) > 0.

(6.40)

x∈G∩Bn

Then noting that Uγβ (x) − Vγβ (x) ≤ 0 for all x ∈ ∂Bn , it follows through standard maximum principle arguments that κ0 = (Uγβ (z) − Vγβ (z)), for some z ∈ ∂G ∩ Bn . (6.41) We now show that (6.41) leads to a contradiction. Let F be as in (3.3). Then by using the boundedness and Lipschitz property of the coefficients one can show that the following hold. • There is a function m1 ∈ C([0, ∞)) satisfying m1 (0) = 0 such that for all θ0 ≥ 1, x, y ∈ G ∩ B n , r ∈ IR, p ∈ IRk and X, Y ∈ S k , Fα (y, r, p, −Y ) − Fα (x, r, p, X) ≤ m1 (|x − y|(|p| + 1) + θ0 |x − y|2 ), whenever −θ0

’

I 0

0 I

“



’

X 0

18

0 Y

“

≤ θ0

’

I −I

−I I

“

.

• There is a neighborhood U of ∂G in G ∩ B n and a function m2 ∈ C([0, ∞)) satisfying m2 (0) = 0 for which (6.42) |F (x, r, p, X) − F (x, r, q, Y )| ≤ m2 (|p − q| + ||X − Y ||), for x ∈ U, r ∈ IR, p, q ∈ IRk and X, Y ∈ S k .

Clearly, we can find an open neighborhood V of z (which is as small as we want) such that In(x) ⊂ In(z) for x ∈ V ∩ ∂G, V ∩ G ⊂ U, and

(6.43)

hy − x, ni i ≤ θ|x − y|, for i ∈ In(z), x ∈ V ∩ Gi and y ∈ V ∩ ∂Gi ,

(6.44)

Also, from Theorem 4.1 of [13] we can find a family {w }>0 of continuous functions on V × V , and a positive constant θ having the property : For any  > 0 and x, y ∈ V , there are p ≡ p(, x, y), q ≡ q(, x, y) ∈ IRk such that for all i ∈ In(z), the following hold (cf. equations (3.15) - (3.19) of [13]). w (x, x) = 0, w (x, y) ≥ θ

|x − y|2 , 

|x − y|2 if hy − x, ni i ≥ −θ|x − y|  |x − y|2 if hy − x, ni i ≤ θ|x − y|, hdi , qi ≥ −  |x − y| |x − y|2 , |q| ≤ , |p + q| ≤   ’ ““ ’ “ ’ 1 |x − y|2 I −I I 0 p, q, + ∈ D2,+ w (x, y). −I I 0 I   hdi , pi ≥ −

and

(6.45) (6.46) (6.47) (6.48)

(6.49)

Henceforth, we will write Uγ,β as u and Vγ,β as v. Now fix δ > 0 and define u ˜ ∈ U SC(G) by δ . ˜(x) = u(x) − |x − z|2 . u 2

(6.50)

Clearly, z is the unique maximum point of u ˜ − v. Fix  > 0, and define φ ∈ U SC([V ∩ G] × [V ∩ G]) by . φ(x, y) = u ˜(x) − v(y) − w (x, y). Let (x, y) ≡ (x(), y()) ∈ [V ∩ G] × [V ∩ G] be a maximum point of φ. Then it follows exactly as in [13] (cf. equation (3.23) of that paper) that as  → 0 |x − y|2 → 0, x, y → z, u ˜(x) → u ˜(z) and v(y) → v(z). (6.51)  . . . Now let  be small enough so that x, y ∈ V . Define w(x, y) = u ˜(x)−v(y), x, y ∈ G. Let p = p(, x, y), q = q(, x, y). From (6.49) it follows that ’ ’ ’ ““ “ 1 |x − y|2 I −I I 0 p, q, ∈ D2,+ w (x, y) ⊂ D2,+ w(x, y), + −I I 0 I   where the last inclusion follows on noting that (x, y) is the maximum point of φ. Hereafter, denote |x−y|2 by κ. From Lemma 6.1 one can find matrices X, Y ∈ S k such that  “ ’ ’ “ ’ “ ’ “ C C I 0 I −I X 0 I 0 ≤ ≤ + Cκ − 0 I −I I 0 Y 0 I  

19

and (˜ u(x), p, X) ∈ D

2,+

˜(x) and (v(y), −q, −Y ) ∈ D u

2,−

v(y),

(6.52)

where C is as in Lemma 6.1. Using (6.50) one has that (u(x), p + δ(x − z), X + δI) ∈ D

2,+

u(x).

(6.53)

γ . 2

(6.54)

Also, using (6.51) we have that for  small enough and i ∈ In(z) hdi , p + δ(x − z)i + γ ≥ hdi , pi +

Using (6.46), (6.47) and (6.51) we have that for i ∈ In(z) and small enough  hdi , pi +

γ γ > 0, if x ∈ ∂Gi , and − hdi , qi − < 0, if y ∈ ∂Gi . 2 2

(6.55)

From (6.43) we have that V ∩ ∂G ⊂ ∪i∈In(z) ∂Gi .

(6.56)

Combining (6.56), (6.55) and (6.54) we have that, for small enough , hdi , p + δ(x − z)i + γ > 0, if x ∈ ∂G and i ∈ In(x) and hdi , −qi − γ < 0, if y ∈ ∂G and i ∈ In(y). (6.57) This, along with (6.38), (6.39), (6.53), (6.52), shows that we can find δ ∈ (0, ∞) such that for all 0 ≤  ≤ δ Fα (x, u(x), p + δ(x − z), X + δI) ≤ 0 ≤ Fα (y, v(y), −q, −Y ). Thus we have that 0

≥ Fα (x, u(x), p + δ(x − z), X + δI) − Fα (y, v(y), −q, −Y ) ≥ Fα (x, u(x), −q, X − CκI) − Fα (y, u(x), −q, −Y + CκI)

+ α(u(x) − v(y)) − m2 (κ + δ|x − z| + δ + Cκ) − m2 (Cκ) ≥ α(u(x) − v(y)) − m1 (|x − y| + 2Cκ) − m2 (κ + δ|x − z| + δ + Cκ) − m2 (Cκ), where the second inequality is obtained from (6.52) while the third inequality is obtained from (6.53). Taking limit as  → 0 we get that α(u(z) − v(z)) ≤ 0 which contradicts with (6.40) and (6.41). This shows that (Uγβ (x) − Vγβ (x)) ≤ 0 for all x ∈ G ∩ Bn . Taking limit as β → 0, we see that V2 (x) ≤ V1 (x) for all x ∈ G ∩ Bn . Reversing the roles of V1 and V2 we see that, we must have V2 (x) = V1 (x) for all x ∈ G ∩ Bn .

References [1] R. Atar and A. Budhiraja. Stability properties of constrained jump-diffusion processes. To Appear in Electronic Journal of Probability. [2] R. Atar, A. Budhiraja, and P. Dupuis. On positive recurrence of constrained diffusion processes. Annals of Probability, 29:979–1000, 2001. [3] G. K. Basak, V. S. Borkar, and M. K. Ghosh. Ergodic control of degenerate diffusions. Stochastic Anal. Appl., 1:1–17, 1997.

20

[4] V. S. Borkar. Dynamic programming for ergodic control with partial observations. To appear in Stoch. Proc. App. [5] V. S. Borkar. Optimal Control of Diffusion Processes. Longman Scientific and Technical, 1989. [6] V. S. Borkar and A. Budhiraja. A further remark on dynamic programming for partially observed Markov processes. Submitted for Publication. [7] V. S. Borkar and M. K. Ghosh. Ergodic control of multidimensional diffusions. II. Adaptive control. Appl. Math. Optim., 21:191–220, 1990. [8] A. Budhiraja. An ergodic control problem for constrained diffusion processes: Existence of optimal Markov control. To Appear in SIAM J. Cont. Opt. [9] A. Budhiraja and P. Dupuis. Simple necessary and sufficient conditions for the stability of constrained processes. SIAM J. Applied Math., 59:1686–1700, 1999. [10] R. M. Cox. Stationary and discounted control of diffusion processes. Ph.D thesis, Columbia University, 1984. [11] M. G. Crandall and P. L. Lions. Viscosity solutions of Hamilton-Jacobi equations. Trans. Amer. Math. Soc., 277:1–42, 1983. [12] P. Dupuis and H. Ishii. On Lipschitz continuity of the solution mapping to the Skorokhod problem, with applications. Stochastics, 35:31–62, 1991. [13] P. Dupuis and H. Ishii. On oblique derivative problems for fully nonlinear second-order elliptic PDE’s on domains with corners. Hokkaido Mathematical Journal, 20:135–164, 1991. [14] P. Dupuis and K. Ramanan. Convex duality and the Skorokhod Problem. I, II. Probability Theory and Related Fields, 2:153–195, 197–236, 1999. [15] J. M. Harrison and J. A. Van Miegham. Dynamic control of Brownian networks: State space collapse and equivalent workload formulation. Annals of Applied Probability, 7:747–771, 1997. [16] J. M. Harrison and L. Wein. Scheduling networks of queues: Heavy trafic analysis of a simple open network. Queuing Systems, 5:265–280, 1989. [17] F. P. Kelly and C. N. Laws. Dynamic routing in open queuing networks: Brownian models, cut constraints and resource pooling. Queuing Systems, 13:47–86, 1993. [18] H. J. Kushner. Heavy Traffic Analysis of Controlled Queueing and Communication Networks. Springer-Verlag, New York, May 2001. [19] H. J. Kushner and L. F. Martins. Routing and singular control for queuing networks in heavy traffic. SIAM J. Control Optim, 28:1209–1233, 1990. [20] P.-L. Lions. Optimal control of diffusion processes and Hamilton-Jacobi-Bellman equations. II. Viscosity solutions and uniqueness. Comm. Partial Differential Equations, 8(11):1229–1276, 1983. [21] S. Meyn and R. Tweedie. Markov Chains and Stochastic Stability. Springer-Verlag, 1993.

21