Existence of Optimal Feedback Production Plans ... - Semantic Scholar

Report 8 Downloads 60 Views
Existence of Optimal Feedback Production Plans in Stochastic Flowshops with Limited Bu ers E. Presman Central Economics and Mathematical Institute of the Russian Academy of Sciences, Moscow, Russia

S.P. Sethi Faculty of Management, University of Toronto, Toronto, Ontario, Canada M5S 3E6

W. Suo Faculty of Management, University of Toronto, Toronto, Ontario, Canada M5S 3E6

We consider an N -machine owshop with unreliable machines and bounds on work-in-process. Machine capacities and demand processes are nite state Markov chains. The problem is to choose the rates of production on the machines over time to minimize the expected discounted costs of production and inventory/backlog. We show that the value function of the problem is locally Lipschitz and is a solution to a dynamic programming equation with a certain boundary condition. We provide a veri cation theorem, and derive the optimal feedback control policy in terms of the directional derivatives of the value function. Key words: Manufacturing systems; production control; ow control; dynamic programming; feedback control; state constraints; Markov decision processes.

?

This work was partly supported by the Grant RBRF 94-01-01609 and the NSERC Grant A4619. Corresponding author: Professor Suresh Sethi, Tel. +1 416 978 4162; Fax +1 416 978 5433; Email [email protected] Preprint submitted to Elsevier Preprint

3 April 1997

1 Introduction We consider the problem of a stochastic manufacturing system consisting of N machines in a owshop con guration that must meet the demand for its product at a minimum cost. The stochastic nature of the system is due to the machines that are failure-prone and to uncertainty in demand. The machine capacities and demand processes are assumed to be nite state Markov chains. The decision variables are the input rates to the machines. We take the number of parts in the bu er of the rst N -1 machines and the di erence of the real and planned cumulative productions at the last machine, known as the surplus, as the states of the system. We consider all bu ers to have storage capacities. Moreover, the number of parts in the internal bu ers between any two machines should be all nonnegative, or more generally, not fall below given lower bounds. As is usual in the literature, we treat the material to be processed in the system as though it is continuous uid; See Chapter 2 in Sethi and Zhang [8] and references therein for justi cation and details. We also assume that the processing is resumed upon a machine's repair from the point at which it was interrupted when the machine broke down. This ensures that the surplus process is continuous in time. Our objective is to choose admissible input rates to minimize a sum of expected discounted surplus and production costs over an in nite horizon. It can be formulated as a stochastic dynamic programming problem. We show that there exists a unique optimal control under a strict convexity condition on the cost function and show that optimal control can be represented as a feedback control. Furthermore, as in Presman, Sethi and Zhang (PSZ hereafter) [4], we write the dynamic programming equation for the problem in terms of directional derivatives (DPEDD hereafter) for inner and boundary points and prove a veri cation theorem corresponding to our dynamic programming formulation. This paper extends the work PSZ, who only required the work-in-process to be nonnegative. Inclusion of lower and upper bound constraints on the work-inprocess and upper bound on the nished good surplus represents an important feature that is usually present in real life. There has been a substantial number of works related to the problems considered here. In view of the literature review in PSZ and Sethi and Zhang [8], we choose not to discuss the earlier literature in this paper. Instead, we discuss the relevant research that has appeared subsequent to PSZ. Fong and Zhou [2] have treated a two-machine owshop with limited bu ers in the context of hierarchical controls. While they are not able to show the local Lipschitz continuity of the value function as in PSZ for an n-machine owshop with unlimited bu ers or in Sethi, Zhang and Zhou [9] for a 2-machine owshop with 2

limited internal bu er, they prove a weaker property that is sucient for their analysis of hierachical controls. But when it comes to optimal controls, local Lipschitz property of the value function is one of the most important things to establish. We do not know how to extend the construction procedure used in PSZ and Sethi, Zhang and Zhou [9] to allow for upper bounds on the bu er sizes. In this paper, therefore, we develop a new methodology that allows us to prove that the value function of a general N -machine owshop is locally Lipschitz continuous. While useful in the present context, we believe that the methodology would nd applications in other contexts. It therefore represents a main contribution of this paper. The plan of the paper is as follows. In Section 2, we give a formulation of the problem and state the result on the existence and uniqueness of the optimal feedback control and establish the corresponding veri cation theorem. We state the main lemma required to prove that the value function is locally Lipschitz continuous. The main lemma is proved in Section 3, and the reader is referred to PSZ for proofs of the other results. Section 4 concludes the paper.

2 Problem Formulation and Main Results We consider a manufacturing system producing a single nished product using

N machines, M ; , MN , in tandem that are subject to breakdown and repair; see Fig. 1 in PSZ. We are given a stochastic process k(t) = (k (t); : : :; kN (t)) on the standard probability space ( ; F ; P ), where kn(t), n = 1; : : : ; N; is the production capacity of the n-th machine Mn at time t, and kN (t)  d(t) is 1

1

+1

+1

the demand process, with constant demand representing a special case. We use un(t) to denote the input rate to the n-th machine, n = 1; : : :; N , and xn(t) to denote the number of parts in the bu er between the n-th and (n + 1)-th machines, n = 1; : : : ; N ? 1. It is assumed that material is always available at M . Finally, the di erence between cumulative production and cumulative demand, called surplus, is denoted by xN (t). If xN (t) > 0, we have nished good inventories, and if xN (t) < 0, we have backlogs. 1

The dynamics of the system can then be written as follows:

x_ n(t) = un(t) ? un (t); xn(0) = xn; n = 1; : : : ; N; +1

(1)

where uN (t)  d(t). This relation can be written also in the vector form: +1

x_ (t) = Au(t); x(0) = x;

(2)

where A : RN ! RN is the corresponding linear operator. Since the number +1

3

of parts in the internal bu ers cannot be negative and bu ers usually have limited storage capacities, we impose the state constraints Ln  xn(t)  Hn ; n = 1; : : : ; N ? 1; xN (t)  HN , 0  Ln < Hn  1; n = 1; : : : ; N ? 1, where Ln and Hn represent the lower and upper bounds on the work-in-process in the n-th bu er respectively, and HN represents the upper bound on the nished product surplus. To formulate the problem precisely, let S = QnN ? [Ln; Hn ]  (?1; HN ]  RN denote the state constraint domain and let 1 =1

U (k) = fu = (u ; : : : ; uN ) : 0  un  kn ; n = 1; : : : ; N; uN = kN g; 1

+1

+1

+1

for k = (k ; : : :; kN ); kn  0; n = 1; : : : ; N + 1. For x 2 S; let 1

+1

U (x; k)= fu : u 2 U (k); xn = Ln ) un ? un  0; n = 1; : : : ; N ? 1; xn = Hn ) un ? un  0; n = 1; : : : ; N g: +1

+1

Let the sigma algebra Ft = fk(s) : 0  s  tg. We de ne now the concept of admissible controls. De nition. We say that a control u()= (u1(); : : :; uN +1 ()) is admissible with respect to the initial state vector x = (x1; : : : ; xN ) 2 S if: (i) u() is an Ftadapted measurable process, (ii) u(t) 2 U (k(t)) for all t  0, and (iii) the corresponding state process x(t) = (x1(t); : : :; xN (t)) 2 S for all t  0:

We use A(x; k) to denote the set of all admissible controls with respect to x 2 S and k(0) = k:

Remark 1 Condition (iii) is equivalent to u(t) 2 U (x(t); k(t)); t  0: The problem is to nd an admissible control u() that minimizes the cost function:

Z1

J (x; k; u()) = E e?th(x(t); u(t); k(t))dt;

(3)

0

where h(; ; ) de nes the cost of surplus and production under current value of k(t), k = (k ; : : : ; kN ) is the initial value of k(t), and  > 0 is the discount rate. The value function is then de ned as 1

+1

v(x; k) = u  2A inf x;k J (x; k; u()): ()

(

4

)

(4)

We make the following assumptions on the random process k(t) and the cost function h(; ; ): (A1) The capacity/demand process k(t) 2 M is a nite state Markov chain with the in nitesimal generator Q = (qij ); see PSZ for details. (A2) For any given k; h(; ; k) is a nonnegative jointly convex function. For all x; x0 2 S and u; u0 2 U (kj ); j = 1; : : :; p, there exist constants C and h  1 such that 0

K

jh(x; u; k) ? h(x0; u0; k)j  C (1 + jxj h + jx0j h )(jx ? x0j + ju ? u0j); where j  j denotes an appropriate norm. Remark 2 The restrictions on the growth rate of the cost h in (A2) is a usual assumption to ensure the existence of a solution to the dynamic programming equation to be considered later. For the special case Ln = 0 and Hn = 1; n = 1; 2;  ; N ? 1, and HN = 1 considered in PSZ, they had assumed that the surplus cost h(x; u; k) does not depend on k, so that it can be written as h(x; u), and h(x; u) satis es K

0

K

jh(x; u) ? h(x0; u0)j  C (1 + jxj h + jx0j h )(jx ? x0j) + C 0 (ju ? u0j); (5) K

0

K

0

instead of the corresponding condition in (A2). Note that all their results hold for the more general surplus cost as in (A2) above. As in PSZ, we can formally write the DPEDD for our problem as:

v(x; k) = u2Uinfx;k fvA0 u(x; k) + h(x; u; k)g + Qv(x; k): (

)

(6)

where vp0 (x; k) is a directional derivative along the direction p 2 RN .

Theorem 1 (Existence and Veri cation) The following results hold: (a) There exists an optimal control u() 2 A(x; k). Furthermore, if h(; ; k) for some k is strictly convex in either x or u or both, then u () is unique, and can be represented as a feedback control, i.e., there exists a function u(; ) such that for any x, we have u (t) = u(x(t); k(t)); t  0; where x() is the optimal state process, the solution of (2) for u() with x(0) = x. (b) v(x; k) satis es equation (6) for all x 2 S . (c) If some continuous convex function v~(x; k) satis es (6) and (9) with x0 = 0, then v~(x; k)  v(x; k). Moreover, if there exists a feedback control u(x; k) providing the in mum in (6) for v~(x; k), then v~(x; k) = v(x; k) and u(x; k) is an optimal feedback control.

5

(d) Assume that h(x; u; k) is strictly convex in u for each xed x and k. Let u(x; k) denote the minimizer function of the right-hand side of (6). Then,

x_ (t) = Au(x(t); k(t)); x(0) = x has a solution x(t) and u(t) = u(x(t); k(t)) is the optimal control. For each k 2 M; y 2 S , let us consider the following deterministic optimal control problem P (y; k): y_ (t) = Au(t); y(0) = y; (7) Z1 X v^(y; k)= u t 2Uinfy t ;k e? k t[^h(y(t); u(t); k) + qk;k v(y(t); k0)] dt; (8) ( )

( )

( ( )

)

0

k 6=k 0

0

where (k) =  + Pk 6 k qk;k , known as the killing rate in the PDP literature (see Davis, 1993). 0

=

0

PSZ proved Theorem 1 in the special case when Ln = 0; Hn = +1 for n = 1; :::; N +1 and h(y; u; k) does not depend on k and satis es (5). For this purpose, they used the reduction of initial problem to P (y; k).

Remark 3 In the context of a single uid model, Rajgopal [5] also uses a reduction to an equivalent deterministic problem in his analysis.

The essential reduction step in PSZ was the proof of the Lipschitz property of the value function, namely,

jv(x; k) ? v(x0; k)j  C (1 + jxj h + jx0j h )jx ? x0j: K

1

K

(9)

To esteblish (9) PSZ proved an intermediate result (see Lemma 1 in PSZ). In Section 3, we provide a constructive proof of the following Main Lemma, which leads to the required Lipschitz property in this paper.

Main Lemma. Given x = (x ; : : : ; xN ), x0 = (x0 ; : : :; x0N ) 2 S , and k 2 M, let u() = (u (); : : :, uN (); kN ()) 2 A(x; k) and x() be the corresponding trajectory. Then there exists u0() = (u0 (); : : :; u0N (); kN ()) 2 A(x0; k) with the corresponding trajectory x0(), such that (a) 0R  u0n(t)  un(t) for all t  0; n = 1; : : : ; N ; P t N 0 (b) P jun(s) ? un (s)jds P i jxi ? x0ij; n = 1; : : : ; N ; (c) Ni jxi(t) ? x0i(t)j  Ni jxi ? x0ij: 1

1

1

+1

1

+1

=1

0

=1

=1

Armed with Main Lemma in the place of Lemma 1 of PSZ and Remark 2, the proof of Theorem 1 is absolutely similar to that in PSZ and we omit it. 6

3 Proof of the Results In order to prove Main Lemma, let us rst consider the deterministic problem P (y; k) for some initial y and k without the objective function, and deterministic functions y(t) and u(t); 0  t < 1, which satisfy y(0) = y 2 S , y(t) = Au(t), and u(t) 2 U (y(t); k) for the given value of k (it is easy to see that y(t) 2 S for all 0  t < 1). For any y0 2 S , we will construct functions y0(t) and u0(t) which satisfy y0(0) = y0 , dy0(t)=dtP= Au0(t), u0(t) 2 U (y0(t); k) P for all 0  t < 1; and Ni [yi(t) ? yi0(t)] and Ni [yi(t) ? yi0(t)]? are nonincreasing in t. 0

0

0

=1

+

=1

We begin our construction by de ning for any y0 = (y0 ; :::; yN0 ) 2 S and u = (u ; : : : ; uN ), the disjoint index sets 1

1

+1

(y0; u) = fn : yn0 = Hn ; 1  n  N un ? un > 0g; ?(y0; u) = fn : yn0 = Ln ; 1  n  N ? 1; un ? un  0g: +

(10)

+1

+1

Clearly, (y0; u) is the set of indices of the components of the surplus vector y0 that would violate the upper bound constraint if the given control u were used. The set ?(y0; u) is the set of indices that would either stay or violate the lower bound constraint if the given control u were used. +

Whenever the use of the control u would violate the constraint, we must modify it to avoid the violation. Two simple principles will guide our modi cation procedure. First, the modi ed control should remain nonnegative and should not exceed the given machine capacity. The way we shall ensure this is to modify the control only downward but not make it negative. Second, we shall modify the control in such a way so that the new trajectory y0() stays close to the given trajectory y().

un?

u

un

- yn? n - Mn (n ? 1)-th bu er 1

1

- yn

un

+1

-

n-th bu er

Fig. 1. Upstream and downstream bu ers of machine Mn

Let us rst describe the modi cation when n 2 (y0; u). As is obvious from Figure 1, reducing the upstream control un to equal the value of the downstream control un would keep the modi ed control between zero and the machine capacity, and ensure that the trajectory yn0 () does not violate the upper bound Hn . But, when the control un is reduced, it increases the rate at which the inventory in the (n ? 1)-th bu er changes. We then invoke the second principle, which suggests that we should reduce the control un? (up+

+1

1

7

stream to the (n ? 1)-th bu er) as well to preserve the di erence (un? ? un) as much as possible, i.e., without making the modi ed value of un? negative. 1

1

These ideas lead us to de ne the modi ed control uH (y0; u) = (uH (y0; u);  ; uHN (y0; u)) by the following backward induction: 1

+1

uHN (y0; u) = u8N ; >< uHn (y0; u) if n 2 (y0; u); H 0 un (y ; u) = > h : un ? un + uHn (y0; u)i if n 2= (y0; u): +1

+1

+

+1

+

+1

(11)

+

+1

Similar arguments lead us to a forward induction procedure for modifying the control un when n 2 ? (y0; k). The procedure results in uL(y0; u) = (uL(y0; u);  ; uHN (y0; u)) de ned as follows: 1

+1

uL(y0; u) = u8 ; >> uL(y0; u) if n 2 ? (y0; u); >< h n i uLn (y0; u) = > un ? un + uLn(y0; u) if n 2= ? (y0; u); n < N; (12) >> : uN if n = N: 1

1

+

+1

+1

+1

It should be obvious that if u violates no state constraints, then no modi cation is required. Furthermore, we should note that if using u would violate only the upper bound (respectively, lower bound) constraints, then uH (y0; u) (respectively, uL(y0; u)) would provide the modi ed control we are seeking. But in general, upper bound constraints may be violated at some bu ers and lower bound constraints at some others. Then it seems reasonable that we rst modify the control u according to the backward procedure (11) designed to avoid violations of the upper bound constraints, and then further modify the resulting control uH in turn according to the forward procedure (12) designed to avoid violations of the lower bound constraints. Thus, we consider the following control

u0(y0; u) = uL(y0; uH (y0; u));

(13)

and the resulting trajectory obtained from solving the di erential equation

d y0(t) = Au0(y0(t); u(t)); y0(0) = y0 : dt 0

(14)

Remark 4 In order to prove their Lemma 1 (corresponding to Main Lemma of this paper), PSZ constructed the stochastic control process u0() 2 A(x0; k), 8

for the new initial condition x0, given x; k and u() 2 A(x; k). Here we construct the transformation (13), which is local in nature and independent of the machine state k, and which depends only on the new current value (or value at time t) of y0 and the time-t value of the given control u. While this transformation resembles a projection of u onto the set of admissible directions, it is not the same. Moreover, we are unable to prove Lemma 2 below by the projection method. We can state and prove the following results concerning the deterministic control de ned by (13) and the deterministic trajectory de ned by (14). The rst lemma says that the modi ed control u0 is nonnegative and is no larger than the given control u. The second says that the corresponding trajectory y0(t) stays within the required upper and lower bounds for t 2 [0; 1). Intuitively, it is not dicult to see why these results should follow from our construction. The second lemma also says that the deterministic trajectory y(), emanating from the given initial condition y and the control u(), and the new trajectory y0() do not stray too far away from one another from where they begin at time zero.

Lemma 1 If u 2 U (y0; k), then (i) 0  u0n(y0; u)  un; 1  n  N ; (ii) u0(y0; u) 2 U (y0; k). Proof. In order to prove (i), it suces in view of uHN +1 (y0; u) = uN +1 to show the following:

(H ) 0  uHn (y0; u)  un , 1  n  N + 1, and (L ) 0  uLn(y0; u)  un, 1  n  N + 1. 1

1

To prove (H ), we use the induction argument with the knowledge that 0  (a + b)  a for a = un  0 and b = uHn (y0; u) ? un  0. We can prove (L ) analogously. +

1

+1

+1

1

With (i) in hand, the proof of (ii) will be complete once we show that (H ) (L ) 2

2

u0n(y0; u) ? u0n (y0; u)  0 for n 2 (y0; u), and u0n(y0; u) ? u0n (y0; u)  0 for n 2 ?(y0; u). +

+1

+1

In order to prove (H ), note that if n 2 (y0; u), then yn0 = Hn and, therefore, n 2= ? (y0; uH (y0; u)). According to (13), 2

+

u0n (y0; u) = uLn (y0; uH (y0; u)) = [uHn (y0; u) ? uHn (y0; u) + uLn(y0; uH (y0; u))] = [uHn (y0; u) ? uHn (y0; u) + u0n (y0; u)] : +1

+1

+1

+

+1

9

+

But n 2 (y0; u) and (11) imply that uHn (y0; u) = uHn (y0; u). Thus, +

+1

u0n (y0; u) ? u0n(y0; u) = 0: +1

This completes the proof of (H ). The proof of (L ) is similar. 2

2

Lemma 2 Equation (14) hasPa solution y0() such thatP y0(t) 2 S; 0  t < 1. Moreover, y0() is such that Ni [yi(t) ? yi0 (t)] and Ni [yi(t) ? yi0 (t)]? are +

=1

nonincreasing in t.

=1

Proof. Let us rst consider the case when u() is a piecewise-constant, rightcontinuous deterministic function. Then from the construction of the feedback control u0(y0; u) and Lemma 1 (ii), it is easy to see that a solution y() to (14) exists with y(t) 2 S for 0  t < 1.

To prove the second part, we use (14) to write

d (y(t) ? y0(t)) = A(u(t) ? u0(y0(t); u(t))): dt For y; y0 2 S , let us de ne the following partition of f1;  ; N g:

(y; y0; u) = fn : yn ? yn0 < 0 or [yn ? yn0 = 0; (A(u ? u0(y0; u)))n < 0]g; (15) ?(y; y0; u) = fn : yn ? yn0 > 0 or [yn ? yn0 = 0; (A(u ? u0(y0; u)))n  0]g; +

where we note that (A(u ? u0(y0; u)))n = un ? un ? (u0n(y0; u) ? u0n (y0; u)): +1

+1

(16)

As we shall see next, the de nition (15) of thePnew index sets allows us to consider the negative and positive summands of Nn (yn (t)?yn0 (t)) separately. =1

We now consider the sum of the negative di erences. It is easy to see that for the piecewise-constant, right-continuous control u(), a xed t, and a suf ciently small value of s, we have N X

[yn(t + s) ? yn0 (t + s)]? =

n=1 N X

=

n=1

[yn(t) ? yn0 (t)]? ? s

X

[yn0 (t + s) ? yn(t + s)]

n2 + (y(t);y (t);u(t)) 0

X

(A(u(t) ? u0(y0(t); u(t)))n:

n2 + (y(t);y (t);u(t)) 0

10

It is clear that for PNn [yn(t + s) ? yn0 (t + s)]? to be nonincreasing, it is enough that for any y; y0, and u, we have X (A(u ? u0(y0; u)))n  0: (17) =1

n2 + (y;y ;u) 0

To show (17), we need to explore the relationship between the index sets de ned in (10) and (15). Since un ? un > 0 and yn0 = Hn imply yn < Hn , it is immediate that (y0; u)  (y; y0; u). On the other hand, un ? un  0 and yn0 = Ln imply yn  Ln and (A(u ? u0(y0; u)))n = 0. Thus, ?(y0; u)  ? (y; y0; u). +1

+

+

+1

We can now prove (17). We rewrite (16) as (A(u ? u0(y0; u)))n = (A(u ? uH (y0; u)))n + (A(uH (y0; u) ? uL(y0; uH (y0; u)))n: (18) From the de nition of uH (y0; u), we can easily conclude

 (A(u ? uH (y0; u)))n = 0 if n 2= (y0; u) and un ? un  0; and  (A(u ? uH (y0; u)))n  0 if n 2= (y0; u) and un ? un  0. +

+1

+

+1

Therefore, (A(u ? uH (y0; u)))n  0;

8n 2= (y0; u):

(19)

+

Moreover, by (16), (11), and Lemma 1, we have N X n=1

(A(u ? uH (y0; u)))n = u ? uH (y0; u)  0: 1

(20)

1

Inequalities (19) and (20) together with the fact that (y0; u)  (y; y0; u) allow us to conclude +

X

(A(u ? uH (y0; u)))

n2 + (y;y ;u) 0

?

X

=

n

(A(u ? uH (y0; u)))

n=2 + (y;y ;u) 0

n

N X

+

(A(u ? uH (y0; u)))n

n=1

 0:

(21)

Similarly,the following is veri ed easily from the de nition of uL(y0; uH (y0; u)): { For n 2= ? (y0; uH (y0; u)) and uHn (y0; u) ? uHn (y0; u)  0, (A(uH (y0; u) ? uL(y0; uH (y0; u)))n = 0: +1

11

{ For n 2= ? (y0; uH (y0; u)) and uHn (y0; u) ? uHn (y0; u)  0, +1

(A(uH (y0; u) ? uL(y0; uH (y0; u)))n  0: These imply (A(uH (y0; u) ? uL(y0; uH (y0; u)))n  0; 8n 2= (y0; uH (y0; u)): (22) +

Next, we show that

? (y0; uH (y0; u)) = ?(y0; u):

(23)

Indeed if n 2 ? (y0; u), then from (10) and (11) we have yn0 = Ln , un ? un  0, and +1

uHn (y0; u) = [un ? un + uHn (y0; u)]  uHn (y0; u): +1

+

+1

+1

From (10), we then conclude that n 2 ?(y0; uH (y0; u)); and thus ?(y0; u)  ? (y0; uH (y0; u)): Similarly, for any n 2 ? (y0; uH (y0; u)), we have yn0 = Ln and uHn (y0; u) ? uHn (y0; u)  0. But from the de nition of uH (y0; u), we can easily see that the inequality is only possible when un ? un  0, and as a result, n 2 ? (y0; u). Thus (23) is proved. +1

+1

From (23) and the fact ? (y0; u)  ? (y; y0; u) proved earlier, we have that if n 2 (y; y0; u), then n 2= ? (y0; uH (y0; u)). Thus, from (22) we have +

X

(A(uH (y0; u) ? uL(y0; uH (y0; u)))n  0:

n2 + (y;y ;u) 0

(24)

Now (17) follows from (18), (21) and (24).

Similarly, we can show that PNi [yi(t) ? yi0(t)] is also nonincreasing. +

=1

For the general case, i.e., when u() is not necessarily piecewise-constant and right-continuous, the results can be proved through a limiting procedure. So far we have obtained the transformation (13) that has allowed us to construct a modi ed deterministic control and the corresponding trajectory for any new initial condition, which does not stray too far away from the original trajectory. We shall now show that this is also the case for our stochastic problem when we use the construction procedure sequentially between the successive jumps of the Markov chain k(). 12

Proof of Main Lemma. For given u() 2 A(x; k) and the new initial condition x0 2 S , we can construct the continuous process x0(t) satisfying dx0(t) = Au0(x0(t); u(t)); x0(0) = x0;

dt

by applying the existence statement of Lemma 2 between successive jumps of the process k(). It follows from the above construction that the properties of controls and trajectories speci ed in Lemmas 1 and 2 hold also for the processes x(), u(), x0() and u0(). This proves the statements (a) and (c) of Main Lemma. In order to prove (b), we know from (1) that N d X (xi(t) ? x0i(t)) = un(t) ? u0n(t): dt i n

(25)

=

Integrating from 0 to t and using the results of Lemmas 1 and 2, we have

Zt

jun(s) ? u0 (s)jds = n

Zt

(un (s) ? u0n (s))ds

0

0

=

N X i=n N X

(xi(t) ? x0i(t)) ?

N X

(xi ? x0i)

i=n N X

 (xi(t) ? x0i(t)) + (xi ? x0i)? i=1 N X

+

N X

i=1

 (xi ? x0i) + (xi ? x0i)? = +

i=1

i=1

N X i=1

jxi ? x0ij:

This proves the statement (b), and completes the proof of Main Lemma.

4 Concluding Remarks In this paper we have developed a theoretical framework for production planning in an N -machine owshop with nite bu er capacities. We use the directional derivatives to describe the dynamics of the problem and the associated boundary condition. We prove the important property that the value function is locally Lipschitz continuous, and that it is a solution to a dynamic programming problem. The optimal feedback control is given in terms of the value function of the problem. The theoretical results characterizing the optimal control policies can be used in developing numerical approaches for solving 13

production planning problems in dynamic stochastic owshops. In view of veri cation part of Theorem 1, the problem is reduced to solving the DPEDD (6) to obtain the value function and nd the optimal controls. In this connection, numerical schemes developed by Kushner and Dupuis [3] can be used to compute the solution to (6). Many important problems remain open. One of them is the problem in which machine breakdown rates depend on the production rates as in Section 3.6 of Sethi and Zhang [8]. Another is the extension of our results to problems with the average cost criterion, since so far only singular or parallel machine problems have been dealt with in this context; see Sethi et. al. [6,7].

References [1] M.H.A. Davis, Markov Models and Optimization, Chapman & Hall, London, 1993. [2] N.-T. Fong and X.Y. Zhou, Hierarchical production policies in stochastic twomachine owshops with nite bu ers, J.O.T.A. 89 (1996). [3] H.J. Kushner and P.G. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer-Verlag, New York, NY, 1992. [4] E.L. Presman, S.P. Sethi and Q. Zhang, Optimal Feedback Production Planning in a Stochastic N - Machine Flowshop, Automatica, 13 (1993) 1325-1332. [5] S. Rajgopal, Optimal Control of Stochastic Fluid-Flow Systems with Applications to Telecommunication and Manufacturing systems, Ph.D. Thesis, University of North Carolina, 1994. [6] S.P. Sethi, W. Suo, M.I. Taksar and Q. Zhang, Optimal Production Planning in a Stochastic Manufacturing System with Long-Run Average Cost, J.O.T.A., 92 No. 1 (1997). [7] S.P. Sethi, W. Suo, M.I. Taksar and H. Yan, Optimal Production Planning in Multiproduct Stochastic Manufacturing Systems with Long-Run Average Cost, Preprint, Faculty of Management, University of Toronto. [8] S.P. Sethi and Q. Zhang, Hierarchical Decision Making in Stochastic Manufacturing Systems, Birkhauser Boston, Cambridge, MA, 1994. [9] S.P. Sethi, Q. Zhang and X.Y. Zhou, Hierarchical controls in a stochastic twomachine owshop with a nite internal bu er, Proceedings of the 31st IEEE Conference on Decision and Control, Tuscon, Arizona, December 16-18, 1992, 2074-2079.

14