OPTIMAL ROUTING TO TWO PARALLEL HETEROGENEOUS ...

Comment

Report 1 Downloads 12 Views

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

TECHNION - Israel Institute of Technology Computer Science Department

OPTIMAL ROUTING TO TWO PARALLEL HETEROGENEOUS SERVERS WITH RESEQUENCING by

S. Ayoun and Z. Rosberg

Technical Report #592 November 1989

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

OPTIMAL ROUTING TO TWO PARALLEL HETEROGENEOUS SERVERS WITH RESEQUENCING Zvi Rosberg

Serge Ayoun

(November 1989)

Technion - lIT Computer Science Dept. Haifa 3200'0, Israel

Abstract Customers arrive to a single service queue according to a Poisson process with rate l, from which they are routed to two parallel heterogeneous and exponential servers whose ra.tes are 1'1 > #2· Customers are released from the system after service completion, according to their arrival order - a requirement introducing additional resequencing delays. Customers which are delayed due to resequencing are waiting in a. resequencing queue. We consider the optimal routing problem under the class of fixed-position routing policies, that route customers to the faster server from the head of the service queue, and to the slower server from position J. The cost function is taken as the long-run average holding cost of the customers in the system. We show that an optimal stationary policy exists and is of the following type: The faster server is kept active as long as the ser\jce queue is not empty. The decision whether or not to route a customer to the slower server is independent of the state of the resequencing queue. If the Q = ~+ ,then customers are routed to the slower position J is greater than Jo = rlnU-a)l, a ~1 ~2 server if and only if the length of the service queue is at least moll (a threshold policy). We also show that the routing position Jo is 'optimal' in the sense that every policy can be improved by dispatching a customer from position Jo (if not empty), rather than from position J. Keywords: Exponential Queues, Parallel Servers, Resequencing, Routing, Optimal Control, Markov Decision Processes. .

1

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

1

Introduction

In this paper we consider a queueing system (Figure 1) which is composed of a infinite capacity queue, Q, attended by two exponential servers operating at rates

Pl

> P2. Customers arrive

into the system according to a Poisson process with rate '\, and are assigned consecutive integers which serve as their identifiers. Throughout we assume the stability condition ,\ < P

def Pl

+ P2.

Arriving customers join at the end of queue Q and are routed to one of ~he

servers according to some given routing policy (to be defined below). Customers in service cannot be re-routed. In many applications of routing in communication network, customers (messages) are released from the service system (the channel and receiver) according to the order of their arrivals. That is, customer i is not released from the system unless he and all customers whose numbers are smaller than i, have finished their service. The waiting time of a customer that has completed his service, for the release of customers with lower sequence numbers, is referred to as resequencing delay. Note that resequencing delays are possible since servets are operating at different rates. Moreover, a routing policy may assign customers from an arbitrary position of the queue. Customers which are being delayed due to resequencing, are waiting in one of two resequencing queues: Rl for customers which have been served by server 1, and R2 for those which have been served by server 2. The positions in queue Q from which customers are being routed to the servers (which are perceived as two alternative routes), clearly affect the overall resequencing delays (see [3]). The optimal 'routing problem with variable positions turned out to be extremely difficult. Therefore, we restrict our attention to fixed-position routing policies which route customers to server 1 only from the head of queue Q, and to server 2 only from a fixed position J, J

~

2.

By position J we mean the J - th customer among those in server 1 and in queue Q. Beside tractability, this restriction is also motivated by the result in [2]. It has been shown there, that if routing positions are allowed to vary in time, then under light and heavy loads one can take the optimal policy within the class of fixed-position routings. Also, as it will become apparent, it is not optimal to keep server 1 idle if queue Q is not empty, and therefore the requirement of J

~

2 does not exclude the head of the line.

Let X(t) be a tuple denoting the state of the system at time t (to be defined below) and 2

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

IX(t)1 be the number of customers in the system at that state. A routing policy 11" is any rule that at every time t

> 0 decides, on the basis of past states and of past decisions up to time t,

which idle servers to activate. Policies may leave a server idle even when there is a customer in the corresponding position. With a holding cost accrued at a nxed rate of 1, the long-run average cost associated with the policy 11" is then defined by

(1)

, z E S, where

E; H de~otes the expectation with respect to the probability measure induced by the

policy

11"

on the process X

= {X(t), t ~ O} starting in state z.

A routing policy

11"*

is optimal

if it minimizes (1), i.e., if

for any other policy 1r. For the exponential system considered here, the optimization problem associated with (1) falls within the purview of continuous-time Markov decisions processes which are uniformizable, i.e., which are equivalent to uniformized discrete-time Markov decisions processes [6]. The reader is referred for details to [4], where the same problem without resequencing delays is studied. To define the discrete-time decision process, consider that at any given instant, each server is working either on a real customer, if activated, or on a dummy customer otherwise. Dummy customers always return to queue Q upon completing service and incur no contribution to the cost. Transitions are associated either with arrivals or service completions at one of the servers of a customer - either real or dummy - determine free transitions. These free transitions occur according to a Poisson process of rate to an arrival occurs with proba.bility

~

+ p.

A (free) transition due

..\;p' whereas a transition due to a service completion at

server i occurs with probability X,+p' IT in state z before a transition, the process will jump after this transition to a state which depends on the current state z and on the action taken under the policy

11"

in use. The cost function for using policy

3

1r

which corresponds to (1) is

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

then given by

(2) where X(m) now denotes the state sampled at the'm - th tra.nsition. We also need the total ,8-discounted cost (0

< ,8 < 1) associated with the policy 'Ir, which is defined by V!(,,)'" E:

[fo.B'" IX(m) I] . "E S.

(3)

The complex structure of the state space of X (see Section 2) results in a complex class of stationary policies. A simpler

sub-c~ass

are the policies whose decisions are functions of the

length of queue Q only. This sub-class will be referred to as the resequencing-invariant class. A further simpler sub-class are the threshold policies. A policy t m is a threshold policy with level m

~

J if: (i) The first customer in

queue Q is routed to seMJer 1 whenever he becomes free; (ii) The customer from position J is routed to seMJer 2 when and only when he is free and the number of customers in seMJer 1 and queue Q is at least m.

One result of this study is that the optimal policy can be taken within the resequencing.· invariant class. Another result is that for a certain range of positions J, the optimal policy can be taken within the threshold class. We also show that there is a preferable routing position Jo. For the routing problem without

resequenci~

delays, the routing position J is irrelevant

since service requirements are identically distributed. This problem was first studied in [7], where it was conjectured that the optimal policy would be of threshold type. In [1], a version of the problem with N servers was considered under the assumptions that the system has an initial load of n customers and no new customers enter the system, i.e., A = O. A simple policy which minimizes the expected flow time has been determined. This optimal policy has the following simple form [1]: For 1 < j

< N,

set

Rj deC =

PI

+ ... + P,j-l p'j

4

( •

3-

1)

(4)

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

and define R1 = 0. If there are n customers that remain unprocessed and server j is the fastest server available (i.e., with the largest Pi)' then the idle server j is

activated - and a customer dispatched to it - if and only if n

> Ri.

The conjecture from [7] on the threihold form of the optimal policy was settled in the affirmative in [4] for N ~ 2. Using policy iteration, it has been shown that the optimal policy is of threshold type with threihold level R(~) (which depends on ~). It was also conjectured there that as ~

! 0, R(~)

increases and converges to R2 given by (4). In [13], simple stochastic

coupling arguments were used to prove the optimality of the threshold policy for N = .2. Motivated by the conjecture made in [4], it has been shown in [10] (for a general number of parallel servers) and in [8] (for two servers) that the threshold policy above for optimal for small enough values of the arrival rate

~

= 0, is also

~.

In light of the results above, one is naturally led to explore the idea. that when resequencing delays are introduced, the optimal policy would also be of threshold type. We settle this question in the affirmative only for J

> Jo•

The issue of resequencing delays in this context has been first introduced in [3], where queueing statistics have been evaluated under the class of fixed-position threshold policies. It has been further shown there, that for a given threshold level m, there is an optimal position J* from which one should route customers to server 2. This position is given by

J* =

where

{

m, if m

< Jo;

Jo, if m

> Jo,

f1

and

J = rln(l - a o

ina

(5)

In words. When a customer has to be routed to server 2 according to the threshold policy t m , then the beit fixed-position is the nearest to J o. This property of J o, will be referred to as its 'optimality property'. Reviewing the optimality property of Jo for a threshold policy, and considering the fact that threshold policies may not necessarily be Qptimal, we are intrigued by another question, whether Jo has the optimality property for a more general class of policies. We will show that this is indeed the case. 5

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Independently, an attempt based on value iteration, has been made in [12] to prove that the threshold policy is optimal for the case J

~

2. The proofs there however, raise in our

mind some unsettled questions. The most severe one is the validity of the inequalities [12, Eqs. (6.49), p. 120], for K

= 1. These inequalities are crucial for-the validity of Lemma 5.7.1

there. The paper is organized as follows. In Section 2, we define the state space and the transitions under fixed-position routings. Section 3 is sub-divided into two parts. In Sub-section 3.1, we show that the faster server should be kept active as long as the service queue is not empty. In Sub-section 3.2 which is further sub-diveded, we consider the optimal control of the slower server. In Sub-section 3.2.1, we show that the optimal control i~ independent of the state of the resequencing queues. In Sub-section 3.2.2, we show the 'optimality property' of position Jo, and in Sub-section 3.2.3 we show that for J> Jo, the optimal policy is of threshold type.

2

The state process - definitions and basic results

In this section we define the states and the transitions of the Markov decision process that describes our routing problem, and examine its state evolution.

2.1 States and transitions We start with the state definition. After every transition t, t = 0,1, ..., in the discrete time decision process, let n(t) denote the number of customers in queue Q, and ei(t), i = denote the state of server i (with the understanding that ei(t)

= 1 if server i

1,~,

is busy, and

ei(t) = 0 otherwise). To describe the resequencing queues Rl and R2 we need the following notion. We say that customer i in a resequencing queue is being delayed by customer Teo if:

(i) Customer Teo did not finish service. (ii) leo < i. (iii) k o is the maximal k that satisfies (i) and (ii). Thus, customer i is released immediately after the service completion of customer leo. Let l(t) be the number of customers in queue Rl (after the t - th transition), that are being delayed by the customer which is being served by server 2. Here, l(t) 6

= 0 if e2(t) =

O.

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Also (see Figure 1), denote by il(t)

< i 2(t) 0, or k =

(1+1,(0, ... ,0)),

°and

e2

if k=Oande2=1,

(I, (111" . ,lk_l, Ik + 1, Ik+lt •• • , IJ_I)), if k (0, (0, ... ,0)),

> OJ

if k = 0.

7

= OJ

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

The transformation Sl,e2 defines the state that queues R1 and R2 would jump to from state z, when server 1 would complete service of a real customer. Observe that by definition, if k

=0

and

e2

in the system. Thus, if k

=

1, then the customer that is being served by server 2 is the 'oldest'

Otherwise, the customer that is being served by server 1 is the 'oldest'.

> 0, or k

= 0 and

e2

= 0,

when server 1 would complete service of a real

customer, this customer and those in ,queue R2 which are beiD:g delayed by him, would leave the system. In this case we necessarily have 1=0. If k = 0 and

R

e2

= 1, we necessarily have

= (I, (0, ... ,0», and the customer that would finish service in server 1, would join queue

R1. (These observations are proven in the next sub-section.)

The transformation

Sl defines the state that queues R1 and R2 would jump to from state

z, when server 2 would complete service of a real customer. Recall that for k = 0 and

e2

=1

we necessarily have R = (I, (0, ... ,.0». Therefore, wh~n the customer that is being served by server 2 would finish service, he and the customers in queue R1 would leave the system. IT

k > 0 and

e2

= 1, then the customer that would finish service in server 2, would join queue

R2 and would be delayed by customer i k • Now, the free transitions of process X from state z E S (when no routing are made», are as follows.

if el

if

e2

= 0;

= 0;

where z+ = max{O, z}. The probabilities that a free transition A(z), D1(z) or D 2 (z) occurs, are

'\;1"

X'!/:I' and X':/:I" respectively.

Here, it is convenient to identify a stationary policy 11" with a function 11" : S -+ {Ph, Ph P2 , Pb } as follows. Assume that a free transition - either an arrival or a service completion - occurs that would make the state jump to z E S if no action were taken. The policy

11"

uses at, state

z an operator Pa, a E {h,1,2, b}, that makes the state jump instantaneously from z to Pa(z), '8

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

where

Ph(Z) =

Zj

P1(n,0,e2,R,k) == (n -1,I,e2,R,k), n

~ 1;

P2(n,ehO,R,0) = (n -1,e2' 1,R,J -1), n > 1; Pb(n, 0,0, R, 0) = (n - 2, 1,1, R, J - 1),

n > 2.

The operator Ph does not route any customers, PI routes the customer from the head of the queue to server 1, P2 routes the customer from position J to server 2, and Pb does PI and P2. (Notice that from the way we define the posi\ion J, the order in Pb is irrelevant.)

2.2 Basic results Since the cost function is linear in the state variable and the total number of customers in the system changes by at most one at every tra.nsition, it is well known that a.n optimal policy exists for the ,a-discounted problem (associated with (3)), and that it can be taken

i~

the class of Markov stationary policie; [11]. One of the conclusions of this study is that the exact same result also holds for the lorg-run average cost criterion (2). Furthermore, for every stationary policy 1r, the limit in (2) e} ists' and is independent of the initial state z. Without loss of generality we may assume that ,\ +p = 1. Under any stationary policy 1r, the forward equations of v,f(z) are

where 1r(Y) E {Ph(y), P1(y), P2 (y), Pb(1')}, In the following lemmas we present some basic properties of the state evolution. The first lemma resolves the order among the customers at any instant. Denote (see Figure 1): r~(t)

- The customer (Le., its sequence number) in the s - th position in queue Rl at the t - th transition (time t).

r~(t)

- The customer in the s - th position in queue R2 at time t.

q6(t) - The customer in the 8

-

th position in queue Q at time t. 9

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

nl(t) '- The number of customers in queue RI at time t. n2(t) - The number of customers in queue R2 at time t. l(t) - The customer which is being served by server 1 at time t, or 0 if the server is idle. 2( t) - The customer which is being served by server 2 at time t, or 0 if the server is idle.

Lemma 2.1 At every time t and for every occupied positions. and p in the corresponding queues:

(a) qp(t) < q,,(t), for p < .;

(c) r~(t) < r~(t), for p < &; (d) q,,(t') < q,,(t), for t' < t;

(e) r~(t) < l(t) < ql(t);

(f) r~(t) < 2(t) < qJ(t); (g) 2(t) < r~(t); (h) There exists a p, 1 ::; p ::; J - 2, such that l(t) < r~(t) or qp(t) < r~(t). Proof: Properties (a)-(f) are direct consequences from the facts that customers join at the end of the queues and are being dispatched from fixed positions. Property (g): Customer r~(t) is being delayed by a lower customer. From properties (a), (b) and (e), it could only be customer 2(t). Thus, 2(t) < r~(t). Property (h): Similarly for customer r~(t). From properties (f) and (a) it could only be one of the customers in {l(t),ql(t), ... qJ-2(t)}.

o In the next lemma we show that the two resequencing queues cannot be non-empty at the same time.

Lemma 2.2 At every time t, at least one of the queues Rl or R2 is empty. 10

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Remark 2.1: If l(t)

= 0 then I1(t) = ¢.

OtheMDise, IJ

= ¢.

That is, there are at most J -1

non-empty sets from {11 (t),12(t), ... ,IJ(t)}.

Optimal routing

3

In this section we consider the ,8-discounted and the average-cost Markov decision processes. The optimal control is split into two parts: routing to the faster server and routing to the slower server. In Sub-section 3.1 we show by proba.bilistic arguments, that the faster server should be utilized as long as queue Q is not empty. In Sub-section 3.2, which is further subdiveded, we consider the optimal control of the slower server. In Sub-section 3.2.1, we show that the optimal control is independent of the state of the resequencing queues. In Sub-section 3.2.2, we show the 'optimality property' of position Jo, and in Sub-section 3.2.3 we show that for J

> Jo, the optimal policy is of threshold type.

3.1 Routing to the faster server In this sub-section we use arguments similar to those presented in [13] in order to show that server 1 is kept active if queue Q is not empty. To fix the notation, all the proofs in this section are based on pathwise comparison arguments between an original state process X under a given policy

i. under

policy i' derived from

11'.

11',

and another state process

The latter is referred to as the tilde system, and we use a

tilde to denote all relevant quantities in the tilde system.

Lemma 3.1 For every.O < ,8 < 1, the ,8-optimal policy has the property that whenever it activates a server, it activates the fastest available one.

Proof: Let

11'

be any given policy and let X(O) = z be an initial state at which

11'

activates

server 2 while leaving server 1 idle. By definition, server 2 is activated by the J - th customer from queue Q. We will show that

11'

can be strictly improved.

To simplify notation we may assume without 'loss of generality, that the customers in queue

Q have consecutive numbers starting from 1. (This is possible since only the order among them determine their departure times from the system. Also from the state definition of the resequencing queues, this assumption does not change the system state.) 12

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Define a policy and at time 0,

j-

j-

and a corresponding process X as follows. The initial state X(O) = X(O),

takes the same action as

11",

except that it activates server 1 (with customer

number 1) instead of server 2. From then on, the realizations of X and

X are coupled.

This

is done by feeding both systems with the same arrival process and assuming that the first service time at server 2 in X equals T2 =

111.1' 1. 1'2

(Here, Tj is the service time of a customer

at server j). Observe that this coupling is made possible by the fact that

1'1

is exponentially

distributed with parameter 1£1 and therefore T2 is exponentially distributed with parameter 1£2·

After time 0, policy 1i' mimics the actions of policy ?r (activates the same servers by customers from the appropriate positions) with one exception: (i) Let 1r

activates server 1. If T
T} implies the

condition {T2 > El.T ~2

(>

T)

l!1. T } 1'2

and therefore the residual service time of customer J in X from time

is exponentially distributed with parameter 1l2.

Hence, we can couple the residual service time of customer 1 in

X from time T

with his

service time in X which starts at time T. Furthermore, we can also couple the residual service time of J in X from time

I11. T ~2

(>

T),

latter implies that customer J leaves X time

T,

X from time T. The at time El.T + 1'2' while from X at time T + '1'2. After

with

1'2 - the service time of J

in

~2

i continue to mimic ?r's actions. From the coupling above and the definition of T, it

13

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

is clear that this is feasible. Hence, for all realizations in X where (i) occurs we have

I X(t) I -1,

I X(t) 1= { I X (t) I, For all other realizations in {T

for otherwise.

> 7\}, 1r mimics ?r's actions and we obtain

=1 X(t) I,

for 0 Jo, the optimal policy is of threshold type.

3.2.1 The resequencing-invariant property The next lemma is essential for the proof that state R does not play any role in the optimal routing decision. Observe that from Lemma 2.2, Remark 2.1 and Lemma 3.2, the feasible values of R are of the form (0, (111"" IJ- l )) or (I, (0, ... ,0)), where 11, . .. , IJ-l correspond to the customers in server 1 and in the first (J - 2) positions of queue Q. For R

= (0, (0, ... ,0))

we fix the notation [0].

Lemma 3.3 There is a function hfJ (R) such that for every routing policy 'K whose decisions are independent of R,

(9) Proof: Let

:Co

= (n, ell e2, R, k)

and 2:0

= (n, ell e2, [0], k)

be two initial states, and X and

X the processes that are governed by policy 'K and start at :Co and 16

2:0 , respectively. Since

'K

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

is independent of R(t), we may couple ,the arrivals

and service times in both systems. This

is made possible by the same evolutions of (n(t), el(t), e2(t)) and (n(t), el(t), e2(t)). (Here we use the tilde notation as in Section 2.) There are two cases of R that have to be considered. Case (i): Assume that R = (0, (11' ... , IJ-l)). For every 1 ::; let

Tj

For Ij

i
0,

(Tj) be the instant that the customer present at time 0 in position i, leaves the system.

= 0 or i = 0 define Tj = Tj = o.

By the coupling,

and

Tj

Tj

are identical.

Since 1r routes from position J, the customers that are present at time 0 in the first (J -1) positions, will be routed to server 1. Thus,

Tj

is distributed as the sum of j independent

.geometric r.v.'s with parameter 1-'1. By the definition of the resequencing delay, we therefore have

=/ X(t) I +Ef;l'i' I X(t) I { =1 X(t) I,

for

Tj_l ::;

for

t

t < Tj, 1 < i < J - Ij

~ TJ_l.

For this case the lemma follows by defining J-l

hf3 (R) =

L

liE[1

+ {3 +... ,,BTi-1].

(10)

i=1

Case (ii): Assume that R let

T

= (I, (0, ... ,0)).

If I

= 0 then the lemma is trivial.

For I

> 0,

be the instant that the customer present at time 0 in server 2, completes his service.

Clearly, T is geometrically distributed with parameter 1-'2. We have,

1

X(t)

I

{

=1 X(t) 1+1·,

for 0::;

=1 X(t) I,

for

t

t < Tj

~ T.

For this case the lemma follows by defining

(11) From (10) and (11), the function J-l

hf3 (R)

= IE[1 +{3 +... ,{31"-1] + L

i=1

17

liE[1

+ {3 +... ,{31"i-1]

(12)

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

satisfies (9). Here the expectations are taken with respect to the geometric r.v.'s which are clearly independent of 11'.

o The function h13 (R) represents the accrued discounted cost that is contributed by the customers present at time

°in the resequencing queues. For later references denote h (k) ~ 13

h13 ((O, (0, ...,0,1,0, ...,0))), where the 1 corresponds to position k. Observe that from (10), (13)

By using Lemma 3.3 and the following value and policy iterations, we will show that the routing decisions of the ,a-optimal policy are independent of R. Let F be the Banach space of all functions j : S -+ R with the norm

II j 11= sup I ~g~zl} I· zes

II . II

defined by

From (6) and Lemma 3.2 we may define for every stationary policy

t

11',

the dynamic programming operator T1( : F -+ F, by

where lI"(y) E {O, I} with the understandiIlg that 1I"(y) state y, and

= 1 if 11" routes a customer to server 2 at

°otherwise. Also, define the optimal dynamic programming operator T: F

-+

F,

by

j(1I"(Y)) is attained, is consistently Notice that if the decision at state y, 1I"(y), for which the min 'If chosen for every y, then Tj defines a stationary policy

11'"

which satisfies

(T1(IJ)(z) = (Tj)(z) = min(T,rI)(z). If

(16)

The procedure by which a new value function is derived by using operator T is known as value iteration, and by which a new stationary policy is derived by using T, as policy iteration.

Theorem 3.1 The routing decisions of the ,a-optimal policy are independent of the state of the resequencing queues, R.

18

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Proof: First we show that if 1r'S decisions are independent of R, so are the decisions of the policy derived by the policy iteration TY!. Then we show tha.t the optimal policy preserves the same property. For every

f

E

F, define

9J(n,R) = f(n,I,O,R,O) - f(n -I,I,I,R,J -1),

for n

>J

-1.

(17)

Let 1ro be a policy whose routing decisions are independent of R, and for every m

> 0 define

1rm+l as the policy that is derived by the policy iteration TY!",. That is, T1I'm+l Y!m = TY!m' From (15) and (17), 1rm+l(Y) is either 0 or 1, depending on whether 9v. fJ (n, R) is negative or "m non-negative, respectively. From LeIIlIJla 3.3 it follows that if 1rm'S decisions are independent of R, then 9v.W'm fJ (n, R)

= 9v.fJ

-m

(n, [0]), which implies that 1rm+l's decisions are also independent

of R. Since a limit point of {1rm} does not neceSsarily exists, we cannot deduce the theorem by the policy iteration procedure. However, we can extract it by the value iteration procedure as follows. Consider the sign of 9VfJ, where yP = inf Y! is the p-value function. 11'

Since 1ro's decisions are independent of R, it follows by the argument above that so are 1rm's decisions, and by Lemma 3.3, the sign of 9v!m (n, R) is independent of R, m

~

O. Since

the ,ji~ Y!m exists and equals to yP (see, e.g., {4, Lemma 3]), the sign of 9v fJ (n, R) is also independent of R. To conclude the proof, observe that the p-optimal policy r*, is the solution to the optimalityequations yP = TYP. Now, from (15),1r*(Y) = 1 if and only if 9v fJ (n, R) = 9v fJ (n, [0]) ~ 0, and the solution is independent of R.

o Theorem 3.1 would a.lso apply to the optimal policy with respect to the average cost, if one could guarantee the following limits.

Then, since 9vfJ(n, R) is independent of R, the result is a straightforward consequence of the following optimality equations for the average cost problem:

19

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

By Lemma 3.7 and Remark 3.3 below, if ~

< lSI the limits above follow from [5, Theorem 3].

Hereafter, we may further restrict attention to policies whose routing decisions (to server 2) are functions of the length of queue Q only. Although this structure is the same as in the problem without resequencing delays, it does imply that we have the same optimal policy. This is due to the different evolutions of the cost structures.

3.2.2 An optimal routing position In Section 1, we described the optimality property of Jo that has been derived in[3] for the class of fixed-position threshold policies t m • In this sub-section, we extend this property to a more general class of fixed-position policies. From the results above, the p-optimal fixed-position routing policy can be taken in the sub-class of stationary policies that are functions of two parameters: (i) The set of lengths of queue Q at which a policy routes a customer to server 2 (if idle). (ii) The position J from which customers are being dispatched. Note that since the position is fixed, the set in (i) is restricted by the position in (ii). We say that a class of policies

n

is routing-invariant, if the policies differ only by the

positions from which customers are .being dispatc;hed. That is, for every ?r E n, the sets in (i) above are identical. One example is the class of thresl10ld policies with level m and routing positions J, J

< m. Let J (n) be the set of routing positions that correspond to class n. We

will show that the optimality property of Jo holds for every routing-invariant class. To proceed, we first characterize Jo in terms of the expected delay of a customer in position k under two alternative policies. One is a policy that routes customer k to server 2, and

customers {I, 2, ... , k - I} to server 1. The other policy routes customers {I, 2, ... , k} to server 1. Let {Xi} be a sequence of independent geometric r.v.'s with parameter p.1! and Y an independent geometric r.v. with parameter P.2. For k ~ 1, denote, X(k)

Zk

= max{~X(k_I)}' where X(O) = 0

Note that for

P=

1,

i(k)

deC

For every 0

< P :::; 1 define the function,

il(k) :: E[Zkl - E[X(k)]'

20

= E:=l Xi

and

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

The function "Yp(k) represents the difference in the accrued cost that is contributed by a customer present at time 0 in position k, under the two alternative routing policies above. ,we obtain by the forward-equations Recalling that a = --a-+ 1£1 1£2

"Yp(k) = aE[,8To hp(k -.1) - (1 - a)EfPTo+X(lI-1>(1 -:I- {3+, ... , +{3X,,-1 )],

(19)

where To is the time of the first service completion at one of the servers - either real or dummy.

(To is geometrically distributed with parameter lSI + 1S2 - ISlIS2.) From (19) it is clear that "YP(k) is a decreasing function. Furthermore, there are

{30 < 1, such that lim "Yp(k)
0 and

~ {30. The latter property is an immediate

consequence of the facts that lim ([Zk - X(k)] - [X(k-l) - X(k)]) = 0 with probability one, #;-00

E(X(k_l) - X(#;)] = -

:1 and 92l 1p(k) = E[Zk] - E[X(k)].

integer Jo({3) ~ 2, for which the function "Yp(k) ({3

> {30)

Since "Yp(1)

> 0, there exists an

becomes strictly negative for the

first time. As our primer interest is the average cost criterion, we consider (3's for which J o({3) = Jo(1). Since the 10{(3)'s are fntegers and "YP(k) -+ "Y(k), it is clear that there exists a (31 < 1 such that J o({3) = J o(l) for every {3 ;:::

f3t.

!

Finally, we show that Jo(l) = J o (which is defined in (5)). From the memoryless of the geometric distribution it is standard to show that "Y(k) = ,

-t ~ )k-l > 1.

that "Y(k) < 0 if.and only 'if (~)(1

a" 1£2

(I-a)", 1£1

from which it follows

The latter relation implies that Jo(l) = Jo.

Now we are ready to prove the optimality property of J o. We need the following policy transformations, which are applied also to non fixed-position policies: (i) For every routing policy 1r and,k ;::: 2, define Tt(1r) as the non-stationary policy that differs from

1r

only by the following action at the first step. If 1r routes a customer from position

k, then Tt(1r) routes a customer from position k

+ 1,

if not empty. Otherwise (at the first

step and further steps), it takes the same routing actions as

1r.

routes customers to server 2, at the same lengths ,of queue Q that

That is, at step one, Tt(1r) 1r

routes, but possibly from

a higher position. After step one, Tt(1r) routes customers to server 2, at the same lengths and from the same positions as

1r

routes.

(ii) For every routing policy 1r and

k;::: 3, define r;(1r) as the policy that differs from 1r only 21

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

by the following action at the first step. H 11" routes a customer from position Ie, then T;(1\") routes a customer from position Ie - 1.

In the following lemma we consider policies that may route from variable positions (with some restrictions), and show that if position Ie, Ie

t:

Jo, is feasible under

11",

then

11"

can be

improved by one of the transformations above. Lemma 3.4 The following hold for every {J,

f3t < {3 < 1:

(i) If 1\" routes customers to server 2 from positions larger than or equal to Ie, Ie is a feasible position, the,. Tt (1\") is at least ~ good as

< Jo,

1\".

(ii) If1\" routes customers to server 2 from positions larger than or equal to Ie - 1, Ie

Ie is a feasible position, then there is a (J2

{J2 5: (J

< 1 stich

and Ie

that T;(1I") strictly improves

11"

> Jo,

and

for every

< 1.

Proof: The proof is based on a pathwise comparison between the- state process X under policy 1\" and the state process X under policy Tt(1\") (for part (i)) or T;(1\") (for part (ii)). To compare realizations we couple the arrivals and service completions in both systems. (Note that a service completion may correspond to different customers in X and Part (i). Let

Zo

= (n,I,O,'&,O), n ~ Ie, be-an initial state at which

from different positions. Since for all other initial states V!(z) that

11'

X.) and Tt(1\") routes

= V::(lI')(z), we have to show

v: (n - 1, 1, 1, R, Ie - 1) - V: (n - 1, 1, 1, R, Ie) > 0.

This is due to the fact that after the first action, X (respectively X) instantaneously jumps to state (n - 1,1,1, R, Ie - 1) (respectively, to (n - 1,1,1, R, Ie)). From then on, both processes are governed by policy 11'. As in the proof of Lemma 3.1, we may assume without loss of generality that the customers in server 1 and in queue Q are numbered by 1,2, ... , n+l. We will show that for every customer

t: Ie, its departure times in both systems are the same, while the expected departure time of customer Ie is smaller in X.

i

Since 1\" routes from positions larger than or equal to Ie, it follows from the coupling that every customer i

< Ie (and those that at time 0, are being delayed by him), would leave both

systems at the same time. Furthermore, the departure times of customer Ie + 1 (and those

22

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

that at time 0, are being delayed by him) would also be the same. This is plain from the fact that (k

+ 1) leaves the system at the first instant at

which customers {I, 2, ... , k + I}

have been released. Since states (n -1,1, I,R,k -1) and (n - 1,1,1, R, k) differ only by the locations of customers k and k

+ 1, it is apparent from the coupling that this instant is the

same in both systems. Every customer i

> k + 1 in both systems, is routed at the same time and to the same

server, and its completion time is also the same. Since he would leave the system at the first instant at which he and all preceding customers would have been released, it follows by induction that its departure time must be the same in both systems. Hence, it is left to show that the expected accrued cost -due to the delay of customer k, is smaller in

X.

Let.,. (f) be the departure time of customer k·from system X (X). From the identities for the rest of the departure times,

V!(n-l, 1, 1, R, k-l)-V!(n-l, 1, 1, R, k) = (l1:+1) {E{[1

+ p ... + (f-l] -

E[1

+ p ... + Pf-l]}. (20)

Thus, we have to show that the expression within the braces is non-negative. To prove this, first note that customer k in system

X is routed

routed to server 2. Also note, that since

11'

to server 2, if and only if (k

+ 1)

in X is

routes from positions k or higher, customer k in

X would defi~itly be served by server 1, if this server would complete his first

service before

. server 2 does. This event occurs with probability.....El-+ #1 Wl Let To be as in (19) and T1 be the number of steps after To that it takes to route customer

(k + 1) in X to server 2 (and infinite, if he is routed to server 1). Denote by ." the conditional probability (conditioned on the state at time To) that {T1 < co}. By using the forward equations from time 0 to To, it follows from the definitions of ZI: and

E[(1 + P... + p-r-l)

X(k)

that

+ P... + p1'-l)] 11-1 E [pTo ((1 +.p ... + pZIc_l- 1)- (1 + P... + pXCIc _ )-1»)] PI + P2 + P~ E [."pTo+Tl ((1 + P... + pX(Ic_l)-l) - (1 + P... + PZIc-l»)] PI + 11-2 (1

1

23

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Thus, from (18), (19) and the fact tha.t

with probability one, we have

(1 +,8 ... + ,8f-l)) _ "Y1l(k)+

P2 PI

P2 PI

+ P2

> (1-

+ P2

E [71,8To+Tl] "Y1l(k) _. P2 PI P2

PI

E [71,8To+X(._l)(1 +,8 ... + ,8X.-l)]

+ P2

+ P2

E[71,8TO+Tl])"YIl(k)

E [71,8To+Tl+ X(.-l)(1 +,8 ... + ,8X.-l)]

>0.

The last inequality follows from the monotonicity of "Y1l(k), the definition of Jo and the fact that k < Jo. This completes the proof of part (i). Part (ii). Let:l:o = (n,I,O,R,O), n

>k-

1, be an initial state at which

routes from different positions. Since for all other initial states V!(:I:)

11'

and T;(lr)'

= V~-(1I')(:I:), we have

to show that

v! (n -

1, 1, 1, R, k - 2) - V: (n - 1, 1, 1, R, k - 1)

As in part (i), the departure times of every customer i

< 0.

:F (k -1) in both systems are the same,

and therefore it suffices to show that

E[1 +,8 ... + ,8T-l] - E[1 +,8 ... + ,8,"-1] < 0. Here, T and f

r~late

(21)

to the departure times of customer (k-l). Similarly, alter the definitions

of T 1 and 71 in part (i) by relating them to customer (k - 1). Again, by the forward equations we have,

E[(1 + ,8 ... + ,8f-l)

(1 + ,8 ... + ,8,"-1)] 1£1

1£1

+ P2

E [,8To

((1 + ,8 .... + ,8Z.-2-1 ) -

24

(1 + ,8 ... + ,8X(.-2)-I))]

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

_ (1 - PI P2+P2 E [flPTO+Tl]) "'fp(k - 1) P2 E PI +P2

+

[flpTo+X(..-2)(1 + p.. . + pX.._l-l )(1 _ PTl)] .

(22)

To complete the proof we have to reduce the last positive term above. Observe that given

{Tl < oo}, Tl is defintly smaller than the first service time that would be given by server 1 after To. Therefore, T l is stochastically smaller than Xl' Hence, by Jensen inequality the second summand in the right-hand side of (22), can be made arbitrarily close to zero, for

p

arbitrarily close enough to one. Fina.lly, by the definition of Jo and the fact that (Ie - 1) Thus, there is a

P2 < P < 1.

132, f3t

~

132 < 1, such that

> J o, it follows tha.t "Yp( Ie - 1) < 0.

the right-hand side of (22) is negative for every

This completes the proof of part (ii).

o. Remark 3.2: If "'fp(Jo - 1) > 0, then in part (i) of Lemma

9.4, T{( 1r)

strictly improves 11'.

From Lemma 3.4, T{ (11') and Tj; (11') are imJ?rovement transformations of policies that route from positions other than Jo. Therefore, they could successively be used to obtain a. limiting stationary policy. Let 11'0 be a fixed-position routing policy that, routes from position J

I

~

Jo. For every

> 0, recursively define the non-stationary policy 1r1+t = T{ (11'1) (alterna.tively 1r1+t = TIc- (1rI»'

Notice that for every I, 11'1.

Let

11'00

11'1

satisfies the conditions of Lemma 3.4 and therefore

1I'1+t

improves

be the limiting policy. Policy 11'00 is stationary and routes customers at the same

queue lengths (of queue Q) that

11'0

does. However, if 1ro routes from,position J> Jo, then

1r00

< Jo, then 1r00 routes either from position (J + 1) (if not empty) or from position J (otherwise). Le., for J < Jo, 11'00 routes routes from position (J - 1). If 11'0 routes from position J

from position max{n(t) for J

J

+ 1,J + 1}.

> J o, 1r00 is strictly better than

< Jo -1, and for J = Jo -

Furthermore, Lemma 3.4 and Remark 3.1 imply that ?ro,

and for J

1 with "'fp(Jo - 1)

< J o, 11'00 is at least as good as

11'0.

For

> 0, 1r00 is strictly better than 1ro.

The following theorem extends the optimality property of Jo to any routing-invariant class.

25

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Theorem 3.2 The following hold with re8pect to the {j-discounted cost, 132 < (j < 1, and to the average cost criteria.

(a) For every routing-invariant class

n which result in positive recurrent Markov chains:

(a.1) If Jo E J(n), then the policy 11' E n that routes from position Jo is optimal within

II. (a.2) If for every J E J(n), J > Jo, then the policy 11' E n that routes from position J* = min{J I J E J(n)}, is optimal within n.

(a.3) If for every J E J(II), J < Jo, then the policy 11' E J* = max{J I J E J(n)}, is optimal within n.

n

that routes from position

(b) For every fized-position routing policy 11' that routes from position J and result in a positive recurrent Markov chain:

(b.1) If J > Jo, then 1r is inferior to the policy that routes at the same lengths of que~e Q, but from position Jo. (b.2) If J < Jo, then 1r is inferior to the non-fized position policy that routes at the same lengths of queue Q, but from pO,sition (J + 1) if not empty, and from J otherwise.

Proof: The proof for the {j-diuscounted coat criterion is immediate from Lemma 3.4 and the discussion that follows. Indeed, within a routing-invariant class

n, one can successively

improve a policy by gradually increasing (accordingly, decreasing) the routing position within

J(n), until one hits Jo, max{J I J E J(nn or min{J IJ

E

J(n)}. Parts (a.1), (a.2) and

(a.3) follows respectively. Furthermore, (b.I) is an immediate consequence of (a.1), and (b.2) follows from part (i) of Lemma 3.4. The results for the average cost criterion follows from [5] by using the convergence

which holds for problems with a linear cost structure and continuous state jumps as ours (see

[5]).

o 26

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

3.2.3 An optimal policy of threshold type

In this sub-section we show that if the routing position J is greater than Jo, then an optimal policy with respect to the average cost criterion exists, and is of threshold type. Assume that J

> Jo and sta.rt with the ,8-discounted problem, 130 ~ ,8 < 1. Recall that for such J's, "YP(J - 1) < 0,

f32

~,8

< 1.

(23)

The proof is based on policy iteration and develops along the same lines as the proof in {4], with some changes that are required from our different state space. Define a pa.rtial order "~"

z

~

on the states, as follows. Recall that a state z is a tuple z = (n, e}, e2, R, k). We say that

y, z,y E 5, if at least one of the following conditions hold:

(i) :c = y (component-wise).

(iv) A(z) = y. (v) All components of" and y are equal except for one, which is smaller in z. (vi) There is a z E 5 such that, z For every 1 E

~

z and z ~ y.

r we also define the function: I(n - 2,1,1, (O],k) - I(n - 3, 1, 1, [O],k), n

~f(n, k)

=

{ 1(0,1,1, (OJ, 1) - 1(0,0,1, [0],0),

1

~ 3,

n = 2,

°

~ k ~ min{n - 2,J

-1}j (24)

k = 0.

{/(n-l,I,0,[0],0)-/(n-2,1,0,[0],0), n~2j

~f(n) =

(25) 1(0,1,0, [0],0) - 1(0,0,0, [0],0),

n = 1.

In the following lemma we list some properties of 1 E :F that propagates to Ttm/, m This will be used to show that under every threshold policy t m , properties. 27

~

J.

Ve. also satisfies the same

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Lemma 3.5 If 1 E :F satisfies the following properties, so does Ttm/, m ~ J.

(a) For every z, y E 8, if z

~

y then I(z) :5 I(y).

(b) For every n

> 2, 6/(n, J -1) > hP(min{n - 2, J - I}).

(c) For every n

> 2, 6}(n) > hP(min{n - 1, J - I}).

(d) For every n

> 2, 6/(n,.) = 6/(n, 0),

0 < .. < min{n - 2, J -I}.

(f) I(n - 1,1,1, [O],k -1) - I(n - 1,1,1, [0], k)

= '"1P(k),

1 < k < min{n, J - I}.

The proof of this lemma is standard but extremely tedious and we do not present the details here. The main lines are as follows. The function

Ttml

is represented via Eq. (14) and the

properties are verified one by one. The full verification is given in [2] and the reader may reproduce it based on the following properties which are easily shown:

1(n + 1, 1, 1, [0], k) > 1(n, 1, 1, [0] + lk, J - 1), k:5 J I(n, 1, 1, [0], J - 1) - I(n, 1,0, [0], 0) ~ hP(J - 1), Here, lk = (I, (lit ... ,lk, ... ,IJ-l»

- 1-

n:5 J - 2.

= (0, (0, ... , 1, ... ,0».

From Lemma 3.5 one may show by successively using the operator

Tt m,

that

Vea

also

satisfies properties (a)-(f) of the lemma. Indeed, it easy to construct in a recursive manner

10 that satisfies properties (a)-(f). From the lemma it follows that Trm+l10 de! Ttm(Ttnm 10), n ~ 1, also satisfies these properties. Now, since lim T.tnm 10 = V,Pm , we obtain the

a function

n~oo

following corollary. Corollary 3.1: For every m

~

J, the {3-discounted cost function under policy t m satisfies

properties (a)-(J) of Lemma 3.5.

The next lemma is the basis of our final result and its proof is similar to that in [4, Lemma 4]. The assumption J > Jo and the property in (23) are crucial for reproducing the proof. The

28

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

lemma asserts that the new policy that is obtained from l't~ by the policy iteration procedure, is also of threshold type. Lemma 3.6 For every mo, 2 :5 Jo < J :5

mo < 00, there exists an mit

J

< ml :5 mo + I,

such that TCm1 l'.tf3rna = Tl'.tf3mo •

Proof: To prove the lemma we need to explore the properties of the function

gvt (n, R) = l't~ (n, 1,0, R, 0) -l't~ (~- 1,1,1, R, J - 1), rna

From Lemma 3.3 it suffices to explore the fun~tion g(n)

del

for

n 2:: J - 1.

(26)

gvt (n, [0]). This will be carried rna

out by using the forward equations in (6) and representing g(n) in a recursive form. The forward equations depend on the value n and we separately consider all possible cases. Case (i): 1 lengths n

<J

- 1

< n < mo - 2. (The policy t mo does not route a customer at queue

+ 1 and below.)

From (6),

g(n) -

p~[l't~(n+l,I,O,[O],O)-l't~o(n,I,I,[O],J-l)]

+ PJ1.1 [l't~ (n -

1,1,0, [0], 0) - l't~o (n - 2, 1,1, [0], J - 2)]

+ PJ1.2 [l't~ (n, 1,0, [0],0) -

Vt. o (n -

1,1,0, [0]

+ IJ-h 0)] .

First note that the expression in the first braces is g( n + 1). Next, add and subtract

(27)

VLa (n -

2,1,1, [0], J - 1) within the second braces. From the proof of part (ii) of Lemma 3.4 and the assumption J

> Jo, we have

vt. (n - 2, 1, 1, [0], J - 1) -l't~ (n - 2,1,1, [0], J - 2) > O. o

(28)

Thus, by (27) we obtain

g(n) > p~g(n + 1) + pJ1.1g(n - 1) + PJ1.2 [l't~o (n, 1,0, [0], 0) - V;~ (n - 1,1,0, [0] + IJ-h 0)]

> p~g(n + 1) + pJ1.1g(n - 1) + PJ1.2 [l',~ (n, 1,0, [0], 0) - l',~ (n - 1,1,1, [0], J -1)] -

p~g(n

+ 1) + pJ1.1g(n - 1) + pJ1.2g(n). 29

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

The last inequality follows from Corollary 3.1 and property (a) of Lemma 3.5. Since

~

+1'1 +

J.L2 = 1, we obtain for this case,

(1 - f3)g(n) - f3~(g(n + 1) - g(n)) > f3l'l(g(n - 1) - g(n)), The same inequality is obtained for 1 f3l'lg(n - 1) + f31'2(~~" (n + 1) - h.P(J - 1)) ~ f3l'lg(n - 1). Cmo

(30)

The last inequality follows from property (c) of Lemma 3.5. The same inequality is obtained for the case 'l

= mo - 2 = 1.

Observe that the assumption J> Jo > 2 and the requirement

mo ~ J, implies that mo > 3. Case (iii): n

= mo -

1

~

2. (The policy t mo routes a customer at queue length nand

above.) From (6), g(n) -

+

f31'1

[VLa (n -

1,1,0, [0], 0) - V;~ (n - 2,1,1, [0], J - 2)]

f31'2 [V;~o (n - 1,1,1, [0], J - 1) - V;~o (n - 1,1,0, [0] + 1J-1l 0)] .

Again, from (28), the expression in the first braces is greater than g(n - 1). Furthermore, from Corollary 3.1 and property (a) of Lemma 3.5, g(n)

~

f3J.Llg(n - 1).

30

(31)

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Case (iv): n

> mo >

3. (The policy ,tmo routes a customer at queue length n - 1 and

above.) From (6),

g(n) -

+

{3J1.1 [\'t~(n-2,1,1,[0],J-l)-\'t~(n-2,1,1,[0],J-2)] {3J1.2

[VLo (n -

1,1,1, [0], J -1) - \'t~ (n - 2,1,1, [0] + IJ-lI J -1)] .

From (28), the expression in the first braces is positiv~. From Corollary 3.1 and property (a) of Lemma 3.5, the expression in the second braces is also non-negative. Thus,

g(n) ~ 0,

n

> mo.

(32)

To complete the proof note that from (29)-(32), g(n) satisfies the conditions of the corresponding function in [4, Eq. (10»). As a consequence, the rest of the proof is identical to the proof of [4, Lemma 4), and our lemma. follows.

The assertion of the next theorem and its proof are identical to [4, Theorem 5]. The proof applies the convergence of the policy iteration to the {3-optima.! policy. Theorem 3.3 For every J

> Jo and {32 < {3 < 1:

(i) There exists a stationary policy of threshold type, with threshold m*({3) (li) If\'t~(z)
Jo, be the optimal policy with respect to the average cost, given that the routing

position is J. From the proof of Lemma 3.4 part (ii), it is clear that as good as

tmO(J).

Hence,

tm*(Jo+l)

tmO(Jo+l)

is at least

is at least as good as any other fixed-position policy that

routes from J > Jo. ,FUrthermore, from Theorem 3.2 part (b.1), the policy t~*(Jo+l) that routes customers whenever tm*(Jo+l) does, but from position Jo, is at least as good as

tm*(Jo+l)'

Thus,

the following corollary is obtained. Corollary 3.2: The policy t~*(Jo+l) is at least as good as any other policy that routes from position J > Jo.

34

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

References [1] A. K. Agrawala, E. G. Coffman, Jr, M. R. Garey and S. K. Tripathi, A Stochastic Optimization Algorithm Minimizing Expected Flow Times on Uniform Processors, IEEE Transactions on Computers, C-33(4):351-356, April 1984. [2] S. Ayoun, Optimal Control of a Queueing System with Two Heterogeneous Servers with Resequencing, M.S. Thesis, Dept. of EE, Technion, Haifa 32000, Israel, February 1989. [3] I. miadis and Y. C. Lien, ~equencing Delay for a Queueing System with Two Heterogeneous Servers Under a Threshold-type Scheduling, IEEE Transactions on Communication, CQM-36:692-7QZ, 1988. [4] W. Lin and P. R. Kumar, OptimaL Control of Queueing Systems with Two Heterogeneous Servers, IEEE Transactions on Automatic Control, AC-84:696-703, August 1984. [5] S. A. Lippman, Semi-Markov Decision Processes with un-bounded Rewards, Management Science, 19:717-731, 1973. [6] S. A. Lippman, Applying A New Device in the Optimization of Exponential Queueing Systems, Management Science, 23:687-710, 1975.

[7] R. L. Larsen, Control of Multiple Exponential Servers with Application to Computer Systems, Ph.D. Thesis, Technical Report No. 1041, University of Maryland, 1981.

.

[8] ,M. I. Reiman, Optimal Control of a Heterogeneous Two Server Queue in Light Traffic, At&t Bell Lab., Murray Hill, NJ 07974, 1989. [9] M. I. Reiman and B. Simon, Open Queueing Systems in Light Light Traffic, Math. Oper. Res., 1989.

[10] Z. Rosberg and A. Makowski, Optimal Routing to Parallel Heterogeneous Servers - Small Arrival rates, Submitted to IEEE Transaction Automatic Control, September 1988.

[11] M. Schal, Conditions for Optimality in Dynamic Programming and for the Limit of n-stage Optima.1 Policies to be Optimal, Z. Warscheinlichkeitstheorie Verw. Gebiete, 32:179-196, 1975. [12] S. Varma, Some Problems in Queueing Systems with, Resequencing, M.S. Thesis, Technical Report TR-87-192, University of Maryland, College Park, 1987. (13] J. Walrand, A Note on the Optimal Control of a Queueing System with Two Heterogeneous Servers, System and Control Letters, 4:131-134, 1984.

35

Technion - Computer Science Department - Tehnical Report CS0592 - 1989

Q Service Queue ....-...-

---

,, ,

,

,

,

\

I I I ,

I

\

;

\

,

'

I

'

\

,:

q2(t)

I I

ql(t)

" \

J

,,

\, .....

, ' I

: = {i1(t), ... ,iJ_l(t)} \ \

,

\ \

\

Server.2

\

I I

\ \

\

"

",

\ \

, \

' ....

R2

-

_-----

·

r~l (t) (t)

·

·

·

..

-

r~(t)

· · ·

--,'

,

I

Resequencing

r~2(t)(t)

·

, ,, I I

,

·

,

_..

Queues

Rl

rl(t)

t Figure 1: Routing from position J when 2(t) = 0 and let)

36

f: o.

Recommend Documents

Optimal Control in Two-Hop Relay Routing

A Heterogeneous Routing Game - Walid Krichene