Technion - Computer Science Department - Tehnical Report CS0592 - 1989
TECHNION - Israel Institute of Technology Computer Science Department
OPTIMAL ROUTING TO TWO PARALLEL HETEROGENEOUS SERVERS WITH RESEQUENCING by
S. Ayoun and Z. Rosberg
Technical Report #592 November 1989
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
OPTIMAL ROUTING TO TWO PARALLEL HETEROGENEOUS SERVERS WITH RESEQUENCING Zvi Rosberg
Serge Ayoun
(November 1989)
Technion - lIT Computer Science Dept. Haifa 3200'0, Israel
Abstract Customers arrive to a single service queue according to a Poisson process with rate l, from which they are routed to two parallel heterogeneous and exponential servers whose ra.tes are 1'1 > #2· Customers are released from the system after service completion, according to their arrival order - a requirement introducing additional resequencing delays. Customers which are delayed due to resequencing are waiting in a. resequencing queue. We consider the optimal routing problem under the class of fixed-position routing policies, that route customers to the faster server from the head of the service queue, and to the slower server from position J. The cost function is taken as the long-run average holding cost of the customers in the system. We show that an optimal stationary policy exists and is of the following type: The faster server is kept active as long as the ser\jce queue is not empty. The decision whether or not to route a customer to the slower server is independent of the state of the resequencing queue. If the Q = ~+ ,then customers are routed to the slower position J is greater than Jo = rlnU-a)l, a ~1 ~2 server if and only if the length of the service queue is at least moll (a threshold policy). We also show that the routing position Jo is 'optimal' in the sense that every policy can be improved by dispatching a customer from position Jo (if not empty), rather than from position J. Keywords: Exponential Queues, Parallel Servers, Resequencing, Routing, Optimal Control, Markov Decision Processes. .
1
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
1
Introduction
In this paper we consider a queueing system (Figure 1) which is composed of a infinite capacity queue, Q, attended by two exponential servers operating at rates
Pl
> P2. Customers arrive
into the system according to a Poisson process with rate '\, and are assigned consecutive integers which serve as their identifiers. Throughout we assume the stability condition ,\ < P
def Pl
+ P2.
Arriving customers join at the end of queue Q and are routed to one of ~he
servers according to some given routing policy (to be defined below). Customers in service cannot be re-routed. In many applications of routing in communication network, customers (messages) are released from the service system (the channel and receiver) according to the order of their arrivals. That is, customer i is not released from the system unless he and all customers whose numbers are smaller than i, have finished their service. The waiting time of a customer that has completed his service, for the release of customers with lower sequence numbers, is referred to as resequencing delay. Note that resequencing delays are possible since servets are operating at different rates. Moreover, a routing policy may assign customers from an arbitrary position of the queue. Customers which are being delayed due to resequencing, are waiting in one of two resequencing queues: Rl for customers which have been served by server 1, and R2 for those which have been served by server 2. The positions in queue Q from which customers are being routed to the servers (which are perceived as two alternative routes), clearly affect the overall resequencing delays (see [3]). The optimal 'routing problem with variable positions turned out to be extremely difficult. Therefore, we restrict our attention to fixed-position routing policies which route customers to server 1 only from the head of queue Q, and to server 2 only from a fixed position J, J
~
2.
By position J we mean the J - th customer among those in server 1 and in queue Q. Beside tractability, this restriction is also motivated by the result in [2]. It has been shown there, that if routing positions are allowed to vary in time, then under light and heavy loads one can take the optimal policy within the class of fixed-position routings. Also, as it will become apparent, it is not optimal to keep server 1 idle if queue Q is not empty, and therefore the requirement of J
~
2 does not exclude the head of the line.
Let X(t) be a tuple denoting the state of the system at time t (to be defined below) and 2
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
IX(t)1 be the number of customers in the system at that state. A routing policy 11" is any rule that at every time t
> 0 decides, on the basis of past states and of past decisions up to time t,
which idle servers to activate. Policies may leave a server idle even when there is a customer in the corresponding position. With a holding cost accrued at a nxed rate of 1, the long-run average cost associated with the policy 11" is then defined by
(1)
, z E S, where
E; H de~otes the expectation with respect to the probability measure induced by the
policy
11"
on the process X
= {X(t), t ~ O} starting in state z.
A routing policy
11"*
is optimal
if it minimizes (1), i.e., if
for any other policy 1r. For the exponential system considered here, the optimization problem associated with (1) falls within the purview of continuous-time Markov decisions processes which are uniformizable, i.e., which are equivalent to uniformized discrete-time Markov decisions processes [6]. The reader is referred for details to [4], where the same problem without resequencing delays is studied. To define the discrete-time decision process, consider that at any given instant, each server is working either on a real customer, if activated, or on a dummy customer otherwise. Dummy customers always return to queue Q upon completing service and incur no contribution to the cost. Transitions are associated either with arrivals or service completions at one of the servers of a customer - either real or dummy - determine free transitions. These free transitions occur according to a Poisson process of rate to an arrival occurs with proba.bility
~
+ p.
A (free) transition due
..\;p' whereas a transition due to a service completion at
server i occurs with probability X,+p' IT in state z before a transition, the process will jump after this transition to a state which depends on the current state z and on the action taken under the policy
11"
in use. The cost function for using policy
3
1r
which corresponds to (1) is
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
then given by
(2) where X(m) now denotes the state sampled at the'm - th tra.nsition. We also need the total ,8-discounted cost (0
< ,8 < 1) associated with the policy 'Ir, which is defined by V!(,,)'" E:
[fo.B'" IX(m) I] . "E S.
(3)
The complex structure of the state space of X (see Section 2) results in a complex class of stationary policies. A simpler
sub-c~ass
are the policies whose decisions are functions of the
length of queue Q only. This sub-class will be referred to as the resequencing-invariant class. A further simpler sub-class are the threshold policies. A policy t m is a threshold policy with level m
~
J if: (i) The first customer in
queue Q is routed to seMJer 1 whenever he becomes free; (ii) The customer from position J is routed to seMJer 2 when and only when he is free and the number of customers in seMJer 1 and queue Q is at least m.
One result of this study is that the optimal policy can be taken within the resequencing.· invariant class. Another result is that for a certain range of positions J, the optimal policy can be taken within the threshold class. We also show that there is a preferable routing position Jo. For the routing problem without
resequenci~
delays, the routing position J is irrelevant
since service requirements are identically distributed. This problem was first studied in [7], where it was conjectured that the optimal policy would be of threshold type. In [1], a version of the problem with N servers was considered under the assumptions that the system has an initial load of n customers and no new customers enter the system, i.e., A = O. A simple policy which minimizes the expected flow time has been determined. This optimal policy has the following simple form [1]: For 1 < j
< N,
set
Rj deC =
PI
+ ... + P,j-l p'j
4
( •
3-
1)
(4)
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
and define R1 = 0. If there are n customers that remain unprocessed and server j is the fastest server available (i.e., with the largest Pi)' then the idle server j is
activated - and a customer dispatched to it - if and only if n
> Ri.
The conjecture from [7] on the threihold form of the optimal policy was settled in the affirmative in [4] for N ~ 2. Using policy iteration, it has been shown that the optimal policy is of threshold type with threihold level R(~) (which depends on ~). It was also conjectured there that as ~
! 0, R(~)
increases and converges to R2 given by (4). In [13], simple stochastic
coupling arguments were used to prove the optimality of the threshold policy for N = .2. Motivated by the conjecture made in [4], it has been shown in [10] (for a general number of parallel servers) and in [8] (for two servers) that the threshold policy above for optimal for small enough values of the arrival rate
~
= 0, is also
~.
In light of the results above, one is naturally led to explore the idea. that when resequencing delays are introduced, the optimal policy would also be of threshold type. We settle this question in the affirmative only for J
> Jo•
The issue of resequencing delays in this context has been first introduced in [3], where queueing statistics have been evaluated under the class of fixed-position threshold policies. It has been further shown there, that for a given threshold level m, there is an optimal position J* from which one should route customers to server 2. This position is given by
J* =
where
{
m, if m
< Jo;
Jo, if m
> Jo,
f1
and
J = rln(l - a o
ina
(5)
In words. When a customer has to be routed to server 2 according to the threshold policy t m , then the beit fixed-position is the nearest to J o. This property of J o, will be referred to as its 'optimality property'. Reviewing the optimality property of Jo for a threshold policy, and considering the fact that threshold policies may not necessarily be Qptimal, we are intrigued by another question, whether Jo has the optimality property for a more general class of policies. We will show that this is indeed the case. 5
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Independently, an attempt based on value iteration, has been made in [12] to prove that the threshold policy is optimal for the case J
~
2. The proofs there however, raise in our
mind some unsettled questions. The most severe one is the validity of the inequalities [12, Eqs. (6.49), p. 120], for K
= 1. These inequalities are crucial for-the validity of Lemma 5.7.1
there. The paper is organized as follows. In Section 2, we define the state space and the transitions under fixed-position routings. Section 3 is sub-divided into two parts. In Sub-section 3.1, we show that the faster server should be kept active as long as the service queue is not empty. In Sub-section 3.2 which is further sub-diveded, we consider the optimal control of the slower server. In Sub-section 3.2.1, we show that the optimal control i~ independent of the state of the resequencing queues. In Sub-section 3.2.2, we show the 'optimality property' of position Jo, and in Sub-section 3.2.3 we show that for J> Jo, the optimal policy is of threshold type.
2
The state process - definitions and basic results
In this section we define the states and the transitions of the Markov decision process that describes our routing problem, and examine its state evolution.
2.1 States and transitions We start with the state definition. After every transition t, t = 0,1, ..., in the discrete time decision process, let n(t) denote the number of customers in queue Q, and ei(t), i = denote the state of server i (with the understanding that ei(t)
= 1 if server i
1,~,
is busy, and
ei(t) = 0 otherwise). To describe the resequencing queues Rl and R2 we need the following notion. We say that customer i in a resequencing queue is being delayed by customer Teo if:
(i) Customer Teo did not finish service. (ii) leo < i. (iii) k o is the maximal k that satisfies (i) and (ii). Thus, customer i is released immediately after the service completion of customer leo. Let l(t) be the number of customers in queue Rl (after the t - th transition), that are being delayed by the customer which is being served by server 2. Here, l(t) 6
= 0 if e2(t) =
O.
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Also (see Figure 1), denote by il(t)
< i 2(t) 0, or k =
(1+1,(0, ... ,0)),
°and
e2
if k=Oande2=1,
(I, (111" . ,lk_l, Ik + 1, Ik+lt •• • , IJ_I)), if k (0, (0, ... ,0)),
> OJ
if k = 0.
7
= OJ
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
The transformation Sl,e2 defines the state that queues R1 and R2 would jump to from state z, when server 1 would complete service of a real customer. Observe that by definition, if k
=0
and
e2
in the system. Thus, if k
=
1, then the customer that is being served by server 2 is the 'oldest'
Otherwise, the customer that is being served by server 1 is the 'oldest'.
> 0, or k
= 0 and
e2
= 0,
when server 1 would complete service of a real
customer, this customer and those in ,queue R2 which are beiD:g delayed by him, would leave the system. In this case we necessarily have 1=0. If k = 0 and
R
e2
= 1, we necessarily have
= (I, (0, ... ,0», and the customer that would finish service in server 1, would join queue
R1. (These observations are proven in the next sub-section.)
The transformation
Sl defines the state that queues R1 and R2 would jump to from state
z, when server 2 would complete service of a real customer. Recall that for k = 0 and
e2
=1
we necessarily have R = (I, (0, ... ,.0». Therefore, wh~n the customer that is being served by server 2 would finish service, he and the customers in queue R1 would leave the system. IT
k > 0 and
e2
= 1, then the customer that would finish service in server 2, would join queue
R2 and would be delayed by customer i k • Now, the free transitions of process X from state z E S (when no routing are made», are as follows.
if el
if
e2
= 0;
= 0;
where z+ = max{O, z}. The probabilities that a free transition A(z), D1(z) or D 2 (z) occurs, are
'\;1"
X'!/:I' and X':/:I" respectively.
Here, it is convenient to identify a stationary policy 11" with a function 11" : S -+ {Ph, Ph P2 , Pb } as follows. Assume that a free transition - either an arrival or a service completion - occurs that would make the state jump to z E S if no action were taken. The policy
11"
uses at, state
z an operator Pa, a E {h,1,2, b}, that makes the state jump instantaneously from z to Pa(z), '8
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
where
Ph(Z) =
Zj
P1(n,0,e2,R,k) == (n -1,I,e2,R,k), n
~ 1;
P2(n,ehO,R,0) = (n -1,e2' 1,R,J -1), n > 1; Pb(n, 0,0, R, 0) = (n - 2, 1,1, R, J - 1),
n > 2.
The operator Ph does not route any customers, PI routes the customer from the head of the queue to server 1, P2 routes the customer from position J to server 2, and Pb does PI and P2. (Notice that from the way we define the posi\ion J, the order in Pb is irrelevant.)
2.2 Basic results Since the cost function is linear in the state variable and the total number of customers in the system changes by at most one at every tra.nsition, it is well known that a.n optimal policy exists for the ,a-discounted problem (associated with (3)), and that it can be taken
i~
the class of Markov stationary policie; [11]. One of the conclusions of this study is that the exact same result also holds for the lorg-run average cost criterion (2). Furthermore, for every stationary policy 1r, the limit in (2) e} ists' and is independent of the initial state z. Without loss of generality we may assume that ,\ +p = 1. Under any stationary policy 1r, the forward equations of v,f(z) are
where 1r(Y) E {Ph(y), P1(y), P2 (y), Pb(1')}, In the following lemmas we present some basic properties of the state evolution. The first lemma resolves the order among the customers at any instant. Denote (see Figure 1): r~(t)
- The customer (Le., its sequence number) in the s - th position in queue Rl at the t - th transition (time t).
r~(t)
- The customer in the s - th position in queue R2 at time t.
q6(t) - The customer in the 8
-
th position in queue Q at time t. 9
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
nl(t) '- The number of customers in queue RI at time t. n2(t) - The number of customers in queue R2 at time t. l(t) - The customer which is being served by server 1 at time t, or 0 if the server is idle. 2( t) - The customer which is being served by server 2 at time t, or 0 if the server is idle.
Lemma 2.1 At every time t and for every occupied positions. and p in the corresponding queues:
(a) qp(t) < q,,(t), for p < .;
(c) r~(t) < r~(t), for p < &; (d) q,,(t') < q,,(t), for t' < t;
(e) r~(t) < l(t) < ql(t);
(f) r~(t) < 2(t) < qJ(t); (g) 2(t) < r~(t); (h) There exists a p, 1 ::; p ::; J - 2, such that l(t) < r~(t) or qp(t) < r~(t). Proof: Properties (a)-(f) are direct consequences from the facts that customers join at the end of the queues and are being dispatched from fixed positions. Property (g): Customer r~(t) is being delayed by a lower customer. From properties (a), (b) and (e), it could only be customer 2(t). Thus, 2(t) < r~(t). Property (h): Similarly for customer r~(t). From properties (f) and (a) it could only be one of the customers in {l(t),ql(t), ... qJ-2(t)}.
o In the next lemma we show that the two resequencing queues cannot be non-empty at the same time.
Lemma 2.2 At every time t, at least one of the queues Rl or R2 is empty. 10
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Remark 2.1: If l(t)
= 0 then I1(t) = ¢.
OtheMDise, IJ
= ¢.
That is, there are at most J -1
non-empty sets from {11 (t),12(t), ... ,IJ(t)}.
Optimal routing
3
In this section we consider the ,8-discounted and the average-cost Markov decision processes. The optimal control is split into two parts: routing to the faster server and routing to the slower server. In Sub-section 3.1 we show by proba.bilistic arguments, that the faster server should be utilized as long as queue Q is not empty. In Sub-section 3.2, which is further subdiveded, we consider the optimal control of the slower server. In Sub-section 3.2.1, we show that the optimal control is independent of the state of the resequencing queues. In Sub-section 3.2.2, we show the 'optimality property' of position Jo, and in Sub-section 3.2.3 we show that for J
> Jo, the optimal policy is of threshold type.
3.1 Routing to the faster server In this sub-section we use arguments similar to those presented in [13] in order to show that server 1 is kept active if queue Q is not empty. To fix the notation, all the proofs in this section are based on pathwise comparison arguments between an original state process X under a given policy
i. under
policy i' derived from
11'.
11',
and another state process
The latter is referred to as the tilde system, and we use a
tilde to denote all relevant quantities in the tilde system.
Lemma 3.1 For every.O < ,8 < 1, the ,8-optimal policy has the property that whenever it activates a server, it activates the fastest available one.
Proof: Let
11'
be any given policy and let X(O) = z be an initial state at which
11'
activates
server 2 while leaving server 1 idle. By definition, server 2 is activated by the J - th customer from queue Q. We will show that
11'
can be strictly improved.
To simplify notation we may assume without 'loss of generality, that the customers in queue
Q have consecutive numbers starting from 1. (This is possible since only the order among them determine their departure times from the system. Also from the state definition of the resequencing queues, this assumption does not change the system state.) 12
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Define a policy and at time 0,
j-
j-
and a corresponding process X as follows. The initial state X(O) = X(O),
takes the same action as
11",
except that it activates server 1 (with customer
number 1) instead of server 2. From then on, the realizations of X and
X are coupled.
This
is done by feeding both systems with the same arrival process and assuming that the first service time at server 2 in X equals T2 =
111.1' 1. 1'2
(Here, Tj is the service time of a customer
at server j). Observe that this coupling is made possible by the fact that
1'1
is exponentially
distributed with parameter 1£1 and therefore T2 is exponentially distributed with parameter 1£2·
After time 0, policy 1i' mimics the actions of policy ?r (activates the same servers by customers from the appropriate positions) with one exception: (i) Let 1r
activates server 1. If T
T} implies the
condition {T2 > El.T ~2
(>
T)
l!1. T } 1'2
and therefore the residual service time of customer J in X from time
is exponentially distributed with parameter 1l2.
Hence, we can couple the residual service time of customer 1 in
X from time T
with his
service time in X which starts at time T. Furthermore, we can also couple the residual service time of J in X from time
I11. T ~2
(>
T),
latter implies that customer J leaves X time
T,
X from time T. The at time El.T + 1'2' while from X at time T + '1'2. After
with
1'2 - the service time of J
in
~2
i continue to mimic ?r's actions. From the coupling above and the definition of T, it
13
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
is clear that this is feasible. Hence, for all realizations in X where (i) occurs we have
I X(t) I -1,
I X(t) 1= { I X (t) I, For all other realizations in {T
for otherwise.
> 7\}, 1r mimics ?r's actions and we obtain
=1 X(t) I,
for 0 Jo, the optimal policy is of threshold type.
3.2.1 The resequencing-invariant property The next lemma is essential for the proof that state R does not play any role in the optimal routing decision. Observe that from Lemma 2.2, Remark 2.1 and Lemma 3.2, the feasible values of R are of the form (0, (111"" IJ- l )) or (I, (0, ... ,0)), where 11, . .. , IJ-l correspond to the customers in server 1 and in the first (J - 2) positions of queue Q. For R
= (0, (0, ... ,0))
we fix the notation [0].
Lemma 3.3 There is a function hfJ (R) such that for every routing policy 'K whose decisions are independent of R,
(9) Proof: Let
:Co
= (n, ell e2, R, k)
and 2:0
= (n, ell e2, [0], k)
be two initial states, and X and
X the processes that are governed by policy 'K and start at :Co and 16
2:0 , respectively. Since
'K
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
is independent of R(t), we may couple ,the arrivals
and service times in both systems. This
is made possible by the same evolutions of (n(t), el(t), e2(t)) and (n(t), el(t), e2(t)). (Here we use the tilde notation as in Section 2.) There are two cases of R that have to be considered. Case (i): Assume that R = (0, (11' ... , IJ-l)). For every 1 ::; let
Tj
For Ij
i
0,
(Tj) be the instant that the customer present at time 0 in position i, leaves the system.
= 0 or i = 0 define Tj = Tj = o.
By the coupling,
and
Tj
Tj
are identical.
Since 1r routes from position J, the customers that are present at time 0 in the first (J -1) positions, will be routed to server 1. Thus,
Tj
is distributed as the sum of j independent
.geometric r.v.'s with parameter 1-'1. By the definition of the resequencing delay, we therefore have
=/ X(t) I +Ef;l'i' I X(t) I { =1 X(t) I,
for
Tj_l ::;
for
t
t < Tj, 1 < i < J - Ij
~ TJ_l.
For this case the lemma follows by defining J-l
hf3 (R) =
L
liE[1
+ {3 +... ,,BTi-1].
(10)
i=1
Case (ii): Assume that R let
T
= (I, (0, ... ,0)).
If I
= 0 then the lemma is trivial.
For I
> 0,
be the instant that the customer present at time 0 in server 2, completes his service.
Clearly, T is geometrically distributed with parameter 1-'2. We have,
1
X(t)
I
{
=1 X(t) 1+1·,
for 0::;
=1 X(t) I,
for
t
t < Tj
~ T.
For this case the lemma follows by defining
(11) From (10) and (11), the function J-l
hf3 (R)
= IE[1 +{3 +... ,{31"-1] + L
i=1
17
liE[1
+ {3 +... ,{31"i-1]
(12)
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
satisfies (9). Here the expectations are taken with respect to the geometric r.v.'s which are clearly independent of 11'.
o The function h13 (R) represents the accrued discounted cost that is contributed by the customers present at time
°in the resequencing queues. For later references denote h (k) ~ 13
h13 ((O, (0, ...,0,1,0, ...,0))), where the 1 corresponds to position k. Observe that from (10), (13)
By using Lemma 3.3 and the following value and policy iterations, we will show that the routing decisions of the ,a-optimal policy are independent of R. Let F be the Banach space of all functions j : S -+ R with the norm
II j 11= sup I ~g~zl} I· zes
II . II
defined by
From (6) and Lemma 3.2 we may define for every stationary policy
t
11',
the dynamic programming operator T1( : F -+ F, by
where lI"(y) E {O, I} with the understandiIlg that 1I"(y) state y, and
= 1 if 11" routes a customer to server 2 at
°otherwise. Also, define the optimal dynamic programming operator T: F
-+
F,
by
j(1I"(Y)) is attained, is consistently Notice that if the decision at state y, 1I"(y), for which the min 'If chosen for every y, then Tj defines a stationary policy
11'"
which satisfies
(T1(IJ)(z) = (Tj)(z) = min(T,rI)(z). If
(16)
The procedure by which a new value function is derived by using operator T is known as value iteration, and by which a new stationary policy is derived by using T, as policy iteration.
Theorem 3.1 The routing decisions of the ,a-optimal policy are independent of the state of the resequencing queues, R.
18
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Proof: First we show that if 1r'S decisions are independent of R, so are the decisions of the policy derived by the policy iteration TY!. Then we show tha.t the optimal policy preserves the same property. For every
f
E
F, define
9J(n,R) = f(n,I,O,R,O) - f(n -I,I,I,R,J -1),
for n
>J
-1.
(17)
Let 1ro be a policy whose routing decisions are independent of R, and for every m
> 0 define
1rm+l as the policy that is derived by the policy iteration TY!",. That is, T1I'm+l Y!m = TY!m' From (15) and (17), 1rm+l(Y) is either 0 or 1, depending on whether 9v. fJ (n, R) is negative or "m non-negative, respectively. From LeIIlIJla 3.3 it follows that if 1rm'S decisions are independent of R, then 9v.W'm fJ (n, R)
= 9v.fJ
-m
(n, [0]), which implies that 1rm+l's decisions are also independent
of R. Since a limit point of {1rm} does not neceSsarily exists, we cannot deduce the theorem by the policy iteration procedure. However, we can extract it by the value iteration procedure as follows. Consider the sign of 9VfJ, where yP = inf Y! is the p-value function. 11'
Since 1ro's decisions are independent of R, it follows by the argument above that so are 1rm's decisions, and by Lemma 3.3, the sign of 9v!m (n, R) is independent of R, m
~
O. Since
the ,ji~ Y!m exists and equals to yP (see, e.g., {4, Lemma 3]), the sign of 9v fJ (n, R) is also independent of R. To conclude the proof, observe that the p-optimal policy r*, is the solution to the optimalityequations yP = TYP. Now, from (15),1r*(Y) = 1 if and only if 9v fJ (n, R) = 9v fJ (n, [0]) ~ 0, and the solution is independent of R.
o Theorem 3.1 would a.lso apply to the optimal policy with respect to the average cost, if one could guarantee the following limits.
Then, since 9vfJ(n, R) is independent of R, the result is a straightforward consequence of the following optimality equations for the average cost problem:
19
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
By Lemma 3.7 and Remark 3.3 below, if ~
< lSI the limits above follow from [5, Theorem 3].
Hereafter, we may further restrict attention to policies whose routing decisions (to server 2) are functions of the length of queue Q only. Although this structure is the same as in the problem without resequencing delays, it does imply that we have the same optimal policy. This is due to the different evolutions of the cost structures.
3.2.2 An optimal routing position In Section 1, we described the optimality property of Jo that has been derived in[3] for the class of fixed-position threshold policies t m • In this sub-section, we extend this property to a more general class of fixed-position policies. From the results above, the p-optimal fixed-position routing policy can be taken in the sub-class of stationary policies that are functions of two parameters: (i) The set of lengths of queue Q at which a policy routes a customer to server 2 (if idle). (ii) The position J from which customers are being dispatched. Note that since the position is fixed, the set in (i) is restricted by the position in (ii). We say that a class of policies
n
is routing-invariant, if the policies differ only by the
positions from which customers are .being dispatc;hed. That is, for every ?r E n, the sets in (i) above are identical. One example is the class of thresl10ld policies with level m and routing positions J, J
< m. Let J (n) be the set of routing positions that correspond to class n. We
will show that the optimality property of Jo holds for every routing-invariant class. To proceed, we first characterize Jo in terms of the expected delay of a customer in position k under two alternative policies. One is a policy that routes customer k to server 2, and
customers {I, 2, ... , k - I} to server 1. The other policy routes customers {I, 2, ... , k} to server 1. Let {Xi} be a sequence of independent geometric r.v.'s with parameter p.1! and Y an independent geometric r.v. with parameter P.2. For k ~ 1, denote, X(k)
Zk
= max{~X(k_I)}' where X(O) = 0
Note that for
P=
1,
i(k)
deC
For every 0
< P :::; 1 define the function,
il(k) :: E[Zkl - E[X(k)]'
20
= E:=l Xi
and
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
The function "Yp(k) represents the difference in the accrued cost that is contributed by a customer present at time 0 in position k, under the two alternative routing policies above. ,we obtain by the forward-equations Recalling that a = --a-+ 1£1 1£2
"Yp(k) = aE[,8To hp(k -.1) - (1 - a)EfPTo+X(lI-1>(1 -:I- {3+, ... , +{3X,,-1 )],
(19)
where To is the time of the first service completion at one of the servers - either real or dummy.
(To is geometrically distributed with parameter lSI + 1S2 - ISlIS2.) From (19) it is clear that "YP(k) is a decreasing function. Furthermore, there are
{30 < 1, such that lim "Yp(k)
0 and
~ {30. The latter property is an immediate
consequence of the facts that lim ([Zk - X(k)] - [X(k-l) - X(k)]) = 0 with probability one, #;-00
E(X(k_l) - X(#;)] = -
:1 and 92l 1p(k) = E[Zk] - E[X(k)].
integer Jo({3) ~ 2, for which the function "Yp(k) ({3
> {30)
Since "Yp(1)
> 0, there exists an
becomes strictly negative for the
first time. As our primer interest is the average cost criterion, we consider (3's for which J o({3) = Jo(1). Since the 10{(3)'s are fntegers and "YP(k) -+ "Y(k), it is clear that there exists a (31 < 1 such that J o({3) = J o(l) for every {3 ;:::
f3t.
!
Finally, we show that Jo(l) = J o (which is defined in (5)). From the memoryless of the geometric distribution it is standard to show that "Y(k) = ,
-t ~ )k-l > 1.
that "Y(k) < 0 if.and only 'if (~)(1
a" 1£2
(I-a)", 1£1
from which it follows
The latter relation implies that Jo(l) = Jo.
Now we are ready to prove the optimality property of J o. We need the following policy transformations, which are applied also to non fixed-position policies: (i) For every routing policy 1r and,k ;::: 2, define Tt(1r) as the non-stationary policy that differs from
1r
only by the following action at the first step. If 1r routes a customer from position
k, then Tt(1r) routes a customer from position k
+ 1,
if not empty. Otherwise (at the first
step and further steps), it takes the same routing actions as
1r.
routes customers to server 2, at the same lengths ,of queue Q that
That is, at step one, Tt(1r) 1r
routes, but possibly from
a higher position. After step one, Tt(1r) routes customers to server 2, at the same lengths and from the same positions as
1r
routes.
(ii) For every routing policy 1r and
k;::: 3, define r;(1r) as the policy that differs from 1r only 21
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
by the following action at the first step. H 11" routes a customer from position Ie, then T;(1\") routes a customer from position Ie - 1.
In the following lemma we consider policies that may route from variable positions (with some restrictions), and show that if position Ie, Ie
t:
Jo, is feasible under
11",
then
11"
can be
improved by one of the transformations above. Lemma 3.4 The following hold for every {J,
f3t < {3 < 1:
(i) If 1\" routes customers to server 2 from positions larger than or equal to Ie, Ie is a feasible position, the,. Tt (1\") is at least ~ good as
< Jo,
1\".
(ii) If1\" routes customers to server 2 from positions larger than or equal to Ie - 1, Ie
Ie is a feasible position, then there is a (J2
{J2 5: (J
< 1 stich
and Ie
that T;(1I") strictly improves
11"
> Jo,
and
for every
< 1.
Proof: The proof is based on a pathwise comparison between the- state process X under policy 1\" and the state process X under policy Tt(1\") (for part (i)) or T;(1\") (for part (ii)). To compare realizations we couple the arrivals and service completions in both systems. (Note that a service completion may correspond to different customers in X and Part (i). Let
Zo
= (n,I,O,'&,O), n ~ Ie, be-an initial state at which
from different positions. Since for all other initial states V!(z) that
11'
X.) and Tt(1\") routes
= V::(lI')(z), we have to show
v: (n - 1, 1, 1, R, Ie - 1) - V: (n - 1, 1, 1, R, Ie) > 0.
This is due to the fact that after the first action, X (respectively X) instantaneously jumps to state (n - 1,1,1, R, Ie - 1) (respectively, to (n - 1,1,1, R, Ie)). From then on, both processes are governed by policy 11'. As in the proof of Lemma 3.1, we may assume without loss of generality that the customers in server 1 and in queue Q are numbered by 1,2, ... , n+l. We will show that for every customer
t: Ie, its departure times in both systems are the same, while the expected departure time of customer Ie is smaller in X.
i
Since 1\" routes from positions larger than or equal to Ie, it follows from the coupling that every customer i
< Ie (and those that at time 0, are being delayed by him), would leave both
systems at the same time. Furthermore, the departure times of customer Ie + 1 (and those
22
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
that at time 0, are being delayed by him) would also be the same. This is plain from the fact that (k
+ 1) leaves the system at the first instant at
which customers {I, 2, ... , k + I}
have been released. Since states (n -1,1, I,R,k -1) and (n - 1,1,1, R, k) differ only by the locations of customers k and k
+ 1, it is apparent from the coupling that this instant is the
same in both systems. Every customer i
> k + 1 in both systems, is routed at the same time and to the same
server, and its completion time is also the same. Since he would leave the system at the first instant at which he and all preceding customers would have been released, it follows by induction that its departure time must be the same in both systems. Hence, it is left to show that the expected accrued cost -due to the delay of customer k, is smaller in
X.
Let.,. (f) be the departure time of customer k·from system X (X). From the identities for the rest of the departure times,
V!(n-l, 1, 1, R, k-l)-V!(n-l, 1, 1, R, k) = (l1:+1) {E{[1
+ p ... + (f-l] -
E[1
+ p ... + Pf-l]}. (20)
Thus, we have to show that the expression within the braces is non-negative. To prove this, first note that customer k in system
X is routed
routed to server 2. Also note, that since
11'
to server 2, if and only if (k
+ 1)
in X is
routes from positions k or higher, customer k in
X would defi~itly be served by server 1, if this server would complete his first
service before
. server 2 does. This event occurs with probability.....El-+ #1 Wl Let To be as in (19) and T1 be the number of steps after To that it takes to route customer
(k + 1) in X to server 2 (and infinite, if he is routed to server 1). Denote by ." the conditional probability (conditioned on the state at time To) that {T1 < co}. By using the forward equations from time 0 to To, it follows from the definitions of ZI: and
E[(1 + P... + p-r-l)
X(k)
that
+ P... + p1'-l)] 11-1 E [pTo ((1 +.p ... + pZIc_l- 1)- (1 + P... + pXCIc _ )-1»)] PI + P2 + P~ E [."pTo+Tl ((1 + P... + pX(Ic_l)-l) - (1 + P... + PZIc-l»)] PI + 11-2 (1
1
23
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Thus, from (18), (19) and the fact tha.t
with probability one, we have
(1 +,8 ... + ,8f-l)) _ "Y1l(k)+
P2 PI
P2 PI
+ P2
> (1-
+ P2
E [71,8To+Tl] "Y1l(k) _. P2 PI P2
PI
E [71,8To+X(._l)(1 +,8 ... + ,8X.-l)]
+ P2
+ P2
E[71,8TO+Tl])"YIl(k)
E [71,8To+Tl+ X(.-l)(1 +,8 ... + ,8X.-l)]
>0.
The last inequality follows from the monotonicity of "Y1l(k), the definition of Jo and the fact that k < Jo. This completes the proof of part (i). Part (ii). Let:l:o = (n,I,O,R,O), n
>k-
1, be an initial state at which
routes from different positions. Since for all other initial states V!(:I:)
11'
and T;(lr)'
= V~-(1I')(:I:), we have
to show that
v! (n -
1, 1, 1, R, k - 2) - V: (n - 1, 1, 1, R, k - 1)
As in part (i), the departure times of every customer i
< 0.
:F (k -1) in both systems are the same,
and therefore it suffices to show that
E[1 +,8 ... + ,8T-l] - E[1 +,8 ... + ,8,"-1] < 0. Here, T and f
r~late
(21)
to the departure times of customer (k-l). Similarly, alter the definitions
of T 1 and 71 in part (i) by relating them to customer (k - 1). Again, by the forward equations we have,
E[(1 + ,8 ... + ,8f-l)
(1 + ,8 ... + ,8,"-1)] 1£1
1£1
+ P2
E [,8To
((1 + ,8 .... + ,8Z.-2-1 ) -
24
(1 + ,8 ... + ,8X(.-2)-I))]
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
_ (1 - PI P2+P2 E [flPTO+Tl]) "'fp(k - 1) P2 E PI +P2
+
[flpTo+X(..-2)(1 + p.. . + pX.._l-l )(1 _ PTl)] .
(22)
To complete the proof we have to reduce the last positive term above. Observe that given
{Tl < oo}, Tl is defintly smaller than the first service time that would be given by server 1 after To. Therefore, T l is stochastically smaller than Xl' Hence, by Jensen inequality the second summand in the right-hand side of (22), can be made arbitrarily close to zero, for
p
arbitrarily close enough to one. Fina.lly, by the definition of Jo and the fact that (Ie - 1) Thus, there is a
P2 < P < 1.
132, f3t
~
132 < 1, such that
> J o, it follows tha.t "Yp( Ie - 1) < 0.
the right-hand side of (22) is negative for every
This completes the proof of part (ii).
o. Remark 3.2: If "'fp(Jo - 1) > 0, then in part (i) of Lemma
9.4, T{( 1r)
strictly improves 11'.
From Lemma 3.4, T{ (11') and Tj; (11') are imJ?rovement transformations of policies that route from positions other than Jo. Therefore, they could successively be used to obtain a. limiting stationary policy. Let 11'0 be a fixed-position routing policy that, routes from position J
I
~
Jo. For every
> 0, recursively define the non-stationary policy 1r1+t = T{ (11'1) (alterna.tively 1r1+t = TIc- (1rI»'
Notice that for every I, 11'1.
Let
11'00
11'1
satisfies the conditions of Lemma 3.4 and therefore
1I'1+t
improves
be the limiting policy. Policy 11'00 is stationary and routes customers at the same
queue lengths (of queue Q) that
11'0
does. However, if 1ro routes from,position J> Jo, then
1r00
< Jo, then 1r00 routes either from position (J + 1) (if not empty) or from position J (otherwise). Le., for J < Jo, 11'00 routes routes from position (J - 1). If 11'0 routes from position J
from position max{n(t) for J
J
+ 1,J + 1}.
> J o, 1r00 is strictly better than
< Jo -1, and for J = Jo -
Furthermore, Lemma 3.4 and Remark 3.1 imply that ?ro,
and for J
1 with "'fp(Jo - 1)
< J o, 11'00 is at least as good as
11'0.
For
> 0, 1r00 is strictly better than 1ro.
The following theorem extends the optimality property of Jo to any routing-invariant class.
25
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Theorem 3.2 The following hold with re8pect to the {j-discounted cost, 132 < (j < 1, and to the average cost criteria.
(a) For every routing-invariant class
n which result in positive recurrent Markov chains:
(a.1) If Jo E J(n), then the policy 11' E n that routes from position Jo is optimal within
II. (a.2) If for every J E J(n), J > Jo, then the policy 11' E n that routes from position J* = min{J I J E J(n)}, is optimal within n.
(a.3) If for every J E J(II), J < Jo, then the policy 11' E J* = max{J I J E J(n)}, is optimal within n.
n
that routes from position
(b) For every fized-position routing policy 11' that routes from position J and result in a positive recurrent Markov chain:
(b.1) If J > Jo, then 1r is inferior to the policy that routes at the same lengths of que~e Q, but from position Jo. (b.2) If J < Jo, then 1r is inferior to the non-fized position policy that routes at the same lengths of queue Q, but from pO,sition (J + 1) if not empty, and from J otherwise.
Proof: The proof for the {j-diuscounted coat criterion is immediate from Lemma 3.4 and the discussion that follows. Indeed, within a routing-invariant class
n, one can successively
improve a policy by gradually increasing (accordingly, decreasing) the routing position within
J(n), until one hits Jo, max{J I J E J(nn or min{J IJ
E
J(n)}. Parts (a.1), (a.2) and
(a.3) follows respectively. Furthermore, (b.I) is an immediate consequence of (a.1), and (b.2) follows from part (i) of Lemma 3.4. The results for the average cost criterion follows from [5] by using the convergence
which holds for problems with a linear cost structure and continuous state jumps as ours (see
[5]).
o 26
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
3.2.3 An optimal policy of threshold type
In this sub-section we show that if the routing position J is greater than Jo, then an optimal policy with respect to the average cost criterion exists, and is of threshold type. Assume that J
> Jo and sta.rt with the ,8-discounted problem, 130 ~ ,8 < 1. Recall that for such J's, "YP(J - 1) < 0,
f32
~,8
< 1.
(23)
The proof is based on policy iteration and develops along the same lines as the proof in {4], with some changes that are required from our different state space. Define a pa.rtial order "~"
z
~
on the states, as follows. Recall that a state z is a tuple z = (n, e}, e2, R, k). We say that
y, z,y E 5, if at least one of the following conditions hold:
(i) :c = y (component-wise).
(iv) A(z) = y. (v) All components of" and y are equal except for one, which is smaller in z. (vi) There is a z E 5 such that, z For every 1 E
~
z and z ~ y.
r we also define the function: I(n - 2,1,1, (O],k) - I(n - 3, 1, 1, [O],k), n
~f(n, k)
=
{ 1(0,1,1, (OJ, 1) - 1(0,0,1, [0],0),
1
~ 3,
n = 2,
°
~ k ~ min{n - 2,J
-1}j (24)
k = 0.
{/(n-l,I,0,[0],0)-/(n-2,1,0,[0],0), n~2j
~f(n) =
(25) 1(0,1,0, [0],0) - 1(0,0,0, [0],0),
n = 1.
In the following lemma we list some properties of 1 E :F that propagates to Ttm/, m This will be used to show that under every threshold policy t m , properties. 27
~
J.
Ve. also satisfies the same
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Lemma 3.5 If 1 E :F satisfies the following properties, so does Ttm/, m ~ J.
(a) For every z, y E 8, if z
~
y then I(z) :5 I(y).
(b) For every n
> 2, 6/(n, J -1) > hP(min{n - 2, J - I}).
(c) For every n
> 2, 6}(n) > hP(min{n - 1, J - I}).
(d) For every n
> 2, 6/(n,.) = 6/(n, 0),
0 < .. < min{n - 2, J -I}.
(f) I(n - 1,1,1, [O],k -1) - I(n - 1,1,1, [0], k)
= '"1P(k),
1 < k < min{n, J - I}.
The proof of this lemma is standard but extremely tedious and we do not present the details here. The main lines are as follows. The function
Ttml
is represented via Eq. (14) and the
properties are verified one by one. The full verification is given in [2] and the reader may reproduce it based on the following properties which are easily shown:
1(n + 1, 1, 1, [0], k) > 1(n, 1, 1, [0] + lk, J - 1), k:5 J I(n, 1, 1, [0], J - 1) - I(n, 1,0, [0], 0) ~ hP(J - 1), Here, lk = (I, (lit ... ,lk, ... ,IJ-l»
- 1-
n:5 J - 2.
= (0, (0, ... , 1, ... ,0».
From Lemma 3.5 one may show by successively using the operator
Tt m,
that
Vea
also
satisfies properties (a)-(f) of the lemma. Indeed, it easy to construct in a recursive manner
10 that satisfies properties (a)-(f). From the lemma it follows that Trm+l10 de! Ttm(Ttnm 10), n ~ 1, also satisfies these properties. Now, since lim T.tnm 10 = V,Pm , we obtain the
a function
n~oo
following corollary. Corollary 3.1: For every m
~
J, the {3-discounted cost function under policy t m satisfies
properties (a)-(J) of Lemma 3.5.
The next lemma is the basis of our final result and its proof is similar to that in [4, Lemma 4]. The assumption J > Jo and the property in (23) are crucial for reproducing the proof. The
28
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
lemma asserts that the new policy that is obtained from l't~ by the policy iteration procedure, is also of threshold type. Lemma 3.6 For every mo, 2 :5 Jo < J :5
mo < 00, there exists an mit
J
< ml :5 mo + I,
such that TCm1 l'.tf3rna = Tl'.tf3mo •
Proof: To prove the lemma we need to explore the properties of the function
gvt (n, R) = l't~ (n, 1,0, R, 0) -l't~ (~- 1,1,1, R, J - 1), rna
From Lemma 3.3 it suffices to explore the fun~tion g(n)
del
for
n 2:: J - 1.
(26)
gvt (n, [0]). This will be carried rna
out by using the forward equations in (6) and representing g(n) in a recursive form. The forward equations depend on the value n and we separately consider all possible cases. Case (i): 1 lengths n
<J
- 1
< n < mo - 2. (The policy t mo does not route a customer at queue
+ 1 and below.)
From (6),
g(n) -
p~[l't~(n+l,I,O,[O],O)-l't~o(n,I,I,[O],J-l)]
+ PJ1.1 [l't~ (n -
1,1,0, [0], 0) - l't~o (n - 2, 1,1, [0], J - 2)]
+ PJ1.2 [l't~ (n, 1,0, [0],0) -
Vt. o (n -
1,1,0, [0]
+ IJ-h 0)] .
First note that the expression in the first braces is g( n + 1). Next, add and subtract
(27)
VLa (n -
2,1,1, [0], J - 1) within the second braces. From the proof of part (ii) of Lemma 3.4 and the assumption J
> Jo, we have
vt. (n - 2, 1, 1, [0], J - 1) -l't~ (n - 2,1,1, [0], J - 2) > O. o
(28)
Thus, by (27) we obtain
g(n) > p~g(n + 1) + pJ1.1g(n - 1) + PJ1.2 [l't~o (n, 1,0, [0], 0) - V;~ (n - 1,1,0, [0] + IJ-h 0)]
> p~g(n + 1) + pJ1.1g(n - 1) + PJ1.2 [l',~ (n, 1,0, [0], 0) - l',~ (n - 1,1,1, [0], J -1)] -
p~g(n
+ 1) + pJ1.1g(n - 1) + pJ1.2g(n). 29
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
The last inequality follows from Corollary 3.1 and property (a) of Lemma 3.5. Since
~
+1'1 +
J.L2 = 1, we obtain for this case,
(1 - f3)g(n) - f3~(g(n + 1) - g(n)) > f3l'l(g(n - 1) - g(n)), The same inequality is obtained for 1 f3l'lg(n - 1) + f31'2(~~" (n + 1) - h.P(J - 1)) ~ f3l'lg(n - 1). Cmo
(30)
The last inequality follows from property (c) of Lemma 3.5. The same inequality is obtained for the case 'l
= mo - 2 = 1.
Observe that the assumption J> Jo > 2 and the requirement
mo ~ J, implies that mo > 3. Case (iii): n
= mo -
1
~
2. (The policy t mo routes a customer at queue length nand
above.) From (6), g(n) -
+
f31'1
[VLa (n -
1,1,0, [0], 0) - V;~ (n - 2,1,1, [0], J - 2)]
f31'2 [V;~o (n - 1,1,1, [0], J - 1) - V;~o (n - 1,1,0, [0] + 1J-1l 0)] .
Again, from (28), the expression in the first braces is greater than g(n - 1). Furthermore, from Corollary 3.1 and property (a) of Lemma 3.5, g(n)
~
f3J.Llg(n - 1).
30
(31)
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Case (iv): n
> mo >
3. (The policy ,tmo routes a customer at queue length n - 1 and
above.) From (6),
g(n) -
+
{3J1.1 [\'t~(n-2,1,1,[0],J-l)-\'t~(n-2,1,1,[0],J-2)] {3J1.2
[VLo (n -
1,1,1, [0], J -1) - \'t~ (n - 2,1,1, [0] + IJ-lI J -1)] .
From (28), the expression in the first braces is positiv~. From Corollary 3.1 and property (a) of Lemma 3.5, the expression in the second braces is also non-negative. Thus,
g(n) ~ 0,
n
> mo.
(32)
To complete the proof note that from (29)-(32), g(n) satisfies the conditions of the corresponding function in [4, Eq. (10»). As a consequence, the rest of the proof is identical to the proof of [4, Lemma 4), and our lemma. follows.
The assertion of the next theorem and its proof are identical to [4, Theorem 5]. The proof applies the convergence of the policy iteration to the {3-optima.! policy. Theorem 3.3 For every J
> Jo and {32 < {3 < 1:
(i) There exists a stationary policy of threshold type, with threshold m*({3) (li) If\'t~(z)
Jo, be the optimal policy with respect to the average cost, given that the routing
position is J. From the proof of Lemma 3.4 part (ii), it is clear that as good as
tmO(J).
Hence,
tm*(Jo+l)
tmO(Jo+l)
is at least
is at least as good as any other fixed-position policy that
routes from J > Jo. ,FUrthermore, from Theorem 3.2 part (b.1), the policy t~*(Jo+l) that routes customers whenever tm*(Jo+l) does, but from position Jo, is at least as good as
tm*(Jo+l)'
Thus,
the following corollary is obtained. Corollary 3.2: The policy t~*(Jo+l) is at least as good as any other policy that routes from position J > Jo.
34
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
References [1] A. K. Agrawala, E. G. Coffman, Jr, M. R. Garey and S. K. Tripathi, A Stochastic Optimization Algorithm Minimizing Expected Flow Times on Uniform Processors, IEEE Transactions on Computers, C-33(4):351-356, April 1984. [2] S. Ayoun, Optimal Control of a Queueing System with Two Heterogeneous Servers with Resequencing, M.S. Thesis, Dept. of EE, Technion, Haifa 32000, Israel, February 1989. [3] I. miadis and Y. C. Lien, ~equencing Delay for a Queueing System with Two Heterogeneous Servers Under a Threshold-type Scheduling, IEEE Transactions on Communication, CQM-36:692-7QZ, 1988. [4] W. Lin and P. R. Kumar, OptimaL Control of Queueing Systems with Two Heterogeneous Servers, IEEE Transactions on Automatic Control, AC-84:696-703, August 1984. [5] S. A. Lippman, Semi-Markov Decision Processes with un-bounded Rewards, Management Science, 19:717-731, 1973. [6] S. A. Lippman, Applying A New Device in the Optimization of Exponential Queueing Systems, Management Science, 23:687-710, 1975.
[7] R. L. Larsen, Control of Multiple Exponential Servers with Application to Computer Systems, Ph.D. Thesis, Technical Report No. 1041, University of Maryland, 1981.
.
[8] ,M. I. Reiman, Optimal Control of a Heterogeneous Two Server Queue in Light Traffic, At&t Bell Lab., Murray Hill, NJ 07974, 1989. [9] M. I. Reiman and B. Simon, Open Queueing Systems in Light Light Traffic, Math. Oper. Res., 1989.
[10] Z. Rosberg and A. Makowski, Optimal Routing to Parallel Heterogeneous Servers - Small Arrival rates, Submitted to IEEE Transaction Automatic Control, September 1988.
[11] M. Schal, Conditions for Optimality in Dynamic Programming and for the Limit of n-stage Optima.1 Policies to be Optimal, Z. Warscheinlichkeitstheorie Verw. Gebiete, 32:179-196, 1975. [12] S. Varma, Some Problems in Queueing Systems with, Resequencing, M.S. Thesis, Technical Report TR-87-192, University of Maryland, College Park, 1987. (13] J. Walrand, A Note on the Optimal Control of a Queueing System with Two Heterogeneous Servers, System and Control Letters, 4:131-134, 1984.
35
Technion - Computer Science Department - Tehnical Report CS0592 - 1989
Q Service Queue ....-...-
---
,, ,
,
,
,
\
I I I ,
I
\
;
\
,
'
I
'
\
,:
q2(t)
I I
ql(t)
" \
J
,,
\, .....
, ' I
: = {i1(t), ... ,iJ_l(t)} \ \
,
\ \
\
Server.2
\
I I
\ \
\
"
",
\ \
, \
' ....
R2
-
_-----
·
r~l (t) (t)
·
·
·
..
-
r~(t)
· · ·
--,'
,
I
Resequencing
r~2(t)(t)
·
, ,, I I
,
·
,
_..
Queues
Rl
rl(t)
t Figure 1: Routing from position J when 2(t) = 0 and let)
36
f: o.