Technion - Computer Science Department - Tehnical Report CS0219 - 1981
...
TECHNION -
Israel Institute of Technology
Computer Science Depqrtment
OPTIMAL DECENTRALIZED CONTROL IN A ~TI-ACCESS CHANNEL WITH PARTIAL INFORMATION I
:
•
by Zvi Rosberg Technical Report #219 September 1981
"
.
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
.'.
OPTIMAL DECENTRALIZED.CONTROL IN A MULTI-ACCESS CHANNEL WITH
INFORMATION
P~TIAL
by
Zvi Rosberg
.
ABSTRACT We consider ~liann~l.
two
transimssiop stations
shar~ng
a single communication
For diffe~ent values of the input message rates
a simple open-loop control policy is long-run average through,put
s~own
crl~eri?n.
r , i
to be optimal for the
i
= 1,t,
- 1 -
INTRODUCTION
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
L
We 'consider
I
trahsmission
~tatfons
sharing a single communication
channel, subject to the following dynamic system: X.(t+l)
= V.(t) + X.(t)(l - Vi (t»(l - u.(t», 1:< i:< I, 1.
~L1)
11.1.
t
{X.(O),V.(t)ll :
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
Clearly, Lemma 3.3:
o.
o
0, and the lemma is proved.
For' every 'state
(k ,J 2"
then
the proof is similar.) From Lemma 3.2
Define
does If
1T
1T *
If
1T
*
can only be a p'olicy of type
as follows: ~
= 1T2(l,k2 ) or 1T(l,l), theh Jet 1T be the policy which always
u 1T *
it follows that
= (1,0) •
= 1Tl(k 1 ,k 2) or
From Remark 3.1, it is easy to check that at any case
o
which is a contradiction. Corollary 3.2:
Proof:
Lemma 3.4:
*
If
ri >
2"1 ' for some
~,
then
The corollary follows from Lemma 3.3,
increas ing in
u
If
s~nce
PiCk)
is
o
k.
*
u (kl'k ) 2
(1,1)
for some state
(k ,k ), then 1 2
:: (1,1).
Proof:
From Corollary 3.2 we may assume that
From Lemma 3.2 •
- = V(1T - *) = V
(kl'k ) implies 2
for some state
i
= 1,2.
it follows that any non-randomized stationary policy is
- 13 -
• I
or
It also follows that any non-~ransient state is
Figures 3.1 - 3.4.
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
of the form
lTGkl ,k-Z)' which are given in
(l,k) or (k,l).
u *(1 ,k)
~
If
k
1, then
If
k> 1, then the process
u *(k,l) = (1,1).
Sup,pose
(For
(1,1), the proqf is similar.)
into state
IT * = IT(l,l) and the
1e~a
is
prov~9.
k(t), t = O,l,Z, ••• ·
, under
IT * , eqters
(k,l) only from state (k-1,1) and the control action at
state (k-1,1) is From state
(k,l), the process
und~r
IT *, moves on to state (1,1).
IT * during the transitions from (k-1,1) to (k,l)
The reward under
and then to (1,1) is 0.3)
Now, let
*
be the non-stationary
po~cy
which does the same as
IT *
except when the prQcess enters into state (k-1,1).· At this state, takes the control action state (1,1) in which
IT
u = (1,,1).
consecutive
IT
Then, the process moves on to
takes again the control action
~
At this point
proceeds as
IT *•
n
The
r~ward
u = (1,1).
during these two
(l,l)-convrol actions is (3.4)
...
(3.3) and (3.4) !hat
*
r > r .
(3.5)
All the other immediate rewards remain unchanged under n. . .... ~...
the mean ergodic theorem we obtain
!
-
-
-
~
V(;) > V(lT *), which is a
I
o
contradiction.
~--
Thus from
-
-
-
- 14 -
The following
t~eorem
is a direct
con~equence
of Lemma 3.4 and'
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
Corollary 3.2. Theorem 3.2: k. ;> 1, i :: 1,2. ~
Next, we shall find the values (k l ,k 2 ) which maximizes V(n(k ,k )), k ;> 2, i :: 1,2, and then we shall l 2 i maximum is brger than
show
that this
Vtn(l,l)).
From (2.5), 7igures
3.3,
3.4
and Remark 3.1, we have
and
(3.7)
From (3. 7) we have k l ;k 2-2 _ V(n(k ,k +1)) = k +k' -1 v(n(kl,k Z)) + k +k1 -1 (P2(k 2+1)-P 2 (k 2 )+PI(1)) l 2 I 2 l 2 and k +k 2-2 _ l V(n(k +1,k )) = k +k -1 V(n(k l ,k 2 )) + k +k1.-1 (PI(k l +1)-P I (k l )+P 2 (1)). 2 l l 2 I 2 (3.8) Thus
i f and only i f
Lemma 3.5:
(a) If r 2 ;>r 1 then, V(n(2,Z)) > PZ(3) - P2(Z) + PI (1). (b) If r 1;> r
2 then V(n (2,2)) > PI (3) - PI (Z) + P2 (1) .
15 -
~
, Proof:
Since
Pi(k+l) - PiCk)
k~ the lemma follows
decreases in
,
by a straightforward computation using the definition of V(TI(Z,Z»
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
g,iven in
Theorem 3.3:
~ r z),
(r l
Proof:
For finding the optimal control policies when
the only policies which have to be considered are
(\ TI(Z~k»)~ \
TI(k~Z)
and
o
(3.7).
Let
kl ~ Z.
Z
~
r
TI(l~l)
i
k = Z~3, ... V(TI(k,Z»
If
>
Pz (3) - Pz (z) + P1(1) ~ then from
(3.7) and (3.9) it follows that max V(TI(k1,k Z» kZ If
(3.10)
~ PZ(3) - PZ(Z) + Pl(l)
V(TI(k1,Z»
decreases in
= V(TI(kl,Z».
k
-
it follows from
then
(3.7)~
-
since
Pl (k+l) - Pi(k)
(3.9) and Lemma 3.5 that (3.11)
Now~
o
the theorem follows from (3.10) and (3.11). In Theorem 3.4 below it will be shown that
Lemma 3.6: (l-r) Proof:
For every k
o
=
Z
1 - kr + r (k-l)
Z
By a standard induction on
Theorem 3.4: k
~
r:~)
For
(.(2,k o ),
r
Z
~
r
Proof:
1
(r
where
1
~
ko
r Z)
o
k. the policy
=f:~l)
small~r
--
-
-
--
than
We consider only the case
- -
--
TI(ko~Z),
is better than
•
--
TI(l,l) is not optimal.
= O~l~Z~ •.•.
k
is 'the smallest integer not
- - _.
r
-
a. )
where
.(1,1).
- 16 -
• •
From (3.6) and (3.7) we have (3.12)
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
(3.13)
From (3.12)
where
o
-' fr
2
r' l
(3.13) it is sufficient 'to show that
J
From Lemma 3.6 and the definition of
k
we obtain
o
2 g(k ) ~ 1- (l-k r + r 2 (k _1)2) - r - k r + 2k r r o 1 2 001 0 201
o Finally, we present the conclusive theorem, which determines the
...
-
k
~d
op~imal
control policy •
Theorem 3.5:
(a)
r 2 ~ r 1 ~ then
If
~
* = ~(kl,2) *
where
k
*
is the
smallest integer satisfying (3.14) Moreover, TI
* = ~(2,k *),
k*
where
is the smallest
integer satisfying (3.15)
--
-
-
- *
Moreover,
V(TI )
-
--
-
-
-
-
-
-
-
.' -
-
-
- 17 -
, Proof:
We consider only (a).
Technion - Computer Science Department - Tehnical Report CS0219 - 1981
V('IT (k+1 ,2» V('IT (k,2»
~1
From (3'.12) it- follows that
if an.