Anglers’ fishing problem
arXiv:1112.2271v1 [math.PR] 10 Dec 2011
Anna Karpowicz and Krzysztof Szajowski
Abstract The considered model will be formulated as related to ”the fishing problem” even if the other applications of it are much more obvious. The angler goes fishing. He uses various techniques and he has at most two fishing rods. He buys a fishing ticket for a fixed time. The fishes are caught with the use of different methods according to the renewal processes. The fishes’ value and the inter arrival times are given by the sequences of independent, identically distributed (i.i.d.) random variables with the known distribution functions. It forms the marked renewal–reward process. The angler’s measure of satisfaction is given by the difference between the utility function, depending on the value of the fishes caught, and the cost function connected with the time of fishing. In this way, the angler’s relative opinion about the methods of fishing is modelled. The angler’s aim is to have as much satisfaction as possible and additionally he has to leave the lake before a fixed moment. Therefore his goal is to find two optimal stopping times in order to maximize his satisfaction. At the first moment, he changes the technique of fishing e.g. by excluding one rod and intensifying on the rest. Next, he decides when he should stop the expedition. These stopping times have to be shorter than the fixed time of fishing. The dynamic programming methods have been used to find these two optimal stopping times and to specify the expected satisfaction of the angler at these times. Key words: fishing problem, optimal stopping, dynamic programming, semi-Markov process, marked renewal process, renewal–reward process, infinitesimal generator AMS 2010 Subject Classifications:60G40, 60K99, 90A46
Anna Karpowicz Bank Zachodni WBK, Rynek 9/11, 50-950 Wrocław, Poland e-mail:
[email protected] Krzysztof Szajowski Institute of Mathematics and Computer Sci., Wybrze˙ze Wyspia´nskiego 27, 50-370 Wrocław, Poland e-mail:
[email protected] 1
2
A. Karpowicz, K. Szajowski
1 Introduction Before we start the analysis of the double optimal stopping problem (cf. idea of multiple stopping for stochastic sequences in Haggstrom [8], Nikolaev [16]) for the marked renewal process related to the angler behavior, let us present the so called ”fishing problem”. One of the first authors who considered the basic version of this problem was Starr [19] and further generalizations were done by Starr and Woodroofe [21], Starr, Wardrop and Woodroofe [20], Kramer, Starr [14] et al. The detailed review of the papers related to the ”fishing problem” was presented by Ferguson [7]. The simple formulation of the fishing problem, where the angler changes the fishing place or technique before leaving the fishing place, has been done by Karpowicz [12]. We extend the problem to a more advanced model by taking into account the various techniques of fishing used the same time (the parallel renewal– reward processes or the multivariate renewal–reward process). It is motivated by the natural, more precise models of the known, real applications of the fishing problem. The typical process of software testing consists of checking subroutines. At the beginning many kinds of bugs are being searched. The consecutive stopping times are moments when the expert stops general testing of modules and starts checking the most important, dangerous type of error. Similarly, in proof reading, it is natural to look for typographic and grammar errors at the same time. Next, we are looking for language mistakes. As various works are done by different groups of experts, it is natural that we would compete with each other. If in the first period work is meant for one group and the second period needs other experts, then they can be players of a game between them. In this case the proposed solution is to find the Nash equilibrium where strategies of players are the stopping times. The applied techniques of modeling and finding the optimal solution are similar to those used in the formulation and solution of the optimal stopping problem for the risk process. Both models are based on the methodology explicated by Boshuizen and Gouweleeuw [1]. The background mathematics for further reading are monographs by Br´emaud [3], Davis [4] and Shiryaev [18]. The optimal stopping problems for the risk process are subject of consideration in papers by Jensen [10], Ferenstein and Sieroci´nski [6], Muciek [15]. A similar problem for the risk process having disruption (i.e. when the probability structure of the considered process is changed at one moment θ ) has been analyzed by Ferenstein and Pasternak–Winiarski [5]. The model of the last paper brings to mind the change of fishing methods considered here, however it should be made by a decision maker, not the type of the environment. The following two sections usher details of the model. It is proper to emphasize that the slight modification of the background assumption by adopting multivariate tools (two rods) and the possible control of their numbers in use extort a different structure of the base model (the underlining process, sets of strategies – admissible filtrations and stopping times). This modified structure allows the introduction of a new kind of knowledge selection which consequently leads to a game model of the anglers’ expedition problem in the section 1.2 and 2.2. After a quite general
Anglers’ fishing problem
3
formulation a version of the problem for a detailed solution will be chosen. However, the solution is presented as the scalable procedure dependent on parameters which depends on various circumstances. It is not difficult to adopt a solution to wide range of natural cases.
1.1 Single Angler’s expedition The angler goes fishing. He buys a fishing ticket for a fixed time t0 which gives him the right to use at most two rods. The total cost of fishing depends on real time of each equipment usage and the number of rods used simultaneously. He starts fishing with two rods up to the moment s. The effect on each rod can be modelled by the renewal processes {Ni (t),t ≥ 0}, where Ni (t) is the number of fishes caught on the rod i, i ∈ A := {1, 2} during the time t. Let us combine them together to the marked renewal process. The usage of the i-th rod by the time t generates cost ci : [0,t0 ] → ℜ (when the rod is used simultaneously with other rods it will be denoted by the index dependent on the set of rods, e.g. a, cai ) and the reward represented by i.i.d. {i} {i} random variables X1 , X2 , . . . (the value of the fishes caught on the i-th rod) with cumulative distribution function Hi 1 . The streams of two kinds of fishes are mutually independent and they are independent of the sequence of random moments when the → − fishes have been caught. The 2-vector process N (t) = (N1 (t), N2 (t)), t ≥ 0, can be represented also by a sequence of random variables Tn taking values in [0, ∞] such that T0 = 0, (1) Tn < ∞ ⇒ Tn < Tn+1 , for n ∈ N, and a sequence of A-valued random variables zn for n ∈ N ∪ {0} (see Br´emaud [3] Ch. II, Jacobsen [9]). The random variable Tn denotes the moment of catching the n-th fish (T0 = 0) of any kind and the random variable zn indicates to which kind the n-th fish belongs. The processes Ni (t) can be defined by the sequence {(Tn , zn )}∞ n=0 as: ∞
Ni (t) =
∑ I{Tn ≤t} I{zn =i} .
(2)
n=1
→ − Both the 2-variate process N (t) and the double sequence {(Tn , zn )}∞ n=0 are called 2-variate renewal process. The optimal stopping problems for the compound risk process based on 2-variate renewal process was considered by Szajowski [22]. Let us define, for i ∈ A and k ∈ N, the sequence {i}
n0 = 0, {i} {i} nk+1 = inf{n > nk : zn = i}
(3)
− The following convention is used in all the paper: → x = (x1 , x2 , . . ., xs ) for the ordered collection s of the elements {xi }i=1 1
4
A. Karpowicz, K. Szajowski {i}
and put Tk
{i}
{i}
= T {i} . Let us define random variables Sn = Tn nk
{i}
− Tn−1 and as-
sume that they are i.i.d. with continuous, cumulative distribution function Fi (t) = {i} {i} {i} P(Sn ≤ t) and the conditional distribution function Fis (t) = P(Sn ≤ t|Sn ≥ s). In the section 2.1 the alternative representation of the 2-variate renewal process will be proposed. There is also mild extension of the model in which the stream of events after some moment changes to another stream of events. Remark 1. In various procedures it is needed to localize the events in a group of the renewal processes. Let C be the set of indices related to such a group. The sequence ∞ C C C {nC k }k=0 such that n0 = 0, nk+1 := inf{n > nk : zn ∈ C} has an obvious meaning. C Analogously, n (t) := inf{n : Tn > t,zn ∈ C}. Let i, j ∈ A. The angler’s satisfaction measure (the net reward) at the period a from the rod i is the difference between the utility function gai : [0, ∞)2 × A × ℜ+ → [0, Gai ] which can be interpreted as the reward from the i-th rod when the last success was on rod j and, additionally, it is dependent on the value of the fishes caught, the moment of results’ evaluation, and the cost function cai : [0,t0 ] → [0,Cia ] reflecting the cost of duration of the angler’s expedition. We assume that gai and cai are continuous and bounded, additionally cai are differentiable. Each fishing method evaluation is based on different utility functions and cost functions. In this way, the angler’s relative opinion about them is modelled. The angler can change his method of fishing at the moment s and decide to use only one rod. It could be one of the rods used up to the moment s or the other one. Event though the rod used after s is the one chosen from the ones used before s its effectiveness could be different before and after s. Following these arguments, the mathematical model of catching fishes, and their value after s, could (and in practice should) be different from those for the rods used before s. The reason for reduction of the number of rods could be their better effectiveness. The value of the fishes which have been caught up to time t, if the change of the fishing technology took place at the time s, is given by Mts =
Ni (s∧t)
∑ ∑
i∈A n=1
{i}
Xn +
N3 ((t−s)+ )
∑
n=1
{3}
Xn
N3 ((t−s)+ )
= Ms∧t +
∑
{3}
Xn ,
n=1
→ − {2} {1} Ni (t) {i} {i} {i} Xn , and Mt = ∑2i=1 Mt We denote M t = (Mt , Mt ). Let where Mt = ∑n=1 Z(s,t) denote the angler’s pay-off for stopping at time t (the end of the expedition) if the change of the fishing method took place at time s. If the effect of extending 2 the expedition after s is described by gbj : ℜ+ × A × [0,t0] × ℜ × [0,t0] → [0, Gbj ], j ∈ B, minus the additional cost of time cbj (·), where cbj : [0,t0 ] → [0,Cbj ] (when card(B) = 1 then index j will be abandoned, also cb = ∑ j∈B cbj will be used, which will be adequate). The payoff can be expressed as:
Anglers’ fishing problem
→ − ga ( M t , zN(t) ,t) − ca (t) if t < s ≤ t0 , → a − a g ( M s , zN(s) , s) − c (s) Z(s,t) = → − +gb ( M s , zN(s) , s, Mts ,t) − cb(t − s) if s ≤ t ≤ t0 , −C if t0 < t.
5
(4)
→ where the function ca (t), ga (− m , i,t) and the constant C can be taken as follows: → − → − 2 a a a c (t) = ∑i=1 ci (t), g ( M s , j,t) = ∑2i=1 gai ( M t , j,t), C = C1a + C2a + Cb . After moment s the modelling process is the renewal–reward one with the stream of i.i.d. {3} {3} random variables Xn at the moments Tn (i.e. appearing according to the renewal → − → → process N3 (t)). With the notation wb ( m , i, s, m,t) e = wa (− m , i, s) + gb (− m , i, s, m,t) e − → − → − cb (t − s) and wa ( m , i,t) = ga ( m , i,t) − ca (t), formula (4) is reduced to: Z(s,t) = Z {zN(t) } (s,t)I{t<s≤t0 } + Z {zN(s) } (s,t)I {s≤t} ,
where − → → − Z {i} (s,t) = I{t<s≤t0 } wa ( M t , i,t) + I{s≤t≤t0 } wb ( M s , i, s, Mts ,t) − I{t0 Rn }
{3}
¯ n )wˆ b (Ms , s, Mns , Tn + Rbn ) = I{Rbn ≤t0 −Tn } F(R {3} s s + E I {3} b E[Z(s, τ ∨ Tn+1 )|Fn+1 |Fn . {Sn+1 ≤Rn }
b . For every τ ∈ T we have Let σ ∈ Tn+1 n ( {3} σ if Rbn ≥ Sn+1 , τ= {3} {3} Tn + Rbn if Rbn < Sn+1 .
We have E[Z(s, τ )|Fns ] = E I
{3} {Sn+1 ≤Rb n}
s E[Z(s, σ )|Fn+1 ]|Fn
{3}
¯ bn )wˆ b (Ms , s, Mns , Tn + Rbn ) + I{Rbn ≤t0 −Tn } F(R s Γn+1,K |Fn ≤ sup {E I {3} R∈Mes(Fns )
{Sn+1 ≤R}
⋆ ¯ wˆ b (Ms , s, Mns , Tn{3} + R)} = E[Z(s, τn,K + I{R≤t0 −Tn } F(R) |Fns ] ⋆ |F s ] ≤ sup s It follows supτ ∈Tns E[Z(s, τ )|Fns ] ≤ E[Z(s, τn,K n τ ∈Tnb E[Z(s, τ )|Fn ] where ⋆ s the last inequality is because τn,K ∈ Tn,K . We apply the induction hypothesis, which completes the proof. z {3}
s = γ s,Ms (M s , T Lemma 4. Γn,K n n ) for n = K, . . . , 0, where the sequence of functions K−n s,m γ j is given recursively as follows:
12
A. Karpowicz, K. Szajowski
γ0s,m (m,t) e = I{t≤t0 } wˆ b (m, s, m,t) e − CI{t>t0 } ,
e = I{t≤t0 } sup κγbs,m (m, s, m,t, γ s,m e r) − CI{t>t0 } , j (m,t)
where
(19)
j−1
r≥0
¯ κδb (m, s, m,t, e r) = F(r)[I ˆ b (m, s, m,t e + r) − CI{r>t0 −t} ] {r≤t0 −t} w +
Z r
dF(z)
Z ∞
δ (m e + x,t + z)dH(x).
0
0
P ROOF OF LEMMA . 4. Since the case for t > t0 is obvious let us assume that ≤ t0 for n ∈ {0, . . . , K − 1}. Let us notice that according to Lemma 3 we obtain {3} = γ0s,Ms (MKs , TK ), thus the proposition is satisfied for n = K. Let n = K − 1 then Lemma 3 and the induction hypothesis leads to {3} s s ¯ bK−1 )[I b ΓK−1,K = ess sup F(R ˆ b (Ms , s, MK−1 , TK−1 + RbK−1 ) {3} w {3} Tn s ΓK,K
{RK−1 ≤t0 −TK−1 }
Rb K−1 ∈Mes(Fs,K−1 )
− CI
{3}
{Rb K−1 >t0 −TK−1 }
]+E I
{3}
{3}
s where MKs = MK−1 + XK , TK {3}
{3}
{3}
{SK ≤Rb K−1
γ s,Ms (MKs , TK )|Fs,K−1 } 0
{3}
{3}
= TK−1 + SK
a.s., {3}
and the random variables XK {3}
s and SK are independent of Fs,K−1 . Moreover RbK−1 , MK−1 and TK−1 are Fs,K−1 measurable. It follows {3} s s ¯ bK−1 )[I b F(R ˆ b (Ms , s, MK−1 , TK−1 + RbK−1 ) ess sup ΓK−1,K = {3} w {RK−1 ≤t0 −TK−1 }
Rb K−1 ∈Mes(Fs,K−1 )
− CI =
{3}
{Rb K−1 >t0 −TK−1 }
]+
{3} s γ1s,Ms (MK−1 , TK−1 )
Z Rb K−1
dF(z)
0
Z ∞ 0
{3} s + x, TK−1 + z)dH(x) γ0s,Ms (MK−1
a.s. {3}
s = γ s,Ms (M s , T Let n ∈ {1, . . . , K −1} and suppose that Γn,K n n ). Similarly like before, K−n we conclude by Lemma 3 and induction hypothesis that {3} s s ¯ bn−1 )[I b Γn−1,K = ess sup F(R ˆ b (Ms , s, Mn−1 , Tn−1 + Rbn−1) {3} w {Rn−1 ≤t0 −Tn−1 }
s Rb n−1 ∈Mes(Fn−1 )
− CI
{3}
{Rb n−1 >t0 −Tn−1 }
]+
Z Rb n−1
dF(s)
Z ∞ 0
0
{3} s,Ms s γK−n (Mn−1 + x, Tn−1 + s)dH(x) a.s.
{3}
s,Ms s s ,T therefore Γn−1,K = γK−(n−1) (Mn−1 n−1 ).
z
Anglers’ fishing problem
13
From now on we will use αi to denote the hazardrate of the distribution Fi (i.e. αi = fi /F¯ i ) and to shorten notation we set ∆ · (a) = E gˆ· (a + X {i}) − gˆ·(a) , where · can be a or b. Remark 3. The sequence of functions γ s,m j can be expressed as:
e = I{t≤t0 } wˆ b (m, s, m,t) e − CI{t>t0 } , γ0s,m (m,t) b b w ˆ (m, s, m,t) e + y ( m e − m,t − s,t − t) − CI{t>t0 } ( m,t) e = I γ s,m 0 {t≤t0 } j j
and ybj (a, b, c) is given recursively as follows
yb0 (a, b, c) = 0 ybj (a, b, c) = max φybb (a, b, c, r), 0≤r≤c
where φδb (a, b, c, r) = z)}dz. P ROOF Z r
0
OF REMARK .
dF(s)
Z ∞ 0
0
Rr
j−1
′ ¯ α2 (z) ∆ b (a) + Eδ (a + X {3}, b + z, c − z) − cb (b + F(z){ 3 Clearly
h i s,m {3} {3} ( m e + x,t + s)dH(x) = E I ( m e + X ,t + S ) , γ s,m γ {3} j−1 {S ≤r} j−1
e r) where S{3} has c.d.f. F and X {3} has c.d.f. H. Since F is continuous and κγbs,m (m, s, m,t, j−1
is bounded and continuous for t ∈ R+ \ {t0 }, the supremum in (19) can be changed into maximum. Let r > t0 − t then h i ¯ 0 − t) e r) = E I{S{3} ≤t0 −t} γ s,m κγbs,m (m, s, m,t, (m e + X {3},t + S{3}) − CF(t j−1 j−1 i h ¯ 0 − t)wˆ b (m, s, m,t e + X {3},t + S{3}) + F(t e 0) ≤ E I{S{3} ≤t0 −t} γ s,m j−1 (m e 0 − t). = κγbs,m (m, s, m,t,t j−1
e r) − The above calculations cause that γ s,m e = I{t≤t0 } max0≤r≤t0 −t ϕ j (m, s, m,t, j (m,t) h i s,m b ¯ wˆ (m, s, m,t CI{t>t } , where ϕ j (m, s, m,t, e r) = F(r) e +r)+E I {3} γ (m e + X {3},t + S{3}) . 0
{S
≤r} j−1
Obviously for S{3} ≤ r and r ≤ t0 −t we have S{3} ≤ t0 therefore we can consider the e = wˆ b (m, s, m,t) e and the cases t ≤ t0 and t > t0 separately. Let t ≤ t0 then γ0s,m (m,t) ( m,t) e given hypothesis is true for j = 0. The task is now to calculate γ s,m γ s,m j (·, ·). j+1 The induction hypothesis implies that for t ≤ t0
14
A. Karpowicz, K. Szajowski
h
i ¯ wˆ b (m, s, m,t e r) = F(r) e + r) + E I{S{3} ≤r} γ s,m e + X {3},t + S{3}) ϕ j+1 (m, s, m,t, j (m h i ¯ = gˆa (m) − ca (s) + F(r) gˆb (m e − m) − cb(t − s + r) +
Z r 0
f(z){Egˆb (m e − m + X {3}) − cb (t − s + z)
+ Eybj (m e − m + X {3},t − s + z,t0 − t − z)}dz.
It is clear that for any a and b h i ¯ F(r) gˆb (a) − cb(b + r) = gˆb (a) − cb(b) Z r h i b′ ¯ (b + z)}dz, − {f(z) gˆb (a) − cb(b + z) + F(z)c 0
therefore
ϕ j+1 (m, s, m,t, e r) = wˆ b (m, s, m,t) e +
+ Eybj (m e−m+X
Z r
0 {3}
¯ α2 (z)[∆ b (m e − m) F(z){
′
,t − s + z,t0 − t − z)] − cb (t − s + z)}dz,
which proves the theorem. The case for t > t0 is trivial. Following the methods of Ferenstein and Sieroci´nski [6], we find the second optimal stopping time. Let B = B([0, ∞) × [0,t0 ] × [0,t0]) be the space of all bounded, continuous functions with the norm kδ k = supa,b,c |δ (a, b, c)|. It is easy to check that B with the norm supremum is complete space. The operator Φ b : B → B is defined by (20) (Φ b δ )(a, b, c) = max φδb (a, b, c, r). 0≤r≤c
Let us observe that ybj (a, b, c) = (Φ b ybj−1 )(a, b, c). Remark 3 now implies that there ∗ ∗ exists a function rb j (a, b, c) such that ybj (a, b, c) = φybb (a, b, c, rb j (a, b, c)) and this j−1
gives ˆ b (m, s, m,t) e ( m,t) e = I γ s,m {t≤t0 } w j
∗ + φybb (m e − m,t − s,t0 − t, rbj (m e − m,t − s,t0 − t)) − CI{t>t0 } . j−1
The consequence of the foregoing considerations is the theorem, which determines b∗ optimal stopping times τn,K in the following manner: ∗
∗
{3}
b (M s − M , T Theorem 1. Let Rbi = rK−i s i i
over
s ηn,K
= K ∧ inf{i ≥
∗ n : Rbi
{3} < Si+1 },
s and Γ s is optimal in the class Tn,K n,K
{3}
− s,t0 − Ti
) for i = 0, 1, . . . , K more∗
{3}
∗
b =T b then the stopping time τn,K s + Rη s ηn,K n,K i h b∗ )|F s . = E Z(s, τn,K n
Anglers’ fishing problem
15
3.2 Infinite number of fishes caught ∗
The task is now to find the function J(s) and stopping time τ b , which is optimal in class T s . In order to get the solution of one stopping problem for infinite number of fishes caught it is necessary to put the restriction F(t0 ) < 1. Lemma 5. If F(t0 ) < 1 then the operator Φ b : B → B defined by (20) is a contraction. P ROOF OF LEMMA . 5. Let δi ∈ B assuming that i ∈ {1, 2}. There exists ρi such that (Φ b δi )(a, b, c) = φδbi (a, b, c, ρi ). We thus get (Φ b δ1 )(a, b, c) − (Φ b δ2 )(a, b, c) = φδb1 (a, b, c, ρ1 ) − φδb2 (a, b, c, ρ2 ) ≤ φδb1 (a, b, c, ρ1 ) − φδb2 (a, b, c, ρ1 ) =
Z ρ1
dF(z)
0
0
≤
Z ρ1
Z ∞
dF(z)
Z ∞
[δ1 − δ2 ](a + x, b + z, c − s)dH(x) sup |[δ1 − δ2 ](a, b, c)|dH(x)
0 a,b,c
0
≤ F(c) kδ1 − δ2 k ≤ F(t0 ) kδ1 − δ2 k ≤ C kδ1 − δ2 k , b where 0 ≤ C < 1. Similarly, like as before, (Φ b δ2 )(a,
b, c) − (Φ δ1 )(a, b, c) ≤ b b C kδ2 − δ1 k. Finally we conclude that Φ δ1 − Φ δ2 ≤ C kδ1 − δ2 k which completes the proof. z
Applying Remark 3, Lemma 5 and the fixed point theorem we conclude Remark 4. There exists yb ∈ B such that yb = Φ b yb and limK→∞ kybK − yb k = 0. According to the above remark, yb is the uniform limit of ybK , when K tends to infinity, which implies that yb is measurable and γ s,m = limK→∞ γKs,m is given by h i γ s,m (m,t) e + yb(m e − m,t − s,t0 − t) − CI{t>t0 } . (21) e = I{t≤t0 } wˆ b (m, s, m,t) We can now calculate the optimal strategy and the expected gain after changing the place. Theorem 2. If F(t0 ) < 1 and has the density function f, then ⋆
∗
⋆
b a.s. exists and τ b ≤ t is an optimal (i) for n ∈ N the limit τnb = limK→∞ τn,K 0 n {3}
stopping rule in the set T s ∩ {τ ≥ Tn }, ⋆ {3} (ii)E Z(s, τnb )|Fns = γ s,m (Mns , Tn ) a.s.
⋆
s P ROOF. (i) Let us first prove the existence of τnb . By definition of Γn,K+1 we have s Γn,K+1 = ess sup E [Z(s, τ )|Fns ] = ess sup E [Z(s, τ )|Fns ] ∨ ess sup E [Z(s, τ )|Fns ] s τ ∈Tn,K+1
h
∗
i
s τ ∈Tn,K
b = E Z(s, τn,K )|Fns ∨ E [Z(s, σ ∗ )|Fns ]
s τ ∈TK,K+1
16
A. Karpowicz, K. Szajowski b∗
b∗
∗
b ∈ T s and σ ∗ ∈ thus we observe that τn,K+1 is equal to τn,K or σ ∗ , where τn,K n,K ∗ ∗ s b b which implies that the sequence TK,K+1 respectively. It follows that τn,K+1 ≥ τn,K ∗
∗
{3}
b is nondecreasing with respect to K. Moreover Rb ≤ t − T τn,K for all i ∈ 0 i i b∗ ≤ t and therefore τ b⋆ ≤ t exists. {0, . . . , K} thus τn,K 0 0 n Let us now look at the process ξ s (t) = (t, Mts ,V (t)), where s is fixed and V (t) = {3} t − TN (t) . ξ s (t) is Markov process with the state space [s,t0 ] × [m, ∞) × [0, ∞). In a 3 general case the infinitesimal operator for ξ s is given by
∂ s,m ∂ s,m p (t, m, e v) + p (t, m, e v) ∂t ∂v
Aps,m (t, m, e v) =
Z
+ α2 (v)
R+
ps,m (t, x, 0)dH(x) − ps,m (t, m, e v) ,
where ps,m (t, m, e v) : [0, ∞) × [0, ∞) × [0, ∞) → R is continuous, bounded, measurable with bounded left-hand derivatives with respect to t and v (see [1] and [17]). Let us notice that for t ≥ s the process Z(s,t) can be expressed as Z(s,t) = ps,m (ξ s (t)), where a gˆ (Ms ) − ca(s) + gˆb (Mts − Ms ) − cb(t − s) if s ≤ t ≤ t0 , s,m s p (ξ (t)) = −C if t0 < t. It follows easily that in our case Aps,m (t, m, e v) = 0 for t0 < t and
Aps,m (t, m, e v) = α2 (v)[Egˆb (m e + X {3} − m) − gˆ b(m e − m)] − cb′(t − s)
(22)
Rt
for s ≤ t ≤ t0 . The process ps,m (ξ s (t)) − ps,m (ξ s (s)) − s (Aps,m )(ξ s (z))dz is a martingale with respect to σ (ξ s (z), z ≤ t) which is the same as Fs,t . This can be found b∗ ≤ t , applying the Dynkin’s formula we obtain in [4]. Since τn,K 0 h
E p
s,m
(ξ
s
b∗ ))|Fns (τn,K
i
−p
s,m
(ξ
s
{3} (Tn ))
=E
"Z
∗
b τn,K {3}
Tn
(Ap
s,m
)(ξ
s
(z))dz|Fns
#
a.s. (23)
From (22) we conclude that Z τ b∗ n,K {3}
Tn
(Ap
s,m
)(ξ (z))dz = [Egˆ s
−
b
(Mns + X {3} − m) − gˆ b(Mns − m)]
Z τ b∗ n,K
Moreover let us check that
{3} Tn
′
cb (z − s)dz.
Z τ b∗ n,K {3}
Tn
{3}
α2 (z − Tn )dz
Anglers’ fishing problem
17
Z τ b∗ b∗ Z τn,K n,K 1 1 {3} {3} < ∞, f(z − Tn )dz ≤ ¯ {3} α2 (z − Tn )dz ≤ ¯ F(t0 ) Tn{3} Tn F(t0 ) Z ∗ b τn,K {3} b∗ b′ − s) − cb(Tn − s) < ∞, {3} c (z − s)dz = cb (τn,K Tn b s Egˆ (Mn + X {3} − m) − gˆ b(Mns − m) < ∞,
where the two last inequalities result from the fact that the functions gˆb and cb are bounded. On account of the above observation we can use the dominated convergence theorem and " Z b∗ # " Z b⋆ # lim E
K→∞
τn,K
{3} Tn
(Aps,m )(ξ s (z))dz|Fns = E
τn
{3}
Tn
(Aps,m )(ξ s (z))dz|Fns .
(24)
⋆
Since τnb ≤ t0 applying the Dynkin’s formula to the left side of (24) we conclude that # " Z b⋆ h i τn {3} s s s,m s b⋆ s s,m ξ (z))dz|F = E p ( ξ ( τ ))|F − ps,m (ξ s (Tn )) a.s. (Ap )( E n n n {3} Tn
(25)
Combining (23), (24) and (25) we can see that h i h i ∗ b∗ lim E ps,m (ξ s (τn,K ))|Fns = E ps,m (ξ s (τnb ))|Fns ,
(26)
K→∞
i h ∗ b∗ )|Fns = E Z(s, τnb )|Fns . We next prove the optihence that limK→∞ E Z(s, τn,K ∗
{3}
{3}
mality of τnb in the class T s ∩ {τnb ≥ Tn }. Let τ ∈ T s ∩ {τnb ≥ Tn } and it is {3} s . As τ b∗ is optimal in the class T s we have clear that τ ∧ TK ∈ Tn,K n,K n,K i h i h {3} b∗ ))|Fns ≥ lim E ps,m (ξ s (τ ∧ TK ))|Fns . lim E ps,m (ξ s (τn,K
K→∞
K→∞
(27)
∗ From (26) and (27) we conclude that E ps,m (ξ s (τnb ))|Fns ≥ E [ps,m (ξ s (τ ))|Fns ] {3}
∗
for any stopping time τ ∈ T s ∩ {τ ≥ Tn }, which implies that τnb is optimal in this class. i h ∗ {3} (ii) Lemma 4 and (26) lead to E Z(s, τnb )|Fns = γ s,Ms (Mns , Tn ).
The remainder of this section will be devoted to the proof of the left-hand differentiability of the function γ s,m (m, s) with respect to s. This property is necessary to construct the first optimal stopping time. First, let us briefly denote δ (0, 0, c) ∈ B by δ¯ (c). ′ b¯ ¯ ¯ Lemma ′ 6. Let ν¯ (c) = Φ δ (c), δ (c) ∈ B and δ+ (c) ≤ A1 for c ∈ [0,t0 ) then ν¯ (c) ≤ A2 . +
18
A. Karpowicz, K. Szajowski
P ROOF OF LEMMA . 6. First observe that the derivative ν¯ +′ (c) exists because ν¯ (c) = max0≤r≤c φ¯ b (c, r), where φ¯ b (c, r) is differentiable with respect to c and r. Fix h ∈ (0,t0 − c) and define δ¯1 (c) = δ¯ (c + h) ∈ B and δ¯2 (c) = δ¯ (c) ∈ B. Obviously, kΦ b δ¯1 − Φ b δ¯2 k ≥ |Φ b δ¯1 (c) − Φ b δ¯2 (c)| = |Φ b δ¯ (c + h) − Φ bδ¯ (c)| and on the other side using Taylor’s formula for right-hand derivatives we obtain
δ¯1 − δ¯2 = sup δ¯ (c + h) − δ¯ (c) ≤ h sup δ¯+′ (c) + |o(h)| . c
c
From the above and Remark 8 it follows that ′ |o(h)| |o(h)| ′ ν¯ (c + h) − ν¯ (c) ¯ ¯ −C sup δ+ (c) + ≤ ≤ C sup δ+ (c) + h h h c c and letting h → 0+ gives ν¯ +′ (c) ≤ CA1 = A2 .
z
The significance of Lemma 6 is such that the function y(t ¯ 0 − s) has bounded lefthand derivative with respect to s for s ∈ (0,t0 ]. The important consequence of this fact is the following Remark 5. The function γ s,m can be expressed as γ s,m (m, s) = I{s≤t0 } u(m, s)−CI{s>t0 } , where u(m, s) = gˆa (m) − ca (s) + gˆb (0) − cb (0) + y¯b (t0 − s) is continuous, bounded, measurable with the bounded left-hand derivatives with respect to s. At the end of this section, we determine the conditional value function of the second optimal stopping problem. According to (10), Theorem 2 and Remark 5 we have i h ∗ (28) J(s) = E Z(s, τ b )|Fs = γ s,Ms (Ms , s) a.s.
4 Construction of the optimal first stopping time In this section, we formulate the solution of the double stopping problem. On the first epoch of the expedition the admissible strategies (stopping times) depend on the formulation of the problem. For the optimization problem the most natural are the stopping times from T (see the relevant problem considered in Szajowski [22]). However, when the bilateral problem is considered the natural class of admissible strategies depends on who uses the strategy. It should be T {i} for the i-th player. Here the optimization problem with restriction to the strategies from the T {1} at the first epoch is investigated. Let us first notice that the function u(m, s) has a similar properties to the function wˆ b (m, s, m,t) e and the process J(s) has similar structure to the process Z(s,t). By this observation one can follow the calculations of Section 3 to get J(s). Let us define again Γn,K = ess supτ a ∈Tn,K E [J(τ a )|Fn ] , n = K, . . . , 1, 0, which fulfills the following representation
Anglers’ fishing problem
19
bn{1} , Tn{1} ) for n = K, . . . , 0, where the sequence of funcLemma 7. Γn,K = γK−n (M tions γ j can be expressed as:
γ0 (m, s) = I{s≤t0 } u(m, s) − CI{s>t0 } , γ j (m, s) = I{s≤t0 } u(m, s) + yaj (m, s,t0 − s) − CI{s>t0 }
and yaj (a, b, c) is given recursively as follows: ya0 (a, b, c) = 0 yaj (a, b, c) = max φyaa (a, b, c, r) j−1
0≤r≤c
where
φδa (a, b, c, r) =
Z r
h n i F¯ 1 (z) α1 (z) ∆ a (a) + Eδ (a + x{1}, b + z, c − z) 0 o − (y¯b′− (c − z) + ca′ (b + z)) dz.
Lemma 7 corresponds to the combination of Lemma 4 and Remark 3 from Subsection 3.1. Let the operator Φ a : B → B be defined by (Φ a δ )(a, b, c) = max φδa (a, b, c, r). 0≤r≤c
(29)
∗ Lemma 7 implies that there exists a function r1, j (a, b, c) such that
a ∗ γ j (m, s) = I{s≤t0 } u(m, s) + φya (m, s,t0 − s, r1, j (m, s,t0 − s)) − CI{s>t0 } . j−1
We can now state the analogue of Theorem 1. ∗
∗
{1}
{1} a∗ Si+1 }, then τn,K
∗ {1} = Tηn,K +Raηn,K
{1}
) and ηn,K = K ∧ inf{i h≥ n : Rai ∗ t0 } . We can now formulate our main results. Theorem 4. If F1 (t0 ) < 1 and has the density function f1 , then
(30)
20
A. Karpowicz, K. Szajowski a∗
a∗
a∗
(i) for n ∈ N the limit τn = limK→∞ τn,K a.s. exists and τn ≤ t0 is an optimal stop{1}
ping rule in the set T ∩ {τ ≥ Tn }, ∗ {1} (ii)E J(τna )|Fn = γ (Mn , Tn ) a.s.
P ROOF. The proof follows the same method as in Theorem 2. The difference lies in the form of the infinitesimal operator. Define the processes ξ (s) = (s, Ms ,V (s)) {1} where V (s) = s − TN (s) . Like before ξ (s) is the Markov process with the state space 1 [0, ∞) × [0, ∞) × [0, ∞). Notice that J(s) = p(ξ (s)) and p(s, m, v) : [0,t0 ] × [0, ∞) × [0, ∞) → R continuous, bounded, measurable with the bounded left-hand derivatives with respect seen that Ap(s, m, v) = α1 (v)[Egˆa (m + x{1}) − h to s and v. It is easily i ′
gˆa (m)] − y¯b− (t0 − s) + ca′ (s) for s ≤ t0 . The rest of the proof remains the same as in the proof of Theorem 2. Summarizing, the solution of a double stopping problem is given by ∗
∗
∗
{1}
EZ(τ a , τ b ) = EJ(τ a ) = γ (M0 , T0 ) = γ (0, 0), ∗
∗
where τ a and τ b are defined according to Theorem 2 and Theorem 4 respectively.
5 Examples The form of the solution results in the fact that it is difficult to calculate the solution in an analytic way. In this section we will present examples of the conditions for which the solution can be calculated exactly. Remark 7. If the process ζ2 (t) = Aps,m (ξ s (t)) has hdecreasing i paths, then the second ∗
{3}
optimal stopping time is given by τnb = inf{t ∈ Tn ,t0 : Aps,m (ξ s (t)) ≤ 0} on
the other side if ζ2 (t) has non-decreasing paths, then the second optimal stopping time is equal to t0 . Similarly, if the process ζ1 (s) = Ap(ξ (s)) has paths, then the first optii h decreasing ∗ {1} a mal stopping time is given by τn = inf{s ∈ Tn ,t0 : Ap(ξ (s)) ≤ 0} on the other
side if ζ1 (s) has non-decreasing paths, then the first optimal stopping time is equal to t0 . ∗ R τnb ∗ {3} s,m )(ξ s (z))dz (Ap P ROOF. From (25) we obtain E Z(s, τnb )|Fns = Z(s, Tn )+E {3} Tn
a.s. and the application results of Jensen and Hsu [11] completes the proof.
Corollary 2. If has exponential distribution with constant hazard rate α2 , the {3} b function gˆ is increasing and concave, the cost function cb is convex and t2,n = Tn , s s mn = Mn then S{3}
Anglers’ fishing problem
21 b′
b∗
τn = inf{t ∈ [t2,n ,t0 ] : α2 [Egˆb (msn + x{3} − m) − gˆ b(msn − m)] ≤ c (t − s)}, (31) where s is the moment of changing the place. Moreover, if S{1} has exponential distribution with constant hazard rate α1 , gˆa is increasing and concave, ca is convex {1} bn{1} then and t1,n = Tn , mn = M h i ∗ τna = inf{s ∈ [t1,n ,t0 ] : α1 Egˆa (mn + x{1}) − gˆa (mn ) ≤ ca ′ (s)} ∗
P ROOF. The form of τ a ∗n and τnb is a consequence of Remark 7. Let us observe ′ that by our assumptions ζ2 (t) = α2 ∆ b (Mts − m) − cb (t − s) has decreasing paths for {3} {3} {3} t ∈ [Tn , Tn+1 ). It suffices to prove that ζ2 (Tn ) − ζ2 (Tn2− ) = α2 [∆ b (Mns − m) − s ∆ b (Mn−1 − m)] < 0 for all n ∈ N. ′ ∗ ∗ It remains to check that y¯b− (t0 −s) = 0. We can see that τ b = τ b (s) is deterministic, which is clear from (31). Let us notice that if s ≤ t0 then (25), (26) and i hR combining ∗ ∗ τb s,m s,m b (28) gives γ (m, s) = E Z(s, τ )|Fs = Z(s, s) + E s (Ap )(ξ s (z))dz|Fs . By Remark 5 it follows that # Z b∗ "Z b ∗ i τ (s) τ (s) h s b s,m α2 ∆ b (0) − c′2(z − s) dz y¯ (t0 − s) = E (Ap )(ξ (z))dz = s
s
and this yields ′
Z τ b ∗ (s)
h i ∗′ ∗ c′′2 (z − s)dz + τ b (s) α2 ∆ b (0) − c′2(τ b 2 (s) − s) (32) s h i − α2 ∆ b (0) − c′2 (0) h i ∗ = c′2 (τ b (s) − s) − c′2(0) − α2 ∆ b (0) − c′2(0) = 0.
y¯b− (t0 − s) =
The rest of proof runs as before.
Corollary 3. If for i = 1 and i = 2 the functions gi are increasing and convex, ci are concave and S{i} have the exponential distribution with constant hazard rate αi ∗ ∗ then τna = τnb = t0 for n ∈ N. P ROOF. It is also the straightforward consequence of Remark 7. It suffices to check ∗ ′ that y¯b− (t0 − s) is non-increasing with respect to s. First observe that τ b (s) = t0 . ′ Considering (32) it is obvious that y¯b− (t0 − s) = α2 ∆ b (0) − c′2(t0 − s) and this completes the proof.
22
A. Karpowicz, K. Szajowski
6 Conclusions This article presents the solution of the double stopping problem in the ”fishing model” for the finite horizon. The analytical properties of the reward function in one stopping problem played the crucial rule in our considerations and allowed us to get the solution for the extended problem of a double stopping. Let us notice that by repeating considerations from Section 4 it is easy to generalize our model and the solution to the multiple stopping problem but the notation can be inconvenient. The construction of the equilibrium in the two person non-zero sum problem formulated in the section 2 can be reduced to the two double optimal stopping problems in the case when the payoff structure is given by (5), (6) and (11). The key assumptions were related to the properties of the distribution functions. Assuming general distributions and the infinite horizon one can get the extensions of the above model.
References 1. Boshuizen, F., Gouweleeuw, J.: General optimal stopping theorems for semi-markov processes. Adv. in Appl. Probab. 4, 825–846 (1993) 2. Boshuizen, F.A.: A general framework for optimal stopping problems associated with multivariate point processes, and applications. Sequential Anal. 13(4), 351–365 (1994) 3. Br´emaud, P.: Point Processes and Queues. Martingale Dynamics. Springer-Verlag, New York (1981) 4. Davis, M.H.A.: Markov Models and Optimization. Chapman and Hall, New York (1993) 5. Ferenstein, E., Pasternak-Winiarski, A.: Optimal stopping of a risk process with disruption and interest rates. In: M. Br`eton, K. Szajowski (eds.) Advances in Dynamic Games: Differential and Stochastic games: theory, application and numerical methods, Annals of the International Society of Dynamic Games, vol. 11, p. 18pages. Birkh¨auser, Boston (2010) 6. Ferenstein, E., Sieroci´nski, A.: Optimal stopping of a risk process. Applicationes Mathematicae 24(3), 335–342 (1997) 7. Ferguson, T.: A Poisson Fishing Model. In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics (D. Pollard, E. Torgersen and G. Yang, eds.). Springer, New York (1997) 8. Haggstrom, G.: Optimal sequential procedures when more then one stop is required. Ann. Math. Statist. 38, 1618–1626 (1967) 9. Jacobsen, M.: Point process theory and applications. Marked point and picewise deterministic processes., Probability and Its Applications, vol. 7. Birkh¨auser, Boston (2006) 10. Jensen, U.: An optimal stopping problem in risk theory. Scand. Actuarial J. 2, 149–159 (1997) 11. Jensen, U., Hsu, G.: Optimal stopping by means of point process observations with applications in reliability. Mathematics of Operations Research 18(3), 645–657 (1993) 12. Karpowicz, A.: Double optimal stopping in the fishing problem. J. Appl. Probab. 46(2), 415– 428 (2009). DOI 10.1239/jap/1245676097 13. Karpowicz, A., Szajowski, K.: Double optimal stopping of a risk process. GSSR Stochastics: An International Journal of Probability and Stochastic Processes 79, 155–167 (2007) 14. Kramer, M., Starr, N.: Optimal stopping in a size dependent search. Sequential Anal. 9, 59–80 (1990) 15. Muciek, B.K., Szajowski, K.: Optimal stopping of a risk process when claims are covered immediately. In: Mathematical Economics, RIMS Kˆokyˆuroku, vol. 1557, pp. 132–139 (2007) 16. Nikolaev, M.: Obobshchennye posledovatel′ nye procedury. Litovski˘i Matematicheski˘i Sbornik 19, 35–44 (1979)
References
23
17. Rolski, T., Schmidli, H., Schimdt, V., Teugels, J.: Stochastic Processes for Insurance and Finance. John Wiley & Sons, Chichester (1998) 18. Shiryaev, A.: Optimal Stopping Rules. Springer-Verlag, New York, Heidelberg, Berlin (1978) 19. Starr, N.: Optimal and adaptive stopping based on capture times. J. Appl. Prob. 11, 294 – 301 (1974) 20. Starr, N., Wardrop, R., Woodroofe, M.: Estimating a mean from delayed observations. Z. f u¨ r Wahr. 35, 103–113 (1976) 21. Starr, N., Woodroofe, M.: Gone fishin’: Optimal stopping based on catch times. U. Mich. Report., Dept. of Statistics No. 33 (1974) 22. Szajowski, K.: Optimal stopping of a 2-vector risk process. In: Stability in Probability, Banach Center Publications, vol. 90, pp. 179–191. PWN, Warszawa (2010)
Index
filtrations {i} Fn –the short denotation of FT {i} , 7 n
Ft ,FtA –the filtration generated by the A-marked renewal–rewarded process to the moment t, 6 Ft –the filtration generated by the A-marked renewal–rewarded process to the moment t, 6 pay-off functions Cbj –the bounds of the costs, 4 Zi ( j, s,t)–the pay–off process of the anglers, when the first stop has been forced by i-th one, 6 → w˜ ai (− m , j, s, k, m e,t)–the pay-off of the angler i-th at moment t, when his change of fishing method to k ∈ B has been forced by the angler j at s(≤ t) and the state of − the renewal-reward process → m ,6 cbj (t)–the cumulative costs of fishing after the change of fishing method using method j, 4 ci , cai , ca –the cumulative cost of usage the i-th rod at the period a, 4 → − ga (− m , j,t), (gai (→ m , j,t))–the utility of fishes gotten to the moment t (at the i-th rod) when the last catch was at the j-th rod and the state of the renewal-reward → process is − m, 5 → gbj (− m , i, s, m,t)–the e reward function after the change of the fishing methods when the state of the renewal-reward processes − at s has been → m and the final state of the renewal-reward process at t(≥ s) has been m, e 4 → wai (− m , j,t)–the i-th player’s pay-off at moment t when the stop has been
made by the j-th and the state of the − renewal-reward process → m ,5 renewal–reward processes {i} {i} (Tn , Xn )–the renewal–rewarded processes, 6 Fi (t)–the distribution function of the holding times of the i-th type, 4 {i} Mt –the renewal-reward process at moment t related to the rod i-th, 4 Mt (Mts )–the renewal-reward process at moment t (with change of a structure at moment s), 4 Ni (t)–the number of fishes caught on the rod i to the moment t, 3 {i} Sn –n-th holding time of the i-th type, 4 {i} Tk –k-th jump time of the i-th type, 4 Tn –n-th jump moment, 3 {i} Xk –the value of the k-th fish cached on the i-th rod, 3 F(t), f(t)–the distribution and density functions of the holding times after the change of fishing method, 9 H(t)–the distribution function of the rewards after the change of fishing method, 9 zn –the index of n-th jump, 3 → − N (t)–the 2-dimensional renewal process, 3 {i} nk –the index of k-th jump of i-th type, 4 stopping times T , Tn,K –sets of stopping times with respect to σ -fields {Ft }, 7 {i} {i} Tn,K –the stopping times bounded by Tn and TK , 7 {i} {i} Tn –the stopping times bounded by Tn , 7
25
26
Index Tn,K –the subset of stopping times τ ∈ T with respect to the filtration {Ft } such that Tn ≤ τ ≤ TK , 7 ∗ τ a –the optimal moment of the first decision, 8
∗
τ b –the optimal moment of the second decision, 8 τn,K –the element of the set Tn,K , 7 b ∗ , τ b ∗ –the second optimal stopping time τ0,K K in a restricted problem, 9