49th IEEE Conference on Decision and Control December 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA
A Markov-Modulated Stochastic Control Problem with Optimal Multiple Stopping with Application to Finance Tim Siu-Tang Leung Abstract— This paper studies the valuation of multiple American options in an incomplete market where asset prices follow Markov-modulated dynamics. The holder’s optimal hedging and exercising strategies are determined from a utility maximization problem with optimal multiple stopping. We analyze the associated system of variational inequalities for the holder’s utility indifference price, and construct a duality formula involving relative entropy minimization over a random horizon.
I. I NTRODUCTION The standard no-arbitrage pricing model assumes that all option positions can be hedged perfectly by continuously trading the underlying asset. However, in many financial applications, the underlying is non-traded; instead, the investor (option buyer or writer) trades a correlated asset as a proxy to minimize risk exposure. Some examples are employee stock options [1], weather derivatives [2], hedging basket options with indexes [3], and options on illiquid assets [4]. Moreover, asset prices are often seen dependent on the market conditions. One popular approach involves modulating asset price dynamics by a continuous-time finite state Markov chain (ξt )t≥0 representing the stochastic market regime; see, among others, [5] and [6]. In these so-called regime-switching market models, it is common that the risks associated with regime changes are unhedgeable. In this paper, we consider the problem of dynamically hedging a long position in American (early exercisable) options written on a non-traded asset in a regime-switching market. In the model, the investor (option holder) faces the idiosyncratic risk from the non-tradability of the underlying as well as the regime-switching risk. These two sources of unhedgeable risks render the market incomplete. Since not all risks can be hedged, the holder’s risk preferences play a key role in the valuation and investment decisions. We adopt the utility indifference pricing methodology, whereby the optimal hedging and exercising strategies are determined through the associated utility maximization problems. In addition, our approach also accounts for the partial hedge with a correlated liquid asset and the multiple early exercises of American options. In our formulation, the holder’s utility maximization involves stochastic control (due to dynamic hedging) and optimal stopping (due to early exercises). Our solution approach involves the analytic and numerical studies of the associated First version: March 28, 2010. This version: July 10, 2010. Dept. of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore MD 21218;
[email protected]. Research supported by NSF grant DMS-0908295 and the Johns Hopkins Duncan Travel Fund.
978-1-4244-7744-9/10/$26.00 ©2010 IEEE
variational inequalities (VIs) of Hamilton-Jacobi-Bellman (HJB) type. By a series of transformations, we simplify the fully nonlinear VIs into a semilinear free boundary problems of reaction-diffusion type, and develop an efficient finite-difference method to solve for the optimal exercise boundaries. Our analysis provides both mathematical and financial interpretations for the holder’s subjective price, or the indifference price, for the American options. Furthermore, we analyze the dual optimization problem associated with the holder’s utility maximization problem. The key idea is the well-known connection between exponential utility and relative entropy minimization; see, among others, [7] and [8]. Specifically, we first compute explicitly the minimal entropy martingale measure (MEMM) in the regime-switching market. This allows us to construct a duality formula that involves optimizing the expected payoff over a set of stopping times and martingale measures while penalizing by an entropy distance from the MEMM. We apply the duality formula to derive the risk-aversion asymptotics of the holder’s indifference price. The paper is structured as follows. In Section II, we study the dynamic hedging of a single American option on a nontraded asset in a regime-switching market. In Section III, we extend our results to the case of American options with multiple exercises. Then, we present the numerical solutions in section IV. In Section V, we establish the duality results, and study the asymptotics of the holder’s indifference price. II. DYNAMIC H EDGING OF A S INGLE A MERICAN O PTION We fix a probability space (Ω, F , P), where P is the historical measure. Let ξ be a continuous-time irreducible finite-state Markov chain with state space E = {1, 2, . . . , m}. The generator matrix of ξ is denoted by A, which has constant entries A(i, j) = ai j for i, j ∈ E, such that ai j ≥ 0 for i 6= j and ∑ j∈E ai j = 0 for each i ∈ E. This Markov chain represents the changing regime of the financial market, and it influences the dynamics of assets. The financial market consists of a liquid asset S and a nontraded asset Y , along with the riskless money market account with a constant interest rate r ≥ 0. Throughout, we shall work with discounted cash flows. The discounted prices of (S,Y ) are modeled as correlated regime-switching (Markovmodulated) geometric Brownian motions: dSt = µ (ξt )St dt + σ (ξt )St dWt , ¡ ¢ dYt = ν (ξt )Yt dt + η (ξt )Yt ρ (ξt ) dWt + ρ˜ (ξt ) dW˜ t ,
(1) (2)
with correlation coefficient ρ (i) ∈ (−1, 1) for i ∈ E, and p ρ˜ (i) = 1 − ρ 2 (i). The processes W and W˜ are two inde-
559
pendent Brownian motions under P, and are independent of ξ . We define the filtration F = (Ft )0≤t≤T with Ft being the augmented σ -algebra generated by {Wu , W˜ u , ξu ; 0 ≤ u ≤ t}. For each i ∈ E, the coefficients µ (i), ν (i), σ (i), η (i) are known constants, with σ (i), η (i) > 0. The Sharpe ratios of S and Y are λ (i) = µ (i)/σ (i) and κ (i) = ν (i)/η (i) respectively. For convenience, we may use the subscript notation for these constants, e.g. µi ≡ µ (i), and ρi ≡ ρ (i). We consider the problem of hedging a long position in an American option, which is written on the nontraded asset Y , with a smooth bounded payoff gτ := g(τ ,Yτ , ξτ ) at any exercise time τ ≤ T . The set of admissible exercise times, denoted by T , consists of all stopping times with respect to F taking values in [0, T ]. For any two stopping times s, u ∈ T with s ≤ u, we define Ts,u := {τ ∈ T : s ≤ τ ≤ u}. We assume that the holder measures utility from wealth at the terminal time T by an exponential utility function U(x) = −e−γ x ,
dSt = θt µ (ξt ) dt + θt σ (ξt ) dWt . St
(3)
Note that dynamic trading in S hedges against the risk from W , but there exist the idiosyncratic risk due to W˜ and the regime-switching risk from ξ . These two unhedgeable sources of risk render the market incomplete. The utility indifference pricing methodology is based on the comparison of maximal expected utilities corresponding to dynamic investments with and without the options. The first is the Merton [10] portfolio optimization problem without any options, modified here to incorporate regimeswitching price dynamics. With initial wealth x at t ∈ [0, T ], the investor’s Merton value function is M(t, x, i) = sup IE {U(XTθ ) | Xt = x, ξt = i} .
(5)
τ ∈Tt,T θ ∈Θt,τ
where the notation IE t,x,y,i {·} ≡ IE {·|Xt = x,Yt = y, ξt = i}. The holder’s indifference price of the American option g is defined as the cash amount p such that he is indifferent between optimally investing without the claim, and optimally investing with the claim at an initial cost p. Definition 1: The American option holder’s indifference price p ≡ p(t, x, y, i) is defined by the indifference relation: M(t, x, i) = V (t, x − p, y, i). (6) As we shall see in Proposition 3, exponential utility yields wealth-independent indifference prices, and hence the xargument of p will be dropped. We shall analyze p through M and V by studying their associated HJB PDEs and VIs. A. Merton Portfolio Optimization with Regime-Switching
x ∈ IR,
with a constant absolute risk aversion γ ∈ (0, ∞). The option holder can partially hedge by dynamically trading in S and the money market account throughout [0, T ]. A trading strategy (θt )0≤t≤T is the discounted cash amount invested in S. As is standard (see [9]), a strategy θ is deemed admissible if it is self-financing, Ft -progressively measurable, and satisfies the R integrability condition IE { 0T θt2 dt} < ∞. Denote by Θs,u the set of admissible strategies over the period [s, u] with s ≤ u. For any strategy θ , the holder’s trading wealth follows dXtθ = θt
V (t, x, y, i) = sup IE t,x,y,i {M(τ , Xτθ + gτ , ξτ )},
(4)
θ ∈Θt,T
Numerous variations of this classical problem have been developed; for example, see [11] for a risk-sensitive control approach. Next, we consider the dynamic investment problem including the American option g. The holder selects the optimal trading strategy and exercise time in order to maximize his expected utility from trading wealth plus the option payoff. Upon exercise of the option, the holder will re-invest the option payoff, if any, into his portfolio, and continue to trade up to time T . Therefore, the holder faces the Merton problem during [τ , T ], and his value function is given by
First, we provide a closed-form formula for M. Theorem 2: The Merton function admits a separation of variables (7) M(t, x, i) = −e−γ x Fi (t), for (t, x, i) ∈ [0, T ] ×IR+ ×E, where F(t) := (F1 (t), . . . , Fm (t))′ is given by F(t) = exp{(A − D)(T − t)} 1, λ2
(8)
2
where D = diag( 21 , . . . , λ2m ), A is the generator matrix, 1 is a vector of ones, and exp is the matrix exponential. In addition, each Fi (t) admits the probabilistic representation ½ µ ZT 2 ¾ ¶ λ (ξs ) Fi (t) = IE exp − (9) ds | ξt = i . 2 t Proof: The HJB equation associated with M(t, x, i) is ¶ µ 2 2 θ σi i i i Mxx + θ µi Mx + ∑ ai j M j = 0, (10) Mt + max 2 θ j∈E M(T, x, i) = U(x). on [0, T ]×IR ×E, where the notation M i ≡ M(t, x, i). Then, substituting (7) into (10) yields the system of linear ODEs: F ′ (t) = (D − A)F,
(11)
F(T ) = 1. Direct computation shows that (8) solves (11), and (9) is the corresponding Feynman-Kac representation. By performing the maximization over θ in (10), we obtain the optimal trading strategy for the Merton problem:
λi . θˆ (t, i) = γσi
(12)
Note that θˆ is inversely proportional to risk aversion γ , but is independent of wealth x. It stays constant in each regime but jumps when ξ switches states. When there is only one regime, θˆ coincides with the standard Merton solution [10]. In view of (7) and (9), Fi (t) acts as a regime-dependent discounting factor to the utility function U(x). In fact, one
560
can view the Merton function M(t, x, i) as a dynamic regimedependent utility function. Precisely, it measures utility of wealth at intermediate time t in regime i by accounting for the investment opportunity available to the investor over the period [0, T ]. This perspective gives another intuitive explanation for its role in the definition of V in (5). Furthermore, the Merton function satisfies the following dynamic programming principle: M(t, x, i) = sup IE {M(τ , Xτ , ξτ ) | Xt = x, ξt = i}. θ ∈Θt,τ
for every τ ∈ Tt,T . This is a specific example of Proposition 2.6 of [12] which establishes this property of the Merton function in a general semimartingale market. B. The Indifference Price of an American Option Next, we establish a direct connection between the holder’s value function V and the indifference price p. To facilitate presentation, we introduce the differential operators
ηi2 y2 ∂ 2 u ∂u + νi y , 2 ∂ y2 ∂y ∂ u Li0 u = Li u − ρi ηi λi y , (13) ∂y 1 ∂u ∂u + Li0 u − γ (1 − ρi2 )ηi2 y2 ( )2 , (14) Ai u = ∂t 2 ∂y Li u
=
and the Hamiltonian ¡ θ 2 σi2 ¢ Hi (uxx , uxy , ux ) = max uxx + θ (ρi σi ηi yuxy + µi ux ) . 2 θ
Note that Li is the infinitesimal generator of Y under measure P, while Li0 is the infinitesimal generator of Y under the minimal martingale measure Q0 (see Definition 6 below). Also, Ai is semilinear and depends on γ . Theorem 3: The holder’s value function is given by V (t, x, y, i) = M(t, x, i) e−γ p(t,y,i) ,
(15)
for (t, x, y, i) ∈ [0, T ] × IR × IR+ × E, where p(t, y, i) is the holder’s indifference price (defined in (6)). In addition, p(t, y, i) satisfies the system of VIs Ai p(t, y, i) Fj (t) 1 ai j + (1 − e−γ (p(t,y, j)−p(t,y,i)) ) ≤ 0, ∑ γ F (t) i j∈E\{i} p(t, y, i) ≥ g(t, y, i), ¡ ¢ Fj (t) 1 −γ (p(t,y, j)−p(t,y,i)) A p(t, y, i) + (1 − e ) a i i j ∑ γ j∈E\{i} Fi (t) ·(g(t, y, i) − p(t, y, i)) = 0, for (t, y, i) ∈ [0, T )×IR+ ×E, and p(T, y, i) = g(T, y, i), for (y, i) ∈ IR+ ×E. (16) Proof: We present the main steps of the proof, and refer the reader to the companion paper [13] for details. First, we
derive the HJB VI for the value function, namely Vti + LiV i + Hi (Vxxi ,Vxyi ,Vxi ) + ∑ ai jV j ≤ 0, j∈E V (t, x, y, i) ≥ M (t, x + g(t, y, i), i) , ¢ ¡V i + L V i + H (V i ,V i ,V i ) + i i xx xy x ∑ ai j V j t j∈E · (M (t, x + g(t, y, i), i) −V (t, x, y, i)) = 0, for (t, x, y, i) ∈ [0, T )×IR ×IR+ ×E, and V (T, x, y, i) = U(x + g(T, y, i)), for (x, y, i) ∈ IR×IR+ ×E, (17) where the shorthand notation V i ≡ V (t, x, y, i). Maximizing over θ in (17) yields the optimal strategy in feedback form:
θ ∗ (t, x, y, i) = −
λi Vx (t, x, y, i) ρi ηi Vxy (t, x, y, i) y − . (18) σi Vxx (t, x, y, i) σi Vxx (t, x, y, i)
Then, substituting (7) and (15) into (17) gives the VI for p(t, y, i) in (16) as claimed. By (7) and (15), p(t, y, i) satisfies V (t, x, y, i) = M(t, x + p(t, y, i), i), implying that p(t, y, i) is indeed the holder’s indifference price by Definition 1. Formula (15) in Theorem 3 allows us to express the holder’s optimal hedging strategy in terms of p. By substituting (15) to (18), we obtain
θ ∗ (t, y, i) =
λi ηi − ρi y py (t, y, i), γσi σi
which is again wealth independent. The first part of the strategy resembles the the Merton strategy θˆ in (12), while the second part represents the residual investment in S arising from hedging the option g. In particular, if the correlation ρ (k) = 0 for some k ∈ E, then the second term vanishes and we have θ ∗ (t, y, k) = θˆ (t, y, k). This is intuitive because zero correlation between W and W˜ implies that trading S does not reduce the risk from Y , leading the holder to adopt the Merton strategy in this case. By Definition 1, we deduce that the holder’s optimal exercise time o n ∗ ∗ τ ∗ = inf 0 ≤ t ≤ T : V (t, Xtθ ,Yt , ξt ) = M(t, Xtθ + gt ) = inf {0 ≤ t ≤ T : p(t,Yt , ξt ) = g(t,Yt , ξt )} .
(19)
See also [12] for details. This provides an intuitive interpretation for the holder’s optimal exercising strategy: the holder will exercise as soon as his indifference price equals the option payoff. The optimal exercising strategy depends on the underlying price Yt and the Markov chain ξt , but, like the indifference price p, it is wealth-independent. For other utility functions, the optimal exercise time has the same interpretation but it may depend on wealth. III. DYNAMIC H EDGING OF A MERICAN O PTIONS WITH M ULTIPLE E XERCISES We proceed to study indifference pricing for American options with multiple exercise opportunities. We consider the holder of N ≥ 2 integer units of American option g that can be exercised separately. We denote by τn ∈ Tt,T the exercise time of the next unit of American option when n ≤ N units remain unexercised at time t ∈ [0, T ]. After exercising one
561
American option at τn , the holder has (n − 1) options left. If the holder decides to exercise multiple options at the same time, then some exercise times may coincide. As in Section II, the holder dynamically trades S and the money market account throughout [0, T ], and his trading wealth follows (3). Upon exercising each American option, the holder reinvests the option payoff, if any, into the portfolio till T . Therefore, the holder’s value function for holding n ≥ 2 units of American options g is defined recursively by
This result follows directly from Theorem 3 by replacing the payoff g with g + p(n−1) and by induction. With multiple exercises, we obtain chain of VIs for the indifference prices. (n) (1) In order to solve for {pi }i∈E , we first solve for {pi }i∈E (2) (3) via (16), and sequentially solve for {pi }i∈E , {pi }i∈E , and so on, via (24) above. In view of (19), the holder’s optimal exercise time for the next option when n options remain unexercised is
τ (n)∗
V (n) (t, x, y, i) © ª = sup IE t,x,y,i V (n−1) (τn , Xτn + gτn ,Yτn , ξτn ) ,
o n = inf t ≤ T : p(n) (t,Yt , ξt ) − p(n−1) (t,Yt , ξt ) = g(t,Yt , ξt ) .
(20)
τn ∈Tt,T θ ∈Θt,τn
with V (1) (t, x, y, i) = V (t, x, y, i) in (5) and V (0) (t, x, y, i) = M(t, x, i) with no option. We remark that (20) is a stochastic control problem with optimal multiple stopping. (n) Definition 4: The holder’s indifference price pi ≡ (n) p (t, y, i) for n ∈ N units of American option g satisfies ³ ´ (n) M(t, x, i) = V (n) t, x − pi , y, i . (21)
Note that p(1) (t, y, i) = p(t, y, i) in (6), and p(0) = 0. Applying (21) into (20) yields that ½ ¡ (n) V (t, x, y, i) = sup IE t,x,y,i M τn , Xτn + gτn
This implies that the holder, while holding n options, will exercise the next option when the indifference price increment p(n) − p(n−1) reaches the option payoff g. IV. N UMERICAL S OLUTION OF THE I NDIFFERENCE P RICE Theorem 3 yields a system of semilinear free boundary problems of reaction-diffusion type. The regime-switching dynamics of asset prices gives rise to the reaction-diffusion terms (the summation terms in (16)). In this section, we discuss an analytic simplification and a numerical scheme applicable for both VIs (16) and (24). First, for n ∈ N, we apply the transformation p(n) (t, y, i) = −
τn ∈Tt,T θ ∈Θt,τn
+ p(n−1) (τn ,Yτn , ξτn ), ξτn = V (t, x, y, i ; g + p(n−1) ).
¾ ¢
(22)
where V (t, x, y, i ; g + p(i−1) ) is the value function (5) for an American option with payoff g(τ ,Yτ , ξτ ) + p(n−1) (τ ,Yτ , ξτ ) at any τ ∈ T . The last equality (22) represents a crucial connection between the cases with single exercise and multiple exercises. It allows us to tackle the optimal multiple stopping problem by solving the single optimal stopping problem sequentially. With this simplification, we can apply (n) the results from Section II to study the indifference price pi and the corresponding hedging and exercising strategies. Proposition 5: The value function V (n) is given by (n) (t,y,i)
V (n) (t, x, y, i) = M(t, x, i) e−γ p
,
(23)
where p(n) (t, y, i) satisfies the VI: (n) Ai pi (n) (n) Fj (t) ¡ 1 −γ (p j −pi ) ¢ ≤ 0, a 1 − e + i j ∑ γ j∈E\{i} Fi (t) p(n) (t, y, i) ≥ g(t, y, i) + p(n−1) (t, y, i), µ ¶ (n) (n) Fj (t) ¡ 1 (n) −γ (p j −pi ) ¢ (24) a A p + 1 − e ij i i ∑ γ F (t) i ³ j∈E\{i} ´ · g(t, y, i) + p(n−1) (t, y, i) − p(n) (t, y, i) = 0, for (t, y, i) ∈ [0, T )×IR+ ×E, p(n) (T, y, i) = ng(T, y, i), for (y, i) ∈ IR+ ×E.
δi log w(n) (t, y, i), γ
(25)
with δi = (1 − ρi2 )−1 for i ∈ E. By direct substitution of (25) into (16) yields the system of VIs: (n) ∂ wi + L 0 w(n) i i ∂t (n) (w j )δ j ¢ ¡ − 1 w(n) ˆ ∑ Ai j (t) 1 − (n) δi ≥ 0, γδi i j∈E\{i} (wi ) 2 w(n) (t, y, i) ≤ w(n−1) (t, y, i)e−γ (1−ρi )g(t,y,i) ,
(n) (w j )δ j ¢¢ ¡ ∂ w(n) ¡ 1 (n) 0 (n) i ˆ + L w − A (t) 1 − w ∑ ij i i (n) ∂t γδi i j∈E\{i} (wi )δi ´ ³ 2 · w(n−1) (t, y, i)e−γ (1−ρi )g(t,y,i) − w(n) (t, y, i) = 0, for (t, y, i) ∈ [0, T )×IR+ ×E, 2 w(n) (T, y, i) = e−γ (1−ρi )ng(T,y,i) , for (y, i) ∈ IR+ ×E. (26) (n) Here, we have used the notations wi ≡ w(n) (t, y, i), and F (t) Aˆ i j (t) ≡ ai j Fij(t) for i, j ∈ E. Note that the differential operator in (26) is linear, and nonlinearity comes only through the summation term. Working with a linear operator significantly simplifies the numerical solution to this problem. We apply an implicit-explicit finite-difference method to solve for the indifference prices and optimal exercise boundaries in all regimes. Specifically, we discretize the differential inequalities in (26) using central differences for the y-derivatives, backward difference for the t-derivative, and explicit approximation for the summation term. Our
562
Optimal Exercise Boundaries in Regime 1
numerical method iterates backward in time starting at maturity T . At each time step, the constraints is enforced by the projected successive-over-relaxation (PSOR) algorithm, which iteratively solves the implicit time-stepping equations, while preserving the constraint between iterations. Similar numerical schemes can be found in [1] and [14]. In Figure 1, we illustrate the case of exercising a single American call option on a non-tradable stock in a two-regime market. Regime 1 is a state of more bullish market condition than regime 2. The holder will exercise the call as soon as the underlying price Y exceeds the exercise boundary in the current regime. The holder’s optimal exercise boundary in regime 1 is higher than in regime 2, meaning that he intends to exercise the call earlier in regime 2 than in regime 1. Moreover, a more risk-averse holder tends to exercise the option earlier, which we will prove in Proposition 13.
1.7
1.6
Stock Price (Y)
1.5
1.4
1.3
1.2
1.1
1
0
0.1
0.2 0.3 Time in Years
0.4
0.5
Optimal Exercise Boundaries in Regime 2
1.6
1.7 γ = 1(regime 1) γ = 1 (regime 2) γ =2 (regime 1) γ=2 (regime 2)
1.5 Stock Price (Y)
1.6
Stock Price (Y)
1.5
1.4
1.3
1.4 1.2
1.3 1.1
1.2 1
0
0.1
0.2 0.3 Time in Years
1.1
1
0
0.1
0.2 0.3 Time in Years
0.4
0.5
Fig. 1. The holder’s exercise boundary in regime 1 (good state) is higher than that in regime 2 (bad state) (see top two curves for γ = 1, or bottom two curves for γ = 2). Increasing the risk aversion from 1 to 2 results in a lower exercise boundary in each regime, leading to earlier exercise. Here, the parameters are K = 1,T = .5,r = 3%,µ = (9%, 8%),σ = (30%, 20%),ν = (11%, 7%),q = (2%, 1%),η = (40%, 28%),ρ = 30%,γ = 1, a12 = a21 = 1.
In Figure 2, we show that optimal exercise boundaries for 5 American calls in the two-regime market. In either regime, the holder will exercise the first option as soon as the underlying price Y exceeds the lowest boundary, and then the subsequent options at higher boundaries. Observe that holding more options makes the holder more willing to exercise the additional options at lower underlying prices. V. I NDIFFERENCE P RICING VIA E NTROPIC P ENALIZATION WITH O PTIMAL S TOPPING
0.4
0.5
Fig. 2. In each regime, the holder exercises the first call at the lowest exercise boundary, and subsequent ones at higher boundaries. The exercise boundaries in regime 1 dominate the corresponding ones in regime 2. Here, γ = 1, and other parameters are taken from Figure 1. The highest boundaries from these two graphs correspond to the top two boundaries in Figure 1.
Definition 6: We define the probability measure Qφ by µ Z 1 T 2 dQφ = exp − (λ (ξs ) + φs2 ) ds dP 2 0 ¶ Z T Z T − (27) λ (ξs )dWs − φs dW˜ s , 0
0
φ
where (φt )t≥0 is a Ft -adapted process and IE Q { 0T φt2 dt} < φ φ ∞. Its density process is denoted by Zt = IE { dQ dP |Ft }. By Girsanov’s Theorem, (S,Y ) under Qφ satisfies R
φ
An alternative interpretation of indifference pricing is through its dual representation, which involves selecting a pricing measure via relative entropy penalization. This is related to finding the optimal risk premia for the idiosyncratic and regime-switching risks due to W˜ and ξ respectively. A. Minimal Entropy Risk Premia We begin by defining the set of equivalent local martingale measures in the regime-switching market.
dSt = σ (ξt )St dWt , dYt = (ν (ξt ) − ρ (ξt )η (ξt )λ (ξt ) − ρ˜ (ξt )η (ξt )φt )Yt dt ´ ³ φ φ + η (ξt )Yt ρ (ξt ) dWt + ρ˜ (ξt ) dW˜ t ,
φ φ where Wt = Wt + 0t λ (ξs )ds and W˜ t = W˜ t + 0t φs ds are independent Brownian motions under Qφ . The discounted price S is a Qφ -local martingale, so Qφ is an equivalent local martingale measure. When φ = 0, we obtain the well-known minimal martingale measure Q0 (see [15]).
563
R
R
The process φ is the risk premium accounting for W˜ only, and we need to specify the premium for the regime-switching risk. To this end, we summarize Girsanov’s Theorem for Markov Chain (see Chapter IV.22 of [16]). Theorem 7: Define the probability measure Qφ ,α by dQφ ,α dQφ dQφ ,α = , dP dP dQφ with
Next, we consider the relative entropy minimization ( ) φ ,α ZT Qφ ,α h(t, i) = inf log φ ,α | ξt = i , (31) IE Qφ ,α ∈M (P) Zt ˆ and provide explicit formulae for h(t, i) and Q. ˆ ˆ Theorem 9: The MEMM is given by Q = Qφ ,αˆ , where the minimal entropy risk premia are
dQφ dP
defined in (27) and ¶ µ ZT dQφ ,α ˜ s (ξs , ξs ) − A(ξs , ξs ) ds = exp − ( A dQφ 0 · ∏ αs (ξs− , ξs ),
φˆt = 0,
and
h(t, i) = − log Fi (t) µ ½ µ Z = − log IE exp −
(28)
where {αt (i, j)}i6= j is a family of positive bounded adapted processes, and αt (i, j)A(i, j) if i 6= j, ˜ (29) At (i, j) = − ∑ A˜ t (i, k) if i = j. k6=i
Qφ ,α
Then, the measure is equivalent to Qφ , and thus to P. In addition, to preserve the Markovian property of ξ under the new measure Qφ ,α , we require that all αt (i, j) be Markovian. Then, under Qφ ,α , the generator matrix of ξ is A˜ = [A˜ s (i, j)]i, j∈E . In view of (29), the collection {αt (i, j)}i6= j can be considered as the risk premium factors for the regimeswitching risk. As a result, the set of the equivalent local martingale measures with respect to P, denoted by M (P) := {Qφ ,α }φ ,α , is parameterized by the risk premia pair (φ , α ). For any measure Q, the relative entropy of Q with respect to P is defined as ( n o IE Q log dQ , Q ≪ P, dP H(Q|P) := +∞ , otherwise . ˆ minThe minimal entropy martingale measure (MEMM), Q, imizes the relative entropy with respect to P over the set of equivalent local martingale measures M (P):
t
T
(33) ¶ ¾¶ λ 2 (ξs ) ds | ξt = i , 2 (34)
+
Z
∑ 1{ξs− 6= j} (1 − αs (ξs− , j)) A(ξs− , j)
(t,T ] j∈E
¾
+ αs (ξs− , j)As (ξs− , j) log αs (ξs− , j)ds | ξt = i . The HJB equations associated with hi (t) ≡ h(t, i) are ¡ λi2 φ2 + inf + inf ∑ (1 − α i j )ai j φ 2 2 α i j j∈E\{i} ¢ + α i j ai j log α i j + ai j α i j (h j (t) − hi (t)) = 0,
h′i (t) +
(35)
for (t, i) ∈ [0, T )×E, with hi (T ) = 0 for i ∈ E. Minimizing over φ and α i j in (35) yields the optimal controls φˆ = 0 and
αˆ t (i, j) = e−(h j (t)−hi (t)) ,
(36)
and leads to the system of first-order nonlinear ODEs:
Q∈M (P)
Key results on the MEMM in a general semimartingale market framework can be found in [7] and [17]. Definition 8: For any Qφ ,α ∈ M (P), the conditional relative entropy of Q with respect to P at time t ∈ [0, T ] is ) ( φ ,α ZT φ ,α T Qφ ,α Ht (Q |P) := IE (30) log φ ,α |Ft , Zt φ ,α
where Zt := IE { dQdP |Ft } is the related density process. As is well known, it follows from Jensen’s inequality that HtT (Qφ ,α |P) ≥ 0 for any Qφ ,α ∈ M (P). It turns out that the MEMM Qˆ also minimizes the conditional relative entropy (see Proposition 4.1 of [18]). Namely, for any t ∈ [0, T ], Q∈M (P)
(32)
with Fi (t) given in (8). Proof: Applying (28) to (30), the conditional relative entropy of Q with respect to P is given by ( ) φ ,α ZT Qφ ,α log φ ,α | ξt = i IE Z ½ Zt T φ , α 1 (λ 2 (ξs ) + φs2 ) ds = IE Q 2 t
Qˆ = arg min H(Q|P).
ˆ ess inf HtT (Q|P) = HtT (Q|P),
i 6= j.
ˆ The minimal relative entropy HtT (Q|P) = h(t, ξt ), where
0≤s≤T ξs− 6=ξs
φ ,α
αˆ t (i, j) = Fj (t)/Fi (t),
h′i (t) +
λi2 − ∑ ai j e−(h j (t)−hi (t)) = 0, 2 j∈E
(37)
for (t, i) ∈ [0, T )×E, with hi (T ) = 0, i ∈ E. Direct substitution of hi (t) = − log Fi (t) into (37) leads to (11), and (33) follows. Finally, applying (34) to (36) gives αˆ t (i, j) in (32). This theorem has a number of important implications. First, applying (33) to (4) yields a duality formula for the Merton function, namely M(t, x, i) = −e−γ x e−h(t,i) . In turn, this implies that 1 U −1 (M(t, 0, i)) = h(t, i). γ
t ∈T .
564
(38)
Therefore, the minimal relative entropy, when scaled by risk aversion, can be viewed as the certainty equivalent of the Merton investment with zero initial wealth. From (32) and (38), we observe that ξ becomes a timeinhomogeneous Markov chain under the MEMM Qˆ with the ˆ whose off-diagonal elements are generator matrix A, Fj (t) M(t, x, j) = A(i, j) , Aˆ t (i, j) = A(i, j) Fi (t) M(t, x, i)
Theorem 11: The indifference price is given by ¶ µ 1 ˆτ Qφ ,α p(t, y, i) = sup inf IE t,y,i {gτ } + Hφ ,α (t, y, i) . γ τ ∈Tt,T φ ,α Proof: By (39) and (40), the entropy term is Hˆ φτ ,α (t, y, i) ½ Zτ Qφ ,α 1 φ 2 ds = IE t,y,i 2 t s
for i 6= j.
This indicates that Qˆ scales the transition rates A(i, j) by the ratio of the discounting factors Fj (t) and Fi (t), or equivalently, by the relative Merton investment performances. As a ˆ the Markov chain ξ is more result of changing from P to Q, likely to switch to the states with higher M values. Remark 10: When φ = 0 and α (i, j) = 1 for i 6= j, the resulting measure Q0,1 ≡ Q0 is the minimal martingale measure, and it does not minimize the relative entropy with respect to P, despite the contrary claim by [6]. Indeed, ¾ ½ ZT 2 T 0 Q0 1 λ (ξs ) ds| Ft Ht (Q |P) = IE 2 t ½ ZT ¾ 1 2 λ (ξs ) ds| Ft = IE 2 t ¶ µ R T λ 2 (ξs ) ˆ ≥ − log IE {e− t 2 ds |Ft } = HtT (Q|P), by (34) and Jensen’s Inequality. The inequality is strict unless ˆ there is no regime switching. This shows that Q0 6= Q.
+
0≤s≤T ξs− 6=ξs
Let and
us denote f i ≡ f (t, y, i) as the right-hand side of (41), write down the variational inequality for f i . ¡ φ2 ¢ fti + Li0 f i + inf − ρ˜ i ηi y fyi φ + φ 2γ µ 1 ij + inf ∑ ai j [(α − αˆ t (i, j)) α i j j∈E\{i} γ
¶ +α (log α − log αˆ t (i, j))] + α ai j ( f − f ) ≤ 0, ij
ij
ij
j
i
f (t, y, i) ≥ g(t, y, i), µ ¡ φ2 ¢ fti + Li0 f i + inf − ρ˜ i ηi y fyi φ + φ 2γ µ 1 + inf ∑ ai j [(α i j − αˆ t (i, j)) α i j j∈E\{i} γ
¶¶ ij ij i j j i +α (log α − log αˆ t (i, j))] + α ai j ( f − f ) · (g(t, y, i) − f (t, y, i)) = 0, for (t, y, i) ∈ [0, T )×IR+ ×E, f (T, y, i) = g(T, y, i), for (y, i) ∈ IR ×E. +
The optimal controls are given by
φ ∗ (t, y, i) = −γ ρ˜ i ηi y fy (t, y, i),
(42)
α ∗ (t, y, i, j) = αˆ t (i, j) e−γ ( f (t,y, j)− f (t,y,i)) , (39)
i 6= j,
(43)
and the first inequality for f becomes 1 fti + Li0 f i − γ (1 − ρi2 )ηi2 y2 ( fyi )2 2 j i 1 + ∑ ai j αˆ t (i, j)(1 − e−γ ( f − f ) ) ≤ 0. γ j∈E\{i}
αs (ξs− , ξs ) . αˆ s (ξs− , ξs )
ˆ ˆ Here, W˜ Q is a Q-standard Brownian motion, and we have ˆ Q ˜ ˜ W = W since φˆ = 0. Denote the associated density proφ ,α ˆ φ ,α cess by Zˆt := IE Q { dQdQˆ |Ft }, and the conditional relative entropy of Qφ ,α with respect to Qˆ over the period [t, τ ] by ( ) ˆ τφ ,α φ ,α Z Q τ Hˆ φ ,α (t, y, i) := IE t,y,i log φ ,α , (40) Zˆt φ ,α
˜ ξs− , j) − αˆ s (ξs− , j)A(ξs− , j)) ds ∑ 1{ξs− 6= j} (A(
(t,τ ] j∈E
¾ αs (ξs− , j) ˜ + ∑ 1{ξs− 6= j} As (ξs− , j)(log αˆ s (ξs− , j) ) ds . (t,τ ] j∈E
0
∏
Z
Z
B. Duality Formula for the Indifference Price The MEMM Qˆ also plays a crucial role in the indifference price characterization. To illustrate, we first consider the relative entropy with respect to Qˆ instead of P. Following Theorem 7, we define the Radon-Nikodym derivative dQφ ,α dP dQφ ,α = dP dQˆ dQˆ ¶ µ Z Z T 1 T 2 Qˆ ˜ = exp − φ ds − φs dWs 2 0 s 0 µ ZT ¶ ˆ ˜ · exp − As (ξs , ξs ) − A(ξs , ξs ) ds
(41)
By comparing this with VI (16) and using that αˆ t (i, j) = Fj (t)/Fi (t), we conclude that f (t, y, i) = p(t, y, i). The duality formula (41) shows that the option holder selects a pricing measure that minimizes the expected discounted payoff plus a relative entropic penalty up to the exercise time τ . Note that the holder’s risk aversion γ plays the role of scaling the penalty term. By (15) and (43), we can express the risk premium α ∗ as
φ ,α
Q {·} ≡ IE Q {·|Yt = y, ξt = i}. Note that with IE t,y,i Hˆ φτ ,α (t, y, i) ≥ 0 by Jensen’s inequality, and it vanishes ˆ This entropy term represents the penalty in when Qφ ,α = Q. our indifference price formula.
α ∗ (t, y, i, j) = A(i, j)
V (t, x, y, j) , V (t, x, y, i)
for i 6= j.
This provides a financial interpretation that the holder assigns the optimal risk premium factor α ∗ to ξ by scaling the
565
transition rates according to the relative values of the value function in different states. Finally, we directly apply Theorem 11 to American options with multiple exercises. Proposition 12: The indifference price p(n) (t, y, i) satisfies µ n o Qφ ,α (n) gτn + p(n−1) (τn ,Yτn , ξτn ) p (t, y, i) = sup inf IE t,y,i τn ∈Tt,T φ ,α
¶ 1 + Hˆ φτn,α (t, y, i) . γ
C. Risk Aversion Asymptotics Next, we apply the dual formula (41) to analyze some properties of the indifference price, with an emphasis on the impact of the risk aversion γ and correlation coefficient ρ (i). Let us denote the indifference price and optimal exercise time respective by p(γ ) (t, y, i) and τ (γ )∗ to highlight its dependence on γ . Next, we show that higher risk aversion reduces the option holder’s indifference price and directly leads to an earlier exercise time. Proposition 13: If γ2 ≥ γ1 > 0, then p(γ2 ) (t, y, i) ≤ ( γ p 1 ) (t, y, i), and τ (γ2 )∗ ≤ τ (γ1 )∗ almost surely. Proof: For any τ , φ , and α , the entropic penalty in (41), γ −1 Hˆ φτ ,α (t, y, i), is non-negative and non-increasing in γ . This yields p(γ2 ) (t, y, i) ≤ p(γ1 ) (t, y, i). By (19), this implies that p(γ2 ) reaches the payoff g earlier than p(γ1 ) almost surely, and therefore, τ (γ2 )∗ ≤ τ (γ1 )∗ almost surely. This is reflected in Figure 1, where the holder’s optimal exercise boundary shifts downward as risk aversion increases. We observe that, as γ ↑ ∞, the entropic penalty in (41) vanishes. Consequently, we obtain the limit price: φ ,α
lim p(γ ) (t, y, i) = sup inf IE Q
γ →∞
τ ∈Tt,T φ ,α
{g(τ ,Yτ , ξτ ) |Yt = y, ξt = i} .
This price is commonly referred to as the sub-hedging price (see [19]). On the other hand, as γ ↓ 0, we deduce from (41) ˆ resulting in zero that it is optimal not to deviate from Q, penalty. The limit price is ˆ
lim p(γ ) (t, y, i) = sup IE Q {g(τ ,Yτ , ξτ ) |Yt = y, ξt = i} .
γ →0
τ ∈Tt,T
Here, the MEMM is the pricing measure – it assigns zero risk premium (φˆ = 0) to W˜ , but places a time-varying premium αˆ t (i, j) to the regime-switching risk (see (32)). D. The Case of Perfect Correlation Our model applies even in the case with perfect correlation, namely, |ρ (i)| = 1 for every i ∈ E. In this case, W˜ vanishes from the dynamics of Y , and both S and Y are driven by the same Brownian motion W (in addition to ξ ). Consequently, Y is effectively “traded” via the proxy asset S. To avoid arbitrage, we require that λ (i) = κ (i) for every i ∈ E. However, the market is still incomplete due to the unhedgeable regime-switching risk. Using these facts, Theorem 3 can be directly applied to obtain the VI for
the indifference price – the only change here is that the differential operator Ai in (16) reduces to a linear one:
∂u 1 2 2 + η y uyy . ∂t 2 i Also, the duality formula (41) also holds, with φ = 0 due to the absence of a second Brownian motion. Hence, the entropy minimization is conducted over the collections of measures {Q0,α }. ¶ µ 1 ˆτ Q0,α p(t, y, i) = sup inf IE t,y,i {gτ } + H0,α (t, y, i) . γ τ ∈Tt,T α Ai u =
We refer to [12] for a detailed discussion on indifference price asymptotics in a general semimartingale market. R EFERENCES [1] T. Leung and R. Sircar, “Accounting for risk aversion, vesting, job termination risk and multiple exercises in valuation of employee stock options,” Mathematical Finance, vol. 19, no. 1, pp. 99–128, January 2009. [2] M. Davis, “Pricing weather derivatives by marginal value,” Quantitative Finance, vol. 1, pp. 1–4, 2001. [3] ——, “Option pricing in incomplete markets,” in Mathematics of Derivatives Securities, M. Dempster and S. Pliska, Eds. Cambridge University Press, 1997, pp. 227–254. [4] A. Oberman and T. Zariphopoulou, “Pricing early exercise contracts in incomplete markets,” Computational Management Science, vol. 1, pp. 75–107, 2003. [5] X. Guo and Q. Zhang, “Closed-form solutions for perpetual american put options with regime switching,” SIAM Journal on Applied Mathematics, vol. 64, no. 6, pp. 2034–2049, 2004. [6] R. J. Elliot, L. Chan, and T. K. Siu, “Option pricing and esscher transform under regime switching,” Annals of Finance, vol. 1, pp. 423– 432, 2005. [7] M. Fritelli, “The minimal entropy martingale measure and the valuation problem in incomplete markets,” Mathematical Finance, vol. 10, pp. 39–52, 2000. [8] F. Delbaen, P. Grandits, T. Rheinl¨ander, D. Samperi, M. Schweizer, and C. Stricker, “Exponential hedging and entropic penalties,” Mathematical Finance, vol. 12, pp. 99–123, 2002. [9] I. Karatzas and S. Shreve, Methods of Mathematical Finance. Springer, 1998. [10] R. Merton, “Lifetime portfolio selection under uncertainty: the continuous time model,” Review of Economic Studies, vol. 51, pp. 247–257, 1969. [11] W. H. Fleming and S. Sheu, “Optimal long term growth rate of expected utility of wealth,” Ann. Appl. Probab., vol. 9, no. 3, pp. 871–903, 1999. [12] T. Leung and R. Sircar, “Exponential hedging with optimal stopping and application to ESO valuation,” SIAM Journal of Control and Optimization, vol. 48, no. 3, pp. 1422–1451, 2009. [13] T. Leung, “The impacts of risk aversion and market regimes on dynamic hedging and multiple exercises of American options,” 2010, working paper, Johns Hopkins University. [14] P. Wilmott, S. Howison, and J. Dewynne, The Mathematics of Financial Derivatives. Cambridge University Press, 1995. [15] H. F¨ollmer and D. Sondermann, “Hedging of non-redundant contingent claims,” in Contribution to Mathematical Economics: In Honor of Gerard Debreu, W. Hildenbrand and A. Mas-Colell, Eds. Amsterdam, North-Holland, 1986. [16] L. Rogers and D. Williams, Diffusions, Markov Processes and Martingales, 2nd ed. Cambridge University Press, UK, 2000. [17] P. Grandits and T. Rheinl¨ander, “On the minimal entropy martingale measure,” Annals of Probability, vol. 30, pp. 1003–1038, 2002. [18] Y. Kabanov and C. Stricker, “On the optimal portfolio for the exponential utility maximization: Remarks to the six-author paper,” Mathematical Finance, vol. 12, pp. 125–134, 2002. [19] I. Karatzas and S. Kou, “Hedging American contingent claims with constrained portfolios,” Finance and Stochastics, vol. 2, pp. 215–258, 1998.
566