Entropy Coherent and Entropy Convex Measures of Risk∗ Roger J. A. Laeven
Mitja A. Stadje
Dept. of Econometrics and Operations Research
Dept. of Econometrics and Operations Research
Tilburg University, CentER
Tilburg University
and EURANDOM
and CentER
[email protected] [email protected] http://center.uvt.nl/staff/laeven
http://center.uvt.nl/staff/stadje
This Version: September 30, 2011
Abstract We introduce two subclasses of convex measures of risk, referred to as entropy coherent and entropy convex measures of risk. Entropy coherent and entropy convex measures of risk are special cases of ϕ-coherent and ϕ-convex measures of risk. Contrary to the classical use of coherent and convex measures of risk, which for a given probabilistic model entails evaluating a financial position by considering its expected loss, ϕ-coherent and ϕ-convex measures of risk evaluate a financial position under a given probabilistic model by considering its normalized expected ϕ-loss. We prove that (i) entropy coherent and entropy convex measures of risk are obtained by requiring ϕ-coherent and ϕ-convex measures of risk to be translation invariant; (ii) convex, entropy convex and entropy coherent measures of risk emerge as certainty equivalents under variational, homothetic and multiple priors preferences upon requiring the certainty equivalents to be translation invariant; and (iii) ϕ-convex measures of risk are certainty equivalents under variational and homothetic preferences if and only if they are convex and entropy convex measures of risk. In addition, we study the properties of entropy coherent and entropy convex measures of risk, derive their dual conjugate function, and characterize entropy coherent and entropy convex measures of risk in terms of properties of the corresponding acceptance sets. Keywords: Multiple priors; Variational and homothetic preferences; Robustness; Convex risk measures; Exponential utility; Relative entropy; Translation invariance; Convexity; Certainty equivalent. AMS 2010 Classification: Primary: 91B06, 91B16, 91B30; Secondary: 60E15, 62P05. OR/MS Classification: Risk, Asset pricing, Insurance. JEL Classification: D81, G10, G20. ∗
We are very grateful to a referee for helpful comments and suggestions that significantly improved the paper. We are also grateful to Patrick Cheridito, Hans F¨ ollmer, Dilip Madan, Alexander Schied, Hans Schumacher and seminar and conference participants at the Fields Institute in Toronto, the AMaMeF Conference in Berlin, the EURANDOM Lecture Day on Advances in Financial Mathematics, and the Colloquium on Risk Management and Risk Measures at Leibniz Universit¨ at Hannover for their comments and suggestions. This research was funded in part by the Netherlands Organization for Scientific Research (Laeven) under grants NWO VENI 2006 and NWO VIDI 2009.
1
1
Introduction
Convex measures of risk have played an increasingly important role since their introduction by F¨ollmer and Schied [15], Fritelli and Rosazza Gianin [18] and Heath and Ku [28], generalizing Artzner et al. [2]; see also the early work of Deprez and Gerber [13] and Ben-Tal and Teboulle [4, 5], and the more recent Ben-Tal and Teboulle [6] and Ruszczy´ nski and Shapiro [35, 36]. For a given financial position X, defined on a measurable space (Ω, F), a convex risk measure ρ returns the minimal amount of capital the economic agent holding X is required to commit and add to the financial position in order to make it ‘safe’: the theory of convex risk measures postulates that from the viewpoint of the supervisory authority, the financial position X +ρ(X) is acceptably insured against adverse shocks. Convex risk measures are characterized by the axioms of monotonicity, translation invariance and convexity. They can — under additional assumptions on the space of financial positions and on continuity properties of the risk measures; see Section 2 — be represented in the form ρ(X) = sup {EQ [−X] − α(Q)},
(1.1)
Q∈Q
where Q is a set of probability measures on (Ω, F), and α is a penalty function defined on probability measures on (Ω, F). With 0, if Q ∈ M ⊂ Q; α(Q) = ∞, otherwise; we obtain the particular subclass of coherent measures of risk, represented in the form ρ(X) = sup EQ [−X] . Q∈M
Under the probabilistic model Q, the esteemed plausibility of which is measured by α(Q), convex measures of risk evaluate the financial position X by considering its expected loss EQ [−X]. This is equivalent to assuming a risk neutral valuation, given the probabilistic model Q. A more general, and potentially more cautious, approach to evaluating X under the probabilistic model Q consists in considering its normalized expected ϕ-loss cϕ (X, Q) = ϕ−1 (EQ [ϕ(−X)]), with ϕ increasing. The case of a linear ϕ corresponds to a risk neutral evaluation, while a non-linear and convex ϕ corresponds to a risk averse evaluation. The generalized risk measure ρ then takes the form ρ(X) = sup {cϕ (X, Q) − θ(Q)},
(1.2)
Q∈Q
where θ is a penalty function defined on probability measures on (Ω, F). Henceforth, we call risk measures of the form (1.2) ϕ-convex measures of risk. They reduce to ϕ-coherent measures of risk whenever θ is an indicator function that takes the value zero if Q ∈ M and ∞ otherwise. The central objects of this paper, referred to as entropy convex measures of risk, are obtained by specifying ϕ(x) = exp{x/γ}, γ ∈ (0, ∞), in (1.2), so that cϕ (X, Q) = γ log (EQ [exp {−X/γ}]). Entropy coherent measures of risk occur when ϕ(x) = exp{x/γ}, γ ∈ (0, ∞), and θ is an indicator function. In a related strand of the literature, in decision theory, among the most popular theories for decision under uncertainty is the multiple priors model, postulating that an economic agent 2
evaluates the consequences (payoffs) of a decision alternative (financial position) X, according to U (X) = inf EQ [u(X)] , (1.3) Q∈M
where u : R → R is an increasing function, and M ⊂ Q a set of probability measures (priors). The function u, referred to as a utility function, captures the agent’s risk aversion, and the set M represents the agent’s uncertainty about the correct probabilistic model. Gilboa and Schmeidler [21] establish a preference axiomatization of this robust Savage representation, generalizing Savage [37] in the framework of Anscombe and Aumann [1]. The representation of Gilboa and Schmeidler [21], also referred to as maxmin expected utility, is a decision-theoretic foundation of the classical decision rule of Wald [40] — see also Huber [29] — that had long seen little popularity outside (robust) statistics. The multiple priors model is a special case of interest in the class of variational preferences axiomatized by Maccheroni, Marinacci and Rustichini [30]. Under variational preferences, the numerical representation takes the form U (X) = inf {EQ [u(X)] + α(Q)}, Q∈Q
(1.4)
where α is an ambiguity index (penalty function) on probability measures on (Ω, F). Multiple priors occurs when α is an indicator function that takes the value zero if Q ∈ M and ∞ otherwise. Under multiple priors, the degree of ambiguity is reflected by the multiplicity of the priors. Under variational preferences, the degree of ambiguity is reflected by the multiplicity of the priors and the esteemed plausibility of the prior according to the ambiguity index. Recently, Chateauneuf and Faro [9] and, slightly more generally, Cerreia-Vioglio et al. [8] axiomatized a multiplicative analog of variational preferences, henceforth referred to as homothetic preferences, represented as U (X) = inf {β(Q)EQ [u(X)]} , Q∈Q
(1.5)
with β a penalty function on probability measures on (Ω, F); it also includes multiple priors as a special case (β(Q) ≡ 1). To measure the ‘risk’ related to a financial position X, the theories of variational and homothetic preferences sketched above would lead to the definition of a loss functional L(X) = −U (X), satisfying L(X) = sup {EQ [ϕ(−X)] − α(Q)}
and L(X) = sup {β(Q)EQ [ϕ(−X)]} ,
Q∈Q
Q∈Q
respectively, where ϕ(x) = −u(−x). One could, then, look at the capital amount m ¯ X that is ‘equivalent’ to the potential loss of X, solving for m ¯ X in L(m ¯ X ) = L(X). This number is commonly referred to as the certainty equivalent of X, which is a classical notion in decision theory to evaluate X; see, e.g., Gollier [22]. However, because we are interested in the amount of capital needed to compensate or counterbalance the risk from the financial position X, we will rather look at the negative certainty equivalent of X, mX , given by −m ¯ X , satisfying L(−mX ) = ϕ(mX ) = L(X), or equivalently, ! ! mX = ϕ−1
sup {EQ [ϕ(−X)] − α(Q)}
and mX = ϕ−1
Q∈Q
sup {β(Q)EQ [ϕ(−X)]} . (1.6) Q∈Q
3
The contribution of this paper is twofold. First we derive precise connections between risk measurement using ϕ-convex measures of risk — (1.2) — and risk measurement under the theories of variational, homothetic and multiple priors preferences — (1.6). Specifically, we prove the following three main results. Clearly, ϕ-coherent measures of risk coincide with negative certainty equivalents under multiple priors preferences. In this case, θ in (1.2) is an indicator function, 0, if Q ∈ M ; θ(Q) = ∞, otherwise; meaning that all probabilistic models considered are esteemed equally plausible. But if θ is not an indicator function, we prove that ϕ-convex measures of risk are negative certainty equivalents under variational and homothetic preferences if and only if they are convex and entropy convex measures of risk, respectively (Theorem 4.18 and Remark 4.20). In the former case, ϕ is linear, inducing risk neutrality, in the latter case ϕ is exponential, inducing risk aversion. We further prove that entropy coherent and entropy convex measures of risk are obtained by requiring ϕ-coherent and ϕ-convex measures of risk to be translation invariant (Theorem 4.1). It entails that entropy coherent and entropy convex measures of risk are the only convex (hence translation invariant) risk measures of the form (1.2) with a non-linear ϕ, thus allowing for risk aversion. The property of translation invariance is motivated by the interpretation of a risk measure as a minimal amount of risk capital. It ensures that ρ(X + ρ(X)) = 0. We also prove that negative certainty equivalents under variational, homothetic and multiple priors preferences are translation invariant if and only if they are convex, entropy convex, and entropy coherent measures of risk, respectively (Theorem 4.3, Corollary 4.9 and Theorem 4.11). It entails that convex, entropy convex and entropy coherent measures of risk induce linear (risk neutral) or exponential (risk averse) utility functions in the theories of variational, homothetic and multiple priors preferences. We show further that, under a normalization condition, this characterization remains valid when the condition of translation invariance is replaced by requiring convexity (Theorem 4.15, Corollary 4.16 and Remark 4.17). The mathematical details in the proofs of these three main characterization results are delicate. The three characterization results identify entropy coherent and entropy convex measures of risk as distinctive and important subclasses of convex measures of risk. Our second contribution, then, is to study the classes of entropy coherent and entropy convex measures of risk in detail. We show that they satisfy many appealing properties. We prove various results on the dual conjugate function for entropy coherent and entropy convex measures of risk. We show in particular that, quite exceptionally, the dual conjugate function can explicitly be identified under some technical conditions. We also provide characterizations of entropy coherent and entropy convex risk measures in terms of their acceptance sets. In the traditional setting of Von Neumann and Morgenstern [39], where the probabilistic model is known and given so that simply U (X) = E [u(X)], analogs of (some of) the main characterization results established in this paper are relatively easy to obtain; see Hardy, Littlewood and P´olya [27] (p. 88, Theorem 106), Gerber [19] (Chapter 5) and Goovaerts, De Vylder and Haezendonck [23] (Chapter 3). It is intriguingly more complicated for the variational, homothetic and multiple priors preferences considered here, and we will show that without richness assumptions on the probability space and subdifferentiability conditions on ρ, our representation theorems in fact break down. In recent work, Cheridito and Kupper [10] (Example 3.6.3) suggest (without formal proof) a connection between certainty equivalents in 4
the pure multiple priors model and convex measures of risk. They restrict, however, to a specific and simple probabilistic setting which, as we will see below, can be viewed as supplementary (and non-overlapping) to a special case of the general setting considered here. While there is a rich literature on both theories (1.1) and (1.6), to the best of our knowledge, we are not aware of other work establishing precise connections between these prominent paradigms, let alone between the more general (1.2) and (1.6). The rest of this paper is organized as follows: in Section 2, we review some preliminaries for coherent and convex measures of risk. In Section 3, we introduce ϕ-coherent and ϕ-convex measures of risk, and their special cases referred to as entropy coherent and entropy convex measures of risk, provide some motivating examples, and discuss some of their basic properties. In Section 4 we prove axiomatic characterization results for convex, entropy convex and entropy coherent measures of risk. Section 5 studies the dual conjugate function for entropy coherent and entropy convex measures. Section 6 provides characterizations of entropy coherent and entropy convex measures of risk in terms of acceptance sets. Conclusions are in Section 7.
2
Preliminaries
We fix a probability space (Ω, F, P ). Throughout this paper, equalities and inequalities between random variables are understood in the P -almost sure sense. We let L∞ (Ω, F, P ) ≡ L∞ denote the space of all real-valued random variables X on (Ω, F, P ) for which ||X||∞ := inf{c > 0|P [|X| ≤ c] = 1} < ∞, where two random variables are identified if they are P -almost surely equal. We denote (0, ∞) by R+ and (−∞, 0] by R− 0. Definition 2.1 We call a mapping ρ : L∞ → R a convex risk measure if it has the following properties: • Normalization: ρ(0) = 0 • Translation Invariance: ρ(X + m) = ρ(X) − m for all m ∈ R • Monotonicity: If X ≤ Y , then ρ(X) ≥ ρ(Y ) • Convexity: ρ(λX + (1 − λ)Y ) ≤ λρ(X) + (1 − λ)ρ(Y ) for λ ∈ [0, 1] • Continuity from above: If Xn ∈ L∞ is a decreasing sequence converging to X ∈ L∞ , then ρ(Xn ) ↑ ρ(X). Furthermore, ρ is called a coherent risk measure if additionally it is positively homogeneous, i.e., • Positive Homogeneity: For λ > 0 : ρ(λX) = λρ(X). We denote by Q(P ) ≡ Q all probability measures that are absolutely continuous with respect to P. If Q ∈ Q, we also write Q P . It is well-known that if ρ is a convex risk measure then there exists a unique lower-semicontinuous and convex function α : Q → R ∪ {∞}, referred to as the dual conjugate of ρ, such that the following dual representation holds: n o ρ(X) = sup EQ [−X] − α(Q) . (2.1) Q∈Q
5
Furthermore, α(Q) = sup X∈L∞
n o EQ [−X] − ρ(X) ;
(2.2)
α is minimal in the sense that for every other (possibly non-convex or non-lower-semicontinuous) function α0 satisfying (2.1), α ≤ α0 ; see, for instance, F¨ollmer and Schied [16] and Ruszczy´ nski and Shapiro [35, 36]. We define the subdifferential of ρ by ∂ρ(X) = {Q ∈ Q|ρ(X) = EQ [−X] − α(Q)}.
(2.3)
We say that ρ is subdifferentiable if for every X ∈ L∞ , ∂ρ(X) 6= ∅. In this paper, we furthermore denote by C n (E) the space of all functions from R to R for which the first n-derivatives exist and which are continuous in an open set E. Finally, for a set M ⊂ Q, we denote by I¯M the penalty function that is zero if Q ∈ M and ∞ otherwise.
3
Entropy Coherence and Entropy Convexity: Definitions and Basic Properties
From the representation of convex risk measures (1.1) it is apparent that under every probabilistic model Q, the financial position X is evaluated risk neutrally, that is, by considering its expected loss EQ [−X] . We now consider, more generally, ϕ-convex measures of risk, which allow for a risk averse evaluation of X, given the probabilistic model Q. Specifically, given Q, ϕ-convex measures of risk evaluate the financial position X by considering its normalized expected ϕ-loss cϕ (X, Q) = ϕ−1 (EQ [ϕ(−X)]), with ϕ : R → R increasing. If ϕ is linear, the evaluation under Q is risk neutral, but if ϕ is non-linear and convex, the evaluation is risk averse. We state the following definitions: Definition 3.1 The mapping ρ : L∞ → R is a ϕ-convex measure of risk if there exists an increasing function ϕ : R → R and a penalty function θ : Q → [0, ∞], with inf Q∈Q θ(Q) = 0, such that ρ(X) = sup {cϕ (X, Q) − θ(Q)}, (3.1) Q∈Q
with cϕ (X, Q) = ϕ−1 (EQ [ϕ(−X)]). Definition 3.2 We call a mapping ρ : L∞ → R a ϕ-coherent measure of risk if there exists an increasing function ϕ : R → R and a set M ⊂ Q such that ρ(X) = sup cϕ (X, Q), Q∈Q
with cϕ (X, Q) = ϕ−1 (EQ [ϕ(−X)]). For a mapping ρ : L∞ → R we define ρ∗ (Q) = sup {cϕ (X, Q) − ρ(X)} X∈L∞
and ρ∗∗ (X) = sup {cϕ (X, Q) − ρ∗ (Q)}. QP
6
(3.2)
Lemma 3.3 If ρ is a ϕ-convex measure of risk, then for every X ∈ L∞ , ρ∗∗ (X) ≤ ρ(X).
(3.3)
The next proposition establishes a basic duality result for ϕ-convex measures of risk: Proposition 3.4 A normalized mapping ρ is a ϕ-convex measure of risk if and only if ρ∗∗ = ρ. Furthermore, ρ∗ is the minimal penalty function. Proof. This duality result follows in principle from the general duality results in Moreau [32]. We provide a short proof to be self-contained. The ‘if’ part holds because if ρ(X) = ρ∗∗ (X) = supQP {cϕ (X, Q) − ρ∗ (Q)} then by virtue of the equalities 0 = −ρ(0) = − sup −ρ∗ (Q) = inf ρ∗ (Q), Q∈Q
Q∈Q
ρ is a ϕ-convex measure of risk. Let us prove the ‘only if’ direction. We already know from Lemma 3.3 that ρ∗∗ ≤ ρ. We will prove that ρ∗∗ ≥ ρ. If ρ is a ϕ-convex measure of risk there exists a penalty function θ such that ρ(X) = sup {cϕ (X, Q) − θ(Q)}. Q∈Q
Thus, for every Q P we have θ(Q) ≥ cϕ (X, Q) − ρ(X). By the definition of ρ∗ this yields θ(Q) ≥ ρ∗ (Q). This proves that every penalty function θ is dominating ρ∗ . Moreover, ρ∗∗ (X) = sup {cϕ (X, Q) − ρ∗ (Q)} ≥ sup {cϕ (X, Q) − θ(Q)} = ρ(X). QP
QP
2 Proposition 3.4 suggests a way to find out whether a mapping ρ is a ϕ-convex measure of risk: compute ρ∗ and ρ∗∗ , and verify whether ρ∗∗ = ρ. For later reference, we state the following definition: Definition 3.5 For a ϕ-convex measure of risk ρ we denote by ∂# ρ(X) = {Q ∈ Q|ρ(X) = cϕ (X, Q) − θ(Q)} ∗
∞
and
∗
∂# ρ (Q) = {X ∈ L |ρ (Q) = cϕ (X, Q) − ρ(X)} the #-subdifferentials. Furthermore, if for every X ∈ L∞ , ∂# ρ(X) 6= ∅, then we say that ρ is #-subdifferentiable. We define #-subdifferentiability of ρ∗ similarly. A risk measure that is particularly popular in insurance and financial mathematics (Gerber [19], F¨ollmer and Schied [16] and Mania and Schweizer [31]), macroeconomics (Hansen and Sargent [25, 26]), and decision theory (Gollier [22] and Strzalecki [38]), is the (standard) entropic risk measure defined by X , γ ∈ R+ , eγ (X) = γ log E exp − γ with e0 (X) = limγ↓0 eγ (X) = − ess inf X and e∞ (X) = limγ↑∞ eγ (X) = −E [X]. In a setting with distribution invariance, it is commonly referred to as the exponential premium; see Gerber 7
[19] and Goovaerts et al. [24]. It occurs as a special case of cϕ (X, P ) by taking ϕ(x) = exp{x/γ}, γ ∈ R+ , and is the negative certainty equivalent of an economic agent with exponential (CARA) expected utility preferences. As is well-known (Csisz´ar [11]), n o eγ (X) = sup EP¯ [−X] − γH(P¯ |P ) , P¯ P
where H(P¯ |P ) is the relative entropy, i.e., ¯ dP EP¯ log , if P¯ P ; ¯ H(P |P ) = dP ∞, otherwise. The relative entropy is also known as the Kullback-Leibler divergence; it measures the distance between the measures P¯ and P. Risk measurement with the relative entropy is natural in the following setting: the economic agent has a reference measure P ; the measure P is, however, an approximation to the probabilistic model of the payoff X rather than the true model. The agent therefore does not fully trust the measure P and considers many measures P¯ , with esteemed plausibility decreasing proportionally to their distance from the approximation P. Note that for every given X, the mapping γ → eγ (X) is increasing. Consequently, the parameter γ may be viewed as measuring the degree of trust the agent puts in the reference measure P . If γ = 0, then e0 (X) = − ess inf X, which corresponds to a maximal level of distrust; in this case only the zero sets of the measure P are considered reliable. If, on the other hand, γ = ∞, then e∞ (X) = −E [X], which corresponds to a maximal level of trust in the measure P. Hence, on the one hand, eγ (X) has the interpretation of being the negative certainty equivalent of an economic agent with exponential expected utility preferences and coefficient of absolute risk aversion equal to γ. On the other hand, eγ (X) may be seen as a robust expectation with respect to a reference measure P , with the relative entropy as distance measure. n o It his well-known n oi X that ∂eγ (X) is given by the Esscher density with respect to P : exp − γ /E exp −X , γ γ ∈ R+ . In certain situations the agent can consider other (possibly reference) measures Q P. Therefore we define the entropy eγ,Q with respect to Q as −X eγ,Q (X) = γ log EQ exp . γ Consider now the following two examples: Example 3.6 An economic agent with an exponential (CARA) utility function u(x) = 1 − e−x/γ , γ ∈ R+ , computes the certainty equivalent to the financial position X under the reference measure P , that is, u−1 (E [u(X)]) = −γ log (E [exp {−X/γ}]). This evaluation of X coincides, upon a sign change, with the normalized expected ϕ-loss cϕ (X, P ) when ϕ(x) = exp{x/γ}, γ ∈ R+ . The agent is, however, uncertain about the probabilistic model under the reference measure, and therefore takes the infimum over all probability measures Q absolutely continuous with respect to P , including an additive penalty function θ(Q) measuring the esteemed plausibility of the probabilistic model under Q. The robust certainty equivalent thus computed is −X + θ(Q) . −ρ(X) = inf −γ log EQ exp Q∈Q γ 8
Upon a sign change, it is apparent that in this case ρ(X) is a ϕ-convex measure of risk with risk averse ϕ-loss function ϕ(x) = exp{x/γ}, γ ∈ R+ . Example 3.7 Suppose that the economic agent is only interested in downside tail risk. The standard risk measure focusing on tail risk is the Tail-Value-at-Risk (T V @R), also referred to as Conditional-Value-at-Risk or Average-Value-at-Risk (Rockafellar and Uryasev [33] and Rockafellar, Uryasev and Zabarankin [34]). T V @R is defined by Z 1 α α T V @R (X) = V @Rλ (X)dλ, α ∈ (0, 1), α 0 + + + with V @Rλ (X) = −qX (λ), where qX is the upper quantile function ofX: qX (λ) = inf{x|P [X ≤ + α x] > λ}. If the distribution of X is continuous, T V @R (X) = E −X|X ≤ qX (α) , so that + T V @R computes the average over the left tail of the distribution of X up to qX (α). It is well-known that T V @Rα (X) = sup EQ [−X] , Q∈Mα dQ dQ 1 where Mα is the set of all probability measures Q P such that h idP ≤ α . Let dP = dQ 1 = 1. Then one can + + (α)} + cI{X=qX (α)} , where c should be chosen such that E dP α I{X 0 ϕ0 ◦ ϕ−1 (t) 6= 0 ≥ 0. There are two cases: 00 (i) There exists a nonempty interval J = (u, t) ⊂ R+ such that ϕ0 ◦ ϕ−1 < 0, i.e., ϕ0 ◦ ϕ−1 is strictly concave on J. 00 (ii) There exists a nonempty interval J = (u, t) ⊂ R+ such that ϕ0 ◦ ϕ−1 > 0, i.e., ϕ0 ◦ ϕ−1 is strictly convex on J. Let > 0 such that (1 − )t > u. Since the probability space is rich we may choose X ∈ L∞ satisfying both of the following two properties: (a) −X ∈ ϕ−1 (((1 − )t, t)) ⊂ ϕ−1 (J). (b) −X is diffuse.
12
From (a) it follows in particular that ϕ(−X) ∈ ((1 − )t, t) ⊂ J. Since, with Q ∈ ∂# ρ(X), Q P , and −X is diffuse under P , we have that Q[−X = x] = 0 for every x ∈ ϕ−1 (J). Thus, −X is also diffuse under Q. Finally, let us derive the contradiction. Assume case (i) above. Then ϕ0 ◦ ϕ−1 (EQ [ϕ(−X)]) > EQ ϕ0 ◦ ϕ−1 (ϕ(−X)) = EQ ϕ0 (−X) , where the inequality holds because of Jensen’s inequality for strictly concave functions for the diffuse random variable ϕ(−X) (where we used that ϕ(−X) ∈ J and the strict concavity of ϕ0 ◦ϕ−1 on J). The (strict) inequality above is a contradiction to (4.1). Deriving a contradiction in case (ii) can be done similarly. This proves the theorem. 2 Theorem 4.1 shows in particular that entropy convex measures of risk are the only convex risk measures among ϕ-convex measures of risk with non-linear ϕ, thus allowing for risk aversion.
4.2
Homothetic Preferences and Entropy Convex Measures of Risk
In this and the following subsection, we axiomatize convex, entropy convex and entropy coherent measures of risk, showing that they emerge as certainty equivalents under variational, homothetic and multiple priors preferences, respectively, upon requiring the certainty equivalents to be translation invariant. In the characterization theorems (Theorem 4.3, Corollary 4.9 and Theorem 4.11), we consider, more specifically, negative certainty equivalents of the form ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))), with supQ∈Q β(Q)EQ [−X] , supQ∈M EQ [−X] , ρ¯(X) = supQ∈Q {EQ [−X] − α(Q)}, respectively. These constitute the negative certainty equivalents in the theories of homothetic, multiple priors and variational preferences, respectively; cf. (1.6), and also Section I.3 in F¨ollmer, Schied and Weber [17]. We first state the following proposition, which shows that ρ¯(X) = supQ∈Q β(Q)EQ [−X], i.e., the mapping that induces negative certainty equivalents under homothetic preferences through ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))), is characterized (axiomatized) by the properties of monotonicity, convexity, positive homogeneity, and the property that ρ¯(m) = −m for all m ≤ 0. Proposition 4.2 Suppose that ρ¯ : L∞ → R is monotone, convex, positively homogeneous and continuous from above and for all m ∈ R− ¯(m) = −m. Then there exists a function 0, ρ β : Q ⊃ M → [0, 1] with supQ∈M β(Q) = 1, such that for all X ∈ L∞ with X ≤ 0, ρ¯(X) = sup β(Q)EQ [−X] .
(4.2)
Q∈M
Furthermore, if additionally we have ρ¯(1) = −1 then ρ¯ is a coherent risk measure. In particular, M can be chosen such that β(Q) = 1 for all Q ∈ M. Proof. By standard arguments (see, for example, Lemma A64 in the appendix of F¨ollmer and Schied [16]), we may conclude that ρ¯ is weak∗ lower-semicontinuous. Proposition 3.1.2 in Dana [12] implies that ρ¯(X) = sup {E −X 0 X − ρˆ¯(X 0 )}, X 0 ∈L1+
13
and it follows from standard results in convex analysis that the positive homogeneity of ρ¯ entails that ρˆ¯ is an indicator function of a convex nonempty set, say H ⊂ L1+ . Hence, ρ¯(X) = sup E −X 0 X X 0 ∈H 0 0 X0 = sup E X E − X = sup E X EQX 0 [−X] , (4.3) E [X 0 ] X 0 ∈H X 0 ∈H n 0 where in the case that X 0 ≡ 0, we set 0/0 = 1 and QX = P . Now set M = Q ∈ o ∈ H . Then (4.3) entails that for all X ∈ L∞ with Q| there exists a λ ≥ 0 such that λ dQ dP X ≤ 0, ρ¯(X) = sup β(Q)EQ [−X] , Q∈M
where for Q ∈ M, β(Q) = sup{λ ≥ 0|λ dQ dP ∈ H}. This shows (4.2). Furthermore, sup β(Q) = ρ¯(−1) = 1. Q∈M
To see the last part of the lemma note that if ρ¯(1) = −1 then we must have −1 = ρ¯(1) = supX 0 ∈H E [−X 0 ] . This implies that inf E X 0 = 1. 0 X ∈H
On the other hand, since ρ¯(−1) = 1, we also have that supX 0 ∈H E [X 0 ] = 1. Hence, for every X 0 ∈ H we get that E [X 0 ] = 1. This entails that ρ¯ is a coherent risk measure and by the definition of β we also obtain that β(Q) = 1 for every Q ∈ M . 2 Subsequently, we will identify the measure β(Q)Q (given by (β(Q)Q)(A) = β(Q)Q(A) for every 0 1 A ∈ F) with its density β(Q) dQ dP . We say that an element X ∈ H ⊂ L+ is in the subdifferential of ρ¯, ∂ ρ¯(X), if it attains the supremum in (4.3), i.e., ρ¯(X) = E [−X 0 X] . We state the following theorem: Theorem 4.3 Suppose that the probability space is rich and that ρ¯ : L∞ → R is monotone, ¯(m) = −m. Let convex, positively homogeneous and continuous from above and for all m ∈ R− 0,ρ ϕ be a strictly increasing, non-linear and continuous function satisfying 0 ∈ closure(Image(ϕ)), ϕ(∞) = ∞ and ϕ ∈ C 3 ((ϕ−1 (0), ∞)). Then the following statements are equivalent: (i) ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) is translation invariant and the subdifferential of ρ¯ is always nonempty. (ii) ρ is γ-entropy convex with γ ∈ R+ and the #-subdifferential is always nonempty. Remark 4.4 In the proof of Theorem 4.3 (see also Corollary 4.9 below), it will become apparent that ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) is entropy coherent if and only if ρ¯ is a coherent risk measure. In this case, ρ¯(X) = supQ∈M EQ [−X] for a set M ⊂ Q, and ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) is a negative certainty equivalent in the multiple priors model. Furthermore, we will see that the case that ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) is entropy convex corresponds to ρ being the negative certainty equivalent under homothetic preferences, with ρ¯(X) = supQ∈M β(Q)EQ [−X], where β : M → [0, 1] can be viewed as a confidence measure. In this case, every probabilistic model 14
Q is ‘discounted’ by a factor β(Q) corresponding to its esteemed plausibility. If β(Q) = 1 for all Q ∈ M , we are back in the multiple priors framework. However, if there exists a Q ∈ M such that β(Q) < 1, ρ is entropy convex but not entropy coherent. In both cases, ϕ turns out to be exponential, hence risk averse. Remark 4.5 The direction (i)⇒(ii) in Theorem 4.3 does not hold (even not in the case that we additionally assume that ρ¯ is translation invariant as in Corollary 4.9 below) if the probability space is not rich, or if the assumption on the subdifferential of ρ¯ is omitted. Suppose, for instance, that Ω = {ω1 , ω2 , . . . , ωn } and that, without loss of generality, P [{ωi }] = pi > 0, i = 1, . . . , n. Then for a payoff X we can define ρ¯(X) = maxi=1,...,n −X(ωi ), where the maximum is attained in the measure Q that sets Q[{ωi0 }] = 1, where ωi0 = arg maxω −X(ω). Such a discrete worst case measure of risk is popular in robust optimization. Let ϕ be a strictly increasing and continuous function. Then it always holds that ϕ−1 (¯ ρ(−ϕ(−X))) = ϕ−1 (max ϕ(−X(ωi ))) i
=ϕ
−1
(ϕ(−X(ωi0 ))) = −X(ωi0 ) = ρ¯(X).
In particular, ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) = ρ¯(X) is translation invariant for every function ϕ that is strictly increasing and continuous. This shows that (i)⇒(ii) in Theorem 4.3 does not hold if the probability space is finite. If, on the other hand, the probability space is rich but we omit the assumption that ρ¯ is subdifferentiable, then the coherent risk measure ρ¯(X) = ess sup −X satisfies for every strictly increasing and continuous function ϕ that ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) = ρ¯(X) is a convex risk measure. The equality may be seen to hold as ϕ−1 (¯ ρ(−ϕ(−X))) = ess sup ϕ−1 (ϕ(−X)) = ess sup −X = ρ¯(X). Remark 4.6 We recall that, in financial mathematics, translation invariance is typically motivated by the interpretation of a risk measure on L∞ as a minimal amount of risk capital, as it ensures that ρ(X + ρ(X)) = 0. Remark 4.7 Notice that since ϕ is positive somewhere and 0 ∈ closure(Image(ϕ)) we have that ϕ−1 (δ) is well-defined for all δ > 0 small enough and we can define ϕ−1 (0) = limδ↓0 ϕ−1 (δ). The common condition that ϕ(∞) = ∞ implies that ρ remains loss sensitive. Lemma 4.8 Suppose that ρ¯ : L∞ → R is monotone, convex, positively homogeneous and continuous from above and for all m ∈ R− ¯(m) = −m. Let X ∈ L∞ with X > 0. Then for 0, ρ every Q with β(Q)Q ∈ ∂ ρ¯(−X) we have that β(Q) ≥
ess inf X . ess sup X
Proof. Choose Q ∈ M such that β(Q)Q ∈ ∂ ρ¯(−X). Then by (4.2) and the monotonicity of ρ¯ ess inf X = ρ¯(− ess inf X) ≤ ρ¯(−X) = β(Q)EQ [X] ≤ β(Q) ess sup X, where the last inequality holds as β(Q) ≥ 0. Dividing both sides by ess sup X completes the proof. 2 Proof of Theorem 4.3. (i)⇒(ii): Since ϕ is positive somewhere and 0 ∈ closure(Image(ϕ)), there are two cases: 15
(H1) There exists an x0 such that ϕ(x0 ) = 0. (H2) limx→−∞ ϕ(x) = 0 and for every x ∈ R we have ϕ(x) > 0. Let ϕz (·) := ϕ(· + z) for z ∈ R. By translation invariance, ϕ−1 ρ(−ϕz (−X))) = ϕ−1 (¯ ρ(−ϕz (−X))) − z = ϕ−1 (¯ ρ(−ϕ(−X))). z (¯ Thus, by considering ϕz instead of ϕ, we may assume without loss of generality that: • If (H1) holds then ϕ(0) = 0 and ϕ ∈ C 3 ((ϕ−1 (0), ∞)) = C 3 (R+ ). • If (H2) holds then ϕ(0) > 0 and ϕ ∈ C 3 ((ϕ−1 (0), ∞)) = C 3 ((−∞, ∞)). In particular, we may always assume that ϕ−1 (0) ∈ {−∞, 0} and ϕ(0) ≥ 0.
(4.4)
Next, let us look at X ∈ L∞ such that X < 0. By assumption, ∂ ρ¯(−ϕ(−X)) 6= ∅. As −ϕ(−X) < 0 (since ϕ(0) ≥ 0 and ϕ is strictly increasing), by (4.2) and the assumption that the subdifferential of ρ¯ is always nonempty we have that, for β(Q)Q ∈ ∂ ρ¯(−ϕ(−X)), ρ¯(−ϕ(−X)) = β(Q)EQ [ϕ(−X)] .
(4.5)
Let X ∈ L∞ with X < 0. Taking the derivative of the function m → ϕ−1 (¯ ρ(−ϕ(−X + m))) it follows from Lemma 8.1 in the appendix and translation invariance that 1=
β(Q)EQ [ϕ0 (−X)] , ϕ0 ◦ ϕ−1 (β(Q)EQ [ϕ(−X)])
hence ϕ0 ◦ ϕ−1 (β(Q)EQ [ϕ(−X)]) = β(Q)EQ ϕ0 (−X) .
(4.6)
Assume that (i)⇒(ii) does not hold, i.e., there does not exist p, γ, q such that, for all x ∈ (ϕ−1 (0), ∞), ϕ(x) = p exp{ γx } + q. Furthermore, by assumption it cannot hold that ϕ(x) = px + q. We will then derive a contradiction to (4.6). As in the proof of Theorem 4.1 it may be seen that there are two cases: 00 (i) There exists a nonempty interval J = (u, t) ⊂ R+ such that ϕ0 ◦ ϕ−1 < 0, i.e., ϕ0 ◦ ϕ−1 is strictly concave on J. 00 (ii) There exists a nonempty interval J = (u, t) ⊂ R+ such that ϕ0 ◦ ϕ−1 > 0, i.e., ϕ0 ◦ ϕ−1 is strictly convex on J. Let > 0 such that (1 − )2 t > u. Since the probability space is rich we may choose X ∈ L∞ satisfying both of the following two properties: (a) −X ∈ ϕ−1 (((1 − )t, t)) ⊂ ϕ−1 (J). (b) −X is diffuse.
16
From (a) it follows in particular that ϕ(−X) ∈ ((1 − )t, t) ⊂ J. Since, with β(Q)Q ∈ ∂ ρ¯(−ϕ(−X)), Q P , and −X is diffuse under P , we have that Q[−X = x] = 0 for every x ∈ ϕ−1 (J). Thus, −X is also diffuse under Q. As, by (a) and (4.4), ϕ(−X) ∈ J ⊂ R+ and ϕ(0) ≥ 0, we have that ϕ(−X) > 0. Since β(Q)Q ∈ ∂ ρ¯(−ϕ(−X)), Lemma 4.8 gives β(Q) ≥
ess inf ϕ(−X) (1 − )t ≥ = 1 − > 0. ess sup ϕ(−X) t
Therefore, β(Q)ϕ(−X) is a diffuse random variable under Q and t > ϕ(−X) ≥ β(Q)ϕ(−X) ≥ (1 − )ϕ(−X) ≥ (1 − )2 t > u, where the second inequality holds as β(Q) ∈ (0, 1]. In particular, β(Q)ϕ(−X) ∈ J. Finally, let us derive the contradiction. Assume case (i) above. Then ϕ0 ◦ ϕ−1 (β(Q)EQ [ϕ(−X)]) > EQ ϕ0 ◦ ϕ−1 (β(Q)ϕ(−X)) = lim EQ ϕ0 ◦ ϕ−1 (β(Q)ϕ(−X) + (1 − β(Q))δ) δ↓0 ≥ lim inf EQ β(Q)ϕ0 ◦ ϕ−1 (ϕ(−X)) + (1 − β(Q))ϕ0 ◦ ϕ−1 (δ) δ↓0 0 0 −1 = β(Q)EQ ϕ (−X) + (1 − β(Q)) lim inf ϕ ◦ ϕ (δ) δ↓0 0 ≥ β(Q)EQ ϕ (−X) , where the first inequality holds because of Jensen’s inequality for strictly concave functions for the diffuse random variable β(Q)ϕ(−X) (where we used that β(Q)ϕ(−X) ∈ J and the strict concavity of ϕ0 ◦ ϕ−1 on J). The second inequality holds by the concavity of the function ϕ0 ◦ ϕ−1 on (0, t). The third inequality holds because ϕ0 ◦ ϕ−1 (δ) > 0 for every δ > 0 such that ϕ−1 (δ) is well-defined, as ϕ0 is positive. The (strict) inequality above is a contradiction to (4.6), applying to case (i). Now consider the more challenging case (ii): Then the function ϕ0 ◦ ϕ−1 is convex on (0, t] and strictly convex on J. Choosing a sequence δn ↓ 0 such that lim inf ϕ0 ◦ ϕ−1 (δ) = lim ϕ0 ◦ ϕ−1 (δn ), n
δ↓0
the same argumentation as before yields ϕ0 ◦ ϕ−1 (β(Q)EQ [ϕ(−X)]) < EQ ϕ0 ◦ ϕ−1 (β(Q)ϕ(−X)) ≤ lim EQ β(Q)ϕ0 ◦ ϕ−1 (ϕ(−X)) + (1 − β(Q))ϕ0 ◦ ϕ−1 (δn ) n = β(Q)EQ ϕ0 (−X) + (1 − β(Q)) lim inf ϕ0 ◦ ϕ−1 (δ) . δ↓0
(4.7)
Notice that if (1 − β(Q)) lim inf ϕ0 ◦ ϕ−1 (δ) = 0, δ↓0
(4.8)
then (4.7) would imply that ϕ0 ◦ ϕ−1 (β(Q)EQ [ϕ(−X)]) < β(Q)EQ ϕ0 (−X) , which is a contradiction to (4.6). To see that (1 − β(Q)) lim inf δ↓0 ϕ0 ◦ ϕ−1 (δ) = 0 note that there are two cases: 17
(1.) ρ¯(1) = −1, (2.) ρ¯(1) 6= −1. In the first case the second part of Proposition 4.2 implies that β(Q) = 1 and, in particular, (4.8) is satisfied. Let us look at the second case: by positive homogeneity (2.) entails that ρ¯(m) 6= −m for all m > 0. Now suppose that there exists x0 ∈ R such that ϕ(−x0 ) < 0. Since by assumption there also exists x1 such that ϕ(−x1 ) > 0 the continuity of ϕ yields that the assumption (H1) above holds. In particular, ϕ(0) = 0. By (2.) and the positive homogeneity of ρ¯, ρ¯(−ϕ(−x0 )) 6= ϕ(−x0 ). This gives ϕ−1 (¯ ρ(−ϕ(−x0 ))) 6= −x0 .
(4.9)
However, by translation invariance and since ρ¯(0) = 0, ϕ−1 (¯ ρ(−ϕ(−x0 )) = −x0 + ϕ−1 (¯ ρ(−ϕ(0)) = −x0 + ϕ−1 (¯ ρ(0)) = −x0 + ϕ−1 (0) = −x0 , which is a contradiction to (4.9). Hence, ϕ(x) ≥ 0 for all x ∈ R and the assumption (H2) holds, i.e., lim ϕ(x) = 0. (4.10) x→−∞
By construction in (H2) we have ϕ ∈ C 3 (R). Now (4.10) implies that the positive function ϕ0 (x) cannot be bounded constantly away from zero on (−∞, z) for any z ∈ R. This means that there is a sequence xn converging to −∞ such that lim inf ϕ0 (xn ) = 0. n
Choose δ¯n = ϕ(xn ). By (4.10) we have that limn δ¯n = 0 and 0 ≤ lim inf ϕ0 ◦ ϕ−1 (δ) ≤ lim ϕ0 (ϕ−1 (δ¯n )) = lim ϕ0 (xn ) = 0. n
δ↓0
n
Consequently, lim inf ϕ0 ◦ ϕ−1 (δ) = 0. δ↓0
This proves (4.8). Hence, we have derived a contradiction to (4.6), applying to case (ii). Furthermore, we have seen that the cases (H1) and (1.), and (H2) and (2.) coincide, respectively. Hence, (4.6) implies that the function ϕ0 ◦ ϕ−1 has to be linear, and by Lemma 8.2 this implies that there exist constants p, γ, q ∈ R such that ϕ(x) = pex/γ + q or ϕ(x) = px + q for all x ∈ (ϕ−1 (0), ∞) (where in case (H1) ϕ−1 (0) = 0 and in case (H2) ϕ−1 (0) = −∞). However, since ϕ is assumed to be non-linear, ϕ has to be exponential, with γ ∈ R+ and q = 0. As ϕ is strictly increasing we have p > 0. Now in the case (H2) we must have that ϕ(x) = exp{x/γ} (with q = 0) as only then limx→−∞ ϕ(x) = 0. On the other hand, in the case (H1), condition (1.) holds and the second part of Proposition 4.2 implies that β(Q) = 1 for all Q ∈ M. Therefore, ϕ−1 (¯ ρ(−ϕ(−X))) is invariant under positive affine transformations of
18
ϕ. Thus, we may always assume that q = 0. It is ! ϕ−1 (¯ ρ(−ϕ(−X))) = ϕ−1
sup β(Q)EQ [ϕ(−X)] Q∈M
! −X = γ log sup β(Q)EQ exp γ Q∈M −X + γ log(β(Q)) = sup γ log EQ exp γ Q∈M = sup {eγ,Q (X) − θ(Q)}, Q∈M
with θ(Q) = −γ log(β(Q)) ≥ 0 if Q ∈ M and θ(Q) = ∞ else. Thus, indeed ϕ−1 (¯ ρ(−ϕ(−X))) is γ-entropy convex. As the supremum on the right-hand side of the first equality is always attained because ∂ ρ¯(−ϕ(−X)) 6= ∅, (ii) follows. This completes the proof of the implication (i)⇒(ii) of Theorem 4.3. Proof of Theorem 4.3. (ii)⇒(i): To see the direction (ii)⇒(i), we let ϕ(x) = ex/γ , and ρ¯(X) = supQ∈Q β(Q)EQ [−X], with ∗ β(Q) = e−ρ (Q)/γ ≥ 0. Then ρ(X) = γ log ρ¯ −e−X/γ = ϕ−1 (¯ ρ(−ϕ(−X))). Clearly, ρ¯ is monotone, convex, positively homogeneous and continuous from above. As inf Q∈Q ρ∗ (Q) = 0, ¯(m) = −m. Furthermore, we get that supQ∈Q β(Q) = 1. This implies that for m ∈ R− 0, ρ because ρ is entropy convex it is translation invariant. 2 Corollary 4.9 In the setting of Theorem 4.3, if ρ¯ is additionally assumed to be translation invariant, then statement (i) implies that ρ is γ-entropy coherent with γ ∈ R+ . Proof. As ρ¯ is assumed to be translation invariant, we have that ρ¯(m) = −m for all m ∈ R. By Proposition 4.2 this implies that in the proof of Theorem 4.3 we can choose M ⊂ Q such that β(Q) = 1 for all Q ∈ M. Hence, we get θ(Q) = γ log(β(Q)) = 0 if β(Q) = 1 and ∞ else. Thus, indeed ϕ−1 (¯ ρ(−ϕ(−X))) is entropy coherent. 2 Remark 4.10 In recent work, Cheridito and Kupper [10] (Example 3.6.3) suggest (without formal proof) a result quite similar to, but essentially different from, Corollary 4.9. Their suggested result can in a way be viewed as supplementary to the statement in Corollary 4.9: they restrict attention to a specific and simple probabilistic setting with a finite outcome space Ω and consider only strictly positive probability measures on Ω. By contrast, in Corollary 4.9, we consider a rich outcome space and allow for weakly positive probability measures.
4.3
Variational Preferences and Convex Measures of Risk
We state the following theorem: Theorem 4.11 Suppose that the probability space is rich and that ρ¯ : L∞ → R is a convex risk measure with dual conjugate α that has uniformly integrable sublevel sets. Let ϕ be a strictly increasing and convex function with ϕ ∈ C 3 (R) satisfying either ϕ(−∞) = −∞ or x→∞ ϕ(x)/x → ∞. Then the following statements are equivalent: 19
(i) ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) is translation invariant and the subdifferential of ρ¯ is always nonempty. (ii) ρ is a convex risk measure and the subdifferential is always nonempty. Furthermore, in the x→∞ case that ϕ(x)/x → ∞, ρ is γ-entropy coherent with γ ∈ R+ and #-subdifferentiable. Remark 4.12 Note that under the conditions of Theorem 4.11, ρ¯(X) = supQ∈Q {EQ [−X] − α(Q)}, so that ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) is a negative certainty equivalent under variational preferences. In the proof of Theorem 4.11, we will see that ϕ is either linear or exponential. In the latter case, α, the dual conjugate of ρ¯, only takes the values zero and ∞ and ρ is γ-entropy coherent with γ ∈ R+ . It means that entropy coherent measures of risk are the only convex risk measures among negative certainty equivalents under variational preferences with non-linear utility. Proof of Theorem 4.11. The direction (ii)⇒(i) is straightforward. Let us prove (i)⇒(ii). Clearly, for m0 ∈ R we can consider ϕ(x) + m0 instead of ϕ(x). Hence, we may assume without loss of generality that ϕ(0) = ϕ−1 (0) = 0. Now similarly as in (4.6) it follows by translation invariance and Lemma 8.1 in the appendix that, for Q ∈ ∂ ρ¯(−ϕ(−X)), ϕ0 ◦ ϕ−1 (EQ [ϕ(−X)] − α(Q)) = EQ ϕ0 (−X) . (4.11) We also need: Lemma 4.13 For any X ∈ L∞ such that Q ∈ ∂ ρ¯(X), 0 ≤ α(Q) ≤ ess sup −X − ess inf −X. Proof. Since ρ¯(0) = 0 we must have that α(Q) ≥ 0. Furthermore, by monotonicity and translation invariance of ρ¯ α(Q) = EQ [−X] − ρ¯(X) ≤ ess sup −X − ess inf −X. 2 Continuation of the Proof of Theorem 4.11. (i)⇒(ii): First of all note that as ϕ is strictly increasing and convex we must have that ϕ(∞) = ∞. Assume now that there does not exist p, γ, q such that, for all x ∈ (ϕ−1 (−∞), ∞), ϕ(x) = p exp{ γx } + q or ϕ(x) = px + q. Let us derive a contradiction to (4.11). By Lemma 8.2 in the appendix, this assumption implies that ϕ0 ◦ ϕ−1 is not linear on (ϕ(−∞), ∞). As ϕ is in C 3 (R) we have that ϕ0 ◦ ϕ−1 is in C 2 ((ϕ(−∞), ∞)). Now the second derivative of ϕ0 ◦ ϕ−1 cannot be constantly zero on (ϕ(−∞), ∞) as ϕ0 ◦ ϕ−1 is not linear. Hence, one may see as in the proof of Theorem 4.3 that there are the following two cases: (a) There exists a nonempty interval J = (u, t) such that ϕ0 ◦ ϕ−1 is strictly convex on J. (b) ϕ0 ◦ ϕ−1 is concave on (ϕ(−∞), ∞). Furthermore, there exists a nonempty interval J = (u, t) such that ϕ0 ◦ ϕ−1 is strictly concave on J.
20
Assume case (a). Choose an > 0 such that ((1 − )t, t) ⊂ J. Since the probability space is rich we may choose X ∈ L∞ satisfying both of the following two properties: (a’) −X ∈ ϕ−1 (((1 − 32 )t, (1 − 13 )t)) ⊂ ϕ−1 (J). (b’) −X is diffuse. Similar to the proof of Theorem 4.3, it may be seen that, with Q ∈ ∂ ρ¯(−ϕ(−X)), −X is diffuse under Q. From (a’) and Lemma 4.13 it follows in particular that ϕ(−X) − α(Q) is in ((1 − )t, t) ⊂ J. Now let us derive the contradiction. We write ϕ0 ◦ ϕ−1 (EQ [ϕ(−X)] − α(Q)) < EQ ϕ0 ◦ ϕ−1 (ϕ(−X) − α(Q)) ≤ EQ ϕ0 ◦ ϕ−1 (ϕ(−X)) = EQ ϕ0 (−X) , where the strict inequality holds because of Jensen’s inequality for strictly concave functions for the diffuse random variable ϕ(−X) − α(Q) ∈ J. The second inequality holds since α(Q) ≥ 0. The (strict) inequality above is a contradiction to (4.11). Now assume that case (a) does not hold. Then we are in case (b) and ϕ0 ◦ ϕ−1 is concave on ]ϕ(−∞), ∞[. Note that, by assumption, ϕ0 ◦ ϕ is also increasing and positive (as ϕ is convex and strictly increasing). Since no non-constant concave function having domain equal to R is bounded from below, ϕ(−∞) 6= −∞. Hence, by our assumptions on ϕ, we must have that in this case limx→∞ ϕ(x) x = ∞. Next, note that since the derivative of ϕ0 ◦ ϕ−1 is decreasing and positive it must converge to a constant, say c ≥ 0. By the monotonicity of ϕ0 ◦ ϕ−1 (as ϕ is assumed to be convex) there exists a constant d ∈ R such that for every > 0 there exists M > 0 such that cx + d ≤ ϕ0 ◦ ϕ−1 (x) ≤ cx + d + , for all x > M . As ϕ0 ◦ ϕ−1 = all x > M
1 (ϕ−1 )0
we get that for any > 0 there exists a constant M > 0 such that for 1 1 ≤ (ϕ−1 )0 (x) ≤ . cx + d cx + d −
If c = 0 then (4.12) would imply that ϕ grows at most linearly contradicting limx→∞ Hence, c > 0 and (4.12) implies 1 cx + d 1 cx + d −1 −1 −1 ϕ (M ) + log ≤ ϕ (x) ≤ ϕ (M ) + log , c cM + d c cM + d −
(4.12) ϕ(x) x
= ∞.
(4.13)
for all x ∈ (M , ∞), which yields that 1 (cM + d) exp{c(x − ϕ−1 (M ))} − d ≥ ϕ(x) c 1 ≥ (cM + d − ) exp{c(x − ϕ−1 (M ))} − d , for all x ∈ (ϕ−1 (M ), ∞). c
(4.14)
From Lemma 4.14 below we may conclude that (4.13)-(4.14) entail that ρ¯ must be coherent. Now it follows from Theorem 4.3 (since ϕ(0) = 0) that ϕ must be linear or exponential which is a contradiction to our starting assumption that this is not case. Hence, indeed ϕ must be 21
linear or exponential. Furthermore, if ϕ(−∞) = −∞, we must have that ϕ is linear, while if limx→∞ ϕ(x) x = ∞, ϕ is exponential. Now all what is left to show is that if ϕ has an exponential form, then α, the dual conjugate of ρ¯, has to be an indicator function that only takes the values zero and ∞. In this case ρ would be γ-entropy coherent with γ ∈ R+ . However, note that if ϕ has an exponential form, then (4.12) even holds for a certain d and = M = 0. This implies that (4.13)-(4.14) also hold for every > 0 and every M chosen large enough such that < cM + d. Hence, the fact that in this case α is an indicator function also follows from Lemma 4.14 below. That ρ is #-subdifferentiable follows directly from the fact that the supremum in ρ¯ is attained. This completes the proof. 2 Lemma 4.14 Suppose Theorem 4.11(i) and that there exist c > 0 and d ∈ R such that for every > 0 there exists M > 0 such that (4.13)-(4.14) hold. Then ρ¯ is coherent. Proof. The lemma would be proved if we could show that α, the dual conjugate of ρ¯, is an indicator function. Let cM + d − b() := , cM + d 1 and denote b−1 () = b() . Without loss of generality we may assume that M converges to ∞ as tends to zero so that b() tends to one. We will prove the lemma by contradiction. So assume that there exists Q0 such that 0 < α(Q0 ) < ∞. Let α(Q0 ) M = Q ∈ Q|α(Q) ≤ . (4.15) 2
As M is closed and convex, by the Hahn-Banach Theorem there exists an X00 ∈ L∞ such that EQ0 −X00 > sup EQ −X00 . Q∈M
By considering X0 := X00 + m we may, if we choose m suitably, assume that EQ0 [−X0 ] > 0 > sup EQ [−X0 ] . Q∈M
For > 0 with < cM + d let λ :=
cb()α(Q0 )+ cM +d− EQ0
[X0 ]−1 . Then
cb()α(Q0 ) cb()α(Q) EQ0 [−λ X0 ] > > 0 > sup EQ [−λ X0 ] ≥ sup EQ [−λ X0 ] − , cM + d − cM + d − Q∈M Q∈M
where we used that α ≥ 0 in the last inequality. Hence, cb()α(Q0 ) cb()α(Q) EQ0 [−λ X0 ] − > 0 > sup EQ [−λ X0 ] − . cM + d − cM + d − Q∈M
(4.16)
Clearly this inequality also holds for −λ X0 + m ¯ for any constant m ¯ ∈ R. Let us choose a log(−Z ) suitable constant so that −Z = −λ X0 + m ¯ > 1 and consequently is well-defined and c positive. Define Z := λ X0 − ||λ X0 ||∞ − − 1. 22
Then
log(−Z ) c
> 0 and
cb()α(Q0 ) + (4.17) 2||X0 ||∞ EQ0 [X0 ]−1 + + 1. cM + d − n o Let ρ˜(X) := supQ∈Q EQ [−X] − cMcb() α(Q) . By assumption the sublevel sets of α are +d− weakly compact. This entails that for every with < cM + d there exists Q∗ ∈ Q such that ||Z ||∞ ≤
ρ˜(Z ) = EQ∗ [−Z ] −
cb()α(Q∗ ) . cM + d −
It follows from (4.16) that cb()α(Q∗ ) cb()α(Q) EQ∗ [−Z ] − = sup EQ [−Z ] − cM + d − Q∈Q cM + d − cb()α(Q) cb()α(Q0 ) > sup EQ [−Z ] − . ≥ EQ0 [−Z ] − cM + d − Q∈M cM + d − Therefore, Q∗ ∈ / M which by (4.15) implies that for every > 0 we have that α(Q∗ ) > α(Q0 )/2 > 0. Next, choose > 0 small enough such that cMc +d < 1, < cM + d and (b−1 () − b())EQ∗ [−Z ] = (b−1 () − 1) + (1 − b()) EQ∗ [−Z ] = + EQ∗ [−Z ] cM + d − cM + d cb()α(Q0 ) + −1 ≤ + 2||X0 ||∞ EQ0 [X0 ] + + 1 cM + d − cM + d cM + d − cα(Q∗ ) cα(Q0 ) < , < 2(cM + d − ) cM + d − where we used (4.17) in the first inequality. In particular, (b−1 () − b())EQ∗ [−Z ] −
cα(Q∗ ) < 0. cM + d
Next, choose m0 > 0 large enough such that (b−1 () − b())EQ∗ [−Z ] −
cα(Q∗ ) 0 < −e−cm α(Q∗ ). cM + d −
This is equivalent to n o cb()α(Q∗ ) 0 −1 b () EQ∗ [−Z ] − < b() EQ∗ [−Z ] − e−cm b−1 ()α(Q∗ ) . cM + d −
23
(4.18)
Finally, let us derive a contradiction. We write log(−Z ) − log(−Z ) −1 −1 0 −1 0 ρ¯ −ϕ ρ − ϕ (M ) − m = ϕ + ϕ (M ) + m c c 1 1 1 ≥ log c¯ ρ − [(cM + d − ) c cM + d c −1 0 −1 + d + ϕ−1 (M ) × exp log(−Z ) + cϕ (M ) + cm − cϕ (M ) − d !! 0 1 c ecm = log + ϕ−1 (M ) ρ¯ (cM + d − )Z c cM + d c 1 cM + d − cm0 ≥ log ρ¯ + ϕ−1 (M ) e Z c cM + d ! n h i o 1 0 + ϕ−1 (M ) = log sup EQ −ecm b()Z − α(Q) c Q∈Q ! n o 1 cm0 −cm0 −1 b ()α(Q) + ϕ−1 (M ) = log e b() sup EQ [−Z ] − e c Q∈Q ! n o 1 −cm0 −1 0 b ()α(Q) + ϕ−1 (M ) = m + log b() sup EQ [−Z ] − e c Q∈Q n o 1 0 ≥ m0 + log b() EQ∗ [−Z ] − e−cm b−1 ()α(Q∗ ) + ϕ−1 (M ), (4.19) c ) +m0 +ϕ−1 (M ) > where we have used we have used (4.13)-(4.14) in the first inequality as log(−Z c log(−Z ) + ϕ−1 (M ) > ϕ−1 (M ). In the second equality we have used translation invariance and c performed obvious simplifications. In the second inequality we have used that, as ρ¯ is convex and ρ¯(0) = 0, we must have, for 0 ≤ λ = cMc +d ≤ 1, that λ¯ ρ(X) = λ¯ ρ(X)+(1−λ)¯ ρ(0) ≥ ρ¯(λX). On the other hand we obtain − log(−Z ) −1 ρ − ϕ (M ) c 1 1 1 log(−Z ) ≤ log c¯ ρ − (cM + d) exp c − d + d + ϕ−1 (M ) c cM + d − c c 1 c cM + d = log ρ¯ Z + ϕ−1 (M ) c cM + d − c ! 1 c cM + d = log sup EQ − Z − α(Q) + ϕ−1 (M ) c cM + d − Q∈Q c ! 1 cb()α(Q) = log b−1 () sup EQ [−Z ] − + ϕ−1 (M ) c cM + d − Q∈Q 1 ρ(Z ) + ϕ−1 (M ) = log b−1 ()˜ c 1 cb()α(Q∗ ) −1 = log b () EQ∗ [−Z ] − + ϕ−1 (M ), (4.20) c cM + d −
24
where we have used (4.13)-(4.14) in the first inequality. Finally, we may conclude − log(−Z ) − log(−Z ) 0 −1 0 −1 ρ − ϕ (M ) − m = m + ρ − ϕ (M ) c c cb()α(Q∗ ) 1 0 −1 ≤ m + log b () EQ∗ [−Z ] − + ϕ−1 (M ) c cM + d − o n 1 0 < m0 + log b() EQ∗ [−Z ] − e−cm b−1 ()α(Q∗ ) + ϕ−1 (M ) c − log(−Z ) −1 0 ≤ρ (4.21) − ϕ (M ) − m , c where we have used (4.20) in the first inequality, (4.18) in the strict inequality and (4.19) in the last inequality. The equality holds by translation invariance. The strict inequality (4.21) is a contradiction. 2
4.4
Convexity Without the Translation Invariance Axiom
In the previous two subsections the axiom of translation invariance played a key role; see Theorems 4.3(i) and 4.11(i). As is well-documented (see, for example, Cheridito and Kupper [10]), the axiom of translation invariance is equivalent to the axiom of convexity for general certainty equivalents under fairly weak conditions (e.g., continuity with respect to the L∞ norm). In this subsection we adapt and apply this equivalence relation to the present setting, to replace the axiom of translation invariance by the axiom of convexity, which will now play the key role. Throughout this subsection, we suppose the probability space (Ω, F, P ) is rich. We state the following theorem: Theorem 4.15 Let ρ¯ : L∞ → R be monotone, convex, positively homogeneous and continuous ¯(m) = −m. Suppose that the subdifferential of ρ¯ is always from above, and let for all m ∈ R− 0, ρ nonempty. Furthermore, suppose that r : L∞ → R is defined by r(X) = ϕ−1 (¯ ρ(−ϕ(−X))) , for a strictly increasing, non-linear and continuous function ϕ ∈ C 3 ((ϕ−1 (0), ∞)). Finally, suppose that 0 ∈ closure(Image(ϕ)) and that ϕ(∞) = ∞. Then the following statements are equivalent: (i) r is convex and r(m) = −m for all m ∈ R. (ii) r is γ-entropy convex with γ ∈ R+ . Proof. The direction from (ii) to (i) is straight-forward. Let us show the reverse direction. First, notice that r is continuous with respect to the L∞ -norm. This can be seen as follows: from the proof of Proposition 4.2, we have that ρ¯(X) = supX 0 ∈H {E [−X 0 X]} with H ⊂ L1+ and supX 0 ∈H {E [|X 0 |]} = supX 0 ∈H {E [X 0 ]} = 1. Hence, for X, Y ∈ L∞ , ρ¯(Y ) − ρ¯(X) = sup {E −X 0 Y } − sup {E −X 0 X } X 0 ∈H X 0 ∈H 0 ≤ sup {E −X Y − E −X 0 X } X 0 ∈H ≤ ||Y − X||∞ sup E |X 0 | = ||Y − X||∞ . X 0 ∈H
25
Switching the roles of X and Y it follows that ρ¯ is indeed continuous with respect to the L∞ -norm. Now as ϕ is continuous we can conclude that r is continuous with respect to the L∞ -norm as well. But then it follows from Proposition 2.5-(8) in Cheridito and Kupper [10] that r is translation invariant. The argument is simple, namely, for λ ∈ (0, 1) we have X m X r(X + m) ≤ λr + (1 − λ)r = λr − m. λ 1−λ λ Letting λ converge to one and using the continuity of r with respect to the L∞ -norm we find that r(X + m) ≤ r(X) − m. Replacing X by X + m and m by −m yields the stated result. Therefore, r is indeed translation invariant. Now upon application of Theorem 4.3, the direction from (i) to (ii) follows. 2 Using Corollary 4.9, we now obtain directly the following corollary: Corollary 4.16 In the setting of Theorem 4.15, suppose that ρ¯ is additionally assumed to be translation invariant. Then the following statements are equivalent: (i) r is convex. (ii) r is γ-entropy coherent with γ ∈ R+ . Remark 4.17 It is straightforward to adapt the proof of Theorem 4.15 to show that, similarly, the condition of translation invariance in Theorem 4.11 can also be replaced by convexity and the condition that r(m) = −m for all m ∈ R.
4.5
ϕ-Convex Measures of Risk and Homothetic and Variational Preferences
It is straightforward to verify that ϕ-coherent measures of risk coincide with negative certainty equivalents under multiple priors preferences, for any ϕ-loss function. Except for the subclasses of coherent and entropy coherent measures of risk, ϕ-coherent measures of risk are not translation invariant or convex (cf. Theorem 4.1 and Corollaries 4.9 and 4.16), and therefore do not belong to the class of convex risk measures. Under ϕ-coherent measures of risk and multiple priors preferences, all probabilistic models M ⊂ Q are esteemed equally plausible. The following theorem (remarks) show(s) that, when the penalty function θ is not an indicator function, that is, when the probabilistic models are not considered equally plausible, ϕ-convex measures of risk are negative certainty equivalents under homothetic (variational) preferences if and only if they are entropy convex (classical convex) measures of risk. Theorem 4.18 Suppose that ρ¯ : L∞ → R is monotone, convex, positively homogeneous and continuous from above and for all m ∈ R− ¯(m) = −m. Let ϕ satisfy 0 ∈ closure(Image(ϕ)) 0, ρ 2 and ϕ ∈ C (R). Furthermore, suppose that ρ is a ϕ-convex measure of risk with minimal penalty function θ that is L1 -continuous, #-subdifferentiable in its interior, and not an indicator function. Then the following statements are equivalent: (i) ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) and the subdifferential of ρ¯ is always nonempty. (ii) ρ is γ-entropy convex with γ ∈ R+ and the #-subdifferential is always nonempty.
26
Proof. The direction (ii)⇒(i) is straightforward. Let us prove (i)⇒(ii). First of all note that, as both ϕ-convex measures of risk and ϕ−1 (¯ ρ(−ϕ(−X))) are invariant under scaling of ϕ, we may assume without loss of generality that ϕ0 (0) = 1. Note further that if ρ¯(1) = −1 would hold, then, by Proposition 4.2, ρ¯ would be a coherent risk measure. In particular, in that case there exists a convex closed set M ⊂ Q such that for all X ∈ L∞ ρ¯(X) = max EQ [X] , Q∈M
hence ρ(X) = max ϕ−1 (EQ [ϕ(−X)]). Q∈M
But this entails that ρ∗ is given by IM , which is a contradiction to the assumption that the minimal penalty function of ρ is not an indicator function. Therefore, we may conclude that ρ¯(1) 6= −1. Then, by positive homogeneity, ρ¯(m) 6= −m for all m > 0. Since at the same time ρ(m) = −m for all m ∈ R as ρ is a ϕ-convex measure of risk, ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) can only hold if ϕ ≥ 0. As ϕ is strictly increasing and 0 is in the closure of the image of ϕ, we must have that ϕ > 0 and Image(ϕ) = R+ . Applying the transformation Y = ϕ(−X) and using that ρ is a ϕ-convex measure of risk, we obtain from ρ(X) = ϕ−1 (¯ ρ(−ϕ(−X))) that n o max ϕ ϕ−1 (EQ [Y ]) − θ(Q) = ρ¯(−Y ). (4.22) Q∈Q
Note that (4.22) holds only for random variables in the image of ϕ. Define ρ¯Q (−Y ) := ϕ ϕ−1 (EQ [Y ]) − θ(Q) . By (4.22), for all Y taking values in R+ = Image(ϕ) a.s., we have ρ¯(−Y ) = max ρ¯Q (−Y ).
(4.23)
Q∈Q
For every such Y , we denote by QY the measure which attains the maximum in (4.23). By positive homogeneity of ρ¯, we have for every λ > 0, ρ¯(−λY ) = λ¯ ρ(−Y ). Note that + Y ∈ Image(ϕ) = R a.s. if and only if λY ∈ Image(ϕ) a.s. with λ > 0. Now (4.22) entails that for all Y > 0 and λ > 0, Y Y ρ¯Q (−λY ) ≤ λ¯ ρQ (−Y ). Y
Y
This may be seen as follows: Suppose that ρ¯Q (−λY ) > λ¯ ρQ (−Y ). Then Y
ρ¯(−λY ) ≥ ρ¯QY (−λY ) > λ¯ ρQ (−Y ) = λ¯ ρ(−Y ), Y
which is a contradiction to the positive homogeneity property of ρ¯. Hence, indeed ρ¯Q (−λY ) ≤ Y Y Y λ¯ ρQ (−Y ) for all Y > 0 and λ > 0. Dividing both sides by λ we get ρ¯Q (−Y ) ≤ λ1 ρ¯Q (−λY ) ≤ Y ρ¯Q (−Y ), which entails that Y Y ρQ (−Y ), ρ¯Q (−λY ) = λ¯ Y > 0 and λ > 0. Now by assumption, there exists Q0 ∈ Q such that 0 < θ(Q0 ) < ∞. As inf Q∈Q θ(Q) = 0, for all sufficiently small, there exist measures Q such that θ(Q ) = . By considering the 27
mapping λ → θ(λQ + (1 − λ)Q0 ) for λ ∈ [0, 1] and using L1 -continuity on the interior of the domain of θ, we may conclude that (, θ(Q0 )) ⊂ Image(θ). As can be chosen arbitrarily small, we have (0, θ(Q0 )) ⊂ Image(θ). Thus, for arbitrary a ∈ (0, θ(Q0 )) there exists a Qa ∈ Q such that θ(Qa ) = a. As ∂# θ(Q0 ) 6= ∅, there exists a Ya for which the maximum in (4.23) is attained in Qa . For λ ∈ R+ , define the mapping h(λ) = ρ¯Qa (−λYa ) = λ¯ ρQa (−Ya ). If we take the derivative of h with respect to λ, Q then by the definition of ρ¯a and the chain rule we get that ϕ0 ϕ−1 (EQa [λYa ]) − a EQa [Ya ] = h0 (λ) = ρ¯Qa (−Ya ), ϕ0 ϕ−1 (EQa [λYa ]) where we applied in the last equation that, by positive homogeneity, h(λ) = λ¯ ρQa (−Ya ). Hence, ϕ0 ϕ−1 (EQa [λYa ]) − a ρ¯Qa (−Ya ) = =: g(−a). (4.24) EQa [Ya ] ϕ0 ϕ−1 (EQa [λYa ]) The last definition is justified since Qa and Ya only depend on a. Denote f a (λ) = ϕ−1 (EQa [λYa ]). Note that f a is a continuously differentiable and strictly increasing function on R+ with Image(f a ) = R. (The latter statement holds as P [Y > 0] = 1 so that Qa [Y > 0] = 1.) Now we have shown in (4.24) that for any λ > 0, ϕ0 (f θ (λ) − a) = ϕ0 (f θ (λ))g(−a). Choosing λ = (f a )−1 (0) and using that ϕ0 (0) = 1 we obtain g = ϕ0 . Thus, ϕ0 (f θ (λ) − a) = ϕ0 (f θ (λ))ϕ0 (−a). As Image(f a ) = R, it follows in particular that for any h ∈ R, ϕ0 (h − a) = ϕ0 (h)ϕ0 (−a).
(4.25)
As ϕ0 is continuous, (4.25) holds for all a ∈ [0, θ(Q0 )] (and not only on the open interval). Now since ϕ0 is differentiable it is well-known that (4.25) yields that there exists γ > 0 such that ϕ0 (−a) = exp(− γa ) for any a ∈ [0, θ(Q0 )]. (Just look at the differential quotient of ϕ(−a) and apply (4.25).) Finally, by using a decomposition x = −b1 θ(Q0 ) − b2 for x ∈ (−∞, 0) and x = b1 θ(Q0 ) + b2 , for x ∈ [0, ∞) with b1 ∈ N and b2 ∈ [0, θ(Q0 )), and applying (4.25) (or ϕ0 (h−a) x 0 0 ϕ0 (−a) = ϕ (h)) multiple times, it may be seen that (4.25) entails that ϕ (x) = exp( γ ) for any x ∈ R with γ ∈ R+ . Consequently, ϕ(x) = γ exp( γx ) + b for a constant b. Thus, we may conclude that ρ is γ-entropy convex. This proves the theorem. 2 Remark 4.19 In the setting of Theorem 4.18, ϕ−1 (¯ ρ(−ϕ(−X))) is a negative certainty equivalent in the sense of (1.6), under homothetic preferences. Remark 4.20 If a ϕ-convex measure of risk ρ is a negative certainty equivalent under variational preferences, then we must have that sup {ϕ−1 (EQ [ϕ(−X)]) − θ(Q)} = sup ϕ−1 {EQ [ϕ(−X)] − α(Q)} . Q∈Q
Q∈Q
If we exclude the risk neutral case that ϕ is linear, it may be seen with a similar argumentation as above that this entails that θ is an indicator function. On the contrary, if we exclude the case that θ is an indicator function, then ϕ must be linear, in which case ρ is a convex risk measure. 28
5
The Dual Conjugate
In this section we study the dual conjugate function, defined in (2.2), for entropy coherent and entropy convex measures of risk. Quite unusually, some explicit results on the dual conjugate function can be obtained. Let γ ∈ R+ . We state the following proposition: Proposition 5.1 Suppose that ρ is γ-entropy convex. Then ρ∗ (Q) = sup {α(P¯ ) − γH(P¯ |Q)}.
(5.1)
P¯ P
Proof. We write ρ∗ (Q) = sup {eγ,Q (X) − ρ(X)} = sup sup {EP¯ [−X] − γH(P¯ |Q) − ρ(X)} X∈L∞
X∈L∞ P¯ P
= sup sup {EP¯ [−X] − ρ(X) − γH(P¯ |Q)} = sup {α(P¯ ) − γH(P¯ |Q)}. P¯ P X∈L∞
P¯ P
2 Notice that (5.1) yields that α(P¯ ) ≤ ρ∗ (Q) + γH(P¯ |Q). Hence, α(P¯ ) ≤ inf {ρ∗ (Q) + γH(P¯ |Q)}. Q∈Q
The next penalty function duality theorem will show that this inequality under additional assumptions is actually an equality. It also establishes the explicit relationship between the dual conjugate α and the penalty function θ for γ-entropy convex measures of risk. Theorem 5.2 Suppose that ρ is γ-entropy convex with penalty function θ. Then: (i) The dual conjugate of ρ, defined in (2.2), is given by the largest convex and lowersemicontinuous function α being dominated by inf Q∈Q {γH(P¯ |Q) + θ(Q)}. (ii) If θ is convex and lower-semicontinuous, then α is the largest lower-semicontinuous function being dominated by inf Q∈Q {γH(P¯ |Q) + θ(Q)}. (iii) If θ is convex and lower-semicontinuous and for every r ∈ R+ the set Br = {Q ∈ Q|θ(Q) ≤ r} is uniformly integrable, then α(P¯ ) = min {γH(P¯ |Q) + θ(Q)}.
(5.2)
Q∈Q
Proof. (i): We write n o −X ρ(X) = sup γ log EQ exp − θ(Q) = sup sup EP¯ [−X] − γH(P¯ |Q) − θ(Q) γ Q∈Q Q∈Q P¯ P o n o n = sup sup EP¯ [−X] − γH(P¯ |Q) − θ(Q) = sup EP¯ [−X] − inf {γH(P¯ |Q) + θ(Q)} , P¯ P Q∈Q
P¯ P
Q∈Q
where we have used in the second equality that H(P¯ |Q) = ∞ if P¯ is not absolutely continuous with respect to Q. Since α is the minimal lower-semicontinuous and convex function satisfying (2.1), statement (i) follows. (ii): Now assume that θ is convex and lower-semicontinuous. We will first show that: 29
(a) γH(P¯ |Q) is jointly convex in (P¯ , Q). ¯ respectively, then γH(P¯ |Q) ≤ lim inf n γH(P¯n |Qn ). (b) If P¯n and Qn converge weakly to P¯ and Q, h n oi To see (a), note that for every X ∈ L∞ , −γ log EQ exp −X is convex in Q, and EP¯ [−X] γ oi h n is jointly convex in (P¯ , Q) and is convex in P¯ . Hence, EP¯ [−X] − γ log EQ exp −X γ therefore −X ¯ γH(P |Q) = sup EP¯ [−X] − γ log EQ exp γ X∈L∞ is jointly convex in (P¯ , Q) as well. (b) If Qn ∈ Q converges weakly to Q, and P¯n ∈ Q converges weakly to P¯ , then for every n→∞ n→∞ X ∈ L∞ we have EQn [−X] → EQ [−X] and EP¯n [−X] → EP¯ [−X] . Since −X −X EP¯ [−X] − γ log EQ exp = lim EP¯n [−X] − γ log EQn exp n γ γ −X ≤ lim inf sup EP¯n [−X] − γ log EQn exp , n γ X∈L∞ it follows that −X EP¯ [−X] − γ log EQ exp γ X∈L∞ −X ≤ lim inf sup EP¯n [−X] − γ log EQn exp n γ X∈L∞ ¯ = lim inf γH(Pn |Qn ).
γH(P¯ |Q) = sup
n
This proves (b). (a) and (b) imply that γH(P¯ |Q) is jointly convex and lower-semicontinuous in (P¯ , Q). Furthermore, θ(Q) is convex and lower-semicontinuous. Therefore γH(P¯ |Q) + θ(Q) is jointly convex and lower-semicontinuous as well. By Theorem 2.1.3 (v) of Zalinescu [42] it follows that inf Q∈Q {γH(P¯ |Q) + θ(Q)} is convex in P¯ . Now (ii) follows since α is the minimal lowersemicontinuous and convex function satisfying (2.1). (iii): If we could show that β(P¯ ) = inf {γH(P¯ |Q) + θ(Q)} Q∈Q
(5.3)
is also lower-semicontinuous and that the infimum is attained, then (5.2) would follow from the uniqueness of α. First of all let us show that the infimum in (5.3) is attained. Let Qk P be the minimizing sequence. Since θ 6= ∞ we have for all P¯ that β(P¯ ) < ∞. Thus, lim sup θ(Qk ) ≤ lim sup γH(P¯ |Qk ) + θ(Qk ) = β(P¯ ) < ∞. k
k
In particular, (θ(Qk ))k is a bounded sequence. By our assumptions, Qk must be a uniformly integrable sequence and by the Theorem of Dunford-Pettis, see for instance Theorem IV.8.9 in Dunford and Schwartz [14], the sequence Qk is weakly relatively compact. Hence, for fixed P¯ 30
we may take the infimum in (5.3) over the weakly compact set {Q1 , Q2 , . . .}. As by (b) above Q → γH(P¯ |Q) + θ(Q) is lower-semicontinuous we may infer that the infimum is attained. So suppose that P¯n converges weakly to P¯ . For the lower-semicontinuity we have to show that β(P¯ ) ≤ lim inf β(P¯n ). (5.4) n
If lim inf n β(P¯n ) = ∞ then clearly (5.4) holds. So assume that r := lim inf n β(P¯n ) < ∞. Denote by (nj )j the subsequence such that lim inf n β(P¯n ) = limj β(P¯nj ). Let ¯ n ∈ arg minQ∈Q {γH(P¯n |Q) + θ(Q)}. Q j j As lim supj θ(Qnj ) ≤ limj γH(P¯nj |Qnj ) + θ(Qnj ) = r, the sequence Qnj is uniformly integrable. Again by the Theorem of Dunford-Pettis, Qnj has a subsequence, denoted by njk , converging ¯ ∈ Q. Hence, by the lower-semicontinuity of the mapping (P¯ , Q) → weakly to a measure Q H(P¯ |Q) proved in (b), ¯ + c(Q) ¯ β(P¯ ) = min {γH(P¯ |Q) + θ(Q)} ≤ γH(P¯ |Q) Q∈Q
¯ n ) + θ(Qn ) = lim inf β(P¯n ), ≤ lim inf γH(P¯nk |Q jk jk n
k
where the second equality holds because njk was a subsequence of the sequence nj . Hence, indeed β is lower-semicontinuous and we can conclude that β = α. 2 Corollary 5.3 Suppose that ρ(X) = sup eγ,Q (X) Q∈M
for a convex set M ⊂ Q. Then the dual conjugate of ρ is given by the largest lower-semicontinuous function α being dominated by inf Q∈M γH(P¯ |Q). Furthermore, if M is weakly relatively compact, then α(P¯ ) = min γH(P¯ |Q). Q∈M
(5.5)
Proof. The first part of the corollary is precisely (ii) of Theorem 5.2 with θ = I¯M . The second part follows as for all r ∈ R+ we have {Q ∈ Q|θ(Q) ≤ r} = M. (5.5) now follows as by the Theorem of Dunford-Pettis, M is weakly relatively compact if and only if M is uniformly integrable. 2 Corollary 5.4 Suppose that ρ is a convex risk measure with dual conjugate α for which α(P ) = 0 and α(Q) > 0 if Q 6= P. Then ρ is γ-entropy coherent if and only if ρ(X) = eγ (X). Proof. The ‘if’ direction is trivial. Let us prove the ‘only if’ direction. If ρ is γ-entropy coherent, then by Corollary 5.3 we must have α(P¯ ) ≤ inf Q∈M γH(P¯ |Q) for a convex set M . Note that if P¯ ∈ M then 0 ≤ α(P¯ ) ≤ inf Q∈M γH(P¯ |Q) = 0. By the assumptions on α this implies that M can at most contain P. Hence, either α(P¯ ) = γH(P¯ |P ) for all P¯ P, or M = ∅ and α = ∞. However, as inf Q α(Q) = ρ(0) = 0 we must have that α(P¯ ) = γH(P¯ |P ). Therefore, by (2.1) indeed ρ(X) = sup {EP¯ [−X] − γH(P¯ |P )} = eγ (X). P¯ ∈Q
2 31
Corollary 5.5 Let ρ be a convex risk measure. Then the following statements are equivalent: (i) For a convex and lower-semicontinuous function θ from Q to [0, ∞] with inf Q∈Q θ(Q) = 0 and uniformly integrable sublevel sets we have α(P¯ ) = min {γH(P¯ |Q) + θ(Q)}. (5.6) Q∈Q
(ii) ρ is γ-entropy convex with a convex and lower-semicontinuous penalty function θ which has uniformly integrable sublevel sets. Proof. The direction from (ii) to (i) is precisely part (iii) of Theorem 5.2. The reverse direction holds since ρ(X) = sup {E ¯ [−X] − α(P¯ )} = sup {E ¯ [−X] − min [γH(P¯ |Q) + θ(Q)]} P¯ ∈Q
P
P¯ ∈Q
P
Q∈Q
= sup sup {EP¯ [−X] − γH(P¯ |Q) + θ(Q)} = sup {eγ,Q (X) − θ(Q)}. Q∈Q P¯ ∈Q
Q∈Q
2 In the case that the penalty functions admit uniformly integrable sublevel sets, the next theorem establishes a complete characterization of entropy convexity involving only the dual conjugate α. It shows that entropy convexity is equivalent to a min-max being a max-min. Theorem 5.6 Suppose that ρ is a convex risk measure. Furthermore, let θ be defined by θ(Q) := supPˆ P {α(Pˆ ) − γH(Pˆ |Q)}. Then the following statements are equivalent: (i) ρ is γ-entropy convex with ρ∗ having uniformly integrable sublevel sets. (ii) θ is convex and lower-semicontinuous with inf Q∈Q θ(Q) = 0 and uniformly integrable sublevel sets, and for every P¯ ∈ Q, n o (5.7) inf sup γH(P¯ |Q) + α(Pˆ ) − γH(Pˆ |Q) Q∈Q ˆ P ∈Q
= sup inf
Q∈Q Pˆ ∈Q
n o γH(P¯ |Q) + α(Pˆ ) − γH(Pˆ |Q) .
Proof. We can write the right-hand side of (5.7) as " # ¯ n dP¯ dPˆ o n o dP ˆ sup inf γEQ log − log + α(P ) = sup inf γEQ log + α(Pˆ ) . Q∈Q Q∈Q dQ dQ dPˆ Pˆ ∈Q Pˆ ∈Q ¯ ¯ If dPˆ 6= 1 on a non-zero set we have that log dPˆ < 0 on a non-zero set. But then dP dP ¯ dP = −∞. inf γEQ log Q∈Q dPˆ Consequently, we have to choose Pˆ = P¯ in the supremum above, which implies that the righthand side in (5.7) is equal to α(P¯ ). Moreover, for the left-hand side we have that n o n o inf sup γH(P¯ |Q) + α(Pˆ ) − γH(Pˆ |Q) = inf γH(P¯ |Q) + sup {α(Pˆ ) − γH(Pˆ |Q)} Q∈Q ˆ P ∈Q
Q∈Q
Pˆ ∈Q
n o = inf γH(P¯ |Q) + θ(Q) . Q∈Q
Now the theorem follows from Proposition 5.1 and Corollary 5.5. 32
2
6
Acceptance Sets
Throughout this section we assume that the probability space is rich. We further denote by FXQ the cumulative distribution function (cdf) of X under the probabilistic model Q. In the theory of convex risk measures, the notion of acceptability plays an important role. Specifically, for a convex risk measure ρ, the set of all acceptable positions Aρ is defined by Aρ = {X ∈ L∞ |ρ(X) ≤ 0}.
(6.1)
Every acceptance set induces a rejection set Rρ = L∞ \ Aρ . As is well-known (see, for instance, F¨ollmer and Schied [16]), ρ(X) = inf{m ∈ R|X + m ∈ Aρ }, (6.2) which means that a convex risk measure ρ(X) can be interpreted as the minimal amount of capital needed to make the financial position X acceptable. One may verify that ρ is a convex risk measure if and only if Aρ defined in (6.1) contains 0, is closed, monotone (if X ≤ Y and X is acceptable, then Y is acceptable as well), and convex (if X and Y are both acceptable, then, for every λ ∈ [0, 1], the ‘diversified’ portfolio λX +(1−λ)Y is acceptable as well). Convexity captures the notion of diversification. Coherent risk measures correspond to Aρ being additionally assumed to be positively homogeneous, i.e., a cone (X ∈ A implies λX ∈ A for any λ > 0). Rather than starting with a convex risk measure ρ and defining Aρ through (6.1), one can also start with a set A ⊂ L∞ , satisfying the appropriate conditions, and define a convex risk measure ρ through (6.2). In particular, ρ is uniquely defined through its acceptance set. In this section, we characterize entropy coherent and entropy convex measures of risk in terms of properties of their acceptance sets. Before turning to the general case of entropy convex measures of risk, we first consider the case of entropy coherent measures of risk. To highlight the difference with coherent risk measures we also provide an alternative characterization of coherent risk measures in terms of their acceptance sets. For a fixed probabilistic model Q and a non-empty, monotone set AQ ⊂ L∞ with RQ = ∞ L \ AQ 6= ∅, we consider the following properties: (RN) Risk neutrality: If X ∈ AQ , then, for any λ ≥ 1, λX ∈ AQ . (NRN) Non-risk neutrality: If X ∈ AQ and Q[X < 0] > 0, then there exists a λ ≥ 1 such that λX ∈ / AQ . (i) Acceptance additivity: If X, Y ∈ AQ are independent under Q, then also X + Y ∈ AQ . (ii) Rejection additivity: If X, Y ∈ RQ are independent under Q, then also X + Y ∈ RQ . (iii) Mixing: If X, Y ∈ AQ , then, for any λ ∈ [0, 1], the random variable Y with cdf λFXQ + (1 − λ)FYQ under Q is in AQ as well. Furthermore, if X, Y ∈ RQ , then, for any λ ∈ [0, 1], the random variable Y with cdf λFXQ + (1 − λ)FYQ under Q is in RQ as well. (iv) Continuity: AQ is closed with respect to convergence in distribution under Q. (v) ϕ-convexity: There exist constants x1 ∈ int(AQ ) and x2 ∈ RQ such that the function ( z−α(x) ¯ , if x ≥ 0; 1−α(x) ¯ ϕ(x) ¯ := (6.3) z , if x < 0; α(x) ¯ 33
¯ (x) = sup{0 ≤ α ≤ is convex, where z := sup{0 ≤ α ≤ 1|αδx2 + (1 − α)δx1 ∈ AQ }, α Q ¯ (x) = sup{0 ≤ α ≤ 1|αδx + (1 − α)δx1 ∈ AQ } 1|αδx2 + (1 − α)δx ∈ A } for x ≥ 0, and α for x < 0. Here, δx is a Dirac point mass in x. We define ϕ(x) = ϕ(−x). ¯ Note that in (v) random variables are identified with their cdf’s under Q. (By (iii), if FXQ = FYQ , then X ∈ AQ if and only if Y ∈ AQ , and similarly for RQ .) Now we can consider a whole family of sets (AQ )Q satisfying properties (i)-(v), where the upper index Q should express that these properties are satisfied for every given probabilistic model Q, i.e., model-wise. For a family of sets (AQ )Q , we introduce an additional property: 0
(vi) Acceptance neutrality: If X with cdf FXQ under Q and Y with cdf FYQ under Q0 satisfy 0 0 FXQ = FYQ , then X ∈ AQ if and only if Y ∈ AQ . The following proposition provides an alternative characterization of coherent risk measures in terms of their acceptance sets. It features properties (RN),(i)-(vi). Theorem 6.1 A mapping ρ : L∞ → R defined by (6.2) is a coherent risk measure if and only if there exists a closed and convex set M ⊂ Q such that the acceptance set Aρ can be written as \ Aρ = AQ , (6.4) Q∈M
where (AQ )Q∈M satisfy (RN),(i)-(vi). Proof. Recall that ρ(X) is coherent if and only if there exists a set M ⊂ Q such that ρ(X) = supQ∈M EQ [−X] . Moreover, M can be chosen as a closed and convex subset of Q. To prove ‘⇒’, define AQ := {X ∈ L∞ |EQ [−X] ≤ 0}. Then it is straightforward to see that the (AQ )Q∈M (RN),(i)-(vi) (with x1 = 1, x2 = −1 and ϕ(x) ¯ = 1/2(1 − x) in (v)) and that T satisfy Q Aρ = Q∈M A . Next, let us show that ‘⇐’ holds. Suppose that \ Aρ = AQ , Q∈M
where (AQ )Q∈M satisfy (RN),(i)-(vi). From Weber [41], Theorem 3.3, it follows that conditions (iii)-(iv) imply that there exists an increasing function ϕ(x) = ϕ(−x), ¯ with ϕ¯ as defined in (6.3), and a constant w such that AQ = {X ∈ L∞ |ϕ−1 (EQ [ϕ(−X)]) ≤ w}. Now condition (v) guarantees that ϕ is convex. Define ρQ (X) = inf{m ∈ R|X + m ∈ AQ }.
(6.5)
Note that if we could show that ρQ is additive for independent random variables, then we must have ρQ (0) = 0. Furthermore, it would then follow from Gerber [20] that ϕ is linear or exponential. But condition (RN) entails that ϕ cannot be exponential, so ϕ would have to be linear. Let X and Y be independent under Q. Note that, by (6.2), ρQ (X + Y ) = inf{m ∈ R|X + Y + m ∈ AQ } = inf{m = mX + mY ∈ R|X + mX + Y + mY ∈ AQ } ≤ inf{mX ∈ R|X + mX ∈ AQ } + inf{mY ∈ R|Y + mY ∈ AQ } = ρQ (X) + ρQ (Y ), 34
(6.6)
where the inequality holds because, by property (ii), if X + mX , Y + mY ∈ AQ , then (since X + mX and Y + mY are independent under Q) also X + mX + Y + mY ∈ AQ . On the other hand, using translation invariance of ρQ , it is straightforward to prove that ρQ (X) = sup{m ∈ R|X + m ∈ RQ }. Therefore, ρQ (X + Y ) = sup{m ∈ R|X + Y + m ∈ RQ } = sup{m = mX + mY ∈ R|X + mX + Y + mY ∈ RQ } ≥ sup{mX ∈ R|X + mX ∈ RQ } + sup{mY ∈ R|Y + mY ∈ RQ } = ρQ (X) + ρQ (Y ).
(6.7)
It follows from (6.6)-(6.7) that ρQ is additive for independent random variables. Therefore, ϕ is linear and hence ρ is a coherent risk measure. 2 Note that the probabilistic models Q ∈ M may be viewed as stress test measures. By T Theorem 6.1, with Aρ = Q∈M AQ , a financial position X is acceptable if and only if it is acceptable under every stress test measure Q ∈ M. In this context, AQ may be referred to as a stress test set. The following theorem provides a characterization of entropy coherent measures of risk in terms of their stress test sets. It features properties (NRN),(i)-(vi). It shows that entropy coherent measures of risk can be obtained just as coherent risk measures in Theorem 6.1, but by moving from risk neutrality (RN) of the stress test sets to assuming non-risk neutrality (NRN). Theorem 6.2 A mapping ρ : L∞ → R defined by (6.2) is an entropy coherent measure of risk if and only if there exists a closed and convex set M ⊂ Q such that the acceptance set Aρ can be written as (6.4) where (AQ )Q∈M satisfy (NRN),(i)-(vi). Proof. Entropy coherence means that there exists a set M ⊂ Q such that ρ(X) = supQ∈M eγ,Q (X). Again, M can be chosen as a closed and convex subset of Q. One easily verifies that, if X and Y are independent under Q, eγ,Q (X + Y ) = eγ,Q (X) + eγ,Q (Y ).
(6.8)
To prove ‘⇒’, define AQ := {X ∈ L∞ |eγ,Q (X) ≤ 0}. Then it is straightforward to see that the (AQ )Q∈M satisfy (NRN),(i)-(vi) (with ϕ(x) ¯ = ae−x/γ + b, γ ∈ R+ , and a and b chosen such that ϕ(1) ¯ = 0 and ϕ(−1) ¯ = 1 in (v)). In particular, the fact that (i) hold for every AQ T and (ii) Q follows from (6.8). Furthermore, by definition of ρ, clearly, Aρ = Q∈M A . To see the direction ‘⇐’, note that one can show exactly as in the proof of Theorem 6.1 that there exists a convex function ϕ(x) = ϕ(−x), ¯ with ϕ¯ given by (6.3), and a constant w such that AQ = {X ∈ L∞ |ϕ−1 (EQ [ϕ(−X)]) ≤ w}, and that ρQ defined in (6.5) is additive for independent random variables. But then Gerber [20] yields that ϕ is linear or exponential, and (NRN) now implies that ϕ has to be exponential. Furthermore, 0 ∈ AQ entails that T w = 0. It follows that there exists γ ∈ R+ such that ρQ (X) = eγ,Q (X). As Aρ = Q∈M AQ , we have that X ∈ Aρ if and only if supQ∈M eγ,Q (X) ≤ 0. Now it follows from (6.2) and the translation invariance of eγ,Q that indeed ρ(X) = supQ∈M eγ,Q (X). This proves that ρ is an entropy coherent measure of risk. 2 35
Theorem 6.2 shows that moving from coherent to entropy coherent measures of risk exactly and solely means moving from scale risk neutral to non-risk neutral stress test sets, where scale risk neutrality is expressed by scale invariance of the stress test sets. In fact, it follows from Gerber [20] that, under acceptance and rejection additivity, scale risk neutrality implies full risk neutrality. It makes explicit that entropy coherent measures of risk may be viewed as the non-risk neutral counterparts of the risk neutral coherent risk measures. Now let us turn to the more general entropy convex case. Let θ : Q → R ∪ {∞} be a penalty function. For a fixed probabilistic model Q with θ(Q) < ∞, and a monotone set AQ with AQ 6= ∅, RQ 6= ∅, we consider the following properties: (NRN’) Non-risk neutrality with penalty: If X ∈ AQ and Q[X < 0] > 0, then there exists a λ ≥ 1 such that λX + (λ − 1)θ(Q) ∈ / AQ . (i’) Acceptance additivity with penalty: If X, Y ∈ AQ are independent under Q, then also X + Y + θ(Q) ∈ AQ . (ii’) Rejection additivity with penalty: If X, Y ∈ RQ are independent under Q, then also X + Y + θ(Q) ∈ RQ . Notice that since θ(Q) ≥ 0, θ(Q) in properties (NRN’), (i’) and (ii’) can be interpreted as an additional regulator’s charge for increasing the scale of a financial position. We let axioms (iii)-(vi) remain unchanged. It may be then proved similarly as in Theorem 6.1 that convex risk measures are equivalent to acceptance sets of the form (6.4) with (AQ )Q∈Q satisfying properties (i’),(ii’),(iii)(vi). Furthermore, entropy convex measures of risk are equivalent to (AQ )Q∈Q satisfying properties (NRN’),(i’),(ii’),(iii)-(vi). In particular, this may be seen by noting that ρ˜Q (X) := ρQ (X)+θ(Q) = inf{m ∈ R|X +m ∈ A˜Q }, where ρQ is as defined in (6.5), and A˜Q := AQ +θ(Q) satisfies (NRN),(i),(ii) whenever A satisfies (NRN’),(i’),(ii’). To characterize convex risk measures, an analog of axiom (RN) is not needed since even if ϕ is exponential, ρ is still a convex risk measure. This means that entropy convex measures of risk arise from convex risk measures exactly and solely by additionally requiring the property of non-risk neutrality (NRN’) of the stress test sets. It makes explicit that entropy convex measures of risk may be viewed as those convex risk measures that are non-risk neutral.
7
Conclusions
In this paper, we have introduced two subclasses of convex risk measures: entropy coherent and the more general entropy convex measures of risk. A variety of representation and duality results as well as some examples have made explicit that entropy coherent and entropy convex measures of risk are distinct and important classes of risk measures, and satisfy many appealing properties. These results include the facts that entropy convex measures of risk are (i) the only convex risk measures among ϕ-convex measures of risk with non-linear ϕ, thus allowing for risk aversion; (ii) the only convex risk measures among negative certainty equivalents under variational and homothetic preferences with non-linear utility; (iii) the only ϕ-convex measures of risk with non-linear ϕ and non-trivial penalty function among negative certainty equivalents under variational and homothetic preferences. Furthermore, we have shown that the acceptance sets generating entropy convex measures of risk satisfy the same characteristic properties as 36
the acceptance sets generating convex risk measures, the only difference being that the former additionally satisfy a non-risk neutrality property that the latter need not satisfy. Entropy coherent and entropy convex measures of risk are moreover the natural generalizations of the popular entropic measure of risk. The theory developed in this paper is of a static nature. In future research we intend to develop its dynamic counterpart.
8
Appendix
Lemma 8.1 Let U ⊂ R and M ⊂ Q. Let (fQ (·))Q∈M be a family of functions mapping from U to R. Suppose that for every Q ∈ M , fQ (·) is a differentiable function. Let f (m) := sup fQ (m),
(8.9)
Q∈M
and fix m0 ∈ U. Suppose that for f (m0 ) the supremum is attained in a Q0 ∈ M . Assume 0 (m ). further that f is differentiable in m0 . Then we must have that f 0 (m0 ) = fQ 0 0 Proof. Since f is differentiable in m0 we must have f (m0 + ) − f (m0 ) →0 f (m0 + ) − fQ0 (m0 + ) fQ (m0 + ) − fQ0 (m0 ) = lim + lim 0 →0 →0 f (m0 + ) − fQ0 (m0 + ) 0 = lim + fQ (m0 ). 0 →0
f 0 (m0 ) = lim
f (m +)−f
(8.10)
(m +)
0 0 Q0 must exist. However, as for every we have that In particular, lim→0 f (m0 + ) − fQ0 (m0 + ) ≥ 0, it follows that
f (m0 + ) − fQ0 (m0 + ) ↓0 f (m0 + ) − fQ0 (m0 + ) lim ↑0
lim
f (m +)−fQ0 (m0 +)
0 Therefore, lim→0 0 (m ). indeed f 0 (m0 ) = fQ 0 0
≥ 0 ≤ 0.
must be equal to zero. Thus, it follows from (8.10) that 2
Lemma 8.2 Suppose that ϕ ∈ C 3 and that there does not exist p, γ, q such that, for all x ∈ (ϕ−1 (0), ∞), ϕ(x) = γ exp{ γx } + q or ϕ(x) = px + q. Then the function ϕ0 ◦ ϕ−1 is not linear on ϕ((ϕ−1 (0), ∞)) = R+ . Proof. Suppose that there exists c, d such that ϕ0 ◦ ϕ−1 (x) = cx + d for all x ∈ R+ . As 1 ϕ0 ◦ ϕ−1 = −1 0 we get that (ϕ ) 1 (ϕ−1 )0 (x) = . cx + d If c = 0 then ϕ is linear on (ϕ−1 (0), ∞) contrary to our assumptions. As ϕ−1 is strictly 1 increasing on R+ , we must have that c > 0. This entails ϕ−1 (x) = log(cx + d), which yields c that ϕ(x) = 1c exp{cx} − dc on (ϕ−1 (0), ∞). This contradicts again our assumptions. Hence, under the stated assumptions, ϕ0 ◦ ϕ−1 is not linear on R+ . 2 37
References [1] Anscombe, F.J. and R.J. Aumann (1963). A definition of subjective probability. Annals of Mathematical Statistics 34, 199-205. [2] Artzner, Ph., F. Delbaen, J.-M. Eber and D. Heath (1999). Coherent measures of risk. Mathematical Finance 9, 203-228. [3] Barrieu, P. and N. El Karoui (2005). Inf-convolution of risk measures and optimal risk transfer. Finance & Stochastics 9, 269-298. [4] Ben-Tal, A. and M. Teboulle (1986). Expected utility, penalty functions, and duality in stochastic nonlinear programming. Management Science 32, 1445-1466. [5] Ben-Tal, A. and M. Teboulle (1987). Penalty functions and duality in stochastic programming via ϕ-divergence functionals. Mathematics of Operations Research 12, 224-240. [6] Ben-Tal, A. and M. Teboulle (2007). An old-new concept of convex risk measures: The optimized certainty equivalent. Mathematical Finance 17, 449-476. [7] Borch, K. (1962). Equilibrium in a reinsurance market. Econometrica 30, 424-444. [8] Cerreia-Vioglio, S., F. Maccheroni, M. Marinacci and L. Montrucchio (2008). Uncertainty averse preferences. Collegio Carlo Alberto Notebook 77. [9] Chateauneuf, A. and J.H. Faro (2010). Ambiguity through confidence functions. Journal of Mathematical Economics 45, 535-558. [10] Cheridito, P. and M. Kupper (2009). Recursiveness of indifference prices and translationinvariant preferences. Mathematical and Financial Economics 2, 173-188. ´r, I. (1975). I-divergence geometry of probability distributions and minimization problems. [11] Csisza Annals of Probability 3, 146-158. [12] Dana, R.-A. (2005). A representation result for concave Schur concave functions. Mathematical Finance 15, 613-634. [13] Deprez, O. and H.U. Gerber (1985). On convex principles of premium calculation. Insurance: Mathematics and Economics 4, 179-189. [14] Dunford, N. and J.T. Schwartz (1958). Linear Operators. Part I: General Theory. Interscience, New York. ¨ llmer, H. and A. Schied (2002). Convex measures of risk and trading constraints. Finance [15] Fo & Stochastics 6, 429-447. ¨ llmer, H. and A. Schied (2004). Stochastic Finance. 2nd ed., De Gruyter, Berlin. [16] Fo ¨ llmer, H., A. Schied and S. Weber (2009). Robust preferences and robust portfolio choice. [17] Fo In: Ciarlet, P., A. Bensoussan and Q. Zhang (Eds.). Mathematical Modelling and Numerical Methods in Finance. Handbook of Numerical Analysis 15, 29-88. [18] Frittelli, M. and E. Rosazza Gianin (2002). Putting order in risk measures. Journal of Banking and Finance 26, 1473-1486. [19] Gerber, H.U. (1979). An Introduction to Mathematical Risk Theory. S.S. Huebner Foundation Monograph 8, Irwin, Homewood. [20] Gerber, H.U. (1985). On additive principles of zero utility. Insurance: Mathematics and Economics 4, 249-251. [21] Gilboa, I. and D. Schmeidler (1989). Maxmin expected utility with non-unique prior. Journal of Mathematical Economics 18, 141-153.
38
[22] Gollier, C. (2001). The Economics of Risk and Time, MIT Press, Cambridge. [23] Goovaerts, M.J., F.E.C. De Vylder and J. Haezendonck (1984). Insurance Premiums, North-Holland Publishing, Amsterdam. [24] Goovaerts, M.J., R. Kaas, R.J.A. Laeven and Q. Tang (2004). A comonotonic image of independence for additive risk measures. Insurance: Mathematics and Economics 35, 581-594. [25] Hansen, L.P. and T.J. Sargent (2001). Robust control and model uncertainty. American Economic Review 91, 60-66. [26] Hansen, L.P. and T.J. Sargent (2007). Robustness. Princeton University Press, Princeton. ´ lya (1934). Inequalities. Cambridge University [27] Hardy, G.H., J.E. Littlewood and G. Po Press, Cambridge (2nd ed. 1952, reprinted 1978). [28] Heath, D. and H. Ku (2004). Pareto equilibria with coherent measures of risk. Mathematical Finance 14, 163-172. [29] Huber, P.J. (1981). Robust Statistics. Wiley, New York. [30] Maccheroni, F., M. Marinacci and A. Rustichini (2006). Ambiguity aversion, robustness, and the variational representation of preferences. Econometrica 74, 1447-1498. [31] Mania, M. and M. Schweizer (2005). Dynamic exponential utility indifference valuation. Annals of Applied Probability 15, 2113-2143. [32] Moreau, J.J. (1970). Inf-convolution, sous-additivit´e, convexit´e des fonctions num´eriques. Journal de Math´ematiques Pures et Appliqu´ees 49, 109-154. [33] Rockafellar, R.T. and S.P. Uryasev (2000). Optimization of conditional value-at-risk. Journal of Risk 2, 21-42. [34] Rockafellar, R.T., S.P. Uryasev and M. Zabarankin (2008). Risk tuning with generalized linear regression. Mathematics of Operations Research 33, 712-729. ´ski, A. and A. Shapiro (2006). Optimization of convex risk functions. Mathematics [35] Ruszczyn of Operations Research 31, 433-452. ´ski, A. and A. Shapiro (2006). Conditional risk mappings. Mathematics of Operations [36] Ruszczyn Research 31, 544-561. [37] Savage, L.J. (1954). The Foundations of Statistics. Wiley, New York (2nd ed. 1972, Dover, New York). [38] Strzalecki, T. (2011). Axiomatic foundations of multiplier preferences. Econometrica 79, 47-73. [39] Von Neumann, J. and O. Morgenstern (1944). Theory of Games and Economic Behavior. Princeton University Press, Princeton (3rd ed. 1953). [40] Wald, A. (1950). Statistical Decision Functions. Wiley, New York. [41] Weber, S. (2006). Distribution-invariant risk measures, information, and dynamic consistency. Mathematical Finance 16, 419-441. [42] Zalinescu, C. (2002). Convex Analysis in General Vector Spaces. World Scientific, Singapore.
39