Correlation Detection and an Operational ... - Semantic Scholar

Report 6 Downloads 149 Views
Correlation Detection and an Operational Interpretation of the R´ enyi Mutual Information Masahito Hayashi1, 2 and Marco Tomamichel3, 2 1

Graduate School of Mathematics, Nagoya University, Furocho, Chikusaku, Nagoya, 464-860, Japan 2 Centre for Quantum Technologies, National University of Singapore, Singapore 117543, Singapore 3 School of Physics, The University of Sydney, Sydney 2006, Australia Recently, a variety of new measures of quantum R´enyi mutual information and quantum R´enyi conditional entropy have been proposed, and some of their mathematical properties explored. Here, we show that the R´enyi mutual information attains operational meaning in the context of composite hypothesis testing, when the null hypothesis is a fixed bipartite state and the alternate hypothesis consists of all product states that share one marginal with the null hypothesis. This hypothesis testing problem occurs naturally in channel coding, where it corresponds to testing whether a state is the output of a given quantum channel or of a “useless” channel whose output is decoupled from the environment. Similarly, we establish an operational interpretation of R´enyi conditional entropy by choosing an alternative hypothesis that consists of product states that are maximally mixed on one system. Specialized to classical probability distributions, our results also establish an operational interpretation of R´enyi mutual information and R´enyi conditional entropy. I.

INTRODUCTION

In order to distill useful measures of R´enyi mutual information and R´enyi conditional entropy from a plethora of possible definitions, it is important to find out which definitions correspond to relevant operational quantities. For this purpose, let us consider how efficiently an arbitrary bipartite correlated state ρAB on systems A and B can be distinguished from product states when the marginal of ρAB on A is known to be ρA . This problem can be regarded as the problem of detecting correlations in the state ρAB . Formally, we consider the following binary composite hypothesis testing problem for n copies of such a state1 : Null Hypothesis: The state is ρ⊗n AB . n n Alternate Hypothesis: The state is of the form ρ⊗n A ⊗ σB with σB any state on n copies of B.

This hypothesis test figures prominently when analyzing the converse to various channel coding questions in classical as well as quantum information processing.2 There, the problem is specified by a description of a channel EA0 →B and a bipartite state ρAA0 where the system A constitutes an environment of the channel, A0 is the channel input, and B its output. We are given an unknown state on n copies of A and B and consider the following two hypotheses. Null Hypothesis: The state is the output of n uses of the channel EA0 →B , namely the state is exactly ρ⊗n AB where ρAB := EA0 →B [ρAA0 ]. Alternate Hypothesis: The state is the output of a “useless” channel and decoupled from the n n environment, namely it is of the form ρ⊗n A ⊗ σB with σB any state on n copies of B. 1

2

We want to consider the speed with which the probability that we erroneously support the state ρ⊗n AB when the ⊗n actual state is a product state of the form ρ⊗n A ⊗ σB under a constraint for the opposite error. As is explained later, this problem can be discussed as the Hoeffding bound and Stein’s lemma under this formulation. There exists an intimate connection between quantum channel coding and binary hypothesis testing (see, e.g., [21]). This connection is particularly important when analyzing how much information can be transmitted with a single use of a quantum channel [28, 45] or when approximating how much information can be transmitted with finitely many uses of the channel [11, 44]. (See also [19, 37] for the classical case. In particular, Polyanskiy [36, Sec. II] discusses the classical special case of this hypothesis testing problem.)

2 A hypothesis test for this problem is a binary positive operator-valued measure {QAn B n , 1An B n − QAn B n } on the n copies of the systems A and B, determined by an operator 0 ≤ QAn B n ≤ 1An B n . If the operator QAn B n “clicks” on our state, we conclude that the null hypothesis is correct, whereas otherwise we conclude that the alternate hypothesis is correct. The error of the first kind, αn (QAn B n ), is defined as the probability with which we wrongly conclude that the alternate hypothesis is correct even if the state is ρ⊗n AB , given by n n n n αn (QAn B n ) = tr[ρ⊗n AB (1A B − QA B )].

(1)

Conversely, the error of the second kind, βn (QAn B n ), is defined as the probability with which we n wrongly conclude that the null hypothesis is correct even if the state is of the form ρ⊗n A ⊗ σB for some σB n , given by n n n βn (QAn B n ) = max tr[ρ⊗n A ⊗ σB QA B ],

(2)

σB n

where the maximum is taken over all states σB n on n copies of B. Main Results. The main contribution of this paper is an asymptotic analysis of the fundamental trade-off between these two errors as n goes to infinity. To investigate this trade-off, we ask the following questions: let us assume that our test is such that βn (QAn B n ) ≤ exp(−nR), what is the minimum value of αn (QAn B n ) we can achieve? The answer is different depending on whether R is smaller or larger than the mutual information between A and B, denoted I(A : B)ρ . If R < I(A : B)ρ , we show that the minimal error of the first kind vanishes exponentially fast in n. This implies a quantum Stein’s lemma [22] for the above composite hypothesis testing problem. More formally, we define n o α ˆ n (nR) = min αn (QAn B n ; ρAB ) βn (QAn B n ) ≤ exp(−nR) (3) 0≤QAn B n ≤1

and investigate the exact exponents with which this error vanishes as n goes to infinity, yielding a quantum Hoeffding bound [18, 32] for our composite hypothesis testing problem. We find that the exponents are determined by the R´enyi mutual information, defined as Iα (A : B)ρ = min Dα (ρAB kρA ⊗ σB ), σB

for

α ∈ (0, 1),

(4)

 1−α 1−α  1 where Dα (ρkσ) := α−1 log tr σ 2 ρα σ 2 is the R´enyi relative entropy first investigated by Petz (see, e.g. [35]) and the minimization is over all states σB on B. We obtain      1 1−s lim − log α ˆ n (nR) = sup Iα (A : B)ρ − R . (5) n→∞ n s s∈(0,1) On the other hand, if R > I(A : B)ρ , we show that α ˆ n (R) must approach one exponentially fast in n. This implies the strong converse for quantum Stein’s lemma [34] for our problem. We then find the exact exponents (also called strong converse exponenents, see [17, Ch. 3] and [29, 34]) with which the error of the first kind goes to one as n goes to infinity and we find that in our case the exponent is determined by the sandwiched R´enyi mutual information [4, 15], given as e α (ρAB kρA ⊗ σB ), Ieα (A : B)ρ = min D σB

e α (ρkσ) := where D obtain

1 α−1

log tr

 lim

n→∞



σ

1−α 2α

ρσ

1−α 2α

α 

 1 ˆ n (nR) − log 1 − α n

for

α > 1,

(6)

is the (sandwiched) R´enyi divergence [31, 46]. We



 = sup s>1

 s−1 e R − Is (A : B)ρ . s

(7)

3 Hence, we show that the above composite hypothesis testing problem yields an operational interpretation for different definitions of the R´enyi mutual information for the two ranges of α, paralleling the observation in [29]. Finally, we also perform a second-order analysis for quantum Stein’s lemma [25, 43] and show that √ the minimal error of the first kind converges to a constant if βn (QAn B n ) ≤ exp(−nI(A : B)ρ − nr) for some r ∈ R. Then, for any r ∈ R, we have !  √  r lim α ˆ n nI(A : B)ρ + n r = Φ p , (8) n→∞ V (A : B)ρ where Φ is the cumulative standard normal (Gaussian) distribution and h 2 i V (A : B)ρ := tr ρAB log ρAB − log ρA ⊗ ρB − I(A : B)ρ .

(9)

is the mutual information variance. Analogously, an operational interpretation for conditional R´enyi entropies is established by considering the following binary hypotheses testing problem, which is motivated by the task of decoupling of quantum states. The problem is specified by a description of a state ρAB . Given an unknown state on A and B, consider the following two hypotheses: Null Hypothesis: The state is the n-fold product of ρAB , namely ρ⊗n AB . Alternate Hypothesis: The state is uniform on An and decoupled form B n , i.e. it is of the form ⊗n πA ⊗ σB n , where πA is the fully mixed state on A. The same analysis as above applied to this problem reveals that the exponents in the quantum Hoeffding bound are determined by the R´enyi conditional entropies defined as [42] Hα↑ (A|B)ρ = − min Dα (ρAB k1A ⊗ σB ), σB

for

α ∈ (0, 1) ,

(10)

and the strong converse exponents are determined by the sandwiched conditional R´enyi entropies [31] e ↑ (A|B)ρ = − min D e α (ρAB k1A ⊗ σB ), H α σB

for

α > 1.

(11)

Related Work. Complementary and concurrent to this work, Cooney et al. [9] investigated the strong converse exponents for a similar hypothesis testing problem when adaptive strategies are allowed — however, they did not treat the case of a composite alternate hypothesis and they also did not analyze the error exponents in the quantum Hoeffding bound. Our proof of the strong converse exponents parallels the development in a very recent preprint by Mosonyi and Ogawa [30]. There, the authors consider correlated states and use the G¨ artner-Ellis theorem of classical large deviation theory in order to investigate the asymptotic error exponents in the presence of correlations. Here, we are not interested in correlated states per se, but our proof technique based on pinching naturally leads us to a classical hypothesis testing problem with correlated distributions, for which the G¨artner-Ellis theorem again provides the right solution. Outline. The remainder of this paper is structured as follows. In Section II we introduce the necessary notation and mathematical preliminaries, and we discuss some properties of the R´enyi divergence. We believe that Lemma 2 and Corollary 3 may be of independent interest. In Section III we define the generalized R´enyi mutual information (which formally generalizes both R´enyi mutual information and R´enyi conditional entropy) and discuss various properties, including a duality relation and additivity. Most importantly, in Proposition 7, we show that it can be represented as an asymptotic limit of classical R´enyi divergences.

4 Then, in Section IV we formally define the composite hypothesis test we consider and the required operational quantities. In doing so, we introduce a slightly more general problem that includes the two hypothesis testing problems discussed previously as special cases. In Section V we prove an analogue of the quantum Hoeffding bound that establishes the operational meaning for R´enyi mutual information and R´enyi conditional entropy for α < 1. Moreover, in Section VI we find the strong converse exponents for our problem, yielding an operational meaning for the R´enyi mutual information and R´enyi conditional entropy for α > 1. As in the non-composite case, for α > 1 the relevant R´enyi divergence is the “sandwiched” R´enyi divergence. We conclude our treatment of the problem by considering the second order asymptotics in Section VII. This section is interesting on its own since it provides a new and more intuitive proof of the achievability of the second order that also easily adapts to non-composite hypothesis testing. II.

NOTATION AND PRELIMINARIES

We model quantum systems, denoted by capital letters (e.g., A, B), by a finite-dimensional Hilbert spaces (e.g., HA , HB ). Moreover, An denotes a quantum system composed of n copies of the system A, modeled by an n-fold tensor product of Hilbert spaces, HAn = HA⊗n . We denote by U(A), H(A) and P(A) the set of unitary, Hermitian, and positive semi-definite operators acting on HA , respectively. We denote the identity operator on HA by 1A and the partial trace by trA . Furthermore, we use |A| to denote the dimension of the Hilbert space HA . Let S(A) be the set of quantum states, i.e., S(A) := {ρA ∈ P(A) | tr[ρA ] = 1}, where tr denotes the trace. Given a bipartite state ρAB ∈ S(AB), we denote by ρA = trB [ρAB ] its marginal on A. We consequently use subscripts to indicate which physical system an operator acts on. Finally, πA ∈ S(A) denotes the maximally mixed state given by πA = 1A /|A|. A.

Projectors and Pinching

For two Hermitian operators L, K ∈ H, we write L ≤ K if and only if K − L ∈ P and we write L  K if the support of L is contained in the support of K. Moreover, we write {L ≥ K} = 1 − {L < K} for the projector onto the subspace spanned by eigenvectors corresponding to non-negative eigenvalues of L − K. By definition we have (L − K){L ≥ K} ≥ 0, and, thus, L{L ≥ K} ≥ K{L ≥ K}.

(12)

 V {L ≥ K}V † = V LV † ≥ V KV † .

(13)

For any unitary V , we further have

We will also use an inequality by Audenaert et al. [1, Thm. 1], which can be conveniently stated as follows [3, Eq. (24)]. Let L and K be positive semi-definite and s ∈ (0, 1). Then, tr[Ls K 1−s ] ≥ tr [K{L ≥ K}] + tr [L{L < K}] .

(14)

P For any Hermitian L, we write its spectral decomposition as L = λ∈spec(L) λPLλ , where PLλ are projectors and spec(L) ⊂ R is its spectrum. We denote by PL the pinching map for this spectral decomposition, i.e the following completely positive trace-preserving map: X PL : K 7→ PLλ KPLλ . (15) λ∈spec(L)

5 B.

Permutation Invariance and Universal State

We will use the following observation from the representation theory of the group Sn of permutations of n elements. Let UAn : Sn 7→ U(An ) denote the natural unitary representation of Sn that permutes the subsystems A1 , A2 , . . . , An . An operator LAn is called permutation invariant if it satisfies UAn (π)LAn UAn (π)† = LAn for all π ∈ Sn . Similarly, we say that LAn is invariant under (n-fold) product unitaries if it satisfies VA⊗n LAn VA†⊗n = LAn for all VA ∈ U(A). n ∈ S(An ), which Lemma 1. Let A be a system with |A| = d. For all n ∈ N there exists a state ωA n we call universal state, such that the following holds:

1. For all permutation invariant states τAn ∈ S(An ), we have   n + d2 − 1 2 n τAn ≤ gn,d ωA with g = ≤ (n + 1)d −1 . n n,d n 2. The universal state has the following eigenvalue decomposition: M n ωA pλ PAλn , n =

(16)

(17)

λ∈Λn,d

where Λn,d is the set of Young diagrams of size n and depth d and satisfies |Λn,d | ≤ (n + 1)d−1 , {PAλn }λ are mutually orthogonal projectors and {pλ }λ is a probability distribution. In n is permutation invariant and invariant under product unitaries, and commutes particular, ωA n with all permutation invariant states. Note that a related construction is presented in [7], and we refer the reader to [6] for a thorough discussion of group representation theory in the context of quantum information. A different explicit construction of such a universal state is also proposed in [20, Sec. 3], but the constant given there instead of gn,d is not optimal. Proof. Since τAn is invariant under permutations, it has a purification τAn A0n in the symmetric subspace of (HA ⊗ HA0 )⊗n where HA0 ≡ HA are isomorphic (see, e.g., [38, Lem. 4.2.2.]). Let PAsymm n A0n denote the projector onto this symmetric subspace, and its dimension by gn,d . Then, 1 X UAn (π) ⊗ UA0n (π). (18) τAn A0n ≤ PAsymm where PAsymm n A0n , n A0n = |Sn | π∈Sn

Let us now define the universal state as X   1 1 n ωA trA0n PAsymm tr[UA0n (π)] UAn (π) n := n A0n = gn,d gn,d |Sn |

(19)

π∈Sn

The state clearly has the desired first property, due to (18). Moreover, it is evident from (19) that n is invariant under permutations and product unitaries. By the Schur-Weyl duality the natural ωA n representation of Sn × U(A) given by π × VA 7→ UAn (π) · VA⊗n decomposes into different irreducible n is representations labelled by the Young diagrams in Λn,d and Schur’s lemma thus ensures that ωA n of the form given in (17). The number |Λn,d | is upper bounded by the number of types of strings of length n with d symbols, which in turn is bounded by (n + 1)d−1 . (See, e.g., [20, Eq. (1)]). Finally, since the subspaces PAλn are spanned by the irreducible representations of Sn of a type λ, by Schur’s Lemma every permutation invariant state τAn can written in the form M τAn = τAλn , (20) λ∈Λn,d

where

τAλn

is supported in

PAλn .

n . It is thus evident that such states commute with ωA n

6 C.

R´ enyi Divergence

Let us define the following two families of R´enyi divergences for α ∈ (0, 1) ∪ (1, ∞). For any quantum state ρ ∈ S and positive semi-definite operator σ ≥ 0 satisfying ρ  σ, we define the R´enyi relative entropy [35] and the (sandwiched) R´enyi divergence [31, 46], respectively, as i h 1−α 1−α 1 and log tr σ 2 ρα σ 2 α−1 h i 1−α α e α (ρkσ) := 1 log tr σ 1−α 2α ρσ 2α D . α−1 Dα (ρkσ) :=

(21) (22)

The two families of entropies coincide when ρ and σ commute. For α ∈ {0, 1, ∞} we define Dα (ρkσ) e α (ρkσ) as the corresponding limit. The relative entropy emerges when we take the limit and D α → 1 in both cases, namely   e α (ρkσ) = lim Dα (ρkσ) = tr ρ(log ρ − log σ) =: D(ρkσ) . lim D (23) α→1

α→1

e ∞ (ρkσ) have previously been Some special cases of these entropies, in particular D0 (ρkσ) and D discussed in [10] and are based on Renner’s min- and max-entropy [38]. A comprehensive overview of other special cases is given in [31]. For the second order analysis we will employ the information variance [25, 43], given as h 2 i V (ρkσ) := tr ρ log ρ − log σ − D(ρkσ) . (24) In particular, we will use the fact that [27, Prop. 35] ∂ ∂ e V (ρkσ) Dα (ρkσ) = Dα (ρkσ) = . ∂α ∂α 2 α=1 α=1

(25)

Let | spec(σ)| denote the number of mutually different eigenvalues of σ. The following property of the sandwiched R´enyi divergence is crucial for our derivations: Lemma 2. Let ρ ∈ S and σ ∈ P. For all α ≥ 0, we have ( log | spec(σ)| e e e Dα (Pσ (ρ)kσ) ≤ Dα (ρkσ) ≤ Dα (Pσ (ρ)kσ) + 2 log | spec(σ)|

if α ∈ [0, 2] . if α > 2

(26)

Proof. Since Pσ (σ) = σ, the first inequality is a special case of the data-processing inequality. This special case was first established in [31, Prop. 15]. To derive the upper bound for α ∈ (1, 2], we write   1−α α−1 1−α 1−α  e α (ρkσ) = tr σ 1−α 2α ρσ 2α exp (α − 1)D σ 2α ρσ 2α  1−α 1−α α−1 1−α 1−α  ≤ tr σ 2α | spec(σ)|Pσ (ρ)σ 2α σ 2α ρσ 2α  1−α 1−α α−1 1−α 1−α  = | spec(σ)|α−1 tr σ 2α Pσ (ρ)σ 2α σ 2α Pσ (ρ)σ 2α  e α (Pσ (ρ)kσ . = | spec(σ)|α−1 exp (α − 1)D

(27) (28) (29) (30)

To establish (28), we use [17, Lem. 9] which states that ρ ≤ | spec(σ)| Pσ (ρ)

(31)

7 and, since the function x 7→ xα−1 is operator monotone for α ∈ (1, 2), σ

1−α 2α

ρσ

1−α 2α

α−1

≤ | spec(σ)|α−1 σ

1−α 2α

Pσ (ρ)σ

1−α 2α

α−1

.

(32)

An analogous argument, with the opposite inequality (28), holds for α ∈ (0, 1). Thus, for all α ∈ (0, 1) ∪ (1, 2], we conclude that  e α (ρkσ) ≤ D e α (Pσ (ρ)kσ + log | spec(σ)|. D (33) To get an upper bound for α > 2 we observe that   1−α α  e α (ρkσ) = tr σ 1−α 2α ρσ 2α exp (α − 1)D  1−α 1−α α  ≤ | spec(σ)|α tr σ 2α Pσ (ρ)σ 2α

(34) (35)

since A ≤ B implies tr[f (A)] ≤ tr[f (B)] for every monotonically increasing function f . Thus,  e α (ρkσ) ≤ D e α (Pσ (ρ)kσ + D

 α e α (Pσ (ρ)kσ + 2 log | spec(σ)|. log | spec(σ)| ≤ D α−1

(36)

Finally, note that the inequality thus also holds for the limiting cases α ∈ {0, 1, ∞}. As a direct consequence of Lemma 2 and the fact3 that | spec(σ ⊗n )| ≤ (n + 1)d−1 , we find Corollary 3. Let ρ ∈ S and σ ∈ P. For all α ≥ 0, we have  

 1 e α (ρkσ) . lim Dα Pσ⊗n (ρ⊗n ) σ ⊗n =D n→∞ n

(37)

This extends a prior result by Mosonyi and Ogawa in [29] to all α ≥ 0. eα Finally, note that the first inequality in Lemma 2 (but only the first) also holds if we replace D e with Dα . Taking the asymptotic limit in (37), this yields that Dα (ρkσ) ≤ Dα (ρkσ). III.

´ GENERALIZED RENYI MUTUAL INFORMATION

We state our results in a general form that allows us to treat mutual information and conditional entropies at the same time. The usual mutual information and the conditional entropies can then be recovered as special cases. A.

Definitions

For this purpose, let us define the the generalized R´enyi mutual information and the generalized sandwiched R´enyi mutual information for a bipartite state ρAB ∈ S(AB) and any τA ≥ 0 such that τA  ρA as follows: Iα (ρAB kτA ) := Ieα (ρAB kτA ) :=

3

inf

Dα (ρAB kτA ⊗ σB ) ,

(38)

inf

e α (ρAB kτA ⊗ σB ) . D

(39)

σB ∈S(B) σB ∈S(B)

If σ has k different eigenvalues, the number of different eigenvalues of σ ⊗n is equal to the number of types of sequences of length n with k symbols.

8 It is easy to verify that the infimum in the above definitions can be replaced by a minimum since the divergences are continuous in σB whenever σB  ρB and diverges to +∞ otherwise. Specializing to α = 1, we find that I(ρAB kτA ) :=

min D(ρAB kτA ⊗ σB ) = D(ρAB kτA ⊗ ρB ) ,

(40)

σB ∈S(B)

i.e. the minimizer is given by the marginal ρB at α = 1. We also define V (ρAB kτA ) := V (ρAB kτA ⊗ ρB ).

(41)

Note that the R´enyi mutual information and the sandwiched R´enyi mutual information [4, 15] is recovered by choosing τA = ρA , namely we define4 Iα (A : B)ρ := Iα (ρAB kρA ) ,

and

Ieα (A : B)ρ := Ieα (ρAB kρA ) .

(42)

Similarly, we define the R´enyi conditional entropy [42] and the sandwiched R´enyi conditional entropy [31] by choosing τA = 1A . Using the notation of [42], we have Hα↑ (A|B)ρ := −Iα (ρAB k1A ) = log |A| − Iα (ρAB kπA ) e α↑ (A|B)ρ := −Ieα (ρAB k1A ) = log |A| − Ieα (ρAB kπA ) . H

B.

and

(43) (44)

Duality Relation for Ieα

We will take advantage of the following duality relation for the mutual information: Lemma 4. Let ρAB ∈ S(AB), τA ≥ 0 such that τA  ρA and α, β ≥ any purification ρABC of ρAB , we have

1 2

with

1 α

Ieα (ρAB kτA ) = −Ieβ (ρAC kτA−1 ) ,

+

1 β

= 2. Then, for

(45)

where the inverse is taken on the support of τA . This result is a rather straight forward generalization of the duality relation for the conditional R´enyi entropy that was recently established independently in [31] and [4]. We provide a proof in Appendix B for completeness.

C.

Additivity of Iα and Ieα

We are interested in the additivity of the mutual informations Iα and Ieα defined above. For Iα we can find the state σB that minimizes the R´enyi divergence using Sibson’s identity. Lemma 5. For all α ≥ 0 and any states ρAB , ρA0 B 0 , τA , and πA0 , we have  Iα ρAB ⊗ ωA0 B 0 kτA ⊗ πA0 = Iα (ρAB kτA ) + Iα (ωA0 B 0 kπA0 ). 4

(46)

For α = ∞, various variants of the latter definition have been investigated in [8] in the context of the smooth entropy framework (see [41]).

9 Proof. The following quantum Sibson’s identity is adapted from [39, Lem. 3 in Suppl. Mat.]. Let ρAB ∈ S(AB), τA ∈ S(A), and σB ∈ S(B). For any α > 0, we have ∗ ∗ Dα (ρAB kτA ⊗ σB ) = Dα (ρAB kτA ⊗ σB (α)) + Dα (σB (α)kσB ),

where

n 1−α 1−α o 1 α trA τA 2 ραAB τA 2 ∗ . σB (α) :=  n 1−α 1−α o 1 α 2 2 α tr trA τA ρAB τA

(47)

(48)

∗ (α)kσ ), we Furthermore, as an immediate consequence of the positive definiteness of Dα (σB B ∗ find that arg minσB ∈S(B) Dα (ρAB kτA ⊗ σB ) = σB (α) is unique. In particular,  α n 1−α 1−α o 1 1 α α 2 2 Iα (ρAB kτA ) = log tr trA τA ρAB τA , (49) α−1

which implies that Iα is additive and concludes the proof. We note that a trivial extension of [4, Th. 10 and 11] establishes that Ieα is additive for α ≥ 12 . However, the result also directly follows from the duality relation in Lemma 4.5 Lemma 6. For all α ≥

1 2

and any states ρAB , ρA0 B 0 , τA , and πA0 , we have  Ieα ρAB ⊗ ωA0 B 0 kτA ⊗ πA0 = Ieα (ρAB kτA ) + Ieα (ωA0 B 0 kπA0 ). D.

(50)

Uniform Asymptotic Achievability of Ieα

The following result forms the core of our proof for the achievability of the strong-converse exponent. It establishes that the mutual information Ieα (ρAB kτA ) can be expressed as a limit of classical R´enyi divergences. More precisely, we have the following. Proposition 7. Let ρAB ∈ S(AB) and τA ∈ S(A) such that τA  ρA . For any α ≥ 12 , we have  

 log n 1 ⊗n ⊗n n e Dα Pτ ⊗n ⊗ ωn n (ρAB ) τA ⊗ ωB n = Iα (ρAB kτA ) + O . (51) A B n n Moreover, the convergence is uniform in α. Proof. For any σB ∈ S(B), employing the data-processing inequality we find

  n n e α ρ⊗n τ ⊗n ⊗ ωB e α P ⊗n n (ρ⊗n ) τ ⊗n ⊗ ωB ≤D D n n A AB A τA ⊗ ωB n AB

 e α ρ⊗n τ ⊗n ⊗ σ ⊗n + log gn,d , ≤D AB A B

(52) (53)

where we used [31, Prop. 5] in the last step and set d = max{|A|, |B|}. And thus, in particular,

  1e n e α ρAB τA ⊗ σB + log gn,d ) τA⊗n ⊗ ωB ≤ min D Dα Pτ ⊗n ⊗ ωn n (ρ⊗n n AB A B n n σB ∈S(B) log gn,d = Ieα (ρAB kτA ) + . n 5

(54) (55)

To see this, note that the inequality trivially holds in one direction by definition — the other direction then follows by applying the duality relation on both sides for a product purification.

10 The upper bound then follows by taking the limit n → ∞. For the lower bound, we first invoke Lemma 2, which yields

   e α P ⊗n n (ρ⊗n ) τ ⊗n ⊗ ω n n ≥ D e α ρ⊗n τ ⊗n ⊗ ω n n − log v ⊗n n D (56) B B A AB A τA ⊗ ωB n AB τA ⊗ ω B n

 

⊗n − log v ⊗n n . ≥ Ieα ρ⊗n (57) AB τA τ ⊗ω n A

B

where we use the shorthand notation vσ = | spec(σ)|. Next, we recall that Lemma 6 establishes the ⊗n e additivity of Ieα , in particular Ieα (ρ⊗n AB k τA ) = nIα (ρAB kτA ). Thus, we have  log vτ ⊗n ⊗ ωn n

 1e ⊗n ⊗n n A B Dα Pτ ⊗n ⊗ ωn n (ρAB ) τA ⊗ ωB n ≥ Ieα (ρAB kτA ) + . (58) A B n n Finally, note that vτ ⊗n ⊗ ωn n ≤ vτ ⊗n vωBn ≤ (n + 1)2(d−1) to conclude the proof. A

A

B

1 It is important that the correction terms in the above derivation are of the order o n− 2 . This allows for the following corollary.

Corollary 8. For any t ∈ R, we have   

  t2 t

⊗n n lim √ D1+ √t Pτ ⊗n ⊗ ωn n (ρ⊗n − nI(ρ kτ ) = V (ρAB kτA ) . ) τ ⊗ ω n

A AB A B AB A n→∞ B n 2 n

(59)

Proof. According to Proposition 10 below, the Taylor expansion of Ieα (ρAB kτA ) for α close to 1 is  α−1 Ieα (ρAB kτA ) = D(ρAB kτA ⊗ ρB ) + V (ρAB kτA ⊗ ρB ) + O (α − 1)2 . 2 Substituting this into (51) with α = 1 +

√t n

(60)

yields the desired limit.

From the fact that the function α 7→ Ieα (ρAB kτA ) is a point wise limit of a sequence of classical R´enyi divergences and the convergence is uniform in α, we can immediately deduce the following: Corollary 9. The function α 7→ Ieα (ρAB kτA ) is continuous and monotonically increasing. Moreover, the function t 7→ tIe1+t (ρAB kτA ) is continuous and convex. E.

Differentiability of α 7→ Ieα (ρAB kτA )

However, the argument used to derive Corollary 9 does not suffice to establish differentiability of the sandwiched R´enyi mutual information. Proposition 10. Let ρAB ∈ S(AB) and τA ∈ S(A) such that τA  ρA . Then, the function α 7→ Ieα (ρAB kτA ) is continuously differentiable for α ≥ 12 . Moreover, ∂ e 1 Iα (ρAB kτA ) = V (ρAB kτA ). (61) ∂α 2 α=1 Let us remark that continuity of α 7→ Ieα (ρAB kτA ) also directly follows from the fact that e α (ρAB kτA ⊗ σB ) is continuous and the duality relation. However, due to the optimization α 7→ D over σB involved in the definition of Ieα (ρAB kτA ), it is not at all clear that the function is differentiable. ∗ (α) for each α and then We show this in Appendix C by first characterizing the optimal state σB ∗ (α) is continuously differentiable. showing that α 7→ σB Here, we present the proof for the second statement and establish the derivative at α = 1.

11 Proof. First, note that    1 e lim sup I1+h (ρAB kτA ) − I(ρAB kτA ) h h→0    1 e ≤ lim sup D1+h (ρAB kτA ⊗ ρB ) − D(ρAB kτA ⊗ ρB ) h h→0 ∂ e 1 = D1+h (ρAB kτA ⊗ ρB ) = V (ρAB kτA ⊗ ρB ) . ∂h 2 h=0

(62) (63) (64)

On the other hand, using Lemma 4, we find Ie1+h (ρAB kτA ) − I(ρAB kτA ) = I(ρAC kτA−1 ) − Ie1−f (h) (ρAC kτA−1 ), where f : h 7→

h 1+2h

(65)

satisfies f (0) = 0 and f 0 (0) = 1. Using this, we find

   1 e lim inf I1+h (ρAB kτA ) − I(ρAB kτA ) h→0 h    1 −1 −1 = lim inf I(ρAC kτA ) − Ie1−f (h) (ρAC kτA ) h→0 h    1 −1 −1 e D(ρAC kτA ⊗ ρC ) − D1−f (h) (ρAC kτA ⊗ ρC ) ≥ lim inf h→0 h V (ρAC kτA−1 ⊗ ρC ) ∂ e −1 = − D1−f (h) (ρAC kτA ⊗ ρC ) = . ∂h 2

(66) (67) (68) (69)

h=0

It is easy to verify that V (ρAC kτA−1 ) = V (ρAB kτA ), which concludes the proof. IV.

PROBLEM DEFINITION AND OPERATIONAL QUANTITIES

We define a more general hypothesis testing problem that allows us to treat both problems discussed in the introduction together. Let ρAB ∈ S(AB) be a bipartite quantum state on systems A and B and let τA ∈ S(A) be a state on system A. We are interested in the following composite hypothesis testing problem: null hypothesis: alternate hypothesis:

state is ρAB

(70)

state is τA ⊗ σB , for any state σB ∈ S(B).

(71)

We consider arbitrary bipartite hypothesis tests, given by an operator 0 ≤ QAB ≤ 1 on AB and define the type-I error and type-II error, respectively, as follows:   α(QAB ; ρAB ) := tr (1 − QAB )ρAB , and   β(QAB ; τA ) := max tr QAB (τA ⊗ σB ) . σB ∈S(B)

(72) (73)

It is convenient to define the quantity α ˆ (µ; ρAB , τA ) as the minimum type-I error when the type-II error is below ε, i.e. we consider the following optimization problem: n o α ˆ (µ; ρAB kτA ) := min α(QAB ; ρAB ) β(QAB ; τA ) ≤ µ (74) 0≤QAB ≤1

12 and note that this quantity can trivially be bounded as α ˆ (µ; ρAB kτA ) ≥ max

σB ∈S(B)



min

0≤QAB ≤1 tr[QAB (τA ⊗σB )]≤µ

min

0≤QAB ≤1 tr[QAB (τA ⊗ρB )]≤µ

  tr QAB ρAB

(75)

  tr QAB ρAB .

(76)

The quantity α ˆ (µ; ρAB kτA ) is the object of our study here. More precisely, for any fixed n ∈ N, we consider the following n-fold extension of this composite hypothesis testing problem: null hypothesis:

state is ρ⊗n AB

alternate hypothesis:

τA⊗n

state is

(77) n

⊗ σB n , for any state σB n ∈ S(B ).

(78)

Here, it is important to note that σB n is an arbitrary state in S(B n ), and not restricted to product

⊗n 

or permutation invariant states. We are interested in the asymptotic behavior of α ˆ µn ; ρ⊗n AB τA for suitably chosen sequences {µn }n for large n. V.

HOEFFDING BOUND

Our first result considers the case where the error of the second kind goes to zero exponentially with a rate below the mutual information I(ρAB kτA ). In this case, we find that the error of the first kind converges to zero exponentially fast, and the exponent is determined by the generalized R´enyi mutual information, Iα (ρAB kτA ), for α < 1. Theorem 11. Let ρAB ∈ S(AB) and τA ∈ S(A). Then, for any R > 0, we have   

   1−s 1 ⊗n ⊗n = sup lim − log α ˆ exp(−nR); ρAB τA Is (ρAB kτA ) − R . n→∞ n s s∈(0,1)

(79)

Note that if R ≥ I(ρAB kτA ) the right hand side of (79) evaluates to zero, revealing that in this case the error of the first kind will decay slower than exponential in n.6 Furthermore, if R < I0 (ρAB kτA ), we find that the right hand side of (79) diverges to +∞ indicating that the decay is faster than exponential in n. This includes the case where the error of the first kind is identically zero for sufficiently large n, e.g. in zero-error channel coding. We also consider the following two special cases: Corollary 12. Let ρAB ∈ S(AB). Then, for any R > 0, we have   

   1 1−s ⊗n ⊗n lim − log α ˆ exp(−nR); ρAB ρA Is (A : B)ρ − R , = sup n→∞ n s s∈(0,1)   

   1 1−s ⊗n ⊗n ↓ lim − log α ˆ exp(−nR); ρAB πA = sup log |A| − Hs (A|B)ρ − R . n→∞ n s s∈(0,1)

(80) (81)

This corollary establishes an operational interpretation of the R´enyi mutual information, Iα↓ (A : B)ρ , as well as the R´enyi conditional entropies, Hα↑ (A|B)ρ , for 0 ≤ α ≤ 1. In the following, we treat the proof of the achievability and optimality in Theorem 11 separately. 6

In fact, we will see in Theorems 13 and 17, respectively, that the error of the first kind will converge to 1 exponentially fast in n if R > I(ρAB kτA ), and that it will converge to 12 if R = I(ρAB kτA ).

13 A.

Proof of Achievability

⊗n n The achievability is shown using a quantum Neyman-Pearson test comparing ρ⊗n AB with τA ⊗ωB n , n where ωB n is the universal state defined in Lemma 1. The analysis follows the lines of the proof of the direct part of the quantum Hoeffding bound given in [3, Sec. 5.5] and further hinges on the additivity of the mutual information expressed in Lemma 5. Note that the expression on the right hand side of (79) is zero if R ≥ I(ρAB kτA ) and the inequality thus trivially holds. In the following we show that, for any 0 < R < I(ρAB kτA ),   

   1 1−s ⊗n ⊗n (82) lim inf − log α ˆ exp(−nR); ρAB τA ≥ sup Is (ρAB kτA ) − R . n→∞ n s s∈(0,1)

Proof of Achievability in Theorem 11. Let us fix s ∈ (0, 1) and let {λn }n∈N be real numbers to be specified later and define the sequence of tests  ⊗n n QnAn B n := ρ⊗n (83) AB ≥ exp(λn ) τA ⊗ ωB n , n is the universal state introduced in Lemma 1. First, note that the natural representation where ωB n of Sn decomposes as UAn B n (π) = UAn (π) ⊗ UB n (π) and that QnAn B n is permutation invariant as a direct consequence of Eq. (13). Thus, we have   β(QnAn B n ; τA⊗n ) = max tr QnAn B n (τA⊗n ⊗ σB n ) (84) σB n ∈S(B n )

i 1 X h tr UAn B n (π) QnAn B n (τA⊗n ⊗ σB n )UAn B n (π)† σB n ∈S(B n ) |Sn | π∈Sn  n  eB n ) , = max tr QAn B n (τA⊗n ⊗ σ

=

max

σB n ∈S(B n )

where σ eB n :=

1 |Sn |

P

π∈Sn

(85) (86)

UB n (π)σB n UB n (π)† is permutation invariant. Lemma 1 then yields   n (87) β(QnAn B n ; τA⊗n ) ≤ gn,d tr QnAn B n (τA⊗n ⊗ ωB n) ,

where we set d = |B|. Furthermore, using Audenaert et al.’s inequality (14) we find h  i s ⊗n n 1−s ⊗ ω τ β(QnAn B n ; τA⊗n ) ≤ gn,d exp(−sλn ) tr ρ⊗n n B A AB 

⊗n  n

τ ⊗ ωB = gn,d exp (−sλn ) exp − (1 − s)Ds ρ⊗n n AB A 



τ ⊗n ≤ gn,d exp(−sλn ) exp − (1 − s) Is ρ⊗n AB A

(88) (89) (90)

⊗n 

Observing that Is ρ⊗n = n Is (ρAB kτA ) due to the additivity of the mutual information AB τA established in Lemma 5, we find that the choice  1 λn = log gn,d + n R − (1 − s)Is (ρAB kτA ) (91) s achieves the desired bound β(QnAn B n ; τA⊗n ) ≤ exp(−nR). On the other hand, again using (14) and Lemma 5, we find h ⊗n i ⊗n ⊗n n α(QnAn B n ; ρ⊗n ) = tr ρ < exp(λ ) τ ⊗ ω n ρ n B AB AB A AB  h ⊗n s ⊗n  i n 1−s ≤ exp (1 − s)λn tr ρAB τA ⊗ ωB n   ≤ exp (1 − s)λn − n(1 − s)Is (ρAB kτA )

(92) (93) (94)

14 Substituting (91) for λn , we thus find α ˆ





⊗n exp(−nR); ρ⊗n AB τA



α(QnAn B n ; ρ⊗n AB )

 1 − s log gn,d + nR − nIs (ρAB kτA ) . ≤ exp s 

(95)

Since log gn,d = O(log n), taking the limit n → ∞ yields 

  1

⊗n lim inf − log α ˆ exp(−nR); ρ⊗n τ

AB A n→∞ n

 ≥

1−s (Is (ρAB , τA ) − R) . s

(96)

Finally, since this holds for all s ∈ (0, 1), we established the direct part.

B.

Proof of Optimality

To show optimality, we will directly employ the converse of the quantum Hoeffding bound established in [32] together with a minimax theorem derived in Appendix A. Recall that it remains to show that for any R > 0, we have   

   1−s 1

⊗n ≤ sup . lim sup − log α ˆ exp(−nR); ρ⊗n τ I (ρ kτ ) − R s AB A AB A n s n→∞ s∈(0,1)

(97)

Proof of Optimality in Theorem 11. We fix σB ∈ S(B) and note that

   

⊗n ⊗n ⊗n τ ⊗ σ ) τ ≥ α ˆ exp(−nR); ρ α ˆ exp(−nR); ρ⊗n

A B AB AB A

(98)

At this point we can apply the converse of the quantum Hoeffding bound [32] to the expression on the right-hand side, which yields

  1 ⊗n ⊗n ˆ exp(−nR); ρAB τA lim sup − log α n n→∞ 

  1 ⊗n ⊗n ≤ lim sup − log α ˆ exp(−nR); ρAB τA ⊗ σB ) n n→∞   1−s ≤ sup (Ds (ρAB kτA ⊗ σB ) − R) . s s∈(0,1) 

(99) (100) (101)

Since this holds for all σB ∈ S(B), the limit in (99) is in fact upper bounded by  inf

sup

σB ∈S(B) s∈(0,1)

1−s (Ds (ρAB kτA ⊗ σB ) − R) s

 (102)

It remains to observe that the functional   (s, σB ) 7→ (1 − s)Ds (ρAB kτA ⊗ σB ) = − log tr ρsAB (τA ⊗ σB )1−s

(103)

is convex in σB for s ∈ (0, 1) due to Lieb’s theorem [26] and concave in s as was shown in [2, Lem. 2.1]. Hence, the minimax theorem (Proposition 18) in Appendix A applies to (102). This, together with the definition of Iα in (38), concludes the proof.

15 VI.

STRONG CONVERSE EXPONENT

Our second result considers the case where the error of the second kind goes to zero exponentially with a rate exceeding the mutual information I(ρAB kτA ). In this case, we find that the error of the first kind converges to 1 exponentially fast, and the exponent is determined by the sandwiched R´enyi mutual information, Ies (ρAB kτA ), with s > 1. Theorem 13. Let ρAB ∈ S(AB) and τA ∈ S(A). Then, for any R > 0, we have   

   1 s−1

⊗n e . lim − log 1 − α ˆ exp(−nR); ρ⊗n R − I (ρ kτ ) τ = sup s AB A AB A n→∞ n s s>1

(104)

Again, we are interested in the following two special cases: Corollary 14. Let ρAB ∈ S(AB). Then, for any R > 0, we have    

   1 s−1

⊗n e R − I (A : B) , (105) lim − log 1 − α ˆ exp(−nR); ρ⊗n ρ = sup s ρ AB A n→∞ n s s>1    

   1 s−1 ⊗n ⊗n ↓ e lim − log 1 − α R − log |A| + Hs (A|B)ρ . ˆ exp(−nR); ρAB πA = sup n→∞ n s s>1 (106) This corollary establishes an operational interpretation of the sandwiched R´enyi mutual ine α↑ (A|B)ρ , for formation, Ieα↓ (A : B)ρ , as well as the sandwiched R´enyi conditional entropies, H α ≥ 1. Before we commence with the proof, we will discuss the classical Neyman-Pearson test we use and some results from classical large deviation theory. Following this, we treat the proof of the achievability and optimality in Theorem 13 separately. A.

Classical Neyman-Pearson Test

To show the direct part we employ a classical Neyman-Pearson test for the pinched state ⊗n n Pτ ⊗n ⊗ ωn n (ρ⊗n AB ) and the state τA ⊗ ωB n . The idea to use a classical Neyman-Pearson test on the A B pinched state goes back to [16]. We start by discussing some properties of this test. Lemma 15. Let ρAB ∈ S(AB), τA ∈ S(A), n ∈ N and µn ∈ R. Consider the test n o  ⊗n n QnAn B n := Pτ ⊗n ⊗ ωn n ρ⊗n AB ≥ exp(µn ) τA ⊗ ωB n . A

(107)

B

⊗n n Let {|φxn i}xn be a common orthonormal eigenbasis of Pτ ⊗n ⊗ ωn n (ρ⊗n AB ) and τA ⊗ ωB n and define A B the probability distributions



n Pn (xn ) = φxn Pτ ⊗n ⊗ ωn n ρ⊗n and Qn (xn ) = φxn τA⊗n ⊗ ωB (108) n φ xn . AB φxn , A

B

Then, with Xn distributed according to the law Pn and Xn0 according to the law Qn , we have α(QnAn B n ; ρ⊗n AB ) = Pr [Pn (Xn ) < exp(µn )Qn (Xn )] ,   n β(QAn B n ; τA⊗n ) ≤ gn,d Pr Pn (Xn0 ) ≥ exp(µn )Qn (Xn0 ) , where d = |B|.

and

(109) (110)

16 Proof. It is easy to verify that the pinched quantity is permutation invariant, and thus QnAn B n is permutation invariant as well. Let us evaluate   β(QnAn B n ; τA⊗n ) = max tr QnAn B n (τA⊗n ⊗ σB n ) (111) σB n ∈S(B n )

i 1 X h tr UAn B n (π) QnAn B n (τA⊗n ⊗ σB n )UAn B n (π)† σB n ∈S(B n ) |Sn | π∈Sn  n  = max tr QAn B n (τA⊗n ⊗ σ eB n ) , =

max

σB n ∈S(B n )

(112) (113)

UB n (π)σB n UB n (π)† is permutation invariant. Lemma 1 then yields   n β(QnAn B n ; τA⊗n ) ≤ gn,d tr QnAn B n (τA⊗n ⊗ ωB (114) n) hn o i  ⊗n ⊗n n n = gn,d tr Pτ ⊗n ⊗ ωn n ρ⊗n (115) AB ≥ exp(µn ) τA ⊗ ωB n (τA ⊗ ωB n ) .

where σ eB n :=

1 |Sn |

P

π∈Sn

A

B

Pτ ⊗n ⊗ ωn n (ρ⊗n AB ) A B

n in (115) commute. Let {|φ i} The two operators and τA⊗n ⊗ ωB n xn xn be a common orthonormal eigenbasis for these operators and define the probability distributions in (108) as well as the corresponding random variables. Then we can simplify (115) by noting that hn o i    ⊗n ⊗n n n tr Pτ ⊗n ⊗ ωn n ρ⊗n ≥ exp(µ ) τ ⊗ ω (τ ⊗ ω ) = Pr Pn (X 0 ) ≥ exp(µn )Qn (X 0 ) , n n n A B B AB A A

B

(116) which yields (110). Finally, it is easy to verify that hn o  i ⊗n ⊗n ⊗n n α(QnAn B n ; ρ⊗n ) = tr P ρ < exp(µ ) τ P ρ ⊗ ω ⊗n ⊗n n n n n B AB AB A AB τ ⊗ω n τ ⊗ω n A

B

= Pr [Pn (Xn ) < exp(µn )Qn (Xn )] .

B.

A

(117)

B

(118)

Classical Large Deviation Theory

Our proof will rely on a variant of the G¨artner-Ellis theorem of large deviation theory (see, e.g., [12, Sec. 2 and Sec. 3.4] for an overview), which we recall here. Given a sequence of random variables {Zn }n∈N we introduce its asymptotic cumulant generating function as    1 ΛZ (t) := lim log E [exp(ntZn )] , (119) n→∞ n if it exists. For our purposes it is sufficient to use the following variant of the G¨artner-Ellis theorem due to Chen [5, Thm. 3.6] (see also [30, Lem. B.2]). Proposition 16. Let us assume that t 7→ ΛZ (t) as defined in (119) exists and is differentiable in some interval (a, b). Then, for any z ∈ limt&a Λ0Z (t), limt%b Λ0Z (t) , we have   1 lim sup − log Pr[Zn ≥ z] ≤ sup {zt − ΛZ (t)} . (120) n n→∞ t∈(a,b) Finally, in order to evaluate the asymptotic cumulant generating function, we will employ the asymptotic achievability in Proposition 7, namely the fact that 1 Dα (Pn kQn ) = Ieα (ρAB kτA ). n→∞ n lim

(121)

17 C.

Proof of Achievability

We are now ready to present the proof of achievability, namely we show that   

   1 s−1 ⊗n ⊗n e . lim sup − log 1 − α ˆ exp(−nR); ρAB τA R − Is (ρAB kτA ) ≤ sup n s n→∞ s>1

(122)

We restrict our attention to the case where I(ρAB kτA ) < R < Ie∞ (ρAB kτA ), for which we provide a novel proof. For general R > 0 we refer the reader to a recent analysis of the strong converse exponent by Mosonyi and Ogawa [30] that can be adapted to cover the situation at hand here. (See also [33] for an earlier discussion of this issue in classical hypothesis testing.) Proof of Achievability in Theorem 13. First, we introduce the function  f : (s, t) 7→ t R − sIe1+t (ρAB kτA ) + (s − 1)Ies (ρAB kτA ) . As is shown later, we have  

  f (s, t) 1 ⊗n ⊗n ≤ sup ˆ exp(−nR); ρAB τA lim sup − log 1 − α n s n→∞ t>0

(123)

(124)

for all s ∈ (1, ∞). Taking the infimum for s, we find that 

  f (s, t) 1 ⊗n ⊗n ≤ inf sup lim sup − log 1 − α ˆ exp(−nR); ρAB τA , s>1 t>0 n s n→∞ 

(125)

It is straightforward to verify that f (s, t) is concave in t and convex in s since t 7→ tIe1+t (ρAB kτA ) is convex (cf. Corollary 9). Thus, by Proposition 18 in Appendix A, we have inf sup

s>1 t>0

f (t + 1, t) f (s, t) f (s, t) = sup inf ≤ sup , s s t+1 t>0 s>1 t>0

(126)

where we simply chose s = t + 1 in the last step. The latter term corresponds to the right hand side of (122) by a suitable change of variable. Then, we obtain (122). Hence, it suffices to show (124) for all s ∈ (1, ∞). Given an arbitrary fixed s ∈ (1, ∞), we choose a sequence {µn }n∈N of real numbers as µn =

 1 log gn,d + nR + (s − 1)Ds (Pn kQn ) . s

(127)

Consider the sequence of tests given by Lemma 15. Then, due to (110), we have   β(QnAn B n ; τA⊗n ) ≤ gn,d Pr Pn (X 0 ) ≥ exp(µn )Qn (X 0 ) X ≤ gn,d exp(−sµn ) Pn (x)s Qn (x)1−s = gn,d exp (−sµn ) exp ((s − 1)Ds (Pn kQn )) .

(128) (129) (130)

Hence, the requirement that β(QnAn B n ; τA⊗n ) ≤ exp(−nR) can then be satisfied by the choice (127). Let us now take a closer look at error of the first kind in (109). We find 1 − α(QnAn B n ; ρ⊗n AB ) = Pr [Pn (Xn ) ≥ exp(µn )Qn (Xn )] = Pr [Zn ≥ R] ,

(131)

18 where we introduced the sequence of random variables {Zn }n∈N with  1 Zn (Xn ) : = log Pn (Xn ) − log Qn (Xn ) − µn + R n  log g  1 s − 1 1 n,d = log Pn (Xn ) − log Qn (Xn ) − + R − Ds (Pn kQn ) . n sn s n Since β(QnAn B n ; τA⊗n ) ≤ exp(−nR) holds for our test, (131) yields

 

⊗n 1−α ˆ exp(−nR); ρ⊗n τ ≥ Pr [Zn ≥ R] ,

AB A which implies that    

   1 1 ⊗n ⊗n ˆ exp(−nR); ρAB τA . ≤ lim sup − log Pr [Zn ≥ R] lim sup − log 1 − α n n n→∞ n→∞

(132) (133)

(134)

(135)

Now, we want to tackle the asymptotics of this quantity using the G¨ artner-Ellis theorem. We therefore calculate the asymptotic cumulant generating function, as in (119), for t ≥ − 21 as follows:    1 (136) ΛZ (t) = lim log E [exp(ntZn )] n→∞ n     t log gn,d t(s − 1)  1 1 Pn (x)t = lim − log E + R − Ds (Pn kQn ) (137) n→∞ n Qn (x)t sn s n    s − 1 1 1 D1+t (Pn kQn ) + R − Ds (Pn kQn ) (138) = t lim n→∞ n s n    s−1 e e = t I1+t (ρAB kτA ) + R − Is (ρAB kτA ) . (139) s We used Proposition 7 in the form of (121) twice to establish (139). Now, it is easy to verify that ΛZ (t) is continuously differentiable for t ≥ − 21 due to Proposition 10. Furthermore, we have  s−1 R − Ies (ρAB kτA ) s  s−1 ≤ I(ρAB kτA ) + R − I(ρAB kτA ) < R , s

lim Λ0Z (t) = I(ρAB kτA ) +

t→0

(140) (141)

where we used that R > I(ρAB kτA ) and s−1 s < 1 in the last step. On the other hand, using the e convexity of t 7→ tI1+t (ρAB kτA ) (cf. Corollary 9), we find7  s−1 lim Λ0Z (t) ≥ Ie∞ (ρAB kτA ) + R − Ies (ρAB kτA ) t→∞ s  s − 1 ≥ Ie∞ (ρAB kτA ) + R − Ie∞ (ρAB kτA ) > R , s where we used that R < Ie∞ (ρAB kτA ). Hence, we may apply Proposition 16, which yields    1 f (s, t) lim sup − log Pr [Zn ≥ R] ≤ sup {tR − ΛZ (t)} = sup . n s n→∞ t>0 t>0

(142) (143)

(144)

Therefore, the combination of (144) and (135) yields (124), which concludes the proof. 7

More precisely, recall that φ(t) := tIe1+t (ρAB kτA ) is convex and φ(0) = 0. This implies that φ(λt) ≤ λφ(t) for all λ ∈ (0, 1). Moreover, φ0 (t) = limλ→1 φ(t)−φ(λt) ≥ φ(t) . t(1−λ) t

19 D.

Proof of Optimality

It remains to show that, for all R > 0,   

   1 s−1 ⊗n ⊗n lim inf − log 1 − α . ˆ exp(−nR); ρAB τA ≥ sup R − Ies (ρAB kτA ) n→∞ n s s>1

(145)

Proof of Optimality in Theorem 13. Analogous to the optimality proof for Theorem 11, we first fix σB ∈ S(B) and this time apply the converse bound in [29, Thm. IV.9]. This yields  

  1

⊗n lim inf − log 1 − α ˆ exp(−nR); ρ⊗n τ (146) AB A n→∞ n  

  1 ⊗n ⊗n ˆ exp(−nR); ρAB τA ⊗ σB ) (147) ≥ lim inf − log 1 − α n→∞ n    s−1 e (148) R − Ds (ρAB kτA ⊗ σB ) . ≥ sup s s>1 Since this holds for all σB ∈ S(B), we may maximize the expression in (148) with regards to σB , yielding the desired result. VII.

SECOND ORDER

For completeness, we also investigate the second order behavior, namely we investigate the error √  of the first kind when the error of the second kind vanishes as exp − nI(ρAB kτA ) − n r . This analysis takes a step beyond the quantum Stein’s lemma [22, 34]. Paralleling the results in [25, 43] for simple hypothesis tests, we find that the error of the first kind converges to a constant in this case. Theorem 17. Let ρAB be a bipartite quantum state on AB and let τA a quantum state on A. Then, for any r ∈ R, we have ! o n  √  ⊗n r

⊗n =Φ p lim α ˆ exp − nI(ρAB kτA ) − n r ; ρAB τA , (149) n→∞ V (ρAB kτA ) where Φ is the cumulative standard normal (Gaussian) distribution. The proof of the optimality follows directly from the bound in Eq. (76) and the second order expansion for binary quantum hypothesis testing independently established in [25] and [43], and we omit it here. Second-order achievability is proven using the hypothesis test of Section VI together with Corollary 8 in the following. A.

Proof of Achievability

Proof of Achievability in Theorem 17. We again use the test in Lemma 15 and set µn = nI(ρAB kτA )+ √ n r + log gn,d . Then, Eq. (110) yields   β(QnAn B n ; τA⊗n ) ≤ gn,dB Pr Pn (Xn0 ) ≥ exp(µn )Qn (Xn0 ) (150) ≤ gn,dB exp(−µn ) Pr [Pn (Xn ) ≥ exp(µn )Qn (Xn )] √  ≤ exp − nI(ρAB kτA ) − n r .

(151) (152)

20 Moreover, using (109), we find   √ α(QnAn B n ; ρ⊗n AB ) = Pr log Pn − log Qn < nI(ρAB kτA ) + n r + log gn,d = Pr [Yn < r] .

(153) (154)

where we defined the following sequence of random variables: 1 Yn (Xn ) := √ (log Pn (Xn ) − log Qn (Xn ) − nI(ρAB kτA ) − log gn,dB ) . n with Xn distributed according to the law Pn as usual. Now, note that the cumulant generating function of the sequence {Yn }n converges to  ΛY (t) : = lim log E [exp(tYn )] n→∞      t t √ √ D t (Pn kQn ) − nI(ρAB kτA ) − lim log gn,d = lim n→∞ n→∞ n 1+ √n n t2 = V (ρAB kτA ) . 2

(155)

(156) (157) (158)

In the last step we used Corollary 8 to evaluate the first term and the fact that log gn,d = O(log n) to evaluate the second term. Hence, by L´evi’s continuity theorem (see, e.g., [14, Ch. 14, Thm. 21]), the sequence of random variable {Yn }n converges in distribution to a random variable Y with cumulant generating function ΛY (t), i.e., a Gaussian random variable with zero mean and variance V (ρAB kτA ). In particular, this yields ! r lim Pr [Yn < r] = Pr [Y < r] = Φ p . (159) n→∞ (V (ρAB kτA ) Finally, due to (152) we have

n  o √ 

⊗n ˆ exp − nI(ρAB kτA ) − n r ; ρ⊗n ≤ lim α(QnAn B n ; ρ⊗n lim sup α τ

AB ) . AB A n→∞

n→∞

(160)

Combining this with (154) and (159) concludes the proof. Acknowledgements: MT thanks Mil´an Mosonyi for enlightening discussions throughout this project, for many comments that helped improve the presentation, and for sharing his notes on the G¨ artner-Ellis theorem. We thank Mark M. Wilde for comments. MH is partially supported by a MEXT Grant-in-Aid for Scientific Research (A) No. 23246071, and by the National Institute of Information and Communication Technology (NICT), Japan. MT acknowledges support from the MOE Tier 3 Grant “Random numbers from quantum processes” (MOE2012-T3-1-009). The Centre for Quantum Technologies is funded by the Singapore Ministry of Education and the National Research Foundation as part of the Research Centers of Excellence program. Appendix A: A Minimax Theorem

Here we show a useful minimax theorem, that is essentially a corollary of K¨onig’s minimax theorem [24]. First, we need to introduce a weaker notion of concavity and convexity. A function f : X × Y → R is 1/2 concavelike in X if, for every x1 , x2 ∈ X , there exists x3 ∈ X such that f (x3 , y) ≥

 1 f (x1 , y) + f (x2 , y) 2

for every

y ∈ Y.

(A1)

21 Analogously, f is 1/2 convexlike in Y if, for every y1 , y2 ∈ Y, there exists y3 ∈ Y such that f (x, y3 ) ≤

 1 f (x, y1 ) + f (x, y2 ) 2

for every

x ∈ X.

(A2)

K¨onig’s minimax theorem now reads as follows [24] (see also [23]). Let Y be a compact Hausdorff space and let f (x, ·) be lower-semicontinuous for every x ∈ X . Moreover, let f be 1/2 concavelike in X and 1/2 convexlike in Y. Then, we have sup inf f (x, y) = inf sup f (x, y).

x∈X y∈Y

(A3)

y∈Y x∈X

We sacrifice a bit of generality to state the following result. Proposition 18. Let X ⊂ R+ be convex and let Y be a convex compact Hilbert space. Further, let f : X × Y → R be a function that is convex in Y and concave in X . Then, sup inf

x∈X y∈Y

f (x, y) f (x, y) = inf sup . y∈Y x∈X x x

(A4)

Proof. We just need to show that f (x,y) satisfies the conditions required for (A3) to hold. First, x since f is a convex function on the convex set Y, it is in particular 1/2 convexlike and lowersemicontinuous in Y. It is also 1/2 concavelike in X due to the following argument. Let x1 , x2 ∈ X with x1 < x2 and y ∈ Y be arbitrary. We have     1 f (x1 , y) f (x2 , y) x1 + x2 x2 x1 + = f (x1 , y) + f (x2 , y) (A5) 2 x1 x2 2x1 x2 x1 + x2 x1 + x2   2x1 x2 x1 + x2 f ,y (A6) ≤ 2x1 x2 x1 + x2 by the concavity of f (·, y). Thus, choosing x3 = 1/2 concavelike according to (A1).

2x1 x2 x1 +x2

∈ [x1 , x2 ] ⊂ X , we see that

f (x,y) x

is indeed

Appendix B: Proof of Lemma 4

  Proof. By symmetry it is sufficient to proof the statement for α ∈ 21 , 1 so that β > 0. First, recall that



γ exp − γ Ieα (ρAB kτA ) = sup τAγ ⊗ σB × ρAB (B1) σB ∈S(B)

α

1

1

2 2 where we set γ := 1−α α ∈ (0, 1] and use the notation L × R to denote the Hermitian operator L RL . By introducing the purification ρABC , we see that

h i 



γ γ (B2) sup τAγ ⊗ σB × ρAB = sup trC τAγ ⊗ σB ⊗ 1C × ρABC

σB ∈S(B)

α

α

σB ∈S(B)

=

sup σB ∈S(B)

= =

h i 

γ γ

trAB τA ⊗ σB ⊗ 1C × ρABC

α

inf

σB ∈S(B)

σC ∈S(C) σC >0

h h ii  −γ γ tr σC trAB τAγ ⊗ σB ⊗ 1C × ρABC

sup

inf

tr

sup

σB ∈S(B)

σC ∈S(C) σC >0

h

i γ −γ  τAγ ⊗ σB ⊗ σC ρABC ,

(B3) (B4) (B5)

22 where we employed [31, Lm. 12] to establish (B4). Now, it is easy to verify that S(B) is convex γ compact, the set of strictly positive elements of S(C) is convex and the function tr[(τAγ ⊗ σB ⊗ −γ σC )ρABC ] is concave in σB and convex in σC . Thus, Sion’s minimax theorem [40] applies and yields the following alternative expression: h i  γ −γ  exp − γ Ieα (ρAB kτA ) = inf sup tr τAγ ⊗ σB ⊗ σC ρABC (B6) = = =

σC ∈S(C) σC >0

σB ∈S(B)

inf

sup

σC ∈S(C) σC >0

σB ∈S(B)

h h ii γ −γ  tr σB trAC τAγ ⊗ σC × ρABC

inf

h i

γ −γ 

trAC τA ⊗ σC × ρABC

(B8)

inf



γ

−γ 

τA ⊗ σC × ρAC .

(B9)

σC ∈S(C) σC >0

σC ∈S(C) σC >0

β

β

We again used [31, Lm. 12] to establish (B8) and note that β = γ=

− 1−β β

(B7)

α 2α−1

=

1 1−γ .

Substituting for

in (B9) establishes the desired equality. Appendix C: Differentiability of the R´ enyi Mutual Information 1.

Directional Derivatives

We define the following directional derivatives. For two density operators σ, ω ∈ S, define   2     ∂ ∂ f (1 − s)σ + sω and ∂ω2 f (σ) = lim f (1 − s)σ + sω . (C1) ∂ω f (σ) = lim s&0 ∂s2 s&0 ∂s First, we establish the following property. Lemma 19. For γ ∈ (0, 1), we have Z −1 −1 sin(πγ) ∞ dt tγ t1 + f (σ) ∂ω f (σ) t1 + f (σ) ∂ω f (σ)γ = π Z0 ∞  −1 −1 sin(πγ) ∂ω2 f (σ)γ = dt tγ t1 + f (σ) ∂ω2 f (σ) t1 + f (σ) π 0 −1 −1 −1  − 2 t1 + f (σ) ∂ω f (σ) t1 + f (σ) ∂ω f (σ) t1 + f (σ) .

(C2) (C3) (C4)

Furthermore, the same relations hold if γ ∈ (−1, 0) and f (σ) > 0. Proof. For γ ∈ (0, 1), let σ(s) be an arbitrary function of s, and σ(s) ˙ and σ ¨ (s) its first and second derivative with regards to s, respectively. For γ ∈ (0, 1), we have the following integral representation: Z −1 sin(πγ) ∞ σ(s)γ = . (C5) dt tγ−1 σ(s) t1 + σ(s) π 0 We also recall that −1 −1 −1 ∂ t1 + σ(s) = − t1 + σ(s) σ(s) ˙ t1 + σ(s) . ∂s

(C6)

23 This allows us to evaluate Z  −1 −1 −1  ∂ sin(πγ) ∞ γ dt tγ−1 σ(s) ˙ t1 + σ(s) σ(s) = − σ t1 + σ(s) σ(s) ˙ t1 + σ ∂s π Z0 −1 −1 sin(πγ) ∞ dt tγ t1 + σ(s) σ(s) ˙ t1 + σ(s) and = π 0 Z  −1 −1 ∂2 sin(πγ) ∞ γ γ dt t t1 + σ(s) σ ¨ (s) t1 + σ(s) σ(s) = 2 ∂s π 0 −1 −1 −1  − 2 t1 + σ(s) σ(s) ˙ t1 + σ(s) σ(s) ˙ t1 + σ(s) . Finally, for γ ∈ (−1, 0), multiplying (C5) by σ(s)−1 , we find Z −1 sin(π(γ + 1)) ∞ σ(s)γ = σ(s)−1 σ(s)γ+1 = dt tγ t1 + σ(s) π 0 Z −1 sin(πγ) ∞ =− dt tγ t1 + σ(s) , π 0

(C7) (C8) (C9) (C10)

(C11) (C12)

and the result follows by taking the derivative. 2.

Characterization of the Optimal Marginal State

The following lemma characterizes the marginal state that minimizes the sandwiched mutual information.  Lemma 20. Let ρAB ∈ S(AB) and τA ∈ S(A) and let α ∈ 12 , 1 . Consider the functional h i 1−α 1−α α σB 7→ χ(σB ) := tr (τA ⊗ σB ) 2α ρAB (τA ⊗ σB ) 2α . (C13) Then, supσB ∈S(B) χ(σ) is uniquely achieved by the fixed-point of the following non-linear map: i h 1−α α 1−α 1 . (C14) σB 7→ Xα (σB ) := trA (τA ⊗ σB ) 2α ρAB (τA ⊗ σB ) 2α χ(σB )  We focus here on the case where α ∈ 12 , 1 since this is sufficient for our purposes. However, the same characterization is also expected to hold for α > 1. In fact, our proof below shows that any fixed point of the map Xα achieves supσB ∈S(B) χ(σ) for all α ≥ 12 ; however, we do not show uniqueness for α > 1. Proof. First, we derive a sufficient condition for a state σB to a maximum of the above optimization as follows. Let α ∈ ( 21 , 1) ∪ (1, ∞) and set γ = 1−α α ∈ (0, 1). Moreover, set  α  X = τ γ/2 ρ1/2 , such that χ(σ) = tr X † σ γ X . (C15) where we omitted the identity symbol and dropped the subscripts to make the presentation more concise in the following. Since χ is concave [13], a sufficient condition for σ to be a maximizer is given if ∂ω f (σ) = 0 for all ωB ∈ S(B). Using the cyclicity of the trace (see also Lemma 19), we find that h α−1 i ∂ω f (σ) = α tr X † σ γ X · ∂ω X † σ γ X (C16) h i   α−1 † γ/2 = α tr σ γ/2 X X † σ γ X X σ · σ −γ/2 ∂ω σ γ σ −γ/2 (C17) h h i i   α = α tr trA σ γ/2 XX † σ γ/2 · σ −γ/2 ∂ω σ γ σ −γ/2 (C18)

24 Now let us check the condition ∂ω f (¯ σ ) = 0 for an arbitrary fixed-point σ ¯ of the map Xα . Such a state always exists by Brouwer’s fixed-point theorem, and by definition satisfies h α i χ(σ) · σ ¯ = trA σ ¯ γ/2 XX † σ ¯ γ/2 . (C19) Substituting this into (C18) yields  1−γ      1 ∂ω f (¯ σ ) = tr σ ¯ · ∂ω σ ¯ γ = γ tr ∂ω σ ¯ = γ tr ω − σ ¯ = 0. αχ(σ)

(C20)

Here, we again took advantage of the cyclicity of the trace to evaluate the derivative ∂ω σ ¯ γ . Hence, we have established that σ ¯ is indeed a maximizer. In the following, we will show that σ ¯ is unique. Since χ is concave, in order to establish uniqueness it is sufficient to prove that ∂ω2 f (¯ σ ) < 0 for all ω 6= σ ¯ for one maximizer σ ¯ . Using Lemma 19, we find ∂ω2 f (σ)

sin(πα) = π

Z

− 2 tr

h



dt t

α

 tr

h

t1 + Q(σ)

−1

 −1 i ∂ω2 Q(σ) t1 + Q(σ)

(C21)

0

 −1  −1  −1 i t1 + Q(σ) ∂ω Q(σ) t1 + Q(σ) ∂ω Q(σ) t1 + Q(σ) ,

(C22)

where we set Q(σ) = X † σ γ X. Due to the cyclicity of the trace, we can integrate the first term. Furthermore, we note that the second trace term is never negative, and thus  α−1 2 † γ  ∂ω2 f (σ) ≤ α tr X † σ γ X · ∂ω X σ X (C23) h h i i   α = α tr trA σ γ/2 XX † σ γ/2 · σ −γ/2 ∂ω2 σ γ σ −γ/2 (C24) Specializing this to a fixed-point σ ¯ satisfying (C19) yields  1−γ 2 γ  ∂ω2 f (¯ σ ) ≤ αχ(¯ σ ) tr σ ¯ · ∂ω σ ¯ .

(C25)

Furthermore, evaluating the derivative using Lemma 19 and using the fact that ∂ω σ = ω − σ and ∂ω2 σ = 0, we find that the trace term on the right hand side evaluates to  1−γ 2 γ  tr σ ¯ · ∂ω σ ¯ (C26) Z ∞ h i −1 −1 −1 2 sin(πγ) =− dt tγ tr σ ¯ 1−γ t1 + σ ¯ (ω − σ ¯ ) t1 + σ ¯ (ω − σ ¯ ) t1 + σ ¯ (C27) π Z0 ∞ h i −2 2 sin(πγ) tγ ≤− dt tr σ ¯ 1−γ t1 + σ ¯ (ω − σ ¯ )2 , (C28) π t+1 0 −1 where we used that t1 + σ ≥ (t + 1)−1 1 and the cyclicity of the trace in the last step. We can now integrate this operator to get Z −2  2 sin(πγ) ∞ tγ Z(¯ σ) = − dt σ ¯ 1−γ t1 + σ ¯ = 2(1 − σ ¯ )−2 σ ¯ 1−γ − γ(1 − σ ¯) − σ ¯ (C29) π t+1 0 Since t 7→ t1−γ is strictly concave, taking the tangent at t = 1 yields σ ¯ 1−γ ≤ 1 − (1 − γ)(1 − σ ¯ ), where the inequality is strict unless σ ¯ has a unit eigenvalue. Hence, we find Z(σ) < 0 and, collecting the results starting from (C25), we have   ∂ω2 f (¯ σ ) ≤ αχ(σ) tr Z(¯ σ )(ω − σ ¯ )2 < 0 (C30) for all ω 6= σ ¯.

25 3.

Proof of Proposition 10

It remains to show that α 7→ Ieα (ρAB kτA ) is differentiable for α ≥ 12 . e α (ρAB kτA ⊗¯ σB (α)), Proof. For α ∈ 12 , 1) we recall Lemma 20 which establishes that Ieα (ρAB kτA ) = D where σ ¯B (α) is the unique zero of the functional Xα (¯ σB )− σ ¯B . Since Xα is continuously differentiable in α, its zero σ ¯B (α) is continuously differentiable in α as well by the implicit function theorem. e α (ρAB kτA ⊗ σ Thus, by the continuity of D ¯B (α)) in α and σ ¯B , this implies that Ieα (ρAB kτA ) is continuously differentiable. For α > 1, the same result follows due to the duality relation in Lemma 4.

[1] K. M. R. Audenaert, L. Masanes, A. Ac´ın, and F. Verstraete. Discriminating States: The Quantum Chernoff Bound. Phys. Rev. Lett., 98(16), Apr. 2007. DOI: 10.1103/PhysRevLett.98.160501. [2] K. M. R. Audenaert, M. Mosonyi, and F. Verstraete. Quantum State Discrimination Bounds for Finite Sample Size. J. Math. Phys., 53(12):122205, Apr. 2012. DOI: 10.1063/1.4768252. [3] K. M. R. Audenaert, M. Nussbaum, A. Szkoa, and F. Verstraete. Asymptotic Error Rates in Quantum Hypothesis Testing. Commun. Math. Phys., 279(1):251–283, Feb. 2008. DOI: 10.1007/s00220-008-0417-5. [4] S. Beigi. Sandwiched R´enyi Divergence Satisfies Data Processing Inequality. J. Math. Phys., 54(12):122202, June 2013. DOI: 10.1063/1.4838855. [5] P.-N. Chen. Generalization of G¨artner-Ellis Theorem. IEEE Trans. on Inf. Theory, 46(7):2752–2760, 2000. [6] M. Christandl. The Structure of Bipartite Quantum States - Insights from Group Theory and Cryptography. PhD thesis, Apr. 2006. Available online: http://arxiv.org/abs/quant-ph/0604183. [7] M. Christandl, R. K¨onig, and R. Renner. Postselection Technique for Quantum Channels with Applications to Quantum Cryptography. Phys. Rev. Lett., 102(2), Jan. 2009. DOI: 10.1103/PhysRevLett.102.020504. [8] N. Ciganovic, N. J. Beaudry, and R. Renner. Smooth Max-Information as One-Shot Generalization for Mutual Information. IEEE Trans. on Inf. Theory, 60(3):1573–1581, Mar. 2014. DOI: 10.1109/TIT.2013.2295314. [9] T. Cooney, M. Mosonyi, and M. M. Wilde. Strong Converse Exponents for a Quantum Channel Discrimination Problem and Quantum-Feedback-Assisted Communication. Aug. 2014. arXiv: 1408.3373. [10] N. Datta. Min- and Max- Relative Entropies and a New Entanglement Monotone. IEEE Trans. on Inf. Theory, 55(6):2816–2826, 2009. DOI: 10.1109/TIT.2009.2018325. [11] N. Datta, M. Tomamichel, and M. M. Wilde. Second-Order Coding Rates for Entanglement-Assisted Communication. May 2014. arXiv: 1405.1797. [12] A. Dembo and O. Zeitouni. Large Deviations Techniques and Applications. Stochastic Modelling and Applied Probability. Springer, 2 edition, 1998. [13] R. L. Frank and E. H. Lieb. Monotonicity of a Relative R´enyi Entropy. J. Math. Phys., 54(12):122201, June 2013. DOI: 10.1063/1.4838835. [14] B. E. Fristedt and L. F. Gray. A Modern Approach to Probability Theory. Probability and Its Applications. Birkh¨ auser, Boston, 1996. [15] M. K. Gupta and M. M. Wilde. Multiplicativity of Completely Bounded p-Norms Implies a Strong Converse for Entanglement-Assisted Capacity. Oct. 2013. arXiv: 1310.7028. [16] M. Hayashi. Optimal Sequence of Quantum Measurements in the Sense of Stein’s Lemma in Quantum Hypothesis Testing. J. Phys. A: Math. Gen., 35(50):10759–10773, Dec. 2002. DOI: 10.1088/0305-4470/35/50/307. [17] M. Hayashi. Quantum Information — An Introduction. Springer, 2006. [18] M. Hayashi. Error Exponent in Asymmetric Quantum Hypothesis Testing and its Application to ClassicalQuantum Channel Coding. Phys. Rev. A, 76(6):062301, Dec. 2007. DOI: 10.1103/PhysRevA.76.062301. [19] M. Hayashi. Information Spectrum Approach to Second-Order Coding Rate in Channel Coding. IEEE Trans. on Inf. Theory, 55(11):4947–4966, Nov. 2009. DOI: 10.1109/TIT.2009.2030478.

26 [20] M. Hayashi. Universal Coding for Classical-Quantum Channel. Commun. Math. Phys., 289(3):1087–1098, May 2009. DOI: 10.1007/s00220-009-0825-1. [21] M. Hayashi and H. Nagaoka. General Formulas for Capacity of Classical-Quantum Channels. IEEE Trans. on Inf. Theory, 49(7):1753–1768, July 2003. DOI: 10.1109/TIT.2003.813556. [22] F. Hiai and D. Petz. The Proper Formula for Relative Entropy and its Asymptotics in Quantum Probability. Commun. Math. Phys., 143(1):99–114, Dec. 1991. DOI: 10.1007/BF02100287. [23] G. Kassay. A Simple Proof for K¨onig’s Minimax Theorem. Acta Mathematica Hungarica, 63(4):371–374, Dec. 1994. DOI: 10.1007/BF01874462. ¨ das von Neumannsche Minimax-Theorem. Archiv der Mathematik, 19(5):482–487, Dec. [24] H. K¨onig. Uber 1968. DOI: 10.1007/BF01898769. [25] K. Li. Second-Order Asymptotics for Quantum Hypothesis Testing. Ann. Stat., 42(1):171–189, Feb. 2014. DOI: 10.1214/13-AOS1185. [26] E. H. Lieb. Convex trace functions and the Wigner-Yanase-Dyson conjecture. Adv. in Math., 11(3):267– 288, Dec. 1973. DOI: 10.1016/0001-8708(73)90011-X. [27] S. Lin. Investigating Properties of a New R´enyi Divergence. Final year dissertation, National University of Singapore, 2014. Available online: http://quantuminfo.quantumlah.org/etc/CP4101_Final_ Report.pdf. [28] W. Matthews and S. Wehner. Finite blocklength converse bounds for quantum channels. Oct. 2012. arXiv: 1210.4722. [29] M. Mosonyi and T. Ogawa. Quantum Hypothesis Testing and the Operational Interpretation of the Quantum Renyi Relative Entropies. Sept. 2013. arXiv: 1309.3228. [30] M. Mosonyi and T. Ogawa. The strong converse rate of quantum hypothesis testing for correlated quantum states. July 2014. arXiv: 1407.3567. [31] M. M¨ uller-Lennert, F. Dupuis, O. Szehr, S. Fehr, and M. Tomamichel. On Quantum R´enyi Entropies: A New Generalization and Some Properties. J. Math. Phys., 54(12):122203, June 2013. DOI: 10.1063/1.4838856. [32] H. Nagaoka. The Converse Part of The Theorem for Quantum Hoeffding Bound. Nov. 2006. arXiv: quant-ph/0611289. [33] K. Nakagawa and F. Kanaya. On the Converse Theorem in Statistical Hypothesis Testing for Markov Chains. IEEE Trans. on Inf. Theory, 39(2):629–633, 1993. [34] T. Ogawa and H. Nagaoka. Strong converse and Stein’s lemma in quantum hypothesis testing. IEEE Trans. on Inf. Theory, 46(7):2428–2433, Nov. 2000. DOI: 10.1109/18.887855. [35] M. Ohya and D. Petz. Quantum Entropy and Its Use. Springer, 1993. [36] Y. Polyanskiy. Saddle Point in the Minimax Converse for Channel Coding. IEEE Trans. on Inf. Theory, 59(5):2576–2595, May 2013. DOI: 10.1109/TIT.2012.2236382. [37] Y. Polyanskiy, H. V. Poor, and S. Verd´ u. Channel Coding Rate in the Finite Blocklength Regime. IEEE Trans. on Inf. Theory, 56(5):2307–2359, May 2010. DOI: 10.1109/TIT.2010.2043769. [38] R. Renner. Security of Quantum Key Distribution. PhD thesis, ETH Zurich, Dec. 2005. arXiv: quant-ph/0512258. [39] N. Sharma and N. A. Warsi. Fundamental Bound on the Reliability of Quantum Information Transmission. Phys. Rev. Lett., 110(8):080501, Feb. 2013. DOI: 10.1103/PhysRevLett.110.080501. [40] M. Sion. On General Minimax Theorems. Pacific J. Math., 8:171–176, 1958. [41] M. Tomamichel. A Framework for Non-Asymptotic Quantum Information Theory. PhD thesis, ETH Zurich, Mar. 2012. arXiv: 1203.2142. [42] M. Tomamichel, M. Berta, and M. Hayashi. Relating different quantum generalizations of the conditional R´enyi entropy. J. Math. Phys., 55(8):082206, Aug. 2014. DOI: 10.1063/1.4892761. [43] M. Tomamichel and M. Hayashi. A Hierarchy of Information Quantities for Finite Block Length Analysis of Quantum Tasks. IEEE Trans. on Inf. Theory, 59(11):7693–7710, Nov. 2013. DOI: 10.1109/TIT.2013.2276628. [44] M. Tomamichel and V. Y. F. Tan. On the Gaussian Approximation for the Classical Capacity of Quantum Channels. Aug. 2014. arXiv: 1308.6503. [45] L. Wang and R. Renner. One-Shot Classical-Quantum Capacity and Hypothesis Testing. Phys. Rev. Lett., 108(20), May 2012. DOI: 10.1103/PhysRevLett.108.200501. [46] M. M. Wilde, A. Winter, and D. Yang. Strong Converse for the Classical Capacity of EntanglementBreaking and Hadamard Channels via a Sandwiched R´enyi Relative Entropy. Comm. Math. Phys.,

27 331(2):593–622, July 2014. DOI: 10.1007/s00220-014-2122-x.