Abstract
The sandwiched Rényi divergences of two finite-dimensional density operators quantify their asymptotic distinguishability in the strong converse domain. This establishes the sandwiched Rényi divergences as the operationally relevant ones among the infinitely many quantum extensions of the classical Rényi divergences for Rényi parameter \(\alpha >1\). The known proof of this goes by showing that the sandwiched Rényi divergence coincides with the regularized measured Rényi divergence, which in turn is proved by asymptotic pinching, a fundamentally finite-dimensional technique. Thus, while the notion of the sandwiched Rényi divergences was extended recently to density operators on an infinite-dimensional Hilbert space (in fact, even for states of an arbitrary von Neumann algebra), these quantities were so far lacking an operational interpretation similar to the finite-dimensional case, and it has also been open whether they coincide with the regularized measured Rényi divergences. In this paper we fill this gap by answering both questions in the positive for density operators on an infinite-dimensional Hilbert space, using a simple finite-dimensional approximation technique. We also initiate the study of the sandwiched Rényi divergences, and the related problem of the strong converse exponent, for pairs of positive semi-definite operators that are not necessarily trace-class (this corresponds to considering weights in a general von Neumann algebra setting). This is motivated by the need to define conditional Rényi entropies in the infinite-dimensional setting, while it might also be interesting from the purely mathematical point of view of extending the concept of Rényi (and other) divergences to settings beyond the standard one of positive trace-class operators (positive normal functionals in the von Neumann algebra setting). In this spirit, we also discuss the definition and some properties of the more general family of Rényi \((\alpha ,z)\)-divergences of positive semi-definite operators on an infinite-dimensional separable Hilbert space.
Similar content being viewed by others
1 Introduction
In a simple binary i.i.d. quantum state discrimination problem, an experimenter is presented with several identically prepared quantum systems, all in the same state that is either described by a density operator on the system’s Hilbert space \({\mathcal {H}}\), (null-hypothesis \(H_0\)), or by another density operator \(\sigma \) (alternative hypothesis \(H_1\)). The experimenter’s task is to guess which hypothesis is correct, based on the result of a 2-outcome measurement, represented by a pair of operators \((T_n(0)=:T_n,T_n(1)=I-T_n)\), where \(T_n\in {{\mathcal {B}}}({\mathcal {H}}_n)\) is a test on \({\mathcal {H}}_n{:}{=}{\mathcal {H}}^{\otimes n}\), i.e., \(0\le T_n\le I\), and n is the number of identically prepared systems. If the outcome of the measurement is k, described by the measurement operator \(T_n(k)\), the experimenter decides that hypothesis k is true. The type I success probability, i.e., the probability that the experimenter correctly identifies the state to be , and the type II error probability, i.e., the probability that the experimenter erroneously identifies the state to be , are given by
respectively, where , \(\sigma _n=\sigma ^{\otimes n}\).
In the asymptotic analysis of the problem, it is customary to look for the optimal asymptotics of the type I success probabilities under the constraint that the type II error probabilities decrease at least as fast as \(\beta _n\sim e^{-nr}\) with some fixed r. It is known that if r is smaller than the relative entropy of and \(\sigma \) then the type I success probabilities converge to 1 exponentially fast, and the optimal exponent (the so-called direct exponent) is equal to the Hoeffding divergence \(H_r\) of and \(\sigma \) [3, 15, 26, 37]. The Hoeffding divergences are defined from the Petz-type Rényi divergences \(D_{\alpha }\) with \(\alpha \in (0,1)\), and the above result establishes the operational significance of these divergences [32, 37].
On the other hand, it was shown in [33] (see also [16, 36, 38]) that if the Hilbert space is finite-dimensional, (equivalently, the density operators are of finite rank), and r is larger than the relative entropy, then the type I success probabilities converge to 0 exponentially fast, and the optimal exponent (the so-called strong converse exponent) is equal to the Hoeffding anti-divergence \(H_r^{*}\) of and \(\sigma \). (\(H_r^{*}\), as well as the various divergences mentioned below, will be precisely defined in the main text.) The Hoeffding anti-divergences are defined from the sandwiched Rényi divergences \(D_{\alpha }^{*}\) with \(\alpha >1\) [35, 46], and this result establishes the operational significance of these divergences
A key step in the proof of the strong converse exponent in [33] is showing that the regularized measured Rényi divergence \(\overline{D}_{\alpha }^{\text {meas}}\) coincides with the sandwiched Rényi divergence \(D_{\alpha }^{*}\) for any \(\alpha >1\), which was proved using the pinching inequality [14], a fundamentally finite-dimensional technique. Thus, while the notion of the sandwiched Rényi divergences was extended recently to density operators on an infinite-dimensional Hilbert space (in fact, even for states of an arbitrary von Neumann algebra) in [6] and [27], these quantities were so far lacking an operational interpretation similar to the finite-dimensional case described above, and it has also been open whether they coincide with the regularized measured Rényi divergences. In this paper we fill this gap by answering both questions in the positive for density operators on an infinite-dimensional Hilbert space.
We also initiate the study of the sandwiched Rényi divergences, and the related problem of the strong converse exponents, for pairs of positive semi-definite operators that are not necessarily trace-class (this corresponds to considering weights in a general von Neumann algebra setting). This is motivated by the need to define conditional Rényi entropies in the infinite-dimensional setting, while it might also be interesting from the purely mathematical point of view of extending the concept of Rényi (and other) divergences to settings beyond the standard one of positive trace-class operators (or positive normal functionals, in the von Neumann algebra setting). In this spirit, we also discuss the definition and some properties of the more general family of Rényi \((\alpha ,z)\)-divergences [4, 25] in this setting. To the best of our knowledge, this is new even for trace-class operators when the underlying Hilbert space is infinite-dimensional .
The structure of the paper is as follows. In Sect. 2 we collect some necessary preliminaries. In Sect. 3 we define the Rényi \((\alpha ,z)\)-divergences for an arbitrary pair of positive semi-definite operators on a possibly infinite-dimensional Hilbert space, and establish some of their properties. The most important part of this section for the later applications is the recoverability of the sandwiched Rényi divergence from finite-dimensional restrictions, given in Proposition 3.40. Based on this, in Sect. 3.4 we show that the sandwiched Rényi divergence is equal to the regularized measured Rényi divergence for pairs of states, extending the finite-dimensional result of [33] to infinite dimension. In Sect. 4.1 we consider a generalization of the state discrimination problem where the hypotheses are given by (not necessarily trace-class) positive semi-definite operators, and establish lower and upper bounds on the strong converse exponents in this setting. In particular, we show that the strong converse exponent is equal to the Hoeffding anti-divergence for quantum states, thereby giving an operational interpretation of the sandwiched Rényi divergences analogous to the finite-dimensional case. Moreover, we prove the above equality also in the case where the reference operator \(\sigma \) is only assumed to be compact, and to dominate the first operator as for some \(\lambda >0\). In Sect. 4.2, we give a direct operational interpretation to the sandwiched Rényi divergences as generalized cutoff rates, extending the analogous interpretations given previously for classical [8] and finite-dimensional quantum states [33]. In Sect. 4.3 we use the strong converse result from Sect. 4.1 to show the monotonicity of the sandwiched Rényi divergences under the action of the dual of a normal unital completely positive map. While this follows from [6, 27] for density operators, our proof is completely different, and also applies to other settings, e.g., for a compact \(\sigma \) that dominates .
2 Preliminaries
Throughout the paper, \({\mathcal {H}}\) and \({{\mathcal {K}}}\) will denote separable Hilbert spaces (of finite or infinite dimension), and \({{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) will denote the set of everywhere defined bounded linear operators from \({\mathcal {H}}\) to \({{\mathcal {K}}}\), with \(B({\mathcal {H}},{\mathcal {H}})=:{{\mathcal {B}}}({\mathcal {H}})\). We will use the notations \({{\mathcal {B}}}({\mathcal {H}})_{\text {sa}}\) for the set of self-adjoint, and \({{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), for the set of non-zero positive semi-definite (PSD), operators in \({{\mathcal {B}}}({\mathcal {H}})\) respectively, and
for the set of tests in \({{\mathcal {B}}}({\mathcal {H}})\). A test T is projective if \(T^2=T\). We will denote the set of all projections on \({\mathcal {H}}\) by \(\mathbb {P}({\mathcal {H}})\), and the set of finite rank projections by \(\mathbb {P}_f({\mathcal {H}})\). The set of finite-rank operators on \({\mathcal {H}}\) will be denoted by \({{\mathcal {B}}}_f({\mathcal {H}})\). The set of density operators, or states, on \({\mathcal {H}}\) will be denoted by \({{\mathcal {S}}}({\mathcal {H}})\). For two PSD operators , we will use the notations
and , , , where .
For a (possibly unbounded) self-adjoint operator A on a Hilbert space \({\mathcal {H}}\), let \(P^A(\cdot )\) denote its spectral PVM, and for any complex-valued measurable function f defined at least on \({{\,\textrm{spec}\,}}(A)\), let \(f(A)=\int _{\mathbb {R}}f\,dP^A\) be the operator defined via the usual functional calculus. We will use the relations
where \(\overline{f}\) stands for the pointwise complex conjugate of f, and for a closable operator X, \(\overline{X}\) denotes its closure.
We say that a (not necessarily everywhere defined or bounded) linear operator A on a Hilbert space is positive semi-definite (PSD), if it is self-adjoint, and \({{\,\textrm{spec}\,}}(A)\subseteq [0,+\infty )\). If A is PSD then we may define its real powers as
In particular, \(A^0\) is the projection onto \((\ker A)^{\perp }={{\,\mathrm{\overline{{{\,\textrm{ran}\,}}}}\,}}A=:{{\,\textrm{supp}\,}}A\),
and
Note that here we use the notation
for the identity map on any subset \(B\subseteq \mathbb {R}\). Analogously, we will denote the characteristic function, or indicator function, of a subset \({{\mathcal {X}}}_0\) of a set \({{\mathcal {X}}}\) by
For any \(X\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) with polar decomposition \(X=V|X|\), we have \(|X^*|=V|X|V^*\), whence \(|X^*|^p=V|X|^pV^*\) for any \(p\in \mathbb {R}\). In particular,
which we will use in many proofs below without further notice. We will use the notation \(\left\| X\right\| _p{:}{=}({{\,\textrm{Tr}\,}}|X|^p)^{1/p}\) for \(X\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) and \(p>0\). When \(p\ge 1\), \(\left\| \cdot \right\| _p\) is a norm on the Schatten p-class
We will denote the usual operator norm on \({{\mathcal {B}}}({\mathcal {H}})\) by \(\left\| \cdot \right\| _{\infty }\).
The following is well known; see, e.g., [18, Proposition 2.7] and [31, Theorem 2.3].
Lemma 2.1
(Hölder inequality). Let \(p_0,p_1,p>0\) be such that \(\frac{1}{p_0}+\frac{1}{p_1}=\frac{1}{p}\). For any \(A,B\in {{\mathcal {B}}}({\mathcal {H}})\),
Moreover, if \(\left\| A\right\| _{p_0}\left\| B\right\| _{p_1}<+\infty \) then equality holds in (2.4) if and only if \(A=\lambda B\) or \(B=\lambda A\) for some \(\lambda \ge 0\).
We will use the notations \(\mathrm{(wo)}\lim \) and \(\mathrm{(so)}\lim \) for limits in the weak and the strong operator topologies, respectively. The following two statements are from [13].
Lemma 2.2
Let \(A\in {{\mathcal {L}}}^p({\mathcal {H}})\) for some \(p\ge 1\), and \(B_n\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\), \(C_n\in {{\mathcal {B}}}({{\mathcal {K}}},{\mathcal {H}})\), \(n\in \mathbb {N}\), be two sequences bounded in operator norm and converging strongly to some \(B_{\infty }\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) and \(C_{\infty }\in {{\mathcal {B}}}({{\mathcal {K}}},{\mathcal {H}})\), respectively. Then
Proof
The first limit in (2.5) is immediate from [13, Theorem 1], and the second limit follows from it trivially. \(\square \)
The following is Theorem 2 in [13]:
Lemma 2.3
Let \(p\in [1,+\infty )\) and \(A,A_n\in {{\mathcal {L}}}^p({\mathcal {H}})\), \(n\in \mathbb {N}\), be such that \(\mathrm{(so)}\lim _n A_n=A\), \(\mathrm{(so)}\lim _n A_n^*=A^*\), and \(\lim _n\left\| A_n\right\| _p=\left\| A\right\| _p\). Then \(\lim _n\left\| A_n-A\right\| _p=0\).
The following is a special case of [18, Proposition 2.11]:
Lemma 2.4
Assume that a sequence \(A_n\in {{\mathcal {B}}}({\mathcal {H}})\), \(n\in \mathbb {N}\), converges to some \(A\in {{\mathcal {B}}}({\mathcal {H}})\) in the weak operator topology. For any \(p\in [1,+\infty ]\),
We will need the following straightforward generalization of the minimax theorem from [32, Corollary A.2]. Its proof is essentially the same, which we include for readers’ convenience.
Lemma 2.5
Let X be a compact topological space, Y be an upward directed partially ordered set, and let \(f:\,X\times Y\rightarrow \mathbb {R}\cup \{-\infty ,+\infty \}\) be a function. Assume that
-
(i)
\(f(.\,,\,y)\) is upper semicontinuous for every \(y\in Y\) and
-
(ii)
f(x, .) is monotonic decreasing for every \(x\in X\).
Then
and the suprema in (2.6) can be replaced by maxima.
Proof
The inequality \(\sup _{x\in X}\inf _{y\in Y}f(x,y)\le \inf _{y\in Y}\sup _{x\in X}f(x,y)\) is trivial, and for the converse inequality it is sufficient to prove that for any finite subset \(Y'\subseteq Y\),
according to [32, Lemma A.1] (applied to \(-f\) in place of f). Due to Y being upward directed, for any finite subset \(Y'\subseteq Y\), there exists a \(y^*\in Y\) such that \(y\le y^*\) for every \(y\in Y'\). Since f(x, .) is assumed to be monotone decreasing, we get
as required. The assertion about the maxima is straightforward from the assumed semi-continuity and the compactness of X. \(\square \)
3 The Rényi \((\alpha ,z)\)-Divergences in Infinite Dimension
The sandwiched Rényi \(\alpha \)-divergences for pairs of finite-dimensional density operators were introduced in [35, 46]. The Rényi \((\alpha ,z)\)-divergences [4, 25] give a 2-parameter extension of this family, which includes both the sandwiched Rényi divergences (corresponding to \(z=\alpha \)) and the Petz-type, or standard Rényi divergences [40] (corresponding to \(z=1\)) as special cases.
The concept of the sandwiched Rényi divergences was extended recently to pairs of positive normal linear functionals on a general von Neumann algebra in [6, 27, 28], while the Petz-type Rényi divergences have been studied in this more general setting for a long time [21, 29, 39]. These extensions require advanced knowledge of von Neumann algebras, and the details of the proofs might be difficult to verify for those who are not experts in the subject. Below we give a more pedestrian exposition of the definition and basic properties of the Rényi divergences in the simpler case where the states are represented by density operators on a possibly infinite-dimensional Hilbert space, while in the same time we also generalize the above works in this setting to the case where density operators may be replaced by arbitrary positive semi-definite operators. Since these are mostly not assumed to be trace-class, they cannot be normalized to states in the properly infinite-dimensional case. Moreover, we also consider the more general notion of Rényi \((\alpha ,z)\)-divergences in this setting.
The recoverability of the sandwiched Rényi divergences from finite-size restrictions, given in Proposition 3.40, seems to be new even for density operators, although in that case it follows easily from the known properties of monotonicity and lower semi-continuity of the sandwiched Rényi divergences.
3.1 Definition and basic properties
The sandwiched Rényi divergence of and \(\sigma \) is finite according to the definition in [27] if and only if is in Kosaki’s interpolation space \({{\mathcal {L}}}^{\alpha }({\mathcal {H}},\sigma )\). The following lemma gives various alternative characterizations of this condition, and also an extension that we will use in the definition of the Rényi \((\alpha ,z)\)-divergences in this setting. The lemma is essentially a special case of Douglas’ range inclusion theorem [10] for PSD operators with and \(B{:}{=}\sigma ^{\frac{\alpha -1}{2z}}\) (points (iv)–(iv)) as well as an extension with further equivalent characterizations (points (i)–(iii)), and it is inspired by a similar statement for the \(\alpha =z=+\infty \) case given in [30].
Let us introduce the notation
For \((\alpha ,z){:}{=}(+\infty ,+\infty )\), we will use the convention \(\frac{\alpha }{z}{:}{=}1\), and define similar expressions by a formal calculus, e.g., \(\frac{\alpha }{2z}{:}{=}\frac{1}{2}\frac{\alpha }{z}=\frac{1}{2}\), \(\frac{\alpha -1}{2z}{:}{=}\frac{\alpha }{2z}-\frac{1}{2z}=\frac{1}{2}\), etc.
Lemma 3.1
Let , and let \((\alpha ,z)\in \mathbb {A}\). The following are equivalent:
-
(i)
There exists an \(R\in {{\mathcal {B}}}({\mathcal {H}})\) such that
(3.1) -
(ii)
, and is densely defined and bounded.
-
(iii)
, and for any/some sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), the sequence of bounded operators
(3.2)converges in the weak/strong operator topology, where \(\sigma _n{:}{=}P_n\sigma P_n\), \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\).
-
(iv)
.
-
(v)
.
-
(vi)
There exists a \(\lambda \ge 0\) such that .
Moreover, if the above hold then , and among all operators R as in (3.1) there exists a unique PSD operator with the property \(R^0\le \sigma ^0\), denoted by , which can be expressed as
where \((\sigma _n)_{n\in \mathbb {N}}\) is any sequence as in (iii). This unique is in the von Neumann algebra generated by and \(\sigma \), and its operator norm is equal to the smallest \(\lambda \) for which (iv) holds.
Proof
Note that if \(R\in {{\mathcal {B}}}({\mathcal {H}})\) satisfies (3.1) then so does \(\sigma ^0R\sigma ^0\) as well. Moreover, any of the conditions above imply . Hence, we may assume without loss of generality that \({{\,\textrm{supp}\,}}\sigma ={\mathcal {H}}\), so that \({{\,\mathrm{\overline{{{\,\textrm{ran}\,}}}}\,}}(\sigma ^{\frac{\alpha -1}{2z}})= \big (\ker \big (\sigma ^{\frac{\alpha -1}{2z}}\big )\big )^{\perp }=(\ker \sigma )^{\perp }={\mathcal {H}}\).
Assume that (i) holds. Then holds trivially, and
whence its closure is equal to R. This proves (ii) and the existence of the unique with the postulated properties, as well as the first equality in (3.3). Moreover, for any \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), we have, with \(\sigma _n\) as in (iii),
where we used (2.2). This proves (iii) and the first equality in (3.4). Since is in the von Neumann algebra generated by and \(\sigma \), so is , according to (3.5). Obviously, (i) also implies
whence (vi) follows with . As a consequence, , where \(\lambda _{\min }\) denotes the smallest \(\lambda \) for which (vi) holds. Conversely, let \(\lambda \) be as in (vi). Multiplying both sides by \(\sigma _n^{\frac{1-\alpha }{2z}}\) yields , which in combination with (3.5) gives . Thus, , as stated.
Assume next that (ii) holds. Then
where the last equality follows from the assumption . Since is everywhere defined, it is actually equal to the first operator in (3.6), and thus (i) holds. Moreover, if (3.6) holds then for any \(\phi \in {\mathcal {H}}\),
Since \({{\,\textrm{ran}\,}}\sigma ^{\frac{\alpha -1}{2z}}\) is dense and is bounded, it follows that is PSD.
Assume now (iii), i.e., that for some sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), the sequence of operators converges in the weak operator topology to some operator \(R_{\sigma ,\alpha ,z}\). Then
and hence (i) holds, as well as the second equality in (3.4).
The equivalence of (iv), (v), and (iv) follows from Douglas’ range inclusion theorem [10]. Note that (iv)\(\Longleftrightarrow \)(v) is simple, as being everywhere defined is equivalent to , and boundedness of is automatic from the boundedness of and the closedness of \(\sigma ^{\frac{1-\alpha }{2z}}\), due to the closed graph theorem. Moreover, we have
whence
which is densely defined and bounded. Thus, . Finally,
where the last equality follows from the assumption \({{\,\textrm{ran}\,}}\varrho ^{\frac{\alpha }{2z}}\subseteq {{\,\textrm{ran}\,}}\sigma ^{\frac{\alpha -1}{2z}}\). Thus, (i) follows with \(\varrho _{\sigma ,\alpha ,z}= \big (\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\big )\overline{\varrho ^{\frac{\alpha }{2z}}\sigma ^{\frac{1-\alpha }{2z}}}\), and we also have the second and the third equalities in (3.3). Note that the last expression in (3.3) gives another proof for the positive semi-definiteness of \(\varrho _{\sigma ,\alpha ,z}\). \(\square \)
Definition 3.2
For \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in \mathbb {A}\), let
When \(\alpha =z\), we will use the shorthand notation \({{\mathcal {B}}}^{\alpha ,\alpha }({\mathcal {H}},\sigma )=:{{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\).
Remark 3.3
Note that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) if and only if it satisfies (i) in Lemma 3.1, which is equivalently characterized by all the other points in Lemma 3.1. In particular, there exists a unique PSD \(\varrho _{\sigma ,\alpha ,z}\) satisfying \(\varrho _{\sigma ,\alpha ,z}^0\le \sigma ^0\) and \(\varrho ^{\frac{\alpha }{z}} = \sigma ^{\frac{\alpha -1}{2z}}\varrho _{\sigma ,\alpha ,z}\sigma ^{\frac{\alpha -1}{2z}}\), and thus the map \(\varrho \mapsto \varrho _{\sigma ,\alpha ,z}\) is well-defined from \({{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) onto \(\{\tau \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}:\,\tau ^0\le \sigma ^0\}\), and it is also injective, hence it is a bijection. When \(\alpha =z\), we will use the notation \(\varrho _{\sigma ,\alpha ,\alpha }=:\varrho _{\sigma ,\alpha }\).
Lemma 3.4
For any \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \((0,+\infty )\ni z\mapsto {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) is increasing, i.e.,
Proof
Let \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Then, by (iv) of Lemma 3.1, \(\varrho ^{\frac{\alpha }{z}}\le \lambda \sigma ^{\frac{\alpha -1}{z}}\) for some \(\lambda \in (0,+\infty )\). Since \(z\le z'\), \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\frac{z}{z'}}\) is operator monotone, whence
Again by (iv) of Lemma 3.1, \(\varrho \in {{\mathcal {B}}}^{\alpha ,z'}({\mathcal {H}},\sigma )\). \(\square \)
Remark 3.5
By (3.3)–(3.4), for \(P_n\) and \(\sigma _n\) as in (3.2),
and if \(\alpha =z\), then we further have
Thus, with \(\varrho _n{:}{=}P_n\varrho P_n\),
Remark 3.6
Note that if \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\), i.e., \(\varrho =\sigma ^{\frac{\alpha -1}{2\alpha }}\varrho _{\sigma ,\alpha }\sigma ^{\frac{\alpha -1}{2\alpha }}\) with \(\varrho _{\sigma ,\alpha }\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(\varrho _{\sigma ,\alpha }^0\le \sigma ^0\), then for any \(\alpha '<\alpha \),
whence \(\varrho \in {{\mathcal {B}}}^{\alpha '}({\mathcal {H}},\sigma )\), and
In particular, if \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\), i.e., \(\varrho =\sigma ^{1/2}\varrho _{\sigma ,\infty }\sigma ^{1/2}\) with some \(\varrho _{\sigma ,\infty }\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(\varrho _{\sigma ,\infty }^0\le \sigma ^0\), then \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\) for every \(\alpha >1\), and
As an immediate consequence,
where
is the max-relative entropy of \(\varrho \) and \(\sigma \) [9, 41].
Definition 3.7
For \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let
Again, when \(\alpha =z\), we will use the notation \({{\mathcal {L}}}^{\alpha ,\alpha }({\mathcal {H}},\sigma )=:{{\mathcal {L}}}^{\alpha }({\mathcal {H}},\sigma )\).
Remark 3.8
Note that for \(\alpha >1\), \(\sigma ^{\frac{\alpha -1}{2z}}\in {{\mathcal {B}}}({\mathcal {H}})\), and if \(z\ge 1\) then \({{\mathcal {L}}}^{z}({\mathcal {H}})\) is an ideal in \({{\mathcal {B}}}({\mathcal {H}})\). Thus, by (i) of Lemma 3.1, if \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then \(\varrho ^{\frac{\alpha }{z}}\in {{\mathcal {L}}}^{z}({\mathcal {H}})\), or equivalently, \(\varrho \in {{\mathcal {L}}}^{\alpha }({\mathcal {H}})\). Therefore,
Assume now that \(\sigma \) is trace-class and \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) for some \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\). Then, by Lemma (i) of 3.1 and the operator Hölder inequality, \({{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{z}}\big )^r<+\infty \), where \(\frac{1}{r}=\frac{\alpha -1}{2z}+\frac{1}{z}+\frac{\alpha -1}{2z}=\frac{\alpha }{z}\), or equivalently, \(\varrho \in {{\mathcal {L}}}^1({\mathcal {H}})\). Thus, we get
It is easy to see that the above inclusion is strict. Indeed, let \(\sigma \in {{\mathcal {B}}}(l^2(\mathbb {N}))\) be diagonal in the canonical basis of \(l^2(\mathbb {N})\), i.e., \(\sigma =\sum _{k\in \mathbb {N}}s(k)\left| \textbf{1}_{\{k\}}\right\rangle \!\left\langle \textbf{1}_{\{k\}}\right| \) for some \(s:\,\mathbb {N}\rightarrow (0,+\infty )\) such that \(\sum _{k\in \mathbb {N}}s(k)<+\infty \) (i.e., \(\sigma \) is trace-class) and \(\sum _{k\in \mathbb {N}}s(k)^{\frac{\alpha -1}{z}}<+\infty \). Define \(\varrho {:}{=}\sum _{k\in \mathbb {N}}s(k)^{\frac{\alpha -1}{z}}\left| \textbf{1}_{\{k\}}\right\rangle \!\left\langle \textbf{1}_{\{k\}}\right| \). Then \(\varrho \) is trace-class, and for any sequence \((P_n=\textbf{1}_{(c_n,d_n)}(\sigma ))_{n\in \mathbb {N}}\) as in Lemma 3.1,
which goes to I in the strong operator topology. Hence, \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\cap {{\mathcal {L}}}^{1}({\mathcal {H}})\), but \(\varrho _{\sigma ,\alpha ,z}=I\), and therefore \(\varrho \notin {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}})\).
The following is an extension of the Rényi \((\alpha ,z)\)-divergences [4] to the case of infinite-dimensional PSD operators. It is also a special case of Jenčová’s definition of the sandwiched Rényi divergence [27] when \(\varrho \) and \(\sigma \) are trace-class, and \(z=\alpha \), and it is a natural extension of it otherwise.
Definition 3.9
For \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let
with \(\varrho _{\sigma ,\alpha ,z}\) as in Lemma 3.1. The Rényi \((\alpha ,z)\)-divergence of \(\varrho \) and \(\sigma \) is defined as
We use the notations \(Q_{\alpha }^{*}{:}{=}Q_{\alpha ,\alpha }\) and \(D_{\alpha }^{*}{:}{=}D_{\alpha ,\alpha }\), and call the latter the sandwiched Rényi \(\alpha \) -divergence.
We also define the following variants of the Rényi \((\alpha ,z)\)-divergences for trace-class operators:
Definition 3.10
For PSD trace-class operators \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let
We also use the notation \(\tilde{D}_{\alpha }^{*}{:}{=}\tilde{D}_{\alpha ,\alpha }\).
Remark 3.11
For a convex function f on \([0,+\infty )\), the quantum f -divergence of a pair of positive normal functionals on a von Neumann algebra is defined using the relative modular operator; see [21, 39]. In particular, it is well-defined for a pair of positive trace-class operators \(\varrho ,\sigma \) on a Hilbert space and \(f_{\alpha }{:}{=}{{\,\textrm{id}\,}}_{[0,+\infty )}^{\alpha }\) for any \(\alpha >1\); let it be denoted by \(Q_{f_{\alpha }}(\varrho \Vert \sigma )\). According to [21, Theorem 3.6],
In particular, for PSD trace-class operators \(\varrho \) and \(\sigma \), \(D_{\alpha ,1}(\varrho \Vert \sigma )\) in Definition 3.9 coincides with the Petz-type or standard quantum Rényi \(\alpha \)-divergence of \(\varrho \) and \(\sigma \), just as in the finite-dimensional case; see, e.g. [4].
Remark 3.12
Note that for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),
and
Remark 3.13
It is clear from their definitions that \(Q_{\alpha ,z}\), \(D_{\alpha ,z}\) and \(\tilde{D}_{\alpha ,z}\) satisfy the scaling properties
valid for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\lambda ,\eta \in (0,+\infty )\).
Remark 3.14
According to Lemma 3.1, if \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then
which is a straightforward generalization of the formula for PSD operators on a finite-dimensional Hilbert space. Moreover, Lemma 3.1 also yields the formula
which generalizes the finite-dimensional expression \({{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{2z}}\sigma ^{\frac{1-\alpha }{z}}\varrho ^{\frac{\alpha }{2z}}\big )^z\). Note that by Lemma 3.1, (3.14) can also be written as
where we use the notation \(\left\| \cdot \right\| _z=({{\,\textrm{Tr}\,}}|\cdot |^z)^{1/z}\) also for \(z\in (0,1)\).
A further connection to the finite-dimensional formula is given by the following:
Lemma 3.15
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\varrho ^0\le \sigma ^0\), and let \((\alpha ,z)\in (1,+\infty )\times [1,+\infty )\). Then \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), or equivalently, \(Q_{\alpha ,z}(\varrho \Vert \sigma )<+\infty \), if and only if for any/some sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), \(\big (\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}}\big )_{n\in \mathbb {N}}\) is a convergent sequence in \({{\mathcal {L}}}^{z}({\mathcal {H}})\), where \(\sigma _n{:}{=}{{\,\textrm{id}\,}}_{(c_n,d_n)}(\sigma )\).
Moreover, if \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then
and if \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then
for any sequences as above.
Proof
The “if” part follows since convergence in z-norm implies \(\mathrm{(so)}\) convergence, whence \(\varrho _{\sigma ,\alpha ,z}\) exists as in Lemma 3.1, and the \(\mathrm{(so)}\) limit coincides with the z-norm limit, whence \(\varrho _{\sigma ,\alpha ,z}\in {{\mathcal {L}}}^z({\mathcal {H}},\sigma )\).
Assume now that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Then \(\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}} =P_n\varrho _{\sigma ,\alpha ,z}P_n\), with \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\), and the “only if” part, as well as (3.15), follows from Lemma 2.2.
Note that (3.15) trivially implies (3.16) when \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Assume thus that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\setminus {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), so that \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \). Since \(\sigma _n^{\frac{1-\alpha }{2z}}\varrho \sigma _n^{\frac{1-\alpha }{2z}}=P_n\varrho _{\sigma ,\alpha ,z}P_n\) converges to \(\varrho _{\sigma ,\alpha ,z}\) in the weak operator topology, Lemma 2.4 yields that
from which (3.16) follows. \(\square \)
Proposition 3.16
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and any \(\alpha \in (1,+\infty )\),
In particular, for any \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \({{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) is increasing in z, i.e.,
Proof
It is sufficient to prove that for any \(0<z<z'\), \(Q_{\alpha ,z}(\varrho \Vert \sigma )\ge Q_{\alpha ,z'}(\varrho \Vert \sigma )\) holds. This is obvious when \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \), and hence for the rest we assume the contrary, i.e., that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). By Lemma 3.4, this implies that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z'}({\mathcal {H}},\sigma )\). Thus, by Lemma 3.15,
According to Araki’s inequality [2, Theorem 2],\({{\,\textrm{Tr}\,}}\varphi \!\big (B^{1/2}AB^{1/2}\big )^q\!\!\le \!{{\,\textrm{Tr}\,}}\varphi \!\big (B^{q/2}A^q B^{q/2}\big )\) for any \(A,B\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(q\in [1,+\infty )\), and monotone increasing continuous function \(\varphi \) on \([0,+\infty )\) such that \(\varphi (0)=0\) and \(t\mapsto \varphi (e^t)\) is convex on \(\mathbb {R}\). Applying this to \(A{:}{=}\varrho ^{\frac{\alpha }{z'}}\), \(B{:}{=}\sigma _n^{\frac{1-\alpha }{z'}}\), \(q{:}{=}\frac{z'}{z}\), and \(\varphi {:}{=}{{\,\textrm{id}\,}}_{[0,+\infty )}^{z}\) yields
for every \(n\in \mathbb {N}\). Thus, by (3.17),
where the equality is again due to Lemma 3.15. \(\square \)
Remark 3.17
As a special case of Proposition 3.16, we get that for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),
i.e., the sandwiched Rényi \(\alpha \)-divergence cannot be larger than the Petz-type Rényi \(\alpha \)-divergence. This has been proved for positive normal functionals on a von Neumann algebra (positive trace-class operators in our case) in [6, Theorem 12] and [27, Corollary 3.6] using different methods than in the proof of Proposition 3.16 above.
Remark 3.18
Assume that \(Q_{\alpha }^*(\varrho \Vert \sigma )<+\infty \), i.e., \(\varrho \in {{\mathcal {L}}}^{\alpha }(\varrho \Vert \sigma )\) for some \(\alpha >1\), and \(1<\alpha '<\alpha \). Then, by (3.8),
where the inequality follows by the operator Hölder inequality. In particular, if \(\sigma \) is trace-class then \(Q_{\alpha '}^{*}(\varrho \Vert \sigma )<+\infty \). If \({{\,\textrm{Tr}\,}}\sigma =1\) then a simple rearrangement yields
Note that this is weaker than \(D_{\alpha '}^{*}(\varrho \Vert \sigma )\le D_{\alpha }^{*}(\varrho \Vert \sigma )\), which was proved in [28, Proposition 3.7].
Remark 3.19
Since we do not assume the second operator to be trace-class, the expression \(D_{\alpha ,z}(\varrho \Vert I)\) makes sense, and we recover the following identity for the Rényi \(\alpha \)-entropy of a state \(\varrho \in {{\mathcal {S}}}({\mathcal {H}})\), which is well-known in the finite-dimensional case:
(In fact, this makes sense for arbitrary PSD operator \(\varrho \)).
More importantly, allowing non trace-class operators enables the definition of conditional \((\alpha ,z)\)-entropies. Following [44], we define two different notions of conditional \((\alpha ,z)\)-entropy between systems A and B in a state \(\varrho _{AB}\in {{\mathcal {S}}}({\mathcal {H}}_A\otimes {\mathcal {H}}_B)\) as
where \(\varrho _B={{\,\textrm{Tr}\,}}_A\varrho _{AB}\) denotes the marginal of \(\varrho _{AB}\) on system B. Again, (3.19)–(3.20) make sense even when \(\varrho _{AB}\) is only assumed to be PSD. Note that while the Rényi entropies (3.18) can be defined directly for \(\varrho \) without reference to any Rényi divergences, this is not the case for the conditional Rényi entropies (3.19)–(3.20), and the ability to take non-trace-class operators at least in the second argument of the divergence is crucial for the definition.
According to Proposition 3.16, for any fixed \(\varrho _{AB}\in {{\mathcal {S}}}({\mathcal {H}}_{A}\otimes {\mathcal {H}}_B)\), and any \(\alpha >1\),
In particular, either version of the sandwiched conditional Rényi entropy is at least as large as the corresponding version of the Petz-type conditional Rényi entropy.
Lemma 3.20
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),
Proof
The assertion is trivial when \(\varrho \notin {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), and hence we assume the contrary. Then
contrary to the assumption that \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). Hence, the inequalities in (3.21) hold. \(\square \)
Remark 3.21
Stronger bounds than the ones in (3.21) are given below in Corollary 3.27 for trace-class operators.
Lemma 3.22
Let \(\varrho _k,\sigma _k\in {{\mathcal {B}}}({\mathcal {H}}_k){}_{\gneq 0}\), \(k=1,2\). For any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),
and \((\varrho _1\otimes \varrho _2)_{\sigma _1\otimes \sigma _2,\alpha ,z}= (\varrho _1)_{\sigma _1,\alpha ,z}\otimes (\varrho _2)_{\sigma _2,\alpha ,z}\). As a consequence,
Proof
The right to left implications in (3.22, 3.23) are obvious from choosing \(R{:}{=} (\varrho _1)_{\sigma _1,\alpha ,z}\otimes (\varrho _2)_{\sigma _2,\alpha ,z}\) in (i) of Lemma 3.1. Assume that \(\varrho _1\otimes \varrho _2\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_1\otimes {\mathcal {H}}_2,\sigma _1\otimes \sigma _2)\). By (iv) of Lemma 3.1, there exists a \(\lambda \ge 0\) such that
Choose any \(\psi _2\notin \ker (\varrho _2)\). For any \(\psi _1\in {\mathcal {H}}_1\), we get
Thus, \(\varrho _1^{\frac{\alpha }{z}}\le \lambda (\kappa _2/\kappa _1)\sigma _1^{\frac{\alpha -1}{z}}\), and again by (iv) of Lemma 3.1, \(\varrho _1\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_1,\sigma _1)\). An exactly analogous argument gives \(\varrho _2\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_2,\sigma _2)\). This proves the left to right implication in (3.22), and we also get
from which \((\varrho _1\otimes \varrho _2)_{\sigma _1\otimes \sigma _2,\alpha ,z}= (\varrho _1)_{\sigma _1,\alpha ,z}\otimes (\varrho _2)_{\sigma _2,\alpha ,z}\), according to (3.3), and thus (3.24) and (3.25) follow due to the multiplicativity of the trace. The left to right implication in (3.23) follows immediately from the above. \(\square \)
3.2 Variational formulas
The following variational representations of \(Q_{\alpha ,z}\) and \(D_{\alpha ,z}\) are very useful to establish their fundamental properties. We will use these variational formulas to prove monotonicity of \(Q_{\alpha ,z}\) under restrictions of the operators to subspaces (Lemma 3.32, Corollary 3.34) and to give a lower bound on the strong converse exponent (Lemma 4.2).
For \(z=\alpha \) (the case of the sandwiched Rényi divergence), the variational formula in (3.26) was given first in [12] for finite-dimensional PSD operators, and was extended to the case of pairs of positive normal functionals on a general von Neumann algebra in [28] (see also [21, Lemma 3.19] for the case \(\alpha <1\)), while the variational formula in (3.27) can be obtained as an intermediate step in the proof of the first variational formula, and it was given in [5] in the finite-dimensional case.
For finite-dimensional invertible PSD operators and arbitrary \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), both variational formulas (3.26)–(3.27) follow as special cases of [47, Theorem 3.3].
The version below is an extension of the above when \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\) are arbitrary, and the operators \(\varrho ,\sigma \) can be PSD operators on an infinite-dimensional Hilbert space satisfying the conditions in Lemma 3.23. Our proof follows essentially that of [47, Theorem 3.3].
For any \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), let
Lemma 3.23
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), and assume that one of the following holds: a) \(\varrho \notin {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\); b) \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\setminus {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) and \(\sigma \) is compact; c) \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Then
The equality in (3.26) still holds if the supremum is taken over \({{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\). Moreover, in cases a) and b), and in case c) if \(\sigma \) is compact, the H operators in (3.26) and (3.27) may additionally be required to be of finite rank.
Proof
For any \(H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), let
Assume first that \(\varrho \notin {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), and hence \(Q_{\alpha ,z}(\varrho \Vert \sigma )=\log Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \). By (iv) of Lemma 3.1, for every \(\lambda >0\) there exists a vector \(x_{\lambda }\in {\mathcal {H}}\) such that
Clearly, for any \(x\in {\mathcal {H}}\), \(H_x{:}{=}\left| x\right\rangle \!\left\langle x\right| \in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\cap {{\mathcal {B}}}_f({\mathcal {H}})\), and
If \(\langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle =0\) for some \(\lambda >0\) then let \(x_{\lambda ,t}{:}{=}tx_{\lambda }+t^{-1}y\), \(t>0\), where \(y\in (\ker \sigma )^{\perp }\setminus \{0\}\) is some fixed vector. Then \(H_{x_{\lambda ,t}}\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\cap {{\mathcal {B}}}_f({\mathcal {H}})\), (3.29) implies that \(\left\langle x_{\lambda } , \varrho ^{\frac{\alpha }{z}}x_{\lambda }\right\rangle >0\), and it is straightforward to verify that
Thus,
and therefore (3.26)–(3.27) hold.
If \(\langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle >0\) for every \(\lambda >0\) then let \(\tilde{x}_{\lambda }{:}{=}x_{\lambda } \langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle ^{-1/2}\). Then
and
according to (3.29) and (3.30). Thus,
and therefore (3.26)–(3.27) hold, even with the optimizations restricted to finite-rank operators.
This completes the proof of case a), and hence for the rest we assume that b) or c) holds.
Consider any sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), and let \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\), \(\sigma _n{:}{=}{{\,\textrm{id}\,}}_{(c_n,d_n)}(\sigma )=P_n\sigma P_n\), and
Then
and similarly,
We have
in particular,
Moreover, if \(Q_{\alpha ,z}(\varrho \Vert \sigma )<+\infty \), i.e., in case c), or if \(\sigma \) is compact (in which case \(H_n\) and \(P_n\varrho _{\sigma ,\alpha ,z}P_n\) are of finite rank) then \(F(H_n)=G(H_n)<+\infty \), whence \(H_n\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\). Since \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) implies \(\varrho ^0\le \sigma ^0\), it is also true that \(0\ne \sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}\), and hence \(H_n\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\), for all large enough n.
When \(z\ge 1\), Lemma 2.4 yields
When \(z\in (0,1)\), \({{\,\textrm{id}\,}}_{[0,+\infty )}^z\) is operator concave, and hence \((P_n\varrho _{\sigma ,\alpha ,z}P_n)^z\ge P_n\varrho _{\sigma ,\alpha ,z}^zP_n\), whence
This completes the proof when \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \), i.e., in case b).
Assume for the rest that case c) holds. By the above considerations, we have LHS\(\le \)RHS in (3.26)–(3.27), and hence we only have to show the converse inequalities. By Lemma 3.1 and Definition 3.7, \(\varrho ^{\frac{\alpha }{z}}=\sigma ^{\frac{\alpha -1}{2z}}\varrho _{\sigma ,\alpha ,z}\sigma ^{\frac{\alpha -1}{2z}}\), where \({{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z<+\infty \). For any \(H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\), we have
where we used that \({{\,\textrm{ran}\,}}\varrho ^{\frac{\alpha }{2z}}\subseteq {{\,\textrm{dom}\,}}\sigma ^{\frac{1-\alpha }{2z}}\) and \(\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\in {{\mathcal {B}}}({\mathcal {H}})\), according to Lemma 3.1, and the expression in (3.14) for \(Q_{\alpha ,z}(\varrho \Vert \sigma )\). The first inequality above is due to the operator Hölder inequality, and the second inequality is trivial from the convexity of the exponential function. A simple rearrangement yields that LHS\(\ge \)RHS in (3.26, 3.27), completing the proof. \(\square \)
Remark 3.24
It is interesting that one can formally take the logarithm of each term in (3.26) to obtain (3.27).
Remark 3.25
The variational formulas in (3.26)–(3.27) hold for the sandwiched quantities (\(z=\alpha \)) when \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\setminus {{\mathcal {L}}}^{\alpha }({\mathcal {H}},\sigma )\) even if \(\sigma \) is not compact [20]. However, we won’t need this fact in the rest of the paper.
Remark 3.26
Note that the case \(z=1\) corresponds to the Petz-type Rényi divergences. By the above, \(\varrho \in {{\mathcal {B}}}^{\alpha ,1}({\mathcal {H}},\sigma )\) if and only if \(\varrho ^{\alpha }\le \lambda \sigma ^{\alpha -1}\) with some \(\lambda \ge 0\), in which case
(See [21, Theorem 3.6] for a generalization of the above in the setting of von Neumann algebras, and also for an analogous formula in the case \(\alpha \in (0,1)\).) Moreover, we have the variational formulas
where \({{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,1}^+= \left\{ H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}:\, 0<{{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\alpha -1}H^{1/2}\big )^{\frac{1}{\alpha -1}}<+\infty \right\} \).
These variational expressions for the Petz-type Rényi divergences do not seem to have appeared in the literature before, even for finite-dimensional operators, although in that case they follow easily from the results of [47].
The variational formulas in Lemma 3.23 can be used to prove the following important properties of the Rényi \((\alpha ,z)\)-divergences.
Corollary 3.27
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\sigma \) is trace-class. For every \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),
If, moreover, \(\varrho \) is trace-class then we have
and
Proof
The second inequality in (3.40) follows simply from the convexity of \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\alpha }\). The first inequality is obvious when \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \), and hence we may assume the contrary, i.e, that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). The assumption that \(\sigma \) is trace-class yields that \(I\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\), and the variational formula in (3.27) with \(H{:}{=}I\) yields the first inequality in (3.40). In fact, we don’t need the “full power” of the variational formula in (3.27) to obtain the first inequality in (3.40), as it follows simply from the Hölder inequality as in (3.36)–(3.39), with \(H=I\).
Assume for the rest that \(\varrho \) is trace-class. The right to left implications are straightforward to verify in both (3.41) and (3.42). Assume now that the equality on the LHS of (3.41) holds. It implies that \(Q_{\alpha ,z}(\varrho \Vert \sigma )\) is finite, i.e., \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), and the inequality in (3.37) holds as an equality for \(H=I\). Thus, by the characterization of the equality case in Hölder’s inequality (Lemma 2.1), \(\sigma =\lambda \left| \big (\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\big )^* \right| ^{2z}=\lambda \varrho _{\sigma ,\alpha ,z}^z\) for some \(\lambda >0\). From this we get \(\sigma ^{\frac{1}{z}}=\lambda ^{\frac{1}{z}}\varrho _{\sigma ,\alpha ,z}\), and
for every \(n\in \mathbb {N}\), where \(P_n\) is as in Lemma 3.1. Rearranging yields \(\sigma _n^{\frac{\alpha }{z}}=\lambda ^{\frac{1}{z}}P_n\varrho ^{\frac{\alpha }{z}}P_n\). Thus,
(In the last equality we use that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) implies \(\varrho ^0\le \sigma ^0\)). Hence, \(\sigma =\lambda ^{\frac{1}{\alpha }}\varrho \), i.e, the RHS of (3.41) holds true.
Finally, assume that the equality on the LHS of (3.42) is true. By (3.40), this implies that the equality on the LHS of (3.41) is true, and hence, by the above, \(\sigma =\eta \varrho \) for some \(\eta \in (0,+\infty )\). Moreover, the second equality in (3.40) holds as an equality, whence \({{\,\textrm{Tr}\,}}\varrho ={{\,\textrm{Tr}\,}}\sigma \), so we get \(\varrho =\sigma \) as given on the RHS of (3.42). \(\square \)
Corollary 3.28
For any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), the Rényi \((\alpha ,z)\)-divergence \(D_{\alpha ,z}\) is strictly positive in the sense that for any two density operators \(\varrho ,\sigma \in {{\mathcal {S}}}({\mathcal {H}})\),
Proof
Immediate from Corollary 3.27. \(\square \)
Remark 3.29
Non-negativity of the Rényi \((\alpha ,z)\)-divergences has been proved in [34] in the finite-dimensional case, by different methods. Strict positivity of the sandwiched Rényi \(\alpha \)-divergences with \(\alpha >1\) has been proved in the general von Neumann algebra case in [27].
Finally, we prove the lower semi-continuity of \(Q_{\alpha ,z}\) and \(D_{\alpha ,z}\) on pairs of trace-class operators from the variational formula; we will use this later in the proof of Lemma 3.39.
Corollary 3.30
For any \(\alpha >1\) and \(z\ge \alpha \), \(Q_{\alpha ,z}\) and \(D_{\alpha ,z}\) are lower semi-continuous on \({{\mathcal {L}}}^1({\mathcal {H}})\times {{\mathcal {L}}}^1({\mathcal {H}})\).
Proof
Let \(\varrho _n,\sigma _n\in {{\mathcal {L}}}^1({\mathcal {H}})\), \(n\in \mathbb {N}\), be convergent sequences in trace-norm, with \(\varrho {:}{=}\lim _{n\rightarrow +\infty }\varrho _n\), \(\sigma {:}{=}\lim _{n\rightarrow +\infty }\sigma _n\). Then
Since \(\left\| \varrho _n-\varrho \right\| _{\infty }\le \left\| \varrho _n-\varrho \right\| _1\rightarrow 0\), the continuity of the functional calculus implies \(\Vert {\varrho _n^{\frac{\alpha }{z}}-\varrho ^{\frac{\alpha }{z}}}\Vert _{\infty }\rightarrow 0\); in particular, \(\varrho _n^{\frac{\alpha }{z}}\rightarrow \varrho ^{\frac{\alpha }{z}}\) in the strong operator topology. Hence, by Lemma 2.3, \(\lim _n\left\| \varrho _n^{\frac{\alpha }{z}}-\varrho ^{\frac{\alpha }{z}}\right\| _{\frac{z}{\alpha }}=0\). Thus, for any \(H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),
This shows that for any \(H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(\varrho \mapsto {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }} = \Vert {H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}}\Vert _{\frac{z}{\alpha }}^{\frac{z}{\alpha }}\) is continuous on \({{\mathcal {L}}}^1({\mathcal {H}})\), and continuity of \(\sigma \mapsto {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}} H^{1/2}\big )^{\frac{z}{\alpha -1}}\) on \({{\mathcal {L}}}^1({\mathcal {H}})\) can be proved in the same way. Thus, by Lemma 3.23, \({{\mathcal {L}}}^1({\mathcal {H}})\times {{\mathcal {L}}}^1({\mathcal {H}})\ni (\varrho ,\sigma )\mapsto Q_{\alpha ,z}(\varrho ,\sigma )\) is the supremum of continuous functions, and hence it is upper semi-continuous. The assertion about the lower semi-continuity of \(D_{\alpha ,z}\) follows trivially from this. \(\square \)
Remark 3.31
Lower semi-continuity of the sandwiched Rényi \(\alpha \)-divergences for \(\alpha >1\) (i.e., \(z=\alpha >1\)) was given in [27, Proposition 3.10] in the general von Neumann algebra setting, with a different proof.
3.3 Finite-dimensional approximations
Our next goal is to investigate the relation between the sandwiched Rényi divergences of finite-dimensional restrictions of the operators and those of the unrestricted operators. We start with the following:
Lemma 3.32
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and \(K\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})_{\varrho ,\sigma }^+\) be a contraction. For any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\) with \(\max \{\alpha -1,\alpha /2\}\le z\le \alpha \),
Proof
By assumption, \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\frac{\alpha }{z}}\) is operator convex and \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\frac{\alpha -1}{z}}\) is operator concave, whence
according to the operator Jensen inequality [7, Theorem 11].
If \(\varrho \notin {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \) by definition, and (3.43) holds trivially. Hence, for the rest we assume that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Lemma 3.1 yields the existence of some \(\lambda \ge 0\) such that \(\varrho ^{\frac{\alpha }{z}}\le \lambda \sigma ^{\frac{\alpha -1}{z}}\). Thus,
where the first and the last inequalities are due to (3.44). Hence, again by Lemma 3.1, \(K\varrho K^*\in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},K\sigma K^*)\); in particular, the variational formulas in Lemma 3.23 hold for \(K\varrho K^*\) and \(K\sigma K^*\) in place of \(\varrho \) and \(\sigma \), respectively.
For any \(H\in {{\mathcal {B}}}({{\mathcal {K}}})_{K\sigma K^*,\alpha ,z}\),
where the second inequality is due to (3.44). In particular, \(K^*HK\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\). Similarly,
Plugging (3.45)–(3.46) into the variational formula yields
\(\square \)
Remark 3.33
When \(z=\alpha \) and \(K=P\) is a projection in Lemma 3.32, one could appeal to the monotonicity of \(Q_{\alpha }^{*}\) under positive trace-preserving maps, and its additivity on direct sums [27, Proposition 3.11], to obtain the inequality (3.43) as
Note, however, that these properties were only proved in [27] for positive normal functionals, i.e., positive trace-class operators in our setting, and hence this argument gives (3.43) in a restricted setting compared to that of Lemma 3.32, even when we only consider \(z=\alpha \) and reductions by projections.
Recall that the set of projections on \({\mathcal {H}}\) is an upward directed partially ordered set w.r.t. the PSD order. Lemma 3.32 yields the following:
Corollary 3.34
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\) as in Lemma 3.32. For any contraction \(K\in {{\mathcal {B}}}({\mathcal {H}})_{\varrho ,\sigma }^+\), and any projection \(P\in \mathbb {P}({\mathcal {H}})\) such that \(|K|^0\le P\),
In particular,
Proof
Since \(K(P\varrho P)K^*=K\varrho K^*\) and \(K(P\sigma P)K^*=K\sigma K^*\), (3.47) follows immediately by replacing \(\varrho \) with \(P\varrho P\) and \(\sigma \) with \(P\sigma P\) in Lemma 3.32. The monotonicity in (3.48) follows immediately from this. \(\square \)
Definition 3.35
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let
be the finite-dimensional approximations of \(Q_{\alpha ,z}(\varrho \Vert \sigma )\) and \(D_{\alpha ,z}(\varrho \Vert \sigma )\), respectively. If, moreover, \(\varrho \) is trace-class then we also define
Remark 3.36
It is clear from (3.10) that for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),
Lemma 3.37
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\) as in Lemma 3.32. Then
and
Proof
Immediate from Corollary 3.34. \(\square \)
Our next goal is to see when equality in (3.50) holds.
Lemma 3.38
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}})_+\) and \(1<\alpha \le z\) be such that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), let \(0<c_n<d_n\), \(n\in \mathbb {N}\), be sequences such that \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), and \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\). Then
Proof
Note that by assumption, \(\varrho ^0\le \sigma ^0\), and for every large enough n, \(P_n\in {{\mathcal {B}}}({\mathcal {H}})_{\varrho ,\sigma }^+\). By Lemma 3.1,
where the inequality follows from the operator Jensen inequality [7, Theorem 11] due to the fact that \(\alpha /z\in (0,1]\) by assumption. Hence,
where the first inequality is due to Lemma 2.4, and the second inequality follows from (3.52). \(\square \)
The range of \((\alpha ,z)\) pairs to which both Lemma 3.37 and Lemma 3.38 apply is \(1<\alpha =z\), i.e., the case of the sandwiched Rényi divergences, and hence for the rest we restrict to this case. Fortunately, this is sufficient for the intended applications in the rest of the paper.
Lemma 3.39
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be trace-class, and \(K_n\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})_{\varrho ,\sigma }^+\), \(n\in \mathbb {N}\), be contractions such that
(That is, \((K_n)_{n\in \mathbb {N}}\) converges in the strong\(^*\) operator topology.) Then
In particular, if \(P_n\in \mathbb {P}({\mathcal {H}})_{\varrho ,\sigma }^+\), \(n\in \mathbb {N}\), is a sequence of projections strongly converging to some \(P_{\infty }\) with \(P_{\infty }\varrho P_{\infty }=\varrho \) and \(P_{\infty }\sigma P_{\infty }=\sigma \) then
Proof
The second inequality in (3.54) is obvious from Lemma 3.32, and the first inequality is trivial. By the assumptions and Lemma 2.2,
Since \(Q_{\alpha }^{*}\) is lower semi-continuous on \({{\mathcal {L}}}^1({\mathcal {H}})\times {{\mathcal {L}}}^1({\mathcal {H}})\) (see Corollary 3.30, or [27, Proposition 3.10]), we get the inequality in (3.53). The last assertion follows obviously. \(\square \)
Lemmas 3.37, 3.38, and 3.39 imply immediately the following:
Proposition 3.40
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and assume that \(\varrho \) and \(\sigma \) are trace-class, or that \(\sigma \) is compact and \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\). Then
for every \(\alpha >1\), where the convergence in (3.56) and (3.58) is a net convergence in the strong operator topology, and (3.57) and (3.59) hold for any sequence \((P_n)_{n\in \mathbb {N}}\) as in Lemma 3.38. If, moreover, \(\varrho \) is trace-class then
Remark 3.41
Finite-dimensional approximability for the standard f-divergences was given in [19, Theorem 4.5] in the general von Neumann algebra setting. In particular, it shows that for any two PSD trace-class operators on a Hilbert space, the standard (or Petz-type) Rényi divergences satisfy
It is an open question whether finite-dimensional approximability holds for any \((\alpha ,z)\) pairs other than \(\alpha \in [0,2]\) and \(z=1\), and \(z=\alpha \in (1,+\infty )\).
There are cases apart from the ones treated in Proposition 3.40 where the inequality in (3.50) holds with equality. In particular, we have the following trivial case, which we will use in the proof of Proposition 4.3.
Lemma 3.42
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\).
Proof
Assume that \(\varrho ^0\nleq \sigma ^0\), so that there exists a unit vector \(\psi \in {\mathcal {H}}\) such that \(\sigma ^0\psi =0\), \(\varrho ^0\psi \ne 0\). Let \(\phi \) be any unit vector such that \(\sigma ^0\phi =\phi \), and for every \(t\in [0,1]\), define \(\psi _t{:}{=}\sqrt{1-t}\psi +\sqrt{t}\phi \), \(P_t{:}{=}\left| \psi _t\right\rangle \!\left\langle \psi _t\right| \). Then
while \(P_t\sigma P_t\ne 0\) for every \(t\in (0,1]\). Thus,
Since \(\varrho ^0\nleq \sigma ^0\) implies that \(\varrho \notin {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) (see Lemma 3.1), we also get \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \). \(\square \)
The finite-dimensional approximability of the sandwiched Rényi divergences in Proposition 3.40 is the key property used in proving the main results of the paper, the equality of the sandwiched and the regularized measured Rényi divergences, and the determination of the strong converse exponent of state discrimination, in Sects. 3.4 and 4.1.
The following monotonicity result has been proved for finite-rank states in [35], and for states of a general von Neumann algebra in [6, 27]. We give a different proof of it in our setting as an illustration of the use of the finite-dimensional approximability in extending finite-dimensional results to infinite dimension. We will give yet another proof in Sect. 3.4, using a different respresentation of the sandwiched Rényi divergences.
Corollary 3.43
Let \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) be PSD trace-class operators. Then
and
Proof
These are well-known when \(\varrho \) and \(\sigma \) are finite-rank [35]. Thus, by (3.60), the monotonicity in (3.62) holds. This also shows that the first limit in (3.63) exists, and it is trivial by definition that it is equal to the second limit. To show the last equality in (3.63), it is sufficient to consider the case when \(\varrho \) and \(\sigma \) are density operators, due to the scaling properties in Remark 3.13. Then \(D_{\alpha }^{*}(\varrho \Vert \sigma )=\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma )\) for every \(\alpha >1\), and
Here, the first three equalities are trivial, and the fourth one follows by Proposition 3.40. The fifth equality is again trivial, and the sixth one is by definition. In the seventh equality we use that both \(\alpha \mapsto \tilde{D}_{\alpha }^{*}(P\varrho P\Vert P\sigma P)\) and \(\alpha \mapsto \frac{1}{\alpha -1}\log {{\,\textrm{Tr}\,}}P\varrho P\) are increasing, and hence the supremum of their sum over \(\alpha >1\) is the sum of their limits at \(\alpha \rightarrow +\infty \), which is equal to \(D_{\max }(P\varrho P\Vert P\sigma P)\), according to the known behaviour in the finite-dimensional case. The last equality is straightforward to verify. \(\square \)
3.4 Regularized measured Rényi divergence
A finite-outcome positive operator-valued measure (POVM) on a Hilbert space \({\mathcal {H}}\) is a map \(M:\,[r]\rightarrow {{\mathcal {B}}}({\mathcal {H}})\), where \([r]{:}{=}\{1,\ldots ,r\}\), all \(M_i\) is PSD, and \(\sum _{i=1}^rM_i=I\). (We assume without loss of generality that the set of possible outcomes is a subset of \(\mathbb {N}\).) We denote the set of such POVMs by \(\text {POVM}({\mathcal {H}},[r])\). For two PSD trace-class operators \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\), their measured Rényi divergence is defined as
where in the second expression we have the classical Rényi divergence [42] of the given non-negative functions on [r]. This is defined for \(p,q\in [0,+\infty )^{[r]}\backslash \{0\} \) as
One might consider more general POVMs for the definition, but that does not change the value of the measured Rényi divergence; see, e.g., [21, Proposition 5.2]. The regularized measured Rényi divergence of \(\varrho \) and \(\sigma \) is then defined as
The following has been shown in [33]:
Lemma 3.44
For finite-rank PSD operators \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),
In the proof of the next theorem, we will use the monotonicity of the sandwiched Rényi \(\alpha \)-divergences under finite-outcome measurements for \(\alpha >1\). The more general statement of monotonicity under quantum operations has been proved in [6, Theorem 14] and [27, Theorem 3.14] in the general von Neumann algebra setting. We give a different proof for trace-class operators on a Hilbert space in Corollary 4.15 below.
Theorem 3.45
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be trace-class, and \(\alpha >1\). Then
Proof
The inequality \(\overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma )\le D_{\alpha }^{*}(\varrho \Vert \sigma )\) is trivial from the monotonicity of \(D_{\alpha }^{*}\) under quantum operations and its additivity under tensor products (Lemma 3.22), and hence we only need to prove the converse inequality. By Proposition 3.40, for any \(c<D_{\alpha }^{*}(\varrho \Vert \sigma )\) there exists a finite-rank projection \(P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\) such that
By Lemma 3.44, there exist \(n\in \mathbb {N}\), a number \(r\in \mathbb {N}\), and \(M_i\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n}){}_{\gneq 0}\), \(M_i^0\le P^{\otimes n}\), \(i\in [r]\), with \(\sum _{i\in [r]}M_i=P^{\otimes n}\), such that
Let us define \(\tilde{M}_i{:}{=}M_i\), \(i\in [r]\), and \(\tilde{M}_{r+1}{:}{=}I_{{\mathcal {H}}^{\otimes n}}-P^{\otimes n}\). Then \((\tilde{M}_i)_{i\in [r+1]}\) is a POVM on \({\mathcal {H}}^{\otimes n}\), and we have
where the first inequality is by (3.64), the equality and the second inequality are trivial, and the third and the fourth inequalities are by definition. Thus, \(c<\overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma )\), and since the above holds for every \(c<D_{\alpha }^{*}(\varrho \Vert \sigma )\), the assertion follows. \(\square \)
Their representation given in Theorem 3.45 distinguishes the sandwiched Rényi divergences among all quantum generalizations of the classical Rényi divergences; in particular, it gives special importance to the \(\alpha =z\) case in the family of Rényi \((\alpha ,z)\)-divergences, at least for \(\alpha >1\). It also allows to deduce some important properties of the sandwiched Rényi divergences from those of the classical Rényi divergences; we present such an example in Corollary 3.46. Note that the properties in Corollary 3.46 were also proved in [6, 27] in the general von Neumann algebra setting, by different methods. Yet another proof was given in our setting in Corollary 3.43.
Corollary 3.46
Let \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) be PSD trace-class operators. Then
and
Proof
The increasing property in (3.65) is well-known and easy to verify for commuting finite-rank states (i.e., in the finite-dimensional classical setting). The general case follows immediately from this and Theorem 3.45. The first equality in (3.66) is immediate from the increasing property in (3.65), and the second equality is trivial by definition.
Note that the equality in (3.68) is by definition (see (3.9)), and it is clear that \(D_{\max }(\varrho \Vert \sigma )\) is an upper bound on (3.69). To prove the converse inequality, note first that (3.69) is equal to \(+\infty \) if \(\varrho ^0\nleq \sigma ^0\), and hence for the rest we assume the contrary. Let \(0<\lambda <\exp (D_{\max }(\varrho \Vert \sigma ))\). By definition, there exists a unit vector \(\psi \in {\mathcal {H}}\) such that \(\left\langle \psi , \varrho \psi \right\rangle >\lambda \left\langle \psi , \sigma \psi \right\rangle \). In particular, \(\left\langle \psi , \varrho \psi \right\rangle >0\), and hence also \(\left\langle \psi , \sigma \psi \right\rangle >0\), due to the assumption that \(\varrho ^0\le \sigma ^0\). Choosing \(T{:}{=}\left| \psi \right\rangle \!\left\langle \psi \right| \) shows that (3.69) is lower bounded by \(\log \lambda \) for any such \(\lambda \), and hence it is also lower bounded by \(D_{\max }(\varrho \Vert \sigma )\). Thus, we get the equality in (3.69).
It is also straightforward to verify that the expressions in (3.66) are upper bounded by \(D_{\max }(\varrho \Vert \sigma )\). To prove the converse inequality, note that for any test T as in (3.69),
whence
Taking the supremum over T yields that \(\lim _{\alpha \rightarrow +\infty }D_{\alpha }^{*}(\varrho \Vert \sigma )\) is lower bounded by (3.69), which in turn is equal to \(D_{\max }(\varrho \Vert \sigma )\) by the above. \(\square \)
3.5 The Hoeffding anti-divergences
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), let
We will need these quantities to define the Hoeffding anti-divergences, which will give the strong converse exponent of state discrimination in Sect. 4.
Lemma 3.47
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) with \(\varrho ^0\le \sigma ^0\).
(i) For any finite-rank projection \(P\!\!\in \!\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\), \(\psi ^{*}\!(P\varrho P\Vert P\sigma P|\cdot \!)\) and \(\tilde{\psi }^{*}\!(P\varrho P\Vert P\sigma P|\cdot \!)\) are finite-valued convex functions on \((1,+\infty )\) and (0, 1), respectively, and hence they are continuous. Moreover,
and the so extended functions \(\psi ^{*}(P\varrho P\Vert P\sigma P|\cdot )\) and \(\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|\cdot )\) are convex and continuous on \([1,+\infty ]\) and on [0, 1], respectively.
(ii) For every \(\alpha \in [1,+\infty ]\), \(P\mapsto \psi ^{*}(P\varrho P\Vert P\sigma P|u)\) is monotone increasing on \(\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\).
(iii) For every \(u\in [0,1]\), \(P\mapsto \tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\) is monotone increasing on \(\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\).
Proof
By [33, Corollary 3.11], \(\psi ^{*}(P\varrho P\Vert P\sigma P|\cdot )\) is a finite-valued convex function on \((1,+\infty )\). Hence, it can be written as \(\psi ^{*}(P\varrho P\Vert P\sigma P|\alpha )=\sup _{i\in {{\mathcal {I}}}}\{c_i\alpha +d_i\}\), \(\alpha \in (1,+\infty )\), with some \(c_i,d_i\in \mathbb {R}\) and an index set \({{\mathcal {I}}}\). This implies that \(\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)=(1-u)\sup _{i\in {{\mathcal {I}}}}\{c_i(1-u)^{-1}+d_i\}= \sup _{i\in {{\mathcal {I}}}}\{c_i+d_i(1-u)\}\), and therefore \(\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|\cdot )\) is also convex and finite-valued on (0, 1), and thus it is continuous as well. The limits in (3.70)–(3.71) follow by a straightforward computation, using in the second limit that \(\lim _{\alpha \rightarrow +\infty }D_{\alpha }^{*}(\omega \Vert \tau )=D_{\max }(\omega \Vert \tau )\) for finite-rank states \(\omega ,\tau \) (see [35] or Corollary 3.46). Convexity and continuity of the extensions are obvious from the definitions. Monotonicity in (ii) and (iii) are immediate from Corollary 3.34. \(\square \)
Corollary 3.48
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), the functions
are convex and lower semi-continuous on \(\mathbb {R}\) (and on \(\mathbb {R}\cup \{+\infty \}\) in the case of \(\psi ^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\)).
Proof
If \(\varrho ^0\nleq \sigma ^0\) then all three functions are easily seen to be constant \(+\infty \) on \(\mathbb {R}\), and hence for the rest we assume that \(\varrho ^0\le \sigma ^0\). Since the supremum of convex functions is again convex, and the supremum of lower semi-continuous functions is again lower semi-continuous, both properties hold for the above functions on \([1,+\infty ]\), [0, 1], and [0, 1], respectively, according to Lemma 3.47, and it is trivial to verify that the same is true on the whole of \(\mathbb {R}\). \(\square \)
Remark 3.49
It is clear that
This motivates to define
Remark 3.50
By Corollary 3.48, if \(\varrho \) and \(\sigma \) are such that \(Q_{\alpha }^{*}(\varrho \Vert \sigma )= Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\), \(\alpha >1\), then \(\psi ^{*}(\varrho \Vert \sigma |\cdot )\) and \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot )\) are convex and lower semi-continuous on \((1,+\infty )\) and on (0, 1), respectively. In particular, this holds when both \(\varrho \) and \(\sigma \) are trace-class, according to Proposition 3.40.
Recall the definition of the finite-dimensional approximation of the sandwiched Rényi divergences as a special case of Definition 3.35: For \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),
Analogously, we define
Definition 3.51
For \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(r\in \mathbb {R}\), let
Here, \(H_r^{*}(\varrho \Vert \sigma )\) and \({\hat{H}}_r^{*}(\varrho \Vert \sigma )\) are two different versions of the Hoeffding anti-divergence of \(\varrho \) and \(\sigma \) with parameter \(r\in \mathbb {R}\), and the rest of the quantities are different finite-dimensional approximations.
Remark 3.52
\(H_r^{*}\) and \({\hat{H}}_r^{*}\) are called anti-divergences because for trace-class operators they are monotone non-decreasing under quantum operations; this is immediate from the monotone non-increasing property of \(D_{\alpha }^{*}\) under such maps for \(\alpha > 1\); see [6, Theorem 14], [27, Theorem 3.14], or Theorem 4.14.
The Hoeffding anti-divergences are defined as Legendre-Fenchel transforms (polar functions). For some of them this transformation can be reversed as follows; this will be used in Theorem 4.14.
Lemma 3.53
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),
Proof
By Corollary 3.48, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\) and \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot )_{\overline{\textrm{fa}}}\) are convex and lower semi-continuous on \(\mathbb {R}\), and hence (3.83)–(3.84) follow from (3.80)–(3.81) according to the bipolar theorem (see, e.g., [11, Proposition 4.1]). Likewise, \(r\mapsto H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\) is the polar function of \(f(u){:}{=}\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}+(+\infty )\textbf{1}_{\{0,1\}}(u)\), \(u\in \mathbb {R}\), and hence, by [11, Proposition 4.1], its polar function is the largest convex and lower semi-continuous minorant of f, which is exactly \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\). This proves (3.82). \(\square \)
The different variants of the Hoeffding anti-divergence defined above appear naturally in different bounds on the strong converse exponents; see Sect. 4. Our next goal is to explore their relations; in particular, to find sufficient conditions for some or all of them to coincide. Note that this is not always the case, as shown in Examples 3.59–3.60.
Lemma 3.54
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). For any \(u\in [0,1]\), and any \(r\in \mathbb {R}\),
and
In particular, if \(\varrho \) and \(\sigma \) are trace-class, or \(\sigma \) is compact and \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\), then all equalities in (3.85)–(3.86) hold.
Proof
The inequalities are immediate from (3.50) and the definitions of the given quantities. The equivalence in (3.85) is trivial by definition, as is the implication in (3.86). The last assertion follows from Proposition 3.40. \(\square \)
Lemma 3.55
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) (equivalently, \(\varrho \in {{\mathcal {L}}}^{\alpha _0}({\mathcal {H}},\sigma )\)) for some \(\alpha _0\in (1,+\infty )\). Then
Proof
It is enough to prove that
We prove the first equality, as the second one follows the same way. If \(\tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}=+\infty \) then there is nothing to prove, and hence we assume the contrary. Also by assumption,
where \(u_0{:}{=}(\alpha _0-1)/\alpha _0\). By Corollary 3.48, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\) is convex on [0, 1], and finiteness at 0 and \(u_0\) implies \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}<+\infty \), \(u\in [0,u_0]\). By Lemma 3.20, we also have \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}>-\infty \), \(u\in [0,u_0]\). Hence, \(u\mapsto ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\) is a finite-valued concave and upper semi-continuous function on \([0,u_0]\), whence it is also continuous on \([0,u_0]\). This proves the asserted equality. \(\square \)
Proposition 3.56
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha _0\in (1,+\infty )\), and \(Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=Q_{\alpha }^{*}(\varrho \Vert \sigma )\), \(\alpha >1\). Then, for every \(r\in \mathbb {R}\),
Proof
Immediate from Lemmas 3.54 and 3.55. \(\square \)
Remark 3.57
Some further properties of, and relations among, the different Hoeffding anti-divergences are given in Appendix A. While these are not used in the rest of the paper, they might give some extra insight into the different bounds given in Proposition 4.5.
We close this section with some statements on the possible values of the Hoeffding anti-divergences. For these, we will need the notion of the Umegaki relative entropy [45]. For two finite-rank PSD operators \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), it is defined as
where \({{\,\mathrm{\widehat{\log }}\,}}x{:}{=}\log x\), \(x>0\), and \({{\,\mathrm{\widehat{\log }}\,}}0{:}{=}0\). For positive normal functionals on a von Neumann algebra, it may be defined using the relative modular operator [1]. In the simple case of PSD trace-class operators \(\varrho ,\sigma \) on a separable Hilbert space \({\mathcal {H}}\), their relative entropy may be expressed equivalently as [19, Theorem 4.5]
where the second equality holds for any increasing sequence \(P_n\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\), \(n\in \mathbb {N}\), converging strongly to I. For non-zero PSD trace-class operators \(\varrho ,\sigma \) and \(\lambda ,\eta \in (0,+\infty )\), the scaling laws
are easy to verify from the definitions (see also Remark 3.13). It was shown in [6, 27] that
Lemma 3.58
Let \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) be PSD trace-class operators.
(i) For every \(r\in \mathbb {R}\),
(ii) If there exists an \(\alpha _0\in (1,+\infty )\) such that \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) then
(iii) If \(D_{\alpha }^{*}(\varrho \Vert \sigma )=+\infty \) for every \(\alpha \in (1,+\infty )\) then
Proof
(i) By the scaling laws (3.90–3.92),
According to Corollary 3.46, \(\displaystyle {\lim _{\alpha \rightarrow +\infty }}\) \(D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg ) =D_{\max }\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg ) =D_{\max }\big (\varrho \Vert \sigma \big )-\log {{\,\textrm{Tr}\,}}\varrho +\log {{\,\textrm{Tr}\,}}\sigma \), and hence,
proving (3.95).
(ii) The equalities in (3.96) follow from Proposition 3.56. Using the assumption \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \), (3.94) and (3.92) give
In particular, the above limit is finite, and thus
where in the second expression we used (3.100), and the inequality is by definition. On the other hand, (3.100) shows that \(H_r^{*}(\varrho \Vert \sigma )>-\log {{\,\textrm{Tr}\,}}\varrho \) holds if and only if
where the equalities are due to (3.101). Note that (3.102) is exactly the condition in the second line of (3.97), and hence we obtain the first line in (3.97). Assume now that r is as in (3.102). Then
proving the second line of (3.97).
(iii) By Lemma 3.54, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u)=\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}=+\infty \) for every \(u\in (0,1)\), whence \(H_r^{*}(\varrho \Vert \sigma )=H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}= -\infty \). On the other hand, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |0)=\tilde{\psi }^{*}(\varrho \Vert \sigma |0)_{\overline{\textrm{fa}}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}=\log {{\,\textrm{Tr}\,}}\varrho \), according to Remark 3.49, and \(\tilde{\psi }^{*}(\varrho \Vert \sigma |1)=\tilde{\psi }^{*}(\varrho \Vert \sigma |1)_{\overline{\textrm{fa}}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |1){}_{\textrm{fa}}=D_{\max }(\varrho \Vert \sigma )=+\infty \), where the last equality follows from Corollary 3.46. Hence, \({\hat{H}}_r^{*}(\varrho \Vert \sigma )={\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=-\log {{\,\textrm{Tr}\,}}\varrho \). \(\square \)
Example 3.59
Let \((e_n)_{n\in \mathbb {N}}\) be an orthonormal basis in \({\mathcal {H}}\), and \(\varrho {:}{=}c_1\sum _{n=1}^{+\infty }n^{-\beta }\left| e_n\right\rangle \!\left\langle e_n\right| \), \(\sigma {:}{=}c_2\sum _{n=1}^{+\infty }n^{-n^{\gamma }}\left| e_n\right\rangle \!\left\langle e_n\right| \), with some \(\beta >1\) and \(\gamma >0\), where \(c_1\) and \(c_2\) are choosen so that \(\varrho \) and \(\sigma \) are density operators. Obviously, \(\varrho \) and \(\sigma \) are commuting (classical). For \(P_N{:}{=}\sum _{n=1}^N\left| e_n\right\rangle \!\left\langle e_n\right| \), we have
whence
and
according to Lemma 3.58. Note also that
For the relative entropy we get
if \(\beta >\gamma +1\). Hence, assuming that \(D(\varrho \Vert \sigma )<+\infty \) is not sufficient for Lemma 3.55 and Lemma 3.58.
This also gives an example where
which is contrary to the case where \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha _0>1\); see (3.94). This kind of behaviour was already pointed out in [19, Remark 5.4].
Example 3.60
Let \(\varrho =\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\varrho \) is not trace-class. Then \(\varrho =\varrho ^{\frac{\alpha -1}{2\alpha }}\varrho ^{\frac{1}{\alpha }}\varrho ^{\frac{\alpha -1}{2\alpha }}\), whence \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\) with \(\varrho _{\sigma ,\alpha }=\varrho ^{\frac{1}{\alpha }}\notin {{\mathcal {L}}}^{\alpha }({\mathcal {H}})\), and
Thus, for every \(r \in \mathbb {R}\),
In particular, this holds also when \(\varrho \) is compact, and obviously \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\varrho )\). This shows that the assumption \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha _0<+\infty \) is also important in this case of Proposition 3.56.
Note also that this is an example where
This cannot happen when \(\varrho \) and \(\sigma \) are both trace-class, according to Corollary 3.43 or Corollary 3.46.
4 The Strong Converse Exponent
4.1 The strong converse exponents and the Hoeffding anti-divergences
Before restricting our attention to the i.i.d. case in the main result, we first consider a generalization of the binary state discrimination problem described in the Introduction. First, we do not assume the hypotheses to be represented by density operators, but by general positive semi-definite operators. Second, we do not assume the problem to be i.i.d. In the most general case, a simple asymptotic binary operator discrimination problem is specified by a sequence of Hilbert spaces \({\mathcal {H}}_n\), \(n\in \mathbb {N}\), and for each \(n\in \mathbb {N}\), a pair \(\varrho _n,\sigma _n\in {{\mathcal {B}}}({\mathcal {H}}_n){}_{\gneq 0}\), representing the null and the alternative hypotheses, respectively. Since the operators are not assumed to be trace-class, the expressions in (1.1) may not make sense, and need to be modified as
to define the generalized type I success and type II errors, respectively. These expressions are equal to those in (1.1) when \(\varrho _n\) and \(\sigma _n\) are trace-class.
Definition 4.1
Let \(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}{:}{=}(\varrho _n)_{n\in \mathbb {N}}\), \(\mathop {\sigma }\limits ^{{\tiny \rightarrow }}{:}{=}(\sigma _n)_{n\in \mathbb {N}}\) be as above. The strong converse exponents of the simple asymptotic binary operator discrimination problem \(H_0:\,\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\) vs. \(H_1:\,\mathop {\sigma }\limits ^{{\tiny \rightarrow }}\) with type II exponent \(r\in \mathbb {R}\) are defined as
where the infima are taken along all test sequences \(T_n\in {{\mathcal {B}}}({\mathcal {H}}_n)_{[0,I]}\), \(n\in \mathbb {N}\), satisfying the indicated condition, and in the last expression also that the limit exists.
We will need an extension of the notion of the Hoeffding anti-divergence in the above setting. Let
where we used (3.74)–(3.75), and
The inequality in the following lemma is called the optimality part of the Hoeffding bound. For trace-class operators, it can be easily obtained from the monotonicity of the sandwiched Rényi divergence under measurements; see [6, 33, 36]. If we do not assume \(\varrho _n\) and \(\sigma _n\) to be trace-class, we can still obtain it using the variational formula in (3.27), as we show below.
Proposition 4.2
For every \(r\in \mathbb {R}\),
Proof
All the inequalities are trivial by definition, except for the first one. Thus, we need to show that for any \(r\in \mathbb {R}\) and any \(u\in [0,1]\),
Let us fix \(r\in \mathbb {R}\) for the rest. First, note that for any test \(T_n\),
Thus, for any sequence of tests \((T_n)_{n\in \mathbb {N}}\),
proving (4.2) for \(u=0\). For \(u=1\), (4.2) is trival when \(\psi ^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|1)=+\infty \), and hence we assume the contrary; in particular, \(D_{\max }(\varrho _n\Vert \sigma _n)<+\infty \) for any large enough n. Let \((T_n)_{n\in \mathbb {N}}\) be a test sequence such that
Then for any \(r'<r\) and any large enough n, \(\beta _n(T_n|\sigma _n)\le \exp (-nr')\), whence
Thus,
This gives (4.2) for \(u=1\).
For the rest, let us fix an \(u\in (0,1)\), and corresponding \(\alpha =1/(1-u)>1\). If \(\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u)=+\infty \) then
holds trivially. Hence, we assume that \(\tilde{\psi }^{*}(\varrho _n\Vert \sigma _n|u)<+\infty \), or equivalently, \(\varrho _n\in {{\mathcal {L}}}^{\alpha }({\mathcal {H}}_n,\sigma _n)\) for every large enough n. In particular, the variational formula (3.27) holds (with \(z=\alpha )\).
Consider now a sequence of tests \((T_n)_{n\in \mathbb {N}}\) such that \(\liminf _{n\rightarrow +\infty }\!-\frac{1}{n}\!\log \!{{\,\textrm{Tr}\,}}(T_n^{1/2}\!\sigma _n T_n^{1/2}) \ge r\). Then \({{\,\textrm{Tr}\,}}\,(T_n^{1/2}\sigma _n T_n^{1/2})<+\infty \) for every large enough n, and we have
where the first inequality is due to the operator Jensen inequality [7, Theorem 11]. Hence, \(T_n\in {{\mathcal {B}}}({\mathcal {H}}_n)_{\sigma _n,\alpha ,\alpha }\). If \({{\,\textrm{Tr}\,}}\big (T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}\big )^{\frac{\alpha }{\alpha -1}}>0\) then the variational formula (3.27) yields
where the second inequality is due to (4.4). In particular, we also have \({{\,\textrm{Tr}\,}}\,(T_n^{1/2}\varrho _n T_n^{1/2})<+\infty \). By a simple rearrangement, we get
If \({{\,\textrm{Tr}\,}}\big (T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}\big )^{\frac{\alpha }{\alpha -1}}=0\) then \(T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}=0\). Since \(\varrho _n\in {{\mathcal {L}}}^{\alpha }({\mathcal {H}}_n,\sigma _n)\), this implies \(T_n^{1/2}\varrho _n T_n^{1/2}=0\), according to Lemma 3.1, and therefore (4.5) holds trivially, with both sides equal to \(+\infty \).
Taking the liminf in (4.5) yields
Since this holds for every test sequence as above, we get \(ur-\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u)\le \underline{\textrm{sc}}(\varrho \Vert \sigma )\), as required. \(\square \)
For the rest, we restrict our attention to the i.i.d. case, where
for some Hilbert space \({\mathcal {H}}\) and \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). Note that by Lemma 3.22, \(\tilde{\psi }^{*}\big ((\varrho ^{\otimes n})_{n\in \mathbb {N}}\Vert (\sigma ^{\otimes n})_{n\in \mathbb {N}}|u\big )= \tilde{\psi }^{*}(\varrho \Vert \sigma |u)\), \(u\in (0,1)\), \(n\in \mathbb {N}\), and the same identity is straightforward to verify for \(u=0,1\), whence
We replace the notations \(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\) and \(\mathop {\sigma }\limits ^{{\tiny \rightarrow }}\) with \(\varrho \) and \(\sigma \), respectively, in the strong converse exponents introduced above. Let
be defined the same way as \(\underline{\textrm{sc}}(\varrho \Vert \sigma )\), \(\overline{\textrm{sc}}(\varrho \Vert \sigma )\), and \(\textrm{sc}(\varrho \Vert \sigma )\), respectively, but with the restrictions that only finite-rank tests are used. Obviously,
The following lower bound follows by a straightforward adaptation of Nagaoka’s method [36].
Proposition 4.3
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). For every \(r\in \mathbb {R}\),
Proof
Let us fix an \(r\in \mathbb {R}\). We need to prove that for every \(u\in [0,1]\),
The cases \(u=0\) and \(u=1\) can be proved exactly the same way as in the proof of Proposition 4.2 above. For the rest, let us fix an \(u\in (0,1)\), with corresponding \(\alpha =1/(1-u)>1\). If \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}=+\infty \) then (4.6) holds trivially, and hence for the rest we assume that \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}<+\infty \). In particular, we have \(\varrho ^0\le \sigma ^0\), according to Lemma 3.42.
Let \(T_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,I]}\), \(n\in \mathbb {N}\), be a sequence of finite-rank tests such that
Assume first that \(T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}\ne 0\), whence, by the assumption that \(\varrho ^0\le \sigma ^0\), we also have \(T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\ne 0\). By Lemma 3.37,
where the second inequality follows from Corollary 3.27. A simple rearrangement yields
These inequalities also hold (trivially, with the leftmost expression being \(+\infty \)) when \(T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}=0\). Thus, we get
Since this holds for every test sequences as above, (4.6) follows. \(\square \)
Lemma 4.4
For finite-rank PSD operators \(\varrho ,\sigma \) on a Hilbert space, with \(0\ne \varrho ^0\le \sigma ^0\), we have
Proof
The inequality in (4.7) was proved in [33, Theorem 4.10] for finite-rank density operators, under the implicit assumption that \(D(\varrho \Vert \sigma )\ne D_{\max }(\varrho \Vert \sigma )\), and it was proved in [22] in the case \(D(\varrho \Vert \sigma )= D_{\max }(\varrho \Vert \sigma )\). The case of general PSD operators follows easily by replacing \(\varrho \) and \(\sigma \) with \(\varrho /{{\,\textrm{Tr}\,}}\varrho \) and \(\sigma /{{\,\textrm{Tr}\,}}\sigma \), respectively, and using the scaling laws (3.91) and \(\overline{\textrm{sc}}_r(\lambda \varrho \Vert \eta \sigma )=\overline{\textrm{sc}}_{r+\log \eta }(\varrho \Vert \sigma )-\log \lambda \). \(\square \)
Proposition 4.5
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\varrho ^0\le \sigma ^0\). For every \(r\in \mathbb {R}\),
Proof
By Propositions 4.2 and 4.3, we only need to prove \(\textrm{sc}_r(\varrho \Vert \sigma )_\mathrm{{f}}\le {\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\). Let \(P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\). According to Lemma 4.4, there exists a sequence of tests \((S_{P,n})_{n\in \mathbb {N}}\) such that \(S_{P,n}\le P^{\otimes n}\), and
where the equality is due to Propositions 3.56 and 3.40. Note that
and therefore (4.8)–(4.9) yield
Thus,
By Lemma 3.47, \(u\mapsto ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\) is continuous on the compact set [0, 1] for every \(P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\). On the other hand, \(\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\) is an upward directed partially ordered set with respect to the PSD order, and for any \(u\in [0,1]\), \(P\mapsto ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\) is monotone decreasing on \(\mathbb {P}_f({\mathcal {H}})\), again by Lemma 3.47. Hence, by Lemma 2.5, we may exchange the inf and the max in (4.10). Thus, we get the upper bound
as required. \(\square \)
Theorem 4.6
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(D_{\alpha }^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha \in (1,+\infty )\), and \(Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=Q_{\alpha }^{*}(\varrho \Vert \sigma )\), \(\alpha >1\). Then
On the other hand, if \(\varrho ,\sigma \) are trace-class and \(D_{\alpha }^{*}(\varrho \Vert \sigma )=+\infty \) for all \(\alpha \in (1,+\infty )\), then
Proof
Immediate from Propositions 4.5, 3.56, and Lemma 3.58. \(\square \)
As a special case of Theorem 4.6, we get the exact characterization of the strong converse exponent of discriminating quantum states on a separable Hilbert space, as follows:
Corollary 4.7
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be density operators. For every \(r\in \mathbb {R}\),
and
Proof
The equalities in (4.11) are immediate from Theorem 4.6, and the characterization of positivity in (4.13) follows from Lemma 3.58. \(\square \)
Remark 4.8
Let \(\varrho \) and \(\sigma \) be density operators. According to the direct part of the quantum Stein’s lemma [24, 26], for every \(r<D(\varrho \Vert \sigma )\) there exists a test sequence \(T_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,1]}\), \(n\in \mathbb {N}\), such that
It was shown in [36, 38] that in the finite-dimensional case, for any test sequence \(T_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,1]}\), \(n\in \mathbb {N}\),
That is, if the type II error decreases with an exponent larger than the relative entropy then the type I error goes to 1; this is called the strong converse to Stein’s lemma. The optimal (lowest) speed of convergence to 1 is exponential, with the exponent being equal to the Hoeffding anti-divergence \(H_r^{*}(\varrho \Vert \sigma )\), according to [33]. Corollary 4.7 generalizes this to the infinite-dimensional case, with one important difference. While in the finite-dimensional case finiteness of the relative entropy implies strict positivity of \(H_r^{*}(\varrho \Vert \sigma )\) for every \(r>D(\varrho \Vert \sigma )\), and hence the strong converse property, in the infinite-dimensional case it might happen that \(D(\varrho \Vert \sigma )<+\infty \), yet \(H_r^{*}(\varrho \Vert \sigma )=0\) for every \(r\in \mathbb {R}\), and hence the type I error sequence does not converge to 1 with an exponential speed along a test sequence \((T_n)_{n\in \mathbb {N}}\), even if \(\liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}\sigma ^{\otimes n}T_n>D(\varrho \Vert \sigma )\). According to Corollary 4.7, this happens if and only if \(D_{\alpha }^{*}(\varrho \Vert \sigma )=+\infty \) for every \(\alpha >1\). It is an open question what kind of behaviour can occur in this case; if the type II exponent is above the relative entropy, do the type I error probabilities still go to 1 (strong converse property) but with a sub-exponential speed, or may it happen that the strong converse property does not hold, i.e., (4.15) is not satisfied? Note that the monotonicity of the relative entropy under measurements implies that
see, e.g., the proof of (2.4) in [24], or [23, Proposition 5.2]. In particular, (4.14) cannot hold with \(r>D(\varrho \Vert \sigma )\) for any test sequence.
4.2 Generalized cutoff rates
Corollary 4.7 gives an operational interpretation to the Hoeffding anti-divergences, but not directly to the sandwiched Rényi divergences. To get such an operational interpretation, one can consider the following quantity, introduced originally in [8] for the finite-dimensional classical case:
Definition 4.9
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\kappa \in (0,1)\). The generalized \(\kappa \)-cutoff rate \(C_{\kappa }(\varrho \Vert \sigma )\) is defined to be the infimum of all \(r_0\in \mathbb {R}\) such that \(\underline{\textrm{sc}}_r(\varrho \Vert \sigma )\ge \kappa (r-r_0)\) holds for every \(r\in \mathbb {R}\). Analogously, \(C_{\kappa }(\varrho \Vert \sigma ){}_{\textrm{fa}}\) is defined to be the infimum of all \(r_0\in \mathbb {R}\) such that \(\underline{\textrm{sc}}_r(\varrho \Vert \sigma )_\mathrm{{f}}\ge \kappa (r-r_0)\) holds for every \(r\in \mathbb {R}\).
Proposition 4.10
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\).
(i) For any \(\kappa \in (0,1)\),
(ii) If \(\kappa \) is such that there exist \(0<\kappa _1<\kappa<\kappa _2<1\) for which \(D_{\frac{1}{1-\kappa _j}}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}<+\infty \), \(j=1,2\), then
If, moreover, \(D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma )\), then all the inequalities in (4.17) hold as equalities.
Proof
(i) The first inequality in (4.16) is trivial by definition. If \(D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma )=+\infty \) then the second inequality in (4.16) holds trivially, and hence we assume the contrary. By Proposition 4.2,
from which the second inequality in (4.16) follows by definition.
(ii) By the assumptions, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa _j){}_{\textrm{fa}}<+\infty \), \(j=1,2\), and hence \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\) is a finite-valued convex function on \([\kappa _1,\kappa _2]\), according to Remark 3.36 and Corollary 3.48. Thus, in particular, its left and right derivatives \(\partial ^{-} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa )\) and \(\partial ^{+} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa )\) at \(\kappa \) exist and are finite: \(-\infty<\partial ^{-} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}\le \partial ^{+} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}<+\infty \). Moreover, \(\varrho ^0\le \sigma ^0\), according to Lemma 3.42. For any \(r\in [\partial ^{-} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}},\partial ^{+} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}]\),
where the first inequality is due to Proposition 4.5. This yields the first inequality in (4.17), and the rest have already been proved in the previous point. \(\square \)
Proposition 4.10 and Corollary 3.40 yield immediately the following:
Theorem 4.11
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), be such that \(\varrho \) and \(\sigma \) are trace-class, or \(\sigma \) is compact and \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\). Let \(\kappa \in (0,1)\), and assume that \(D_{\alpha }^{*}(\varrho \Vert \sigma )<+\infty \) for \(\alpha \) in a neighborhood of \(\alpha _0{:}{=}1/(1-\kappa )\). Then
4.3 Monotonicity of the Rényi divergences
The operational representation of the Hoeffding anti-divergences in Sect. 4.1 can be used to obtain the monotonicity of the sandwiched Rényi divergences under quantum operations.
In the Heisenberg picture, a quantum operation from a system with Hilbert space \({\mathcal {H}}\) to a system with Hilbert space \({{\mathcal {K}}}\) is given by a unital normal completely positive map \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\), which can be written as
where \(V:\,{\mathcal {H}}\rightarrow {{\mathcal {K}}}\otimes {\mathcal {H}}_E\) is an isometry, \(V_i{:}{=}(I_{{{\mathcal {K}}}}\otimes \left\langle e_i\right| )V\) for some ONB \((e_i)_{i\in {{\mathcal {I}}}}\) in the auxiliary Hilbert space \({\mathcal {H}}_E\), and the sum in (4.18) converges in the strong operator topology [17, 43]. As in everywhere in the paper, we assume that \({\mathcal {H}},{{\mathcal {K}}}\) are separable, in which case the auxiliary Hilbert space \({\mathcal {H}}_E\) can be chosen to be separable, and the index set \({{\mathcal {I}}}\) in (4.18) countable.
In the Schrödinger picture, a density operator \(\varrho \in {{\mathcal {S}}}({\mathcal {H}})\) is transformed by the predual map
where the sum converges in trace-norm, and the result is a density operator on \({{\mathcal {K}}}\). Note that the predual map of \(\Phi \) maps from \({{\mathcal {L}}}^1({\mathcal {H}})\) to \({{\mathcal {L}}}^1({{\mathcal {K}}})\), and is usually denoted by \(\Phi _*\). On the other hand, if \({{\mathcal {B}}}({\mathcal {H}})\) and \({{\mathcal {B}}}({{\mathcal {K}}})\) are both equipped with the ultraweak topology then the corresponding dual map of \(\Phi \) coincides with the predual map, which justifies the notation \(\Phi ^*\).
If \(\varrho \) is PSD but not trace-class then the sum in (4.19) need not converge (in the weak, equivalently, in the strong operator topology), but it may, in which case we say that \(\Phi ^*\) is defined on \(\varrho \), and define \(\Phi ^*(\varrho ){:}{=}\sum _{i\in {{\mathcal {I}}}}V_i\varrho V_i^*\). A trivial case where \(\Phi ^*\) is defined on every \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) is when \(\Phi \) has only finitely many operators in its Kraus decomposition, or equivalently, \({\mathcal {H}}_E\) is finite-dimensional.
Lemma 4.12
Let \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive map. If \(\Phi ^*\) is defined on \(\varrho \) then
Proof
Let \((f_j)_{j\in {{\mathcal {J}}}}\) be an orthonormal basis in \({{\mathcal {K}}}\). Then
\(\square \)
The transformation on multiple systems is given by
If \(\Phi ^*\) is defined on \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) then \((\Phi ^{\otimes n})^*\) is defined on \(\varrho ^{\otimes n}\), and \((\Phi ^{\otimes n})^*(\varrho ^{\otimes n})=(\Phi ^*(\varrho ))^{\otimes n}\).
In the context of operator discrimination, a transformation \(\Phi \) effectively reduces the available tests for discriminating \(\varrho \) and \(\sigma \), thereby increasing the strong converse exponent, as expressed by the following:
Lemma 4.13
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive map such that \(\Phi ^*\) is defined on \(\varrho \) and \(\sigma \). Then
Proof
We only prove the assertion for \(\overline{\textrm{sc}}_r\), as the proof for \(\underline{\textrm{sc}}_r\) goes the same way. We have
where the first two infima are taken over sequences of tests \(T_n\in {{\mathcal {B}}}({{\mathcal {K}}}^{\otimes n})_{[0,I]}\), \(n\in \mathbb {N}\), satisfying the given conditions, the third infimum is taken over sequences of tests \(S_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,I]}\), \(n\in \mathbb {N}\), satisfying the given condition, the first equality is by definition, the second equality follows from Lemma 4.12, the inequality is obvious from the fact that \(T_n\in {{\mathcal {B}}}({{\mathcal {K}}}^{\otimes n})_{[0,I]}\Longrightarrow \Phi ^{\otimes n}(T_n)\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,I]}\), and the last equality is again by definition. \(\square \)
The proof of the following monotonicity result is similar to the proof of the analogous result given in [37, Remark 2] for the monotonicity of the Petz-type Rényi divergences and finite-dimensional density operators. The main ideas in the proof are using the bounds on the strong converse exponents given in Proposition 4.5, the monotonicity of the strong converse exponents given in Lemma 4.13, and the fact that the Rényi divergences can be expressed from the Hoeffding anti-divergences by Legendre-Fenchel transformation, i.e., Lemma 3.53.
Theorem 4.14
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), be such that
and let \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive linear that is defined on \(\varrho \) and \(\sigma \). Then
Proof
By assumption, \({\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}={\hat{H}}_r^{*}(\varrho \Vert \sigma )\), \(r\in \mathbb {R}\), and
where the first and the last inequalities follow from Proposition 4.5, and the second inequality from Lemma 4.13. Hence, by Lemma 3.53,
which is equivalent to (4.22). \(\square \)
Corollary 4.15
Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and let \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive map. Assume that
(a) \(\varrho \) and \(\sigma \) are both trace-class,
or
(b) \(\Phi ^*\) is defined on \(\varrho \) and \(\sigma \), \(\sigma \) and \(\Phi ^*(\sigma )\) are compact, and \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\).
Then
Proof
If \(\varrho \) and \(\sigma \) are both trace-class then \(\Phi ^*\) is automatically defined on them, and \(\Phi ^*(\varrho )\) and \(\Phi ^*(\sigma )\) are both trace-class. Note that \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\) \(\Longleftrightarrow \) \(\varrho \le \lambda \sigma \) for some \(\lambda \ge 0\), whence \(\Phi ^*(\varrho )\le \lambda \Phi ^*(\sigma )\), i.e., \(\Phi ^*(\varrho )\in {{\mathcal {B}}}^{\infty }({{\mathcal {K}}},\Phi ^*(\sigma ))\). By Remark 3.6, \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\) and \(\Phi ^*(\varrho )\in {{\mathcal {B}}}^{\alpha }({{\mathcal {K}}},\Phi ^*(\sigma ))\), \(\alpha >1\). Thus, by Lemma 3.40, the assumptions guarantee that \(D_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=D_{\alpha }^{*}(\varrho \Vert \sigma )\) and \(D_{\alpha }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma )){}_{\textrm{fa}}=D_{\alpha }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))\), \(\alpha >1\), and therefore (4.23) follows immediately from Theorem 4.14. \(\square \)
Remark 4.16
Monotonicity of the form (4.23) in the case where both \(\varrho \) and \(\sigma \) are trace-class is a special case of [6, Theorem 14] and [27, Theorem 3.14], where monotonicity was proved in the more general setting of normal positive linear functionals on a von Neumann algebra. Our proof above is completely different from the proofs given in [6] and [27].
5 Conclusion
We have shown that for any \(\alpha >1\), the sandwiched Rényi \(\alpha \)-divergence of infinite-dimensional density operators has the same operational interpretation in the context of state discrimination as in the finite-dimensional case, and also that it coincides with the regularized measured Rényi \(\alpha \)-divergence, again analogously to the finite-dimensional case. Our results can be extended to more general operator algebraic settings, as shown in [22].
It is worth noting that while in [33] the equality of the sandwiched Rényi divergence and the regularized measured Rényi divergence was an important ingredient of showing the equality of the strong converse exponent and the Hoeffding anti-divergence, the extensions to the infinite-dimensional case can be done separately, building in each problem only on the corresponding finite-dimensional result and the recoverability of the sandwiched Rényi divergences from finite-dimensional restrictions.
We also considered the extension of the sandwiched Rényi divergences (and more generally, Rényi \((\alpha ,z)\)-divergences) to pairs of not necessarily trace-class positive semi-definite operators, and established some properties of this extension. Related to this, we considered a generalization of the state discrimination problem, where the hypotheses may be represented by general positive semi-definite operators. We gave bounds on the strong converse exponent in this problem, and showed that at least in some cases, the equality between the strong converse exponent and the Hoeffding anti-divergence still holds in this generalized setting.
There are a number of interesting problems left open in the paper. Probably the most important is clarifying whether \(Q_{\alpha }^{*}(\varrho \Vert \sigma )=Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\) holds for every pair of PSD operators \(\varrho ,\sigma \) and every \(\alpha >1\), and if not then whether there exist other examples for which it holds, apart from the ones given in Proposition 3.40 and Lemma 3.42. Such examples would extend the applicability of Proposition 4.5 and Theorem 4.14, among others. While less relevant for the problem of operator discrimination, the same question may be asked for more general Rényi \((\alpha ,z)\)-divergences, which seems interesting from the matrix analysis point of view. Finally, from the point of view of quantum information theory, the most important question seems to be to clarify the optimal asymptotics of the type I error probability when the type II exponent is strictly above the relative entropy of the two states (assuming that the latter is finite), while all their sandwiched Rényi \(\alpha \)-divergences are \(+\infty \) for \(\alpha >1\); see Remark 4.8.
References
Araki, H.: Relative entropy of states of von Neumann algebras. Publ. RIMS Kyoto Univ. 11, 809–833 (1976)
Araki, H.: On an inequality of Lieb and Thirring. Lett. Math. Phys. 19, 167–170 (1990)
Audenaert, K.M.R., Nussbaum, M., Szkola, A., Verstraete, F.: Asymptotic error rates in quantum hypothesis testing. Commun. Math. Phys. 279, 251–283 (2008). arXiv:0708.4282
Audenaert, K.M.R., Datta, N.: \(\alpha -z\)-relative Renyi entropies. J. Math. Phys. 56, 022202 (2015). arXiv:1310.7178
Berta, M., Fawzi, O., Tomamichel, M.: On variational expressions for quantum relative entropies. Lett. Math. Phys. 107(12), 2239–2265 (2017). arXiv:1512.02615
Berta, M., Scholz, V.B., Tomamichel, M.: Rényi divergences as weighted non-commutative vector-valued \({L}_p\)-spaces. Ann. Henri Poincaré 19, 1843–1867 (2018). arXiv:1608.05317
Brown, L.G., Kosaki, H.: Jensen’s inequality in semi-finite von Neumann algebras. J. Oper. Theory 23(1), 3–19 (1990)
Csiszár, I.: Generalized cutoff rates and Rényi’s information measures. IEEE Trans. Inf. Theory 41(1), 26–34 (1995)
Datta, N.: Min- and max-relative entropies and a new entanglement monotone. IEEE Trans. Inf. Theory 55(6), 2816–2826 (2009)
Douglas, R.G.: On majorization, factorization, and range inclusion of operators on Hilbert space. Proc. Am. Math. Soc. 47(2), 413–415 (1966)
Ekeland I, Témam R: Convex Analysis and Variational Problems. SIAM (1999)
Frank, R.L., Lieb, E.H.: Monotonicity of a relative Rényi entropy. J. Math. Phys. 54(12), 122201 (2013). arXiv:1306.5358
Grümm, H.: Two theorems about \(\cal{C} _p\). Rep. Math. Phys. 4(3), 211–215 (1973)
Hayashi, M.: Optimal sequence of POVM’s in the sense of Stein’s lemma in quantum hypothesis testing. J. Phys. A Math. Gen. 35, 10759–10773 (2002)
Hayashi, M.: Error exponent in asymmetric quantum hypothesis testing and its application to classical-quantum channel coding. Phys. Rev. A 76(6), 062301 (2007). arXiv:quant-ph/0611013
Hayashi, M.: Quantum information theory: mathematical foundation, 2nd ed. In Graduate Texts in Physics. Springer (2017)
Hellwig, K.-E., Kraus, K.: Operations and measurements: II. Commun. Math. Phys. 16, 142–147 (1970)
Hiai, F.: Log-majorizations and norm inequalities for exponential operators. Banach Center Publ. 38, 119–181 (1997)
Hiai, F.: Quantum \(f\)-divergences in von Neumann algebras: I-standard \(f\)-divergences. J. Math. Phys. 59, 102202 (2018)
Hiai, F.: Private communication (2021)
Hiai, F.: Quantum f-Divergences in von Neumann Algebras. Springer, Berlin (2021)
Hiai, F., Mosonyi, M.: Quantum Rényi divergences and the strong converse exponent of state discrimination in operator algebras. Annales Henri Poincaré, 2022. arXiv:2110.07320
Hiai, F., Mosonyi, M., Ogawa, T.: Error exponents in hypothesis testing for correlated states on a spin chain. J. Math. Phys. 49, 032112 (2008)
Hiai, F., Petz, D.: The proper formula for relative entropy and its asymptotics in quantum probability. Commun. Math. Phys. 143(1), 99–114 (1991)
Jaksic, V., Ogata, Y., Pautrat, Y., Pillet, C.-A.: Entropic fluctuations in quantum statistical mechanics. an introduction. In: Quantum Theory from Small to Large Scales, August 2010, volume 95 of Lecture Notes of the Les Houches Summer School. Oxford University Press (2012)
Jaksic, V., Ogata, Y., Pillet, C.-A., Seiringer, R.: Quantum hypothesis testing and non-equilibrium statistical mechanics. Rev. Math. Phys. 24(6), 1230002 (2012). arXiv:1109.3804
Jenčová, A.: Rényi relative entropies and noncommutative \({L}_p\)-spaces. Ann. Henri Poincaré 19, 2513–2542 (2018). arXiv:1609.08462
Jenčová, A.: Rényi relative entropies and noncommutative \({L}_p\)-spaces II. Ann. Henri Poincaré 22, 3235–3254 (2021). arXiv:1707.00047
Kosaki, H.: Interpolation theory and the Wigner-Yanase-Dyson-Lieb concavity. Commun. Math. Phys. 87, 315–329 (1982)
Li, Y., Gao, S., Hao, H.: The sandwiched Rényi divergence and quantum positive evidence order in infinite-dimensional Hilbert space. Rep. Math. Phys. 88(2), 175–193 (2021)
McCarthy, C.A.: \(\cal{C} _p\). Israel J. of Math. 5, 249–271 (1967)
Mosonyi, M., Hiai, F.: On the quantum Rényi relative entropies and related capacity formulas. IEEE Trans. Inf. Theory 57(4), 2474–2487 (2011)
Mosonyi, M., Ogawa, T.: Quantum hypothesis testing and the operational interpretation of the quantum Rényi relative entropies. Commun. Math. Phys. 334(3), 1617–1648 (2015). arXiv:1309.3228
Mosonyi, M., Ogawa, T.: Divergence radii and the strong converse exponent of classical-quantum channel coding with constant compositions. IEEE Trans. Inf. Theory 67(3), 1668–1698 (2021). arXiv:1811.10599
Müller-Lennert, M., Dupuis, F., Szehr, O., Fehr, S., Tomamichel, M.: On quantum Rényi entropies: a new generalization and some properties. J. Math. Phys. 54(12), 122203 (2013). arXiv:1306.3142
Nagaoka, H.: Strong converse theorems in quantum information theory. In: Proceedings of ERATO Workshop on Quantum Information Science, page 33, 2001. Also appeared in Asymptotic Theory of Quantum Statistical Inference, ed. M. Hayashi, World Scientific (2005)
Nagaoka, H.: The converse part of the theorem for quantum Hoeffding bound. arXiv:quant-ph/0611289 (2006)
Ogawa, T., Nagaoka, H.: Strong converse and Stein’s lemma in quantum hypothesis testing. IEEE Trans. Inf. Theory 46(7), 2428–2433 (2000). arXiv:quant-ph/9906090
Petz, D.: Quasi-entropies for states of a von Neumann algebra. Publ. RIMS Kyoto Univ. 21, 787–800 (1985)
Petz, D.: Quasi-entropies for finite quantum systems. Rep. Math. Phys. 23, 57–65 (1986)
Renner, R.: Security of Quantum Key Distribution. PhD thesis, Swiss Federal Institute of Technology Zurich (2005). Diss. ETH No. 16242
Rényi, A.: On measures of entropy and information. In Proc. 4th Berkeley Sympos. Math. Statist. and Prob., volume I, pages 547–561. Univ. California Press, Berkeley, California (1961)
Stinespring, W.F.: Positive functions on \(C^*\)-algebras. Proc. Am. Math. Soc. 6, 211–216 (1955)
Tomamichel, M., Berta, M., Hayashi, M.: Relating different quantum generalizations of the conditional Rényi entropy. J. Math. Phys. 55, 082206 (2014). arXiv:1311.3887
Umegaki, H.: Conditional expectation in an operator algebra, IV (entropy and information). Kodai Math. Sem. Rep. 14, 59–85 (1962)
Wilde, M.M., Winter, A., Yang, D.: Strong converse for the classical capacity of entanglement-breaking and Hadamard channels via a sandwiched Rényi relative entropy. Commun. Math. Phys. 331(2), 593–622 (2014). arXiv:1306.1586
Zhang, H.: From Wigner-Yanase-Dyson conjecture to Carlen-Frank-Lieb conjecture. Adv. Math. 365, 107053 (2020). arXiv:1811.01205
Acknowledgements
This work was partially funded by the National Research, Development and Innovation Office of Hungary via the research grants K124152 and KH129601, and by the Ministry of Innovation and Technology and the National Research, Development and Innovation Office within the Quantum Information National Laboratory of Hungary. The author is grateful to Péter Vrana for discussions at the early stage of the project, to Ludovico Lami for comments on the strong converse property that led to Remark 4.8, and to Fumio Hiai for numerous helpful comments. The author is also grateful to an anonymous referee for several comments that helped to improve the presentation.
Funding
Open access funding provided by Budapest University of Technology and Economics.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Data availability
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
Additional information
Communicated by G. Chiribella.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Some Further Properties of the Hoeffding Anti-divergences
Appendix A: Some Further Properties of the Hoeffding Anti-divergences
Lemma A.1
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and for every \(u\in \mathbb {R}\),
and
Proof
The assertions are trivial when \(u\in \mathbb {R}\setminus [0,1]\), because then all quantities above are equal to \(+\infty \). For \(u\in (0,1)\), superadditivity is obvious from restricting to projections of the form \(P_1\otimes P_2\), \(P_1\in \mathbb {P}_f({\mathcal {H}}^{\otimes n})_{\varrho ^{\otimes n},\sigma ^{\otimes n}}^+\), \(P_2\in \mathbb {P}_f({\mathcal {H}}^{\otimes m})_{\varrho ^{\otimes m},\sigma ^{\otimes m}}^+\) in the definition of \(\tilde{\psi }^{*}(\varrho ^{\otimes (n+m)} \Vert \sigma ^{\otimes (n+m)} |u){}_{\textrm{fa}}\), according to Lemma 3.22, and the cases \(u\in \{0,1\}\) follow by taking limits (in fact, even additivity holds there, according to (3.72)–(3.73)). The rest of the assertions are straightforward consequences. \(\square \)
Lemma A.2
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and every \(n\in \mathbb {N}\),
Proof
The first equality is immediate from the multiplicativity of \(Q_{\alpha }^{*}\), given in Lemma 3.22, and the second one follows from (A.1), as
\(\square \)
Lemma A.3
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(r\in \mathbb {R}\),
where \(\mathbb {H}_r^{*}\) stands for any of the Hoeffding anti-divergences in Definition 3.51. Moreover,
Proof
(A.3) is immediate from Lemma A.2, and it trivially implies subadditivity for the quantities defined in (3.76), (3.78), (3.79), (3.81). By Lemma A.1, \(n\mapsto \tilde{\psi }^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}|u){}_{\textrm{fa}}\) is superadditive for every \(u\in \mathbb {R}\), which implies the subadditivity of \(n\mapsto H_{nr}^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}){}_{\textrm{fa}}\) and \(n\mapsto {\hat{H}}_{nr}^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}){}_{\textrm{fa}}\). \(\square \)
Proposition A.4
For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(r\in \mathbb {R}\),
Proof
The second equality in (A.4) follows from the subadditivity of \(n\mapsto {\hat{H}}_{nr}^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}){}_{\textrm{fa}}\) given in Lemma A.3. To see the other two equalities in (A.4)–(A.5), note that by definition,
On the other hand, by Corollary 3.48 and Lemma A.1, \(\frac{1}{2^n}\left\{ u2^n r- \tilde{\psi }^{*}(\varrho ^{\otimes 2^n}\Vert \sigma ^{\otimes 2^n}|u){}_{\textrm{fa}}\right\} \) is upper semi-continuous in u on the compact set [0, 1], and monotone decreasing in n, whence
where the second equality is due to Lemma 2.5, and the third equality follows from Lemma A.1. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mosonyi, M. The Strong Converse Exponent of Discriminating Infinite-Dimensional Quantum States. Commun. Math. Phys. 400, 83–132 (2023). https://doi.org/10.1007/s00220-022-04598-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220-022-04598-1