1 Introduction

In a simple binary i.i.d. quantum state discrimination problem, an experimenter is presented with several identically prepared quantum systems, all in the same state that is either described by a density operator on the system’s Hilbert space \({\mathcal {H}}\), (null-hypothesis \(H_0\)), or by another density operator \(\sigma \) (alternative hypothesis \(H_1\)). The experimenter’s task is to guess which hypothesis is correct, based on the result of a 2-outcome measurement, represented by a pair of operators \((T_n(0)=:T_n,T_n(1)=I-T_n)\), where \(T_n\in {{\mathcal {B}}}({\mathcal {H}}_n)\) is a test on \({\mathcal {H}}_n{:}{=}{\mathcal {H}}^{\otimes n}\), i.e., \(0\le T_n\le I\), and n is the number of identically prepared systems. If the outcome of the measurement is k, described by the measurement operator \(T_n(k)\), the experimenter decides that hypothesis k is true. The type I success probability, i.e., the probability that the experimenter correctly identifies the state to be , and the type II error probability, i.e., the probability that the experimenter erroneously identifies the state to be , are given by

(1.1)

respectively, where , \(\sigma _n=\sigma ^{\otimes n}\).

In the asymptotic analysis of the problem, it is customary to look for the optimal asymptotics of the type I success probabilities under the constraint that the type II error probabilities decrease at least as fast as \(\beta _n\sim e^{-nr}\) with some fixed r. It is known that if r is smaller than the relative entropy of and \(\sigma \) then the type I success probabilities converge to 1 exponentially fast, and the optimal exponent (the so-called direct exponent) is equal to the Hoeffding divergence \(H_r\) of and \(\sigma \) [3, 15, 26, 37]. The Hoeffding divergences are defined from the Petz-type Rényi divergences \(D_{\alpha }\) with \(\alpha \in (0,1)\), and the above result establishes the operational significance of these divergences [32, 37].

On the other hand, it was shown in [33] (see also [16, 36, 38]) that if the Hilbert space is finite-dimensional, (equivalently, the density operators are of finite rank), and r is larger than the relative entropy, then the type I success probabilities converge to 0 exponentially fast, and the optimal exponent (the so-called strong converse exponent) is equal to the Hoeffding anti-divergence \(H_r^{*}\) of and \(\sigma \). (\(H_r^{*}\), as well as the various divergences mentioned below, will be precisely defined in the main text.) The Hoeffding anti-divergences are defined from the sandwiched Rényi divergences \(D_{\alpha }^{*}\) with \(\alpha >1\) [35, 46], and this result establishes the operational significance of these divergences

A key step in the proof of the strong converse exponent in [33] is showing that the regularized measured Rényi divergence \(\overline{D}_{\alpha }^{\text {meas}}\) coincides with the sandwiched Rényi divergence \(D_{\alpha }^{*}\) for any \(\alpha >1\), which was proved using the pinching inequality [14], a fundamentally finite-dimensional technique. Thus, while the notion of the sandwiched Rényi divergences was extended recently to density operators on an infinite-dimensional Hilbert space (in fact, even for states of an arbitrary von Neumann algebra) in [6] and [27], these quantities were so far lacking an operational interpretation similar to the finite-dimensional case described above, and it has also been open whether they coincide with the regularized measured Rényi divergences. In this paper we fill this gap by answering both questions in the positive for density operators on an infinite-dimensional Hilbert space.

We also initiate the study of the sandwiched Rényi divergences, and the related problem of the strong converse exponents, for pairs of positive semi-definite operators that are not necessarily trace-class (this corresponds to considering weights in a general von Neumann algebra setting). This is motivated by the need to define conditional Rényi entropies in the infinite-dimensional setting, while it might also be interesting from the purely mathematical point of view of extending the concept of Rényi (and other) divergences to settings beyond the standard one of positive trace-class operators (or positive normal functionals, in the von Neumann algebra setting). In this spirit, we also discuss the definition and some properties of the more general family of Rényi \((\alpha ,z)\)-divergences [4, 25] in this setting. To the best of our knowledge, this is new even for trace-class operators when the underlying Hilbert space is infinite-dimensional .

The structure of the paper is as follows. In Sect. 2 we collect some necessary preliminaries. In Sect. 3 we define the Rényi \((\alpha ,z)\)-divergences for an arbitrary pair of positive semi-definite operators on a possibly infinite-dimensional Hilbert space, and establish some of their properties. The most important part of this section for the later applications is the recoverability of the sandwiched Rényi divergence from finite-dimensional restrictions, given in Proposition 3.40. Based on this, in Sect. 3.4 we show that the sandwiched Rényi divergence is equal to the regularized measured Rényi divergence for pairs of states, extending the finite-dimensional result of [33] to infinite dimension. In Sect. 4.1 we consider a generalization of the state discrimination problem where the hypotheses are given by (not necessarily trace-class) positive semi-definite operators, and establish lower and upper bounds on the strong converse exponents in this setting. In particular, we show that the strong converse exponent is equal to the Hoeffding anti-divergence for quantum states, thereby giving an operational interpretation of the sandwiched Rényi divergences analogous to the finite-dimensional case. Moreover, we prove the above equality also in the case where the reference operator \(\sigma \) is only assumed to be compact, and to dominate the first operator as for some \(\lambda >0\). In Sect. 4.2, we give a direct operational interpretation to the sandwiched Rényi divergences as generalized cutoff rates, extending the analogous interpretations given previously for classical [8] and finite-dimensional quantum states [33]. In Sect. 4.3 we use the strong converse result from Sect. 4.1 to show the monotonicity of the sandwiched Rényi divergences under the action of the dual of a normal unital completely positive map. While this follows from [6, 27] for density operators, our proof is completely different, and also applies to other settings, e.g., for a compact \(\sigma \) that dominates .

2 Preliminaries

Throughout the paper, \({\mathcal {H}}\) and \({{\mathcal {K}}}\) will denote separable Hilbert spaces (of finite or infinite dimension), and \({{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) will denote the set of everywhere defined bounded linear operators from \({\mathcal {H}}\) to \({{\mathcal {K}}}\), with \(B({\mathcal {H}},{\mathcal {H}})=:{{\mathcal {B}}}({\mathcal {H}})\). We will use the notations \({{\mathcal {B}}}({\mathcal {H}})_{\text {sa}}\) for the set of self-adjoint, and \({{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), for the set of non-zero positive semi-definite (PSD), operators in \({{\mathcal {B}}}({\mathcal {H}})\) respectively, and

$$\begin{aligned} {{\mathcal {B}}}({\mathcal {H}})_{[0,I]}{:}{=}\{T\in {{\mathcal {B}}}({\mathcal {H}}):\,0\le T\le I\} \end{aligned}$$

for the set of tests in \({{\mathcal {B}}}({\mathcal {H}})\). A test T is projective if \(T^2=T\). We will denote the set of all projections on \({\mathcal {H}}\) by \(\mathbb {P}({\mathcal {H}})\), and the set of finite rank projections by \(\mathbb {P}_f({\mathcal {H}})\). The set of finite-rank operators on \({\mathcal {H}}\) will be denoted by \({{\mathcal {B}}}_f({\mathcal {H}})\). The set of density operators, or states, on \({\mathcal {H}}\) will be denoted by \({{\mathcal {S}}}({\mathcal {H}})\). For two PSD operators , we will use the notations

and , , , where .

For a (possibly unbounded) self-adjoint operator A on a Hilbert space \({\mathcal {H}}\), let \(P^A(\cdot )\) denote its spectral PVM, and for any complex-valued measurable function f defined at least on \({{\,\textrm{spec}\,}}(A)\), let \(f(A)=\int _{\mathbb {R}}f\,dP^A\) be the operator defined via the usual functional calculus. We will use the relations

$$\begin{aligned}&(f(A))^*=\overline{f}(A), \end{aligned}$$
(2.1)
$$\begin{aligned}&\overline{f(A)g(A)}=(fg)(A),\,\, \,\, {{\,\textrm{dom}\,}}(f(A)g(A))={{\,\textrm{dom}\,}}(g(A))\cap {{\,\textrm{dom}\,}}((fg)(A)), \end{aligned}$$
(2.2)

where \(\overline{f}\) stands for the pointwise complex conjugate of f, and for a closable operator X, \(\overline{X}\) denotes its closure.

We say that a (not necessarily everywhere defined or bounded) linear operator A on a Hilbert space is positive semi-definite (PSD), if it is self-adjoint, and \({{\,\textrm{spec}\,}}(A)\subseteq [0,+\infty )\). If A is PSD then we may define its real powers as

$$\begin{aligned} A^p{:}{=}{{\,\textrm{id}\,}}_{(0,+\infty )}^p(A)=\int _{(0,+\infty )}{{\,\textrm{id}\,}}_{(0,+\infty )}^p\,dP^A,\,\, \,\, \,\, p\in \mathbb {R}. \end{aligned}$$

In particular, \(A^0\) is the projection onto \((\ker A)^{\perp }={{\,\mathrm{\overline{{{\,\textrm{ran}\,}}}}\,}}A=:{{\,\textrm{supp}\,}}A\),

$$\begin{aligned} (A^p)^{-1}=(A^{-1})^p=A^{-p},\,\, \,\, \,\, p\in \mathbb {R}, \end{aligned}$$

and

$$\begin{aligned} A\,\, \text { is bounded}\,\, \,\, \Longrightarrow \,\, \,\, A^{-p} A^p=I,\,\, \,\, \,\, A^pA^{-p}=I_{{{\,\textrm{ran}\,}}A^p},\,\, \,\, \,\, p>0. \end{aligned}$$

Note that here we use the notation

$$\begin{aligned} {{\,\textrm{id}\,}}_{B}:\,x\mapsto x \end{aligned}$$

for the identity map on any subset \(B\subseteq \mathbb {R}\). Analogously, we will denote the characteristic function, or indicator function, of a subset \({{\mathcal {X}}}_0\) of a set \({{\mathcal {X}}}\) by

$$\begin{aligned} \textbf{1}_{{{\mathcal {X}}}_0}:\,x\mapsto {\left\{ \begin{array}{ll}1,&{}x\in {{\mathcal {X}}}_0,\\ 0,&{}x\notin {{\mathcal {X}}}_0.\end{array}\right. } \end{aligned}$$

For any \(X\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) with polar decomposition \(X=V|X|\), we have \(|X^*|=V|X|V^*\), whence \(|X^*|^p=V|X|^pV^*\) for any \(p\in \mathbb {R}\). In particular,

$$\begin{aligned} {{\,\textrm{Tr}\,}}(X^*X)^p={{\,\textrm{Tr}\,}}(XX^*)^p,\,\, \,\, \,\, p>0, \end{aligned}$$
(2.3)

which we will use in many proofs below without further notice. We will use the notation \(\left\| X\right\| _p{:}{=}({{\,\textrm{Tr}\,}}|X|^p)^{1/p}\) for \(X\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) and \(p>0\). When \(p\ge 1\), \(\left\| \cdot \right\| _p\) is a norm on the Schatten p-class

$$\begin{aligned} {{\mathcal {L}}}^p({\mathcal {H}}){:}{=}\{X\in {{\mathcal {B}}}({\mathcal {H}}):\,{{\,\textrm{Tr}\,}}|X|^p<+\infty \}. \end{aligned}$$

We will denote the usual operator norm on \({{\mathcal {B}}}({\mathcal {H}})\) by \(\left\| \cdot \right\| _{\infty }\).

The following is well known; see, e.g., [18, Proposition 2.7] and [31, Theorem 2.3].

Lemma 2.1

(Hölder inequality). Let \(p_0,p_1,p>0\) be such that \(\frac{1}{p_0}+\frac{1}{p_1}=\frac{1}{p}\). For any \(A,B\in {{\mathcal {B}}}({\mathcal {H}})\),

$$\begin{aligned} \left\| AB\right\| _p\le \left\| A\right\| _{p_0}\left\| B\right\| _{p_1}. \end{aligned}$$
(2.4)

Moreover, if \(\left\| A\right\| _{p_0}\left\| B\right\| _{p_1}<+\infty \) then equality holds in (2.4) if and only if \(A=\lambda B\) or \(B=\lambda A\) for some \(\lambda \ge 0\).

We will use the notations \(\mathrm{(wo)}\lim \) and \(\mathrm{(so)}\lim \) for limits in the weak and the strong operator topologies, respectively. The following two statements are from [13].

Lemma 2.2

Let \(A\in {{\mathcal {L}}}^p({\mathcal {H}})\) for some \(p\ge 1\), and \(B_n\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\), \(C_n\in {{\mathcal {B}}}({{\mathcal {K}}},{\mathcal {H}})\), \(n\in \mathbb {N}\), be two sequences bounded in operator norm and converging strongly to some \(B_{\infty }\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})\) and \(C_{\infty }\in {{\mathcal {B}}}({{\mathcal {K}}},{\mathcal {H}})\), respectively. Then

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| B_nAC_n-B_{\infty }AC_{\infty }\right\| _p=0,\,\, \,\, \,\, \lim _{n\rightarrow +\infty }\left\| B_nAC_n\right\| _p=\left\| B_{\infty }AC_{\infty }\right\| _p. \end{aligned}$$
(2.5)

Proof

The first limit in (2.5) is immediate from [13, Theorem 1], and the second limit follows from it trivially. \(\square \)

The following is Theorem 2 in [13]:

Lemma 2.3

Let \(p\in [1,+\infty )\) and \(A,A_n\in {{\mathcal {L}}}^p({\mathcal {H}})\), \(n\in \mathbb {N}\), be such that \(\mathrm{(so)}\lim _n A_n=A\), \(\mathrm{(so)}\lim _n A_n^*=A^*\), and \(\lim _n\left\| A_n\right\| _p=\left\| A\right\| _p\). Then \(\lim _n\left\| A_n-A\right\| _p=0\).

The following is a special case of [18, Proposition 2.11]:

Lemma 2.4

Assume that a sequence \(A_n\in {{\mathcal {B}}}({\mathcal {H}})\), \(n\in \mathbb {N}\), converges to some \(A\in {{\mathcal {B}}}({\mathcal {H}})\) in the weak operator topology. For any \(p\in [1,+\infty ]\),

$$\begin{aligned} \left\| A\right\| _{p}\le \liminf _{n\rightarrow +\infty }\left\| A_n\right\| _{p}. \end{aligned}$$

We will need the following straightforward generalization of the minimax theorem from [32, Corollary A.2]. Its proof is essentially the same, which we include for readers’ convenience.

Lemma 2.5

Let X be a compact topological space, Y be an upward directed partially ordered set, and let \(f:\,X\times Y\rightarrow \mathbb {R}\cup \{-\infty ,+\infty \}\) be a function. Assume that

  1. (i)

    \(f(.\,,\,y)\) is upper semicontinuous for every \(y\in Y\) and

  2. (ii)

    f(x, .) is monotonic decreasing for every \(x\in X\).

Then

$$\begin{aligned} \sup _{x\in X}\inf _{y\in Y}f(x,y)= \inf _{y\in Y}\sup _{x\in X}f(x,y), \end{aligned}$$
(2.6)

and the suprema in (2.6) can be replaced by maxima.

Proof

The inequality \(\sup _{x\in X}\inf _{y\in Y}f(x,y)\le \inf _{y\in Y}\sup _{x\in X}f(x,y)\) is trivial, and for the converse inequality it is sufficient to prove that for any finite subset \(Y'\subseteq Y\),

$$\begin{aligned} \sup _{x\in X}\inf _{y\in Y'}f(x,y)\ge \inf _{y\in Y}\sup _{x\in X}f(x,y), \end{aligned}$$

according to [32, Lemma A.1] (applied to \(-f\) in place of f). Due to Y being upward directed, for any finite subset \(Y'\subseteq Y\), there exists a \(y^*\in Y\) such that \(y\le y^*\) for every \(y\in Y'\). Since f(x, .) is assumed to be monotone decreasing, we get

$$\begin{aligned} \sup _{x\in X}\inf _{y\in Y'}f(x,y)\ge \sup _{x\in X}f(x,y^*) \ge \inf _{y\in Y}\sup _{x\in X}f(x,y), \end{aligned}$$

as required. The assertion about the maxima is straightforward from the assumed semi-continuity and the compactness of X. \(\square \)

3 The Rényi \((\alpha ,z)\)-Divergences in Infinite Dimension

The sandwiched Rényi \(\alpha \)-divergences for pairs of finite-dimensional density operators were introduced in [35, 46]. The Rényi \((\alpha ,z)\)-divergences [4, 25] give a 2-parameter extension of this family, which includes both the sandwiched Rényi divergences (corresponding to \(z=\alpha \)) and the Petz-type, or standard Rényi divergences [40] (corresponding to \(z=1\)) as special cases.

The concept of the sandwiched Rényi divergences was extended recently to pairs of positive normal linear functionals on a general von Neumann algebra in [6, 27, 28], while the Petz-type Rényi divergences have been studied in this more general setting for a long time [21, 29, 39]. These extensions require advanced knowledge of von Neumann algebras, and the details of the proofs might be difficult to verify for those who are not experts in the subject. Below we give a more pedestrian exposition of the definition and basic properties of the Rényi divergences in the simpler case where the states are represented by density operators on a possibly infinite-dimensional Hilbert space, while in the same time we also generalize the above works in this setting to the case where density operators may be replaced by arbitrary positive semi-definite operators. Since these are mostly not assumed to be trace-class, they cannot be normalized to states in the properly infinite-dimensional case. Moreover, we also consider the more general notion of Rényi \((\alpha ,z)\)-divergences in this setting.

The recoverability of the sandwiched Rényi divergences from finite-size restrictions, given in Proposition 3.40, seems to be new even for density operators, although in that case it follows easily from the known properties of monotonicity and lower semi-continuity of the sandwiched Rényi divergences.

3.1 Definition and basic properties

The sandwiched Rényi divergence of and \(\sigma \) is finite according to the definition in [27] if and only if is in Kosaki’s interpolation space \({{\mathcal {L}}}^{\alpha }({\mathcal {H}},\sigma )\). The following lemma gives various alternative characterizations of this condition, and also an extension that we will use in the definition of the Rényi \((\alpha ,z)\)-divergences in this setting. The lemma is essentially a special case of Douglas’ range inclusion theorem [10] for PSD operators with and \(B{:}{=}\sigma ^{\frac{\alpha -1}{2z}}\) (points (iv)–(iv)) as well as an extension with further equivalent characterizations (points (i)–(iii)), and it is inspired by a similar statement for the \(\alpha =z=+\infty \) case given in [30].

Let us introduce the notation

For \((\alpha ,z){:}{=}(+\infty ,+\infty )\), we will use the convention \(\frac{\alpha }{z}{:}{=}1\), and define similar expressions by a formal calculus, e.g., \(\frac{\alpha }{2z}{:}{=}\frac{1}{2}\frac{\alpha }{z}=\frac{1}{2}\), \(\frac{\alpha -1}{2z}{:}{=}\frac{\alpha }{2z}-\frac{1}{2z}=\frac{1}{2}\), etc.

Lemma 3.1

Let , and let \((\alpha ,z)\in \mathbb {A}\). The following are equivalent:

  1. (i)

    There exists an \(R\in {{\mathcal {B}}}({\mathcal {H}})\) such that

    (3.1)
  2. (ii)

    , and is densely defined and bounded.

  3. (iii)

    , and for any/some sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), the sequence of bounded operators

    (3.2)

    converges in the weak/strong operator topology, where \(\sigma _n{:}{=}P_n\sigma P_n\), \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\).

  4. (iv)

    .

  5. (v)

    .

  6. (vi)

    There exists a \(\lambda \ge 0\) such that .

Moreover, if the above hold then , and among all operators R as in (3.1) there exists a unique PSD operator with the property \(R^0\le \sigma ^0\), denoted by , which can be expressed as

(3.3)
(3.4)

where \((\sigma _n)_{n\in \mathbb {N}}\) is any sequence as in (iii). This unique is in the von Neumann algebra generated by and \(\sigma \), and its operator norm is equal to the smallest \(\lambda \) for which (iv) holds.

Proof

Note that if \(R\in {{\mathcal {B}}}({\mathcal {H}})\) satisfies (3.1) then so does \(\sigma ^0R\sigma ^0\) as well. Moreover, any of the conditions above imply . Hence, we may assume without loss of generality that \({{\,\textrm{supp}\,}}\sigma ={\mathcal {H}}\), so that \({{\,\mathrm{\overline{{{\,\textrm{ran}\,}}}}\,}}(\sigma ^{\frac{\alpha -1}{2z}})= \big (\ker \big (\sigma ^{\frac{\alpha -1}{2z}}\big )\big )^{\perp }=(\ker \sigma )^{\perp }={\mathcal {H}}\).

Assume that (i) holds. Then holds trivially, and

whence its closure is equal to R. This proves (ii) and the existence of the unique with the postulated properties, as well as the first equality in (3.3). Moreover, for any \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), we have, with \(\sigma _n\) as in (iii),

(3.5)

where we used (2.2). This proves (iii) and the first equality in (3.4). Since is in the von Neumann algebra generated by and \(\sigma \), so is , according to (3.5). Obviously, (i) also implies

whence (vi) follows with . As a consequence, , where \(\lambda _{\min }\) denotes the smallest \(\lambda \) for which (vi) holds. Conversely, let \(\lambda \) be as in (vi). Multiplying both sides by \(\sigma _n^{\frac{1-\alpha }{2z}}\) yields , which in combination with (3.5) gives . Thus, , as stated.

Assume next that (ii) holds. Then

(3.6)

where the last equality follows from the assumption . Since is everywhere defined, it is actually equal to the first operator in (3.6), and thus (i) holds. Moreover, if (3.6) holds then for any \(\phi \in {\mathcal {H}}\),

Since \({{\,\textrm{ran}\,}}\sigma ^{\frac{\alpha -1}{2z}}\) is dense and is bounded, it follows that is PSD.

Assume now (iii), i.e., that for some sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), the sequence of operators converges in the weak operator topology to some operator \(R_{\sigma ,\alpha ,z}\). Then

and hence (i) holds, as well as the second equality in (3.4).

The equivalence of (iv), (v), and (iv) follows from Douglas’ range inclusion theorem [10]. Note that (iv)\(\Longleftrightarrow \)(v) is simple, as being everywhere defined is equivalent to , and boundedness of is automatic from the boundedness of and the closedness of \(\sigma ^{\frac{1-\alpha }{2z}}\), due to the closed graph theorem. Moreover, we have

whence

which is densely defined and bounded. Thus, . Finally,

(3.7)

where the last equality follows from the assumption \({{\,\textrm{ran}\,}}\varrho ^{\frac{\alpha }{2z}}\subseteq {{\,\textrm{ran}\,}}\sigma ^{\frac{\alpha -1}{2z}}\). Thus, (i) follows with \(\varrho _{\sigma ,\alpha ,z}= \big (\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\big )\overline{\varrho ^{\frac{\alpha }{2z}}\sigma ^{\frac{1-\alpha }{2z}}}\), and we also have the second and the third equalities in (3.3). Note that the last expression in (3.3) gives another proof for the positive semi-definiteness of \(\varrho _{\sigma ,\alpha ,z}\). \(\square \)

Definition 3.2

For \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in \mathbb {A}\), let

$$\begin{aligned} {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )&{:}{=}\left\{ \varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}:\, \exists \,R\in {{\mathcal {B}}}({\mathcal {H}})\,\, \text {s.t.} \,\, \varrho ^{\frac{\alpha }{z}} = \sigma ^{\frac{\alpha -1}{2z}}R\sigma ^{\frac{\alpha -1}{2z}} \right\} . \end{aligned}$$

When \(\alpha =z\), we will use the shorthand notation \({{\mathcal {B}}}^{\alpha ,\alpha }({\mathcal {H}},\sigma )=:{{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\).

Remark 3.3

Note that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) if and only if it satisfies (i) in Lemma 3.1, which is equivalently characterized by all the other points in Lemma 3.1. In particular, there exists a unique PSD \(\varrho _{\sigma ,\alpha ,z}\) satisfying \(\varrho _{\sigma ,\alpha ,z}^0\le \sigma ^0\) and \(\varrho ^{\frac{\alpha }{z}} = \sigma ^{\frac{\alpha -1}{2z}}\varrho _{\sigma ,\alpha ,z}\sigma ^{\frac{\alpha -1}{2z}}\), and thus the map \(\varrho \mapsto \varrho _{\sigma ,\alpha ,z}\) is well-defined from \({{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) onto \(\{\tau \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}:\,\tau ^0\le \sigma ^0\}\), and it is also injective, hence it is a bijection. When \(\alpha =z\), we will use the notation \(\varrho _{\sigma ,\alpha ,\alpha }=:\varrho _{\sigma ,\alpha }\).

Lemma 3.4

For any \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \((0,+\infty )\ni z\mapsto {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) is increasing, i.e.,

$$\begin{aligned} 0<z<z'\,\, \Longrightarrow \,\, {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\subseteq {{\mathcal {B}}}^{\alpha ,z'}({\mathcal {H}},\sigma ). \end{aligned}$$

Proof

Let \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Then, by (iv) of Lemma 3.1, \(\varrho ^{\frac{\alpha }{z}}\le \lambda \sigma ^{\frac{\alpha -1}{z}}\) for some \(\lambda \in (0,+\infty )\). Since \(z\le z'\), \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\frac{z}{z'}}\) is operator monotone, whence

$$\begin{aligned} \varrho ^{\frac{\alpha }{z'}} = \big (\varrho ^{\frac{\alpha }{z}}\big )^{\frac{z}{z'}} \le \lambda ^{\frac{z}{z'}}\big (\sigma ^{\frac{\alpha -1}{z}}\big )^{\frac{z}{z'}} = \sigma ^{\frac{\alpha -1}{z'}}. \end{aligned}$$

Again by (iv) of Lemma 3.1, \(\varrho \in {{\mathcal {B}}}^{\alpha ,z'}({\mathcal {H}},\sigma )\). \(\square \)

Remark 3.5

By (3.3)–(3.4), for \(P_n\) and \(\sigma _n\) as in (3.2),

$$\begin{aligned} P_n\varrho _{\sigma ,\alpha ,z}P_n = \sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}, \end{aligned}$$

and if \(\alpha =z\), then we further have

$$\begin{aligned} P_n\varrho _{\sigma ,\alpha }P_n =\sigma _n^{\frac{1-\alpha }{2\alpha }}\varrho \sigma _n^{\frac{1-\alpha }{2\alpha }} =(P_n\sigma P_n)^{\frac{1-\alpha }{2\alpha }}(P_n\varrho P_n) (P_n\sigma P_n)^{\frac{1-\alpha }{2\alpha }}. \end{aligned}$$

Thus, with \(\varrho _n{:}{=}P_n\varrho P_n\),

$$\begin{aligned} \varrho _{\sigma ,\alpha }= \mathrm{(wo)}\lim _{n\rightarrow +\infty } \sigma _n^{\frac{1-\alpha }{2\alpha }}\varrho _n \sigma _n^{\frac{1-\alpha }{2\alpha }}. \end{aligned}$$

Remark 3.6

Note that if \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\), i.e., \(\varrho =\sigma ^{\frac{\alpha -1}{2\alpha }}\varrho _{\sigma ,\alpha }\sigma ^{\frac{\alpha -1}{2\alpha }}\) with \(\varrho _{\sigma ,\alpha }\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(\varrho _{\sigma ,\alpha }^0\le \sigma ^0\), then for any \(\alpha '<\alpha \),

$$\begin{aligned} \varrho =\sigma ^{\frac{\alpha '-1}{2\alpha '}} \sigma ^{\frac{1}{2\alpha '}-\frac{1}{2\alpha }} \varrho _{\sigma ,\alpha } \sigma ^{\frac{1}{2\alpha '}-\frac{1}{2\alpha }} \sigma ^{\frac{\alpha '-1}{2\alpha '}}, \end{aligned}$$

whence \(\varrho \in {{\mathcal {B}}}^{\alpha '}({\mathcal {H}},\sigma )\), and

$$\begin{aligned} \varrho _{\sigma ,\alpha '}=\sigma ^{\frac{1}{2\alpha '}-\frac{1}{2\alpha }} \varrho _{\sigma ,\alpha } \sigma ^{\frac{1}{2\alpha '}-\frac{1}{2\alpha }}. \end{aligned}$$
(3.8)

In particular, if \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\), i.e., \(\varrho =\sigma ^{1/2}\varrho _{\sigma ,\infty }\sigma ^{1/2}\) with some \(\varrho _{\sigma ,\infty }\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(\varrho _{\sigma ,\infty }^0\le \sigma ^0\), then \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\) for every \(\alpha >1\), and

$$\begin{aligned} \varrho _{\sigma ,\alpha }=\sigma ^{\frac{1}{2\alpha }}\varrho _{\sigma ,\infty }\sigma ^{\frac{1}{2\alpha }}. \end{aligned}$$

As an immediate consequence,

$$\begin{aligned} \cap _{\alpha >1}{{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma ) \supseteq {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )=\left\{ \varrho \in {{\mathcal {B}}}({\mathcal {H}}):\,D_{\max }(\varrho \Vert \sigma )<+\infty \right\} , \end{aligned}$$

where

$$\begin{aligned} D_{\max }(\varrho ,\sigma ){:}{=}\inf \{\kappa \in \mathbb {R}:\,\varrho \le e^{\kappa }\sigma \} \end{aligned}$$
(3.9)

is the max-relative entropy of \(\varrho \) and \(\sigma \) [9, 41].

Definition 3.7

For \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let

$$\begin{aligned} {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )&{:}{=} \left\{ \varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma ):\, {{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z<+\infty \right\} . \end{aligned}$$

Again, when \(\alpha =z\), we will use the notation \({{\mathcal {L}}}^{\alpha ,\alpha }({\mathcal {H}},\sigma )=:{{\mathcal {L}}}^{\alpha }({\mathcal {H}},\sigma )\).

Remark 3.8

Note that for \(\alpha >1\), \(\sigma ^{\frac{\alpha -1}{2z}}\in {{\mathcal {B}}}({\mathcal {H}})\), and if \(z\ge 1\) then \({{\mathcal {L}}}^{z}({\mathcal {H}})\) is an ideal in \({{\mathcal {B}}}({\mathcal {H}})\). Thus, by (i) of Lemma 3.1, if \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then \(\varrho ^{\frac{\alpha }{z}}\in {{\mathcal {L}}}^{z}({\mathcal {H}})\), or equivalently, \(\varrho \in {{\mathcal {L}}}^{\alpha }({\mathcal {H}})\). Therefore,

$$\begin{aligned} {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\subseteq {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\cap {{\mathcal {L}}}^{\alpha }({\mathcal {H}}),\,\, \,\, \alpha >1,\,\, z\ge 1. \end{aligned}$$

Assume now that \(\sigma \) is trace-class and \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) for some \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\). Then, by Lemma (i) of 3.1 and the operator Hölder inequality, \({{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{z}}\big )^r<+\infty \), where \(\frac{1}{r}=\frac{\alpha -1}{2z}+\frac{1}{z}+\frac{\alpha -1}{2z}=\frac{\alpha }{z}\), or equivalently, \(\varrho \in {{\mathcal {L}}}^1({\mathcal {H}})\). Thus, we get

$$\begin{aligned} \sigma \text { trace-class }\,\, \Longrightarrow \,\, {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\subseteq {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\cap {{\mathcal {L}}}^{1}({\mathcal {H}}),\,\, \,\, \alpha>1,\,\, z>0. \end{aligned}$$

It is easy to see that the above inclusion is strict. Indeed, let \(\sigma \in {{\mathcal {B}}}(l^2(\mathbb {N}))\) be diagonal in the canonical basis of \(l^2(\mathbb {N})\), i.e., \(\sigma =\sum _{k\in \mathbb {N}}s(k)\left| \textbf{1}_{\{k\}}\right\rangle \!\left\langle \textbf{1}_{\{k\}}\right| \) for some \(s:\,\mathbb {N}\rightarrow (0,+\infty )\) such that \(\sum _{k\in \mathbb {N}}s(k)<+\infty \) (i.e., \(\sigma \) is trace-class) and \(\sum _{k\in \mathbb {N}}s(k)^{\frac{\alpha -1}{z}}<+\infty \). Define \(\varrho {:}{=}\sum _{k\in \mathbb {N}}s(k)^{\frac{\alpha -1}{z}}\left| \textbf{1}_{\{k\}}\right\rangle \!\left\langle \textbf{1}_{\{k\}}\right| \). Then \(\varrho \) is trace-class, and for any sequence \((P_n=\textbf{1}_{(c_n,d_n)}(\sigma ))_{n\in \mathbb {N}}\) as in Lemma 3.1,

$$\begin{aligned} \sigma _n^{\frac{1-\alpha }{2z}}\varrho \sigma _n^{\frac{1-\alpha }{2z}} =\sum _{k:\,c_n< s(k)<d_n}\left| \textbf{1}_{\{k\}}\right\rangle \!\left\langle \textbf{1}_{\{k\}}\right| , \end{aligned}$$

which goes to I in the strong operator topology. Hence, \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\cap {{\mathcal {L}}}^{1}({\mathcal {H}})\), but \(\varrho _{\sigma ,\alpha ,z}=I\), and therefore \(\varrho \notin {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}})\).

The following is an extension of the Rényi \((\alpha ,z)\)-divergences [4] to the case of infinite-dimensional PSD operators. It is also a special case of Jenčová’s definition of the sandwiched Rényi divergence [27] when \(\varrho \) and \(\sigma \) are trace-class, and \(z=\alpha \), and it is a natural extension of it otherwise.

Definition 3.9

For \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ){:}{=} {\left\{ \begin{array}{ll} {{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^{z}, &{}\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma ),\\ +\infty ,&{}\text {otherwise}, \end{array}\right. } \end{aligned}$$

with \(\varrho _{\sigma ,\alpha ,z}\) as in Lemma 3.1. The Rényi \((\alpha ,z)\)-divergence of \(\varrho \) and \(\sigma \) is defined as

$$\begin{aligned} D_{\alpha ,z}(\varrho \Vert \sigma ){:}{=}\frac{1}{\alpha -1}\log Q_{\alpha ,z}(\varrho \Vert \sigma ). \end{aligned}$$

We use the notations \(Q_{\alpha }^{*}{:}{=}Q_{\alpha ,\alpha }\) and \(D_{\alpha }^{*}{:}{=}D_{\alpha ,\alpha }\), and call the latter the sandwiched Rényi \(\alpha \) -divergence.

We also define the following variants of the Rényi \((\alpha ,z)\)-divergences for trace-class operators:

Definition 3.10

For PSD trace-class operators \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let

$$\begin{aligned} \tilde{D}_{\alpha ,z}(\varrho \Vert \sigma ){:}{=}D_{\alpha ,z}(\varrho \Vert \sigma )-\frac{1}{\alpha -1}\log {{\,\textrm{Tr}\,}}\varrho . \end{aligned}$$

We also use the notation \(\tilde{D}_{\alpha }^{*}{:}{=}\tilde{D}_{\alpha ,\alpha }\).

Remark 3.11

For a convex function f on \([0,+\infty )\), the quantum f -divergence of a pair of positive normal functionals on a von Neumann algebra is defined using the relative modular operator; see [21, 39]. In particular, it is well-defined for a pair of positive trace-class operators \(\varrho ,\sigma \) on a Hilbert space and \(f_{\alpha }{:}{=}{{\,\textrm{id}\,}}_{[0,+\infty )}^{\alpha }\) for any \(\alpha >1\); let it be denoted by \(Q_{f_{\alpha }}(\varrho \Vert \sigma )\). According to [21, Theorem 3.6],

$$\begin{aligned} Q_{f_{\alpha }}(\varrho \Vert \sigma )=Q_{\alpha ,1}(\varrho \Vert \sigma ),\,\, \,\, \,\, \alpha >1. \end{aligned}$$

In particular, for PSD trace-class operators \(\varrho \) and \(\sigma \), \(D_{\alpha ,1}(\varrho \Vert \sigma )\) in Definition 3.9 coincides with the Petz-type or standard quantum Rényi \(\alpha \)-divergence of \(\varrho \) and \(\sigma \), just as in the finite-dimensional case; see, e.g. [4].

Remark 3.12

Note that for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )>0,\,\, \,\, \,\, D_{\alpha ,z}(\varrho \Vert \sigma )>-\infty , \end{aligned}$$
(3.10)

and

$$\begin{aligned} D_{\alpha ,z}(\varrho \Vert \sigma )<+\infty \,\, \Longleftrightarrow \,\, Q_{\alpha ,z}(\varrho \Vert \sigma )<+\infty \,\, \Longleftrightarrow \,\, \varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma ). \end{aligned}$$

Remark 3.13

It is clear from their definitions that \(Q_{\alpha ,z}\), \(D_{\alpha ,z}\) and \(\tilde{D}_{\alpha ,z}\) satisfy the scaling properties

$$\begin{aligned} Q_{\alpha ,z}(\lambda \varrho \Vert \eta \sigma )&=\lambda ^{\alpha }\eta ^{1 -\alpha }Q_{\alpha ,z}(\varrho \Vert \sigma ), \end{aligned}$$
(3.11)
$$\begin{aligned} D_{\alpha ,z}(\lambda \varrho \Vert \eta \sigma )&=D_{\alpha ,z}(\varrho \Vert \sigma ) +\frac{\alpha }{\alpha -1}\log \lambda -\log \eta , \end{aligned}$$
(3.12)
$$\begin{aligned} \tilde{D}_{\alpha ,z}(\lambda \varrho \Vert \eta \sigma )&=\tilde{D}_{\alpha ,z} (\varrho \Vert \sigma )+\log \lambda -\log \eta , \end{aligned}$$
(3.13)

valid for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\lambda ,\eta \in (0,+\infty )\).

Remark 3.14

According to Lemma 3.1, if \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )= {{\,\textrm{Tr}\,}}\overline{\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}}\sigma ^{\frac{1-\alpha }{2z}}}^{\,z}, \end{aligned}$$

which is a straightforward generalization of the formula for PSD operators on a finite-dimensional Hilbert space. Moreover, Lemma 3.1 also yields the formula

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )= {{\,\textrm{Tr}\,}}\big (\overline{\varrho ^{\frac{\alpha }{2z}}\sigma ^{\frac{1-\alpha }{2z}}} \big (\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}} \big )\big )^{z}, \end{aligned}$$
(3.14)

which generalizes the finite-dimensional expression \({{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{2z}}\sigma ^{\frac{1-\alpha }{z}}\varrho ^{\frac{\alpha }{2z}}\big )^z\). Note that by Lemma 3.1, (3.14) can also be written as

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )=\left\| \sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}} \right\| _{2z}^{2z}, \end{aligned}$$

where we use the notation \(\left\| \cdot \right\| _z=({{\,\textrm{Tr}\,}}|\cdot |^z)^{1/z}\) also for \(z\in (0,1)\).

A further connection to the finite-dimensional formula is given by the following:

Lemma 3.15

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\varrho ^0\le \sigma ^0\), and let \((\alpha ,z)\in (1,+\infty )\times [1,+\infty )\). Then \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), or equivalently, \(Q_{\alpha ,z}(\varrho \Vert \sigma )<+\infty \), if and only if for any/some sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), \(\big (\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}}\big )_{n\in \mathbb {N}}\) is a convergent sequence in \({{\mathcal {L}}}^{z}({\mathcal {H}})\), where \(\sigma _n{:}{=}{{\,\textrm{id}\,}}_{(c_n,d_n)}(\sigma )\).

Moreover, if \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| \varrho _{\sigma ,\alpha ,z}-\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}}\right\| _z=0, \end{aligned}$$
(3.15)

and if \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ) = \lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\big (\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}}\big )^z \end{aligned}$$
(3.16)

for any sequences as above.

Proof

The “if” part follows since convergence in z-norm implies \(\mathrm{(so)}\) convergence, whence \(\varrho _{\sigma ,\alpha ,z}\) exists as in Lemma 3.1, and the \(\mathrm{(so)}\) limit coincides with the z-norm limit, whence \(\varrho _{\sigma ,\alpha ,z}\in {{\mathcal {L}}}^z({\mathcal {H}},\sigma )\).

Assume now that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Then \(\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}} =P_n\varrho _{\sigma ,\alpha ,z}P_n\), with \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\), and the “only if” part, as well as (3.15), follows from Lemma 2.2.

Note that (3.15) trivially implies (3.16) when \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Assume thus that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\setminus {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), so that \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \). Since \(\sigma _n^{\frac{1-\alpha }{2z}}\varrho \sigma _n^{\frac{1-\alpha }{2z}}=P_n\varrho _{\sigma ,\alpha ,z}P_n\) converges to \(\varrho _{\sigma ,\alpha ,z}\) in the weak operator topology, Lemma 2.4 yields that

$$\begin{aligned} +\infty =Q_{\alpha ,z}(\varrho \Vert \sigma )= \left\| \varrho _{\sigma ,\alpha ,z}\right\| _z^z \le \liminf _{n\rightarrow +\infty }\left\| \sigma _n^{\frac{1-\alpha }{2z}}\varrho \sigma _n^{\frac{1-\alpha }{2z}}\right\| _z^z = \liminf _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\big (\sigma _n^{\frac{1-\alpha }{2z}}\varrho \sigma _n^{\frac{1-\alpha }{2z}}\big )^z, \end{aligned}$$

from which (3.16) follows. \(\square \)

Proposition 3.16

For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and any \(\alpha \in (1,+\infty )\),

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ), D_{\alpha ,z}(\varrho \Vert \sigma ), \tilde{D}_{\alpha ,z}(\varrho \Vert \sigma )\,\, \,\, \text {are decreasing in }\,\, z. \end{aligned}$$

In particular, for any \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \({{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) is increasing in z, i.e.,

$$\begin{aligned} 0<z\le z'\,\, \Longrightarrow \,\, {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\subseteq {{\mathcal {L}}}^{\alpha ,z'}({\mathcal {H}},\sigma ). \end{aligned}$$

Proof

It is sufficient to prove that for any \(0<z<z'\), \(Q_{\alpha ,z}(\varrho \Vert \sigma )\ge Q_{\alpha ,z'}(\varrho \Vert \sigma )\) holds. This is obvious when \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \), and hence for the rest we assume the contrary, i.e., that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). By Lemma 3.4, this implies that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z'}({\mathcal {H}},\sigma )\). Thus, by Lemma 3.15,

$$\begin{aligned} Q_{\alpha ,z'}(\varrho \Vert \sigma ) =\lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\big (\sigma _n^{\frac{1-\alpha }{2z'}}\varrho ^{\frac{\alpha }{z'}} \sigma _n^{\frac{1-\alpha }{2z'}}\big )^{z'}. \end{aligned}$$
(3.17)

According to Araki’s inequality [2, Theorem 2],\({{\,\textrm{Tr}\,}}\varphi \!\big (B^{1/2}AB^{1/2}\big )^q\!\!\le \!{{\,\textrm{Tr}\,}}\varphi \!\big (B^{q/2}A^q B^{q/2}\big )\) for any \(A,B\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(q\in [1,+\infty )\), and monotone increasing continuous function \(\varphi \) on \([0,+\infty )\) such that \(\varphi (0)=0\) and \(t\mapsto \varphi (e^t)\) is convex on \(\mathbb {R}\). Applying this to \(A{:}{=}\varrho ^{\frac{\alpha }{z'}}\), \(B{:}{=}\sigma _n^{\frac{1-\alpha }{z'}}\), \(q{:}{=}\frac{z'}{z}\), and \(\varphi {:}{=}{{\,\textrm{id}\,}}_{[0,+\infty )}^{z}\) yields

$$\begin{aligned} {{\,\textrm{Tr}\,}}\big (\sigma _n^{\frac{1-\alpha }{2z'}}\varrho ^{\frac{\alpha }{z'}} \sigma _n^{\frac{1-\alpha }{2z'}}\big )^{z'}= {{\,\textrm{Tr}\,}}\left[ \big (\sigma _n^{\frac{1-\alpha }{2z'}}\varrho ^{\frac{\alpha }{z'}} \sigma _n^{\frac{1-\alpha }{2z'}}\big )^z\right] ^{\frac{z'}{z}} \le {{\,\textrm{Tr}\,}}\big (\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}}\big )^{z} \end{aligned}$$

for every \(n\in \mathbb {N}\). Thus, by (3.17),

$$\begin{aligned} Q_{\alpha ,z'}(\varrho \Vert \sigma ) \le \lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\big (\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}} \sigma _n^{\frac{1-\alpha }{2z}}\big )^{z} =Q_{\alpha ,z}(\varrho \Vert \sigma ), \end{aligned}$$

where the equality is again due to Lemma 3.15. \(\square \)

Remark 3.17

As a special case of Proposition 3.16, we get that for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),

$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma )= D_{\alpha ,\alpha }(\varrho \Vert \sigma ) \le D_{\alpha ,1}(\varrho \Vert \sigma ), \end{aligned}$$

i.e., the sandwiched Rényi \(\alpha \)-divergence cannot be larger than the Petz-type Rényi \(\alpha \)-divergence. This has been proved for positive normal functionals on a von Neumann algebra (positive trace-class operators in our case) in [6, Theorem 12] and [27, Corollary 3.6] using different methods than in the proof of Proposition 3.16 above.

Remark 3.18

Assume that \(Q_{\alpha }^*(\varrho \Vert \sigma )<+\infty \), i.e., \(\varrho \in {{\mathcal {L}}}^{\alpha }(\varrho \Vert \sigma )\) for some \(\alpha >1\), and \(1<\alpha '<\alpha \). Then, by (3.8),

$$\begin{aligned} Q_{\alpha '}^{*}(\varrho \Vert \sigma )&={{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha '}^{\alpha '} =\left\| \varrho _{\sigma ,\alpha '}\right\| _{\alpha '}^{\alpha '} \le \left\| \sigma ^{\frac{1}{2\alpha '}-\frac{1}{2\alpha }}\right\| _{\frac{2\alpha \alpha '}{\alpha -\alpha '}}^{\alpha '} \left\| \varrho _{\sigma ,\alpha }\right\| _{\alpha }^{\alpha '} \left\| \sigma ^{\frac{1}{2\alpha '}-\frac{1}{2\alpha }}\right\| _{\frac{2\alpha \alpha '}{\alpha -\alpha '}}^{\alpha '}\\&=({{\,\textrm{Tr}\,}}\sigma )^{1-\frac{\alpha '}{\alpha }}Q_{\alpha }^{*}(\varrho \Vert \sigma )^{\frac{\alpha '}{\alpha }}, \end{aligned}$$

where the inequality follows by the operator Hölder inequality. In particular, if \(\sigma \) is trace-class then \(Q_{\alpha '}^{*}(\varrho \Vert \sigma )<+\infty \). If \({{\,\textrm{Tr}\,}}\sigma =1\) then a simple rearrangement yields

$$\begin{aligned} \frac{\alpha '-1}{\alpha '}D_{\alpha '}^{*}(\varrho \Vert \sigma )\le \frac{\alpha -1}{\alpha }D_{\alpha }^{*}(\varrho \Vert \sigma ). \end{aligned}$$

Note that this is weaker than \(D_{\alpha '}^{*}(\varrho \Vert \sigma )\le D_{\alpha }^{*}(\varrho \Vert \sigma )\), which was proved in [28, Proposition 3.7].

Remark 3.19

Since we do not assume the second operator to be trace-class, the expression \(D_{\alpha ,z}(\varrho \Vert I)\) makes sense, and we recover the following identity for the Rényi \(\alpha \)-entropy of a state \(\varrho \in {{\mathcal {S}}}({\mathcal {H}})\), which is well-known in the finite-dimensional case:

$$\begin{aligned} S_{\alpha }(\varrho ){:}{=}\frac{1}{1-\alpha }\log {{\,\textrm{Tr}\,}}\varrho ^{\alpha }=-D_{\alpha ,z}(\varrho \Vert I),\,\, \,\, \,\, \alpha >1. \end{aligned}$$
(3.18)

(In fact, this makes sense for arbitrary PSD operator \(\varrho \)).

More importantly, allowing non trace-class operators enables the definition of conditional \((\alpha ,z)\)-entropies. Following [44], we define two different notions of conditional \((\alpha ,z)\)-entropy between systems A and B in a state \(\varrho _{AB}\in {{\mathcal {S}}}({\mathcal {H}}_A\otimes {\mathcal {H}}_B)\) as

$$\begin{aligned} S_{\alpha ,z}(A|B)^{\downarrow }&{:}{=}-D_{\alpha ,z}(\varrho _{AB}\Vert I_A\otimes \varrho _B), \end{aligned}$$
(3.19)
$$\begin{aligned} S_{\alpha ,z}(A|B)^{\uparrow }&{:}{=}-\inf _{\omega _B\in {{\mathcal {S}}}({\mathcal {H}}_B)}D_{\alpha ,z}(\varrho _{AB}\Vert I_A\otimes \omega _B), \end{aligned}$$
(3.20)

where \(\varrho _B={{\,\textrm{Tr}\,}}_A\varrho _{AB}\) denotes the marginal of \(\varrho _{AB}\) on system B. Again, (3.19)–(3.20) make sense even when \(\varrho _{AB}\) is only assumed to be PSD. Note that while the Rényi entropies (3.18) can be defined directly for \(\varrho \) without reference to any Rényi divergences, this is not the case for the conditional Rényi entropies (3.19)–(3.20), and the ability to take non-trace-class operators at least in the second argument of the divergence is crucial for the definition.

According to Proposition 3.16, for any fixed \(\varrho _{AB}\in {{\mathcal {S}}}({\mathcal {H}}_{A}\otimes {\mathcal {H}}_B)\), and any \(\alpha >1\),

$$\begin{aligned} S_{\alpha ,z}(A|B)^{\downarrow }\,\, \text {and}\,\, S_{\alpha ,z}(A|B)^{\uparrow }\,\, \text {are monotone increasing in}\,\, z. \end{aligned}$$

In particular, either version of the sandwiched conditional Rényi entropy is at least as large as the corresponding version of the Petz-type conditional Rényi entropy.

Lemma 3.20

For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )>0,\,\, \,\, \,\, D_{\alpha ,z}(\varrho \Vert \sigma )>-\infty . \end{aligned}$$
(3.21)

Proof

The assertion is trivial when \(\varrho \notin {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), and hence we assume the contrary. Then

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )={{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z=0\,\, \Longleftrightarrow \,\, \varrho _{\sigma ,\alpha ,z}=0\,\, \Longrightarrow \,\, \varrho ^{\frac{\alpha }{z}}=\sigma ^{\frac{\alpha -1}{2z}}\varrho _{\sigma ,\alpha ,z}\sigma ^{\frac{\alpha -1}{2z}}=0, \end{aligned}$$

contrary to the assumption that \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). Hence, the inequalities in (3.21) hold. \(\square \)

Remark 3.21

Stronger bounds than the ones in (3.21) are given below in Corollary 3.27 for trace-class operators.

Lemma 3.22

Let \(\varrho _k,\sigma _k\in {{\mathcal {B}}}({\mathcal {H}}_k){}_{\gneq 0}\), \(k=1,2\). For any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),

$$\begin{aligned}&\varrho _1\otimes \varrho _2\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_1\otimes {\mathcal {H}}_2,\sigma _1\otimes \sigma _2)\,\, \Longleftrightarrow \,\, \varrho _k\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_k,\sigma _k),\,\, \,\, k=1,2, \end{aligned}$$
(3.22)
$$\begin{aligned}&\varrho _1\otimes \varrho _2\in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}}_1\otimes {\mathcal {H}}_2,\sigma _1\otimes \sigma _2)\,\, \Longleftrightarrow \,\, \varrho _k\in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}}_k,\sigma _k),\,\, \,\, k=1,2, \end{aligned}$$
(3.23)

and \((\varrho _1\otimes \varrho _2)_{\sigma _1\otimes \sigma _2,\alpha ,z}= (\varrho _1)_{\sigma _1,\alpha ,z}\otimes (\varrho _2)_{\sigma _2,\alpha ,z}\). As a consequence,

$$\begin{aligned}&Q_{\alpha ,z}(\varrho _1\otimes \varrho _2\Vert \sigma _1\otimes \sigma _2)= Q_{\alpha ,z}(\varrho _1\Vert \sigma _1)Q_{\alpha ,z}(\varrho _2\Vert \sigma _2), \end{aligned}$$
(3.24)
$$\begin{aligned}&D_{\alpha ,z}(\varrho _1\otimes \varrho _2\Vert \sigma _1\otimes \sigma _2)= D_{\alpha ,z}(\varrho _1\Vert \sigma _1)+D_{\alpha ,z}(\varrho _2\Vert \sigma _2). \end{aligned}$$
(3.25)

Proof

The right to left implications in (3.22, 3.23) are obvious from choosing \(R{:}{=} (\varrho _1)_{\sigma _1,\alpha ,z}\otimes (\varrho _2)_{\sigma _2,\alpha ,z}\) in (i) of Lemma 3.1. Assume that \(\varrho _1\otimes \varrho _2\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_1\otimes {\mathcal {H}}_2,\sigma _1\otimes \sigma _2)\). By (iv) of Lemma 3.1, there exists a \(\lambda \ge 0\) such that

$$\begin{aligned} \varrho _1^{\frac{\alpha }{z}}\otimes \varrho _2^{\frac{\alpha }{z}} = \big (\varrho _1\otimes \varrho _2\big )^{\frac{\alpha }{z}} \le \lambda \big (\sigma _1\otimes \sigma _2\big )^{\frac{\alpha -1}{z}} = \lambda \sigma _1^{\frac{\alpha -1}{z}}\otimes \sigma _2^{\frac{\alpha -1}{z}}. \end{aligned}$$

Choose any \(\psi _2\notin \ker (\varrho _2)\). For any \(\psi _1\in {\mathcal {H}}_1\), we get

$$\begin{aligned} \left\langle \psi _1 , \varrho _1^{\frac{\alpha }{z}}\psi _1\right\rangle \underbrace{\left\langle \psi _2 , \varrho _2^{\frac{\alpha }{z}}\psi _2\right\rangle }_{=:\kappa _1>0} \le \lambda \left\langle \psi _1 , \sigma _1^{\frac{\alpha -1}{z}}\psi _1\right\rangle \underbrace{\left\langle \psi _2 , \sigma _2^{\frac{\alpha -1}{z}}\psi _2\right\rangle }_{=:\kappa _2}. \end{aligned}$$

Thus, \(\varrho _1^{\frac{\alpha }{z}}\le \lambda (\kappa _2/\kappa _1)\sigma _1^{\frac{\alpha -1}{z}}\), and again by (iv) of Lemma 3.1, \(\varrho _1\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_1,\sigma _1)\). An exactly analogous argument gives \(\varrho _2\in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}}_2,\sigma _2)\). This proves the left to right implication in (3.22), and we also get

$$\begin{aligned} (\sigma _1\otimes \sigma _2)^{\frac{1-\alpha }{2z}} (\varrho _1\otimes \varrho _2)^{\frac{\alpha }{2z}} =\big (\sigma _1^{\frac{1-\alpha }{2z}}\varrho _1^{\frac{\alpha }{2z}}\big )\otimes \big (\sigma _2^{\frac{1-\alpha }{2z}}\varrho _2^{\frac{\alpha }{2z}}\big ), \end{aligned}$$

from which \((\varrho _1\otimes \varrho _2)_{\sigma _1\otimes \sigma _2,\alpha ,z}= (\varrho _1)_{\sigma _1,\alpha ,z}\otimes (\varrho _2)_{\sigma _2,\alpha ,z}\), according to (3.3), and thus (3.24) and (3.25) follow due to the multiplicativity of the trace. The left to right implication in (3.23) follows immediately from the above. \(\square \)

3.2 Variational formulas

The following variational representations of \(Q_{\alpha ,z}\) and \(D_{\alpha ,z}\) are very useful to establish their fundamental properties. We will use these variational formulas to prove monotonicity of \(Q_{\alpha ,z}\) under restrictions of the operators to subspaces (Lemma 3.32, Corollary 3.34) and to give a lower bound on the strong converse exponent (Lemma 4.2).

For \(z=\alpha \) (the case of the sandwiched Rényi divergence), the variational formula in (3.26) was given first in [12] for finite-dimensional PSD operators, and was extended to the case of pairs of positive normal functionals on a general von Neumann algebra in [28] (see also [21, Lemma 3.19] for the case \(\alpha <1\)), while the variational formula in (3.27) can be obtained as an intermediate step in the proof of the first variational formula, and it was given in [5] in the finite-dimensional case.

For finite-dimensional invertible PSD operators and arbitrary \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), both variational formulas (3.26)–(3.27) follow as special cases of [47, Theorem 3.3].

The version below is an extension of the above when \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\) are arbitrary, and the operators \(\varrho ,\sigma \) can be PSD operators on an infinite-dimensional Hilbert space satisfying the conditions in Lemma 3.23. Our proof follows essentially that of [47, Theorem 3.3].

For any \(\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), let

$$\begin{aligned} {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}&{:}{=} \left\{ H\in {{\mathcal {B}}}({\mathcal {H}})_{\ge 0}:\, {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}<+\infty \right\} ,\\ {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+&{:}{=} \left\{ H\in {{\mathcal {B}}}({\mathcal {H}})_{\ge 0}:\, 0<{{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}<+\infty \right\} . \end{aligned}$$

Lemma 3.23

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), and assume that one of the following holds: a) \(\varrho \notin {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\); b) \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\setminus {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) and \(\sigma \) is compact; c) \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Then

$$\begin{aligned}&Q_{\alpha ,z}(\varrho \Vert \sigma )=\sup _{H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}}\left\{ \alpha {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }}\right. \nonumber \\&\left. + (1-\alpha ) {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}\right\} , \end{aligned}$$
(3.26)
$$\begin{aligned}&\log Q_{\alpha ,z}(\varrho \Vert \sigma )=\sup _{H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+}\left\{ \alpha \log {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }}\right. \nonumber \\&\left. +(1-\alpha )\log {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}\right\} . \end{aligned}$$
(3.27)

The equality in (3.26) still holds if the supremum is taken over \({{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\). Moreover, in cases a) and b), and in case c) if \(\sigma \) is compact, the H operators in (3.26) and (3.27) may additionally be required to be of finite rank.

Proof

For any \(H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), let

$$\begin{aligned} F(H)&{:}{=}{{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }},\,\, \,\, \,\, G(H){:}{=}{{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}. \end{aligned}$$
(3.28)

Assume first that \(\varrho \notin {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), and hence \(Q_{\alpha ,z}(\varrho \Vert \sigma )=\log Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \). By (iv) of Lemma 3.1, for every \(\lambda >0\) there exists a vector \(x_{\lambda }\in {\mathcal {H}}\) such that

$$\begin{aligned} \left\langle x_{\lambda } , \varrho ^{\frac{\alpha }{z}}x_{\lambda }\right\rangle > \lambda \langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle . \end{aligned}$$
(3.29)

Clearly, for any \(x\in {\mathcal {H}}\), \(H_x{:}{=}\left| x\right\rangle \!\left\langle x\right| \in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\cap {{\mathcal {B}}}_f({\mathcal {H}})\), and

$$\begin{aligned} F(H_x)= \left\langle x , \varrho ^{\frac{\alpha }{z}}x\right\rangle ^{\frac{z}{\alpha }},\,\, \,\, \,\, G(H_x)= \langle x,\sigma ^{\frac{\alpha -1}{z}}x\rangle ^{\frac{z}{\alpha -1}}. \end{aligned}$$

If \(\langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle =0\) for some \(\lambda >0\) then let \(x_{\lambda ,t}{:}{=}tx_{\lambda }+t^{-1}y\), \(t>0\), where \(y\in (\ker \sigma )^{\perp }\setminus \{0\}\) is some fixed vector. Then \(H_{x_{\lambda ,t}}\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\cap {{\mathcal {B}}}_f({\mathcal {H}})\), (3.29) implies that \(\left\langle x_{\lambda } , \varrho ^{\frac{\alpha }{z}}x_{\lambda }\right\rangle >0\), and it is straightforward to verify that

$$\begin{aligned} \lim _{t\rightarrow +\infty }F(H_{x_{\lambda ,t}})=+\infty ,\,\, \,\, \lim _{t\rightarrow +\infty }G(H_{x_{\lambda ,t}})=0. \end{aligned}$$

Thus,

$$\begin{aligned} \lim _{t\rightarrow +\infty }\big (\alpha F(H_{x_{\lambda ,t}})+(1-\alpha )G(H_{x_{\lambda ,t}})\big )&=+\infty =Q_{\alpha ,z}(\varrho \Vert \sigma )=\log Q_{\alpha ,z}(\varrho \Vert \sigma )\\&=\lim _{t\rightarrow +\infty }\big (\alpha \log F(H_{x_{\lambda ,t}})+(1-\alpha )\log G(H_{x_{\lambda ,t}})\big ), \end{aligned}$$

and therefore (3.26)–(3.27) hold.

If \(\langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle >0\) for every \(\lambda >0\) then let \(\tilde{x}_{\lambda }{:}{=}x_{\lambda } \langle x_{\lambda },\sigma ^{\frac{\alpha -1}{z}}x_{\lambda }\rangle ^{-1/2}\). Then

$$\begin{aligned} \langle \tilde{x}_{\lambda },\sigma ^{\frac{\alpha -1}{z}}\tilde{x}_{\lambda }\rangle =1=\langle \tilde{x}_{\lambda },\sigma ^{\frac{\alpha -1}{z}}\tilde{x}_{\lambda }\rangle ^{\frac{z}{\alpha -1}}=G(H_{\tilde{x}_{\lambda }}), \end{aligned}$$
(3.30)

and

$$\begin{aligned} F(H_{\tilde{x}_{\lambda }}) =\left\langle \tilde{x}_{\lambda } , \varrho ^{\frac{\alpha }{z}}\tilde{x}_{\lambda }\right\rangle ^{\frac{z}{\alpha }} >\big (\lambda \langle \tilde{x}_{\lambda },\sigma ^{\frac{\alpha -1}{z}}\tilde{x}_{\lambda }\rangle \big )^{\frac{z}{\alpha }} =\lambda ^{\frac{z}{\alpha }}, \end{aligned}$$

according to (3.29) and (3.30). Thus,

$$\begin{aligned} \lim _{\lambda \rightarrow +\infty }\big (\alpha F(H_{\tilde{x}_{\lambda }})+ (1-\alpha )G(H_{\tilde{x}_{\lambda }})\big )&=+\infty =Q_{\alpha ,z}(\varrho \Vert \sigma )=\log Q_{\alpha ,z}(\varrho \Vert \sigma )\\&=\lim _{\lambda \rightarrow +\infty }\big (\alpha \log F(H_{\tilde{x}_{\lambda }})+ (1-\alpha )\log G(H_{\tilde{x}_{\lambda }})\big ), \end{aligned}$$

and therefore (3.26)–(3.27) hold, even with the optimizations restricted to finite-rank operators.

This completes the proof of case a), and hence for the rest we assume that b) or c) holds.

Consider any sequences \(0<c_n<d_n\) with \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), and let \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\), \(\sigma _n{:}{=}{{\,\textrm{id}\,}}_{(c_n,d_n)}(\sigma )=P_n\sigma P_n\), and

$$\begin{aligned} H_n{:}{=}\sigma _n^{\frac{1-\alpha }{2z}} (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}})^{\alpha -1} \sigma _n^{\frac{1-\alpha }{2z}} =\sigma _n^{\frac{1-\alpha }{2z}}(P_n\varrho _{\sigma ,\alpha ,z}P_n)^{\alpha -1}\sigma _n^{\frac{1-\alpha }{2z}}. \end{aligned}$$

Then

$$\begin{aligned} F(H_n)&={{\,\textrm{Tr}\,}}\big (H_n^{1/2}\varrho ^{\frac{\alpha }{z}} H_n^{1/2}\big )^{\frac{z}{\alpha }}\nonumber \\&={{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{2z}}H_n\varrho ^{\frac{\alpha }{2z}} \big )^{\frac{z}{\alpha }}\nonumber \\&={{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{2z}}\sigma _n^{\frac{1-\alpha }{2z}} (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}})^{\alpha -1}\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}} \big )^{\frac{z}{\alpha }}\nonumber \\&={{\,\textrm{Tr}\,}}\Big ( (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}})^{\frac{\alpha -1}{2}} \sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}} (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}})^{\frac{\alpha -1}{2}} \Big )^{\frac{z}{\alpha }}\nonumber \\&= {{\,\textrm{Tr}\,}}\Big (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}\Big )^{z}\nonumber \\&={{\,\textrm{Tr}\,}}\big (P_n\varrho _{\sigma ,\alpha ,z}P_n\big )^z, \end{aligned}$$
(3.31)

and similarly,

$$\begin{aligned} G(H_n)&={{\,\textrm{Tr}\,}}\big (H_n^{1/2}\sigma ^{\frac{\alpha -1}{z}}H_n^{1/2}\big )^{\frac{z}{\alpha -1}} = {{\,\textrm{Tr}\,}}\big (\sigma ^{\frac{\alpha -1}{2z}}H_n\sigma ^{\frac{\alpha -1}{2z}}\big )^{\frac{z}{\alpha -1}} \nonumber \\&= {{\,\textrm{Tr}\,}}\Big ( \underbrace{\sigma ^{\frac{\alpha -1}{2z}} \sigma _n^{\frac{1-\alpha }{2z}}}_{=P_n} (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}})^{\alpha -1} \underbrace{\sigma _n^{\frac{1-\alpha }{2z}} \sigma ^{\frac{\alpha -1}{2z}}}_{=P_n} \Big )^{\frac{z}{\alpha -1}}\nonumber \\&= {{\,\textrm{Tr}\,}}\Big (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}\Big )^{z}\nonumber \\&={{\,\textrm{Tr}\,}}(P_n\varrho _{\sigma ,\alpha ,z}P_n)^z. \end{aligned}$$
(3.32)

We have

$$\begin{aligned} {{\,\textrm{Tr}\,}}(P_n\varrho _{\sigma ,\alpha ,z}P_n)^z = {{\,\textrm{Tr}\,}}\big (\varrho _{\sigma ,\alpha ,z}^{1/2}P_n\varrho _{\sigma ,\alpha ,z}^{1/2}\big )^z \le {{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z = Q_{\alpha ,z}(\varrho \Vert \sigma ); \end{aligned}$$

in particular,

$$\begin{aligned} \limsup _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}(P_n\varrho _{\sigma ,\alpha ,z}P_n)^z\le Q_{\alpha ,z}(\varrho \Vert \sigma ). \end{aligned}$$
(3.33)

Moreover, if \(Q_{\alpha ,z}(\varrho \Vert \sigma )<+\infty \), i.e., in case c), or if \(\sigma \) is compact (in which case \(H_n\) and \(P_n\varrho _{\sigma ,\alpha ,z}P_n\) are of finite rank) then \(F(H_n)=G(H_n)<+\infty \), whence \(H_n\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\). Since \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) implies \(\varrho ^0\le \sigma ^0\), it is also true that \(0\ne \sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}\), and hence \(H_n\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\), for all large enough n.

When \(z\ge 1\), Lemma 2.4 yields

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ) = {{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z=\left\| \varrho _{\sigma ,\alpha ,z}\right\| _z^z&\le \liminf _{n\rightarrow +\infty }\left\| P_n\varrho _{\sigma ,\alpha ,z}P_n\right\| _z^z \nonumber \\&= \liminf _{n\rightarrow +\infty } {{\,\textrm{Tr}\,}}(P_n\varrho _{\sigma ,\alpha ,z}P_n)^z. \end{aligned}$$
(3.34)

When \(z\in (0,1)\), \({{\,\textrm{id}\,}}_{[0,+\infty )}^z\) is operator concave, and hence \((P_n\varrho _{\sigma ,\alpha ,z}P_n)^z\ge P_n\varrho _{\sigma ,\alpha ,z}^zP_n\), whence

$$\begin{aligned} \liminf _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}(P_n\varrho _{\sigma ,\alpha ,z}P_n)^z \ge \liminf _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}P_n\varrho _{\sigma ,\alpha ,z}^zP_n={{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z= Q_{\alpha ,z}(\varrho \Vert \sigma ). \end{aligned}$$
(3.35)

Combining (3.31)–(3.35) gives

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )&= \lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\Big (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}\Big )^{z}\\&=\lim _{n\rightarrow +\infty }\left( \alpha {{\,\textrm{Tr}\,}}\big (H_n^{1/2}\varrho ^{\frac{\alpha }{z}} H_n^{1/2}\big )^{\frac{z}{\alpha }}\right. \\&\left. +(1-\alpha ) {{\,\textrm{Tr}\,}}\big (H_n^{1/2}\sigma ^{\frac{\alpha -1}{z}}H_n^{1/2}\big )^{\frac{z}{\alpha -1}}\right) ,\\ \log Q_{\alpha }^{*}(\varrho \Vert \sigma )&=\lim _{n\rightarrow +\infty }\log {{\,\textrm{Tr}\,}}\Big (\sigma _n^{\frac{1-\alpha }{2z}} \varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}}\Big )^{z}\\&=\lim _{n\rightarrow +\infty }\left( \alpha \log {{\,\textrm{Tr}\,}}\big (H_n^{1/2}\varrho ^{\frac{\alpha }{z}} H_n^{1/2}\big )^{\frac{z}{\alpha }}\right. \\&\left. + (1-\alpha ) \log {{\,\textrm{Tr}\,}}\big (H_n^{1/2}\sigma ^{\frac{\alpha -1}{z}}H_n^{1/2}\big )^{\frac{z}{\alpha -1}}\right) . \end{aligned}$$

This completes the proof when \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \), i.e., in case b).

Assume for the rest that case c) holds. By the above considerations, we have LHS\(\le \)RHS in (3.26)–(3.27), and hence we only have to show the converse inequalities. By Lemma 3.1 and Definition 3.7, \(\varrho ^{\frac{\alpha }{z}}=\sigma ^{\frac{\alpha -1}{2z}}\varrho _{\sigma ,\alpha ,z}\sigma ^{\frac{\alpha -1}{2z}}\), where \({{\,\textrm{Tr}\,}}\varrho _{\sigma ,\alpha ,z}^z<+\infty \). For any \(H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\), we have

$$\begin{aligned} {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }}&= {{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{2z}}H\varrho ^{\frac{\alpha }{2z}}\big )^{\frac{z}{\alpha }} = {{\,\textrm{Tr}\,}}\left| H^{1/2}\varrho ^{\frac{\alpha }{2z}} \right| ^{\frac{2z}{\alpha }} = {{\,\textrm{Tr}\,}}\left| H^{1/2}\sigma ^{\frac{\alpha -1}{2z}}\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}} \right| ^{\frac{2z}{\alpha }} \end{aligned}$$
(3.36)
$$\begin{aligned}&=\left\| H^{1/2}\sigma ^{\frac{\alpha -1}{2z}}\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\right\| _{\frac{2z}{\alpha }}^{\frac{2z}{\alpha }} \le \left\| H^{1/2}\sigma ^{\frac{\alpha -1}{2z}}\right\| _{\frac{2z}{\alpha -1}}^{\frac{2z}{\alpha }} \left\| \sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\right\| _{2z}^{\frac{2z}{\alpha }} \end{aligned}$$
(3.37)
$$\begin{aligned}&=\left[ {{\,\textrm{Tr}\,}}\big (\sigma ^{\frac{\alpha -1}{2z}}H \sigma ^{\frac{\alpha -1}{2z}}\big )^{\frac{z}{\alpha -1}}\right] ^{\frac{\alpha -1}{\alpha }} \Bigg [\underbrace{{{\,\textrm{Tr}\,}}\big (\big (\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\big )^* \sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}} \big )^{z}}_{=Q_{\alpha ,z}(\varrho \Vert \sigma )}\Bigg ]^{\frac{1}{\alpha }} \end{aligned}$$
(3.38)
$$\begin{aligned}&\le \frac{\alpha -1}{\alpha }{{\,\textrm{Tr}\,}}\big (\sigma ^{\frac{\alpha -1}{2z}}H \sigma ^{\frac{\alpha -1}{2z}}\big )^{\frac{z}{\alpha -1}} +\frac{1}{\alpha }Q_{\alpha ,z}(\varrho \Vert \sigma ), \end{aligned}$$
(3.39)

where we used that \({{\,\textrm{ran}\,}}\varrho ^{\frac{\alpha }{2z}}\subseteq {{\,\textrm{dom}\,}}\sigma ^{\frac{1-\alpha }{2z}}\) and \(\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\in {{\mathcal {B}}}({\mathcal {H}})\), according to Lemma 3.1, and the expression in (3.14) for \(Q_{\alpha ,z}(\varrho \Vert \sigma )\). The first inequality above is due to the operator Hölder inequality, and the second inequality is trivial from the convexity of the exponential function. A simple rearrangement yields that LHS\(\ge \)RHS in (3.26, 3.27), completing the proof. \(\square \)

Remark 3.24

It is interesting that one can formally take the logarithm of each term in (3.26) to obtain (3.27).

Remark 3.25

The variational formulas in (3.26)–(3.27) hold for the sandwiched quantities (\(z=\alpha \)) when \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\setminus {{\mathcal {L}}}^{\alpha }({\mathcal {H}},\sigma )\) even if \(\sigma \) is not compact [20]. However, we won’t need this fact in the rest of the paper.

Remark 3.26

Note that the case \(z=1\) corresponds to the Petz-type Rényi divergences. By the above, \(\varrho \in {{\mathcal {B}}}^{\alpha ,1}({\mathcal {H}},\sigma )\) if and only if \(\varrho ^{\alpha }\le \lambda \sigma ^{\alpha -1}\) with some \(\lambda \ge 0\), in which case

$$\begin{aligned} Q_{\alpha }(\varrho \Vert \sigma ){:}{=}Q_{\alpha ,1}(\varrho \Vert \sigma ) ={{\,\textrm{Tr}\,}}\overline{\sigma ^{\frac{1-\alpha }{2}}\varrho ^{\alpha }\sigma ^{\frac{1-\alpha }{2}}}. \end{aligned}$$

(See [21, Theorem 3.6] for a generalization of the above in the setting of von Neumann algebras, and also for an analogous formula in the case \(\alpha \in (0,1)\).) Moreover, we have the variational formulas

$$\begin{aligned} Q_{\alpha }(\varrho \Vert \sigma )&=\sup _{H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,1}^+}\left\{ \alpha {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\alpha } H^{1/2}\big )^{\frac{1}{\alpha }}\right. \\&\left. + (1-\alpha ) {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\alpha -1}H^{1/2}\big )^{\frac{1}{\alpha -1}}\right\} ,\\ \log Q_{\alpha }(\varrho \Vert \sigma )&=\sup _{H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,1}^+}\left\{ \alpha \log {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\alpha } H^{1/2}\big )^{\frac{1}{\alpha }}\right. \\&\left. +(1-\alpha )\log {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\alpha -1}H^{1/2}\big )^{\frac{1}{\alpha -1}}\right\} , \end{aligned}$$

where \({{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,1}^+= \left\{ H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}:\, 0<{{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\alpha -1}H^{1/2}\big )^{\frac{1}{\alpha -1}}<+\infty \right\} \).

These variational expressions for the Petz-type Rényi divergences do not seem to have appeared in the literature before, even for finite-dimensional operators, although in that case they follow easily from the results of [47].

The variational formulas in Lemma 3.23 can be used to prove the following important properties of the Rényi \((\alpha ,z)\)-divergences.

Corollary 3.27

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\sigma \) is trace-class. For every \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ) \ge ({{\,\textrm{Tr}\,}}\varrho )^{\alpha }({{\,\textrm{Tr}\,}}\sigma )^{1-\alpha } \ge \alpha {{\,\textrm{Tr}\,}}\varrho +(1-\alpha ){{\,\textrm{Tr}\,}}\sigma . \end{aligned}$$
(3.40)

If, moreover, \(\varrho \) is trace-class then we have

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )=({{\,\textrm{Tr}\,}}\varrho )^{\alpha }({{\,\textrm{Tr}\,}}\sigma )^{1-\alpha }\,\, \,\, \Longleftrightarrow \,\, \,\, \sigma =\eta \varrho \text { for some }\eta \in (0,+\infty ), \end{aligned}$$
(3.41)

and

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )=\alpha {{\,\textrm{Tr}\,}}\varrho +(1-\alpha ){{\,\textrm{Tr}\,}}\sigma \,\, \,\, \Longleftrightarrow \,\, \,\, \sigma =\varrho . \end{aligned}$$
(3.42)

Proof

The second inequality in (3.40) follows simply from the convexity of \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\alpha }\). The first inequality is obvious when \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \), and hence we may assume the contrary, i.e, that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). The assumption that \(\sigma \) is trace-class yields that \(I\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}^+\), and the variational formula in (3.27) with \(H{:}{=}I\) yields the first inequality in (3.40). In fact, we don’t need the “full power” of the variational formula in (3.27) to obtain the first inequality in (3.40), as it follows simply from the Hölder inequality as in (3.36)–(3.39), with \(H=I\).

Assume for the rest that \(\varrho \) is trace-class. The right to left implications are straightforward to verify in both (3.41) and (3.42). Assume now that the equality on the LHS of (3.41) holds. It implies that \(Q_{\alpha ,z}(\varrho \Vert \sigma )\) is finite, i.e., \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), and the inequality in (3.37) holds as an equality for \(H=I\). Thus, by the characterization of the equality case in Hölder’s inequality (Lemma 2.1), \(\sigma =\lambda \left| \big (\sigma ^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{2z}}\big )^* \right| ^{2z}=\lambda \varrho _{\sigma ,\alpha ,z}^z\) for some \(\lambda >0\). From this we get \(\sigma ^{\frac{1}{z}}=\lambda ^{\frac{1}{z}}\varrho _{\sigma ,\alpha ,z}\), and

$$\begin{aligned} \sigma _n^{\frac{1}{z}} = P_n\sigma ^{\frac{1}{z}} P_n = \lambda ^{\frac{1}{z}}P_n\varrho _{\sigma ,\alpha ,z} P_n = \lambda ^{\frac{1}{z}}\sigma _n^{\frac{1-\alpha }{2z}}\varrho ^{\frac{\alpha }{z}}\sigma _n^{\frac{1-\alpha }{2z}} \end{aligned}$$

for every \(n\in \mathbb {N}\), where \(P_n\) is as in Lemma 3.1. Rearranging yields \(\sigma _n^{\frac{\alpha }{z}}=\lambda ^{\frac{1}{z}}P_n\varrho ^{\frac{\alpha }{z}}P_n\). Thus,

$$\begin{aligned} \sigma ^{\frac{\alpha }{z}} =\mathrm{(so)}\lim _n \sigma _n^{\frac{\alpha }{z}} =\lambda ^{\frac{1}{z}}\mathrm{(so)}\lim _n P_n\varrho ^{\frac{\alpha }{z}}P_n =\lambda ^{\frac{1}{z}}\varrho ^{\frac{\alpha }{z}}. \end{aligned}$$

(In the last equality we use that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) implies \(\varrho ^0\le \sigma ^0\)). Hence, \(\sigma =\lambda ^{\frac{1}{\alpha }}\varrho \), i.e, the RHS of (3.41) holds true.

Finally, assume that the equality on the LHS of (3.42) is true. By (3.40), this implies that the equality on the LHS of (3.41) is true, and hence, by the above, \(\sigma =\eta \varrho \) for some \(\eta \in (0,+\infty )\). Moreover, the second equality in (3.40) holds as an equality, whence \({{\,\textrm{Tr}\,}}\varrho ={{\,\textrm{Tr}\,}}\sigma \), so we get \(\varrho =\sigma \) as given on the RHS of (3.42). \(\square \)

Corollary 3.28

For any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), the Rényi \((\alpha ,z)\)-divergence \(D_{\alpha ,z}\) is strictly positive in the sense that for any two density operators \(\varrho ,\sigma \in {{\mathcal {S}}}({\mathcal {H}})\),

$$\begin{aligned} D_{\alpha ,z}(\varrho \Vert \sigma )\ge 0,\,\, \,\, \text {with equality if and only if}\,\, \,\, \varrho =\sigma . \end{aligned}$$

Proof

Immediate from Corollary 3.27. \(\square \)

Remark 3.29

Non-negativity of the Rényi \((\alpha ,z)\)-divergences has been proved in [34] in the finite-dimensional case, by different methods. Strict positivity of the sandwiched Rényi \(\alpha \)-divergences with \(\alpha >1\) has been proved in the general von Neumann algebra case in [27].

Finally, we prove the lower semi-continuity of \(Q_{\alpha ,z}\) and \(D_{\alpha ,z}\) on pairs of trace-class operators from the variational formula; we will use this later in the proof of Lemma 3.39.

Corollary 3.30

For any \(\alpha >1\) and \(z\ge \alpha \), \(Q_{\alpha ,z}\) and \(D_{\alpha ,z}\) are lower semi-continuous on \({{\mathcal {L}}}^1({\mathcal {H}})\times {{\mathcal {L}}}^1({\mathcal {H}})\).

Proof

Let \(\varrho _n,\sigma _n\in {{\mathcal {L}}}^1({\mathcal {H}})\), \(n\in \mathbb {N}\), be convergent sequences in trace-norm, with \(\varrho {:}{=}\lim _{n\rightarrow +\infty }\varrho _n\), \(\sigma {:}{=}\lim _{n\rightarrow +\infty }\sigma _n\). Then

$$\begin{aligned} \Vert {\varrho _n^{\frac{\alpha }{z}}}\Vert _{\frac{z}{\alpha }}= ({{\,\textrm{Tr}\,}}\varrho _n)^{\frac{\alpha }{z}} =\left\| \varrho _n\right\| _1^{\frac{\alpha }{z}} \xrightarrow [n\rightarrow +\infty ]{}\left\| \varrho \right\| _1^{\frac{\alpha }{z}}=({{\,\textrm{Tr}\,}}\varrho )^{\frac{\alpha }{z}} =\Vert {\varrho ^{\frac{\alpha }{z}}}\Vert _{\frac{z}{\alpha }}. \end{aligned}$$

Since \(\left\| \varrho _n-\varrho \right\| _{\infty }\le \left\| \varrho _n-\varrho \right\| _1\rightarrow 0\), the continuity of the functional calculus implies \(\Vert {\varrho _n^{\frac{\alpha }{z}}-\varrho ^{\frac{\alpha }{z}}}\Vert _{\infty }\rightarrow 0\); in particular, \(\varrho _n^{\frac{\alpha }{z}}\rightarrow \varrho ^{\frac{\alpha }{z}}\) in the strong operator topology. Hence, by Lemma 2.3, \(\lim _n\left\| \varrho _n^{\frac{\alpha }{z}}-\varrho ^{\frac{\alpha }{z}}\right\| _{\frac{z}{\alpha }}=0\). Thus, for any \(H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),

$$\begin{aligned} \left| \left\| H^{1/2}\varrho _n^{\frac{\alpha }{z}} H^{1/2}\right\| _{\frac{z}{\alpha }} - \left\| H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\right\| _{\frac{z}{\alpha }} \right|&\le \left\| H^{1/2}\varrho _n^{\frac{\alpha }{z}} H^{1/2} - H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\right\| _{\frac{z}{\alpha }}\\&\le \left\| H\right\| _{\infty }^2\left\| \varrho _n^{\frac{\alpha }{z}}-\varrho ^ {\frac{\alpha }{z}}\right\| _{\frac{z}{\alpha }}\xrightarrow [n\rightarrow +\infty ]{}0. \end{aligned}$$

This shows that for any \(H\in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), \(\varrho \mapsto {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }} = \Vert {H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}}\Vert _{\frac{z}{\alpha }}^{\frac{z}{\alpha }}\) is continuous on \({{\mathcal {L}}}^1({\mathcal {H}})\), and continuity of \(\sigma \mapsto {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}} H^{1/2}\big )^{\frac{z}{\alpha -1}}\) on \({{\mathcal {L}}}^1({\mathcal {H}})\) can be proved in the same way. Thus, by Lemma 3.23, \({{\mathcal {L}}}^1({\mathcal {H}})\times {{\mathcal {L}}}^1({\mathcal {H}})\ni (\varrho ,\sigma )\mapsto Q_{\alpha ,z}(\varrho ,\sigma )\) is the supremum of continuous functions, and hence it is upper semi-continuous. The assertion about the lower semi-continuity of \(D_{\alpha ,z}\) follows trivially from this. \(\square \)

Remark 3.31

Lower semi-continuity of the sandwiched Rényi \(\alpha \)-divergences for \(\alpha >1\) (i.e., \(z=\alpha >1\)) was given in [27, Proposition 3.10] in the general von Neumann algebra setting, with a different proof.

3.3 Finite-dimensional approximations

Our next goal is to investigate the relation between the sandwiched Rényi divergences of finite-dimensional restrictions of the operators and those of the unrestricted operators. We start with the following:

Lemma 3.32

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and \(K\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})_{\varrho ,\sigma }^+\) be a contraction. For any \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\) with \(\max \{\alpha -1,\alpha /2\}\le z\le \alpha \),

$$\begin{aligned} Q_{\alpha ,z}(K\varrho K^*\Vert K\sigma K^*)\le Q_{\alpha ,z}(\varrho \Vert \sigma ). \end{aligned}$$
(3.43)

Proof

By assumption, \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\frac{\alpha }{z}}\) is operator convex and \({{\,\textrm{id}\,}}_{[0,+\infty )}^{\frac{\alpha -1}{z}}\) is operator concave, whence

$$\begin{aligned} (K\varrho K^*)^{\frac{\alpha }{z}}\le K\varrho ^{\frac{\alpha }{z}} K^*,\,\, \,\, \,\, \,\, (K\sigma K^*)^{\frac{\alpha -1}{z}} \ge K\sigma ^{\frac{\alpha -1}{z}}K^*, \end{aligned}$$
(3.44)

according to the operator Jensen inequality [7, Theorem 11].

If \(\varrho \notin {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) then \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \) by definition, and (3.43) holds trivially. Hence, for the rest we assume that \(\varrho \in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},\sigma )\). Lemma 3.1 yields the existence of some \(\lambda \ge 0\) such that \(\varrho ^{\frac{\alpha }{z}}\le \lambda \sigma ^{\frac{\alpha -1}{z}}\). Thus,

$$\begin{aligned} (K\varrho K^*)^{\frac{\alpha }{z}}\le K\varrho ^{\frac{\alpha }{z}} K^* \le \lambda K\sigma ^{\frac{\alpha -1}{z}}K^*\le \lambda (K\sigma K^*)^{\frac{\alpha -1}{z}}, \end{aligned}$$

where the first and the last inequalities are due to (3.44). Hence, again by Lemma 3.1, \(K\varrho K^*\in {{\mathcal {L}}}^{\alpha ,z}({\mathcal {H}},K\sigma K^*)\); in particular, the variational formulas in Lemma 3.23 hold for \(K\varrho K^*\) and \(K\sigma K^*\) in place of \(\varrho \) and \(\sigma \), respectively.

For any \(H\in {{\mathcal {B}}}({{\mathcal {K}}})_{K\sigma K^*,\alpha ,z}\),

(3.45)

where the second inequality is due to (3.44). In particular, \(K^*HK\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}\). Similarly,

$$\begin{aligned} {{\,\textrm{Tr}\,}}\big (H^{1/2}(K\varrho K^*)^{\frac{\alpha }{z}}H^{1/2}\big )^{\frac{z}{\alpha }}&\le {{\,\textrm{Tr}\,}}\big (H^{1/2}K\varrho ^{\frac{\alpha }{z}} K^*H^{1/2}\big )^{\frac{z}{\alpha }}\nonumber \\&= {{\,\textrm{Tr}\,}}\big (\varrho ^{\frac{\alpha }{2z}} K^*HK\varrho ^{\frac{\alpha }{2z}}\big )^{\frac{z}{\alpha }}\nonumber \\&= {{\,\textrm{Tr}\,}}\big ((K^*HK)^{1/2}\varrho ^{\frac{\alpha }{z}}(K^*HK)^{1/2}\big )^{\frac{z}{\alpha }}. \end{aligned}$$
(3.46)

Plugging (3.45)–(3.46) into the variational formula yields

$$\begin{aligned}&Q_{\alpha ,z}(K\varrho K^*\Vert K\sigma K^*)\\&\quad = \sup _{H\in {{\mathcal {B}}}({{\mathcal {K}}})_{K\sigma K^*,\alpha ,z}}\left\{ \alpha {{\,\textrm{Tr}\,}}\big (H^{1/2}(K\varrho K^*)^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }}\right. \\&\qquad \left. + (1-\alpha ) {{\,\textrm{Tr}\,}}\big (H^{1/2}(K\sigma K^*)^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}\right\} \\&\,\, \le \sup _{H\in {{\mathcal {B}}}({{\mathcal {K}}})_{K\sigma K^*,\alpha ,z}}\left\{ \alpha {{\,\textrm{Tr}\,}}\big ((K^*HK)^{1/2}\varrho ^{\frac{\alpha }{z}} (K^*HK)^{1/2}\big )^{\frac{z}{\alpha }}\right. \\&\qquad \left. + (1-\alpha ) {{\,\textrm{Tr}\,}}\big ((K^*HK)^{1/2}\sigma ^{\frac{\alpha -1}{z}}(K^*HK)^{1/2}\big )^{\frac{z}{\alpha -1}}\right\} \\&\,\, \le \sup _{H\in {{\mathcal {B}}}({\mathcal {H}})_{\sigma ,\alpha ,z}}\left\{ \alpha {{\,\textrm{Tr}\,}}\big (H^{1/2}\varrho ^{\frac{\alpha }{z}} H^{1/2}\big )^{\frac{z}{\alpha }}+ (1-\alpha ) {{\,\textrm{Tr}\,}}\big (H^{1/2}\sigma ^{\frac{\alpha -1}{z}}H^{1/2}\big )^{\frac{z}{\alpha -1}}\right\} \\&\,\, = Q_{\alpha ,z}(\varrho \Vert \sigma ). \end{aligned}$$

\(\square \)

Remark 3.33

When \(z=\alpha \) and \(K=P\) is a projection in Lemma 3.32, one could appeal to the monotonicity of \(Q_{\alpha }^{*}\) under positive trace-preserving maps, and its additivity on direct sums [27, Proposition 3.11], to obtain the inequality (3.43) as

$$\begin{aligned} Q_{\alpha }^{*}(\varrho \Vert \sigma )&\ge Q_{\alpha }^{*}(P\varrho P+P^{\perp }\varrho P^{\perp }\Vert P\sigma P+P^{\perp }\sigma P^{\perp }) \\&= Q_{\alpha }^{*}(P\varrho P\Vert P\sigma P)+ Q_{\alpha }^{*}( P^{\perp }\varrho P^{\perp }\Vert P^{\perp }\sigma P^{\perp }) \\&\ge Q_{\alpha }^{*}(P\varrho P\Vert P\sigma P). \end{aligned}$$

Note, however, that these properties were only proved in [27] for positive normal functionals, i.e., positive trace-class operators in our setting, and hence this argument gives (3.43) in a restricted setting compared to that of Lemma 3.32, even when we only consider \(z=\alpha \) and reductions by projections.

Recall that the set of projections on \({\mathcal {H}}\) is an upward directed partially ordered set w.r.t. the PSD order. Lemma 3.32 yields the following:

Corollary 3.34

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\) as in Lemma 3.32. For any contraction \(K\in {{\mathcal {B}}}({\mathcal {H}})_{\varrho ,\sigma }^+\), and any projection \(P\in \mathbb {P}({\mathcal {H}})\) such that \(|K|^0\le P\),

$$\begin{aligned} Q_{\alpha ,z}(K\varrho K^*\Vert K\sigma K^*)\le Q_{\alpha ,z}(P\varrho P\Vert P\sigma P). \end{aligned}$$
(3.47)

In particular,

$$\begin{aligned} \mathbb {P}({\mathcal {H}})_{\varrho ,\sigma }^+\ni P\mapsto Q_{\alpha ,z}(P\varrho P\Vert P\sigma P)\,\, \,\, \text {is increasing.} \end{aligned}$$
(3.48)

Proof

Since \(K(P\varrho P)K^*=K\varrho K^*\) and \(K(P\sigma P)K^*=K\sigma K^*\), (3.47) follows immediately by replacing \(\varrho \) with \(P\varrho P\) and \(\sigma \) with \(P\sigma P\) in Lemma 3.32. The monotonicity in (3.48) follows immediately from this. \(\square \)

Definition 3.35

For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\), let

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}&{:}{=}\sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}Q_{\alpha ,z}(P\varrho P\Vert P\sigma P),\\ D_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}&{:}{=}\sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}D_{\alpha ,z}(P\varrho P\Vert P\sigma P) =\frac{1}{\alpha -1}\log Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}, \end{aligned}$$

be the finite-dimensional approximations of \(Q_{\alpha ,z}(\varrho \Vert \sigma )\) and \(D_{\alpha ,z}(\varrho \Vert \sigma )\), respectively. If, moreover, \(\varrho \) is trace-class then we also define

$$\begin{aligned} \tilde{D}_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}&{:}{=} D_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}-\frac{1}{\alpha -1}\log {{\,\textrm{Tr}\,}}\varrho . \end{aligned}$$

Remark 3.36

It is clear from (3.10) that for any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\in (1,+\infty )\times (0,+\infty )\),

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}>0,\,\, \,\, \,\, D_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}>-\infty . \end{aligned}$$
(3.49)

Lemma 3.37

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \((\alpha ,z)\) as in Lemma 3.32. Then

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}\le Q_{\alpha ,z}(\varrho \Vert \sigma ), \end{aligned}$$
(3.50)

and

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}&=\sup \left\{ Q_{\alpha ,z}(T\varrho T\Vert T\sigma T):\,T\in {{\mathcal {B}}}({\mathcal {H}})_{[0,I]}\cap {{\mathcal {B}}}_f({\mathcal {H}})_{\varrho ,\sigma }^+\right\} \\&= \sup \left\{ Q_{\alpha ,z}(K\varrho K^*\Vert K\sigma K^*):\,K\in {{\mathcal {B}}}_f({\mathcal {H}})_{\varrho ,\sigma }^+,\,\left\| K\right\| \le 1\right\} . \end{aligned}$$

Proof

Immediate from Corollary 3.34. \(\square \)

Our next goal is to see when equality in (3.50) holds.

Lemma 3.38

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}})_+\) and \(1<\alpha \le z\) be such that \(\varrho \in {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\), let \(0<c_n<d_n\), \(n\in \mathbb {N}\), be sequences such that \(c_n\rightarrow 0\), \(d_n\rightarrow +\infty \), and \(P_n{:}{=}\textbf{1}_{(c_n,d_n)}(\sigma )\). Then

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma )\le \liminf _{n\rightarrow +\infty }Q_{\alpha ,z}(P_n\varrho P_n\Vert P_n\sigma P_n). \end{aligned}$$
(3.51)

Proof

Note that by assumption, \(\varrho ^0\le \sigma ^0\), and for every large enough n, \(P_n\in {{\mathcal {B}}}({\mathcal {H}})_{\varrho ,\sigma }^+\). By Lemma 3.1,

$$\begin{aligned} \varrho _{\sigma ,\alpha ,z}=\mathrm{(wo)}\lim _{n\rightarrow +\infty }(P_n\sigma P_n)^{\frac{1-\alpha }{2z}} \underbrace{(P_n\varrho ^{\frac{\alpha }{z}} P_n)}_{\le (P_n\varrho P_n)^{{\frac{\alpha }{z}}}} (P_n\sigma P_n)^{\frac{1-\alpha }{2z}}, \end{aligned}$$
(3.52)

where the inequality follows from the operator Jensen inequality [7, Theorem 11] due to the fact that \(\alpha /z\in (0,1]\) by assumption. Hence,

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ) =\left\| \varrho _{\sigma ,\alpha ,z}\right\| _{z}^{z}&\le \liminf _{n\rightarrow +\infty } \left\| (P_n\sigma P_n)^{\frac{1-\alpha }{2z}}(P_n\varrho ^{\frac{\alpha }{z}} P_n) (P_n\sigma P_n)^{\frac{1-\alpha }{2z}}\right\| _{z}^{z}\\&\le \liminf _{n\rightarrow +\infty } \left\| (P_n\sigma P_n)^{\frac{1-\alpha }{2z}}(P_n\varrho P_n)^{\frac{\alpha }{z}} (P_n\sigma P_n)^{\frac{1-\alpha }{2z}}\right\| _{z}^{z} \\&= \liminf _{n\rightarrow +\infty } Q_{\alpha ,z}(P_n\varrho P_n\Vert P_n\sigma P_n), \end{aligned}$$

where the first inequality is due to Lemma 2.4, and the second inequality follows from (3.52). \(\square \)

The range of \((\alpha ,z)\) pairs to which both Lemma 3.37 and Lemma 3.38 apply is \(1<\alpha =z\), i.e., the case of the sandwiched Rényi divergences, and hence for the rest we restrict to this case. Fortunately, this is sufficient for the intended applications in the rest of the paper.

Lemma 3.39

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be trace-class, and \(K_n\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})_{\varrho ,\sigma }^+\), \(n\in \mathbb {N}\), be contractions such that

$$\begin{aligned} \exists \mathrm{(so)}\lim _{n\rightarrow +\infty }K_n=:K_{\infty }\in {{\mathcal {B}}}({\mathcal {H}},{{\mathcal {K}}})_{\varrho ,\sigma }^+,\,\, \,\, \,\, \exists \mathrm{(so)}\lim _{n\rightarrow +\infty }K_n^*. \end{aligned}$$

(That is, \((K_n)_{n\in \mathbb {N}}\) converges in the strong\(^*\) operator topology.) Then

$$\begin{aligned} Q_{\alpha }^{*}(K_{\infty }\varrho K_{\infty }^*\Vert K_{\infty }\sigma K_{\infty }^*)&\le \liminf _{n\rightarrow +\infty }Q_{\alpha }^{*}(K_n\varrho K_n^*\Vert K_n\sigma K_n^*) \end{aligned}$$
(3.53)
$$\begin{aligned}&\le \limsup _{n\rightarrow +\infty }Q_{\alpha }^{*}(K_n\varrho K_n^*\Vert K_n\sigma K_n^*) \le Q_{\alpha }^{*}(\varrho \Vert \sigma ). \end{aligned}$$
(3.54)

In particular, if \(P_n\in \mathbb {P}({\mathcal {H}})_{\varrho ,\sigma }^+\), \(n\in \mathbb {N}\), is a sequence of projections strongly converging to some \(P_{\infty }\) with \(P_{\infty }\varrho P_{\infty }=\varrho \) and \(P_{\infty }\sigma P_{\infty }=\sigma \) then

$$\begin{aligned} \lim _{n\rightarrow +\infty }Q_{\alpha }^{*}(P_n\varrho P_n\Vert P_n\sigma P_n) = Q_{\alpha }^{*}(\varrho \Vert \sigma ). \end{aligned}$$

Proof

The second inequality in (3.54) is obvious from Lemma 3.32, and the first inequality is trivial. By the assumptions and Lemma 2.2,

$$\begin{aligned} \lim _{n\rightarrow +\infty }\left\| K_n\varrho K_n^*-K_{\infty }\varrho K_{\infty }^*\right\| _1=0= \lim _{n\rightarrow +\infty }\left\| K_n\sigma K_n^*-K_{\infty }\sigma K_{\infty }^*\right\| _1. \end{aligned}$$
(3.55)

Since \(Q_{\alpha }^{*}\) is lower semi-continuous on \({{\mathcal {L}}}^1({\mathcal {H}})\times {{\mathcal {L}}}^1({\mathcal {H}})\) (see Corollary 3.30, or [27, Proposition 3.10]), we get the inequality in (3.53). The last assertion follows obviously. \(\square \)

Lemmas 3.373.38, and 3.39 imply immediately the following:

Proposition 3.40

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and assume that \(\varrho \) and \(\sigma \) are trace-class, or that \(\sigma \) is compact and \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\). Then

$$\begin{aligned} Q_{\alpha }^{*}(\varrho \Vert \sigma )= Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&= \lim _{\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\ni P\nearrow I} Q_{\alpha }^{*}(P\varrho P\Vert P\sigma P) \end{aligned}$$
(3.56)
$$\begin{aligned}&= \lim _{n\rightarrow +\infty } Q_{\alpha }^{*}(P_n\varrho P_n\Vert P_n\sigma P_n), \end{aligned}$$
(3.57)
$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma )= D_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&=\lim _{\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\ni P\nearrow I} D_{\alpha }^{*}(P\varrho P\Vert P\sigma P) \end{aligned}$$
(3.58)
$$\begin{aligned}&= \lim _{n\rightarrow +\infty } D_{\alpha }^{*}(P_n\varrho P_n\Vert P_n\sigma P_n), \end{aligned}$$
(3.59)

for every \(\alpha >1\), where the convergence in (3.56) and (3.58) is a net convergence in the strong operator topology, and (3.57) and (3.59) hold for any sequence \((P_n)_{n\in \mathbb {N}}\) as in Lemma 3.38. If, moreover, \(\varrho \) is trace-class then

$$\begin{aligned} \tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma )= \tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&=\lim _{\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\ni P\nearrow I} \tilde{D}_{\alpha }^{*}(P\varrho P\Vert P\sigma P)\nonumber \\&=\lim _{n\rightarrow +\infty } \tilde{D}_{\alpha }^{*}(P_n\varrho P_n\Vert P_n\sigma P_n),\,\, \,\, \,\, \alpha >1. \end{aligned}$$
(3.60)

Remark 3.41

Finite-dimensional approximability for the standard f-divergences was given in [19, Theorem 4.5] in the general von Neumann algebra setting. In particular, it shows that for any two PSD trace-class operators on a Hilbert space, the standard (or Petz-type) Rényi divergences satisfy

$$\begin{aligned}&D_{\alpha ,1}(\varrho \Vert \sigma )= D_{\alpha ,1}(\varrho \Vert \sigma ){}_{\textrm{fa}}{:}{=} \sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}D_{\alpha ,1}(P\varrho P\Vert P\sigma P)\,, \quad \alpha \in [0,2]. \end{aligned}$$

It is an open question whether finite-dimensional approximability holds for any \((\alpha ,z)\) pairs other than \(\alpha \in [0,2]\) and \(z=1\), and \(z=\alpha \in (1,+\infty )\).

There are cases apart from the ones treated in Proposition 3.40 where the inequality in (3.50) holds with equality. In particular, we have the following trivial case, which we will use in the proof of Proposition 4.3.

Lemma 3.42

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\).

$$\begin{aligned} \text {If}\,\, \,\, \varrho ^0\nleq \sigma ^0\,\, \,\, \text {then}\,\, \,\, Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}=+\infty =Q_{\alpha ,z}(\varrho \Vert \sigma ),\,\, \,\, \alpha \in (1,+\infty ),\,\, z\in (0,+\infty ). \end{aligned}$$
(3.61)

Proof

Assume that \(\varrho ^0\nleq \sigma ^0\), so that there exists a unit vector \(\psi \in {\mathcal {H}}\) such that \(\sigma ^0\psi =0\), \(\varrho ^0\psi \ne 0\). Let \(\phi \) be any unit vector such that \(\sigma ^0\phi =\phi \), and for every \(t\in [0,1]\), define \(\psi _t{:}{=}\sqrt{1-t}\psi +\sqrt{t}\phi \), \(P_t{:}{=}\left| \psi _t\right\rangle \!\left\langle \psi _t\right| \). Then

$$\begin{aligned} P_t\varrho P_t&=\left| \psi _t\right\rangle \!\left\langle \psi _t\right| \left\langle \psi _t , \varrho \psi _t\right\rangle \xrightarrow [t\rightarrow 0]{} \left| \psi \right\rangle \!\left\langle \psi \right| \underbrace{\left\langle \psi , \varrho \psi \right\rangle }_{>0},\\ P_t\sigma P_t&=\left| \psi _t\right\rangle \!\left\langle \psi _t\right| \left\langle \psi _t , \sigma \psi _t\right\rangle \xrightarrow [t\rightarrow 0]{}0, \end{aligned}$$

while \(P_t\sigma P_t\ne 0\) for every \(t\in (0,1]\). Thus,

$$\begin{aligned} Q_{\alpha ,z}(\varrho \Vert \sigma ){}_{\textrm{fa}}\ge \lim _{t\searrow 0}Q_{\alpha ,z}(P_t\varrho P_t\Vert P_t\sigma P_t) = \lim _{t\searrow 0}\left\langle \psi _t , \varrho \psi _t\right\rangle ^{\alpha }\left\langle \psi _t , \sigma \psi _t\right\rangle ^{1-\alpha } =+\infty . \end{aligned}$$

Since \(\varrho ^0\nleq \sigma ^0\) implies that \(\varrho \notin {{\mathcal {B}}}^{\alpha ,z}({\mathcal {H}},\sigma )\) (see Lemma 3.1), we also get \(Q_{\alpha ,z}(\varrho \Vert \sigma )=+\infty \). \(\square \)

The finite-dimensional approximability of the sandwiched Rényi divergences in Proposition 3.40 is the key property used in proving the main results of the paper, the equality of the sandwiched and the regularized measured Rényi divergences, and the determination of the strong converse exponent of state discrimination, in Sects. 3.4 and 4.1.

The following monotonicity result has been proved for finite-rank states in [35], and for states of a general von Neumann algebra in [6, 27]. We give a different proof of it in our setting as an illustration of the use of the finite-dimensional approximability in extending finite-dimensional results to infinite dimension. We will give yet another proof in Sect. 3.4, using a different respresentation of the sandwiched Rényi divergences.

Corollary 3.43

Let \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) be PSD trace-class operators. Then

$$\begin{aligned} (1,+\infty )\ni \alpha \mapsto \tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma )\,\, \,\, \text {is increasing}, \end{aligned}$$
(3.62)

and

$$\begin{aligned} \lim _{\alpha \rightarrow +\infty }\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ) =\lim _{\alpha \rightarrow +\infty } D_{\alpha }^{*}(\varrho \Vert \sigma ) =D_{\max }(\varrho \Vert \sigma ). \end{aligned}$$
(3.63)

Proof

These are well-known when \(\varrho \) and \(\sigma \) are finite-rank [35]. Thus, by (3.60), the monotonicity in (3.62) holds. This also shows that the first limit in (3.63) exists, and it is trivial by definition that it is equal to the second limit. To show the last equality in (3.63), it is sufficient to consider the case when \(\varrho \) and \(\sigma \) are density operators, due to the scaling properties in Remark 3.13. Then \(D_{\alpha }^{*}(\varrho \Vert \sigma )=\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma )\) for every \(\alpha >1\), and

$$\begin{aligned} \lim _{\alpha \rightarrow +\infty } D_{\alpha }^{*}(\varrho \Vert \sigma )&= \lim _{\alpha \rightarrow +\infty } \tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ) = \sup _{\alpha>1}\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ) = \sup _{\alpha>1} D_{\alpha }^{*}(\varrho \Vert \sigma )\\&=\sup _{\alpha>1}\sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}D_{\alpha }^{*}(P\varrho P\Vert P\sigma P)\\&=\sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}\sup _{\alpha>1}D_{\alpha }^{*}(P\varrho P\Vert P\sigma P)\\&=\sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}\sup _{\alpha >1} \left\{ \tilde{D}_{\alpha }^{*}(P\varrho P\Vert P\sigma P)+\frac{1}{\alpha -1}\log {{\,\textrm{Tr}\,}}P\varrho P\right\} \\&=\sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+} D_{\max }(P\varrho P\Vert P\sigma P) =D_{\max }(\varrho \Vert \sigma ). \end{aligned}$$

Here, the first three equalities are trivial, and the fourth one follows by Proposition 3.40. The fifth equality is again trivial, and the sixth one is by definition. In the seventh equality we use that both \(\alpha \mapsto \tilde{D}_{\alpha }^{*}(P\varrho P\Vert P\sigma P)\) and \(\alpha \mapsto \frac{1}{\alpha -1}\log {{\,\textrm{Tr}\,}}P\varrho P\) are increasing, and hence the supremum of their sum over \(\alpha >1\) is the sum of their limits at \(\alpha \rightarrow +\infty \), which is equal to \(D_{\max }(P\varrho P\Vert P\sigma P)\), according to the known behaviour in the finite-dimensional case. The last equality is straightforward to verify. \(\square \)

3.4 Regularized measured Rényi divergence

A finite-outcome positive operator-valued measure (POVM) on a Hilbert space \({\mathcal {H}}\) is a map \(M:\,[r]\rightarrow {{\mathcal {B}}}({\mathcal {H}})\), where \([r]{:}{=}\{1,\ldots ,r\}\), all \(M_i\) is PSD, and \(\sum _{i=1}^rM_i=I\). (We assume without loss of generality that the set of possible outcomes is a subset of \(\mathbb {N}\).) We denote the set of such POVMs by \(\text {POVM}({\mathcal {H}},[r])\). For two PSD trace-class operators \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\), their measured Rényi divergence is defined as

$$\begin{aligned} D_{\alpha }^{\text {meas}}(\varrho \Vert \sigma ){:}{=}\sup _{r\in \mathbb {N}}\sup _{M\in \text {POVM}({\mathcal {H}},[r])} D_{\alpha }\big (\big ({{\,\textrm{Tr}\,}}\varrho M_i\big )_{i\in [r]}\Big \Vert \big ({{\,\textrm{Tr}\,}}\sigma M_i\big )_{i\in [r]}\big ), \end{aligned}$$

where in the second expression we have the classical Rényi divergence [42] of the given non-negative functions on [r]. This is defined for \(p,q\in [0,+\infty )^{[r]}\backslash \{0\} \) as

$$\begin{aligned} D_{\alpha }(p\Vert q){:}{=}{\left\{ \begin{array}{ll} \frac{1}{\alpha -1}\log \sum _{i\in [r]}p(i)^{\alpha }q(i)^{1-\alpha },&{}{{\,\textrm{supp}\,}}p\subseteq {{\,\textrm{supp}\,}}q,\\ +\infty ,&{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

One might consider more general POVMs for the definition, but that does not change the value of the measured Rényi divergence; see, e.g., [21, Proposition 5.2]. The regularized measured Rényi divergence of \(\varrho \) and \(\sigma \) is then defined as

$$\begin{aligned} \overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma ){:}{=} \sup _{n\in \mathbb {N}}\frac{1}{n}D_{\alpha }^{\text {meas}}\big (\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}\big )= \lim _{n\rightarrow \infty }\frac{1}{n}D_{\alpha }^{\text {meas}}\big (\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}\big ). \end{aligned}$$

The following has been shown in [33]:

Lemma 3.44

For finite-rank PSD operators \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),

$$\begin{aligned} \overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma ) =D_{\alpha }^{*}(\varrho \Vert \sigma ),\,\, \,\, \,\, \alpha >1. \end{aligned}$$

In the proof of the next theorem, we will use the monotonicity of the sandwiched Rényi \(\alpha \)-divergences under finite-outcome measurements for \(\alpha >1\). The more general statement of monotonicity under quantum operations has been proved in [6, Theorem 14] and [27, Theorem 3.14] in the general von Neumann algebra setting. We give a different proof for trace-class operators on a Hilbert space in Corollary 4.15 below.

Theorem 3.45

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be trace-class, and \(\alpha >1\). Then

$$\begin{aligned} \overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma )=D_{\alpha }^{*}(\varrho \Vert \sigma ). \end{aligned}$$

Proof

The inequality \(\overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma )\le D_{\alpha }^{*}(\varrho \Vert \sigma )\) is trivial from the monotonicity of \(D_{\alpha }^{*}\) under quantum operations and its additivity under tensor products (Lemma 3.22), and hence we only need to prove the converse inequality. By Proposition 3.40, for any \(c<D_{\alpha }^{*}(\varrho \Vert \sigma )\) there exists a finite-rank projection \(P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\) such that

$$\begin{aligned} c<D_{\alpha }^{*}(P\varrho P\Vert P\sigma P)\le D_{\alpha }^{*}(\varrho \Vert \sigma ). \end{aligned}$$

By Lemma 3.44, there exist \(n\in \mathbb {N}\), a number \(r\in \mathbb {N}\), and \(M_i\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n}){}_{\gneq 0}\), \(M_i^0\le P^{\otimes n}\), \(i\in [r]\), with \(\sum _{i\in [r]}M_i=P^{\otimes n}\), such that

$$\begin{aligned} c< \frac{1}{n}D_{\alpha }\big (\big ({{\,\textrm{Tr}\,}}(P\varrho P)^{\otimes n}M_i\big )_{i\in [r]}\big \Vert \big ({{\,\textrm{Tr}\,}}(P\sigma P)^{\otimes n}M_i\big )_{i\in [r]}\big ). \end{aligned}$$
(3.64)

Let us define \(\tilde{M}_i{:}{=}M_i\), \(i\in [r]\), and \(\tilde{M}_{r+1}{:}{=}I_{{\mathcal {H}}^{\otimes n}}-P^{\otimes n}\). Then \((\tilde{M}_i)_{i\in [r+1]}\) is a POVM on \({\mathcal {H}}^{\otimes n}\), and we have

$$\begin{aligned} c&<\frac{1}{n}D_{\alpha }\big (\big ({{\,\textrm{Tr}\,}}(P\varrho P)^{\otimes n}M_i\big )_{i\in [r]}\big \Vert \big ({{\,\textrm{Tr}\,}}(P\sigma P)^{\otimes n}M_i\big )_{i\in [r]}\big )\\&= \frac{1}{n}D_{\alpha }\big (\big ({{\,\textrm{Tr}\,}}\varrho ^{\otimes n}\tilde{M}_i\big )_{i\in [r]}\Big \Vert \big ({{\,\textrm{Tr}\,}}\sigma ^{\otimes n}\tilde{M}_i\big )_{i\in [r]}\big )\\&\le \frac{1}{n}D_{\alpha }\big (\big ({{\,\textrm{Tr}\,}}\varrho ^{\otimes n}\tilde{M}_i\big )_{i\in [r+1]}\Big \Vert \big ({{\,\textrm{Tr}\,}}\sigma ^{\otimes n}\tilde{M}_i\big )_{i\in [r+1]}\big )\\&\le \frac{1}{n}D_{\alpha }^{\text {meas}}\big (\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}\big )\\&\le \overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma ), \end{aligned}$$

where the first inequality is by (3.64), the equality and the second inequality are trivial, and the third and the fourth inequalities are by definition. Thus, \(c<\overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma )\), and since the above holds for every \(c<D_{\alpha }^{*}(\varrho \Vert \sigma )\), the assertion follows. \(\square \)

Their representation given in Theorem 3.45 distinguishes the sandwiched Rényi divergences among all quantum generalizations of the classical Rényi divergences; in particular, it gives special importance to the \(\alpha =z\) case in the family of Rényi \((\alpha ,z)\)-divergences, at least for \(\alpha >1\). It also allows to deduce some important properties of the sandwiched Rényi divergences from those of the classical Rényi divergences; we present such an example in Corollary 3.46. Note that the properties in Corollary 3.46 were also proved in [6, 27] in the general von Neumann algebra setting, by different methods. Yet another proof was given in our setting in Corollary 3.43.

Corollary 3.46

Let \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) be PSD trace-class operators. Then

$$\begin{aligned} (1,+\infty )\ni \alpha \mapsto \tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma )\,\, \,\, \text {is increasing}, \end{aligned}$$
(3.65)

and

$$\begin{aligned} \sup _{\alpha >1}\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ) =\lim _{\alpha \rightarrow +\infty }\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma )&= \lim _{\alpha \rightarrow +\infty } D_{\alpha }^{*}(\varrho \Vert \sigma ) \end{aligned}$$
(3.66)
$$\begin{aligned}&= D_{\max }(\varrho \Vert \sigma ) \end{aligned}$$
(3.67)
$$\begin{aligned}&= \log \inf \{\lambda > 0:\,\varrho \le \lambda \sigma \} \end{aligned}$$
(3.68)
$$\begin{aligned}&=\log \sup \left\{ \frac{{{\,\textrm{Tr}\,}}\varrho T}{{{\,\textrm{Tr}\,}}\sigma T}:\,T\in {{\mathcal {B}}}({\mathcal {H}})_{[0,1]},\,{{\,\textrm{Tr}\,}}\sigma T>0\right\} . \end{aligned}$$
(3.69)

Proof

The increasing property in (3.65) is well-known and easy to verify for commuting finite-rank states (i.e., in the finite-dimensional classical setting). The general case follows immediately from this and Theorem 3.45. The first equality in (3.66) is immediate from the increasing property in (3.65), and the second equality is trivial by definition.

Note that the equality in (3.68) is by definition (see (3.9)), and it is clear that \(D_{\max }(\varrho \Vert \sigma )\) is an upper bound on (3.69). To prove the converse inequality, note first that (3.69) is equal to \(+\infty \) if \(\varrho ^0\nleq \sigma ^0\), and hence for the rest we assume the contrary. Let \(0<\lambda <\exp (D_{\max }(\varrho \Vert \sigma ))\). By definition, there exists a unit vector \(\psi \in {\mathcal {H}}\) such that \(\left\langle \psi , \varrho \psi \right\rangle >\lambda \left\langle \psi , \sigma \psi \right\rangle \). In particular, \(\left\langle \psi , \varrho \psi \right\rangle >0\), and hence also \(\left\langle \psi , \sigma \psi \right\rangle >0\), due to the assumption that \(\varrho ^0\le \sigma ^0\). Choosing \(T{:}{=}\left| \psi \right\rangle \!\left\langle \psi \right| \) shows that (3.69) is lower bounded by \(\log \lambda \) for any such \(\lambda \), and hence it is also lower bounded by \(D_{\max }(\varrho \Vert \sigma )\). Thus, we get the equality in (3.69).

It is also straightforward to verify that the expressions in (3.66) are upper bounded by \(D_{\max }(\varrho \Vert \sigma )\). To prove the converse inequality, note that for any test T as in (3.69),

$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma )&=\overline{D}_{\alpha }^{\text {meas}}(\varrho \Vert \sigma ) \ge D_{\alpha }\big (({{\,\textrm{Tr}\,}}\varrho T,{{\,\textrm{Tr}\,}}\varrho (I-T)),({{\,\textrm{Tr}\,}}\sigma T,{{\,\textrm{Tr}\,}}\sigma (I-T))\big )\\&\ge \frac{1}{\alpha -1}\log \left[ ({{\,\textrm{Tr}\,}}\varrho T)^{\alpha }({{\,\textrm{Tr}\,}}\sigma T)^{1-\alpha }\right] = \frac{\alpha }{\alpha -1}\log {{\,\textrm{Tr}\,}}\varrho T-\log {{\,\textrm{Tr}\,}}\sigma T, \end{aligned}$$

whence

$$\begin{aligned} \lim _{\alpha \rightarrow +\infty }D_{\alpha }^{*}(\varrho \Vert \sigma )\ge \log \frac{{{\,\textrm{Tr}\,}}\varrho T}{{{\,\textrm{Tr}\,}}\sigma T}\,. \end{aligned}$$

Taking the supremum over T yields that \(\lim _{\alpha \rightarrow +\infty }D_{\alpha }^{*}(\varrho \Vert \sigma )\) is lower bounded by (3.69), which in turn is equal to \(D_{\max }(\varrho \Vert \sigma )\) by the above. \(\square \)

3.5 The Hoeffding anti-divergences

For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), let

$$\begin{aligned} \psi ^{*}(\varrho \Vert \sigma |\alpha )&{:}{=} \log Q_{\alpha }^{*}(\varrho \Vert \sigma )=(\alpha -1)D_{\alpha }^{*}(\varrho \Vert \sigma ),\,\, \,\, \,\, \alpha >1,\\ \tilde{\psi }^{*}(\varrho \Vert \sigma |u)&{:}{=} (1-u)\psi ^{*}\big (\varrho \Vert \sigma |(1-u)^{-1}\big ),\,\, \,\, \,\, u\in (0,1). \end{aligned}$$

We will need these quantities to define the Hoeffding anti-divergences, which will give the strong converse exponent of state discrimination in Sect. 4.

Lemma 3.47

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) with \(\varrho ^0\le \sigma ^0\).

(i) For any finite-rank projection \(P\!\!\in \!\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\), \(\psi ^{*}\!(P\varrho P\Vert P\sigma P|\cdot \!)\) and \(\tilde{\psi }^{*}\!(P\varrho P\Vert P\sigma P|\cdot \!)\) are finite-valued convex functions on \((1,+\infty )\) and (0, 1), respectively, and hence they are continuous. Moreover,

$$\begin{aligned} \psi ^{*}(P\varrho P\Vert P\sigma P|1) {:}{=} \tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|0)&{:}{=}\lim _{u\searrow 0}\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u) =\log {{\,\textrm{Tr}\,}}P\varrho P, \end{aligned}$$
(3.70)
$$\begin{aligned} \psi ^{*}(P\varrho P\Vert P\sigma P|+\infty ) {:}{=} \tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|1)&{:}{=}\lim _{u\nearrow 1}\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\nonumber \\&= D_{\max }(P\varrho P\Vert P\sigma P), \end{aligned}$$
(3.71)

and the so extended functions \(\psi ^{*}(P\varrho P\Vert P\sigma P|\cdot )\) and \(\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|\cdot )\) are convex and continuous on \([1,+\infty ]\) and on [0, 1], respectively.

(ii) For every \(\alpha \in [1,+\infty ]\), \(P\mapsto \psi ^{*}(P\varrho P\Vert P\sigma P|u)\) is monotone increasing on \(\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\).

(iii) For every \(u\in [0,1]\), \(P\mapsto \tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\) is monotone increasing on \(\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\).

Proof

By [33, Corollary 3.11], \(\psi ^{*}(P\varrho P\Vert P\sigma P|\cdot )\) is a finite-valued convex function on \((1,+\infty )\). Hence, it can be written as \(\psi ^{*}(P\varrho P\Vert P\sigma P|\alpha )=\sup _{i\in {{\mathcal {I}}}}\{c_i\alpha +d_i\}\), \(\alpha \in (1,+\infty )\), with some \(c_i,d_i\in \mathbb {R}\) and an index set \({{\mathcal {I}}}\). This implies that \(\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)=(1-u)\sup _{i\in {{\mathcal {I}}}}\{c_i(1-u)^{-1}+d_i\}= \sup _{i\in {{\mathcal {I}}}}\{c_i+d_i(1-u)\}\), and therefore \(\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|\cdot )\) is also convex and finite-valued on (0, 1), and thus it is continuous as well. The limits in (3.70)–(3.71) follow by a straightforward computation, using in the second limit that \(\lim _{\alpha \rightarrow +\infty }D_{\alpha }^{*}(\omega \Vert \tau )=D_{\max }(\omega \Vert \tau )\) for finite-rank states \(\omega ,\tau \) (see [35] or Corollary 3.46). Convexity and continuity of the extensions are obvious from the definitions. Monotonicity in (ii) and (iii) are immediate from Corollary 3.34. \(\square \)

Corollary 3.48

For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), the functions

$$\begin{aligned} \psi ^{*}(\varrho \Vert \sigma |\alpha ){}_{\textrm{fa}}&{:}{=} {\left\{ \begin{array}{ll} \sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}\psi ^{*}(P\varrho P\Vert P\sigma P|\alpha ),&{}\,\, \,\, \alpha \in [1,+\infty ],\\ +\infty ,&{}\,\, \,\, \alpha \in (-\infty ,1), \end{array}\right. }\\ \tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}&{:}{=} {\left\{ \begin{array}{ll} \sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u),&{}\,\, \,\, u\in [0,1],\\ +\infty ,&{}\,\, \,\, u\in \mathbb {R}\setminus [0,1], \end{array}\right. }\\ \tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}&{:}{=} {\left\{ \begin{array}{ll} \sup _{n\in \mathbb {N}}\frac{1}{n}\tilde{\psi }^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}|u){}_{\textrm{fa}},&{} \,\, \,\, \,\, u\in [0,1],\\ +\infty ,&{}\,\, \,\, \,\, u\in \mathbb {R}\setminus [0,1], \end{array}\right. } \end{aligned}$$

are convex and lower semi-continuous on \(\mathbb {R}\) (and on \(\mathbb {R}\cup \{+\infty \}\) in the case of \(\psi ^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\)).

Proof

If \(\varrho ^0\nleq \sigma ^0\) then all three functions are easily seen to be constant \(+\infty \) on \(\mathbb {R}\), and hence for the rest we assume that \(\varrho ^0\le \sigma ^0\). Since the supremum of convex functions is again convex, and the supremum of lower semi-continuous functions is again lower semi-continuous, both properties hold for the above functions on \([1,+\infty ]\), [0, 1], and [0, 1], respectively, according to Lemma 3.47, and it is trivial to verify that the same is true on the whole of \(\mathbb {R}\). \(\square \)

Remark 3.49

It is clear that

$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}&= (1-u)\psi ^{*}(\varrho \Vert \sigma |1/(1-u)){}_{\textrm{fa}}= (1-u)\log Q_{1/(1-u)}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}},\,\, \,\, \,\, u\in (0,1), \end{aligned}$$

and (3.70)–(3.71) yield

$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}&=\log {{\,\textrm{Tr}\,}}\varrho = \tilde{\psi }^{*}(\varrho \Vert \sigma |0)_{\overline{\textrm{fa}}}, \end{aligned}$$
(3.72)
$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |1){}_{\textrm{fa}}&= D_{\max }(\varrho \Vert \sigma ) = \tilde{\psi }^{*}(\varrho \Vert \sigma |1)_{\overline{\textrm{fa}}}. \end{aligned}$$
(3.73)

This motivates to define

$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |0)&{:}{=}\log {{\,\textrm{Tr}\,}}\varrho , \end{aligned}$$
(3.74)
$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |1)&{:}{=} D_{\max }(\varrho \Vert \sigma ). \end{aligned}$$
(3.75)

Remark 3.50

By Corollary 3.48, if \(\varrho \) and \(\sigma \) are such that \(Q_{\alpha }^{*}(\varrho \Vert \sigma )= Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\), \(\alpha >1\), then \(\psi ^{*}(\varrho \Vert \sigma |\cdot )\) and \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot )\) are convex and lower semi-continuous on \((1,+\infty )\) and on (0, 1), respectively. In particular, this holds when both \(\varrho \) and \(\sigma \) are trace-class, according to Proposition 3.40.

Recall the definition of the finite-dimensional approximation of the sandwiched Rényi divergences as a special case of Definition 3.35: For \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),

$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&{:}{=} \sup _{P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}D_{\alpha }^{*}(P\varrho P\Vert P\sigma P) =\frac{1}{\alpha -1}\psi ^{*}(\varrho \Vert \sigma |\alpha ){}_{\textrm{fa}}. \end{aligned}$$

Analogously, we define

$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}&{:}{=} \sup _{n\in \mathbb {N}}\frac{1}{n}D_{\alpha }^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}){}_{\textrm{fa}}=\frac{1}{\alpha -1}\psi ^{*}(\varrho \Vert \sigma |\alpha )_{\overline{\textrm{fa}}}. \end{aligned}$$

Definition 3.51

For \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(r\in \mathbb {R}\), let

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )&{:}{=} \sup _{\alpha >1}\frac{\alpha -1}{\alpha } \left[ r-D_{\alpha }^{*}(\varrho \Vert \sigma )\right] = \sup _{u\in (0,1)}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)\right\} , \end{aligned}$$
(3.76)
$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&{:}{=}\sup _{\alpha >1}\frac{\alpha -1}{\alpha } \left[ r-D_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\right] =\sup _{u\in (0,1)}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\right\} , \end{aligned}$$
(3.77)
$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}&{:}{=} \sup _{\alpha >1}\frac{\alpha -1}{\alpha } \left[ r-D_{\alpha }^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}\right] = \sup _{u\in (0,1)}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}\right\} , \end{aligned}$$
(3.78)
$$\begin{aligned} {\hat{H}}_r^{*}(\varrho \Vert \sigma )&{:}{=} \sup _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)\right\} , \end{aligned}$$
(3.79)
$$\begin{aligned} {\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&{:}{=} \max _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\right\} = \max _{u\in \mathbb {R}}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\right\} , \end{aligned}$$
(3.80)
$$\begin{aligned} {\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}&{:}{=} \max _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}\right\} = \max _{u\in \mathbb {R}}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}\right\} . \end{aligned}$$
(3.81)

Here, \(H_r^{*}(\varrho \Vert \sigma )\) and \({\hat{H}}_r^{*}(\varrho \Vert \sigma )\) are two different versions of the Hoeffding anti-divergence of \(\varrho \) and \(\sigma \) with parameter \(r\in \mathbb {R}\), and the rest of the quantities are different finite-dimensional approximations.

Remark 3.52

\(H_r^{*}\) and \({\hat{H}}_r^{*}\) are called anti-divergences because for trace-class operators they are monotone non-decreasing under quantum operations; this is immediate from the monotone non-increasing property of \(D_{\alpha }^{*}\) under such maps for \(\alpha > 1\); see [6, Theorem 14], [27, Theorem 3.14], or Theorem 4.14.

The Hoeffding anti-divergences are defined as Legendre-Fenchel transforms (polar functions). For some of them this transformation can be reversed as follows; this will be used in Theorem 4.14.

Lemma 3.53

For any \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\),

$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}&= \sup _{r\in \mathbb {R}}\left\{ ur- H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\right\} , \,\, \,\, \,\, u\in \mathbb {R}\setminus \{0,1\}, \end{aligned}$$
(3.82)
$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}&= \sup _{r\in \mathbb {R}}\left\{ ur-{\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\right\} , \,\, \,\, \,\, u\in \mathbb {R} \end{aligned}$$
(3.83)
$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}&= \sup _{r\in \mathbb {R}}\left\{ ur-{\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}\right\} , \,\, \,\, \,\, u\in \mathbb {R}. \end{aligned}$$
(3.84)

Proof

By Corollary 3.48, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\) and \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot )_{\overline{\textrm{fa}}}\) are convex and lower semi-continuous on \(\mathbb {R}\), and hence (3.83)–(3.84) follow from (3.80)–(3.81) according to the bipolar theorem (see, e.g., [11, Proposition 4.1]). Likewise, \(r\mapsto H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\) is the polar function of \(f(u){:}{=}\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}+(+\infty )\textbf{1}_{\{0,1\}}(u)\), \(u\in \mathbb {R}\), and hence, by [11, Proposition 4.1], its polar function is the largest convex and lower semi-continuous minorant of f, which is exactly \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\). This proves (3.82). \(\square \)

The different variants of the Hoeffding anti-divergence defined above appear naturally in different bounds on the strong converse exponents; see Sect. 4. Our next goal is to explore their relations; in particular, to find sufficient conditions for some or all of them to coincide. Note that this is not always the case, as shown in Examples 3.593.60.

Lemma 3.54

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). For any \(u\in [0,1]\), and any \(r\in \mathbb {R}\),

and

$$\begin{aligned} Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=Q_{\alpha }^{*}(\varrho \Vert \sigma ),\,\, \,\, \alpha >1&\,\, \,\, \Longleftrightarrow \,\, \,\, \tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}= \tilde{\psi }^{*}(\varrho \Vert \sigma |u),\,\, \,\, u\in [0,1] \end{aligned}$$
(3.85)
$$\begin{aligned}&\,\, \,\, \Longrightarrow \,\, \,\, {\left\{ \begin{array}{ll} H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}= H_r^{*}(\varrho \Vert \sigma ),\,\, \,\, r\in \mathbb {R},\\ {\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ),\,\, \,\, r\in \mathbb {R}. \end{array}\right. } \end{aligned}$$
(3.86)

In particular, if \(\varrho \) and \(\sigma \) are trace-class, or \(\sigma \) is compact and \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\), then all equalities in (3.85)–(3.86) hold.

Proof

The inequalities are immediate from (3.50) and the definitions of the given quantities. The equivalence in (3.85) is trivial by definition, as is the implication in (3.86). The last assertion follows from Proposition 3.40. \(\square \)

Lemma 3.55

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) (equivalently, \(\varrho \in {{\mathcal {L}}}^{\alpha _0}({\mathcal {H}},\sigma )\)) for some \(\alpha _0\in (1,+\infty )\). Then

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}},\,\, \,\, \,\, r\in \mathbb {R}. \end{aligned}$$
(3.87)

Proof

It is enough to prove that

$$\begin{aligned} \sup _{u\in [0,1)}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\right\} =H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=\sup _{u\in (0,1]}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\right\} . \end{aligned}$$
(3.88)

We prove the first equality, as the second one follows the same way. If \(\tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}=+\infty \) then there is nothing to prove, and hence we assume the contrary. Also by assumption,

$$\begin{aligned} +\infty >(\alpha _0-1)D_{\alpha _0}^{*}(\varrho \Vert \sigma )=\psi ^{*}(\varrho \Vert \sigma |\alpha _0) \ge \psi ^{*}(\varrho \Vert \sigma |\alpha _0){}_{\textrm{fa}}= \frac{\tilde{\psi }^{*}(\varrho \Vert \sigma |u_0){}_{\textrm{fa}}}{1-u_0}, \end{aligned}$$

where \(u_0{:}{=}(\alpha _0-1)/\alpha _0\). By Corollary 3.48, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\) is convex on [0, 1], and finiteness at 0 and \(u_0\) implies \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}<+\infty \), \(u\in [0,u_0]\). By Lemma 3.20, we also have \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}>-\infty \), \(u\in [0,u_0]\). Hence, \(u\mapsto ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\) is a finite-valued concave and upper semi-continuous function on \([0,u_0]\), whence it is also continuous on \([0,u_0]\). This proves the asserted equality. \(\square \)

Proposition 3.56

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha _0\in (1,+\infty )\), and \(Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=Q_{\alpha }^{*}(\varrho \Vert \sigma )\), \(\alpha >1\). Then, for every \(r\in \mathbb {R}\),

(3.89)

Proof

Immediate from Lemmas 3.54 and 3.55. \(\square \)

Remark 3.57

Some further properties of, and relations among, the different Hoeffding anti-divergences are given in Appendix A. While these are not used in the rest of the paper, they might give some extra insight into the different bounds given in Proposition 4.5.

We close this section with some statements on the possible values of the Hoeffding anti-divergences. For these, we will need the notion of the Umegaki relative entropy [45]. For two finite-rank PSD operators \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), it is defined as

$$\begin{aligned} D(\varrho \Vert \sigma ){:}{=}{\left\{ \begin{array}{ll} {{\,\textrm{Tr}\,}}\varrho ({{\,\mathrm{\widehat{\log }}\,}}\varrho -{{\,\mathrm{\widehat{\log }}\,}}\sigma ),&{}\varrho ^0\le \sigma ^0,\\ +\infty ,&{}\text {otherwise}, \end{array}\right. } \end{aligned}$$

where \({{\,\mathrm{\widehat{\log }}\,}}x{:}{=}\log x\), \(x>0\), and \({{\,\mathrm{\widehat{\log }}\,}}0{:}{=}0\). For positive normal functionals on a von Neumann algebra, it may be defined using the relative modular operator [1]. In the simple case of PSD trace-class operators \(\varrho ,\sigma \) on a separable Hilbert space \({\mathcal {H}}\), their relative entropy may be expressed equivalently as [19, Theorem 4.5]

$$\begin{aligned} D(\varrho \Vert \sigma )=\lim _{\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\ni P\nearrow I}D(P\varrho P\Vert P\sigma P) =\lim _{n\rightarrow +\infty }D(P_n\varrho P_n\Vert P_n\sigma P_n), \end{aligned}$$

where the second equality holds for any increasing sequence \(P_n\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\), \(n\in \mathbb {N}\), converging strongly to I. For non-zero PSD trace-class operators \(\varrho ,\sigma \) and \(\lambda ,\eta \in (0,+\infty )\), the scaling laws

$$\begin{aligned} D_{\alpha }^{*}(\lambda \varrho \Vert \eta \sigma )&=D_{\alpha }^{*}(\varrho \Vert \sigma )+\frac{\alpha }{\alpha -1}\log \lambda -\log \eta , \end{aligned}$$
(3.90)
$$\begin{aligned} H_r^{*}(\lambda \varrho \Vert \eta \sigma )&=H_{r+\log \eta }^{*}(\varrho \Vert \sigma )-\log \lambda , \end{aligned}$$
(3.91)
$$\begin{aligned} D(\lambda \varrho \Vert \eta \sigma )&= \lambda D(\varrho \Vert \sigma )+({{\,\textrm{Tr}\,}}\varrho )\lambda \log \frac{\lambda }{\eta }, \end{aligned}$$
(3.92)
$$\begin{aligned} D_{\max }(\lambda \varrho \Vert \eta \sigma )&= D_{\max }(\varrho \Vert \sigma )+\log \lambda -\log \eta , \end{aligned}$$
(3.93)

are easy to verify from the definitions (see also Remark 3.13). It was shown in [6, 27] that

$$\begin{aligned} \exists \,\alpha _0>1:\,\, D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \,\, \Longrightarrow \,\, \lim _{\alpha \searrow 1}\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ) = \inf _{\alpha >1}\tilde{D}_{\alpha }^{*}(\varrho \Vert \sigma ) = \frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma ). \end{aligned}$$
(3.94)

Lemma 3.58

Let \(\varrho ,\sigma \in {{\mathcal {L}}}^1({\mathcal {H}}){}_{\gneq 0}\) be PSD trace-class operators.

(i) For every \(r\in \mathbb {R}\),

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )\ge r-D_{\max }(\varrho \Vert \sigma ). \end{aligned}$$
(3.95)

(ii) If there exists an \(\alpha _0\in (1,+\infty )\) such that \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) then

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )=&H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}={\hat{H}}_r^{*}(\varrho \Vert \sigma )={\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\\&{\left\{ \begin{array}{ll} =-\log {{\,\textrm{Tr}\,}}\varrho ,&{} r\le \frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma )-\log {{\,\textrm{Tr}\,}}\varrho ,\\ \in \big (-\log {{\,\textrm{Tr}\,}}\varrho , r-\frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma )\big ),&{}r> \frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma )-\log {{\,\textrm{Tr}\,}}\varrho . \end{array}\right. } \end{aligned}$$
(3.96)

(iii) If \(D_{\alpha }^{*}(\varrho \Vert \sigma )=+\infty \) for every \(\alpha \in (1,+\infty )\) then

$$\begin{aligned}&H_r^{*}(\varrho \Vert \sigma )=H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}= -\infty \\&<-\log {{\,\textrm{Tr}\,}}\varrho = {\hat{H}}_r^{*}(\varrho \Vert \sigma )= {\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}. \end{aligned}$$
(3.97)

Proof

(i) By the scaling laws (3.903.92),

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )&= H_{r+\log {{\,\textrm{Tr}\,}}\sigma }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )-\log {{\,\textrm{Tr}\,}}\varrho \nonumber \\&=\sup _{\alpha >1}\frac{\alpha -1}{\alpha }\left[ r+\log {{\,\textrm{Tr}\,}}\sigma -D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\right] -\log {{\,\textrm{Tr}\,}}\varrho . \end{aligned}$$
(3.98)

According to Corollary 3.46, \(\displaystyle {\lim _{\alpha \rightarrow +\infty }}\) \(D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg ) =D_{\max }\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg ) =D_{\max }\big (\varrho \Vert \sigma \big )-\log {{\,\textrm{Tr}\,}}\varrho +\log {{\,\textrm{Tr}\,}}\sigma \), and hence,

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )&\ge \lim _{\alpha \rightarrow +\infty } \frac{\alpha -1}{\alpha }\left[ r+\log {{\,\textrm{Tr}\,}}\sigma -D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\right] -\log {{\,\textrm{Tr}\,}}\varrho \\&=r-D_{\max }\big (\varrho \Vert \sigma \big ), \end{aligned}$$

proving (3.95).

(ii) The equalities in (3.96) follow from Proposition  3.56. Using the assumption \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \), (3.94) and (3.92) give

$$\begin{aligned} \inf _{\alpha >1}D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )&= \lim _{\alpha \searrow 1}D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg ) = D\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\nonumber \\&= \frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma )-\log {{\,\textrm{Tr}\,}}\varrho +\log {{\,\textrm{Tr}\,}}\sigma . \end{aligned}$$
(3.99)

In particular, the above limit is finite, and thus

$$\begin{aligned} -\log {{\,\textrm{Tr}\,}}\varrho = \lim _{\alpha \searrow 1} \frac{\alpha -1}{\alpha }\left[ r+\log {{\,\textrm{Tr}\,}}\sigma -D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\right] -\log {{\,\textrm{Tr}\,}}\varrho \le H_r^{*}(\varrho \Vert \sigma ), \end{aligned}$$

where in the second expression we used (3.100), and the inequality is by definition. On the other hand, (3.100) shows that \(H_r^{*}(\varrho \Vert \sigma )>-\log {{\,\textrm{Tr}\,}}\varrho \) holds if and only if

$$\begin{aligned} r+\log {{\,\textrm{Tr}\,}}\sigma&>\inf _{\alpha >1}D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )=D\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\nonumber \\&=\frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma )-\log {{\,\textrm{Tr}\,}}\varrho +\log {{\,\textrm{Tr}\,}}\sigma , \end{aligned}$$
(3.100)

where the equalities are due to (3.101). Note that (3.102) is exactly the condition in the second line of (3.97), and hence we obtain the first line in (3.97). Assume now that r is as in (3.102). Then

$$\begin{aligned} \underbrace{\frac{\alpha -1}{\alpha }}_{\in (0,1)} \underbrace{\left[ r+\log {{\,\textrm{Tr}\,}}\sigma -D_{\alpha }^{*}\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\right] }_{\le \, r+\log {{\,\textrm{Tr}\,}}\sigma -D\Bigg (\frac{\varrho }{{{\,\textrm{Tr}\,}}\varrho }\Big \Vert \frac{\sigma }{{{\,\textrm{Tr}\,}}\sigma }\Bigg )\,\in (0,+\infty ) }-\log {{\,\textrm{Tr}\,}}\varrho <r-\frac{1}{{{\,\textrm{Tr}\,}}\varrho }D(\varrho \Vert \sigma ), \end{aligned}$$

proving the second line of (3.97).

(iii) By Lemma 3.54, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u)=\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}=+\infty \) for every \(u\in (0,1)\), whence \(H_r^{*}(\varrho \Vert \sigma )=H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}= -\infty \). On the other hand, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |0)=\tilde{\psi }^{*}(\varrho \Vert \sigma |0)_{\overline{\textrm{fa}}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}=\log {{\,\textrm{Tr}\,}}\varrho \), according to Remark 3.49, and \(\tilde{\psi }^{*}(\varrho \Vert \sigma |1)=\tilde{\psi }^{*}(\varrho \Vert \sigma |1)_{\overline{\textrm{fa}}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |1){}_{\textrm{fa}}=D_{\max }(\varrho \Vert \sigma )=+\infty \), where the last equality follows from Corollary 3.46. Hence, \({\hat{H}}_r^{*}(\varrho \Vert \sigma )={\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=-\log {{\,\textrm{Tr}\,}}\varrho \). \(\square \)

Example 3.59

Let \((e_n)_{n\in \mathbb {N}}\) be an orthonormal basis in \({\mathcal {H}}\), and \(\varrho {:}{=}c_1\sum _{n=1}^{+\infty }n^{-\beta }\left| e_n\right\rangle \!\left\langle e_n\right| \), \(\sigma {:}{=}c_2\sum _{n=1}^{+\infty }n^{-n^{\gamma }}\left| e_n\right\rangle \!\left\langle e_n\right| \), with some \(\beta >1\) and \(\gamma >0\), where \(c_1\) and \(c_2\) are choosen so that \(\varrho \) and \(\sigma \) are density operators. Obviously, \(\varrho \) and \(\sigma \) are commuting (classical). For \(P_N{:}{=}\sum _{n=1}^N\left| e_n\right\rangle \!\left\langle e_n\right| \), we have

$$\begin{aligned} Q_{\alpha }^{*}(P_N\varrho P_N\Vert P_N\sigma P_N)=c_1^{\alpha }c_2^{1-\alpha } \sum _{n=1}^N n^{-\alpha \beta -(1-\alpha )n^{\gamma }} \xrightarrow [N\rightarrow +\infty ]{}+\infty ,\,\, \,\, \,\, \alpha \in (1,+\infty ), \end{aligned}$$

whence

$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma )&=+\infty ,\,\, \,\, \alpha \in (1,+\infty ), \end{aligned}$$

and

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )=H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&= -\infty \\&<-\log {{\,\textrm{Tr}\,}}\varrho = {\hat{H}}_r^{*}(\varrho \Vert \sigma )= {\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}, \end{aligned}$$

according to Lemma 3.58. Note also that

$$\begin{aligned}&\tilde{\psi }^{*}(\varrho \Vert \sigma |u)=\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}=+\infty ,\,\, \,\, u\in (0,1),\\&\tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |0)_{\overline{\textrm{fa}}}=\log {{\,\textrm{Tr}\,}}\varrho =0 <\lim _{u\searrow 0}\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}},\\&\tilde{\psi }^{*}(\varrho \Vert \sigma |1){}_{\textrm{fa}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |1)_{\overline{\textrm{fa}}}=D_{\max }(\varrho \Vert \sigma )=+\infty = \lim _{u\nearrow 1}\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}. \end{aligned}$$

For the relative entropy we get

$$\begin{aligned} D(\varrho \Vert \sigma ) = c_1\sum _{n=1}^{+\infty }\frac{1}{n^{\beta }}\log \frac{c_1n^{n^{\gamma }}}{c_2n^{\beta }} = \log \frac{c_1}{c_2}+c_1\sum _{n=1}^{+\infty }\frac{(n^{\gamma }-\beta )\log n}{n^{\beta }} <+\infty , \end{aligned}$$

if \(\beta >\gamma +1\). Hence, assuming that \(D(\varrho \Vert \sigma )<+\infty \) is not sufficient for Lemma 3.55 and Lemma 3.58.

This also gives an example where

$$\begin{aligned} D(\varrho \Vert \sigma )<+\infty =\lim _{\alpha \searrow 1}D_{\alpha }^{*}(\varrho \Vert \sigma ), \end{aligned}$$

which is contrary to the case where \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha _0>1\); see (3.94). This kind of behaviour was already pointed out in [19, Remark 5.4].

Example 3.60

Let \(\varrho =\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\varrho \) is not trace-class. Then \(\varrho =\varrho ^{\frac{\alpha -1}{2\alpha }}\varrho ^{\frac{1}{\alpha }}\varrho ^{\frac{\alpha -1}{2\alpha }}\), whence \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\) with \(\varrho _{\sigma ,\alpha }=\varrho ^{\frac{1}{\alpha }}\notin {{\mathcal {L}}}^{\alpha }({\mathcal {H}})\), and

$$\begin{aligned}&\tilde{\psi }^{*}(\varrho \Vert \sigma |u)=\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}=+\infty ,\,\, \,\, u\in (0,1),\\&\tilde{\psi }^{*}(\varrho \Vert \sigma |0){}_{\textrm{fa}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |0)_{\overline{\textrm{fa}}}=\log {{\,\textrm{Tr}\,}}\varrho =+\infty =\lim _{u\searrow 0}\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}},\\&\tilde{\psi }^{*}(\varrho \Vert \sigma |1){}_{\textrm{fa}}=\tilde{\psi }^{*}(\varrho \Vert \sigma |1)_{\overline{\textrm{fa}}}=D_{\max }(\varrho \Vert \sigma )=0 < \lim _{u\nearrow 1}\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}. \end{aligned}$$

Thus, for every \(r \in \mathbb {R}\),

$$\begin{aligned} H_r^{*}(\varrho \Vert \sigma )=H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&=-\infty \\&<r ={\hat{H}}_r^{*}(\varrho \Vert \sigma )={\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}. \end{aligned}$$

In particular, this holds also when \(\varrho \) is compact, and obviously \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\varrho )\). This shows that the assumption \(D_{\alpha _0}^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha _0<+\infty \) is also important in this case of Proposition 3.56.

Note also that this is an example where

$$\begin{aligned} \exists \,\lim _{\alpha \rightarrow +\infty }D_{\alpha }^{*}(\varrho \Vert \sigma )\,\, (=+\infty )\,\, \ne D_{\max }(\varrho \Vert \sigma ). \end{aligned}$$

This cannot happen when \(\varrho \) and \(\sigma \) are both trace-class, according to Corollary 3.43 or Corollary 3.46.

4 The Strong Converse Exponent

4.1 The strong converse exponents and the Hoeffding anti-divergences

Before restricting our attention to the i.i.d. case in the main result, we first consider a generalization of the binary state discrimination problem described in the Introduction. First, we do not assume the hypotheses to be represented by density operators, but by general positive semi-definite operators. Second, we do not assume the problem to be i.i.d. In the most general case, a simple asymptotic binary operator discrimination problem is specified by a sequence of Hilbert spaces \({\mathcal {H}}_n\), \(n\in \mathbb {N}\), and for each \(n\in \mathbb {N}\), a pair \(\varrho _n,\sigma _n\in {{\mathcal {B}}}({\mathcal {H}}_n){}_{\gneq 0}\), representing the null and the alternative hypotheses, respectively. Since the operators are not assumed to be trace-class, the expressions in (1.1) may not make sense, and need to be modified as

$$\begin{aligned} \gamma _n(T_n|\varrho _n)&{:}{=}{{\,\textrm{Tr}\,}}\, (T_n^{1/2}\varrho _n T_n^{1/2}) ={{\,\textrm{Tr}\,}}\,(\varrho _n^{1/2}T_n\varrho _n^{1/2}),\\ \beta _n(T_n|\sigma _n)&{:}{=}{{\,\textrm{Tr}\,}}\, (T_n^{1/2}\sigma _n T_n^{1/2})= {{\,\textrm{Tr}\,}}\, (\sigma _n^{1/2} T_n\sigma _n^{1/2}), \end{aligned}$$

to define the generalized type I success and type II errors, respectively. These expressions are equal to those in (1.1) when \(\varrho _n\) and \(\sigma _n\) are trace-class.

Definition 4.1

Let \(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}{:}{=}(\varrho _n)_{n\in \mathbb {N}}\), \(\mathop {\sigma }\limits ^{{\tiny \rightarrow }}{:}{=}(\sigma _n)_{n\in \mathbb {N}}\) be as above. The strong converse exponents of the simple asymptotic binary operator discrimination problem \(H_0:\,\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\) vs. \(H_1:\,\mathop {\sigma }\limits ^{{\tiny \rightarrow }}\) with type II exponent \(r\in \mathbb {R}\) are defined as

$$\begin{aligned} \underline{\textrm{sc}}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})&{:}{=} \inf \left\{ \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \gamma _n(T_n|\varrho _n):\, \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \beta _n(T_n|\sigma _n)\ge r \right\} ,\\ \overline{\textrm{sc}}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})&{:}{=} \inf \left\{ \limsup _{n\rightarrow +\infty }-\frac{1}{n}\log \gamma _n(T_n|\varrho _n):\, \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \beta _n(T_n|\sigma _n)\ge r \right\} ,\\ \textrm{sc}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})&{:}{=} \inf \left\{ \lim _{n\rightarrow +\infty }-\frac{1}{n}\log \gamma _n(T_n|\varrho _n):\, \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \beta _n(T_n|\sigma _n)\ge r \right\} , \end{aligned}$$

where the infima are taken along all test sequences \(T_n\in {{\mathcal {B}}}({\mathcal {H}}_n)_{[0,I]}\), \(n\in \mathbb {N}\), satisfying the indicated condition, and in the last expression also that the limit exists.

We will need an extension of the notion of the Hoeffding anti-divergence in the above setting. Let

$$\begin{aligned} \psi ^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|\alpha )&{:}{=}\limsup _{n\rightarrow +\infty }\frac{1}{n}\psi ^{*}(\varrho _n\Vert \sigma _n|\alpha ),\,\, \,\, \,\, \alpha \in (1,+\infty ),\\ \tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u)&{:}{=} \limsup _{n\rightarrow +\infty }\frac{1}{n}\tilde{\psi }^{*}(\varrho _n\Vert \sigma _n|u)\\&= {\left\{ \begin{array}{ll} (1-u)\psi ^{*}\big (\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|(1-u)^{-1}\big ),&{} u\in \!(0,1),\\ \limsup _{n\rightarrow +\infty }\frac{1}{n}\log {{\,\textrm{Tr}\,}}\varrho _n,&{}u=0,\\ \limsup _{n\rightarrow +\infty }\frac{1}{n}D_{\max }(\varrho _n\Vert \sigma _n),&{}u=1, \end{array}\right. } \end{aligned}$$

where we used (3.74)–(3.75), and

$$\begin{aligned} {\hat{H}}_r^{*}\big (\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}\big ){:}{=} \sup _{u\in [0,1]}\{ur-\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u)\}. \end{aligned}$$

The inequality in the following lemma is called the optimality part of the Hoeffding bound. For trace-class operators, it can be easily obtained from the monotonicity of the sandwiched Rényi divergence under measurements; see [6, 33, 36]. If we do not assume \(\varrho _n\) and \(\sigma _n\) to be trace-class, we can still obtain it using the variational formula in (3.27), as we show below.

Proposition 4.2

For every \(r\in \mathbb {R}\),

$$\begin{aligned} {\hat{H}}_r^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})\le \underline{\textrm{sc}}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})\le \overline{\textrm{sc}}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})\le \textrm{sc}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}). \end{aligned}$$
(3.101)

Proof

All the inequalities are trivial by definition, except for the first one. Thus, we need to show that for any \(r\in \mathbb {R}\) and any \(u\in [0,1]\),

$$\begin{aligned} \underline{\textrm{sc}}_r(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }})\ge ur-\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u). \end{aligned}$$
(3.102)

Let us fix \(r\in \mathbb {R}\) for the rest. First, note that for any test \(T_n\),

$$\begin{aligned} \gamma _n(T_n|\varrho _n)={{\,\textrm{Tr}\,}}\,(\varrho _n^{1/2}T_n\varrho _n^{1/2})\le {{\,\textrm{Tr}\,}}\varrho _n. \end{aligned}$$

Thus, for any sequence of tests \((T_n)_{n\in \mathbb {N}}\),

$$\begin{aligned} \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \gamma _n(T_n|\varrho _n)\ge -\limsup _{n\rightarrow +\infty } \frac{1}{n}\log {{\,\textrm{Tr}\,}}\varrho _n=0\cdot r-\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|0), \end{aligned}$$

proving (4.2) for \(u=0\). For \(u=1\), (4.2) is trival when \(\psi ^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|1)=+\infty \), and hence we assume the contrary; in particular, \(D_{\max }(\varrho _n\Vert \sigma _n)<+\infty \) for any large enough n. Let \((T_n)_{n\in \mathbb {N}}\) be a test sequence such that

$$\begin{aligned} \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \beta _n(T_n|\sigma _n)\ge r. \end{aligned}$$

Then for any \(r'<r\) and any large enough n, \(\beta _n(T_n|\sigma _n)\le \exp (-nr')\), whence

$$\begin{aligned} \gamma _n(T_n|\varrho _n)= {{\,\textrm{Tr}\,}}\, (T_n^{1/2}\varrho _n T_n^{1/2})&\le \exp ({D_{\max }(\varrho _n\Vert \sigma _n)}){{\,\textrm{Tr}\,}}\, (T_n^{1/2}\sigma _n T_n^{1/2})\\&\le \exp (D_{\max }(\varrho _n\Vert \sigma _n)-nr'). \end{aligned}$$

Thus,

$$\begin{aligned} \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log \gamma _n(T_n|\varrho _n)\ge r'-\limsup _{n\rightarrow +\infty }\frac{1}{n}D_{\max }(\varrho _n\Vert \sigma _n) = r'-\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|1). \end{aligned}$$

This gives (4.2) for \(u=1\).

For the rest, let us fix an \(u\in (0,1)\), and corresponding \(\alpha =1/(1-u)>1\). If \(\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u)=+\infty \) then

(4.1)

holds trivially. Hence, we assume that \(\tilde{\psi }^{*}(\varrho _n\Vert \sigma _n|u)<+\infty \), or equivalently, \(\varrho _n\in {{\mathcal {L}}}^{\alpha }({\mathcal {H}}_n,\sigma _n)\) for every large enough n. In particular, the variational formula (3.27) holds (with \(z=\alpha )\).

Consider now a sequence of tests \((T_n)_{n\in \mathbb {N}}\) such that \(\liminf _{n\rightarrow +\infty }\!-\frac{1}{n}\!\log \!{{\,\textrm{Tr}\,}}(T_n^{1/2}\!\sigma _n T_n^{1/2}) \ge r\). Then \({{\,\textrm{Tr}\,}}\,(T_n^{1/2}\sigma _n T_n^{1/2})<+\infty \) for every large enough n, and we have

(4.2)

where the first inequality is due to the operator Jensen inequality [7, Theorem 11]. Hence, \(T_n\in {{\mathcal {B}}}({\mathcal {H}}_n)_{\sigma _n,\alpha ,\alpha }\). If \({{\,\textrm{Tr}\,}}\big (T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}\big )^{\frac{\alpha }{\alpha -1}}>0\) then the variational formula (3.27) yields

$$\begin{aligned} \psi ^{*}(\varrho _n\Vert \sigma _n|\alpha )\!&\ge \! \alpha \log {{\,\textrm{Tr}\,}}\,(T_n^{1/2}\varrho _n T_n^{1/2}) +(1-\alpha )\log {{\,\textrm{Tr}\,}}\big (T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}\big )^{\frac{\alpha }{\alpha -1}}\\&\ge \alpha \log {{\,\textrm{Tr}\,}}\,( T_n^{1/2}\varrho _n T_n^{1/2}) +(1-\alpha )\log {{\,\textrm{Tr}\,}}\big (T_n^{1/2}\sigma _nT_n^{1/2}\big ), \end{aligned}$$

where the second inequality is due to (4.4). In particular, we also have \({{\,\textrm{Tr}\,}}\,(T_n^{1/2}\varrho _n T_n^{1/2})<+\infty \). By a simple rearrangement, we get

(4.3)

If \({{\,\textrm{Tr}\,}}\big (T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}\big )^{\frac{\alpha }{\alpha -1}}=0\) then \(T_n^{1/2}\sigma _n^{\frac{\alpha -1}{\alpha }}T_n^{1/2}=0\). Since \(\varrho _n\in {{\mathcal {L}}}^{\alpha }({\mathcal {H}}_n,\sigma _n)\), this implies \(T_n^{1/2}\varrho _n T_n^{1/2}=0\), according to Lemma 3.1, and therefore (4.5) holds trivially, with both sides equal to \(+\infty \).

Taking the liminf in (4.5) yields

Since this holds for every test sequence as above, we get \(ur-\tilde{\psi }^{*}(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\Vert \mathop {\sigma }\limits ^{{\tiny \rightarrow }}|u)\le \underline{\textrm{sc}}(\varrho \Vert \sigma )\), as required. \(\square \)

For the rest, we restrict our attention to the i.i.d. case, where

for some Hilbert space \({\mathcal {H}}\) and \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). Note that by Lemma 3.22, \(\tilde{\psi }^{*}\big ((\varrho ^{\otimes n})_{n\in \mathbb {N}}\Vert (\sigma ^{\otimes n})_{n\in \mathbb {N}}|u\big )= \tilde{\psi }^{*}(\varrho \Vert \sigma |u)\), \(u\in (0,1)\), \(n\in \mathbb {N}\), and the same identity is straightforward to verify for \(u=0,1\), whence

We replace the notations \(\mathop {\varrho }\limits ^{{\tiny \rightarrow }}\) and \(\mathop {\sigma }\limits ^{{\tiny \rightarrow }}\) with \(\varrho \) and \(\sigma \), respectively, in the strong converse exponents introduced above. Let

be defined the same way as \(\underline{\textrm{sc}}(\varrho \Vert \sigma )\), \(\overline{\textrm{sc}}(\varrho \Vert \sigma )\), and \(\textrm{sc}(\varrho \Vert \sigma )\), respectively, but with the restrictions that only finite-rank tests are used. Obviously,

The following lower bound follows by a straightforward adaptation of Nagaoka’s method [36].

Proposition 4.3

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\). For every \(r\in \mathbb {R}\),

Proof

Let us fix an \(r\in \mathbb {R}\). We need to prove that for every \(u\in [0,1]\),

(4.4)

The cases \(u=0\) and \(u=1\) can be proved exactly the same way as in the proof of Proposition 4.2 above. For the rest, let us fix an \(u\in (0,1)\), with corresponding \(\alpha =1/(1-u)>1\). If \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}=+\infty \) then (4.6) holds trivially, and hence for the rest we assume that \(\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}<+\infty \). In particular, we have \(\varrho ^0\le \sigma ^0\), according to Lemma 3.42.

Let \(T_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,I]}\), \(n\in \mathbb {N}\), be a sequence of finite-rank tests such that

$$\begin{aligned} \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\ge r. \end{aligned}$$

Assume first that \(T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}\ne 0\), whence, by the assumption that \(\varrho ^0\le \sigma ^0\), we also have \(T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\ne 0\). By Lemma 3.37,

$$\begin{aligned} Q_{\alpha }^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}){}_{\textrm{fa}}&\ge Q_{\alpha }^{*}(T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}\Vert T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2})\\&\ge \big ({{\,\textrm{Tr}\,}}T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}\big )^{\alpha } \big ({{\,\textrm{Tr}\,}}T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\big )^{1-\alpha }, \end{aligned}$$

where the second inequality follows from Corollary 3.27. A simple rearrangement yields

$$\begin{aligned} -\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}&\ge \frac{\alpha -1}{\alpha }\Bigg (-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\Bigg ) -\frac{1}{\alpha }\frac{1}{n}\psi ^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}|\alpha ){}_{\textrm{fa}}\\&= u\Bigg (-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\Bigg ) -\underbrace{\frac{1}{n}\tilde{\psi }^{*}(\varrho ^{\otimes n}\Vert \sigma ^{\otimes n}|u){}_{\textrm{fa}}}_{ \le \tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}}\\&\ge u\Bigg (-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}\sigma ^{\otimes n}T_n^{1/2}\Bigg ) -\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}. \end{aligned}$$

These inequalities also hold (trivially, with the leftmost expression being \(+\infty \)) when \(T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}=0\). Thus, we get

$$\begin{aligned} \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}\varrho ^{\otimes n}T_n^{1/2}&\ge ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)_{\overline{\textrm{fa}}}. \end{aligned}$$

Since this holds for every test sequences as above, (4.6) follows. \(\square \)

Lemma 4.4

For finite-rank PSD operators \(\varrho ,\sigma \) on a Hilbert space, with \(0\ne \varrho ^0\le \sigma ^0\), we have

$$\begin{aligned} \textrm{sc}_r(\varrho \Vert \sigma )\le H_r^{*}(\varrho \Vert \sigma ),\,\, \,\, \,\, r\in \mathbb {R}. \end{aligned}$$
(4.5)

Proof

The inequality in (4.7) was proved in [33, Theorem 4.10] for finite-rank density operators, under the implicit assumption that \(D(\varrho \Vert \sigma )\ne D_{\max }(\varrho \Vert \sigma )\), and it was proved in [22] in the case \(D(\varrho \Vert \sigma )= D_{\max }(\varrho \Vert \sigma )\). The case of general PSD operators follows easily by replacing \(\varrho \) and \(\sigma \) with \(\varrho /{{\,\textrm{Tr}\,}}\varrho \) and \(\sigma /{{\,\textrm{Tr}\,}}\sigma \), respectively, and using the scaling laws (3.91) and \(\overline{\textrm{sc}}_r(\lambda \varrho \Vert \eta \sigma )=\overline{\textrm{sc}}_{r+\log \eta }(\varrho \Vert \sigma )-\log \lambda \). \(\square \)

Proposition 4.5

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(\varrho ^0\le \sigma ^0\). For every \(r\in \mathbb {R}\),

Proof

By Propositions 4.2 and 4.3, we only need to prove \(\textrm{sc}_r(\varrho \Vert \sigma )_\mathrm{{f}}\le {\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\). Let \(P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\). According to Lemma 4.4, there exists a sequence of tests \((S_{P,n})_{n\in \mathbb {N}}\) such that \(S_{P,n}\le P^{\otimes n}\), and

$$\begin{aligned} \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}(P\sigma P)^{\otimes n}S_{P,n}&\ge r, \end{aligned}$$
(4.6)
$$\begin{aligned} \lim _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}(P\varrho P)^{\otimes n}S_{P,n}&\le H_r^{*}(P\varrho P\Vert P\sigma P) \nonumber \\&= \max _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\right\} , \end{aligned}$$
(4.7)

where the equality is due to Propositions 3.56 and 3.40. Note that

$$\begin{aligned} {{\,\textrm{Tr}\,}}(P\sigma P)^{\otimes n}S_{P,n}&= {{\,\textrm{Tr}\,}}\sigma ^{\otimes n} \underbrace{\big (P^{\otimes n}S_{P,n}P^{\otimes n}\big )}_{=S_{P,n}},\,\, \,\, \,\, \\ {{\,\textrm{Tr}\,}}(P\varrho P)^{\otimes n}S_{P,n}&= {{\,\textrm{Tr}\,}}\varrho ^{\otimes n} \underbrace{\big (P^{\otimes n}S_{P,n}P^{\otimes n}\big )}_{=S_{P,n}}, \end{aligned}$$

and therefore (4.8)–(4.9) yield

$$\begin{aligned} \textrm{sc}_r(\varrho \Vert \sigma )_\mathrm{{f}}\le \max _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\right\} . \end{aligned}$$

Thus,

$$\begin{aligned} \textrm{sc}_r(\varrho \Vert \sigma )_\mathrm{{f}}&\le \inf _{\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}\max _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\right\} . \end{aligned}$$
(4.8)

By Lemma 3.47, \(u\mapsto ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\) is continuous on the compact set [0, 1] for every \(P\in \mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\). On the other hand, \(\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+\) is an upward directed partially ordered set with respect to the PSD order, and for any \(u\in [0,1]\), \(P\mapsto ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\) is monotone decreasing on \(\mathbb {P}_f({\mathcal {H}})\), again by Lemma 3.47. Hence, by Lemma 2.5, we may exchange the inf and the max in (4.10). Thus, we get the upper bound

$$\begin{aligned} \textrm{sc}_r(\varrho \Vert \sigma )_\mathrm{{f}}&\le \max _{u\in [0,1]}\inf _{\mathbb {P}_f({\mathcal {H}})_{\varrho ,\sigma }^+}\left\{ ur-\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\right\} \\&=\max _{u\in [0,1]}\left\{ ur-\sup _{P_f\in \mathbb {P}({\mathcal {H}})}\tilde{\psi }^{*}(P\varrho P\Vert P\sigma P|u)\right\} \\&= \max _{u\in [0,1]}\left\{ ur-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\right\} = {\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\textrm{fa}}, \end{aligned}$$

as required. \(\square \)

Theorem 4.6

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be such that \(D_{\alpha }^{*}(\varrho \Vert \sigma )<+\infty \) for some \(\alpha \in (1,+\infty )\), and \(Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=Q_{\alpha }^{*}(\varrho \Vert \sigma )\), \(\alpha >1\). Then

$$\begin{aligned} \underline{\textrm{sc}}_r(\varrho \Vert \sigma )&=\overline{\textrm{sc}}_r(\varrho \Vert \sigma )=\textrm{sc}_r(\varrho \Vert \sigma )= \underline{\textrm{sc}}_r(\varrho \Vert \sigma )\mathrm{_\mathrm{{f}}}=\overline{\textrm{sc}}_r(\varrho \Vert \sigma )\mathrm{_\mathrm{{f}}}=\textrm{sc}_r(\varrho \Vert \sigma )\mathrm{_\mathrm{{f}}}\\&= H_r^{*}(\varrho \Vert \sigma )=H_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}=H_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\\&={\hat{H}}_r^{*}(\varrho \Vert \sigma )={\hat{H}}_r^{*}(\varrho \Vert \sigma )_{\overline{\textrm{fa}}}={\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}, \quad r \in \mathbb {R}. \end{aligned}$$

On the other hand, if \(\varrho ,\sigma \) are trace-class and \(D_{\alpha }^{*}(\varrho \Vert \sigma )=+\infty \) for all \(\alpha \in (1,+\infty )\), then

Proof

Immediate from Propositions 4.53.56, and Lemma 3.58. \(\square \)

As a special case of Theorem 4.6, we get the exact characterization of the strong converse exponent of discriminating quantum states on a separable Hilbert space, as follows:

Corollary 4.7

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) be density operators. For every \(r\in \mathbb {R}\),

$$\begin{aligned} \underline{\textrm{sc}}_r(\varrho \Vert \sigma )&=\overline{\textrm{sc}}_r(\varrho \Vert \sigma )=\textrm{sc}_r(\varrho \Vert \sigma )= \underline{\textrm{sc}}_r(\varrho \Vert \sigma )_{f}=\overline{\textrm{sc}}_r(\varrho \Vert \sigma )_{f}=\textrm{sc}_r(\varrho \Vert \sigma )_{f}\\&= {\hat{H}}_r^{*}(\varrho \Vert \sigma )\ge 0, \end{aligned}$$
(4.9)

and

$$\begin{aligned} {\hat{H}}_r^{*}(\varrho \Vert \sigma )>0\,\, \Longleftrightarrow \,\, \exists \,\alpha>1:\,D_{\alpha }^{*}(\varrho \Vert \sigma )<+\infty \,\, \text {and}\,\, r>D(\varrho \Vert \sigma ). \end{aligned}$$
(4.10)

Proof

The equalities in (4.11) are immediate from Theorem 4.6, and the characterization of positivity in (4.13) follows from Lemma 3.58. \(\square \)

Remark 4.8

Let \(\varrho \) and \(\sigma \) be density operators. According to the direct part of the quantum Stein’s lemma [24, 26], for every \(r<D(\varrho \Vert \sigma )\) there exists a test sequence \(T_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,1]}\), \(n\in \mathbb {N}\), such that

$$\begin{aligned} \lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\varrho ^{\otimes n}(I-T_n)=0,\,\, \,\, \text {and}\,\, \,\, \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}\sigma ^{\otimes n}T_n\ge r. \end{aligned}$$
(4.11)

It was shown in [36, 38] that in the finite-dimensional case, for any test sequence \(T_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,1]}\), \(n\in \mathbb {N}\),

$$\begin{aligned} r{:}{=}\liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}\sigma ^{\otimes n}T_n>D(\varrho \Vert \sigma ) \,\, \Longrightarrow \,\, \lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\varrho ^{\otimes n}(I-T_n)=1. \end{aligned}$$
(4.12)

That is, if the type II error decreases with an exponent larger than the relative entropy then the type I error goes to 1; this is called the strong converse to Stein’s lemma. The optimal (lowest) speed of convergence to 1 is exponential, with the exponent being equal to the Hoeffding anti-divergence \(H_r^{*}(\varrho \Vert \sigma )\), according to [33]. Corollary 4.7 generalizes this to the infinite-dimensional case, with one important difference. While in the finite-dimensional case finiteness of the relative entropy implies strict positivity of \(H_r^{*}(\varrho \Vert \sigma )\) for every \(r>D(\varrho \Vert \sigma )\), and hence the strong converse property, in the infinite-dimensional case it might happen that \(D(\varrho \Vert \sigma )<+\infty \), yet \(H_r^{*}(\varrho \Vert \sigma )=0\) for every \(r\in \mathbb {R}\), and hence the type I error sequence does not converge to 1 with an exponential speed along a test sequence \((T_n)_{n\in \mathbb {N}}\), even if \(\liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}\sigma ^{\otimes n}T_n>D(\varrho \Vert \sigma )\). According to Corollary 4.7, this happens if and only if \(D_{\alpha }^{*}(\varrho \Vert \sigma )=+\infty \) for every \(\alpha >1\). It is an open question what kind of behaviour can occur in this case; if the type II exponent is above the relative entropy, do the type I error probabilities still go to 1 (strong converse property) but with a sub-exponential speed, or may it happen that the strong converse property does not hold, i.e., (4.15) is not satisfied? Note that the monotonicity of the relative entropy under measurements implies that

$$\begin{aligned} \lim _{n\rightarrow +\infty }{{\,\textrm{Tr}\,}}\varrho ^{\otimes n}(I-T_n)=0\,\, \Longrightarrow \,\, \limsup _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}\sigma ^{\otimes n}T_n\le D(\varrho \Vert \sigma ); \end{aligned}$$

see, e.g., the proof of (2.4) in [24], or [23, Proposition 5.2]. In particular, (4.14) cannot hold with \(r>D(\varrho \Vert \sigma )\) for any test sequence.

4.2 Generalized cutoff rates

Corollary 4.7 gives an operational interpretation to the Hoeffding anti-divergences, but not directly to the sandwiched Rényi divergences. To get such an operational interpretation, one can consider the following quantity, introduced originally in [8] for the finite-dimensional classical case:

Definition 4.9

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\kappa \in (0,1)\). The generalized \(\kappa \)-cutoff rate \(C_{\kappa }(\varrho \Vert \sigma )\) is defined to be the infimum of all \(r_0\in \mathbb {R}\) such that \(\underline{\textrm{sc}}_r(\varrho \Vert \sigma )\ge \kappa (r-r_0)\) holds for every \(r\in \mathbb {R}\). Analogously, \(C_{\kappa }(\varrho \Vert \sigma ){}_{\textrm{fa}}\) is defined to be the infimum of all \(r_0\in \mathbb {R}\) such that \(\underline{\textrm{sc}}_r(\varrho \Vert \sigma )_\mathrm{{f}}\ge \kappa (r-r_0)\) holds for every \(r\in \mathbb {R}\).

Proposition 4.10

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\).

(i) For any \(\kappa \in (0,1)\),

$$\begin{aligned} C_{\kappa }(\varrho \Vert \sigma ){}_{\textrm{fa}}\le C_{\kappa }(\varrho \Vert \sigma )\le D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ). \end{aligned}$$
(4.13)

(ii) If \(\kappa \) is such that there exist \(0<\kappa _1<\kappa<\kappa _2<1\) for which \(D_{\frac{1}{1-\kappa _j}}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}<+\infty \), \(j=1,2\), then

$$\begin{aligned} D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\le C_{\kappa }(\varrho \Vert \sigma ){}_{\textrm{fa}}\le C_{\kappa }(\varrho \Vert \sigma )\le D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ). \end{aligned}$$
(4.14)

If, moreover, \(D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma )\), then all the inequalities in (4.17) hold as equalities.

Proof

(i) The first inequality in (4.16) is trivial by definition. If \(D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma )=+\infty \) then the second inequality in (4.16) holds trivially, and hence we assume the contrary. By Proposition 4.2,

$$\begin{aligned} \underline{\textrm{sc}}_r(\varrho \Vert \sigma )&\ge {\hat{H}}_r^{*}(\varrho \Vert \sigma ) = \sup _{u\in [0,1]}\{u r-\tilde{\psi }^{*}(\varrho \Vert \sigma |u)\} \ge \kappa r-\tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa )\\&= \kappa \Big ( r-\underbrace{\frac{1}{\kappa }\tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa )}_{=D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma )}\Big ), \end{aligned}$$

from which the second inequality in (4.16) follows by definition.

(ii) By the assumptions, \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa _j){}_{\textrm{fa}}<+\infty \), \(j=1,2\), and hence \(\tilde{\psi }^{*}(\varrho \Vert \sigma |\cdot ){}_{\textrm{fa}}\) is a finite-valued convex function on \([\kappa _1,\kappa _2]\), according to Remark 3.36 and Corollary 3.48. Thus, in particular, its left and right derivatives \(\partial ^{-} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa )\) and \(\partial ^{+} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa )\) at \(\kappa \) exist and are finite: \(-\infty<\partial ^{-} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}\le \partial ^{+} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}<+\infty \). Moreover, \(\varrho ^0\le \sigma ^0\), according to Lemma 3.42. For any \(r\in [\partial ^{-} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}},\partial ^{+} \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}]\),

$$\begin{aligned} \textrm{sc}_r(\varrho \Vert \sigma )_\mathrm{{f}}\le {\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}&=\max _{u\in [0,1]}\{u r-\tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}\} = \kappa r- \tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}\\&= \kappa \Big ( r-\underbrace{\frac{1}{\kappa }\tilde{\psi }^{*}(\varrho \Vert \sigma |\kappa ){}_{\textrm{fa}}}_{=D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}}\Big )\,, \end{aligned}$$

where the first inequality is due to Proposition 4.5. This yields the first inequality in (4.17), and the rest have already been proved in the previous point. \(\square \)

Proposition 4.10 and Corollary 3.40 yield immediately the following:

Theorem 4.11

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), be such that \(\varrho \) and \(\sigma \) are trace-class, or \(\sigma \) is compact and \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\). Let \(\kappa \in (0,1)\), and assume that \(D_{\alpha }^{*}(\varrho \Vert \sigma )<+\infty \) for \(\alpha \) in a neighborhood of \(\alpha _0{:}{=}1/(1-\kappa )\). Then

$$\begin{aligned} C_{\kappa }(\varrho \Vert \sigma ){}_{\textrm{fa}}&= C_{\kappa }(\varrho \Vert \sigma )= D_{\frac{1}{1-\kappa }}^{*}(\varrho \Vert \sigma ), \quad \textrm{or}\, \textrm{equivalently,}\\ D_{\alpha _0}^{*}(\varrho \Vert \sigma )&=C_{\frac{\alpha _0-1}{\alpha _0}}(\varrho \Vert \sigma )=C_{\frac{\alpha _0-1}{\alpha _0}}(\varrho \Vert \sigma ){}_{\textrm{fa}}. \end{aligned}$$

4.3 Monotonicity of the Rényi divergences

The operational representation of the Hoeffding anti-divergences in Sect. 4.1 can be used to obtain the monotonicity of the sandwiched Rényi divergences under quantum operations.

In the Heisenberg picture, a quantum operation from a system with Hilbert space \({\mathcal {H}}\) to a system with Hilbert space \({{\mathcal {K}}}\) is given by a unital normal completely positive map \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\), which can be written as

$$\begin{aligned} \Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\ni A\mapsto V^*(A\otimes I_E)V=\sum _{i\in {{\mathcal {I}}}} V_i^*AV_i, \end{aligned}$$
(4.15)

where \(V:\,{\mathcal {H}}\rightarrow {{\mathcal {K}}}\otimes {\mathcal {H}}_E\) is an isometry, \(V_i{:}{=}(I_{{{\mathcal {K}}}}\otimes \left\langle e_i\right| )V\) for some ONB \((e_i)_{i\in {{\mathcal {I}}}}\) in the auxiliary Hilbert space \({\mathcal {H}}_E\), and the sum in (4.18) converges in the strong operator topology [17, 43]. As in everywhere in the paper, we assume that \({\mathcal {H}},{{\mathcal {K}}}\) are separable, in which case the auxiliary Hilbert space \({\mathcal {H}}_E\) can be chosen to be separable, and the index set \({{\mathcal {I}}}\) in (4.18) countable.

In the Schrödinger picture, a density operator \(\varrho \in {{\mathcal {S}}}({\mathcal {H}})\) is transformed by the predual map

$$\begin{aligned} \Phi ^*(\varrho ){:}{=}\sum _{i\in {{\mathcal {I}}}}V_i\varrho V_i^* = \sum _{i\in {{\mathcal {I}}}}(I_{{{\mathcal {K}}}}\otimes \left\langle e_i\right| )V\varrho V^*(I_{{{\mathcal {K}}}}\otimes \left| e_i\right\rangle ) = {{\,\textrm{Tr}\,}}_EV\varrho V^*, \end{aligned}$$
(4.16)

where the sum converges in trace-norm, and the result is a density operator on \({{\mathcal {K}}}\). Note that the predual map of \(\Phi \) maps from \({{\mathcal {L}}}^1({\mathcal {H}})\) to \({{\mathcal {L}}}^1({{\mathcal {K}}})\), and is usually denoted by \(\Phi _*\). On the other hand, if \({{\mathcal {B}}}({\mathcal {H}})\) and \({{\mathcal {B}}}({{\mathcal {K}}})\) are both equipped with the ultraweak topology then the corresponding dual map of \(\Phi \) coincides with the predual map, which justifies the notation \(\Phi ^*\).

If \(\varrho \) is PSD but not trace-class then the sum in (4.19) need not converge (in the weak, equivalently, in the strong operator topology), but it may, in which case we say that \(\Phi ^*\) is defined on \(\varrho \), and define \(\Phi ^*(\varrho ){:}{=}\sum _{i\in {{\mathcal {I}}}}V_i\varrho V_i^*\). A trivial case where \(\Phi ^*\) is defined on every \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) is when \(\Phi \) has only finitely many operators in its Kraus decomposition, or equivalently, \({\mathcal {H}}_E\) is finite-dimensional.

Lemma 4.12

Let \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive map. If \(\Phi ^*\) is defined on \(\varrho \) then

$$\begin{aligned} {{\,\textrm{Tr}\,}}A^{1/2}\Phi ^*(\varrho )A^{1/2}={{\,\textrm{Tr}\,}}\Phi (A)^{1/2}\varrho \Phi (A)^{1/2},\,\, \,\, \,\, A\in {{\mathcal {B}}}({{\mathcal {K}}}){}_{\gneq 0}. \end{aligned}$$

Proof

Let \((f_j)_{j\in {{\mathcal {J}}}}\) be an orthonormal basis in \({{\mathcal {K}}}\). Then

$$\begin{aligned} {{\,\textrm{Tr}\,}}A^{1/2}\Phi ^*(\varrho )A^{1/2}&= \sum _{j\in {{\mathcal {J}}}}\underbrace{\left\langle A^{1/2}f_j , \Phi ^*(\varrho )A^{1/2}f_j\right\rangle }_{ =\sum _{i\in {{\mathcal {I}}}}\left\langle A^{1/2}f_j , V_i\varrho V_i^*A^{1/2}f_j\right\rangle }\\&=\sum _{i\in {{\mathcal {I}}}}\underbrace{\sum _{j\in {{\mathcal {J}}}}\left\langle A^{1/2}f_j , V_i\varrho V_i^*A^{1/2}f_j\right\rangle }_{= {{\,\textrm{Tr}\,}}A^{1/2}V_i\varrho V_i^*A^{1/2}={{\,\textrm{Tr}\,}}\varrho ^{1/2}V_i^*AV_i\varrho ^{1/2}}\\&=\sum _{i\in {{\mathcal {I}}}}{{\,\textrm{Tr}\,}}\varrho ^{1/2}V_i^*AV_i\varrho ^{1/2}\\&=\sum _{i\in {{\mathcal {I}}}}\sum _{j\in {{\mathcal {J}}}}\left\langle \varrho ^{1/2}f_j , V_i^*AV_i\varrho ^{1/2}f_j\right\rangle \\&=\sum _{j\in {{\mathcal {J}}}}\underbrace{\sum _{i\in {{\mathcal {I}}}}\left\langle \varrho ^{1/2}f_j , V_i^*AV_i\varrho ^{1/2}f_j\right\rangle }_{ =\left\langle \varrho ^{1/2}f_j , \Phi (A)\varrho ^{1/2}f_j\right\rangle }\\&={{\,\textrm{Tr}\,}}\varrho ^{1/2}\Phi (A)\varrho ^{1/2}\\&={{\,\textrm{Tr}\,}}\Phi (A)^{1/2}\varrho \Phi (A)^{1/2}. \end{aligned}$$

\(\square \)

The transformation on multiple systems is given by

$$\begin{aligned} \Phi ^{\otimes n}:\,{{\mathcal {B}}}({{\mathcal {K}}}^{\otimes n})\ni A&\mapsto (V^{\otimes n})^*(A\otimes I_{E^n})V^{\otimes n}\nonumber \\&=\sum _{\underline{i}\in {{\mathcal {I}}}^n} (V_{i_1}{{\,\mathrm{\otimes \ldots \otimes }\,}}V_{i_n})^*A (V_{i_1}{{\,\mathrm{\otimes \ldots \otimes }\,}}V_{i_n}). \end{aligned}$$
(4.17)

If \(\Phi ^*\) is defined on \(\varrho \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) then \((\Phi ^{\otimes n})^*\) is defined on \(\varrho ^{\otimes n}\), and \((\Phi ^{\otimes n})^*(\varrho ^{\otimes n})=(\Phi ^*(\varrho ))^{\otimes n}\).

In the context of operator discrimination, a transformation \(\Phi \) effectively reduces the available tests for discriminating \(\varrho \) and \(\sigma \), thereby increasing the strong converse exponent, as expressed by the following:

Lemma 4.13

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\) and \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive map such that \(\Phi ^*\) is defined on \(\varrho \) and \(\sigma \). Then

$$\begin{aligned} \underline{\textrm{sc}}_r(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))\ge \underline{\textrm{sc}}_r(\varrho \Vert \sigma ),\,\, \,\, \,\, \overline{\textrm{sc}}_r(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))\ge \overline{\textrm{sc}}_r(\varrho \Vert \sigma ),\,\, \,\, \,\, r\in \mathbb {R}. \end{aligned}$$

Proof

We only prove the assertion for \(\overline{\textrm{sc}}_r\), as the proof for \(\underline{\textrm{sc}}_r\) goes the same way. We have

$$\begin{aligned} \overline{\textrm{sc}}_r(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))&= \inf \left\{ \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}(\Phi ^*(\varrho ))^{\otimes n}T_n^{1/2}:\right. \\&\left. \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}T_n^{1/2}(\Phi ^*(\sigma ))^{\otimes n}T_n^{1/2}\ge r \right\} \\&=\inf \left\{ \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}(\Phi ^{\otimes n}(T_n))^{1/2}\varrho ^{\otimes n}(\Phi ^{\otimes n}(T_n))^{1/2}:\,\right. \\&\left. \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}(\Phi ^{\otimes n}(T_n))^{1/2}\sigma ^{\otimes n}(\Phi ^{\otimes n}(T_n))^{1/2}\ge r \right\} \\&\ge \inf \left\{ \liminf _{n\rightarrow +\infty }\!-\!\frac{1}{n}\log {{\,\textrm{Tr}\,}}S_n^{1/2}\varrho ^{\otimes n}S_n^{1/2}:\right. \\&\qquad \qquad \left. \liminf _{n\rightarrow +\infty }-\frac{1}{n}\log {{\,\textrm{Tr}\,}}S_n^{1/2}\sigma ^{\otimes n}S_n^{1/2}\!\ge \! r \right\} \\&= \overline{\textrm{sc}}_r(\varrho \Vert \sigma ), \end{aligned}$$

where the first two infima are taken over sequences of tests \(T_n\in {{\mathcal {B}}}({{\mathcal {K}}}^{\otimes n})_{[0,I]}\), \(n\in \mathbb {N}\), satisfying the given conditions, the third infimum is taken over sequences of tests \(S_n\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,I]}\), \(n\in \mathbb {N}\), satisfying the given condition, the first equality is by definition, the second equality follows from Lemma 4.12, the inequality is obvious from the fact that \(T_n\in {{\mathcal {B}}}({{\mathcal {K}}}^{\otimes n})_{[0,I]}\Longrightarrow \Phi ^{\otimes n}(T_n)\in {{\mathcal {B}}}({\mathcal {H}}^{\otimes n})_{[0,I]}\), and the last equality is again by definition. \(\square \)

The proof of the following monotonicity result is similar to the proof of the analogous result given in [37, Remark 2] for the monotonicity of the Petz-type Rényi divergences and finite-dimensional density operators. The main ideas in the proof are using the bounds on the strong converse exponents given in Proposition 4.5, the monotonicity of the strong converse exponents given in Lemma 4.13, and the fact that the Rényi divergences can be expressed from the Hoeffding anti-divergences by Legendre-Fenchel transformation, i.e., Lemma 3.53.

Theorem 4.14

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), be such that

$$\begin{aligned} D_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}= D_{\alpha }^{*}(\varrho \Vert \sigma ),\,\, \,\, \,\, \alpha >1, \end{aligned}$$
(4.18)

and let \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive linear that is defined on \(\varrho \) and \(\sigma \). Then

$$\begin{aligned} D_{\alpha }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma )){}_{\textrm{fa}}\le D_{\alpha }^{*}(\varrho \Vert \sigma ),\,\, \,\, \,\, \alpha >1. \end{aligned}$$
(4.19)

Proof

By assumption, \({\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}={\hat{H}}_r^{*}(\varrho \Vert \sigma )\), \(r\in \mathbb {R}\), and

$$\begin{aligned} {\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}={\hat{H}}_r^{*}(\varrho \Vert \sigma )\le \overline{\textrm{sc}}_r(\varrho \Vert \sigma )&\le \overline{\textrm{sc}}_r(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))\\&\le {\hat{H}}_r^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma )){}_{\textrm{fa}}\,,\,\, \,\, \,\, r\in \mathbb {R}, \end{aligned}$$

where the first and the last inequalities follow from Proposition 4.5, and the second inequality from Lemma 4.13. Hence, by Lemma 3.53,

$$\begin{aligned} \tilde{\psi }^{*}(\varrho \Vert \sigma |u){}_{\textrm{fa}}&= \sup _{r\in \mathbb {R}}\left\{ ur-{\hat{H}}_r^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\right\} \ge \sup _{r\in \mathbb {R}}\left\{ ur-{\hat{H}}_r^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma )){}_{\textrm{fa}}\right\} \\&= \tilde{\psi }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma )|u){}_{\textrm{fa}},\,\, \,\, u\in \mathbb {R}, \end{aligned}$$

which is equivalent to (4.22). \(\square \)

Corollary 4.15

Let \(\varrho ,\sigma \in {{\mathcal {B}}}({\mathcal {H}}){}_{\gneq 0}\), and let \(\Phi :\,{{\mathcal {B}}}({{\mathcal {K}}})\rightarrow {{\mathcal {B}}}({\mathcal {H}})\) be a unital normal completely positive map. Assume that

(a) \(\varrho \) and \(\sigma \) are both trace-class,

or

(b) \(\Phi ^*\) is defined on \(\varrho \) and \(\sigma \), \(\sigma \) and \(\Phi ^*(\sigma )\) are compact, and \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\).

Then

$$\begin{aligned} D_{\alpha }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))\le D_{\alpha }^{*}(\varrho \Vert \sigma ),\,\, \,\, \,\, \alpha >1. \end{aligned}$$
(4.20)

Proof

If \(\varrho \) and \(\sigma \) are both trace-class then \(\Phi ^*\) is automatically defined on them, and \(\Phi ^*(\varrho )\) and \(\Phi ^*(\sigma )\) are both trace-class. Note that \(\varrho \in {{\mathcal {B}}}^{\infty }({\mathcal {H}},\sigma )\) \(\Longleftrightarrow \) \(\varrho \le \lambda \sigma \) for some \(\lambda \ge 0\), whence \(\Phi ^*(\varrho )\le \lambda \Phi ^*(\sigma )\), i.e., \(\Phi ^*(\varrho )\in {{\mathcal {B}}}^{\infty }({{\mathcal {K}}},\Phi ^*(\sigma ))\). By Remark 3.6, \(\varrho \in {{\mathcal {B}}}^{\alpha }({\mathcal {H}},\sigma )\) and \(\Phi ^*(\varrho )\in {{\mathcal {B}}}^{\alpha }({{\mathcal {K}}},\Phi ^*(\sigma ))\), \(\alpha >1\). Thus, by Lemma 3.40, the assumptions guarantee that \(D_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}=D_{\alpha }^{*}(\varrho \Vert \sigma )\) and \(D_{\alpha }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma )){}_{\textrm{fa}}=D_{\alpha }^{*}(\Phi ^*(\varrho )\Vert \Phi ^*(\sigma ))\), \(\alpha >1\), and therefore (4.23) follows immediately from Theorem 4.14. \(\square \)

Remark 4.16

Monotonicity of the form (4.23) in the case where both \(\varrho \) and \(\sigma \) are trace-class is a special case of [6, Theorem 14] and [27, Theorem 3.14], where monotonicity was proved in the more general setting of normal positive linear functionals on a von Neumann algebra. Our proof above is completely different from the proofs given in [6] and [27].

5 Conclusion

We have shown that for any \(\alpha >1\), the sandwiched Rényi \(\alpha \)-divergence of infinite-dimensional density operators has the same operational interpretation in the context of state discrimination as in the finite-dimensional case, and also that it coincides with the regularized measured Rényi \(\alpha \)-divergence, again analogously to the finite-dimensional case. Our results can be extended to more general operator algebraic settings, as shown in [22].

It is worth noting that while in [33] the equality of the sandwiched Rényi divergence and the regularized measured Rényi divergence was an important ingredient of showing the equality of the strong converse exponent and the Hoeffding anti-divergence, the extensions to the infinite-dimensional case can be done separately, building in each problem only on the corresponding finite-dimensional result and the recoverability of the sandwiched Rényi divergences from finite-dimensional restrictions.

We also considered the extension of the sandwiched Rényi divergences (and more generally, Rényi \((\alpha ,z)\)-divergences) to pairs of not necessarily trace-class positive semi-definite operators, and established some properties of this extension. Related to this, we considered a generalization of the state discrimination problem, where the hypotheses may be represented by general positive semi-definite operators. We gave bounds on the strong converse exponent in this problem, and showed that at least in some cases, the equality between the strong converse exponent and the Hoeffding anti-divergence still holds in this generalized setting.

There are a number of interesting problems left open in the paper. Probably the most important is clarifying whether \(Q_{\alpha }^{*}(\varrho \Vert \sigma )=Q_{\alpha }^{*}(\varrho \Vert \sigma ){}_{\textrm{fa}}\) holds for every pair of PSD operators \(\varrho ,\sigma \) and every \(\alpha >1\), and if not then whether there exist other examples for which it holds, apart from the ones given in Proposition 3.40 and Lemma 3.42. Such examples would extend the applicability of Proposition 4.5 and Theorem 4.14, among others. While less relevant for the problem of operator discrimination, the same question may be asked for more general Rényi \((\alpha ,z)\)-divergences, which seems interesting from the matrix analysis point of view. Finally, from the point of view of quantum information theory, the most important question seems to be to clarify the optimal asymptotics of the type I error probability when the type II exponent is strictly above the relative entropy of the two states (assuming that the latter is finite), while all their sandwiched Rényi \(\alpha \)-divergences are \(+\infty \) for \(\alpha >1\); see Remark 4.8.