Symmetric distinguishability as a quantum resource

We develop a resource theory of symmetric distinguishability, the fundamental objects of which are elementary quantum information sources, i.e., sources that emit one of two possible quantum states with given prior probabilities. Such a source can be represented by a classical-quantum state of a composite system $XA$, corresponding to an ensemble of two quantum states, with $X$ being classical and $A$ being quantum. We study the resource theory for two different classes of free operations: $(i)$ ${\rm{CPTP}}_A$, which consists of quantum channels acting only on $A$, and $(ii)$ conditional doubly stochastic (CDS) maps acting on $XA$. We introduce the notion of symmetric distinguishability of an elementary source and prove that it is a monotone under both these classes of free operations. We study the tasks of distillation and dilution of symmetric distinguishability, both in the one-shot and asymptotic regimes. We prove that in the asymptotic regime, the optimal rate of converting one elementary source to another is equal to the ratio of their quantum Chernoff divergences, under both these classes of free operations. This imparts a new operational interpretation to the quantum Chernoff divergence. We also obtain interesting operational interpretations of the Thompson metric, in the context of the dilution of symmetric distinguishability.

3 Some key ingredients of the resource theory of symmetric distinguishability

Introduction
Distinguishability plays a central role in all of modern science. The ability to distinguish one possibility from another allows for making inferences from experimental data and making decisions based on these inferences or developing new theories. Thus, it is essential to understand distinguishability from a fundamental perspective. Furthermore, distinguishability is a resource, in the sense that fewer trials of an experiment are needed to arrive at conclusions when two different possibilities are more distinguishable from one another.
In this paper, we adopt a resource-theoretic approach to distinguishability in quantum mechanics that ultimately is helpful in and enriches our fundamental understanding of distinguishability. We note here that, more generally, the resource-theoretic approach to quantum information processing [CG19] has illuminated not only quantum information science but also other areas of research in physics and mathematical statistics. Our work differs from prior developments with a related motivation [Mat10,WW19a,WW19b], in that here we focus instead on what we call symmetric distinguishability, or alternatively, the Bayesian approach to distinguishability. The outcome of our efforts is a resource theory with a plethora of appealing features, including asymptotic reversibility with the optimal conversion rate being given by a ratio of Chernoff divergences. We explain these concepts in more detail in what follows. Our theory is similar in spirit to that proposed previously in [Mor09], but there are some notable differences and our conclusions are arguably stronger than those presented in [Mor09]. We refer to the resource theory that we propose here as the resource theory of symmetric distinguishability (RTSD).

Overview of the resource theory of symmetric distinguishability
The basic objects of this resource theory are elementary quantum information sources, which emit one of two quantum states with certain prior probabilities (i.e., the source emits a state ρ 0 with probability p or a state ρ 1 with probability 1−p). Such a source can be represented by the following classical-quantum (c-q) state: where p ∈ [0, 1] is a prior probability and ρ 0 and ρ 1 are quantum states 1 . Note that the classical system is equivalently specified by a random variable that we also denote as X. In analogy with the notation used in the resource theory of asymmetric distinguishability [WW19a,WW19b], it can be equivalently represented by a quantum box given by the triple (p, ρ 0 , ρ 1 ). The nomenclature "box" is used here to indicate that a quantum system is prepared in the state ρ 0 with probability p and ρ 1 with probability 1 − p and it is not known which is the case (thus, the system is analogous to an unopened box). An important goal of the resource theory is to transform a state of the above form to the following state σ XB := q|0 0| ⊗ σ 0 + (1 − q) |1 1| ⊗ σ 1 , (1.2) via a chosen set of free operations, where q ∈ [0, 1] and σ 0 and σ 1 are quantum states. Note that the target system B need not be isomorphic to the initial system A. This corresponds to the following transformation between boxes: (p, ρ 0 , ρ 1 ) → (q, σ 0 , σ 1 ). Note that in what follows, we often suppress the subscripts denoting the quantum systems, for notational simplicity. Given such an elementary quantum source ρ XA ≡ (p, ρ 0 , ρ 1 ), a natural way to study distinguishability is to consider the binary hypothesis testing task of discriminating between the states ρ 0 and ρ 1 . There are two possible errors that can be incurred in the process, namely, the type I error (mistaking ρ 0 to be ρ 1 ) and the type II error (mistaking ρ 1 to be ρ 0 ). In the setting of asymmetric hypothesis testing, one minimizes the type II error probability under the constraint that the type I error probability is below a given threshold. In contrast, in symmetric hypothesis testing, the two error probabilities are considered on the same footing and weighted by the prior distribution. The latter (also known as Bayesian discrimination) is arguably the first problem ever considered in the field of quantum information theory and solved in the single-copy case by Helstrom [Hel67,Hel69] and Holevo [Hol72]. The operational quantity in this task is the minimum (average) error probability, which we formally define in (3.2) and denote as p err (ρ XA ).
Let ρ (n) XA denote the c-q state corresponding to a source that emits the state ρ ⊗n 0 (resp. ρ ⊗n 1 ) with probability p (resp. (1 − p)). It is known that p err (ρ (n) XA ) decays exponentially in n, with exponent given by the Chernoff divergence (also known as the quantum Chernoff bound) of the states ρ 0 and ρ 1 [NS09, ACMnT + 07]. Just as the resource theory of asymmetric distinguishability (RTAD) provides a resource-theoretic perspective to asymmetric hypothesis testing [Mat10,Mat11,WW19a], our resource theory (RTSD) provides a resource-theoretic framework for symmetric hypothesis testing. By the quantum Stein's lemma, the relevant operational quantity in asymmetric hypothesis testing is known to be characterized by the quantum relative entropy [HP91,ON00], and in the RTAD, the optimal rate of transformation between quantum boxes was proved to be given by a ratio of quantum relative entropies [WW19a] (see also [BST19] in this context). In analogy and given the result of [ACMnT + 07], it is natural to expect that, in the RTSD, the corresponding optimal asymptotic rate of transformation between quantum boxes is given by a ratio of Chernoff divergences. It is pleasing to see that this is indeed the case. In contrast, in [Mor09], only one-shot transformations are considered and hence, in contrast to our work, asymptotic transformations are not studied.
We consider the RTSD for two different choices of free operations: (i) local quantum channels (i.e., linear, completely positive, trace-preserving maps) acting on the system A alone, and (ii) the more general class of conditional doubly stochastic (CDS) maps. We denote the former class of free operations by CPTP A . A CDS map acting on a c-q state ρ XA defined through (1.1) consists of quantum operations acting on the system A and associated permutations of the letters x ∈ X . A detailed justification behind the choice of CDS maps as free operations is given in Section 2, where the RTSD is introduced via an axiomatic approach.
In any quantum resource theory, there are several pertinent questions to address. What are the conditions for the feasibility of transforming a source state to a target state? If one cannot perform a transformation exactly, how well can one do so approximately? What is an appropriate measure for approximation when converting a source state to a target state? Is there a "golden unit" resource that one can go through as an intermediate step when converting a source state to a target state? At what rate can one convert repetitions (i.e., multiple copies) of a source state to repetitions of a target state, either exactly or approximately? More specifically, at what rate can one distill repetitions of a source state to the golden unit resource, either exactly or approximately?
Conversely, at what rate can one dilute the golden unit resource to repetitions of a target state? Is the resource theory asymptotically reversible? In this paper, we address all of these questions within the context of the RTSD. Given an elementary quantum source ρ XA , a natural measure of symmetric distinguishability is given by the minimum error probability p err (ρ XA ) in the context of Bayesian state discrimination (mentioned above). Then SD(ρ XA ) := − log (2p err (ρ XA )) (1.3) is a natural measure of the symmetric distinguishability (SD) contained in ρ XA . 2 A justification of this choice arises from the consideration of free states and infinite-resource states. For a detailed discussion of the notion of symmetric distinguishability in a more general setting, see Section 2.
A natural choice of a free state for this resource theory is a c-q state of the form (1.1), for which p = 1/2 and ρ 0 and ρ 1 are identical and hence indistinguishable. For such a state, p err (ρ XA ) = 1/2, which is achieved by random guessing, and hence SD(ρ XA ) = 0. Note that the converse is true also (i.e., SD(ρ XA ) = 0 implies that ρ 0 and ρ 1 are identical and p = 1/2). Hence our choice of SD respects the requirement that a state has zero SD if and only if it is free. On the other hand, a natural choice of an infinite-resource state is a c-q state of the form (1.1), for which ρ 0 and ρ 1 have mutually orthogonal supports. This corresponds to an elementary quantum information source that emits perfectly distinguishable states. For such a state, p err (ρ XA ) = 0 and hence SD(ρ XA ) = +∞. Thus our choice of SD validates the identification of such states as infinite-resource states. An infinite-resource state has the desirable property that it can be converted to any other c-q state via CDS maps. Under CPTP A , it can be transformed to any other c-q state with the same prior. Moreover, any given c-q state cannot be transformed to an infinite-resource state unless it is itself an infinite-resource state.
We coin the word SD-bit to refer to the unit of symmetric distinguishability, the basic currency of this resource theory. In fact, for every positive real number m ≥ 0, it is useful to identify a family of c-q states that have m SD-bits. In Definition 3.4 of Section 3.1, we consider a natural choice for such a family of c-q states. They are parametrized by M ≡ 2 m and denoted as γ Consideration of an M -golden unit also leads naturally to a clear definition of the fundamental tasks of distillation and dilution in the RTSD as the conversions of a given c-q state to and from, respectively, an M -golden unit under free operations 3 . The previously mentioned task of transfor-mations between two arbitrary c-q states under free operations can be achieved by first distilling an M -golden unit (for the maximal possible M ) from the initial state and then diluting the distilled M -golden unit to the desired target state. This is discussed in detail in Sections 4 and 5.
One fundamental setting of interest is the one-shot setting, with the question being to determine the minimum error in converting an initial source to a target source. We prove that this minimum error can be calculated by means of a semi-definite program. Thus, we can efficiently calculate this error in time polynomial in the dimensions of the A and A ′ systems.
Moving on to the case of asymptotic transformations, we consider the following fundamental conversion task via free operations: where p, q ∈ [0, 1] and, for i ∈ {0, 1}, ρ i and σ i are states of quantum systems A and B, respectively. The goal here is, for a fixed n, to make m as large as possible, and to evaluate the optimal asymptotic rate m n of the transformation in the limit as n becomes arbitrarily large. We allow for approximations in the transformation and require the approximation error to vanish in the asymptotic limit (n → ∞). This approximation error will be measured with respect to an error measure (defined for c-q states of the form considered in this paper) that we denote by the symbol D ′ . The precise definition of D ′ and its mathematical properties are given in Section 3.2.

Main results
In this paper, we develop a consistent and systematic resource theory of symmetric distinguishability that answers the most important questions associated with a resource theory. In brief, the main contributions of this paper can be summarized as follows. In the following, all c-q states are assumed to be of the form (1.1), and they hence represent elementary quantum information sources.
• We define two new examples of generalized divergences, each of which satisfies the dataprocessing inequality (DPI). These are denoted as ξ min and ξ max . Moreover, we define a quantity, denoted as ξ ⋆ max , on c-q states of the form (1.1), and we prove that it satisfies monotonicity under CDS maps.
• All of the above quantities are of operational significance in the RTSD: -The one-shot exact distillable-SD of ρ XA under CPTP A maps is given by ξ min (ρ XA ) (Theorem 4.3); -The one-shot exact SD-cost of ρ XA under CPTP A maps is given by ξ max (ρ XA ) (Theorem 5.3); -The one-shot exact distillable-SD of ρ XA under CDS maps is given by its symmetric distinguishability, SD(ρ XA ) (Theorem 4.5); -In addition, the one-shot exact SD-cost of ρ XA under CDS maps is given by ξ ⋆ max (ρ XA ) (Theorem 5.5).
• ξ max (ρ XA ) and ξ ⋆ max (ρ XA ) are both defined in terms of the Thompson metric of the state ρ XA (see Theorems 5.3 and 5.5), thus providing operational interpretations of the latter in the context of the RTSD 4 .
• The optimal asymptotic rate of exact and approximate SD-distillation for a state ρ XA ≡ (p, ρ 0 , ρ 1 ), under both CPTP A and CDS maps, is equal to its quantum Chernoff divergence, ξ(ρ 0 , ρ 1 ) (Theorem 4.7 and 4.18), where ξ(ρ 0 , ρ 1 ) := sup (1.5) • The optimal asymptotic rate of exact SD-dilution for a state ρ XA is equal to its Thompson metric (see Theorem 5.12). This provides another clear operational interpretation for the latter.
• The optimal asymptotic rate of approximate SD-dilution for a state ρ XA is equal to its quantum Chernoff divergence (see Theorem 5.13).
• The optimal asymptotic rate of transforming one c-q state to another, under both CPTP A and CDS maps, is equal to the ratio of their quantum Chernoff divergences (see Theorem 7.4). This result constitutes a novel operational interpretation of the Chernoff divergence beyond that reported in earlier work on symmetric quantum hypothesis testing [NS09, ACMnT + 07]. It also demonstrates that the resource theory of symmetric distinguishability (RTSD) is asymptotically reversible.
In the following sections, we develop all of the above claims in more detail. In particular, in Section 2, we introduce a general resource theory of symmetric distinguishability, for arbitrary quantum information sources (given by an ensemble {p x , ρ x } x∈X of quantum states), via an axiomatic approach. The resource theory studied in the rest of the paper is a special case of the above, namely, the one for elementary quantum information sources (i.e., corresponding to the choice |X | = 2). Certain necessary ingredients of the RTSD are introduced in this section and in Section 3. These include the notion of golden units, which facilitates a study of distillation and dilution of symmetric distinguishability (SD). SD-distillation and SD-dilution are studied in Sections 4 and 5, respectively, both in the one-shot and asymptotic regimes. In Section 6, we elucidate the salient features of the RTSD for certain examples of elementary quantum information sources. The interesting task of converting one elementary quantum information source to another via free operations is studied in Section 7. We conclude the main part of the paper with a summary and some open questions for future research. Various relevant quantities of the RTSD can be formulated as semi-definite programs (SDPs). These are stated in Sections 3 and 4, but some of their proofs appear in the appendices.
2 An axiomatic approach to the resource theory of symmetric distinguishability In this section, we introduce an axiomatic approach to a resource theory of symmetric distinguishability, from which the particular resource theory that we study in this paper arises as a natural special case, corresponding to the choice |X| ≡ |X | = 2 in what follows. Consider an ensemble {p x , ρ x } x∈X of quantum states. Such an ensemble can be described by a c-q state There are many functions that can be used to quantify the distinguishability of the ensemble of states above. Perhaps the function that is most operationally motivated is the guessing probability: where the maximum is over every possible POVM {Λ x } x∈X . We are interested in a notion of ensemble distinguishability that takes into account the prior distribution {p x } x∈X , while distinguishing between the states in the ensemble, as opposed to state distinguishability, which is concerned only with the distinguishability of the states in the set {ρ x } x∈X .
Motivated by the guessing probability and symmetric hypothesis testing, we identify an ensemble as free in the resource theory of symmetric distinguishability if the guessing probability takes on its minimum value; i.e., if the guessing probability is equal to 1 |X | , which is the same value attained by a random guessing strategy. Note that the guessing probability is equal to 1 |X | if and only if all the states of the ensemble {p x , ρ x } x∈X are identical and the prior distribution is uniform. For the sake of completeness, we include a proof of this fact at the end of this section (see Lemma 2.6 below).
Thus, for all such ensembles, the symmetric distinguishability is equal to zero. The corresponding c-q state ρ XA = π X ⊗ ω A , where π X := I X /|X | is the completely mixed state, is then a 'free' state of the resource theory of symmetric distinguishability because it has zero SD. This leads us to identify the set of free states in the resource theory of symmetric distinguishability as follows: Note that, in the above, we use the notation D(A) to denote the set of quantum states (i.e., density matrices) of the quantum system A.
To make the notion of symmetric distinguishability (SD) precise, we introduce an axiomatic approach. Here, we define a preorder relation ≺ on the set of all c-q states, which satisfies Axioms I-V below (see Definition 2.1). We say a c-q state ρ XA has less symmetric distinguishability than σ X ′ A ′ if ρ XA ≺ σ X ′ A ′ . In this approach, SD is a property of a composite physical system shared between two parties, say, Xiao and Alice, with Xiao possessing classical systems (denoted by X, X ′ , etc.) and Alice possessing quantum systems denoted by A, A ′ , etc. The word "symmetric" refers to the fact that the distinguishability is symmetric with respect to the ordering in the ensemble; i.e., for every permutation π, the ensemble {p x , ρ x } x has the same SD as the ensemble {p π(x) , ρ π(x) } x because the latter is just a relabeling of the former. We write this equivalence as the following relation: Similarly, every isometric channel V ∈ CPTP(A → A ′ ) that acts on the states {ρ x } x does not change their "overlap" (i.e., their Hilbert-Schmidt inner product), and this leads to the next axiom: The fact that states with zero SD cannot add SD leads to the next axiom: for every classical system X ′ on Xiao's side and every state ω A ′ on Alice's side, (2.6) The next axiom concerns two c-q states ρ XAA ′ , σ XAA ′ ∈ D(XAA ′ ) that have the form with Y a finite alphabet, q y ∈ [0, 1] the elements of a probability distribution, and {|y } y∈Y an orthonormal basis. Since the label y is distinguishable in such states, we assume that The motivation behind Axiom IV is the following: Since the label y can be perfectly inferred by a measurement of system A ′ , the symmetric distinguishability of the states ρ XAA ′ and σ XAA ′ should be fully determined by the symmetric distinguishability of the individual states ρ y XA and σ y XA . Hence, if ρ y XA ∼ σ y XA for all y, then their mixtures ρ XAA ′ and σ XAA ′ should also be equivalent. The last axiom is the following natural assumption: The SD of a c-q state does not increase by discarding subsystems; i.e., (2.9) The above axioms now lead to the formal definition of the preorder of SD: Definition 2.1 (Preorder of SD) Let D(XA) denote the set of c-q states on a classical system X and a quantum system A, and let be the union over all finite-dimensional systems X and A. The preorder of SD is the smallest preorder relation on D cq that satisfies Axioms I-V. 5 In Appendix A, we discuss a number of consequences of Axioms I-V and the preorder of SD. There, we also define general resource measures to quantify symmetric distinguishability, and we provide several examples.
We now use this preorder to define the set of free operations.
Definition 2.2 (Free operations of the RTSD) A map N ∈ CPTP(XA → X ′ A ′ ) is said to be a free operation if for every c-q state ρ XA . We denote the set of such free operations by F(XA → X ′ A ′ ).
The above definition of the free operations defines the resource theory of symmetric distinguishability (SD). Specifically, we identify SD as a property of a composite physical system, shared between two parties, Xiao and Alice (with Xiao's systems being classical and Alice's being quantum), that can neither be generated nor increased by the set F of free operations.
There is an important class of operations that play a central role in the resource theory of symmetric distinguishability. These are referred to as conditionally doubly stochastic (CDS) maps and were first introduced in [GGH + 18]. In fact, when the classical input and output systems of the resource theory are the same (i.e., X = X ′ ), then the set of free operations, defined above, reduces to the set of CDS maps. This is stated in Lemma 2.3 below. Before proceeding to the lemma, we introduce the notion of CDS maps in the next section.

Conditional doubly stochastic (CDS) maps
Consider the following problem: Xiao picks a state ρ x ∈ {ρ 1 , ..., ρ |X | } at random, with prior probability p x , and then sends the state ρ x to Alice through a noiseless quantum channel. Alice knows the probability distribution {p x } x from which Xiao sampled the state ρ x , but she does not know the value of x. Therefore, the overall state can be represented as a classical-quantum state of the form: where X is a classical register to which Alice does not have access and A is a quantum register to which she does have access. What are the ways in which Alice can manipulate the state ρ XA ? There are two basic operations that she can perform: 1. Alice can perform a generalized measurement on her system A.
2. Alice can partially lose her knowledge of the distribution p x , by performing random relabeling on the alphabet of the classical system X := {1, 2, . . . , |X |}.
That is, Alice can perform a generalized measurement on the quantum system A, and based on the outcome, say j, apply a random relabeling map D (j) on the classical system X. Hence, the most general operation that Alice can perform is a CPTP map N ∈ CPTP(XA → XA ′ ) of the form where each D j is a classical doubly stochastic channel and each E j is a completely positive (CP) map such that j E j is CPTP. Alternatively, since every doubly stochastic matrix can be expressed as a convex combination of permutation matrices, there exist conditional probabilities t z|j such that where P z is the permutation channel defined by P z (|x x|) = |π z (x) π z (x)|, with π z being one of the |X |! permutations. We therefore conclude that whereẼ z ≡ j t z|j E j . Hence, we can assume, without loss of generality, that in (2.12) the map D z = P z , with z = 1, ..., |X |!, so that the maps that Alice can perform are given by We call such CPTP maps conditional doubly stochastic (CDS) maps.
The following lemma states that, when the input and output classical systems are the same (i.e., X = X ′ ), the free operations for the resource theory of symmetric distinguishability (RTSD), introduced in Definition 2.2, are given by CDS maps, denoted by CDS(XA → XA ′ ). Lemma 2.3 For a classical system X and quantum systems A and A ′ , the following set equivalence holds Proof. We first prove that (2.17) Since partial trace and isometries acting on Alice's systems are free, it follows that every quantum instrument on Alice's side is free. Let be a quantum instrument on Alice's side, and let ρ XA ∈ D(XA). Then, the action of the quantum instrument on ρ XA yields the state and from Axiom I, we have, for all y ∈ Y, the equivalence (2.20) where each P y X→X is a permutation channel. Combining this with Axiom IV, we conclude that the c-q state in (2.19) can further be transformed to (2.21) Finally, tracing system A ′′ yields the overall transformation which is the general form of a CDS map. This completes the proof of (2.17). Conversely, note that only Axiom III is not covered by CDS maps. Therefore, the most general transformation N XA→XA ′ ∈ F(XA → XA ′ ) has the form where π X ′ is the maximally mixed state, {E y } y is a quantum instrument, and P y XX ′ ∈ CPTP(XX ′ → XX ′ ) are joint permutation channels. However, observe that is a classical doubly stochastic channel and therefore can be expressed as a convex combination of permutation channels. We therefore conclude that N XA→XA ′ ∈ CDS(XA → XA ′ ).
Lemma 2.3 above demonstrates that if the dimension of the classical system is fixed, then the free operations in the resource theory of SD are CDS maps. However, we point out that the lemma above can also be used to characterize F(XA → X ′ A ′ ) where |X| = |X ′ |. In particular, observe that for all ρ XA , σ X ′ A ′ ∈ D cq because the maximally mixed states π X and π X ′ are free. In the general case, when |X| = |X ′ |, the set F(AX → A ′ X ′ ) can be viewed as a special subset of conditional thermal operations [NG17], corresponding to a thermodynamical system with a completely degenerate Hamiltonian. We therefore call it the set of conditional noisy operations.
In the rest of the paper, we focus on the case in which the input classical system X has the same dimension as the output classical system X ′ , and furthermore, we constrain both of them to have dimension equal to two. Thus, according to Lemma 2.3, the set of free operations reduces to CDS maps in all of our discussions that follow.
Moreover, in addition to CDS maps, we will also consider the set as a possible set of transformations. Note that CPTP A has the clear physical interpretation of applying a fixed quantum channel onto the quantum part of the c-q state without changing its classical probability distribution. Furthermore, it is clear from the definition of CDS maps that we have the inclusion CPTP A ⊂ CDS. (2.27) The following lemma shows that the minimum error probability p err (ρ XA ) can only be increased by application of a CDS map.  Proof. The guessing probability p guess (X|A) can be written as follows [KRS09]: where D max denotes the max-relative entropy, which for a state ρ and a positive semi-definite operator σ is defined as follows [Dat09]: Thus, we conclude that The first inequality follows from the fact that for each quantum state ω A on system A, the image under the CDS map N can be written as N (I X ⊗ ω A ) = I X ⊗ σ A ′ with σ A ′ a state on system A ′ (which is evident from the definition of CDS maps). The second inequality follows from the dataprocessing inequality for the max-relative entropy [Dat09].
The above lemma immediately leads to a natural choice of a measure of SD for the particular case of the RTSD that we study in this paper, namely, one in which the dimension of the classical system X is fixed to |X| = 2. This measure was mentioned in (1.3), and we recall its definition here: Definition 2.5 For a fixed dimension |X| = 2, we define SD(ρ XA ) := − log 2p err (ρ XA ) . (

2.30)
This function is equal to zero on free states and behaves monotonically under CDS maps, as is evident from Lemma 2.4 above.
Lemma 2.6 For an ensemble {p x , ρ x A } x∈X , the following lower bound holds for the guessing probability: with p guess (X|A) defined in (2.2). This lower bound is saturated (i.e., p guess (X|A) = 1 |X | ) if and only if ∃σ A ∈ D(A) ∀ x ∈ X , ρ x = σ A , and p x = 1 |X | . (2.32) Proof. It is known that [KRS09] p guess (X|A) where the infimum is over every state ω A . In the above, ρ XA is the c-q state corresponding to the ensemble and is defined in (2.1), and π X = I X /|X | is the completely mixed state. The lower bound in (2.31) is then a direct consequence of (2.33) and the fact that D max (ρ σ) ≥ 0 for states ρ and σ. As a consequence of (2.33), we also conclude that Now note that the infimum on the right-hand side of (2.34) is actually a minimum (due to the finiteness of inf ω A D max (ρ XA π X ⊗ ω A ) and continuity of 2 λ σ in λ). Therefore, where we also employed the property of max-relative entropy that, for states ρ and σ, D max (ρ σ) = 0 if and only if ρ = σ.
Remark 2.7 By rearranging (2.33), we find that log(|X |p guess (X|A)) = inf This quantity is a measure of symmetric distinguishability alternative to SD(ρ XA ), as defined in (1.3). Indeed, this measure of symmetric distinguishability has the appealing feature that it is the max-relative entropy from the state ρ XA of interest to the set of free states. As is common in quantum resource theories [CG19], one could then define an infinite number of SD measures based on the generalized divergence of the state of interest with the set of free states.
3 Some key ingredients of the resource theory of symmetric distinguishability 3.1 Infinite-resource states and golden units In the following sections, we study the information-theoretic tasks of distillation and dilution within the framework of the resource theory of symmetric distinguishability (SD), with respect to the golden unit of Definition 3.4 below. We consider two different choices of free operations: (i) CPTP maps on the quantum system A and the identity channel on the classical system, which we denote as CPTP A , and (ii) CDS maps, which were introduced in the previous section. The reason for considering both of these is that they lead to novel and interesting results. Also, in some cases, a proof for one choice of free operations follows as a simple corollary from that for the other choice. We refer to these tasks as SD-distillation and SD-dilution, respectively. We study the exact and approximate one-shot cases, as well as the asymptotic case for both of these tasks.
In this section, we prove certain key results that serve as prerequisites for the above study; they involve infinite-resource states and golden units-notions that were introduced in Section 1. We recall their definitions before stating the relevant results. As mentioned in the Introduction, the basic objects of the RTSD are elementary quantum information sources represented by where p ∈ [0, 1] is a prior probability and ρ 0 and ρ 1 are quantum states. The main operational quantity associated with such a state, in the context of symmetric hypothesis testing, is the minimum error probability of Bayesian state discrimination of the states ρ 0 and ρ 1 : Remark 3.1 Since the c-q state ρ XA is also represented by the quantum box (p, ρ 0 , ρ 1 ) (as mentioned in the Introduction), we sometimes use the notation p err (p, ρ 0 , ρ 1 ) instead of p err (ρ XA ) in the following.
The following well-known theorem gives an explicit expression for this minimum error probability [Hel67,Hel69,Hol72].
Theorem 3.2 (Helstrom-Holevo Theorem) For a c-q state ρ XA of the form in (3.1), the following equality holds Recall from Section 1 that a c-q state ρ XA is said to be an infinite-resource state if p err (ρ XA ) = 0, which is equivalent to the quantum states ρ 0 and ρ 1 having mutually orthogonal support. Hence, the symmetric distinguishability given by SD(ρ XA ) = − log (2p err (ρ XA )) is infinite in this case.
The following lemma shows that we can transform any infinite-resource state to any other c-q state of the form in (3.1) via CDS maps.
Lemma 3.3 Let ω XA be an infinite-resource state, and let σ XB be a general c-q state. Then there exists a CDS map N : XA → XB such that (3.4) Proof. We write the c-q states explicitly as for some p, q ∈ [0, 1] and quantum states ω 0 , ω 1 , σ 0 , and σ 1 . As ω XA is an infinite-resource state, and hence ω 0 and ω 1 have mutually orthogonal supports, we can pick a POVM {Λ, ½ − Λ} such that Tr(Λω 0 ) = Tr((½ − Λ)ω 1 ) = 1 and consequently Tr(Λω 1 ) = Tr((½ − Λ)ω 0 ) = 0. Consider a pair (E 0 , E 1 ) of quantum operations, i.e., completely positive, trace non-increasing linear maps that sum to a CPTP map, defined as follows: and consider the corresponding CDS map where F X denotes the flip channel on the classical system X. We then immediately get which concludes the proof.
As mentioned in the Introduction, it is useful to consider a particular class of c-q states that lead naturally to a clear definition of the fundamental tasks of distillation and dilution in the RTSD. These states are parametrized by M ∈ [1, ∞] and q ∈ (0, 1), and for M large enough have SD equal to log M . We refer to such a state as an (M, q)-golden unit. It is defined as follows: Definition 3.4 (Golden unit) We choose the following class of classical-quantum (c-q) states of a composite system XQ, where Q is a qubit. Each state is labelled by a parameter M ∈ [1, ∞] and a probability q ∈ (0, 1) and is defined as follows: is a state of a qubit Q and σ (1) denotes the Pauli-x matrix. We call the state γ (M,q) XQ an (M, q)golden unit. Note that for M = ∞, we have π ∞ = |0 0| and hence the golden unit reduces to an infinite-resource state.
The goodness of this choice of the golden unit lies in the fact that its SD has a useful scaling property, as stated in the following lemma.
Proof. As q 1 ≺ q 2 , there exists a λ ∈ [0, 1] such that λq 2 + (1 − λ)(1 − q 2 ) = q 1 . Now consider the CDS map where F X and F Q denotes the flip channel on systems X and Q, respectively. This directly gives which finishes the proof.
When we consider CDS maps as free operations, it suffices to focus on the case q = 1/2. We denote the corresponding golden unit simply as γ (M ) XQ , and call it the M -golden unit. For future reference, we write it out explicitly: Remark 3.7 Note that the golden unit γ XQ is equivalent under CDS maps to π M ⊗ ω, for an arbitrary quantum state ω. To see that, consider first the CDS map (3.11) and note that N (γ To see the other direction, consider the CDS map

A suitable error measure for approximate transformation tasks
In this section, we introduce the notion of a minimum conversion error for the transformation of one c-q state to another (both of the form defined in (3.1)) via free operations. For the transformation ρ XA → σ XB , we denote this quantity as d ′ FO (ρ XA → σ XB ). The latter is defined in terms of a scaled trace distance, which we denote as D ′ (ρ XA , σ XB ). We also discuss some of the properties of the above quantities. These quantities are then used to define the one-shot approximate distillable-SD and the one-shot approximate SD-cost in the following sections.
Definition 3.8 For general c-q states ρ XA and σ XA , define the scaled trace distance if p err (σ XA ) = 0 and ρ XA = σ XA .
(3.13) Remark 3.9 Note that the scaling factor in the definition of the scaled trace distance D ′ (ρ XA , σ XA ) depends on the state σ XA in the second slot.
Definition 3.10 For a set of free operations denoted by FO, we define the minimum conversion error corresponding to the scaled trace distance D ′ (·, ·) as follows: with ρ XA and σ XB being general c-q states on the classical system X and the quantum systems A and B, respectively.
Proof. By inspecting the definition of d ′ FO (ρ XA → σ XB ), we see that it remains to prove the following: where FO is either CPTP A or CDS. We leave the proof of the equality above to Appendix B (see Lemmas B.3 and B.4 therein).
Remark 3.12 The reason for considering the scaled trace distance D ′ (·, ·) instead of the usual trace distance as an error measure for transformations in the RTSD is that using the latter would allow for the unreasonable possibility of a finite-resource state being arbitrarily close to an infiniteresource state. In particular, as any infinite-resource state can be transformed to any other c-q state via CDS maps (see Lemma 3.3), this would imply that, for any finite allowed error measured in trace norm, the transformation would be possible at an infinite rate (as long as ρ 0 = ρ 1 ). To see this, for all n ∈ AE, pick a POVM {Λ n , ½ − Λ n } on the composite system of n copies of system A such that both type I and II error probabilities corresponding to the source (p, ρ ⊗n 0 , ρ ⊗n 1 ) vanish asymptotically, i.e.
This is possible because ρ 0 = ρ 1 . Hence, considering the infinite-resource state with the quantum system being a qubit Q, and the measure-prepare channel E n (·) = Tr(Λ n ·) |0 0| + Tr((½ − Λ n )·) |1 1|, we see that Therefore, as the infinite-resource state ω XQ can be transformed to any other c-q state without error, we see that for every ε > 0, we can pick n ∈ AE large enough such that for every c-q state σ XB ≡ (q, σ 0 , σ 1 ) and m ∈ AE there exists a CDS map N such that Hence, the transformation ρ is possible at an infinite rate and with arbitrarily small error measured in trace distance. For q = p, this transformation can also be performed at an infinite rate by just using CPTP A operations.
Note that D ′ (·, ·) is exactly defined in such a way that for every infinite-resource state ω XA and ρ XA = ω XA , we have D ′ (ρ XA , ω XA ) = ∞. Therefore, the problem discussed above, in which we obtain unreasonable infinite rates in the transformations in the RTSD, does not occur when we choose D ′ (·, ·) as the error measure. Moreover, in the following, we see that D ′ (·, ·) has many desirable properties that lead to reasonable asymptotic rates in SD distillation, SD dilution, and the transformation of general elementary quantum sources.
The scaled trace distance D ′ satisfies the data-processing inequality under CDS maps: Lemma 3.13 (DPI for D ′ under CDS maps) Let ρ XA , σ XA be two c-q states and N a CDS map. Then (3.20) Proof. The statement directly follows from the data-processing inequality for the trace distance under general CPTP maps, i.e., and the monotonicity of the minimum error probability under CDS maps proven in Lemma 2.4, i.e.
This concludes the proof. The following lemma now establishes a bound relating the minimum error probabilities of two cq states ρ XA and σ XA , involving a multiplicative term related to their scaled trace distance D ′ . This lemma is the key ingredient for proving all converses in approximate asymptotic SD-distillation, SD-dilution, and the transformation of general elementary quantum sources.
Lemma 3.14 Let ρ XA and σ XA be c-q states such that D ′ (ρ XA , σ XA ) is finite. Then (3.23) Proof. First, it is helpful to note that by writing the c-q states explicitly as they can be block-diagonalised in the same basis and hence we can write We can hence bound the minimum error probability of ρ XA as This concludes the proof.

Semi-definite program for the scaled trace distance D ′
We now prove that the scaled trace distance D ′ (·, ·) can be calculated by means of a semi-definite program (SDP). SDPs can be computed efficiently by numerical solvers [VB96]. As semi-definite programming is a powerful theoretical and numerical tool for quantum information theory, with a plethora of applications, we expect that the following SDP characterizations of D ′ (·, ·) may be useful for a further understanding of this quantity.
Proposition 3.15 For general c-q states ρ XA and σ XA with p err (σ XA ) > 0, the scaled trace distance, D ′ (ρ XA , σ XA ), in Definition 3.8 is given by the following semi-definite program: (3.28) The dual SDP is as follows: (3.29) If p err (σ XA ) > 0, then strong duality holds, so that the optimal value in (3.28) is equal to the optimal value in (3.29).
The proof of Proposion 3.15 can be found in Appendix C.

Semi-definite program for the minimum conversion error
The one-shot transformation task from a source ρ XA : 1| ⊗ σ 1 using CDS as the set of free operations can be phrased as the following optimization task: We now prove that the minimum conversion error in (3.30) can be calculated by means of a semidefinite program.
Proposition 3.16 The minimum conversion error in (3.30) can be evaluated by the following semi-definite program: The proof of Proposion 3.16 can be found in Appendix C.

SD-distillation
In this section, we study the fundamental task of distillation of symmetric distinguishability (SDdistillation), both in the one-shot and asymptotic settings.

One-shot exact SD-distillation
One-shot exact SD-distillation of a given c-q state with p ∈ [0, 1] and ρ 0 , ρ 1 states of a quantum system A, is the task of converting a single copy of it to an M -golden unit via free operations. The maximal value of log M for which this conversion is possible is equal to the one-shot exact distillable-SD for the chosen set of free operations. This is defined formally as follows: Definition 4.1 For a set of free operations denoted by F O and q ∈ [0, 1], the one-shot exact distillable-SD of the c-q state ρ XA defined in (4.1) is given by For the choice the only sensible choice in (4.2) is q = p, as free operations of the form id ⊗ E cannot change the prior in the c-q state. In that case, the above quantity is called the one-shot exact distillable-SD under CPTP A maps and we simply write Whereas for the choice FO ≡ CDS and q = 1/2, the above quantity is called the one-shot exact distillable-SD under CDS maps and we use the notation Explicitly, for a c-q state ρ XA given by (4.1) we then have where γ as stated previously.
Remark 4.2 Note that in the case of prior p ∈ (0, 1) and the free operations being CPTP A , the distillable-SD is, by definition, independent of p. In fact, in that case it can be equivalently written as For p ∈ {0, 1} one easily sees that Therefore, we restrict to the non-singular case p ∈ (0, 1) in the following Theorem 4.3.
We state the following theorems now.
Theorem 4.3 The one-shot exact distillable-SD under CPTP A maps of a c-q state ρ XA , defined through (4.1), with p ∈ (0, 1) is given by and Q min (ρ 0 , ρ 1 ) is given by the following SDP: (4.12) Remark 4.4 An alternative way of writing Q min (ρ 0 , ρ 1 ) is as follows: (4.13) This clarifies that the optimization is over all POVMs {Λ, ½ − Λ} such that the Type I error probability Tr(Λρ 0 ) is (greater than or) equal to the Type II error probability Tr((½ − Λ)ρ 1 ).
To see the second equality in (4.13), note first that the last line is trivially smaller than or equal to the right-hand side of the first line, since we are minimising over a larger set. To arrive at the other inequality, letΛ min be a minimiser of the last line of (4.13) and c := Tr(Λ min (ρ 0 + ρ 1 )) ≥ 1.
from which we conclude the equality in (4.13).
Theorem 4.5 The one-shot exact distillable-SD under CDS maps of a c-q state ρ XA , defined through (4.1), is given by (4.14) Remark 4.6 As a consequence of Theorem 4.3, it follows that the one-shot exact distillable-SD under CPTP A maps can be calculated by means of a semi-definite program, due to the form of Q min (ρ 0 , ρ 1 ) in (4.12). As a consequence of Theorem 4.5, it follows that the one-shot exact distillable-SD under CDS maps can be calculated by means of a semi-definite program, due to the expression for p err (ρ XA ) in (3.2).
Proofs of the above theorems are given in Sections 4.4 and 4.5, respectively. The quantities ξ min and Q min appearing in Theorem 4.3 have several useful and interesting properties, which are given in Section 4.3.

Optimal asymptotic rate of exact SD-distillation
Consider the c-q state ρ (n) XB ) denote its one-shot exact distillable-SD under CPTP A maps and CDS maps, respectively. Then the optimal asymptotic rates of exact SD-distillation under CPTP A maps and CDS maps are defined by the following two quantities, respectively: The next theorem asserts that both limits in (4.15) actually exist and are equal to the wellknown quantum Chernoff divergence [ACMnT + 07, NS09]: Theorem 4.7 (Optimal asymptotic rate of exact SD-distillation) For p ∈ (0, 1), the optimal asymptotic rates of exact SD-distillation under CPTP A and CDS maps are given by the following expression: Here, the restriction to p ∈ (0, 1) is sensible as for A proof of the above theorem is given in Section 4.6.

Properties of Q min and ξ min
In this section, we establish some basic properties of the distinguishability measures Q min and ξ min .
From that we see The reversed inequality can be obtained by symmetry, which yields By a straightforward consequence of the definition of ξ min (ρ 0 , ρ 1 ) in terms of Q min (ρ 0 , ρ 1 ), it follows that ξ min (ρ 0 , ρ 1 ) is symmetric in its arguments.
The quantity ξ min also satisfies a data-processing inequality (DPI) under CPTP maps, which is the statement of the following lemma: Lemma 4.9 (DPI for ξ min under CPTP maps) Let E be a quantum channel. Then In the last inequality, we have used that 0 ≤ E * (Λ) ≤ ½ for each 0 ≤ Λ ≤ ½, implying that we are effectively minimising over a smaller set. Hence, directly from the definition of ξ min in (4.11), we conclude the data-processing inequality (4.18).
The next lemma gives upper and lower bounds on Q min , which turn out to be the key ingredients for proving the asymptotic result in (4.16).
this completes the proof. Note that in order to see (4.24), [BK02, Eq. (13)] gives the following lower bound on the guessing probability of the pretty good measurement: where p guess (1/2, ρ 0 , ρ 1 ) is the optimal guessing probability in the discrimination task. This bound immediately implies (4.24) since (1 − x) 2 ≥ 1 − 2x for every real number x.
Remark 4.12 Using Lemma 4.8, ξ min can be written as (4.27) The use of the subscript in ξ min is motivated by the similarity of the above expression (modulo the additive constant) with D min : (4.28) The notation ξ min is further motivated by analogy with the resource theory of asymmetric distinguishability [WW19a], where the quantity analogous to ξ min is the min-relative entropy D min [Dat09].
Let Λ min be a minimiser of the optimisation problem corresponding to Q min given in (4.12), i.e., 2Tr(Λ min ρ 0 ) = Q min (ρ 0 , ρ 1 ). Let E be the following measure-and-prepare channel: (4.29) Hence, using Tr(Λ min ρ 0 ) = Tr((½ − Λ min )ρ 1 ) we get ). This implies that To obtain the converse, i.e., the reverse inequality ξ d (ρ 0 , ρ 1 ) ≤ ξ min (ρ 0 , ρ 1 ), we first note that and therefore Now considering E to be an arbitrary CPTP map such that for some M ≥ 1, we see by the data processing inequality in Lemma 4.9 that As E is an arbitrary CPTP map satisfying the constraint in (4.2), we get

Proof of Theorem -One-shot exact distillable-SD under CDS maps
Proof. We start with the achievability part, i.e., the lower bound Let Λ min be the minimiser of Define the quantum operations, i.e., completely positive, trace non-increasing maps and note that E 0 and E 1 sum to a CPTP map. Hence, we can define the corresponding CDS map as where id X and F X denote the identity and flip channel on the the classical system X, respectively. Noting that and by symmetry Hence, To obtain the upper bound in (4.14), we use monotonicity of the minimum error probability under CDS maps. More precisely, let M ≥ 1 satisfy the constraint in (4.7); i.e., there exists a CDS map N such that (4.39) Using the monotonicity of the minimum error probability p err under CDS maps, we obtain (4.40) As M is arbitrary under the constraints in (4.7), we have shown that which finishes the proof.

Proof of Theorem 4.7 -Optimal asymptotic rate of exact SD-distillation
Proof. We first prove the result in the case of free operations being CPTP A maps. As a consequence of Theorem 4.3, we have the equality and so it suffices to prove the asymptopic result for ξ min . Using (4.26) and the fact that the quantum Chernoff divergence ξ is additive, we get To get the other inequality, we use the lower bound in (4.20) to see that for every q ∈ (0, 1) (4.43) Hence, by using the main result in [ACMnT + 07], we conclude that In the case of free operations being CDS maps, the equality directly follows from Theorem 4.5, together with the main result of [ACMnT + 07], which finishes the proof.

Approximate SD-distillation
We now define the one-shot approximate distillable-SD for a general c-q state XQ , the one-shot approximate distillable-SD of the c-q state ρ XA is given by where the minimum conversion error d ′ FO is defined in Definition 3.10. For the choice the only sensible choice in (4.47) is q = p, as free operations of the form id ⊗ E cannot change the prior in the c-q state. In that case, we simply write Whereas for the choice FO ≡ CDS and q = 1/2, we use the notation (4.50) 4.7.1 One-shot approximate distillable-SD as a semi-definite program In this section, we prove that the one-shot approximate distillable-SD under CPTP A can be evaluated by means of a semi-definite program and comment on SDP formulations of the one-shot approximate distillable-SD under CDS. Under CPTP A this quantity is defined as follows: and for CDS maps as (4.52) We begin with the following: Proposition 4.14 For all ε ≥ 0 and p ∈ (0, 1) and for every elementary source described by the c-q state ρ XA , the one-shot approximate distillable-SD under CPTP A maps can be evaluated by the following semi-definite program: So we find that the optimal value is given by as a consequence of the data-processing inequality for D ′ given in Lemma 3.13 and with the optimization over every measurement channel M. Now consider that and .
(4.58) By applying (3.7), we know that where the last equality follows from a simplification that holds for p ∈ [0, 1] and r ∈ [0, 1]. Now consider that So then we find that So the optimization problem is equivalent to the following: (4.64) Let us make the substitution r → 2r, and the above becomes the following: This concludes the proof.
Remark 4. 15 We can see that if ε = 0 on the right-hand side of (4.53), this expression reduces to the ξ min quantity defined in (4.11).
We now provide an alternate characterization of the semi-definite program for the one-shot approximate distillable-SD in Proposition 4.16 below. The proof is given in Appendix D and starts from the SDP for the minimum conversion error in (3.15).

Proposition 4.16
The approximate one-shot distillable-SD under CPTP A maps can be calculated by means of the following semi-definite program: (4.66) We find that the one-shot approximate distillable-SD under CDS maps can be evaluated by a semi-definite program as well, and the proof of Proposition 4.17 below is given in Appendix E: Proposition 4.17 The approximate one-shot distillable-SD under CDS maps can be calculated by means of the following semi-definite program:

Optimal asymptotic rate of approximate SD-distillation
We now consider the asymptotic case of approximate SD-distillation. Theorem 4.7 already established that in the exact case the optimal rates for free operations being CDS or CPTP A are given by the quantum Chernoff divergence. The following theorem shows that also when an error is allowed in the transformation, the corresponding asymptotic rates are still given by the quantum Chernoff divergence.
Remark 4.19 Note that Theorem 4.18 establishes the strong converse property for the task of asymptotic distillation in the RTSD. Another way of interpreting this statement is as follows: for a sequence of SD distillation protocols with rate above the asymptotic distillable-SD, the error necessarily converges to ∞ in the limit as n becomes large.
Proof of Theorem 4.18. As we have, for every c-q state ρ XA , we immediately get the lower bounds from Theorem 4.7.
To establish the upper bounds, we use Lemma 3.14. We only prove the upper bound for ξ ⋆,ε d , as the one for ξ ε d exactly follows the same lines. Let M be such that it satisfies the constraint in the following optimization: Hence, there exists a CDS map N such that Then, by the monotonicity of the minimum error probability under CDS maps and the bound in Lemma 3.14, we see As M is arbitrary under the constraint in (4.73), we get

SD-dilution
We now turn to the case of dilution of symmetric distinguishability. We begin with the exact oneshot case in Section 5.1, establish some properties of relevant divergences in Section 5.2, provide proofs in Sections 5.3 and 5.4, consider the one-shot approximate case in Section 5.5, and evaluate asymptotic quantities in Sections 5.6 and 5.7.

One-shot exact SD-dilution
One-shot exact SD-dilution of a given c-q state with p ∈ [0, 1], is the task of converting an M -golden unit to the target state ρ XA via free operations. The minimal value of log M for which this conversion is possible is equal to the one-shot exact SDcost for the chosen set of free operations. This is formally defined as follows: Definition 5.1 For a set of free operations denoted by FO and q ∈ [0, 1], the one-shot exact SD-cost of the c-q state ρ XA defined in (5.1) is given by the only sensible choice in (5.2) is q = p, as free operations of the form id ⊗ E cannot change the prior in the c-q state. In that case, the above quantity is called the one-shot exact SD-cost under CPTP A maps and we simply write Whereas for the choice FO ≡ CDS and q = 1/2, the above quantity is called the one-shot exact SD-cost under CDS maps and we use the notation Explicitly, for a c-q state ρ XA given by (5.1), we then have Remark 5.2 Note that in the case of the free operations being CPTP A and prior p ∈ (0, 1) the SD-cost, just as the distillable-SD (see Remark 4.2), is independent of p by definition. In fact, in that case, it can be equivalently written as For p ∈ {0, 1}, one easily sees that Therefore, we restrict to the non-singular case p ∈ (0, 1) in the following Theorem 5.3.
Remark 5.4 Note that ξ max can be written as ξ max (ρ 0 , ρ 1 ) = max log 2 Dmax(ρ 0 ,ρ 1 ) + 1 , log 2 Dmax(ρ 1 ,ρ 0 ) + 1 − 1. (5.14) The use of the subscript in ξ max is motivated by the fact that it is a divergence that is essentially a symmetrized version of D max . The notation ξ max is further motivated by analogy with the resource theory of asymmetric distinguishability [WW19a], in which the quantity analogous to ξ max , arising as the cost of exact dilution of asymmetric distinguishability, is the max-relative entropy D max [Dat09].
We also note that Theorem 5.3 can be infered from [BST19, Lemma 3.1], but we include a self-contained proof below for completeness.
Theorem 5.5 The one-shot exact SD-cost under CDS maps of a c-q state ρ XA , defined through (5.1), is given by Remark 5.6 Note that Theorems 5.3 and 5.5 give an operational interpretation to the Thompson metric in the context of exact one-shot dilution in the resource theory of symmetric distinguishability. To the best of our knowledge, this is the first time an operational meaning in quantum information has been given to the Thompson metric.
Remark 5.7 As a consequence of Theorem 5.3, it follows that the one-shot exact SD-cost under CPTP A maps can be calculated by means of a semi-definite program, due to the expression in (5.12).
As a consequence of Theorem 5.5, it follows that the one-shot exact SD-cost under CDS maps can be calculated by means of a semi-definite program, due to the expression in (5.17).
Properties of ξ max , Q max , ξ ⋆ max , and Q ⋆ max are discussed in Section 5.2. Proofs of the above theorems are given in Sections 5.3 and 5.4.

Properties of
The quantity ξ max satisfies the data-processing inequality under CPTP maps: Lemma 5.8 (DPI for ξ max ) Let E be a CPTP map. Then Proof. Follows immediately from the data-processing inequality for D max under CPTP maps [Dat09].
We now prove that ξ ⋆ max is decreasing under CDS maps. For that, we find that ξ ⋆ max can also be expressed in the following ways, as a consequence of the definition (5.17) of Q ⋆ max : where F X (·) = σ (1) · σ (1) denotes the flip channel on the system X.
Lemma 5.9 Let N ∈ CDS. Then Proof. Noting that every CDS channel N commutes with the channel F X ⊗ id A , (5.20) follows directly from the data-processing inequality for D max .

One-shot approximate SD-dilution
We can now define the one-shot approximate SD-cost for a general c-q state Definition 5.10 For ε ≥ 0 and golden unit γ (M,q) XQ , the one-shot approximate SD-cost of the c-q state ρ XA is given by Whereas for the choice FO ≡ CDS and q = 1/2, we use the notation ξ ⋆,ε c (ρ XA ) ≡ ξ CDS,1/2,ε c (ρ XA ). (5.55) In the case of the dilution task, the one-shot approximate SD-cost can be directly obtained from the corresponding exact quantity.
Lemma 5.11 For FO being c-q state preserving and ε ≥ 0, we have where we have defined the ball of c-q states 6 around ρ XA with radius ε with respect to the scaled trace distance D ′ (·, ·) Hence, in particular we get for free operations being CPTP A or CDS maps Proof. The proof simply follows by Here, for the third equality we have used that is a c-q state because FO is c-q state preserving. Hence D ′ ( ρ XA , ρ XA ) ≤ ε already implies ρ XA ∈ B ′ ε (ρ XA ).

Optimal asymptotic rates of exact and approximate SD-dilution
Consider the c-q state ρ Theorem 5.12 (Exact asymptotic SD-cost) For all p ∈ (0, 1), the optimal asymptotic rates of exact SD-dilution under CPTP A and CDS maps is given by Hence, unlike the case of distillation, the optimal asymptotic rates in the case of exact dilution do not match the quantum Chernoff bound, as they are too large. We also note here that (5.62) gives an operational interpretation of the Thompson metric in quantum information theory.
However, allowing errors with respect to the scaled trace distance D ′ in the conversion, the corresponding approximate quantities converge to the Chernoff divergence.
Remark 5.14 The fact that the limits in (5.63) hold without any restriction on the value of ε > 0 implies that the strong converse holds for the asymptotic SD-cost. Another way of interpreting this statement is as follows: for a sequence of SD dilution protocols with rate below the asymptotic SD-cost, the error necessarily converges to infinity as n → ∞.
Remark 5.15 Given that the asymptotic distillable-SD and SD-cost are equal to the quantum Chernoff divergence, it follows that the resource theory of symmetric distinguishability is asymptotically reversible. This means that, in the asymptotic limit of large n, one can convert the source state ρ XA at the inverse rate with no loss (in the asymptotic limit). The procedure to do so, for the first aforementioned state conversion, is to distill golden-unit states from ρ (n) XA at a rate equal to the Chernoff divergence. Then we dilute these golden-unit states to σ (m) XA ′ , and the overall conversion rate is equal to the ratio of Chernoff divergences. Then we go back from σ (m) in a similar manner, and there is no loss in the asymptotic limit. We discuss these points in much more detail in Section 7.
The desired upper bound (5.63) in Theorem 5.13 can now be deduced from the following lemma.
Lemma 5.18 For all ε > 0, we have Proof. We prove the statement only for ξ ⋆,ε c , as the proof for ξ ε c follows exactly along the same lines. Let A n denote the quantum system consisting of n copies of the quantum system A. Moreover, for a generic c-q state ρ XAn , we use the notation ρ XAn =p|0 0| ⊗ρ 0 + (1 −p)|1 1| ⊗ρ 1 .

Examples
In this section, we detail a few examples of the RTSD to illustrate some of the key theoretical concepts developed in the previous sections. We begin with a first example. Let ρ 0 and ρ 1 be the following states: where A γ,N is the generalized amplitude damping channel, defined as with γ, N ∈ [0, 1] and The parameter γ ∈ [0, 1] is a damping parameter and N ∈ [0, 1] is a thermal noise parameter. The generalized amplitude damping channel models the dynamics of a two-level system in contact with a thermal bath at non-zero temperature [NC10] and can be used as a phenomenological model for relaxation noise in superconducting qubits [CB08]. See [KSW20] for an in-depth study of the information-theoretic properties of this channel and for a discussion of how this channel can be interpreted as a qubit thermal attenuator channel. We choose the prior probabilities for the states ρ 0 and ρ 1 to be q and 1−q respectively, with q = 1/3, so that the c-q state describing the elementary quantum source is In Figure 1, we set the thermal noise parameter N = 0.1 and plot the exact one-shot distillable-SD of ρ XA under CPTP A maps, and under CDS maps, and the exact one-shot SD-cost of ρ XA under CPTP A maps, and under CDS maps. As expected, when the damping parameter γ increases, each measure of SD decreases. The exact distillable-SD and SD-cost under CDS maps do not decrease to zero due to the non-uniform prior (q = 1/3). However, they do decrease to zero under CPTP A maps because the prior q does not play a role in this case. Additionally, the SD-cost under CPTP A maps is strictly larger than the distillable-SD under CPTP A maps for all γ ∈ (0, 1), demonstrating that the RTSD is not reversible in this one-shot scenario. The same holds for distillable-SD and SD-cost under CDS maps.
The SD-cost under CPTP A maps is smaller than that under CDS maps because we use different golden units in these two cases. In this context, recall Definition 5.1. It is not possible for CPTP A maps to change the prior q. So we are forced to use the golden unit with prior q, i.e., γ (M,q) , which in this case we chose to be q = 1/3. CDS maps, however, can change the prior, and as mentioned in Definition 5.1, we pick the prior of the golden unit to be the canonical choice of 1/2. Also, note that γ (M,1/3) has more SD than γ (M,1/2) ; i.e., it dominates γ (M,1/2) in the preorder of SD and hence can be transformed into the prior 1/2 golden unit via CDS (see Lemma 3.6). So the SD costs under CPTP A and CDS maps are different, as we are paying with a less valuable currency in the case of CDS maps. We next consider the following example: ρ 0 := A γ,N (|0 0|), (6.9) where the angle φ ∈ [0, π/2], We choose the prior q to be the same (i.e., q = 1/3). All of the quantities mentioned above are plotted in Figure 2. As the angle φ increases from zero to π/2, the states A γ,N (|0 0|) and e iφσ (1) A γ,N (|1 1|)e −iφσ (1) become less distinguishable and become the same state when φ = π/2. Thus, we expect for the various measures of SD to decrease as φ increases from zero to π/2. Similar statements as given above apply regarding the difference between the SD quantities under CPTP A and CDS maps.
As another example, we plot the logarithm of the one-shot approximate distillable-SD of the states in (6.1)-(6.2) under both CPTP A and CDS maps, as a function of the damping parameter γ. We set the approximate error ε = 0.1, the prior probability q = 1/3, and the noise parameter N = 0.1. For reference, we also plot the logarithm of the exact distillable-SD under both CPTP A and CDS maps. See Figure 3. We have plotted the logarithm of the number of SD bits in order to distinguish the curves more clearly. The difference in the behavior of the curves has to do with the fact that CDS maps can change the prior probability while CPTP A maps cannot. Here, the distillable-SD under CDS maps (both in the approximate and exact cases) flattens out for values of the damping parameter greater than γ ≈ 0.5 as in this case symmetric distinguishability of the  (1) ) as a function of the angle φ ∈ [0, π/2], with γ 1 = 0.5, N 1 = 0.3, γ 2 = 0.25, and N 2 = 0.1.
considered box is exclusively due to the non-uniform prior (q = 1/3) and does not decrease further even if the quantum states themselves become less distinguishable. As a final example, we plot the minimum conversion error in (3.30) when transforming the box to the box (1/4, A γ 2 ,N 2 (|0 0|), e iφσ (1) A γ 2 ,N 2 (|1 1|)e −iφσ (1) ) as a function of the angle φ ∈ [0, π/2], with γ 1 = 0.5, N 1 = 0.3, γ 2 = 0.25, and N 2 = 0.1. To do so, we make use of the semi-definite program from Proposition 3.16. The minimum conversion error is plotted in Figure 4 as a function of the angle φ. Intuitively, for small values of the angle φ, it should be more difficult to perform the conversion because the states in the first box are less distinguishable than those in the second, and so we expect the error to be higher. However, as the angle φ increases, the states in the second box become less distinguishable and so the transformation becomes easier. The difference in the prior probabilities of the boxes is a fundamental limitation that cannot be overcome, even as φ becomes closer to π/2, so that the minimum conversion error plateaus for angle values greater than ≈ 0.9. All Matlab programs that generate the above plots (along with the semi-definite programs) are available with the arXiv ancillary files of this paper.

Asymptotic transformation task
Let ρ XA and σ XB be c-q states explicitly given by with ρ 0 , ρ 1 states of a quantum system A, and σ 0 , σ 1 states of a quantum system B. Moreover, we assume that p, q ∈ [0, 1]. We use the short-hand notation p := (p, 1 − p) and q := (q, 1 − q) for the prior (distribution) of ρ XA and σ XB respectively and write p ≻ q if p majorizes q. Let ρ (n) XA and σ (m) XB be as follows: Definition 7.1 Let ρ XA , σ XB be c-q states, and let FO denote the set of free operations. For n, m ∈ AE and ε > 0, we say that there exists a (n, m, ε) FO-transformation protocol for the states That is, there exists an A ∈ FO such that We denote such a transformation protocol in short by the notation ρ XA → σ XB .
Definition 7.2 The rate R ≥ 0 is an achievable rate for the transformation ρ XA → σ XB under free operations FO, if for all ε, δ > 0 and n ∈ AE large enough there exists an (n, ⌊n(R − δ)⌋, ε) FO-transformation protocol. The optimal rate is given by the supremum over all achievable rates, and is denoted by In particular, in the case of free operations being CPTP A we write and in the case of free operations being CDS we write Note that by the inclusion CPTP A ⊂ CDS we immediately get the inequality (7.10) Definition 7.3 (Strong converse rate) The rate R ≥ 0 is a strong converse rate for the transformation ρ XA → σ XB under free operations FO, if for all ε, δ > 0 and n ∈ AE large enough there does not exist an (n, ⌈n(R + δ)⌉, ε) FO-transformation protocol. The optimal strong converse rate is given by the infimum over all strong converse rates, and is denoted by In particular, in the case of free operations being CPTP A we write and in the case of free operations being CDS we write By definition, we have and, moreover, by again using the fact that CPTP A ⊂ CDS, we get the inequality The following theorem gives expressions for the optimal achievable and strong converse rates for the transformation ρ XA → σ XB under both CDS and CPTP A . Note that in the following we interpret ∞ ∞ as ∞.
For free operations being CDS we have: for ξ(σ 0 , σ 1 ) > 0 For ξ(σ 0 , σ 1 ) = 0 and ξ(ρ 0 , ρ 1 ) > 0 we have For ξ(σ 0 , σ 1 ) = ξ(ρ 0 , ρ 1 ) = 0, we have For free operations being CPTP A we have: in the case of ρ XA and σ XB having equal priors Here, we interpreted 0 0 as ∞. In the case of the priors being different we get Remark 7.5 For simplicity we excluded in Theorem 7.4 the case of singular priors, i.e. p ∈ {0, 1} or q ∈ {0, 1}. For completeness we now state the corresponding results on optimal and strong converse rates in these cases: For free operations being CDS we have and for p ∈ (0, 1) and q ∈ {0, 1} we have For free operations being CPTP A and q ∈ {0, 1} we have And for p ∈ {0, 1} and q ∈ (0, 1) we have similarly to (7.21) and (7.22) Remark 7.6 Further to what was already stated in Remark 5.15, Theorem 7.4 expresses the fact that the resource theory of symmetric distinguishability is asymptotically reversible. Indeed, the optimal asymptotic rate at which one can convert ρ XA to σ XB is equal to the ratio of quantum Chernoff divergences. The rate at which one can convert back is thus equal to the reciprocal of the forward rate. Since the product of these two rates is equal to one, we conclude that the RTSD is asymptotically reversible.
7.1 Proof of Theorem 7.4

Achievability
We start the proof of Theorem 7.4 by proving the achievability part. In particular, we show the following lemma.
Proof. We prove the result for free operations being CDS, since for free operations being CPTP A the proof follows the same lines. Let us first consider the case ξ(σ 0 , σ 1 ) = 0, in which case necessarily σ 0 = σ 1 ≡ σ. Moreover, first assume that ξ(ρ 0 , ρ 1 ) = 0 (which implies that ρ 0 = ρ 1 ≡ ρ) and p ≻ q. In that case, there exists a λ ∈ [0, 1] such that and consequently Therefore, considering for every m ∈ AE E (m) which are quantum operations summing to a CPTP map, and the corresponding CDS map BX without error via a CDS map for an arbitrary m ∈ N.
Next consider the case in which ξ(σ 0 , σ 1 ) > 0. We can assume without loss of generality that ξ(ρ 0 , ρ 1 ) > 0, since otherwise (7.31) is trivially satisfied. The case ξ(ρ 0 , ρ 1 ) = ∞ follows by Lemma 3.3, i.e., by the fact that we can transform any infinite-resource state to any other c-q state via CDS maps without error. Note here that ξ(ρ 0 , ρ 1 ) = ∞ if and only if ρ 0 and ρ 1 have orthogonal supports. Furthermore, the case in which ξ(ρ 0 , ρ 1 ) < ∞ and ξ(σ 0 , σ 1 ) = ∞ follows from the fact that any transformation from a finite resource to an infinite-resource has infinite error with respect to the scaled trace distance D ′ .

Summary and open questions
In summary, we have introduced the resource theory of symmetric distinguishability (RTSD) and have answered many of the fundamental questions associated with it. In particular, we have developed an axiomatic approach to the RTSD, which led to the conclusion that CDS maps are the natural choice for free operations, with CPTP A maps being a special case. We then introduced the golden units of the RTSD and argued why a particular scaled trace distance is a more appropriate figure of merit for approximate transformations, instead of the standard trace distance. We finally defined and studied the tasks of dilution, distillation, and transformation, in the exact and approximate cases, both in the one-shot and asymptotic scenarios. We proved that the rate at which asymptotic transformations are possible is equal to the ratio of quantum Chernoff divergences of the elementary information sources, and we thus concluded that the RTSD is asymptotically reversible. Going forward from here, it would be interesting to generalize the RTSD that we developed in this paper for elementary information sources to more general information sources, i.e., to c-q states for which the classical alphabet has a size greater than two. We note here that many of the concepts considered in our paper, such as the basic axioms for the RTSD, CDS maps, and the scaled trace distance D ′ (·, ·) already apply to this more general setting. In light of the seminal result in [Li16], it is a tantalizing possibility that the optimal conversion rate between quantum information sources would be equal to a ratio of multiple-state Chernoff divergences, as a generalization of Theorem 7.4, but it remains open to determine if it is the case. It is also interesting to determine expressions for the one-shot distillable-SD and SD-cost, as generalizations of ξ min and ξ max . As an additional open direction, it is worth exploring whether there is an operational interpretation of the scaled trace distance D ′ (·, ·) that we introduced in Section 3.2. Finally, it is an open question to determine if the one-shot approximate SD-cost can be evaluated by a semi-definite program. We prove in Appendix F that the two variants of approximate SD-cost (based on CPTP A and CDS maps) can be evaluated by means of bilinear programs, so that the methods of [HKT20] can be used to evaluate these quantities. However, it is not clear to us if these bilinear programs can be simplified further to semi-definite programs.
2. Maximal Elements. For all ρ XA ∈ D(XA), {p x } |X| x=1 a probability distribution, and A ′ a quantum system with |A ′ | ≥ |X|, we have 3. Reduction to Majorisation. For ρ X , σ X ∈ D(X) classical states of the same dimension, the preorder of SD is equivalent to the majorisation preorder.
Proof. The first property follows immediately from Axiom III and Axiom V. For the second property, first observe that Axiom III implies that |1 1| X ∼ |1 1| X ⊗ ω A for some ω A . From the form of CDS maps we get that |1 1| X ⊗ ω A can be converted to any cq-state in D(XA ′ ). Hence, since F(XA → XA ′ ) = CDS(XA → XA ′ ) (see Lemma 2.3) the assertion follows. For the third property, note that by Axioms I, III, and V, with respect to the preorder of SD if and only if for some classical system X ′ and permutation channel P XX ′ on the joint classical system XX ′ . The above channels are known as noisy operations [HHO03]. First, observe that noisy operations are doubly stochastic, so that if such a permutation channel P XX ′ exists, then σ X majorizes ρ X . Conversely, suppose ρ = m y=1 t y U y σU y where {t y } m y=1 are the components of a probability distribution, U y are permutation matrices on system X, and for convenience of the exposition here we removed the subscript X. It is well known that such {t y } and {U y } exist iff ρ ≺ σ. Suppose that t y = ny n are rational components where n = m y=1 n y is the common denominator and each n y ∈ N. Let X ′ be a classical system of dimension |X ′ | = n. Define a permutation matrix P XX ′ by its action on the basis elements where y x ′ ∈ [m] is the index satisfying and we used the convention that the left-hand side of the above inequality is zero for y x ′ = 1. With this definition we have Therefore, noisy operations can approximate any mixture of unitaries arbitrarily well. Note that the lemma above indicates that |1 1| X is the maximal resource in the fixed dimension of X. If, for example, X ′ is another system with a higher dimension |X ′ | > |X| then That is, the 'embedding' of X into X ′ by adding zero components to matrices/vectors is not allowed in this resource theory since it can increase the value of the resource. To get the intuition behind it, consider Xiao possessing either one of the two classical states |1 1| XX or |1 1| X ⊗ πX. In the first case, Xiao has complete information of the state in her possession since she knows the values of both the random variables X andX. On the other hand, in the second case Xiao has no information aboutX, since it is in a uniform state. Therefore, the first state is more distinguishable than the second one and we get Hence, for the case X ′ = XX, (A.7) reduces (A.6). More generally, the SD of a cq-state ρ XA represents the ability of Alice to distinguish the elements in Xiao's system. Therefore, the greater |X| is, the more elements there are to distinguish, and consequently, the maximal resource has greater SD. This, in particular, applies to the minimum error probability p err (ρ XA ). That is, suppose for example that two states ρ XA , σ X ′ A ′ ∈ D cq , with |X| < |X ′ | satisfy Then, we can expect that σ X ′ A ′ has more SD since Alice is able to distinguish among |X ′ | > |X| elements with the same error as she would have if she held ρ XA . This means in particular that if we consider inter-conversions among c-q states with different classical dimensions then the minimum error probability is not a good measure of SD. In the following subsection we show how the minimum error probability needs to be re-scaled with the classical dimension so that it becomes a proper measure of SD.

A.1 Quantification of SD
SD is quantified with functions that preserve the preorder of SD.
Definition A.2 A function f : D cq → R is called a measure of SD if: 1. For any ρ XA , σ X ′ A ′ ∈ D cq we have is a measure of SD, since in any quantum resource theory, a function of the form min ω∈F(A) D(ρ A ω A ) is a measure of a resource [CG19].
Example A.4 (Normalized guessing probability) A special example of the above family of measures of SD is obtained when setting D to be the max-relative entropy D max . Specifically, in [KRS09] it was shown that the guessing probability can be written as Therefore, replacing I X with the maximally mixed state π X we get that the function is a measure of SD. Note that the dimension of the classical system is included on the left-side so that the expression remains invariant under replacement of ρ XA with ρ XA ⊗ π X ′ .
In this paper we have focused on the RTSD for the particular case in which the dimension of the classical system X is fixed to |X| = 2. In this case, as mentioned in the main text, it suffices to consider measures of SD that behave monotonically under CDS but not necessarily under conditional noisy operations (i.e., under free operations that change the dimension of X). The measure of SD that we have chosen in the paper is given by Definition 2.5. As shown in Theorem 4.5 it has the particularly pleasing feature of having an operational meaning in the context of SD distillation. Let pρ 0 , (1 − p) ρ 1 be a pair of subnormalized states, with p ∈ (0, 1), and ρ 0 and ρ 1 states. Then this pair is in one-to-one correspondence with the following classical-quantum state: Let (qσ 0 , (1 − q) σ 1 ) be another pair of subnormalized states, with q ∈ (0, 1), and σ 0 and σ 1 states. Then this pair is in one-to-one correspondence with the following classical-quantum state: The trace-distance conversion error of ρ XB to σ XB ′ is defined as follows: where CDS is the set of conditional doubly stochastic (CDS) maps. We first show that the trace-distance conversion error can be computed by means of a semidefinite program.
Proposition B.1 The trace-distance conversion error between the initial pair (pρ 0 B , (1 − p) ρ 1 B ) and the target pair (qσ 0 B ′ , (1 − q) σ 1 B ′ ) can be calculated by means of the following semi-definite program: The dual program is given by Proof. Recall that an arbitrary CDS channel has the following form: where N 0 B→B ′ and N 1 B→B ′ are completely positive maps such that N 0 B→B ′ + N 1 B→B ′ is trace preserving, and P X is a unitary channel that flips |0 0| and |1 1|. This means that its action on an input p|0 0| X ⊗ ρ 0 is as follows: The semi-definite specifications for the completely positive maps N 0 B→B ′ and N 1 B→B ′ are as follows: Furthermore, the output state is given by Recall that the dual semi-definite program for computing the normalized trace distance of two quantum states ρ and σ is as follows (see, e.g., [WW19a]): So in this case, it follows that where we have now called the output system X for clarity. Then the SDP for the trace-distance conversion error is given by min It is clear that the optimal Y XB ′ respects the classical-quantum structure. So this means that the final SDP can be written as follows: Now we compute the dual of the semi-definite program above. Recall the standard form of primal and dual SDPs: From inspecting the above, we see that Now we need to find the adjoint map of Φ † . Consider that So this means that Then the dual program is given by This can be simplified to the following: Then we can set X 3 We can finally make the substitution Y B → Y T B and the optimal value is unchanged. Since the operators on the right-hand side of the inequalities just above are separable, the partial transpose has no effect and can be removed. This concludes the proof.

B.1 Minimum error probability and minimum conversion error (in terms of trace distance) to infinite-resource states
As mentioned in the main text, the minimum error probability is given by An alternative expression for it is given by where Herm denotes the set of Hermitian operators acting on the system B. Note that the maximising operator Y B on the right hand side of the above equation is called the "greatest lower bound (GLB) operator" of the operators pρ 0 B and (1 − p)ρ 1 B . The GLB operator is defined in Eq. (84) of [AM14], and the above result was established as Lemma A.7 of the same paper.
Consider the infinite-resource state γ (∞,q) which is the (M, q)-golden unit (Definition 3.4) with M = ∞. It follows from Lemma 3.3 that for all q, q ′ ∈ [0, 1], it is possible to perform the transformation N XQ (γ (∞,q) Lemma B.2 Let ρ XB ≡ (p, ρ 0 , ρ 1 ) be a c-q state. Then the following equality holds for all q, q ′ ∈ [0, 1]: Proof. We first establish the inequality ≥. Let N ′ XQ be the CDS channel that converts γ XQ . Then consider that The first inequality follows because N ′ XQ • N XB→XQ is a member of the set of CDS channels. The second inequality follows from the DPI under the trace distance. We can then apply the same argument to arrive at the opposite inequality.
by demonstrating the existence of a value of q ∈ [0, 1] and a CDS channel for which the left-hand side is equal to p err (ρ XB ). From Lemma B.2, it follows that the left-hand side of (B.64) is independent of q ∈ [0, 1]. So we can pick q = p, and the value is unchanged. Now consider that the channel used in state discrimination is a simple local channel of the following form: and so id X ⊗M B→Q is a CDS channel. Acting with it on ρ XB leads to the following state: Since this is a particular choice, it follows that min q∈[0,1], Now let us compute the trace distance between M B→Q (ρ XB ) and the simple state γ (∞,p) XQ : We now establish the opposite inequality. Since the value of q does not matter, let us set it to 1/2, so that By applying Proposition B.1 and weak duality of semi-definite programming, we conclude that the trace-distance conversion error min q∈[0,1], is not smaller than the optimal value of the following SDP: the above constraints are equivalent to the following: This SDP is thus equal to the following one: This quantity is precisely the trace of the greatest lower bound operator, and so we conclude by applying (B.58).
Lemma B.4 Let ρ XB ≡ p, ρ 0 B , ρ 1 B be a c-q state. Then the following equality holds Proof. The inequality ≤ follows by the same reasoning given at the beginning of the proof of the previous theorem. The opposite inequality follows because we can apply a completely dephasing channel to the Q system and the state γ (∞,p) XQ remains invariant, while the channel N is transformed to a measurement channel. The trace distance does not increase under such a channel and evaluating it leads to an expression for the error probability under a particular measurement.
C Derivation of the SDPs for scaled trace distance D ′ and minimum conversion error in Propositions 3.15 and 3.16 Proof of Proposition 3.15. We begin by rewriting the scaled trace distance D ′ (ρ XA , σ XA ) as follows: Furthermore, let us introduce t = TrL XA (ρ XA −σ XA ) 1−TrP A (qσ 0 −(1−q)σ 1 ) and obtain The constraints in the optimization above still have bilinear conditions. However, we can absorb tP A into a single variable and obtain the simplified SDP in (3.28). We now continue with the derivation of the dual SDP stated in (3.29). The standard form of primal and dual SDPs is as follows [Wat18]: (C.8) In standard form, this SDP is as follows: X = diag(t, L 0 XA , L 1 XA , P 0 A , P 1 A ), (C.9) A = diag(1, 0, 0, 0, 0), (C.10)  where Π ρ XA ≥σ XA is the projection onto the non-negative eigenspace of ρ XA ≥ σ XA and Π ρ XA <σ XA is the projection onto the strictly negative eigenspace. Then all of the following constraints are satisfied with strict inequality (except for the final equality): t, L 0 XA , L 1 XA , P 0 A , P 1 A ≥ 0, (C.33) where P is the positive part of qσ 0 − (1 − q) σ 1 and N is the negative part of qσ 0 − (1 − q) σ 1 . Under these choices, we find that the constraints from the dual program are met, i.e., as follows: Finally, suppose now that ρ XA − σ XA 1 = 0 and qσ 0 − (1 − q) σ 1 1 < 1. Then the choices t = 0, L XA = 0, and P A = 0 are feasible for the primal and lead to a value of zero for the objective function. Also, setting B XA to be the positive part of s(ρ XA − σ XA ) and C XA to be the negative part of s(ρ XA − σ XA ), with the same choices for s, D A , and E A as given above, leads to feasible choices for the dual, for which the objective function also evaluates to zero. So strong duality holds in this case also.
Proof of Proposition 3.16. Using (3.29), the scaled trace distance D ′ (·, ·) for states τ XA := t|0 0| ⊗ τ 0 + (1 − t) |1 1| ⊗ τ 1 and ω XA := w|0 0| ⊗ ω 0 + (1 − w) |1 1| ⊗ ω 1 can be written as the following semi-definite program As written, this is not a semi-definite program, due to the bilinear term sτ XA ′ in the second line above, given that s is an optimization variable and τ XA ′ includes the optimization variables Γ N 0 AA ′ and Γ N 1 AA ′ . However, we observe that s ≥ 1, due to the constraints Tr[D A ′ + E A ′ ] ≤ s − 1 and D A ′ , E A ′ ≥ 0. We can then make the reassignments sΓ N 0 AA ′ → Ω 0 AA ′ and sΓ N 1 AA ′ → Ω 1 AA ′ to rewrite the above optimization as follows: This concludes the proof. As written, this is not an SDP. However, through the substitutions B XQ → sB XQ , C XQ → sC XQ , D Q → sD Q , and E Q → sE Q and observing that s ≥ 1, we arrive at the following SDP: It suffices to take B XQ and C XQ to have the following form: c i,j |i i| X ⊗ |j j| Q , (D.14) with b i,j , c i,j ≥ 0. Also, it suffices to take with d i , e i ≥ 0. We also have that pπ 1/r − (1 − p) σ (1) π 1/r σ (1) (1/r,1/2) XQ , D Q − E Q = 1 2 (π 1/r − σ (1) π 1/r σ (1) ), Then by applying a completely dephasing channel to the Q system, the state γ (1/r,1/2) XQ does not change, whereas the CDS channel becomes a measurement channel of the following form: The effect of the CDS channel N XA→XQ on the c-q state ρ XA is as follows: This is also a bilinear program, due to terms like (2M − 1) ρ 1 and (2M − 1) ρ 1 appearing in the optimization.