Computable lower bounds on the entanglement cost of quantum channels

A class of lower bounds for the entanglement cost of any quantum state was recently introduced in [arXiv:2111.02438] in the form of entanglement monotones known as the tempered robustness and tempered negativity. Here we extend their definitions to point-to-point quantum channels, establishing a lower bound for the asymptotic entanglement cost of any channel, whether finite or infinite dimensional. This leads, in particular, to a bound that is computable as a semidefinite program and that can outperform previously known lower bounds, including ones based on quantum relative entropy. In the course of our proof we establish a useful link between the robustness of entanglement of quantum states and quantum channels, which requires several technical developments such as showing the lower semicontinuity of the robustness of entanglement of a channel in the weak*-operator topology on bounded linear maps between spaces of trace class operators.


I. INTRODUCTION
From a modern perspective, one of the fundamental conceptual contributions of classical thermodynamics [1][2][3][4] is the realisation that information itself is an entity with observable physical consequences. The advent of quantum mechanics and the epistemological revolution it brought [5][6][7][8] have endowed this statement with a more profound meaning. On the other hand, information theory [9] has taught us that information can only be understood by means of the operational tasks it enables. In this sense, it is only natural that a key role in the discipline is played by the processes that allow to manipulate information. In the theory of quantum information [10][11][12], which aims to combine quantum physics and information theory, such processes are represented by quantum channels [13,14].
Quantum information is concerned, among other things, with how resources can be interconverted [15,16]. In this spirit, a great effort has been devoted to the problem of understanding how quantum channels can be transformed into each other. For example, given many uses of a point-to-point quantum channel Λ : → connecting Alice's system to Bob's system, one is often interested in determining how much information Alice can transmit to Bob -the quantum capacity of the channel [17][18][19][20]. Equivalently, this problem can be thought of as that of understanding how efficiently one can simulate the noiseless qubit identity channel id 2 given Λ.
While this question has received much attention in the last decades, its converse, i.e. the problem of determining the rate at which resources are needed to simulate a given (noisy) quantum channel Λ, although conceptually appealing, is much less studied. Here, the word 'resources' can take, depending on the context, different meanings, with the two main ones referring to entanglement and classical communication. For example, a notable result in this area is the quantum reverse Shannon theorem, stating that in the presence of free entanglement the classical communication cost of a quantum channel is equal to its entanglement-assisted capacity, i.e. to the amount of classical communication that it could have conveyed in the first place, again in the presence of free entanglement [21][22][23]. While this result is aesthetically pleasing because it establishes a reversible theory, it is not easy to imagine situations in which quantum entanglement, notoriously hard to maintain over long distances, could be counted as a free resource. A complementary approach, instead, is to consider classical communication free and to look primarily at the cost in terms of entanglement consumption. These considerations have inspired the notion of entanglement cost for quantum channels [24,25]. As is the case for states [26][27][28], no single-letter (let alone closedform) expression is known for the entanglement cost of a quantum channel. In the channel case, the situation is in fact even more intricate than for states, because dynamical resources can be used in a sequential order, where each channel use can influence the subsequent ones. Therefore, one can identify at least two notions of entanglement cost of a quantum channel, one corresponding to what is needed to simulate parallel instances [24], and the other encompassing possible overheads required for sequential simulation [25]. In fact, even more general schemes for the transformations of quantum channels can be conceived by allowing more exotic protocols that do not assume a fixed causal order of the channel uses [29,30].
In this work, we focus on the challenging problem of computing lower bounds to the entanglement cost of quantum channels. The fundamental mathematical difficulty associated with this problem is, as for states, the absence of a single-letter formula. In the channel setting, however, further complications linked to the optimisation over entangled input states over many uses of the channel may arise [24,Eq. (1)]. Here we bypass these difficulties by generalising the tempering method recently introduced by us [31] to the dynamical setting of quantum channels. Our fundamental result is a semidefinite-programming-computable lower bound on the parallel (and hence also on more general notions, such as the sequential) entanglement cost of a channel in terms of a quantity called the tempered negativity. We show with an example that our new result can improve upon all previously known lower bounds. Building on these findings, we conclude by providing a complete proof of the result announced in [31] that the theory of point-to-point quantum channel manipulation is fundamentally irreversible under the set of all channel-to-channel transformations that preserve either the set of entanglement-breaking channels, or that of channels with a positive partial transpose.
The rest of the paper is structured as follows. We begin in Section II with an introduction to the concepts underlying the investigation of quantum channels and quantum entanglement, and recall the tempering method for quantum states developed in [31]. Section III then deals with the problem of how to choose a suitable topology on the space of quantum channels to study their properties: as we show, the most appropriate choice here is an often overlooked weak*-operator topology. Section IV contains the main results of our work: here we rigorously introduce the notions of quantum capacity and quantum channel entanglement cost, generalise the tempering method to quantum channels, and then use the tempered monotones to provide a new lower bound on the entanglement cost of any channel. In Section V, we show that the bound can perform better than previously known computable bounds for entanglement cost, and prove the general asymptotic irreversibility of channel manipulation. Our last Section VI is devoted to a complete proof of one of our technical results used in Section IV, namely the equivalence of the measure of entanglement known as the robustness for states and channels.

A. Quantum systems
Quantum systems, denoted with capital letters , , etc. are mathematically represented by separable Hilbert spaces H , H , and so on. In this paper we shall consider the fully general case of infinite-dimensional spaces, which is arguably the most fundamental -in fact, all quantum fields that we suspect to model the fundamental constituents of matter are intrinsically infinitedimensional. The hold; here, the inclusions are intended to be between sets (and not Banach spaces), and are all strict unless dim H < ∞.
If H is separable, as we will always assume, both C(H) and T (H) -but not B(H)! -can be shown to be separable as well (as Banach spaces). The separability of T (H) can be proved simply by taking as a dense subset the set of all operators having a finite expansion with rational coefficients in a fixed orthonormal basis; since separability of the dual space implies separability of the primal [32,Theorem III.7], it follows immediately that C(H) is also separable; also, it turns out that B(H) is not separable whenever H is infinite dimensional [33].

B. Quantum channels
Mathematically, a quantum channel from a quantum system to a quantum system , denoted by Λ : → , is first and foremost a linear map Λ : T (H ) → T (H ). In order to be a bona fide quantum channel, Λ must satisfy two additional conditions: (i) Complete positivity, which requires that id ⊗ Λ : T (C ⊗ H ) → T (C ⊗ H ) is a positive map for all ∈ N + , where id denotes the identity map acting on the space of × complex matrices, and positivity of a map Γ means that Γ( ) ≥ 0 is positive semidefinite for all positive semidefinite ≥ 0.
(ii) Trace preservation, which requires that the identity Tr Λ( ) = Tr is obeyed for all ∈ T (H ).
In what follows, the set of maps satisfying (i)-(ii) will be denoted with CPTP → . A Hilbert space, or more generally a Banach space, is said to be separable if it admits a countable norm-dense subset.

C. Separability and the PPT criterion
The Hilbert space associated with a bipartite quantum system is simply the tensor product of the local spaces, in formula H = H ⊗ H . A very important set of states within D(H ) is composed of separable states, formally defined as the closed convex hull of product states, i.e.
Here, the closure is taken with respect to the trace norm topology (see Section III for an introduction to topologies for quantum systems). It can be shown [34] that a state is separable if and only if it admits the expression where is a Borel probability measure on the product of the sets of local (normalised) pure states. The cone generated inside T + (H ) by the set of separable states is Since deciding whether a state is separable or not is a notoriously intractable problem [35,36], some handy criteria have been developed to facilitate this task. The most notable of those is the positive partial transposition (PPT) criterion [37]. The partial transpose of some ∈ T (H ), denoted Γ , is defined by first assuming that = ⊗ , in which case Γ = ⊗ , the transposition being with respect to a fixed (but immaterial) basis of H , and then extending the operation to the whole T (H ) by linearity. It is worth observing that the resulting operator Γ ∈ B(H ) will be bounded but in general not of trace class. We refer the reader to [31, Section VII.A, Supplementary Information] for further details on some subtleties concerning the infinite-dimensional case. We can now observe that the partial transpose of any separable state is necessarily a positive semidefinite operator. Therefore, which is precisely the aforementioned PPT criterion.
Note. Hereafter we will denote with K any one of the two cones S or PPT , defined by (4) and (5), respectively. Therefore, a statement involving K will be intended to hold equally well for K = S or K = PPT .

D. Robustness, negativity, and tempering
From now on all states are implicitly understood to be on a bipartite system , even though we will often omit the subscripts. Given a state = , how to quantify its entanglement content? A particularly simple and arguably fruitful idea is to use its (standard) K-robustness, defined by In the above equation, is assumed to be of trace class, the dual cone K * is defined by and the corresponding operator interval is Note that PPT ( ) ≤ S ( ) for all states , simply because of the inclusion in (5). Let us observe in passing that in this paper we use the original definition of Vidal and Tarrach [38], instead of adopting the more recent convention of Ref. [39,40], according to which the robustness would be defined as 1 + K . A simpler quantity to compute is the negativity, defined by [41,42] ( ) It is also useful to consider its logarithmic version, the logarithmic negativity, defined by Note that the convention that we are adopting here differs slightly from the one employed by Vidal and Werner [41], who take the negativity to be 1 2 ( ) − 1 . Our logarithmic negativity, instead, is the same as in [41].
In [31], we introduced a technique called 'tempering' that can be applied to yield modified versions of the robustness and the negativity. For a pair of states , on a bipartite system , the -tempered K-robustness is defined by sup Tr : where [−1, 1] K * is given by (8) (cf. (6)). Clearly, this is a convex program, and even a semidefinite one (a.k.a. SDP) for the special case K = PPT . Analogously, the -tempered negativity defined by Compare this with (9). Once again, the expression in (14) is in fact an SDP. The corresponding tempered logarithmic negativity is For a survey of the properties of the tempered robustness and negativity we refer the reader to [31,Proposition S5]. The main application of this quantity in [31] was to establish a universal and computable lower bound on the entanglement cost of any quantum state -the generalisation of this result is precisely the aim of this work.

III. THE UNJUSTLY OVERLOOKED WEAK*-OPERATOR TOPOLOGY
Before proceeding with the investigation of entanglement of quantum channels, let us address an issue pertinent to the study of their properties: the choice of a suitable topology on the space of quantum channels. The purpose of this section is to introduce and discuss the notion of the weak*-operator topology, which will prove instrumental to some of our proofs.

A. Topologies on the set of quantum states
Let us begin by considering topologies at the level of states. The space of interest here is therefore T (H), i.e. the Banach space of trace class operators on some separable Hilbert space H. We will consider mainly two topologies on T (H), namely: • The trace norm topology, induced by the native norm · 1 . A sequence ( ) ∈N of trace class operators is said to converge with respect to the trace norm topology to some ∈ T (H), and we write • The weak* topology induced by the duality T (H) = C(H) * , where C(H) denotes the space of compact operators on H. Equivalently, it can be defined as the coarsest topology that makes all functionals of the form ↦ → Tr , where ∈ C(H), continuous. Accordingly, a sequence ( ) ∈N in T (H) will be said to converge to ∈ T (H) with respect to the weak* topology, denoted Clearly, the weak* topology is coarser than the trace norm topology, which implies that any sequence that converges with respect to the former topology converges (to the same limit) also with respect to the latter. Trace norm and weak* topology are however genuinely different in infinite dimension. For example, picking an orthonormal basis {| } ∈N of H, one sees that the sequence (| |) ∈N has no limit with respect to the former topology, yet it satisfies | | * − −− → →∞ 0. A word of caution before proceeding is advisable: the weak* topology considered here is not the same commonly used in the von Neumann algebra approach to quantum theory [43,Remark 2].
The fundamental reason why the weak* topology is so useful to us lies in the Banach-Alaoglu theorem, which states that for every Banach space , the unit ball of * is weak*-compact [32, Theorem IV.21]. In our operator setting, by applying this result to the duality T (H) = C(H) * we immediately deduce the following:

B. Topologies on the set of quantum channels
We now move on to the discussion of topologies on spaces of quantum channels. For the sake of this presentation, let us fix two quantum systems and , and let us use the shorthand notation T T (H ) and T T (H ) for the respective spaces of trace class operators. Quantum channels can be thought of as elements of the Banach space B (T → T ) of linear maps Λ : T → T that are bounded with respect to the trace norm, i.e. that satisfy Λ 1→1 sup ∈T , 1 ≤1 Λ( ) 1 < ∞. We can turn B (T → T ) into a Banach space by equipping it with the norm · 1→1 . As it turns out, every completely positive and trace preserving map, and hence any quantum channel, belongs to B (T → T ); moreover, its norm is precisely 1.
When it comes to the choice of a topology on B (T → T ), there are several possibilities. The most common choices are however essentially two: • The diamond norm topology, induced by a norm alternative to · 1→1 called the diamond norm (completely bounded trace norm). This is given by [44] Λ Strictly speaking, since the weak* topology is non-metrisable in general, we should be talking about nets rather than sequences. However, we will see that this technical complication can be avoided in most cases of interest here.
with the optimisation being over all bipartite quantum states ∈ D(H ⊗ H ) on two copies of the Hilbert space of Alice's system . This distance represents a natural extension of the trace distance to quantum channels, obeying an equivalent of the Helstrom-Holevo theorem: the diamond norm distance between any two quantum channels captures the difficulty in distinguishing them operationally [46].
According to the diamond norm topology, a sequence (Λ ) ∈N This is essentially the choice made in the definitions (26)- (27).
• Since the above topology turns out to be too strong for many purposes, most notably when dealing with infinite-dimensional systems [47][48][49], it is customary to employ also the strong operator topology, induced by the family of semi-norms Λ ↦ → Λ( ) 1 , for all ∈ T . This implies that a sequence of channels (Λ ) ∈N in B (T → T ) converges to Λ with respect to the strong operator topology, and we write Despite the name, the strong topology is actually weaker (i.e. coarser) than the diamond norm topology. And still, for what we have in mind it is too strong. In order to exploit the power of the Banach-Alaoglu theorem, we need to devise a version of the weak* topology that applies to the channel setting. The simple solution is the following. We immediately see that a sequence (Λ ) ∈N with respect to the weak*-operator topology, which we will write Λ * Remark. The use of the weak* topology in the context of quantum resource theories of states was explored in [39,40,43,50]. Building on that, the possibility of extending these concepts to spaces of quantum channels was considered in [51]. Related topologies, such as the bounded weak topology of [52,53], appeared in the literature before, but they have been employed in a rather different way -as is customary in the von Neumann algebra community, they were defined for sets of unital maps, which can be considered as adjoints of quantum channels, acting on operators in B(H) rather than T (H) (i.e. the Heisenberg picture).
With these tools at hand, we can now obtain the following.
is compact with respect to the weak*-operator topology.
Proof. In order to apply the Banach-Alaoglu theorem we need to identify the weak*-operator topology with the weak* topology induced on B (T → T ) by a pre-dual. Construct the vector space The diamond norm is sometimes defined through an optimisation over only pure state ∈ D(H ⊗ H ) or over all states ∈ D(H ⊗ H ) with system arbitrary; all such notions are equivalent [45].
We can turn it into a normed space by defining the projective tensor norm on it through the expression [54, p. 27] inf Let T ⊗ C denote the completion of T ⊗ C with respect to the norm · . It is well known that [54, p. 27] with the duality taking the form for all Λ ∈ B (T → T ), ∈ T , and ∈ C , and extended by linearity and continuity to the whole T ⊗ C .
Hence, T ⊗ C is a pre-dual of B (T → T ). Then, the Banach-Alaoglu theorem [32, Theorem IV.21] tells us that the dual unit ball Ξ is compact in the weak* topology induced by T ⊗ C . According to this topology, a net ( for all ∈ T ⊗ C . By choosing to be of the form = ⊗ , we see that the weak* topology we just defined is finer than the weak*-operator topology discussed above. Hence, since Ξ is compact with respect to the former topology, it must be such with respect to the latter as well.

A. Quantum capacity and entanglement cost of a channel
We are now interested in the study of quantum communication, in which the manipulated objects are quantum channels themselves. To this end, we specify the relevant sets of channels which can be regarded as having no entanglement, and hence, as basically useless for the purposes of transmitting quantum systems. These should be thought of as the equivalent of separable and PPT states at the level of maps. Recalling that K denotes either one of the cones S and PPT , let us define the set of K-enforcing channels, denoted KE, as In particular, any normalised quantum state satisfies [id ⊗ Γ]( ) ∈ K 1 when Γ ∈ KE, with K 1 standing for the set of operators in K with trace one. Without loss of generality, leveraging the fact that K is weak*-closed (in fact, it would suffice to have it trace norm closed), one can constrain the ancillary space that the identity channel is acting on to be finite-dimensional [34], in the sense that When K = S, the K-enforcing channels are known as entanglement breaking [55]; when K = PPT , they correspond to PPT-binding (or entanglement-binding) channels [56]. In finite-dimensional In fact, these two topologies can be shown to coincide on Ξ, by virtue of the general fact that a subset of a Banach space and its norm closure generate the same weak* topology on any bounded subset of the dual space. spaces, these are precisely the channels whose Choi-Jamiołkowski states [id⊗Γ](Φ ) are separable or PPT, respectively [55,56].
In the context of entanglement theory, the manipulation of quantum resources is typically realised using the class of local operations and classical communication (LOCC) [24,57], which consists of all protocols where the two communicating parties can perform arbitrary channels on the local parts of their systems, and communicate classical information (e.g. measurement results) to each other. However, many fruitful bounds and relations have been obtained by relaxing the considered set of processes, allowing the two parties to employ larger classes of protocols [58][59][60][61][62][63][64]. In order to understand the ultimate capabilities of such channel manipulation schemes, and indeed also to avoid the ambiguity in choosing a 'right' type of transformations to consider, we instead follow the axiomatic approach of [31,65,66] inspired by ideas that first emerged in the context of thermodynamics [67][68][69], and set out to establish a bound that would apply to all relevant processes, without making assumptions about their structure.
As the basic axiom of any communication scheme, we assume that a valid channel manipulation protocol should transform K-enforcing channels without generating any additional resources. We consider this to be the weakest constraint that any physical communication protocol that could reasonably be deemed as 'free', i.e. effectively inexpensive to implement, should satisfy.
To understand why, it is instructive to look at a setting that violates our assumption, such as that of the reverse Shannon theorem. In this framework, pre-shared entanglement is provided for free to the parties, and the only costly resource is instead classical communication [22,23]. Clearly, by adding entanglement it is possible to transform an entanglement-breaking channel into something that is not entanglement-breaking, which contradicts our assumption. As mentioned in the Introduction, this setting is interesting because it leads to a reversibility result: the classical communication cost of implementing the channel is the same as the amount of classical communication that can be extracted from it [21][22][23]. And yet, it is not entirely clear in what concrete setting entanglement could be considered to be a cheaper resource than classical communication. The present state of affairs, in which we have serious difficulties establishing entanglement over distances larger than a thousand kilometers [70] but we routinely communicate classically with the Voyager 1 probe, more than 7 orders of magnitude more distant [71], casts some doubts on the practicality of this route. Our axioms, instead, do not incur this problem, as they prohibit the creation of entanglement for free altogether. They can thus be thought of as a possible extension of the LOCC paradigm that, although much more permissive than the standard LOCC framework, treats entanglement as a costly resource at all stages of the protocol.
We thus define, first at the level of transformations of single channels, the set of -preserving quantum processes: We do not assume any specific structure of such protocols; although physical transformations of channels are typically taken to have the form of so-called quantum superchannels [72], here we do not need to presuppose that. We can also write KEP [ → ] ⊗ →[ → ] ⊗ to denote processes which act on parallel copies of the space of maps from system to system , in the sense that However, there are more general ways in which transformation protocols could access uses of a given channel [29,30,73]. We use the notation [ → ] × to denote -tuples of maps from to , representing arbitrary uses of multiple channels. That is, a process which uses channels (Λ 1 , . . . , Λ ) ∈ [ → ] × does not have to use them in parallel, but can use them in any physically consistent manner, including transformations which do not have a fixed causal order. A more general form of a KE-preserving quantum process can then be defined as We note that each transformation Υ ∈ KEP [ → ] × →[ → ] is assumed to be an -linear map. The quantum capacity (Λ) is then the maximum rate at which K-enforcing -channel processes can simulate the noiseless communication channel id ⊗ 2 when the channel Λ is used times. The (parallel) entanglement cost (Λ), on the other hand, is given by the least rate at which noiseless identity channels id 2 are required in order to simulate the action of parallel copies of the given noisy channel Λ. Precisely, , The distance used in the above definitions is the operationally meaningful diamond norm, given by (16) [44,46]. Let us remark about the different systems in play in (26)- (27). In the definition of KEP (Λ → ), we write Υ Λ × → to denote that the copies of the channel Λ do not have to be provided as a tensor product Λ ⊗ , but can be used in any desired way. The target of this protocol is the channel id ⊗ 2 , representing qubits of noiseless quantum communication. In contrast, in our definition of , KEP (Λ → ) we use a certain number of identity channels to simulate the action of Λ ⊗ in parallel. Importantly, this is not the most general definition of channel entanglement cost, and indeed more general simulation schemes can be considered [25]. However, the important point for us is that this definition is the lowest possible entanglement cost of Λ -having to simulate Λ ⊗ is easier than having to simulate arbitrary uses of it, so our definition of , KEP lower bounds more general ones [25].
We also stress that the choice of the broad class of KE-preserving quantum processes means that other choices of manipulation process -and in particular LOCC -are necessarily subsets of KEP. This immediately gives KEP (Λ) ≥ LOCC (Λ) and ,KE (Λ) ≤ ,LOCC (Λ). Finally, we can also consider an extension of the definitions in (26)-(27) which incorporates a non-zero transformation error -that is, we no longer demand that the transformation be asymptotically exact, but only that the final error do not exceed a given threshold. The resulting modified notions of quantum capacity and channel entanglement cost are , Mirroring the quantification of resources such as entanglement for quantum states, one can ask about how to effectively measure the resource content of a channel. Although such concepts date back to the early days of quantum information [74], it was not until recently that resource measures of quantum channels were formalised [24,47,[75][76][77][78][79][80]. One such measure can be naturally defined by extending the concept of robustness measures [38], which we encountered in the definition of K . The (standard) K-enforcing robustness is Note that this quantity is in general not just the robustness of entanglement of the corresponding Choi state [81]. The crucial property of the robustness is its monotonicity under all KE-preserving quantum processes, which we show explicitly for completeness. The robustness KE is defined at the level of channels, rather than states, which prevents a direct application of methods established for the state case, such as those in Ref. [31]. However, we will show this quantity to obey a very strong relation with the state-based robustness measure K : the channel robustness can be computed by optimising the state robustness K over all input states.

Lemma 5 (Channel-state equivalence of the robustness). For any channel Λ : → , it holds that
where the maximisation is over all states ∈ D(H ⊗ H ) (or, equivalently, over all pure states , or over states ∈ D(H ⊗ H ) with arbitrary).
The proof of this Lemma is one of the main technical contributions of this work. Due to its length, we defer it to Sec. VI.
Our main idea will be therefore to go from quantities defined at the level of channels to quantities defined at the level of states, which will allow us to extend the reasoning of Ref. [31] to channel manipulation. In particular, we define the channel tempered robustness and channel tempered negativity as The final auxiliary result that we will need gives the precise value of the channel-based robustness KE for the identity channel. Proof. Recall from [38] that a maximally entangled state can be written as Φ = + − ( − 1) − , where are both separable states [82]. It is not difficult to notice that ± correspond to the Choi states of valid quantum channels, i.e. ± = [id ⊗ Γ ± ](Φ ) for some Γ ± , which means that such channels are entanglement breaking [55]. This gives a valid feasible decomposition for id as id = Γ + − ( − 1)Γ − , implying that KE (id ) ≤ − 1. On the other hand, it is known that the state-based robustness K satisfies K (Φ ) = − 1 [38], which by Lemma 5 gives for both the separable and the PPT cone, so equality must hold.
Remark. The fact that the robustness KE (Λ) of a channel equals the state-based robustness K ( Λ ) of the corresponding Choi-Jamiołkowski state is a more general property satisfied by so-called teleportation-simulable channels [47,57,62,83,84]. Here we limited ourselves to the channel id for simplicity.

C. General bounds on
With the definitions in place, we are ready to state and prove the main technical result of this paper.

Theorem 7.
With KE denoting either entanglement-breaking or PPT-binding channels, the entanglement cost under KE-preserving quantum processes satisfies that inf where Remark. The evaluation of the left-hand side of (40) is, in general, very difficult due to two obstacles: one is the optimisation in the transformation error , and the other is the computation of the limit → ∞ of the number of channel copies which is used to define , KEP (Eq. (29)). Our first bound in (40) alleviates the former problem, giving in particular a general lower bound on the entanglement cost , KEP . The resulting bound, however, still requires an asymptotic regularisation. The crucial aspect of our second bound in (41) is that it is single letter -no optimisation over many channel copies is needed, and the quantity corresponds to an optimisation problem of a fixed size. For any input state , the computation of ([id ⊗ Λ]( )) is a semidefinite program [31], making the bound efficiently computable in practice.
Proof. The argument follows closely that given in the proof of Theorem S7 in Ref. [31]. Let be an achievable rate for the entanglement cost , KEP (Λ) at some error threshold ∈ [0, 1/2), as per the definition in (29). Consider a sequence of operations Υ ∈ KEP [ 0 → 0 ] ⊗ → [ → ] ⊗ , with 0 , 0 single-qubit systems, such that For all sufficiently large , we then write Let us go through each of the steps in detail. (iii) Here we go from the channel-based quantity KE to the state-based quantity K through an application of the channel-state equivalence (Lemma 5).
(vi) This step is a consequence of the super-multiplicativity of the channel tempered negativity; explicitly, where in the second inequality (on the fourth line) we used the supermultiplicativity of for states, i.e. the fact that ( ⊗ ) ≥ ( ) , which was shown in [31, Proposition S5(e)].
Let us now go back to (44). Applying the logarithm, dividing by , and taking the limit superior as → ∞ concludes the proof. The stated inequality with the quantity K (Λ) follows by applying this procedure to the inequality in step (iv).
Let us stress again that the KE-preserving processes considered here are a larger class than typically employed ones, such as LOCC or PPT processes. Since the entanglement cost can only increase when a smaller type of channel manipulation schemes is used, the bound of Theorem 7 applies also to any smaller class, and in particular for any channel Λ it holds that ,LOCC (Λ) ≥ (Λ).
We will shortly see that this bound can outperform previously known ones.

Remark.
We observe in passing that the same reasoning used to derive the bounds appearing in Theorem 7 in terms of tempered quantities, which ultimately relies on the properties of the partial transpose, can be repeated for another operation known as reshuffling (or realignment) [85][86][87]. The outcome is another family of lower bounds for the channel entanglement cost, possibly independent of the one provided here. Indeed, the underlying reasoning can be extended also beyond the resource theory of entanglement. A complete account of these developments will be published soon [88].

V. IRREVERSIBILITY OF CHANNEL MANIPULATION: A DETAILED PROOF
In [31] we presented a result establishing the fundamental irreversibility under non-entangling operations of the theory of entanglement manipulation for states, as well as its extension to the channel setting [31,Methods]. The purpose of this section is to provide a complete proof of this result, leveraging the technical tools honed in the previous section. We start by recalling the definition of the two-qutrit state 3 , whose irreversibility under general non-entangling protocols was shown in [31]; it is defined by Here, 3 3 =1 | | is the projector onto the maximally correlated subspace, and Φ 3 = |Φ 3 Φ 3 |, | , is the maximally entangled state of dimension 3. We then considered the qutrit-to-qutrit channel Ω 3 whose Choi state is 3 . This is given by [31,Methods] where Δ(·) = 3 , =1 | | · | | is the dephasing channel, setting to 0 all off-diagonal elements of the input matrix. The fact that the entanglement cost of the channel Ω 3 exceeds its corresponding quantum capacity, even under all KE-preserving transformations, has been announced in [31], as we now recall.
Theorem 8 [31]. The qutrit-to-qutrit channel Ω 3 defined by (49) satisfies that for all ∈ [0, 1/2). In particular, the resource theory of communication is irreversible under quantum processes which preserve either the set of entanglement-breaking channels or that of PPT-binding channels.
The above result establishes the fundamental irreversibility of the theory of manipulation of point-to-point channels, and it is therefore in direct analogy with the other main findings of [31] concerning the theory of bipartite states. In the above setting, we consider as free all those protocols that in some sense do not introduce additional entanglement into the system, a philosophy encapsulated in our choice of free operations as KE-preserving processes. It was already known [25] that the theory of quantum channel manipulation is irreversible when only local operations and classical communication are allowed, while here we extend this to the much broader class of KEP transformations.
As we have mentioned previously, a classic result of quantum information known as the reverse quantum Shannon theorem [21][22][23] establishes instead the reversibility of the theory under different circumstances, i.e. when unlimited entanglement is given for free, and instead it is classical communication that is deemed a costly resource. Since it is easy to see that this latter approach does not comply with our assumptions, our results and the reverse quantum Shannon theorem are not in direct contradiction and instead complement each other. Indeed, Theorem 8 can be thought of as a general no-go result: when no entanglement creation is allowed, the irreversibility of quantum communication cannot be circumvented even by going beyond LOCC. The ability to generate entanglement is therefore necessary to achieve reversible channel transformations and establish an equivalent of the reverse Shannon theorem.
Proof of Theorem 8. The lower bound on follows from Theorem 7: we have that where the last equality was shown in Theorem S9 of [31].
To bound the quantum capacity of Ω 3 , we will use the channel divergence based on the maxrelative entropy max [24,89] (also known as generalised robustness). This bound first appeared in Ref. [90] for transformation protocols Υ restricted to adaptive LOCC quantum combs. Recently, it was shown in Ref. [64] that the max-relative entropy in fact provides a strong converse bound on quantum capacity assisted by general, KE-preserving quantum processes -specifically, it holds that where max (Λ Γ) = log 2 inf {1 + : Since the completely dephasing channel Δ is explicitly entanglement breaking, we get where we used the ansatz Ω 3 + 1 2 id 3 = 3 2 Δ as a feasible solution for (54).
Previous lower bounds on the entanglement cost fall broadly into two categories. The first one is quantities that require complicated optimisation and are typically intractable in practice, such as the regularised relative entropy of entanglement ∞ [91] or the squashed entanglement [92][93][94][95]. The second type are computable measures, which can be efficiently evaluated. The latter category includes the measured relative entropy of entanglement [96] or the SDP lower bound of Ref. [97]. Importantly, to date, all of the computable bounds were in fact lower bounds on the regularised relative entropy of entanglement, and thus they can never perform better than the bound obtained using ∞ . Our bound based on , on the other hand, can be strictly better: since the quantum relative entropy is upper bounded by max [89], we have that where the equality in the second line was shown in [98,Lemma 12], and with ( ) = Tr (log 2 − log 2 ). This shows that the tempered negativity bound -itself efficiently computable as a semidefinite program -can not only outperform all other computable bounds, but even the regularised relative entropy bound.

VI. PROOF OF LEMMA 5
Let us restate the result used before for the reader's convenience.

Lemma 5. For any channel Λ : → , it holds that
where the maximisation is over all states ∈ D(H ⊗ H ) (or, equivalently, over all pure states , or over states ∈ D(H ⊗ H ) with arbitrary).
We begin with a helpful lemma that will allow us to recast the robustness as an optimisation over sub-normalised quantum operations. Specifically, we define a K-enforcing subchannel to be a completely positive map Γ which satisfies [id ⊗ Γ]( ) ∈ K for all ∈ N and ≥ 0, and which is also trace non-increasing, in the sense that Tr Γ( ) ≤ 1 for all density operators . We denote the set of all K-enforcing subchannels with KE. We then have the following.

Lemma 10.
The robustness KE can be equivalently expressed by optimising over K-enforcing subchannels. Specifically, for any positive and trace preserving map Λ it holds that Proof. First, notice that when Λ + Γ = (1 + )Θ for trace-preserving maps Λ and Θ, in the nontrivial case where > 0, also Γ is automatically constrained to be trace preserving. We thus write for a fixed state ∈ T (H ). Now, ↦ → Tr is a K-enforcing map, which means in particular that Γ , Θ ∈ KE by the convexity of K. But then Λ + Γ = (1 + )Θ , so KE ≤ . Since this holds for arbitrary feasible , we get KE (Λ) = KE (Λ) as desired.
The next ingredient we need is the compactness of KE with respect to an appropriate topology.
Corollary 11. For K = S , PPT , the set KE of K-enforcing subchannels is compact with respect to the weak*-operator topology.
Proof. First, the cone of positive maps inside B (T → T ) is weak*-operator closed. To see this, consider a net of positive maps (Λ ) converging to Λ in the weak*-operator topology, where Λ , Λ ∈ B (T → T ). For all ∈ T (H ), ≥ 0, and all | ∈ H , we have that |Λ( )| = lim |Λ ( )| ≥ 0, where we computed the limit thanks to the fact that | | is a compact operator. Hence, Λ( ) ≥ 0; since this holds for all ≥ 0, we deduce that Λ is positive, as claimed.
Secondly, also the cone of completely positive map is weak*-operator closed. In fact, with the above notation, Λ w*o −−→ Λ implies that id ⊗ Λ w*o −−→ id ⊗ Λ for all ∈ N, because tensoring with A net on a set X is simply a function : A → X , where A is an arbitrary directed set, i.e. a set equipped with a pre-order ≤ such that given any two elements , ∈ A one can find a common upper bound , ≤ ∈ A. In this context, we need to use nets rather than simple sequences because the weak*-operator topology is 'non-metrisable', i.e. it is not induced by any metric, unless dim H < ∞. a finite-dimensional space cannot affect weak*-operator convergence. Since id ⊗ Λ is positive for all , by the above result so is id ⊗ Λ. This ensures that Λ is completely positive.
Thirdly, it is straightforward to verify that the cone cone(KE) Λ : ≥ 0, Λ ∈ KE of Kenforcing maps is weak*-operator closed as well. This follows from a similar reasoning as above -recalling from Ref. [34] that it suffices to verify that [id ⊗ Λ] ( ) ∈ K for all ∈ T + for finite-dimensional ancillary spaces -together with the fact that K itself is weak*-closed.
Finally, we can write where Ξ is the unit ball of the space B (T → T ), defined by (17). To see why, notice that Λ 1→1 = sup Tr Λ( ) whenever Λ is positive (in particular, when it is completely positive), so that for completely positive maps Λ 1→1 ≤ 1 amounts to sup Tr Λ( ) ≤ 1. Having established (62), we deduce that KE, being an intersection of a weak*-operator compact (cf. Lemma 3) and a weak*-operator closed set, is itself weak*-operator compact.
Following the techniques of Refs. [39,40], we now show that the above result implies the lower semicontinuity of the channel robustness KE .

Lemma 12. The channel robustness KE is lower semicontinuous with respect to the weak*-operator topology, in the sense that
Proof. Due to Lemma 10, we can see that 2 KE + 1 is the gauge function (Minkowski functional) with respect to the set conv KE ∪ − KE , that is, Crucially, from the weak*-operator compactness of KE established in Corollary 11 we have that conv KE ∪ − KE is also weak*-operator compact, which in particular implies that it is weak*operator closed. The proof is then completed by noting that the gauge of a closed set is always lower semicontinuous. More explicitly, lower semicontinuity of 2 KE + 1 is equivalent [99, Proposition 2.5] to the weak*-operator closedness of the sublevel sets for all , which is immediate from the closedness of conv KE ∪ − KE .
We then proceed by establishing the identity in Lemma 5 for the case of finite-dimensional channels.

Lemma 13.
For any point-to-point channel Λ : → where , < ∞, it holds that Proof of Lemma 13. We use Γ [id ⊗ Γ](Φ ) to denote the Choi state of a channel Γ : → . Recall that Γ ∈ KE if and only if Γ ∈ K [55,56]. By Lemma 10 we can write where ≤ K denotes inequality with respect to the cone K, in the sense that ≤ K ⇐⇒ − ∈ K. We then have by Sion's minimax theorem [100] . The rest of the proof will follow an argument similar to [61, Lemma 7]. By continuity, it suffices to consider > 0. Since conjugation by a product operator preserves K-ness (that is, it cannot map a separable / PPT operator to an operator which is not separable / PPT, respectively) we have that Defining Θ ( 1/2 ⊗ 1) Θ ( 1/2 ⊗ 1), we similarly have that Θ ∈ K ⇐⇒ Θ ∈ K. Altogether, this gives where Φ denotes the maximally entangled state. Since any pure state can (up to an inconsequential local unitary on the second system) be written as ( 1/2 ⊗ 1) Φ( 1/2 ⊗ 1) for some and, conversely, any state can be purified to a state , we get Sion's theorem gives us sufficient conditions for a function : X × Y → R to satisfy the 'minimax' property A set of conditions under which the above equality holds is as follows: (i) X is compact and convex; (ii) Y is convex; (iii) (·, ) is convex and lower semi-continuous on X for every ∈ Y; and (iv) ( , ·) is concave and upper semicontinuous on Y for every ∈ X . In our case, is actually a bilinear function on a finite-dimensional space, hence verifying the above conditions (i)-(iv) is straightforward.
The proof is concluded by noting that the convexity of K ensures that K [id ⊗ Λ]( ) is convex in , which means that we can equivalently optimise over all states ∈ D(H ⊗ H ) as the maximum will anyway be achieved at an extreme point .
The final step is to extend this relation to infinite-dimensional spaces. The first part of the proof is a standard argument based on finite-dimensional approximations of infinite-dimensional channels [101], where we employ in particular a normalised, trace-preserving construction found e.g. in [102] in order to avoid normalisation issues. The second part of the proof relies on the lower semicontinuity that we have shown in Lemma 12.
Proof of Lemma 5. Let {Π } ∈N and {Π } ∈N be increasing sequences of finite-rank orthogonal projectors which converge strongly to the identity operator on H and H , respectively. For any channel Λ : → , we define the maps where ∈ T (H ) is some fixed state satisfying ≤ Π for all sufficiently large , but otherwise arbitrary. Equivalently, this means that supp( ) ⊆ supp(Π ) for some , and hence for all ≥ . It is not difficult to see [102] that the maps Λ converge to Λ in the topology of strong convergence -specifically, for any ∈ T (H ), it holds that Our strategy will now be to show that lim sup and use the finite-dimensional result of Lemma 13 to conclude that equality holds between the leftmost and rightmost terms, since each Λ can be equivalently understood as a map between finite-dimensional spaces.
We begin with the leftmost inequality (i). Clearly, constraining to finite-dimensional input states such that = 1 ⊗ Π 1 ⊗ Π can only decrease the value of sup K [id ⊗ Λ] ( ) . For any such state , consider then any feasible solution for K [id ⊗ Λ] ( ) , that is, states , ∈ K such that [id ⊗ Λ] ( ) + = (1 + ) . Note then that, since = 1 ⊗ Π 1 ⊗ Π , it holds that id ⊗ Λ ( ) = (id ⊗ Φ) • (id ⊗ Λ)( ), where Φ( ) Π Π + Tr (1 − Π ) . The crucial observation is that id ⊗ Φ is a K-preserving channel: simply projecting with a local projection 1⊗Π cannot generate entanglement or non-positive partial transpose, and the measureand-prepare map ↦ → Tr (1 − Π ) is K-enforcing, so by convexity of K we have that ∈ K ⇒ id ⊗ Φ( ) ∈ K. This gives which constitutes a feasible solution for the robustness of id ⊗ Λ ( ). Thus for any , from which inequality (i) follows. We now move on to inequality (ii). Consider any feasible solution for KE (Λ), that is, take any pair of channels Γ, Θ ∈ KE such that Λ + Γ = (1 + )Θ. Then, for any we have that for some , ∈ K 1 by definition of KE. This then gives K ([id ⊗ Λ]( )) ≤ . As this holds for any input state and any feasible , we get the desired inequality. Finally, inequality (iii) is just the lower semicontinuity of KE established in Lemma 12. To see that this is applicable, observe that the strong operator convergence in (74) implies in particular that Λ w*o − −−− → →∞ Λ. This concludes the proof.
Remark. All of the considerations of this section, and in particular the main result of Lemma 5, can be analogously applied to another resource measure closely related to the robustness KE : the generalised robustness KE , defined as where now Γ is not required to be a K-enforcing channel. Indeed, the finite-dimensional variant of this result (analogous to our Lemma 13) appeared already in [64,Lemma 17]. An extension of this finding to infinite-dimensional spaces, including a proof that KE is weak*-operator lower semicontinuous, can be obtained in direct analogy with our Lemmas 12 and 5.