Indefinite causal order enables perfect quantum communication with zero capacity channels

Quantum mechanics is compatible with scenarios where the relative order between two events can be indefinite. Here we show that two independent instances of a noisy process can behave as a perfect quantum communication channel when used in a coherent superposition of two alternative orders. This phenomenon occurs even if the original process has zero capacity to transmit quantum information. In contrast, perfect quantum communication does not occur when the message is sent directly from the sender to the receiver through a superposition of alternative paths, with an independent noise process acting on each path. The possibility of perfect quantum communication through independent noisy channels highlights a fundamental difference between the superposition of orders in time and the superposition of paths in space.


I. INTRODUCTION
The framework of information theory was established in the seminal work of Claude Shannon [1], who laid the foundations of our current communication technology. In his work, Shannon modelled the devices used to store and transfer information as classical systems, whose internal state can in principle be determined without errors, and whose arrangement in space and time is always well-defined. At the fundamental level, however, physical systems obey the laws of quantum mechanics, which in principle can be exploited to achieve communication tasks that are impossible in classical physics [2][3][4]. The ability of quantum channels to transmit information has been quantified by various types of quantum capacities, such as the classical capacity [5,6], the quantum capacity [7][8][9], and the entanglement-assisted capacity [10,11]. By now, the theory of communication with quantum systems has become a throughly developed discipline, known as quantum Shannon theory [12,13].
The standard model of communication in quantum Shannon theory generally assumes that the available communication channels are used in a definite configuration. In principle, however, quantum theory is compatible with scenarios where the configuration of the communication channels is in a quantum superposition. For example, a photon could travel through a superposition of different paths between the sender and receiver [14][15][16], and interference between the noise processes on different paths could offer an opportunity to filter out some of the noise affecting the transmission [17]. More recently, it has been observed that the superposition of channel configurations can also involve the order of the channels in time, in a scenario known as the quantum SWITCH [18,19]. In the quantum SWITCH, the relative order of two channels is controlled by a qubit, and superpositions in the state of such qubit lead to indefinite causal order. The indefiniteness of the order is sometimes called causal non-separability [20][21][22].
In recent years, the applications of the quantum SWITCH and other causally non-separable processes have attracted increasing interest, leading to the discovery of quantum advantages in various tasks, such as testing properties of quantum channels [23,24], winning non-causal games [20], reducing quantum communication complexity [25], boosting the precision of quantum metrology [26], and achieving thermodynamic advantages [27]. Experimental investigations of the quantum SWITCH have been recently proposed in various photonic setups [28][29][30][31][32][33][34]. The quantum SWICH also admits more exotic realizations, which could take place in new physical regimes involving quantum superpositions of spacetimes [35] or closed timelike curves [18,19].
The extension of quantum Shannon theory to scenarios involving the superposition of causal orders has been recently initiated by Ebler, Salek, and one of the authors [36,37]. In these works, the authors established a number of advantages with respect to the standard communication model of quantum Shannon theory, where channels arXiv:1810.10457v4 [quant-ph] 5 Sep 2022 are arranged in a definite configuration, and no additional side channels are used [38]. Recently, some of the Shannon theoretic advantages of the quantum SWITCH have been demonstrated experimentally in photonic setups [32,34,39]. A natural question is whether these advantages are an exclusive feature of the superposition of orders, or whether instead they could be reproduced by the superposition of paths originally considered in Refs. [14][15][16]. Recently, Abbott et al [40] argued for the latter, showing that some of the advantages of the quantum SWITCH can be reproduced in a scenario where multiple independent channels are put in parallel between the sender and receiver, and the message is sent through them along a superposition of paths. But can all the advantages of the quantum SWITCH be reproduced in this way?
Here we answer the question in the negative, showing that the combination of two independent channels in an indefinite order leads to a phenomenon that cannot be achieved through the combination of independent channels on alternative paths between the sender and the receiver. Specifically, we show that an entanglement-breaking channel, which in normal conditions cannot send any quantum information, can become a perfect quantum communication channel when two independent uses of it are combined by the quantum SWITCH. In contrast, we prove that perfect quantum communication cannot take place when the two independent uses of the channel are placed on two alternative paths between the sender and the receiver, letting the message travel on a coherent superposition of these paths. More generally, we show that no superposition of any finite number of independent noisy channels can lead to a complete noise removal. Our result proves that any communication model that reproduces all the advantages of the quantum SWITCH through the superposition of paths between the sender and the receiver must necessarily feature correlations between the processes occurring on the different paths [38,41,42].
The quantum communication advantage established here is also interesting as an extreme form of activation of the quantum capacity. In our example, two channels with zero quantum capacity are combined into a channel that has not only positive capacity, but also maximal capacity for the given input size. We characterize the set of channels that give rise to such extreme form of activation, showing that our example is unique up to changes of basis.
In Section II we briefly review the quantum SWITCH and its application to quantum Shannon theory. In Section III we present our example of perfect activation of channels with zero quantum capacity, and we show that, for qubit, our example is essentially unique. In Section IV we prove that perfect activation cannot be achieved through the superposition of independent noisy channels. In Section V we discuss the implication of the theorem proved in Section IV. The conclusions of the paper are given in Section VI.

II. QUANTUM SWITCH
A general quantum process, transforming an input system A into an output system B, is described by a quantum channel, namely a linear, completely positive, trace-preserving map from L(H A ) to L(H B ), where H A and H B denote the Hilbert spaces of systems A and B, respectively, and L(H) denotes the space of linear operators on a generic Hilbert space H. The set of density operators on the Hilbert space H will be denoted as D(H). The action of a generic quantum channel E on an input state ρ ∈ D(H A ) can be expressed in the Kraus representation as Two communication channels E and F can be combined in different configurations, either by Nature itself or by a communication provider that sets up the communication network between the sender and the receiver [38]. Classically, the channels E and F can be combined in a variety of well-defined configurations. For example, they can be combined in parallel, giving rise to the product channel E ⊗ F , or in a sequence (if their inputs and outputs match), giving rise either to the channel E • F or to the channel F • E. The parallel configuration corresponds to the scenario where the two channels are used by a sender to communicate directly to a receiver. The sequential configuration corresponds to the scenario where the information travels through two causally connected regions before reaching the receiver. More generally, one could think of a sequential composition where a third process R takes place in between E and F . Here R could be an operation performed at an intermediate station placed between the sender and the receiver.
In principle, quantum theory allows for scenarios where two processes, E and F , are combined in a quantum superposition of two alternative orders, via a higher-order operation called the quantum SWITCH [18,19]. For simplicity, consider the case of two processes E and F transforming system A to system B, with H A H B . The new quantum channel S(E , F ) resulting from the combination of E and F in an order controlled by a control qubit C is described by the Kraus operators where {E i } and {F j } are the Kraus operators of the channels E and F , respectively, and {|0 C , |1 C } are orthogonal states of system C. Note that the definition of the channel S(E , F ) is independent of the choice of Kraus representation used for E and F [19]. Mathematically, the quantum SWITCH is a quantum supermap [19,43,44], transforming a pair of quantum channels E and F into the new quantum channel S(E , F ). In a communication scenario, the quantum SWITCH supermap can be interpreted as an operation performed by a communication provider that sets up the communication network between the sender and the receiver [38]. In this model, the control system can be accessed by the communication provider, but remains inaccessible to the sender, who can only encode information in the target system. The same model is generally applied to the superposition of paths, where the path degree of freedom is not used to directly encode information, but only to assist the communication [17,38,40,41]. In both communication models, the control (or path) system is set to a fixed state ω, viewed as a parameter of the communication network between the sender and the receiver. In the case of the quantum SWITCH, the effective channel between sender and receiver is the channel S ω (E , F ) defined by are the anticommutator and the commutator, respectively. Note that there is no entanglement between the target system and the control system at the output state of channel S ω (E , E ). In the following we assume that the communication provider measures the control system and communicates the outcome to the receiver through a classical transmission line, as illustrated in Figure 1. This setting is similar to the setting of quantum communication with classical assistance from the environment [45,46]. An important difference, however, is that we do not assume that the whole environment is accessible: the only part of the environment that needs to be accessible to the communication provider is the two-dimensional system responsible for the order of the channels E and F .

III. PERFECT ACTIVATION OF THE QUANTUM CAPACITY
The relevant quantity of the transmission of quantum information is the coherent information [47], defined as where , and S(τ) := − Tr[τ log τ] is the von Neumann entropy of a generic quantum state τ, with the logarithm taken in base 2. When the state σ is of the separable form σ = ∑ i q i σ A,i ⊗ σ B,i , one has I(A B) σ ≤ 0, with the equality if and only if system A is in a pure state. This implies that entanglement-breaking channels, which transform every state into a separable state, have zero coherent information.
When the channel E is used in parallel for an asymptotically large number of times, its ability to transmit quantum information is measured by the quantum capacity Q(E ), which can be computed in terms of the coherent information as Q(E ) = lim n→∞ I c (E ⊗n )/n [7][8][9]. For an entanglement-breaking channel E , the coherent information I c (E ⊗n ) is zero for every n, and therefore the quantum capacity is zero [48,49].
We now show that the combination of two entanglement-breaking channels in the quantum SWITCH can generate a perfect quantum communication channel. Our example involves two Pauli channels, that is, two qubit channels E p with a Kraus decomposition of the form E p (ρ) = p 0 ρ + p 1 XρX + p 2 YρY + p 3 ZρZ, where p ≡ (p 0 , p 1 , p 2 , p 3 ) is a probability vector, and (X, Y, Z) are the three Pauli matrices.
Suppose that two uses of the same Pauli channel E p are combined in the quantum SWITCH. The action of the resulting channel can be obtained from Equation (3), which yields For ω = |+ +|, the final states of the control system are the orthogonal states |+ and |− , with |± = (|0 ± |1 )/ √ 2. In other words, the output of the channel S |+ +| (E p , E p ) exhibits perfect classical correlations between the evolution of the target system and two orthogonal states of the control system.
A measurement on the control system can then separate the two channels C + and C − in Equation (5). By measuring the environment and communicating the outcome to the receiver, a communication provider can improve the quality of the transmission, giving the receiver the opportunity to decode the channels C + and C − separately. Now, the key point is that the channels C + and C − can be noiseless even if the original channel E p was noisy: • if p 0 is zero, then channel C + is the identity, • if one of the three probabilities p 1 , p 2 , or p 3 is zero, then channel C − is unitary.
When both conditions are satisfied, the quantum channel S |+ +| (E p , E p ) enables a perfect, deterministic transmission of a qubit from the sender to the receiver. This is the case for the channel E XY (ρ) = 1/2 (XρX + YρY).
The channel E XY (ρ) exhibits an extreme example of activation of the quantum capacity. It is entanglement-breaking, because its action can be equivalently expressed in the measure-and-reprepare form E XY (ρ) = |1 1| 0|ρ|0 + |0 0| 1|ρ|1 . Hence, E XY has zero quantum capacity: in the standard communication model of quantum Shannon theory it cannot transmit any quantum information, even if used infinitely many times in parallel or in sequence. In contrast, the quantum channel S |+ +| (E XY , E XY ) has unit capacity, which is the maximum capacity one could possibly obtain with a qubit input. Recently, the extreme activation offered by the quantum SWITCH was experimentally observed in [32], up to a small error due to the unavoidable imperfections affecting any realistic setup.
Physically, one can ask which resources are responsible for the activation of the quantum capacity shown in our example. From the point of view of the communication provider, who sets up the communication network between sender and receiver, the resource is the ability to coherently control the order of two channels, and the ability to perform a measurement in the basis {|+ , |− }, whose vectors are coherent superpositions of the vectors {|0 , |1 } controlling the choice of orders. If the the control qubit were prepared in an incoherent mixture of the states {|0 , |1 }, or if measurement were performed in the incoherent basis {|0 , |1 }, then the evolution of the target system would be described by the entanglement-breaking channel E 2 XY , and no advantage could take place. Hence, a key resource in the protocol is quantum coherence [50] in the qubit controlling the causal order. From the point of view of the sender and receiver, who don't have direct access to the control system, the key resource is the correlation between the evolution of the target system, and the classical information received from the communication provider. In this respect, our example can be seen as a special instance of quantum communication with classical assistance from the environment [45,46].
At this point, it is natural to ask which quantum channels exhibit the extreme activation phenomenon showed in our example. Interestingly, we find out that our example is essentially unique. First of all, we show that every qubit channel E satisfying the conditions Q(E ) = 0 and Q(S ω (E , E )) = 1 must be unitarily equivalent to the channel E XY . Since the quantum capacity Q(S ω (E , E )) = 1 quantifies the amount of information that can be decoded with full access to the output of channel S ω (E , E ), our result covers in particular the case where the control is measured and the outcome is communicated to the receiver: if unit capacity is to be achieved at all, then the channel E must be unitarily equivalent to E XY .
To obtain the above result, we first characterize the qubit channels that achieve unit capacity when inserted into the quantum SWITCH.
Theorem 1. For a qubit channel E , the unit-capacity condition Q(S ω (E , E )) = 1 is satisfied if and only if one of the following conditions is satisfied is a probability, U is an arbitrary unitary gate, and X and Y are Pauli matrices.
The proof of Theorem 1 is provided in Appendix A.
With Theorem 1 at hand, we can show that maximal activation of the quantum capacity only occurs for channels that are unitarily equivalent to E XY : The proof is simple: by Theorem 1, we know that channel E is either unitary or unitarily equivalent to E q (ρ) := q XρX + (1 − q) YρY, for some value of q ∈ [0, 1]. The unitary option is ruled out by the zero-capacity condition Q(E ) = 0. Likewise, the zero-capacity condition rules out all probability values except q = 1/2, due to the hashing bound is the binary entropy. Hence, E must be unitarily equivalent to E 1/2 ≡ E XY . This concludes the proof of Theorem 2.
Theorem 1 characterizes all the qubit channels that admit maximal activation of the quantum capacity. A natural question is whether maximal activation can occur for higher dimensional systems. For a channel with d-dimensional input, the maximum value of the capacity is log d. Hence, the question is whether there exists a channel E acting on a d-dimensional quantum system such that Q(E ) = 0 and Q(S ω (E , E )) = log d for some state ω. As it turns out, the answer is negative for every d > 2: The proof of Theorem 3 is provided in Appendix E. Summarizing, we have shown that switching the order of two uses of a zero capacity channel can yield maximal capacity only for a specific type of qubit channels, that is, channels that are unitarily equivalent to a uniform mixture of the X and Y Pauli gates. For quantum systems of dimension d > 2, activation from zero to maximal capacity could still occur through variants of the quantum SWITCH that permute the order of N > 2 uses of the given channel [52,53]. Finding examples of such activation, however, is beyond the scope of this paper, which focusses on the N = 2 case.

IV. NO PERFECT ACTIVATION VIA SUPERPOSITION OF INDEPENDENT NOISY CHANNELS
We now show that the extreme activation phenomenon shown in the previous section disappears if, instead of combining two independent uses of E XY in a superposition of orders, one places them on two alternative paths between the sender and receiver, sending a message through both paths in a coherent quantum superposition, as illustrated in Figure 2. In other words, if the sender sends a quantum message to the receiver through a superposition of two alternative paths, and each path leads to an independent instance of the channel E XY , then the output state will suffer from some uneliminable noise. In fact, we prove a much stronger result: if the sender sends a quantum message to the receiver through a superposition of any finite number N of paths, and if the N paths lead to N independent noisy channels, then the output state will necessarily suffer from noise.
The evolution experienced by a quantum particle travelling on a superposition of paths was discussed by Aharonov, Anandan, Popescu, and Vaidman in the unitary case [14]. The definition of superposition of quantum evolutions was subsequently extended to noisy channels in a series of works [15,16,40,54]. In the following we briefly review the notion of superposition of noisy channels, following the framework of [16,54].
This framework is inspired by quantum optics, where a single photon travelling along N possible paths can be equivalently modelled as the one-photon subspace of N spatial modes of the electromagnetic field.
For a generic quantum system S (hereafter called a "particle"), the superposition of N paths is described by introducing N abstract "modes". For simplicity, here we present the framework for N = 2, leaving the extension to arbitrary N to Appendix F.
Consider two abstract modes, labelled as 0 and 1, each coming with an internal degree of freedom. In the quantum optics example, the internal degree of freedom is the polarization: each abstract mode is the composite system of a pair of polarization modes, such as vertical and horizontal polarization, associated to the same path. A generic quantum state of mode m ∈ {0, 1} can be expressed as |Ψ = n max n=0 c n |ψ n , where n labels the number of particles, n max is the maximum number of particles in mode m, (c n ) are complex amplitudes, and (|ψ n ) are states of the subspace H n,m associated to n particles in mode m. For each mode m, we assume that 1. the zero-particle subspace H 0,m is one-dimensional, meaning that there is a unique vacuum state, hereafter denoted as |0, m , and 2. the one-particle subspace H 1,m has dimension d, independently of m. Both assumptions are satisfied in the motivating example of quantum optics, where the vacuum of the electromagnetic field is one-dimensional, and the one-particle subspace associated to each path is a qubit, spanned by the two orthogonal states of horizontal and vertical polarization, respectively.
Suppose that the evolution of mode m ∈ {0, 1} is described by a quantum channel E (m) that preserves the number of particles. Preservation of the number of particles implies that the Kraus operators of E (m) have the block-diagonal form E i,n acts on the n-particle subspace H n,m [55]. In particular, since the zero-particle subspace is one-dimensional, the operators E (m) i,0 are complex numbers, called the vacuum amplitudes of channel E (m) [54]. In the following, we will use the shorthand notation α On the one-particle subspace, the channel E (m) acts as a quantum channel E (m) with Kraus operators E (m) i We call channel E (m) an extension of channel E (m) . In the example of a single photon's polarization, the channel E (m) represents the effective evolution of the polarization degree of freedom of a single photon travelling on the m-th path. Instead, the extension E (m) describes the full evolution of the polarization modes associated to the m-th path. Now, consider a single particle propagating in a coherent superposition of two possible paths. The state space of the single particle is the one-particle subspace of the corresponding modes. A generic state in the one-particle subspace is of the form that is, it is a linear combination of product states where one mode is in a one-particle state and the other mode is in the vacuum. The one-particle subspace can be equivalently represented as a bipartite system, whose subsystems are the internal degree of freedom of the particle (denoted by S), and the particle's path (denoted by C). Explicitly, the one-particle states can be written as where we associated the orthonormal vectors |0 C and |1 C to the two possible paths, and we introduced the notation |ψ 0 S ⊗ |0 C := |ψ 0 ⊗ |0, 1 and |ψ 1 S ⊗ |1 C := |0, 0 ⊗ |ψ 1 .
Assuming that the modes 0 and 1 evolve independently, the evolution of the single particle is simply the restriction of the product channel E (0) ⊗ E (1) to the one-particle subspace. The action of the Kraus operators E on a generic state |Ψ in the one-particle subspace is having used the decomposition (6). Note that one has E Hence, the restriction of the product channel E (0) ⊗ E (1) to the one-particle subspace is the channel R( ij . Regarding the one-particle subspace as the composite system SC, made of the internal degree of freedom and the path, the Kraus operator of the channel R( E (0) , E (1) ) can be expressed as In the following, we make the standard assumption that the path is initialized in a fixed state ω, independent of the state of the internal degree of freedom S [15-17, 38, 40, 54]. Then, the communication between the sender and the receiver is described by the effective channel R ω ( E (0) , E (1) ) defined by the relation We call the channel R ω ( E (0) , E (1) ) a superposition of the channels E (0) and E (1) , or simply, the superposition channel.
Note that the superposition channel depends not only on the original channels E (0) and E (1) , but also on the vacuum amplitudes. Physically, this dependence is due to the fact that the full description of the transmission lines is provided by the channels E (0) and E (1) acting on the two modes, rather than the channels E (0) and E (1) acting on the one-particle subspaces of such modes.
We now show that, if the one-particle channels E (0) and E (1) are noisy, then the superposition channel R ω ( E (0) , E (1) ) cannot be perfectly corrected. Hence, any message sent through it will suffer from some unavoidable noise. This result holds for every finite number of paths: Theorem 4. Suppose that the state of a finite-dimensional quantum system is encoded in the internal degree of freedom of a single-particle, which is transmitted through a superposition of N < ∞ paths traversing N independent channels. If all channels are noisy, then the initial quantum state cannot be retrieved without errors.
The proof of Corollary 1 is provided in Appendix H. Theorem 4 and Corollary 1 establish a fundamental result valid for arbitrary noisy channels and for arbitrary finite numbers of independent uses. Combined with our extreme example of activation, these results strengthen an earlier observation made in [37], which showed that the superposition of orders can give rise to a noiseless heralded transmission of quantum states through two entanglement-breaking channels, while the superposition of these two channels cannot. The key difference is that the perfect quantum communication exhibited by our example takes place deterministically, and therefore it guarantees a reliable transmission of entanglement over many uses of the channel. In contrast, heralded quantum communication can only be used to transmit quantum states involving entanglement among a few particles, because the probability of successful transmission decays exponentially with the number of particles.
It is worth stressing that Theorem 4 refers to the scenario where the number of paths is finite. When the number of paths N tends to infinity, it is known that nearly perfect quantum communication can sometimes be achieved also through the superposition of N independent noisy channels, with an error vanishing as 1/N [54]. The crucial point, however, is that the quantum SWITCH can achieve perfect deterministic communication with N = 2.

V. IMPLICATIONS OF THEOREM 4
Theorem 4 helps understanding the nature of the superposition of orders, by contrasting it with other types of superposition.
First of all, the quantum SWITCH of two channels E and F is not a superposition of the channels E (0) := E • F and E (1) := F • E regarded as two independent channels. If it were, then the extreme activation phenomenon shown earlier in this paper would be in contradiction with Theorem 4.
Mathematically, the difference between the quantum SWITCH and the superposition of independent channels is evident from the Kraus representation. For two independent channels E (0) = E • F and E (1) = F • E , the Kraus operators are E (0) ij := E i F j and E (1) kl := F k E l , respectively. The superposition of the channels E (0) and E (1) results into a new channel with Kraus operators given by Eq. (F6), which now reads where (α (0) ij ) and (α (1) kl ) are vacuum amplitudes associated to channels E (0) and E (1) , respectively. The above Kraus operators are clearly different from the Kraus operators of the channel produced by the quantum SWITCH, shown in Eq. (1).
The channel produced by the quantum SWITCH can still be regarded as a "superposition of the channels E • F and F • E ", in a more general sense discussed in [16,54]. This generalized kind of superposition is realized by sending a particle on two paths, with the channel on one path correlated with the channel on the other path. An explicit realization of the switched channel S(E , F ) using correlated channels on two paths has been provided in [42,54] (see also [56]). Physically, the correlations between the channels on the two paths can be understood by modelling the quantum channels E and F as "collisions" between the system and two other particles [57], with the order of the collisions be determined by a control qubit. In this way, the occurrence of a collision realizing channel E on one path is anti-correlated with the occurrence of a collision realizing channel E on the other path, and similarly for channel F . From this physical perspective, our result highlights the value of the correlations between the channels on the two paths as a communication resource.
Theorem 4 also enables an interesting comparison between the superposition of channel configurations in space and the superposition of channel configurations in time. Suppose that a communication provider is given two communication devices, which can take as input either one particle or the vacuum. Let E and F be the two quantum channels describing the two devices. One way to use the devices is to place them in two spatially separated regions, R 0 and R 1 : the communication provider could place channel E in region R 0 and channel F in region R 1 , or the other way round. By letting the placement of the devices be controlled by a quantum system, the provider could also create a coherent superposition of these two alternative configurations, obtaining a new channel T ( E , F ) with Kraus operators where {|0 D , |1 D } are orthonormal states of a suitable control qubit D. A single particle could then be sent in a superposition of two paths, passing through regions R 0 and R 1 , respectively. Let |ψ S be the initial state of the particle's internal degree of freedom, c 0 |0 C + c 1 |1 C be the initial state of the path, and d 0 |0 D + d 1 |1 D be the initial state of the qubit controlling the channels; configuration. In terms of modes, the state of the particle can be expressed as c 0 |ψ ⊗ |0, 1 + c 1 |0, 0 ⊗ |ψ , using the notation of the previous section. The action of the Kraus operator T ij in Eq. (13) then yields the state where α i and β j are the vacuum amplitudes of channels E and F , respectively, E i and F j are the Kraus operators of the one-particle restrictions of channels E and F , denoted by E and F respectively. The state (14) can be equivalently written as The state (15) is formally identical to the state one would get by sending the particle on two paths, leading to channels E and F , and associated to the orthogonal states |Φ 0 and |Φ 1 of a composite control system CD. In other words, the effective channel acting on the particle's internal degree of freedom is a superposition of the channels E and F . By Theorem 4, no choice of the extensions E and F can enable a perfect transmission of quantum messages when each of the channels E and F is noisy.
In contrast, suppose that the regions R 0 and R 1 are causally connected, i.e. that it is possible to send signals from R 0 to R 1 . In particular, this implies that region R 0 precedes region R 1 in time. Also, suppose that there exists a mechanism that can place the available communication devices into regions R 0 and R 1 , so that the choice of which device is placed in which region is controlled coherently by a qubit. Such a mechanism could be used to realize the quantum SWITCH of channels E and F . When a single particle is transmitted, the quantum SWITCH of channels E and F reduces to the quantum SWITCH of channels E and F . Hence, the example provided earlier in the paper shows that perfect quantum communication is possible even if the both channels E and F are noisy, In summary, Theorem 4 can be used to highlight a difference between the coherent placement of two quantum channels on two spatially separated regions, and the coherent placement of two channels on two causally connected regions. When a single particle is sent, one placement permits perfect quantum communication, while the other does not. Informally, this can be viewed as a difference between superpositions of channel placements in space and superpositions of channel placements in time.

VI. CONCLUSIONS
In this work we showed that the possibility of indefinite causal order gives rise to an extreme activation phenomenon: two uses of a zero-capacity quantum channel can be deterministically converted into a single use of a quantum channel with maximal capacity. Remarkably, such extreme form of activation cannot be achieved by sending a particle on a superposition of paths between the sender and the receiver, as long as the processes encountered along different paths are independent and the number of paths is finite.
Our results are particularly relevant in light of the observation that some of the benefits of the superposition of causal orders can be obtained also through the superposition of paths in space [40]. While the advantages in both scenarios exhibit similarities, our findings highlight a fundamental difference between the type of advantages arising from independent channels placed on a superposition of alternative paths and independent channels placed in a superposition of alternative orders. The proof of Theorem 1 is based on three lemmas, whose proofs are provided in the subsequent appendices. The proof provided in Appendix B. Lemma 1 implies that the quantum capacity Q(S ω (E , E )) is maximal if and only if the channel S ω (E , E ) is correctable. A necessary condition for the correctability of S ω (E , E ) is provided by the following Lemma: Lemma 2. Let E be a channel from a generic quantum system A (of dimension d A ≥ 2) to itself. If the channel S ω (E , E ) is correctable for some state ω, then the channel S |γ γ| (E , E ) is correctable for every |γ in the support of ω, and the same correction channel works for both S ω (E , E ) and S |γ γ| (E , E ).
The proof is elementary, and is provided in Appendix C for completeness. Thanks to Lemma 2, we can restrict our attention to the case where the state ω is pure without loss of generality. For a pure state ω = |γ γ| with |γ = c 0 |0 + c 1 |1 , the Kraus operators of the channel S ω (E , E ) are where we used the notation O ⊗ |ψ to denote the operator defined by O ⊗ |ψ |φ := (O|φ ) ⊗ |ψ , for generic vectors |φ and |ψ , and for a generic operator O. Correctability is determined by the Knill-Laflamme condition [58], which reads where τ is a density matrix and p = |c 0 | 2 . Now, let us restrict our attention to the qubit case d A = 2. In this case, the Knill-Laflamme condition implies that the Kraus representation of E contains at most two linearly independent operators. Lemma 3. For every unit vector |γ ∈ C 2 , if the channel S |γ γ| (E , E ) satisfies the Knill-Laflamme condition (A2), then E has at most two linearly independent Kraus operators.
The proof is provided in Appendix D.
Equipped with the above lemmas, we are now ready to prove Theorem 1. Proof of Theorem 1. Let us start from the"if" part. If E is unitary, then the channel S ω (E , E ) is equal to E 2 ⊗ ω and can be corrected by discarding the control system and applying the inverse of E . Now, suppose that the channel E is of the form E (ρ) = q (UXU † ) ρ (UXU † ) + (1 − q) (UYU † ) ρ (UYU † ), for some unitary matrix U. The switched channel S |γ γ| (E , E ) has Kraus operators given by Eq. (A1), which reads with |γ ± := c 0 |0 ± c 1 |1 .
We now prove the "only if" part. Assume that there exists a state ω such that the quantum capacity of the switched channel S ω (E , E ) is maximal. Then, Lemma 1 implies that the channel S ω (E , E ) is correctable. Furthermore, Lemma 2 implies that the channel S |γ γ| (E , E ) is correctable for every |γ in the support of ω. In the following, we will fix one such state |γ and we will consider the channel S |γ γ| (E , E ).
Lemma 3 guarantees that channel E has a Kraus representation with only two Kraus operators E 1 and E 2 . Setting i = j = m = n in the Knill-Laflamme condition (A2), we obtain the relation (E † i ) 2 (E i ) 2 = τ iiii I, meaning that each operator E 2 i is proportional to a unitary gate. We now characterize the operators O such that O 2 is unitary. The condition implies that O is invertible and that one has In terms of the singular value decomposition O = ∑ k √ λ k |v k w k |, the above relation reads For two-dimensional systems, this means that there are only two possibilities: 1. λ k = 1 ∀k. In this case, O is unitary.
2. λ 1 = 1 and λ 2 = 1/λ 1 . In this case, one must have |v 1 ∝ |w 2 and |v 2 ∝ |w 1 . In short, O is of the form Case 1. If one of the two Kraus operators E 1 and E 2 is proportional to a unitary matrix, then the normalization condition E † 1 E 1 + E † 2 E 2 = I implies that also the other Kraus operator is proportional to a unitary matrix. Hence, the channel E is of the random-unitary form E (ρ) = q U 1 ρU † 1 + (1 − q) U 2 ρU † 2 for some probability q ∈ (0, 1) and some pair of unitary gates U 1 and U 2 . Choosing i = j = 1 and m = n = 2 in the Knill-Laflamme condition (A2) we obtain (U 2 1 ) † U 2 2 ∝ I, or equivalently, U 2 1 ∝ U 2 2 . Then, there are two possibilities: either U 2 1 ∝ U 2 2 ∝ I, or the unitaries U 1 and U 2 have the form U 1 = e iθ 1 |v 1 v 1 | + e iθ |v 2 v 2 | and U 2 = e iθ 2 |v 1 v 1 | − e iθ |v 2 v 2 | , for some phases θ 1 , θ 2 , θ ∈ R. In the second case the unitaries U 1 and U 2 commute, and therefore the Knill-Laflamme condition (A2) is reduced to the Knill-Laflamme condition for the channel E 2 . In turn, the correctablity of channel E 2 implies the correctability of E , which means that E must be unitary, because E is a channel from a quantum system to itself.
The other possibility is U 2 1 ∝ U 2 2 ∝ I. This condition means that the unitaries U 1 and U 2 are proportional to self-adjoint unitaries, with eigenvalues +1 and −1. Since the proportionality constant is an irrelevant global phase, we can discard it without loss of generality. Hence, we can take the unitaries U 1 and U 2 to be self-adjoint. Now, the Choi operator of channel E is given by E = ∑ m,n E (|m n|) ⊗ |m n| = q|U 1 U 1 | + (1 − q)|U 2 U 2 |, using the notation |A = A mn |m ⊗ |n , for a generic matrix A. Since the unitaries U 1 and U 2 are self-adjoint, the product U 1 |U 2 = Tr[U 1 U 2 ] is a real number. This means that the Gram-Schmidt construction applied to {|U 1 , |U 2 } yields an orthonormal basis {|Ψ 1 , |Ψ 2 } where the vectors |Ψ 1 and |Ψ 2 are linear combinations of |U 1 and |U 2 with real coefficients. In this basis, the Choi operator can be written as a real symmetric matrix. Hence, it can be diagonalized as E = λ 1 |Φ 1 Φ 1 | + λ 2 |Φ 2 Φ 2 |, where |Φ 1 and |Φ 2 are linear combinations of |U 1 and |U 2 with real coefficients, and Φ 1 |Φ 2 = 0. Equivalently, the channel E can be decomposed as E (ρ) = λ 1 A 1 ρA † 1 + λ 2 A 2 ρA † 2 where A 1 and A 2 are real linear combinations of U 1 and U 2 and Tr[A 1 A 2 ] = 0. We observe that every real linear combination of self-adjoint 2 × 2 unitaries is proportional to a self-adjoint unitary (this is because the 2 × 2 self-adjoint unitaries are of the form U = n · σ where n ∈ R 3 is a unit vector, and σ := (X, Y, Z) is the vector with the three Pauli matrices as entries). Thanks to this observation, we know that the operators A 1 and A 2 are proportional to self-adjoint unitaries, say A 1 = α 1 n 1 · σ and A 2 = α 2 n 2 · σ, for proportionality constants α i > 0 and unit vectors n i ∈ R 3 , i ∈ {1, 2}. Finally, the condition Tr[A 1 A 2 ] = 0 implies n 1 · n 2 = 0, which in turn implies that the two unitaries n 1 · σ and n 2 · σ are of the form UXU † and UYU † for some suitable unitary U.
The channel E is of the form E (ρ) = AρA † + BρB † , with A = a|v 1 v 2 | + b|v 2 v 1 | and B = c|v 1 v 2 | + d|v 2 v 1 | and |a| 2 + |c| 2 = |b 2 | + |d 2 | = 1. Its Choi operator is E = |A A| + |B B|, where the vector |A ∈ C 2 ⊗ C 2 is defined as |A = (A ⊗ I) |I , with |I = ∑ n |n ⊗ |n . In the two-dimensional subspace spanned by the vectors |v 1 ⊗ |v 2 and |v 2 ⊗ |v 1 , the Choi operator has the matrix representation Now, the matrix E can be expressed as E = 1+|c| . This means that E is the Choi operator of the random unitary channel with , one can see that the channel E is given by with In summary, the channel E is random unitary. This brings us back to Case 1.

Appendix B: Proof of Lemma 1
Proof. The "if" part is trivial: clearly, a correctable channel has capacity Q(C) = log d A . For the "only if" part, we use the Holevo-Werner upper bound Q(C) ≤ log T B • C , where T B denotes the transpose map on system B, and ∆ denotes the diamond norm of a generic Hermitian-preserving map ∆ [48]. Note that the upper bound can be equivalently written as Q(C) ≤ log D • T A with D = T B • C • T A . Now, suppose that Q(C) = log d A . Using the notation |A = (A ⊗ I) |I , |I = ∑ n |n ⊗ |n , we obtain where Equation ( Note that the right-hand side is zero if and only if ρ|φ is proportional to |φ for every |φ ∈ H A , that is, if and only if ρ = I A /d A . In other words, the state |Ψ must be maximally entangled.
For the canonical maximally entangled state |Ψ = ∑ d A n=1 |n ⊗ |n / √ d A , the Holevo-Werner bound yields the chain of inequalities which again implies that all inequalities must be saturated. In particular, the triangle inequality (B5) must hold with the equality sign, meaning that the operators (I A ⊗ D)(P + ) and (I A ⊗ D)(P − ) must have orthogonal support, namely Expanding the channel D in a Kraus representation D(ρ) = ∑ i D i ρD † i , and using the fact that each map D i · D † i is completely positive, we obtain the condition which in turn implies The above equation is satisfied if and only if the vector D † j D i |φ is proportional to |φ , that is, if and only if D † j D i = τ ij I, for some proportionality constant τ ij ∈ C. This is nothing but the Knill-Laflamme condition for error correction [58]. Hence, there must exists a correction channel D such that D • D = I A . Recalling that D is equal to T B • C • T A , we then obtain the chain of equalities Since C is a quantum channel, we conclude that C is correctable.

Appendix C: Proof of Lemma 2
Proof. If |γ is in the support of ω, then ω can be decomposed as ω = t |γ γ| + (1 − t)σ, where t > 0 is a nonzero probability and σ is a suitable density matrix. By linearity, one has S ω (E , E ) = t S |γ γ| (E , E ) + (1 − t) S σ (E , E ). Now, let C be a correction for S ω (E , E ). The decomposition of S ω (E , E ) implies the condition Since the identity is an extreme point of the set of quantum channels, the above condition implies C • S |γ γ| (E , E ) = I A . This proves that S |γ γ| (E , E ) is correctable and admits the same correction as S ω (E , E ).

Appendix D: Proof of Lemma 3
Proof. For an arbitrary channel C with arbitrary input and output Hilbert spaces H in and H out , error correction on arbitrary inputs is possible only if quantum packing bound is satisfied (see e.g. [59]), where d out and d in are the dimensions of H out and H in , respectively, and r is the number of linearly independent Kraus operators of the channel C. For the switched channel S |γ γ| (E , E ), we have d in = d and d out = 2d. Hence, we have the bound where r switch is the number of linearly independent Kraus operators of S |γ γ| (E , E ). We now show that also the original channel E can have at most 2 linearly independent Kraus operators. To this purpose, consider the Knill-Laflamme condition (A2) and set i = j = m = n. With this choice, we obtain (E † i ) 2 (E i ) 2 = τ ii,ii I for every i, which implies that each non-zero Kraus operator E i is invertible. Now, suppose that E has r linearly independent Kraus operators (E i ) r i=1 . For every fixed j, the operators (E i E j ) r i=1 must be linearly independent, and so must be the must be linearly independent. This means that the switched channel S ω (E , E ) has at least r linearly independent Kraus operators, namely r switch ≥ r . (D3) In conclusion, we obtained the bound r ≤ 2.

Appendix E: Proof of Theorem 3
Proof. Let E be a generic quantum channel with d-dimensional input system A and d-dimensional output system B, with d > 2. The maximal capacity condition Q(S ω (E , E )) = log d implies that channel E has at most two linearly independent Kraus operators (by Lemmas 1 and 3). Hence, the Choi operator E := (E ⊗ I)(|I I|) has rank at most 2, and channel E has a Kraus representation of the form E (ρ) , we show that the channel E cannot have zero capacity for any d > 2. To this purpose, we use the fact that the quantum capacity is lower bounded by the the coherent information of the channel, which in turn is lower bounded by the coherent information the Choi state E/d. In formula, Then, it suffices to show that the Choi state has non-zero coherent information. Explicitly, the coherent information of the Choi state is Since the Choi state has rank at most 2, and its von Neumann entropy is at most log 2 = 1, and we have the bound At this point, we recall the normalization condition E † 1 E 1 + E † 2 E 2 = I. Defining P := E † 1 E 1 , we then have E † 2 E 2 = I − P. Moreover, we recall that, for every operator A, the operators A † A and AA † are unitarily equivalent. Hence, there exist two unitary operators U 1 and U 2 such that The state ρ B can then be written as and its operator norm ρ B ∞ (equal to its maximum eigenvalue) satisfies the bound Using this fact, we can lower bound the min-entropy S min (ρ B ) : Since the min-entropy is a lower bound to the von Neumann entropy, we obtain the bounds and For d > 4, this bound implies that the coherent information of the Choi state is strictly positive. In this case, Eq. (E1) implies that the quantum capacity is also strictly positive.
To conclude the proof, we consider separately the cases of d = 4 and d = 3. For d = 4, the proof is based on the bound (E7), which guarantees that the coherent information of the Choi state is larger than, or equal to zero. If the coherent information is larger than zero, then Eq. (E1) implies that the quantum capacity is strictly positive. It remains to analyze the case where the coherent information is zero, namely the case in which the bound (E7) is attained with the equality sign. To achieve the equality, one must have the equality sign in Eq. (E5), meaning that the maximum eigenvalue of ρ B is exactly 2/d. Moreover, one must have the equality in the bound S(ρ B ) ≥ S min (ρ B ). Such equality is attained only when ρ B is proportional to a projector. Since one of the eigenvalues is 2/d, we conclude that the projector has rank r = d/2 = 2. From the definition of ρ B in Eq. (E4), we then obtain that the operator P should be a projector on a two-dimensional subspace. Finally, recall the definition P = E † 1 E 1 , which implies E 1 = U 1 √ P = U 1 P for some unitary operator U 1 . For every state ρ with support contained into the support of P, one has E (ρ) = E 1 ρE † 1 = U 1 ρU † 1 . Since the channel acts unitarily on a two-dimensional subspace, its quantum capacity is at least 1. In summary, for d = 4 any quantum channel E satisfying the condition S ω (E , E ) for some state ω must have non-zero quantum capacity.
Let us move now to the d = 3 case. By Lemma 1, the maximal capacity condition Q(S ω (E , E )) = log d implies that the channel S ω (E , E ) must be correctable, namely that there exists a channel C such that C • S ω (E , E ) = I.
Equivalently, there must exist a channel E such that Now, consider the completely positive map {E i , E j }(·){E i , E j }. Since this map transforms system S into itself, and is correctable, it must be proportional to a unitary channel. Hence, each operator {E i , E j } must be proportional to a unitary. In particular, E 2 1 and E 2 2 must be proportional to unitaries. Now, let us express E 1 and E 2 as E 1 = U 1 √ P 1 and E 2 = U 2 √ P 2 , where U 1 and U 2 are unitaries, P 1 := E † 1 E 1 , and P 2 := E † 2 E 2 . For i ∈ {1, 2}, the condition that E 2 i is proportional to a unitary can be written as E † i = λ i V i , for some constant λ i and some unitary operator V i .
We now show that, if the constant λ i is zero for some i, then the quantum channel has non-zero capacity. To see that this is the case, note that the condition λ i = 0 implies U i √ P i U i √ P i = 0, and also √ P i U i √ P i = 0. The last condition implies that the kernel of √ P i contains all vectors of the form U i |ψ , where |ψ is in the support of √ P i . Hence, the dimension of the kernel of √ P i cannot be smaller than the dimension of the support of √ P i . Since the total dimension is d = 3, this condition implies that the kernel has dimension 2 and the support has dimension 1. In other words, the operator P i has the form P i = |η i η i | for some (possibly subnormalized) vector |η i . Now, recall the definition P i := E † i E i and the normalization condition P 1 + P 2 = I. If P 1 = |η 1 η 1 |, then P 2 = I − |η 1 η 1 |, and P 2 acts as a projector in the two-dimensional subspace orthogonal to |η 1 . For every vector |ψ in such subspace, one has E (|ψ ψ|) = U 2 |ψ ψ|U † 2 . Hence, the quantum capacity of E is at least 1. Similarly, if P 2 = |η 2 η 2 |, then P 1 = I − |η 2 η 2 |, and P 1 acts as a projector in the two-dimensional subspace orthogonal to |η 2 . For every vector |ψ in such subspace, one has E (|ψ ψ|) = U 1 |ψ ψ|U † 1 . Hence, the quantum capacity of E is at least 1.
Summarizing, the quantum capacity is non-zero whenever λ 1 = 0 or λ 2 = 0. Now, consider the case when λ i is non-zero for every i ∈ {1, 2}. In the following we will show that, also in this case, the capacity of E is non-zero.
First, note that the condition or equivalently The last equation implies that P i and |λ i | 2 P −1 i have the same spectrum. Let (a, b, c) be the eigenvalues of P 1 , listed in descending order a ≥ b ≥ c. Then, the eigenvalues of P −1 1 are 1 c , 1 b , 1 a , still listed in descending order. The condition that P 1 and |λ 1 | 2 P −1 1 have the same spectrum implies |λ 1 | 2 = b 2 and c = b 2 /a. Summarising, the spectrum of P 1 is of the form (a, b, c) with a ≥ b ≥ c ≡ b 2 /a. Similarly, the spectrum of P 2 must be of the form (a , b , c ) with a ≥ b ≥ c ≡ b 2 /a . On the other hand, the condition P 1 + P 2 = 1 implies a = 1 − c, b = 1 − b, and c = 1 − a. Hence, we must have that is, (a − b) 2 = 0. This condition implies a = b = c, and a = b = c , meaning that the operators P 1 and P 2 are proportional to the identity. Hence, the operators E 1 = U 1 √ P 1 and E 2 = U 2 √ P 2 are proportional to unitaries. From Eq. (E4) and from the definition P := E † 1 E 1 ≡ P 1 we obtain the equality ρ B = I/d. Hence, the bound (E3) becomes Since the coherent information of the Choi operator E/d is a lower bound to the quantum capacity, we proved that the channel E has non-zero capacity. Summarizing, any quantum channel E acting on a d-dimensional quantum system with d > 2 and satisfying the condition Q(S ω (E , E )) = log d for some state ω must have Q(E ) > 0. This concludes the proof of the theorem. labels the number of particles, n max,m is the maximum number of particles in mode m, (c n ) are complex amplitudes, and (|ψ n ) are states of the subspace H n,m associated to n particles in mode m. For Fermionic modes, one has n max,m = 1, while for Bosonic modes one has n max,m = ∞. To be fully general, we allow n max,m to be any number in N ∪ {∞}, and possibly even to depend on m.
For each mode m, we assume that 1. the zero-particle subspace H 0,m is one-dimensional, meaning that there is a unique vacuum state, hereafter denoted as |0, m , and 2. the one-particle subspace H 1,m has dimension d, independently of m.
The second assumption guarantees that we can interpret the one-particle subspace as representing "a d-dimensional quantum system travelling on path m." Suppose that the evolution of mode m is described by a quantum channel E (m) that preserves the number of particles. The Kraus operators of any such channel must have the block-diagonal form E i,n ) are operators operator acting on the n-particle subspace, and satisfying the normalization condition where I (m) n is the identity operator on the n-particle subspace of mode m [55]. In particular, since the zero-particle subspace is one-dimensional, the operators E i,1 . We call channel E (m) an extension of channel E (m) . Now, consider the situation of a single particle propagating in a coherent superposition of N paths. The state space of the single particle is the one-particle subspace of the N modes associated to the N paths. A generic state in the one-particle subspace is of the form that is, it is a linear combination of product states where one mode is in a one-particle state and all the other modes are in the vacuum. The one-particle subspace can be equivalently represented as a bipartite system, whose subsystems are an internal degree of freedom of the particle (denoted by S), and the particle's path (denoted by C, in analogy to the control system in the quantum SWITCH). Explicitly, the one-particle states can be written as where we introduced the notation |ψ m S ⊗ |m C := |0, m ⊗ · · · ⊗ |0, m − 1 ⊗ |ψ m ⊗ |0, m + 1 ⊗ · · · ⊗ |0, N − 1 , and associated the orthonormal vectors (|m ) N−1 m=0 to the N possible paths that the particle can traverse. Assuming that the N modes evolve independently under the channels ( E (m) ) N−1 m=0 , the evolution of the single particle is simply the restriction of the product channel E (0) ⊗ E (1) ⊗ · · · ⊗ E (N−1) to the one-particle subspace. We denote the one-particle restriction by R( E (0) , E (1) , . . . , E (N−1) ), and its Kraus operators by Note that the one-particle restriction depends only on the one-particle channels (E (m) ) N−1 m=0 and on the vacuum amplitudes (α (m) ) N−1 m=0 . Hence, we can without loss of generality assume that the maximum number of particles is n max,m = 1 for every mode m. This assumption will help simplifying the characterization of the possible extensions of the original channels (E (m) ) N−1 m=0 . In the following, we make the standard assumption that the path is initialized in a fixed state ω, independent of the state of the internal degree of freedom S [15-17, 38, 40, 54]. Then, the communication between the sender and the receiver is described by the effective channel R ω ( E (0) , . . . , E (N−1) ) defined by the relation R ω ( E (0) , . . . , E (N−1) ) (ρ) := R( E (0) , . . . , E (N−1) ) (ρ ⊗ ω) . (F7) We call the channel R ω ( E (0) , . . . , E (N−1) ) a coherent superposition of the channels (E (m) ) N−1 m=0 , or simply, the superposition channel.

Appendix G: Proof of Theorem 4
Here we show that, if all the channels (E (m) ) N−1 m=0 are noisy, then it is impossible to find extensions ( E (m) ) N−1 m=0 and a state ω such that the superposition channel R ω ( E (0) , . . . , E (N−1) ) is correctable.
First, note that we can assume without loss of generality that the state ω is pure. Indeed, the superposition channel R ω ( E (0) , . . . , E (N−1) ) depends linearly on the state ω, and one has R p ω+(1−p) ω ( E (0) , . . . , E (N−1) ) = p R ω ( E (0) , . . . , E (N−1) ) + (1 − p) R ω ( E (0) , . . . , E (N−1) ). Since the convex combination of two channels is correctable if and only if each channel is correctable, if the superposition channel is correctable for a mixture p ω + (1 − p) ω , then it must be correctable for each of the two states ω and ω . Hence, we can without loss of generality assume that the state ω is pure, namely ω = |φ φ| for some unit vector where the notation A ⊗ |φ C denotes the linear operator from H S to H S ⊗ H C defined by the relation (A ⊗ |φ C ) |ψ S := (A|ψ ) S ⊗ |φ C , ∀|ψ ∈ H S . Second, note that we can assume without loss of generality that the unit vector |φ satisfies the condition c m = 0 for every path m. Indeed, if any of the paths has zero amplitude, we can simply remove it from the list of allowed paths, and focus our attention on the remaining paths.
Third, note that, for every given m, the extensions of a given channel E (m) form a convex set: if E (m) and E (m) are two extensions of E (m) , then also the channel p E (m) + (1 − p) E (m) is an extension of E (m) , for every probability p ∈ [0, 1]. Without loss of generality, the extension E (m) can be taken to be an extreme point of the convex set. Indeed, the superposition channel R ω ( E (0) , . . . , E (N−1) ) depends linearly on the extensions ( E (m) ) N−1 m=0 , and convex combinations of the form p E (m) + (1 − p) E (m) result into convex combinations of the form pR ω ( E (0) , . . . , E (m) , . . . , E (N−1) ) + (1 − p) R ω ( E (0) , . . . , E (m) , . . . , E (N−1) ). Since the convex combination of two channels is correctable only if each channel is correctable (Lemma 2), this argument allows us to restrict our attention to the case where each channel E (m) is an extreme point. Now, let Chan(S) be the set of channels from system S to itself. For a given channel E ∈ Chan(S), let E be an extension of channel E , acting on the original system S and on the vacuum. Define Ext(E ) be the set of all such extensions. Again, since the dimension of the input system is finite, the set Ext(E ) is a finite-dimensional compact set.