Entanglement generation secure against general attacks

We present a security proof for establishing private entanglement by means of recurrence-type entanglement distillation protocols over noisy quantum channels. We consider protocols where the local devices are imperfect, and show that nonetheless a confidential quantum channel can be established, and used to e.g. perform distributed quantum computation in a secure manner. While our results are not fully device independent (which we argue to be unachievable in settings with quantum outputs), our proof holds for arbitrary channel noise and noisy local operations, and even in the case where the eavesdropper learns the noise. Our approach relies on non-trivial properties of distillation protocols which are used in conjunction with de-Finetti and post-selection-type techniques to reduce a general quantum attack in a non-asymptotic scenario to an i.i.d. setting. As a side result, we also provide entanglement distillation protocols for non-i.i.d. input states.


Introduction
Entanglement is a key resource in quantum information processing. Entanglement can be used to teleport quantum information [1], to implement remote quantum gates [2], or for distributed quantum computation [3]. It allows one to perform tasks that are not possible by classical means, such as secret key expansion vital for secure classical communication. The latter is achieved through the famous and extensively studied quantum key distribution (QKD) protocols [4,5,6,7,8,9,10]. In these works, security was proven in a variety of ever more general scenarios, considering noisy channels, imperfect devices and device-independent (DI) settings, where even the local quantum devices are untrusted [11,12,13]. In contrast, the perhaps equally important task of establishing private entanglement, and the closely related problem of establishing secure quantum channels, has not been resolved in equal generality. The latter has, historically, received significantly less attention [14], until the very recent increase of interest [15,16,17,18] in security under ideal settings. The task of establishing private entanglement has been considered in the context of noisy channels and both perfect [19] operations, and operations with local depolarizing noise [20,21]. In these works, either initial states that are identical and independently distributed (i.i.d.), or asymptotic scenarios are assumed.
Here, we present a comprehensive treatment for the security of distillation protocols. To make our results broadly applicable, we generalize the security model (i.e. powers of the adversary) over standard settings for protocols with quantum outputs. Furthermore, we remove the need for asymptotic, or i.i.d. assumptions, allow for more general noise models, and formulate and prove security criteria which ensure composability -i.e. the security of the protocols when they are used in arbitrary contexts, e.g. as sub-routines of larger protocols. More specifically, we consider arbitrary attacks employed by an adversary (Eve, the distributer of noisy or corrupt Bell-pairs) and assume noisy communication channels and noisy local operations -essentially arbitrary noise describing imperfect single-and two-qubit gates. We also extend adversarial powers beyond standard: the noisy apparatus may leak all the information about the noise processes which occurred in a run of the protocol to Eve. Our scenario, by necessity, falls short from full DI, as security under such weakest assumptions is not attainable for protocols with a quantum output -any device used in any protocol with which a client can interact classically, perhaps to test its performance, but which eventually outputs a quantum system, can always deviate from honest behavior when the final quantum output is eventually demanded (independent of how elaborate the testing may have been). This raises the questions of how DI assumptions can be relaxed such that security becomes possible also for quantum output protocols, or how standard security models can be further extended. DI assumptions can be understood as an extreme noisy scenario, where Eve has absolute control over the noise processes. Our model relaxes this: Eve's control is not exact (deterministic), but rather probabilistic, however still perfectly heralded -while Eve may fail in her interventions, she still learns the noise realized. In this sense, generalizing the types of noise the protocol is provably secure under in our model, corresponds to scenarios which are ever closer to DI. Naturally, other generalizations of DI settings which make sense for protocols with quantum outputs may be possible * . We proceed by first providing a security analysis for i.i.d. inputs, and then generalize to non-i.i.d. states. This is done by employing de-Finetti and post-selection symmetrization-based techniques. However, since we are interested in security in arbitrary contexts, we must go beyond standard scenarios considered in entanglement distillation works [19,20,21] and explicitly consider the adversarial quantum systems (containing e.g. purifications of all quantum states) as well. Therefore the symmetrization-based techniques cannot be straightforwardly applied, but need to be adapted. We present and discuss the required additional steps of preprocessing, and provide entanglement distillation protocols that are not restricted to i.i.d. inputs, but are capable of dealing with general inputs. The latter is related to recent results in [22,23,24].

Structure of the paper
The paper is organized as follows. In Sec. 3 we introduce the basic concepts, specify the overall setting and define the confidentiality of entanglement distillation protocols. Next, we summarize our main contribution in Sec. 4. In Sec. 5 we show confidentiality of recurrence-type entanglement distillation protocols by proving confidentiality for i.i.d. inputs in Sec. 5.1 and we extend this results to arbitrary initial states in Sec. 5.2 and 5.3. Finally we prove Illustration of the overall setting: Eve provides the initial pairs to Alice and Bob, who run the entanglement distillation protocol. The noisy apparatus may leak the specification of the realized noise map to Eve after every step of the protocol.
The proposed overall protocol under i.i.d. assumption involves several steps. First, Eve distributes n pairs (the initial states), to Alice and Bob who apply local "twirl" operations (random, correlated local operations). Next, Alice and Bob sacrifice some m ≈ √ n pairs to check whether the fidelity, given with F (ρ, σ) = tr ρ 1/2 σρ 1/2 for density operators ρ and σ, of the pairs is sufficient for entanglement distillation, via local σ x and σ z measurements. If the fidelity F relative to |B 00 is insufficient, they abort. Otherwise they proceed with a recurrence-type entanglement distillation to produce a high fidelity Bell-pair from the remaining initial states, which may also be aborted. Finally, Alice and Bob output their final state. For i.i.d. inputs, the twirl ensures that local σ z and σ x correlation measurements can be used to estimate the fidelity of individual pairs. This estimate is crucial for ensuring entanglement distillation via recurrence-type entanglement distillation protocols. Later, we will generalize to non-i.i.d. settings by prepending the protocol with symmetrization (permuting of the pairs) and tracing-out steps.
To formalize the security requirements, we define the ideal map F α,l , mapping the initial states of Alice and Bob to a single Bell-pair, where α (abstractly) characterizes the noise levels in the channels connecting Eve to Alice and Bob, and also the noise of the local devices, and l indicates that the noise transcripts leak to Eve. The ideal map can intuitively be thought of as a map which simulates a real protocol as follows. In the case of an abort, it replaces the final state with a fixed state σ ⊥ ABE . In the non-aborting case, however, it replaces the actual output with a special state σ α,P,l ABE , which corresponds to the output of a real protocol where the noise transcripts leak to Eve, utilizing distillation protocol P, that was successfully run with asymptotically many high-fidelity i.i.d. initial pairs. This is the best the noisy entanglement distillation protocol P could ever do. As we show later, σ α,P,l ABE is a well-defined state for the entanglement distillation protocols and noise parameters considered here. That is, it depends on the local noise parameters only, and not the initial states. Formally, we have for a given real map (that is, the map realized by the execution of a real protocol) a corresponding ideal map (F α,l ⊗ id E ) (|ψ ψ| ABE ) = p ρ σ α,P,l ABE ⊗ |ok ok| f + (1 − p ρ )σ ⊥ ABE ⊗ |f ail f ail| f (2) where |ψ ABE is a purification of the initial n-partite ensemble ρ (n) AB provided by Eve, p ρ is the success probability depending on the initial state ρ (n) AB , and σ ⊥ ABE is a fixed state output if the protocol is aborted. Observe that the corresponding success probabilities p ρ , per definition, are identical for the real and ideal maps E α,l and F α,l in (1) and (2) respectively. The two-level flag system f distinguishes the accepting and aborting branches. The state σ α,P,l ABE is the asymptotic state of the entanglement distillation protocol P and is of the form where |η ij are the leaked noise transcripts of Eve, |B ij = (id ⊗ σ j x σ i z ) |B 00 the Bell-basis states, and ω ij (α, P) are probabilities which depend on the noise level of the local devices and the entanglement distillation protocol P. For instance, if the local devices are perfect, then ω ij = 1 if and only if i = j = 0, hence AB contains a perfect Bell-pair. Finally, the states |η ij specify the sequences of noise operations, and are orthogonal for different i, j. If the noise transcripts are not leaked to Eve, we denote the ideal protocol by F α . In that case, |η ij in (3) is not accessible to Eve, hence we replace σ α,P,l ABE by σ α,P (2). Observe that the ideal map F α,l , which mathematically defines the type of process we wish to realize, is a global operation beyond LOCC (local operations and classical communication) which can be decomposed by concatenating the real protocol E α,l and a replacement map S (which replaces the final state only if the real protocol succeeds according to the system f in (2)), i.e. F α,l = S • E α,l .
An entanglement distillation protocol (together with the noise maps), given as a CPTP map E α,(l) , is confidential if it is close to the ideal map: holds for all initial states |ψ ABE , where ρ 1 = tr ρρ † is the operator 1-norm for a density operator ρ.
The system E above may contain any purification of the initial states Eve provided.
In this work, we use the term security in a generic sense, and the precise meaning depends on the context. For instance, in QKD applications, security means that Alice and Bob establish a perfectly random and secret key which the adversary has negligible information about [36,37,6,5,9,35]. In recent times, composable security definitions have become commonplace, in which, roughly speaking, security is defined via an ideal process, and security level via the amount by which the process realized by the protocol deviates from the ideal process. In the context of QKD, this distance reduces to the distance on the generated final states of the ideal vs. realized protocol. The ideal protocol outputs a completely mixed state on Alice and Bobs system which is in tensor product with Eve. More formally, see also [9], a QKD protocol Q is said to be ε−secure for initial state ρ ABE if holds where σ S A S B CE = (Q ⊗ id E )(ρ ABE ), S A and S B denote the output systems of Alice and Bob (corresponding the generated key), C denotes the classical communication and σ SS = 1/|S| s∈S |s s| ⊗ |s s| for orthogonal states s. The state σ SS ⊗ σ CE corresponds to the output of the ideal protocol.
The confidentiality criterion which we introduce here follows the distance-on-maps approach introduced in the context of QKD like in e.g. [8]. Observe that such an approach is especially tailored to compose different protocols, as the confidentiality definition concerns the distance of the real process with respect to an ideal process. Therefore the real and ideal maps E α,(l) and F α,(l) respectively are motivated by abstracting the protocol in terms of processes. It is straightforward to abstract and define the ideal map in terms of input and output relations, reflecting an ideal entanglement distillation process. As we discuss above, the ideal protocol has an ok− and f ail−branch. The f ail−branch corresponds to the case whenever Alice and Bob abort the procedure, outputting the state σ ⊥ ABE . However, if the procedure succeeds then we might think of the ideal map as running the entanglement distillation protocol for infinitely many initial states, ending up in the fixed state σ α,P,l ABE of the entanglement distillation protocol P for noise level α. We observe two important facts regarding that particular state: first, its the best the entanglement distillation protocol P can do in the presence of noise of level α, and second, as Eve is disentangled from Alice and Bob, this state is useful for applications like quantum teleportation. Hence we refer to this state also as a private state, or equivalently, Alice and Bob share private entanglement. In contrast to (5), the target state σ α,P,l ABE in the ok−branch is only in tensor product with respect to Eve if the noise transcripts do not leak to the adversary. In that case a secure quantum channel is feasible in terms of quantum teleportation. Otherwise, that is if the noise transcripts |η ij leak to Eve, she is in a separable state with respect to Alice and Bob, but still enabling for confidential applications. By confidential we mean here that when the final state is used for quantum teleportation no information about the teleported state is leaked, but the final state does not guarantee that Eve cannot change the teleported state. This observation motivates the term confidentiality rather than security. The classical communication is not correlated to the output of the real protocol, thus it can be ignored, see Appendix A for details. The robustness of the protocol * is considered in Appendix E, which enables us to assume for the subsequent analysis that all basic distillation steps succeed.

Main contribution
We summarize the main findings of our paper as follows: recurrence-type entanglement distillation protocols prepended by a symmetrization and a system discarding step enable confidentiality, provided that the noise transcripts do not leak to the adversary for all noise levels α for which distillation would be possible in the i.i.d. case. We also show that this alone implies that the final state in the accepting branch, is close to a tensor product state -Eve is factored out. The results regarding the BBPSSW protocol [28] are analytic whereas for the DEJMPS protocol [19] the results rely on strong numerical evidence. For low noise rates, we achieve better results via the post-selection-based reduction. In that case, no system discarding step is necessary. Finally we find that if an entanglement distillation protocol is confidential when the noise transcripts do not leak, then it also confidential if they do leak to the adversary. In particular, even in the case that Eve picks up information about all the realized noise processes during the protocol, the final output system still enables confidential quantum applications like e.g. quantum teleportation. The paper proceeds as follows. We establish necessary conditions to guarantee confidentiality for recurrence-type entanglement distillation protocols restricted to i.i.d. inputs whenever the noise transcripts are not leaked to Eve. Then, we generalize this to arbitrary initial states via the de-Finetti theorem [25]. Next, we use them to prove the confidentiality criterion (4) for entanglement distillation protocols where the noise transcripts are not leaked. Finally, this will be used to derive the confidentiality bound whenever the noise transcripts are leaked.

Entanglement distillation for i.i.d inputs
The basic step of a recurrence-type entanglement distillation protocol is summarized as follows: Alice and Bob share two noisy Bell-pairs, i.e. both have two qubits, each representing a "half" of a noisy Bell pair, and they first apply local operations to their respective parts of the Bell-pairs; next, they measure one Bell-pair and classically communicate their outcomes. Depending on the entanglement distillation protocol and the outcomes they either keep or discard the unmeasured pair. The basic step is applied to all pairs of the initial states, which comprises one distillation round. This distillation round is iterated where output states of the previous round are used as inputs for the next round. In the limit, a noiseless entanglement distillation protocol outputs a perfect Bell-pair (implying that Eve is factored out). Here, we allow for any type of noise acting (independently) on the single-and two-qubit gates appearing in the protocol * . Using the results of [26], by utilizing random basis changes and adding additional noise, any such general noise can be brought to a standard form: depolarizing noise for imperfect single-and two-qubit CNOTtype operations, see Appendix A. Thus, it is sufficient to address noise in such standard form. For such noise, one can analytically show [27] that for the BBPSSW protocol [28], there exists a unique attracting fixed point of the protocol which only depends on the noise parameters. That is, whenever the fidelity of the initial states is above some minimum fidelity F min , depending on the noise parameters, the protocol converges towards that unique fixed point which we denote by σ α;B AB . Observe that σ α;B AB is related to σ α,P,l ABE of (3) by letting P = B and tracing out Eves system, i.e. σ α;B AB = tr E σ α,B,l ABE . In particular, we mean by P = B that the BBPSSW protocol is used for entanglement distillation. We find that the output state σ N AB , where N = log 2 n denotes the number of successfully completed distillation layers, satisfies σ N AB − σ α;B AB 1 ≤ B , where B is a function of N , and it holds that B ≤ F (n) ∈ O n −bB(α) and 0 < b B (α) ≤ log 2 3 − 1. For the entanglement distillation protocol of Deutsch et. al. [19] (referred to as the DEJMPS protocol) the fixed point analysis is more complicated. In the noiseless case, DEJMPS was proven to have a unique attracting fixed point [29]. For the noisy case, we can only provide extensive numerical evidence that there exists a unique attracting fixed point, depending on the noise parameters only which we denote by σ α;D AB , see Appendix A.1. Again, observe that σ α;D AB is related to σ α,P,l ABE of (3) by setting P = D and tracing out Eves system, i.e. σ α;D AB = tr E σ α,D,l ABE . We numerically find that for the state σ N AB obtained after successfully completing N = log 2 n layers of distillation that σ N AB − σ α;D AB 1 ≤ D where D is a function of N , and it holds that D ≤ F (n) ∈ O n −bD(α) . b D (α) is a positive function. We note that a similar analysis, but also with analytic findings for the noiseless DEJMPS protocol was first performed in [29]. We reiterate that we assume for our analysis that all basic distillation steps succeed, since we deal with failures due to the entanglement distillation protocol with a quadratic overhead in terms of initial states, see Appendix E. The final state of the entanglement distillation protocol P in the ok−branch, σ AB , depends on whether the parameter estimation on √ n initial states was accurate or not. The latter occurs with an exponentially small probability in terms of initial states, see the discussion of the robustness of the protocol in Appendix E. This in turn implies that the parameter estimation was accurate with probability exponentially close to unity. Therefore the results regarding n i.i.d. initial states as input to the distillation protocol P above imply that p ρ σ AB − σ α;P AB 1 ≤ P (n) + 2p PE ≤ P (n) =: ε P (n + √ n) where . This equation attains exactly the same form for both protocols with the difference in the labels, so if we substitute P with B (by writing, for example B (n)) we refer to the BBPSSW protocol, where substituting P with D refers to the DEJMPS protocol. In similar fashion we refer from now by P (n) to P (n) for the sake of clarity. So to summarize, the distance for n + √ n i.i.d. initial states in the ok−branch of the protocol is bounded by ε P (n + √ n). Since, in the abort case, the outputs of the overall protocol E α and the ideal protocol F α are identical we obtain that where the probability p ρ depends on the initial state ρ for both protocols and corresponds to the probability of parameter estimation succeeding and completing log 2 (n − √ n) distillation layers successfully for initial state ρ. Hence, in both cases, the final distance to the respective fixed points scales polynomial in terms of n. The functions b B (α) and b D (α) of the local noise level α govern the rate of convergence of the real protocol to the ideal protocol in the i.i.d case for entanglement distillation protocols. We numerically found that these functions monotonically increase as the local noise rate α tends to zero Appendix A. Thus, increasing the fidelity of local devices (through e.g. fault tolerance) directly influences the rate of convergence, which in turn governs the confidentiality level. In contrast to b B (α), the function b D (α) is not upper bounded, which implies that for certain noise parameters α the DEJMPS protocol needs to perform fewer distillation rounds than the BBPSSW protocol to achieve the * We assume that the noise characteristics of the quantum gates are constant throughout the protocol. required confidentiality levels. This fast convergence is crucial for the powerful post-selection technique [8] for non i.i.d. initial states, which is not applicable for the BBPSSW protocol. Now we use the established fixed point properties of entanglement distillation protocols for i.i.d. initial states to show that similar results hold for arbitrary initial states.

Entanglement distillation for arbitrary inputs
In generalizing the previous results to arbitrary initial states we make use of the de Finetti theorem [25]. The basic de-Finetti results guarantee that the reduced state tr n−k ρ (n) AB of a permutation-invariant n−partite state ρ (n) AB is close to an i.i.d state σ ⊗k AB dσ, with distance which scales as O(k/n). This enables the following Lemma. Lemma 2. Let n, k ∈ N where k ≤ n. Furthermore, let E s&t be the real protocol and F s&t the ideal protocol including symmetrization and the tracing out of n − k pairs. Moreover, let ρ AB be a bipartite mixed state of n systems shared by Alice and Bob and let E and F denote the real and ideal protocol after symmetrization and tracing out n − k pairs. Then Proof. Let ρ AB be a mixed state. After Alice and Bob apply a symmetrization they share a permutation invariant stateρ AB . Thus we can apply Theorem II.7 of [25] and have for ξ k AB := tr n−k [ρ AB ] the inequality ξ k AB − µ ⊗k AB dm(µ AB ) 1 ≤ 32k/n for some probability measure m on the set of mixed states on AB. Moreover we note that E and F are CPTP maps. We define τ k := µ ⊗k AB dm(µ AB ). A straightforward computation shows which completes the proof.
Therefore the application of the de-Finetti theorem introduces an additive term 64k n when reducing arbitrary initial states to i.i.d. initial states. As the right hand side of (8) is independent of the initial state ρ AB , (8) holds for all initial states ρ AB . In (8) we have omitted the superscript α characterizing the noise level, and we will use it only if it is specifically needed. Inequality (8) implies that the properties of the fixed point (uniqueness, attractivity, noise-dependence) also hold for arbitrary initial states, if the protocol is prepended by symmetrization and a trace-out step. This enables us to prove the confidentiality criterion of Definition 1 for entanglement distillation protocols, where the noise transcripts of L are not leaked, which will, in turn, imply the confidentiality criterion (4) whenever the noise transcripts are leaked.

Confidentiality of entanglement distillation protocols
The inequality in (7) establishes the local properties of the protocol, and is more-or-less typical for studies of the convergence of entanglement distillation protocols in the i.i.d. case. However, it falls short of the complete characterization captured by the confidentiality criterion (4) in two ways: first, the input states are restricted (i.i.d.); second, it fails to consider the purifying system of Eve * , vital in cryptographic contexts. While the prior issue is the subject of de-Finetti and post-selection-type reductions, the latter issue can be a problem in general, as small distance of corresponding subsystems does not imply a small distance of the total systems. However, we can resolve this issue by using the fixed point properties of entanglement distillation protocols. More precisely, we relate the two distances by the following general Lemma, proven in Appendix B.1.

Lemma 3.
Let ρ be an arbitrary mixed state shared by Alice and Bob and let |ψ ABE be a purification thereof held by Eve. Furthermore, let P 1 correspond to a (distillation-type) real protocol and P 2 correspond to the associated (distillation-type) ideal protocol, i.e.
where α characterizes the level of the noise, σ α AB , and σ ⊥ AB are two fixed two qubit states. Furthermore, let P 1 and P 2 satisfy the following properties: (1) The noise transcripts do not leak to Eve.
(2) The protocol P 1 guarantees to converge towards some state σ α AB within the ok-branch of the protocol and Then it holds that The factor 34 · 4 8 + 1 arises as an upper bound on the distance of the given states from states in product form based on the notion of non-steerability we introduce (see Appendix B.1 for details). In our computations we managed to prove the key lemma in a manner which is proportional to the dimension of the systems, more precisely, the overall size of the corresponding density matrix. It may be the case that the bound of Lemma 3 could hold without the dependence on the system size (and indeed, with smaller constants), however this was not necessary for our purposes. Lemma 3 is vital as it allows us to employ the de-Finetti theorem [25]. Hence, for the protocols E s&t and F s&t , by combining Lemma 2 with Lemma 3, we obtain the following Theorem.
Theorem 4 (de-Finetti-based reduction technique). Let E s&t be the real protocol and F s&t the ideal protocol including symmetrization and the tracing out of n − k pairs, taking n input pairs and k ≤ n and utilizing entanglement distillation protocol P. Then we have where ε P (k) denotes the maximum distance of the real and ideal protocol without symmetrization and tracing out step using entanglement distillation protocol P in the ok−branch for k i.i.d. initial states, i.e. Eq. (7).
Proof. Suppose Eve prepares a purification |ψ ABE of the state ρ AB shared by Alice and Bob. Recall that the real and ideal protocol including symmetrization and the tracing out of n − k pairs applied to initial state ρ AB read as and observe that we have for the initial state ρ AB by Lemma 2 that where E and F denote the real and ideal protocol after symmetrization and tracing out n − k pairs. Since the right-hand side of (11) is independent of the initial state ρ AB it holds for all initial states of the protocol. Therefore, the properties of the fixed point (unique, attracting and depending on the noise parameters only) translate from i.i.d. initial states to arbitrary initial states. Hence the protocol guarantees that it converges towards the fixed point of the entanglement distillation protocol. Additionally, by inserting (7) in (11) we find This implies that the real protocol indeed converges towards the fixed point, and, thus we can apply Lemma 3 to the protocols E s&t and F s&t for the purification |ψ ABE of ρ AB and we find by using (12) that Taking the maximum in (13) completes the proof.
Thus, we can reach arbitrary confidentiality levels, however at the cost of wasting some pairs. The scaling of the confidentiality parameter, i.e. the right-hand side of (10), is linear in the number of initial states n, due to the use of the "basic" de Finetti approach.
If the local noise is low, we can do better in terms of scaling and efficiency, using the post-selection technique [8].
For that purpose, we first establish a result similar to (9) by using the fact that the resulting state of the protocol, including L, is pure, see Appendix A. More precisely, we have the following Lemma, proven in Appendix B.2.
Lemma 5. Let E be the real protocol which guarantees to converge towards a unique and attracting fixed point depending on the noise parameter only and let F be the ideal protocol. Furthermore let ρ be a mixed state (consisting of n systems) shared by Alice and Bob. If the extension of E and F to the system of L satisfies for all purifications |ψ ABE of ρ.
This Lemma allows us to prove the closeness on any purification from the closeness of the reduced systems, and finally to derive confidentiality from the performance of the ideal protocol via the following Theorem.
Theorem 6 (Post-selection-based reduction technique). Let E s be the real protocol and F s the ideal protocol preceded by a symmetrization step operating on n input pairs. Furthermore let (7), where E and F denote the sub-protocols after symmetrization (i.e. the protocols without the symmetrization step) and P the entanglement distillation protocol. Then we have where g n,d = n+15 n .
Proof. We observe that E s and F s are permutation invariant maps due to the symmetrizazion step. Thus we can apply the post-selection technique of [8] which implies where |τ ABE is a purification of the de-Finetti Hilbert-Schmidt state, hence tr E [|τ τ | ABE ] = µ ⊗n AB dη(µ) =: τ where η is the measure induced by the Hilbert-Schmidt metric on End(C 4 ). Furthermore, we note that we have for the extensions of E s and F s to L, i.e. the maps E s L and F s L , that According to Appendix A.1.1, which implies that the distance including L scales as the square root of the 1−norm induced distance without L, i.e. Alice and Bob only, we find for (16) by using the assumption As |τ ABE is a purification of τ we can apply Lemma 5 which gives, for (15), which completes the proof.
Observe that ε P (n), which governs the rate of convergence of the overall protocol, relates to the rate of convergence of the entanglement distillation protocol P via ε P (n) = P (n − √ n), as √ n initial states are used for parameter estimation. We remind the reader that the preprocessing steps (symmetrization, tracing out) of the entanglement distillation protocol and the Lemmas of this section are non-trivial and crucial for the proof of the de-Finetti-based and post-selection-based reduction technique. Furthermore we point out that the proof regarding the BBPSSW protocol is analytic and necessarily relies on the de-Finetti-based reduction technique because of its slow convergence rate. The rate of convergence for the BBPSSW protocol can easily be derived, see Appendix A for details. For the DEJMPS protocol it turns out that we have polynomial scaling depending on the noise parameter α, (7). However, the protocol needs to converge sufficiently quickly, as the post-selection technique incurs a multiplicative increase in the effective distance between real and ideal protocols, which scales as a (15 degree) polynomial in n, see (14). The resulting confidentiality level scales therefore as O(n 15−bD(α)/4 ), which leads to an acceptable noise level that is rather low, e.g. about 10 −19 for the DEJMPS protocol in the setting of binary pairs * , see Appendix A.1.1. This very low rate is due to the polynomial factor introduced by applying the post-selection technique, i.e. g n,d in (14) with d = 4. Observe that these small rates are determined by properties of recurrence-type entanglement distillation protocols, i.e. b(α) for the recurrence-type entanglement distillation protocols studied here, and may be improved by either considering hashing-type protocols [30] or through fault-tolerant constructions. Indeed, the noise threshold for fault-tolerant quantum computation also applies to this case, yielding a tolerable noise level of about 10 −4 . We reiterate that the post-selection technique is not applicable to the BBPSSW protocol, due to its slow convergence.

Confidentiality of entanglement distillation protocols when the noise transcripts leak
Finally, we provide confidentiality guarantees for entanglement distillation protocols when the noise transcripts are leaked to Eve. For that purpose, we relate the confidentiality criterion (4) for protocols where the noise transcripts are leaked to the earlier results. More formally, we have the following Theorem.
Theorem 7. Let E be the real protocol and F be the ideal protocol satisfying the assumptions of Lemma 3. Furthermore, let E l denote the real and F l the ideal protocol when the noise transcripts leak to Eve. Then for all purifications |ψ ABE of initial state ρ AB consisting of n systems.
The proof, see Appendix C, uses the unitary equivalence of purifications. Theorem 7 establishes via (18) that if an entanglement distillation protocol is ε−confidential according to Definition 1 then the protocol is 2 √ ε−confidential if the noisy apparatus leaks the noise transcripts.

Discussion
We have shown that recurrence-type entanglement distillation protocols ensure private entanglement without referring to the asymptotic limit. This holds true even when the local devices are noisy, and when the potential eavesdropper is able to completely monitor the operation of these devices in run-time (i.e., the noisy apparatus leaks information about the realized noise processes). If the noise transcripts are not leaked, Eve is "factored out" -in tensor product with Alice and Bob, and only classically correlated otherwise. Our protocol can, for instance, be used to realize confidential quantum channels by means of teleportation -the only information that may leak to Eve after teleportation is which noise map was applied to the sent state, but nothing about the state itself (see Appendix F for details). More generally, our results imply the confidentiality of the protocols in arbitrary settings (beyond the application to quantum channels), thus opening the way for the confidential realization of various quantum tasks: from establishing quantum channels and quantum networks, to applications such as distributed quantum computation. Aside from cryptographic aspects, the proposed protocol can be used to generate high quality entanglement from non-iid sources.

Acknowledgments:
We acknowledge the support by the Austrian Science Fund (FWF) through the SFB FoQuS F 4012 and project P28000-N27. AP and VD are grateful to Christopher Portmann for useful discussions, comments, and advice concerning technical aspects of this work.
We first provide an overview of the DEJMPS protocol [19] and then extend the description incrementally to our proposed setting (including L and Eve). The DEJMPS protocol is a recurrence-type entanglement distillation protocol which combines several noisy copies of a mixed state ρ to distill a state arbitrarily close to the maximally entangled state |B 00 , where , 1}, provided that the fidelity F = B 00 | ρ |B 00 satisfies F > 1/2 for the noiseless case. If the apparatus is noisy, then the minimal required fidelity F needs to satisfy F > F min (where F min depends on the noise level of the apparatus) to achieve distillation. For more details on recurrence-type entanglement distillation protocols in general we refer the interested reader to [31]. A basic step of the DEJMPS protocol is as follows: Protocol 1: Basic step of the DEJMPS protocol Require: Input state of Alice and Bob: ρ (a1,b1) ⊗ ρ (a2,b2) 1: Alice and Bob apply the local basis change U x = e −iπ/4σ (a 1 ) 2: Alice and Bob apply a bilateral CNOT (BCNOT): 3: Alice and Bob apply a σ (a2) z = σ z ⊗ id and a σ (b2) z = id ⊗ σ z measurement 4: Alice and Bob communicate their measurement outcomes, z a and z b respectively, over a classical authentic channel 5: if z a = z b then 6: Alice and Bob keep the subsystems a 1 and b 1 of step 2 7: Alice and Bob discard the measured subsystems a 2 and b 2 8: else 9: Alice and Bob discard both pairs 10: end if Hence, we can write one basic distillation step of the DEJMPS protocol as the linear map modulo a normalization factor and where P z = |z z| , z ∈ {0, 1} denotes the respective outcome of step 3 of Protocol 1. The basic step is applied to all initial pairs, which comprises one distillation round. This distillation round is iterated where output states of the previous round are used as inputs for the next round. So we summarize the DEJMPS protocol as follows: Require: Input state of Alice and Bob: Apply Protocol 1 to all pairs 3: Use the outputs of the previous step as input for the next distillation round 4: end while We remind the reader that the recurrence relations of the protocol (i.e. update functions of the coefficients of an ensemble) are central for the convergence analysis of the DEJMPS protocol. For Bell-diagonal states, i.e. states of the form where N = (p 00 + p 11 ) 2 + (p 01 + p 10 ) 2 , see e.g. [19].
In [29] it has been shown analytically that the recurrence relations (A.1) converge towards a unique and attracting fixed point provided the initial fidelity with |B 00 , p 00 , is above 1/2. The recurrence relations of the DEJMPS protocol taking independent single qubit white noise, i.e. noise of the form N ρ = f ρ + (1 − f )/4(ρ + σ x ρσ x + σ y ρσ y + σ z ρσ z ) acting on each qubit of Alice into account, read far more complex. In the presence of noise we have strong numerical evidence that the DEJMPS protocol converges towards a unique and attracting fixed point depending on the noise level f only. From figure A1 we suggest a linear relationship between log ρ fix − ρ n 1 (where ρ fix and ρ n denote the fixed point and the state after successfully completing n distillation rounds respectively) and the number of successful distillation rounds n. We immediately observe that the slope only depends on the noise parameter f , i.e. we have that Using log 2 N = n, where N denotes the number of input pairs, this implies as mentioned in the main text. Furthermore we numerically find that the function b (f ) monotonically grows for f → 1.
For two qubit correlated noise, we refer the reader to the analysis including L, as the fixed point and the scaling can be recovered from that analysis by tracing out the system of L. Appendix A.1.1. Detailed analysis including L We outline the remainder of this section as follows: First we derive the recurrence relations of the DEJMPS protocol in the most general setting, taking the noise applied by L into account as well as assuming that Eve receives the leaked noise transcripts of L. We use those recurrence relations in the next subsection to provide analytical results regarding the fixed point of the recurrence relations, where the inputs are binary pairs and L only applies either id or σ x operators. We close the section with numerical results for general i.i.d. Bell-diagonal pairs and the most general noise maps of L.
The recurrence relations For i.i.d. input states the state of each system subject to distillation at an intermediate distillation round of the DEJMPS protocol is of the form |Ψ ABEL = i,j,k,l P ijkl |B ij AB |kl L |ijkl E , where P ijkl are probability amplitudes, if we assume the noise is leaked to Eve after every distillation round. The system AB models the pair of Alice and Bob, L the system of L (where the content of the register corresponds to the effective noise introduced to AB) and E the system of Eve. L applies the noise processes before a basic protocol step to the systems of Alice. Moreover, L keeps track of the effective noise introduced using its system in a sense we clarify later. In the following we use the notation σ 0,0 = id, σ 0,1 = σ x , σ 1,0 = σ z , σ 1,1 = σ y for the four Pauli-operators. Furthermore we denote by superscripts in brackets particle labels and by superscripts without brackets the power of an operator. L introduces the noise maps U α1,β1,α2,β2 = U . We observe that applying the noise map U α1,β1,α2,β2 might flip the contents of the registers L 1 and L 2 depending on the values of α 1 , β 1 , α 2 and β 2 . This enables L to keep track of the noise introduced to a pair. There are two approaches how L can apply the noise maps U α1,β1,α2,β2 : stochastically in terms of CPTP maps, or coherently in terms of unitaries acting on an enlarged Hilbert space. Here we assume the latter approach, but provide the analysis of the noisy DEJMPS protocol in terms of CPTP maps and purifications.
To show that these are equivalent, first suppose that L owns a register H set to the state α1,β1,α2,β2 f α1,β1,α2,β2 |α 1 β 1 α 2 β 2 H wheref α1,β1,α2,β2 are the probabilities of applying the respective noise map U α1,β1,α2,β2 . L uses the register H to apply the noise maps U α1,β1,α2,β2 coherently controlled to the input state |Ψ ABEL . We observe that tracing out H after applying all the noise maps U α1,β1,α2,β2 in a controlled fashion yields On the other hand, assume that L applies the noise process in terms of a CPTP map N , i.e.
We observe that N ρ will be, in general, a mixed state, thus there exists a purification on a larger Hilbert space. As all purifications are unitarily equivalent, see e.g. [32], we choose the purification Hence tr H [|Φ Φ|] = N ρ. Furthermore, we observe that the pure state |Φ can be generated by applying the unitaries U α1,β1,α2,β2 , coherently controlled by the register H, to |Ψ ⊗ |Ψ ⊗ α1,β1,α2,β2 f α1,β1,α2,β2 |α 1 β 1 α 2 β 2 H . This equivalence allows us to assume that L introduces the noise as a CPTP map, applying U α1,β1,α2,β2 with respective probabilities f α1,β1,α2,β2 and purifying the state after the basic distillation step is executed by Alice and Bob. Since the noise of L is applied before the basic distillation step is executed by Alice and Bob, the result of one noisy distillation step reads as which needs finally to be purified. In order to evaluate (A.2), we proceed as follows: • Step 1: We first compute which corresponds to the state after the noise map U α2,β2 is applied by L and the basic distillation step of the entanglement distillation protocol is executed by Alice and Bob.
• Step 2: We apply the unitary U u , which acts only on L's systems and whose purpose we clarify later, to the previous equality. • Step 3: We have to determine the purification held by Eve if the noise is leaked to her. In doing so, we trace out Eve and then provide her with the purification of the resulting state (which corresponds to leaking the noise transcripts to Eve).
Step 1: We observe that applying the noise map U (a1) α,β to |Ψ yields This observation suggests the following notational simplifications: Using this notation we rewrite (A.3) as U (a1) . This is the state of Alice, Bob, L, and Eve after the noise map U (a1) α,β is applied by L to the first pair. In order to compute (A.2) we define which corresponds to the state after the noise map U α2,β2 is applied and Following Protocol 1, a σ z -measurement of the target pair of the BCNOT, i.e. the subsystem AB 2 , is applied to (A.6). Next Alice and Bob communicate their respective measurement outcomes over a classic authentic channel. If the measurement outcomes coincide, Alice and Bob keep the source pair, i.e. subsystem AB 1 of step 2, else they discard both subsystems AB 1 and AB 2 . We assume that both measurements yield the outcome 1. If both measurement outcomes yield 0, no phase factor (−1) i2 would be required in the expression (A.7). The coinciding measurement outcomes imply i 1 ⊕ j 1 ⊕ i 2 ⊕ j 2 = 0. To summarize, the state post-selected on the measurement outcomes 1 of Alice and Bob is i1,j1,i2,j2 k1,l1,k2,l2 . (A.7) Step 2: Recall that L stores in its register attached to the pair of Alice and Bob the effective noise introduced. For that purpose we introduce the unitary U u as well as an ancilla system L 3 set to the state |00 L3 . Applying U u to all three registers of L yields U u |00 |i |j |i |j = |u(i, j, i , j ) |i |j |i |j where u is the so called flag update function defined in [20]. The function u returns the effective noise introduced on the source pair of step 2 of Protocol 1. Applying U u to (A.7) gives Ψ α1,β1,α2,β2 = i1,j1,i2,j2 k1,l1,k2,l2 .
We remind the reader that Ψ α1,β1,α2,β2 is the state after the application of i) the noise map U α2,β2 , ii) a basic distillation step, and iii) the update of L's noise register by U u .
From (A. 19) we observe that in the limit the 'cross-probabilities' p 01 and p 10 , vanish, hence L is fully correlated to AB. It is of central importance, regarding convergence that the fixed point p ∞ is an attractor, as only this ensures convergence towards that fixed point. Note that p ∞ is an attractor if and only if the largest eigenvalue λ max of f (p ∞ ) satisfies λ max < 1. We easily find that λ max = (f 0 4f 0 − 3 −f 0 )/(2f 0 − 1) < 1 for 0.78 ≤f 0 ≤ 1. The fixed point p ∞ enables us to determine the rate of convergence. For that purpose, we expand f in terms of its Taylor series around the fixed point p ∞ , i.e.p = f (p) ≈ f (p ∞ ) + f (p ∞ )(p − p ∞ ). Hence by defining e = p − p ∞ we findẽ = f (p ∞ )e, providing an estimate of the error propagation for one successful distillation round. The state of Alice, Bob, and L after n successful distillation rounds and at the fixpoint read as ρ n = ij p (n) ij |B 0i B 0i | AB ⊗ |η j η j | L and ρ fix = i p ∞ ii |B 0i B 0i | AB ⊗ |η i η i | L respectively, which implies for their distance induced by the 1-norm 20) only concerns the systems of Alice, Bob, and L. To complete the analysis we recall that Eve purifies ρ n and ρ fix with the leaked noise transcripts of L. If we take this purifying system, E, into account, i.e. consider |ψ n ψ n | ABEL −|ψ α ψ α | ABEL 1 where ρ n = tr E [|ψ n ψ n | ABEL ], |ψ α since purifications scale with a square root. In order to apply the post-selection-based reduction, we need to relate the previously obtained results for i.i.d. input pairs to general ensembles. As stated in the main text, we exclude the parameter estimation step on √ n initial states for simplicity. We remind the reader, as we have stated in the main text, that for all purifications |ψ ABE of a n-partite input state ρ AB we have where g n,d = n+d 2 −1 n . Thus, inserting the previous result for 2 n i.i.d. input states (necessary to achieve n rounds of distillation) in (A.22) yields One square root in the expression above arises from inequality (A.21) and the other square root appears from inequality (A.22). Hence, for confidentiality we necessarily need g 2 n ,d 1/4 n → 0 for n → ∞. Thus 1/4 n should decay faster than g 2 n ,d grows in n. Numerical simulations suggest that, forf 0 = 1 − 10 −19 , this turns out to be true, i.e. the postselection-based reduction is applicable (see Figure A2). As stated in the main text such rates are unlikely to be achievable on the physical level, but they are, at least in principle, possible through fault-tolerant constructions.  Figure A2.

Fixed point and convergence -General pairs
In the following we show that the previous established results also hold true for the general i.i.d. setting where L applies all four Pauli operators and each individual pair is arbitrary. We remind the reader that the recurrence relations for states i,j,k,l P ijkl |B ij AB ⊗ |η kl L ⊗ |η ijkl E (i.e. Eve purifies ρ n = i,j,k,l |P ijkl | 2 |B ij B ij | AB ⊗ |η kl η kl | L with the leaked noise transcripts) read (by denoting |P ijkl | 2 = p ijkl ) as modulo the normalization factor δ0δ1γ0γ1p δ0δ1γ0γ1 . For simplicity we assume independent single qubit white noise, i.e.f α1,β1,α2,β2 =f α1,β1fα2,β2 as well asf α1,β1 = f if α 1 = β 1 = 0 and (1 − f )/3 otherwise. Furthermore, we assume that the initial fidelity F with |B 00 is sufficiently high for distillation. Numerically iterating the recurrence relations (which we again denote by p f →p) reveal that, for a sufficiently large number of iterations, the 'cross-probabilities' vanish, i.e. p ∞ ijkl = 0 ⇔ i = k or j = l. Hence, to obtain a fixed point p ∞ = (p ∞ ijkl ) 1 i,j,k,l=0 of f , it is reasonable to assume that p ∞ ijkl = 0 ⇔ i = k or j = l. Thus the fixed point p ∞ is determined by four equations in four unknowns, namely the equations where δ 0 , δ 1 ∈ {0, 1} and N = δ0,δ1 p δ0δ1δ0δ1 . Figure A3 illustrates the numerical estimate of p ∞ 0000 as a function of f .
p ∞ 0000 f Figure A3. The figure illustrates p ∞ 0000 as a function of f . The fidelity with |B 00 of the asymptotic state is equal to unity for a perfect apparatus.
Similar to the case of binary pairs, we can write the recurrence relations f in terms of its Taylor series expansion around the fixed point p ∞ , i.e.p = f (p) ≈ f (p ∞ ) + f (p ∞ )(p − p ∞ ). Hence by defining e = p − p ∞ we havẽ e = f (p ∞ )e, i.e. as for binary pairs, the error induced by the 1−norm of the state of Alice, Bob, and L after n successful distillation rounds satisfies (A.23) Figure A4 suggests a linear relationship between the number of successful distillation rounds n and log f (p ∞ ) n−1 for each noise level f , i.e. b(f )n + a(f ) = log f (p ∞ ) n−1 . As the number N of pairs necessary to achieve n distillation rounds is N = 2 n (⇔ n = log 2 N ) we have b(f ) log 2 N + a(f ) = log f (p ∞ ) n−1 , which is equivalent to .
What is left to show, is that the fixed point p ∞ is an attracting fixed point. For that purpose we numerically compute the largest eigenvalue of f (p ∞ ), see Fig. A5, and observe that, for noise below 10 −1 , i.e. 1 − f < 10 −1 , the largest eigenvalue λ max of f (p ∞ ) fulfills λ max < 1, proving that p ∞ is an attracting fixed point.  This implies that, if the initial fidelity F with |B 00 is sufficiently large for distillation, the DEJMPS protocol necessarily converges towards the fixed point p ∞ where the 'cross-probabilities' vanish. The analysis so far still lacks Eve's system E for the leaked noise transcripts. Suppose |ψ n ABEL and ψ f ABEL are purifications of ρ n and ρ fix , i.e. ρ n = tr E [|ψ n ψ n |] and ρ fix = tr E ψ f ψ f respectively. This implies ) which we also confirmed with our numeric results. It is straightforward to extend the analysis above to two-qubit correlated noise introduced by L on the system of Alice and Bob. For that purpose we assume thatf α1,β1,α2,β2 =f + (1 −f )/16 if α 1 = β 1 = α 2 = β 2 = 0 and (1 −f )/16 otherwise. Also in that case we numerically observe that p ∞ ijkl = 0 ⇔ i = k or j = l. Hence it is reasonable to assume that p ∞ ijkl = 0 ⇔ i = k or j = l in order to obtain a fixed point The fixed point p ∞ is determined by four equations in four unknowns, namely the equations where δ 0 , δ 1 ∈ {0, 1} and N = δ0,δ1 p δ0δ1δ0δ1 . Figure A6 illustrates the numerical estimate of p ∞ 0000 as a function off . p ∞ 0000 f Figure A6. The figure illustrates p ∞ 0000 as a function off for two qubit correlated noise. The fidelity with |B 00 of the asymptotic state is equal to unity for a perfect apparatus.
Furthermore we numerically compute the largest eigenvalue of f (p ∞ ) and observe that iff > 0.8284, the largest eigenvalue λ max of f (p ∞ ) fulfills λ max < 1, hence p ∞ is an attracting fixed point, see Fig. A7. Finally, we obtain again a linear relationship between the number of successful distillation rounds n and log f (p ∞ ) n−1 for each noise levelf , i.e. b 2 (f )n + a 2 (f ) = log f (p ∞ ) n−1 , see Fig. A8. This implies, similar to the case of single qubit white noise, that the right-hand-side of (A.23) converges polynomial fast towards zero in terms of initial states. The rate of convergence is governed byf , i.e. ρ n −ρ fix Taking the system of leaking noise transcripts into account,this implies that n = |ψ n ψ n | − ψf ψf To conclude the analysis, we now show that the noise model of two-qubit depolarizing noise is actually sufficient to cover any noise process for two-qubit operations. This is the case because for any CNOT-type gate (which we need to apply in the case of both recurrence-type entanglement distillation protocols we consider), one can depolarize these gates to a standard form [26]. This is done by randomly applying single-qubit operations before and after the application of the gate, which allows one to reduce any noise characteristics to a specific form with 8 parameters without altering the fidelity of the gate. A further simplification is possible if the noise characteristic of the apparatus is known [26], which could in some cases be achieved through quantum process tomography. In this case, one can add additional (local) noise by randomly choosing to apply the gate, or some other (separable) operation. This allows one to bring any CNOT-type gate (i.e. any two-qubit gate that is equivalent to a CNOT gate up to single qubit unitary operations that are applied before and after the gate) to the standard form As outlined in [26] this depolarization procedure causes a change in the gate fidelity of the utilized quantum gates. More precisely, if the fidelity of the quantum gate before the depolarization was F g = 1 − x then the gate fidelity after the depolarization is F g > 1 − 17x. Thus one reduces the quality of the gate by about an order of magnitude in the worst case by depolarizing to this standard form.
We observe that (A.24) can be rewritten as α1,β1,α2,β2 σ α1,β1 σ α2,β2 ρσ α1,β1 σ α2,β2 We observe that the noise maps U α2,β2 in (A.26) act on Alice's part of the systems only. But this is sufficient due to the symmetry of Bell-states -noise on Bobs side can be moved to the other side. Furthermore the additional σ x -flips introduced on the system(s) of L by the unitaries U α2,β2 are used to keep track of the noise map applied. Because Alice and Bob apply the depolarization procedure as described in [26] and L keeps track of the effective error introduced, we can safely assume that the additional σ x -flips will be introduced after Alice and Bob complete the depolarization procedure, hence it is sufficient to consider two qubit correlated noise introduced at Alice's part of the systems.

Appendix A.2. The BBPSSW protocol
The protocol proposed in [28] (also referred to as BBPSSW protocol) is very similar to the DEJMPS protocol. Instead of step 1 of Protocol 1 Alice and Bob apply a correlated depolarization procedure (twirl) to their input states which brings them to Werner form. For the subsequent analysis, suppose that each pair of Alice and Bob is of the form ρ(p) = p |B 00 B 00 | +(1−p) 1 4 id. We assume that the apparatus applies independent and identical noise of the form N ρ(p) = f ρ(p)+(1−f )/4(ρ(p)+ σ x ρ(p)σ x + σ y ρ(p)σ y + σ z ρ(p)σ z ) before each distillation step. In similar fashion to the DEJMPS protocol one easily obtains the recurrence relation for the noisy BBPSSW protocol: The fixed point p ∞ of the protocol is obtained by solving the equation b(p ∞ ) = p ∞ . A straightforward computation gives the fixed point p ∞ = 2/3 + 1/3 4 − 9/f 2 + 6/f (which depends on the noise parameter f ). It was shown in [27] that this fixed point is an attractor assuming sufficiently high initial fidelity with |B 00 per input pair. Expressing the recurrence relation b in terms of its Taylor series around p ∞ leads tõ Hence, (A.27) provides an approximation of the error in terms of fidelity with |B 00 after n+1 successful distillation rounds, i.e. n+1 = (b (p ∞ )) n 1 , see also the plots within Fig. A9. Moreover, we compute the first derivative of b by From this we conclude that, if the apparatus is perfect, i.e. f = 1 in (A.28), the error in terms of fidelity with |B 00 after n + 1 successful distillation rounds scales as n+1 = (2/3) n 1 . Using log 2 N = n, where N denotes the number of initial states, we infer for n+1 that This implies that n+1 scales as F (N ) ∈ O(N log 2 b (p ∞ ) ) and thus ρ fix − ρ n 1 , where ρ fix and ρ n denote the fixed point and the state after n successful distillation rounds respectively, scales also as F (N ) ∈ O(N log 2 b (p ∞ ) ) as mentioned in the main text. For the analysis of two qubit correlated noise we assume that the noisy operations used by the BBPSSW protocol are of the form where ρ is a two qubit density operator and O ideal 12 denotes the ideal two qubit quantum gate. Observe that (A.29) coincides with the standard form of [26]. If the noisy quantum gates are not of the form (A.29) we bring them to that standard form via the same depolarization procedure mentioned in the analysis of the DEJMPS protocol. Hence the following anaylsis is not restricted to this specific noise model, but actually applies to arbitrary noise processes describing noisy two qubit gates. It has been shown in [27] that the BBPSSW protocol converges for noisy CNOT gates of the form (A.29) to a unique and attracting fixed point iff is sufficiently high. The recurrence relation for the fidelity relative to |B 00 obtained in [27] is given by the formula Hence one obtains as in [27] the respective fixed points of (A.30) to be For F ∈ (F min , F max ) we have that F > F which shows that F max is an attracting fixed point. By replacing F in (A.30) withb(F ) we observe similar to (A.27) that the error after n + 1 successful distillation rounds scales for two qubit correlated noise as F (N ) ∈ O(N log 2b (Fmax) ) where N denotes the number of initial states. Finally we provide a worst case analysis of the BBPSSW protocol. For that purpose assume the following scenario: The noisy apparatus performs with probability f I the ideal distillation step E I and introduces with probability 1 − f I an arbitrary noise map E ⊥ . More precisely, we decompose the distillation step taken by Alice and Bob before the measurement of the target system as the CP map where ρ is a four qubit density operator. Notice that one can always decompose a noisy map in this form, where both maps are completely positive and trace preserving. We remark, however, that the map E I denotes the ideal protocol which includes an abort option, i.e. we only keep the first pair if the results of the measurements on the second pair coincide. The map E ⊥ may similarly contain such an abort branch. The noise parameter f I describes the quality of the overall map * , i.e. one can think of the process that with probability f I the desired procedure (including gates and measurements) is performed, while with probability (1−f I ) something else happens (described by the map E ⊥ ).
We will now consider the worst case for the map E ⊥ w.r.t. entanglement distillation. The worst case for the BBPSSW protocol is that the apparatus introduces a state orthogonal to |B 00 on the source system and the state |B 00 on the target system as this will always contribute to the overall success probability of a distillation step of the BBPSSW protocol but lead at the same time to a lower fidelity relative to |B 00 after the measurement of the target system compared to the ideal distillation step. One example for such a map is given by E ⊥ (ρ) = |B 01 B 01 | ⊗ |B 00 B 00 |. Any other map will lead to a larger fidelity after the distillation step followed by depolarization to Werner form. We thus have for the fidelity relative to |B 00 . This formula can be understood as follows: The ideal protocol is applied with probability f I , and succeeds with probability f suc , thereby producing a fidelityF . The map E ⊥ is applied with probability (1 − f I ), does never abort and does not contribute to the final fidelity (which is clearly the worst case). We We now analyze the worst case scenario, i.e. assuming equality in (A.31). Since we know that at each step the actual noise map produces an output density operator with a larger fidelity than the worst-case map, we can conclude that the resulting fidelity of any noise map will be larger than the fixed point which is achieved by the worst-case map. We remark, however, that this does not constitute a full confidentiality proof for arbitrary noise maps, as it is not evident from this analysis that for any fixed noise map a unique fixed point is reached. Assuming equality in (A.31), one can compute that the fixed points of the noisy BBPSSW protocol are in this case given by the solutions of which only depend on the noise parameter f I . We define g fix (x, f I ) = −f I + (9 − 2f I )x − 14f I x 2 + 8f I x 3 which implies that (A.32) reads as g fix (F ∞ , f I ) = 0. The question how many solutions of (A.32) are real we easily answer by the discriminant of g fix . We obtain for the discriminant of g fix three possible fixed points for f I > f Icrit . Hence we need to show that the fixed point with the highest fidelity relative to |B 00 obtained via (A.32) is an attracting fixed point. We solve this issue by showing that F > F for F ∈ (F min , F max ) (where F min denotes the second, and F max the third fixed point in Fig. A11). From Fig. A12 we find that F − F > 0 for f I > f Icrit , hence F > F which shows that F max is an attracting fixed point whenever starting with initial fidelity F > F min . Furthermore, by assuming equality in (A.31) and replacing F with b ⊥ (F ), we observe similar to (A.27) that the error after n + 1 successful distillation rounds scales in this worst case analysis as F (N ) ∈ O(N log 2 b ⊥ (Fmax) ) where N denotes the number of initial states.

Appendix B. Confidentiality of entanglement distillation protocols
In this section we provide the proofs of Lemma 3 and Lemma 5 of the main text, crucial for the de-Finetti-based and post-selection-based reduction techniques. Both proofs require only one specific property of the real protocol E α : after passing the parameter estimation phase the entanglement distillation protocol always converges to one fixed point, i.e. the fixed point is unique, an attractor for all the states which pass the parameter estimation and depends on the noise parameters only, as this implies that the distance with respect to the 1−norm within the ok−branch of the protocol is bounded and converges towards zero.
Appendix B.1. Proof of Lemma 3 We first state the following lemma which establishes a connection between measurements on one subsystem of a bipartite state and tensor product states.
where |φ ∈ H A and p A (φ) = tr(|φ φ| ρ A ). If ρ φ B − ρ B 1 ≤ for all |φ ∈ H A , then where C only depends on the dimensions of A and B. In particular, if we fix the number of qubits of A and B to 2 respectively, then we have C = 4 8 .
Proof. In the following we denote the four Pauli operators by First we decompose ρ AB in the Pauli basis, i.e. we have where n and m denote the number of qubits of A and B respectively and we use the notations i = (i 1 , .., i n ) and j = (j 1 , .., j m ) where each i k and j k are in {0, .., 3} as well as σ i = n k=1 σ i k and σ j = m k=1 σ j k . Recall that tr(σ 0 ) = 2 and tr(σ 1 ) = tr(σ 2 ) = tr(σ 3 ) = 0. From this one easily computes ρ A and ρ B by where a = (α 00 , .., α 3 n 3 m ), a = (α 00 α 00 , .., α 3 n 0 α 03 m ) and · 1;C 4 n+m denotes the 1−norm of vectors in C 4 n+m .
Hence in order to prove (B.1) it is sufficient to prove a − a 1;C 4 n+m ≤ 2C . By assumption we have for ρ φ B where |φ ∈ H A and p A (φ) = tr((|φ φ| ⊗ I)ρ AB ) that ρ φ B − ρ B 1 ≤ for all |φ ∈ H A . Moreover, according to Theorem 9.1 in [32] we have for all |ξ ∈ H B where p B (ξ|φ) denotes the conditional probability of obtaining the outcome φ on system A and the outcome ξ on system B and {E m } denotes a POVM on the subsystem of B. Suppose we perform a projective measurement on the systems of A and B denoted by {|ψ k AB } = {|φ k A ⊗ |ξ k B } where k ∈ {1, .., 4 n+m } on ρ AB and ρ A ⊗ ρ B . This yields for the respective probabilities p AB (ψ k ) and q AB (ψ k ) of observing outcome k for ρ AB and ρ A ⊗ ρ B where p B (ξ k |φ k ) denotes the conditional probability of obtaining outcome φ k on system A first and obtaining outcome ξ k on system B. We observe p A (φ k ) = q A (φ k ). Thus we obtain using (B.6). In order to compute a bound for (B.5) we use quantum state tomography, see e.g. [33]. For that purpose we perform an informationally complete POVM induced by different separable bases on H A ⊗ H B . More precisely, we choose that many POVMs such that we have in total 4 n+m different outcomes. We observe for |ψ k AB = |φ k A ⊗ |ξ k B that Enumerating (B.7) for 1 ≤ k ≤ 4 n+m yields 4 n+m equations for a, i.e.
as well as 4 n+m equations for a ...
Roughly speaking Lemma 8 states that if all post-selected reduced states of a bipartite state, where each partition consists of two qubits, are η−close then the overall state is 2 · 4 8 η close to a product state. We gave the lemma in a more general form as it may have utility beyond the scope of this paper. However for our purposes we need a stronger, but more specific result. In the following lemma we will show that we can achieve the same result even if the measurements must succeed above a threshold, which is important in the application of the lemma.
Lemma 9. In the situation of Lemma 8 for n = m = 2 it suffice to consider measurements on the subsystem A which have a probability greater than or equal to 1/16. More precisely, for every state ρ AB there exists a unitary U acting on system A and a state ρ AB = (U ⊗I B )ρ AB (U ⊗ I B ) † , such that if the state ρ AB meets the conditions of Lemma 8, i.e. subsystem B is −non-steerable via measurements on subsystem A for all measurements with probability greater than or equal to 1/16, then Proof. First we construct the state ρ AB associated with ρ AB and show that it suffice to consider measurements of probability greater than or equal to 1/16. Recall the situation of Lemma 8. Let ρ AB be a bipartite (in general, mixed) state and let ρ A = tr B [ρ AB ] and ρ B = tr A [ρ AB ]. Furthermore let ρ φ B be defined as where |φ ∈ H A and p A (φ) = tr(|φ φ| ρ A ). Then the claim of Lemma 8 was: where C only depends on the dimensions of A and B. In particular, if we fix the number of qubits of A and B to 2 respectively, then we have C = 4 8 . Further recall that the set In order to prove the claim, we use the following observation: The state ρ A = tr B [ρ AB ] is a two qubit quantum state, so it can be written as where the states |Ψ j correspond to the (orthogonal) eigenstates of ρ A for the real non-negative eigenvalues λ j . Hence there exists at least one j ∈ {0, 1, 2, 3} such that λ j ≥ 1/4, which corresponds to the maximum of the eigenvalues λ j . Now we choose a local unitary U such that U |Ψ j = |0 ⊗ |0 . Applying this unitary to (B.22) therefore leads to the state where |ϕ j = U |ψ j . We compute the probability for any projector applied on ρ A which is taken from the set (B.18)-(B.21) and of the form |φ φ | = |φ k φ k | ⊗ |φ l φ l | by where p A (φ) ≥ 1/16. Furthermore assume as in Lemma 8 that ρ φ B − ρ B 1 ≤ for all such |φ ∈ H A . Then Lemma 8 implies that The proof completes by observing that ρ AB and ρ A ⊗ ρ B are related by the local unitary U to ρ AB and ρ A ⊗ ρ B and the unitary equivalence of the trace distance, i.e.
We observe that, due to the proof of Lemma 9, which relies on the informationally complete set (B.18)-(B.21), it suffices to be non-steerable with respect to the measurements within that set for a probability of measurement above or equal to 1/16. We actually have proven a stronger result, as the actual choice of measurements does not matter, provided the probability of success is above or equal to the threshold 1/16.

Lemma (Lemma 3 in main text -Product Form Lemma).
Let ρ be an arbitrary mixed state shared by Alice and Bob and let |ψ ABE be a purification thereof held by Eve. Furthermore, let P 1 correspond to a (distillation-type) real protocol and P 2 correspond to the associated (distillation-type) ideal protocol, i.e.
where α characterizes the level of the noise, σ α AB , and σ ⊥ AB are two fixed two qubit states. Furthermore, let P 1 and P 2 satisfy the following properties: (1) The noise transcripts do not leak to Eve.
(2) The protocol P 1 guarantees to converge towards some state σ α AB within the ok-branch of the protocol and max µ AB (P 1 − P 2 )(µ AB ) 1 ≤ ε.
Then it holds that Proof. The proof relies on Lemma 8 and 9. Suppose Eve prepares the pure state |ψ ABE and let tr E [|ψ ψ|] = ρ AB be the state received by Alice and Bob. Then we have If we post-select Eq. (B.29) on the ok−branch we have after normalization It is obvious from the fact that the protocol is performed by Alice and Bob per definition that any measurement of Eve in the ok−branch can be commuted to the beginning of the protocol P 1 because Eve is not part of the protocol. Hence her measurement only changes the input of the protocol P 1 and thus either cause an abort or not.
because p E (φ) ≥ 1/16 and max µ AB (P 1 − P 2 )(µ AB ) 1 is bounded by ε by assumption. Hence we apply Lemma 8 to σ ABE with = 17 pρ ε which implies for the distance between σ ABE and σ AB ⊗ σ E that where the factor 4 8 is the constant C of Lemma 8 depending on the dimensions of the systems of Alice/Bob and Eve, for which we have n = m = 2. Furthermore, this implies via Lemma 9 that because σ ABE and σ AB ⊗ σ E are unitarly related to σ ABE and σ AB ⊗ σ E via the unitary U on Eve's system. Finally, employing (B.36) in (B.28) yields Proof. As mentioned in the main text, we introduce a two-level flag system held by Alice which indicates whether they aborted the protocol or not. So we observe where E denotes the system of leaked noise transcripts to Eve. By assumption we have E L (ρ) − F L (ρ) 1 ≤ ε(n). This is equivalent to p ρ σ ABEL − |ψ f ψ f | ABEL 1 ≤ ε(n) since E L (ρ) and F L (ρ) are equal on the fail branch. This we can rewrite to σ ABEL − |ψ f ψ f | ABEL 1 ≤ ε(n)/p ρ . Moreover, applying the real and ideal protocol to the purification |ψ ABE results in Again, both expression are equal in the fail branch, thus the 1-norm simplifies to Hence it is sufficient to show p ρ σ ABEE − σ f ABE ⊗ ρ E 1 ≤ 4 ε(n). We observe that by introducing the system L held by L that One easily verifies tr E [σ ABELE ] = σ ABEL and tr ABEL [σ ABELE ] = ρ E because the system E is not changed by the protocol E. Moreover, by assumption we have σ ABEL − |ψ f ψ f | ABEL 1 ≤ ε(n)/p ρ . Thus we apply Lemma 10 to ρ A B := σ ABELE and ϕ A B = |ψ f ψ f | ABEL ⊗ ρ E where A := ABEL and B := E which implies which completes the proof.
Appendix C. Confidentiality of entanglement distillation protocols whenever the noise transcripts leak In this section we show how the confidentiality guarantees regarding an entanglement distillation protocol can be extended to the case whenever the noise transcripts leak to Eve. We remind the reader that it is not necessary to leak the noise transcripts to Eve after every single distillation round. It is sufficient to copy all noise transcripts at the very end to Eve's register, as L is not accessible and Eve is not part of the protocol being executed by Alice and Bob.
Theorem (Theorem 7 in main text). Let E be the real protocol and F be the ideal protocol. Furthermore, let E l be the real and F l be the ideal protocol when the noise transcripts leak to Eve. Then for all purifications |ψ ABE of initial state ρ AB consisting of n systems Suppose Alice and Bob apply a secret twirl T to (D.1), i.e. they apply stochastically the family of operators {id, K 1 , K 2 , K 1 K 2 } where K 1 = σ x ⊗ σ x and K 2 = σ z ⊗ σ z . These are two stabilizers of the Bell state, i.e., Hence, applying the secret twirl T to (D.1) gives T ρ ABE = r 1 ,r 2 i1,i2,j1,j2,k,l Note that in the resulting state i1,j1 |B i1j1 B i1j1 | ⊗ k,l |P i1j1kl | 2 |η i1j1kl η i1j1kl | Eve decouples, i.e. Alice/Bob and Eve have a separable state. The obtained resource state can be used to establish a confidential quantum channel by means of quantum teleportation.

Appendix E. Robustness of recurrence-type entanglement distillation protocol
To complete the security characterization of entanglement distillation protocols we also consider the robustness of an entanglement distillation protocol. To define this term precisely we first need the definition of a honest eavesdropper.
Definition 11. We call an eavesdropper honest, if the states sent by the eavesdropper are of the form |B 00 ⊗2 n .
It is obvious that a honest eavesdropper is not entangled with the ensemble delivered to Alice and Bob via the noisy quantum channel. Moreover we formally define the robustness of a protocol by: Definition 12 (Robustness of a protocol). We call a protocol E α ε R -robust, if for a honest eavesdropper the probability of aborting the protocol is at most ε R .
Now we show that we can tune the robustness of a recurrence-type entanglement distillation protocol to be exponentially small in terms of necessary number of input pairs. Theorem 13. Let M ∈ N such that Alice and Bob achieve ε-confidentiality by succeeding M rounds of a recurrencetype entanglement distillation protocol. Furthermore assume that Alice and Bob receive n pairs from a honest eavesdropper over the quantum channel Φ ⊗n (where Φ(ρ) = βρ + (1 − β)/4 i,j σ i,j ρσ i,j ) such that, after the parameter estimation step of the proposed protocol, k − √ k pairs (where k − √ k = c2 M and c = ξ2 M +2 ) are left for entanglement distillation. Then, the robustness ε R of the protocol is bounded by Proof. The basic idea of the proof is to request sufficiently many pairs from Eve such that the probabilities of abort during the protocol to be exponentially small while still having enough pairs left to achieve M rounds of a recurrence-type entanglement distillation protocol. We divide the proof into two parts: • Part 1: We prove that the probability of aborting the recurrence-type entanglement distillation protocol due to parameter estimation is exponentially small.
• Part 2: We prove the same holds true for aborting the protocol during entanglement distillation.
Part 1: Suppose Eve sends the state |B 00 ⊗n through the noisy quantum channel Φ ⊗n to Alice and Bob. Applying Φ to |B 00 B 00 | yields ρ AB = Φ (|B 00 B 00 |) = (3β + 1)/4 |B 00 B 00 | + (1 − β)/4 (|B 10 B 10 | + |B 01 B 01 | + |B 11 B 11 |) . (E.1) Thus the state Alice and Bob receive is ρ ⊗n AB . According to the preceding protocols proposed in the main text, Alice and Bob apply a symmetrization to ρ ⊗n AB , and, depending on the noise level of the apparatus, they might have to trace out n − k pairs or not. For the subsequent analysis we assume that this tracing out step is necessary, i.e. the de-Finetti-based reduction needs to be applied. Hence, Alice and Bob continue by applying a twirl to each remaining pair. Since ρ ⊗k AB is invariant under permutations and ρ AB is Bell-diagonal, the remaining state after twirling is equal to ρ ⊗k AB . Next, they apply to √ k of the remaining k pairs the parameter estimation for estimating the fidelity of each pair. Necessary for convergence of all recurrence-type entanglement distillation protocols is that the fidelity F of ρ AB with |B 00 satisfies F > F min (α). Hence this step is crucial in order to guarantee successful distillation. For that purpose, we measure √ k of k pairs by applying two-qubit measurements.
To be more precise, we apply a σ x ⊗ σ x to the first and σ z ⊗ σ z measurement to the second pair. We refer to this measurements by M 1 and M 2 respectively. We observe that the state |B 00 is a common eigenstate of M 1 and M 2 with eigenvalue 1. We define to each pair of pairs a random variable X i for i ∈ {1, .., √ k /2} with X i = 1 whenever both measurements M 1 and M 2 yield outcome 1 and X i = 0 else. Furthermore we assume for the expected value E(X) of the fidelity with |B 00 that E(X) = F min (α) + δ, where δ > 0 will be fixed below. The protocol will be aborted if the estimate is below F min (α) + δ. From (E.1) we observe that, whenever (3β + 1)/4 ≤ F min (α), the entanglement distillation protocol will not distill any entanglement. This implies for the quantum channel Φ that, if β ≤ (4F min (α) − 1)/3 the parameter estimation step will abort, independent of the input provided by Eve. Thus we assume for the subsequent analysis that β > (4F min (α) − 1)/3. Moreover we define η = δ/2. Hence we get by the Hoeffdings inequality [34] for the probability of an error larger than η in our measured estimate X for the fidelity the following expression: P(|E(X) − X| ≥ η) ≤ exp −η 2 √ k/2 =: p pe-abort .
Thus the probability of aborting the protocol due to an error in the parameter estimation is exponentially small in number of necessary input pairs. In order to fix δ we recognize that Alice and Bob abort the protocol whenever (3β + 1)/4 < F min (α) + δ. This is equivalent to δ > (3β − 4F min (α) − 1)/4. Inserting the definition of η yields η > (3β − 4F min (α) − 1)/8 and thus p pe-abort < exp −(3β − 4F min (α) − 1) 2 √ k/128 . Part 2: What remains to be shown is that the probability of aborting the protocol in the distillation phase is also exponentially small in the number of input pairs. For that purpose, we assume that the noise level α of the apparatus is such that distillation is feasible. In the following we show that we can force the probability of abort due to entanglement distillation to be exponentially small in terms of requested input pairs. We assume that Alice and Bob are left with c2 M pairs after parameter estimation. Recall that the Chernoff inequality for a sequence of independent Bernoulli random variables X 1 , ..., X n where P (X i = 1) = p and d ∈ [0, 1] reads as Moreover, we observe that a basic distillation step can be modelled by a Bernoulli random variable X i where P (X i = 1) = p is the probability of succeeding (measurement outcomes coincide).