Memory-assisted quantum key distribution resilient against multiple-excitation effects

Memory-assisted quantum key distribution (MA-QKD) has recently been proposed as a technique to improve the rate-versus-distance behavior of QKD systems by using existing, or nearly-achievable, quantum technologies. The promise is that MA-QKD would require less demanding quantum memories than the ones needed for probabilistic quantum repeaters. Nevertheless, early investigations suggest that, in order to beat the conventional no-memory QKD schemes, the quantum memories used in the MA-QKD protocols must have high bandwidth-storage products and short interaction times. Among different types of quantum memories, ensemble-based memories offer some of the required specifications, but they typically suffer from multiple excitation effects. To avoid the latter issue, in this paper, we propose two new variants of MA-QKD both relying on single-photon sources (SPSs) for entangling purposes. One is based on known techniques for entanglement distribution in quantum repeaters. This scheme turns out to offer no advantage even if one uses ideal SPSs. By finding the root cause of the problem, we then propose another setup, which can outperform single no-QM setups even if we allow for some imperfections in our SPSs. For such a scheme, we compare the key rate for different types of ensemble-based memories and show that certain classes of atomic ensembles can improve the rate-versus-distance behavior.

. MA-MDI-QKD schemes with (a) heralding and (b) non-heralding QMs. In (a), we assume that, using certain mechanisms, the transmitted photon by the user can be written into the QM and the memory can herald its successful loading [10]. In (b), the dual-rail configuration for ensemble-based QMs is shown. Here, in each round, one entangles QMs A 1 and A 2 , and, similarly, B 1 and B 2 , with two optical modes in the vacuum or single-photon state. At the transmitters, users encode their bits using phase encoded BB84 as explained in [13]. The BSM is performed using two single-photon detectors and a 50:50 beam splitter on each rail; see the BSM box for memory A1.
All other BSM boxes in (b) and (c) are the same. A click on only one detector would herald success for the corresponding BSM. Once both BSMs on one side are successful, we assume that the user's state has been teleported to the corresponding QMs. One then continues with loading the other two QMs, and, once done, they will proceed to perform the middle BSMs. (c) MA-MDI-QKD with EPR sources. At each round, one generates an entangled state in the form |ψ entg AP , use half of it to do the side BSM, and, if successful, attempt to store the other half in the QMs. Note that the dual-rail configuration in (b) and (c) is for illustration purposes only. In practice, one can use the equivalent single-rail time-bin encoding techniques. makes the implementation of the system easier, but also has an additional operational advantage: Now, the repetition rate of the protocol is not determined by the distance, or the transmission delay, between the end users. Instead, one can in-principle run the protocol as fast as our QMs and optical sources allow without the need to wait for classical signals to acknowledge the success of entanglement distribution. If one employs QMs that feature short light-matter interaction times, then one may improve the total key generation rate per unit of time. One still, however, requires that the storage of the photon in the QM to be heralding. Direct heralding mechanisms for writing photons into QMs are often slow, because of which the authors in Ref. [10] suggested to use the teleportation idea. That is, by first entangling a photon with the QM, and performing a side BSM on this photon and the photon sent by the user, one can indirectly herald the transfer of the user's state to the corresponding QM.
One of the first investigations [12] of the above technique utilized atomic-ensemble based QMs in conjunction with a heralding scheme based on off-resonant Raman interactions [4].
By using such a scheme [4] for interaction between weak pump signals and atomic ensembles, one can generate states with dominant terms in the form (neglecting normalization factors throughout this section) |0 P |0 A + √ p c |1 P |1 A , where |0 P and |1 P are, respectively, vacuum and single-photon states, |0 A represents an ensemble with all atoms in their ground states, and |1 A is an ensemble with only one atom, randomly, in a meta-stable excited state, while the rest are in the ground state. Using two of such states, see Fig. 1(b), plus postselection succeeding with a rate proportional to p c , one can then end up with an entangled state between two ensembles A 1 and A 2 , and their corresponding photonic modes P 1 and P 2 , in the form of |ψ entg AP = |0 P 1 |0 A 1 |1 P 2 |1 A 2 + |1 P 1 |1 A 1 |0 P 2 |0 A 2 , provided that p c , the excitation probability, is much lower than one. The setup in Fig. 1(b) was investigated in [12] and it turned out that primarily the |1 P 1 |1 A 1 |1 P 2 |1 A 2 state, which would be generated with probability p 2 c , could result in such an amount of error that would prevent this system from outperforming no-QM systems. We refer to this issue by the two excited QM (TEQM) problem. Note that reducing p c would also reduce the success rate of the post-selection mechanism, and, on balance, would not result in an overall rate advantage.
There are several solutions to the TEQM problem. First, one may consider only quasisingle-atom QMs, such as nitrogen-vacancy (NV) centers in diamond, as proposed in Ref. [14]. In order to obtain a significant improvement in the key rate however, the NV centers must be embedded into microcavities [14]. While it is shown that the required cavity cooperativity is not necessarily high, their entangling protocol requires an appropriate SPS to entangle a photon with the electron spin of the NV center [14], a combination of which has yet to be demonstrated. Another remedy to TEQM, proposed in Ref. [12], is to use nearly ideal entangled-photon (EPR) sources for creating the initial QM-photon entanglement; see Fig. 1(c). The idea is that if one has an EPR source that ideally generates only one pair of photons per trigger, and that one can efficiently store one of these photons into a QM 4 without concerns of multiple-excitation issues. In Ref. [12], the authors show that conventional EPR sources relying on parametric down-conversion would not solve the problem, but suggest to instead use quantum-dot based EPR sources, of which have been shown to have very low second-order coherence properties [15]. Similarly, this solution would benefit quantum repeater implementations [16]. The other benefit of the EPR-based approach is that one only needs to write into QMs if the corresponding side-BSM is successful. We refer to this technique as "delayed writing", which further reduces the requirements on the access times of QMs as they do not need to be initialized in every round.
Our proposed solutions consider SPSs as a replacement for EPR sources for implementing the above ideas. SPSs are at a more advanced stage of development than EPR sources, which opens up the possibility of a proof-of-principle experiment to be accomplished in the short term. For instance, SPSs relying on quantum dot structures [17] can offer high-rate and low-noise performance that is suitable for the systems we propose here.
Specifically, we propose two setups. The first, of which resembles a noiseless linear amplifier (NLA) [18], involves an entangling procedure that is based on the method described in Refs. [19,20]. Simply, the user's photons are passed through a NLA before storing them into the QMs. For this setup, we optimize over the NLA parameters to maximize the key rate, finding that, while the TEQM issue is resolved, the rate scaling does not improve.
Our second, improved, solution, consists of a "quasi-EPR" source relying on two SPSs. This setup provides the required entanglement after post-selection (via the side BSMs), solves the TEQM issue, improves the rate, and is compatible with some non-ideal SPSs [21].
The key rate of our system not only depends on the entangling procedure but also on the characteristics of the QMs that are employed [12]. Thus, we calculate the secret key rate of the proposed MA-MDI-QKD protocols considering different types of ensemble-based QMs.
The latter may differ in coherence time, efficiency, bandwidth and access time, reading and writing procedures, and operating wavelength for example. In particular, we consider a selection of state-of-the-art memories based on warm vapors at room temperature [22][23][24][25], cold ensembles of rubidium atoms [26][27][28], and cryogenically-cooled rare-earth-ion-doped crystals (REICs) [29][30][31][32][33]. In the latter case, we utilize the property of ensemble memories to store multiple excitations (each in distinguishable modes) to our advantage, in that we account for the possibility of spectral multi-mode storage [34].
We consider all major sources of errors in each MA-MDI-QKD setup, such as, channel loss, efficiency and background noise due to photodetection and frequency conversion, as well as coherence time, and writing-reading efficiencies of QMs. Based on our calculations, we find existing and near-future candidates of MA-MDI-QKD systems that offer better performance than existing QKD links.
The paper is structured as follows. In Sec. II we describe our two proposed setups. In Sec. III, we study the performance of these setups by calculating their secret key rates. In Sec. IV, we present our numerical results by comparing the key rate with the fundamental rate bounds for the distribution of secure keys over a lossy channel found in Ref. [35].
We also determine the secret key rate of the quasi-EPR-based setup for different types of ensemble-based QMs and we compare the rate with that of the no-memory systems. In Sec.
V, we draw our conclusions.

II. SYSTEM DESCRIPTION
We describe our two SPS-based MA-MDI-QKD setups in this section. The setups we present here both use the EPR source structure in Fig. 1(c) except that, instead of the actual EPR source, we use SPS-based modules that have similar functionality. We run the protocol with a repetition rate R S = 1/T , where T is the repetition period, that is mainly specified by the SPS. In both cases we assume that the delayed writing procedure is used. That is, one attempts to write the photons into the QMs A 1 and A 2 only if both corresponding side BSMs are successful, and do similarly for B 1 and B 2 . The delay required for this step can be on the order of nano seconds, corresponding to the measurement time at the BSMs [36], and should not incur much additional loss or complexity. The benefit is that the potentially time-consuming initialization of the QMs shall only be done once the memories have been loaded and read instead of in every round. The loading/reading step occurs at a much lower rate especially at long distances. In the following, we first explain our NLA-and quasi-EPR-based setups, and then give a precise description of all components used in these systems. Users' photons will be effectively amplified before being stored in the QMs. A successful loading would be declared if the two NLAs corresponding to each user are both successful, that is, their corresponding BSM modules have a single click. We use the same BSM modules as in Fig. 1(b). The FC box represents the frequency converter.

A. NLA-based MA-MDI-QKD
The key requirement of the setups of Figs. 1(b) and (c) is to generate entanglement between photons and QMs. Our aim is to achieve the same objective by using SPSs. A solution that may by envisioned utilizes entanglement distribution techniques that rely on SPSs. Not surprisingly, there is a class of probabilistic quantum repeaters that have such a property. In the scheme proposed in Ref. [19], the authors use a SPS and an imbalanced beam splitter to create spin-photon entanglement. They interfere two such photonic modes at a BSM to entangle the corresponding QMs. In Fig. 2, we have used a similar idea to create our desired entangled state in the form |ψ entg AP , for A 1 and A 2 and their corresponding photonic modes P 1 and P 2 . The same can be done for B 1 and B 2 in Fig. 2. Here, we use a beam splitter with reflectivity η to split a single photon into two paths: one would be stored into a QM, and the other interferes at a BSM with the signal sent by the user.
This structure, as shown for memory A 1 in Fig. 2, then resembles a NLA module based on quantum scissors [18]. We consider the QMs to be loaded with the user's transmitted state (within a known rotation) if both NLAs on one side are successful, meaning that their BSM module generates exactly one click, As shown below, the above NLA structure can provide us with the required entanglement.
Suppose the writing efficiency into QMs is unity, and, without loss of generality, let us focus on QMs A 1 and A 2 . Assuming ideal on-demand SPSs, the joint state of the QMs and their corresponding optical modes P 1 and P 2 is given by where the first term, in brackets, is the desired entangled state. After the postselection by the two BSMs, which requires exactly one click in each module, the last term in (1) would be ideally removed. This last term is what could cause the TEQM problem. Therefore, this scheme resolves the TEQM issue. There is, however, a remaining term in the form |11 P 1 P 2 |00 A 1 A 2 , which is unwanted but can result in successful BSMs with a probability proportional to η 2 , whether or not the transmitted photons have survived the path loss.
That is, because of one background photon in each leg, in the asymptotic limit, when the distance L is large, the success rate of the side BSMs is nonzero. Let us give a name to this issue and call it the "two loss-independent click" (TLIC) problem. We will show in Sec. III how this problem prevents us from getting any rate advantage over no-QM setups. The scheme of Ref. [12] as shown in Fig. 1(b) also suffers from the TLIC issue. Note again that reducing η alone may not solve the problem, as our desired term occurs with a probability proportional to η. In principle, dark counts could also cause the TLIC problem, but, we may ignore it for now if it is small in comparison with other sources of background photons.
We comment on the effect of dark counts later in this section and fully account for it in our key rate analysis. Next, we examine another solution that resolves the TLIC problem as well as the TEQM one.  tons. Our proposed quasi-EPR module is shown in Fig. 3(b), of which may be built using integrated optics. It produces the desired entangled states, by interfering two single photons at different balanced beam splitters. It also generates additional spurious terms, which we aim to select out after successful side BSMs. Analyzing the circuit in Fig. 3(b), and using ideal A 1 and A 2 memories, the joint state of A and P modes can be written as follows where, again, the first term, in brackets, on the right-hand side, represents the desired entangled state. The last term represents the no-photon term, hence, unless for negligible dark count effects, cannot result in successful side BSMs, and it would be selected out. The term in the middle could result in successful BSMs, provided that the user's photon survives the path loss and/or because of dark counts. But, then, the QMs are both in their ground states, and except for a probability proportional to the dark count rate, they will not produce successful results at the middle BSM, and will be selected out at that stage.
Specifically, by proper use of the quantum interference effect, in Fig. 3(b), we have managed to group the unwanted states into terms that ensures both photons appear at the same output port. This creates only one background-induced click, making it easier to remove them by postselection. In the case of ideal SPSs, the above solution does resolve both the TEQM and TLIC problems. Even in the case of the second term, in order to get two successful side BSMs, one needs to have a user's photon arriving at the receiver, whose probability goes to zero at large distances. All of the previous discussion is based on the assumption that dark counts are negligible. The situation would be different if we have non-ideal SPSs with non-zero probabilities for emitting more than one photon, or when we have substantial dark count or background noise. We will consider theses scenarios later in our paper.
We made some idealistic assumptions in explaining how our proposed entanglement generation processes work. In the next section, we properly model major non-idealities in the system from which a realistic account of the key rate performance can be obtained.

C. Device modeling
We model different components of our system as follows.
BB84 encoders: We use phase encoding in the dual rail setup, or, equivalently, and if allowed by the QM setups, time-bin encoding in a single-rail setup, in bases Z and X [13].
We assume that efficient QKD protocols are in use [37], where basis Z is chosen most often.
We also assume that both users employ ideal SPSs for their BB84 encoding. In principle, they can use the decoy-state version of BB84, but, for the sake of our comparison, it would be sufficient to assume that both QM-assisted and no-QM systems use single photons to encode their bits. The multi-photon terms in a decoy-state protocol can be characterized by statistical analysis and they will not impose a change in the rate scaling [13]. The pulse duration is denoted by τ p , and it is assumed to be equal to T in our numerical analysis.
Channel: We denote the total channel length by L, and its attenuation length by L att .
That is, the total channel transmissivity will be given by exp(−L/L att ). We assume that the channel does not impose any phase or polarization distortions. In practice, such effects can be compensated by classical-feedback mechanisms. The error in such compensating mechanisms can then be analytically modeled via misalignment parameters. In this work, we neglect such errors as they are not major error bearing components of our system, and they are common for both QM-assisted and no-QM systems. One can use the methods proposed in Refs. [10,12] to account for such imperfections.
Single-photon detectors (SPDs): All our employed SPDs are assumed to be nonresolving detectors with efficiency η D . The dark count rate is denoted by γ dc , which results in a dark count probability d c = γ dc τ p per pulse. Here we assume that photodetectors are gated with an opening time that is identical to the pulse duration. The time that it takes to detect a photon and prepare the detector for next measurement is denoted by τ M . Using self-difference techniques [36], τ M can be on the order of nanoseconds.
Quantum memory: We consider several characteristics of QMs pertinent to our setups. The writing efficiency into QMs, i.e., the probability of successfully transferring the qubit-state encoded into a single photon to the QM, is denoted by η w . The probability of successfully reading the QM, i.e., transferring the qubit-state encoded into a QM (back) onto a single photon is denoted by η r . The latter will be affected by amplitude decay with time constant T r . The reading efficiency at time t after the loading is then assumed to be given by η r = η r0 exp (−t/T r ) [10,12], where η r0 is the reading efficiency right after the loading.
The exponential decay is not necessarily the case for all memories studied in this work. For instance, the decay is Gaussian for AFC-based QMs that do not compensate for dephasing induced by ground-level inhomogeneous broadening. In the regime of interest, where the relevant system time parameters are shorter than T r , the exponential decay assumption will then be a pessimistic one for such QMs and it would not alter the overall conclusions made in our work. We also denote the required time to initialize the memory by τ init and the time needed to interact with single photons by τ int .
Single-photon source: We assume that the SPSs used in the middle site of Figs. 2 and 3 are identical but probabilistic. That is, upon trigger, there is a likelihood η SPS that they generate the following normalized state where |2 is the two-photon state, and p 1 η SPS and p 2 η SPS are, respectively, the single-photon and double-photon probabilities. For most of this paper, we assume that p 2 = 0. We will examine the range of values for p 2 that are be tolerable for our setups.
Frequency Converter: Given that many QMs do not operate at the telecom wavelengths, we may need to convert the frequency of some of the generated photons to match that of the QM or the telecom channel used. We consider three scenarios: (1) use SPSs that generate photons at telecom wavelengths. One may then need an upconverter right before the QMs. The advantage is that the side BSM can be done more efficiently. On the downside, however, all the errors in the upconversion will affect the QM as well; (2) generate photons that are matched to the QM, but we downconvert the photons that enter the side BSM. Here, the advantage is that one can possibly use a matched SPS, in terms of the QM bandgap and its bandwidth, to maximize the writing efficiency, but one will have noisier side-BSMs in this case; and (3), which is similar to (2), but one upconverts the photons sent by the user before the side BSM. In this work, we adopt the second scenario and assume that the wavelength and the bandwidth of the SPSs matches that of QMs. In order to do side-BSMs, one may need to use a down-converter to match the wavelength of the two interfering photons [38][39][40]. We account for the conversion efficiency of such devices in our analysis. We also assume that the additional background Raman photons generated by the down-converter would modify the dark count of the side-BSM detectors.
In all devices, the sources of inefficiency are modeled by fictitious beam splitters with proper transmissivities.

III. KEY RATE ANALYSIS
In this section, we find the secret key generation rate for our proposed schemes shown in Figs. 2 and 3. We assume that there is no eavesdropping and we are only affected by device imperfections of the system as modeled in Sec. II C. For convenience, we assume that both setups are symmetric. Under these conditions, in the infinite-key setting, the secret key generation rate in the setups of Figs. 2 and 3 is lower bounded by where e QM 11;X and e QM 11;Z , respectively, represent the quantum bit error rate (QBER) between Alice and Bob in the X and Z basis, and R S Y QM 11 is the rate at which one generates raw key bits; the index 11 means that single photons are used at BB84 encoders; f denotes the inefficiency of the error correction scheme, and Shannon binary entropy function [10,41].
We use the techniques of Refs. [10,12] to calculate the above terms in the scenarios of interest. Without fully repeating the detailed calculations, here we just highlight the key steps in the derivation that are important in our understanding of the key-rate behavior of setups in Figs. 2 and 3. The key idea behind calculating Y QM 11 is to decompose the problem into two parts: (1) how often one loads the QMs on both sides, and (2) once loaded, how often the middle BSMs succeeds. Let us denote the former by P SBSM and the latter by Here, P SBSM partly depends on the probability to obtain two successful side-BSMs on one side, and partly on memory reading and writing times. Once both QMs are loaded, one has to spend a time equivalent to τ r = τ int + τ M + τ init to obtain a measurement outcome for the middle BSM, and prepare the QMs for the next round [14]. Accounting for τ w = τ int + τ M to write into the QM, there is a minimum time of τ w + τ r to get one raw key bit. The inverse of this parameter then sets a bound on the maximum key rate achievable from our delayed-writing schemes. At long distances, however, the challenge of ensuring both sides to be loaded would take precedent, hence we would expect that P SBSM ≈ 2 3 Pr(Successful side-BSMs on one side) [10]. As for P MBSM , the difficult part is to account for the decay of the QMs that may be loaded earlier. This requires us to average over the statistics of loading as has been detailed in [10,12]. The same averaging is required in the calculation of e QM 11;X and e QM 11;Z . Note that when T r is sufficiently large, we can ignore the averaging, and we have P MBSM ≈ Pr(two successful middle BSMs).
All above terms are found by calculating the relevant output density matrices in the setups of interest. We analytically obtain the pre-measurement state of the system by applying a series of transformations on the input density matrix considering channel transit, the entangling circuits, and the BSM modules. After applying relevant measurement operators, one can then find the post-measurement states and the relevant probability terms. In our case, this has been implemented using a generic Maple code developed for such setups.
Next, we examine the key rate scaling of the NLA-based and the quasi-EPR MA-QKD setups.
A. Key rate scaling: NLA-based setup In this section, we investigate how the secret key rate of the scheme shown in Fig. 2 behaves at long distances. Here we ignore all inefficiencies except for the channel loss for simplicity. We also assume that T r is sufficiently large. Under these conditions, from Eq. (1), there are two major terms that correspond to successful side-BSMs. The first term in brackets on the right-hand side of Eq. (1), corresponding to the desired entangled state, would result in successful side-BSMs provided that the user's photon has survived the path loss. This happens with a probability proportional to (1 − η)η exp(−L/2/L att ) for which the QMs are left in the desired state. The other term that could result in successful side-BSMs is |11 P 1 P 2 |00 A 1 A 2 , which succeeds with probability η 2 and would leave the QMs in their ground state. At long distances, the post-measurement state of the QMs would be roughly given by ρ (PM) where and ρ (TX) K represents the transmitted state (up to a known rotation) by user K = A, B.
Starting with ρ (PM) B 1 B 2 , then, for the middle BSM, one has In the regime of operation where η (1 − η)e −L/2/Latt d c , we then obtain that is, the key rate scales with the loss in the entire channel as is the case for a conventional no-QM system, and one should not expect any benefit from the NLA-based setup. As mentioned before, the distance-independent terms in Eqs. (6) and (7)  Using a similar analysis as in the previous section, we calculate how the secret key rate scales for the quasi-EPR setup of Fig. 3. In this case, for an ideal SPS with p 1 = 1 and p 2 = 0 and ignoring dark counts, from Eq. (2), one has P SBSM ∝ exp(−L/2/L att ), as none of the terms will generate detection events corresponding to both BSMs in the cases whereby the users' photons are lost. This implies that ρ (PM) for some constant α and β, adding up to one. P MBSM would then be given by which, when d c e −L/2/Latt , results in From Eq. (11), we infer that the key rate for the setup of Fig. 3 scales similarly as a singlenode quantum repeater system. Although we have not yet accounted for the errors in this rough analysis, the quasi-EPR setup promises to outperform the no-memory schemes. We examine this conjecture in the next section.
The above conclusion relies on the assumption that p 2 = 0 and results in P SBSM being proportional to the channel loss. If the probability to obtain the two-photon terms of the SPS is non-zero, then the distance-independent terms in P SBSM are on the order of p 2 , similar to η 2 in Eq. (7). Such terms would result in the TLIC problem as before once p 2 /p 1 is comparable to e −L/(2Latt) . The same holds if d c is on the order of e −L/(2Latt) , which could happen if the frequency converters generate a large background noise. In the following section, we explore the requirements on the employed devices in practical setups.

IV. NUMERICAL RESULTS
In this section we calculate the secret key rate that can be achieved using the schemes illustrated in Figs. 2 and 3. Specifically, we first calculate the secret key rate with the assumption that ideal QMs, meaning those that feature no limitations in performance, are employed. We compare the secret key rate per pulse of both schemes to the maximum rate achievable over a lossy channel, which we refer to as the PLOB bound [35]. We find that the quasi-EPR scheme outperforms the bound, while the NLA scheme, due to the TLIC problem, fails to surpass the PLOB bound. Next we calculate the secret key rate, in bits per second, corresponding to the quasi-EPR scheme in conjunction with experimentallymeasured properties of state-of-the-art warm and cold atomic ensembles as well as solid-state QMs based on REICs. For comparison we also plot the secret key rate for a no-memory MDI-QKD implementation driven at 1 GHz repetition rate; we use the "no-memory" label to refer to this system. We find that, under certain assumptions, some cold atom memories can surpass the no-memory bound due to their favorable coherence properties.
We also calculate the secret key rate of the quasi-EPR scheme with the assumption that we employ QMs that feature properties with modest improvements over the state-of-the-art memories. We refer to these as "near-future" QMs, and find that almost all near-future memories can outperform the no-memory system. We conclude with a discussion around other possible sources of imperfection, such as multi-photon events and background noise, and explore how these impact the quasi-EPR scheme.  The only sources of nonideality are listed in Table I for a detection gate of 1 ns. We compare the rate with the PLOB bound obtained in [35].

A. Ideal quantum memories
Let us first consider the case of ideal QMs. Specifically, these memories feature unity reading and writing efficiencies and fidelities, infinitely long coherence times, unlimited band-width, and zero interaction and initialization times. We calculate the secret key rate for the NLA and quasi-EPR schemes and that provided by the PLOB bound. The results are shown in Fig. 4, where we have used the values in Table I  Due to its improved rate-versus-distance scaling, the quasi-EPR scheme can, however, beat the PLOB bound at distances roughly greater than 150 km. Note that this scheme improves the key rate by nearly 5 orders of magnitude over the PLOB bound at a distance of 700 km. Based on this performance, in the following sections, we only focus on the quasi-EPR scheme for practical and near-future QMs.

B. State-of-the-art quantum memories
Here we evaluate the performance of the quasi-EPR scheme using a selection of state-ofthe-art ensemble-based memories. There are a variety of systems that have been utilized for optical QMs; see Ref. [8] for a recent overview. We consider ensemble-based memories due to their strong light-matter coupling and, in several cases, the possibility of long coherence times (up to seconds [26]) and high bandwidths (up to several GHz [23]). Furthermore, they offer the possibility of multi-mode storage [7,8]. By multi-mode we are referring to memories that can simultaneously store more than one qubit during a single storage event by encoding many qubits each into a different mode. This feature has been exploited to enhance secret key generation rates in certain quantum repeater schemes [6,7,34]. It is important to stress that the definition of multi-mode storage differs from our reference to (the detrimental) storage of multiple excitations. The former involves many excitations, in which each individual excitation occupies a single distinguishable mode (or a pair as required for a a qubit), while the latter concerns many excitations that occupy a single mode and thus each excitation may not be distinguished. Motivated by their impressive, and continually-improving, experimental record, we specifically consider warm vapor (Cs and Rb atomic gas) and cold atom (Rb atoms in a magneto-optical trap or atomic lattice) systems [8] that rely on the so-called Raman QM protocol [44] as well as cryogenically-cooled REICs that utilize AFCs [45].
Raman memory schemes [44] rely on three energy levels, usually a Λ-level system that features long-lived ground levels. A strong control pulse maps a propagating off-resonant photon onto the ground level. This is called the "writing" step and at this point the photon is "stored". To retrieve the excitation, a control pulse is applied again, in which the excitation is mapped back onto a propagating photon. This is referred to as the "reading" step. Note that the Raman protocol has been applied to photons that encode qubits with respect to various degrees of freedom (see Refs. [8,27] and references therein). Along with the convenience of operation at room temperature, warm vapor Raman QMs feature the possibility to efficiently store GHz-bandwidth photons with microsecond-long coherence times (with up to 100 µs being possible [24]) [8,22,23]. Cold atoms reduce the impact of collisional or motionalinduced decoherence, and, if magnetic-field-insensitive states are used, they offer very long coherence times reaching hundreds of ms and possibly more [8,26,28].
In a similar way, on-demand AFC QMs also require a Λ-level system except here an optical inhomogeneously-broadened transition is tailored into a series of narrow absorption lines (the "comb"), each of which are detuned from each other by an integer multiple of a fixed detuning [45]. A photon is absorbed by the comb, creating a delocalized atomic excitation and, using an optical control pulse, the excitation is reversibly-mapped onto a long-lived spin level. The photon is emitted due to a quantum interference effect between each absorption line of the comb. Ensembles of rare-earth-ions are particularly suited for AFC QMs due to the long coherence times of both the optical (100s of microseconds [46,47]) and spin (up to milliseconds [46,47] or even seconds [48]) transitions in conjunction with level structures that allow for efficient AFCs over ∼MHz bandwidths [8,46,47].
In the following, we study the performance of certain representatives from each group of memories. In this subsection and next, we focus mainly on the memory characteristics and neglect two-photon emissions from the source (i.e., p 2 = 0), or other issues that may arise in the photonic part of the system. We address the latter issues in Sec. IV D. We also assume that memories feature no additional noise for the purpose of our simulations except for the decoherence effect and coupling issues already accounted for. This assumption is supported by several recent rare-earth AFC [29,33], as well as cold and warm Raman experiments [23,25,27] that have shown storage of non-classical light. We have ensured that the repetition rate of each QM does not exceed the corresponding memory bandwidth. Furthermore, the choice of τ p = T would minimize any inefficiency due to bandwidth mismatch between the source and the QM. In practice, one may need to choose τ p to be shorter than T , in which case its effect on the coupling efficiency must be considered. For all memories considered, we also assume that τ init = 0 given that these QMs would ideally go back to the desired initial state after being read out. In practice, memory re-initialization may be occasionally needed to avoid the spread of error. We assume that the frequency at which the initialization is needed is sufficiently low that it would not affect our key rate analysis. of efficiency and coherence time as well as low noise, while Ref. [22] uses an anti-resonance of a Fabry-Perot cavity to suppress four-wave-mixing-induced noise that is present in Ref. [23]. The limitations of coherence time in these demonstrations are largely due to imperfect magnetic shielding, allowing magnetic-field-induced dephasing. The experiment of Ref. [24] employs exceptional magnetic shielding, but does not feature storage of non-classical light.
[25] uses a ladder energy-level system to achieve storage in an excited level, which opens the possibility of storage of light pulses of less than ∼100 ps duration. The storage time is, however, restricted to ∼100 ns. Figure 5 shows the secret key rate of the quasi-EPR scheme using the memories listed in    Table I and the warm   vapor memories featured in Table II.
time is so low that the entire curve is parallel to that of the no-memory curve, indicating the same rate-versus-distance scaling. In Section C, we show that the no-memory bound may be overcome with some minor improvement in these QMs.
Cold atoms: We consider the three experiments described in Refs. [26][27][28]. Reference [27] utilizes 85 Rb in a magneto-optical trap while Refs. [26,28] feature atomic lattices of 87 Rb. The coherence times of the magneto-optical trap implementations are limited by, among many factors, atomic diffusion in comparison to those of the atomic lattice [26][27][28]. The exceptional coherence time of Ref. [26] is due to compensation of light shifts, the insensitivity of the spin states to magnetic field fluctuations, and use of dynamical decoupling. We note that even though Ref. [26] did not explicitly show storage of nonclassical light, the experiment of Ref. [49] importantly shows that no noise is introduced by dynamical decoupling. Our simulations, which are presented in Fig. 6, show that the atomic lattice experiments of Refs. [26,28] can allow rates that surpass the no-memory bound.
Both memories have such long coherence times that, in both cases, the maximum security distance has been dictated by the dark count noise, and not the memory decoherence.
However, these memories are only useful if a low secret key rate is acceptable. Towards the possibility of higher rates and shorter-distance operation, we consider small improvements to memory properties (e.g. bandwidth) in Section C. Note that the experiments of [26] and [28] employ off-resonant Raman scattering to achieve memory-photon entanglement and have not explicitly performed storage of an externally-provided photon as is required for the quasi-EPR scheme. We assume that the QM parameters derived from these experiments may be translated to a Raman memory demonstration (as is achieved in [27]). We also mention that there is a theoretical proposal [50] for Raman memory using an optical lattice.
Rare-earth-ion-doped crystals: We consider the five AFC experiments described in Refs. [29][30][31][32][33]. Europium-doped Y 2 SiO 5 crystals are employed in the investigations of Refs. [29,30] while the well-studied Pr:Y 2 SiO 5 is featured in Refs. [32,33]. On-demand storage at the single photon level is shown in Ref. [29], of which dynamical decoupling techniques are also used to overcome dephasing due to spin inhomogeneous broadening. Reference [30] utilizes a low-finesse cavity to show (up to 50%) efficient and on-demand storage of strong pulses. Efficient storage using a low-finesse cavity is achieved in [31], while on-demand storage of qubits and heralded single photons are shown in Refs. [32] and [33], respectively.
Again we simulate the key rate of the quasi-EPR scheme and find that none of the REIC   implementations will surpass the no-memory performance. The best performance is offered by REIC2, which has a high efficiency and a decent coherence time. Taking into consideration the technical challenges of obtaining both high efficiency and low noise in a REIC-based AFC system, in Section C we explore the possibility of using several (spectral) modes to overcome the no-memory bound. Note that coherence times of 6 hours [48] and one minute [51] have been measured using magnetically-insensitive ground-level transitions of 151 Eu:Y 2 SiO 5 and Pr:Y 2 SiO 5 , respectively. However, it has yet to be shown that these coherence times may be combined with the possibility of efficient and broadband storage, hence these transitions may not be suitable for MA-MDI-QKD.  Table I and the REIC   memories featured in Table IV.

C. Near-future quantum memories
In this section we evaluate the performance of the quasi-EPR scheme using near-future QMs. Specifically, we suggest memory parameters that could be obtained with realistic experimental improvements to the memories of Refs. [22][23][24][25][26][27][28][29][30][31][32][33]. We attempt to be conservative with our suggested parameters, in particular with those of efficiency and coherence time, and acknowledge that there are fundamental limitations of some parameters, e.g. the restriction of bandwidth due to a certain energy level structure. Our enhanced memory parameters may represent a short-term goal for developing QMs.
Warm vapor: Here we consider three potential QMs with properties displayed in Table V and corresponding quasi-EPR key rates shown in Fig. 8. The first we refer to as "excellent coherence" (ExC) in which improved magnetic shielding will eliminate inhomogeneous spin dephasing such that a coherence time of Ref. [24] is achieved. Furthermore, we assume that a cavity is used to ensure low noise operation [22] and an enhancement of efficiency to that of Ref. [23], either by the cavity or control field tailoring [44]. We find that this memory enables surpassing the bound at just over 200 km and obtains maximal advantage at 400-500 km. This is a promising result given that MDI-QKD has been demonstrated over 400 km [2]-a distance for which channel stabilization has been realized. The second we refer to as "enhanced coherence" (EnC) in which we keep all parameters the same as ExC except the coherence time, of which corresponds to the minimum required to surpass the no-memory  Table V. Parameters for near-future warm vapor memories, and the corresponding interaction times and repetition rates, used for our numerical calculation of the secret key rate assuming the setup of Fig. 3. Memory abbreviations are explained in the main text.  Table I and the near-future warm vapor memories featured in Table V. bound. Surprisingly, we find that a (reasonable) coherence time of approximately 10 µs will beat the bound at around 200 km, while the difference with memory ExC lies in the rate-distance scaling at longer distances. The last QM we refer to as "enhanced efficiency" (EnE) in which we keep the parameters the same as in Ref. [23] except we find the minimum efficiency to beat the bound, this being an efficiency of 60% at a distance of less than 200 km.
Although it is likely that the EnE memory is challenging to achieve without added noise, improvements in experimental geometry in conjunction with control field optimization may reach this requirement without any compromise to coherence time. The QM of Ref. [25] is not useful for MA-MDI-QKD due to the limited coherence time (up to 100 ns) of the (excited) level used for storage.
Cold atoms: Here we consider the QMs outlined in Table VI, with the corresponding key rates shown in Fig. 9. We consider the memory of [27] with a bandwidth expanded to 1 GHz (CA2+BW), of which results in R S ∼ 667 MHz. Note that the bandwidth must be less than half of the 3 GHz ground-state splitting of 85 Rb to ensure minimum impact of noise.
Unfortunately, we find that, due its low coherence time, this QM will only (just) beat the no-memory bound if it is ∼90% efficient. Next we assume that a magnetically-insensitive ground-state transition is employed for the investigation of [27] (CA2+MI), finding that about 50% efficiency is needed to beat the bound, which can be realized by control pulse shaping or backwards retrieval [44]. We also consider the QM of Ref. [28] except we allow the bandwidth to be expanded to 100 MHz, R S = 95 MHz, (CA3+BW) which is well below the limitations given by the ground-state structure, but may pose a challenge if a cavity setup is employed. Encouragingly, we find that this QM easily overcomes the bound if it is 30% efficient. Finally, if the highly-coherent memory of Ref. [26] is employed and its bandwidth is expanded from 12.2 to 100 MHz (CA1+BW), only 10% efficiency is required to be useful for MDIQKD for distances greater than 600 km, albeit at a low key rate. Note that, in majority of cases, the cross-over distance is around 300 km.
Rare-earth-ion-doped crystals: The corresponding QM properties and key rates are shown in Table VII and Fig. 10, respectively. We employ the 151 Eu:Y 2 SiO 5 memory of Ref. [29], except that we assume perfect dynamical decoupling is in use to achieve a coherence time that is entirely limited by the ground-level homogeneous broadening (Eu+DD), and we employ the Pr:Y 2 SiO 5 crystal of [32] and [33] (Pr+DD) in a similar way. Even with perfect efficiency, we find that neither of the QMs overcome the bound, mainly due to their limited bandwidth in comparison to the Raman QMs. To gain an advantage, we first assume the cavity enhanced setups of [30] and [31] in conjunction with memories Eu+DD and Pr+DD, respectively. We then consider the possibility of multi-mode storage and find the minimum  number of modes that are needed to beat the bound, which we refer to as memories Eu+MM and Pr+MM for Eu-and Pr-doped Y 2 SiO 5 , respectively. Since our implementation is already intrinsically temporally multi-mode, a convenient degree of freedom to use for multiplexing is that of frequency. This is especially true of REICs where their sub-level structure limits the AFC bandwidth, but their inhomogeneously-broadened lines offer simultaneous storage of many, in some cases up to one-thousand [34], spectral modes [46,47]. Praseodymium-doped Y 2 SiO 5 offers the possibility to store up to ∼100 spectral modes given its hyperfine structure and its ∼5 GHz inhomogeneous linewidth [32], while 151 Eu:Y 2 SiO 5 only offers the possibility of storing a single spectral mode [29]. Nonetheless, one could employ spatial multiplexing, or explore the possibility to increase the inhomogneous linewidth by co-doping methods [52].
In order to use the multi-mode feature of the memory we may need to employ an array of SPSs, each generating single photons at different wavelength or spatial modes. One should account for that if the normalized rate per channel use is of interest.

D. Near-future memories with additional system imperfections
The implication that atomic ensembles could outperform the no-QM QKD is based on several assumptions. The key assumption is that the two SPSs in the quasi-EPR setup can generate identical single photons that are (bandwidth-) matched to the QMs. We have also thus far ignored the additional background noise coming from the frequency converters. Any  Table VII. Parameters of near-future rare-earth-ion-doped memories, and the corresponding interaction times and repetition rates, used for our numerical calculation of the secret key rate assuming the setup of Fig. 3. Memory abbreviations are explained in the main text.  Table I and the REIC   memories featured in Table VII. deviation from these assumptions may change the rate scaling and add to the QBER of the system. Below, we use our rough calculations of Sec. III B to investigate how resilient our setup is to the following imperfections.
• Multi-photon terms: We now test the resilience of our setup against possible multiple-photon components in the SPS. In fact, one can say that so long as p 2 /p 1 exp(−L/(2L att )), our system is immune against the two-photon terms generated by the source. At L = 200 km, that would require p 2 /p 1 0.003, which is almost achievable with today's quantum dot technology for generating entangled and/or single photons [15], and possibly even those that rely on parametric down-conversion. In the latter case, a bank of downconverters is needed to boost the trigger rate of the system [53].
The additional QBER due to two-photon terms is also on the order of p 2 , which is negligible.
• Photons distinguishibility: If the two single photons generated by the two SPSs in Fig. 3(b) do not fully couple to each other at 50:50 beam splitters, then some TLIC-related issues occur at the side BSMs. Yet, similar to the two-photon terms, our system can tolerate the same order of magnitude (0.1-1%) mismatch between the corresponding modes of the two single photons, which is again achievable by the current technology [17]. The additional QBER is also expected to be on the same order. The overlap between the user's photon and the SPSs in the middle node is important, but not as vital as the overlap between that of the two SPSs. The former issue could increase the QBER to some extent but given that long-distance MDI-QKD has been demonstrated, this issue can be dealt with using existing technologies. to tolerate a dark count on the order of 10 −4 per pulse, which is an order of magnitude higher than the typical background noise from frequency converters [39].
In order to test the above expectations, in Fig. 11, we have plotted the effect of dark counts from the side-BSM modules on the key rate of the MA-QKD system that uses memory ExC from near-future warm vapor atomic ensembles. Since warm vapor QMs are employed, no loss due to bandwidth mismatch is considered. The results show that, at d c = 10 −6 , the rate is nearly one order of magnitude above the MDI-QKD curve at L = 300 km, which leaves  Tables V and I. room for losses due to other experimental imperfections. Note that such a study of dark noise also guides the development of future Raman QMs based on warm vapor of which, without special considerations, are plagued by four-wave-mixing-induced noise [22].

V. CONCLUSIONS
In this paper we explored the possibility of using ensemble-based QMs in MA-MDI-QKD setups. Such QMs promise high efficiencies due to their strong light-matter coupling, large time-bandwidth products, and the ability to store multiple modes. By using single-photon sources, which are at an advanced stage of development, we proposed setups that could remove or alleviate the (single-mode) multiple-excitation problem. We identified the key problems in previously-proposed setups or the ones that resembled NLAs, and proposed a quasi-EPR setup that could outperform single no-memory QKD links. We showed that our solution is resilient against main imperfections in the source, the QM module, and other required devices such as frequency converters and single-photon detectors. Based on our calculations, warm vapor atomic ensembles have the best chance to improve the rateversus-distance behavior at channel distances above 200 km provided their efficiencies and coherence times can be improved. Cold atomic ensembles also offer a good performance