Modeling a measurement-device-independent quantum key distribution system.

We present a detailed description of a widely applicable mathematical model for quantum key distribution (QKD) systems implementing the measurement-device-independent (MDI) protocol. The model is tested by comparing its predictions with data taken using a proof-of-principle, time-bin qubit-based QKD system in a secure laboratory environment (i.e. in a setting in which eavesdropping can be excluded). The good agreement between the predictions and the experimental data allows the model to be used to optimize mean photon numbers per attenuated laser pulse, which are used to encode quantum bits. This in turn allows optimization of secret key rates of existing MDI-QKD systems, identification of rate-limiting components, and projection of future performance. In addition, we also performed measurements over deployed fiber, showing that our system's performance is not affected by environment-induced perturbations.


Introduction
From the first proposal in 1984 to now, the field of quantum key distribution (QKD) has evolved significantly [1,2]. For instance, experimentally, systems delivering key at Mbps rates [3] as well as key distribution over more than 100 km [4,5] have been reported. From a theoretical perspective, efforts aim at developing QKD protocols and security proofs with minimal assumptions about the devices used [6]. Of particular practical importance are two recently developed protocols that do not require trusted single photon detectors (SPDs) [7,8]. One of these, the so-called measurement-device-independent QKD (MDI-QKD) protocol, has already been implemented experimentally [9][10][11][12]. Hence, it is foreseeable that it will play an important role in the future of QKD, and it is thus important to understand the interplay between experimental imperfections (which will always remain in real systems) and system performance to maximize the latter.
In this work, we derive a widely applicable mathematical model describing systems that implement the MDI-QKD protocol. The model is based on facts about our [9], and other existing experimental setups [10][11][12], and takes into account carefully characterized imperfect state preparation, loss in the quantum channel, as well as limited detector efficiency and noise. It is tested by comparing its predictions with data taken with a proof-of-principle QKD system [9] employing time-bin qubits and implemented in a laboratory environment. Our model, which contains no free parameter, reproduces the experimental data within statistical uncertainties over three orders of magnitude of a relevant parameter. The excellent agreement allows optimizing central parameters that determine secret key rates, such as mean photon numbers used to encode qubits, and to identify rate-limiting components for future system improvement. In addition, we also find that the model accurately reproduces experimental data obtained over deployed fibers, showing that our system minimizes environment-induced perturbation to quantum key distribution in real-world settings.
This paper is organized in the following way: In section 2 we detail some of the side-channel attacks (i.e. attacks exploiting incorrect assumptions about the working of QKD devices) proposed so far and review technological countermeasures. In section 3 we briefly describe the MDI-QKD protocol, which instead exploits fundamental quantum physical laws to render the most important of these attacks useless. Our model of MDI-QKD systems is presented in section 4. This section is followed by an in-depth account of experimental imperfections that affect MDI-QKD performance and a description of how we characterized them in our system (section 5). Section 6 shows the results of the comparison between modelled and measured quantities, and section 7 details how to optimize the performance of our MDI-QKD system using the model. Finally, we conclude the article in section 8.

Side-channel attacks
A healthy development of QKD requires investigating the vulnerabilities of QKD implementations in terms of potential side-channel attacks. Side-channels in QKD are channels over which information about the key may leak out unintentionally. One of the first QKD side-channel attacks proposed was the photon number splitting (PNS) attack [13] in which the eavesdropper, Eve, exploits the fact that attenuated laser pulses sometimes include more than one photon to obtain information about the key. This attack can be detected if the decoy state protocol [14][15][16] is implemented. In the decoy state protocol, Alice varies the mean photon number per pulse in order to allow her and Bob to distill the secret key only from information stemming from single photon emissions. More proposals of side-channel attacks followed, including the Trojan-horse attack [17], for which the countermeasure is an optical isolator [17], and the phase remapping attack [18], for which the countermeasure is phase randomization [18]. Later on, attacks that took advantage of SPD vulnerabilities were also proposed and demonstrated [19][20][21][22]. For example, the time-shift attack [20] exploits a difference in the quantum efficiencies of the SPDs used in a QKD system. This attack can be prevented by actively selecting one of the two bases for the projection measurement, as well as by monitoring the temporal distribution of photon detections [20]. Another example is the detector blinding attack [22] in which the eavesdropper uses high intensity pulses to modify the performance (i.e. blind) the SPDs. It can be detected by monitoring the intensity of light at the entrance of Bob's devices with a photodiode [22][23][24]. Nevertheless, due to its power, the blinding attack is currently of particular concern.
It is important to mention that open side-channels do not necessarily compromise the security of the final key if the information that Eve may have obtained through an attack is properly removed during privacy amplification. However, as technological fixes (as discussed above) or additional privacy amplification can only thwart known attacks, it is important to develop and implement protocols that use a minimum number of assumptions about the devices used to implement the protocol. An important example is the measurement-device-independent QKD protocol, which we will introduce in the next section.

The measurement-device-independent quantum key distribution protocol
The MDI-QKD protocol is a time-reversed version of entanglement-based QKD. In this protocol, the users, Alice and Bob, are each connected to Charlie, a third party, through a quantum channel, e.g. optical fiber (see Fig. 1). In the ideal version, the users have a source of single photons that they prepare randomly in one of the BB84 qubit states [25] |0 , |1 , |+ and |− , where |± = 2 −1/2 (|0 ± |1 ). The qubits are sent to Charlie where the SPDs are located. Charlie performs a partial Bell state measurement (BSM) through a 50/50 beam splitter and then announces the events for which the measurement resulted in a projection onto the state. Alice and Bob then publicly exchange information about the used bases (z, spanned by |0 and |1 , or x, spanned by |+ and |− ). Associating quantum states with classical bits (e.g. |0 , |− ≡ 0, and |1 , |+ ≡ 1) and keeping only events in which Charlie found |ψ − and they picked the same basis, Alice and Bob now establish anticorrelated key strings. (Note that a projection of two photons onto |ψ − indicates that the two photons, if prepared in the same basis, must have been in orthogonal states.) Bob then flips all his bits, thereby converting the anti-correlated strings into correlated ones. Next, the so-called x-key is formed out of all key bits for which Alice and Bob prepared their photons in the xbasis; its error rate is used to bound the information an eavesdropper may have acquired during photon transmission. Furthermore, Alice and Bob form the z-key out of those bits for which both picked the z-basis. Finally, they perform error correction and privacy amplification [1,2] to the z-key, which results in the secret key.
The advantage of the MDI-QKD protocol over conventional prepare-and-measure or entangled photon-based QKD protocols is that, in the case of Charlie performing an ideal (partial) BSM as described above, detection events are uncorrelated with the final secret key bits. This is because a projection onto |ψ − only indicates that Alice and Bob sent orthogonal states, but does not reveal who sent which state. As a result, Charlie (or Eve) is unable to gain any information about the key from passively monitoring the detectors. Furthermore, a measurement that is different from the ideal BSM leads to an increased error rate and thus to a smaller, but still secret, key once privacy amplification has been applied. Notably, it does not matter wether the difference is due to experimental imperfections or to an eavesdropper (possibly Charlie himself) trying to gather information about the states that Alice and Bob sent by replacing or modifying the measurement apparatus. Hence, all detector side channels are closed in MDI-QKD.
In the ideal scenario introduced above, Alice and Bob use single photon sources to generate qubits. However, it is possible to implement the protocol using light pulses attenuated to the single photon level. Indeed, as in prepare-and-measure QKD, randomly varying the mean photon number of photons per attenuated light pulse between a few different values (so-called decoy and signal states) allows making the protocol practical while protecting against a possible PNS attack [7,26]. The secret key rate is then given by [7]: where h 2 is the binary entropy function, f indicates the error correction efficiency, Q indicates the gain (the probability of a projection onto |ψ − per emitted pair of pulses [27]) and e indicates error rates (the ratio of erroneous to total projections onto |ψ − ). Furthermore, the superscripts, x or z, denote if gains or error rates are calculated for qubits prepared in the xor the z-basis, respectively. Similarly, the subscripts, µ and σ , show that the quantity under concern is calculated or measured for pulses with mean photon number µ (sent by Alice) and σ (sent by Bob), respectively. Finally, the subscript 11 indicates quantities stemming from detection events for which the pulses emitted by Alice and Bob contain only one photon each. Note that Q 11 and e 11 cannot be measured; their values must be bounded using either a decoy state method, or employing qubit tagging [13]. However, the latter yields smaller key rates and distances than the former. Shortly after the original proposal [7], a practical decoy state protocol for MDI-QKD was proposed [26]. It requires Alice and Bob to randomly pick mean photon numbers between two decoy states and a signal state. One of the decoy states must have a mean photon number lower than the signal state, while the other one must be vacuum. A finite number of decoy states results in a lower bound for Q x,z 11 and an upper bound for e x 11 , which in turn gives a lower bound for the secret key rate in Eq. (1). We will elaborate more on decoy states in section 7.1.

The model
Our model takes into account imperfections present in a typical QKD system. Regarding the sources, located at Alice and Bob, we take into account imperfect preparation of the quantum state of each photon. Furthermore, we consider transmission loss of the links between Alice and Charlie, and Bob and Charlie. And finally, concerning the measurement apparatus at Charlie's, we consider imperfect projection measurement stemming from non-maximum quantum interference on Charlie's beam splitter, detector noise such as dark counts and afterpulsing, and limited detector efficiency. See also [28] for another model describing MDI-QKD performance, but with a more restrictive set of imperfections and not yet tested against actual experimental data.
In the following paragraphs we present a detailed description of our model. It relies on the assumption of phase randomized laser pulses at Charlie's. While Alice and Bob generate coherent states in our proof-of-principle setup, this assumption is correct as the long fibres used to connect Alice and Bob with Charlie introduce random global phase variations (we will discuss the impact of the lack of phase randomization at Alice's and Bob's on the security of distributed keys in section 8). We note that, in order to facilitate explanations, we have adopted the terminology of time-bin encoding. However, our model is general and can also be applied to MDI-QKD systems implementing other types of encoding [11].

State preparation
In the MDI-QKD protocol, Alice and Bob derive key bits whenever Charlie announces a projection onto the |ψ − Bell state. We model the probability of a |ψ − projection for various quantum states of photons emitted by Alice and Bob as a function of the mean photon number per pulse (µ and σ , respectively) and transmission coefficients of the fiber links (t A and t B , respectively). We consider photons in qubit states described by: where |0 and |1 denote orthogonal modes (i.e. early and late temporal modes assuming timebin qubits), respectively. Note that |ψ describes any pure state [29] and the presence of the m x,z and b x,z terms in Eq. (2), as opposed to using only one parameter, is motivated by the fact that they model different experimentally characterizable imperfections. In the ideal case, m z ∈ [0, 1] for photon preparation in the z-basis (in this case, the value of φ z is irrelevant), m x = 1 2 and φ x ∈ [0, π] for the x-basis, and b x,z = 0 for both bases. Imperfect preparation of photon states is modelled by using non-ideal m x,z , φ x,z and b x,z for Alice and Bob. The parameter b x,z is included to represent the background light emitted and modulated by an imperfect source. Furthermore, in principle, the various states generated by Alice and Bob could have differences in other degrees of freedom (i.e. polarization, spectral, spatial, temporal modes). This is not included in Eq. (2), but would be reflected in a reduced quality of the BSM, which will be discussed below.

Conditional probability for projections onto |ψ −
A projection onto |ψ − occurs if one of the SPDs after Charlie's 50/50 beam splitter signals a detection in an early time-bin (a narrow time interval centered on the arrival time of photons occupying an early temporal mode) and the other detector signals a detection in a late time-bin (a narrow time-interval centered on the arrival time of photons occupying a late temporal mode). Note that, in the following paragraphs, this is the desired detection pattern we search for when modeling possible interference cases or noise effects. Also, note that we assume that Charlie's two single-photon detectors have identical properties. A deviation from this approximation does not open a potential security loophole (in contrast to prepare-and-measure and entangled photon based QKD), as all detector side-channel attacks are removed in MDI-QKD.
We build up the model by first considering the probabilities that particular outputs from the beam splitter (at Charlie's) will generate the detection pattern associated with a projection onto |ψ − . The outputs are characterized by the number of photons per output port as well as their joint quantum state. The probabilities for each of the possible outputs to occur can then be calculated based on the inputs to the beam splitter (characterized by the number of photons per input port and their quantum states, as defined in Eq. (2)). Note that for the simple cases of inputs containing zero or one photon (summed over both input modes), we calculate the probabilities leading to the desired detection pattern directly, i.e. without going through the intermediate step of calculating outputs from the beam splitter. Finally, the probability for each input to occur is calculated based on the probability for Alice and Bob to send attenuated light pulses containing exactly i photons, all in a state given by Eq. (2). The probability for a particular input to occur also depends on the transmissions of the quantum channels, t A and t B . We note that this model considers up to three photons incident on the beam splitter. This is sufficient as, in the case of heavily attenuated light pulses and lossy transmission, higher order terms do not contribute significantly to projections onto |ψ − . However, we limit the following description to two photons at most: the extension to three is lengthy but straightforward and follows the methodology presented for two photons.
Detector noise Let us begin by considering the simplest case in which no photons are input into the beam splitter. In this case, detection events can only be caused by detector noise. We denote the probability that a detector indicates a spurious detection as P n . Detector noise stems from two effects: dark counts and afterpulsing [32]. Dark counts represent the base level of noise in the absence of any light, and we denote the probability that a detector generates a dark count per time-bin as P d . Afterpulsing is an additional noise source produced by the detector as a result of prior detection events. The probability of afterpulsing depends on the total count rate, hence we denote the afterpulsing probability per time-bin as P a , which is a function of the mean photon number per pulse from Alice and Bob (µ and σ ), the transmission of the channels (t A and t B ) and the efficiency of the detectors (η) located at Charlie (see below for afterpulse characterization). The total probability of a noise count in a particular time-bin is thus P n = P d + P a . All together, we find the probability for generating the detection pattern associated with a projection onto the |ψ − -state, conditioned on having no photons at the input, specified by "in", of the beam splitter, to be : Here and henceforward, we have ignored the multiplication factor (1-P n ) ∼ 1 [30], which indicates the probability that a noise event did not occur in the early time-bin (this is required in order to see a detection during the late time-bin assuming detectors with recovery time larger than the separation between the |0 and |1 temporal modes). Note that the probability conditioned on having no photons at the inputs of the beam splitter equals the one conditioned on having no photons at the outputs (specified in Eq. (3) by the conditional "out").
One-photon case Next, we consider the case in which a single photon arrives at the beam splitter. To generate the detection pattern associated with |ψ − , either the photon must be detected and a noise event must occur in the other detector in the opposite time-bin, or, if the photon is not detected, two noise counts must occur as in Eq. (3). We find where η denotes the probability to detect a photon that occupies an early (late) temporal mode during an early (late) time-bin (we assume η to be the same for both detectors).
Two-photon case We now consider detection events stemming from two photons entering the beam splitter. The possible outputs can be broken down into three cases. In the first case, both photons exit the beam splitter in the same output port and are directed to the same detector. This yields only a single detection event, even if the photons are in different temporal modes (the latter is due to detector dead time. Note that as our model calculates detections in units of bits per gate, modeling a dead-time free detector is straightforward.). The probability for Charlie to declare a projection onto |ψ − is then In the second case, the photons are directed towards different detectors and occupy the same temporal mode. Hence, to find detections in opposite time-bins in the two detectors, at least one photon must not be detected. This leads to P(|ψ − |2 photons, 2 spatial modes, 1 temporal mode, out) = In the final case, both photons occupy different spatial as well as temporal modes. In contrast to the previous case, a projection onto |ψ − can now also originate from the detection of both photons. This leads to P(|ψ − |2 photons, 2 spatial modes, 2 temporal modes, out) = In order to find the probability for each of these three two-photon outputs to occur, we must examine two-photon inputs to the beam splitter. We note that it is possible for the two photons to be subject to a two-photon interference effect (known as photon bunching) when impinging on the beam splitter. As this quantum interference can lead to an entangled state between the output modes, the calculation must proceed with quantum mechanical operators. We consider three cases: two photons arrive at the same input of the beam splitter, one photon arrives at each input of the beam splitter and the two photons are distinguishable, and one photon arrives at each input of the beam splitter and the two photons are indistinguishable. For ease of analysis, we first introduce some notation: where b x,z 1,2 and m x,z 1,2 are the parameters introduced in Eq. (2); the subscripts label the photon (one or two) whose state is specified by the parameters. Furthermore, p x,z (i, j) is proportional to finding photon one before the beam-splitter in temporal mode i and photon two in temporal mode j, where i, j ∈ [0, 1]. Finally, b x,z norm is a normalization factor. First, considering the situation in which the two photons impinge from the same input on the beam splitter, one has the state whereâ † (0) andâ † (1) are the creation operators for a photon in the |0 or |1 state, respectively. Evolving this state through the standard unitary transformation for a lossless, 50/50 beam splitter, described byâ † → (ĉ † +d † )/ √ 2 (whereĉ † andd † are the two output modes of the beam splitter), one finds that with probability 1/2 the two photons exit the beam splitter in the same output port (or spatial mode) and with probability 1/2 in different ports. Furthermore, with norm we find the photons in different spatial modes and in the same temporal mode, and with probability B = [p x,z (0, 1) + p x,z (1, 0)]/2b x,z norm we find the photons in different spatial and temporal modes. By symmetry, we find the same result if the two photons arrive from the other input mode of the beam splitter.
Second, consider the situation in which the two photons come from different inputs, and are completely distinguishable in some degree of freedom. This can be modelled by starting with the input state whereb † is the creation operator for a photon in the second input mode of the beam splitter. One can then evolve the state with the beam splitter unitary described byâ † → (ĉ † +d † )/ √ 2 (as before) andb † → (−ê † +f † ) √ 2, whereĉ † andê † correspond to the same spatial output mode but with distinguishability in another degree of freedom, and similarly for the other spatial output mode described byd † andf † . One finds the same result as for the previous case, described by Eq. (10): P (|ψ − |2 photons, 2 spatial modes, non-interfering, in) = P(|ψ − |2 photons, 1 spatial mode, in) The definition reflects that there is no two-photon interference in both cases. Finally, consider the case in which the two photons impinge from different inputs are indistinguishable, and interfere on the beam splitter. This can be modelled by considering the same input state as in Eq. (11), but using a beam splitter unitary described byâ † → ( In this case, the probabilities of finding the outputs from the beam splitter discussed in Eqs. (5-7) depend on the difference between the phases φ x,z 1 and φ x,z 2 that specify the states of photons one and two, ∆φ x,z ≡ φ x,z 1 − φ x,z 2 . Note that, due to the two-photon interference effect, finding the two photons in different spatial modes and the same temporal mode is impossible. We are thus left with the case of having two photons in the same output port (the same spatial mode), which occurs with probability C = [p x,z (0, 0) + p x,z (1, 1) + 0.5(p x,z (0, 1) + p x,z (1, 0)) + p x,z (0, 1)p x,z (1, 0) cos(∆φ x,z )]/b x,z norm , and the case of having the photons in different temporal and spatial modes, which occurs with probability D = [0.5(p x,z (0, 1) + p x,z (1, 0)) − p x,z (0, 1)p x,z (1, 0) cos(∆φ x,z )]/b x,z norm . This leads to P(|ψ − |2 photons, interfering, in) = C × P(|ψ − |2 photons, 1 spatial mode, out) + D × P(|ψ − |2 photons, 2 spatial modes, 2 temporal modes, out).
4.3. Aggregate probability for projections onto |ψ − Now that we have calculated the conditional probabilities of a detection pattern indicating |ψ − for various inputs to the beam splitter, let us consider with what probability each case occurs. This requires that we know the photon number distribution of the pulses arriving at Charlie's beam splitter from Alice and Bob, which can be computed based on the photon number distribution at the sources and the properties of the quantum channels. For the following discussion, we assume that the channels from Alice to Charlie, and from Bob to Charlie are characterized by the loss t A and t B , respectively, yielding pulses with number distribution D and mean photon number, µt A and σt B , respectively. This is equivalent to assuming that no PNS attack takes place, which was ensured by performing experiments with the entire setup (including the fiber transmission lines) inside a single laboratory in which no eavesdropping took place during the experiments. We limit our discussion to the cases with two or less photons at the input of the beam splitter (but recall that the actual calculation includes up to three photons). Hence, the cases we consider and their probabilities of occurrence, P O , are given by: • 0 photons at the input from both sources: • 1 photon at the input from Alice and 0 photons from Bob: • 0 photons at the input from Alice and 1 photon from Bob: • 2 photons at the input from Alice and 0 photons from Bob: • 0 photons at the input from Alice and 2 photons from Bob: • 1 photon at the input from both sources: where we denote the probability of having i photons from a distribution D with mean number µ as D i (µ). For each of these cases, we have already computed the probability that Charlie obtains the detection pattern associated with the |ψ − -state for arbitrary input states of the photons (as defined in Eq. (2)). When zero or one photons arrive at the beam splitter, Eq. (3) and Eq. (4) are used, respectively. In the case in which two photons arrive from the same source, Eq. (12) is used. Finally, in the case in which one photon arrives from each source at the beam splitter, Eq. (13) would be used in the ideal case. However, perfect indistinguishability of the photons cannot be guaranteed in practice. We characterize the degree of indistinguishability by the visibility, V , that we would observe in a closely-related Hong-Ou-Mandel (HOM) interference experiment [33] with single-photon inputs. Taking into account partial distinguishability, the probability of finding a detection pattern corresponding to the projection onto |ψ − is given by Equations 3-14 detail all possible causes for observing the detection pattern associated with a projection onto the |ψ − Bell state, if up to two photons at the beam splitter input are taken into account. We remind the reader that all calculations in the following sections take up to three photons at the input of the beam splitter into account. To calculate the gains, Q x,z µσ , using these equations, we need only substitute in the correct values of µ, σ , t A , t B , m x,z , b x,z , and ∆φ x,z for the cases in which Alice and Bob both sent attenuated light pulses in the x-basis or z-basis, respectively. The error rates, e x,z µ , can then be computed by separating the projections onto |ψ − into those where Alice and Bob sent photons in different states (yielding correct key bits) and in the same state (yielding erroneous key bits). More precisely, the error rates, e x,z µσ , are calculated as e x,z µσ = p x,z wrong /(p x,z correct + p x,z wrong ) where p x,z wrong (p x,z correct ) denotes the probability for detections yielding an erroneous (correct) bit in the x (or z)-key.

Characterizing experimental imperfections
The parameters used to model our system are derived from data established through independent measurements. To test our model, the characterization of experimental imperfections in our MDI-QKD implementation [9] is very technical at times. It can be broken down into timeresolved energy measurements at the single photon level (required to extract µ, σ , b x,z and m x,z for Alice and Bob, as well as dark count and afterpulsing probabilities), measurements of phase (required to establish φ x,z for Alice and Bob), and visibility measurements. In the following paragraphs we describe the procedures we followed to obtain these parameters from our system.

Our MDI-QKD implementation
In our implementation of MDI-QKD [9] Alice's and Bob's setups are identical. Each setup consists of a CW laser with large coherence time, emitting at 1550nm wavelength. Time-bin qubits, encoded into single photon-level light pulses with Poissonian photon number statistics, are created through an attenuator, an intensity modulator and a phase modulator located in a temperature controlled box. More precisely, the intensity modulator is used to tailor pulse pairs out of the cw laser light, the phase modulator is used to change their relative phase, and the attenuator attenuates these pulses to the single-photon level. The two temporal modes defining each time-bin qubit are of 500 ps (FWHM) duration and are separated by 1.4 ns. Each source generates qubits at 2 MHz rate.
We emphasize that our qubit generation procedure justifies the assumption of a pure state in Eq. (2). Indeed, all photons, including background photons due to light leaking through imperfect intensity modulators, have to be generated by the CW lasers whose coherence times exceeds the separation between the temporal modes |0 and |1 [31]. Note that in all experiments reported to date [9][10][11][12] background photons always add coherently to the modes describing qubits, making our pure-state description widely applicable.
The time-bin qubits are sent to Charlie through an optical fiber link. The link consisted of spooled fiber (for the measurements in which Alice, Bob and Charlie were all located in the same laboratory) or deployed fiber (for the measurements in which the three parties were located in different locations within the city of Calgary). We remind the reader that all pulses arriving at Charlie's are phase randomized, due to the use of long fibers. Charlie performs a BSM on the qubits he receives using a 50/50 beamsplitter and two SPDs. See Figure 2. Note that, in order to perform a Bell state measurement the photons arriving to Charlie must be indistinguishable in all degrees of freedom: polarization, frequency, time and spatial mode. The indistinguishability of the photons is assessed through a Hong-Ou-Mandel interference measurement [33]. As our system employs attenuated laser pulses, the maximum visibility we can obtain in this measurement is V max = 50% (and not 100% as it would be with single photons) [34]. In our implementation the visibility measurements resulted in V = (47 ± 1), irrespective of whether they were taken with spooled fiber inside the lab, or over deployed fiber.

Time-resolved energy measurements
First, we characterize the dark count probability per time-bin, P d , of the SPDs (InGaAsavalanche photodiodes operated in gated Geiger mode [32]) by observing their count rates when the optical inputs are disconnected. We then send attenuated laser pulses so that they arrive just after the end of the 10 ns long gate that temporarily enables single photon detection. The observed change in the count rate is due to background light transmitted by the intensity modulators (whose extinction ratios are limited) and allows us to establish b x,z (per time-bin) for Alice and Bob. Next, we characterize the afterpulsing probability per time-bin, P a , by placing the pulses within the gate, and observing the change in count rate in the region of the gate prior to the arrival of the pulse. The afterpulsing model we use to assess P a from these measurements is described below.
Once the background light and the sources of detector noise are characterized, the values of m x,z can be calculated by generating all required states and observing the count rates in the two time-bins corresponding to detecting photons generated in early and late temporal modes. Observe that m z=1 for photons generated in state |1 (the late temporal mode) is zero, since all counts in the early time-bin are attributed to one of the three sources of background described above. Furthermore, we observed that m z=0 for photons generated in the |0 state (the early temporal mode) is smaller than one due to electrical ringing in the signals driving the intensity modulators. Note that, in our implementation, the duration of a temporal mode exceeds the width of a time-bin, i.e. it is possible to detect photons outside a time-bin (see Figure 3 for a schematical representation). Hence, it will be useful to also define the probability for detecting a photon arriving at any time during a detector gate; we will refer to this quantity as η gate .The count rate per gate, after having subtracted the rates due to background and detector noise, together with the detection efficiency, η gate (η gate , as well as η, have been characterized previously based on the usual procedure [32]), allows calculating the mean number of photons per pulse from Alice or Bob (µ or σ , respectively). The efficiency coefficient relevant for our model, η, is smaller than η gate . Finally, we point out that the entire characterization described above was repeated for all experimental configurations investigated (the configurations are detailed in Table 2). We found all parameters to be constant in µσt A t B , with the obvious exception of the afterpulsing probability. . Sketch (not to scale) of the probability density p(t) for a detection event to occur as a function of time within one gate. Detection events can arise from a photon within an optical pulse (depicted here as a pulse in the late temporal mode), or be due to optical background, a dark count, or afterpulsing. Also shown are the 400 ps wide time-bins. Within the early time-bin only optical background, dark counts and afterpulsing give rise to detection events in this case. Note that the width of the temporal mode exceeds the widths of the time-bins.

Phase measurements
To detail the assessment of the phase values φ x,z determining the superposition of photons in early and late temporal modes, let us assume for the moment that the lasers at Alice's and Bob's emit light at the same frequency. First, we defined the phase of Bob's |+ state to be zero (this can always be done by appropriately defining the time difference between the two temporal modes |0 and |1 ). Next, to measure the phase describing any other state (generated by either Alice or Bob) with respect to Bob's |+ state, we sequentially send unattenuated laser pulses encoding the two states through a common reference interferometer. This reference interferometer featured a path-length difference equal to the time-difference between the two temporal modes defining Alices and Bob's qubits. For the phase measurement of qubit states |+ and |− (generate by Alice), and |− generated by Bob), first, the phase of the interferometer was set such that Bob's |+ state generated equal intensities in each output of the interferometer (i.e. the interferometers phase was set to π/4). Thus, sending any of the other three states through the interferometer and comparing the output intensities, we can calculate the phase difference. We note that any frequency difference between Alice's and Bob's lasers results in an additional phase difference. Its upper bound for our maximum frequency difference of 10 MHz is denoted by φ f req .

Measurements of afterpulsing
We now turn to the characterization of afterpulsing. After a detector click (or detection event, which includes photon detection, dark counts and afterpulsing), the probability of an afterpulse occuring due to that detection event decays exponentially with time. The SPDs are gated, with the afterpulse probability per gate being a discrete sampling of the exponential decay. This can be expressed using a geometric distribution: supposing a detection event occurred at gate k = −1, the probability of an afterpulse occuring in gate k is given by P k = α p(1 − p) k . Thus, if there are no other sources of detection events, the probability of an afterpulse occuring due to a detection event is given by ∑ ∞ k=0 α p(1 − p) k . In a realistic situation, the geometric distribution for the afterpulses will be cut off by other detection events, either stemming from photons, or dark counts. In addition, the SPDs have a deadtime after each detection event during which the detector is not gated until k ≥ k dead (note that time and the number of gates applied to the detector are proportional). The deadtime can simply be accounted for by starting the above summation at k = k dead rather than k = 0. However, for an afterpulse to occur during the k th gate following a particular detection event, no other detection events must have occured in prior gates. This leads to the following equation for the probability of an afterpulse per detection event: where: and P d,gate denotes the detector dark count probability per gate (as opposed to per time-bin), and µ avg (µ, σ ,t A ,t B ) expresses the average number of photons present on the detector during each gate as follows: where b A and b B characterize the amount of background light per gate from Alice and Bob, respectively, and the factor of 1 2 comes from Charlie's beam splitter. The terms in the sum of Eq. (15) describe the probabilities of neither having an optical detection (γ), either caused by a modulated pulse or background light, nor a detector dark count (υ) in any gate before and including gate k, and not having an afterpulse in any gate before gate k (ρ), followed by an afterpulse in gate k (P k ). Equation (15) takes into account that afterpulsing within each time-bin is influenced by all detections within each detector gate, and not only those happening within the time-bins that we post-select when acquiring experimental data.
The afterpulse probability, P a,gate , for given µ, σ , t A and t B can then be found by multiplying Eq. (15) by the total count rate P a,gate = µ avg (µ, σ ,t A ,t B )η gate + P d,gate + P a,gate P(a,det).
This equation expresses that afterpulsing can arise from prior afterpulsing, which explains the appearance of P a,gate on both sides of the equation. Equation (18) simplifies to Finally, to extract the afterpulsing probability per time-bin, P a (µ, σ ,t A ,t B ), we note that we found that the distribution of afterpulsing across the gate to be the same as the distribution of dark counts across the gate. Hence, Fitting our afterpulse model to the measured afterpulse probabilities, we find α = 1.79 × 10 −1 , p = 2.90 × 10 −2 , and P d P d,gate = 4.97 × 10 −2 for k dead = 20. The fit, along with the measured values, is shown in Figure 4 as a function of the average number of photons arriving at the detector per gate µ avg (µ, σ ,t A ,t B ). A summary of all the values obtained through these measurements is shown in Table 1.
6. Testing the model, and real-world tests

Comparing modelled with actual performance
To test our model, and to verify our ability to perform, in principle, QKD with deployed (realworld) fiber, we now compare the model's predictions with experimental data obtained using the QKD system characterized by the parameters listed in Table 1. We performed experiments in two configurations: inside the laboratory using spooled fiber (for four different distances between Alice and Bob ranging between 42 km and 103 km), and over deployed fiber (18 km). The first configuration allows testing the model, and the second configurations shines light on our system's capability to compensate for environment-induced perturbations, e.g. due to temperature fluctuations. For each test, three different mean photon numbers (0.1, 0.25 and 0.5) were used. All the configurations tested (as well as the specific parameters used in each Table 1. Experimentally established values for all parameters required to describe the generated quantum states, as defined in Eq. (2), as well as two-photon interference parameters and detector properties.

Parameter
Alice's value Bob's value b z=0 = b z=1 (7.12 ± 0.98) × 10 −3 (1.14 ± 0.49) × 10 −3 b x=− = b x=+ (5.45 ± 0.37) × 10 −3 (1.14 ± 0.49) × 10 −3 m z=0 0.9944 ± 0.0018 0.9967 ± 0.0008 π + (0.075 ± 0.015) π − (0.075 ± 0.015) Parameter Value test) and the results obtained are listed in Table 2. In Figure 5 we show the simulated values for the error rates (e z,x ) and gains (Q z,x ) predicted by the model as a function of µσt A t B . The plot includes uncertainties from the measured parameters, leading to a range of values (bands) as opposed to single values. The figure also shows the experimental values of e z,x and Q z,x from our MDI-QKD system in both the laboratory environment and over deployed fiber.
Considering the data taken inside the lab, the modelled values and the experimental results agree within experimental uncertainties over three orders of magnitude. This shows that the model is suitable for predicting error rates and gains. In turn, this allows us to optimize performance of our QKD systems in terms of secret key rate (see section 7). In particular, the model allows optimizing the mean photon number per pulse that Alice and Bob use to encode signal and decoy states as a function of transmission loss, and identifying rate-limiting components.
Furthermore, the measurement results over deployed fibre are also well described by the same model, indicating that this more-difficult measurement worked correctly. The increased difficult across real-world fiber arises due to the fact that BSMs require incoming photons to be indistinguishable in all degrees of freedom (i.e. arrive within their respective coherence times, with identical polarization, and with large spectral overlap). As we have shown in [9], time-varying properties of optical fibers in the outside environment (e.g. temperature dependent polarization and travel-time changes) can remove indistinguishability in less than a minute. Active stabilization of these properties is thus required to achieve functioning BSMs and, in fact, three such stabilization systems were deployed during the MDI-QKD measurements presented here (more details are contained in [9]). That our measurement results agree with the predicted values of the model demonstrates that the impact of environmental perturbations on the ability to perform Bell state measurements is negligible (which is the same conclusion drawn in [9]).

Decoy-state analysis
To calculate secret key rates for various system parameters, which allows optimizing these parameters, first, it is necessary to compute the gain, Q z 11 , and the error rate, e x 11 , that stem from events in which both sources emit a single photon. We consider the three-intensity decoy state method for the MDI-QKD protocol proposed in [26], which derives a lower bound for the secret key rate using lower bounds for Q x,z 11 and an upper bound for e x 11 . Note that we assume here that the the only effect of imperfectly generated qubit states on the secret key rate that we consider . The difference in gains and error rates in the x-and the z-basis, respectively is due to the fact that, in the case in which one party sends a laser pulse containing more than one photon and the other party sends zero photons, projections onto the |ψ − Bell state can only occur if both pulses encode qubits belonging to the x-basis. The Bell state projection cannot occur if both prepare qubits belonging to the z-basis (we ignore detector noise for the sake of this argument). This causes increased gain for the x-basis and, due to an error rate of 50% associated with these projections, also an increased error rate for the x-basis.
here is that it increases the error rates (further considerations require advancements to security proofs, which are under way [26,35]) increases of error rates. We denote the signal, decoy, and vacuum intensities by µ s , µ d , and µ v , respectively, for Alice, and, similarly, as σ s , σ d , and σ v for Bob. Note that µ v = σ v = 0 by definition. This decoy analysis assumes that perfect vacuum intensities are achievable, which may not be correct in an experimental implementation. However, note that, first, intensity modulators with more than 50 dB extinction ratio exist, which allows obtaining almost zero vacuum intensity, and second, that a similar decoy state analysis with non-zero vacuum intensity values is possible as well [28]. For the purpose of this analysis, we take both channels to have the same transmission coefficients (that is t A = t B ≡ t), according to our experimental configuration, and Alice and Bob hence both select the same mean photon numbers for each of the three intensities (that is µ s = σ s ≡ τ s , Additionally, for compactness of notation, we omit the µ and σ when describing the gains and error rates (e.g. we write Q z ss to denote the gain in the z-basis when Alice and Bob both send photons using the signal intensity). Under these assumptions, the lower bound on Q x,z 11 is given by where the various D i (τ) denote the probability that a pulse with photon number distribution D and mean τ contains exactly i photons, and Q x,z 0 (τ d ) and Q x,z 0 (τ s ) are given by Table 2. Measured error rates, e x,z µσ , and gains, Q x,z µσ , for different mean photon numbers, µ and σ (where µ = σ ), lengths of fiber connecting Alice and Charlie, and Charlie and Bob, A and B , respectively, and total transmission loss, l. The last set of data details realworld measurements using deployed fiber. Uncertainties are calculated using Poissonian detection statistics.
The error rate e x 11 can then be computed as where the upper bound holds if a lower bound is used for Q x 11 . Note that Q x,z 11 , Q x,z 0 (τ d ), Q x,z 0 (τ s ) and e x 11 (Eqs. (21-24)) are uniquely determined through measurable gains and error rates.

Optimization of signal and decoy intensities
For each set of experimental parameters (i.e. distribution function D, channel transmissions and all parameters describing imperfect state preparation and measurement), the secret key rate (Eq. (1)) can be maximized by properly selecting the intensities of the signal and decoy states (τ s and τ d , respectively). Here we consider its optimization as a function of the total transmission (or distance) between Alice and Bob. We make the assumptions that both the channel between Alice and Charlie and the channel between Bob and Charlie have the same transmission coefficient, t, and that Alice and Bob use the same signal and decoy intensities.
We considered values of τ d in the range 0.01 ≤ τ d < 0.99 and values of τ s in the range τ d < τ s ≤ 1. An exhaustive search computing the secret key rate for an error correction efficiency f = 1.14 [36] is performed from 2 km to 200 km total distance (assuming 0.2 dB/km loss), with increments of 0.01 photons per pulse for both τ s and τ d . For each point, the model described in section 4 is used to compute all the experimentally accessible quantities required to compute secret key rates using the three-intensity decoy state method summarized in Eqs. (21-24). In our optimization, we found that, in all cases, τ d = 0.01 is the optimal decoy intensity. We attribute this to the fact that τ d has a large impact on the tightness of the upper bound on e x 11 in   Fig. 6. a) Optimum signal state intensity, τ s , and b) corresponding secret key rate as a function of total loss in dB. The secondary axis shows distances assuming typical loss of 0.2 dB/km in optical fiber without splices. The optimum values for µ s for small loss have to be taken with caution as in this regime the model needs to be expanded to higher photon number terms.
Eq. (24) (this is due to the fact that all errors in the cases in which both parties sent at least one photon, which increases with τ d , are attributed to the case in which both parties sent exactly one photon). Figure 6 shows, as a function of total loss (or distance), the optimum values of the signal state intensity, τ s , and the corresponding secret key rate, S, for decoy intensities of τ d ∈ [0.01, 0.05, 0.1], as well as for a perfect decoy state protocol (i.e. using values of Q z 11 and e x 11 computed from the model, as detailed in the preceeding section).

Rate-limiting components
Finally, we use our model to simulate the performance of the MDI-QKD protocol given improved components. We consider two straightforward modifications to the system: replacing the InGaAs single photon detectors (SPDs) with superconducting single photon detectors (SSPDs) [37], and improving the intensity modulation (IM). For various combinations of these improvements, the optimized signal intensities and secret key rates for µ d = 0.05 are shown in Figure 7. First, using state-of-the-art SSPDs in [37], the detection efficiency (η) is improved from 14.5% to 93%, and the dark count probability (P d ) is reduced by nearly two orders of magnitude. Furthermore, the mechanisms leading to afterpulsing in InGaAs SPDs are not present in SSPDs (that is P a = 0). This improvement results in a drastic increase in the secret key rate and maximum distance as both the probability of projection onto |ψ − and the signal-to-noise-ratio are improved significantly. Second, imperfections in the intensity modulation system used to create pulses in our implementation contribute significantly to the observed error rates, particularly in the z-basis. Using commercially-available, state-of-the-art intensity modulators [38] allow suppressing the background light (represented by b x,z in general quantum state given in Eq. (2)) by an additional 10-20 dB, corresponding to an extinction ration of 40 dB. Furthermore, we considered improvements to the driving electronics that reduces ringing in our pulse generation by a factor of 5, bringing the values of m x,z in Eq. (2) closer to the ideal values. As seen in Figure 7, this provides a modest improvement to the secret key rate, both when applied to our existing implementation, and when applied in conjunction with the SSPDs. Note that in the case of improved detectors and intensity modulation system the optimized τ s for small loss (under 10 dB) is likely overestimated due to neglected higher-order terms.

Discussion and conclusion
We have developed a widely applicable model for systems implementing the Measurement-Device-Independent QKD protocol. Our model is based on facts about the experimental setup and takes into account carefully characterized experimental imperfections in sources and measurement devices as well as transmission loss. It is evaluated against data taken with a real, time-bin qubit-based QKD system. The excellent agreement between observed values and predicted data confirms the model. In turn, this allows optimizing mean photon numbers for signal and decoy states and finding rate-limiting components for future improvements. We believe that our model, which is straightforward to generalize to other types of qubit encoding, as well as the detailed description of the characterization of experimental imperfections will be useful to improve QKD beyond its current state of the art. To finish, let us emphasize that tests of a model that describes the performance of a QKD system in terms of secret key rates has to happen in a setting in which eavesdropping can be excluded (i.e. within a secure lab and using spooled fibre) -otherwise, the measured data, which depends on the (unknown) type and amount of eavesdropping, may deviate from the predicted performance and no conclusion about the suitability of the model can be drawn. Interestingly, this implies that neither phase randomization, nor random selection of qubit states or intensities of attenuated laser pulses used to encode qubit states is necessary to test a model, as their presence (or absence) does not impact the measured data. However, it is obvious that these modulations are crucial to ensure the security of a key that is distributed through a hostile environment. We note that in this article, all effects of imperfections in the system on the measured quantities are still attributed to an eavesdropper, and accounted for in the calculation of the secret key rate as well in the optimization of system parameters.