Application of an improved version of McDiarmid inequality in finite-key-length decoy-state quantum key distribution

In practical decoy-state quantum key distribution, the raw key length is finite. Thus, deviation of the estimated single photon yield and single photon error rate from their respective true values due to finite sample size can seriously lower the provably secure key rate R. Current method to obtain a lower bound of R follows an indirect path by first bounding the yields and error rates both conditioned on the type of decoy used. These bounds are then used to deduce the single photon yield and error rate, which in turn are used to calculate a lower bound of the key rate R. Here we report an improved version of McDiarmid inequality in statistics and show how use it to directly compute a lower bound of R via the so-called centering sequence. A novelty in this work is the optimization of the bound through the freedom of choosing possible centering sequences. The provably secure key rate of realistic 100 km long quantum channel obtained by our method is at least twice that of the state-of-the-art procedure when the raw key length ℓraw is ≈105–106. In fact, our method can improve the key rate significantly over a wide range of raw key length from about 105 to 1011. More importantly, it is achieved by pure theoretical analysis without altering the experimental setup or the post-processing method. In a boarder context, this work introduces powerful concentration inequality techniques in statistics to tackle physics problem beyond straightforward statistical data analysis especially when the data are correlated so that tools like the central limit theorem are not applicable.


Introduction
Quantum key distribution (QKD) enables two trusted parties Alice and Bob to share a provably secure secret key by preparing and measuring quantum states that are transmitted through a noisy channel controlled by an eavesdropper Eve. One of the major challenges to make QKD practical is to increase the number of secure bits generated per second [1]. That is why most QKD experiments to date use photons as the quantum information carriers; and these photons come from phase randomize Poissonian distributed sources instead of the much less efficient single photon sources. In addition, decoy state method is used to combat Eve's photon-numbersplitting attack on multiple photon events emitted from the Poissonian sources [2,3]. From the theoretical point of view, a more convenient figure of merit is the key rate, namely, the number of provably secure secret bits per average number of photon pulses prepared by Alice. This is because key rate measures the intrinsic performance of a QKD protocol (in other words, the software issue) without taking the frequency of the pulse (which is a hardware issue) into account. This is analogous to the use of time complexity measure rather than the actual runtime to gauge the performance of an algorithm in theoretical computer science.
Surely, provably secure lower bound of key rate R (which we simply call the key rate from now on) of a QKD scheme depends on various photon yields as well as error rates of those detected photons to be precisely defined in equations (1) and (2) below. The problem is that Alice and Bob can only transmit a finite number of photons in practice. Consequently, the yield and error rates estimated by any sampling technique may differ from their actual values. If Alice and Bob ignore these deviations, the actual number of bits of secret key they get could be smaller than that computed by the key rate R, posing a security threat.
Various key rate formulae which take the above finite-size statistical fluctuations into account for a few (decoy-state-based) QKD schemes had been reported in literature. For instance, Lim et al [4] computed the key rates of a certain implementation of the BB84 QKD scheme [5] using three types of decoy; recently, Chau [6] extended it to the case of using more than three types of decoys. Hayashi and Nakayama investigated the key rate for the BB84 scheme [7]. Brádler et al showed the key rate for a qudit-based QKD scheme using up to three mutually unbiased preparation and measurement bases [8]. And Wang et al proved that errors and fluctuations in the decoy photon intensities only have minor errors on the final key rate [9]. In brief, the provably secure key rate of a QKD scheme so far is found using the following three-step strategy. First, the yields B m Q , n and error rates B m E , n conditioned on the preparation and measurement basis B as well as the photon intensity parameter μ n used are determined by comparing the relevant Bob's measurement outcomes, if any, with Alice's preparation states. The second step is to deduce yields and error rates conditioned on the number of photons emitted by the source.  [2][3][4]10]. Nevertheless, the later quantities cannot be determined precisely because equations (1) and (2) are under-determined systems of equations given B m Q , n ʼs and B m E , n ʼs provided that the number of photon intensities used k is finite. To make things worse, in the finite-raw-key-length (FRKL) situation, the measured values of B m Q , n ʼs and B m E , n ʼs deviate from their true values due to finite sampling. Fortunately, effective lower bounds of B Y ,0 and B Y ,1 as well as upper bound of B e ,1 are available [2-4, 6, 10, 11]. In the FRKL situation, these bounds can be deduced with the help of Hoeffding's inequality [12]. (See, for example [4,6], for details. Note that here we cannot assume the measurement outcomes are statistically independent and thus use more familiar tools such as central limit theorem because Eve may launch a coherent attack to all the photon pulses. In fact, we do not even know what kind of statistical distributions do B m Q , n ʼs and B m E , n ʼs follow.) The third step is to deduce R from these bounds [2-4, 8, 10].
Computing lower bound of R using this indirect strategy is not satisfactory in the FRKL situation because it is unlikely for each of the finite-size fluctuations in B m Q , n ʼs and B m E , n ʼs to decrease the value of the provably secure key rate. In fact, for a given security parameter, the worst case bounds on B Y ,0 and B Y ,1 cannot be not attained simultaneously if the raw key length is finite. (This is evident, say, from the bounds of B Y ,0 and B Y ,1 given by inequalities(2) and(3) in [4] or inequalities(12a) and(12b) in [6]. Note that there is a typo in inequality(12b) In all cases, the finite-size statistical fluctuation that leads to the saturation of lower bound for B Y ,0 does not cause the saturation of the lower bound for B Y ,1 and vice versa.) It is more effective if one could directly investigate the influence of finite-key-length on the key rate. To do so, one has to go beyond the use of Hoeffding's inequality to bound the statistical fluctuation, which only works for equally weighted sum of random variables that are either statistical independent or drawn from a finite population without replacement [12]. Here we use the computation of the key rate of a specific BB84 QKD protocol [5] that generates the raw key solely from X basis measurement results as an example to illustrate how to directly tackle statistical fluctuation in the FRKL situation by means of McDiarmid-type inequality [13] in statistics. The technique used here can be easily adapted to compute the key rates of other QKD schemes using finite-dimensional qudits in the FRKL situation. Our work here is based on an earlier preprint by one of the us [14]. Here we greatly extend and improve the original proposal by first proving a new and slightly extended McDiarmid-type of inequality on so-called centering sequences. (See definition 1 for the precise definition of a centering sequence.) Then we apply it through four different methods, each giving a separate provably secure key rate. We also optimize the provably secure key rate R by exploiting our freedom to pick the centering sequences. To our knowledge, this is the first time such an optimization is performed. In contrast, this type of optimization is not possible in previous approach that makes use of a less general inequality known as Hoeffding's inequality. It turns out that each method works best in different situations; and the best provably secure key rate among the four methods in realistic practical situation is at least about 10% better than the state-of-the-art method before [14]. Moreover, for raw key length ℓ » 10 raw 5 -10 7 , this work almost double the secure key rate of the original proposal in [14] when four different photon intensities are used. From a broader perspective, the technique we introduce here is also applicable to bound the conclusion of a general physics experiment in the form of a real number due to finite-size statistical fluctuations of more than one type of measurement outcomes that are possibly statistically dependent.

The QKD scheme by Chau in [6] and the assumptions of the security proof
To illustrate how McDiarmid-type of inequality can be used to give a better key rate, we consider the QKD Scheme studied by Chau whose details can be found in [6]. Note that this scheme is a slight variation of the one studied by Lim et al in [4]. The only difference is that they use three different photon intensities while we consider the slightly more general case of using k2 different photon intensities. In essence, the Scheme in [6] is a decoy-state BB84 scheme with one-way classical communication using the X-basis measurement results as the raw key and the Z-basis measurement results for phase error estimation.
We assume that the light source is Poissonian distributed with intensities μ 1 >μ 2 >...>μ k 0 with k2. Using the result in [9], we simply our discussion by assuming that these photon intensities are accurately determined and fixed throughout the experiment. This is fine because fluctuation of photon intensity of a laser source is negligible in practice. Since our aim is to demonstrate our technique of using McDiarmid-type inequality in the simplest possible QKD implementation, we do not consider twin-field [15] or measurement device independent [16] setups although adaptation to these situations is straightforward though tedious. The measurement is performed using threshold photon detectors with random bit assignment in the event of multiple detector click. Last but not least, we assume both Alice and Bob have access to their own private perfect random number generators when choosing their preparation and measurement bases.

Finite-size decoy-state key rate
Recall that the error rate for this particular variation of the decoy-state BB84 QKD scheme using one-way classical communication is lower-bounded by [4,6]  where X p denotes the probability that Alice (Bob) uses X as the preparation (measurement) basis, is the binary entropy function, e p is the phase error rate of the single photon events in the raw key, and Λ EC is the actual number of bits of information that leaks to Eve as Alice and Bob perform error correction on their raw bits. It is given by if they use the most efficient (classical) error correcting code to do the job. In addition, ℓ raw is the raw sifted key length measured in bits, ò cor is the upper bound of the chance that the final secret keys shared between Alice and Bob are different, and Here p abort is the chance that the scheme aborts without generating a key, ρ AE is the classical-quantum state describing the joint state of Alice and Eve, U A is the uniform mixture of all the possible raw keys created by Alice, ρ E is the reduced density matrix of Eve, and ·   1 is the trace norm [17][18][19]. Thus, Eve's information on the final key is at most ò sec . Last but not least, χ is a QKD scheme specific factor which depends on the detailed security analysis used. In general, χ may also depend on other factors used in the QKD scheme such as the number of photon intensities k [4,6].
For BB84, Z  e e p ,1 as ℓ  +¥ raw . More importantly, the best known bound on the difference between e p and Z e ,1 due to finite sample size correction using properties of the hypergeometric distribution reported in given by [6,20] if k is even.) Note that in our subsequent analysis, we also need the following two inequalities, which can be proven using the same method as in inequality(7b): (3) gives the following lower bound of the key rate can be dealt with in the same way by changing the definition of b n accordingly. But these cases are not interesting for they likely imply R = 0 in realistic channels.) Note that the worst case key rate corresponds to the situation that the spin flip and phase shift errors in the raw key are uncorrelated so that Alice and Bob cannot use the correlation information to increase the efficiency of entanglement distillation. Thus, we may separately consider statistical fluctuations in X m Q , n ʼs and Z e ,1 in the FRKL situation.

An improved version of McDiarmid inequality
We now prove an improved version of a deep mathematical statistics result before applying it to improve the key rate R. Our inslight is that statistical fluctuations in X m Q , n ʼs and Z e ,1 can be bounded using McDiarmid-type inequality. Actually, the first inequality of this type was proven for the case of statistically independent random variables using martingale technique in [13]. The inequality we need here is a straightforward extension of theorem 6.7 in [13] and theorem 2.3 in [21] for statistically dependent random variables. (See also a closely related version in [22].) We first introduce the concept of a centering sequence [21]. The definition below is written in a more apparent manner to physicists.
be a random real vector whose components W i ʼs are possibly statistically dependent random variables each taking values in the set  i . Let f m be a real-valued bounded function of W. Set Note that centering property implicitly depends on the distribution of W through the conditional Theorem 1. Using notations in definition 1, for a fixed = Here the symbols esssup and essinf denote the essential supremum and infimum, respectively. Further setˆ( Pr denotes the occurrence probability of the argument.
Remark 1. This version of McDiarmid inequality is slightly stronger than the one reported in [13] as we also utilize information of w in obtainingr whereas the original version in [13] made use of the worst case w. The proof of this theorem is based on that of theorem 2.2 in [21].
To proceed, we consider the function ( ) with the equality holds whenever ( x a e e ha hb t t . Therefore, Taylor's theorem gives for any > h 0. The rhs of inequality (14) is minimized by settingd = h r 4 ; 2 and with this h, inequality(14) becomes inequality(11a).
Finally, by applying the same argument to -f m ʼs instead of f m ʼs, we get inequality(11b). This completes our proof. , be a random vector such that W m takes on value from the same bounded set of real numbers Proof. This proof is adapted from example1 in [21]. From definition 1, it suffices to show that is a decreasing function of υ. Suppose W i ʼs are drawn from a collection of M objects out of which M j of them take the value a j for all j. Suppose further that among W i ʼs with <  i m 1 , there are m j of them taking the value of a j for all j. Then, the probability that Width for all m and -V m 1 . Hence, it is proved. , Remark 2. The above corollary was first proven by Hoeffding in [12] without using the concept of centering sequence. Actually, corollary 1 is more often referred to as the Hoeffding's inequality. In fact, Hoeffding's inequality has been used to compute the provably secure key rate R when the raw key length ℓ raw is finite in previous works [4,[6][7][8]. In section 5 below, we use the above corollary to bound Z e ,1 in Methods A and B.
1, 2, , be an arbitrary but fixed permutation. Suppose where d is a small correlation term of the order of ( ) ( ) +  y tx Width 2 . Furthermore, by picking x to be the rhs of inequality (17), then inequality (11) is true with Proof. Since ( ) is also a multivariate hypergeometrically distributed random vector, we only need to prove the case when P is an identity operator as the general case can be proven in the same way. From equation (15) .
Observe that . From inequality (15), with (15), we may expand ( ) ( as series of Dw via Taylor's theorem. In this way, the rhs of equation (20) can be expressed in the form where . And the correlation term g 2 obeys | | A sufficient condition for { } V m to be centering is D +  g w g 0 1 2 for all m and w. Moreover, this condition is satisfied if We now switch back to consider the situation of an arbitrary but fixed permutation P. To optimize the bound in theorem 1, we use the freedom to pick a suitable permutation P to minimizer . From theorem 1, , which is a decreasing function of both x and w. Hence, the optimal situation occurs when we pick the permutation so that ( ) w P i is a decreasing function of i. In this case, is a decreasing function of m. In this way, we arrive atr 2 in equation (18). , Remark 3. The ability to optimizer by means of picking the best possible permutation P and hence the best possible centering sequence is a novel feature of McDiarmid inequality. As far as we know, this feature has not been exploited before. In contrast, from the proof of corollary 1, it is clear that the value ofr obtained from the Hoeffding's inequality does not depend on the choice of P. In section 5 below, we fully exploit this freedom of picking P to bound Z e ,1 in Method D. Note however that the above corollary requires the knowledge of M j ʼs. In addition,r 2 is written as a rather involved sum. Let us replace every ( ) w P i in equation (18) by the average observed value, namely, å = w t i t i 1 . In this way,r would increase by a factor of ( as well (without caring whether inequality (17) holds or not). Thenr would change by a factor of ( ( ) )  D t O Width most of the time due to statistical fluctuation. Thus, in practice, we may replacer in equation (18) by the following more convenient and useful expression which does not depend on the knowledge of M j ʼs. This expression forr shall be used to bound Z e ,1 in Method C to be reported in section 5.

Application of the improved McDiarmid inequality in finding the key rate
There is a subtlety in applying theorem 1 to study the statistical fluctuation of Z e ,1 . A naive way to do so is to use inequalities(5) and (7)  , , n n ʼs as random variables and directly apply theorem 1 and definition 1 to the rhs of the above inequality. Nonetheless, it does not work for the rhs of this inequality need not be bounded. Besides, the bound obtained is not strong enough even if we ignore the boundedness problem.
To proceed, we first write Z Z= å m m Q W s j nj , , n n where Z m s , n is the number of photon pulses that Alice prepares using photon intensity μ n and that Alice prepares and Bob tries to measure (but may or may not have detection) in Z basis. In addition,W nj denotes the possibly correlated random variable whose value is 1 (0) if the jth photon pulse among the Z m s , n photon pulses is Here Z W i , is the random variable that takes the value m a p n 1 n if the ith photon pulse that are prepared by Alice and then successfully measured by Bob both in the Z basis is in fact prepared using photon intensity μ n . Recall that Eve knows the number of photons in each pulse and may act accordingly. However, she does not know the photon intensity parameter used in each pulse and the preparation basis until the pulse is measured by Bob. Hence, Z W n , ʼs may be correlated. Actually, the most general situation is that Z W n , ʼs are drawn from a larger population without replacement. That is to say, these random variables obey the multivariate hypergeometric distribution. By the same argument, inequalities(7c) and(7d) gives  Incidentally, this is the method reported in the preprint by one of us in [14]. Moreover, similar bounds on statistical fluctuations of B Q n , ʼs and B B Q E ,1 ,1 have been obtained using Hoeffding's inequality in [4,6]. That method is not as effective as the one reported here since they indirectly deal with finite sampling statistical fluctuation of Z Y ,1 and Z Z Y e ,1 ,1 .
B. Alternatively, we may use inequality(25b) and corollary 1 to bound Z e ,1 . Specifically, the true value of  C. An even more interesting way to bound Z e ,1 is to use inequality(25b), corollary 2 and remark 3. Since á ñ w in this case is the measured Z Z Z Y e s ,1 ,1 e , which is lower-bounded by inequality(7e), remark 3 gives with probability at least    D. There is an alternative way to apply inequality(25b) and corollary 2 to find Z De ,1 in inequality(26f), which is quite aggressive. Since å = w t  n of at most k elements, rhs of the above inequality can be simplified to a big sum of at most k terms To be more explicit, suppose the descending sequence { } ( ) = w P m m t 1 contains ( ) n 1 copies of ( ) w 1 , followed by ( ) n 2 copies of ( ) w 2 , and so on until ending with ( ) n k copies of ( ) w k . Surely, wherer is given by the rhs of inequality(26i).
In reality, we use the minimum of the above four methods to upper-bound the value of Z e ,1 . To study the statistical fluctuation of R, it remains to consider the fluctuation of X m Q , n in the first term in expression (8). (Although the second term also depends on X m Q , n ʼs implicitly through L EC , statistical fluctuation is absent from this term. This is because L EC is the amount of information leaking to Eve during classical post-processing of the measured raw bits. Thus, it depends on the observed values of X m Q , n ʼs and X m E , n ʼs instead of their true values.) Using the same technique as in the estimation of statistical fluctuation in Z e ,1 , the first term of Expression(8) can be rewritten as X   for = n k 1, 2, ..., 1 [6]. This means the number of photon intensities k used in practice should be 10.

Performance analysis
We study the following quantum channel, which models a commonly used 100km long optical fiber in QKD experiments, to test the performance of this new key rate formula in realistic situation. The findings here are generic as the general trend and performance improvement are also found in other situations including using the same fiber of different lengths as well as other randomly generated quantum channels. The yield and error rate of that quantum channel is given by . In addition, the transmittance of the system h h = 0.1 sys ch , and the transmittance of the fiber is given by h = -10 L ch 0.2 10 with L is the length of the fiber in km. These parameters are obtained from optical fiber experiment on a 100km long fiber in [24]; and have been used in [4,6] to study the performance of decoy-state QKD in the FRKL situation. We also follow [4,6] by using the following security parameters: is the length of the final key measured in bits. Note that κ can be interpreted as the secrecy leakage per final secret bit. Table 1 compares the optimized key rates for the state-of-the-art method reported recently equation (3) of [6] with equation (27) for various X s and k. (This is the best provably secure key rate obtained before the posting of the original proposal using McDiarmid inequality by one of us in [14].) The optimized rates are found by fixing the minimum photon intensity to´-1 10 6 , while maximizing over X p as well as all other photon intensities m n ʼs and all the m p n ʼs. This optimization is done by Monte Carlo method plus simulated annealing with a sample size of at least 10 10 for each data entry in table 1. For Method D, the optimized key rate depends on the actual Z-basis measurement results. Here we simply fix ( ) n i ʼs to their expectation values. The table clearly shows that using McDiarmid inequality improves the optimized key rates in almost all cases. It also shows that for any method used, the provably secure key rate increases as the raw key length X s increases. And they all gradually converge to the same infinite-size key rate. Besides, the asymptotic key rate generally increases with k. These are natural as longer X s implies smaller finite-size statistical fluctuation and larger number of decoys k used allows better estimation of the bounds of various B Y m , ʼs and B B Y e ,1 ,1 ʼs. Among the four methods introduced here, Method A almost always gives the least provably secure key rate. This implies that it is more effective to estimate a lower bound for Z Y ,1 via estimating an upper bound for Z Z Y e ,1 ,1 plus a lower bound for Z Z Y e ,1 ,1 . Method B is slightly better than Method C for large X s (say when 10 8 , the improvement is about a few percent). Method D is about 5%-15% or so better than Method C when X   s 10 1 0 8 1 1 . This is not unexpected for the following reason. Although Method D is more aggressive than Method C in estimating the statistical fluctuation of Z e ,1 and hence the key rate, it requires an additional condition for lower-bounding á ñ w . Thus the value of χ for Method D is 1 greater than that of Method C. As a result, for small raw key length, the improvement in estimating Z e ,1 for Method D may not be able to compensate the need to control the statistical fluctuation of one more variable. Table 1 also depicts that Method D is about 5%-20% better than Method B when X   s 10 1 0 8 1 1 . Furthermore, for fixed Z s and κ and a fixed method to compute bound for Z e ,1 , the provably secure key rate reaches a maximum at a finite k. This is not unexpected because even though the χ we deduce is independent of the number photon intensities k used, ( )  Width diverges as  +¥ k . Last but not least, in the case of k = 4, Method D always gives the best key rate. We do not have a good answer to this observation. It is instructive to study why in future.

Summary and outlook
To summarize, for X » s 10 10 5 6 , at least one of the four methods reported here could produce a provably secure key rate that is at least twice that of the state-of-the-art method. And for X » s 10 8 , Method D is at least 40% better than the state-of-the-art method. These improvements are of great value in practical QKD because the computational and time costs for classical post-processing can be quite high when the raw key length X s is long. More importantly, the McDiarmid inequality method reported here is effective to increase the key rate of real or close to real time on demand generation of the secret key-an application that is possible in near future with the advancement of laser technology. It is instructive to extend our McDiarmid inequality method to handle the case of FRKL decoy-state measurement-device-independent QKD and compare it with existing methods in literature, such as the one that uses the Chernoff bound [25] and its extension specifically for decoys with four different intensities [26].
In addition to QKD, powerful concentration inequalities in statistics such as McDiarmid inequality could also be used beyond straightforward statistical data analysis. One possibility is to use it to construct model independent test for physics experiments that involve a large number of parameters but with relatively few data points.