Optimal randomness generation from optical Bell experiments

Genuine randomness can be certified from Bell tests without any detailed assumptions on the working of the devices with which the test is implemented. An important class of experiments for implementing such tests is optical setups based on polarisation measurements of entangled photons distributed from a spontaneous parametric down conversion source. Here we compute the maximal amount of randomness which can be certified in such setups under realistic conditions. We provide relevant yet unexpected numerical values for the physical parameters and achieve four times more randomness than previous methods.


Introduction
Quantum systems have the potential to provide a strong form of randomness which cannot be attributed to incomplete knowledge of any classical variable of the system. At the basis of such genuine randomness lies a quantitative relation between the amount by which a Bell inequality is violated [1] and the degree of predictability of the results of the test [2]. Intuitively, the violation of a Bell inequality certifies the presence of nonlocal correlations [3], and in turn, this guarantees that the outcomes of the measurements cannot be determined in advance [4,5]. Furthermore, this genuine randomness can be certified without any detailed assumptions about the internal working of the devices used, that is, in a "device-independent" fashion. Device independence is advantageous since it provides immunity to attacks that exploit imperfections in the physical implementation, to which device-dependent protocols are susceptible [6]. For this reason, device-independent randomness generation has recently received much attention [7,8,9,10,11,12].
An intense research effort has been devoted to the experimental realisation of device-independent randomness generation. A few years ago, Pironio et al. [2] implemented the first proof-of-principle experiment. It involved two entangled atomic ion qubits confined in two independent vacuum chambers separated by approximately one meter. This implementation, which was based on light-matter interaction, managed to certify 42 random bits over a period of one month.
The principal challenge for a device-independent randomness generation experiment is that it must close the detection loophole [13,14], i.e. it must provide a Bell inequality violation without post-selection on the data, since otherwise violation can be faked by classical resources [15] and no genuine randomness can be guaranteed. The detection loophole was first successfully closed on several systems relying on light-matter interaction; see for instance [16,17,18]. Very recently it has been closed in optical setups [19,20], based on polarisation measurements of entangled photons distributed from a spontaneous parametric down-conversion (SPDC) source. These optical implementations represent an important achievement as they enable much higher rates of genuine random bits per time unit.
Given these experimental achievements, the natural question that arises is how to generate this genuine randomness efficiently. What is the maximal amount of randomness that a given physical implementation allows for? And most importantly, how should the relevant physical parameters of the setup be tuned to provide such an optimal amount? Here we answer these questions for the case of optical implementations based on SPDC, for which a thorough physical characterization has been recently presented in [21].
We start out by constructing a general framework and methods for optimal randomness certification in Bell experiments. The idea is to keep as much information as possible by avoiding any sort of binning of outcomes, then to use the methods recently introduced in [7] to estimate randomness by constructing a device-independent guessing probability optimized over all possible Bell inequalities, and finally to optimize the latter quantity over all the tunable physical parameters of the experiment. We then narrow our focus to entirely optical polarisation-based implementations (e.g. [19,20]). We first characterize the realistic parameters of such Bell setups and then apply our methods to determine optimal amounts of global and local randomness under realistic conditions. We provide interesting bounds on the experimental parameters -some of them counter-intuitive and perhaps unexpectedand certify up to four times more randomness than what a standard analysis, based on a binning of the outcomes and on the CHSH inequality [22], can achieve [2].

Methods
Here we describe methods that allow for optimal device-independent randomness certification. The general idea consists of three steps which are given in Box 1. Since we do not make any physical characterization of the source or the devices, the results are kept general and can be applied to any bipartite Bell experiment free of the detection loophole (cf. [16,17,18,19,20]

Scenario
To begin, we recall the device-independent scenario [2,7,23]. Two parties, Alice and Bob, are located in two secure laboratories from which no unwanted classical information can leak out. At each round of the experiment, they receive a quantum state ρ AB from a source S and perform on it one out of m A (m B ) possible measurements x = 0, 1, .., m A − 1 (y = 0, 1, ..., m B − 1) and retrieve one out of o A (o B ) possible outcomes a = 0, 1, ..., o A − 1 (b = 0, 1, .., o B − 1). We make no other assumption on ρ AB other than the fact that it is a quantum state. In fact, ρ AB could have any dimension, and could even be correlated with another quantum system in the possession of a malicious eavesdropper Eve ‡, such that ρ AB = Tr E ρ ABE .
Moreover, Alice and Bob do not trust the devices they use to measure ρ AB . These devices can be thought of as measurements characterized by positive operatorvalued measures (POVMs) with elements {M a|x } and {M b|y } acting on ρ AB . Their probabilistic behaviour is given by Born's rule, There are a total of m A m B o A o B such probabilities, which can be seen as the components of a vector p = {p(ab|xy)} ∈ R m A m B o A o B . We call p the behaviour ‡ We consider that Eve is limited by the laws of quantum mechanics. We also assume that the behaviour of the boxes is independent and identically distributed from one round to another, though, interestingly, the bound (3) has been proved secure under less demanding assumptions, (see [24]). associated with the quantum realization Q defined by the state ρ AB and the measurements with elements {M a|x } and {M b|y }.

Bounding the device-independent guessing probability
The optimal amount of randomness that Alice and Bob can certify from an observed quantum behaviour p is measured here by the min-entropy of the device-independent guessing probability G p [25], i.e. h = − log 2 (G p ). To estimate G p , consider that for some round of the experiment Alice and Bob have chosen and performed some measurements x = x * and y = y * on ρ AB . Without loss of generality any strategy z of Eve can be seen as a POVM measurement with o A o B elements {M e|z } that she applies on her reduced state ρ E = Tr E ρ ABE . Whenever she obtains the output e = (a * , b * ) she then guesses that Alice's (Bob's) outcome was a * (b * ). It can be shown that G p , the average probability that Eve correctly guesses the output of Alice and Bob boxes using an optimal strategy, is the solution to the following conic linear program [7,8]: Each p e is an unnormalized behaviour "prepared" for Alice and Bob and conditioned on the outcome e of the measurement with POVM elements {M e|z } performed by Eve. Hence, the probability that p e is prepared is the probability that Eve obtains the corresponding outcome e, i.e. p(e|z) = Tr[ρ E M e|z ]. To be precise, and Q is the set of all such unnormalized quantum behaviours. The first constraint in the program translates the fact that the behaviours p e should on average reproduce Alice and Bob's observed behaviour p. The second constraint demands that every behaviour should be quantum §. The program maximizes the success of Eve's strategy over all possible The program presented in (2) is in general intractable due tu the lack of a precise characterization of Q, but semi-definite programming (SDP) relaxations similar to the ones presented in [26] can be used tu put bounds on G p . One then defines a convergent hierarchy of convex sets having a precise characterization and being such that Q 1 ⊇ Q 2 ⊇ ... ⊇ Q [7,26]. This hierarchy approximates the quantum set Q from the outside, and thus one can relax the difficulty of the problem (to the order § A behaviour p is said to be quantum whenever there exists a realization Q (i.e. a quantum state + measurements) which reproduces p through Born's rule (1). k) by replacing Q in (2) by Q k . The solution G k p of the k-th SDP program sets an upper bound on the guessing probability G p , which in turn sets a lower bound h k = − log 2 (G k p ) on the number h of global random bits that are certified from p and from the measurements (x * , y * ).
It is worth mentioning that the methods presented so far can be adapted straightforwardly for local randomness evaluation. In this case, the situation is considered from Alice's perspective, for example, and a program equivalent to (2) is derived to obtain the local guessing probability G p (x * ). Computationally speaking, local randomness is appealing as the number of POVM elements of Eve's strategies To conclude this section notice that the optimal Bell inequality which yields G k p can be accessed from the dual formulation of (2). The advantage with respect to previous methods (which assess the problem via a fixed Bell inequality, e.g. [2]) has been found to be significant in both [7] and [8,9,10].

Keeping as much data as possible
In subsection 2.2 we discussed how to quantify the maximal amount of randomness available for Alice and Bob from an observed behaviour p. Still, there are several degrees of freedom in p that can be further optimized to provide even more randomness. More precisely, tailoring these degrees of freedom always leads to different behaviours, which in turn yields different -and hopefully higher-amounts of randomness. We can distinguish two types of such degrees of freedom; those that require adjustments in the experimental setup (e.g. increasing the efficiency of the detectors), and those which do not. Here we will deal with the latter, and leave the former for subsection 2.4. In particular, the numbers of outcomes o A and o B can be adjusted without much experimental effort. All Bell experiments so far, which have managed to close the detection loophole, have relied violation of the CHSH inequality [22] (or similar ones [27]). This assumes the local observation of two outcomes per party. However, in addition to the two good outcomes, loss and imperfections lead to events where no detector clicks, resulting in a third outcome per party; this means that a local binning process was applied in all these experiments to reduce the size of the original behaviour to two outcomes.
It is intuitive to expect that more randomness can be certified when binning strategies are avoided; any binning strategy represents a loss of potentially useful information. Still, it could be the case that the amount of certifiable randomness would not get diminished for some particular binning. Our results in section 4 show that this is not the case in general. In fact, In Appendix A we explicitly show how any binning strategy applied to CHSH correlations with inefficient detectors will systematically decrease the amount of certifiable randomness. Hence, to certify optimal amounts of randomness, Alice and Bob must ensure that the number of outcomes o A and o B is kept as high as possible.

Taking experimental parameters into account
The observed quantum behaviour p possesses physical degrees of freedom that can be adjusted in the experimental setup to produce higher amounts of randomness. The solution of (2) can be minimized over all the possible realistic values that such parameters (which we label P) can take. In this way, the optimal amount of randomness that can be certified to the order k is the solution of: In particular, notice that this program optimizes G k p (x * , y * ) over the number of measurements m A and m B , which are implicit quantities in P (see also section 4.1). The methods presented above are general and can be adjusted to any bipartite Bell experiment. We focus and describe in the following the architecture of optical implementations based on polarisation measurements of entangled photons distributed from an SPDC source (see Fig. 1), which was thoroughly analysed in [21].

Realistic optical implementations
The source is characterized by three adjustable quantities: two squeezing parameters g 1 and g 2 and a total number of modes N onto which the photons may be distributed. Each mode locally splits into two orthogonal polarisations. In terms of bosonic creation operators, the unnormalized state produced by S is given by [21]: were |0 is the vacuum state associated to the 4N bosonic operators a † 1 , ..., a † N ⊥ , b † 1 , ..., b † N ⊥ , and the a-modes (b-modes) are distributed to Alice (Bob). All the different types of losses including detectors inefficiencies are modelled, without loss of generality, by two beam-splitters (not shown in Fig. 1) placed at any point between the users and the source. The transmittance η of these beam-splitters is the overall detection efficiency of the experiment.
The measurements are performed with polarizing beam-splitters (PBS) and half-wave plates (HWP) and quarter-wave plates (QWP) which allow splitting the orthogonal modes along arbitrary directions [19,20,21]. Each measurement u is fully characterized by two angles (θ u , φ u ) defining a projection in the Bloch sphere. Each of the parties holds two detectors, which do not resolve photon number. Hence, for each detector only the outcomes "0=No click" and "1=Click" can be distinguished, and the maximal number of local outcomes (without binning) is o A = o B = 4.

Results
In this section we apply the methods presented in section 2 to the optical setup described in section 3.

Constructing P, p and G
Considering that Alice and Bob respectively perform m A and m B measurements, the experiment is characterized by 4 + 2(m A + m B ) physical parameters, which are: N , All of these parameters are adjustable within some range of realistic values, except η which, as discussed above, represents the main restriction for an optical implementation. Hence, the adjustable parameters read: The analytic expression of p as a function of P and η is at first only computed for the first measurements of Alice and Bob, (θ A 1 , φ A 1 ) and (θ B 1 , φ B 1 ). In this case P consists of seven parameters, i.e. P = (N, g 1 , g 2 , θ A 1 , φ A 1 , θ B 1 , φ B 1 ). Since the number of outcomes are kept as high as possible (o A = o B = 4), this expression is obtained by solving a linear system of 4×4 = 16 equations; 15 of these equations correspond to the "no-click" probabilities of all the detectors, which can be found in the supplementary material of [21]. The remaining equation is a normalization condition. Next, this expression (obtained only for the first measurements) is generalized for arbitrary (m A , m B ). One only needs to concatenate all the individual behaviours: In particular, all the individual behaviours have the same analytical structure as the behaviour obtained for the first measurements, and hence one only needs to (6). This yields the desired m A m B o A o B -sized quantum behaviour (see subsection 2.1).
Finally, it is necessary to set realistic limits on P; otherwise, the minimization in (3) is unbounded. We let 1 ≤ N ≤ 100, −1/2 ≤ g 1 , g 2 ≤ 1/2 (corresponding to about 4.3 dB of squeezing) and we let all the measurement angles vary in a 2π-length interval.

Optimal randomness for m
Optimal randomness is retrieved from (3) upon optimization of all adjustable parameters, which include the number of measurements in the experiment. Optimizing G k over m A and m B is of particular relevance for the setup that we consider as distinct rotation directions of the incoming modes can be achieved by adjusting the HWP and QWP, i.e. without the need of further experimental resources. Still, to illustrate the performance of our methods we consider here the simplest case m A = m B = 2.
We find that whenever the parties are restricted to o bin = 2 outcomes, more global randomness is certified when no specific Bell inequality is considered. This was to be expected following subsection 2.2 and the line of research of [7,8,9] (see dashed and dotted curves in Fig. 2). However, we improve considerably this expected result by suppressing the binning of the outcomes and letting o = 4, as we explained in subsection 2.3 (solid curve in Fig. 2). For η = 1 our methods certify 0.74 bits of global randomness per source use, four times more than the 0.19 bits that are certified from the CHSH inequality (we provide the Bell inequality that certifies this improvement in Appendix B). The numerical values of the optimal parameters P are given in Fig. 3 for several values of η. Intuitively, the ratio t = tanh(g 1 )/ tanh(g 2 ) quantifies the degree of entanglement of the source, as (4) shows. For η = 1 optimal randomness is All our results were obtained at the order k = 1 + AB. This corresponds to an intermediate stage [26] for details. Overall detection efficiency η

Random bits
Optimal Method Binning Only Binning + CHSH obtained from a "maximally entangled" state, i.e. t = 100%, but as η decreases t also decreases. This was to be expected for the lower values of η, where nonlocality can only be certified with non-maximally entangled states [27]. Interestingly, for η ≈ 1 the optimal measurements are not similar to the ones that intuitively maximize the violation of the CHSH inequality on two maximally entangled qubits (e.g. they are not mutually unbiased); see Appendix B for the exact expressions. That is, the optimal measurements for optimal randomness certification are not the same as those maximizing the CHSH violation.
The number of modes attains the maximal value that we allow (N = 100) whenever η is greater than 2 √ 2 − 2. For η smaller than this value, the single mode case N = 1 is sufficient to obtain maximal randomness; this fact was noticed in [21] for the maximization of the CHSH inequality violation. Finally, we have found that the improvement obtained when increasing the number of modes beyond ≈ 25 is very small. Figure 3. Color online. Optimal parameters P for different values of η. t is the ratio between tanh(g 1 ) and tanh(g 2 ), while g = max(g 1 , g 2 ). N always reaches 100. Blue (Red): optimal measurements for Alice (Bob) in the Bloch sphere representation. All these quantities were obtained after solving program (3).

Optimal randomness with more than two measurements
Our next goal is to see whether deploying more measurements yields an improvement in the number of random bits. In the previous subsection we considered the case m A = m B = 2; however, by adjusting the HWP and QWP located in front of their PBS, Alice and Bob can measure their incoming subsystem along any arbitrary polarisation direction of the Bloch sphere. These adjustments can thus be obtained with relatively low experimental cost, the main drawback being a non-negligible increase in the amount of statistical data (the size of the observed behavior p increases with m A m B ).
Our results in Table 1 show that more measurements certify more randomness, even in scenarios for which a binning strategy had to be considered and P could not be fully optimized due to computational limitations. The time required to solve (3) becomes large as the number of measurements increases, since the total number of SDP variables describing the behaviours p e in (2) increases as (m A m B ) 2 . The increase is less dramatic when local randomness is certified e.g. from Alice's perspective, as there are only o A (instead of o A o B ) SDP matrices in (2) for each choice of P.
In particular, with four measurements per party we certify 0.557 local random bits. This is 3 times more than the amount that is certified from the CHSH inequality (≈ 0.17 bits) under the same considerations.  , m B ). The * symbol is used when full optimization was not possible, and instead: (i) the optimization was only carried over the number of modes, with g 1 = g 2 = 0.1; (ii) the measurements were inspired from the chained inequality [28] and (iii) we considered 3 outcomes per party by locally binning the "no click-no click" and the "click-click" outcomes.

Experiments with only one detector per side
The setup depicted in Fig. 1 has been hitherto central in our analysis as it captures the general architecture for Bell experiments with entangled photons. Unfortunately, state-of-the-art superconducting detectors, i.e. those which achieve detection efficiencies above 70% and thus enable a true Bell violation without postselection, represent an extremely high experimental cost nowadays. This situation can be alleviated (the cost can be reduced by half) by realizing that a Bell test can still be carried on with the use of only one detector on each arm of the experiment [19,20]. Given the techniques that we have shown so far, it is interesting to see how the optimal amount of randomness is affected. For a fixed overall detection efficiency η, how does the optimal amount of randomness that can be certified in an experiment with only one detector compare to the optimal amount of randomness that can be certified with two detectors?
The statistics of an experiment with only one detector are straightforwardly obtained from the statistics of an experiment with two detectors (those which we presented in 4.1). As discussed in 3 the possible local outcomes of an experiment with two detectors are 00, 01, 10 and 11 where the first (second) number labels the outcome of the first (second) detector"0=No click" and "1=Click". Then, applying the local binning B 1Det = {00 → 0 , 01 → 0 , 10 → 1 , 11 → 1 } on Alice and Bob's sides yields the statistics of the experiment without the second detector.
We observe that for η 0.8 no disadvantage occurs if the second detector is removed: the optimal amount of local and global randomness than can be certified in both cases is ∼ 6 × 10 −4 bits. On the other hand, as η becomes close to 1 removing a detector negatively affects the optimal amount of randomness: for η = 1 the optimal amount of local (global) random bits certified with two detectors is ≈ 0.45 (≈ 0.73) bits, while with only one detector the optimal amount is ≈ 0.31 (≈ 0.34) bits.

Discussion
Summarizing, in the present article we have explicitly shown the benefits of optimizing randomness in a Bell experiment over all possible inequalities, and the negative consequences that occur when information is lost through a binning of the resulting outcomes. We carefully analysed and characterized optical setups based on SPDC and certified up to four times more randomness when all of the physical parameters were optimized.
To put it in a nutshell, here are the important facts to be aware of in order to retrieve optimal amounts of randomness from an optical Bell implementation based on SPDC (and their experimental cost): 1. Keep the whole statistics and avoid binning the outcomes. (no cost). 2. Use as many polarisation measurements as possible. (small cost). 3. Use many modes to distribute the entangled photons. (high cost in principle, but keep in mind that more than ≈ 25 modes will provide little improvement). 4. For η ≈ 1, the optimal measurements for randomness extraction are not the ones that maximize the violation of the CHSH inequality. (no cost). 5. For η 0.8 it is enough to use a single mode to distribute entanglement and use a single detector per side. (no cost).
We hope that this work will be useful for the future development of Bell-type randomness generation experiments.

Acknowledgments
We thank M Hoban and S Pironio for interesting discussions and for the proof presented in Appendix A. We also thank N Sangouard for sharing with us the exact expressions of the no-click probabilities discussed in subsection 4.1. The SDP calculations were performed using the code QMBOUND written by JD Bancal. This work was supported by by the EU projects QITBOX and SIQS and the John Templeton Foundation. JB was supported by the Swiss National Science Foundation (QSIT director's reserve) and SEFRI (COST action MP1006), DC by the Beatriu de Pinós fellowship (BP-DGR 2013), AM by the Mexican CONACYT graduate fellowship porgram, and PS by the Marie Curie COFUND action through the ICFOnest program.
of the elements. The same occurs in (A.3) whenever they apply a different binning. It is therefore sufficient to evaluate the optimal randomness available from p BB and from p BB , for example. In Fig. A1 we plot the percentage by which the guessing probability for these quantum behaviours is increased with respect to the guessing probability obtained from p η . We find that for any 2 √ 2 − 2 < η < 1 it is always advantageous to keep the no-click outcome.  Figure A1. The binning disadvantage is the difference in the number of bits that are certified from either p BB or p BB with respect to p η .

Appendix B. Bell Inequality and relevant parameters expressions
As explained in the main text, the dual formulation of (2) yields the expression of the Bell inequality that certifies the optimal amount randomness [7]. It is therefore possible to retrieve the Bell inequality associated to the optimal parameters. One first solves the program (3) for η fixed; this yields some optimal parameters P = P * . Then one comes back to solve the dual program of (2) using as input p(P * ). In the Collins-Gisin parametrization, the 7 × 7 Bell inequality which certifies 0.74 bits of global randomness (see subsection 4.2) is: