Tsirelson Polytopes and Randomness Generation

We classify the extreme points of a polytope of probability distributions in the (2,2,2) CHSH-Bell setting that is induced by a single Tsirelson bound. We do the same for a class of polytopes obtained from a parametrized family of multiple Tsirelson bounds interacting non-trivially. Such constructions can be applied to device-independent random number generation using the method of probability estimation factors [Phys. Rev. A, 98:040304(R), 2018]. We demonstrate a meaningful improvement in certified randomness applying the new polytopes characterized here.


Introduction
The phenomenon of Bell nonlocality [2] is a prediction of quantum physics in which entangled particles display behavior incompatible with any local realistic (or "classical") explanation. Originally discovered during an examination of foundational assumptions about physics, Bell nonlocality was later found to have applications to tasks in quantum information theory. Specifically, communication protocols based on an underlying Bell nonlocality experiment can be designed to be "device-independent," in the sense that users of the protocols can be assured of security so long as the observed data meets certain statistical benchmarks, without having to make detailed assumptions about the internal functioning of their devices. Applications include device-independent quantum key distribution [3,4] and device-independent random number generation [5,6,7,8,9,10].
A Bell experiment will generate measurement outcomes according to a probability distribution. Only certain probability distributions exhibit Bell nonlocality while others do not, and the set of quantum-achievable probability distributions that do exhibit Bell nonlocality is complicated with a curved boundary [11,12]. Even in the simplest (2,2,2) Bell scenario (2 parties, 2 settings, 2 outcomes), new discoveries about the structure of this set are still being made [13].
Here, we describe a method for approximating the set of quantum-achievable distributions from the outside with convex polytopes, motivated by an application to device-independent random number generation [1,14]. We obtain the approximation by first restricting to the set of no-signaling probability distributions -itself a polytope containing the quantum set [3] -and then reducing to a smaller polytope by intersecting with half-spaces defined by socalled "Tsirelson" inequalities: linear constraints obeyed by quantum-achievable probability distributions. We did this in an earlier work [14] for single instances of Tsirelson's original inequality [15]; in the current paper, we generalize and formalize this approach to apply to any generalized Tsirelson inequality such as those found in Refs. [16,17]. We also simultaneously incorporate multiple Tsirelson inequalities that interact non-trivially and describe the resulting smaller polytopes. While the extreme points of a given polytope characterized by known linear constraints can be found algorithmically, our work goes beyond this in classifying polytopes as parametrized families depending on parameters of the generalized Tsirelson inequalities that induce them. These analytic results provide a useful tool for optimizing approximating polytopes for a given task.
The specific application motivating our study is the Probability Estimation Factor (PEF) method [1,14] for device-independent random number generation. As demonstrated recently [10,18,19] the PEF method is effective for certifying randomness in feasible experiments. In its basic form it certifies randomness secure against an adversary holding classical side information, as done in [18], and it can also be used [10,19] as a tool for constructing the necessary machinery to execute the quantum probability estimation protocol of [20] which is secure against more general quantum side information.
The relevance of bounding polytopes to the PEF method is explained in detail in Section 4 below, but it can be understood roughly as follows: if a candidate for a randomness certifying function is found to be valid, loosely speaking, for a finite set of probability distributions, then it will be valid for any convex mixture of these probability distributions. If the set of these convex mixtures -which is a polytope -contains the entire quantum set, the function is then confirmed to be appropriate for certifying randomness, having only had to check the validity condition for a finite number of probability distributions. In contrast to the polytopes presented here, other methods for approximating the quantum set, such as the non-linear methods of Navascués, Pironio, and Acín [11,21], are not readily applicable to the PEF method. It is possible then that our polytopes may have applications to other quantum information tasks for which a linear approximation of the quantum set is desirable.
In the remainder of the paper, we review basic facts about the set of quantum-achievable probability distributions and related polytopes (Section 2), then derive our results about Tsirelson polytopes in Section 3. In Section 4, we review the PEF method and present a scenario in which using the new polytopes yields a demonstrable improvement over previous implementations, and then we finish with concluding remarks in Section 5. An appendix contains some of the more technical proofs.

Definitions and Background
Our setting is the (2,2,2) Bell scenario, in which there are two spatially separated parties ("Alice" and "Bob") making measurements, two measurement settings for each party, and two possible measurement outcomes for each party. We can use random variables O A and O B to represent Alice's and Bob's outcomes, respectively, and random variables S A and S B to represent their respective settings. The outcome random variables O A and O B both take values in the set {0, +} and the settings random variables S A and S B take values in the sets {a, a ′ } and {b, b ′ }, respectively. A (2,2,2) Bell experiment is thus governed by a set of four conditional probability distributions corresponding to the four possible measurement configurations: and we follow the terminology of [13]: Definition 2.1 A behavior, denoted P , is a vector in R 16 listing the set of 16 conditional probabilities P (++|ab), P (+0|ab), ... , P (00|a ′ b ′ ) corresponding to all possible measurement configurations and outcomes of the (2,2,2) experiment. A Bell function is a vector B in R 16 inducing a function from behaviors to R via the dot product: B · P : R 16 → R.
Bell functions will be used to introduce constraints of the form B · P ≤ R, where R is a real number, which will be satisfied only by some behaviors.
By the laws of probability, any valid behavior P must have only nonnegative entries and satisfy the following normalization equations: Furthermore, we will study only behaviors that additionally satisfy the no-signaling constraints: The no-signaling constraints express the condition that Alice's marginal outcome distribution should not depend on Bob's measurement choice and vice versa. For instance, the first equation above requires that when Alice's setting is a, her probability of getting a "+" is the same whether Bob has setting b or b ′ . The first equation also implies, when combined with the laws of probability expressed in (1), that Alice's probability of getting a "0" does not depend on Bob's setting. We will refer to the class of valid behaviors satisfying (2) as the no-signaling set N S. N S forms a closed convex polytope: a bounded set formed by the intersection of finitely many closed half spaces. For N S, the half spaces are obtained from the 8 linear equality constraints in (1) and (2) combined with 16 linear inequality constraints ensuring that all entries are nonnegative, and the boundedness follows from the observation that N S ⊆ [0, 1] 16 . It is well known [3] that N S can be expressed as the convex hull of 24 so-called extreme points, 8 of which are called "Popescu-Rohrlich (PR) boxes" [22] and 16 of which are called "local deterministic" behaviors; see [23] Tables A1 and A2 for a list. We denote these extremal behaviors as , respectively. The standard definition (such as in [13]) of an extreme point of a set S is one that cannot be expressed as non-trivial convex combinations of other points in S, and the Krien-Milman theorem implies that any convex compact set in R n will be equal to the convex hull of its extreme points. This allows us to use the following working definition of a set of extreme points: Definition 2.2 Given a convex set S, a subset E ⊆ S is the set of extreme points of S if 1. S is contained in the convex hull of E, denoted S ∈ Conv(E) 2. No element of E can be expressed as a convex combination of other points in E There are two important subsets of N S to mention. First is the quantum set Q, consisting of behaviors that can be induced by quantum measurements of a quantum system. Second is the local set L, consisting of behaviors that admit a decomposition of the form for a random variable Λ that represents local hidden variables. L is equal to the convex hull of the 16 local deter- ; recall these are some (but not all) of the extreme points of the set N S. Quantum behaviors in N S \ L can be shown to contain certifiable randomness.
Every Bell function has a maximum local value, a maximum quantum value, and a maximum no-signaling value which, given a Bell function B, we define as For L and N S, the above suprema are indeed maxima, as the maximum values are achieved. This follows because L and N S are each equal to the convex hull of a finite set of extreme points, and so for such a scenario, given a behavior P we can re-express B · P as where the E i are elements of the set of extreme points and the λ i are nonnegative numbers summing to one. Therefore no behavior can have a value of B · P that is greater than max E i ∈E B · E i . The fact that the maxima for L and N S are achieved at extreme points will be useful in the arguments below. Regarding the question of whether the supremum over the quantum set is a proper maximum, the situation is more complicated and discussed in [13], but this question is not material for the work below. Finally, it is well known that L ⊆ Q ⊆ N S, and so the following inequality holds in general: Thus Bell functions B for which LB < T B holds strictly can be used to witness certifiable randomness in quantum behaviors P satisfying B · P > LB.
One particularly important Bell function that we will discuss is the Clauser-Horne-Shimony-Holt (CHSH) Bell function B CHSH [24], whose coefficients are given in Table 1. The famous CHSH inequality, B CHSH · P ≤ 2 for P ∈ L, is a statement that LB is 2 for B CHSH . The Tsirelson bound T B for B CHSH is 2 √ 2 [15]. The no-signaling bound NSB of 4 is achieved by the PR box behavior in Table 1 [22]. The local bound LB of 2 is achieved by eight local deterministic behaviors; each of these eight behaviors has a entry of "1" in exactly one place where the PR box has a "0," and this location is unique to each of the eight saturating local behaviors.
There are symmetries of the convex sets N S, Q, and L, for which the associated transformations applied to the CHSH inequality generate new inequalities. There are eight inequivalent versions of the CHSH inequality obtained this way. Each version of the CHSH inequality Table 1: The CHSH Bell function B CHSH (left), a PR box behavior that achieves the nosignaling maximum NSB = 4 of B CHSH (center), and one of the eight local deterministic behaviors that achieves the local maximum LB = 2 of B CHSH (right). The entries of the table for B CHSH give the coefficients that appear in the dot product B CHSH · P defining the Bell function, so starting in the upper left, 1 is the number to be multiplied by P (++|ab), −1 is the number to be multiplied by P (+0|ab), etc. The entries of the table for the PR box behavior are the probabilities themselves, so P (++|ab) = 1/2, P (+0|ab) = 0, etc. The entries of the table for the local deterministic behavior are also probabilities.
corresponds to a unique PR box behavior obtaining the NSB of 4 with a corresponding set of eight local deterministic behaviors obtaining the LB of 2.

Tsirelson Polytopes
Given a Bell function B and a real number We allow for T B * to exceed the quantum supremum T B, because we might want to consider Bell functions for which the exact quantum limit is not known but a numerical upper bound can be found [11,21]. As Q T is the intersection of N S with a half space defined by a linear inequality, Q T will form a polytope. Not all Bell functions will lead to scenarios worth studying. For our purposes, any interesting Bell function should not have NSB = LB, which by (4) would lead to the degeneracy LB = T B = NSB, so we only consider Bell functions for which LB < NSB holds strictly. For a given Bell function B, one can effectively determine whether the strict inequality holds by checking the value of B · E for all extreme points of L and N S. Here is a useful fact about such Bell functions: Fact 3.1 For any Bell function B for which LB < NSB holds, there is exactly one PR box for which B · P R > LB, and B · P R = NSB for this PR box.
Proof. NSB will be achieved at an extreme point of N S, and the assumption LB < NSB implies that this extreme point must be one of the PR boxes. If a second PR box satisfied B · P R > LB, then an equal mixture of these two PR boxes -which is in the local set L (see Theorem 2.1 in [23]) -would exceed the local bound, a contradiction.
A priori, any Bell function satisfying LB < NSB will fall into one of two categories.: LB ≤ T B < NSB and LB < T B = NSB. It turns out the latter of these is impossible: Appendix D of [13] explains that if T B = NSB, then LB = T B = NSB. Thus we need only consider Bell functions for which LB ≤ T B < NSB; this forms our general scenario of interest. For the rest of this paper, we assume this condition, as well as T B * < NSB (as T B * = NSB yields Q T = N S).
Fact 3.1 tells us that when we intersect N S with { P | B · P ≤ T B * } for some T B * ∈ [T B, NSB), the resulting set contains all of the extreme points of N S save one -the sole PR box for which B · P R = NSB > LB. Thus many of the extreme points of N S are extreme points of Q T . The following theorem, proved in the appendix, classifies the rest of the extreme points of Q T . Theorem 3.1 Let B be a Bell function for which LB ≤ T B < NSB holds, and let P R 1 denote the PR box for which Then the set of extreme points of Q T is equal to the set E defined as follows: all of the local deterministic distributions, all of the PR boxes except P R 1 , and all behaviors of the form where L i is one of the eight local distributions saturating the version of the CHSH inequality maximally violated by P R 1 , and We note that if LB = T B * holds, a behavior defined by (5) can coincide with a local deterministic distribution, in which case the statement of the theorem refers to this behavior twice in defining the set E.
Applying the above theorem to the special case of the original Tsirelson bound of 2 √ 2 for B CHSH , one obtains the same value of λ i = √ 2 − 1 for all eight versions of the E i behavior in (5). Furthermore, if one simultaneously introduces all eight versions of the CHSH inequality with Tsirelson bounds 2 √ 2, it follows from the above arguments that each version causes the corresponding PR box behavior achieving NSB = 4 to split into eight extreme points of the form (5), resulting in a polytope with 80 extreme points, as described in [14].
The scenario of the 80-vertex polytope is straightforward to describe because the different Tsirelson bounds do not "interact" in the sense that the respective no-signaling maxima for their corresponding Bell functions are achieved by different PR box behaviors; this is depicted schematically in Figure 1a. However, the situation is more complicated if Tsirlson bounds are introduced for two distinct Bell functions maximized by the same PR box, as represented in Figure 1b. This turns out to be the scenario that yields improvement for randomness certification. With a little effort, the proof method for Theorem 3.1 can be adapted to find the extreme points of the more complicated polytopes with two Tsirelson bounds interacting non-trivially.
As an example, let us consider the "tilted" CHSH Bell function B α whose coefficients are given in Table 2. The Tsirelson bound for this Bell function is derived in Ref. [16], valid for all values of α > 1: There is a quantum behavior saturating the bound. For a fixed α > 1, the extreme points of the polytope induced by (6) alone are given by Theorem 3.1. Now let us consider the polytope of behaviors that obey Tsirelson's original inequality B CHSH · P ≤ 2 √ 2 as well as (6) for a fixed value of α > 1, as depicted schematically in Figure 1b. To describe the extreme points of this polytope, first we remark that of the eight local deterministic behaviors satisfying B CHSH · L = 2, four of them will have a B α · L value of 2, and four of them will have a B α · L value of 2α. This is confirmed by inspection of Table 2. We can label the first four of these local deterministic behaviors . We assert that the extreme points of the polytope are those for N S, minus the single PR box that violates both Tsirelson bounds, plus 24 new extreme points. Four of these extreme points are obtained from the expression (5) using L top i vectors with NSB, B i , and T B * generated by B CHSH , and four are obtained from the expression (5) using the L bot i vectors with NSB, B i , and T B * generated by B α . Note these behaviors each saturate one of the two Tsirelson inequalities while strictly obeying the other one. The remaining 16 extreme points, which saturate both Tsirelson inequalities, are given as for all choices of i, j ∈ {1, 2, 3, 4}, where the λ coefficients are found by solving the simultaneous set of equations The following values solve this set of equations: Furthermore, the condition α > 1 ensures the above coefficients are nonnegative, 3 so the expression (7) with the coefficients in (8) yields a valid convex combination. An outline of the proof that these are the extreme points is given in the appendix.

Applications to Device Independent Random Number Generation
The constructions of Section 3 can be applied to the task of random number generation via the method of probability estimation factors [1,14], which we now briefly review. A probability estimation factor (PEF) is a function, satisfying certain conditions, that maps the result of a Bell experiment to the nonnegative real numbers. In our (2,2,2) Bell scenario, the result of an experimental trial consists of the two settings choices and two outcomes for Alice and Bob, so any PEF is a function F : with sixteen possible inputs. The precise definition of a PEF in this scenario is as follows: for a collection 3 The subtracted fractions in the expressions for λ top and λ bot are less than one: of behaviors P and a parameter β > 0, a PEF with power β is a nonnegative function satisfying the following inequality for all behaviors P ∈ P: where E P [·] denotes expected value with respect to the joint distribution P (O A , O A , S A , S B ) of the settings and outcomes given by the behavior P with a fixed settings distribution. The (9) is also according to the behavior P . PEFs are useful because in a sequence of n repeated trials of a Bell experiment, they can be used to derive a bound on the probability of the string of outcomes, and therefore indicate the presence of randomness. Furthermore, the validity of this bound does not require an assumption that the n trials are independent and identically distributed (i.i.d.). Specifically, in [1] it is shown that if F is a PEF with power β for a collection of behaviors P, then for any ǫ > 0, the following holds for any probability distribution P governing all n trials in which the trial-by-trial, settings-conditional outcome distributions are (possibly different) behaviors in P: The above expression, roughly speaking, says that when the product of the trial-by-trial PEF values F i is large -and so the quantity (ǫ n i=1 F i ) −1/β is small -it is unlikely (outer probability less than epsilon) that the observed outcome string occurs with more than a small probability (inner inequality). Often, we assume that the collection of behaviors P is the set of quantum-achievable behaviors, and so the probability bound in (10) will hold under the assumption that quantum mechanics is correct.
Polytopes containing the quantum set of behaviors Q can be used to effectively construct valid PEFs satisfying the defining constraint (9). Specifically, Section V of the supplemental material of [1] shows that anything satisfying the PEF defining condition (9) for the extreme points of a polytope containing P will satisfy (9) for all the behaviors in P. Hence it becomes possible to check that a PEF is valid by checking only a finite number of linear inequality constraints, one for each extreme point.
The bound in (10) is a rough measure of how much randomness is available in the data of a Bell experiment. To process the randomness into a final, near-uniform output string for information-theoretic applications, further work can be done [1,8,10,14] to account for the non-unity probability that the PEF product exceeds a certain threshold and to properly apply classical postprocessing machinery such as an extractor function [25]. Here, we will not implement these final steps and will just use the expression (10) as a measure of raw certifiable randomness for comparing performance of various PEFs.
With this as our criterion, we can choose a quantum-achievable behavior and compare the performance of PEFs for it by fixing an error bound ǫ, fixing a number n of i.i.d. trials sampled from the quantum behavior with equiprobable settings, and computing (ǫ n i=1 F i ) −1/β in (10). Since n i=1 F i is a random variable, the probability bound will depend on the particular instance of the experiment. However, we can anticipate a likely value for n i=1 F i using the fact that for sufficiently large n, n i=1 log (F i ) will be either greater than or less than Raw Randomness (Bits) Figure 2: Optimizing (11) by performing the maximization problem of (12) for various values of β, with ǫ = 10 −6 and n = 10, 000. The distribution of the trial results is given by the unique quantum behavior P , given in [16], that saturates (6) with α = 2. The polytope used to generate the extreme points E in (12) is induced by (only) the original Tsirelson inequality B CHSH · P ≤ 2 √ 2. The Y-axis plots − log 2 of the quantity in (11), to measure the amount of raw randomness in bits. A similar phenomenon is observed in Figure 1 of [14]. nE(log F ) with roughly equal probability (this follows from the Central Limit Theorem; see [26], or [8] Supplementary Information Section 3 for details). Hence we use where the expectation is computed according to the chosen quantum behavior with equiprobable settings, as an anticipated value for (ǫ n i=1 F i ) −1/β . Given a set of extreme points E of a polytope containing the quantum set Q, we can perform a maximization procedure to find a PEF, valid for all quantum behaviors, that optimizes the quantity (11) for a fixed number of trials n. The procedure is as follows: first, fix a choice of β, then solve the following convex optimization problem: Subject to This can be solved effectively with standard computer algorithms. Once this is done, plug the resulting optimized value of E(log F ) into (11). Then repeat the procedure for various values of β looking for the highest possible value of (11); in all examples studied, testing values of β between 0.001 and 0.100 yields a curve with a clear optimum (see Figure 2). Table 3: Comparison of the amount of randomness obtained from three different polytopes with ǫ = 10 −6 and n = 10, 000. The "Raw Randomness (Bits)" quantity is obtained by calculating max where the expectation is computed for the quantum distribution P that maximizes B α · P for α = 2. The inner maximum is calculated according to (12), and then recalcuated for multiple values of β ∈ (0, 0.1) to obtain the outer maximum. Results. To demonstrate the method, we considered the amount of randomness certifiable from a specific quantum-achievable behavior. Our choice was the unique quantum behavior, given in [16], that maximizes the Bell function in Table 2 when α = 2. Since this Bell function uniquely determines the behavior, a reasonable conjecture might be that the polytope obtained from the corresponding Tsirelson bound (6) with α = 2 would result in a larger amount of randomness than could be obtained using the polytope induced by the original Tsirelson bound B CHSH · P ≤ 2 √ 2 (equivalent to (6) with α = 1). We found that the opposite was true. However, including both bounds to create a smaller polytope does result in a meaningful improvement compared to using either bound individually. Note that [1,10,14,18,19] all use the polytope with only B CHSH ≤ 2 √ 2, so the fact that the polytope using both bounds results in an improvement of almost 18% to the number of certified random bits is relevant for future implementations. Our results are summarized in Table 3.
We are of course not limited to the three polytopes analyzed in Table 3. We have derived a parametrized family of two-inequality polytopes for different values of α. Figure 3 displays interesting behavior for the amount of certifiable randomness at different values of α, and supports the notion that the α = 2 figure of 6, 805.23 reported in Table 3 is near-optimal for this family of polytopes. Numerical evidence indicates that the optimal value of α is closer to 2.03.
There are many different scenarios that can be examined. The Tsirelson bound for expression (17) of Ref. [13], computed using the analytic technique of Wolfe and Yelin [17], is saturated by a quantum behavior as well as a local deterministic behavior -and consequently is saturated by any convex mixture of these two behaviors. We found that the amount of randomness that can be certified from such a convex mixture is greater using the standard CHSH inequality polytope, compared to what can be certified using the polytope induced by (17) in Ref. [13], paralleling our finding above for the single-inequality polytopes induced by B α and B CHSH . Furthermore, initial explorations employing polytopes with two non-trivially interacting Tsirelson bounds to certify randomness in existing experimentally generated data sets, such as the data in Ref. [10], produced only marginal improvements over the results reported in the reference (which uses only the original Tsirelson inequality). It may be that the meaningful improvements reported in Table 3 are more characteristic of behaviors near the quantum boundary and/or behaviors inducing a large absolute violation of the original CHSH inequality.  Figure 3: For various choices of α, we compute the polytope induced by both (6) and B CHSH · P ≤ 2 √ 2. We then perform the optimization problem (12) using this polytope and the distribution used in Table 3. The optimization problem is performed multiple times with multiple values of β ∈ (0, 0.1). For each choice of α we report the largest value of − log 2 of (11) found for all β.

Conclusion
We have derived formulas for identifying the extreme points of polytopes induced by multiple Tsirelson bounds, relating the structure of these polytopes to parameters in the defining inequalities. We have demonstrated that these results can be used to improve the performance of the probability estimation factor method for certifying randomness. Our techniques outline a general approach for classifying the extreme points of such polytopes, and in future work it may be useful to incorporate three or more non-trivially interacting Tsirelson bounds to obtain better approximations of the quantum set. Unfortunately, the complexity of a polytope grows with each additional constraint and a law of diminishing returns will apply as each iteration removes a smaller volume of behaviors than the one before it. Ideally the limiting behavior of performance enhancements will become apparent before the procedure becomes intractable. The results presented here may find applications beyond probability estimation, given the growing scope of quantum information theory and in particular device-independent protocols.

A Proofs of Results
In this appendix, we prove Theorem 3.1, and then outline the proof of the classification of the extreme points of the polytope induced by B CHSH and B α .
where L i is one of the eight local distributions saturating the version of the CHSH inequality maximally violated by P R 1 , and Proof. First note that the relation B i ≤ LB ≤ T B * < NSB ensures that 0 ≤ λ i < 1 holds. Thus (5) defines a convex combination of the behaviors P R 1 and L i , so the E i are valid behaviors in N S. Furthermore, one can check directly that B · E i ≤ T B * holds for any i (indeed, B · E i = T B * ) so the E i are contained in Q T . As for the other elements of E, these are all local deterministic distributions or PR boxes other than P R 1 , which all satisfy B · P ≤ LB ≤ T B * . Thus E ⊆ Q T . Now we demonstrate that E satisfies the statement of Definition 2.2. The first step is to show that every element of Q T is in the convex hull of E. To do this, note that the convex hull of E includes L as well as every convex combination of local deterministic distributions and PR boxes other than P R 1 . By the remarks following Theorem 2.2 of [23], any element of N S that does not fall into one of these categories can be expressed as a convex combination of the following form: The p coefficients are nonnegative and satisfy p P R + 8 i=1 p i = 1. To show Q T ⊆ Conv(E), it will thus suffice to show that any behavior P of the form (13) that obeys B · P ≤ T B * can be expressed as convex combination of the E i behaviors defined in (5) and the local distributions L i .
Our strategy for this is to first re-write expression (13) as where the new set of p coefficients on the right side of (13) are still nonnegative and sum to one, and the new coefficient p ′ P R for P R 1 is smaller than p P R . This process can be re-applied to p ′ P R P R 1 + p 2 L 2 to further lessen the coefficient p ′ P R , and the process is repeated while cycling from L 1 through L 8 , until the entire weight of P R 1 is replaced with weight on the E vertices, leaving a convex combination solely of elements of E.
To demonstrate that it is possible to execute the step displayed in (14), we divide the problem into cases. First, let us suppose that T B * − B 1 = 0 and that p 1 ≥ [(NSB − T B * )/(T B * − B 1 )]p P R . Then the following equality holds: Our case assumptions assure that the coefficients of E 1 and L 1 are nonnegative, and furthermore the sum of the coefficients of E 1 and L 1 is equal to p P R + p 1 . Thus (15) can replace the P R and L 1 terms in (13) to obtain a well-defined convex combination of E 1 and L 1 , ..., L 8 yielding the same behavior. In this case, containment in Conv(E) is demonstrated and further iterations of the procedure are unnecessary. Now let us suppose that either T B * − B 1 = 0, or T B * − B 1 > 0 and p 1 < [(NSB − T B * )/(T B * − B 1 )]p P R . In either case the following equality holds: Once again, the coefficients on the right side of the above equation are nonnnegative and sum to p P R + p 1 , and so (16) can replace the P R and L 1 terms in (13). Now there will remain some weight on the PR box in the new convex combination, but it will be reduced (unless p 1 = 0 or T B * = B 1 , in which case it is unchanged), and the above process can be reapplied to p P R − T B * −B1 N SB−T B * p 1 × P R 1 + p 2 L 2 . The key is that repeated applications of the (16)-type substitution to successive L i terms must eventually terminate with a (15)-type substitution that eliminates the coefficient of P R 1 , prior to cycling through all eight of the L i . If this did not happen, the final expression after eight applications of (16)-type substitutions would be of the form 8 i=1p i E i +p P R P R 1 withp P R > 0. But it is not possible for such an expression to be equivalent to (13), because B · E i = T B * for all i and B · P R 1 > T B * , so the new expression would violate the bound B · P ≤ T B * , whereas the behavior in (13) was assumed to obey it.
We have thus shown that Q T ⊆ Conv(E). To complete the proof, we need to show the second part of Definition 2.2 is satisfied; that is, no element of E can be expressed as a convex combination of other points in E. We need only check this for new behaviors E i defined in (5). Note first that if T B * − B i = 0, then E i = L 1 , and L i is already known to be extremal, so let us assume T B * − B i > 0. In this case, E i violates the version of the CHSH inequality that is maximially violated by P R 1 and saturated by L 1 . This version of the CHSH inequality is not violated by any of the local deterministic distributions or PR boxes in E, so any non-trivial convex combination of elements of E equaling E i would require positive weight on at least one of the other behaviors defined by expression (5). But this is impossible: if we refer to Table 2 of [23], we see that each E i behavior looks like the Table 4: An example of a E i behavior. From (5), p P R = (T B * − B i )/(NSB − B i ) and p i = (NSB − T B * )/(NSB − B i ). ++ +0 0+ 00 ab p i + 1 2 p P R 0 0 1 2 p P R ab ′ p i + 1 2 p P R 0 0 Table 4. Importantly, there is a single location containing the entry p i and seven locations containing zero, all located outside the support of the PR box. Furthermore, each different E i defined by (5) will contain a positive p i entry in a different one of these eight cells outside the support of the PR box, and zeros in the rest. This feature implies that no convex combination equaling E i can contain positive weight on any other E j with j = i.
Now we outline the proof that the extreme points of the polytope induced by B CHSH and B α are given by the expressions at the end of Section 3.
To demonstrate part 1 of Definition 2.2, consider an expression of the form (13) that obeys both Tsirelson bounds. Then, analogously to (14), replace a portion of the weight on the PR box with weight on one of the 16 behaviors of the form (7). This will be possible so long as there is positive weight on at least one L top i behavior and at least one L bot i behavior. Repeat this process with different choices of the 16 behaviors of the form (7) until it is no longer possible -either 1) all PR box weight has been converted, 2) there is no remaining weight on L top i behaviors, or 3) there is no remaining weight on L bot i behaviors. In case of (1), the process is complete; in case of (2), one can continue to replace PR box weight with E itype behaviors as defined in (5) with L bot i behaviors (which saturate (6), and strictly satisfy B CHSH · P < 2 √ 2); and in case of (3), one can instead continue to replace PR box weight with E i -type behaviors as defined in (5) with L top i behaviors (which, conversely, strictly satisfy (6) and saturate B CHSH · P ≤ 2 √ 2). The fact that the original behavior (13) satisfies both Tsirelson inequalities ensures that in all cases, this process terminates with an equivalent expression consisting of a convex combination with weight solely on the new extreme points, and zero weight on the PR box.
To demonstrate part 2 of Definition 2.2, note that the 24 new extreme points all violate the CHSH inequality, and the other extreme points do not, so any convex combination replicating one of the 24 new extreme points would have to contain weight on the other 23 new extreme points. However, it can be verified by inspection that this cannot occur by considering where these 24 extreme points contain zeros (recall Table 4) and which Tsirelson inequalities the extreme points saturate and/or strictly satisfy.