Bell non-locality using tensor networks and sparse recovery

,


I. INTRODUCTION
Bell's theorem [1] shows that quantum predictions are at odds with the physical intuition from classical physics.More precisely, that the correlations obtained by local measurements on distant but entangled systems cannot be reproduced by any local hidden variable model, the phenomenon generally known as Bell non-locality [2].Historically a topic in the foundations of quantum theory, with the establishment of quantum information science non-locality is now understood as a resource in a number of information processing applications ranging from randomness certification [3], secure communication [4], reduction in communication complexity [5] and selftesting [6].It is also at the core of the device-independent framework [7] where information processing is achieved without the need of a precise knowledge of the internal physical mechanisms of the state preparation and measurement apparatuses.
A central problem in the study of non-locality is to decide whether a given observed correlation is non-local [2] and furthermore quantify it [8].The standard approach is that based on Bell inequalities, experimentally testable witnesses, the violation of which allows for the deviceindependent certification of the non-local nature of the correlations under test.The set of correlations compatible with a local hidden variable model is characterized by a convex set [9], the non-trivial facets of which are precisely the Bell inequalities.However, the full characterization of Bell inequalities bounding a given scenario can only be achieved for the simplest cases [2] and in practice one often has to rely on an incomplete set of inequalities [10][11][12].Alternatively, the non-local behaviour of a given correlation can be tested directly, resorting to linear programming (LP) [2,13,14].Notwithstanding, the LP approach also suffers from the curse of dimensionality, being of no use as the number of parties, measurements settings or measurement outcomes increase in the Bell scenario.
Motivated by these issues and the fact that new rep-resentations often lead to new insights [9,[15][16][17][18][19], our aim in this paper is to offer an alternative view on Bell non-locality, based on tensor networks [20] and sparse recovery [21][22][23][24].We also drew inspiration from category theory and its applications to quantum mechanics and probability theory which in some sense represents a formal counter part to the computation-focused tensornetwork approach [25][26][27][28][29][30].Tensor networks have become an important tool in condensed matter physics, and by itself constitutes a field in rapid development that branches out into topics as varied as quantum gravity and machine learning [20].The current surge of progress can be traced back to the invention of the density matrix renormalization group (DMRG) [31] and its reformulation in terms of matrix product states (MPS) [32].The successful application of tensor networks, as well as machine learning models such as neural networks [33], generally rest on a combination of two factors: (i) The computational problem at hand allows an efficient encoding in terms of the given model, (ii) there is an efficient manner to fix the free parameters of the model by means of an optimization problem.In the case of finding the ground state of one-dimensional systems the model is given the MPS ansatz and DMRG is the optimization algorithm.In the case of neural networks the algorithm is based on back propagation.
Here we show that the problem of determining Bell non-locality has a natural representation as a tensor network problem.It turns out that the optimization one has to perform is equivalent to the problem of basis pursuit known from the theory of compressed sensing.Compressed sensing refers to the idea that sparse signals can be reconstructed efficiently from a very limited number of observations (well below the Nyquist-Shannon limit) [21,22].This allows to recover a signal from a small number of observed data points by using convex optimization for recovery.This has lead to many applications in the last decade [23,24].
Based on the tensor network approach we show a number of results.First we show that non-signalling correlations (including non-local correlations) can be mapped to hidden variable models governed by quasi-probabilities, that is, probabilities that sum up to one but are not necessarily positive [34,35].Nicely, the negativity of this quasi-probability provides a natural way to quantify nonlocality.Second, we provide an explicit singular value decomposition for the hidden variable model that introduces a natural basis to express the problem and points out a novel way to detect and quantify non-locality with tools originating from the field of compressed sensing [36].In fact, as we show, sparse recovery algorithms allow for a significant speed-up in the detection of non-locality in comparison with the standard linear programming approach.

II. BELL SCENARIO AS A TENSOR NETWORK
We will focus here on the standard bipartite Bell scenario in which Alice and Bob each locally perform an experiment in spatially separated regions of spacetime.However, all our results generalize in a straightforward manner to more parties.Alice and Bob have the freedom to choose experimental settings, labeled x and y respectively, and obtain outcomes indexed by labels a and b with some probability.We will assume that x, y can both take values 1, . . ., m while a, b take one of the values 0, . . ., n − 1.The setup is fully described by a conditional probability P (ab|xy), i.e. the probability to obtain outcome a and b given the inputs x and y.We will call P a behavior.
In order to be consistent with the laws of special relativity, the behavior P must obey the no-signalling conditions In a quantum description, according to Born's rule the probability distribution in a Bell scenario should be given by where ρ is the density matrix describing the quantum state shared between Alice and Bob and M x a and M y b are POVM operators describing their measurements.
Clearly, quantum correlations are non-signalling, however, there are non-signalling correlations of a postquantum nature [37].
If moreover the experimental outcomes can be explained within the assumptions of local realism, the conditional probability allows a decomposition P (ab|xy) = λ P (a|xλ)P (b|yλ)p(λ). ( It is well known that we can replace the local conditional probabilities (e.g.P (a|xλ) for Alice) by a deterministic process mapping each x to some a (i.e. a function) and the local variable simply determines the probability for the combination of deterministic processes at Alice's and Bob's side.In other words, λ can be taken to be the combination of two sequences (a 1 . . .a m ) and (b 1 . . .b m ) that prescribe the outputs a x and b y for each of the inputs x and y and we can write P (ab|xy) = a1...am b1...bm δ aax δ bbx q a1...amb1...bm , where q a1...amb1...bm is the corresponding probability.Remarkably, as we will show next, any no-signalling conditional probability allows such a decomposition with q a quasi probability, i.e. q a1...amb1...bm can take negative values but still sums to 1.This was noted in [15] but in slightly different form and totally different language.
We will arrive at this observation independently from a reasoning rooted in tensor network theory.Furthermore, the decomposition in Eq. ( 4) gives a new approach to testing Bell non-locality: One can now search the space of all q compatible with P for an element with only nonnegative coefficients, i.e. a proper probability (see Sec.

IV).
Let us establish some conventions and notations.Any object with multiple indices such as a conditional probability P (ab|xy) or a (quasi) probability q a1...am will be viewed as a tensor.We will use mixed notations for indices-upper, lower, as function argument-without distinction.For clarity, we will not use the Einstein summation convention and keep summations explicit.If any or all of the indices are suppressed we will use bold face such as P and q.The graphical depiction of a tensor as a box with a line for each index is often useful.For example, P we depict as where we used an arrow on the lines to distinguish input from output indices.Connecting lines between tensors implies summation over the corresponding index, also called contraction.This way one can create tensor networks representing a bunch of tensors with a certain pattern of contractions.
The decomposition in Eq. ( 4) of any no-signalling P (ab|xy) can be shown using a typical tensor-network tool, namely the singular value decomposition (SVD).Recall that any matrix M allows an SVD M = U SV † where U , V are unitary (orthogonal if M is real) and S is quasi diagonal.Furthermore, we will make use of the following lemma.To show the only if statement, let us start with the case M ax ≥ 0. We do induction on the column sum.For C = 0, the only option is M ax = 0 for all a, x.Suppose we proved the statement for C ≤ C and we are given M ax ≥ 0 with column sums C. Pick the coefficients a x of the smallest non-zero elements of each column x of M and let λ be the smallest value of all these elements.Then matrix M with coefficients M ax = M ax − λδ aax and M a x = M a x for (a, x) = (a , x ), has all coefficients non-negative and column sums C = C−λ ≤ C. Hence, by the induction step we can write with C a1...am ≥ 0. Adding back λδ aax gives the required decomposition for M .This establishes the lemma for this case.
Next, note that the matrices with a single 1 and a single -1 in one of the columns and otherwise zeros can be constructed as N ax = δ aax − δ aa x where a = (a 1 , . . ., a m ) and a = (a 1 . . ., a m ) differ only for the label x corresponding to the column with a x = a x the corresponding row indices of the non-zero coefficients.We can convert any matrix with constant column sums into any other matrix with the same constant column sums by adding a superposition of such matrices N .Combined with the case M ax ≥ 0 we have shown that the matrices δ aax form an over-complete set that generates all matrices with constant column sum and the lemma follows.
The conditional probability P (a|x) can be viewed as a stochastic matrix with constant column sums 1 and nonnegative coefficients.The lemma states that this can be decomposed as a superposition of matrices with a single 1 in each column with non-negative coefficients, as in the following simple example: Such a decomposition is not necessarily unique.The white noise probability P (a|x) = 1/n can for example easily be seen to allow different decompositions.If we would allow negative coefficients there is of course an even larger ambiguity in choosing the coefficients.For a conditional quasi probability Q(a|x), a matrix with column sums equal to 1 but some negative elements, we will always find some negative expansion coefficients in the super position, such as in the following simple example: The non-negative case of the lemma essentially states that stochastic matrices form a polytope with corners given by the matrices with coefficients δ aax which simply encode functions or deterministic processes.As such it can be viewed as a generalization of the Birkhoff-von Neumann theorem.
Let us now consider the bipartite P (ab|xy) that satisfies the no-signalling property and see how the lemma implies the decomposition in terms of a quasi probability for the hidden variable model.We can interpret P as a matrix by grouping Alice's and Bob's indices and write the SVD as P (ab|xy) = λ A λ ax Λ λ B λ by .From the no-signalling property it is clear that a A λ ax is independent of x for each λ and similar for b B λ by is independent of y.Then, applying the lemma for fixed λ we find that A λ ax = a1...am δ aax A λ a1...am and B λ ax = a1...am δ bbx B λ b1...bm for some A λ and B λ .Hence we have found a decomposition of the form in Eq. ( 4) with q a1...amb1...bm = λ A λ a1...am B λ b1...bm Λ λ which is real and summing over all indices gives 1.However, in general the coefficients of q can be negative hence it is a quasi probability.
To give a concrete example, let us put n = m = 2 and consider the Popescu-Rohrlich (PR) box [37] (for the fully binary case we use x, y ∈ {0, 1} and a ⊕ b = a + b mod 2).We can find Doing the summation we recover P (ab|xy) = a0a1b0b1 δ aax δ bby q a0a1b0b1 .We can refine the decomposition (4) further.Let us introduce the deterministic tensor D defined by the coefficients D axa1...am = δ aax .Viewing D as a matrix we can apply the singular value decomposition (SVD) Remarkably, we can find exact analytic results for this SVD.By a direct computation one can check that the following definitions are consistent with this decomposition x y a b D D q . . . . . .
Tensor-network decomposition of bipartite nosignalling conditional probability P (ab|xy).(a) The nosignalling condition on P is equivalent to the decomposition P = (D ⊗ D)q with q a quasi probability and D the deterministic tensor.(b) The singular value decomposition (SVD) of deterministic tensor D. Writing D = U SV T we find that U and V are expressed in terms of the rotation R. (c) The full decomposition of P .Indicated is the vector z = R 2m q, i.e. the hidden-variable quasi probability in the basis of singular vectors of D ⊗ D (correlation basis).It is natural to compare q with the behavior P in this basis as discussed in the text and is used to obtain q via sparse recovery. where and Note that R defines a rotation to a basis of which the first vector labeled by b = 0 has all coefficients equal and is thus the normalized joint eigenvector with all coefficients equal to 1 of any stochastic matrix, while all other basis vectors b > 0 have column sum zero.The tensor R is like the controlled version of R which only implements the rotation when the control equals a = 0 and which for a = 0 just acts as the identity, similar in spirit to the well-known CNOT gate.In the definition of R xy we have to replace n by m when compared to R ab .Note that for the case n = 2 the matrix R is simply the Hadamard gate while for n = 3 Note also that for a binary probability p a with a = 0, 1 the vector z = Rp has only a single non-trivial coefficient z 1 = a (−1) a p a / √ 2 which up to normalization equals the imbalance or expectation value a = a (−1) a p a .We will call the basis defined by the columns of R the correlation basis since rotating a binary probability p by the corresponding transformation is analogues to switching to a description in terms of the correlators (expectation values) rather than the probability itself.This becomes clearer still if we consider a probability with more indices p a1...am and apply R as we will discuss in the next section.The rotation R straightforwardly generalizes to the case n > 2 in which case the coefficients z a with a > 0 together are a generalized correlation-like description of p a that is mathematically equivalent.
Lemma 1 and the explicit SVD of D [Eqs.(10) to (15)] represent the main technical results of this paper.The first implies the decomposition in Eq. ( 4) which, although known, we provide with a new derivation and utility based on tensor networks.Refining the decomposition using the SVD of D suggests to attack the problem of Bell non-locality in the basis of singular vectors of D ⊗D for the hidden variable of Alice and Bob.Noting the tensor product structure of the matrix V = R ⊗ . . .⊗ R it follows that this is exactly the correlation basis which we have just described.

III. THE CORRELATION BASIS
The problem of Bell non-locality is exactly equivalent to the marginal problem: given P (ab|xy) can we find the probability p a1...amb1...bm that reproduces all correct marginals?Given the probability p we can construct the marginals such as P (a|x) by summing over all indices except a x and P (ab|xy) by summing over all indices except a x and b y and so on.Using the tensor network structure in the Bell non-locality problem we have layed out before we can see exactly how this equivalence is manifest in the basis of singular vectors of D ⊗ D in the bipartite scenario (which we call the correlation basis).Specifying P (ab|xy) fixes the coefficients of basis elements with non-zero singular values.The kernel of D ⊗ D then determines a subspace of quasi probabilities q which are compatible with P .Whether or not this subspace contains a proper probability p is then a subsequent problem which can be treated with algorithms from the theory of compressed sensing.This will be discussed in the next section.
Let us make a general definition: For a general (quasi) probability q c1...c l we can use the rotation R to define and we refer to the standard basis in the c z indices as the correlation basis.In the bipartite scenario the c z indices are split into two groups a x and b y .But because z is defined by acting on q by a tensor product operator V = R⊗. ..⊗R, the definition generalizes to multipartite scenarios or when Alice and Bob have different numbers of inputs (or even outputs).Given a no-signalling P (ab|xy), we can obtain unambiguous marginals P (a|x) and P (b|y).Given the decomposition (4) it is straightforward to see that these are obtained by summing over all indices except a x and/or b y .By the fact that R a0 = 1/ √ n we can easily see that so these elements are fixed by the given P (ab|xy).Equivalently, one look at the decomposition of P in terms of z as and formally invert this by taking the inverse only of the non-zero singular values.The corresponding coefficients of z are exactly the ones fixed by the marginals that follow from the problem.Putting all other coefficients of z to zero corresponds to the q obtained by applying the Moore-Penrose pseudo inverse [38] of D ⊗ D on P .This gives the solution of the linear equation (D ⊗ D)q = P of minimal 2 -norm.While a viable quasi probability consistent with P , there is no guarantee that this q is a probability if P is local since adding any element k of the kernel of D⊗D to q gives a q = q +k that is still a quasi probability and reproduces P .It may happen that q with minimal 2 -norm lies outside the probability simplex but some q lies inside the probability simplex.The clearest example is given by the deterministic cases that form the corners of the local hidden variable polytope.Let us consider n = m = 2 and P (00|xy) = 1 and all other P (ab|xy) = 0.The correct hidden variable probability that generates this behavior is but the minimal 2 -norm solution is in fact This is illustrated in Fig. 2. One way to guarantee that a solution q is a proper probability is to make sure it minimizes an p -norm with 0 ≤ p ≤ 1. Algorithms to solve this problem will be discussed in the next section.
Bell inequalities are the traditional way to test Bell non-locality.We can offer a new perspective on Bell inequalities in terms of the correlation basis.Clearly, any local conditional probability P has by definition a proper probability q = p for the hidden variable.For q to be a probability it simply has have non-negative coefficients, which in terms of z means Vz ≥ 0 (the inequality is interpreted as element wise).Here V = R ⊗2m .One can use quantifier elimination methods such as Fourier-Motzkin elimination to derive the domains for a restricted set of coefficients of z which can satisfy these inequalities treating the other coefficients as free variables that can take any required value.The Bell setup amounts to fixing certain coefficients of z while any value for the other coefficients leads to a consistent q.Hence if the reduced set of inequalities is satisfied by the fixed coefficients of z we can find a true probability consistent with P .Hence, these inequalities are the Bell inequalities.Alternatively, one can proceed as usual, write down the deterministic strategies (the extremal points of the local polytope) in terms of the correlation basis and use standard convex optimization algorithms to obtain Bell inequalities.
To illustrate this point let us give the CHSH inequality [10] in the correlation basis.This corresponds to n = m = 2 in terms of which the non-trivial Bell inequality in terms of z a0a1b0b1 reads The transformation to the correlation basis may also be useful in attacking more general contextuality scenarios [39].

IV. RECOVERING THE HIDDEN VARIABLE AND QUANTIFYING NONLOCALITY
We will now consider the following problem: given a conditional probability P for a bipartite Bell scenario, determine whether it corresponds to a local hidden variable model.As a further refinement of the problem one may be interested in quantifying by some distance measure the degree of non-locality in case P is found to be non-local.This problem is also considered in Ref. [13].Here we will show how techniques from compressed sensing can be used to solve this problem efficiently.We show that NS exactly corresponds to those P which allow a hidden variable model defined by a quasi probability q, i.e. solutions to the equation (D ⊗ D)q = P .The local polytope L exactly corresponds to those Q for which q can be chosen to non-negative, i.e. a probability.The quantum set Q is defined as P that can be obtained from a quantum mechanical model, It is not a polytope and satisfies the strict inclusions L ⊂ Q ⊂ NS.(b) Illustration of the hyperplane of quasi probabilities q (red) with the probability simplex indicated (blue).It is shown that a solution q for the hidden variable quasi probability that minimizes 2-norm can exist even when In compressed sensing one is interested in solving the linear equation Ax = b and find the solution x that is as sparse as possible.Formally, the most sparse solution minimizes the 0 -norm but instead minimizing the 1norm is known to give a good approximation and it has the benefit of being convex and amenable to the techniques of convex optimization.
In our case we can also focus attention on the 1 -norm, and we consider the optimization problem minimize ||q|| 1 subject to (D ⊗ D)q = P This problem is known as basis pursuit in the computer science literature.There are several classes of algorithms equipped to solve this.A well-known way is to map the problem to a linear program (LP).Although slightly differently formulated, this is very similar to the approach detailed in Ref. [13].The problem with LPs is however that, while accurate, for large dimensions they become computationally expansive.For instance, the dimension of the hidden-variable space grows exponentially with the number of inputs as n 2m .
We will follow a different route here: Based on the tensor-network decomposition of P we can formulate the problem such that it precisely fits the most efficient version of NESTA (a shorthand for Nesterov's algorithm), a class of algorithms introduced in [36] to tackle exactly the basis pursuit problem.
Recall that fixing the no-signalling conditional probability P (ab|xy) is equivalent to fixing certain coefficients of the vector z [Eqs.(19) to (22)].Let us introduce the projector Π that projects onto these coefficients and let us denote the vector of these coefficients by z 0 .Then the problem can be reformulated as minimize ||Vz|| 1 subject to Πz = z 0 where V = V ⊗ V = R ⊗2m .We have used the tensornetwork decomposition of P and the NCON function in Matlab [40] to compute the vector z 0 from P .Subsequently we have fed this into the NESTA package available online [36,41] to obtain q = Vz.The main goal here is efficiency.If q is found to be non-negative we have obtained a probability by construction and the original P is local.
In Fig. 3 we have compared the LP based method from [13] implemented in Mathematica with our current method.We see that at least in the case of varying number of outputs we really do get a very significant improvement in the computation time for larger cases using the NESTA based approach.To test the correctness we have also compared NL(P ) with neg(P ) and see that the relation is linear.Here NL(P ) is the 1 distance of P to the local polytope [13], while neg(P ) is the minimal negativity neg(q) = a1...am b1...bm max(−q a1...amb1...bm , 0) for all q compatible with P and can also be understood as a measure of how non-local a given distribution P is.Since we also have neg(q) = 1 2 ||q|| 1 − 1, minimizing the negativity is indeed equivalent to minimizing the 1 norm.Remarkably, for even n we find exact equality neg(P ) = NL(P ) and better performance of the NESTA algorithm while for odd n we find a non-trivial constant of proportionality and also longer computation times.An exact diagnosis of why this is the case is postponed to future research.One remark we can make however is that the longer computation time seems to be caused by the fact that more iterations are needed in NESTA to reach the stop criterion.A big part of the computational cost of NESTA comes from the matrix multiplication by V and V T .We leveraged the fact that V = R ⊗2m is a tensor product operator by using a multiplication routine that does not construct the full matrix V [42].

V. CONCLUSIONS
Bell non-locality is a cornerstone of quantum theory and central resource in a variety of quantum information processing protocols.For that aim, a basic step is to decide whether a given observed correlation is nonlocal or not.Given its importance, over the years a few approaches been developed to tackle to the problem [2,13,14,[43][44][45][46], however, limited in practice to relatively simple Bell scenarios, with a small number of distant parties and measurement settings and outcomes.Most of such approaches are based on a geometric view We computed q via the NESTA algorithm which gives access to neg(P ) and we computed the 1 distance NL(P ) via linear programming (LP).Non-zero values of neg(P ) and NL(P ) correspond to non-locality of P .We see that the NESTA computation scales much more favourably with increasing dimension.The sampled data here is 100 points of a random convex combination of the form P = c0P wn + c1P ld + c2P pr where Pwn(ab|xy) = 1/n 2 is white noise, P ld (ab|xy) = δa0δ b0 is local deterministic and Ppr(ab|xy) = 1/n if b − a = xy mod 2 and zero otherwise is a generalized PR box, i.e. a corner of the no-signalling set.Solid lines are fits with f (n) = a[exp(bn) − 1].For LP we find b ≈ 1.006.For NESTA we find a distinguished difference between even and odd cases, which we fit separately.This gives an exponent b ≈ 0.25.Fitting the even and odd cases together (dashed blue line) gives b ≈ 0.18.(b) We compare NL(P ) with neg(q) showing that all cases which are deemed local by the LP method are indeed local according to the NESTA based method.The relation is linear in all cases, but while in the cases n is even we find equality neg(q) = NL(P ), the cases with n odd have a non-trivial constant of proportionality.
of Bell scenarios, more precisely the fact that local correlations form a polytope and that to detect non-locality means to find ways of certifying that a given correlation lies outside this local set.However, different perspectives such as sheaf-theoretic [15], graph theoretical [16], causal [18,19] or category theoretic [47][48][49] are also possible.Often a different perspective leads to new insights and computational methods.
Here we propose a new perspective to understand and quantify Bell non-locality, based on a tensor network description.With that we showed that non-signalling correlations can be described by hidden variable models governed by quasi-probabilities, the negativity of which offers a natural way to quantify non-locality.By refining our description via a singular value decomposition we obtained a natural basis to attack the problem of quantifying non-locality and that can be computationally implemented via extremely efficient sparse recovery algorithms.To show its relevance we compared the NESTA algorithm used in compressed sensing with the standard linear programming tool used in the study of non-locality and showed a significant speed-up in computational time as function of the complexity of the Bell scenario.
We believe this perspective on Bell's theorem opens a few venues for future research.Local correlations are equivalent to hidden variable models governed by probabilities while general non-signalling correlations imply quasi-probabilities.What are the restrictions imposed by Born's rule (the quantum mechanical rule) to such quasiprobabilities?There is an important research program trying to recover the quantum limitations on correlations [37,50], however, to our knowledge, the intersection of this program with this quasi-probability description of non-locality has not been considered so far (see however recent results on general connections between quantum theory and quasi probabilities such as [28,35,51]).A potential application of the sparse recovery method is to combine it with the machine-learning approach that has been recently proposed [46].There, neural networks are used in supervised learning algorithms that create a machine model of the local set.The bottleneck of the method is exactly the fact that the standard linear programming approach is used to generate the input data, consisting of a set of correlations and their respective degree of non-locality.Can the NESTA algorithm [36] provide a more scalable solution to this machine learning approach?Finally, the tensor network description can easily be extended to more complex Bell scenarios consisting of several independent sources [52] and leading to non-convex sets of correlations [53,54].Can generalizations of sparse recovery algorithms adapted to deal with non-linear constraints [55] provide a new way to deal with such complicated Bell scenarios?Can tensor network ideas be further leveraged to menage computational cost if the network of causal relations increases in size in a certain way?We hope our results might motivate further research along these directions.

P
(a b|xy) for all a, a .

FIG. 2 .
FIG. 2. Conceptual illustration of the local versus nonlocal probabilities.(a)In the space of conditional probabilities P the no-signalling condition defines a polytope (NS).Strictly included in NS there is the polytope of local correlations (L).We show that NS exactly corresponds to those P which allow a hidden variable model defined by a quasi probability q, i.e. solutions to the equation (D ⊗ D)q = P .The local polytope L exactly corresponds to those Q for which q can be chosen to non-negative, i.e. a probability.The quantum set Q is defined as P that can be obtained from a quantum mechanical model,P (ab|xy) = Tr[M x a ⊗ M y b ρ].It is not a polytope and satisfies the strict inclusions L ⊂ Q ⊂ NS.(b) Illustration of the hyperplane of quasi probabilities q (red) with the probability simplex indicated (blue).It is shown that a solution q for the hidden variable quasi probability that minimizes 2-norm can exist even when P is local if q = q + k where k ∈ ker D ⊗ D.

FIG. 3 .
FIG. 3. (a)Computation times for Bell 22n (bipartite, two inputs, n outputs).We computed q via the NESTA algorithm which gives access to neg(P ) and we computed the