The simplest causal inequalities and their violation

In a scenario where two parties share, act on and exchange some physical resource, the assumption that the parties' actions are ordered according to a definite causal structure yields constraints on the possible correlations that can be established. We show that the set of correlations that are compatible with a definite causal order forms a polytope, whose facets define causal inequalities. We fully characterize this causal polytope in the simplest case of bipartite correlations with binary inputs and outputs. We find two families of nonequivalent causal inequalities; both can be violated in the recently introduced framework of process matrices, which extends the standard quantum formalism by relaxing the implicit assumption of a fixed causal structure. Our work paves the way to a more systematic investigation of causal inequalities in a theory-independent way, and of their violation within the framework of process matrices.


I. INTRODUCTION
In our common understanding of the world, we typically perceive events as being embedded in some causal structure, where events happening earlier can influence events happening later but not vice versa. Correlations can be established in such a picture by physical systems that may be shared or exchanged by different parties, and which may be used to communicate or convey causal influences.
It is well known, however, that this view is challenged by quantum correlations: Bell's theorem [1] shows for instance that these conflict with Reichenbach's common cause principle [2,3], so that quantum mechanics forces us to generalize the notion of causal influence [4][5][6][7][8][9]. Another implication of this picture is that if one assumes that the parties interact only once with the physical medium, then only one-way influences (i.e., one-way signaling) are possible, which restricts-independently of any assumptions on the physics of the involved systems-the possible correlations that can be observed.
But is this view that events should comply with a definite causal structure, and causal influences can only be unidirectional, necessary in any physical theory? Or could one envisage theories where the causal relations between events are not necessarily well defined [10,11]? To answer these questions, Oreshkov, Costa and Brukner developed the framework of process matrices as an extension of quantum theory, where the assumption of a fixed causal structure is relaxed [12]. Process matrices describe the physical resource that allows different parties to establish correlations. Oreshkov et al. showed that certain so-called causally nonseparable process matrices indeed do not comply with a definite causal structure.
The incompatibility of a certain causally nonseparable process matrix with a definite causal structure was proven in Ref. [12] by its ability to generate correlations that are incompatible with a definite causal order, as demonstrated by the violation of a so-called causal inequality. This can be tested in a device-independent manner, by just looking at the statistics observed in an experiment. It was recently shown that causal nonseparability could also be detected in a device-dependent manner by using so-called causal witnesses [13]. This approach is more powerful as it can detect all causally nonseparable process matrices, while not all causally nonseparable process matrices can violate a causal inequality [13,14]. Furthermore, physical implementations of certain (multipartite) causally nonseparable process matrices, and of corresponding causal witnesses that detect their causal nonseparability, have been proposed [13][14][15] and even realized experimentally [16], while it is still not known whether there actually exist any physically realizable process that violates a causal inequality. Nevertheless, the device-independent approach is still of interest as it relaxes the requirement to trust the functioning and the operations implemented by one's devices in an experiment. It is furthermore also theory-independent: causal inequalities can in principle be tested, and correlations with no definite causal order can be identified whatever the description of the physical resource is-whether we use the process matrix framework or any other theory to be discovered in the future. A related open question is whether the ability to violate causal inequalities can-in analogy with Bell nonlocality [17]-be exploited as a resource, just like causally non-separable process matrices provide advantages for information-theoretical [18] and computational [19] tasks.
Our paper aims at providing a better understanding of the device-independent characterization of correlations that are compatible with a definite causal order or not. We show that bipartite correlations with a definite causal order form a convex polytope, whose facets correspond to causal inequalities (Section II). We characterize this causal polytope in the simplest scenario where the two parties observe correlations with binary inputs and out-puts, which gives us two families of new causal inequalities. We then investigate their possible violation in the framework of process matrices, and find that these can indeed be violated (Section III). This provides an example of "noncausal" process matrix correlations in a simpler scenario than that considered in Ref. [12], where one party had two input bits, or in Refs. [20,21], where more parties were involved.

A. "Causal correlations"
We consider an experiment with two parties, Alice and Bob, each of them having control over some closed laboratory. They both open their lab, let some physical system in, interact with it and send a physical system out, only once during each run of the experiment. Alice and Bob are given some classical inputs labeled by x and y, and return some classical outputs a and b, respectively. We assume that all inputs and outputs have a finite number of possible values. The correlation that Alice and Bob establish in such an experiment is described by the joint conditional probability distribution p(a, b|x, y).
In a situation where at each run of the experiment Alice's events precede Bob's events (denoted A ≺ B), Alice could send her input and output to Bob, but not vice versa; hence, there cannot be any signaling from Bob to Alice, and their correlation, which we shall denote in this case p A≺B , must therefore satisfy ∀ x, y, y ′ , a, p A≺B (a|x, y) = p A≺B (a|x, y ′ ) , (1) with p A≺B (a|x, y (′) ) = b p A≺B (a, b|x, y (′) ). Similarly, in a situation where Bob's events precede Alice's (B ≺ A), their correlation p B≺A must satisfy the no-signalingto-Bob constraint with p B≺A (b|x (′) , y) = a p B≺A (a, b|x (′) , y). Note that non-signaling correlations satisfy both Eqs. (1) and (2), and are compatible with both causal orders A ≺ B and B ≺ A. More generally, if the correlation is compatible with the causal order A ≺ B with probability q, and with B ≺ A with probability 1 − q, then the correlation will be of the form Following Refs. [13,14,22], we call the bipartite probability distribution p(a, b|x, y) (or the correlation it describes, equivalently) "causal" if it can be written as in Eq. (3), with q ∈ [0, 1] and p A≺B and p B≺A valid (i.e., nonnegative and normalized) probability distributions satisfying Eqs. (1) and (2), respectively. Causal correlations are those that can be obtained in a situation where every run of the experiment is compatible with a definite causal order (A ≺ B or B ≺ A), which may however vary for each run, and is only determined probabilistically. Note that the decomposition (3) is in general not unique, as non-signaling contributions can be included in either p A≺B or p B≺A .

B. Causal polytopes and causal inequalities
Correlations that are compatible with the causal order A ≺ B satisfy nonnegativity (p A≺B (a, b|x, y) ≥ 0 ∀x, y, a, b) and normalization ( a,b p A≺B (a, b|x, y) = 1 ∀x, y) constraints, together with the no-signaling-to-Alice constraint (1). As these constitute a finite number of linear constraints on a bounded probability space 1 , it follows that the set of correlations p A≺B is a (convex) polytope [23]. Similarly, the set of correlations p B≺A that are compatible with the causal order B ≺ A is also a polytope. Now, according to Eq. (3), the set of causal correlations is simply the convex hull of the sets of correlations p A≺B and p B≺A , and is therefore itself a polytope, which we call the causal polytope.
By construction, the extremal points of the causal polytope are extremal points of either the polytope of p A≺B correlations, or of the polytope of p B≺A correlations (or of both polytopes); in Appendix A we show that these correspond to deterministic correlations compatible with either causal order (or both, in the case of nonsignaling correlations). From this "V-representation" of the causal polytope in terms of its vertices, for a given number of inputs and outputs, one can in principle determine its equivalent "H-representation" in terms of its facets [23] (although in practice, this is a hard problem to solve when the number of inputs and outputs increase). Some of its facets are trivial, in the sense that they only correspond to the nonnegativity constraints p(a, b|x, y) ≥ 0; its other, nontrivial facets define socalled causal inequalities [12]-inequalities that are satisfied by any causal correlation.
The above characterization hints of course at a strong analogy with Bell inequalities, which may be obtained as facets of the "local polytope" [1,17,24] (or may not; in the same way that not all Bell inequalities are facets of the local polytope, not all causal inequalities are facets of the causal polytope, as they can also correspond to some external hyperplanes 2 ). Causal inequalities are written as linear combinations of the conditional probabilities p(a, b|x, y), constrained by some "causal bounds". They can also be translated in the language of "causal games" by considering for instance the linear combination to define the score, or possibly the probability of success (for some specific distribution of inputs), of some game. They can be tested experimentally in a deviceindependent way-i.e., by just considering the observed statistics, without making any assumptions on the functioning of the physical devices used in the experiment: a violation of a causal inequality guarantees that the observed correlation is incompatible with a definite causal order-or, in short, is noncausal.

C. The simplest causal polytope
To illustrate the previous discussion, we now turn to the characterization of the simplest nontrivial causal polytope. Note that causal inequalities can only be nontrivial if each party has nontrivial inputs and outputsi.e., if they can take at least two different values. Indeed, if one party only has trivial inputs or outputs, then clearly either (1) or (2) holds, so that any correlation is compatible with a definite causal order.
Hence, the simplest candidate for a nontrivial causal polytope is the case with a single bit of input and output for each of the two parties 3 (which we shall denote by 0 or 1), reminiscent of the scenario considered by Clauser-Horne-Shimony-Holt (CHSH) in the case of nonlocality [26]. We generated the list of its 112 deterministic vertices (see Appendix A), and enumerated its 48 facets using the software lrs [27].
16 of these facets are trivial, corresponding to the nonnegativity constraints p(ab|xy) ≥ 0. By relabeling the inputs and outputs, the 32 remaining, non-trivial facets can be grouped in two non-equivalent families of causal inequalities: 16 facets are relabelings of the inequality where δ i,j is the Kronecker delta, while the last 16 facets 5056 vertices, only span a 21-dimensional affine subspace, while a facet of this 24-dimensional polytope should have dimension 23). Likewise, the causal inequalities (21-24) below are not facets of the causal polytope for binary inputs and outputs (they are only facets of its projection onto the plane considered in Subsection III C). 3 Actually, one also has a nontrivial causal polytope in a scenario where one of Alice and/or Bob's input yields a binary output, while the other always gives the same output (or has no output, equivalently). In such a case the only nontrivial causal inequalities are of the LGYNI type, Eq. (5) or (7) (note on the other hand that the corresponding local polytope is trivial [25]). For simplicity however, we choose to impose throughout the paper that all inputs should have the same number of outputs.
are relabelings of the inequality 1 4 x,y,a,b where ⊕ denotes addition modulo 2.
The causal inequality (4) can be interpreted as a bound on the maximal probability of success for a bipartite "guess your neighbor's input" (GYNI) game [28] with uniform input bits x, y (such that p(x, y) = 1 4 ), where Alice and Bob's task is to guess each other's input, i.e., to output a = y and b = x. Implicitly assuming uniform input bits 4 , inequality (4) can indeed be written in a more compact form as This causal bound on the probability of success p GYNI can easily be understood: assuming that the correlation is compatible with the causal order A ≺ B, Alice cannot know anything about Bob's input bit and can therefore only make a random guess, so that p(a = y) = 1 2 and therefore p(a = y, b = x) ≤ 1 2 ; a similar reasoning holds for the causal order B ≺ A, and a convex mixture cannot increase the bound on p GYNI .
Similarly, the causal inequality (5) can be interpreted as a bound on the maximal probability of success for what we shall call a "lazy GYNI" (LGYNI) game, still with uniformly random input bits, where Alice and Bob's task is now to guess each other's input only when their respective input is 1 (for an input 0, their output can be arbitrary). Implicitly assuming uniform input bits, inequality (5) can then also be written in a more compact form as This causal bound on the probability of success p LGYNI can also easily be understood with a similar reasoning as above (taking into account that Alice for instance is only asked to guess Bob's input half of the time, when her input is 1).

III. PROCESS MATRIX CORRELATIONS WITH NO DEFINITE CAUSAL ORDER
In this section we study the violation of our simplest causal inequalities in the framework of process matrices, introduced recently by Oreshkov, Costa and Brukner [12]. Let us first start with a brief overview of this framework.

A. The process matrix framework
The basic assumption of the framework is that quantum theory correctly describes what happens locally in Alice and Bob's laboratories; however, no assumption is being made about the global causal structure in which the parties operate.
More specifically, it is assumed that Alice and Bob can perform any operation allowed by the standard formulation of quantum theory, as described by quantum instruments [29] from some input Hilbert spaces H AI and H BI (for Alice and Bob, respectively) to some output Hilbert spaces H AO and H BO . An instrument is a set of completely positive (CP), trace non-increasing maps from L(H XI ) to L(H XO ) (for X = A, B), where L(H XI ) and L(H XO ) are the spaces of linear operators over the Hilbert spaces H XI and H XO . Each CP map of a given instrument is associated with a given classical output, which we shall again denote by a and b for Alice and Bob, and all CP maps of an instrument must sum up to a trace-preserving map. The various instruments that the parties can choose to apply shall be labeled by some classical "inputs" x and y.
Using the Choi-Jamiołkowski (CJ) isomorphism [30,31], one can represent Alice's maps as some operators As shown in Ref. [12], the assumption of local consistency with quantum theory implies that the probability p(a, b|x, y) of observing the classical outputs a, b for a choice of instruments labeled by x, y is a bilinear function of Alice and Bob's maps, which can be written as for some hermitian matrix W ∈ L(H AI ⊗ H AO ⊗ H BI ⊗ H BO ). Requiring that the probabilities given by (9) are nonnegative and normalized for all possible choice of quantum operations (including operations involving possibly entangled ancillary systems) imposes some restrictions on the possible W matrices [12]. As shown in Ref. [13], these constraints can be expressed as follows: where the last three conditions are written using the operation X · defined by for X = A I , A O , B I , B O , with 1 X and tr X denoting the identity operator and the partial trace over the Hilbert space H X , respectively, and d X denoting its dimension.
Operators W that satisfy these conditions are called process matrices. They represent the most general way to "connect" the output spaces H AO ⊗ H BO to the input spaces H AI ⊗ H BI (see Fig. 1) in a way that is locally consistent with quantum theory. While these conditions do not impose a global causal order a priori and therefore allow in general for two-way signaling, the nonnegativity and normalization conditions on the probabilities guarantee that no logical paradoxes, like the grandfather paradox for instance [32,33], appear. In the following we will refer to correlations of the form (9), with Alice and Bob's instruments satisfying Eq. (8) (together with its analogous form for Bob) and W satisfying the conditions (10), as process matrix correlations. way that what happens in Alice and Bob's labs is locally consistent with quantum theory [12]. Process matrices generalize in particular quantum states and quantum channels.

B. Violation of the simplest causal inequalities by process matrix correlations
It was shown in Ref. [12] that certain process matrices 6 could generate correlations with no definite causal order. A specific process matrix and specific instruments were indeed found, which violate a particular causal inequality with one input bit for Alice and two for Bob, and one output bit for each. Remarkably, one of Bob's input bits could be used to distinguish some runs of the experiment where signaling happened in one direction, and some runs where it happened in the other direction. It remained an open question, whether this special input bit for Bob was necessary to obtain noncausal correlations in the process matrix framework, or whether any simpler causal inequality (with fewer inputs) could be violated.
Here we answer this question positively, by exhibiting violations of both our GYNI and LGYNI inequalities (6, 7) by process matrix correlations.
Let us start with a simple example with twodimensional input and output systems-"qubits"-for Alice and Bob (i.e., d AI = d AO = d BI = d BO = 2). One can check that the matrix where Z and X are the Pauli matrices and where tensor products are implicit, satisfies the constraints (10), so that it defines a valid process matrix. We choose Alice and Bob's operations to be the same, defined by with {|0 , |1 } denoting the computational basis (i.e., the eigenbasis of Z), and |Φ + := (|00 + |11 )/ √ 2. These indeed satisfy (8), and thus constitute valid instruments. These operations can be interpreted as follows: when their input is 0, Alice and Bob simply transmit their incoming physical system, untouched (2 |Φ + Φ + | being indeed the CJ representation of an identity channel), and output the value 1; when their input is 1, Alice and Bob perform a measurement in the Z basis, whose result defines their classical output, and send out the fixed state 6 A necessary condition for a process matrix to allow for a causal inequality violation is that it is causally nonseparable [12]-i.e., that it is itself incompatible with a definite causal order. In the multipartite case this is known however not to be a sufficient condition [13,14]. It remains an open question whether there can be bipartite causally nonseparable process matrices that only generate causal correlations.
One may now wonder, what the largest possible violation of these two causal inequalities by process matrix correlations is. To optimize the violations for some input and output Hilbert spaces of a given dimension, we used a See-Saw algorithm inspired by that of Werner and Wolf [34], as described in Appendix B. Note that because the optimization problem is nonconvex, the algorithm is not guaranteed to converge to a global maximum. Nevertheless, for small enough dimensions (at least, for qubits), the repeatability of our results for different random starting points of the algorithm makes us confident that we indeed found the global maxima. From our numerical results, we thus conjecture that the maximal violations of our causal inequalities achievable with qubit systems are p max,d=2 LGYNI ≈ 0.8194 = p max,d=2 In Appendix C we give an analytical description of the process matrices that reach these values. Going to larger dimensions, we found that the maximal value of p GYNI could increase, as summarized in Table I for dimensions up to 5; however, we did not find any larger value for p LGYNI than p max,d=2 LGYNI above, whichprovided our See-Saw algorithm did find the global maxima-reveals some fundamental difference between the two inequalities, despite their similarities (and in addition to the fact that contrary to GYNI, the outputs corresponding to certain inputs are irrelevant in the LGYNI game; see also footnote 3). It remains an open question, which values are the true "Tsirelson bounds" [35] for these two causal inequalities, in the sense of the largest possible values for p GYNI and p LGYNI reachable with quantum process matrices of any dimension.  Table I. Maximal values of p GYNI found through numerical optimization, as a function of the dimension of Alice and Bob's input and output Hilbert spaces d = dA

C. Boundary of the set of process matrix correlations
To finish with, let us picture the set of process matrix correlations vs that of causal correlations. In order to visualize the two, we shall project them onto the plane with coordinates p(a = y), p(b = x) , where we again implicitly assume uniformly random inputs for ease of notations 7 . In this plane the complementarity between the two directions of signaling, from Alice to Bob and from Bob to Alice, is conspicuous; the projected causal polytope is bounded here by the four causal inequalities 8 (see Figure 2) To obtain a lower bound for the boundary of the set of process matrix correlations, we again used the See-Saw algorithm described in Appendix B to maximize quantities of the form α p(a = y) + β p(b = x), with various weights α, β. Different runs of the algorithm gave us different lower bounds (recall that the See-Saw algorithm is not guaranteed to always find the global optimum), which we combined to obtain the bounds represented on A surprising feature of the set of process matrix correlations for dimension 2 is that it does not seem to be convex (see Figure 2, red region). We believe this is a true characteristic of it, not only a numerical artifact due to 7 Without assuming uniformly random inputs, p(a = y) and p(b = x) in Eqs. (21)(22)(23)(24) and in Figure 2 should be replaced by 1 4 x,y,a,b δa,y p(a, b|x, y) and 1 4 x, y,a,b δ b,x p(a, b|x, y), respectively. 8 Note that inequality (21)   Its upper right corner corresponds for instance to a correlation such that Alice's output is always equal to Bob's input (a = y) and Bob's output is always equal to Alice's input (b = x), which requires perfect 2-way signaling and violates Eqs. (6), (7) and (21) up to their algebraic maximum. This correlation may somehow be thought of as being analogous to the Popescu-Rohrlich (PR) box considered in the context of nonlocal correlations [37] (one difference, however, is that this correlation is deterministic, while the PR box correlations are not).
some failure to find global optima. Note, nevertheless, that the boundary of the set of process matrix correlations for arbitrary dimensions is convex, as proven in Appendix D.

IV. CONCLUSION
We have shown that the set of correlations compatible with a definite causal order ("causal correlations") forms a convex polytope, which we fully characterized in the simplest nontrivial bipartite scenario with binary inputs and outputs. Two nonequivalent families of causal inequalities were obtained, Eqs. (6-7), for which we gave intuitive interpretations in terms of "causal games". These allow for a device-independent characterization of correlations with or without definite causal order, and can be tested independently of the physical theory under consideration. We exhibited in particular violations of these inequalities by process matrix correlations, which generalize standard quantum correlations. Because of their simplicity (and despite the vi-olations we found being somewhat less intuitive), we expect these new inequalities-in particular the first one, interpreted as a "guess your neighbor's input" gameto become archetypical examples of causal inequalities, just like the CHSH inequality is the archetype of Bell inequalities [17,26].
Our approach can be used to characterize (non)causal correlations in more complex scenarios as well. It should be noted that because of the large dimension of the probability space and the large number of vertices of the causal polytope (see Appendix A), the full facet enumeration problem rapidly becomes intractable in practice as the number of inputs and outputs increases beyond the simplest binary case. Nevertheless, one could adapt some of the tricks developed for the derivation of Bell inequalities (see Ref. [17] for a review) to construct new causal inequalities for various scenarios of interest. Violations of these inequalities in the process matrix framework can then be investigated using our See-Saw algorithm. An interesting question is whether it would also be possible to derive nontrivial bounds on such violations from certain information-theoretic principles [38], along analogous lines to the research program that aims at restricting quantum nonlocal correlations from various principles [17].
In the present paper we focused for simplicity on the bipartite case. Our work can naturally also be extended to the case of more parties. With a proper generalization of the concept of noncausal correlations (see for instance Ref. [14]), it can also be shown that multipartite noncausal correlations form a convex polytope. Similar techniques can be used to characterize this polytope, construct causal inequalities and investigate their possible violation. Note that a remarkable new feature in the multipartite case is that violations are also possible with "classical process matrices" [21].
One of the main open questions along the line of research presented here is whether it would actually be possible, in practice, to observe correlations with no definite causal order and a violation of a causal inequality. As an extension of standard quantum theory, the framework of process matrices-which does indeed predict such violations theoretically-appears as a good candidate to provide such a possibility. However, to the best of our knowledge, no practical implementation has been identified for any of the process matrices that are known to violate a causal inequality [12,20,21] (including the ones presented here)-while in contrast, a causally nonseparable quantum process has been recently demonstrated experimentally [16]. It is likely that more complex scenarios need to be considered, and a systematic investigation of causal inequalities and their violation by process matrix correlations might prove useful to find practical violations-or, should it be the case, to clarify why such violations cannot be observed in practice. Our work makes the first crucial step in this direction.
Note added.-While finishing writing up this manuscript, we became aware that the concept of causal poly-topes introduced here was also referred to (with proper reference to our work) in Ref. [14], where the emphasis was put on multipartite scenarios, and in Ref. [39], where the authors also introduced, for the multipartite case as well, larger polytopes of logically consistent but possibly noncausal classical processes. them, we get where β α (x, y) = β(x, y, α(x)) and q α,β ′ = q α q β ′ |α with q β ′ |α = β δ βα,β ′ q β , such that q α,β ′ ≥ 0 and α,β ′ q α,β ′ = 1. Hence, any correlation p A≺B can be written as a convex combination of deterministic correlations compatible with the order A ≺ B-which thus correspond to the vertices of the corresponding polytope of correlations p A≺B .

Dimensions
Because of the m A m B normalization constraints a,b p(a, b|x, y) = 1, the probability space of correlations p(a, b|x, y) is of dimension m A m B (k A k B −1). With the no-signaling-to-Alice and no-signalingto-Bob constraints (1,2), the dimensions of the polytopes of correlations p A≺B and p B≺A are re- However, the causal polytope-i.e., their convex hullremains of the same dimension as the full probability space.
In the case where both Alice and Bob's inputs and outputs take binary values, the 10-dimensional polytopes of correlations p A≺B and p B≺A both have 64 vertices, among which 16 are non-signaling vertices common to both polytopes. The 12-dimensional causal polytope thus has 64 + 64 − 16 = 112 different vertices.
We enumerated the facets of the causal polytope for binary inputs and outputs by solving the convex hull problem using the software lrs [27]. As described in the main text, we obtained 48 facets, which can be grouped into 3 families of equivalent facets (up to relabelings of inputs and outputs). Explicitly, these are • 16 trivial facets of the form p(a, b|x, y) ≥ 0 for all x, y, a, b = 0, 1; • 16 facets of the GYNI type, which can be written (in the same form as (6), implicitly assuming uniform input bits) as for all α 0 , α 1 , β 0 , β 1 = 0, 1; • 16 facets of the LGYNI type, which can be written (in the same form as (7), implicitly assuming uniform input bits) as for all α 0 , α 1 , β 0 , β 1 = 0, 1.
Note that this causal polytope for binary inputs and outputs coincides with the polytope of correlations obtained from a local model augmented with one bit of communication, as described in Ref. [40]. This is because the use of just one bit of (one-way) communication is of course compatible with a definite causal order, either A ≺ B or B ≺ A, and for binary inputs one bit is enough for one party to send all the information about her input to the other party. In general however, the polytopes described in Ref. [40] are different from causal polytopes.

Appendix B: See-Saw algorithm
Maximizing the violation of a causal inequality over the process matrix and the instruments is a nonlinear problem, which makes it intractable directly. To address this problem, we used an approach inspired by the See-Saw algorithm of Werner and Wolf [34]. The idea is that if Alice and Bob's instruments are fixed, then the combination of probabilities that enters the causal inequality is a linear function of the W matrix, and maximizing it is a semidefinite programming (SDP) problem [41] that can be solved efficiently. In the same spirit, if the W matrix and the instruments of one party are fixed, then the value of interest is a linear function of the instruments of the other party, and again its optimization is a SDP problem. Hence, one can try to approach the maximum violation of a causal inequality by optimizing over the process matrix and the parties' instruments in an iterative manner.
More specifically, let ω(W, A, B) be the value taken by the combination of probabilities in the causal inequality, considered as a function of the process matrix W and the sets of instruments A = {M AI AO a|x } a and B = {M BI BO b|y } b (in their CJ representation). We start the algorithm by generating random sets of instruments A 0 and B 0 , and for these fixed instruments we maximize ω considered as a function of W , via the following SDP problem: maximize ω (W, A 0  With the optimal set of instruments A 0 obtained now and the previously obtained process matrix W 0 , we do the analogous optimization over Bob's set of instruments B, and iterate the three optimization steps of the algorithm until it converges. One can see that at each step of the algorithm the value of ω can only increase, so it is guaranteed to converge to a local maximum. One does not, however, always get the global maximum, and in practice one must repeat the algorithm several times to get a good lower bound on the maximal value of ω. Note that this See-Saw algorithm can of course straightforwardly be adapted to more than two parties. or to the case of the well known CHSH Bell inequality [26,35]. Note also that, as mentioned in the main text, we could find higher violations of the GYNI inequality using higher-dimensional quantum systems (see Table I), while we couldn't find any higher violations of the LGYNI inequality.