Causal hierarchy of multipartite Bell nonlocality

As with entanglement, different forms of Bell nonlocality arise in the multipartite scenario. These can be defined in terms of relaxations of the causal assumptions in local hidden-variable theories. However, a characterisation of all the forms of multipartite nonlocality has until now been out of reach, mainly due to the complexity of generic multipartite causal models. Here, we employ the formalism of Bayesian networks to reveal connections among different causal structures that make a both practical and physically meaningful classification possible. Our framework holds for arbitrarily many parties. We apply it to study the tripartite scenario in detail, where we fully characterize all the nonlocality classes. Remarkably, we identify new highly nonlocal causal structures that cannot reproduce all quantum correlations. This shows, to our knowledge, the strongest form of quantum multipartite nonlocality known to date. Finally, as a by-product result, we derive a non-trivial Bell-type inequality with no quantum violation. Our findings constitute a significant step forward in the understanding of multipartite Bell nonlocality and open several venues for future research.


I. INTRODUCTION
Bell nonlocality [1,2] represents a fundamental and intriguing aspect of nature: Not all correlations observed in space-like separated measurements can be explained by any classical model respecting the causal assumptions of locality -that the measurement outcomes depend only on local variables -and measurement independence -that the observers choose their measurement settings freely. Thus, understanding to what extent one has to give up the causal assumptions for a classical model to reproduce nonlocal correlations provides both insights into the nature of quantum correlations [3,4] and a natural way of quantifying them [5][6][7][8][9]. In view of that, the study of causal relaxations in Bell scenarios has become a topic of intense interest [5][6][7][8][9][10][11][12][13][14][15][16][17][18]. In particular, in the bipartite scenario, it is known that all quantum correlations can be reproduced if either of the two causal assumptions are relaxed (see e.g. [6]). However, this is not in general the case in the multipartite scenario.
In the N-partite case, even if N − 1 parties communicate -a relaxation of locality -, classical models cannot explain all quantum correlations [19][20][21]. This discovery gave rise to the notion of genuinely multipartite nonlocality (GMNL), with both fundamental and applied implications [22][23][24]. Since then, several forms of GMNL have been identified [25][26][27][28][29][30][31][32]. However, a unifying picture of the models arising from all the different causal relaxations, together with the classes of nonlocality they lead to, has until now been an unrealistic goal. A significant obstacle is the rapidly increasing complexity of generic multipartite causal structures as N grows.
Here, we develop a systematic characterisation of the classes of multipartite Bell nonlocality in terms of the causal relaxations required for a classical causal model to explain all the correlations in the class. We use the formalism of Bayesian networks [33,34], which allows us to identify equivalences among different causal relaxations in the context of nonsignaling correlations. As a result, one can define causal classes of Bell correlations each of which groups together many different causal structures. This enormously simplifies the problem, rendering a practical characterisation possible. Additionally, the classification has the built-in advantage of automatically yielding a natural hierarchy, from which one can directly read which nonlocality classes are stronger than others in the inclusion sense.
We develop the formalism in full generality and discuss in detail the tripartite scenario, where a full characterisation of Bell nonlocality is given. The hierarchy delivers 10 classes of tripartite causal models with nonsignaling violations. Interestingly, we prove that, in the nonsignaling scenario, 3 classes in different levels of the hierarchy collapse to a same single class. This leaves us with 8 inequivalent causal Bell classes. From these, at least 7 are violated by quantum correlations, including 1 class for which no quantum violations were known to date [25]. This proves that nature is nonlocal in a stronger sense than previously known, closing a long-standing open question [25]. Interestingly, the causal class for which we could not find a quantum violation produces correlations able to win, with unit probability, the celebrated nonlocal game without a quantum advantage "guess your neighbour's input" (GYNI) [35]. We identify a non-trivial Bell-type inequal-ity with no quantum violation for this class, which can also be interesting on its own [36].

II. SCENARIO
We consider the correlations arising when N parties perform local measurements on their respective shares of a joint N-partite system. These correlations are described by a conditional probability distribution p(a 1 , . . . , a N |x 1 , . . . , x N ), where x i and a i label, respectively, the measurement choices (inputs) and outcomes (outputs) of the i-th party, for i = 1, . . . , N. If the parties are space-like separated, the correlations must be nonsignaling. That is, the marginal conditional probabilities p(a 1 , . . . , a i−1 , a i+1 , . . . , a N |x 1 , . . . , x N ) := ∑ a i p(a 1 , . . . , a i , . . . , a N |x 1 , . . . , x N ) over all but the i-th outcome must be independent on the i-th input [2]: p(a 1 , . . . , a i−1 , a i+1 , . . . , a N |x 1 , . . . , x N ) = p(a 1 , . . . , a i−1 , a i+1 , . . . , a N |x 1 , . . . , x i−1 , x i+1 , . . . , x N ), (1) for all a 1 , . . . , a i−1 , a i+1 , . . . , a N , all x 1 , . . . , x N , and all i. This means that the input choice by the i-th party cannot influence the statistics observed by the others. Our goal is to study the nonsignaling correlations that can arise from different causal structures where relaxations of measurement independence and locality are allowed. Causal structures can be represented with directed acyclic graphs (DAGs) [33], examples of which are shown in Figs. 1, 2, and 3. Each node of a DAG represents a classical random variable, and each directed edge encodes a causal relation between two nodes. For each edge, one calls the start vertex the parent and the arrival one the child. The acyclicity of the graph prevents an effect from being its own cause. Then, given a collection of variables V = {v 1 , v 2 , . . . , v n }, V forms a Bayesian network with respect to a DAG G if the joint probability distribution p(v 1 , . . . , v n ) describing the statistics of V can be decomposed as where pa(v i ) denotes the set of parents of v i according to G. Here, we are interested in a specific subclass of structures with two common features. First, they all possess an unobservable node, the hidden variable λ, and two sets of observable ones, the inputs x 1 , . . . , x N and the outputs a 1 , . . . , a N , i.e. V Bell = {λ, a 1 , . . . , a N , x 1 , . . . , x N }. Second, each i-th output a i contains x i and λ as parents, i.e. pa(a i ) ⊇ {x i , λ}. We refer to these DAGs as Bell DAGs (BDAGs). The simplest BDAG, shown at the top of Figs. 2 and 3 for N = 3, is the one for which pa(a i ) = {x i , λ} for all i. This gives rise to the so-called local hidden-variable (LHV) models, with correlations of the form [1]: (3) More complex causal structures are obtained by considering causal relaxations of the LHV BDAG. Relaxations of measurement independence [5, 6,16] and locality [11][12][13]15] have been studied in the bipartite scenario. In the multipartite case, communication among N − 1 out of N parties is allowed in bi-LHV models [19][20][21]26], while causal influences from the input of one party towards the outputs of all others is accounted for in input-broadcasting models [26]. However, a systematic classification of N-partite causal structures for all the different causal relaxations was missing [37]. The problem consists of organising, in a physically meaningful and practical way, objects with an exponential complexity in the number of parties. In what follows, we propose a solution to this problem and, on the way, give a positive answer to the above-mentioned open question.

III. BELL CLASSES OF MULTIPARTITE CAUSAL NETWORKS
Our classification relies on two technical results that connect different causal relaxations in the nonsignaling framework. We say that a BDAG G 1 nonsignaling implies another G 2 , if every nonsignaling correlations compatible with G 1 (i.e., produced by a Bayesian network with respect to it) are also compatible with G 2 . Note that, if all the causal relaxations in G 1 are also present in G 2 , G 1 automatically nonsignaling implies G 2 . In addition, if G 1 and G 2 nonsignaling imply one another, we say that they are nonsignaling equivalent. In particular, notice that if two BDAGs that are nonsignaling equivalent, the maximal violation of any given Bell inequality is the same and the correlations produced are useful for the same information-theoretic protocols.
The first result asserts that the most general causal influence from one party to another is nonsignaling equivalent to a single locality relaxation from the input of the former towards the output of the latter. This is depicted in Fig. 1 a) for N = 2. Thus, any particular locality relaxation is nonsignaling implied by the input-to-output one.
Lemma 1 (Generic locality relaxation ↔ input-to-output locality relaxation). Let G gen and G in−out be any two BDAGs whose only difference is that there exists 1 ≤ j = i ≤ N such that, for G gen , pa(x j ) ⊇ {a i , x i } and pa(a j ) ⊇ {a i , x i }, whereas, for G in−out , pa(a j ) ⊇ {x i }. Then, G gen and G in−out are nonsignaling equivalent. Proof. We prove this lemma explicitly for the particular case of BDAGs of N = 2 parties. The proof for the general case is totally analogous. We have to prove the implication relations between the DAGs in Fig. 2a). The most general locality relaxation between two parties is represented by the BDAG G gen in the left-hand pannel of Fig 2a). The simpler DAG G in−out is represented in the right-hand pannel of Fig. 2a). Clearly, since all the causal relaxations in G in−out belong also to the set of causal relaxations in G gen , G in−out nonsignalling implies G gen . We now prove that the converse also holds true, thus proving that both DAGs are nonsignaling equivalent. Any probability distribution compatible with G gen can be decomposed as where Bayes' rule has been repeatedly used. In addition, Eq. (6) follows, using Eq. (2), from the fact that, for the BDAG G gen , x 1 is neither a descendant nor a parent of λ. In turn, Eqs. (7) and (8) use the fact that, if the hidden variable λ can take sufficiently many values (which can, without loss of generality, always be assumed to be the case), the outcome a 1 can be taken as a deterministic function of x 1 and λ, so that p(a 1 |x 1 , x 2 , λ) = p(a 1 |x 1 , λ) and p(a 2 |a 1 , x 1 , x 2 , λ) = p(a 2 |x 1 , x 2 , λ).
This is manifestly the explicit expression of generic correlations produced by Bayesian networks with respect to the IO BDAG G in−out , which finishes the proof.
The second result we will use states that allowing for a direct causation between any input and λ is nonsignaling equivalent to broadcasting the input to the outputs of all N − 1 other parties [see Fig. 1 mir , and G ib be any three BDAGs whose only differences are that there exists mir , x i ∈ pa(λ), and for G ib , x i ∈ pa(a j ) for all 1 ≤ j ≤ N. Then, G (1) mir , G (2) mir , and G ib are nonsignaling equivalent. Proof. We will prove this lemma explicitly for the particular case of N = 3 parties. The proof for the general case is totally analogous. The proof strategy consists in showing that, if nonsignalling holds, the most general expression for a correlation compatible with the inputbroadcasting BDAG G ib in the right-hand panel of Fig. 1 b) coincides with the most general expressions for correlation compatible with the measurement-dependence BDAGs G (1) mir and G (2) mir in the left-hand and central panels, respectively, of Fig. 1 b). This implies that the set of nonsignalling correlations compatible with G ib and the set of nonsignalling correlations compatible with G (1) mir and G (2) mir are indeed equivalent. The most general correlation produced by a Bayesian network with respect to G ib , where the setting x 1 is a cause of a 2 , is Eq. (14) follows from Eq. (2) in the main text with respect to G ib . The right-hand side of Eq. (15), in turn, is simply the expression of the right-hand side of Eq. (14) in terms of the local deterministic response functions of each output a i given its parent inputs with respect to G ib , for i = 1, 2, 3, for which we have explicitly decomposed λ as the tri-index variable λ = λ 1 , λ 2 , λ 3 , with λ i labelling the local deterministic strategy of the i-th party. More precisely, the local deterministic response functions are defined as with δ denoting the Kronecker delta, f have two arguments. There are , for i = 2, 3, where |X i | and |A i | denote the numbers of inputs and outputs, respectively, of the i-th party, for i = 1, 2, 3. This gives a total of |Λ| = |Λ 1 | × |Λ 2 | × |Λ 3 | different global deterministic strategies. Now, for any fixed λ i and x 1 , the two-argument as- where we have also introduced γ i : Then, Eqs. (16) and (19) imply for the righthand side of Eq. (15) that where, in the last equality, we have in-troduced the normalised conditional prob- ability distribution q(λ 1 , λ 2 , λ 3 |x 1 ) := ∑ λ 2 :γ 2 (λ 2 ,x 1 )=λ 2 ,λ 3 :γ 3 (λ 3 ,x 1 )=λ 3 p(λ 1 , λ 2 , λ 3 ).
The right-hand side of the last line of Eq. (20) is readily identified as the decomposition into deterministic strategies of the most general correlation produced by a Bayesian network with respect to G (2) mir . This, in turn, is actually also equivalent to the most general correlation produced by a Bayesian network with respect to G (1) mir . This can be immediately seen by noting that, instead of . However, the two expressions are trivially equal due to Bayes' theorem.
Lemmas 1 and 2 imply that every causal relaxation on a LHV model is, as for what nonsignaling correlations concerns, accounted for (in the inclusion sense) by an input-to-output locality relaxation. We refer to any BDAG whose only causal relaxations consist of input-to-output locality relaxations as an input-output (IO) BDAG. Every IO BDAG can be defined by the subsets of inputs that are parents of each ouput (see Fig. 3 for more details). We emphasise that, as a consequence of the lemmas, the total number of BDAGs to scrutinise is hugely reduced. Namely, there are 15 different ways of drawing directed edges from one party to another [all the particular instances of the general locality relaxation of Fig. 1 a)]. All corresponding 15 BDAGS are grouped together with a single IO BDAG due to lemma 1. Furthermore, each BDAG with directed edges from λ to any of the inputs is grouped together with an IO BDAG due to lemma 2. Hence, IO BDAGs make generic representatives of all possible causal relaxations in the nonsignaling framework. This leads us to the following natural classification.
Definition 3 (Causal classes of Bell correlations). Each IO BDAG G in−out defines a causal class of Bell correlations, or, for short, a causal Bell class, as the convex hull of nonsignaling correlations produced by Bayesian networks with respect to G in−out or any of its party-permutation equivalents. In addition, we call a causal Bell class nonsignaling interesting if there exist nonsignaling correlations incompatible with it; otherwise we call it nonsignaling boring [38]. Finally, each nonsignaling interesting causal Bell class defines a class of multipartite Bell nonlocality, as the set of all nonsignaling correlations outside the causal Bell class.
This characterisation offers, as mentioned, a physically meaningful tree-like hierarchy, where classes in a given level are nonsignalling implied by classes in the level below. We refer to it as the causal hierarchy, represented in Fig. 3 for the N = 3. We note that only the nonsignalling interesting part of the hierarchy is plotted in the figure. The complete tripartite causal hierarchy is graphically represented in Fig. 2. It contains a total of 16 causal Bell classes, but 6 of them are nonsignalling boring (see also the appendix for a list of all the classes in the complete hierarchy for N = 4). Furthermore, apart from the above-mentioned implications that follow automatically from the hierarchy, other nonsignalling implications can take place. For N = 3,  Fig. 2). Every dashed arrow from one DAG (in a given level) to another DAG (in a different level) indicates that the latter nonsignaling implies the former. Black dashed arrows represent the implications that the hierarchy automatically imposes, whereas red ones implications that we prove by other means (see text and appendix). The 6 light-grey shaded classes were known not to reproduce all quantum correlations [25]. From the remaining four classes, the three light-green shaded ones collapse to the star class {(1), (2), (1, 2, 3)} (see red arrows). We find quantum correlations beyond this class.
for instance, 3 of the 10 nonsignalling interesting classes turn out to be equivalent, thus collapsing to a single class. All this is formalised by the following theorem, proven in the appendix.

IV. STRONGER FORMS OF QUANTUM NONLOCALITY
The 6 IO BDAGs shaded in light grey in Fig. 3 belong to a class of models that were shown [25] to satisfy Svetlichny's inequality, which can be violated by quantum correlations [19]. The remaining 4 IO BDAGs define causal Bell classes for which no quantum violation was known so far. The 3 of them shaded in light green in Fig. 3 are nonsignalling equivalent, as mentioned, so that the fourth level, and part of the third one, of the hierarchy collapse to the second level [39].We refer to the resulting class as the star, because one party (the centre of the star) receives the inputs of all others parties (the rays). Remarkably, we find quantum correlations outside of it.
In the appendix, we show that the star class is nonsignalling boring for the specific scenario of 3 parties with 2 inputs and 2 outputs each. However, it satisfies a broad family of Bell inequalities that are nontrivial for higher output alphabets. Consider output alphabets where each output can be factorised into two integer variables. Then, the inequalities in question can be expressed in a unified fashion as where I 2 stands for an arbitrary bipartite Bell expression with LHV and nonsignalling bounds β L and β NS , respectively. A and A are the two variables that encode the output of the first party, B and B the output of the second one, and C and C that of the third one. For example, for 2 inputs and 4 outputs per part, I 2 can be taken as the Clauser-Shimony-Holt inequality [40], with β L = 2 and β NS = 4 ( note that this inequality has also been discussed, in a different context, in Ref. [41]). Then, A and A are bits, generated, for instance, by making the same measurement on two independent subsystems, and equivalently for B, B , C and C . In that case, the resulting tripartite inequality I 3 can be maximally violated by three bipartite boxes independently distributed among the parties. More precisely, we refer to the well-known PR boxes [42], which are post-quantum but nonsignalling.
Surprisingly, for higher dimensions, there exist choices of I 2 for which I 3 is maximally violated in quantum mechanics. Consider for instance I 2 as the allversus-nothing Bell inequality studied in Refs. [43][44][45] (containing 3 inputs and 4 outputs per party). This bipartite inequality is tight [43] and can be violated up to its algebraic maximum by correlations obtained from a pair of singlets [43][44][45]. Hence, I 3 can be maximally violated with 6 singlets (2 singlets per pair of parties, same measurement on both pairs of qubits of each party). Since here each party holds two subsystems (e.g. A and A ), in our case each party will have 16 outcomes. Another alternative is to choose I 2 as the chained inequality, which requires a single singlet per pair but a very large number of inputs per party [46] (see the appendix). Importantly, for both choices, experiments with violations of I 2 high enough to violate I 3 have been demonstrated [44,45,47,48].

V. A NON-TRIVIAL BELL-TYPE INEQUALITY WITHOUT QUANTUM VIOLATION
The last nonsignalling interesting class to analyse is { (1, 3), (1, 2), (2, 3)} in the third level, where each party communicates his/her input to his/her nearest neighbour in a circle-like configuration. We refer to this class as the circle. Interestingly, correlations in this class attain a unit success probability of winning the GYNI nonlocal game of Ref. [35]. We have not found quantum violations of this class; but we found that it satisfies the binary-input-binary-output inequality: This is maximally violated by the nonsignaling extremal correlations p(a 1 , a 2 , a 3 |x 1 , x 2 , x 3 ) = δ a 1 ⊕a 2 ⊕a 3 ,x 1 ×x 2 ×x 3 /4, where δ denotes the Kronecker delta and ⊕ addition modulo 2. Using the techniques of Refs. [49,50], we see that Eq. (22) is not violated by any quantum correlations. Thus, Eq. (22) constitutes a non-trivial Bell-type inequality with no quantum violation. Note, nevertheless, that this does not imply that the circle class contains all quantum correlations, as there might be other inequalities (involving not only full-correlators) that admit quantum violations. See the appendix for details.

VI. DISCUSSION
We proposed a hierarchical classification for all the relaxations of locality and measurement independence in Bell's theorem in terms of the nonsignalling correlations to which they lead. The nonsignalling correlations compatible with an arbitrary causal structure are al-ways captured by a (typically much simpler) causal network involving only locality relaxations where the input of one party causally influences the outputs of others. The framework facilitates the study of unexplored forms of multipartite Bell nonlocality. For instance, we identified new tripartite causal structures that cannot reproduce all quantum correlations. This demonstrates the strongest form of quantum multipartite nonlocality known, closing a long-standing open question [25]. Furthermore, as another application, we derived a previously unknown non-trivial Bell-type inequality without a quantum violation.
Our work offers a number of exciting questions for future research. In particular, the discovery of new and stronger forms of quantum correlations offers a vast, unexplored territory. In addition, from a fundamental perspective, the fact that our framework naturally leads to a non-trivial Bell-type inequality with no quantum violation is appealing [36]. From an applied perspective, our results may have implications in communication complexity problems [51] or in the emerging field of quantum causal networks [52][53][54][55][56][57][58]. In conclusion, we believe our findings can open a new chapter in the understanding of multipartite Bell nonlocality.

ACKNOWLEDGMENTS
We would like to specially thank Fernando de Melo for the hospitality at Rio de Janeiro's CBPF, where the first ideas that led to this paper were conceived, as well as S. Pironio with q(λ) ≥ 0 and ∑ λ q(λ) = 1. Here, in a similar fashion to in the previous section, the multi-variable decomposition λ = λ 1 , . . . , λ N is used and D with δ denoting the Kronecker delta and f (i) λ i being the λ i -th local deterministic assignment of in i into a i . In addition, the λ-th global deterministic response function is given by the product Each D λ is clearly also a vector in R d A|X , with components D λ a 1 ,...,a N ,x 1 ,...,x N := D λ (a 1 , . . . , a N |x 1 , . . . , It is convenient to identify each extremal vector D λ with the λ-th column of a d A|X × |Λ| real matrix D, with com- ponents D a 1 ,...,a N ,x 1 ,...,x N ,λ := D λ (a 1 , . . . , a N |x 1 , . . . , x N ), and each q(λ) as the λ-th component of a |Λ|-dimensional real vector q. With this, Eq. (A1) can be rewritten concisely as where the symbol · stands for contraction over the index λ. The tensor D of deterministic strategies depends exclusively on the number of inputs and outputs per party, as well as on the the causal structure in question. It characterises completely the polytope of all correlations (both signalling and nonsignalling) compatible with the IO BDAG {in 1 , . . . , in N }. We call such polytope the causal polytope of {in 1 , . . . , in N }. In contrast, the vector q is in one to one correspondence with the particular p.
Hence, the problem of determining whether a given p is compatible with a Bayesian network with respect to {in 1 , . . . , in N } is equivalent to determining whether there exists q such that Eq. (A4) holds. This, since Eq. (A4) defines a system of linear equations, can always be solved efficiently in the length |Λ| of the vector q. A practical tool to do this is linear programming. More precisely, solving the linear system given by Eq. (A4) is equivalent to solving the linear programme Given D and p, minimize q∈R |Λ| , q≥0, q =1 with q ≥ 0 and q = 1 short-hand notations for q(λ) ≥ 0, for all λ = 1, . . . |Λ|, and ∑ λ q(λ) = 1, respectively, and where I is any vector in R |Λ| (that encodes the so-called objective function). If the linear programme (A5) is feasible, p is compatible with {in 1 , . . . , in N }. Otherwise p is not inside the causal polytope of {in 1 , . . . , in N }. In turn, using standard convex-optimization tools, such as for instance the software PORTA [59], one can also find the dual description of the polytope in terms not of its extremal points but of its facets, i.e., its Bell inequalities. Finally, we recall that the causal Bell class associated to {in 1 , . . . , in N } is actually defined (see Def. 3 in the main text) not by all the correlations compatible with it but by the convex hull of all nonsignalling correlations compatible with any IO BDAG obtained via party exchanges from it. In that case, one proceeds in a similar way but taking into account all the different global deterministic-strategy tensors arising from party-permutations of {in 1 , . . . , in N } and adding to the constraints of the linear programme the nonsignaling constraints on p, given by Eq. (1) in the main text. parents of all N outputs. The latter causal structure can reproduce all correlations (including the signalling ones), so that the corresponding causal Bell class is trivially nonsignalling boring. For each fixed number l of inputoutput locality relaxations, there are a total of ( N (N−1) l ) different IO BADGs, but many of them are redundant, as they are equivalent up to party exchanges. Eliminating all the party-exchange redundancies leaves us with the IO BDAGs that define the complete causal hierarchy. Following the procedure above for N = 3 yields 16 different non-redundant IO BDAGs, graphically represented in Fig. 2. In addition, in table I, we summarise the main properties of each of these IO BDAGs. (See also Sec. C for a brief description of the complete causal hierarchy for the four-partite case.)

Five nonsignalling boring classes in the tripartite scenario
In this subsection, we prove that the 5 causal Bell classes represented by the IO BDAGs in black boxes in Fig. 2 are nonsignalling boring, i.e., they can reproduce all nonsignalling correlations. We do this by explicitly proving that the class {(1), (1, 2), (1, 2, 3)} in the third level is nonsignalling boring. This automatically implies that the other 4 classes (3 in the fourth level and the one in the fifth level) are nonsignalling boring too, as they can all be obtained from {(1), (1, 2), (1, 2, 3)} by causal relaxations.

10 nonsignalling interesting tripartite classes, at least 7 of which with quantum violations
In this subsection, we prove that the remaining 10 classes are nonsignalling interesting. We do that by deriving Bell inequalities for each causal Bell class that are violated by nonsignalling correlations. Furthermore, for 7 of the classes, the violations that we find are not only nonsignalling but actually quantum.
The  (1, 2), (1, 3)} define, in the terminology of Ref. [25], partially paired correlations. In Ref. [25], it was shown that all partially paired correlation, respect the Svetlichny inequality [19] This inequality is violated by quantum correlations obtained from local measurements on entangled quantum states, the maximum quantum violation being 4 √ 2, with Greenberger-Horne-Zeilinger states [19]. Thus, the six classes are not only nonsignalling interesting but they also admit quantum violations.
For the circle class, represented by {(1, 3), (1, 2), (2, 3)}, we derive a previously unknown non-trivial tight Bell inequality for full correlators. The set of full correlators compatible with a given class defines also a polytope. Thus, as discussed in Sec. (A), standard convex optimization tools [59] can be used to obtain the Bell inequalities for full correlators. Notice also that, given a Bell inequality, a simple way to see if it is satisfied by a causal Bell class is to check that all global deterministic strategies of the class (see Sec. A) respect the inequality. We find that all full correlators compatible with the circle class satisfy the inequality This inequality is violated up to the algebraic maximal value 8 by the nonsignaling extremal correlations p(a 1 , a 2 , a 3 |x 1 , x 2 , originally identified in Ref. [60]. This proves that {(1, 3), (1, 2), (2, 3)} is nonsignalling interesting. Using the techniques of Refs. [49,50], we see that Eq. (22)  (1, 2, 3)}, are found to be nonsignalling boring for the restricted case of binary inputs and outputs. To see this, we solved, once again with standard convex-optimization tools [59], the feasibility problem of Eq. (A4) for all the 46 extremal nonsignalling correlations [61] for the binary-input binary-output case, the same ones used for table II.  (1, 2, 3)}, has marginal (bipartite) correlations over Alice and Bob, the first and second parties, respectively, described by a bipartite LHV model. The intuitive explanation for this is that in none of the three BDAGs there are arrows going from Alice to Bob or from Bob to Alice. In the end of this subsection, we prove this fact formally. This fact implies that the three corresponding causal Bell classes consist exclusively of convex combinations of (tripartite) correlations each of which has a LHV bipartite marginal over some pair out of the three parties. From this, in turn, it follows that the three causal Bell classes satisfy a broad family of non-trivial Bell inequalities that can all be described in a unified way by the generic expression where I 2 stands for any arbitrary bipartite linear Bell expression with local bound β L and nonsignalling bound β NS .
A and A are random variables associated to the outputs of Alice, B and B to the outputs of Bob, and C and C to those of Charlie, the third party. For instance, the simplest non-trivial example we find is in the scenario of 2 inputs and 4 outputs per party. There, each output can, without loss of generality, be represented by two bits: A and A for Alice, B and B for Bob, and C and C for Charlie. Then, I 2 can be chosen as the Clauser-Shimony-Holt inequality [40]: Importantly, this inequality can be violated with three Popescu-Rohrlich boxes p PR (a, b|x, y) := (1/2)δ a⊕b,xy [42] distributed among the three parties, such that the overall tripartite correlations are given by p (a 1 , a 1 , a 2 , a 2 , a 3 , a 3 |x 1 , x 2 , x 3 ) = p PR (a 1 , a 2 |x 1 , x 2 ) p PR (a 1 , a 3 |x 1 , x 3 ) p PR (a 2 , a 3 |x 2 , x 3 ). The latter correlations yield the maximal algebraic value 12 for the lhs of (B7). This shows that the causal Bell classes {(1), (2), (1, 2, 3)}, {(1), (2, 3), (1, 2, 3)} and {(1, 3), (2, 3), (1, 2, 3)} are nonsignalling interesting. Furthermore, a very surprising fact arises in the scenario of 3 inputs and 16 outputs (4 bits) per party. There, quantum correlations exist that are incompatible with the three causal classes. More precisely, each output can now be represented by four bits: A 1 , A 2 , A 1 , and A 2 for Alice, B 1 , B 2 , B 1 , and B 2 for Bob, and C 1 , C 2 , C 1 , and C 2 for Charlie. Then, I 2 is now chosen as a bipartite Bell inequality I PM , for 3 inputs and 4 outputs per party, associated to the so-called Peres-Mermin square [44,45]. See Eq. (7) in Ref. [45], for instance, for an explicit expression of I PM (A 1 , A 2 , B 1 , B 2 ). In turn, for this inequality the local and maximal nonsignalling bounds are β L = 7 and β NS = 9, respectively. The interesting feature of I PM for our purposes is that it can be violated by quantum correlations obtained from a maximally entangled state of two ququarts, or, equivalently, two maximally entangled states of two qubits, up to the algebraic maximal value β NS = 9. Thus, with this choice, Eq. (B6) gives the overall Bell inequality which can be maximally violated up to the algebraic maximal value 27 with three maximally entangled states of two ququarts each appropriately distributed among the three parties. Another construction with equivalent implications would be to take I 2 as the celebrated chained inequality I chain , with 4 outputs (2 bits) and different numbers of inputs per party. This can also be maximally violated up to its algebraic maximal value with quantum correlations. A maximally entangled state of just two qubits (instead of ququarts) is needed for this choice, but at the expenses of requiring an infinitely large number of inputs [46]. Nevertheless, we note that the experimental violations obtained in Refs. [47] and [48] [25]. We emphasise that the question about the existence of quantum correlations more non-local than totally paired models had been open since the work of Ref. [25]. Consider then arbitrary correlations p produced by a generic Bayesian network with respect to {(1, 3), (2, 3), (1, 2, 3)}, with elements p(a 1 , a 2 , a 3 , |x 1 , x 2 , x 3 ). Then, the marginal correlations over Alice and Bob will have elements p(a 1 , a 2 |x 1 , x 2 ) := ∑ a 3 ,x 3 ,λ p(a 1 , a 2 , a 3 , x 3 , λ|x 1 , x 2 ) = ∑ a 3 ,x 3 ,λ p(a 1 |a 2 , a 3 , x 1 , x 2 , x 3 , λ) p(a 2 |a 3 , x 1 , x 2 , x 3 , λ) p(a 3 |x 1 , x 2 , x 3 , λ) p(x 3 , λ|x 1 , x 2 ) (B9) = ∑ a 3 ,x 3 ,λ p(a 1 |x 1 , x 3 , λ) p(a 2 |x 2 , x 3 , λ) p(a 3 |x 1 , x 2 , x 3 , λ) p(x 3 , λ) (B10) = ∑ x 3 ,λ p(a 1 |x 1 , x 3 , λ) p(a 2 |x 2 , x 3 , λ) p(x 3 , λ) (B11) = ∑ λ p(a 1 |x 1 , λ ) p(a 2 |x 2 , λ ) p(λ )   TABLE II. Table with (1, 2, 3)}, and, clearly, also for convex combinations of correlations produced by them. This proves the → implications.
As a final comment, we note that the last proof generalises straightforwardly to the case of arbitrary N. In other words, all causal Bell classes represented by IO BDAGs containing a star, i.e., for which all inputs go to the output of one party while any other locality relaxation involves the latter party, are equivalent: In this section, as a further example of the applicability of our machinery, we list all the IO BDAGs, excluding party-exchange redundancies, that appear for N = 4. The complete hierarchy possesses a total of 52 causal Bell classes, taking into account both nonsignaling interesting as well as nonsignaling boring ones, and disregarding collapses among different classes. 0  1  1  1  2  5  3  13  4  27  5  38  6  48  7  38  8  27  9  13  10  5  11  1  12  1   TABLE III. Table with all the IO BDAGs that arise in the fourpartite scenario excluding party-exchange redundancies. The complete hierarchy possesses 52 causal Bell classes, including both nonsignalling boring and interesting ones, distributed in 12 levels.