On the definition and characterisation of multipartite causal (non)separability

The concept of causal nonseparability has been recently introduced, in opposition to that of causal separability, to qualify physical processes that locally abide by the laws of quantum theory, but cannot be embedded in a well-defined global causal structure. While the definition is unambiguous in the bipartite case, its generalisation to the multipartite case is not so straightforward. Two seemingly different generalisations have been proposed, one for a restricted tripartite scenario and one for the general multipartite case. Here we compare the two, showing that they are in fact inequivalent. We propose our own definition of causal (non)separability for the general case, which---although a priori subtly different---turns out to be equivalent to the concept of"extensible causal (non)separability"introduced before, and which we argue is a more natural definition for general multipartite scenarios. We then derive necessary, as well as sufficient conditions to characterise causally (non)separable processes in practice. These allow one to devise practical tests, by generalising the tool of witnesses of causal nonseparability.


I. INTRODUCTION
The notion of a causal order between events is an essential ingredient in our understanding of the world. Our conventional view of causality is that events are ordered according to some global time parameter, with past events influencing future events, but not vice versa. One may however wonder whether this concept is really fundamental, or whether scenarios without such an underlying background causal structure are conceivable. The situation is particularly interesting in quantum theory, where the properties of physical systems are not always well-defined, and where the question arises of whether the causal structure itself can be subject to quantum effects in a similar way. These questions are of great importance for the foundations of physics [1][2][3], but they are also motivated by a more practical point of view, as new resources for quantum information processing become available when the assumption of a definite causal structure is relaxed [4][5][6][7][8][9][10][11].
A particular model describing causal relations between quantum events is the so-called process matrix formalism [2]. In this framework, quantum events are assumed to take place locally, but the causal order between them is not specified a priori. The physical resource relating the local events is described by a process matrix, which, broadly speaking, is a generalisation of a multipartite density matrix allowing also for the description of signalling scenarios, such as quantum channels. As it turns out, some scenarios arising within this formalism are indeed incompatible with any definite causal order. The process matrices corresponding to these scenarios are called causally nonseparable, while the process matrices describing scenarios compatible with a well-defined causal structure are called causally separable.
The process matrix formalism was initially introduced for two local events. In that bipartite case, the notion of causal (non)separability is clearly defined and well understood. In particular, the causal (non)separability of any bipartite process matrix can be determined using witnesses of causal nonseparability [12,13], similar conceptually to entanglement witnesses. While the formalism of process matrices generalises rather easily to more parties [12,14,15], the notion of causal (non)separability becomes less clear. In fact, several different definitions have recently been proposed to generalise the bipartite case [12,15] which, as it turns out, are not equivalent.
In this work, we clarify the definition of causal (non)separability in multipartite scenarios. After recalling the framework and definitions in the bipartite case, we compare the generalisations of causal (non)separability that have been proposed so far, before proposing and motivating our own definition for the multipartite case (Definition 6). We then provide a characterisation of multipartite causally (non)separable processes via necessary as well as sufficient conditions (Propositions 8, 9 and 10), allowing us to generalise the tool of witnesses of causal nonseparability. ries only once to let an incoming physical system enter, and once to send out an outgoing system. Alice and Bob may choose local operations to perform within their laboratories, possibly depending on some external (classical) input x or y for A and B, and producing (classical) measurement outcomes a and b, respectively. The correlations established between the parties after repeating the experiment many times are described by the conditional probability distribution P (a, b|x, y).
While no assumption is made about the global causal order between the parties, we assume that the local operations performed inside the laboratories are described by standard quantum theory. We can therefore assign some "incoming" and "outgoing" Hilbert spaces to the parties, which we denote H In this paper, we will only consider finite-dimensional Hilbert spaces; for a generalisation of the framework to infinite-dimensional systems, see Ref. [16].
According to quantum theory, Alice and Bob's local operations can most generally be described as quantum instruments [17]-that is, sets of completely positive (CP) maps that sum up to completely positive trace-preserving (CPTP) maps. The Choi-Jamiołkowski (CJ) isomorphism [18,19]  Here, Tr X denotes the partial trace over the system X, and 1 X denotes the identity operator in the space X (in general, superscripts on operators, which may be omitted when clear enough, denote the system(s) they apply to).
As shown in Ref. [2], requiring compatibility with quantum mechanics locally and assuming the noncontextuality of the probabilities imply that the probabilities P (a, b|x, y) must be bilinear in the CP maps associated with the operations of A and B-or, equivalently, bilinear in their CJ representations. (Throughout this paper we will often refer to CP maps by their equivalent CJ representation and vice versa.) It follows that the overall process can be described by a Hermitian operator, a "process matrix" W ∈ A IO ⊗ B IO [2], such that the correlations are obtained via the generalised Born rule P (a, b|x, y) = Tr (where Tr is now the full trace). The framework also permits the parties to share, in addition to the process matrix, some (possibly entangled) ancillary quantum state that can be accessed via their local operations. The parties may thus have access also to some extra incoming Hilbert spaces H A I and H B I of arbitrary (finite) dimension, and be able to perform CP This implies that any process matrix W ∈ A IO ⊗ B IO can be extended to a process matrix W ⊗ ρ ∈ A II O ⊗ B II O , for any extra incoming spaces A I , B I and any ρ ∈ A I ⊗ B I [2].
Requiring Eq. (1) to yield valid (i.e., nonnegative and normalised) probabilities, even when the parties share arbitrary ancillary states, is equivalent to W satisfying the following constraints: for some particular linear subspace L {A,B} of A IO ⊗B IO ; see Sec. IV A and Appendix A 1 for an explicit characterisation [2,12]. In the following we will refer to a matrix satisfying the first two constraints above (i.e., without necessarily imposing the normalisation constraint as a valid process matrix, and whenever we talk about a process matrix W we always implicitly assume it is valid. Hermitian matrices that are not valid process matrices will simply be referred to as "matrices".

B. Bipartite causal (non)separability
One may now consider the question, whether the situation described by a process matrix can be embedded in a well-defined causal structure, with a fixed causal order between the events happening in each party's laboratory, or not.
A process matrix is said to be "compatible with (the causal order) A ≺ B" (sometimes abbreviated to just "A ≺ B", e.g., in superscripts) if all the correlations it generates are compatible with a causal order where A acts before B, which is to be understood operationally: such a process matrix W A≺B does not allow for any signalling from B to A. More precisely, whatever the CP and CPTP maps M A IO a|x , M B IO y ( ) of A and B, the resulting correlations respect the no-signalling condition P (a|x, y) = P (a|x, y ), or ·W A≺B ] according to Eq. (1). This constrains W A≺B to be in a linear subspace L A≺B ⊂ L {A,B} of A IO ⊗ B IO ; see Sec. IV A and Appendix A 2 for an explicit characterisation of L A≺B .
Likewise, process matrices that do not allow signalling from A to B are said to be compatible with the causal order B ≺ A, and will typically be denoted W B≺A ∈ L B≺A . One can also conceive of situations where the causal order is not fixed to be the same for all experimental runs, but where there is instead a probabilistic mixture of the two possibilities. Such a scenario is described by a convex combination of process matrices compatible with A ≺ B and B ≺ A, respectively. Process matrices of this form remain compatible with an underlying causal framework and are the subject of the following definition, first introduced by Oreshkov, Costa and Brukner [2]: Definition 1 (Bipartite causal (non)separability [2]). A bipartite process matrix W is said to be causally separable if and only if it can be written as a convex combination with q ∈ [0, 1] and where W A≺B and W B≺A are two process matrices compatible with the causal orders A ≺ B and B ≺ A, respectively. A process matrix that cannot be decomposed as above is said to be causally nonseparable.
Causally separable process matrices thus describe the most general bipartite situations where one can identify a definite causal order between the parties, be it fixed for all experimental runs or subject to classical randomness. In contrast, if a process matrix is causally nonseparable, it is incompatible with any causal order between A and B. In the bipartite case, causal (non)separability can be easily and efficiently verified; in particular, any causally nonseparable process can be detected using a witness of causal nonseparability [12,13] (see Sec. IV D).

C. Towards generalising to more parties
The process matrix framework itself generalises rather easily to the multipartite case.
Let us first introduced some generalised notations. We shall consider N parties denoted by A k for k ∈ {1, . . . , N } := N , with corresponding inputs and outputs denoted by x k , and a k , respectively. We define the input and output vectors x := (x 1 , . . . , x N ) and a := (a 1 , . . . , a N ). The "incoming" and "outgoing" Hilbert spaces for each party are denoted by H A k I , H A k O (of dimensions d A k I , d A k O , respectively), while the spaces of Hermitian linear operators over these Hilbert spaces are denoted by A k I , A k O . We also define A k IO := A k I ⊗ A k O , and d A k For a subset K ⊆ N of parties, we will denote by x K and a K the vectors of inputs and outputs restricted to the parties in K, and use shorthand notations like A K IO := k∈K A k IO (= R if K = ∅), 1 K := k∈K 1 A k IO = 1 A K IO , and Tr K for the trace over all (incoming and outgoing) systems of the parties in K-i.e., Tr A K IO or Tr A K

II O
, as appropriate (see below), and with Tr ∅ the identity operation and Tr N the full trace. For notational simplicity, we shall identify the parties' names with their labels, and singletons of parties (e.g., {A k }) with the parties themselves (e.g., A k ) or the corresponding label, so that N = {1, . . . , N } ≡ {A 1 , . . . , A N }, N \{A k } ≡ N \k, Tr {A k } ≡ Tr k , etc.
The CP maps corresponding to the parties' operations are then denoted by M represented by a process matrix W ∈ A N IO . The resulting correlations are then obtained through a generalised Born rule as before: As in the bipartite case, the parties may also share some ancillary state ρ in some extra incoming spaces A 1 I ⊗· · ·⊗A N I = A N I , and extend their local operations to act on these spaces as well. Requiring again the nonnegativity and normalisation of all obtainable probabilities, including for arbitrary extensions W ⊗ ρ of W , imposes validity constraints on W . In the general multipartite case, they read for some particular linear subspace L N of A N IO ; see Sec. IV A and Appendix A 1 [2,12]. As for the bipartite case, in this paper a matrix will be called a (valid) process matrix whenever it satisfies the first two constraints above, without necessarily requiring that it is correctly normalised.
The no-signalling constraints can readily be generalised to the N -partite case, allowing the notion of compatibility with a fixed causal order to be extended accordingly. For instance, a process matrix is said to be compatible with the fixed causal order A 1 ≺ A 2 ≺ · · · ≺ A N if no party or group of parties can signal to other parties in their causal "past" (as defined by the specified causal order)-which translates into the constraint that P (a 1 , . . . , a k | x) = P (a 1 , . . . , a k |x 1 , . . . , x k ) for all k = 1, . . . , N − 1. As before, this constrains such a process matrix W A1≺···≺A N to be in a linear subspace L A1≺···≺A N ⊂ L N of A N IO ; see Sec. IV A for an explicit characterisation of L A1≺···≺A N (and Appendices A 2-A 4 for further discussions and characterisations of process matrices compatible with other fixed causal orders).
What is not so straightforward, however, is to generalise the concept of causal (non)separability, which turns out to be much more subtle for more than two parties. In particular, additional complexity arises in the multipartite case because the causal order can be dynamical as well as probabilistic-that is, the causal order of parties in the future can depend on operations of parties in the past [14,15,20]. Simply considering a convex combination of process matrices compatible with different fixed causal orders does not include scenarios with such dynamical causal orders, and is therefore too restrictive to capture all scenarios that should be considered compatible with a well-defined causal order. Perhaps more strikingly, as we shall see the possibility to extend process matrices with ancillary quantum states has nontrivial implications for the definition of causal (non)separability for more than two parties [15]. The main objectives of this paper are precisely to discuss how the concept of causal (non)separability should properly be generalised to the multipartite case, and to characterise causally separable and causally nonseparable process matrices. III The multipartite case was first considered in a restricted tripartite situation in which one party has no (or, equivalently, a trivial) outgoing system. This particular scenario was studied because of its relevance for a practical, physically implementable protocol where the causal order between two operations on a target system is controlled by another quantum system in a superposition. This so-called quantum switch constitutes a new resource for quantum computation that goes beyond causally ordered quantum circuits [4]. It can naturally be described in the process matrix formalism [12,15] and corresponds indeed to a tripartite process matrix for parties A, B and C (Charlie), where Charlie has no outgoing system and therefore cannot signal to the other parties. The situation is thus relatively similar to the bipartite case, since the only relevant causal orders are those where Charlie acts last, i.e., A ≺ B ≺ C and B ≺ A ≺ C. This observation led Araújo et al. to propose the following definition (as an initial, "1-step" generalisation of Definition 1) for this particular scenario: Definition 2 (Araújo et al.'s causal separability [12]). In a tripartite scenario where party C has no outgoing system, a process matrix W is said to be causally separable if and only if it can be written as a convex combination with q ∈ [0, 1] and where W A≺B≺C and W B≺A≺C are two process matrices compatible with the causal orders A ≺ B ≺ C and B ≺ A ≺ C, respectively.
It was shown that the process matrix describing the quantum switch is causally nonseparable as per Definition 2 [12], and this definition has subsequently been used e.g. in Refs. [13,16,21].

B. Oreshkov and Giarmatzi's definitions
While Araújo et al.'s definition recalled above applied only to a particular tripartite situation, Oreshkov and Giarmatzi (OG) considered in Ref. [15] the general multipartite case-taking into account, in particular, the possibility of dynamical causal orders. They defined in fact two possible generalisations of bipartite causal (non)separability, namely what they called the notions of "causal (non)separability" and "extensible causal (non)separability".
The definition they proposed for causal separability is recursive, in analogy with the definition of multipartite "causal correlations" [15,20]-correlations that are compatible with a definite causal order. In Refs. [15,20], these were characterised as those for which it is possible to identify, up to some probability, a party that acts first, and such that, for any behaviour of this first party, the conditional correlations shared by the remaining parties are again causal. Oreshkov and Giarmatzi invoked an analogous "unraveling argument" for causally separable processes.
More specifically, their definition is based on the concept of a "conditional (process) matrix", defined for a given matrix W and a given CP map M k := M A k IO a k |x k applied by a party A k as In general, even if W is a valid process matrix, W |M k thus defined may not be a valid process matrix (in which case we shall just talk about a "conditional matrix"). In fact, as we will see in Sec. IV A, a process matrix W is compatible with party A k acting first (i.e., it does not allow signalling from the other parties to A k ) if and only if for any CP map M k the conditional matrix W |M k , as defined in Eq. (7), is (up to normalisation 1 ) a valid (N −1)-partite process matrix for the parties in N \ k. In that case, the conditional process matrix W |M k then represents the process shared by these N −1 parties, conditioned on party A k performing the CP map M k = M A k IO a k |x k (i.e., conditioned on both receiving the input x k and obtaining the outcome a k ).
Oreshkov and Giarmatzi then proposed the following (recursive) definition: 2 Definition 3 (Oreshkov and Giarmatzi's causal separability [15]). For N = 1, any process matrix is causally separable. For N ≥ 2, an N -partite process matrix W is said to be causally separable if and only if it can be decomposed as with q k ≥ 0, k q k = 1, and where for each k, W (k) is a process matrix compatible with party A k acting first, and is such that for any possible CP map M k ∈ A k IO applied 1 For a properly normalised process matrix W compatible with A k first (i.e., which always gives P (a k | x) = P (a k |x k )) and a trace-non-increasing CP map M k = M a k |x k , one has Tr , so that W |M k must be divided by the factor P (a k |x k ) to also be properly normalised according to Eq. (5). 2 More precisely, what we present here as their definition is actually presented in Ref. [15] (in a slightly different, but equivalent way) as a characterisation following from a more fundamental recursive definition of causally separable processes (not necessarily quantum mechanical).
by party A k , the conditional (N −1)-partite process ma- As outlined in the previous section, the process matrix framework allows for process matrices to be extended by providing additional ancillary states the the parties. Taking this into account, OG introduced a second definition of causal separability for process matrices that are causally separable even under arbitrary such extensions: Definition 4 (Oreshkov and Giarmatzi's extensible causal separability [15]). An N -partite process matrix W is said to be extensibly causally separable if and only if it is causally separable (as per Definition 3 above), and it remains so under any extension with incoming systems in an arbitrary joint quantum state-i.e., if and only if for any extension A N I of the parties' incoming systems and any ancillary quantum state ρ ∈ A N I , W ⊗ ρ is causally separable.
It is easy to see that OG's causal separability (CS) and extensible causal separability (ECS) are equivalent in the bipartite case, and, indeed, equivalent to Definition 1 given in Sec. II B: the process matrix W ⊗ ρ obtained by attaching an ancillary state ρ to a causally separable process matrix W of the form of Eq. (3) remains of the same form, with W A≺B ⊗ ρ (W B≺A ⊗ ρ) compatible with A acting before B (B before A), and for both terms W A≺B ⊗ ρ and W B≺A ⊗ ρ, whatever operation the first party applies, the resulting conditional process matrix for the other party is single-partite, hence trivially causally separable.
However, OG's CS and ECS are not equivalent in the general multipartite case and thus indeed represent two different possible multipartite generalisations of the same bipartite concept. Of course ECS implies CS, but the converse is not true in general-the result of a phenomenon called "activation of causal nonseparability" in Ref. [15]. An explicit example of a CS process that is not ECS was indeed given in [15], in a tripartite scenario where one party has no incoming system; we will see another example in the following subsection.

C. Comparison
We thus now have three potential generalisations of the concept of causal separability to the particular tripartite situation where one party has no outgoing systemnamely, the two different definitions of causal separability (Definitions 2 and 3), and that of extensible causal separability (Definition 4). How do they relate to one another? Are the two definitions of causal separability indeed equivalent? These questions are answered by the following result: Proposition 5. In a tripartite scenario where party C has no outgoing system, Araújo et al.'s definition of causal separability (Definition 2) is equivalent to Oreshkov and Giarmatzi's definition of extensible causal separability (Definition 4), but nonequivalent to their definition of causal separability (Definition 3).
The equivalence between Definitions 2 and 4 for this particular tripartite scenario is proved explicitly in Appendix B 1 a, which we refer to for more details; we simply summarise the argument here as follows. Clearly, any process matrix W of the form of Eq. (6) is ECS, as any W ⊗ρ is also of that form (and of the form also of Eq. (8)), and for any W A≺B≺C and any M A , the conditional process (W A≺B≺C ) |M A is compatible with the order B ≺ C (hence it is causally separable; similarly for any W B≺A≺C and any M B ). The proof that an ECS process matrix W necessarily has the form of Eq. (6) is based on a "teleportation technique" (see Lemma 11 in Appendix B), already used in Ref. [15], that consists in introducing an ancillary system in a maximally entangled state ρ shared by two parties, e.g. A and C. By definition, the global process W ⊗ ρ A I C I has a decomposition of the form (8). It is then easy to see that the terms W A and W B compatible with parties A or B acting first are in fact compatible, since C has no outgoing system, with the causal orders A ≺ B ≺ C and B ≺ A ≺ C, respectively, and thus contribute to the terms W A≺B≺C and W B≺A≺C in Eq. (6). For the term W C compatible with C acting first, letting C project his systems C II := C I ⊗ C I onto the maximally entangled state effectively "teleports" his system to A. By definition, the conditional bipartite process then shared by A and B must be causally separable, and must therefore have a decomposition of the form (3), which also leads to a decomposition of the form (6) for W C .
In order to prove the nonequivalence between Araújo et al. and OG's definitions of causal separability, we will now show that OG's CS and ECS are nonequivalent-i.e., that there can be "activation of causal nonseparability" (according to OG's terminology)-in the scenario where party C has no outgoing system. Note that this scenario differs from that in which OG already gave an example of activation of causal nonseparability: they indeed considered a tripartite case where C has no incoming system, rather than no outgoing system.
Consider for that the following process matrix: where the subsystems are written, for convenience, in the order Here, as in the other examples presented in this paper,x,ŷ,ẑ denote the Pauli matrices, 1 denotes the 2 × 2 identity matrix and tensor products between all matrices are implicit. We note first that W act. is compatible with Charlie act-construct a witness for W act. , and we give one explicitly in Appendix C. Since, as stated above, the existence of such a decomposition (as in Definition 2) would be equivalent in the scenario considered here to OG's ECS (Definition 4), this implies that although W act. is CS according to OG's Definition (see above), it is not ECS. This provides an explicit example of "activation of causal nonseparability" in that scenario.
Hence, OG's CS does not reduce (contrary to OG's ECS) to Araújo et al.'s definition of causal separability in this particular scenario. Definitions 2 and 3 of causal separability are therefore inconsistent. Our aim now is to rectify this inconsistency.

D. Our choice of definition
To fix this, we now propose our own definition of multipartite causal separability, which indeed resolves the inconsistency pointed out above, and which we argue is a more natural definition for general multipartite scenarios. Similarly to OG, we choose a recursive definition, based on the concept of a conditional process matrix and very much in the spirit of the recursive definitions that have been given for multipartite causal correlations [15,20]. For a process matrix to be compatible with a definite causal order, there should, in any run of the experiment, be a designated party that acts first (which party this is can be determined probabilistically, just like in the bipartite case) and the conditional process matrix for the remaining parties, which depends on the action of the first party, should again be causally separable for any CP map that the first party applies.
For several reasons, we consider it important to allow extensions with extra incoming systems, similar to OG's extensible causal separability. Firstly, the whole process matrix framework is constructed so as to allow for shared ancillary systems between the parties. For consistency, we should thus take into account such extensions with shared incoming quantum states when defining causal (non)separability. Indeed, entanglement is a very different resource from causal nonseparability: entangled systems do not by themselves allow signalling between parties, and should be able to be distributed between parties prior to an experiment without "activating" causal nonseparability. (Note, however, that entanglement can still play a crucial role in causal nonseparability, as e.g., in the quantum switch, where the control and target systems can end up being entangled after the parties' operations.) While a "resource theory" for causal nonseparability has not yet been developed, it is reasonable to expect that providing additional shared (entangled) incoming states should be a free operation in such an approach. These considerations lead us to propose the following definition.
Definition 6 (N -partite causal separability). For N =1, any process matrix is causally separable. For N ≥ 2, an N -partite process matrix W is said to be causally separable if and only if, for any extension A N I of the parties' incoming systems and any ancillary quantum state ρ ∈ A N I , W ⊗ ρ can be decomposed as with q k ≥ 0, k q k = 1, and where for each k, W ρ (k) ∈ A N II O is a process matrix compatible with party A k acting first, and is such that for any CP map M k ∈ A k II O applied by party A k , the conditional (N −1)-partite process matrix 4 Note that there is a subtle difference between our definition here and that of OG's ECS (Definition 4). We indeed require all conditional process matrices appearing at all levels of the recursive decomposition to remain causally separable under extension with arbitrary ancillary states, while OG impose this a priori only for the original process matrix. In fact, although prima facie different, these definitions turn out to be equivalent; the proof of this is given in Appendix D.
From Definition 6 we recover the natural, intuitive definition of Araújo et al. [12] in the particular tripartite case where one party has a trivial outgoing system-a case of practical relevance, as the quantum switch is the first example of a causally nonseparable process that has been demonstrated and studied in laboratory experiments [21][22][23]. One can also readily verify that process matrices that are causally separable by Definition 6 cannot generate noncausal correlations (as defined in Refs. [15,20]); an explicit proof is given in Appendix E.
From now on, whenever we talk about causal (non)separability we will refer to our Definition 6.

IV. CHARACTERISING MULTIPARTITE CAUSAL (NON)SEPARABILITY
With the definition of causal (non)separability given above, we now turn to addressing the question of how to characterise causally separable process matrices in terms of simple conditions and how to demonstrate multipartite causal nonseparability in practice.
For that we will start by reviewing the characterisations of valid process matrices and of process matrices compatible with fixed causal orders, before recalling the characterisations of causally separable process matrices in the bipartite and tripartite cases, where we will give 4 Note that compared to Eq. conditions for causal separability that are both necessary and sufficient. We will then present a generalisation to the N -partite case which, for N ≥ 4, gives two conditions, one necessary and one sufficient, whose coincidence remains an open question.
In this section we will not concern ourselves with the normalisation of process matrices (which can always be imposed later). Our characterisations will then be given in terms of linear subspaces of matrices (e.g., the spaces L N and L A1≺···≺A N introduced already in Sec. II); when adding the requirement of positive semidefiniteness, the corresponding sets of (nonnormalised) process matrices will thus be closed convex cones of positive semidefinite matrices. This will allow the conditions we give to be checked efficiently with semidefinite programming (SDP) techniques. In particular, by generalising the techniques used for the bipartite and restricted tripartite cases in Refs. [12,13], we will extend the idea of witnesses of causal nonseparability to the multipartite case and show how multipartite witnesses can be constructed efficiently, allowing this causal nonseparability to be verified experimentally by having each party perform appropriately chosen measurements [21,23]. Following Ref. [12], we adopt the following notation, which will be used heavily throughout the rest of the paper: with d X the dimension of the Hilbert space of system X (note that W → X W defines a CPTP map). In particular, constraints of the form [1−X] W = 0 (which will appear regularly) therefore mean that W is of the form W = Tr X W ⊗ 1 X d X .
A. Valid process matrices and compatibility with a fixed causal order Recall from Sec. II that the conditions for a process matrix W to be valid arise from requiring that the generalised Born rule (4) should give valid probability distributions, even when the parties share arbitrary ancillary systems. The fact that these probabilities should be nonnegative imposes that W must be positive semidefinite, while the requirement that these probabilities must sum to 1 implies that any valid (but, once again, not necessarily normalised) W must be in a subspace L N of A N IO [2,12]. In Appendix A 1 we recall the proof (following Ref. [12]) that this subspace can be characterised as follows: Written in the form of Eq. (14), the validity constraint for W says that all reduced matrices Tr N \X W shared by the parties of any strict subset X of N (obtained after tracing out the parties that are not in X ) must be valid, and that W must further satisfy the additional constraint that The form of Eq. (15) expresses explicitly all the (linearly independent) constraints that these recursive validity conditions imply on W . 5 Denoting by P the convex cone of positive semidefinite matrices, the set of valid process matrices is then the convex cone In order to discuss the causal separability of process matrices, it is necessary to also characterise the subspaces of such matrices that are compatible with certain fixed causal relations between (subsets of) parties. Such causal relations, as for the particular cases of fixed causal orders discussed in the previous sections, are understood via the notion of signalling: if a (group of) parties is in the causal future of some others, then there is no way for them to signal to those earlier parties.
We first consider the case of process matrices that are compatible with a given party A k acting first: 6 regardless of the operation performed by the other parties A k (for all k = k), the marginal probability distribution for A k obtained from (4) must not depend on the CPTP maps M A k IO x k chosen by those other parties. As already mentioned in the previous section and shown in Appendix A 2, a given process matrix W satisfies this condition if and only if, whatever CP map M k is applied by A k , the conditional process matrix W |M k , as defined in Eq. (7), is a valid (N −1)-partite process matrix for the remaining parties in N \A k .
We can in fact ignore here the assumption that M k ≥ 0, and the above constraint is equivalent to imposing that W |M k ∈ L N \A k for any M k ∈ A k IO . Such a constraint defines a linear subspace of A N IO . Taking its intersection 5 Note that the constraint in Eq. (15) can also be written as In this paper we generically use the form of Eq. (15) for ease of notation; it may be useful, however, to keep in mind that this type of constraint is in fact a constraint on the reduced matrix Tr N \X W shared by the parties in X , as written more explicitly in Eq. (14). 6 Note that a process matrix can be compatible with several different causal relations between parties. For example, if a matrix W does not allow any party to signal to another, then it is compatible with any party or group of parties acting first.
with the subspace L N , we denote the linear subspace of valid process matrices compatible with party A k first by L A k ≺(N \A k ) . We find, using Eq. (15) above (and after removing redundant constraints; see Eq. (A12) in Appendix A 2): In Appendix A 2 we also derive constraints for more general causal orders of the form K 1 ≺ K 2 ≺ · · · ≺ K K , for various disjoint subsets K i of N . Of particular interest is the specific case in which each K i is a singleton, which gives constraints on a process matrix W being compatible with a fixed causal order such as A 1 ≺ A 2 ≺ · · · ≺ A N . Such a W must be compatible with A 1 acting first (and must therefore satisfy Eq. (17) for k = 1-in particular, the constraint on its third line); then, whatever CP map M 1 party A 1 applies, the resulting conditional process matrix W |M1 must then be a valid (N −1)-partite process matrix, compatible with party A 2 acting first (and must therefore satisfy Eq. (17) for k = 2-in particular, its third line-with N replaced by N \{1}); etc. By iterating this argument (up until the party A N ), we find that the linear subspace L A1≺···≺A N of process matrices compatible with the causal order A 1 ≺ · · · ≺ A N is characterised by (cf. Eq. (A16)) [12,24,25] with A (>k)

B. Bipartite and tripartite causally (non)separable process matrices
In the bipartite scenario, the above characterisation of the subspaces L A≺B and L B≺A allows us, from Definition 1, to give the following explicit characterisation of causally separable process matrices.
Proposition 7 (Characterisation of bipartite causally separable process matrices). A matrix W ∈ A IO ⊗ B IO is a valid bipartite causally separable process matrix if and only if it can be decomposed as where, for each permutation (X, Y ) of the two parties A and B, W (X,Y ) is a positive semidefinite matrix satisfying (i.e., W (X,Y ) is a valid process matrix compatible with the causal order X ≺ Y ).
Note that, in contrast to Eq. (3) in Definition 1, we did not write the weights q and 1 − q explicitly in Eq. (19). Instead, for convenience and consistency with the characterisations of tripartite and N -partite causally separable processes which will follow, we decomposed W in terms of nonnormalised process matrices, writing W (A,B) = q W A≺B and W (B,A) = (1−q) W B≺A .
As we discussed in Sec. III, the tripartite case of causal separability was already studied by Oreshkov and Giarmatzi under the name "extensible causal separability" in Ref. [15]. In their Proposition 3.3 they provided a characterisation of tripartite (extensible) causal separability, albeit describing the constraints in a different way. In our approach, this characterisation can be expressed as follows: Proposition 8 (Characterisation of tripartite causally separable process matrices). A matrix W ∈ A IO ⊗ B IO ⊗ C IO is a valid tripartite causally separable process matrix (as per Definition 6) if and only if it can be decomposed as where, for each permutation of the three parties (X, Y, Z), W (X,Y,Z) and W (X) := W (X,Y,Z) + W (X,Z,Y ) are positive semidefinite matrices satisfying The proof of this characterisation was sketched in Ref. [15] using a somewhat different terminology to what we employ; in particular, they express causal constraints in terms of restrictions of what terms are "allowed" in a Hilbert-Schmidt basis decomposition of a matrix (see Appendix A 4). We give a more detailed proof in Appendix B 1 b, which is again based on a "teleportation technique" (cf. Lemma 11 in Appendix B), similar in spirit to the one briefly sketched in Sec. III C.
Let us break down and analyse the terms appearing in the decomposition (21) to understand better this characterisation.
From the constraints in Eq. (23) it follows that, that for each party X, the matrix Together with Eq. (22) and the fact that W (X) is positive semidefinite, this implies that W (X) is a valid tripartite process matrix compatible with party X acting first (since it satisfies Eq. (17) for A k = X). W is thus decomposed in Eq. (21) as a sum of 3 valid process matrices, which ensures in particular that it is itself a valid process matrix.
On the other hand, the matrices W (X,Y,Z) in the decomposition (21) are not necessarily valid process matrices. Nevertheless, the constraints (23) imply that whatever the CP map M X applied by the first party X, the conditional process matrix (W (X,Y,Z) ) |M X := Tr X [M X ⊗ 1 Y Z · W (X,Y,Z) ] is a valid bipartite process matrix, compatible with the causal order Y ≺ Z (indeed, it satisfies Eq. (18) for this causal order: e.g., The fact that the matrices W (X,Y,Z) are not necessar-ily valid process matrices, and thus that Eq. (21) does not simply decompose W into a combination of process matrices compatible with fixed causal orders, is a consequence of the possibility of dynamical (but still welldefined, albeit not fixed ) causal orders (recall the discussion at the end of Sec. II C). In Sec. IV E we will consider in more detail a concrete example of a process matrix allowing for such dynamical causal orders.
C. General multipartite causally (non)separable process matrices As we will see below, it is possible to generalise the decomposition of Proposition 8 to the case of N -partite causal separability. While the generalisation clearly provides a sufficient condition for causal separability, it turns out that the proof that it is also a necessary condition does not readily generalise. Indeed, the proof for the tripartite case in Appendix B 1 b relies on the fact that each term W (X) in Eq. (21) is the sum of only two "base" terms, something that is not true in the natural generalisation of this decomposition. (To understand this better, we encourage the interested reader to look at the subtleties of that proof.) For the general multipartite case, we therefore provide the following, separate, necessary and sufficient conditions. Although these coincide in the bipartite and tripartite cases, it remains an open question whether this is the case in general (or if one is both necessary and sufficient but not the other, or if neither are).

Necessary condition
The necessary condition we present here is based on the teleportation technique and is a generalisation of the use of this approach in the proof of the tripartite characterisation. The teleportation technique is more formally described in Lemma 11 in Appendix B, but we briefly outline how it leads to the necessary condition to help understand the condition itself.
The idea is to provide, as ancillary incoming systems, a maximally entangled state between every pair of parties, defining an overall ancilla state ρ. If W is a causally separable process matrix, then, by definition, W ⊗ ρ can be decomposed into a sum of process matrices W ρ (k) compatible with a given party A k acting first (cf. Eq. (12) in Definition 6); furthermore, as ρ is pure, one can write W ρ (k) = W (k) ⊗ ρ with W (k) itself being compatible with A k first. For each such process matrix W (k) the party A k can then "teleport" the part of W (k) on their systems A k IO to another party A k by applying an appropriate CP map M k . The effect is that the resulting (N −1)-partite conditional process matrix (W ρ (k) ) |M k formally has the same form as W (k) (tensored with what is left over of the, now reduced, ancillary state ρ), except that the systems A k IO are instead attributed ("teleported") to the ancillary incoming system A k I of A k . From the definition of causal separability, (W ρ (k) ) |M k must itself be causally separable, so the necessary condition can be recursively applied to this (N −1)-partite process matrix until the base case of N = 3, given by Proposition 8, is reached.
We give the full details of the proof of the necessary condition in Appendix B 2 a. However, in order to state more formally the condition itself, let us introduce the following notation. For a given matrix W ∈ A N IO , we denote by W A k IO are attributed to some other system A k I (of the same dimension as A k IO ). More formally, We then obtain the following recursive necessary condition: Proposition 9 (Necessary condition for general multipartite causal separability). An N -partite causally separable process matrix W ∈ A N IO (as per Definition 6) must necessarily have a decomposition of the form where each W (k) is a valid process matrix compatible with party A k acting first, and such that for each k = k, is an (N −1)-partite causally separable process matrix.
Hence, any constraints satisfied by (N −1)-partite causally separable process matrices must also be satisfied by W (k) after re-attributing the system A k I back to A k IOi.e., after formally replacing A k I by A k I A k I and then A k I by A k IO in the constraints written using the notation defined in Eq. (13).
To further clarify this condition, let us illustrate, in the fourpartite case (with parties A, B, C, D), how one can use it to obtain explicit constraints on causally separable process matrices. Proposition 9 implies that a fourpartite causally separable process matrix W must be decomposable as with each W (X) (for X = A, B, C, D) being a valid process matrix compatible with party X acting first-hence satisfying Eq. (17) for A k = X. 7 For each X and every other party Y = X, the recursive constraint that W is a tripartite causally separable process matrix further implies, according to Proposition 8 (for the 3 parties Y, Z, T = X) and after re-attributing the system Y I to X IO (i.e., replacing Y IO by Y I Y IO and then Y I by X IO in the constraints), that there must exist a decomposition of W (X) of the form 8 (X,T,Z,Y ) (27) where each term appearing in the decomposition is positive semidefinite, W Finally, we remark that the constraints obtained by considering teleporting each party X's system to just a single other party Y (i.e., by just demanding the existence of a decomposition of the above form for some other party Y , rather than for all other parties Y = X) yields conditions that are still necessary for the causal separability of W , but which are generally weaker than those given in Proposition 9. Indeed, in Appendix F 1 we give an example of a fourpartite process matrix which satisfies those weaker conditions but not all of those given above.

Sufficient condition
A sufficient condition for causal separability can be obtained by considering a stricter form of the recursive decomposition (25) in Proposition 9. In particular, we demand that W has a decomposition into W (k) compatible with A k acting first and such that each W (k) itself recursively satisfies the sufficient constraints for an (N −1)-partite process matrix without A k IO being traced out. One can easily verify that the decomposition (21) in the tripartite case is a generalisation of this kind from the bipartite case. In the fourpartite case described explicitly above, this means that for each party X there should be a single decomposition of the form (27) (i.e., no longer dependent on Y ) such that the constraints (28) are satisfied without tracing out X IO on the first and fourth lines. The fact that, unlike in the necessary conditions, we only consider a single (recursive) decomposition of each W (k) means that we can give a more explicit formulation for the sufficient condition.
Before stating the sufficient condition, let us introduce some more notations. Let Π denote the set of permutations (generically denoted by π) of N . For an ordered subset (k 1 , . . . , k n ) of N with n elements (with 1 ≤ n ≤ N , k i = k j for i = j), let Π (k1,...,kn) be the set of permutations of N for which the element k 1 is first, k 2 is second, . . ., and k n is n th -i.e., Π (k1,...,kn) = {π ∈ Π | π(1) = k 1 , . . . , π(n) = k n }. With these notations, we have the following sufficient condition, that directly generalises the decomposition of Proposition 8.
Proposition 10 (Sufficient condition for general multipartite causal separability). If a matrix W ∈ A N IO can be decomposed as a sum of N ! positive semidefinite operators W π ≥ 0 in the form such that for any ordered subset of parties then W is a valid causally separable process matrix (as per Definition 6).
This decomposition was also suggested independently by Oreshkov as a possible generalisation of Proposition 8 [26] (although following the approach of Refs. [2,15], Oreshkov expressed it differently, namely in terms of allowed terms in a Hilbert-Schmidt basis decomposition of the matrices W (k1,...,kn) ; cf. Appendix A 4). The proof that the condition above is indeed sufficient is given in Appendix B 2 b. In order to understand it better, it is nonetheless worth discussing the form of the decomposition and the terms appearing within in a little more detail.
For n = 1, Eqs. (31) and (32) imply that each matrix W (k1) (≥ 0) is a valid process matrix compatible with party A k1 acting first; indeed, Eq. (17) is satisfied for A k = A k1 . As W = k1 W (k1) according to Eqs. (29)- (30), this ensures in particular that W is indeed a valid process matrix.
As we have noted already, the condition of Proposition 10 coincides, in the bipartite and tripartite cases, with those given in Propositions 7 and 8, respectively. Indeed, for these cases, the necessary and sufficient conditions given here coincided. For four-or-more parties it remains an open question whether this is also the case. We performed several numerical searches for process matrices satisfying the necessary but not sufficient conditions (see Appendix F 2) and failed to find any such examples, although the complexity of the numerical searches means that we caution against interpreting this as evidence that the conditions coincide in general. In Appendix B 3, however, we show that they do coincide in the specific fourpartite case with d D O = 1. This is a rather restricted scenario (where any process matrix is compatible with D acting last), but nonetheless includes cases of interest such as the fourpartite variant of the quantum switch we discuss at the end of this section.
Finally, we note that the decomposition in Proposition 10 has consequences beyond the definition of causal separability meriting additional interest: as we show elsewhere [27], it characterises precisely (i.e., providing a necessary and sufficient condition for) quantum circuits with classical control of causal order.

D. Witnesses of causal nonseparability
While the previous characterisations provide mathematical descriptions of causally (non)separable process matrices, an important problem is the ability to detect and certify causal nonseparability in practice. One approach that has been explored extensively is to show the violation of causal inequalities [2,14,20,28,29], which is indeed only possible (within the process matrix formalism) with causally nonseparable process matrices (see Appendix E), and provides a device-independent certificate of noncausality. However, certain causally nonseparable process matrices are known not to violate any such inequalities-this is, e.g., the case for the quantum switch [12,15].
Another approach, first introduced in Ref. [12] for the bipartite and a restricted tripartite scenario, and further studied in Ref. [13], is to construct witnesses of causal nonseparability-or "causal witnesses" for short. A causal witness is defined as a Hermitian operator S such that for all W sep ∈ W sep , where W sep ⊂ W is the set of causally separable process matrices. For any causally nonseparable W ns , it is known that there exists a causal witness S such that Tr[S ·W ns ] < 0 [12,13]. Given a process, a causal witness S can be "measured" by having each party implement suitably chosen operations or measurements, providing a now device-dependent test of causal nonseparability. This approach has already been used to verify experimentally the causal nonseparability of two different implementations of the quantum switch [21,23].
Here we will show how this approach can be generalised to the multipartite case using the conditions described in the previous subsections. Propositions 7, 8, 9 and 10 allow for the characterisation of the convex cone W sep of causally separable processes-or, for the latter two propositions, outer and inner approximations W sep + and W sep − thereof-in terms of Minkowski sums and intersections of linear subspaces and of the cone of positive semidefinite operators P. The set of causal witnesses is then precisely the dual cone of W sep , S = (W sep ) * [12,13]. A characterisation of S can, in general, be obtained from the description of W sep by using the following duality relations for any two nonempty closed convex cones C 1 and C 2 [30]: Minkowski sum of the two cones C 1 and C 2 ; note that all the cones we shall consider will be nonempty, closed and convex). Since these cones are convex, the construction of causal witnesses (or of explicit decompositions of causally separable process matrices) can be efficiently performed with semidefinite programming (SDP), as first described in Ref. [12]; we will follow here the slightly different approach of [13]. The question of whether a given W is causally separable can be reformulated as the optimisation problem of how much white noise can be added to a process matrix before it becomes causally separable. Let I be the "white noise" process matrix (which corresponds to each each party just receiving a fully mixed state 1 A k I /d A k I , and is causally separable), and consider the noisy process matrix Since the normalisation is irrelevant to membership of W sep , determining whether W is causally separable can be thus phrased as the SDP optimisation problem which can be efficiently solved using standard software by writing W sep in terms of SDP constraints (see [13], the examples below and Appendix G for further details). The solution to this problem, r * , gives the random robustness max(r * , 0) of W , and a value r * > 0 implies that W is causally nonseparable [12,13]. Eq. (36) is known as the primal problem, and is related to the dual problem defined over the dual cone S of W sep [12,13]. The optimal solution S * is a witness of the causal nonseparability of W whenever Tr[S * · W ] < 0. The Strong Duality Theorem for SDP problems moreover relates these two problems (see Appendix G), stating that their solutions satisfy This implies in particular that the witness S * thus obtained is optimal when W is subject to white noise, in the sense that it witnesses the causal nonseparability of all noisy process matrices W (r) with r sufficiently small (r < r * ) so as for W (r) to remain causally nonseparable. For more than 3 parties, the witnesses in the set S + = (W sep + ) * obtained from the cone W sep + ⊇ W sep arising from the necessary condition of Proposition 9 are also valid witnesses of W sep since S + ⊆ S. On the other hand, by solving the primal SDP problem over the cone W sep − arising from the sufficient condition in Proposition 10, one can show the causal separability of any W ∈ W sep − ⊆ W sep (through the construction of an explicit causally separable decomposition for W of the form given in Proposition 10). Recalling the claim that such process matrices correspond precisely to quantum circuits with classical control of orders [27], the dual cone S − is thus the set of "witnesses for no classical control of causal order" (which can thus be found by solving the dual SDP problem).
In Appendix G we give some concrete characterisations of the cones W sep and S for different scenarios in terms of SDP constraints, and which are relevant for the examples that we shall now give in the following section.

E. Examples
In the bipartite scenario and restricted tripartite scenario in which C has no outgoing system, several examples of causally nonseparable process matrices have previously been formulated and studied in detail [2,12,15]. As mentioned already, the quantum switch is a particularly interesting example of a causally nonseparable process matrix in the latter scenario. The quantum switch has attracted particular interest as a consequence of being readily implementable, and indeed several implementations have been experimentally realised [21][22][23]31]. Consequent work has sought to clarify whether such implementations can really be seen as genuine realisations of indefinite causal orders, and Ref. [32] gives arguments clarifying why they can be. The characterisations of the cones of causal witnesses that we give in Appendix G 2 for these bipartite and restricted tripartite scenarios (see Eqs. (G3) and (G5)) are equivalent to those given in Refs. [12,13], and can readily be used to verify the noncausal separability of these examples, following the approach just outlined.
In this restricted tripartite scenario we have in fact also already looked at another explicit example: the process matrix W act. (9) introduced in Sec. III C to show the "activation of causal nonseparability" under OG's definition of causal separability. An explicit witness from the cone (G5) is given in Appendix C, Eq. (C2), which could thus have been equally well found with the approach of Refs. [12,13].
Another example of "activation of causal nonseparability" under OG's terminology was given in Ref. [15] in the different tripartite case in which one party, say now A, has only a nontrivial outgoing system, and can thus always be seen as acting first. A witness for this example can be found by solving the dual SDP problem (37) using now the cone of witnesses (G7) corresponding to this restricted tripartite scenario.
Of more novel interest is the fourpartite scenario, in which causal separability has not previously been characterised. A particularly interesting and simple example here is a fourpartite version of the quantum switch, in which a party A(lice) has no incoming system (d A I = 1) and always acts first, while another party D(orothy) has no outgoing system (d D O = 1) and always acts last. Let us describe more precisely this version of the quantum switch.
The switch is composed of two qubits: a control qubit and a target qubit. Initially, Alice prepares the control qubit in some state of her choosing (in general as a function of her input x). (Note that it is here that the fourpartite switch described here differs from the tripartite one, where the control qubit is in a fixed superposition.) The target qubit, initially prepared (externally to the 4 parties) in some state |ψ , is then sent to Bob and Charlie, who act in an order that depends on the state of the control qubit: if it is |0 then Bob acts before Charlie (B ≺ C), while if it is |1 then Charlie acts before Bob (C ≺ B). If it is in a superposition, then Bob and Charlie can instead be seen to act in a superposition of different orders. Finally, both qubits are sent to Dorothy who can perform a measurement on them (for simplicity, we will consider that D simply ignores the target qubit and thus will trace it out, as this will not change the discussion that follows). 9 Labelling the relevant incoming and outgoing systems (where the superscripts indicate control and target qubits) , the process matrix for the quantum switch can be written [12,15,32] where |1 := |0 |0 + |1 |1 is the pure CJ representation (in the computational basis {|0 , |1 }) of an identity qubit channel. Note that, while Alice has control over 9 We note that the quantum switch was also described as a fourpartite process in Ref. [23], with one party acting first, and one acting last. However, in that reference the first party was controlling the target qubit, rather than the control qubit as we consider here. In that case (with the first party controlling the target qubit), the random robustness is increased to 2.767. One could also have here a first party that controls both the target and control qubits (as in Ref. [32]), which further increases the tolerable white noise to 4.686; for simplicity we do not consider this possibility, as our goal here is just to illustrate the role of the control qubit. Note also that Rubino et al. [23] used yet another definition of causal nonseparability, different from the ones discussed in Sec. III, which did not allow for dynamical causal orders. As argued before and discussed in Refs. [15,20], such a definition is however too restrictive to really characterise processes that are compatible with a well-defined causal order, as one would like the notion of causal separability to do. Nevertheless, it turns out that the witness constructed and experimentally tested in Ref. [23] is not only a witness for fixed (nondynamical) causal orders, but also witnesses causal nonseparability as per our Definition 6.
the causal order of the other parties, this switch differs from a classical dynamical control of causal order in that she has coherent quantum control over the control qubit (and thus the causal orders). In this particular restricted fourpartite scenario, our necessary and sufficient conditions for the causal separability of a process matrix W coincide and reduce to the existence of a decomposition of the form W = to ensure, with the previous constraints, that W is valid); see Proposition 18 in Appendix B 3. These conditions thus characterise precisely the cone W sep in the scenario considered here, and the dual cone of causal witnesses S is then readily obtainable (see Eq. (G11) for the explicit characterisation).
The causal nonseparability of W switch can thus be verified by solving the dual SDP problem (37) and thereby obtaining a witness of its causal nonseparability. Doing so, we find that (up to numerical precision) the random robustness of W switch of 2.343 (note that this does not depend on the choice of initial state of the target qubit, so in solving the SDP problem numerically we can take, e.g., |ψ = |0 ). In experimental efforts to measure a witness and verify the causal nonseparability of a process matrix, one may only have access to a restricted set of operations for the parties. Many natural such constraints can also be imposed as SDP constraints, as described in Ref. [13], allowing one to find implementable causal witnesses. A particularly natural such constraint is to restrict B and C's operations to unitary operations (as in the experimental implementation of the tripartite switch in Refs. [21,22]); we find that the tolerable white noise on W switch to witness its causal nonseparability is reduced, under such a restriction, to 0.746.
It is important to note that if we trace out the last party from W switch (i.e., D c I in addition to D t I ), we obtain which is causally separable since it is of the form of Eq. (21) with just the first two terms being nonzero:  (23). This was also the case with the original tripartite version of the quantum switch (in which the control qubit is in the fixed state ). There, one is left with a simple probabilistic mixture of channels in two different directions after tracing out the last party [12,15]. In contrast here, Eq. (40) is not compatible with any probabilistic mixture of fixed causal orders: indeed, W (A,B,C) and W (A,C,B) are not valid process matrices, as [ to be a valid process matrix). Rather, Tr D W switch is a "classical switch" in which A can incoherently control the causal order between B and C, which thus allows for dynamical causal orders.

V. DISCUSSION
In this paper we studied the question of how to generalise the concept of causal (non)separability to the multipartite case. We reviewed several definitions that had been proposed for multipartite scenarios in previous works, namely the definition of causal separability introduced by Araújo et al. [12] for a particular tripartite situation, and Oreshkov and Giarmatzi's definitions of causal separability (CS) and extensible causal separability (ECS) [15] for the general multipartite case. We established the equivalence between Araújo et al.'s (restricted) definition of causal separability and Oreshkov and Giarmatzi's definition of ECS in the particular tripartite situation considered by Araújo et al., thus linking two a priori different definitions for that case. Moreover, by showing that ECS and CS are different in that scenario, we found that the two definitions of causal separability proposed by Araújo et al. [12] and by Oreshkov and Giarmatzi [15] were inconsistent, a problem that thus needed to be addressed.
We proposed a new general definition of N -partite causal nonseparability, similar in spirit to the recursive definitions that have been proposed for multipartite causal correlations [15,20], and more consistent with the fact that the process matrix framework always allows for parties to share additional ancillary systems. Our definition thus avoids some unwanted features of the definition of CS in Ref. [15], such as the "activation" of causal nonseparability by shared entanglement. Moreover, we showed that our definition, although a priori different, in fact reduces to the notion of ECS proposed in [15], which also reduces to the definition of Araújo et al. [12] in the particular restricted scenario considered there.
We then focused on characterising causally separable process matrices, giving (in the general multipartite case) two conditions-one necessary and one sufficient (Propositions 9 and 10, respectively)-for a given process matrix to be causally separable. These conditions allowed us to characterise the corresponding sets of process matrices through SDP constraints, and to generalise the tool of witnesses for causal nonseparability to the multipartite case. In the bipartite and tripartite cases, our necessary and sufficient conditions coincide and reduce to those previously described [2,12,15]. The principal open question raised by this work is whether this also holds in the general N -partite case with N ≥ 4, or whether one of the two is both necessary and sufficient (or if one could derive yet another distinct condition, that would we both necessary and sufficient).
As we show elsewhere, our sufficient condition characterises precisely the processes that can be realised as a quantum circuit with classical control of causal order [27]. If that condition is in fact also necessary, this would thus confirm the conjecture of Oreshkov and Giarmatzi, that causally separable process matrices (or "extensibly causally separable processes" using their terminology) are those realisable by such "classically controlled quantum circuits" [15]. This would provide more solid founding for our understanding of the notion of causal separability, which would then indeed correspond to our intuition (quantum circuits with possibly dynamical causal orders that are classically controlled). Furthermore, the proof in Ref. [27] would also provide a general explicit construction to realise any given causally separable process matrix in practice.
However, the forms of our necessary and sufficient condition, and the fact that the proof for the necessity of the conditions in the tripartite case does not generalise straightforwardly to more parties, indeed leave open the possibility that our sufficient condition may turn out to not be necessary. If this is the case, it would mean that there exist causally separable process matrices that are not realisable as classically controlled quantum circuitsand which we would not currently know how to realise experimentally. It would certainly be interesting to understand what kind of situations such process matrices correspond to-and if (and how) they can be realised quantum mechanically. This question is reminiscent of the open problem of whether process matrices that allow for the violation of causal inequalities are realisable with "standard" quantum mechanics. Here the question would concern even less extreme situations: causally separable process matrices.
Another question that arises naturally in the multipartite case is whether a given phenomenon is genuinely multipartite, in the sense that its occurrence truly requires the coordinated action of a certain number of parties. It would be important for our understanding of multipartite process matrices to define a notion of "genuinely multipartite causal nonseparability", similar to the concept of "genuinely multipartite noncausality" for correlations [29] and analogous to the notions of genuinely multipartite entanglement [33] and nonlocality [34][35][36]. It would then also be interesting to study whether the definition can be refined to give a hierarchy of degrees of causal nonseparability, similar to the approach in Ref. [29] for correlations, and whether the characterisation of the corresponding process matrices and the construction of "witnesses of genuinely multipartite causal nonseparability" are still possible with SDP techniques. These questions are left for further research.
In this first appendix we show how the valid process matrices, as well as those compatible with a given causal order, can be characterised. We then discuss some properties of process matrices, and alternative characterisations.

Valid process matrices
A given matrix W ∈ A N IO defines a valid N -partite process matrix if and only if it generates nonnegative and normalised probabilities P ( a| x) through the generalised Born rule of Eq. (4)-including in the case where an ancillary quantum state ρ in some extension A N I of the parties' incoming spaces is attached to W (and thus shared among the N parties), and the parties' operations are allowed to act on their joint incoming systems A k II := A k I ⊗ A k I . The constraint that the probabilities in Eq. a k |x k ≥ 0), translates into the constraint that W must be positive semidefinite [2,12].
As for normalisation, the constraint is that Eq. (4) must give a total probability equal to 1 for any set of CPTP maps-i.e., any positive semidefinite matrices M , using the "traceout-and-replace" notation X · defined in Eq. (13). It is easy to see that the constraint of positive semidefiniteness does not play any role here; furthermore, note that for any matrix M ∈ A IO , M := 1 and that any M ∈ A IO satisfying The normalisation constraint thus translates into the constraint that Note that for simplicity we did not explicitly attach an ancillary state ρ to W here; doing so would have led to the same conclusion. Full details for this whole argument can be found in Appendix B of Ref. [12]. We shall in general ignore the normalisation constraint (A2) when talking about valid process matrices. The 2 N − 1 linear constraints of Eq. (A3) define a linear subspace of A N IO , which we denote by L N , the subspace of valid process matrices: explicitly (noting that the constraints as in Eq. (15) of the main text. It is furthermore straightforward to check that this is equivalent to the following recursive characterisation, as in Eq. (14): Summing up, the set of (nonnormalised) valid process matrices is the convex cone W = P ∩ L N , where P is the cone of positive semidefinite matrices.

Compatibility with fixed causal orders
Let us now analyse the constraints imposed on process matrices by requiring that they are compatible with a given causal order.

a. Causal order between two subsets of parties
Consider two nonempty disjoint subsets of parties K 1 , K 2 N . We say that the correlation P ( a| x) is compatible with the causal order K 1 ≺ K 2 if and only if there is no signalling from the parties in K 2 to the parties in K 1 -i.e., the marginal probability distribution for the outputs of parties in K 1 does not depend on the inputs of parties in K 2 : P ( a K1 | x) = P ( a K1 | x N \K2 ) for all x, a K1 . We then say that a (valid) process matrix W is compatible with the causal order K 1 ≺ K 2 if and only if it only generates correlations (through the generalised Born rule (4), possibly allowing for extensions of W with some ancillary state ρ) that are compatible with K 1 ≺ K 2 .
Formally, this means (ignoring again for simplicity the possibility of attaching an ancillary state ρ; as before, the same reasoning also goes through if we allow for this possibility) that whatever the CP maps M applied by the parties in K 2 and in K 12 := N \(K 1 ∪ K 2 ) (which may be empty), respectively, one must have (i.e., the probabilities should be the same if the parties in K 2 apply the CPTP maps 1 As in the previous subsection, the constraint of positive semidefiniteness of the CJ matrices M k does not play any role here, and we can equivalently write the above constraint as for any matrices M k ∈ A k IO . Expanding this constraint in a similar way as above (or as it was done in more details in Appendix B of Ref. [12]), we find that it is equivalent to the following 2 N −|K1| −2 N −|K1|−|K2| linear constraints: Combining these conditions with those from Eq. (A4) to ensure that W is a valid process matrix, and removing redundant constraints, 10 one can then characterise the subspace L K1≺K2 of (valid) process matrices compatible with the causal order K 1 ≺ K 2 through the following 10 One can easily see that the constraints from Eq. (A4) for which X ∩ K 2 = ∅ are already implied by those of Eq. (A8): indeed, defining X 1 := X ∩ K 1 and X 2 := X ∩ (N \K 1 ) ⊆ N \K 1 , in such a case one has X 2 ∩ K 2 = ∅ and Let us assume now that K 1 and K 2 cover the full set N , so that K 1 ∪ K 2 = N . The characterisation above then simplifies to the following 2 |K1| + 2 |K2| − 2 constraints: Comparing these constraints with Eq. (A4), one can see that the third line of Eq. (A10) is equivalent to imposing that the reduced process Tr K2 W is in L K1 , the subspace of valid |K 1 |-partite process matrices for parties in K 1 ; while the fourth line is equivalent (using the fact that to imposing that whatever CP maps M K1 ∈ A K1 IO applied by the parties in K 1 , the conditional matrix W |M K 1 := Tr K1 [M K1 ⊗1 K2 ·W ] must be in the subspace L K2 of valid process matrices for the parties in K 2 (note that M K1 may or may not be of a product form k1∈K1 M k1 here, and that its complete positiveness is in fact irrelevant). 11 We thus equivalently have the following characterisation: These constraints are indeed quite intuitive: they simply correspond to the fact that for a process matrix correlation P ( a| x) to be compatible with the causal order K 1 ≺ K 2 (with K 1 ∪ K 2 = N ), the probability distributions P ( a K1 | x K1 ) and P ( a K2 | x, a K1 ) := P ( a| x)/P ( a K1 | x K1 ) can be calculated from Tr K2 W and W |M K 1 , and must be well-defined. In particular, for K 1 = {A k } (a singleton of just one party coming first) and K 2 = N \A k , Eq. (A10) becomes as in Eq. (17) of the main text. For K 1 = N \A k and K 2 = {A k } (a singleton of just one party coming last), 11 Although the constraints in the fourth line of Eq. (A10) are written exactly as those that would define L K 2 , we emphasise that they apply here to some matrix W ∈ A N IO , rather than to W ∈ A K 2 IO as in the definition of L K 2 . This is why they must of course not directly be interpreted as implying that W ∈ L K 2 , but W |M K 1 ∈ L K 2 for all M K 1 , as in Eq. (A11).
Eq. (A10) becomes Causal order between several subsets of parties Consider now K disjoint subsets K i of N . Generalising the idea of causal order between two subsets of parties, we say that the correlation P ( a| x) is compatible with the causal order K 1 ≺ K 2 ≺ · · · ≺ K K if and only if there is no signalling from "future parties" to "past parties"-i.e., if for any k = 1, . . . , K − 1, the outputs of parties in K (≤k) := k i=1 K i do not depend on the inputs of the parties in K (>k) := K j=k+1 K j : for all x, a K (≤k) , or equivalently, the correlation is compatible with the causal order K (≤k) ≺ K (>k) for all k = 1, . . . , K − 1.
As before, we then say that a process matrix W is compatible with the causal order K 1 ≺ · · · ≺ K K if and only if it only generates correlations that are compatible with that order. Similarly to L K1≺K2 , we define the subspace of (valid) process matrices compatible with the causal order K 1 ≺ · · · ≺ K K .

c. Particular cases with d
Suppose there exists a party A f which has a trivial incoming space, i.e., such that d A f I = 1. The constraints of Eq. (A4) can be written, depending on whether A f ∈ X or A f / ∈ X and renaming X \A f → X in the former case, in the forms respectively. Summing up these two constraints in the case where X = ∅ (and keeping the first one for the case where X = ∅), we find that L N is characterised by the same constraints as those characterising Hence, in that case any valid process matrix is compatible with party A f acting first. This corresponds indeed to the natural intuition that, because party A f does not receive any physical system from anyone, they do not need to wait for any other party to act before them.
In the case where several parties in A f := {A f1 , . . . , A fn } have trivial incoming spaces (such that d [For d Instead of d A f I = 1, suppose now that there exists a party A which has a trivial outgoing space, i.e., such that 12 We use here, in particular, the fact that Hence, similarly to the previous case, here any valid process matrix is compatible with party A acting last. This is again rather intuitive: as party A sends no physical system out and cannot signal to anyone, then they can always come last-see, e.g., the motivation for only considering fixed orders with party C coming last in Araújo et al.'s definition of causal separability [12]. If now several parties in A := {A 1 , . . . , A n } have trivial outgoing spaces (such that d It is worth emphasising that no similar property holds for several parties in A f = {A f1 , . . . , A fn } having trivial incoming spaces, as considered previously: any process matrix is compatible in that case with any causal order A fj ≺ (N \A fj ) (as in Eq. (A18)), but not necessarily with A f1 ≺ · · · ≺ A fn ≺ (N \A f ) (or with any other permutation of the first parties). 13 To finish here, note, furthermore, that if a party A k has both d A k I = d A k O = 1, then clearly one can just ignore it: in such a case, W ∈ L N ⇔ W ∈ L N \A k . 13 Note indeed, in a similar fashion, that while compatibility of a probability distribution P with the orders (K 1 ∪ K 2 ) ≺ K 3 and (K 1 ∪ K 3 ) ≺ K 2 implies compatibility with K 1 ≺ (K 2 ∪ K 3 ), and therefore with both K 1 ≺ K 2 ≺ K 3 and K 1 ≺ K 3 ≺ K 2 , it is not the case that compatibility with K 1 ≺ (K 2 ∪ K 3 ) and K 2 ≺ (K 1 ∪ K 3 ) necessarily implies (K 1 ∪ K 2 ) ≺ K 3 , and it therefore does also not necessarily imply compatibility with K 1 ≺ K 2 ≺ K 3 or K 2 ≺ K 1 ≺ K 3 . As a counter-example, one can see for instance that P (a, b, c|x, y, z) = 1 2 δ a⊕b,z δ c,0 (with binary inputs and outputs, where δ the Kronecker delta and ⊕ denotes addition modulo 2) is compatible with both A ≺ {B, C} and B ≺ {A, C}, but not with {A, B} ≺ C. d. Comment on our use of the notation ≺ Let us comment briefly here on our use of the notation ≺. Recall that for two disjoint nonempty subsets K 1 and K 2 of N , a probability distribution P is said to be compatible with the causal order K 1 ≺ K 2 if and only if P ( a K1 | x) = P ( a K1 | x N \K2 ) for all x, a K1 . It should be noted that the relation "compatible with K 1 ≺ K 2 " thus defined is not transitive, and therefore it does not define a partial order between events. For instance, P (a, b, c|x, y, z) := δ a,z δ b,0 δ c,0 (with δ the Kronecker delta and a, z taking at least two different values) is compatible with A ≺ B and B ≺ C, but not with A ≺ C. This justifies why, considering more subsets, we defined the notation K 1 ≺ K 2 ≺ · · · ≺ K K to formally mean K (≤k) ≺ K (>k) -rather than just K k ≺ K k+1 -for all k = 1, . . . , K−1.
We note also that the notation ≺ was used differently in Ref. [15], where it denoted a strict partial order (and was hence transitive). Our use of the notation ≺ here is consistent e.g. with that of Refs. [12,13,16,20,21,28,29,37,38], and would instead correspond to the notation in Ref. [15] (also used in Ref. [2]).

Operations on process matrices
In this section we clarify how process matrices behave in general, with respect to their validity and their compatibility with fixed causal orders, when tracing out subsystems or attaching extensions, and when tracing out, adding or grouping parties. W ; similarly, if W is compatible with a causal order K 1 ≺ K 2 , then so is W . Both properties are quite intuitive: 14 clearly, ignoring some parts of the incoming and outgoing spaces cannot make a process matrix invalid, and cannot induce some signalling where there was none before. Note, however, that the converse is in general not true: if W = Tr A N I O W is a valid process matrix (or is compatible with K 1 ≺ K 2 ), this does not guarantee in general that W is also a valid process matrix (or is compatible with K 1 ≺ K 2 ). 14 There is nevertheless a case, where the validity of a process matrix W ensures that a "larger" matrix W (defined on more subsystems) is valid: namely, when one attaches to W some ancillary state ρ. Indeed, in constructing the framework of process matrices, it is always assumed that one can consider some extensions of the incoming spaces of each party, and distribute (possibly entangled) ancillary states shared by all parties. Hence, by definition, if a matrix W ∈ A N IO is a valid process matrix, then for any quantum state (i.e., any positive semidefinite matrix, up to normalisation) ρ in any extension A N I , the matrix W = W ⊗ ρ ∈ A N II O defines a valid process matrix. Similarly, if W is compatible with a given causal order K 1 ≺ K 2 between two disjoint subsets of parties, then so is W ⊗ ρ.
One may then wonder if instead of attaching an ancillary state ρ ∈ A N I to the incoming spaces, one could attach any other positive semidefinite matrix W ext. ∈ A N I O in some extension of both incoming and outgoing spaces. It is clear, from the previous remarks on the partial trace of subsystems, that for a valid (nonzero) process matrix W , a necessary condition for W := W ⊗ W ext. to define a valid process matrix is that W ext. itself is also a valid process matrix. 15 For two parties and more, this condition is however not sufficient (as noted also in Refs. [39,40] is not (due to the fact that W and W ext. allow for some signalling in two conflicting directions). Similarly, for a (nonzero) process matrix W compatible with K 1 ≺ K 2 , a necessary condition for W := W ⊗W ext. to be a process matrix compatible with K 1 ≺ K 2 is that W ext. is also a process matrix compatible with K 1 ≺ K 2 . As before, this condition is however not sufficient for three parties and more.

b. Tracing out / Adding / Separating / Grouping parties
In the previous observations we were keeping the set of parties under consideration N fixed. Let us now consider how process matrices behave when changing the set of parties.
Consider a nonempty subset N 0 of N . Clearly, if W ∈ A N IO is a valid N -partite process matrix, then its restriction to the parties in the subset N 0 , defined as W 0 := Tr N \N0 W , is a valid |N 0 |-partite process matrix. Similarly, if W is compatible with a causal order K 1 ≺ K 2 , then W 0 is compatible with the order (K 1 ∩ N 0 ) ≺ (K 2 ∩ N 0 ). 16 Let N 1 and N 2 be two disjoint sets of parties. If W 1 ∈ A N1 IO and W 2 ∈ A N2 IO are two valid process matrices for the parties in N 1 and N 2 , respectively, then so is IO for all parties in N 1 ∪N 2 . (Note however that if N 1 and N 2 are not disjoint, this may not hold any more, as in the case with N 1 = N 2 = N considered in the previous subsection.) If say W 1 is compatible with a causal order K 1 ≺ K 1 (with K 1 , K 1 two disjoint nonempty subsets of N 1 ), then so is W . Furthermore, for any nonempty subsets K 1 ⊆ N 1 and K 2 ⊆ N 2 , W is compatible with both orders K 1 ≺ K 2 and K 2 ≺ K 1 .
Suppose now that the incoming and outgoing spaces of a party, say A N , can be factorised into IO . One can then virtually "separate" A N into two new parties, A N (1) and A N (2) , with incoming and outgoing spaces A N (1) IO and A N (2) IO , respectively, and thus consider the new set of N +1 parties N = IO is a valid N -partite process matrix, one can then verify that when considering the set N , W ∈ A N IO is also is a valid (N +1)partite process matrix, i.e., W ∈ L N . If W is compatible with a causal order K 1 ≺ K 2 , then W is also com- 17 Conversely, for a given set of N ≥ 2 parties N , let us finally consider a set N obtained from N by now grouping two or more of the parties, e.g.
is considered to form a new effective party. Then W is not necessarily a valid (N −1)-partite process matrix for the parties in N . The reason for this is that valid process matrices are required to give valid probability distributions only for product operations of the parties; if two parties are grouped together and perform a joint operation, that may no longer yield valid probabilities. An explicit counterexample is for instance which represents a (dephasing) channel from A to B and is indeed a valid bipartite process matrix, but not a valid single-partite process if A and B are grouped together (as [ 17 Both properties are straightforward when recalling that the validity and compatibility with a fixed order constraints are obtained by imposing certain conditions for all operations M ∈ A N IO of party A N : clearly, these constraints are also satisfied if A N is separated into two parties A N (1) and A N (2) , and M takes the

IO
(and by noting that if M (1) and M (2) are CPTP, then so is M ). These properties can also be verified formally using the characterisations of Eqs. (A4) or (A9), after noting in particular that

Allowed and forbidden terms in the Hilbert-Schmidt basis decomposition of process matrices
In Refs. [2,15], the constraints characterising the set of valid process matrices and the set of process matrices compatible with a fixed causal order between two complementary subsets were formulated in a different way, namely by specifying which terms can appear in the decomposition of the corresponding operators in a Hilbert-Schmidt basis. To complete this appendix, we now establish the connection between these two alternative characterisations, and we prove their equivalence.
A Hilbert-Schmidt basis of some space of linear operators X (acting on a d X -dimensional Hilbert space) is given by a set of generalised Pauli matrices, i.e., a set of Hermitian operators {σ X µ } In such a basis, a process matrix The approach of Refs. [2,15] looks at which terms ν3 · · · can appear with a nonzero coefficient w µ1ν1µ2ν2µ3ν3··· (i.e., are "allowed") in the above decomposition. According to Proposition 3.1 of Ref. [15], a Hermitian operator W is in the linear subspace L N of valid process matrices if and only if, in addition to the identity term 1 N , it contains only terms for which at least one party A k has a nontrivial operator σ µ k = 1 on A k I and the identity operator 1 on A k O . 19 To see that this statement is indeed equivalent to our own characterisation of L N , let us first verify that all terms of this kind fulfil all the constraints of Eq. (A4). This is clearly the case for the identity ial outgoing spaces, can be grouped together without changing the validity of the process matrix in question. 19 As clarified in Ref. [ x k , and are thus allowed (in addition to the identity) in the decomposition of a process matrix.
O ] 1 N = 0 for any party A i . Consider then some generic Hilbert-Schmidt term T k of the form · · · σ A k I µ k 1 A k O · · · (with σ µ k = 1). Such a term satisfies T k = 0, whether k ∈ X or k ∈ N \X . By linearity, any operator W whose Hilbert-Schmidt decomposition (A21) only contains the identity or such terms T k thus satisfies all the constraints (A4). Conversely, suppose that the Hilbert-Schmidt decomposition of W contains a term F (with a nonzero weight) that is "forbidden" according to Proposition 3.1 of Ref. [15], that is, a term such that for all parties A k , there is either a nontrivial operator σ ν k = 1 on A k O , or an identity operator on both A k I and A k O (and where there is at least one party for which the former is true). Consider then the nonempty subset X ⊆ N of parties A i for which σ The process matrices that are compatible with the causal order K 1 ≺ K 2 , with K 1 ∪ K 2 = N , were likewise characterised in Ref. [15] in terms of allowed terms in a Hilbert-Schmidt basis decomposition. The following terminology was used: the restriction of a Hilbert-Schmidt term onto certain subsystems is the part of the term corresponding to the respective subsystems-for example, the restriction of the term σ ν2 . According to Proposition 3.2 in Ref. [15], the (valid) process matrices that do not allow signalling from K 2 to K 1 are those that contain only Hilbert-Schmidt terms whose restriction to the (incoming and outgoing systems of) parties in K 2 are of the allowed type for a |K 2 |-partite process matrix for those parties-that is, terms with either the O for all parties in K 2 , or for which there is some party A k2 ∈ K 2 with a nontrivial generalised Pauli operator σ µ k 2 = 1 on A k2 I and the identity operator on A k2 O . To see that this proposition is indeed equivalent to our characterisation of L K1≺K2 given by Eqs. (A10) or Eq. (A11), note that the restriction of a Hilbert-Schmidt term T to K 2 is precisely obtained, up to a multiplicative factor (which may be 0), by taking the partial trace Hence, imposing that all Hilbert-Schmidt terms T in the decomposition of W = T w T T have restrictions to K 2 that are in L K2 , as in the characterisation of Ref. [15] just recalled, is equivalent to imposing that for any M K1 ∈ A K1 IO , W |M K 1 = T w T T |M K 1 only have terms T |M K 1 ∈ L K2 , i.e., that W |M K 1 itself is in L K2 , as imposed in Eq. (A11). Note that Proposition 3.2 in Ref. [15] pre-supposed that the process matrix under consideration was valid. If this is not pre-supposed, one must in addition impose, according to the previous characterisation, that for Hilbert-Schmidt terms in the decomposition of W whose restriction to K 2 is the identity operator 1 K2 , there must either also be the identity operator for all parties in K 1 , or there must be some party A k1 ∈ K 1 with a restriction σ O -in other words, one must impose that Tr K2 W ∈ L K1 , so that one also recovers the first constraint of Eq. (A11).

Appendix B: Characterisation of causally separable process matrices
In this appendix we prove the propositions that allow us to characterise causally separable process matrices in terms of simple conditions. We start by proving the first part of Proposition 5, namely that in the particular tripartite scenario with d C O = 1, Araújo et al.'s definition of causal separability (Definition 2) is equivalent to OG's notion of extensible causal separability (Definition 4), and thus also to our Definition 6. Then we provide the proofs for the characterisation of general tripartite causally separable process matrices (Proposition 8) as well as for the necessary condition (Proposition 9) and the sufficient condition (Proposition 10) in the general Npartite case. Note that all the special cases follow from Propositions 9 and 10, and we could just give the proofs of those two general propositions. However, for pedagogical reasons we start with the simpler proofs, which may entail some repetition in the arguments, but allows for greater clarity in presenting the core ideas.
All of the proofs below (of increasing complexity) make use of the same type of argument to prove the necessity of the respective conditions. This argument is based on the "teleportation technique" that follows from the lemma below. Before stating it, let us introduce some further notation. For two Hilbert spaces H X , H X with the same dimension d, and denoting by {|i an orthonormal basis of either H X or H X , we will consider the maximally entangled state |Φ + X/X : We also recall that for a given matrix W ∈ A N IO , we denote by W A k IO →A k I the matrix in ( j∈N \k A j IO ) ⊗ A k I that has formally the same form as W , except that party A k 's system A k IO is now attributed to an extension A k I of party A k 's incoming space. Formally (recalling Eq. (24)), conditional matrix for the other N -1 parties is then Proof. For clarity, let us write explicitly as superscripts the spaces in which the various operators act. We have: We shall also use the following facts in (some of) the proofs below: in Definition (6) can be taken to be of the form W (k) ⊗ ρ. Eq. (12) then implies the direct decomposition W = k∈N q k W (k) , with each W (k) ∈ A N IO being a process matrix compatible with party A k acting first (and such that for any CP map M k ∈ A k II O , (W (k) ⊗ ρ) |M k is causally separable).
Proof. If ρ is pure, then from the extremality of pure states it follows that W ρ (k) = W (k) ⊗ ρ. If ρ is mixed one can first purify it by introducing an additional incoming system for some arbitrary party, obtain the appropriate decomposition (12) for its purification, and then trace out the additional incoming space just introduced to reach the same conclusion. As W ρ (k) is compatible with A k acting first and W (k) = Tr A N I W ρ (k) , then W (k) itself must also be compatible with A k acting first (see remarks in Appendix A 3 a). Proof. For N = 1 party any process matrix is by definition causally separable, so that the claim is trivial. Suppose the claim holds true in the (N −1)-partite case. If W ∈ A N II O is causally separable then by Definition 6, for any extension A N I of the parties' incoming systems and any ancillary quantum state ρ ∈ A N I , W ⊗ ρ has a decomposition of the form with each W ρ (k) a valid process matrix compatible with party A k first, and such that for any possible CP map M k ∈ A k II I O applied by party A k , the conditional is itself causally separable. One then has Tr I W ⊗ ρ = k q k (Tr I W ρ (k) ), with Tr I W ρ (k) a valid process matrix compatible with party A k first (see remarks in Appendix A 3 a). For any possible CP map M k ∈ A k II O applied by party A k , one has ( . As stated above, A k I is causally separable, and by the induction hypothesis so is Tr We thus have a valid causally separable decomposition of Tr I W ⊗ ρ for any extension ρ, which proves that Tr I W is causally separable, and which thus proves, by induction, the claim of Proposition 13.
Again, this property is quite intuitive: clearly, discarding some parts of the incoming systems cannot induce some causal nonseparability where there was none previously. As for the similar statements for valid process matrices and for process matrices compatible with a fixed causal order discussed in Appendix A 3 a, the converse is not necessarily true: if Tr I W is a causally separable process matrix, then W may not necessarily be causally separable 20 -unless W is of the product form I , in which case by our Definition 6 if W 0 is a causally separable process matrix then so is W = W 0 ⊗ ρ. Let us start by considering the tripartite case where party C has no outgoing system (or equivalently, a trivial outgoing system, i.e., d C O = 1). The following proposition directly implies (after proper re-normalisation with appropriate weights q, 1−q) the first part of Proposition 5, namely the equivalence in that case between Araújo et al.'s causal separability and OG's extensible causal separability (which, we recall, is what we simply call causal separability here).
Proposition 14 (Characterisation of tripartite causally separable process matrices with d C O = 1). In a tripartite scenario where party C has no outgoing system, a matrix W ∈ A IO ⊗ B IO ⊗ C I is a valid tripartite causally separable process matrix (as per Definition 6) if and only if it can be decomposed as where, for each permutation (X, Y ) of the two parties A and B, W (X,Y,C) is a positive semidefinite matrix satisfying Proof. Consider a causally separable process matrix W ∈ A IO B IO C I . Let us then introduce an extension A I ⊗ C I of parties A and C's incoming spaces, 20 As a counterexample, consider some causally nonseparable bipartite process matrix W ∈ A I O ⊗ B IO . The process matrix of dimensions d A I = d C I = d C I , and consider attaching to W the maximally entangled ancillary state ρ = |Φ + Φ + | C I /A I .
As W is assumed to be causally separable, according to Definition 6 and Proposition 12 it must be decomposable as where each term W (X) ∈ A IO B IO C I is a (nonnormalised) process matrix compatible with party X acting first, and such that whatever CP map that party applies to their share of W (X) ⊗ ρ, the resulting conditional process matrix for the other two parties is causally separable.
Consider now the term W (C) . Letting party C act first on W (C) ⊗ ρ = W (C) ⊗ |Φ + Φ + | C I /A I and project his incoming systems onto the maximally entangled state |Φ + C I /C I , according to Lemma 11 (with a trivial extra ancillary stateρ), parties A and B are then left with the conditional process matrix . (B7) By assumption this conditional process matrix must be a (bipartite) causally separable process matrix: according to Proposition 7, there must therefore exist a decomposition for W is formally the same matrix as W (C) , except that system C I is replaced by A I . Changing back A I into C I in Eq. (B8), we obtain the decomposition Conversely, any process matrix W that can be decomposed as in Eq. (B4), with process matrices W (A,B,C) and W (B,A,C) satisfying the constraints of Eq. (B5)-i.e., being compatible with the causal orders A ≺ B ≺ C and B ≺ A ≺ C-is clearly causally separable, which concludes the proof of Proposition 14.

b. General tripartite causally separable process matrices
We now turn to proving Proposition 8, which characterises causal separability in the general tripartite scenario where all three parties have nontrivial incoming and outgoing systems.
where each term W (X) ∈ A IO B IO C IO is a process matrix compatible with party X acting first-so that it satisfies in particular [ (A12)), as in Eq. (22)-and such that whatever that party does on W (X) ⊗ ρ, the resulting conditional process matrix for the other two parties is causally separable. Consider the first term in Eq. (B13). Letting party A act first on W (A) ⊗ρ and perform the operation described by the CJ operator  21 Together with Eq. (B17), we thus find that all constraints of Eq. (23) for X = A are satisfied. One can similarly show that they are satisfied for X = B, C, which proves (since we noted before that Eq. (22) is also satisfied) that the decomposition of Proposition 8 is indeed a necessary condition for any causally separable process matrix W .
Conversely, suppose a matrix W ∈ A IO ⊗ B IO ⊗ C IO has a decomposition of the form (21) that satisfies Eqs. (22)- (23). Then as we noted right after Proposition 8, each term W (X) is a valid process matrix, compatible with party X acting first. For any CP map M X applied by party X on its share of W (X) , the resulting 21 Note that this is the step where the tripartite proof does not generalise straightforwardly to N ≥ 4 parties. In particular, we cannot use the same argument to prove that the constraints (28) that appear in our necessary condition are satisfied without tracing out the X IO on the first and fourth lines (as one would need if the terms in Eq. (27) were to satisfy Eq. (31) and thus specify a decomposition satisfying also our sufficient condition for causal separability). One indeed obtains e.g. (X,T,Z,Y ) which, a priori, may still be nonzero.
conditional process matrix for the other two parties Y, Z is (and similarly for (W (X,Z,Y ) ) |M X ), as follows from Eq. (23). This shows that (W (X,Y,Z) ) |M X and (W (X,Z,Y ) ) |M X are valid bipartite process matrices compatible with the orders Y ≺ Z and Z ≺ Y , respectively, so that (W (X) ) |M X is causally separable. Note that for any ancillary state ρ, W ⊗ ρ also has a decomposition as in Eq. (21), obtained simply by attaching the ancillary state to every individual term in the decomposition of W . Therefore, the same reasoning as above applies, which implies that W is causally separable. This thus shows that the decomposition of Proposition 8 is also a sufficient condition for a matrix W to represent a causally separable process matrix, which concludes the proof of that proposition.
Let us mention here that Proposition 14, for the particular tripartite case where d C O = 1, could also be obtained as a corollary of the general tripartite case considered by  Another particular tripartite case of interest is one where one party, say now A, has no incoming space (or a trivial one, with d A I = 1). The following characterisation is also obtained as a corollary of the general tripartite case above.
Proposition 15 (Characterisation of tripartite causally separable process matrices with d A I = 1). In a tripartite scenario where party A has no incoming system, a matrix W ∈ A O ⊗ B IO ⊗ C IO is a valid tripartite causally separable process matrix (as per Definition 6) if and only if and W can be decomposed as where, for each permutation (X, Y ) of the two parties B and C, W (A,X,Y ) is a positive semidefinite matrix satisfying Note already that contrary to the decomposition of Proposition 14, the two summands W (A,B,C) and W (A,C,B) above are not necessarily valid process matrices: indeed, they are not required to satisfy = 0 (only their sum must satisfy Eq. (B20)). This allows for dynamical causal orders, where A (incoherently) controls the causal order between the next parties B and C.
Proof. According to Proposition 18, a causally separable process matrix W ∈ A O ⊗ B IO ⊗ C IO must have a decomposition of the form (21) that satisfies the constraints (22)- (23).
In particular, the constraints Conversely, it is clear that the decomposition of Proposition 15 is a particular case of that of Proposition 8 (with W (B) = W (C) = 0), so that any process matrix that can be decomposed as in Eq. (B21) and satisfies Eqs. (B20) and (B22) is causally separable according to Proposition 8.

d. Allowed and forbidden terms in a Hilbert-Schmidt basis decomposition
We note that Ref. [15] already provided a characterisation of general tripartite causally separable process matrices-or "extensibly" causally separable process matrices, in their terminology. Let us prove here the equivalence with our own characterisation (Proposition 8) explicitly.
According to Proposition 3.3 in Ref. [15], every tripartite (extensibly) causally separable process matrix W ∈ A IO ⊗ B IO ⊗ C IO can be written in the form where each W (X) contains only Hilbert-Schmidt terms (see Appendix A 4) that are allowed in a process matrix compatible with party X acting first as per Proposition 3.2 in [15]-i.e., W (X) ∈ L X≺{Y,Z} in our languageand has the form where Ω (X,Y,Z) ∈ X IO ⊗ Y IO ⊗ Z I and Ω (X,Z,Y ) ∈ X IO ⊗ Y I ⊗ Z IO are positive semidefinite. (We changed here the notations of Ref. [15] to match ours; note in particular that unlike in [15], we again ignore the normalisation constraints in the decomposition of W .) Any such process matrix can thus be decomposed as (A12)), as in Eq. (22). It is furthermore immediate to see that each W (X,Y,Z) = Ω (X,Y,Z) ⊗1 Z O satisfies the second constraint in (23), i.e., , the first constraint in Eq. (23). Thus, any process matrix that satisfies the characterisation of Proposition 3.3 from Ref. [15] also satisfies that of our Proposition 8. Conversely, let W be a process matrix that has a decomposition as in Proposition 8. As discussed after that proposition, the conditions (22) As emphasised before, the matrices W (X,Y,Z) = Ω (X,Y,Z) ⊗ 1 Z O need not be valid process matrices (the only requirement is that Ω (X,Y,Z) ≥ 0). Both individual summands in Eq. (B24) can thus contain terms that are forbidden in a process matrix compatible with party X acting first, as long as these terms cancel out in the sum. More precisely, in addition to the terms that are allowed in a process matrix with X first, W (X,Y,Z) and W (X,Z,Y ) can contain terms of the form σ  Let us now prove the necessary condition for general multipartite causal separability given by Proposition 9.
Proof. Consider an N -partite causally separable process matrix W ∈ A N IO . Let us introduce now, for each party A k , an extension k ∈N \k A k -i.e., we provide each pair of parties with two maximally entangled states, which will allow us to use the teleportation technique in either direction.
According to Definition 6 and Proposition 12, W must be decomposable as where each term W (k) is a process matrix compatible with party A k acting first, and such that whatever that party does on W (k) ⊗ρ, the resulting conditional process matrix for the other N − 1 parties is causally separable. Consider, for a given k ∈ N , the process matrix W (k) ⊗ ρ, and let party A k perform, for a given k = k, The resulting conditional process matrix for the other N − 1 parties is then, according to Lemma 11 (with ρ = Tr A k As this conditional process matrix must be causally separable, it then follows from Proposition 13 that itself must be causally separable, which concludes the proof of Proposition 9 (where ).

b. Sufficient condition for causal separability
Here we shall prove the sufficient condition for general multipartite causal separability given by Proposition 10.
Let us however first prove the claim that was made just after that proposition, namely that if Eq. (31) is satisfied for all (k 1 , . . . , k n ), then one also has, for all (k 1 , . . . , k n ) with 1 ≤ n < N , that ∀ X ⊆ N \{k 1 , . . . , k n }, X = ∅, Proof. This can be seen (by induction) as follows. Assume that Eq. (31) is satisfied for all (k 1 , . . . , k n ).
In particular for the case n = 1, Eq. (B27) together with Eq. (31) imply that each matrix W (k1) is a valid process matrix compatible with party A k1 acting first (see Eq. (A12)).
Let us now prove Proposition 10 by induction.
Proof. Clearly, it trivially holds for N = 1 (in which case Eq. (31) ensures in particular that W is a valid process matrix). (Note also that for N = 2 and 3, it reduces to the sufficient conditions of Propositions 7 and 8, respectively.) Suppose Proposition 10 holds in the (N −1)-partite case, and consider a matrix W ∈ A N IO that can be decomposed as in Eq. (29), with all partial sums W (k1,...,kn) satisfying Eq. (31). Then we have with (as noted above) each W (k1) being a valid process matrix compatible with party A k1 acting first.
Consider a CP map M k1 applied by party A k1 on W (k1) = π∈Π (k 1 ) W π . The resulting conditional process matrix for the remaining N − 1 parties is . By denoting by Π N \k1 the set of permutations of N \k 1 (and by Π N \k1 (k2,...,kn) the set of those that start with k 2 , . . . , k n ), by writing any permutation π of N that starts with k 1 as π = (k 1 , π ) with π ∈ Π N \k1 , and by defining [(W (k1) ) |M k 1 ] π := (W (k1,π ) ) |M k 1 , we can re-write Eq. (B29) as in a similar form to Eq. (29). For n = 2, . . . , N , and for any ordered subset of parties (k 2 , . . . , k n ) of N \{k 1 }, the partial sums by assumption (31). Thus, Eq. (B30) provides a decomposition of the (N −1)-partite process matrix (W (k1) ) |M k 1 of the same form as in Eq. (29), with positive semidefinite matrices [(W (k1) ) |M k 1 ] π and with all partial sums satisfying the analogue constraints as those of Eq. (31). By the induction hypothesis, this implies that (W (k1) ) |M k 1 is causally separable. Note that the exact same reasoning also goes through if instead of W we consider W ⊗ρ with any ancillary state ρ. Indeed, W ⊗ ρ also has a decomposition as in Eq. (29) obtained simply by attaching the ρ to every individual term in the decomposition of W . This shows that W is causally separable, and by induction this proves that the decomposition of Proposition 10 is indeed a sufficient condition for W to be causally separable.
For clarity and to get some better intuition on how it generalises the characterisation of Proposition 8 for the tripartite case, let us write the sufficient condition of Proposition 10 explicitly in the fourpartite case: Proposition 16 (Sufficient condition for fourpartite causally separable process matrices). If a matrix W ∈ A IO ⊗ B IO ⊗ C IO ⊗ D IO can be decomposed as then W is a valid fourpartite causally separable process matrix (as per our Definition 6).
It follows from Eqs. (B35)-(B36) that for each party X, W (X) also satisfies [ B27)). This, together with Eq. (B34) and the fact that W (X) ≥ 0, implies that W (X) is a valid process matrix, compatible with party X acting first (see Eq. (A12)).
Similarly, it follows from Eq. (B36) that for each pair of parties X, Y , W (X,Y ) also satisfies [ This, together with Eq. (B35) and the fact that W (X,Y ) ≥ 0, implies that whatever CP map M X party X applies, the conditional process matrix (W (X,Y ) ) |M X is a valid tripartite process matrix for parties Y, Z, T , compatible with party Y acting first.
Finally, Eq. (B36) implies that whatever CP maps M X , M Y parties X and Y apply, the conditional matrix (W (X,Y,Z,T ) ) |M X ⊗M Y is a valid bipartite process matrix for parties Z, T , compatible with party Z acting first.

Fourpartite causally separable process matrices
in the particular case with dD O = 1 Consider now a fourpartite situation where party D has no outgoing system (or a trivial one, with d D O = 1). It turns out that in such a case our sufficient condition above is also necessary, and it simplifies as follows (note the similarity with Proposition 8).
Proposition 17 (Characterisation of fourpartite causally separable process matrices with d D O = 1). In a fourpartite scenario where party D has no outgoing system, a matrix W ∈ A IO ⊗ B IO ⊗ C IO ⊗ D I is a valid fourpartite causally separable process matrix (as per Definition 6) if and only if it can be decomposed as Proof. According to the necessary condition of Proposition 9, a causally separable process matrix W ∈ A IO ⊗ B IO ⊗ C IO ⊗ D I must have a decomposition of the form where each W (X) is a process matrix compatible with X first, such that for is causally separable.
Consider first X = A, and note already that as W (A) is compatible with A first, one has, from Eq. (A12), Taking now Y = B, we have that W ∈ B II O ⊗ C IO ⊗ D I is a tripartite causally separable process matrix in a scenario where one party (D) has no outgoing space. Using the characterisation of Proposition 14, and re-attributing the system B I back to A IO (as we did, e.g., in the proof of Proposition 14), we obtain that W (A) must have a decomposition of the form Altogether, W is thus a combination of terms that have a decomposition as in Proposition 17; combining these decompositions, it directly follows that W itself has a decomposition of the form of Eq. (B37) that satisfies the required constraints.
Conversely, it is easy to see that if a matrix W has a decomposition of the form (B37), then it is also of the form (B33) (where all terms W (X,Y,Z,T ) with D = T are 0, and thus only the terms W (X,Y,Z,D) = W (X,Y ) remain). Furthermore, if the decomposition satisfies the constraints of Eqs. (B38)-(B39), then it also satisfies those of Eqs. (B34)-(B36). According to Proposition 10, this implies that such a process matrix W is causally separable.
One can further simplify the characterisation above in the particular fourpartite case where, in addition to one party (D) having no outgoing system, one also has a party (A) with no incoming system. We then obtain the following: Proposition 18 (Characterisation of fourpartite causally separable process matrices with d A I = 1 and d D O = 1). In a fourpartite scenario where party A has no incoming system and party D has no outgoing system, a matrix W ∈ A O ⊗B IO ⊗C IO ⊗D I is a valid fourpartite causally separable process matrix (as per Definition 6) if and only if and W can be decomposed as where, for each permutation (X, Y ) of the two parties B and C, W (A,X,Y,D) is a positive semidefinite matrix satisfying We emphasise again that the two summands W (A,B,C,D) and W (A,C,B,D) above are not necessarily valid process matrices, thus allowing for dynamical causal orders. We omit the proof of Proposition 18 here, as it follows that of Proposition 15 very closely. We note, as an aside, that both Propositions 14 and 15 could be obtained as corollaries of Proposition 18 after removing one party. Namely, by imposing d To conclude this section, we further note that Propositions 17 and 18 generalise straightforwardly to cases with more parties D, E, . . . that have no outgoing spaces (by simply replacing D I by D I E I · · · ). Hence, we can give necessary and sufficient conditions for causal separability in any N -partite scenario in which at most 3 parties have nontrivial outgoing spaces.
Appendix C: Explicit witness of causal nonseparability for W act.
In this appendix we provide an explicit witness of causal nonseparability for the process matrix W act. introduced in Sec. III C.
According to Eq. (G5) in Appendix G 2 below, in the tripartite scenario in which W act. is defined, where d C O = 1, the cone of causal witnesses can be characterised as Using the approach of Sec. IV D, we obtained the following causal witness for W act. , written, as in the definition (9) of W act. , in the order C I A I B I A O B O : To see that S act. indeed defines a valid causal witness, one can verify that it admits decompositions as in Eq. (C1) above, with (still written in the same order) S (1) B . One can easily check that all constraints in Eq. (C1) are satisfied.
With S act. thus defined, one finds Tr[S act. · W act. ] = −( 4 √ 3 −2) < 0, which proves that W act. is indeed causally nonseparable according to our Definition 6-or equivalently, to Araújo et al.'s Definition 2, or "extensibly causally nonseparable" according to Definition 4-even though, as proven in Sec. III C, it is causally separable according to OG's Definition 3.
Since the causal witness S act. above was obtained with the SDP optimisation technique described in Sec. IV D, it allows us to determine the robustness of W act. to white noise. From Eq. (38) we thus find that its random robustness is r * = − Tr[S act. · W act. ] = 4 √ 3 − 2 0.31.
"Activation" of causal nonseparability with W act.
It is instructive to see explicitly how causal nonseparability can be "activated" by attaching an entangled ancillary state to W act. .
Recall that W act. is compatible with party C acting first. As shown in Sec. III C, it is such that for any CP map (or POVM element) M c applied by C, the conditional bipartite process matrix (W act. ) |M c is causally separable. This is precisely why W act. is considered to be causally separable according to OG's Definition 3.
Consider now attaching an ancillary maximally entangled state ρ = |Φ + Φ + | A I /C I , shared by A and C with dimensions d A I = d C I = d C I , and letting C project his two incoming systems onto |Φ + Φ + | C I /C I . The resulting conditional process matrix (W act. ⊗ ρ) |M C =|Φ + Φ + | shared by A and B is then (up to normalisation) (W act. ) C I →A I , i.e., it is formally represented by the same matrix as W act. , Eq. (9), with party C's incoming system now given to party A (see Lemma 11 of Appendix B).
One can verify that (W act. ⊗ ρ) |M C =|Φ + Φ + | thus obtained is causally nonseparable by constructing a (bipartite) causal witness using, for instance, the explicit characterisation of Eq. (G3) below, in a similar way to what we did for W act. above.
Note, however, that this argument is not sufficient to conclude that W act. is (extensibly) causally nonseparable: one indeed needs to prove that for any possible decomposition of the form W act.
with each W ρ (X) compatible with party X acting first, there exist CP maps M A , M B , or M C , that make either ) |M C causally nonseparable. 22 Our construction of a causal witness for W act. 22 Indeed, a process matrix compatible with C first (in short, of the form W = W (C) ), and such that for some CP map M C the conditional process matrix W |M C is causally nonseparable, may still be causally separable if it also has another, causally separable, decomposition of the form W = W (A) + W (B) + W (C) .
An example is for instance W = W 0 ⊗ |Φ + Φ + | A I /C I with (written again in the order (111 + 1ẑẑ +ẑ1ẑ +ẑẑ1)11 One can check that W ∈ L C≺{A,B} and that with M C = |Φ + Φ + | C I /C I , the bipartite conditional process matrix W |M C is causally nonseparable-even though W is also compatible with the fixed order A ≺ B ≺ C (and is hence causally separable).
(Here the ancillary entangled state attached to W 0 and the CP map M C allow party C to "teleport" their incoming system in W 0 to A; the same observation holds if C teleports his system to B instead.) A similar observation can be made at the level of correlations: a tripartite correlation P (a, b, c|x, y, z) compatible with C first and such that the bipartite conditional correlation Pz,c(a, b|x, y) := P (a, b|x, y, z, c) is noncausal for some z, c may in general still be causal. An example with binary inputs and outputs 0, 1 is P (a, b, c|x, y, z) := 1 2 δ b,x δ c,a⊕y , which is indeed compatible with C first (as P (c|x, y, z) = 1 2 does not depend on x, y) and is such that conditioned on C's output c = 0, the resulting conditional bipartite correlation P z,c=0 (a, b|x, y) = δ b,x δa,y shared by A and B is noncausal (it violates the "Guess Your Neighbour's Input" inequality [28] maximally). Nevertheless, P is clearly also com-confirms nonetheless that this must indeed be the case, which allows us conclude, using OG's terminology, that the entangled ancillary state ρ introduced here indeed "activates" the causal nonseparability of W act. .

Appendix D: Equivalence between Oreshkov and
Giarmatzi's extensible causal (non)separability and our definition of causal (non)separability In this appendix we prove that OG's Definition 4 of extensible causal (non)separability and our Definition 6 of multipartite causal (non)separability are equivalent.
Proof. Let W be an N -partite process matrix that is causally separable as per our Definition 6. The con- in Definition 6 are again causally separable (as per our definition), and thus fulfil in particular Definition 3. Therefore, W ⊗ ρ is causally separable as per OG's Definition 3 (OG-CS) for any A N I and any ancillary quantum state ρ ∈ A N I . That is, W is extensibly causally separable as per OG's Definition 4 (OG-ECS).
The proof of the converse is more involved. The idea is to consider, for an N -partite OG-ECS process matrix W and two ancillary quantum states ρ and ρ , the extended process matrices W ⊗ ρ and W ⊗ ρ ⊗ ρ , which are both OG-CS. By comparing the corresponding decompositions we will show that the conditional (N −1)-partite process matrices obtained from the decomposition of W ⊗ ρ are not only OG-CS, but also OG-ECS. From there, one can conclude by induction that W then also satisfies our Definition 6.
The difficulty here is that the causally separable decomposition of W ⊗ ρ (for ρ = ρ or ρ = ρ ⊗ ρ in our case here) depends, a priori, on ρ. The following proposition states, however, that there exists a decomposition of W that provides a unique causally separable decomposition of W ⊗ ρ for any ρ. Proposition 19. Any N -partite extensibly causally separable (OG-ECS) process matrix W , as per Definition 4, can be decomposed as patible with the fixed order A ≺ B ≺ C, and is hence causal. Note that the argument given by OG in Ref. [15] to show activation of causal nonseparability consisted precisely in proving that, after attaching an ancillary state, the correlations generated by a given tripartite process matrix were compatible with C first and such that the bipartite conditional correlation Pz,c(a, b|x, y) was noncausal. In that case, however, C was performing a deterministic operation (i.e., c could only take a single fixed value), so this argument was enough, in their case, to prove that the tripartite correlation under consideration was indeed noncausal [20,42].
with q k ≥ 0, k q k = 1, and where for each k, W (k) is a process matrix compatible with party A k acting first, and is such that for any extension A N I , any ancillary quantum state ρ ∈ A N I and any possible CP map M k ∈ A k II O applied by party A k , the conditional (N −1)-partite process matrix (W (k) ⊗ρ) |M k := Tr k [M k ⊗1 N \k · W (k) ⊗ρ] is causally separable (OG-CS) as per Definition 3.
Proof. Consider an N -partite OG-ECS process matrix W . By Definition 4, for any extension A N I and any state ρ ∈ A N I , the extended N -partite process matrix W ⊗ ρ must have a decomposition of the form where for each k, W ρ (k) is a process matrix compatible with party A k acting first, and such that whatever that party does, the resulting conditional process matrix for the other (N −1) parties is OG-CS.
By an argument similar to that of Proposition 12, it is easy to see that W ρ (k) can without loss of generality be taken to be of the form W ρ (k) = W (k) ⊗ ρ. We emphasise again that the convex decomposition of W = k∈N q k W (k) that then follows from Eq. (D2) could a priori depend on the ancillary state ρ. We will however now show that for all extensions and ancillary quantum states one can choose the same decomposition of W .
First, note that for any finite set of extensions {A N I1 , A N I2 , . . . , A N In } and ancillary quantum states {ρ 1 ∈ A N I1 , ρ 2 ∈ A N I2 , . . . , ρ n ∈ A N In } under consideration, one can indeed choose the same decomposition-consider the ancillary state ρ 1 ⊗ · · · ⊗ ρ n ∈ A N I1 ⊗ · · · ⊗ A N In , and the corresponding decomposition W ⊗ ρ 1 ⊗ · · · ⊗ ρ n = k∈N q k W (k) ⊗ ρ 1 ⊗ · · · ⊗ ρ n , (D3) with each W (k) ⊗ρ 1 ⊗· · ·⊗ρ n -and therefore, each W (k)a process matrix compatible with party A k acting first, and such that for any operation M k applied by A k the resulting conditional process matrix (W (k) ⊗ρ 1 ⊗· · ·⊗ρ n ) |M k for the other (N −1) parties is OG-CS. Proposition 13 now implies that these conditional process matrices remain causally separable when tracing out all but one ancillary states in the tensor product. Therefore, the decomposition of W obtained from Eq. (D3) can be chosen for any of the individual ρ j ∈ A N Ij . Next, one uses the following statement from basic topology (Theorem 2.36 in Ref. [43]): Proposition 20. If {K α } is a collection of compact subsets of a metric space X such that the intersection of every finite subcollection of {K α } is nonempty, then K α is nonempty.
Here, let the index set be the set of all possible ancillary quantum states (of any dimension), and the set K ρ , indexed by some quantum state ρ, be the set of possible causally separable decompositions of W corresponding to the ancillary state ρ. The finite intersection property follows from the observation above-for any finite set of ancillary states {ρ 1 , . . . , ρ n }, there exists a common decomposition, that is, the intersection K ρ1 ∩ · · · ∩ K ρn is nonempty. As the conditions of Proposition 20 are satisfied, 23 it guarantees that the intersection K ρ over all quantum states ρ is nonempty. That is, there exists indeed a convex decomposition of W , with q k ≥ 0, k q k = 1, and where for each k, W (k) is a process matrix compatible with party A k acting first, and is such that for any extension A N I , any ancillary quantum state ρ ∈ A N I and any possible CP map M k ∈ A k II O applied by party A k , the conditional (N −1)-partite process matrix (W (k) ⊗ ρ) |M k is OG-CS.
One can now prove by induction that any OG-ECS process matrix is causally separable according to our Definition 6. In the single-partite case (N =1), the claim is trivial. Suppose, for N ≥ 2, that the claim holds true in the (N −1)-partite case. Let then W be an N -partite OG-ECS process matrix. According to Proposition 19, W has a decomposition of the form (D1), such that for any k, for any arbitrary extensions A N I , A N \k I , any ancillary quantum states ρ ∈ A N I , ρ ∈ A N \k I , and any CP map M k ∈ A k II O applied by party A k , the conditional (N −1)-partite process matrices (W (k) ⊗ρ ) |M k and That is, for any A N I , ρ and M k , (W (k) ⊗ ρ ) |M k is OG-ECS-and therefore, by the induction hypothesis, it is causally separable according to our Definition 6. Summing up, we thus have, for any A N I and any ρ ∈ A N I , a decomposition of the form W ⊗ ρ = k q k W (k) ⊗ ρ such that for any M k ∈ A k II O , (W (k) ⊗ ρ ) |M k is causally separable. This means that W itself is causally separable as per our Definition 6, which concludes the proof. 23 More precisely, let X be the space of N -tuples of Hermitian matrices W = (W (1) , . . . , W (N ) ), equipped with the standard Euclidean metric, and let the sets Kρ be defined as It follows from the positivity of the W (k) 's and the normalisation of W that the sets Kρ are bounded. One can further easily convince oneself that the sets characterised by the four individual conditions in Eq. (D4) are closed, and thus, as it is the intersection of closed sets, that Kρ is closed. The sets Kρ being bounded and closed, it follows from the Heine-Borel theorem that they are compact, as required for Proposition 20 to be applicable here. Here we show explicitly that causally separable process matrices, according to our Definition 6, can only generate so-called "causal correlations" (even when attaching ancillary entangled states).
Let us first recall the definition of N -partite causal correlations given in Ref. [20] (which is equivalent to that first introduced in Ref. [15]): Definition 21 (N -partite causal correlations). For N = 1, any correlation P (a 1 |x 1 ) is causal. For N ≥ 2, an N -partite correlation P ( a| x) is said to be causal if and only if it can be decomposed as with q k ≥ 0, k q k = 1, where (for each k) P k (a k |x k ) is a single-partite (and hence causal) correlation and (for each k, x k , a k ) P k,x k ,a k ( a N \k | x N \k ) is a causal (N −1)partite correlation.
By this definition, for N = 1, any correlationand in particular, any correlation generated by a (trivially) causally separable single-partite process matrix-is causal.
Assume, for N > 1, that any correlation generated by a (N −1)-partite causally separable process matrix is causal. Consider then an N -partite process matrix W ∈ A N IO , some ancillary state ρ ∈ A N I , and some CP maps M a k |x k ∈ A N II O (for any k, x k , a k ), which all together generate the probability distribution P ( a| x) = Tr[M a1|x1 ⊗ · · · ⊗ M a N |x N · W ⊗ ρ], (E2) as in Eq. (4). Assuming that W is causally separable, according to Definition 6 W ⊗ ρ can be decomposed as in Eq. (12), which allows us to write Here W ρ (k) is compatible with party A k acting first, so that for any set of CPTP maps M x k with k = k, which does not depend on the choice of CPTP maps M x k , and defines a probability distribution P k (a k |x k ) for party A k . The conditional process matrix (W ρ (k) ) |M a k |x k for parties in N \k can be renormalised (when nonzero) by defining ( W ρ (k) ) |M a k |x k := 1 P k (a k |x k ) (W ρ (k) ) |M a k |x k , so that ( W ρ (k) ) |M a k |x k is now a properly normalised process matrix (according to Eq. (E4) above that defines P k (a k |x k ), , as required by Eq. (A2)). We can then write Eq. (E3) as Now, by assumption and according to Definition 6 ( W ρ (k) ) |M a k |x k must a causally separable process matrix; by the induction hypothesis it can only generate causal correlations, which implies that P k,x k ,a k ( a N \k | x N \k ) is causal. Eq. (E5) thus provides a causal decomposition of P ( a| x) as in Eq. (E1) of Definition 21, which proves that the correlation P ( a| x) obtained from the N -partite causally separable process matrix W is causal, and which, by induction, concludes the proof.
Appendix F: Relationship between our necessary and sufficient conditions for causal separability

A necessary but not sufficient condition
In our recursive necessary condition of Proposition 9 for general multipartite causal separability, we require the (N −1)-partite process matrices W to be causally separable for each k = k. In the tripartite case, it is not necessary to impose this explicitly, since considering the teleportation of A k 's systems to some arbitrary A k yields necessary conditions that already coincide with the sufficient conditions for tripartite causal separability (see the proof in Appendix B 1 b). In the general case, however, considering the teleportation to just one or some of the parties yields weaker necessary conditions that may not be sufficient. In this appendix we present an explicit fourpartite example.
We consider the fourpartite scenario where A has a trivial incoming space (d A I = 1) and D has a trivial outgoing space (d D O = 1), and define the following matrix in A O ⊗ B IO ⊗ C IO ⊗ D I : It is easy to verify that W gap satisfies Eq. (17) for A k = A, i.e., that it is a valid process matrix compatible with party A acting first (note its similarity with the original process matrix of Oreshkov, Costa and Brukner [2]). Furthermore, it (A,C,B,D) ) satisfying Eq. (28). In other words, the tripartite conditional process matrix that we obtain by teleporting A O to D I is causally separable (it is compatible with both fixed causal orders B ≺ C ≺ D and C ≺ B ≺ D).
However, this is not the case when teleporting A O to B I , or to C I . W gap indeed cannot be decomposed as in Eq. (27) with Y = B or C, and is thus causally nonseparable. This can be certified by the causal witness (obtained as described in Sec. IV D, with the characterisation of Eq. (G11) in Appendix G 2) for which we obtain Tr[W gap · S gap ] = 1 − √ 2 < 0. This shows that, in the general multipartite case, there is indeed a gap between the necessary conditions obtained by teleporting to just some of the parties and those obtained by teleporting to each of the parties, and that the former are not sufficient.
(Note that in the example above D I did not play any role, as we always had 1 D I in all terms. We could in fact consider the case where D I is also trivial, d D I = 1. We kept here a nontrivial system D I to clarify the fact that W gap was defined in a fourpartite scenario, and that party D does play a role in the argument.)

Numerically investigating the (in)equivalence of our necessary and sufficient conditions
In order to investigate whether the (full version of the) necessary condition in Proposition 9 and the sufficient condition in Proposition 10 differ in general, we conducted numerical testing to see whether we could find process matrices contained in the cone W sep + but not in W sep − (i.e. the outer and inner approximations of W sep arising from the necessary and sufficient conditions, respectively). To this end, we considered the following general approach: we first generated a large number of random process matrices. For each process matrix W , we then solved the primal SDP optimisation problem (36) over the cones W sep ± to obtain the corresponding random robustnesses r * ± . If we were to find r * + = r * − (up to numerical error; note that, since W sep − ⊆ W sep + , one always has r * + ≤ r * − ), this would imply the cones differ since one would have W + r * + 1 • ∈ W sep + but W + r * − . The size of the SDP problems associated with finding the random robustness of a process matrix meant that we could not solve these problems for the "complete" fourpartite scenario with qubit incoming and outgoing spaces for each party (recall that, for three parties, the conditions are already known to coincide). We therefore considered the restricted scenario in which d A I = 1 while the remaining Hilbert spaces are two-dimensional, so that W is thus (128×128)-dimensional. We note that in any simpler scenario, the necessary and sufficient conditions can be be proven to coincide, making this the simplest case of interest. Indeed, in Appendix B 3 we already showed that they coincide if one of the four parties has a trivial outgoing space. If, on the other hand, a second party were to have a trivial incoming space (e.g., d B I = 1), it is not difficult to show they again coincide by writing explicitly the necessary and sufficient conditions of Propositions 9 and 10, by using the fact that they simplify to Proposition 15 in a tripartite case where (at least) one party has a trivial incoming space, and by using the linearity of the subspaces appearing in the constraints. We leave the explicit proof of this as an exercise for the reader.
To generate random process matrices, one could follow the hit-and-run approach of Ref. [37]. Although this approach is guaranteed to sample process matrices uniformly, the high dimensionality of the space of valid process matrices (in this scenario it is 7597-dimensional) renders this approach intractable. Instead, forgoing uniformity, we generated matrices by randomly sampling Hermitian positive semidefinite matrices, projecting them onto the space L N of valid process matrices before adding white noise (i.e., 1 • ) until the resulting matrix was again positive semidefinite.
We solved the SDP optimisation problems for the necessary and sufficient conditions for approximately 1000 randomly generated process matrices (including several hundreds in which an additional constraint, namely the symmetry of W between permutations of the parties B, C and D, was imposed). These numerical tests failed to provide any potential counterexamples: in all cases we found r * + = r * − up to numerical precision. However, since the space of valid process matrices is so high-dimensional and our sampling method non-uniform, we do not believe that our results on this number of samples provide enough evidence to reasonably conjecture that the necessary and sufficient conditions coincide in this scenario.

Appendix G: Construction of witnesses of causal nonseparability through SDP
In this appendix we give some further details relating to the construction of witnesses of causal nonseparability through SDP. Firstly, we discuss the duality of the two SDP problems given in Sec. IV D, showing that they are indeed dual and that the Strong Duality Theorem is satisfied. We then give some additional details on how the characterisations of causal separability can be explicitly translated into SDP constraints in order to find witnesses in practice, giving some explicit examples that both illustrate this and, at the same time, allow the results in Sec. IV E to be readily verified.

Duality of SDP problems
Since both the set of causally separable process matrices W sep and its dual S = (W sep ) * (or the inner and outer approximations W sep ± of W sep arising from Propositions 9 and 10 and their respective duals S ± = (W sep ± ) * , see Sec. IV D) are convex cones, the problems of minimising the amount of white noise that must be added to make a process matrix causally separable and finding the witness of causal separability with the most negative value for a given process matrix can be formulated as SDP problems as in Eqs. (36) and (37), respectively. For these problems to be efficiently solvable with standard algorithmic techniques for SDP, however, one must show that they have no duality gap (i.e., no difference between the optimal values of an SDP problem and its dual). Here, we will show that Eqs. (36) and (37) are indeed a primal-dual pair, and that the Strong Duality Theorem holds [44], implying that that their optimal solutions indeed coincide and can therefore be efficiently obtained. This shows, in particular, that the solution to the SDP problem (37) is the optimal witness with respect to the random robustness.
Ref. [12] showed the duality of two variations of the SDP problems (36) and (37) in the bipartite case: rather than consider the robustness to white noise of a process matrix, they considered the robustness of mixing a given W with any valid process matrix. The optimal solutions to the corresponding SDP problems give the generalised robustness of W . Nevertheless, their approach to proving duality, and the applicability of the Strong Duality Theorem, is easily adapted to (and even simpler for) the random robustness, and the bipartite and some restricted tripartite versions of Eqs. (36) and (37) were already given in Ref. [13]. The same approach can be used in the more general multipartite case to show that these problems (considering the cones W sep or W sep ± ) satisfy the required properties. Rather than repeating these (somewhat technical and lengthy) arguments, we instead refer the reader to Appendix E of Ref. [12] and prove explicitly only the main technical lemma needed to generalise their approach.
First, as noted already in [12,13], it is sufficient just to consider the restriction S N := S ∩L N of witnesses in L N . Indeed, for any S ⊥ in the orthogonal subspace (L N ) ⊥ of L N and any process matrix W one has Tr[S ⊥ · W ] = 0, and thus for any S ∈ S there exists S ∈ S N such that Tr[S · W ] = Tr[S · W ] for all W ∈ L N . The formulations given in Eqs. (36) and (37) are only formally dual when W sep and S are considered as subsets of the vector space L N [or when S is replaced by S N in Eq. (37)]. However, the fact that the restriction to S N does not change the optimal value of the problem ensures that the optimal solutions coincide in the more general formulation.
The primary element of the proof in Ref. [12] which needs to be generalised beyond two parties is the need to show that W sep has a nonempty interior (within L N , cf. their Lemma 7; we also need to check that it is pointed, which is trivial). To this end, it is sufficient to show that the white noise process matrix 1 • is in the interior of W sep , i.e., that there exists ε > 0 such that for any W ∈ L N with W HS < ε (where · HS is the Hilbert-Schmidt norm), one has 1 • + W ∈ W sep .
Recalling from Appendix A 4 the characterisation of L N in terms of "allowed" terms in a Hilbert-Schmidt basis decomposition, let us first note that any allowed Hilbert-Schmidt term T k which contains · · · σ A k I µ k 1 A k O · · · (with σ A k I µ k = 1) for some k ∈ N is compatible with any fixed causal order where party A k comes last-i.e., that T k ∈ L A π k (1) ≺···≺A π k (N −1) ≺A k for any permutation π k of parties such that π k (N ) = k (the same also trivially holds for the allowed identity term 1 N ). Indeed, [1−A k O ] T k = 0 and A k IO T k = 0, so that Eq. (18) holds for any such order. It follows that any W ∈ L N can be written as W = N k=1 Ω k , where each Ω k ∈ L A π k (1) ≺···≺A π k (N −1) ≺A k (for some arbitrary π k for each k); furthermore, the terms Ω k can be taken to be orthogonal, so that W 2 HS = N k=1 Ω k 2 HS . Note that the Ω k 's may not, in general, be positive semidefinite. Nevertheless, if we take W such that W HS < ε := 1 N d I , with d I := k∈N d A k I , then each Ω k ≤ Ω k HS < 1 N d I (where · is now the spectral norm), so that 1 N d I 1 + Ω k ≥ 0. For any such W , we thus obtain a decomposition with 1 N d I 1 + Ω k ∈ P ∩ L A π k (1) ≺···≺A π k (N −1) ≺A k , which proves that 1 • + W is the sum of (valid) process matrices compatible with fixed causal orders, and hence is causally separable: 1 • + W ∈ W sep , as desired.
With this verified, the approach of Ref. [12] can be applied, with the appropriate modifications for the random robustness, 24 to show that the required duality indeed holds and that the conditions of the Strong Duality Theorem are satisfied. 24 Namely, one can change Eqs. (E.3)-(E.7) in Ref. [12] to E = L N , K = W sep , L = {r1 • | r ∈ R}, b = W and c = 1/ k∈N d A k O (using their notations for E, K, L, b, c) and then adapt the proof accordingly.

Explicit SDP constraints and example constructions
In order to characterise S more explicitly for a given scenario, as well as to solve both the primal and dual SDP problem using convex optimisation algorithms [45], it is helpful to write W sep explicitly as intersections and Minkowski sums of convex cones corresponding to individual constraints on causally separable process matrices. The duality relations (34) can then be exploited to describe S. Here we give some examples to illustrate this procedure.
The simplest example is the bipartite scenario. From the definition in Eq. (3) we see that W sep = W A≺B + W B≺A , where W A≺B = P ∩ L A≺B and similarly for W B≺A . Using Eq. (18) to write L A≺B and L B≺A in terms of spaces defined by individual linear constraints, or directly referring to Proposition 7, we see that Note that a slightly different, but equivalent, characterisation was given for the bipartite scenario in Refs. [12,13]. Although their formulation is slightly simpler, we choose to give the above form as it shows more clearly the procedure of obtaining explicit SDP characterisations from the characterisations of causally separable process matrices given in the main text, and it generalises more directly to the multipartite scenario.
The next simplest case is the tripartite scenario with d C O = 1. In this case, causally separable process matrices are characterised by Proposition 14, from which it follows that We note again that two slightly different, but once again equivalent, characterisations were given in Refs. [12,13] for this particular tripartite case.
In the tripartite scenario with d A I = 1 instead (as, e.g., in the example of "activation of causal nonseparability" given by Oreshkov and Giarmatzi [15]), Proposition 15 leads to It follows that In the general tripartite case, the characterisation of Proposition 8 shows that we can write W sep as from which it follows that cone of witnesses is (G11) Once again, for more general scenarios it remains an open question whether the necessary and sufficient conditions of Propositions 9 and 10 coincide. Nonetheless, the same approach here can be applied to our necessary condition, which defines the cone W sep + that is an outer approximation of W sep , to characterise a subset S + of causal witnesses. Solving the dual SDP problem (37) over this set allows one to find valid witnesses of causal nonseparability for a given process matrix W , even though (without proof that the necessary and sufficient conditions coincide) such a witness may not be optimal amongst the full set of causal witnesses S.