Causal and causally separable processes

We develop rigorous notions of causality and causal separability in the process framework introduced in [Oreshkov, Costa, Brukner, Nat. Commun. 3, 1092 (2012)], which describes correlations between separate local experiments without a prior assumption of causal order between them. We consider the general multipartite case and take into account the possibility for dynamical causal order, where the order of a set of events can depend on other events in the past. Starting from a general definition of causality, we derive an iteratively formulated canonical decomposition of multipartite causal processes, and show that for a fixed number of settings and outcomes for each party, the respective correlations form a polytope whose facets define causal inequalities. In the case of quantum processes, we investigate the link between causality and the theory-dependent notion of causal separability, which we here extend to the multipartite case based on concrete principles. We show that causality and causal separability are not equivalent in general by giving an example of a physically admissible tripartite quantum process that is causal but not causally separable. We also show that there exist causally separable (and hence causal) quantum processes that become non-causal if extended by supplying the parties with entangled ancillas. This example of activation of non-causality motivates the concepts of extensibly causal and extensibly causally separable (ECS) processes, for which the respective property remains invariant under extension with arbitrary ancillas. We characterize the class of tripartite ECS processes in terms of simple conditions on the form of the process matrix, which generalize the form of bipartite causally separable process matrices. We show that the processes realizable by classically controlled quantum circuits are ECS and conjecture that the reverse also holds.

The idea that events are equipped with a partial causal order is central to our understanding of physics in the tested regimes: given two pointlike events A and B, either A is in the causal past of B, B is in the causal past of A, or A and B are space-like separated. Operationally, the meaning of these order relations corresponds to constraints on the possible correlations between experiments performed in the vicinities of the respective events: if A is in the causal past of B, an experimenter at A could signal to an experimenter at B but not the other way around, while if A and B are space-like separated, no signaling is possible in either direction. In the context of a concrete physical theory, the correlations compatible with a given causal configuration may obey further constraints. For instance, space-like correlations in quantum mechanics arise from local measurements on joint quantum states, while time-like correlations are established via quantum channels. Similarly to other variables, however, the causal order of a set of events could be random, and little is understood about the constraints that causality implies in this case. A main difficulty concerns the fact that the order of events can now generally depend on the operations performed at the locations of these events, since, for instance, an operation at A could influence the order in which B and C occur in A's future. So far, no formal theory of causality compatible with such dynamical causal order has been developed. Apart from being of fundamental interest in the context of inferring causal relations, such a theory is imperative for understanding recent suggestions that the causal order of events in quantum mechanics can be indefinite. Here, we develop such a theory in the general multipartite case. Starting from a background-independent definition of causality, we derive an iteratively formulated canonical decomposition of multipartite causal correlations. For a fixed number of settings and outcomes for each party, these correlations form a polytope whose facets define causal inequalities. The case of quantum correlations in this paradigm is captured by the process matrix formalism. We investigate the link between causality and the closely related notion of causal separability of quantum processes, which we here define rigorously in analogy with the link between Bell locality and separability of quantum states. We show that causality and causal separability are not equivalent in general by giving an example of a physically admissible tripartite quantum process that is causal but not causally separable. We also show that there are causally separable quantum processes that become non-causal if extended by supplying the parties with entangled ancillas. This motivates the concepts of extensibly causal and extensibly causally separable (ECS) processes, for which the respective property remains invariant under extension. We characterize the class of ECS quantum processes in the tripartite case via simple conditions on the form of the process matrix. We show that the processes realizable by classically controlled quantum circuits are ECS and conjecture that the reverse also holds.

I. INTRODUCTION
The possibility for dynamical and indefinite causal structures in quantum theory and more general probabilistic theories has recently attracted a great deal of interest, both from a foundational point of view and in the context of quantum information processing . Motivated by the long standing search for a theory of quantum gravity, where the causal structure is expected to be dynamical as in General Relativity but fundamentally probabilistic in nature, as well as by the exploration of novel quantum architectures beyond the standard circuit model, operational ways of thinking about causal order in a probabilistic setting have provided new perspectives on quantum mechanics, its possible applications, and routes for potential extensions.
A general framework for the study of correlations between local experiments without the assumption of a predefined causal order between them was proposed in Ref. [4]. In this so called process framework, each experiment is associated with an input and an output system between which an experimenter can perform different operations, but no specific assumption about the existence of a causal structure in which the experiments are embedded is made. When the experiments take place at fixed locations in a background space-time in circumstances defined without post-selection, the causal structure of space-time imposes signaling constraints on the correlations between the experiments. For example, there can be signaling from one experiment to another only if the former takes place in the past light cone of the latter, but no signaling between space-like separated locations or from the future to the past is possible. In Ref. [4], it was shown that if the local operations are described by quantum mechanics, it is possible to conceive correlations that are incompatible with any underlying causal structure. Such correlations allow two parties, Alice and Bob, to establish correlations that violate a causal inequality, which is impossible if their operations take place in a causal order, even if that order is random. A similar possibility was subsequently shown to exist in a multipartite setting even when the local operations are purely classical [9], which in the bipartite case is not possible [4]. It is not known at present whether such joint processes could have a physical realization without post-selection, that is, whether one could prepare a setup that leads to correlations violating causal inequalities between separate experimenters who locally experience the validity of standard quantum mechanics.
Another peculiar effect that seems at odds with causality, which has a physical realization without post-selection, arises when local quantum operations are applied in an order that depends on the value of a variable prepared in a quantum superposition [3,5,6,10,17], a technique known as 'quantum switch' [3]. This approach allows achieving certain tasks that are impossible if the quantum operations are applied in a definite causal order. In contrast to the violation of a causal inequality, however, this conclusion depends on the assumed description of the local operations and is theory-dependent.
So far, the analysis of these effects has relied on semi-rigorous considerations about what it means for a process to be compatible with 'definite causal order'. A fully rigorous argument requires such considerations to be rooted in a clear notion of causality, which, however, in this background-independent setting has been lacking. Such a notion is expected to have a universal expression which can be applied in the context of any number of parties, but how to formulate it turns out to be a nontrivial problem. Simple considerations in the multipartite case show that the causal order of a set of local experiments should most generally be considered to be a random variable that can depend on the settings of these experiments. The latter possibility cannot be excluded since compatibly with our intuition of causality we can conceive of scenarios in which the setting in a given local experiment can influence the order in which other experiments take place in the future. In other words, causality should be expressed as a rule that constrains the joint conditional probabilities for the events in the local experiments and the causal order between them, allowing for the possibility that causal configurations unfold as a result of events in the past. A formal theory of such dynamical causal order is essential not only for understanding the subject of indefinite causal order in quantum mechanics or more general theories, but also for the problem of inferring causal structure beyond the classic paradigm of underlying deterministic variables and static causal relations [24].
In this paper, we develop rigorous theory-independent and theory-dependent notions of causality in the process framework and characterize the structure and relations between the corresponding classes of processes they define. Section II is devoted to the theory-independent perspective, which contains our core result. We formalize the process framework in theory-independent terms and propose a definition of causality which allows for the possibility of dynamical causal order. We develop a number of concepts, such as multipartite signaling, reduced and conditional processes, and derive necessary and sufficient conditions for a process to be causal, which are expressed in the form of an iteratively defined canonical decomposition of the probabilities in the process. This decomposition can be understood as describing a causal 'unraveling' of the events in the experiment in a sequence, showing that the proposed notion of causality yields the structure expected from intuition. Apart from being logically non-trivial, this result has important conceptual implications -it presents us with an understanding of causal order as a random function on random events rather than the ordering of underlying locations in which events happen. This perspective is in the spirit of the idea of background independence in general relativity, according to which there are no underlying locations, but only events and the relations between them. In Section III, we focus on the quantum process framework, where we develop different theory-dependent notions of causality, which in principle have analogues in more general process theories too. Specifically, we investigate several possible generalizations of the bipartite notion of causal separability, which was previously defined heuristically in the bipartite case by postulating a particular form of the quantum process matrix [4]. We show that this form can be understood as arising from the canonical decomposition of causal processes under the condition that each process in this decomposition is a valid quantum process. We define the multipartite concept based on this principle. We show that the sets of causal and causally separable processes are not equivalent in the multipartite case, by giving an explicit example of a class of processes that are causal but not causally separable. This example is based on the 'quantum switch' technique discussed earlier.
We also show that, surprisingly, there exist causally separable (and hence causal) quantum processes that become non-causal if extended by supplying the parties with an entangled input ancilla. This example of 'activation of non-causality' is constructed based on a suitable modification of the non-causal process matrix of Ref. [4]. This observation motivates the concepts of extensibly causal and extensibly causally separable (ECS) processes, for which the respective property remains invariant under extension with arbitrary input ancillas. We derive a characterization of the class of ECS quantum processes in the tripartite case in terms of simple conditions on the form of the process matrix, which generalize the known form of bipartite causally separable process matrices. In the bipartite case, causal separability and extensible causal separability are equivalent, hence the class of ECS processes can be regarded as another possible multipartite generalization of the previously known bipartite concept. Finally, we consider the class of processes realizable by classically controlled quantum circuits, which we show is inside the class of ECS processes. These, too, are equivalent to the causally separable processes in the bipartite case and provide a possible multipartite generalization based on a different principle. We conjecture that the processes that can be obtained by classically controlled quantum circuits are equivalent to the ECS processes, and hence are described by process matrices obeying the simple conditions we have derived. We provide arguments in favor of this conjecture based on analysis in the tripartite case. In Section IV, we summarize our results and discuss future research directions. operational probabilistic theories in the circuit framework [25][26][27][28], it is understood that equivalence classes of the variables s X , o X , and w A,B,... , with regard to the probabilities (1) are taken, and these variables are identified with their equivalence classes.
But what are these variables supposed to describe in practice? In Refs. [15,16], it was argued that there are two main ideas that underlie the concept of operation in the standard circuit framework for operational probabilistic theories [25][26][27][28]. The first one, termed the closed-box assumption, is the idea that the input and output systems of an operation are the only means of information exchange responsible for the correlations between the outcomes of that operation and the outcomes of other operations in the global experiment. The second idea, termed the no-post-selection criterion, which makes sense assuming a predefined notion of temporal ordering as in the standard circuit formulation, is that the variable that defines an operation, or the setting s X , can be known with certainty before the time of interaction with the input system unconditionally on any events in the future.
Since no predefined global time is assumed in our picture, the latter condition will be imagined to hold only with respect to the local temporal sequence of events observed by each experimenter. Furthermore, we will assume that the variable w A,B,··· that defines the global setup in which the individual experiments take place is also obtained without post-selection. We can make sense of this idea by imagining that the variable is associated with an event that fits within each of the local temporal frames of the experimenters and is such that it occurs before any of them receives the input system. We will call processes that describe experiments of this kind pre-selected processes. (For a generalization that admits post-selection, see Ref. [16]).
For the rest of this paper, we will consider only pre-selected processes. We will drop the explicit specification 'pre-selected' for brevity, and will refer to them simply as processes, unless we want to explicitly emphasize the assumption of pre-selection. We will also drop the explicit specification of the variable w A,B,··· on which the joint experiment is conditioned, and we will simply write W A,B,··· ≡ {p(o A , o B , ...|s A , s B , ...)}, keeping in mind that every process describes circumstances defined by such a variable and hence all probabilities we consider are implicitly conditional on such a variable.

B. Causal processes
In the circuit framework for operational probabilistic theories, causality is defined as the property that the probability distribution over the outcomes of a given operation in a circuit do not depend on what operations take place in the absolute future or absolute elsewhere [33] of that operation as defined by the strict partial order (SPO) of the circuit composition [26,27]. More specifically, every circuit describes a set of operations taking place at the vertices of a directed acyclic graph, whose directed edges (the circuit 'wires') correspond to systems that go from one operation to another. Such a graph defines a SPO on the operations in a circuit (a precise definition of SPO is given below) -one operation is in the absolute past of another (equivalently, the latter is in the absolute future of the former) if there exists a directed path from the former to the latter through the graph. If there is no directed path connecting two operations, we say that one is the absolute elsewhere of the other. If we imagine that there is a local experiment taking place at every vertex of such a graph, the property of causality says that the probabilities for the outcomes of local experiments that are in the causal past or causal elsewhere of a given local experiment cannot depend on the setting of that experiment. A circuit theory that obeys this condition, such as standard quantum theory, is called causal, and for such a theory the SPO defined by the circuit composition can be interpreted as causal order [26,27]. This interpretation corresponds to the intuitive idea that, if the setting of a local experiment is regarded as up to the 'free choice' of an experimenter, then any correlations between that setting and other variables must indicate a causal influence of the setting on those variables. From this perspective, causality can be understood as the condition that a variable can influence only variables in its immediate location or in its absolute future.
In the process framework, we do not assume the existence of a given circuit in which the local experiments are embedded. Thus, there is no natural SPO with respect to which to define causality. Nevertheless, we may ask whether the probabilities described by a given process are compatible with the existence of a SPO with respect to which causality is satisfied. How to formulate this precisely, however, is not immediately clear because the process framework can describe situations in which the SPO may be random. For instance, it can describe the correlations between local experiments that can be embedded in different circuits according to some probability distribution. Clearly, if the SPO between the local experiments is random, it must be the case that conditionally on that SPO taking any particular value, the probabilities of the outcomes of the parties given their settings must obey the above notion of causality. This condition, however, is not sufficient to capture the idea of causality. For example, consider the local experiments of two parties, Alice and Bob, which are embedded at random in one of two possible causal circuits where they occur in different orders. The probabilities for all events and the specific circuit could be such that, conditionally on any particular circuit being realized, the joint probabilities of the outcomes of the parties given their settings obey the above notion of causality, but nevertheless the setting of Alice could be correlated with the circuit in which her experiment is embedded, and thereby with the SPO on the two local experiments. Intuitively, such a situation should be in conflict with causality, because if Alice's setting could not influence events that occur in the past, it should not influence whether or not Bob performs an operation in the past. The circuit notion of causality cannot be used to define such an independence from the past, because there the past is defined assuming a fixed circuit. This indicates that we need a more general notion of causality that imposes constraints on how the SPO on the local experiments can depend on the parties' settings. A simple possibility is to require that the SPO on the local experiments must be independent of the parties' setting. This condition, however, is too restrictive, because, compatibly with the idea of causality, we can conceive of scenarios where the setting of a given party influences the order in which other parties perform their experiments in that party's absolute future. Thus, a more sophisticated definition of causality is needed for the process framework. We next develop such a definition.
First, let us review the properties of SPO and introduce some terminology. A SPO on a nonempty set of local elements S = {A, B, C, · · · } is a binary relation ≺ which satisfies the following conditions: (1) irreflexivity -not A≺A; (2) transitivity -if A≺B and B≺C, then A≺C; (3) anti-symmetry -if A≺B, then not B≺A. When two local experiments A and B satisfy A≺B (equivalently, B A), we will say that A is in the absolute past of B, or that B is in the absolute future of A [33]. It will be convenient to introduce the notation A B (equivalently, B A), which means A B and not A≺B, that is, A and B are different and A is not in the absolute past of B (equivalently, B is not in the absolute future of A). We will also introduce the notation A B, which means A B and A B, that is, A and B are different and A is neither in the absolute past nor in the absolute future of B (and hence, B is neither in the absolute past nor in the absolute future of A). In the case when A B, we will say that A and B are absolutely independent, or that A is in the absolute elsewhere [33] of B (and similarly, B is in the absolute elsewhere of A). A prototypical example of these relations is the causal order between the points in a Minkowski space-time -the absolute past/future of a given point corresponds to the points in the past/future light-cone of this point, excluding the point itself, while the absolute elsewhere consists of the points that are space-like separated from the point.
Note that if a set of elements S = {A, B, · · · } is equipped with a SPO, the elements X and Y in any pair (X, Y) ∈ S × S are related by X≺Y, X Y, X Y, or X=Y. The SPO on the set S = {A, B, · · · } is equivalently described by the list of respective relations for each such pair, which we will denote by κ(A, B, · · · ). (This list obviously must respect the properties of SPO listed above.) Since for pairs (X, X) of identical elements this relation is trivially X = X, when we explicitly describe κ(A, B, · · · ), we will only list the pairwise relations for all pairs of distinct elements of the set (if any). Note that this description is generally redundant due to the transitivity of SPO. If we are given the pairwise relations for a set S = {A, B, · · · }, we have, in particular, pairwise relations for any nonempty subset S = {X, Y, · · · } ⊂ S, i.e., a SPO κ(A, B, · · · ) on S implies a SPO κ(X, Y, · · · ) on S ⊂ S, S {}.
As discussed above, the SPO κ(A, B, · · · ) on a set of local experiments S = {A, B, · · · } in terms of which causality would be defined can most generally be random and correlated with the events in these experiments. The notion of causality would impose constraints on the possible correlations. We want these constraints to formalize the following intuition about causality: The choice of setting in a local experiment cannot affect the occurrence of events in the absolute past or absolute elsewhere of that experiment, nor the SPO on such events and the experiment in question.
Since a process is defined by the conditional probabilities for the outcomes of the local experiments given their settings and does not assume the existence of probabilities for the settings, we will formulate the above constraint at the level of probabilities conditional on the settings. We define this as follows.
Definition II.3. (Causal process): A process W A,B,··· ≡ {p(o A , o B , · · · |s A , s B , · · · )} for a nonempty set of local experiments S = {A, B, · · · } is called causal if and only if there exists a probability distribution p(κ(A, B, · · · ), o A , o B , · · · |s A , s B , · · · ), κ(A,B,··· ) p(κ(A, B, · · · ), o A , o B , · · · |s A , s B , · · · ) = p(o A , o B , · · · |s A , s B , · · · ), where the random variable κ(A, B, · · · ) takes values in the possible SPOs on S = {A, B, · · · }, such that for every local experiment, e.g. A, every subset X = {X, Y, · · · } of the rest of the local experiments, and every SPO κ(A, X, Y, · · · ) ≡ κ(A, X) on the local experiment in question and that subset, we have Here, o X denotes collectively the outcomes of all local expriments in X, and A X denotes the condition that all these local experiments are in the causal past or causal elsewhere of A (i.e., A X, A Y, · · · , for all X, Y, · · · ∈ X). [The probability p(κ(A, X), A X, o X |s A , s B , · · · ) is understood obtained from p(κ(A, B, · · · ), o A , o B , · · · |s A , s B , · · · ) by summing over all cases in which κ(A, B, · · · ) is compatible with κ(A, X) and A X (obviously, if κ(A, X) itself is not compatible with A X, the respective probability is zero) and over all possible outcomes of the local experiments in the complement of X].

Remark.
A monopartite process is trivially causal.
For a process W A,B,··· that is causal, the binary relation ≺ of the SPO κ(A, B, · · · ) can be interpreted as causal order. In that case, we will use the terms 'causal past', 'causal future', 'causal elsewhere' and 'causally independent' in the place of 'absolute past', 'absolute future', 'absolute elsewhere' and 'absolutely independent', respectively. We will also refer to the list of pairwise relations κ(A, B, · · · ) as the causal configuration of the local experiments (in the case of a monopartite process, the causal configuration is trivial).
Our goal next is to understand the structure of causal processes that arises from this definition and show that it corresponds exactly to what one expects from intuition. C. Fixed-order causal processes, (no) signaling, reduced and conditional processes Before we consider the case of general causal processes, it will be instructive to investigate the special case of causal processes for which the causal configuration of the local experiments is fixed. As we will show, the constraints on such processes can be expressed via the concept of signaling, which we develop below. We also introduce several related concepts that will be of use later.
Definition II.4. (Fixed-order causal process): A process W A,B,··· ≡ {p(o A , o B , · · · |s A , s B , · · · )} is called fixed-order causal if it is compatible with a deterministic causal configuration, i.e., if it satisfies condition (2) for a SPO κ(A, B, · · · ) that takes a particular value κ(A, B, · · · ) = κ * (A, B, · · · ) with unit probability for all possible settings of the parties: Since our definition of causal process implies that the setting of a local experiment cannot be correlated with the outcomes of local experiments that are in the absolute past or absolute elsewhere of that experiment, one may expect that for any fixed causal configuration of the local experiments, causality would impose constraints on the possibility for signaling between them, similarly to the case in the circuit framework. In the case of two experiments, signaling can be defined as follows: Definition II.5. (Bipartite signaling): We say that there is no signaling from Alice (A) to Bob (B) in a bipartite process W A,B if and only if the probabilities of the process satisfy i.e., the marginal probabilities for the outcomes of Bob are independent of the setting of Alice for any possible setting of Bob. Equivalently, we say that there is signaling from Alice to Bob in the process W A,B if and only if this condition is not satisfied.
For a fixed-order causal process W A,B , where one of the relations A≺B, B≺A, or A B holds with unit probability for all settings of the parties, we can see that signaling is possible from one experiment to the other only if the former is in the causal past of the latter, which agrees with the notion of causality in the circuit framework [26,27]. Indeed, assume for example that B≺A, i.e., p(κ(A, B) = B≺A|s A , s B ) = 1, ∀s A ∈ S A , ∀s B ∈ S B (and hence p(κ(A, B) = A≺B|s A , s B ) = 0 and p(κ(A, B) = A B|s A , s B ) = 0, ∀s A ∈ S A , ∀s B ∈ S B ). Then, we have i.e., there is no signaling from Alice to Bob. In a similar way, we see that if A≺B, there is no signaling from Bob to Alice, while if A B, there is no signaling from Alice to Bob and no signaling from Bob to Alice. In the case of more than two local experiments, the relevant generalization of the above notion of signaling may not be immediately obvious. Notice that if a given bipartite process W A,B involves no signaling between A and B, such a process is in principle compatible with the causal configuration A B (in fact, it is compatible with any causal configuration of the two parties). However, in the case of processes for more than two local experiments, even if there is lack of signaling between any pair of experiments for all possible settings of the rest of the experiments, the process may not be compatible with a causal configuration in which all experiments are causally independent.
To see this, consider three local experiments performed by Alice, Bob, and Charlie, where each party's input and output systems are classical bits, and each party is allowed to perform any classical stochastic operation from the input bit to the output bit. Let the experiments of Bob and Charlie be causally independent, and let Alice's experiment be in the absolute future of Bob's experiment, but in the absolute elsewhere of Charlie's experiment (i.e., the causal configuration of the three parties is [B≺A, A C, B C]). Imagine that Charlie receives his input system in one of the two possible states 0 or 1 with probability 1/2, and depending on that state, Alice and Bob are in one of the following two scenarios. In the first scenario (say, when Charlie receives 0), Bob receives a random bit as an input system, his output bit is sent unaltered into the input system of Alice, and Alice's output bit is discarded. In the second scenario (when Charlie receives 1), Bob again receives a random input bit, but this time his output bit is flipped before sending it into Alice's input, and Alice's output bit is again discarded. In both cases, the output system of Charlie is discarded. Clearly, the described situation can be realized in agreement with a fixed causal configuration of the parties -all we need to do is supply Bob with a random bit and correlate the channel from Bob to Alice with the input system of Charlie, discarding the outcomes of Alice and Charlie. The mechanism realizing this is sketched in Fig. 1a. Note that the tripartite process corresponding to this scenario would involve no signaling from Bob to Alice in spite of the existence of a channel from Bob to Alice. This is the case irrespectively of what operation Charlie performs. Obviously, there can be no signaling from Alice to Bob either, since Alice operates in the future of Bob, nor can there be signaling between Alice and Charlie, or between Bob and Charlie, since Charlie is causally independent of both Alice and Bob. Thus, we have no signaling between any pair of parties, no matter what the setting of the third party is. Yet, the possible correlations between the parties cannot be realized if all parties are causally independent because if Alice and Charlie measure their input bits and collect the results of their measurements, they can infer the bit sent out by Bob, which is impossible if all parties are causally independent. We might say that in this case we have signaling from Bob to Alice and Charlie together. But intuitively, given the described scenario, this signaling should be from Bob to Alice only, since there is no channel connecting Bob's output system to Charlie's input. However, the latter conclusion is based on knowledge about the mechanism by means of which the correlations are established, or about the causal configuration of the parties, and does not follow solely from the correlations between them. Indeed, the tripartite joint probabilities for the outlined scenario are symmetric with respect to interchanging the roles of Alice and Charlie, and thus they could arise from a different mechanism in a situation where Alice is causally independent of both Bob and Charlie, and Charlie is in the causal future of Bob ( Fig. 1b). They could also arise from a channel from Bob to both Alice and Charlie ( Fig. 1c) which transforms Bob's output bit into either correlated or anti-correlated random input bits for Alice and Charlie. We therefore see that, at the level of the joint probabilities for the parties' experiments, there is no way of distinguishing between these different mechanism of information transmission, and hence no way of giving a definition of signaling among a proper subset of the parties that unambiguously captures the existence of such a mechanism. We can, however, give an unambiguous definition of lack of signaling between two complementary subsets of the parties (Fig. 2), as well as an associated notion of multipartite signaling, generalizing the bipartite case.
Equivalently, we say that there is signaling from (1 or · · · or k) to (k + 1 or · · · or n) if and only if this condition is not satisfied. Remark. There is no signaling from or to the empty subset.
Note that this definition only says whether there is signaling from one or more local experiments from a given subset to one or more local experiments from the complementary subset, but in the general case it does not identify pairs of experiments between which there is signaling. In the case of two experiments, the definition reduces to the notion of bipartite signaling defined earlier.
Definition II.7. (Non-signaling process): A process W 1,··· ,n for a set of local experiments S = {1, · · · , n}, n = 0, 1, · · · , is called non-signaling if and only if there is no signaling from A to B for any pair of complementary subsets A and B of S.
Remark. Monopartite processes and the trivial process are non-signaling.
From the definition of causal process, one easily obtains the following relation between the existence of multipartite signaling among the local experiments described by a given process and the causal configuration of these experiments.
It turns out that we can formulate necessary and sufficient conditions for a process to be fixed-order causal, which are expressed entirely in terms of the condition stated in Proposition II.1 applied to different subsets of the experiments. To formulate the conditions precisely, we will need to introduce the concept of reduced process.
Definition II.8. (Reduced process): Consider an n-partite process W 1,··· ,n , n ≥ 0, for a set of local experiments S = {1, · · · n}. Let A = {1, · · · , k} and B = {k + 1, · · · , n}, 0 ≤ k < n, be two complementary subsets of the experiments (specified up to relabeling), such that there is no signaling from B to A. This means that i.e., we have well defined conditional probabilities p(o 1 , · · · , o k |s 1 , · · · , s k ) for the experiments in A. The collection of these probabilities will be called reduced process for A and will be denoted by W A ≡ W 1,··· ,k .
Note that if a multipartite process is a valid pre-selected process, any of its reduced processes is also a valid pre-selected process because it is defined conditionally on the same pre-selected event. Note also that a general multipartite process need not admit any reduced processes apart from the trivial process and itself, since it may involve signaling from every proper subset of the local experiments to its complementary subset.
Before we state the conditions for a process to be fixed-order causal, we introduce another concept that will be needed later.
Definition II.9. (Conditional process): Consider an n-partite process W 1,··· ,n , n ≥ 0, for a set of local experiments S = {1, · · · , n}. Let A = {1, · · · , k} and B = {k + 1, · · · , n}, 0 ≤ k < n, be two complementary subsets of the experiments (specified up to relabeling), such that there is no signaling from B to A (and hence we can define a reduced process W A ≡ W 1,··· ,k ). For each fixed event (s 1 , o 1 , · · · s k , o k ) in A for which p(o 1 , · · · , o k |s 1 , · · · , s k ) 0, consider the collection of conditional probabilities {p(o k+1 , · · · , o n |s k+1 , · · · , s n , s 1 , o 1 , · · · , s k , o k )}. These can be thought of as an (n−k)-partite process for B dependent on the event (s 1 , o 1 , · · · , s k , o k ) in A. The collection of these processes for all values of (s 1 , o 1 , · · · , s k , o k ) for which p(o 1 , · · · , o k |s 1 , · · · , s k ) 0 will be called conditional process and will be denoted by W B|A ≡ W k+1,··· ,n|1,··· ,k . The relation between the whole process and the reduced and conditional processes can be written in the compact form where the product • between W B|A and W A denotes multiplication of the respective probabilities of these processes, when defined, for the same value of the event in A: for p(o 1 , · · · , o k |s 1 , · · · , s k ) 0, and for p(o 1 , · · · , o k |s 1 , · · · , s k ) = 0.
Proposition II.2. A process W 1,··· ,n for a set of local experiments S = {1, · · · , n}, n ≥ 1, is compatible with a deterministic causal configuration κ * (1, · · · , n) of these experiments (and is thereby fixed-order causal) if and only if, for the assumed causal configuration, Proposition II.1 holds for the full process and all of its reduced processes for all bipartitions of the local experiments into two complementary subsets. The Proof S1 is given in the Appendix.
We next turn to general causal processes, beginning with the bipartite case.

D. Bipartite causal processes
Consider a process W A,B describing the local experiments of two parties, Alice and Bob. If the process is causal, there exist probabilities p(A≺B|s A , s B ), p(B≺A|s A , s B ), p(A B|s A , s B ), with p(A≺B|s A , s B ) + p(B≺A|s A , s B ) + p(A B|s A , s B ) = 1. We can therefore write the joint probabilities of the process in the form where each of the probability distributions Since the sum of these probabilities must be unity, we obtain p(A≺B|s A ) = p(A≺B), p(B≺A|s B ) = p(B≺A), i.e., the causal configuration of the local experiments is independent of the parties' settings. Thus, the probabilities of a bipartite causal process W A,B c have the form where the probability distributions , whenever defined, describe processes, which we will denote by W A≺B , W B≺A , and W A B , respectively. (Note that we can imagine that the causal configuration κ(A, B) taking values A≺B, B≺A, or A B, is associated with an event in the past of both A and B, i.e., the processes W A≺B , W B≺A , and W A B , can be thought of as proper pre-selected processes.) The assumption of causality imposes conditions on these processes too. Specifically, it can be seen that each of them must obey a no-signaling constraint compatible with the concrete causal configuration it is conditioned on: the first one must involve no signaling from Bob to Alice, p(o A |s A , s B , A≺B) = p(o A |s A , A≺B); the second one must involve no signaling from Alice to Bob, p(o B |s A , s B , B≺A) = p(o B |s B , B≺A); and the third one must involve no signaling in either direction, i.e., these are fixed-order causal processes. In a compact form, we can write i.e., a bipartite causal process has the form of a probabilistic mixture of processes that are compatible with the different mutually exclusive causal configurations of the parties (and correspondingly involve only one-way signaling in the respective direction, or no signaling). This form is not only necessary but also sufficient for a process to be causal because it explicitly gives a joint ) that obeys the condition for causality (2) when each conditional distribution p(o A , o B |s A , s B , κ(A, B)) obeys the no-signaling constraints compatible with κ(A, B). Indeed, we have and similarly p( are compatible with the one-way signaling constraints for the cases A≺B or B≺A, we can also write the probabilities (12) in the non-unique form where w A B and w B A are two mutually exclusive variables for which the experiments of Alice and Bob respect the relations A B and B A, respectively, with the probabilities of these variables satisfying p(w A B ) + p(w B A ) = 1. In a compact form, this can be written where W Y X is a process that involves no signaling from Y to X, i.e., The constraint (16) (equivalently, (15)) provides a means of testing whether a given bipartite process theory is compatible with causal order. For every fixed number of settings and fixed number of outcomes for each party, the joint probabilities satisfying Eq. (15) form a convex polytope, which is the convex hull of the polytope of probabilities that involve no signaling from Alice to Bob, and the polytope of probabilities that involve no signaling from Bob to Alice [34]. The non-trivial facets of this 'causal polytope' define bipartite causal inequalities, similar to the one in Ref. [4], whose violation by a given process theory indicates that the theory is not compatible with causal order. Note that a causal inequality does not need to be a facet of the causal polytope -it may correspond to an external plane. For instance, the causal inequality of Ref. [4], which concerns the case where one party has a binary input and a binary output while the other one has a quaternary input and a binary output, is not a facet of the respective causal polytope [21]. One way of seeing this is to note that the derivation of the inequality in Ref. [4] only used certain consequences of the requirement that the causal configuration of the parties must be independent of the parties' settings, but not the full requirement. The bipartite causal polytope for binary inputs and binary outputs has been characterized by Branciard [34] (see Ref. [21]).

E. The tripartite and n-partite causal processes
In the case of more than two parties, causal processes need not have the simple form of probabilistic mixtures of fixedorder causal processes with probability weights that are independent of the parties' settings. This is because, consistently with causality, we have the possibility that the causal configuration of a subset of the local experiments may depend on the settings of other local experiments in their past. For example, imagine that we have a tripartite experiment where the input and output systems of each party correspond to the internal (e.g., spin) degrees of freedom of a particle that enters the respective laboratory at a given instant and leaves it at a given later instant. The time at which each party receives her/his particle is determined by some predefined mechanism, which also governs any exchange of information taking place outside of the parties' laboratories.
(Note that in order for the internal degrees of freedom of the particle to constitute the only means of information exchange between each local experiment and the rest of the experiment, the experiment should be so designed that no communication via the times of input or output of the parties is possible. For example, each party may be restricted not to possess any common time reference frame with the rest of the experiment and to perform her/his operation during a fixed time interval with a stopwatch.) In such a case, if Charlie receives a particle first, the operation that he applies on the system could affect the order in which Alice and Bob receive their particles afterwards, since we can conceive of a mechanism that selects different future scenarios for that order conditionally on the outcome of a measurement performed on the internal degrees of freedom of the particle coming out of Charlie's laboratory. This can result in the different scenarios depicted in Fig. 3. By construction, the outlined setup is compatible with the condition that the setting of each local experiment can be chosen independently of events in the causal past and causal elsewhere of that experiment, as well as of the causal configuration of such events and the experiment in question, so it would be associated with a valid causal process. Clearly, the dependence of the causal configuration of the parties on the parties' settings cannot be arbitrary, because it must agree with causality. To formulate the constraints on this dependence, we will need to introduce some more terminology.
For any fixed causal configuration κ(1, · · · , n) of the local experiments S = {1, · · · , n}, there are local experiments that are in no-one else's causal future. The full set of such local experiments, {i, j, · · · } ⊂ {1, · · · , n}, will be referred to as the local experiments that are first, or as the first consecutive set and will be denoted by [i, j, · · · ] I . Next, if the first consecutive set does not include all of the local experiments, there are local experiments whose causal past contains local experiments from [i, j, · · · ] I and only from [i, j, · · · ] I . The full set of these will be referred to as the local experiments that are second, or as the second consecutive set, and will be denoted by [k, l, · · · ] II . Then, if the first and second consecutive sets do not include all local experiments, there are local experiments whose causal past contains local experiments from both sets [i, j, · · · ] I and [k, l, · · · ] II and only from those sets. The full set of these will be referred to as the local experiments that are third, or as the third consecutive set, and will be denoted by [p, q, · · · ] III , and so on.
The following proposition will play a central role in our derivation of the form of multipartite causal processes.
Proposition II.3. Consider a causal process for S = {1, · · · , n}, n ≥ 1, with an associated joint probability distribution p(κ(1, · · · , n), o 1 , · · · , o n |s 1 , · · · , s n ), where κ(1, · · · , n) are the causal configurations of the local experiments. The probability for the first K consecutive sets to consist of specific local experiments, [1 I , · · · , n I ] I , · · · , [1 K , · · · , n K ] K , these experiments to have a specific causal configuration κ(1 I , · · · , n K ), the experiments in the first K − I consecutive sets to have a specific set of outcomes o 1 I , · · · , o n K−I , and a given (possibly empty) subset {1 K , · · · , g K } ⊂ {1 K , · · · , n K } of the local experiments in the K th set (given up to relabeling) to have specific outcomes o 1 K , · · · , o g K , can depend non-trivially only on the settings of the local experiments indicated in the first K − I consecutive sets and the subset {1 K , · · · , g K }, where we define the 0 th set as the empty set. The Proof S2 is given in the Appendix. An important consequence of Proposition II.3 is that the probability for a given set of local experiments to be first is independent of the settings of all parties (this is the case of K = 1 and the subset {1 K , · · · , g K } being empty). For example, consider the different causal configurations of three parties -Alice (A), Bob (B), and Charlie (C) -which are compatible with [C] I (Fig. 3). Each of the individual configurations has a probability that may depend on the setting of Charlie, but the overall probability for Charlie to be first, i.e., for any one of these configurations to be realized (which is the sum of the probabilities for the individual configurations), is independent of the settings of all parties, including Charlie. This independence of the first consecutive set on the settings of all parties will play a key role in our characterization of the structure of multipartite causal processes. We will first develop the characterization for the case of three parties in order to illustrate the underlying principle, and then we will extend it to the general multipartite case.
Groups of tripartite causal configurations whose probabilities are independent of the parties' settings, defined by the set of parties that are first  The groups of tripartite causal configurations compatible with the different possibilities for the first consecutive set of parties are listed in Table I. In terms of these possibilities, the probabilities of a tripartite causal process can be written where and the probability distributions p(o A , ...|s A , ..., [· · · ] I ) for a given [· · · ] I , defined whenever p([· · · ] I ) 0, describe processes which we will denote by W [··· ] I . (Note that we can imagine that the variable [· · · ] I is associated with an event in the past of all local experiments, i.e., these can be thought of as a proper pre-selected process.) In a compact form, Eq. (19) can be written i.e., the overall process is a mixture of processes defined conditionally on the different scenarios [· · · ] I . The processes W [··· ] I cannot be arbitrary but must be compatible with causality, the conditions for which we derive next. Consider the case in which one party is first, say [C] I (Fig. 3). There are three distinct causal configurations compatible with this case, in which A≺B, B≺A, or A B (Table I). We can expand p(o A , o B , o C |s A , s B , s C , [C] I ) conditionally on these configurations as follows: From Proposition II.3, we have that Similarly, we have which together with Eq. (23) implies Substituting this in Eq. (22), we obtain with where the probability distributions [C] I ) describe bipartite processes for Alice and Bob for every fixed value of (s C , o C ). The assumption of causality implies conditions for these processes too. They must respect the no-signaling constraints imposed by the causal configuration κ(A, B) they are conditioned on -the first one must involve no signaling from Bob to Alice, the second one must involve no signaling from Alice to Bob, and the third one must involve no signaling between Alice and Bob in either direction. This follows from the fact that and the observation that since only the numerator on the right-hand side depends on s A , o A , s B , and o B , the respective no-signaling constraints on the quantity on the left-hand side follow from the requirement that the numerator is compatible with Eq. (2). Notice that the probabilities p(o C |s C , [C] I ) in Eq. (26) define a reduced monopartite process for Charlie, W C , while the probabilities enclosed by the square brackets define a conditional bipartite process W A,B|C c , which is causal (indicated by the subscript c) for every fixed (s C , o C ). In a compact form, this can be written The form (29) is necessary for a causal process for which all causal configurations that have non-zero probabilities respect [C] I (in that case, a causal process of the general form (21) reduces to the term W [C] I ). It is also sufficient, because this form provides an explicit joint probability distribution and to zero otherwise -for which condition (2) is satisfied with respect to every party. Indeed, condition (2) is satisfied with respect to C since the probability for any party being in the causal past or causal elsewhere of C is zero. It is also satisfied with respect to A (similarly for B) since the no-signaling constraints respected by with where the probabilities p(o B , o C |s B , s C , [B, C] I ) in Eq. (30) define a reduced bipartite process that involves no signaling between B and C, and the probabilities in the square brackets describe a conditional process for A. The fact that there is no signaling between B and C in the first process follows easily from Proposition II.3. It turns out that the decomposition over different causal configurations does not yield any nontrivial conditions on the probabilities of the conditional process enclosed in the square brackets, i.e., the simpler form where W B,C ns is a non-signaling bipartite process for Bob and Charlie, and W A|BC is a monopartite process for Alice conditional on the events in the laboratories of Bob and Charlie.    (33)) is compatible with the case [C] I in which C≺B≺A, since the only constraints in that case are that Alice cannot signal to Bob and Charlie, and that Bob cannot signal to Charlie, which are satisfied by the probabilities in Eq. (32). Similarly, W [B,C] I is compatible with [C] I . A process W [A,B,C] I is compatible with any causal configuration since it does not involve signaling between any of the parties. These observations suggest that we can group (in a generally non-unique way) the terms in the probabilistic mixture (21) so as to obtain a mixture of three processes Obviously, the existence of a convex decomposition (36) is both necessary and sufficient for a tripartite process to be causal, since any process of the form (35)  . Therefore, these probabilities also form a polytope, and so do the probabilities of the form (36). The nontrivial facets of the polytope of probabilities (36) would define tripartite causal inequalities, whose violation indicates incompatibility with causal order. Examples of tripartite causal inequalities for binary inputs and outputs can be found in Refs. [7,9] (we have not investigated whether these are facets of the respective causal polytope).
The extension of the conditions for causality of a process to the case of n parties can be defined iteratively. The following theorem provides the generalization of condition (35): Theorem II.1. A process for a set of parties S = {1, · · · , n}, n ≥ 1, is causal if and only if it can be written in the form where the sum is over all nonempty subsets X of the local experiments S, p X are suitable probability weights (which can be interpreted as the probability for X to be first, p X = p([X] I )), S\X denotes the relative complement of X in S, W X ns is a non-signaling reduced process for X, and the conditional process W S\X|X c is either the trivial process (when X = S) or otherwise can be written in the same form (40) for every given value of the possible events in X. The Proof S3 is given in the Appendix.
As in the bipartite and tripartite cases, we can simplify the conditions for an n-partite process to be causal by noticing that the constraints on a process compatible with a given set of k (1 ≤ k ≤ n) parties being first are compatible with the constraints on a process compatible with the case in which only a single one of the k parties is first. Therefore, by an argument analogous to the one in the tripartite case, we obtain the following alternative formulation of the conditions. Theorem II.2. (Canonical causal decomposition): A causal process for n parties is one that can be written in the (generally non-unique) form with where the (n − 1)-partite conditional process W 1,··· ,i−1,i+1,··· ,n|i c is either trivial (when n = 1) or has the form (41) for every value of the event in i.
Theorem II.2 (alternatively Theorem II.1) gives iteratively formulated necessary and sufficient conditions for a process to be causal in the general multipartite case. It can be understood as describing an 'unraveling' of the different possible sequences of operations in steps: first, the party that is first and his/her monopartite process are selected at random based on some probability distribution; next, the party that is second and his/her monopartite process are selected at random from some probability distribution that most generally can depend on the first party's setting and outcome; next, the party that is third and his/her monopartite process are selected from some probability distribution that most generally can depend on the settings and outcomes of the first two parties, and so on. We refer to this intuitive decomposition as the canonical causal decomposition of a causal process.
By an argument analogous to the one in the tripartite case, one easily sees from Theorem II.2 that for any fixed number of settings and outcomes for each party, the causal probabilities for n parties form a polytope, provided that the causal probabilities for (n−1) parties form a polytope. By induction, this implies a polytope structure for the general multipartite case. The nontrivial facets of such a polytope define causal inequalites. Examples of n-partite causal inequalities, where n = 2k + 1, for binary inputs and outputs can been found in Refs. [7,9]. It would be interesting to check if these inequalities are facets of the respective causal polytope.

A. General quantum processes
The quantum process framework introduced in Ref. [4] is a particular theory within the general operational framework for pre-selected processes discussed in the previous section. It is based on a set of assumptions about the local operations of the parties and the joint probabilities for their outcomes, which we review next.
The first main assumption is that of local quantum mechanics [4], which says that each local experiment is described as in standard quantum mechanics. Specifically, let X 1 and X 2 denote the input and output systems of a local experiment X. It is assumed that these systems are associated with Hilbert spaces H X 1 and H X 2 of dimensions dimH X 1 = d X 1 and dimH X 2 = d X 2 , respectively. The set of operations that can be performed between the input and output systems is the set of standard quantum operations (or quantum instruments [35]). A quantum operation has a set of outcomes labeled by j = 1, . . . , n. Each outcome induces a specific transformation from the input to the output, which corresponds to a completely positive (CP) map M X j : corresponding to all possible outcomes of a quantum operation has the property that n j=1 M X j is CP and trace-preserving (CPTP), which is equivalent to the condition n j=1 m k=1 E † jk E jk = 1 1 X 1 . The second main assumption is that the joint probabilities for the outcomes of the operations of a set of parties, Alice, Bob, Charlie, · · · , is a non-contextual function of the local CP maps, The requirement that local procedures agree with standard quantum mechanics implies that the function ω should be linear in the local CP maps [4].
Such a linear function can be written in a convenient form by expressing each local CP map as a positive semidefinite operator using a version of the Choi-Jamiołkowsky (CJ) isomorphism [29,30]. In this isomorphism, the CJ operator  [31]. For the purposes of the present paper, the latter basis can be an arbitrary fixed basis. We note, however, that within the time-symmetric generalization of the framework developed in Ref. [16], this basis has a nontrivial physical significance related to the transformation of time reversal. Specifically, in that formulation, the Hilbert space H A 2 on which the CJ operator is defined is not interpreted as the original output Hilbert space of the CP map, but a time-reversed copy of it. In this paper, we will not be concerned with that formulation, but will simply regard the CJ representation of CP maps, defined for an arbitrary choice of basis, as a mathematical convenience. Using the CJ representation, the joint probabilities (43) can be written in the form where The last main assumption behind the quantum process framework is that the local operations of the parties can be extended to act on input ancillas A 1 , B 1 , C 1 , · · · , that are allowed to be prepared in an arbitrary quantum state ρ [4]. The requirement that the probabilities are non-negative for any combination of local CP maps M A i , M B j , M C k , · · · , on the extended In addition, since the probabilities should sum up to 1 for a complete set of local outcomes, we have the condition that where Tr X 2 denotes partial trace over X 2 . Here, we have used the fact that a linear map M X is CPTP if and only if its CJ operator satisfies M X 1 X 2 ≥ 0 and Tr X 2 M X 1 X 2 = 1 1 X 1 . An operator W A 1 A 2 B 1 B 2 C 1 C 2 ··· that satisfies conditions (45) and (46) is called a process matrix [4]. Knowing the process matrix, by Eq. (44) we have the probabilities for the outcomes of any combination of local operations of the parties, i.e., the process matrix provides a complete description of a process. (Here, the set S X of possible settings of a given party is the set of quantum operations with the respective input and output systems.) The process matrix can be expanded in a Hilbert-Schmidt basis of orthogonal matrices on the Hilbert spaces of the input and output systems of the parties, which is helpful in analyzing different properties of the correlations that the process allows. A Hilbert-Schmidt basis of L(H X ) is given by a set of Hermitian operators {σ X µ } d 2 X −1 µ=0 , with σ X 0 = 1 1 X , Trσ X µ σ X ν = d X δ µν , and Trσ X j = 0 for j = 1, ..., d 2 X − 1. In such a basis, a process matrix can be written w i jklmn··· ∈ R, ∀i, j, k, l, m, n, · · · .
It turns out that many properties of process matrices can be formulated entirely as statements about the nonzero terms in the above expansion [4]. For this purpose, it is convenient to introduce the following terminology. Non-zero terms proportional to σ A 1 i ⊗ 1 1 rest (i ≥ 1) will be called terms of type A 1 , non-zero terms proportional to σ A 2 i ⊗ σ B 1 j ⊗ 1 1 rest (i, j ≥ 1) will be called terms of type A 2 B 1 , etc. Every process matrix also contains a non-zero term proportional to the identity operator on all systems. This term will be referred to as of type 1 1, or as the identity term.
In Ref. [4], it was shown that, in the bipartite case, an operator W A 1 A 2 B 1 B 2 satisfies condition (46) if and only if it contains at most terms from the following types: 1 This rule also includes the monopartite case, which is obtained when the input and output systems of one of the parties is trivial (the one-dimensional Hilbert space C 1 ). Specifically, a monopartite operator W A 1 A 2 satisfies condition (46) if and only if it contains at most terms of type 1 1 and A 1 . The types of allowed terms can be generalized to the n-partite case as follows.
Proposition III.1. An operator of the form (47) satisfies condition (46) if and only if in addition to the identity term it contains at most terms in which there is a nontrivial σ operator on X 1 and a trivial one (the identity operator) on X 2 for some party X ∈ {A, B, C, · · · }.
In the Appendix, we present Proof S4 of the above proposition for the case of three parties and the general case follows accordingly. From the analysis in Proof S4 we see that a general operator W A 1 A 2 B 1 B 2 C 1 C 2 can contain up to 64 types of terms. The condition for normalization of probabilities (46) narrows the types of terms to the 38 types listed in Table II. The positive semidefiniteness condition (45) does not limit any further the allowed types of terms, because one can conceive of a positive semidefinite matrix containing nonzero terms of any chosen type (this can be ensured by taking the nontrivial σ terms with non-zero coefficients of sufficiently small magnitude relative to the weight of the identity term which is always fixed). Thus, an operator W A 1 A 2 B 1 B 2 C 1 C 2 is a valid tripartite process matrix, i.e., it satisfies conditions (45) and (46), if and only if it satisfies condition (45) and contains only terms of the types listed in Table II, where the identity term comes with the weight . In a similar way, one proves the allowed types of terms in the general n-partite case. (For an alternative formulation of the conditions for an operator to be a valid process matrix, see Ref. [19].) The types of terms that appear in the expansion of a process matrix are closely related to the signaling between the parties that the process allows. For example, a bipartite process involves signaling from Bob to Alice if and only if the process matrix contains terms of type A 1 B 2 or A 1 B 1 B 2 [4]. To state the condition for (no) signaling in the multipartite case, it is convenient to introduce the following terminology (see also Ref. [19]). Consider a Hilbert-Schmidt term σ A 1 i ⊗σ A 2 j ⊗σ B 1 k ⊗σ B 2 l ⊗σ C 1 m ⊗σ C 2 n ⊗· · · as in Eq. (47). The restriction of this term onto, say, subsystems Proposition III.2. An n-partite process matrix for a set of parties {1, · · · , n} does not permit signaling from, say, (1 and 2 and · · · and k) to (k + 1 and k + 2 and · · · and n) if an only if it contains only terms whose restriction onto 1 1 1 2 · · · k 1 k 2 are of the allowed types for a process matrix on {1, · · · , k} as described in Proposition III.1. The Proof S5 is given in the Appendix.
As an example, a tripartite quantum process that is causal and compatible with a situation in which Charlie is first (Fig. 3) should involve no signaling from Alice and Bob to Charlie, and hence it can only contain the types of terms listed in Table III. These constraints on the allowed types of terms imposed by causal order will turn out to play an important role in the characterization of the so-called causally separable quantum processes, which we define in the next subsection.

B. Causally separable quantum processes
Given that quantum processes have a simple description in terms of process matrices, it is natural to ask whether the property of causality can also be expressed in terms of simple conditions on these matrices. Consider a bipartite quantum process for Alice and Bob, and assume that it is a fixed-order process compatible with the causal configuration A≺B. In that case, as argued earlier, the only constraint imposed by causal order is that the process should involve no signaling from Bob to Alice. As pointed out in the previous subsection, there can be signaling from Bob to Alice if and only if the process matrix W A 1 A 2 B 1 B 2 contains terms of type A 1 B 2 or A 1 B 1 B 2 . Therefore, a process matrix is compatible with A≺B if and only if none of these types of terms appear in its expansion. This means that such a process matrix has the form where contains at most terms of type 1 (This is equivalent to saying that W A 1 A 2 B 1 is a valid process matrix for the case where Bob has a trivial output system, Similarly, in the case where A B, the process matrix has the form where Such a process is realized in a situation in which Alice and Bob receive input systems in a joint quantum state with a density matrix W A 1 B 1 , and their output systems are discarded. We can unify these two conditions to write down the form of a process matrix compatible with B A, which is identical to (48), where W A 1 A 2 B 1 is a valid process matrix for the case where H B 1 = C 1 .
As shown in Ref. [37] within a different framework, all process matrices of the type (50) can be realized by embedding the experiments of Alice and Bob in a quantum circuit, so that Bob's experiment does not precede Alice's experiment in the order of the circuit composition. Most generally, this corresponds to providing Alice with an input system that is entangled with an ancilla, then sending Alice's output together with the ancilla through a quantum channel into Bob's input, and then discarding Bob's output. Such a process is referred to as quantum 'channel with memory'.
As we have seen earlier, a bipartite causal process is one that can be written in the form (16), where W A B and W B A are two processes compatible with A B and B A, respectively. It is then tempting to conjecture that the class of causal quantum processes might be those whose process matrices can be written in the form where W A B and W B A have the form defined in Eq. (50). Certainly, since the probabilities for the outcomes in the quantum process framework are linear functions of the process matrix, a process matrix of the form (51) describes a causal process. However, the condition for a process to be causal (Eq. (16)) does not imply that W A B and W B A in the convex decomposition of the process should themselves be quantum process; only their convex mixture needs to be. While it is conceivable that the structure of quantum processes might imply the form (51) (indeed, this has been shown to hold for a limited class of bipartite quantum processes [14]), there is no obvious reason to expect this to hold in the general case. In fact, we will see that the natural generalization of condition (51) to the multipartite case is not equivalent to the condition that a process is causal (the same holds also for other possible generalizations that we will discuss later). Very recently, the same was shown to hold also in the bipartite case, by Feix, Araújo, and Brukner [39].
A bipartite quantum process that admits the decomposition (51) was called causally separable [4]. One way to think of the relation between causal and causally separable quantum processes is in analogy with the relation between Bell-local and separable (non-entangled) quantum states. Given an arbitrary multipartite quantum state with a density matrix ρ AB··· , the probabilities for the outcomes of a set of local POVM measurements A Bell-local state is one for which the joint probabilities for the outcomes of any combination of local measurements admits a local hidden variable description (and hence such a state cannot be used to violate any Bell inequality [40]), i.e., the joint distribution can be written as a probabilistic mixture of factorizing local distributions, where λ is some variable with a probability distribution p(λ), s A , s B , · · · are the local measurement settings (each corresponding to a specific local POVM measurement {M A i } i∈O A , {M B j } j∈O B , · · · ), and o A , o B , · · · are their outcomes (corresponding to i, j, · · · in the expression (52)). A separable quantum state is one for which each of the local distributions p(o A |s A , λ), p(o B |s B , λ), · · · in Eq. (52) itself can be thought of as arising from the respective local measurement being applied on a local quantum state, which means that the density matrix of the state can be written (54) A separable quantum state is clearly Bell local, but the reverse is known not to be true [41]. The relation between causal (16) and causally separable (51) bipartite quantum processes can be seen in an analogous way -a causally separable process is one for which the processes into which we decompose the process are themselves valid quantum processes.
Here, we propose to extend the notion of causal separability to the multipartite case based on this analogy.
Definition III.1. (Causally separable quantum process): A quantum process is called causally separable if and only if it can be decomposed in the canonical form given by Theorem II.2, with the additional condition that each process on the right-hand side of Eq. (41) is a quantum process. (Note that since the canonical form is defined iteratively, the latter is understood to hold for all conditional processes in this definition.) By a direct analogy, causally separable processes can be defined for any theory formulated in the process framework, but here we will be interested specifically in quantum processes. The process matrix of a causally separable quantum process will be called a causally separable process matrix.
C. Non-equivalence between causal and causally separable multipartite processes: a tripartite example We now give an example of a tripartite quantum process that is causal but causally non-separable, which demonstrates that these two concepts are not equivalent, at least in the case of more that two parties. A similar conclusion based on the same example has been obtained independently by Costa and is presented in Ref. [19].
The example is inspired by the idea of superposition of causally ordered quantum circuits by means of the so-called quantum switch technique [3], where the order of two black-box quantum operations is made to depend on the value of a quantum control bit prepared in superposition of the two possible logical values. Each of the input and output systems of Alice and Bob in our example will be assumed to be a two-dimensional (qubit) system. We can imagine that this is the spin degree of freedom of a spin- 1 2 particle, which enters each laboratory, interacts with the devices inside, and leaves. The particle could be prepared so as to go in superposition along two different possible paths -along one path, it goes first through Alice's laboratory and then through Bob's, whereas along the other path it goes first through Bob's laboratory and then through Alice's. For simplicity, we can imagine that the experiment is arranged in such a way that the particle would always go through Bob's laboratory at a fixed time, but depending on the value of the control bit, it would go through Alice's laboratory before or after that. It is assumed that independently of the time at which the system may go through Alice' laboratory in a given run, Alice would apply the same operation on it. To understand the effect of such a setup, consider first the case in which Alice and Bob each apply a unitary operation on the system, U A and U B , respectively. Let us denote the Hilbert space of the control qubit (path degree of freedom) by H c , and that of the system (spin degree of freedom) by H s . If |0 c corresponds to the path in which Alice is before Bob and |1 c to the path in which Bob is before Alice, if we initially prepare the particle in the state, say, ρ cs in = |Ψ Ψ| cs in , where |Ψ cs in = (α|0 c + β|1 c )|ψ s , at the end it will be in the state ρ cs if a third party, Charlie, performs an operation on the joint system H c ⊗ H s subsequently, he can distinguish this situation from a situation in which the order between the operations of Alice and Bob is conditioned on a classical bit (e.g., modeled by the initial state of the control qubit being in a 'classical' mixture of the two possible values, |α| 2 |0 0| s + |β| 2 |1 1| s , instead of a coherent superposition) by performing a suitable measurement. In fact, it was shown in Ref. [6] that by exploiting such a coherent strategy, Charlie can perfectly distinguish between a pair of unitaries U A and U B that commute or anticommute by using each of the unitaries only once, which is impossible if the order of the unitaries is conditioned on a classical bit. An experimental demonstration of this effect was recently reported in Ref. [17].
In the general case, the operations of Alice and Bob need not be unitary and may have different possible outcomes. Every such operation, however, can be seen as the result of a joint unitary on the input system and a local ancilla, such that the outcome remains stored on the local ancilla in a particular basis. Similarly, any local 'choice' of operation can be modeled by a larger unitary on all systems involved plus a local ancilla that carries the 'choice' variable. Thus, we can have Alice and Bob perform general operations in this setup by purifying their local operations to unitaries and deferring the reading of their outcomes to the end of the whole experiment. (Note that in order not to destroy the superposition, the whole experiments needs to be performed coherently, which may be unrealistic for local operations performed by macroscopic devices, but is in principle compatible with standard quantum mechanics.) In our example, we will take α = β = 1 √ 2 , and we will assume, as described above, that Charlie can operate on both the path and spin degrees of freedom of the particle after it has interacted with Alice and Bob. In other words, Charlie's input system will be four dimensional, and we will formally decompose it into two qubit subsystems, where H C c 1 and H C s 1 correspond to the path and spin degrees of freedom, respectively. Since Charlie operates last, we do not need to introduce a non-trivial output system for him, i.e., his output system will be assumed one-dimensional. The process matrix relating the local experiment of Alice, Bob, and Charlie in this setup can easily be obtained by describing the experiment in the form of a circuit in which Alice's operation is represented by two controlled operations at two possible times, such that one of them would act nontrivially depending on the state of the control qubit (left diagram on Fig. 5). Using the CJ representation of the channels connecting the different boxes, we obtain where with |Φ + = |00 + |11 . It can be verified that W A 1 A 2 B 1 B 2 C 1 C 2 contains only allowed terms. This process matrix is a rank-one projector, and hence it cannot be written as a convex mixture of different process matrices. Therefore, if it is causally separable, it must be of one of the types W (A,B) C , W (B,C) A , or W (A,C) B . But each of these types of process matrices should permit no signaling from two of the parties to the third one (e.g., in the first case there can be no signaling from Alice and Bob to Charlie). However, the above process matrix permits signaling to any of the parties from some of the other parties. Indeed, to see that there can be signaling from Alice and Bob to Charlie, imagine that Alice and Bob choose to perform the unitary operations U A and U B . In this case, Charlie will receive the state [|0 C c √ 2, which can be different for different choices of the unitaries of Alice and Bob, and can therefore yield different probabilities for the outcomes of some measurement of Charlie. To see that we can have signaling from Alice to Bob or vice versa, notice first that there can be no signaling from Charlie to Alice and Bob (Charlie has a trivial output system). This means that we have a well-defined reduced process for Alice and Bob, whose process matrix is This is a causally separable bipartite process matrix that can be interpreted as describing an equally weighted probabilistic mixture of two fixed-order processes -the first one describes a situation in which the input state |ψ is sent into Alice's input, her output is sent into Bob's input through the identity channel, and Bob's output is discarded; the second one describes the analogous situation with the roles of Alice and Bob interchanged. Clearly, since in the first situation there is an ideal channel from Alice to Bob, there can be signaling from Alice to Bob in this process (even if imperfect on average), and similarly from Bob to Alice. Therefore, the process matrix given by Eqs. (55) and (56) is not causally separable. The fact that the process is causal follows immediately from the fact that the reduced process for Alice and Bob is causally separable (and hence also causal). Specifically, we have [B] I , which is the form of a causal process. This observation suggests how the probabilities of Alice, Bob, and Charlie can be simulated without using a quantum switch, if we allow the parties to have larger input and output systems. Since the reduced probabilities of Alice and Bob can be realized by conditioning their order on a classical random bit, all that is needed in order for the tripartite process to be reproduced in this way is for Charlie to receive the information about the settings and outcomes of Alice and Bob so as to produce the necessary p(o C |s A , o A , s B , o B , s C ). Therefore, if in addition to the qubit system that goes between Alice and Bob there is another (possibly infinite-dimensional) system on which each party writes down his/her setting and outcome (right diagram on Fig. 5), and this system at the end enters Charlie's laboratory (or, alternatively, the state on Charlie's original input system is prepared based on this information), the process can be simulated using classically random causal configurations.
By a similar argument we can construct a large class of multipartite processes that are causal but not causally separable. Consider a situation in which the order of all but one of the parties is conditioned on the state of a control system prepared in superposition, and subsequently all systems on which these parties have operated together with the control system are sent into the input of the last party. If all systems were initially prepared in a pure state and all channels are unitary ones, the process matrix will have rank 1, and unless the process is fixed-order causal, it cannot be causally separable. Yet, it will be causal because the reduced process for all parties except for the last one will be causally separable (and hence causal) due to the fact that when we trace out the control system, the process for these parties would be a classical probabilistic mixture of fixed-order processes. Since the full process is obtained by multiplying the conditional process of the last party with the reduced process of the previous ones, the full process is causal. It can be simulated using classical control of the order of the parties by allowing larger input and output systems by which the settings and outcomes of all other parties are made available to the last one.

D. Non-causality can be activated by shared entanglement
We now show another peculiar property of the concepts of causality and causal separability of quantum processes. One of the key assumptions in the derivation of the quantum process matrix framework is that every process can be extended by supplying the parties with ancillary input systems in an arbitrary quantum state, yielding another valid process. Intuitively, since a joint input state is a non-signaling process that is compatible with any causal configuration, one may expect that by adding such a state to a causal quantum process would yield again a causal process. We now show that this is not the case. We refer to this effect as activation of non-causality.
We give a particular example of a tripartite causal quantum process matrix, constructed on the basis of the bipartite process matrix presented in Ref. [4], which is most generally described by some CP map with CJ operator M C 2 ≥ 0, Alice and Bob are left with a bipartite process with process matrix This process matrix is obviously a linear combination of the identity and terms containing only σ z operators on different subsystems, i.e., it is diagonal in a given local basis (the {|0 , |1 } basis for each subsystem). It was shown in Ref. [4] that all such bipartite process matrices are causally separable (though we remark that the same was shown not to hold for multipartite processes [9]). Imagine now that we supply Bob and Charlie with the entangled input state 1 2 |Φ + Φ + | C 1 B 1 , which yields the new process If Charlie performs the identity unitary channel from C 1 to C 2 in his laboratory, which is described by M C 1 C 2 = |Φ + Φ + | C 1 C 2 , Alice and Bob are left with the bipartite process This can be easily seen from the fact that taking the partial trace of W A 1 A 2 B 1 B 1 B 2 C 1 C 2 with the operator |Φ + Φ + | C 1 C 2 is formally identical (up to a normalization) to a local projection in a quantum-state teleportation protocol [38], which amounts to 'teleporting' the part of the matrix on C 2 onto B 1 . (Note that the standard notion of teleportation is defined for quantum states and not process matrices, and the protocol requires a correcting operation on the receiver's side since a projection of the kind above, which does not require correction, cannot be accomplished deterministically [38]). The process matrix (63) is similar to (58), except that the local operators on B 1 in the non-trivial sigma terms in Eq. (58) are now on B 1 , and there is a σ z operator on B 1 in each such term. This process matrix is non-causal, because it allows Alice and Bob to obtain any correlations that they could obtain using the non-causal process matrix (58). This can be done as follows. Alice always performs the same operations that she would perform with the process matrix (58). Bob performs a measurement on system B 1 in the {|0 , |1 } basis. If he obtains the outcome |0 , then it is as if Alice and Bob share the process matrix (58) with B 1 in the place of B 1 . He will then apply any operation from B 1 to B 2 that he would apply from B 1 to B 2 with the process matrix (58), which yields the same joint probabilities for Alice and Bob as those with the process matrix (58). If Bob obtains the outcome |1 for his measurement on B 1 , then it is as if Alice and Bob share the same process matrix as (58) with B 1 in the place of B 1 but with a minus sign in front of each of the two nontrivial σ terms. This process matrix is equivalent to the previous one under a change of basis by the unitary σ B 1 y . Therefore, Bob can simply apply from B 1 to B 2 the same operations he would apply from B 1 to B 2 with the process matrix (58) but transformed by the unitary transformation σ B 1 y . Again, this yields the same joint probabilities for Alice and Bob as with the process matrix (58). In particular, Alice and Bob can use this strategy to violate the causal inequality described in Ref. [4]. The process matrix (63) is thus non-causal, and so is the tripartite process matrix (62).
It is not known at present whether non-causal processes can be realized in agreement with the known laws of quantum mechanics without resorting to post-selection. We have seen in the previous subsection that we can realize causally non-separable processes, which are nevertheless causal. Here, we see that certain causal processes can become non-causal when supplied with shared entanglement. The ability to extend a process with shared entanglement seems natural to expect for any experimentally realizable process. From this perspective, this result suggests that either non-causal processes may be possible, or that there may exist causally separable processes, as defined above, that cannot be realized in practice.

E. Extensibly causal and extensibly causally separable quantum processes
The fact that according to our definition of causal separability there exist causal processes that may be activated to non-causal ones by shared entanglement naturally suggests the definition of the following classes of processes that do not have this counterintuitive property.
Definition III.2. (Extensibly causal quantum process): A quantum process that is causal and remains causal under extension with input systems in an arbitrary joint quantum state is called extensibly causal.
Definition III.3. (Extensibly causally separable (ECS) quantum process): A quantum process that is causally separable and remains causally separable under extension with input systems in an arbitrary joint quantum state is called extensibly causally separable (ECS).
The process matrices of these types of processes will also be referred to as extensibly causal and ECS process matrices, respectively.
Note. These definitions can be formulated analogously for more general process theories that permit composite local systems.
Do these classes of processes correspond to something easy to describe in practice, and are they different at all? It is immediate to see the following facts.
Observation 1: All bipartite causally separable processes are ECS. This is because, if we add an arbitrary joint input ancilla to a process matrix of the form (51), we again obtain a process matrix of the same form. Therefore, the notion of extensible causal separability can be seen as another possible multipartite extension of the bipartite notion of causal separability, which, however, is linked in a less direct way to the theory-independent notion of causality.
Observation 2: Extensibly causal and ECS processes are not equivalent in general. Indeed, the causally non-separable tripartite process (55) based on the quantum switch is also extensibly causal (our proof that it is causal applies also if the parties share entangled input ancillas).
Comment: Recently, Feix, Araújo, and Brukner gave an example of a bipartite quantum process that is causal but not extensibly causal [39], proving that causality and extensible causality are different in the bipartite case too. While in the tripartite case we have seen that extensible causality is also different from causal separability, it is currently an open problem whether the same holds in the bipartite case.
In the next subsection, we derive a characterization o f the tripartite ECS processes in terms of conditions on the form of the process matrix which generalize the conditions in the bipartite case (Eqs. (50), (51)).

F. Structure of tripartite ECS process matrices
Recalling the definition of causally separable process, let us first state an obvious consequence of this definition for the structure of causally separable (though not necessarily ECS) process matrices. Since the probabilities of a quantum process are linear in the process matrix, the requirement that a causally separable process decomposes as in Theorem II.2 where all processes on the right-hand side of Eq. (41) are valid quantum processes means that a causally separable process matrix is one that can be written in the form where W (1,··· ,i−1,i+1,··· ,n) i is a process matrix which describes a process W (1,··· ,i−1,i+1,··· ,n) i with the property where for n > 1 the conditional process W 1,··· ,i−1,i+1,··· ,n|i cs is a causally separable process for every value of the event in i, and for n = 1 it is the trivial process. Note that the requirement that W (1,··· ,i−1,i+1,··· ,n) i is a quantum process that permits no signaling from the rest of the parties to i guarantees that both the reduced and the conditional process on the right-hand side of Eq. (65) are valid quantum processes (this can be seen from the (no) signaling condition in Proposition III.2).
In the case of two parties, we have seen that the process matrices W A B , whose processes obey W A B = W A|B cs •W B (note that any monopartite process is trivially causally separable and ECS), are those that can be written in the form W A B = W B 1 B 2 A 1 ⊗1 1 A 2 , and the general form of bipartite causally separable process matrices is (51). As noted already, this is also the general form of the bipartite ECS process matrices. Our goal is to obtain a similar conditionfor triparite ECS processes.
First, let us consider a process of the form W (A,B) C = W A,B|C cs • W C , where W C is a monopartite quantum process and W A,B|C cs is a bipartite conditional process which is causally separable for each possible event in C. Since in particular there should be no signaling from Alice and Bob to Charlie in such a process, its process matrix, which we will denote can at most contain the types of terms listed in Table III. These are the terms that do not permit signaling from Alice and Bob to Charlie according to Proposition III.2.
We will first obtain necessary and sufficient conditions for such a process to be ECS. Note that we have not proven yet that a general ECS process matrix should have the form (64) where each of the terms W (1,··· ,i−1,i+1,··· ,n) i is itself ECS. This will be shown later.
Every event in Charlie's laboratory is described by some CP map with CJ operator M C 1 C 2 ≥ 0, TrM C 1 C 2 ≤ d C 1 . Conditionally on such an event, Alice and Bob are left with the process matrix where p(M C 1 C 2 ) is the probability for the event M C 1 C 2 to occur in Carlie's laboratory (given the appropriate setting), which is independent of the operations performed by Alice and Bob since the process involves no signaling from Alice and Bob to Charlie. More specifically, where is the reduced process of Charlie. The requirement that the conditional process for Alice and Bob is causally separable means that for all M C 1 C 2 , where W A B M C 1 C 2 and W B A M C 1 C 2 are valid quantum processes compatible with A B and B A, respectively, and q M C 1 C 2 ∈ [0, 1] (all objects generally depend on M C 1 C 2 ). For convenience, we will write this simply in the form , and the whole operator is a valid process matrix, i.e., it contains only allowed terms and is properly normalized.
A sufficient condition for this to hold is that whereW A 1 B 1 B 2 C 1 C 2 ≥ 0 andW A 1 A 2 B 1 C 1 C 2 ≥ 0 are some positive semidefinite operators, whose sum gives a properly normalized quantum process matrix containing only the types of terms listed in Table III. (We remark that each ofW A 1 B 1 B 2 C 1 C 2 ≥ 0 and W A 1 A 2 B 1 C 1 C 2 ≥ 0 may contain terms that are forbidden in a process matrix, such as terms of type C 2 , but these terms have to cancel in the sum.) Indeed, we have whereW and it is easy to see that since contains only the types of terms listed in Table III, can only contain allowed terms.
It is immediate to see that this condition is sufficient also for the process matrix to be ECS. This is because if is a density matrix, also has these properties. We now show that the form (71) is also a necessary condition for an ECS process matrix compatible with (A, B) C, which we will denote by W A 1 A 2 B 1 B 2 C 1 C 2 ecs;(A,B) C . The proof makes use of the 'teleportation' technique that we used in showing the activation of non-causality. Imagine that we supply Alice and Charlie respectively with ancillary systems A 1 and C 1 of dimension d C 1 d C 2 each, which are prepared in the maximally entangled state |φ |i A 1 |i C 1 . Conditionally on Charlie performing a suitable operation and obtaining an outcome with CP map M C 1 C 2 C 1 ∝ |φ + φ + | (C 1 C 2 )C 1 , Alice and Bob will be left sharing a process matrix which, up to a normalization factor, has an identical form to that of W A 1 A 2 B 1 B 2 C 1 C 2 ecs;(A,B) C but with A 1 in the place of C 1 C 2 . The requirement that this is a causally separable bipartite process matrix means that W A 1 A 2 B 1 B 2 C 1 C 2 ecs;(A,B) C must be of the form (71).
So far, we have only obtained necessary and sufficient conditions for an ECS process matrix W A 1 A 2 B 1 B 2 C 1 C 2 ecs;(A,B) C compatible with (A, B) C (and similarly for permutations of A, B, C). We next prove the general case.
Proposition III.3. Every tripartite ECS process matrix can be written in the form where contains only terms from table III and has the form (71), and analogously for by permutation. The Proof S6 is given in the Appendix.
The extension of this form to an arbitrary number of parties is left for future investigation.

G. Processes realizable by classically controlled quantum circuits
Bipartite ECS processes have a clear experimental realization. This raises the question of whether multipartite ECS processes can also be realized in practice, and if so, whether they correspond to a natural class of experimental procedures. (Note that in the bipartite case, ECS processes are equivalent to causally separable processes, but we have already seen that there are multipartite causally separable processes that can become non-causal under extension with entangled ancillas, and these do not have a known experimental realization.) Here, we will show that a particular class of processes which can be realized in practice, referred to as classically controlled quantum circuits, belong to the class of ECS processes, which is the smallest class of causal quantum processes that we have considered so far. Based on certain considerations, we furthermore conjecture that all ECS processes can be realized in this way (this is certainly true in the bipartite case).
The idea of a classically controlled circuit can be thought of as falling within the paradigm of quantum lambda calculus with classical control [42,43]. If we regard the local experiments of the parties as black-box operations, we may think that they are called, only once each, as part of a computation where at every time step a quantum operation is applied on some part of a quantum register depending on a classical protocol that may use as a variable the outcomes of past operations. If black-box operations are involved in such a computation, their outcomes cannot be directly used (they remain 'inside the box' until the end), but the order of subsequent operations of the circuit may nevertheless depend indirectly on the event inside such a black box, since it can be decided based on a measurement on the output system.
More concretely, we define such a process to have the following general realization. We begin with some sufficiently large quantum system (or 'register') in a given quantum state. We perform a quantum operation on it and conditionally on the outcome of that operation we determine which party will be first, which subsystem of the register will be his/her input system, and what operation will be applied after the black box of that party, all according to some specified rule. We apply the black-box operation of the first party on the decided subsystem, perform the decided operation after it, and depending on its outcome and the outcome of our first operation decide which party will be second, and so on. This continues until all parties are called (by definition, the protocol is such that each party is called exactly once). This model can be formalized in different equivalent ways, which may be suitable for different purposes, and we will consider some simplifications below when we discuss a tripartite example. The fact that this model gives rise to valid quantum processes can be seen from the fact that if we formally write the operation inside each box and calculate the joint probabilities for the outcomes of all boxes using the standard rules of quantum mechanics for all possible outcomes of the protocol, we see that they are linear and non-contextual functions of the respective CP maps of the parties. The same holds if we introduce ancillary systems prepared in an arbitrary state and consider extended operations of the parties that act on parts of them.
In the case of only two parties, we know that any (extensibly) causally separable process can be implemented in this way, since it most generally corresponds to embedding at random the local experiments of Alice and Bob into one of two possible fixed circuits, which can be chosen conditionally on the outcome of a measurement on some state at the very beginning. Since after the first party is chosen there is only one possible choice for the second party, no measurement after the first party is needed. Reversely, any bipartite process that we may obtain via this model has the form of an ECS process. Fist notice that the process is independent of the operation applied after the last party. Also, the outcome of any operation after the first party can be ignored since there is only one choice for the last party, i.e., that operation can be assumed deterministic. Finally, the outcomes of the operation before the first party can be grouped into two coarse-grained outcomes such that conditionally on one of them the first party is Alice and on the other one it is Bob. But since after the outcome of that operation and before the input of the first party the quantum register is in some particular quantum state, the rest of the experiment simply corresponds to a deterministic circuit in which Alice and Bob are embedded in a particular order. Therefore, the process realized by such a procedure is just a probabilistic mixture of the processes of two fixed-order circuits, which is the claimed form.
In the case of more than two parties, the equivalence between the two concepts is less obvious, but we can easily argue that all processes obtained by classically controlled circuits are ECS. First, it is clear that depending on the outcome of the first measurement (which has a probability independent of any future operations and therefore of the settings of the parties), there will be one party that is first and hence the subsequent process that results from the protocol can involve no signaling from the rest of the parties to that first party. Therefore, the subsequent process has a well-defined reduced process for the first party. Taking into account all possible outcomes of the first measurement, the whole process will be just a probabilistic mixture of processes of this kind where one party is first, which is Eq. (64). But conditionally on the outcome of the first party, the procedure for the rest of the parties looks analogously, so Eq. (65) holds too, i.e., the process is causally separable. Including ancillas onto which the operations of the parties can be extended does not change anything in this argument. Therefore, every process realizable with a classically controlled quantum circuit is ECS.
We conjecture that the reverse also holds. We provide some partial considerations that support this conjecture, based on analysis of the restrictions on the allowed terms in processes realized by classically controlled quantum circuits in the tripartite case. We will focus on the question of implementing by a classically controlled quantum circuit an ECS process matrix of the type W (A,B) C ecs , which has the form (71). Implementability of a matrix of this kind is both necessary and sufficient for the implementability of a general tripartite ECS process matrix as described in Proposition III.3, since by using a suitable measurement at the beginning we can select with the right probability which of the three process matrices in the mixture on the right-hand side of Eq. (75) to realize subsequently. The protocol begins most generally with some quantum system prepared in a state ρ. After Charlie operates on some subsystem, we apply some operation based on whose outcome we determine who is second, on what subsystem he/she would act, and what operation will be applied after that. Note that without loss of generality we may assume that there is a pre-specified subsystem on which the second party will operate since any subsystem of the same dimension can be mapped onto the designated subsystem by a unitary transformation that can be absorbed as part of the definition of the present operation. Also, without loss of generality we may assume that this operation has only two outcomes, since we can group the outcomes into those for which Alice will be next, and those for which Bob will be next, and any conditioning of the operation following the next party on the fine-grained outcome within each group can be equivalently done by a single future operation acting on a larger system that includes some subsystem on which the classical information about the outcome at this step is copied (still something that we can include as part of the definition of the operation at this step). Since there is only a single possibility for the last party, the operation after the second party can be regarded as a deterministic operation (or a CPTP map) from all systems to the input of the last party. We leave the possibility that this last operation may be defined conditionally on the first outcome rather than absorb the conditioning on that outcome into a larger operation, in order to avoid complications arising from the fact that the different parties may have input and output systems of different dimensions. The outlined procedure is sketched in Fig. 6, where the two possible sequences of transformations arising from the two possible outcomes of our first operation are depicted in blue and green, respectively. The two CP maps corresponding to the outcomes of the operation after Charlie must sum up to a CPTP map, since they correspond to the two possible outcomes of a standard quantum operation.
Each of the two possible developments (blue and green) of this protocol is a non-deterministic linear supermap [44] from the local CP maps of the parties into the real numbers, the result of which equals the probability for the particular sequence of events. This can be written in a similar form as the formula for the probabilities of the outcomes of the parties in a valid process, except that in the place of the process matrix we would have an operatorW A 1 A 2 B 1 B 2 C 1 C 2 i ≥ 0, where i = 1, 2, labels the particular development, which generally would not be a valid process matrix. However, would be a valid process matrix realized through this classically controlled quantum circuit.
Consider now just one of the two possible developments, say, the blue one, in which Alice is second and Bob is last (labeled by 1). One can see that since Bob is last and his output system is discarded, we haveW ). Notice that if the transformation N 1,CPT P after Alice was not required to be CPTP but could be any CP map N 1,CP , for a suitable choice of the initial state ρ and of the CP maps M 1,CP and N 1,CP we could realize anyW A 1 A 2 B 1 C 1 C 2 1 ≥ 0. This is simply because we can choose the density operator ρ C 1 C proportional tõ is stored on C , and we can 'teleport' this part of the operator onto its desired subsystem by using CP maps M 1,CP and N 1,CP that have CJ operators proportional to projectors on maximally entangled states as needed to realize the 'teleportation' (the traces of these CP maps can be chosen to ensure the overall trace of the resultant operatorW A 1 A 2 B 1 C 1 C 2 1 ). However, the restriction that the transformation after Alice is trace-preserving, N 1,CPT P , places constraints on what kind ofW A 1 A 2 B 1 C 1 C 2 1 can be obtained. Indeed, the CJ operator of N 1,CPT P cannot contain terms of type A 2 , A A 2 , and A . Considering the calculation ofW A 1 A 2 B 1 C 1 C 2 1 based on the CJ operators of ρ, M 1,CP and N 1,CP , we see that the lack of these types of terms in N 1,CPT P implies the lack of any term with a nontrivial σ on . This is the only constraint on the possible types of terms inW A 1 A 2 B 1 C 1 C 2 1 . The possible types of terms are exactly those allowed in the operator Eq. (71). Similarly, we see that the allowed terms inW A 1 B 1 B 2 C 1 C 2 2 (Bob second, Alice last) are the same as those inW A 1 B 1 B 2 C 1 C 2 in Eq. (71). These are the terms allowed in a process matrix compatible with Charlie being first, except that bothW may contain terms of type C 2 and C 1 C 2 . The fact that these terms should cancel in the sum 1 follows from the fact that this is a valid ECS process, and can be seen to be ensured by the requirement that M 1,CP + M 2,CP is CPTP.
The only restriction on the operators 1 imposed by this model, apart from their positivesemidefiniteness and the normalization of their sum, seems to be the absence of the forbidden terms in each of them, as well as of the forbidden terms in their sum. If this is indeed the case, then any ECS process could be realized by a suitable classically controlled quantum circuit. A strictly rigorous proof requires showing that apart from the lack of these forbidden terms, there can be no other hidden constraints on the pair of operators 1 (which, of course, are guaranteed to be properly normalized). One way of doing it could be by exhibiting an explicit constructive procedure for implementing any given ECS process, which would be of additional interest on its own right. We leave this question, and the multipartite case, for future investigation.

IV. CONCLUSION
In this paper, we proposed a rigorous definition of causality in the process framework [4], which takes into account the fact that the causal order between a set of local experiments may in general be random and correlated with the settings of some of them. We derived the structure of causal processes permitting such 'dynamical' causal order in the general multipartite case, which is captured by an iteratively formulated canonical form expressed in terms of reduced and conditional processes. The canonical form can be interpreted as an unraveling of the process into a sequence of local experiments, which agrees with the condition that the order and outcomes of the experiments prior to a given step is independent of the settings of future experiments. We showed that for any fixed number of settings and outcomes for each party, the probabilities of a causal processes form a polytope, referred to as the causal polytope. The facets of this polytope define causal inequalities, whose violation by a given process can be interpreted as demonstrating the non-existence of causal order between the local experiments.
We investigated this concept and the related concept of causal separability in the quantum process theory introduced in Ref. [4], whose properties were detailed here in the multipartite case. We proposed a definition of causal separability, which reduces to the one for the case of two parties [4], based on the canonical form of causal processes. Specifically, a causally separable quantum process was defined as a causal quantum process that has a causal decomposition such that the different processes appearing in this decomposition are themselves valid quantum processes. We showed that the set of causally separable quantum processes is strictly within the set of causal quantum processes, by exhibiting an example of a tripartite process that is causal but not causally separable. Very recently, the same was shown to hold also in the bipartite case [39]. We also gave an example of a causally separable (and hence also causal) process that becomes non-causal when extended by supplying the parties with an entangled ancillary state. Based on this observation, we proposed two extended notions of causality and causal separability called extensible causality and extensible causal separability, which require preservation of the respective property under extending the process with entangled input ancillas. Although they are different in the general case, the sets of causally separable and ECS processes are equivalent in the bipartite case. We showed that the sets of extensibly causal and causally separable processes are different in general via the same tripartite example that we used to show that causal and causally separable processes are different. At present we do not know if the same separation holds in the bipartite case. However, it was recently shown that causal and extensibly causal processes are different in the bipartite case, similarly to the multipartite case [39]. Finally, we derived a simple characterization of the ECS quantum processes in the tripartite case in terms of conditions on the form of their process matrices, which extends the conditions for (extensibly) causally separable process matrices in the bipartite case. We conjectured that the set of ECS processes is equivalent to the processes that can be obtained within the paradigm of classically controlled quantum circuits and provided evidence for this based on analysis of the restrictions that this paradigm imposes on the tripartite process matrices it can create. The ECS processes and the processes obtainable by classically controlled quantum circuits are equivalent in the bipartite case.
Our present understanding of the relation between all these different classes of quantum processes is illustrated for the general multipartite case and for the bipartite case in Fig. 7a and Fig. 7b, respectively. An obvious open problem is whether the gray segments in these figures are empty or not.
Another problem of fundamental importance is to understand the class of quantum processes that are physically admissible in agreement with the known laws of quantum mechanics, and where this class stands with respect to all of the above classes. Are the processes that can be realized by classically controlled quantum circuits all the physically admissible causally separable processes? Where does the class of quantum-controlled quantum circuits stand? At present, this is the most general operationally feasible paradigm that we are aware of and all known processes realizable through it seem to be extensibly causal. Could the class of extensibly causal processes be equivalent to quantum-controlled quantum circuits? And most intriguingly, are there physically admissible non-causal processes?
The implications of our results are not limited to the subject of indefinite causal order in quantum mechanics. They can be useful also for the problem of inferring causal structure [24], both in classical and quantum theory [45]. The subject of causal inference concerns many disciplines, from philosophy and machine learning to sociology and medicine. Our formulation of a background-independent operational notion of causality that admits dynamical causal relations opens the road to a more general paradigm for causal inference than the one assuming deterministic underlying variables and static causal relations [24]. The decomposition of causal processes derived here implies constraints on the possible causal orders compatible with given setting-outcome correlations, which can serve as a basis for developing more sophisticated causal inference tools.
Appendix: Causal and causally separable processes Proof S1. Proposition II.2. The 'only if' part is contained in the very Proposition II.1. To prove the 'if' part, take an arbitrary experiment, say, 1. Let {2, · · · , k}, up to relabeling, be the set of local experiments that are in the causal past or causal elsewhere of 1, and {k + 1, · · · , n} be the set of local experiments that are in the causal future of 1. Since the causal configuration of the local experiments is assumed fixed, the condition for the process to be causal reduces to the requirement that for every such 1, we have p(o 2 , · · · , o k |s 1 , s 2 , · · · , s n ) = p(o 2 , · · · , o n |s 2 , · · · , s n ). But from the transitivity and anti-symmetry of causal order it follows that none of the experiments {1, · · · , k} is in the causal future of any of the experiments {k + 1, · · · , n}. This implies that we have a reduced k-partite process for {1, · · · , k}, i.e., p(o 1 , · · · , o k |s 1 , s 2 , · · · , s n ) = p(o 1 , · · · , o k |s 1 , · · · , s k ). The desired condition then follows from Proposition II.1 applied to the k-partite process.
Proof S2. Proposition II.3. First, observe that the property (18) holds for the case where the specified K consecutive sets exhaust all local experiments {1, · · · , n}. This is because, in this case, each of the local experiments in the K th consecutive set is causally preceded by or causally independent from every other local experiment. Hence, the definition of causality (2) directly implies the desired relation. The general case follows by induction from this special case and the following Lemma.
Lemma S1. Let the property (18) hold for K = K + I, where K ≥ 1. Then it also holds for K = K .
What remains to be shown is that this probability cannot depend on the settings s (g+1) K , · · · , s n K .
Note that here we cannot apply straightforwardly the causality condition (2) as we did in the case when the first K consecutive sets were assumed to contain all local experiments. This is because for a particular causal configuration κ * (1 I , · · · , n K ) compatible with [1 I , · · · , n I ] I , · · · , [1 K , · · · , n K ] K , it is generally not the case that p(κ * (1 I , · · · , n K ), [1 I , · · · , n I ] I , · · · , [1 K , · · · , n K ] K , o 1 I , · · · , o g K |s 1 , · · · , s n ) = p(κ * (1 I , · · · , n K ), o 1 I , · · · , o g K |s 1 , · · · , s n ). (S3) Indeed, in order for the first K consecutive sets to be the specified ones, it is necessary and sufficient that: 1) the local experiments in the specified K consecutive sets have a causal configuration compatible with these sets, and 2) each of the local experiments that are not in the specified K consecutive sets is in the causal future of at least one of the local experiments in the K th consecutive set. [In the case where the K sets were assumed to contain all local experiments, only condition 1) was relevant and hence the equality (S3) held.] Consider a particular causal configuration κ * (1 I , · · · , n K ) compatible with [1 I , · · · , n I ] I , · · · , [1 K , · · · , n K ] K (when the causal configuration κ(1 I , · · · , n K ) in the probability on the left-hand side of Eq. (18) is not compatible with the specified consecutive sets, that probability is trivially zero). Let us denote by 1 rest , · · · , l rest , l = n − K K=I n K m=1 m, the rest of the local experiments, i.e, those that do not belong to the assumed first K consecutive sets. We have p(κ * (1 I , · · · , n K ), [1 I , · · · , n I ] I , · · · , [1 K , · · · , n K ] K , o 1 I , · · · , o g K |s 1 , · · · , s n ) = p(κ * (1 I , · · · , n K ), (1 K ≺1 rest ∨ · · · ∨ n K ≺1 rest ), · · · , (1 K ≺l rest ∨ · · · ∨ n K ≺l rest ), o 1 I , · · · , o g K |s 1 , · · · , s n ). (S4) We will show that the probability on the right-hand side can be written as a linear combination of probabilities for which the condition of causality (2) straightforwardly implies independence of s (g+1) K , · · · , s n K .
which, using the expansion of the process matrix, becomes d A 1 d B 1 d C 1 (w 000000 + n>0 w 00000n c n + mn>0 w 0000mn p mn ) = 1, (S14) ∀c n , p mn ∈ R.

Likewise, by fixing M A
, and considering an arbitrary M B 1 B 2 of the form (S10), we obtain w 000l00 = w 00kl00 = 0 for all k, l > 0, while by fixing M B 1 B 2 = 1 1 B 1 B 2 d B 2 and M C 1 C 2 = 1 1 C 1 C 2 d C 2 , and considering an arbitrary M A 1 A 2 of the form (S10), we obtain w 0 j0000 = w i j0000 = 0 for all i, j > 0.

Now, if we fix only M A
, and we use the previously obtained constraints, we obtain w 000l0n = w 00kl0n = w 000lmn = w 00klmn = 0 (each of these coefficients can be shown to vanish by suitably choosing the parameters in M B 1 B 2 and M C 1 C 2 in order to select only the term with that coefficient). Then, if we fix M B 1 B 2 = 1 1 B 1 B 2 d B 2 , we obtain w 0 j000n = w 0 j00mn = w i j000n = w i j00mn = 0.
Finally, we impose condition (46) for arbitrary M A 1 A 2 , M B 1 B 2 , and M C 1 C 2 , of the form (S10). Using the constraints obtained from the special cases above, we obtain w 0 j0l0n = w 0 j0lmn = w 0 jkl0n = w 0 jklmn = w i j0l0n = w i j0lmn = w i jkl0n = w i jklmn = 0. Thus, we have shown that all coefficients w i jklmn , except for w 000000 , that may appear in the result of taking the trace of W A 1 A 2 B 1 B 2 C 1 C 2 with a general combination of M A 1 A 2 , M B 1 B 2 , M C 1 C 2 of the form (S10), must vanish. This is also a sufficient condition for the normalization condition (46) to hold. All these forbidden terms for a process matrix are listed in Table (S1).