Quantum superpositions of"common-cause"and"direct-cause"causal structures

The constraints arising for a general set of causal relations, both classically and quantumly, are still poorly understood. As a step in exploring this question, we consider a coherently controlled superposition of"direct-cause"and"common-cause"relationships between two events. We propose an implementation involving the spatial superposition of a mass and general relativistic time dilation. Finally, we develop a computationally efficient method to distinguish such genuinely quantum causal structures from classical (incoherent) mixtures of causal structures and show how to design experimental verifications of the nonclassicality of a causal structure.


I. INTRODUCTION
The deeply rooted intuition that the basic building blocks of the world are cause-effect-relations goes back over a thousand years [1][2][3] and yet still puzzles philosophers and scientists alike.
In physics, general relativity provides a theoretic account of the causal relations that describe which events in spacetime can influence which other events. For two (infinitesimally close) events separated by a time-like or light-like interval, one event is in the future light cone of the other, such that there could be a direct cause-effect relationship between them. When a space-like interval separates two events, no event can influence the other. The causal relations in general relativity are dynamical, since they are imposed by the dynamical light cone structure [4].
Incorporating the concept of causal structure in the quantum framework leads to novelties: it is expected that such a notion will be both dynamical, as in general relativity, as well as indefinite, due to quantum theory [5]. One might then expect indefiniteness with respect to the question of whether an interval between two events is time-like or space-like, or even whether event A is prior to or after event B for time-like separated events. Yet, finding a unified framework for the two theories is notoriously difficult and the candidate models still need to overcome technical and conceptual problems.
One possibility to separate conceptual from technical issues is to consider more general, theory-independent notions of causality. The causal model formalism [6,7] is such an approach, which has found applications in areas as diverse as medicine, social sciences and machine learning [8]. The study of its quantum extension, allowing for non-local correlations [9][10][11][12] or including new information-theoretic principles [13][14][15] might provide intuitions and insights that are currently missing from the theory-laden take at combining quantum mechanics with general relativity.
Recently, it was found that it is possible to formulate quantum mechanics without any reference to a global causal structure [16]. The resulting framework-the process matrix formalism-allows for processes which are in-compatible with any definite order between operations. One particular case of such a process is the "quantum switch", where an auxiliary quantum system can coherently control the order in which operations are applied [17]. This results in a quantum controlled superposition of the processes "A causing B" and "B causing A". The quantum switch can also be realized through a preparation of a massive system in a superposition of two distinct states, each yielding a different but definite causal structure for future events [18,19]. Furthermore, it provides computational [20] and communication [21,22] advantages over standard protocols with a fixed order of events. The first experimental proof-ofprinciple demonstration of the switch has been reported recently [23].
Given that one can implement superpositions of two different causal orders, one may ask if and how one could realize situations in which two events are in superpositions of being in "common-cause" (A does not cause B directly) and "direct-cause" (A and B share no common cause) relationships. Here we show that such superpositions exist and how to verify them.
We develop a framework for the computationally efficient verification of coherent superpositions of "directcause" and "common-cause" causal structures. We propose a natural physical realization of a quantum causal structure with the spatial superposition of a mass and general relativistic time dilation using the approach developed in Refs. [18,19]. Finally, using the process matrix formalism, we define a degree of "nonclassicality of causal structures" and show how to design experimental verifications thereof using a semidefinite program [24].

II. QUANTUM CAUSAL MODELS
To formalize the pre-theoretic notion of causality, the standard approach is to use causal models [6,7], consisting of (i) a causal network and (ii) model parameters. The causal network is represented by a directed graph, whose nodes are variables and whose directed edges represent causal influences between variables. The causal influence from A to B is identified with the possibility of signaling from A to B. To exclude the possibility of causal loops, one imposes the condition that the graph should be acyclic (a "DAG"), which induces a partial order ("causal order") over the variables. The model parameters then determine how the probability distribution of each variable or set of variables is to be computed as a function of the value of its parent nodes.
Fully characterizing the causal model requires information which is available only through "interventions", where the value of one or more variables is set to take a specific value, independently of the values of the rest of the variables. In the resulting causal network, the connections from all its parents are eliminated. Intervening on all relevant variables is sufficient to completely reconstruct the full causal model [7]. Since this is often practically impossible, it is crucial to investigate the possibilities of causal inference from a limited set of interventions.
Moving to quantum causal models, we will define variables as results of generalized quantum operations applied to incoming quantum systems ("local operation"). Formally, a local operation M A : ). The Choi-Jamio lkowski (CJ) isomorphism [25,26] provides a convenient representation of the local map as a positive operator The quantum causal structure, which is the quantum analogue of the classical causal network, maps the aforementioned local operations to a probability distribution. It can be thought of as a higher order operator and can be formally represented in the "superoperator", "quantum comb" or "process matrix" formalisms [16,[27][28][29][30][31].
We will focus on quantum causal structures with three laboratories (three nodes in the graph) A, B and C compatible with the causal order "A is not after B, which is not after C" (A ≺ B ≺ C). This means that there are no causal influences from B and C to A, nor from C to B (see Fig. 1 In the process matrix formalism, the quantum causal structure is represented by the matrix [16,32]. The probabilities of observing the outcomes i, j, k at A, B, C (corresponding to implementing the completely positive (CP) maps M i A , M j B , M k C respectively) are given by the generalized Born rule: The quantum causal structure and local operations should generate only meaningful (that is, positive and normalized ) probability distributions. In addition, we require the probability distributions to be compatible with the causal order A ≺ B ≺ C. Note that both "commoncause" and "direct-cause" relationships between A and B are compatible with this causal order.
In terms of process matrices, these conditions are equivalent to requiring that W satisfies [32]: (3) L A≺B≺C (·) is the projection onto processes compatible with the causal order A ≺ B ≺ C, defined in Appendix B. Eq. (2) defines a convex cone W, eq. (3) a normalization constraint. Following the standard DAG terminology, a purely "direct-cause" process W dc contains only a direct causeeffect relation between A and B, excluding any form of common cause between A and B. Any correlation between A and B is therefore caused by A alone (Fig. 1 (a) and Fig. 2 (a)). Tracing out C I and B O , the process matrix is a tensor product ρ A I ⊗W A O B I . In our scenario, it will prove natural to extend this definition to include convex mixtures of direct-cause processes, i.e., where p i ≥ 0, i p i = 1, ρ A I i are arbitrary states and W A O B I i arbitrary valid channels between Alice's output and Bob's input, representing to direct cause-effect links between A and B.
Such a process can be interpreted as a probability distribution over states entering A I and corresponding channels from A O to B I . In the DAG framework, such probability distributions can be obtained from a graph with an additional latent node that acts as a common cause for all the observed nodes or simply ignorance of the graph that is implemented. Every channel from A to B with classical memory can be decomposed in this way; see Appendix G for details.
On the other hand, a purely "common-cause" process W cc does not include any direct causal influence between A and B ( Fig. 1 (b) and Fig. 2 (b)). This implies that there is no channel between A O and B I . Therefore, when B O and C I are traced out, the process factorizes as where σ A I B I is an arbitrary (possibly entangled, possibly mixed) state, representing the common-cause influencing A and B.

III. CLASSICAL AND QUANTUM SUPERPOSITIONS OF CAUSAL STRUCTURES
One possibility of combining direct-cause and common-cause processes consists in allowing for classical mixtures thereof: imagine that flipping a (possibly biased) coin determines which process will be realized in an experimental run. Formally, this is described by a process W conv which can be decomposed as a convex combination: where 0 ≤ q ≤ 1, W dc satisfies (4) and W cc satisfies (5). Note that such a classical mixture was experimentally implemented in Ref. [33].
Can there be causal structures exhibiting genuine quantum coherence, i.e., that cannot be decomposed as a classical mixture of direct-cause and common-cause processes (while respecting the causal order A ≺ B ≺ C)?
We now give an example of such a coherent superposition. It is analogous to the "quantum switch" [17], which coherently superposes two causal orders A ≺ B ≺ C and B ≺ A ≺ C, where the causal structure is entangled to a "control" system C (0) I added to C's input space 1 . To keep the notation simple, we define it in the "pure" CJ-vector notation (see Appendix A): where |I := d j=1 |jj represents a non-normalized maximally entangled state-the CJ-representation of an identity channel. The corresponding superposition of circuits is shown in Fig. 3. W coherent satisfies neither the direct-cause condition (4) nor the common-cause condition (5) and is a projector on a pure vector, so it cannot be decomposed into any nontrivial convex combination, in particular not a mixture of direct-cause and commoncause processes. This proves that the process's causal structure is nonclassical.
3. Coherent superposition of a direct-cause and a common-cause process, implementing the causal structure W coherent of (7).

IV. PHYSICAL IMPLEMENTATION OF THE QUANTUM CAUSAL STRUCTURE
The causal structure W coherent would not be of particular interest if it were a mere theoretical artifact. We now give an explicit and plausible physical scenario to realize the quantum causal structures in models which respect the principles of general relativistic time dilation and quantum superposition. We utilize the approach recently developed for the "gravitational quantum switch" to realize a superposition and entanglement of two different causal orders [18,19].
Consider two observers, Alice and Bob, who have initially synchronized clocks. We define the events in the respective laboratories with respect to the local clocks. Bob's local operation will always be applied at his local time τ B , while Alice's is applied at her local time τ A . We will consider two configurations, which will be controlled by a quantum system. The state of the control system is given by the position of a massive body. In the first configuration, all masses are sufficiently far away such that the parties are in an approximately flat spacetime. The events in the two laboratories are chosen such that the event B is outside of A's light cone and the commoncause causal relationship is implemented. The coordinate times of the two events, as measured by a local clock of a distant observer, are t A ≈ τ A and t B ≈ τ B . (Fig. 4 (a)).
In the second configuration, a mass M is put closer to Bob's laboratory than to Alice's such that his clock runs slower with respect to hers due to gravitational time dilation. With a suitable choice of mass and distance between Alice and Bob, the event B, which is defined by his clock showing local time τ B , will be inside A's future light cone. In terms of coordinate times one now has t A = τ A / −g 00 (A) and t B = τ B / −g 00 (B), where g 00 (A) and g 00 (B) are the "00" components of the metric tensor at the position of the laboratories. This configuration can implement the direct-cause relationship ( Fig. 4 (b)). I ), the quantum causal structure will be described by W coherent , as given in (7).
If the mass M is initially in a coherent spatial superposition of a position close and a position far away from Bob, the quantum superposition of causal structures W coherent is implemented. The position of the mass acts as the control system C (0) I ; 2 it can be received by Charlie, who can manipulate it further (in particular, measure it in the superposition basis). Any possible information about the causal structure (direct cause or common cause) encoded in the degrees of freedom of the laboratories, such as for example in the clocks of the labs, must be erased, possibly using the methods of Ref [19].
Note that, in contrast to the superposition of different causal orders [18,19], the time dilation necessary to "move B in or out" of the light cone can, in principle, be made arbitrarily small, if Bob can define τ B and thus the 2 The state |0 corresponding to the mass being far away from Bob and the state |1 corresponding to the mass being close to Bob. event B with a sufficiently precise clock 3 .
To give an idea of the orders of magnitude involved: for a spatial superposition of the order of ∆x = 1 mm and a mass of M = 1 g, Bob's clock should resolve one part in 10 27 to be able to certify the nonclassicality of the causal structure. This regime is still quite far from experimental implementation, since the best molecule interferometers [35] do not go beyond M = 10 5 amu, ∆x = 10 −6 m, while the best atomic lattice clocks achieve a precision of one part in 10 18 [36]. An additional difficulty consists in avoiding significant entanglement between the position of the mass and systems other than the local clocks. Nonetheless this regime is still far away from the Planck scale that is usually assumed to be relevant for quantum gravity effects.
We also stress that the process W coherent , although it cannot be decomposed as a convex combination of a common cause and a direct cause process, is still compatible with the causal order A ≺ B ≺ C and, as such [37], can be realized as a quantum circuit, as shown in Fig. 5

V. VERIFYING THE NONCLASSICALITY OF CAUSAL STRUCTURES
We now provide an experimentally accessible and efficiently computable measure of the nonclassicality of causality.
Let us first define the set S of operators which are positive on any convex combination W conv of direct-cause and common-cause processes (i.e., processes satisfying (6)): If S is positive on all convex combinations of direct-cause and common-cause process matrices, then it is also positive on all direct-cause (tr[S W dc ] ≥ 0) and commoncause (tr[S W cc ] ≥ 0) processes individually.
Since W dc is a direct-cause process (4) if and only if the operator tr C I B O W dc is separable with respect to the bipartition (A I , A O B I ), we effectively require S to be an entanglement witness [38,39] of the reduced process for the bipartition (A I , A O B I ). The full characterization of the set of entanglement witnesses is known to be computationally hard [40]. Instead, we will use the positive partial transpose [41,42] criterion as a relaxation to define an efficiently computable measure of nonclassicality.
Enforcing that S is positive on common-cause process matrices in terms of semidefinite constraints is straightforward: since the condition for W cc (5) to be a commoncause process is already a semidefinite constraint, the "dual" constraint for S to be positive on all commoncause process matrices is semidefinite as well.
The operators in the set S SDP (explicitly constructed in Appendix D) are defined as those that obey both the condition of having a positive partial transpose and being positive on all common-cause process matrices. Every S ∈ S SDP has positive trace with any W conv . Conversely, tr[S W ] < 0 certifies that the process W is a genuinely nonclassical causal structure-the operators S can therefore be used as nonclassicality of causality witnesses 4 .
It is crucial to realize that for every given genuinely quantum W , one can efficiently optimize-the optimization is a semidefinite program [24]-over the set of nonclassicality witnesses to find the one that has minimal trace with W : where W * is the dual cone of W, given in Appendix C.
The normalization condition 1/d O −S ∈ W * is necessary for the optimization to reach a finite minimum and confers an operational meaning to C(W ) := − tr[S opt W ]: it is the amount of "worst-case noise" the process can tolerate before its quantum features stop being detectable by witnesses in S SDP (in analogy to the "generalized robustness of entanglement" [43]). Because of its ability to certify the quantum nonclassicality of causal structures, we will refer to C(·) as the "nonclassicality of causality". Note that C(·) satisfies the natural properties of convexity and monotonicity under local operations (see Appendix E).
To experimentally verify the properties of a process like W coherent , one can use the semidefinite program (9) to compute the optimal nonclassicality of causality witness S opt for W coherent . The nonclassicality of causality C(W coherent ) can be measured by decomposing S opt in a convenient basis of local operations. In general, this is as demanding as performing a full "causal tomography" [15,32,33].

VI. CAUSAL INFERENCE UNDER EXPERIMENTAL CONSTRAINTS
There are two reasons to consider witnesses that are subject to certain additional restrictions. First, there might be various technical limitations arising from the experimental setup [23,33], which make full tomography impractical. Second, in analogy to the classical case, it is of conceptual interest to investigate the power of quantum causal inference mechanisms working on limited data. In particular, one might want to investigate differences between quantum and classical causal inference algorithms under such constraints.
As an application of this method, we will examine witnesses for the process W coherent . In the following, we will consider qubit input and output spaces, i.e., dim A I = dim A O = dim B I = dim C (0,1,2) I = 2 for simplicity and computational speed. The optimal witness for W coherent , obtained from the optimization (9) using YALMIP [44] with the solver MOSEK [45], leads to a nonclassicality of causality of C(W coherent ) = − tr[S opt W coherent ] ≈ 0.2278.
An intriguing feature of quantum causal models is that direct-cause correlations ( Fig. 1 (a)) and common-cause correlations ( Fig. 1 (b)) can be distinguished through a restricted class of informationally symmetric operations [31], sometimes called "observations" [33,46] that are non-demolition measurements (we refer the reader to Appendix H for certain issues with this definition). We can constrain a witness S ndmeas to consist of linear combinations of such non-demolition measurements through an additional condition to the semidefinite program (9), given in Appendix F. Surprisingly, purely "observational" witnesses are sufficient not only to distinguish common-cause from direct-cause processes, but also to distinguish a classical mixture of direct-cause and common-cause processes from a genuine quantum superposition, since − tr[S ndmeas opt W coherent ] ≈ 0.0732. Since measurements and repreparations and even nondemolition measurements are often challenging to implement [47], it can also be useful to consider a nonclassicality of causality witness S unitary which can be decomposed into unitary operations for A and B, and arbitrary measurements for C. The requirement can also easily be translated in a semidefinite constraint, given in Appendix F. One finds that − tr[S unitary opt W coherent ] ≈ 0.1686. A summary of the different constraints mentioned in this section can be found in Appendix F.

VII. CONCLUSIONS
We presented a three-event quantum causal model compatible with the causal order A ≺ B ≺ C which is a quantum controlled coherent superposition between common-cause and direct-cause models, not a classical mixture thereof.
The experimental implementation we proposed is of conceptual interest, since it relies both on general relativity and the quantum superpositions principle, two elements we expect to feature in a full theory unifying quantum theory and general relativity. Interestingly, both the mass of the object and the separation between the two amplitudes can be arbitrarily small, as long as Bob has access to a sufficiently precise clock to define the instant of his event B.
In order to experimentally certify a genuinely quantum causal structure, we introduced and characterized non-classicality of causality witnesses and provided a semidefinite program to efficiently compute them. Experimental and conceptual constraints are readily included in the framework.
The potential of quantum causal structures as a quantum information resource was recently demonstrated in terms of query complexity [20] and communication complexity [21,22], but is still poorly understood. It would be interesting to understand which advantages could be obtained from the coherent superpositions of and common-and direct-cause processes.
Remark.-In the final stages of completing this manuscript, a related work by MacLean et al. [34] appeared independently. The difference in the definitions of direct-cause processes between the two papers and its implications are discussed in Appendix G. Acknowledgements where I is the identity map, |I := d H I j=1 |jj ∈ H I ⊗ H I is a non-normalized maximally entangled state and T denotes matrix transposition in the computational basis.
The inverse transformation is then defined as: For operations which have a single Kraus operator (M A (ρ) = AρA † ), one also define a "pure CJisomorphism" [48,49], which maps the operation to a vector 5 : The usual CJ-representation of such an operation is simply the projector onto the CJ-vector : We first introduce a shorthand that we will use throughout the following appendices: where d X is the dimension of the Hilbert space X.
In this paper, we consider three parties, where the C's output space C O can be disregarded. The process matrix W ∈ A I ⊗ A O ⊗ B I ⊗ B O ⊗ C I , which encodes the quantum causal model, is defined on the dual space to the tensor products of the maps. Since both the "commoncause" and the "direct-cause" scenarios are compatible with the causal order A ≺ B ≺ C, we can also represent the process matrix W as a circuit. (see Fig. 5). For instance, the coherent superposition of common cause and direct cause, defined in (7), would consist of |ψ = |φ + ⊗ (|0 + |1 )/ √ 2, W 1 and W 2 being control-SWAPs (where the control is the last qubit, initially in the state (|0 + |1 )/ √ 2). We now define the projection L A≺B≺C (·) onto the linear subspace of process matrices compatible with the causal order A ≺ B ≺ C, which can be derived from the conditions given in Ref. [32]: W A≺B≺C is compatible with the causal order A ≺ B ≺ C if and only if W A≺B≺C = L A≺B≺C (W A≺B≺C ) holds. The projection onto the subspace of common-cause process matrices L cc (·) is given by composing the projection L A≺B≺C with the projection onto processes which have no channel from A O to B I : Given the definition (2) of the cone W, we can characterize the dual cone W * of all operators whose product with operators in W has positive trace. Remember that W is the intersection of the cone of positive operators P with a linear subspace defined by the conditions for causal order: W := P ∩ L A≺B≺C .
The dual of the linear subspace L * A≺B≺C is its orthogonal complement [24,32] i.e., the space of operators with a support that is orthogonal to the original subspace. Additionally, the dual of the intersection of two closed convex cones containing the origin is the convex union of their duals [24,32], so that Since the cone of positive operators is self-adjoint (P * = P), we can combine (C1) and (C2) into W * = conv(P ∪ L ⊥ A≺B≺C ). Explicitly, this means that any operator Q ∈ W * can be decomposed as Appendix D: Nonclassicality of causality witnesses We will now explicitly construct the set of nonclassicality of causality witnesses S SDP .
The semidefinite relaxation of the direct-cause constraint (4) in terms of positive partial transposition is (using the shorthand introduced in (B1)): The dual cone (D2) to the cone of relaxed direct-cause processes defined by the intersection of W with the cone defined in (D1) and the dual cone (D3) to the cone of common-cause processes defined by the intersection of W with the linear subspace (5) can be constructed in the same way as in Appendix C.
The set of witnesses positive on all positive partial transpose operators is a subset of entanglement wit-nesses. Every witness belonging to this set satisfies 6 : If tr[S dc W ] < 0, this implies that W is not a directcause process as defined in Eq. (4). Note that since we are only considering a subset of entanglement witnesses, the converse does not hold.
We can now turn to the requirement that S is positive on common-cause processes. Since condition (5) (corresponding to (B3) together with positivity) defines a convex cone, we can use the techniques of Appendix C to construct the dual cone, of which the witness will be an element. This leads us to write S as where the projection onto the common-cause subspace L cc is defined in Appendix B. W is not a common-cause process as defined in (4) if and only if there exists an S cc such that tr[S cc W ] < 0. Now, combining both conditions, we can construct a set of operators positive on all mixtures of direct-cause and common-cause processes only in terms of semidefinite constraints. To test whether an arbitrary W process is of this type, we can run the following semidefinite program (SDP) [24]: The last condition, where W * is the cone dual to W (see Appendix C), imposes a normalization on S. It gives the nonclassicality of causality C(W ) = − tr[S opt W ] the operational meaning of "generalized robustness", quantifying resistance of the nonclassicality detectable by S SDP to worst possible noise [32,43]. This becomes more intuitive from the dual SDP, given by s.t. W + Ω = W cc + W dc , The process Ω·d O / tr[Ω] can be interpreted as worst-case noise with respect to the optimal witness S opt , resulting from the SDP (D4).

Appendix E: Convexity and monotonicity
Here we prove that the nonclassicality of causality defined as C(W ) := − tr[S opt W ], which results from the SDP (D4), satisfies the natural properties of convexity and monotonicity, following analogous proofs of Ref. [32].
Convexity means that C( i p i W i ) ≤ i p i C(W i ), for any p i ≥ 0, i p i = 1. Take S Wi to be the optimal witness for W i . Any other witness, in particular the optimal witness S W for W := i p i W i will be less robust to noise with respect to W i : Averaging over i we have which is exactly the statement of convexity for C.
Monotonicity under local operation means that C(W ) ≥ C($(W )), where $(·) is the composition of W with local operations.
We wish to show that − tr S $(W ) $(W ) ≤ − tr[S W W ]. By duality, this is equivalent to where $ * (·) is the map dual to $(·). Eq. (E3) is satisfied if $ * S $(W ) is a witness, i.e., is positive on all mixtures of direct-cause and common-cause operators (tr $ * S $(W ) W mix ≥ 0), and is normalized appropri- The first condition can be seen to hold by applying duality and using the fact that local operations map any mixture of direct-cause and common-cause processes to a mixture of direct-cause and common-cause processes. The second condition is equivalent to for every process matrix Ω. We apply duality and linearity of the trace to find that This relation holds because $(·) maps normalized ordered process matrices to normalized ordered process matrices and 1/d O − S $(W ) ∈ W * is the normalization condition for the SDP (D4). The condition of discrimination (or faithfulness), which would mean that C(W ) ≥ 0 if and only if the process matrix is not a mixture of direct-cause and commoncause processes (6), is not satisfied. Since we relied on a relaxation of the direct-cause condition by using the positive partial transpose criterion, there are processes which are not a mixture satisfying (6) but for which the nonclassicality of causality is zero.
Therefore, the nonclassicality of causality is not a faithful measure of the nonclassicality of the causal structure. This is reasonable, since finding such a measure would be equivalent to finding a fully general entanglement criterion-a problem known to be computationally hard [40]. can be interpreted as the amount of noise tolerated before the constrained set of witnesses becomes incapable of detecting the nonclassicality of causality of W coherent .
A simple example of a restriction simplifying the experimental implementation consists in disregarding the space C (1,2) I , i.e., to have S = C (1,2) I S as an additional constraint. The nonclassicality of causality is unaffected by this restriction, which shows that the input spaces C (1,2) I do not carry any additional information about the nonclassicality of causality.
The constraint for the witness to consist only of nondemolition measurements is: where σ k (k = 1, 2, 3) are the qubit Pauli matrices and E l , l = 1, . . . , 8 is an arbitrary basis of projectors on C I 's three qubits.
The constraint for the witness to only consist of unitary operations 7 for A and B is:  Since Ref. [34] considers two party case, we can merge B and C to make our scenario comparable to the one of Ref. [34]. More precisely, B I and C I are relabeled as B I and B O is disregarded, eliminating the necessity to trace over B O and C I . The condition for direct-cause processes (4) then becomes which implies that the states given to A and the channel connecting A and B can be classically correlated.
In the terminology of DAGs this convex mixture would correspond to tracing over a (hidden) classical 9 common cause between A and B. An alternative, more restricted definition would exclude such classical correlations, i.e., It is used in Ref. [34]. To make the difference apparent, consider the convex mixture of two direct-cause processes between A and B (here, dim where the tensor products between the Hilbert spaces are implicit. W mem classically correlates the channel between A O and B I (a classical channel with or a without bit flip) to the state in A I , as shown in Fig. 6. It is of the type (G1) but not of the type (G2).
FIG. 6. Quantum causal models respecting the extended "direct-cause" condition (G1) can be thought of as a general channel with classical memory (left), or equivalently as a convex combination of direct-cause processes with no memory (right).W andWi are general quantum channels, |ψ an arbitrary quantum state and the gray square represents a fully dephazing channel (in an arbitrary basis).
In Ref. [34], (G3) is not considered to be a direct-cause process, nor a convex mixture (called "probabilistic mixture") of direct-cause and common-cause processes. It is instead termed a "physical mixture" of common-cause and direct-cause processes.
We instead use the broader definition (G1) because we ultimately intend to study convex combinations of common-cause and direct-cause processes (6), which means we should also allow for convex combinations of direct-cause processes. The restricted definition (G2) for direct-cause processes would lead to consider a convex combination of a direct-cause and a common-cause process to be a "probabilistic mixture", but not a convex combination of two cause-effect processes.
Finally note that the class of processes, which, when post-selected on CP maps being implemented at B I , result in an entangled conditional process on A I A O , is defined to be "coherent mixtures" in Ref. [34]. All of these "coherent mixtures" are nonclassical in our terminology (the processes that can be decomposed as (6) never result in an entangled conditional process on A I A O ). It is not clear whether the converse is true.
Appendix H: Issues in defining a quantum "observational scheme" Ried et al. [33] define the "observational scheme" (as opposed to the "interventionist scheme") on a quantum causal structure as composed of operations satisfying the "informational symmetry principle". We examine the subtleties and issues involved in this definition, in particular regarding the dependence on the initially assigned state.
Ref. [33] assumes that before the observation, one assigns the (epistemic) state ρ A I to the system coming into A's laboratory. A quantum operation (described by the Choi-Jamio lkowski representation of the quantum instrument [50] {M i A }, where i labels the outcome) is applied. This updates the information about the outgoing state ρ (i) A O but also (through retrodiction) about the incoming state ρ (i) A I . These states are found by applying the update rules [31]: ρ (i) The informational symmetry principle holds if and only if after the operation, the states assigned to the incoming and outgoing systems are the same: For Ried et al., an instrument for which this informational symmetry holds is defined to be an "observation" [33]. In this sense, there can obviously be "nonpassive" observations such as non-demolition measurements. Any non-demolition measurement in a basis in which the initially assigned state ρ A I is diagonal will be an observation in this sense. This matches the intuition that a classical measurement only reveals information and does not disturb the system. If one wishes to implement measurements in arbitrary bases, the only initially assigned state which results in informational symmetry is the maximally mixed state ρ A I = 1/d [33]. This shows how problematic the definition of observational scheme is, since it not only crucially depends on an initial (epistemic) assignment ρ A I but also because there is only one such assignment which allows all measurements to be "observations"-which tolerates no amount and no type of noise. In this sense, as soon as the experimenter changes her beliefs about the incoming state in any way, she will be intervening on the system, not merely observing it.
Leaving aside these interpretative difficulties, it is interesting to realize that operations which are unitary also turn out to be "observations" if the initially assigned state is ρ A I = 1/d: for a unitary operation, ρ (i) A O = ρ A I = 1/d. The unitary provides exactly the same information about input and output states, namely none.
Finally, note that both the framework of Ref. [33] and the one we developed rely on the assumption that quantum theory is valid and the correct operations were implemented-the analysis is device-dependent. This means that any "quantum advantage" in inference will not be based on mere correlations in the sense of a conditional probability distribution of outputs given inputs. This makes the comparison with the power of classical causal models somewhat problematic.