Brought to you by:
Paper The following article is Open access

Updating the Born rule

, and

Published 4 May 2018 © 2018 The Author(s). Published by IOP Publishing Ltd on behalf of Deutsche Physikalische Gesellschaft
, , Citation Sally Shrapnel et al 2018 New J. Phys. 20 053010 DOI 10.1088/1367-2630/aabe12

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1367-2630/20/5/053010

Abstract

Despite the tremendous empirical success of quantum theory there is still widespread disagreement about what it can tell us about the nature of the world. A central question is whether the theory is about our knowledge of reality, or a direct statement about reality itself. Current interpretations of quantum theory, regardless of their stance on this question, regard the Born rule as fundamental and add an independent state update (or 'collapse') rule to describe how quantum states change upon measurement. In this paper we present an alternative perspective and derive a unified probability rule that subsumes both the Born rule and the collapse rule. We show that this more fundamental probability rule can provide a rigorous foundation for informational, or 'knowledge-based', interpretations of quantum theory. Our result requires an assumption of instrument non-contextuality, a key notion that generalises previous approaches to non-contextuality. Therefore, the framework also permits one to consider non-contextuality in scenarios with arbitrary causal structure.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Introduction

Knowledge-based, or informational, views of quantum theory are popular for a variety of reasons. Perhaps one of the strongest motivations for this perspective comes from the conceptual difficulties that surround quantum state collapse upon measurement. If quantum states are a direct description of reality then this seems to demand that collapse is a nonlinear, stochastic and temporally ill-defined physical process [14]. From a 'knowledge' perspective however, collapse is seen merely as a form of information update, no more problematic than classical probabilistic conditioning [512]. Whilst compelling, there is an obvious problem with this kind of approach: classical probabilistic conditioning treats multiple consecutive events on a single system on exactly the same footing as multiple events on distinct systems: joint probabilities are defined in exactly the same way in each case. Thus, classical joint probabilities can be assigned to events in a manner that is independent of the spatio-temporal relationships between those events. In quantum mechanics however, the Born rule does not assign joint probabilities to consecutive events [13], figure 1. This means that knowledge-based interpretations, where one argues that the Born rule is fundamental and the state update rule 'merely a case of probabilistic conditioning', are deeply unsatisfactory. Both rules have to be introduced and justified separately.

Figure 1.

Figure 1. Quantum probability rules. (a) The Born rule assigns probabilities to measurements on distinct systems: for a state ρ, and measurement operators EA,B, the probability is $P({E}^{A},{E}^{B})=\mathrm{Tr}[({E}^{A}\otimes {E}^{B})\cdot \rho ]$. (b) For two consecutive measurements on the same system, one cannot apply the Born rule without first updating the state. The state update rule, defined as $\rho \rightsquigarrow{\widetilde{\rho }}_{{{ \mathcal M }}^{A}}={{ \mathcal M }}^{A}(\rho )/\mathrm{Tr}{{ \mathcal M }}^{A}(\rho )$ for a completely positive map ${{ \mathcal M }}^{A}$ describing the first measurement, is typically introduced as an independent axiom in the theory.

Standard image High-resolution image

In this paper we aim to provide a solution to this problem and breathe new life into the knowledge-based view of quantum theory. We present a proof of a unified quantum probability rule that subsumes both the Born rule and the state update rule. This rule is useful in a variety of contexts, from quantum information [1419] to quantum causal modelling [2023], and non-Markovian dynamics [2427]. Dubbed the 'Quantum Process Rule', we prove that one can derive this higher-order, generalised form of the standard quantum probability rule from the structure of quantum operations and a single non-contextuality assumption. This approach is analagous to Gleason's [28] and related [2932] derivations of the ordinary Born rule. We also show that using this more fundamental approach, where one assigns joint probabilities to arbitrary quantum events, it is possible to derive both the Born rule and the state update rule. A key conceptual advantage is that state update, or 'collapse' need no longer be viewed as an ad hoc ingredient, independent and estranged from the core of the theory.

Results

Measurements and the Born rule

In order to introduce the fewest possible assumptions, we take an explicitly operational perspective. Operational theories can be phrased in terms of events, which define the results of measurements. Each time a measurement is performed on a system, a number of possible events can be observed. The set of all events that can result from a specific measurement is called a context. It is natural when constructing such a theory to assume measurement non-contextuality [6, 30]. This means that operationally indistinguishable events should have the same mathematical representation in the theory2 . Clearly, any probabilistic theory can be formulated in a non-contextual way by appropriate relabelling of the mathematical objects describing events.

In this setting, the minimal task of a physical theory is to non-contextually assign probabilities to such measurement events. In essence this is the 'probability rule' of the theory and also defines the relevant state-space. One can represent any such non-contextual probability rule (the Born rule being a prime example) by means of a frame function. This is a function that associates a probability to every event, independently of the context to which it belongs, such that probabilities for all events in a given context sum up to one3 . Crucially, the frame function is not a probability distribution over the space of all events, as that would require a normalised measure over the entire space.

As was shown in [2932], the notion of a frame function can be used to derive the Born rule as the appropriate non-contextual probability rule to apply when one identifies events with the results of a measurement on a quantum system. In this approach, events are identified with quantum effects: for a d-level quantum system, the full set of quantum effects is defined as ${{ \mathcal E }}_{d}:= \{E\in { \mathcal L }({{ \mathcal H }}_{d}),0\leqslant E\leqslant {\mathbb{1}}\}$, where ${ \mathcal L }({{ \mathcal H }}_{d})$ is the space of linear operators on a d-dimensional Hilbert space ${{ \mathcal H }}_{d}$. Contexts are described by positive-operator-valued measures (POVMs), lists of effect operators $\{{E}_{1},{E}_{2},\,\mathrm{...}\}$ that sum up to the identity, ${\sum }_{j}{E}_{j}={\mathbb{1}}$. The subscript j labels the list of possible measurement outcomes for a given context.

Assuming measurement non-contextuality here means that the probability of a particular quantum effect (equivalently, measurement outcome) is assumed to be independent of the context (POVM) to which it belongs. Operationally, this means that the probability assigned to a given event does not depend on any extra information regarding how it was achieved.

A frame function for quantum effects is defined as a mapping from the set of all effects to the unit interval:

Equation (1)

satisfying

Equation (2)

for any set $X=\{{E}_{1},{E}_{2},\,\ldots \}$, ${E}_{j}\in {{ \mathcal E }}_{d}$ such that

Equation (3)

Using this definition, the task then is to prove that for each frame function, f, there is a unit-trace positive operator ρ such that $f({E}_{j})=\mathrm{Tr}(\rho {E}_{j})$ 4 .

The proof in [30] follows three simple steps. First, one proves linearity of the frame function over the field of non-negative rational numbers, then extension to full linearity is obtained by proving continuity of the frame function. Then, as the frame function has been proved to be linear, it can be recast as arising from an inner product. In particular, using the Hilbert–Schmidt inner product on the operator space ${ \mathcal L }({ \mathcal H })$, the frame function can be written as $f(E)=\mathrm{Tr}(\rho E)$ for some positive semidefinite, unit-trace operator ρ. This both characterises the Born rule and also defines the density operator as the appropriate object to represent the quantum state.

As we have noted, the above proof does not tell us how to assign probabilities to consecutive events. That is, assuming we know the state of a quantum system prior to measurement, the Born rule alone does not tell us how to update this state following measurement. To remedy this situation, we now wish to provide a similar proof for a probability rule that can subsume both the Born rule and the state update rule.

Instruments and the quantum process rule

We consider more general operational primitives than those of [30] and instead consider local regions where one can perform actions that are associated with outcomes. The class of allowed local actions is broad: one can perform measurements, realise transformations, or even add and discard ancillary systems. Such actions can also be associated with local outcomes and we define a particular single case outcome, associated to a given action, as the relevant event. The event thus now labels not only the outcome but also any concurrent transformation to the local system.

Just as with effects in the traditional approaches, we assume a minimal operational labelling for transformations: different interactions of the system with an environment, that cannot be distinguished by looking at the system alone, will be assigned the same label.

If we consider a particular run of an experiment there will in general be a collection of such events that occur, one for each local region. One can associate a joint probability to this set of events, and, given enough runs of an experiment, one can empirically verify probability assignments for each possible permutation of events.

Formally, an event in region A is represented by a completely positive trace-non-increasing (CP) map ${{ \mathcal M }}^{A}:{A}_{I}\to {A}_{O}$, where input and output spaces are the spaces of linear operators over input and output Hilbert spaces of the local region, ${A}_{I}\equiv { \mathcal L }({{ \mathcal H }}^{{A}_{I}})$, ${A}_{O}\equiv { \mathcal L }({{ \mathcal H }}^{{A}_{O}})$ respectively (here identified with the corresponding matrix spaces) [36], see figure 2. We write ${L}^{A}:= { \mathcal L }({A}_{I},{A}_{O})$ for the set of linear maps from AI to AO. We denote the set of CP maps associated to each region, ${{\rm{CP}}}^{X}\subset {L}^{X}$.

Figure 2.

Figure 2. Local region. A local region A is defined by an input (${{ \mathcal H }}^{{A}_{I}}$) and an output (${{ \mathcal H }}^{{A}_{O}}$) Hilbert space. An event is represented by a completely positive map ${{ \mathcal M }}^{A}$.

Standard image High-resolution image

We demand complete positivity because in principle it should be possible to perform arbitrary quantum operations in the local region. This includes performing operations on a subsystem that is part of a larger system. Complete positivity means that, for arbitrary dimensions of an ancillary system $A^{\prime} $, the map ${{ \mathcal I }}^{A^{\prime} }\otimes {{ \mathcal M }}^{A}$ transforms positive operators into positive operators, where ${{ \mathcal I }}^{A^{\prime} }$ is the identity map on $A^{\prime} $. Trace non-increasing means that $\mathrm{Tr}{ \mathcal M }(\rho )\leqslant \mathrm{Tr}\rho $ for all operators ρ. A CP map can be decomposed as ${ \mathcal M }(\rho )={\sum }_{j=1}^{m}{K}_{j}\rho {K}_{j}^{\dagger }$, where the Kraus operators ${K}_{j}:{{ \mathcal H }}^{{A}_{I}}\to {{ \mathcal H }}^{{A}_{O}}$, j = 1, ..., m, satisfy ${\sum }_{j=1}^{m}{K}_{j}^{\dagger }{K}_{j}\leqslant {\mathbb{1}}$ for a trace non-increasing map [37, 38].

The context for each set of CP maps is now no longer a POVM but rather a quantum instrument [39]. An instrument thus represents the collection of all possible events that can be observed given a specific choice of local action. Given a local region A, an instrument is formally defined as a set ${{\mathfrak{I}}}^{A}$ of CP maps that sum up to a completely positive trace-preserving (CPTP) map:

Equation (4)

We are now in a position to define the relevant frame function and derive the appropriate probability rule for this scenario. Just as the Born rule tells us how to calculate the probability of a particular outcome given the relevant measurement operator, the Quantum Process Rule should tell us how to assign a joint probability to each possible collection of local events given the relevant instruments. We assume 'instrument' non-contextuality, rather than 'measurement' non-contextuality. That is, the joint probability for a set of events, one for each region, is independent of the particular context (set of instruments) to which they belong, see figure 3.

Figure 3.

Figure 3. Instrument non-contextuality. Operations are performed in distinct local regions. Operation ${{ \mathcal M }}^{A}$ in region A corresponds to a shared outcome of two different instruments, ${{\mathfrak{I}}}_{1}$ and ${{\mathfrak{I}}}_{2};$ ${{ \mathcal M }}^{B}$ in region B to a shared outcome of instruments ${{\mathfrak{I}}}_{3}$ and ${{\mathfrak{I}}}_{4}$. Instrument non-contextuality implies the joint probability $P({{ \mathcal M }}^{A},{{ \mathcal M }}^{B})$ for the two events is independent of whether instrument ${{\mathfrak{I}}}_{1}$ or ${{\mathfrak{I}}}_{2}$ was used in Region A, and whether instrument ${{\mathfrak{I}}}_{3}$ or ${{\mathfrak{I}}}_{4}$ was used in region B.

Standard image High-resolution image

As for [30], the non-contextuality assumption is formalised by requiring that probabilities are given by a frame function. Each 'frame' is now a collection of instruments, one per region, rather than a single POVM.

Definition 1. A frame function, f, for a set of local non-intersecting regions X = A, B, C, ..., is defined by:

  • 1.  
    f is a function from the Cartesian product of the set of CP maps associated to each region, ${{\rm{CP}}}^{X}\subset {L}^{X}$, to the unit interval:
    Equation (5)
  • 2.  
    f is normalised for all sets of CP maps, ${{ \mathcal M }}^{X}$, that form instruments ${{\mathfrak{I}}}^{X}$,
    Equation (6)

We now show that this definition is sufficient to derive the new probability rule. As in [30] we first prove linearity of the frame function.

Theorem 1. The frame function f is a convex-multilinear functional on ${{\rm{CP}}}^{A}\times {{\rm{CP}}}^{B}\times {{\rm{CP}}}^{C}\times ...$

By convex-multilinear we mean:

and similarly for all other regions $B,C,\,...$

Proof. We fix instruments at all regions, except for region A, to be instruments with a single CPTP map each: ${\overline{{ \mathcal M }}}^{B},\,{\overline{{ \mathcal M }}}^{C},\,...$

Consider two instruments applied in region A:

The frame function constraints imply:

Therefore

and thus we have additivity for CP, trace non-increasing maps.

We next prove homogeneity of the frame function for the rational numbers between 0 and 1. Take two integers $1\leqslant n\leqslant m\in {\mathbb{N}}$, and a CP, trace-non-increasing map ${ \mathcal M }$. By converting multiplications by integers into sums, from additivity of the frame function we have:

Equation (7)

Therefore, having additivity and homogeneity, we have proved the convex linearity of f for rational numbers between 0 and 1.

Convex linearity of the frame function on the real interval [0, 1] can be established using the 'squeeze theorem' of elementary calculus [40]. Define two sequences of positive rationals, $\{{a}_{n}\}$ increasing and $\{{b}_{n}\}$ decreasing with ${a}_{n}\lt {b}_{n}\leqslant 1$, that converge to the same real number c. Then, for any CP map ${{ \mathcal M }}^{A}$, the map ${{ \mathcal N }}_{n}^{A}:= (c-{a}_{n}){{ \mathcal M }}^{A}$ is also CP. Thus, fixing all maps in other regions to be CPTP, we have

Similarly, we have that $f(c{{ \mathcal M }}^{A},\,\mathrm{...})\leqslant f({b}_{n}{{ \mathcal M }}^{A},\,\mathrm{...})$. This implies

Equation (8)

Because ${a}_{n}f({{ \mathcal M }}^{A},\,\mathrm{...})$ and ${b}_{n}f({{ \mathcal M }}^{A},\,\mathrm{...})$ both converge to ${cf}({{ \mathcal M }}^{A},\,\mathrm{...})$, equation (8) implies

Equation (9)

by the 'squeeze theorem'.

We have thus proved that f is linear on CPA and, with similar steps, linearity can be proved for CPB, CPC, ...which concludes the proof.□

Just as in ordinary quantum mechanics a state is defined as a linear functional over effects (POVM elements), we can define a multilinear functional over sets of events (CP maps) as a process, in accordance with the terminology of [20, 22, 4148].

We next use the fact that a linear functional can be expressed by means of an inner product. This enables us to derive a new probability rule using our frame function, and also gives the appropriate form for the matrix representation of a process.

First consider that because each convex space CPX contains a basis of the entire linear space LX, X = A, B, ..., the frame function f can be extended by linearity to a multilinear function on the spaces LA, LB, LC, ... Again by linear extension, this defines a unique linear function on the product space ${L}^{A}\otimes {L}^{B}\otimes {L}^{C}\otimes ...$

Next, it is easy to show that a natural inner product between any two linear maps ${{ \mathcal M }}^{A}$, ${{ \mathcal N }}^{A}$LA is defined as follows (see the appendix for details):

Equation (10)

where ${\{{\tau }_{\mu }\}}_{\mu =0}^{{d}^{2}-1}$ is a Hilbert–Schmidt basis for the d-dimensional input space: ${\tau }_{\mu }\in { \mathcal L }({{ \mathcal H }}^{{A}_{I}})$, ${\tau }_{\mu }={\tau }_{\mu }^{\dagger }$, $\mathrm{Tr}({\tau }_{\mu }{\tau }_{\nu })={\delta }_{\mu \nu }$.

One can also represent this inner product in a more convenient (and familiar) form by representing the CP maps associated to each region as Choi–Jamiolkowski (CJ) matrices [49, 50]. Recall, a CP map associated to a region A, where input and output spaces are the spaces of linear operators over input and output Hilbert spaces, ${A}_{I}\equiv { \mathcal L }({{ \mathcal H }}^{{A}_{I}})$, ${A}_{O}\equiv { \mathcal L }({{ \mathcal H }}^{{A}_{O}})$, respectively, can be represented as a matrix5 :

Equation (11)

where T denotes transposition in a chosen basis and ${\{| j\rangle \}}_{j=1}^{{d}_{{A}_{I}}}$ is an orthonormal basis in ${{ \mathcal H }}^{{A}_{I}}$. We show in the appendix that the inner product (10) can be expressed as

Equation (12)

and it is independent of the choice of Hilbert–Schmidt basis.

This inner product determines an isomorphism between elements of ${L}^{A}\otimes {L}^{B}\otimes {L}^{C}\otimes ...$ and linear functionals on the same space. We can thus define a trace rule that allows one to determine the joint probability for a set of CP maps, one for each region:

Equation (13)

where ${{ \mathcal W }}_{f}\in {L}^{A}\otimes {L}^{B}\otimes {L}^{C}\otimes ...$ is the linear map that uniquely defines f and ${W}_{f}^{{AB}...}$ is its CJ representation, called the process matrix. (In the following, we will drop the subscript f.)

Positivity and normalisation of the frame function, equations (5) and (6) respectively, impose constraints on the operators W that define valid processes. The set of process matrices, together with the expression (13) for the frame function, defines the Quantum Process Rule. As discussed in [20], the set of matrices characterised by positivity, equation (5), is further restricted if one assumes that local operations can be extended to act on additional multipartite quantum states shared among the regions. The overall result can be summarised as follows:

Theorem 2. Given a set of regions $X=A,B,\,...$ where arbitrary quantum operations can be performed, any instrument non-contextual probability assignment, expressed through a frame function as per definition 1, must be given by the Quantum Process Rule, equation (13), where the process matrix $W$ satisfies the conditions

Equation (14)

Equation (15)

and

Equation (16)

If one additionally assumes that each operation in region $X$ can be extended to act on an additional input space ${X}_{I}^{{\prime} }$, with an arbitary multipartite state ${\rho }^{{A}_{I}^{{\prime} }{B}_{I}^{{\prime} }...}$ shared across the regions, then the process matrix must be positive semidefinite, $W\geqslant 0$, a strictly stronger condition than equation (16).

Property (15) is the CJ representation of the trace-preserving condition; therefore, the normalisation constraint says that CPTP maps can be performed with unit probability. The resulting constraint on process matrices differs from the analogous one for density matrices, $\mathrm{Tr}\rho =1$. In addition to an analogous affine constraint, $\mathrm{Tr}W={d}_{O}$ with dO the product of all output dimensions, W has to satisfy further linear constraints. We refer to appendix B of [42] for an explicit characterisation of such constraints.

Recovering the state update and Born rule

Let us recapitulate the rationale so far: it was shown in [30] that if we accept the structure of quantum measurements, we can identify quantum probabilities as the most general non-contextual probability assignments. Whereas this approach only considers a single measurement/event—or at most measurements of separate quantum systems—in the quantum process approach outlined above we derive a general rule to assign joint probabilities to an arbitrary number of events. The ordinary Born rule is thus recovered from the general one in the case where a single region is considered—in which case instruments reduce to POVMs and process matrices reduce to density matrices [20].

We are in particular interested in the situation where two consecutive measurements are performed on a single quantum system. Gleason-type derivations of quantum probabilities do not tell us how to assign joint probabilities to two such events: one must introduce an additional ingredient—the state update rule. If the statistics for the first measurement are described by a density matrix ρ, and the first measurement is described by a CP map ${ \mathcal M }$, one calculates the probabilities for the second measurement, given the outcome of the first is known, by applying the Born rule to the updated state [51]

Equation (17)

(Note that the update rule does not depend on the particular decomposition of ${ \mathcal M }$ into Kraus operators ${\{{K}_{j}\}}_{j=1}^{m}$.) In an operational perspective, rule (17) is seen as a quantum analogue of classical knowledge update. Within the quantum process framework, this is more than an analogy: the update rule is derived from the joint probability assignment.

To make the argument rigorous, we should remark again that the quantum frame function is not a normalised probability measure over the entire space of potential events. Formally, the frame function defines a parametrised probability for observing a CP map ${{ \mathcal M }}^{X}$ given an instrument ${{\mathfrak{I}}}^{X}$ in regions X = A, ...:

Equation (18)

(This defines a conditional probability if a marginal $P({{\mathfrak{I}}}^{A},\,\mathrm{...})$ is assigned.) Even though the inclusion of instruments is necessary to define (18) as a classical probability, we will omit them in the following for notational convenience.

Expression (18) defines an ordinary, classical probability measure, which lets us use all the machinery of classical probability theory. In particular, the conditional probability to observe ${{ \mathcal M }}^{B}$ in region B, given that ${{ \mathcal M }}^{A}$ is observed in region A, can be calculated from the joint probability distribution:

Equation (19)

where we introduced the conditional process matrix [44]

Equation (20)

Relevant to the ordinary state update rule is the case where A precedes temporally B, and the evolution between the two events is trivial. This scenario is described by the process matrix (see, e.g., [22])

Equation (21)

Equation (22)

where ρ is the density matrix describing the input state of region A. A straightforward calculation shows that, in this case, the conditional process matrix reduces to

Equation (23)

which is the process matrix description of region B receiving a state described by the density matrix ${\widetilde{\rho }}_{{{ \mathcal M }}^{A}}$.

Conditioning versus updating

Some clarification at this point might be helpful regarding the distinction between probabilistic conditioning and knowledge update (see, e.g., [52] for a more detailed discussion). In classical probability theory, the rules for probabilistic conditioning simply follow from the axioms of the theory. These axioms can be assumed, as in Kolmogorov's approach, or derived from requirements on how one should consistently assign degrees of belief, for example through Dutch book arguments [53] or theorems like that of Savage [54]. A consequence of probabilistic conditioning is Bayes' theorem for inverting conditional probabilities:

Importantly, this rule does not involve updating knowledge given new information: all information is contained in the joint probability P(A, B), which is unchanged by the application of the theorem.

Knowledge update, on the other hand, refers to the process of updating one's belief following the acquisition of new data. This process is not encoded in the axioms of probability theory and requires extra assumptions. For example, if one assumes that all data values that were not observed can be discarded, one arrives at Bayes rule:

where b is the observed value of the variable B. Implicit in this rule is the counterfactual assumption that values that are not observed are known to be false. Thus, applying Bayes theorem, one arrives at the standard form for Bayes updating:

In the quantum case, such counterfactual assumptions are known to be problematic [55]6 . In our approach, no such assumption is necessary, because the primitive object is the joint probability. From this perspective, the 'state update' rule is not an update at all, but rather an application of probabilistic conditioning. This insight distinguishes this approach from other attempts to leverage Bayesian arguments to justify an informational interpretation of quantum theory [21, 23].

Discussion

In this work we have shown that it is possible to use a Gleason-type approach to derive a quantum probability rule that subsumes both the Born rule and the state update rule. By using the structure of local quantum operations and a reasonable non-contextuality assumption we have derived both the new rule and the appropriate object to represent the arbitrary background structure, or process. The central feature of the probability rule is linearity. In contrast to previous derivations, where linearity was assumed [17, 20], here we have shown that it can be derived from the assumption of non-contextuality alone7 .

Our demonstration that the state update, or 'collapse' rule can be regarded as non-fundamental offers a new perspective on a variety of foundational questions. In particular, informational interpretations of wavefunction collapse can now be given a rigorous foundation: state update can be viewed as a case of classical probabilistic conditioning. The work here also presents the opportunity to extend no-go theorems for non-contextual hidden variable models to scenarios involving more general causal structures [56].

Finally, a key advantage of the approach presented here is that it does not presuppose any a priori distinction between space-like and time-like separated events. Therefore, it avoids conceptual difficulties associated with the non-covariant nature of the state update rule. It is thus a promising direction for research aimed at developing a fully relativistic version of the formalism that encodes space-time symmetries.

Acknowledgments

We thank Časlav Brukner, Eric Cavalcanti, Josh Combes, Chris Timpson, and Howard Wiseman for helpful discussions. This work was supported by an Australian Research Council Centre of Excellence for Quantum Engineered Systems grant (CE 110001013), and by the Templeton World Charity Foundation (TWCF 0064/AB38). FC acknowledges support through an Australian Research Council Discovery Early Career Researcher Award (DE170100712). This publication was made possible through the support of a grant from the John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. We acknowledge the traditional owners of the land on which the University of Queensland is situated, the Turrbal and Jagera people.

Appendix. Inner product for linear maps:

Here we construct the inner product on the space of linear maps ${L}^{A}=\{{ \mathcal M }:{ \mathcal L }({{ \mathcal H }}^{{A}_{I}})\to { \mathcal L }({{ \mathcal H }}^{{A}_{O}})\}$ and derive its CJ representation. Recall that, given an inner product $\langle \psi | \phi \rangle $ on a Hilbert space ${ \mathcal H }$ and an arbitrary basis that is orthonormal with respect to this product, $\langle {e}_{j}| {e}_{k}\rangle ={\delta }_{{jk}}$, one defines the Hilbert–Schmidt scalar product for operators $\sigma ,\rho \in { \mathcal L }({ \mathcal H })$ as

Equation (24)

where we momentarily abandon the Dirac notation and represent explicitly the action of an operator on a vector as $v\in { \mathcal H }\rightsquigarrow\rho (v)\in { \mathcal H }$. Note that the Hilbert–Schmidt inner product does not depend on which basis is used in its definition, as long as it is orthonormal with respect to the underlying Hilbert space inner product.

We move a step further and, based on the Hilbert–Schmidt inner product, define an inner product for the space LA of linear maps. For this purpose, we select a basis of Hermitian matrices for the input space that is orthonormal with respect to the Hilbert–Schmidt product (called Hilbert–Schmidt basis):

The inner product between any two linear maps ${ \mathcal M }$, ${ \mathcal N }$ is then defined in analogy to equation (24) and coincides with the inner product introduced in the main text:

Equation (25)

where the subscript S stands for 'superoperator'. Note that, just as for equation (24), the RHS of equation (25) can be rewritten as a superoperator trace and is thus independent of the choice of basis.

Next, we want to relate the superoperator inner product to the CJ representation. Reintroducing the Dirac notation, the CJ inner product between operators is defined as

Equation (26)

Equation (27)

Note that the inner product keeps the same form if definition (26) is replaced by its transpose. We can thus re-write it as

Equation (28)

To see how this relates to the superoperator inner product, we need to recall two useful facts.

Lemma 3. Given a Hilbert space ${ \mathcal H }$, the swap operator $S:{ \mathcal H }\otimes { \mathcal H }\to { \mathcal H }\otimes { \mathcal H }$, defined by its action $S| \psi \rangle | \phi \rangle =| \phi \rangle | \psi \rangle $, can be written as

Equation (29)

for an arbitrary Hilbert–Schmidt basis $\{{\tau }_{\mu }\}\subset { \mathcal L }({ \mathcal H })$.

Proof. Viewed as an operator, S can be decomposed with respect to a basis $\{| j\rangle \}$ of the Hilbert space ${ \mathcal H }$ as $S={\sum }_{{km}}| k\rangle \langle m| \otimes | m\rangle \langle k| $. On the other hand, viewed as a vector on the linear space of operators ${ \mathcal L }({ \mathcal H }\otimes { \mathcal H })$, S can be decomposed with respect to the Hilbert–Schmidt basis as

Equation (30)

The components in the above representation are given by

Plugging this into the decomposition (30), we obtain equation (29). □

This lemma can be used to prove the completeness relation

Equation (31)

Indeed, using ${\tau }_{\mu }={\tau }_{\mu }^{\dagger }$, we have

We can now re-write the superoperator inner product:

Comparing this with equation (28), we conclude that ${({ \mathcal M },{ \mathcal N })}_{{\rm{CJ}}}={({ \mathcal M },{ \mathcal N })}_{{\rm{S}}}$.

Footnotes

  • In some works, a distinction is made between operational and ontological versions of non-contextuality [3335], where the latter are used to rule out hidden-variable models. Although the expression 'measurement non-contextuality' was introduced in the ontological setting [33], we here use it in the operational sense (corresponding to the simple expression 'non-contextuality' used in [6, 30]). The addition of 'measurement' is made here in order to distinguish the notion from 'instrument non-contextuality,' a term we introduce in the next section.

  • The term frame function used in this paper is distinct from Gleason's original use of the term. In Gleason's terminology, we are here considering a positive frame function of weight one. This is in accordance with the definition of a frame function in [29, 30].

  • Henceforth we omit the subscript j for notational convenience.

  • This definition aligns with the convention in [20]. Other definitions, differing by a transpose or partial transpose, do not change the representation of the inner product.

  • That is not to say that one can not apply counterfactual reasoning successfully in the case of single quantum contexts. Indeed, it is only when one extends the requirement to a joint probability over all contexts that such counterfactual reasoning becomes problematic.

  • Obtaining linearity of the probability rule via an extension of Gleason's theorem was suggested, but not proved, in the supplementary methods of [20].

Please wait… references are loading.