Toys can't play: physical agents in Spekkens' theory

Information is physical, and for a physical theory to be universal, it should model observers as physical systems, with concrete memories where they store the information acquired through experiments and reasoning. Here we address these issues in Spekkens' toy theory, a non-contextual epistemically restricted model that partially mimics the behaviour of quantum mechanics. We propose a way to model physical implementations of agents, memories, measurements, conditional actions and information processing. We find that the actions of toy agents are severely limited: although there are non-orthogonal states in the theory, there is no way for physical agents to consciously prepare them. Their memories are also constrained: agents cannot forget in which of two arbitrary states a system is. Finally, we formalize the process of making inferences about other agents' experiments and model multi-agent experiments like Wigner's friend. Unlike quantum theory or box world, in the toy theory there are no inconsistencies when physical agents reason about each other's knowledge.

1 Introduction and summary of the toy theory Physical information. Physical theories that aim to describe the world at macroscopic scales should ideally be able to physically model observers, the experiments they perform, and their reasoning within the theory. For instance, the information acquired by agents must be stored in some physical form, in systems that we call memories. These can encompass biological brains but also classical and quantum computer memories, or even just measurement devices. Processing that information is ultimately a physical process, and manipulations of an agent's memory don't simply result abstract epistemic changes of their knowledge, but also in concrete physical changes. For example, quantum measurement schemes [9] show us that the process of acquiring information about a system entangles the system with our physical memory, and Landauer's principle [1,10] tells us that forgetting that information is to shuffle those correlations to the environment, at a thermodynamic cost.

Abstract logic.
Another feature that physical theories should satisfy is to allow for the information contained in the memories to be operated according to simple reasoning principles. Ideally we would like inferences such as "if Alice knows that a is true, and she knows that a implies b, then she knows that b is true" to hold independently of the physical origins of a and b. 1 In other words, an abstract system of epistemic logic should ideally be applicable to any physical setting within the theory [5,7].

Tension between abstract logic and physical information.
It was shown that for some theories where both of of these requirements are satisfied -where agents reason about each other's knowledge and are themselves modeled as physical memories within the scope of the model -experience inconsistencies. For example, this is the case for the quantum theory and generalized probability theories (in particular, so-called box world), where agents applying standard logic to reason about physical experiments can come to contradictory conclusions [3,8]. Our ultimate goal is to understand which classes of theories Notation for toy bits. We can represent up to two discrete toy systems via simple grid diagrams. For higher dimensions, grid diagrams aren't as useful; the interested reader may find a review of the necessary mathematical notation in the appendix. A single system with two degrees of freedom (d = 2) has four different ontic states (labelled for example 1,2,3,4). To see this, note that answering two binary questions would be sufficient to identify the ontic state of the system, for example "is the ontic state odd?" and "is the ontic state smaller than 3?". Each ontic state is represented by one of four boxes, 1 2 3 4 . To represent an observer's knowledge about the ontic state -that is the epistemic state from their perspective -we colour in some of the boxes. For example, {1, 2} = 1 2 3 4 represents "the observer knows that the system's ontic state is either o = 1 or o = 2." For simplicity we usually omit the number labels. One can draw an analogy between toy states and quantum states, These are the only pure states at d = 2: they are states of maximal allowed information, according to the knowledge balance principle, for which the observer knows half of the degrees of freedom (for example represents "the ontic state is odd"). The quantum analogy carries through to a stabilizer formulation of the toy theory [11]. The fully mixed state (a state of maximal ignorance) is represented as the mixture Unlike quantum theory, this is the only physical mixed state at d = 1, that is the only mixed state that can emerge as a marginal of a globally pure state (like the entangled state which we will consider in a moment). The epistemic restriction implies that for a system composed of N elementary systems, an agent is only allowed to have access to exactly 0 ≤ J ≤ N bits of information (corresponding to an epistemic state spanning 2 2N −J ontic states). For example, for N = 1 the valid epistemic states can span either 2 or 4 ontic states. The mixture = ∨ is not a valid epistemic state [20]. 2 Composing systems. When we consider two toy systems, we represent the ontic states with a 4 × 4 grid, where the rows determine the possible ontic states of system A and the columns those of system 2 At the time of writing (late 2022), Spekkens and colleagues are investigating how to articulate a layer of (Bayesian) probabilistic knowledge on top of these epistemic "physical" states. If successful, this would allow for Bayesian states whose measurement statistics are identical to the illegal "physical" epistemic state , at least for local measurements. We leave it as future work to study the consequences and stability of that approach.

B, for example
We omit the subsystem labels when they are clear from context. There are also global toy states which are not product states, for example the classically correlated state which is a mixed state, and the pure entangled state ∼ |00 + |11 .
Analogously to a Bell state in quantum theory, this latter state is used for toy teleportation and dense coding protocols [20]. Like their quantum analogues, entangled toy states are globally pure states with mixed marginals. In this example, the observer has maximal information about the correlations between A and B, and maximal ignorance about the reduced state of individual subsystems.
Reduced states. For discrete toy systems, taking the reduced state over one system corresponds to projecting the grid diagram into one axis. For example, More formally, if the global epistemic state is E AB , the reduced (or marginal) state on system B is In the above example, E AB = {(1, 1), (1,2), (3,1), (3,2)} so E B = {1, 2}.

Notation for transformations.
For discrete dimensions, the allowed toy transformations are permutations of ontic states, constrained to mapping valid epistemic states to valid epistemic states (in all subsystems). For example consider the transformation that permutes the second and third ontic states, H = = (2,3).
This permutation acts analogously to the Hadamard gate in quantum theory, We will see other examples (like a CNOT gate) further ahead; in particular we will see that a controlled Hadamard is not a valid operation, and we will explore the implications of this for agents' free choice.
Notation for measurements. In the toy theory, a measurement is a partition of the ontic state space into valid epistemic states, called the measurement basis. The outcome of the measurement is determined by the position of the ontic state. The observer can then update their knowledge; in the toy theory the measurement update rule leads to an ontic disturbance, which we will later see emerges from the physical implementation of a measurement. The consequence for the updated epistemic state is a bit cumbersome to explain, but will become clear from examples right ahead. In short, the updated description should guarantee that if the observer repeats the measurement they obtain the same outcome, and be compatible with their overall knowledge [20]. For example, consider the measurement M Z = 0 0 1 1 , where the numbered partitions correspond to the epistemic states of the measurement basis. This is analogous to a Pauli-Z measurement of a single qubit. Suppose that Bob measures a toy bit in state in this basis, and obtains outcome 0 . This allows him to deduce that the ontic state previous to the measurement was 0; however = ∧ is not a valid epistemic state. Rather, it is a pre-and post-selected state, a concept that has analogues in quantum theory. The smallest epistemic state compatible with his knowledge of the pre-and postselection and with the requirement for repeated outcomes is . To see this, first note that there are four epistemic states compatible with Bob's knowledge, The requirement for repeated outcomes translates to "the post-measurment description must be a subset of , so that if I apply the same measurement again, I am guaranteed to obtain the same outcome 0 ." Of the four candidates, only the first epistemic state satisfies the condition,

⊆ ⊆
This is analogous to the quantum measurement update rule for projective outcomes: if Bob measured |+ in the Z basis and obtained outcome 0, he would henceforth describe the state as |0 . Now suppose that Bob is making the same measurement, but now on his half of the entangled toy Bell state (1. This is the pre-and post-selected state that describes his knowledge of the ontic state of AB just before the measurement. There are two valid epistemic states that are compatible with this knowledge and live in the measurement partition 0 , Of the two candidates, Bob picks the leftmost (the smallest state), which is the description that makes the most use of his knowledge about the state before the measurement (picking the state on the right would be discarding what he knew of the correlations between the two systems). The quantum analogy is when Bob makes a local Z measurement on a Bell state ∝ |00 + |11 , and upon seeing outcome 0, updates his global description of the state to |00 .
2 Learning: physical implementations of measurements Minimal settings. In theories where agents can improve their knowledge about the state of the system by posing questions or performing measurements on the system, they have to update their epistemic state as a consequence. Hence, if we model agents as physical systems, the bare minimum of their description has to include the degree of freedom of the memory entry corresponding to the system being inquired by the agent. We call the process of revising the memory entry after a measurement a memory update. In this section, we consider how one can model the process of learning -measuring a system and registering the outcome in a memory -in the toy theory. It is in these settings that we find the most dramatic differences between quantum and toy theory. At every step, we start by reviewing the quantum process, and then try to find an analogous in the toy theory.
Quantum measurements as entangling operations. In quantum mechanics, measurements are entangling processes between the system of interest and a measurement device. For example, in the Stern-Gerlach experiment, the internal spin of a particle is coupled to its position degree of freedom through the application of a magnetic field; it is the position of the particle that acts as a pointer or measurement device when it hits a screen, as the arrival position is correlated with the internal spin.
Quantum von Neumann measurement scheme. More concretely, we can measure a discrete ob-servableÂ S = k a k |a k a k | S on a system S by coupling it to a measurement device (or pointer) M , through a Hamiltonian of the form H = gÂ S ⊗B M , whereB M is a suitable observable on M and g a tunable constant [9]. Letting the two systems evolve for some time t corresponds to the reversible (a) Von Neumann measurement scheme [9]. An observableÂ S = k a k |a k a k | is measured by coupling the system of interest S to a pointer M through the HamiltonianĤ = gÂ S ⊗B M . If the original state of the system is A qubit S is measured in computational basis: it is coupled to another qubit M , acting as a memory system, by a CNOT gate. The operation coherently copies the state of S to M .
(c) Continuous quantum measurement. If M is a continuous system andB M =P M is the momentum operator and then each |ψ k M has the original wave function of the pointer shifted by a k , ψ k (x) = ψ 0 (x − ta k ). For strong measurements, these peaks don't overlap. In quantum theory, we can implement measurements as unitary processes, which entangle the state of the system measured to the measurement device in the basis of the observable. Here are two familiar examples. Zooming out to the perspective of an external agent who sees the experimenter as just another quantum system, we can include the experimenter's memory and lab as part of the measurement device, and use the same unitary view to model the whole measurement process.
evolution modelled by the unitary V = e −it H . If S is initially in an arbitrary state |φ S = k α k |a k S (with unknown coefficients α k = a k |φ S ), and we prepare the measurement device (or "pointer") in state |ψ 0 M , then after time t the global state becomes This simple unitary evolution can be modified to account for noise, finite-size effects, coarse and continuous observables and other corrections, in order to cover realistic implementations of quantum measurements. A relevant remark for later is that ultimately, the observer's memory is itself a quantum system that becomes entangled with the system measured -at least from the perspective of an external agent.

Quantum examples.
Two examples for continuous and discrete measurements are summarized in Figure 2.1. The simplest case is when both the system to measured and the pointer are single qubits (Figure 2.1b). An entangling CNOT gate 3 between S and M implements a strong measurement of S in the Z basis: A familiar example in continuous systems is the position measurement of a particle, which entangles the particle and pointer in the position basis. This is achieved for example by settingÂ S =X S andB M = 3 An interaction Hamiltonian that implements this gate in a suitable amount of time (e.g. t = π) is for example where Z S and X M are shifted Z and X operators, P M , and initializing the pointer to a well-localized state (like a Gaussian wave 4 ). The measurement process results in the physical evolution Emergence of the post-measurement state. In quantum theory, the post-measurement state of a system S emerges from the physical picture by moving the Heisenberg cut one level up and thinking of a projective measurement on the agent's memory M after the physical entangling operation between S and M . For example, the (unnormalized) post-measurement state when obtaining outcome 1 on a Z measurement of a qubit in state |ψ S can be expressed as For details on the general case we refer to Appendix C.1.
Measurements in the toy theory: discrete example. We can now try to bring over the same concept of physical measurement to the toy theory. Let us consider two separate toy bit systems: the first one is the system we measure, the second is the memory where the outcome is registered. We can proceed analogously to the quantum case, as there is an allowed transformation corresponding to the CNOT gate in the original toy theory [2]. In the notation of grid diagrams it permutes the ontic states as

CNOT =
Analogously to the quantum CNOT gate, the toy CNOT transformation correlates two toy-bits, for example Once again, the post-measurement state of the system emerges from conditioning on a projective measurement on the memory, and then finding the reduced state.
Measurements in the toy theory: general case. For the general case of arbitrary continuous or discrete dimensions, we can also find global transformations on the system and pointer that implement any valid measurements (Theorem B.6). The transformation is essentially analogous to the von Neumann measurement procedure, correlating the two toy subsystems in a way that reflects the properties of the observable, and recovers the post-measurement state of the system measured when we condition on the outcome. There are a few extra constraints (for example, on how the dimensions of system and pointer should match), but the main qualitative difference to quantum measurements comes from the fact that toy observables are already restricted at the abstract level, as we saw in the introduction. 3 Acting: restrictions on agents' choices Choices start with measurements. If we model agents' actions as physical processes, we see that they can always be decomposed as a measurement followed by a conditional transformation. The process of deciding which of a series of actions to take is ultimately a measurement -of one's memory, of a randomness generated, or any relevant external systems. We look at the weather to decide what to wear, consult our agenda to decide on appointments, and even when making random decisions we can model our source of randomness as an explicit physical system. Taking this to the extreme, when we try to implement a statement like "a system S can be prepared in one of many states {ψ k } k ", whoever chooses the state k does it by measuring another system, obtain an outcome k, and then apply a physical transformation on their own memory and S that prepares the state. We will see that in Spekkens' toy theory, physical agents are dramatically restricted in this action. Indeed, only agents outside the toy theory can prepare non-orthogonal states. In other words, toys can't play.
Quantum conditional preparation scenarios. First consider a simple quantum measure-and-prepare scenario: Alice measures a state |φ R = k α k |a k R and, depending on her observed outcome k, prepares another system S in state |η k S . This procedure can be described from the outside as a global two-step unitary process (Figure 3.1). First, Alice's measurement is modelled by a unitary V that couples R to Alice's memory A; This is followed by another unitary U , which implements the conditional state preparation, correlating S with A. In the case of a strong measurement ( ψ k |ψ A = δ k ), the overall transformation acts as After a strong quantum measurement ( ψ k |ψ A = δ k ), Alice is not restricted in the states {|η k S } S that she can conditionally prepare. In particular, she can conditionally prepare non-orthogonal states: for example, she can prepare |0 S if she observes outcome 0, and |+ S if the outcome is 1, In contrast, the toy theory imposes unexpected constraints, and this kind of conditional preparation is not allowed.
Forbidden conditional preparations in the toy theory. Conditional preparations of non-orthogonal states are not allowed in the toy theory. The problem is not in Alice's measurement (which we've seen are very similar to quantum measurements), but in the second step, U , where Alice performs the conditional preparation of non-orthogonal states. This transformation would have to act on the joint state of A and S analogously to which in the toy theory would look like There are no allowed transformations that implement this action in the toy theory, even when we consider transformations on a larger system, which are irreversible at this scale (Corollary B.9). The generalization of this example is a dramatic restriction on agents' actions; see Theorem B.8 for the formal version of this result. This theorem applies to arbitrary system dimension and arbitrary amount of systems, that can later be traced out.
Theorem 3.1 (Restrictions on conditional action of agents in the Spekkens' toy theory). In Spekkens toy theory, if an agent measures a system R, obtaining outcome k, and prepares a second system S in one of several states {ψ k } k depending on the outcome k, then any two of these states (ψ k , ψ ) must be either identical or orthogonal. The number of identical states of each type is the same.

Forgetting: valid expressions of ignorance
Forgetting with explicit quantum memories. The physical process of forgetting information originally stored in a memory can be modelled through an interaction between the memory and its environment; this may lead to the loss of correlation between the current memory content and the system it referred to. For example, suppose that you write your credit card number on a notepad -the notepad is correlated with your credit card. If later the notepad is smudged, erased or burned (all physical interactions with an environment), it will no longer be perfectly correlated with the credit card. The same can be said of information stored in a (quantum or classical) hard drive which is subject to noise and decoherence originating from the interaction with its environment, as expressed by the data processing inequality. In a simple example (Figure 4.1), suppose that the agent measures a bipartite system S = S 1 ⊗ S 2 , storing the outcome in their bipartite memory M = M 1 ⊗ M 2 , through a standard von Neumann scheme which from an outside perspective is modelled like a coherent copy operation, entangling Memory update One can imagine a process where the information stored in an agent's memory is partially lost to the environment. Here, the agent first performs a memory update, writing down the outcomes of measurements on systems S1 and S @ in their memory systems M1 and M2. Due to an interaction with the environment E (for example a complete thermalization represented here as a SWAP gate), the information contained in memory qubit M2 is exchanged with the environment, so that correlations between the memory and S2 are lost to the agent through this process.
S and M , Now let the memory interact with an environment system E. An example is complete thermalization of the second register, in which the environment is initially in a thermal state τ E and the interaction swaps the state of the second register M 2 with the environment, After the global state evolves under 1 S ⊗ U M E , the final (mixed) state of S and the memory is where all the information about S 2 is lost to the environment, as can be seen from the mutual information Abstract uncertainty in quantum theory. In addition to this physical process of forgetting, in quantum theory mixed states ρ S = i p i ρ i can be used to describe a state of knowledge where an agent has abstract uncertainty about which of the states {ρ i } i describes S, and their best Bayesian guess is described by a probability distribution {p i } i . In quantum theory the probabilities and states in these abstract mixtures can be arbitrary, and one can always find a physical forgetting process on an explicit memory that connects the physical and abstract representations. In the toy theory (as it stands at the time of writing) descriptions of uncertainty are severely limited, perhaps because there are no natural physical sources for this uncertainty in the toy world.
Forgetting with an implicit toy memory. In the toy theory, not all states can be mixed in a way such that the resulting state is allowed by the epirestricted picture. For example, while it is possible to mix the states = , we are not allowed to mix states as

= !
This means that for certain sets of states we are not allowed to forget which states we had initially. Moreover, even if we are able to 'forget', we only forget each state with an equal probability: the epistemic states always constitute uniform probability distributions over the corresponding set of ontic states. One can argue that only uniform distributions over the states we choose to forget (when we can) are physical, as it is not clear how one would assign non-uniform priors to the probabilities of forgetting for different states. For example, we cannot model a setting where we forget that the system is in the state {1, 2} with probability 1 3 , and in the state {3, 4} with probability 2 3 -we are only allowed to forget both states with an equal probability of 1 2 .
Forgetting as a physical process with explicit toy memories. We can now see the equivalent of the quantum example where we explored the physical process of forgetting. Suppose that S starts in state . and the agent has a memory M = M 1 ⊗ M 2 originally in state The agent measures the two toy bits of S in the Z basis, storing the outcome of the first measurement in M 1 and the second in M 2 . From the outside we can model this measurement as an entangling operation CNOT S1M1 ⊗ CNOT S2M2 resulting in the global state Now we simulate the decoherence process in memory M 2 , through swapping with an environment in a fully mixed state, E . In the quantum analogy, this corresponds to full thermalization of M 2 with an environment at infinite temperature or with a degenerate Hamiltonian. The grid diagram of the joint state of S 2 ⊗ M 2 ⊗ E would span three dimensions, but a convenient visualization is This gives us the joint state of S and M S 1 In this example, measuring their memory M could give the agent some information about S 1 but not about S 2 -they have forgotten about S 2 . This physical picture helps us understand why forgetting in arbitrary ways isn't allowed in the toy theory: to model partial ways of forgetting, we can vary the interaction V M E between memory and the environment, and the initial state of the environment. Because both transformations and epistemic states are restricted in the toy theory, the final states of S ⊗ M are also restricted in form. The information an agent still remembers about S after interaction with an environment can be again modelled through the reduced states of S conditioned on a measurement on the memory, and these states are, as we have seen, restricted.
Interpretation. Another intuition for why forgetting always results in a fully mixed state lies in how we understand knowledge in the toy theory [21]. The principle of knowledge balance imposes that we either possess information about the individual states of the systems, or information about correlations between them. After a physical measurement, the memory and the measured system are perfectly correlated, so from the outside we have no information about the reduced state of the system; erasing the memory effectively erases the information about correlations, leaving us maximally ignorant about the state of the measured system.

Reasoning: making inferences about other agents' experiments
In this section, we formalize the conditions under which agents can reason about measurement outcomes -their own and each other's. First, we look at what it means to get a certain outcome or predict it with certainty. Then, we apply this result to a particular subset of statements agents can make, namely, inferential statements. Finally, we demonstrate how these rules are applied, using examples of Bell scenario, Wigner's friend, and Frauchiger-Renner thought experiment in the toy theory, and discuss the differences from their quantum counterparts.
Deterministic predictions. In the following thought experiments, agents are able to reason about each other's outcomes -for deterministic statements. Let us formalize what it means to measure something with certainty or to predict something with certainty in the toy theory. In Appendix B.3 we prove lemma B.10 which gives two conditions for an outcome of a mesurement to happen with certainty. The first condition certifies that an outcome happens with certainty, while the second condition ensures that this outcome is the desired one.

Conditions for making inferences.
In thought experiments we often make inferences of the type "A = 1 =⇒ B = 1", which predict other agent's measurement outcome based on our observation. How do we formalize the certainty of such an inference in the toy theory? The statement "A = 1 =⇒ B = 1" corresponds to the conditional probability P (B = 1|A = 1) = 1. Lemma B.11 gives two conditions for when we can make valid inferences. Intuitively, the first condition ensures that there is an outcome of the measurement of observable B that can be inferred if observable A = 1 is known. The second condition ensures that the outcome that can be inferred is indeed the outcome B = 1.

Example: a Bell scenario
Quantum Bell setting. In quantum theory, if Alice and Bob share a Bell state and measure their individual qubits in the computational basis (corresponding to the observables Z A and Z B ), they can make inferences about each other's outcomes ( Figure 5.2). For example, if the shared state is and Bob obtains outcome b = 0, he can update his description of the global post-measurement state to |00 AB and infer with certainty that Alice must obtain outcome a = 0; analogously for b = 1.
Toy Bell scenario. This scenario can be reenacted in the toy theory, with similar results: Bob can make a deterministic prediction about Alice's outcome ( Figure 5.2). Alice and Bob share the entangled From this description, Bob can infer that if Alice now measures her system in the same basis, she will obtain outcome A = 0 = with certainty, that is, "B = 0 =⇒ A = 0" Analogously, he can conclude that "B = 1 =⇒ A = 1". A formal proof can be found in Appendix B.3.

Example of meta measurements: Wigner's friend
Wigner's quantum friend. Wigner's friend experiment was first proposed by Wigner [22]. The setting involves a quantum system R and an observer A (Alice) performing a measurement on this system in a closed laboratory, as well as an outside observer Wigner. For Alice in the lab, the outcome of the experiment is recorded in the device she is using to measure the system R, for example, as a position of a pointer (or an entry in her memory). However, Wigner does not have any information about Alice getting a particular outcome, and describes the evolution of the closed lab as a unitary (reversible) process, and assigns an entangled state to R and A. In the language of quantum mechanics, if we assume that the pointer is initially in the state |0 A , and the state of the measured system is 1 Alice and Wigner turn out to have descriptions of the same setting which are vastly different from each other. We will not discuss numerous conceptual implications of the original thought experiment herea review can be found in [6]. However, we would still like to see how we can model this setting in the toy theory (here we will do so in the original epirestricted picture).
Wigner's toy friend. In the toy theory, we consider again two subsystems: the measured system R and Alice's memory register A. The individual states and the joint state of the systems R and A can be pictured as Alice measures R in the toy-Z basis. She describes her measurement as M Z = 0 0 1 1 and sees a definite outcome a. Wigner sees Alice's measurement as a reversible transformation (the toy CNOT), and updates his description of the joint state of R and A as Alice, on the other hand, has the subjective experience of seeing one outcome a and write it to her memory. We can describe the knowledge update from the perspective of the different agents as Wigner, describing Alice's measurement as a reversible CNOT transformation. .

Interpretation.
In the framework of the toy theory, the difference between Alice's and Wigner's descriptions has a straightforward interpretation. Thanks to the knowledge balance principle, in the state of the maximal knowledge an agent can have maximal information either about an individual system or about how these individual systems are correlated. Hence, Alice's and Wigner's epistemic states do not contradict each other, and simply represent two different ways an agent can view a composite system (two states of knowledge about the same ontic state). In [21], it is shown that the Wigner and Alice's views discrepancy can be reproduced (and interpreted!) in any realistic toy model. Alice's collapsed state can simply be understood as more coarse-grained compared to Wigner's; the correlations of her state are beyond her description level. She can still get away with it, though, as the correlations are not used in later dynamics -which is not the case for the next scenario we present.

Example of multi-agent paradoxes: the Frauchiger-Renner scenario
Quantum multi-agent paradox. In the Frauchiger-Renner setting [19], four quantum agents (Alice, Bob, Ursula and Wigner) perform a series of measurements and reasoning steps, reaching a logical contradiction, that is, when both Ursula and Wigner obtain outcomes "ok" in their measurements, they can reason based on their observation that that Bob predicted that Alice predicted that Wigner would obtain a different outcome, "fail", with certainty. The experimental protocol consists of individual steps that we covered so in this manuscript: two qubits R and S are initially prepared in an entangled Hardy state [23]; Alice measures R and Bob measures S; then Ursula measures Alice's lab (including R and Alice's memory A), and finally Wigner measures Bob's lab (including S and Bob's memory B). The contradiction is found for a specific choice of initial state and measurement bases, which are described in Appendix C.2. 5 For a pedagogical discussion of the original paradox and the assumptions behind it, we refer to our previous work [24]; for discussions of broader implications for abstract logic and interpretations of quantum theory see for example [6,7,25,26]. The paradox has also been shown to arise in other physical theories, namely box world [8].
Toy multi-agent scenario. Our question is whether a similar multi-agent logical paradox can be found in Spekkens' toy theory; we will see that it cannot, partially because of the restrictions in the individual operations like conditional state preparation, and partly because this is an explicitly non-contextual epistemic theory. Since the toy theory does not allow Alice to perform non-orthognal conditional state preparation of S, we follow the version of the experiment where all relevant correlations are encoded in the initial state of RS. The global system of Alice and Bob's labs is composed of four subsystems: two systems R and S measured by Alice and Bob, and Alice's and Bob's memory registries A and B (Figure 5.3). The order of measurements follows the original experiment, and the systems can be of an arbitrary dimension k. We show that in the toy theory, there is no choice of initial state and measurements by the four agents that can lead to a logical contradiction in this experimental scenario. A formal description of the setting and proof can be found in Appendix B.4.

Discussion
Learning, reasoning and forgetting as physical processes in the toy theory. In this work, we found a way to model the physical evolution of systems and agents' memories that implement in Spekkens' toy theory the abstract process of measurement and forgetting information, analogously to how quantum measurements and information loss are modelled as explicit quantum evolutions. We found conditions on experimental settings that guarantee that agents can reason with certainty about each other's experiments, including settings where agents can measure each other.
Restrictions on free choice of agents. One can interpret the impossibility of an arbitrary conditional preparation in the toy theory as a limitation on the free choice of agents. An experimenter cannot decide to prepare an arbitrary valid epistemic state depending on her observations -she is constrained to an orthogonal set of states. In addition, agents cannot set arbitrary probability distributions as inputs for future experiments (because such distributions would be encoded in physical systems like biased coins). Note that agents can still perform deterministic operations that entangle other systems; they just cannot make a decision about which entangling operation to apply that does not result in uniformly-distributed orthogonal states on those systems.
In the toy theory, limited knowledge is. . . limited. While in classical and quantum theories we can always lose information, in Spekkens' toy theory it is impossible to model many natural expressions of limited knowledge, like not knowing which of two non-orthogonal states a system is in; conditional state preparation of arbitrary states is also forbidden. This shows us that even aspects of logic and information theory that we take for granted and consider independent of the physical theory (like having probabilistic knowledge) are indeed dramatically physical.

Foils of the toy theory
To understand to which extent these peculiarities are an artifact of the knowledge balance principle, one could consider possible relaxations of the principle 6 -for example, imposing it for pure states but allowing all probabilistic mixtures of pure states. One must proceed with caution, as any such relaxation may have unintended consequences for the stability of the theory: for example, we considered the relaxation in the original formulation where we only require valid marginals, but not that validity is preserved under subsystem measurement, and tried to come up with a different measurement update that does not require this. However, we found that it is not possible to define such a measurement update and, additionally, found mixed epistemic states that that cannot be written as the mixture of pure states. We leave the investigation of other relaxations of the theory to future work.
Forgetting in other epistemic theories. Epistemic models provide insight into how different epistemic restrictions influence the set of transformations and measurements an agent can perform, and how their memory can be modeled. We have already mentioned epistemically restricted Liouville mechanics [27], and here we have taken a look at a particular epistemic limitation of "knowledge balance principle". However, one can imagine that the restrictions above can be weakened or modified. One possible direction of the future research then would be modeling memories of agents and their reasoning for an arbitrary relation between information contained in the epistemic and ontic states of the system, and see in which cases conclusions made in quantum mechanics are reproduced. This could lead to a better understanding of which types of epistemology can be admitted by quantum mechanics, and what are essential properties of such epistemic theories. We leave the investigation of how the striking limitations in the process of forgetting information plays out for other (epistemically restricted) theories as future work.
No multi-agent logical paradox in the Frauchiger-Renner setting. We proved that in settings analogous to the Frauchiger-Renner experiment there is no assignment of states and measurements that can lead to a logical paradox in the toy theory. We don't claim that our model of agents is exhaustive; in our analysis, we only capture one degree of freedom of the agent which corresponds to the memory register for the outcome of their measurement, and for this particular model (albeit a minimally reasonable one) the paradox does not come to be. We conjecture that there is no model that can lead to a paradox, in arbitrary multi-agent settings, because the theory is non-contextual; we will investigate this in future work.
Relation to contextuality. Multi-agent logical paradoxes involve chains (or possibly more general structures) of statements that cannot be simultaneously true in a consistent manner. Failures of noncontextuality can often be expressed in terms of the inability to consistently assign definite outcome values to a set of measurements [2,28]. Examples of paradoxical chains of reasoning in quantum theory [19] and box world [8] -two contextual theories -and the intuition of the impossibility of finding such a chain in Spekkens' toy model, as shown here, suggests the following conjecture: logical multi-agent paradoxes are proofs of contextuality, and all contextual physical theories can model multi-agent logical paradoxes. The connection of contextuality to logical contradictions has already been to some extent explored in existing research. For example, it can be shown that the patterns of reasoning which are used in finding a contradiction in the Liar cycles [29] are similar to the reasoning we make use of in FR-type arguments, and in [30] the connection is established between such logical cycles and contextuality. Additionally, in [31], it has been shown that every proof of a logical pre-post selection paradox is a proof of contextuality. The question of how proofs of contextuality relate to proofs of multi-agent logical paradoxes will be formally addressed in future work.
Weak and noisy measurements. In [32], the analysis of weak measurements and weak values is applied to the epistemically-restricted theory of Liuoville classical mechanics. Weak values in that case coincide with the ones obtained for gaussian quantum mechanics; no anomalous weak values are observed, as the theory is non-contextual. It would also be interesting to apply our analysis of physical measurements to try to implement noisy and weak measurements in generalized Spekkens' theory; we leave this as an open project.
Wigner's other friends. The thought experiment analyzed in this paper has many similarities to the thought experiments proposed by Brukner [4] and Cavalcanti [33], which also build on the original Wigner's friend scenario. However, the conclusions drawn from the latter two differ from the original FR experiment and its toy analogue discussed in this paper. While Brukner's and Cavalcanti's results provide a strengthening of the Bell's theorem, considering the FR thought experiment in various theories is an exploration of what it means to be a user of the theory and also be described within the said theory, and what operational restrictions such a user might have. As the toy theory does not exhibit non-local features, and its correlations do not violate Bell's inequalities, it is not suitable for formulating Brukner's and Cavalcanti's results. We leave the interesting question of identifying fundamental connections between the assumptions used in all of these scenarios as future work.
Open questions and generalizations. We did not use the explicit form of the epistemic restriction in our proof of non-existence of the paradox in the toy theory; the assumptions we made are not unique to classical complementary. In principle, we could have defined a different criterion for joint knowability of observables. To preserve the general structure of the theory this new criterion needs to be subject to some conditions: for example, we require jointly knowable variables to have a linear structure; this follows from the fact that if two variables are known, any linear combination can be calculated. The formalization and exploration of different epistemic restrictions, as well as the investigation of other more general settings in which agents could reason about each other, is left for future work.

A Toy theory formalism in arbitrary dimensions
The complete review of the toy theory formalism, including its original, stabilizer and arbitrary formulations, can be found in [18].

A.1 Formalism for arbitrary dimensions.
Is this formalism necessary to understand the results of this paper? To tackle the more general case of toy systems of arbitrary dimensions, we must review heavier formalism [12,13]. Our main results are expressed in this language, but we add intuitive descriptions that convey the main message and don't require learning the formalism.
Epistemic states. In the continuous case, we represent toy systems through observables q i and p i for each of the n subsystems, analogous to position and momentum: for example, a toy particle moving in 3D would have n = 3. Epistemic restriction for continous systems. The complete epistemic restriction in then given by the principle of classical complementarity: "The valid epistemic states are those wherein an agent knows the values of a set of quadrature variables that commute relative to the Poisson bracket, and is maximally ignorant otherwise." [12] Here, "maximal ignorance" means that there is a uniform probability over all other values of variables. It can be shown that this complementarity principle requires observables to be linear, i.e. quadrature observables. We represent quadrature observables as a vector f ∈ Z 2n d /R 2n if we consider n systems of dimension d or n continuous systems. Then the Poisson bracket is defined for both the continuous and discrete case, where the sum has to be understood in mod d in the discrete case: where f j denotes the jth entry of f 7 8 [12].
Composing continuous systems. To continue our example suppose that we bring in a second 1D system, and we know the local observable corresponding to the position of this system, for instance f 2 = q 2 = 10. The global epistemic state is then specified by which is a product state. On the other hand, if instead of q 2 we knew a global property, like that the positions of the two systems were perfectly correlated, q 1 = q 2 , we could represent this through a new observable f 3 = q 1 − q 2 = 0, and so our global epistemic state would be Reversible transformations. Valid reversible transformations are sympletic transformations, represented by a pair (U, a), where a is an ontic state 9 and U is a sympletic matrix. Sympletic matrices are those that satisfy and is used to write the sympletic inner product 10  This transforms the above state to The ontic state o 2 = 2 −1 which was compatible with the knowledge before the transformation is still valid after the transformation as −1 2 Measurements. In the continuous case measurement consists of a vector space V π of observables that which can have outcomes v π ∈ Ω, all possible (inequivalent) outcomes also result in a partition of the ontic state space, like in the discrete case. The probability to get outcome v π if the system is in an ontic state m is given by [12] ξ(v π |m) = δ V ⊥ π +vπ (m).

(A.4)
Intuitively, this means that we can only obtain measurement outcomes that are compatible with the ontic state of the system. We can denote a measurement V π and its outcome v π by the pair (V π , v π ). With the 9 Throughout this paper a = 0 unless otherwise stated. conditional probability distribution ξ(v π |m) we can calculate the probability for a measurement outcome given the epistemic state (V, v) [12], where µ (V,v) (m) is the equal probability distribution over all ontic states compatible with the knowledge of the observables in V having valuations v.
Example of a measurement. For example consider the state and we want to measure the position 1 0 .
The epistemic state is such that we know the position to be 2·3 = 6, so the only compatible measurement outcome is v π = 6 0 .
Post-measurement state. The post-measurement state of the system (given the information about the outcome) is as follows.
Theorem A.1 (Measurement update rule [18]). When an epistemic state (V, v) is subjected to a measurement V π , and outcome v π is obtained, the epistemic state is updated to

B Formal results and proofs B.1 Linear algebra lemmas
Here we list the results from linear algebra we used in the refinement of the generalization of Spekkens' toy theory.
Lemma B.1 ([34]). Let W ⊂ Ω be a subvector space or submodule and w ∈ Ω. Then for any a ∈ W + w for some w 1 ∈ W . As a ∈ W + w we know that for some w 2 ∈ W . Plugging the expression for w into the expression for b we find: Let c ∈ W + a then we can write where we used the expression for a from above. From this equation we can conclude that W + a ⊂ W + w.
Lemma B.2 ([34]). Let W, V ⊂ Ω be two subvector spaces or submodules and v, w ∈ Ω. Then if Proof. If (W + w) ∩ (V + v) = ∅ then there exists a u ∈ (W + w) ∩ (V + v). Lemma B.1 allows us to write This means each element in V + v is of the form u 1 = v 1 + u for v 1 ∈ V , and each element in W + w is of the form u 2 = w 1 + u for w 1 ∈ W . Therefore u 1 is in W + w if and only if v 1 ∈ W and u 2 is in V + v if and only if w 1 ∈ V . Therefore we can conclude that (W + w)

Lemma B.3 ([13]). Let V, W ⊂ Ω be two subvector spaces or submodules then it holds that
Proof. Let a ∈ (V ⊕W ) ⊥ and, therefore, for all vectors u ∈ (V ⊕W ) it holds that a T u = 0. In particular, this holds for a ∈ V ⊥ and a ∈ W ⊥ as V and W are subsets of (V ⊕ W ). Therefore, we can conclude Let b ∈ V ⊥ ∩ W ⊥ and let u ∈ (V ⊕ W ) be arbitrary. Then we find that 35,36]). Let V ⊂ Ω be a subset or submodule. Then it holds that (V ⊥ ) ⊥ = V .
Proof. The proof for this in the case for general d can be found in [35]. In the case for general vector spaces (d prime or the continuous case) the proof can be found in [36].

B.2 Measurement as a physical process
Here you can find the proofs of statements used to formulate rules for measurement process.
Example: measuring position with a continuous 1D pointer. We consider the case where both the measured system and the pointer are continuous 1D systems, characterized by the observables q S , p S , q M and p M . Note that we cannot start the pointer in the analogous of a Gaussian state, as each toy observable can only be either fully known or completely unknown. We start instead with a pointer well-localized in position space, with q M = x 0 . Suppose that we want to measure the position of the first system, S. If S starts in a state of well-defined position q, then we expect to end up in a "classically corelated" toy state analogous to |q S |x 0 +q M . If on the other hand S starts with well-defined momentum p and undefined position, we would expect the final global state to be somehow analogous to a superposition q α(q, x 0 , p)|q S |x 0 + q P . Theorem B.6 shows that in the toy theory there exists a transformation that produces a final state that is analogous to the quantum case.
Theorem B.5. [Coherent copy of position] Let the epistemic state of the memory system be initialized as q memory = 0, and let the initial epistemic state of the measured system be (V information = v 1 , v) with v 1 , v ∈ Z 2 d or R 2 so that the initial epistemic state of the composite system is Proof. We apply the transformation to the initial state S : This transformation ensures that the position of the memory system is always equal to the position of the information system. Furthermore, if we trace out the memory system, that is taking the marginal of the probability distribution of the ontic state over the memory system. The probability distribution before marginalisation is the uniform distribution over is the vector spanning V ⊥ inf ormation . After marginalisation this results in a uniform probability distribution over V ⊥ + v with the last two entries removed . Therefore, if v 2 1 = 0 the traced out state is the maximally mixed state and therefore an equal mixture of all positions of the information system. In non prime dimensions, even if v 2 1 = 0, the state does not need to be the maximally mixed state. On the other hand if v 2 1 = 0, the state is the state where the information system has definite position v 1 .

Examples.
To obtain some intuition, let us look at two examples. Recall that this sympletic transformation acts on an initial state as (V, v) → ((S T ) −1 V, Sv). Consider the initial state analogous to |φ S |x 0 M , that is the product state between an arbitrary state of S and a well-defined memory position q M = x 0 . This state is transformed as In particular, after a quick simplification we can see that it transforms the initial state analogous to that is, a state analogous to |q S |x 0 + q M , as desired (it satisfies q S = q and q M = q + x 0 ). On the other hand, U transforms the state analogous to |p S |x 0 M as such that the observable f of the information system in the old basis is transformed to q 1 in the new basis and the observable v of the memory system is transformed to the position of the memory system. Therefore, the case where we copy q 1 into the position of the memory system is related by a basis change to the case where we copy f into the value of v of the memory system. By an analogous calculation as in theorem B.5 the transformation (S ⊗1 mes2,..,mes N ) correlates the position of the memory system with q 1 . This means that after the transformation with (S ⊗ 1 mes2,..,mes N ) the state (V tot , v ) is such that V tot contains a vector of the form (1, 0, −1, 0, 0, 0, 0, . . . , 0) T ∈ V tot and the valuation of (1, 0, −1, 0, 0, 0, 0, . . . , 0) T is zero. The last part of the transformation (1 mem ⊗ S M mes )(T mem ⊗ 1 S )) transforms (1, 0, −1, 0, 0, 0, 0, . . . , 0) T → (v, −f ) and the valuation of the new vector is still zero as the transformation is symplectic. Therefore, the above transformation correlates the value of f on the information system with the value of v on the memory system. As the transformation (1 mem ⊗ (S M mes ) −1 )(T −1 mem ⊗ 1 S )) acts only locally on the information system and the memory system, we can first transform back the joint memory and system state, then trace out the memory system and finally only transform the information system back and get the same result as if we would have just traced out the memory system. To determine the marginal we must consider the probability distribution over the ontic states induced by the epistemic state. This probability distribution is just the uniform distribution over the ontic states in U ⊥ + u, where (U, u) is the state after the measurement update. The vector space  (1, 0, ...) T or (0, 0, ...) T . This means that M ⊕ (1, 0, 0, ..., 0) T is a set of commuting observables and (1, 0, 0, ..., 0) T is linearly independent of M if and only if (0, 1, 0, ..., 0) T was not already contained in (S T M W ). In this case, for each value q ∈ Z d or R there exists a valuation vector v q such that (1, 0, 0, ..., 0) T has the valuation q and the valuation of M is constant. This means that M is a mixture of states with all possible values of q 1 except if q 1 was already known. After transforming the marginalized system back with S M , the results we found for q 1 before the transformation with S M hold now for f . . . , 0) if w 2 = 0). In a similar way for all vectors u in the set C such a vector u can be constructed such that it has symplectic inner product is 1 with u and commutes with all other vectors in C. A transformation is sympletic if its columns are such that, the first two columns have symplectic inner product 1 with each other but commute with the rest, the same for the third and fourth column and so on for the following pairs of columns. Therefore, the transformation where the odd columns are the vectors from the set C, starting with the vector w and the even rows are the vectors we constructed from the previous odd column in the manner as described above is by construction symplectic and has w as its first column. Proof. After the transformation the state of the system is This state is allowed as V i S only has support on the first 2 amount of systems in S entries of the system and V i T on the rest. The marginal of the target system after the transformation is where Π T the projection on the last 2 amount of systems in T entries of the system and R T removes the vectors that have support in S. As the transformation cannot depend on v i S , V f T is independent of v i S . Therefore, changing v i S can only change the valuation v f s of the final state. Furthermore, changing the valuation either result in orthogonal or identical states.
If the transformation is irreversible, then we can always see the transformation as a reversible transformation on a larger system and subsequently tracing out some systems. Taking the trace effectively removes vectors from V f T . Therefore it can make states that were orthogonal identical, but it cannot produce non orthogonal or identical states, as also in this case the transformation cannot depend on v i S . This establishes that the different marginalised states on the target system only differ in their valuations which depend only on v i s . By eq. (B.21) this dependence is a linear map. Let us call this map C.
. Therefore, the sets of identical marginal target states are given by Ker(C) + v f T with v f T some valuation for a target state. Therefore, each set of identical target marginal states has the same number of elements.
Note that all transformations in Spekkens' toy theory are given by a symplectic matrix S and a shift vector a ∈ Ω, the shift vector only changes the value of the different known observables by a constant, independent of the vector v i s . Therefore, adding shifts to transformations also does not help to obtain non-orthogonal states. Proof. This is a direct application of Theorem B.8, which is the formal version of Theorem 3.1. It also has a direct and simple proof in the stabilizer version of the toy theory; we present that proof here for pedagogical purposes for the readers familiar with the stabilizer formalism.
In the stabilizer formalization of the toy theory [11], for appropriate dimensions, each valid epistemic state can be isomorphically identified with a set S = g 1 g 2 , . . . of commuting or anti-commuting Pauli operators forming a valid stabilizer. Each allowed transformation on toy states corresponds to a permutation Π on the subset of stabilizers. This permutation can be represented by a (unitary or anti-unitary) permutation matrix V Π that acts on each stabilizer g ∈ S as V Π gV T Π . In this case, we have This transformation Π would have to map the stabilized states as This means that permutation Π would depend on the sign of Z A . This dependence should be explicit in the corresponding permutation matrix V Π (which needs to be either unitary or anti-unitary [11]). However, permutation matrices cannot depend on the sign of stabilizers [11], due to the linear nature of quantum theory. To see this, note that the permutation matrix would have to act on each stabilizer , so if the transformation on the first state behaves as Π : then it must act on the second state as

B.3 Predictions with certainty
Here you can find the proofs of statements used to formulate rules for agents making predictions with certainty. The linear algebraic statements used in the proofs are formally justified in Appendix B.1.

Lemma B.10. Given an epistemic state (V, v) and a measurement V π , an outcome v π will be measured with certainty if and only if
This condition is fulfilled if and only if the following two properties hold: Proof. The probability of the measurement outcome v π given that the system is in an epistemic state (V, v) is given by the condition in [18] We are able to predict with certainty that the measurement outcome is v π if By the definition of δ V ⊥ +v and δ V ⊥ π +vπ this condition is equivalent to [18] One might wonder why the symmetry is broken between (V, v) and (V π , v π ). The reason here is that δ V ⊥ +v is normalized, while δ V ⊥ π +vπ is not. The above condition can be further simplified with the following lemma.
Let both properties be fulfilled. Then there exists a vector w ∈ (V ⊥ + v) ∩ (V ⊥ π + v π ) and it holds that V π ⊂ V . From the latter, it follows that V ⊥ π ⊃ V ⊥ . This can be seen in the following was: if w ∈ V ⊥ then for any u ∈ V it holds that u T w = 0 and as V π ⊂ V this holds in particular for all u ∈ V π . Thus, w ∈ V ⊥ π . With lemma B.1 we can then conclude Thus, it holds that be fulfilled. Then the condition 2 holds, and V ⊥ ⊂ V ⊥ π . Similarly as above, it holds that (V ⊥ ) ⊥ ⊂ (V ⊥ π ) ⊥ . Since V ⊂ Ω is a subset or a submodule, it is true that (V ⊥ ) ⊥ = V (lemma B.4, which implies the condition 1.
Thus, lemma B.10 says that the statement "A = 1 =⇒ B = 1" can be made if and only if The second condition is equivalent to Based on lemma B.1 we can conclude that the second condition is equivalent to Toy Bell scenario. The example of the Bell scenario in the generalized formalism goes as follows. Alice and Bob's shared entangled state can be expressed as Now, if Alice measures her system in

B.4 No Frauchiger-Renner paradox in the toy theory
Experimental setting. We can write an arbitrary global ontic state of Alice and Bob's labs before the measurements start as We allow for arbitrary measurements by the different agents, fixing only the range of systems measured and the order of measurements: Alice at t = 1 performs the following measurement where v A ∈ Z n R d or R n R . Alice says she got the outcome "1" if she got the result v A=1 .
Bob at t = 2 performs the following measurement where v B ∈ Z n S d or R n S . Bob says he got the outcome 1 if he got the result v B=1 .

Ursula at t = 3 performs the following measurement
where v U,1 , v U,2 ∈ Z n A +n R d or R n A +n R . Ursula says she measured "ok" if she got the outcome v U,ok and "fail" if she got the outcome v U,f ail . These two vectors need to correspond to distinct outcomes, i.e. v U,ok − v U,f ail / ∈ V ⊥ U .

Wigner at t = 4 performs the following measurement
where v W,1 , v W,2 ∈ Z n B +n S d or R n W +ns . Wigner says he measured "ok" if he got the outcome v W,ok and "fail" if he got the outcome v W,f ail . These two vectors need to correspond to distinct outcomes, The linear algebraic statements used in the proof are formally justified in Appendix B.1. Proof. We separate the proof in four parts. First, we find necessary conditions on the epistemic state such that predictions can be made with certainty. Second, we find conditions such that the probability for Wigner and Ursula both get "ok" is non zero. Finally, we show that two measurement outcomes which would lead to a contradiction must be the same and, thus, do not lead to a contradiction.
Epistemic state that allows predictions with certainty. We follow the chain of reasoning and determine what V needs to fulfill to allow for predictions with certainty: "U = ok =⇒ B = 1" The first condition for predictions with certainty requires that "B = 1 =⇒ A = 1" The first condition for predictions with certainty requires that "A = 1 =⇒ W = fail" The first condition for predictions with certainty requires that V W ⊂ V commute,A ⊕ V A where V commute,A denotes subset of V that commutes with all vectors in V A . This means for In total, this means for each (B.39) P (ok, ok) = zero. Even if the above chain of reasoning holds, the paradox only occurs if, additionally, the probability for Ursula and Wigner to both get "ok" P (ok, ok) is not zero. As they perform a joint measurement their measurement is given by lemma B.2 allows us to conclude that this condition is equivalent to Due to (V ⊥ ) ⊥ = V for a submodule or subset V (lemma B.4), we can rewrite the expression ( Getting the correct outcomes with certainty. We want to ensure that in the reasoning chain we cannot only conclude with certainty but we can also make the desired conclusions. For example, the previous conditions ensure that when Bob measures he gets an outcome with certainty (which can be calculated from the epistemic state), but the condition we consider here ensures that this outcome is B = 1: "U = ok =⇒ B = 1" We apply the second condition for predictions with certainty. Thus, it has to hold that The two measurements are defined on two different subsystems. Therefore, we can choose, without loss of generality, v B=1 ∈ V ⊥ U and v U,ok ∈ V ⊥ B . Additionally, it does not matter which intersection is calculated first. Therefore, we can first calculate the intersection (V ⊥ B + v B=1 ) ∩ (V ⊥ U + v U,ok ) using lemmas B.1 and B.3 (B.45) Plugging this result into eq. (B.44) we find the condition Due to lemma B.1 this condition is equivalent to the condition 47) "B = 1 =⇒ A = 1" With the same reasoning as above we find "A = 1 =⇒ W = fail" With the same reasoning as above we find In total, the epistemic state (V, v) and the measurements of Alice, Bob, Ursula, and Wigner need to fulfill Let us assume we have found a state (V, v) and measurements such that the above chain of reasoning holds, and P (ok, ok) = 0. Such a state would lead to a paradox. In the following, we show that such a state and measurements cannot exist.
For all v W there exist v A , v B , and v U be as in eq. (B.39). Without loss of generality, we can choose equivalent measurement outcomes such that v B=1 ∈ V ⊥ U and v U,ok ∈ V ⊥ B . Then it holds that v U − v B ∈ V commute,U and v U − v B ∈ V B ⊕ V U . Thus, eq. (B.47) implies that With the same argument we can find the following conditions Adding up eq. (B.53) and eq. (B.50) we find the condition In particular, it holds that Thus we can conclude that the following condition holds Because v U , v A , v B as in eq. (B.39) can be found for all v W it holds that v W,ok − v W,f ail ∈ V ⊥ W . Thus, v W,ok − v W,f ail correspond to the same measurement outcome, as they have same valuation for any v W ∈ V W . In summary, if there is a state and measurements such that the paradoxical chain of reasoning would hold, we have shown that then the two measurement outcomes that would lead to a paradox have to be the same outcome. Therefore, no such paradoxical chain of reasoning is possible.

C Review of quantum processes and experiments C.1 Quantum measurements as physical processes
The usual way to describe the measurement process in quantum theory is to start with von Neumann measurements [9], which project the system into one of the eigenstates of the observable -such a measurement can be characterized by a set of projectors. However, von Neumann measurements only represent a certain class of measurements, as they contain all information about the observable. Generally, we are also interested in measurements which extract information only partially -while they reduce the uncertainty about the observable, they don't remove it completely. These generalized measurements are known as POVMs (positive operator-valued measurements) 11 . Operationally, POVMs can be implemented by introducing another quantum system, an ancilla (which can play the part of a pointer or a memory), performing a joint unitary on both systems, and then subjecting the ancilla to a von Neumann measurement. For example, the simplest way to measure a qubit in its computational basis with projectors {|0 0| S , |1 1| S } is to perform a joint CNOT gate on the system and the memory (Figure 2.1b). Now let us consider the memory update for the case of the continuous system. For a system H S , we define the orthonormal basis {|x S | x ∈ R} such that the states of the basis fulfill x|x S = x|x S (C.1) Let us also introduce a memory system H M , isomorphic to the Hilbert space H S . Our aim is to describe an operation which would coherently copy the state of the system S to the system M . We define the CN OT X gate as the transformation We can call CN OT X a X-memory update, as it is written w.r.t. the position basis. If we choose x 2 = 0, then the position of the first system S (the one being measured) is copied into the memory system M . After performing the memory update on the state |x 1 S ⊗ |0 M , the state evolves to |x 1 S |x 1 M . Physically, this can be implemented as the action of the Hamiltonian H SM =X S ⊗P M for time t = 1, which couples two systems in the following way (we assume = 1): In principle, we can substitute the observableX S on the system S with any other physical observablê A S = k a k |a k a k | S . In that case, the Hamiltonian takes the formĤ SM =Â S ⊗P M , and the memory update CN OT X acts as Suppose that the pointer M has the initial position wave function ψ 0 (x), for instance a Gaussian wave. The interaction Hamiltonian readsĤ SM = gÂ S ⊗P M (where we also add the factor of g for 11 They can be characterized by a generalising the set of projectors above: suppose that we pick {Π i } -a set of m operators with the only restriction i Π † i Π i = 1, where m ≤ n. quantifying the strength of the interaction), which leads to The final global state is therefore where each outcome is correlated with a shift in the position of the pointer, and the weight of each peak corresponds to the probability of observing that outcome 12 (Figure 2.1c). Let us just look at two examples that will be useful later to compare to the toy theory. For simplicity, suppose that we tune the interaction Hamiltonian and interaction time such that tg = 1, that the system S measured is continuous, and that we are measuring a continuous observable,Â S = +∞ −∞ dk a k |a k a k | S . In particular, let us see what happens when we measure the position (Â S =X S = +∞ −∞ dx x |x x| S ) of a system S that's initially in:

C.2 Frauchiger-Renner thought experiment
In this appendix, we present the technical derivation of the Frauchiger-Renner paradox in quantum theory [3]. Here, we assume that the reader has basic knowledge of the postulates and notation of quantum theory. This derivation is adapted from [24] without conditional state preparation.
Systems and initial state. We have two qubits R and S and two participants Alice and Bob, whose memory registers A and B are also modelled as one qubit each. There are two external agents Ursula and Wigner, whose quantum memories don't need to be explicitly modelled at this stage. The initial state of R, S, A, B is a Hardy state of R and S [23], and erased memories.
we see that Alice reasons that, whenever she finds R in state |1 R , then Wigner will obtain outcome "fail" when he measures Bob's lab. That is, "a = 1 =⇒ w = fail". Thus, chaining together the statements (the same reasoning that allowed the reader to solve the three hats problem), we reach an apparent contradiction: w = u = ok =⇒ b = 1 =⇒ a = 1 =⇒ w = fail.
That is, when the experiments stops with u = w = ok, the agents can make deterministic statements about each other's reasoning and measurement results, concluding that Alice had predicted w = fail, hence arriving to a logical contradiction.