1 Introduction

Bell-type theorems connect assumptions about purported hidden variables with empirical predictions for the outcome of quantum correlation experiments. They have three kinds of premises, namely: (1) the independence of outcomes from remote outcomes, called Outcome Independence, OI, (2) the independence of outcomes from remote measurement settings, called Parameter Independence, PI (or remote context independence), and (3) the independence of the measurement settings from the value taken by the purported hidden variable, called Setting Independence, SI (or measurement independence, or no conspiracy, or freedom of choice).Footnote 1

Put abstractly, the logic of a Bell-type “no go” proof is as follows:

$$\begin{aligned} { \frac{\underset{\displaystyle \mathrm {non-EMP}}{(\textrm{OI} \wedge \textrm{PI} \wedge \textrm{SI}) \rightarrow \textrm{EMP}}}{(\mathrm {non-OI}) \vee (\mathrm {non-PI}) \vee (\mathrm {non-SI}))}} \end{aligned}$$

Here, the first premise is the Bell-type theorem. It is based on a formal analysis of assumptions about hidden variables (the mentioned claims OI, PI, and SI, which together make up the conditional’s antecedent) and empirical consequences EMP that can be derived from them (in the consequent). These consequences can, for example, be a Bell-type inequality, or a modal claim about the possibility or impossibility of certain outcomes. The second premise is empirical, based on experimental results that (in all cases known so far) agree with the quantum-mechanical predictions and show a violation of the empirical consequences EMP of the hidden variable assumptions, e.g., a violation of the respective Bell inequality, or the occurrence or non-occurrence of certain joint outcomes. The upshot of the proof is that at least one of the assumptions of the Bell-type theorem has to be rejected.

From among these three assumptions, SI has been the least investigated, especially when it comes to formal modeling. Our paper aims to improve upon this situation. We ask what exactly a violation of SI looks like in the context of Bell-type theorems. That is, assume that we sacrifice SI (while keeping OI and PI intact) in such a way that EMP is violated exactly as QM requires: which formal constraints follow for the functioning of hidden variables?

All three conditions, OI, PI, and SI, are independence assumptions. OI and PI together are meant to capture a notion of locality. The meaning of independence varies among different Bell-type theorems. In probabilistic Bell-type theorems, independence takes the form of statistical independence, whereas in non-probabilistic ones, independence has a purely modal character, as it concerns the joint possibility or impossibility of some given events. The three premises differ in the support they call for. On the one hand, it is customary to argue for independence assumptions by appealing to relativity theory, since the relevant pairs of events (outcome-outcome for OI, and outcome-setting for PI, and the creation of the entangled state at the source and the selection of settings for SI) are modeled as being, and ideally are, space-like separated. The prohibition of faster than light signaling, via the no-communication theorem (Peres and Terno, 2004), then entails that there is no direct interaction between these events. In order to establish their true independence, a further step is needed to exclude indirect dependence via their common past. For OI and PI, it is natural to postulate that the hidden state in a properly chosen spatio-temporal region screens off one event from the other (e.g., one outcome from another outcome).

This move is, however, questionable for the SI condition: while the experimenter’s choice of the setting may be space-like separated from the region whose full physical state is encoded by the value \(\lambda \) of the purported hidden variable, both the physical state underlying the experimenter’s choice and the state \(\lambda \) evolved from the physical state of a region in their common past. It may then be natural to think that this past state determined both the experimenter’s choice and the state \(\lambda \). Why should the selection of the setting be independent from the hidden variable? Such considerations motivate a different independence argument: a violation of SI might encroach upon the experimenters’ freedom to select settings. We will show that this is indeed the case.

Our focus in this paper is on deterministic hidden variable models for the GHZ set-up. In a deterministic model of this sort, the hidden variable together with the measurement settings determines a unique measurement outcome. In contrast to super-deterministic models, the choice of the settings is not determined by the hidden variable. We focus on the GHZ (aka GHSZ) argument (Greenberger et al., 1989, 1990; Mermin, 1990) because of its non-probabilistic character.Footnote 2 Following Mermin (1990), deterministic hidden variables posited for the GHZ experiment are usually called “instruction sets”: each value of the hidden variable provides instructions for every possible combination of settings.

In the GHZ setup, SI together with the assumption of non-contextual instruction sets (which, as we will soon see, incorporates OI and PI) implies a contradiction with QM. Our approach will be to assume the existence of non-contextual instruction sets together with QM, so that a violation of SI will follow, and to see what this violation looks like. Additionally, we will also reflect on models with contextual instruction sets.

Our paper is organized as follows. In Section 2 we explain the salient features of our approach: the focus on a single run and the possibilities involved in the run, and the distinction between two kinds of indeterminism: Nature-induced indeterminism and agent-induced indeterminism. Section 3 provides a modest overview of debates over SI: it reports on how researchers argued for SI and what kinds of models for Bell-type experiments with a violation of SI are currently on the market. Section 4 sketches the framework of Branching Space-Times theory (BST) that we will be using, and presents an initial BST structure for GHZ. Section 5 provides a formal account of adding hidden variables in BST. The gist of the paper is in Section 6 where we analyze the costs of violating the SI condition in the framework of BST. Section 7 provides a parallel analysis assuming contextual instruction sets. We discuss our results and draw some conclusions in Section 8.

2 Modal aspects of a single run

Our modeling concerns single runs. Emphatically, we take it that a single run, apart from being an actual process, involves certain possibilities. After a run is finished, one can say which outcomes and settings actually occurred, but also which outcomes and settings might have occurred instead. We are thus after a modal analysis of the branching possibilities of a single run.

Such talk of possibilities raises the epistemological question of how one can know what is possible in a concrete run of a given experiment. One source of such information is statistical data. If each run was a wholly separate event, any statistics that one might collect would be arbitrary. But a concrete run is always one from among a collection of runs of a given experiment. In an experiment the experimenters take effort that the initial parts of these runs, in particular those involving a source, are as similar as possible. Thus, unless the runs are differentiated by some factors that the experimenters have no handle upon, what happens in other actual runs tells us what is or was possible in a given concrete run. Another source of information is quantum mechanics since it tells us what the possible outcomes of a measurement are; tentatively we use this information at its face value. Additionally, our feeling of agency informs us that sometimes we could have done different things than what we actually did, and we use this intuition to ascribe to the experimenters the capacity to select alternative possible settings.

The modal information about a single run is captured by an initial (surface) structure that describes what the possible experimental settings and the possible outcomes are, and how separate possibilities combine into joint possibilities or impossibilities. Such a model can be unsatisfactory on theoretical grounds—for example, there may be weird correlations (e.g., a surface violation of Outcome Independence) that call out for explanation, triggering the idea of structure extensions that may remove surface indeterminism. We will provide a formal description in Section 5.

The option of wholly or partially removing indeterminism links to a central distinction in our approach. In the typical narrative of Bell-type experiments, the selection of settings is done by experimenters, in contrast to the production of the measurement outcomes, which is due to Nature. Working physicists talk about the things that they can control and the things that they cannot control (see, e.g., the comment of Aspect (2015), quoted in more detail below: “I am just pursuing my profession of experimental physics”). These observations point to a distinction between Nature-induced indeterminism vs. agents-induced indeterminism. While a super-deterministic structure extension removes both kinds of indeterminism, an arguably subtler approach aims to remove only Nature-induced indeterminism while preserving the agents-induced indeterminism present in the surface structure.

Our focus on a single run separates our approach from other analyses of Bell-type theorems in the philosophy of physics in which the focus is on the statistics of the experiments while, importantly, these statistics are not accounted for in terms of the modal features present in a given run—presumably because such irreducibly modal features are deemed to be philosophically suspect. In this spirit Fine (1989) recommends one to get accustomed to the weird statistics of remote correlations. Another example is the Budapest group of Rédei, Szabó and Hofer-Szabó, with their project of accounting for non-local statistical correlations by invoking principles that are weakenings of Reichenbach’s common cause principle (such as the separate-common-cause system of Hofer-Szabó (2008)). Arguably, the experimentalist’s thinking is different: knowing the statistics and other relevant data, one considers a concrete run—what could have gone differently in that very run? This is clearly seen, for instance, in a well-known paper by Bell (2001). While introducing his principle of local causality, he starts with an intuitive reading, which concerns token events and spatiotemporal regions in a given run (“The direct causes (and effects) of events are near by ...”). Then, after warning the reader that cleaning up this intuitive reading to be mathematically tractable may “throw out the baby with the bathwater”, he offers a probabilistic reading: “A theory will be said to be locally causal if the probabilities attached to values of local beables in a space-time region 1 are unaltered by specification of values of local beables in a space-like separated region 2 ...”. This still refers to objects in a single run, namely, “values of local beables in a region”; probabilities are ascribed to such values. In philosophical parlance, these probabilities are naturally interpreted as single-case objective probabilities. Of course, to assess them one needs the statistics of multiple runs of the experiment.Footnote 3

Finally a remark on our take on modalities is in place. As is standard in philosophy nowadays, we will analyze alternative possibilities in terms of scenarios for a given system. If a particular measurement run has two alternative possible outcomes, we will represent it by two scenarios, each containing the same measurement event but a different outcome. This readily yields the picture of branching scenarios that share a common past, which is the core insight of BST. We call such branching scenarios “histories”, to separate our approach from possible-worlds theories, which represent the above experiment by two non-overlapping scenarios without a shared past, each containing a different, albeit very much similar, measurement event.

3 SI in perspective

As was already mentioned, the task of justifying SI is somewhat different from justifying the remaining premises of Bell-type theorems. An experimenter’s choice of settings shares a common past with the source that supplies the purported value of the hidden variable. Even if one succeeds in realizing the respective events of the creation of the entangled quantum system and the experimenters’ selection of settings in non-overlapping and indeed space-like separated regions, what rules out correlations via the common past? Why assume that these events are independent?Footnote 4

A typical move is to invoke the idea of the experimenters’ freedom to perform an experiment the way they like. Somewhat in this vein, Bell (1987, p. 101) emphasizes that this is “our everyday way of looking at the world” and that it is legitimate to accommodate this view in physics. He points to theories, QM and others, that have “free external variables”, on top of the internal variables that are inherent to the theory itself. These external variables are used to represent experimental conditions. In Bell’s view they can be used to represent the results of actions of “free willed experimenters”, though it may be ambiguous “what and where the free elements are”.

A different strand of support for SI comes from observing that a failure of SI entails, in non-probabilistic contexts, that some settings could not be chosen, given that some hidden state obtained. In a probabilistic context, the analogous consequence is that a hidden state makes some settings harder to choose than some other settings. This looks like Nature systematically hiding its secrets before the experimenters, by biasing their probing of Nature. To quote Goldstein et al. (2011), a failure of SI would amount to “some incredible conspiracy of nature (the kind of conspiracy that would make any kind of scientific inquiry impossible)”. For that reason SI is often called the “no conspiracy condition” in the literature.

The worry is thus that a failure of SI would undermine the very notion of experiment as freely probing Nature, because the cornerstone of this practice is the ability to arbitrarily choose which experiment to perform and which settings to choose. Needless to say, there are various impediments that constrain this ability in concrete situations. Yet the art and ingenuity of experimenting allows such impediments to be gradually removed. Whether they can always be fully removed is likely an article of faith. As such, it may fail. But opting for a violation of SI amounts to giving up on this idea without providing independent evidence against it. It is like saying: you may go on and carry out your experiments, but no matter how hard you try, you cannot pose certain physical questions, such as what would occur if a setting were chosen when a system is in a hidden state prohibiting exactly this setting. Aspect (2015) expresses the point as follows:

   Taken to its logical extreme, however, this argument implies that humans do not have free will, since two experimentalists, even separated by a great distance, could not be said to have independently chosen the settings of their measuring apparatuses. Upon being accused of metaphysics for his fundamental assumption that experimentalists have the liberty to freely choose their polarizer settings, Bell replied (1987, 31): “Disgrace indeed, to be caught in a metaphysical position! But it seems to me that in this matter I am just pursuing my profession of theoretical physics.” I would like to humbly join Bell and claim that, in rejecting such an ad hoc explanation that might be invoked for any observed correlation, “I am just pursuing my profession of experimental physics.”

SI has strong intuitive support from such considerations, but in Bell-type experiments, the selection of settings is always due to some piece of technology (even if, as in the “BIG Bell Test Collaboration” (Abellán et al., 2018), human choices are incorporated). In the breakthrough experiment of Aspect et al. (1982), fast switching acoustic-optical devices combined with polarizers were used. Weihs et al. (1998) improved on the periodic switching in this experiment by using physical randomness to select the settings. Recent improved Bell tests, also using physical randomness to select settings, managed to close further loopholes connected to detector efficiency (see, e.g., Hensen et al. , 2015; Giustina et al. , 2015; Shalm et al. , 2015). As Pironio (2015) and Scarani (2019, Ch. 1.5) point out, however, the absurdity of a violation of SI may in fact be felt more strongly when the selection of settings would have to be correlated not to physical random generators, but to obviously unrelated phenomena such as bits from the Geneva phone book. Shalm et al. (2015) in fact used bits from digitized movies for setting selections. In all these experiments, quantum mechanics was vindicated, and technological uses of Bell-test-certified randomness are being developed (Bierhorst et al., 2018).

Until quite recently, questioning SI was a rare move, with researchers putting the blame for the violation of Bell-type inequalities on OI or (less typically) PI. It is a recent trend to sacrifice SI in the modeling of Bell-type experiments. A precursor of this trend was Szabó’s (2000) model for the Bell-Aspect experiment, which satisfies OI and PI and only very subtly violates SI.Footnote 5 In a similar vein, Hall (2010, 2016) showed that a little violation of SI is enough to accommodate a violation of Bell’s inequality. On the interpretational level, the “cellular automaton interpretation of QM” due to t’Hooft (2016) features a violation of SI and thereby accounts for the violation of Bell’s inequalities. SI was also discussed from an operationalist point of view by Hermens (2019). More recently, Ciepielewski et al. (2023), in an extended framework of Bohm’s pilot wave theory, constructed a model in which SI is violated (and Bell’s inequality is violated). Purported advantages of abandoning SI are also advocated by Hossenfelder (2020) in her vocal campaign for super-determinism (see also Hossenfelder and Palmer 2020). Relatedly, there is also Adlam’s (2018, 2022) quest for globally deterministic laws of nature.

To sum up: SI concerns the independence of a setting selection from the physical state of a near-by spatiotemporal region. It is typically defended by appealing to the experimenters’ ability to freely choose settings, and by considering the consequences of the absence of this ability. The debate over SI has so far been informal. It is therefore interesting to formally model agents’ free choices (without attempting to address Bell’s question of how and where this freedom is realized). The relevant question then is: what is so special about these selection events from a formal point of view? We will offer an analysis that addresses this question, in the hope of achieving a better understanding of the consequences of violating SI.

4 BST, modal correlations, and a BST structure for GHZ

We will carry out our analysis in the BST framework (launched by Belnap , 1992), which permits a rigorous discussion of possibilities and impossibilities in a rudimentary relativistic framework. BST can thus represent modal and spatiotemporal features of experiments. We refer the reader to Belnap et al. (2022) for a full exposition of BST. In Section 4.1 we rehearse its basic features; in Section 4.2 we sketch a BST structure for GHZ.

4.1 BST and modal correlations

The theory of BST has two primitive notions: a set of possible point-like events W, to represent our world including its modal aspects, and a strict partial ordering <, where “\(e_1 < e_2\)” is read as “event \(e_2\) can happen after event \(e_1\)”. The ordering, taken together with the theory’s axioms, is used to define consistency (co-possibility) and then to carve out the maximal consistent subsets from the basic set W, which are called “histories”. Any two histories share a common past, so a BST structure is forward-branching. In one specific class of BST structures, known as Minkowskian Branching Structures (MBS), each history is isomorphic to the Minkowski space-time of a given dimension. One can thus visualize an MBS as a stack of forward branching copies of Minkowski space time, differentiated by the distribution of some properties (e.g., values of some fields). For a picture, see Fig. 1. A BST structure with more than one history has distinguished events, called choice points, which are the maximal elements in the overlap of two histories. A choice point has multiple possible outcomes. Each outcome is identified with a set of histories that (1) all contain the choice point but (2) do not branch at this choice point (but do branch somewhere else).

Fig. 1
figure 1

A Minkowskian Branching Structure in BST. The structure has two histories, \(h_1\) and \(h_2\), which split at the choice point c that occurs in both histories. The shaded region of \(h_2\) indicates where \(h_2\) deviates from \(h_1\). The two histories overlap everywhere outside the forward lightcone above c, as dictated by the BST axioms. Thus, for example, event x occurs in \(h_1\), but not in \(h_2\)

These last two concepts are combined to form the notion of a transition, which, in the most frugal form, is “something and then something else”. In BST we define a (basic) transition as a pair consisting of a choice point and one of its outcomes, written \(e\rightarrowtail O\). An obvious example is a concrete measurement event together with one of its possible outcomes.

On this basis, we can introduce the concept of modal correlations and modal funny business, the latter being the BST tool to represent outcomes that are individually possible, but not jointly possible, yet lack proper local reasons for this impossibility. We first define:

  • A set of transitions is consistent iff all of the transitions can occur together, i.e., the set of outcomes of these transitions has a non-empty intersection. (It follows that any singleton set of transitions is consistent.)

  • A set of transitions that is inconsistent harbors modal correlations.

To motivate the BST definition of modal funny business, we consider three cases, all based on concrete events. The first is the set consisting of two transitions: \(\{\)coin toss \(\rightarrowtail \) heads, coin toss \(\rightarrowtail \) tails\(\}\). It harbors modal correlations, but they are easily understandable, as the set consists of alternative, mutually exclusive transitions. Consider a more complex setup, which adds a second layer to our coin toss: (*) if and only if the coin lands heads, a die is cast. Then, the set \(\{\)coin toss \(\rightarrowtail \) tails, die cast \(\rightarrowtail \) six\(\}\) is modally inconsistent. But this inconsistency is not mysterious: once (*) has been taken into account, it boils down to the inconsistency of the two transitions based on the coin toss. Turning finally to the real McCoy, consider two space-like related coin tosses, say a left (L) toss and a right (R) toss, each with two possible outcomes, L heads (R heads) and L tails (R tails).Footnote 6 Now suppose that L-heads cannot occur with R-tails, and vice versa. Then the set \(\{\)L-coin toss \(\rightarrowtail \) L-heads, R-coin toss \(\rightarrowtail \) R-tails\(\}\) is modally inconsistent. This is mysterious, given the space-like separation between the tosses and the fact that there is no inconsistency in their past. Accordingly, we define:

  • A set of transitions exhibits Modal Funny Business (MFB) if it harbors modal correlations, but the choice points of the transitions are all pairwise space-like related.Footnote 7

Can modal funny business be explained away? The idea of adding instructions somewhere in the past of the correlated transitions naturally comes to mind. If in a given run the instructions “L-heads – R-heads” are given, the outcomes L-heads and R-heads occur, and analogously with the “L-tails – R-tails” instructions. If there are neither “L-heads – R-tails” nor “L-tails – R-heads” instructions, no run can produce such a mixed pair, and there is nothing mysterious about this.

In the GHZ setup there are several “measurement \(\rightarrowtail \) outcome” transition sets that exhibit MFB. We now recall the GHZ setup and describe a BST (surface) structure of the experiment.

4.2 A BST structure for GHZ

We consider Mermin’s (1990) exposition of the 3-particle GHZ Gedankenexperiment. The setup has three measurement stations, and at each station there is an event \(c_i\) (\(i = 1, 2, 3\)) of selecting that station’s setting out of the two options x and y. After the selection, each station realizes the corresponding one of two possible alternative measurements, \(x_i\) or \(y_i\). Each measurement in each station has two possible results, “\(+\)” and “−”. The arrangement is shown in Fig. 2.

Fig. 2
figure 2

Schematic drawing of the GHZ set-up, adapted from Fig. 1 of Mermin (1990)

In line with our conceptual distinction between Nature-induced indeterminism and agents-induced indeterminism, we highlight the difference between the selection of settings and the production of measurement results. We normally consider the setting selections to be due to the choices of agents; we call them C-events. The production of results, however, we consider to be due to Nature, and we call them E-events. For GHZ, we thus have the following two sets, to be modeled as choice points in BST (see Fig. 3):

$$ C = \{c_1,c_2,c_3\};\quad E = \{x_1,y_1,x_2,y_2,x_3,y_3\}. $$

Space-like relations (SLR) hold at five different levels: (i) any two C-events are SLR, (ii) every C-event is SLR to every x- and every y-event at a different station, and (iii) every C-event is SLR to every “\(+\)” and every “−” result at a different station. It follows that (iv) measurement events from E at different stations are SLR and (v) a measurement event at one station is SLR to the results (“\(+\)” or “−”) at different stations.

Modal Funny Business in a BST structure of this set-up is due to the following constraints on which combinations of results are jointly possible:

  1. (xxx)

    If all three settings are “x”, the number of “−” outcomes must be odd;

  2. (yy)

    If exactly two settings are “y”, the number of “−” outcomes must be even.

This limits the number of possible histories from the combinatorially allowable 64 (2 settings \(\times \) 2 outcomes at each of the 3 stations gives \(4^3\) combinations) to just 48 physically possible histories (4 combinations violate (xxx), and \(3\times 4\) combinations violate (yy)).

Accordingly, a BST surface structure for GHZ has to have 48 histories, all of which are isomorphic to Minkowski space-time. Any two histories share a common past, which must contain all three selection events, \(c_1\), \(c_2\), and \(c_3\). Whether the overlap of two histories contains regions above these selection events depends on the case at hand. Some pairs of histories share one, two, or even three measurement events, and some pairs of histories in addition share some results (but never all three results). The BST structure additionally has to incorporate the SLR constraints (i)–(v) mentioned above. Figure 4 depicts a fragment of the BST surface structure of GHZ, exhibiting three out of all the 48 histories.

Fig. 3
figure 3

Schematic drawing of the GHZ set-up, showing the distinction between agent-induced indeterminism (C-events) and Nature-induced indeterminism (E-events)

The constraints (xxx) and (yy) on possible combinations of results entail that there will be exactly 16 cases of MFB corresponding to the violations of the (xxx) and (yy) constraints. Here are two examples, i.e., sets of transitions that exhibit MFB (“\(x_1-\)” stands for the transition from choice point \(x_1\) to its outcome “−”):

$$\begin{aligned} \{x_1-, x_2+, x_3-\}; \qquad \{x_1+, y_2-, y_3+ \}. \end{aligned}$$

Note that the sets of transitions exhibiting MFB in this structure consist solely of transitions starting with measurement events, i.e., E-events. There are no cases of MFB built on C-events or on a combination of C-events and E-events.Footnote 8

This observation will be important in what follows.

Fig. 4
figure 4

Schematic BST surface structure for GHZ, showing three out of the 48 possible histories. Shading indicates overlap with \(h_1\): Histories overlap everywhere outside of the future lightcones of choice points with different outcomes

5 A formal account of adding hidden variables in BST

We emphasize that the hidden variable construction that we provide is not specific to quantum mechanics at all. We start with a BST surface structure for a single run of an experiment. This surface structure contains all modal details of that run that are known. These are: (i) the free selection of measurement settings, given by a set C of choice points, and (ii) measurement events with indeterministic outcomes, given by a different set E of choice points. Figure 5 illustrates a BST surface structure that is simpler than GHZ, as it has only two C-events (choices of settings), \(c_1\) and \(c_2\).

Fig. 5
figure 5

A simple surface structure with two kinds of indeterminism, captured by the distinction between C choice points and E choice points

The motivation for extending this structure must come from instances of Modal Funny Business. We may safely assume that normally, and certainly in the GHZ set-up, Modal Funny Business within C is ruled out, since experimenters are independent agents. Furthermore, Modal Funny Business that (non-trivially) involves initials from both C and E would amount to signaling, which can be ruled out on empirical grounds: if there was a case of such a MFB, there would be an outcome \(O_e\) of some \(e \in E\) and and outcome \(O_c\) of some \(c \in C\) that cannot occur together. Accordingly, an experimenter learning at her station that the outcome \(O_e\) of e occurred would thereby learn that at a remote station, \(O_c\) was not selected. So a realistic surface structure will show MFB only among E-transitions. Such a structure, that is, contains no MFB based on a non-trivial combination of C-transitions and E-transitions. We will call such a structure “C/E independent”.

Having defined what a surface structure is, we now consider how it should be extended. The most straightforward way is to consider so-called deterministic hidden variables, such that the value \(\lambda \) that a hidden variable takes, together with the settings, determines the measurement outcomes. (As remarked, this does not amount to super-determinism, as the experimenters’ settings C are not influenced by \(\lambda \).) As Fine (1982 Prop. 3) has shown, deterministic hidden variable theories are as general as stochastic hidden variable theories, so that it suffices to consider the former. The value of a deterministic hidden variable can be understood as an instruction set, and we will use that terminology.

The following recipe says how to produce a new (extended) BST structure from a given initial BST (surface) structure (see Fig. 6 for illustration):Footnote 9

  1. (1)

    Taking into account the details of the surface structure, calculate the set \(\mathfrak {I}\) of instruction sets \(\lambda \) pertaining to E. (See (5) below on how \(\mathfrak {I}\) is to be calculated.)

  2. (2)

    Posit a choice point \(e^* < E\), so that in the extended structure there are copies of E and a new choice point \(e^*\) (which was not a choice point in the surface structure) below every copy of E.

  3. (3)

    Make sure that there are as many alternative future possibilities branching at \(e^*\) as there are instruction sets. Identify the different alternative future possibilities at \(e^*\) with the instruction sets \(\lambda \in \mathfrak {I}\).

  4. (4)

    Produce copies of each \(e\in E\), as many as there are instruction sets, and relate each copy to a separate instruction set. Let each instruction set \(\lambda \) determine a unique outcome of each copy of \(e\in E\). This removes the surface indeterminism at E. Accordingly, the whole modal structure changes.

  5. (5)

    Instruction sets are calculated as follows. In a non-contextual extension, \(\lambda \) is a (possibly partial) function from E to the respective outcomes: for each \(e\in E\) on which \(\lambda \) rules, \(\lambda \) specifies exactly one outcome. Given a surface structure, we calculate how many such functions there are and what they are like (see Section 6).

  6. (6)

    We also consider contextual extensions. In this case, \(\lambda \) may specify more than one outcome for some \(e\in E\), but any difference in such a specification must be connected to differences in the measurement context, i.e., differences in outcomes of choices from C (see Section 7).

This formal recipe allows us to introduce two types of deterministic hidden variables providing instruction sets. With respect to the conditions OI, PI, and SI of the main Bell-type argument (Section 1), we can see that OI is automatically fulfilled in both cases: by step (4), the instruction set influences each measurement outcome deterministically in a local way.Footnote 10 The non-contextual instruction sets (step 5) also provide unique instructions in any context, thus also satisfying PI. In contrast, contextual instruction sets (step 6) violate PI. By the general logic of the Bell-type argument, thus, conforming with the predictions of quantum-mechanics in a non-contextual extension forces a violation of SI. For the contextual extension, however, the status of SI is open. In any case, our explicit formal construction puts us in a position to specify precisely what a violation of SI in either case will look like.

One might have the impression that extending a surface structure is always possible and in fact easy. This is indeed so unless some further constraints are brought into play. We discuss one such constraint, related to the interplay of E and C, below, namely, the preservation of the C/E independence that was present in the surface structure. For the unconstrained case, we can consider the super-deterministic extension of an arbitrary BST surface structure. Super-determinism aims to remove indeterminism from both agents’ actions and the production of results in measurements. In our framework this boils down to setting \(C=\emptyset \), which means that E is the set of all choice points. This set is then used to calculate the instruction sets. Under this assumption there is a completely straightforward extension: simply push all the indeterminism down to some \(e^*\) far enough in the past of E. Then each instruction set, identified with an outcome of \(e^*\), allows for exactly one outcome of every event that is a counterpart of a choice point \(e\in E\) in the surface structure. Wherever there was indeterminism in the surface structure, it has vanished in the extended structure, in which all indeterminism resides in the multiple possible outcomes of the posited new choice event \(e^*\).Footnote 11

Fig. 6
figure 6

Structure extension. Note that the indeterminism present in the surface structure (e.g., via the two transitions \(x_2+\) and \(x_2-\)) vanishes in the extended structure: instruction set \(\lambda _1\) ensures outcome “\(+\)” of the respective copy of \(x_2\), whereas instruction set \(\lambda _2\) ensures the outcome “−”

In a super-deterministic extension, SI is violated by construction, but the type of SI violation in other non-contextual or contextual extensions is still to be determined. In such extensions, a non-trivial C/E distinction is acknowledged, so there are two kinds of indeterminism present, including experimenters’ choices (\(C\ne \emptyset \)). As a consequence, the free choices at the \(c\in C\) from the surface structure should be preserved in an extended structure. More precisely, preserving the free choices in C, and thus, preserving SI, amounts to the following two desiderata: (i) if \(c \in C\) has a given number of possible outcomes in a surface structure, its counterpart \(c^{\text {ext}} \in C^{\text {ext}}\) in an extended structure has to have the same number of possible outcomes, and the choices in \(C^\text {ext}\) must not be hampered by the working of the instruction sets, i.e., there should be no MFB between outcomes \(\lambda \) of \(e^*\) and members of \(C^{\text {ext}}\). A perhaps more subtle desideratum relates to C/E independence. Typically, a surface structure satisfies C/E independence, which means that there is no Modal Funny Business between members of C and members of E in that structure. As observed above, one motivation for this condition is the prohibition of superluminal signaling. As we said, our surface structure for GHZ satisfies C/E independence. Preserving the experimenters’ free choices then requires that there be no MFB between members of \(C^\text {ext}\) and those members of \(E^\text {ext}\) that stay indeterministic in the extension. Hence the second desideratum: (ii) if a surface structure satisfies C/E independence, an extended structure should satisfy it as well.

Now, the big question is whether a given BST surface structure can always be extended by introducing instruction sets (non-contextual or contextual extensions) while preserving the free choices in C present in the surface structure. That is: can we satisfy the two desiderata (i) and (ii) in a structure extension? We first discuss non-contextual extensions in Section 6 and then consider contextual extensions in Section 7.

6 “No-go” results for GHZ: the non-contextual case

Models with non-contextual instruction sets for GHZ (so-called local deterministic models) were already investigated by Greenberger et al. (1989, 1990) and Mermin (1990), who found that such models’ empirical consequences contradict the quantum mechanical predictions.Footnote 12 The key assumption of such models is that in each run of the experiment, the source emits a triple of particles endowed with a set of instructions prescribing, for every possible combination of settings, the outcomes of all the measurements. This assumption is meant to preserve the SI condition, since the instructions should not interfere with the settings: each instruction set should be compatible with every triple of settings \((z_1, z_2, z_3)\), where \(z_i \in \{x_i, y_i\}\). To quote Mermin on this point: “The [given] instruction set must determine the outcomes of these [other] runs as well. For who is to prevent somebody from flipping the two switches set to [y] over [x], just before the two particles arrive?”

The other constraints on instruction sets come from the space-like separatedness of the relevant events in the experiment: these relations indicate that an outcome in one station should not depend on an outcome or a setting selected at another station. Clearly, these two ideas are behind the OI and PI conditions. These two conditions entail that an instruction set \(\lambda \) provided by the particle source assigns outcomes for settings in a piecemeal way, that is, independently of other outcomes and other settings.

A non-contextual instruction set \(\lambda \) can thus be viewed as a (partial) function from E into the set of transitions based on elements of E, providing exactly one transition with initial e for each \(e\in E\) on which it is defined, and such that it maps consistent sets of initials to consistent sets of transitions (see Belnap et al. , 2022, p. 248).

Ideally, instruction sets for GHZ should specify instructions for all of E. Thus, an instruction set for GHZ should look like the following example:

$$\begin{aligned} \{x_1+,y_1+,x_2+,y_2+,x_3-,y_3+\} \end{aligned}$$

This set specifies a unique outcome for each of the counterparts of events \(x_1\), \(y_1\), \(x_2\), \(y_2\), \(x_3\), and \(y_3\) in the extended structure. The same should hold for each instruction set, i.e., in each possible future of \(e^*\) in the extended structure. However, no such six-element set satisfies both the rules (xxx) and (yy) on the possible combinations of results—this is the upshot of Mermin’s (1990) argument. Our definition of non-contextual instruction sets, however, yields five-element non-contextual instruction sets. That is, a non-contextual instruction set \(\lambda \) specifies outcomes for only five of the possible measurement events. Here is an example of such a set:

$$\begin{aligned} \lambda = \{x_1+,y_1+,x_2+,x_3-,y_3+\}. \end{aligned}$$

This can also be written as a set of instructions, one for each consistent combination of settings:

$$\begin{aligned} \lambda \!= \!\{\! \{x_1+,x_2+,x_3-\}, \! \{x_1+,x_2+,y_3+\}, \! \{\! y_1+,x_2+,x_3-\}, \! \{\! y_1+,x_2+,y_3+\}\! \}\!. \end{aligned}$$

Observe that there is no reference to \(y_2\) in this \(\lambda \). This means that on this \(\lambda \), there is no possibility of experimenter 2 choosing \(y_2\). Moreover, the problem is not limited to this particular \(\lambda \). Since the instruction sets all have five elements, there will always be one element of E missing, i.e., one setting will not be possible.

The consequence is that in the extended structure based on these five-element instruction sets, there are multiple cases of Modal Funny Business between \(e^*\) and the experimenters’ choices \(c_i\), violating desideratum (i) of preserving free choices in extended structures. In the example shown above, MFB is exhibited by the set \(\{ e^*\rightarrowtail \lambda , c^\lambda _2\rightarrowtail y^\lambda _2\}\), consisting of two transitions, one from \(e^*\) to its outcome associated with \(\lambda \) and the other from the respective copy of \(c_2\) to the copy of \(y_2\). For a schematic representation, see Fig. 7.

Fig. 7
figure 7

MFB arising as a consequence of the non-contextual extension: Given the instruction set \(\lambda = \{x_1+,y_1+,x_2+,x_3-,y_3+\}\) as an outcome of the new choice point \(e^*\), an agent at \(c_2\) cannot select setting \(y_2\)

7 “No-go” results for GHZ: the contextual case

Contextual instruction sets relax one constraint of the non-contextual approach. A contextual instruction from a given set \(\lambda \) need not work in a piecemeal way: an instruction assigns outcomes instead for triples of settings \(z_1, z_2, z_3\) (\(z_i \in \{x_i, y_i\}\)). The consequence is that two instructions from a given \(\lambda \) might assign different outcomes for a setting \(z_i\), depending on what the other settings are. To be more specific, a contextual instruction set \(\lambda \) has to specify instructions for each possible selection of settings, but the instructions for a given member of E may differ given different settings (different measurement contexts). For the GHZ set-up a typical contextual instruction set looks like the following:

$$\begin{aligned} \begin{aligned} \lambda =&\{\{x_1-,x_2-,x_3-\}, \! \{x_1-,x_2-,\mathbf{y_3-} \}, \! \{ x_1-,y_2-,x_3- \}\!, \{x_1-y_2-\mathbf{y_3+} \},\\&\ \ \{ y_1+,x_2-,x_3- \}\!, \{ y_1+,x_2-,\mathbf{y_3-} \}\!, \{ y_1+,y_2-,x_3-\}\!, \{ y_1-,y_2-,\mathbf{y_3+} \}\}\!. \end{aligned} \end{aligned}$$

To exhibit the contextual aspect of this instruction set, we indicated with bold face the instructions for \(y_3\): observe that they are different for different settings at the remote location 2. Otherwise, the instruction set appears sound: each member of \(\lambda \) satisfies the rules (xxx) and (yy). One may be troubled by this feature:

$$\begin{aligned} \bigcup \lambda = \{ x_1-,x_2-,x_3- ,y_1+,y_1-,y_2-,\mathbf{y_3+}, \mathbf{y_3-}\}, \end{aligned}$$

but this is just contextuality: \(\lambda \) indeed prescribes different outcomes for \(y_3\), depending on the context. It follows that the structure extension in the \(\lambda \)-outcome of \(e^*\) retains \(y_3\) as an indeterministic event with two outcomes, “\(+\)” and “−”. Observe that every possible setting can be chosen in the given \(\lambda \)-outcome of \(e^*\). This is indeed true for every outcome of \(e^*\). There is, therefore, no MFB between a transition based on \(e^*\) and a transition based on an element of \(C^{\text {ext}}\). In contrast to the non-contextual case, condition (i) of the preservation of free choices is thus satisfied.

A closer examination of our exemplary contextual \(\lambda \) reveals a problem, however: there is no possibility of including \(y_3+\) and the setting \(x_2\) of choice \(c_2\). This means that there is Modal Funny Business in the extended structure between the (copy of) the transition \(y_3+\) and the (copy of) the transition from \(c_2\) to \(x_2\). Since in the surface structure there was no MFB non-trivially involving transitions based on C and based on E, C/E independence is thus not preserved, violating our second desideratum (ii). For the scenario described by the contextually extended structure, in the possible future of \(e^*\) specified by \(\lambda \) an agent’s selection of \(x_2\) at \(c_2\) is incompatible with the occurrence of result “\(+\)” of the distant measurement \(y_3\) (or vice versa). Figure 8 below depicts this predicament schematically. Again, this kind of MFB is not limited to this particular \(\lambda \): every contextual \(\lambda \) gives rise to such a case of MFB.

Fig. 8
figure 8

MFB arising as a consequence of the contextual extension: given the instruction set \(\lambda \) mentioned in the main text, the selection of setting \(x_2\) and obtaining result “\(+\)” in the remote measurement \(y_3\) are not jointly possible

8 The price of violating SI: discussion and conclusions

We investigated the SI condition in the GHZ Gedankenexperiment, using the technique of structure extension in BST. After describing the surface structure of the GHZ setup, we considered three kinds of extensions: super-deterministic, non-contextual, and contextual. The extensions are produced by the addition of instruction sets for GHZ that comply with the predictions of QM in the form of the rules (xxx) and (yy), which are borne out experimentally. In line with the general logic of Bell-type “no-go” proofs, the assumption of such instruction sets entails a violation of at least one of the independence assumptions, OI, PI, or SI. Our focus was on the precise formal features of violations of SI in models that preserve the quantum predictions.

As we showed, a super-deterministic extension of the GHZ surface structure is easily available, but (as expected) leaves no room for the free choice of settings. For the two remaining kinds of extensions, which do not assume super-determinism, we separated our discussion into two parts, since the logic of the respective constructions takes two different forms. For non-contextual extensions, we added local instruction sets that satisfy OI and PI, while making sure that quantum predictions are preserved. We showed that the resulting violation of SI consists in a failure to preserve the freedom of selection of settings of the surface structure. New cases of MFB arise in the extended structure, and these violate our desideratum (i): For each purported non-contextual instruction set there is an experimenter who cannot choose a particular setting.

The logic of adding contextual instruction sets is slightly different since in this case an instruction for some measurement station may depend on the setting chosen at a distant station. Accordingly, such instruction sets violate PI. This is, however, not all that happens, as our construction showed: By producing a structure with contextual instruction sets for GHZ that satisfies the predictions of QM, we obtained a structure in which C/E independence in the form of our desideratum (ii) is violated. For contextual extensions, for each instruction set there is thus an experimenter whose choice of a particular setting is incompatible with the occurrence of a particular result of some remote measurement.

Our modal framework, with its notion of MFB, permits one to see that in both extensions, agents’ choices become modally correlated with either instructions or remote outcomes, even though they were not so correlated in the surface structure. Our results suggest that SI should be read as the prohibition of any form of MFB that constrains the freedom of agents. The subtle nature of the SI condition is not visible in less formal arguments in which SI is simply identified with the independence of settings and hidden variables.Footnote 13

From the abstract point of view of a general Bell-type argument, the option of abandoning SI is a way of making the introduction of hidden variables compatible with the empirical results of quantum mechanics. Super-determinism is one way of abandoning SI—it amounts to renouncing any freedom of choice for the experimenters, which goes against the very notion of an experiment. Other ways of abandoning SI via the introduction of deterministic hidden variables may appear to be more enticing. As our results show, however, such a move in fact undermines the very motivation for introducing hidden variables in the first place. In the terminology of BST, that motivation comes from the occurrence of MFB among measurement outcomes in the surface structure. The introduction of local hidden variables (instruction sets) was meant to remove these strange modal correlations. Our constructions show that if the surface correlations are removed—whether via non-contextual or via contextual hidden variables—then MFB reappears in such a way that it now affects the agents’ choices as well. Thus, instead of removing the troublesome occurrence of MFB, the introduction of such hidden variables leads to even more troublesome cases of MFB. Such hidden variables are, thus, a cure that is worse than the disease it was supposed to remedy. The rational response would seem to be to accept non-local quantum correlations as a brute fact about Nature, and as a resource for experiments and new technology.