Discreteness and the origin of probability in quantum mechanics

Attempts to derive the Born rule, either in the Many Worlds or Copenhagen interpretation, are unsatisfactory for systems with only a finite number of degrees of freedom. In the case of Many Worlds this is a serious problem, since its goal is to account for apparent collapse phenomena, including the Born rule for probabilities, assuming only unitary evolution of the wavefunction. For finite number of degrees of freedom, observers on the vast majority of branches would not deduce the Born rule. However, discreteness of the quantum state space, even if extremely tiny, may restore the validity of the usual arguments.

Quantum mechanics exhibits an odd dichotomy in the time evolution of states. A quantum state undergoes deterministic, unitary evolution until a measurement causes probabilistic, non-unitary collapse. While many physicists do not feel that there is anything wrong with this standard Copenhagen picture, it seems less than economical to postulate two fundamental processes-unitary evolution and non-unitary measurement-if somehow one could suffice. Everett [1] proposed that unitary time evolution of a closed system is sufficient to account for the appearance of measurement collapse to observers inside the system (see also Hartle [2] and DeWitt and Graham [3]), in what has now become known as the Many Worlds (MW) formulation of quantum mechanics.
The MW interpretation is regarded as extravagant, and hence implausible, by many (including at least one of the authors), because of the huge multiplicity of branches of the wavefunction, each of which is presumed to be as real as the others [5]. Before the anti-MW reader abandons this paper, we note that the discussion that follows applies also to the conventional Copenhagen interpretation, with measurement collapse, and may allow a derivation of probability in quantum mechanics from a weaker initial assumption, known as the certainty assumption, along the lines of Hartle [2] (see also Farhi, Goldstone and Gutmann [6] and Coleman and Lesniewski [7]). An attractive doctrine (preferred by one of the authors) is the minimalist view outlined by Hartle [4] insisting that physics should be done without ill-defined words and slogans such as "The other worlds are just as real." Our analysis could also be read within this post-Everett or decoherent histories approach.
We focus on the Born rule in quantum mechanics, and the extent to which it can be derived. The Born rule states that given an observable A with spectrum λ i and eigenstates |ψ i , the probability of λ i as the outcome of a measurement on state |ψ is P i = | ψ i |ψ | 2 . It has been claimed by Everett, Hartle, and others, that this rule arises as a consequence of the assumption of uni-tary evolution, but as we discuss below, the derivation is unsatisfactory for any system with only a finite number of degrees of freedom. (For recent discussions of the Born rule in MW, see [8].) In a recent paper [9] we speculated that quantum gravity and related considerations may imply that quantum state space is itself discrete. We will review our argument in the next section. Here we point out that one consequence of this discreteness in state space may be the emergence of the Born rule, even in the case when the number of degrees of freedom is finite.
The original derivation of the Born rule given by Everett [1], Hartle [2], and others, is quite simple. Consider an ensemble of identically prepared states and a sequence of outcomes S = (s 1 , s 2 , . . . , s N ) obtained from measurements on each of the states. The probability P (S) of a given sequence, or class of sequences, calculated using the Born rule, is identical to the norm (magnitude) squared of the projection of Ψ onto eigenstates with the eigenvalues (s 1 , s 2 , . . . , s N ), namely | s 1 s 2 . . . s N |Ψ | 2 . As Everett noted, it follows that an improbable sequence corresponds to a component of Ψ (in the eigenstate basis) with small magnitude. In the formal limit N → ∞, components of Ψ which do not correspond to statistically typical sequences generated by the Born rule have zero magnitude (i.e. converge to the null vector), and therefore do not correspond to physical states. From the frequentist perspective on probability, then, the Born rule is a consequence of excluding zero norm states from the Hilbert space. To further elucidate, consider a simple example using spin states. Let |ψ = c + |+ + c − |− , and define p ± = |c ± | 2 . Then a sequence of measurement outcomes will be of the form S = {+ + − + · · · }. If the sequence is generated by the Born rule, then in the limit of large N , the fraction of (+) outcomes will be p + to very good approximation. Any other value for the fraction of (+) outcomes has zero probability at infinite N . Correspondingly, the magnitude squared | s 1 s 2 · · · s N |Ψ | 2 is zero for any state s 1 s 2 · · · s N | in which the fraction of outcomes s i equal to (+) is not p + .
This can be generalized: if s 1 s 2 · · · s N | corresponds to a sequence S = {s 1 , s 2 , . . . , s N } which is statistically atypical according to the Born rule, its overlap with Ψ will vanish when N → ∞. Everett referred to these branches of the wavefunction as "maverick worlds"observers on these branches would not deduce the Born rule. Below, we will repeat this discussion for those readers who prefer a more standard Copenhagen interpretation to the MW interpretation.
We can define parameters characterizing the deviation of a maverick world from the central Born value. For example, in the spin example, we might consider f + to be the frequency of (+) outcomes, so that δ = f + − p + is the deviation parameter. Then any branch with non-zero δ will have vanishing norm in the large N limit. When N is strictly infinite all maverick worlds have zero norm. The remaining branches have outcomes S which satisfy the Born rule in the frequentist sense.
The problem with this reasoning is of course that N is never strictly infinite. In fact, given the finite size of the causal horizon of our universe and an ultraviolet cutoff on modes (e.g., from the Planck scale), we obtain a finite, although very large, upper limit on the number of outcomes N which characterize any particular branch of the MW wavefunction. Without invoking something like the Born rule-a correspondence between probability and norm-there is no reason to exclude branches with small but non-zero norm. The problem is exacerbated by the fact that maverick worlds are generally far more numerous than non-maverick worlds. The MW wavefunction branches with each measurement, regardless of how small either of |c ± | 2 is. This leads to 2 N total branches after N measurements. Even if, e.g., |c + | 2 is much larger than |c − | 2 , both (+) and (−) outcomes will still occur at each branch, and the structure of the tree is independent of c ± as long as neither is zero. The overwhelming majority of branches will have roughly equal numbers of (+) and (−) outcomes. Thus the multiplicity of maverick worlds is enormously larger than non-maverick worlds, although their collective magnitude is vanishingly small. Again, without assuming the Born rule, we have noà priori reason to exclude small (but non-zero) norm states.
Of course, a strict frequentist interpretation of probability requires an infinite sequence of outcomes. However, the use of probability by physicists is more Bayesian than frequentist: confronted with a finite sequence of outcomes, S = (s 1 , s 2 , . . . , s N ), our goal is to deduce a predictive model for subsequent outcomes. In this way, we deduce the Born rule based on the limited number of measurements thus far performed on quantum systems.
As mentioned, our discussion may be of interest even to those who do not accept MW, as it pertains to the origin of the Born rule within the Copenhagen, or measurement collapse, interpretation. In particular, it has been proposed by Hartle [2] that the Born rule can be derived from the weaker certainty assumption, stating that when a measurement of an observable A is performed on an eigenstate |a of A, the value a is obtained with certainty. Taking A to be, for example, the frequency operator for (+) outcomes, or any other statistical property, Hartle found that for N infinite, Ψ is an eigenstate of each of these statistical operators, with eigenvalues given by the Born rule.
The discussion parallels that in the MW interpretation. In the standard Copenhagen picture the state Ψ is, in the eigenstate basis, a sum of 2 N terms, each term being in one-to-one correspondence with a MW branch or a universe. In the Copenhagen interpretation the outcomes S result from measurements on an ensemble, whereas in MW they specify a particular branch or decoherent history [10] of the wavefunction of the entire universe. The mathematics is the same in either picture: maverick terms collectively have a very small norm that approaches zero as N approaches infinity.
This has the same weakness as the earlier MW argument. For any finite N , the state Ψ is only approximately an eigenstate of the frequency operator. The certainty assumption does not specify the outcome of a measurement on an approximate eigenstate, and going further requires an assumption relating the norm of a state vector to the probability of a measurement outcome, which is essentially the Born rule.

DISCRETE STATE SPACE
Consider normalized states Ψ = ψ ⊗ · · · ⊗ ψ and Ψ ′ = ψ ′ ⊗ · · · ⊗ ψ ′ . Suppose that, due to fundamental discreteness, one cannot distinguish ψ and ψ ′ when | ψ − ψ ′ | < ǫ. This implies that the direct product states cannot be distinguished when (assuming √ N ǫ ≪ 1) (We have assumed that ψ|ψ ′ is real, which would be the case if ψ ′ resulted from rotating ψ slightly on the Bloch sphere. Relative phases could lead to order N ǫ terms in Eq. (2), which allow an acceptable cutoff of maverick branches for even smaller discreteness scale ǫ.) Motivated by this observation, we assume that any (maverick!) components of Ψ with norm less than √ N ǫ can be removed from the wavefunction.
We argued in Ref. [9] that quantum gravity suggests a discreteness scale of order ǫ ∼ E, where E is the characteristic energy of the system described by ψ, in Planck units. Equivalently, ǫ ∼ L −1 , where L is the characteristic size, or Compton wavelength, of the system. We can motivate this result by noting that quantum gravity seems to imply a minimal length [11] of order the Planck length. A minimal length restricts our ability to distinguish two different orientations of an experimental apparatus, such as a Stern-Gerlach device for measuring the orientation of a spin. (Rotation of the device by an angle less than L −1 does not displace any component by more than the Planck length.) Thus, the resulting ambiguity in the spin state even after an ideal measurement is at least of order ǫ given above (see Fig. 1). There is no way to ensure that the ensemble states ψ are identical to accuracy better than ǫ. For example, each time we pass a spin through the Stern-Gerlach device to produce another ψ there can be no guarantee that the Stern-Gerlach device remains in precisely the same orientation.
While some might consider fundamental discreteness of the space of quantum states (previously referred to in the earlier paper [9] as discrete Hilbert space [12]) to be a radical notion, we find asserting its absolute continuity in the absence of any supporting experimental evidence to be perhaps just as speculative. Consider the case of spacetime: few would claim that spacetime must be absolutely continuous (in fact, most likely it is not [11]); why should quantum state space be different?
It is worth emphasizing that the discreteness we propose has nothing to do with the dimensionality of state space. Rather, it has to do with whether the coefficients c i in an eigenstate expansion |ψ = i c i |i are continuous or can only take on a discrete set of values (see Fig. 1).
We have not specified the concrete realization of discreteness, other than to assume that states can be defined only modulo some fundamental uncertainty. There are many ways to define the evolution of a state in a discrete state space. One method would be to write the time evolution operator e −iHt as a product of discrete evolution operators e −iH∆t and apply this product of operators sequentially to the state, followed by the "snap to" rule ("snap to nearest lattice site"; see Fig. 1) after each step. This is equivalent to taking classical digital computer simulations literally. That is, by accepting the finite precision of the variable ψ(x) in an ordinary computer program, one obtains a naive discretization of Hilbert space with the "snap to" rule implemented by simple numerical rounding. With limited numerical precision, branches of the wavefunction with very small norm are eventually discarded. This scheme leads to small violations of linear superposition, but only at the level of ǫ.
Interestingly, for ǫ ∼ L −1 , the condition that discreteness have only a small effect on Ψ , √ N ǫ ≪ 1, leads to a condition on the number of degrees of freedom reminiscent of holography [13]: where A is the surface area of the region. This bound implies far fewer degrees of freedom than the usual extensive scaling N ∼ L 3 . It can be deduced as a constraint from gravitational collapse [14]. Excluding states from the Hilbert space of the L 3 volume which would have already caused gravitational collapse to a black hole, we find the stronger condition N < A 3/4 ∼ L 3/2 . Points between discs can be assigned to the nearest disc.

NO MAVERICK WORLDS
Consider the spin example from the first section. Let n = n + = f + N be the number of (+) outcomes in the sequence S. We suppress the + subscript in what follows. For N ≫ 1, the function has a sharp maximum at n = pN and rapidly decreases for n sufficiently far from it. The maximum results from a competition between the combinatorial factor (multiplicity), which is peaked at n = N/2, and the product p n (1 − p) N −n , which is peaked at either n = 0 or n = N , unless p − is extremely close to p + . It follows that when calculating P (n) for n not too far from pN , we make a negligible error by assuming n ≫ 1 and N − n ≫ 1. The Stirling formula gives where and f = n/N . For large N this becomes sharply peaked.
Expanding φ(f ) around f = p, we find The collective magnitude squared of all maverick states |δ, N with frequency deviation |δ| = |f − p| greater than One contribution to the sum comes from the range f ∈ [0, p − δ 0 ] and the other from the range f ∈ [p + δ 0 , 1]. Note that we have replaced f (1 − f ) in the overall factor in P (f N ) by p (1 − p). The resulting error should be negligible for our purposes here. Requiring that this collective magnitude squared is less than N ǫ 2 yields The maximum deviation δ for undiscarded branches vanishes as N → ∞ for fixed p, ǫ. If, for finite N , an experimenter could measure all N outcomes which define his branch of the wavefunction, he might find a deviation from the predicted Born frequency f = p as large as | ln(N ǫ 2 )| 1/2 standard deviations (i.e., measuring the deviation in units of N −1/2 ). Note that we are working in the regime N ǫ 2 ≪ 1. If the discussion in Ref. [9] offers a valid guide, the number ǫ may be much smaller than 10 −20 , so that even if N is as large as Avogadro's number, N ǫ 2 will still be a small number (see example below). However, an experimenter is unlikely to be able to measure more than a small fraction of the outcomes that determine his branch. Recall that in MW a particular branch of the wavefunction is specified by the sequence of outcomes S = (s 1 , s 2 , . . . , s N ). N is the total number of decoherent outcomes on a branch, so it is typically enormous-at least Avogadro's number if the system contains macroscopic objects such as an experimenter. The experimental outcomes available to test Born's rule will be a much smaller number N * ≪ N corresponding to a subset of the s i directly related to the experiment. Any deviation from the Born rule of order N −1/2 will be well within the experimental statistical error of order N −1/2 * . Therefore the Born rule will be observed to hold in all the branches which remain after truncation due to discreteness. This would, however, not be true if we were to set ǫ to zero, in which case | ln(N ǫ 2 )| 1/2 would be infinite.
For definiteness, consider the following numerical example. Let the discreteness scale be truly tiny: ǫ ∼ 10 −100 , and let N ∼ 10 160 , which is the Hubble fourvolume in fermis. Then | ln(N ǫ 2 )| 1/2 ∼ 10, so unless experimenters can measure more than 10 −2 N ∼ 10 158 quantum outcomes, they will have insufficient statistics to exclude any of the maverick branches which remain after truncation.

COPENHAGEN AGAIN
If we assume the Copenhagen (collapse) interpretation, our analysis describes when the Born rule can be supplanted by the weaker assumption of certainty of measurement outcome when the measured state is an eigenstate. In a discrete Hilbert space it is natural to extend the notion of eigenstate, so that states within the discreteness distance ǫ of an eigenstate will also be considered eigenstates. (More precisely, we cannot distinguish between any two such states.) As discussed in the previous section, for large (but finite) N , Ψ is approximately an eigenstate of any statistical operator (such as the frequency operator, but also higher moments) with eigenvalue equal to the Born rule value. For example, the wavefunction is sharply peaked at the Born rule frequency value of f = p. If, motivated by the discreteness scale ǫ, we simply modify the certainty assumption to include states which are approximate eigenstates, we will have deduced the Born rule from a more elementary assumption.
There is, however, a technical difficulty in defining how close a state Ψ is to being an eigenstate of an operator such as the frequency operator. It would be natural to impose a certainty criteria as follows. Given Ψ satisfying where Ψ f is an eigenstate of the frequency operator with eigenvalue f , we identify Ψ with Ψ f and require that a measurement of the frequency on Ψ return the value f with certainty. The problem arises because, for finite N , no choice of Ψ = ⊗ N a=1 ψ (a) is an exact eigenstate of the frequency operator (except in the trivial cases where ψ is already an eigenstate such as |+ or |− , and in those cases f is either zero or one). The state Ψ f does not exist, except in the limit N → ∞, so the distance criteria in Eq. (10) cannot be defined. (Ψ and Ψ f live in Hilbert spaces of very different dimensions.) One has to rely on some other criterion for identifying a state Ψ as a frequency eigenstate.
One possibility is to use the width of |Ψ | 2 about the maximum, in comparison to some ǫ-dependent quantity. When the width is sufficiently small, the certainty assumption is assumed to apply. Consider a selfadjoint operator A, its eigenvectors ψ i and eigenvalues λ i , Aψ i = λ i ψ i , ψ i |ψ j = δ ij (i, j = 1, . . . , n). (For the qubit case A is the spin operator and n = 2.) For a state ψ = n i=1 c i ψ i , projection operators P i satisfy P i ψ = c i ψ i . This gives ψ|P i |ψ = |c i | 2 = p i and n i=1 p i = 1. Let us consider the state of N copies of ψ, Ψ = ⊗ N a=1 ψ (a) . The frequency operators for the eigenvalues λ i are We find and the variances are (∆F i ) 2 = N −1 p i (1 − p i ). Consider ψ ′ = n i=1 c ′ i ψ i close to ψ, and require This gives which leads to |ψ − ψ ′ | 2 < N −1 (n − 1).
This condition is satisfied if we require |ψ − ψ ′ | 2 ≪ ǫ 2 , recalling that N ǫ 2 < 1. It is natural to identify the two states ψ and ψ ′ , and consider them both approximate eigenstates of the frequency operator.

CONCLUSIONS
We argued that attempts to derive the Born rule, either in the Many Worlds or Copenhagen interpretation, are unsatisfactory for systems with only a finite number of degrees of freedom. For Many Worlds this is a serious problem, since its goal is to account for apparent collapse phenomena-including the Born rule for probabilitiesassuming only unitary evolution of the wavefunction. For finite number of degrees of freedom, observers on the vast majority of branches would not deduce the Born rule.
However, we noted that discreteness of the quantum state space, even if extremely tiny, may restore the validity of the usual arguments. Some may regard discreteness as a radical proposal. We might argue that it is actually less speculative than absolute continuity, something that can never be experimentally verified.