Quantum theory as a description of robust experiments: derivation of the Pauli equation

It is shown that the Pauli equation and the concept of spin naturally emerge from logical inference applied to experiments on a charged particle under the conditions that (i) space is homogeneous (ii) the observed events are logically independent, and (iii) the observed frequency distributions are robust with respect to small changes in the conditions under which the experiment is carried out. The derivation does not take recourse to concepts of quantum theory and is based on the same principles which have already been shown to lead to e.g. the Schr\"odinger equation and the probability distributions of pairs of particles in the singlet or triplet state. Application to Stern-Gerlach experiments with chargeless, magnetic particles, provides additional support for the thesis that quantum theory follows from logical inference applied to a well-defined class of experiments.


I. INTRODUCTION
In laboratory experiments, one never has complete knowledge about the mechanisms that affect the outcome of the measurements: there is always uncertainty.In addition, the outcomes of real experiments are always subject to uncertainties with respect to the conditions under which the experiments are carried out.
If there are uncertainties about the individual events and uncertainties about the conditions under which the experiment is carried out, it is often difficult or even impossible to establish relations between individual events.However, in the case that the frequencies of these events are robust (to be discussed in more detail later) it may still be possible to establish relations, not between the individual events, but between the frequency distributions of the observed events.
The algebra of logical inference provides a mathematical framework that facilitates rational reasoning when there is uncertainty [1][2][3][4][5].A detailed discussion of the foundations of logical inference, its relation to Boolean logic and the derivation of its rules can be found in the papers [1,4] and books [2,3,5].Logical inference is the foundation for powerful tools such as the maximum entropy method and Bayesian analysis [3,5].To the best of our knowledge, the first derivation of a non-trivial theoretical description by this general methodology of scientific reasoning appears in Jaynes' papers on the relation between information and (quantum) statistical mechanics [6,7].
A recent paper [8] shows how some of the most basic equations of quantum theory, e.g. the Schrödinger equation and the probability distributions of pairs of particles in the singlet or triplet state emerge from the application of logical inference to (the abstraction of) robust experiments, without taking recourse to concepts of quantum theory.This logical-inference approach yields results that are unambiguous and independent of the individual subjective judgement.In addition, this approach provides a rational explanation for the extraordinary descriptive power of quantum theory [8].As the introduction of the concept of intrinsic angular momentum, called spin, is a landmark in the development of quantum theory, it is natural to ask the question under which circumstances this concept appears in a logical-inference treatment.
A classical review of how the concept of spin has been introduced in quantum theory is given by van der Waerden [9].The original motivation to introduce this new concept was the discovery of the anomalous Zeeman effect and its transition to the normal Zeeman effect with increasing magnetic field (the so-called Paschen-Back effect).Pauli introduced spin in a very formal way by attributing to the electron an additional intrinsic magnetic quantum number taking the values ±1/2 [10].Although the picture of the spin in terms of a "rotating electron model" was quickly and widely accepted, Pauli was strongly against this picture because of its purely classical-mechanics character.A few years later he suggested the Pauli equation [11] in which this intrinsic degree of freedom was introduced by replacing the single-component wavefunction that appears in Schrödinger's equation by a two-component wavefunction and "Pauli matrices"; the most rigorous way to establish a relation with the idea of the rotating electron is just a formal observation that these Pauli matrices satisfy the same commutation relation as the generators of the rotation group in three-dimensional space and that the two-component wavefunctions (spinors) provide a double-valued representation of this group [9].
Bohm and his followers, in the spirit of their general approach to provide a causal interpretation of quantum mechanics, tried to construct a purely classical description of spin by analogy with the hydrodynamics of a rotating liquid [12,13].Despite the beauty of the mathematical description, the interpretation of the spin as entity, a field, which is distributed over the whole space is rather exotic and can hardly be considered as a derivation and justification of the Pauli equation.
Bohr and Pauli suggested that spin and the related magnetic moment cannot be measured in experiments which can be interpreted in terms of classical trajectories (such as Stern-Gerlach experiments with a free-electron beam), see Ref. 14 and references therein.In an inhomogeneous magnetic field, spin effects cannot be separated from the effects of the Lorentz force due to the orbital motion of the charged particle.However, these difficulties are technical rather than conceptual as they do not consider the possibility that there are neutral particles (not subject to the Lorentz force) with magnetic moments, such as neutrons, for which Stern-Gerlach experiment is not only possible in principle but has really been performed [15].It is clear now that the naive way to demonstrate the "essentially non-classical" character of the spin degree of freedom premature.
In this paper, we show how the Pauli equation and the concept of spin naturally emerge from the logical-inference analysis of experiments on a charged particle.We carefully analyze the additional assumptions (some of them having obvious analogs in Pauli's analysis of the anomalous Zeeman effect) which are required to pass, in a model-free way, to the Pauli equation.
Conceptually, we return to the roots by first introducing "spin" as some intrinsic degree of freedom characterized by a twovalued number.We will call this two-valued property "color" (e.g.red or blue) to make clear that we leave no room for (mis)interpretations in terms of models of rotating particle and the like.This is in sharp contrast to the interpretation of Refs.12 and 13.Note that such a generalization of the concept of spin is very important in modern physics.For instance, the idea of isospin of elementary particles [16] which was originally introduced [17] as a way to describe constituents of atomic nuclei in terms of the same particles (nucleons) with two subspecies (neutrons and protons).Another example is the pseudospin of the charge carriers in graphene [18] used to indicate that the carriers belongs to sublattice A or B of the honeycomb crystal lattice.In both of these examples, there is nothing that is rotating!We further illustrate the power of the approach by an application to Stern-Gerlach experiments with chargeless, magnetic particles, providing additional support to the idea that quantum theory directly follows from logical inference applied to a welldefined class of experiments [8].
To head off possible misunderstandings, it is important to mention that the underlying premise of our approach is that current scientific knowledge derives, through cognitive processes in the human brain, from the discrete events which are observed in laboratory experiments and from relations between those events that we, humans, discover.As a direct consequence of this underlying premise, the validity of the results obtained in our approach does not depend on the assumption that the observed events are signatures of an underlying objective reality which is mathematical in nature (for an overview of older and new work in this direction, see Ref. 19).We take the point of view that the aim of physics is to provide a consistent description of relations between certain events that we perceive (usually with the help of some equipment) with our senses.Some of these relations express cause followed by an effect and others do not.A derivation of a quantum theoretical description from logical-inference principles does not prohibit the construction of cause-and-effect mechanisms that, when analyzed in the same manner as in real experiments, create the impression that the system behaves according to quantum theory [20][21][22].Work in this direction has shown that it is indeed possible to build simulation models which reproduce, on an event-by-event basis, the results of interference/entanglement/uncertainty experiments with photons/neutrons [23][24][25][26][27].
The paper is organized as follows.In Section II we specify the measurement scenario, introduce the inference-probability that characterizes the observed detection events (all the elements of logical inference that are required to for the purpose of the present paper are summarized in Appendix A).Then, we discuss and formalize the notion of a robust experiment.Although these three steps are similar to the ones taken in the logical-inference derivation of the Schrödinger equation [8], to make the presentation self-contained, we give a detailed account.The next three subsections address the problem of including additional knowledge about the motion of the particle in some limiting cases.In subsection II H we collect the results of the previous subsections and derive the Pauli equation.Section III shows that the same procedure leads to the quantum theoretical equation that describes the motion of an uncharged particle in a magnetic field.A discussion of the relation of the logical-inference derivation of the Pauli equation and earlier work on the hydrodynamic formulation of quantum theory is given in Section IV.A summary and discussion of more general aspects of the work presented in this paper can be found in Section V.

A. Measurement scenario
We consider N repetitions of an experiment on a particle located in 3-dimensional space Ω Ω Ω.The experiment consists of sending a signal to the particle at discrete times labeled by the integer τ = 1, . . ., M. It is assumed that for each repetition, labeled by n = 1, . . ., N, the particle is at the unknown position X τ ∈ Ω Ω Ω.As the particle receives the signal, it responds by emitting another signal which is recorded by an array of detectors.For each signal emitted by a particle the data recorded by the detector system is used to determine the position j n,τ ∈ V where V denotes the set of voxels with linear extent [−∆, ∆]/2 that cover the 3-dimensional space Ω Ω Ω.The signal also contains additional information which is two-valued and encodes, so to speak, the "color" of the particle at the time when it responded to the signal emitted by the source.This color is represented by variables k n,τ = ±1.The frequency distribution of the (j, k) n,τ 's changes with the applied electric and magnetic field from which we may infer that there is some form of interaction between the electromagnetic field and the particle.
The result of N repetitions of the experiment yields the data set or, denoting the total counts of voxels j j j and color k at time τ by 0 ≤ c j j j,k,τ ≤ N, the data set can be represented as

B. Inference-probability of the data produced by the experiment
The first step is to introduce a real number 0 ≤ P(j, k|X τ , τ, Z) ≤ 1 which represents the plausibility that we observe a detector click (j, k), conditional on (X τ , τ, Z).For reasons explained in Appendix B, P(j, k|X τ , τ, Z) is called inference-probability (or i-prob for short) and encodes the relation between the unknown location X τ and the location j and color k registered by the detector system at discrete time τ.Except for the unknown location X τ , all other experimental conditions are represented by Z and are assumed to be fixed and identical for all experiments.Note that unlike in the case of parameter estimation, in the case at hand both P(j, k|X τ , τ, Z) and the parameters X τ are unknown.
We make the following, seemingly reasonable assumptions: 1.Each repetition of the experiment represents an event of which the outcome is logically independent of any other such event.By application of the product rule (see Appendix B), a direct consequence of this assumption is that and hence 2. It is assumed that it does not matter where the experiment is carried out.This implies that the i-prob should have the property where ζ ζ ζ is an arbitrary 3-dimensional vector.The relation Eq. ( 5) expresses the assumption that space is homogeneous.

C. Condition for reproducibility and robustness
If the frequencies with which the detectors fire vary erratically with {X τ }, the experiment would most likely be called "irreproducible".Excluding such experiments, it is desirable that frequency distributions of the data exhibit some kind of robustness, smoothness with respect to small changes of the unknown values of {X τ }.Unless the experimental setup is sufficiently "robust" in the sense just explained, repeating the run with slightly different values of {X τ } would often produce results that are very different from those of other runs and it is common practice to discard such experimental data.Therefore, a "good" experiment must be a robust experiment.
The robustness with respect to small variations of the conditions under which the experiment is carried out should be reflected in the expression of the i-prob to observe data sets which yield reproducible averages and correlations (with the usual statistical fluctuations).The next step therefore is to determine the expression for P(j, k|X τ , τ, Z) which is most insensitive to small changes in X τ .It is expedient to formulate this problem as an hypothesis test.Let H 0 and H 1 be the hypothesis that the same data D is observed for the unknown locations {X τ } and {X τ + ε ε ε τ }, respectively.The evidence Ev of hypothesis H 1 , relative to hypothesis H 0 , is defined by [3,5] Ev = ln where the logarithm serves to facilitate algebraic manipulations.If H 1 is more (less) plausible than H 0 then Ev > 0 (Ev < 0).In statistics, the r.h.s. of Eq. ( 6) is known as the log-likelihood function and used for parameter estimation.In contrast, in the present context, the function Eq. ( 6) is not used to estimate X τ but is a vehicle to express the robustness with respect to the coordinates X τ .Writing Eq. ( 6) as a Taylor series in ε ε ε we have where ∇ ∇ ∇ τ differentiates with respect to X τ .Here and in the following we assume that ε ε ε τ is sufficiently small such that the third and higher order terms in the ε ε ε's can be ignored.According to our criterion of robustness, the evidence Eq. ( 7) should change as little as possible as X τ varies.This can be accomplished by minimizing, in absolute value, all the coefficients of the polynomial in ε ε ε τ , for all allowed ε ε ε τ and X τ .The clause "for all allowed ε ε ε τ and X τ " implies that we are dealing here with an instance of a global optimization problem [28].
The first and third sum in Eq. ( 7) vanish identically if we choose c j,k,τ /N = P(j, k|X τ , τ, Z).Indeed, we have for α = 1, 2, . ...Although this choice is motivated by the desire to eliminate contributions of order ε ε ε τ , it follows that our criterion of robustness automatically suggests the intuitively obvious procedure to assign to P(j, k|X τ , τ, Z) the value of the observed frequencies of occurrences c j,k,τ /N [3,5].
Dropping irrelevant numerical factors and terms of O(ε 3 τ ), the remaining contribution to the evidence vanishes identically (for all ε ε ε τ ) if and only if ∇ ∇ ∇ τ P(j, k|X τ , τ, Z) = 0 in which case it is clear that we can only describe experiments for which the data does not exhibit any dependence on X τ .Experiments which produce frequency distributions that do not depend on the conditions do not increase our knowledge about the relation between the conditions and the observed data.Therefore, we explicitly exclude such noninformative experiments.Thus, from now on, we explicitly exclude the class of experiments for which ∇ ∇ ∇ τ P(j, k|X τ , τ, Z) = 0.
The clause "for all allowed ε ε ε τ " can be eliminated using the Cauchy-Schwarz inequality.We have where ε 2 = max τ ε ε ε 2 τ .As the ε ε ε τ 's are arbitrary (but small), it follows from Eq. ( 10) that we find the robust solution(s) by searching for the global minimum of which is the Fisher information of the measurement scenario described above.

D. Continuum limit
Propositions such as "detector (j, k) has clicked at time τ" are ultimately related to sensory experience and are therefore discrete in nature.On the other hand, the basic equations of quantum theory such as the Schrödinger, Pauli and Dirac equations are formulated in continuum space.Taking the continuum limit of the discrete formulation connects the two modes of description.Here and in the following, we use the symbols for (partial) derivatives for both the case that the continuum approximation is meaningful and the case that it is not.In the latter, operator symbols such as ∂ /∂t should be read as the corresponding finite-difference operators.
Assuming that the continuum limit is well-defined, we have V → Ω Ω Ω and the Fisher information reads where ∇ ∇ ∇ denotes derivatives with respect to x and we have simplified the notation somewhat by writing X = X t .We have changed derivatives with respect to X to derivatives with respect to x by assuming that (P(x, k|X,t, Z) = P(x + y, k|X + y,t, Z) holds for all y (see assumption 2 in Section II B).Furthermore, it is understood that integrations are over the domain defined by the measurement scenario.Technically speaking, after passing to the continuum limit, P(x|X,t, Z) denotes the probability density, not the probability itself.However, as mentioned above, we write integration and derivation symbols for both the discrete case and its continuum limit and as there can be no confusion about which case we are considering, we use the same symbol for the probability density and the probability.
For later use, it is expedient to write Eq. ( 12) in a different form which separates the data about the position of the clicks and the associated color k as much as possible.According to the product rule, we have P(x, k|X,t, Z) = P(k|x, X,t, Z)P(x|X,t, Z), (13) which we may, without loss of generality, represent as Substituting Eq. ( 14) into Eq.( 12) we obtain which is the Fisher information for the measurement scenario described earlier.Note that up to this point, we have not assumed that the particle moves or carries a magnetic moment nor did we assign any particular meaning to θ (x, X,t, Z).
According to the principle laid out earlier, our task is to search for the global minimum of Eq. ( 15), the Fisher information of the measurement scenario described above, thereby excluding the uninformative class of solutions.

E. Including knowledge
It is instructive to first search for the global minimum of Eq. ( 15) in the case that we do not know whether the particle moves or not and do not know about the effect of the applied electromagnetic field on the frequency distribution of the (j, k) n,τ 's.In this situation, we may discard the time dependence altogether and search for the non-trivial global minimum of For pedagogical purposes, we now specialize to the case of one spatial dimension and discard the color dependence, that is we set ∇ ∇ ∇θ (x, X, Z) = 0 and assume that Ω → [0, L] where [0, L] is the range covered by the detection system.With the latter assumption P(x|X, Z) = 0 for x ≤ 0 or x ≥ L.
Recalling the assumption that space is homogeneous (see Eq. ( 5)), we search for solutions of the form 16) and we obtain Recall that the requirement of a global minimum entails that I F is constant, independent of the unknown position X of the particle.
The extrema of Eq. ( 17) are easily found by a standard variational calculation.Introducing the Lagrange multiplier µ to account for the constraint For µ > 0, the solutions of Eq. ( 18) are hyperbolic functions, a family of solutions that is not compatible with the constraint P(x|X, Z) = 0 for x = 0, L and can therefore be ruled out.Writing µ = −4ν 2 , the general solution of Eq. ( 18) reads where c 1 (Z) and c 2 (Z) are integration constants.Imposing the boundary condition ψ(x − X, Z) = 0 for x = 0 we must have c 1 (Z) sin νX = c 2 (Z) cos νX hence the second term in Eq. ( 19) vanishes for all x.In addition, imposing the boundary condition ψ(x − X, Z) = 0 for x = L, we must have either c 1 (Z) cos νX + c 2 (Z) sin νX = 0 in which case ψ(x − X, Z) = 0 for all x or ν = nπ/L for n = 1, 2, . . . in which case the non-trivial solutions read Using and from which are nothing but the solutions of the Schrödinger equation of a free particle in a one-dimensional box [29].Note that the r.h.s of Eq. ( 22) does not depend on X.In other words, from the measured data we cannot infer anything about the unknown position X, in concert with the notion that the particle is "free".From Eq. ( 20) it follows that I F = (2nπ/L) 2 , independent of X as it should be.Clearly, the solution for non-trivial global minimum of I F is given by Eq. ( 22) with n = 1.
Returning to the case that the frequency distribution of the (j, k) n,τ 's indicates that the motion of the particle depends on the applied electric or magnetic field, we can incorporate this additional knowledge as a constraint on the global minimization problem.In general, the global minimization problems that we will consider take the form λ I F + Λ where λ is a parameter (not a Lagrange multiplier) that "weights" the uncertainty in the conditions (represented by I F ) relative to the knowledge represented by the functional where F(x, k,t, Z) is a function which encodes the additional knowledge and which does not depend on the unknown position X.
The assumption that space is homogeneous allows us to replace derivatives with respect to X by derivatives with respect to x.This helps in searching for the global minimum of λ I F + Λ because it can be found by searching for the extrema of as a functional of the P(x, k|X,t, Z)'s.By the standard variational procedure, the extrema of λ On the other hand, the global minimum of λ I F + Λ should not depend on unknown X because if it did, it was not a global minimum and in addition, the values of λ I F + Λ would tell us something about X, a contradiction to the assumption that X is unknown.
Taking the derivative of Eq. ( 24) with respect to X (recall X = X t ) yields Comparing Eqs. ( 25) and ( 26) and recalling the constraint ∇ ∇ ∇ τ P(j, k|X τ , N, Z) = 0 used to eliminate uninformative solutions, we conclude that the extrema (and therefore also the global minimum) of Eq. ( 24) are (is) independent of X t , as required.

F. Motion of the particle
We consider the limiting case that there is no uncertainty on the position of the particle, that is x = X for all clicks.Then the motion of the particle and the motion of the positions of the detector clicks map one-to-one, for each repetition of the experiment (by assumption).
From the data x(t) we can compute the vector field U(x,t) defined by dx dt = U(x,t).
In principle, U(x,t) is fully determined by the data obtained by repeating the experiment under different (initial) conditions.In practice, however, it is unlikely that we have enough data to compute U(x,t) for all (x,t).
We only consider the case in which the position of the clicks is encoded by its (x, y, z)-coordinates in an orthogonal frame of reference attached to the observer.Under the usual assumptions of differentiabilty etc., we can use the Helmholtz-like decomposition of a vector field U(x,t) = ∇ ∇ ∇S(x,t) − ∇ ∇ ∇ × W(x,t).We will not use this form but write [30] U(x,t) = ∇ ∇ ∇S(x,t) − A(x,t), (28) where S(x,t) is a scalar function and A(x,t) a vector field.Clearly Eq. (28) has some extra freedom which we can remove by requiring that A(x,t) = ∇ ∇ ∇ × W(x,t).This amounts to requiring that ∇ ∇ ∇ • A = 0.It is convenient not do this at this stage so we take Eq. ( 28) and will impose ∇ ∇ ∇ • A = 0 later.As mentioned earlier, if differentiabilty is an issue we should use the finite-difference form of the ∇ ∇ ∇ operators.
For convenience, we drop the (x,t) arguments and switch to a component-wise notation in the few paragraphs that follow.From Eq. ( 27) and Eq. ( 28) it directly follows that [30] d where i = 1, 2, 3 labels the coordinate of the detector clicks.
Introducing the vector field B = ∇ ∇ ∇ × A the second term in Eq. ( 29) can we written as It is important to note that in order to derive Eq. ( 30), it is essential that the position is represented by three coordinates.Switching back to the vector notation we have Up to now, we have not made any assumption other than that space is three-dimensional.Next comes a crucial step in the reasoning.Let us hypothesize that there exists a scalar field φ = φ (x,t) such that Then, upon introducing the vector field E = −∇ ∇ ∇φ − ∂ A/∂t, Eq. ( 31) becomes Although Eq. ( 33) has the same the structure as the equation of motion of a charged particle in an electromagnetic field (E, B), our derivation of Eq. ( 33) is solely based on the elementary observation that the data yields the vector field U(x,t) (see Eq. ( 28)), some standard vector-field identities and the hypothesis that there exist a scalar field φ such that Eq. ( 32) holds.No reference to charged particles or electromagnetic fields enters the derivation.Put differently (and putting aside technicalities related to differentiability), if there exist a scalar field φ such that Eq. ( 32) holds, then mathematics alone dictates that the equation of motion must have the structure Eq. ( 33), with E and B having no relation to the electromagnetic field acting on a charged particle.The latter relation is established when the data shows that there is indeed an effect of electromagnetic field on the motion of the particle, an effect from which it is inferred that the particle carries charge.This relation can be made explicit by introducing the symbols m for the mass and q for the charge of the particle and by replacing A by qA/m (we work with MKS units throughout this paper) and φ by (qφ + u)/m where u represent all potentials that are not of electromagnetic origin.Then we have and upon replacing S by S/m and V = qφ Note that we have obtained the Hamilton-Jacobi equation Eq. ( 35) without making any reference to a Hamiltonian, the action, contact transformations and the like.In essence, Eqs. ( 28)-( 35) follow from Eq. ( 27), some mathematical identities and the crucial assumption that there exist a V such that Eq. ( 35) holds.Summarizing: If we can find scalar fields S and V and a vector field A(x,t) such that Eq. ( 35) holds for all (x,t) then the clicks of the detectors will carve out a trajectory that is completely determined by the classical equation of motion Eq. (34) of a particle in a potential and subject to electromagnetic potentials.Of course, there is nothing really new in this statement: it is just telling us what we know from classical mechanics but there is a slight twist.
First, given the data x(t) of the detector clicks, this data will not comply with the equations of classical mechanics unless we can find scalar fields S (the action) and V (the potential) and a vector field A(x,t) (vector potential) such that Eq. ( 35) holds.Second, in the case of interest to us here, there is uncertainty on the mapping between the particle position X(t) and the position of the corresponding clicks x(t) and there is no reason to expect that Eq. ( 35) will hold.Instead of requiring that Eq. ( 35) holds, we will require that there exists two scalar fields V k (x,t) for k = ±1 such that where we regard the particles that respond with k = +1 or k = −1 as two different objects, the clicks generated by each object being described by its own Hamilton-Jacobi equation with potentials V k (x,t).The next step is to disentangle as much as possible the motion of the positions of the clicks from their k-values.We introduce S k (x,t) = S(x,t) − kR(x,t) for k = ±1 and after some rearrangements we obtain where ]/2 and we made use of ∑ k=±1 kP(x, k|X,t, Z) = cos θ (x, X,t, Z)P(x|X,t, Z).Omitting the terms involving cos θ (x, X,t, Z) and R(x,t), Eq. ( 37) reduces to the expression of the averaged Hamilton-Jacobi equation which entered the derivation of the time-dependent Schrödinger equation [8].

G. Including the motion of the magnetic moment
The function cos θ (x, X,t, Z) determines the ratio of k = ±1 clicks and R(x,t) = (S −1 (x,t) − S +1 (x,t))/2, that is half of the difference between the actions of the k = −1 and k = +1 clicks.We can relate these two functions to the direction of a classical magnetic moment by imposing the constraint that when the positions of the clicks (=particle position in this case) do not move, we recover the classical-mechanical equation of motion of a magnetic moment in a magnetic field, for every x.
In the limit that m → ∞ (corresponding to the situation that the positions of the clicks hardly change with time) we have Without loss of generality, we may assume that V 0 (x,t) = V 0 (x,t) + V 0 (x,t) where V 0 (x,t) does not depend on θ (x, X,t, Z) and R(x,t) while V 0 (x,t) may.Writing V 1 (x,t) = V 0 (x,t) + V 1 (x,t) cos θ (x, X,t, Z), searching for the extrema of Eq. ( 38) through variation with respect to cos θ (x, X,t, Z), R(x,t), S(x,t) and P(x,t) yields From Eq. ( 41) it follows that P(x|X,t, Z) does not change with time, in concert with the assumption that the positions of the clicks are stationary.Comparing Eqs. ( 39) and ( 40) with Eq. (C7), it is clear that we will recover the classical equations of motion of the magnetic moment if (i) we set V 1 (x,t) = −γm(x,t) • B(x,t) where m(x,t) is a unit vector, and (ii) make the symbolic identification z = cos θ (x, X,t, Z) and ϕ(x,t) = R(x,t)/a where a needs to be introduced to give aϕ(x,t) the dimension of S(x,t).Substituting the infered expression for V 1 (x,t) in Eq. ( 37) yields

H. Derivation of the Pauli equation
We now have all ingredients to derive the Pauli equation from the principle that logical inference applied to the most robust experiment yields a quantum theoretical description [8].According to this principle, we should search for the global minimum of the Fisher information for the experiment, subject to the condition that when the uncertainty vanishes, we recover the equations of motion of classical mechanics [8].Thus, we should search for the global minimum of where I F and Λ are given by Eqs. ( 15) and ( 43), respectively.
In Appendix B, it is shown that the quadratic functional Q which yields the Pauli equation is identical to Eq. ( 44) if we make the identification V 0 (x,t) = qφ (x,t), a = h/2, γ = q/m and λ = h2 /8m and This then completes the derivation of the Pauli equation from logical inference principles.

I. Discussion
In Section II F, we showed how to include the knowledge that in the absence of uncertainty the particle's motion is described by Newtonian mechanics.Obviously, this treatment requires the particle to have a nonzero mass.On the other hand, in our logical inference treatment of the free particle in Section II.E, the notion of mass does not enter in the derivation of Eq. ( 22) but neither does the concept of motion.This raises the interesting question how to inject into the logical inference treatment the notion of moving massless particles with spin.We believe that the analogy with the pseudo-spin in graphene mentioned in the introduction may provide a fruitful route to explore this issue.
The carbon atoms of ideal single-layer graphene form a hexagonal lattice with the π-band (originating from p z -orbitals of carbon atoms) well separated from other bands [18].The electronic band structure of graphene has the remarkable feature that in the continuum limit, the low-energy excitations are described by the two-dimensional Dirac equation for two species of massless fermions (corresponding to two valleys, K and K ′ ).The fact that there the wave function of each of these two species is a two-component "spinor" is not related to the intrinsic spin of the electron but is a manifestation of the two sub-lattice and bipartite structure of the hexagonal lattice [18].This feature (Dirac-like spectrum) is present already in the simplest model where only the nearest-neighbor hopping is taken into account [31] but, actually, it is robust and follows just from discrete symmetries, namely, time-reversal and inversion symmetries [18].A generalization to a four dimensional lattice, retaining the property that the continuum limit yields the Dirac equation, is given in Ref. 32.This is a nice illustration of the fact that the model of a rotating electron is not the only way to arrive at the concept of spin.In our derivation of the Pauli equation, we have to make the additional assumption (based on experimental observations such as the anomalous Zeeman effect) that the interaction of this intrinsic degree of freedom with an external magnetic field is described by the standard classical expression for the energy of a magnetic moment.
The next important step might be the derivation of the Dirac equation.The Creutz model [32] suggests that we should consider incorporating into the logical inference treatment, the additional knowledge that one has objects hopping on a lattice instead of particles moving in a space-time continuum.Recall that up to Section II.D, the description of the measurement scenario, robustness etc. is explicitly discrete.In Section II D, the continuum limit was taken only because our aim was to derive the Pauli equation, which is formulated in continuum space-time.Of course, the description of the motion of the particle in Section II F is entirely within a continuum description but there is no fundamental obstacle to replace this treatment by a proper treatment of objects hopping on a lattice.Therefore it seems plausible that the logical inference approach can be extended to describe massless spin-1/2 particles moving in continuum space-time by considering the continuum limit of the corresponding lattice model.An in-depth, general treatment of this problem is beyond the scope of the present paper and we therefore leave this interesting problem for future research.
A comment on the appearance of h is in order.First of all, it should be noted that recent work has shown that h may be eliminated from the basic equations of (low-energy) physics by a re-definition of the units of mass, time, etc. [33,34].This is also clear from the way h appears in the identification that we used to shown that quadratic functional Q which yields the Pauli equation (see Eq. (B4)) is the same as Eq.(44).With the MKS units adopted in the present paper, Planck's constant h enters because of dimensional reasons (a = h/2) and also controls the importance of the term that expresses the robustness of the experimental procedure (λ = h2 /8m).The actual value of λ can only be determined by laboratory experiments.Note that the logical-inference derivation of the canonical ensemble of statistical mechanics [6,7] employs the same reasoning to relate the inverse temperature β = 1/k B T to the average thermal energy.
We end this section by addressing a technicality.Mappings such as Eq. ( 45) are not one-to-one.This is clear: we can alway add a multiple of 2π h to S 1 (x,t) or S 2 (x,t), for instance.In the hydrodynamic form of the Schrödinger equation [35], the ambiguity that ensues has implications for the interpretation of the gradient of action as a velocity field [36,37].As pointed out by Novikov, similar ambiguities appear in classical mechanics proper if the local equations of motion (Hamilton equations) are not sufficient to characterize the system completely and the global structure of the phase space has to be taken into consideration [38].However, for the present purpose, this ambiguity has no effect on the minimization of F because Eq. ( 44) does not change if we add to S 1 (x,t) or S 2 (x,t) a real number which does not depend on (x,t) (as is evident from Eq. ( 37)) or, equivalently, if we multiply Φ(x|X,t, Z) by a global phase factor and add a constant to ϕ(x,t).

III. STERN-GERLACH EXPERIMENT: NEUTRAL MAGNETIC PARTICLE
The Stern-Gerlach experiment with silver atoms [39] and neutrons [15] demonstrates that a magnetic field affects the motion of a neutral particle suggesting that minimalist theoretical description should account for the interaction of the magnetic moment of the particle and the applied magnetic field.As is clear from the definition of the Pauli Hamiltonian Eq. (B2), in the Pauli equation the magnetic field is directly linked to the charge q of the particle.Therefore, in this form the Pauli equation cannot be used to describe the motion of a neutral magnetic particle in a magnetic field.
In quantum theory, this problem is solved by the ad-hoc introduction of the intrinsic magnetic moment which is proportional to the spin and by replacing qh/2m by the gyromagnetic ratio γ, the value of which is particle-specific.
In the logical-inference treatment, no such ad-hoc procedure is necessary.We simply set q = 0 in Eq. ( 43) and use Eq. ( 45) to find the equivalent quadratic form.The Hamiltonian that appears in this quadratic form reads where γ is the gyromagnetic ratio which, in general, is not given by q/m.As mentioned earlier, the appearance in Eq. ( 46) of the Pauli-matrices is a direct consequence of logical inference applied to robust experiments that yield data in the form of the position and one of the two kinds of detector clicks.

IV. RELATION TO EARLIER WORK
Readers familiar with the hydrodynamic formulation of quantum theory [35] and its interpretation in terms of Bohmian mechanics [40,41] undoubtedly recognize the steps which transform quadratic functional Eq.(B4) yielding the Pauli equation Eq. (B2) and the functional Q given by Eq. (B27).In fact, the functional Q has been used as the starting point for the hydrodynamic representation [42] and a causal interpretation [12,43,44] of the Pauli equation.In this formulation, the two-component spinor can be given a classical-mechanics interpretation in terms of an assembly of very small rotating bodies which are distributed continuously in space.Within this interpretations spins of different bodies interact.
Clearly, the logical-inference treatment does not support this interpretation: the functional Eq.(B27) is the result of analyzing a robust experiment that yields data in the form of (x, k) where x is a 3-dimensional coordinate and k = ±1 denotes the two-valued "color", together with the requirement that on average and in special cases, the data should comply with the classical-mechanical motion.
An expression of Eq. (B27) in which the separation of the contribution of the Fisher information and the classical-field mechanical is explicit has been given by Reginatto [45].This expression is different from ours.Comparing Eq. ( 15) with Eq. (6,7) in Ref. 45, we find that the expressions are fundamentally different due the fact that the representation (7), when substituted in (6), does not yield Eq. (B27).

V. CONCLUSION
It is somewhat discomforting that it takes a considerable amount of symbolic manipulations to derive the Pauli equation from the combination of the measurement scenario, the notion of a robust experiment and the behavior expected in some limiting cases.Therefore, it may be worthwhile to recapitulate what has be done in simple words, without worrying too much about the technicalities.
The first step is to describe the measurement scenario.It is assumed that the object (particle) we are interested in responds to the signal that we send to probe it.The response of the object triggers a detection event.In the case at hand, the data representing the detector clicks consist of spatial coordinates and two-valued "color" indices.We assign an i-prob to the whole data set.To make progress, it is necessary to make assumptions about the data-collection procedure.We assume that each time we probe the object, the data produced by the detection system is logically independent of all other data produced by previous/subsequent probing.With this assumption, together with the assumption that is does not matter where we carry out the experiment, the notion of a robust experiment is found to be equivalent to the global minimum of the Fisher information for the corresponding measurement scenario (see Eq. ( 15)).
The next step is to bring in the knowledge that in the extreme case that there is no uncertainty about the outcome of each detection event, we expect to observe data that is compliant with classical, Newtonian mechanics both for the motion of a particle as well as for the motion of its magnetic moment in the case that the particle does not move (see Eq. ( 43)).
The third step is to find the balance between the uncertainty in the detection events represented by Eq. ( 15) and the "classical mechanics" knowledge represented by Eq. ( 43) by searching for the global minimum of Eq. ( 44) for all possible unknown positions of the particle.The result of this calculation is a fairly complicated non-linear set of equations for the i-prob to observe a click.
The final step is to observe that by transformation Eq. ( 45), this non-linear set of equations and the Pauli equation are equivalent.The latter, being a set of linear equations, are much easier to solve than their non-linear equivalent.
In the logical inference approach, the assumption that each time we probe the object, the detection system reports a two-valued "color" index and our requirement that in the extreme case mentioned earlier we expect to see the motion of a classical magnetic moment automatically leads to the notion of a "quantized" (i.e.two-valued) intrinsic magnetic moment.The notion of spin appears as an inference, forced upon us by the (two-valued) data and our assumptions (which do not make reference to concepts of quantum theory) that the experiment is robust, etc.
From a more general perspective, it is remarkable that the logic inference approach introduces the concept of "spin" in a way which is not much different from the way real numbers are introduced.Indeed, the latter appear as a necessity to provide an answer to questions such as "what new kind of number do we have to introduce such that the square of it yields the integer n".If n = m 2 where m is an integer, no new concept has to be introduced but if say n = 2, the answer to the question is given the symbolic name √ 2. Similarly, in our logical-inference treatment the concept of spin naturally appears as a result of describing situations in which there is two-valued data and the requirement that in a limiting case we recover the classical equation of motion.This concept of spin only exists in our mind, in complete agreement with the fact that this concept maybe put to very good use whenever there are two-valued variables that may or may not relate to (intrinsic) angular momentum, as in the theory of the electronic properties of graphene, for example [18].
It will not have escaped the reader that in the logical-inference derivation of the Pauli equation as well as in earlier work along this line [8,46] there are no postulates regarding "wavefunctions", "observables", "quantization rules', no "quantum" measurements [47],"Born's rule", etc.This is a direct consequency of the basic premise of this approach, namely that current scientific knowledge derives, through cognitive processes in the human brain, from the discrete events which are observed in laboratory experiments and from relations between those events that we, humans, discover.These discrete events are not "generated" according to certain quantum laws: instead these laws appear as the result of (the best) inference based on available data in the form of discrete events.In essence, for all the basic but fundamental cases treated so far, the machinery of quantum theory appears as a result of transforming a set of non-linear equations into a set on linear ones.The wavefunction, spinor, spin, . . .are all mathematical concepts, vehicles that render a class of complicated nonlinear minimization problems into the minimization of a quadratic forms.As products of our collective imagination, these concepts are extraordinarily useful but have no tangible existence, just like numbers themselves.Of course, it remains to be seen whether the logical-inference approach can be extended to e.g.many-body and relativistic quantum physics.
In summary: the Pauli equation derives from logical inference applied to robust experiments in which there is uncertainty about individual detection events which yield information about the particle position and its two-valued "color".This derivation adds another, new instance to the list of examples [8,46] for which the logical-inference approach establishes a bridge between objective knowledge gathered through experiments and their description in terms of concepts.
rule".It should be mentioned here that it is not allowed to define a plausibility for a proposition conditional on the conjunction of mutual exclusive propositions.Reasoning on the basis of two or more contradictory premises is out of the scope of the present paper.
3. P(A Ā|Z) = 0 and P(A+ Ā|Z) = 1 where the "sum" A+ B denotes the logical sum (inclusive disjunction) of the propositions A and B, that is the proposition A + B is true if either A or B or both are true.These two rules show that Boolean algebra is contained in the algebra of plausibilities.
The algebra of logical inference, as defined by the rules (1-3), is the foundation for powerful tools such as the maximum entropy method and Bayesian analysis [3,5].The rules (1-3) are unique [3][4][5].Any other rule which applies to plausibilities represented by real numbers and is in conflict with rules (1-3) will be at odds with rational reasoning and consistency, as embodied by the desiderata 1-3.
The rules (1-3) are identical to the rules by which we manipulate probabilities [5,[49][50][51].However, the rules (1-3) were not postulated.They were derived from general considerations about rational reasoning and consistency only.Moreover, concepts such as sample spaces, probability measures etc., which are an essential part of the mathematical foundation of probability theory [50,51], play no role in the derivation of rules (1-3).Perhaps most important in the context of quantum theory is that in the logical inference approach uncertainty about an event does not imply that this event can be represented by a random variable as defined in probability theory [51].
There is a significant conceptual difference between "mathematical probabilities" and plausibilities.Mathematical probabilities are elements of an axiomatic framework which complies with the algebra of logical inference.Plausibilities are elements of a language which also complies with the algebra of logical inference and serve to facilitate communication, in an unambiguous and consistent manner, about phenomena in which there is uncertainty.
The plausibility P(A|B) is an intermediate mental construct that serves to carry out inductive logic, that is rational reasoning, in a mathematically well-defined manner [3].In general, P(A|B) may express the degree of believe of an individual that proposition A is true, given that proposition B is true.However, in the present paper, we explicitly exclude applications of this kind because they do not comply with our main goal, namely to describe phenomena "in a manner independent of individual subjective judgment".
To take away this subjective connotation of the word "plausibility", we will simply call P(A|B) the "inferenceprobability" or "i-prob" for short.
A comment on the notation used throughout this paper is in order.To simplify the presentation, we make no distinction between an event such as "detector D has fired" and the corresponding proposition "D = detector D has fired".If we have two detectors, say D x where x = ±1, we write P(x|Z) to denote the i-prob of the proposition that detector D x fires, given that proposition Z is true.Similarly, the i-prob of the proposition that two detectors D x and D y fire, given that proposition Z is true, is denoted by P(x, y|Z).Obviously, this notation generalizes to more than two propositions.
By the standard variational argument, it follows that the Pauli equation is an extremum of the quadratic form (functional) Φ † σ z Φ = P 1 (x,t) − P 2 (x,t).(B14) Thus, we have all the expressions to write Eq. (B4) in terms of P 1 (x,t), P 2 (x,t), S 1 (x,t), and S 2 (x,t).