Obtaining tight bounds on higher-order interferences with a 5-path interferometer

Within the established theoretical framework of quantum mechanics, interference always occurs between pairs of trajectories. Higher order interferences with multiple constituents are, however, excluded by Born's rule and can only exist in generalized probabilistic theories. Thus, high-precision experiments searching for such higher order interferences are a powerful method to distinguish between quantum mechanics and more general theories. Here, we perform such a test in optical multi-path interferometers. Our results rule out the existence of higher order interference terms to an extent which is more than four orders of magnitude smaller than the expected pairwise interference, refining previous bounds by two orders of magnitude. This establishes the hitherto tightest constraints on generalized interference theories.

Within the established theoretical framework of quantum mechanics, interference always occurs between pairs of trajectories. Higher order interferences with multiple constituents are, however, excluded by Born's rule and can only exist in generalized probabilistic theories. Thus, high-precision experiments searching for such higher order interferences are a powerful method to distinguish between quantum mechanics and more general theories. Here, we perform such a test in optical multi-path interferometers. Our results rule out the existence of higher order interference terms to an extent which is more than four orders of magnitude smaller than the expected pairwise interference, refining previous bounds by two orders of magnitude. This establishes the hitherto tightest constraints on generalized interference theories.
Since arising almost a century ago, quantum mechanics has long become an established paradigm for the description of nature on a submicroscopic scale. It is at the basis of an enormous variety of present and potential future applications, such as quantum communication [1,2], quantum computation [3][4][5] and protocols like entanglement swapping [6] or teleportation [7]. However, all these applications rely ultimately on interference and entanglement, which can be alternatively explained by theories sharing only some fundamental features with quantum mechanics, such as the superposition principle or probabilistic predictions for outcomes, and yet differing from it in other aspects. In order to distinguish between quantum theory and such alternatives one needs to design dedicated experiments. The situation may be compared with the time before the first Bell test experiments had been performed. Until then one could explain all quantum mechanical phenomena with a local hidden variable theory. It was required to first state Bell's theorem [8] and then to perform dedicated experiments with space-like separated laboratories to exclude the alternative. Only last year all experimental loopholes were finally closed [9][10][11]. Another example is the experiment that distinguishes between quaternion and complex (standard) quantum theory [12,13]. Here we focus on an experimental test capable of discerning between quantum mechanics and its generalizations exhibiting higher order interference.
The probabilistic nature of quantum theory is stated by Born's rule [14], i.e. that the probability density P (r, t) for an observation of a quantum object at a certain time t and a certain position r is given by the absolute square of its wavefunction Ψ(r, t): P (r, t) = Ψ * (r, t)Ψ(r, t) = |Ψ(r, t)| 2 .
As a consequence of Born's rule and quantum superposition, interference can take place even for single particles [15]. For concreteness, consider an interferometer with multiple non-overlapping paths k = A, B, C, . . . which superpose in some output port to the final wavefunction Ψ = k Ψ k . Eq. (1) implies: with pairwise (first-order) interference terms I kl ≡ Ψ k Ψ * l + c.c., depending on the relative phase between the two paths k and l. Thus, one obtains interference terms that always originate from pairings of paths, but no higher order interferences involving more than two paths at once.
In this vein, one can use the presence or absence of higher-order interferences as an experimental probe of the current framework of quantum mechanics. First developed by Sorkin in the context of a measure theory on spacetime [16], one can define a hierarchy of interference terms. In a 3-path interferometer with individually blockable paths A, B, C, where P ABC is the probability to find a particle in the output port of the interferometer if all paths are open, P AB for only paths A and B being open, etc. The so-called second-order interference term I ABC ≡ P ABC − P AB − P AC − P BC + P A + P B + P C (3) should be zero, independent of the individual phases and powers in each interferometer arm, due to Eq. 2. Conversely, a significant deviation from I ABC = 0 would indicate the existence of higher-order interferences and contradict conventional quantum theory. Note that the definition (3) accounts for deviations from the standard theory in a model-independent way.
In any experiment with discrete particles, the probability P will be proportional to the detected particle flux p. Therefore a directly measurable quantity can be defined [17]. In this expression, for example p AB is the detected particle flux at the output when only paths A and B are open. The background term p 0 gives the measured signal when all paths are blocked, accounting for detector dark current/dark counts. For better comparison of the results with the expected behavior, one can introduce the normalized quantity κ 3 ≡ 3 /δ 3 measuring the ratio of hypothetical second-order interference to the sum of the expected first-order interference, δ 3 ≡ |I AB |+|I AC |+|I BC |. For interferometers with more than three paths, the higher (third, fourth,...)-order interference terms (κ 4 , κ 5 ,...), which of course are also zero in standard quantum theory, can be defined accordingly.
Different experiments have been realized previously to obtain an upper bound on the modulus of the secondorder interference term. These experiments were implemented in optics [1,18,19] as well as via nuclear magnetic resonance (NMR) in molecules [21], delivering results which were all in accordance with the expectation κ 3 = 0. The NMR experiment provided the hitherto tightest constraint with κ 3 = 0.001 ± 0.003.
As is the case for any such null-test experiment, the tightness of the bound and, thereby, the strength of any conclusions to be drawn about the foundations of the theory depend on the measurement uncertainties. In previous optical 3-path interferometers, the precision was mostly limited by the phase stability of the interferometer, while the accuracy suffered from detector nonlinearities [1]. In this work, we present a greatly improved multi-path experiment, namely a stabilized 5-path interferometer with single photons, with which we are not only able to tighten the bound on second-order interference by two orders of magnitude, but also measure third and fourth-order interference terms. The 5-path interferometer has the additional advantage of permitting the acquisition of more statistics for the second-and third-order interference term since it consists of ten 3-path interferometers and five 4-path interferometers. The systematic error of detector nonlinearities is taken into account by separate detector calibration and full quantum state tomography of the produced 5-dimensional qudit state.
A schematic drawing of our setup can be seen in Fig. 1. We employed light sources in three different regimes: classical (C), semi-classical (SC) and quantum (Q). In the classical regime we used a continuous-wave single frequency laser at 808 nm power-stabilized to 1 mW with relative fluctuations smaller than 0.1% over the complete measurement time of several days by using a liquid crystal noise eater (Thorlabs LCC3112). We used a single photon source to perform measurements in the other regimes. Photon pairs at 808nm are produced via type-II spontaneous parametric down conversion (SPDC) in a 10 mm long periodically poled potassium titanyl phosphate (ppKTP) crystal which is pumped by a blue laser (404 nm). The orthogonally polarized photons are separated on a polarizing beam splitter (PBS). We collect 6 × 10 5 single photons per second in each of the outputs in single mode fibers and we get 10 5 pairs per second at 4 mW pump power. One of the photons serves as a heralding photon, whereas the other is sent through our multi-path interferometer. Therefore we have two  [22] in the quantum regime. All light sources were linearly polarized.
The interferometer is a Mach-Zehnder 5-path interferometer consisting of a diffractive beam splitter (a diffractive optical element -Holoeye DE 263 -modulating the incident light via a micro-relief surface) which creates five almost equally powerful beams, collimated by a lens (f = 150 mm). A shutter assembly serves to block or unblock each of the five beams individually, phase plates (glass plates with a thickness of 0.15 mm and anti-reflection coated for 808 nm) mounted on motorized rotation stages in all of the five beams allow us to set the phase of each path independently. The absolute angular repeatability is 0.005 • , which corresponds to π/1000 in phase. An identical pair of lens and diffractive element sends the resulting beam onto a detector. The interferometer is designed in 4f -configuration and the individual beams are separated by 5 mm, therefore the overall dimensions are (60 × 2) cm 2 . For detecting single photons we used SPCM-AQRH-12-FC single photon counting modules from Perkin Elmer followed by a quTAU time-to-digital converter from qutools GmbH. This system has a deadtime of (33.85 ± 0.31) ns and (150 ± 18) dark counts per second. The laser radiation is detected by a Physimetron A139-001 photoreceiver based on a Si-photodiode (Hamamatsu S2386-18K) and a 1 MV/A transimpedance amplifier, read out by an Ag- Open Paths ilent 34410A multimeter. This detection system has a low maximum nonlinearity of less than 35 ppm [3]. One measurement set consists of the 2 5 = 32 different possible open/close combinations of the five paths. These combinations were measured in random order to reduce the influence of any memory effects of the detector and of drifts of the source. To obtain data with comprehensive statistics we recorded several thousand measurement sets within a total measurement time of several days. The whole interferometer is shielded against air motion and stray light as well as passively and actively temperature stabilized with a PI controller (Wavelength Electronics HTC1500) and heating mats to a root-mean-square fluctuation < 0.02 K /24 h (The temperature was monitored with a PT1000 resistance thermometer). Additionally, the phases are actively stabilized by optimizing the phase-plate position after 100 measurement cycles towards maximally constructive interference of all twopath combinations. This point in phase space were chosen for convenience of alignment and because small phase changes lead only in second order to deviations in output power. This results in good phase stability over the whole measurement time. By comparing the fluctuations and drifts of the single-path power with the multi-path powers (see Fig. 4 and Fig. 5 of the Supplemental Material [24]), which have the same order of magnitude, we found that phase uncertainty plays a minor role compared to power noise from the light source.
The resulting average powers of the different path combinations can be seen in Fig. 2 for the measurement with the power stabilized laser [24]. We filtered the data for extreme outliers (resulting from shutter failure) according to Grubbs' test for outliers (with a significance level of 99%) [25,26]. After that the largest relative standard deviation of the various classical signals is 0.3% for 5618 measurement sets recorded within 68 h. For the semi-classical (quantum) single photon measurement with 1912 measurement sets the largest standard deviation was measured to be 3.6% (15.5%) over a measure- Due to the anti-correlation between the numerator and the denominator δ in the definition of κ, a bias towards positive values can arise from random fluctuations in the data when calculating κ for every shutter cycle and averaging over the data sets. However, calculating the averages of numerator and denominator in the definition of κ separately, eliminates their correlations and yields an unbiased estimator of the higher-order interference terms. Indeed, one can show that error sources, which typically occur in interference experiments, such as power fluctuations of the photon source, countrate fluctuations of the detectors (Poissonian photon counting uncertainties), detector/electronic noise, coherent phase fluctuations as well as incoherence, have no systematic effect on the measurement outcome [27]. For each of the measured 32-tuples we calculate 3,4,5 and δ 3,4,5 . A histogram plot of the measured ensemble of 3 / δ 3 in the classical regime is shown in Fig. 3 [28]. After averaging across all possible path combinations one obtains the mean values and associated uncertainties presented in table I. It was recently shown that near-field effects in slit-based measurements, where the relevant dimensions are just one or two orders of magnitude larger than the wavelength λ, can lead to an apparent higher-order interference and, therefore, bias the experiment [29,30]. However, in our interferometer these effects are of negligible influence, due to the macroscopic separation of the paths and beam width, exceeding λ by 3 to 4 orders of magnitude (see inset in Fig. 1). Instead, the main systematic uncertainty in our experimental configuration arises from the nonlinearity of the detectors. Real detectors usually have a nonlinear response function, which means that the recorded value (voltage, photon counts,...) is not linear in the incident power or photon flux, but biased differently for different optical powers. This biases the value of κ, as we measure light powers varying over more than one order of magnitude. Here, the bias arises mainly due to nonlinearities in the electronics of our photoreceiver and due to deadtime in the single photon detector. To take this error into account it is useful to fully characterize our 5-path interferometer, which can be described as a 5-dimensional qudit state. Therefore, we additionally performed complete quantum state tomography [31,32]. The density matrix ρ was numerically reconstructed from single-and two-path measurements with defined phases via direct reconstruction. The phases are calibrated via scanning the classical two-path laser interference. We used the direct reconstruction instead of a maximum likelihood estimation to avoid systematic deviations in the state reconstruction, which have recently been shown to arise due to the constraint of physicality in maximum likelihood estimates [33]. The real and imaginary parts of the resulting density matrix are shown in Fig. 4. We calculated tr ρ 2 = 0.74; the deviation from 1 (a pure state with no which-path information) can be attributed to an imperfect overlap of the five beams at the second beamsplitter. While this degree of coherence in the interferometer must be determined for an accurate prediction of the influence of the nonlinearities, its actual value has no systematic impact on the Sorkin experiment. The effect of the nonlinearity on the reconstruction is negligible for two reasons: the ratio of the photon fluxes for the different measurement settings is much smaller than in the measurements contributing to the evaluation of κ. More importantly, deviations in the density matrix do not produce a systematic effect on the expected higher-order interference, as κ = 0 holds for all states in quantum theory. From the density matrix it is possible to calculate the expected powers for the different settings of the shutters in the Sorkin experiment. We found good accordance with our measurement data [34], suggesting that the tomography produces an accurate description of the interferometer. The nonlinearities of both detectors have been characterized in separate experiments [3]. Applying them to the powers/count rates predicted from the density matrix yields small corrections of these powers (rel- The density matrix ρ of the 5-dimensional qudit state in our interferometer, with tr ρ 2 = 0.74.
Final result.κ gives the difference between the experimentally measured κ and the value expected from detector nonlinearities κ th for the measurements in the three regimes. The order ofκ increases along the horizontal axis and the error bars indicate one standard deviation.
ative change < 0.03% for the laser powers and < 0.5% for the unheralded single photon rates). One can then calculate the apparent higher-order interferences κ th , which would be expected in the Sorkin measurements, from these corrected data. The differences between the experimentally measured higher-order interferences and the expected values due to the nonlinearitiesκ ≡ κ − κ th give corrected higher-order interferences as the final results, which can be found in table TABLE II.κ ≡ κ − κ th is the nonlinearity-corrected higherorder interference for all measurement regimes. All these values are within one standard deviation of the expected zero value.
Note that in case of the heralded single photon data, we did not calculate an explicit prediction for κ th because the nonlinearity model is quite involved in this case. Instead we used the model to correct the raw experimental data, in order to obtainκ [35]. A final summary of all the differentκ j values is presented in Fig. 5. One finds that all these values are within one standard deviation of the expected zero value.
The optical 5-path interferometer presented in this work permitted us to experimentally confine the allowed domain of second order interference to an uncertainty of 3 × 10 −5 in the classical light regime. This is two orders of magnitude tighter than the bounds obtained from the most precise experiments in any system to date. The uncertainties in the semi-classical and quantum regimes of 2 × 10 −4 and 2 × 10 −3 , respectively, are also much lower than what has been reported before [1,17]. This new level of precision has been reached by a range of technical improvements over previous interferometers including power stabilization, phase stabilization and increased throughput as well as a judicious analysis of detector nonlinearities, which are the dominant origin of systematic error. Furthermore, we have performed the first measurement of third-and fourth-order interference terms, with similarly small uncertainties. So far, all our experimental results showed no significant higher-order interferences and are, therefore, in full accordance with the conventional theory. The dominant sources of imprecision in our setup are the uncertainties in determining the detector nonlinearities as well as shot noise in the single photon regime. In order to narrow the bound on higherorder interference further, highly linear detection systems and brighter single photon sources or higher detection efficiency will be required. A narrower experimental bound will aid the development of new theories or constrain free parameters of existing ones. In particular, knowledge of the various higher-order terms should permit discriminating between different models for generalized theories, such as coefficients in nonlinear extensions of Born's rule [36], the theory of density cubes [37] and quartic quantum theory [38]. For example, the bounds onκ 3 translate directly to bounds on the magnitude of off-diagonal elements in the theory of density cubes [37]. All these alternative theories contain quantum theory as a subset similarly as quantum theory contains classical theory as a subset. The mechanism by which theories exhibiting higher-order intereferences reduce to standard quantum theory is called hyper-decoherence [38,39]. This mechanism would be analogous to the process of decoherence, which induces the quantum-to-classical transition. Our experiment places also a bound on the hyperdecoherence time of the potential extensions of quantum theory with second, third and fourth order interference. Such post-quantum theories are not only interesting from the foundational point of view; they could solve problems intractable even on a quantum computer [40].

S1-ROBUSTNESS OF THE SORKIN EXPERIMENT
In the following, we will outline the impact of typical error sources occurring in optical multi-path interferometers. For the sake of simplicity, we will restrict the analysis to potential second-order interference κ 3 arising across three paths A, B, C with associated photon transmission rates into the target output mode p A , p B , p C and phases φ A , φ B , φ C . An extension to yet higher orders of interference is straightforwardly possible. There are four types of errors, which we are going to consider: • Incoherence, e.g. due to phase fluctuation shorter than a single measurement or limited beam overlap on the recombining beam splitter • Coherent phase fluctuations (phase fluctuations on a time scale longer than each measurement, but shorter than a measurement cycle) • Input power fluctuations • Poissonian photon counting uncertainty (Shot noise).
It should be reasonable to assume that all sources of errors are independent from one another, such that they can be treated separately and their individual influences on the Sorkin experiment can be added.
We first exclude the counting uncertainty and keep it for later. A general three-path interferometer has respective transmissions T A,B,C from the input along either of its three paths into the output mode: where the input rates p 1,2,3 are allowed to be different random variables accounting for power fluctuations between the respective single-path measurement settings. A two-path measurement of modes A and B takes place at another instance in time and is, thus, subjected to another input power p 4 : where the phase difference φ 1 ≡ φ B − φ A can be treated as a single random variable. Similarly one can express the other two-path terms as: p BC = p 5 (T B + T C + 2 √ T B T C cos φ 2 ) and p AC = p 6 (T A + T C + 2 √ T A T C cos φ 3 ), with independent realizations of phase differences φ 2 ≡ φ C − φ B and φ 3 ≡ φ A − φ C (Of course φ 1 + φ 2 + φ 3 = 0 must hold in a stationary experiment). The two-mode interference term arises as the difference of the two-mode count rate and the two corresponding single-path rates: The other two terms are defined accordingly.
The three-path measurement is influenced by yet another input power p 7 and three random realizations of the phases φ A , φ B , φ C , which we denote by φ 4,5,6 : In this case the three phases influence the measurement at the same time, so their differences can no longer be treated as being independent from one another. Of course, the background rate p 0 can also be affected by noise. However, as its magnitude is usually much smaller than the other rates, we neglect it in the following analysis. Therefore, the second-order interference term 3 defined in Eq. (4) in the main text is modeled by seven random input powers p 1,...,7 and six random phases φ 1,..., 6 .
In the experiment we operate at the point of fully constructive interference, that is φ j = 0 for all phases. In the following we will evaluate the impact of the various error sources on the unnormalized term 3 before considering the normalization by the sum of the first-order interference terms δ 3 .

Phase fluctuations
We first consider pure phase fluctuations, that is the absence of any changes in input power. Thus p i = p in = const. for all i = 1, . . . , 7 and one obtains for the second-order interference term: (9) Whether phase fluctuations influence the experiment in a coherent or an incoherent way will depend on their time scale. We will evaluate both cases separately.

Incoherence
Incoherence can be modeled by rapid fluctuations of all phases around their mean value φ = 0. This means that during a measurement the detector integrates over these rapid fluctuations reducing the result from the ideal value cos φ = 1 to some averaged value cos φ = X, with 0 ≤ X ≤ 1 measuring the degree of coherence and, thereby, the interference visibility. If the incoherence is stationary, it will reduce the interference contrast for all measurement settings by the same amount. As evident from Eq. 9, the cosines for each pair of paths enter with opposite signs, such that reduced interference contrasts cancel out and no net effect on 3 remains. Thus, the Sorkin experiment is immune to such incoherence effects, as has been known for some time [1].

Coherent phase fluctuations
Slower phase fluctuations preserve the phase within a single measurement, but can alter the phase inbetween measurements of a cycle. This can be modeled by assigning independent random values to the phases φ 4,5,6 adhering to a Gaussian distribution with zero mean and a standard deviation of σ φ : φ 4,5,6 ∝ N 0,σ 2 φ . The phase differences φ 1 , φ 2 , φ 3 are differences of two such independent Gaussians. Hence, they are also normally distributed, but with twice the variance: φ 1,2,3 ∝ N 0,2σ 2 φ . As in the scenario of incoherence, the phase fluctuations reduce the cosine terms. However, not every cosine is reduced in the same way, but by the six independent random phases. One can straightforwardly see from Eq. (9) that there is no bias on the unnormalized second-order interference: as all terms are cosines of phase differences with identical spread, so they have identical expectation values which cancel each other out. Hence, also coherent phase fluctuations have no systematic influence on our experiment. They merely lead to random uncertainties which can be mitigated by averaging over multiple data sets. As the cosine is changing its value only in second order in φ at the maximum, one can expect a quadratic scaling of the uncertainty of 3 with such phase fluctuations. We verify this intuitive expectation by numerical simulations. To this end, the random phases φ 1,...,6 are drawn from the Gaussian distributions defined above, with an ensemble size chosen large enough to ensure at least two significant digits for all moments in each case. Fig. 6 shows simulated data for a phase uncertainty of σ φ = π/10 in a balanced interferometer (T A = T B = T C = T ; p ≡ T p in as the rate transmitted through a single path). As the cosine of a Gaussian variable centered around zero yields a highly asymmetric distribution (see subfigure (b)), the resulting distributions for the second-order interference are also asymmetric and deviate considerably from a normal distribution. Yet, the expectation value of 3 remains exactly zero, as expected from our earlier considerations (c). In order to determine, the scaling of the random errors on 3 with the level of phase noise, the procedure is repeated for a variety of phase uncertainty levels. The resulting standard deviations σ 3,phase are summarized in Table III  all cases, there is no bias. The scaling of the random uncertainty can be extracted from the table as being roughly which is a quadratic scaling, as expected in the vicinity of fully constructive interference.

Power fluctuations
Now we consider the other extreme case of perfect phase stability but random fluctuations of the input power. Consequently, one can set all cosines in Eqs. (6) and (8) to 1 and treat the rates p 1,...,7 as independent Gaussians with mean p in and standard deviation σ p . One obtains for the three-path interference term: Thus, the Sorkin experiment suffers no systematic error from power fluctuations, as long as all shutter settings receive the same expected input power. The influence of potential long-term drifts is eliminated by picking the shutter settings in random order for each measurement cycle. As Eq. (10) is linear in the random powers p 1,...,7 , the second-order interference follows also a Gaussian distribution with a standard deviation σ 3,power scaling linearly with σ p . The coefficient depends on the specific values of the transmittivities. For a balanced interferometer, one obtains

Counting Error
Finally, we take the photon counting noise into account. The number of incident photons in a given time interval follows Poissonian statistics, with equal mean photon number and variance. Hence, the count rate measured for shutter setting 'A open' has a standard deviation of √ p A , and so on. Again, the second-order interference is unbiased by these uncertainties, as the mean values of the individual rates are unaffected. The counting errors for the different shutter settings are uncorrelated, so one can add their variances and obtain In case of a balanced interferometer with absolute single path rate p one gets in absence of other error sources Other sources of noise in the detector can be modeled in the same way and also cause no bias on the result.

Combined uncertainty and normalization
The above discussion shows that none of these error sources has a systematic effect on the unnormalized second-order interference. The three sources of random error can be expected to be independent from another, yielding a total uncertainty of with the numbers in the last step applying to the case of a perfectly balanced interferometer. Evidently, for large count rates phase and power fluctuations, which enter in second and first order, respectively, tend to dominate over shot noise. Note that the phase noise does not lead to a Gaussian distribution of 3 , whereas the others do. Hence, histogram plots of 3 can reveal which type of error source is dominant.
It seems more useful to determine higher-order interference relative to first-order interference rather than in absolute terms. Therefore, 3 is normalized by the sum of the magnitudes of the first-order terms as defined in Eq. (7): Clearly, the numerator and the denominator are dependent upon each other. Consequently, fluctuations in any of the random variables can lead to an undesired bias on κ 3 . In order to overcome this problem, we take the averages for and δ separately, as stated in Eq. (5) in the main paper. Thereby, we obtain an unbiased estimate for (which is zero in the conventional theory) and normalize it by the averaged first-order interference. The latter is reduced by incoherence and phase fluctuations. The other error sources have no systematic influence on δ.
The relative random uncertainty of the numerator 3 is much larger than the one of the denominator δ 3 , due to 3 ≈ 0, but δ 3 > 0. Therefore, the standard deviation of κ can be approximated by In case of a perfectly balanced interferometer at fully constructive interference one can evaluate the denominator as δ 3 = 6p, thus:

S2-ERROR ANALYSIS OF THE DATA
We conduct a series of M measurements of and δ, cycling through all shutter settings. The resulting distribution of for some exemplary path combinations, normalized by the average of δ, is shown in Fig. 7 as well as in Fig. 3 in the main text. As discussed in the previous section, the distribution of the data can provide information on the magnitude of random uncertainty as well as on the predominant source of error. In the classical and semi-classical regime the central peak of the distributions is slightly more pronounced than expected for an ideal Gaussian distribution. This is consistent with the expected shape of a phase-noise induced distribution (cf. Fig. 6(c)). Therefore, the data suggests that both, Gaussian power and shot noise as well as non-Gaussian phase noise play a role in these regimes. For heralded single photons, on the other hand, the lower count rates render the data shot-noise dominated, causing the results to be normally distributed (see bottom row in Fig. 7).
If the measurement results (i) (i = 1, . . . , M ) are mutually uncorrelated, one can use the standard error of mean to quantify the uncertainty of the average: with σ as the measured standard deviation of the data set. The same holds for the uncertainty of κ. We verify the absence of correlations in the data by calculating their auto-correlation function: An exemplary auto-correlation for 3 measured with the laser source on paths A, B, C is shown in Fig. 8(a). Evidently, no significant correlation among the data points can be detected, not even for subsequent ones. The same observation is made for all other measurement settings and light sources. Therefore, it is fully justified to use the standard error of mean, as defined in Eq. (13), as a measure for the uncertainty of the higher-order interference.
One has to be more careful, however, when it comes to averaging across the various path combinations in the multi-path interferometer. For example, phase fluctuations in a single path will influence the interference among several path combinations simultaneously. Therefore, cross-correlations between the measured data for different path combinations (i = ABC, . . . , CDE for three paths) can be expected. Indeed, one finds non-vanishing cross-correlations (R cc ) i,j for 3 and 4 , as shown in Fig. 8(b) and (c), respectively. With this at hand, one can obtain the standard error of mean of the final result for κ 3 via error propagation: with ∆κ 3,i denoting the standard error of mean of κ 3 on path combination i. Analogously, one gets for κ 4 : These are the experimental uncertainties presented in Table I in the main text.

S3-TOMOGRAPHY DATA AND PREDICTED COUNT RATES
For completeness we show additional data/figures resulting from our measurement: The measured count rates of the different path combinations for the measurement with the laser can be seen in Fig. 10. After filtering the 5618 measurement sets recorded in a measurement time of 68 h the standard deviation was measured to be 0.3%. A bar chart with the average powers over the whole measurement duration can be seen in Fig. 2 of the main text.
The measured count rates of the different path combinations for the measurement with the single photon source for the semi-classical measurement can be seen in Fig. 10. After filtering the 1912 measurement sets recorded in a measurement time of 88 h the standard deviation was measured to be 3.6%. This higher standard deviation compared to the classical light source results mainly from shot noise and due to the fact that the power of the blue pump laser was not stabilized. Fig. 11 shows the comparison between the measured values (shown in Fig. 2 in the main text for the laser) and the expected values arising from the reconstructed density matrix for all different path combinations.

S4-NONLINEARITY MEASUREMENT AND ASSOCIATED UNCERTAINTY
Detector nonlinearities are the predominant cause of systematic deviations in our implementation of the Sorkin experiment. We deduce these nonlinearities from independent beam combination experiments [2,3]. Their expected impact on the higher-order interference term κ th is then calculated by applying the nonlinear detector response function to the various multi-path probabilities calculated from the density matrix of the interferometer. In the following, it will be discussed how and with which uncertainty the nonlinearities are reconstructed and how this translates into the theoretical prediction κ th and the error bars of this prediction.

Measured value
Direct reconstruction ABCDE  A  B  C  D  E  AB  AC  AD  AE  BC  BD  BE  CD  CE  DE  ABC  ABD  ABE  ACD  ACE  ADE  BCD  BCE  BDE  CDE  ABCD  ABCE  ABDE  ACDE  BCDE   Measured value   Direct reconstruction   ABCDE  A  B  C  D  E  AB  AC  AD  AE  BC  BD  BE  CD  CE  DE  ABC  ABD  ABE  ACD  ACE  ADE  BCD  BCE  BDE  CDE  ABCD  ABCE  ABDE  ACDE

Polynomial expansion of the transfer function
In a measurement instrument for classical light the output voltages V are related to the impinging optical powers p by some transfer function p = f (V ). A variety of effects can contribute to nonlinearities in a photodiode and the attached voltmeter, such that it is difficult to develop a specific model. However, as the nonlinearities are quite weak compared to the linear part of the transfer function (on the 10 ppm-level), one can reasonably expand f (V ) into a polynomial of degree n in the region of interest [3]: with coefficients a ≡ (a 0 , . . . , a n ) T and the residual r n (V, a) quantifying the fidelity of this approximation for a given measured voltage V and polynomial degree n.
The goal of the beam-combination method is to find a suitable polynomial for f (V ). One overlaps two incoherent beams, which can be shuttered and power-controlled individually, on a detector (see, e.g., Fig. 1 in [3] for an illustration). For a series of k = 1, . . . , M settings of optical powers, the voltages V 1,k , V 2,k for either of the paths being open, V 3,k for both path being open and V 0,k for both paths being closed are recorded. The corresponding unknown optical powers are p 0,k , . . . , p 3,k . For incoherent beams, the optical powers should be additive, i.e., ∀k : p 3,k + p 0,k = p 1,k + p 2,k .
Condition (14) leads to the following constraints on the coefficients: with r n (V k , a) = r n (V 1,k , a) + r n (V 2,k , a) − r n (V 3,k , a) − r n (V 0,k , a) measuring the combined residual for data set k.
Note that the coefficient a 0 drops out of the sum due to V 0 0,k = . . . = V 0 3,k = 1. This is to be expected, as constant off-sets should have no influence in a balanced zero-sum experiment as given by Eq. (14).
The aim is to determine the nonlinearity of the detector. The exact value of the linear slope a 1 is of no interest in this respect. Thus, it can be set to a 1 = 1 for simplicity. It is convenient to introduce the quantity With this at hand, one can reformulate the constraints (15) into the following least-square optimization problem: which is solved by finding a configuration of nonlinearity parametersā ≡ (a 2 , . . . , a n ) T that minimize R n (ā). Here, the contribution of each dataset k is inversely weighted with its variance σ 2 k . These variances are a result of random experimental uncertainties of the measured voltages and may vary across the measuring range of the instrument. Taking them into account bears the advantage that more certain parts of the data contribute more to the optimization than uncertain ones. If these data uncertainties are known, one can straightforwardly solve the least-square problem [4]: the weighted matrix elements A k,j−1 ≡ S j,k /σ k (for k = 1, . . . , M and j = 2, . . . , n) and vector components b k ≡ −S 1,k /σ k . Then, the optimal transfer function reads where the irrelevant off-set term a 0 has been set to zero for convenience.

Uncertainty of the nonlinearity
In the following, it will be explained how the variances σ 2 k can be extracted from the data and how they influence the uncertainty of the optimization. The experimental data of [3] is used for this procedure and the subsequent prediction of κ th . The optical power in the beam combination experiment is increased linearly between subsequent measurement points (see Fig. 12(a)). The uncertainty of each voltage V m,k is estimated by calculating a floating standard deviation σ V m,k of the surrounding data points (r = 101 points were taken in each case) with respect to a perfectly linear slope ( Fig. 12(b)). The uncertainty σ k can then be taken as the standard deviation of each summand in Eq. (16), i.e., as the standard deviation of the function Y (V k ,ā) ≡ S 1,k + n j=2 a j S j,k . If the 4 voltages in each dataset V k are independent from one another and if the uncertainties of the voltages are small compared to their mean values, one can express σ 2 k as [5]: In general, this depends on the measured voltages as well as on the nonlinearity coefficients. However, for a weak nonlinearity a j V j V holds for all j > 1. Hence, a j V j−1 1 and σ k is dominated by uncertainties in the linear part of the optimization problem: The resulting uncertainties are shown in Fig. 12(c) and are used as weights in the least-square optimization (16).
The variances and covariances of the resulting coefficientsā can then be readily obtained from the covariance matrix (17) [4]: The uncertainty of the reconstructed transfer function in Eq. (18) at a given voltage V can be calculated from the covariance matrix via error propagation Note that this purely data-based approach makes no assumption on the precision of the instruments or the magnitude of random fluctuations in the experimental conditions.

Order of the polynomial and resulting transfer function
The optimization procedure Eq. (16) is solved for various degrees n of the polynomial. We select the lowest possible n beyond which the quality of the optimization does not significantly increase any further. To be specific, we calculate a test quantity X(n), defined as the residual R n (ā), normalized by the remaining number of degrees of freedom in the system X(n) ≡ R n (ā)/(M − (r − 1) − n + 1), with M − (r − 1) being the number of data points remaining after calculation of the local standard deviation. This quantity has been calculated for increasing n and the result is plotted in Fig. 13(a). As one would expect, a linear function n = 1 fares worse than nonlinear ones, which suggests that indeed significant nonlinearity is present in the system. There is some improvement in the quality of the fit from n = 2 to n = 3 (X(3)/X(2) ≈ 0.98). However, beyond n = 3, the optimization improves barely any further: X(4)/X(3) ≈ 0.992 and X(5)/X(4) ≈ 0.998. This suggests that n = 3 is a reasonable choice for the order of the polynomial without overinterpreting the data. This value is chosen for the estimation of the nonlinearity in this work. The resulting parameters, which optimize (16) for the beam combination measurements of Ref. [3] are: with the covariance matrix Implications on the Sorkin experiment Now it will be discussed how the systematic influence of nonlinearity-affected detection of the interferometer's output powers on higher-order interferences can be quantified. To this end, the state of the interferometer, i.e., all amplitudes and phases of the 5 paths, is reconstructed via state tomography. From this the expected optical powers of the Sorkin terms (p A , p AB , . . .) are calculated (see Fig. 11). Due to the nonlinear dependence between these powers and their measured voltages . . an apparent non-zero higher order interference κ 3,th arises, even if no physical higher-order interference is present. Its value will be predicted by calculating the nonlinearity affected voltages via the inverse transfer function. Clearly, any uncertainty in f −1 opt (p) will translate directly into an error bar of the predicted κ th . The inverse function can be analytically calculated from f opt (V ) for low-order polynomials. However, the measurement device is mostly linear, such that With this one can approximate Then, clearly, the uncertainty of the inverse function is approximately equal to the one of the forward transfer function.
which can be calculated via (19). The inverse transfer function and its uncertainty are shown in Fig. 13(b).
Note that the tomography data itself will also be affected by the nonlinearity, so one could apply the function f opt (V ) to reconstruct the 'true 'optical powers before the density matrix is reconstructed. However, the higher-order interference must always be zero for any physical density matrix, regardless of its precise structure (if Born's rule holds). Therefore, a forward nonlinearity correction of the tomography data, leading only to minuscule changes in the density matrix, has no significant impact on κ th and can, hence, be omitted.
It is of course difficult to tell in which way the realizations of f −1 (p) for the various values of p are correlated with each other. One may consider two extreme cases: The case of 'maximum correlation', where all values of f −1 opt (p) are shifted in the same direction by one standard deviation and the completely uncorrelated case, where each value f −1 opt (p) is replaced by an independent Gaussian random variable with mean f −1 opt (p) and standard deviation σ f (p). In the maximally correlated scenario, one finds generally much smaller error intervals than in the uncorrelated case. Therefore, we resort here to the latter scenario to provide the more conservative error estimation. For such a fully independent random variation of the nonlinearity for each power term, one can calculate the uncertainty of the prediction of κ th by propagation of Gaussian errors. With this procedure one obtains for the dataset of the main paper and the corresponding tomography data, the apparent higher order interferences shown as the red symbols in Fig. 14. Similar values and uncertainties are expected for all subsets of the five interferometer-paths. In case of second-and third-order interference, one can average over these subsets to obtain, along with the prediction for the fourth-order interference on the full set of paths, the final estimates used in the main text of the paper: κ C 3,th = (9.7 ± 3.1) · 10 −5 κ C 4,th = (−1.6 ± 4.1) · 10 −5 κ C 5,th = (−3.9 ± 5.1) · 10 −5 .
The uncertainties of the measured value κ and the predicted value κ th arise from independent experiments. Therefore, it is natural to assume that there is no correlation between them, which implies that the uncertainty of the corrected higher-order interferenceκ = κ − κ th is obtained by adding the individual variances: ∆κ = ∆κ 2 + ∆κ 2 th .
These are the error bars shown in Table II and Fig. 5 in the main text.

Nonlinearity of a single photon detector
The reverse-biased avalanche diodes used as single photon detectors have an intrinsic deadtime τ . The resulting saturation of the detector for increasing count rates is the dominant nonlinear mechanism. The transfer function p = f (V ) (p and V now being count rates) and its inverse can be expressed by a simple saturation model [3]: Just as for classical light detectors, the unknown parameter τ can be determined via beam combination experiments. The additivity of the optical powers (14) translates to the following condition: The optimal solution for the deadtime is found via the nonlinear optimization problem: As before, σ 2 k is the variance of each function F (τ, V k ) and is used to weight the optimization. If one is sufficiently far away from the saturation point, the nonlinearity is weak and the variance is dominated by the uncertainty in the linear part of F (τ, V k ). Then, σ 2 k ≈ 3 m=0 σ 2 V m,k , just as for the classical detector. As before, the individual uncertainties σ V m,k can be calculated from the data via a floating standard deviation.
Using this technique, we obtain a deadtime τ = 33.9 ns with an uncertainty σ τ = 0.3 ns for the detectors used in our experiment. One can then use the saturation model (23) to calculate a prediction of higher-order interference from the tomography data, just as for the classical light measurements. In this case, however, the uncertainty in τ must influence all count rates in the same direction. Therefore, the uncertainty in the prediction can be estimated by performing it also for deadtimes τ ± σ τ . These predictions are shown together with the experimental data in Fig. 15 for the individual path combinations. Compared to the classical case, they have negligible error bars, as a very specific nonlinearity model with a single, precisely determined parameter is used. The averages over all path combinations yield: κ SC 3,th = (−11.18 ± 0.10) × 10 −4 κ SC 4,th = (−3.48 ± 0.03) × 10 −4 κ SC 5,th = (2.85 ± 0.05) × 10 −6 .