Nanomechanical test of quantum linearity

Spontaneous wavefunction collapse theories provide the possibility to resolve the measurement problem of quantum mechanics. However, the best experimental tests have been limited by thermal fluctuations and have operated at frequencies far below those conjectured to allow the physical origins of collapse to be identified. Here we propose to use high-frequency nanomechanical resonators to surpass these limitations. We consider a specific implementation that uses a quantum optomechanical system cooled to near its motional ground state. The scheme combines phonon counting with efficient mitigation of technical noise, including non-linear photon conversion and photon coincidence counting. It is capable of resolving the exquisitely small phonon fluxes required for a conclusive test of collapse models as well as potentially identifying their physical origin.


I. INTRODUCTION
Quantum mechanics is one of the most transformative physical theories of the 20th century. However, while the evolution of the quantum wave function is deterministically described by Schödinger's equation, the outcome of a measurement is probabilistic, given by Born's rule. Despite recent progress [1][2][3], there is no consensus on how to reconcile these two viewpoints, as illustrated by the measurement paradox [4]. There are two conceptually distinct approaches: either the interpretative postulates must be modified [5][6][7][8][9], or quantum mechanics approximates a deeper theory yet to be discovered. The later approach gives rise to collapse models [10,11], postulating a stochastic nonlinear modification to Schrödinger's equation. Irrespective of whether they successfully allow the reconciliation of quantum evolution and measurement theory, these collapse models are considered the only mathematically consistent, phenomenological modifications against which quantum theory can be tested [5,12].
The most universal and well studied collapse model is Continuous Spontaneous Localization (CSL) [13,14], which serves as a framework to describe a variety of collapse mechanisms [10,11,[15][16][17][18][19][20]. In CSL, a collapse noise field is introduced which couples nonlinearly to the local mass density. In its simplest form this noise is white, and the model has two parameters -the collapse rate λ c , which determines the interaction strength with the collapse noise field, and the correlation length r c , which determines the spatial resolution of the collapse process [5,14]. The correlation length is expected to be ∼ 100 nm [21], since the behaviour of larger systems is generally adequately described by classical theories, whereas quantum mechanics appears to apply on smaller scales. Refined dissipative and coloured models introduce two additional parameters, associating a temperature and high-frequency cut-off to the collapse noise field to ensure energy conservation and permit an identifiable physical origin of collapse [16][17][18][19][20]. Based on the assumption that the origin is of cosmological nature, and thermalised to the photon-, neutrino-, or gravitational wave background, the high-frequency cut-off is estimated to occur at Ω csl /2π ∼ 10 10 − 10 11 Hz [20].
To date, the most stringent unambiguous upper bounds on the collapse rate at the expected correlation length are based on mechanical resonators, with signatures of spontaneous collapse expected to manifest as an anomalous temperature increase. However, the suggested lower bounds to the collapse-induced heating are lower than one phonon per day [20][21][22].
The challenge of resolving these exquisitely small collapse signatures over a large thermal noise background has precluded conclusive tests of CSL, and has also introduced significant challenges for data interpretation [23]. Even were these issues resolved, quantum backaction heating [24,25] would remain orders of magnitude larger than the predicted collapse signatures. Moreover, with micron- [23,26] to meter-sizes [27,28], the resonators employed to-date are larger than the anticipated correlation length and have frequencies far below the expected high-frequency cut-off. As such they are unable to provide insight into the physical origin of collapse [16][17][18][19][20].  In this work we propose to test collapse theories with high frequency nanomechanical resonators. This offers the advantages of miniaturisation to match the expected collapse correlation length, resonance frequency around the high-frequency cut-off, and the abilities to both exponentially suppress thermal phonons via passive cryogenic cooling and apply quantum measurement techniques to improve performance. To assess the approach, we develop a specific experimental implementation that makes use of phonon counting in a nanoscale mechanical resonator. Our proposal includes new mitigation strategies for optical, thermal, electrical and quantum back-action noise that, for the first time, provide a way to bring each of these noise sources below the expected lower bounds for collapse-induced heating. We conclude that with challenging but plausible improvements in the state-ofthe-art our approach could conclusively test CSL, closing the gap between measured upper bounds and predicted lower bounds on the collapse rate, and could also potentially identify the physical mechanism underlying the collapse process. This provides an experimental pathway to answer one of the longest standing questions in physics, and also opens up possibilities for laboratory tests of astrophysical models of dark matter [29,30], and other exotic particles [31].

Basic protocol
Our protocol is illustrated in Fig. 1, and is based on a gigahertz nanomechanical resonator, or array thereof, within a millikelvin environment. As opposed to standard optomechanical measurement, consisting of an optical cavity linearly coupled to a mechanical resonator [32], we propose to perform phonon counting in a three-mode optomechanical system where two optical modes are coupled via a mechanical resonator with resonance frequency Ω. This allows collapse signatures to be spectrally distinguished from most noise sources. One mode, the probe mode, is excited by a continuous weak laser at its resonance frequency ω p . In the ideal case, the other, the signal mode at frequency ω s = ω p + Ω, is only excited by resonant anti-Stokes Raman scattering between collapse induced phonons and probe photons. A single-photon readout scheme minimises both absorption heating [33] and quantum back-action heating. Signal photons are spectrally separated from probe photons by a filter cavity, while dark counts are suppressed by nonlinearly downconverting signal photons to pairs and performing coincidence detection.
As a concrete example, we consider using a three-mode photonic-phononic crystal optomechanical system, such as proposed in [34][35][36][37]. We choose most parameters based on those achieved in [38], with a mechanical resonance frequency Ω/2π = 5.3 GHz, a mechanical damping rate Γ/2π = 108 mHz, an effective mass m eff = 136 fg, and thermalisation to the base temperature of a dilution refrigerator (T = 10 mK). We use the theoretical scatteringlimited intrinsic decay rate of κ p,0 = κ s,0 = 2π · 9.2 MHz calculated for these devices [39] for both optical modes, where the subscripts 'p' and 's' distinguish the probe and signal mode throughout [32]. Finally, we assume a tenfold improved single-photon optomechanical coupling rate of g 0 /2π = 11.5 MHz, as predicted to be feasible with optimized designs [40].

Phonon flux induced by CSL
The CSL phonon flux isṅ c = λ c D, where D is a geometrical factor that quantifies the susceptibility of the resonator to spontaneous collapse. The requirement that CSL should resolve the measurement problem introduces lower bounds on λ c , and therefore on the phonon flux. Adler proposed λ c ≥ 10 −8±2 s −1 from the postulate that collapse should account for latent image formation in photography [21], while Bassi et al. proposed λ c ≥ 10 −10±2 s −1 from the presumption that collapse should occur in the human eye [20]. We estimate D = 5.1 · 10 5 for our proposed device [26] (see Supplemental Material [41]), which combined with these bounds implies minimum CSL induced phonon fluxes ofṅ c = 5.1 · 10 −3±2 s −1 anḋ n c = 5.1 · 10 −5±2 s −1 , respectively.

Optomechanical dynamics and conversion efficiency
If the oscillator is initially in its ground state, with one photon in the probe mode, a phonon introduced by spontaneous collapse prepares the state |n b n p n s = |110 , where n b is the phonon number in the mechanical resonator, while n p and n s are the photon numbers in probe-and signal-mode, respectively. The optomechanical conversion efficiency η om for this state to emit a signal photon at frequency ω s is obtained by numerically solving the } Ω csl Figure 2: Heating rates of a Q = Ω/Γ = 10 7 silica sphere resonator vs. mechanical frequency and sphere diameter. Red traces: heating due to coupling to the thermal environment at temperatures 300,1 and 0.01 K. Gray shaded: Lower bounds on CSL heating rates for a sphere, according to Adler [21], Bassi et al. [20] and GRW [22], assuming the fundamental mechanical breathing mode frequency Ω = c/R, with sphere radius R and speed of sound c = 3000 m/s. CSL heating rates drop once the resonator becomes smaller than the noise correlation length, which is set to r c = 10 −7 m. Green: lower bound on heating rate predicted from classical channel gravity [42][43][44]. At high frequencies and low temperatures, collapse signatures exceed the thermal heating. Blue shaded: proposed range of Ω csl .
Four classes of noise can potentially imitate a collapse signal: thermal phonons, probe photons that leak through the system, phonons introduced by the measurement process, and detector dark counts. Photons that leak through the system can be efficiently filtered using a standard laser stabilisation reference cavity [45] (see Table I and Methods), and will not be considered further here.
probe mode signal mode c) Two phonons created by counter-rotating transition followed by resonant transition.
Measurement-induced phonons. Phonons introduced by the optomechanical measurement can imitate collapse signatures. These phonons are created by non-resonant scattering pro-cesses between the signal and probe modes, the three lowest order of which are represented in Fig. 3. We calculate the probability of phonon occupancy due to these processes numerically by solving the Born-Markov master equation (see Methods). We find that each of these processes is suppressed by the square of the resolved sideband ratio Ω/κ p , with predicted phonon occupancies shown in Fig. 4 (a) and (b). Photoabsorptive heating can also introduce phonons. However, it only adds a negligible contribution to the measurement-induced phonon occupancy (see Table I and Methods). A measurement-induced phonon can only be converted to a collapse-imitating photon at frequency ω s if it scatters with a second photon entering the probe mode within the lifetime Γ −1 of the mechanical resonator (see Fig. 3). It is therefore possible to suppress these photons by operating with a low average photon occupancyn p . Here, we choose the photon occupancy so that the probability of a photon entering the probe mode during one mechanical oscillation lifetime is η p =n p κ p /Γ ∼ 1%. This reduces the rate of measurementinduced photons by a factor of a hundred. The cumulative probability of a probe photon generating a phonon, and a second probe photon then causing emission of a photon at frequency ω s , is shown in Fig. 4 where τ c is the coincidence timing resolution. For commercially available photon counters with R d,1 = 3.5 s −1 and τ c = 30 ps [47], we predict R d,2 ∼ 3.7 · 10 −10 s −1 .

Minimum testable collapse rate
For r c = 10 −7 m, the rate of coincidence counts attributed to collapse is R c = λ c Dη = 5.5 · 10 2 λ c , where the efficiency η = η p η om η χ η d η f = 1.1 · 10 −3 quantifies the fraction of phonons in the mechanical resonator that result in a coincidence count, η χ = 0.95, and η d = 0.64 is the coincidence detection efficiency (see Supplemental Material [41]). This rate must exceed the sum of the noise rates, setting the limit to the minimum observable collapse rate λ c . For optomechanically induced phonons, probe photons leaking through the system, and thermal phonons, and R th = ηṅ th , respectively, where η f = 0.56 is the transduction efficiency through the filter and p f = 3.5 · 10 −10 the probability of a probe photon leaking through the filter (see   Fabricating an array of N optomechanical cavities on a silicon wafer [48][49][50][51] (see Fig. 1), coupled to a single filter cavity, nonlinear medium and detector (or a small number of such elements) could significantly reduce this time to t (N ) meas = t meas /N , and also the dark countlimited testable collapse rate to λ (N ) c,det = λ c,det /N . We estimate that N ∼ 10 4 may be feasible (see Supplemental Material [41]), essentially eliminating detector dark counts as a limit, and allowing a reduction of the measurement time to about two days.

Feasibility and alternative parameter regimes
The only optomechanical parameters that must be improved from the current state-ofthe-art [38] to realise our protocol, are a reduced optical linewidth (by a factor of ∼ 50), as predicted by theoretical modelling based on the device realized in [39], and an enhanced single-photon coupling rate (by a factor of ∼ 10), based on theoretical modelling in [40].
Alternatively, effective enhanced single-photon coupling could be achieved by coupling to a qubit or other highly nonlinear system, e.g. as demonstrated in [52]. Given the trajectory of the field, we estimate these requirements to be likely achievable in the intermediate future.
Nevertheless, it is also useful to consider alternative realizations of the method.
Quadratic coupling. While here we consider phonon-counting via an optomechanical Raman interaction, in principle the method could be implemented with any low-noise phononcounting method applied to a high-frequency oscillator [53][54][55][56][57]. One promising approach may be quantum non-demolition measurement of phonon number using non-linear optomechanics [58]. In the regime of quadratic optomechanical coupling and resolved mechanical sidebands [32], a collapse-induced phonon imparts a frequency shift 2n 0 on the optical resonance at frequency ω, wheren cav = a † a is the average intracavity photon number with a the annihilation operator for the optical cavity field, and g (2) 0 the zero-point quadratic coupling rate [58][59][60]. The shift is detectable if it is larger than the significant noise sources, which are random fluctuations in the probe frequency, absorption heating, and quantumbackaction from spurious linear coupling.
Considering Bassi et al.'s proposed mechanism [20], takingn cav = 10 2 and assuming that the probe is shot noise limited, we find that a zero-point quadratic coupling rate of Hz would be sufficient for the weakest possible collapse signal to exceed the probe frequency noise using the photonic-phononic crystal considered in the protocol above [38] (see Methods). This is well within experimentally achieved values in optomechanical photonic crystals (e.g. g (2) 0 /2π = 245 Hz in [59]). Perhaps the most significant challenge in this approach would be to engineer a strong suppression of linear optomechanical coupling, so that the phonon flux due to quantum backaction does not exceed the predicted CSL signature. If using standard architectures, there is a fundamental limit to this suppression of linear coupling [61]. Hence, either a different architecture would have to be employed [54,60,62], or the substantially more stringent condition g (2) 0 ≥ κ would have to be realized. The phonon flux due to quantum back-action is given byṅ ba = 4g 2 0n cav /κ = 4g 2 /κ. To resolve a potential CSL signature,ṅ c must be greater thanṅ ba . As a result, the linear optomechanical coupling would need to be suppressed to g ≤ λ c Dκ/4. To test λ c = 10 −12 , we find the condition g/2π 10 −1 Hz, about seven orders of magnitude lower than typical linear coupling rates in photonic-phononic crystal structures [38]. While some architectures may in principle allow for vanishing linear coupling g, achieving the required suppression in practice may be challenging [54,62,63].
In continuous operation, with currently available technology [38], absorption heating would exceed the expected heating from collapse by about seven orders of magnitude. Even with this very large heating in the continuous domain, it may be possible to resolve the problem by operating in a pulsed regime, so that each optomechanical measurement process is completed in a timescale much shorter than the time required for absorption events to create phonons. In this case, the measurements would need to be sufficiently temporally spaced to allow for phonons to fully dissipate. the observed temperature of neutron stars [72,73] (dashed black), which are valid however only for white noise CSL; and cold atom interference [67] (gray region), though we note the controversy [74] on the actual size of the superposition reported in [75].
The red shaded region in Fig. 5 could be tested by our protocol as discussed above.
In the case of white-noise collapse, the protocol could for the first time fully test Bassi et al.'s proposal. If collapse noise has one of the proposed physical origins [18][19][20], the envisaged protocol would also for the first time probe Adler's prediction, which is in this case not tested by X-ray emission (black dotted line in Fig. 5). The resonance frequency is close to the frequency range in which a drastic frequency-dependent reduction of the  collapse noise stemming from a physical origin is expected [20]. To identify the physical origin of collapse, and to differentiate between collapse-induced signal and technical noise, we suggest employing a number of mechanical resonators of slightly different frequencies, or one frequency-tunable resonator [76], at frequencies around Ω/2π ∼ 10 GHz, such as reported in [39].
We also evaluate the capability of the protocol to constrain parameters in gravitational collapse models. While for the Diósi-Penrose model [77][78][79] we find that it cannot exceed existing bounds, for the classical channel gravity model in a typical parameter range [42][43][44] we predict about a one order-of-magnitude stronger bound than previously achieved [80] (see Supplemental Material [41]).
In summary, we have proposed the concept of testing quantum linearity using highfrequency mechanical oscillators. This offers the advantages of thermal noise suppression to well below expected collapse signatures, and the potential for identification of the physical origin of collapse. As a possible implementation we suggest a protocol based on a dualcavity high-frequency optomechanical device passively ground-state-cooled and operating in the strong coupling regime. This design, combined with nonlinear optical techniques to reduce dark counts, is predicted to allow measurement of the minuscule phonon-flux generated by collapse-induced heating. While challenging, the protocol has the potential to conclusively test CSL, and thus whether collapse mechanisms can be invoked to resolve the measurement paradox. Unlike previous proposals and experiments, it is designed to allow for identification of the physical noise field underlying CSL, and for differentiation between excess technical noise and signatures of collapse.

Born-Markov master equation
To model the dynamics of the three-mode optomechanical system we employ the Born- Markov framework for open quantum systems [81][82][83]. The interaction picture Hamiltonian for our system is [34,35,83] where b, a p and a s are annihilation operators for the mechanical mode and optical modes, respectively, and a in is the coherent input field. The first term describes the mechanically mediated cross-coupling of the optical modes, while the second term describes the coherent excitation [32]. In the parameter regime of this work where g 0 Ω and Γ κ p , κ s , g 0 , the dynamics of the system can be described by the Born-Markov master equation as [81,82] whereρ is the density matrix,n th the mechanical mean thermal occupancy, and D the dissipating superoperator, D[A]ρ = AρA † − 1 2 (A † Aρ +ρA † A). A weak phonon flux due to spontaneous collapse is described byṅ c = λ c D, independent of its origin. This allows us to model the conversion of a signal phonon to a signal photon, as well as creation of noise phonons introduced by measurement (see Supplemental Material [41]).

Negligible sources of noise
Probe photons leaking though the system. Probe photons passing directly from the laser through the optomechanical system, without a scattering event, could in principle imitate a signal, obfuscating collapse signatures. We find that, using a standard laser stabilisation reference cavity as a filter [45], this noise can be suppressed well below both Adler's and Optical absorption heating. To estimate the phonon occupancy due to optical absorption, we use the model for absorption heating in silicon optomechanical crystals outlined in [33,39]. Photoabsorption creates an electronic excitation, which is then transferred to terahertzfrequency phonons. While radiating from the resonator to the environment with a geometryand material-dependent rate γ THz , they also couple to lower energy phonons with a generally longer timescale, potentially exciting the mechanical resonator. In [33,39], the average phonon numbern b is related to the average intracavity photon numbern cav vian b ∝n 1/3 cav . We expect this relationship to break down when the time between photoabsorption events is long enough for the generated heat to fully dissipate,n cav · γ THz /κ 1, where κ is the loaded optical decay rate, as any discrete photon absorption event is expected to create a fixed amount of heat. In this casen cav determines the frequency of these events, but not the magnitude of dissipated heat. We compute the average phonon number excited by of one probe photon in the mechanical resonator,n γ , due to photoabsorption for time t abs , at which the oscillator is in thermal equilibrium with the material, but not yet with the environment, Γ −1 t abs γ −1 THz . For the proposed setup we find p abs (t → ∞) = 6.1 · 10 −12 and R abs = 1.4 · 10 −14 s −1 (see Supplemental Material [41] for calculation details).

Measurement-induced phonons.
A probe photon can create a noise phonon by coupling directly into the signal mode instead of the probe mode ( Fig. 3 (a)). This process is suppressed by the square of the resolved-sideband ratio Ω/κ s . The corresponding occupancy is calculated by numerically solving the Born-Markov master equation (see Methods and Supplemental Material [41]) and shown by the dashed blue line in Fig. 4 (a).
A photon that does enter the probe mode, corresponding to the state |n b n p n s = |010 , can introduce noise by undergoing the non-resonant phonon-creating transition |010 → |101 (see Fig. 3 (b)). The resulting state can also resonantly transition to a two-phonon state, |101 → |210 , as shown in Fig. 3 (c). Similarly to above, noise phonons from this process are suppressed by ∼ (Ω/κ p ) 2 . Predicted phonon occupancies are shown in Fig. 4 (a) and (b).

Zero-point quadratic coupling rate required to fully test Bassi et al.'s lower bound
The linearised quadratic part of the optomechanical interaction Hamiltonian is H 0 (a † +a)(2b † b+b † b † +bb) [60]. The term proportional to b † b yields a per-phonon optical resonance frequency shift of 2n  The term proportional to b † b † a converts a probe photon to two phonons, potentially imitating a collapse-signature. However, the shift induced by two phonons is 4n 0 and can be clearly distinguished from the collapse-induced shift caused by one phonon. Therefore, two-phonon creation can only imitate a collapse signal if it coincides with a frequency fluctuation of the probe mode −δω ≥ 2n 1/2 cav g (2) 0 , sustained at least over the two-phonon lifetime ( √ 2Γ) −1 . The low probability of such a fluctuation, together with suppression on the order of (2Ω/κ) 2 due to the non-resonant nature of the interaction, make this source of noise negligible.

DATA AVAILABILITY
The data that support the findings of this study are available within the paper and its

SUPPLEMENTARY NOTE 1. SIGNAL RATES FROM SPONTANEOUS COL-LAPSE
While there exists a plethora of collapse models [1,2], they can all be formulated in terms of a stochastic nonlinear modification to Schrödinger's equation of the general form [3,4]: where the operator H is related to the standard Hamiltonian of the system, ξ(t) is defined in terms of an increment dW (t) of a stochastic Wiener process W (t) through ξ(t)dt = dW (t); λ is a coupling that sets the strength of the collapse and A is the reduction operator, which is specific to the particular realization of the collapse mechanism. The requirements of normpreservation of the state evolution and of no-superluminal signalling make these collapse models the only mathematically consistent, phenomenological modifications against which quantum theory can be tested in this context [3].
In the following we give expressions for the decoherence rates due to two commonly studied collapse mechanisms. Collapse models differ in the properties and nature of the mechanisms purported to cause the collapse and can thus be classified according to the basis in which decoherence occurs, the mathematical properties of the stochastic mechanism and whether the mechanism has a quantum mechanical origin, or is due to a modification to Schödinger's equation from a deeper-level theory. In the models discussed here decoherence acts in the position basis with Gaussian correlations in space, see [1] and references therein. A generic expression for the diffusion term √ λAξ(t)dt in these models is and describes the rate at which the spa-tial coherence is suppressed. Taking as an example the typically used Gaussian smearing function of the mass density G( x) = (4πr 2 c ) −3/2 e − x 2 /4r 2 c , the decoherence rate reads . We note that this is exactly the one-particle decoherence rate of the Ghirardi-Rimini-Weber (GRW) [5] and in the Continuous Spontaneous Localization (CSL) models [6,7]. Furthermore, for small superpositions sizes (as compared to the correlation length r c ) | x − x | r c the decoherence rate becomes approx- , where x 0 is the zero-point motion (or equivalently, for spatial superpositions of massive particles, the superposition size ∆x).
The most studied collapse model is the CSL model [6,7]. It considers second-quantized (albeit non-relativistic) indistinguishable particles where the collapse occurs in the particle number (Fock) basis. The key consequence of the indistinguishability of the particles is that for multi-particle systems the model predicts quadratic dependence of the decoherence rate on the number of particles that are within the cutoff distance r c . For a comparison, the GRW model postulates discrete in time collapse events of the wave-function of individual (and also distinguishable) particles in the positions basis which yields linear dependence of the decoherence rate on the particle number [5]. The stochastic process in the CSL model is introduced in terms of a time-dependent Wiener noise at each point in space, coupling to mass-density smeared over some length scale r c . As CSL is linear in the coupling rate λ c of matter to the collapse noise field, we define a dimensionless decoherence operator D bỹ For an oscillator with mass density ρ( x) and direction of motion along the z-axis, the decoherence operator in the CSL-model reads [8,9] whereρ( k) = d 3 rρ( r)e −i k· x is the Fourier transform of the mass density and u = 1.66 · 10 −27 kg is the atomic mass unit.
CSL in its original form predicts infinite energy increase as time goes to infinity [7,10,11]. This problem can be solved by postulating a finite temperature of the noise [12]. Furthermore, the model assumes a white noise spectrum, which cannot be identified with any physical origin [3]. In order to generalise the model to become compatible with relativity and with observations, the noise field should have a more general, i.e. non-white spectrum.
In such a case, however, the model becomes non-Markovian [13,14]. While such models are difficult to study in full generality, it has been demonstrated [13,14] that to lowest order in λ the qualitative features of the model are the same as for the white-noise model. A common assumption that helps to lift ambiguities in defining the model is that the field underlying the collapse process has a cosmological origin. This allows one to introduce a high-frequency cutoff of the order of Ω csl /2π ≈ 10 10 − 10 11 Hz [12,15] which ensure the collapse rate is essentially as in a white-noise CSL model, but which changes the relaxation behaviour: in the coloured-noise model the system in the limit of long times thermalises to the temperature of the noise field, while in the white-noise model the system energy keeps growing with time [16].
D can be calculated analytically for simple geometries of composite test-systems [8,9].
For a sphere of radius R the decoherence operator reads and for the case of a cuboid with constant density ρ and sidelengths L 1 , L 2 and L 3 , where L 3 is the direction of motion For our proposed experiment, we estimate length, height and width of the photonic crystal beam to L 1 , L 2 , L 3 = 1.21, 0.22, 0.22 µm, respectively, reproducing the effective motional mass of the relevant mechanical mode of 136 fg [17], where we used the density of silicon, ρ = 2.33 · 10 3 kg/m 3 . We choose the cuboid approximation for the modeshape following [9].
For the models where the collapse is assumed to have a gravitational origin, two main types of theories can be distinguished: where decoherence arises due to an intrinsic uncertainty in the local value of the gravitational field [18,19] or, equivalently, gravitational self-interaction [20]; and where decoherence is a consequence of the assumption that gravity is fundamentally a classical channel [21][22][23]. In both cases for small superposition size, the resulting effect has the same general form as the corresponding regime of the CSL and GRW models: For an oscillator it is proportional to the square of the zero point motion, for spatial superpositions of massive particles the effect is proportional the square of the superposition size. Gravity-based decoherence has also been described within the framework of CSL in [24].
In the Diósi-Penrose (DP) model the decoherence rate is quantified by gravitational potential evaluated between superposed amplitudes of the system: is the gravitational interaction between mass-densities ρ X , ρ Y associated with the superposed configurations X, Y [25], with G = 6.67 · 10 −11 m 3 kg −1 s −2 being Newton's gravitational constant. For point particles the above expression gives divergent decoherence rate and thus a short-distance cutoff r DP is needed. The decoherence operator reads [8] where a is the lattice constant of the composite object. Comparing the heating rate expected from Eq. (5) to measurement-induced spurious phonons, which constitute the strongest noise source in our proposed experiment (see main text), we find that short-distance cutoffs up to r DP ≈ 3.9 fm can be excluded. For a discussion of experimental tests of classical channel gravity [21][22][23], see [26].

SUPPLEMENTARY NOTE 2. CALCULATION DETAILS: EFFICIENCY AND NOISE LEVELS OF THE OPTOMECHANICAL SYSTEM
The optomechanical system is probed by a strongly attenuated coherent source, such as described in [27], which can be stabilized to a sub-Hz linewidth κ L . Because κ L is much smaller than κ s , κ p , and κ f , which are the linewidths of the probe mode, the signal mode, and the filter cavity, respectively, the field a in of the incoming probe laser is well approximated with a δ -function: a in (ω) = a in δ(ω L ), where ω L is the laser frequency. We set ω L = ω p to maximize coupling into the probe mode a p . The mechanical decay rate Γ, as well as the collapse rate are slow compared to timescales of the optomechanical interaction: D, Γ g 0 , κ p/s , where g 0 is the single photon optomechanical coupling rate [28]. In this limit, the conversion dynamics can be modelled by the reduced master equation: whereρ is the density matrix, the operators a p and a s correspond to annihilation operators for the optical probe-and signal-modes, respectively, and H int is the interaction Hamiltonian given in the main text.
Resonant anti-Stokes scattering. If a phonon is introduced into the ground-state cooled oscillator, and a probe pulse is incident within the lifetime of the mechanical excitation, the system is in the initial state |n b n p n s = |110 . The optomechanical conversion efficiency η om is the probability of one photon in the probe mode a p scattering with one phonon in the mechanical resonator, creating a photon in the signal mode a s (|110 → |001 ) via a anti-Stokes Raman process, and this photon being outcoupled to create a signal photon at frequency ω s . η om is obtained by numerically solving Eq. (6) and time-integrating over the emission from the signal mode, η om = κ s,ex ∞ 0 a † s (t)a s (t) dt, where κ s,ex = κ s − κ s,0 is the external decay rate of the signal mode due to coupling, with κ s,0 and κ s its intrinsic and loaded decay rates, respectively. Supplementary Figure 1 (a) shows η om as a function of the effective coupling strength g 0 /κ p for critical coupling of the probe mode κ p,0 = κ p,ex and equal intrinsic couplings κ s,0 = κ p,0 , for critically (κ s,0 = κ s,ex ) and overcoupled (κ s,ex /κ p = 2 and 4) signal mode. The efficiency η om is sensitive to the ratio g 0 /κ p , as g 0 sets the rate for anti-Stokes scattering, and κ p sets the rate for the competing optical decay. Furthermore, it depends on the ratio κ s,ex /κ p as higher values favour outcoupling of the signal photon. For our proposed experiment, g 0 = κ p and κ s,ex /κ p = 1.8. The probability of a phonon in the mechanical resonator translating to a coincidence count, imitating a signal, is given by η = η p η om η f η χ η d = 1.1·10 −3 , where η p is the probability of a photon entering the probe mode during the mechanical excitation lifetime, η f = 0.56 is the transduction efficiency through the filter for a signal photon at frequency ω s , η χ = 0.95 and η d = 0.64 are downconversion and coincidence detection efficiencies, respectively. These efficiencies are analysed in more detail in the following paragraphs. As the rate of phonons created in the resonator due to spontaneous collapse is given by the collapse rate λ c D, the rate of registered collapse signatures is R c = λ c Dη = 5.5 · 10 2 λ c .
Probe field occupancy. The average number of photons encountered by one phonon is given byn wheren p is the average photon occupancy of the probe mode and e −Γt is the phonon occupancy at time t after one phonon is created in the mechanical resonator at time t = 0. In the limitn p Γ/κ p , Eq. (7) also quantifies the probability η p of a phonon in the mechanical resonator encountering a probe photon, η p =n ph . Furthermore, in this limit, η p asymptotes to η p =n p κ p /Γ. In our protocol, the probe laser power is adjusted so that η p = 0.01, corresponding to an average of 0.01 photons per mechanical oscillator lifetime. In the steady state, the average intracavity photon number is given byn p = 4κ p,exnin /κ 2 p [28], and hence, to achieve a given η p the input field occupancy is adjusted ton in (η p ) = (η p Γκ p )/(4κ p,ex ). Because the signal is proportional to η p , and the noise background from measurement-induced noise phonons is proportional to η 2 p , the input field occupancy gives a handle to lower the noise at the cost of longer measurement time, or vice versa.
Transmission through the filter cavity. The efficiency η f of the signal passing through the filter, assuming equal input-and output coupling strengths κ f,ex , is given by [29] where κ f,0 is the intrinsic filter linewidth and κ f = κ f,0 + κ f,in + κ f,out is the loaded filter linewidth, with κ f,in and κ f,out the in-and output coupling, respectively. Using a typical a laser stabilisation filter cavity [27], κ f,0 = 30 kHz, and overcoupling both at the input and output κ f,in = κ f,out = 1.5 · κ f,0 , a transmission efficiency of η f = 0.56 is achieved.
Nonlinear downconversion. After separation from probe light at frequency ω p , signal photons at frequency ω s are downconverted to photon pairs in a nonlinear medium. In order to minimize detector dark counts, we propose a nonlinear conversion process to convert signal photons to pairs. A bright classical pump beam with electric field amplitude E is coupled into a medium exhibiting a third order χ (3) optical nonlinearity. This yields an effective where γ is the nonlinear coupling strength [30], a f,out is the mode of the signal transmitted through cavity and filter, and d 1/2 are the modes coupled to the detectors. Given the input state |n f,out n d1 n d2 = |100 , the time evolution in the nonlinear medium is [30] |Ψ(t) = cos (|γ|t/ ) |100 + i sin (|γ|t/ ) |011 .
By setting the length of the nonlinear medium to L = 1 2 π c n |γ| −1 , where c n is the speed of sound in the medium, the output state is |Ψ(t final ) = |011 , corresponding to a photon in each detector mode d 1/2 . It has been shown that this conversion can be performed with near-unit efficiency [30], hence we assume η χ = 0.95.
Coincidence detection. Assuming a detection efficiency of 80% for a single detector [31], the efficiency for coincidence detection is η d = (0.80) 2 = 0.64. The coincidence dark count rate is R coincidence = R 2 d · τ c , where R d is the dark count rate of a single detector and τ c is the coincidence timing resolution (time jitter). This allows a suppression of dark counts with the square of the single-detector dark count rate. For R d = 3.5 Hz and τ c = 30 ps [31], the predicted coincidence dark count rate is R coincidence = 3.7 · 10 −10 s −1 .
Probe photons leaking through the system. There are two ways in which probe photons can leak through the system and potentially imitate a collapse signature. Firstly, optomechanical conversion processes can create photons at frequency ω p , or at a frequency reduced by multiple integers n of the mechanical resonance frequency, ω p − nΩ. Discussions of these processes are included in the following paragraphs on noise phonons. The amplitudes of these processes are negligible due to strong suppression (see main text). Secondly, measurement noise is introduced by probe photons transmitted through the filter cavity and downconverted to a pair of photons in the nonlinear medium, imitating a signal. The probability of a probe photon with detuning ∆ = ω s − ω p = Ω transmitting through both the near-critically coupled optomechanical system and a subsequent filter cavity of linewidth κ f and free spectral range ω fsr is given by [29] For the frequency of the proposed mechanical resonator of Ω/2π = 5.3 GHz [32], a finesse of 3.16·10 5 as in a standard laser stabilisation reference cavity [27] and a cavity length L of one centimeter, we find ω fsr /2π = c/2L = 15 GHz and p f = 3.5 · 10 −10 . A noise photon is then downconverted to a photon pair and registered as a coincidence count with efficiency η χ η d .
As the rate of incoming probe photons coupled into the probe mode in the steady state is 4κ p,exnin /κ p =n p κ p = η p Γ, the rate of coincidence counts due to noise photons, imitating a signal, is Noise phonons from direct occupation of the signal mode. A photon from the coherent probe laser can create a phonon by coupling into the signal mode a s instead of the probe mode a p , which results in a direct occupation of the signal mode |n b n p n s = |001 , as shown in Fig.   3 (a) in the main text. This process is suppressed due to the small spatiotemporal overlap Θ of the signal mode with the coherent laser beam, Θ = κ 2 s /(κ 2 s + Ω 2 ) ≈ (κ s /Ω) 2 = 3.2 · 10 −5 , where κ s /2π = 30 MHz is the loaded decay rate of the signal mode. If the photon is outcoupled, it has a frequency of ω p due to energy conservation and can be efficiently filtered from a signal at frequency ω s . However, a scattering process to the probe mode can create a noise phonon in mode b, imitating a decoherence-signature. The efficiency of this Stokes scattering process is given by the probability of the initial state |001 causing the mechanical oscillator to be in the excited state, which equals the phonon occupancy after optical decay, . η Stokes is shown as a function of g 0 /κ p , for κ s,ex /κ p = 1, 2, and 4 in Supplementary Figure 1 (b). Higher values of κ s,ex correspond to lower probabilities for this type of noise, as the decay process of rate κ s,ex competes with the Stokes scattering of rate g 0 . For the proposed experiment with κ s,ex /κ p = 1.8 we find η Stokes = 0.17. The probability of phonon creation, due to this process, at time t 0 after incidence of the probe photon is p direct = (κ p /κ p,ex ) · Θη Stokes = 2.7 · 10 −5 .
Scattering of noise phonons to signal photons. A spurious signal photon at frequency ω s is created if a noise phonon scatters with a second photon entering the probe mode within the lifetime of the mechanical excitation. For timescales of the mechanical excitation lifetime, t ∼ Γ −1 κ −1 p , κ −1 s , after incidence of a probe photon, it is convenient to define the 'occupancy probabilities' of the one-and two-phonon states from the optomechanical processes described above: p n b =1 (t 0 ) = p direct + p counterrot,1 and p n b =2 (t 0 ) = p counterrot,2 , where t 0 is long compared to timescales of the optical decay, so that all optomechanical conversions have concluded, but short compared to the mechanical decay (Γ −1 t 0 κ −1 p , κ −1 s ). The time dynamics of these one-and two phonon occupancy probabilities are described by and respectively. The probability of these measurement-induced phononic states encountering a second probe photon is given by η p Γ ∞ 0 p n b =1 (t)dt and η p Γ ∞ 0 p n b =2 (t)dt, respectively. The conversion efficiency of the one-and two-phonon state to a spurious photon outcoupled from the cavity is then given by η om and η om,2 , respectively, where latter is numerically calculated, and plotted as a function of g 0 /κ p for different values of κ s,ex /κ p = 1, 2, and 4, as shown in Supplementary Figure 1 (c). The resulting probability of a noise phonon of frequency ω s coupled out of the cavity, due to optomechanical processes, potentially imitating a signal, is p om (t) = η p Γ t 0 p n b =1 (t)η om + p n b =2 (t)η om,2 dt, (14) as shown in the main text in Fig. 4 (c). We find the asymptotic value p om (t → ∞) = 8.4 · 10 −8 . In analogy to Eq. (11), the rate of coincidence counts due to noise is given by R om = η p Γp om (t → ∞)η f η χ η d = 1.9 · 10 −10 s −1 .
Noise phonons due to photoabsorptive heating. The number of noise phonons originating from absorption heating is estimated following [33,34]. We compute the average phonon number excited by one probe photon in the mechanical resonator,n abs,1 , due to photoabsorption for time t abs , at which the oscillator is in thermal equilibrium with the material, but not yet with the environment ( Γ −1 t abs γ −1 THz , with γ THz the rate at which THz-frequency phonons radiate to the environment, see also Methods). We approximaten abs,1 by extrapolating from [17,34]. In this case, an average intracavity photon numbern cav = 1 yields an average phonon number ofn abs = 10 in the mechanical resonator.n cav = 1 is equivalent to κ/Γ photons passing through the cavity within the lifetime Γ −1 of the mechanical resonator, creating in total 10 phonons. The intrinsic optical decay is limited by photon absorption, κ 0 ≈ κ abs and γ THz ≈ κ 0 /2π, within the errors given. Therefore, from one photon we expect the induced occupancyn abs,1 = 10 Γ κ = 10 · 108 mHz/2π 575 MHz/2π ≈ 1.9 · 10 −9 . As with the optomechanical heating rates, a spurious signal will only occur if the phonons resulting from absorption heating interact with another probe photon. In analogy to Eq. (14), the probability of a probe photon creating a spurious signal photon at frequency ω s due to absorption heating is p abs (t) = η p Γ t 0n abs,1 e −Γt η om dt .
For the quadratic coupling approach,n cav = 10 2 (see main text), and the relationn abs ∝ n 1/3 cav holds [33,34], and we find an average intracavity phonon number ofn abs ≈ 5. This corresponds to a noise phonon rate ofṅ = Γn abs ≈ 3 s −1 , close to seven orders of magnitude Multiplexing. A photonic-phononic crystal, including suspension and phononic shield, requires an area of about 1000 µm 2 [17]. Thus it would be conceivable to fabricate a high number of them on a 4-inch-wafer with an area of ∼ 2 · 10 9 µm 2 . We assume N ∼ 10 4 , allowing more than 99% of the wafer to be reserved for waveguide coupling, fabrication tolerances, etc. In principle, all these devices to may be connected to the same filter cavity, nonlinear medium, and detector pair, or to a small number of such elements.