Noisy atomic magnetometry in real time

Continuously monitored atomic spin-ensembles allow, in principle, for real-time sensing of external magnetic fields beyond classical limits. Within the linear-Gaussian regime, thanks to the phenomenon of measurement-induced spin-squeezing, they attain a quantum-enhanced scaling of sensitivity both as a function of time, $t$, and the number of atoms involved, $N$. In our work, we rigorously study how such conclusions based on Kalman filtering methods change when inevitable imperfections are taken into account: in the form of collective noise, as well as stochastic fluctuations of the field in time. We prove that even an infinitesimal amount of noise disallows the error to be arbitrarily diminished by simply increasing $N$, and forces it to eventually follow a classical-like behaviour in $t$. However, we also demonstrate that,"thanks"to the presence of noise, in most regimes the model based on a homodyne-like continuous measurement actually achieves the ultimate sensitivity allowed by the decoherence, yielding then the optimal quantum-enhancement. We are able to do so by constructing a noise-induced lower bound on the error that stems from a general method of classically simulating a noisy quantum evolution, during which the stochastic parameter to be estimated -- here, the magnetic field -- is encoded. The method naturally extends to schemes beyond the linear-Gaussian regime, in particular, also to ones involving feedback or active control.


Introduction
Optical magnetometers based on atomic spin-ensembles [1] are considered today as stateof-the-art magnetic-field sensors competing head to head in sensitivity with SQUIDbased devices [2] without need of cryogenic cooling, while being already miniaturised to chip scales [3]. On one hand, they have been demonstrated to be capable of revolutionising medical applications [4][5][6][7], on the other, when interconnected into global networks, they are used in searches of new exotic physics [8,9].
Despite constituting a prominent example of quantum sensors [10], the nominal sensitivity of atomic magnetometers is commonly described within their "classical" regime of operation as follows [1]-also referred to as the "Equation One" [11]: where the (squared) error in sensing the value of the magnetic field, B, simply decreases (quadratically) with respect to the time over which the sensor gathers information about the field while undergoing Larmor precession-with the constant of proportionality given then by the effective gyromagnetic ratio γ. The maximal value of such probing time, T coh , is dictated by the time-scale within which the sensor can preserve its coherence, with T coh being determined by dominant noise-mechanisms exhibited by the atomic ensemble, e.g. spin-relaxation T coh = T * 1 and/or spin-decoherence T coh = T * 2 1 . Moreover, as "Equation One" applies (only) in the limit when a sufficient number of measurements are performed over the total sensing time T , it contains a division over the number of repetitions, T /T coh , in accordance with the central limit theorem. For the same reason, ∆ 2 B is further divided by the number of atoms N , as in the "classical" regime of operation the atoms within the ensemble can be treated as independent probes.
Although the derivation of "Equation One" may be considered "hand-waving", it can be formalised by following the so-called frequentist approach to estimation theory [13], which, indeed, applies in the limit of large number of repetitions (asymptotic statistics), here T T coh . In particular, the l.h.s. of equation (1) formally corresponds to the mean squared error (MSE) of an (unbiased) estimator of B built basing on all the collected measurement data [10], while its minimum can then be generally associated with the Cramér-Rao Bound (CRB) [13], which effectively represents the r.h.s. of equation (1). As a result, one can then answer fundamental questions using techniques of quantum metrology [14,15], in particular, by how much can the sensitivity (1) be improved by allowing for arbitrary quantum states of the atomic ensemble and measurements more general than the natural light-probing scheme based on the Faraday effect [16]. This has lead to the seminal observation that "Equation One" can be breached-in particular, its 1/N -behaviour commonly referred to as the Standard Quantum Limit (SQL)-by preparing the atomic ensemble in an entangled state [17], so that the MSE can in principle attain the ultimate Heisenberg Limit ∼ 1/N 2 [18].
Unfortunately, such claims about the attainable scaling of precision with N have been proven to be overoptimistic in the asymptotic N limit, whenever one accounts for decoherence-noise-whose strength does not effectively decrease with the number of atoms 2 [15,20]. In particular, the noise when affecting each atom in an uncorrelated manner constrains the quantum improvement to a constant factor beyond the SQL [21,22], while when disturbing the whole ensemble in a collective way enforces a positive lower bound on ∆ 2 B that cannot be overcome also when letting N → ∞ [15,23,24]. Nonetheless, for finite but very large atomic ensembles (N 10 5 ), multiple optical-magnetometry experiments have spectacularly demonstrated that sensitivities beyond SQL can be reached [25][26][27]. In such experiments, the spin-ensemble is prepared in an (entangled) spin-squeezed state [28] every time before using it to sense the external magnetic field, B, which importantly cannot vary over the process of performing the necessary number of experimental repetitions.
The above paradigm, however, must be abandoned if the sensing task considered requires tracking of the magnetic field in real time, and one cannot assure the same magnitude of the field to be probed sufficiently many times. This may occur whenever the field follows a predetermined time-varying waveform subject to stochastic fluctuations, or its behaviour in time is simply not known at all [29]. Still, the continuous quantum measurement theory [30] allows one, in principle, to describe then the dynamics of an optical magnetometer conditioned on the data collected in real time [31,32]-also beyond the "classical" regime-while the ability to perform quantum non-demolition measurements [33,34] of the atomic ensemble continuously in time [35][36][37] appears to be natural. Indeed, first experiments in this direction have been conducted demonstrating the capability to preserve quantum enhancement when waveforms of a known shape are being probed [38], or a fluctuating signal is sensed by a magnetometer operating in the "classical" regime [39]. Moreover, the Bayesian approach to estimation theory [13] provides analogous tools to constrain the attainable "single-shot" precision-via the Bayesian Cramér-Rao Bound (BCRB) [40] and its variations [41]-that have been also generalised to the quantum setting [42].
In case of atomic magnetometry schemes in which spin-squeezing is realised continuously in time by light-probing based on the Faraday effect [43], the Bayesian inference techniques strikingly predict the average mean squared error (aMSE) 3 to follow at short timescales the Heisenberg limit in absence of decoherence [44,45]: where in comparison to the (classical) "Equation One" (1) also the scaling in time is improved from 1/T to 1/t 3 -we use above small t instead to emphasise that it now represents the time along a single experimental trajectory. Equation (2) has been subsequently adapted to account for stochastic fluctuations of the field [46,47]. However, it has never been rigorously generalised to take into account the impact of decoherence beyond numerical considerations [45] and models that do not incorporate measurement back-action in real time [48], also when allowing for arbitrary quantum measurements to be performed at the end of the sensing procedure [49,50]. In our work we tackle this problem by including the collective noise-general anisotropic decoherence affecting the atomic spin-ensemble as a whole-into the analysis, while similarly considering the short time-scales, i.e., the linear-Gaussian regime, for which equation (2) was derived [44,45]. In particular, we focus on the canonical optical magnetometry setup, in which all the atoms are initially polarised along one direction before being continuously spin-squeezed in time via the mechanism of light-probing in a perpendicular direction [31,32,44]. We explicitly account for the presence of collective noise, as well as stochastic fluctuations of the field, and construct the optimal Bayesian estimator, i.e. the Kalman filter (KF) [51,52], whose minimal error we are able to resolve in time. Furthermore, we demonstrate how to generalise and adapt the theoretical techniques previously developed to deal with noise within the frequentist approach to quantum metrology [22,53], in particular, the classical simulation (CS) method [54], which allows us to establish fundamental upper bounds on the attainable precision dictated by the decoherence. As a result, we are able to prove that, although at short timescales, when the collective decoherence can be effectively ignored, the error ∆ 2B follows the 1/t 3 -behaviour of equation (2), it is subsequently degraded to 1/t once the impact of decoherence "kicks-in". Moreover, at large times at which the KF attains its optimal performance in tracking the fluctuating field-its steady state-the error reaches the minimal value that is determined by both the decoherence rate and the strength of field fluctuations. Focussing instead on the dependence of error on the ensemble size, we show that the Heisenberg-like behaviour 1/N 2 is similarly lost as N grows and the impact of collective noise becomes significant. In particular, the error always approaches a constant value dictated by the decoherence rate. However, our method allows us to crucially demonstrate that the precision achieved by the magnetometry setup here considered saturates the ultimate bound for long times and large ensembles, and hence should be considered as the optimal realistic scheme to track fluctuating magnetic fields in presence of collective noise.
The manuscript is organised as follows. In section 2 we describe the atomic magnetometer of interest, in particular, the corresponding experimental setup as well as the resulting dynamical model of the spin-ensemble being probed continuously in time by light. This allows us to explain in detail in subsection 2.3 the linear-Gaussian regime being considered, and discuss the impact of noise on the evolution of the spinsqueezing parameter in subsection 2.4. The section 3 is then devoted to the explanation and derivation of the Kalman filter as the optimal estimator, as well as the precision it achieves in real time. In section 4 we present how the classical simulation method can be utilised to derive the ultimate limit on precision dictated by the decoherence, stemming from the Bayesian Cramér-Rao Bound. Consequently, in subsection 4.3 we discuss the different regimes the error follows both in time and N , and give an intuitive Noisy atomic magnetometry in real time 6 explanation of their origin. Moreover, we demonstrate that the continuous estimation scheme saturates the ultimate limit derived in section 4 and, hence, may be regarded as optimal. Finally, we conclude our findings in section 5.

Setup
The optical magnetometry scheme we consider, see figure 1a, consists of an ensemble of N spin-1/2 atoms prepared in a coherent spin state (CSS) polarized along the xdirection [28], so that the initial mean and variance of the ensemble angular momentum operators, denoted by the vectorĴ(t) = (Ĵ x (t),Ĵ y (t),Ĵ z (t)) T in the Heisenberg picture, read Ĵ (0) CSS = (J, 0, 0) T and ∆ 2Ĵ (0) CSS = (0, J/2, J/2) T , respectively, where J = N/2. As shown in figure 1b, one may then naturally visualise the distribution of the angular momentum with help of the Bloch sphere representation. The aim of the scheme is to measure and estimate in real-time the magnetic field B t being directed along y, which fluctuates according to the Ornstein-Uhlenbeck (OU) stochastic process [55]: where by dW B we denote the Wiener noise of zero mean, E[dW B ] = 0, and variance The magnetic field B t induces a Larmor precession around the y-axis at the frequency ω L (t) = γB t , with γ being the effective (constant) gyromagnetic ratio. However, as discussed in more detail below, in our analysis we restrict to dynamics at small times in the presence of small magnetic fields, under which the angular momentum operatorĴ(t) deviates only slightly from pointing along the x-direction-the direction along which the atoms are initially pumped. In principle, this regime could also be achieved over larger time-scales in experiments involving stroboscopic probing [56][57][58][59]. However, such implementations would then require the atoms to be initially pumped in the direction perpendicular to the magnetic field, as depicted in figure 1.

System and measurement dynamics
The atomic ensemble is continuously monitored by exploiting the paramagnetic Faraday rotation effect [16], which twists the linear polarisation of the (off-resonant) light propagating through the ensemble in the z-direction-the light probe, see figure 1a. As a result, a quantum non-demolition measurement is realised [33,34] that is undertaken continuously in time [35][36][37], which can be effectively described by means of the homodyne-like continuous measurement [60] with the shot-noise in the measured signal taking the form of a Wiener process [43,61]. Note that, even if higher-spin atoms are considered, the following measurement model still applies as long as the contribution from the atomic polarizability tensor component can be suppressed [43,62,63]. In particular, the photocurrent being measured at time t is proportional to the mean value Noisy atomic magnetometry in real time The magnetometry scheme involves an atomic ensemble optically pumped along the x-direction (red arrow ) into a coherent spin-state (CSS). The magnetic field being sensed is directed along y, while the Faraday-rotation-based continuous measurement is performed by using the light-probe propagating along z (blue arrow ), which yields a photocurrent signal y(t) being recorded. (b) Bloch sphere representation of the angular momentum of the ensemble prepared in a CSS along x (with red and blue arrows indicating the pumping and probing directions, respectively).
of the collective atomic spin-component along the probe, Ĵ z (t) (c) , i.e.: where η is the detection efficiency, M is the measurement strength, and dW is the Wiener differential fulfilling E[dW 2 ] = dt according to the Itô lemma [55]. Importantly, due to the active influence the continuous measurement (4) exerts on the atoms, the ensemble dynamics becomes conditional-denoted by the subscript (c)and the atoms evolve differently depending on a particular trajectory of the measurement outcomes registered up to (and including) time t, y ≤t = {y(τ )} 0≤τ ≤t [61]. In particular, the ensemble is characterized at time t by the quantum state conditioned on the past measurement record, ρ (c) (t) ≡ ρ(t|y ≤t ), which undergoes further conditional dynamics described by a stochastic master equation: where the superoperators D and H are defined as The first term in (5) arises simply from the free Hamiltonian,Ĥ = γB tĴy , responsible for the Larmor precession of the ensemble spin. In contrast, both terms in (6) describe the continuous measurement ofĴ z introduced above [31,32], of which the former is responsible for measurement-induced decoherence, or the backaction, while the latter constitutes the non-linear information gain provided when observing Noisy atomic magnetometry in real time 8 a particular measurement signal in (4) [30]. Furthermore, in our analysis we model all possible, e.g. environment-induced, decoherence mechanisms affecting the ensemble (as a whole) by introducing the second term in (5) with three dissipative components being parametrised by distinct effective rates, γ α , in the three directions α ∈ {x, y, z}. Let us also emphasise that the Wiener differential, dW , appearing in (6) is the same as in the measurement dynamics (4)-a particular fluctuation of the photocurrent in (4) after being registered drives the conditional state of the ensemble according to (6).
Finally, let us highlight the difference between the fluctuations of the estimated magnetic field B t , as dictated by the OU process (3), and the collective decoherence introduced above in (5), of which only the latter should be interpreted as a form of "noise"-in the statistical sense, forcing the quantum state of ensemble to lose its purity over time. In particular, due to its presence, the expectation value ofĴ α (α = x, y, z) exponentially decreases with a rate of γ α , whose inverse can be identified as the phenomenological (ensemble) spin-decoherence time T * 2 [12]. Within the theory of open quantum systems [64], this may be associated with tracing out some environmental degrees of freedom or the environment itself monitoring continuously each component J α , whose stochastic trajectory of outcomes is inaccessible and, hence, must be averaged out. On the contrary, the above dynamical model (3)(4)(5)(6) describes the evolution along a single trajectory of the fluctuating magnetic field. As a result, the impact of field fluctuations on the performance in magnetometry should not be treated on equal grounds with the decoherence of the atomic ensemble, but rather associated with the inability to perfectly estimate the field in real time due to its stochastic nature. In fact, if one on purpose decided to ignore the field fluctuations and rather consider the effective dynamics averaged over all possible field trajectories (see Appendix F), one would recover the above model of collective decoherence in the direction of the field (here, J y ), but with a decoherence rate that accumulates as γ y ∼ t 2 over time.

Linear-Gaussian regime
In order to define the regime of interest in which the atomic sensor operates, let us first consider the unconditional dynamics of the atomic ensemble, which can be simply obtained by dropping the stochastic term in (5-6), so that it now describes the evolution of the atomic state at time t, averaged over all the stochastic trajectories of the measured outcomes, i.e. [55]: where by dy ≤t we denote the integral over all possible measurement records y ≤t = {y(τ )} 0≤τ ≤t . Using equation (7) to define the dynamics of any observableÔ(t) in the Heisenberg picture, whose mean must then obey Ô (t) = Tr Ô (t)ρ = Tr Ô ρ(t) , we observe that the (unconditional) evolution of the mean spin-component along the direction of pumping, Ĵ x (t) , is governed by the solution to the following set of coupled differential equations: of which the last (stochastic) one just corresponds to the OU process with E[dW 2 B ] = q B , describing the magnetic-field fluctuations in (3).
Although this implies that the dynamics of Ĵ x (t) is stochastic, we show in Appendix A that, if one focuses on short timescales such that ω L (t) t 1, where ω L (t) = γ|B t | is the time-averaged Larmor frequency, then the mean spin-component along the direction of the pump must decay exponentially as follows: with an effective decay rate: r = M + γ y + γ z . For this to be true, not only the timeaverage value of the magnetic field, B t = 1 t t 0 dτ B τ , must be small (|B t | 1 γt ), but also the distribution of B t should be narrow enough, which we ensure by verifying that E B t ± 2 Var B t 1 γt according to the 68-95-99.7 rule for normal distributions (see also figure 3a). We explicitly prove in Appendix A that for approximation (12) to hold it is sufficient to consider short timescales such that r t 1, while also requiring the parameters of the OU process in (11) (or (3)) to obey: Furthermore, by restricting to short timescales at which the approximation (12) is valid, we effectively deal with spin dynamics in the regime in whichĴ(t) only slightly deviates from its initial x-polarisation. This allows us then to perform the Gaussian approximation [45,65] and introduce the effective (time-dependent) quadrature operators: which satisfy the canonical commutation relationship [X(t),P (t)] = iĴ x | Ĵ x(t) | ≈ i, as long as | Ĵ x (t) | 1 [45,65], which is assured within the approximation (12) whenever J 1-recall J = N/2 10 5 in optical-magnetometry experiments [25][26][27]. As a consequence, it is enough to consider the evolution of the first and second moments of the quadratures, so that when focussing on the probed spin-component in the z-direction that fulfilsĴ z ≈ √ Je −rt/2P (t), its conditional dynamics (5-6) is completely described by the following set of equations: which assume [X(t),P (t)] ≈ i to hold-satisfied whenever J 1, rt 1 and the conditions (13) are valid. Note that equations (15)(16)(17) are linear with respect to one another, while equation (18) can be solved independently. Moreover, they involve only Gaussian (in particular, Wiener) stochastic-noise terms, and are derived based on the Gaussian approximation (14). Hence, we refer in short to the evolution described by (15)(16)(17)(18) as the (conditional) dynamics ofĴ z in the linear-Gaussian regime.

Continuous spin-squeezing
Firstly, let us note that within the linear-Gaussian regime, the equations of motion (15)(16)(17) are independent of the decoherence rate γ x , which is thus redundant. This is a consequence of the approximation (12) (see also Appendix A) and can be intuitively explained-deviations ofĴ(t) from the x-direction are then too small for the collective noise generated by the D[Ĵ x ]-term in equation (5) to have any effect on the quadratures (14). Furthermore, the decoherence rate γ z is also unnecessary, since by transforming the parameters of the continuous measurement (4) as follows: M → M − γ z , η → ηM/(M − γ z ) and y → y M/(M − γ z ); we retrieve the conditional dynamics (15)(16)(17)(18) with γ z = 0. Hence, the impact of the collective noise generated by the D[Ĵ z ]-term in equation (5) instead, can always be interpreted and incorporated into a modified form of the continuous measurement (4).
As a result, without loss of generality, we restrict the effective decay rate of Ĵ x (t) introduced in (12) to take the form r = M + γ y , when considering the (conditional) dynamics ofĴ z in (15)(16)(17). We then solve for the dynamics of ∆ 2Ĵ z (t) (c) in (18), which constitutes a fully decoupled differential equation. For the complete analytical solution we refer the reader to Appendix B, where we also show that the time-dependence of the (conditional)Ĵ z -variance (satisfying ∆ 2Ĵ z (0) (c) = ∆ 2Ĵ z (0) CSS = J/2 at t = 0) can be described by two distinct short-time and long-time behaviours: where t * = (2J M γ y η) −1 is the transition time between the two regimes.
Importantly, note that t t * implies 2Jtγ y γ y /(ηM ). Then, if also γ y < ηM , we may infer 2Jtγ y 1 and approximate 1 + 2Jtγ y ≈ 1 in (19a). As a result, we then recover the noiseless (γ y = 0) solution for the variance within the short-time regime, previously found by Geremia et al. [44], i.e.: despite the actual presence of the noise (γ y > 0). In fact, we prove also in Appendix B that ∆ 2Ĵ z (t) (c) is a non-decreasing function at t ≈ 0 if γ y ≥ ηM . Hence, the noise may be considered negligible at small times only if γ y < ηM .
The dynamics of ∆ 2Ĵ z (t) (c) enables us to study the phenomenon of spin-squeezing of the atomic ensemble [28] by directly computing the (conditional) squeezing parameter introduced by Wineland et al. [66] along the pumping x-direction: which when satisfying ξ 2 (t) < 1 indicates a gain in interferometric precision over the CSS [66]. In particular, just like we decomposed ∆ 2Ĵ z (t) (c) according to (19), we can similarly split the behaviour of the squeezing parameter in the linear-Gaussian regime, in which Ĵ x (t) ≈ Je −r t/2 , as follows From now on in all plots, in order to confirm that apart from (13) the condition rt 1 (with now r = M +γ y ) is satisfied and, hence, the linear-Gaussian approximation (12) is valid, we introduce the rescaled time t S = (M + γ y )t and limit its range to t S 1, what assures the timescale of the linear-Gaussian regime to always be consistently considered.
In figure 2 we present explicitly the exact dynamics of the squeezing parameter (21) for two important cases: (a) -when γ y < ηM and the spin-squeezing (ξ 2 (t) < 1) occurs within a finite-time window (see figure 2a); and (b) -when γ y ≥ ηM for which spinsqueezing is forbidden (see figure 2b). In both cases, it is evident that the exact solution for ξ 2 (t) very closely follows the two-regime behaviour in (22), with a clear transition at t ≈ t * . Moreover, as seen explicitly from the two-regime solution (22)-and the exact solution that can be straightforwardly derived from the exact evolution of thê J z -variance (19)  In what follows, we turn to the problem of sensing the magnetic field, B t , in real time by constructing the optimal estimator,B t , of its fluctuating value in the form of a -14 -12  (21) is plotted as a function of rescaled time t S = (M + γ y )t for the case of γ y = 10mHz < M = 100kHz (subplot (a)) and γ y = 100M Hz > M = 100kHz (subplot (b)). The exact function ξ 2 (t) (solid blue) is compared with its two different regimes ξ 2 <t * (t) and ξ 2 >t * (dashed yellow and green, respectively), as well as the noiseless solution when γ y < M (dashed red ). The other parameters used to generate the plots are: J = 10 9 , γ = 1kHz/mG, and η = 1.
Kalman filter [51,52]. However, we have already included in figure 3 the evolution of the corresponding minimal estimation error, ∆ 2B t , in order to emphasise the fact that the stochasticity of the B t -signal affects the estimation procedure, but not the spinsqueezing phenomenon. In particular, for a given magnetic-field trajectory generated according to the OU process (3)-of a magnitude that does not invalidate the linear-Gaussian regime, see figure 3a-we obtain a measurement signal y(t) from which we can infer via (4) the conditional dynamics of the mean spin-component Ĵ z (t) (c) , see figure 3b. Now, the evolution of its variance, ∆ 2Ĵ z (t) (c) , allows us to compute the squeezing parameter ξ 2 (t), which, as described above, evolves in the linear-Gaussian regime in the same manner for any particular stochastic trajectory of the magnetic field-see figure 3c. In contrast, the stochastic fluctuations of B t affect the estimation procedure, so that the Kalman filter reaches a constant error (the steady state) after a certain time despite the atomic ensemble still being spin-squeezed (ξ 2 (t) < 1). This can be directly seen by comparing the evolution of ∆ 2B t in figure 3d against the one of ξ 2 (t) in figure 3c for t S 10 −3 . The transition time when this occurs depends on the parameters of the OU process (3)-as later shown in section 4.2.

Field estimation with Kalman filtering
The goal within the magnetometry scheme considered here is to most accurately infer the true value of the magnetic field at a time t, given a particular measurement signal y ≤t recorded up to this time instance. In order to do so, we seek the optimal estimator . Dynamics along a single trajectory. Subplot (a) shows the simulated magnetic field along its time average (solid red ) and confidence intervals (orange and green) assuring validity of the linear-Gaussian approximation. Subplot (b) presents the normalized conditional evolution ofĴ z driven by the particular B-field trajectory depicted above. Subplot (c) compares the squeezing parameter with (blue) and without (orange) decoherence present. Finally, subplot (d) juxtaposes the minimal estimation error of the field (solid blue line) with its different scalings in time: noiseless-like (∝ 1/(t 3 J 2 ), dashed blue), "classical"-like (∝ 1/t predicted by the classical simulation (CS) limit, dashed black ), and the one dictated by the steady state (∝ 1, dashed red ). All the plots are made with respect to the rescaled time t S = (M + γ y )t, and the other parameters used are: χ = 0, q B = 100G 2 /s, η = 1, γ y = 10mHz, M = 100kHz, J = 10 9 , and γ = 1kHz/mG. B t (y ≤t ) of B t that minimises the average (Bayesian) mean squared error (aMSE) [13]: Noisy atomic magnetometry in real time

14
where by dB t we denote the integral over the values that the magnetic field may take at time t (only), i.e. the random variable B t with probability distribution: which for the OU process (3) can be determined once the a priori distribution p(B 0 ) is specified, e.g. a normal distribution of variance σ 2 0 (see Appendix E.1). From the Bayesian perspective, p(B 0 ) should most accurately describe the knowledge an experimentalist possesses about the field at t = 0, prior to taking any measurements. Note that, as in our work we are interested in estimation of B in real time, we will not consider the setting in which one seeks the optimal estimator for some time t < t and accounts also for "future" measurement-data collected in the time-window [t , t]. In that case, one should resort to the inference methods of Bayesian smoothing [67][68][69], rather than to the filtering ones that are relevant for our purpose.
The optimal estimator minimising the aMSE is always given by the mean of the posterior distribution [40], i.e.: where in the last step we have used p(B t |y ≤t ) = dB <t p(B ≤t |y ≤t )-the fact that the probability of the magnetic field taking at time t the value B t , given a particular measurement trajectory y ≤t , can be obtained by averaging over all past B-field stochastic trajectories up to (but not including) time t. However, since the problem of interest is described by a set of equations (15)(16)(17) that are linear and Gaussian, the posterior distribution p(B t |y ≤t ) does not have to be explicitly reconstructed. Instead, the optimal estimator (25) can be determined by solving the so-called Kalman-Bucy equation [40] and is commonly referred to as the Kalman filter (KF) [51,52]. The name "filter" originates from the fact that the optimal estimator (25) at time t can be constructed basing solely on the previous-step estimate and the current measurement outcome y t , so there is, in principle, no need to store all the measurement data.
In order to formulate the problem within the KF-framework [70,71], we first define the state vector of the variables Ĵ z (t) (c) and B t to be estimated as well as the measurement vector y t = t 0 y(τ )dτ , which takes, however, a scalar form with only one d.o.f. being continuously measured. Consequently, we then identify the corresponding noise vectors as dw t = (dW, dW B ) T and dv t = √ η dW , respectively, so that the system of equations (15)(16)(17) can be compactly rewritten as: where The self-and cross-correlations between stochastic noise-terms are then given by where are the covariance and cross-correlation matrices (or scalars) of the process and measurement noises, respectively. Let us note that we have used (31), as no cross-correlations are present between the B-field fluctuations and the measurement noise; whereas, because q B ≥ 0 and η ≥ 0, the covariances of process and measurement noises consistently satisfy Q t , R t ≥ 0. As discussed in more detail in Appendix C, the optimal estimatorx t = ( J z (t) (c) ,B t ) T that minimises the overall aMSE-formally corresponding to the trace of the covariance matrix is given by the correlated KF [70]. In particular, focussing on the optimal covariance matrix that the filter yields, its dynamics is then determined by a non-linear differential equation of the Riccati form [70]: Mt Mt  (36) for an infinitely wide prior (σ 0 → ∞) compared with its small-and long-time approximations (39), for parameters: M = 100kHz, η = 1, and γ = 1kHz/mG. dynamics of theĴ z -variance is then given by equation (20). Moreover, since no fluctuations of the field are being considered, the noise correlation matrix just reads Q t = diag(1, 0). As a result, the decoupled system of differential equations introduced in (34-35) can be analytically solved, providing the solution for the minimal aMSE in estimating the B-field (23) as where we have emphasised its dependence on the width of the prior, σ 0 , and defined: with . Note that for an infinitely wide prior, σ 0 → ∞, b(t) = c(t) and the aMSE (36) matches consistently the solution obtained in [44] with ∆ 2BHL t (∞) exhibiting the non-classical scaling in both t and J-following the Heisenberg limit (HL) ∼ 1/N 2 with N = J/2-as mentioned already in equation (2). This becomes clear after making the approximation summarised in figure 4, i.e.: where the terms (39a) and (39b) are obtained by Taylor-expanding ∆ 2BHL t (∞) in equation (36) to leading order in time t and (JM t) −1 , respectively.

Impact of Noise
In figure 3d we have already presented (solid blue line) the behaviour of the minimal aMSE, ∆ 2B t , in presence of noise (γ y > 0), which we have determined by numerically solving the Riccati equation (33) after assuming no initial knowledge about the field (σ 0 → ∞), and setting γ y < ηM . The latter condition is necessary for the minimal aMSE to initially follow the noiseless solution introduced in (39) and, hence, exhibit a "supraclassical" scaling in time at short timescales, ∼ 1/t 3 , as only then the spinsqueezing induced by the continuous measurement can occur-the conditional variance ofĴ z decreases with time, see (20) and Appendix B, so that ξ 2 (t) < 1 in equation (22), as depicted in figure 2a.
Importantly, although the spin-squeezing is maintained at larger timescales (see figure 3c), the noiseless behaviour of aMSE in figure 3d is quickly lost due to noise. In particular, the minimal aMSE firstly follows a classical-like scaling, ∼ 1/t, before attaining a constant value at timescales t S ∼ 1, at which still ξ 2 (t) < 1. In the upcoming sections we explicitly show how the presence of collective decoherence and field fluctuations causes the "supraclassical" scaling to be lost both in time t and the atom number N (J = N/2). We achieve this by giving analytical solutions to what in figure 3d we refer to as the classical simulation limit and the steady state.

No-go theorem for the Heisenberg limit: the classical simulation method
Within the Bayesian approach to estimation theory there exist various general lower bounds on the minimal aMSE, which are commonly referred to as Bayesian (or global) Cramér-Rao Bounds (BCRBs) [72]. Still, in the case of linear and Gaussian stochastic processes many of these simplify to a single form [40,41]. When interested in estimating the process at its last step-here, the magnetic field B t at the final time t-the applicable bound is the so-called marginal unconditional BCRB [41]: where J B is the Bayesian information (BI) [40] evaluated at time t: The BI can be conveniently split into two terms: where J P represents our knowledge about B t prior to any estimation, and J M accounts for the contribution of the measurement record y ≤t . Formally, the BI associated with our prior knowledge, J P , corresponds to the Fisher information (FI) [13] evaluated with respect to the estimated field value B t for the distribution p(B t ) defined in (24), i.e.: In contrast, J M reads where is now the FI of the distribution p(y ≤t |B t ) evaluated again with respect to B t . The prior contribution to the BI (43) can always be ignored by letting the prior distribution describing our knowledge about the field at t = 0, p(B 0 ), be of infinite width [40]-see Appendix E.1 where we demonstrate that J P = 0 for σ 2 0 → ∞. In contrast, the contribution of the measurement record to the BI (44) in general requires one to explicitly evaluate the FI of p(y ≤t |B t ) for every value B t may take. In this work, we avoid doing so by constructing an upper-bound on F [p(y ≤t |B t )] that is determined by the noise in the atomic magnetometry scenario, which turns out to be independent of the form of the continuous measurement and the initial state of the atoms.
In order to do so, we discretize the time evolution such that t = k δt-this does not prevent us from obtaining the results in the continuous-time limit, which can be recovered by finally letting δt → 0 [30]. As a result, the continuous-measurement record y ≤t can be described by a finite set of (time-ordered) outcomes, y k = {y 0 , y 1 , . . . , y k }, collected at the end of each time interval indexed by j = 0, 1, . . . , k. Similarly, the evolution of the magnetic field corresponds now to a discrete-time stochastic process, B k = {B 0 , B 1 , . . . , B k }, with B j representing the value taken by the B-field throughout the jth time interval. Moreover, the conditional quantum state of the ensemble at time t = kδt, previously described by ρ (c) (t) ≡ ρ(t|y ≤t ) as the solution to the conditional dynamics (6), can now be explicitly written as where the continuous measurement is formally described by a set of measurement operators 5 E y j that form a positive-operator valued measure (POVM), i.e. {E † y j E y j } y j with y j E † y j E y j = 1 for every j. In the expression (46), every Λ B j denotes the quantum channel (a completely positive trace-preserving map) describing the evolution of the atomic ensemble during the jth interval in between measurements. Such evolution will then solely incorporate the Larmor precession under the magnetic field B j (constant in that time interval), and the collective decoherence. A scheme describing this sequence of channels interspersed with measurements is depicted in figure 5a (top).
The denominator of the expression (46) should be interpreted as the probability of registering the measurement record y k given a particular (discrete) trajectory of the B-field B k , i.e.: Moreover, its marginal distribution can then be determined using the Bayes' rule as constituting the discrete-time equivalent of p(y ≤t |B t ) appearing in equation (44). The crucial step in our construction is the observation that the dynamics of the atomic ensemble in between measurements can be classically simulated (CS) [22,53,54]. In particular, each quantum channel Λ B j in equation (46) can be equivalently replaced by a probabilistic mixture of unitary evolutions such that only the classical mixing probability depends on the instantaneous B-field value B j -see figure 5b. In other words, we "shift" the encoding of the parameter B t from a quantum channel to a classical probability, which we show to be Gaussian. Namely, for any j = 0, 1, . . . , k: where ω j is an auxiliary frequency-like random variable distributed according to the Gaussian distribution: which parametrises now a noiseless Larmor precession of the ensemble within the jth time interval described by the unitary transformation: with the HamiltonianĤ =Ĵ y consistent with the dynamics (5). A step by step demonstration of this statement is presented in Appendix E.2. Moreover, within the setting here considered, the random variable B j follows a (discrete-time) OU process (3), while the initial B 0 is drawn from a Gaussian prior distribution of variance σ 2 0 . As a consequence, by now defining the set of particular frequencies ω j chosen in each time interval, ω k = {ω 0 , ω 1 , . . . , ω k }, we can interpret the conditional distribution of the measurement record (47) as a probabilistic average over the frequency-like parameters: where q(ω k |B k ) = k j=0 q(ω j |B j ) is just a product distribution and is now the conditional probability of detecting a set of measurement records y k , given a particular set of frequencies ω k dictating subsequent unitary evolutions of the atomic ensemble within each time interval. The decomposition (52)  , where the mixing probability q(ω j |B j ) has a B j -dependence, the whole scheme can effectively be interpreted as if the consecutive field-values B j (shaded in yellow ) are first encoded into the mixing probabilities (a, bottom), which only then determine the unitary channels to be applied in the sequence. As a result, any inference strategy with access to all the random variables ω j (shaded in red ) can only perform better than the one based on the actual measurement outcomes y j (shaded in green).
unitary evolutions, whose frequencies ω k are drawn from a distribution q(ω k |B k ) containing all the information about the field trajectory B k . We represent this equivalence schematically in figure 5a.
Focussing on the marginal distribution of interest (48), we substitute the decomposition (52) into it, in order to obtain where we can now identify S ω k →y k [ • ] = dω k p(y k | ω k ) • as a stochastic map independent of the estimated B k , and define the probability distribution function: Noisy atomic magnetometry in real time 21 which effectively describes the information about B k contained within q(ω k |B k ). Crucially, as the FI is generally contractive (monotonic) under the action of stochastic maps [13,73], we can now upper-bound F [p(y ≤t |B t )] in (45) as follows and evaluate the FI of P B k (ω k ), which we denote as P Bt (ω <t ) upon letting δt → 0 (with t = kδt) to recover the continuous-time limit. In Appendix E.3, we explicitly show that for the case of infinitely wide (σ 0 → ∞) Gaussian prior distribution p(B 0 ), the FI reads and being independent of the estimated B t directly sets an upper bound in J M , i.e.: Hence, by using also the fact that J P = 0 when σ 0 → ∞, we conclude that where the last inequality follows from x coth(x) ≥ 1 for any x > 0 with the CS limit, ∆ 2BCS t (q B ), monotonically increasing with q B -as expected, the error is predicted to deteriorate with an increase in the strength of the field fluctuations.
Although we differ the full proof of the upper bound (59) to Appendix E.3, let us note that in the simplest scenario with the magnetic field being time-invariant, the CS limit can be straightforwardly derived and takes the form ∆ 2BCS t (0) as in equation (62). For a constant B-field, the probability distribution (55) simplifies to the product P B (ω k ) = q(ω k |B) = k j=0 q(ω j |B), whose FI is now evaluated with respect to B and reads: so that the CS limit (62) then directly follows.

22
In general, the CS limit ∆ 2BCS t (q B ) defined in (61) can always be approximated independently of the value of q B by splitting the timescales into two regimes, as follows: where the expression (64a) implies that the CS limit (61) always simplifies at short times to its form applicable in absence of field fluctuations, i.e. ∆ 2BCS t (0) in (62). Finally, let us highlight that the derivation of the CS limit is independent of the measurement dynamics and the initial state, and depends only on the form of the noisy quantum channel describing the evolution of the system in each discretised step between subsequent measurements.

Steady-state solution of the Kalman filter
Returning to the particular measurement model and the estimation strategy considered, we determine the steady state (SS) solution of the KF derived in section 3, as it may then be directly compared with the fundamental CS limit determined above. Still, for simplicity, we focus here only on the SS solution when the magnetic field fluctuates according to the Wiener process [55], i.e. the special case of the OU process (3) with χ = 0. The complete solution for χ > 0 is presented in Appendix D.
In general, the SS is attained by the KF when its covariance matrix Σ t no longer changes in time, so that dΣt dt = 0 in equation 33. This corresponds to the solution of equating the r.h.s. of (33) to zero-solving the corresponding (non-linear) algebraic Riccati equation [70]. In our case, finding the SS solution becomes much easier after noting that for t t * (guaranteed as t → ∞ for SS) ∆ 2Ĵ z (t) (c) ≈ V >t * (t), as in (19b). Then, the SS solution for ∆ 2B t = [Σ t ] 22 can be shown (see Appendix D) to read whose second term in the parenthesis-the one surviving in the absence of noise (γ y = 0)-has been derived previously in [46]. In contrast, it is the first term arising due to the decoherence considered here (γ y > 0) that always dominates for large J.
In particular, at the relevant timescales t (M +γ y ) −1 whenever J γ γy q B /(ηM ), the second term within the brackets in (65) becomes negligible and the SS solution (65) coincides with the CS limit (64b) valid at long timescales, i.e.: which moreover implies that-see also the main plot of figure 6-∆ 2BSS t ≈ ∆ 2BCS t (q B ), as long as further t 1 γ γ y /q B for which the CS limit takes the form (64b).

23
Note that this proves that in presence of decoherence (γ y > 0) for large enough ensemble size (J = N/2 1) the chosen type of continuous measurement (homodynelike model) and the initial state of the atomic ensemble (CSS) always give the best possible precision in estimating the fluctuating magnetic field (q B > 0) in the SS regime-bearing in mind that the linear-Gaussian approximation made throughout our work must hold. Crucially, this conclusion holds for large atomic ensembles also at short timescales at which the SS solution does not apply, as we will now explicitly show.

Different regimes exhibited by the estimation error
Once the CS limit and the SS solution of the KF have been explicitly derived, we finally have all the information we need to fully understand the evolution of the estimation error in figure 3d, which we supplement now with additional plots of the aMSE with respect to time and ensemble size (J = N/2) in figures 6 and 7, respectively.

Estimation error as a function of time
We depict the time evolution of the estimation error, i.e. the aMSE of the KF, in figure 6. Importantly, as shown in the main plot, for "large" ensembles (we quantify "large" below) the aMSE attains the fundamental CS limit (61) that holds for any potential measurement and estimation strategy, hence, proving that the magnetometry scheme achieves then the ultimate precision dictated by the decoherence.
In such a setting, the behaviour of error as a function of time can be intuitively explained. Firstly, at very short timescales for which the atomic decoherence can be ignored, the aMSE follows the noiseless-like (or supra-classical) scaling of 1/t 3 . Once the impact of decoherence "kicks in", its behaviour transitions to a "classical"-like regime exhibiting 1/t scaling, as determined by the CS limit in (64a). At long times, at which the field fluctuations are optimally compensated by the KF, it reaches its steady state (SS) and the error saturates at a constant value √ q B γ y /γ. However, let us emphasise again that this value arises due to the presence of decoherence (γ y > 0), which in this case dominates the SS solution of the KF in equation (66) and, in fact, assures the error to attain the ultimate CS limit, in particular, its long-time behaviour (64b). In contrast, for "small" ensembles, as presented in the inset, the aMSE does not reach the CS limit. It is so, as the impact of the atomic decoherence can then be actually neglected in comparison to the field fluctuations. As a consequence, the KF reaches quickly with time its SS solution, which is now effectively independent of the decoherence, i.e. ∆ 2BSS t ≈ 4 q 3 B /(M ηγ 2 J 2 ) in equation (65). One can then interpret the magnetometer to operate in a "perfect" manner, with its precision being dictated fully by just the characteristics of the field.
Note that the above two settings are formally distinguished by whether the SS solution in (65) attains or not the long-time CS limit (64b)-or, in other words, whether the green line can cross the black line in figure 6. Hence, this condition defines naturally what constitutes a "large" or "small" ensemble above, i.e., J J CS or J J CS ,   (68), between the noiseless-like (∝ 1/t 3 , rosé), "classical"-like (∝ 1/t, orange) and steady-state (= √ q B γ y /γ, green) regimes. Within the latter two, the CS limit (61) is importantly saturated. In contrast, for "small" ensembles of J γ γy q B /(ηM ) (see the inset) the aMSE of the KF saturates, directly after exhibiting the noiseless-like behaviour, the SS solution (65) at t SS defined in (68), without attaining the CS limit at all. Other parameters are set to: M = 100kHz, γ = 1kHz/mG, q B = 100G 2 /s, γ y = 100mHz, χ = 0 and η = 1, while the rescaled time is defined as t S = (M + γ y )t. respectively, where follows from the derivation of equation (66). Moreover, by evaluating the times at which the dominant behaviours of the error appearing in figure 6 cross one another, we explicitly determine the transition times between all the different regimes: which are valid as long as t (M + γ y ) −1 (and the linear-Gaussian approximation holds). In particular, t CS marks the transition time of the aMSE from the noiseless-like to the "classical"-like regime, whose later transition to the steady-state regime occurs   Figure 7. Estimation error as a function of the ensemble size at "short" (main plot) and "long" (inset) timescales: here at t S = 10 −4 and t S = 10 −2 rescaled times, respectively. The Solid blue lines represent the average mean squared error (aMSE) of the Kalman filter (KF), while the dashed lines of different colours denote: the noiseless solution (red ), the classical simulation (CS) limit (black ), the CS limit in the absence of field fluctuations (orange), and the steady-state (SS) solution of the KF (green). At "short" timescales of t γ y /(γ 2 q B ) (main plot) the SS solution does not apply even for J → ∞. The aMSE, after initially following with J the Heisenberg-like behaviour (∝ 1/J 2 , shaded in rosé), saturates around J CS (defined in (69) and marked with a black dashed vertical line) at a constant value-given by the CS limit γ y /(γ 2 t) derived in (64a). In contrast, at "long" timescales (inset) the aMSE of the KF is explicitly given by its SS solution (65) for all J above J SS defined in (69) (green-shaded area). As a result, the aMSE, after following with J initially the Heisenberg-like behaviour up to J SS , firstly scales as 1/ √ J before saturating around J CS defined in (67) again at the CS limit, which is now given by the expression (64b) instead. As in figure 6, we set the other parameters to: M = 100kHz, γ = 1kHz/mG, q B = 100G 2 /s, γ y = 100mHz, χ = 0 and η = 1, while the rescaled time is defined as then at t CS . Similarly, for "small" ensembles t SS indicates when the aMSE reaches its steady state.

Estimation error as a function of the number of atoms
The aMSE as a function of the ensemble size (J = N/2) is presented in figure 7, where we show two distinct scenarios differentiated by the timescales considered. Still, we observe that the ultimate CS limit dictated by the decoherence is attained in both cases as long as a sufficiently large ensemble is considered-proving then the measurement scheme and the estimation procedure to be optimal.
In particular, for "short" timescales (main plot), the error as a function of J initially follows the Heisenberg-like scaling 1/J 2 before attaining the short-time CS limit (64a) once the effect of the collective noise becomes significant. For "long" timescales (inset), the KF optimally compensates for the field fluctuations and its SS solution applies as long as J is sufficiently large. Still, the long-time CS limit (64b) is achieved as J → ∞, as it dictates in this case the SS solution (66) affected by the collective noise (γ y > 0). The time threshold differentiating between these two cases, t CS , is then given by (64)coinciding consistently with the definition in (68). Importantly, both when the aMSE (solid blue line) attains the CS limit at "short" timescales t t CS (the main plot of figure 7), or at "long" timescales t t CS (the inset of figure 7), it saturates at a constant value as J is increased. In each case, that occurs at different ensemble sizes: J CS and J CS , respectively. In the latter case, one can also identify a threshold value J SS , above which the aMSE exhibits a 1/ √ J-scaling before saturating at J CS . All these can be determined by comparing the dominant behaviours of the error within each regime, and read: while J CS coincides consistently with the definition (67).

Conclusions
We have studied the problem of sensing a magnetic field in real time within the canonical atomic magnetometry setting-a polarised spin-ensemble is being continuously probed in the perpendicular direction to induce spin-squeezing of the atoms, so that a quantumenhanced precision in estimating the field can be maintained. Within our model we have importantly incorporated both the stochastic fluctuations of the field (in the form of an Ornstein-Uhlenbeck process) as well as collective decoherence (affecting the ensemble as a whole) into the magnetometer conditional dynamics, depending on a particular measurement record collected continuously in time.
As a result, while considering the magnetometer evolution at short timescales within the linear-Gaussian regime, we have computed explicitly the optimal estimator-the Kalman filter-and studied the behaviour of its error both in time, t, as well as the effective size of the atomic ensemble, J = N/2 with N being the total number of atoms. Moreover, we have developed a classical simulation method thanks to which we have established a fundamental limit on the precision that is induced solely by the decoherence. Crucially, as the so-obtained limit applies to any type of statepreparation and continuous-measurement scheme-as long as the conditional dynamics of the magnetometer in between subsequent measurements is not changed-it has allowed us to prove that the continuous measurement model of our interest may often be considered to be optimal in presence of any, even infinitesimal, noise.
In particular, the corresponding average mean squared error (aMSE) of the Kalman filter, which at short timescales always follows the quantum-enhanced behaviour in both time and the number of atoms, i.e. 1/t 3 and 1/N 2 , respectively, is bound to saturate eventually the limit dictated by the noise. From the perspective of the time dependence, the noise constrains the aMSE to follow a "classical"-like 1/t scaling before the Kalman filter reaches its steady-state solution-which, in fact, coincides with the ultimate longtime precision induced by the decoherence for large atomic ensembles. If instead we focus on the dependence of the aMSE with respect to the number of atoms N , we observe that the impact of the collective noise is even more drastic, as the aMSE (after following the Heisenberg-like behaviour 1/N 2 ) saturates eventually at a constant value-whether or not the Kalman filter operates within the steady-state regime. Furthermore, our analysis allows to straightforwardly estimate both the times and the numbers of atoms at which the transitions between such regimes occur.
Our work paves the way for finding new methods of incorporating effects of decoherence in real-time sensing protocols, while stemming from Bayesian inference techniques combined with tools previously developed within noisy quantum metrology. In particular, as the classical simulation method we have invoked relies on properties of the effective quantum channel describing the dynamics, and not the design of the continuous-measurement scheme under study, e.g. any adaptive control operations that it incorporates, it should also be directly applicable to sensing protocols involving quantum feedback [74]. On the other hand, although within our work we have focussed on the estimation task in which only past measurement data may be used for inference (filtering), we believe that our results can be naturally extended to smoothing protocols [67][68][69] that include also retrodiction of data, and have been recently implemented experimentally [58,59]. Moreover, although we have dealt here with the setting of atomic magnetometry, let us stress that the techniques we have presented can also be applied to other real-time sensing platforms, e.g.: cavitybased experiments with cold atoms incorporating feedback [75,76], requiring similar continuous-measurement theory [77]; or optomechanical devices [78][79][80] and levitated nanoparticles [81,82] that naturally evolve respecting Gaussian dynamics, and hence directly require Kalman-filtering and alike techniques.

Appendix A. Unconditional dynamics of Ĵ x (t) with field fluctuations
The unconditional evolution of Ĵ x (t) can be computed from the stochastic set of differential equations (9)(10)(11), where B t follows the OU process described in (3). Although the full system of differential equations can in principle be solved numerically, we would like to find approximate analytical solutions valid in particular parameter regimes.
In particular, we focus on timescales short enough, such that we can assure ω L (t) t 1, where ω L (t) = γB t is the instantaneous Larmor frequency following the fluctuations of B t . In order to identify and ensure such timescales, we enforce ω L (t) t 1, where ω L (t) = γB t is now the time-average over the duration t with Replacing B t by B t within the system of differential equations (9-11) and treating it as a constant, we solve them for Ĵ x (t) to get Then, by expanding the above expression to leading order in B t , we obtain where in (A.4) we have further assumed t (M − γ x + γ z ) −1 , so that e (M −γx+γz)t/2 ≈ 1 + 1 2 (M − γ x + γ z )t + 1 8 (M − γ x + γ z ) 2 t 2 holds up to the second order in t. Hence, it seems that we may approximate the dynamics of the mean value of the spin-componentĴ x as which constitutes the basis for the linear-Gaussian approximation introduced in equation (12), as long as: ; of which the latter condition allows us to neglect the term in the parenthesis in (A.4). Moreover, as the noise-parameter γ x does not enter the dynamics any more, after letting γ x ≈ 0 and reparametrising the measurement strength as in section 2.4, M → M −γ z , the condition (i) simplifies to t 1/M and is naturally satisfied by the more stringent requirement on the rescaled time t S = (M + γ y )t to always obey t S 1, which we ensure throughout the main text.
However, we must be more careful when ensuring the validity of condition (ii). As B t follows a stochastic process, the time-average B t defined in (A.1) is a random variable in itself. Therefore, we must further assure that the condition (ii) holds for almost all stochastic trajectories of B t . We achieve this by applying the 68-95-99.7 rule, which allows us to state that if then the approximation (A.5) of (A.4) is valid with 95% probability. Evaluating the mean of B t defined in (A.1), we obtain where E[B τ ] = B 0 e −χt for the OU process (3) [55]. Similarly, we calculate the variance of the time-averaged value of the magnetic field, B t , as using the expression for the two-time correlation function of the OU process (3), i.e. [55]: and evaluating Although when constructing the optimal estimator (KF) of B t at finite time t > 0 we do not want to assume any prior knowledge about the value of the initial field B 0 , for the purpose of this calculation we take the magnetic field to always be initialised in B 0 = 0, so that it is solely the presence of the OU process (3) that invalidates the   condition (A.6) over time. In such a case, we have where the above approximation holds as long as t hold. Throughout our work we assure these by focussing on the first one and considering the rescaled time t S = (M + γ y )t to always fulfil t S 1 (as also done in figure A1). However, we must then also require the field fluctuations to be small enough such that the other two conditions are satisfied for any t S 1. In particular, this is achieved by considering the decay parameter and the fluctuation strength of the OU process (3) to fulfil for any t (M + γ y ) −1 : which we more generally state for r = M + γ y + γ z in equation (13)  where α = 2J η γ y M /(M + γ y ) and β = αe −(M +γy)t/2 . The behaviour of the solution (B.2) can be better understood when broken down into different regimes.
In order to do so, the first step is to expand the modified Bessel functions and the regularized confluent hypergeometric functions around infinity up to leading ordersuch an approximation is assured due to α 1 and β 1. The relevant expansions are shown in table B1.
√ πα Table B1. Series expansions of the Bessel functions for 1/α and 1/β around 1/α 0 = 0 and 1/β 0 = 0, stated to leading order.    Figure B1. In subfigure (a) with γ y = 10mHz < M , the exact variance solution ∆ 2Ĵ z (t) (c) is compared to the approximated functions V <t * (t) and V >t * (t) (dashed green and yellow, respectively). The transition time t * between these two regimes is marked with a dotted black vertical line. In subfigure (b) with γ y = 100M Hz > M , the two different regimes V <t * (t) (dashed green) and V >t * (t) (dashed yellow) are superimposed with the exact solution of ∆ 2Ĵ z (t) (c) (in solid blue). Notation t S refers to a rescaled time, t S = t(M + γ y ). All plots have been generated with M = 100kHz, γ = 1kHz/mG, η = 1, and J = 10 9 .
defined with help of (positive semi-definite) covariances Q t , R t ≥ 0. Moreover, the cross-correlations between dw t and dv t are specified by the matrix S t : which is not necessarily symmetric. Note that the matrices Q t , R t and S t are not fully independent: since the covariance of the "overall" noise dT = dw t ⊕ dv t , i.e.
must be positive semi-definite by definition (constituting an outer product of a vector), the form that matrices Q t , R t and S t can take is then constrained. Next, it is convenient to rewrite the system of differential equations (C.1-C.2) as [70]: where D(x t , t) can be set arbitrarily, since dy t − C(x t , t)dt − dv t = 0. By regrouping the terms in equation (C.7) we obtain with dZ t = B(x t , t)dw t − D(x t , t)dv t being now the effective process noise. This allows us to ensure that the correlations between the process and measurement noises, dZ t and dv t , respectively, are vanishing-and after setting D(x t , t) = B(x t , t)S t R −1 t without loss of generality we have As a result, we obtain a set of stochastic differential equations that are equivalent to (C.1-C.2): (C.10) where the process noise dU t = B(x t , t) −1 dZ t = dw t − S t R −1 t dv t is now uncorrelated from the measurement noise, E dU t dv T t = 0, but the dynamics of the state x t explicitly depend on the observation y t in (C.11). Moreover, in contrast to (C.3-C.4), the selfcorrelations of process and measurement noises now read Now, in case of a linear model, the dynamical matrices in (C.1-C.2) simplify to: so that the system of equations (C.10-C.11) reads In such a special case, the optimal estimatorx t minimising the overall aMSE, which is defined as the trace of the covariance matrix is referred to as the Kalman filter (KF) and given by the solution of the so-called Kalman-Bucy equation [70]: where Γ t is the Kalman gain Nonetheless, let us note that in order to evaluate the Kalman gain (C.20), one must determine beforehand the (optimal) covariance matrix Σ t , which evolves according to the non-linear differential equation in the Riccati form [70]: From the practical perspective, however, the solution to equation (C.22) can be determined (often only numerically) and stored in advance, so that in real-life applications the construction of the KF,x t , as the solution to equation (C. 19) can still performed fast and "on the fly", while the observations y t are constantly gathered.
Appendix D. Steady-state solution of the Kalman filter for χ = 0 Let us consider the Riccati differential equation (C.22) (equivalent to (33)), which specifies the evolution of the covariance matrix Σ t for the KF with the dynamical matrices F t , B t , and H t , as well as the noise-correlation matrices Q t , R t , and S t defined in equations (28) and (32) of the main text, respectively. For simplicity, we rename the elements of the covariance matrix as so that by resorting to the evolution of Σ t in (C.22) and setting dΣ dt = 0-with (C.22) formally constituting then a continuous-time algebraic Riccati equation (CARE) [70]we obtain a system of equations describing the steady state as where we have used the fact that the variance ∆ 2Ĵ z (t) (c) for t t * (and, hence, in the steady state) equals V >t * (t) as defined in equation (19b).
Being interested only in the steady-state (SS) solution for the aMSE of the magnetic field, ∆ 2BSS t ≡ z(t), one may explicitly solve for z(t) in (D.2) to obtain which for J 1 can be approximated by performing the Taylor expansion in 1/J to first order, as follows: In contrast, by letting χ → 0 in equation (D.3), we obtain the exact steady-state solution for the aMSE, when the magnetic field follows a Wiener (rather than the OU (3)) process: which is the expression stated in equation (65) of the main text.
Appendix E. Derivation of the CS limit for a fluctuating field After discretising the dynamics of the magnetic field B t and the measurement outcomes y t into k = t/δt steps, their particular trajectories follow discrete-time stochastic processes being described by (time-ordered) sets: respectively, while the marginal conditional BCRB (40) after the kth step then reads .
In the first subsection of this appendix, we compute the prior contribution to the BI, i.e. J P above in (E.2). In the second subsection, we show how the quantum channel describing the dynamics of the atomic ensemble in the absence of continuous measurement can be decomposed into a probabilistic mixture of unitary channels. Finally, in the last subsection, we calculate the FI of P B k (ω k ) defined in (55), which allows us to upper-bound the contribution of the measurement records to the BI, i.e. J M above in (E.2).

Appendix E.1. Prior contribution to the Bayesian information
The marginal probability density function of the magnetic field at time t-or equivalently after the time-step k = t/δt-can be written as: where the initial field B 0 is drawn from a Gaussian distribution which effectively specifies our a priori knowledge about the field at t = 0, i.e.: while each transition probability p(B j |B j−1 ) for all j = 1, 2, . . . , k is given by the OU process (3) as a Gaussian distribution [55]: Hence, after explicitly evaluating the multiple integrals in (E.3), we arrive at the marginal Gaussian distribution for B k : whose mean is zero and its variance which in the case of a magnetic field following a Wiener process-a special case of the OU process (3) with χ = 0-simplifies to lim χ→0 V (k) P = kδtq B + σ 2 0 . In general, the FI evaluated with respect to a Gaussian distribution, e.g. the marginal distribution (E.7), corresponds to the inverse of its variance. Hence, the prior contribution to the BI defined in equation (43) of the main text simply reads , (E.9) and in the continuous-time limit of δt → 0 with k = t/δt, it becomes .

(E.10)
In the special case of a Wiener process (χ → 0), equation (E.10) reduces to lim χ→0 J P = (q B t + σ 2 0 ) −1 . On the other hand, for any χ, q B , t ≥ 0, expression (E.10) vanishes if we do not possess any prior knowledge about the initial field B 0 , i.e. when we consider σ 0 → ∞ in equation (E.4), for which lim σ 0 →∞ J P = 0. Theorem 1. The quantum map Λ τ obtained by averaging over ξ after a time τ , i.e.: corresponds to the solution of the master equation: where the effective time-dependent frequency and decay parameters read, respectively: Proof. By explicitly differentiating ρ τ defined in (E.13) with respect to τ , we obtain: where by using relations for the moments of a Gaussian distribution we can further simplify the following expressions: As a result, we can write the dynamics (E.17) as which is the desired form stated above in equation (E.15).
Corollary. For the case of N spin-1/2 particles evolving for a time τ according to the dynamics (E.15) withĤ =Ĵ y , ω(τ ) = γB and Γ(τ ) = γ y , i.e.: with B being constant over the time τ , the effective quantum map describing the evolution Λ τ in (E.13) can be interpreted as mixture of unitary channels with the Gaussian mixing probability (E.12) of mean µ τ = γB and standard deviation σ τ = γy τ .
Noisy atomic magnetometry in real time

40
The above statement can be straightforwardly verified by explicitly computing the effective time-dependent frequency and decay parameters in (E.16) for the choice of µ τ = γB and σ τ = γy τ , which consistently simplify to: Appendix E.3. Fisher information of P Bt (ω <t ) As discussed in the main text, upper-bounding the measurement-record contribution to the BI, i.e. J M defined in equation (44), corresponds effectively-see inequality (59)-to computing the FI: for the probability distribution defined in equation (55): Note that P B k (ω k ) contains a Gaussian marginal distribution p(B k ) specified in (E.7), as well as products of Gaussian distributions: p(B k ) = k j=1 p(B j |B j−1 )p(B 0 ) and q(ω k |B k ) = k j=0 q(ω j |B j ), (E. 26) where p(B j |B j−1 ) is the Gaussian transition probability of the OU process defined in (E.5) with variance V P given by (E.6), p(B 0 ) is the prior Gaussian distribution with zero mean and variance σ 2 0 , while each q(ω j |B j ) is the mixing probability introduced within the CS method in equation (50), which is also a Gaussian with mean γB j and variance It follows from the definitions in (E.26) that the integral in (E.25) can be rewritten as a set of nested integrals, i.e.: dB k−1 p(B k ) q(ω k |B k ) = dB k−1 k j=1 p(B j |B j−1 )q(ω j |B j )p(B 0 )q(ω 0 |B 0 ) = q(ω k |B k ) dB k−1 p(B k |B k−1 )q(ω k−1 |B k−1 ) . . . dB 1 p(B 2 |B 1 )q(ω 1 |B 1 ) dB 0 p(B 1 |B 0 )q(ω 0 |B 0 )p(B 0 ), (E. 28) which we would like to simplify. To do so, we first need to prove the following lemma: Lemma 2. Let us consider a recurrence relation between P j (B j ) and P j−1 (B j−1 ) for j = 0, 1, 2, . . . as a generalised convolution of Gaussian distributions: where V P , V Q ≥ 0 and P 0 (B 0 ) = C 0 e − (B 0 −µ 0 ) 2 2V 0 with some fixed C 0 , V 0 ≥ 0 and µ 0 ∈ R. Then, for all j ≥ 1: where the parameters C j , µ j and V j are given as the solution to the following (coupled) recurrence relations: Proof. As for any recurrence problem, it is sufficient to prove that the solution (E.30) holds for j = 0, and that the recurrence relation (E.29) is fulfilled for any j ≥ 1. The first part is trivially satisfied by definition, while to prove the latter we substitute P j−1 (B j−1 ), defined according to (E.30), into (E.29) and explicitly perform the integration, i.e: (E. 34) where α and β are constants independent of B j and B j−1 , and equal to Hence, by 'completing the square' we rewrite (E.34) as (E. 37) and, after substituting for α and β from (E.35-E.36), we arrive at the expression (E. 30) for P j (B j ) with C j , µ j and V j specified by the recurrence relations (E.31-E.33). Now, using the above lemma we may rewrite equation (E.28) as dB k−1 p(B k ) q(ω k |B k ) = q(ω k |B k )P k (B k ), (E. 38) with P k (B k ) being now defined according to equation (E.30) with variances V P and V Q in (E.31-E.33) specified in our case by equations (E.6) and (E.27), respectively. Consequently, the FI of P B k (ω k ) in equation (E.24) reads where the expression (E.40) follows from the fact that we are dealing with a product (and quotient) of Gaussian distributions within log(. . . ) in (E.39), so that the FI becomes just the sum (and difference) of the inverses of their respective variances. In particular, V Q /γ 2 is the variance of q(ω k |B k ) when treating B k as the random variable, V (k) P is the variance of p(B k ) specified in (E.8); while V k is the variance of P k (B k ) given by the recurrence relation (E.33) that must still be solved.

Appendix F. Effective dynamics averaged over the field fluctuations
We use the results obtained above in Appendix E.2, in order to answer the question of what would the effective ensemble dynamics be, if one was not to use inference techniques such as the Kalman filter in order to track the magnetic field in real time, but rather ignore and, hence, average over the field fluctuations. For that purpose, let us ignore the impact of continuous measurement on the atomic ensemble, and focus on the ensemble dynamics dictated only by the unitary evolution: where B t follows a Wiener process dB t = √ q B dW t , such that E[dW 2 t ] = dt. Given that the ensemble is initially prepared in a pure state |ψ 0 , its state at time t then reads |ψ t = e −iγĴy t 0 Bτ dτ |ψ 0 . (F.2) The integral of a Wiener process, t 0 W (τ ) dτ with W (t) := t 0 dW t , constitutes a random variable distributed normally with zero mean and variance t 3 /3 [55]. Hence, upon defining being averaged over all potential stochastic trajectories of the variable Z t . Now, inspecting Theorem 1 and noticing that Equation (F.5) is just a special case of Equation (E.13), we may directly conclude from (E.15) that the average dynamics is described by the following master equation with the resulting decoherence rate given by which is obtained by substituting into Equation (E.16) σ 2 t = γ 2 q B t/3, i.e. the variance of the random variable Z t .
The above short derivation demonstrates that averaging over field fluctuationsinstead of following a single trajectory, as done by resorting to, e.g., the Kalman filtereffectively leads to a collective noise in the direction of magnetic field (here, in the eigenbasis ofĴ y ), whose decoherence rate (F.7), however, increases quadratically with time, Γ(t) ∼ t 2 . This contrasts the collective noise model considered throughout the manuscript, whose decoherence rate is constant, γ y = 1/T * 2 in Equation (5), being determined by the phenomenological (ensemble) spin-decoherence time T * 2 .