Result of the MICROSCOPE weak equivalence principle test

The space mission MICROSCOPE dedicated to the test of the equivalence principle (EP) operated from April 25, 2016 until the deactivation of the satellite on October 16, 2018. In this analysis we compare the free-fall accelerations (a A and a B) of two test masses in terms of the Eötvös parameter η(A,B)=2aA−aBaA+aB . No EP violation has been detected for two test masses, made from platinum and titanium alloys, in a sequence of 19 segments lasting from 13 to 198 h down to the limit of the statistical error which is smaller than 10−14 for η(Ti, Pt). Accumulating data from all segments leads to η(Ti, Pt) = [−1.5 ± 2.3 (stat) ± 1.5 (syst)] × 10−15 showing no EP violation at the level of 2.7 × 10−15 if we combine stochastic and systematic errors quadratically. This represents an improvement of almost two orders of magnitude with respect to the previous best such test performed by the Eöt-Wash group. The reliability of this limit has been verified by comparing the free falls of two test masses of the same composition (platinum) leading to a null Eötvös parameter with a statistical uncertainty of 1.1 × 10−15.


Introduction
The Equivalence Principle (EP) is the foundation stone on which Einstein built his new theory of gravitation, General Relativity (GR) [1,2].GR has become an essential element in our description of the macroscopic universe, from the big bang to black holes and gravitational waves.GR has passed with flying colours many stringent experimental tests (for reviews, see, e.g., [3] and chapter 21 in [4]).
However, fundamental physics is facing several conundrums which suggest the need to extend our present theoretical framework.On the gravity side, the missing mass problem [5,6], and the acceleration of the cosmic expansion [7,8] have motivated the search for modifications of GR.On the particle-physics side, the peculiar structure of the Standard Model (SM), the hierarchy of particle masses, the observed preponderance of matter over antimatter, the presence of several different gauge symmetries with a curious symmetry breaking pattern, are some of the puzzles that motivate the search for extensions of the SM (notably supersymmetric ones [9]).
Most of the attempts to go beyond GR or beyond the SM, including the attempts to unify all interactions, have suggested the existence of new particles and of new interactions.In many cases, these new interactions give rise to apparent violations of the EP by predicting additional long-range feeble forces that do not couple, as Einsteinian gravity does, to the total mass-energy of a body.For instance, many theories including those with extra dimensions, from the Kaluza-Klein model [10,11] up to string theories [12], suggest the existence of a light spin-0, dilaton-like, particle.Such a light scalar field can be made compatible with current solar system tests if some screening mechanism is at work [13][14][15][16][17][18][19][20][21].The coupling to matter of a dilaton-like particle is expected to violate the EP at a small level (see, e.g., Refs.[14,17,22,23]).Another possibility is the existence of a very light spin-1 U boson, related to an extension of the SM gauge group, mediating a new EP-violating force [24,25].
The EP, or more precisely the weak equivalence principle (WEP) states that two bodies of different compositions and/or masses fall at the same rate in the same gravitational field (universality of free fall-UFF); equivalently, it states the equivalence of the "inertial" and "gravitational" masses.Since its use by Einstein in 1907 as a starting point of GR, it has been experimentally tested with higher and higher precision.Tests of the WEP are usually presented in terms of the Eötvös ratio η [26], defined as the normalised difference of accelerations (or equivalently, as the normalised difference of gravitational-to-inertial mass ratios) of two test bodies in same gravitational field [3]: where a j is the acceleration of the jth test-body, and m Gj and m Ij are its gravitational and inertial masses.Since previous experiments have shown that m G /m I does not differ from 1 by more than about 10 −13 , the quantity which we directly measure, differs from the Eötvös parameter η(2, 1) only by terms of order [η(2, 1)] 2 , and may in practice be identified with it.
Tests of the UFF have a long history, starting with Galileo Galilei (1638) and Newton (1687), and continuing to the end of the 20th century after Fischbach et al. [27] revived the interest in experimental searches for new, WEP-violating interactions.The state-of-the-art experiments have measured |η| < a few 10 −13 (see Ref. [3] for a historical account of tests of the WEP): (i) the Eöt-Wash group used a high-precision torsion pendulum in the Earth and Sun gravitational fields [28,29], and (ii) Lunar Laser Ranging has been used to monitor the motions of the Moon and the Earth around the Sun [30,31], leading to a slightly better accuracy; but in this case η tests combination of effects due to composition differences between Earth and Moon (related to the WEP), and effects due to the self-gravity of each body (related to the strong EP).
Concepts for an EP test in space were first developed in Stanford (STEP project) to cope with ground experiments limitations [32,33].MICROSCOPE was the first space experiment to test the WEP and hence also UFF.Test masses in orbit follow quasiinfinite and purer free falls in a quieter environment free of seismic disturbances and anthropogenic electromagnetic perturbations.The satellite was launched into a low-Earth, 710 km sun-synchronous orbit by a Soyuz rocket from Kourou on April 25, 2016.It delivered data for more than two years.
The satellite carries the Twin Space Accelerometers for Gravitation Experiment (T-SAGE) payload (Fig. 1).T-SAGE is composed of two sensor units called SUREF (Sensor Unit for Reference) and SUEP (Sensor Unit for the Equivalence Principle test).Each sensor unit includes two inertial sensors (or accelerometers), each one controlling one test mass.The test masses are concentric, co-axial hollow cylinders.Choosing cylindrical shapes for the test masses allows (i) to nest them with their centres of mass at the same position, (ii) to approximate their tensor of inertia to that of a sphere with appropriate choice of dimensions † †, and (iii) to optimise the capacitive sensing along the cylinders' common axis [34].SUREF's test-masses are made of the same material (PtRh10), while SUEP's are made of different material (PtRh10) for the inner mass, Ti alloy for the outer mass [35]).The PtRh10 platinum-rhodium alloy contains 90% by mass of Pt (A = 195.1,Z = 78) and 10% of Rh (A = 102.9,Z = 45).The isotopic composition of Pt has been measured by PTB on a sample of flight material [35].SUEP's outer test-mass is made of 90% titanium (A = 47.9,Z = 22), 6% of aluminium (A = 27.0,Z = 13) and 4% of vanadium (A = 50.9,Z = 23).The choice of the materials is a tradeoff between machining laboratory know-how and theoretical motivation [36].Titanium and platinum differ mainly from the neutron excess over the atomic mass (N-Z)/A and, to a smaller extent, in the nuclear-electrostatic-energy parameter Z(Z-1)/(N+Z) 1/3 .The instrument works by measuring the electrostatic force required to equilibrate all other "natural" forces in order to keep the test masses motionless with respect to electrodes fixed to the satellite [34].The measured electrostatic force divided by the known mass is commonly called "measured acceleration" (this is the opposite of the acceleration which would be undergone by the test mass in the absence of electrostatic force) and this terminology will be used in this paper.The driving idea of the experiment is to compare the measured accelerations of the two test mass pairs within each sensor to verify if they have the same free-falls.Along the X axis, parallel to the cylinder axis (Fig. 1, right panel and Fig. 2, right panel), capacitance changes are caused by the variation of the overlap between the test mass and its surrounding electrodes fixed to the satellite.Along the Y and Z axes the capacitance changes through variation of the gap.This allows for better sensitivity, removes electrostatic instability, and gives complete linearity along X.This is why the X-axis will be used in our analyses.† † but unlike a sphere, the momenta of order > 2 are not null.The MICROSCOPE satellite was designed to provide an environment as stable as possible.It was finely controlled along its six degrees of freedom with a Drag-Free and Attitude Control System (DFACS) described in [37].The DFACS allows several modes of operation: inertial pointing or spin mode with the choice between several rotation rates.In all cases the instrument's X-axis is kept parallel to the orbital plane (Fig. 2, left panel).In inertial pointing mode the axes of the spacecraft and instrument are maintained pointing in a fixed direction and hence the direction of the Earth gravity field projected onto the X-axis varies at the orbital frequency and hence f EP = f orb .In the spin mode the satellite is rotated about the instrument Y -axis, which is orthogonal to the orbital plane, at a frequency f spin , and in this case f EP = f orb + f spin .The measurements analysed in this paper were obtained with two different spin rates referred to as f spin 2 and f spin 3 .The values of the frequencies of interest in this paper are given in Table 1 according to [38].The higher frequency, f spin 3 , is a trade-off between minimisation of instrument noise, which would favour a higher frequency, and the capabilities of the micro-propulsion system; the smaller value, f spin 2 , has the advantage of being farther from the limits of the propulsion system, and saves on gas usage.
In [39], we used 7% of the available data to provide first results.No evidence for a violation of the EP was found at the 1.3 × 10 −14 level in terms of the Eötvös parameter, also providing improved constraints on additional new long-range forces [40][41][42].Since then, the use of all data has allowed us to improve significantly the statistical error, and a thorough analysis of systematic errors has been conducted [43] using additional calibration sessions.The present paper describes these efforts and their result.In Sect.2, we review the available data at our disposal, grouped into scientific sessions.In Sect.3, we explain the physical parametric model used to analyse the measurements.In Sect. 4 we show that perturbed behaviours of the measured accelerations lasting one  or several orbits requires the division of some sessions into several disjointed segments.We list the characteristics of the segments analysed in this paper and describe the glitch (short singular events) detection strategy.The analysis is performed first on individual segments as shown in Sect.5, and then using the data from all segments gathered for a single estimation as presented in Sect.6.We conclude in Sect.7. MICROSCOPE is simple in its principle but each component of the mission has been pushed to its limits given the external constraints (e.g.size of the satellite and global cost); for a more detailed presentation of MICROSCOPE the reader is referred to other papers of this volume (Ref.[34,37,38,[43][44][45][46][47][48]).

Scientific sessions and available data
The MICROSCOPE observations are divided into different measurement sessions.A session represents a time span during which the satellite and the instrument keep the same configuration (spin, drag-free mode, etc).They are numbered by increasing integers and are described in the mission scenario [45].Some of these sessions (called "EP sessions") are directly devoted to the test of the Equivalence Principle while others ("calibration sessions") are used to calibrate or characterise the experimental apparatus [46].EP sessions are the longest, most of them spanning 120 orbital periods (about 8 days), while calibration sessions typically span a few orbits.All sessions were performed with SUEP as well as with SUREF.Sessions are characterised by: • The sensor unit (SU) used: as explained in [34], the payload is composed of 2 SUs, each SU enclosing 2 co-axial concentric test masses.SUREF has 2 test masses with the same composition and is used as a null check of the experiment; SUEP aims at comparing the free falls of a test mass in platinum and of a test mass in titanium.
During most of the sessions only one SU is on, with both operating simultaneously only during the EP sessions 430, 452 and 454.
• Which combination of the accelerometer outputs are used by the DFACS [37]: the DFACS uses micro-thrusters servo-controlled by the outputs of one or several inertial sensors in order to cancel their measured acceleration and to stabilise the rotation; it can be controlled by the output of one of the two inertial sensors (labeled IS1 for the inner mass and IS2 for the outer mass) within the SU in operation.A common mode use of both accelerometers is also possible.In practice almost all sessions used IS2 except sessions 358 and 406 which used the common mode.When both the SUs are working, the DFACS is controlled by the accelerometers within only one of the SUs: SUEP for session 430 and SUREF for sessions 452 and 454.
• The session duration: sessions were planned to last as long as possible, subject to operational constraints [45]: (i) periodic pointing updates required to remain in specification due to possible onboard clock drift; this limited the maximum duration to 120 orbits, each of duration T orb = 5946; (ii) roughly once a month the star trackers pointed towards the bright moon and the fine attitude control had to be interrupted so that some sessions had to be shortened.In addition a few sessions were interrupted due to technical problems.
The first EP session after the commissioning is session 120, performed with SUREF.Note, the first calibration sessions revealed significant non-linearities in SUEP, which were solved by modifying the parameters of the proportional-integral-derivative (PID) controller in the servo-loop of SUEP, and its behaviour was nominal from session 210 onwards.None of the prior sessions are used in our analysis.Session 430 was interrupted by an anomaly and was discarded.Finally we are left with 9 EP sessions performed with SUREF (Table 2) and 18 with SUEP (Table 3).Tables 2 and 3 list also the sensors' minimum and maximum temperatures.The typical temperature of the SUEP was about 10 • C while it was about 19 • C for the SUREF.The higher temperature in SUREF was due to two defective capacitors used for house-keeping data [34].Note also the higher temperatures during sessions 452 and 454 when the two SUs ran simultaneously.
All sessions come with the following data used directly to estimate the Eötvös parameter [47]: • The measured accelerations (with the meaning explained in section 1) for each test mass of the operating SU at a sampling rate of 4 Hz with the associated time stamping.The difference of acceleration Γ x between the 2 test masses along the most sensitive axis X (see [38] and [34] for the description of the axis), which was analysed in this work, is directly computed from these data.
• The attitude, angular velocity and angular acceleration of the satellite with respect to the inertial reference frame J2000 [37].These data are given at exactly the same time stamps as the accelerometer measurements.
• The position and velocity of the centre of mass of the satellite in the J2000 frame sampled every minute.
Additionally, housekeeping data (sampled at 1 Hz) are used to monitor the behaviour of the experiment and to estimate systematic errors [43,46]: • The variation of position of each test-mass as measured by the capacitive sensors; the residual displacements are very small (less than 10 −12 m at f EP ) and the corresponding acceleration is negligible compared to our needs (Sect.5.5.4).• The temperature which is measured by several probes inside the mechanical and electrical subsystems of each SU [34].The corresponding systematic effects are estimated in [43].

The measurement model
A detailed explanation of the measurement model is given in [38].Here we only summarise its main aspects.As explained in the introduction, we look for a difference of free fall between two co-axial concentric test masses of the same SU (SUREF or SUEP) by analysing the difference of their measured accelerations 2) .In a perfect experiment, this would correspond exactly to the applied differential acceleration (2) .Additional terms must be considered to account for the real experiment: • A mapping matrix Ã(c) † between the applied and the measured acceleration; Ã(c) is close to the identity matrix and takes into account scale factors and coupling between axes as well as a rotation reflecting the fact that our model for #» γ (d) is not exactly expressed in the instrument frame which is imperfectly known: • A residual projection of the measured common mode acceleration 2 are the noisefree measured accelerations but in practice, we approximate them with #» Γ (i) ; • A coupling with the angular acceleration .
#» Ω; The origin of these terms is related both to detailed instrumental characteristics [34] and to the implementation of the instrument in its environment [38].We end up with the model .
Note that this equation is fully similar to Eq. ( 19) of Ref. [38] but we have used the notation shortcut The derivation of this model and in particular how we get a mixing between the applied differential acceleration and the measured common mode acceleration is detailed in [38].Note also that other potential disturbing effects (non-linearity, thermal effects, stiffnesses and others) are not included in this model but are characterised in [46] and [43] and contribute to the assessment of the systematic effects [43].
The difference of accelerations applied to the test masses derives directly from simple dynamics.First, each mass experiences Earth gravity; the gravitational force applied to each mass is slightly different because of (i) a gravity gradient due to their small difference of positions #» ∆ (called offcentring in the following, right panel of Fig. 2), and (ii) of a possible intrinsic difference of free fall parametrised by the Eötvös parameter.Second, since the acceleration is expressed in the instrument frame co-rotating with the satellite at an angular velocity matrix [Ω], the corresponding inertial acceleration must be taken into account.Third, additional small perturbations on the test masses (local gravity, magnetic effects, radiation pressure, radiometric effect and others) are gathered in the physical bias #» b 1 (d) .The detailed derivation [38] yields where 2) (note that Eq. ( 4) involves δ(2, 1) instead of δ(1, 2) because the measured differential acceleration is opposite to the difference of gravity accelerations), • #» g (O sat ) is the gravity acceleration computed at the centre of the satellite, • [T] is the gravity gradient tensor computed at the centre of the satellite, is the gradient of inertia matrix.
In this analysis, we will use only the measurement along the X-axis which is an order of magnitude more sensitive than those along Y and Z.This leads to Γ (d)  x ≈ B . . .
As explained in [47], some terms have been demonstrated to be negligible and others are corrected.Finally, after calibration and correction [47] (and see also Sect.5.1 for a summary), we get the fundamental equation which will be used for our analysis [47]: x,corr = b (d)   x where x is a bias which is almost constant but may slowly drift over time due to thermal effects; δ is very close to the Eötvös parameter since |ã c11 − 1| < 2 × 10 −2 whereas the potential contribution of the Eötvös parameter to δ z = ãc13 δ should be much smaller because |ã c13 | < 2.6 × 10 −3 rad from manufacturing; • ∆ x and ∆ z are effective combinations of the components of the offcentrings between the 2 test masses [47].
The structure of this equation is very simple: • g x , g z , S xx and S xz are time-varying deterministic signals which can be computed accurately [47] knowing the position and the pointing of the satellite as well as its angular velocity and acceleration which are all delivered by CNES [38] with an accuracy better than the requirements [37]; x is taken into account by estimating a polynomial trend; • The parameters δ x , δ z , ∆ x and ∆ z are estimated.
These quantities are computed in the instrument frame, in which the varying signals of Eq. ( 4) have very different frequency patterns [36]: • g x and g z are essentially periodic signals of frequency f EP and are in phase quadrature, • S xx and S xz have dominant components at DC and 2f EP and the variations of S xx and S xz at 2f EP are in phase quadrature, x is at very low frequency.
As a consequence, these signals are almost uncorrelated.
The bias, which encompasses the low frequency trend, is modelled with a degree 3 polynomial : b (d)  x (t) = 3 j=0 α j (t − t 0 ) j .Substituting it in Eq. ( 6), we finally get 4. Handling of singular events

Segmentation
During some EP sessions, sudden changes in the local mean of the measured acceleration can be noted (Fig. 3).The typical macroscopic manifestation in the raw data (first panel of the figure) is a leap in the values of the measurements.These leaps are observed on the SUREF instrument only.They are not well understood as they are unpredictable, rare and not correlated to other observable events.Applying a lowpass filter and then zooming on the leap (third panel) shows that it does not consist of a simple Heaviside step function but is a complex mixing of erratic oscillations.We are faced with at least a few hundred seconds of unusable data.More rarely some sessions have been stopped after the detection of technical problems (for example an instability in the drag-free loop) and there was a delay between the occurrence of the problem and the interruption.The data associated with the occurrence of such problems were discarded.Thus, some sessions cannot be analysed in full, but only in parts referred to as "segments" in the following.There can be a single segment if only the end of the session is corrupted or if the other potential usable parts are too short to bring a significant contribution; of course if no problem is detected in a session, the segment corresponds to the whole session.We can also extract several segments in one session as is the case for session 380 (Fig. 3).The driving principle is to end up with segments as long as possible including an even number of orbital periods: T = 2nf orb .Since f spin has been chosen such that f spin = (q/2)f orb (q odd integer), all potential signals at frequencies f i = k i f orb + p i f spin (with k i and p i any integers) combining the orbital frequency f orb and the spin frequency f spin are such that This means that f i corresponds to a sampling frequency of the discrete Fourier transform and the correlation between two signals at frequencies f i and f j respecting the above property is null in theory and very low in practice.Tables 4 and 5 show the segments selected for our analysis.This comprises 13 segments totalling 598 orbits for SUREF and 19 segments totalling 1362 orbits for SUEP.

Glitches
4.2.1.General characteristics of glitches.When looking closely at the temporal evolution of the measured accelerations on several test-masses, we can see short (a few seconds) and significant (1-10 nms −2 ) variations which appear at the same time on both masses of the operating SU and even on the 4 masses when the 2 SUs are operating simultaneously (Fig. 4).The simultaneous appearance of these features for all masses proves that these events have the same common external source.This kind of event has been already observed in other space missions carrying accelerometers [49] and are called "twangs" or "glitches".Glitches in MICROSCOPE have been extensively studied  in a dedicated paper [48].Here, we recall their main characteristics: • As seen through the transfer functions of the drag-free and of the instrument, glitches look like exponentially damped sines with a large first ramp in one direction followed by a smoother oscillation in the opposite direction; their mean shape has been computed in [48] and is shown in Fig. 5; even if the source events probably do not last more than a few milliseconds, they affect the measured acceleration for a dozen seconds.
• The amplitude of the corresponding measured acceleration can reach up to a few 10 −8 ms −2 but can be much smaller; there are probably also glitches which are masked by the measurement noise.
• The number of detected glitches typically ranges from 0.02 to 0.06 s −1 .
• Although they can occur at any time, their probability of occurrence is affected by two periodicities: the orbital period of the satellite and its spin period.
Their most likely origins are crackling of the MLI (Multi-Layer Insulation) of the satellite and more rarely clangs of the gas tanks used for the micro-propulsion.Predicting the exact occurrence, form and amplitudes of the glitches seems out of reach.Moreover, although very similar, the responses of the different test masses to these sudden events are not perfectly identical and the corresponding signal is not fully cancelled in the differential acceleration [48].Thus, due to the time distribution of the glitches, they could generate a tiny signal at the f EP frequency.Consequently, the chosen strategy is to detect and eliminate them.4.2.2.Detection and elimination of the glitches.Glitches are detected in a double two steps procedure [45,47,48]: (i) we use a standard recursive σ-clipping technique (e.g.Ref. [50]) to extract outliers (4.5σ) from the measured differential accelerations on the three axes (X, Y, Z) simultaneously before (ii) flagging data points in the second preceding the outliers and the 15 seconds following it (typically, a single glitch builds from the noise within 0.5 seconds and dies off within 5 to 10 seconds from its peak, depending on its signal-to-noise ratio).Thence we define a first mask M 1 made of zeros in segments characterised as glitches and ones elsewhere.The same two steps are then performed on the high-frequency-filtered differential acceleration (using a 2 nd order Butterworth filter of critical frequency 0.01 Hz), which allows us to detect low signal-tonoise glitches and define a second mask M 2 (a 3σ threshold is used during these steps).The final mask is the logical sum of both masks, M = M 1 × M 2 .The percentage of masked data for each session is indicated in Tables 4 and 5.These are typically 20% but with less than 10% for some sessions and 40% for session 176.

Separated analysis of each segment
As explained in Section 4.1 we have 13 segments for SUREF and 19 segments for SUEP.
A first important objective is to analyse these segments separately since, as discussed in Refs.[35,39] the analysis of a single segment already resulted in an accuracy of estimation of the Eötvös parameter 10 times better than the previous experiments.Moreover this analysis will provide some insights into the data before a global analysis of all segments.

Main steps of the analysis
The analysis of individual segments is performed in the following steps: (i) Calibration correction: thanks to dedicated calibration sessions, it was possible to estimate the values of the instrumental parameters ∆ y , a d11 , a d12 and a d13 throughout the mission [43]; using the values of S ij and Ωi precisely computed for each measurement date and of the common mode measured acceleration Γ(c) i , Γ x is corrected from the terms ∆ y S xy + Ωz and 2 a d11 Γ(c) (ii) Detection of the glitches: we detect glitches and define a corresponding mask according to the algorithm described in section 4.2.2.The union of this mask with the mask generated by the few points (typically a dozen per session) tagged directly on board is used during the analysis (see Sect. 5.2) to discard the corresponding points.Note that this detection is performed only for EP sessions since we have checked that this operation has no impact on the parameters estimated using the calibration sessions.
(iii) The parameters α j , δ x , δ z , ∆ x and ∆ z are estimated by fitting the corrected measured differential acceleration to the model (7).
Since the measurement noise in MICROSCOPE is coloured (Fig. 6 and [35]), the optimal estimation of the parameters requires use of the characteristics of the noise.However, masking introduces gaps into the data, which are not regularly sampled any more.In this case, straightforward techniques (periodograms) to compute the Power Spectral Density (PSD) are no longer effective: a significant leakage of the signal power from high to low frequency bands, leads to a substantial overestimation of the standard deviation of the estimated parameters [51,52].To solve this problem, we performed our estimation with two very different techniques described below: a method of maximisation of the likelihood in the time domain called M-ECM (Modified Expectation Conditional Maximisation) and a weighted least-squares regression in the Fourier domain named ADAM (Accelerometric Data Analysis for MICROSCOPE).

Estimation of the parameters in the time domain using the M-ECM analysis
M-ECM [52] is an inference algorithm designed to perform linear regression of gapped data.Although M-ECM may find applications in many areas, it was developed by the MICROSCOPE team during the mission design for the purpose of processing the mission data.Assuming a linear model for the signal and a stationary Gaussian distribution for the noise, it maximises the likelihood by iterating the following two steps.The first one is the conditional expectation step, which computes the expected likelihood (or rather, its logarithm l Y (θ)) conditionally to the observed data.This process amounts to estimating missing data within gaps.The second one is the maximisation step, which maximises the expected likelihood over the regression parameters θ.This step amounts to computing the generalised least-squares estimate of the parameters.It also includes estimating the noise PSD (the inclusion of this step motivates the qualification "modified" to the algorithm's name, as it is not present in standard ECM algorithms [53,54]).The use of the M-ECM algorithm allows one to avoid spectral leakage effects due to data gaps by restoring the statistics of the noise in the frequency domain.
In the case of complete data, and under the stationary assumption, we can write the logarithm of the likelihood as where Ŷ is the vector of N Fourier-transformed measurements, θ is the vector of parameters to estimate and Â is the matrix of derivatives of the model with respect to these parameters.The matrix [γ] denotes the covariance of the noise in the Fourier domain.It is approximately diagonal (see e.g.Fig. 1 of Ref. [47]) and its diagonal elements are equal to the noise PSD.If all measurements are available, we solve the estimation problem by maximising the likelihood with respect to θ.
When data points are missing, directly maximising Eq. ( 9) can be computationally cumbersome because we cannot approximate [γ] as a diagonal matrix.Instead, the estimation is broken down in two steps.In the first step, we compute the expectation of the log-likelihood given the observed data Y o and the value of the parameters at the present iteration: This computation is called the expectation step (E).It requires computing the expectation of the full data vector conditionally on the observed data E Ŷ|Y o , θ, γ , along with its second-order moment E Ŷ Ŷ † |Y o , θ, γ .We can compute these quantities by using the conditional mean and covariance formulas for Gaussian processes (see [52] for more details).The second step is similar to the maximisation one would do for complete data, except that now we maximise the expected likelihood: This step is the maximisation (M) step.Note that solving for θ is done for a given noise PSD γ.In the M-ECM algorithm, we assume that γ is unknown and that it depends on some noise parameters β.Hence, we also fit for the PSD by performing a pseudo maximisation conditionally on θ, so that Finally, we iterate E and M steps until θ converges.The final solution is the value that maximises the likelihood with respect to observed data l Yo (θ).As a result, M-ECM first computes the expectation of the likelihood through data reconstruction and then maximises it over the parameters.Hence, the maximisation takes advantage of the fast Fourier transform applied to the regularly sampled reconstructed time series, which is computationally more efficient than the direct maximisation of the gapped data likelihood.Values drawn from the missing data conditional distribution (also called reconstructed data in the following) is a useful by-product of M-ECM.
The resulting algorithm is unbiased after several iterations because it converges to the same solution as the one obtained with a direct (but costly) maximisation of the gapped, time-domain data likelihood.M-ECM is also approximately optimal in the statistical sense, as it yields the solution with nearly minimal variance, provided that the PSD estimation is sufficiently accurate [52].

Estimation of the parameters in the Fourier domain using the ADAM software
ADAM was developed by the MICROSCOPE team during the mission preparation for the purpose of processing the data of the experiment.For the estimation with ADAM, we use both original data remaining after the masking operation and data reconstructed by M-ECM as described in the previous section.This means that we have now data without gaps, i.e. regularly sampled.This allows us to apply a discrete Fourier transform (DFT) converting the whole system of measurement equation in the frequency domain: whereas in the time domain each equation is associated to a time t i , each equation of the transformed system is associated to a frequency f j .This leads to several interesting properties and in particular: • In case of periodic signals, their energy is concentrated in a small number of frequencies, corresponding to a small number of equations in the frequency domain; this is the case of the gravity acceleration and of the gravity gradient; moreover thanks to our choice to impose f spin = kf orb /2 and to perform the analysis over 2nT orb (k and n integers), the frequencies f orb , f spin and f EP correspond precisely to a frequency of the DFT: f q = q/T analysis = qf orb /(2n) with q orb = 2n, q spin = kn and q EP = (k + 2)n.
• With long enough data streams, the covariance is approximately diagonal, with a squared error inversely proportional to the data size [55].In our application, the deviation is below the percent level.Thus a diagonal weighting matrix composed of the elements w( where γ(f k ) is the PSD of the noise at the frequency f k [47], is almost optimal.
Details of the procedure are described in [47] and we recall here the main steps: (i) The series Γ x,corr (t i ), (t i − t 0 ) j , g i (t i ), g z (t i ), S xx (t i ) and S xz (t i ) are transformed in the frequency domain by application of a DFT: the N observation equations in the time domain (corresponding to Eq. ( 7) at N different times) are transformed into N observation equations in the frequency domain.
(ii) Potentially, we select only a subset of equations in the frequency domain, which is equivalent to selecting frequency bands.For the standard analysis, the frequency band around f EP is selected to estimate δ x and δ z , and the frequency band around 2f EP is selected to estimate ∆ x and ∆ z .We choose a bandwidth large enough to encompass all the relevant signals (gravity acceleration and gravity gradient with their significant harmonics as well as the rotational terms): 8 × 10 −4 Hz for sessions in spin V2 and 2 × 10 −3 Hz for sessions in spin V3.The whole spectrum is used only to estimate the parameters of the low frequency trend described above.
(iii) These observation equations are used to estimate the parameters of Eq. ( 7) and the associated statistical errors using a weighted least-squares inversion.
Fundamentally, no new information is expected from the ADAM analysis that performs the parameters estimation in the Fourier domain using the reconstructed data provided by M-ECM.However, it is interesting to cross-check results from different methods.Moreover, ADAM is much faster than M-ECM: for a segment of 120 orbits M-ECM needs about 12 hours while ADAM takes only a few minutes.This is because in the Estep, M-ECM has to solve a large linear system where the system matrix is the covariance of the observed data in the time domain.The preconditioning is also memory-expensive.Although these disadvantages are not critical for a single run, they become a serious problem for the numerous tests required to strengthen the quality and the robustness of the estimation analysis.-2.3 ± 6.0 -3.2 ± 5.5 20.0 ± 6.0 18.6 ± 5.5 -0.4 -0.6

Results
The estimates of the parameters δ x and of δ z and their standard deviations are listed in Tables 6 and 7.As noted in Sect.3, δ z = ãc13 δ is almost three orders of magnitude smaller than δ, i.e. far below what is detectable; thus the estimated values of δ z and of its standard deviation are indicators of statistical or systematic effects.Figs.7 and 8 give an overview of the estimates of the Eötvös parameter with their time distribution.The values of the estimated offcentring are reported and commented in [43].Several comments are in order.
(i) Although being very different algorithms, M-ECM and ADAM provide very similar results both for the values of δ x and σ.The only appreciable difference is for δ z estimated for segment 748 which is a SUEP session in spin V2 lasting 24 orbits; but this difference is statistically insignificant since it is smaller than the standard deviation computed by M-ECM.
(ii) As expected given the frequency dependence of the noise visible on Fig. 6, the standard deviation is smaller in spin V3.Furthermore it logically depends on the duration of the analysis segment: it is shown in [46], that when normalised to the same duration (or equivalently, expressed in terms of PSD) all sessions with the same SU and the same spin have quite similar noise.(iii) As expected before the launch, SUREF provided more accurate data than SUEP, mainly thanks to its heavier outer test-mass.Indeed the ratio area over mass is a driving factor of the error budget.
In terms of the Eötvös parameter, these results are statistically compatible with a null value: all absolute values are smaller than 2σ except for sessions 294 and 380 where they exceed 3σ but remain below 5σ.Accounting for systematic errors (see Table 10) further downplays the significance of these values.The value of σ depends on the segment and in particular of its length but is typically 10 −14 or smaller for the longest segments of 120 orbits.

Robustness tests
We conducted a series of complementary analyses to test the robustness of our results and quantify how they are impacted by the settings of the analysis and by some terms considered as negligible according to the specifications.For reason of CPU time, all these tests have been performed using the ADAM software.5.5.1.Impact of the frequency bandwidth.As explained above and in [47], we select a frequency band around f EP and 2f EP to compute the parameters of the model using a weighted least-squares estimation in the frequency domain; we have checked that dividing the used bandwidth around these frequencies by a factor 4 does not change the values of the parameters by more than a few percent (a few 10 −16 for δ x ); however the associated standard deviation can change by up to 10 % which is understandable because if we reduce the number of data (in the frequency domain) its estimation is less precise.In the same spirit, if we no longer use a selected band of frequency but the whole domain, the parameters are not noticeably modified and the estimated standard deviation generally increases from 10 to 20%; indeed, in this case, we could have unmodeled high frequency effects which contribute to increasing the standard deviation.This is also probably why σ estimated by M-ECM (which implicitly uses the whole frequency domain) is generally slightly larger than the one obtained with ADAM.5.5.2.Impact of the degree of the fitted polynomial.In the actual analysis, we estimate a polynomial of degree 3 in order to absorb the low frequency trend due to temperature variations [47].We have also tested other degrees, from 1 to 5. While there is no significant modification with degrees 2, 4 or 5 the standard deviation is slightly increased when using a degree 1.
We have also compared different strategies to estimate the coefficients of the polynomial: (i) a prior estimation (without weighting) in the time domain, (ii) a prior estimation (with weighting according to the estimated power spectral density of the noise) in the Fourier domain (using the whole spectrum), (iii) estimation at the same time as the other parameters using the whole spectrum but the frequency bands around f EP and 2f EP .
The final results for the estimated Eötvös parameter and the associated standard deviation are fully equivalent: the differences are much smaller than the error and than our ideal objective of 10 −15 .

Impact of the angular velocity and angular acceleration. Angular velocity (included in the matrix [S]
) and angular acceleration are used in the actual analysis to correct the measured linear acceleration from gradient effects due to offcentring #» ∆.The stability of the attitude of MICROSCOPE has been specified to limit the effect of this gradient at the f EP and 2f EP frequencies.To check that this is indeed the case, we have analysed the data without these corrections.Again, this resulted in negligible changes in the estimated values of the Eötvös parameter, less than 5% of its standard deviation except for 3 segments for which this was about 10%.5.5.4.Impact of the residual variations of position of the test masses.Fundamentally, the full dynamical equation of the test masses includes terms taking into account the motion of the masses with respect to the satellite (see Eq. ( 1) in [38]): the kinematic acceleration #» ∆ and the Coriolis acceleration 2 Since the principle of the accelerometer is to nullify the displacement of the test masses, these terms are not included in the actual analysis.Nevertheless position measurements are available in the telemetry data.Due to the limited quantity of data that can be transmitted on ground, they are sampled at 1 Hz.Thus, in order to compute the previous terms and correct the electrostatic acceleration we have both interpolated the position at 4 Hz and derived them numerically.The modification of the estimated δ x amounts to a few percents of its standard deviation.

Testing the whole process with a simulated EP signal
It is essential to check that the global analysis process, including the detectionelimination of the glitches, is able to preserve and retrieve a potential EP-violation signal.To this aim we added a fake signal to the real measurements before any preprocessing and applied our complete chain of analysis going from the masking described in Sect.4.2.2 to the estimation of the Eötvös parameter as explained above.More precisely, we conducted two series of tests with two levels of added simulated signal: one corresponding to an Eötvös parameter of 3.4 × 10 −14 (which is large compared to the objective of MICROSCOPE) and a second corresponding to an Eötvös parameter of 3.4 × 10 −15 (which is more or less the limit of detection for SUEP as confirmed by Eq. 19).We then performed the analysis of these data and subtracted the Eötvös parameter estimated with the original data.Tables 8 and 9 show these differences which can be compared to the simulated signal.The standard deviations corresponding to these analyses are very close to those of the initial analysis (Tables 6 and 7) and are not repeated here.Tables 8 and 9 show also the bias (i.e. the difference quoted above minus the simulated value) divided by the standard deviation of the Eötvös parameter.The absolute value of this ratio is smaller than 2% for all segments of the SUEP and for most of the segments of the SUREF.The worst case is for segment 778-1 (SUREF) with a relative error of 18% for a fake signal of 3.4 × 10 −14 ; but even in this case the absolute error is smaller than 10 −15 .

Systematic errors
The evaluation of systematic errors is a major topic addressed during the preparation and specification of the mission and also since the end of the mission in 2018 using the actual measurements and characterisation of the experiment.Systematic errors are estimated in detail in [43] (see Table 15 therein for a summary).They can be divided into seven main contributors, the maximum impact of each was estimated at the f EP frequency.
(i) Residual gravitational effects either due to imperfect knowledge of the Earth's gravity field and of the position and orientation of the instrument, or due to local effects coming from the satellite or the instrument itself (first three rows of Table 15 in [43]).Those residuals are due either to errors in the correction of the Earth's gravity gradients after calibration of the offcentrings, or to errors in the estimation of local gravity fluctuations (performed with finite element analysis before the launch).
(ii) Clock errors: the most stringent requirement in terms of time stamping comes from the need to compute the gravity gradient tensor with the correct position and orientation of the satellite, in order to correct the effects of the gravity gradient.
Since the dominant contributions of the gravity gradient are at DC and 2f EP this contrains the absolute clock errors at f EP and 3f EP ‡; they have been specified to be smaller than 1 ms, leading to an error smaller than 2 × 10 −16 ms −2 on the gravity gradient effect.The maximum effective errors were 2.1 µs and 0.7 µs respectively at f EP and 3f EP in inertial pointing and even smaller in spin mode.The maximum total error (including bias, harmonic errors and drift) with respect to UTC was specified to be smaller than 50 ms and was always smaller than 41.3 ms.In order to limit the effect of the drift, the on-board clock was regularly synchronized (outside the scientific sessions).
(iii) Uncorrected inertial effects due to imperfect estimation of angular velocity and angular acceleration.This contribution is computed via the estimated performance of the DFACS [37,45], which allows estimation of the angular error contribution at f EP for each session.
(iv) Time variation of the instrument parameters (satellite pointing, common mode test-mass alignment and angular-to-linear acceleration couplings).These variations are not due to temperature variation at f EP (which are taken into account in the thermal effect below).They are computed by combining the satellite alignment variation issued from the DFACS control with the residual continuous differential acceleration at the differential measurement output.
(v) The drag-free control residuals.In complementarity with the previous item, instrument parameters (estimated from calibration sessions) are assumed constant during a given session.The DFACS-related systematic error is computed from the residual accelerations at f EP issued from the DFACS performance report.
(vi) Magnetic effects.They are computed with a finite element model partially adjusted to measurements made on the instrument magnetic shielding and based on the knowledge of all electronics units characteristics.
(vii) thermal effects induced by the tiny variations of temperature at the f EP frequency.These are the major contributors to the systematic error budget.Their estimation results from a detailed analysis of the instrument and of the satellite thermal behaviours.It should be noted that the temperature variations at f EP in the SU were reduced to a fraction of µK in the worst case and helped to strongly reduce the thermal contributor compared to that of [39].‡ The combinations f EP with DC, f EP with 2f EP and 3f EP with 2f EP can all produce effects at f EP .
Table 10.Main systematic effects for SUREF sessions: thermal effects and non-linear (quadratic) effects which are dominant and total computed as the quadratic sum of all effects detailed in [43].The last column repeats, for comparison, the stochastic error obtained with ADAM presented in Table 6.All values are given in equivalent of 10 (viii) Non-linearities, described by a quadratic term in the measurement equation.Note that Equation (5) ignores this term because it is not used in the data process but only in a posteriori analysis of errors [43].
According to [43], the last two terms are dozens of times larger than the others for SUEP.In the case of SUREF, non-linear effects are smaller and thermal effects dominate all the others.These various effects are unlikely to be correlated and are quadratically added at the end.
The main systematics (in terms of Eötvös parameter) as well as their total contribution S l for each segment l are summarised in Tables 10 and 11.Following [43] k,l are the maximum systematic errors of acceleration corresponding to each source k, and g 7.9 × 10 −15 ms −2 is the gravity acceleration for MICROSCOPE.The standard deviation issued from the least-squares regression with ADAM presented in Section 5.4 is also recalled for comparison.This shows that stochastic errors are clearly dominant for SUEP and marginally dominant for SUREF.

Estimation of the Eötvös parameter using the combination of sessions
The results from the analysis of the individual segments show that stochastic errors are larger than systematic errors.In order to improve the signal to noise ratio we have gathered, in a global analysis, all segments included in Tables 6 and 7.The polynomial coefficients α j and the offcentring ∆ x and ∆ z are specific to each segment, but the parameters δ x and δ z are common.This has been achieved in the Fourier domain (with the ADAM software) as described in [47] and recalled in Appendix A. Due to excessively large gaps between segments, we can not simply cumulate the corresponding measurements without being overwhelmed by leakage effects, even when applying dedicated algorithms like M-ECM.Instead, we apply a Discrete Fourier Transform to each segment and cumulate the transformed equations in the Fourier domain as detailed in Appendix A.
As a result we get for SUREF and for SUEP δ where the errors given above are statistical errors at 1 σ.As was to be expected from the results of individual segments, SUEP suffers from a larger statistical error than SUREF despite the larger number of segments used in the combined solution.
In the same spirit, we have analysed the cumulated segments containing fake signals as described in Sect.5.6.For the SUREF the estimated increment on the Eötvös ratio is 3.38×10 −15 when 3.40×10 −15 was simulated and 34.01×10 −15 when 34.00×10 −15 was simulated.For the SUEP the results are respectively 3.37 × 10 −15 and 33.99 × 10 −15 .Fig. 9 shows the histograms of the weighted residual accelerations in the frequency band around f EP after estimation of the parameters and subtraction of the model (7).We have checked that they are compatible with a Gaussian statistics.Note that if instead of gathering all measurements to provide the global solution, we compute the weighted mean of the solutions for individual segments l as and the associated variance we get the very similar results δ x,M = (−0.4± 1.1) × 10 −15 for SUREF and δ x,M = (−1.8± 2.2) × 10 −15 for SUEP.This is expected if the observations of the different segments are sufficiently independent.The same weighting is used to combine systematic errors S l associated to individual segments (given in column 4 of Tables 10 and 11) in order to get the systematic error S M associated to the global solution [43]: which leads to the systematics S M = 2.3 × 10 −15 for SUREF and S M = 1.5 × 10 −15 for SUEP.Putting all together, and remembering that the conventional Eötvös parameter η can be practically identified to the parameter δ x measured in this experiment, we end up for SUREF with As a null Eötvös parameter is expected in this case, this result gives a good indication that there is no important anomaly in the whole chain going from the measurements to the analysis and including the modelling.We finally get for SUEP η(Ti, Pt) δ(Ti, Pt) = [−1.5 ± 2.3(stat) ± 1.5(syst)] × 10 −15 at 1σ.
This final result indicates that there is no visible violation of the WEP in the full data of the MICROSCOPE mission.

Conclusion
We have analysed the measurements provided by the payload T-SAGE flown on the MICROSCOPE satellite: these are the differences of accelerations of two test-masses made of the same material (PtRh10) for the sensor SUREF and two test-masses made of different material (PtRh10 for the inner mass, Ti alloys for the outer mass) for the sensor SUEP.This involves 13 segments (i.e.sequences of continuous measurements sampled at 4 Hz) totalling 598 orbits for SUREF and 19 segments totalling 1362 orbits for SUEP.This represents accumulated free falls in the Earth's gravity field of about 41 days for SUREF and 94 days for SUEP.In the data analysis we have compared the measurements to a model including many effects, in particular those of the gravity gradients and of the gradient of inertia due to the tiny difference of positions of the test masses, together with a hypothetical signal of violation of the Equivalence Principle.Before these computations we have corrected and calibrated the instrumental parameters estimated during the dedicated calibration sessions.We have moreover detected and discarded the measurements affected by glitches.As this process breaks the regularity of the sampling, we had to use appropriate algorithms in order to prevent the effects of leakage.
In a first step the analysis has been performed separately on single segments using two different methods: M-ECM operating in the time domain which has been designed for optimal estimation using irregularly sampled data and is also able to reconstruct the most likely data where they are missing, and ADAM operating in the frequency domain.The two methods give consistent estimates of the parameter δ = η + O (η 2 ).In particular, the values estimated with the SUEP accelerometer on the different segments are consistent with 0 at less than 2 σ for most of them, 2.2 σ for one of them.This distribution is compatible with Gaussian statistics.The value of σ depends on the segment and in particular on its length but is typically 10 −14 or smaller for the longest segments of 120 orbits.The systematic errors, analysed in [43] for the same segments, are significantly smaller.
In a final step we have gathered the data coming from all the segments for each SU in a single analysis, with the aim of obtaining the best signal to noise ratio from our full set of data.This led again to no detection of violation of the WEP since we obtained η(Ti, Pt) = [−1.5 ± 2.3(stat) ± 1.5(syst)] × 10 −15 for the SUEP.The result obtained for the SUREF, η(Pt, Pt) = [0.0±1.1(stat)±2.3(syst)]×10−15 confirmed the absence of bias in the whole analysis, a null value being expected in this case.It is common [29,56] to add quadratically the statistical and systematic errors.Doing this we conclude that the MICROSCOPE experiment does not see evidence for any difference of free fall between titanium and platinum test masses at a level of sensitivity of 2.7×10 −15 .This represents an improvement of almost two orders of magnitude with respect to the constraint before the launch of MICROSCOPE.
Although this new upper bound on the WEP allows for improved bounds on beyond-GR models (see e.g.[40][41][42] for bounds obtained after the first MICROSCOPE results [35,39]), the challenges faced by fundamental physics remain as pressing as ever and call for still more precise experiments.New tests in space could improve MICROSCOPE's state-of-the art measurement by two orders of magnitude in the next decades [57].
simply correspond to the signal associated to each parameter, sampled at the epochs of the measured acceleration.
The N measurements are assumed to be regularly sampled at a frequency f e over a duration T .In order to solve the problem in the Fourier domain, we take the Fourier transform of Eq. (7).To this aim, we make use of the DFT operator [F].The DFT operator being unitary, the signal energy content is preserved by the transformation.The new system can be simply written Ŷ = Â θ + n. (A. 2) The original quantities being real, the new system can be reduced to N useful real equations.These new equations can be grouped by pair (related to real and imaginary parts of the DFT), corresponding to frequencies . Since each measurement projected in the Fourier domain can be associated to a discrete frequency, the corresponding weight is where γ(f k ) is the PSD of the noise at the frequency f k .
As shown in Ref. [38], the MICROSCOPE mission was designed to concentrate useful signal at specific frequencies (i.e., gravity acceleration peaks at f EP , the gravity gradient signal at 2f EP and calibration signals at f cal ).This is so true in the real data that a very simple analysis such as synchronous detection could lead to reasonable results.However, we use a more flexible method: we limit our least-squares inversion to the bands of frequency containing the relevant signals.In practice, this is equivalent to extracting a subsystem of Eq. (A.2) by selecting the relevant equations to get the truncated system Âr θ + n = Ŷr .where m is the number of sessions considered.Then an appropriately weighted leastsquares technique can be used to solve for the concatenated system, the weight associated to each frequency for each segment being chosen according to (A.3).
In the simplest cases, all parameters are common to all segments.If some parameters are specific to each segment (as could be the case for the polynomial coefficients or for the offcentring), the corresponding column of the design matrix related to the other segments is simply set to zero.

Figure 1 .
Figure 1.The MICROSCOPE satellite(left panel) and the T-SAGE instrument with its two cylindrical sensor units (right panel).

Figure 2 .
Figure 2. Left: the 4 test-masses orbiting around the Earth (credits CNES / Virtual-IT 2017).Right: reference frames for the satellite and for one pair of masses.The (X sat , Y sat , Z sat ) triad defines the satellite frame; the reference frames (X k , Y k , Z k , k = 1, 2) are attached to the test-masses (black for the inner mass k = 1, red for the outer mass k = 2); the X k axes are the test-mass cylinders' longitudinal axis and define the direction of WEP test measurement; the Y k axes are normal to the orbital plane, and define the rotation axis when the satellite spins; the Z k axes complete the triads.The 7 µm gold wires connecting the test-masses to the common Invar sole plate are shown as yellow lines.∆ represents the test-masses offcentring.The centers of mass have been approximately identified with the origins of the corresponding sensor-cageattached reference frames.

Figure 3 .
Figure 3. Differential acceleration for session 380.The first graph (top left) shows raw data in blue and filtered (using a running average over 60 s) data in red.The red curve reveals leaps which are not clearly visible on raw data: this appears more clearly by zooming in the ordinate axis as shown on the second graph (top right).Finally the third graph (bottom) which zooms in on the first leap shows that this is not a simple step but that there are disturbed measurements in the neighbourhood.Consequently, in the actual analysis, we use only data belonging to the 2 segments represented in the top right panel.

Figure 4 .
Figure 4. Example of superposition of measured acceleration (after removing of a low frequency trend) by the 4 test-masses at the same time.The large peaks appear on all masses but with slightly different amplitudes.

Figure 6 .
Figure 6.Amplitude Spectral Density of the sensor differential acceleration along the X-axis for SUREF during session 176 (upper panel) and for SUEP during session 236 (lower panel).Note the peak at frequency 2f EP coming from the Earth's gravity gradient due to the offcentring of the two test masses.

Figure 7 .Figure 8 .
Figure 7. Eötvös parameter estimates for each SUREF segment and their 68% confidence error bars.Blue circles show M-ECM's estimates and orange ones Adam's.

Figure 9 .
Figure 9. Histograms of the residuals, in the frequency band around f EP , of the measured acceleration after fitting of the model (7) in the Fourier domain for SUREF (left panel) and SUEP (right).

(A. 4 )
One can cumulate the data from disjoint segments by just gathering the corresponding matrices and vectors:

Table 1 .
Main frequencies of interest.

Table 2 .
List of analysed sessions dedicated to the EP test with SUREF and their characteristics.Column 3 gives the calendar date of the beginning of each session, whereas columns 4 and 5 give the beginning and the end in terms of orbit number.

Table 3 .
Same as Table 2 for sessions performed with SUEP.

Table 4 .
Characteristics of the segments selected for our analysis of the SUREF data.The segment number corresponds to the session number extended by an index when there are more than one segment in the session.The duration is given as a multiple of orbital periods, remembering that this period is about 5946 s.The position of the segment in the session is indicated by the first and the last orbit which are included in the segment.The fourth column indicates the percentage of data eliminated from each segment during the pre-processing (see Sect. 4.2.2).

Table 5 .
Same as Table 5 but for SUEP.

Table 6 .
Values of the Eötvös parameter δ x with their associated standard deviation estimated from SUREF measurements over individual segments.The table reports also the value of the component δ z in phase quadrature.The values followed by (M) result from the M-ECM analysis while the values followed by (A) come from ADAM.Segments marked by an asterisk * correspond to sessions in spin V3 and the others to sessions in spin V2.

Table 7 .
Same as Table 6 but for SUEP.

Table 8 .
Estimation of a simulated fake EP signal to test the robustness of the analysis process for SUREF.Fake Eötvös parameter 3.40 × 10 −15 Fake Eötvös parameter 34.00 × 10 −15

Table 9 .
Estimation of a simulated fake EP signal to test the robustness of the analysis process for SUEP.Fake Eötvös parameter 3.40 × 10 −15 Fake Eötvös parameter 34.00 × 10 −15 −15for the Eötvös parameter (for example systematic errors in ms −2 have been divided by 10 −15 g 7.9 × 10 −15 ms −2 ).Segments marked by an asterisk * correspond to sessions with spin V3 and the others to sessions with spin V2.

Table 11 .
Same as Table 10 but for SUEP.