Unsigned magnetic flux proxy from solar optical intensity spectra

The photospheric unsigned magnetic flux has been shown to be highly correlated with radial velocity (RV) variations caused by solar surface activity. This activity indicator is therefore a prime candidate to unlock the potential of RV surveys to discover Earth twins orbiting Sun-like stars. We show for the first time how a precise proxy of the unsigned magnetic flux ($\Delta\alpha B^2$) can be obtained from Sun-as-a-star intensity spectra by harnessing the magnetic information contained in over 4000 absorption lines in the wavelength range from 380 to 690 nm. This novel activity proxy can thus be obtained from the same spectra from which RVs are routinely extracted. We derived $\Delta\alpha B^2$ from 500 randomly selected spectra from the HARPS-N public solar data set, which spans from 2015 to 2018. We compared our estimates with the unsigned magnetic flux values from the Solar Dynamics Observatory (SDO) finding excellent agreement (median absolute deviation: 4.9 per cent). The extracted indicator $\Delta\alpha B^2$ correlates with SDO's unsigned magnetic flux estimates on the solar rotational timescale (Pearson correlation coefficient 0.67) and on the three-year timescale of our data set (correlation coefficient 0.91). We find correlations of $\Delta\alpha B^2$ with the HARPS-N solar RV variations of 0.49 on the rotational timescale and 0.78 on the three-year timescale. The Pearson correlation of $\Delta\alpha B^2$ with the RVs is found to be greater than the correlation of the classical activity indicators with the RVs. For solar-type stars, $\Delta\alpha B^2$ therefore represents the best simultaneous activity proxy known to date.


INTRODUCTION
A planet causes the radial velocity (RV) of its host star to change periodically over time.Yet, Doppler-like signals caused by the star itself, linked to the interplay between the evolving magnetic field and stellar surface convection, can drown out and mimic planetary signals.These manifestations of stellar magnetic activity represent a major obstacle to detecting planetary-induced RVs below 1 m s −1 (see Crass et al. 2021), with only very few measurements below this threshold (e.g.Faria et al. 2022).To date, Earth-like planets orbiting solar-type stars in the habitable zone are out of reach, as they produce RV signals with semi-amplitudes of the order of 10 cm s −1 .It is therefore essential to disentangle planetary and stellar RV components to obtain a clean planetary RV curve.
Stellar activity subsumes a range of phenomena including stellar magnetic cycles (Lanza 2010;Costes et al. 2021), starspots (Saar & Donahue 1997;Desort et al. 2007;Lagrange et al. 2010), faculae and plages (Saar & Donahue 1997;Saar 2003Saar , 2009;;Meunier et al. 2010a,b), meridional flows (Meunier & Lagrange 2020), granulation (Dravins 1982;Dumusque et al. 2011;Meunier et al. 2015; ★ E-mail: fl386@cam.ac.ukCegla et al. 2019), super-granulation (Rieutord et al. 2010;Rincon & Rieutord 2018;Meunier & Lagrange 2019), and p-mode oscillations (Mayor et al. 2003;Medina et al. 2018;Yu et al. 2018;Chaplin et al. 2019).These phenomena act on different timescales and have different impacts on the RVs.An effect of particular importance is the suppression of convective blueshift (Meunier et al. 2010a).For solar-type stars, the emission emanating from convective upflows dominates over the downflows and leads to a net blueshift of the stellar spectrum.However, this effect is modulated by the magnetic field inhibiting stellar surface convection (e.g.Hanslmeier et al. 1991).Since the magnetic field is spatially inhomogeneous, regions with suppressed convection rotate in and out of view as the star rotates, leading to a varying Doppler shift and variations in the shape of the absorption lines.In addition, the magnetic field evolves in time, and thus the overall effect also varies in time beyond the rotational timescale.
The hemispherically averaged unsigned magnetic flux | Bobs | has been shown experimentally to be an excellent proxy for variations in solar RV (Haywood et al. 2016(Haywood et al. , 2022)).This finding is supported by analyses of Dopplergrams and magnetograms from the Michelson Doppler Imager (Scherrer et al. 1995) presented in Meunier et al. (2010b).They showed that suppression of convective blueshift is pro-nounced where the magnetic field is strong.In addition, simulations by Meunier et al. (2010a) showed that the attenuation of the convective blueshift is indeed the dominant contributor to star-induced RV variations.The analyses in Meunier et al. (2010a) indicate that the attenuation of the convective blueshift leads to a long-term RV signal with an amplitude of about 8 m s −1 and thus impedes the detection of Earth twins orbiting solar-type stars.The photometrically induced RV variations due to bright active regions and dark starspots rotating in an out of view, on the other hand, partially cancel out and are of lesser concern.
The strength and evolution of stellar magnetic fields are challenging to measure, though.The earliest measurements of the solar magnetic field date back to 1908, with Hale (1908) ) observing Zeeman splitting (Zeeman 1897) in sunspot spectra.Most of the methods that exist to date are either only applicable to highly active stars, require polarimetric data, measurements at infrared wavelengths or a combination of these (Saar & Linsky 1985;Valenti et al. 1995;Johns-Krull et al. 1999;Reiners & Basri 2006).An overview of magnetic field estimation methods is provided in Sections 2.3 and 2.4.
In this study, we show how a proxy of | Bobs | can be derived from intensity spectra in the visible wavelength range of weakly active to moderately active stars.These stars have been, and continue to be, prime targets of RV surveys.This work represents an extension of, and builds on, the Multi-Mask Least-Squares Deconvolution (MM-LSD) method 1 presented in Lienhard et al. (2022).They analysed the performance of Least-Squares Deconvolution (LSD; Donati et al. 1997) as a tool to extract RV information from spectra of FGK-type stars and the dependence of the measured RV on various parameters.For this purpose, Lienhard et al. (2022) developed a pipeline that continuum-normalises deblazed echelle order spectra, partially corrects for telluric absorption lines, masks problematic wavelength regions, and finally extracts the RV via LSD.The preprocessing steps implemented in this pipeline and the convolution approach are reused in the present analysis.
In Section 2, we explain the basics of Zeeman splitting, describe different magnetic field diagnostics based on this effect, as well as the in this context commonly used LSD method.In Section 3, we describe the data products used in this analysis.The proposed magnetic flux proxy is presented in Section 4. We describe the application on HARPS-N solar spectra in Section 5. Lastly, we show and discuss our results in Section 6, and conclude in Section 7.

Least-Squares Deconvolution
The LSD method relies on physical information about the absorption lines, such as line depth and wavelength, to model the spectra at hand on the basis of few assumptions.The objective of using a simple model is to describe and model the bulk of the absorption lines, rather than to precisely model single lines.In this way, one can tease out line information from all absorption lines and extract information that is otherwise hidden in the noise.The LSD model for a spectrum is generated by convolving a common profile with a line list consisting of scaled delta functions at the rest wavelengths of the known absorption lines.The delta functions are scaled depending on what needs to be modelled.To model intensity spectra, for instance, 1 Available on github: https://github.com/florian-lienhard/MM-LSD.
we scale the delta functions by the expected relative depths of the relevant spectral absorption lines (e.g.Lienhard et al. 2022).
By applying least-squares fitting, the best-fitting common profile can be determined.Similarly to the Cross-Correlation Function (CCF; Baranne et al. 1996;Pepe et al. 2002), this common profile represents the average shape of the absorption lines.Analogously to the CCF, one can extract the stellar RV from this common profile.Other applications include modelling Stokes  spectra to estimate the stellar magnetic flux (e.g.Donati et al. 1997).
There are two main assumptions that LSD is based on.One is that absorption lines add up linearly.This assumption is valid for weak absorption lines only (Kochukhov et al. 2010).Secondly, LSD defines the common profile in velocity space.It is thus assumed that the absorption lines have the same shape in velocity space and only scale by a wavelength-specific factor.This standard assumption in LSD (Donati et al. 1997;Kochukhov et al. 2010;Lienhard et al. 2022) is based on the fact that the conditions on the stellar surface are similar, all lines are rotationally broadened, and the dominant atomic absorbers for FGK-type stars, such as Fe, Ni, Cr, and Ti, all have similar atomic masses and hence similar thermal broadening.The width component due to thermal broadening is where  is the plasma temperature,  the speed of light,  the Boltzmann constant,  the atomic mass, and  0 the rest wavelength of the absorption line (e.g.Bellot Rubio & Orozco Suárez 2019).Since this expression scales linearly with wavelength, the width contribution in the velocity domain remains constant 2 .Lastly, micro-and macro-turbulent broadening are the same for all species.
The LSD method can therefore be a useful tool if we find the same general shape at the wavelengths of the absorption lines and we can model this shape as a profile fixed in velocity space scaling with factors that depend on the lines' physical properties.

Zeeman effect
In this Section, we summarise the theoretical background of Zeeman splitting and the relevant equations describing its effect on absorption lines.A magnetic field splits an absorption line involving a magnetically sensitive state into multiple absorption lines at slightly offset wavelengths.This effect is called Zeeman splitting and is due to the external magnetic field splitting an initially degenerate energy level with angular momentum  into 2 + 1 sublevels.The energy difference between the original degenerate state and the split components is proportional to the magnetic field strength and their respective magnetic quantum number .Transitions involving these split states are therefore shifted in wavelength.For a review on Zeeman splitting in stellar spectra, see e.g.Reiners (2012); Stenflo (2013); Bellot Rubio & Orozco Suárez (2019).
For electric dipole transitions, the selection rules allow for Δ  = 0 and Δ  = ± 1.The former produce the unshifted central  component, while the latter produce the shifted  components.A simple triplet of lines is created when  of one of the involved states is equal to 0 or if the Landé factors of both states are equal.More complicated patterns can still be treated as a triplet by computing an effective Landé factor  eff for the transition.The polarisation of the 2 The conversion can be made with the standard approximation Δ  = Δ

𝑐
where Δ is the observed Doppler shift due to the source moving with velocity Δ relative to the observer.
individual components depends on the direction of the magnetic field relative to the observer.Crucially, the  components are circularly polarised with opposite orientation if the magnetic field vector points towards or away from the observer.Since the  components are shifted in wavelength with the shift proportional to the magnetic field strength, this generates a characteristic signal in Stokes  and enables the estimation of the stellar magnetic flux from polarised spectra, as outlined in Section 2.3.For transverse fields, on the other hand, the three components are linearly polarised producing signals in Stokes  and  (Kochukhov et al. 2011).
The wavelength difference between the  and the  components is equal to which translates to a velocity shift of with the rest frame wavelength  0 in Å,  eff the dimensionless effective Landé factor, the magnetic field strength  in kG, and the velocity shift in km s −1 .It follows that lines at longer wavelengths exhibit stronger Zeeman splitting.A field of 1 kG strength typically shifts an absorption line in the visible range by about 1 km s −1 .This is orders of magnitudes larger than the typical RV shift due to a planet.However, the Zeeman signal manifests very differently in the absorption lines, as it leads to a varying shift of a varying fraction of the absorption lines rather than a velocity shift affecting the entire spectrum uniformly.Furthermore, only a tiny surface fraction (less than a few per cent for the Sun, as can be derived from Milbourne et al. 2021;Haywood et al. 2022) of weakly active stars is affected by such strong fields.The Zeeman signal is therefore washed out and intermixed with weaker splitting patterns.Since the width of a typical absorption line is much greater than 1 km s −1 , Zeeman splitting generally leads to slightly broadened lines in the optical for FGK-type stars, rather than separated Zeeman triplets.For non-saturated lines Zeeman splitting does not alter the equivalent width.However, the splitting expands the saturation region of saturated absorption lines and thereby increases their equivalent widths.This effect is called Zeeman intensification (e.g.Saar et al. 1992;Basri & Marcy 1994;Kochukhov et al. 2020).
In the following two sections, we describe how polarised and unpolarised stellar spectra are affected differently by Zeeman splitting and how this relates to the techniques used to characterise magnetic fields.

Polarimetric measurements
The 4th component of the Stokes vector, Stokes , is defined as the difference between right and left-handed circular polarisation (for a review see Stenflo 2013).The  components are oppositely circularpolarized when the magnetic field vector is parallel to the line of sight, as mentioned in Section 2.2.This leads to a characteristic signal in Stokes  thus encoding the strength and orientation of the large-scale magnetic field.In the weak field regime, this information can be extracted from multiple lines simultaneously using LSD, as described in Donati et al. (1997).
The LSD method is used, for instance, to compute surface magnetic maps through Zeeman Doppler Imaging (ZDI; Semel 1989;Donati et al. 1989;Kochukhov & Wade 2016).ZDI capitalises on the fact that Stokes  signatures of active regions are blueshifted as they first appear on the visible stellar hemisphere and then progressively shift towards longer wavelengths as the star rotates.Given a series of observations at different times, surface magnetic maps can be reconstructed by finding the maximum-entropy magnetic field geometry that produces the observed Stokes  time series (e.g.Skilling & Bryan 1984;Donati et al. 2006;Folsom et al. 2018).
Methods based on Stokes  permit the extraction of magnetic field diagnostics for rapidly rotating stars, but they have some inherent disadvantages.Mainly, Stokes  signals from adjacent stellar surface regions of opposite polarity can cancel out if the respective polarised components are not sufficiently separated in wavelength (e.g.Saar 1988;Donati et al. 1997;Reiners 2012).This leads to an underestimation of the magnetic field strength.Linear polarisation signals are much weaker, can be significantly affected by magnetooptical effects (Landolfi & Landi Degl'Innocenti 1982), and are also subject to cancellation effects (Saar 1988;Reiners 2012).For instance, Kochukhov et al. (2011) find the linear polarisation signal to be 10 times weaker than the circular polarisation signal.Lastly, taking polarimetric measurements requires additional equipment, poses technical challenges, and generally uses up more observation time to collect the same number of photons as compared to Stokes I measurements.

Stokes I measurements
Extraction of magnetic field information from intensity spectra is fraught with technical complications but in principle has significant advantages over polarimetric methods.It is important to note that, conversely to Stokes  measurements, intensity spectra are not affected by cancellation effects due to regions of opposite polarity.Also, there are many high-resolution spectrographs designed for radial velocity studies on solar-type stars producing extensive time series of Stokes I spectra, but only few that provide polarimetric data.The capability to simultaneously measure the evolution of the instantaneous magnetic flux and the RV from intensity spectra is expected to lead to the discovery of smaller planets and improve mass measurements of known planets in the vast amount of existing and upcoming data.
Zeeman splitting measurements in the visible range are challenging since the line profile changes are very small in Sun-like stars and can be confused with other line broadening effects, such as thermal broadening (see e.g.Reiners 2012;Bellot Rubio & Orozco Suárez 2019).Since Zeeman splitting is proportional to the wavelength squared (see Eq. 2), many intensity-based methods therefore focus on extracting information from one suitable line in the infrared at very high spectral resolution.For instance, a few studies are based on the Mg I line at 12.32 m (e.g.Brault & Noyes 1983;Zirin & Popp 1989;Bruls & Solanki 1995).The infrared domain poses instrumental problems, however, and is riddled with telluric absorption lines that can lead to a higher RV error (Cunha et al. 2014).Water absorption lines are especially problematic as the precipitable water vapour content is spatially inhomogeneous and variable (e.g.Leet et al. 2019;Cretignier et al. 2021).Furthermore, most of the spectrographs designed for RV studies record stellar intensity spectra in the visible wavelength range.For solar-type stars, this wavelength range is optimal because there is a large number of absorption lines and the SNR is highest as the stellar flux peaks in the visible.For example, HARPS (Mayor et al. 2003) and HARPS-N (Cosentino et al. 2012) measure spectra from 383 to 690 nm, EXPRES from 390 to 780 nm (Jurgenson et al. 2016), and ESPRESSO from 378.2 to 788.7 nm (Pepe et al. 2021).Other instruments, such as CARMENES (520-960 and 960-1710 nm, Quirrenbach et al. 2016) or NEID (380-930 nm, Halverson et al. 2016) also record parts of the near-infrared spectrum.However, these instruments still do not reach the wavelength regime where cleanly split Zeeman diagnostic lines are found.Hence, there is a need for a magnetic flux proxy for intensity spectra in the visible wavelength range.
There are some studies describing Stokes I magnetic field fitting techniques.Stenflo & Lindegren (1977) fit 402 unblended unpolarised Fe 1 absorption lines in the visible wavelength range observed at the solar disk centre.They fit the line widths at different depths and were able to estimate an upper limit for the average magnetic flux of 110 G for the Sun.Kochukhov et al. (2020) have extracted magnetic flux estimates for Sun-like stars via Zeeman intensification.However, their error bars are much larger than the typical average magnetic flux variations of less than 1 G that are relevant for stellar activity mitigation.In this context, we present the first method able to produce sufficiently precise magnetic flux time series.

VALD3
To model the magnetic response of the absorption lines, we require the stellar absorption lines' wavelength, depth, and their effective Landé factor.This information can be retrieved from the Vienna Atomic Line Database (VALD3; Ryabchikova et al. 2015).Since we analyse solar spectra in this work, we set the stellar microturbulence to 1.1 km s −1 , the effective temperature ( eff ) to 5833 K, and surface gravity (log ) to 4.44, and the chemical composition to solar values.These stellar parameters were estimated as outlined in Lienhard et al. (2022), section 2.3.Our estimate for the solar effective temperature is marginally higher than the recommended value of 5772 K (Prša et al. 2016).We kept the value that we derived from the HARPS-N spectra to keep the analysis consistent.We do not expect this to have any measurable impact on our results.We only included lines with relative depth greater than 0.2 to exclude very weak lines which are often affected by noise.Furthermore, we excluded all molecular absorption lines in the VALD3 list to have a more homogeneous set.This removes about 12 per cent of the lines in our list and very marginally improves our results.About 50 per cent of the remaining lines are due to Fe 1.

HARPS-N
HARPS-N is a pressure and temperature-stabilised cross-dispersed echelle spectrograph operational since 2012.It has a resolving power of R = 115,000 in the visible range from 383 to 690 nm over 69 spectral orders.In addition to the nightly observations, HARPS-N is outfitted with a solar telescope to record disk-integrated spectra of the Sun at 300-second cadence, and has been doing so for several hours on most days since 2015 (Cosentino et al. 2014;Dumusque et al. 2015;Phillips et al. 2016;Collier Cameron et al. 2019;Dumusque et al. 2021).
For this study, we randomly selected one spectrum from each observing day contained in the set of three years of high-quality HARPS-N solar observations presented in Dumusque et al. (2021). 3e note that HARPS-N had a cryostat leak requiring periodic interventions.As a result, the spectra within 5 days from an intervention were excluded from this dataset because they can be affected by flux variations and are not representative of HARPS-N's usual performance.
In total, we selected 500 spectra.The exposure time was 300 seconds for each of them.The first spectrum was recorded on 29 July 2015, and the last spectrum on 18 May 2018.The airmass of the exposures ranges between 1 and 2.9, with the median being around 1.3.The minimum signal-to-noise ratio (SNR) at 550 nm is about 250, the maximum is around 460, and the median is equal to 380.These SNR values are very high compared to nightly HARPS-N observations with a typical SNR between 50 and 200.As we show in Section 6.2, our approach does not require a very high SNR.
For each spectrum, we extracted MM-LSD RVs as well as CCF RVs.Furthermore, the Data Reduction System (DRS) also computes several activity indices.For this study, we used the Full Width at Half Maximum (FWHM), contrast and bisector inverse slope (BIS) of the CCF.The RV, FWHM, and contrast values were corrected for effects of Solar System motions as detailed in Collier Cameron et al. (2019).Furthermore, the log  ′  index was computed directly from the HARPS-N spectra, yielding values between -5.03 and -4.96.This indicator quantifies the emission in the cores of the Ca II H (3968.47 Å) and K (3933.66Å) spectral lines which is induced by magnetic activity.First, the S-index is computed standardly by weighing the emission within these bands with a triangular response function with width 1.09 Å and dividing by the reference bands with a width of 20 Å at 3900 and 4000 Å (Gomes da Silva et al. 2011;Dumusque et al. 2021).The emission within the line cores was susceptible to contamination due to effects related to the cryostat leak.The leak led to the build-up of humidity over time, increasing the reflectivity in the detector and producing local flux variations called ghosts.The impact of these ghosts on the extracted S-index is corrected as described in Dumusque et al. (2021).The S-index is then converted to log  ′  following Noyes et al. (1984).

SDO
To validate our results, we compare with data from the Helioseismic and Magnetic Imager (HMI) instrument onboard the Solar Dynamics Observatory (SDO; Schou et al. 2012;Pesnell et al. 2012;Couvidat et al. 2016).HMI measures the line-of-sight magnetic flux through the magnetically sensitive Fe I line at 6173.3 Å.We downloaded4 the 720 s magnetograms and intensitygrams using a six-hour cadence spanning our full HARPS-N time range.The absolute value of the SDO line-of-sight magnetic fluxes were intensity-weighted and summed over all pixels as outlined in Haywood et al. (2016) to compute | Bobs | .To estimate the filling factors of active regions, we used the same thresholds as in Haywood et al. (2016) and Milbourne et al. (2019) to distinguish between faculae, sunspots, and quiet photosphere.More specifically, the magnetic field was assumed to be radial.Any pixel with foreshortening-corrected magnetic flux below 24 G was assumed to measure quiet photosphere.Pixels above this threshold were divided into sunspots and faculae using an intensity threshold of 0.89 times the mean pixel intensity corrected for limb-darkening as in Yeo et al. (2013).The filling factor and unsigned magnetic flux time series can alternatively be obtained using SolAster, presented in Ervin et al. (2022).
There is a minor time difference between the HARPS-N observations and the SDO magnetic flux measurements.This time difference is smaller than 6 hours (mean absolute difference: 2 hours) for all our measurements.Since | Bobs | and the RVs only marginally evolve over this timescale, the time difference only minimally influences our results.By interpolating the SDO data to the timestamps of the HARPS-N spectra, we achieve correlations about 0.005 higher than those reported in this analysis.

Tapas
The Transmissions of the AtmosPhere for AStromomical data database (TAPAS; Bertaux et al. 2014) provides transmission spectra for the Earth's atmosphere.We use one arbitrarily chosen transmittance spectrum (La Palma Roque de los Muchachos Canarias Spain, 2018/3/30, 01:45:07, airmass 1.03) to identify and exclude spectral regions impacted by deep tellurics.More information is provided in Section 5.1 and in Lienhard et al. (2022).

EXTRACTING THE UNSIGNED MAGNETIC FLUX
In this Section, we first describe our model for the difference between a Zeeman-split and an unsplit absorption line and where this model is valid.We subsequently show how the difference between our spectra and a master spectrum can be fit using the LSD approach to extract a proxy for | Bobs | .

Residual model
For simplicity, we assumed that the magnetic field strength distribution on the stellar surface can be captured by two components: the quiet surface and the active regions with magnetic field strength roughly three orders of magnitude higher.The exact values do not have to be fixed for the algorithm described below.The intensity profile of a Zeeman-split line can then be modelled as in Title & Tarbell (1975), Robinson (1980), and Marcy (1982): The parameter  captures the emission from the shifted  components,  quantifies the emission from unsplit lines and the unshifted  component, Δ is the Zeeman shift proportional to the effective Landé factor and the magnetic field strength as in Eq. 2, and  describes the shape of the line.Both values  and  finally depend on the orientation of the magnetic field relative to the observer, as explained in Section 2.2.The model in Eq. 4 relies on the assumption that the  and the  components have the same profile and do not interact.The validity of this assumption is based on the shifted  components having opposite polarity.This greatly simplifies the radiative-transfer problem, as it allows diagonalising the transfer matrix leading to non-interaction between the polarisation components (e.g.Stenflo et al. 1984).Note also that we assumed that the magnetic field strength does not vary radially within the active regions.We, therefore, assume that all lines are exposed to the same magnetic field independent of their formation height.
Assuming that the lines' equivalent widths remain the same, Eq. 4 can be simplified: (5) In the following, we assume that the ratio of  to  components for the active regions remains about the same.In this case, the parameter  in Eq. 5 is directly proportional to the filling factor of active regions on the visible stellar hemisphere. 5We note that the strength of the  to  components for a given active region depends on the angle between the line of sight and the magnetic field vector (e.g.Seares 1913;Marcy 1982;Skumanich & López Ariste 2002).By the Seares' relation, the intensity of one  component at a given position on the stellar surface is equal to where  0 is the total intensity of all three components and  is the angle between the line-of-sight and the magnetic field vector.The strength of the  component is then: For disk-integrated spectra, Marcy (1982) assumed a radial field and estimated the average  to 34 • .Since we are mainly interested in the evolution of the magnetic flux, we do not need to estimate an average field-line to line-of-sight angle.However, we assume that the active regions are homogeneously distributed such that in the disk-integrated spectra the ratio remains about constant.This means that the factor capturing the disk-averaged value of 0.25(1 − cos 2 ) is absorbed in .
Lastly, we assume that the line profiles are all Gaussian.With few exceptions, this is a valid assumption for optical absorption lines of main-sequence FGK-type stars given our resolution and precision.The intensity profile of an absorption line with line depth , central wavelength , and width   is then in the absence of magnetic flux.The line profile emerging from the active region is equivalently: where  L is equal to 1 +  and  R is equal to 1 −  with  representing the Zeeman-induced velocity shift (see Eq. 3) divided by the speed of light : Subtracting the magnetic line from the non-magnetic line, we get with The expression in Eq. 13 is equal to the difference between two Gaussian absorption lines that are shifted relative to each other.It can be decomposed in terms of the Hermite-Gaussian polynomials as in Holzer et al. (2021): where   are the Hermite-Gaussian functions defined as in Eq.A5 and   are coefficients (Eq.A3 and A4) that depend on the shift of the two Gaussians relative to each other.For more details, we refer to Appendix A or Holzer et al. (2021).By Taylor-expanding the relevant coefficients in , a simple expression emerges, as the terms that are odd in  cancel out due to the symmetry of Zeeman splitting.The odd components  1  1 and  3  3 can be neglected for the same reason.Thus, we get the following convenient expression: The parameter   denotes the standard deviation of the Gaussian absorption line profile in wavelength space.We can translate this to the velocity space and assume that the width of all lines in this space is about the same, as in LSD.
then translates to    and with  0 the absolute radial velocity of the star: Note that keeping the dominant components of Eq.16, we recover the shape of the second derivative of the absorption line (cf.Eq.B4).

Residual approximation
We included only the dominant quadratic terms of the Taylor expansion in our estimation of the residual shape in Eq. 16.Quartic and later terms only become relevant if Our approximation is valid if the width (  ) of the absorption lines in velocity space is larger than the Zeeman shift 1.4 × 10 −4  0  eff .While the condition above resembles the weak-field condition, it only refers to the validity of the Taylor expansion.
Note that the expression in Eq. 16 is equally valid for local as well as for disk-integrated spectra, given our assumptions.The reason for this lies in the fact that we model absorption lines as Gaussian functions and rotational and macroturbulent broadening are well described by a convolution with a Gaussian kernel (Takeda & UeNo 2017;Sheminova 2019).A Gaussian  (,  2 ) convolved with a Gaussian kernel  (0,  2  ) results in  (,  2 +  2  ) and therefore remains Gaussian in shape.Our residual shape is expressed as a sum of Gaussian functions.To translate this residual expression from the local to the disk-integrated spectra, it is convolved with a Gaussian kernel.Since the convolution is distributive, the Gaussian functions in the sum can be individually convolved with the broadening kernel.Therefore, the residual shape is equally a sum of Gaussian functions for the local and disk-integrated spectra.The only difference is the line width and line depth.Since we assumed constant   for the line profiles, the line widths are equally constant for the local profiles.By convolving with a Gaussian kernel, the line depths change by a multiplicative factor.Therefore, the extracted residual shape amplitudes scale by a factor that depends on the broadening kernels.A multiplicative factor is of no concern for the purpose of RV detrending, however, and can remain unaccounted for.For our purposes, we investigate disk-integrated residual spectra, and therefore we need to ensure the Taylor expansion is a good fit to those spectra.Therefore, we set   to the line width measured from these spectra.
Our approximation is generally valid for absorption lines of weakly active FGK-type stars in the optical wavelength range.For the Sun,  is about 3 km s −1 and a 1 kG magnetic field produces a typical Zeeman shift of about 1 km s −1 , which therefore comfortably lies within our range of validity.The typical magnetic field strength of plage regions on the Sun, however, is about 1.5 kG with some areas reaching up to 2 kG (e.g.Rueedi et al. 1992;Martínez Pillet et al. 1997;Buehler et al. 2015).Similarly, magnetic field concentration in the solar network reach the same magnetic flux strengths (Buehler et al. 2019).In the following, we investigate the behaviour of the residual profile for a magnetically sensitive line for a magnetic field strength of 1 and 2 kG.In Fig. 1, we show the residual profile computed from the original Gaussian expression in Eq. 11 in yellow, as well as the dominant Hermite-Gaussian components and our approximation as in Eq. 16.
For this example, we chose an absorption line at 5000 Å, with  equal to 3 km s −1 , relative depth 0.3,  eff of 2,  set to 0.1 and the magnetic field strength in the active region to 2 kG.This results in a Zeeman shift of 2.8 km s −1 which is just within our validity range of 3 km s −1 in this case.For comparison, we display the approximations for a 1 kG field keeping the other line parameters the same in Fig. 2. Since the residual profiles scale with  2 , the amplitude is reduced by a factor of 4 in Fig. 2. Note that only small fractions of the solar surface are affected by such strong fields (e.g.Haywood et al. 2016) and only about 3 per cent of the included lines have an effective Landé factor greater than 2.
For both cases,  0  0 + 2  2 is already very close to the numerical solution.The difference between our Taylor approximation as in Eq. 16 and the Hermite-Gaussian approximations  0  0 +  2  2 and  0  0 +  2  2 +  4  4 lies mainly in the quartic components.Given that the example shown in Fig. 1 treats a case that is at the very edge of the validity range and given the noise present in the spectra, including quartic components is not warranted.Also, the quartic components break the linearity of the problem that allows us to use the LSD approach in Section 4.3.
Apart from the question of when our Taylor expression in Eq. 16 is valid, we also need to consider when the underlying parametrisation in Eq. 9 is valid on the local level where the lines are narrower.The parametrisation in Eq. 9 is based on the weak-field approximation which breaks down for absorption lines with very high Landé factors in active regions with high magnetic field strengths (Jefferies et al. 1989;Lehmann et al. 2015).The violation of this condition may have a larger impact on the extraction of Δ 2 for more active stars but overall holds for solar-type stars.For now, we thus recommend applying our model to the spectra of solar-type stars.

Gaussian absorption lines
In the derivation, we assumed a line is well characterised by a Gaussian profile.Deviations from Gaussian profiles can arise due to pressure broadening leading to more prominent line wings.These absorption lines are generally better characterised by Voigt profiles.The residual profile emerging from these lines is still symmetric and broadly follows our approximation such that Zeeman-induced variability can still be captured.Another factor leading to deviations from the Gaussian line shape is stellar surface convection (see e.g.Gray 2005).The latter produces slightly asymmetric absorption lines in disk-integrated spectra (Gray 2005;Cegla et al. 2019).More specifically, the asymmetry is due to granulation.As we observe a star, we record both the blue-shifted light emitted from upwards-flowing hot matter in the granules and the redshifted light emitted from the downwards-flowing cooler matter in the surrounding intergranular lanes.Averaged over the stellar hemisphere, this leads to C-shaped bisectors.A bisector is defined as the line connecting the mid-points of each horizontal segment of an absorption line (see e.g.Gray 2005, p.297).For our purposes, the variation of the line shapes in time is more important than the line shapes themselves.To investigate the residual shapes caused by bisector variations, we analysed the CCF variations of HD 166435 6 recorded with the ELODIE spectrograph (Baranne et al. 1996).This star is known to have large bisector variations that mimic the presence of a planet (Queloz et al. 2001).The CCF variations are unsurprisingly not symmetric relative to the line centre but anti-symmetric instead.This result is expected because the line cores shift RV-like leading to shapes that are well characterised by the odd first Hermite-Gaussian function but are not expected to interfere when fitting an even function that also naturally places most of the statistical weight on the line centres.We leave the detailed analysis of strong line shape variations on our residual approach to future work.
In our derivation, we assumed that the equivalent width is conserved.This is not the case for saturated lines displaying Zeeman intensification.Saturation is an issue when absorption lines are to be modelled.However, as we model residual spectra and the Zeemaninduced line variations are very small, the impact of saturation is reduced.Nevertheless, we expect improved results if saturation can be incorporated into the residual model.
Blended lines are also known to pose problems to magnetic field extraction techniques.For this reason, unblended lines are selected in most works (e.g.Stenflo & Lindegren 1977;Giampapa et al. 1983;Saar 1988).As we include a large number of absorption lines, we expect the blend effects to be strongly diluted.We did, however, remove heavily blended regions and heavily saturated lines, as outlined in Section 5.1.

Zeeman triplets
We modelled the residual profiles as triplets, despite most absorption lines not being simple triplets.As explained in Section 2.2, an energy level with angular momentum  splits into 2 + 1 sublevels, which is of course the case for both states involved in a transition.If both states have non-zero angular momentum and different Landé factors, we see multiple  components at different wavelength shifts.The effective Landé factor is defined as the factor that captures the shift of the centre of gravity of the red-shifted  components (Landi Degl'Innocenti 1982).
Since the  components are defined to originate from the transitions with Δ  = ± 1, they come in pairs of a red-shifted and a blue-shifted component, as long as there is a state to populate, with the  component in between.However, triplets can also be shifted relative to the non-magnetic case.This shift is proportional to the difference between the Landé factors of the upper and the lower state and produces a distribution of  components that is symmetric relative to the non-magnetic transition wavelength.
As long as the Landé factors of both involved states are very similar, we can treat them as a superposition of unshifted triplets with different strengths.Therefore, these non-triplets produce a superposition of residual profiles with the same shape but different amplitudes.To first order, the effective Landé factor squared captures the amplitude of this residual profile.Thus, we can also model non-triplets with the triplet model by using the effective Landé factor, as done in this study. 6The spectra are available on the ELODIE archive: http://atlas.obs-hp.fr/elodie/.We investigated the dependence of our results on the non-triplet transitions.We could remove up to 60 per cent of the absorption lines without noticeably deteriorating the results.This cut corresponds to excluding all absorption lines for which the Landé factors of the upper and the lower state differ by more than 0.1.This shows that, on the one hand, non-triplets do not interfere with our extraction.On the other hand, it shows that the magnetic information in the non-triplet lines is not yet fully harnessed.

Convolution model
In this Section, we show how the line model can be applied on multiple lines simultaneously to boost the residual signal.For this, we assume that line residuals add up linearly, which allows us to model the residual spectrum as a convolution using the residual profile scaled by the expected amplitude of the signal, 2 2 .Note that the amplitude of the signal can be split into a line-specific component multiplied by a general part  2 for all absorption lines.We now have a profile that is constant in velocity space and only scales with a line-specific amplitude and a multiplicative factor.Therefore, we can use the LSD approach and model the convolution via matrix multiplication.For this, we need the following definitions: •   : wavelength of pixel i.
•   : central rest frame wavelength of absorption line .
•   : depth of absorption line .
•   : velocity grid point  of residual profile .
•   =    −    : radial velocity   which shifts   to   .With these definitions, the convolution can be expressed as the matrix multiplication using where Δ is the velocity increment, and Λ is defined as in Lienhard et al. (2022): To find the best-fitting model to the data, we compute the value of  2 that minimises where the matrix  contains the inverse squared flux uncertainties on the diagonal.

Non-magnetic profile
Above we estimated the residual profile arising from subtracting a non-magnetic line from a line affected by Zeeman splitting.However, we do not have an exact model for the unsplit line because the magnetic field is non-zero even in the quiet regions of the solar photosphere and we measure disk-integrated spectra.Robinson (1980) chose lines with low Landé factor as a model for the unsplit lines.However, for the line comparison to be meaningful, the formation heights of the split and the unsplit lines must be similar and the equivalent width must be comparable (Robinson 1980).Alternatively, the spectra of another magnetically quiet star of the same spectral type may be used as a comparison spectrum as in Giampapa et al. (1983).However, this approach comes with some complications, such as accounting for residual differences between the stars and estimating the magnetic flux of the quiet star (Saar 1988).Another option consists in comparing the absorption lines to a reference spectrum of the same star (Saar 1988;Thompson et al. 2017Thompson et al. , 2020)).This means that each absorption line is compared to a line with the same formation height and practically identical equivalent width.Instead of choosing one spectrum as the reference, we compare each spectrum to the average spectrum computed from a selection of spectra.This reduces the extent of spurious variations due to telluric absorption lines, the continuum correction, and photon noise in the master profile leading to a cleaner comparison profile.We found this averaging procedure to be crucial as it significantly suppresses the photon noise in the template and therefore also reduces the scatter in the extracted values of  2 .
To generate the master spectrum, we stack 100 normalised solar spectra in the barycentric reference frame and fit a univariate spline.The exact selection of solar spectra is negligible for this process.The residual profile evolves smoothly across the full velocity grid spanning 20 km s −1 with the two minima being about 10 km s −1 apart.This is orders of magnitudes larger than the expected RV shift due to planets.The extracted amplitude of the residual shape is thus not affected by planet-induced RV variations and  0 in Eq. 17 can be set constant.Furthermore, Doppler-shift-induced residuals have a very different shape as compared to broadening-induced residuals.The extracted  2 is therefore largely independent of the presence of planets.Nevertheless, we shift the master profile to the RV of the individual spectra to match the derivation of the residual profiles.
Since we compare the absorption lines to their individual average profiles, we extract the change in magnetic flux strength Δ 2 , rather than the absolute  or  2 : This expression is valid under the assumption that what actually evolves is the filling factor of active regions on the visible stellar hemisphere rather than the unsigned magnetic flux within the magnetically active regions.To verify this, consider a triplet as in Eq. 9. Keeping all parameters the same but varying only , we see that the mean of any number of triplets can also be modelled as a triplet with well-defined parameters  and .Within our toy model, the difference between the master profile and a non-magnetic line can therefore be modelled with Eq. 12, as we did in Eq. 25.However, our method does not rely critically on this assumption.Injection-recovery tests show that we can vary  within the defined validity range (cf.Section 4.2) and keep  constant and still recover Δ 2 with a mere constant offset which does not interfere with the linear correlations.Note that we measure Δ 2 relative to the master profile which means that the difference in Δ 2 between two spectra is equal to the difference between their respective  2 and thus an overall offset is not worrisome.
Measuring the absolute evolution of the magnetic flux requires two measurements to calibrate  2 and determine the offset.These measurements can be carried out with the established magnetic field estimation approaches.The technique presented in this study consequently also provides a gateway to get precise and cost-effective absolute unsigned magnetic flux time series.
Lastly, we assumed in Eq. 9 that the magnetic field strength is radially constant within the active and quiet regions.This assumption is not critical for our application because we are only interested in how the residual profiles evolve.In fact, this assumption is of no concern if the magnetic flux within both regions is constant and only the filling factor of active regions evolves in time.

Direct 𝛼𝐵 2 extraction from the spectra
We investigated applying an extended version of LSD modelling the absorption lines as triplets to the spectra themselves rather than the residual spectra.However, the combined problem consisting of the problematic LSD-intrinsic line addition assumptions and imperfect absorption line depths represented a Gordian Knot that we could only cut by applying the weak-field splitting approach on the residual spectra, at the expense of getting the variation in  2 rather than  2 itself.As mentioned, this does not impact RV mitigation and can be overcome by doing two calibration measurements.For the Sun, this correlation is mainly driven by faculae (Pearson correlation coefficient: 0.98) as the Sun is a faculae-dominated star.
The correlation of the filling factor with the emission reversal in the Ca II H and K lines, as measured by log  ′  , and the correlation between log  ′  and the average magnetic flux was shown in Meunier (2018) and is also discussed in Haywood et al. (2022).
Assuming no magnetic field in the quiet regions and constant magnetic field strength in the active regions, the only factor that evolves in time is .This factor includes variations in the ratio of transverse to longitudinal field components, which we assume to be negligible in the disk-averaged spectra.In this case,  is directly proportional to the filling factor of active regions.Since we assumed a constant magnetic field strength in the active regions,  2 is directly proportional to the filling factor of active regions multiplied by their magnetic field strength.The latter is equal to | Bobs | in this idealised two-component model.
A two-component model is supported by evidence that plages exhibit fairly tight distributions of the magnetic field strength (e.g.Rueedi et al. 1992;Martínez Pillet et al. 1997;Buehler et al. 2015).This is due to the efficient concentration of the small-scale magnetic fields in flux tubes through the convective collapse mechanism (Parker 1978;Spruit 1979).For sunspots, there is a wider distribution of magnetic field strengths between spots of different sizes and within the individual spots themselves.The peak magnetic field strength is found in the umbra reaching 2000 -3700 G and decreases towards the periphery of the spot to 700 -1000 G (Solanki 2003).Since all of these components contribute to the amplitude of the residual signal described in this study, we expect the spread of the magnetic field strengths to impact the scaling of Δ 2 with | Bobs | for spot-dominated stars.Therefore, we expect the direct proportionality between Δ 2 and | Bobs | to hold for less active stars, as they are expected to be plage-dominated (e.g.Radick et al. 2018).Skumanich & López Ariste (2002) applied Principal Component Analysis (PCA) on intensity spectra.They found the first component to be equal to the first derivative of the profiles and their score to correlate with the RVs.This finding is in perfect agreement with the results from Holzer et al. (2021) who find that the difference between two shifted Gaussians is well-modelled by one profile with the shape of the first Hermite-Gaussian polynomial scaled with the stellar RV.Skumanich & López Ariste (2002) also find the second derivative to scale with the filling factor multiplied by the magnetic field strength squared.This agrees with our findings that the Zeeman signature can also be modelled by one profile that scales with  and the squared unsigned magnetic field strength of the active regions.This result is also apparent from Eq. 10 in Stenflo (2013).

Comparison of methodology to other techniques
For the range of validity defined in 4.2, the difference between a line and a line broadened by a factor 1 +  while preserving the equivalent width can be expressed as By Taylor expanding this expression for small width variations, i.e. for  near 0, it can be seen that this residual profile is practically identical to the expression in Eq. 16 for  equal  2  2  2  .Since the amplitude of the residual signal in Eq. 16 scales linearly with  2 , the residual profile in Eq. 28 scales linearly with  for small width variations.The derivation is shown in detail in Appendix B. The emergent residual profile is therefore not unique to Zeeman splitting.This also means we can model other broadening mechanisms by adding the same residual profile scaled by the appropriate factor.Also, if Zeeman splitting is modelled as a line-broadening effect, we expect to get the same scaling behaviour.Indeed, Stenflo & Lindegren (1977) fit the absorption line width of 402 unblended Fe 1 absorption lines as a polynomial expression with the magnetic factor scaling with  2 .Lehmann et al. (2015) applied a Principal Component Analysis (PCA) approach on the moderately active star  Eridani.They find one eigenprofile in good agreement with the second derivative of the line profiles and extract  2 via the respective principal component score.We do not extract the magnetic flux profiles from the spectra themselves.Instead, we derive the shape and scaling behaviour from our weak-field triplet splitting model and thus assume the residual profiles to be known a priori.As also pointed out in Lehmann et al. (2015) and following directly from the Hermite-Gaussian expansion, the purely RV-induced residual variations are orthogonal to the Zeeman splitting induced shape variations.These components, therefore, interfere negligibly which makes the modelling of the RV effect in the residuals unnecessary.Lehmann et al. (2015) choose 30 spectral lines with Landé factor greater than 1.59.They generate a calibration mapping from the extracted  2 values to the average magnetic flux by comparison with synthetic line profiles.Such a mapping is challenging for our technique because we include over 4000 absorption lines.The magnetic flux values of Lehmann et al. (2015) vary by a few tens of Gauss from spectrum to spectrum, with the average magnetic flux being 186 G.This error ratio may inhibit stellar activity mitigation.
There is direct observational evidence for the existence of the features that we derive from Zeeman splitting in the present analysis.Thompson et al. (2017) compared HARPS spectra of  Cen B to investigate the impact of stellar activity on absorption lines.Their data set spans a range of log  ′  values from -5 to -4.82 and covers a sizeable fraction of  Cen B's activity cycle of about 8.1 years (Ayres 2014).This range of log  ′  is larger than for the solar data used in this work implying a stronger variation of the magnetic flux B (Schrĳver et al. 1989).They generated a low-activity stellar template by stacking the spectra recorded during a night in 2008 when  Cen B was most inactive within their data set.Thompson et al. (2017) then divided the nightly stacked spectra from 2010, when  Cen B was more active, by this template to investigate the differences.By visual inspection, they found features that closely resemble the Zeeman-induced residual shapes derived in the present analysis.The fact that Thompson et al. (2017) used division while we used subtraction does not significantly affect the morphological similarity.However, the Hermite-Gaussian approach we use only applies to residual spectra produced by subtraction.
By simulating specific absorption lines, Thompson et al. (2017) deduced that magnetically active regions can produce the observed residual shapes.They furthermore found that the strength of the central component of the residual shape correlates with log  ′  .The residual shapes were also seen in a follow-up study in which Thompson et al. (2020) stacked daily HARPS-N solar spectra to discover a number of the same features as in  Cen B. They also found that the strength of those features correlated well with log  ′  and with the facular filling factor which agrees with our findings.We do not visually see the Zeeman residual shapes, however, since we analyse individual spectra of the Sun.We can only deduce the magnetic flux proxy by combining the information from thousands of lines.This approach does make it possible to use this proxy as a simultaneous indicator for RV variations.As Thompson et al. (2017) see a strong correlation for single lines, we expect K-dwarfs such as  Cen B to be exquisite targets for our multi-line approach too, as long as the extent of Zeeman splitting is within our range of validity and they are still plage-dominated.

APPLICATION TO HARPS-N SOLAR SPECTRA
In this Section, we describe how we applied the residual model to our data set of 500 HARPS-N solar spectra to extract Δ 2 .We subsequently outline the results from injection-recovery tests using the solar spectra and describe a combination of the magnetic field modelling with RV modelling.

Preprocess spectra
For the present analysis, we used the deblazed 2-dimensional echelle order spectra, their associated uncertainties, and barycentric wavelengths contained in the spectral files that we selected as described in Section 3.2.We continuum normalised the spectra with RAS-SINE Cretignier et al. (2020), corrected for residual cryostat leak effects, and divided by a simple telluric model.Details can be found in Lienhard et al. (2022), section 4.
We excluded any wavelength range in the barycentric frame that is affected by a telluric line deeper than 1 per cent in any of the spectra.Such a strict threshold is warranted since we measure very small signals that could easily be distorted by telluric absorption lines (Cunha et al. 2014;Ulmer-Moll et al. 2019).
In Lienhard et al. (2022), the spectra were modelled by convolving the best-fitting common profile, representing the average line profile, with a line list containing the wavelength and depth of the absorption lines.For this, the velocity grid on which to evaluate the common profile had to be defined.We adopt the same velocity grid centred at the stellar RV of the first spectrum.The width of the velocity grid was set to 3 times the FWHM of the first common profile and the velocity increment to 0.82 km s −1 .The latter is equal to the average velocity increment per physical pixel on the HARPS-N CCD.This results in a grid width of about 20 km s −1 .Moreover, we excluded regions of the spectrum containing fluxes deviating by more than 0.5 in relative depth from the convolution model.This essentially removes lines that are heavily blended or absorption lines with inaccurately estimated depths in the VALD3 list.Furthermore, we included only atomic lines with relative depth greater than 0.2 as per the VALD3 list.We excluded all lines deeper than 0.9 and spectral regions near such absorption lines to avoid including heavily saturated absorption lines which are not well modelled by our residual model.This leaves us with 4636 absorption lines with mean effective Landé factor of 1.17.The distribution of the absolute effective Landé factors and the line depths is shown in the histograms in Fig. 3.We show the absolute value of the Landé factors because Zeeman splitting in Stokes I does not depend on the sign.
The correlations found in this study are negligibly dependent on the exact choice of the velocity grid width, the model-spectrum deviation, or the minimal and maximal depth of the included lines.Note that, as in the MM-LSD technique, we ensure that the same stellar absorption lines are included for all spectra.

Extraction from S2D spectra
We ran the residual fitting technique described in Section 4 on all echelle order spectra individually and combined the extracted Δ 2 values by computing the weighted mean of the extracted values of each order.The weight of each order was set to the sum of the inverse squared uncertainties of all included fluxes.The same weights were used in Lienhard et al. (2022) to combine the common profiles of the individual orders.We also tested running the extraction code on all orders simultaneously.This approach yielded marginally lower correlations with | Bobs | .
Overall we find a good correlation of the extracted Δ 2 with | Bobs | from SDO for each order except order 65 (around 6650 Å) where there are only very few included absorption lines.The Pearson correlation coefficients of Δ 2 with | Bobs | for each order are displayed in Fig. 4. The central orders between 5000 and 5500 Å, where we find a high number of isolated absorption lines at high SNR, correlate best with | Bobs | and also get the highest statistical weight.For lower orders, the absorption lines are more often blended, affected by noise, and the continuum is expected to be less precise.At wavelengths beyond 5500 Å the number of included absorption lines can reach low values leading to high scatter in the extracted order Δ 2 .Since these orders have a very low statistical weight, their impact on the final Δ 2 is minor.
We found it necessary to correct for an observational effect unique to the Sun.As the Earth revolves around the Sun on a slightly ec- centric orbit is inclined relative to the Sun's axis of rotation, the observed rotational line broadening evolves over the course of a year.The width variation is present in the HARPS-N solar spectra and can be modelled as a double-sinusoid (Collier Cameron et al. 2019).This signal results in a strong 182-day peak in the periodogram of the width measurements.Our method, being susceptible to line width variations, is susceptible to this viewing angle effect too.However, spectra of other stars are not affected and therefore we remove this signal to get a more realistic estimate of the expected proxy performance for other stars.To remove the double-sinusoid signal, we find the scaling factor that eliminates the 182-day signal from the time series when we subtract the expected line width variation times this scaling factor from the Δ 2 time series.This is achieved by minimising the power of the respective peak in the Bayesian generalised Lomb-Scargle periodogram (Mortier et al. 2015).We show the model and the difference between Δ 2 and | Bobs | in Fig. 5.Note that we do not use | Bobs | to compute this correction factor, but the additional signal is most obvious once the magnetic signal has been removed.The correlation coefficients of the individual orders increase by on average 0.06 after applying this correction.The overall correlation of Δ 2 with | Bobs | increases from 0.8 to 0.914 after the removal of this spurious signal.The CCF contrast and FWHM that we use in Section 6 have also been corrected for the viewing angle effect following Collier Cameron et al. (2019).

Injection-recovery test
We generated mock spectra to validate our approach given our assumptions.Each of these mock spectra is based on our real spectra.This means that the wavelength solution, the uncertainties and the SNR of the i th real spectrum correspond to those of the i th mock spectrum.To generate the mock fluxes, we produced a new line list by splitting each VALD3 line into a  and two  components and convolved this list with a normalised Gaussian profile scaled by the depth of the respective component as in Eq. 9. We set the field strength in the active regions to 1.5 kG and assumed a factor  between 0.005 and 0.015.We made the same assumptions as in Section 4. This means that we assume a homogeneous distribution of active regions making the average line-of-sight angle constant and absorbing this factor into .Given the angle-dependent factor in Eq. 6 and assuming a characteristic angle ⟨⟩ of 34 • as in Marcy (1982), the values of  chosen here correspond to about a filling factor of active regions of about 2 per cent which is within the plage filling factors presented in Milbourne et al. (2021).
We added Gaussian noise, with the standard deviation of each flux equal to its associated uncertainty estimate, to the spectra.As shown in Fig. 6, the LSD extraction of Δ 2 based on the Taylor-expanded Hermite-Gaussian expression successfully retrieves the injected values with no systematic differences and minimal scatter.

RV extraction
It is tempting to simultaneously model the RV and the magnetic effect within the residual profiles to obtain an RV estimate that is less affected by (1) stellar RV effects and (2) effects specific to the RV extraction.The RV impact of stellar activity is expected to be reduced if we allow the spectral component emerging from the active regions to vary in relative strength, to be Zeeman broadened, and to be shifted in RV.RV extraction effects, on the other hand, are conceivable to originate from the Zeeman-induced width variations of individual lines leading to a varying degree of line blending, for example.This can cause shape variations of the LSD common profile or the CCF because neither of these methods perfectly models line blends.Zeeman intensification can furthermore influence the weight of magnetically sensitive ab-sorption lines in the computation of the common profile.This makes the contribution of these lines to the common profile dependent on the magnetic field and thus produces an activity-dependent contribution to the extracted RV.Correcting for these RV extraction effects cannot remove the magnetically-driven convective RV signals or RV variations caused by the inhomogeneous brightness distribution on the stellar surface.Equally, modelling the photometric effect requires an additional step even if we allow the emission from active regions to vary in strength and Doppler shift within our model.
The difference between two absorption lines that are Doppler shifted relative to each other is well described by the first Hermite-Gaussian polynomial and scales linearly with the RV, as shown in Holzer et al. (2021).Using the weak-field triplet splitting model, it can be shown that the RV extracted using the first Hermite-Gaussian polynomial is largely unaffected by Zeeman splitting.This may partially explain the lower RV scatter obtained in Holzer et al. (2021).With our line selection including partially blended lines and imperfect line depth estimates, we were able to extract RVs using this method.However, the RV semi-amplitudes were reduced which prohibited meaningful scatter analyses without discarding more data and degrading Δ 2 .The reason for the suppressed RV amplitude can be deduced directly from Eq. 18 in Holzer et al. (2021).This expression assumes line additivity which is problematic for the partially blended lines which we included in our analysis.

DISCUSSION
In this Section, we present the magnetic flux proxy Δ 2 as extracted from the HARPS-N solar spectra and compare it with other available activity indicators and RVs.

Comparisons of activity indicators
In Fig. 7, we show the complete time series of Δ 2 and log  ′  extracted from the HARPS-N S2D spectra and | Bobs | from SDO.The overall evolution of the magnetic flux and the quasiperiodic variations on the solar rotational timescale are well traced by all three indicators.
We further evaluated how well our new indicator, Δ 2 , traces the HARPS-N RV variations extracted using the CCF technique and compare it to the SDO | Bobs | and other standard activity indicators ( log  ′  , BIS, FWHM, contrast).For this, we split the data into chunks of a given duration and computed the Pearson correlation coefficient between the RVs and the indicators within these chunks.
To divide the data with timestamps  1 ,  2 ...   into intervals of  days, we proceeded as follows.As the first chunk of data, we select all measurements taken between  1 and  1 + .We then computed the Pearson correlation coefficient if this interval contained at least  3 measurements and if these covered at least 90 per cent of the duration .Next, we selected all measurements between  1 + /3 and  1 + /3 + , proceeding the same way as above.We thus get a list of Pearson correlation coefficients for data chunks of duration .In the following Figures 8, 9, and 10, we display the absolute value of the mean of the Pearson correlation coefficients for each time scale .We included all chunks of data irrespective of whether stellar activity impacted the data significantly.Excluding chunks with RV RMS below 1 m s −1 did not significantly alter our results.The correlation analyses were also computed including only chunks of data with RV RMS greater than 1.5 m s −1 .For these chunks of data, stellar activity is expected to be the dominant contributor to the RV variability with the photon noise contribution being around 0.2 m s −1 and the long-term instrumental stability of HARPS-N being near 1 m s −1 .The respective results are shown in Appendix C.
In Fig. 8, we show the absolute value of the mean of the correlation coefficients computed for these chunks of fixed duration.We computed the absolute value because the contrast is anti-correlated with | Bobs | .This has also been noted in Costes et al. (2021).The lower correlation coefficients towards shorter timescales can partially be explained by phase offsets between the RV and the indicators, mentioned e.g. in Collier Cameron et al. (2019).These offsets are believed to originate from geometric effects and are therefore expected to be a function of the stellar rotation period.Higher correlations can be achieved by accounting for the time delay between the RVs and the activity indicators.
Unsurprisingly, | Bobs | traces the RV variations best as it originates from a dedicated magnetic field measurement with extremely high SNR resulting from combining the information from all relevant pixels, and is not affected by the Earth's atmosphere.Δ 2 shows a consistently higher correlation with the RV variations as compared to the classical activity indicators.The contrast shows high correlations with the RVs too, but lower correlations with | Bobs | as compared to Δ 2 .It is not surprising that the contrast and Δ 2 show some similarities because Zeeman broadening alters the depth of absorption lines where the signal-to-noise is highest.Altogether, his makes Δ 2 the best known activity indicator that can be extracted from Stokes I spectra simultaneously with the RV for solar-type stars.
One particular advantage of Δ 2 is that it can easily be extended to include other absorption lines and is not restricted to particular wavelength ranges of detectors.log  ′  , in contrast, depends on collecting enough photons where the line reversal emerges around 4000 Å.Furthermore, log  ′  weighs the contributions of surface areas differently to the RV extraction because the used wavelength regime, and therefore the limb-darkening, differ.Δ 2 , on the other hand, weighs the individual echelle orders equivalently to the RV extraction.Another difference between Δ 2 and log  ′  is that the flux reversal in the Ca II H and K lines forms above the photosphere, whereas the Zeeman splitting signature traces the magnetic fields in the photosphere where the absorption lines form.This also leads to log  ′  essentially seeing a different field distribution since the magnetic field expands with height (Wiegelmann et al. 2014).Lastly, log  ′  shows a non-linear correlation with the magnetic flux (e.g.Schrĳver et al. 1989;Chatzistergos et al. 2019), whereas we expect Δ 2 to scale linearly within our model assumptions.
The same comparison for the MM-LSD RVs in Fig. 9 shows generally lower correlation coefficients.Strikingly, the BIS is significantly less correlated with the MM-LSD RVs than with the CCF RVs.This can be partially due to the BIS being calculated from the CCFs themselves rather than the MM-LSD common profiles.However, this would indicate that the BIS is method-specific and therefore less useful as a general diagnostic for the stellar surface conditions.Alternatively, this may indicate that the line selection for the CCF method is more susceptible to bisector variations.We performed a simple linear detrending of the RVs with Δ 2 to estimate the minimal RV RMS improvement as linear detrending is a crude method (for an overview on stellar activity mitigation, see Zhao et al. 2022).The RMS of the RVs decreased from 2.01 m s −1 to 1.25 m s −1 .The RMS of the MM-LSD RVs reduced from 1.65 m s −1 to 1.22 m s −1 .The very similar final RMS values indicate that MM-LSD already picks up less stellar activity as compared to the CCF method.This is in agreement with evidence presented in Lienhard et al. (2022). In

Signal-to-noise ratio dependence
The extracted Δ 2 does not depend crucially on the SNR of the spectra.Neglecting all but photon noise, we can add normally distributed noise to the spectra to simulate spectra at lower SNR.For each flux value, we randomly sampled from a Gaussian distribution with a mean of 0 and the variance equal to the squared uncertainty multiplied by a factor.To double the noise and simulate spectra with half of the original SNR, we set the mentioned factor to 1.The uncertainty estimates were adjusted accordingly.
By doubling the noise, we get a median SNR of 190, which is at the lower end of the expected SNR for very bright targets with visual magnitude of about 6.For these spectra, we still get a Pearson correlation coefficient of 0.91 for Δ 2 with | Bobs | overall.This is virtually identical to the Pearson correlation coefficient of 0.914 that we computed for the original spectra with an SNR of about 360.On the rotational timescale of 30 days, the mean correlation coefficient with | Bobs | reduces negligibly from 0.67 to 0.66 as we double the noise, whereas the mean correlation coefficient with the CCF RVs reduces from 0.49 to 0.48.
For simulated spectra with median SNR equal to 60, we get a very good overall correlation with | Bobs | of 0.89.The mean correlation with the | Bobs | on the rotational timescale is slightly reduced from 0.67 to 0.56.The mean correlation with the CCF RVs reduces from 0.49 to 0.42.It follows that the described approach is negligibly affected by white noise within the typical SNR range achieved for RV studies.

CONCLUSIONS
We showed that the difference between a Zeeman-split and an unsplit absorption line is well modelled by Hermite Gaussian polynomials with Taylor-expanded coefficients.The expansion provided a residual model scaling with line-specific factors that allowed us to use the LSD framework to condense the information contained in over 4000 absorption lines and extract Δ 2 .This indicator represents the hemispherically averaged unsigned magnetic flux variations.We found a correlation of Δ 2 with | Bobs | from SDO of 0.91 overall and about 0.67 on the rotational timescale.Importantly, we find a minimal dependence of Δ 2 on the SNR of the spectra, which makes it a prime activity index even for faint stars.Also, we show in Section 6 that Δ 2 correlates better with activity-induced RV variations than the classical activity indicators, on the rotational timescale and on the overall timescale of 3 years.For solar-type stars, Δ 2 is thus the best activity tracer known to date that can be simultaneously extracted with the RVs.
We expect this method to perform better if longer wavelength regions are included, provided that the telluric lines are properly corrected and there is a sufficient number of absorption lines.This is due to the Zeeman effect being more pronounced at longer wavelengths.The proxy for | Bobs | presented in this study provides many potential avenues for extensions and additional applications.For instance, it can provide a magnetic field estimator for stars for which the Calcium H and K lines are too weak, contaminated, or affected by instrumental effects.Also, this provides a magnetic flux estimator for instruments that do not cover the wavelength range required for extracting log  ′  .Furthermore, it provides an independent estimate of the evolution of the magnetic flux.The computation of this indicator can be done on readily available spectra that were recorded for RV purposes.No additional equipment or dedicated observation strategies are required.
Several potential modifications of the present approach are conceivable.For instance, additional line-broadening signals could be modelled simultaneously if temperature and magnetic relations are known.Regarding the Zeeman signal, the Doppler shift of the various components could be modelled as the magnetic and velocity fields are not independent (e.g.Cegla et al. 2013).Active regions can thus be Doppler shifted relative to the quiet photospheric regions.On the technical side, instead of comparing the stellar spectra to an averaged stellar spectrum of the same star, simulated spectra could be used.
Lastly, there are many applications and extensions for the presented indicator.The magnetic flux at different heights within the photosphere may be probed by making line selections.For our next steps, we intend to generalise this method to other stars and assess the performance of the indicator for stellar activity modelling.

Figure 1 .Figure 2 .
Figure 1.Upper panel: Residual profile emerging from subtracting a Zeemansplit absorption line exposed to a 2 kG field from a non-split line (yellow) and approximations thereof.Lower panel: Difference between the residual profile and the approximations.

4. 5
Relation to hemispherically averaged unsigned magnetic flux | Bobs | As shown in the preceding sections, we extract Δ 2 from our spectra.SDO data shows that the changes in | Bobs | are mainly driven by the variation of the filling factor of active regions rather than the magnetic flux within those regions.In fact, in our SDO test dataset containing one SDO observation at 6-hour cadence from July 2015 to September 2021, we find the overall filling factor of active regions to correlate almost perfectly with | Bobs | (Pearson correlation coefficient: 0.99).Similarly high correlations between the filling factor and | Bobs | were found for the data set analysed in Ervin et al. (2022).

Figure 3 .
Figure 3. Top panel: Histogram of the effective Landé factors of the included absorption lines.Bottom panel: Histogram of the absorption lines depths from VALD3.

Figure 4 .
Figure 4.The red (yellow) stars show the Pearson correlation of the order Δ 2 values with | Bobs | before (after) correcting for the Sun-Earth viewing angle effect.The purple dots indicate the number of included absorption lines per order.

Figure 5 .Figure 6 .
Figure 5. Difference between the scaled Δ 2 and | Bobs | (orange stars) and the model for the HARPS-N line width variations (black dots) caused by the varying viewing angle on the Sun.

Figure 7 .
Figure 7. Evolution of Δ 2 and log  ′  , extracted from the HARPS-N solar spectra, and | Bobs | from HMI/SDO from July 2015 to May 2018.

Figure 8 .Figure 9 .
Figure 8. Absolute value of the mean correlation coefficient of the six activity indices with the heliocentric CCF RVs for data chunks covering a fixed time span.

Figure 10 .
Figure 10.Absolute mean Pearson correlation coefficient of the five activity indices derived from HARPS-N spectra with | Bobs | for data chunks covering a fixed time span.

Figure C1 .Figure C2 .Figure C3 .
Figure C1.Absolute mean Pearson correlation coefficient of the six activity indices with the heliocentric CCF RVs for data chunks covering a fixed time span.