Pyodine: An open, flexible reduction software for iodine-calibrated precise radial velocities

For existing and future projects dedicated to measuring precise radial velocities (RVs), we have created an open-source, flexible data reduction software to extract RVs from \'echelle spectra via the iodine (I$_2$) absorption cell method. The software, called $pyodine$, is completely written in Python and has been built in a modular structure to allow for easy adaptation to different instruments. We present the fundamental concepts employed by $pyodine$, which build on existing I$_2$ reduction codes, and give an overview of the software's structure. We adapted $pyodine$ to two instruments, Hertzsprung SONG located at Teide Observatory (SONG hereafter) and the Hamilton spectrograph at Lick Observatory (Lick hereafter), and demonstrate the code's flexibility and its performance on spectra from these facilities. Both for SONG and Lick data, the $pyodine$ results generally match the RV precision achieved by the dedicated instrument pipelines. Notably, our code reaches a precision of roughly $0.69 \,m\,s^{-1}$ on a short-term solar time series of SONG spectra, and confirms the planet-induced RV variations of the star HIP~36616 on spectra from SONG and Lick. Using the solar spectra, we also demonstrate the capabilities of our software in extracting velocity time series from single absorption lines. A probable instrumental effect of SONG is still visible in the $pyodine$ RVs, despite being a bit damped as compared to the original results. With $pyodine$ we prove the feasibility of a highly precise, yet instrument-flexible I$_2$ reduction software, and in the future the code will be part of the dedicated data reduction pipelines for the SONG network and the Waltz telescope project in Heidelberg.


Introduction
The radial velocity (RV) method is the second-most successful technique for the detection of exoplanets today, accounting for more than 1000 discoveries, and many more confirmations of planet candidates discovered first through other methods 1 . This success has been possible thanks to immense progress on instruments and continuing development of advanced data analysis methods. Stellar RVs are measured by detecting Doppler-induced shifts of the absorption lines of a star in high-resolution spectra, usually obtained with échelle spectrographs. Two different methods for the computation of RVs have been developed in the last decades: The first one aims to keep the instrument in a highly stabilized environment, which allows one to directly make use of the wavelength solution produced by a calibration source (either simultaneously or before and after the observations). The small Doppler shifts of the stellar features over time are then measured ⋆ Radial velocity data of σ Draconis and HIP 36616 are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds. unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra. fr/cgi-bin/qcat?J/A+A/. 1 https://exoplanetarchive.ipac.caltech.edu/ either by using a cross-correlation technique with a mask (see e.g., Baranne et al. 1996;Pepe et al. 2002), or through forward modeling of the observed spectra with a template (Anglada-Escudé & Butler 2012;Zechmeister et al. 2018). Today, instruments such as CARMENES (Quirrenbach et al. 2016), HARPS (Mayor et al. 2003), and ESPRESSO (Pepe et al. 2021) reach an internal RV precision below ∼ 1 m s −1 using methods based on this principle, and several high-performance, open-source analysis software packages for the extraction of RVs exist (e.g., CERES, SERVAL, and WOBBLE: Brahm et al. 2017;Zechmeister et al. 2018;Bedell et al. 2019).
For less stable instruments, where the wavelength solution and line-spread function (LSF) of the spectrograph vary considerably over time, the iodine cell method has proven its merit as the second successful technique for measuring precise RVs. In this method, a temperature-stabilized glass cylinder filled with gaseous molecular iodine (I 2 ) is inserted into the light path between the telescope and the spectrograph; the recorded spectra thus show a combination of stellar and I 2 absorption features, the latter of which can be used as a precise wavelength reference. The RVs of the star are then found by forward-modeling the ob-Article number, page 1 of 13 arXiv:2306.13615v1 [astro-ph.SR] 23 Jun 2023 A&A proofs: manuscript no. main servations from stellar and I 2 template spectra with a Doppler shift applied to the stellar template.
While the general idea of using a superimposed reference spectrum for precise RV measurements and first applications of this technique date back to Griffin (1973) and Campbell & Walker (1979), Marcy & Butler (1992) established I 2 as the gas of choice for the absorption cell and further developed the technique in the early 1990s to reach a RV precision of ∼ 3 m s −1 (Valenti et al. 1995;Butler et al. 1996). More recently, Butler et al. (2017) have demonstrated a precision at the ∼ 1 m s −1 level on data collected in a 20-year survey with HIRES/Keck using the I 2 cell method, and a similar precision is reached on shorter time scales by Hertzsprung SONG on Tenerife, Spain, on some stars (Andersen et al. 2016).
Due to its less strict requirements on instrument stability, the I 2 cell method offers a possibility for measuring precise RVs even for smaller telescope projects with constrained financial budgets. We therefore chose this technique for our own Waltz telescope project at Landessternwarte (LSW) Heidelberg (Tala et al. 2016), which aims to continue the RV survey of evolved stars originally performed at Lick Observatory (Frink et al. 2001;Reffert et al. 2015). However, in contrast to the RV measurement method on highly stabilized instruments, to our knowledge no open-source analysis software exists for the I 2 cell method. Furthermore, I 2 cell codes such as the one by Butler et al. (1996) (called Butler code hereafter) or the SONG reduction code, called iSONG (Corsaro et al. 2012;Antoci et al. 2013;Grundahl et al. 2017), are tailored to specific instruments and thus lack the flexibility to be easily applicable to other projects.
We therefore decided to develop a new software package for the extraction of precise RVs from spectra obtained with the I 2 cell method, with the goals of (i) reaching the RV precision achieved by other analysis codes, and (ii) allowing a quick adaptation to different instruments. In the development, we took guidance from the iSONG code, and from a code maintained by Debra Fischer at Yale University. The resulting software, called pyodine, is written in Python 3, built with a modular, objectoriented approach, and publicly available under MIT license 2 . It has been adapted to and tested on spectra from two different instruments thus far, namely the Hertzsprung SONG node in Tenerife (SONG hereafter), and the Hamilton spectrograph at Lick Observatory (Lick hereafter), and in both cases the precision of the dedicated instrument pipelines (Butler code and iSONG) is generally met.
In this work, we briefly describe the mathematical concepts used in the code (Section 2) and explain the fundamental structure and the workflow of the software (Section 3). Furthermore, in Section 4 we present some examples of results for the RVs generated with pyodine from SONG and Lick spectra, showcasing the performance and flexibility of the software. Finally, we summarize our work and give an outlook on future development on the code (Section 5).

Theoretical background
The general mathematical description of the analysis follows algorithms presented in Marcy & Butler (1992), Valenti et al. (1995), and Butler et al. (1996); for convenience, we summarize all of the necessary equations in the following subsections.

Modeling observations
The analysis is performed on reduced échelle spectra of the target star, with orders already extracted and a rough wavelength scale, typically based on Thorium-Argon (ThAr) calibration spectra, already known. Each observation of the star must have been performed with an I 2 cell in the light path, so that the resulting spectrum is a combination of the stellar and I 2 absorption lines. As the I 2 spectral features are mostly present in the wavelength range 5000 Å < λ < 6300 Å, only orders covering these parts of the spectrum are used. Each order is then divided into chunks of roughly 2 Å in length, and each chunk is modeled individually, in order to reduce the overall complexity of the model and account for spatial variations in the instrumental line spread function (LSF).
As the observed spectrum contains both stellar and I 2 spectral features, high-resolution template spectra for the I 2 and the observed star are required. The I 2 template spectrum, T I2 (λ), is typically obtained by scanning the I 2 cell in a Fourier-Transform Spectrometer (FTS) of very high resolving power (for a detailed discussion and analysis of I 2 template spectra, compare Wang et al. 2020). Retrieving the stellar template spectrum T * (λ) usually involves a more complex routine, which is explained in Section 2.2.
We denote the observed spectrum over pixels x of a given chunk as I obs (x). This is then fitted in a non-linear least squares approach with the model which incorporates the following sub-models: k(x), a model of the continuum flux values in the given chunk (Section 2.1.2); T I2 λ(x) , the part of the I 2 template spectrum corresponding to the modeled chunk; λ(x), a model of the wavelength scale in the chunk (Section 2.1.2); T * λ(x) · (1 + z) , the part of the stellar template spectrum corresponding to the modeled chunk, taking the (relativistic) Doppler-shift by the relative RV between observation and template z = [(1+β)/(1−β)] 1/2 into account, with β = v/c, where v is the RV estimate of that chunk (including the barycentric velocity of the observatory; Section 2.2); and L(x), a model of the instrumental line-spread function (LSF; Section 2.1.1).
The philosophy behind the combined model presented in Equation 1 is that spatial and temporal variations of the instrumental LSF are directly taken into account, given that the chosen LSF description L(x) is flexible enough (for a more thorough discussion compare Marcy & Butler 1992;Valenti et al. 1995). The (relative) RV of an observation is finally computed through a weighted mean of the best-fit velocities v of all chunks (see Section 2.3).
We note that the combined model is first built on an oversampled pixel grid (usually by a factor 4 to 6 with respect to the observation spectrum), to ensure good convolution and to preserve the high-frequency information from the iodine cell; for evaluation by the fitting routine the model is then re-binned to the observation pixels.

The LSF model
In its simplest form, the instrumental LSF can be described by a model consisting of a single Gaussian function: where the width c is a free model parameter. For most instruments, this model will not deliver a very good description of the actual spectrograph LSF, as asymmetries cannot be accounted for. However, it can be useful in order to determine an estimate of other free parameters (e.g., for the continuum and wavelength models), which can then be used as initial guesses for a more complex modeling. A more flexible parametrization of a spectrograph LSF has been developed by Valenti et al. (1995), who use a model built from a central Gaussian and n/2 satellite Gaussians on either side, called Multigaussian LSF hereafter: Here, the positions b i and widths c i of all Gaussians as well as the central amplitude a 0 are usually fixed, leaving n free model parameters, namely the amplitudes of the satellite Gaussians a i 0 . Typically, n = 10 offers enough flexibility to account even for small variations and asymmetries. Also note that each LSF model is normalized to unit area, through L norm (x) = L(x)/ x L(x).

The continuum & wavelength models
Both the continuum flux and the wavelength scale of an échelle spectrum exhibit complex, large-scale variations along the orders, attributable for example to the blaze-function of the spectrograph, grating anamorphism, optical distortions etc. By modeling the spectrum in individual chunks over rather short pixel ranges, that complexity can be greatly reduced and the continuum flux and wavelengths within a chunk can typically both be described by a simple linear model with a slope p slope and an intercept p intercept . In the case of the wavelength model λ(x), the variable parameters p intercept and p slope represent the wavelength at the central pixel of the chunk and the dispersion within the chunk, respectively, while for the continuum model k(x) they describe the amplitude (again at the central chunk pixel) and slope of the continuum flux values. Additionally, we also implemented a quadratic model in pyodine, which can be easily activated and used as an alternative for the wavelength and/or continuum models. In our general adaptation of pyodine to Lick and SONG spectra however, the quadratic model did not show any significant improvements in any of the key result metrics (e.g., chunk velocity scatter, residuals between observed and model spectrum), while increasing the computation time per observation by about 20% (when using a quadratic instead of linear model both for wavelengths and continuum). In our standard implementation for these two instruments, we therefore keep the linear models as default; for other instruments (or when using larger chunks, compare Section 4.1.1), the additional complexity of the quadratic model nevertheless can lead to better results. Also, thanks to the modular design of pyodine, it is straightforward for users to add their own model descriptions if required.

The deconvolved stellar template
While the I 2 template spectrum can be acquired directly with an FTS, this process is not a viable solution to produce the stellar template, as it would result in too low signal-to-noise ratios S /N for most stars. However, when using the same spectrograph as for the observations, but without the I 2 cell in the light path, the stellar spectrum is still affected by the instrumental LSF. Marcy & Butler (1992) therefore developed a routine where the stellar template spectrum is obtained by "cleaning" an I 2 -free observation spectrum of the star from the LSF in a deconvolution algorithm.
To achieve this, spectra of hot and rapidly rotating stars are obtained with the I 2 cell in the light path directly before and after the I 2 -free stellar observation spectrum. As these stars, typically of spectral class O or B, show essentially no absorption lines in the wavelength range of interest, the recorded spectra only contain the LSF-affected I 2 features; that is to say, these stars act as a continuum source with the benefit that the light travels (nearly) the same path through the instrument as for all other observations. When combined and split into chunks, each chunk spectrum I(x) can then be modeled by: Thus we receive a description of the LSF within each chunk, L(x), which we can now use in a deconvolution of the I 2 -free stellar observation I * (x): We are interested in the "true" stellar object spectrum T * (λ) which satisfies where N(x) is the noise in the observation. For deconvolving the spectrum, Butler et al. (1996) used a modified version of the Jansson technique which involves an iterative process, following Gilliland et al. (1992); in pyodine, we incorporate a slightly different recipe developed by Crilly et al. (2002) on the basis of the algorithm by Agard et al. (1989), which uses the same iterative steps, but pre-filters the input data to accelerate convergence. The deconvolved stellar template chunks are then saved to file, along with the fitted wavelength parameters of each chunk and the computed barycentric velocity at the time of acquiring the I 2 -free stellar template spectrum. When modeling the observation spectra as described in Section 2.1, the chunks in these observations are chosen such that their wavelengths roughly correspond to the template chunk wavelengths after taking the relative barycentric velocities between them into account -this means that the chunks are co-moving in wavelength space with the barycentric velocity. This ensures that the observation and template chunks will mostly overlap in pixel space, and the only relative shift is due to the true relative RV between both (i.e., corrected for relative barycentric velocity), which is usually small (tens of m s −1 for planet-induced RVs, a few km s −1 for close stellar binaries).

Combining fitted velocities to an RV time series
The modeling process of an observation, as described in Section 2.1, results in n ch individual best-fit velocities v i j for each observation i (where n ch is the number of chunks within one spectrum). Since the chunks do not overlap, all these velocities can be treated as independent estimates of the true RV of the star at the time of the observation, and a simple mean or median over all v j would be the easiest way to compute an overall RV estimate. However, as Butler et al. (1996) discuss in detail, different chunks contain vastly different Doppler information because the number and depth of stellar absorption lines within each chunk varies greatly. This leads to some chunks contributing a better estimate of the stellar RV, and the overall RV determination will thus greatly improve if we assign a larger weight to these chunks. Furthermore, by weighting chunks differently we can weaken the contribution of chunks that are corrupted (e.g., through instrumental defects).
The RV time series of a star is therefore computed taking all chunk velocities from all n obs observations v i j into account, which is a 2D-matrix of shape (n obs , n ch ). Fig. 1 illustrates the velocity combination algorithm for a simulated "true" signal and "measured" chunk velocities (six observations, five chunks with different noise properties, see box a in Fig. 1). Within each observation, the velocities are first centered around 0 by subtracting an outlier-resistant mean 3 of v i j over the chunks j, mean(v i j , j) ( Fig. 1b): From the centered velocitiesv i j , we can now compute the scatter within each chunk time series σ j by taking the robust standard deviation std() over observations i (Fig. 1c, top): Now, for each chunk we can compute the deviation δ i j from the mean of its chunk time series (i.e., over all observations) in terms of the overall scatter within that chunk time series σ j . These deviations are then again centered around 0 by subtracting the mean of all deviations within an observation (Fig. 1c, bottom): The chunk deviationsd i j can be used to compute weights for the chunks. To increase the contrast between good chunks (small absoluted i j ) and bad chunks (large absoluted i j ), we use a reweight function, inspired by Stetson (1989), of the form: where the constants a, s and β can be set by the user in a configuration file (compare Chapter 3.2 in Stetson 1989, for a detailed explanation of these constants). We found that a = 1.8, s = 2 and β = 8 deliver good results. The resulting weights ω i j are distributed in the interval [0, 1], where chunks with small absolutē d i j are close to 1 (see Fig. 1d, top). To arrive at the final weights for the velocity combination w i j , we scale the ω i j by the chunk time series scatter σ j (Fig. 1d, bottom): Before we weight the chunk velocities, we aim to center each chunk time series around 0; therefore we compute chunk time series offsets from the observation means c j , and again center them around 0 to arrive atc j : Now we can correct the original chunk velocities v i j by the chunk time series offsetsc j (Fig. 1e): Finally, we use these corrected chunk velocities to compute the weighted mean for each observation, which results in the RV time series of the star: For each observation, we determine an uncertainty estimate according to: where N i is the number of chunks with good velocities (non-NaN, i.e., the fit converged successfully) in observation i. Fig. 1f shows the RV time series for the simulated velocities. The RV estimate RV i in Equation 13 contains contributions both from the observed star as well as the barycentric motion of the observatory along the line-of-sight. To arrive at the final RV time series with only stellar velocity contributions RV * i , we use the open-source Python package barycorrpy (Kanodia & Wright 2018). The package follows the routines outlined in Wright & Eastman (2014) to correct measured redshifts z meas i for barycentric velocities at the 1 cm s −1 -level: We note that absolute measured redshifts are required for the barycentric correction to work at highest precision (compare Equation 10 in Wright & Eastman 2014). In pyodine, the RV estimates RV i are relative to the template observation, which is usually affected by some non-zero Doppler shift with respect to laboratory wavelengths itself. Therefore, we allow to incorporate a constant RV offset RV off (usually the template velocity offset, estimated through cross-correlation of the stellar template with a reference spectrum, see Section 3.2), to arrive at the absolute measured redshifts: Be aware that the template velocity offset is only accurate on the order of a few 100 m s −1 , as it is determined through a simple cross-correlation. However, the method outlined above suffices to achieve a barycentric correction better than 1 m s −1 in all cases, and much better in most.

Code implementation
3.1. Core modules The mathematical concepts described above are implemented within the pyodine software package, which is completely written in Python 3 and available for download and use under MIT license 4 . All components of the algorithm (including models, sub-models, observations, template spectra, and routines for chunk-definition, deconvolution, and fitting) are represented as Python objects, encapsulating for instance instrument-specific code within each instrument class. The philosophy of pyodine, as a framework for I 2 reduction, is that all these components should be easily swappable. Some open-source Python packages are used for specific tasks, for example lmfit for fitting and barycorrpy (Kanodia & Wright 2018) for the computation of barycentric velocities. Additionally, pyodine comes with a number of convenience functions for the creation of analysis plots or loading and saving of results. Instrument-and user-specific code is largely bundled in separate directories, called utilities modules hereafter. These contain functions which define how to load input spectra and required meta-data, for instance times of observations or name of the observed object, which differs from instrument to instrument. Similarly, some fixed parameter values for the I 2 cell code may differ for different instruments, such as the pixel width of chunks or the positions of satellite Gaussians in the Multigaussian LSF (Equation 3); for a given instrument, all these parameters are therefore defined in parameter input files which reside inside the instrument's utilities module.
So far, we have adapted pyodine to work with spectra from two instruments, the SONG Hertzsprung spectrograph in Tenerife/Spain and the Hamilton spectrograph at Lick observatory/USA. For each of these instruments a utilities module exists with all required parameters and functions (called utilities_song and utilities_lick, respectively), and a third module utilities_waltz has been set up for the so-far untested new spectrograph at the Waltz telescope. In Section 4, we briefly describe the most important parameters employed and present examples of test results on data from these two instruments.

Typical workflow
Following the philosophy of pyodine, the main routines containing the workflows for template creation, observation modeling, and the combination and weighting of chunk velocities are generalized for all instruments; all differences between instruments that affect the major workflow can be controlled through the parameters in the utilities modules. Fig. 2 schematically presents the main steps of the workflow for modeling an observation, and we briefly explain it here: 1. The observed spectrum and corresponding stellar template (I obs and T * ) are loaded from the disk. 2. A Normalizer object is initialized with a high-S /N reference spectrum, and a first RV guess of the observation relative to the template is computed by cross-correlating each of the two with the reference spectrum (using orders that lie outside the I 2 -affected wavelength region). At the moment pyodine comes with atlases of Arcturus and the Sun (Hinkle et al. 2000) as reference spectra, but users can also add their own choices (through minimum changes to the code and the respective utilities modules). 3. The observation is split into chunks that roughly correspond to the template chunks in wavelength space (after correcting for the relative barycentric velocity shift between both, see Section 2.2), and pixel weights are computed -in this step information from a bad pixel mask or a telluric atlas can optionally be used. 4. The modeling is performed, optionally in multiple runs, where the exact choice of (sub-)models can be varied between runs, and parameter results from one modeling run can be used to change starting values or constraints of parameters in a later one. Each run works as follows: (a) Set up the spectrum model (with the sub-models desired for this run), and use it to initialize the fitter object. (b) Choose appropriate starting values and possibly constraints for the model parameters. If desired, fit results from an earlier run can be used here. (c) Loop over the spectrum chunks and model each one.
Store the fit results in an internal data object. (d) Possibly save this run's relevant model information and fit results for all chunks to file, as well as automatically created analysis plots. For example, in our tests on SONG and Lick spectra, a modeling procedure in two runs proved to deliver good results, where a Singlegaussian LSF model is used in the first run and the results serve as starting guesses for a second run with a Multigaussian LSF. The number of runs and their model and parameter setup are controlled through the parameter files in the utilities modules.
From the saved fit results of a number of observations, the velocities can later be loaded and combined to a RV time series using a routine that follows the steps described in Section 2.3.
In the process of creating a deconvolved stellar template, the modeling of the hot-star observation works largely as outlined above, with a few differences: -No stellar template is required, and no prior RV guess is computed. stellar template and the later observations, as they all correspond in (barycentric-corrected) wavelength space. pyodine offers different chunking algorithms for the user: One can either pack as many chunks of a given size as possible into the orders, thus making full use of the accessible spectral range -this should normally be used for precise RVs. However, we also offer the possibility of only chunking specific wavelength regions -the user can thus track velocity variations in specific absorption lines for example, which might prove valuable in asteroseismic measurements (compare Section 4.1.1).
-After the last run (item 4. in above flow outline), the bestfit results of the hot-star model are used to divide into chunks and deconvolve the input stellar template observation (I * ), and the resulting deconvolved template chunks are then saved to file.
The code was tested extensively on a desktop PC with a 3.1 GHz processor, and the workflow described above required typical computation times of 2.5 to 3 min for the modeling of a single SONG observation, and 4 to 5 min for a Lick spectrum. To accelerate the analysis of large time series of observations, pyodine offers the possibility to parallelize the modeling of spectra; this is currently done using the Python package pathos (McKerns et al. 2011). In our tests we ran the software on 12 cores of a desktop computer, and the reduction of the overall 2061 SONG observations of σ Draconis presented in Section 4.1 thus took 11.1 hr, while the Lick time series of 91 spectra of HIP 36616 (Section 4.2) was modeled in 33 min. Note that the subsequent combination and weighting of the chunk velocities of all modeled observations, as outlined in Section 2.3, computes within a few seconds (for short time series) to some minutes (for very long time series of O(10 4 ) observations). 5 5 The bottleneck here is not the actual computation itself, but the reading of the model results from the disk.

SONG spectra
The SONG instrument is a high-resolution, cross-dispersed échelle spectrograph, fed through the coudé path by the 1 m robotic Hertzsprung telescope at Teide Observatory on the island of Tenerife, Spain (Andersen et al. 2014; Fredslund Andersen et al. 2019a). It was built as part of an initiative to set up a world-wide network of spectrographs dedicated to asteroseismic measurements through RVs (Grundahl et al. 2007), and has been in operation since 2014. With the typically used slit the spectrograph reaches a resolving power of ∼ 90, 000, with a maximum of ∼ 115, 000 for the narrowest slit.
The spectral orders are extracted and wavelength-calibrated using a code based on the IDL package REDUCE (Piskunov & Valenti 2002). RVs are computed through the I 2 cell method with the IDL code iSONG, and for that 24 échelle orders of 2048 pixels each are used, covering the wavelength range from roughly 4994 to 6208 Å. They are split into 22 chunks of 91 pixels in width per order, resulting in a total of 528 chunks for each spectrum. Typically the Multigaussian model is used for fitting the LSF (Equation 3), and in the construction of the model an oversampling factor of 6 is employed.
To evaluate the performance of pyodine and compare it to the iSONG results, we tested our code on SONG observations of several different targets, while using exactly the same parameter setup as iSONG. The following two sections show results from a series of solar observations, which test the internal precision of the software, and from observations of the star σ Draconis, which illustrate an instrumental artifact of SONG.

Solar data
For the Solar-SONG initiative, the SONG spectrograph is equipped with an optical multi-mode fiber whose one end is mounted on a solar tracker outside the telescope dome, while the other end is focused on the entrance slit of the spectrograph  Fig. 3 (green). Right bottom: Power spectra of the full 2 hr time series of chunk-9 velocities (blue) and the combined detrended RVs (green). (Pallé et al. 2013;Fredslund Andersen et al. 2019b;Breton et al. 2022). SONG thus allows to take precise RV data of the Sun during daytime, which contributes to a deeper understanding of the solar interior.
Additionally, due to the extraordinary high S /N that is achieved in observations of the Sun even with short exposure times of 1 s and less, the short-term RV precision of the instrument and reduction code can be tested on time series of very high-cadence spectra. To probe the precision of pyodine and compare it to the results produced by the dedicated reduction pipeline iSONG, we modeled a series of 2000 spectra of the Sun obtained over the course of 2 hr on May 18, 2018 (on average more than 16 spectra per minute). As stellar template, we used a spectral atlas of the Sun (Hinkle et al. 2000), which was transformed into the required format.
The fitted chunk velocities were then combined as described in Section 2.3, but without correcting for barycentric movement, to create the RV time series shown in the left plot of Fig. 3. A long trend is seen, caused by the Earth's rotation, which we fit with a 3 rd -degree polynomial to remove it from the data. In the detrended time series (top right graph of Fig. 3) oscillations induced by solar p-mode pulsations are clearly visible with a peakto-peak amplitude around 4 m s −1 , leading to a RV scatter (standard deviation) of 1.11 m s −1 . Qualitatively these results are well in line with an analysis performed by Fredslund Andersen et al. (2019b) on a much larger data set of Solar-SONG observations (compare Fig. 2 in Fredslund Andersen et al. 2019b).
To exclude the systematics of the p-mode pulsations and get to the bottom of the intrinsic RV precision, we used a LOWESS filter (locally weighted regression and smoothing ;Cleveland 1979) to smooth the data. We tested different filter window sizes between 0.4 and 2.4% of the whole data set, corresponding to time windows between 0.5 and 3 min, which reduced the standard deviation of the filtered RV time series to values between 0.64 and 0.77 m s −1 . For the bottom right graph of Fig. 3, we chose to adopt a window size of 0.8% (roughly 1 min), which is short enough to completely filter out the dominant solar pmode pulsations with periods between 5 and 6 min, but still averages over sufficiently many observations to probe the pointto-point scatter. The filter reduces the standard deviation of the RVs to 0.69 m s −1 , which is a little smaller than the mean of the individual RV uncertainties as computed by pyodine,σ RV = 0.81 ± 0.07 m s −1 .
The results from pyodine for this 2 hr-time series of solar spectra match the original RVs produced by the dedicated SONG pipeline nearly perfectly: When performing the same analysis steps as presented above on the iSONG RVs, the detrended time series scatters with 1.12 m s −1 , and the LOWESS-filtered time series (window size of 0.8%) with 0.69 m s −1 (see Fig. A.1 in Appendix A for the respective results plots).
We furthermore used the same Solar-SONG observations to test the flexibility of pyodine with respect to chunking and more complex modeling: By providing appropriate start and stop wavelengths, we created 28 chunks centered around prominent absorption features, such as the Na D 1 + D 2 and several strong Fe, Mg, Mn and Ti lines. These chunks have varying widths, with those around broader lines being much wider than the usual 91 pixels / 2 Å in order to still include sufficient parts of the continuum; therefore, when using these chunks to model the same 2000 solar spectra as described above, we adopted quadratic wavelength and continuum models to account for non-linearities in the wavelength scale and blaze function. Fig. 4, left, shows the data and model results of chunk 9, centered around three closely spaced Fe I-lines around 5195.5 Å, for one solar observation. The relative (outlier-resistant) residuals between data and model have an rms of 0.94%, which corresponds to the median of this chunk's residuals over all observations. While the width of this chunk has been chosen quite conservatively to be nearly exactly 2 Å, some wider chunks (up to 5 Å in width) centered around other absorption lines show simi- The top right-hand graph in Fig. 4 shows a 30 min window of the best-fit velocity time series of chunk 9, along with a smoothed version of the data (using the same LOWESS filter as described above), and the detrended combined RVs from above. The single-chunk velocities vary much more than the combined RVs and their point-to-point scatter is much larger. Accordingly, the power spectrum of the chunk velocities of all 2000 observations (Fig. 4, right bottom, in blue) shows a higher noise floor compared to the one of the combined RVs (same graph, in green), with a visible power excess around the p-mode induced signal frequencies between roughly 2500 and 3700 µHz visible in the combined RVs (compare also Fig. 3 in Fredslund Andersen et al. 2019a). This is the case for most of our tested chunks centered around single absorption lines. We expect that the noise can be reduced significantly by including more observations; a similar analysis like this performed on a larger data set can then yield additional scientific value to a simple RV time series of combined chunk velocities (see Section 5 for further discussion).

Sigma Draconis
The main-sequence star σ Draconis (σ Dra, HD 185144) has long served as RV standard star (spectral type K0V, Keenan & McNeil 1989), and has been observed extensively by the SONG telescope since the beginning of operations in April 2014: Until early October 2021, a total of 2061 spectra were obtained with the standard I 2 cell to monitor the long-term precision of the instrument. The star has a V magnitude of 4.68 (Oja 1984), and typical exposure times with SONG were 5 -10 min.
The RV time series of these observations, as computed by the iSONG pipeline and shown in the top-most plot of Fig. 5, is modulated by a signal of roughly yearly period and amplitude of ∼ 20 m s −1 , leading to a scatter of roughly 5 m s −1 . A generalized Lomb-Scargle periodogram (GLS, Zechmeister & Kürster 2009) of the data shows a strong peak around a period of 1 yr, along with additional smaller peaks at roughly half and one-third of that period probably caused by harmonics of the 1 yr-signal (blue line in Fig. 5, bottom). Similarly, the RVs computed with pyodine show clear variations, albeit smaller in amplitude (overall scatter 4.5 m s −1 ), especially in some parts of the time series (e.g., between 2015 and mid-2017, see Fig. 5 second plot from top). Consequently, the GLS periodogram of the pyodine RVs displays less prominent peaks, particularly around the yearly period, and the half-year period even completely vanished (orange line in Fig. 5, bottom).
The cause of these RV variations are not completely clear at the moment. While Vogt et al. (2014) detect a potential planet around σ Dra with a 308 d-period, they report a small RV amplitude < 2 m s −1 ; it is thus implausible that a planetary modulation is the cause for the observed SONG RV variations. Furthermore, we have observed similar modulations with 1 yr-periods (and higher frequencies as well) in other targets observed over long time-scales with SONG, and thus we ascribe these variations to an instrumental effect. However, at present we do not fully understand the origin of this as our efforts so far have not yielded a conclusive cause. We have confirmed that our barycentric correction works correctly, and for the star σ Dra we obtain a similar result when using two different I 2 cells. Paul Butler has kindly performed a completely independent reduction (including spectral extraction) of the data and also finds a similar modulation. Furthermore, on March 18, 2018, we have attempted a 90 • rotation of the detector (marked by the green-dotted line in Fig. 5), with no effect.
The results presented in Fig. 5 suggest that there is some dependence on the software used as the modulation seen in the pyodine time series is less pronounced -although they are still clearly visible. Our present working hypothesis is that the root cause comes from a low-level problem in the detector readout, and the 1 yr modulation as well as shorter-period signals originate from the time-varying blending of the stellar and I 2 absorption lines due to the orbital motion of the Earth.

Lick spectra: HIP 36616
The Hamilton spectrograph at Lick observatory was one of the first instruments used in the search of extrasolar planets, and it served as the testbed in the development of the I 2 cell method (Vogt 1987;Butler et al. 1996). It is a cross-dispersed échelle spectrograph with a resolving power of 62, 000, fed by both the 3m Shane telescope and the 0.6m Coudé Auxiliary Telescope (CAT). When used for RV measurements, the Butler code served as the standard reduction pipeline.
From an RV survey of G-and K-giant stars conducted at Lick observatory between 1999 and 2012, we possess a large database of reduced spectra obtained with the combination of the Hamilton spectrograph and CAT, and their RVs were determined through the Butler code (see e.g., Frink et al. 2001;Reffert et al. 2015). This data allowed the detection of several (multi-)planetary and stellar binary systems, amongst them the first discovery of an exoplanet orbiting an evolved star (ι Dra, Frink et al. 2002).
In the development of pyodine, we made use of these data to test the performance of the software and compare it to the Article number, page 9 of 13 A&A proofs: manuscript no. main results from the Butler code. Using information from Butler et al. (1996) and result products in our data base, we aimed to reconstruct the parameters employed in the Butler code and setup our software accordingly: We use 16 orders of 1851 pixels each, covering a total wavelength range between roughly 5016 and 5872 Å. Each order is split into 44 chunks of width 40 pixels, resulting in a total of 704 chunks over all orders. The model is constructed with an oversampling factor of 4, and using the Multigaussian LSF model in the final run.
Overall, pyodine seems to perform similarly as the Butler code, as for a number of targets no significant differences could be registered between the RV time series from the two reductions. In the following, we present some examples of results for observations of one of our targets, HIP 36616, where some small differences are visible.
HIP 36616 (HD 59686) is a single-lined stellar binary system, with the main component HIP 36616 A being a horizontalbranch (HB) star of spectral type K2 III (Reffert et al. 2015) and V = 5.45 mag (van Leeuwen 2007). It is part of our Lick RV survey of evolved stars, and based on 88 RVs Ortiz et al. (2016) constrained the orbit of the invisible stellar companion HIP 36616 B and discovered RV variations indicative of a planetary companion orbiting the main component. In a dynamical analysis Trifonov et al. (2018) later showed that despite the high eccentricity of the stellar binary (e B ∼ 0.73) long-term stable orbits of the proposed planet exist.
We used the overall 91 I 2 observations of HIP 36616 in our data base, covering the years 1999 until 2011, to test pyodine on Lick spectra and compare the resulting RVs to the original ones determined through the Butler code. Fig. 6, top-most graph, shows the RV time series as determined by Butler from the Lick observations (blue data points), and the pyodine RVs computed from the same spectra (orange, offset by −1500 m s −1 ). The pyodine data set consists of 85 RVs, three less than the Butler reduction, as the modeling failed for some observations; this is at least partly caused by the fact that the extracted Lick spectra in our data base are not wavelength calibrated yet, and our construction of wavelength solutions (using a self-built pipeline based on CERES routines, see Brahm et al. 2017) returned bad results in some cases due to low quality of the respective ThAr calibration spectra. However, pyodine also successfully reduced three observations for which we do not possess any Butler RVs.
In both reductions, the high-amplitude, long-period variation caused by the stellar companion as well as the low-amplitude, short-period signal induced by the planet are visible. To allow comparison, we modeled each time series with a double-Keplerian model, similarly as described in Ortiz et al. (2016), using the capabilities of the modeling software for RV and photometry data Exostriker (Trifonov 2019). In each model, a stellar jitter estimate was added quadratically to the individual measurement errors; it was chosen such that the χ 2 red of the model becomes 1, leading to jitter values of 19.8 m s −1 for the Butler time series, and 19.6 m s −1 for the pyodine RVs. The best-fit models for both time series are plotted as dashed and dasheddotted lines in the top plot of Fig. 6, and the respective best-fit orbital periods of the two companions are printed in the legend (with uncertainties taken from the covariance matrices of the fits). The periods for each companion agree within their 2σerrors, which indicates a high similarity between the two time series.
The bottom two plots of Fig. 6 display the residuals between the RVs of the two time series and their respective best-fit models. The standard deviation of the Butler residuals is 19.5 m s −1 , as compared to 20.1 m s −1 for the pyodine RVs. Two distinct differences in the residuals are visible: First, in the Butler time series a number of RV data points around JD 2 454 500 follow a systematic downward trend from roughly 80 m s −1 to −55 m s −1 around the best-fit model, while in pyodine the residuals at that time are much smaller. This trend in the Butler time series is addressed explicitly by Trifonov et al. (2018), where these exact RVs constrain a dynamical model of mutually inclined orbits between the stellar and planetary companion. However, the authors argue that these data points could as well be noise-induced outliers. The second significant difference arises at the very end of the time series, where a few pyodine RVs deviate much more from their best-fit model as compared to the Butler RVs. This could be caused by inaccurate wavelength solutions as discussed above, but it might as well be noise, especially as the last pyodine data point, which shows the largest difference, is not even present in the Butler time series. Still, these differences between the two data sets illustrate the merit of using a second reduction software on the data, as it allows to assess the plausibility of individual RVs.
Furthermore, from March 2015 onward we continued to observe HIP 36616 with the SONG telescope, resulting in another 108 spectra. This allows to take advantage of the flexibility offered by pyodine and compute RVs for the SONG observations as well, which enables us to check the consistency with the Keplerian fit to the Lick RVs. Fig. 7, top, displays the time series determined through pyodine from both Lick and SONG spectra, along with the best double-Keplerian model to the combined RVs. With a jitter value of 17.3 m s −1 applied to both data sets the χ 2 red of the fit became 1. The residuals between the RVs and the model are 20.3 m s −1 and 16.1 m s −1 for the Lick and SONG data, respectively, indicating a higher instrumental precision of SONG as compared to Lick.
The best-fit parameters are printed in Table 1, and nearly all results fall within the 3σ-uncertainties of the original model from Ortiz et al. (2016). These results further strengthen the planet hypothesis as a cause of the observed RV variations, and it would be interesting to repeat the dynamical analysis performed by Trifonov et al. (2018) on the complete pyodine time series of Lick and SONG data.

Conclusions and outlook
In this work we presented the mathematical algorithms, structure and first results of pyodine, a Python-based code package for the computation of precise RVs from extracted spectra using the I 2 cell method. We demonstrated the flexibility of the software by applying it to data from two different instruments, namely the SONG project and the Hamilton spectrograph at Lick observatory. In both cases, pyodine reaches an overall RV precision that matches the ones by the dedicated reduction packages (iSONG and the Butler code, respectively), and we presented RV results of several interesting SONG and Lick targets.
Particularly on SONG observations of the Sun, pyodine proves its extremely good performance on high-S /N spectra and reaches an estimated short-term precision of 0.69 m s −1 . On a long-term time series of the star σ Dra however, a strong modulation with roughly yearly period is visible; the modulation is also present (and even more pronounced) in the time series computed with the dedicated instrument software iSONG, and is a known problem of SONG for some stars. It is most probably caused by an instrumental effect and has even been observed in a reduction of the σ Dra spectra with the Butler code, so we are safe to assume that it does not point at any problems of pyodine itself.
For a few Lick spectra of the star HIP 36616, we find significant differences between the pyodine and Butler RVs, with each code performing apparently better on some of these spectra, and worse on the others; whether this is caused by noise in the observations that is handled differently by the two reductions, or whether it points at an astrophysical phenomenon, cannot be answered at this point. However, from the whole RV time series of HIP 36616 and results from other Lick targets not presented in this work we can conclude that pyodine roughly matches the RV precision of the Butler code.
Additionally, we presented model results for the Solar-SONG observations of a single chunk centered around three Fe Ilines. The velocity time series of that single chunk, while being quite noisy, shows clear power excess close to the solar p-mode oscillation frequencies, and this is also true for most other singlechunk velocity time series that we tested. Inspecting single-line velocities such as demonstrated here -but on larger data sets covering longer baselines to reduce the noise -might offer additional scientific value: In helio-/asteroseismic analyses this could allow to probe different depths within the atmospheres of the observed targets, possibly aiding a better understanding of their compositions; similarly, by focusing on lines that are strongly affected by stellar activity, this method could help to disentangle Keplerian RV signals from stellar variability. Furthermore, centering chunks specifically around absorption lines can help optimize the efficiency of the code both with respect to RV content and computing power, as parts of the spectrum with low Doppler-content can be neglected and the total number of chunks thus brought down. Particularly for metal-poor stars, which have only few absorption lines and where many of the otherwise automatically generated chunks are poorly constrained, this method might also benefit the final RV precision.
Due to its demonstrated performance and usability and implementation as a Python package, pyodine will be adopted as the standard reduction software for SONG in the future. Furthermore, we plan to implement several upgrades, additions and improvements to the software: -Already in the current version, we have included an experimental routine to compute I 2 -free spectra from the stellar observations, following the basic recipe described in Díaz et al. (2019). This can be useful for instance to compute activity indices such as bi-sector spans, and we plan to improve the usability and functionality of the existing routine. -Investigating the wavelength-dependence of RVs can help distinguish signals caused by stellar activity from those induced by companions (often called Chromatic Index or CRX, compare Zechmeister et al. 2018). In pyodine, we have incorporated a similar routine which fits the slope of chunk velocities against their central wavelengths to return