Stellar Spectroscopy in the Near-infrared with a Laser Frequency Comb

The discovery and characterization of exoplanets around nearby stars is driven by profound scientific questions about the uniqueness of Earth and our Solar System, and the conditions under which life could exist elsewhere in our Galaxy. Doppler spectroscopy, or the radial velocity (RV) technique, has been used extensively to identify hundreds of exoplanets, but with notable challenges in detecting terrestrial mass planets orbiting within habitable zones. We describe infrared RV spectroscopy at the 10 m Hobby-Eberly telescope that leverages a 30 GHz electro-optic laser frequency comb with nanophotonic supercontinuum to calibrate the Habitable Zone Planet Finder spectrograph. Demonstrated instrument precision<10 cm/s and stellar RVs approaching 1 m/s open the path to discovery and confirmation of habitable zone planets around M-dwarfs, the most ubiquitous type of stars in our Galaxy.


Introduction
Measurements of the periodic Doppler shift of a star using the spectroscopic radial velocity (RV) technique provide evidence of an unseen orbiting exoplanet and its minimum mass [1]. Complemented by photometric measurements of the transiting exoplanet [2], one can obtain the mass and radius (and density) of the exoplanet, which are critical parameters for classification and assessment of habitability. While the RV technique has been used extensively, the highest precision measurements have been limited to the visible region of the spectrum <700 nm, and not the infrared [3]. Thus, 70% of the stars in our galaxy-M-dwarfs, which primarily emit in the near infrared (NIR)-have largely been outside the efficient spectral grasp of the current generation of proven RV instruments [4]. Due to their proximity, abundance, and small radius and mass, M-dwarfs are very attractive targets in the search for habitable zone rocky planets with NIR spectroscopy. The habitable zone [5] of an M-dwarf is close-in to the star, such that an orbiting Earth-mass planet within this zone induces an RV shift on the order of 1 m/s [6]. This is ten times larger than the RV signature of Earth around the Sun, and significantly increases the detectability of such an Earth-mass planet. Stellar convection and magnetic activity (e.g. granulation and starspots) can obfuscate the small RV signature of terrestrial planets, but many of these noise sources are suppressed in the NIR compared to the optical [7]. High RV precision measurements of bright stars in the visible will benefit from complementary high RV precision NIR observations that better discriminate stellar activity from real planet signals [7,8].
While these motivations are well known, a combination of factors has limited the precision of even the best infrared RVs to 5-10 m/s over timescales of months [9,10]. Low-noise silicon detector arrays are not efficient at wavelengths beyond ~900 nm, requiring infrared detectors and cryogenic instruments. Existing wavelength calibration sources in the infrared are not yet as effective as iodine cells and thorium-argon lamps in the visible. Fiber-fed spectrographs with the required intrinsic stability in the infrared are only now being built and tested [11][12][13]. We overcome many technological and analysis hurdles that have impeded NIR RV measurements from reaching the ~1 m/s level, and we introduce a complete suite of infrared spectroscopic tools and techniques that provide the necessary precision for discovering and characterizing planetary systems around M-dwarf stars (see Fig. 1). We have developed a 30 GHz optical frequency comb spanning 700-1600 nm built on integrated electro-optic (EO) modulators [14][15][16] and high-efficiency nano-photonic nonlinear waveguides [17,18] to provide a robust calibrator for long-term operation at the telescope. This comb-based calibrator provides tailored light to the stable Habitable Zone Planet Finder (HPF) spectrograph, designed and built from the ground up for precision infrared RVs [19]. The frequency comb and spectrograph have been installed at the 10 m Hobby-Eberly Telescope, and we have demonstrated differential stellar RVs at 1.53 m/s (RMS) over months-long timescales, as well as shown that the combcalibrated HPF can support RV precision as low as 6 cm/s.
To the best of our knowledge, this is the first time such RV precision has been realized in the near infrared wavelength region beyond the spectral grasp of silicon detector arrays. This interdisciplinary result is an important step on the path to discovery and characterization of Earth-mass planets in the habitable zones of the nearest stars. The higher planet-to-star radius ratio in these M dwarf systems will boost the transit signal by almost two orders of magnitude compared to Sun-like stars, making atmospheric studies of Earth-like planets feasible with current technology. Moreover, discovering and characterizing such targets is critical for the near-future detection of biomarkers in transiting terrestrial planetary atmospheres via spectroscopy with the James Webb Space Telescope [20]. The frequency comb is generated via electro-optic modulation of a continuous wave (CW) laser followed by nonlinear spectral broadening and amplitude filtering stages to tailor the spectrum. (B) The spectral envelopes recorded with a low-resolution grating spectrometer at different points in the setup: Green -at the output of the SiN chip waveguide, Blue -after a static amplitude filter, and Purple -after a programmable amplitude filter. The two insets are high-resolution recordings of the 30 GHz comb modes centered at 910 and 1210 nm respectively.
Our frequency comb is built around a combination of electro-optic and integrated-photonic technologies to address the challenges of bandwidth, mode spacing, and robustness; see Fig. 2A. In addition, we leverage recent advances with electrooptic frequency combs to mitigate the impact of multiplicative microwave phase noise on the comb tooth linewidth and coherence [16,18]. The frequency comb begins with 1064 nm continuous wave (CW) light from a semiconductor laser that feeds waveguide electro-optic modulators driven by a 30 GHz microwave source. This results in a comb of approximately 100 teeth spaced exactly by the microwave drive frequency. The CW laser, microwave source, and all other frequencies in the system are referenced to a GPS-disciplined clock that provides absolute traceability to the SI second. The initial comb is next passed through a 30 GHz Fabry-Perot cavity that acts to filter and reduce the impact of broad bandwidth electronic thermal noise [16]. The frequency comb is then amplified in an ytterbium fiber amplifier, spectrally broadened and temporally compressed to a pulse width of 70 fs, and finally focused into a 25 mm long nonlinear silicon nitride (SiN) waveguide [17,18]. The experimental setup is covered in more detail in Supplement 1.
The waveguide is 750 nm wide and 690 nm thick, which provides the combination of tight confinement and engineered dispersion to achieve a supercontinuum spectrum spanning 700-1600 nm with only 525 mW of incident average power (18 pJ of pulse energy). This low power reduces the thermal loading and aides the long-term operability. Following spectral generation, a combination of static and programmable amplitude filters [30] are used to tailor the spectral envelope. The output of each amplitude filtering stage is shown in Fig. 2B along with high resolution insets that show the resolved 30 GHz comb modes at 910 and 1210 nm. Optical heterodyne measurements between the 30 GHz comb and a frequency-stabilized reference laser confirm that the stability of the comb matches that of the GPS-disciplined clock with fractional uncertainty of 2x10 -11 at 1 s, and <3×10 -13 for timescales of 1 day and longer. Significantly, electro-optic frequency combs could be employed for coverage from <600 nm to >2500 nm [17,18].
The entire frequency comb has been assembled on a 60 cm × 152 cm breadboard and installed at the Hobby-Eberly telescope at McDonald Observatory, together with the HPF spectrograph. The HPF is a vacuum-housed crossdispersed echelle spectrograph with dimensions of approximately 1.5 m × 3 m and resolving power of R=λ/Δλ~53,000. The spectrograph is designed and optimized for stability, with its optics platform cryogenically cooled and temperature stabilized at the milliKelvin level [19,31]. The 28 echelle orders cover the wavelength range 810-1280 nm, and the spectra are recorded on a 2048×2048 pixel Hawaii-2 (H2RG) Mercury-Cadmium-Telluride (HgCdTe) infrared detector array that has a 1.7 μm long wavelength cutoff. The HPF entrance slit is fed by 3 optical fibers that can be simultaneously illuminated with star, sky, and calibration light to track the instrument drift. A more detailed description of experimental methods is provided in Supplement 1.

Spectrograph Calibration and Radial Velocity Measurements
The frequency comb was integrated into the HPF calibration system at the end of February 2018 and has been operating continuously and autonomously since early May 2018. The light from the comb system is coupled to the HPF calibration source selector system (see Kanodia et al. [32] for a layout) with a 50 μm diameter multimode fiber. This selector couples the light from any source (including the comb) into a 300 μm dynamic fiber agitator which is coupled to one port of a twoinch integrating sphere. We have shown [33] that the combination of temporal scrambling and an integrating sphere is effective at minimizing modal noise to deliver a constant illumination. The dynamic fiber-agitator is a customizable commercial system from Giga Concept that applies torsional oscillations to the fiber (under compression) at a few Hz. An identical second fiber agitator is also used to couple the calibration light to the calibration fiber coupled to the telescope facility calibration unit.
On-sky stellar observations of stable M-dwarfs and stars with known orbiting planets have been employed to characterize the system's present RV precision. Figure 3A shows example stellar and comb spectra as recorded on the HgCdTe array, illustrating the HPF's spectral range and the uniformity of the comb across the 28 orders. Using the comb as a reference, a wavelength solution is derived for each order that is subsequently used to calibrate the corresponding stellar spectra. Benchmarking the full instrumental chain (from telescope to comb to detector), as well as spectral extraction, RV measurement algorithms, and corrections of barycentric radial velocity requires on-sky observations of a star that is intrinsically stable, or has a known RV signal. Barnard's star is a bright nearby star of spectral type ~M4. While its RVs show a low-amplitude signal attributed to an exoplanet [35], it is still among the most stable M stars known. Figure 3B shows the residual RVs of Barnard's star from 118 high signal-to-noise measurements over a 3-month period. Observations were limited to 5-minute exposures to prevent saturation of the NIR detector. The scatter of individual (5-minute) RV measurements is 2.83 m/s, and when data within a ~1-hour HET observation window (or track) are binned to increase the signal to noise, the scatter is reduced to 1.53 m/s. This RV precision is unprecedented in the near infrared, and approaches that of the best measurements for this star with visible-band spectrometers (e.g. 1.23 m/s using the HARPS instrument [34]). A more detailed description of experimental methods is provided in Supplement 1.
Additional experiments allow us to probe the intrinsic calibration stability of the HPF spectrograph independent of stellar observations. To accomplish this, we take advantage of an auxiliary optical fiber that transmits the laser comb light to the primary focus of the telescope, where it is diffused and sent to the HPF spectrograph through both the "science" and "sky" fibers; Fig. 4A. In this arrangement, two sets of comb calibration spectra are recorded by HPF and processed to provide RV wavelength solutions from both fiber channels. As the stable frequency comb is common to the fibers, the retrieved RVs will reveal the slow drift of the HPF in each channel. However, the difference of the two channels should ideally be zero within the uncertainty of the photon noise. As such, this measurement allows us to carefully probe non-common-mode variations between the light paths from the two fibers to the detector array that would ultimately limit the RV precision. Such instabilities could arise from independent drifts in the optical fiber feeds, the spectrograph optics, systematics and noise in the HgCdTe detector array, or data reduction issues.
Measurements were taken over multiple nights and the absolute drift of the HPF is recorded in both channels every 5 or 10 minutes, with the results shown in Fig. 4B. As seen in these data, the drift of the HPF was about 5-7 m/s during a 5-10  Table S1 in the Supplement for the binned RVs along with a listing of equivalent exposure times).
hour measurement window, with larger daily excursions due to the liquid nitrogen fills. The difference of the RVs of the two channels is shown in 4C, revealing that the scatter of a single 10 minute differential drift measurement is at the 20 cm/s level. We further see that if we bin multiple measurements, the RV precision improves approximately as the square root of total measurement time to as low as 6 cm/s at 300 minutes, Fig. 4D. Furthermore, we observe no statistically significant drift in the difference of the RVs retrieved from the two fibers over 6 days.
These data support the assertion that the comb-calibrated HPF has the intrinsic stability to support infrared spectroscopy with precision below 10 cm/s. With the HET this was a particularly harsh test since the calibration light splays across a significant fraction of the 30 arcmin focal plane-leading to fiber modal noise which is known to be worse in the NIR than the optical. A fiber-based mode scrambler was used to mitigate modal noise, but it likely does not fully solve this issue. Our measured RV scatter is currently about 5 times greater than the photon limited precision (3.7 cm/s in 10 minutes). Further improvements towards the photon limited precision should be possible with algorithmic improvements, better characterization of the NIR detectors, and better modal scrambling. The laser frequency comb light is coupled via the HPF calibration bench and a fiber scrambler into an auxiliary optical fiber whose output is diffused across the HET focal plane to illuminate the HPF "science" and "sky" fibers. These fibers carry the light to the HPF spectrograph, where it is dispersed and detected on the HgCdTe detector array. (B) Absolute drifts of the HPF measured in the two fiber channels over 4 nights of tests. Each point is a 10 min exposure, except for data on 03-05, where the exposure time is 5 minutes. Note, the RVs from the "science" fiber are offset by 2 m/s in the plot for clarity. The total daily peak-to-peak drift of HPF is15-20m/s, which is primarily caused by daily liquid nitrogen fills. Dashed lines show the time of these daily fills. The HPF was also exponentially approaching thermal stability in March when this test was performed. (C) The difference of the drifts between the RVs recovered from the "sky" and the "science" fibers. (D) Characterization of the RV stability by successive binning of measurements. This demonstrates that binning to effective averaging time of 300 minutes results in RV stability below 10 cm/s.

Discussion and Conclusion
In summary, we have introduced a new frequency comb platform in the NIR (700-1600 nm) that leverages the reliability of telecom electro-optic modulators and efficient chip-integrated nonlinear waveguides to provide an ultrastable calibrator for the HPF spectrograph. Since May 2018, the frequency comb and HPF instruments have been running autonomously at the 10 m Hobby-Eberly telescope, enabling high cadence RV measurements on one of the largest optical telescopes in the world. This combination of instrumentation forms a unique and powerful set of tools for exoplanet science focused on M-dwarfs. Our stellar RV measurements with scatter of 1.53 m/s are presently the most precise achieved in the NIR with HgCdTe detector arrays, and now approach the best RV measurements with more mature visible wavelength silicon detectors and associated technology. Within this context, the instrumentation and techniques we introduce represent a significant step towards the long-desired goal of RV spectroscopy in the NIR with precision at (and below) 1 m/s, as will be critical for the discovery and characterization of Earth-mass planets in the habitable zones of the nearest stars.
Finally, during the review of this paper, Ribas et al. [35] announced the discovery of a 3.2 Earth mass (RV amplitude = 1.2 m/s) exoplanet candidate orbiting Barnard's star with a period of 233 days. While our HPF RVs are not inconsistent with the orbit proposed by Ribas et al., our observations extending over 86 days coincided with the least dynamic region of the orbital phase curve, resulting in the flat time series shown in Figure 3C. Nonetheless, it is significant that our first RVs from the comb-calibrated HPF are at a level of precision that makes them relevant for state-of-the-art exoplanet astrophysics. Further discussion of our RVs in the context of this exoplanet candidate is found in Supplement 1.

SUPPLEMENT 1 1. LASER FREQUENCY COMB
The laser frequency comb is arranged on a 2.5' x 5' optical breadboard that is covered by a light-tight optical enclosure. The breadboard and enclosure are housed inside a temperature controlled 'calibration' room which resides in the basement of the Hobby-Eberly telescope (HET). This room also houses the HPF calibration bench and source selector. Support and diagnostic equipment, including the computer that controls and monitors the frequency comb operation, are located directly outside of the calibration room inside a standard 19" electronics rack. Custom designed software and an internet connection allow remote monitoring and control of the frequency comb from off-site locations. Similarly, the Habitable Zone Planet Finder (HPF) spectrograph is housed in a separate temperature-controlled room and has its own electronics rack and computer system for control. The computer systems of the frequency comb and HPF are fully integrated into the HET's internal network and can be accessed remotely by the resident astronomer during operation. A detailed diagram of the frequency comb setup is shown in Fig. S1 and the basic architecture draws from recentlyintroduced techniques with electro-optics frequency combs [1][2][3]. A continuous wave (CW) laser at 1064 nm is amplified and then modulated at 30 GHz by three phase (PM) and one intensity modulator (IM) to generate ~100 comb lines spaced at 30 GHz. The number of PMs (three in this case) is chosen to provide the spectral bandwidth required to drive the subsequent nonlinear optical expansion of the spectrum. Only one IM is required, and it serves to carve out pulses with ∼50% duty cycle from the phase-modulated light. When the phase of the 30 GHz signal applied to the IM and PM are correctly aligned in time, the IM will carve the light only when the chirp from the PMs is mostly linear, hence providing a relatively flat and easily compressed spectrum [1].
The CW laser is phase-locked to a home-built 125 MHz mode-locked laser (MLL) [4]. The MLL and 30 GHz RF signal are both phase-locked to a GPS-disciplined Rb clock to ensure the absolute frequency and spacing between the 30 GHz comb lines remain stable. The 30 GHz pulse train is then temporally compressed using pure quadratic phase from a programmable phase filter to achieve ~330 fs pulses. A linear Fabry-Perot (FP) optical cavity with finesse ~600 (linewidth of 50 MHz) is then used to reduce the multiplied thermal noise of the 30 GHz microwave source [2]. To align the comb and FP cavity for maximum transmission, a portion of the CW laser is split off from the main light path and modulated with PM sidebands to generate a Pound Drever Hall (PDH) error signal that is used to lock the cavity length to the CW light. The CW light is sent through the cavity in the reverse direction and on an orthogonal polarization state with respect to the main frequency comb light. The absolute frequency of the CW light relative to the 125 MHz reference comb is adjusted to maximize transmission of the 30 GHz comb. The 30 GHz comb output from the cavity is then amplified with ytterbium gain fiber to 2 Watts average power and output coupled to free space. Up to this point in the setup all components have been fiber-coupled and utilized polarization maintaining fiber.
The free space output from the amplifier is coupled into a 10 m length of normal dispersion photonic crystal fiber (PCF) to broaden the spectrum. Normal dispersion fiber was used to limit the effect of modulation instability and to provide a smooth spectral phase. At the output of the PCF the comb bandwidth spans ~50 nm and the smooth spectral phase allows subsequent recompression to a near band-limited duration of ~70 fs. A portion of the compressed pulse train (~525 mW) is focused into a silicon nitride (SiN) waveguide which produces a spectrum spanning from 700-1600 nm. The waveguide is silica cladded silicon nitride. The silicon nitride core has a cross section of 750 nm by 690 nm. These dimensions produce a region of anomalous dispersion around 1064 nm and appropriate phase matching to generate the dispersive waves near 700 nm and 1560 nm. The waveguides are tapered to 200 nm within 300 μm of the input and output to expand the mode profile for greater free space coupling efficiency. Ligentec fabricated the 25 mm long, low-loss waveguides. The resulting devices had up to 60% total throughput coupling using a diffraction limited input coupling lens.
We verify the stability and the linewidth of the 30 GHz comb following spectral broadening via heterodyne with independent stable CW lasers at 1070 nm, 1319 nm and 1565 nm. At these three wavelengths we observe comb tooth linewidths of 0.56 MHz, 0.955 MHz, and 1.4 MHz, respectively. The 0.56 MHz linewidth is consistent with the linewidth of the 1064 nm CW laser locked to the auxiliary 125 MHz comb, and the modest increase to 1.4 MHz at a frequency span of 91 THz (approximately 3000 comb teeth from the initial 1064 nm CW laser) is to be expected from microwave noise multiplication. However, this linewidth is much below the 50 MHz width of the Fabry-Perot filter cavity, indicating the efficacy of the high frequency cavity filtering on the ultimate comb linewidth. With the same optical heterodyne, we further verify that the center frequencies of the comb lines are precisely tied to the 30 GHz mode spacing and frequency of the 1064 nm laser with uncertainty better than 2x10 -11 for all times greater that 1 second. This is limited not by the frequency comb itself, but by the GPS-disciplined rubidium clock that serves as the master system frequency reference.
A static dielectric band reject filter is used to cut out the region of high optical power near the center of the broadened comb. The comb is then further tailored using a custom-built programable spectral amplitude filter. The filter utilizes a 2dimensional liquid-crystal on silicon (LCoS) Spatial Light Modulator (SLM) and transmission grating arranged in a traditional 4-f pulse shaping setup [5]. The SLM has the ability to provide dynamic real-time control to update the filter mask to compensate for comb fluctuations. However, due to the relatively stable nature of the comb we have not yet utilized this feature and instead program the SLM with a static filter mask.
Light from the frequency comb is fed directly into the HPF calibration bench via a multimode optical fiber with a 62.5 µm diameter. From here, the comb light can be directed either to the HET Facility Calibration Unit [6] on the telescope through a Giga Concept fiber scrambler and auxiliary calibration fiber and used to illuminate the sky and science fibers of HPF or coupled through a second Giga Concept fiber scrambler which feeds an integrating sphere, with the HPF calibration fiber attached to one of the integrating sphere ports. These compact fiber scramblers help reduce modal noise from the highly coherent laser comb lines.
Our frequency comb architecture permits unsupervised operation with minimal remote intervention. Instead of nightly turn-on/turn-off procedures, the comb is kept on and ready to use at all times, and has an uptime of over 98% since continuous operation began more than 6 months ago. The majority of the downtime was due to a recurring computer issue that compromised the use of the comb for a total of 3 nights, but that has since been resolved. Building on the passive stability of the front-end components and the low-power supercontinuum, we have implemented automated computer control of servo set points for the comb, as well as control of the supercontinuum spectrum. Control software monitors the condition of each servo loop and automatically changes secondary parameters (e.g. temperature and powers) to keep the locks continuously in range of the primary servo output. We have also implemented slow feedback loops on the bias of the IM and the power injected into the SiN waveguide to maintain the spectral profile of the supercontinuum. With this, we have achieved long term relative amplitude fluctuations at the 10% level across the majority of the HPF bandwidth.

On-Sky Observations
The unique tilted Arecibo-style design of the HET allows observations of any given star only during one or two short tracks (~1hr) during the night. During such a track the constantly changing pupil of the HET leads to the variation in both collecting aperture and illumination. The use of octagonal fibers, with specially designed high-performance ball-lens scramblers [7] is necessary to achieve high RV precision. The LFC calibration light does not see this variable illumination. All observations at the HET are executed in a queue [8] by the HET resident astronomers.
As part of HPF early testing and commissioning to assess the on-sky RV stability of HPF, we obtained 191 spectra of Barnard's star (spectral type M4V, J band magnitude of 5.24) from February 25 to July 21 2018. Of these 191 spectra, we excluded a total of 25 spectra with a) a signal-to-noise ratio (SNR) <150 evaluated as the mean SNR at λ~1.07 micron (HPF order 57) since this was significantly lower than the median SNR (545) and indicated a problem with telescope guiding, acquisition or weather, b) spectra showing an excess sky-background and/or another stellar spectrum in the sky fiber, and c) spectra where a readout glitch introduced by a known issue with the Teledyne SAM H2RG GigE readout caused those spectra to be unusable.
This resulted in a total of 166 spectra with a median SNR of 545 per 1D extracted pixel at 1.07 micron. We used these 166 spectra for the generation of the master template, and performed a further cut of the spectra to present in the RV stream in Figure 3C, as is further described in the subsections below Figure 3C For the RVs presented in Figure 3C, we performed a further cut on the selection of the 166 spectra for the following reasons. First, the 166 spectra were taken in two overall different instrument and calibration system configurations which we describe here. Spectra taken from February 25-April 20 (45 spectra) were acquired as part of early HPF and LFC commissioning with varying instrument configurations (e.g., variations and upgrades to the calibration system fiber feed). Spectra taken from April 25-July 21 (121 spectra) were taken in the final configuration of both HPF and the calibration system. This time period is when we have the highest quality spectra, the most uniform drift calibration from the LFC, and regular flat field frames. We therefore only used spectra from the following time period for the RVs presented in Figure 3C.

Spectra used for the RV stream in
Second, we only use tracks (the unique design of the HET only enables acquisition of targets during a discrete east or west track, ~1hr long for Barnard's star) that had 4 or more individual high-quality exposures per track to allow us to bin down RVs from observations in each track, with cumulative exposure time of at least 20 minutes (shown as the red points in Figure 3).
These additional cuts resulted in a selection of 118 spectra presented in the RV stream in Figure 3C. These spectra had a mean SNR of ~600 per 1D extracted pixel evaluated at λ~1.07 micron (HPF order 57). These spectra were obtained with two main exposure time lengths of either ~315s (99 spectra) or ~157.5s (19 spectra), corresponding to 30 and 15 up-the-ramp non-destructive reads on the HPF H2RG detector (10.45s for each read), respectively. The shorter exposure times were only performed on two nights (April [28][29], where the goal was to conduct higher cadence observations of Barnard's star while minimizing saturation. For all subsequent spectra, we adopted the longer exposure time for consistency, with the goal to achieve an SNR of 500-600 per spectrum per 1D extracted pixel at 1.07 micron.

Spectra used for the master RV template
To generate the highest SNR master template, we used the set of all 166 spectra from February 25-July 21, 2018. Although the early spectra from February 25-April 20 were not taken in the final configuration of the instrument, being at significantly different barycentric velocities than the later spectra, using them in the generation of the master template helped both generate a more accurate representation of the master template in the telluric regions, along with yielding an overall higher SNR template.

Generation of 1D extracted spectra
The full frame HPF H2RG array reads out non-destructively during exposures at a cadence of 10.45 seconds without any resets. Before fitting a slope to estimate e-/s flux from this up-the-ramp data, we corrected for various H2RG detector noise and artifacts. The largest source of correlated noise is bias fluctuations in the four HPF readout channels. Using the lightinsensitive reference pixels and correlated signal in the light sensitive pixels between orders, we subtracted out the bias fluctuations to significantly below the readout noise. The bias-corrected data were then corrected for non-linearity using nonlinearity curves we derived in laboratory testing. Detailed description of these algorithms can be found in Ninan et al. [9]. The 2D e-/s flux image calculated from the up-the-ramp data was then flat fielded to correct for pixel-to-pixel quantum efficiency differences. The echelle order position, as well as the profile on the array, was traced using a continuum source spectrum. This trace and profile were used for rectification and the optimal extraction of the flux to create a 1D spectrum [10][11][12]. For correcting the chromatic pixel sensitivity as well as other instrumental response across the spectrum, we divided the extracted spectrum using the de-blazed 1D extracted spectrum of the continuum source.
Wavelength calibration was done by fitting a smooth dispersion solution to the centroids of the LFC lines. The centroids were estimated by fitting individual Gaussian profile models with linear background. In typical LFC spectra, the background level is about 1% of the peak flux, and is primarily due to grating scattering. Our fitting algorithms, as described here, are largely robust against the background continuum at the present RV levels. For precise barycentric velocity as well as instrumental drift correction, flux weighted midpoints of the exposure time of each echelle orders were calculated using the differential flux in the raw up-the-ramp data.
For precise instrumental drift correction, we created a high signal-to-noise ratio template of the LFC from the epochs near the one used for solving the wavelength dispersion solution. This template was then transformed in pixel space by a polynomial model to fit the LFC spectrum at different epochs. The fit was performed by least square minimization of each echelle order individually. The residuals were weighted proportionally to the flux. However, to prevent the brightest LFC lines from dominating the minimization, we de-weighted the brightest 2 percentile pixels to the same level as the 98 percentile pixels. The instrumental drift measured from daily LFC comb calibration was then linearly interpolated to the flux-weighted epochs of stellar observations. This drift correction was then applied to the wavelength dispersion solution of each individual echelle order.
True instrumental RV precision is measured by calculating how well the instrumental drift can be corrected across the fibers in HPF. The measurement setup shown in Figure 4 illuminated both the science and the sky fiber of HPF simultaneously with the LFC. The instrumental drift was estimated separately from both the fibers as explained in the previous paragraph. The drift estimates were then subtracted from each other to measure any residual drift. Scatter in our estimate from a 10-minute exposure is 20 cm/s; this is larger than the expected photon noise level scatter of 3.7 cm/s. We have identified a few possible sources of errors like modal noise, and inter pixel sensitivity variation that need to be mitigated to further improve this measurement precision.

RV reduction
The RVs in Figure 3C were extracted with a least-square template-matching RV fitting method described by Anglada-Escude & Butler [13]. This algorithm is based on minimizing the differences of the observed spectrum against a high signal-to-noise master template constructed from a co-addition of all available observations of the target star. This method has been shown to be particularly powerful when extracting RVs for mid-to-late M-dwarf spectra, where the stellar continuum is not locally smooth due to the large number of molecular absorption lines [13]. To extract RVs from the HPF 1D spectra, we adapted the publicly available SpEctrum Radial Velocity Analyzer (SERVAL) code [14], a Python implementation of the RV templatematching algorithm. Here we provide an overview of how we extracted the HPF RVs, focusing on a brief description of the core algorithm along with highlighting our main changes to the base SERVAL code. We direct the interested reader to the SERVAL paper [14]. The template-matching algorithm in SERVAL calculates the RVs in three main steps. First, in the 'pre-RV' step, all of the spectra are first brought to the reference frame of the Solar system barycenter by correcting for the barycentric motion of the Earth. To calculate the barycentric velocity of the Earth we use the openly available barycorrpy package [15], which has been shown to be precise at the ~1cm/s level. After the barycentric correction, the highest SNR spectrum is used as an initial template to calculate the maximum likelihood 'pre-RV' through minimizing the differences between the observed spectra and the initial template. For the second step, the algorithm uses these 'pre-RVs' together with the barycentric correction to bring all of the individual spectra to a more accurate common reference frame to then coadd the individual spectra to create a high-SNR master template. As the third and final step, this master template is used to derive a more robust set of final RVs. For our RV reduction, we skipped the first 'pre-RV' step, i.e., we generated the template assuming a pre-RV of 0 for all of the spectra observed.
Although HPF has a total of 28 echelle orders covering the information-rich z, Y and J NIR bands, for our RV extraction we focused on using the 9 orders least affected by telluric absorption in the HPF bandpass. Specifically, for the extraction presented here we used spectra corresponding to grating orders 72-69 (covering the 841.15-889.5nm region in the z-band), and 61-57 (covering the 993.3-1076.7nm region in the Y-band). We are actively working on incorporating the other HPF orders for the RV extraction, to leverage the larger information content offered in doing so. However, more work remains on optimally extracting the RVs in the presence of large-scale telluric absorption regions.
To acquire the highest precision RVs, both lines due to telluric absorption and/or sky emission lines present in the atmosphere either need to be modeled out or explicitly masked. We follow the SERVAL implementation of both telluric and sky-emission lines masks: any areas affected by variable telluric and/or sky-emission are either severely down-weighted (in the creation on the master template spectrum), or fully masked out (in calculation of final RVs).
To generate a binary telluric mask, we first created a finely-sampled master telluric spectrum using the open source TelFit package [16]. TelFit is a Python wrapper to the well-known Line-By-Line Radiative Transfer Model package [17] which is capable of calculating a telluric-absorption spectrum given a number of different atmospheric input parameters. For our telluric mask, we calculated a synthetic telluric spectrum across the full HPF bandpass, assuming the following default TelFit values: a humidity of 50% at the HET latitude (30.6 degrees) and altitude (2100 m). We then used this output TelFit spectrum to generate a thresholded binary mask where we masked everything below a transmission of 99.5% as a telluric. We experimented with different thresholding levels from 99% to 99.9%, and use 99.5% for the best RV reduction, though in practice, the final standard deviation of the RVs did not vary substantially between these threshold levels. We elected to use the native high-resolution telluric spectrum from TelFit-i.e., without degrading the spectral resolution to the HPF R~53,000 mean resolution-to better help mask out smaller less-deep telluric regions. After creating the threshold binary mask, we further broadened the mask by 17 wavelength resolution elements (each resolution element in the mask being 0.027A). The exact choice of broadening did not impact the final RVs substantially. Table S1. Binned radial velocity measurements and associated 1-sigma uncertainties from our observations of Barnard's star. These RVs are plotted in Figure 3C as the red points. The errors are the associated weighted-average errors from the 1-sigma unbinned errors from SERVAL (blue errorbars in Figure 3C), along with a fixed fiber-to-fiber LFC calibration error of 20cm/s per 5minute bins added in quadrature. BJD  We created a sky-emission line mask in a similar manner to the steps outlined above for the telluric masking with some slight differences. To create the mask we used a master sky-background frame taken by staring at a blank patch of the night sky for 10 minutes on the night of March 29th 2018 (moon phase of 98%). After subtracting the background continuum of this deep sky spectrum, we derived the locations of the stellar lines by thresholding anything above 5 sigma as a sky emission line. In addition to masking out the telluric and sky emission lines, we masked out stellar lines. We specifically masked out the 3 Ca II infrared triplet lines (Ca IRT) centered at 8498.02A, 8542.09A, and 8662.14A (air wavelengths), which are known to be highly sensitive to stellar activity [18]. In addition to these lines, we masked out the following 8 deep (>30% deeper than surrounding pseudo-continuum) atomic lines (8426.504A, 8434.959A, 8435.650A, 8468.4069A, 10496.122A, 10661.628A, 10677.053A, and 10726.392A; air wavelengths) that were either close to the edge of an order, and/or showed excess correlated noise (at the ~0.5-1% level) in the residuals. Although Barnard's star has been shown to exhibit low levels of stellar activity [19], we performed tests with (σRV,binned=1.53m/s) and without these lines masked (σRV,binned=1.77m/s). We additionally masked out three other regions-corresponding to 10140.2-10142.7A, 10527.7-10531.3A, and 10739.4-10742.2A (in the barycentric rest-frame)-that did not correspond to any deep lines but showed excess correlated noise in the residuals which we attribute as bad-pixel regions on the HPF detector. We continue to explore the reasons why these lines and bad-pixel regions add noise in an effort to disambiguate between detector and extraction induced issues and astrophysical noise. We surmise that improved flat fielding has the potential to further increase the overall RV precision presented here. These challenges, some from the unique design of the HET Facility Calibration Unit [8], remain to be overcome.
We have calculated the expected photon noise for the spectrum in the bandpass used using the formalism presented by [20]. We calculate the quality factor (Q) from the high signal-to-noise master RV template. For each observed spectrum we mask out tellurics and sky emission regions, along with the noisy stellar lines. Following [20], using the Q calculated from the template for the unmasked regions, we estimate the photon noise limited stellar RV precision (including detector read noise and digitization noise) per observation by taking a weighted average of the photon noise RV precision for each individual order. In doing so, we estimate a median photon noise RV precision of 1.68m/s compared to the as-observed scatter of 2.83m/s for the unbinned data. Similarly, for the binned data, we estimate a median photon noise error of 0.71m/s compared to the as-observed scatter of 1.53m/s. In both cases, we see that the as-observed scatter is larger by a factor of ~2, suggesting that our observations are limited by other noise sources. Table S1 lists the weighted-average RVs (red points in Figure 3C), where each point represents the weighted-average radial velocity from all RVs from one HET track with a cumulative exposure time of at least 20 minutes. The errors are the associated weighted-average errors from the 1-sigma un-binned errors from SERVAL (blue errorbars in Figure 3C), along with a fixed fiber-to-fiber LFC calibration error of 20cm/s per 5-minute bins added in quadrature. The binning of RV observations obtained inside the same HET track (<1hr of each other) serves to increase the SNR and has historically also been used to beat down any short-term stellar activity and oscillations. We chose to acquire individual 5-minute observations and bin the RVs to avoid detector saturation or large non-linearity that would result from longer exposures. Figure S2: a) HPF observations (gold) with the proposed 233-day exoplanet orbit from Ribas et al [21] shown as a solid blue line. We have fit and subtracted a zero-point offset to the HPF velocities. b) Residuals of the HPF RVs after subtracting the exoplanet model. The residuals have an RMS scatter of 1.59 m/s, statistically indistinguishable from our baseline 1.53 m/s stability. c) HPF RVs phased to the proposed 233-day exoplanet period. Figures generated using RadVel [22].

Consistency of HPF RVs with the Barnard's star exoplanet candidate
Ribas et al. [21] reported the detection of an exoplanet candidate with orbital period 233 days and RV amplitude 1.2 m/s around Barnard's star. While our RV stability should be sufficient to detect such a planet, our 86 day observing baseline coincided with the flattest part of the proposed eccentric orbit for Barnard b, resulting in the stable RV curve shown in Figure  3c. In Figure S2, we show our HPF RVs alongside the Ribas et al. orbital model. We have modeled and subtracted a zeropoint offset to the HPF RVs, but have not otherwise modified the RVs or the planet model. When comparing our observations to the phased orbit model, we see that we do not cover the RV maximum, where the candidate planet's eccentric (e = 0.3) orbit creates the greatest deviation from a flat line. Hence, our stability results are essentially identical regardless of the RV model adopted; we find an RMS of 1.53 m/s around the flat-line model, and RMS = 1.59 m/s around the Ribas et al. exoplanet model.