An Improved Lyman‐Alpha Composite

The hydrogen Lyman‐alpha (Lyman α) line at 121.56 nm is the strongest solar vacuum ultraviolet emission line. Especially because of the impacts on planetary atmospheres, long‐term data sets of Lyman α are important for understanding solar and atmospheric processes. A revised composite data set of daily Lyman α values beginning in 1947 is constructed using measurements of Lyman α from Atmospheric Explorer E, Solar Mesospheric Explorer, Upper Atmosphere Research Satellite, and Solar Radiation and Climate Experiment. Gaps are filled using proxy models based on the Magnesium II index and the 10.7‐ and 30‐cm solar radio fluxes (F10.7 and F30).


Introduction
The hydrogen Lyman-alpha (Lyman α) line at 121.567 nm is the strongest solar vacuum ultraviolet emission line and the main excitation source for atomic hydrogen resonant scattering in cool material in the solar system (e.g., Emerich et al., 2005). Measurements of Lyman α are useful to examine and model both solar and solar irradiance processes as well as the terrestrial atmospheric response and processes. Because Lyman α irradiance is absorbed in the Earth's atmosphere above 70 km, it is an important component of the forcing of the thermospheric-ionospheric system (e.g., Tobiska et al., 1997). On Mars, Lyman α measurements have been used to help define the atmosphere (e.g., Bougher et al., 2017;Chaffin et al., 2018). Additionally, Lyman α can serve as a proxy measurement for other solar emissions (e.g., Barth et al., 1990;Schöll et al., 2016) including its first use for solar irradiance modeling in the SERF2 ) models and later extensions to the EUV90, EUV91, and SOLAR2000 models .
The Lyman α line is produced in the solar transition region and radiated into the upper chromosphere where coherent scattering results in spectral broadening (Roussel-Dupré, 1993;Vernazza et al., 1981). The line profile has an approximately 0.1-nm width with a central self-reversal dip of about 30% (Kretzschmar et al., 2018;Lemaire et al., 2015). The Lyman α variability within a 27-day period is <40%, defined by the ratio of the maximum to the minimum flux in a 27-day window, while the variability over a solar cycle is about a factor of 2 .
Composite data sets of Lyman α, covering multiple solar cycles and longer periods than available from just one instrument, have been developed by Tobiska et al. (1997), Woods and Rottman (1997), and Woods et al. (2000). The composites of Tobiska et al. (1997) and Woods et al. (2000) were referenced to the Lyman α measurements from the SOLar STellar Irradiance Comparison Experiment (SOLSTICE) aboard the Upper Atmosphere Research Satellite (UARS) Woods et al., 1996), while the composite of Woods and Rottman (1997) was scaled by 95% to represent the average of the measurements from UARS SOLSTICE and the UARS Solar Ultraviolet Spectral Irradiance Monitor (Brueckner et al., 1993). Multiwavelength solar spectral irradiance composites include the SOLID composite of Haberreiter et al. (2017) which models from 0.5 to 1991.5 nm based on a probabilistic approach and 20 instruments and the composite of DeLand and Cebula (2008) which covers the 120-to 400-nm range.
Lyman α composites are especially useful for long-term studies of solar forcing on the Earth's atmosphere and the development of solar spectral irradiance models. As examples of the broad use of Lyman α composites in trending studies, the Woods et al. (2000) composite has been used in the study of solar forcing of noctilucent clouds (Fiedler et al., 2017;Robert et al., 2010) and the study of the stability of the interstellar hydrogen inflow (Koutroumpa et al., 2017). This composite has also been used to develop and validate solar spectral irradiance modeling based on solar surface features for a variety of models including, as examples, Naval Research Laboratory's solar variability models (Coddington et al., 2019) and the ADAPT model described by Henney et al. (2015, and references therein).
The work described below is a major update to the Woods et al. (2000) composite. In this paper, we create a new composite data set, known as Version 4 of the University of Colorado Laboratory for Atmospheric and Space Physics (LASP) Lyman α composite, where we take advantage of multiple extended data sets with improved calibrations. This new composite is available with daily updates from the LASP. Major improvements include the use of SOLSTICE aboard the Solar Radiation and Climate Experiment (SORCE) Snow et al., 2005) as the reference, updated Lyman α observations, and different proxy models. In this paper, section 2 describes the previous three versions of the LASP Lyman α composite. Section 3 describes the new Version 4 composite. Section 4 discusses the results, and section 5 is conclusions.

Previous LASP Lyman α Composites (Versions 1-3)
The original LASP Lyman α composite (Version 1) is described by Woods et al. (2000) and provides daily values of Lyman α irradiance for a 1-nm bandwidth and is based on measurements of Lyman α from three satellites and two proxy models. The Version 1 composite includes Lyman α measurements from Atmospheric Explorer E (AE-E;1977-1980 (Hinteregger et al., 1981), Solar Mesospheric Explorer (SME;-1989 (Barth et al., 1983), and UARS SOLSTICE (1991. Data gaps of less than 5 days are linearly interpolated, while longer gaps are filled in with the proxy models. The preferred proxy model is based on the Mg II core-to-wing ratio (Mg II index) (Heath & Schlesinger, 1986;Viereck et al., 2001). The second model (used when the Mg II index is not available) is based on the 10.7-cm solar radio flux (F10.7) measurements made near Ottawa and Penticton (Canada) (Tapping, 2013). The Lyman α composite goes back to 1947 with the use of the F10.7 proxy. Both proxy model linear equations have three terms: a constant term, a term for the proxy, and a term for an 81-day average of the proxy; these represent the quiet sun, short-term, and longterm variations, respectively. The 81-day averages are intended to remove modulation due to the 27-day solar rotation period. The F10.7 proxy is based on the square root of the F10.7, because the square root of the F10.7 has a better correlation with the Lyman α measurements. The models were fit empirically via linear regression. Data gaps in the proxies are interpolated over gaps of 5 days or less and filled via fast Fourier transform analysis for longer gaps. A 3-day smooth is applied to the entire composite.
In 2007, the LASP Lyman α composite was updated to Version 2 to include the latest versions of Lyman α measurements from the Thermosphere Ionosphere Mesosphere Energetics and Dynamics (TIMED) Solar EUV Experiment (SEE) (Woods et al., 1998) and SORCE SOLSTICE. The TIMED and SORCE data are used after 2003 and are scaled to match the UARS reference level. For this revision, developed only several years into the SORCE mission, it was determined that to match UARS, the SORCE signal should be reduced by 4%. During the Years 2001 to 2003, the composite uses the F10.7 proxy model instead of UARS and TIMED measurements, because there were unresolved degradation issues for both of those instruments in this period.
After the 2010 launch of the Solar Dynamics Observatory (SDO), Version 3 of the LASP composite was created which filled gaps in the SORCE SOLSTICE Lyman α observations with Lyman α measurements from the Extreme Ultraviolet Variability Experiment (EVE) (Woods et al., 2010) Multiple Extreme ultraviolet Grating Spectrographs P (Hock et al., 2012) instrument. In the Version 3 composite, TIMED SEE, SORCE SOLSTICE, and SDO EVE data are all scaled to UARS SOLSTICE and are not smoothed. Version 3 of the LASP composite, shown in Figure 1, contains eight components and covers more than six solar cycles and more than 70 years.
In the 10 years since the creation of Version 3, discrepancies discovered in the composite, and a revision to UARS SOLSTICE, Version 18, serves as motivation to revise the composite. Discrepancies at the junctures between components in the composite can be observed by plotting the composite against proxy measurements made by a single instrument (e.g., F10.7) during the same period. As examples of issues, in 1989, the composite jumps up by 10% as it switches from SME to the Mg II model, and prior to 1966, the F10.7 proxy was incorrectly scaled with an extra correction of 1 AU. Also, the Mg II model in the composite is too low for periods in 1991 where the Mg II index had several month-long gaps that were interpolated. Finally, the 4% adjustment of SORCE SOLSTICE and SDO EVE Lyman α values to match UARS SOLSTICE was determined to be unnecessary.

Revised Composite
This paper discusses a revised LASP Lyman α composite (Version 4), which differs from the previous composite in several ways. First, the reference for the composite is now Lyman α measurements from SORCE SOLSTICE because it is better calibrated and has lower measurement uncertainty than UARS SOLSTICE. Second, the new composite uses different and updated measurements, proxy data, and models and excludes two short data sets (TIMED SEE and SDO EVE) that cause jumps in the time series. Finally, the composite is not smoothed except for the AE-E and SME time periods which are scaled from the previous composite.
As shown in Figure 2, the new composite has seven components. The composite is based on direct measurements of Lyman α from AE-E, SME, UARS SOLSTICE, and SORCE SOLSTICE and proxy models based on F10.7, 30-cm radio flux (F30), and the Mg II index are used to fill gaps. The Mg II index proxy is also used to

10.1029/2019EA000648
Earth and Space Science create functions to transfer the scaling from one period to another. Sections 3.1 through 3.3 describe each of these components further and Section 3.4 describes their uncertainties.

SORCE SOLSTICE and UARS SOLSTICE Lyman α
SORCE SOLSTICE (Level 3, Version 16) data are the primary reference for the daily Lyman α measurements for the Version 4 composite. SORCE was launched into low Earth orbit in 2003. SOLSTICE consists of a pair of monochrometers and makes full solar disk measurements from 115 to 320 nm. Observations of Lyman α are made for several minutes on most orbits as well as throughout one full orbit. SORCE SOLSTICE was photometrically calibrated at the National Institute of Standards Synchrotron Ultraviolet Radiation Facility III prior to launch. SOLSTICE was designed to use stellar measurements of very stable stars to quantify instrument degradation and maintain calibrations in space . SORCE SOLSTICE combines three measurements (with a 0.033nm step size) to achieve 0.1 nm resolution at far ultraviolet (FUV) wavelengths. The Level 3 data used for the Lyman α composite is binned to 1 nm. SORCE SOLSTICE differs from its predecessor UARS SOLSTICE in several ways. SORCE SOLSTICE has separate grating drives for the FUV and middle ultraviolet measurements which allow for simultaneous different types of scans for the two wavelength regions, while UARS SOLSTICE had one shared grating drive. SORCE SOLSTICE also has larger detectors which provide a higher signal-to-noise ratio. The detector degradation has been extrapolated since 2011 when SORCE SOLSTICE stopped making stellar observations due to battery degradation. The SORCE SOLSTICE Lyman α measurements have an initial 1-sigma (1σ) calibration uncertainty of 5% . The degradation of SORCE SOLSTICE is corrected, but the uncertainty of the degradation rate correction increases in time, and this so-called trending uncertainty is about 0.3%/ year. The SORCE SOLSTICE Lyman α measurements are included in the composite without any modification. The 4% adjustment to SORCE SOLSTICE in the earlier version of the composite was based on limited SORCE SOLSTICE data; with a longer SORCE data set now available, the need for such scaling is not substantiated based on comparisons of the UARS and SORCE SOLSTICE data sets with the Mg II proxy model used for the Version 4 Lyman α composite.
The UARS SOLSTICE Lyman α (Version 18) data are the primary observation used in the Version 4 composite from 1991 through 2000. The FUV channel on this instrument takes measurements in 0.1nm steps and with an effective resolution of 0.3 nm. The Level 3 data used for the composite were binned to 1-nm resolution. The uncertainties for this data are an initial calibration uncertainty of 5% and a trending uncertainty of 1%/year . Degradation was extrapolated after failure of the star tracker in 1999. The UARS SOLSTICE Lyman α measurements are included in the composite without any modification.

Proxy Models of Lyman α
A proxy model creates a time series of estimated measurements by applying a function to another measurement which has similar time series characteristics. An ideal proxy for Lyman α would have the same physical origins in the transition region and the chromosphere as the Lyman α emissions, have daily data with no or minimal gaps, and have low measurement uncertainty. In the Version 4 composite, proxy models are used to fill in gaps where there are no measurements from any of the four Lyman α instruments used in the composite and are also used to scale measurements between different instruments that do not overlap in time.
Comparisons to proxy models are also used help to validate data sets used in the composite; measurement anomalies are suggested where a data set suddenly deviates from the proxy model. For this type of validation, the use of multiple independent Lyman α observations or proxies ensures that such anomalies are not due to a discrepancy in the proxy such as a jump when the sources switch for a proxy that itself is a composite (such as the Mg II index). The primary proxies considered for the Version 4 Lyman α composite were the Mg II index and solar radio fluxes.
The Mg II h and k resonance lines near 280 nm are emitted in the lower chromosphere and have been used since the 1960s to understand chromospheric structure (e.g., Ayres & Linsky, 1976). The Mg core-to-wing ratio, also known as the Mg II index, is defined as the ratio of the irradiance of the chromospheric Mg II doublet (line cores) to the irradiance of the photospheric wings and was created to be a proxy for EUV irradiance (Heath & Schlesinger, 1986). In a comparison of multiple proxies on different time scales, Dudok de Wit et al. (2009) found that the Mg II index was one of the best proxies for Lyman α. Kretzschmar et al. (2018) showed that chromospheric Mg II index has a strong correlation with Lyman α, notably >95% within 0.04 nm of the line center. The advantage of an index that is a ratio of irradiances is that it has reduced sensitivity to instrument effects because uncorrected degradation that is linear in wavelength cancels out in the ratio. Additionally, as a ratio, the Mg II indices calculated for different instruments with differing resolution can be linearly scaled to one another if they have an overlapping time period, and this has allowed for the creation of long-term composites of the Mg II index (Snow et al., 2014). Solar radio flux measurements can serve as indicators of solar activity and proxies for ultraviolet emissions. As described by White (1999)  Multiple proxies including hybrids of measurements as well as different functional forms were examined. The best proxy models were those with good proxy data availability and the lowest standard deviation of the regression of the proxy with the SORCE SOLSTICE Lyman α measurements. Potential proxies were assessed based on linear regression of the proxies to 15 years of SORCE SOLSTICE Lyman α measurements. Comparisons of some of the potential proxy data to SORCE SOLSTICE Lyman α measurements are shown in Figure 3.
As with the earlier versions of the composite, the chromospheric Mg II index is the best proxy for Lyman α, while among the solar radio wavelengths considered (3.2, 8, 10.7, 15, and 30 cm), F30 was found to be the best proxy for Lyman α. The standard deviation of regression of Mg II to the SORCE SOLSTICE Lyman is 1.8%. The 1σ standard deviations were 2.3% for the regression with F30 and 4.2% for the regression with

10.1029/2019EA000648
Earth and Space Science F10.7, while other available radio wavelengths had larger standard deviations. We also considered Ca II (Keil et al., 1997), also derived from chromospheric emissions, but its regression has a much larger standard deviation of 5.3%.
For the Version 4 Lyman α composite, the Mg II proxy model shown in equation (1) is constructed with a linear regression of SORCE SOLSTICE Lyman α from 2003 through 2017 with the Mg II index composite (Version 5) from the University of Bremen (as described by Snow et al., 2014). A composite of Mg II measurements from multiple instruments is used as a proxy because it covers a much longer time span than any individual instrument; continuous measurements of the Mg II index go back to 1978, covering four solar cycles. The Bremen Mg II composite has a mean 1σ calibration uncertainty of 0.3% (where the uncertainties are determined from pairwise comparison of the components during periods where they overlap) and has fewer interpolationfilled gaps than the NOAA Mg II composite used for Version 3 of the Lyman α composite. The two-term proxy model shown in equation (1)  The ratio of the Mg II proxy model to the SORCE SOLSTICE Lyman α is shown in Figure 4. The ratio deviates from unity by up to ±5% and has lowest variability around the solar minimum in 2008. Although a better fit to the SORCE SOLSTICE data can be obtained using more complex functions or doing the fit over just the rising or falling part of the solar cycle, these proxy models have much worse standard deviations from the Lyman α measurements when compared to pre-SORCE periods, such as for UARS SOLSTICE. The Mg II proxy model, based on the Bremen Mg II composite, is used to fill in gaps in the Lyman α composite back to 1978.
Solar radio flux proxies are used to fill the remaining gaps in the composite. Radio fluxes are measured from the ground and well-calibrated daily measurements of F10.7 began in 1947, before the rocket and satellite era. Continuous measurements of solar radio emissions at other centrimetric frequencies have been made since the 1950s by the Toyokawa and Nobeyama radio observatories (Japan) (Tanaka, 1967). In the original composite, F10.7, available since 1947, was used to fill all gaps prior to the advent of Mg II measurements in 1978. The F10.7 flux measurements are of the total emission at a wavelength of 10.7 cm over the whole solar disk over a 1-hr period in solar flux unit, where 1 solar flux unit = 10 -22 W m -2 Hz -1 (Tapping, 2013).
In Version 4 of the composite, we fill data gaps with an F10.7 proxy from 1947 to 1957 and with an F30 proxy from its earliest available date in 1957 to 1978 (when the Mg II proxy became available). These proxy models are created with linear regression over the SORCE SOLSTICE time periods and are defined by equations (2) and (3). For the F10.7 proxy, the three-term linear functional form of √F10.7 was retained from the Version 3 composite, with new coefficients derived based on linear regression with SORCE SOLSTICE. Similar to what was found earlier for UARS SOLSTICE, we found empirically that the linear regression to Lyman α is better for √F10.7 than for F10.7 (standard deviations 3.8% and 4.2%, respectively) and the inclusion of √F10.7 81day improves the fit even further (standard deviation of 3.1%). A simpler linear equation is used for the F30 proxy (equation (2)), because additional terms or using √F30 did not produce smaller standard deviations.

SME and AE-E Lyman α
For Version 1 of the composite, Woods et al. (2000) reprocessed the SME and AE-E Lyman α data to remove anomalous periods and to adjust the data to match UARS SOLSTICE via proxy models. They scaled SME by a factor of 0.69 over its full measurement period and AE-E by a variable factor that ranged from 1.0 to 0.69 . The original uncertainties were 40% for SME and 30% for AE-E, but these corrections resulted in improved uncertainties of 10% for both instruments in the Version 1 composite. This "cleaned" SME and AE-E Lyman α was also used in Versions 1-3 of the LASP Lyman α composite.
For the Version 4 composite, we used the Mg II proxy model to rescale the "cleaned" SME and AE-E data provided with the previous Version 3 of the LASP Lyman α composite. Our Mg II proxy fully overlaps the SME period (but not the AE-E period), and so we derived a linear regression function (f v3_to_v4 ) which scales the Version 3 LASP composite during the SME period to the level of our Mg II proxy: This scaling function was applied to both the SME and AE-E Lyman-α data in the Version 3 composite to bring the SME and AE-E Lyman α irradiance for the Version 4 composite to the absolute scale of the SORCE SOLSTICE Lyman α irradiance while retaining the corrections to remove the anomalous period in the AE-E data. The 1σ standard deviation of the regression of SME to the Mg II model is 3.5%.

Uncertainties
Uncertainty estimates for data composites are difficult to determine due to the scaling of data sets. The coefficients in a proxy function depend on the exact time period used for its definition. This means that scaling based on short data sets (in this case the 15-year SORCE SOLSTICE period) may have trending errors that increase with time relative to the reference data set. Uncertainty estimates for the elements in the composite are summarized in Tables 1 and 2. Table 1 shows the 1σ uncertainties for the proxy models in the Version 4 Lyman α composite. All proxy models were linearly regressed to SORCE SOLSTICE Lyman α measurements over the Years 2003 through 2017. SORCE SOLSTICE has an instrument calibration uncertainty of 5% plus a 0.3%/year instrument degradation correction uncertainty. The total uncertainty for each proxy model is due to two terms: a mean calibration (or systematic) uncertainty of the SORCE SOLSTICE instrument, including its trending uncertainty over 2003 to 2017, and a random (or precision) uncertainty given by the 1-sigma standard deviation uncertainty Note. The systematic (calibration) uncertainty for SORCE SOLSTICE of 5.8% is the mean total uncertainty for the period of the fit based on the calibration uncertainty of 5% and a trending uncertainty of 0.3%/year. The random uncertainty is the uncertainty of the proxy model fit.

10.1029/2019EA000648
Earth and Space Science of the regression of the proxy to SORCE due to measurement noise and physical differences in the variability of the proxy and Lyman α irradiance. Because the proxy data are rescaled to SORCE SOLSTICE, we can ignore contributions from the calibration uncertainty of the proxy data itself in the total uncertainty of the model. The total uncertainty for the model components is determined with the residual sum of squares and is provided in the Version 4 data files.
Table 2 provides the uncertainties in the Lyman α measurements used in the Version 4 composite. Listed separately are the reported instrument calibration uncertainties, random uncertainties, and an uncertainty in instrument trends, if known. Again, as with the model components, the total uncertainty for the independent observations is the residual sum of squares of the separate uncertainty terms and is provided in the data files. Because the Mg II proxy model was used to scale SME and AE-E records, we use its total uncertainty (from Table 1) as an estimate of the calibration uncertainty of the SME and AE-E records. The random uncertainty for SME is the standard deviation of f v3_to_v4 relative to SME. The random uncertainty for AE-E is unknown but is also estimated as the same as that for SME.

Results
The improvements and changes in the Version 4 composite are seen in the ratio of Version 4 to Version 3 that is shown in Figure 5. There is an overall upward shift by 4% for Version 4 relative to Version 3, because Version 4 uses an unadjusted SORCE SOLSTICE as the reference scale, while Version 3 reduced SORCE SOLSTICE by 4% in an attempt to scale it to UARS SOLSTICE. In the F10.7-model period, annual oscillations are seen, because we corrected the misapplication of a double 1-AU correction in the F10.7 used for the earlier composite versions. Differences in proxies, the switch to the F30 proxy, the scaling of SME and AE-E, and the removal of SDO EVE and TIMED SEE are also all apparent in this ratio. Although both composites use UARS Version 18 data, since the ratio of the composites is not 1 for the UARS period, it is apparent that there was an undocumented revision of Version 18 data at some point.
A comparison of 81-day smoothed data for the Versions 3 and 4 Lyman α composites is shown in Figure 6. The deep and protracted solar minimum in 2008, observable in the Lyman α irradiance, has been suggested as a cause of low thermospheric density (Solomon et al., 2011).

Conclusions and Future Lyman α Measurements
A new Version 4 of the Lyman α composite has been constructed using SORCE SOLSTICE as the primary reference scale. The revisions, including the addition of an F30 proxy, have fixed many of the discrepancies

10.1029/2019EA000648
Earth and Space Science in the previous versions of the composites. This composite should be useful to examine other solar data sets and to model both solar and terrestrial atmospheric emissions and processes. Version 4 of the solar Lyman α composite is available from the LASP Interactive Solar Irradiance Datacenter (http://lasp.colorado.edu/ lisird).
The SORCE mission is planned to end in December 2019. After this, the Version 4 composite will transition the primary reference Lyman α data set from SORCE SOLSTICE to the Extreme Ultraviolet and X Ray Irradiance Sensors (EXIS) on the Geostationary Operational Environmental Satellite-R (GOES-R) series of four satellites; these data are not yet publicly available and so cannot be included in the composite right now. The first two of these satellites are GOES-16, which launched in November 2016, and GOES-17, which launched in March 2018. The standard product from the Extreme Ultraviolet Sensor in EXIS includes Lyman α measurements at a 30-s cadence with 1-nm resolution. Inclusion of GOES Lyman α in the Version 4 composite will use a daily average that excludes the reduced nighttime signals obtained while the satellite observes the Sun through the geocorona. The EXIS Lyman α measurements agree well with SORCE SOLSTICE which will enable the future inclusion of operational EXIS Lyman α measurements into the composite with minimal scaling.
The use of the Lyman α composites for solar irradiance modeling may increase after the end of the SORCE mission. SORCE SOLSTICE measures from 115 to 320 nm. After the end of the mission, wavelengths in the middle ultraviolet above 200 nm will continue to be measured by the Spectral Irradiance Monitor aboard the Total and Spectral Solar Irradiance Sensor (Harder et al., 2005). However, except for the GOES-R measurements at Lyman α, there will be a gap in spectral solar irradiance measurements in the FUV (100 to 200 nm) that will need to be filled with models, and these may be based in part on the Lyman α observations by GOES-R satellites.