CLIMCAPS—A NASA Long‐Term Product for Infrared + Microwave Atmospheric Soundings

The Community Long‐term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS) spans more than two decades of soundings from infrared and microwave instrument pairs on Aqua, Suomi‐NPP, and NOAA‐20. CLIMCAPS builds on three decades of research within the National Aeronautics and Space Administration Atmospheric Infrared Sounder Science Team (AST) to retrieve profiles of atmospheric temperature, water vapor, ozone, and many other trace gases as well as cloud and surface properties. In this paper, we highlight four innovative aspects of the AST retrieval methodology that allows the CLIMCAPS record (2002 to present) to bridge technology differences, maintain global coverage and stay updated with fast reprocessing times. CLIMCAPS improves on the AST method with rigorous optimal estimation uncertainty quantification for improved understanding of the signal‐to‐noise in sounding retrievals to allow their use in scientific research.

(the interferogram was truncated to accommodate downlink limitations). But late in 2015, the full spectral resolution CrIS mode became fully operational to provide a better ability to measure carbon monoxide (CO) and upper tropospheric humidity, as well as improved calibration of the shortwave band. A second CrIS + ATMS pair was launched on the JPSS-1 platform (known as NOAA-20 in operations), and a third pair on JPSS-2 (NOAA-21) late in 2022. Two additional pairs are scheduled for launch in the next two decades. With this, NASA and NOAA are committed to collaboratively supporting a long-term sounding record for a wide range of science and applications concerning Earth's weather and climate systems.
Section 2 provides an overview of how CLIMCAPS builds on the AST retrieval methodology to continue the AIRS + AMSU record with sounding retrievals from CrIS + ATMS. We outline our CLIMCAPS product objectives in Section 2.1 and in Section 2.2 discuss the innovative aspects of the AST method that help CLIMCAPS achieve the stated objectives. Section 3 gives a brief overview of the range of values available in the CLIMCAPS record with conclusion and future work in Section 4.

Overview of CLIMCAPS Soundings
The AIRS and CrIS instruments measure top-of-atmosphere (TOA) radiance emitted from the Earth's surface and atmosphere in very narrow spectral intervals along the full infrared (IR) spectral range (650-2,680 cm −1 ). These IR radiance (or emitted energy) measurements are highly mixed signals of many different physical properties and gaseous constituents that manifest all along the atmospheric column, from Earth surface to top of atmosphere. Retrieval methods extract a finite set of easy-to-use discrete pressure-dependent geophysical variables (soundings) from these measurements. In this section we focus on hyperspectral IR measurements because they contribute most of the information content CLIMCAPS requires to retrieve the atmospheric state. CLIMCAPS combines co-located microwave measurements (from AMSU on Aqua and ATMS on SNPP and the JPSS series) with IR channels, but only when retrieving atmospheric temperature and water vapor as well as the three surface variables (emissivity, reflectivity and skin temperature).
In Figure 2, we illustrate what we mean by hyperspectral IR measurements being a mixed signal of many overlapping spectral features. We do this for seven atmospheric gases as well as temperature. We calculated each  feature as the absolute difference in brightness temperature (BT), given a perturbation in the target variable. We derived BT from radiances simulated with the CLIMCAPS a-priori atmospheric state from [9.0°S, 89.7°W] on 17 October 2015 at 19h23m UT. We perturbed each variable along all 100 pressure layers as follows; temperature = 1.0 K, H 2 O vap = 10%, O 3 = 10%, CH 4 = 10%, CO = 30%, HNO 3 = 100%, N 2 O = 5.0%, and CO 2 = 6 ppm, respectively. Our forward model in this experiment is the same one used in the CLIMCAPS Bayesian retrieval steps (Smith & Barnet, 2020), namely the Standalone AIRS Radiative Transfer Algorithm (SARTA) (Strow et al., 2003). In practice, the overlapping spectral signals such as those for AIRS in Figure 2 mean that we cannot retrieve the absolute quantities of each gas or the exact temperature at specific pressure levels from an instantaneous measurement. In technical terms, the retrieval problem is under-determined and ill-posed. The measurements do not contain enough information to retrieve a geophysical variable along the entire atmospheric column with absolute accuracy. A robust retrieval method has to extract the signal and weigh it against all possible sources of uncertainty and noise to separate the radiance measurement into a set of Earth system variables that can be used in downstream applications. For this reason, there can be large variations across retrieval methods, and ultimately, we design retrieval algorithms to meet the signal-to-noise requirements of target applications.
We defined five general objectives for the CLIMCAPS Level 2 product suite to support scientific research of medium-to large-scale atmospheric processes and enable informed product inter-comparisons. It should be noted that CLIMCAPS is a Principal-Investigator led algorithm and we define these objectives based on our knowledge of existing data gaps and requirements. Our overarching goal is to generate a product that minimizes clear-sky bias in large-scale process studies and enable scientific research with rigorous signal-to-noise quantification.

CLIMCAPS Product Objectives
We designed CLIMCAPS to build on the AST legacy with a long-term, consistent record of atmospheric soundings across multiple satellite platforms. These are, a system that should: 1. Exist for any modern-era hyperspectral IR and advanced MW instrument pairs. The algorithm should be written in such a way as to minimize instrument-specific effects, be able to exploit instrument information content and discriminate between physical correlations (e.g., the climate sensitivity of H2Ovap∕ ) as well as spectral correlations in the measurements (e.g., spectroscopic variability of H2Ovap∕ ). The goal here is to enable a long-term continuous record of atmospheric soundings from multiple instruments. 2. Have minimal dependence on things we cannot know, with reasonable accuracy. At first order these include clouds as well as uncertainty due to background state variables not being retrieved in a given step. Cloud parameters (e.g., particle sizes and shapes, optical properties) are highly non-linear and can change quickly over short time and space dimensions. This makes cloud parameters difficult to retrieve especially from IR (insensitive to microphysical cloud properties) and MW (insensitive to non-precipitating clouds) measurements alone. As demonstrated in Figure 2, the IR measurement is complex, with overlapping signals from different parts of the atmosphere and other absorbers. If interfering signals are not minimized and explicitly accounted for then they can obscure atmospheric variability in the target retrieval variable and result in a low-quality retrieval. 3. Have global coverage and availability day and night for all seasons across clear and partly cloudy skies. The system cannot rely on regional or highly tailored a-priori estimates that would restrict background knowledge to target areas or specific weather regimes. This would render the retrieval product inferior in all regions except the one(s) defined by the a-priori. To allow consistent quality in products across the globe and under most environmental conditions, the retrieval algorithm should avoid dependence on ancillary datasets that are unavailable for or unskilled in remote regions, such as the Pacific Ocean or Antarctica. 4. Quantify and propagate all known error sources, random and systematic, so that the product information content can be known and traceable. Averaging kernel matrices quantify the signal-to-noise of a Bayesian retrieval system and can be used to separate the retrieved from the assumed (a-priori) information. Error traceability is achieved when a system fully characterizes and propagates instrument noise, radiative transfer model biases, uncertainty in a-priori estimates, as well as the inter-correlations among atmospheric, cloud, and surface properties. 5. Be very fast. This enables easy reprocessing of the full record every time a significant algorithm upgrade is made. The modern era of sounding instruments employs a large number of spectral channels and already spans two decades. This record will continue to grow for at least the next two decades if JPSS-3 and -4 launches successfully. IR retrieval systems typically do not retrieve variables in isolation but instead retrieve multiple variables to properly account for interfering signals. An upgrade in one component means a rerun of the entire system to preserve the long-term continuity and stability of the sounding record. A fast system has the added benefit that it can function in real-time if necessary (i.e., generate sounding products within an hour of satellite overpass). By "fast," we mean a system that maintains the same rate of processing per CPU (central processing unit) as the rate at which the measurements were made. For example, the CrIS instrument measures 30 fields of regard (or 3 × 3 fields of view) along each scanline in 8 s. A Level 1B granule with 45 scanlines takes 6 min to measure. We, therefore, define a fast retrieval system as one that can retrieve the full atmospheric state ( Figure 3 and Table 1) from each field of regard in 0.27 s. Figure 3. A flow diagram of the Community Long-term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS) retrieval algorithm that starts with (top left) reading infrared (IR) Level 1B radiance files and ends with (bottom right) Level 2 geophysical retrieval and cloud cleared radiance files. CLIMCAPS applies hamming apodization only to the Cross-track Infrared Sounder (CrIS) Level 1B IR radiances. A degree of local-angle correction (LAC) is needed for each instrument and CLIMCAPS applies LAC to each field-of-regard to remove view-angle differences among the 3 × 3 fields-of-view. The IR radiances, irrespective of instrument type, are cloud cleared in Step 1 to remove cloud radiative effects ahead of geophysical retrievals. Temperature is retrieved only after the Earth's surface within the field-of-regard is known, and retrieved a second time once all the interfering species are known. CLIMCAPS retrieves geophysical retrievals sequentially using optimal estimation and cloud-clears the IR radiances a final time using the retrieved cloud-free state. All the trace gas species, O 3 , CO, HNO 3 , CO 2 , N 2 O, CH 4 , and SO 2 , are retrieved using subsets of IR channels from AIRS or CrIS and are represented by the "IR OE" solid boxes. CLIMCAPS combines collocated Level 1B microwave measurements with the IR channel subsets for the retrieval of atmospheric temperature (T p ) and water vapor (H 2 O vap ) as well as the three Earth surface variables, namely skin temperature (T s ), emissivity ( ) and reflectance ( ). These retrieval steps are represented by the "IR + MW OE" dashed boxes.

CLIMCAPS Adaptation of the AIRS Science Team Methodology
The design of CLIMCAPS benefits from many years of involvement in the AIRS Science Team, extensive experience within the NOAA operational environment as well as with real-time direct broadcast systems, collaboration with National Weather Service forecasters, and the development of application-specific products (Berndt et al., 2020;Maddy & Barnet, 2008, 2011Maddy et al., 2009;Nalli et al., 2013;Smith et al., 2015;Susskind et al., 2003;Weaver et al., 2019;Weisz et al., 2013). We gave detailed technical and scientific descriptions of CLIMCAPS in a series of publications (Smith & Barnet, 2019Smith et al., 2021). Thrastarson et al. (2021) contrasted the CLIMCAPS and AIRS V7.0 algorithms in their Table 2 as well as their Figures 11 and 12 to distinguish the AIRS V7.0 algorithm from CLIMCAPS V2.0. They identify a few components of AIRS V7.0 not present in CLIMCAPS V2.0, namely a neural network regression a-priori as well as stochastic cloud clearing without explicit error propagation to complete its sequential Bayesian retrieval. In this section, we focus on how the five objectives we set for CLIMCAPS (Section 2.1), translate into an algorithm design that is different from AIRS V7.0 and with products that are suitable for in-depth scientific research. With this, we aim to motivate why we adopted a few key components of the original AST retrieval method and why we introduced novel elements. Figure 3 provides a simplified outline of the main algorithm design.
Four components of the AST retrieval method (Susskind et al., 2003) make up an integral part of the CLIMCAPS algorithm. First, "cloud clearing" is used to remove cloud spectral signals from the IR radiances. Cloud clearing is a simple, highly linear, fast method that requires no prior knowledge of clouds at the time of measurement. It does not depend on complex radiative transfer calculations through clouds nor co-location with imager-based measurements. Cloud clearing requires the aggregation of every 3 × 3 cluster of IR measurements (or 9 fields of view) to harness spatial information content and retrieve a single cloud-free IR measurement for all subsequent Note. The full set is retrieved at every available field-of-regard (FOR; ∼50 km at nadir) except the cloud fraction that is retrieved at every field-of-view (FOV; ∼13 km at nadir). Cloud variables are retrieved from Level 1B radiances, while atmospheric and Earth surface variables are retrieved from cloud cleared infrared (IR) radiances. The variables that depend on additional microwave (MW) channel sets (indicated by †) use cloud cleared IR channels and Level 1B MW channels. The MW radiances are not cloud-cleared. All the variables listed here are available in the Level 2 product file (see Data Availability Statement Section for details). * Products with two-dimensional averaging kernel matrices at every retrieval field-of-regard. † Products with two retrievals each, using a MW and a MW + IR channel set respectively. a Products retrieved using only IR channels. ‡ Products retrieved using only MW channels.
retrievals. This method was developed more than four decades ago (Chahine, 1974(Chahine, , 1977 is still relevant today. The cloud clearing method allows CLIMCAPS to retrieve accurate atmospheric soundings in partly cloudy atmospheres and thus achieve near global coverage. Most importantly, the AST cloud clearing method enables CLIMCAPS to quantify the error introduced by clouds so that it can propagate this error into subsequent retrievals and thus distinguish between natural variability (signal) and spectral noise.
Second, a step-wise, sequential approach that employs both microwave and infrared measurements as depicted in Figure 3. A simultaneous retrieval system would have to define each retrieval vector as being made up of nine profile variables each (temperature, water vapor, CO, CH 4 , O 3 , SO 2 , N 2 O, HNO 3 , and CO 2 ) combined with spectral emissivity and a host of surface and cloud variables. This can impose enormous time constraints when using OE (Rodgers, 2000) that requires iterative radiative transfer calculations at every retrieval footprint with multiple two-dimensional matrix inversions. The error covariance matrices in simultaneous OE retrievals are very large. A sequential retrieval, on the other hand, presents a number of advantages. These are, the (a) use of shorter state vectors (i.e., only one profile/spectral vector at a time) and smaller, simpler error covariance matrices that does not depend on knowledge of complex cross-correlations, (b) ability to carefully propagate errors introduced by interfering signals (Smith & Barnet, 2019) from one step to the next and, (c) selection of smaller subsets of spectral channels to maximize the signal-to-noise ratio for each target variable.
Third, the projection of state variables onto vertical functions colloquially known as "trapezoids" (Maddy & Barnet, 2008;Smith & Barnet, 2020). These trapezoidal functions more closely approximate the actual vertical information content of the instrument and thus stabilizes the retrieval of state variables even in highly unstable atmospheric conditions. As mentioned earlier, the retrieval problem is under-determined in that we wish to retrieve more information than is present in the measurements. Modern-era radiative transfer models such as SARTA simulate TOA hyperspectral IR radiances using state parameters defined on 100 pressure levels, but the IR measurements do not contain 100 pieces of information for each retrieval variable. For this reason, OE methods rely on a-priori estimates to fill the Null space (i.e., areas where the measurements lack information). We employ trapezoids in CLIMCAPS OE to further stabilize the retrieval by reducing the dimensionality of state variables (instead of 100 values for temperature, we use, say, 31 coarser layers as defined by the trapezoidal shape) while simultaneously allowing SARTA forward calculations on the 100-level vertical grid. Projecting state variables onto lower-order vertical functions has the additional benefits of improved execution time for Jacobians and acting as a smoothing constraint that reduces an otherwise strong dependence on a-priori estimates.
Lastly, the use of SVD of the measurement information content to derive a scene-dependent regularization parameter. SVD allows the separation of spectral information from noise at every retrieval footprint (Smith & Barnet, 2020) and helps account for the variability in measurement information content as conditions change from scene-to-scene (e.g., due to cloud cover, surface type, temperature lapse rate, or amount of boundary layer water vapor). By using dynamic SVD-based regularization, instead of static regularization based on a climatological background error covariance matrix, CLIMCAPS moves away from an a-priori estimate only where the measurement contains enough information. This means that the CLIMCAPS product can be stable even where measurement information content is low.
CLIMCAPS has novel components not present in later versions of the AIRS retrieval product (Susskind et al., 2014) that help us meet requirements for a consistent record across multiple instruments and satellites. In brief, these are (a) a state-of-the-art reanalysis model as a-priori for the primary thermodynamic variables, (b) propagation of two-dimensional a-priori error covariance matrices through all steps (Figure 3), and (c) the availability of two-dimensional averaging kernel matrices for every profile retrieval at every scene. See (Smith & Barnet, 2019 for more details. It is worth highlighting that CLIMCAPS has the ability to functionally emulate the real-time NOAA Unique Combined Atmospheric Processing System (NUCAPS, STAR NUCAPS Team, 2021). This is possible because NUCAPS and CLIMCAPS share some of the same AIRS Science Team retrieval components and differ primarily in the applications they support. NUCAPS is model-independent and supports real-time monitoring with an linear regression a-priori state that does not require runtime collocation with other datasets. In contrast, CLIMCAPS uses a reanalysis model as a-priori to minimize instrument dependence and has enhanced two-dimensional error propagation to support continuity across Aqua, S-NPP, and NOAA-20/21 for climate science and applications. CLIMCAPS has built-in capability to generate (and upgrade) the look-up-tables necessary to run a real-time 9 of 11 instance of NUCAPS for scientific purposes (e.g., field campaigns). This means, CLIMCAPS can be used to develop, test and provide scientific recommendations to NOAA for future operational upgrades. A companion paper in this special issue (Berndt et al., 2023) outlines how we were able to support NOAA weather forecasting by running an off-line instance of NUCAPS (using the CLIMCAPS system) on AIRS Level 1B products received in real-time via direct-broadcast antennas. Paired with operational NUCAPS soundings from NOAA-20, we were able to demonstrate the value of having multiple sounding observations in quick succession (Hyperspectral IR sounders are yet to make it onto a geostationary platform). The ability to run NUCAPS and CLIMCAPS using the same code base has the added advantage of diagnosing and validating retrieval information content and sensitivity to a-priori assumptions. Table 1 lists the complete set of variables CLIMCAPS retrieves from every cloud cleared radiance spectrum. These variables characterize the three-dimensional atmospheric state with a spatial resolution of ∼50 km at nadir (∼150 km at the edge of the scan). Over and above the range of values that can be derived to support scientific studies and applications, such as relative humidity and stratospheric intrusions (Berndt & Folmer, 2018), the CLIMCAPS retrieval set can be used to calculate outgoing longwave radiation for a better understanding of large-scale processes (Moy et al., 2010;Peterson et al., 2019;Susskind et al., 2012).

Geophysical and Diagnostic Data Products
For each retrieved variable in Table 1, CLIMCAPS quantifies a range of error and uncertainty metrics (See Data Availability Statement Section). Given the nature of sounding observations and the degree to which retrieval methods can vary, such metrics are important for data interpretation and inter-comparisons. In Figure 4a, we have a global map of CLIMCAPS mid-tropospheric CO retrievals averaged across all ascending orbits in April 2018. Unlike in situ observations of boundary layer air quality, CLIMCAPS retrieves CO in the free troposphere that quantifies long-range transport of pollutant air. This can be due to mega fires or industrial activity. Figure 4b-4h depict the metrics one can use to better understand the quality of CLIMCAPS retrieved CO transport signals. Details are in the caption and also in Smith et al. (2021).

Conclusions and Future Work
CLIMCAPS is built on decades of NASA investment in sounding instruments and retrieval methodologies. Our design philosophy for CLIMCAPS reflects this awareness, and we continue to make a conscious effort to acknowledge and encourage community interaction through research support, collaboration, and targeted product improvements. We think of CLIMCAPS as a system built by the NASA community for the global research community. Our goal with CLIMCAPS is to maintain this critical NASA capability and improve its sounding capability where necessary to continue serving the growing user community in an age where we need to better understand, monitor, and manage environmental change at all levels of society. An example of our commitment to this idea is manifested in the CLIMCAPS Science Application Guides ) that we developed for CLIMCAPS V2 in response to the need for practical, easy-to-understand documentation about algorithm components (such as channel selection, averaging kernels, a-posteriori error or trapezoid state functions) and examples of data applications with methods documented as open-source code. We expect new applications to continue to emerge as the sounding research community grow and evolve. For example, CLIMCAPS water vapor retrieval can help improve forecast models in simulating (or assimilating) extreme events as well as diurnal heat and convection (Elsaesser et al., 2019). CLIMCAPS could also improve atmospheric river forecasting, specifically in predicting landfall and intensity (Reynolds et al., 2019). There is also potential to use CLIMCAPS in drought research (Hobbins et al., 2016). Finally, there are many potential applications for the trace gas products, for example, CO 2 , CH 4 , and N 2 O are all important in monitoring greenhouse gas emissions, ozone and HNO 3 are important in understanding polar dynamics and chemistry, and CO is used in detecting smoldering combustion and characterize long-rage wildfire and urban pollution. We remain committed to working with the research community in evaluating CLIMCAPS products and, where necessary, identify improvements or needs for tailored products, especially as data services like the NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) transition to a "data cloud" infrastructure. In our experience, it takes clear documentation, rigorous demonstrations, and an open science approach to cultivate a user community with the knowledge and confidence to correctly and effectively use satellite sounder products in their applications and research.

Data Availability Statement
CLIMCAPS Level 2 Version 2 continuity product described in this paper is available at the NASA GES DISC using a combined IR + MW retrieval approach across three distinct periods from instrument pairs on multiple satellite platforms, as summarized below.
• 2002/09/01-2016/08/31: AIRS + AMSU (Barnet, 2019a(Barnet, , 2019b • 2016/09/01-2018/01/30: CrIS + ATMS on S-NPP (Sounder SIPS & Barnet, 2020c, 2020d • 2018/02/01-present: CrIS + ATMS on NOAA-20 (Sounder SIPS & Barnet, 2020a, 2020b That CLIMCAPS Level 2 products are additionally available for an AIRS-only (Barnet, 2019c(Barnet, , 2019d as well as a CrIS nominal spectral resolution configuration (Sounder SIPS & Barnet, 2019a, 2019b. While we will not regenerate these products when we upgrade to Version 3 in 2024, we do maintain the ability to run CLIMCAPS in IR-only mode or for instruments with different spectral configurations when needed. All error and uncertainty metrics are reported in the Level 2 product. Specifically, the "aux" subgroup of the Level 2 netCDF file contains the output from each threshold test that CLIMCAPS performs to determine whether the final retrieved state should be considered "failed" or "successful." We explain and demonstrate how to use these metrics in the CLIMCAPS Science Application Guides . Of note are the following quan tities that can be employed to refine data filters for target applications. • The random and systematic error uncertainty introduced by the cloud clearing retrieval are quantified as "etarej," "ampl_eta", and "aeff_end." • Each atmospheric retrieval (Table 1) is associated with the following metrics: 1) An error profile, which is the square root of the diagonal vector from the optimal estimation a-posteriori covariance matrix as indicated by the suffix "_err." This error profile quantifies the degree to which CLIMCAPS reduced the a-priori error for the target variable.
2) The degrees of freedom (DOF) as indicated by the suffix "_dof," which summarizes the signal-to-noise that CLIMCAPS achieved for the target variable given the measurement information content and all known sources of error and uncertainty.
3) The chi-square error estimate as indicated by the suffix "_chi2." This metric quantifies the difference between the observed (Level 1B measurement) and expected (top of atmosphere radiance simulated using the retrieved state) for each channel set used in retrieving the target variable. 4) Two-dimensional averaging kernel matrices at each retrieval footprint (field of regard) as identified for each atmospheric variable in the subgroup "ave_kern." • The retrieved cloud top pressure and cloud fraction can aid analyses with information on scene-specific cloudiness.