Fast retrieval of XCO 2 over east Asia based on Orbiting Carbon Observatory-2 (OCO-2) spectral measurements

. The increase in greenhouse gas concentrations, particularly CO 2 , has signiﬁcant implications for global climate patterns and various aspects of human life. Spaceborne remote sensing satellites play a crucial role in high-resolution monitoring of atmospheric CO 2 . However, the next generation of greenhouse gas monitoring satellites is expected to face challenges, particularly in terms of computational ef-ﬁciency in atmospheric CO 2 retrieval and analysis. To address these challenges, this study focuses on improving the speed of retrieving the column-averaged dry-air mole fraction of carbon dioxide (XCO 2 ) using spectral data from the Orbiting Carbon Observatory-2 (OCO-2) satellite while still maintaining retrieval accuracy. A novel approach based on neural network (NN) models is proposed to tackle the non-linear inversion problems associated with XCO 2 retrievals. The study employs a data-driven supervised learning method and explores two distinct training strategies. Firstly, training is conducted using experimental data obtained from the inversion of the operational optimization model, which is released as the OCO-2 satellite products. Secondly, training is performed using a simulated dataset generated by an accurate forward calculation model. The inversion performance and prediction performance of the machine learning model for XCO 2 are compared, analyzed, and discussed for the observed region over east Asia. The results demonstrate that the model trained on simulated data accurately predicts XCO 2 in the target area. Furthermore, when compared to OCO-2 satellite product data, the developed XCO 2 retrieval model not only achieves rapid predictions ( < 1 ms) with good accuracy (1.8 ppm or approximately 0.45 %) but also effectively captures sudden increases in XCO 2 plumes near industrial emission sources. The accuracy of the machine learning model retrieval results is validated against reliable data from Total Carbon Column Observing Network (TCCON) sites, demonstrating its ability to effectively capture CO 2 seasonal variations and annual growth trends.


Introduction
Since the Industrial Revolution, human activities have released large amounts of greenhouse gases, primarily carbon dioxide, into the atmosphere.This continual increase in emissions has led to global warming and has disrupted human societies and ecosystems (Zehr, 2015).Accurately estimating atmospheric carbon fluxes is critical for implementing effective emission reduction strategies at national and regional levels.However, precise carbon flux estimates require assimilating carbon dioxide concentration data across regions using measurements of atmospheric column-averaged dry airmole fraction of carbon dioxide (XCO 2 ; Jin et al., 2021).

Direct measurement methods like balloons or aircraft have
Published by Copernicus Publications on behalf of the European Geosciences Union.
F. Xie et al.: Fast retrieval of XCO 2 over east Asia based on OCO-2 spectral measurements challenges obtaining global-scale data.Currently, the main monitoring approach uses spectrometers to record spectra in CO 2 absorption bands, followed by inversion algorithms to derive XCO 2 .The two primary monitoring methods are ground-based monitoring stations and satellite remote sensing.
The Total Carbon Column Observing Network (TCCON) provides ground-based monitoring of atmospheric carbon dioxide through a global network of high-precision Fourier transform spectrometers (Wunch et al., 2011(Wunch et al., , 2015)).However, TCCON sites are sparsely distributed and cannot be deployed in regions with unfavorable geography or harsh climates.Consequently, the network lacks the extensive spatial coverage required for comprehensive global carbon monitoring and carbon cycle analysis.Nevertheless, the ultrahigh spectral resolution of TCCON spectrometers enables highly accurate retrievals of XCO 2 .Under clear-sky conditions, TCCON precision can reach 0.1 % (< 0.4 ppm).Under relatively clear conditions with minimal clouds and aerosols, precision remains within 0.25 % (< 1 ppm; Messerschmidt et al., 2011).Due to such high precision and accuracy, TC-CON data are invaluable for validating satellite-based XCO 2 products (Cogan et al., 2012;Wunch et al., 2017;Liang et al., 2017) and comparing them to carbon cycle models.However, the spatial limitations of the network underscore the need for satellite remote sensing to provide regular global measurements of atmospheric carbon dioxide.
High-spectral-resolution greenhouse gas monitoring satellites employ spectrometers in orbit to measure solar radiation spectra after interaction with the Earth's atmosphere and ground surface (Meng et al., 2022).Unlike ground monitoring, satellite remote sensing offers broader spatial coverage and more flexible temporal observation globally.Consequently, satellite remote sensing has become vital for greenhouse gas monitoring worldwide.Notable ongoing passive CO 2 observation missions include China's TanSat (Liu et al., 2018), Japan's Greenhouse Gases Observing Satellite (GOSAT, 2009) and GOSAT-2 (2018) (Hamazaki et al., 2005;Kuze et al., 2009;Imasu et al., 2023), and the United States' OCO-2 (2014) and OCO-3 (2018;Crisp et al., 2017;Eldering et al., 2019).Upcoming missions are France's Mi-croCarb by CNES (Cansot et al., 2023), ESA's CO 2 M (Sierk et al., 2021), and GOSAT-GW (Matsunaga and Tanimoto, 2022).The next-generation greenhouse gas monitoring satellites mainly address the challenge of improving the spatial and temporal resolutions of observations.However, single satellites still have resolution, coverage, and meteorological limitations for regional emission monitoring.Enhancing satellite sensor performance alone cannot produce datasets sufficient for monitoring carbon sources and sinks.Improving the accuracy and efficiency of satellite data inversion is also crucial.Integrating data from multiple satellites into a coordinated system is necessary to fully capture dynamic changes in regional carbon sources and sinks.Developing new high-precision, high-throughput inversion methods to efficiently derive accurate greenhouse gas concentration distributions from satellite data is a key challenge needing attention.
The mainstream inversion algorithms (O'Dell et al., 2012;Crisp et al., 2012;Yoshida et al., 2013) for retrieving greenhouse gas concentrations from high-spectral-resolution satellite measurements are based on nonlinear Bayesian optimization theory (Rodgers, 2000) and full physics models.In essence, these algorithms operate by iteratively adjusting estimated gas concentration profiles and other atmospheric and surface parameters in a radiative forward model to minimize the mismatch between simulated and observed spectra.More specifically, the inversion process starts with the a priori atmospheric state, including trace gas concentration profiles as functions of pressure/altitude.Radiative-transfer equations are then solved to simulate the top-of-atmosphere radiance spectrum observed by the satellite for this atmospheric state.The simulated spectrum is compared to the actual observed spectrum, calculating the difference, covariance, and cost function.The input gas profiles and model parameters are iteratively adjusted to reduce the cost function over multiple rounds of radiative-transfer simulations.Once simulated spectra closely match observations, the model state is output as the retrieved concentration profile.However, executing these complex optimizations requires computationally expensive interpolation of high-spectral-resolution gas absorption reference data and solving the radiative-transfer equations in each iteration.Running the radiative forward model repeatedly for every adjusted atmospheric state also leads to slow overall inversion.Consequently, optimization-based retrievals struggle to match increasing satellite observation volumes and throughput needs.This inherent inefficiency has become a major obstacle to operational greenhouse gas monitoring using current and planned high-resolution spectrometers.While rigorous, standard nonlinear optimization retrievals lack the speed and scalability required for highprecision real-time or near-real-time satellite-based greenhouse gas mapping.Overcoming this bottleneck necessitates new inversion approaches that can ingest high-resolution spectral data and retrieve concentrations with both accuracy and computational efficiency.
In recent years, machine learning has demonstrated exceptional performance across various research fields, with the discovery of potential nonlinear relationships between data as one of its fundamental and crucial applications.Regarding the important applications of carbon dioxide (CO 2 ) retrieval, Carvalho et al. (2010) attempted to retrieve the vertical CO 2 profiles using spectral data from SCIAMACHY's 6 channels (1000-1700 nm).The overall precision and bias of the retrieved results were estimated to be approximately 1.0 % and less than 3.0 %, respectively.Gribanov et al. (2010) developed a two-hidden-layer multilayer perceptron (MLP) model to retrieve CO 2 vertical concentrations by reflected solar radiation measured by the GOSAT Thermal and Near infrared Sensor for carbon Observation -Fourier Transform Spec-trometer (TANSO-FTS) sensor, achieving an inversion accuracy better than 1 ppm for CO 2 column-averaged values and better than 4 ppm for surface CO 2 concentrations for the test samples.In the study conducted by Zhao et al. (2022), a twostep machine learning approach was developed for retrieving atmospheric XCO 2 using spectral data from the GOSAT weak-CO 2 band.They established a direct one-dimensional line-by-line forward model to simulate GOSAT's observed spectra within the 6180-6280 cm −1 spectral interval, forming the foundation for training their machine learning model.The retrieval model operates by initially obtaining the atmospheric spectral optical thickness, followed by extracting XCO 2 from these optical thickness spectra.As a proof-ofconcept, the method was tested in Australia under clear-sky conditions using GOSAT's spectra, demonstrating an accuracy of approximately 3 ppm for XCO 2 retrieval.The study also discussed potential enhancements to further refine the accuracy of this retrieval method.Keely et al. (2023) employed the machine learning method of Extreme Gradient Boosting (Chen and Guestrin, 2016) to develop a nonlinear bias correction approach for the OCO-2 version 10 product, significantly reducing systematic errors in CO 2 measurements and improving data quality, with an increase in sounding throughput by 14 %.David et al. (2021) and Bréon et al. (2022) attempted to establish correlations between XCO 2 in the European Centre for Medium-Range Weather Forecasts CAMS (Copernicus Atmosphere Monitoring Service) database and OCO-2 satellite monitoring spectra using multilayer perceptron artificial neural network models.However, their recent research (Bacour et al., 2023) indicates that when the test dataset extends beyond the time range covered by the training dataset, the predicted results show a slight bias, approximately 2.5 ppm yr −1 .Practical deployment of machine learning techniques for remote sensing demands additional research into the generalization performance of models on new observational data distributions beyond those encountered during training.
In the present paper, a proof-of-concept study demonstrates a novel machine learning strategy to accurately and efficiently retrieve atmospheric XCO 2 values from OCO-2 satellite spectral measurements.The model rapidly retrieves XCO 2 directly from OCO-2 spectral data, eliminating the need for repetitive radiative-transfer simulations required by traditional nonlinear optimization retrieval algorithms.Additionally, the model enables the prediction of future XCO 2 values.The method was validated by comparing the retrieved XCO 2 against OCO-2 satellite version 10r products and ground-based TCCON measurements, confirming the accuracy of our efficient spectral-inversion approach.The model also successfully demonstrated its ability to detect local plume features, indicating its potential utility in monitoring and analyzing specific emission sources.A major innovation in the present study is using accurate radiative-transfer simulations to generate the training data rather than relying solely on experimental data products.This simulation-based training approach could help overcome limitations in existing experimental data.Additionally, our neural network model achieves XCO 2 retrieval speeds orders of magnitude faster than traditional methods, reducing computation time from multiple seconds to less than 1 ms.This dramatic improvement in retrieval efficiency could enable real-time processing of the massive data volumes expected from next-generation greenhouse gas monitoring satellites.Importantly, our model achieves a precision of less than 1.8 ppm, competitive with the current state of the art in retrieval accuracy.We also demonstrate the ability to accurately capture temporal variations and trends in XCO 2 by validating against reliable TC-CON ground-based data.This level of verifiable performance is an important ability.This provides an effective solution for rapid inversion of large-scale high-spectral-resolution remote sensing data in the future.
2 The machine-learning-based XCO 2 retrieval model

Targeted area and data screening
This proof-of-concept study aims to develop and validate an accurate and efficient machine-learning-based XCO 2 retrieval model applied to the long OCO-2 time series for the east Asian region.Currently, similar global XCO 2 retrieval models rely on computationally intensive physical models.Our goal is to demonstrate a more efficient data-driven approach using MLP neural networks.
Before developing the machine-learning-based fastretrieval model, we implemented several preprocessing steps on the OCO-2 observational dataset (OCO-2 Science Team et al., 2020a) for the target east Asian area spanning between 20-45°N and 110-145°E, as shown in Fig. 1.Specifically, we filtered the data both spatially and temporally to focus only on observations within the geographic region and time period of interest (2016)(2017)(2018)(2019)(2020)(2021).Additionally, we filtered the data to only include nadir mode observations marked as "good" based on the quality flag indicator ("xco2_quality_flag" = 0 in OCO-2 Lite v10r files; OCO-2 Science Team et al., 2020b), as these represent the highest quality OCO-2 measurements.
Several TCCON ground stations located in this region (e.g., Hefei, Saga, Tsukuba, Xianghe, Anmyeondo, and Rikubetsu), as shown in Fig. 1, provide valuable groundtruth XCO 2 data for validating the MLP model predictions.If the model can accurately reproduce the TCCON observations from corresponding OCO-2 measurements, it suggests that the model has learned meaningful relationships between the satellite data and underlying CO 2 concentrations.
Furthermore, the successful demonstration of accurate XCO 2 retrieval over east Asia is a first step toward expanding this approach globally.The model could be retrained or supplemented with additional regional data to extend coverage.By combining reliable regional MLP models, global https://doi.org/10.5194/amt-17-3949-2024Atmos.Meas.Tech., 17, 3949-3967, 2024  XCO 2 maps could be retrieved.This "jigsaw puzzle" strategy would further validate the feasibility of global-scale machine-learning-based XCO 2 retrievals from satellite observations.

The artificial neural network architecture
This study introduces a multilayer perceptron (MLP) neural network model for estimating XCO 2 from OCO-2 satellite observations.Inspired by David et al. (2021) and Bréon et al. (2022), the MLP-XCO 2 model input layer is designed based on the measurement principles of OCO-2 and atmospheric radiative-transfer effects on the observed spectra; the artificial neural networks architecture is shown in Fig. 2. Specifically, the MLP model input layer consists of spectral information, surface pressure, the corresponding year, and geographical observation information, as summarized in Table 1 and explained below.

Spectral information
The OCO-2 satellite instrument measures high-resolution spectra in three spectral bands centered around 0.76, 1.6, and 2.0 µm, referred to as the O 2 -A, weak CO 2 (WCO 2 ), and strong CO 2 (SCO 2 ) bands, respectively (OCO-2 Science Team et al., 2019).However, only the WCO 2 and SCO 2 bands are used as inputs for current XCO 2 retrievals.The O 2 -A band is excluded as it lacks significant information needed to directly estimate XCO 2 based on radiative-transfer principles.Instead, the O 2 -A band is primarily used in OCO-2's operational full-physics algorithm for rapid cloud and aerosol screening prior to CO 2 retrieval (O'Dell et al., 2012), effectively excluding observational cases that potentially lead to poor retrieval quality, thus saving substantial computational costs.Each OCO-2 spectral band is sampled by 1024 detector pixels.However, over time some detectors have degraded or become unstable in the space environment, resulting in pixels being flagged as "bad samples" in quality filters (Marchetti et al., 2019).To maximize high-quality training data, additional preprocessing is performed on the WCO 2 and SCO 2 bands.Initially, the beginning and ending spectral ranges corresponding to the most degraded detectors are removed.The remaining spectra are resampled into 525 and 755 wavelength points for the WCO 2 and SCO 2 bands, respectively (spectral points in wavelength are detailed in Table 2).To enhance the CO 2 absorption line information, each input spectrum is normalized by dividing the mean radiance within a nearby spectrally transparent window lacking absorption features (1.6056-1.6059µm using 10 points for WCO 2 ; 2.0602-2.0607µm using 15 points for SCO 2 ).Additionally, as shown in Fig. 3, we noticed that some isolated pixels within the main CO 2 absorption bands still consistently exhibited poor radiance quality.To address this issue, a "bad sample filter" was implemented, which utilizes a binary record from the OCO-2 L1B database (0 indicates spectra derived from good-quality pixels, and 1 indicates pixels with defects or derived from poor-quality interpolations).The settings of this filter are determined solely by the historic records and the version of the bad-pixel map, ensuring refined data quality and consistency across different versions of the map.To further address bad samples resulting from natural degradation, we have implemented a dropout layer between the initial and the first intermediate MLP layer, thus enhancing the model generalizability with the remaining spectral inputs.

Geographical information
The model is designed to accept two key observation geometry angles that are determined by the relative positions of the Sun, the satellite, and the ground observation point.These include the solar zenith angle and relative azimuth angle.The solar zenith angle (SZA) features prominently as a cosine term in the radiative-transfer equation that defines the atmo- Table 2. Wavelength spacing of the input spectra.

Band
Spectral range (µm) Spectral points (µm) spheric radiative processes.Thus, SZA is pre-converted to its cosine form for model input.The relative azimuth angle is a comprehensive angle that jointly combines the solar azimuth angle and the satellite azimuth angle.It is important to emphasize that the satellite zenith angle is not used in this study.Our current research is based on the nadir mode of the OCO-2 satellite observation.In the nadir observing mode, the satellite zenith angle is assumed to be nearly perpendicular to the Earth's surface, theoretically approaching 0°.

Other parameters
In addition to the primary inputs, two other parameters play critical roles in the MLP-XCO 2 model: the surface pressure and the corresponding year (2016, 2017, etc.).In traditional retrieval algorithms based on iterative optimization, accurate surface pressure and a reliable prior CO 2 profile are crucial.The importance of this has been highlighted by the averaging kernel utilized in the OCO-2 retrieval algorithm (Nguyen et al., 2014), which indicates a higher sensitivity near the surface compared to the stratosphere.To prevent the retrieval of unrealistic CO 2 profiles, the prior covariance matrix imposes significantly stricter constraints in the stratosphere than in the troposphere (O'Dell et al., 2012).In cases where the prior CO 2 profile is inadequate, it can lead to poor results, with minimal or even opposite updates in the stratospheric CO 2 profile during the inversion process (Iwasaki et al., 2019).Additionally, in order to achieve the best agreement between observed and estimated spectra, the retrieval process may inaccurately estimate tropospheric CO 2 profiles.To tackle these challenges, our investigation suggests that incorporating additional information such as the year can offer valuable context for XCO 2 retrieval.This conservative approach provides a simple means to enhance prior CO 2 information without directly specifying XCO 2 prior values.
3 Satellite-product-data-based machine learning model We first developed the MLP-XCO 2 model using the OCO-2 v10r product dataset.The primary goal was to optimize the hyperparameters of the MLP-XCO 2 network.On one hand, we aimed to confirm whether the "slow bias" shown in Bacour et al. ( 2023) is a universal issue across machine learning models with similar architecture.On the other hand, by fixing the hyperparameters of the MLP-XCO 2 network structure, we sought to develop a comparable model using simulated data.In theory, MLP models using identical hyperparameters should possess the same fitting and generalization abilities.By first presenting results from a model trained solely on satellite product data, we can demonstrate the limitations of these satellite-data-based models.This then motivates the https://doi.org/10.5194/amt-17-3949-2024Atmos.Meas.Tech., 17, 3949-3967, 2024 development of new machine learning strategies to overcome these limitations, as discussed in later sections.
Following the target areas and data screening methods discussed previously, observational data and lists of bad pixels were obtained from the OCO-2 v10r L1B database.Additionally, retrieved surface pressure and XCO 2 data were obtained from the L2 std database.Specifically, we obtained data from March, June, September, and December spanning the years 2016 to 2020.This time frame was chosen to provide a comprehensive training and testing set for our analysis.In total, the dataset encompassed 194 150 samples collected over this 5-year period.The year-wise distribution of the samples is as follows: 38 626 samples from 2016, 39 850 from 2017, 35 945 from 2018, 36 452 from 2019, and 43 277 from 2020.
After completing the data collection, we proceeded to construct the MLP-XCO 2 model.To balance model complexity and performance, the MLP-XCO 2 architecture (Fig. 2) comprises five hidden layers, with 1000, 500, 300, 100, and 20 nodes, respectively.All hidden layers also use Rectified Linear Unit (ReLU) activation functions.The output layer contains a single node to predict XCO 2 values, with a linear activation function.Upon developing the MLP-XCO 2 model architecture as described in this section, we independently trained two versions of the MLP-XCO 2 model, each based on the aforementioned model structure but with different training and testing datasets.

Historical data training
The first MLP-XCO 2 model based on OCO-2 product data was trained using historical XCO 2 data collected from 2016 to 2018.The test set for this model comprised product data from the years 2019 and 2020.This setup allowed us to assess the model's predictive performance using a straightforward historical data approach.

Skipped-year training
The second version of the model was trained using data from the years 2016, 2018, and 2020.The test set for this MLP-XCO 2 model included the skipped years 2017 and 2019.This unique approach enabled a clearer and more direct comparison of the potential limitations of relying solely on historical data for future predictions.
Figure 4 presents the results for the two trained MLP-XCO 2 models on their respective 10 % out-of-sample testing datasets.Panel (a) illustrates the predictions of the historical data training model from the 2016-2018 data, and panel (b) shows similar predictions for the skipped-year training model.Both models achieve high accuracy on these testing datasets, with a root mean square error (RMSE) close to 1 ppm and an R-squared score (R 2 ) larger than 0.9.These results demonstrate the robust interpolation capabilities of both models within their respective training periods, indicating their effectiveness in handling known observed scenarios.
Figure 5 evaluates the generalization capabilities of each MLP-XCO 2 model on testing sets comprising years not included in their respective training datasets.These test sets represent periods outside the range of years used for training.Here, we observed a noticeable positive bias solely in the predictions from the historical data training model.In contrast, the skipped-year training model did not exhibit this bias.Performance remains highly accurate for these out-ofrange points, further validating the model's robustness for XCO 2 prediction within skipped years.
Globally, the average XCO 2 in the atmosphere shows a stable annual increase, with an observed rise of approximately 2-3 ppm yr −1 .However, despite the inclusion of the corresponding year in the input layer as a high-correlation parameter, there remains a limitation in capturing the potential rising trend of the atmospheric CO 2 .This highlights the limitations of models trained solely on historical satellite data, motivating the development of new techniques to incorporate external information about temporal CO 2 dynamics.

Simulation-data-based machine learning model
In the previous section, the MLP-XCO 2 model showed excellent interpolation within the training data range but exhibited bias when predicting outside this period.To eliminate this bias, we propose using an accurate forward model to simulate representative training data that cover future at- mospheric conditions.If we can pre-generate atmospheric profiles that capture possible future states, and simulate the corresponding spectral radiance using an accurate forward model, the MLP-XCO 2 model can pre-learn future satellite observations.This could prevent incremental annual bias and enable accurate XCO 2 prediction.The effectiveness of this approach depends on the forward model accuracy and representativeness of the simulated atmospheres (Zhao et al., 2022).
It is therefore critical to select an appropriate radiativetransfer forward model with proven reliability in simulating spectral radiance under varying atmospheric conditions.The model must precisely capture the relationship between trace gas concentrations, meteorological states, and the resulting spectral signatures.With accurate simulations, the machine learning model can generalize robustly to future atmospheric scenarios.The representative training data should span the expected range of atmospheric variability in XCO 2 and interfering species like water vapor.A broad sampling of the state space is key for the model to learn a robust mapping to XCO 2 across multiple atmospheric regimes.The following sections describe our approach for accurate spectral radiative-transfer simulations and possible (realistic) atmospheric profile generation.

Forward model
In this study, we developed a forward radiative-transfer calculation model using the ReFRACtor (Reusable Framework for Retrieval of Atmospheric Composition) software (Mc-Duffie et al., 2020).ReFRACtor is an extensible framework for multi-instrument atmospheric radiative transfer and retrieval, originally derived from the operational OCO-2 retrieval program.Although ReFRACtor contains both radiative transfer and retrieval capabilities, we only used the radiative-transfer component.Specifically, we configured ReFRACtor to simulate top-of-atmosphere radiance spectra that would be observed by OCO-2.These simulated observations were then used to generate a large training dataset for our machine learning model, MLP-XCO 2 .
The OCO-2 satellite primarily observes the radiative spectra in the shortwave infrared (SWIR) band.Over the range of SWIR, the impact of thermal emission can be ignored when simulating the spectra (Crisp et al., 2021).To simulate OCO-2's observed spectra in the WCO 2 band around 1.6 µm and the SCO 2 band around 2.06 µm, the ReFRACtor model numerically solves Eq. ( 1) of the radiative-transfer equation (RTE; Modest and Mazumder, 2021): where I η is the observed spectra, µ is the cosine of the observation zenith angle (e.g., µ = cos θ ), τ is the vertical optical depth that can be column-integrated from the molecular absorption coefficients and optical path, φ is the azimuthal angle relative to the observation point for the satellite and the sun, and J represents the scattering components and inhomogeneous source term describing both single-scattering and multiple-scattering contributions.The term J in RTE can be expressed as Eq.(2): where ω is the single-scattering albedo, P is the scatteringphase function, µ and φ are the cosine and azimuth angle of the incident direction angle in each direction, µ 0 is the cosine https://doi.org/10.5194/amt-17-3949-2024Atmos.Meas.Tech., 17, 3949-3967, 2024 of the solar zenith, and I 0 is the solar intensity at the top of the atmosphere.The ReFRACtor model uses a hybrid model to solve RTE.Specifically, the radiative-transfer software Linearized Discrete Ordinate Radiative Transfer (LIDORT; Spurr, 2008) is applied for the scalar and Jacobian computation.Concurrently, a two-order scattering model (Natraj and Spurr, 2007) is utilized for the additional radiance correction.Within this framework, the ReFRACtor model comprehensively considers five types of scatter particles for each sounding: two types of clouds, two types of tropospheric aerosols, and one type of stratospheric aerosol.The single-scattering optical properties for each cloud and aerosol particle, including cross-section, single-scattering albedo, and scattering-phase matrix, have been pre-computed and tabulated for the forward calculations.Furthermore, the model determines surface reflectance as a quadratic spectral albedo for each band, which is derived from the bidirectional reflectance distribution function (BRDF).
An essential step in developing the forward calculation model is referencing the pre-computed lookup table of H 2 O and CO 2 to obtain the required spectral absorption coefficients.In this study, the ABSCO v5.1 database (absorption coefficient table; Payne et al., 2020) was applied for this purpose.Additionally, we identified and corrected an overestimation of the solar continuum in ReFRACtor compared to the OCO-2 Level 2 algorithm (Crisp et al., 2021).Without this correction, there would have been an approximately 3 % overestimation in the 1.6 µm band and 6.5 % in the 2.06 µm band.By reducing the solar continuum, our forward model aligned better with the OCO-2 spectral measurements.These configurations of the absorption coefficients and solar continuum were essential to accurately simulate OCO-2 spectra for generating training data across a variety of observation conditions.
To assess the performance of the forward model, we selected four distinct global locations in the year 2017.The goal was to replicate the OCO-2-observed spectra for both the WCO 2 1.6 µm absorption band and the SCO 2 2.06 µm absorption band at the four locations.By accessing the OCO-2 L2 std database, we acquired atmospheric conditions and pertinent geographical data (including spectral albedo, sur- face pressure, and observation angles) specific to these chosen locations.The outcomes of our simulations for these four locations are visually depicted in Figs. 6 and 7, respectively, for the two bands, with accompanying residual plots displayed in the lower subpanels.It is worth noting that the simulated results exhibit a high level of agreement with the observed OCO-2 spectra; the relative error remains under 1 %, underlining the robustness of the established forward model.The remarkable agreement between the observed and simulated spectra indicates the excellent performance of the forward radiative-transfer model.This performance is particularly evident in accurately replicating the satellite observations from OCO-2.As a result, this forward model serves as a reliable tool for the development of machine learning models trained using simulated spectral data.

Training data generation
To optimize the training of the MLP-XCO 2 model, it is essential that the input training vectors cover a wide range of realistic variations.Although the idea of randomizing all in-put parameters to enhance diversity might appear attractive, simulating satellite spectra involves managing a multitude of interdependent variables.In addition to the CO 2 vertical profile, factors such as surface pressure, temperature profile, water vapor, aerosols, and observation geometry must be accurately represented.Randomizing all of these parameters would require an impractical amount of data and could result in combinations that have no real-world relevance.For example, the four viewing angles determined by the Sun, observation point, and the OCO-2 satellite have fixed combinations during the satellite's regular operation.Therefore, randomly selecting angle combinations lacks practical significance.To ensure that the training data cover valid variations, we conducted an analysis of historical OCO-2 retrievals.This analysis revealed consistent seasonal patterns and year-toyear trends in most parameters.This supports the idea of selecting representative samples from statistical distributions rather than relying on complete randomization.By carefully considering the relationships between parameters and the realities of satellite observations, we can create a reasonably https://doi.org/10.5194/amt-17-3949-2024Atmos.Meas.Tech., 17, 3949-3967, 2024 sized training dataset that effectively captures the range of expected predictions.Generation of the vertical CO 2 profile is especially critical among all input parameters.This dataset theoretically determines the generalization domain of the MLP-XCO 2 model.In the forward model based on the ReFRACtor model, the atmospheric CO 2 profile is segmented into 20 sub-layers by pressure.By statistically analyzing the OCO-2-retrieved CO 2 profiles in the target east Asia area in 2016-2018, the boxplots for atmospheric CO 2 concentration in each sub-layer are shown in Fig. 8a, and the historic XCO 2 results from the OCO-2 product data are shown in Fig. 8b.From the upper atmosphere down to the ground surface, the variability in CO 2 concentrations gradually increases.This challenges the model's ability to standardize the atmospheric CO 2 profiles, particularly closer to the Earth's surface.Fortunately, a consistent year-on-year rise in CO 2 concentrations in each sub-layer has been observed over time.Consequently, in our research, we have proposed a method for generating subsequent CO 2 atmospheric profiles.We incrementally increase the CO 2 concentration by 2.5 ppm annually, starting from the 2016 OCO-2 retrieved CO 2 vertical profile.This approach ensures that we encompass a range of plausible atmospheric CO 2 distributions with realistic shapes, enabling the generation of simulated spectra for the designated training years.
In addition to the CO 2 profile, Fig. 9 illustrates the year-toyear trends of various observed parameters essential for the forward calculation model in the east Asian region.Although they display seasonal variations, these parameters consistently exhibit annually cyclic patterns.Given that the OCO-2 satellite conducts global observations in cycles of approximately half a month (15-16 d), this study employed observation parameters and a priori data for atmospheric profiles, except for CO 2 , from the year 2016 as a reference.These reference data were repetitively utilized to generate simulations in subsequent years.Regarding the quadratic spectral albedo, the constant term in the training data samples is uniformly set to 1 (to be normalized before being processed by the neural network).The slope and the quadratic coefficient are stochastically sampled within the range of values corresponding to the retrieval results based on the OCO-2 L2 products.
Based on 60 000 uniformly sampled observation data points exclusively from the OCO-2 satellite throughout 2016, we randomly separated the dataset into six sets of 10 000 data points each.Each set represents CO 2 profiles from 2016 to the end of 2021, with the yearly increase of 2.5 ppm added to the original data reflecting projected future profiles.The forward model was used to generate the corresponding simulated spectra for each set.These simulated samples served as the foundational dataset for training the new MLP-XCO 2 machine learning model.It is important to note that this new model relies solely on the data recorded by the OCO-2 satellite in 2016 as its reference.However, it is essential to acknowledge that real-world observations by the OCO-2 satellite involve parameters that are not predetermined in future simulations, such as the empirical orthogonal function (EOF) parameters, the signal-to-noise ratio (SNR), badsample lists, and the degradation of grating pixels.Therefore, our new model is trained not only on the 60 000 simulated data points but also on the 2016 historical data.According to the data selection criteria outlined in Sect.2.1, we identified a total of 38 626 sets of historical data in 2016, comprising spectral measurements from OCO-2 and the corresponding XCO 2 products.These historical experimental datasets are integrated with the simulated data, enriching the training datasets.These dual combination and data augmentation techniques ensure that the model is well-equipped to handle both potential future atmospheric conditions and the current realities of instrument and spectral measurement capabilities.By doing so, we provide a more comprehensive training strategy that captures both anticipated future scenarios to accu-rately and efficiently perform XCO 2 retrieval for the "future" years from 2017 to the end of 2020.

Comparison with the OCO-2 satellite product data
To evaluate the retrieval capability of the MLP-XCO 2 model trained on a combined dataset of simulated data and historical 2016 OCO-2 satellite data, the neural network architecture and hyperparameters were intentionally kept identical to the previous model trained solely on actual OCO-2 satellite product data.Keeping these factors constant isolates the training data source as the only major difference between the models.This enables a direct, apples-to-apples comparison of how the training data affect model performance.
Figure 10a shows the retrieval results on the 10 % out-ofsample testing data that was excluded from model training.Setting aside this test subset is a standard technique for evaluating model performance on new examples.The accurate predictions of the MLP-XCO 2 model for the test data suggest that the model has learned generalizable patterns not overfit to the training data.Figure 10b shows the comparison of the retrieval results of the MLP-XCO 2 model to real OCO-2 satellite spectral observations in 2016.Figure 11 displays XCO 2 predictions from 2017 to 2020 using test data consistent with Figs. 4 and 5.As the simulated training data were generated based on 2016 OCO-2 measurements, testing on 2017-2020 data evaluates the model's ability to make predictions beyond the time frame of the training data.The scatter plots demonstrate that the MLP-XCO 2 model trained on simulated data can accurately and stably predict the annual XCO 2 growth trend, maintaining an RMSE less than 1.8 ppm (0.45 %).Compared to models trained relying solely on historical satellite product data, the key advantage is the https://doi.org/10.5194/amt-17-3949-2024Atmos.Meas.Tech., 17, 3949-3967, 2024 ability to make reasonable forecasts of future atmospheric XCO 2 levels.Table 3 offers a detailed spatiotemporal comparison of the results presented in Fig. 11, enhancing our understanding of the MLP-XCO 2 model's performance across distinct subregions within east Asia.This table specifically focuses on a finer spatial segmentation within the broad east Asian longitude and latitude range, dividing it into four subregions.These are defined based on the geographical demarcation of 35°N and 130°E, categorized as the northeast (NE), north-west (NW), southeast (SE), and southwest (SW) regions.The results demonstrate that regardless of the distribution of sample sizes across these subregions and their varied topographical characteristics (land or ocean), the model maintains a consistent and stable performance in each subregion.Furthermore, the error metrics for these individual subregions align closely with the overall regional errors, indicating a uniformity in the model's predictive accuracy and reliability across different spatial scales within east Asia.tion that makes the model's outputs scientifically sound.By training on synthetic data spanning potential future scenarios, the model learns robust representations not tightly coupled to specifics of the training data time period.This enables high-fidelity inversion and prediction of XCO 2 even for future time periods beyond available measurements.

Detecting plume features from the OCO-2 observations
In a further effort to deeply analyze the ability of our MLP-XCO 2 model to capture key XCO 2 information from spectral data, we focused on plume detection at sites of potentially high emissions, such as thermal power plants, in our target regions from 2017 to 2020.Utilizing the data in the work of Li et al. (2023), we sourced test samples from multiple instances of XCO 2 enhancements detected by the OCO-2 satellite in nadir mode observations.These samples were located in close proximity to known large power plants, providing an ideal scenario for assessing retrieval accuracy in detecting localized emission sources.
Figure 12 presents a geographical map that highlights XCO 2 predictions in the test samples from the MLP-XCO 2 model and compares them with results retrieved by the OCO-2 v10r product.This map clearly marks power plants with red triangles, establishing a visual link between industrial emission sources and observed points where elevated XCO 2 levels are detected.Figure 13 further explores this relationship by presenting a longitude-based comparison of XCO 2 results.This figure plots the same data points from Fig. 12 against their corresponding longitude coordinates.This arrangement facilitates a direct and intuitive comparison of the trends in XCO 2 enhancements captured by our model and reported by the OCO-2 product.
In both figures, it is visually evident that observation points near power plants show sudden increases in XCO 2 values aligning with the trend from the OCO-2 v10r product.This trend is particularly pronounced when compared to  points farther away from these emission sources.Considering that these samples are nearly identical in terms of observation angles and times, such consistency is a powerful confirmation of our model's ability to retrieve genuine atmospheric XCO 2 from OCO-2 spectral data.

Comparison with the TCCON data
A comparison of the retrieved results from the OCO-2 satellite showed that the RMSE of our developed MLP-XCO 2 model was around 2 ppm.In other words, our results could be worse than or better than the OCO-2 satellite, requiring further comparison with ground-based measurements.To further validate the accuracy of the MLP-XCO 2 model, we compared the XCO 2 retrievals from the OCO-2 v10r nadir mode products, the MLP-XCO 2 model outputs, and groundbased measurements from five TCCON sites within the study region (Fig. 1).As summarized in Table 4, spatiotemporal screening was applied to the TCCON and OCO-2 data to obtain comparable observations.The five TCCON sites included were Tsukuba (Morino et al., 2022b), Saga (Shiomi et al., 2022), Hefei (Liu et al., 2022), Xianghe (Zhou et al., 2022), and Rikubetsu (Morino et al., 2022a).The Anmyeondo site was excluded from this analysis as the XCO 2 data were not updated in the TCCON GGG2020 database and were only available until early 2018 in the GGG2014 database.Figure 14a-1-e-1 presents time series comparisons of XCO 2 retrievals from the different TCCON sites, the MLP-XCO 2 model, and OCO-2 nadir observations.Figure 14a-2-e-2 displays the boxplots of the differences between the MLP-XCO 2 model results, OCO-2 products, and TCCON site data.The plots at each of the five TCCON sites demonstrate that the simulated data-trained MLP-XCO 2 model accurately predicts XCO 2 from the OCO-2 spectra.The model successfully captures seasonal variations and the long-term XCO 2 growth trend over the 4-year study period.The reliable performance over time and across multiple TCCON sites further validates the conclusion that the model has learned generalizable representations of carbon cycle processes rather than overfitting to specifics of the simulated training data.Using realistic future simulations for training, the model provides robust and unbiased XCO 2 retrievals across a range of atmospheric conditions.

Retrieval efficiency
In this study, the ReFRACtor forward model required 12.16 s per simulation case (two absorption bands) using an AMD Ryzen-7 5800X computer.The OCO-2 retrieval based on Bayesian optimization typically needs over three iterations to converge, requiring at least 36.48s per retrieval.In contrast, the MLP-XCO 2 model demonstrated remarkable efficiency on the same hardware.It required just 1.14 s total to retrieve XCO 2 from 6642 OCO-2 test spectra across all five TCCON sites, averaging 0.17 ms per sample with an RTX 3080Ti GPU.This rapid inversion drastically reduces processing times compared to traditional methods.While machine learning models need significant upfront time for training data generation and hyperparameter tuning, the https://doi.org/10.5194/amt-17-3949-2024Atmos.Meas.Tech., 17, 3949-3967, 2024 prediction is extremely fast once deployed.This enables near-real-time processing ideal for operational satellite data streams.Furthermore, the precision and efficiency of neural networks make them well-suited to meet future demands of high-resolution global greenhouse gas monitoring, enabling millisecond-scale XCO 2 retrievals suitable for large-scale satellite analysis.

Conclusions
This proof-of-concept study aims to use the efficient regression inversion capability of the machine learning method to develop machine learning models based on simulated atmospheric radiative-transfer data for efficient inversion of satellite observed spectra to retrieve XCO 2 .This helps overcome the low efficiency in traditional optimization-based iterative algorithms for XCO 2 retrievals.In the present study, XCO 2 inversion models using both satellite-product-based data and simulation-based data were developed, trained, and tested.Long time series inversion and prediction of OCO-2 observations over east Asia were also performed using the developed models.The results were compared with OCO-2 and TCCON retrievals, showing that the simulation-data-based machine learning models can effectively eliminate lagging biases while achieving millisecond-level (< 1 ms) inversion efficiency, good accuracy (less than 1.8 ppm), local emission source capture, and long-term prediction stability.It should be noted that our current MLP-XCO 2 model does not provide direct uncertainty estimates; estimating prediction intervals is an important next step for future improvements.Additionally, to provide good prior information while preventing the model from potentially focusing solely on interpolation rather than learning about actual CO 2 increases within spectra, the results of our investigation suggest that integrating additional contextual information, such as the year, can offer valuable context for XCO 2 retrieval.However, the underlying mechanisms behind this improvement may require further investigation.

Figure 1 .
Figure 1.The target area for the east Asia region, distribution of observation points (from OCO-2 L2 std v10r files) of OCO-2 nadir mode in January 2016, and the distribution of TCCON sites in this area.

Figure 2 .
Figure 2. Schematic diagram of the MLP-XCO 2 model.The input layer includes two interpolated radiance values of WCO 2 and SCO 2 band filtered through a bad pixel filter, geographical observation information, surface pressure, and the corresponding year.A dropout layer with a 0.1 dropout rate is added between the input layer and the first hidden layer.

Figure 3 .
Figure 3. Visualization of the OCO-2 satellite data quality across interpolated wavelength grid indices.The map illustrates the badsample list extracted from OCO-2 Level 1B files for all test cases.On the horizontal axis, sample numbers range from 0 to 194 150, while the vertical axis represents various wavelength grid indices ranging from 0 to 1280.Red coloration indicates problematic data pixels.

Figure 4 .
Figure 4. Comparison of 10 % out-of-sample XCO 2 testing cases predicted by the MLP-XCO 2 model versus results retrieved by the OCO-2 v10r product.Panel (a) is the historical data training model, while panel (b) is the skipped-year training model.The solid red lines in the figure correspond to perfect agreement, and shaded areas around the solid red lines represent ±1 % of XCO 2 deviations.

Figure 5 .
Figure 5.Comparison of XCO 2 results predicted by the MLP-XCO 2 model versus results retrieved by the OCO-2 v10r product on test sets consisting of years not included in the training periods.Panels (a-1) and (a-2) are the historical data training model using the 2019 and 2020 test sets, respectively.Panels (b-1) and (b-2) are the skipped-year training model using the 2017 and 2019 test sets, respectively.The solid red lines in the figure correspond to perfect agreement, and shaded areas around the solid red lines represent ±1 % of XCO 2 deviations.

Figure 6 .
Figure 6.Comparisons of the OCO-2 observed spectra with the simulated ones from the modified ReFRACtor forward calculation model in the WCO 2 band.The lower subpanel shows the relative error between the spectrum observed by the OCO-2 satellite and that simulated by the forward calculation model.Panels (a)-(d) correspond to test samples from four different regions.The input vectors for the ReFRACtor model were derived from OCO-2 L2std retrieved results.

Figure 7 .
Figure 7. Comparisons of the OCO-2 observed spectra with the simulated ones from the corrected ReFRACtor forward calculation model in the SCO 2 band.The lower subpanel shows the relative error between the spectrum observed by the OCO-2 satellite and that simulated by the forward calculation model.Panels (a)-(d) correspond to test samples from four different regions.The input vectors for the ReFRACtor model were derived from OCO-2 L2std retrieved results.

Figure 8 .
Figure 8. Panel (a) is the boxplot of the vertical distribution of CO 2 profiles (from OCO-2 L2std files) retrieved by the OCO-2 satellite over east Asia in nadir mode from 2016 to 2018.The horizontal axis represents the atmospheric layers from layer 1 (top of the atmosphere) to layer 20 (near surface).The upper and lower bounds of each whisker show the maximum and minimum CO 2 concentrations recorded within that layer for each year.Panel (b) is the scatter plot of historic XCO 2 results retrieved by the OCO-2 inversion program (from L2std files).

Figure 9 .
Figure 9. Scatter plots of atmospheric parameters required for forward calculation models (excluding CO 2 profiles) from 2016 to 2020, sourced from the OCO-2 L2 product.Panel (a) is the surface pressure, (b) is the surface temperature, (c) is the near-surface water vapor concentration, (d) is the solar zenith, (e) is the Sun-Earth distance, and (f) is the Earth-satellite relative velocity.

Figure 10 .
Figure 10.Comparison of XCO 2 results predicted by the MLP-XCO 2 model from the 10 % of test data not involved in training.Panel (a) shows the predicted XCO 2 values for the test data that are derived from the simulated dataset, and panel (b) shows the test data that are derived from OCO-2 2016 L2 XCO 2 data.

Figure 13 .
Figure 13.Longitude-based scatter comparison of XCO 2 predicted by the MLP-XCO 2 model versus results retrieved by the OCO-2 v10r product.The potential plume enhancements were screened in nadir mode OCO-2 observations as reported in the work of Li et al. (2023).ME represents the mean value of XCO 2 within the longitude range shown in the figure.

Figure 14 .
Figure 14.Comparisons of XCO 2 results from 2017 to 2020 across five TCCON sites.Panels (a-1)-(e-1) show the time series comparisons of XCO 2 retrievals from the different TCCON sites, the MLP-XCO 2 model, and OCO-2 L2Lite nadir observations for the Tsukuba, Saga, Hefei, Xianghe, and Rikubetsu sites, with data screening conditions as defined in Table 4. Panels (a-2)-(e-2) present the boxplots depicting the differences ( XCO 2 ) between the MLP-XCO 2 model and OCO-2 products in comparison to the TCCON results for each year.The boxes show the middle half of the data, from the 25th to 75th percentiles.The median (50 %) is represented by the line within each box.The whiskers encompass the central 90 % of the data, extending from the 5th to 95th percentiles.

Table 1 .
A detailed list of the input parameters for the MLP-XCO 2 model.

Table 4 .
Spatiotemporal screening conditions for TCCON sites and OCO-2 satellite nadir mode observations.