Retrieving Atmospheric Gas Proﬁles Using FY-3E/HIRAS-II Infrared Hyperspectral Data by Neural Network Approach

: The observed radiation data from the second-generation Hyperspectral Infrared Atmo-spheric Sounder (HIRAS-II) on the Fengyun-3E (FY-3E) satellite contain useful vertical atmosphere information which can distinguish and retrieve vertical proﬁles of atmospheric gas components


Introduction
Ozone (O 3 ), carbon monoxide (CO), and methane (CH 4 ) are important gas components in the atmosphere which have an important impact on processes such as atmospheric radiative transfer, regional air quality conditions, and global climate change [1]. In the stratosphere, O 3 is a powerful absorber of solar ultraviolet radiation and is critical in safeguarding the Earth's biosphere. In the troposphere and near-surface, O 3 is a greenhouse gas and air pollutant, impacting human health and the ecological environment [2]. When nitrogen oxides and hydrocarbons reach a certain concentration, they can be formed through photochemical reaction resulting in secondary pollution [3]. Additionally, the rise concentration of O 3 and CH 4 in the atmosphere significantly contributes to global warming. Examining the vertical distribution of these gas components offers insight into the distribution of atmospheric chemical components and their impacts on the atmosphere and ecosystems.
The progression of meteorological satellites offers precise observations of atmospheric conditions and has facilitated the weather forecasting models, playing a critical role in monitoring the rapidly changing composition of stratospheric and tropospheric gases [4]. At present, it allows the infrared hyperspectral atmospheric vertical sounding instrument on weather satellites to reveal the vertical distribution of thermodynamic variables (temperature and water vapor), surface characteristics (surface temperature and emissivity), and atmospheric gas composition profiles or column concentration information [5]. Because the infrared spectrum region has very rich gas absorption bands, while the infrared hyperspectral instrument has narrow weight functions and high vertical resolution, it has highly sensitive to the gas composition at a specific level. Therefore, detecting atmospheric composition hierarchical information is one of the advantages of infrared hyperspectral atmospheric vertical sounding instruments [6]. With the continuous progress of research, the retrieval of atmospheric temperature and humidity profiles using satellite infrared hyperspectral data has become relatively mature, but the research on gas composition profiles still needs to be gradually explored. In the following content, we will introduce the current means and methods of mining atmospheric composition information using infrared hyperspectrum.
The gases have varying absorption abilities for spectral radiation at different wavelengths [7]. For example, O 3 has the strongest absorption band at 9.6 µm, and CO shows strong absorption at 4.67 µm, while CH 4 has strong absorption bands at 3.31 µm and 7.66 µm [8]. In order to account for the differing sensitivities of gas components across various spectral bands, it is typically necessary to conduct channel selection beforeretrieval. At present, there are several channel selection methods, which can be roughly divided into two groups. The first group utilizes a weighting function-based selection approach, such as the data accuracy matrix method and the Jacobian method. These methods primarily focus on the sensitivity of each channel to atmospheric parameters but may not always fully consider the effects stemming from channel noise, background fields, and specific retrieval techniques. The second group involves channel selection methods based on information capacity, including the degrees of freedom and information content analysis method, the constant iteration method, and the atmospheric retrievable index method [9]. Some international scholars have utilized infrared hyperspectral instruments such as CrIS [10], IASI [11], and AIRS [12] to retrieve gas composition profiles or column concentrations. Related products have been verified to be accurate on the ground and applied to some numerical forecast models [13][14][15][16]. Currently, algorithms for retrieving atmospheric profiles based on satellite-based infrared hyperspectral retrieval comprise physical retrieval methods and statistical regression algorithms such as deep learning [6]. Physical retrieval methods commonly used include the onion peeling algorithm and the optimization method. Despite its fast pace, the onion peeling algorithm is prone to accumulating errors from the upper layer, leading to lower retrieval accuracy. On the other hand, the optimization algorithm requires a precise calculation of the radiation transmission model and Jacobian matrix, which can be time-consuming and needs prior information input. The accuracy of this inputted information directly impacts the retrieval result's accuracy.
The channel selection method and physical retrieval algorithms were widely used in numerous studies on infrared hyperspectral. Rodger [17] proposed an information-based hyperspectral remote sensing channel selection method that combines prior knowledge about atmospheric composition profiles with observed data to obtain optimal estimations of the true profile and its error covariance. Cyril [18] developed the optimal sensitivity Remote Sens. 2023, 15, 2931 3 of 28 profile method for AIRS, which selects 43 channels for CO 2 retrieval and has demonstrated its effectiveness for other trace gases, such as CO and CH 4 . Using the Reference Forward Model (RMF), Zhong [19] simulated the weight function of atmospheric pollution gas volume mixing ratio in the limb detection mode of a hyperspectral instrument. Meanwhile, Li [20] and Zhang [21] employed information entropy analysis to select channels from the spectrum and employed a one-dimensional variational retrieval method to obtain the atmospheric temperature and humidity profiles from HIRAS observation data. Wang [22], on the other hand, used a weighted function based on various ozone channels and their interfering components in combination with an information entropy method to perform channel selection and estimate ozone profiles using the optimal estimation method. Zhang [9] proposed a channel selection method based on peak sampling, considering both channel sensitivity and weight function characteristics, for retrieving CO profiles from hyperspectral infrared data. The RMSE of the retrieval result for CO profiles in the Alxa region during winter was found to be 3.07 × 10 −8 kg/kg. Wang [23] utilized the empirical orthogonal function method to retrieve the vertical profile of CO in the atmosphere using infrared hyperspectral data from the CrIS satellite, consistent with the verification set. Noel [24] combined the onion stripping algorithm with the Weighting Function Modified Differential Optical Absorption Spectroscopy (WFM-DOAS) algorithm, using the 1.559~1.671 µm band to retrieve the stratospheric CH 4 profile. The retrieval accuracy was found to be between 5 to 10 percent. Zhang [25] proposed a model for estimating methane profiles using Empirical Orthogonal Function (EOF) based on spaceborne hyperspectral infrared observations with a relative RMSE of less than 2.5%. Zhou [26] and Song [27] quantified the errors associated with CH 4 measurements in the infrared spectrum and highlighted that precise estimates of temperature, and the gases overlapping the measurement of CH 4 , can enhance the accuracy of CH 4 retrieval. Deng [28] implemented an effective and precise forward modeling retrieval algorithm based on several sensitivity studies, and most CH 4 retrieval mistakes were under 1%. The core of the optimization method in physical retrieval algorithms is a forward model based on a fast radiative transfer mode. In retrieving a single observation sample, the radiative transfer mode must calculate satellite-simulated radiation and the more time-consuming Jacobian matrix. Furthermore, the entire calculation process is time-consuming as satellite forward calculations must be based on multiple forward models. However, convolutional neural networks have been gradually introduced into numerical weather forecasting and remote sensing satellite retrieval due to their adaptive, self-organizing, and real-time learning features. They can obtain the best model of satellite observation data and gas profile information without relying on complex atmospheric radiative transfer processes [29]. Currently, there is an abundance of literature on using neural networks to retrieve temperature and humidity profiles from infrared hyperspectral data, but only a limited amount of literature exists regarding the retrieval of gas profiles. Zhang [30] and Liu [31] employed artificial neural networks (ANN) to retrieve atmospheric temperature and humidity profiles in their early studies and found that the neural network method yielded higher retrieval accuracy compared to the eigenvector statistical method. Huang [32] combined an artificial neural network algorithm with an improved one-dimensional variational algorithm to retrieve temperature profiles for the Advanced Geostationary Radiation Imager of FengYun-4A (FY-4A/GIIRS) data at various atmospheric pressure layers. Yao [33] established CNN and U-NET networks based on GIIRS to retrieve temperature and humidity profiles. Results show that the U-NET algorithm has significantly improved retrieval across all altitudes compared to the CNN algorithm. Xue [34] developed 1D-CNN and 3D-CNN models based on GIIRS to retrieve temperature and humidity profiles. It shows that the retrieval results near the ground were lower, while accuracy gradually improved with increased altitude. Neural networks have been used for the retrieval and prediction of gas column concentrations, as evidenced by previous studies [35][36][37]. Moreover, Jarosawski [38] demonstrated that the neural network retrieval of a 10-layer O 3 profile and comparison with site data led to greater consistency than when using Umkehr's method. Through the above analysis, although the physical retrieval method and neural network method have made some achievements in the application of the infrared hyperspectrum, there are currently few research works available in the literature on gas profile retrieval. It is of great significance to further explore the application ability of the infrared hyperspectrum. In addition, it can furnish precise initial values for numerical prediction models and serve as a reference for producing Fengyun satellite products. This study utilized FY-3E/HIRAS-II to calculate the retrieval of atmospheric component profiles and conduct a preliminary exploration of the changes, distribution, and concentration of important gas components including O 3 , CH 4 , and CO throughout the atmosphere. Since providing accurate initial values for numerical prediction models is one of the tasks of satellite atmospheric retrieval applications, this experiment compared the obtained component profiles with the forecast data produced by the current international advanced numerical prediction model to verify the correctness and accuracy of the retrieval results. Additionally, the retrieval results were compared with those of similar instruments currently in orbit internationally to verify their consistency.

Datasets
In this experiment, we used the infrared hyperspectral data of FY-3E/HIRAS-II during in-orbit testing for atmospheric composition retrieval. Given the scarcity of actual measurement data for the vertical profiles of component gases, it was necessary to fully consider the insufficient sample size during the construction of the neural network. In order to ensure the accuracy of the training set output, ERA5 (O 3 ) and EAC4 (CO, CH 4 ) reanalysis data released by European Centre for Medium-Range Weather Forecasts (ECMWF) in Europe were used as label data for this experiment. Verification and comparison results were obtained by comparing GFS forecast data (O 3 ), WACCM climate data (CO, CH 4 ), and component profile data products for AIRS (O 3 , CO, CH 4 ) and for IASI (O 3 , CO). Using these methods can improve the reliability and accuracy of the experiment, thus enabling more effective evaluation of the model's performance. The relevant datasets are shown in the following Table 1.  [39], an infrared hyperspectral atmospheric detection instrument developed by the Shanghai Institute of Technical Physics, Chinese Academy of Sciences. [40] In this experiment, we used FY-3E/HIRAS-II Level 1 satellite data (http://satellite. nsmc.org.cn/portalsite/default.aspx, accessed on 15 April 2023). The area selected for the experiment is the sea area south of India (25 • N~25 • S, 45 • E~100 • E), and the selected time interval is the on-orbit testing period during winter, from 21 December 2021 to 18 January 2022. Table 2 shows the spectral characteristics and related parameters of FY-3E/HIRAS-II Level 1 (the data obtained from the website is unapodized, and more details can be found in https://satellite.nsmc.org.cn/PortalSite/StaticContent/FileDownload. aspx?CategoryID=1&LinkID=553, accessed on 15 April 2023). FY-3E/HIRAS-II has three bands for longwave, mediumwave, and shortwave, with a total of 3041 spectral channels ranging from 650-2550 cm −1 , the instrument is capable of infrared-wide-spectrum continuous hyperspectral detection as well as high-precision calibration. Each detector array contains 9 detector bands that observe the target region simultaneously. Furthermore, each probe element has an opening angle of 1 • , corresponding to an instantaneous field of view of each probe unit to be about 14 km at its lowest point [41][42][43]. The training set labels were determined based on selected gas datasets in this experiment. To be more specific, the O 3 dataset used was obtained from ERA5 reanalysis datasets (https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-pressure-levels, accessed on 15 April 2023) that were post-processed and combined with different types of observation data. The dataset comprises 37 layers and has a time resolution of 1 h. Then, the CH 4 and CO datasets were primarily derived from the EAC4 reanalysis data (https: //ads.atmosphere.copernicus.eu/cdsapp#!/dataset/cams-global-reanalysis-eac4, accessed on 15 April 2023), which were used as the training output in this experiment. EAC4 is the fourth generation of global atmospheric composition reanalysis data of the ECMWF, which is mainly based on physical and chemical models of atmospheric radiation and incorporates global observations. Similar to ERA5, it provides post-processed data every 3 h globally, and there are 25 pressure levels in total. The retrieval layers of O 3 , CO, and CH 4 in this experiment were consistent with the levels of gas profiles corresponding to ERA5 and EAC4.

WACCM Forecast Data
The WACCM climate model data use the CESM (Community Earth System Model) from the National Center for Atmospheric Research (NCAR) as their numerical framework and incorporate observations and modeling of the upper atmosphere from HAO (High Altitude Observatory), middle atmosphere observations and modeling from ACOM (Atmospheric Chemistry and Modeling), and global climate observations from Climate Global Dynamics (CGD) tropospheric modeling. The dataset has 88 barometric layers and focuses on atmospheric vertical information. Some scholars consider the data produced by this climate model as background field data or the initial guess value of the physical retrieval method. This study primarily used CO and CH 4 background forecast data in the WACCM climate model data (https://rda.ucar.edu/datasets/ds313.6/dataaccess/, accessed on 15 April 2023) to compare the accuracy of retrieval results. The spatial resolution of the dataset is 0.9 • × 1.25 • , and the temporal resolution is 6 h, which is similar to the timing of the test set. In this paper, we selected the O 3 , CH 4 , and CO profiles of WACCM to compare the retrieval results using a similar timing to that of the test set.

GFS Forecast Data
The GFS forecasting data (https://www.ncei.noaa.gov/data/global-forecast-system/ access/grid-004-0.5-degree/analysis, accessed on 15 April 2023) produced by the National Center for Environmental Prediction (NCEP) are used in this experiment. The data have a forecast time of 6 h and a spatial resolution of 0.5 • . The GFS O 3 profile was selected to compare the retrieval results using a similar timing to that of the test set.

AIRS Product Data
The main purpose of the dataset (available at https://disc.gsfc.nasa.gov/datasets/ AIRS2RET_7.0/summary?keywords=AIRS2RET_7.0, accessed on 15 April 2023) is to provide AIRS secondary products with temperature and humidity profiles, as well as contour lines for O 3 , CH 4 , and CO profile. This dataset includes two sets of retrieval data, one for ascending orbit and another for descending orbit, which are updated daily. The standard data products are issued 72 h after the L1 (Level 1) data, and the spatial resolution is 1 • × 1 • . Some parameters are calculated using the optimal estimation in the physical retrieval algorithm. In this study, we selected O 3 , CH 4 , and CO profile parameters from the AIRS L2 (Level 2) products for comparing the retrieval results. The time frame of the dataset matches that of the test set.

IASI Product Data
The primary objective of IASI (Infrared Atmospheric Sounding Interferometer) is to procure atmospheric emission spectrum data and high-resolution secondary product data (https://archive.eumetsat.int/usc/UserServicesClient.html, accessed on 15 April 2023) that provide accurate temperature and humidity distribution information. The instrument is also capable of detecting trace gases, including O 3 , N 2 O, CO 2 , and CH 4 , and obtaining data on land and ocean surface temperature, emissivity, and cloud characterization. In this study, we selected O 3 and CO profiles from the secondary products of the IASI to compare the retrieval outputs. The time frame of the dataset matches that of the test set. Figure 1 shows the framework diagram for the neural network (CNN and UNET) retrieval of atmospheric component gas profiles in this experiment. The retrieval was achieved with advances in data preprocessing, including observations from HIRAS-II, reanalysis data, and forecast background data, as well as secondary products from AIRS and IASI. Specific data preprocessing methods include the thresholding of HIRAS-II observation data, selection of clear sky ocean sample points, and interpolation calculation of AIRS reanalysis data products. These advancements demonstrate the effectiveness of the neural network approach in atmospheric component gas profiles retrieval.

Data Preprocessing
(1) Spectral Apodization The channel spectral response function of hyperspectral detectors is difficult to measure directly in the laboratory due to the narrow detection band of each channel. Instead, a sinc-like function is commonly used to simulate the channel spectral response function of the interferometer without apodization. However, the sinc-like function produces sidelobes on both sides of the main peak, leading to inaccuracies in spectral simulation. In order to reduce this effect, apodization is required on the L1 data of HIRAS-II. Apodized on the interferogram is equivalent to multiplication by a gradual Hamming window function, while thresholding on the spectrum is equivalent to smoothing processing. Therefore, a Hamming function was selected as the apodization function for HIRAS-II L1 data [44]. The specific calculation formulas are as follows: Rad n = 0.23 × Rad n−1 + 0.54 × Rad n + 0.23 × Rad n+1 (1) where Rad n is the radiation value when the index is n in a spectrum sample.  (1) Spectral Apodization The channel spectral response function of hyperspectral detectors is difficult to measure directly in the laboratory due to the narrow detection band of each channel. Instead, a sinc-like function is commonly used to simulate the channel spectral response function of the interferometer without apodization. However, the sinc-like function produces sidelobes on both sides of the main peak, leading to inaccuracies in spectral simulation. In order to reduce this effect, apodization is required on the L1 data of HIRAS-II. Apodized on the interferogram is equivalent to multiplication by a gradual Hamming window function, while thresholding on the spectrum is equivalent to smoothing processing. Therefore, a Hamming function was selected as the apodization function for HIRAS-II L1 data [44]. The specific calculation formulas are as follows: where n Rad is the radiation value when the index is n in a spectrum sample. Figure 2 depicts the spectrum of FY-3E/HIRAS-II with and without apodization. The solid black line shows the spectral brightness temperature without apodization, while the solid red line shows the spectral brightness temperature after apodization. As shown in the figure, apodization results in a smoother spectrum [21].  (2) Bright Temperature Conversion The radiation value is converted into a brightness temperature value using Planck's formula. The specific calculation formulas are as follows: where λ ( , ) u T represents the total energy of radiation with wavelength λ and thermodynamic temperature T, k is the Boltzmann constant, h is the Planck constant, and c is the speed of light.
(3) Clear Sky Area Screening The high intensity of infrared background radiation from clouds significantly affects the accuracy of retrieval for gas profiles in the infrared spectral region. Therefore, cloud removal processing is necessary for spectral data. In this research, clear sky samples without clouds are selected using a screening process using part of the longwave spectrum of FY-3E/HIRAS-II, which differs from usual research that employs a cloud detection algo- (2) Bright Temperature Conversion The radiation value is converted into a brightness temperature value using Planck's formula. The specific calculation formulas are as follows: where u(λ, T) represents the total energy of radiation with wavelength λ and thermodynamic temperature T, k is the Boltzmann constant, h is the Planck constant, and c is the speed of light. (3) Clear Sky Area Screening The high intensity of infrared background radiation from clouds significantly affects the accuracy of retrieval for gas profiles in the infrared spectral region. Therefore, cloud removal processing is necessary for spectral data. In this research, clear sky samples without clouds are selected using a screening process using part of the longwave spectrum of FY-3E/HIRAS-II, which differs from usual research that employs a cloud detection algorithm on the same platform. Clear sky samples are judged based on the observation data of five representative infrared channels (810 cm −1 , 830 cm −1 , 850 cm −1 , 870 cm −1 , 890 cm −1 ) in the longwave window area. The selected samples must have a spectral brightness temperature greater than 290 K to ensure that the sample point is definitely a clear sky sample [21,45].
(4) Spatio-Temporal Matching The ultimate training and test sets were determined by the pixels through clear sky screening. Due to the differences in the temporal and spatial resolutions between reanalysis, forecast, secondary satellite product, and HIRAS-II observation data sources, it is necessary to utilize the time, latitude, and longitude of observation samples as a reference for interpolation. This method performs spatio-temporal matching and pressure layer interpolation on data from other sources.
Specifically, (1) in temporal matching: on account of ERA5, EAC4, GFS, and WACCM datasets having uniform distribution of time and space, we selected time-matched data by linear interpolation from similar time periods before and after the sample point. However, since AIRS and IASI instruments have a limited number of revisits to the same area within a day, we collected data in the vicinity of the sample point as approximate time-matched data; (2) in spatial matching: currently, four commonly used spatial interpolation methods are bilinear interpolation, nearest neighbor interpolation, inverse distance weighting method, and cubic spline interpolation (Cubic). However, the nearest neighbor interpolation method is prone to cause discontinuity of the data due to its sawtooth effect. The remaining three interpolation methods are superior and more appropriate for spatial interpolation. In this experiment, spatial interpolation of reanalysis data, forecast data, AIRS, and IASI data were performed by cubic spline interpolation (Cubic) based on the latitude and longitude of HIRAS-II sample points; (3) in hierarchical interpolation: the approach was utilized to linearly interpolate the O 3 profile to 37 layers and the CH 4 and CO data to 25 layers of EAC4 based on the stratification of the reanalysis data. In addition, for data with a hierarchy range less than that of the reanalysis data, interpolation was only performed within the appropriate hierarchy range.

Channel Selection
The channel sensitivity analysis of HIRAS-II was carried out based on the perturbation information of the retrieval accuracy of the international part of the gas at present. The spectral sensitive position of the gas was determined first, and then the improved channel algorithm for OSP (the Optimal Sensitivity Profile method) was employed to optimize the channels. In this experiment, the channel selection is based on the noise estimation DS NEDT (Deepspace Noise Equivalent Delta Temperature) and gas Jacobi matrix simulated by RTTOV during the FY-3E/HIRAS-II in-orbit test, and DS NEDT was derived from the FY-3E/HIRAS-II L1 dataset.
Using the FY-3E/HIRAS sensor coefficient file given by RTTOV, the brightness temperature values of each infrared channel of HIRAS are calculated. Based on this, the micro-perturbation is carried out according to the retrieval accuracy of the current related gas [46], which are CO (10%), N 2 O (2%), CH 4 (10%), H 2 O (20%), O 3 (10%), CO 2 (1%), T (1 K), and Tsurf (1 K), which were used to exhibit the level of the response the changes in atmospheric composition parameters. The simulated brightness temperature's change Remote Sens. 2023, 15, 2931 9 of 28 value is utilized to represent the response value of each channel concerning the atmospheric parameters perturbation, according to the given equation: In the formula, BT epresents the simulated brightness temperature calculated by RT-TOV, X 0 epresents the original atmospheric composition information of the reanalysis data, δX j represents the disturbance amount of the atmospheric composition information j, and the change value of the simulated observed brightness temperature ∆BT j (v) represents the variation of each atmospheric parameter in different wavenumber channels sensitivity [47].
In the experiment, the absorption position of some gases in the infrared spectral region and the brightness temperature change brought about by disturbing the gas content are shown in Figure 3 below.  After determining the spectral position of the sensitive channel, the improved algorithm for OSP was used for channel optimization, which combined the DS NEDT of HIRAS-II and the Jacobian matrix of different gases obtained by the partial derivative. After the lateral selection of spectral position and the longitudinal selection of sensitive gas layer, the optimal channel of different gases is finally determined. The improved algorithm for OSP channel selection works as follows: First, the algorithm selects the channel with the largest Jacobian peak in different pressure layers to be the first channel. Secondly, it calculates the signal-to-noise ratio (SNR) of each channel and excludes channels in which the target gas is less than the DS NEDT. Then, the algorithm uses the SNR of the first channel in each pressure layer as a threshold and excludes channels with a Jacobian peak at the same height but with an SNR lower than the threshold. This ensures that more similar channels are excluded, and information redundancy between channels is avoided.
As shown in Figure 4, (a) the channel selected by O3 with concentration of channels ranging between 1000 cm⁻¹ and 1080 cm⁻¹, exhibiting low interference from other gases; (b) it shows that the preferred channels for CO are located between 2080 cm −1 and 2200 cm −1 ; (c) the channel selected by CH4 is primarily located within the channel selected, ranging between 1240 cm⁻¹ and 1360 cm⁻¹. It can be seen that both CO and CH4 are greatly affected by water vapor absorption. Eventually, we selected 96 groups of O3 absorption channels, 76 groups of CO absorption channels, and 150 groups of CH4 absorption channels in the experiment, with the detailed bands shown in Table A1. After determining the spectral position of the sensitive channel, the improved algorithm for OSP was used for channel optimization, which combined the DS NEDT of HIRAS-II and the Jacobian matrix of different gases obtained by the partial derivative. After the lateral selection of spectral position and the longitudinal selection of sensitive gas layer, the optimal channel of different gases is finally determined. The improved algorithm for OSP channel selection works as follows: First, the algorithm selects the channel with the largest Jacobian peak in different pressure layers to be the first channel. Secondly, it calculates the signal-to-noise ratio (SNR) of each channel and excludes channels in which the target gas is less than the DS NEDT. Then, the algorithm uses the SNR of the first channel in each pressure layer as a threshold and excludes channels with a Jacobian peak at the same height but with an SNR lower than the threshold. This ensures that more similar channels are excluded, and information redundancy between channels is avoided.
As shown in Figure 4, (a) the channel selected by O 3 with concentration of channels ranging between 1000 cm −1 and 1080 cm −1 , exhibiting low interference from other gases; (b) it shows that the preferred channels for CO are located between 2080 cm −1 and 2200 cm −1 ; (c) the channel selected by CH 4 is primarily located within the channel selected, ranging between 1240 cm −1 and 1360 cm −1 . It can be seen that both CO and CH 4 are greatly affected by water vapor absorption. Eventually, we selected 96 groups of O 3 absorption channels, 76 groups of CO absorption channels, and 150 groups of CH 4 absorption channels in the experiment, with the detailed bands shown in Table 1.
ranging between 1000 cm⁻¹ and 1080 cm⁻¹, exhibiting low interference from other gase (b) it shows that the preferred channels for CO are located between 2080 cm −1 and 220 cm −1 ; (c) the channel selected by CH4 is primarily located within the channel selected ranging between 1240 cm⁻¹ and 1360 cm⁻¹. It can be seen that both CO and CH4 are greatl affected by water vapor absorption. Eventually, we selected 96 groups of O3 absorptio channels, 76 groups of CO absorption channels, and 150 groups of CH4 absorption chan nels in the experiment, with the detailed bands shown in Table A1.

Neural Network Model and Experimental Process
This paper develops network models for O3, CO, and CH4 using widely-used full convolutional neural networks (CNN and UNET).
(1) CNN Model Convolutional neural network (CNN) provides an end-to-end learning model whos parameters can be trained by gradient descent method, and CNN can learn the deep fea tures of the samples. The CNN model structure [33,48] is displayed in Figure 5 and in cludes one input layer (brightness temperature data), four convolutional layers, two poo ing layers (using average pooling), one fully connected layer, and one regression outpu layer. The input data are the brightness temperature data derived after optimizing th infrared hyperspectral channel from FY-3E/HIRAS-II. The first three convolutional layer comprise convolution, normalization, and rectified linear unit (ReLU) activation func tions. Batch normalization operations stabilize the data following the convolution opera tion and nonlinearity of the activation function is utilized for feature extraction. In partic ular, the activation function enhances the network's nonlinear fitting capacity. Althoug

Neural Network Model and Experimental Process
This paper develops network models for O 3 , CO, and CH 4 using widely-used fully convolutional neural networks (CNN and UNET).
(1) CNN Model Convolutional neural network (CNN) provides an end-to-end learning model whose parameters can be trained by gradient descent method, and CNN can learn the deep features of the samples. The CNN model structure [33,48] is displayed in Figure 5 and includes one input layer (brightness temperature data), four convolutional layers, two pooling layers (using average pooling), one fully connected layer, and one regression output layer. The input data are the brightness temperature data derived after optimizing the infrared hyperspectral channel from FY-3E/HIRAS-II. The first three convolutional layers comprise convolution, normalization, and rectified linear unit (ReLU) activation functions. Batch normalization operations stabilize the data following the convolution operation and nonlinearity of the activation function is utilized for feature extraction. In particular, the activation function enhances the network's nonlinear fitting capacity. Although the initial convolutional layer demonstrates the ability to learn shallower features, the higher-level convolutional layers were capable of gaining more abstract feature information. Finally, the regression output layer of the training set represents the reanalysis profile information from ERA5 and EAC4. The specific network parameters of CNN are detailed in Table 3, which provides the values of Nin and Nout representing the input and output channel dimensions, respectively. In traditional CNN, features are extracted through the convolution layer and pooling layer, and the final parameters are determined through back propagation. In this process, shallow features are gradually discarded and deep features are mined. However, the feature extraction steps of U-shaped network model (UNET) are relatively complex, and can be divided into encoder and decoder. Through the skip-connection, a shallow feature of samples can be retained, as well as deep features. The UNET network model structure [49,50] is provided in Figure 6. The original structure included two-dimensional layers, where various convolutional layers, pooling layers, and other function layers were reduced for the purpose of reducing dimensions in this paper. Accordingly, we adopted a shorter structure with shorter convolution layers and path lengths. The 1D-Unet integrates two paths, the contraction path for localized feature extractions and the expansion path for precise segmentation, and the contraction path is mainly composed of convolutional and pooling layers, while the expansion path includes upsampling and convolutional layers. This encoding-decoding-like nature involves information encoding and decoding for output in a compressed and denoised manner. Meanwhile, the skip connection The specific network parameters of CNN are detailed in Table 3, which provides the values of N in and N out representing the input and output channel dimensions, respectively. In traditional CNN, features are extracted through the convolution layer and pooling layer, and the final parameters are determined through back propagation. In this process, shallow features are gradually discarded and deep features are mined. However, the feature extraction steps of U-shaped network model (UNET) are relatively complex, and can be divided into encoder and decoder. Through the skip-connection, a shallow feature of samples can be retained, as well as deep features. The UNET network model structure [49,50] is provided in Figure 6. The original structure included two-dimensional layers, where various convolutional layers, pooling layers, and other function layers were reduced for the purpose of reducing dimensions in this paper. Accordingly, we adopted a shorter structure with shorter convolution layers and path lengths. The 1D-Unet integrates two paths, the contraction path for localized feature extractions and the expansion path for precise segmentation, and the contraction path is mainly composed of convolutional and pooling layers, while the expansion path includes upsampling and convolutional layers. This encoding-decoding-like nature involves information encoding and decoding for output in a compressed and denoised manner. Meanwhile, the skip connection structure assists in restoring information lost during the convolution pooling process. In practice, this structure improves the accuracy of retrieval operations.  Figure 6. UNET Network Model.
The specific network parameters of UNET are detailed in Table 4, which provides the values of Nin and Nout, representing the input and output channel dimensions, respectively.  The specific network parameters of UNET are detailed in Table 4, which provides the values of N in and N out , representing the input and output channel dimensions, respectively.
Throughout the convolutional neural network's training process, the training input consists of satellite observation brightness temperature data from the preferred channel, while the label comes from the corresponding reanalysis data, and the output is the profile of the retrieval. In addition, the test set inputs are satellite observations that are independent of the training samples and are recorded at a lagged time than the training data. During the training, the loss function is used to calculate the difference between the network output and the label. The training process involves iteratively propagating the loss function backward, updating weights using derivatives, and continually reducing the loss function. Specifically, the experiment used the RMSE function as the loss function.

Analytical Method
In this experiment, we selected clear-sky sample data from FY-3E/HIRAS-II in the sea area south of India (25 • N~25 • S, 45 • E~100 • E) during on-orbit operation and divided them chronologically into a training set and test set. Figure 7a shows the distribution of the training samples of FY-3E/HIRAS-II with the 890 cm −1 spectral brightness and temperature after clear sky pixel selection. Figure 7b shows the distribution of the test samples of FY-3E/HIRAS-II with an 890 cm −1 observation brightness and temperature after pixel selection for the test set. The input data of the neural network are the spectral brightness and temperature data selected by FY-3E/HIRAS-II through its spectral channel. Meanwhile, the label data correspond to the gas composition profile, matching time and space with the ERA5 and EAC4 reanalysis data of FY-3E/HIRAS-II. Finally, the output data are the gas composition profile obtained through neural network retrieval. Within the training set, we selected the spectral brightness and temperature data of FY-3E/HIRAS-II from 21 December 2021 to 9 January 2022, having a total of 67,472 spectral samples. We randomly divided the verification set, which constituted 20% of these spectral samples, for performance evaluation during the training process. Within the test set, we selected spectral brightness and temperature data of FY-3E/HIRAS-II from 10 to 18 January 2022, totaling 15,315 samples.
Throughout the convolutional neural network's training process, the training input consists of satellite observation brightness temperature data from the preferred channel, while the label comes from the corresponding reanalysis data, and the output is the profile of the retrieval. In addition, the test set inputs are satellite observations that are independent of the training samples and are recorded at a lagged time than the training data. During the training, the loss function is used to calculate the difference between the network output and the label. The training process involves iteratively propagating the loss function backward, updating weights using derivatives, and continually reducing the loss function. Specifically, the experiment used the RMSE function as the loss function.

Analytical Method
In this experiment, we selected clear-sky sample data from FY-3E/HIRAS-II in the sea area south of India (25°N~25°S, 45°E~100°E) during on-orbit operation and divided them chronologically into a training set and test set. Figure 7a shows the distribution of the training samples of FY-3E/HIRAS-II with the 890 cm −1 spectral brightness and temperature after clear sky pixel selection. Figure 7b shows the distribution of the test samples of FY-3E/HIRAS-II with an 890 cm −1 observation brightness and temperature after pixel selection for the test set. The input data of the neural network are the spectral brightness and temperature data selected by FY-3E/HIRAS-II through its spectral channel. Meanwhile, the label data correspond to the gas composition profile, matching time and space with the ERA5 and EAC4 reanalysis data of FY-3E/HIRAS-II. Finally, the output data are the gas composition profile obtained through neural network retrieval. Within the training set, we selected the spectral brightness and temperature data of FY-3E/HIRAS-II from 21 December 2021 to 9 January 2022, having a total of 67,472 spectral samples. We randomly divided the verification set, which constituted 20% of these spectral samples, for performance evaluation during the training process. Within the test set, we selected spectral brightness and temperature data of FY-3E/HIRAS-II from 10 to 18 January 2022, totaling 15,315 samples.  This experiment applied MPE and RMSE to compare and analyze the accuracy of retrieval outcomes. Meanwhile, we utilized the determination coefficient (R 2 ) to evaluate the effectiveness of the model. The specific calculation formulas are as follows: whereŷ i denotes the inverted value, y i represents the actual value, and N pertains to the number of samples.

Evaluation of Model Training and Test
In this study, network models were constructed for various gases, and different models were employed to perform retrieval calculations on the corresponding gas composition profiles. Ultimately, the profiles for O 3 , CO, and CH 4 were determined and verified using relevant data. The accuracy of retrieval results was analyzed from the model verification effect and the retrieval results of gas composition comparison.
In this paper, CNN and UNET models were built for the three gas components. We preprocessed sea surface clear sky samples in the target area, trained the network with the validation and training sets, and used the test set to run retrieval calculations of the model. Figure 7 shows scatter plots of the concentration of each of the three gas components compared to label data (ERA5 and EAC4). As shown in Figure 8, in subfigure (a), the x-axis and y-axis, respectively, represent the model output data of the validation set in the training set and its corresponding label data; in subfigure (b), the horizontal and vertical coordinates, respectively, represent the model output data of the test dataset and its corresponding label data. The units for these coordinates are kg/kg of gas concentration. The CNN model results are represented by the blue scatter plot, while the green scatter plot represents the UNET model results. When comparing the scatter plots in Figure 9, it becomes apparent that some sample points and lines diverge more. Retrieval results for the test set within the range of 0-700 hPa for both the CNN and UNET models were found to have determination coefficients (R2) of 0.920 and 0.912, which are some gaps with the indicator results of the validation set. Additionally, Table 5 presents the evaluation indexes of the retrieval data at various levels. The determination coefficients of the retrieval results for the test set are superior to 0.9 before 700 hPa and below 1 × 10 −8 kg/kg for the RMSE. Table 3 shows that the data of each index decrease as the middle and lower troposphere, ranging from 700 hPa to 1000 hPa, near the surface. These decreases could be attributed to the influence of water vapor, As shown in Figure 8, due to the large concentration difference between different levels of O 3 , there is a stratification phenomenon among the samples, and most of the data are concentrated near the red line in both the training set and the test set. In this experiment, the centralized overall evaluation indicators of O 3 were counted. CNN and UNET models performed similarly, with decision coefficients greater than 0.998 on the validation set and greater than 0.996 on the test set. These results suggest strong generalization ability without any clear signs of overfitting. Among them, the generalization ability of the CNN network for retrieval O 3 is slightly better than that of UNET.
When comparing the scatter plots in Figure 9, it becomes apparent that some sample points and lines diverge more. Retrieval results for the test set within the range of 0-700 hPa for both the CNN and UNET models were found to have determination coefficients (R2) of 0.920 and 0.912, which are some gaps with the indicator results of the validation set. Additionally, Table 5 presents the evaluation indexes of the retrieval data at various levels. The determination coefficients of the retrieval results for the test set are superior to 0.9 before 700 hPa and below 1 × 10 −8 kg/kg for the RMSE. Table 3 shows that the data of each index decrease as the middle and lower troposphere, ranging from 700 hPa to 1000 hPa, near the surface. These decreases could be attributed to the influence of water vapor, and additionally, the fitting of CO is impacted by significant changes in the concentration of lower pressure layers, including the surface state. When comparing the scatter plots in Figure 9, it becomes apparent that some sample points and lines diverge more. Retrieval results for the test set within the range of 0-700 hPa for both the CNN and UNET models were found to have determination coefficients (R2) of 0.920 and 0.912, which are some gaps with the indicator results of the validation set. Additionally, Table 5 presents the evaluation indexes of the retrieval data at various levels. The determination coefficients of the retrieval results for the test set are superior to 0.9 before 700 hPa and below 1 × 10 −8 kg/kg for the RMSE. Table 3 shows that the data of each index decrease as the middle and lower troposphere, ranging from 700 hPa to 1000 hPa, near the surface. These decreases could be attributed to the influence of water vapor, and additionally, the fitting of CO is impacted by significant changes in the concentration of lower pressure layers, including the surface state.    Figure 10 demonstrates that the scatter plots of the test set samples were relatively concentrated. The R 2 for the CNN model and the UNET model are 0.9814 and 0.9767, respectively, comparable to the results of the validation set. The overall ME and MPE indices of both models are very low, and the MPE being less than 0.1% suggests uniformity in the CH4 retrieval data, indicating that the model's overall results are relatively ideal. Overall, the performance results of the typical CNN and the representative UNET network model with deep and shallow feature links are comparable. The results suggest that, while UNET is marginally better than CNN at extracting sample features, its generalization ability is slightly inferior to that of CNN, and both models achieved great results. The experiment that employed a neural network to retrieve O3 and CH4 gas composition profiles showed good results. Although the retrieval of CO also achieved good results in the range of 0 to 700 hPa, the retrieval results in the range of 700 to 1000 hPa were worse, and the divergence from the real point was high. Overall, the performance results of the typical CNN and the representative UNET network model with deep and shallow feature links are comparable. The results suggest that, while UNET is marginally better than CNN at extracting sample features, its generalization ability is slightly inferior to that of CNN, and both models achieved great results. The experiment that employed a neural network to retrieve O 3 and CH 4 gas composition profiles showed good results. Although the retrieval of CO also achieved good results in the range of 0 to 700 hPa, the retrieval results in the range of 700 to 1000 hPa were worse, and the divergence from the real point was high.

Analysis of O 3 Retrieval Results
The retrieval results of the experimental area are averaged by the atmospheric pressure layer, demonstrating the changing trend of O 3 profile' concentration at various levels throughout the region. Figure 11 depicts that when the atmospheric pressure layer ranges between 100~1000 hPa, the average O 3 concentration is less than 10 −7 kg/kg. The O 3 concentration profile changes more obviously above 80 hPa and reaches the concentration peak near the atmospheric pressure layer of 10 hPa. As a whole, the concentration of the O 3 profile shows a rapid increase at first, a rapid decrease, and finally, a low and stable concentration from the top to the bottom of the atmosphere. In the figure, the green line represents the O 3 reanalysis concentration profile of the label ERA5 data. Meanwhile, the blue lines marked differently depict the O 3 concentration profile of two distinct satellite instrument products. Similarly, the purple lines marked differently represent the O 3 concentration profile of two forecast datasets. The color markers in the subsequent figures will have identical meanings. The retrieval results show an overall consistency with the trend of the entire layer in both the satellite product data and the forecast data. concentration from the top to the bottom of the atmosphere. In the figure, the green line represents the O3 reanalysis concentration profile of the label ERA5 data. Meanwhile, the blue lines marked differently depict the O3 concentration profile of two distinct satellite instrument products. Similarly, the purple lines marked differently represent the O3 concentration profile of two forecast datasets. The color markers in the subsequent figures will have identical meanings. The retrieval results show an overall consistency with the trend of the entire layer in both the satellite product data and the forecast data. Figure 11. O3 profiles by different datasets.

Comparison of O3 between Retrieval Results and Forecast Data
As one of the primary objectives of atmospheric composition retrieval is to produce precise initial values for global numerical prediction models, which incorporate atmospheric models and regional climate models, this experiment assessed the O3 retrieval outcomes against advanced forecast model data.
As shown in Figure 12, compared with the data of ERA5, WACCM has positive deviations in most layers; the absolute MPE of the whole layers is 18.69%, the maximum absolute MPE is 45.48%, while the RMSE of the whole layers is 1.74 × 10 −7 kg/kg, and the maximum RMSE is 1.23 × 10 −6 kg/kg; as for the GFS, having negative deviations in most levels, the absolute MPE of the whole layers is 15.08%, and the maximum absolute MPE is 27.13%, while the RMSE of the whole layers is 2.52 × 10 −7 kg/kg, and the maximum RMSE is 1.92 × 10 −6 kg/kg; the absolute MPEs of the whole layers for CNN and UNET data in relation to ERA5 data were 7.59% and 7.06%, with the maximum absolute MPE being less than 15%. The RMSEs of the whole layers for CNN and UNET was 1.33 × 10 −7 kg/kg and 1.43 × 10 −7 kg/kg, with the maximum RMSE below 7.5 × 10 −7 kg/kg. The O3 profile concentration deviation from the neural network model was lower than that of both forecast models.

Comparison of O 3 between Retrieval Results and Forecast Data
As one of the primary objectives of atmospheric composition retrieval is to produce precise initial values for global numerical prediction models, which incorporate atmospheric models and regional climate models, this experiment assessed the O 3 retrieval outcomes against advanced forecast model data.
As shown in Figure 12, compared with the data of ERA5, WACCM has positive deviations in most layers; the absolute MPE of the whole layers is 18.69%, the maximum absolute MPE is 45.48%, while the RMSE of the whole layers is 1.74 × 10 −7 kg/kg, and the maximum RMSE is 1.23 × 10 −6 kg/kg; as for the GFS, having negative deviations in most levels, the absolute MPE of the whole layers is 15.08%, and the maximum absolute MPE is 27.13%, while the RMSE of the whole layers is 2.52 × 10 −7 kg/kg, and the maximum RMSE is 1.92 × 10 −6 kg/kg; the absolute MPEs of the whole layers for CNN and UNET data in relation to ERA5 data were 7.59% and 7.06%, with the maximum absolute MPE being less than 15%. The RMSEs of the whole layers for CNN and UNET was 1.33 × 10 −7 kg/kg and 1.43 × 10 −7 kg/kg, with the maximum RMSE below 7.5 × 10 −7 kg/kg. The O 3 profile concentration deviation from the neural network model was lower than that of both forecast models.

Comparison of O3 between Retrieval Results and Similar Satellite Products
In order to improve the credibility and reliability of the retrieval results, the Level-2 products of AIRS and IASI were compared with the retrieval results. The O3 profile products were obtained from the same transit experiment area as HIRAS-II and, within the simultaneous period, were chosen. The number of pressure levels used for interpolation was 37. Finally, the O3 profiles obtained from the retrieval results were compared with the

Comparison of O 3 between Retrieval Results and Similar Satellite Products
In order to improve the credibility and reliability of the retrieval results, the Level-2 products of AIRS and IASI were compared with the retrieval results. The O 3 profile products were obtained from the same transit experiment area as HIRAS-II and, within the simultaneous period, were chosen. The number of pressure levels used for interpolation was 37. Finally, the O 3 profiles obtained from the retrieval results were compared with the Level-2 products of AIRS and IASI, respectively.
In Figure 13, the absolute MPE of the whole layers and maximum absolute MPE values for AIRS compared with ERA5 data were 11.90% and 31.35%, while the RMSE of the whole layers and maximum RMSE values were 2.21 × 10 −7 kg/kg and 1.14 × 10 −6 kg/kg. As for IASI, the absolute MPE of the whole layers and maximum absolute MPE values were 10.96% and 30.86%, while the RMSE of the whole layers and maximum RMSE values were 2.52 × 10 −7 kg/kg and 9.20 × 10 −7 kg/kg. The retrieval results were mostly superior to the AIRS and IASI satellite products within the pressure range of 0~100 hPa. Within the pressure range of 100~1000 hPa, where the O 3 concentration values were smaller and closer to zero, the MAE percentage and root mean square error of these datasets were acceptable.

Analysis of CO Retrieval Results
Similar to the previous operation, averaging the retrieval results of each atmospheric pressure layer within the experimental area can reflect the concentration trends of CO profiles at different levels in the region. The specific retrieval result is shown in Figure 14, which shows that the CO concentration was maintained at a low level within the top of the atmospheric pressure layer between 0 and 100 hPa. Between 150 and 700 hPa, the CO concentration increased to around 7 × 10 −8 kg/kg and remained relatively constant, while the concentration of CO was relatively high in the part near the surface from 700 hPa to 1000 hPa. The data from the two satellite products are consistent with the retrieval and other datasets in terms of magnitude, but differences exist in some pressure layers. In contrast, the retrieval results are much closer to the reanalysis and prediction data.

Analysis of CO Retrieval Results
Similar to the previous operation, averaging the retrieval results of each atmospheric pressure layer within the experimental area can reflect the concentration trends of CO profiles at different levels in the region. The specific retrieval result is shown in Figure 14, which shows that the CO concentration was maintained at a low level within the top of the atmospheric pressure layer between 0 and 100 hPa. Between 150 and 700 hPa, the CO concentration increased to around 7 × 10 −8 kg/kg and remained relatively constant, while the concentration of CO was relatively high in the part near the surface from 700 hPa to 1000 hPa. The data from the two satellite products are consistent with the retrieval and other datasets in terms of magnitude, but differences exist in some pressure layers. In contrast, the retrieval results are much closer to the reanalysis and prediction data.
profiles at different levels in the region. The specific retrieval result is shown in Figure 14, which shows that the CO concentration was maintained at a low level within the top of the atmospheric pressure layer between 0 and 100 hPa. Between 150 and 700 hPa, the CO concentration increased to around 7 × 10 −8 kg/kg and remained relatively constant, while the concentration of CO was relatively high in the part near the surface from 700 hPa to 1000 hPa. The data from the two satellite products are consistent with the retrieval and other datasets in terms of magnitude, but differences exist in some pressure layers. In contrast, the retrieval results are much closer to the reanalysis and prediction data.   Figure 15 shows that the concentration of CO in the upper atmosphere is low, which can lead to a significant error percentage. After removing the influence of 0-50 hPa data, the absolute MPE of the whole layers and maximum absolute MPE values for WACCM compared with the data of EAC4 was 14.69% and 30.64%, while the RMSE of the whole layers and maximum RMSE values were 3.34 × 10 −8 kg/kg and 9.69 × 10 −8 kg/kg. The absolute MPEs of the whole layers for CNN and UNET data in relation to EAC4 data were 10.70% and 6.93%, with the maximum absolute MPE being less than 18%. The RMSEs of the whole layers for CNN and UNET were 3.69 × 10 −8 kg/kg and 3.77 × 10 −8 kg/kg, with the maximum RMSE below 1.4 × 10 −7 kg/kg. The RMSE of the retrieval results and the forecast data showed a gradual increase in value from 700 hPa to 1000 hPa, while the absolute MPE had a small overall fluctuation. In this range, the effect of the retrieval results is not as good as WACCM as well as the range from 200 to 500 hPa. It also indicates that the sample points' values are more distinct from the reanalysis data compared with other pressure levels. The possible reason for this could be the substantial variation in the concentration of near-surface CO.  Figure 15 shows that the concentration of CO in the upper atmosphere is low, which can lead to a significant error percentage. After removing the influence of 0-50 hPa data, the absolute MPE of the whole layers and maximum absolute MPE values for WACCM compared with the data of EAC4 was 14.69% and 30.64%, while the RMSE of the whole layers and maximum RMSE values were 3.34 × 10 −8 kg/kg and 9.69 × 10 −8 kg/kg. The absolute MPEs of the whole layers for CNN and UNET data in relation to EAC4 data were 10.70% and 6.93%, with the maximum absolute MPE being less than 18%. The RMSEs of the whole layers for CNN and UNET were 3.69 × 10 −8 kg/kg and 3.77 × 10 −8 kg/kg, with the maximum RMSE below 1.4 × 10 −7 kg/kg. The RMSE of the retrieval results and the forecast data showed a gradual increase in value from 700 hPa to 1000 hPa, while the absolute MPE had a small overall fluctuation. In this range, the effect of the retrieval results is not as good as WACCM as well as the range from 200 to 500 hPa. It also indicates that the sample points' values are more distinct from the reanalysis data compared with other pressure levels. The possible reason for this could be the substantial variation in the concentration of near-surface CO.

Comparison of CO between Retrieval Results and Similar Satellite Products
Since AIRS and IASI lack partial-level data information, they are not shown in Figure  16. Consistent with the prior approach, data effects between 0 to 50 hPa need to be removed. In the range of 50 to 900 hPa, the absolute MPE of the whole layers and maximum absolute MPE values for AIRS compared with the data of EAC4 were 29.70% and 45.63%, while the RMSE of the whole layers and maximum RMSE values were 4.01 × 10 −8 kg/kg and 1.20 × 10 −7 kg/kg. As for IASI, the absolute MPE of the whole layers and maximum absolute MPE values for AIRS compared with the data of EAC4 were 36.90% and 79.02%,

Comparison of CO between Retrieval Results and Similar Satellite Products
Since AIRS and IASI lack partial-level data information, they are not shown in Figure 16. Consistent with the prior approach, data effects between 0 to 50 hPa need to be removed. In the range of 50 to 900 hPa, the absolute MPE of the whole layers and maximum absolute MPE values for AIRS compared with the data of EAC4 were 29.70% and 45.63%, while the RMSE of the whole layers and maximum RMSE values were 4.01 × 10 −8 kg/kg and 1.20 × 10 −7 kg/kg. As for IASI, the absolute MPE of the whole layers and maximum absolute MPE values for AIRS compared with the data of EAC4 were 36.90% and 79.02%, while the RMSE of the whole layers and maximum RMSE values were 4.26 × 10 −8 kg/kg and 8.65 × 10 −8 kg/kg. On the other hand, in the range of 50 to 900 hPa, the absolute MPEs of the whole layers for CNN and UNET data in relation to EAC4 data were 10.42% and 7.93%, with the maximum absolute MPE being less than 18%, while the RMSEs of the whole layers for CNN and UNET were 2.21 × 10 −8 kg/kg and 2.22 × 10 −8 kg/kg, with the maximum RMSE below 7.5 × 10 −8 kg/kg. The retrieval results of CO in the range of 50 hPa to 900 hPa were better than those of AIRS and IASI product data. It should also be noted that the trend shows an increasing error between AIRS and IASI near the surface, indicating great challenges in retrieving CO from both pattern retrieval and statistical regression. to 900 hPa were better than those of AIRS and IASI product data. It should also be noted that the trend shows an increasing error between AIRS and IASI near the surface, indicating great challenges in retrieving CO from both pattern retrieval and statistical regression.  Figure 17 shows that the concentration of CH4 in the range of 100~1000 hPa within the atmospheric pressure layer undergoes a stable fluctuation, maintaining an average concentration of around 1.0 × 10 −6 kg/kg, while the concentration of 0~100 hPa in the upper layer is relatively low. In terms of methane concentration, the forecast data and satellite product data have shown higher values compared to the retrieval and reanalysis data.  Figure 17 shows that the concentration of CH 4 in the range of 100~1000 hPa within the atmospheric pressure layer undergoes a stable fluctuation, maintaining an average concentration of around 1.0 × 10 −6 kg/kg, while the concentration of 0~100 hPa in the upper layer is relatively low. In terms of methane concentration, the forecast data and satellite product data have shown higher values compared to the retrieval and reanalysis data. Figure 17 shows that the concentration of CH4 in the range of 100~1000 hPa within the atmospheric pressure layer undergoes a stable fluctuation, maintaining an average concentration of around 1.0 × 10 −6 kg/kg, while the concentration of 0~100 hPa in the upper layer is relatively low. In terms of methane concentration, the forecast data and satellite product data have shown higher values compared to the retrieval and reanalysis data.

Comparison of CH4 between Retrieval Results and Forecast Data
As shown in Figure 18, the methane concentration obtained via model retrieval is generally lower than that obtained via WACCM. After excluding the influence of 0-50 hPa

Comparison of CH 4 between Retrieval Results and Forecast Data
As shown in Figure 18, the methane concentration obtained via model retrieval is generally lower than that obtained via WACCM. After excluding the influence of 0-50 hPa data, in the range of 50 to 900 hPa, the absolute MPE of the whole layers and maximum absolute MPE values for WACCM compared with the data of EAC4 were 6.83% and 7.42%, while the RMSE of the whole layers and maximum RMSE values were 6.94 × 10 −8 kg/kg and 8.77 × 10 −8 kg/kg. The absolute MPEs of the whole layers for CNN and UNET data in relation to EAC4 data were 0.62% and 0.70, with the maximum absolute MPE being less than 1%. The RMSEs of the whole layers for CNN and UNET were 1.43 × 10 −8 kg/kg and 1.43 × 10 −8 kg/kg, with the maximum RMSE below 5.0 × 10 −8 kg/kg. The retrieval of CH 4 resulted in a significant improvement compared to the WACCM forecast data.

Comparison of CH4 between Retrieval Results and Similar Satellite Products
In Figure 19, similar to the comparison results of WACCM, the CH4 concentration obtained by the model retrieval was generally lower than that obtained by AIRS. After excluding the influence of 0~50 hPa data, in the range of 50 to 900 hPa, the absolute MPE of the whole layers and maximum absolute MPE values for WACCM compared with the data of EAC4 were 5.30% and 8.07%, while the RMSE of the whole layers and maximum RMSE values were 5.36 × 10 −8 and 8.09 × 10 −8 kg/kg. The retrieval of CH4 resulted in a significant improvement compared to the AIRS data.

Comparison of CH 4 between Retrieval Results and Similar Satellite Products
In Figure 19, similar to the comparison results of WACCM, the CH 4 concentration obtained by the model retrieval was generally lower than that obtained by AIRS. After excluding the influence of 0~50 hPa data, in the range of 50 to 900 hPa, the absolute MPE of the whole layers and maximum absolute MPE values for WACCM compared with the data of EAC4 were 5.30% and 8.07%, while the RMSE of the whole layers and maximum RMSE values were 5.36 × 10 −8 and 8.09 × 10 −8 kg/kg. The retrieval of CH 4 resulted in a significant improvement compared to the AIRS data. In Figure 19, similar to the comparison results of WACCM, the CH4 concentration obtained by the model retrieval was generally lower than that obtained by AIRS. After excluding the influence of 0~50 hPa data, in the range of 50 to 900 hPa, the absolute MPE of the whole layers and maximum absolute MPE values for WACCM compared with the data of EAC4 were 5.30% and 8.07%, while the RMSE of the whole layers and maximum RMSE values were 5.36 × 10 −8 and 8.09 × 10 −8 kg/kg. The retrieval of CH4 resulted in a significant improvement compared to the AIRS data.

Discussion
The FY-3E satellite is the fifth satellite under China's second-generation polar orbit meteorological satellite and the world's first civil operational meteorological satellite in sun-synchronous dawn and dusk orbit. The infrared hyperspectral payload HIRAS-II is a crucial component of the FY-3E satellite, having great significance in developing atmospheric remote sensing applications for it. This study employs FY-3E/HIRAS-II data to retrieve atmospheric component gas profiles including ozone (O 3 ), carbon dioxide (CO), and methane (CH 4 ). The physical method requires the computation of the complicated atmospheric radiation transport equation and extensive auxiliary data that can be timeconsuming and laborious, limiting its ability for quick retrieval. However, neural networks are increasingly utilized in atmospheric remote sensing as they possess the characteristics of self-adaptation, self-organization, and real-time learning. This experiment demonstrates that the application of atmospheric composition profile retrieval based on FY-3E/HIRAS-II data can be realized quickly and in a timely fashion. We compared the results with international advanced forecast data and international atmospheric products with similar instrument loads. The results showed that the gas composition profile retrieved including O 3 andCH 4 by the neural network model has evident accuracy advantages for most pressure layers compared to both the forecast data (WACCM) and satellite product (AIRS and IASI). As for the CO retrieval results, the error was higher than that of the forecast data at the pressure level of 200~500 hPa and lower than that of similar satellite products with most pressure levels. Based on the preliminary results, the retrieval experiment exhibits high accuracy and a positive effect.
Here, are some things to discuss and analyze for the retrieval results. One factor that affects the retrieval results is the sensitive channel of the gases. As depicted in Figure 4, the CH 4 channel close to 1306.2 cm −1 is minimally affected by other gases or water vapor, resulting in the most accurate retrieval result for CH 4 , with a maximum percentage error of less than 1%. Similarly, the O 3 -sensitive channel range between 1000 cm −1 and 1080 cm −1 reveals minimal disruption from gas signals, and the O 3 sensitivity intensity is relatively high, therefore producing favorable retrieval results, especially in the high concentration range of 0-50 hPa. Relative errors derived from partial concentrations near zero are within acceptable limits across other barometric layers. However, the impact of water vapor collection on CO remains significant in the sensitive channel range from 2080 cm −1 to 2200 cm −1 . Although greater spectral resolution improves the information capacity of CO, it can still result in relatively large errors during retrieval. On the other hand, analyzing the distribution and errors of gas profiles from the top of the atmosphere to the surface reveals that retrieval results are poorer near the surface, especially for CO. Crevoisierl [48] gave a certain explanation for this problem: although infrared sounders provide comprehensive spatio-temporal coverage and contribute significantly to our understanding of the threedimensional atmosphere, they are still limited in their sensitivity to the lower troposphere near the surface.
Due to the absence of sounding data for O 3 , CO, and CH 4 , this study employs the reanalysis datasets of ERA5 (O 3 ) and EAC4 (CO, CH 4 ) for comparison purposes. Although this means that there are no measured data, the test and training sets are independent of one another, and the test set data lag behind the training set, which is a reasonable experimental design that has yielded positive retrieval results. However, this study has a few limitations that must be addressed in future research.
(1) In this experiment, we selected sea surface data from a fixed geographic area with a limited sample breadth in time and space. Considering the seasonal weather changes, the passage of time may alter the sample spectrum data in the region. Additionally, different latitudes have varying distributions of certain gas concentrations, which requires continued study. In the future, the potential of cloud computing can be leveraged to perform distributed retrievals of global gas profiles, revealing the spatio-temporal variation characteristics of these component gases and augmenting the accuracy and speed of retrieval.
(2) Compared with the traditional physical method of model retrieval, the current statistical regression method, including a neural network, has a faster retrieval ability. However, it is challenging to construct an efficient two-dimensional graphical structure due to the need to eliminate samples of non-clear sky polluted by clouds. Retrieving only one clear sky sample results in a loss of spatial characteristics, generating the map information with noise points and insufficiently smooth retrieval outcomes across the entire region. Addressing this issue and optimizing sample spatial structure characteristics to refine retrieval ability is a topic that merits further exploration.
The results of this experiment showed that the neural network model's component profile has the advantages of high precision and fast response time, which can provide a reference for the operation and the production of related products and also provide a basis for improving the follow-up instrument-related algorithm. Therefore, the retrieval model of FY-3E/HIRAS-II proposed in this experiment has a broad application prospect. Additionally, in future experiments, analyzing multi-instrument data can improve the time resolution of the polar orbit meteorological satellite, which is significant for the continuous observation of gas components.

Conclusions
The neural network model in this experiment was built based on the data of FY-3E/HIRAS-II, a new generation of FY-3E/HIRAS-II polar orbit satellite hyperspectral detector. The model was trained using satellite observation data and reanalysis data from 21 December 2021 to 9 January 2022, while the test set data for the retrieval model were from 10 to 18 January 2022. After the clear sky pixel screening in the experimental area, an improved channel selection method was used for channel optimization, and 96 O 3 channels, 76 CO channels, and 150 CH 4 channels were selected based on their noise signals, which are higher than HIRAS-II cold air. Overall, the performance results of the typical CNN and the representative UNET network model with deep and shallow feature links are comparable. Retrieval calculations of O 3 , CO, and CH 4 were carried out to obtain the profile concentration maps of each gas. These results can indicate the changes of each component gas in the different atmospheric levels. Comparing the relevant scatter plot and data in the process of model testing, the R2 difference between the two methods is less than 0.01 and the generalization ability of CNN network is slightly higher than that of UNET. The retrieval results show that both the CNN and UNET models can achieve good retrieval effects, and the retrieval accuracy of CNN is slightly higher than that of UNET (0.53% for O 3 , 3.77% for CO and 0.08% for CH 4 ). Under the condition of only using spectral structure information, there is no need to deepen the network further that using CNN network can meet the retrieval needs. This experiment demonstrates high retrieval accuracy, which can provide a reference to improve numerical forecast accuracy. The atmospheric composition profiles obtained through the neural model retrieval are better than the forecast data and international similar satellite product data to some extent.
The absorption of various gas molecules exists in the infrared spectral region. Whether infrared hyperspectral data can be used to retrieve part of the gas content in the atmosphere with higher precision and faster speed has always been one of the directions of the application of infrared hyperspectral data. This experiment presents a new approach to retrieve infrared hyperspectral data that can broaden their application prospect. The methods and results of this experiment can provide a reference for benefits for application from FY-3E/HIRAS-II capabilities.