Spectral replacement using machine learning methods for continuous mapping of Geostationary Environment Monitoring Spectrometer (GEMS)

Abstract. Earth radiance in the form of hyperspectral data contains useful information on atmospheric constituents and aerosol properties. The Geostationary Environment Monitoring Spectrometer (GEMS) is an environmental sensor measuring such hyperspectral data in the ultraviolet and visible (UV/VIS) spectral range over the Asia-Pacific region. After successful completion of the in orbit test of GEMS in October 2020, bad pixels are found as a remaining calibration issue to be updated with follow-up treatment. Currently, one-dimensional interpolation in the spatial direction is performed in operation to replace the erroneous pixels of GEMS, which causes high interpolation error for a wider defect area on a detector array. To resolve the issue, this study suggests machine learning methods with artificial neural network (ANN) and multivariate linear regression (Linear) to fill in a spectral gap of defective spectra. The machine learning models are trained with normal measurements to emulate spectral relations between input and output radiances in a spectrum. For efficient training, dimensionality reduction for the input radiances is applied with principal component analysis (PCA) prior to the training process. The results show that the defect area at the wavelengths of strong absorption lines is better replaced with PCA-ANN with the error of 5 %, while PCA-Linear is better for reproducing radiances having strong correlation with input radiances. The shorter the spectral range of output radiances is, the smaller the prediction error is with PCA-Linear (0.5–5 %). Spectral and spatial discontinuity caused by real bad pixels can be significantly improved with the trained machine learning models especially for wide defect areas. This study verifies that spectral relations of radiances in the UV/VIS spectrum are successfully reproduced with a simple machine learning model, which has high potential to be investigated further for enhancing measurement quality of environmental satellite measurements.


The applied analysis in the first draft could not quantitatively prove the validity of the suggested methods.Following the two referee's recommendations, we targeted a certain area including each defective region (Defects 1-3) and its surroundings (100-indices toward both north and south direction) where actual measurements (regarded as 'true') could be obtained.The following part has been inserted to Section 3.2.

Spatial and spectral inspection
For the quantitative evaluation of the reproduced spectra, certain areas are targeted which include each defect (Defects 1-3) and its surroundings where actual measurements regarded as 'true' could be obtained.The evaluation is made with the data measured on 10 March 2021 (06 UTC), which are not used for the model training.The center longitude of the areas is set to 128° E, which is identical with the sub-nadir longitude of GK-2B.Along the spectral direction, we focus on the specific spectral range of the whole spectral gap of Defects 1-3, as shown in Table 3. Specifying the range helps to closely analyze the spectral patterns of absorption lines of trace gases and cloud properties.Table 3 presents spectral ranges of Defects 1-3 and the target wavelengths for the analysis.
Table 1 The spectral range of Defects 1-3 and target wavelengths for the analysis.The third column presents GEMS retrieval products of which each fitting window is overlapped with Defects 1-3.For the evaluation, actual GEMS radiances and the reproduced radiances with machine learning methods are directly compared, hereafter called GEMS radiances and ML radiances,each column shows GEMS, ML radiances and the difference while the first and second rows show the representative wavelengths for the smallest and the largest difference, respectively.Figure 11 shows the comparison results of the Defect 3 area, which shows the best performance compared to the Defects 1-2 areas.The difference in Fig. 11 is close to zero (within ± 0.5%) because the spectral gap of Defect 3 is narrower than the counterparts of Defects 1-2.The narrower the spectral range of the output radiances is, the more abundant information could be obtained from the input radiances.For Defect 3, there is no scene dependence over the output wavelengths and the difference shows noise-like features except for the spatial dependence which might be originated from instrument artifacts.
(a) (b) Figure 1 The GEMS, ML radiances and the difference from left to right at the wavelengths presenting (a) the smallest and (b) the largest difference for the Defect 3 area.Bad pixels are marked in dark gray and the difference is calculated as (ML-GEMS)/GEMS in percent.The color bar range for the difference is ± 0.5% and the unit of RMSE is in percent divided by the mean radiance.
Figure 12 shows the Defect 1 area where the ML radiances are within about 5% of the GEMS radiances.It also shows that dark targets (clear sky with small radiance) show a positive difference while bright targets (mostly cloudy sky with large radiances) show an opposite tendency.The tendencies are also found from the ML radiances on the other dates for different angle conditions such as SZA and VZA.It seems the applied machine learning model (PCA-Linear) might not be fully trained to resolve the different atmospheric conditions and radiances which causes a certain bias depending on the scenes.For the Defect 2 area, it is clear that the information from valid radiances of wavelengths longer than 400 nm is insufficient to effectively reproduce the spectral features at shorter wavelengths (consistent results with Figs.8-9).Both output spectral lengths of Defects 2-3 are nearly identical around 100 nm but it seems radiances near 300 nm need more information to be successfully reproduced.The stripping feature found in Fig. 12b is significant at 312 nm for the ML radiances, while it doesn't at 357.2 nm in Fig. 12a.The stripping feature seems to be added during the reproducing process especially for shorter wavelengths, and the reason is still unclear.Another distinct feature found in Fig. 12 is that the difference in northern parts is very large with the difference of 10%.We suspect that the reason might be the VZA effect considering that VZA increases at the northern parts in the area.Without angle conditions in the input parameters for the model, the difference becomes doubled at 312 nm presenting similar patterns with the difference in Fig. 12b.This indicates the angle effect can be emulated in the model by applying VZA and SZA as the input parameters, but it is not fully resolved especially for the radiances at shorter wavelengths.A closer inspection is performed to analyze the general spectral features over target wavelengths.For each defect area in Figs.11-13, the collected spectra are divided into four groups depending on the scene brightness considering that ML radiances could have different systematic biases depending on the scenes.With the data, the mean difference is calculated for each wavelength.As found in Fig. 11,Fig. 14a shows that the ML radiances over dark scenes have the positive bias while brighter scenes have the negative bias.It is interesting that the scene dependence is only significantly found for Defect 1. Figure 14b indicates that the ML radiances are overestimated except for the very brighter scenes.It should be noted that the y-axis range of Fig. 14b is wider than the figures for Defects 1 and 3.With the results, it can be deduced that the complicated atmospheric effects at the shorter wavelengths are difficult to be emulated and instrument artifacts such as stray light also would affect the reproducing process.Figure 14c shows relatively large difference at the spectral peaks, but generally the difference is smaller than 0.2% Besides the shorter wavelengths of Defect 2, mean ML radiance and the difference with GEMS radiances are presented by targeting Fraunhofer lines from 390 to 400 nm (see Fig. 14).The Ring effect caused by rotational Raman scattering can be found over the two peaks in Fig. 14a, which is generally known to be very small and largely affected by clouds (Joiner et al., 1995).Figure 14b shows that PCA-ANN reproduces the dominant features at the peaks very well on average within 0.6%, but it seems the difference increases with darker scenes where the Ring effect becomes stronger.This indicates that the ML radiances would need additional information to successfully reproduce the exact spectral features especially for the very small signals such as the Ring effect.

PCA-based spectral analysis
As applied in the pre-processing step in our research, PCA is a very useful tool to capture the meaningful variances along the spectral direction and it has been widely used to retrieve environmental and surface properties (Horler and Ahern, 1986;Joiner et al., 2016;Li et al., 2013Li et al., , 2015)).To investigate further the spectral patterns, we apply PCA to GEMS radiances (except for bad pixels) at the target wavelengths (see Table 3) collected within each area in Fig. 11-13.With PCA, various spectral patterns are compressed and a spectrum can be projected to PC subspaces by multiplying with the constructed PC matrix (eigenvector matrix).This indicates that if a spectrum has disparate spectral patterns, the projected PCs would also have distinct values when comparing with the PCs of GEMS radiances.Figure 15 presents the results when projecting both GEMS and ML radiances with PCA.For the inspection, the Defect 3 area is presented which has the wider defective width along the north-south direction.Because the first PC scores represents mean radiances, the second PC are used for the analysis.As we assumed, bad pixels in Fig. 15a show disparate values because the spectral patterns of the interpolated spectra are inconsistent with GEMS radiances.The ML radiances in Fig. 15b show spatially homogenous PC scores which indicates that the machine learning methods could properly reproduce the dominant spectral patterns, in this case of the second PC.The dominant patterns for each PC are presented in Fig. 16 with GEMS radiances for the target wavelengths of Defects 1-3.Each color indicates the eigenvector of the first-sixth PCs which determines how each PC score of a spectrum contributes to the original spectrum.Li et al. (2015) verified that the leading PC scores from the UV/VIS backscattered radiation (shorter than 360 nm) are significantly correlated with dominant absorption features and surface properties.The trailing PC scores might be associated with instrument artifacts and other unresolved spectral features.Figure 16 shows that the first PC corresponds to the mean spectrum as discussed in Sect. 4 with the correlation coefficient.The results indicate that the mean spectral feature (the first PC) and some dominant patterns (the second and third PCs) could be well reproduced with the suggested models, but other spectral features such as the fourth PC for Defect 2 have difficulty obtaining valid information from input radiances for accurate reproduction.The magnitude of radiance from the major PCs except for the first PC might not be large considering that even the leading PCs have small explained variance ratio for hyperspectral data in UV/VIS spectrum.However, it would be enough to determine the exact spectral signals which are mostly related to the important information for the retrieval process."Besides, how can the spectral sampling of input/output (0.1 nm) be finer than the original GEMS data (0.2 nm)?More detailed descriptions about this are recommended.Overall, I suggest this manuscript be reconsidered after major revisions."

Response #2:
As for the spectral intervals of GEMS spectra for the training process, the response for the comment is addressed below as there is a similar comment in the following section.
• Line 82: The authors refer to each of ~700 east-west pixels as a "scan," but probably this term is not accurate.Isn't the whole ~700 pixels considered to be in one scan?Also, can GEMS cover the entire field of regard by one scan?It seems that is what the authors are implying.
The sentences have been revised as follows: "For earth measurements, GEMS measures the backscattered radiation from east to west about 700 times by moving a scan mirror and for each scan, totally 2048 pixels are obtained along the northsouth direction.All measurements at each scan position are combined together to cover the full field of regard (FOR) of GEMS." • Line 84: Do the CCD pixel numbers presented here represent those for only photoactive pixels?
The provided pixel numbers are designed to be photoactive pixels.However, signals from some pixels at the edges of the CCD are known to be invalid, which are flagged as low quality pixels.The point has been added to the revised manuscript.
• Line 89: The general description of the bad pixel detection method is informative.But how about presenting how long the GEMS integration time is (by adding another sentence)?
The integration time of GEMS is 69.996409 milliseconds.The information has been updated to the manuscript.
• Line 99: This sentence sounds as if the results of 1-D interpolation were presented earlier, which is not true.How about rephrasing this sentence, using a verb like "imply" instead of "indicate"?
We agreed to the point.It has been corrected.
• Line 104: The subject affected by the defective pixels is the quality of ozone retrieval, not the ozone properties themselves.
Indeed.It has been corrected.
• Line 148: How can the spectral interval of input and output (0.1 nm) be narrower than that of original GEMS measurements (0.2 nm)?How are the GEMS measurement spectra sampled onto the finer grids?Please give more details here.
The detailed description of the spectral interval of input and output has been added to Line 175: For the training process, each measured spectrum is linearly interpolated with the sampling interval of 0.1 nm, and radiances of each spectrum are divided into input and output radiances based on the specified spectral ranges in Table 2.The training datasets should be sampled at identical spectral grids and for that, each spectrum is interpolated in a pre-processing step.After the prediction, each replaced spectrum is reversely interpolated onto its original spectral grids.During the interpolation processes, intrinsic information a spectrum has could be lost, and thus finer spectral grids are applied to minimize interpolation errors by preserving radiances at more frequent interval than the original (about 0.2 nm).
• Line 149: Did you investigate how much the results changed when trained without solar zenith angle (SZA) and viewing zenith angle (VZA)?Please describe the impacts of including these variables.
The impact of angle conditions as input has been analyzed and added to Line 175:  Thanks for the correction.It is radiance at 310 nm and the caption has been corrected accordingly.
• Line 264: How can we tell if spectra look "reasonable"?This statement is vague.Please consider changing Figs.11-12 to include any reference (know, good, measured) spectra for the reconstructed parts.
The response for this comment is addressed in the previous section.
• Line 269: I believe the term "noise" itself implies randomness, which would not necessarily be canceled in the normalized radiance.Please consider replacing the term with another, e.g., error, bias, artifact, etc.
Artifacts would be more proper expression, indeed.It has been updated.
• Please consider re-writing the units in the figures as W cm-3 sr-1 Corrected.
• Please consider minor English corrections below.These comments have been addressed in the revised manuscript.

Figure 2
Same as Fig.11for the Defect 1 area.

Figure 3
Same as Fig.11for the Defect 2 area.

Figure 4
Mean difference between ML and GEMS radiances within the target area of (a) Defect 1, (b) Defect 2 and (c) Defect 3.Each color indicates the average for each quartile and Q1, Q2 and Q3represent the first, second and third quartile, respectively.The difference is calculated as (ML-GEMS)/GEMS in percent.
Figure 5 (a) Mean ML radiances (b) and the difference with GEMS raidances at the Fraunhofer lines for the Defect 2 area.Each color indicates the average for each quartile and Q1, Q2 and Q3 represent the first, second and third quartile, respectively.The difference is calculated as (ML-GEMS)/GEMS in percent.

Figure 15 .
The second PC of (a) actual measurements and (b) reproduced spectra on the target area for Defect 3. The PC is scaled for clarity of presentation.

Figure 6
Figure 6 Eigenvector of the firstsixth PCs applied to GEMS radiances for the target wavelengths of (a) Defects 1, (b) Defect 2 and (c) Defect 3.All eigenvectors are scaled (min-max scaling) and shifted for clarity of presentation.

Figure 5
Figure5presents the converging process of the PCA-ANN model for Defect 2 applying different optimizers with and without SZA and VZA conditions.The additions of the angle conditions as input parameters speed up the model convergence with smaller MSE because without the angle parameters, the information would be implicitly elicited during the optimization process.The model converges at 44, 98 and 33 epochs for Adam, SGD and RMSprop, respectively.Adam converges at the smallest MSE while the SGD converges with the highest MSE.RMSprop presents unstable loss for validation data and converges with higher MSE compared to Adam.

Figure 7
Figure 7 Training and validation losses for Defect 2 (a) with and (b) without the angle conditions as input parameters.The results are obtained with different optimizers such as Adam (black), SGD with the gradient clipping value of 0.5 (blue) and RMSprop (orange)

Table 2 .
Correlation coefficient of PC scores of reproduced and actual measurements for Defects 1-3.