Use of multi-spectral visible and near-infrared satellite data for timely estimates of the Earth’s surface reﬂectance in cloudy and aerosol loaded conditions: Part 1 - application to RGB image restoration over land with GOME-2

Space-based quantitative passive optical remote sensing of the Earth’s surface typically involves the detection and elimination of cloud-contaminated pixels as an initial processing step. We explore a fundamentally di ↵ erent approach; we use machine learning with cloud contaminated satellite multi-spectral data to estimate underlying terrestrial surface reﬂectances at red, green, and blue (RGB) wavelengths. The NN reproduces land RGB reﬂectances with high ﬁdelity even in scenes with moderate to high cloud optical thicknesses. This implies that spectral features of the Earth’s surface can be detected and distinguished in the presence of clouds, even when they are partially obscured by clouds; the NN is able to separate the spectral ﬁngerprint of the Earth’s surface from that of the clouds, aerosols, gaseous absorption, and Rayleigh scattering, provided that there are adequately di ↵ erent spectral features and that the clouds are not completely opaque. Once trained, the NN enables rapid estimates of RGB reﬂectances with little computational cost. Aside from the training data, there is no requirement of prior information regarding the land surface spectral reﬂectance, nor is there need for radiative transfer calculations. We test di ↵ erent wavelength windows for reconstruction of surface reﬂectance. This work provides an initial example of a general approach that has many potential applications in land and ocean remote sensing as well as other practical uses such as in search and rescue, precision agriculture, and change detection.


Introduction
Surface properties derived from satellite solar backscattered ultraviolet (UV) through near-and short-wave infrared (NIR, SWIR) reflectances have a multitude of uses for studies on marine and terrestrial biogeochemistry, including the response of ecosystems to climate variability, change, and feedback processes as well as time-sensitive applications such as drought or other hazard detection (e.g., AghaKouchak et al., 2015;Anyamba and Tucker, 2012;Frouin et al., 2019;Groom et al., 2019;Yuan et al., 2014).A cornerstone of many satellite-based scientific studies involving the Earth's surface is atmospheric correction, the process of removing the e↵ects of the atmosphere and other factors, when processing satellite radiance data.The corrections need to account for the e↵ects of absorbing and scattering gases and particles in the Earth's atmosphere.Additional corrections may be needed in order to remove unwanted surface signals such as from ocean glitter.Atmospheric correction remains an active area of research for Earth remote sensing (e.g., Frouin et al., 2019).
Atmospherically-corrected reflectances over land, such as from the MODerateresolution Imaging Spectroradiometer (MODIS) or similar satellite instruments, provide measurements at several broad wavelength bands from blue to the NIR or SWIR.There are several available MODIS land surface reflectance data sets, including some that have undergone detailed processing in the time domain to quantify bidirectional reflectance distribution functions (BRDFs) that describe This is a preprint and has not been peer reviewed.
how the surface reflectance varies with the sun-satellite geometry.These include the nadir BRDF-adjusted reflectance (NBAR), also known as the MODIS MCD43 product (Schaaf et al., 2002;Wang et al., 2018), and the multi-angle implementation of atmospheric correction (MAIAC), known as the MCD19 product (Lyapustin et al., 2011a(Lyapustin et al., ,b, 2012)).Deriving the terrestrial BRDF and optionally information regarding aerosol properties necessitates complex algorithms that use data acquired over a window of time with assumptions regarding variability over the considered time period.As a result of the processing required over the time window (typically 16 days), data sets may not be immediately available after the satellite overpass.In addition, these algorithms also detect and remove pixels that are a↵ected by clouds; this can significantly reduce the available spatial coverage (e.g., Banks and Mélin, 2015;Mercury et al., 2012).
In this work, we examine whether it is possible to reconstruct red, green, and blue (RGB) land surface reflectances in cloudy conditions using nearly all-sky satellite-based radiances measured by multi-spectral instruments.The satellite instrument that we use is the Global Ozone Monitoring Experiment 2 (GOME-2) that was designed primarily for atmospheric composition measurements.GOME-2 o↵ers continuous spectral coverage from the ultraviolet (UV) through near-infrared (NIR, up to ⇠800 nm).Specifically in this work, we test whether the spectral measurements from GOME-2 can be used to accurately estimate the underlying land surface reflectances in cloudy pixels with machine learning trained on NBARs from the MCD43 data set.In addition, we test the approach using discrete wavelength windows to determine the most important ranges for reconstruction of red, green, and blue (RGB) surface reflectances in cases of light to moderate amounts of cloud and heavy aerosol loading.
The basic assumption behind our approach is that a machine learning method, such as an artificial neural network (NN), is capable of disentangling and extracting a surface spectral fingerprint (in this case, as it relates to RGB land reflectances) from observed spectral reflectances.These observations are a↵ected by scattering and absorption from air molecules (Rayleigh and Raman scattering and absorption from gases such as O 2 , H 2 O, NO 2 , and O 3 ), aerosols, and clouds.A principal component analysis is used to precondition the spectral inputs to the NN.Similar methods have been employed in ocean remote sensing in the presence of aerosol, thin clouds, and sun glint using simulated data for training (e.g., Gross-Colzy et al., 2007a,b;Schroeder et al., 2007).Our approach uses a large sample of measured spectra from a satellite instrument rather than simulated data for training inputs.The resulting trained NN is then applied globally for one year for evaluation under a wide range of conditions.In addition, we test the approach for observations with optically thicker clouds than previously attempted (clouds that appear white to the human eye) as well as cases of heavy absorbing aerosol loading.
Our approach is fundamentally di↵erent than other image restoration methods that rely on either the availability of temporally adjacent clear sky reference images (temporal methods), estimates of the surface reflectance (spatial methods), or remaining parts of an image (non-complementation methods) (e.g., Li et al., 2019;Wang et al., 2019;Zhang et al., 2018, and references therein).Spectral This is a preprint and has not been peer reviewed.
methods have been employed in image dehazing that has far ranging applications in Earth science, search and rescue, event recognition, and aerial surveillance (e.g., Mehta et al., 2020, and references therein) and have also been employed for ocean remote sensing (e.g., Frouin et al., 2014;Frouin and Gross-Colzy, 2016;Steinmetz et al., 2011).Here, we describe a spectral method that encompasses but goes beyond dehazing and can be described as spectral cloud clearing.Beyond the RGB information provided in the training data set, our approach does not require any additional a priori information about the Earth's surface or atmospheric absorption or scattering, nor does it require radiative transfer calculations or look up tables.The ultimate success of the method does however depend heavily upon the quality of the training data.

Materials and Methods
We use reflectance measurements from GOME-2 (Munro et al., 2016), a satellite instrument with spectral coverage from the UV to the NIR including the socalled red edge from approximately 670 nm through the start of the strong O 2 A band near 758 nm.This instrument has a spectral resolution of the order of 0.5 nm at the wavelengths of interest (⇠400-800 nm).It flies on the European Meteorological Satellites operational (EUMETSAT MetOp) series.Here, we use data from GOME-2A (aboard MetOp-A) in 2018.MetOp-A is in a polar low Earth orbit (LEO) with a local equator crossing time near 09:30.We use data from GOME-2 bands 3 and 4 covering 397-604 nm and 593-790 nm, respectively.In 2018, GOME-2 was collecting data in a reduced swath mode (960 km swath), with pixels sizes approximately 40⇥40 km 2 at nadir.While this spatial resolution is considered low for many applications, the algorithm developed here and tested with GOME-2 can be implemented and evaluated with higher spatial resolution multi-spectral imagers.

MODIS MCD43 and other land data sets
We use NBAR data from the collection 6 MODIS MCD43C data set (Schaaf, 2015;Schaaf et al., 2002;Wang et al., 2018).MODIS instruments fly on the National Aeronautics and Space Administration (NASA) Terra and Aqua satellites in late morning (10:30 equator crossing time) and early afternoon (13:30 equator crossing time) polar orbits, respectively.Among the available MODIS bands, here, we use bands 1, 4, and 3 corresponding to red (R, 620-670 nm), green (G, 545-565 nm), and blue (B, 459-479 nm), respectively, that fall within the GOME-2 spectral range.MCD43 NBARs are reported on a daily basis.These data are constructed from clear-sky reflectance data observed over a 16 day window, computed on a daily basis and weighted towards the reported day.Therefore, these data are not true daily data, but rather can be considered as somewhat smoothed over a rolling 16 day window.The MCD43C NBAR data are averaged on a grid of 0.05 latitude by 0.05 longitude.
We collocate MCD43C data to GOME-2 footprints by averaging of all NBARs This is a preprint and has not been peer reviewed.
for gridboxes whose centers fall within the rectangular area defined by the corners of a given GOME-2 pixel.We similarly compute the fraction of snow in a GOME-2 pixel according to the Interactive Multi-sensor Snow and ice mapping system (IMS) data set (U.S. National Ice Center, 2008) in the northern hemisphere and the Near-real-time Ice and Snow Extent (NISE) data set (Brodzik and Stewart, 2016) in the southern hemisphere.We also calculate the fraction of water-cover within a GOME-2 pixel as in Qin et al. (2019).

Neural network architecture
Figure 1 describes the general approach of training a neural network (NN) to estimate RGB land surface reflectances (henceforth denoted RGB G2 or R G2 , G G2 , B G2 ) from all-sky GOME-2 spectra.The collocated MODIS MCD43-based RGB NBARs (denoted RGB M or R M , G M , B M ) are used as the predicted or target variables and GOME-2 multi-spectral reflectances, ⇢ G2 , are the predictors along with the optional sun-satellite geometry that can be described by the cosines of the solar zenith, view zenith, and phase angles, ✓ 0 , ✓, and , respectively (the phase angle is defined as the angle at a given point between the sun and satellite), i.e., In order to reduce the dimensions of the spectral measurements used as predictors, we perform an principal component analysis (or eigen-decomposition) of the input spectra, in the form of a covariance matrix, derived from several days encompassing all seasons (Jan.2-3, Mar. 2-3, May 2-3, Jul. 6-7, Sep. 3-4, and Nov. 2-3 of 2018).The dates were chosen to avoid days when GOME-2A was operating in a more narrow swath mode.The two days in each chosen month were selected to provide good coverage over all land masses.We then compute coe cients of the leading eigen-modes in order to reconstruct each spectrum.These coe cients can then be thought of as pseudo-observations to be used as predictors or features in the NN training rather than the full spectral complement.The NN training is performed with half the available samples from the dates used above for the eigen-decomposition.Evaluation is performed using all independent GOME-2 samples (i.e., not used for training) on the same days or other days not used in the training.
The spectral signature of the snow at the wavelengths used here may be confused with that of clouds.In addition, the MCD43 algorithm may not produce appropriate daily reflectances for training in the presence of partial or rapidly changing snow conditions.During initial training, we found that results improved when we removed snow-contaminated data.We also found that eliminating mixed land/water pixels improved performance as water reflectance properties also may not be well-captured within MCD43.Therefore, we focus here only on land pixels, removing training data with the following criteria: 1) Collocated snow fraction from either the IMS or NISE data > 5% or G M -(1.5 B M -0.02)This is a preprint and has not been peer reviewed.This is a preprint and has not been peer reviewed.
< 0, an empirical approach found to remove snow-contaminated data; 2) Red reflectance, ⇢ G2 , > 0.7 (to remove opaque clouds); 3) Water fraction > 5%; 4) SZA > 70 ; 5) poor reconstruction of spectra as follows.For each spectrum, the maximum error in the reconstructed spectrum is computed.If this absolute error is more than 5 greater than the mean error computed over the spectra in the training set, then that spectrum is not used; this flagging may identify damaged spectra, such as those with large noise found in the south Atlantic Anomaly area and potentially saturation in glint conditions (Gorkavyi et al., 2021).
While we have eliminated both snow and water contaminated pixels here, we note that this does not mean that our approach is not applicable to either snowor water-covered pixels.For example, the use of short-wave infrared bands (not available on GOME-2) along with an appropriate training data set over snow may enable the approach to be applied over snow-covered surfaces.Similarly, applications over water surfaces should also be possible with appropriate training data.
We employ the same general NN architecture as that used by Joiner and Yoshida (2020) for estimation of gross primary production (GPP) from MODIS NBARs.Briefly, a configuration within the Interactive Data Language (IDL) software package consists of a three layer feed-forward artificial NN with two hidden layers and 2N nodes in each layer, where N is the number of inputs.The output layer has three nodes, one each for R, G, and B NBAR; this produced similar results as when separate networks were created for each of the output bands.For activation functions we use a soft-sign for the first layer, a logistic (sigmoid) for the second layer, and a bent identity for the third layer.An adaptive moment estimation optimizer minimizes the error function with a learning rate of 0.1.Inputs and outputs are both scaled to produce zero means and unit standard deviations.Similar results were reproduced with Python codes and di↵erent architectures.
We use the following quality assurance (QA) checks in order to flag potentially erroneous reconstructed RGB data.These checks are similar to or identical to most of the checks applied during the training and described above.They include: 1) snow presence: G M -(1.5 B M -0.02) < 0; 2) optically thick clouds: ⇢ G2 > 0.7; 3) any corners of the GOME-2 pixel extending over the ocean; 4) SZA > 70 ; 5) poor reconstruction of spectra as described above.

Results
Figure 2 shows a random sample of GOME-2 spectra under various conditions ranging from mostly clear (those with lowest reflectances) to highly cloudy (those with the highest reflectances).Several gaseous absorption bands are seen including the O 2 B band near 685 nm and the O 2 A band near ⇠760 nm.The rapid rise in reflectance between about 685 and 760 nm, known as the red edge, is apparent particularly in the mostly clear sky pixels.In general, clouds tend to flatten out the spectra.The e↵ect of increased Rayleigh scattering at shorter (bluer) wavelengths is also apparent in the all sky reflectances; this is manifested This is a preprint and has not been peer reviewed.as increasing reflectances from green to blue wavelengths although the surface reflectance is generally lower in the blue as compared with the green.

Dependence on inputs
We perform a series of NN trainings and evaluations using di↵erent wavelength ranges along with optional smoothing and resampling of the spectra to simulate other instruments.With the full spectral range, resolution, and sampling of GOME-2, we can simulate and compare the potential performance of various current and future satellite instruments.Table 1 summarizes the evaluations with statistics comparing reconstructed RGB G2 with the target RGB M .
We find that 14 coe cients of the leading eigenvectors is close to optimal for RGB estimation using the full wavelength range of 403-795 nm.There is no apparent benefit from using additional eigenvectors (results not shown) and some degradation with less than 14.The leading eigenvectors (principal components) are shown in Figs.S1-S3.Correlations and root mean squared di↵erence (RMSD) are highest for the red band and lowest for blue which may reflect di↵erences in the signal to noise ratios in the di↵erent bands.
The reconstruction of NBARs includes not only atmospheric correction and This is a preprint and has not been peer reviewed.This is a preprint and has not been peer reviewed.
cloud and/or aerosol clearing, but also the BRDF adjustment to nadir viewing.This use of the three angles that define the sun-satellite geometry as predictors helps in this respect.We conducted another training without using these angles as predictors.The results of exp #2 in Table 1 without angles as predictors show a relatively small degradation as compared with results that used the sun-satellite angles in exp #1.This is an indication that the NN is learning about the canopy scattering and shadowing directly from the spectra.The NN also appears to learn how to deal with complex cloud and aerosol bi-directional e↵ects from the spectra.
Nearly identical results are obtained using the full wavelength range of GOME-2 bands 3 and 4 and a reduced range that removes wavelengths in the O 2 A band (403-758 nm) (not shown).A small degradation is seen with the reduced wavelength range of 403-680 nm where red edge wavelengths are removed (exp #3 of Table 1).Removing wavelengths in the blue region (using 500-758 nm in exp #4) does not produce substantial degradation for the red and green bands.Using only wavelengths in the blue range (403-500 nm) produces noticeable degradation to the reconstruction of blue reflectance (exp #5), although the overall RMSD may be acceptable for some applications.This wavelength range would be applicable to the Ozone Monitoring Instrument (OMI) that is used to estimate column amounts of trace gases such as NO 2 and does not cover green or red wavelengths.Using red through the red edge (603-758 nm, exp #6) produces some degradation in statistics for the red band as compared to results obtained with the green through red wavelength range.
We simulate RGB reconstruction that would be achieved using a lower spectral resolution multi-spectral sensor such as the planned NASA Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) Ocean Color Instrument (OCI) that will have spectral coverage from the UV through the NIR at approximately 5 nm spectral resolution and at 1 km spatial resolution (Werdell et al., 2019).We averaged together every 25 GOME-2 samples (GOME-2 sampling is approximately 0.21 nm) to produce PACE-like spectra.We obtain negligible degradation in RGB estimates as compared with those obtained using full GOME-2 spectral resolution with the same wavelength range (not shown).Further testing by averaging every 50 GOME-2 samples produced similar results (not shown), indicating that high spectral resolution is not required in order to achieve good surface reflectance estimates in cloudy conditions.
We computed the full error covariance of the GOME-2 surface RGB reconstruction.We find that the errors for all bands are substantially correlated.This bodes well for applications that rely upon ratios or di↵erences of bands, such as in the computation of some vegetation indices.The error correlation between red and blue is 0.73, red and green is 0.85, and green and blue is 0.89.We list statistics for band di↵erences in Table 1 (last grouping, for exp #1) that shows, as expected based on the error correlations, higher correlations for surface reflectance band di↵erences as compared with some of the individual band surface reflectances.

Dependence on cloudiness
Figure 3A shows two dimensional (2D) density plots comparing the target R M with the reconstructed R G2 using the range 400-795 nm on 2 August 2018, a day not used in the NN training.The r 2 value is 0.96, and the overall bias is negligible.However, a small bias is shown at lower values of R M .
Figure 3B shows R di↵erences (R G2 -R M ) as a function of the cloud e↵ective fraction, f c , a quantity used in solar backscatter trace-gas retrievals in thin and/or broken cloud conditions.The independent pixel approximation (IPA) expresses the top-of-atmosphere (TOA) measured radiance as a function of wavelength , I m ( ), as where I g and I c are the clear sky and cloudy sky sub-pixel radiances, respectively, computed in an atmosphere with Rayleigh scattering and atmospheric absorption (all radiance quantities here are normalized by the solar irradiance).In the commonly used mixed Lambertian-equivalent reflectivity (MLER) model, I g and I c are modeled with the assumption of Lambertian surfaces with equivalent reflectivity, LER (dimensionless), that can be computed using where I 0 is the radiance contributed by the atmosphere in the presence of a black surface, T is the total amount of irradiance (direct plus di↵use) reaching the surface converted to the ideal Lambertian reflected radiance (divided by ⇡) in the direction an observer then multiplied by the transmittance between surface and top-of-atmosphere that includes the e↵ects of atmospheric absorption and scattering, and S b is the di↵use flux reflectivity of the atmosphere for its isotropic illumination from below.In the MLER model, the cloud LER is assumed to be 0.8, a number that well reproduces Rayleigh scattering in partly cloudy and thin cloud scenes over a range of di↵erent conditions (e.g., Ahmad et al., 2004).An f c value of 1.0 indicates optically thick cloud completely covering a pixel.
We estimate f c by first using Eq. 3 to compute I g (with reconstructed surface reflectance) and I c .Then, we invert Eq. 2 with the observed reflectance and the assumption of a Rayleigh scattering atmosphere (i.e., aerosol and trace-gas absorption are not included).We use parameters I 0 , T , and S b computed with the vector linearized discrete ordinate radiative transfer (VLIDORT) code (Spurr, 2006).
Figure 3B shows a small increase in the R G2 standard deviation with increased cloudiness.It should be noted that the bulk of the data samples have f c < 0.2.There is a small positive bias in R G2 for f c > 0.1, and the bias grows noticeably for f c > 0.6.
Figure 3C shows that estimated R G2 errors increase with R G2 from 0 to ⇠0.25, then decrease at higher values.Fractionally, the errors are largest at low R G2 values (⇠20% at R G2 =0.05) and decrease with increasing R G2 (12% at R G2 =0.1,This is a preprint and has not been peer reviewed.6% at R G2 =0.2, and 2.5% at R G2 =0.4).The estimated R G2 uncertainty shows the expected increase with f c in Figure 3D.Interestingly, errors increase for 0 < f c < 0.2 from about 0.009 to 0.014, remain relatively flat for 0.2 < f c < 0.6, and increase sharply for f c > 0.6.

Spatial dependence
Figure 4A shows the observed all-sky red reflectance, ⇢ R , on 2 August 2018 along with the reconstructed R G2 (Figure 4B), the di↵erence between the target and reconstructed R (R M R G2 , Figure 4C), and estimated R uncertainties from the NN fitting (Figure 4D).At this wavelength with typical non-desert land surface reflectances, a ⇢ R of 0.4 corresponds approximately to cloud optical thickness, ⌧ , of 5, ⇢ R = 0.6 corresponds roughly to ⌧ = 10, and ⇢ R = 0.7 corresponds to ⌧ =⇠ 20 (Kujanpää and Kalakoski, 2015).So even at high values of observed ⇢ R , a small fraction of incoming sunlight reaches the surface and is then reflected from the surface and penetrates through the clouds to reach a satellite-borne sensor.Even though the fraction of surface-reflected light may be quite small, its spectral signature still may be distinguished from that of the clouds and atmosphere in observations with su cient signal-to-noise ratios.The estimated R G2 uncertainties shown in Figure 3D display generallly larger errors for more heavily clouded conditions as expected.In the presence of small amounts of undetected snow or ice within a GOME-2 pixel, R G2 would likely be underestimated as the trained NN may confuse snow for clouds and subsequently try to reconstruct the snow-or ice-free part of the pixel, i.e., snow clearing.Some R di↵erences may result in part from the fact that the underlying MCD43 R M is estimated using a moving weighted 16 day window and is not a true daily product.In addition, the MCD43 product may occasionally be a↵ected by cloud and/or aerosol contamination.Note that we have included some pixels that are flagged as highly uncertain by the MCD43 QA to show data under as many conditions as possible.In addition, here we somewhat relax our own QA filtering from the strict training conditions to show results under a variety of challenging conditions.For example, larger errors are shown over the great lakes (not flagged here) in the US upper midwest as well as for some pixels along coastlines that may have some water contamination.
Overall, R G2 provides nearly complete coverage over the GOME-2 swaths, capturing the major spatial features in the R M training data set.On this day, the absorbing aerosol index (AAI) from the ozone mapping and profiler suite (OMPS) on the Suomi national polar partnership (SNPP) satellite shows high values across the Sahara (> 2) and MiddleEast (see Fig. S4) presumably owing to the transport of dust.High AAI values are also seen over other areas, likely from smoke transported from fires including over southern Africa, the western US, and Siberia.Di↵erences between target and predicted R are not substantially higher in these regions as compared with other areas, indicating that the training appears to work reasonably well in the presence of high absorbing aerosol loading.
Figure 5 shows GOME-2 all sky, reconstructed, and MCD43 target surface RGB images for 2 August 2018.GOME-2 pixels are substantially impacted by This is a preprint and has not been peer reviewed.
This is a preprint and has not been peer reviewed.clouds as well as Rayleigh scattering.The GOME-2 atmospherically-corrected, nadir-adjusted, cloud-cleared reconstructed RGB image of the Earth's land surface by eye is almost indistinguishable from that of the target MCD43 data.A close examination of the images shows some apparent artifacts in the reconstructed GOME-2 data on the easternmost part of the swath that crosses over Southeast Asia.However, the GOME-2 data show a bit more uniformity in the surface RGB over the highly cloudy Indian subcontinent region.Note that all RGB images shown here are selectively scaled using the IDL ScaleModis procedure supplied with the coyote library from Fanning Software Consulting that is based on code originally developed by the MODIS rapid response team for image display.
Figure 6 provides a zoomed in view of the Saharan region that was impacted by both dust and clouds.The whitish and grayish pixels in the top panel indicate cloud contamination, particularly in the lower right corner of the panels where ⇢ R exceeded 0.7 and is removed from the images.Here, the increase in green vegetation towards the bottom of the lower two images can be seen clearly although some noise is present in the eastermost orbit at this transition which is also seen to be cloudy.Many of the details from the MODIS data are well captured by GOME-2 despite the heavy aerosol loading across the region that is present on this day.Figure S5 shows additional RGB imagery for a date in a di↵erent season (1 February 2018) where the NN has apparently performed snow clearing as well as cloud clearing along with the AAI on that day in Fig. S6.

Temporal dependence
We processed one year of GOME-2 data (2018) to analyze time series of the reconstructed reflectances at di↵erent sites.Our QA checks caught many but not all outliers.We therefore applied an additional simple outlier check by checking whether any points at a given location fell outside the range of the annual mean ± 4 .We selected locations near eddy covariance flux towers.Here we show one example and provide additional time series in Figs.S7-S18.Figure 7 shows one example over southeast Asia where there is substantial seasonal variability.The time series of the reconstructed RGBs are noisier in general than those of the collocated MCD43 target data.The reconstructed surface reflectances capture the overall seasonal variations from higher values at the beginning of the year through a transition to lower values at the middle to late part of the year.In this example, about 30% of data are filtered by the snow check, 4% by the high cloudiness check and 0% by the reconstruction check.One low value around day 103 is not flagged by the 4 check (possibly due to unflagged snow) but could be detected by a more sophisticated time series continuity check.This is a preprint and has not been peer reviewed.This is a preprint and has not been peer reviewed.
Figure 7: Time series of target (MCD43, black line with stars) and reconstructed (from GOME-2, colored + with extended lines in the y direction to indicate 1 uncertainties) R, G, and B surface reflectances in top three panels, respectively, and observed ⇢ R , ⇢ G , and ⇢ B from GOME-2 in the bottom panel for pixels near the MAGIM eddy covariance site.Bias and standard deviation of the reconstructed reflectances with respect to the target are provided for each wavelength (color).The number of outliers found by the 4 outlier check is indicated above the third panel (in this case zero).Points filtered by the quality assurance (QA) checks (percentages given above the fourth panel) including snow (red diamonds), extremely high cloudiness (blue diamonds), or poorly reconstructed spectra from the leading 14 principal components (none in this case).

Interpretation of NN learning within the MLER framework
The e↵ective cloud fraction, f c , should not be confused with the geometrical cloud fraction; f c used within the MLER model is a construct designed to produce the observed amount of scattering and absorption for a pixel of thin and/or broken clouds and/or aerosol without having to employ complex cloud/aerosol models with many parameters.While the MLER model may seem a too simplistic representation of clouds, it is found to well reproduce the complex radiative transfer within a cloudy atmosphere including Rayleigh scattering and atmospheric absorption (see the review of Stammes et al., 2008, and references within).
In the MLER model, f c is formally wavelength dependent.Cloud shadowing and other bi-directional e↵ects and their interaction with Rayleigh scattering as well as unaccounted for particle or gaseous absorption contribute to f c wavelength dependence.Gupta et al. (2016) conducted 1D radiative transfer simulations in complex cloudy conditions using three di↵erent cloud models and a range of geometrical cloud fractions to show that f c is fairly wavelength invariant over a large spectral range from the UV to the NIR.This implies that in the absence of gaseous, cloud, or aerosol absorption, the radiative transfer in a complex scene can be modeled with approximately one parameter: f c .If f c , and optionally its wavelength dependence, can be distinguished from the spectral dependence of the underlying surface, then it is possible to reconstruct a surface RGB image in cloudy conditions.We have utilized machine learning to accomplish this task with a large, representative training set.We may then compute f c at the training wavelengths using the reconstructed reflectances as described above and examine its wavelength dependence.
Figure 8A shows f c derived at the red wavelength band.Note that here we have used the reconstructed NBAR as a proxy for the LER.A more exact estimate of f c should account for the BRDF-dependence of the reflectance (Vasilkov et al., 2017(Vasilkov et al., , 2018)).For our purpose of roughly estimating f c and its wavelength dependence, use of the reconstructed NBAR as the surface LER should su ce; however, note that neglect of the surface BRDF may contribute to wavelength and cross track dependence in f c . Figure 8B shows the di↵erence between f c computed for the red and blue wavelengths.The presence of absorbing aerosol (e.g., from airborne dust over the Sahara and surrounding region, see SI for maps of the aerosol index on this day), not considered in the calculations of f c , produces lower than expected f c at the strongly absorbing blue wavelength.There is a small bias overall between f c derived at red and blue wavelengths.
Larger spread and bias is seen in the comparison of f c from red and blue wavelengths in Figure 8C as compared with red and green wavelengths in Figure 8D.There is only a small mean di↵erence of 0.01 between f c at red and green wavelengths.Rayleigh scattering, which is much stronger at blue wavelengths, reduces the e↵ects of cloud shadowing as well as cloud and surface bi-directional e↵ects, causing larger di↵erences between red and blue wavelength This is a preprint and has not been peer reviewed.This is a preprint and has not been peer reviewed.f c .Negative values are shown here and may be due to a combination of factors including shadowing and GOME-2 calibration error.The NN is able to adjust for calibration error in GOME-2 relative to MODIS.
The near wavelength invariance of f c provides an explanation of how the NN is able to separate the surface and cloud spectral signatures by extracting f c , or similar information, and its relatively small wavelength dependence.The near spectral invariance of f c also suggests that our cloud-clearing method may be able to fill spectral gaps in the RGB training data set in order to estimate more complex surface spectral signatures such as from chlorophyll absorption.Finally, this example gives an indication as to why the errors in the reconstructed RGBs are correlated; an error in the derived f c should produce spectrally correlated errors in estimated surface reflectances.Other works have similarly exploited the spectral di↵erences between ocean surface reflectance and spectrally smooth clouds, aerosol, and ocean glint (e.g., Frouin et al., 2014;Frouin and Gross-Colzy, 2016;Steinmetz et al., 2011, and references therein).

Potential short-comings of the approach
We studied a sample of the outlier cases produced with GOME-2 that did not agree well with collocated MODIS data.In some of these, we found excess filling-in of solar Fraunhofer lines that is indicative of additive e↵ects in the observed radiances as discussed by e.g., Gorkavyi et al. (2021).Scattered light, dark current, non-linearity, bad or dead pixels, changes in the spectral response function from inhomogeneous slit illumination or temperature changes, eclipses, detector memory e↵ects, saturation, and cloud three dimensional (3D) e↵ects (light scattered from clouds or aerosol outside the area of the ground footprint) are example of e↵ects that may lead to additive radiance errors in GOME-2 and similar instruments (e.g., Gorkavyi et al., 2021;Lichtenberg et al., 2006).These additive error could in turn result in problems reconstructing surface reflectances from cloudy spectra.
While cloud and aerosol 3D e↵ects are less common in large pixel sensors such as GOME-2 as compared with imagers such as MODIS, they may occur under certain conditions and lead to additive errors.In cases of broken but optically thick clouds with a homogeneous surface underneath, GOME-2 may be able to reconstruct the surface reflectance from the clear parts of a scene.However, for imagers that resolve such clouds, the signals from underneath clouds may be spectrally and spatially scrambled, leading to a spatially smeared reconstructed image.Future studies will apply our approach to higher spatial resolution imagers under di↵erent conditions.Ultimately, the results of such studies will help to determine whether our spectral image reconstruction approach will be accurate enough for a given application under di↵erent conditions.
Finally, we stress that the success of the method depends critically on the quality of the training data set.In this study, we utilize high quality daily NBARs that can be accurately averaged over a larger GOME-2 pixel.An advantage of our approach is that a large training sample can be constructed over a wide range of conditions; it does not depend upon simulating a multitude of situations for This is a preprint and has not been peer reviewed.
training including the complex conditions that occur with real observations.Our approach is also robust with respect to a constant calibration error.However, for application to other higher spatial resolution instruments, similar high quality training data may not be available.

Additional remote sensing applications
There are a host of potential applications of the approach developed here to extract information about the Earth's surface, over both land and ocean.This study focuses on land surfaces; subsequent studies will address the ocean surface.We plan to test the approach with other higher order level 2 data products as target output variables.The desired precision and accuracy as well as availability of adequate training data sets will be considerations that factor into whether or not this approach will be feasible for a given application.

Application to other sensors
While our approach was implemented with GOME-2 that provides complete spectral coverage from the 400-800 nm, there will be many more instruments available for such applications in the future that have enhanced capabilities.For example, the results obtained here are applicable to PACE and other multispectral instruments that will provide substantially higher spatial resolution.The NASA geostationary tropospheric emissions: monitoring of pollution (TEMPO) (Zoogman et al., 2017) (expected launch in the 2023 time frame) will provide nearly complete spectral coverage of UV through part of the red edge (740 nm) over much of North America at an hourly time step at ⇠5 km spatial resolution.Other instruments to be launched over the next several years may provide spectral coverage appropriate for certain applications.For example, the European Space Agency (ESA) FLuORescence Imaging Spectrometer (FLORIS) on the FLuorescence EXplorer (FLEX) mission will fly in low Earth orbit with approximately 300 m spatial resolution and spectral coverage from 500-780 nm (i.e., green through NIR) over a 250 km swath (Coppo et al., 2017).The NASA surface biology and geology (SBG) and the German Environmental Mapping and Analysis Program (EnMAP) (Storch et al., 2020) will produce multi-spectral imagery using wavelengths from the visible through SWIR.These instruments are particularly well suited for applications related to land biochemical processes.The methods developed here are also applicable to multi-spectral imagery acquired by instruments on the international space station as well as aircraft and other suborbital platforms.

Supplementary Figures
In this supporting information, we provide plots of the leading eigen-vectors of GOME-2 reflectance spectra.We also provide similar plots as in the main manuscript but for an additional dates in another season (1 February 2018) as well as additional time series at di↵erent locations worldwide.
Figures S1-S3 show the leading fourteen principal components (PCs) or eigenvectors of GOME-2 reflectance spectra collected globally on the collection of training days in 2018 listed in the main manuscript.The oxygen and water vapor bands feature prominently in most of these modes.The red edge (increasing surface reflectance from 700-760 nm) can be seen in the mirror image of the third PC.More subtle low frequency patterns are seen in other PCs.Note that a small discontinuity between the GOME-2 bands 3 and 4 can be seen, particularly in the ninth PC near 600 nm.
Figure S4 shows the aerosol index, an indicator of absorbing aerosol for 2 August 2018.Heavy absorbing aerosol can be seen across Africa.The heavy loading across Saharan Africa typically results from blowing dust.Plumes can be seen extending towards the east over the Atlantic, northwards over Spain, and eastward reaching the Saudi Arabian peninsula and the Indian subcontinent.A second plume over central Africa typically results from biomass burning.Another plume over the western US is likely due to smoke from forest fires in California.
Figures S5-S6 are similar to plots shown for 2 August 2018, but are for a day in another season (1 February 2018).Performance of the reconstructed reflectances is similar for both dates.The reconstructed RGB in Figure S5 shows how the NN has performed snow clearing over partially snow covered surfaces (this may include woody or evergreen vegetation present above the snow) at high northern latitudes.Heavy aerosol is seen across Africa on this date (Figure S6) although not as widespread as on 2 August 2018.
Figures S7-S18 show additional time series at sites around the world of MCD43, GOME-2 reconstructed, and GOME-2 observed reflectances at red, This is a preprint and has not been peer reviewed.
green, and blue wavelengths similar to Figure 7. Details relevant to each figure are given in the captions.This is a preprint and has not been peer reviewed.This is a preprint and has not been peer reviewed.

Red
Green Blue R > 0.7, 2.5% This is a preprint and has not been peer reviewed.

Figure 1 :
Figure 1: Flow diagram showing how RGB surface reflectances and their uncertainties are estimated with machine learning (a neural network, NN) using continuous multi-spectral measurements from instruments such as GOME-2 trained on MODIS RGB (RGB M ) surface reflectances.

Figure 2 :
Figure 2: Random sampling of GOME-2 spectra (colors chosen at random) on various days in 2018 with major atmospheric gaseous absorption bands labeled above and MODIS bands indicated as labeled above.

Figure 3 :
Figure 3: Density plots using (independent) red band data (denoted "R") from 2 August 2018 with colors indicating the number of points in a bin as indicated along the top; bins with a single point are indicated as a dot rather than a filled box; A) scatter diagram of the predicted surface red reflectance from MODIS MCD43 (R M ) versus that reconstructed from all sky GOME-2 data (R G2 ) with statistics including the number of points (N) and standard deviation (std); B) R G2 -R M as a function of the e↵ective cloud fraction f c computed with the GOME-2 R band (higher f c values indicate higher geometrical cloud fractions and optical thicknesses).Red diamonds indicate binned means and red vertical bars the binned standard deviations; C) predicted R G2 errors as a function of R G2 ; D) predicted R G2 errors as a function of f c computed with the red (R) band.

Figure 5 :
Figure 5: Top: GOME-2 RGB image from 2 August 2018; Middle: Reconstructed surface RGB image from GOME-2; Bottom: Surface RGB from MODIS MCD43 averaged over GOME-2 pixels.Snow/ice covered pixels are not masked out.Orbital gaps as well as pixels with reflectances > 0.7 or not passing the reconstruction check are masked out.Ocean data are colored as blue and are not considered in this study.

Figure 6 :
Figure 6: Similar to Figure 5 but zoomed in on northern Africa; Top: GOME-2 RGB image from 2 August 2018 ; Middle: Reconstructed surface RGB image from GOME-2; Bottom: Surface RGB from MODIS MCD43 averaged over GOME-2 pixels.

Figure 8 :
Figure 8: f c computed at di↵erent wavelengths from 2 August 2018 GOME-2 data: A) f c computed at the red band wavelength; B) map of di↵erence between computed f c at red and blue wavelengths; C) density plot of f c at red and blue wavelengths; D) density plot of f c at red and green wavelengths.

Figure S5 :
Figure S5: Similar to Figure 5 but for 1 February 2018; Top: GOME-2 RGB; Middle: Reconstructed surface RGB image from GOME-2; Bottom: Surface RGB from MODIS MCD43 averaged over GOME-2 pixels.Snow/ice covered pixels are not masked out in the reconstructed and MCD43 images.Data are masked where R reflectances > 0.7 and those few pixels that failed the reconstruction quality assurance check.Data over water surfaces are colored as blue and are not considered in this study.

Figure S7 :
Figure S7: Similar to Figure 7 but for a di↵erent location in Australia.In this example, larger di↵erences and uncertainties are shown for cloudier times of year, particularly around day 50.

Table 1 :
Statistical comparison of with 131,544 GOME-2 independent (not used in training) data points from training/evaluation days listed above.Statistics include the root mean squared di↵erence (RMSD), bias (mean of RGB G2 -RGB M ), and variance explained (r 2 ).The table is segmented into di↵erent experiments, denoted by "exp #".The last grouping is for band di↵erences.
a Sun-satellite geometry (3 angles) not included as predictors.
PC % of total: 98.5994; cumulative % of total: 98.5994Figure S1: Leading principal components (PCs) of GOME-2 global reflectance spectraThis is a preprint and has not been peer reviewed.