Assessing Landsat-8 atmospheric correction schemes in low to moderate turbidity waters from a global perspective

ABSTRACT Atmospheric correction is one of the major challenges in ocean color remote sensing, thus threatening comprehensive evaluation of water quality within aquatic environments. In this study, five state-of-the-art atmospheric correction (AC) processors (i.e. Acolite, C2RCC, iCOR, L2gen, and Polymer) were applied to Operational Land Imager (OLI) Landsat-8 scenes and evaluated against in situ measurements across various types of waters worldwide. A total of 262 matchups between in situ measured and satellite-derived remote sensing reflectance (R rs) at 20 sites were obtained between August 2013 and August 2021. Classification of optical water types (OWTs) was carried out using in situ measurements with matched satellite observations. OWT-specific analysis demonstrated that L2gen produced the most accurate R rs with R 2 ≥ 0.74 and root mean squared error (RMSE) ≤ 0.0018 sr–1 for the four visible bands of OLI, followed by Polymer, C2RCC, iCOR, and Acolite. In terms of R rs spectral similarity, C2RCC yielded the lowest spectral angle (SA) of 8.55°, followed by L2gen (SA = 9.20°). The advantage and disadvantage of each AC scheme were discussed. Recommendations to improve the accuracy for atmospheric correction were made, such as polarization observations and concurrent aerosol and ocean color measurements.


Introduction
Although coastal waters occupy only 7% of the total ocean surface, they serve as essential hubs for human and wildlife activities and are critical components of global ecosystems (Osterholz et al. 2021).Given their importance in human habitation, resource provisioning, cultural practice, and carbon sequestration, they play a significant role in achieving the Sustainable Development Goals (SDGs).However, global trends point to continued deterioration of coastal waters caused by pollution and eutrophication (Chen et al. 2017).Therefore, it is very important to conduct systematic, large-scale, and continuous researches on them from a long-term perspective (Jamet et al. 2011).With the advantage of synoptic views of marine biosphere over large spatial and temporal scales, ocean color remote sensing represents a unique tool to provide vital insights into the linkage between human activity and coastal marine environments.Its products can serve for comprehensive evaluation of water quality and greatly facilitate assessment of coastal ecosystems (Marrari, Hu, and Daly 2006;Volpe et al. 2007;Malenovsky et al. 2012;Platt 2008;Mouw et al. 2015).
It is well-known that top-of-atmosphere signals received by ocean color satellite sensors contained approximately 90% atmospheric and 10% oceanic information (Gordon et al. 1985).Therefore, atmospheric correction (AC) is one of the key procedures in the application of quantitative ocean color remote sensing.The black pixel assumption algorithm performs well and can reach the goal of 5% uncertainty for remote sensing reflectance (R rs , unit sr -1 ) in the blue and green wavelengths for open oceanic waters (Gordon and Wang 1994;Mélin 2019).However, AC over coastal waters is challenged by the presence of continental aerosols, bottom reflection, and adjacency of land (Renosh et al. 2020;Siegel et al. 2000;Shanmugam 2012).Numerous efforts have been made to minimize those perturbing effects (Xue et al. 2021).For example, wavelengths in the ultraviolet (He et al. 2012) or in the short-wave infrared (SWIR) (Wang 2007;Wang and Shi 2007) were introduced to improve the determination of the aerosol model in highly turbid waters.Fixed reflectance models and spatial homogeneity of aerosol were also proposed for moderately turbid waters (Hu, Carder, and Muller-Karger 2000;Ruddick, Ovidio, and Rijkeboer 2000).With the availability of powerful computation facilities, numerical and statistical optimization algorithms were developed (Chomko and Gordon 1998;Guanter et al. 2010;Son and Wang 2012;Wang et al. 2020).To ensure the continuity and consistency with past and present ocean color missions, it is essential to validate and assess the existent AC schemes across various types of waters with different hydrodynamic and atmospheric conditions.
In this study, we aim to compare five state-of-the-art AC processors for OLI commonly used in the field of ocean color, i.e.Acolite, C2RCC, iCOR, L2gen, and Polymer.Extensive datasets over global oceans are involved, including Europe, Africa, North America, South America, Asia, and Oceania.The most suitable AC method for each region based on precision or spectral similarity needs will be proposed.To achieve the goal, in situ measurements across global waters from different sources are utilized, including Aerosol Robotic Network-Ocean Color (AERONET-OC), Sea-WiFS bio-optical archive and storage system (SeaBASS), Marine Optical BuoY (MOBY), National Satellite Ocean Application Service of China (NSOAS), and our research cruises.Optical classification for different water bodies is implemented using R rs spectra.

In situ observation
In situ observations from a total of 20 sites around the world, as shown in Figure 1, were collected between August 2013 and August 2021, coinciding with OLI/Landsat 8 overpasses.Table 1 lists the information for those 20 stations.Figure 2 shows OLI-derived true color composites for those stations across various types of waters with different hydrodynamic conditions.AERONET-OC provides normalized water-leaving radiance (nL w ) at globally distributed offshore sites.Quality-assured nL w (Level 2.0) data from 12 AERONET-OC sites were used.Equation (1) was used to convert nL w to R rs .SeaBASS is a well moderated and documented archive for bio-optical data, which is operated by National Aeronautics and Space Administration (NASA).SeaBASS data were downloaded from https://seabass.gsfc.nasa.gov/.MOBY, anchored off Lanai, Hawaii, has provided vicarious calibration data in near real time for ocean color sensors since 1996.MOBY-measured L w data were acquired from the National Oceanic and Atmospheric Administration (NOAA).NSOAS deployed an ocean color instrument, named Cruise for Apparent Optical Properties (CruiseAOP), to collect R rs at a fixed station off Muping, China, which became operational since January 2020 and is about 25    kilometers away from the coastline with a water depth of about 12 m.Two field campaigns were conducted in May-June and August 2021 in the South China Sea (SCS), respectively, during which CruiseAOP was also used for R rs measurements.
The CruiseAOP instrument was equipped with three pre-calibrated radiometers (TriOS RAMSES-ARC-VIS), one for irradiance and the other two for radiance.The instrument has a rotatable component which can automatically control the observation geometry of the sensors.The total above-surface radiance (L t ), downwelling sky radiance (L s ), and downwelling incident irradiance (E s ) in the spectral range of 325-944 nm were simultaneously measured at an interval of 15 minutes.For each time stamp, 15 spectral scans were recorded for L t , L s , and E s , respectively.The averaged spectra were then obtained.All measurements followed the NASA ocean optics protocols to minimize sun glint perturbations (Mueller et al. 2003).According to  1. Lee et al. (2016), R rs was calculated from where ρ s is the Fresnel reflectance at the air-sea interface and was determined following Mobley (1999).F 0 represents the solar irradiance of perpendicular incident outside the atmosphere at mean sun-Earth distance.Δ is related to a second-order correction and determined by setting the spectrally averaged R rs between 900 and 920 nm to 0 here.

Satellite data processing
On 11 February 2013, the Landsat-8 satellite was launched as a joint initiative between the U.S. Geological Survey (USGS) and NASA, carrying the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS).OLI measures top of atmosphere (TOA) radiance in 9 wavebands in the visible to shortwave infrared (SWIR) domain at a spatial resolution of 30 m. Landsat-8 OLI collection 1 level 1 data were downloaded from USGS.Five AC algorithms, namely Acolite, C2RCC, iCOR, L2gen, and Polymer, were implemented.Acolite, C2RCC and L2gen directly output R rs while iCOR and Polymer yielded dimensionless water-leaving reflectance (ρ w ) that was transformed into R rs via R rs = ρ w /π.R rs at 443, 482, 561, and 655 nm was generated.

Optical water types
To meet the need for an assessment of AC processors across widely variable coastal water conditions, the fuzzy C-means (FCM) clustering algorithm proposed by Vantrepotte et al.
(2012) was employed for the classification of optical water types (OWTs).As shown in Figure 3, five OWTs were achieved for all data involved in this study based on their spectral shape and m.They were numbered sequentially from blue, oligotrophic waters to moderately eutrophic, sediment-rich waters with an overall trend of increasing chlorophyll and optical complexity.
To enable direct inter-comparison among the AC processors for each OWT, the following steps were carried out.First of all, pairs of matched in situ and satellite-derived R rs were defined.If R rs retrieval from any AC algorithm fails, the data pair would be rejected.13 pairs of data were retained for OWT 1, 109 for OWT 2, 81 for OWT 3, 32 for OWT 4, and 12 for OWT 5. Secondly, the five algorithms were ranked based on statistical measures.Finally, the comprehensive ranking of each AC algorithm for each band was obtained.

Match-up analysis
The flowchart of match-up analysis between in situ measurements and satellite observations is shown in Figure 4.For some in situ measurements, R rs at certain bands was not available and thus discarded.Duplicated data were removed since different data sources were involved.The spatial and temporal windows were set to 5 × 5 and ±2 h, respectively.The matching exercise followed the approach proposed by Bailey and Werdell (2006).For each matched pair, the median of R rs (λ) over a 5 × 5 pixel window centered at the station was used.A median is better than a mean because the former is less sensitive to spikes.Several flags were checked for satellite-derived R rs , including cloud shadow, stray light, and sun glint.Finally, 262 pairs of matched field measurements and OLI observations were obtained, with 184 from AERONET-OC, 21 from SeaBASS, 33 from MOBY, 11 from Muping, and 13 from the SCS.

Statistical analysis
Along with slope, intercept, and determination coefficient (R 2 ), the following statistical measures, i.e. bias, absolute error (AE), and root mean square error (RMSE), were used for the performance  assessment of the five AC algorithms: bias where n is the number of match-ups, R rs sat denotes satellite-derived R rs using a specific AC algorithm, and R rs meas denotes field measured R rs .Spectral Angle (SA) (Keshava 2004) was introduced to quantify the similarity between satellite-derived and in situ measured R rs spectra.SA is insensitive to amplitude and calculated from where i denotes the ith band of OLI.Smaller SA indicates higher similarity.If R rs for an OLI wavelength is not available, data for the nearest wavelength were used instead.

Description for AC algorithms
TOA radiance received by ocean color satellite sensors (L toa ) can be decomposed into several terms as follows: L R (λ) is the Rayleigh scattering of atmospheric molecules.L A (λ) is the aerosol scattering in the atmosphere.L ra (λ) is the multiple scattering of aerosol-molecular interaction.L g (λ) is the reflection of sun glint on the sea surface entering the sensor field of view, also known as specular reflectance.L b is the reflection from bottom.L wc (λ) is the reflection of white bubble cloud entering the sensor field of view, also known as white cap reflection.L w (λ) is water-leaving radiance.T(λ) is the direct transmittance, also known as beam transmittance.t d (λ) is the atmospheric attenuation coefficient between sea surface and sensor, also known as atmospheric diffuse transmittance.Acolite (version 20210114) is a generic image-base processor developed by the Royal Belgian Institute of Natural Sciences (RBINS).It does not need external inputs, such as aerosol optical thickness (AOT).The default dark spectrum fitting (DSF) approach was utilized (Vanhellemont 2019).This scheme assumes a homogeneous atmosphere.An automated method is performed to predict atmospheric path reflectance based on multiple dark pixels selected in a scene (such as ground surface shadows or water pixels).This approach requires the presence of SWIR bands to work over scenes with only turbid waters (Vanhellemont and Ruddick 2018).
The Case 2 Regional Coast Color (C2RCC) processor involves a multi-sensor per-pixel artificial neural network to perform the inversion of L w from L toa , as well as the retrieval of inherent optical properties (IOPs) (Doerffer and Schiller 2007;Brockmann et al. 2016).Characterization of optically complex waters through its IOPs as well as of coastal atmospheres is performed to parameterize radiative transfer models for water and atmosphere.A large dataset for reflectance at water surface is produced, which is then used as lower boundary conditions for the radiative transfer calculation in the atmosphere.A database of 5 million cases is generated, which forms the basis for neural network training.It can be imbedded into the SeNtinel Application Platform (SNAP, V6.0) software.The default settings were used for all images.
Image correction for atmospheric effects (iCOR) offers the ability of AC over inland, coastal, and transitional waters (Guanter, Gonzalez-Sanpedro, and Moreno 2007).It first identifies land and water pixels using a band threshold approach.Scene-specific aerosol optical thickness (AOT) is obtained following the SCAPE-M algorithm (Guanter, Gonzalez-Sanpedro, and Moreno 2007).An adjacency correction can optionally be applied using the SIMilarity Environmental Correction (SIMEC) approach (Sterckx et al. 2015).A MODerate resolution atmospheric TRANsmission (MODTRAN 5) look-up table is built (Berk et al. 2006), which requires ancillary information as inputs, such as solar and viewing angles and digital elevation model.If the above steps fail, Rayleigh scattering correction is automatically initiated with a default AOT of 0.1.Proper terrestrial pixel distributions are needed to obtain the best results.iCOR can be implemented as a plugin in the SNAP software (V6.0).In this study, the default settings were applied and the adjacency correction was implemented.
Level 2 generator (L2gen) is a combination of NASA atmospheric correction algorithms.It employs bio-optical models and resolves AC in an iterative process (Gordon and Wang 1994;Bailey, Franz, and Werdell 2010).An iterative scheme is exploited to estimate the aerosol radiance at the near-infrared (NIR) and/or shortwave infrared (SWIR) bands to relax the limitation of the 'dark pixel' assumption.In this study, OLI bands centered at 865 and 2201 nm were exploited.This NIR-SWIR band combination has been proven to yield the most robust R rs in moderately turbid waters (Pahlevan et al. 2017).L2gen is a built-in program for SeaWiFS Data Analysis System (SeaDAS, V7.5).
The Polynomial based algorithm applied to MERIS (Polymer) is specially designed for oceanic waters with and without the presence of sun glint (Steinmetz, Deschamps, and Ramon 2011).Compared with other AC algorithms, it can thus improve valid data availability.It employs a spectramatching optimization method on a pixel-by-pixel basis.It relies on two models, one for spectral reflectance of atmosphere and sun glint and the other for water reflectance in the visible wavebands.The bio-optical water reflectance model is extended to NIR wavebands using the similarity spectrum for turbid waters (Ruddick et al. 2006).The parameters for the two models are tuned iteratively to produce the best fit.Default settings for Polymer (V4.14) were used.

Validation of R rs from the five AC schemes
Figure 5 shows the scatter plots of in situ measured against OLI-derived R rs in the visible region from the five AC schemes.The statistical results are summarized in Table 2.The histograms for AE, bias, RMSE, and R 2 are displayed in Figure 6.It can be seen that the performance of the five AC algorithms indicates discrepancies for different wavelengths.For the 443 nm band, L2gen demonstrates the best performance with the largest R 2 of 0.74, a slope closest to 1, and the smallest RMSE, AE and bias, followed by Polymer, C2RCC, iCOR, and Acolite.For the 482 nm band, Polymer produces the best result with the largest R 2 of 0.84 and the smallest RMSE.For the green band centered at 561 nm, L2gen also yields the best match between in situ and satellite-derived R rs with a R 2 of 0.97, a RMSE of 0.0013 sr -1 , and a bias of -7%.In terms of R 2 and bias, C2RCC illustrates better performance than Polymer.The worst performance is found from iCOR.The performance of Acolite is ranked the fourth among the five algorithms.With respect to the red band centered at 655 nm, R rs data from L2gen show the best agreement with in situ measurements with a slope closest to 1 and the smallest bias.Polymer and C2RCC have similar performance.Relatively large deviation from the 1:1 relationship is observed for R rs from Acolite and iCOR while the latter agrees better with field measurements than the former.For all AC matchups, the scatter points are relatively dispersed for R rs at 443 nm, which is consistent with previous findings (Pahlevan et al. 2021).
In general, R rs for all bands is overestimated by Acolite and iCOR.AE and bias for R rs from Acolite and iCOR are much larger than from other three algorithms.Note that L2gen exhibits near-zero biases at 482, 561, and 655 nm, which corroborates the findings of previous studies (Pahlevan et al. 2021).As shown in Table 3, L2gen has the best performance at 443 and 482 nm with the smallest AE while Polymer has the best performance at 561 and 655 nm.Polymer-derived versus in situ R rs points are evenly scattered around the 1:1 line with R 2 s > 0.7 for all bands.L2gen produces the lowest bias, AE and RMSE for four bands while tends to overestimate R rs at 443 and 482 nm, which is consistent with recent studies (Ilori, Pahlevan, and Knudby 2019;Xu et al. 2020).To further evaluate algorithm performance, radar charts displaying statistical measures for each band are shown in Figure 7.Each vertex of the hexagon represents a statistical parameter.The length of a spoke is proportional to the magnitude of a statistical measure.Results further away from the center represent higher metric values, demonstrating better performance than those closer to it.In this scenario, all statistical parameters can be superimposed to objectively assess the performance of each AC algorithm from a global perspective.It can be seen that the area for L2gen is the largest for all bands, indicative of the best comprehensive performance.The performance ranking is followed by Polymer, C2RCC, iCOR, and Acolite in order.Therefore, it can be concluded that L2gen has the best performance among the five AC schemes.Figure 9 shows the comparison between averaged R rs spectra from in-situ and satellite observations using different AC algorithms for the five OWTs.L2gen-generated R rs spectra agree well with in situ measurements.In general, Acolite-obtained R rs spectra deviate the most from in situ data for all OWTs except for OWT 5.This further verifies that Acolite is suitable for the treatment of highly turbid water bodies.The spectral shape of Polymer-derived R rs is in good agreement with in situ measurements of OWTs 1 to 3, especially at 561 and 655 nm.Nevertheless, its performance degrades for OWTs 4 to 5. R rs from iCOR is generally overestimated.Table 4 lists the most suitable OWT-specific AC algorithm for each wavelength based on spectral shape preference via calculating SA.It can be seen that the spectral similarity for R rs from Polymer is the highest for OWT 1 with an SA of 1.15°.C2RCC has the smallest SAs for OWTs 2 and 4 with SAs of 4.92°and 1.31°, respectively.L2gen-derived R rs has the most similar spectral shapes to the in-situ data for OWTs 3 and 5 with SAs of 3.94°and 1.57°, respectively.These findings are consistent with those shown in Figure 9.

Comparison of R rs spectra for different stations
In situ measured R rs spectra with matched satellite measurements for six sites from five continents, three in the northern hemisphere and three in the southern hemisphere, respectively, Table 4. AC algorithm assessment for different optical water types (OWTs) using SA and the optimal algorithm is highlighted in bold.The best band-specific algorithm for each OWT is also presented.The number in brackets in the first column indicates the number of matched in situ and OLI-measured R rs for each OWT.across various types of waters are shown in Figure 10.R rs from the Gulf of Finland (GoF) was generally lower than 0.008 sr -1 and peaked in the green region with a minimum in the blue region < 0.0015 sr -1 .These spectral features were attributed to high contents of colored dissolved organic matter (CDOM), whose absorption at 412 nm varied between 0.8 and 1.6 m -1 (Kowalczuk 1999).R rs spectra from Muping and Mississippi River Estuary were similar and characterized by peaks at 561 nm that varied in the range of 0.003-0.02sr -1 and 0.0026-0.016sr -1 for the two stations, respectively.Note that there were several cases when R rs peaked at 482 nm and was < 0.003 sr -1 for both stations, which was caused by the landward movement of clear blue waters around them.For the Adriatic Sea and Great Barrier Reef stations, the peak wavelength of R rs spectra varied between 482 and 561 nm.The peak values also demonstrated relatively large ranges, almost an order of magnitude.R rs spectra from Muping, Mississippi River Estuary, Adriatic Sea, and Great Barrier Reef were typical for coastal waters influenced by both freshwater discharge and local ocean circulation (Mélin et al. 2016).R rs spectra from the La Plata-Parana River were characterized by high values in the red domain.R rs spectral peak at 561 gradually disappeared and the entire spectra were uplifted, which was associated with high loads of suspended sediment and agreed well with the findings in the Yangtze River Estuary (Pan, Shen, and Wei 2018).
SAs computed for all stations over global oceans are illustrated in Figure 11.L2gen and C2RCC achieve the lowest median SA of 8°while the former has a smaller interquartile range and thus tighter distribution than the latter.Averaged in situ R rs spectra and the matched satellite observations for the six stations are shown in Figure 12.The best-performing AC algorithm for all stations in terms of SA and average of AE (AAE) is summarized in Table 3. Apparently, great discrepancies can be recognized for different AC schemes.For example, C2RCC demonstrates the best performance for the Muping station with an SA of 8.75°and an AAE of 0.002 sr -1 .For the Great Barrier Reef station, R rs spectra from Polymer have the highest similarity to in situ data.Among the five algorithms, R rs spectra from Acolite generally deviate the most from in situ measurements.On the other hand, it deserves noting that the performance of AC algorithms varies for different bands.lists the optimal AC algorithm for each band at each station.Except for several cases where L2gen or Polymer produces the best R rs retrievals for all bands, mixed AC solutions are observed.Note that Acolite yields the best R rs at 443 nm in the very turbid La Plata-Prana River.It may therefore be advantageous to seek a fit-for-purpose AC that estimates satisfactory R rs retrievals in only one or two individual bands for specific applications.

Inter-comparison of satellite-derived R rs spectra
The quality assurance (QA) evaluation system developed by Wei, Lee, and Shang (2016) is introduced here to further assess the quality of R rs from each AC algorithm in terms of spatial distribution for different OWTs.It was applied to a satellite scene on 20 February 2016 with clear sky conditions covering the region that receives freshwater and nutrients via two large rivers, namely the Mississippi River (MR) and the Atchafalaya River (AR), as illustrated in Figure 13.A previous study showed that riverine nitrogen input accounted for 80% of the total nitrogen load on this shelf (Xue et al. 2013), making the MR-AR system a major source of nutrients to support local primary production (Joshi and D'Sa 2020).The unique absorption, scattering, and fluorescence properties of optically active constituents affect light fields, resulting in variations of R rs spectra.The yellowish-brown color (areas A and D in Figure 13) represents mineral-rich water with a R rs peak at 655 nm.R rs for area B has a higher rate of change than for area A in the visible domain.In this dynamic mixing zone where CDOM and sediment-rich plume water mixed with oligotrophic offshore waters, apparent discrepancies in water color can be observed.The green color over area C in Figure 13 represents waters with more contents of phytoplankton than other areas with R rs peaks in the green region (e.g.561 nm).For area E, R rs peaks at 561 nm while an order of magnitude smaller than those for area C.  Figure 14 shows the comparison of QA scores for R rs derived from different AC processors.It can be seen that R rs retrieved from all algorithms has higher accuracy in offshore waters than in nearshore waters, which is likely associated with high loads of suspended sediment in nearshore water.Note that R rs over areas with high sediment was masked in the L2gen algorithm although L2gen-produced R rs generally demonstrates high quality.In the vicinity of our sampling site (around area C), QA score is high.C2RCC-derived R rs demonstrates the best spatial smoothness than those from other algorithms.In clear blue waters, C2RCC and Polymer has good performance, followed by Acolite.In contrast, QA scores for R rs from L2gen and iCOR are even lower than 0.3 in some areas.These findings are related to the mechanism of AC algorithms mentioned above.

Uncertainties of matchup analysis
Although field data are usually considered as 'ground truth', they also contain some uncertainties, like incomplete calibration, non-standard instrument operation, ship shadow, white cap, sun glint, and the Fresnel reflection of sky radiance (Mobley 1999).A recent study reported that radiometric measurements at water surface may have an uncertainty of 5-50% in the visible region (Alikas et al. 2020).Another study showed that even the same measurement technique was used with well-calibrated instruments, the results differed by up to 10% from AERONET-OC (Tilstone et al. 2020).This phenomenon can be explained by the following reasons.Firstly, some uncontrollable factors, such as sea wave and underwater mixing process, caused rapid changes in the aquatic light field and introduced uncertainties (Ackleson 2003).Secondly, variations in water depth and optical properties of the water column may result in inaccurate measurements (Mishra et al. 2007).Thirdly, in situ data from different sources were used in this study.They were determined with hyperspectral radiometers from different manufacturers with different calibration histories and data processing procedures also differed.All these factors would introduce potential errors to field data.The spatial difference between field and satellite observations is well-known since the latter covers much larger areas than the former.Furthermore, the diversity of benthic substrates is a potential source of differences between them since variations in optical properties for different benthic habitats can be anticipated (Karpouzli and Malthus 2003;Kutser, Dekker, and Skirving 2003).In optically shallow waters, different types of benthic habitats (like seagrass, bare sand, and coral) can affect the accuracy of in situ measurements (Zeng et al. 2022).For example, sites in the Great Barrier Reef in Figure 10 show greater variations in blue wavelengths than in other wavelengths, which may be due to uncertainty in the measurement of surface radiation (i.e.changes in sky conditions) or strong absorption of benthic corals in blue wavelengths.
Temporal dynamics are one of the key characteristics that ocean and atmosphere differ from land (Cui et al. 2014).The time difference between in situ and satellite measurements has also persisted.Although in situ data collected within ±2 h of satellite overpass were used to represent the matching results, ocean color satellite data are often biased in turbid coastal waters of complex physical and bio-optical properties.For example, rapid changes in sediment re-suspension caused by wind, tide, river plume, or precipitation during the 2 h period can affect the optical properties of water column.Moreover, the extent to which wind and tidal changes affect sediment re-suspension depends on the local bottom topography and varies for different coastal regions (Choi et al. 2012;Shi and Wang 2009;Nezlin and DiGiacomo 2005).An alternative is to use matching records with the same collection time.However, the 16-day revisit frequency of Landsat 8 poses limitations.
Hence, increasing the amount of available data may provide a feasible solution, which can decrease the potential uncertainties of in situ measurements during inappropriate conditions.Time series collection of in situ data can be made through a large number of regular and continuous field work.This requires various shared ocean color field observations (such as SeaBASS and MOBY) to harmonize data inputs from various sources, expand data reserves, unify data formats, and simplify the process of data submission and acquisition.These efforts would greatly facilitate future atmospheric correction assessments.Using a combination of various methods to create a large data pool may minimize systematic errors in the performance evaluation (Pahlevan et al. 2021).Therefore, it is possible to characterize the optical properties and their spatiotemporal variations over oceans, which, in turn, can improve the accuracy of AC.

Acolite
The most recent version of Acolite updated the look-up table (LUT) for dark spectral fitting, using the default continental and maritime aerosol models, and also supports the addition of additional aerosol models (Vanhellemont 2019).Acolite has been validated over Pléiades images in turbid coastal waters in Zeebrugge and the results showed that R rs products from the newly developed DSF algorithm were superior to those from the SWIR method.But this algorithm can hardly obtain valid values in clear water (Vanhellemont and Ruddick 2018).DSF is similar to 'black NIR' or 'black SWIR' (Gordon and Wang 1994).Involving land pixels in the processing window provides better performance than using any type of water pixels.Therefore, there is a reasonable explanation for its poor performance for OWTs 1 to 4 while good for OWT 5 since OWT 5 is closer to land.In addition, the inaccurate aerosol calculation is a potential cause of errors.A key step of DSF is to find all 'dark' pixels in the entire scene.Vanhellemont and Ruddick (2018) emphasized that using samples with dark pixels would greatly facilitate AC.However, Acolite dynamically identifies the darkest pixels for each band and does not implement adjacency correction.Therefore, land pixels may contaminate R rs of adjacent water pixels.

C2RCC
C2RCC generally produced good performance in various regions and bands, ranked the third behind L2gen and Polymer.Unlike Acolite and L2gen, C2RCC had almost no pixel windows rejected since the dependence of the algorithm on input parameters is relatively low.However, C2RCC tended to highly underestimate R rs for OWTs 1 and 5.This could be related to the fact that the optical conditions of the training set (Soriano-González et al. 2022) were closer to those of OWTs 2 to 4. However, the optical conditions of OWTs 1 and 5 included in this study may differ from those in the training dataset for C2RCC, which limited the use of this processor in eutrophic and oligotrophic waters (Ligi et al. 2017).
However, the present version of the C2RCC processor does not correct for the effect of sun glint.Through feedbacks from the scientific community to characterize the scope and limitation, C2RCC is constantly improved and the future version is expected to improve the accuracy of the blue band and correct for sun glint effect.Since the good performance of C2RCC is related to its capability of capturing slight effects during calibration, new machine-learning solutions should be considered (Pahlevan et al. 2021).Feasible measures include improving forward radiative transfer modeling and exploring the latest architectures (e.g.optimization functions, loss functions, activation functions) or models (e.g.Convolutional Neural Network, Recurrent Neural Network).

iCOR
iCOR is an image-based algorithm.It does not require profound knowledge of radiative transfer and absolute calibration of sensors, which makes it easy to implement.However, this imagebased process requires the radiance from cloud shadow over water and land pixels to derive relevant parameters for the correction, which are not always present.As shown in Figure 9, iCOR generally overestimated reflectance for five OWTs due to limited land pixels for aerosol estimation.The extrapolation of land pixels probably underestimated atmospheric path radiation in coastal waters.It is very likely that these corrections returned default values of aerosol optical thickness (Warren et al. 2019).This may help to explain the poor performance of iCOR at offshore stations, such as Gulf of Finland and Muping.Note that adjacency effect influenced by land cover type and its seasonal variation, topography and other environmental conditions can affect the estimation of aerosol types and thickness, especially in inland and coastal waters, and further influence the accuracy of AC (Pahlevan et al. 2021).At present, iCOR is the only processor that corrects adjacency effect.

L2gen
As illustrated in Figure 5 and Table 2, L2gen achieved the best agreement with field measurements and was the most stable and balanced among the five algorithms for all OWTs.This can be explained by the following reasons: (1) Aerosol data used in L2gen were mainly obtained from AERONET-OC (Ahmad et al. 2010) that is also the major source of in situ data for matchup in this study; (2) The built-in bio-optical model and iterative scheme with NIR and SWIR switching options contribute to the stability of retrieval; (3) The effective standard flagging system can remove low-quality pixels and improve the retrieval quality.However, L2gen also showed systematic underestimation for all OWTs except for OWT 3, as shown in Figures 6 and 12.This phenomenon could be related to adjacency effects or uncertainties in estimating the aerosol contribution because of low signal-to-noise ratio propagation into the blue band when extrapolated from NIR and SWIR bands.They both led to overestimation of aerosols and retrieval of negative or low R rs (Ibrahim et al. 2018).Therefore, uncertainties still exist in L2gen-retrieved R rs data.L2gen requires an accurate understanding of aerosol types to obtain high-quality R rs retrievals.Hence, uncertainties for L2genderived R rs are mainly determined by uncertainties in aerosol LUTs, spectral extrapolations, and absorbing aerosols (Gordon, Du, and Zhang 1997;Franz et al. 2015).
In view of the abovementioned potential sources of uncertainties, Pahlevan et al. (2014) and Franz et al. (2015) investigated vicarious calibration gains for OLI, which apparently improved the accuracy of R rs .However, alternative calibration gains also introduced some uncertainties as a result of algorithm failure caused by residual effects from cloud shadow or overcorrection for aerosol contribution in one or more visible band.Overcorrection usually occurs when water-leaving radiance is not negligible for bands used to estimate aerosol contributions (Amin et al. 2009).Pahlevan et al. (2014) found that alternative calibration gains could also result in overcorrection of L2gen-derived R rs at 443 nm.

Polymer
The Polymer algorithm assumes that the Angstrom coefficient of aerosol scattering is 1.0 (Zhang et al. 2018).From the perspective of global statistics, Angstrom coefficient usually varies between 0.5 and 1.3 (Hu, Lee, and Franz 2012).For clear open oceanic waters, it is close to 1.0.Nevertheless, it will deviate from 1.0 for coastal waters and estuaries, resulting in incorrect spectral shape of aerosol reflectance and then leading to errors in the retrieved R rs .As reported by Steinmetz, Deschamps, and Ramon (2011), inappropriate estimation of wind speed at each pixel would yield negative reflectivity.Furthermore, Polymer simplifies that reflectance can be modeled as a function of two parameters, i.e. chlorophyll-a concentration and a coefficient (g) to account for backscattering variability (Zhang et al. 2018).In clear open oceanic waters where optical properties are mainly determined by chlorophyll-a, like OWTs 1 to 3, the algorithm demonstrated good performance, especially at 561 and 655 nm.Obvious overestimation can be clearly seen for R rs at 443 and 482 nm.This is related to the following reasons: (1) Insufficient correction for strong Rayleigh scattering in the blue to green spectral domain; (2) The spectral optimization did not account for contributions of CDOM to IOPs, other than chlorophyll-a (Steinmetz and Ramon 2018).In turbid coastal waters, like OWTs 4 to 5, changes in optical properties are affected not only by phytoplankton and their appendage, but also by other substances, such as suspended inorganic particles and CDOM (Babin et al. 2003;Siegel and Michaels 1996;Jiang et al. 2014).The assumptions made in Polymer are not applicable.Polymer would thus greatly underestimate R rs as different IOP models produced different phytoplankton concentration outputs (Windle et al. 2022).

Potential improvements towards AC processors
As discussed above, each AC scheme has their own advantage and disadvantage.The choice of optimal AC for specific region varies, especially for coastal turbid waters.Relative large errors may be expected when data outside the range of those used for algorithm development are involved.Therefore, optical classification of OWTs is essential to improve the accuracy of different AC algorithms.In addition, there exists other potential improvements toward different AC processors.
Dependent upon sun-target-sensor geometry, sun glint may pose a major challenge for ocean color remote sensing from space.Although it can provide information to extract surface features, like oil spill and internal wave, it hinders quantitative retrievals of biogeochemical variables from satellite observations.It was usually masked by defining a threshold during AC, like Acolite and L2gen, resulting in reduction of valid satellite data.To date, Polymer is the only processor capable of correcting sun glint.With more studies providing insights into the mechanism of sun glint, it is promising to address concerns caused by sun glint in the near future.Adjacency effect is another source of uncertainty to be addressed during AC.In the atmosphere-land system, adjacency effect always happens due to the presence of scattering atmosphere (Frouin et al. 2019).For in situ observations close to land, there would inevitably be interference of land pixels in the correction process.Among the five algorithms evaluated here, only iCOR considered this effect.
Complex aerosol properties can be significantly influenced by different continental and monsoon climates, which help absorb various anthropogenic particles, as well as desert dust in certain seasons (Mélin et al. 2010).Therefore, it is necessary to develop regional aerosol models suitable for different climate conditions in different regions in the future to describe the characteristics of different aerosols.At the same time, accurately calibrated paired datasets need to be expanded to validate and improve the accuracy and stability of aerosol models.In complex coastal environments, combination of multi-source remote sensing data can be considered.For example, the vertical structure of aerosols can be inferred from active remote sensing data such as LIDAR (Tian et al. 2009).In the future design of ocean color satellite sensors, payloads for concurrent measurements of aerosol and ocean color can be considered to provide more accurate aerosol information for AC.More detailed information is also conducive to the optimization of AC algorithm.
Polarization observation is also valuable for the improvement of AC accuracy.Some scientific research has identified that atmospheric products related to both clouds and aerosols can only be obtained with a multi-angle polarimeter which provides information for the simultaneous retrieval of R rs and aerosol properties (Dubovik et al. 2011;Hasekamp, Litvinov, and Butz 2011;Peers et al. 2015).Polarization measurements of scattered radiance in the NIR and SWIR domain provide determination of aerosol physical properties, i.e. size distribution and refractive index.Hence, the sensitivity of polarized radiance to aerosol types has the potential to improve AC, particularly in turbid coastal waters.He et al. (2014) provided further evidence of advantages of polarimetry for atmospheric correction over oceans.Polarization capabilities were added in the Pre-Aerosol, Clouds, and ocean Ecosystem (PACE) mission in addition to and complementary with the Ocean Color Instrument (OCI) (Waluschka et al. 2021).
Additionally, it is deemed necessary to apply an OWT classification before AC.This procedure would assist in better evaluating the performance of AC algorithms in local or global scales.Otherwise, the results can be affected.For instance, data out of the range for algorithm development could easily lead to failure, or the magnitude of adjacency effect in the visible bands could be underestimated (Bulgarelli and Zibordi 2018;Pereira-Sandoval et al. 2019).Furthermore, it would be helpful to have IOP data, like chlorophyll-a and CDOM, to improve the classification.

Other considerations
As stated above, uncertainties may appear even when executing the optimal AC for OLI images.In addition to the aforementioned reasons, these uncertainties can also be explained by the following reasons.Firstly, it is found that the inaccuracy during the removal of aerosol contribution is still the main source of error in R rs inversion (Warren et al. 2019).Aerosol models used by the AC processors do not necessarily accurately represent aerosol types over land and coastal waters.Secondly, environmental factors affect the accuracy of R rs retrieval.Ilori, Pahlevan, and Knudby (2019) investigated the influence of AOT, solar zenith angle (SZA), and wind speed.They found that AOT had no significant effect on the performance of L2gen and Acolite while SZA had a significant effect on the quality of L2gen-and Acolite-derived R rs .AOT and SZA were reported to reduce the quality of water leaving radiance obtained by SeaWiFS and MODIS sensors (Zibordi, Mélin, and Berthon 2006).The effect of wind speed on R rs is similar to that of SZA.Finally, ancillary information for the operation of atmospheric correction processors is a very important criterion, such as computational efficiency and output data size.The averaged time for each algorithm to process an entire image (computational efficiency) and the averaged size of R rs products are recorded and displayed in Table 5.It can be seen that L2gen-yielded R rs has the smallest size.Although the five algorithms were implemented on different configurations and environments, Acolite seems to have the largest computational efficiency, followed by L2gen.It should be noted that the data size of R rs data is related to the choice of output and the calculation efficiency is related to CPU and RAM.

Conclusion
AC is a prerequisite for quantitative applications of ocean color satellite data.This study provides an assessment of five AC algorithms (Acolite, C2RCC, iCOR, L2gen, and Polymer) for Landsat-8 OLI imagery over a variety of water types in global oceans.This is also the first time to match the most applicable AC algorithm for OLI based on accuracy or spectral similarity on a global scale.L2genproduced R rs demonstrated the best agreement with in situ measurements among the five algorithms, followed by Polymer, C2RCC, iCOR, and Acolite in order.Classification of OWTs was carried out using in situ measurements with matched satellite observations.OWT-specific analysis showed that L2gen produced the most accurate R rs for the five OWTs.Potential improvement for future AC algorithms was recommended, especially to address major concerns, such as sun glint and adjacency effect.Synoptic views from space-borne satellite sensors provide a unique vintage point to document and determine changes in marine environments and their drivers.Satellite observations play a significant role in supporting the framework for Sustainable Development Goals (SDGs), like Clean Water and Sanitation (SDG 6), Sustainable Cities and Communities (SDG 11), Climate Action (SDG 13), Life below Water (SDG 14), and Life on Land (SDG 15) (Sudmanns et al. 2020;Guo 2021).Accurate AC approaches, especially over turbid waters, is critical to leverage the maturity of existing operational ocean color algorithms and further pave the road for assessing the effects of climate change and anthropogenic activities on marine ecosystems.Our findings can provide deeper insights into the improvement of future AC schemes and thus our knowledge on longterm variations in oceans can be strengthened to contribute to the development of Digital Earth.

Figure 1 .
Figure1.Map of 20 sites (solid circles) with matched in situ radiometric measurements within ± 2 h of Landsat-8 overpasses.Dots of the same color represent sites given the same name listed in Table1.Names of six locations whose remote sensing reflectance (R rs ) spectra are shown in Figure7are annotated.Please refer to Table1for details. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Figure 3 .
Figure 3. Averaged R rs spectra for the five optical water types (OWTs).

Figure 4 .
Figure 4. Flowchart for matchup analysis and satellite data processing in this study.

Figure 5 .
Figure 5. Scatter plots of satellite-derived versus in situ measured R rs for visible bands of OLI/Landsat-8.The orange dashed lines refer to the 1:1 relationship and other colored lines represent the best-fitted relationships.Note that the scale for each band is different.

Figure 8
Figure8shows relative performance assessments of AC algorithms for the five OWTs.Overall, L2gen demonstrates good performance for all five OWTs, especially at 443 and 482 nm.In contrast, among the five algorithms, Acolite generally produces the poorest results for all five OWTs except for OWT5.It appears that Acolite is suitable for sediment-rich waters, consistent with previous findings(Renosh et al. 2020;Pereira-Sandoval et al. 2019).Polymer performs well for OWTs 1 to 3, especially at 561 nm, where it always illustrates the best results.Note that the performance of Polymer compromises in turbid coastal waters, such OWTs 4 and 5. C2RCC-derived R rs shows the highest accuracy for OWT 2 among the five algorithms and indicates the best performance at 655 nm.This can be attributed to the precise handling of low R rs values in the red bands by the neural network of C2RCC.The performance of iCOR has low ranking for OWTs 1 to 4. However, iCOR shows good performance for R rs at 443 and 482 nm for OWT 5, ranking the first and second, respectively.Figure9shows the comparison between averaged R rs spectra from in-situ and satellite observations using different AC algorithms for the five OWTs.L2gen-generated R rs spectra agree well with in situ measurements.In general, Acolite-obtained R rs spectra deviate the most from in situ data for all OWTs except for OWT 5.This further verifies that Acolite is suitable for the treatment of highly turbid water bodies.The spectral shape of Polymer-derived R rs is in good agreement with in situ measurements of OWTs 1 to 3, especially at 561 and 655 nm.Nevertheless, its performance degrades for OWTs 4 to 5. R rs from iCOR is generally overestimated.Table4lists the most suitable OWT-specific AC algorithm for each wavelength based on spectral shape preference via calculating SA.It can be seen that the spectral similarity for R rs from Polymer is the highest for OWT 1 with an SA of 1.15°.C2RCC has the smallest SAs for OWTs 2 and 4 with SAs of 4.92°and 1.31°, respectively.L2gen-derived R rs has the most similar spectral shapes to the in-situ data for OWTs 3 and 5 with SAs of 3.94°and 1.57°, respectively.These findings are consistent with those shown in Figure9.

Figure 8 .
Figure 8. Performance assessment of five AC algorithms for five optical water types (OWTs).Different colors indicate their rankings for the visible bands of OLI/Landsat 8. Warm color indicates better performance.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Figure 9 .
Figure 9.Comparison between averaged R rs spectra from in-situ and satellite observations using different AC algorithms for five OWTs.

Figure 10 .
Figure 10.In situ measured R rs spectra at six sampling sites.Only those with matched satellite observations are shown here.

Figure 11 .
Figure 11.Spectral angle for each AC processor.Smaller spectral angle would imply more similar shape.The red line represents the median and the boxes cover the 25-75% range.The upper and lower whiskers represent values outside the middle 50% (excluding outliers), defined as 1.5 × interquartile range (1.5 IQR).Box plot in depicts median (horizontal line), interquartile range (box), 1.5 × interquartile range (error bars), and outliers (black dots) which fall more than 1.5 times the interquartile range.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Figure 12 .
Figure 12.Comparison between averaged R rs spectra from field measurements and satellite observations using different atmospheric correction algorithms.

Figure 13 .
Figure 13.True color composite for 10 February 2016 over the area affected by the Atchafalaya River and the Mississippi River.Landsat-8 OLI-derived R rs spectra for five locations, i.e.A-E, representing different water conditions are shown.The colored line denotes the averaged R rs for the 30 × 30 pixels and the shaded color denotes standard deviation.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Figure 14 .
Figure 14.Quality assurance score (QAS) for OLI/Landsat 8-measured R rs from different atmospheric correction processors.The scene was collected on 10 February 2016.The higher QAS, the better data quality.

Table 1 .
. Names of six locations whose remote sensing reflectance (R rs ) spectra are shown in Figure7are annotated.Please refer to Table1for details. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)Information for locations with matched in situ radiometric and Landsat 8 OLI measurements, including station number, station name and their abbreviations, latitude, and longitude.

Table 2 .
Statistical results for remote sensing reflectance (R rs ) at four visible bands of Operational Lands Imager (OLI) Landsat 8 using five atmospheric correction (AC) algorithms, including Acolite, C2RCC, iCOR, L2gen, and Polymer.Statistical measures for comparison between in situ and satellite measurements include slope, intercept, R 2 , root mean square error (RMSE), absolute error (AE), bias, spectral angle (SA), quality assurance (QA) score.N denotes the number of data samples.Best metrics for each band are highlighted in bold.
Figure 6.Histograms for absolute error (AE), bias, root mean square error (RMSE), and R 2 for satellite-derived vs in situ measured R rs at visible bands of OLI/Landsat-8.

Table 3 .
ACs with the best performance for each station in terms of SA and average of AE (AAE) between in situ and satellite measured R rs .The number in brackets in the second column indicates the number of matched in situ and OLI measured R rs .The sixth to ninth columns represent ACs with the best performance for each visible band of Landsat 8 OLI for each station.
Figure 7. Radar map for each visible band of OLI in terms of bias, slope, intercept, R 2 , RMSE, and AE.4.2.Relative performance assessments for five optical water types

Table 5 .
Comparison among the five AC algorithms in terms of computation efficiency and output data size.