Validation of ESA Sentinel-2 L2A Aerosol Optical Thickness and Columnar Water Vapour during 2017–2018

This study presents a validation of aerosol optical thickness (AOT) and integrated water vapour (IWV) products provided by the European Space Agency (ESA) from multi-spectral imager (MSI) measurements on board the Sentinel-2 satellite (ESA-L2A). For that purpose, data from 94 Aerosol Robotic Network (AERONET) stations over Europe and adjacent regions, covering a wide geographical region with a variety of climate and environmental conditions and during the period between March 2017 and December 2018 have been used. The comparison between ESA-L2A and AERONET shows a better agreement for IWV than the AOT, with normalized root mean square errors (NRMSE) of 5.33% and 9.04%, respectively. This conclusion is also reflected in the values of R2, which are 0.99 and 0.65 for IWV and AOT, respectively. The study period was divided into two sub-periods, before and after 15 January 2018, when the Sentinel-2A spectral response functions of bands 1 and 2 (centered at 443 and 492 nm) were updated by ESA, in order to investigate if the lack of agreement in the AOT values was connected to the use of incorrect spectral response functions. The comparison of ESA-L2A AOT with AERONET measurements showed a better agreement for the second sub-period, with root mean square error (RMSE) values of 0.08 in comparison with 0.14 in the first sub-period. This same conclusion was attained considering mean bias error (MBE) values that decreased from 0.09 to 0.01. The ESA-L2A AOT values estimated with the new spectral response functions were closer to the correspondent reference AERONET values than the ones obtained using the previous spectral response functions. IWV was not affected by this change since the retrieval algorithm does not use bands 1 and 2 of Sentinel-2. Additionally, an analysis of potential uncertainty sources to several factors affecting the AOT comparison is presented and recommendations regarding the use of ESA-L2A AOT dataset are given.


Introduction
Aerosols and water vapour are two of the most important atmospheric constituents of the Earth-atmosphere system because they attenuate solar radiation through scattering and absorption processes, and also modulate terrestrial radiation mainly by absorption and emission. The effect of aerosols and water vapour on short wave (SW) irradiance was analysed by Obregón et al. [1], who showed that the combined effect of those quantities was not simply the sum of their individual effects. The effect of each constituent has been also studied by Alexandri et al. [2]. Atmospheric aerosols also Sentinel-2 MSI Level-1C products are atmospherically corrected through the implementation of Sen2Cor method [18], obtaining Level-2A bottom of the atmosphere (BOA) reflectance products (L2A). The method was developed for application over land surfaces but it can be also applied over water surfaces. However, it should be noted that the processor does not consider water surface effects such as sunglint. L2A products include BOA reflectance images, maps of AOT at 550 nm, IWV and scene classification (SCL), among others. Sen2Cor estimates AOT at 550 nm on the basis of the dense dark vegetation (DDV) pixel method introduced by Kaufman and Sendra [19]. AOT is derived from a correlation of short wave infra-red (SWIR) reflectance (band 12, see Table 1) with the blue (band 2, see Table 1) and red (band 4, see Table 1) reflectances. If there are no DDV pixels in the image, the AOT cannot be estimated and Sen2Cor uses a fallback solution. This solution is to perform the atmospheric correction with a constant AOT, which is specified by the start (default) visibility of 40 km, corresponding to an AOT of approximately 0.2 at sea level. IWV is retrieved from a Level-1C image by means of a Sentinel-2 adapted atmospheric pre-corrected differential absorption (APDA) algorithm [20], which uses spectral bands 8A and 9 (see Table 1). Band 8a is the reference Remote Sens. 2019, 11, 1649 4 of 15 channel in an atmospheric window region, whereas band 9 is the measurement channel in the water vapour absorption region. The total Sun-ground-sensor and path radiances are calculated for both bands, assuming that the surface reflectance is the same in both. The radiance ratio obtained between the measurement channel (water vapour absorption) and the reference channel (not influenced by water vapour) is then a measure of the water vapour column content (IWV). The scene classification is separated into the 11 classes presented in Table 2. The objective of this product is not to classify the land cover in a strict sense, but to distinguish between cloudy, clear and water pixels, to be used internally in Sen2Cor for atmospheric correction purposes. The accuracy requirement limits defined by ESA [21] for AOT and IWV equal 10% of the reference AOT value plus 0.03, and 10% of the reference IWV value plus 0.2, respectively. ESA provides L2A products based on an implementation of the Sen2Cor algorithm [13], but also provides an offline version of the Sen2Cor to produce L2A products [22]. In this study, the data used belong to the first option, and more specifically to the ESA-L2A products that are systematically generated at the ground segment over Europe and adjacent areas since March 2017. The Copernicus Open Access Hub provides complete, free and open access to Sentinel-2 data that are available to users in SENTINEL-SAFE format (https://scihub.copernicus.eu/dhus/#/home). The Sentinel-2 tiles (granules of 100 km × 100 km) were downloaded with cloud cover conditions lower than 90%. AOT and IWV products are provided with a spatial resolution of 20 m. Only pixels with SCL of dark area, vegetation or bare soils were considered.

AERONET
The aerosol robotic network (AERONET) products (https://aeronet.gsfc.nasa.gov/) were used here as ground-based reference values to validate Sentinel-2 ESA-L2A AOT at 550 nm and IWV. It is very common to use this reference data because: (a) The sunphotometer provides a near-direct measurement of the extinction, which is therefore much more reliable than the satellite measurements; (b) AERONET provides a globally distributed long-term and accessible public database of these products; (c) AERONET has a high level of standardization of instruments, calibration, processing and distribution [23]. Specifically, data from 94 AERONET stations distributed throughout Europe, western Asia and northern Africa have been used. Figure 1 displays the geographical location of the AERONET sites considered, located in various regions, coastal and inland, at different altitudes, latitudes and longitudes, representing a great variety of climate and environmental conditions. This constitutes a valuable set of reference data to assess the accuracy of Sentinel-2 level 2A AOT and IWV atmospheric products.
2018, when most stations used already had Level 2.0 data available when the study began.
The ESA-L2A AOT product is provided at the reference wavelength of 550 nm, however this is not a direct product of AERONET. Therefore, it is necessary to convert AERONET AOT at λ wavelength to AOT at 550 nm. The conversion has been performed using the Ångström power law (Equation (1)): (1) Where α is the Ångström exponent in the wavelength range of 440-870 nm and λ is the wavelength. The sites with less than 10 data points are represented by the grey dots. The location of the three stations used by Djamai et al. [14] are shown with a black star. The sites with less than 10 data points are represented by the grey dots. The location of the three stations used by Djamai et al. [14] are shown with a black star.

Methodology
All ground-based stations are equipped with CIMEL CE-318 sunphotometers, the standard instrument of the AERONET network. This instrument performs direct sun measurements in 9 channels covering the spectral range between 340 and 1020 nm. In particular, the 940 nm channel is used to retrieve IWV, since water vapour has a strong absorption at this wavelength. The estimated uncertainty in AOT under cloud-free conditions is 0.01 for wavelengths higher than 440 nm and 0.02 for shorter wavelengths [23,24]. For IWV, the uncertainty is less than 12% [23]. A more detailed description about the CIMEL sunphotometer, as well as the protocol applied by AERONET to process all radiance measurements and provide AOT, IWV and inversion products have been documented by Holben et al. [23]. AERONET version 3 algorithm processing includes three different quality levels (1.0, 1.5 and 2.0). Levels 1.0 and 1.5 (cloud-screened and quality controlled) are provided in near real-time, whereas level 2.0 (quality-assured) may present a delay of more than 12-months due to final calibration and manual inspection, ensuring the highest data quality. In this study, AOT and IWV from version 3 level 2.0 have been used. This means that the data at this level are not available until the sunphotometers are subjected to calibration which is usually done once a year. Hence, the period of data used in this study is from March 2017 (beginning of the ESA-L2A database) until December 2018, when most stations used already had Level 2.0 data available when the study began. The ESA-L2A AOT product is provided at the reference wavelength of 550 nm, however this is not a direct product of AERONET. Therefore, it is necessary to convert AERONET AOT at λ wavelength to AOT at 550 nm. The conversion has been performed using the Ångström power law (Equation (1)): where α is the Ångström exponent in the wavelength range of 440-870 nm and λ is the wavelength.

Methodology
The ESA-L2A retrieved AOT and IWV products were compared with independent space-timecollocated AERONET AOT and IWV quantities retrieved from the sunphotometer measurements. For this purpose, subsets of 9 km × 9 km, around the coordinates of each AERONET site were extracted from the MSI tiles, following the procedures of other validation studies (e.g., [12,14]). The procedure was limited at each site by the existence of AERONET level 2.0 data with a temporal difference of less than ±1 h from the satellite time overpass. The next step was to identify the valid pixels within the subsets for the validation process. Each of the subset pixels, with a spatial resolution of 20 m, is assigned a value of AOT, IWV and scene classification, among other products. In this study, only those pixels with scene classifications of a dark area (codified as 2), vegetation (codified as 4) or bare soils (codified as 5) were considered. Once these pixels were identified, their percentage was calculated, finally selecting those subsets with percentages greater than 50%. It is noted that pixels classified as water (scene classification 6), have not been taken into account in the calculation of this percentage. Subsequently, the AOT and IWV averages were computed for each subset with the pixels that met the defined criteria. The closest AOT and IWV AERONET data to the satellite overpass were selected for each station, considering a maximum difference of ±1 h from the satellite overpass, as mentioned before. AERONET data fulfilling this condition were then selected and linearly interpolated to estimate the value of these products at the exact time of the satellite overpass. The existence of AERONET AOT and IWV products, both in the previous and subsequent hour, is a necessary condition to perform such an interpolation.
After these collocation steps, the two initial databases (ESA-L2A and AERONET Level 2.0) are reduced to two composed of spatial and temporal coincident cases in a total of 2398 pairs of data. AOT and IWV of both databases have been compared in order to evaluate the accuracy of AOT and IWV ESA-L2A products. Common statistical indicators have been used in this study to obtain an appropriate comparative assessment of both datasets. The mean bias error (MBE), which is a measure of the overall error or systematic error, was calculated as follows: where N is the size of the sample corresponding to the total number of data, X are ESA-L2A products data and Xref are the ground-based reference values, i.e., AERONET Level 2.0 data. The root mean square error (RMSE) gives a measure of the error existing between ESA-L2A products data and AERONET Level 2.0 products data. Small values of the RMSE indicate that the regression fit is acceptable. This parameter is expressed as: Also, normalized values of the previous statistic indicators (NMBE and NRMSE) have been calculated to compare AOT and IWV.
Another statistical indicator is the determination coefficient (R2), which indicates the accuracy of the obtained regression fit from the comparison. This parameter, which shows how much the model is able to describe the data, should be close to 1. Figure 2 presents the scatter plots of the 2398 ESA-L2A retrievals of AOT (a) and IWV (b) versus the coincident reference AERONET Level 2.0 quantities retrieved from the ground-based measurements. This figure also shows the regression line, MBE, NMBE, RMSE, NRMSE and R2 values obtained for each quantity. The intercomparison of ESA-L2A and AERONET derived quantities showed a better agreement for IWV than for the AOT, with NRMSE of 5.33 and 9.04%, respectively. This conclusion was also reflected in the values of R 2 , which are 0.99 and 0.65 for IWV and AOT, respectively. From this figure, it was extracted that the NMBE corresponding to AOT was 3.75%, while for IWV, it presented a negative value of −3.24%. These results indicated that ESA-L2A AOT data overestimated the reference values, while the opposite occurred with IWV. The value of MBE for AOT (0.05) was lower than the value of MBE reported in Djamai et al. [14] (0.07). However, the value of MBE for IWV (−0.14 cm), although with the same sign, presented a higher absolute value than that reported by Djamai et al. [14] (−0.02 cm). It should be taken into account that these authors only used five data points belonging to three AERONET stations from the south of Europe (see Figure 1) during the period April 2017-October 2017. This was a reduced dataset compared with the 94 AERONET sites covering a wide geographical region with a variety of climate and environmental conditions considered in the present study. quantities showed a better agreement for IWV than for the AOT, with NRMSE of 5.33 and 9.04%, respectively. This conclusion was also reflected in the values of R², which are 0.99 and 0.65 for IWV and AOT, respectively. From this figure, it was extracted that the NMBE corresponding to AOT was 3.75%, while for IWV, it presented a negative value of −3.24%. These results indicated that ESA-L2A AOT data overestimated the reference values, while the opposite occurred with IWV. The value of MBE for AOT (0.05) was lower than the value of MBE reported in Djamai et al. [14] (0.07). However, the value of MBE for IWV (−0.14 cm), although with the same sign, presented a higher absolute value than that reported by Djamai et al. [14] (−0.02 cm). It should be taken into account that these authors only used five data points belonging to three AERONET stations from the south of Europe (see Figure  1) during the period April 2017-October 2017. This was a reduced dataset compared with the 94 AERONET sites covering a wide geographical region with a variety of climate and environmental conditions considered in the present study. (a) (b) Figure 2 also shows the accuracy requirement limits for AOT and IWV defined by ESA [21], represented with solid blue lines. As it can be observed, IWV meets the accuracy requirements better than AOT. Specifically, from the 2398 data points used in this study, 2167 IWV data points (90.4%) were within the limits defined by ESA, while, in the case of AOT, only 875 (36.5%) met the requirement. Another aspect to be highlighted in this figure was that the minimum value of ESA-L2A AOT was 0.06. This value is greater than the minimum value of AERONET AOT, which is close to zero. In fact, 26.9% of AERONET AOT data were lower than 0.06. With respect to IWV, it was also observed that the minimum value of ESA-L2A IWV, 0.15, was higher than the minimum value of AERONET IWV, which was 0.04. However, in this case only sixteen AERONET IWV data were lower than 0.15.

Results
According to information from ESA (https://earth.esa.int/web/sentinel/missions/sentinel-2/news/; last accessed 21/05/2019), a new version of the Sentinel-2A spectral response functions is used from 15 January 2018 onwards, mainly correcting the responses for bands 1 and 2 due to inaccurate spectral responses provided previously and used in the above retrievals before 15 January 2018. Band 2, centered at 492 nm, is used to estimate AOT, thus one of the reasons for the poor Figure 2 also shows the accuracy requirement limits for AOT and IWV defined by ESA [21], represented with solid blue lines. As it can be observed, IWV meets the accuracy requirements better than AOT. Specifically, from the 2398 data points used in this study, 2167 IWV data points (90.4%) were within the limits defined by ESA, while, in the case of AOT, only 875 (36.5%) met the requirement. Another aspect to be highlighted in this figure was that the minimum value of ESA-L2A AOT was 0.06. This value is greater than the minimum value of AERONET AOT, which is close to zero. In fact, 26.9% of AERONET AOT data were lower than 0.06. With respect to IWV, it was also observed that the minimum value of ESA-L2A IWV, 0.15, was higher than the minimum value of AERONET IWV, which was 0.04. However, in this case only sixteen AERONET IWV data were lower than 0.15. According to information from ESA (https://earth.esa.int/web/sentinel/missions/sentinel-2/news/; last accessed 21/05/2019), a new version of the Sentinel-2A spectral response functions is used from 15 January 2018 onwards, mainly correcting the responses for bands 1 and 2 due to inaccurate spectral responses provided previously and used in the above retrievals before 15 January 2018. Band 2, centered at 492 nm, is used to estimate AOT, thus one of the reasons for the poor agreement between ESA-L2A AOT estimates and AERONET AOT data could be the use of incorrect spectral response functions to estimate ESA-L2A AOT. This is investigated by dividing the data into two sub-periods, before and after the spectral response function update, and inter-comparing ESA-L2A AOT and AERONET AOT in the two sub-periods. The first sub-period includes the data obtained with the old spectral response functions, i.e., from the beginning of the study period (March 2017) until 14 January, 2018 and the second sub-period includes the data obtained with the new spectral response functions, from 15 January to December 2018. Figure 3 presents the scatter plots of AOT from ESA-L2A retrievals versus coincident AERONET Level 2.0 data for the two sub-periods. As it can be seen in this figure, the inter-comparison of ESA-L2A AOT and AERONET showed a better agreement for the second sub-period than for the first one, with RMSE values of 0.14 and 0.08, respectively for the first and second sub-periods. Therefore, ESA-L2A AOT values estimated with the new spectral response functions are closer to the reference values than the ones obtained with the previous spectral response functions. This conclusion was also verified through the calculation of the MBE values, obtaining 0.01 for the period after 15 January, 2018 compared with the 0.09 value obtained for the period before 15 January, 2018. The values of RMSE and MBE for the two sub-periods have also been compared with the values obtained for the whole period, displayed in Table 3. This indicated that the two statistical indicators have been drastically reduced for the sub-period with the new spectral response functions, compared with the entire period, being even smaller than the ones reported by Djamai et al. [14] of 0.07 (MBE) and 0.09 (RMSE). It can be concluded that with the use of these new spectral response functions, the quality of the ESA-L2A AOT estimates has significantly improved. The results of this study are compared to those reported by Bilal et al. [25], regarding the evaluation of AOT retrievals from MODIS using AERONET, in the same region. Despite the different spatial resolutions considered in both studies, the results obtained were very similar in terms of RMSE and MBE values. agreement between ESA-L2A AOT estimates and AERONET AOT data could be the use of incorrect spectral response functions to estimate ESA-L2A AOT. This is investigated by dividing the data into two sub-periods, before and after the spectral response function update, and inter-comparing ESA-L2A AOT and AERONET AOT in the two sub-periods. The first sub-period includes the data obtained with the old spectral response functions, i.e., from the beginning of the study period (March 2017) until 14 January, 2018 and the second sub-period includes the data obtained with the new spectral response functions, from 15 January to December 2018. Figure 3 presents the scatter plots of AOT from ESA-L2A retrievals versus coincident AERONET Level 2.0 data for the two sub-periods. As it can be seen in this figure, the inter-comparison of ESA-L2A AOT and AERONET showed a better agreement for the second sub-period than for the first one, with RMSE values of 0.14 and 0.08, respectively for the first and second sub-periods. Therefore, ESA-L2A AOT values estimated with the new spectral response functions are closer to the reference values than the ones obtained with the previous spectral response functions. This conclusion was also verified through the calculation of the MBE values, obtaining 0.01 for the period after 15 January, 2018 compared with the 0.09 value obtained for the period before 15 January, 2018. The values of RMSE and MBE for the two sub-periods have also been compared with the values obtained for the whole period, displayed in Table 3. This indicated that the two statistical indicators have been drastically reduced for the sub-period with the new spectral response functions, compared with the entire period, being even smaller than the ones reported by Djamai et al. [14] of 0.07 (MBE) and 0.09 (RMSE). It can be concluded that with the use of these new spectral response functions, the quality of the ESA-L2A AOT estimates has significantly improved. The results of this study are compared to those reported by Bilal et al. [25], regarding the evaluation of AOT retrievals from MODIS using AERONET, in the same region. Despite the different spatial resolutions considered in both studies, the results obtained were very similar in terms of RMSE and MBE values.    (Figure 4a), no seasonality was observed, and the statistical values for each season were very similar. It was also observed that ESA-L2A AOT values were higher than AERONET AOT values in all seasons. However, when the sub-periods were analysed separately, the behaviors were moderately different. For example, in Figure 4b, corresponding to the sub-period before 15 January, 2018, seasonality was observed in the AOT values of the two databases, obtaining the highest values in summer, and the lowest in winter. The small overlap between the two databases in winter was due to the minimum value of ESA-L2A AOT, which was 0.06, while the minimum value of AERONET was lower. However, the differences between the values of ESA-L2A and AERONET estimates were greater in this sub-period (for each season and the entire time) than when the entire study period was considered (Figure 4a). It should be noticed that in Figure 4b the boxes characterizing the inter-quartile ranges only overlap in autumn, whereas in Figure 4a, they overlap in all seasons and for the whole time period. On the other hand, if the second sub-period is used (Figure 4c) a greater overlap of both databases is obtained, which means that ESA-L2A AOT and AERONET AOT values show a better agreement when using the new spectral response functions in the ESA AOT retrievals. Moreover, the AOT values also showed seasonality, with the highest values in summer and the lowest in winter. The high AOT values in summer are due to a combination of factors that occur in this period of the year. One of them is the lack of precipitation, favoring the presence of aerosols in the atmosphere. Another factor is the arrival of desert-dust aerosols from the Saharan region and the summer forest fires, which are important sources of aerosols in the stations of the Mediterranean region [26,27] (South Europe and North Africa/Middle East). In this study, stations of the Mediterranean region predominate with respect to those of northern Europe, hence, these aerosol sources have a great influence on summer AOT values. This seasonal variation of AOT was also observed by other authors [28][29][30]. When the entire study period was considered (Figure 4a), no seasonality was observed, and the statistical values for each season were very similar. It was also observed that ESA-L2A AOT values were higher than AERONET AOT values in all seasons. However, when the sub-periods were analysed separately, the behaviors were moderately different. For example, in Figure 4b, corresponding to the sub-period before 15 January, 2018, seasonality was observed in the AOT values of the two databases, obtaining the highest values in summer, and the lowest in winter. The small overlap between the two databases in winter was due to the minimum value of ESA-L2A AOT, which was 0.06, while the minimum value of AERONET was lower. However, the differences between the values of ESA-L2A and AERONET estimates were greater in this sub-period (for each season and the entire time) than when the entire study period was considered (Figure 4a). It should be noticed that in Figure 4b the boxes characterizing the inter-quartile ranges only overlap in autumn, whereas in Figure 4a, they overlap in all seasons and for the whole time period. On the other hand, if the second sub-period is used (Figure 4c) a greater overlap of both databases is obtained, which means that ESA-L2A AOT and AERONET AOT values show a better agreement when using the new spectral response functions in the ESA AOT retrievals. Moreover, the AOT values also showed seasonality, with the highest values in summer and the lowest in winter. The high AOT values in summer are due to a combination of factors that occur in this period of the year. One of them is the lack of precipitation, favoring the presence of aerosols in the atmosphere. Another factor is the arrival of desert-dust aerosols from the Saharan region and the summer forest fires, which are important sources of aerosols in the stations of the Mediterranean region [26,27] (South Europe and North Africa/Middle East). In this study, stations of the Mediterranean region predominate with respect to those of northern Europe, hence, these aerosol sources have a great influence on summer AOT values. This seasonal variation of AOT was also observed by other authors [28][29][30].

Analysis of Potential Uncertainty Sources
Since the correlation between ESA-L2A AOT estimates and AERONET AOT data is not very good, it improves when the correct spectral response functions are applied. Hence, a sensitivity

Analysis of Potential Uncertainty Sources
Since the correlation between ESA-L2A AOT estimates and AERONET AOT data is not very good, it improves when the correct spectral response functions are applied. Hence, a sensitivity analysis to key factors, as the scene classification class, the region and the aerosol type, has also been performed.
As mentioned, in this study only pixels with scene classifications classes 2 (Dark_Area_PIXELS), 4 (Vegetation) and 5 (Bare_soils) were considered. In order to estimate the sensitivity of the intercomparison to scene classification, the study was performed individually for each scene classifications class. The results obtained are shown in Table 4, both for the whole period and for the sub-period after 15 January, 2018. From this table, the best agreement between the two AOT datasets was obtained for the scene class 4 (Vegetation), with RMSE of 0.08 and 0.10, respectively. This conclusion was also reflected in the values of R 2 , which were 0.74 and 0.81 for the whole period and for the sub-period after 15 January, respectively. Regional variation is also an important factor to take into account in the assessment of Sentinel-2 AOT values, since each region presents different environmental conditions and is typically influenced by different aerosol types. For this reason, the stations have been grouped into three regions (North Europe, South Europe, and North Africa/Middle East) in order to investigate the effect of regional variations on the ESA-L2A AOT product accuracy. The results obtained are presented in Table 5 and show that there are regional differences, with the best agreement obtained for the North Europe region and the worst agreement in North Africa/Middle East region, both for the whole period and for the sub-period after 15 January, 2018. The reason for the poor correlation between the two AOT datasets in the North Africa and Middle East regions is due to the assignment of AOT values of 0.2 (corresponding to the initial visibility assumed by the algorithm), as illustrated in Figure 5, which occurs when the scene does not contain reference pixels for the calculation. The sensitivity of the comparison to different aerosol types was analysed using the classification proposed by Elias et al. [31], who presented a criterion to classify the aerosol turbidity situations based on both the values of AOT and the Ångström exponent. The aerosol types proposed are clean (CL), continental (CO), maritime (MA), forest fires emissions (FF) and desert dust (DD). Table 6 shows the values of MBE, RMSE and R2 for each aerosol type, both for the whole period and for the sub-period after 15 January, 2018. From this table, CO type presented the best agreement between ESA-L2A AOT estimates and AERONET AOT data, with the lowest MBE and RMSE values. However, DD type presented the worst correlation due to the existence of a large number of cases with ESA-L2A AOT value equal to 0.2, showing the relationship between this aerosol type and the North Africa region. The most negative value of MBE corresponded to FF aerosol type. This means that there is an underestimation of ESA-L2A AOT in relation to AERONET AOT during these episodes, characterized by very high AOT values. Followed by these values of MBE were those corresponding to CL and MA types, although with positive values which indicated an overestimation of ESA-L2A AOT. This was because the minimum value of ESA-L2A AOT was 0.06, which was much higher than most of the lowest values of AERONET AOT, characterizing CL and MA aerosol types. Also the highest values of AOT were not well represented by ESA-L2A AOT product, as illustrated in Figure 5.      The previous sensitivity analyses allows the conclusion that ESA-L2A AOT estimates equal to 0.2 and AERONET AOT values lower than 0.06 and higher than 0.4 worsen the correlations between ESA-L2A AOT estimates and AERONET AOT data. Therefore, in the following analysis, AERONET AOT values lower than 0.06, higher than 0.4 and data corresponding to stations that present more than 20% of the ESA-L2A AOT values equal to 0.2 have not been considered. This analysis has been performed only for the sub-period after 15 January, 2018. The results obtained are presented in Figure 6 and constitute the recommended dataset to use for quantitative purposes. This figure shows that the correlation between ESA-L2A AOT estimates and AERONET AOT data is better (R 2 equal to 0.84) than that obtained without the exclusions described and shown in Figure 3 (R 2 equal to 0.72). The value of RMSE also decreased, from 0.08 to 0.07. This new AOT dataset meets the ESA accuracy requirements better than the complete AOT dataset presented in Figure 3 (without the exclusions described). Specifically, from the 838 data points used, 526 AOT data points (62.8%) were within the error limits defined by ESA, whereas previously only 36.5% of the AOT data met this requirement. The previous sensitivity analyses allows the conclusion that ESA-L2A AOT estimates equal to 0.2 and AERONET AOT values lower than 0.06 and higher than 0.4 worsen the correlations between ESA-L2A AOT estimates and AERONET AOT data. Therefore, in the following analysis, AERONET AOT values lower than 0.06, higher than 0.4 and data corresponding to stations that present more than 20% of the ESA-L2A AOT values equal to 0.2 have not been considered. This analysis has been performed only for the sub-period after 15 January, 2018. The results obtained are presented in Figure  6 and constitute the recommended dataset to use for quantitative purposes. This figure shows that the correlation between ESA-L2A AOT estimates and AERONET AOT data is better (R 2 equal to 0.84) than that obtained without the exclusions described and shown in Figure 3 (R 2 equal to 0.72). The value of RMSE also decreased, from 0.08 to 0.07. This new AOT dataset meets the ESA accuracy requirements better than the complete AOT dataset presented in Figure 3 (without the exclusions described). Specifically, from the 838 data points used, 526 AOT data points (62.8%) were within the error limits defined by ESA, whereas previously only 36.5% of the AOT data met this requirement. Therefore, it can be concluded that ESA-L2A AOT product should be used with caution in two cases: (a) Regions where the appropriate AOT value is often not estimated (due to the lack of DDV regions) and the value of 0.2 is assigned, and (b) regions with extreme visibility values (high or low), that is, sites that normally have very low or high AOT values. It would be advisable to use more than one default value (start visibility), according to the geographic area and season. These reference values could be obtained from a climatological analysis of ground-based or satellite reference data (AERONET for example). This would be important because in the cases with no DDV pixels available to obtain ESA-L2A AOT estimates, the AOT corresponding to the start visibility is used, which causes important errors as demonstrated here. Therefore, it can be concluded that ESA-L2A AOT product should be used with caution in two cases: (a) Regions where the appropriate AOT value is often not estimated (due to the lack of DDV regions) and the value of 0.2 is assigned, and (b) regions with extreme visibility values (high or low), that is, sites that normally have very low or high AOT values. It would be advisable to use more than one default value (start visibility), according to the geographic area and season. These reference values could be obtained from a climatological analysis of ground-based or satellite reference data (AERONET for example). This would be important because in the cases with no DDV pixels available to obtain ESA-L2A AOT estimates, the AOT corresponding to the start visibility is used, which causes important errors as demonstrated here.

Conclusions
This study contributes to the validation of AOT and IWV products provided by ESA from MSI measurements on board the Sentinel-2 satellite, respectively AOT ESA-L2A and IWV ESA-L2A. These products have been validated by comparison with data from 94 AERONET stations over Europe and adjacent regions and during the period between March 2017 and December 2018. The results indicated that ESA-L2A AOT data overestimated the reference values with NMBE of 3.75%, while the opposite occurred with IWV presenting a NMBE of −3.24%. The results showed a good agreement for IWV with NRMSE of 5.33% and R 2 above 0.99. However, a larger disagreement was obtained for AOT, with NRMSE of 9.04% and R 2 equal to 0.65. One possible cause for this lack of agreement in the AOT values could be due to the use of inaccurate spectral response functions, which impacted the retrieval of AOT ESA-L2A product from MSI measurements (the most affected were bands 1 and 2 and AOT retrieval uses band 2, among others). A new version of the Sentinel-2A spectral response functions is used from 15 January, 2018 onwards. Therefore, the study period was divided into two sub-periods, before and after 15 January, 2018. The inter-comparison of ESA-L2A AOT with AERONET AOT showed a better agreement for the second sub-period than for the first one, with RMSE (and MBE) values of 0.08 (0.01) for the second sub-period and 0.14 (0.09), for the first one. These values of RMSE and MBE were also smaller than the ones reported by Djamai et al. [14], of 0.09 and 0.07, respectively. The seasonal differences between AOT ESA-L2A and AOT AERONET data for the entire study period, the first sub-period and the second sub-period have also been analysed. The results showed no seasonality for the entire period. However, when the sub-periods were analysed separately, seasonality was observed, obtaining the highest AOT values in summer and the lowest in winter. For the first sub-period, the differences between the ESA-L2A and AERONET AOT estimates were greater in this sub-period than for the entire study period for all seasons. On the other hand, if the second sub-period is considered, a greater overlap of both databases is obtained for all seasons. Therefore, it can be concluded that with the use of these new spectral response functions the quality of the ESA-L2A AOT estimates has significantly improved. An analysis of potential uncertainty sources showed that the algorithm accuracy highly depended on the AOT first guess and that this was a critical issue over bright surfaces. It is recommended to update the algorithm in order to consider different initial visibility values, based for example, on reference climatologies.