Ground-based validation of the MetopA and B GOME-2 OClO measurements

. This paper reports on ground-based validation of the atmospheric OClO data record produced within the framework of EUMETSAT’s Satellite Application Facility on Atmospheric Chemistry Monitoring (AC SAF) using the GOME2-A and -B instrument measurements covering the 2007-2016 and 2013-2016 periods, respectively. OClO slant column densities are compared to correlative measurements collected from 9 NDACC Zenith-Scattered-Light DOAS (ZSL-DOAS) instruments distributed in both the Arctic and Antarctic. Sensitivity tests are performed on the ground-based data to estimate the impact of 5 the different OClO DOAS analysis settings. On this basis, we infer systematic uncertainties of about 25% (i.e. about 3.75x10 13 molec/cm 2 ) between the different ground-based data analyses reaching total uncertainties ranging from about 26% to 33% for the different stations (i.e. between around 4 to 5x10 13 molec/cm 2 ). Time-series at the different sites show good agreement between satellite and ground-based data, both for the inter-annual variability and the overall OClO seasonal behaviour. GOME-2A results are found to be nosier than those of GOME-2B, especially after 2011, probably due to instrumental degradation


Introduction
The increase of the chlorine and bromine species in the stratosphere, due to the anthropogenic release of long-lived halogenated compounds, has led to dramatic ozone losses in the polar winter stratosphere starting in the eighties (e.g.Solomon et al., 1988Solomon et al., , 1990;;Solomon, 1999).
In polar regions, the chemical destruction of ozone is strongly influenced by the polar vortex, which results from the largescale descent of cold air masses during winter.The polar vortex is also associated withto strong Coriolis-related circumpolar winds that prevent air mixing with lower latitudes.In the Northern Hemisphere, due to the inhomogeneous distribution of land masses, disturbances of the Arctic vortex by vertical propagation of planetary waves is frequent, while the Antarctic vortex usually remains stable and more or less symmetric until at least late spring (November).
During winter, temperatures inside the vortex can drop below the threshold for the formation of polar stratospheric clouds (PSCs), and heterogeneous reactions on PSC-particles convert ozone-inert chlorine reservoirs (mainly ClONO 2 and HCl) into ozone destroying species (active chlorine, mainly Cl, ClO and ClOOCl), see, e.g., Solomon (1999).This chlorine activation is the prerequisite for ozone destruction by catalytic cycles like the ClO-ClO and the ClO-BrO cycle (McElroy et al., 1986;Molina and Molina, 1987) after the return of sunlight in the polar spring.OClO is mostly created by the reaction between ClO and BrO (ClO + BrO -> OClO + Br) (Solomon et al., 1987;Toumi, 1994;Renard et al., 1997).OClO has a very short lifetime of a few seconds in the sunlit atmosphere due to its photolysis (OClO + hν -> ClO + O), which prevents the build-up of significant amounts until large solar zenith angles (SZAs) are reached.Nighttime and twilight OClO are thus a good indicator of chlorine activation (Sessler et al., 1995;Renard et al., 1997;Tørnkvist et al., 2002).Although OClO is only formed in sizeable quantities during the night, solar backscatter measurements of OClO columns can be performed from space near the terminator where the photolysis efficiency is reduced.
The emission of long-lived chlorine and bromine containing substances has been regulated since 1987 after the implementation of the Montreal Protocol and its Aamendments.As a result, atmospheric levels of the ozone-destroying precursor substances have decreased over the last decades.Monitoring of stratospheric chlorine and bromine contents remains important to assess the effectiveness of the regulatory measures taken, in particular in the context of climate change and its impact on ozone recovery.
Halogen oxides such as BrO and OClO can be measured using the Differential Optical Absorption Spectroscopy (DOAS) method (Platt and Stutz, 2008) owing to their structured absorption cross-sections in the UV and visible parts of the spectrum.
The first OClO retrievals from nadir satellite data were performed using the Global Ozone Monitoring Experiment (GOME) by Wagner et al. (2001Wagner et al. ( , 2002)); Burrows et al. (1999); Kühl et al. (2004) and Richter et al. (2005).This was followed by measurements from the Scanning Imaging Spectrometer for Atmospheric Chartography (SCIAMACHY, Kühl et al. (2006)), the The GOME-2 GDP 4.8 OClO retrieval algorithm is fully described in the corresponding Algorithm Theoretical Basis Document (Valks et al., 2019a) and detailed information about the development of the analysis can be found in Richter et al. (2015).
The DOAS retrieval is performed in the UV wavelength range 345-389nm which was found to minimise both bias and noise in retrieved OClO slant columns.The fit includes NO 2 , O 3 , O 2 -O 2 and the Ring effect (see Table 1).The GOME-2 key data parameter Eta (Valks et al., 2019a) is included as another effective cross-section to correct for residual polarization errors in the level-1 product.This inclusion significantly improves the OClO fitting residuals.Two empirical correction functions (derived from mean DOAS-fit residuals) are also included as additional (pseudo-) absorption cross-sections in the DOAS-fit: a mean residual and a scan angle correction function.These two empirical functions correct for positive offsets and scan angle dependencies in the OClO columns.Remaining biases in the OClO columns (e.g.non-zero OClO columns over areas without chlorine activation), with temporal drifts observed mainly in the OClO data from GOME-2A (see Richter et al. (2015)), need to be treated using an additional offset correction.A simple normalization is thus applied on an orbital basis.The mean OClO slant column for the area between 50 • N and 50 • S (a latitude region without chlorine activation) is determined for each GOME-2 orbit ,and subtracted from the retrieved OClO slant columns for the complete orbit, leading to normalized OClO slant columns (SCD).Typically, the offset can be can be a few (~1-4) x10 13 molec/cm 2 .
Table 1.DOAS settings used for the GOME-2 OClO retrieval in GDP 4.8.As OClO photolyses rapidly, it can only be observed at large solar zenith angle close to the terminator.Under these circumstances, the calculation of an AMF and a vertical column is not trivial.It is complicated by rapid photolysis, the change in SZA along the line of sight, and also the uncertainty in the OClO vertical profile (Richter et al., 2005;Oetjen et al., 2011).Therefore, as done in previous studies, the GOME-2 GDP data product only contains (normalized) OClO slant columns densities (SCD).

Variable
A flag indicates when valid (enhanced) OClO column values can be expected from the GOME-2 data.The OClO flag is set to 1 for daylight measurements with large solar zenith angle (85 º < SZA < 89º) and it is set to 2 for measurement during twilight (89 º < SZA < 92º), see (Valks et al., 2019b).
Figure 2 illustrates the GOME-2A and B datasets, by presenting the daily 90°SZA OClO SCD averages of both instruments, separated by hemisphere.As expected, OClO levels in the Southern Hhemisphere are usually larger than in the Northern H hemisphere, and the year-to-year variability is larger in the latter.E.g., lower chlorine activation levels are found in 2009 and 2013 in the Northern Hhemisphere compared to other years.Outside the chlorine activation period, values should be very close to 0 in both hemispheres.This is partlyusually the case in the first years of measurements of each instrument, especially in the Northern Hemisphere, although some negative or positive offsets (of up to 4 to 5 x10 13 molec/cm 2 ) and drifts appear for some of the years (e.g.2010 in the Northern Hhemisphere or 2011, 2012 and 2013 for southern Hemisphere for GOME-2A).
In particular, GOME-2A for the Northern Hemisphere starts with a baseline close to 0 for the first 3 years, then has a jump up in 2010 before it slowly drifts down again to a 0 baseline in 2016.For the Southern Hemisphere GOME-2A starts negative, drifts up until it is in the positive in 2010/2011, and then jumps straight down again in 2011/12 and stays in the negative.These results suggest that there is still room for improvement in the current GOME-2 analysis.3 Comparison data and method

Ground-based NDACC ZSL-DOAS data
As stated in the introduction, OClO columns have been retrieved from the ground since 1986 using the DOAS technique.
For this study we selected 8 stations operating Zenith-Scattered sun Light (ZSL)-DOAS UV-Visible spectrometers from the Network for the Detection of Atmospheric Composition Change (NDACC, https://www.ndaccdemo.org/,last access on 28 June 2021), located above 60 • latitude in both hemispheres and performing OClO SCD data retrievals.The geographical distribution of these instruments is represented in Fig. 3 and a more extensive descriptions of the sites is given in Appendix Annex A1.This dataset provides a good temporal coverage, some of the stations reporting observations over the whole Metop-A operation period (2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016).A good coverage of the Arctic and Antarctic region is also achieved, with half of the stations in the Northern Hemisphere and half in the Southern Hemisphere.This ensemble of stations was also recently used for the validation of TROPOMI OClO SCDs (Pinardi et al., 2020).
Specific details on the OClO SCD analysis are given in Table 2.As further described in Sect.3.2, ground-based measurements are extracted at the solar zenith angle of the recorded GOME-2 pixels, for optimal photochemical coincidence with satellite observations.A fixed reference spectrum selected outside of the activated vortex period ensures that no OClO contribution comes from the reference, providing in this way absolute slant columns.For the UToronto instrument in Eureka, some instrumental instabilities prevented the use of one yearly fixed spectrum for the analysis of some of the years, leading to a reduced temporal coverage of the comparisons (see Fig. 13 and 14).
From Table 2, it is clear that the ensemble of ground-based datasets is an aggregate of existing measurements and there is no harmonization in the retrieval choices of the different groups processing the OClO data.Different wavelength regions were used by each group for the OClO analysis, depending mainly on the spectral range covered by the respective instruments (see Table A1 for the instrumental details).In most cases, retrievals were performed in the UV region between 345 and 392nm.One   exception is NIWA who analysed its data in the visible spectral range (404-425nm, Kreher et al. (1996)).An illustration of the different OClO bands used in the different intervals is presented in Figure 4.
Another important difference is related to the OClO cross-section used, and its temperature.It can be seen that most of the groups use the Kromminga et al. (2003) cross-sections, while IUPB adopted the Kromminga et al. (1999) and UToronto the Wahner et al. (1987) dataset at 204K.Moreover, within groups having adopted the Kromminga et al. (2003) data, most of them used the 213K dataset, while INTA and IUPH used the 233K dataset.
Depending on the selected DOAS interval, the different groups include in their DOAS fit several other trace gas crosssections (NO 2 , O 3 , BrO, O 4 ) in addition to OClO.Also, they treat the Ring effect as a pseudo-absorber.Not all the absorbers are necessarily needed, especially when a small wavelength interval is considered.E.g., the Ny-Ålesund IUPB analysis (365-388nm) does not include O 3 and BrO while the Kiruna MPIC analysis (372-392nm) does not include BrO.For the NIWA visible interval these 2 gases are also not necessary, while the water vapor cross-section isshould be considered.
In order to assess the uncertainties related to the use of different OClO DOAS fit settings by the different groups, we performed a series of sensitivity tests that are reported in the next subsection.

SCD error estimation
In this section, we summarize the ground-based SCD error estimation.The random component of the uncertainty is evaluated using results from DOAS retrievals performed by each group, and, for the systematic uncertainty, we perform sensitivity tests to evaluate the impact of applying different retrieval settings, as presented in Table 2.The details of the different sensitivity tests are presented in AppendixAnnex A2, and the results are summarized here and in the different tables.

Random errors:
Random errors on SCDs are estimated by each group as part of their DOAS analysis.As summarized in Table 3, median values for the different datasets range from 6 to 22% (i.e., between 1 to 3.3 x10 13 molec/cm 2 ) for SCD values of about 15±2 x10 13 molec/cm 2 (representative of OClO measurements in activated conditions and median values of the SZA in between 86°to 90°, depending on the station).These values are globally consistent with past literature estimations (about 2 x10 13 molec/cm 2 for Neumayer and Arrival Heights (Frieß et al., 2005), 4-10% at 90 • SZA for the NIWA Arrival Heights (Kreher et al., 1996) and 20% for Ny-Ålesund data at 90 • SZA (Oetjen et al., 2011)).

Systematic errors:
Systematic errors on OClO SCDs are estimated based on sensitivity tests performed using spectra recorded with the IUPB instrument in Ny-Ålesund during of a few days in February 2014.As presented in AppendixAnnex A2, we investigated the impact of main differences that can be identified in Table 2, i.e, first, the choice of the OClO cross-sections source and its temperature, and secondly the different wavelength ranges.
The estimated systematic errors range between 2 and 15% for the uncertainty related to the OClO cross-section (see Fig. A1) and a total uncertainty of about 17% (Table A2).The values corresponding to each group's choice are indicated in the first column of the systematic uncertainty contributions in Table 3.
The errors due to the different group's retrieval choices are estimated through regression analysis of each setting with respect to the median OClO SCD values of all the settings together (see Fig. A2).The results present compact regression with RMS generally smaller than 2x10 13 molec/cm 2 , except for IUPH and MPIC.As discussed in AppendixAnnex A2, results for the latter two cases are likely biased due to the limited wavelength range (up to about 390.4nm) of the Ny-Ålesund spectra.
All intercepts except for IUPH are small (smaller than 1x10 13 molec/cm 2 (see Fig. A2)), and the differences between the measurements reside mostly in the slope, meaning that those differences arethe observed bias is mostly multiplicative.The values corresponding to each group's choice are indicated in the second column of the systematic uncertainty contributions in Table 3.The largest impact on the slope is obtained for the MPIC and for UToronto cases, leading to a difference between all cases of about 18.5% (see Table A2).This value is considered as the maximum systematic uncertainty on the retrieval choice for the systematic uncertainty contribution in Table A2, leading to a total maximum systematic uncertainty of about 25% (i.e. about 3.75 x10 13 molec/cm 2 for a SCD value of about 15x10 13 molec/cm 2 ) when adding the contribution related to the OClO cross-section source.
Expected systematic bias against GOME-2: The expected systematic bias due to differences between each group's analysis and the GOME-2 OClO retrieval settings is investigated in a third test.This test (presented in Fig. A3) uses a similar methodology than the second test presented above, Table 3. Error estimates for the different OClO analysis at each station, in percent.The random uncertainty is estimated from the DOAS fit uncertainty for an OClO SCD of 15±2 x10 13 molec/cm 2 .The systematic uncertainty is evaluated considering the impact of using different OClO cross-sections as well as different retrieval settings (see text and Fig. A1, A2, Table A2).The total uncertainty is calculated as the quadrature sum of random and systematic contributions.Estimation of the expected systematic bias with respect to the GOME-2 analysis setting is given in the last column (see text and Fig. A3).
Uncertainties but we now comparethe median OClO SCDs are replaced by the SCDs obtained by applying to the Ny-Ålesund spectra the DOAS settings from the different groups and using the GOME-2 settings in the scatter plots (see Fig. A3) defined in Table 1.For each group, the total expected systematic bias on OClO SCD consists of a first component due to the difference in the used OClO cross-section compared to Kromminga et al. (2003) (reported as the first number of the last column of Table 3) and a second component coming from the impact of other settings, as obtained in Fig. A3.The total expected systematic bias on OClO SCDs with respect to GOME-2 analysis ranges between 4% and 16% for the different stations (i.e., between 0.6 and 2.4x10 13 molec/cm 2 for a SCD value of about 15x10 13 molec/cm 2 ) .
The total uncertainty of the ground-based OClO SCDs calculated as the sum in quadrature of the random uncertainty at each station and the maximum systematic uncertainty (25%; see Table A2), is thus ranging 26% to 33%, i.e. between 4 to 5x10 13 molec/cm 2 .

SCD offset correction
Although OClO SCD measurements used in this study are obtained using a fixed reference spectrum selected outside of the activated period to make sure that no residual OClO is contained in this reference, OClO SCD offsets are often observed in actual measurements due to instrumental effects leading to systematic spectral interferences with OClO absorption structures (e.g.thermal instabilities leading to changes in instrumental spectral response) or due to possible unknown atmospheric effects interfering with the OClO retrieval.
Such effects generally lead to a systematic bias on the retrieved OClO SCDs that can vary in time, but usually with a time constant that exceeds the duration of a twilight period.
To further mitigate the impact of such biases, an empirical correction was designed and systematically applied to the groundbased data sets.
The principle of this correction relies on the assumption that OClO bias sources are constant during a twilight period and therefore lead to an offset on the retrieved OClO SCDs.For each morning and evening twilight, we draw a Langley plot, i.e.
a plot of the SCDs reported as a function of the OClO air mass factor (AMF).One example of such a plot is represented in Fig. 5, for the Harestua station on 13 January 2013.The AMF used for this purpose was empirically estimated from observed OClO SCDs recorded during a series of chlorine activation events of various strengths (see Fig. 6).The AMF is here defined as the ratio of the measured slant column to the vertical column estimated at 70°of SZA, assuming that at this solar elevation a simple geometrical AMF can be used.The grey area in Fig. 6 indicates the range of the measured OClO AMFs, while the blue and green curves show photochemical AMFs calculated using the DISORT radiative transfer model coupled PSC-Box and initialized with SLIMCAT 3D-Chemical transport model simulations, as explained in Hendrick et al. (2007).The red line represents the median value of the measured AMFs, which was used asan input for the present analysis.
As can be seen in Fig. 5, a linear relationship is obtained between the empirical AMFs and the measured SCDs over a large range of SZA values.We also note that although the reference spectrum used to analyse these data was recorded well outside the activated period (in late April in this case) and therefore does not contain any sizeable OClO amount, the observed SCDs present an offset, i.e. the measured SCDs do not converge to zero for low AMF values.This offset is necessarily an artefact and should be removed to restore physically consistent SCD values.
It must be noted here that this approach is only applicable for observations covering a sufficiently large range of SZAs.
The limit on the minimum solar zenith angle has been empirically set to 86°.For high latitude observations during polar night conditions, when the SZA constantly exceeds 86°, an estimate of the offset was obtained by fitting a polynomial function to offsets derived during the illuminated periods.
Despite its empirical nature, this offset correction, which was derived independently for morning and evening data on each day, can be considered as objective as a) it is not linked to the satellite data and b) it is not based on subjective criteria such as the smoothness of the OClO timeseries.
This correction was applied to all ground-based datasets used in this study, except for NIWA measurements in Arrival Heights.At this site, the method could not be used due to the unavailability of daily sequences of OClO measurements covering a suitable range of SZAs.

Comparison method
For the comparison of GOME-2 and ZSL-DOAS data, a method similar to Richter et al. ( 2015 Comparisons of the daily coincidences are performed at each station for the whole available time-series.It should be noted that there is a non-constant number of points at SZA>85 • throughout the year at some stations.This is even more the case after the reduced swath configuration was adopted for GOME-2A in July 2013.During several periods of the year (depending on the location) no valid OClO SCD can be found and such periods tend to be longer after 2013.
The approach of comparing slant columns (instead of vertical columns) relies on the assumption that satellite nadir and ground-based zenith sky light paths are comparable at large SZA (Oetjen et al., 2011).In other words, satellite AMFs (AMF-sat_nadir) and ground-based AMFs (AMFgb_zenith) are assumed to be similar.Oetjen et al. (2011) et al. calculated differences of up to 4% for the two observation geometries between 89 • and 91 • SZA and of 13% at 80 • SZA in Ny-Ålesund.
Zenith and nadir AMF calculations for one OClO activated day were performed here for conditions corresponding to 60 65°N, as shown in Figure 8.The simulations were performed using an implementation of the DISORT radiative transfer code accounting for the impact of photochemical enhancements along the light path at twilight (Hendrick et al., 2007).They confirm the Oetjen et al. (2011) results, with differences of up to 13% for SZA between 80 and 88 • , and differences of up to -8% between 88.5 and 92°SZA.On average, over the 85°to 92°SZA range, the AMF difference is close to zero.The presence of larger OClO columns in the austral winter and spring compared to the Northern Hemisphere was highlighted in past satellite's studies (Wagner et al., 2001(Wagner et al., , 2002;;Wittrock et al., 1999).Above the Antarctic, high OClO SCDs are usually observed after mid-May, with a large increase within a few days, reaching a maximum by mid-September and then quickly 265 decreasing until the chlorine activation stops by late October (Wagner et al., 2001;Richter et al., 2005).Due to a less stable polar vortex, the year-to-year variability of OClO is larger in the Northern Hemisphere, so that only few years are characterised by large activation events (Richter et al., 2005).The yearly variability in OClO SCDs is anti-correlated with the temperature variations and modulated by PSC formation (Weber et al., 2003).Please note that in some cases, some GOME-2A points lie below the x-axis limit of -1 x10 14 , down to -3.5 x10 14 molec/cm 2 , especially from 2011 onward (e.g., in the case of Neumayer, this represents 27 data points over a total of 1536 ).It can be noted that in the case of GOME-2A, some daily mean points are negative and smaller than the lower x-axis limit in Fig. 9, especially the case from 2011 on, when data are more negative, as also seen in Fig. 2 and discussed at the end of Sect. 2.
Each year, an enhanced OClO signal (from 20 and up to 40 and 60 x10 13 molec/cm 2 ) is observed in August and September, followed by a decrease.The largest OClO columns are measured at Arrival Heights in 2012, 2013, 2015, at Neumayer in 2013, 2014, 2015and at Belgrano in 2011, 2014and 2015.There is some variability in the strength of the signal from year to year, but the daily variations are sampled in a coherent way from the ground and from space, with a general tendency for smaller (sometimes negative, especially for GOME-2A) OClO SCDs retrieved by the satellites during November to April, outside of the chlorine activation period.
A gap in the GOME-2A data is observed in October at the Neumayer station since 2013, due to the reduced swath of the satellite instrument.There are no satellite measurements within 200 km for both sensors between May and end of July, which results in missing the start of the chlorine activation.Some more pronounced negative slant columns appear in the GOME-2A dataset after mid 2011, probably related to the degradation of the instrument.A quantitative comparisons for different GOME-2A periods is also shown in Fig. 15 and discussed later on.
In Marambio, an enhanced OClO signal is observed in June, August and September, with a data gap in July.A day-to-day variability of several 10 x10 13 molec/cm 2 is visible in GOME-2B data (Figure 10) during the activated period.This behaviour is related to the intermittent probing of air masses that are on the edge of the Antarctic polar vortex.The ground-based data seem more sensitive to these rapid changes, resulting in higher peaks than observed with GOME-2A and GOME-2B.For this station, the averaging of the satellite data within 200 km could mix air from inside and outside the vortex.Tests with a smaller co-location radius were performed for this station, but with similar results and less co-located points.

Arctic
Comparisons at the four Arctic stations are shown in Fig. 13 and 14.It should be noted that Eureka and Ny-Ålesund are in the polar night until about February/March, so that ground-based measurements can only be madedone duringup to April/May.
After that period, SZAs are too low (smaller than 88 • ) to perform ground-based measurements of OClO.
At all stations GOME-2A, GOME-2B and the zenith-sky DOAS instruments capture similarly the seasonal cycle of the OClO SCDs, as well as its day-to-day variations.Differences from year to year and station to station exist, but typically    The statistical analysis (presented in Fig. 11 and Fig. 12) leads to correlation coefficients from 0.51 (Harestua) to 0.88 (Ny-Ålesund) for GOME-2A and 0.74 to 0.81 for GOME-2B daily comparisons, with linear regression slopes around 0.79-0.92and 0.65-0.98 for GOME-2A and GOME-2B respectively.

Comparison summary
We now consider all the stations and focus only on the activated periods (July-August-September in the Southern Hemisphere and January-February-March in the Northern Hemisphere).Figure 16 summarizes the biases (offsets) between GOME-2 and ground-based ZSL-DOAS time-series using box-whisker plots of their differences at each site.Stations are ordered by latitude, from the Arctic (top) to the Antarctic (bottom).It is worth mentioning that although Eureka and Ny-Ålesund are close to each other in latitude (80°N and 79°N), they are far away in longitude (Canada and northern Europe), which implies very different positions with respect to the polar vortex.This is also true for Arrival Heights and Belgrano, which are both at a latitude of 78°S but located at opposite sides of the Antarctic continent (see map in Figure 3).The figure indicates a general negative bias (up to around -8 x10 13 molec/cm 2 ) for both GOME-2 instruments at most stations, except for Kiruna and Marambio.The differences between GOME-2A and GOME-2B are of a few 10 13 molec/cm 2 .Differences of the same order of magnitude are found e.g., between the two Arrival Heights instruments.The median bias statistics of the individual comparisons are reported in Table 4 for each station and for both hemispheres, together with regression analysis statistics.In relative values, the station biases range from -53% to 8% for GOME-2A and -78% to 13% for GOME-2B for Eureka and Marambio.
Table 4. Summary of the regression parameters and bias between GOME-2A and B and zenith-sky OClO SCDs daily mean comparisons for the active months (January-February-March for the Northerm Hemisphere and July-August-September for the Southern Hemisphere).
Intercept, RMS and absolute biases (median (SAT-GB)) are in x10 13 molec/cm Figure 17 presents the results as a scatter plot, with GOME-2A values in red and GOME-2B values in green.It can be seen that GOME-2A results are slightly noisier than GOME-2B, with several outliers, a smaller correlation coefficient (0.8 wrt to 0.87) and larger RMS values.As already mentioned, this is likely related to instrumental degradation effects and/or the different empirical corrections used for GOME-2A.Regression slopes are about 0.64 for GOME-2A and 0.72 for GOME-2B, with an intercept of about 2 x10 13 molec/cm 2 for GOME-2A and half of it for GOME-2B.Fig. 18 presents the same data but color-coded according to the different stations.The small intercepts are representative of small additive biases, while the slopes smaller than unity are the largest contributors to the negative multiplicative bias.The small intercept can potentially be explained by the GOME-2 normalization correction (see Sect. 2), that subtracts any remaining positive OClO SCD in region where no OClO is expected.The slope can potentially be explained by the different GOME-2 and ground-based DOAS fit settings and the corresponding SCD uncertainties (see Sect. 3.1.1).For GOME-2 there is e.g. the impact of the mean residual or the scan angle empirical correction functions (see Sect.2).The impact of the AMF differences highlighted in Fig. 8 has also a multiplicative effect.The smaller satellite SCDs for valid flags (ie >85°SZA) found here compared to the ground-based ones, could be potentially compensated in the VCD by the AMF.However, Fig. 8 shows that AMF sat is smaller than AMF gb , only for SZA>88°.

365
Concentrating on the slopes of daily linear regressions at each station (Table 4), values around or better than 0.7 are found for GOME-2B, and often slightly smaller for GOME-2A.The intercepts are generally smaller than 2 x10 13 molec/cm 2 , except at Kiruna (for both instruments) and at Neumayer for GOME-2A.RMS are generally larger for Antarctic stations.
These results are to be put in perspective with the systematic bias estimated in Sect.3.1.1and summarized in Table 3.Some stations have larger expected biases than others (e.g.Eureka up to 15%) due to their DOAS settings choices, and in general, 370 there is a total uncertainty within the ground-based datasets of about 26 to 33%, which is close to the remaining 36 and 28% multiplicative biases from the slope (slope values of 0.64 and 0.72 for GOME-2A and GOME-2B respectively)..When considering results grouped by hemisphere, the slope is larger in the northern hemisphere for GOME-2A (0.85 wrt 0.61), while for GOME-2B results are more coherent (0.76 and 0.7).For GOME-2B the relative bias is very similar in both hemispheres (around -24%), while for GOME-2A it is about -5% in the northern hemisphere and -30% in the southern hemi-375 sphere.These numbers are within the EUMETSAT AC SAF GDP OClO product target accuracy of 50% and close to the optimal accuracy of 30% (Hovila and Hassinen, 2021).
To summarize, we can conclude that: -The variability of the OClO column, from day-to-day fluctuations to the annual cycle, is captured consistently by all instruments.

Conclusions
We investigated the quality of the GOME-2A (2007GOME-2A ( -2016) ) and GOME-2B (2012GOME-2B ( -2016) ) OClO GDP 4.8 slant column datasets by comparing them to ground-based ZSL-DOAS measurements at a selection of 8 stations located in the Arctic and Antarctic For the ground-based instruments, OClO spectral analyses were performed using fixed noon spectra recorded at low SZA in the absence of chlorine activation.Different DOAS analysis settings are used by different instrument teams, and the impact of these differences are quantified through dedicated sensitivity tests.This leads to an estimation of systematic uncertainties of about 25% maximum.Depending on the different instruments, the random noise error was estimated to be between 6 and 22%.The total uncertainty from each ground-based dataset is estimated to be between 26 to 33%, depending on the site.
At each station, daily comparisons were performed by selecting satellite and ground-based SCD data pairs corresponding to similar SZA conditions, assuming similar AMFs in both nadir and zenith geometries.Using radiative transfer simulations, this assumption was shown to be valid within the SZA range of the measurements, confirming estimations from previous studies.
Daily mean OClO SCD time-series show that satellite and ground-based observations agree well at all stations, and display consistent seasonal and inter-annual variabilities.GOME-2A tends to be noisier than GOME-2B especially after 2011, which is likely related to instrumental degradation effects combined with the possible impact of the different instrumental corrections applied to the two instruments.
Daily scatterplots based on data selected during chlorine activated periods give correlation coefficients of 0.8 for GOME-2A and 0.87 for GOME-2B, and regression slopes are 0.64 for GOME-2A and 0.72 for GOME-2B.These results fulfill the GOME-2 accuracy requirements for OClO, as stated in the EUMETSAT AC SAF Product Requirement Document, i.e. a target accuracy of 50% and an optimal accuracy of 30%.
Biases at each station are generally negative and close to -8 x10 13 molec/cm 2 in the worst case (Arrival Heights IUPH).
Those biases do not seem to originate from the ground-based datasets since these were also used recently for TROPOMI OClO -IUP-Heidelberg and NIWA jointly operate a UV-VIS spectrometer at Arrival Heights (77.83 • S, 166.65 • W), part of the New Zealand station Scott Base on Ross Island since 1998 (Frieß et al., 2005).Another instrument was present at the station, operated by NIWA (Kreher et al., 1996), but stopped measurements in 2017.Both instruments provide OClO SCDs since 2007.

Systematic errors:
In a first test, OClO SCD analyseis are performed in the 345-389nm range (as for the GOME-2 analysis window), with varying OClO cross-section sources (using the Wahner et al. (1987), the Kromminga et al. (1999) and the Kromminga et al. (2003) cross-sections at several temperatures), and fixing the other inputs, as summarized in Table A2.With respect to Kromminga et al. (2003) at 213K (used for GOME-2 analysis), regression analysis reveals slopes of 1.02 for the Kromminga et al. (2003) at 233K, 0.97 for the Kromminga et al. (1999) also at 213K and of 0.85 for the Wahner et al. (1987) at 204K (see Fig. A1), so a total uncertainty of about 15% with respect to what is used for GOME-2 retrievals.This is coherent with Kromminga et al. (2003) reporting cross-section band peaks about 8% smaller than Wahner et al. (1987).
Considering the largest impact between results obtained with the different OClO cross-sections, we come to a difference of about 17% (corresponding to slopes ranging from 0.85 to 1.02).This value is used to quantify the first component of the systematic uncertainty in Table A2.The expected bias for each group's OClO cross-section choice is also reported for each station in Table 3.
For the second test (see Table A2), we fixed the OClO cross-section to Kromminga et al. (2003) at 213K and varied the other DOAS fit parameters in an attempt to match the different settings used by each group (wavelength interval, interfering species and their cross-section references as in Table 2).Unfortunately, the Ny-Ålesund instrument does not cover the visible range and stops at 390.4nm and the MPIC wavelength choice (interval 372-392nm) cannot be entirely covered.It should be noted that no analysis could be done in the visible interval used by NIWA.
Results of the regression analysis for each group's choice with respect to the median OClO SCD values, are presented in Figure A2.In most cases, the regression is compact (correlation R larger than 0.945) except for MPIC (R=0.893), also the RMS is generally smaller than 2x10 13 molec/cm 2 , except for IUPH and MPIC.Results for the latter two cases are likely biased due to the limited wavelength range (up to about 390.4nm) of the Ny-Ålesund spectra.As a result in these cases, the upper part of the wavelength interval is not covered.Depending on the setting choices, the difference compared to the median OClO SCD can take the form of a multiplicative bias (slope different than 1) and/or an additive bias (non-zero intercept).In the tested cases, all intercept except for IUPH are smaller than 1x10 13 molec/cm 2 , so the observed bias is mostly multiplicative.
The largest impact on the slope is obtained for the MPIC case (slope of 0.925) and for UToronto (1.04), leading to a difference between all cases of about 18.5% (slopes from 0.925 to 1.11).This value is considered as the maximum systematic uncertainty on the retrieval choice for the systematic uncertainty contribution in Table A2, leading to a total maximum systematic uncertainty of about 25% (see Table A2).
Expected systematic bias against GOME-2: A third test has been carried out (see Table A2), comparing each group's analysis to the OClO SCD obtained using the GOME-2 data retrieval settings (345-389nm range, see Table 1), as illustrated in Fig. A3.From this sensitivity test, the expected systematic bias for each group is estimated in comparison to the GOME-2 retrieval settings, ranging between 4% and 16% for the different stations.
Author contributions.GP carried out the validation analysis, the associated investigations and wrote the manuscript.MVR and FH contributed input and advise at all stages of the scientific discussions and of the manuscript writing.MVR prepared the ground-based offset     2 and A2 in test 3).Each set of OClO SCD is compared against the OClO values obtained using the GOME-2 retrieval settings described in Table 1 and regression statistics are given as inset for each plot.
mean residual and scan angle correction An illustration of OClO SCD maps for the Arctic in February 2011 and the Antarctic in August 2015 is given in Fig. 1.

Figure 3 .
Figure 3. Geographical distribution and measurement time-periods of the UV-visible NDACC ZSL-DOAS instruments providing the correlative OClO measurements.

Figure 7
Figure 7 presents an illustration of the impact of the correction for the Neumayer ground-based dataset time-series.The original data is displayed in light grey and the corrected one in black.The same data set is also represented as a function of the SZA in the lower panel.As can be seen, in this case, the main impact of the offset correction is to reduce the apparent noise on the low values of the OClO SCD.During periods of strong activations, changes are generally minor.

Figure 5 .
Figure 5. Illustration of the Langley plot method used to estimate offset artefacts on OClO SCD measurements.This case was obtained in Harestua on 13 January 2013.

Figure 6 .
Figure 6.Illustration of the AMFs used for the Langley plots.The grey area indicates the range of the measured OClO AMFs, the red curve their median value, while the blue and green curves are AMFs calculated using the DISORT radiative transfer model coupled PSC-Box and initialized with SLIMCAT 3D-Chemical transport model simulations.

Figure 7 .
Figure 7. Illustration of the offsset correction impact on of Neumayer data (a) the time-series and (b) the SZA dependence.

Figure 8 .
Figure 8. OClO AMF calculations for 6065°N from ground-based zenith and satellite nadir geometries.

Figure 9 .
Figure 9.Time series of GOME-2A (red) OClO daily mean slant column data co-located with ground-based (black) measurements performed at each Antarctic station.Lighter/transparent red color is used for GOME-2A when there areis no ground-based measurements.

Figure 10 .Figure 9
Figure 10.Time series of GOME-2B (green) OClO daily mean slant column data co-located with ground-based (black) measurements performed at each Antarctic station.Lighter/transparent green color is used for GOME-2B when there areis no ground-based measurements.

Figure 11 .
Figure 11.Scatter plot of GOME-2A OClO slant column data co-located with ground-based measurements at each station.

Figure 12 .
Figure 12.Scatter plot of GOME-2B OClO slant column data co-located with ground-based measurements at each station.

Figure 13 .
Figure 13.Time series of GOME-2A (red) OClO daily mean slant column data co-located with ground-based (black) measurements performed at each Arctic station.Lighter/transparent red color is used for GOME-2A when there areis no ground-based measurements.

Figure 15 .
Figure 15.Scatter plot between daily GOME-2A (red) and GOME-2B (green) GDP 4.8 satellite data and ground-based data at Neumayer station for all data (left) and the first 4 years of operation (right) of each satellite.

Figure 16 .
Figure 16.Box and whisker plot of the difference between all the GOME-2 and ZSL-DOAS OClO SCD pairs during active period months.Stations are ordered by decreasing latitude (South at the bottom).The box and whisker plots are defined as follows: crosses and lines for the mean and median values, boxes for the 25th and 75th percentile and dashed lines for the 9th and 91st percentile.Numbers on the right correspond to the number of days considered in the analysis.

Figure 17 .
Figure 17.Scatter plot between daily GOME-2A (red) and GOME-2B (green) GDP 4.8 satellite data and ground-based data for all the stations included in the study, during the active period months.

Figure 18 .
Figure 18.Scatter plot between daily GOME-2A (left) and GOME-2B (right) GDP 4.8 satellite data and ground-based data at the different stations included in the study, for the activated months (JAS for stations in the SH and JFM for stations in the NH).The stations are colorcoded, and the total regression statistics are given as insert.

Figure A1 .
Figure A1.Regression analysis of OClO SCD retrieved from a common set of Ny-Ålesund spectra to investigate the sensitivity of OClO results on the cross-sections used.The different DOAS analyseis used correspond to what is described in TableA2, for tests 1), with respect to OClO values obtained using theKromminga et al (2003) cross-section at 213K as in GOME-2.

Figure A2 .
Figure A2.Regression analysis of OClO SCD retrieved from a common set of Ny-Ålesund spectra to investigate the sensitivity of OClO results on different settings.The different DOAS analyses used correspond to those used by each group for their own station analysis, as described in Table 2 and A2 in tests 2).Each set of OClO SCD is compared against median OClO values and regression statistics are given as inset in each plot.

Figure A3 .
Figure A3.Regression analysis of OClO SCD retrieved from a common set of Ny-Ålesund spectra to investigate the sensitivity of OClO results on different settings.The different DOAS analyseis used correspond to what each group used for their own station analysis, as described in Table2 and A2 in test 3).Each set of OClO SCD is compared against the OClO values obtained using the GOME-2 retrieval

Table A1 .
Information on ground-based DOAS instruments.