Diurnal , seasonal and long-term variations of global formaldehyde columns inferred from combined OMI and GOME-2 observations

We present the new version (v14) of the BIRAIASB algorithm for the retrieval of formaldehyde (H2CO) columns from spaceborne UV–visible sensors. Applied to OMI measurements from Aura and to GOME-2 measurements from MetOp-A and MetOp-B, this algorithm is used to produce global distributions of H2CO representative of midmorning and early afternoon conditions. Its main features include (1) a new iterative DOAS scheme involving three fitting intervals to better account for the O2–O2 absorption, (2) the use of earthshine radiances averaged in the equatorial Pacific as reference spectra, and (3) a destriping correction and background normalisation resolved in the across-swath position. For the air mass factor calculation, a priori vertical profiles calculated by the IMAGES chemistry transport model at 09:30 and 13:30 LT are used. Although the resulting GOME-2 and OMI H2CO vertical columns are found to be highly correlated, some systematic differences are observed. Afternoon columns are generally larger than morning ones, especially in mid-latitude regions. In contrast, over tropical rainforests, morning H2CO columns significantly exceed those observed in the afternoon. These differences are discussed in terms of the H2CO column variation between mid-morning and early afternoon, using ground-based MAX-DOAS measurements available from seven stations in Europe, China and Africa. Validation results confirm the capacity of the combined satellite measurements to resolve diurnal variations in H2CO columns. Furthermore, vertical profiles derived from MAX-DOAS measurements in the Beijing area and in Bujumbura are used for a more detailed validation exercise. In both regions, we find an agreement better than 15 % when MAX-DOAS profiles are used as a priori for the satellite retrievals. Finally, regional trends in H2CO columns are estimated for the 2004–2014 period using SCIAMACHY and GOME-2 data for morning conditions, and OMI for early afternoon conditions. Consistent features are observed, such as an increase of the columns in India and central–eastern China, and a decrease in the eastern US and Europe. We find that the higher horizontal resolution of OMI combined with a better sampling and a more favourable illumination at midday allow for more significant trend estimates, especially over Europe and North America. Importantly, in some parts of the Amazonian forest, we observe with both time series a significant downward trend in H2CO columns, spatially correlated with areas affected by deforestation.

The authors provide a clear description of the products and tools used and the analysis applied. The paper is very well written and easy to read. Accompanying their analysis the authors give a good demonstrations of the limitations of the products and their interpretation while at the same time highlighting the value and possible applications of the data set. This is very important information for users of satellite derived HCHO products.
Their analysis also yields some interesting scientific results that reflect the changes in human activity on tropospheric chemistry such as emission reductions in Western Europe and the Eastern US or changes due to deforestation in the Amazon.
We thank the reviewer for their supportive comments. We have provided detailed responses to comments raised and have adjusted the manuscript accordingly where appropriate.
I have a few comments and questions I appreciate if the authors could address: (1) I am confused whether the SCIAMACHY time series was also processed with the new algorithm. In Section 6 GOME-2 and SCIAMACHY are used combined for analysis, so I assume both rely on the same retrieval algorithms? A good agreement between the two products is mentioned on Page 12265, line 5 with reference to Fig 14-16, but to me the graphs do not seem to make this clear. Have there been any more comprehensive comparisons to ensure consistency between the different products?
The SCIAMACHY and GOME-2 time series were not retrieved using exactly the same algorithm, because it is not possible. More precisely, the pre-fit of O4 and BrO is not possible with SCIAMACHY because of a spectral polarisation structure around 360 nm. However, H 2 CO SCD are finally retrieved in the same interval (328.5-346 nm), and using the same external datasets (as mentioned on page 12250, lines 10-13). The air mass factor calculation is done using the same algorithm. The consistency between the SCIAMACHY and GOME-2 retrievals has been presented in details in De 2012 (Figure 12).
In figures 14-16, the morning time series combine SCIAMACHY (2003SCIAMACHY ( -2011, GOME-2A (2007GOME-2A ( -2013 and B (from 2013) measurements. Therefore, the morning observations from 2004 to 2006 are only composed by SCIAMACHY. We used the mean of SCIA and GOME2 columns between 2007 and 2011. From 2012 onwards, only GOME-2 data can be used, as the SCIAMACHY time series stops in Mar.2012. These details have been added in the description of the figures.
(2) Also related to Fig 14-16 I wonder if the authors have an explanation for why the trend series over California is not significant for OMI and only significant for SCIA/GOME-2. For all other regions either both or only OMI show significance which as they mention is reasonable giving the higher HCHO columns at the OMI overpass.
The afternoon OMI time series does not show this positive trend, which is more in line with what is expected from anthropogenic emissions. Simulations with a 3D CTM are needed in order to assess the impact of, for example, fire events or temperature changes on biogenic NMVOC emissions (Stavrakou et al., 2014). We cannot go further in the interpretation of the observations at this stage.  Thanks. Corrected.
(4) Section 3.2: I could not quite follow how this correction is applied. It is stated that the median column over the Pacific is subtracted from the slant columns together with a polynomial latitudinal fit and then replaced by the latitudinal dependence of modelled HCHO columns (the same model as used for a priori I assume). Would one expect that over the Pacific the HCHO columns are then nearzero? And why are the corrected columns larger than the uncorrected columns. It would help if this part is rewritten to describe the individual steps in a very clear way.
The reference sector correction has been described more in details in our previous papers (De Smedt et al., 2008 andalso in De Smedt et al., 2014).
As daily radiance spectra (selected in the Equatorial Pacific) are used as background spectra in the DOAS fit, the retrieved slant columns actually represent the difference in slant column with respect to the slant column contained in the reference spectra. The SCD ( ) are therefore expected to be equal to 0 in the Equatorial Pacific region. However, this is not always the case, because of spectral artefacts, that can result in positive or negative biases in the SCD, and that depend on the latitude and season.
In the equation: = ∆ + ,0, ∆ are indeed forced to 0 in the Pacific Ocean, by the reference sector correction. Then, the vertical columns of the IMAGES model in the reference sector ( ,0, ) are added. This background value of H 2 CO, due to methane oxidation, ranges from 1 to 4x10 15 molec.cm -², depending on the latitude and on the season. It presents a smooth variation with the latitudes.

Response to reviewer #2
This paper provides a consistent long time record of satellite HCHO data from OMI and GOME-2 using an updated version of retrieval algorithm. The authors then validate the new HCHO retrievals using global ground-based DOAS HCHO observations regarding total HCHO columns, HCHO vertical profiles as well as seasonal and diurnal variations in HCHO columns. Finally, interesting global and regional trends in satellite HCHO columns are examined and identified based on the validated retrievals. The paper is generally well written and describes space-based observations and trend clearly and thoroughly. I think it is certainly appropriate for publication subject to the following additions/modifications. We thank the reviewer for their supportive comments. We have provided detailed responses to comments raised and have adjusted the manuscript accordingly where appropriate.
1. It needs to clarify whether aerosol scattering has been taken into consideration in computing scattering weights. If not, why? Along the same line of scattering weights, could the authors comment on uncertainties due to cloud? And to what extent could (changing) aerosol load impact the retrievals and eventually affect the HCHO trend seen from space in China and India?
Aerosol scattering is not taken into account in the scattering weight LUT, because aerosol scattering effects are very difficult to separate from cloud effects, and because the needed information on aerosol properties is not available at the global scale. A full treatment of aerosols in radiative transfer would be possible if clouds are aerosols were represented separately as scattering layers and if detailed information on aerosol optical properties was available at the global scale. However, the aerosol impact is partially accounted for by the cloud correction scheme. Indeed, to a large extent, the effect of the non-absorbing part of the aerosol extinction is implicitly included in the cloud correction, because the LER cloud algorithm is expected to overestimate the cloud fraction in the presence of aerosols (Boersma et al., 2004;2011). Furthermore, observations with cloud fractions exceeding 40% are filtered out, excluding a large part of the high AOD scenes (Theys et al., 2015).
In Leitao et al. (2009), the aerosol effects on the retrieval of tropospheric trace gases using UV/visible nadir measurements were investigated. They showed that, when realistic vertical profiles are applied, the aerosol effect has a relatively small impact. Theys et al. (2015) have also performed simulations with LIDORT using anthropogenic SO 2 and aerosols profiles retrieved from ground-based MAX-DOAS measurements performed in Beijing and Xianghe (Clémer et al., 2010). The results confirmed the limited effect of the aerosols on the air mass factors (maximum 15% in average, for very high aerosol loading). The explanation is a compensation of enhancement and reduction of the trace gas signal due to aerosols, because the vertical profiles of tropospheric gases and aerosols are mixed.
Finally, a trend in aerosols should result in a trend in the cloud fractions because the LER cloud product is sensitive to scattering aerosols. We have done a sensitivity trend study on the cloud fractions and on the tropospheric AMFs. We find a positive trend in cloud fractions over Beijing and India, but also a positive trend in the cloud altitudes. These effects have respectively a positive and a negative impact on the tropospheric AMFs, and no trend is observed in AMFs. Furthermore, the positive trends are already present in the H 2 CO slant columns.
In section 3.3.1, the following sentence has been added: "No explicit correction is applied for aerosols but the cloud correction scheme accounts for a large part of their scattering effect (Boersma et al., 2011). The uncertainty related to aerosol effects is estimated to be lower than 15% in average (Leitao et al., 2008;Theys et al., 2015). 2. In my opinion, it would be great to have some comparison between the updated HCHO retrievals from this paper and other available HCHO products, e.g., OMI HCHO from González Abad et al.
[2015] and OMPS HCHO from Li et al. [2015]. Some numbers in hotspot regions and background from different products are enough. Those 2 recent papers include comparisons between our OMI HCHO product and respectively the SAO OMI and OMPS HCHO products (Gonzalez Abad et al., 2015b) and the NASA OMPS HCHO product (Li et al., 2015). We provide here a summary of their comparison results. References to their papers have been added in the manuscript. However, we would prefer to keep a more in depth comparison of the different satellites products for another work/publication. Figure 6 in the recent paper of Gonzalez Abad et al. (2015b) show comparisons between BIRA OMI, and SAO OMI H 2 CO columns (here in Table 1). Seasonal changes and spatial patterns are similar between the two OMI products. The quantitative agreement is within 25% during peak seasons, while relative differences are larger for mid to low H 2 CO columns, from 50 to 80% in Indonesia. Figure 3 and Table S1 in Li et al. (2015) compares the seasonal patterns of BIRA OMI and OMPS PCA H 2 CO (here in Table 2). In general, the two H 2 CO retrievals demonstrate similar seasonal changes and spatial patterns. Quantitatively, the OMPS PCA H 2 CO retrievals are about 15-20% smaller than those from BIRA OMI in several source regions during the peak season. For example, the regional mean H 2 CO for the southeast U.S. for August 2013 is ~20% smaller than BIRA OMI H 2 CO. For months with lower H 2 CO loading, the two retrievals generally agree better in the tropics than in the middle latitudes. This may reflect greater uncertainty in both retrievals under more challenging conditions during winter such as larger solar zenith angle and weaker signals. Despite these differences, the overall agreement between the OMPS and OMI retrievals is encouraging and should lend confidence to both products, given that they are independently produced from two different instruments using fundamentally different approaches (PCA versus DOAS). For example, OMPS has lower spatial and spectral resolution but higher signal-to-noise ratio than OMI. The input data are different between the three algorithms, including the a priori H 2 CO profiles, surface reflectivity, cloud data, and also background H 2 CO corrections. More detailed analyses will be necessary to reconcile the quantitative differences seen between the data sets, but the level of agreement is within the reported error bars of the products. Aug.2013 15.00 ± 3.00 11.00 ± 3.00 -0.27 ± 0.29 3. I don't understand why GOME and SCIAMACHY HCHO retrievals are necessarily consistent with new OMI and GOME-2 retrievals. GOME and SCIAMACHY data were retrieved using old algorithm, if I understand correctly. Will difference in retrieval algorithms lead to uncertainties when combine all the data in looking at the long-term trend?
 The algorithms are as consistent as possible between sensors. The main difference is that 3 intervals are used for OMI and GOME2, O 4 and BrO are pre-fitted, while HCHO is retrieved in 328.5-346 nm. For GOME and SCIAMACHY, the quality of the recorded spectra does not allow to use the pre-fit windows, but HCHO is retrieved is the same window as for OMI and GOME-2 (328.5-346). The other retrieval parameters are the same (cross-sections, albedo, a priori profiles, …). GOME and SCIAMACHY HCHO datasets have been reprocessed several times since the 2008 publication (De Smedt et al, 2008).  Remaining differences between HCHO slant columns are corrected by the reference sector correction step. This removes global offsets, artificial zonal dependencies, and global longterm degradation effects.  Consistency between GOME, SCIAMACHY and GOME-2 is addressed in De Smedt et al., 2012.
It is not perfect everywhere but very satisfactory.  To our point of view, the main reason for inconsistencies between the satellite columns is the large differences in spatial resolution. GOME pixels being much larger than the other sensors.  In this paper, GOME is not included in our trend analysis. SCIAMACHY/GOME-2 and OMI datasets are treated separately. Furthermore, in the trend model, offset terms between SCIAMACHY and GOME-2 are fitted per region to account for possible effects of spatial resolution between the two time series (De Smedt et al., 2010;Hilboll et al., 2013). For these reasons, we think that if the trends are found consistent between morning and afternoon observations, they can be trusted.
4. GOME-2 has a larger footprint that OMI. Does this mean that GOME-2 is easier to be affected/contaminated by cloud? If so, it may not be fair to directly compare OMI and GOME-2 retrievals. This might also lead to bias in getting reliable diurnal HCHO profile, which is one of the main purposes of this paper, bracketed by GOME-2 and OMI. Can the authors comment on this? We have performed several sensitivity tests to obtain the diurnal variations from GOME-2 and OMI columns. We have compared the cloud free observations to the cloud corrected columns. The differences observed between GOME-2 and OMI are almost identical, all the more so as the averaging period is several years. As stated in the paper, cloud effects are more random than systematic, and are not thought to influence systematically the averaged observed diurnal variations.
It is clearly acknowledged in the paper that uncertainties on the satellites H 2 CO columns are large and therefore also the errors on their differences (as reflected in Table 4). Nevertheless, the sign of the differences between GOME-2 and OMI agrees well with both MAX-DOAS and FTIR measurements. Early afternoon values are almost always equal to or larger than mid-morning values, except in Bujumbura where morning columns are larger. 5. I don't understand why 100 km is used in getting average HCHO columns. Some back-ofenvelope calculations based on HCHO lifetime and local annual wind speed will be appreciated. A radius of 100 km is used to compare satellite columns to ground-based measurements. For this kind of comparison, and unlike inversion exercises using models, the smallest possible radius should be used. Using 100 km is a pragmatic choice, because using a smaller radius does not include enough satellite pixels to reduce sufficiently the noise of the data, particularly in the case of GOME-2. It is a trade-off between the spatial resolution and the quality of the H 2 CO time series. We added the following sentence in section 5: "Although larger than the typical length of air masses sampled by a MAX-DOAS spectrometer, which is less than a few tens of kilometres (Gomez et al., 2014), this radius allows including enough satellite pixels to ensure significant analysis." Gomez, L., Navarro-Comas, M., Puentedura, O., Gonzalez, Y., Cuevas, E., and Gil-Ojeda, M.: Long-path averaged mixing ratios of O3 and NO2 in the free troposphere from mountain MAX-DOAS, Atmos. Meas. Tech., 7, 3373-3386, doi:10.5194/amt-7-3373-2014, 2014 6. I'm not totally convinced by some potential driving factors of HCHO trend proposed in the paper.
Temperature should be the dominated driver of the interannual variability (IAV) in HCHO columns. The authors may want to acknowledge more that trends in temperature would have played a role in the changes/trends observed. The authors may also want to clarify why the instrument aging only plays a negligible role in the observed trend.
We agree, and have tried to make it clearer in the revised version of the paper.
Formaldehyde columns are mainly formed by oxidation of VOCs, from biogenic, biomass burning and anthropogenic sources. Column inter annual variabilities are mainly driven by fire events and temperature changes (Millet et al., 2008;Barkley et al., 2009;Stavrakou et al., 2014). However, over industrialised regions, changes in anthropogenic emissions have also been identified as drivers of observed H 2 CO column trends (De Smedt et al., 2010;Zhu et al., 2014;Khokhar et al., 2015;Mahajan et al., 2015;Stroud et al., 2015).
The instrumental degradation effects play a role in the long-term observations, which we try to take into account. The reference sector correction cancels any global variation of the H 2 CO columns. In the case of OMI, we also take into account the change in sampling along the years, by using "sampling-corrected" OMI columns. Finally, the VCD errors are used in the trend analysis, and errors are provided with trend estimates. The results of our trend analysis are displayed only when they have been found statistically significant, and the fact that equivalent trends are observed independently with both datasets gives confidence in our H2CO column long-term variation estimates.
Some minor comments: Page 12243, Line 17-18 "constrain NMVOC emissions in top-down inversion approaches" Not all of cited work involve inversion approaches, some of them instead simply assume linear relationship between HCHO columns and NMVOCs from a CTM to constraint NMVOC emissions. Right. We then removed "in top-down inversion approaches".
In the discussion paper, it is not RMA regression but a simple least squares solution, indeed forced to zero. This has been changed in the revised version. We now use a cubic least square regression, where the line is fit by minimizing both x-and y-residuals simultaneously for weighted data points. This analysis works well in Beijing/Xianghe, where we have good correlations both for GOME-2 and OMI, but not in Bujumbura, where the correlations are too low. Therefore, we now only provide mean differences in Bujumbura. A Table has been added with more detailed numbers for both stations.  Page 12263, Line 11-12, "bringing the satellites and ground-based observations to a satisfactory agreement within 15%." I'm not convinced. It seems to me that you have used information from observations in getting vertical profile. Of course you can get a better agreement.
Using the max-doas profile shapes to re-calculate the satellite AMFs is equivalent to smoothing the max-doas profiles with the satellite total column averaging kernels. It allows removing from the comparison the error associated to the a priori profile shapes (Eskes and Boersma, 2003). However, only the vertical distribution of the a priori profile impact the satellite AMFs, not their integrated columns (Palmer et al., 2001). It means that with the same max-doas profile shapes but integrated columns twice larger, the impact on the satellite AMFs would be the same, but the final column agreement would not be good. Therefore, if successful, this type of comparison validates the satellite slant columns, and the satellite AMF dependencies with albedo, clouds, … (all factors expect vertical profile shapes).
We added this precision in the manuscript: "It must be noted that the retrieved MAX-DOAS profiles also have their own uncertainties (Vlemmix et al., 2014), however using them to re-calculate the satellite AMFs allows to remove from the comparison the error associated to the a priori profile shapes (Eskes and Boersma, 2003). Indeed, only the shape of the a priori profiles impact the satellite AMFs, not their total columns (Palmer et al., 2001). The satellite averaging kernels (AKs) are much closer in shape to the FTIR AKs than to the MAX-DOAS retrievals, which may explain the better agreement of the columns (Vigouroux et al., 2009)." Page 12263, Line 24-26, "The effect of the rather coarse resolution of the global CTM on the modelled profiles (here 2.5°× 2.5°) needs to be further investigated, as well as possible other effects of vertical transport and chemical processes." Just a comment on this. The coarse CTM may be doing OK for regional background from biogenic sources (e.g, HCHO from isoprene) but may never be able to model urban/point source right on top of biogenic background. Agreed. Regional models would be more appropriate. We added to the sentence: "The effect of the rather coarse resolution of the global CTM on the modelled profiles (here 2°x2.5°) needs to be further investigated, particularly for anthropogenic sources, as well as possible other effects of vertical transport and chemical processes."