Seven years of global retrieval of cloud properties using space-borne data of GOME

We present a global and regional multi-annual (June 1996–May 2003) analysis of cloud properties (spherical cloud albedo – CA, cloud optical thickness – COT and cloud top height – CTH) of optically thick (COT > 5) clouds, derived using measurements from the GOME instrument on board the ESA ERS-2 space platform. We focus on cloud top height, which is obtained from top-of-atmosphere backscattered solar light measurements in the O 2 A-band using the Semi-Analytical CloUd Retrieval Algorithm SACURA. The physical framework relies on the asymptotic equations of radiative transfer. The dataset has been validated against independent groundand satellite-based retrievals and is aimed to support trace-gases retrievals as well as to create a robust long-term climatology together with SCIAMACHY and GOME-2 ensuing retrievals. We observed the El Niño-Southern Oscillation anomaly in the 1997–1998 record through CTH values over the Pacific Ocean. The global average CTH as derived from GOME is 5.6 ± 3.2 km, for a corresponding average COT of 19.1 ± 3.9.


Introduction
Clouds play an important role in the Earth climate system (Stephens, 2005;Heintzenberg and Charlson, 2009).The amount of radiation reflected by the Earth-atmosphere system into outer space depends not only on the cloud cover and the total amount of condensed water in the atmosphere but also on the size of droplets and their thermodynamic state.The information about microphysical properties, cloud top height and spatial distributions of terrestrial clouds on a global scale can be obtained optimally with satellite remote sensing systems.The amount of reflected solar light depends both on geometrical and microphysical characteristics of clouds.In particular, it is often assumed that clouds can be represented by homogeneous and (in horizontal direction) infinitely extended plane-parallel slabs.The range of applicability of such an assumption for real clouds is limited because 3-D effects are not taken into account and multi-layer cloud systems can occur.However some properties can still be derived and valuable information can be retrieved.
At present, a number of relevant datasets of global cloud properties are available.They have been derived from different instruments and platforms, each with their own spatial, temporal and spectral characteristics, which are summarized in Table 1.Some of them are compared in the Global Energy and Water Experiment (GEWEX) activity in the framework of the World Climate Research Programme (WCRP) (Stubenrauch et al., 2009).Either backscattered ultraviolet and visible radiation or scattered/emitted infrared radiation is measured by passive satellite imagers.The following sensors have been used to infer cloud properties: the High resolution Infrared Sounder (HIRS, Wylie et al., 1994;Wylie and Menzel, 1999), the International Satellite Cloud Climatology Project (ISCCP, Rossow and Garder, 1993;Rossow and Schiffer, 1999), the Advanced Very High Resolution Radiometer (AVHRR, Jacobowitz et al., 2003), the Global Retrieval of ATSR (Along-Track Scanning Radiometer) Cloud Parameters and Evaluation (GRAPE, Poulsen et al., 2011;Sayer et al., 2011), the Multiangle Imaging SpectroRadiometer (MISR, Diner et al., 1989;Moroney et al., 2002), the Moderate Resolution Imaging Spectroradiometer (MODIS, Platnick et al., 2003;Menzel et al., 2008), the Atmospheric Infrared Sounder (AIRS, Stubenrauch et al., 2010) the Ozone Monitoring Instrument (OMI, Sneep et al., 2008;Vasilkov et al., 2008;Joiner et al., 2012), and the Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY, Bovensmann et al., 1999;Kokhanovsky et al., 2007d).Existing cloud datasets derived by measurements in the O 2 A-band are the Fast Retrieval Scheme for Clouds from the Oxygen A-band (FRESCO, Wang et al., 2008, see http: //www.temis.nl/fresco/)and the Retrieval of Cloud Information using Neural Network (ROCINN, Loyola et al., 2010).New perspectives for cloud properties retrieval are offered by active sensors such as the Cloud Profiling Radar (CPR, Stephens et al., 2008) onboard CloudSat and the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP, Winker et al., 2007) onboard CALIPSO platform.
In these systems, the high vertical resolution is counterbalanced by the limited spatial coverage.
Though the main scientific objective of GOME (Global Ozone Monitoring Experiment, Burrows et al., 1999) is the retrieval of trace gases (Coldewey-Egbers et al., 2005;Meijer et al., 2006;Van Roozendael et al., 2012), its measurements are also relevant for the study of cloud parameters.GOME is a space-borne spectrometer that has been flying on the European Remote Sensing Satellite 2 (ERS-2, whose payload has been switched-off since July 2011) since April 1995.GOME measured reflected solar radiation in the wavelength range between 240 and 790 nm at a spectral resolution of 0.2 to 0.4 nm (see Table 2).
Clouds affect the path of light through the atmosphere and therefore change the depth of a gaseous absorption band as seen in reflected light.They act as reflectors and absorbers and their influence can be summarized in three components: firstly they shield part of the troposphere, hiding the gas columns below; secondly they enhance the absorption above and inside a cloud (due to light path enhancement), yielding an increased band depth; finally they cause multiple scattering, as photons travel inside.The properties to be known are cloud albedo, optical thickness and top height.
The aim of this paper is to describe the retrieval of such properties with SNGome (SACURA -Semi-Analytical CloUd Retrieval Algorithm -Next Generation for GOME) and assess the quality of the dataset through validation and comparison with other algorithms, based on different physical approaches.The structure of the paper is as follows.In Sect. 2 the algorithm is described.The solution of the forward and inverse problem is sketched as well as the extension to global processing.Section 3 is devoted to validation, both synthetic error analysis and against radar measurements and other retrieval techniques.In Sect. 4 we show results and global cloud patterns.In the final section we draw some conclusions.

SNGome algorithm description
It has been extensively proven that cloud top height can be retrieved from measurements in the oxygen A-band (758-778 nm) (Yamamoto and Wark, 1961;Saiedy et al., 1967;Fischer and Grassl, 1991;Kuze and Chance, 1994;Koelemeijer et al., 2001;Kuji and Nakajima, 2002;Rozanov and Kokhanovsky, 2004).When a cloud is idealized as a perfect reflector, every photon striking on the cloud top will be scattered back and will not be absorbed by O 2 molecules within or below the cloud.So the depth of the absorption line decreases as the cloud altitude increases because most of the oxygen is located under the clouds.
In reality, two further aspects must be considered.First, the assumption of a cloud as a Lambertian diffuser with zero transmittance and fixed plane albedo leads to the underestimation of height, because smaller top-of-atmosphere (TOA) reflectances in the oxygen absorption band are misinterpreted as lower cloud layers (firstly remarked by Saiedy et al., 1967).Gaseous absorption takes place throughout a cloud layer and does not stop at the cloud top.This effect has been proven in the context of the Optical Centroid Pressure (OCP) for OMI retrievals (Vasilkov et al., 2008;Sneep et al., 2008;Joiner et al., 2012).Second, it has been shown that the sole retrieval of top height will be biased low if no attempt is made to account for multiple scattering and its value will be closer to the altitude of the middle of the cloud (Ferlay et al., 2010).The SNGome algorithm is based on Semi-Analytical CloUd Retrieval Algorithm (SACURA, Kokhanovsky et al., 2003;Rozanov and Kokhanovsky, 2004).SACURA was originally developed at IUP Bremen for the application to SCIAMACHY measurements (Gottwald and Bovensmann, 2010;Burrows et al., 2011;Kokhanovsky et al., 2011).It consists of two parts: a forward semi-analytical parameterization of the cloud TOA reflection function and a numerical minimization for the retrieval.An extensive description can be found in Kokhanovsky et al. (2003); Rozanov and  335, 380, 416, 440, 463, 494.5 555, 610, 670, 758, 772 nm Spectral resolution 1 nm Spatial resolution 1 • × 1 • Kokhanovsky ( 2004); Kokhanovsky and Rozanov (2004); Kokhanovsky and Nauss (2006).

The forward problem
The spectral top-of-atmosphere (TOA) reflectance is defined as where E 0 is the spectral solar irradiance and µ 0 the cosine of solar zenith angle.The geo-referenced calibrated and degradation-corrected spectral radiances I are extracted from L0 GOME data with the aid of the GOME Data Processor 4 (Slijkhuis and Loyola, 2009).Due to the coarse GOME spatial resolution (i.e.320 × 40 km 2 ), two corrections are introduced to address the issue of broken cloudiness.It has been shown (Kokhanovsky et al., 2007a) that, as long as the cloud top height retrieval incorporates spectral ratios, the horizontal photon transport is of minor importance.Hence, if the cloud fraction value is known from an independent source, then it is reasonable to scale partially cloudy scenes to fully cloudy cases with the Independent Pixel Approximation (IPA) (Marshak et al., 1995) and to calculate the cloud TOA reflectance R cl from The value of cloud fraction c f , defined as the fraction of the GOME pixel occupied by a cloud, is delivered by DLR (Deutsches Zentrum für Luft-und Raumfahrt) in bundle with the GOME radiances and is based on the analysis of Polarization Measuring Device (PMD) records (Loyola and Ruppert, 1998;Loyola, 2004).The clear sky reflectance R s is substituted by a Minimum Lambert-Equivalent Reflectivity (MLER) value taken from the global database Tropospheric Emission Monitoring Internet Service (TEMIS, Koelemeijer et al., 2003, see Table 3).This climatological value has been derived from 5.5 yr of GOME observations.The TEMIS sub-pixels are co-located (see Eqs. 1 and 2 in Kokhanovsky et al., 2007c) and averaged.Second, the influence of the surface reflection on the topof-atmosphere reflection of the cloudy scene, assuming that where t is the cloud transmissivity, K 0 (µ) and K 0 (µ 0 ) are the escape functions, µ and µ 0 the cosines of viewing and solar zenith angles and R ∞ the reflection function of an infinite layer, respectively.Arguments in R ∞ and R mes are omitted for simplicity.The escape function can be approximated as with an accuracy of 2 % at µ 0.2 (Kokhanovsky, 2006).
The value of t is related to the cloud optical thickness (COT) The asymmetry parameter g depends on the chosen phase function.We will assume that g = 0.859 (i.e.water clouds, Kokhanovsky, 2006).The parameter α is almost independent of microphysics of clouds and is set equal to 1.07 (Kokhanovsky, 2006).The optical thickness τ is then calculated from the continuum outside the absorption, at wavelength 758 nm, where almost no sensitivity to cloud top height is expected.Then it follows from Eqs.
(3) and ( 5): where β = α α−1 .This technique, used also in King (1987), applies to clouds with τ > 5.For the validation of cloud optical thickness, a set of MODIS Terra measurements was ingested and compared with two other algorithms (ATSK3/JAXA and MOD06/NASA, both based on look-up-tables approach.Details are given in Nauss et al., 2005).SACURA retrievals exhibit a slightly higher mean than MOD06 and ATSK3 (18.5 versus 15.9 and 16.9 respectively) and deviate ±18 % on average from MOD06, with a stability index r 2 of 0.99.Since the intercomparison has been performed on the same measurement set, the arose discrepancies among the algorithms rule out co-registration and scenario issues and can be tracked down to the different theoretical and algorithmic approaches.
The values of geometrical cloud height h and thickness l are derived from measurements around the oxygen absorption centered at 761 nm (whose depth as seen in reflected light depends also on τ ), with the nominal GOME spectral resolution and sampling (67 spectral points were used).In this case, the modeled reflectance R TOA is modified accounting for both gaseous absorption and multiple light scattering inside and below the cloud and has the following form where R 0 gives the reflection function of the part of atmosphere above the cloud in the single scattering approximation.The Rayleigh and aerosol scattering and absorption coefficients are considered.The aerosol properties are taken from MODTRAN 2/3-LOWTRAN 7 (Kneizys et al., 1996) and correspond to a tropospheric model with ground visibility 23 km and boundary layer humidity 70 %, while the stratospheric aerosol is the so-called background aerosol (Kneizys et al., 1996).R b is the reflection function of the cloud-underlying atmosphere system together with surface contribution, while the multipliers T 1,2 are the transmission coefficients from the Sun to a cloud and from the cloud to a satellite, respectively.Accounting in T 1,2 only for gaseous absorption (without aerosol and molecular extinction) diminishes the total extinction along the light path and results in the increase of the second term of the right hand side of the above equation.This procedure enables the account of multiple scattering above the cloud (Kokhanovsky and Rozanov, 2004).Moreover, the contribution of the atmospheric layer below the cloud is not neglected.Kokhanovsky and Rozanov (2004) illustrate how the aerosol-gaseous medium under the cloud and underlying surface can be approximated by an effective Lambertian surface with albedo A.
The oxygen absorption within the cloud layer is taken into account in the term R b .The main parameter is the atmospheric single scattering albedo (SSA) ω 0 , which changes in the presence of the cloud and depends on height inside the gaseous absorption band (Kokhanovsky and Rozanov, 2004).It can be written as where σ O 2 abs and σ ext are the oxygen absorption and the total extinction coefficients, respectively.Aerosol and cloud absorption in the visible are neglected.The value of the SSA for the effective homogeneous cloud layer is then calculated iteratively (the formulae are in Appendix A in Kokhanovsky and Rozanov, 2004).The accuracy of this approach is given in Yanovitskij (1997).

The inverse problem
The retrieval block of SNGome relies on the minimisation of the difference between the modelled and the observed TOA reflectances in the wavelength range 758-772 nm.It is assumed that the reflection function R can be expanded in the Taylor series around the a-priori value of the cloud top height h 0 as with R (i) (h 0 ) being the i-th derivative of R corresponding to cloud top height h 0 .It was found that the function R(h) is close to a linear one in a broad interval of the argument change (Rozanov and Kokhanovsky, 2004).Therefore, neglecting nonlinear terms in Eq. ( 9), it follows Having set the value of h 0 equal to 1.0 km, a value typical for low level clouds, the actual cloud top height h is calculated minimizing the cost function where R cl is the TOA reflectance calculated with Eq. ( 2).
The retrieval of the pair (h, l) is accomplished writing a vectorial form of the above equation and performing a twoparameter minimization (Rozanov and Kokhanovsky, 2004).Tests have shown that the retrieval is almost insensitive to different start values of h 0 and l 0 .This is due to the fact that the solution for the two-parameter inverse problem is performed iteratively.In particular, the following values are set: h 0 = 1 km, l 0 = 100 m.The value of minimum difference δ(h k , l k ) between the forward calculated spectrum R and the measured spectrum R cl is iteratively looked for along the whole absorption band with the following equation where the index k = 0 ... N is the needed iteration number.
The retrieval of the cloud geometrical thickness l enables the calculation of cloud bottom height (CBH).It reflects the transmission of light through a single-layered cloud and must be allowed to vary.The error analysis for CBH has been reported in Rozanov and Kokhanovsky (2004) and Lelli et al. (2011), where a black and a moderately bright surface (albedo 0.2) have been considered.What we see is that the values are accurate and stable for CTH and CBH values in range [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16] km and τ values in range .
The correlated k-distribution accounts for the highfrequency oscillations of the oxygen molecular absorption coefficients.They are reproduced adopting the method of the "exponential sum fitting coefficients": five precalculated profiles of molecular oxygen cross-section (T -, P -and λdependent) are employed, multiplied by tabulated constants, and summed up to give the convolved wavelength-dependent monochromatic TOA reflection function (Buchwitz et al., 2000).In this work, the wavelength step of 0.05 nm is used.This method enables fast calculations with an accuracy within 2 % as compared with line-by-line calculations (Buchwitz et al., 2000).The temperature and pressure dependence of molecular absorption coefficients for a given location of measurements is accounted for using the standard atmosphere model (Brühl and Crutzen, 1993).
Finally, the cloud spherical albedo r is calculated with the aid of Eq. ( 5), taking into account that t = 1 − r, if one No convergence 5 Cloud top and bottom height convergence neglects absorption processes.The error for r has been estimated smaller than 10 % at τ 6 and below 3 % at τ 10.
The technique has been validated by comparing retrieved values of r with airborne measurements, showing remarkable agreement (Kokhanovsky et al., 2007b).
For global processing we employ the digital elevation model STRM30 (Earth Resources Observation and Science (EROS, USGS) Center, 2000).The fundamental sample spacing of 3 arc-second in latitude and longitude (≈90 m at equator) has been down-sampled to 0.5 arc-minute in both coordinates (≈1 km at equator).
The algorithm flags each retrieval in ascending order (see Table 4), depending on the quality of the simultaneous fits of cloud top and bottom height, given the value of cloud optical thickness calculated in the continuum outside the band.
In summary, the strengths of the algorithm are the semianalytical forward parametrization of the TOA reflectances in the wavelength range of the oxygen A-band (but suitable to the broader range 0.4-2.4µm) for clouds with τ > 5 and solar zenith angles 75 • , the inclusion of molecular, aerosol scattering in clear sky condition and multiple scattering inside the cloud.In this way we avoid the common look-uptable approach.

Synthetic error analysis: single-layered cloud
The theoretical error investigation has been carried out generating forward spectra with the radiative transfer software package SCIATRAN (v. 3.1, Rozanov et al., 2012) in the Scalar Discrete Ordinate Method (S-DOM) mode.This is because polarization effect play a very little role in the O 2 Aband.The calculated reflectances were ingested in SNGome.In the first case study, a single layered water cloud of fixed geometrical thickness 1 km and optical thickness in the range angle equal to 0 • .The absolute error of the retrieved top altitude, defined as is shown in Fig. 1.The error is in the expected value range (±0.5 km), in line with previous findings (Rozanov and Kokhanovsky, 2004), where the authors already pointed out the decreased sensitivity to oxygen absorption for high and thin clouds.However such clouds cannot be detected anyway by GOME, as reported in Rozanov et al. (2006).Moreover, GOME is a UV-VIS instrument, lacking spectral coverage in the short-wave IR.This limitation implies the lack of information on the cloud phase function for the retrievals beforehand because only very weak absorption by condensed water takes place in UV-VIS; hence no cloud particle size can be inferred.For this reason errors are introduced because the phase function can be only guessed and solar illumination geometry varies appreciably.Yet, in order to test the algorithm under realistic operational conditions, the choice has been to maintain a slight difference between the asymmetry parameters g in Eq. ( 5) used in the forward (g = 0.846) and inverse (g = 0.859) problem.This effect is depicted in the cloud optical thickness retrieval, whose relative error is shown in Fig. 2. Given the geometry in Fig. 2, a solar zenith angle of 38 • corresponds to a scattering angle of 142 • .Referring to Kokhanovsky (2006, Fig. 37, p. 152), we are in the region of rainbow.An analytical error propagation study for a single spectral channel has been presented earlier by King (1987) and Kokhanovsky et al. (2003).For values of solar zenith angles → 90 • , τ will be overestimated as a consequence of the increased light-path through the atmosphere, which weakens the assumption of the plane parallel geometry of our approach.In Fig. 3 the impact of bow regions is less evident.This error mitigation is due to the fact that only reflectances normalized to the average value of these functions outside the band (at λ = 758 nm) are ingested in the algorithm and that the retrieval itself is performed along the oxygen absorption band using 67 spectral points.

Synthetic error analysis: double-layered cloud
Addressing the issue of multi-layer clouds, which are likely to be sensed by GOME due to its spatial resolution, we have run synthetic tests for a two-layer system with a lower water cloud of COT = 10, CBH = 3 km, CTH = 4 km and the results are shown in Fig. 4. In the first case, the upper water layer was fixed at heights 13-15 km; in the second case, an ice cloud was simulated with a fractal crystal model (Kokhanovsky, 2006) of 50 µm side length and placed at 13-15 km as well.This height value has been chosen from the CALIPSO dataset (Sassen et al., 2008).The solar zenith angle was set equal to 30 • , consistent with tropical latitudes.With increasing optical thickness of the upper layer, the curves show the cloud bottom (red curve) and top (blue curve) height retrieved values of the lower layer.The lower panel shows the total COT retrieved for both layers.
Inspecting the retrieval flags, we notice that the operational limit of geometrical thickness (11 km) is met at COT water = 4 and COT ice = 2 of the upper layer.Beyond that value, CTH and CBH are constrained and all successive retrievals are flagged 3. Given that our model assumes single-layered clouds, we would then reject retrievals flagged 3, above a limit height of 5 km.COT 10 (lower layer), cloud C1 model (Deirmendjian, 1969), effective radius 6 µm.Upper ice layer parameters (lower panels): crystal fractal model, 50 µm side length (Kokhanovsky, 2006).
When looking at the lower plot of Fig. 4, the presence of an ice layer does not hinder the retrieval.However we are not able, at the present stage, to discriminate the thermodynamic phase due to the lack of spectral coverage in the infrared by GOME.Besides, we do not process L1 reflectances lower than 0.15, therefore cirrus clouds are excluded and the algorithm is not triggered.Therefore only low-level ice clouds are present in the retrievals.In the single-layer approximation, we expect a stronger sideward scattering for an ice cloud as compared to a water cloud, due to the irregular shape of ice crystals as compared to water droplets.This implies an increase in the reflection function in the oxygen A-band at nadir, which means an overestimation of CTH.This effect can be seen comparing the two plots.With increasing optical thickness of the upper layer, the retrieved CTH curve grows faster in the ice case scenario than in the water one.This leads to the increase of the global mean of CTH.

Comparison with other datasets
In order to test the soundness of SNGome cloud top height retrievals, we compare the results with ground-based measurements and with two different and independent spacebased algorithms.The ground-based data are collected at three different several Atmospheric Radiation Measurement (ARM) climate research facilities (Clothiaux et al., 2000) and at the Chilbolton Facility for Atmospheric and Radio Research (CFARR).The satellite-based retrievals come from the GRAPE (Poulsen et al., 2011;Sayer et al., 2011) dataset made freely available via the British Atmospheric Data Center (http://badc.nerc.ac.uk/data/grape/) and from the Retrieval of Cloud Information using Neural Network (ROCINN, Loyola et al., 2007) dataset, operationally deployed by DLR.The basic idea behind this comparison is to gain insight on the strength and weakness of each technique.They rely on three different physical principles: the ARM data are based on active measurements of a millimeter-wave cloud radar, the GRAPE data on the passive thermal measurements of ATSR-2 instead, while the ROCINN data are based on the O 2 A-band technique in the framework of the neural network approach.Clearly different parts of the cloud are sensed and the intercomparison is not straightforward.

Satellite-and ground-based data
The Along-Track Scanning Radiometer (ATSR-2) (Stricker et al., 1995) is a dual-view sounder onboard the ESA ERS-2 space platform, being the natural choice for comparison with GOME since no temporal lag between the two instruments and the same cloud scene is assumed.Even so, the limited across-nadir swath (≈500 km) of the ATSR-2 reduces the number of co-registered retrievals of SNGome, resulting in a decreased spatial coverage.The radar facilities description used in this evaluation is given in Table 5.
The physical principle of the GRAPE algorithm is the cloud infrared (IR) brightness temperature as observed by ATSR-2 (Poulsen et al., 2011).Clouds located higher up in atmosphere are generally colder.Local temperature profiles are used to match the derived cloud-top-temperature with the equivalent cloud altitude.The main assumptions in the GRAPE retrieval scheme are: look-up-tables of atmospheric transmittance and reflectance (DISORT as radiative transfer code and MODTRAN for the gaseous absorption part); Lambertian surface (MODIS albedo product for 2002); cloud model as a single layer; pressure, temperature and H 2 O profiles according to ECMWF (ERA-40 dataset).More details on the algorithm can be found in Poulsen et al. (2011), while an evaluation of the data, and the criteria for data selection, are given in Sayer et al. (2011).SNGome data selection and properties are as follows: the ground-based site is inside the GOME pixel at a maximum of half of its size; the quality flags are 2 and 5; no restriction on fractional cloud cover has been made.Hence cloud fraction is in the range [0.17-1] for the investigated scenes (i.e.no overcast clouds).
The scenes are additionally subset as "deep clouds" if the top of the cloud is higher than 3 km and vertical extent of the cloud greater than half of its height, whereas "shallow  clouds" otherwise.We stress that the "deep" clouds do not refer to the customary deep convective systems, but instead emphasizes the vertical heterogeneous extent of the sensed scene, as it can be seen in Figs. 9 and 10 in Sayer et al. (2011,(3924)(3925).This distinction has been made in view of the fact that vertically heterogeneous clouds might occur, in contrast to single-layered homogenous ones, and has been adopted here for consistency with the results given in Sayer et al. ( 2011).A total 51 co-registered overpasses have been selected for the deep cloud scenario, 15 overpasses have been matched for the shallow cloud scenario and the respective plots are given in Figs. 5 and 6.The statistics are shown in Table 6.First, the findings confirm what has been already explained by Sherwood et al. (2004); Rozanov et al. (2006) andSayer et al. (2011).Infrared sounding techniques are affected by a systematic bias, as a consequence of the assumption that a cloud is a blackbody radiator in the IR; for that reason the profile matches at higher temperature, placing the cloud too low.This effect can be seen in both cloud field types.Especially for deep clouds the simultaneous retrieval of top and bottom altitude seems to be more suitable, despite the fact that a single layered cloudiness is assumed in the model.It has been shown that inference of both parameters, using the full spectral informations in the A-band (Rozanov et al., 2004) or multi-angular measurements (Ferlay et al., 2010), mitigates this uncertainty.We recall here that, in order to account for the vertical photon penetration depth in GRAPE, a first-order correction was introduced in Sayer et al. (2011) and it resulted in a better (smaller bias) comparison.However this correction was not applied to GRAPE in the present study.
In the shallow cloud case plotted in function of radar facility (Fig. 7), the outliers originate from the site TWP-Nauru.From the climatological viewpoint, this site exhibits Table 6.Cloud top height (km), correlation coefficient r and average bias (Radar -Satellite, km) for 51 matches of deep (Fig. 5) and 15 matches of shallow (Fig. 6) clouds.The fractional cloud cover for the GOME pixels is in range [0.17  frequently westward downwind cloud trails (Henderson et al., 2006), which are, in turn, linked to aerosol production.It is therefore likely that, on the GOME pixel scale, the assumption of a single-layer cloud is not appropriate.
As an example, the radar reflectivity profile for the day 5 July 2001 has been plotted.Given a mean wind speed of 5 m s −1 and westward direction, the scene sensed by GOME is highly heterogeneous (see Fig. 8).We see three distinct layers.At the overpass time of GOME, the radar CTH was 7.4 km, this being the intermediate layer.SNGome CTH was 13.02 km (COT 10.26).GRAPE placed the cloud at 4.82 km (COT 2.2), which is the layer of radiative cloud height.
Clearly the uppermost layer was retrieved by SNGome, handling the space between layers as if it were a single cloud slab.
Overall, where the satellite retrievals deviate from radar top height, they exhibit opposite signs, backing the idea of synergistic use of oxygen A-band and infrared techniques.Therefore, the profiling capabilities of the former together with the radiative sounding of the latter can result in valueadded datasets and should not be rejected for future instruments' design.

Satellite-based data
The Retrieval of Cloud Information using Neural Network (ROCINN) algorithm (Loyola, 2004;Loyola et al., 2007)  uses the oxygen absorption band and a combination of lookup-tables of forward reflectivities and neural network to deliver cloud top height (pressure) and albedo, with the same cloud fraction used in SNGome and calculated with OCRA (Optical Cloud Recognition Algorithm, Loyola and Ruppert, 1998;Loyola et al., 2007).We compare the same dataset as described in the work of Rozanov et al. (2006).Four separate GOME orbits (15 453, 16 910, 18 366, 19 537, for a total pixel number of 2422) were selected, which are considered to be representative of climatological and geometrical illumination conditions.Such orbits have been operated in enhanced narrow observation mode (i.e.ground pixel size 80 × 40 km 2 ), thus the results can be extended to instruments with equivalent spatial resolution as GOME-2 (80 × 40 km 2 ) and SCIAMACHY (30 × 60 km 2 ).
For the large GOME pixel size, an error in cloud fraction impacts the cloud top height retrieval.Assuming the ATSR cloud fraction as the true one (due to the better spatial resolution), we show in Fig. 9 the CTH bias of the two O 2 Aband algorithms versus the CF bias (defined as ATSR CF-OCRA CF) shared by them.OCRA itself slightly overestimates CF compared with ATSR (as already reported in Tuinder et al., 2004).However, there is no evidence of a CTH bias cluster in the plot against IR retrievals for both O 2 A-band algorithms.This is an indirect corroboration of the validity of the independent pixel approximation.
The comparison between SNGome and ROCINN discloses a cluster of retrievals where CF underestimation leads to a slight CTH overestimation.This cluster corresponds to the low-level clouds of 2-3 km height of Fig. 10.Being all parameters equal for SNGome and ROCINN, this bias can be explained through the enhancement of radiation backscattered to the platform, because of the higher fractional cloud cover.Only in this scenario the assumption of a Lambertian cloud model leads to CTH overestimate, with respect to a model where multiple scattering is taken into account.
Overall, ROCINN tends to underestimate CTHs with respect to SNGome (in Table 7 the statistics of the four orbits are given).A negative bias of −0.63 ± 1.46 km (Loyola et al., 2007) and, more recently, of −0.44 km ± 1.26 km (Loyola et al., 2010) have been found, where the same record of CTHs from GOME and METEOSAT were compared.The difference likely arises from the assumption in the ROCINN forward model that a cloud is a perfect Lambertian reflector, hence not accounting for multiple scattering of light inside the cloud.Scatterplots between the three CTH products for the 4 orbits are given in Fig. 10.In general, SNGome shows high correlations with both ATSR (0.81) and ROCINN (0.86).ROCINN itself exhibits an excellent correlation (0.95) with ATSR.We underline that ROCINN algorithm is based on a neural network approach, which relies on the beforehand training of its components and offers a limited space of solutions, whereas SNGome makes no assumption for the sensed scene.SNGome agrees better with IR retrievals for low and mid-level (CTH < 7 km) clouds than for high-level clouds.The possible reason for such scattered retrievals likely arise from the presence of ice or mixed-phase clouds, whose unknown phase function (and lower asymmetry parameter g) enhance light scattering in the sideward direction.

Results
This section is devoted to the analysis of cloud top height derived from GOME observations during the period from June 1996 through May 2003, due to missing global coverage after June 2003.We consider zonal averages and inter-annual  variations from 70 • N to 70 • S. The single pixel GOME spatial resolution is 320 × 40 km 2 and a total orbit swath of approximately 960 km 2 .Global coverage is reached in 3 days.While the raw data are available with the nominal sampling, for the zonal analysis the data have been re-gridded with a latitudinal and longitudinal spacing of 1.5 • .

Data selection
In Fig. 11 we plotted the occurrences of the quality flags (see Table 4 for their meaning) for the complete dataset in function of cloud top height.Accounting for these flags is a crucial step for the extraction of realistic cloud scenarios.In this analysis retrievals flagged 0, 1 and 4 are clearly discarded.Specifically, the peak of flag 1 is the consequence of the CTH underestimation introduced by the model (see Fig. 1).For this reason clouds with height < 1 km might be underrepresented in the record.We make use of retrievals flagged 2, 3 and 5. Data flagged 2 appear as long as the 2-parameter minimization of Sect.2.2 converges only for cloud top height and not for cloud bottom height.In view of the synthetic study presented in Sect.3.2, we notice that retrievals flagged 3 appear when the upper layer becomes optically thick enough to generate a multi-layer cloud system.The situation contributes to the second mode of the green curve of Fig. 11.Thus we will reject such retrievals above a limit height of 5 km.

Global geographical analysis
We focus on geographical cloud top height distributions.Our aim is to highlight regional trends and annual distributions.For this purpose, the year 2001 is plotted for the four seasons in Fig. 12.The maps have been projected onto a lattice of 0.5 • × 0.5 • after a pixel-counted average of daily composites.However, the ungridded retrievals are available as original data at IUP Bremen website (http://www.iup.uni-bremen.de/ ∼ sciaproc/SNGome/).In fact the main features of global cloudiness are already known and have been studied by other satellite groups (Rossow and Schiffer, 1999;Wylie et al., 2005;Chang and Li, 2005;Stubenrauch et al., 2006;Jensen et al., 2008;Loyola et al., 2010).Nevertheless is it worth to mention that, in the presented maps, some world regions over ocean (and sometimes over portions of the coast) are characterized by specific cloud systems.A cloud system may be represented by one or several interacting cloud structures and even with the coarse spatial resolution of the instrument we are able to detect some of them on a global scale.
Namely, over North Atlantic at mid-latitudes the "extratropical cyclones" form in the late autumn through winter months and they can reach altitudes of ≈9 km.Such cloud systems are detected by SNGome.Especially the seasonality of the monsoon (stretching from South-East Asia to the Arabian Sea) is well pictured, together with the appearance of the typhoons' cloud structures in the late summer and in the autumn in the far east region bordered by Japan from the north side and Taiwan from the south.The habitual cloud structures termed "marine stratocumulus clouds" can be seen over south Pacific, close to south Peruvian and Chilean coast.Their accumulation is mainly due to the cold Humboldt Sea current, the high mountainous coast and winds from the Andes.They reach 1.5-2 km, rarely exceeding such altitudes.4.
This region resembles the Benguela region, situated over south Atlantic, where cloud cells formation is mainly due to the cold sea current and the warm winds from the continent.Another feature is the season-conditioned cloudiness over the Caribbean Sea, where hurricanes are observed in the late summer and in the autumn.
In Fig. 13 we present zonally averaged seasonal vertical distributions of relative cloud amount for the year 2001 for the same data in Fig. 12. Data are normalized in a way that for each latitude belt (1.5 • increment) the sum of all CTH occurrences is equal to 100 %.The seasonality is again well reproduced and the structure of the Intertropical Convergence Zone (ITCZ) with high clouds near the tropopause is depicted.In Stubenrauch et al. (2010, Fig. 8, p. 7207), datasets from CALIPSO, AIRS-LMD and the radar-lidar GEOPROF are compared for years 2007-2008, boreal winter and summer, and similar plots are presented.Notwithstanding the different temporal coverage, we observe a similar shift of the maximum around the equator.This maximum is placed by SNGome at ≈12-13 km: lower than CALIPSO and GEO-PROF and similar to AIRS-LMD.This behavior is expected because, in the case of a thick layer underneath a thin one, SNGome detects the former.
As a further investigation, cloud distributions are analyzed with respect to season, hemisphere and underlying surface.Retrievals are binned with 0.25 km spacing and normalized to the total number of counted cloudy pixels.Additionally we filter cloud fractions smaller than 0.3 in order to screen occasional dust events and to be consistent with the analysis of Joiner et al. (2012).The distributions are plotted in Fig. 14 for year 2001 and the disentanglement of the frequency distributions is plotted in Fig. 15.
Since we plot cloud distributions with respect to northern seasons, the behavior as shown in Fig. 15a is expected.From winter seasons, where more low-level clouds are observed, the response to an increased heating is to shift the mean mode toward higher values during spring and summer.Likewise the high peaks in boreal cold seasons have to be linked to austral warmer seasons.In particular, in the Southern Hemisphere we find again the hallmark of the persistent low-level cloud structures which contribute to the first modes, as seen in Fig. 15d.It is evident from Figs. 14, 15a and 15c that cloud top heights over land follow a bimodal distribution, whereas in both hemispheres over water appear broader and even trimodal distributed.Given that averaged global cloud distributions hide short-time fluctuations, we found good agreement with the shape of distributions for July 2007 derived from Cloudsat profiles and presented in Joiner et al. (2012, Fig. 13, p. 540).

Zonal analysis
Average plots of cloud top height over years minimize the influence of short-time variations.Nevertheless in Fig. 16, during the period 1997-1998, a shift in the maximum can be observed.If one considers cloud height as a proxy for atmosphere dynamics and radiative processes, there might be a link to the development of El Niño-Southern Oscillation (ENSO).In 1997, when the ENSO had its first appearance within this record, a single maximum of zonal CTH at  Pacific Ocean.The combination with the longitudinal Walker circulation and Earth rotation had the net effect to strengthen convection loops along the equator and to change heat distribution maps at the surface.
Cloud cover trends, retrieved in the O 2 A-band, have been found to be positively correlated with sea surface temperature (SST) (Wagner et al., 2005).Moreover, SST anomalies over Pacific Ocean have been found to be negatively correlated with O 2 absorption (Wagner et al., 2008).Thus an increase in SST implies a shallower O 2 band, that is higher CTHs.This effect could be observed in ISCCP records during the ENSO episode back in 1987-1988: a change of SST of 2 • C for temperatures >26 • C lowered cloud top pressure of ≈25 hpa (Bony et al., 1997), which means an increment in CTH of ≈0.6 km, therefore matching our retrievals when the maxima of 1997 and 1998 at 3-5 • S are compared.
More recently, Larson and Hartmann (2003) numerically probed the response of tropical clouds and water vapor to SST anomaly.Their findings suggest that high cloud occurrence rises as compared to middle or low cloud ones.We focus on the tropical pacific region (7.5 • S-10 • N, 100 • E-280 • E), as specified in Cess et al. (2001).High clouds (HC) are defined as clouds with h > 6.5 km, middle clouds (MC) with 3.2 km < h < 6.5 km and low clouds (LC) with h < 3.2 km (Stubenrauch et al., 2010); in Fig. 17 their monthly relative averages are plotted.The seasonality, more pronounced in the HC, starting from mid 1998 onward until December 2002, is broken during the ENSO anomaly.In the time window February 1997-September 1998, the high cloud abundance never drops below 65%, and middle and low cloud do not exhibit any periodicity either.This confirms the role enhanced convection plays, linking the oceanic coupled system of non-dispersive Kelvin and off-equatorial nondispersive Rossby waves (Dijkstra, Jan. 2002) with clouds in the tropics.
We present also the multi-annual global distribution of zonal mean cloud top height observed by GOME in the boreal winter and summer (upper panel) with its difference (lower panel) in Fig. 18.Qualitatively, the CTH maximum is located in ITCZ region centered at 5 • N-10 • N in summer, while in winter the ITCZ moves southward, displacing the maximum at 5 • S-7.5 • S. In terms of hemispheric averages, winters clearly exhibit a lower CTHs at 22 • N-25 • N in the boreal belt, whereas 16 • S-20 • S in the austral belt.In opposite seasons (i.e.summer), this minimum vanishes and the average CTHs increase.These changes are related to changes in the atmospheric circulation over the annual cycle, that is, in the tropical Hadley cell and mid-latitude Ferrel cells and their intervening ITCZ (Mokhov and Schlesinger, 1993), as shown in the sinusoid in the lower panel of Fig. 18.For polar regions, the anomalous high peak during the austral winter can be related to a missing snow/ice screening in the algorithm.In the case of clouds occurring over bright surfaces, due to missing contrast, the sensitivity to COT retrieval decreases (Pincus et al., 1995;Kokhanovsky et al., 2003) and the retrieved total optical thickness (typically greater than 100) will be the sum of snow OT plus cloud OT.Similar to the two-layer system presented in Sect.3.2, the retrieved CTH will be biased high.
In Fig. 19 we plot the pixel-counted multi-annual average of daily composites of zonal CTH, COT and CA with 95 % confidence interval.The results are compared with ROCINN retrievals.The ROCINN curves here presented are slightly lower than the ones published in Loyola et al. (2010).We do not have enough informations about the applied data selection.Especially for CTH, the maximum in the tropics is ≈1 km higher.Even so, the bias between the two datasets for www.atmos-meas-tech.net/5/1551/2012/CTH and COT can be explained as follows.CTH depends, to a certain extent, on the COT values, because the depth of the O 2 A-band around 760 nm changes in function of COT (see Fig. 8 in Kokhanovsky and Rozanov, 2004, p. 46).Therefore if the independent piece of information of actual COT values is not used as input for the forward radiative transfer calculations along the band, the resulting CTH can be biased low.
In general, the high values of CA calculated by SNGome can be understood in this way: the asymptotic relations used in this work hold only for clouds with τ > 5, therefore thinner clouds do not contribute to the global statistics.Another limitation is the GOME spatial resolution.Horizontal and vertical variability of clouds can introduce systematic biases in cloud albedo (Pincus et al., 1999;Oreopoulos et al., 2007).A heterogeneous cloud, which is likely to be sensed by GOME, has always a lower albedo than its homogeneous counterpart, both having the same optical thickness.Thus, treating real clouds as plane-parallel slabs leads to higher albedos.On the other hand, we speculate that a positive trend in aerosol optical thickness (AOT) over ocean, as reported by Thomas et al. (2010, Table 4, p. 4861), impacts cloud albedo through a decrease in mean cloud droplet radius.This effect has been already seen for weak volcanic eruptions over ocean (Gassó, 2008).The negative correlations shown in Bulgin et al. (2008) between aerosol optical thickness and effective radius corroborate also this hypothesis.However, these results pertain only to oceanic regions, which are affected by continental aerosol outflows.Note that the AOT signals in Bulgin et al. (2008) and Thomas et al. (2010) are derived from ATSR-2 measurements, therefore temporal and spatial co-registration with GOME are not an issue.
Overall the global average cloud top height, derived from GOME measurements for the period June 1996-May 2003, is 5.6 ± 3.2 km in the belt of ±70 • latitude, for a corresponding average cloud optical thickness of 19.1 ± 13.9 and average cloud albedo of 0.63 ± 0.10.We underline that the given average cloud top height is not weighted by the respective average cloud optical thickness.The overview of regional statistics of the retrieved cloud properties is given in Table 8.

Conclusions
We have presented properties of a seven-year global cloud dataset (see http://www.iup.uni-bremen.de/∼ sciaproc/ SNGome/) from the Global Ozone Monitoring Experiment GOME using the semi-analytical cloud retrieval algorithm SACURA, hereafter termed SNGome.The retrieval is based on optimal estimation approach applied to radiances around the 760 nm O 2 absorption A-band.Auxiliary data used in the calculation are the minimum Lambert-equivalent reflectivity values from TEMIS and the cloud cover from OCRA-DLR, both derived from GOME measurements.The In this way we aim at supporting GOME ozone and other trace gases retrievals as well as long-term trend analysis of global and regional cloudiness with datasets from SCIAMACHY and GOME-2.We have found that CTH retrieved values are quantitatively comparable to altitudes derived by other algorithms and techniques.Our approach shows a smaller bias with respect to co-registered ground-based retrievals, pointing to the utility of using COT as an independent information for the concurrent cloud top and base altitude retrieval.The algorithm's quality flags identify multi-layered scenes and enable their removal from the statistics presented in this work, in compliance with the single-layer cloud model used in the forward calculations.Yet two limitations have to be noted.Firstly, the algorithm tends to deliver unreliable results in proximity of the ground and fewer retrievals become available for very low clouds.Secondly, at the moment, the algorithm can not discriminate the thermodynamic phase of water and the presence of low-level ice clouds might affect the statistics.These restraints can be lifted when the algorithm is applied to higher spatially resolved instruments equipped with SWIR channels.On the global scale, the distinctive features of cloudiness are reproduced satisfyingly in the dataset as well as the El Niño-Southern Oscillation (ENSO) event in the period 1997-1998, which spatially and temporally correlates with CTH values over the Pacific Ocean.For the seven-year record (June 1996-May 2003), in the belt of ±70 • latitude, the average CTH has been found to be 5.6 ± 3.2 km with a relative average cloud optical thickness of 19.1 ± 13.9 and average cloud albedo of 0.63 ± 0.10.

Fig. 1 .
Fig. 1.Absolute error (in km) in the cloud top height retrieval with SNGome as function of cloud top height and optical thickness.Input parameters: solar zenith angle 60 • , nadir view, cloud geometrical thickness 1 km.The contour lines indicate the specified error levels (in km).

Fig. 7 .
Fig. 7. Upper panel: comparison of retrieved CTH as function of ground-based facility for the shallow cloud case of Fig. 6.Lower panel: CTH bias between SNGome and Radar.

Fig. 9 .
Fig. 9. Scatterplots for CTH bias versus CF bias between IR-based and O 2 A-band-based retrievals (upper panel) and the same between the O 2 A-band-based retrievals.Original dataset presented in Rozanov et al. (2006).

Fig. 10 .
Fig.10.Scatterplots and correlations among the three CTH products derived by satellite measurements for four GOME orbits.The original dataset was presented inRozanov et al. (2006).

Fig. 11 .
Fig. 11.Quality flag counts in function of cloud top height for the 7 yr of the SNGome dataset.The meaning of the flags is given in Table4.

Table 5 .
Location of the radar facilities with elevation above mean sea level and number of matches for deep and shallow clouds.

Table 7 .
Rozanov et al. (2006)e values of all orbits inRozanov et al. (2006)for three retrieval algorithms.Bias values are given w.r. t.SNGome.Total number of pixels 2422.