CLARA-SAL : a global 28 yr timeseries of Earth ’ s black-sky surface albedo

We present a novel 28 yr dataset of Earth’s blacksky surface albedo, derived from AVHRR instruments. The dataset is created using algorithms to separately derive the surface albedo for different land use areas globally. Snow, sea ice, open water and vegetation are all treated independently. The product features corrections for the atmospheric effect in satellite-observed surface radiances, a BRDF correction for the anisotropic reflectance properties of natural surfaces, and a novel topography correction of geolocation and radiometric accuracy of surface reflectance observations over mountainous areas. The dataset is based on a homogenized AVHRR radiance timeseries. The product is validated against qualitycontrolled in situ observations of clear-sky surface albedo at various BSRN sites around the world. Snow and ice albedo retrieval validation is given particular attention using BSRN sites over Antarctica, Greenland Climate Network stations on the Greenland Ice Sheet (GrIS), as well as sea ice albedo data from the SHEBA and Tara expeditions. The product quality is found to be comparable to other previous long-term surface albedo datasets from AVHRR.


Introduction
Accurate determination of Earth's surface albedo is required for both characterizing the atmosphere-surface boundary layer in atmospheric studies, and resolving the radiative budget at surface level for climate monitoring studies.Most such studies require the surface albedo to be known over a large area with continuous monitoring.Global climate studies naturally require global coverage, which is difficult to attain us-ing airborne or in situ observation methods.Satellite-based surface albedo estimation methods provide both continuous monitoring and fully global coverage from polar orbits.
A limiting factor from the viewpoint of climate studies is the temporal coverage of satellite data.Homogeneous surface albedo datasets often cover only the lifespan of one or two satellites, while climate studies would benefit more from data records of longer length (Trenberth et al., 2006).The Advanced Very High Resolution Radiometer (AVHRR) instrument has flown on all polar-orbiting National Oceanic and Atmospheric Administration (NOAA) weather satellites, providing the longest continuous timeseries of observations on the reflectance properties of Earth.Once intercalibrated, this timeseries provides a basis for generating a long-term global dataset on surface albedo.
In this paper, we introduce and validate a new 28year timeseries ) of Earth's surface albedo, derived from AVHRR/2 and AVHRR/3 sensors.The timeseries has been created in the Satellite Application Facility on Climate Monitoring (CM SAF) project of EUMET-SAT as a part of the CMSAF cLouds, Albedo and RAdiation (CLARA) first release product family (Karlsson et al., 2013).This timeseries, specifically called CLARA-A1-SAL (from AVHRR 1st Release -Surface ALbedo), describes the global broadband (0.25-2.5 µm) directional-hemispherical reflectance (DHR) of terrestrial surfaces -or black-sky albedo as it is also known.In addition to describing the dataset characteristics and the processing algorithm behind it, we will also present results from an extensive validation effort aimed at determining the albedo retrieval accuracy.

A. Riihelä et al.: Global 28 yr timeseries of surface albedo
This paper is organized as follows.We will first provide a review of current surface albedo datasets covering at least several years and briefly discuss their major characteristics.Then we proceed to an overview of the CLARA-A1-SAL product and its main properties, including spatial and temporal coverage.The discussion then goes deeper into the SAL algorithm itself as we describe the process of retrieving surface albedo from the AVHRR measurements.The algorithm description is followed by an overview of the validation results to date and a discussion on the main findings, including the perceived strengths and weaknesses of the dataset.For reference, we also provide a comparison of CLARA-A1-SAL to the often-used Moderate Resolution Imaging Spectroradiometer (MODIS) MCD43C3 albedo products.We finish with a brief outlook on future development directions for next releases of the CLARA-A1-SAL dataset, and conclusions.

On surface albedo terminology
The surface albedo of natural surfaces is determined by the inherent optical properties of the surface material, as well as the properties of the atmosphere affecting the incoming and reflected radiation fluxes.Therefore, the definition of surface albedo changes, depending on whether or not atmospheric effects are included.In the literature, the surface albedo where the incoming radiation flux is unidirectional without any atmospheric effects is called the directional-hemispherical reflectance (DHR), often also called "black-sky albedo".If the incoming radiation flux is completely diffuse, the surface albedo is defined as the bi-hemispherical reflectance (BHR), or "white-sky albedo".Under natural illumination conditions, the incoming radiation flux at Earth surface is a varying combination of direct and diffuse radiation fluxes.Therefore, the observable albedo at Earth surface level is also a weighted combination of DHR and BHR according to cloudiness, atmospheric characteristics, and sun zenith angle (SZA).This albedo is often called "blue-sky albedo".For the remainder of this paper, we shall use this terminology to describe the various albedo quantities in accordance with, e.g.Schaepman-Strub et al. (2006) and Lucht et al. (2000).

Previous long-term surface albedo timeseries
Several research teams have produced long-term timeseries of surface albedo from various satellite instruments.Here we will briefly review the datasets that are 10+ yr in length.As new datasets are continuously developed and existing ones expanded, the reader should not consider this listing to be exhaustive.Wang and Key (2005) described the AVHRR Polar Pathfinder dataset, which contains broadband directionalhemispheric apparent (blue-sky) surface albedo from the po-lar regions between 1982-1999.This dataset has since been extended to cover 1982-2004 in the APP-X version and analyzed for trends (Wang et al., 2012).Lucht et al. (2000) introduced the MODIS surface albedo algorithm, which has since been applied to MODIS data from the Terra and Aqua satellites to produce a global surface albedo timeseries from the year 2000 to the present day.The MODIS dataset (MCD43C3, other versions also exist) provides both black-sky and white-sky albedo.The MODIS albedo products have been shown to be stable and accurate in both radiometric and geolocation aspects (Cescatti et al., 2012;Wolfe et al., 2002), and the spatial resolution of MODIS is considerably better than AVHRR (GAC).
Onboard the same satellite platforms and utilizing MODIS data as background information, the CERES instrument has been used to derive global broadband clear-sky (blue-sky) albedo from the year 2000 onwards (Rutan et al., 2009;Yan et al., 2011;Hudson et al., 2010).CERES albedo/flux products (EBAF for top Of atmosphere and SYN for surface) have global coverage and feature albedo retrievals of ocean and sea ice as well as land surfaces.The EBAF product also adjusts surface irradiances to create a net energy balance at surface.A new CERES instrument is now also observing from the Suomi NPP satellite.Loew and Govaerts (2010) describe a surface albedo timeseries spanning 1982-2006 generated from the geostationary Meteosat First Generation (MFG) imager data (MSA product).There are also current efforts underway to extend the algorithm to other geostationary weather satellite data (Govaerts et al., 2008).These geostationary imager-based albedo products enjoy the advantage of having a large number of observations per grid cell per day, enabling large sample sizes for concurrent albedo and aerosol retrievals.
The NASA/GEWEX surface radiation budget project has produced surface radiative fluxes spanning 1983-2007 by utilizing a combination of ISCCP DX satellite radiances (Rossow and Schiffer, 1999) and various reanalysis datasets, resulting in the SRB dataset.More details may be found in Stackhouse et al. (2011).The International Satellite Cloud Climatology Project (ISCCP) project has also produced their own radiative fluxes from which various albedo quantities may be computed (Zhang et al., 2004).The ISCCP and GEWEX fluxes have been shown to be in close agreement, and have been extensively validated against in situ data (Stackhouse et al., 2011).
The CLARA-A1-SAL dataset promises to add value to this group of datasets by combining global coverage with a higher spatial resolution than SRB or ISCCP, a comparable retrieval accuracy but improved intercalibration relative to APP-X, and with a longer temporal coverage than any of the aforementioned datasets.Good coverage of Earth's cryosphere, including sea ice, combined with the intercalibrated AVHRR data record make this dataset well-suited for studies involving the albedo of snow and ice.For the reader's reference, we have constructed a comparative table of the long-term albedo datasets reviewed here as Table 1.The parameters shown are by no means an exhaustive list; we encourage the reader to study the relevant literature for in-depth analyses and applicability studies.
3 The characteristics of the CLARA-SAL dataset

Dataset coverage
The dataset coverage is global over a pentad (five days) or one month period.Swath-level SAL processing takes place at the nominal GAC resolution (4.4 km at nadir), after which the GAC resolution data are projected to a 0.05 degree latitude-longitude grid (5 km EASE-grid for polar subsets).This resolution was chosen because it is close to the nominal GAC resolution (at nadir), and therefore the majority of the original observations are still represented in the reprojected grid.In this step, each observation is mapped to only one pixel (i.e.no averaging, interpolation or nearest-neighbour filling is applied), so that the weighting of observations does not change.These reprojected grids are then used to calculate the spatiotemporal mean albedo by averaging all the (projected) observations that lie within each end product grid cell (0.25 degree or 25 km for polar subsets) and time interval (pentad or month).An example of the Arctic monthly mean product is shown in Fig. 1, and a composite of the 2007 global monthly mean products is shown in Fig. 2.
The choice of 0.25 degree/25 km as the end product resolution was motivated by compatibility with other global meteorological and climate datasets frequently offered at resolutions of 0.25-1 degree (e.g.Sheffield et al., 2006), and our desire to avoid the effects of sporadic navigation errors in the early AVHRR data.
Along with the temporal mean albedo, each product also contains the number of observations from which the mean is calculated (per grid cell), and the standard deviation of the retrievals in each grid cell.Temporal gaps in the data due to satellite data unavailability may be identified using the number of observations-data field.We recommend that users consider and utilize these data as a quality assurance measure when using the CLARA-SAL products.We have also calculated histograms of the albedo retrievals over the globe with a 0.02 bin size.These histograms provide additional statistical information about the distribution of individual SAL retrievals for, e.g.obtaining information about outlier retrievals.They are available to users upon request.All products are provided free of charge to any interested users.

The CLARA-SAL retrieval algorithm
The CLARA-SAL algorithm follows a sequential approach, first correcting the TOA reflectances from AVHRR channels 1 and 2 for topography (where necessary) and atmospheric effects (for all pixels), and then utilizing different algorithms according to each pixel's land cover to convert the spectral surface reflectances into a broadband surface albedo.
The CLARA-SAL albedo retrieval is based on a homogenized TOA radiance data record across the entire AVHRR instrument family timeline.The radiance data record is constructed using the method described by Heidinger et al. (2010).This homogenization of the underlying radiance data record significantly diminishes albedo variation resulting from differences in calibration across the 11 different AVHRR instruments.This radiance data record is common to all CLARA-A1 products (Karlsson et al., 2013).
Here we briefly list the main stages of CLARA-SAL retrievals.More details on the process may be found in Appendix A, and the CLARA-SAL Algorithm Theoretical Basis Document (ATBD).After the initial topography correction for geolocation and radiometry, the atmospheric effects are corrected for using the Simplified Method for Atmospheric Correction (SMAC) after Rahman and Dedieu (1994).AOD and O 3 content of the atmosphere are kept constant at 0.1 and 0.35 [atm cm], respectively.The subsequent effect on retrievals is explored in Sect.7.
Next, anisotropic reflectance (over vegetated/barren surfaces) is corrected and the surface reflectances are converted into spectral albedo after Wu et al. (1995) and Roujean et al. (1992).Broadband albedo is retrieved according to land cover type.Over vegetation, the spectral albedos are processed using a narrow-to-broadband conversion algorithm by Liang (2000).Water surface albedo is presently derived from a lookup table (LUT) by Jin et al. (2004).As wind speed or chlorophyll data was not included in this first version, the water surface albedo is in fact a constant 0.0676.We have preferred the use of a LUT-based approach since it circumvents issues such as an accurate AOD correction  over oceans, subpixel cloud contamination in AVHRR data and anisotropic effects of water reflectance over areas experiencing high surface winds.Surface conditions such as chlorophyll concentrations affect broadband ocean albedo only slightly (Jin et al., 2004).
For snow and ice surfaces, the image swath processing is intended to only yield bidirectional broadband snow/ice reflectances.At swath level, the GAC reflectances are corrected for atmospheric effects and the broadband conversion is performed after Xiong et al. (2002).As described in the publication, the algorithm automatically tunes itself to match different snow/ice types, therefore we do not need to differentiate between snow and sea ice in the retrieval.
The swath-level data are projected to 0.05 degree / 5 km grids, as described in the previous section.During the production of the pentad and monthly mean SAL products, the (projected) bidirectional snow/ice reflectances that sample the viewing hemisphere are spatiotemporally averaged to produce a black-sky snow/ice broadband albedo estimate at each end product grid cell.In essence, we exploit the varying AVHRR viewing geometry, high data availability over polar regions, and averaging to a coarser end product grid to sam-ple the reflectance hemisphere densely enough to retrieve an albedo estimate.

Validation against ground truth
The validation of this first release of the CLARA-SAL dataset was done using in situ albedo observations from the SHEBA and Tara ice camps, and the Baseline Surface Radiation Network (BSRN) and Greenland Climate Network (GC-Net) station networks as reference data (Ohmura et al., 1998;Steffen et al., 1996).Fifteen (15) BSRN stations and three GC-Net stations were used as validation sites, plus one Finnish Meteorological Institute site (Sodankylä) and the two ice camps.The sites are listed in Table 2 and their locations are shown in Fig. 2. The in situ albedo data covers vegetated terrain, desert, coastal sites, seasonal and perennial snow cover, and sea ice.The ice camp data are compared only against pentad SAL products.
For the validation to be accurate, the satellite and in situ data should have similar spectral coverage, be temporally and spatially colocated, and the in situ observations should be spatially representative for the satellite footprint area.We address each of these requirements in turn.
The albedo measurements at the BSRN sites and Sodankylä are performed using broadband albedometers (typical coverage 0.285-2.8µm), therefore we do not need to adjust their data to achieve a spectral match with CLARA-SAL.The GC-Net observations are taken using LI-COR 200 SZ pyranometers that observe a narrower band (0.4-1.1 µm).Stroeve et al. (2001) reported differences of up to 4 % relative to concurrently measured broadband albedo to result from the narrower observation band in these sensors.We include this additional uncertainty in the GC-Net in situ data in the validation accuracy considerations.
We need to also consider the atmospheric effects in the in situ measurements.Albedo sensors (pyranometer pairs) measure incoming and reflected radiation fluxes simultaneously to determine the surface albedo (the so-called blue sky albedo).The incoming radiation flux in real world conditions consists of both direct and diffuse radiation fluxes.To match this data perfectly to the CLARA-SAL black-sky albedo, we would need to remove all diffuse flux contributions (resulting from atmospheric scattering), and also account for any possible absorption effects in the direct flux.However, Liu et al. (2009) showed in their study that the difference between black-sky and blue-sky albedo is very small in cloud-free conditions when SZA is less than 70 degrees.Manninen et al. (2012) showed that the difference between black-sky and blue-sky albedo measurements on the ground at Cabauw in the Netherlands is typically no more than 5 %, although the difference increases with increasing AOD.Considering these results, we choose not to adjust the in situ measurements for any blue-sky/black-sky albedo difference.This decision is also supported by the fact that our validation strategy is based on exclusive selection of confirmed clear-sky in situ data for validation reference, based on the CM SAF cloud mask product (Karlsson et al., 2013).Sites where typical AOD concentrations are large will most likely show poorer correspondence in any case owing to the constant AOD in the atmospheric correction.
For the first group of BSRN sites and the GC-Net sites (1st and 3rd block in Table 2), temporal colocation is achieved by storing the date-time stamps of cloudfree AVHRR overpasses at our validation sites and then exclusively selecting and averaging corresponding in situ observations to form the mean reference albedo for the pentad or month in question.For the second BSRN site group and the ice camps, AVHRR overpass timestamps were not recorded.For these sites, we first discard all samples with either unphysical radiation fluxes, an SZA over 70 degrees, or having a standard deviation of irradiance that is larger than 1 % of the irradiance itself.This last step is designed to exclude cloudy sky measurements.

A. Riihelä et al.: Global 28 yr timeseries of surface albedo
The surviving samples are then averaged over the SAL product timeperiod to form the mean reference albedo.
For the SHEBA campaign where albedo measurements were taken every two-three days over a 200 m transect, we first averaged the transect data to form a daily mean albedo and then average all daily means fitting each SAL pentad.SHEBA cloudiness observations (Intrieri et al., 2002) were studied to identify cloudy skies situations for exclusion.However, the data showed that the vast majority of measurement days had cloud cover on-site for large parts of the day.Culling the reference dataset to include only clear-sky data would therefore also render it too small to draw conclusions about the performance of the satellite retrieval.We thus chose to include all measured transects into the validation, noting that cloudiness typically increases the broadband albedo of sea ice by 5-10 %, depending on cloud thickness and the existence of snow cover and/or melt ponds on the ice (Key et al., 2001;Grenfell and Perovich, 1984).
The in situ albedo observations at Tara ice camp took place with a fixed instrument measuring continuously as the schooner Tara drifted with the sea ice across the Arctic Ocean.We have utilized hourly all-sky albedo measurements to compose the in situ pentad mean albedo in order to maintain cohesion with the SHEBA validation.
Spatial colocation and representativeness of the in situ measurement for the GAC footprint are challenging issues, as the in situ albedo observations are practically point measurements when compared to the satellite retrieval, especially after aggregation of satellite data.Validation using such a pair will produce results which rely not only on the satellite algorithm accuracy, but also on the representativeness of the in situ measurement for its surrounding area.We will next discuss a method to quantitatively assess the site representativeness for validating satellite-based albedo measurements.

Using semivariogram estimators to assess site representativeness
In geostatistics, semivariograms are used to describe the spatial autocorrelation of measurements.Semivariograms illustrate the similarity of measurements at increasing distance (lag) relative to a reference point and therefore quantify the representativeness of a single site for a larger area.Calculation of an empirical semivariogram from measurement data requires that the data are continuous, which is generally true for optical satellite observations of snow-free (or completetely snow-covered) Earth's surface.The empirical semivariogram (also called semivariogram estimator) may be calculated as (Matheron, 1963) where N(h) is the number of data pairs (z(x i + h), z(x i )) at distance h from each other.
Recently, Susaki et al. (2007) and Román et al. (2009) have applied semivariogram analysis for assessing site representativeness in MODIS albedo product validation over rice paddies and forests.Their work was based on calculating albedo estimates from 30 m resolution Landsat ETM+ images and then applying theoretical semivariogram models to analyze their variability within MODIS spatial resolution scales.We followed the general approach of their studies, but provide a simpler method of quantifying the site representativeness.
We used Landsat TM/ETM+ near-infrared (band 4) surface reflectance images at our validation sites over summer to analyze site representativeness.We calculated the empirical omnidirectional semivariogram at each site up to a maximum lag distance of half of the CLARA-SAL 0.25 degree pixel diagonal, using lag distance multiples of 30 m (TM/ETM+ resolution).Rather than fitting a semivariogram model to this empirical data, we then calculated and used its area integral as an indicator of site representativeness for albedo measurements.In essence, we replaced the physical interpretation of a theoretical semivariogram model (range and sill) with an empirical measure of the cumulative variability of nearinfrared surface reflectance with respect to the site location.
We chose to use near-infrared surface reflectance in lieu of calculating surface albedo estimates to simplify the calculations by removing the need for a BRDF correction and a narrow-to-broadband conversion.Also, the near-infrared surface reflectance tends to dominate the determination of broadband AVHRR surface albedo based on wide visible and NIR spectral bands (Liang, 2000).Therefore, application of NIR surface reflectances as a proxy to actual surface albedo data is acceptable for vegetated sites over snow-free periods.As stated by Román et al. (2009), accurate determination of satellite field-of-view on the ground is challenging.In AVHRR-LAC data, geolocation uncertainty can be half a pixel or more (Lucht et al., 2000).We therefore simplified the analysis by calculating the semivariograms centered on the validation site.We also calculated the semivariogram omnidirectionally to better observe the overall variability of surface reflectance in the region surrounding the site.
To quantitatively assess and compensate for site representativeness effects in the CLARA-A1-SAL validation results, we used the following method.First, the area integrals of the semivariograms are normalized by the maximum lag length to compensate for variation in the projected size of the CLARA-SAL 0.25 degree grid cell.The normalized semivariogram integrals were compared to the pentad mean RMSE of the June-July-August period to verify the correlation between low integral values and low RMSE.Finally, the inverted semivariogram integrals were used as weights in determining the mean albedo retrieval accuracy of the CLARA-A1-SAL dataset over land areas.This method gives greatest weight in the algorithm accuracy evaluation to sites where in situ albedo measurements describe the surrounding area best, and allows for at least partial separation of atmospheric correction-related retrieval errors from cases where the in situ measurement cannot provide a reasonable estimate of the mean albedo of the area.

Results for land and snow
To maintain clarity and brevity, we do not show full validation result details here for all 19 land sites.We discuss the results in general, and highlight BSRN sites in the US (Southern Great Plains) and Central Europe (Payerne), and one GC-Net site with perennial snow cover (Summit camp, Greenland) for a better overall analysis.
Validation results for all sites are collected in Table 3. Overall, a wide variety of fits between satellite retrievals and in situ observations can be seen.However, as already stated earlier in this study and also by other authors (Lucht et al., 2000), in situ observations of surface albedo often cannot provide the terrain albedo at the spatial scale of the satellite.This leads us to consider the retrieval errors shown in Table 3 as being composed of two factors: the systematic errors resulting from an imperfect retrieval algorithm, and the random errors resulting from changes in the capability of an in situ observation to represent its surrounding (satellite-scale) area.
Apart from the highlighted site analysis to follow, certain aspects of the validation results deserve mentioning.CLARA-A1-SAL shows clear seasonal retrieval accuracy cycles at Bondville and Goodwin Creek sites, which correspond with observed aerosol optical depth cycles at these sites (Augustine et al., 2008).The AOD maxima at midsummer also coincide with best agreement between SAL and in situ data.The following implication that CLARA-A1-SAL retrieves negatively biased cropland/grassland albedo is offset by the stable validation results from the Southern Great Plains (SGP) site, which has been shown to have both a homogenous land cover and typically low aerosol loading (Michalsky et al., 2010).Furthermore, the study by Augustine et al. (2008) showed low AOD and a less pronounced seasonal cycle at Fort Peck and Desert Rock.In accordance, we observed a very good correspondence between in situ and CLARA-A1-SAL data, also noting that the area around the Desert Rock site has a fairly homogeneous land cover.The Ny-Ålesund is located very near the coast with mountains nearby; considering this against the AVHRR resolution and geolocation accuracy, large under-or overestimations of the site albedo can occur often.
It is clear that both aerosol effects and spatial representativeness effects play a role in the CLARA-A1-SAL validation results.In Sect.7, we will attempt to use the semivariogram method described in Sect.4.2 to reduce the weight of the (random) site representativeness errors and identify the mean albedo retrieval accuracy of the dataset, including aerosol effects.

Southern Great Plains BSRN site
The Southern Great Plains site, located in central Oklahoma and operated by the Atmospheric Radiation Measurement (ARM) program of the US government, offers one of the longest continuous in situ measurement timeseries in our study.The site is located in a rural area with negligible topography.Aside from sporadic snowfall events in the December-January timeframe, the albedo of the site is quite stable around 0.2.
We show the relative retrieval error of CLARA-SAL products between 1994-2009 at SGP in Fig. 3.The variability of retrieval error is larger in the early part of the timeseries when satellite data is more sparse, and pentad means in general are more vulnerable to errors induced by partial snow cover affecting either in situ observation but not the satellite product (or vice versa), as well as cloud masking errors.
The October-March monthly means between 1994 and 1999 show retrieval errors that are typically larger than twice the standard deviation of satellite retrievals (2σ ), and can therefore be considered significant in the statistical sense.We found that several of them resulted from sporadic snowfall events either on-site or in the CLARA-A1-SAL aggregation area, as mentioned above.Others are likely a result of having less available AVHRR data during the 1990s (compared to the 2000s).
We display the overall achieved retrieval accuracy as RMSE units and mean relative retrieval errors over yearquarters in Table 3.Both pentad and monthly mean RMSEs are low (≈ 0.04), and the relative retrieval errors are on the order of −10 % regardless of season.The seasonally invariant retrieval accuracy is a result of the (intra-annually) largely invariant in situ albedo of the site.The interannual variability of the in situ albedo at SGP is also small (not shown).The fairly homogeneous cropland surrounding the site suggests good spatial representativeness of the in situ albedo over the satellite footprint; we will return to the topic in detail in Sect.7.

Payerne BSRN site
The Payerne BSRN site is located in western Switzerland and is operated by MeteoSwiss.The site is located at the border of the town of Payerne; outside the town agricultural land cover dominates.The pyranometers used in this study are located 2 m above a grass field.
Like SGP, Payerne also experiences occasional snowfall during the winter period.A notable difference to SGP is that the grid-averaged CLARA-SAL products are substantially negatively biased.As the validation results in Table 3 show, RMSE of both pentad and monthly means is considerably larger here than at SGP, and relative retrieval errors of −40 % or more are common.However, we have analysed the instantaneous albedo retrievals at the higher GAC resolution and found a considerably smaller bias (RMSE 0.055).This points towards spatial representativeness issues at this validation site.
The overall retrieval accuracy during the validation period is illustrated in Fig. 4 and expressed numerically in Table 3.We will show in Sect.5.3 that problems in site representativeness are a likely cause of the observed negative bias.

Summit camp, Greenland
The Greenland Environmental Observatory, informally Summit camp, is situated near the highest point of the Greenland Ice Sheet.The ice sheet under the camp is over 3000 m thick.The altitude combined with the large distance from coast create conditions in which the perennial on-site snow cover does not melt even during midsummer.The albedo over the central part of the ice sheet is therefore very stable.This site has been used extensively in past MODIS snow albedo retrieval validation by Stroeve et al. (2005Stroeve et al. ( , 2006)).
The full validation period retrieval accuracy is illustrated in Fig. 5 and listed in Table 3.
The results show a good and stable retrieval accuracy over the entire period.The results are similar to the initial validation results for this retrieval method shown by Riihelä et al. (2010).It should be noted that the instantaneous CLARA-SAL retrievals are bidirectional broadband surface reflectance retrievals, therefore still containing reflectance anisotropy effects.The pentad and monthly means provide a good estimate of the surface albedo, although minor overestimations occur during spring and summer.
We further note that the good retrieval accuracy is also a result of well-matched atmospheric parameters.The atmosphere over Greenland is typically dry and has few aerosols (Stroeve et al., 1997).Therefore, the constant AOD of 0.1 used in this release of CLARA-SAL data is an acceptable approximation of prevailing conditions.The outliers showing significant underestimates are nearly always a result of cloud mask misclassifications over a bright snow surface.

Validation results for sea ice
For assessing CLARA-SAL retrieval quality over sea ice, we utilize surface broadband albedo observations from the Tara (Gascard et al., 2008) and Surface Heat Budget of the Arctic Ocean (SHEBA) (Perovich et al., 2002) floating ice camps as reference data.The changing location of the ice camps is accounted for when selecting CLARA-SAL grid cells for validation.To better assess the capacity of CLARA-SAL in tracking the changing albedo of Arctic sea ice over polar summer, we focus our validation on pentad mean products.

SHEBA validation
Our validation study utilizes SHEBA surface broadband albedo data from May 1998 to September 1998 to match Figure 6 shows the achieved retrieval accuracy at pentad mean level against SHEBA observations.CLARA-SAL clearly tracks the overall evolution of the sea ice pack from its dry snow phase through the snowmelt onset and melt pond formation phases.CLARA-SAL performance during late summer decreases as the sun zenith angles grow large, causing problems for both cloud identification and albedo estimation from fewer available data.Achieved RMSE for the pentad mean retrievals is 0.081.

Tara validation
Because Tara was icebound between September 2006 and January 2008, we focus our study on polar summer 2007 when SAL is available.Figure 7 shows the retrieval accuracy of CLARA-SAL relative to the albedo observations at Tara ice camp.The RMSE of the retrieval over this dataset is 0.090, similar to the SHEBA results.In a previous study, Riihelä et al. (2010) compared Tara albedo data to the higher-resolution operational SAL product of CM SAF.They found a better retrieval accuracy (RMSE of 0.045) when comparing only against confirmed clear-sky in situ data from Tara, although the exclusion also limited the dataset significantly compared to this study.Since cloudy conditions were dominant during the Tara study period, we tested the effect of the estimated +0.05 difference between sea ice albedo under cloudy and clear sky.Assuming the Tara in situ data to correspond mainly to cloudy sky albedo, we subtracted 0.05 from each sample and recalculated the RMSE, improving it to 0.067.As the CLARA-SAL grid is coarser than that used by Riihelä et al. (2010), we can hypothesize that much of the remaining difference is due to increased heterogeneity of the satellite observations in the larger CLARA-SAL grid cell over Tara.

Site representativeness analysis results
The validation results from land-based sites in Table 3 show considerable variability.To investigate how much of this variability may be explained by surface albedo heterogeneity within the CLARA-SAL grid cell, we performed a semivariogram analysis over 12 of our validation sites located over www.atmos-chem-phys.net/13/3743/2013/Atmos.Chem.Phys., 13, 3743-3762, 2013 vegetated areas.The sites were all the non-perennial snow sites in Table 2, excluding Alert and Boulder (BOS).The land cover of the sites is shown in Table 3.In subplot (e) we show the comparison between the normalized area integrals of the semivariograms and the mean RMSE of the June-July-August period CLARA-SAL pentad products.The JJA RMSE was chosen as a reference metric since the analyzed Landsat images of the sites were mainly taken between late May and early August, and because snowfree conditions are necessary for this analysis.The results show a statistically significant correlation (1-p > 0.99) between RMSE and the normalized semivariogram integral.A linear fit to the data has R 2 of 0.68, and the regression line has a low zero offset, which should be the case if the basis of this method is physically sound.We used 14 Landsat surface reflectance images from the 12 sites for this comparison.The data was processed at the Global Land Cover Facility of the University of Maryland after Masek et al. (2006).
The semivariograms show features that also correlate plausibly to actual terrain on-site.For example, the clear increase in the semivariogram estimator at Payerne when lag distance is close to 300 Landsat pixels is most likely due to the influence of Lac de Neuchâtel, a large lake northwest of Payerne.At Barrow, the coastal location of the site and the presence of several smaller water bodies nearby have been known to cause challenges for coarse-resolution imagers during surface parameter retrieval (Niu and Pinker, 2011).The calculated semivariogram estimator can increase or decrease with increasing lag length, depending on how poorly or well the average reflectance of Landsat pixels at that specific lag length resembles the reflectance at the validation site (i.e. at lag length 0).
The method we presented has some sources of uncertainty.The available Landsat images are widely dispersed over the summer months, which causes variance in the semivariogram integrals because of phenological changes in vegetation reflectance.Also, as we use surface reflectance imagery, the associated atmospheric correction has some uncertainty; we refer the reader to Masek et al. (2006) for a detailed discussion.For our purposes, we selected two Landsat images at Payerne -one from early May and another from mid-Julyand compared their normalized semivariogram integrals.The difference between the integrals was 25.8 %, which we take as an empirical estimate for the semivariogram integral uncertainty.Subplot (e) of Fig. 8 shows these error bars for each calculated integral.As described in Sect.4.2, we finally inverted the normalized semivariogram integrals and used them as weights in estimating the overall retrieval accuracy of CLARA-A1-SAL.For the non-perennial snow sites using all seasons, we obtained weighted pentad and monthly mean RMSE of 0.085 and 0.082, respectively.This weighted monthly mean RMSE equates to a mean relative retrieval error of −11.1 %.As over-and underestimations both exist in the retrievals, it is appropriate to consider this value as an estimate of retrieval uncertainty regardless of sign.If only the summer period (JJA) is considered, the weighted pentad and monthly mean RMSEs are 0.048 and 0.033.
We conclude that the 11% weighted uncertainty is the sum of contributions from the systematic error sources within the algorithm (inherent and aerosol-related under average atmospheric conditions of these sites).Even allowing for the observed 25 % variation in the weights, this systematic retrieval uncertainty does not exceed 15 %.The retrieval errors listed in Table 3 follow from this value through the additional effects of unaccounted-for aerosols and the spatial heterogeneity of the landscape around the validation site.For the perennial snow sites, where we may assume the landscape to be uniform, no weighing for spatial representativeness is necessary.The average RMSE for both pentad and monthly CLARA-A1-SAL over snow and ice is 0.101.

Comparison of the dataset with MCD43C3 products
In situ validation yields valuable information about retrieval quality at individual sites, but offers little opportunity to evaluate the product at larger scales.To accomplish this, we compared CLARA-SAL to the MODIS MCD43C3 blacksky surface albedo product over the year 2009.The MODIS albedo product and its validation efforts have been extensively described in the literature (e.g.Schaaf et al., 2002;Jin et al., 2003;Cescatti et al., 2012).To match the CLARA-SAL products to the MODIS 16-day albedo means, the following procedures were followed.First, The MCD43C3 product was coarsened to 0.25-degree resolution by averaging 5 × 5 pixel regions with the assumption that the spatial matching is thus sufficiently good to allow a general comparison of the products.We then created a MODIS-equivalent CLARA-SAL product by averaging the pentads which fit within the MODIS 16-day period, plus minus one day.Three pentads were generally available for averaging per MODIS product.
Before the comparison, we excluded areas where either CLARA-SAL or MODIS was not retrieved.We also excluded coastal areas, mountainous areas where the products are not comparable due to the topography correction in CLARA-SAL, and any areas where the MODIS retrieval quality flag indicates sub-optimal retrieval.Also, we did not consider any areas where the annual mean AOD is over 0.4 (based on the MISR annual mean AOD product, see Fig. 13 for illustration).This final step allows for us to remove areas where AOD effects dominate the difference, permitting study of algorithm-related differences.
The comparison results are shown in Fig. 9.In subplot (a), we see that in the overall annual cycle the mean albedo of land/snow surfaces is remarkably similar between the products, although CLARA-SAL is consistently 10-15 % higher.Considering that the systematic retrieval uncertainty of CLARA-A1-SAL may be estimated at 11 % (see previous section), the remaining difference is attributable mainly to differences in aerosol and anisotropy correction differences.It should be kept in mind that this result is an average difference across all land/snow surfaces, and should not be used to infer anything about MCD43C3 product accuracy.As CLARA-A1-SAL tends to retrieve higher forest and perennial snow albedos than MCD43C3, the average difference is weighed positive as a result.
The difference may also be studied as a function of time and land cover class, as is shown in subplot (b).The land cover dataset used is the MCD12C1 majority land cover dataset (IGBP classification) for 2009.We display the median relative difference, which is more representative of typical differences than the arithmetic average in this case.Agreement is best for perennial snow surfaces, barren ground and open shrublands.Over these terrain types, CLARA-A1-SAL and MCD43C3 agree within their inherent uncertainties.The forest classes in general exhibit larger disagreement.The largest differences occur over evergreen broadleaf forests, which consist mainly of the world's rainforests.Disagreement over rainforest regions may be expected as the aerosol loading over these areas tends to be high and subpixel cloud contamination is common at AVHRR resolution, both of which create a tendency for CLARA-A1-SAL to overestimate the rainforest albedo.The average difference over most land cover classes tends to increase 5-10 % during midsummer as a result of the summer AOD maximum over the Northern Hemisphere.
We may also look at the differences directly as a function of AOD; these results are shown in subplot (c).The Monitoring Atmospheric Composition and Climate (MACC) AOD reanalysis dataset (Morcrette et al., 2009;Benedetti et al., 2009) was resampled spatially and averaged temporally to fit the resolution of this analysis.The median of the relative differences of the CLARA-SAL/MCD43C3 comparison for each AOD bin (sized 0.01) is shown.We chose the median in lieu of the mean to avoid giving too much weight to singular grid cells with unrealistically large differences.AOD values below 0.05 occur only rarely and are therefore discarded from the analysis as unrepresentative.
The large landmass of Northern Eurasia, where smaller AOD values of 0.1 to 0.15 occur frequently, is excluded from the analysis during the earliest and latest time periods of 2009 due to sun zenith angle cutoff.This can enhance the apparent seasonal effect in subplot (c) at low AOD values.Overall, the median difference as a function of AOD does not clearly separate into classes.Differences remain between 10 and 20 % at mean AOD values up to 0.3.
It is noteworthy that both products agree within 10 % over Sahara and other desert regions, despite the differences in aerosol handling between the products.Li and Garand (1994) observed that aerosol effects on surface albedo retrievals are smallest when surface albedo is near 0.5, which is approximately the albedo of the brightest regions of Sahara.Considering also that the AVHRR solar channel radiances used for CLARA-A1-SAL have been recalibrated using the MODIS radiance record, we can explain why MOD43C3 and CLARA-A1-SAL agree very well over high-reflectance deserts; differences in atmospheric correction inputs have minimal effect, and the underlying radiance data records are linked.

Product stability and retrieval uncertainty considerations
Evaluation of the long-term stability of the CLARA-SAL timeseries requires study of a target with minimal natural variability to separate algorithm-related changes from actual natural albedo changes.We decided to use highelevation ice sheets for this purpose.We chose the central  part of the Greenland Ice Sheet (GrIS) (75 • N, −42.5 • E) and the widely used Dome-C calibration site on Antarctica (−75.1 • N, 123.4 • E).To minimize sporadic cloud misclassfication errors, a 0.5 × 0.5 degree study area was averaged for analysis at both sites.
We calculated the regional mean albedo from each monthly mean product between 1982-2009 and also analyzed the regional means of the number of observationsdatafield and the standard deviation of the monthly mean albedo.The calculated timeseries for GrIS is shown in Fig. 10.
The 28 yr mean albedo over central GrIS is 0.844, which is very well in line with the commonly cited albedo of dry fresh snow over GrIS (≈0.84) (Konzelmann and Ohmura, 1995).The maximum deviation from the 28 yr mean is 6.8 %.If our assumption that the GrIS interior has a constant albedo, this deviation may be attributed solely to CLARA-SAL retrieval uncertainty.How good is that assumption?We have included only the highest regions of central GrIS to avoid any regions that experience surface melt (Fettweis et al., 2011), but we note that there is recent evidence of a negative albedo trend over the last decade even at the high-elevation parts of the GrIS (Box et al., 2012).This implies that the Greenland site albedo may actually not be naturally stable from the year 2000 onwards.
The Dome-C results are similar and therefore not shown in detail; the 28 yr mean albedo is 0.89 with a maximum deviation of 5.5%.Deviations up to 5 % are expected from natural variation in snow grain size (Jin et al., 2008).The value of the 28 yr mean albedo somewhat overestimates ground measurements, which have retrieved 0.8 as the typical clear-sky albedo at Dome-C (Pirazzini, 2004).This is understandable given that the atmospheric correction inputs to CLARA-A1-SAL overestimate the very thin, dry and ozone-poor atmosphere over Dome-C.We also note that seasonal variation in solar zenith angles will cause some variation in the monthly mean CLARA-SAL albedo, as the data is unnormalized in this first release.
Product stability can also be studied through its anomalies.Figure 11 shows the deaseasonalized land/snow surface albedo anomalies vs. 1982-1998 mean as zonal averages across the dataset.The strongest feature in the figure is the positive anomaly at midlatitudes (0-20 • N) between mid-1991 to 1994.This feature correlates strongly with a sizable increase in atmospheric mid-latitude AOD observed in the TOMS aerosol timeseries (Torres et al., 2002), identified as a result of the Mt.Pinatubo eruption in June 1991.The continuation of the mid-latitude anomaly into 1994 is likely also a result of documented large Saharan dust outbreaks during that year (Díaz et al., 2001).We further note that smaller positive and negative mid-latitude albedo anomalies between 1982-1990 appear to correlate with measurements of tradewind aerosols at Barbados, directly downstream of the Saharan dust plume (Prospero and Lamb, 2003), although retrieval variance over, e.g.tropical rainforests, will also affect the SAL latitudinal anomalies.Albedo anomalies in the subpolar regions result mainly from year-to-year variations in snow cover.
The effect of the assumption of constant AOD = 0.1 (at 550 nm) on the surface albedo product was analyzed in varying atmosphere using the SPCTRAL2 albedo model by Bird and Riordan (1986).AOD values corresponding to the AVHRR channel 1 and 2 wavebands were chosen to vary between 0.1 and 0.4.The water vapour values used were 0.5, 2 and 3.5 mm and the ozone values were 0.25, 0.35 and 0.5 atm cm.The surface spectra were the same 87 as in the paper by Manninen et al. (2012)  values.Figure 12 shows the relative difference of the albedo determined using the constant AOD assumption of the SAL product and the true black-sky albedo.Obviously the SAL albedo is mostly overestimating the true albedo, but typically between 5-10 %.Only for large incidence angles the SAL albedo tends to be lower than the black-sky albedo, but the relative difference is smaller than 10 %.It should be noted that Fig. 12 depicts the additional retrieval error from using a constant AOD of 0.1, not the total effect of aerosols on albedo retrieval.Rutan et al. (2009) found a similar range of variation (8-10 %, their Table 1) for their radiative transfer model-based sensitivity study of CERES albedo retrievals, using a similar AOD variation range.
Figure 13 shows the annual mean AOD at 555 nm for the year 2010 from the MISR instrument (Martonchik et al., 1998), highlighting regions where annual mean AOD exceeds 0.25.Sahara, the Amazon rainforest, the Middle East and Southeast Asia are regions where we may expect substantial retrieval errors to occur as a result of the constant AOD in CLARA-SAL; elsewhere the additional error is acceptable.It should also be noted that the constant AOD does create larger retrieval errors in the visible surface reflectances than in broadband albedo.However, over vegetated surfaces the near-infrared AVHRR channel dominates the narrow-tobroadband albedo conversion used in CLARA-A1-SAL.This lessens the influence of AOD variation, since solar radiation at near-infrared wavelengths is scattered considerably less effectively by aerosols.Over the snow-covered polar regions the aerosol effect could be substantially larger, but in situ measured aerosol concentrations at 500 nm have typically been between 0.05 and 0.2 (Tomasi et al., 2007), matching our AOD treatment and thus ameliorating the issue.
The algorithm uncertainties have been discussed in detail in Riihelä et al. (2010) for the operational SAL product over snow and ice, which mostly apply to this study as well.The narrow-to-broadband conversion algorithms have reported uncertainties of 5-10 % (Liang, 2000;Xiong et al., 2002).The BRDF correction and spectral albedo calculation for vegetation (Roujean et al., 1992;Wu et al., 1995) is estimated to have an uncertainty in the same order of magnitude.Cloud mask misclassifications cause the largest retrieval errors, but they occur only sporadically so that their effect on the temporal means is generally low.Polar regions are an exception; users are encouraged to screen grid cells with only a few averaged overpasses.
Another retrieval uncertainty issue which needs to be mentioned is the evolution of land cover.As the CLARA-SAL BRDF correction uses local land cover as an input variable, natural and man-made changes in both sub-and superpixel land cover can effect retrieval quality in a systematic way.In this first release, we utilized a single land use cover (LUC) dataset (USGS classfication of 1992-1993) as the basis for calculations.Updates to LUC datasets occur infrequently, but it is possible that, e.g., high temporal and spatial resolution datasets of vegetation phenology, could be used to update land cover data for albedo calculations along the lines of Stöckli et al. (2011).
Uncertainties in the radiance intercalibration of the AVHRR data record would directly affect SAL accuracy.Heidinger et al. (2010) validated the calibration to be accurate within 2-3 % relative to MODIS.Molling et al. (2010) made a thorough review of past AVHRR calibration work and compared the various currently developed calibrations, finding that differences of around 10% are common.They also discussed the problems in AVHRR calibration at length, including the choice of reference targets, solar input changes, and instrument degradation, to name a few.The reader is referred to these publications for a complete treatment of the topic.

Discussion
Our study of the first edition of the CLARA-SAL dataset has provided us with information concerning its strengths and weaknesses.While the overall albedo retrieval capability is comparable to previous AVHRR-based datasets, we are aware of certain issues which require improvement in future releases.Firstly, the accuracy of the atmospheric correction needs to be improved over high-aerosol regions.AOD datasets covering 1982 onwards are scarce, the ones used for the ERA-40 and ERA-Interim reanalysis datasets being the most analyzed (Tanré et al., 1984;Tegen et al., 1997).However, as the use of aerosol climatologies as satellite retrieval input tends to cause unwanted internal correlation between them, we are also examining methods to perform the atmospheric correction based solely on satellite data.
We are also planning improvements to the snow/ice and ocean albedo calculations.Utilization of long-term sea ice concentration datasets will remove cloud masking errors.In this first version of the dataset, cloud masking errors over sea ice usually occur at the edges of persistent cloud fronts and/or regions with high SZA.Users are strongly recommended to utilize the number of observations-datafield to establish an appropriate screening of data for their application.We are also investigating the inclusion of a ocean wind speed dataset to improve ocean albedo representativeness.

Conclusions
We have developed, processed and validated a timeseries describing the global black-sky broadband surface albedo between 1982-2009.In this paper we have described the processing algorithm and validation strategy, highlighted representative validation results and discussed the stability of the dataset, as well as its comparability to the well-known MODIS black-sky albedo product.The main results of our work are as follows: 1. Validation results from 14 land/seasonal snow sites combined with a quantitative analysis of site albedo representativeness show that the mean systematic retrieval uncertainty of CLARA-A1-SAL monthly mean products is 11 % relative to ground truth, given average aerosol conditions (AOD less than 0.4) Instantaneous SAL product Fig.A1.The SAL algorihm process as a flowchart.
albedo retrieval.We omit numerical details in this description.
First, any cloud-contaminated pixels are excluded from processing, as well as any pixels for which the sun zenith angle (SZA) is larger than 70 degrees or the viewing zenith angle (VZA) is larger than 60 degrees.The snow/ice flags in the cloud mask are used to identify snow cover and sea ice.
The topography affects the satellite image in first order in two ways: (1) the altitude difference with respect to sea level will cause the geolocation of the pixel to be shifted, and (2) the inclination of the slopes of the terrain within a pixel will alter its reflectance value.As the BRDF calculations are based on a horizontal plane assumption, erroneous values will be obtained for inclined slopes.In addition, the slope distribution of the terrain covered by the pixel may contain slopes that are not seen at all by the sun or the satellite.
The topography correction for geolocation considers the fact that latitude/longitude coordinates of a satellite pixel on the ground are calculated in AVHRR preprocessing assuming a flat Earth at sea level; at the swath edges where satellite viewing angle is large, actual location of a satellite pixel may differ considerably if the terrain is elevated.Where elevation is sufficiently high in the swath, the elevated pixel reflectance is moved onto its neighbour in the range direction, in fact replacing the original pixel.The GTOPO30 Digital Elevation Model (DEM) is used to determine pixel elevation.
The topography correction for radiometry considers the fact that AVHRR surface reflectances over sloped terrain are affected by the distribution and orientation of terrain slopes in the subpixel scale.Some slopes are not visible to the satellite, others are not illuminated by the sun, and some are neither.Yet, they would still contribute to the overall albedo of that area, and therefore their presence has to be corrected for.First, a high-resolution DEM from SRTM is fitted into the satellite data and viewing/illumination geometry data are applied to compute the number of SRTM slopes within the (GAC-resolution) satellite pixel that are illuminated, in shadow, or illuminated/viewable but not within the BRDF model validity range.BRDF calculations are repeated for each slope (as viewing/illumination geometry changes for each), assuming reflectance signature shapes to be identical for all slopes in the pixel.The calculated slope reflectances are averaged, and the average is adjusted to include the nonvisible slopes' effect.This average sloped terrain reflectance is finally compared to the observed AVHRR reflectance to obtain the correction factor.We only consider this correction for regions where mean slope exceeds five degrees.For more details on the topic, the reader is referred to Manninen et al. (2011).
The next processing step is to remove the atmospheric scattering and absorption effects from the imaged reflectances, transforming them from TOA reflectances into surface reflectances.We utilize the Simplified Method for Atmospheric Correction (SMAC) after Rahman and Dedieu (1994) to do this.The SMAC algorithm is a version of the Second Simulation of a Satellite Signal in the Solar Spectrum (6S) radiative transfer (RT) code, where some parts of the RT calculations have been parameterized to speed up the algorithm.The calculation is made separately for both AVHRR channels using continental coefficients.
The atmospheric correction requires four inputs: (1) aerosol optical depth (AOD) at 550 nm, (2) total ozone content of the atmosphere [atm cm −1 ], (3) total column water vapour [g cm −2 ], and (4) surface pressure [hPa].Surface pressure and total column water vapour are taken from the daily ERA-Interim reanalysis dataset.Ozone content of the atmosphere has typically a very small effect on visible and near-infrared radiances; therefore we use a constant value of 0.35 atm cm −1 .
After the atmospheric correction, the processing separates according to surface type.For non-snow covered land surfaces, we correct for anisotropy in the surface reflectance prior to the conversion into spectral albedo.The process uses the kernel-based BRDF/albedo model by Roujean et al. (1992), using land cover-specific BRDF coefficients from Wu et al. (1995).Landcover information is obtained with a nearest-neighbour retrieval from a 1 km USGS land cover dataset for each GAC overpass.While it is unavoidable that terrain heterogeneity will cause some errors in land cover assignment, their effect is not critical.Our BRDF correction model utilizes NDVI as its main parameter and does not typically introduce drastic corrections into the data.Also, the land cover information is aggregated into archetypal classes (forest, grassland, cropland, barren) prior to the correction, effectively ameliorating the breadth of the problem.
The derived spectral land surface albedos are finally converted into a single broadband surface albedo using the linear regression equations of Liang (2000).
For water, we employ the lookup-table (LUT) approach of Jin et al. (2004).In this first release of CLARA-SAL, the water albedo is very simplistic: The four inputs of the LUT are set to constants; SZA to 60 degrees, wind speed at surface is set to 10 m s −1 , AOD is kept at the processing constant of 0.1, and chlorophyll content of the water surface layer is set to 0.15 mg m −3 .Therefore ocean albedo is a constant 0.067 throughout the dataset.
For snow and sea ice, we keep the approach described and validated by Riihelä et al. (2010).The spectral surface reflectances are converted into broadband bidirectional reflectances using the empirical equations of Xiong et al. (2002).The conversion algorithm also accounts for ponding effects, which are important for late summer sea ice albedo retrievals.Instead of attempting an instantaneous reflection anisotropy correction, the broadband reflectances are averaged over the product period per grid cell, sampling the snow BRDF to create a snow broadband albedo estimate.Sufficient sampling of the viewing hemisphere is required to make the estimate accurate.Fortunately, AVHRR overpasses occur frequently over the high-latitude regions with significant snow cover.Figure A2 illustrates a typical distribution of surface reflectance retrievals in the viewing zenith/viewing azimuth angle space at Summit camp (72.58 • N, −38.5 • E) during the observable part of year 2007.Between April and August the viewing hemisphere is well sampled (between 110 and 291 retrievals per month) at this site, despite cloud cover and the viewing angle cut-off.Similar distributions have been observed at all of our high-latitude validation sites.

Fig. 1 .
Fig. 1.An example of the Arctic CLARA-SAL monthly mean product from August 2007.Grid cell size is 25 km.

Fig. 2 .
Fig. 2.An averaged composite of the global lat-long grid CLARA-A1-SAL monthly mean products from 2007.The grid cell resolution is 0.25 degrees.Validation sites are overlaid using black cross-in-circle markers.

Fig. 3 .
Fig. 3.The relative retrieval error of CLARA-SAL at SGP between 1994-2009.Upper subplot indicates color-coded relative retrievals errors of monthly means; the lower subplot shows the same for pentad means.The grey color indicates a retrieval error within ± 5 %.Cells with white color indicate non-available satellite or in situ data.

Figure 8
Figure8illustrates the calculated semivariograms and their comparison to CLARA-A1-SAL validation results.Each of the subplots (a)-(d)  shows the semivariogram at one example validation site as a function of lag length (distance from validation site coordinates, as multiples of 30 m).In subplot (e) we show the comparison between the normalized area integrals of the semivariograms and the mean RMSE of the June-July-August period CLARA-SAL pentad products.The JJA RMSE was chosen as a reference metric since the analyzed Landsat images of the sites were mainly taken between late May and early August, and because snowfree conditions are necessary for this analysis.The results show a statistically significant correlation (1-p > 0.99) between RMSE and the normalized semivariogram integral.A linear fit to the data has R 2 of 0.68, and the regression line has a low zero offset, which should be the case if the basis of this method is physically sound.We used 14 Landsat surface reflectance images from the 12 sites for this comparison.The data was processed at the Global Land Cover Facility of the University of Maryland afterMasek et al. (2006).The semivariograms show features that also correlate plausibly to actual terrain on-site.For example, the clear increase in the semivariogram estimator at Payerne when lag distance is close to 300 Landsat pixels is most likely due to the influence of Lac de Neuchâtel, a large lake northwest of Payerne.At Barrow, the coastal location of the site and the presence of several smaller water bodies nearby have been known to cause challenges for coarse-resolution imagers during surface parameter retrieval(Niu and Pinker, 2011).The calculated semivariogram estimator can increase or decrease with increasing lag length, depending on how poorly or well the average reflectance of Landsat pixels at that specific lag length resembles the reflectance at the validation site (i.e. at lag length 0).The method we presented has some sources of uncertainty.The available Landsat images are widely dispersed over the

Fig. 8 .
Fig. 8. Semivariogram estimators of Landsat channel 4 surface reflectance for (a) Barrow, (b) Southern Great Plains, (c) Payerne, (d) Sodankylä.(e) shows the correlation between the mean RMSE error of the pentad CLARA-SAL products from the June-July-August period, and the normalized area integrals of the semivariograms.
Fig. 9. (a): CLARA-SAL and MCD43C3 mean retrievable blacksky albedo (in %, left axis) and relative difference (in %, right axis) over land/snow surfaces through 2009.(b): Relative median difference (CLARA-SAL minus MCD43C3) of products as a function of land use class and time through 2009, only land/snow surfaces considered.(c): Relative median difference (CLARA-SAL minus MCD43C3) of products as a function of aerosol optical depth and time through 2009.Only land/snow surfaces considered in all plots.White color in subplots (b) and (c) marks no data.

Fig. 10 .
Fig. 10.(a) illustrates the monthly means of surface albedo over 28 years, averaged over the central part of GrIS.The thick green line shows the interannual mean albedo of the region.(b) shows the mean number of AVHRR overpasses in each monthly mean in (a), averaged over the study region.Finally, (c) shows the mean standard deviation of each monthly mean in (a), averaged over the study region.

Fig. 12 .
Fig. 12.The relative difference in albedo retrieval, resulting from using a constant AOD of 0.1 versus true AOD (up to 0.4) for a range of surface spectra and sun zenith angles.Results shown as a function of true black-sky albedo of the terrain.

Fig. 13 .
Fig. 13.The global annual mean AOD at 555 nm from the MISR AOD product for the year 2010.Color coding set to highlight regions where the annual mean exceeds 0.25; full range of data not shown.

Fig. A2 .
Fig. A2.Sampling of the viewing hemisphere in AVHRR surface reflectance retrievals at Summit camp on the Greenland Ice Sheet during 2007.Radial direction indicates increasing viewing zenith angle; the azimuth direction indicates increasing viewing azimuth angle.

Table 2 .
The validation sites used in this study.Sites listed with in the 2nd and 4th block use an alternative method for formulation of the site's temporal mean albedo.See text for details.

Table 3 .
Validation results over the full validation period at each site.Pentad and monthly mean RMSE are defined for albedo values in the range of 0-1.