Quantification of model uncertainty in aerosol optical thickness retrieval from Ozone Monitoring Instrument ( OMI ) measurements

Introduction Conclusions References Tables Figures


Introduction
Many ongoing studies are aiming for a better understanding of atmospheric aerosol properties such as size distribution, type, optical properties, formation, and transport.The remote sensing of atmospheric aerosols from space enables the monitoring of aerosols on both regional and global scale.The satellite measurements are widely used together with ground-based and airborne measurements to provide data for important atmospheric aerosol studies related to, for example, climate change, energy budget, air quality and cloud properties.Atmospheric aerosols have been monitored for years from several satellite instruments including the Ozone Monitoring Instrument (OMI), the Moderate Resolution Imaging Spectroradiometer (MODIS), the Global Ozone Monitoring Experiment-2 (GOME-2), the Multi-angle Imaging SpectroRadiometer (MISR), the Cloud-Aerosol Lidar and Infrared Path finder (CALIPSO), the Scanning Imaging Absorption spectroMeter for Atmospheric Chartography (SCIAMACHY) and the Polarization and Directionality of the Earth's Reflectances (POLDER).The instrument characteristics vary with spatial resolution, wavelengths, polarization and view angle.The aerosol retrieval algorithms have been developed and gradually improved as the knowledge on satellite sensors has increased.The development on retrieval algorithms have even made it possible to retrieve aerosol properties from instruments that were not originally designed for this.
The determination of the aerosol properties from satellite measurements is an illposed inverse problem as the limited information content in the observations does not allow for complete determination of aerosol properties.Prior information, such as assumed surface conditions, and selection of aerosol properties for pre-calculated radiative transfer models, is an essential part in the retrieval process.For the solution of the inverse problem, various assumptions and simplifications are needed.
The forward problem in aerosol retrieval is based on radiative transfer calculations which depend on various aerosol properties.Currently, these calculations are too time consuming to be performed simultaneously with the retrieval inversion, and many operational algorithms are based on pre-calculated look-up tables (LUT) for a selection of aerosol types.In this paper the aerosol model selection problem refers to this choice of the LUT or several LUTs which are most representative of the current aerosol type.The atmospheric aerosol column content above the Earth ground-pixel can be a mixture of several aerosol types, which complicates the choice of the correct aerosol type.One important reason for the disagreement in the results derived from different satellite instruments for the same location and time is the difference in the algorithms and in the assumption of the underlying aerosol model (Kokhanovsky et al., 2010;Li et al., Introduction Conclusions References Tables Figures

Back Close
Full 2009; Livingston et al., 2009).This choice of an appropriate aerosol model composes a significant part in the aerosol optical thickness retrieval.The aim of this paper is to provide statistical tools that account the aerosol model selection uncertainty in quantifying the uncertainty in the retrieved aerosol optical thickness (AOT).The aerosol optical thickness (also called aerosol optical depth) is a dimensionless measure of the amount of light absorbed or scattered by the aerosols.In addition, another source of uncertainty that is usually neglected in satellite retrieval analyses is related to model error.The aerosol models contain results of radiative transfer calculations for various aerosol physical properties.A simple LUT-based forward model vastly simplifies the actual atmospheric conditions.We use tools from Bayesian model selection methodology to weight the models according to their predictive abilities (MacKay, 1992;Spiegelhalter et al., 2002;Robert, 2007) and combine information about the AOT over the best fitting models by averaging over the best models (Hoeting et al., 1999).The model discrepancy is modelled using Gaussian processes that define the allowed deviations from modelled to observed reflectance by a suitable covariance structure that lets model residuals to correlate depending on their wavelength distance (Kennedy and O'Hagan, 2001;Rasmussen and Williams, 2006).The error model for this discrepancy has been build up by empirically exploring a set of residuals of aerosol model fits to the observed reflectances.
This methodology is applied to OMI measured reflectances at the top of the atmosphere (TOA) using the aerosol models of the OMI multi-wavelength aerosol algorithm OMAERO.The operational OMAERO product uses a look-up table (LUT) based technique for the retrieval of aerosol optical properties in the ultraviolet and visible wavelength region.The multidimensional LUT contains pre-calculated microphysical aerosol models having specific optical properties such as aerosol optical thickness (AOT) and single scattering albedo (SSA).The aerosol models represent four main types of aerosols: desert dust, biomass burning, weakly absorbing and volcanic aerosols (Torres et al., 2002(Torres et al., , 2007;;Livingston et al., 2009) Introduction

Conclusions References
Tables Figures

Back Close
Full The methodology is applicable to other instruments, also.This study is a part of Finnish Technology and Innovation Agency (TEKES) funded project PP-TROPOMI, where one aim was to improve the existing model selection algorithm of OMI to the benefit of the future TROPOMI instrument algorithm development (Veefkind et al., 2012).The next section introduces the OMAERO algorithm.Section 3 describes the Bayesian model selection technique to choose the aerosol model based on satellite observations with associated uncertainty.The characteristics for model error are determined in Sect. 4. Finally, aerosol model selection in different atmospheric cases is exemplified in Sect. 5.

Data and operational OMI multi-wavelength aerosol algorithm OMAERO
In this study we have used reflected solar radiation measurements from OMI on board NASA's Earth Observing System (EOS) Aura satellite, launched in July 2004.The Aura spacecraft is in polar sun-synchronous orbit at altitude of 705 km having daily global coverage with 14 orbits.The OMI instrument has been built in cooperation with Finland and the Netherlands.OMI is a nadir-viewing solar backscatter spectrometer measuring in the ultraviolet (UV) and visible (VIS) regions between 270 and 500 nm.The ground pixel size is 13 × 24 km 2 at nadir.The retrievals from OMI measured Earth radiance and solar irradiance spectrum at high spatial resolution are aerosol characteristics, surface UV, cloud information and atmospheric trace gases including ozone, NO 2 , SO 2 , HCHO, BrO and OClO.The retrievals are used in the studies of air quality, ozone trend, and relation between atmospheric chemical composition and climate change (Levelt et al., 2006a, b;Torres et al., 2007).
The operational OMI aerosol multi-wavelength algorithm OMAERO has been developed to retrieve aerosol optical properties for cloud-free scenes using reflectance spectrum in the near UV and visible wavelength range between 331 and 500 nm.The OMAERO Level-2 data is available for public access from NASA GSFC Earth Sciences (GES) Data and Information Services Center (DISC) (http://disc.gsfc.nasa.gov/Aura/Introduction

Conclusions References
Tables Figures

Back Close
Full OMI/omaero_v003.shtml).The available data period is from 1 October 2004 to the present.The OMAERO Level 2 product provides aerosol properties including aerosol type, AOT, SSA, aerosol absorption indices and other related data (Torres et al., 2002(Torres et al., , 2007)).The principal component analysis applied to the OMAERO algorithm by Veihelmann et al. (2007) shows that OMI reflectance measurements have two to four degrees of freedom of signal.
The current version (V003) of operational OMAERO product uses over land pixels surface albedo climatology based on OMI observations of five years.Over oceans the spectral bidirectional reflectance distribution function is calculated by means of an ocean model that accounts for wind speed and chlorophyll concentration climatology.

Aerosol optical thickness retrieval
For the OMAERO algorithm, the radiative transfer calculations in the cloud-free scene have been done in advance for a range of aerosol physical properties and sun-satellite geometries generating the corresponding microphysical aerosol models (Torres et al., 2002(Torres et al., , 2007)).These aerosol models are divided into four main types: desert dust, biomass burning, weakly absorbing and volcanic aerosols.The main types are divided into subtypes according to aerosol size distribution, refractive index and vertical profile, ending up about fifty aerosol models in total.The content of the aerosol models are Introduction

Conclusions References
Tables Figures

Back Close
Full The operational OMAERO algorithm uses TOA reflectance measurements R obs (λ) to find aerosol model and a value for AOT, τ, that best matches the observations.As the spectral shape of the AOT is fixed by any given aerosol model input configurations, there is only one parameter to be fitted, the AOT at reference wavelength (500 nm) τ = τ(λ ref ).
In the fitting procedure at present pixel a subset of aerosol models are preselected according to a priori knowledge of aerosol regional and seasonal distribution.The fitting is done using the least square criteria by minimizing where L is the number of wavelength bands, σ(λ i ) is the uncertainty in the measured reflectance given as standard deviation and R mod (τ, λ i ) is the reflectance from the aerosol LUT model (Torres et al., 2002(Torres et al., , 2007)).The best fitted model is selected according to test value χ 2 mod and is used to determine the spectral AOT.The operational product provides also the precision of the AOT.In addition, a maximum of ten models, for which root mean square of the residual reflectance is below a given threshold value, are delivered with related AOT and SSA (Torres et al., 2007;Livingston et al., 2009).

Reflectance
The aerosol optical thickness τ = τ(λ ref ) is retrieved from TOA reflectance spectrum.The TOA spectral reflectance R obs (λ) is calculated as the ratio of observed OMI Level 1b Earth radiance E (λ) over the observed OMI Level 1b solar irradiance spectra F (λ) by where θ sun is the solar zenith angle (Levelt et al., 2006b;Torres et al., 2007).Introduction

Conclusions References
Tables Figures

Bayesian model choice
There are various sources of uncertainties affecting the accuracy of the retrieved AOT values, and the selection of correct LUT for modelled reflectance calculations is only one factor.Other are related to the size of OMI pixels, sub-pixel cloud contamination, aerosol horizontal inhomogeneity, etc.One large source of uncertainty comes from the use of surface albedo climatologies.In this study, we want to use Bayesian model selection tools to select the most appropriate LUT and quantify the related uncertainty.Secondly, we need to take into account the other sources of uncertainties that might cause systematic model discrepancies.This is done by using model discrepancy modelling with Gaussian processes described in Sect. 4. We want to choose an aerosol model from a set of models that provides a best explanation to the observed reflectance at each OMI pixel.As several models might be equally good, an important task here is to be able to quantify the uncertainty coming from the model selection procedure.We will use tools from Bayesian statistical inference.We utilise model choice, model averaging and modelling of model error that naturally account different sources of uncertainties.Bayesian analysis will provide the Introduction

Conclusions References
Tables Figures

Back Close
Full solution to the estimation as a posterior probability density that is a measure of uncertainty in the quantity of interest after accounting for the uncertainties in the modelling procedure.
Model selection in general is a delicate problem that can not be solved by statistical reasoning only.For a given data sets, there will be an infinite number of different models that fit the data equally well.Here we deal with the specific problem of choosing the most suitable model from a given set of candidate models.We acknowledge the fact that none of the models might give adequate fit to the observations and want to have a measure for that situation, also.

Bayesian parameter estimation and model comparison
Typical statistical parameter estimation procedure proceeds in steps, where first a given model is fitted to the observations to get a parameter estimate and its uncertainty.Then model residuals (i.e. the difference from modelled values to the observed ones) are studied to see if the assumptions on the residuals are met.This is called model diagnostics, where one typically checks for any systematic features in the residuals, which signals for inadequacy in the model formulation, and the form of the distribution of the residuals, signaling problems in the statistical assumptions.When we have several possible models, as in the OMI case, one can fit all the models, one by one, and see, which provides the best fit according to some chosen criteria, such as minimum least squares.
We recall and outline the Bayesian parameter estimation and model selection in the current framework of finding the posterior distribution of the AOT parameter τ using the OMAERO algorithm.The posterior distribution for the uncertainty in τ after observing R obs is given by the Bayes' formula where the likelihood p(R obs |τ, m) and the prior distribution p(τ|m) depend on the model m.This will give us valid posterior inferences about τ given the observed and modelled reflectance, prior distribution on τ and assuming that m is the correct model.As the posterior p(τ|R obs , m) is a probability distribution and the denominator p(R obs |m) does not depend on τ, the latter must be a constant that normalizes the numerator p R obs |m = p R obs |τ, m p(τ m) dτ. (5) For model selection this constant has an important use.It is the probability of observing R obs given the model.This value is sometimes called evidence.Basically, we could select the model that has largest evidence with respect to the observations.There are some caveats on using the evidence for model choice pointed out in statistical literature (Robert, 2007).In this particular case, where one dimensional parameter is fitted with a selection of possible models, we find basic Bayesian model selection very useful, provided that we can account for the model error as done in Sect. 4. The least square criteria in Eq. ( 1) has a direct counterpart within Bayesian inference as it appears exactly in the likelihood function for Gaussian observation error where we assume the measurement noise standard deviations σ(λ) to be known.If we assume an uninformative prior for τ, i.e. p(τ|m) = 1 the least squares estimate, maximum likelihood estimate (MLE) and Bayesian maximum aposteriori (MAP) estimate are all equal.
To compare models, we use a method based on the posterior model probabilities.where p(m) is the prior probability that model m is the correct one.This formula describes the probability of model m assuming that the measurements have been generated from this model.The evidence term from Eq. ( 5) appears here as the marginal likelihood p(R obs |m) of observed data within model m.The denominator is again a normalizing constant defined as sum over all the models considered As we are going to deal with relatively small number of different models, this term is easily calculated, provided we can calculate the individual evidences.
In case when a priori all models are equally likely, the model comparison and calculation of relative weights for each model simplifies to calculating the relative evidences: Consequently, in this case the model with the highest evidence is the best among the models involved.We can compare models to see if one is clearly the best with respect to other models, or if there are several almost as plausible.However, having the largest evidence among a set of models does not guarantee a good, or even adequate, fit in itself.

Bayesian model averaging
In practice, several aerosol models can provide equally good explanation to the measurements and the particular one with highest evidence may have obtained it just by (10) If different models give rise to different values for the unknown, then the uncertainty in the averaged posterior distribution p avg (τ|R obs ) can be larger than it is with any single model.This means that the uncertainty in model selection has been incorporated into the result (Hoeting et al., 1999;Robert, 2007).

Modeling the model uncertainty
In As before, we assume that the spectral measurement uncertainty due to instrument noise is known and Gaussian obs (λ) ∼ N 0, σ 2 (λ) .( 12) We wish to build a statistical model for the remaining model discrepancy term η(λ).In order to see how this discrepancy behaves we studied residuals of model fits, i.e. the differences between the observed reflectances and the modelled reflectances, at wavelengths λ for an ensemble of residuals representing varying atmospheric situations (see Fig. 2).The modelled reflectances were calculated from aerosol models that were the most appropriate according to the operational OMAERO product.We found that the residuals have typically very similar systematic behaviour that could be modelled by a suitable correlation structure.By using standard tools from spatial statistics, we estimate this correlation structure and use it to build a model for the model error.

Gaussian process
Following Kennedy and O'Hagan (2001), we use a Gaussian process to model the model discrepancy η(λ) between the aerosol model generated reflectance and the observations.Gaussian process is a stochastic process for which every finite set of its realizations has a joint Gaussian distribution (Rasmussen and Williams, 2006).It is a theoretical tool that provides a general and flexible framework for constructing the error model.As we only deal with finite representations, we can work with random variables and covariance matrices, in practice.
A Gaussian process is defined by its mean and covariance function, and the essential part in implementation is the determination and parameter estimation related to the covariance function.We will model the model discrepancy as a zero mean Gaussian process η(λ) ∼ G P (0, C), where the covariance function C quantifies the correlation Introduction

Conclusions References
Tables Figures

Back Close
Full properties of the discrepancy.As there typically is no direct data available about the covariance, one proceeds by assuming a certain parameterized functional form.Following Banerjee et al. (2004) we derived the covariance function C using a Gaussian variogram model.The covariance depends only on the wavelength distance |λ i − λ j | and is defined as where l is the correlation length parameterizing the distance between two wavelengths where the residuals are still correlated.Parameter σ 2 0 represents non-spatial diagonal variance and σ 2 1 corresponds to spatial variance.These three parameters σ 2 0 , σ 2 1 and l are the essential characteristics of the covariance function to be determined.In the next section we show how we estimated the covariance function empirically from wavelength dependent correlation structure of residuals of model fits.
After the model discrepancy term has been estimated, the theoretical covariance function is used to form the corresponding covariance matrix C defined for the range of wavelength bands of the observations.Then it can be incorporated into the likelihood function (Eq.6) as an additional error covariance where R res is the residual of model fit (Eq.13).The joint covariance matrix in Eq. ( 15) consists now of two elements: C is the covariance matrix for model discrepancy (Eq.14) and diag(σ 2 (λ)) is the diagonal matrix having measurement error variances σ 2 (λ) as its diagonal elements.
By choosing a suitable representation for model error covariance matrix C we allow a smooth departure from the model to the observed reflectance.The covariance function parameters define this allowed smoothness.As a consequence we achieve more realistic, although wider, uncertainty estimation of AOT.Introduction

Conclusions References
Tables Figures

Back Close
Full

Empirical semivariogram
The wavelength dependent correlation structure of the residuals can been estimated by the means of empirical semivariogram.The relationship between theoretical variogram models and the covariance functions of Gaussian process gives a way to determine the covariance function of model discrepancy (Banerjee et al., 2004).The empirical semivariogram for particular distance d between wavelengths λ i and λ j is given as where n(d ) is the number of pairs of wavelengths with the same distance d .In the formula for the particular distance d the sum of squared residual differences is taken over the set of wavelength pairs with that distance d .The variance of the difference between residuals at any two wavelengths depends only on the wavelength distance.
We have calculated the empirical semivariogram (Eq.16) for the ensemble of residuals from different orbits.The empirical semivariogram at different wavelength distances d is plotted as circles in Fig. 3.This figure shows the wavelength dependent correlation structure of the residual differences.The residuals are similar at wavelengths nearby while the variance of residual differences increases for those wavelength pairs that are more apart.
Next we estimate the parameters of a theoretical parametric semivariogram model that fits the empirical semivariogram.In the literature there are several predefined parametric forms for semivariogram (Banerjee et al., 2004).The commonly used Gaussian variogram model used here is given as  et al., 2004).The correlation length l defines a scale for the distance between wavelengths where the residuals are still correlated.Parameters l , σ 2 0 , and σ 2 1 are tuning parameters of the variogram model that exactly corresponds to those of the covariance function in Eq. ( 14).The fitted Gaussian semivariogram model is plotted in Fig. 3 as a solid curve.
To illustrate the covariance function parameters, Fig. 4 shows how the averaged posterior probability (Eq.10) changes when the correlation length l in the covariance function (Eq.14) is increased from 20 to 200.The averaged posterior probability of τ is the weighted mean of the posteriors within the best models.Between any two wavelength bands at the distance of appointed correlation length, the modelled reflectance is allowed to smoothly diverge from the measured reflectance, instead of close fit at intervening wavelength bands.That is, the higher value of correlation length, the smoother the modelled spectral reflectance are allowed to deviate from the measurements.This is related to the higher uncertainty from model discrepancy that increase the uncertainty in the AOT retrieval in our case.

Results
The aerosol model selection, model averaging, and model discrepancy modelling is demonstrated here by four examples representing different atmospheric aerosol situations where we expect different dominant main aerosol types.In the examples we have experimented the method using two cases: without the model discrepancy term being included (Eq. 6) and with the model discrepancy included (Eq.15).Table 1 lists the examples with appropriate information.The selected pixels are cloud-free and over land.
The basis of our work is in the OMI multiwavelength algorithm OMAERO (Torres et al., 2002) introduced in Sect. 2. We used spectral measurements from 14 wave-8524 Introduction

Conclusions References
Tables Figures

Back Close
Full  : 342.5, 367.0, 376.5, 388.0, 399.5, 406.0, 416.0, 425.5, 436.5, 442.0, 451.5, 463.0, 477.0 and 483.5 nm.There were some differences in our experimental retrieval algorithm compared to the operational OMAERO.We have taken the surface reflectivity at given location and date from the database based on TOMS and MODIS data, whereas the current OMAERO product (V003) uses over land the surface albedo climatology based on OMI measurements spanning five years.We have defined for this study the measurement noise standard deviation σ(λ) by assuming signal-to-noise ratio SNR = 500, i.e. σ(λ) = R obs (λ)/SNR.We used fifty OMAERO aerosol model lookup-tables (LUT) and the modelled TOA reflectance R mod was calculated as in Eq. ( 3).
The size of the covariance matrix C in Eq. ( 15) depends on the number of wavelength bands involved.In our case the dimension of C is 14 × 14, that is quite moderate for the matrix operations needed.The empirical semivariogram model described in Sect.4.2 was used to estimate the parameters defining the covariance matrix C as l = 90, σ 2 0 = 0 and σ 2 1 = 0.0004.An important aspect in Bayesian analysis, the specification of prior distributions, has not been discussed so far.As we are mainly performing a feasibility and method development study, we have used rather conventional choices.For each individual model fit the prior distribution for AOT parameter τ was set to log-Gaussian with mean value 2 and 700 % standard deviation.This ensured the positivity of the estimated AOT values and was only weakly informative in all of the test cases.For the model choice, uniform prior was used for p(m), i.e. all the models were a priori equally likely.

Greece forest fires 2007
During summer 2007 there were massive forest fires in many parts of Greece (Kaskaoutis et al., 2011).We considered two days, approximately at the same location in Peloponnese, 16 and 25 August 2007 (Table 1).The latter date represents the time when the fires were at the most disastrous phase in that area.
Figure 5 shows observed and modelled reflectances on the left panel for 16 August 2007.The observed reflectance is marked with blue dots and the measurement 8525 Introduction

Conclusions References
Tables Figures

Back Close
Full not been involved.The five most likely models are of weakly absorbing (models with "1" as the first digit) and biomass burning type ("2" as the first digit).The model "1213" has the largest support as explaining the observed reflectance.The averaged posterior distribution (Eq.10), plotted in red thick line, has spread over the posteriors of τ within these five models.The sharp peaked and narrow posterior probabilities indicates low uncertainty of retrieved aerosol optical thickness τ.We expected that this posterior underestimates the true uncertainty.The lower panel shows the results when the model discrepancy has been acknowledged in the fitting procedure.Now there are ten models almost as likely in the averaged posterior distribution of τ.It appears that the uncertainty averaged over models is very wide when the model discrepancy is involved.Also, the single posterior distributions of τ within models are clearly broader in this case.On 25 August, all the best models are biomass burning type (see Fig. 6).Again, when the model discrepancy is not included, the uncertainty shown in the figure on the upper panel gives the impression of low uncertainty of retrieved AOT.In addition, there is clearly only one best model according to the relative posterior model probability.When the model discrepancy is included, Fig. 6 (bottom panels), there are seven models almost as likely.This can also be seen by the mean posterior curve when the support is spread over the most likely seven models.When comparing the results of these two days, on the latter day the aerosol load is larger leading to different aerosol models chosen and higher AOT estimates.

Russian wildfires 2010
There were several wildfires in the western part of Russia from the end of July until August 2010 (Mei et al., 2011;Mielonen et al., 2011).The sample ground pixel from Introduction

Conclusions References
Tables Figures

Back Close
Full the day of 8 August 2010 is located near Moscow (Table 1).Figure 7 shows the reflectances and AOT estimates of the best fitted models when the model discrepancy is not included (upper panel) and when the model discrepancy is included (lower panel).
In both cases the two best fitted aerosol models are the same biomass burning type models.When the model discrepancy is not included, the model "2122" is clearly the most likely and the second best model does not have much weight.Because of this, the averaged posterior (in red) covers completely the posterior curve of τ within model "2122".When fitting the model to the measured reflectance acknowledging the model discrepancy term the ranking between the best two models is not so clear anymore.
The posterior distributions of τ under the best models are broad and now they overlap each other.

Sahara sand storm 2011
In April 2011 there were strong Sahara dust storm (Preißler et al., 2011).At that time, favourable weather conditions helped the dust to transport long way across the North Atlantic and Europe (http://earthobservatory.nasa.gov/NaturalHazards/view.php?id=50123).We consider here the date of 5 April 2011 (Table 1).The best fitted aerosol models are of type desert dust (Fig. 8).With or without model discrepancy, the same two best models have the largest evidence.Also the best model "3212" has almost the same relative evidence in both cases, as seen from the relative posterior model probability percentage values in the legend boxes.However, when the model Our aim was to study the additional retrieval uncertainty originating from the need to select one look-up-table-based aerosol model from a set of pre-calculated models.We utilized Bayesian statistical methodologies that are general in scope and applicable to a wide range of similar problems.As a particular application example we used operational OMI reflectance measurements from NASA's Aura satellite and modified operational OMAERO aerosol algorithm to estimate aerosol optical thickness (AOT) parameter.In OMI, the amount of information in the measurements is known to be limited to accurately select the correct aerosol type.Also, in practice there may be several models that explain the observations equally well.
The use of Bayesian statistical inference provides unified approach for quantification of uncertainties originating from the model choice and from parameter estimation.Here the Bayes' formula is applied twice: first, when defining the posterior distribution of unknown AOT within each aerosol model, and second, when comparing these models to select the most appropriate aerosol model.In our particular case there is only one unknown aerosol model parameter and the actual statistical calculations are rather simple.The obtained posterior probability weights of the models are used to build an averaged model that accounts for the uncertainty in the selection procedure.
The aerosol model represents some aerosol type with certain size distribution, refractive index and aerosol layer height, and is an approximation of the reality which The applied model for the error model is just one example of different possible ways to explain the systematic departures.The selected Gaussian process approach allows the modeled reflectances to have smooth deviations from the observed reflectances, and in our studies it was able to account for the typical systematic features in the model residual.In our case, once having estimated it, one global model error covariance matrix was used for all the test cases we considered.If needed, it would be possible to set up a table of model error covariance parameters depending, for instance, on geographical distribution climatology of models, or even to estimate error model parameters individually for each orbit, etc.Instead of using observed deviances, one way to study model error would be by doing radiative transfer simulations for some fixed atmospheric states and then estimating the model deviations at these situations.
In our examples, all the available aerosol models were equally probable a priori.Because of the limited information in the measured reflectance, the prior selection of aerosol models for certain location and time would be necessary, in practice.Prior information about the background aerosol conditions is important, especially, in situations, where the amount of aerosols is small, as the different models would be indistinguishable based on the observed reflectance only.In practice these prior weights could be based on aerosol distribution climatologies.
Our motivation was to improve the model choice process by acknowledging uncertainties from model selection together with measurement uncertainty, and also the aerosol model discrepancy, by taking advantage of statistical methodologies.We have demonstrated that by relative simple additional calculation we can improve the existing OMAERO algorithm to include model selection uncertainty into the retrieval uncertainty estimates.To further quantify the benefits of these additional calculation would need more refined validation and comparisons of different aerosol retrieval products, both satellite and ground-based.However, we feel that our study has already demonstrated the importance and the added value of more careful model error modelling.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | For a model m and measurements R obs we use Bayes' theorem again to obtain p m|R obs = p R obs |m p(m) p (R obs ) , Discussion Paper | Discussion Paper | Discussion Paper | chance.If there is uncertainty in the model selection, it should be accounted in the inference about the quantity of interest.A Bayesian model averaging technique (Hoeting Introduction Discussion Paper | Discussion Paper | Discussion Paper | et al., 1999; Robert, 2007) enables the shared inference about an unknown appearing in several alternative models.The Bayesian model averaging uses combined posterior distribution defined by weighting the individual posteriors by their evidence based weights p avg τ|R obs = n i =1 p τ|R obs , m i p m i |R obs .
Fig. 1 reflectance spectra from one OMI pixel is shown together with two fitted OMAERO models.The models represent two different aerosol main types, weakly absorbing (model "1212") and biomass burning (model "2223").They both fit the observed reflectance equally well and the modelled reflectance curves deviates from the observed reflectance curve in a similar but opposite way.Both models can explain the observations within the individual observation error-bar uncertainties, but there is significant systematic bias.Next, we want to model this additional uncertainty caused by model discrepancy.To acknowledging the model discrepancy, or model error, we use an additional error term η(λ) and write the general model equation as R obs (λ) = R mod (τ, λ) + η(λ) + obs (λ).(Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |where d = |λ i − λ j | is the particular distance between wavelengths.In spatial statistics parameter σ Discussion Paper | Discussion Paper | Discussion Paper | length bands Discussion Paper | Discussion Paper | Discussion Paper | uncertainty as error-bars for 2σ standard error.The posterior distributions of τ on the right panel describe the uncertainty of the retrieved AOT assuming that the associated aerosol model is correct.The legend shows the relative posterior model probability percentage values for each of the aerosol models involved.The upper panel represents the results of model comparison and AOT estimation when the model error has Discussion Paper | Discussion Paper | Discussion Paper | discrepancy term is included (lower panel) the posterior curves indicate higher uncertainty in the retrieved AOT τ value.The reflectance curves (left panel) show visible systematic errors in both of the models.The inclusion of model error shifts both posterior curves to right and widens the uncertainty (right panel).
seldom matches the simplifying assumptions used in model calculation.This causes additional uncertainty into the retrieval.This model discrepancy is taken into consideration by applying Gaussian process model to explain the characteristics for this model error.The covariance function defining the model discrepancy model is estimated empirically from an ensemble of residuals of model fits.Adding this model discrepancy term together with measurement errors in the aerosol model fitting procedure will allow wider deviation for the model from the observed spectral reflectance.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Fig. 5 .Fig. 6 .Fig. 7 .
Fig. 5. Case Greece 16 August 2007.Upper panel: the best five models when model discrepancy is not included.Bottom panel: the best ten models when model discrepancy is included.Observed and modelled reflectances on the left and posterior probability distributions for the AOT parameter τ on the right.The reflectance observations are marked with blue dots and error-bars corresponding to 2 × standard error uncertainty.The modelled reflectance curves on left match the colours of the individual posterior distributions on right, although overlaying each other.On right, the thicker red curve is the averaged posterior distribution over the best models that account at least 80 % of the total posterior weights of all the models.

Fig. 8 .
Fig. 8. Case Sahara dust storm 2011.Upper panel: the best two models when the model discrepancy is not included.Bottom panel: the best two models when the model discrepancy is included.See Fig. 5, for more explanation.

Table 1 .
Orbits, dates and locations of example cases.