Bayesian statistical modeling of spatially correlated error structure in atmospheric tracer inverse analysis

We present and discuss the use of Bayesian modeling and computational methods for atmospheric chemistry inverse analyses that incorporate evaluation of spatial structure in model-data residuals. Motivated by problems of refining bottom-up estimates of source/sink fluxes of trace gas and aerosols based on satellite retrievals of atmospheric chemical concentrations, we address the need for formal modeling of spatial residual error structure in global scale inversion models. We do this using analytically and computationally tractable conditional autoregressive (CAR) spatial models as components of a global inversion framework. We develop Markov chain Monte Carlo methods to explore and fit these spatial structures in an overall statistical framework that simultaneously estimates source fluxes. Additional aspects of the study extend the statistical framework to utilize priors on source fluxes in a physically realistic manner, and to formally address and deal with missing data in satellite retrievals. We demonstrate the analysis in the context of inferring carbon monoxide (CO) sources constrained by satellite retrievals of column CO from the Measurement of Pollution in the Troposphere (MOPITT) instrument on the TERRA satellite, paying special attention to evaluating performance of the inverse approach using various statistical diagnostic metrics. This is developed using synthetic data generated to resemble MOPITT data to define a proof-ofconcept and model assessment, and then in analysis of real MOPITT data. These studies demonstrate the ability of these simple spatial models to substantially improve over standard non-spatial models in terms of statistical fit, ability to recover sources in synthetic examples, and predictive match with real data. Correspondence to: C. Mukherjee (chiranjit@stat.duke.edu)


Model and inference setting
Bayesian statistical techniques are increasingly being used in atmospheric chemistry inverse modeling studies to refine bottom-up trace gas and aerosol source/sink flux estimates.Past inverse studies have generally focused on analysis of surface and airborne in situ measurements from geographically distributed sites for species such as CO 2 , CO, CH 4 , (e.g., Enting et al., 1995;Hein et al., 1997;Houweling et al., 1999;Bergamaschi et al., 2000;Bousquet et al., 2000Bousquet et al., , 2006;;Gurney et al., 2002Gurney et al., , 2003;;Kasibhatla et al., 2002;Petron et al., 2002;Peylin et al., 2002;Gerbig et al., 2003;Palmer et al., 2003;Rodenbeck et al., 2003;Fletcher et al., 2004;Michalak et al., 2004;Patra et al., 2005;Rayner et al., 2005;Baker et al., 2006;Mueller et al., 2008;Gourdji et al., 2010).More recently, inverse studies based on synthetic and real satellite retrievals of tropospheric trace gas concentration fields have identified the potential for these new measurements to further improve our understanding of trace gas fluxes at regional and sub-regional scales (e.g., Rayner and O'Brien, 2001;Jones et al., 2003;Arellano et al., 2004Arellano et al., , 2006;;Heald et al., 2004;Houweling et al., 2004;Petron et al., 2004;Chevallier et al., 2005aChevallier et al., ,b, 2007Chevallier et al., , 2009a,b;,b;Stavrakou and Mueller, 2006;Meirink et al., 2008;Feng et al., 2009;Kopacz et al., 2009Kopacz et al., , 2010)).To fully exploit the information content in these high-dimensional, spatially-dense satellite data sets, we must address questions about the nature of spatial dependencies among observations that are not predicted by existing models, and how to appropriately integrate spatial dependencies to ensure robust and unbiased inverse analyses.We show here how we can address both modeling and computational issues via Bayesian analysis of conditional autoregressive (CAR) spatial models to characterize spatial observation error fields.
Published by Copernicus Publications on behalf of the European Geosciences Union.

C. Mukherjee et al.: Bayesian spatial modeling in atmospheric inverse analysis
The general aims of the paper are to introduce and illustrate CAR models in this context, discussing computational implementations and giving examples in analyses of synthetic and real data.Our model and analysis have additional practically relevant features, including the use of constrained priors on source fluxes, and the overall Bayesian analysis defines inference on spatial dependencies and variance parameters as well as fluxes.We regard this work as a proof-ofconcept in illustrating the ability to fit, assess and compare initial classes of spatial models in this context, and -using formal statistical methods -to highlight the nature and extent of improvements over non-spatial models using synthetic and real data.While the class of CAR spatial models used here is simple and relatively limited in scope for modeling very diverse spatial patterns, it represents a first step towards more elaborate and adaptive models that may become more relevant with inverse analysis at higher resolutions and in the context of access to increasingly rich satellite data.
We begin with the canonical model where y is a m × 1 vector of atmospheric concentration measurements for a particular species, x is a n × 1 vector of corresponding fluxes with individual elements x i representing source/sink categories (e.g.fossil-fuel combustion, biomass burning, etc.) and/or geographical regions, and K is a m × n Jacobian matrix derived from an atmospheric chemistry transport model (CTM) and describing the relationships between discretized atmospheric concentrations and fluxes corresponding to source/sink categories.The random m × 1 vector accounts for errors associated with the measurement technique, the chemical transport model, as well as representativeness errors arising from differences in resolution between the measurements and model calculated concentration fields.The vast majority of inverse modeling applications in atmospheric chemistry are based on this formulation under the assumption of linearity of atmospheric transport for unreactive species such as CO 2 , and with additional linearizing assumptions with regards to chemistry for reactive species such as CO and CH 4 .We also note that while the atmospheric concentration measurements are spatially and temporally resolved, the vector y is constructed by stacking measurements indexed by CTM grid cells and time.Global-scale atmospheric chemistry inverse modeling studies involving real or synthetic satellite retrievals have generally focused on analyzing monthly or weekly mean measurements that are spatially aggregated to the CTM grid resolution (typically 200-500 km in the horizontal).Satellite atmospheric concentration retrievals typically consist of vertically averaged information which can be accounted for in Eq. ( 1) by appropriately modifying K based on the specific instrument characteristics.Past studies have generally focused on estimating regionally-and monthly-aggregated fluxes, though there is increasing interest in estimating fluxes at higher temporal and spatial resolution.As a result, m≈10 5 -10 6 and n≈10 2 -10 5 for applications involving a year's worth of data.Bayesian analysis generates the posterior density function p(x | y) ∝ p(y | x) p(x) using the likelihood function p(y|x) induced by the model (Eq. 1) and a specified prior p.d.f.p(x).In the context of atmospheric tracer inverse modeling, p(x) represents prior knowledge of fluxes from independent, bottom-up estimates.In atmospheric chemistry inverse modeling applications it has typically been assumed that p(x) and p( ) are multivariate normal distributions, defined by x ∼ N (x a ,S a ) and ∼ N (0,S ) where x a and S a are the prior mean vector and covariance matrix for x, and S is the observation error covariance matrix.Then, for known x a , S a and S , inferences are defined by the resulting posterior (x|y) ∼ N (x p ,S p ) where where G = S a K (KS a K +S ) −1 .Here x p and S p are the posterior mean and covariance, respectively.The Gaussian, linear assumptions underlie analytic tractability in defining the closed form posterior here, as well as extensions to time series of data (e.g., Prado and West, 2010), so continue to be important in enabling applications.
With some exceptions noted in Sect.1.2, previous applications in atmospheric chemistry have generally assumed a diagonal structure for S due to the lack of effective and computationally efficient approaches to identifying and integrating relevant spatial structures.This eliminates the computational burden associated with the calculation of the matrix inverse of S in Eq. ( 2).The equations clearly show, however, that if spatial dependencies in the model errors exist and can be captured by a relevant non-diagonal and structured covariance matrix S , this will impact on the posterior estimates x p of fluxes as well as the associated measures of uncertainties in S p .The impact can be substantial as demonstrated by some earlier studies (e.g., Chevallier, 2007) and our examples below.

Application context and previous approaches
While the assumption of uncorrelated observational errors may be reasonable for inverse studies based on surface measurements from a limited number of geographically scattered locations, it is increasingly untenable for geographically dense satellite measurements.This is especially true when mid-and upper tropospheric tracer concentrations (where transport is relatively fast) contribute disproportionately, relative to surface and lower tropospheric concentrations, to satellite weighted column-average retrievals.
A few global scale inverse modeling studies have considered observation error correlations, and their impact on posterior flux estimates.Chevallier (2007) considered the problem of estimating CO 2 fluxes from synthetic OCO measurements with known error correlations, and demonstrated that posterior source estimates and the corresponding uncertainties are sensitive to the treatment of observation error correlations in the inverse analysis.This study also showed that the effect of neglecting observation error correlations in the inverse analysis can be partially compensated for by techniques such as observation thinning and error variance inflation, but at the expense of not fully utilizing the information content of the measurements.
To avoid this loss of information, one can attempt to explicitly account for observation error correlations in the inverse analysis.For example, it has been proposed that spatial correlations associated with the CTM-component of the observation error can be approximated by the spatial error covariance structure from pairs of short-term chemical forecast simulations with different forecast starting times (Jones et al., 2003).This approach to characterizing the CTM forecast error is intuitively appealing, but suffers from the drawback that running multiple forward chemical model simulations over seasonal and inter-annual time scales is computationally expensive.It is also often impractical to perform simulations with independent CTMs on a routine basis to more fully characterize chemical model transport errors.
An alternative approach involves statistically modeling spatial observation error structures, and determining the associated parameters of the statistical error model as part of the inverse analysis.A key challenge in this context is computation.Traditional spatial modeling utilizing standard Gaussian processes based on a spatial distance-based correlation function (e.g., Rue and Held, 2005) have been explored to a degree (e.g., Michalak et al., 2004;Mueller et al., 2008;Gourdji et al., 2010) in terms of characterizing spatial error structure in the prior.In the context of modeling observation error structures in a fully Bayesian framework, this approach is computationally severely limited due to the resulting needs to perform multiple matrix inversions on covariance matrices of order m.Our interest in exploring alternatives that do not involve approximation short-cuts addresses the scale-up issues head-on while evaluating the flexibility of the class of conditional autoregressive spatial models.

Overview
We discuss an approach that uses alternative spatial structures that (a) recognize and exploit the fact that the data is inherently grid-based, and (b) provide access to effective statistical computation using Bayesian simulation methods, specifically Markov chain Monte Carlo (MCMC) analysis (e.g., Gelman et al., 2004;Prado and West, 2010, chapter 1).We show how this allows direct and appropriate modeling of spatial dependencies in observation errors in analysis that in-tegrates evaluation of the spatial field structure together with inference on source fluxes.Our analysis involves additional modeling advances for the inverse problem that include use of non-normal priors for fluxes to properly reflect the fact that the sources of interest in this study are positive, and the integration of missing data analysis to account for and infer missing retrievals.We further note that computer code (in Matlab) for all our analyses reported is available for others to explore and use.
Models of spatial structure in S involve additional unknown parameters that define the spatial dependencies; denote these by θ θ θ.Further, since satellite retrievals are subject to substantial missing data we explicitly recognize this; we denote by M the set of indices of missing retrievals, M ⊂ {1 : m}, while H is the set of indices for observed retrievals.Thus the observed data is y H and the missing data y M .Then, for a given prior p(x), the formal Bayesian inference problem is to compute and summarize the posterior p(x,θ θ θ ,y M |y H ). We do this using custom development of standard Bayesian statistical simulation methods based on MCMC; some summary aspects are mentioned here and in the Appendix, with full technical details provided in the supplementary material on statistical computation.

Spatial error structure: conditional autoregressive (CAR) model formulation
An approach based on Gaussian conditional autoregressive (CAR) spatial models is able, as we show, to define realistic and appropriate spatial structures for geographically dense satellite retrieval data on a lattice, while leading to a computationally tractable methodology for atmospheric tracer inverse modeling.The approach takes advantage of the fact that, under certain conditions, it is possible to statistically model the precision matrix S −1 as a very sparse matrix defined by a very small number of parameters, and that these parameters can be efficiently inferred using MCMC algorithms.
In the basic model of Eq. ( 1), y represents the vectorized set of retrievals from the original global rectangular lattice, or grid.Suppose that y represents the vector of retrievals for a single month, and that CAR (the notation change is to explicitly reflect the assumption of a CAR spatial structure) refers to the corresponding errors.Specification of a CAR model starts with a m × m proximity matrix, W, that designates weights to the neighbors for each grid cell.In this application, we define the elements of W as where δ ij ≥0 measures distance between the centroids of grid cells i,j ; with (lat i ,long i ) representing centroid of cell i, this is given by δ The CAR model introduces spatial dependencies through the complete conditional distributions for all elements of CAR ; the error in cell i depends on its neighbors as in and where w i+ = m j =1 w ij .It is clear from the forms of these conditional distributions that the spatial dependence parameter ρ defines association via the implied linear regression of each cell on its neighbors.In tandem, the scale parameter τ c controls levels of variation in these conditional distributions.Global correlations between separated grid cells are induced as a result; cell i depends on a more distant cell k transitively through its neighbors that link to neighbors of cell k, for example.Both parameters ρ,τ c are to be estimated.
In the examples in this paper, we adopt a first-order neighborhood approach in which the 8 cells that are physical neighbors (N, NE, E, SE, S, SW, W, NW) of grid cell i are neighbors in the model, and only those.It turns out that this first-order dependence structure is capable of capturing spatial patterns sufficient to reflect much of the residual dependency we see in MOPITT data, though more elaborate neighborhoods could be examined using the same Bayesian approach; the changes would simply use a different proximity matrix.
Define the m × m matrix D w = diag(w 1+ ,w 2+ ,...,w m+ ) and the spatial precision matrix U = τ −2 c (D w −ρW).It follows from the specification of the CAR model that the joint distribution of all m error values is CAR ∼ N (0,U −1 ). (5) That is, the error covariance matrix S is replaced by the spatially structured CAR covariance matrix U −1 that has non-zero pairwise correlations between cells over larger distances induced by the local neighborhood dependencies even though U itself has zero entries between cells that are not neighbors.Model fitting and inference relies on posterior estimation of U through estimation of θ θ θ = (ρ,τ c ) as uncertain parameters.Some of the computational tractability in dealing with the spatial structure as an ingredient of the inverse analysis comes through the fact that the precision matrix U is sparse; i.e., the elements U ij are non-zero only when cells i,j are neighbors.CAR models are capable of representing spatial structure that has traditionally been modeled via spatial distance-based correlation functions, referred to in the statistical literatures as Gaussian processes (GP) (e.g., Rue and Held, 2005).One of several often used forms is the exponential kernel in which the correlation between grid cells i,j is exp(−d ij /L) where d ij is the great circle distance between the centroids of the cells and L is the range parameter.The appropriate way to compare with CAR models is to look at the conditional regression coefficients implied by such a GP model in comparison to those that are used to define the CAR model, simply the ρw ij /w i+ term of Eq. (4). Figure 1 shows such a comparison for several relevant values of L, each plotted sideby-side with a corresponding CAR model, on a 5 × 5 grid beyond which the GP coefficients are negligible.This ability to adequately match local spatial patterns is a general feature of CAR models while its computational accessibility makes it a clearly dominant choice over GP models for all but very small problems.
Finally, note that we may aggregate measurements over a series of time epochs (e.g. from multiple months).In this case, the variance matrix of the extended CAR will be block diagonal, with the number of diagonal blocks U −1 equal to the number of time epochs considered in the analysis.

Prior specification for fluxes
To date, inverse applications in atmospheric chemistry have relied heavily on the linear-Gaussian theory associated with multivariate normal priors over x and the resulting analytic closed-form posterior summaries in Eq. ( 2).However, the fluxes of interest are often strictly non-negative, as for example, when CO sources are being estimated.In exploring posterior inferences using the traditional normal priors, we routinely encounter posterior densities that give appreciable Atmos.Chem.Phys., 11, 5365-5382, 2011 www.atmos-chem-phys.net/11/5365/2011/probability to negative flux values; this is a purely technical issue as, in such cases, the data:prior synthesis is surely consistent with very low or zero flux, whereas the mathematical assumption of normal priors leads to unconstrained posteriors that support practically meaningless negative values.Modern Bayesian computational methods allow us to do away with such physically unrealistic assumptions that have historically been made for purely mathematical tractability reasons.Here we simply adapt the usual assumptions and utilize priors for strictly positive fluxes that are the usual normal distributions but now truncated at some small chosen lower bound to ensure scientifically relevant posterior inferences that disallow negative values.The traditional normal prior specification based on bottom-up fluxes x a = (x a,i ) has been to adopt a multivariate normal prior with S a diagonal and having ith diagonal variance element S a,i for each source i = 1,...,n.In particular, we define S a,i = c 2 a x 2 a,i based on a specified coefficient of variation c a .The direct modification to ensure non-negative fluxes above a lower value is to take the prior as a product of independent priors over sources p(x i ) where each is defined by Here I (•) is the indicator function, and t i is a pre-specified small lower bound on realistic flux levels.
Given the prior flux estimates x a,i , and prior flux variance S a,i we can numerically match the expectation and variance of the prior distribution given by Eq. ( 6) with x a,i and S a,i to determine the values of the required prior parameters m a,i and v a,i .In the examples below, we use this specification with the lower bound on fluxes defined as t i = x a,i /4.

Accounting for missing retrieval data
Satellite retrievals are inherently subject to missing data.This translates into an index set M for the CO retrievals that are missing, while those indexed in the set H are recorded.Writing the observed data sub-vector as y H and the missing data sub-vector as y M , we include the information that y M is missing in the analysis.This is done in standard Bayesian fashion: y M is included as part of the inference problem in an extended analysis that computes and summarizes aspects of the posterior p(x,θ θ θ,y M | y H ).
The missing rows of the transport matrix K (i, * ) for each i ∈ M are linearly interpolated using the neighboring grid cells (since instrument characteristics required to calculate the corresponding elements of K are unavailable) and the corresponding unknowns y i , i ∈ M are assigned values that are repeatedly updated via simulations from the relevant conditional posterior predictive distributions in the Bayesian MCMC analysis, noted in the next section and detailed further in the Appendix.

Bayesian computation
Iterative posterior simulation using MCMC has been the de facto standard in the statistical community for Bayesian statistical computation for some years (Gelman et al., 2004).Recalling that θ θ θ stands for unknown parameters in the error variance S , the full joint posterior distribution for all unknowns {x,θ θ θ ,y M } conditional on y H is evaluated by simulating a large Monte Carlo sample, and basing inferences on numerical summaries of that sample.MCMC analysis performs this simulation iteratively, successively updating each of the unknowns by simulation from a relevant conditional distribution that may involve some of the most recently simulated, or imputed, values of other unknowns.Our MCMC strategy for the current context is outlined in the Appendix, with further technical details provided in the supplemental documentation.

Synthetic data studies
We first demonstrate the approach with an extensive set of synthetic data analyses that parallel the problem of estimating CO sources from MOPITT data considered by Arellano et al. (2004).We utilize synthetic data in order to provide a context where the true sources x are known, and to compare the CAR spatial model with a non-spatial (NS) statistical model to evaluate the effect of neglecting "true" spatial error correlations on the inverse source estimates.The NS model is a special case of the CAR model obtained when ρ = 0; in that special case, we denote the scale parameter by τ n rather than τ c .We pay special attention to evaluating performance of the inverse approach using various statistical diagnostic metrics.

Generation of synthetic data with spatially correlated errors
The inverse problem (Arellano et al., 2004) consists of using Level 2 V3 MOPITT daytime column CO retrievals from April-December 2000 to estimate annual CO emissions for n = 15 source categories consisting of: (a) fossil fuel/biofuel (FFBF) combustion in 7 geographical regions, (b) biomass burning (BIOM) in 7 geographical regions, and (c) and oxidation of biogenic isoprene and monoterpenes on a global-scale (BIOG).The geographical extent of each of the FFBF and BIOM regions is shown in Fig. 2. CO production from methane oxidation is not estimated as part of the inversion, but is taken into account by pre-subtracting its contribution to the MOPITT retrievals.The Jacobian matrix K is constructed by applying MOPITT averaging kernels to gridded CO fields calculated using an offline, tagged tracer version of the GEOS-Chem CTM at a resolution of 4 × 5 degrees.The gridded, quality controlled MOPITT dataset used in the original analysis spans 50 • N-50 • S, yielding a lattice of 26 × 72 grid cells that vectorizes to 1872 observations on a monthly basis.Combining data from April-December 2000 yields a complete retrievals vector y of length m =16 848 and K of dimension 16 848 × 15.The original analysis of Arellano et al. (2004) used only those MOPITT retrievals satisfying certain quality control metrics.We modified these to require at least 5 days of observations each month for a site to be considered valid; sites not meeting this are those treated as having missing data, yielding 139 missing values.Further details on MOPITT data processing and construction of K are given in Arellano et al. (2004); they also give the prior source vector x a and the diagonal matrix S a with ith diagonal element S a,i = c 2 a x 2 a,i for a constant coefficient of variation c a = 0.5.We use these values as the basis for positively constrained priors x i ∼N (m a,i ,v a,i )I (x i >t i ) taking t i = x a,i /4 such that the prior mean of each x i is the specified bottom-up value x a,i and the prior variance is S a,i .
We generate a single synthetic MOPITT CO retrieval data set with spatially-correlated errors by first generating a "true" source vector, x, simply sampling from the truncated normal priors; we use ˜to denote the synthetic quantities throughout.We next construct a "true" error covariance matrix, S , that includes off-diagonal terms representing spatial error correlations for cells within the same month; we assume no between-month dependencies.Individual elements of this matrix are specified using an exponentially decaying correlation kernel; thus, for grid cells i,j the covariance element in S within each month is S ,ij = σ 2 exp(−d ij /L) where d ij is the great circle distance between the cells and L the range parameter.We further take the constant observation error σ as 20 % of the global, annual-mean MOPITT-equivalent CO columns from the CTM with prior CO source estimates.Corresponding error terms are then simulated from ˜ ∼N (0, S ) and the synthetic CO observations vector is calculated directly from the model, Eq. ( 1), i.e., ỹ = K x+˜ .Finally, the grid cells for which the real MOPITT data is missing are then masked as missing, defining the index sets M and H.We explore 6 repeat versions of this synthetic data using L =100, 200, 500, 1000, 2000 and 5000 km, respectively; this generates 6 synthetic data sets reflecting varying degrees of spatial error correlation.
We repeat this exercise to generate 1000 replicate simulations in order to quantify Monte Carlo variability and resulting accuracy of reconstructions of the true source fluxes.Figures 3 and 4 show spatial plots of elements of the simulated ˜ and ỹ for the month of December 2000 for one particular realization of x randomly drawn from the set of 1000 replicates.

Results: model adequacy and model comparisons
For each value of L, we compare the posterior mean estimates of CO fluxes with the known "true" fluxes for each of 1,000 synthetic data sets.Figures 5 and 6 show the results via scatter plots of estimated CO flux versus the "truth" for one FFBF and one BIOM source category, respectively.Similar comparisons for the remaining source categories are shown in the supplementary material.For one randomly selected synthetic data set, Figs.7 and 8 display the estimated 95 % posterior credible intervals for each of the 15 source categories in both the CAR and NS model analyses; also shown are the corresponding 95 % prior credible intervals and the values of the "true" CO fluxes.It is readily evident from these figures that the performance of the NS model is comparable to that of the CAR model when the degree of spatial error correlation is very low (i.e., L is rather small).The CAR model is clearly superior at higher values of L over the range of "true" fluxes considered in this analysis.Results   (see supplementary material) are similar across the range of synthetic data sets considered here.
To summarize performance across the 1000 synthetic datasets, and to provide further insight into the relative performance of the CAR and NS approaches in reconstructing source fluxes, we calculate two metrics for each CO source category using the MCMC-sampled posterior distributions: Success Rate = (#times the "true" CO flux falls within posterior 95% interval)/1000, (7) Learning Ratio = average of (Prior 95% interval length/ Posterior 95% interval length).
Figure 9 shows a combined plot of these two metrics for each CO category for different values of L. Notice that the Learning Ratio>1 in each source category under both models, indicating both the models learn significant information from the data; these ratios differ for different sources, reflecting the fact that the measurements provide varying degrees of information on the magnitude of different sources.The Success Rates demonstrate precision of the analysis in the standard statistical coverage sense; Fig. 9 supports the point already noted that at low spatial dependencies the NS and the CAR models have comparable accuracy, whereas the CAR model very substantially outperforms the NS model when spatial structure becomes practically meaningful.
In this synthetic context and with the spatial model based on a single spatial dependence parameter, posterior distributions and resulting credible intervals tend to be quite precise  .21 [14 180.68, 17 959.34]due to the large amount of data.This is perfectly appropriate and a consequence of model assumptions and data-model match.In this specific model the results serve as a first, proof-of-concept and example of the approach for accounting for spatial error correlation structures only; the high level of posterior precision about fluxes may be reduced in more elaborate spatial models for higher resolution data, and further work will aim to explore this in new case studies.Further substantiation in favor of spatial modeling with the CAR approach comes from formal statistical summaries for model comparison.A key, standard measure of relative fit of two models is the Bayes factor (an integrated variant of a likelihood ratio); to compare the CAR with the NS approach, this is BF(CAR : NS) = p CAR (y H )/p NS (y H ) Atmos.Chem. Phys., 11, 5365-5382, 2011 www.atmos-chem-phys.net/11/5365/2011/where p * (y H ) is the marginal probability density function of the observed data y H under the assumptions of the model * = CAR or * = NS, respectively; p * (y H ) is otherwise referred to as the marginal likelihood or evidence for model * , and the ratio form in the Bayes factor measures relative evidence on a likelihood scale; see Bernardo and Smith (1994, chapter 6) and West and Harrison (1997, chapters 11 and 12), for example.A Bayes factor BF(CAR : NS)>1 indicates that the data favors the CAR over NS model, with values of 100 or more indicating very substantial evidence indeed.From the MCMC analysis of each synthetic data set we can estimate the values of p CAR (y H ) and p NS (y H ) and hence estimate the Bayes factor for that particular data set.Table 1 reports the averages of the values over the 1000 simulated data sets, together with the associated 95 % intervals.We see significantly positive values of log(BF(CAR:NS)) for larger L, indicating extremely strong evidence in favor of the spatial model over the non-spatial model.
We illustrate the effectiveness of CAR in modeling these synthetic (non-CAR) spatial dependencies by comparing the synthetic data with samples from the posterior predictive distribution.Exploring posterior predictions is a traditional statistical method for both informal and formal evaluation of model fit; here we simply present graphical summaries of prediction from the model.For any of the posterior MCMC draws of {x,θ θ θ ,y M } we can, using these values, directly simulate additional synthetic data y from the model; such simulations generate random draws from the posterior predictive distribution -i.e., synthetic representations of what data will look like if the model is true.Often, simply exploring graphical and numerical summaries of posterior predictive simulated data sets can highlight ways in which the model is inadequate when compared to the real data (West and Harrison, 1997;Gelman et al., 2004).Figures 10 and 11 show representative posterior predictive samples from the NS and CAR models for L =100 km and L =5000 km, respectively.It is clear that the spatial patterns of posterior predictive samples from the CAR model are visually similar to the synthetic data, while they are clearly noisier for the NS model for higher spatial dependences.

Analysis of real MOPITT retrievals
We now consider the analysis applied to MOPITT retrievals of CO columns, paralleling the study of Arellano et al. (2004) but now including spatial CAR structure, modified priors, and formal treatment of missing data.Figure 12 displays 95 % posterior credible intervals from CAR and NS models for each source category, and Table 2 presents detailed posterior summaries.We observe significant differences between the two analyses for several of the CO source categories considered here.In particular, the CAR analysis suggests that, for several of the FFBF and BIOM source categories the topdown estimates are not as inconsistent with the bottom-up estimates as is suggested by the NS analysis.Again, as with the synthetic data study above, we note relatively precise posterior intervals based on the MCMC approach; these are accurate summaries of uncertainties conditional on the assumed form of the model.Model comparison using Bayes factors yields log(BF(CAR : NS))>10073, simply overwhelming statistical evidence that the CAR model substantially improves model:data match due to the presence of significant spatial residual structure.To further indicate the ability of the spatial model to reflect realistic spatial structure, Fig. 13 displays two randomly selected posterior predictive samples from the NS and CAR model analyses, together with the actual data for December 2000.The spatial dependence patterns in the CAR samples are visually similar to that in the real data, while the NS samples are again noisier.These preliminary comparisons suggest that accounting for spatial error structures in the real data is important in the context of constraining CO sources using spatially-dense satellite measurements.Further investigations at higher spatial and temporal resolution with the latest version of the MOPITT dataset, as well with retrievals from other satellite instruments, are required to more fully characterize and account for these spatial error patterns and to refine top-down CO source estimates.

Concluding remarks
The fast-expanding ability to access increasingly highresolution atmospheric data using satellite imagery raises exciting opportunities for substantial advances in data synthesis in inverse modeling.Capitalizing on this opportunity will involve increased attention to core challenges that are inherently statistical in nature.The work presented here reflects this view and exemplifies the potential to address rather basic yet challenging problems of very large-scale spatial modeling, coupled with refined prior specifications and treatment of missing data, in inverse studies of atmospheric trace gas source/sink flux estimation.To date, although the broader field of atmospheric chemistry inverse modeling has become heavily invested in statistical methods, there has been limited development of what are standard statistical approaches utilizing Bayesian simulation methods, including MCMC.The work here demonstrates the utility of the Bayesian perspective and the enabling computational methodology provides for extending inverse modeling frameworks to incorporate relevant spatial stochastic structure.Our analysis framework explicitly paralleled the earlier work of Arellano et al. (2004) in order to demonstrate the extensions with spatial CAR modeling and other statistical innovations.Based on this proof-of-concept in studies of both synthetic and real MOPITT retrieval data, some next steps include expanding the model framework to explicitly represent time dependencies in data and models.The extension of spatial modeling into a temporal framework is standard, and this, together with extension to model potential time-variations in source fluxes, will rely on additional Bayesian computational methods that are well-developed in other areas of multivariate time series analysis (West and Harrison, 1997;Prado and West, 2010).Additional important directions include the application of these techniques to carbon dioxide and methane as high quality satellite measurements of these climatically important gases become available, especially in the context of source/sink estimation with high spatial resolution.
With regard to computational demands for higher resolution problems, we note that the running time for the MCMC algorithm increases roughly linearly with n, the size of the source vector.This is based on evaluations of computational time using our current Matlab code and is approximately linear in FLOP counts, stopwatch time (tic toc) and elapsed CPU time (cputime).This is the best one can expect and indicates that indeed the analysis and computational approach is scalable to much larger and higher resolution problems.Moreover, the matrix multiplications that constitute much of the computational burden with increasing m can be trivially parallelized on multi-core machines or clusters, or via exploitation of GPU parallelization, suggesting the opportunity for very substantial gains with larger m.
Finally, we note again that the analysis presented is intended as a first, proof-of-principle analysis and example of the opportunity to integrate spatial structure into inversions.The single-dependence parameter CAR model is likely overly simplistic as a representation of spatial errors that combine model misfit and natural local dependencies in satellite retrieval data.To represent additional complexity in spatial structure, especially with regard to the potential opportunities to fit higher resolution models that can capture more refined structure, extensions or alternatives to CAR models will be needed.Indeed, one of our current research  projects is to extend the general strategy of the paper to models that permit changes in the local dependency parameter across the spatial region; importantly, such extensions will also need to come through precision matrix models, rather than covariance models, for both the feasibility of computations and, we believe, for practical realism.Our work in this paper demonstrates the ability of simple CAR models to substantially improve over the standard non-spatial models in terms of statistical fit, source flux recovery in synthetic examples, and predictive match with the real data; this gives us a basis to move ahead with development of more general, flexible and likely realistic spatial structures in future work.

Appendix A Posterior computation
The Markov chain Monte Carlo posterior simulator successively re-simulates values of all of the unknowns {x,θ θ θ ,y M } to draw a large Monte Carlo sample from the full joint posterior p(x,θ θ θ ,y M | y H ). Initializing at (essentially arbitrary) starting values θ θ θ ,y M , the MCMC proceeds through many iterations to revise the full set of unknowns, at each iterate stepping through the stages below to stochastically update x conditional on the last values of {θ θ θ,y M }, then θ θ θ conditional on the latest values of {x,y M }, and then y M conditional on the latest values of {x,θ θ θ}.The specific distributions used for each of these three stages are summarized here with more technical details in the supplemental documentation.We use the following notation: i.We give summary details for MCMC in both the nonspatial and CAR model contexts.As described in Sect.2.3, analysis is based on the use of the truncated normal priors for sources with that for the i-th source being x i ∼ N(m a,i ,v a,i )I (x i > t i ) where m a,i and v a,i are numerically specified so that the prior has mean x a,i and variance S a,i with t i = x a,i /4.

A1 Single epoch data
We first give summaries for the analysis of a single epoch of data -a single monthly retrieval y in the case of MOPITT data.The extension to the context analyzed in our examples, paralleling Arellano et al. (2004) with multi-month retrievals and time-invariant fluxes, is then also summarized in the following subsection.
-Resample values of the missing data vector from the conditional posterior predictive distribution (y M | x,τ 2 n ,K)∼N (K (M, * ) x,τ 2 n I).

A1.2 Posterior computation for the CAR model
The prior for τ 2 c ∼IG(α c ,λ c ) with prior mean, E(τ 2 c ) = 8σ 2 (known, sets unbiased prior when there is no spatial dependence, i.e. ρ = 0), and coefficient of variation set at 0.5; this implies α c = 6 and λ c = 40σ 2 .Further, we adopt the uniform prior for ρ on 0<ρ<1.The MCMC algorithm alternatively samples from the following conditional distributions: -For i = 1,...,n, resample the ith source element from Note here how the spatial covariance structure in U plays a key role in determining the relative weightings of cells having observed data via the current values of multiple regression coefficients (in the regression of y M on y H ). Actual sampling from this distribution does not in fact require any matrix inversions; only a Cholesky decomposition of a square matrix whose dimension is the number of missing values |M| (137×137 in our MOPITT study context) (Rue, 2001;Rue and Held, 2005).As a result, these successive imputations of missing data to represent the posterior estimates and uncertainties about y M do not add measurably to the overall computational burden of the MCMC.

A2 Multi-epoch data with time-invariant fluxes
To parallel Arellano et al. (2004), consider now the case of several epochs (e.g.months) of retrieval data.In epoch t = 1,...,T , retrievals follow the model of Eq. ( 1) where we now index by t, viz.y t = K t x+ t for t = 1,...,T .We can simply

Fig. 1 .
Fig. 1.Left: conditional regression coefficients from Gaussian process models (GP) with exponential decay correlation kernel exp(−d/L) for several values of the range length L.Here d is the centroid-centroid distance between cells and the regression coefficients plotted are those for the regression of the central cell (3,3) on the rest.The mesh size is not constant over the entire lattice; assuming the mean radius of the Earth to be 6371 km, the average size of a 4 • latitude × 5 • longitude cell is 445 km × 556 km.Right: conditional regression coefficients from CAR model using ρ values fitted to match the regressions in the corresponding GP model.

Fig. 3 .
Fig. 3. Spatial images of randomly selected realizations of the synthetic MOPITT CO column observation errors ˜ (in units of 10 18 molecules CO cm −2 ) for December 2000 for different values of L. The black cells represent missing observations.

Fig. 4 .
Fig. 4. Spatial images of synthetic MOPITT CO column measurements ỹ (in units of 10 18 molecules CO cm −2 ) corresponding to the errors in Fig. 3.The black cells represent missing observations.

Fig. 5 .
Fig.5.Scatter plots of "true" x i versus estimated x i (in units of Tg CO yr −1 ) for the FFBF North America (FFBF-NAM) source category computed from 1000 synthetic data sets.RMSE is the root mean square error metric of the estimated x i from the "true" x i .

Fig. 7 .
Fig. 7. Plots of 95 % posterior credible intervals for NS and CAR model analyses for L =100, 200 and 500 km, showing summary inferences of source magnitude for all 15 CO source categories for one synthetic dataset (see Fig. 2 for definition of source category numbers).Posterior means for both are marked with dots inside the corresponding intervals.Alongside we plot 95 % prior credible intervals for the corresponding source and indicate the "true" CO source with a square.

Fig. 9 .
Fig. 9. Success Rate (top panels) and Learning Ratio (bottom panels) for all CO source categories for different values of L. See Fig. 2 for definition of source category numbers.

Fig. 10 .
Fig. 10.Two samples of column CO fields (in units of 10 18 molecules CO cm −2 ) from the posterior predictive distributions of NS (lower 2 left panels) and CAR (lower 2 right panels) models for December 2000 for L =100 km; top panels show the corresponding synthetic data.

Fig. 12 .
Fig. 12. Plots of 95 % prior and posterior credible intervals for the real MOPITT data inversion.Posterior means for both models are marked with dots inside the corresponding intervals.The squares on the prior credible intervals represent the corresponding prior means.See Fig. 2 for definition of source category numbers.

Fig. 13 .
Fig. 13.Two samples of column CO fields (in units of 10 18 molecules CO cm −2 ) from the posterior predictive distributions of NS (lower 2 plots, left panels) and CAR (lower 2 plots, right panels) models for December 2000; the top panels show the corresponding MOPITT retrieval data.

Table 1 .
Posterior means and 95 % credible intervals for log(BF(CAR:NS)) from the analyses of 1000 synthetic data sets.

Table 2 .
Summary of posterior inferences for the CO fluxes (in units of Tg CO yr −1 ) and model parameters from analysis of actual MOPITT retrieval data.See Fig.2for definition of source category numbers.