Comparing the CarbonTracker and TM5-4DVar data assimilation systems for CO2 surface flux inversions

. Data assimilation systems allow for estimating surface ﬂuxes of greenhouse gases from atmospheric concentration measurements. Good knowledge about ﬂuxes is essential to understand how climate change affects ecosystems and to characterize feedback mechanisms. Based on the assimilation of more than 1 year of atmospheric in situ concentration measurements, we compare the performance of two established data assimilation models, Car-bonTracker and TM5-4DVar (Transport Model 5 – Four-Dimensional Variational model), for CO 2 ﬂux estimation. CarbonTracker uses an ensemble Kalman ﬁlter method to optimize ﬂuxes on ecoregions. TM5-4DVar employs a 4-D variational method and optimizes ﬂuxes on a 6 ◦ × 4 ◦ longitude– latitude grid. Harmonizing the input data allows for analyzing the strengths and weaknesses of the two approaches by direct comparison of the modeled concentrations and the estimated ﬂuxes. We further assess the sensitivity of the two approaches to the density of observations and operational parameters such as the length of the assimilation time window. Our results show that both models provide optimized CO 2 concentration ﬁelds of similar quality. In Antarctica Carbon-Tracker underestimates the wintertime CO 2 concentrations, since its 5-week assimilation window does not allow for adjusting the distant surface ﬂuxes in response to the detected concentration mismatch. Flux estimates by CarbonTracker and TM5-4DVar are consistent and robust for regions with good observation coverage, regions with low observation coverage reveal signiﬁcant differences. In South America, the ﬂuxes estimated by TM5-4DVar suffer from limited rep-resentativeness of the few observations. For the North American continent, mimicking the historical increase of the measurement network density shows improving agreement be-tween CarbonTracker and TM5-4DVar ﬂux estimates for increasing observation density.


Introduction
Sources and sinks of atmospheric carbon dioxide (CO 2 ) largely control future climate change (Schimel, 2007).Anthropogenic emissions release roughly 10 Gt C into the atmosphere per year (Peters et al., 2013), part of which gets taken up by the biosphere and oceans.The fraction of emitted CO 2 which remains in the atmosphere is the largest driver of climate change (Stocker et al., 2013, chapter 8.5.1), but the distribution and strength of carbon sources and sinks on the surface is hard to measure directly.Methods for observing the fluxes directly require either eddy covariance measurements at multiple height levels (Foken et al., 2012) or measurements of concentration changes in a sealed volume of air.However, such bottom-up approaches are only representative for a given collection of vegetation types in a limited geographic area.
Published by Copernicus Publications on behalf of the European Geosciences Union.
Inverse modeling therefore uses CO 2 concentration gradients observed in the Earth's atmosphere to quantify the spatiotemporal distribution of the net CO 2 surface fluxes (e.g.Enting, 2000;Peters et al., 2007;Chevallier et al., 2010;Feng et al., 2011;Peylin et al., 2013).To this end, various data assimilation (DA) techniques have been developed.These DA approaches differ in four major ways: first, they ingest different observational constraints, for example in situ concentration measurements at different sites.Second, they represent sources and sinks of carbon differently, for example by binning them by vegetation type or on a longitude-latitude grid.Third, they relate sources and sinks to observed atmospheric abundances using different air-mass transport models (Gurney et al., 2004, estimate their impact on fluxes).And fourth, they use different inverse methods that find the best estimate of the source-sink distribution using the transport model, the observational constraints, the representation of sources and sinks and a prior estimate of the sources and sinks.Differences in these characteristics contribute to the differences in flux estimates from different studies.To analyze the impact from the representation of sources and sinks and from the inverse method, it is therefore necessary to harmonize the observational constraints, the transport model and the prior concentration, and flux and flux covariance estimates between the approaches which are compared.
There are two main classes of assimilation techniques for complex inversions, variational methods and ensemble methods (Lahoz et al., 2007;Lahoz and Schneider, 2014).Both approaches are approximate variants of the general Bayesian optimal estimation scheme (e.g.Rodgers, 2000) which aims at balancing prior or background information with actual measurement information to derive robust parameter estimates.Approximations are necessary to render the inverse problem computationally feasible since real-world CO 2 surface flux inversions typically involve thousands of concentration measurements and millions of unknown flux parameters.Both schemes can either treat the entire considered assimilation period at once or divide it into shorter periods to be treated sequentially.Ensemble methods approximate the exact solution from an ensemble of model runs, while variational methods approach the optimal solution step by step (e.g.Juhász and Bölöni, 2007;Gilbert and Lemaréchal, 1989).
The performance of ensemble methods and variational methods has been evaluated previously for numerical weather prediction (e.g.Kalnay, 2005;Fairbairn et al., 2013) and direct optimization of atmospheric gas abundances (Skachko et al., 2014).Chatterjee and Michalak (2013) were the first to evaluate the performance of the two methods for the purpose of CO 2 surface flux estimation.They use a synthetic setup with simulated observations and a onedimensional transport model which has the advantage of knowing the true fluxes and for which a direct Bayesian inversion is computationally feasible.In particular they find that under constraints on model runtime and resource use, the estimated surface fluxes are more realistic with their variational implementation than with their ensemble method, and that for both models small-scale fluxes (flux aggregation spanning up to 5 % of the model size) are very sensitive to the data coverage and distribution.
Here, we focus on evaluating the performance of an ensemble method and a variational method used for real atmospheric CO 2 flux inversion problems.We focus on a case study for the period from 2009 to 2010, and use observational constraints collected by an in situ measurement network and compiled by the NOAA Environmental Sciences Division and Oak Ridge National Laboratory (2013, exact version: "obspack" PROTOTYPE v1.0.2 2013-01-28).Our ensemble method is the ensemble square root filter (EnSRF, Whitaker and Hamill, 2002) as employed by the Carbon-Tracker modeling system (Peters et al., 2007), a variant of the ensemble Kalman filter.The variational method is the TM5-4DVar (Transport Model 5 -Four-Dimensional Variational model) package described by Meirink et al. (2008) and Basu et al. (2013).
Besides the mathematical treatment of the inversion, Car-bonTracker and TM5-4DVar differ in the design of the state vector.CarbonTracker optimizes fluxes binned by regions with similar vegetation -like cropland or boreal forestand separated by geographic regions following the Transcom basemap (Gurney et al., 2000).TM5-4DVar adjusts the fluxes on a grid (6 • × 4 • , longitude × latitude) with correlations which decay exponentially in time and space.
Our goal is to evaluate the impact of the inverse method (including the flux representation) on the estimated surface fluxes.Therefore, we must make sure that the other components of the DA systems -the observations to be assimilated, the transport model and the prior assumptions -are the same.After a short summary of the general Carbon-Tracker and TM5-4DVar methodology in Sect.2, Sect. 3 describes how we harmonize these other components of the two DA systems, mostly focusing on the observation input and the prior assumptions, since CarbonTracker and TM5-4DVar both operate on the same transport model, the Transport Model 5 (TM5, Krol et al., 2005).In Sect. 4 we compare the performance of the two inverse methods by evaluating the mismatch between modeled and measured concentration fields.The comparison to assimilated observations verifies that the schemes work as expected.Building on these results, Sect. 5 analyzes the estimated surface fluxes and tests their sensitivity to observation density.

Inverse methods and setup
The DA systems aim at inferring a state vector x that contains spatially and temporally binned surface fluxes or a related quantity such as scaling factors for an initial guess flux field.To this end, the systems exploit measurements of the atmospheric concentration chained into an observation vector y.Fluxes and measured concentrations are linked through the transport and observation operator H which is linear for the case of our CO 2 flux inversions, but in general could be nonlinear such as for CH 4 flux inversions.Typically, the inverse problem of estimating x from a set of observations y is ill posed.Due to sparse observational coverage, measurement errors or measurement configuration, the observations contain insufficient information to determine all components of x independently.A background flux estimate x b from biosphere and ocean models is used to provide a constraint that fills the null-space where measurement information is insufficient.Accordingly, the state vector of fluxes x is determined by minimizing a cost function J that typically consists of two terms, the mismatch between measured and modeled observations and the mismatch between the fluxes to be estimated and the background estimate: with R the observation covariance and B the background flux covariance.R and B define the relative weights of the measurement and background mismatch.
In general, minimization of Eq. ( 1) can be solved by means of matrix algebra (Rodgers, 2000) yielding optimized fluxes and their error covariances, with x the a posteriori state vector and B as the respective covariance matrix.The equivalence of Eqs.
While theoretically the minimization of Eq. ( 1) reduces to a matrix inversion for linear systems like CO 2 flux inversion (e.g.Rodgers, 2000), the large number of parameters to be estimated and the amount of measurements to be ingested requires approximate methods such as EnSRF and 4DVar which are numerically efficient.

CarbonTracker: EnSRF based data assimilation
CarbonTracker is an inverse modeling framework based on the ensemble square root filter (EnSRF) developed by Peters et al. (2005).Instead of solving the minimization problem in one step, the EnSRF determines optimized surface fluxes sequentially in a time stepping approach with x t defining a subset of x for a certain time window.In our standard setup x contains scaling factors for the surface fluxes for 96 weeks, while x t only spans 5 weeks.
Commonly, a gain matrix G is defined as Equations ( 2) and ( 4) then read with the gain matrix where subscript t indicates quantities of reduced dimensions, for the time step under investigation.Once Eqs. ( 8) and ( 9) are solved for time slice t, the solution of the scaling factors xt is used as the background estimate x b,t+1 for the next time slice t + 1, assuming that a simple persistence forecast is adequate for our CO 2 flux inversion problem, The covariance B t+1 is prescribed at each time step as described in Peters et al. (2005).Given an initial guess for the first background state, this strategy allows for sequentially calculating the complete state vector x.
To estimate the gain matrix G t , the EnSRF uses an ensemble approach.The ensemble members x i b,t = x b,t + x i b,t (i = 1. ..E, with E the ensemble size) of the background state are drawn such that their mean and covariance are consistent with the background state x b,t and background covariance matrix B t , respectively, so that with each of the vectors x 1 b,t , x 2 b,t , . .., x E b,t defining the deviations from the mean state.
Then, the terms H t B t H T t and B t H T t required for calculating G t following Eq.( 10) can be approximated using the results from an ensemble run of the possibly nonlinearized where the approximation becomes more exact with increasing ensemble size E. The EnSRF method yields robust results with nonlinear transport operators H as long as the transport model is close to linear for small perturbations (H(x + x) ≈ Hx + H x). Using Eqs. ( 13) and ( 14), the gain matrix G t can be calculated from Eq. ( 10), finally to update the state estimate xt via Eq.( 8).Peters et al. (2005) describe in detail how to estimate the state covariance Bt by separately updating the ensemble deviations x i b,t while avoiding the costly evaluation of Eq. ( 10) and circumventing spurious underestimation of Bt .Overall, CarbonTracker's EnSRF approach requires running the transport model H for E ensemble members over the time period covered by all time steps t.At each time step t the transport model is sampled at all measurement instances within the time step and the above methodology is followed.
CarbonTracker uses a refined approach for stepping through the entire time period considered.CarbonTracker's state vector x t is subdivided into five 1-week bins (five cycles) resulting in an assimilation window of 5 weeks (Peters et al., 2005, chapter 2.3).At each optimization step the oldest cycle at the "end" of the state vector drops out of the state vector and is used as an a posteriori flux estimate while a new cycle is added to the "beginning" of the state vector according to Eq. ( 11).As such, each 1-week cycle experiences a number of optimization steps equal to the number of weeks in the assimilation time window.The choice of assimilation time window, here 5 weeks, also implies that CarbonTracker can adjust surface fluxes only when their effects are observed at a site within 5 weeks of atmospheric transport.In the zonal direction, this limitation is of little consequence, because typical global transport timescales are on the order of weeks.However, in the meridional direction and especially for interhemispheric transport where the transport timescales are on the order of months, this choice needs to be taken into account when interpreting flux results.The time stepping also defines the temporal binning of 1-week fluxes.
The spatial binning of CarbonTracker's state vector follows the Transcom regions (Gurney et al., 2000), further categorized into land regions with similar ecosphere following Olson et al. (1992) and ocean regions following the ocean inversion fluxes approach (Jacobson et al., 2007b) as described in the documentation of CarbonTracker North America1 .In total, there are 240 flux ecoregions to be optimized, which is significantly less than the number of grid cells of the transport model operating on 6 • × 4 • (longitude × latitude).The fluxes to be optimized are further separated into three categories: biosphere/ocean, fire and fossil fuel.Only the category biosphere/ocean is optimized, the others are imposed from their priors following the assumption that fossil fuel fluxes are known with much higher precision than biosphere and ocean fluxes and that fire fluxes cannot easily be distinguished from biosphere fluxes, so they could not be interpreted separately.Altogether, temporal and spatial binning results in a state vector x t with 240 × 5 = 1200 elements.
The structure of the background covariance B t in the Northern Hemisphere is a diagonal matrix with a variance of 0.64 (80 % standard deviation) in units of dimensionless flux scaling factors.In tropical and many Southern Hemisphere regions, the ecosystems are coupled with exponentially decreasing covariance, selected such that the total covariance in the Transcom region matches the variance in Northern Hemisphere regions.The covariance for ocean regions uses the results of the ocean inversion by Jacobson et al. (2007a).Temporal covariance in CarbonTracker stems from processing observations multiple times in the time-stepping approach.The observation covariance R is assumed diagonal.
The version of CarbonTracker used here is derived from version 1.0 of the code maintained by Wageningen University with the same state vector as CarbonTracker North America (as used in Peters et al., 2007) and without a zoom region.

TM5-4DVar: variational data assimilation
Whereas the EnSRF in CarbonTracker reduces the dimension of the minimization problem of Eq. ( 1) by solving sequentially for time-sliced state vectors, the 4DVar method in TM5-4DVar leaves the dimension of the state vector intact and approximates the solution using a limited set of search directions, corresponding to the dominant singular vectors of the inverse problem to approach the minimum of the cost function step by step.The iterative minimization of Eq. ( 1) in TM5-4DVar is described in detail by Chevallier et al. (2005) and Meirink et al. (2008).It employs the conjugate gradient algorithm (Navon and Legler, 1987) which is equivalent to the Lanczos method (Lanczos, 1950;Fisher and Courtier, 1995) and requires calculation of the cost function gradient where subscript n indicates the nth iterative step.The adjoint formulation of TM5 allows for the calculation of the cost function gradient using a single run of the transport model and its adjoint (Errico, 1997;Chevallier et al., 2005).
The conjugate gradient algorithm further provides the leading eigenvalues and eigenvectors of the preconditioned Hessian matrix which is the second derivative of the cost function J with respect to the dimensionless preconditioned state χ defined as x = Lχ + x b , where L is the preconditioning matrix with B = LL T .This can be used to construct the inverse of the state covariance B−1 as defined in Eq. ( 4).After n steps, corresponding to n runs of the forward and the adjoint model, the minimization algorithm yields an optimized state estimate χ n and the first n eigenvalues λ i (λ i > 1) and eigenvectors v i (i = 1, . .., n) for the eigensystem of the preconditioned Hessian matrix.The latter can be used to construct an approximate error covariance matrix, With an increasing number of iterations, the optimized state vector xn approaches the optimal state vector x at the minimum of the cost function and the approximate state covariance Bn approaches B from above, so that the estimated uncertainty is always larger than the analytical value (Basu et al., 2013).For practical purposes the iteration is stopped when the gradient norm reduction exceeds a threshold, i.e.
with the constant chosen to be η = 10 −9 here.TM5-4DVar's state vector x is binned temporally in monthly fluxes and spatially on the transport model grid scale, i.e. 6 • × 4 • , longitude × latitude.Fluxes are categorized into biosphere, ocean, fire and fossil fuel.To create a setup comparable to CarbonTracker, only biosphere and ocean fluxes are optimized.The background covariance B of the state vector is characterized by a global temporal and spatial correlation length.By default TM5-4DVar uses an exponential decay with a temporal and spatial length scale of 1 month and 200 km for biosphere fluxes and 3 months and 1000 km for ocean fluxes.As such, the temporal binning of TM5-4DVar's state vector containing monthly bins is about a factor of 4 coarser than the temporal binning of CarbonTracker's weekly bins.TM5-4DVar's spatial binning has a different overall structure.Whereas CarbonTracker's prior fluxes are fully correlated inside the 240 ecoregions and mostly uncorrelated between different ecoregions, the correlation of TM5-4DVar's fluxes falls off exponentially around each grid box.The exponential decay in TM5-4DVar's temporal background correlation limits the effects of observations in time.However, TM5-4DVar has no strict limit on the time window during which observations can be linked to fluxes but rather reduces the strength of the influence with temporal lag.TM5-4DVar can adjust surface fluxes in response to any observation during the entire considered time period given that the transport model reveals a link between fluxes and observations.As for CarbonTracker, the observation covariance R is assumed diagonal.

Setup of the comparison
Given the setup of the CarbonTracker and TM5-4DVar modeling systems, we aim at comparing the performance of their data assimilation concepts for the purpose of CO 2 surface flux estimation when assimilating atmospheric CO 2 concentration records.To avoid affecting conclusions about the inverse methodology, care must be taken that model input such as transport parameters, background estimates, initial concentration fields and assimilated observations are harmonized as far as possible.However, as outlined in Sect.2, conceptual differences between the models prevent us from making the model setup exactly identical.

Transport model and observation operator
To connect concentration measurements and surface fluxes, CarbonTracker and TM5-4DVar use a transport model which transports the CO 2 tracer using meteorological fields.Both models use the Transport Model 5 (TM5) as described by Krol et al. (2005) which utilizes meteorological data from the European Centre for Medium-Range Weather Forecasts (ECMWF, 2013).For CarbonTracker, we follow the setup used by Peters et al. (2007).For TM5-4DVar our setup differs from the setup used by Basu et al. (2013) in one main aspect to be consistent with CarbonTracker: the CO 2 concentration field is sampled in the second model layer (≈ 980 hPa ≈ 170 m) or higher instead of in the first model layer (≈ 994 hPa ≈ 50 m) or higher.Except for these adjustments and some minor differences due to different interfaces of the inverse methods, the versions of TM5 used by the Carbon-Tracker and TM5-4DVar systems we are using are the same.

Background flux and initial guess
CarbonTracker and TM5-4DVar use the same background fluxes and initial concentration fields.The biosphere fluxes are taken from the Simple Biosphere model using the Carnegie-Ames-Stanford Approach (SIBCASA as by Schaefer et al., 2008).SIBCASA is a carbon cycle model that represents the uptake of CO 2 by different types of vegetation and its subsequent transfer back to the atmosphere through autotrophic and heterotrophic respiration.Its mechanistic description of the processes involved is driven by a combination of high-resolution weather data and satellite remote sensing products and includes interactions between the carbon, water, and energy cycles of the land surface.For the oceans both models use ocean inversion fluxes, the output from an ocean www.atmos-chem-phys.net/15/9747/2015/Atmos.Chem.Phys., 15, 9747-9763, 2015 inversion which assumes that the uptake of anthropogenic CO 2 increases proportionally to the mismatch between atmospheric and oceanic CO 2 partial pressure.Fire fluxes are taken from the Global Fire Emissions Database version 2 (GFEDv2, van der Werf et al., 2010).Fossil fuel fluxes are taken from the Miller data set as described in Peters et al. (2007) and its supplement.
The initial concentration field is generated from the output of a previous CarbonTracker run which ended on 1 January 2007.The field for 2009 is derived by increasing the concentration by 1.9 parts per million (ppm) per year.The value 1.9 ppm was chosen based on tests of the fit to observation sites in the first month of 2009.
The covariance of the fluxes is defined in the models as described in Sects.2.1 and 2.2.We harmonize the overall covariance by adjusting the prior flux uncertainty in TM5-4DVar to 172.59 % of the flux for ocean grid boxes and to 199.17 % for land grid boxes to match uncertainty of a CarbonTracker run with a monthly cycle for global and continental aggregates.Due to the different ways of specifying the state vector x and its covariance B in CarbonTracker (weekly with ecoregions) and TM5-4DVar (monthly gridded with global covariance parameters), it is not possible to get an exact match of the flux uncertainties.This is a result of comparing real-world systems used for flux estimation to not only capture theoretical effects but also differences which show in practical use.While making the comparison more complex, this choice allows gaining a better understanding of the uncertainties due to the large amount of implementation decisions which have to be taken for a production system.The remaining mismatches in the prior flux uncertainty can have an effect on the estimated fluxes.This effect has to be taken into account for interpreting an a posteriori flux differences.Section 5.1.1 includes an example of such an analysis.The remaining mismatches in the flux uncertainty per Transcom region and month are provided in the Supplement.

Observations and observation errors
Both DA systems use the same observations from the obspack compilation of in situ CO 2 concentration measurements (Masarie et al., 2014; NOAA Environmental Sciences Division and Oak Ridge National Laboratory, 2013, version: PROTOTYPE v1.0.2 2013-01-28).Discrete (e.g. one sample per week) measurements from surface flask sites, in situ continuous (and semi-continuous) measurements from surface sites and towers, and aircraft campaign measurements are collected, aggregated and quality screened to make them suitable for inverse flux estimation.At many but not all of the continuous measurement sites, the measurements are averaged to provide afternoon or nighttime averages (depending on the type of site, e.g., continental planetary boundary layer site or mountain site), using intra-day averaging periods representative of large scale fluxes and discarding single measurements outside the respective averaging periods.For our baseline CarbonTracker and TM5-4DVar runs, we exclude 21 measurement sites from the assimilation to use them as validation sites.
Additionally we take out five sites which have more than 1000 measurements in the assimilation period.This is to keep the TM5-4DVar results representative of TM5-4DVar runs which use the native TM5-4DVar input.When using these five sites with the CarbonTracker preprocessing, TM5-4DVar shows strong gradients between neighboring grid cells in North America which it does not show when processing its native set of observations.In addition to these 26 excluded sites, there are 24 further sites from which the default run of CarbonTracker uses no data or only a subset of the observations.The reasons for not using some of the observation data of a site include that the data is assumed not representative of its grid cell or recorded in aircraft campaigns.
Measurement uncertainty is set to a fixed value for each site accounting for the measurement errors and for representativeness errors.The latter originate from using the in situ samples to represent the CO 2 concentration in a transport model grid box of 6 • longitude and 4 • latitude.Concentration uncertainties range from 0.75 ppm for marine boundary layer sites, to 2.5 ppm for land sites to 7.5 ppm for sites which experience variable meteorological conditions.Table 1 in the Supplement lists the observation records used in our study.Figure S1 in the Supplement shows the global distribution of observation sites together with a visual representation of their weight due to sampling frequency and representativeness error.In our setup CarbonTracker and TM5-4DVar use the same representativeness errors.

A posteriori concentration fields
As a first step, we compare and validate the performance of CarbonTracker and TM5-4DVar by evaluating the difference between measured and modeled CO 2 concentration fields at the location of various ground sampling stations.Comparing concentration fields at the assimilated sites in Sect.4.1 provides a way of verifying that data assimilation works in both systems.Comparing measured and modeled concentrations at non-assimilated sites in Sect.extent the data assimilation approaches yield improvements where observational constraints are distant in space and/or time.CarbonTracker and TM5-4DVar are both run with the baseline setup (as described in Sect.3) for a 23-month period starting on 1 February 2009.

Assimilated sites
As an example for an assimilated site, Fig. 1 shows a time series of measured and modeled CO 2 concentrations at Mauna Loa (MLO), Hawaii, located at 3399 m a.s.l. in the Pacific.
For the period from 1 February 2009, to 30 December 2010, the models assimilate 94 weekly flask measurements.We compare the observations to a posteriori and a priori model concentrations.The a posteriori concentrations are sampled using the a posteriori surface fluxes estimated by Carbon-Tracker or TM5-4DVar.The prior model concentrations are sampled using the background (prior) flux estimate common to both models.The Mauna Loa record demonstrates that the a posteriori concentrations produced by both models match the observations within the uncertainty estimate and that the match is substantially better than for the prior concentration fields.Differences between CarbonTracker and TM5-4DVar are much smaller than the representativeness error of the measurements at Mauna Loa (0.75 ppm) over the entire period.This is consistent with the results at other sites.
The mismatch between measured and modeled CO 2 concentrations for all assimilated measurements is shown in Fig. 2, with the prior concentrations, the a posteriori concentrations optimized by CarbonTracker, and the a posteriori concentrations optimized by TM5-4DVar.The concentration mismatch is normalized by the representativeness error of the observations such that a (unitless) mismatch of 1 corresponds to a mismatch with the magnitude of the representativeness error.Unlike the time series for Mauna Loa, the histograms only integrate over the 1 year period of 3 April 2009 to 2 April 2010 in order to be consistent with the analysis of the a posteriori surface fluxes in Sect. 5.This time period gives the models sufficient spin-up and spin-down time, given that the initial concentration is already well-optimized by a previous CarbonTracker run.
The concentrations from the prior forward run in Fig. 2 reveal an overall bias in the normalized (unitless) mismatch of 0.37 with a standard deviation of 1.09.Tentatively, the prior fields show a dipole pattern with peaks around −1 and 1 which can be traced back to the prior Northern Hemisphere, generally overestimating the observations, and the prior for the Southern Hemisphere, generally underestimating the observations.The CarbonTracker and TM5-4DVar histograms show small biases of 0.006 and 0.025 with standard deviation of 0.727 and 0.650, respectively.Compared to the prior, both DA systems improve the overall bias and they substantially reduce the spread of the observation-model mismatch.Normalized standard deviations smaller than 1 indicate that the mismatch is on average smaller than the estimated representativeness error, which points to a conservative choice of representativeness errors and consequently a stronger than optimal influence of the prior flux estimate.However, avoiding this would require using the output of the assimilation systems to adjust their input parameters which could lead to transient errors in the result.
The histograms for a posteriori CarbonTracker and TM5-4DVar concentrations reveal some non-Gaussian behavior, with long tails toward greater mismatch and with a narrow peak at the center.The tails most likely stem from temporally varying contributions to the representativeness error which our input data assumes constant in time.The narrow peak likely stems from two sources: first, sites with highfrequency measurements are assumed uncorrelated in the models and as such provide a stronger constraint than sites with low-frequency measurements.Second, an already welloptimized prior which is close to the observations causes the models to stick to the prior in a sparse observation network.
In summary, both models show similar performance for assimilated sites, and the assimilation substantially reduces the mismatch between modeled and measured concentrations at assimilated sites.

Non-assimilated sites
Next, we evaluate the performance of the DA systems for sites whose observations are not assimilated.These sites provide independent validation of the results.Figure 3 shows a time series of flask measurements in Guam, Mariana Islands (GMI), located in the western Pacific.In contrast to Mauna Loa, the measurements are taken at sea level, and are not assimilated by the CarbonTracker and TM5-4DVar inverse models.The observation error in Guam is 1.5 ppm, and the modeled concentrations agree well with measurements taken at the site.CarbonTracker and TM5-4DVar reproduce the measurements similarly well with a respective bias of  0.12 and 0.02 ppm.Their standard deviation of 0.79 and 0.82 ppm, are greater than the standard deviation at Mauna Loa, our selected example for assimilated sites.The prior concentrations on the other hand deviate substantially from the measurements with a bias and standard deviation of 0.89 and 1.24 ppm, respectively.The histograms of model-observation mismatch, are shown in Fig. 4, for the concentrations of a prior forward run and from the a posteriori CarbonTracker and TM5-4DVar runs.Many of the non-assimilated measurements come from continuous sampling sites and aircraft campaigns which provide a high number of measurements.Normalized bias and standard deviation of the prior mismatch aggregated for all sites are 0.66 and 1.03, respectively.The normalized biases of the mismatch for CarbonTracker and TM5-4DVar are 0.097 and 0.004, respectively, and the standard deviation of the histograms are 0.835 and 0.839, indicating that assimilating observations with the DA systems substantially improves the match to independent data when compared to the prior performance.The spread of the a posteriori modelobservation mismatch, however, is somewhat greater than for the comparison to assimilated measurements.This is as expected and indicates a slightly worse performance of both methods for the non-assimilated than for assimilated sites.

Robustness of the result
A posteriori CarbonTracker concentrations show a larger bias for non-assimilated measurements (0.097) than for assimilated measurements (0.006).TM5-4DVar biases are more similar for non-assimilated (0.004) and assimilated measurements (0.025).In order to investigate whether these differences are likely to be an artefact of our selection of validation sites, we conduct a resampling experiment.Out of the 50 sites for which there are non-assimilated observations -our 26 validation sites, aircraft measurements and sites for which only a given measurement method is assimilated -we randomly select subsets of 25 sites and recalculate the statistical model-observation bias for non-assimilated measurements.Then we repeat the exercise 9 times and examine the distribution of the resampled CarbonTracker and TM5-4DVar biases.Figure 5 shows that the normalized biases for the Carbon-Tracker baseline run consistently scatter around 0.08 with a standard deviation of 0.04 while the TM5-4DVar average bias and standard deviation are −0.04 and 0.07, respectively.So, while a posteriori CarbonTracker concentrations appear offset from the (non-assimilated) observations, TM5-4DVar does not show a significant overall bias but does show greater station-to-station variability for the modelobservation mismatch.

Impact of the CarbonTracker assimilation window length
In order to investigate whether the robust bias our resampling found for CarbonTracker can be due to the choice of the EnSRF assimilation time window, we vary Carbon-Tracker's lag and cycle parameters.Figure 6 S1 in the Supplement.The baseline run (dots) is compared to a CarbonTracker run with the assimilation period extended to 5 × 20 days (×) instead of 5 × 7 days and 10 × 7 days (+).
Syowa measurements by flux adjustment unless they account for far-and long-reaching correlations between concentrations and fluxes.While TM5-4DVar allows for such connections, CarbonTracker's baseline assimilation window strictly limits these to 5 weeks, which is shorter than the transport timescales from strong flux regions to Antarctica.Therefore, the baseline CarbonTracker run shows a small but systematic underestimation of the CO 2 concentration by up to 0.5 ppm observed in Syowa in summer and fall 2009 while a posteriori TM5-4DVar concentrations match well (not shown).
Increasing or decreasing CarbonTracker's assimilation window length respectively improves or deteriorates the match to Syowa observations, showing that the assumed temporal correlations play a role.For sites which are closer to biosphere regions, this effect could manifest as flux misattribution, which would appear as a mismatch to non-assimilated stations.Figure 5 illustrates the resulting biases for our re-  sampling assessment when CarbonTracker is run with an assimilation window of 10 × 7 days or 5 × 20 days instead of 5×7 days.For 10×7 the average normalized bias reduces to 0.03 with a standard deviation of 0.03, and for 5 × 20 the average normalized bias reduces to −0.01 with a standard deviation of 0.03.Both are consistent with TM5-4DVar's performance and better than the run with 5 × 7 days.This suggests that a longer assimilation window adds valuable information to CarbonTracker's DA system.It is unclear, though, whether this improved match to validation measurements translates into improved flux estimates since transport model errors might have a larger impact for the longer assimilation windows.In Sect.5.1 we discuss additional effects from a larger bin size which may make a long assimilation window undesirable, despite the better match to validation measurements.These uncertainties are excluded from the annually aggregated graphs, because there is no method for temporally aggregating the uncertainties in a way which is comparable to the uncertainties estimated by TM5-4DVar.

Comparison of a posteriori surface fluxes
Section 4 shows that the methods are of similar quality when comparing the a posteriori concentrations with assimilated and non-assimilated observations.Here, we turn to evaluating the a posteriori surface fluxes delivered by Carbon-Tracker and TM5-4DVar.
As first step we describe the results of the baseline runs.Then we analyze detectable features and the effect of a longer assimilation window in CarbonTracker.

Surface fluxes of the baseline run
For the baseline CarbonTracker and TM5-4DVar runs, Table 1 shows the globally aggregated a posteriori fluxes for the biosphere and oceans from 3 April 2009 to 2 April 2010.Car-bonTracker and TM5-4DVar estimate a global carbon sink (due to the biosphere and oceans) which is stronger than the prior estimate by 1.42 and 1.35 Pg C a −1 , respectively.We only show the uncertainty for the prior and TM5-4DVar which is calculated as described by Basu et al. (2013), because for CarbonTracker the aggregation of uncertainties from the weekly to yearly scale requires using assumptions about the temporal correlation of the uncertainties.Due to these assumptions, the yearly uncertainties of TM5-4DVar and CarbonTracker would not be comparable, even if we adopted existing schemes as for example the one employed by Peters et al. (2005).The differences in the uncertainties would not be representative of actual differences in the models.Therefore, we use the uncertainties from TM5-4DVar as a metric for comparisons.Different from the Monte Carlo-based uncertainty calculation which Chatterjee and Michalak (2013) used, the error propagation employed in TM5-4DVar always approaches uncertainties from above: the aggregated errors are larger than the analytical uncertainties at the exact minimum of the cost function.
Due to this we expect our uncertainties to overestimate the real uncertainties from measurement and representativeness errors.With this caveat, the sink estimates of the two models are consistent within the TM5-4DVar uncertainties and also match previous findings for CarbonTracker (Peters et al., 2007).Examining the time series of globally aggregated surface fluxes in Fig. 7 confirms that the two DA systems are consistent on the global scale, both showing stronger summer uptake than the prior.
Figure 8 illustrates the a posteriori biogenic and oceanic fluxes aggregated over the one-year time period on conti- nental scale regions.Agreement between CarbonTracker and TM5-4DVar is found for North America, Africa, Europe, and Australia, and for all oceans except for the Indian Ocean.The optimized fluxes in these regions differ by less than the yearly uncertainties estimated from TM5-4DVar's statistical error aggregation (see Basu et al., 2013).On the other hand, the modeled fluxes from CarbonTracker and TM5-4DVar differ by more than their uncertainty in South America, Asia and the Indian Ocean.In South America they differ by roughly 2 times the estimated uncertainty; therefore, we take a more detailed look at this discrepancy.2013) until that point.

TM5-4DVar's flux anomaly in South America
The time series of South American surface fluxes in Fig. 9 reveals that the flux differences in South America stem from particularly large emission estimates in summer 2009 in TM5-4DVar.The temporal structure of TM5-4DVar fluxes for the Indian and Pacific oceans suggest that ocean uptake compensates for the large South America source to match the hemispheric flux budget.South America suffers from sparseness of observational constraints such that validation of the estimated surface fluxes via comparison of measured and modeled atmospheric CO 2 concentrations is difficult.Aircraft measurements regularly conducted in South America do not provide deeper insight, because they have a data gap in the critical time between June and August 2009.The only other site that is close to the South America flux region is Arembepe in Brazil (ABP, 12.77 • S, 38.17 • W), a ground sampling station which is used as constraint within our data assimilation exercise.
To check its impact on the fluxes, we perform a sensitivity run without assimilating Arembepe.In this run both models are similarly good at matching modeled a posteriori and measured CO 2 concentrations in Arembepe and mostly follow the prior (see Fig. 10).When assimilating observations from Arembepe however, TM5-4DVar closely follows the observations in spring 2009 while CarbonTracker only moves halfway from the prior to the observations.This can be explained by the outlier rejection in CarbonTracker: when the difference between the model and a measurement is more than 3 times the estimated representativeness error of the measurement, CarbonTracker ignores the measurement as an outlier.As marine boundary layer site, Arembepe is as- signed a representativeness error of only 0.75 ppm, so Car-bonTracker ignores most measurements before May 2009.The aggregated fluxes in Fig. 8 show that assimilating the measurements in Arembepe has a significant effect on the a posteriori fluxes of TM5-4DVar.When taking out Arembepe from the baseline run, TM5-4DVar's attribution of fluxes shifts: the sinks in the Pacific and Indian oceans weaken while the strong source in South America disappears.The time series in Fig. 9 provide a temporal fingerprint of the flux difference due to removing Arembepe from the assimilation which identifies the changes in the Pacific and Indian oceans as compensation for the removal of the strong source in South America.
The flux changes in CarbonTracker with assimilating Arembepe are within the estimated uncertainties in the yearly aggregated fluxes and in the time series.Disabling the outlier rejection in CarbonTracker causes the modeled concentrations to follow the observations much more closely.Car-bonTracker specifies the flux uncertainty relative to the total flux, which in April and May 2009 yields a lower uncertainty than that from TM5-4DVar, which can cause the flux to change less than in TM5-4DVar in those months, leading to the strong reaction of the outlier rejection.However, as shown in Fig. 9, CarbonTracker does not show the additional source seen in TM5-4DVar between July and August 2009, where the flux uncertainty of both models differs by less than 10 %.Also it does not show the compensation fluxes TM5-4DVar gives in the oceans.
The fluxes induced by assimilating Arembepe show that TM5-4DVar is more susceptible than CarbonTracker to the effect of single measurement sites in regions with very low observation density.

CarbonTracker with longer assimilation window
Figure 8 shows that when increasing the assimilation time window of CarbonTracker to 5 × 20 days, CarbonTracker yields roughly the same aggregated flux as TM5-4DVar.
The time series in Fig. 11 suggests that the change in the CarbonTracker estimate of Asian fluxes when going to the longer assimilation window originates from high-frequency corrections to the prior fluxes.If the biosphere model needs to be corrected for only 1 week, the run with weekly flux bins can adjust that week separately, while the run with 20-day flux bins has to adjust a full 20-day period.To test this theory, we verified that a run with an assimilation window consisting of ten 1-week cycles yields a similar Asian sink as the run with five 1-week cycles (1.84 instead of 1.61 Pg C a −1 ) which does not increase further when going to fifteen 1-week cycles (not shown), while a run with three 20-day cycles yields a similar Asian sink as the run with five 20-day cycles (2.22 instead of 2.25 Pg C a −1 ).
For a quantitative discussion of the propagation of aggregation errors see Turner and Jacob (2015).Our findings suggest that there is an impact of roughly 0.5 Pg C a −1 from high-frequency mismatches between the prior model and the measured concentrations during the Asian summer which cannot be corrected accurately with a bin size of 20 days or more.
In summary we see good agreement for the baseline fluxes between CarbonTracker and TM5-4DVar on a global scale and for most continents and oceans.The mismatch of the fluxes in South America, the Indian Ocean and Asia can be traced back to two distinct effects: a different flux response in regions with very limited observation coverage and using weekly (CarbonTracker) or monthly (TM5-4DVar) adjustments to account for mismatches on shorter timescales.

Sensitivity to observation coverage
In order to assess the importance of data density and coverage on the two DA systems, we follow the approach which Bruhwiler et al. (2011) used to analyze the performance of their initial version of a fixed-lag ensemble Kalman smoother (Bruhwiler et al., 2005).We carry out five "historical" model runs where we increase the number of assimilated observation sites stepwise, mostly following the historical availability of data.The first run, termed "2/cont", assimilates observations from up to two stations per continent.It represents an extremely sparse observation network with different sampling frequencies per site.The runs "1988" and "2000" assimilate observations from all sites that were active in the years 1988 and 2000.The 2000 run assimilates roughly the same number of observations as our baseline run.The run 2010 uses all stations which were active in the year 2010 except for Arembepe.We exclude Arembepe from the 2010 run because as shown in Sect.5.1 the different treatment of the observations there would dominate the flux changes and as  such mask other effects.Figure S2 illustrates the observation density and coverage for the different historical runs while Table S1 lists the sites included for all the historical runs.

Atmos
Figure 12 shows the globally aggregated prior and a posteriori fluxes for the baseline setup and each of the historical runs.All the historical runs for both models, Carbon-Tracker as well as TM5-4DVar, yield consistent estimates of the global (biospheric and oceanic) carbon sink.The results differ by a few tenths of a Pg Ca −1 which is well below the TM5-4DVar uncertainty estimate of about 1 Pg C a −1 .This consistency is expected since the global carbon sink is well constrained by the trend in global background concentrations.Compared to the prior, all runs indicate a stronger sink by more than 1 PgCa −1 .The global flux estimate is robust against changes in the observation coverage and against the choice of the inverse method.Global scale fluxes are also consistent with the 2013B estimates from CarbonTracker North America (NOAA, ESRL) 2 .NOAA shows a global sink of 6.79 ± 6.86 PgC for 2009 while we see values between 6.37 and 7.03 PgC for April 2009 to April 2010, depending on the observation data we assimilate.
On the continental scale we take a closer look at North America, since changes in observation density are historically most pronounced there.Figure 13 shows that TM5-4DVar and CarbonTracker fluxes for North America become more similar the denser the observation network becomes, with almost the same flux estimate in the 2010 setup in which the DA systems assimilate more than 15 sites on the North American continent (see Fig. 14).This good match of both methods suggests that the density of observation sites in North America suffices to optimize continental scale fluxes with some degree of certainty.Separating the fluxes of the two North America Transcom regions (Fig. 13) shows that for the more homogeneous Transcom region in boreal North  America, the results from both methods have already converged with the observation coverage in the 1988 run, while in the more heterogeneous North American temperate region with many agricultural regions, the methods only converge in the 2010 setup.
The stronger land sink seen by TM5-4DVar for 2/cont stems from assimilating only two sites: a site in West Branch, Iowa, USA (WBI, 41.7 • N, 91.4 • W), in the US Corn Belt, and a site on Sable Islands, Nova Scotia, Canada (WSA, 43.9 • N, 60.0 • W).In TM5-4DVar, the strong summer sink near West Branch dominates the North America fluxes and increases the sink from roughly 1 Pg C a −1 in the 2010 run to more than 1.6 Pg C a −1 in the 2/cont run.CarbonTracker is less susceptible to this effect than TM5-4DVar because its ecoregion approach enforces a correlation between the fluxes for all regions in the Corn Belt as well as for all regions with grassland -both region types span the area from southern North America up to the US-Canada border.This makes it more likely that a potential flux adjustment is constrained by more than one site which gives it a stronger meridional coupling.Since meridional mixing is much slower than zonal mixing, stronger meridional coupling forces a larger region to change in the same way.For example adjusting the flux On the other hand, the overall North American sink of 0.65 Pg C a −1 estimated by CarbonTracker in the 1988 run is 30 % lower than the sink of 0.95 Pg C a −1 in the 2010 run, while in TM5-4DVar the 1988 and the 2010 run differ only by 10 % (0.1 Pg C a −1 ).The difference between the results for the 2000 and the 2010 runs in North America is on the order of 0.1 Pg C a −1 for both models, but in different directions.So with low observation coverage, the quality of the inversion in either system depends on the exact distribution of the observations.This suggests that with the coverage from 2000, we need to assume a minimum uncertainty of 0.25 Pg C a −1 from only the choice of the inverse method.For 2010 this is down to less than 0.1 Pg C a −1 .
The strong reduction of the uncertainty estimate in the North America fluxes of TM5-4DVar in the 2/cont run, despite assimilating only 2 sites in North America, shows the sensitivity of these estimates to the raw number of assimilated observations.It proves that the actual structure of the observational network has to be taken into account when interpreting the reduction of model-estimated uncertainty.
Overall our results show that the current observation coverage in North America allows estimating robust fluxes on continental scales and on the scales of Transcom regions.The improving agreement with increasing observation coverage between both models for the aggregated North American fluxes and the two Transcom regions in North America suggests that increasing the observation coverage allows obtaining robust fluxes on even smaller scales.

Conclusions
Our study evaluates the performance of the data assimilation models CarbonTracker and TM5-4DVar by comparing their a posteriori CO 2 concentration fields to measurements and by comparing their a posteriori surface fluxes.We test the sensitivity of the a posteriori CO 2 fluxes to model parameters and data coverage.To analyze the impact of the inverse method and the flux representation, the models run in setups which are close to their production settings but use harmonized input data, tracer transport model, prior flux and prior flux covariance estimates.A caveat applies since prior fluxes and prior flux uncertainties cannot be made identical due to differences in how the state vectors of the two methods are setup: CarbonTracker optimizes weekly ecosystem-wide fluxes while TM5-4DVar optimizes monthly fluxes on a regular longitude-latitude grid.
Both inverse models yield CO 2 concentration fields of comparable quality.We show that increasing the length of the assimilation time window of CarbonTracker to five bins of 20 days or ten bins of 7 days shows good agreement to observations in Antarctica which are underestimated in summer when using the default setup with an assimilation window of only 5 weeks.With these longer windows, the difference of the bias of the models at non-assimilated measurement sites is lower than the uncertainty of the bias due to the limited number of non-assimilated sites.This has two implications: first, the differences between the a posteriori fluxes provide a lower estimate of the uncertainty due to the choice of the optimization method, and second, a choice between the two systems may be reduced to practical considerations, such as (a) CarbonTracker is easily parallelizable because of the ensemble structure, but (b) TM5-4DVar yields defined uncertainties over long-term flux integrals which have to be approximated in CarbonTracker, or (c) TM5-4DVar requires an adjoint of the transport model, CarbonTracker does not.
The a posteriori fluxes from both models are in good agreement on a global scale, but on continental scales they show significant differences, most noticeably in South America which has very sparse coverage of observation sites.Investigating the flux time series allows tracing these differences back to spurious flux adjustments in TM5-4DVar for South America due to assimilating observations from a single site in Arembepe, Brazil, along with compensating fluxes in the oceans.Also we see a difference in the adjustment of Asian fluxes, but an additional CarbonTracker run with a coarser temporal flux adjustment bin size of 20 days gives similar fluxes in Asia as TM5-4DVar.Here, the flux time series reveal that part of the weaker sink in CarbonTracker with smaller bin size stems from high-frequency changes which cannot be represented with the monthly binning of flux adaptations in TM5-4DVar and the CarbonTracker run with bins of 20 days.The impact of this effect on the fluxes in Asia is 0.5 Pg C a −1 .
To better analyze the sensitivity of both models to the observation coverage, we run the models with collections of measurement sites selected by historical availability.In North America, where the change of observation coverage is most pronounced, fluxes estimated with the observation network from 2000 differ by 0.25 Pg C a −1 , which can serve as a lower limit for the uncertainty due to changing the method.With the measurement network from 2010, the difference reduces to 0.1 Pg C a −1 .
TM5-4DVar has a stronger response to the data coverage than CarbonTracker.This shows that the ecoregion approach in CarbonTracker with its stronger meridional coupling of fluxes and observations makes CarbonTracker less susceptible to changes in the distribution and density of observations than the simple global flux covariance in TM5-4DVar.As such it might be useful to reuse CarbonTracker's spatial flux correlation structure in TM5-4DVar.
Generally, we see sensitivity of the optimized fluxes to the density and distribution of observations which might be particularly important for using satellite data, in which the coverage of observations changes with cloud cover.The improved agreement between both models when adding observation sites indicates that the coverage of observation sites in North America should be sufficient to yield robust fluxes on a continental scale when only considering the uncertainty from the inverse methods and the flux representation.
The Supplement related to this article is available online at doi:10.5194/acp-15-9747-2015-supplement.

Figure 1 .
Figure 1.Time series of measured and modeled CO 2 concentrations from CarbonTracker and TM5-4DVar at Mauna Loa, Hawaii (assimilated weekly flasks), NOAA site code MLO.Also shown are the concentrations obtained from a forward run of the transport model using the a priori background flux estimates.

Figure 2 .Figure 3 .
Figure 2. Histograms of the mismatch between measured and modeled CO 2 concentrations for all assimilated measurements using prior fluxes, CarbonTracker optimized fluxes and TM5-4DVar optimized fluxes.The histograms show residuals for one year (3 April 2009 to 2 April 2010) which are normalized by the estimated representativeness error.The line overlying the histograms is a fit of a Gauss function to the histogram.The parameters in the top left show the bias and standard deviation of the Gaussian.The bottom right shows the number of measurements which were accumulated into the histogram.

Figure 4 .Figure 5 .
Figure 4. Histograms of the mismatch between measured and modeled CO 2 concentrations for all non-assimilated samples using prior fluxes, CarbonTracker optimized fluxes and TM5-4DVar optimized fluxes.The histograms show residuals for one year (3 April 2009 to 2 April 2010) which are normalized by the estimated representativeness error.The line on top of the histograms is a fit of a Gauss function to the histogram.The parameters in the top left show the bias and standard deviation of the histogram.The bottom right shows the number of measurements which were accumulated into the histogram.

Figure 6 .
Figure 6.series of measured and modeled CO 2 concentrations at Syowa Station, Antarctica, for CarbonTracker runs with different lengths of the assimilation time window.The baseline run uses an assimilation window of 5 × 7 days.Color coding of shorter and longer assimilation windows follows the legend (lag × cycle in days).

Figure 7 .
Figure7.Global fluxes from the baseline runs of TM5-4DVar and CarbonTracker.The Prior is shown in the binning from Car-bonTracker.The uncertainties shown for CarbonTracker are aggregated spatially but not temporally.As such they represent the uncertainty of the estimated fluxes, calculated directly from the ensemble.These uncertainties are excluded from the annually aggregated graphs, because there is no method for temporally aggregating the uncertainties in a way which is comparable to the uncertainties estimated by TM5-4DVar.

Figure 8 .
Figure8.Fluxes from TM5-4DVar and CarbonTracker aggregated on continental scale.The uncertainties for TM5-4DVar are calculated followingBasu et al. (2013).The error bars for the prior are taken from TM5-4DVar.As written in Sect.5.1 we show no uncertainties for CarbonTracker, because the aggregation of uncertainties from the weekly to yearly scale is not clearly defined.

Figure 9 .
Figure 9. CO 2 surface fluxes from April 2009 to April 2010 in South America and the Indian and Pacific oceans.Only the time series for South America shows the CarbonTracker noreject, because it follows the CarbonTracker baseline in the other regions.The uncertainties shown for CarbonTracker are aggregated spatially but not temporally.See the caption of Fig. 7 for details.

Figure 11 .
Figure 11.CO 2 surface fluxes during summer 2009 in Asia.The prior forward run shows the prior fluxes aggregated to the bin size of the weekly CarbonTracker scaling factors.

Figure 12 .
Figure 12.Globally aggregated surface fluxes estimated by the model runs indicated in the legend.In all aggregated flux bar charts, the uncertainties are estimated by TM5-4DVar.

2Figure 13 .
Figure 13.Fluxes for CarbonTracker and TM5-4DVar from April 2009 to April 2010 separated into the two Transcom regions in North America.

Figure 14 .
Figure 14.Visualization of the weight of the measurement sites which are assimilated in North America in the respective runs.
assimilated observations yields an estimate of how the DA systems succeed in modeling CO 2 concentration fields in regions where the methods do not assimilate observations.

Table 1 .
Yearly global CO 2 fluxes and uncertainty (standard deviation) from the prior forward run and from the baseline runs of TM5-4DVar and CarbonTracker.The § column lists important notes.
Time series of CO 2 concentration in Arembepe, Brazil.The two "without ABP" runs show the concentrations when the models do not assimilate data from the Arembepe site.Carbon-Tracker noreject shows the concentrations for CarbonTracker with disabled outlier rejection.The time series ends after January 2010 because data at Arembepe are only available in obspack PROTO-TYPE v1.0.2 2013-01-28 from NOAA Environmental Sciences Division and Oak Ridge National Laboratory (