Carbon cycling of European croplands: A framework for the assimilation of optical and microwave Earth observation data

Worldwide, cropland ecosystems play a significant role in the global carbon (C) cycle. However, quantifying and understanding the cropland C cycle are complex, due to variable environmental drivers, varied management practices and often highly heterogeneous landscapes. Efforts to upscale processes using simulation models must resolve these challenges. In this study we show how data assimilation (DA) approaches can link C cycle modelling to Earth observation (EO) and reduce uncertainty in upscaling. We evaluate a framework for the assimilation of leaf area index (LAI) time-series, derived from EO optical and radar sensors, for state-updating a model of crop development and C fluxes. Sensors are selected with fine spatial resolutions (20–50 m) to resolve variability across field sizes typically used in European agriculture (1.5–97.6 ha). Sequential DA is used to improve the canopy development simulation, which is validated by comparing time-series of net ecosystem exchange (NEE) predictions to independent eddy covariance observations at multiple European cereal crop sites. From assimilating all EO LAI estimates, results indicated adjustments in LAI and, through an enhanced representation of C exchanges, the predicted at-harvest cumulative NEE was improved for all sites by an average of 69% when compared to the model without DA. However, using radar sensors, being relatively unaffected by cloud cover and more sensitive to the structural properties of crops, further improvements were achieved when compared to the combined, and individual, use of optical data. Specifically, when assimilating radar LAI estimates only, the cumulative NEE estimation was improved by 79% when compared to the simulation without DA. Future developments would include the assimilation of additional state variables, such as soil moisture.


Introduction
Agricultural intensification over the past 40 to 50 years, achieved by 'Green Revolution' technologies and an increase in cropland area (Foley et al., 2005), has resulted in an approximate doubling in world grain harvests (Tilman et al., 2001). Through changes to carbon (C) storage and emissions, associated with management activities, croplands also provide opportunities for climate change mitigation (Power, 2010). The European Union , with around half of the land area occupied by croplands (EU, 2009), presents a mosaic of crop varieties, phenologies and growth periods due to spatiotemporal variations in soil and climatic conditions, together with local and regional production requirements. The resulting broad range of cropland management techniques (e.g. tilling intensity, use of fertilisers and irrigation) causes uncertainty when generalising the impact of specific activities on crop C budgets (Osborne, Saunders, Walmsley, Jones, & Smith, 2010).
There is considerable uncertainty involved in quantifying C dynamics, particularly when identifying whether, and under what conditions, landscapes act as sources or sinks for C (Quaife et al., 2008). Flux towers provide measurements of net ecosystem exchange (NEE) at local scales (~1 km 2 ) via eddy covariance (EC) technique (Baldocchi, 2003), which directly measures biosphere-atmosphere CO 2 exchanges. However, complex terrain and heterogeneous spatial distributions of vegetation within the sensor 'footprint' undermine assumptions of the EC technique, introducing uncertainty (Hollinger & Richardson, 2005). Furthermore, towers are sparsely distributed and data-gaps are always present. Therefore, a complete analysis of crop C dynamics and yield relies on simulations using process-based models, often linked to C flux observations for validation. The models require reliable input parameters, including management interventions, plant traits, meteorological driving data and soil properties at points within the model domain. Therefore parameter estimates are the largest source of model uncertainty (Launay & Guerif, 2005) and a particular challenge is to derive these parameters across the model spatial and temporal domains.
Earth observation (EO) data can be combined with models to provide objective updates of state variables describing crop condition over landscape scales. This model-data fusion can be achieved via data assimilation (DA) algorithms, which assume that estimates from neither observations nor models are perfect but a combination of the two, weighted by a specified uncertainty, will produce more realistic model updates (Williams, Schwarz, Law, Irvine, & Kurpius, 2005). Researchers have demonstrated how DA can link regional-scale models with moderate spatial resolution EO sensors (250 m to 1 km, e.g. de Wit & van Diepen, 2007;Sus, Heuer, Meyers, & Williams, 2012;Wu et al., 2012). These sensors have high temporal resolutions (daily to weekly) that can capture the key developmental stages of crop growth (Launay & Guerif, 2005). However, this spatial resolution is typically insufficient for retrieving biophysical variables at field sizes b 25 ha (Doraiswamy et al., 2004), which are typical in Europe. Smaller fields require finer resolution sensors (e.g. 20 m), which is at the expense of lower temporal resolution that potentially leads to gaps in acquisitions during critical growth stages. Moreover, cloud cover affects the availability of optical EO imagery, resulting in further reductions in observations. The spatial/temporal resolution trade-off can be addressed by also using Synthetic Aperture Radar (SAR) sensors. SAR provides fine-scale data and is unaffected by cloud cover. Furthermore, as optical instruments are sensitive to biochemical properties of crops, SAR sensors are more responsive to crop water content and structural elements, such as the size and shape of leaves (Shang, McNairn, Champagne, & Jiao, 2009).
In this paper we demonstrate a framework for the assimilation of leaf area index (LAI) observations, retrieved from optical and SAR sensors, to update the simulated LAI of a process-based model of cereal crop C budgets for European landscapes. Our specific objectives were to: (1) determine the potential of the DA technique for improving the simulated daily NEE fluxes and the at-harvest cumulative NEE of wheat crops at the field-scale. The accuracy of the DA, when assimilating optical and SAR LAI estimates individually and synergistically, is evaluated by comparing model outputs to independent observations from flux towers at European sites. (2) Establish if the same methodology is applicable to a series of field sites across Europe, to improve the relationship between simulated and observed values, thereby providing a proof-of-concept for future spatial upscaling activities. Innovations of this study include the sequential DA of data derived from high resolution optical and SAR sensors, thus increasing the number of available observations. It is hypothesised that the multi-sensor approach improves the model performance at the field-scale by more effectively tracking the development of cereal crops, which is critical for seasonal carbon balances .

Study sites and data
This study investigates one winter wheat (Triticum aestivum) growing season at six different European field sites (Fig. 1). These sites were selected from the CarboEurope Integrated Project ecosystem database (CarboEurope-IP, www.carboeurope.org), which include those used by Sus et al. (2010) and Wattenbach et al. (2010). They are located in France (Auradé, Lamasquère and Grignon), Germany (Klingenberg and Gebesee) and Switzerland (Oensingen). As well as different management techniques, they also vary in latitude (43.5 to 51.1°N) and longitude (1.1 to 13.5°E) and show significant variations in temperature (annual average 6 to 11°C) and precipitation (mean annual values from 327 to 1051 mm). Consequently, different growing periods were observed at each site, in terms of both the sowing dates and the overall length of the growing season (sowing to harvest), ranging from 245 to 346 days (Table 1). Field sizes vary from 1.5 to 97.6 ha, and the terrain across each field site can be considered level to very gently sloping.
Field data available from CarboEurope-IP included daily NEE flux measurements using the EC technique during the growing season at each site. These NEE flux data consisted of aggregates of half-hourly observations. Gap-filling was applied to the original data using the Marginal Distribution Sampling method (Reichstein et al., 2005). However this dataset was filtered such that only days that consisted of aggregated original data (i.e. days consisting of 48 observations without gap-filling) were used in this analysis, consequently between 15% and 45% of the values were rejected. Meteorological observations collected at each site, used to drive a cropland C cycle model, included hourly radiation, temperature, wind, humidity and precipitation. Cartographic data and descriptions of the physical extents of the sites from CarboEurope-IP were digitised and used in the processing and extraction of EO observations. Additional site information, recorded at dates during the crop growing seasons, include soil data on texture (i.e. clay/ sand ratio), site management (e.g. sowing/harvest dates, applied fertilisers and yield) and biophysical measurements, such as the leaf area index (LAI) used to calibrate EO data. These time-series observations of LAI were available for all sites except for Gebesee.

Earth observation data and pre-processing
Based on a completely cloud-free coverage over the field sites, 18 SPOT (Système Pour l'Observation de la Terre) images and 28 ERS-2 (European Remote Sensing) SAR scenes (PRI data format) were selected to cover all sites ( Table 1). The level 2A processed SPOT data consisted of a combination of SPOT-2 and SPOT-4 data. Multi-temporal SPOT scenes were sourced for the growing seasons at all sites with the exception of Oensingen where sufficiently cloud-free images were unavailable. Each SPOT scene had a spatial resolution of 20 m and included multi-spectral measurements, centred on green (500-590 nm), visible red (610-680 nm) and near-infrared (790-890 nm) wavelengths.
Pre-processing of the SPOT scenes included geometric correction of one scene per site to the Universal Transverse Mercator (UTM) coordinate system, using the digitised field boundaries to within one SPOT pixel (20 m) and the nearest-neighbour resampling algorithm. Using the geo-referenced scene, image-to-image geometric registration was applied to the remaining multi-temporal scenes.
The SPOT image pixel values were converted to apparent at-sensor reflectance using the absolute calibration coefficients provided in the header file of the scenes. The time-series of images for each site were normalised to reduce the effects of variable sun angle, which involved dividing the pixel values by the sine of the solar elevation angle also reported in the header of the image data. Atmospheric normalisation was then applied to the multi-temporal images. Since no field-based atmospheric data were available, the relative image-based correction technique of pseudo-invariant targets was applied (Lu, Mausel, Brondizio, & Moran, 2002). Per site, this method involved normalising all multi-temporal images to a standard reference scene, chosen as the most cloud-free. Invariant targets, selected using the criteria outlined in Eckhardt, Verdin, and Lyford (1990), included man-made features and inland water bodies. Once the target features were identified in the reference scene, linear regression was carried out to correct for the remaining scenes band-by-band.
Pre-processing of the ERS-2 SAR (C-band: VV-polarisation) scenes included derivation of the backscatter coefficient (σ°) expressed in decibels (dB), using processing steps described in Laur et al. (2004) and accurate to within ±0.4 dB. This process includes corrections for range spreading losses, application of absolute calibration constants and terrain corrections to derive local incident angle using the SRTM (Shuttle Radar Topography Mission) digital elevation model. The ERS scenes were geometrically registered to UTM, to match the boundaries of the cropland sites to within one ERS-2 PRI pixel (12.5 m).

LAI retrieval from Earth observation data
Simplified empirical retrieval algorithms were used for the estimation of LAI from EO data. The mean within-field reflectance value, with pixel sample sizes between 226 (Grignon) to 695 (Klingenberg), was extracted from the SPOT scenes for each band and used to calculate the Weighted Difference Vegetation Index (WDVI, Clevers, 1988). The WDVI is an orthogonal index used to reduce the effect of soil reflectance, which influences the relationship between the scene reflectance and LAI. This relationship is specifically related to moisture, as reflectance decreases with increasing soil moisture content. However this decrease is independent of wavelengths between 400 and 1000 nm (Clevers, 1988). For each SPOT scene, the WDVI was calculated using: The R NIR and R VIS parameters correspond to the reflectance values in the near-infrared and visible red sensor wavebands respectively. γ is the ratio of reflectance in the near-infrared and visible red wavebands (R NIR :R VIS ) for bare soil (i.e. before crop emergence). Across all cropland sites the γ value ranged from 0.75 to 1.98.
The LAI was then estimated empirically based on a linear relationship between WDVI and the ground measured LAI at all sites. Therefore, a single calibration was determined that was valid for estimating LAI from WDVI for all sites, thus allowing for the potential use in broader spatial upscaling studies of European croplands. This calibration involved matching each WDVI measurement to an in-situ measurement made on a date corresponding to the SPOT acquisition. A sensitivity analysis was carried out to investigate any temporal disparity of development stages between the measured LAI and SPOT WDVI dates to within ±10, ±7 and ±5 days. It was found that the inclusion of more measurements from accepting temporal difference of ±10 days reduced the root-mean-square error (RMSE) of SPOT estimated LAI compared to ground measurements from 0.70 (±5 days) to 0.24 (±10 days), with the corresponding negative bias being less than 1.31 m 2 m −2 in both cases.
The mean within-field σ°value was extracted from the calibrated ERS scenes. The number of pixels used in this averaging procedure depended on the size of the field sites and varied significantly from 49 (Oensingen) to 4845 (Gebesee). The LAI values were then estimated from this mean σ°by empirically modelling the relationship between σ°and the corresponding measured LAI value ±10 days of ERS acquisition dates. Table 1 List of study sites including the field sizes, length of growing period (from sowing to harvest), average annual temperature (Av. temp.), mean annual precipitation (Precip.) for wheat crop seasons during the years 2005 to 2007 and the number of available multi-temporal SPOT-2/4 and ERS-2 scenes (adapted from Sus et al. (2010)

. Ecosystem model description
The C cycle was simulated at the cropland sites using the Soil-Plant-Atmosphere (SPA) model (Williams et al., 1996, Williams, Law, Anthoni, & Unsworth, 2001, with modifications for carbon allocation (Williams et al., 2005) and croplands . SPA simulates the ecosystem carbon cycle and water-balance at the point-scale at fine temporal (half-hourly) and vertical scales (ten canopy and twenty soil layers). Leaf-scale processes, such as photosynthesis and transpiration, are scaled up to make canopy-scale predictions, linked to a radiative transfer scheme tracking absorption, reflectance and transmittance of direct and diffuse irradiance. Photosynthesis and transpiration are linked at leaf level by a model of stomatal conductance. Stomatal conductance is varied to optimise C uptake, but also to maintain leaf water potential above a minimum value, explicitly linking vapour phase losses with hydraulic transport. Modifications were made to SPA, to develop SPA v2-Crop (referred to henceforth as 'SPAc'), which involved defining a crop-specific C partitioning scheme based on empirical observations of crop growth cycles (Penning de Vries, Jansen, ten Berge, & Bakema, 1989). The C partitioning scheme describes the allocation of C amongst the roots, leaves, stem and storage organ (i.e. grain) pools as a function of development stage. The development stage is calculated as the accumulation of daily development rates, a function of temperature, photoperiod and vernalisation (until emergence). SPAc has previously been tested and parameterised by Sus et al. (2010) over the same European sites (Fig. 1) and growth seasons using field data.

Assimilation algorithm
By assimilating EO LAI values, the model can propagate this information throughout the model state vector, according to error covariance, and forward to subsequent time-steps when EO data is unavailable. The ensemble Kalman Filter (EnKF, Evensen, 2003;Williams et al., 2005) DA algorithm is used to produce a probabilistic estimate of LAI by combining the forecast and observed values, which are weighted according to the relative uncertainty quantified and assigned to the modelled and EO LAI values. This updated LAI is then used to update the full model state vector containing all above and below ground biometric variables. The EnKF approach represents the model and observation error statistics with a Monte Carlo ensemble (i.e. probability distribution) of state variables, where the mean of the ensemble is the best estimate and the error covariance is determined by the variance of the state variables. For each ensemble member the basic analysis steps for the EnKF (Eq. 2, de Wit & van Diepen, 2007) can be written as: A a represents the analysed state vector updated by the forecasted state A f . P e and R e represent the model and observation covariance matrices. H is the observation operator, which consists of a probability matrix that relates the model state vector to the data, and (D − HA) represents the innovation vectors.

Ecosystem model setup and determination of uncertainty
Initial simulations with the SPAc model were undertaken without DA (referred to henceforth as the 'forward mode') using only the input vegetation and soil parameters and meteorological driver data available for each site. A detailed overview of the parameters fitted in SPAc, along with nominal values and references, can be found in Sus et al. (2010).
Experimentation was then carried out using the EnKF algorithm (referred to hereafter as 'EnKF DA') to assimilate the EO LAI estimates into SPAc at times-steps corresponding to 12 noon on the same day as the EO acquisitions (i.e. no other assimilations were performed in the remaining 23 h of the day).
As the EnKF technique is based on the assumption that both SPAc and EO data are uncertain descriptions of the cropland ecosystems, it was necessary to ascertain the relative uncertainties of the model and EO retrieved LAI values. Ideally, observation error variances are identified from the instrument specifications (Williams et al., 2005), however pre-processing techniques applied to the SPOT and ERS LAI estimates, including atmospheric and radiometric corrections, introduced additional uncertainty into the EO data (Quaife et al., 2008). The uncertainty of the EO data was characterised based on the random error determined from the comparison of the ground measured LAI against the modelled LAI from fits to SPOT and ERS data. The normalised RMSE from the model misfit was calculated for both sensors, being 38% for the SPOT and 24% for ERS data. The model variance was then quantified by an iterative procedure where the prescribed value was adjusted until at least 68% of the ground measured LAI were within ±1 standard deviation of the daily LAI predicted by the EnKF DA (Williams et al., 2005).
The influence of the ensemble size representing the mean LAI values was evaluated by assessing the outputs from using 50, 100 and 250 members based on the RMSE between the observed and modelled daily NEE. It was found that an ensemble size of 50 members reduced the RMSE between observed and modelled NEE with little or no improvements noticed beyond this size. This observation is also consistent with the study by Sus et al. (2012) for the assimilation for MODIS LAI and with de Wit and van Diepen (2007) for the assimilation of soil moisture estimates. Therefore, the EnKF experiments were carried out using model outputs from 50 members only.
Comparisons were carried out between the LAI values simulated by SPAc, in the forward mode and with DA, to individual ground measurements of LAI available at each site. Further experimental analysis was conducted to assess the DA of EO measurements for improving the simulation of daily C fluxes, at-harvest cumulative NEE and yield at each site, including the assimilation of ERS LAI and SPOT LAI estimates both synergistically and individually, by comparison to independent eddy flux data.

LAI retrieval results
There was a significant correlation (R 2 = 0.62, P b 0.05) between all multi-temporal WDVI values at Auradé, Grignon, Lamasquère and Klingenberg and the corresponding (within ±10 days) ground measured LAI values (Fig. 2a). For the other sites SPOT imagery was unavailable (Oensingen) or ground measured LAI was not recorded (Gebesee). Overall, the LAI values derived from the WDVI (RMSE 0.60) showed a slightly negative bias when compared to measured values with a mean bias (i.e. mean SPOT LAI − ground measured) of 0.05 m 2 m − 2 . However, the mean error for Klingenberg (0.87 m 2 m − 2 ) had a much stronger positive bias when compared to the other sites.
A significant exponential relationship (R 2 = 0.76, P b 0.05) was modelled between all mean ERS σ°values and ground LAI measurements within ± 10 days of the ERS acquisition (Fig. 2b). Using the coefficients derived from this exponential fit the LAI was then estimated: where A = 0.087 and B = −0.257. As was the case with the WDVI calculation, this exponential fit was determined globally between all σ°a nd measured LAI values. Overall, the LAI values retrieved from the ERS σ°using Eq. (3) were overestimated, with a mean bias of 0.15 m 2 m −2 (RMSE 0.54), compared to measured LAI. The LAI and σ°relationship weakens and becomes negatively biased with decreasing σ°below around −14 dB (Fig. 2b). This change is significant for Grignon, where the σ°is −14.8 dB, and the estimated LAI value is around 1.6 m 2 m −2 less than that of the measured.

Ecosystem model results: forward mode
Some clear differences in the magnitude of the simulated peak LAI values in the forward mode can be seen when compared to the ground data (Fig. 3). Particularly, this can be seen for Auradé where the simulated LAI is overestimated by 1.3 m 2 m −2 and Grignon where the LAI is underestimated by 1.6 m 2 m −2 .
The overall timings of the peak LAI when simulated by SPAc in the forward mode generally closely matched the ground measured LAI. The exception was Klingenberg, where the simulated peak LAI value was around 25 days later than the ground measurements. However, this apparent discrepancy in timing could be due to a lack of field measurements around the time of maximum LAI.
The overall seasonal trends and timings of the observed NEE fluxes of crops were reproduced in the SPAc forward mode simulation (Fig. 4). For all sites there is a progressive decrease in NEE from early in the season onwards in response to an increase in C uptake (i.e. increase in sink strength) as the crops develop. The date of minimum NEE (i.e. peak C uptake) varies between different sites ranging from early May (Auradé (a), Lamasquère (c) and Oensingen(e)) to early-mid June (Klingenberg (d)). After the maximum C uptake the observed and simulated values show a relatively sharp increase in NEE, corresponding to crop maturity, and become a source of C at or around the harvest date.

Simulated LAI
The DA decreases the simulated peak LAI values for Auradé (a), Grignon (b), Lamasquère (c) and Oensingen (e) by an average of 0.18 m 2 m − 2 . However, for sites Klingenberg (d) and Gebesee (f), where the EO estimated peak LAI was higher than the simulated value, the DA significantly increased this maximum value by a mean of 1.84 m 2 m − 2 .
The timing of the peak LAI was also adjusted by the DA when compared to the forward mode. This adjustment was most notable for Lamasquère, with the maximum LAI being 12 days later with DA, and Klingenberg and Gebesee being 9 and 7 days earlier respectively.
With adjustments in both the magnitude and timing of the simulated LAI, a linear fit (R 2 ) between the observed and simulated LAI ( Table 2) showed that the EnKF DA improved the overall modelled and ground measured LAI relationship by an average of 43% when compared to the forward mode. This improvement was noted for all sites with the exception of Auradé where the DA reduced the strength of the relationship by 19%.

Simulated C fluxes
The assimilation of all EO LAI estimates resulted in some clear adjustments in the magnitudes of daily fluxes (Fig. 4). Specifically, based on an analysis of the NEE residuals (i.e. observed-modelled) the sink strength is reduced at Grignon, Oensingen and Gebesee by an average of 1.85 gC m −2 per day. Furthermore, the residuals show a progressive increase in the difference between the forward mode and DA estimates (i.e. forward mode − DA values) from the beginning of the year to an average of 1.12 gC m −2 at the date of maximum C uptake, whereas this average difference was only 0.02 gC m −2 60 days earlier.
With regard to the R 2 between the observed and modelled NEE values (Table 2), the DA strengthened this relationship by an average of 6% for sites Auradé, Grignon, Lamasquère and Klingenberg. However, the observed-modelled relationship was weakened by an average of 4% for Oensingen and Gebesee. The slope of this linear fit was also adjusted by DA, and for the majority of these sites, with a value less than 1, this value was increased by 9%, from 0.69 to 0.75 with DA.

Estimated yield
The simulated yield statistic (i.e. total mass of C allocated to the storage organ at harvest) was underestimated at all sites when compared to observed values (Table 3). When comparing the average difference across the sites, the magnitude of this underestimation with the EnKF DA (47%) was greater than that in the forward mode (38%). Specifically, in the DA simulation there was an improvement of 10% between measured and modelled yield for Klingenberg, however this was offset by the yield being further underestimated by 35% at Aurade when compared to the forward mode.

Synergistic and individual DA comparison of ERS and SPOT results
Assimilating the ERS and SPOT LAI estimates synergistically reduced the RMSE of the forward mode daily NEE simulations by an average of 10% for three out of the six sites (Table 4). However, the assimilation of ERS LAI estimates alone improved the simulation for five out of six sites, by an average of 13%. Gebesee was the exception to this as the assimilation of EO LAI in all cases appears to increase the RMSE when compared to the forward mode.
With the application of assimilation, the increase in agreement between the model and observations is also reflected by the cumulative NEE at harvest (Table 5). The cumulative NEE in the forward mode was lower than observed for nearly all sites by an average of  107 gC m −2 (27%). The combined assimilation of all EO LAI estimates showed an average difference between the measured and modelled cumulative NEE of 33 gC m −2 (8%), representing an overall error reduction of 69% when compared to the forward mode. Furthermore, although the RMSE of the daily simulations of NEE at Gebesee was not reduced with DA, the synergistic assimilation of both ERS and SPOT LAI improves the estimated cumulative NEE by over 50% when compared to the forward mode at this site. However, with the individual assimilation of ERS LAI alone the average cumulative NEE was only 22 gC m −2 (6%) lower than measured, thereby providing a greater improvement of 79% when compared to the forward mode estimate. Whereas, with the assimilation of SPOT LAI alone the cumulative NEE was 75 gC m −2 (19%) lower than measured, representing an overall improvement of only 30%.

LAI retrieval and assimilation
With the image processing techniques applied in this research, a reasonable linear relationship was established between all WDVI values and LAI ground measurements within ±10 days of the SPOT  acquisitions. This relationship suggests that a single empirical approximation can adequately describe the relationship between WDVI and LAI across multiple European wheat cropland sites.
The relationship between all ERS σ°and ground measured LAI (within ±10 days) was approximated by a single exponential function. This strong exponential relationship is consistent with research by  Macelloni, Paloscia, Pampaloni, Marliani, and Gai (2001) for narrow leaf crops. Furthermore, the RMSE between the σ°estimated LAI and measured LAI was reduced by 11% when compared to the LAI estimated from the WDVI. This improvement highlights some of the operational advantages of SAR over optical sensors for multi-temporal analysis. Specifically, SAR sensors are unaffected by localised atmospheric conditions, thus making them less site specific. Additionally, since it was not necessary to apply atmospheric corrections to the ERS data, this prevents the inclusion of the potential uncertainties involved with this pre-processing step.
The higher sensitivity of σ°to LAI can be attributed to the frequency and polarisation of the ERS sensor. Specifically, C-band frequency (5.3 GHz) provides significant amounts of σ°even for moderate crop growth, i.e. early in the growth season (Baronti et al., 1993). Paloscia (2002) mentions that VV-polarised σ°from small-leaf crops (e.g. wheat) decreases with increasing biomass. This is partly due to the attenuation of the VV-polarised signal by the predominantly vertical orientation of the wheat crop structure, including the stem and ears (Ferrazzoli, Guerriero, Quesney, Taconet, & Wigneron, 1999).
The use of EO LAI for adjusting the peak simulated LAI value is clearly dependant on the timing of EO acquisitions. This sensitivity was particularly the case for Grignon and Klingenberg where the day number of peak EO LAI approximately coincides with the maximum simulated LAI value. For Grignon, the maximum LAI in the forward mode is similar to that of the ERS LAI acquired around the same day, therefore the peak LAI value remains the same between the forward mode and the EnKF DA. The SPOT LAI value for Klingenberg, derived around the same date as the forward mode simulated peak LAI results in an increase in maximum LAI when the EnKF is applied.
At sites where the forward mode peak LAI values and corresponding NEE sink strengths were higher than observed, and were subsequently reduced by assimilating EO LAI estimates, it would have been expected that the forward mode simulation overestimated the grain yield. However, although the DA technique does reduce the predicted yield at these sites, in some cases this had the consequence of underestimating this value even further when compared to observations. This suggests weaknesses in the structural representation and calibration of yield formation in SPAc. As recommended by Sus et al. (2010), further constraints should be made to the C allocation parameterisation, particularly for the allocation to the roots.

The quality of simulated C fluxes
Generally, the modelled daily NEE matched the magnitude of the observed values more closely with the assimilation of EO LAI estimates. For three out of the six sites the overall representation of wheat C flux dynamics by SPAc was improved with a mean reduction in RMSE of 10%. However, across the majority of sites the assimilation of all EO LAI estimates was more significant at reducing the simulated sink strength, suggesting that SPAc is slightly negatively biased when compared to observations. These findings are similar to those reported in Sus et al. (2010) and also have the consequence of a lower cumulative NEE (i.e. overestimate of net C uptake) at harvest for most sites. However, for the majority of sites, an increase in the slope value of the measured-modelled regression line with the assimilation of all EO LAI estimates suggests that the assimilation technique is also successful at reducing model biases.
A greater improvement in the simulation was achieved with the assimilation of ERS LAI estimates alone, as opposed to synergistically (i.e. with SPOT estimates). This improvement is likely due to a stronger correlation between the ERS σ°and LAI, when compared to that of the WDVI and LAI. This enhanced simulation when using the ERS LAI only is reflected both in the RMSE of daily fluxes and the at-harvest cumulative NEE when compared to observations. The extent to which the difference between modelled and observed values are minimised, particularly around the period of peak C uptake, is also dependant on the timings of the assimilated LAI estimates. This is evident for sites Grignon, Lamasquère and Klingenberg where LAI values are assimilated on days that approximately correspond to the time of simulated peak LAI. Therefore this maximum LAI value is adjusted, which then varies the magnitude of daily NEE values accordingly. This was also discussed in Launay and Guerif (2005), where the model performance for crop yield estimates was improved when the timings of assimilated EO acquisitions coincided with growth stages in the vegetative phase when crop condition was shown through canopy development.

Is the model framework valid for multiple cropland sites?
An overall improvement in daily C flux modelling was achieved for up to five out of six European cropland sites. This suggests that the techniques reported here, including parameterisation and LAI retrieval calibrations, are sufficiently accurate and can reliably enhance the forecasting of wheat crop C fluxes at multiple European sites under different environmental conditions. A specific example of this is demonstrated by the precision of the simulated NEE at Lamasquère and Klingenberg. These sites represented the largest variation in latitude and average temperature, along with growth seasons in different years. However, with the DA of ERS LAI alone, the RMSE of the forward mode simulated NEE fluxes is reduced by 5 and 10% for Lamasquère and Klingenberg respectively, whereas that of the at-harvest cumulative NEE was improved by 49 and 43%. Further proof of concept regarding the multiple site applicability of this framework, including the derivation of LAI, is demonstrated at Gebesee. Ground measured LAI was not available for this site, therefore LAI values were estimated from EO data only using retrieval algorithms calibrated using measurements from the remaining sites. The synergistic assimilation of ERS and SPOT estimates had the result of improving the at-harvest cumulative NEE value by around 50% when compared to the forward mode.

Recommendations for further Earth observation data and ecosystem model developments
As this study evaluates the assimilation of LAI estimated from ERS and SPOT EO sensor data, an intrinsic area of development would be to assess the model accuracy with measurements from alternative sensors. Moreover, it is expected that the model framework, including the EnKF DA and LAI retrieval techniques, is sufficiently versatile to facilitate measurements from sensors operating at different spatial and temporal resolutions to those used in this study. This multi-sensor framework would also be appealing for future satellite missions, for instance ESA's Sentinel-1 (Torres et al., 2012). Research should also focus on updating additional model state variables to further improve the precision of the model. Specifically, soil moisture measurements could enhance the simulation of water-balance, as demonstrated in de Wit and van Diepen (2007), and improve the simulation of root allocation. Soil moisture estimates can also be retrieved from dedicated EO sensors, such as the Soil Moisture and Ocean Salinity (SMOS) sensor (Kerr et al., 2012) and the planned Soil Moisture Active Passive (SMAP) mission (Entekhabi et al., 2010).
The model framework was successfully calibrated and applied at the point-scale using within-field mean LAI estimates. Future research could involve investigating the spatial implementation of this modelling approach. Furthermore, if the current model framework was applied spatially over large areas, where input data regarding spatially heterogeneous soil and meteorological conditions are typically unavailable, it is anticipated that the integration of EO observations would become more valuable for updating state-variables.

Conclusions
A technique for simulating cropland C dynamics has been presented and evaluated over six European cropland sites with varying environmental conditions. The framework consisted of successfully deriving LAI estimates from SPOT and ERS satellite measurements using empirical retrieval algorithms that were calibrated using ground measured values. Generally, when compared to the WDVI values calculated from SPOT imagery, a stronger exponential relationship existed between all ERS σ°and ground measured LAI when applied across all sites.
The LAI simulated by the SPAc model of crop C fluxes was updated using EO LAI estimates via the EnKF sequential DA algorithm. The modelled outputs were compared to ground LAI measurements and NEE observations available at the study sites. For three out of the six study sites, the assimilation of all EO LAI estimates (i.e. ERS and SPOT) resulted in an average decrease in RMSE of around 10% when comparing the simulated and observed daily NEE values. However, with the assimilation of ERS estimated LAI only, the RMSE value is reduced by an average of 13% for the majority of the sites.
Further improvements to the simulation were strongly indicated by the increased accuracy of the cumulative NEE at harvest value. For most sites, without DA this value was consistently lower than observed by an average of 107 gC m −2 (27%) (observed-modelled) and so the overall sink strength was overestimated. However, with assimilating all LAI estimates this value was overestimated by 33 gC m −2 (8%) and only 22 gC m −2 (6%) when assimilating the ERS LAI values individually. The result highlights weaknesses in the SPAc parameterisation related to allocation to roots and storage organs; nonetheless it is concluded that this DA approach, particularly the use of radar sensors alone, provided a superior means of quantifying the overall extents to which croplands are sources or sinks of C at harvest.
Specific refinements should be made to the C allocation scheme as a function of development, with the overall aim of improving the prediction of harvested yield. Such changes would also allow for an improved crop C simulation, not only in the contexts of C cycling and climate change, but also crop yield forecasting and food security. Future studies could also focus on improving the modelling of C fluxes by assimilating additional state variables from different EO satellite sensors. Furthermore, owing to the intrinsically high variability in soil and meteorological conditions over large areas, it is expected that a technique allowing for the spatial implementation of the current framework would rely more heavily on the assimilated EO measurements.