Impacts of reduced model complexity and driver resolution on cropland ecosystem photosynthesis estimates

Landscape and regional estimates of crop photosynthesis are required to support research into food security, carbon (C) cycling and land surface processes. Quantifying C uptake by cropland ecosystems is complicated by spatial heterogeneity. A major challenge is to upscale the detailed understandings embodied in process models that have been validated at speciﬁc sites with high resolution inputs. At landscape scales the input requirements for such complex models are generally unavailable (e.g. site speciﬁc parameters, hourly meteorological data), and the computing demands are prohibitive. We demonstrate a simpliﬁed crop C aggregated canopy model (ACM) predicting daily photosynthesis, requiring minimal parameters. This simple model emulates a high resolution model (SPAc, half-hourly time-steps; simulating leaf to canopy processes) whilst using coarser-scale (daily) drivers. Based on the SPAc model outputs, Bayesian inference is used to calibrate the simple photosynthesis model scalar coefﬁcients at eight European cereal crop sites. We test whether a single calibration, generated from only four of the sites (i.e. calibration sites), is effective across all sites (i.e. including independent validation sites). We further investigate the error introduced by using regional meteorological drivers over local observations. We show that, compared to photosynthesis estimated from eddy covariance at the sites, the simple model produced comparable results to the complex model: both models explained a similar proportion of daily variability in photosynthesis (mean R 2 = 0.78 for ACM, 0.77 for SPAc), and had similar model error (mean RMSE = 2.89 g m − 2 d − 1 for ACM, 3.20 g m − 2 d − 1 for SPAc). Thus, the simple model, which has much reduced computational requirements, shows no reduction in model reliability and offers a sim- ple means to upscale a critical process. We discuss the importance of the simple model in regional to continental-scale data assimilation schemes. Crown Copyright © 2015 Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Croplands, together with pastures, account for around 38% of the Earth's ice-free land area (Foley et al., 2005(Foley et al., , 2011. Crop ecosystems are also entirely managed with farming practices being applied on a range of temporal and spatial scales. The variability of human intervention causes significant uncertainty when investigating feedbacks between climate and the crop C balance (Porter and Semenov, 2005;Reichstein et al., 2013). However, our understandings of feedbacks between C fluxes, anthropogenic drivers and climate are limited Smith et al., 2010). The availability of more accurate estimates of C exchanges between the terrestrial biosphere and the atmosphere at regional to global scales strategies (Ciais et al., 2011;Osborne et al., 2013;Challinor et al., 2014).
Discrepancies between modelled and observed fluxes are due to errors in data (including EC measurements and meteorological drivers) and model uncertainties, such as poorly calibrated parameters, errors in initial state estimates and uncertainties in the representation of ecosystem processes Kuppel et al., 2012). Errors in EC data can be attributed to complex terrain and heterogeneous spatial distributions of vegetation within the sensor 'footprint' (Hollinger and Richardson, 2005). Croplands have the advantage of generally being located in more level terrain where mechanisation is possible. While fields are relatively homogeneous in advanced agriculture, European field sizes are small enough that the sample flux footprint may overlap with several fields. For crop models, the most sensitive parameters related to C exchange are photosynthesis-related and development-related parameters . Agricultural production is strongly influenced by climate (Hansen, 2002), therefore errors in meteorological drivers lead to uncertainties in model C budget estimates (Ciais et al., 2011). Crop C modelling approaches, such as the Soil-Plant-Atmosphere Crop model (SPAc, Sus et al., 2010) and ORCHIDEE (Krinner et al., 2005), have been applied and evaluated at relatively data-rich sites with fine temporal scale (e.g. half-hourly) drivers. However, at global scales, the number of sites with available fine scale meteorology observations is grossly inadequate; therefore, given their complex demands for inputs, the practical application of these models is limited (Sheffield et al., 2006).
When compared to standard land-surface models, detailed crop models simulating leaf-level process over multiple canopy layers typically require a large number of input parameters (Valade et al., 2013). Since exact parameter values are difficult to specify they are often based on some expert knowledge (Newlands et al., 2012), but uncertainties associated with prior parameter estimates can result in large variations in simulated C fluxes (Knorr and Heimann, 2001;Ziehn et al., 2012). Calibrated parameters can be overly tuned to particular sites (Kuppel et al., 2012), presenting complications when scaling-up models for providing regional estimates (Fox et al., 2009;Spadavecchia et al., 2011;Newlands et al., 2012). Additionally, parameterising complex models that run at fine temporal scales is often prohibited by computational processing time (Valade et al., 2013), particularly when optimising parameters through an ensemble of model runs over large areas where parameters may be expected to vary with space.
Here we aim to address the limitations associated with spatially upscaling crop C models, specifically those related to model complexity, meteorological driver requirements, and computational demand. We first use Bayesian inference to calibrate a simple model of photosynthesis from a validated complex model. Second, we explore the impacts of using gridded meteorological driver data instead of local observations. We compare photosynthesis estimates from the aggregated canopy model (ACM, Williams et al., 1997) to the SPAc model. Our main objective is to determine the viability of using a simple model (ACM), with a single calibration of photosynthesis, when driven by atmospheric re-analysis data. We hypothesise that the increase in uncertainty linked to model and driver simplification is uncorrelated with, and similar in magnitude, to the uncertainty in driving the more complex SPAc model with sparse driver data. We focus on the following questions: (1) How does model complexity influence estimates of photosynthesis? (2) How do single-site and multi-site photosynthesis calibrations compare across European cereal field sites?
(3) How do the complex and simple model photosynthesis estimates compare when driven by atmospheric re-analysis data?
The novelty of this research is the investigation of parameter and driver uncertainty on model estimates of crop production. Furthermore, the associated reductions in model complexity and temporal resolution allow ACM to run at higher computational speeds, thus increasing the efficiency for future experimentation, and application in data assimilation frameworks where large model ensembles are required.

Study sites and data
This study investigates one winter wheat (Triticum aestivum) growing season at eight European sites (Fig. 1). These sites and crop growing seasons, selected from the FLUXNET database (fluxnet. ornl.gov), are located in France (Auradé, Avignon, Grignon and Lamasquere), Belgium (Lonzee), Germany (Klingenberg and Gebesee) and Switzerland (Oensingen). To support the analysis of a multi-site model calibration, these sites were equally divided into calibration and validation sites. Specifically, Auradé, Klingenberg, Lonzee and Oensingen were selected as calibration sites as they broadly covered the spatial extents of all eight sites. The remaining four sites -Grignon, Lamasquere, Gebesee and Avignon -were used for validating the multi-site calibrated model. With a range in latitude (43.5-51.1 • N) and longitude (1.1-13.5 • E), the locations of all eight sites span a large area of western-central Europe. Consequently, the sites also show variability in the overall length of the growing period (from sowing to harvest), ranging from 245 days (Auradé) to 342 days (Klingenberg), and seasonal average temperatures: from 7 • C (Klingenberg) to 14 • C (Avignon).
Site data available from the FLUXNET database consisted of in situ daily and half-hourly meteorological observations, which were used to drive ACM and the more complex SPAc model, respectively. Daily gross primary productivity (GPP), derived from aggregated half-hourly EC data (Baldocchi et al., 2001), were used for validating photosynthesis estimates from both models. Additional FLUXNET data used in this analysis consisted of soil texture (i.e. clay/sand ratio) and management information (sowing and harvest dates).
To evaluate the use of atmospheric re-analysis data for driving the two photosynthesis models, we used the Princeton dataset with applied bias corrections (see Sheffield et al., 2006). Princeton is a 3-hourly 1.0 • resolution dataset developed by the Land Surface Hydrology Research Group at Princeton University. The Princeton data also has a near-global coverage and thus could supply the driving data for any regional application of a crop C cycle model.

Photosynthesis models
The SPAc model (see Sus et al., 2010 for a full description and evaluation) simulates cropland ecosystem photosynthesis and water-balance at point-scales over fine temporal (half-hourly) and vertical scales (ten canopy and twenty soil layers). Leaf-level processes are scaled up to make canopy-scale predictions. Furthermore, the leaf and canopy-scale simulations are linked to a radiative transfer scheme: tracking absorption, reflectance and transmittance of direct and diffuse irradiance. Photosynthesis, simulated using the Farquhar model (see Farquhar and von Caemmerer, 1982), and transpiration, determined using the Penman-Monteith equation (Jones, 1992), are linked at leaf-level by a model of stomatal conductance. The stomatal conductance, based on that detailed in Meinzer and Grantz (1991), varies to optimise C uptake whilst maintaining leaf water potential above a minimum value-explicitly linking vapour phase losses with hydraulic transport using the parameterisations summarised in Sus et al. (2010) for winter wheat crops.

Aggregated canopy model (ACM)
ACM generates photosynthesis from daily inputs of irradiance, atmospheric CO 2 , daylength, leaf area index (LAI), soil water availability, minimum and maximum temperature. To generate GPP from these drivers, ACM uses a series of aggregation equations that are designed to reproduce the daily GPP estimates made by SPAc. The equations use a set of 10 unitless coefficients (listed in Table 1) that are fitted to create a response surface. This response surface scales the daily accumulation of half-hourly SPAc photosynthesis estimates in order to predict whole-canopy photosynthesis using only coarse-scale (daily) driving data (see further details on the ACM governing equations in Appendix A). In essence, ACM is designed to capture and emulate the detailed behaviour of the SPAc photosynthesis routines whilst operating at a reduced temporal scale and, as a consequence, higher computational speeds.
In SPAc, photosynthesis is restricted when soil moisture is unavailable, either from drought, or from freezing conditions. ACM does not simulate the energy balance and temperature of soils, and so we implement a simple switch so that photosynthesis occurs only when daily average temperature > 0.0 • C (i.e. GPP = 0.0 g m −2 d −1 otherwise). This temperature-linked switch acts as an ecological constraint on C accumulation during cold days that typically coincide with key winter crop developmental stages, including tillering and stem extension.

Data Assimilation Linked Ecosystem Carbon crop (DALECc) model
The DALECc model provides the half-hourly and daily LAI inputs to both the SPAc and ACM photosynthesis models, respectively, and simulates C mass-balance and allocation when driven by the GPP Fig. 2. Experimental design for the study. Rectangles show models; rhombuses are datasets; solid lines are inputs; dashed lines are inter-comparisons. The daily photosynthesis model (ACM, left-hand side) can be driven by either climate reanalyses data (Princeton meteorological data) or daily aggregated local observations (FLUXNET meteorological data). The half-hourly photosynthesis model (SPAc, right-hand side) can be driven by either temporally downscaled estimates of the reanalyses data, or directly from the local half-hourly FLUXNET meteorology. A single crop development and carbon cycle model (DALECc) can be driven by either daily (i.e. from ACM) or half-hourly (i.e. from SPAc) independent estimates of photosynthesis. DALECc provides daily or half-hourly LAI updates, for ACM or SPAc, respectively, in order to generate successive photosynthesis estimates. Experimental tests include inter-comparisons between downscaled reanalyses data with FLUXNET meteorology; along with an evaluation of ACM (multi-site calibration) and SPAc GPP with independent FLUXNET GPP estimates.

Table 1
List of ACM scalar coefficients, including priori minimum and maximum bounds, single-site mean and multi-site calibrated values. Brackets shown next to the single-site mean calibrations show the range in values across the eight sites. The multi-site calibrated coefficients were derived from four of the sites (i.e. calibration sites). The coefficients are used in the series of aggregation equations (see Appendix A2) designed to scale the fine-scale (half-hourly) to coarse-scale (daily) photosynthesis estimates. estimates (as illustrated in Fig. 2). Specifically, the DALECc structure consists of C pools/stores that are linked by allocation fluxes (i.e. rate of C allocated to plant tissues) or litterfall fluxes (i.e. rate of C removed from tissues). The model includes a crop-specific C allocation scheme that consists of a look-up Table defining the C allocation to plant organs (foliage, stem, storage and root) based on empirical observations (see Penning de Vries et al., 1989) (see further details on the structure of DALECc in Appendix B). Allocation fractions assigned at each time-step are a function of developmental stage (DS), ranging from −1 (sowing) to 2 (maturity). The DS is calculated based on the accumulation of daily development rates, which are determined from the key developmental responses: daily temperature, photoperiod and vernalisation (until emergence only) (Wang and Engel, 1998;Sus et al., 2010).

ACM cropland photosynthesis calibration
In this research we calibrated the 10 ACM coefficients based on the daily simulation of SPAc photosynthesis for winter cereal crops. This calibration was applied on a single-site basis (i.e. at each of the eight sites individually) and then we merged the datasets from the four calibration sites to develop a single multi-site calibration. The calibration steps we carry out can be summarised as: (1) run SPAc once at each site, using the local half-hourly FLUXNET drivers, to generating daily outputs of GPP; (2) use the FLUXNET daily datasets to produce ACM meteorological drivers: minimum and maximum temperature, irradiance and atmospheric CO 2 (fixed at 393 ppm). For the LAI values -also required to drive ACM -we used the daily accumulation of LAI estimates that were generated by SPAc driving DALECc in the previous step. We assumed that soil moisture was not limiting at any location or time (we chose years when the recorded drought stress was not significant) and so we set the same soil moisture parameter for ACM in all cases; (3) Use SPAc GPP estimates to calibrate the ACM constants.
We use a Metropolis-Hastings Markov Chain Monte Carlo (MHMCMC), approach (e.g. Xu et al., 2006;Hill et al., 2012;Ziehn et al., 2012, amongst others) to calibrate the ACM cereal crop coefficients. The likelihood function for ACM coefficient x given SPAc GPP values c, p(c|x), can be expressed as follows: where M(x) is the ACM GPP based on coefficient combination x, and spa is the Gaussian uncertainty in SPAc GPP: spa was set to 2 gC m −2 day −1 , which approximates the mean relative uncertainty previously quantified for SPAc (see Revill et al., 2013). In accordance with Bayes' theorem, based on the likelihood function p(c|x) the probability density function (PDF) of x given SPAc GPP values c, p(x|c), can be expressed as follows: where p(x) is the prior probability of x. For each ACM coefficient we prescribe a log-uniform prior value and min/max range (see Table 1)-these were determined from some preliminary runs whereby the bounds were progressively increased until the accepted coefficient space was unconstrained. To determine p(x|c), we use the MHMCMC to draw 2 × 10 6 samples of x, from which the probability distribution p(x|c) can be adequately approximated: a full description of the MHMCMC algorithm used in this study can be found in Bloom and Williams (2015).
To avoid correlations between subsequent samples only every 10th iteration was used to estimate the posterior coefficient distributions (Ziehn et al., 2012) and so a total of 2 × 10 5 samples remained. The MHMCMC algorithm was applied five times (i.e. five chains) each with randomly selected initial prior values, in order to verify convergence between the p(x|c) distributions of each ACM constant. We also considered a burn-in time for each chain, defined here as the cut-off time before convergence to the PDF maximum (Ziehn et al., 2012). We discard the first 50% of accepted values as burn-in time. The calibrated values were selected from the union of the remaining values in all five chains based on the most likely value assigned (i.e. the coefficient set x with the highest corresponding p(x|c)). We test convergence of the five MHMCMC chains of accepted constant values using the Gelman-Rubin (G-R) diagnostic method (Gelman and Rubin, 1992).

Gridded meteorological driver disaggregation
The use of gridded meteorological products with regional to global coverage are essential to support and evaluate the spatial application of the photosynthesis models. And so, to complement the FLUXNET site-scale meteorological data, we constructed half-hourly and daily drivers (for the SPAc and ACM models, respectively) from the Princeton data. Temporal downscaling (i.e. to half-hourly resolutions) through cubic spline interpolation was first applied to the reanalysis datasets of temperature, precipitation, atmospheric pressure, wind speed, specific humidity and shortwave radiation.
The vapour pressure deficit (VPD), as required by SPAc, was estimated by first calculating the saturation vapour pressure (SVP) based on an empirical relationship to the interpolated temperature (see Monteith and Unsworth, 1990). Second, using the interpolated specific humidity and atmospheric pressure, we estimated the partial pressure (pp) of water vapour (see Roberts, 2010). We Table 2 Summary statistics evaluating half-hourly predictions of solar irradiance and temperature used as driver datasets for SPAc and produced from temporally disaggregating 3-hourly Princeton reanalysis data. Comparisons are made against half-hourly FLUXNET site-scale observations from sowing to harvest across eight European crop sites. Metrics include root-mean-square-error (RMSE) and normalised mean bias (NMB).

Site
Half-hourly irradiance Half-hourly temperature then estimated the relative humidity (RH = pp/SVP) and VPD was expressed as follows: In this research we considered the 3-hourly temporal coverage of the Princeton radiation to be too sparse for a reliable interpolation that could be used directly by SPAc. Therefore, we first constructed half-hourly estimates of the extraterrestrial radiation: a function of latitude, day of year and time (see Allen et al., 1998). The relative shortwave radiation (i.e. ratio of actual to clear sky solar radiation) was then calculated as the fraction of the half-hourly interpolated Princeton values to the extraterrestrial radiation and thus used to express atmospheric attenuation (i.e. cloudiness). The half-hourly extraterrestrial radiation values were then multiplied by the daily averages of these half-hourly ratios. Essentially, this daily averaged ratio was used to scale the half-hourly potential radiation accordingly to reflect the degree of cloudiness. Daily drivers for ACM (minimum/maximum temperature and daily radiation) were then determined from the disaggregated half-hourly datasets.

Approaches for evaluating model performance
We analysed outputs from the temporal disaggregation routine applied to the Princeton data when generating both the half-hourly and daily drivers. However, we focus on irradiance and temperature estimates only as these variables are considered as the major environmental factors determining winter wheat development (Streck et al., 2003). The disaggregated data were compared to FLUXNET site-level observations, and metrics were calculated: root-meansquare-error (RMSE) describing the average estimated-measured differences and the normalised mean bias (NMB) quantifying model over-or under-predictions. We also compute the traditional R 2 regression statistic (least-squares coefficient of determination).
This study evaluates the calibration of a simple photosynthesis model when compared to estimates made by a more complex model. And so, where Sus et al. (2010) compared SPAc to independent EC data at the cereal crop sites, here we primarily focus our analysis between the ACM calibrations (single-site and multisite) and SPAc outputs. We first compare ACM and SPAc estimates from using the local FLUXNET drivers (i.e. daily and half-hourly for ACM and SPAc, respectively), where disparities between the models were summarised statistically. For the multi-site calibration, we further extend this analysis to compare the ACM and SPAc photosynthesis relationship at the calibration and validation sites. Since reanalysis data has not been previously used to drive SPAc, we then compared ACM (multi-site calibration only) and SPAc outputs, with both models driven using the disaggregated Princeton data, to GPP derived from the FLUXNET EC data.

Irradiance
There was a significant correlation (mean R 2 = 0.76) between the half-hourly disaggregated irradiance estimates from the Princeton data and FLUXNET site-level observations, reported in W m −2 (Table 2 and Auradé example Fig. 3). Furthermore, across all sites there was a relatively small range in R 2 (0.71 ≤ R 2 ≤ 0.85). From a linear fit, the sites showed positive intercept and slope values < 1 suggesting similar biases, along with an NMB range from 13% to 54% (mean NMB = 27%). However, the Lonzee site, which had a slope value > 1 and the most bias (NMB = 54%), was a notable exception to this. Across all sites, the RMSE of the half-hourly irradiance estimates ranged from 96 to 134 W m −2 (mean RMSE = 111 W m −2 ).
The daily irradiance estimates, as derived from sampling the half-hourly disaggregated values, compared to the daily FLUXNET observations, reported in MJ m −2 d −1 (Table 3 and Auradé exam- Table 3 Summary statistics evaluating daily average predictions of solar irradiance and temperature, used as driver datasets for ACM, produced from sampling the half-hourly time-series of disaggregated Princeton 3-hourly reanalysis data. Comparisons are made against daily FLUXNET site-scale observations from sowing to harvest across eight European crop sites. Metrics include root-mean-square-error (RMSE) and normalised mean bias (NMB).

Site
Daily irradiance Daily temperature  ple Fig. 4), show a similar degree of bias (mean NMB = 22%) to that of the half-hourly estimates. There was also a similarly strong correlation (0.59 ≤ R 2 ≤ 0.78). Across all sites, the RMSE of the daily irradiance estimates ranged from 3.62 to 5.03 MJ m −2 d −1 (mean RMSE = 4.17 MJ m −2 d −1 ).

Temperature
The half-hourly time-series of disaggregated temperature estimates explained an average of 66% of the variability recorded across all observations (Table 2 and Fig. 3). The overall correlation range (0.52 ≤ R 2 ≤ 0.79) of the half-hourly temperature estimates to the FLUXNET site-level observations was larger when compared to that of the half-hourly irradiance values. Based on a linear fit, Lonzee and Gebesse both had negative intercepts and slopes > 1, whereas the remaining sites had positive intercepts and slopes < 1 suggesting the degree of biases in the temperature was not consistent across all sites. Although the average bias was low (mean NMB = 0.49%), the range in NMB values (−32% ≤ NMB ≤ 26%) across all sites was large. The RMSE of the half-hourly temperature estimates ranged from 3.35 to 5.95 • C (mean RMSE = 4.66 • C).
Similarly to the half-hourly values, the analysis of the daily temperature estimates when compared to the FLUXNET observations across all sites (Table 3 and Fig. 4) show a relatively low bias (mean NMB = 6%). However, the range in NMB values (−14% ≤ NMB ≤ 24%) was smaller when compared to the half-hourly analysis. Furthermore, the daily estimates have a generally stronger correlation to the observations (0.55 ≤ R 2 ≤ 0.88) when compared to the halfhourly values, and the RMSE had a smaller magnitude, from 2.24 to 4.36 • C (mean RMSE = 3.30 C).

Convergence analysis
For a qualitative determination of optimization convergence between ACM and SPAc we examined the five MH-MCMC chains, for both the single-site and multi-site calibrations. For the majority of constants the interquartile ranges in accepted values across the five chains are both similar in magnitude and share a degree of overlap. Furthermore, the G-R test values for each coefficient (1.00-1.12) were all close to 1 indicating convergence (Xu et al., 2006).

Single-site calibration
ACM was run using a local calibration of constants (listed in Table 1) and local meteorology drivers (i.e. FLUXNET) for all days within each crop growth season (i.e. sowing to harvest). From evaluating the time-series GPP estimates by comparing to SPAc (Fig. 5 and Table 4), for all eight sites there was a significant correlation between ACM and SPAc estimates (mean R 2 = 0.97), the range in R 2 values (0.95 ≤ R 2 ≤ 0.98) was also small. The RMSE ranged from 0.87 g m −2 d −1 (Gebesse) to 1.22 g m −2 d −1 (Oensingen) with a mean RMSE of 1.09 g m −2 d −1 . The slope of the linear fit ranged from 0.95 to 1.23; however for the majority of sites this value was >1 indicating some positive biases (mean NMB = 6%).

Multi-site calibration
Similarly to the single-site calibration, from comparing the ACM GPP generated using the multi-site calibrated constants (listed in Table 1) to SPAc estimates, with both models using the FLUXNET drivers ( Fig. 5 and Table 4), a high correlation (mean R 2 = 0.96) was achieved between the two models at all sites. The range in R 2 values (0.93 ≤ R 2 ≤ 0.97) was relatively small. The RMSE results between ACM and SPAc were also comparable, ranging from 0.98 g m −2 d −1 (Aurade) to 1.48 g m −2 d −1 (Oensingen) with a mean value of 1.16 g m −2 d −1 . When compared to the single-site ACM constants, the use of the multi-site calibration showed a slight reduction in the biases of estimates, demonstrated by a decrease in the mean slope (from 1.09 to 1.05) and an increase in the intercept (from −0.13 to −0.06 g m −2 d −1 ). Moreover, although differences in the degree of biases exist at individual sites, the difference in the average NMB for the single-site (6%) and multi-site (4%) were very similar.
When evaluating the performance of the multi-site ACM calibration specifically at the validation sites the mean correlation to SPAc (mean R 2 = 0.96) was the same as that for the calibration sites. The mean RMSE values were also similar in magnitude, being 1.15 g m −2 d −1 and 1.18 g m −2 d −1 for the calibration and validation sites, respectively. However, the mean NMB indicated a positive increase in bias between the calibration (mean NMB = 0%) and validation sites (mean NMB = 7%).

Local versus disaggregated meteorological drivers
We compare differences between the multi-site calibrated ACM and SPAc model GPP predictions when both models are driven by Fig. 5. Plots (shown for Auradé only) comparing ACM and SPAc GPP estimates (sowing to harvest) including ACM calibrations: single-site (a, b) and multi-site (c, d) using local meteorological drivers. ACM (multi-site calibration) and SPAc estimates, both models using disaggregated drivers, are also shown (e, f). Time-series consist of ACM (black line; grey shading indicating 5/95% confidence interval), SPAc (dashed black line) and FLUXNET estimates (black asterisks), including a black arrow indicating harvest (H) date. Scatter plots compare ACM and SPAc estimates, including 1:1 line (grey line) and metrics: root-mean-square-error (RMSE) and normalised mean bias (NMB). the disaggregated meteorology datasets (Fig. 5 and Table 4). For the majority of sites, there was a strong correlation between GPP predictions from the two models (0.64 ≤ R 2 ≤ 0.98). However, with an R 2 value of 0.64, this correlation for Lonzee was significantly weaker when compared to other sites. Furthermore, with an intercept value of 3.07 g m −2 d −1 , a linear regression indicated biases in the Lonzee predictions. Compared to the relationship between the two models when using local drivers, the range in RMSE values across the sites was relatively large: from 0.82 g m −2 d −1 (Auradé) to 3.78 g m −2 d −1 (Lonzee). This corresponds to a large inter-site range in NMB values (−33% ≤ NMB ≤ 67%).

Model comparison with FLUXNET photosynthesis
Using the disaggregated gridded drivers, we evaluate ACM (using the multi-site calibration) and SPAc predictions when compared to GPP estimates derived from FLUXNET EC data ( Fig. 6 and Table 5). For both models, overall there was a consistent and similarly strong correlation to the FLUXNET data across all sites: ACM (0.61 ≤ R 2 ≤ 0.88) and SPAc (0.52 ≤ R 2 ≤ 0.88). The overall degree of biases in estimates from ACM (mean NMB = 32%) and SPAc (mean NMB = 35%) were also comparable. However, the range in SPAc biases (−45% ≤ NMB ≤ 88%) was larger when compared to the ACM estimates (3% ≤ NMB ≤ 59%).
The range in RMSE between ACM and FLUXNET GPP (from 1.91 to 3.94 g m −2 d −1 ) is smaller when compared to that between ACM and SPAc (from 0.82 to 3.78 g m −2 d −1 ). From the linear fit there was an average slope of 1.10 indicating an overall positive bias in ACM GPP predictions compared to FLUXNET. For the ACM estimates at Lonzee, although having a relatively weak correlation and large biases when compared to SPAc (R 2 = 0.64, NMB = 67%), the correlation was stronger when comparing to FLUXNET estimates at this site (R 2 = 0.78, NMB = 10%).
Similarly to the comparison between ACM and SPAc, when comparing ACM photosynthesis to the FLUXNET data there was a consistently high correlation at the calibration (mean R 2 = 0.79) and validation sites (mean R 2 = 0.78). The error was also similar between these two groups of sites with a mean RMSE of 2.96 g m −2 d −1 and 2.83 g m −2 d −1 for the calibration and validation sites, respectively. However, the estimates at the validation sites were more positively biased (mean NMB = 40%) when compared to those at the calibration sites (mean NMB = 24%).

Reduced model complexity
The application of ACM when driven with site-level meteorological data had a consistently high correlation to SPAc GPP estimates for both single-site and multi-site calibrations (Table 4). Therefore, a reduction in model complexity, including temporal resolution (i.e. half-hourly to daily time-steps), does not significantly diminish the overall accuracy of photosynthesis estimates at daily timescales.

Single-site versus multi-site calibrations
The analysis between the ACM coefficient MHMCMC chains, derived from SPAc GPP estimates, indicated convergence when comparing across eight sites for both the single-site and multisite calibrations. Therefore, the overall ACM coefficient calibration setup used here, including sample size, was sufficient when searching the available space (i.e. prior upper and lower bounds) defined for each of the 10 ACM constants.
When comparing all the single-site and multi-site ACM calibrations with SPAc (Table 4) the accuracy and biases in ACM GPP were consistent in magnitude. Similar results in Kuppel et al. (2012), albeit for a deciduous broadleaf forest application, demonstrated that NEE estimates using a multi-site coefficient optimisation were also as good as those achieved using a single-site optimisation.
Although an increase in bias was observed at the validation sites, the correlation of the multi-site calibrated ACM to SPAc was equally high when compared to that of the calibration sites. Furthermore, with an increase in mean model error of only 0.02 g m −2 d −1 , the overall effectiveness of the model was not significantly reduced when applied at the validation sites. Consequently, using only the four calibration sites selected in this analysis, we have produced a generic and robust ACM calibration for generating winter wheat photosynthesis estimates when compared to outputs from both a site-specific calibration and a more complex model. However, we acknowledge that the crop seasons and locations selected in this analysis were not considered to be drought-stressed and soil moisture was assumed to be fixed across all sites.

Performance of spatially aggregated drivers
From the temporal disaggregation procedure we applied to the Princeton reanalysis data (i.e. from 3-hourly to half-hourly estimates), the temperature and irradiance estimates generally had Table   4 Summary statistics  a high agreement with the independent FLUXNET observations. However, there were biases in the two datasets across all sites. This bias was particularly the case for the half-hourly temperature estimates that indicated a large range of positive and negative biases (−32% ≤ NMB ≤ + 26%) in the Princeton data. We evaluate the use of temporally disaggregated reanalysis data for driving SPAc. ACM (multi-site calibration) was then driven based on the daily aggregates (e.g. minimum and maximum temperature) of the half-hourly time-series of estimates. The GPP estimates from both models demonstrated a high agreement (Table 4). This observation indicates that the uncertainty associated with a reduction in model complexity is uncorrelated with that of a more complex model when driven with disaggregated meteorological data. And so, the use of disaggregated drivers satisfies our previous hypothesis: the propagation of driver data uncertainty impacts the two models to a similar degree.
Although selected as a calibration site, the GPP generated at Lonzee was a notable exception to the high correlation between the ACM and SPAc estimates, which can be attributed to the ACM temperature-linked photosynthesis switch that prevents photosynthesis when the daily average temperature was <0.0 • C. By preventing photosynthesis and, hence C accumulation, during cold days this ACM modification was effective at delaying crop development. However, the Lonzee growth season had a large number of days where the average temperature was <0.0 • C; furthermore these days coincided with key crop developmental stages. On the other hand, SPAc uses half-hourly drivers to simulate leaf-level processes within multiple canopy layers, and thus resolves photosynthesis at much finer temporal resolutions. Although the daily average temperatures used by ACM were <0.0 • C, a large proportion of the disaggregated half-hourly time-series was >0.0 • C; therefore SPAc continued to simulating photosynthesis for some of the half-hourly periods during these days.
The bias range across individual site estimates was much larger for SPAc when compared to ACM (Table 5). We deduce this bias is a consequence of biases in the original Princeton reanalysis product, which was temporally (3-hourly) and spatially (1.0 • ) aggregated. These biases would have propagated into the SPAc model at a higher frequency when compared to ACM, which corresponded to larger biases in SPAc photosynthesis estimates. In spite of this, the overall ACM and SPAc relationships to FLUXNET estimates were similar in terms of accuracy and biases. As was the case for the ACM and SPAc comparison, there was a positive increase in the mean bias when applying the multi-site calibration at the validation sites. However, comparisons of ACM to FLUXNET estimates at the calibration and validations sites also showed a similar correlation and error. And so, a simpler model can produce reliable estimates of photosynthesis even when driven with coarse-scale meteorological data.

Limitations and implications for future research
We acknowledge that the sites/seasons selected for analysis in this study are by no means representative of winter cereal crop conditions as a whole. Furthermore, although we equally split the eight sites for calibration and validation, due to the scarcity of European field-scale observations of cereal crop meteorology and photosynthesis, the multi-site ACM calibration could not be substantially validated against data from independent field sites. Nonetheless, we hypothesise that a similar accuracy in photosynthesis predictions could be achieved if the multi-site ACM calibration was applied at alternative western-central European winter wheat sites.
Generally, outputs from driving ACM with estimates from the applied disaggregation routine were promising. Given the widescale (global) coverage of the Princeton 1.0 • 3-hourly reanalysis product used here, this shows potential for the spatial upscaling of Table 5 Summary statistics of ACM (multi-site coefficient calibration) and SPAc GPP estimates compared to FLUXNET GPP, when both models are driven with the disaggregated meteorological data. Indicating the sites used for calibrating (c) and validating (v) the multi-site calibration of ACM. Metrics include root-mean-square-error (RMSE) and normalised mean bias (NMB). ACM. It is also worthy of note, that errors existing in the SPAc model, due to parameter uncertainty and inadequacies in process understanding (see evaluation in Sus et al., 2010), would have invariably transferred to ACM through calibration. However, we anticipate a reduction in this uncertainty and improvements in model predictions by combining ACM with additional observations within a model-data fusion framework. For instance, previous research using SPAc has demonstrated the sequential assimilation of LAI estimates, derived from Earth observation (EO) data, improves C flux estimates (see Revill et al., 2013;Sus et al., 2013). The simplicity of ACM compared to SPAc, particularly in terms of computational demand, also offers a more practical means of updating state variables through such data assimilation schemes involving a large ensemble of model runs.

Conclusions
Previous approaches to simulating the crop C cycle have used detailed models operating at fine spatial and temporal scales, with extensive and often uncertain parameterisations in order to resolve leaf-level processes. As a result, the spatial upscaling of these models is highly susceptible to errors and constraints stemming from fine temporal scale meteorological driver data and site-specific parameterisations. The computational costs of complex models also prohibit ensemble crop C cycle analyses at continental-scales. To this end, we evaluated the use of a simplified model framework that simulates aggregated canopy processes using comparatively coarse temporal scale meteorological data. We further reduced model complexity by applying a generic multi-site photosynthesis calibration and used disaggregated drivers instead of local observations.
Outputs from the simplified model using a multi-site calibration closely reproduced (range in RMSE 0.98-3.78 g m −2 d −1 ) those of the more complex SPAc model when both models are driven with either local or disaggregated data. This strong relationship between the two models also existed when the multi-site calibrated model was evaluated at independent sites. Similar results were achieved when comparing the two models to site-level EC data. However, due to parameter uncertainty and meteorological driver availability, we argue that simpler models with reduced parameterisation are more favourable for further studies involving the spatial upscaling of crop C simulations. Additionally, the increased computational efficiency, as a consequence of a decrease in model complexity, is more applicable for model-data fusion experiments that would potentially enhance the representation of cropland C fluxes. acknowledge the work of the FLUXNET data providers and the organisers of the GHG-Europe project database. We are also grateful to the three anonymous reviewers for providing supportive and constructive feedback on an earlier version of this manuscript. The writing of this paper was partially carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.
Appendix A.

A2. ACM derivation and equations
Using the 10 scalar coefficients and fixed variables (listed in Table 1 of the manuscript and Appendix Table A1, respectively), ACM consists of aggregation equations, which are solved in sequence, in order to fit daily photosynthesis estimated by the fine-scale model. From Williams et al. (1997), the first governing equation assumes a linear relationship between GPP and total canopy nitrogen, which is estimated from the average foliar nitrogen (N) and leaf area index (LAI), also including the impacts of temperature on the metabolic processes: where P N is the total canopy nitrogen, T is the average daily temperature ( • C, determined from the daily minimum and maximum temperatures), a 1 and a 8 are the Nitrogen use efficiency and Temperature calibration coefficients, respectively. The daily canopy conductance (g c ), which determines the rate of carbon (C) fixation, was calculated as a function of the daily temperature range (T d ) and the soil-canopy water potential gradient ( d , MPa, the difference between the minimum leaf water potential and soil water potential) balanced by the total soil-plant hydraulic resistance (R tot ): where a 10 and a 6 are the water potential and hydraulic scalar coefficients, respectively. Using the P N and g c values, the internal CO 2 concentration (C i ) was then determined as a function of ambient atmospheric CO 2 (C a ):  B1. Schematic of the Data Assimilation Linked Ecosystem Carbon crop (DALECc) model structure, including a carbon (C) allocation scheme based on crop developmental stage-calculated from daily accumulations of effective temperature, photoperiod and vernalisation (until emergence). The GPP used to drive DALECc is estimated either from the daily photosynthesis model (ACM) or the half-hourly photosynthesis model (SPAc). The calculated C allocation fractions (A) set the C allocation to the five C pools. Allocated C is removed from the system either by exporting harvest or through heterotrophic respiration of the crop litter and soil organic matter (SOM) C pools.
where q = Â − k and p = p N /g c . The rate of diffusion of atmospheric CO 2 to the point of C fixation (P D ) is calculated as a function of g c and the difference between C a and C i : Since the diffusive constraints vary with irradiance (I), a two-step calculation was applied in order to calculate the light limitation. First, the canopy-level quantum yield (E 0 ) that was calculated based on LAI: E 0 = a 7 LAI 2 a 9 + LAI 2 (A2.5) where a 7 and a 9 are the maxium canopy quantum yield and LAI-canopy quantum yield coefficients, respectively. The light limitation (P I ) is then calculated as: The final calculation of daily GPP (P T ) is then made, which is a function of P I : p T = p 1 (a 5 D ms + a 2 ) (A2.7) where D ms is the number of days (absolute) from the summer solstice (22 June/Julian day 173 in the Northern Hemisphere), a 5 and a 2 are the daylength constant and daylength coefficients.