Optimality-based modelling of climate impacts on global potential wheat yield

Evaluation of potential crop yields is important for global food security assessment because it represents the biophysical ‘ceiling’ determined by variety, climate and ambient CO2. Statistical approaches have limitations when assessing future potential yields, while large differences between results obtained using process-based models reflect uncertainties in model parameterisations. Here we simulate the potential yield of wheat across the present-day wheat-growing areas, using a new global model that couples a parameter-sparse, optimality-based representation of gross primary production (GPP) to empirical functions relating GPP, biomass production and yield. The model reconciles the transparency and parsimony of statistical models with a mechanistic grounding in the standard model of C3 photosynthesis, and seamlessly integrates photosynthetic acclimation and CO2 fertilization effects. The model accurately predicted the CO2 response observed in FACE experiments, and captured the magnitude and spatial pattern of EARTHSTAT ‘attainable yield’ data in 2000 CE better than process-based models in ISIMIP. Global simulations of potential yield during 1981–2016 were analysed in parallel with global historical data on actual yield, in order to test the hypothesis that environmental effects on modelled potential yields would also be shown in observed actual yields. Higher temperatures are thereby shown to have negatively affected (potential and actual) yields over much of the world. Greater solar radiation is associated with higher yields in humid regions, but lower yields in semi-arid regions. Greater precipitation is associated with higher yields in semi-arid regions. The effect of rising CO2 is reflected in increasing actual yield, but trends in actual yield are stronger than the CO2 effect in many regions, presumably because they also include effects of crop breeding and improved management. We present this hybrid modelling approach as a useful addition to the toolkit for assessing global environmental change impacts on the growth and yield of arable crops.


Introduction
Wheat accounts for a quarter of the world's cereal supply (FAOSTAT 2020) and provides 20% of human caloric intake (Shiferaw et al 2013), thus prediction of wheat yield is essential for global food security assessments contributing to the sustainable development goal of 'No hunger' (United-Nations 2015, Wang et al 2020c. Potential yield, representing the biophysical 'ceiling' on yield, is an important component of such assessments. Potential yield is defined as the yield obtained when growth is determined only by crop variety, climate and ambient CO 2 , assuming that any yield reduction due to water or nutrient limitation, or caused by biotic stress, has been effectively controlled through management practices (Van Ittersum et al 2013). Potential yield is the most relevant benchmark for crop systems where water supply is adequate, such as in wet regions or areas where irrigation can be implemented. However, water deficits may not be fully compensated for in rainfed systems in semi-arid and arid regions. In such regions, it is more relevant to estimate the water-limited potential yield (Van Ittersum et al 2013).
The gap between potential and actual yields reflects the space for yield improvement (Van Ittersum et al 2013, Fischer 2015. To meet increasing food demand due to population and economic growth (Godfray et al 2010, Tilman et al 2011, research has focused on increasing actual yield through management (Mueller et  However, wheat yield can be substantially reduced by unfavourable climate conditions (Ray et al 2015, Tack et al 2015, Asseng et al 2018, Ortiz-Bobea et al 2019, even when the fertilization effect of elevated atmospheric CO 2 is taken into account (Attavanich andMcCarl 2014, Liu et al 2018). Climate conditions can have a major effect on food security, particularly in less developed regions where management interventions are limited (Hasegawa et al 2018). Thus, it is important and timely to assess climate and CO 2 impacts on potential wheat yield (Godfray et al 2010, Tilman et al 2011, Ray et al 2013, Hasegawa et al 2018. Statistical modelling (Licker et al 2010, Neumann et al 2010, Guilpart et al 2017 and process-based crop models (Gobbett et al 2016, Rosa et al 2018 have both been used to estimate potential yield. Statistical models are extensively used to understand climatic impacts on crops (Urban et al 2012, Tack et al 2017, Schlenker and Roberts 2019, and to benchmark process-based model predictions (Van Ittersum et al 2013). However, statistical estimations of potential yield are based on the current highest actual yield observed in a given climate (Licker et al 2010, Mueller et al 2012, and so have limited capacity to predict the impacts of environmental change (Van Ittersum et al 2013, Roberts et al 2017. Moreover, this approach does not distinguish irrigated and rainfed crops and thus could overestimate the potential wheat yield where water for irrigation is not available (Van Ittersum et al 2013). Statistical predictions are also strongly dependent on the training data, which are less abundant in less developed regions.
Process-based models are mathematical representations of current understanding of biological processes (phenology, carbon assimilation, assimilate partitioning, transpiration) and their responses to environmental factors. Although this approach provides assessments of both geographical patterns and the impacts of changing environmental conditions on yield (Gobbett et al 2016), existing models differ in their structure and in values of their many parameters, and show large differences in their predictions of potential wheat yield (Asseng et al 2013, Tao et al 2018, Wang et al 2020a. Moreover, process-based models have largely ignored the acclimation of photosynthetic capacities to long-term climate change (Challinor et al 2009, Peng et al 2020. Given the limitations of statistical approaches and the uncertainties of current process model-based results, a more robust way to estimate potential crop yields globally as a function of their growth environment is needed. Recent advances in modelling natural ecosystems have shown that eco-evolutionary optimality (EEO) concepts can provide a basis for developing robust models since they account for physiological acclimation and provide a universal constraint on ecosystem processes, thus dramatically reducing parameter requirements without compromising realism (Lin et al 2015, Harrison et al 2021, Tan et al 2021. The PC model (Qiao et al 2020) is a hybrid model for wheat growth and yield, which combines an EEO-based model of primary production with empirical functions relating production to yield. This model reproduced the magnitude and interannual variations of wheat yield for different regions of China, showing that EEO approaches can be applied successfully in agro-ecosystems (Qiao et al 2020).
In this paper, we apply an extended version of the PC model to simulate potential wheat yield and to assess the impacts of climate and CO 2 changes on potential wheat yield globally. We first develop a scheme to account for water limitation in rainfed regions. We then test model predictions of potential wheat yields at site and global grid scales. Finally, we assess the impacts of climate and CO 2 changes on potential wheat yields during the period 1981-2016, and compare these to the impacts on actual yields.

The PC model
The PC model (Qiao et al 2020) predicts wheat yield with two separate modules for photosynthesis and allocation (figure 1). The first module uses a universal gross primary production (GPP) model (the P model: Wang et al 2017, Stocker et al 2020), combined with a scheme to predict leaf area index (LAI) dynamics using the mass-balance principle, to derive GPP from climatic variables and CO 2 concentration. The P model is based on the Farquhar-von Caemmerer-Berry C 3 photosynthesis model for instantaneous biochemical processes (Farquhar et al 1980) combined with two EEO hypotheses concerning photosynthetic acclimation: the 'coordination' and 'leastcost' hypotheses (Prentice et al 2014, Wang et al 2017. These account for the spatial and temporal acclimation of carboxylation and stomatal conductance to environmental variations at weekly to monthly time scales (Bloomfield et   . LAI dynamics are predicted from a phenology scheme adopted from the LPJmL4 global vegetation model (Schaphoff et al 2018) and the carbonbalance constrained maximum LAI. The phenology scheme allows maximum LAI to be interpolated to daily LAI, with leaf-area development determined by the accumulated heat units and total heat requirement between planting and harvest. Here we have introduced an additional water-balance constraint on LAI, such that maximum LAI in rainfed regions is the lesser of a carbon-and a water-limited value. The maximum LAI in irrigated regions is only limited by carbon balance, however, because we assume full irrigation and therefore no water deficit there.
The second module is a data-driven scheme predicting the allocation of GPP to aboveground biomass (AB) and hence to yield. The PC model was initially developed using data from sites in China where wheat was grown under optimal irrigation and fertilization (Qiao et al 2020). Allocation of GPP to AB was treated as a fixed fraction, accounting for maintenance respiration and the allocation from net primary production to AB (Campioli et al 2015, Huang et al 2018. Here, to facilitate application to rainfed crops in dry regions, allocation to AB is treated as a function of soil moisture (indexed by an estimated ratio of actual to equilibrium evapotranspiration). In irrigated regions this index is set to its maximum value of 1.26. Grain yield is treated as a saturating function of AB with wheat variety as a random effect.
LAI and biomass allocation can be impacted by other factors, including nutrient supply and biotic stress. In the current application we focus on potential yield, which assumes that appropriate management practices have been adopted to counteract such impacts. Details of the PC model, and the derivation of the empirical functions relating GPP, AB and potential yield, are given in the appendix.

Data sources 2.2.1. Site-level data sets
A global dataset of experimental information on wheat growth and yield was compiled from the Chinese National Ecosystem Research Network (CNERN: www.cnern.org.cn), the International Maize and Wheat Improvement Center (CIMMYT: www.cimmyt.org), the Agricultural Model Intercomparison and Improvement Project (AgMIP: www.agmip.org), the Kellogg Biological Station (KBS: www.lter.kbs.msu.edu), and published papers (table S2, available online at stacks.iop.org/ERL/16/ 114013/mmedia). The dataset includes 118 agricultural sites located in all the major wheat growing regions of the world, and contains 319 site-years of observations (figure S2). Fourteen sites (76 site-years) of the 118 sites provide site-level information on meteorology and management practices. We used these 14 sites, which are all intensively managed environments (USA, Europe, China, Australia and New Zealand), to test our simulations of potential wheat yield at site-level. The remaining data (104 sites, 243 siteyears) were used for model calibration by means of a quantile regression. Details of the model calibration procedure are given in the appendix.
We derived observations of the response of wheat to elevated CO 2 from eight sites in Ainsworth and Long (2020). The experiments were run in different years and, in some cases, involved additional treatments (water, nitrogen, warming). There is also variation in the level of CO 2 enhancement. In view of this, we used the median value of the observed changes in wheat yield across the experiments for comparisons with the simulations.

Global model input data sets
We obtained climate data from the WFDE5 data set globally at a spatial resolution of 0.5 • (Cucchi et al 2020) between 1981 and 2016, including 2 m air temperature (T, • C), atmospheric pressure (P air , Pa), 2 m specific humidity (SH, kg kg -1 ), downward shortwave radiation flux (R sw , W m -2 ) and precipitation (rain and snowfall) rate (P, kg m -2 s -1 ). Daily values were calculated as means of the hourly data in WFDE5. These climatic variables were converted to the variables required by the PC model. Daily 2 m specific humidity was used to calculate vapour pressure deficit. Daily photosynthetically active radiation (PAR) was derived from downward shortwave radiation as PAR = 3600 × 24 × 10 -6 R sw k EC , where k EC = 2.04 µmol J -1 (Meek et al 1984) converts energy units (W m -2 ) to photon flux units (µmol m -2 s -1 ). Daily variables were pre-processed to yield model inputs on a seven-day time step. Global annual average CO 2 concentrations for 1981-2016 were obtained from the US National Oceanic and Atmospheric Administration (www.esrl.noaa.gov).
Wheat-growing areas (Monfreda et al 2008) were defined using EARTHSTAT (www.earthstat.org), a dataset created by merging census and satellitederived datasets. The MIRCA2000 data set (Portmann et al 2010) was used to determine which wheatgrowing regions were irrigated and which were rainfed, and to provide information on the wheat calendar (planting and harvest dates). The MIRCA2000 data set used time-series data from 1998 to 2002 to derive the average month of planting and of harvesting in 2000 CE for 0.5 • grid cells. In regions where multiple wheat crops are grown in a year, for example where there is both winter-planted and spring-planted wheat, the MIRCA2000 data set provides planting dates for both types. We used the wheat calendar dates associated with the largest area of wheat planted in that grid cell (figure S3) in our analyses. There are some inaccuracies regarding the distribution of irrigated wheat-growing areas (figure S4) in the MIRCA2000 dataset, particularly for some countries in Africa and South America, so this information was modified according to countryspecific information from published papers (Tadesse et al 2019). In grid cells where both irrigated and rainfed wheat are grown, we used the dominant type for our analyses. Thus if the ratio of irrigated to rainfed wheat in any one grid cell was >0.5, we assumed for modelling purposes that the wheat planted in that grid cell was fully irrigated. Table S1 provides a detailed description of all input data sets.

Global data on potential and actual yield
EARTHSTAT provides actual and potential wheat yield at 0.083 • resolution globally for 2000 CE, where potential yield was estimated from the global gridded data set of actual wheat yield (Monfreda et al 2008) by climate zone. Grid cells in wheat-growing areas were classified into climate zones based on growing degree days and a soil moisture index, and the potential ('attainable') yield was estimated as the 95% quantile of all yield data in each climate zone without distinguishing wheat type or water management (spring or winter wheat; irrigated or rainfed wheat). We aggregated the EARTHSTAT attainable yield data to the 0.5 • WFDE5 grid as a benchmark for our model predictions of potential yield.
The Inter-Sectoral Impact Model Intercomparison Project-2a (ISIMIP: www.isimip.org) provides historical simulations of wheat yield driven by global climate data at a spatial resolution of 0.5 • under six scenarios: fully irrigated or no-irrigation under default conditions, fully irrigated or no-irrigation with harmonized seasons, and fully irrigated or noirrigation with harmonized seasons and no nitrogen constraints. We used outputs from the scenario with full irrigation and harmonized seasons with no nitrogen constraints for irrigated wheat regions, and the scenario with no-irrigation and harmonized seasons with no nitrogen constraints for rainfed wheat regions. We used outputs for 2000 CE from nine wheat models (ORCHIDEE-crop, pAP-SIM, pEGASUS, pDSSAT, EPIC-BOKU, EPIC-IIASA, EPIC-TAMU, CLM-crop, GEPIC) for comparison with the PC model and EARTHSTAT data.
The Global Dataset of Historical Yields (GDHY: Iizumi and Sakai 2020)) provides actual crop yields for years from 1981 to 2016 at 0.5 • resolution. The dataset was created by combining agricultural census data, satellite remote sensing and information on crop calendar and crop harvested area (table S1). It was used here to detect the effects of climate variability and year on actual yields over the recent period, for comparison with the effects of climate variability and year on potential yield as simulated by the PC model.

Model application and evaluation
We simulated potential wheat yield in irrigated areas assuming that water and nutrient limitations and biotic stress are counteracted by appropriate management practices. We simulated potential yield in rainfed wheat areas assuming water limitation, but with no other limitations. We evaluated PC model performance at both site and global scales (figure S1). At site level, the model was run using observed climate and CO 2 concentration to predict wheat yield at 14 sites (figure S2) where wheat was grown under close to optimal management conditions, and the observed yield can therefore be assumed to be comparable to the potential simulated yield. We also used eight wheat FACE experiments to test the simulated site-level response to increased CO 2 concentration. At the global level, because there are no direct observations of management practices, we ran the PC model for the year 2000 CE, assuming optimal management, to predict potential yield. We compared these results with statistical estimates of 'attainable' yield for 2000 CE from EARTHSTAT, and with simulated potential yield from the ISIMIP models.
We then applied the PC model to assess the impacts of climate and CO 2 concentration on potential yield over the interval from 1981 to 2016. We calculated partial Pearson correlations of potential yield against mean growing season temperature, total growing season precipitation and accumulated growing-season PAR. Year was included as a predictor to account for CO 2 fertilization, as CO 2 increased year-on-year during this period. We also calculated partial Pearson correlations for actual yield over the same interval using the GDHY data (Iizumi and Sakai 2020), thus testing the hypothesis that climatic effects on modelled potential yield would be reflected in actual yield. The effect of year in the data-based correlation analysis implicitly represents all non-climatic trends, such as improvements in management, as well as changes in atmospheric CO 2 .
The dataset of planting and harvest dates (Portmann et al 2010) is only resolved to the nearest month and to administrative unit (often country) level and probably has unrealistically sharp boundaries between units. The dates are multi-year averages and do not reflect year-to-year variability. We quantified the uncertainty in simulated potential yield arising from these limitations by randomly sampling planting and harvest dates assuming a uniform distribution of dates within a ±15 d window around the mid-month. We used this approach to run a 100member ensemble of simulations with other inputs unchanged.

Results
The PC model reproduced well the observed AB and yield at experimental sites (figures 2 and S7). Observed and predicted yields were well correlated (r = 0.59) with the slope of the linear regression through the origin close to unity (0.97). Simulated AB was correlated (r = 0.32) with observed AB ( figure S7) and also has a slope close to unity. These correlations are weaker than those obtained by Qiao et al (2020) for irrigated sites from China (r = 0.89), perhaps reflecting incomplete representation of water limitation at rainfed sites. The poorer correlation may also reflect our assumption that irrigated areas were not subject to any water limitation, although this may not be true everywhere.
The PC model predicts that potential wheat yield increased by 11.68 ± 6.41% across the FACE sites in response to a CO 2 increase from ∼380 ppm to ∼550 ppm compared to the observed change in wheat yield (median ± sd, 11.16% ± 8.21%) reported by Ainsworth and Long (2020). The larger spread in the observations may reflect the effects of different cultivars on wheat yield, not considered in the model.
The PC model captures both the spatial pattern and the magnitude of potential wheat yields in 2000 CE (figures 3 and S8). Potential yields in central Asia, central USA and Australia were low (<3 t ha -1 ), reflecting water limitation in these regions of predominantly rainfed agriculture. Potential yields in Europe, northern China, and the northeastern USA were high (>6 t ha -1 ), consistent with the favourable climate conditions or intensive irrigation in these regions. Simulated potential yields are higher than implied by the EARTHSTAT dataset in some regions, for example the western USA, northwestern China and southern Africa (figure 3). These are regions with relatively small harvested areas (Monfreda et al 2008) and where the crops are irrigated. The EARTHSTAT calculation of potential yield as the 95% quantile by climate zone means that yields in regions with a low harvested area are under-represented; this could explain the discrepancy with the simulations. The observed yields from these regions are indeed higher than the potential yield implied by EARTHSTAT (Monfreda et al 2008).
The ISIMIP crop models generally fail to capture the spatial pattern and the magnitude of potential yield shown by EARTHSTAT. Some models underestimate the potential wheat yields regionally (e.g. EPIC-BOKU, EPIC-IIASA, EPIC-TAMU, GEPIC) or globally (e.g. ORCHIDEE-crop, CLM-crop); some models (e.g. pAPSIM, pEGASUS, pDSSAT) capture the spatial pattern, but overestimate the potential Thus, the PC model performs better than any of the ISIMIP models.
Uncertainty about the length of the wheat growing season had some impact on the simulated potential wheat yield (figure S10), but the resulting variations in wheat yields were less than 1 t ha -1 over 95% of the wheat-growing areas. This is a relatively small change in percentage terms (<10%). Thus, uncertainty due to the planting and harvest date has generally little effect on the estimation of potential yield. These simulations however only consider uncertainty in timing of the dominant wheat crop in any grid cell. This could be another source of uncertainty in the estimates of potential yield, if for example the sub-dominant type were irrigated or had higher overall yields. However, the impact of this is likely to be small compared the differences in the length of the growing season for the dominant crop, which can be assumed to have been planted at a time best adapted to local climate conditions. There are some places where more than one wheat crop is planted in the same year, and our approach will therefore underestimate the total yield. However, this situation occurs rarely. It is more common for wheat to be followed by a different crop-which has no impact on our estimates of potential wheat yield.
The impacts of changes in climate and CO 2 concentration on actual and potential wheat yield are shown in figure 4. Rising temperature has tended to reduce wheat yields in almost all regions, although a positive effect of warming is seen in regions (e.g. northeastern America and northern China) where growing-season temperature is lower and water supply adequate. The impacts of solar radiation on wheat yields differ regionally. Positive effects are seen in northeastern America, much of western Europe, Africa, India, and China. These regions are either wet or irrigated, so wheat growth is not limited by water and high PAR is beneficial for carbon assimilation. However, there are also negative impacts of solar radiation on wheat yields, for example, in the northernmost part of the wheat-growing area of North America. Negative impacts are also seen in Iberia, Ukraine and the adjacent part of southern Russia, much of the wheat-growing area of South America and in Australia. Many of these regions are rainfed and water stress is the major limitation on wheat growth; increased solar radiation is then expected to increase water stress, due to more rapid soil water depletion. The spatial pattern of the precipitation impacts on wheat yield is generally opposite to that of solar radiation. Negative impacts are seen in northeastern America, much of western Europe and some parts of China. These negative precipitation impacts may be associated with the damages causing by heavy rain events and waterlogging (Li et al 2019). By contrast, increased precipitation lessens the impact of water stress on wheat growth in dry rainfed regions.
The impacts of year on actual yield as seen in the historical data includes the effects of rising CO 2 concentration and advances in breeding and management, whereas the impacts on potential yield only reflect rising CO 2 . The similarities between the two suggests that rising CO 2 has had a non-negligible positive impact on wheat yield, although the stronger response in actual yield shows there have been additional effects from management and plant breeding, particularly in North America and western Europe.  (Iizumi and Sakai 2020); potential yields (PY) were simulated by the PC model using climate data from WFDE5 and changes in CO2 concentration from NOAA. Panels (a) and (e) show partial correlations with mean growing season temperature where growing season is defined by daily temperature above 5 • C: (b) and (f) show partial correlations with the annual mean precipitation: (c) and (g) show partial correlations with the accumulated growing season photosynthetically active radiation, where growing season is defined by daily temperature above 5 • C; (d) and (h) show partial correlations with year.

Impacts of climate and CO 2 on wheat yields
Our analyses show that warming has a negative effect on wheat yield in most wheat-growing areas. This negative impact of increasing temperature has been reported in many previous studies (Asseng et al 2011, Zhao et al 2016, Agnolucci et al 2020, Wang et al 2020c. Warmer temperatures accelerate the phenological development of wheat, shorten its growing season, and thereby decrease yield (Zabel et al 2021). Wheat growth and consequently yield may also be reduced through more frequent high temperature stresses (Zampieri et al 2017).
Our analyses suggest that increasing CO 2 concentrations have had a positive effect on wheat yields which compensated, to some extent, for the negative impact of increasing temperature. This effect has been observed in FACE experiments (Ainsworth and Long 2020). Other experimental studies have also suggested that CO 2 fertilization may offset the negative impacts of increasing temperatures (Fitzgerald et al 2016, Macabuhay et al 2018. Moreover, increased CO 2 also implies increased water use efficiency (Manderscheid et al 2018) and this could reduce the need for irrigation. However, the consequent reduction in transpiration could increase surface temperatures and increase heat stress on wheat growth in warmer regions (Kadam et al 2014).
There are negative relationships between historical wheat yield against year in a few regions, for example, Zimbabwe and parts of northern China.
The decline in wheat yield in Zimbabwe has been reported previously (Govere et al 2020) and presumably reflects unstable social conditions in recent years. The apparent decline in wheat yield in northern China is puzzling, however, since previous studies have shown increasing wheat yields in this region over recent decades (Khan et al 2009, Qin et al 2015, consistent with our model predictions. This discrepancy between actual and potential yields may reflect uncertainties in the GDHY dataset, as also noted by Iizumi and Sakai (2020).

Comparison between the PC model and other modelling approaches
The PC model predicts both the magnitude and the spatial patterns of potential yield better than more complex process-based models (figures 3 and S9). Although these models incorporate a detailed treatment of wheat growth dynamics, they require information on a large number of poorly known parameters which reduces their reliability and robustness (Prentice et al 2015, Tao et al 2018, Xiong et al 2019. The success of the hybrid PC model in predicting potential wheat yield at both site scale and grid scale shows that EEO approaches can lead to simpler models to predict vegetation processes even in highly coupled humanand-environmental systems with constraints imposed by artificial selection. The PC model predictions of potential yield in 2000 CE are in reasonable agreement with the EARTHSTAT data. The EARTHSTAT data set does not distinguish spring from winter wheat or irrigated from rainfed wheat, whereas the simulated potential yields only account for the dominant crop type and management in each grid cell. The spatial pattern of potential yield is not significantly affected by these different assumptions, but these differences could help to explain some of the discrepancies between the EARTHSTAT estimates and simulated yield. The largest discrepancies are in wellwatered regions where cropped area is small (e.g. some regions in southern Africa), and may reflect underestimation of potential yields by EARTHSTAT. In regions where the dominant crop is rainfed (e.g. Ukraine, Belarus and Romania), discrepancies may reflect overestimation of potential yields by EARTHSTAT.
Overall, the PC model has several advantages over purely statistical approaches in estimating potential yields under future conditions. First, the model realistically captures the CO 2 effect, as shown in FACE experiments. Second, it allows the separate impacts on growth processes of correlated climate variables, such as temperature and light, to be simulated in a process-based framework. The model also accounts for the impact of water limitation in areas of rainfed wheat.

Future directions
Several outstanding issues might be addressed in future model development. Our current analysis does not consider the effects of differences between cultivars explicitly because information on the cultivars planted is not available globally. However, some model parameters, including the intrinsic quantum yield, the light extinction coefficient and the ratio of AB to yield, vary among cultivars (Sukumaran et al 2015, Martin et al 2018. Given the current structure of the model, it would be possible to replace the universal parameter values used in the current analyses by values appropriate for specific cultivars. The current version can also be used to assess the effects of cultivar selection through analyses treating cultivar as a random effect on these parameters, as demonstrated by Qiao et al (2020).
The current version of the PC model simulates potential wheat yields by assuming that growth stresses are counteracted by appropriate management practices. While the accuracy of the simulations shows this is appropriate to first order, in reality suboptimal nutrient supply could be an important limitation of yields. It would be feasible to extend the approach used here to take account of water limitation to allow nutrient mass balance to be considered in the prediction of LAI dynamics and biomass allocation. The impact of other factors that lower yields, such as pests and diseases, could be incorporated via empirical relationships provided that field data are available to allow such relationships to be determined. However, it would then be necessary to use modelling approaches that predict the incidence of pest and disease outbreaks as a function of environmental factors in order to be able to predict the likely influence of such factors on future crop yields.
The PC model was designed to run at a weekly time step, which is an appropriate timescale to describe the acclimation of photosynthetic processes. Short-lived weather events, particularly extreme high temperatures, are known to have a negative effect on growth and to reduce crop yield (Lobell et al 2013, Schlenker and Roberts 2019, Perry et al 2020. Recent work (Mengoli et al 2021) has shown that the P model can simulate carbon fluxes at half-hourly resolution by separating the timescales for instantaneous and acclimated photosynthetic responses. A similar approach could be used to examine the impact of daily and sub-daily weather conditions, including temperature extremes, on wheat yields.
In common with more complex crop models, we used fixed planting and harvest dates for our simulations using an observation-based monthly climatology. Although we have shown that this only induces a small uncertainty in estimated wheat yields for recent decades, the use of fixed dates is not realistic for future simulations-since farmers are likely to adapt to changing environmental conditions by shifting the timing of planting. It would be useful to explore EEO approaches to determine the optimal timing of planting and harvest under changing environmental conditions.
The satisfactory performance of the PC model in predicting wheat yields both at research sites and globally demonstrates the usefulness of EEO approaches to simulate vegetation dynamics in coupled human-and-environmental systems with constraints imposed by artificial selection. It would be useful to explore this kind of hybrid modelling for other arable crops.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files). (No. 20213080023) and National Key R&D Program of China (No. 2018YFA0605400). I C P and S P H are supported by the High-End Foreign Expert program of the China State Administration of Foreign Expert Affairs at Tsinghua University (G20190001075, G20200001064). SPH also acknowledges the support from the ERC-funded project GC2.0 (Global Change 2.0: Unlocking the past for a clearer future, No. 694481). I C P also acknowledges support from the European Research Council under the European Union's Horizon 2020 research and innovation programme (Grant Agreement No: 787203 REALM). This research is a contribution to the Land Ecosystem Models based On New Theory, obseRvations and ExperimEnts project funded through the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program and to the Imperial College initiative on Grand Challenges in Ecosystems and the Environment.

Author contributions
S C Q carried out the analyses and prepared the display items with contributions from H W, I C P and S P H. S C Q wrote the first draft. H W, I C P and S P H contributed to the revision. All authors contributed to the design of this study and the interpretation of the results.
where k = 0.5 is the light extinction coefficient. In reality k can vary among wheat cultivars in a range from 0.3 to 0.8, and has shown a trend from large values in historical cultivars to smaller values in modern varieties (Zhang et al 2014, Pradhan et al 2018).
Modelled daily LAI is the product of maximum LAI (LAI max ) and a multiplier (0 ⩽ f LAI ⩽ 1): where f LAI depends on accumulated heat units, following the LPJmL model formulation for wheat (Schaphoff et al 2018). LAI max is determined as the lesser of a carbon-and a water-limited value. The carbon-limited value reflects the requirement that the carbon demand to construct a given LAI be matched by the carbon supply from assimilation. The waterlimited value reflects the requirement that transpiration (closely linked to GPP via stomatal behavior) be matched by the water supply from precipitation. The carbon balance is simulated as: where TB is total biomass production (g biomass m -2 ); ρ is the conversion factor between moles of carbon and grams of biomass, set at 2.5 (g biomass g -1 C); ΣGPP is the accumulated gross primary production (GPP) over the growing season (g C m -2 ); BPE is biomass production efficiency, set at 0.5, the mean value for wheat according to Campioli et al (2015); LMA is leaf mass per area (g biomass m -2 ); and γ is the fractional allocation of total biomass to leaves. We define the growing season as the days with a temperature above a threshold value of 0 • C. We define δ as an intermediate variable: where ΣPAR is the accumulation of the incident PAR over growing season. Substituting equations (1a), (1b), (2), (4a) and (4c) into (4b) yields equation (4d): where LAI c is the carbon-limited maximum LAI and W 0 is the principal branch of the Lambert W function, implemented in the R package lamW.
The water balance is simulated as: where ET is the total evapotranspiration during the growing season (mm); Pre is the total precipitation during the growing season (mm); f ET,pre is the fraction of precipitation that accrues to ET, with an assumed constant value of 0.9; T is the total transpiration during the growing season (mm); f T,ET is the ratio of T to ET, assigned a constant value of 0.7; g s is the canopy conductance to CO 2 ; and χ is the ratio of the leafinternal to ambient CO 2 partial pressure. Substituting equations (1a), (1b), and (5a)-(5c) into (5d) yields equation (5e): where f APARw is the water-limited value of f APAR . Substituting f APARw into equation (2) yields the waterlimited LAI (LAI w ): Theoretically, the maximum value of f APARw is 1, however this might lead to an invalid log transformation of zero. Therefore, the maximum value of f APARw is set at 0.99, which occurs when water supply is ample. Maximum LAI is then taken as the smaller of LAI c and LAI w : LAI max = min (LAI c , LAI w ) .

Biomass allocation
We re-derived the allocation equation of Qiao et al (2020) based on experimental data from the major wheat planting areas (figure S2). Although Qiao et al (2020) assumed that the proportion of accumulated assimilation over growing season allocated to AB was constant, analysis of the larger global data set showed that it varies with moisture, a greater fraction of biomass being allocated below ground under water-stressed conditions ( figure S5). This finding is consistent with findings from several other studies (Franco et al 2011, Poorter et al 2012, Fry et al 2018.
Defining f un as the fraction of total biomass allocated below ground, we fitted a linear regression: where α is a moisture index given by the ratio of actual evapotranspiration to equilibrium evapotranspiration during the growing season, as simulated by the SPLASH model (Davis et al 2017) using climate data from bias-adjusted ERA5 reanalysis data (WFDE5: (Cucchi et al 2020)) and soil information from the Harmonized World Soil Database (HSWD: (Fischer et al 2008)). We assume that moisture is not limiting in irrigated areas, and set α to the maximum value of 1.26 there. Analysis of the data showed that grain yield increases non-linearly with AB (figure S6), consistent with the findings of Qiao et al (2020) for China. We therefore fitted a saturating function for yield (Y) against AB as in Qiao et al (2020), using a mixedeffects model with wheat variety imposed as a random effect: where ε is a generic multiplier, µ i is the random effect of variety, and c, d are fitted parameters.
To obtain potential yield, we assumed that nutrition stresses for wheat growth have been counteracted by fertilization; that water stress for irrigated wheat has been counteracted by full irrigation; and that other factors, including pests and diseases, that can reduce wheat yield have been mitigated by appropriate management practices. However, not all of the data used to derive the saturation function correspond to growth under optimal conditions. To minimize the impact of this uncertainty on the saturation function, we fitted the relationship using the R package qrNLMM to perform quantile regression, and used the 90% quantile to derive the final equation for potential yield: where PY is potential yield (g m −2 ) and AB is aboveground biomass (g m −2 ). The constant 1138.4 represents the saturating value with no growth stresses. The effect of cultivar can be accounted via this constant by refitting the value for specific cultivar, or by considering cultivar as a random effect.