An AgMIP framework for improved agricultural representation in integrated assessment models

Integrated assessment models (IAMs) hold great potential to assess how future agricultural systems will be shaped by socioeconomic development, technological innovation, and changing climate conditions. By coupling with climate and crop model emulators, IAMs have the potential to resolve important agricultural feedback loops and identify unintended consequences of socioeconomic development for agricultural systems. Here we propose a framework to develop robust representation of agricultural system responses within IAMs, linking downstream applications with model development and the coordinated evaluation of key climate responses from local to global scales. We survey the strengths and weaknesses of protocol-based assessments linked to the Agricultural Model Intercomparison and Improvement Project (AgMIP), each utilizing multiple sites and models to evaluate crop response to core climate changes including shifts in carbon dioxide concentration, temperature, and water availability, with some studies further exploring how climate responses are affected by nitrogen levels and adaptation in farm systems. Site-based studies with carefully calibrated models encompass the largest number of activities; however they are limited in their ability to capture the full range of global agricultural system diversity. Representative site networks provide more targeted response information than broadly-sampled networks, with limitations stemming from difficulties in covering the diversity of farming systems. Global gridded crop models provide comprehensive coverage, although with large challenges for calibration and quality control of inputs. Diversity in climate responses underscores that crop model emulators must distinguish between regions and farming system while recognizing model uncertainty. Finally, to bridge the gap between bottom-up and top-down approaches we recommend the deployment of a hybrid climate response system employing a representative network of sites to bias-correct comprehensive gridded simulations, opening the door to accelerated development and a broad range of applications.


Introduction
Integrated assessment models (IAMs) examine the interactions between human systems and the natural environment. IAMs thus explore how societal changes, such as global policies, population growth, socioeconomic development, greenhouse gas emissions, and technological advances affect land, air, and water resources, as well as repercussions when these natural resources are strained (Füssel et al 2010, Clarke et al 2014. Agriculture has long been central to the relationship between society and natural systems, providing vital foods, fiber, and energy while drawing heavily on land and water resources. IAMs have traditionally represented agricultural sector changes as exogenous yield changes provided via scenarios aggregated to national or regional level production using current harvested area weights (Müller and Robertson 2014, Nelson et al 2014, Wiebe et al 2015; however these only draw from a small subset of cutting-edge crop model assessments. A more direct coupling of agricultural responses within IAMs is facilitated by the application of crop model emulators, defined here as computationally-efficient representations of crop model results that capture fundamental responses to climate conditions. Crop model emulators may take the form of lookup tables (e.g. , each estimating yield as a function of climate variables with varying degrees of non-linearity and detail about the specific crop variety, farm environment, weather extremes, and crop model emulated. As these emulators get more complex the gain in computational efficiency (compared to just using the crop model itself) is reduced, and in the end a crop model emulator is limited by the performance of the crop model or crop model ensemble that it is emulating. Emulators are distinct from statistical crop models, which are trained upon observational data, with one advantage being that they may use principles of biophysical process response to explore environments that have not been observed (such as future climate and land use change). The exact specifications and desired detail of a crop model emulator depends on the IAM to which it is coupled, the intended applications, and the capabilities and coverage of the underlying crop model assessments.
IAMs have a lot to gain by better incorporating crop responses to changes in carbon dioxide concentration ([CO 2 ]), temperature, water, nitrogen, and adaptation (CTWNA). CTWNA sensitivity simulations can be more useful than projections driven by global climate models (GCMs) as they provide the information basis to construct crop model emulators for use in IAMs in conjunction with climate emulators (e.g. Meinshausen et al 2011, Castruccio et al 2014, Hartin et al 2015. Figure 1 illustrates how this powerful combination improves agricultural sector representation by allowing IAM land use changes and emissions of greenhouse gases and aerosols to influence regional temperature and precipitation changes (using the climate emulator), affecting crop production and requirements (using the crop model emulator) that feed back into the IAM. This also captures agricultural feedback loops, where societal or environmental changes alter the climate and shift agricultural production in a manner that reinforces or diminishes those changes, and unintended consequences when policies in another sector or region impact distant farming systems (potentially through climate responses or through independent mechanisms such as trade).
The Agricultural Model Intercomparison and Improvement Project (AgMIP; Rosenzweig et al 2013Rosenzweig et al , 2015 was launched in 2010 to provide a common  (2008), note that studies cover many major production regions, while GGCMI activities simulate the entire land surface.
framework and systematic approach for analysis of agricultural challenges. AgMIP connects climate, crop, livestock, and economic models at local, regional, and global scales, allowing multi-model, multi-discipline, multi-scale assessments of agricultural development and food security (Rosenzweig et al 2016, Antle et al 2015). AgMIP mainly utilizes process-based crop models that represent biophysical processes and their responses to genetics, environment, and management over the course of a growing season, with statistical models also included in some efforts. Integrated assessment modelers examining previous crop modeling studies have been challenged to make sense of differing assumptions, methods, and models in addition to the under-representation of agricultural systems beyond the mid-latitude, high-input breadbaskets (White et al 2011, Challinor et al 2014a. AgMIP facilitates more robust and transferable findings based on common simulation protocols, multi-model ensembles, the tracking of uncertainty, and an emphasis on under-simulated farm systems. Great strides in computational power are opening new doors for agricultural model development and application, raising the ceiling for multi-model analyses, new scales of decision support, and more accurate crop model emulators for IAM applications. This article takes stock of the methods used by AgMIP to capture the response of agricultural productivity to changing climate conditions, examining the relative strengths and weaknesses of site, network, and gridded modeling approaches to inform IAMs and related crop model emulators. We then provide a framework for coordinated development and application of agricultural responses drawing value from local to global approaches and linking biophysical and integrated assessments. We conclude with recommendations for priority future work and applications.

Survey of crop model outputs germane to IAM emulators
Although AgMIP conducts more than 30 activities (Rosenzweig et al 2015), here we survey activities that (a) test for sensitivity to some or all of CTWNA factors and utilize (b) multiple agricultural models, (c) multiple sites, and (d) common protocols. These are described in  Figure 2 presents the geographic coverage of these site, network, and gridded activities.

Site-based approaches
The overwhelming majority of studies in the large literature on crop impacts are site-based studies (White et al 2011, Challinor et al 2014a), but inconsistent protocols, assumptions, geographic sampling, and methods make generalized interpretation of the results difficult. AgMIP's emphasis on model intercomparison and exploration of climate responses drove initial research activities toward species-based assessment at a small number of carefully selected sites. These 'pilot' projects organized around the application of multiple models on high-quality field datasets (Boote et al 2015, Kersebaum et al 2015 to expose differences in model structure, process responses, data requirements, and input/output formats.
The first crop pilot was organized by the AgMIP Wheat Team, in which 27 modeling groups ran historical simulations and 30 year sensitivity tests for [CO 2 ], temperature, and nitrogen response at sites in the Netherlands, India, Argentina, and Australia (Asseng et al 2013, Martre et al 2015. The Wheat Pilot was open to all interested modeling groups as long as their models were published in peer-reviewed articles.

AgMIP-Livestock and Grasslands Phase 2
Sites 330 to 900 −1 to +8 −50 to +50 --Common protocols for single model tests at 14 sites. Seven models contributed yield and GHG balance results.

AgMIP Regional Integrated Assessments
Sites 360 to 720 −2 to +8 −75 to +100 0 to 210 kg N ha −1 -Two models each for ten sites, multiple crops at many of the sites.

MACSUR-IRS Phase 1
Sites -−2 to +9 −50 to +50 --26 wheat models at four sites in Europe.   Protocol-based multi-model intercomparison at diverse, high-quality sites. Included limited information and full information calibration settings.

MACSUR-Crop Rotation
Phase 2a (Asseng et al 2015a(Asseng et al , 2015b Protocol-based multi-model analysis of temperature response at Hot Serial Cereals artificial heating experiment in Arizona and temperature responses in Mexico.
Phase 2b (Asseng et al 2015a, 2015b, Liu et al 2016 Intercomparison of temperature responses across 30 sites selected as a representative network of well-watered wheat production regions around the world.

Phase 3 (in progress)
Intercomparison of temperature responses across 60 sites selected to represent both well-watered and water-limited wheat production regions around the world.

AgMIP-Maize
Phase 1 (Bassu et al 2014) Protocol-based multi-model intercomparison at diverse, high-quality sites. Included limited information and full information calibration settings.

Phase 2 (Durand et al 2017)
Protocol-based multi-model intercomparison at Free-Air Carbon Enrichment (FACE) site in Germany.

AgMIP-Rice
AgMIP-Rice Phase 1 (Li et al 2015) Protocol-based multi-model intercomparison at diverse, high-quality sites. Included limited information and full information calibration settings.

AgMIP-Rice Phase 2 (in progress)
Protocol  (table 2). CTWN sensitivity experiments also form a key component of AgMIP's regional integrated assessments at sites across South Asia and Sub-Saharan Africa (Rosenzweig et al 2017).

Strengths and weaknesses of site-based approaches
Intensive, multi-model intercomparisons at highquality pilot field sites are a critical first component of model evaluation, yielding valuable insight into process responses, structural biases, data requirements, and performance across contrasting systems. These analyses are anchored in field data that enable validation of state variables (e.g. leaf-area index; above-ground biomass and N contents; plant-available soil moisture) across a number of phenological stages as well as end-of-season characteristics (e.g. grain yield and protein content, harvest index). This allows evaluation of the mechanisms by which plants respond to environmental changes, highlighting sensitive biophysical processes and growth stages that in turn help focus climate projections on fundamental stresses (e.g. drought in reproductive stages; heat stress at anthesis).
Results demonstrate that multi-model ensembles consistently outperform individual models when evaluated across variables and sites ( Site-based assessments from the initial AgMIP Pilots are limited in their application to IAMs as they cover only a small number of sites and farming systems. As expected, crops responded differently at the selected sites owing to unique soils, weather, cultivars, and farm management. Additional careful sampling of interactions across the broader CTWNA space is

Network-based approaches
As AgMIP protocols were developed and tested on individual sites, the next step scaled up these approaches through larger networks of sites coordinated to ensure adherence to a common protocol that enables direct comparison. an unprecedented number and diversity of contributed simulation sets but also challenges in analyses. The result is a network of voluntary 'crowdsourced' responses rather than a designed plan of geographic coverage, representative sites, or multimodel analyses. Nevertheless, C3MP's wide ad hoc network covers most major agricultural lands and features models calibrated with site-specific information (figure 2). Sampling across all submitted results for a given category of system (e.g. rainfed maize) provides CTW response surfaces isolating the common yield response across a broad sampling of sites and systems as well as uncertainty stemming from model, soil, baseline climate, cultivar, and farming system differences (McDermid et al 2015a). Recognizing that IAMs typically track major crops (wheat and rice) and commodity groups (e.g. oil seeds, coarse grains, sugar crops, fruits and vegetables), C3MP's relatively large number of crop species also reduces the amount of crop response mapping that is required to represent climate responses across the diversity of agricultural commodities. C3MP is particularly useful in distinguishing responses within a commodity group (for example, differentiating between millet, sorghum, and maize responses for coarse grains).

Wide ad hoc network approach
Aggregation of the C3MP archive to global production responses is challenging given geographic gaps and under-represented systems, and vetting is difficult given its reliance on prior model calibration and a skew toward common crop models (as were also challenges in the Challinor et al 2014a, meta-analysis). We recommend that C3MP analyses do not include simulation sets that use antiquated model versions and a small percentage of flagged sites where low historical yields indicate farming systems that are not presently viable. In some cases, these were conducted as tests of land uses that may become viable in wetter and high-[CO 2 ] futures, but must be considered distinct from broader CTW analyses. C3MP remains an open process, and each new submission increases the robustness of ensemble statistics and analyses.

Representative network approach
AgMIP Wheat Phase 2b created a global network of 30 well-watered sites selected to represent major wheat systems and regional production areas (irrigated and high-rainfall wheat crops contribute ∼70% of global production; see figure 2) (Asseng et al 2015a). 30 wheat models are configured for simulation of CT responses at each site, allowing robust ensemble projections and uncertainty analyses (Wallach et al 2015, 2016).

Strengths and weaknesses of representative networks
The AgMIP Wheat Team network is distinct from C3MP's ad hoc network in that its design allows multi-model assessment on major regional production systems that together generate the large majority of global wheat production (table 2). Simulated relative impacts are applied to recent FAO country production statistics associated with each simulated location to upscale to global production impacts (Asseng et al 2015a, Liu et al 2016. Even with 30 sites, the network is limited in its spatial coverage and individual sites may not reflect conditions in the broader production regions they represent. The network is concentrated in high-production zones and is likely to miss important responses in areas that were not simulated (AgMIP-Wheat Phase 3 will fill some of these gaps for water-stressed systems). As a simple metric of comprehensiveness of coverage, figure 3 shows how the rainfed and irrigated wheat networks from C3MP and AgMIP-Wheat Phase 2 cover wheat-growing climate conditions as compared with the global Monthly Irrigated and Rainfed Crop Area Both networks are most dense in climate zones that are prominent for wheat production; however the larger C3MP network also includes less common climates for rainfed wheat and samples more from the tails of the irrigated wheat distribution than does AgMIP-Wheat. By simulating more of the cool and wet tails it is likely that C3MP captures more farms that potentially benefit from increases in temperature or are less vulnerable to decreases in precipitation.
Regions with high levels of diversity are difficult to capture given limitations in representative site networks. Sentinel crop modeling sites are often calibrated with data from field experiment datasets designed to highlight potential genetic, fertilizer, water, or pest control treatments, and therefore may not be representative of prevailing agricultural systems within that production region. These site networks tend to be more useful when examining the percentage yield response to a given climate change; this metric has proven robust even in the face of persistent bias in mean regional yields (Challinor et al 2014b, Asseng et al 2015a).

Global approaches
Advances in high-performance computing have allowed crop models to enter a new phase of development that is nearly unconstrained by computational limitations. While IAMs are typically run on desktop computers or simple clusters, the 18 modeling groups participating in AgMIP's Global Gridded Crop Modeling Intercomparison (GGCMI; table 3) use parallel computing and advanced data processing pipelines to conduct protocol-based simulations on a 0.5 • × 0.5 • global grid (Rosenzweig et al 2014, Elliott et al 2015), with higher resolution gridded studies in the works. These outputs therefore form a desirable basis for more computationally-efficient IAM application through    emulators. GGCMI Phase 2 performs a systematic analysis of CTWNA sensitivities for rainfed and irrigated maize, rice, wheat, and soybean with consistent climate information and harmonized planting dates. Adaptation is examined by shifting cultivars to maintain the growing period even as warmer temperatures accelerate phenologic development, thus offsetting some yield losses from climate change. (table 3)  . It is difficult for crop model emulators to disentangle fundamental responses from these outputs, however, given the many types of changing and interacting climate conditions (e.g. mean temperatures and rainfall; subseasonal variations; extreme events). Emulation is also complicated by the inclusion of responsive adaptations allowing management to evolve with climate change in some participating models (Rosenzweig et al 2014, supplementary).

Strengths and weaknesses of global approaches GGCMI's fast-track results
GGCMI Phase 2 findings indicate considerable spatial variation in CTWNA response across different environments and farm systems, exemplified by the response of rainfed maize to higher [CO 2 ] and temperature in the parallel-DSSAT crop model (pDSSAT; Elliott et al 2014) ( figure 4). These results provide a convenient basis for the construction of crop model emulators, and can also be connected to economic and/or resource availability drivers from IAMs to dynamically characterize the evolution of socioeconomic yield gap factors such as fertilizer use, irrigation, and adaptation potential.
In contrast to the site networks, GGCMs rely on gridded soil, genetic, management, and weather datasets designed to capture spatially-averaged conditions rather than conditions on a particular farm (Elliott et al 2015). While the 0.5 • × 0.5 • spatial resolution used within GGCMI is finer than many GCMs, a grid cell on the equator represents >310 000 ha and thus poses a challenge for comprehensive farm system calibration.
GGCM results are often evaluated using regional yield and production reports, with trend adjustment recommended in recognition of technological development and processes that are not explicitly modeled such as pests, diseases, and widespread flooding (Müller et al 2017). Analogously, statistical crop response models are occasionally fitted to similar aggregate yield data that may reflect embedded abiotic factors (e.g. Lobell et al 2011). Bias-adjustment is recommended for GGCM application in IAMs, similar to common practices for climate model output applications

Emergent characteristics and opportunities from CTWNA simulations
AgMIP site, network, and gridded results demonstrate that multi-model ensembles outperform individual models when analyzed across multiple sites and evaluation variables (e.g. Asseng et al 2013,  (2016) found relative agreement in wheat response to a 1 • C rise in global temperature, with multi-model ensembles in the wellwatered AgMIP Wheat network, GGCMI's ISIMIP fast-track, and several statistical model approaches finding 4.1%-6.4% declines in global production.
Uncertainties in input data indicate that there is still room for harmonization that will improve consistency, as illustrated by a comparison of growing seasons at the well-watered AgMIP-Wheat network sites and corresponding GGCMI grid cells ( figure 5). Uncertainty owing to model structure and parameters remains substantial, and differences in CTWNA responses by two modelers using the same DSSAT model within the MACSUR IRS and AgMIP-Wheat Phase 1 also highlights the potential role of modeler uncertainty stemming from assumptions and subjective decisions made in the absence of supporting data (Pirttioja et al 2015, Confalonieri et al 2016. We therefore advise applications to recognize the uncertainty in model-based responses through the use of emulators derived from multiple models or an imposed error term scaled to model-based uncertainty. Evidence across AgMIP activities also recommends avoidance of universal yield functions in favor of yield response functions fitted to broad agro-ecological zones and farming systems (e.g. defined by fertilizer and irrigation inputs).

AgMIP framework for improved agricultural representation in IAMs
A cascading pathway of development underlies agricultural representation in IAMs, forming a framework that may be used to drive coordinated development of 'simulation levels', here defined as common communities of development including site-based crop models, network and gridded models, crop model emulators, Figure 6. AgMIP framework for improved agricultural representation in IAMs. The core agricultural response development and application pathway (green arrows) spans several levels of model applications (dark blue boxes) and recognizes that site-based crop models are the backbone of model networks and grids, which feed into IAMs either directly or through crop model emulators built upon a hybrid system blending network and gridded CTWNA responses. Improvement in each level of model development requires access to data for evaluation and configuration (gray boxes) as well as methodological advances (light blue boxes). Agricultural applications also inform development up the framework chain, with IAMs providing critical information about the economic viability of changing land use patterns, emulators helping to isolate aggregate CTWNA responses, and networks and grids testing site-based models in more diverse settings. and eventual IAM applications (figure 6). Close collaboration and regular updates between site, network, and gridded crop modelers, emulation experts, and IAM groups are needed to keep agricultural impact applications on the cutting edge, to facilitate the use of multiple models, to incorporate understanding from multiple modeling groups, and to avoid the propagation of known biases.
Each simulation level in the AgMIP Framework benefits from improved data access and innovations in core methodologies. Investment in research and development is well served by matching the design, capabilities, and development priorities of models and tools at each level in figure 6. In particular, new biophysical process understanding is best developed within site-based models using field experiment data, particularly for under-sampled agro-ecological zones, crop species, and farming systems under various intensifications (Challinor et al 2015, Maiorano et al 2017).
Networks and gridded models gain from new datasets that allow extensive configuration for many sites and systems, and have tremendous potential to apply advanced bias-correction and aggregation approaches (Challinor et al 2014, van Bussel et al 2016, Hoffmann et al 2015, 2016, Zhao et al 2015, 2016. Crop model emulators are progressed with improved statistical efficiencies and the availability of observed agricultural response data for evaluating strengths and weaknesses. In addition to the potential benefit of adding improved crop model emulators, IAM simulations of long-term shifts in agricultural production are furthered by good data on current systems and advanced representation of the implications of agricultural investment and technological development.
The AgMIP framework for improved agricultural representation in IAMs is non-linear as lower simulation levels build upon advances higher up in the framework and high levels also receive critical feedback from downstream simulation levels. Pathways of upstream improvements include that assessments of improved models on established grids and networks provide vital feedback for site-based model development on diverse sites. Likewise, emulators often spotlight key sensitivities and uncertainties that may spur further site-based model development and the creation of more representative networks. Network and gridded studies examine the biophysical viability of various simulated farm systems to determine land use pressures, but benefit tremendously by incorporating information on economic viability and resource constraints that IAMs can provide. It is also important to note that many of these simulation levels have extensive applications beyond agricultural representation in IAMs, and that the key bottleneck for one applications may differ from another's crucial development priority.

Priority future development and applications
Analysis of the multi-model, multi-site climate sensitivity datasets reviewed in this study suggest that IAMs and other large-scale applications would be well served by the creation and systematic development of a hybrid CTWNA response system that blends the strengths of network and gridded approaches (as noted in figure  6). This hybrid response system would be rooted in (1) detailed process understanding across a representative network of well-calibrated field sites (ideally using field data from prevailing management systems) combined with (2) comprehensive CTWNA coverage from gridded models. Baseline responses generated by these gridded models could initially be compared against the corresponding representative network simulations to assess methodological uncertainty and calculate biascorrection factors. Bias-corrected gridded results could then provide an information basis for crop model emulators and IAM applications, characterizing different farming systems using nitrogen and water components of the CTWNA analysis. Table 1 highlights that progress toward the creation of this hybrid response system is most advanced for wheat, given the AgMIP-Wheat Phase 2b representative network and spring and winter wheat simulations within GGCMI Phase 2. In contrast, soybean is simulated in GGCMI but has not yet been the focus of site-or network-based CTWNA analysis, and a number of other important commodities merit inclusion. Coordinated and systematic development of the hybrid response system would foster rapid iterative improvements, as research groups improve the hybrid framework by contributing new process understanding, field sites, model runs, regional configuration information, or statistical approaches. An expanded representative network of models and a fully configured high-resolution gridded (or geo-referenced polygon) model will eventually be interchangeable; however this hybrid response system provides current state-of-theart responses and a practical roadmap for applications.
Coordination across AgMIP activities supports the development of linked global and regional assessments to address agricultural sector challenges and food security (Rosenzweig et al 2016). Inclusion of IAMs would bring these to a new level, although it is critical that these account for lingering model uncertainty and data gaps even as these are addressed through the coordinated development of agricultural response in linked models.