Simulating long-term carbon nitrogen and phosphorus biogeochemical cycling in agricultural environments Science of the Total Environment

• A C-N-P model simulating transition from natural to agricultural land-use • The model is compared with data from long-termexperimentalplotsinEurope. • Agriculture causes signi ﬁ cant changes to nutrient cycling. • Modern C and N stocks remain strongly in ﬂ uenced by initial weatherable P. • Results indicate a missing process or source of P in current understanding. Understandinghowagriculturalpracticesalterbiogeochemicalcyclesisvitalformaintaininglandproductivity,food security, and other ecosystem services such as carbon sequestration. However, these are complex, highly coupled long-term processes that are dif ﬁ cult to observe or explore through empirical science alone. Models are required that capture the main anthropogenic disturbances, whilst operating across regions and long timescales, simulating both natural and agricultural environments, and shifts among these. Many biogeochemical models neglect agricul- tureor interactionsbetween carbonandnutrientcycles,whichissurprisinggiven thescaleofinterventioninnitro-gen and phosphorus cycles introduced by agriculture. This gap is addressed here, using a plant-soil model that simulates integrated soil carbon, nitrogen and phosphorus (CNP) cycling across natural, semi-natural and agricul-turalenvironments.Themodelisrigorouslytestedbothspatiallyandtemporallyusingdatafromlong-termagricul- turalexperiments acrosstemperateenvironments.Themodelprovedcapableofreproducing themagnitudeofand trends in soil nutrient stocks, and yield responses to nutrient addition. The model has potential to simulate anthro- pogenic effects on biogeochemical cycles across northern Europe, for long timescales (centuries) without site-speci ﬁ c calibration, using easily accessible input data. The results demonstrate that weatherable P from parent ma- terial has a considerable effect on modern pools of soil C and N, despite signi ﬁ cant perturbation of nutrient cycling from agricultural practices, highlighting the need to integrate both geological and agricultural processes to under- stand effects of land-use change on food security, C storage and nutrient sustainability. The results suggest that an important process or source of P is currently missing in our understanding of agricultural biogeochemical cycles. The model could not explain how yields were sustained in plots with low P fertiliser addition. We suggest that


H I G H L I G H T S
• A C-N-P model simulating transition from natural to agricultural land-use • The model is compared with data from long-term experimental plots in Europe. • Agriculture causes significant changes to nutrient cycling. • Modern C and N stocks remain strongly influenced by initial weatherable P. • Results indicate a missing process or source of P in current understanding.

G R A P H I C A L A B S T R A C T
a b s t r a c t a r t i c l e i n f o

Introduction
Agriculture has been a major long-term driver of biogeochemical cycle change. Globally, approximately 37% of the land surface is used for agriculture, and much of this land has been farmed for hundreds of years following the steep rise in demand for food coupled with industrialisation in the 17th and 18th centuries (FAO, 2017). Land-use change and the introduction of new crops, rotations and mechanisation during this time have significant impact on the cycling of carbon, and other nutrients. These disturbances have accelerated following the Green Revolution in the 1950s and the widespread introduction of nitrogen and phosphorus fertilisers, and increased conversion of land to agriculture (FAO, 2018).
Developing our understanding of how agriculture has modified biogeochemical cycles is critical as these alterations have consequences for plant productivity and food security, climate regulation and water quality and availability. If we are to manage and mitigate these consequences, there is a need to develop understanding that: i) Considers interactions between nutrient cyclesparticularly carbon, nitrogen and phosphorus (C, N and P) -as all three have been highly modified by anthropogenic land-use change and land management practices ii) Addresses long-term and large-scale effects, given the widespread nature of agriculture, and that the majority of this land will have been experiencing disturbances for decades/centuries iii) Encompasses both land-use change and agricultural management practices Here, we present a new modelling framework that addresses these three aspects, and we apply this model to a range of long-term sites across Northern Europe providing a test of the model and an exploration of the long-term effects of agriculture on C, N and P cycles.
There has been a great deal of research focusing on C cycle changes in agricultural systems, driven by the importance of carbon in climate regulation, and the links made between soil C and crop productivity (Lal, 2005). The need for improved understanding and quantification of C cycles in agricultural systems continues to grow with the rise of initiatives such as '4 per mille' that aim to increase C stocks in agricultural soils to mitigate climate change (Chambers et al., 2016;Van Groenigen et al., 2017;Poulton et al., 2018). However, given the interaction between C, N and P cycles, and the magnitude of perturbation of N and P in agricultural systems, understanding the control that these nutrients have over the C-cycle is important.
Many empirical studies have observed significant reductions in soil C as a result of conversion of land to agricultural uses (Lal, 2004;Smith et al., 2016;Wei et al., 2014), but there is considerable variability in the response of soil C due to varying soil properties, land management practices, and climate regimes (Bruun et al., 2013;Wei et al., 2014;Doetterl et al., 2015). Long-term experiments can provide valuable insights into long-term effects, but there are relatively few long-term datasets, and multiple drivers at these sites are often changing at once. Modelling can help understand and decouple drivers, provide valuable insights into controls on long-term change for these purposes.
Biogeochemical modelling of agricultural systems to date has predominantly focused on the cycling of C (RothC - Coleman and Jenkinson, 1996) or integration of C and N (DAISY - Hansen et al., 2012;DNDC -Li et al., 1992;SPACSYS -Wu et al., 2007). Models that integrate C-N-P cycling provide a more comprehensive means of estimating future changes in soil biogeochemistry, and are required to understand feedback processes and system response to environmental change (Achat et al., 2016). However, existing C-N-P models mainly focus on natural and semi-natural ecosystems (N14CP - Davies et al., 2016a;CASACNP -Wang et al., 2010;ECOSSE -Smith et al., 2010;JSBACH-CNP -Goll et al., 2012;CLM-CNP -Yang et al., 2014). Biogeochemical models of heavily modified agroecosystems exist, however the high temporal resolution of these models limits their application to long-term simulations (of 100 s of years or more) due to computational constraints (Crop-DNDCdaily, Zhang et al., 2002;EPICdaily, Causarano et al., 2008;DAISYhourly or finer, Hansen et al., 2012) making them unsuitable for assessing the impacts of land-use changes. The latest versions of models such as CENTURY (Parton et al., 1998); an integrated C-N-P model with a later version including agricultural representation (CENTURY.v4 -Metherell et al., 1993) and Roth-CNP model (Muhammed et al., 2018) meet these requirements, yet to-date applications of these models do not simulate land-use change from natural to agricultural use (Bortolon et al., 2011;Cong et al., 2014;Probert et al., 1995).
Evaluation of models with observational data is of key importance to provide confidence in model outputs. Several previous studies of soil biogeochemistry either test models spatially (Davies et al., 2016a) or temporally (Davies et al., 2016b;Li et al., 2017;Zhang et al., 2016) only. Few studies combine both spatial and temporal model testing (examples include Cong et al., 2014), although these are often limited to a small number of sites, and short temporal scales. If regional-scale estimates of future biogeochemical change are the aim, then model evaluation through space and time is important.
In this study, we aim to address these gaps in our ability to understand and predict long-term and large-scale changes in biogeochemical cycles resulting from agricultural land use change and management practices. We present an integrated C-N-P model with the ability to simulate both natural and agricultural land uses and transitions among these. The new N14CP-Agri model includes representations of agricultural practices and estimation of crop yields. We assess the performance of this model through blind-comparison with yield, soil organic carbon, nitrogen and Olsen-P data pertaining to 62 plots from 11 long-term experimental arable and improved grassland sites situated in Northern Europe. This provides a test of the model's ability to simulate nutrient cycle interactions, and its suitability for use over long timescales and regional spatial scales. It also provides an opportunity to formally test the ability of our current conceptual understanding of C-N-P cycles to account for plant-soil observations across a wide range of management, climatic and soil conditions and histories, which we address in our discussion.

Extending the N14CP model for agricultural ecosystems
The N14CP model, implemented within MATLAB, originally developed by (Davies et al., 2016a;Tipping et al., 2012) integrates C, N and P cycling in semi-natural environments. The model tracks C, N and P pools in total vegetation biomass, topsoil (upper 15 cm of soil) and subsoil layers on a quarterly (3 monthly) time-step. It has previously been applied in temperate semi-natural systems (broadleaf and coniferous forests, rough grasslands and heathlands) at site-scale (Davies et al., 2016a), and national scale to investigate the long-term effects of atmospheric deposition of N in semi-natural system on C storage and NPP (Tipping et al., 2017(Tipping et al., , 2019, and to a grazed unfertilised site (Davies et al., 2016b). An overview of the model is provided in Fig. 1.
The model simulates net primary productivity based on the most limiting factor out of temperature, precipitation, available N or available P. The nutrient availability as defined by Davies et al. (2016a), is modified to take into account increased nitrogen fixation in a number of agricultural PFTs and fertiliser additions.
For legume crops a representation of N fixation by root nodules was included. Fixation of N is a complex process controlled by soil temperature, moisture, and available nutrients (Wu & Mcgechan, 1999;Vitousek et al., 2002). Soil N fixation within N14CP is currently determined as a function of P availability, and is down-regulated by N deposition. Nitrogen fixation rates by legumes were selected based on literature values (see supplementary information S2, tables 6 & 7, Cowling, 1982;Crush et al., 1982;Archer, 1988;Bremer et al., 1988;Peoples et al., 1995;Ledgard, 2001;Carlsson and Huss-Danell, 2003;Crawley et al., 2005;Hauggaard-Nielsen et al., 2009;Jennings, 2010). As improved grasslands contain leguminous species, representation of N fixing within this PFT has also been introduced within this study based on literature values (see supplementary information S2).
Fertiliser addition, alongside other land management practices such as tillage, grazing, and harvesting were incorporated in the model on a quarterly timestep. Fertilisers were added directly to the available pool (see Fig. 1). The effects of tillage on organic matter decomposition were represented by increasing decomposition rate constants for fast, slow, and passive pools by a factor KPlough. Grazing animals were assumed to consume 60% of the above ground biomass, with 25% of C returning to the soil and 75% of N and P returned (similarly to Ball and Ryden, 1984;Tyson et al., 1990;Jarvis, 1993;Lemaire and Chapman, 1996;Haygarth et al., 1998;Soussana et al., 2010).
For crops that are harvested, a fixed proportion of the biomass is assumed to be removed based on typical harvest indexes (HI) for the crop, which represent the proportion of above-ground biomass that is harvested. Values for HI for each arable plant functional type grouping (Cereal, Root, Legume and Oilseed) were estimated by collating HI indices from literature values and averaging across plant type (supplementary information S1 table 5, Bélanger et al., 2001;Donald and Hamblin, 1976;Hay, 1995;Jefferies and Mackerron, 1993;Li et al., 2014;Lobell et al., 2002;Osaki et al., 2012;Prince et al., 2001). The remainder of the plant is assumed to die after harvest and is added to the litter pool. We then estimate yield from the biomass that was harvested. Previous studies have estimated net primary productivity (NPP) from yield (Hicke et al., 2004;Jaafar and Ahmad, 2015;Li et al., 2014;Lobell et al., 2002;Monfreda et al., 2008;Prince et al., 2001). Yield can be estimated from NPP (g C m −2 yr −1 ) in the same way: Where yield is in g m −2 yr −1 in dry mass, AGB is the Above-Ground Biomass fraction (or below-ground biomass fraction for root crops), and FC is the C content of dry matter. FC was set as 0.45 g C g −1 dry matter (Hicke et al., 2004;Jaafar and Ahmad, 2015;Monfreda et al., 2008). For arable crops, the model separates nutrients assimilated through NPP into those available for removal via cropping (above ground biomass, or all biomass for root crops) and those left in the soil system (as litter) post-harvest. To ensure mass balance of nutrients within the model, yield is estimated as: Where C Removed is the total C removed through harvest.

Data for model calibration and testing
Observational data for soil organic C, total N and P pools, soil pH and crop yield were collated from several long-term agricultural soil experiments in the UK and Northern Europe ( Fig. 2 and supplementary information S3 table 8). Each site has multiple plots with varying land management e.g. fertiliser application rate. Experimental sites were selected to include a range of temperature and precipitation patterns representative of the region (see Fig. 2). The sites included tilled arable and non-tilled grassland treatments. Average annual fertiliser application rates ranged from 0-40.5 and 0-35.0 g m −2 yr −1 of N and P respectively. Crop types include cereals (wheats, barley, rye, oats), roots (potatoes, beetroot, turnip), oilseed (oilseed rape) and legumes (peas, clover). In total, 62 plots from 11 experimental sites were simulated, plots selected had a minimum of two observations of topsoil C, N, and yield. Several plots also had observed topsoil P, measured using the Olsen P method (representing inorganic P and a portion of the organic P), meaning available topsoil P data from the sites was not directly comparable to modelled outputs.
Mean temperature and total precipitation data at quarterly timestep for each site from 1901-2015 were calculated using monthly CRU TS4.00 (Climatic Research Unit Time-series version 4.00) modelled data (Harris et al., 2014;Harris and Jones, 2017). Mean annual temperature (MAT) prior to 1901 was temporally varied using an anomaly based on Davis et al., 2003(similarly to Davies et al., 2016a, and mean annual precipitation (MAP) was held constant at the 30 year average for 1901-1931. Observed N deposition data was used for sites where this was available, and for other sites this was estimated from EMEP modelled deposition (MET Norway, 2016). A sequence of historic variation from the mean was applied (similar to Schöpp et al., 2003) to estimate past N deposition.
Land-use histories for the simulation period (10,000 BCE to present) were determined for each site using site information available in the literature. Where histories were incomplete, forest clearance was assumed to have occurred when b50% of usable land remained forested, as per estimates by Kaplan et al. (2009) based on population size and land suitability for agriculture and pasture.
Harvesting, tillage, and fertiliser application at each site were also set based on literature information (see supplementary information table 8 for further details). For sites where the timing of arable harvests was not recorded it was assumed that harvest occurred in the 3rd quarter of the year, and that 95% of above-ground biomass (accounting for 5% remaining as stubble) was removed unless otherwise stated. Three of the experimental sites (Askov, Darmstadt and Broadbalk) were arable before the start of the long-term experiments, but lack information on manuring rate. It was assumed that arable land during these periods would have received fertiliser in the form of animal or human manure, so a constant rate of N and P addition (0.72 and 0.44 g m −2 yr −1 respectively) was applied to these periods similarly to Muhammed et al., 2018.

Model parameterization and testing
Parameter values for processes that were included in the original N14CP were unchanged (Davies et al., 2016a). Parameters for new PFTs (including C:N:P ratios, above and below ground biomass fractions and harvest indices) were set based on literature values (see 2.1 above and supplementary information table 4).
Whilst several agro-ecosystem models simulate the impact of tillage on soil physical properties, few models consider changes in biogeochemical properties (Maharjan et al., 2018). The KPlough parameter within the agricultural representation was set at a value of 3.0 (i.e. simulated decomposition rates in tilled soils were increased threefold) based on the DNDC model (Li et al., 1994) which  (Birkhofer et al., 2008;Christensen, 1988Christensen, , 1997Christensen et al., 2006;Elliott and Thomas, 1934;Esperschütz et al., 2007;Fließbach and Mäder, 2000;Heinze et al., 2010;Heitkamp et al., 2009;Hopkins et al., 2009, Jacobs et al., 2009Jensen et al., 2017;Kidd et al., 2017;Kirchmann et al., 1994;Kirkham et al., 2014;Korsaeth, 2012;Korsaeth and Eltun, 2008;Maqsood et al., 2013;Natural England, 2015 simulates increased decomposition rates of up to 3 times where ploughing occurs. The initial pool of weatherable P, PWeath 0 , which is an initial condition was at first set to the value used for non-podzols and non-rankers in Davies et al. (2016a) and Tipping et al. (2017). Within the model, P enters the plant-soil system from this fixed initial weatherable pool (PWeath 0 ) based on an annual release (ΔPWeath) which is determined by a temperature dependent first-order rate constant (k PWeath ); Where F T is the fraction of the year with a mean quarterly temperature above 0 (see Davies et al., 2016a for further details). ΔPWeath enters the topsoil layer and is spread evenly over the quarterly time steps, contributing to available soil water P. In semi-natural applications of the model, Davies et al., 2016a demonstrated that contemporary stocks of soil organic C, N and P are strongly conditioned by PWeath 0 . To investigate whether this also applied to agricultural settings, a second parameterization was performed whereby PWeath 0 was allowed to vary between 50 and 1000 g m −2 for each site, to minimise a cost function penalising the error between model outputs and observations. Since the PWeath 0 parameter represents initial weatherable P when the soil begins to develop, in principle the value should be similar for all plots on a given site, so this parameter was calibrated on a site basis rather than a plot basis. We compared the model against all long-term experiment data described in 2.2, analysing model outputs within MATLAB. Where the default values of PWeath 0 are used, this essentially forms a blind test of the model as parameters are not calibrated to the site data. Where PWeath 0 is varied, this is not a blind-test, but an exploration of the relative control that this initial condition can have over multiple outputs across C, N and P cycles.

Results
The performance of the model was assessed through comparison with data across each of the 62 plots, from the 11 experimental sites. Here, we present results in 3 ways in order to address the aims of the paper. Firstly, comparison by output variable (across all plots and time points) to assess the ability of the model to consider interactions between nutrient cycles. Secondly, by land management practice, to evaluate the performance of the model across a range of agricultural management types. Finally, temporally, to evaluate the model performance simulating soil change over time. 3.1. Model results for all sites: default P weathering initial conditions The left column of Fig. 3 shows the observed values and model simulated values for key output variables (topsoil organic C, total N, C:N ratio and dry matter yield) for all data points prior to site-based calibration of the PWeath 0 . Mean observed and simulated values are of the same magnitudes, and R 2 values are all significant (p b 0.001) with the exception of C:N ratios. Observed values of Olsen P show a significant correlation with simulated topsoil organic P (R 2 0.18 p b 0.001, but a stronger correlation with simulated topsoil inorganic P (R 2 0.45 p b 0 .001, supplementary information S5). However, it is important to note that it is not possible to directly compare these outputs with values obtained by Olsen P.

Model results for all sites: site-specific P weathering initial conditions
After calibrating the parameter PWeath 0 by site, the model performance across soil nutrient pools markedly improved (Fig. 3 right column). Performance of variables topsoil C and N improve markedly; R 2 values increase from 0.23 to 0.66 and 0.67 to 0.81 respectively, RMSE and intercepts of the correlations for these variables were also reduced, illustrating the reduction of error in the simulated values. Performance of variables C:N and yield also show improvement after calibration, but not as significant. The site-calibrated PWeath 0 values were found to vary between 50 and 1000 g m −2 with an average of 255 g m −2 .

Model results by land management
The model results broken down by land management type in Table 1 and discussed in turn in here.
Over half of the plots were fertilised with both N and P (n = 37). As the model is based on Liebig's law, NPP is simulated based on the single most limiting factor (of N, P, temperature or precipitation). The majority of the 37 plots fertilised with both N and P were calculated to be limited by N, with 7 showing temperature/precipitation limitations (sometimes fluctuating between N/temperature/precipitation limited), and 3 displaying P (or fluctuations between P and N) limitation. Table 1 shows that the performance of these plots is better than that of the average than those fertilised with only N or P.
Of the 62 plots, 10 had no additions of N or P fertiliser. The model indicates that NPP at these unfertilised sites was limited by N or P. For all the plots where P was added without N (n = 5) NPP was also limited by N. When N is added without P (n = 10) simulated NPP became P limited. Model performance at these 10 sites is below average, particularly for yields; average yields across these plots are underestimated by 77%, yet all other plots are underestimated by only 15%.

Temporal results
The data collected from the long-term agricultural experimental sites enables an assessment of model performance over time. To evaluate the long-term model performance across all sites in simulating trends in both C and N, the first and last observation from each site was used to estimate a change in C and N over the observation period (ΔC and ΔN). Initial observed values of C across all plots ranged from 1746-9387 g C m −2 , with change between −3019 and + 3238 gC m −2 during the period of observation (1-167 years). Observed and simulated ΔC show a positive relationship, however, this was not statistically significant (R 2 b 0.01, p N 0.05). When ΔC is calculated as percentage of soil C the relationship between observed and simulated change is significant (R 2 = 0.08 p b 0.05, see supplementary information S6). Initial observed values of soil N across all plots ranged from 156-851 gN m −2 , with change between −258 and + 329 gN m −2 during the period of observation. The model accuracy is greater for N (ΔN R 2 0.01 p N 0.05, ΔN as a percentage of soil N R 2 = 0.21 p b 0.01). As noted in section 3.1, the observational Olsen P data was not comparable to simulated values.
To provide an example of time-series performance, Fig. 4 shows the simulated soil C, N, and yield values for plots from the Askov site (Denmark) with varying fertiliser applications. Averaged observed values for each site, and averages of the simulation outputs, corresponding to the observations are shown in Table 2. The Askov site is an arable site with a crop rotation of cereal, root, cereal, grass/clover mix (oscillations in yields are due to crop rotation). The change in land use in 1795 from broadleaf forest to arable can be observed in the 'spike' in soil C and N caused by a short-term increase in litter as a result of forest clearance. The model simulates the magnitude of soil C and N and the temporal trends with reasonable accuracy. Additionally, the model captures the impacts of the fertiliser treatments on soil C and N pools.
As also indicated by the overall statistics shown in Fig. 3, the model typically underestimates yields, but the degree to which it does so varies with nutrient availability (yield performance is best in fertilised systems, as the results in section 3.3 describe). The observed inter-annual variability of crop yields is not well-captured by the model largely due to the use of averaged climate data, resulting in the omission of temperature and precipitation variability. Additional factors such as pests and diseases resulting in inter-annual variability of observed yields are also not considered in the model. For these reasons, correlations between observed and simulated crop yields were low. However, as Table 1 Comparison of observed and model simulated results by land management type. Asterisks denote statistical significance at *p b 0.05 and p** b 0.01 levels. shown in Fig. 4 and Table 1, the variation in yield between plots as a result of fertiliser applications is captured reasonably well by the model. To provide a grassland example, Fig. 5 shows the time series for Park Grass (UK). The magnitudes of soil C and N are comparable to those observed. Grass yields (dry matter of cut hay) are predicted by the model with a lower degree of accuracy than arable crop yields. At the control plot and the high N addition plot, simulated NPP was limited by P due to exhaustion of available P (see supplementary information S7. This is similar to all other sites where N was added without P, as described in section 3.3). Abrupt changes in the simulations reflect changes in harvest or fertiliser regimes (for example the N and P addition plot yields increase in 2013 due to N addition change from 0 to 14.4 g m −2 yr −1 ).

Overall performance
The N14CP-Agri model developed and tested here simulates observed C, N and yield values for a range of sites across northern Europe over long timescales. The results indicate the ability of the model to simulate both spatial and temporal trends. Even prior to calibration of the PWeath 0 parameter, model performance statistics shown in Fig. 3 indicated statistically significant relationships between simulated and observed soil C, N, C:N, pH and crop yields. This is notable given the small number of model parameters, which are set to universal values across all plots and sites, with no site specific calibration, making the model widely applicable. After adjusting the initial amount of weatherable P present at a site level (but not the plot level), the model performance across multiple output variables could be improved considerably.
Overall, when compared to the site-scale results of N14CP model for broadleaf, rough grassland, shrubland and conifer sites across Northern Europe (Davies et al., 2016a), the model performance (determined through R2 and RMSE between observed and simulated values) is better for agricultural sites than for semi-natural sites. This may be attributable to the fact that soil C-N-P in agricultural sites are conditioned by human inputs of fertilisers, tillage and harvesting, the timings and magnitudes of which are well characterised by experimental records. For seminatural sites, processes such as atmospheric deposition and weathering of nutrients are relatively more significant in determining soil biogeochemistry, and these fluxes and processes are more difficult to quantify. Additionally, one would expect that plant stoichiometry of monocultural sites (agricultural) should be better characterised by the PFT parameters, and hence better represented within the model.
The temporal resolution of the model (quarterly) will inevitably limit the degree to which variability of yield simulated; average Fig. 4. Time series plots for Askov site B3. Plots include from top to bottom control, low, medium and high N and P fertiliser additions. Left column shows simulated soil carbon (black) and nitrogen (red) in g m −2 yr −1 from 1700 to the end of the simulation. Change in land use in 1795 from broadleaf forest to arable is indicated by spike in soil C and N. Right column shows simulated and observed yields (dry matter yields g m −2 yr −1 ). Yield oscillation (in observed and simulated) is due to crop type rotation.

Table 2
Mean observed and simulated values for topsoil carbon and nitrogen (in g m −2 yr −1 ) and dry matter yields (g m −2 yr −1 ) for the Askov B3 sites(arable control, low, medium and high NPK fertiliser plots), and Park Grass site (grassland control, low N, P only, N and P, and high N fertiliser quarterly temperature will not capture temperature extremes or diurnal variations, relevant for crop growth and yield. Arable crops such as wheat are particularly sensitive to the timing of temperature variations within the growing season, with temperatures during certain plant growth stages influencing grain yield (Porter and Gawith, 1999). Other factors that affect yield but are not included in the model include pests and diseases, and the effects of crop protection and higher yielding varieties (e.g. Christensen et al., 2006). However, the model is intended to simulate and explore long-term and large-scale effects of agriculture on biogeochemical cycling and productivity, and as such does not aim for the level of accuracy that can be obtained with a detailed crop model and fine-resolution temporal input data. Yield magnitudes, and the responses of both arable and grassland yields to addition of fertilisers (and hence changing soil nutrient availability), were simulated with reasonable accuracy, and the limited parameter set makes the model widely applicable. The model presented here is developed and tested here to simulate plant-soil nutrient cycling in temperate ecosystems across Northern Europe. Soil development in the model is based on the Walker Syers concept (Walker and Syers, 1976). We assume soil development begins at −10,000 years BCE at the start of the Holocene. Extension of the model to other regions would require conceptualisation of P sources in older soils; global application of the model would require consideration of soil development over much longer timescales (in the region of 10 6 years). As the initial pool of weatherable P declines, replenishment of P sources through tectonic uplift, and erosion would need to be included in these soils. Furthermore, Plant Functional Types (both natural and agricultural) are typical of temperate regions. New PFTs and their associated stoichiometric ratios would need to be included to extend the model to other ecosystem types. The current model does not apply to ombrotrophic peat, which differs from other soils due to waterlogging, the burial of carbon in the anoxic catotelm. A modified version of N14CP to describe peats is in preparation. The model presented here is a model of plant-soil system biogeochemistry and includes representation of nutrient losses in leachate. Whilst we acknowledge that surface runoff and associated erosional/depositional processes will redistribute sediment and associated nutrients, this is not within the scope of this study.

P weathering
The response to P inputs via weathering demonstrates that agricultural systems are not completely decoupled from prior natural conditions, despite significant amendments to nutrient cycling through the addition of fertilisersin some cases for many decades. The parameter PWeath 0 (representing the initial pool of weatherable P) has a surprisingly large effect on contemporary pools of C, N and P despite large nutrient additions and modification through cropping and tillage (see supplementary information S8for details), indicating that weathering of P is fundamental to the development to agricultural ecosystems. The average calibrated value of PWeath 0 for agricultural sites in this study was 75% higher than that applied for semi-natural sites by Davies et al., 2016a (255 and 145 g m −2 respectively). Higher values of this parameter for agricultural sites may be expected due to the potential for tillage to influence mineral weathering rates and P release (Eriksson et al., 2016;Hartmann et al., 2013), or perhaps that naturally fertile sites with rich mineral inputs are more likely to have been chosen for agricultural use. Simulations suggest that despite high additions of N and P over long time periods, the modelled contemporary pools of C-N-P are still sensitive to initial conditions. Previous studies have indicated the importance of P weathering for semi-natural ecosystems (Anderson, 1988;Davies et al., 2016a), but this study is the first to explore this in an agricultural context. Therefore, to understand how global change will influence ecosystems and cycling of C, N and P in both agricultural and natural land uses, we need to consider geology and weathering processes.

Performance by land use/management
The model evaluation presented in Table 1 indicates the accuracy of model performance in both arable and grassland systems and across a range of management practices, showing statistically significant relationships in each management type for the simulated variables. Table 1 also highlights that simulated variables are less accurate for unfertilised plots, in particular yields are under-predicted in unfertilised Fig. 5. Time series plots for the Park Grass site. Plots include from top to bottom control, low N application, P only, N and P, and high N application. Left column shows simulated soil carbon (black) and nitrogen (red) in g m −2 yr from 1860 to the end of the simulated soil carbon (black) and nitrogen (red) in g m −2 yr from 1860 to the end of the simulation. Right column simulated and observed yields (as dry matter g m −2 yr −1 ). plots or those with very little P compared with N additions. In these plots, the availability of P becomes severely low after a number of years of continued cropping or grazing without P application, resulting in very low NPP estimates in the model (see supplementary information S7 for further details). This reduction in yield however, is not witnessed in the observations. Adjustment of PWeath 0 does not resolve this as the value is selected based on minimising the cost function across several variables (C, N, C:N ratio and yield) at several plots with and without N and P fertilisers. It is difficult to understand whether poor performance in non-fertilised agricultural settings is characteristic of this model or a gap in our understanding more widely, as other agricultural models that include nutrient cycles have not been tested under such conditions. Nonetheless, as use of either inorganic or organic fertilisers is standard in most agricultural settings, and the model performs well under these conditions, we can conclude that the model is suitable for most agricultural settings in Northern temperate Europe.
However, these experimental plots with extremely low (or zero) P fertiliser additions provide more than a mere curiosity in terms of management. The collapse of yields in simulation of these environments raises an interesting question as to how yields have been sustained for so long on long-term experimental plots where P is not added, and where P is exported in harvested material and not replaced?
To help us understand where this P may be deriving from (e.g. from a weatherable source, P deposition or other plant accessible sources) and what processes or sources may be missing or underrepresented in the model, we estimated the magnitude of extra P needed in these simulations to sustain yields for these instances. We found that the extra P required ranged from 0.15-2.5 g m −2 yr −1 . This is a large amount of P, akin to typical fertiliser rates, which typically range from 1.2-6 g m −2 yr −1 in the experiments drawn together in this study. This additional P is in excess of a number of natural sources; the lower end of this range is in excess of inputs via globally observed weathering rates (Hartmann et al., 2013) suggesting that enhanced weathering due to agricultural activity or plant activity is not a likely source. This range also does not align well with likely P deposition rates; a meta-analysis of observational data by Tipping et al., 2014 report measured total P in the range of~0.001 to 0.1 g m −2 yr −1 , with a mean of 0.027 g m −2 yr −1 . We experimented with adding both P deposition and enhancing/sustaining higher weathering fluxes in simulations, however these contributions cumulatively over time were not sufficient to prevent P exhaustion and collapse of yields in these plots. Hence, we suggest that representation of these sources and processes in the model are not the primary issue here.
Flexible plant stoichiometries are another potential mechanism by which yields may be sustained in these plots: plants may adapt their C:P ratios to sustain growth in low P conditions. We experimented with setting the C:P ratio to the highest observed values from the literature to test if this would allow yields to be sustained, with no success, again suggesting that this mechanism is not enough to explain how yields are maintained under these conditions. Further empirical data on nutrient content of crops on these long-term sites would be helpful here.
Unaccounted for legacy P, not captured by our known history of fertiliser application is another potential explanation. Previous excessive P fertiliser additions have been shown to accumulate in soils and sustain yields after addition of P is discontinued for several years (Condron et al., 2013;Liu et al., 2014;Rowe et al., 2016;Zhu et al., 2018). Historical P fertiliser additions prior to the start of the experiment may not have been recorded and so were not represented in the model.
Another explanation for where this P may be being made available in the field, as mentioned above, is via plant access to organic forms of P in the soil. Our representation of plant access to organic P as discussed in Davies et al., 2016a is basic; sources of P for NPP are used preferentially in order of P retained within the plant, readily available inorganic P (equal to the sum of P from decomposition, rot, desorption and weathering), followed by less accessible organic forms. Whilst the importance of phosphatase enzymes as a mechanism for P assimilation in plants starved of P is well recognised, quantification of their contribution is not known (Johnson et al., 2003;Raghothama and Karthikeyan, 2005). This lack of empirical data on the rate of plant access does not permit us to comment on whether this mechanism, if differently parameterised or conceptualised, could meet the extra P demand suggested by a maintenance of yields. Hence, this process remains a possibility and our research highlights a major gap in our understanding of nutrient cycling and sustainability in agricultural systems. More empirical research is needed to provide insights into rates of plant access to soil organic phosphorus, and combining these with models can help robustly test hypotheses on plant P access under low P fertiliser conditions.

Conclusions
The N14CP-Agri model presented in this study provides longterm simulation of integrated C-N-P cycling in temperate soils for agricultural environments and allows transitions between agricultural and natural land uses to be simulated. The model addresses the need to simulate the long-term effects of land-use change from seminatural to agricultural land uses, and subsequent management practices on soil biogeochemistry. The results presented demonstrate the ability of the model to reproduce both inter-site variability and temporal change and produce results applicable across a broad range of temperate fertilised agricultural environments. These simulations also allow us to explore the relative importance of natural and anthropogenic nutrient flows in determining contemporary C storage, and the role of nutrients in sustaining crops.
Including simulation of crop yields within the model provides a means of estimating the impacts of soil biogeochemistry on food production. The model captures the response of crop yields as a result of varying climate and location, land management practices and soil biogeochemistry. The resultant model is applicable to a range of scenarios to evaluate longer-term implications for yield, terrestrial C storage, and gaseous and dissolved fluxes of C, N and P, with relevance for food security, climate change, water quality and biodiversity.
The study has highlighted that even in heavily modified agricultural soils, weatherable P still strongly influences the development of soils and contemporary stores of C, N and P in agricultural soils. Model accuracy for soil C and N significantly improved when this parameter was allowed to vary on a site basis. This is an important finding since it demonstrates that despite large modifications to nutrient fluxes by humans, the state of agricultural systems and their pools of C and N are not fully decoupled from prior natural conditions. Finally, the model testing also highlighted a gap in our current understanding of agricultural nutrient cycling and sustainability. Relatively poor modelled yield performance in plots with low P fertiliser additions suggests an important process or source of P is missing from our understanding of agricultural environments. Multiple potential sources and processes were explored, but none, given present levels of knowledge, could account for the high level of P required to sustain modelled yields in these situations. We suggest that plant access to organic P and 'occluded' phosphates are key uncertain processes that may help explain how the observed yields are sustained under low P conditions. Further empirical and modelling studies should focus on addressing this knowledge gap, particularly given that more agricultural systems are likely to be managed with low P inputs in future due to sustainability and resilience concerns surrounding use of rock sources of P fertiliser.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.