Evaluation of soil carbon dynamics after forest cover change in CMIP6 land models using chronosequences

Land surface models are used to provide global estimates of soil organic carbon (SOC) changes after past and future change land use change (LUC), in particular re-/deforestation. To evaluate how well the models capture decadal-scale changes in SOC after LUC, we provide the first consistent comparison of simulated time series of LUC by six land models all of which participated in the coupled model intercomparison project phase 6 (CMIP6) with soil carbon chronosequences (SCCs). For this comparison we use SOC measurements of adjacent plots at four high-quality data sites in temperate and tropical regions. We find that initial SOC stocks differ among models due to different approaches to represent SOC. Models generally meet the direction of SOC change after reforestation of cropland but the amplitude and rate of changes vary strongly among them. The normalized root mean square errors of the multi model mean range from 0.5 to 0.8 across sites and 0.1–0.7 when excluding outliers. Further, models simulate SOC losses after deforestation for crop or grassland too slow due to the lack of crop harvest impacts in the models or an overestimation of the SOC recovery on grassland. The representation of management, especially nitrogen levels is important to capture drops in SOC after land abandonment for forest regrowth. Crop harvest and fire management are important to match SOC dynamics but more difficult to quantify as SCC rarely report on these events. Based on our findings, we identify strengths and propose potential improvements of the applied models in simulating SOC changes after LUC.


Introduction
Soils store the largest amount of carbon in the global land carbon cycle (Jackson et al 2017) and are of interest for climate mitigation measures (Fargione et al 2018, Bossio et al 2020. At present, more than two thirds of the global soils have experienced land cover change (alteration of surface vegetation) or land use change (alteration of land management; LUC) with the resulting soil organic carbon (SOC) changes contributing approximately one third of the historical LUC related emissions (Houghton et al 2012, Quéré et al 2015, Sanderman et al 2017. However, such estimates come with great uncertainties (Gasser et al 2020). This study aims at identifying the sources of uncertainty, helping to reduce the spread across land models in simulating soil carbon changes after LUC.
The SOC content depends on the balance of input fluxes from litter (leaves, branches, wood and roots), root exudates and input due to damage (fire and wind break) and harvest residues and output fluxes from decomposition. Litter fluxes depend on plant productivity and plant type, while decomposition rates are governed by the ability of microbes to digest the organic matter in soils. Therefore, the quality and quantity of the litter, its degree of stabilization and the prevailing climate are of great importance for the output fluxes (e.g. Kuzyakov and Domanski 2000). LUC which alters the land cover and management practices can disturb the balance between input and output fluxes in a way that soils can either become a carbon source or sink (e.g. Ostle et al 2009). A schematic illustration of the land carbon cycle is shown in figure 1.
This study considers only LUC from or to forest cover. Afforestation of crop land is generally characterized by higher SOC stocks as the permanently regrowing biomass adds more carbon to the soil than the annually harvested crop (Grünzweig et al 2004, Vesterdal et al 2007, Laganière et al 2010, Nyawira et al 2017, although SOC in the surface 10 cm layer could decrease during the first decade after transition (Paul et al 2002). The magnitude and time horizon of soil restoration crucially depend on sustainable management practices (Berthrong et al 2009). Forests are assumed to accumulate more organic matter in the forest floor while management practices of arable land such as tillage, fertilization and residue management can lead to substantial amounts of carbon added to the soils (Nyawira et al 2016). As found by (Li et al 2018) soils of grassland can store similar amounts of carbon compared to forests or crops (Vos et al 2019) since grasses produce substantial amounts of root litter. The mineral layer of the soil is less susceptible to LUC, climate change or management and may thus store carbon on longer time scales (Vesterdal et al 2007, Vos et al 2019. SOC dynamics strongly depend on the initial SOC content (Bellamy et al 2005, Chen et al 2015 and the quality and quantity of the new litter input and the climatic ecoregion (Poeplau et al 2011, Li et al 2018.
Land models are key to estimate past and future global land carbon changes including vegetation and SOC dynamics (Ito et al 2020). However, land models diverge in simulating SOC changes due to their implementation of LUC, parametrizations related to LUC emissions and representation of land management options (Houghton et al 2012, Wilkenskjeld et al 2014. Understanding the underlying SOC processes and the ability to adequately simulate them with land models is crucial to estimate overall LUC-associated carbon fluxes, especially, when related to anticipated climate mitigation techniques (e.g. land conversion to biomass plantations or re-and afforestation ;Fargione et al 2018, Bossio et al 2020. In this study, we compare observational records with simulated SOC dynamics from six land models participating in coupled model intercomparison project phase 6 (CMIP6) (Eyring et al 2016). The four sites include three temperate sites with afforestation of cropland (n = 2) and a change from forest to cropland (n = 1) as well as one tropical site with forest to pasture conversion. The observational records are based on soil carbon chronosequences (SCCs) which are SOC content measurements taken at adjacent plots of different age after LUC and compared to an unchanged reference plot (space for timeapproach). The time series-like SCC in this study cover at least a period of 15 years and a detailed LUC history and past climate conditions are trackable Table 1. Characteristics of selected SCCs sites. Mean annual temperature (MAT) at 2 m height and mean annual precipitation (MAP) from publications and from the model forcing data in parenthesis (GSWP3, (Dirmeyer et al 2006, Kim et al 2012 (Poeplau et al 2011). Plot scale measurements are generally not representative for processes simulated at grid cells scale in land models (Thurner et al 2016) but SCC cover larger areas than single sites and are therefore suitable for this comparison. We conduct transient simulations to analyze the dynamics of SOC changes including the direction and timing of SOC changes. We also pay attention to the representation of management practices (e.g. crop harvest or fire prevention) and climate change effects on SOC. We aim to provide detailed insights to observed and simulated SOC dynamics after LUC. We therefore evaluate the capabilities of, and identify possible improvements to state-of-the-art models to capture the SOC dynamics in relation to LUC in order to reduce the spread across the model ensemble for future simulations. This is, to our knowledge, the first study systematically shedding light on underlying SOC change processes after LUC using a set of transient multi-model simulations which is therefore highly relevant for the analysis of ongoing CMIP6 simulations.

Sites
From over 200 available site collections on SOC data we chose four high-quality sites measured at three temperate and one tropical site , Poeplau et al 2011 that included, among others, stand ages of more than 15 years, multiple depth increments, bulk density measurements and representativeness for the temperate and tropical zone with regard to the mean over all available sites in this region. To guarantee high quality of data we excluded those sampling sites within the individual SCCs which had strongly divergent soil properties to avoid this source of bias (Kalinina et al 2009).
The four sites are: (a) Valday (V) is located at the northern temperate zone of the European part of Russia. LUC included abandonment of cropland and The only tropical SCC site (CR) is located close to the Atlantic Ocean in Costa Rica. In 1974, pasture land replaced tropical rainforest. Details of sites are provided in the table 1 and section S1.

Experimental set up
We use a set of transient experiments that include two contrasting setups: a simulation with the original land cover continued throughout the whole period (VegTr) and a simulation with a LUC transition included (LuTr) at a given year (See table 1). VegTr captures the climate and CO 2 change effects on the original vegetation cover, while LuTr additionally includes changes in natural vegetation and LUC and is therefore directly comparable to the SCC records. Both simulations are run with climatic forcing from GSWP3 v2 dataset from 1901 to 2014 (Dirmeyer et al 2006, Kim et al 2012 for historical data on temperatures, precipitation and CO 2 (table 1). Figure  S2 (available online at stacks.iop.org/ERL/16/074030/ mmedia) provides the meteorological records of temperature and precipitation for all four sites. Note, that, while all other models started their simulation in 1850, JSBACH, LPJ-GUESS and ISBA started their simulations in 1901 and therefore the reference period at Valday is 1901-1910 (table 2).

Models
Six land models participated in this study: JSBACH (Reick et     This model-specific transition, in addition to adjustment of soil parameters such as the specific soil type or soil depth, is selected to resemble the study site as much as possible. Other relevant processes include wildfire, harvest, nitrogen cycling, and description of soil pools follow standard approaches within each model (table 2). LPJ-GUESS, CLM and JSBACH simulate fire activity on natural land. Fires are prevented on crop land (and pastures in LPJ-GUESS) assuming that land management suppresses fires. Fires burn all above ground biomass and litter and release the CO 2 to the atmosphere (Thonicke et al 2001). In JSBACH fire activity depends on the population density, the availability, quality and humidity of fuel material, and the occurrence of lightning (Lasslop et al 2014). In LPJ-GUESS, fires probability depends on the fire season length and the quality and humidity of fuel material (Thonicke et al 2001). Similarly to JSBACH, fires in CLM depend on demographic and economic conditions and lightning (Li et al 2013). ISBA, ORCH-IDEE and JULES do not simulated fires in this setup.
Crop harvest is simulated by JSBACH, ORCH-IDEE, JULES and LPJ-GUESS and wood harvest by ORCHIDEE. Harvest removes different shares of the above ground biomass which is released to the atmosphere as CO 2 after one year without increasing SOC (table 2, section 3.3). ISBA and CLM do not simulate crop or wood harvest.
Deforestation is implemented differently across models by total removal of tree material (ISBA and JULES) or decay of residuals on site (ORCHIDEE, LPJ-GUESS, JSBACH and CLM). The complete clearance of forest land for agricultural purposes seems adequate and in accordance with the SCC records.
Models simulate two or more vertically discretized soil layers for moisture and temperature profiles (table 2). The accounting of specific soil types could be essential (Vesterdal et al 2007, Vos et al 2019 as it influences the SOC development after LUC as microbes and soil structure define the fate of organic matter. All models simulate fast, medium and slow soil carbon pools with decay rates in the order of one year, decades and centuries, respectively, however they do not account for vertical structure of SOC (table 2). The explicit simulation of SOC layers have little influence on assessment of the LUC impacts on SOC as up to 90% of the impact happen within the first 30 cm of soil depth (Poeplau and Don 2013).
JSBACH, LPJ-GUESS, JULES and CLM simulate the nitrogen cycle. However, none of these models has activated management practices such as the application of fertilizers. In JULES and JSBACH, however, crops are parametrized in a way that imitates perfect nutrient supply.

Data analysis
We here present time series of SOC changes and other related carbon pools and fluxes. SOC changes include those from the litter pool (e.g. dead leaves, branches, and roots) since SCC records include the forest floor (i.e. litter). Models separate between long, medium and short-lived soil and litter pools based on the quality of the litter which are mingled in the SCC records covering the forest floor and the organic layer. Further, heterotrophic respiration (RH) (figure 1) represents RH during the decomposition of both, soil organic and litter materials.
We apply a 10 year, not centered moving average for SOC and all other reported variables. We focus on the direction and dynamics of SOC changes relative to the time before LUC and not on the absolute amount of SOC since measurement and simulated soil depths are not aligned. To assess each model's congruence with the observations, a normalized root mean square error (NRMSE) is calculated as fraction using the data range of the observations at each site: (observations))) . (1) We assess the timing with regard to the year when changes in SOC exceed 50% (T50) relative to total change during the observation period to infer the temporal dynamics. For models and observations that show a local minimum or maximum after land conversion we only count the years after soil C is increasing or decreasing again, respectively. This way we can evaluate the model and SCC dynamics from similar starting points assuming that the initial offset could be improved in the future. For the SCC records we apply an exponential fit and in case of Gejlvang and Costa Rica also a linear fit due to ambiguous last data points. As we do not know whether these points are reliable or introduced by artificial feature of chronosequences, we report both, exponential and linear fits. During our analysis, we found that some models produce results far off from observations and expected responses to LUC. Affected are JULES which has too slow forest regrowth (Valday and Gejlvang sites) while fast regrowth on pasture (Costa-Rica). The ISBA model is affected on the SW France site due to the missing crop harvest leading to strong increase in SOC after deforestation. We are reporting results from all models, as models are well-documented and results provide insides on the model response to historical and future changes in CMIP6. In the results section, we present multi-model mean with-and without these models. Table 3 and figure 2 show the development of SOC at all four sites relative to the 10 year period before LUC. The effects of climate and CO 2 change in the VegTr simulation can be found in the supplementary section S3.

Development of carbon pools and fluxes
For the temperate site of Valday (figure 2(a)), the SCC shows an initial decrease of SOC which is due to the mineralization of former crop debris with short turnover times and the still low litter input from the regrowing trees. Only after one to two decades, SOC increases in the SCC as the regrowing forest accumulates carbon in the forest floor building up a humus layer. SOC increases by 50% after 30 years after the initial drop is passed (T50, table 3).
Accounting for this shape of the SCC curve, ORCHIDEE, ISBA and CLM meet this dynamic to different degrees. CLM simulates almost congruent SOC changes between 10 and 55 years and-with 42 years-produces the T50 closest to observations (NRMSE of 0.3). After this period the CLM simulation diverges from the SCC record by showing higher increase rates. To meet the observed dynamics better, CLM would have to simulate a stronger decrease in the initial period (e.g. based on a stronger difference in the carbon to nitrogen ratios for crop versus woody material) and a stronger increase after 50 years when it now simulates smaller litter fluxes (figures S5(a) and S6(a)) due to slower vegetation regrowth in the second half of the century ( figure S4). ISBA and ORCHIDEE both do not simulate the nitrogen cycle but capture the dynamics well due to very different reasons. ORCHIDEE overestimates the initial SOC loss causing an offset of 2 kg m −2 throughout the simulation period: the drop in net primary productivity (NPP) (figure 3) due to the slow growth of trees is enhanced even more by an initial increase in RH (figure S7) from the decay of crop residues during the first decade after LUC. Interestingly, ORCHIDEE simulates the strongest vegetation C increase across models (figure S4(a)) which can only be attributed to an increase in plant respiration (not shown). After 1920, ORCHIDEE simulates the SOC dynamics in accordance with the SCC (T50 of 46 years; NRMSE of 0.7). The shape of the curve is mainly defined by the litter fluxes (figures S5(a) and S6(a)) as soil carbon alone (figure 2(a) dashed line) is missing the initial drop and the continuous increase thereafter. ISBA simulates the dynamics well (NRMSE of 0.5) but misses the timing of the decline in litter fluxes (figure S6(a)) and pools ( figure S3(a)). Although trees are growing similarly fast as in ORCHIDEE ( figure S4(a)) and NPP changes are positive right after LUC, litter inputs from roots stay small and previous crop residues still decompose leading to RH increases. Only after three decades the litter input from below-and above ground litter cause an increase in SOC that is, however, offset by 1 kg m −2 and with T50 of 46 years. JULES meets the development only during the first three years after land conversion but overestimates it thereafter because forest growth is very slow ( figure S4(c)) and the subsequent litter flux ( figure S6(a)) is very small Table 3. Results for initial total soil carbon and changes until last year with/ without LUC. The error is given by the normalized root mean square error (NRMSE). Timing (T50) denotes the time needed to exceed total SOC changes by 50%. The arrows point into the direction of change; * denotes that the time is counted only from when an initial decrease/increase is overcome. T50 for the SCC records was estimated by applying an exponential fit, but we also provide a linear fit for Gejlvang and Costa Rica (in parenthesis LPJ-GUESS reaches similar amounts of SOC allocation (1.9 kg m −2 , NRMSE of 0.5) as in the SCC (1.3 kg m −2 ) with forest regrowth by the end of the simulation period resulting from the highest NPP and RH increases across models (figures 3(a) and S7(a)). JSBACH simulates the direction and the observed saturation during the last decade, but overestimates SOC allocation (5.7 kg m −2 ) due to the strongest decrease in RH ( figure S7(a)) and highest increase in litter C (figure S3) across models (NRMSE of 1.5). Both models simulate results closer to the observations when litter C is excluded (1.4 and 2.5 kg m −2 , respectively; figure 2 dashed lines) and fail to simulate the observed initial drop in SOC as litter fluxes increase immediately (figures S3(a), S5(a) and S6(a)) leading to poorer T50 of 54 and 60 years, respectively. The high litter C contents throughout the simulation period are dominated by woody, above-ground litter, which might be overestimated with regard to the vegetation C pool (i.e. the ratio of vegetation C to litter C). Therefore, the initial decrease in RH (figure S7) which is based on the slow adjustment of too low nitrogen levels for the decomposition of woody materials is overcompensated by high litter inputs The observed SCC dynamic is different at Gejlvang (Denmark) despite similar environmental and historical conditions: SOC stays constant for four years after crop land abandonment before a rapid increase of 3.5 kg m −2 within 41 years occurs. SOC increases by 50% after 31 years using an exponential fit, and after 20 years when applying a linear fit. Carbon accumulation happens mainly in the forest floor with forests adding litter C, but also in the former ploughing horizon of the nutrient-poor cultivated land which still might contain remains of fertilizer that enhance C uptake until a new equilibrium is reached. The land models simulate a similar SOC dynamic as described at Valday but on faster time scales due to a warmer and warming climate ( figure  S1(b)). Here, JSBACH (2.8 kg m −2 in 2001; T50 of 21 years, NRMSE of 0.2) and LPJ-GUESS (1.6 kg m −2 in 2001; T50 of 25 years, NRMSE of 0.2) simulate a good linear fit with the observation-also when including the litter C pool (however, not accounting for the second last point). Due to the ambiguous second last point, we cannot assess whether an exponential or linear increase prevails in the SCC. The other models stay too low, as the decrease and recovery of SOC takes too long compared to the SCC. ORCHIDEE however simulates a good recovery of SOC after eliminating the initial drop with a T50 of 27 years (NRMSE of 0.6). The multi-model mean (excluding JULES) is dominated by the dip in SOC simulated by ORCHIDEE, CLM, and ISBA therefore the magnitude of SOC change is underestimated, while the timing fits well after overcoming the initial decrease (T50 of 28 years, NRMSE of 0.4).
At the SW France site, forests are cleared in 1962 for the establishment of managed crop land. SOC decreases within the observation period of 30 years by 14 kg m −2 which is already reached after 19 years. SOC decreases by 50% after 16 years. This demonstrates that SOC losses happen usually much faster than SOC gains (Poeplau et al 2011). JSBACH reproduces a similarly shaped decline after an initial SOC increase which lasts for about 15 years (−8.2 kg m −2 30 years after the peak increase was reached) and T50 of 11 years. In JSBACH, NPP decreases strongest across models taking not only the change in vegetation cover into account but also the increasing temperatures in combination with lower precipitation in the 1980s.The initial increase is also seen to a lesser degree in ORCHIDEE, CLM and LPJ-GUESS and origins from the transfer of below ground residues of the original vegetation to the SOC and increased RH (figure S7(c)) from the decomposition of formerly more and mainly woody material (with higher C:N ratios in the case of JSBACH, CLM and LPJ-GUESS) leading to a faster decomposition of less and nonwoody crop residues. Again, accounting for litter C is important at this site as soil C changes alone are too small comparing to observations. In ORCH-IDEE, CLM and JULES SOC is increasing after one to two decades and from the beginning in ISBA as the NPP(figure 3(c)) and the litter C flux (figure S6(c)) from crops becomes greater than that of trees before. Here, the positive effects of increasing temperatures on NPP may be overestimated without accounting for comparatively low precipitation levels ( figure S2). Crop harvest is not represented in this version of ISBA, greatly overestimating the litter flux. In JULES, forest density is low at this site and therefore the NPP is lower than for crops or pastures. The multi-model mean (excluding ISBA) captures the direction, but greatly underestimates both the magnitude of SOC change and timing (NRMSE of 0.7).
After deforestation in 1974 for pasture cultivation at the tropical site, the SCC record shows an almost linear decline of SOC for 10 years which is smaller than at the French site. SOC declines after 3 or 8 years by 50% applying either an exponential fit (ignoring the last measurement point) or a linear fit, respectively. The last measurement point reveals a carbon increase probably due to the SOC build up via very productive grass roots. Models diverge strongly already before the year of LUC due to decadal climate change effects. The conversion to pasture instead of crop land leads to a similarly shaped development in JSBACH and ORCHIDEE with T50 of 5 and 2 years, respectively, with an initial increase followed by a strong decrease in SOC exceeding that of the SCC. At this location, ISBA simulates results closest to the observations (with disregard of the last measurement point; NRMSE of 0.5) with moderate decreases in NPP (figure 3(d)) and litter fluxes ( figure S6(d)). JULES is the only model simulating both the initial drop and recovery but the pace and magnitude of the change do not match with the observations. The initial drop of SOC only occurs in the year after deforestation but pastures soon produce even more litter than the replaced, less densely growing trees enriching the SOC. The low tree density could be subject to improvement as growing conditions allow for a dense forest to grow. LPJ-GUESS and CLM simulate similar increases in SOC but for different reasons. While NPP decreases in both models (figure 3(d)), the litter flux (figure S6(d)) and RH (figure S7(d)) both increase in LPJ-GUESS and vice versa in CLM. The litter flux decrease is an intuitive behavior for CLM as less plant productivity leads to less litter. However, the litter C pool increases in size due to the lower RH leading to the buildup of a litter layer which is unrealistic for the tropics. In LPJ-GUESS, the change in litter quality and the lower RH rate of leaf and fine root litter compared to woody litter cause a higher transfer of carbon to the soil. The subsequent decrease observed in soil C alone (figure 2, dashed line) is because the woody litter C pool shrinks in size and pasture litter decays faster ( figure 2(d)). The multi-model mean (excluding JULES) is dominated by the (initial) increase in SOC simulated by JSBACH, ORCHIDEE, LPJ-GUESS and CLM and therefore the timing is only met after the first decade (T50 of 3 years, NRMSE of 1.1). This could mean, that the last point of the SCC record for tropical site is indeed indicating towards a complete recovery of SOC which is however not met by any model.
Although LPJ-GUESS, ORCHIDEE and ISBA share the same soil carbon model structure (CEN-TURY, Parton et al 1993), their SOC dynamics vary strongly due to the interaction with vegetation dynamics and climate. While LPJ-GUESS simulates the final change in SOC better, ORCHIDEE and in some cases also ISBA simulate the dynamics (i.e. the shape of the trajectory) more correctly.
The rather coarse definition of vegetation PFTs in JSBACH perform as good as the more specified PFTs in LPJ-GUESS but the amount of litter C is overestimated for forest regrowth. The influence of vegetation dynamics and the coupling to climatic conditions remains a crucial component in the simulation of soil C dynamics. The spread in simulated NPP is particularly large, especially within the first decades after LUC, contributing to the spread in SOC (figure 3).

Initial SOC stocks and absolute changes of SOC
Absolute SOC contents vary strongly across models (figure S1, table 3). At the time of LUC, CLM is closest to the SCC record at Valday, LPJ-GUESS at Gejlvang, JSBACH at SW France and ISBA at Costa Rica. CLM performs second best at Gejlvang and SW France (table 3). Reasons for these differences in initial SOC between models and observations include the usually shallower measurement depths of 20-30 cm compared to simulated single-layer SOC depths by models, different model parametrizations and structures.
SCCs show the largest SOC change in SW France where initial SOC contents are at least four times higher compared to the other sites. This behavior was reported before (Chen et al 2015, Cherchi et al 2018. Models do not show this behavior; for example, while CLM and LPJ-GUESS simulate a good fit with the observed SOC changes at Valday, Gejlvang and SW France, they underestimate the SOC decrease at SW France by far. JSBACH simulates too high levels of initial SOC at Gejlvang but meets the change of SOC well while the initial SOC stock matches that of the SCC record at SW France best but changes remain too small. At the tropical site, all models overestimate the initial SOC stock. Here, ISBA (deviation 40%) simulates the best SOC change after LUC but without capturing the recovery. Indeed, ORCHIDEE (deviation 350%) matches the SOC loss during the first decade but greatly overestimates it thereafter. To improve the simulated initial SOC stocks and the resulting SOC changes, more observations on the initial SOC stocks are necessary to benchmark model processes (e.g. RH).

Management impacts
Fires are not reported at any SCC site. JSBACH, CLM and LPJ-GUESS suppress fires on crop land and, for the latter two models, on pastures. However, they simulate wild fires but which are negligible at the afforested sites (figure S8, top) under the given climate conditions. Only at SW France and Costa Rica, LPJ-GUESS emits 1.4 and 1.2 kg m −2 from the original forest, respectively. In CLM, human or natural ignition is absent and in JSBACH, trees burn less than grasses and pastures.
The presence of harvest is reported at all three SCC sites covered by crops but not quantified. Harvests actively remove carbon from the site on a regular, mostly annual basis which is then not available for the soil carbon pool. JULES, LPJ-GUESS, ORCHIDEE and JSBACH account for harvest ranging from 0.2 to 1.5 kg m −2 yr −1 (figure S9). JSBACH, JULES and LPJ-GUESS further account for grazing on pastures with approximately 0.2-0.8 kg m −2 yr −1 C removed. ORCHIDEE also applies harvest to forests at Gejlvang with a carbon removal of about 0.7 kg m −2 yr −1 which could add positively to the SOC build up otherwise. The specific distinction between grass and crop parametrizations (e.g. NPP) and management (e.g. harvest) should be a step forward for all models to capture the observed SOC losses under LUC for crop land and recovering SOC gains associated with conversions to grassland (Poeplau et al 2011).
The simulation of the nitrogen cycle is included in JSBACH, LPJ and CLM but without explicit management practices and separate simulations without the nitrogen cycle no significant effects can be detected. The simulation of ploughing could further improve the simulation of SOC drops after crop land abandonment. Simulating management practices can bring model results closer to observations (Nyawira et al 2016) and should be a way forward in land use modeling (Pongratz et al 2018).

Conclusions
For the first time, we compare transient SOC changes of LUC provided by six land models with four highquality SCC records. We found that simulated carbon dynamics varies strongly across models and the type of LUC. However, we could identify some strengths of the applied models. Models can capture the overall direction of change at the afforestation sites but for different reasons. The initial drop of SOC after LUC at Valday caused by declining nitrogen levels of a former fertilized crop field, is best simulated by models without nitrogen cycle (ORCHIDEE, ISBA) due to an increase in RH from the decomposition of original plant material left on site after LUC. As nitrogen fertization was not accounted in the experiment, models with nitrogen cycle, except CLM, are not able to simulate this initial drop inSOC. At Gejlvang, such initial SOC drop is not observed and models with nitrogen cycle simulate good results (JSBACH, LPJ-GUESS). These results call for inclusion of management practices (fertilization) into experimental setup of afforestation simulations with land surface models that account for nitrogen-carbon coupling.
In contrast, we found that SOC losses after deforestation for cropland are generally simulated much too slow. The impact of litter C removal through harvest, explicit crop types (i.e. distinct from grassland) and the complete removal of forest material could improve results. Only ORCHIDEE simulates both the fast decrease of SOC after tropical deforestation and the subsequent recovery of SOC provided by growing grassland afterwards but at much too slow time scales. The initial SOC stock was overestimated at most sites and models did not reproduce the linear relationship between initial SOC stocks and SOC changes as found in observations. Sensitivity to climate and CO 2 change vary strongly among the models.
We found that the detailed specification of PFTs, soil types and layers was not essential to capture the observed dynamics when comparing results of different models. Up to date, the model spread of SOC changes after LUC is large and even the multi-model mean, although more accurate at most sites, should be treated with caution due to opposing processes within and across models.
A main limitation to use SSC data for evaluating transient model performance on global scale (Nyawira et al 2016) is a shortage of high-quality data. Including more high-quality data sites covering deforestation for crops and pastures and vice versa, as well as different management practices including reported fire events, harvest amounts and carbonnitrogen ratios will provide more rigorous constraints for model evaluation. In particular, more observational constraints on SOC changes during the first few decades after LUC changes, such as a SOC drop after land abandonment at the Valday site, is very informative for the models to get a proper balance between different processes.

Data availability
The data that support the findings of this study are available at the following URL: http://hdl.handle.net/ 21.11116/0000-0007-95BE-B and can be obtained by contacting publications@mpimet.mpg.de.