Benchmarking carbon fluxes of the ISIMIP2a biome models

The purpose of this study is to evaluate the eight ISIMIP2a biome models against independent estimates of long-term net carbon fluxes (i.e. Net Biome Productivity, NBP) over terrestrial ecosystems for the recent four decades (1971–2010). We evaluate modeled global NBP against 1) the updated global residual land sink (RLS) plus land use emissions (ELUC) from the Global Carbon Project (GCP), presented as R + L in this study by Le Quéré et al (), and 2) the land CO2 fluxes from two atmospheric inversion systems: Jena CarboScope s81_v3.8 and CAMS v15r2, referred to as FJena and FCAMS respectively. The model ensemble-mean NBP (that includes seven models with land-use change) is higher than but within the uncertainty of R + L, while the simulated positive NBP trend over the last 30 yr is lower than that from R + L and from the two inversion systems. ISIMIP2a biome models well capture the interannual variation of global net terrestrial ecosystem carbon fluxes. Tropical NBP represents 31 ± 17% of global total NBP during the past decades, and the year-to-year variation of tropical NBP contributes most of the interannual variation of global NBP. According to the models, increasing Net Primary Productivity (NPP) was the main cause for the generally increasing NBP. Significant global NBP anomalies from the long-term mean between the two phases of El Niño Southern Oscillation (ENSO) events are simulated by all models (p < 0.05), which is consistent with the R + L estimate (p = 0.06), also mainly attributed to NPP anomalies, rather than to changes in heterotrophic respiration (Rh). The global NPP and NBP anomalies during ENSO events are dominated by their anomalies in tropical regions impacted by tropical climate variability. Multiple regressions between R + L, FJena and FCAMS interannual variations and tropical climate variations reveal a significant negative response of global net terrestrial ecosystem carbon fluxes to tropical mean annual temperature variation, and a non-significant response to tropical annual precipitation variation. According to the models, tropical precipitation is a more important driver, suggesting that some models do not capture the roles of precipitation and temperature changes adequately.


Introduction
Continuing and widespread environmental changes strongly impact the carbon cycling of terrestrial ecosystems, and thus land-atmosphere CO 2 fluxes. Terrestrial ecosystems have sequestrated 24%À33% of the anthropogenic carbon dioxide (CO 2 ) emissions in the last five decades (1965À2014; Le Quéré et al 2015), providing a negative feedback in the carbon-climate system (Friedlingstein et al 2006). Recent environmental changes have led to an increase in the global land carbon uptake. The global land carbon sink increased from 0.2 ± 0.5 Gt C yr À1 in 1960s to 2.1 ± 0.7 Gt C yr À1 in the last decade (2005À2014; Le Quéré et al 2015). The increasing net land carbon sequestration could be due to enhanced productivity caused by rising atmospheric CO 2 concentration (Cramer et al 2001, Smith et al 2016, nitrogen addition (including nitrogen fertilization and deposition), ecosystem management practices, and growing season extension in northern regions (e.g. Myneni et al 1997, Tucker et al 2001, Piao et al 2007.
Here, the global net land carbon flux includes land-usechange emissions (E LUC ) and the 'residual' land sink (RLS) in the Global Carbon Project carbon budget from Le Quéré et al (2015). While data-driven assessments can only provide estimates of historical changes, it is essential to know how and to what extent the terrestrial carbon cycle will change in the future. In the past decades, global biome models have been developed to address these questions. Those models implement various processes to understand the global terrestrial carbon cycle, and have potential to project its future changes (e.g. Friedlingstein et al 2006, Sitch et al 2008. However, to increase the confidence in the associated future projections it is crucial to evaluate performance of the models.
The land CO 2 sink does not only show an increasing trend in the past five decades, but also an important interannual variability, of 1.2 Gt C yr À1 (standard deviation of the net land carbon flux in Le Quéré et al 2015) contributing to the observed fluctuations of the atmospheric carbon dioxide growth rate (CGR; Alden et al 2010, Baker et al 2006, Lee et al 1998, Le Quéré et al 2015. Therefore, besides evaluating model performances in estimating mean distribution and trends of terrestrial carbon fluxes, evaluation of the representation of interannual variability is also critical as it allows for testing our process understanding. The variation of terrestrial carbon fluxes is mainly influenced by climate, with extreme climate events being generally associated with abnormal carbon sources in many observational casestudies of droughts and heat-waves (e.g. Zscheischler . Climate anomalies (sometimes become extremes) cause a shift in the carbon balance of ecosystems mainly through their impacts on photosynthesis, respiration, fire occurrence and emissions, plants mortality and recruitment, insect outbreaks, and soil physical and biogeochemical changes (e.g. erosion, carbon, nutrients, etc.) (Reichstein et al 2013, Frank et al 2015. These disturbances by climate anomalies generally reduce the terrestrial carbon sink or cause net carbon sources from terrestrial ecosystems during a few months, but can lead to recovery sinks in the following years. At global scale, the El Niño Southern Oscillation (ENSO) is the largest-known global mode of interannual climate variability (Dawson and O'Hare, 2000). The ENSO-driven fluctuations of carbon fluxes over tropical land areas play the most important role in explaining global CGR and carbon flux variations. Therefore, a specific focus in this study is the evaluation of modeled global and tropical carbon flux anomalies during ENSO events.
We present a global evaluation of the eight ISIMIP2a biome models for their ability to capture (1) the current mean value and historical trends of the net land carbon flux, (2) carbon flux anomalies during ENSO events, and (3) the sensitivity of carbon fluxes to tropical climate variability.

ISIMIP2a biome models and the simulation protocol
The models involved in ISIMIP2a biome sector simulations are: CARAIB (Warnant et al 1994, Gérard et al 1999, Dury et al 2011, DLEM (Tian et al 2015a), JULES , LPJ-GUESS (Smith et al 2014), LPJmL (Bondeau et al 2007), ORCHIDEE (Krinner et al 2005), VEGAS (Zeng et al 2005), and VISIT (Ito and Inatomi 2012). All models simulated the carbon cycles of terrestrial ecosystem in response to climate change and rising atmospheric CO 2 concentration, with varying degree of detailed representation of vegetation types.
The same climate forcing (two different datasets at daily time step and at a spatial resolution of 0.5°Â 0.5°) and global atmospheric CO 2 concentration (Keeling and Whorf 2005, data continue to be updated) data were used by all models. Historical land-use data provided by ISIMIP2a see text S1 stacks.iop.org/ERL/ 12/045002/mmedia of supplementary material for detail) were used by 7 models, CARAIB being the exception because its simulations did not prescribed land-use (i.e. run with potential natural vegetation). The ISIMIP2a climate forcing data differs from TRENDY  and MsTMIP , since these experiments used only CRU-NCEP data (Viovy 2014   respectively. In total 16 simulations (8 models Â 2 climate forcing datasets) were used in this study. According to the simulation protocol (www.isimip.org/gettingstarted/#simulation-protocol), each simulation comprised of (i) a 'spin-up' run to equilibrate the vegetation and soil carbon pools using cycled climate forcing of early decades of the 20th century (e.g. 1901-1930), pre-industrial CO 2 concentration, and land-use status before the beginning of 20th century; and (ii) followed by a 'transient' run since 1901 forced by historical climate, CO 2 concentration and land-use change. The model outputs for the period 1971-2010 were used in this study. Two models, DLEM and LPJ-GUESS, included Carbon-Nitrogen interaction, thus were able to respond to time variable nitrogen deposition ( 2.2. The global NBP and component fluxes NBP represents the net carbon flux between terrestrial ecosystem and atmosphere, and is calculated as: where C harvest is carbon exported from ecosystems as harvested biomass and/or forest products; C fire is the carbon lost due to fire emissions; NEP is net ecosystem productivity calculated by the difference between NPP and Rh: It should be noted that some models did not include all the above elements of NBP: C harvest was not simulated by CARAIB, JULES, and LPJ-GUESS, and C fire was not considered in DLEM, JULES, and ORCHIDEE. For models accounting for forest harvest (i.e. VEGAS and VISIT), the harvested wood is allocated to pools with different decay rates (e.g. 1 yr, 10 yr and 100 yr for VISIT) to calculate C harvest from forest products. The key model processes impacting NBP components are described in table S1.
To our knowledge, there is no direct global observation of the land carbon balance (i.e. NBP in biome models' simulations), except for the global forest sink on decadal scale (Pan et al 2011). Due to the varying degree of detail in the representation of vegetation types among models, the forest carbon balance dataset cannot be used to benchmark modeled NBP. The GCP carbon budget reported by Le Quéré et al (2015) provided an estimate of the land carbon balance decomposed into two components: net landuse change emissions (E LUC ) and the 'residual' land sink (RLS; see Text S2 of supplementary material for detail information on E LUC and RLS). Given that landuse change was taken into account in most ISIMIP2a biome model simulations, the simulated gridded NBP from them was aggregated to a global mean to be comparable to the sum of RLS and E LUC (hereafter referred to as R þ L).

Materials used in the benchmarking analysis
In this study, several materials other than R þ L were used in the benchmarking analysis (section 2.4). They are (1) the Multivariate ENSO Index (MEI) used to define 'moderate to strong' ENSO events (Wolter and Timlin 2011); (2) the NOAA/AVHRR composite GIMMS-NDVI data (Tucker et al 2005); and (3) the total land fluxes from two inversion systems: the Jena CarboScope s81_v3.8 (Rödenbeck et al 2003(Rödenbeck et al , 2006 hereafter referred to as F Jena ), and CAMS v15r2 (Chevallier et al 2005, Chevallier 2013, hereafter referred to as F CAMS ). The detail descriptions of these materials were presented in the Text S3 to S5 of supplementary material. In this study, we focused on carbon fluxes over vegetated land, defined as the land grid cells with mean NDVI larger than 0.1 over the period 1982-2010.
2.4. Benchmarking analysis 2.4.1. Carbon flux changes during ENSO events To evaluate modeled carbon fluxes anomalies in ENSO events, we separated modeled NBP detrended anomalies (hereafter referred to as NBP var , where the detrended anomaly, i.e. the interannual variability is indicated by the 'var' superscript) and R þ L detrended anomalies (R þ L var ) during El Niño and La Niña years (section 2.3). A two-tailed Student's t-test with a 0.05 significance level was used to detect the carbon flux deviations between the two phases of ENSO events. Furthermore, the spatial pattern of NBP var in ENSO events was compared to that of NDVI detrended anomalies (NDVI var ) to test the hypothesis that above-ground biomass/productivity dynamics may drive the NBP negative (source anomaly). To generate a model-dependent but forcing-independent response, the anomalies from the model output driven by the two climate forcings were used for each biome model (i.e. 80 anomalies for each model, 40 yr per forcing Â 2 climate forcings).

Response of carbon fluxes to tropical climate variations
We used a multiple regression approach to diagnose, from the model simulations, the response of carbon flux (Gross primary productivity (GPP) or NBP) to tropical climate variability over the period 1981À2010: where y is a global carbon flux (i.e. GPP, NBP, F Jena , F CAMS , or R þ L); T is the tropical mean annual temperature; P is the tropical annual precipitation; SWis the tropical downward short-wave radiation. The tropical climate values were calculated as the spatial average over vegetated tropical lands (23°N to 23°S) with mean NDVI larger than 0.1 for the period 1982-2010 The fitted regression coefficients g int , d int , and ' int are apparent carbon flux (GPP, NBP, F Jena , F CAMS , or R þ L) sensitivities as defined in Piao et al (2013) to interannual variations in tropical mean annual temperature, annual precipitation and solar radiation respectively, and e the residual error term. As emphasized in Piao et al (2013), g int , d int , and ' int are not the 'true' sensitivities of carbon fluxes because of (i) temperature, precipitation and solar radiation all covary over time; and (ii) other climate drivers discarded in equation (3), such as humidity, and wind speed may also contribute to the variability of carbon fluxes. The regression coefficients are maximum likelihood estimates. The uncertainty in g int , d int , and ' int was represented as the standard error of the corresponding regression coefficients. To generate a model-dependent but forcing-independent response, the anomalies from the model output driven by the two climate forcings were combined for each regression (i.e. for each model, 80 anomalies, 40 yr per forcing Â 2 climate forcings, of model output and the corresponding climate anomalies were used). We have excluded the years that immediately follow major volcanic eruptions (i.e. El Chichón, Mexico: 1982Mexico: -1983andPinatubo, Philippine: 1991-1993), since during these years the carbon cycle may have been significantly perturbed by changes in radiation quantity and diffuse fraction (Robock 2000) that were not included in our climate forcing datasets. For regression of F var Jena , F var CAMS , and data-based R þ L var , we used mean annual temperature data from the Climatic Research Unit (CRU) at the University of East Anglia (Mitchell and Jones 2005), precipitation data from CRU and GPCC, solar radiation from CRU-NCEP v4 dataset (http://dods.extra.cea.fr/data/ p529viov/cruncep/), which were the same as the datasets used by Piao et al (2013).

NBP estimation and comparison with observation
There are large differences among the modeled NBP (that includes 7 models with land-use change) over the period 1971-2010, ranging from 0.3 Pg C yr À1 (LPJmL-PGFv2) to 2.6 Pg C yr À1 (VISIT-GSWP3) with a year-to-year variability ranging from 0.6 Pg C yr À1 (DLEM-PGFv2) to 1.6 Pg C yr À1 (LPJ-GUESS-GSWP3). The model ensemble-mean NBP is 1.3 Pg C yr À1 and 1.4 Pg C yr À1 driven by PGFv2 and GSWP3 forcings respectively, which is higher than the R þ L  ; table 1). However, the model ensemble-mean NBP trend from ISIMIP2a models was lower than the trends estimated from R þ L and the two inversion systems during the same period (table 1). For the analysis of the interannual variability in modeled NBP, the models generally show good agreement between the NBP variability and the variability of R þ L (r ¼ 0.6 ± 0.1, with all correlation significant at p < 0.05 for the period 1971-2010; figure 2), and that of the land sink from both atmospheric inversions (r ¼ 0.6 ± 0.1 and r ¼ 0.6 ± 0.1, with all correlation significant at p < 0.05 for the period 1981-2010 for Jena CarboScope s81_v3.8 and for CAMS v15r2 respectively). The interannual variation of R þ L and the land carbon flux from the two atmospheric inversions are very similar (r > 0.9 for any two of the dataset). Model ensemble-mean NBP from tropical (23°S--23°N), North Hemisphere extra-tropical (NH extratropical: 23°N-90°N), and South Hemisphere extratropical (SH extra-tropical: 90°S-23°S) ecosystems represent 31 ± 17%, 64 ± 19% and 5 ± 4% of global NBP respectively for the period of 1971-2010, across all ISIMIP2a models. The year-to-year variations of tropical NBP are 0.9 ± 0.3 Pg C yr À1 (table 2), which is much higher than that of NBP in NH (0.4 ± 0.2 Pg C yr À1 ) and SH extra-tropical ecosystems (0.2 ± 0.1 Pg C yr À1 ). The correlation (r) between the interannual variation of global NBP and tropical NBP reaches 0.90 ± 0.05, which is higher than that between the interannual variation of global NBP and extra-tropical NBP (r ¼ 0.3 ± 0.2 between NH extra-tropical and global NBP; r ¼ 0.5 ± 0.2 between SH extra-tropical and global NBP).
More than 85% (i.e. 12 out of 14) of simulations considering land-use change agree on the existence of significant carbon sink over 42% of vegetated land (i.e. 47 Â 10 6 km 2 ), including strong carbon sinks (positive NBP larger than 20 gC m À2 yr À1 ) for intact tropical forest regions (in South America, Africa and Southeast Asia), eastern United States, south China, southeast Asia, as well as weak carbon sinks in high latitudes and northeast Australia (figure 3(a); figure S1). A negative ensemble-mean NBP (carbon source) is simulated in fewer regions, namely southeast South America, Sub-Saharan Africa, and Middle East. However, these negative NBP regions, covering 23 Â 10 6 km 2 , are not consistent in the different simulations. Significant negative NBP (i.e. a net CO 2 source) is only consistent across most models (> 85% of simulations) over 1.4 Â 10 6 km 2 (i.e. only 1.3% of vegetated land; figure 3(a); figure S1). The model ensemble-mean trend of NBP shows distinct spatial patterns. A significant decrease of NBP is simulated with a good agreement between most models (> 85% of simulations) over 8.1 Â 10 6 km 2 of land, concentrated over South America (including the Amazon forest), western North America, southeastern Africa, northeastern India, southeast Asia, northeastern China, and some grid cells in central Australia (figure 3(b). The NBP positive trends simulated across most models (> 85% of simulations) are significant and in close agreement over 20.6 Â 10 6 km 2 of vegetated land (e.g. Alaska, northeastern North America, high-latitude Eurasia, sub-Saharan western Africa, Angola, and southern China). Environ. Res. Lett. 12 (2017) 045002

NBP anomalies during ENSO events
The simulated NBP by all models are significantly (p < 0.05) larger (stronger carbon sink or weaker carbon source) during La Niña years than that during El Niño years, consistent with the data-based R þ L (even though only significant at p ¼ 0.06; figure 4). The modeled global composite NBP anomaly difference between the two phases of ENSO events (hereafter ΔNBP LÀE ¼ NBP LaNiña À NBP ElNiño ) is 1.5 ± 0.5 Pg C yr À1 , which is larger than that estimated from R þ L (1.0 Pg C yr À1 ). This NBP difference is in the models mainly due to the NPP differences (hereafter ΔNPP LÀE ¼ NPP LaNiña À NPP ElNiño, of 1.6 ± 0.4 Pg C yr À1 , i.e. 3.0 ± 0.8% of modeled global mean annual NPP) rather than being the result of Rh differences (ΔRh LÀE ¼ Rh LaNiña À Rh ElNiño, 0.3 ± 0.4 Pg C yr À1 , i.e. 0.6 ± 0.8% of modeled global mean annual Rh). It is noteworthy that positive ΔRh LÀE Table 1. Mean NBP and NBP trend estimation from ISIMIP2a models, R þ L, inversions and TRENDY models, and the correlation of NBP interannual variability among them. Values for different periods are shown depending on the time span of datasets used to compare. The ensemble-mean NBP and its trend from ISIMIP2a models account for 7 models with land-use change (i.e. excluding CARAIB), while the correlation on interannual variability account for all 8 ISIMIP2a models. Var. in ensemble-mean NBP from ISIMIP2a models indicates the ensemble-mean year-to-year variation ± the standard deviation among models. are simulated by 6 out of 8 models, and negative ΔRh LÀE are simulated by CARAIB and LPJ-GUESS. The residual ΔNBP LÀE can be attributed to anomalies of C fire or C harvest (equations (1) and (2)) caused by the ENSO events. These processes are only taken into account in some models (section 2.2), which precludes a more systematic analysis. NDVI data also shows a higher mean value during La Niña years than that during El Niño years (ΔNDVI LÀE ¼ 0.0017, i.e. 0.4% of global mean annual NDVI), whereby the difference is not significant (p ¼ 0.53). It is also noteworthy that the ΔNBP LÀE and ΔNPP LÀE over tropical region (23°SÀ23°N) are 1.4 ± 0.6 Pg C yr À1 and 1.3 ± 0.4 Pg C yr À1 respectively, and contributed to most of the modeled global NBP and NPP differences between La Niña and El Niño events (figures 4(a) and (b)). During El Niño years, strong negative NBP anomalies (<À20 g C m À2 yr À1 ) are simulated in most tropical and sub-tropical regions (30°SÀ30°N) except in southern Brazil and eastern Africa, while positive NBP anomalies during El Niño years are simulated in western temperate North America, southern Brazil, and in some regions of temperate and boreal central Asia (figure 5). During La Niña years, in general opposite NBP anomaly patterns are simulated. More than 87.5% of the simulations (i.e. 14 out of 16) agree well on the sign (negative/positive) of NBP/NPP anomalies in regions where strong mean anomalies are simulated ( figure 5).
The spatial patterns of simulated mean NPP anomalies during both phases of ENSO (figures 5(c) and (d)) are very similar to those of mean NBP anomalies (figures 5(a) and (b)). Generally, the simulated NPP mean anomalies during both phases of ENSO are similar to observed NDVI anomalies. Significant NPP anomalies (negative and positive) are consistently simulated by 14 out of 16 simulations over 39% (i.e. 44 Â 10 6 km 2 ) and 40% (i.e. 46 Â 10 6 km 2 ) of vegetated land during El Niño years and during La Niña years respectively. Within those areas with  3.3. Variation of global GPP and NBP coupled with tropical temperature, precipitation and solar radiation variability The interannual variation of modeled global GPP is significantly and negatively correlated with tropical temperature variation in 2 out of 8 models (JULES and VISIT; all time series are detrended), but the magnitude of the responses is largely different and even the sign of g int GPP can be different among models (e.g. positive interannual correlation is found for DLEM; figure 6(a)). The GPP interannual sensitivity to tropical precipitation variation is always positive, resulting in a mean sensitivity of 1.0 ± 0.4 Pg C yr À1 per 100 mm (response is significant at level p < 0.05 for the 4 out of 8 models; figure 6(b)). The correlation between modeled GPP variation and tropical solar radiation variation is insignificant for all models (figure 6(c)).
In contrast to the response of global GPP to tropical temperature, significantly negative responses Environ. Res. Lett. 12 (2017) 045002 of modeled NBP variation to tropical temperature variation are found in all ISIMIP2a biome models, i.e. reduced uptake or larger sources (À3.1 ± 1.4 Pg C yr À1 K À1 , all p < 0.05). g int NBP ranges from À1.0 ± 0.4 Pg C yr À1 K À1 for DLEM to À5.0 ± 0.8 Pg C yr À1 K À1 for CARAIB (figure 6(a). The apparent Rþ L sensitivity to tropical temperature variation ðg int RþL ¼ À5:2 ± 1:0 Pg C yr À1 K À1 ; p < 0:01Þ is larger than all, g int NBP ; g int F Jena ðÀ3:6 ± 0:8 Pg C yr À1 K À1 ; p < 0:01Þ and g int F CAMS ðÀ3:7 ± 0:9 Pg Cyr À1 K À1 ; p < 0:01Þ. The F Jena , F CAMS , data-based R þ L is not significantly correlated with tropical precipitation variation (negative correlations), while modeled NBP variation shows significantly positive response to tropical precipitation variation in 3 out of 8 models (DLEM, LPJmL and ORCHIDEE; figure 6(b)). Similar to ' int GPP , no significant ' int NBP is found for the apparent NBP sensitivity to tropical solar radiation variation among models, and the results are the same for ' int R þ L , ' int

Net carbon fluxes of the ISIMIP2a biome models
The model ensemble-mean NBP (across the 7 models with land-use change) is higher than, but within the uncertainty of R þ L for the period 1971-2010 (figure 1), indicating that the models overestimate the observed terrestrial net carbon flux. The reasons could include but are not limited to inadequate representation of some processes leading to carbon loss from ecosystems in the models. For example, crop harvest is not taken into account in JULES and LPJ-GUESS; fire emissions are not included in DLEM, JULES and ORCHIDEE; wood harvest is only considered in VEGAS and VISIT. However, NBPs from models that include those 3 processes (i.e. VEGAS and VISIT; with NBP of 1.7 ± 0.9 Pg C yr À1 ) do not show a better agreement with R þ L. In addition, shifting cultivation is not considered in all simulations, which could bring uncertainties on the modeled land-use emission (e.g. Stocker et al 2014). The model ensemble-mean NBP from ISIMIP2a models is between the independent model ensemble-mean NBP from TRENDY (1.0 ± 0.4 Pg C yr À1 with year-to-year variation of 1.0 ± 0.2 Pg C yr À1 , including Land-Use Change Emissions and Terrestrial Sink simulated by TRENDY models) and MsTMIP (1.5 ± 1.4 Pg C yr À1 with year-to-year variation of 0.5 ± 0.1 Pg C yr À1 ; derived from SG3 runs with changing CO 2 , climate, and land-use; Huntzinger et al 2013) for the same period . The difference could come from the different models participating in ISIMIP2a (although some common models such as ORCHIDEE and VISIT participated in all projects), from the different model versions, as well as from the climate forcings (e.g. CRU-NCEP for TRENDY and MsTMIP) and/or land-use/land-cover changes forcings (e.g. prescribed in MsTMIP). For example, large differences in simulated NBPare found in common models (but different model version) driven  (table S2). For the period 1981-2010, the model ensemble-mean NBP is larger than R þ L, but in-between the land flux estimates from two inversion systems (table 1). Inversion estimates include anthropogenic and natural CO 2 land uptake, the latter part including the carbon cycle of aquatic continuum from land to ocean (e.g. inland waters; Regnier et al 2013), which can explain why their land sink is larger than the R þ L. R þ L only account for the overall perturbation of the global carbon cycle caused by anthropogenic activities, (i.e. anthropogenic CO 2 sink only; Le Quéré et al 2015). But models only estimate the anthropogenic CO 2 sink if their spin-up was correctly performed, so they should be comparable with R þ L. In addition, it should be kept in mind that even though the R þ L is, to our knowledge, the most comprehensive assessment of the terrestrial net carbon flux, the component RLS and E LUC were reported to have an uncertainty of ± 0.8 Pg C yr À1 and ± 0.5 Pg C yr À1 respectively ( Le Quéré et al 2015). The insignificant correlation between global GPP and NBP across the ISIMIP2a biome models (r ¼ 0.34, p ¼ 0.23 across the 14 simulations considering land-use change; figure 1) indicates that larger modeled GPP does not necessarily result in larger NBP. This finding is consistent with the previous result derived from another model ensemble reported by Piao et al (2013). The differences in carbon residence times and/or in GPP trends may explain the differences in NBP among models. For example, we found that global NBP is significantly correlated with the trend of GPP across the 14 simulations that consider land-use change (r ¼ 0.39, p < 0.05).

Regional contributions to the NBP interannual variation
The ISIMIP2a biome models capture the interannual variation ofglobal net terrestrial ecosystem carbon fluxes well, given the high correlation between NBP variability and the variability in the R þ L (figure 2) or the two inversion systems (table 1), where NBP driven Environ. Res. Lett. 12 (2017) 045002 by GSWP3 forcing shows higher correlation coefficient than that driven by PGFv2 forcing. Though tropical NBP only represents 31 ± 17% of global total NBP during the recent decades, the year-to-year variation of tropical NBP is a major contributor to the variation of global NBP (table 2), which could be seen from 1) synchronous variation between the modeled global NBP and tropical NBP (r ¼ 0.90 ± 0.05; section 3.1), and 2) the similar magnitude of variation of tropical NBP (0.9 ± 0.3 Pg C yr À1 ) compared to variation of global NBP (1.1 ± 0.3 Pg C yr À1 ). The major contribution of tropical NBP variation to global NBP variation in ISIMIP2a models is consistent with the earlier findings (Cox et al 2013. The significant NPP anomalies over the SH extratropical regions, with the same sign as the global NPP anomalies, suggest that the SH extra-tropical ecosystems also play an important role in controlling the variation of global NPP and thus global NBP (table 2, figure S2), which is consistent with the finding of Poulter et al (2014). In contrast, the NPP anomalies over NH extra-tropical regions show opposite phase, which partly compensate the carbon fluxes interannual variation of other regions (table 2).

Driving factors of NBP and its trend
The simulated global carbon sink (i.e. positive NBP; over 42% of vegetated land) and its positive trend (over 18% of vegetated land) in models is partly derived from the larger global NPP increase than the Rh increase in the past decades (figure S3 and S4; also see table 3 for the period of 1990-2009; equation (2)), suggesting that NBP in the models is more driven by the change of NPP than by the change of Rh. The environmental changes in the recent decades including elevated CO 2 concentration, climate change, nitrogen fertilization (including nitrogen deposition which only DLEM and LPJ-GUESS account for in this study), and anthropogenic land-use change have caused deviations from zero of the terrestrial carbon balance. The nonsynchronous evolution of NPP and Rh causes the positive NEP (and further increase NBP; figure 2). For the past two decades (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009), the ensemble-mean trends in global NPP and Rh (driven by PGFv2 and GSWP3 climate forcings, excluding CARAIB; table 3) are lower than the trends derived from TRENDY model ensembles (NPP trend of 0.22 ± 0.08 Pg C yr À2 and Rh trend of 0.16 ± 0.05 Pg C yr À2 ; S_L2 run: changing CO 2 and climate, and time-invariant present-day land use mask; Sitch et al 2015), which was driven by CRU-NCEP climate forcing (Viovy 2014) and did not consider land-use change in S_L2 run . The simulated NBP trend driven by GSWP3 forcing (0.060 ± 0.024 Pg C yr À2 ) is larger than (p ¼ 0.05 with paired Student's t-test) that driven by PGFv2 forcing (0.044 ± 0.022 Pg C yr À2 ), while the NBP trend derived from TRENDY model ensembles (0.055 ± 0.030 Pg C yr À2 ) is in-between ISIMIP2a NBPs when both forcings were considered, which implies an uncertainty of NBP trend due to uncertain trends of climate forcings, an often overlooked factor when attributing trends of carbon cycle variables to climate change (e.g. Sitch et al 2015). The ensemblemean NBP trends (including ISIMIP2a and TRENDY)  (table 3)  and S5) is found for the model ensemble-mean in many regions, such as tropical forest in northern South America, Africa and Asia, eastern Brazil, southeast Asia, south China, and the middle-to-high latitude regions in north hemisphere (also see figure S5).
The widespread increase of NPP is mainly due to the elevated CO 2 concentration (see Sitch et al 2015, with 4 mutual models in ISIMIP2a and TRENDY). It has the dual effect of increasing leaf photosynthesis and reducing stomatal conductance, thus increasing water-use efficiency (Rotter and Van de Geijn, 1999, Keenan et al 2013. A CO 2 fertilization effect on photosynthesis for C3 (Farquhar et al 1980) and C4 species (Collatz et al 1992) is included in all ISIMIP2a biome models. The magnitude of these effects is, however, heavily debated, for example as population dynamics, which are not represented in great detail in all ISIMIP2a models, might undo physiological effects at the stand scale (e.g. Hickler et al 2015). For temperate and high-latitude (cold) ecosystems in northern hemisphere ( figure S4(a)), NPP could be further enhanced through gradually growing season extension due to the warming trend (e.g. Myneni et al 1997, Tucker et al 2001, Piao et al 2007. In addition, nitrogen fertilization (including nitrogen deposition) may enhance vegetation productivity in the models accounting for nitrogen interactions (i.e. DLEM and LPJ-GUESS in this study). The goal of ISIMIP2a was not to separate the impact of these factors, while their effects could be revealed through other studies with common models regarding NPP-related variables such as LAI (Zhu et al 2016) and evapotranspiration (Mao et al 2015).
A carbon sink is also simulated by a majority of simulations in the Amazon basin and eastern Australia, where NPP is higher than Rh (thus results in positive NBP; data not shown), despite a near-zero or negative NPP trend simulated in some areas (e.g. south Amazon basin, and eastern Australia; figure S4 (a)). However, this carbon sink became weaker in the recent decades (negative NBP trend; figure 3(b)), possibly and partly due to a combination of climate and land-use change in these regions. For example, 13 out of 14 simulations considering land-use change showed a negative NBP trend over intact forest in the Amazon basins (with decreasing rate of À0.50 ± 0.54 gC m À2 yr À2 and À1.08 ± 0.56 gC m À2 yr À2 driven by PGFv2 and GSWP3 forcing respectively), due to the larger increase in Rh than in NPP. Here, intact forest was defined as the grid cells with less than 2% changes in the agricultural fraction of land. In regions that experienced significant agricultural expansion over Amazon basin (i.e. grid cells with more than 2% increase in agricultural fraction during 1971-2010), all simulations showed a negative NBP trend (with decreasing rate of À0.76 ± 0.42 gC m À2 yr À2 and À1.00 ± 0.55 gC m À2 yr À2 driven by PGFv2 and GSWP3 forcing respectively), which could be a result of increased Rh due to deforestation and smaller increase in NPP, or even decreased NPP in 5 out of 14 simulations.
Anthropogenic land-use change is another important factor causing the positive or negative NBP simulated by model ensembles (figure 3(a)). Deforestation for agriculture through wood harvest or burning due to fire could cause intensive carbon loss in reality (e.g. South America, Sub-Saharan Africa, and Southeast Asia in figure 3, figure S6, and table S3; Guo and Gifford 2002, Van der Werf et al 2009), but the fate of carbon after forest loss is represented differently in the biome models where harvested biomass can go either into litter pools (increase Rh; all models), product pools (DLEM) or directly to the atmosphere through burning (DLEM). In addition, a carbon source can be sustained by the removal of above-ground biomass by agricultural practices such as harvest (i.e. positive C harvest in equation (1)). In contrast to the agriculture expansion over tropical regions, there are land-use changes characterized by the abandonment of the agricultural land and possible reforestation in eastern United States (table S3), Europe, eastern Russia and eastern China (figure S6). The abandonment of the agricultural land eliminates the carbon export as agricultural products, increasing the carbon input into soil, and may increase biomass carbon stock by aforestation. All of these processes could induce positive NBP in these regions (figure 3 and S2).

Biome model response to ENSO events
A stronger global carbon sink during the La Niña years than that during the El Niño years is found for the R þ L estimate, and is also simulated by all biome models in this study, implying consistent model capability in capturing the net terrestrial carbon fluxes deviation between the two phases of ENSO at global scale. Model simulations reveal that the NBP deviation during ENSO events could be mainly attributed to the NPP deviation. Rh deviation partly offsets the NPP deviation in 6 out of 8 models (range from 13% for LPJmL to 52% for JULES), CARAIB and LPJ-GUESS being exceptions. ΔRh LÀE is negative for CARAIB (i.e. enhance the NBP deviation), and marginal for LPJ-GUESS. This suggests that the climate anomalies during ENSO events cause the same direction of deviation on NPP and Rh, though that may not always be the case. The very similar spatial pattern between NBP and NPP anomalies during ENSO events confirms again that the NBP deviation is mainly due to the NPP deviation in the biome models. Furthermore, the NDVI anomalies during ENSO events agree in the sign of the modeled NPP anomalies for most regions, globally (section 3.2; figure 5), suggesting that the ISIMIP2a biome models generally are able to capture the NPP anomaly during ENSO events. In the lower part of the Amazon basin and tropical forest in Africa, the opposite sign of anomalies found between modeled NPP and NDVI during the La Niña years could be the result of poor climate data input based on few stations. Apart from the NPP induced NBP deviation during ENSO events, anomalies of fire emissions could also contribute to the NBP deviation. For example, the tropical fire emission anomaly caused by strong 1997-1998 El Niño events was estimated to have a significant contribution to the net terrestrial carbon fluxes anomaly and thus the CO 2 growth rate in that period (Page et al 2002, van der Werf et al 2004, Betts et al 2016. During El Niño years, ISIMIP2a models simulate negative NPP anomalies (lower NPP) over eastern and northern Canada and western Siberia (dominantly boreal ecosystems), but positive NPP anomaly for these regions during the La Niña years, which could be primarily due to the temperature anomalies. Higher mean annual temperature during the La Niña years (figure 5(d) and S3(b)) has the potential to extend the growing season in these regions, possibly partly through advancing the snow-melt (Kirdyanov et al 2003, Piao et al 2007 and enhanced photosynthesis, and with the opposite effect during El Niño years (figure 5(c) and S3(a)). It is noteworthy that the impacts could be diverse for different types of El Niño events (e.g. Capotondi et al 2015), which were not analyzed separately in this study.
For tropical and sub-tropical regions, the positive NPP anomaly generally coincides with a positive precipitation anomaly and a negative temperature anomaly, and vice versa (figures 5(c) and (d), S7(a)À(d)). Higher temperature plus lower precipitation increase water deficit (possibly causing drought) over tropical and sub-tropical regions, which tends to reduce productivity in savannas (a water-limit ecosystem) and forests (Tribuzy 2005, Doughty andGoulden 2008), and increase tree mortality in tropical forest (Allen et al 2010, Phillips et al 2010. Conversely, tropical vegetation generally prospers in the years with lower temperature (negative anomaly) and higher precipitation (positive anomaly; figures S7(b) and (d)), where higher NPP is simulated by models (figures 5(c) and (d)). The ΔNBP LÀE and ΔNPP LÀE over tropical regions (23°S-23°N) dominate the global anomaly differences between ENSO events (section 3.2; figures 4(a) and (b)), which again implies that the interannual variation of global NBP is mainly caused by the tropical NBP variation in models. Furthermore, considering the tight coincidence between tropical NBP anomalies and tropical climate anomalies, the global terrestrial carbon flux anomalies may be strongly driven by the tropical climate anomalies.

Biome model response to tropical climate
The ISIMIP2a biome models responses to tropical climate are generally consistent with the earlier findings (e.g. Baker et al 2006, Cox et al 2013, Wang et al 2014, which suggest that yearto-year variations in global carbon fluxes are strongly connected with the tropical climate variation. Tropical solar radiation variation does not significantly impact the global net carbon fluxes variation, given the fact that ' int NBP ; ' int R þ L ; ' int F Jena and ' int F CAMS are all not significant statistically. However, the different responses to tropical temperature and precipitation derived from modeled NBP and R þ L and inversions (section 3.3) suggest that some models may be undersensitive to tropical temperature variations, but oversensitive to tropical precipitation variations for their simulated net terrestrial carbon fluxes (i.e. NBP), which is consistent with the findings from TRENDY models . It should be noted that the sensitivities obtained in this study only indicates the response of the global spatially averaged carbon fluxes interannual variability to the climate variability averaged for tropical region. The regional (or local) carbon fluxes interannual variability could have different sensitivities to the local climate variability (e.g. stronger response to precipitation and/or radiation variations). For example, Ahlstrom et al (2015) found that global NBP interannual variability becomes more correlated with precipitation (almost as strong as correlation with temperature) at higher levels of disaggregation of climate variables. There is significant correlation between and g int NBP and d int NBP ðr ¼ 0:71; p < 0:05Þ across the model ensemble, which means that high (or low) sensitivity to tropical temperatures may compensate for low (or high) sensitivity to tropical precipitations (i.e. with sensitivity compensation). Furthermore, high positive correlation between d int NBP and d int GPP ðr ¼ 0:80; p < 0:05Þ, and between d int NBP and d int Rh ðr ¼ 0:65; p < 0:08Þ between across model ensemble suggests that the model differences in global NBP response to tropical precipitation variation depend on both NPP and Rh.

Concluding remarks
In this study, we evaluate the NBP from the eight ISIMIP2a biome models against the net global terrestrial ecosystem carbon fluxes that include the land-use-change emissions (E LUC ) and the 'residual' land sink (RLS) derived from the GCP carbon budget analysis by Le Quéré et al (2015) for the period of 1971-2010. We focus on the mean value, spatial distribution, trend and interannual variability of NBP, and investigate the carbon fluxes deviation due to ENSO events. A special attention is paid to the sensitivity of global NBP to tropical climate variability. We found that: 1. the ensemble-mean annual NBP (sink) is higher than but within the uncertainty of R þ L, while simulated NBP trends are lower than that from R þ L; models capture the interannual variability of NBP well; 2. tropical NBP represents 31 ± 17% of global total NBP during the past decades, and the year-toyear variation of tropical NBP is the major contributor to variation of global NBP.
3. different ensemble-mean NBP trends when driven by different climate forcings implies a significant uncertainty of NBP trend due to uncertain trends of climate forcings, an often overlooked factor when attributing trends of carbon cycle variables to climate.
4. all models simulate significant global NBP deviation between El Niño and La Niña years and mainly due to NPP deviation. NDVI shows similar, but not as distinct as significant, deviation between El Niño and La Niña years.
5. the different global net land carbon flux sensitivities to tropical temperature and precipitation derived from modeled NBP, R þ L and inversions indicate that some models may be undersensitive to tropical temperature variation, but over-sensitive to tropical precipitation variation for their simulated NBP.