Estimating the grassland aboveground biomass in the Three-River Headwater Region of China using machine learning and Bayesian model averaging

Spatially and temporally explicit information on the biomass in terrestrial ecosystems is essential to better understand the carbon cycle and achieve vegetation resource conservation. As a climate-sensitive critical ecological function area, accurate monitoring of the spatiotemporal variation in the grassland aboveground biomass (AGB) is important in the Three-River Headwater Region (TRHR) of China. In this study, based on field observation, remote sensing, meteorological and topographical data, we estimated the grassland AGB in the TRHR and analyzed its spatiotemporal change and response to climatic factors. Four machine learning (ML) models (random forest (RF), cubist, artificial neural network and support vector machine models) were constructed and compared for AGB simulation purposes. The AGB results estimated with the four ML models were then applied in integrated analysis via Bayesian model averaging (BMA) to obtain more accurate and stable estimates. Our results demonstrated that the RF model performed better among the four ML models (testing dataset: correlation coefficient (r) = 0.84; root mean squared error = 76.99 g m−2), and BMA improved grassland AGB prediction based on the multimodel results. The spatial distribution of the grassland AGB in the TRHR was heterogeneous, with higher values in the southeast and lower values in the northwest. The interannual variation in the grassland AGB in most areas of the TRHR exhibited nonsignificant increasing trends from 2000 to 2018, and the sensitivity of the AGB to the annual precipitation was obviously modulated by regional water conditions. This study provides a more precise method for grassland AGB estimation, and these findings are expected to enable improved assessments to obtain a greater grassland AGB understanding.


Introduction
Grasslands are some of the most widely distributed ecosystems on Earth, covering approximately 20% of the land surface area (Scurlock 1998, Suttie et al 2005. As an important part of terrestrial ecosystems, grasslands play a crucial role in carbon cycling, biodiversity conservation and forage production (Akiyama 2007, Fu et al 2014, Mowll et al 2015. The Three-River Headwater Region (TRHR) is located on the northeastern Tibetan Plateau, which is the source of the Yangtze, Yellow (Huang He), and Lantsang (Mekong) rivers (Mao et al 2016), with a mean elevation over 4000 m. Moreover, as an important grazing economic zone, the main natural vegetation type in the TRHR is alpine grassland. Due to its unique geographic location, distinguished ecological function and abundant natural resources, the TRHR is a crucial ecological function conservation area in China (Jiang 2015). However, the grassland vegetation in this region is relatively fragile and extremely sensitive to climate change, considering the high altitude and unique natural conditions of the alpine zone (Luo et al 2002, Yang et al 2008. Therefore, the TRHR grassland has received much attention from researchers, who have assessed grassland conditions to better understand the complex interaction between the grassland ecosystem and climate change (Yang et al 2009, Zhang et al 2011, Mao et al 2014, Xia et al 2018.
The aboveground biomass (AGB) is a key parameter to reflect carbon accumulation and vegetation activity (Flombaum et al 2007. Accurate quantification of AGB and its spatiotemporal variations is important to reveal the carbon cycle of grassland ecosystems and to protect pasture resources (Anaya et al 2009). However, it is impossible to directly assess the AGB over a large area, as field measurements require excessive manpower and are time consuming. Alternatively, two main types of spatial methods have been applied in AGB simulation: process-based models and data-driven empirical models. Process-based ecosystem models estimate the biomass via simulation of the biophysical processes of vegetation (Fan et al 2010, Zhuang et al 2010, Yi et al 2014; hence, they are suitable tools for the development of a mechanistic understanding of ecosystems. However, the simulation results from different process models are highly different due to the various assumed model mechanisms and input variables (Liang et al 2017). Xia et al (2018) indicated that the maximum and minimum grassland AGB simulation values in the Tibetan Plateau estimated with processbased models differ by a factor of 6. In addition, due to their simplicity and effectiveness, data-driven empirical models have been widely adopted in grassland AGB estimation at different scales and in different regions (Piao et al 2007, Ali et al 2017, Wang et al 2017. For example, empirical regression models have been proposed based on remote sensing normalized difference vegetation index (NDVI) (Xu et al 2007, Gu et al 2013, Xia et al 2014, enhanced vegetation index (EVI) (Han et al 2017), or soil-adjusted vegetation index data (Huete et al 2002). However, the accuracy of regression models based on a single VI is often limited due to the influence of external environmental factors and the ability of VI to characterize vegetation status (Qi et al 1994, Huete et al 2000. Machine learning (ML) is a more powerful empirical method, and over the past two decades, it has been applied in AGB estimation, such as the random forest (RF) (Su et al 2016), artificial neural network (ANN)  and support vector machine (SVM) (Dusseux et al 2014) models.
Several previous studies suggested that ML models are more applicable to ecosystem research than are regression models because they exhibit a relatively complex structure and more accurately describe the nonlinear relationships between predictive and objective variables (Ramoelo et al 2015, Wang et al 2017. With the development of computing technology and the accumulation of field observation data, various ML algorithms have been applied to biomass estimation in various regions and at different scales. On the Tibetan Plateau, Zeng et al (2019) found that the RF model performed well in grassland AGB simulations and could explain 86% of the observed data variation. In the TRHR grassland, Yang et al (2018) constructed a model based on the ANN model for AGB estimation and demonstrated that the established ANN model performed better than statistical regression models. Based on these diverse studies, ML algorithms have been successfully applied grassland AGB estimation. However, the advantages and disadvantages of these ML algorithms must be examined to select suitable models for alpine grassland AGB simulation. In regard to the estimation of the alpine grassland AGB, a comparison of the performance of these ML algorithms has not yet been reported in the recent literature.
In this study, four ML models (the RF, cubist, ANN and SVM models) were considered and compared in terms of grassland AGB estimation in the TRHR based on remote sensing VI data and gridded meteorological, positional and topographic data. Ensemble analysis is an approach to reduce the uncertainty in predicted results based on the variation in the available output of simulation models (Kovalchuk et al 2017). This approach has been employed in certain prediction studies, such as vegetation classification (Du et al 2012) and gross primary productivity estimation (Jung et al 2011). Among ensemble analysis methods, Bayesian model averaging (BMA) has been verified to be a good integration analysis method to generate more accurate and stable estimates (Chen et al 2015. Zhu et al (2016) predicted terrestrial evapotranspiration across northern China via BMA based on multimodel results and achieved better estimates. However, this integration analysis approach has not been investigated in alpine grassland AGB estimation and mapping. Therefore, the estimation results obtained from the different models considered in this study were selected as input data for ensemble analysis via BMA to yield more robust and accurate results.
The primary objectives of this study were to (a) develop, evaluate, and compare the performance of four ML models (RF, cubist, ANN and SVM models) in grassland AGB estimation in the TRHR; (b) apply BMA to integrate the AGB estimates obtained with the various ML methods to generate a highly accurate AGB map in the TRHR from 2000 to 2018; and (c) analyze the spatial and temporal dynamics in the grassland AGB and its response to climate change across the TRHR.

Study area
The TRHR is located in the south of Qinghai Province, China, and is bounded by latitudes 31 • 39 ′ ∼ 37 • 10 ′ N and longitudes 89 • 24 ′ ∼ 102 • 27 ′ E (figure 1). The TRHR is part of the Tibetan Plateau, which is the largest and highest plateau worldwide, with an average altitude above 4000 m, and exhibits a complex and diverse topography and geomorphology. The TRHR belongs to the alpine climate zone, with an average annual temperature range from −0.56 • C to 3.8 • C. Precipitation mostly occurs from June to September, with an average annual rainfall ranging from 262.2 to 772.8 mm. Based on the ChinaCover database map , the TRHR contains an approximate grassland area of 28.24 × 10 4 km 2 , corresponding to 69.08% of the total area (figure 1). The predominant grassland types include alpine meadows, alpine steppes, and sparse grasslands, covering 23.22%, 26.23%, and 19.63%, respectively, of the total TRHR area.

Field observations of the grassland AGB
The field sample data of the grassland AGB (a total of 362 sample points) analyzed in this study were collected from two sources: (a) field measurements and (b) documentation . Surveys were performed across the TRHR from 2005 to 2016, when the standing biomass reached its maximum level (exactly in the middle of August) during the growing season. At each survey spot (30 × 30 m), the AGB was sampled in three independent subplots (1 × 1 m). All of the vegetation in each subplot was harvested and then dried in a dryer at 65 • C until the weight remained the same. The dry weights in the subplots at each survey site were averaged to yield an AGB value. Then, any AGB values in the entire sample point data exceeding the mean value ±3 × the standard deviation (SD) were excluded as outliers. Moreover, the sample points in the grassland area were selected based on the land cover map (ChinaCover).

Spatial data
MOD13Q1 (V006, NDVI and EVI included) was adopted for AGB estimation purposes in this study (http://earthdata.nasa.gov/). The data with a 250 m resolution at 16 d intervals covered the TRHR area (h25v05 and h26v05) from 2000 to 2018. The moderate resolution imaging spectroradiometer (MODIS) reprojection tool was applied to mosaic two MODIS tiles and project images onto the Albers Equal Area projection. To further reduce the effect of clouds and atmospheric contamination, the original time series NDVI and EVI were smoothed using the double logistic curve fit in the TIMESAT software (Lund University, Lund, Sweden) (Jönsson et al 2004). Then, the 16 d NDVI and EVI data were then analyzed to calculate the growing season average, expressed as the growing season NDVI (GSN) and growing season EVI (GSE), respectively.
Moreover, gridded meteorological datasets were downloaded from the National Ecosystem Research Network of China (www.cnern.org.cn), including temperature and precipitation maps with a 250 m resolution at 8 d intervals. The mean annual precipitation (MAP) and mean annual temperature (MAT) were then calculated for AGB estimation and further analysis. A digital elevation model (DEM) was obtained from shuttle radar topography mission (SRTM) images (version 004, 90 m). To ensure consistency with the available data, the SRTM DEM was resampled to a 250 m resolution. The slope was then calculated with the DEM in ArcGIS software (ESRI, USA).

AGB estimation model construction
Four ML algorithms (RF, cubist, ANN and SVM) were evaluated and compared in the present study. RF is a decision-tree-based ML algorithm developed by Breiman (2001). The bootstrap sampling method was adopted in the RF algorithm to perform random sampling with replay on samples. The results obtained in each sampling were then applied to generate the final predictions (Chan et al 2008). Previous studies have suggested that RF accurately describes the complex relationships between predictive and objective variables (Powell et al 2010, Belgiu et al 2016, Wang et al 2017, Xia et al 2018. The cubist model, a regression tree algorithm based on rule sets, was developed by Quinlan (1996). This algorithm generates rule-based models with one or more regulations, with each rule comprising a set of criteria related to a multivariate linear submodel. Therefore, the cubist model attains a high efficiency and is easy to understand (Huang et al 2003, Gu et al 2015, John et al 2018. The ANN model is also an invaluable ML algorithm in remote sensing (Mas et al 2008, Xie et al 2009, which simulates the human learning process by establishing links between predictive and objective variables (Were et al 2015). Various ANN algorithms have been developed and applied, such as radial basis function networks (Niang et al 2006), Hopfield neural networks (Nguyen et al 2006) and back-propagation ANNs (BP-ANNs) (Xie et al 2009). Among them, the BP-ANN model may be the most popular model in biomass estimation , which was selected in this study. The SVM is a statistical learning algorithm that relies on kernel functions to project input data onto a higher-dimensional space where complex nonlinear decision boundaries between classes are linearized (Wang et al 2010). In the new hyperspace, the SVM constructs the optimal hyperplane to separate classes and create maximum gaps between their data or to fit data and obtain predictions under the minimal empirical risk and complexity of the modeling function .
Estimation models of the grassland AGB were constructed based on the AGB field observation data and GSE, GSN, MAT, MAP, DEM, slope, longitude and latitude values of corresponding location and year. All the different combinations of these variables were considered to build models, of which the subset with the highest accuracy was selected. In addition, the model considering all ten variables as inputs served as a comparison. Each model was evaluated via tenfold cross-validation. Specifically, the grassland AGB observation data were randomly divided into ten groups in each analysis: nine groups (approximately 326 points) were reserved as training sample datasets, and the remaining group (36 points) was reserved as the testing dataset. This procedure was repeated ten times until each group was employed as both training and testing datasets. The model performance was then evaluated based on the ten times averaged correlation coefficient (r) and root mean squared error (RMSE). Training and testing of all ML models were performed using MATLAB, R3.5.2, with its as-in randomForest, Cubist, e1071, sp, raster and rgdal packages, and ArcGIS.

Ensemble analysis
In this study, ensemble analysis was adopted as a tool to reduce uncertainty and increase the robustness of the model results. In the case where the various AGB estimation models produced a comparable simulation accuracy but different results, ensemble analysis was applied to combine the outputs. BMA is a statistical technique that combines the predictions of multiple models to yield a more robust and reliable probability ensemble (Duan et al 2007, Tenneson et al 2018. In this study, the BMA scheme was adopted to improve the ML predictions by adjusting the predictive probability density function to obtain a good fitness to the field observations. As a comparison, mean-based ensemble analysis was also applied to calculate the mean of multiple estimation results.

Spatial distribution and dynamic changes in the grassland AGB
The integrated AGB estimates in the TRHR during the 19 year study period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) were averaged to obtain a spatial distribution map of the AGB covering this period. The temporal trend of the grassland AGB from 2000 to 2018 was determined via linear regression of the annual AGB value against the year. In addition, to analyze the response of the interannual variation in the AGB to the temperature and precipitation, the sensitivity of the AGB to the MAT and MAP was calculated through multiple linear regression, in which the MAT and MAP were defined as independent variables and the AGB was defined as the dependent variable in each pixel. At a MAT slope higher than 0 g m −2 • C −1 , the grassland AGB was considered to rise with the temperature.

Descriptive statistics of the grassland AGB
The grassland AGB field observations in the TRHR and corresponding factors (corresponding to the same year) are summarized in table 1. The mean grassland AGB was 223.12 g m −2 and ranged from 12.00 to 670.90 g m −2 . The SD and coefficient of variation (CV) reached 149.81% and 67.15%, respectively, indicating a moderate variability in the grassland AGB in the TRHR. Regarding all predictor variables, the mean and median values were similar, indicating normality of the data distribution. The MAP varied between 280.76 and 895.58 mm in the semiarid, semihumid and humid regions. The correlation matrix between the AGB and the 8 model variables is shown in figure 2 (calculated using data of all 362 plots). Among them, the EVI attained the highest correlation with the AGB (r = 0.7), suggesting that this remote sensing VI provided a good characterization of the grassland AGB. Next, the MAP exhibited a correlation coefficient of 0.6 with the AGB, indicating that the grassland AGB distribution in the TRHR yielded a good positive response to precipitation. Moreover, the altitude was negatively correlated with the AGB (r = −0.4), which demonstrated that the grassland AGB in this area decreased with increasing altitude. In addition, there existed a strongly positive correlation between the EVI and NDVI (r = 0.9). The MAP exhibited a positive correlation with the EVI and NDVI (r = 0.7 and r = 0.7).

Performance of the AGB estimation models
The four ML models (RF, cubist, ANN, and SVM) achieved comparable results, with training r values ranging from 0.81 to 0.88 and testing r values ranging from 0.77 to 0.84 (table 2). In general, the performance of the various ML models based on the input variables selected with the best-fit approach was slightly better than that of the ML models considering all ten variables as inputs. For example, the ANN model based on GSE, MAP, DEM and longitude, which was selected via the best-fit approach, performed better than the ANN model considering all 10 variables. Among the four ML algorithms, the RF model based on GSE, GSN and MAP achieved the best results (training dataset: r = 0.88, RMSE = 72.56 g m −2 ; testing dataset: r = 0.84, RMSE = 76.99 g m −2 ).
To obtain a better performance, the AGB estimation results retrieved from the different ML models were applied in ensemble analysis. The results demonstrated that ensemble analysis yielded higher r and lower RMSE values than those yielded by each single AGB estimation model ( figure 3). The fit of the two combinations was very close to the 1:1 line, which indicated that the predicted values were highly consistent with the measured values. Moreover, BMA (r = 0.88, RMSE = 71.60 g m −2 ) performed better than did the mean-based combination.

Spatiotemporal dynamics of the grassland AGB
Grassland AGB maps in this region from 2000 to 2018 were produced via BMA ensemble analysis, where figure 4 shows the AGB averaged over the studied period. The spatial distribution of the grassland AGB in the TRHR exhibited obvious heterogeneity, which was higher in the southeast and lower in the northwest. The observed pattern was similar to that of the field observations (figure 1). However, more details were revealed that could not be obtained from the limited field observations. In particular, parts of the northwestern TRHR attained very low AGB values (<80 g m −2 ). This conforms to the high altitude and overall cold climate. Most parts of the central TRHR exhibited relatively high AGB values (80-240 g m −2 ). The eastern areas attained the highest AGB values (>240 g m −2 ), consistent with this region functioning as the main grazing area in the TRHR.
During the 19 year study period, the MAT in the TRHR notably increased by 0.02 • C yr -1 (P < 0.05, figure 5). Moreover, a high interannual variation in the MAP occurred in this region, alternating between dry and wet years. Coupled with the interannual changes in the MAT and MAP, the grassland AGB revealed a slight and nonsignificant increasing trend from 2000 to 2018 ( figure 5(a)), at a growth rate of 0.26 g m −2 yr −1 . The lowest grassland AGB value     increasing AGB trend. As shown in figure 6, the AGB attained significant increasing trends (at the 95% confidence level) in ∼11.93% of the grassland area in the TRHR, mainly located in the north (e.g. Tongde and Maduo counties). The interannual trends of the grassland AGB from 2000 to 2018 exhibited a decrease in parts of the southwest area, including Zaduo and Nangqian. Only 3.73% of the area showed a statistically significant grassland decline.

Response of the grassland AGB to climate change in the TRHR
To analyze the interaction between the temperature and precipitation conditions in regard to the grassland AGB distribution, the average AGB was calculated considering temperature and precipitation surfaces (figure 7). Interaction defines multidimensional response surfaces, in which the response of the AGB to the MAT and MAP was basically a ridge, with the maximum AGB value occurring at above average levels of the MAT and MAP. A higher AGB value (>200 g m −2 ) was mainly concentrated in the area with a MAP value above 500 mm and a MAT value higher than −1 • C. The average AGB value at each MAT (1 • C) and MAP (50 mm) level was also calculated. The pattern of the grassland AGB in the TRHR was unimodal in response to the temperature. As shown in figure 7, 0 • C represented a turning point, below which the AGB significantly increased with the MAT (R 2 = 0.92, P < 0.05). Moreover, the spatial distributions of precipitation and the grassland AGB exhibited a positive correlation (R 2 = 0.96, P < 0.05) across the whole region. However, at MAP values above 600 mm, the grassland AGB slightly changed (R 2 = 0.01, P > 0.05).
The temperature sensitivity of the AGB was positive in most of the grassland area in the TRHR (approximately 67%), especially in the northeastern and central parts of the western plateau ( figure 8(a)). The temperature sensitivity exceeded 5 g m −2 • C −1 , indicating that an increase in temperature of 1 • C corresponded to an AGB increase of at least 5 g m −2 in nearly 38% of the grassland area. The results indicated that temperature increase in the TRHR imposed a good promoting effect on the grassland productivity. The spatial pattern of the interannual  AGB variation sensitivity to precipitation was strikingly different from that of the temperature. According to the precipitation contours shown in figure 8(b), the precipitation sensitivity of the grassland AGB was higher in areas where the MAP ranged from 400 to 700 mm, with the precipitation sensitivity exceeding 0.1 g m −2 mm −1 . In areas with a precipitation higher than 700 mm, certain pixels exhibited negative sensitivity values lower than 0.05 g m −2 mm −1 , indicating that an increase in the MAP of 10 mm corresponded to an AGB decrease of 0.5 g m −2 .

Estimation of the grassland AGB
Both the NDVI and EVI are remote sensing VIs widely applied in grassland AGB prediction and mapping. However, the NDVI has often been criticized for reaching saturation in areas with a high grassland coverage  and is sensitive to the atmosphere and soil background (Huete et al 2002). In general, the EVI is an improved index over the NDVI, which attempts to obtain more reasonable VI values considering different vegetation density distributions . However, studies have suggested that the EVI is not necessarily superior to the NDVI . For example, Liang et al (2016) reported that the regression results between the MODIS-derived NDVI and grassland AGB were better than those between the EVI and AGB in the pastoral area of southern Qinghai, China. Previous studies have presented conflicting opinions regarding the performance of the NDVI and EVI in terms of grassland AGB estimation. In this study, the correlation coefficient between the grassland AGB and GSE was 0.69, higher than that between the grassland AGB and GSN (r = 0.65). Moreover, among the AGB estimation models based on ML algorithms, both the NDVI and EVI were selected as input variables (table 2). The RF best-fist model with the input variables of GSE, GSN, and MAP achieved the highest accuracy. Therefore, we propose that more than one VI could be introduced in AGB estimation, thus taking advantage of the synergistic effects between their respective benefits to obtain better estimation results.
Here, four ML algorithms (RF, cubist, ANN and SVM models) were constructed and compared in terms of AGB estimation. The results verified that all ML models produced encouraging results and attained a higher accuracy than that of linear regression models, which is consistent with the results of previous studies . However, regarding which ML algorithm is the optimum method for biomass estimation, different studies based on various datasets in several regions have reported varying conclusions. Compared to SVM and RF, Zhang et al (2018) found that the ANN achieved the best sawgrass marsh biomass estimation results in the coastal Everglades. In our study, RF achieved the best performance among the four ML algorithms in alpine grassland AGB estimation in the TRHR.
Due to differences between their algorithms, the various ML models may predict the AGB with varying degrees of success considering different grassland types and factorial data. RF is a combination of decision tree predictors, where each tree depends on the values of a random vector independently sampled. Therefore, RF functions well with high-dimensional data and attains a high effectiveness and stability. Similarly, the cubist model relies on a decision tree structure. In contrast to RF, the terminal of the cubist model contains multiregression functions, thus making it easier to understand. However, both of these decision tree-based models require many training samples, which must be representative of the overall dataset to achieve reasonable simulation results. The ANN is a relatively complex algorithm that learns complicated schemes and achieves generalization in a noisy context. However, the obtained results are often difficult to interpret, and the results may vary greatly between multiple experiments. The SVM algorithm searches for the optimal hyperplane to minimize training errors and suitably generalizes a given model with limited training samples, but it suffers from parameter allocation problems that seriously affect the results.
As we adopted various estimation algorithms yielding different predictions, the power of BMA ensemble analysis in AGB prediction was also explored. The results indicated that BMA ensemble analysis could be implemented to improve the accuracy of estimations and obtain more robust predictions based on appropriate outcomes retrieved from different models. Empirical models were constructed based on the association between the AGB observation data and remote sensing and/or contextual variables. Therefore, the accuracy of these models was largely limited by the observation data. The availability of abundant and accurate ground observation data for training sets is important in the construction of grassland AGB estimation models. Moreover, the spatial distribution of the field sites is also a crucial factor to obtain reasonable predictions (Zeng et al 2019). Ensemble analysis also provided a solution to spatial uncertainty mapping by calculating the difference between the estimations obtained with the ensemble analysis model and those obtained with each individual model. Uncertainty mapping clearly reveals the spatial errors in regions, which cannot be obtained via statistical result analysis.

Comparison between the estimated grassland AGB and previous studies
We compared our grassland AGB simulation results to previously published AGB estimations in the TRHR to further evaluate our approach (table 3). In this study, the estimated grassland AGB was 158.29 g m −2 from 2000-2018, which is consistent with the predicted AGB value of 169.25 g m −2 reported by Han et al (2017). The average AGB values in the alpine meadows and steppes and sparse grasslands in the TRHR in this study were 235.23 and 136.39 g m −2 , respectively, and those in the study of Zeng et al (2017) were almost the same, at 214.81 and 130.07 g m −2 , respectively. The interannual variation in the grassland AGB in this study revealed an increasing trend from 2000 to 2018, consistent with most existing studies on the grasslands in the TRHR (Fan et al 2010, Han et al 2017.

Effect of the climate on the spatiotemporal pattern of the grassland AGB
The variation in the regional vegetation productivity is considered to mainly depend on energy-related factors. In most parts of the world, an increased vegetation productivity has been linked to global warming (Xia et al 2014). The interannual variation in the grassland AGB in the TRHR exhibited an overall increasing trend, consistent with those reported in other grasslands worldwide (Xia et al 2014, Moreover, the grassland AGB demonstrated a positive response to the temperature in most areas of the TRHR. An increase in temperature may enhance the photosynthetic rate of vegetation and thus increase its productivity (Piao et al 2007, Yang et al 2009. In addition, an increase in temperature may lengthen the growth period of vegetation and enhance its carbon sequestration capacity. Piao et al (2007) reported that the growing season of alpine meadows in China was extended by approximately 0.9 d yr -1 between 1982 and 1999 due to the increase in temperature, with the greenness correspondingly increasing. However, it should be noted that the response of the vegetation productivity to the temperature was nonlinear, and the different response rates of the various regions were obviously distinct. For example, the response rate of the grassland AGB to the temperature in the eastern part of the TRHR was obviously higher than that in the western and southwestern parts ( figure 8(a)). Due to the poor moisture conditions in the western part of the TRHR, a rise in temperature may increase vegetation evapotranspiration and result in intensification of the vegetation growth limitation attributed to moisture. Analysis of the spatial pattern of the AGB considering the temperature gradient also showed that the productivity did not increase linearly with increasing temperature (figure 7). When the MAT was higher than 0 • C but lower than 3 • C, the AGB exhibited a decreasing trend with increasing temperature. This suggests that the response of vegetation to the temperature may be modulated or may vary with many other factors involved, such as precipitation, insect occurrence, snow cover, overgrazing, and changes in nutrient availability (Piao et al 2014).
It has been reported that the vegetation productivity in most regions worldwide is often limited or colimited by water availability conditions (Knapp et al 2002, Fang et al 2005, Yang et al 2008, Felton et al 2021. Evidence has also suggested that the global extent of water limitation is increasing (Babst et al 2019), driven by temperature rise, which has correspondingly increased the magnitude of droughts via an increase in the atmospheric evaporation demand. Thus, the water availability is projected to exert an increasingly dominant influence on terrestrial ecosystem functioning, especially the biomass in the future (Green et al 2019). The TRHR is an important water source ecological function area and the source of major rivers. However, the central and western areas of the TRHR contain many mountains, which play a major role in limiting soil moisture. Moreover, the MAP in most areas of the TRHR was below 600 mm, most areas occurred under semiarid and semihumid conditions, and the vegetation productivity was limited by water conditions. There existed a positive response relationship between the AGB and MAP in most grassland areas of the TRHR ( figure 8(b)). Precipitation remains an important source of available water for vegetation growth in this region, influencing the spatial-temporal distribution of the AGB.
Our analysis demonstrated that the sensitivity (response rate) of the AGB to the MAP in the grasslands of the TRHR is modulated by regional precipitation conditions ( figure 8(b)). In the areas with a relatively low precipitation (i.e. MAP < 400 mm) and a higher risk of drought, the sensitivity of the grassland AGB to the MAP was lower and exhibited little variability. This may be affected by the higher resistance, as caused by the long-term adaptation of vegetation to these arid and semiarid environments (Gao et al 2019). Therefore, reducing the productivity loss due to drought is a conservation pattern suitable for grassland resources (Hu et al 2018). When the MAP varied between 400 and 700 mm, vegetation developed a higher AGB sensitivity to precipitation. The increase in precipitation imposed a good promoting effect on the AGB in this area. In contrast, in the wetter areas (i.e. MAP > 700 mm), the sensitivity of the AGB to the MAP was lower and even negative, indicating that the increase in precipitation in this area yielded no obvious promoting effect on the AGB or even a negative effect. Sufficient precipitation could promote the growth of grassland leaves and the leaf-level photosynthetic capacity and thus improve the grassland productivity. However, too much precipitation, either saturation or even excess conditions, may influence soil nutrients, light energy absorption and other factors, resulting in an increase in precipitation not significantly contributing to productivity or even causing a productivity decline.

Conclusions
Spatial AGB quantification is important to better understand the carbon cycle and realize pastoral agricultural management in the TRHR. Our study constructed and compared four ML models for grassland AGB estimation in this area based on 362 AGB observation points, remote sensing VI data and gridded meteorological, terrain and position data. Among the four ML algorithms, the RF model with GSE, GSN, and MAP as input variables attained the best performance. We propose that more than one VI could be considered in AGB estimation. Via ensemble analysis, the results of the four ML models were combined through BMA, which yielded a higher AGB estimation accuracy than that obtained with a single estimation model. The grassland AGB exhibited a heterogeneous spatial distribution in the TRHR, with higher values in the southeast and lower values in the northwest. The interannual variation in the grassland AGB in most areas of the TRHR revealed increasing trends from 2000 to 2018. The sensitivity (response rate) of the AGB to the annual precipitation in the grasslands of the TRHR was modulated by regional water conditions.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors.