Introduction

The sand area in north China has the characteristics of the large area, wide distribution, and spanning several different bio-climatic zones1. The hazards of sandstorms and land desertification have seriously hindered ecological restoration and economic development in northern China2. To prevent desertification and realize the restoration of ecology, China has established artificial sand-binding vegetation on 6000,000 ha of windblown sand hazard area, which effectively improved the ecological environment, curbed the spread of land desertification, and decreased wind-sand hazards1,3. The practice has proven that establishing artificial sand-binding vegetation is an effective way for preventing the hazard of wind-sand and promoting regional ecological restoration. However, a series of ecological problems (e.g. new desertification appearing in previously re-vegetated desert regions, groundwater levels beginning to decline and large areas of artificial sand-binding vegetation degrading over the decades, etc.) have arisen with the implementation of a series of ecological projects1. Different from natural vegetation, artificial sand-binding vegetation is established with the clear purpose and function, the stability of re-vegetated ecosystems requires appropriate manual intervention, and maintenance in time3. Therefore, evaluating the stability of artificial sand-binding vegetation timely and accurately can provide evidence for the management and maintenance of re-vegetated ecosystems4.

Evaluating the stability of the ecosystem is one of the core topics in ecology, many researchers have proposed numerous methods to evaluate the stability of ecosystem1,2,5,6,7,8,9. In general, these methods can be mainly divided into dynamical system models1,2,7 and empirical models5,6,8,9. Dynamical system models, including the Lotka–Volterra model10,11, vegetation pattern model12,13, soil moisture coupled with dynamic change of vegetation2,14, etc., were employed to investigate the stability of the ecosystem from the perspective of dynamical system theory; The empirical methods such as fuzzy comprehensive evaluation method5, analytic hierarchy process6, Godron method15, variable weight evaluation method16, etc., comprehensively evaluated the stability of the ecosystem by using the stability indexes that closely related to the structure and function of the ecosystem. Both dynamical system models and empirical methods can be employed to evaluate the stability of ecosystems, but the theoretical results obtained by dynamical system models are difficult to be verified in reality10,11,12,13,14, and the empirical methods have the drawbacks that the weights of evaluation indexes are scored by experts5,6,15,16, the uncertainty of parameters may lead to inconsistency of evaluation results.

Stability is one of the basic characteristics of the ecosystem, and the definition of stability is diversified. E.g., the concepts including resistance, resilience, persistence, variability, local stability, global stability, structural stability, trajectory stability, constancy, relative stability, and set stability, etc., are used to describe ecosystem stability from different perspectives6. As a result, there is still no uniform standard for the measurement of ecosystem stability. This study aims to construct a data-driven hybrid evaluation model by integrating the Bootstrap technique, Monte Carlo simulation, Tagaki–Sugeno fuzzy neural network (T–S FNN), and Kruskal–Wallis (K–W) test to evaluate the stability of re-vegetated ecosystem in southeastern margins of the Tengger Desert under the assumption that the natural vegetation system is stable. The evaluation indexes are selected from vegetation, soil, and soil moisture since the vegetation, soil, and soil moisture are decisive factors for the stability of artificial sand-binding vegetation. The evaluation model is data driven and the requirement of sample size is not high, which can be applied to evaluate the stability of the re-vegetated ecosystems in other areas.

Materials and methods

Study site

The study was conducted in the southeast edges of Tengger Desert (37° 32′ N, 105° 02′ E). The densely distributed trellis dunes are the main landscape type in this area (Fig. 1A). The average annual precipitation is 180.6 mm and the annual average evaporation is 2520.4 mm, the mean monthly temperatures are − 6.9 °C in January and 24.3 °C in July1,6. To ensure the operation of the Shapotou section of the Baotou-Lanzhou railway, the artificially re-vegetated belts were established in 1956a, 1964a, 1981a, and 1987a without irrigation, the moving dunes were fixed, plant species and the vegetation coverage have significantly increased (Fig. 1B), and a biological protective system was eventually established with a length of 16 km and a width of 200–1000 m (Fig. 1A)1,2,6,17,18. However, there are many "activated spots" (Fig. 1C) appeared in the artificially re-vegetated belts due to drought, wind erosion, sand burial, and other factors in the long process of succession, which seriously affect the stability and sustainability of the stability and sustainability of the re-vegetated ecosystems. To ensure the artificial sand-binding vegetation plays the best ecological and economic benefits continuously and stably, it is necessary to accurately evaluate the stability of the re-vegetated ecosystems.

Figure 1
figure 1

Schematic description of the study area. (A) The aerial map of the study area; (B) the fixed duns; (C) the activated spots in the artificially re-vegetated belt. (Using ArcGIS v. 10.8 software; Powered by ESRI “Environmental Systems Research Institute”, www.esri.com).

Evaluation index and data collection

Ten 10 m × 10 m quadrats were set in each artificial sand-binding vegetation belt established in 1956a, 1964a, 1981a, and 1987a as well as in the adjacent natural vegetation belt for a total of 50 quadrats, the vegetation (woody + herbaceous) coverage (%), species, crust and soil thickness (0–5 cm), soil bulk density (%), soil moisture (%), and maximum water holding capacity of surface soil (%) were observed and recorded in September 2020. The Shannon–Wiener index in the observed areas was computed by using \(H = - p_{i} \ln p_{i}\). The measurement methods of other evaluation indicators are omitted since the detailed descriptions can be found in19. Table 1 shows the mean and standard deviation of the evaluation indexes.

Table 1 The mean and standard deviation of the evaluation indexes.

Methods

Bootstrap self-sampling technique

Bootstrap self-sampling is generally used to obtain the robust estimation of the mean and standard deviation of the population by repeating random sampling from observed samples, parametric and non-parametric self-sampling are two ways to obtain Bootstrap samples, where parametric self-sampling requires the population distribution to be known in advance, but it is usually difficult to obtain the population distribution based on a limited sample. The non-parametric Bootstrap self-sampling technique can effectively overcome the defects of parametric sampling and obtain the robust estimation of parameters from small samples. In this study, the non-parametric bootstrap technique was used to estimate the mean and standard deviation of each evaluation index based on the observed data of fixed samples, that is, the mean \(\mu_{i,j}\) and standard deviation \(\sigma_{i,j}\) of each evaluation index in natural vegetation area and the artificial re-vegetated belt in different years (1956a, 1964a, 1981a, and 1987a) were estimated with the sample mean \(\overline{X}_{ij}\) and sample standard deviation \(S_{ij}\).

Monte Carlo simulation

Monte Carlo simulation is a well know random simulation method that can be used to generate pseudo-random numbers of a given distribution. As the measures taken by the artificial sand-binding vegetation engineering within a certain range are the same, we assumed that each evaluation index follows the uniform distribution according to the spatial self-similarity of re-vegetated ecosystems in the Monte Carlo simulation processes1,2,19. On the other hand, it was also reasonable to assume that each evaluation index follows normal distribution by the central limit theorem because the artificial sand-fixing vegetation may be disturbed by various random factors (e.g. micro-topography, moisture, nutrient, etc.) in the long-term succession process. Therefore, Monte Carlo simulation is a feasible and effective method to obtain sufficient samples based on the results of Bootstrap. In the Monte Carlo simulation processes, the mean and standard deviation of generated pseudo-random numbers are consistent with the estimated results of Bootstrap.

Tagaki–Sugeno fuzzy neural network

T–S FNN was proposed by Takagi and Sugeno in 1985 based on fuzzy set theory and fuzzy "if–then" rules20, it combines the advantages of a neural network model and fuzzy inference system, and has the strong adaptive ability and robustness, which is widely used in control theory, water resource assessment, and environmental management5,8,21. However, few scholars have applied T–S FNN to evaluate the stability of the re-vegetated ecosystems.

T–S FNN consists of four layers, including the input layer, fuzzy layer, fuzzy rule calculation layer, and output layer. The input layer is connected to the input vector \(X = [x_{1} ,x_{2} ,\)\(\cdots ,x_{k} ]^{T}\), and each component of the input vector \(X\) is a fuzzy variable, which is defined in the domain \(U_{i}\) with the value \(A_{j}^{i} ,i = 1,2, \cdots ,n,j = 1,2, \cdots ,k\). The fuzzy "if–then" rule is

$$ R^{i} :\,{\text{If}}\,x_{j} \,{\text{is}}\,A_{j}^{i} ,\,j = 1,2, \ldots ,k,\,{\text{then}}\,y = p_{0}^{i} + p_{1}^{i} x_{1} + p_{2}^{i} x_{2} + \cdots + p_{k}^{i} x_{k} $$

The membership function of the input value \(x_{i}\) is determined by

$$ \mu (A_{j}^{i} ) = \exp \left\{ { - (x_{j} - c_{j}^{i} )^{2} /b_{j}^{i} } \right\},\,\;i = 1,2, \ldots ,n,\;j = 1,2, \ldots ,k $$
(1)

where \(c_{j}^{i}\) and \(b_{j}^{i}\) denote the center and width of the membership function, respectively. The applicability \(w_{i}\) of each fuzzy rule is calculated by using the continuous multiplication operator, that is

$$ w_{i} = \mu_{{A_{j}^{1} }} (x_{1} )\mu_{{A_{j}^{2} }} (x_{{2}} ) \ldots \mu_{{A_{j}^{k} }} (x_{k} ),\;i = 1,2, \cdots ,n. $$
(2)

The output of T–S FNN is

$$ y_{i} = \sum\limits_{i = 1}^{n} {w^{i} (p_{0}^{i} + p_{1}^{i} x_{1} + p_{2}^{i} x_{2} + \cdots + p_{k}^{i} x_{k} )/} \sum\limits_{i = 1}^{n} {w^{i} } ,i = 1,2, \cdots ,n. $$
(3)

which can be regarded as evaluation results of the stability of artificial sand-binding vegetation, the smaller the \(y_{i}\) value, the more stable the system is.

The error function is

$$ E = \frac{1}{2}\sum\limits_{i = 1}^{r} {(y_{di} - y_{i} )^{2} } , $$
(4)

where \(y_{di}\) and \(y_{i}\) denote the expected output and the actual output, respectively. The adjustment parameters, including \(p_{j}^{i}\), \(c_{j}^{i}\), and \(b_{j}^{i}\), are computed by using the error backpropagation algorithm and the first order gradient optimization algorithm,

$$ p_{j}^{i} (k + 1) = p_{j}^{i} (k) - \beta \frac{\partial E}{{\partial p_{j}^{i} }}, $$
(5)
$$ c_{j}^{i} (k + 1) = c_{j}^{i} (k) - \beta \frac{\partial E}{{\partial c_{j}^{i} }}, $$
(6)
$$ b_{j}^{i} (k + 1) = b_{j}^{i} (k) - \beta \frac{\partial E}{{\partial b_{j}^{i} }}, $$
(7)

where

$$ \frac{\partial E}{{\partial p_{j}^{i} }} = (y_{di} - y_{i} )w^{i} x_{j} /\sum\limits_{i = 1}^{n} {w^{i} } ,\;i = 1,2, \cdots ,n,\;j = 1,2, \cdots ,k, $$
(8)

and \(\beta > 0\) is the learning rate of T–S FNN. Figure 2 shows the diagram of T–S FNN.

Figure 2
figure 2

The diagram of T–S FNN.

To overcome the drawbacks of insufficient training and over-training in the neural network model, the Nash–Sutcliffe coefficient of efficiency (NSCE), mean absolute percentage error (MAPE), root mean squared error (RMSE), and mean absolute error (MAE), were used to determine the optimal training times of T–S FNN. Table 2 shows the definition of NSCE, MAPE, RMSE, and MAE, where NSCE was a positive indicator, and MAPE, RMSE, and MAE were negative indicators. The closer the value of NSCE is to 1, and the smaller the value of MAPE, RMSE, and MAE is, the better the training effect of the model will be. The trained T–S FNN is used to evaluate the stability of the artificial sand-binding vegetation.

Table 2 The definition of NSCE, MAPE, RMSE, and MAE.

Kruskal–Wallis test

The K–W test is a non-parametric test with the advantage of without the need to meet the normality assumption22. The null hypothesis of the K–W test is that the evaluation results have no significant difference with the significance level \(\alpha = 0.05\). If the P value of the K–W test is less than the significance level \(\alpha\), the null hypothesis is rejected; Otherwise, accept the null hypothesis. In this study, the K–W test was used to determine the significant difference in the evaluation results.

Hybrid evaluation model

In this study, the Bootstrap technique, Monte Carlo simulation, T–S FNN, and K–W test were integrated to construct the hybrid evaluating model to evaluate the stability of re-vegetated ecosystem based on the observed data of fixed quadrats in the artificially re-vegetated belt established in different years (1956a,1964a,1981a, and 1987a) and natural vegetation area in the southeastern margin of the Tengger Desert. As the vegetation, soil, and soil moisture determine the stability of non-irrigated artificial sand-binding vegetation when the annual average rainfall is unchanged1,2, the vegetation coverage (%), Shannon–Wiener index, crust and soil thickness (cm), soil bulk density (%), soil moisture (%), and maximum water holding capacity of surface soil (%) were selected as the evaluation indexes to represent vegetation, soil and soil moisture in the evaluation process. In addition, We assumed that the natural vegetation system is stable because the natural vegetation system has adapted to the regional climatic and soil conditions in the long course of vegetation succession. The corresponding observation indexes of natural vegetation were regarded as the benchmarks to determine the membership degree of the evaluation index. MATLAB software (R2019a, MathWorks, USA) is utilized to implement all the computing processes. The main steps of the data-driven evaluating model are as follows:

  • Step 1. Selecting the evaluation indexes from the aspects including the vegetation, soil, and soil moisture since these factors determine the stability of artificial sand-binding vegetation.

  • Step 2. The Bootstrap technique was used to obtain robust estimates of the mean and variance of the population of evaluation indexes based on the observed data.

  • Step 3. Monte Carlo simulation was employed to generate the training and testing set of T–S FNN under the assumption that each evaluation index obeys uniform distribution or normal distribution, respectively.

  • Step 4. The indexes of undisturbed natural vegetation in the same area were taken as the reference to determine the membership degree of the evaluation index, and the T–S FNN was used to evaluate the stability of the artificial sand-binding vegetation.

  • Step 5. The K–W test is used to determine whether there is significant differences in the evaluation results of T–S FNN.

Results

Table 3 shows the Bootstrap estimation of the population mean and standard deviation of the evaluation indexes, where the sampling number is 1000. As the distribution of each evaluation index is unknown, we assumed that each evaluation index obeys uniform distribution and normal distribution, respectively. Monte Carlo simulation was used to randomly generate 400 samples (Fig. 3) that obey uniform distribution and normal distribution, respectively (Supplementary information file). The mean and standard deviation of generated pseudo-random numbers were determined according to Table 3. The first 350 samples were used to train T–S FNN, and the last 50 samples were used for testing. The ratio between the training and testing sets was 7:1. The input node of T–S FNN was 6, the number of hidden layer nodes was 12, and the output node was 1.

Table 3 The Bootstrap estimation of the population mean and standard deviation of each evaluation index.
Figure 3
figure 3

The Monte Carlo simulation results of evaluation indexes. (A) The evaluation indexes are uniformly distributed; (B) The evaluation indexes are normally distributed.

To determine the optimal training times and prevent insufficient training or over-training of T–S FNN, the training times of T–S FNN were set as 500, 1000, 1500, and 2000, respectively. The NSCE, MAPE, RMSE, and MAE were used to evaluate the training effect. The results of NSCE, MAPE, RMSE, and MAE at different training times under the assumption that each evaluation index obeys uniform or normal distribution are shown in Table 4. The trained T–S FNN was employed to evaluate the stability of artificially re-vegetated belts in different years (1956a, 1964a, 1981a, and 1987a). The evaluation results of different training times are shown in Tables 5 and 6, respectively.

Table 4 The NSCE, MAPE, RMSE and MAE of T–S FNN with different training times.
Table 5 The evaluation results of T–S FNN under different training times under the assumption that each evaluation index is uniformly distributed.
Table 6 The evaluation results of T–S FNN under different training times under the assumption that each evaluation index is normally distributed.

As shown in Table 4, when the training times of TS-FNN are 1500, NSCE reaches the maximum, and MAPE, RMSE, and MAE close to the maximum, indicating that the optimal training times of T–S FNN is 1500. Tables 5 and 6 show that the evaluation results of artificial sand-binding vegetation in different years became accurate with the increase in training times, and the standard deviation of the evaluation results is the smallest if the training number is 1500. The mean of the stability evaluation results of artificial sand-binding vegetation in different years (1956a,1964a,1981a, and 1987a) are 1.0215, 1.0937, 2.3138, and 3.0077 under the assumption that each evaluation index is uniformly distributed (Table 5); The mean of evaluation results of the artificial sand-binding vegetation with the same number of training times are 1.2205, 1.3735, 2.3873, and 2.8405 under the assumption that each evaluation index is normally distributed (Table 6). Therefore, we can conclude that the stability of artificial sand-binding vegetation in different years is: artificial sand-binding vegetation established in 1956a > artificial sand-binding vegetation established in 1964a > artificial sand-binding vegetation established in 1981a > artificial sand-binding vegetation established in 1987a, suggesting that the stability of the artificially re-vegetated belt established in different years (1956a, 1964a, 1981a, and 1987a) tend to be stable with the increase of sand fixation years.

K–W test was employed to determine whether there is significant differences in the evaluation results of artificial sand-binding vegetation established in different years (1956a, 1964a, 1981a, and 1987a) (Tables 5, 6) with the significance level \(\alpha\) of 0.05. The P value of the evaluation results of artificial sand-binding vegetation belts established in 1956a and 1964a are 0.2568 and 0.3643 under the assumption that the population distribution of evaluation indexes are uniformly distributed or normally distributed (Table 7), which are all greater than the significance level \(\alpha\), indicating that there is no difference in the stability evaluation results of the artificial sand-binding vegetation belts between 1956 and 1964a, that is, the artificial sand-binding vegetation belts of 1956a and 1964a have almost the same stability; The P value of the evaluation results of artificial sand-binding vegetation belts established in 1981a and 1987a are all close to 0.0002, which significantly less than the significance level \(\alpha\), indicating that the stability of artificial sand-binding vegetation belts established in 1981a and 1987a have significant difference (Table 7).

Table 7 The results of the K–W test.

Discussion

Model testing

  • The traditional evaluation methods can be mainly divided into dynamical system models and empirical models. However, the theoretical results obtained by dynamical system models are difficult to be verified in reality, and the empirical methods have the drawbacks that the weights of evaluation indexes are scored by experts, and the uncertainty of parameters may lead to inconsistency of evaluation results. In addition, evaluating the stability of an ecosystem comprehensively and systematically requires a large number of observed variables and data, which will lead to huge costs. Therefore, how to evaluate the stability of re-vegetated ecosystems with limited observational data is a challenging problem.

In this study, the bootstrap technique was employed to obtain the robust estimation of the mean and standard deviation of each evaluation index based on the observed data, which provides a standard for Monte Carlo simulation. As mentioned above, Monte Carlo simulation was employed to generate enough pseudo-random numbers for training the T–S FNN under the different assumptions, which makes it possible to evaluate the stability of artificial sand-binding vegetation systems by using the machine learning model. As T–S FNN combines the advantages of the neural network model and fuzzy inference system, and the evaluating results clearly show the stability score in each quadrat in each artificial sand-binding vegetation belt established in different years, which provides a basis for the precision management of re-vegetated ecosystems in the study area. Finally, the K–W test was employed to determine the difference in stability of artificial sand-binding vegetation established in different years (1956a, 1964a, 1981a, and 1987a), and the result show that the artificial sand-binding vegetation belts established in 1956a and 1964a have almost the same stability, but the stability of artificial sand-binding vegetation belts established in 1981a and 1987a have a significant difference. This conclusion is consistent with the views of other scholars1,2,6,7.

Although the application of the proposed evaluation model requires a large number of pseudo-random data, and the evaluation results are completely determined by the input variables and the training times of T–S FNN, the proposed evaluation model has its advantages. E.g., compared with the traditional evaluation methods, the proposed evaluation model is data-driven, which effectively overcomes the drawbacks that the theoretical results of dynamical system models are difficult to be verified in reality, and the weight of evaluation index in empirical methods exists in the uncertainty. The proposed evaluation model has good universality, which can be employed to evaluate the stability of artificial sand-binding vegetation with limited observational data in other bioclimatic zones.

The stability of the revegetated ecosystems

The stability of the artificial sand-binding vegetation is a necessary condition for the sustainability of the re-vegetated ecosystems, which determines the rise and fall of the re-vegetated ecosystem, and relates to the prospective function and skopos. Before the establishment of the artificial sand-binding vegetation system, the native shrub coverage in the southeastern margin of the Tengger Desert was below 1%. With the establishment of the artificial sand-binding vegetation, the shrub coverage reached 33% at most after 15a, and the coverage of the herbs did not exceed 5%. With the increase of sand fixation years and the continuous colonization of herbaceous species, shrub coverage gradually decreased to 6%-10%, while herbaceous species increased by over 30%1,2,6,17,18. The main artificial sand-binding shrubs (e.g. Caragana korshinskii, Caragana microphylla, Calligonum mongolicum, Hedysarum scoparium, Atraphaxis Bracteata and Artemisia ordosica, etc.) have been gradually replaced by natural herbs (e.g. Bassia dasyphylla, A. capillaries, Aristida Adscensionis, Eragrostis minor, Salsola Ruthenica, Setaria Viridis, Stipa Glareosa, Artemisia blepharolepis, Corpermum declinatum steph. ex Iljin, Agriophyllum, Echinops gmelinii, Scorzonera divargos, Allium mongolicum, Euohorbia humifusa Willd etc.)18. This succession can be regarded as the result of competition between herbs and shrubs for limited water resources.

Soil moisture is the driving force and key a-biotic limiting factor for the succession of artificial sand fixation vegetation. Soil water controls the ecological process of artificial sand-binding vegetation in arid sand areas1,2,3. The establishment of artificial sand-binding vegetation changed the original eco-hydrological process of mobile dunes and promoted the water-holding capacity of surface soil1,7,17,18. With the increase of sand fixation years, the dynamic change of soil moisture changed the distribution pattern of vegetation. The coverage of sand fixation shrubs and deep soil moisture reached a new equilibrium state1,2. According to the niche differentiation theory2,6,17, deep soil moisture restricts shrub coverage, while shallow soil moisture affects herb coverage.

The restoration of soil is the most fundamental indicator to measure the success of ecological reconstruction and restoration in arid sand areas. After the mobile dune was fixed with grass squares, the biological soil crust was successively formed on the sand surface with cyanobacteria, lichens, and mosses as the dominant crust, the formation of topsoil was effectively promoted by the extensive colonization of biological soil crust7,17. With the increase of sand fixation years, the content of organic matter in soil increased. The improvement of surface soil provides a suitable habitat for the settlement and reproduction of soil microorganisms and soil micro-fauna1,18. Due to the continuous accumulation of atmospheric falling dust and humus on the surface soil, the formation of the surface soil in sand areas is promoted, the aggregate structure of the surface soil is increased, the bulk density of soil is reduced, and the water retention ability of soil is improved1,2,6,7. The effective water that can be utilized by the shallow root herb is increased, and a complex community with multiple layers including shrubs, herbs, mosses, lichens and algae is gradually formed in the artificial sand-binding vegetation area6,17,19.

Conclusions

Constructing data-driven hybrid models can overcome the defects of mathematical models and empirical models effectively. In this paper, a data-driven evaluation model based on the Bootstrap technique, Monte Carlo simulation, T–S FNN, and K–W test was proposed to evaluate the stability of re-vegetated ecosystems in the Shapotou section of Baotou-Lanzhou railway under the assumption that the undisturbed natural vegetation is stable. The evaluation results show that the stability of the artificial sand-binding vegetation belts established in different years (1956a, 1964a, 1981a, and 1987a) tends to be stable with the increase of sand fixation years, and the artificially re-vegetated belt of 1956a and 1964a have almost the same stability, but the stability of the artificially re-vegetated belt between 1981 and 1987a have a significant difference. The evaluation model is data-driven, and the evaluation results depend on the inherent structure of the model. Therefore, the research method in this paper is also applicable to evaluate the stability of other ecosystems.