Multi-station artificial intelligence based ensemble modeling of suspended sediment load

In this study, Artificial Intelligence (AI) models along with ensemble techniques were employed for predicting the SSL via single-station and multi-station scenarios. Feed Forward Neural Networks (FFNNs), Adaptive Neuro-Fuzzy Inference System (ANFIS), and Support Vector Regression (SVR) were the employed AI models, and the simple averaging (SA), weighted averaging (WA), and neural averaging (NA) were the ensemble techniques developed for combining the outputs of the individual AI models to gain more accurate estimations of the SSL. For this purpose, twenty-year observed streamflow and SSL data of three gauging stations, located in Missouri and Upper Mississippi regions were utilized in both daily and monthly scales. The obtained results of both scenarios indicated the supremacy of ensemble techniques to single AI models. The neural ensemble demonstrated more reliable performance comparing to other ensemble techniques. For instance, in the first scenario, the ensemble technique increased the predicted results up to 20% in the verification phase of the daily and monthly modeling and up to 5 and 8% in the verification step of the second scenario.


INTRODUCTION
The whole amount of sediment output, carried out from a watershed by rivers and streams, is described as sediment load (SL) and can be categorized into two groups of bed load (BL) and suspended sediment load (SSL). Fine sediments fall within SSL and could be moved long distances before deposition, even under the base and low-flow conditions (Salih et al. 2019). SL prediction is a crucial task in hydro-environmental issues and water resources management as it can provide beneficial information about the watershed erodibility and deposition of the sediment caused by the scouring phenomenon. Moreover, it affects the water quality through (i) Creating visual pollution by increasing the turbidity of the water which also increases the drinking and domestic water's purifying expenses. (ii) Carrying contaminants like pesticides, nutrients, and other chemicals. (iii) Reducing the dissolved oxygen of water and consequently limiting the water life. It has also a significant effect on the design and maintenance of water constructions such as the dead volume of a dam, stable channels and river-bed equilibrium . Moreover, one of the destructive phenomenon caused by sediment transport is scouring phenomena and it is often dominant around bridge piers (Singh et al. 2020) and in culvert outlets (Najafzadeh & Kargar 2019;Pandey & Azamathulla 2021).
Considering all these impacts, the accurate prediction of SL is crucial and is a highly challenging task since it has a vast number of temporal and spatial variabilities. Owing to this, several approaches have been introduced to model and address this issue (Verstraeten & Poesen 2001;Ward Et Al. 2009) which are usually categorized into three main groups including, physical-based, conceptual, and black-box models (Nourani & Mano 2007). Conceptual and physical models deal with equations and rules involved in the phenomenon and could be beneficial in gaining knowledge of the process but they have their limitations and complexities (Li 1979;Asselman 1999;Xu 2002;John et al. 2021;Pu et al. 2021). The main obstacle for physical-based and conceptual models is that these models require high-resolution and high-quality data of flow and sediment which are not often accessible (Sivakumar 2006) and in case of direct measurement of these data, aside from its high expenses, any subtle error may affect the modeling results. This also could be time-consuming which encourages the researchers to employ data-driven (black-box) models in cases that accurate forecasting is much more important than physical illumination. In recent decades, artificial intelligence (AI) methods have received great attention and utilized in a wide spectrum of hydrologic processes due to their practical advantages and reliable results (ASCE 2000;Sahoo et al. 2006;Solgi et al. 2014). Since the sedimentation and its transport process are immersed with high sophistication, dynamism, and nonlinearity in both spatial and temporal scales, employing AI approaches often results in dependable outcomes (Jain 2001;Kisi 2004;Zhu et al. 2007;Kisi et al. 2008;Partal & Cigizoglu 2008;Nourani 2009a;Mirbagheri et al. 2010). Nagy et al. (2002) predicted suspended sediment concentration (SSC) employing an artificial neural network (ANN) via several stream data collected from reliable sources. The network was developed based on several input variables including Reynolds number, Froude number, Stream Width Ratio, and Mobility number, and the SSC as the output parameters. Comparing the ANN model and various commonly used sediment discharge formulas performed on observed data, ANN demonstrated better results in model testing than the other commonly used sediment discharge formulas. Cigizoglu (2004) developed a multi-layer perceptron (MLP) network for estimation of daily SSL and the results showed that MLPs capture the complex non-linear behavior of the SSL series much better than the conventional models. Azamathulla et al. (2013), employed an extension of genetic programming (GP) called gene expression programming (GEP) along with adaptive neuro-fuzzy inference system (ANFIS) and regression models to predict the SSL relation of three Malaysian rivers. The results indicated the outperformance of GEP approach. Nourani (2014) reviewed AI-based models employed for modeling SSL and investigated the advantages and disadvantages of both AI-based and hybrid models like GP and indicated that ANN is more reliable than GP. Nourani & Andalib (2015) used a wavelet-based least-squares support vector machine (WLSSVM) along with a wavelet-based ANN (WANN) to predict the SSL in daily and monthly scales. Comparison of the results for both approaches revealed that WLSVM is more reliable and robust than WANN for estimation of sediment loads.
Considering the uncertainties involved in the sedimentation process one one hand and the ability of fuzzy concept to handle the uncertainties on the other hand, different neuro-fuzzy (NF) and ANFIS models have been utilized for sediment load estimation. Rajaee et al. (2009) modeled SSL using ANNs, NF, Multi Linear Regression (MLR), and conventional sediment rating curve (SRC). Obtained results demonstrated that NF performs better than the others due to its nonlinear nature and its ability to handle uncertainties. To predict the functional relationship of the sediment transport in the sewer pipe systems, Azamathulla et al. (2012) used ANFIS and proved its robustness for the practitioners. Support vector machine (SVM), as another kind of AI model, is an almost newly developed model that demonstrates reliable performance and can be utilized as an alternative method of ANN since SVM instead of minimizing the error, considers operational risk as the objective function and minimize it. Nourani et al. (2016) performed spatio-temporal modeling of monthly SSL employing the SVM model and determined the non-linear relationships among the SSL for three stations located on the Ajichay River. Zounemat-Kermani et al. (2020) modeled both SSL and BL using conventional methods including SRC, MLR, and also AI models including ANFIS, support vector regression (SVR), and their integrated version with genetic algorithm (GA-ANFIS, GA-SVR). The major findings of the research pointed out the outperformance of AI models over conventional models regardless of input combinations and the prediction results improved by utilizing integrated versions of the AI models.
Although AI-based models have been successfully applied in numerous studies to model SL and the results proved the efficiency of AI models, it is a common problem that employing different models for solving a unique issue may lead to distinct outcomes. Overall, as can be understood from the presented literature review of the applied AI models in SL prediction, there is almost no consensus among researchers to present a unique model as a leading one in estimation of the SSL since these models are case sensitive and their performance may depend on the time of the employed data. For instance, one model may efficiently capture the maximum values while the other model may well present the lower amounts (i.e. different performances of a single model for different seasons of the year) in a time series forecasting task. Hence, various features of the underlying pattern of a phenomenon could be obtained more exactly by compounding different models through the ensemble technique. The ensemble approach has already been discussed and applied in different fields of engineering and the SL is not an exception (Shamseldin et al. 1997;Zhang 2003;Sharghi et al. 2018;Nourani et al. 2019Nourani et al. , 2021Sharafati et al. 2020). As an example, Alizadeh et al. (2017) used least-square ensemble models as well as ANN model with different wavelet families linked to it (Wavelet-ANN models) to estimate the SL of the Toutle River in Washington state in which the ensemble models outperformed the other single models. However, as far as the authors' concern, the sediment process has not been investigated through the multi-station approach together with the ensemble concept. So the aim of this study is to utilize ensemble techniques to predict the SSL in current time step through single-station and multi-station scenarios. For this purpose, FFNN, SVR, and ANFIS were employed to model the SSL of three stations located in the Missouri and Upper Mississippi regions in daily and monthly scales. Although there are numerous AI models to employ for modeling SSL, the motivations behind choosing these three models are that; FFNN is one of the fundamental models to be used in this field, SVR in that it tries to minimize the operational risk and ANFIS for its employment of fuzzy logic which is essential to handle the uncertainties of the modeling results. The reason for including a multi-station scenario in this study was that it could handle the uncertainty and non-linearity of the SSL modeling via exploring the interaction of stations' information by recognizing the spatial and temporal variabilities and consequently could provide more robust results. Moreover, the multi-station scenario made it possible to estimate the SSL of the downstream station using upstream data, and therefore measuring SSL values in downstream stations becomes unnecessary in the future when using this model. Afterward, to enhance the prediction efficiency, the outputs of the mentioned AI models for each scenario were used as inputs to develop ensemble techniques. With this regard, two linear ensemble techniques including simple averaging (SA) and weighted averaging (WA) and one nonlinear neural averaging (NA) were used for each scenario. So the objectives of this study can be summarized as: (i) AI-based modeling of SSL (ii) improving the AI-based modeling via ensemble approach (iii) discovering the multi-station relationships.

Hydrometric stations and data sets
Three stations over Missouri and Mississippi rivers located in three distinct states (i.e. Nebraska, Illinois, Missouri) were considered to carry out this research. The stations belong to three discrete basins and subbasins of the Missouri and Upper Mississippi regions. The first station is located in Nebraska City and refers to the Missouri-Nishnabotna basin. The other two stations belong to Upper Mississippi-Salt and Upper Mississippi-Meramec basins and are respectively located in (i) below Grafton after the confluence of Mississippi and Illinois Rivers, (ii) in St. Louis after the confluence of Mississippi and Missouri Rivers.
To have an overview of the whole characteristics of the basins, Figure 1 shows the spatial locations of all three stations and Table 1 presents some general information of the stations and land use statistics of the related basins.
According to Figure 1 and Table 1, stations A and B are respectively in upstream of the Missouri and Mississippi rivers and station C is located in downstream of the Mississippi River. Concerning the drainage area, station C has the largest drainage area unlike station B that has the smallest. Although the state soils of all stations are subjected to severe erosion, as demonstrated in Table 1, the major land uses of the related watershed of station B are agriculture and forest lands whereas station A has fewer green areas than the other two stations. This may cause the basin of station A to be subjected to severe soil erosion and consequently bringing about a larger amount of SSL comparing to station B. Accordingly, it is expected that station A has more interactions with station C due to its unsuitable land cover and land use states.
Mean daily and monthly streamflow and SSL data for a 20-year period (from October 1997 to September 2017) derived from United States Geological Survey (USGS) website (https://waterdata.usgs.gov/nwis) were used to assess the proposed methodology of this study. It worth noting that the first 75% of the data were assigned for the calibration dataset and the remaining 25% used as the verification dataset. To facilitate training and to improve the accuracy of the models, prior to modeling, all input data were normalized by: where S norm is the normalized value of the S (t) ; S max(t) and S min(t) are the maximum and minimum values of the observed data, respectively. For better understanding of catchments flow discharge and SSL conditions, Figure 2 provided to depict time series of SSL and discharge of station C in daily scale and also statistical analysis of the used data in daily scale are summarized in Table 2.

Uncorrected Proof
Checking the data set for trend and randomness and regarding the SSL and discharge time series of the used data set (Figure 2), it can be understood that flow and erosion conditions have not encountered any significant transformation from 1997 to 2017 and also there is no trend and randomness in the used data set that was checked by Mann-Kendall test.
As Table 2 reports, stations A and B have a similar amount of mean SSL values while the mean flow discharge of station B is almost three times bigger than the mean flow discharge of station A that indicates much sedimentation happening in the Missouri region. Not surprisingly, both flow discharge and SSL amounts of station C are remarkably larger than the other two stations since it is located after the confluence of the Missouri and Mississippi rivers. Furthermore, higher values of coefficient of variation (CV) for SSL compared to the flow discharge data show the higher deviation of SSL data from the mean and describe SSL as a more sporadic phenomenon than the flow rate series itself. It worth noting that when some uncertainties of data are added into the dataset, ANFIS model could handle some of the uncertainties due to its employment of fuzzy concept and the ensemble technique could reduce the uncertainties of the applied models.
The correlation coefficient and mutual information (MI) are two common methods that are being employed for detecting the relation of different variables. Despite correlation coefficient that defines linear relation of variables, MI shows the nonlinear relation of variables. The MI value between random variables of X and Y can be written in the form of (Yang et al. 2000): where x and y are the probability distributions of variables X and Y; H(x) and H(y) show respectively the entropies of distributions x and y, and H(x,y) is their joint entropy as: where p XY (x,y) is the joint distribution. In this study, for assessing the possibility applying multi station scenario, MI was used to explore the correlation of discharge and SSL of stations A, B and discharge of station C with SSL time series of station C (see Table 3). Uncorrected Proof S i t denotes the SSL of Station i and Q i t denotes the flow discharge of station i, all in current time step. As can be seen from Table 3 both stations A and B have strong interdependency with SSL of station C. However, the discharge of station B has a bit larger MI value which could be due to the closer distance of stations B and C.

Proposed methodology
In this study, three different AI-based black-box models including ANN (a commonly used AI method), ANFIS (AI method which deals with uncertainties involved in the process), and SVR were developed to predict the SSL of the study stations. Then to improve accuracy and decrease the uncertainty of the predictive models, the outputs of these single models were fed into the ensemble unit comprised of three ensemble techniques including SA, WA, and NA methods. The modeling was carried out through two scenarios including single station and multi-station modeling. In this scenario, as single station modeling, it was tried to estimate the SSL of each station in the current time step (t) using its own historical data. For this purpose, the prediction of each station's SSL could be formulated as: where i stands for modeling station and SSL of the ith station at time step t is considered as a function ( f ) of ith station's SSL value at previous time steps (t À 1, t À 2, …) up to lag time n, and discharge value in time t and previous time steps (t À 1, t À 2, …) up to lag time m. It worth noting that the most efficient lags (n, m) and input could be determined through a trial and error procedure.
(ii) Second scenario In the second scenario, as multi-station modeling, the aim was to predict SSL of Station C using discharge and SSL values of stations A and B to be used when SSL is not measured in station C because of technical and/or financial shortages. Due to the particular spatial locations of these three stations, as can be seen in Figure 1, it is expected to represent a practical technique for eliminating the need of SSL measurement in station C. Hence, the general mathematical formulation for modeling of this scenario could be patterned as: where S C t denotes the SSL of Station C in current time step, S A tÀn , S B tÀm , Q A tÀk and Q B tÀz are SSL and discharge of stations A and B at different time steps and Q C t is the discharge value of Station C at the current time step. Like the first scenario, the dominant input combination and lags could be specified using a trial and error procedure.
It is worth noting that as can be seen in Figure 1, in addition to the main rivers of the study area (i.e. Missouri and Mississippi) there are several other tributaries that may have some effects on SSL amount of the studied rivers. The effect of  Uncorrected Proof these tributaries have been included by developing the black-box models which employs the historical data and contains the related information about the upstream conditions.

Feed forward neural network (FFNN)
Originally inspired by biological neural networks, ANN is a family of statistical learning algorithms that can be used to estimate or approximate nonlinear functions with an arbitrary number of inputs. The feed-forward neural networks (FFNN) with back-propagation (BP) learning algorithm are well-known utilized strategies in dealing with various engineering issues. The FFNN consists of interconnected processing elements known as nodes with unique characteristics of information processing such as learning, nonlinearity, noise tolerance, and generalization capability. FFNN structure contains three layers, namely, the input, hidden, and output layer (see Figure 3). In the FFNN, the inputs presented to the input layers' neuron are propagated in a forward direction and a nonlinear function known as activation function is used to compute the output vector. It has been already shown that a FFNN trained by BP algorithm with only 3 layers of input, hidden and target layers can be decently employed at different subjects of water resources engineering (ASCE 2000; Nourani 2017).

Adaptive neural fuzzy inference system (ANFIS)
The ANFIS model includes the merits of both neural network and fuzzy control systems, through unifying ideas from fuzzy control and neural networks. The fuzzy structure has strong inference system and has no learning ability while the neural network has powerful learning ability. ANFIS presents these two desired features in one single model. A fuzzy database, fuzzifier, and defuzzifier are the main components of every fuzzy system (Nourani & Komasi 2013). Fuzzy database is comprised of an Inference engine and fuzzy rule base. As illustrated by Jang et al. (1997), the fuzzy rule base involves rules that are based on fuzzy propositions and the fuzzy inference applies operation analysis. The ANFIS model architecture consists of five layers with layer 1 representing the input layer; layer 2 representing the input membership function; layer 3 representing rules; layer 4 representing the output MFs; and layer 5 representing the output configured as illustrated in Figure 4. To show the typical mechanism of ANFIS to create target function of f, for instance with two input vectors of x and y, the first order Sugeno inference engine may be applied to, two fuzzy if-then rules as (Aqil et al. 2007): In which A 1 , A 2 and B 1 , B 2 show respectively the MFs of inputs (i.e., x and y), p 1 , q 1 , r 1 and p 2 , q 2 , r 2 are the target function parameters. The operation of an ANFIS can be briefly described as follows: Layer 1: Each node in this layer produces membership grades of an input variable. The output of ith node in layer k is denoted as Q k i . Assuming a generalized bell function (gbellmf) as the membership function (MF), the output Q 1 i can be computed as (Jang et al. 1997): Layer 2: The imposed signal to this layer is multiplied by each node of this layer as: Layer 3: Node i in this layer computes the normalized firing strength: Layer 4: In this layer, the contribution of ith rule towards the target is determined as (Jang et al. 1997): where, w is the output of layer 3 and {p i , q i , r i } is the parameter set. Layer 5: Finally, the output of the ANFIS model is computed by the sole node of this layer as (Jang et al. 1997): To calibrate the premise parameters set {a i , b i , c i } and consequent parameters set {p i , q i , r i } of the ANFIS, the conjunction of least squared and gradient descent methods are used as a hybrid calibration algorithm. (Aqil et al. 2007). Back propagation (Kurian et al. 2006) and hybrid learning algorithms (Ebtehaj & Bonakdari 2014) are two common algorithms widely used in various applications to train ANFIS and show somewhat accurate predictions. The Sugeno FIS was employed in this study among other fuzzy inference engines (Jang 1993). To analyze data that has a categorical output variable, SVM is a suitable alternative as it is a machine learning algorithm that develops hyperplanes for identifying different classes. Unlike other black-box models, SVM minimizes the operational risk in place of minimizing the error between observed and estimated values. However, instead of classification, the SVR model is employed in continuous numeric output variable regression analysis. It is employed to get an approximate function from a given complex sample data. The main idea is to first map the non-linearly separable data into a higher dimensional linearly separable feature space and then using this feature space for computation and linear programming. Given a set of data i (x i is the input vector, d i is the actual value and N is the total number of data patterns), the general SVR function is Wang et al. (2013): where w(x i ) indicates feature spaces, non-linearly mapped from input vector x Regression parameters of b and w may be determined by assigning positive values for the slack parameters of ξ and ξ* and minimization of the objective function as: Subject to: where 1 2 w k k 2 is the weights vector norm and C is referred to the regularized constant specifying the exchange between the empirical error and the regularized term. ε is called the tube size and is equivalent to the approximation accuracy placed within the training data points. Mentioned optimization problem can be changed to the dual quadratic optimization problem by defining Lagrange multipliers α i and α i * . Vector w in Equation (6) can be computed after solving the quadratic optimization problem as Wang et al. (2013): The final form of SVR could be presented as [37]: where a i and a Ã i are Lagrange multipliers, n stands for the number of data patterns, k (x,x i ) is the kernel function performing the non-linear mapping into feature space and b is the bias term. One commonly used kernel function is the Gaussian Radial Basis Function (RBF) as: where γ is the kernel parameter. Figure 5 shows the architecture of SVM algorithms. For more details about SVR, the readers are referred to Wang et al. (2013) and Raghavendra & Deka (2014).

Ensemble unit
In peak values modeling or in cases that different models have better results at various conditions and intervals, it is expected that the accuracy of a predicted time series will be improved by combining (ensemble) the outputs from several prediction models. In this regard, utilizing combined predictions is less risky than depending on an individual method. According to several types of research in various fields of engineering, it has been already confirmed that combining the results of different models could improve the overall prediction accuracy (e.g. see Zhang & Berardi 2001;Kasiviswanathan et al. 2013;Nourani et al. 2020a).
In this study, two techniques were employed to ensemble the utilized models' outcomes for enhancing modeling results as (i) linear ensemble technique; which includes SA (Equation (9)) and WA (Equation (10)). (ii) nonlinear ensemble technique (i.e. NA).
SA is done as: where S(t) is the outcome of SA, N shows the number of single models (in this study, N ¼ 3) and S i (t) stands for the outcome of the ith model (i.e. ANN, ANFIS and SVR).
The WA is formulated as: where w i shows imposed weight on the output of ith method that may be computed on the basis of the performance measure of ith method as: Unlike linear ensemble techniques, in nonlinear ensemble technique a FFNN is trained which takes the outputs of single AI models as input and provides the ensemble output (Sharghi et al. 2019).
where DC i measures the ith model performance (such as determination coefficient, DC).

Uncorrected Proof
In the NA, the outputs acquired by individual models (i.e. FFNN, ANFIS, and SVR) are combined together to create a new model and train through the FFNN model to obtain the ensemble output (see Figure 6). It worth mentioning that other AI models like ANFIS and SVR can be similarly employed for the NA.

Efficiency criteria
The Determination Coefficient (DC), also known as Nash-Sutcliffe efficiency (Nash & Sutcliffe 1970;Nourani 2009b), and Root Mean Square Error (RMSE) efficiency criteria were utilized in this study for evaluating the performance of the models as (Nourani 2017): where n, S obs i , S obs and S com i are data number, observed data, averaged value of the observed data and calculated values, respectively. DC ranges between -∞ and 1, with perfect score of 1. Also, B information criteria (BIC) (Rissanen 1978) is calculated for evaluating the models, as: where m is the number of input-output patterns and npar is the number of parameters to be identified. While the RMSE is expected to progressively improve as more parameters are added to the model, the BIC penalize the model for having more parameters and therefore tend to result in more parsimonious models. The smaller the BIC value, the better the model performs.

RESULTS AND DISCUSSION
3.1. Results of the first scenario (single station modeling) In scenario 1, all individual stations were modeled by their historical data using FFNN, ANFIS, SVR, and ensemble techniques. For this purpose, different lagged time series of streamflow and SSL was used to construct various input combinations to fed into the input layers of the models since streamflow and SSL demonstrate Markovian behavior; meaning that the parameter's value at the current time step could be related to its values at previous time steps. It worth mentioning that since the soil type of each station is not a dynamic characteristic of the system in single station scenario, the black-box models learn and then include this feature (i.e. soil type and land use) by self-learning. However, it is one of the major goals of this study to observe the effect of different soil types and land use of different stations on the results as the soil types and land uses differ from one station to the others in multi-station scenario. To check the relations between each station's SSL, discharge and SSL of previous time steps, their MI values were computed and tabulated in Table 4. As it is clear from Table 4, the SSL time series of each station has high relation with its previous SSL and discharge time series, but to determine which variables must be used together which leads to the best result, the trial and error procedure should be employed. Identifying and selecting the best input combination for the models is one of the crucial tasks of any  Table 5, different combinations developed and utilized to estimate the SSL value at the current time step (S (t) ) in both daily and monthly time scales for the first scenario.
It worth mentioning that, in monthly scale S (tÀ12) (SSL value of the same month in the previous year) was also considered in the input layer to capture the seasonality of the process. To avert the overfitting problem in training the FFNN, the network's proper architecture selection (i.e. number of hidden layer's neurons and iteration epoch number) is an essential step in the modeling. Thus, the ranges of 10-100 and 1-10 were examined for epoch number and number of neurons of hidden layers, respectively. Accordingly, the FFNNs were trained by employing the tangent sigmoid as activation function of hidden and output layers using the gradient descent BP training algorithm (Haykin 1994) and the best structure specified via trial and error procedure for modeling the SSL of each individual station.
Due to its ability to handle the uncertainties and complex nonlinear behavior through fuzzy concept, the ANFIS model was used in this study as another AI model. In training the ANFIS, the Sugeno fuzzy inference system was applied and the Gaussian, Trapezoidal, and Triangular-shaped membership functions (MFs) exhibited better results among other MFs for modeling the hydroclimatic process (Nourani et al. 2020a). In addition, constant MF was used in the output layers of the models. The number of training epoch was assessed along with the number of MFs in order to attain the finest ANFIS model. With this aim, the ranges of 5-100 and 2-4 were respectively examined to find the optimum number of epochs and MFs.
Afterward, SVR models were developed based on the RBF kernel which shows more reliable performance compared to other kernels (e.g. sigmoid and polynomial kernels) by considering smoothness assumption (Noori et al. 2011). By adjusting the RBF kernel parameters, optimal SVR was obtained for every input combination for each station. The attained results of the best structure for the AI models in daily and monthly scale are summarized in Table 6.
It can be seen from Table 6 that Comb.3, in all models of daily time scale, led to slightly better results than the other two combinations which could be related to the incorporation of streamflow value in the current time step (Q (t) ) in the input layer. The information provided by Table 6 indicates the outperformance of Comb. 6 in all models in monthly time scale as it includes current month streamflow value (Q (t) ) and SSL of the same month in the previous year (S (tÀ12) ) which enable the model to capture the seasonality characteristics of the sedimentation process as well as autoregressive property. Moreover, the success of Comb. 6 demonstrates the strong correlation between Q (t) , S (tÀ12) , and S (t) .
In single station modeling, as Table 6 shows, the overall outputs indicate the supremacy and robustness of daily scale modeling comparing to monthly modeling owing to the fact that a large amount of data are involved in daily scale training. Various studies were carried out for SL modeling by using AI models and they demonstrated different performances. Ghani et al. (2011) predicted SL of three Malaysian rivers employing FFNN and the findings of this study indicated the superiority of FFNN compared to traditional methods. Azamathulla et al. (2010) used SVM for modeling the SL and proved its Water Supply Vol 00 No 0, 13 Uncorrected Proof  Kumar et al. (2019) to predict the current day runoff and SSL in the Godavari basin and the outcomes demonstrated the outperformance of ANFIS to ANN. Comparing the reviewed studies in the literature, it can be understood that the accuracy of the AI models may vary for distinct case studies and different time scales. This is due to SSL data stochasticity of each considered catchment as well as the capacity of the constructed AI-based models to handle the non-stationarity and nonlinearity in the data set (Salih et al. 2019).
Like the previous studies reported in the literature, AI models demonstrated different performances for different stations in this study. The obtained results reveals that FFNN showed slightly better results for most of the stations. ANFIS, despite of its ability to handle the uncertainties, demonstrated robust performance for some stations and on the other hand showed poor results for some others which could be related to the large number of ANFIS rules and parameters involved in the modeling and the nature of each stations' data. For instance, it could not properly model station B in daily and station C in monthly time scales yet in other stations it showed as reliable results as the other models. Moreover, SVR presented the least performance comparing to other techniques except for Station C in monthly scale that could be related to the used RBF kernel which is not as sensitive as the kernel used in FFNN (i.e. sigmoid). So from the given explanation, it could be understood that it is not an easy task to choose one model which provides the best prediction results for all stations and in all temporal scales. Consequently, combining the outcomes of all three models seems to be practical and could solve the problem of the best model selection as well as improving the prediction results.
For the next step of modeling via scenario 1, the three previously described ensemble techniques were employed to combine the outputs of the single AI models for the best input combination of each time scale. The ensemble technique has been employed successfully in numerous fields of hydrology as a model combination technique (Elkiran et al. 2018;Sharghi et al. 2018;Choubin et al. 2019;Nourani et al. 2020b). Only the verification dataset was used to obtain the variables of both WA and NA techniques and Like the single FFNN model, tangent sigmoid was considered as the activation function for the  The RMSE has no dimension since all data are normalized.
Water Supply Vol 00 No 0, 15 Uncorrected Proof hidden and output layers in the NA technique. Then the neural ensemble network was trained using a scaled conjugate gradient scheme of BP algorithm and the best structure and epoch number of the network were determined through the trial and error procedure. The obtained results of daily and monthly ensemble models are tabulated in Table 7. For better comparison, Figures 7-9 present the observed versus computed daily SSL time series (computed by the NA ensemble and single models) of the verification phase along with the verification step scatter plots of the NA ensemble technique for all stations in daily scale.
As it is obvious from Table 7 and Figures 7-9, performance of the NA ensemble technique for stations A and C are slightly better than station B in daily scale but in monthly scale all stations showed almost similar accuracy. According to the presented results in Table 7, the NA ensemble outperformed the other ensemble techniques in most of the stations. The overall performance of different ensemble techniques is almost the same but NA performed slightly better than the SA and WA ensemble techniques.

Uncorrected Proof
Moreover, as shown by Figures 7-9, almost all models presented reliable outcomes but one model in one particular station captured the maximum amounts while in another it seized the minimum values well. In short, depending on the specific hydrologic features and spatial zone, the performance of a model may significantly or slightly fluctuate. Spread distribution of the dataset has a higher CV value and this means more trouble for the model to be learned and predict. So when CV increases, the performance of AI models would decrease. Comparing the different stations' performances and their CV values, it can be seen that in daily modeling all stations exhibited almost similar performance as they have the same CV values according to Table 2. Yet in monthly modeling, station B demonstrated slightly poor performance since its CV is greater than the CV values of stations A and C.
For better comparison of the models performance and check the results for overfitting issue, the BIC values of single models for best combinations as well as NA model are tabulated in Table 8.
According to Table 8, in both phases of calibration and verification, the obtained BIC values are close to each other. However, in most cases NA model has a lower BIC value and leads to better results comparing to single models. The main goal of modeling through the second scenario was to estimate SSL of station C through a multi-station approach; meaning that discharge and SSL of the other two upstream stations were being used in the SSL modeling of station C (Sc (t) ). The spatial locations of these three stations indicate a high correlation among them. To this end, different input combinations based on different physical interpretations were developed and thoroughly investigated. For instance, one combination only considered the data of station A as the determining station, while the other took station B or the other considered both of these stations. Table 9 presents the considered input combinations of the multi-station scenario for daily and monthly scales.

Uncorrected Proof
The optimum lag times of the combinations were obtained via trial and error process and as Table 9 indicates, discharge and SSL of station A were considered with greater lags in comparison with station B that could be related to the remoteness of station A. In other words, it takes more time for the discharge and sediment load of station A to reach station C. Each combination was modeled by all methods, and then the dominant input combination was obtained through the modeling performance ranking. Results of FFNN, ANFIS, and SVR modeling for the best input combinations in both daily and monthly temporal scales are demonstrated in Table 10.
The main reason for examining various combinations in the modeling was to provide a tool for the physical interpretation of the outcomes to evaluate the interaction degree between stations and to see which river transports the major SSL to station C. Throughout the modeling with different input combinations, it was found out that employing the sole data of stations A and B (input combinations 2 vs 4 and 10 vs 12) demonstrated almost similar performance. This indicates the same interconnection of stations A and B with station C, unlike the early speculations. The initial surmise was that station A may have less correlation with station C (compared to station B) because of the long distance between them. This anomalous outcome could be related to the poor condition of land cover which consequently adds to the sediment load through the soil erosion process. Furthermore, considering the verification phase DC values of modeling by different input combinations, Comb. 6 and Comb. 8 in daily modeling and Comb. 14 and Comb. 16 in monthly scale showed better performance among other input   combinations. Although Comb. 6 and Comb. 8 outperformed the other two input combinations, the 8th and 14th input combinations were chosen as the best data fusion for daily and monthly modeling, respectively, since they have fewer parameters and no SSL data were involved in their formulation. It worth noting that, unlike single station modeling in which daily Uncorrected Proof modeling exhibited better results, in the multi-station scenario, the monthly outcomes showed more reliable results than the daily modeling as the stations in monthly modeling interact with each other in a long lag time. Prediction results for Comb. 8 and Comb. 16 were fed into ensemble models, like the ensemble methodology conducted in scenario 1. Table 11 shows the obtained results of the ensemble approach for the chosen input combinations and Table 12 shows the BIC values of single models and NA ensemble for better comparison of the models' performances. Additionally, Figures 10 and 11 respectively demonstrate daily and monthly observed versus computed values of SSL for multi-station modeling of station C in the verification step, and scatter plots of all proposed models for station C are presented in Figure 12.
According to Table 11, ensemble techniques enhanced the prediction results in both calibration and verification phases and the results indicated the superiority of the NA ensemble comparing to the other two ensemble techniques. NA ensemble has been successfully employed as a multi-model combination technique in hydrological modeling (e.g. Elkiran et al. 2018;Sharghi et al. 2018;Nourani et al. 2021). According to Figures 10-12 almost all models presented acceptable estimations considering the fact that in this scenario none of the models employed the observed SSL values of station C as input data and also the selected combinations for the ensemble were only based on streamflow data. However, this multi-station scenario could be employed when the SSL data of station C is not available due to financial or technical issues. This multi-station modeling outlined the high correlation of the stations and indicated the important role of the physical aspects and geomorphology features of the study area (e.g. land use and soil type) to be taken into consideration in SSL and erosion modeling. For instance, as Table 1 states, the upstream basin of station A has less forest and agricultural lands comparing to station B which causes the basin of station A to be subjected to severe soil erosion and to carry more sediment to station C. This was approved through multi-station modeling scenario of this study as stations A and B demonstrated almost same correlation with SSL value of station C, despite the great distance of station A.
To have a visualized comparison of models' performances in verification step, Radar diagrams of station C in daily scale for both modeling scenarios are presented in Figure 13.

CONCLUSIONS
In this research, due to the practical restrictions of physical and conceptual models, soft computing methods of FFNN, ANFIS, SVR, as well as ensemble techniques were employed for predicting the SSL. To this end, discharge and SSL data of three gauging stations in two different hydrological regions of the United States were used via two distinct modeling scenarios. The first scenario, as the single station modeling, was developed to predict the SSL in the current time step (S (t) ) for all stations using each station's own data. In the next modeling scenario, as the multi-station modeling, observed data of two Uncorrected Proof upstream stations (i.e. A and B) utilized to estimate the SSL of downstream station C (S c (t)) without using the sediment record of station C but employing its streamflow data.
Several input combinations were examined for both scenarios and each input combination's performance was evaluated through modeling. Then the outcomes of the best input combination were fed into ensemble units which consist of the SA, WA, and NA ensemble techniques to increase the modeling efficiency and to address the issues with choosing one single model which is the best fit for all stations in any circumstance. Since every single model has its own merits and demerits, opting for the best method is not an easy task and this was proved by the acquired outcomes of this study. The results indicated that each AI method may exhibit different accuracies depending on the spatial and temporal variations of the stations. In individual daily modeling of station B for example, ANFIS showed poor performance in the verification step comparing to the FFNN and SVR models but in stations A and C it performed as well as other methods. All in all, the obtained results of this study indicated the outperformance of FFNN among other AI models.
For both modeling scenarios, employing ensemble techniques improved the prediction efficiency and the NA ensemble showed more reliable performance than SA and WA techniques since linear averaging always gives a result that is higher than the minimum value and lesser than the maximum value in the set. Ensemble models affected the monthly scale estimation more than the daily modeling. Ensemble techniques could increase the prediction accuracy of the single station modeling up to 7% in training and up to 20% in verification phases of the daily modeling and up to 10 and 20% in training Uncorrected Proof and verification phases of the monthly modeling, respectively. Yet in multi-station modeling, it improved the prediction results up to 5% in the training and verification step of the daily time scale and up to 8% in both training and verification steps of the monthly modeling. Furthermore, in this study, daily modeling exhibited more robust outcomes in single station scenario while in multi-station scenario monthly scale outperformed the daily modeling which highlight the significant function of SSL modeling in various time scales. The findings of this study could pave the way for reliable prediction of SSL by employing AI-based ensemble models and facilitate the evaluation of rivers' fluvial and morphological processes. Moreover, findings provide technical support for river hydraulics and environment as well as for river management.
Every model has its own limitations and disadvantages. In this case, the main disadvantage of the applied AI models of this study is being black-box models in which the computing system is opaque and consequently do not allow the user to sequentially eliminate possible explanatory variables that do not contribute to model fitting (Kisi et al. 2012). Moreover, the AI models are case sensitive meaning that they demonstrate different performances for different case studies. Also they are highly dependent on the quality and quantity of the used dataset. Another limitation of this study is to use only black-box models in developing the ensemble unit. So for future studies, it is also suggested to use physical-based models along with AI models to develop a comparative study and to employ advantages of the models as well as modeling other tributaries of the study area in order to strengthen the credibility of the proposed methodology. Also, it is suggested to employ other AI methods like genetic programming (GP) and to ensemble AI models with conceptual methods.

CONFLICTS OF INTEREST
There is not any conflict of the interest

AVAILABILITY OF DATA AND MATERIAL
The data will be available upon request

CODE AVAILABILITY
The code will be available upon request

DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.