Suspended sediment load prediction using sparrow search algorithm-based support vector machine model

Prediction of suspended sediment load (SSL) in streams is significant in hydrological modeling and water resources engineering. Development of a consistent and accurate sediment prediction model is highly necessary due to its difficulty and complexity in practice because sediment transportation is vastly non-linear and is governed by several variables like rainfall, strength of flow, and sediment supply. Artificial intelligence (AI) approaches have become prevalent in water resource engineering to solve multifaceted problems like sediment load modelling. The present work proposes a robust model incorporating support vector machine with a novel sparrow search algorithm (SVM-SSA) to compute SSL in Tilga, Jenapur, Jaraikela and Gomlai stations in Brahmani river basin, Odisha State, India. Five different scenarios are considered for model development. Performance assessment of developed model is analyzed on basis of mean absolute error (MAE), root mean squared error (RMSE), determination coefficient (R2), and Nash–Sutcliffe efficiency (ENS). The outcomes of SVM-SSA model are compared with three hybrid models, namely SVM-BOA (Butterfly optimization algorithm), SVM-GOA (Grasshopper optimization algorithm), SVM-BA (Bat algorithm), and benchmark SVM model. The findings revealed that SVM-SSA model successfully estimates SSL with high accuracy for scenario V with sediment (3-month lag) and discharge (current time-step and 3-month lag) as input than other alternatives with RMSE = 15.5287, MAE = 15.3926, and ENS = 0.96481. The conventional SVM model performed the worst in SSL prediction. Findings of this investigation tend to claim suitability of employed approach to model SSL in rivers precisely and reliably. The prediction model guarantees the precision of the forecasted outcomes while significantly decreasing the computing time expenditure, and the precision satisfies the demands of realistic engineering applications.


Problem statement and literature
Given the importance of sediment load movement in sculpting the Earth's surface, the challenges associated with sediment transport prediction have attracted a great deal of interest 33 .The important thing is usually to use a tool that is well-founded in order to assess the suspended sediment load.Even with the advancement of modern numerical models, the movement of river sediment load is still difficult to understand.For example, a direct technique that necessitates the installation of a hydrometric station for sample collection and monitoring can be expensive and time-consuming, particularly in remote locations 34 .Although indirect approaches are less costly, the susceptibility of sediment particles to various environmental variables makes it difficult to reconcile theoretical results with observations 35 .Furthermore, a limited range of environment circumstances are covered by the majority of experimentally validated equations, thereby limiting their applicability 36 .Alternatively, the emergence of artificially intelligent algorithms, including ANN and SVM, has revolutionized many time series forecasting, including silt transport prediction, by avoiding the computation of the complex sediment transport rate.One of the key benefits of this strategy is that it does not require knowledge of the complex sediment transport process's underlying physical mechanism 37 .
Regardless of their broad application, ML algorithms still have various flaws [38][39][40][41] .For predicting hydrological variables, it is necessary to train neural network models.Training level finds best values for weight connections, bias values, number of hidden layers, and number of neurons.Conventional training algorithms such as backpropagation algorithms, have a propensity of getting stuck in local minima.In addition, low convergence speed can also hamper the effectiveness of such techniques 42,43 .Even though AI techniques are vastly considered to predict different hydrological variables, such techniques necessitate fine-tuning using training algorithms.The complete AI model must be tuned for completing the ultimate network training.Recently, optimization algorithms like GA, bat algorithm (BA), Grey Wolf Optimization (GWO), shark algorithm (SA), particle swarm optimization (PSO), and firefly algorithm (FA) have been utilized for training soft computing models for determining their best parameter values [44][45][46][47][48][49][50][51] .
Rajaee et al. 52 applied SRC, multilinear regression (MLR), ANN, and Wavelet-ANN models for daily SSL modelling in Iowa River gauge station (US).They concluded that W-ANN model showed better agreement with collected SSL values and performed superior to other considered models.Kisi et al. 53 investigated accuracy of ANFIS-GA, ANFIS-PSO, ANFIS-ACO, and ANFIS-BOA models in drought forecasting considering monthly precipitation data of Biarjmand, Ebrahim-Abad, and Abbasabad stations located in Iran.Adnan et al. 54 proposed an alternative tool named dynamic evolving neural fuzzy inference system (DENFIS) for estimating SSL based on historical sediment and streamflow values recorded at Guangyuan and Beibei stations in China.Obtained results were compared with other two models, and they found that the DENFIS model generated improved SSL predictions.Hassanpour et al. 55 applied SVR-FCM hybrid model, and compared it with SRC), ANN, ANFIS, and SVR models for predicting daily SSL in River Sistan, Iran.They found that SVR-FCM model predicts SSL more accurately in the specified study region.Banadkooki et al. 42 proposed hybrid ANN-BA, ANN-PSO, and ANN-ALO models for SSL prediction in Goorganrood basin, Iran.Based on comparison of results, they observed that ANN-ALO model generated most accurate prediction results in the study region.In another study, Ehteram et al. 56 applied ANN-WA, ANN-PSO, and ANN-BA for optimizing performance of ANN in predicting the rate of SSL accurately in Goorganrood basin, Iran.They found that ANN-WA performed best with accurate SSL predictions.Nhu et al. 57 used random subspace (RS), SVM-RBF (radial basis function) kernel, random forest (RF),

Objective of this study
As discussed in the related literature, optimization algorithms enhance convergence speed of conventional ML models and increase their performance accuracy 48,49,59 .The SSA is a population-based optimization algorithm which was proposed based on foraging and anti-predatory behaviors of sparrow populations, and built upon existing population intelligence algorithms, such as GWO, ALO, and PSO etc.It presents certain advantages in terms of stability, convergence accuracy, and velocity.To the best of the authors' knowledge, no preceding effort has been put into applying an SVM-based SSA model for SSL prediction in Brahmani River basin, hence the objective of present study.By distributing the population of sparrow into three groups: discoverers, entrants, and guards, thresholds and input weights of SVM are optimized.Moreover, various scenarios have been adapted to model input-output architecture for achieving best prediction accuracy for SSL.Lastly, for the performance evaluation between novel SVM-SSA model and other AI algorithms, a complete comparative assessment has been conducted.The outcomes show that proposed SSA algorithm increases convergence speed and efficiently avoids optimization procedure from falling into local optimum.

Study area
River Brahmani flows in the eastern portion of India between 20° 30′ 10″ to 23° 36′ 42″ N latitudes and 83° 52′ 55″ to 87° 00′ 38″ E longitudes (Fig. 1 "generated using ArcGIS software environment").On the right of the basin lies the Mahanadi basin, and on the left the Baitarani basin with a total 39,313.50km 2 catchment area.Climatic conditions of Brahmani basin are tropical, with moderately cold winter and fairly hot summer.The average annual rainfall in the basin is 1305 mm, and most of the rain occurs by the influence of the southwest monsoon season i.e., between June to October.In summer, the maximum temperature goes as high as 47 °C,

Bat algorithm (BA)
Yang 63 introduced BA emulating echolocation behaviour of a bat.In nature, there are several types of bats.When bats navigate and hunt, they all have similar behaviour; but are different in weight and size.Microbats broadly use echolocation characteristic that helps them to seek prey and avoid hurdles in complete darkness 64 .Artificial bats have a velocity vector, frequency vector, and position vector in BA, updated in the period of repetitions.BA can discover search space using velocity and position vectors (Fig. 3).
Every bat has a frequency F i , velocity V i , and position X i in a d-dimension search space.Position, frequency, and velocity vectors are updated using following equations.
here Gbest-optimal solution obtained thus far; F i -ith bat's frequency that is updated during each iteration as expressed below: where β-arbitrary quantity of steady distribution between 0 to 1.As given below, a random walk is employed in BA for improving its exploitation capability: where ε-arbitrary number between − 1 to 1; A-intensity of produced sound.Pulse emission (r) and loudness are updated at each iteration as expressed below: where α and γ-constant constraints which lie amid 0 and 1 and utilised for updating pulse rate (r i ) and loudness rate A i .Pseudocode of BA is given below.

Grasshopper optimization algorithm (GOA)
Saremi et al. 65 proposed a robust metaheuristic optimisation algorithm called the GOA mimicking the swarming behaviour of grasshoppers that comprises adults (grasshoppers having wings) and nymph (not having wings).Adults are utilised for globally searching entire search space (exploration) and finding enhanced food source areas 66 .In contrast, nymphs are utilised for exploiting a specific neighborhood or area of a specific location (exploitation).GOA efficiently balances exploitation and exploration and is mathematically incorporated in a less complicated mechanism of algorithm configuration (Fig. 4).
In nature, behaviour of grasshopper swarms that seeks food sources is articulated using following equation:  where X i (t + 1) -position of i th grasshopper at t + 1 iteration; c-coefficient of reduction for smoothing stabil- ity amid exploitation and exploration phases.c is given by: where c min and c max -minimum and maximum values of c(t) parameter, respectively.In addition, t max and t -maximum and current number of iterations.In Eq. ( 7), lb d and ub d -lower and upper bound of D-dimension hunt space, -d ij distance between grasshoppers and T d -location of solution having best fitness function.Lastly, s(d) signifies societal forces that can be computed using: where f -attraction intensity and l s -attractive scale of length.Additional thorough information can be found in 67,68 .

Butterfly optimization algorithm (BOA)
Arora and Singh 69 proposed a new bionic optimization algorithm by simulating butterflies' mating and foraging behaviour, namely, BOA.The underlying operational process of BOA is based on the observation that during food search, butterflies produce specific fragrances related to their fitness.Also, the fitness of a butterfly changes accordingly as it travels from one search area to another.Fragrance is transmitted in search procedure, and meanwhile, a butterfly can recognize variations in the fragrance of other butterflies 70 .Butterflies travel in the direction of the one butterfly with a more potent fragrance in global search.At the same time, butterflies move arbitrarily in local search for searching food when they cannot sense fragrance from other butterflies.Mainly, fragrance having an exclusive aroma in every butterfly is the distinctive feature of BOA that can be expressed as in Eq. ( 10): where f -detected fragrance magnitude, that is, how other butterflies detect strong fragrances; c-sensory modal- ity; l-intensity of stimulus; a-power proponent which depends on modality accounting changing grade of absorption.In BOA, there are two key steps: global and local search phases.Butterfly takes a step in the direction of fittest solution/butterfly g * in global search and is expressed utilising Eq. (2).where x t i -solution vector x i in iteration t for ith butterfly; g * -current optimal solution obtained between all solutions in present iteration.r-arbitrarynumber between 0 and 1; f i -fragrance of ith butterfly.
Local search is formulated using following equation: where x t j and x t k -jth and kth butterflies from solution space.Figure 5 provides a basic flow diagram of the algorithm.Vol:.( 1234567890)

Sparrow search algorithm (SSA)
Xue and Shen 71 proposed SSA based on theory of anti-predation and foraging behavior of sparrows.SSA is novel and has advantages like fast convergence speed and strong optimisation capability.Mainly the procedure of sparrow foraging is simulated by SSA.The procedure is a producer-joiner model, and it overlays the early warning and detection mechanism 72 .Producers are those individual sparrows who find food without difficulty, and other entities are joiners.A specific number of sparrows in the population are chosen for early warning and investigation at the same time.However, food is abandoned if danger is found since safety is the priority.
Mathematical Model of Sparrow Search Algorithm.Individuals can be categorized as alerters, participants, or discoverers in SSA.The discoverer is in charge of organizing the population's hunt and locating food.In order to grab food, the participants follow the discoverer.When environmental dangers arise, the alerter notifies the sparrow population to flee to a safe location.
It is necessary to create the following rules to simplify the behavior of the sparrow in order to represent the eating process of the bird using a mathematical model: i.The objective function's fitness evaluation determines the environment's fitness in the sparrow population, and the finder's fitness is greater than the participants' .ii.The discoverer and the participant have an internal competitive relationship.In an attempt to boost their own energy, some participants watch how the discoverer behaves in order to compete for food.iii.Less energetic sparrows may relocate in search of more energetic ones.iv.Sparrows possess adaptable individual behavioral methods that enable them to alternate between participants and discoverers, rendering them highly fit discoverers; yet, the population's proportion of participants and discoverers does not change.v.When a sparrow population's alarm value exceeds the security threshold, the finder flees from its current location and guides the population to a secure spot.Warners in the population warn when they perceive an external environmental threat.vi.In order to minimize the risk of their own predation, the alert will take the initiative in escaping when it detects external environmental threats or natural enemies.The alert at the population center will randomly transition from a feeding state to an active state, while the alert at the population edge will move closer to the population center.www.nature.com/scientificreports/ Step 1: Construct and set up the solution.At this point, it is known the size of the population, the maximum number of replicates, the producer ratio (PD), and the PV (sparrows in intensive care) ratio.Equation (13)  displays the sparrow population's starting position.They are generated at random.In Eq. ( 13), n-number of sparrows; d-dimension of choice variables.Equation ( 14) is used to assess each person's suitability for the upcoming operation.Each row in F X represents a person's fit, and n in Eq. ( 14) indi- cates number of sparrows: Step 2: Those who create cuisine are not given favor over producers with greater fitness values in the SSA.Unlike the explorers, producers are able to seek a wider area for cuisine because they are in charge of locating it and guiding the movement of the entire population.In SSA, the discoverer's location update formula is expressed using where t-current iteration number, and X i -information about i th sparrow's position.a-arbitrary number between [0, 1].S(S ∈ [0.5, 1]) and R(R ∈ [0, 1]) signify safety and warning parameters, correspondingly.R-arbitrary number, S-specified constant.When R < S , search environment is found to be safe, and no danger for the population, and a broad range of searches can be conducted by discoverer.When R ≥ S , adjust search approach as scouts find a threat and hence, rapidly move closer to an innocuous region.Q-an arbitrary number following a normal distribution.L-an all-one matrix of 1 × d dimension.
Joiner's position update formula is formulated by where X b -best location of producer at present and X r -worst location in the world at present.A -matrix of 1 × d dimensions, every element has 1 or − 1 value and A + = A T (AA T ) −1 .Scout's position update formula formulated by where X B -current global optimal position; β-steplength regulator parameter, an arbitrary number with vari- ance '1' and mean value '0' drawn on a normal distribution.K-arbitrary number between [− 1, 1]. f i -distinct fitness value of sparrow at present step.f R and f B -current worst fitness and global optimal values, correspond- ingly.ε -a tiny constant.At the end of the iteration, optimisation result is output.Flowchart of SSA is given in Fig. 6.  for i = 1 : PD Update location of sparrow using Eq. ( 13); end for for i = (PD + 1) : n Update location of sparrow using Eq. ( 14); end for for l = 1 : SD Update location of sparrow using Eq. ( 15); end for Get the current new location If the new location is better than before, update it; t = t + 1 end while return Xbest, fg Algorithm 4. Algorithm SSA During iteration procedure, if new position of sparrow is improved than preceding position, present position will be updated and global optimal fitness and global optimal position are found.Also, the sparrow's identity is continuously updated and alternated during this phase.If each sparrow is well adapted, it can be a finder; however, proportions of joiners and finders in the population are constant.

Performance criteria
Four performance indices that includes R 2 , RMSE, MAE, and E NS are considered in this study for measuring the accuracy of the applied models 73,74 .The mathematical expression of the statistical measures can be denoted as: where O i ; F i ; O i and F i express observed, predicted, average observed and average predicted values.R 2 is a statistical index in regression, which shows how fittingly the predictive models estimate real data sets.When the value of R 2 is 1, the predicted values perfectly fit observed values, whereas value of 0 specifies no linear connection.The average error magnitude is measured by a quadratic scoring rule known as the RMSE.It is the square root of average of squared difference amid predicted and actual observations.Contrary to RMSE, MAE is a quantity utilized for measuring how closer the prediction values are to actual observations.MAE calculates average error magnitude amid prediction and observed values with no difference amid direction of error.Low values of MAE and RMSE specify high assurance in model-prediction values 69 .RMSE has the advantage of penalizing huge errors more, thus can be extra suitable in certain circumstances.On the other hand, MAE is evidently a better statistical measure from an interpretation viewpoint.In addition, E NS is one of the standardized Vol:.( 1234567890 1 gives input parameter combinations where discharge and suspended sediment load parameters are given in current time step (Q t and SSL t ) along with previous monthly lag time.From Table 1 it can be seen that five different scenarios were considered for estimating SSL utilising different input combinations of SSL t−1 , SSL t−2 , SSL t−3 , Q t , Q t−1 , Q t−2 , Q t−3 parameters.It is worthy to note that the selected scenarios are considered based on correlation of Q and SSL variables.

Modeling results and analysis
Four hybrid SVM models employed in this study in integration with four different MAs namely BA, BOA, GOA, and SSA are compared for modeling SSL utilizing data of River Brahmani basin.Next, performance of the applied models was assessed against each other and conventional SVM model using statistical measures and graphical interpretations.This section provides the outcomes of the comparisons and effectiveness of all mentioned models at four proposed gauge stations.Two major influencing data, i.e., Q, and SSL, were applied to predict SSL.The outcomes of training and testing periods are provided in Tables 2, 3, 4, 5 and 6, showing prediction performance of all applied models on basis of R 2 , RMSE, E NS , and MAE criteria.The results obtained indicated that these proposed parameters effectively estimated the SSL.
The performance statistics of SVM1 during testing phase for Jaraikela station when SSL and discharge of current month is considered are as follows: R 2 = 0.8996, E NS = 0.8941, MAE = 40.6325,RMSE = 40.768,respectively.Next, we consider a combination of SSL t−1 , Q t−1 (one month lag).Here the SVM2 model (R 2 = 0.9.32,E NS = 0.8975, MAE = 39.341,RMSE = 39.4767)performs better than the SVM1 model.Similarly, when a combination of SSL t−1 , SSL t−2 , SSL t−3 , Q t , Q t−1 , Q t−2 , Q t−3 (one-month, two-month, three-month lag and current month) are considered, the performance of this combination are: R 2 = 0.90819, E NS = 0.90449, MAE = 35.9001,RMSE = 36.0338.It can be observed that, as the lag time is increased, there is a gradual performance improvement in case of SVM as Table 1.Implemented models and their input parameters.www.nature.com/scientificreports/

Input parameters SVM-based models
Thus in case of SVM-BA, SVM-GOA, SVM-BOA too, the last scenario performs better than the other four scenarios.The detailed results are shown in Tables 3, 4 and 5.By observing Table 6, considering SSL t−1 , SSL t−2 , SSL t−3 , Q t , Q t−1 , Q t−2 , Q t−3 provides best results i.e., R 2 = 0.97014, E NS = 0.96481, MAE = 15.3926,RMSE = 15.5287followed by SSL t−1 , SSL t−3 , SSL t−2 , SSL t−3 , Q t−3 (R 2 = 0.9691, E NS = 965, MAE = 16.0031,RMSE = 16.1372), and the worst performance was obsered when SSL t , Q t is considered (R 2 = 0.9636, E NS = 0.9578, MAE = 18.36,RMSE = 18.4958).The performance statistics of SVM-SSA method is best compared to all other hybrid models for all the stations during both training and testing phases.From all the selected stations, all the models perfomed best at Jaraikela, Tilga, Jenapur and Gomlai respectively.Tables 2, 3, 4, 5 and 6 provide the outcomes on train and test datasets for the applied techniques.It gives a general trend, where performance (R 2 , E NS , MAE, RMSE) tends to rise when more characteristics are included, which is found for all proposed models at all the selected stations utilized for SSL prediction.Based on type of statistical measures, the larger values of E NS or R 2 signify that results are better whereas smaller values of RMSE or MAE signify that obtained results are better.Tables 2, 3, 4, 5 and 6 provide the outcomes of statistical assessment measures for result data of Tilga, Jenapur, Jaraikela and Gomlai stations estimated by five different scenarios.From Table 6, the final obtained results revealed that the proposed ML models were adequately trained and verified.The predicted outcomes during testing phase can reflect performance of the predictive models in a better way.As stated before, four hybrid SVM models and the conventional SVM model were employed for SSL prediction; every model had an equal number of MFs.From an analysis of the figure, it is clear that the considered algorithms BA, GOA, BOA, and SSA enhanced the performance of conventional SVM during training and testing stages.During the training period, SVM-SSA showed the best performance, with R 2 , RMSE, MAE, and E NS values of 0.99616, 0.14994, and 0.01578, 0.99195 respectively.Similarly, during the testing period, R 2 , RMSE, MAE, and E NS values for SVM-SSA are 0.97014, 15.5287, 15.3926, and 0.96481 respectively.
The scatter plots of predicted data by SVM, SVM-BA, SVM-GOA, SVM-BOA, and SVM-SSA models against actual data during training and testing phases are reported in Fig. 7. Generally, the model shows better performance when the scatters are closer to 45° slanted line.There were differences in the regression values between the actual and predicted data for each of the recommended approaches.Scatters of SVM-SSA model are more concentrated nearby to the 45° slanted line compared to other four models.According to the graphical variations presented amid observed and predicted sediment load values; the SVM-SSA model achieved eminent correlation with maximum value for all input combinations followed by SVM-BOA, SVM-GOA, SVM-BA and ordinary SVM model.This is shown by the ability of the SVM-SSA prediction model in capturing varied sediment load observations of all four proposed gauge stations.Certainly, utilising SSA optimizer enhanced performance of SVM-SSA compared to other hybrid models and the conventional SVM model in all input scenarios.
To visually compare computational outcomes of SSL obtained from applied models (for best input combination) in comparison to available observed data, a time-series plot of predicted SSL data against observed values is presented in Fig. 8 for all stations.The plot illustrates that hybridized ML algorithms have superior prediction capability, predominantly in finding the peak SSL values, which is a significant development over the conventional model.As illustrated in the figure, the overall trend of SVM prediction model's predicted values can follow the fluctuation trend of the actual values to a certain degree.The prediction trend of SVM-BA and SVM-GOA models does not differ much from the each other, and the overall fluctuation from the real value is quite high.The SVM-BOA model generated prediction values with slightly less deviations from the real ones which matches with the real situation.The overall fluctuation of SVM-SSA prediction model is least having some differences in the validation phase of the trend and the actual trend with the predicted values extremely near to the real values.This figure showed that, in comparison to other predictive models, the SVM-SSA model predictions were more accurate in predicting the matching actual SSL values.Based on time series of modeled and observed SSL in Fig. 8, peak SSL data are well predicted by SVM-SSA algorithm.It can be observed from the figure that there is a minor difference amid time series of modeled and observed SSL.
To visually evaluate performance of the models in replicating probability distribution of actual SSL data, violin plots were prepared.Violin plots of actual and model-predicted SSL data are demonstrated in Fig. 9.The similar resemblance in the form of violin signifies more likeliness of spreading observed and simulated SSL data.Figure 9 illustrates a better similarity amid actual, and SVM-SSA simulated SSL at all four sites.It was observed from the figure that the violin's shape of hybridized models was more similar to shape of actual violin for all locations.The maximum disparity in the violin was witnessed for standalone SVM followed by SVM-BOA, SVM-GOA, and SVM-BA.An assessment of outcomes at four locations showed improved performance of all hybrid models in simulating observed SSL distributions.The reason lies is that for a long-term forecasting task, it becomes much more difficult for forecasting technique to capture dynamic change of SSL because of the more uncertain factors involved in the complex hydrological process.Therefore, the proposed method utilizing SSA to optimize parameters of SVM model can generate satisfactory forecasting outcomes.Based on different statistical indicators, performance of hybrid models was found to fluctuate slightly for different sites.Steadiness in the outcomes reveals a strong dominance of SVM-SSA model in replicating SSL in the selected study region.
For a good apprehension of the estimation accuracies of five employed models, SVM, SVM-BA, SVM-GOA, SVM-BOA, and SVM-SSA, SSL values of varied series, predictions, and observations in diverse ranges are compared.Histogram plots of predicted and actual SSL values are shown in Fig. 10.Prediction of SSL at Jenapur station illustrates that for highest (100000-200000 µg/l) and lowest (300000-400000 µg/l) ranges of SSL, frequency (number of events) of precise prediction by SVM-SSA5 model displays superior agreement with frequency of actual values in comparison to frequency of precise prediction by other models.Performance of SVM-SSA, SVM-BOA, and SVM-GOA models is fairly similar in middle range values (200000-800000 µg/l); yet a slight improvement in range values is noted in performance of SVM-SSA model over SVM-BOA and SVM-GOA models whereas its performance is more enhanced than SVM-BA and SVM models.Overall, for all stations, the Vol:.(1234567890

Discussion
Modelling of river sedimentation is one of the most complicated transformative hydrological modelling problems.Transport of suspended sediment is a dynamic non-linear system that raises significant uncertainty in characteristics of river hydrological modelling, consisting of changes in inflow and sediment load.In this context, robust methods must be employed for modelling SSL in rivers.Based on the assessment provided in previous sections, the developed SVM-SSA model effectively modeled SSL in this study, taking advantage of an optimization system to find optimum values of conventional SVM.The hybrid SVM-BOA, SVM-GOA, and SVM-BA models fail to estimate extreme SSL values accurately.However, the robust SVM-SSA model can correctly predict the maximum and minimum values with lesser errors.Based on forecasting results yielded by SVM-BA, SVM-GOA, SVM-BOA and SVM-SSA, it can be observed that there are slight differences with respect to four statistical metrics, indicating the importance of selecting an appropriate optimization algorithm for model parameter calibration.
Standard SVM utilizing structural risk-minimization principle can gain good generalization performance.However, performance of SVM generally depends on optimization algorithm to calibrate parameters.Even though BA, GOA, BOA have been successfully used in solving optimization problems, all these algorithms face the drawback of easy premature convergence.As a newly proposed optimization algorithm, SSA has strong global In addition, SVM-SSA utilizes a high race optimum procedure that can learn the stochastic phenomena of SSL.It must be emphasized that field engineers are keen on using less complicated tools for practical use.Because the SVM-SSA model incorporates fewer input constraints in its architecture, it can be deliberated as an economic model for SSL prediction.The importance of this study lies in the usage of the sparrow search algorithm and its application in sediment load prediction.SSA has robust optimization capability, fast convergence speed, and broader applications than conventional heuristic search techniques.These advantages attract researchers to apply SSA for major issues like sediment load estimation, which is essential for monitoring and damage mitigation purposes.Also, high load of suspended sediment in streams are known to create unfavorable impacts on river water quality, potable water sources, reservoir or dam operations and irrigation activities.
Even though this research has made several contributions and innovations, there still exist certain drawbacks.A drawback of this research is considering a particular case study (Brahmani River basin).Further investigation will include testing the applied approach on different other streams.The authors also plan on evaluating the usage of input variables produced from weather stations utilizing numerical rainfall-runoff modeling as a substitute for input variables obtained in situ.This will facilitate the adaptation of our applied prediction models (specifically where onsite data are restricted) and probably further enhance performance of SSL estimation.Another drawback is that because of the database's incompleteness, certain possible influence aspects might not be identified.For more enhancements in stability and accuracy of predictive models, the interrelation of more robust optimization algorithms is also worth having a consideration.For future efforts, a powerful model can also attempt for solving prediction problems in several other fields of study.

Conclusions
Prediction of river SSL is significant in planning, functioning, and preserving water structures.Sediment transport exhibits random behaviour in a river that estimates SL.This study predicted sediment loads by hybrid SVM-SSA, SVM-BOA, SVM-GOA, SVM-BA, and conventional SVM approaches.Monthly discharge and SSLs measured at Tilga, Jenapur, Jaraikela, and Gomlai stations were considered for setting up the prediction models.The prediction accuracy of the applied models is assessed utilizing statistical performance measures such as RMSE, MAE, R 2 , and E NS for different arrangements of input parameters.Graphical comparisons are also used in scatter plots, time-series plots, boxplots, and histogram plots for identifying the best prediction model.Results showed that SSA is the leading algorithm with fast convergence and higher accuracy.The results indicated that best SVM-based model has the lowest MAE and RMSE values and highest E NS values.This was achieved with 5 inputs after hybridization with SSA algorithm.The SVM-SSA hybrid model can precisely apprehend extreme SSL values, signifying its robustness for applicability in hydrological and water resource problems.The SVM-SSA model generated the best SSL predictions as confirmed by RMSE values of 15.6992, 15.9143, 15.5287, and  16.01885 during testing data set in Tilga, Jenapur, Jaraikela and Gomlai stations of Brahmani River.In contrast, the SVM model performed worst, as verified by the RMSE value of 36.457(Tilga), 36.6975(Jenapur), 36.0338(Jaraikela), and 37.1004 (Gomlai) during testing phase.Using the proposed models to predict SSL and modelling the SSL process by taking into consideration other variables (e.g., precipitation intensity, temperature, runoff volume) can improve the present investigation.The current study is primarily relied on black-box models within the hybrid and ensemble SSL modeling framework, which is a notable limitation.Therefore, to increase the robustness of SSL modeling in future research, it is suggested for the inclusion of process-based models and
×t) ) Initialize the bat population Xi (i=1, 2, 3, …… n) and Vi Define pulse frequency Fi Initialize pulse rate ri and the loudness Ai while (t< maximum number if iteration) Generate new soln.by adjusting frequency, updating velocities, position [Eq. 1 to 3] if (rand> ri) Select a solution among the best solutions randomly Generate a local solution around the selected best solution End if Generate a new solution by flying randomly If (rand < Ai and f (Xi) < f (Gbest)) Accept the new solution Increase ri and reduce Ai end if Rank the bats and find the current Gbest end while Algorithm 1. Algorithm BA
https://doi.org/10.1038/s41598-024-63490-1www.nature.com/scientificreports/histogram plots show that during the training and testing periods, probability distribution of predicted values is closer to the observed values (better agreement) for the hybrid models than simple SVM model.

Figure 11 .
Figure 11.Comparison plot of MAE for (a) training and (b) testing phase.

Figure 12 .Figure 13 .
Figure 12.Comparison plot of E NS for (a) training and (b) testing phase.
measures to assess the model precision, whose value lies between zero and one.E NS value of 1 specifies best agreement, while a value of 0 specifies no agreement.ENS measure is extremely subtle to limit values because of usage of difference squares.For each model scenario, Table ) Scientific Reports | (2024) 14:12889 | https://doi.org/10.1038/s41598-024-63490-1www.nature.com/scientificreports/

Table 3 .
Performance of SVM-BA models for SSL estimation for all data.

Table 4 .
Performance of SVM-GOA models for SSL estimation for all data.

Table 5 .
Performance of SVM-BOA models for SSL estimation for all data.

Table 6 .
Performance of SVM-SSA models for SSL estimation for all data.