Using machine learning to improve predictions and provide insight into fluvial sediment transport

A thorough understanding of fluvial sediment transport is critical to addressing many environmental concerns such as exacerbated flooding, degradation of aquatic habitat, excess nutrients, and the economic challenges of restoring aquatic systems. Fluvial sediment samples are integral for addressing these environmental concerns but cannot be collected at every river and time of interest. Therefore, to gain a better understanding for rivers where direct measurements have not been made, extreme gradient boosting machine learning (ML) models were developed and trained to predict suspended sediment and bedload from sampling data collected in Minnesota, United States (U.S.), by the U.S. Geological Survey. Approximately 400 watershed (full upstream area), catchment (nearby landscape), near‐channel, channel, and streamflow features were retrieved or developed from multiple sources, reduced to approximately 30 uncorrelated features, and used in the final ML models. The results indicate suspended sediment and bedload ML models explain approximately 70% of the variance in the datasets. Important features used in the models were interpreted with Shapley additive explanation (SHAP) plots, which provided insight into sediment transport processes. The most important features in the models were developed to normalize streamflow by the 2‐year recurrence interval and quantify the rate of change in streamflow (slope), which helped account for sediment hysteresis. Generally, this study also showed a combination of mostly watershed and catchment geospatial features were important in ML models that predict sediment transport from physical samples. This study is a promising step forward in making fluvial sediment transport predictions using machine learning models trained by physically collected samples. The approach developed here can be used wherever similar datasets exists and will be useful for landscape and water management.

sediment can shift a river into disequilibrium. Excess suspended sediment can impair rivers by adversely affecting aquatic habitats, degrading water quality, transporting harmful contaminants, diminishing recreational opportunities, and depositing sediment in navigable waterways (Alexander et al., 2012;Minnesota Pollution Control Agency, 2021a). In contrast, dams and other river modifications can reduce sediment transport and contribute to loss of native fish species and riparian ecosystems, subsidence and loss of wetlands, and decreased nutrient delivery to downstream receiving waters (Draut et al., 2011;Kondolf, 1997;Schmidt & Wilcock, 2008). Therefore, accurate and cost-effective estimates of sediment loading are needed to manage riverine sediment transport at a multitude of scales (Ellison et al., 2016); also needed are methods to estimate sediment transport at sites where little or no physical samples have been collected (Gray & Simões, 2008).
Physically collected sediment samples provide the most accurate data to inform understanding of fluvial sediment processes and transport. The most accurate suspended sediment sampling methods are equal-width increment (EWI) or equal-discharge increment (EDI) and use depth integrating, isokinetic samplers (Davis, 2005) to represent nearly the entire water column in a river cross section (Edwards & Glysson, 1999). The most accurate laboratory method analyzes the entire suspended sample for suspended-sediment concentration (SSC; American Society for Testing and Materials, 2000;Guy, 1969) and accounts for both fines and sands. Pressure difference bedload (BL) samples collect larger sized particles (sands, gravel, and cobbles) moving along the bottom of the water column, and one method used to sample BL transport is the single EWI (SEWI; Edwards & Glysson, 1999).
However, EWI and EDI sampling analysed for SSC, and SEWI analysed for BL is time-consuming and costly, requiring specialized training and equipment to collect samples. In fact, the number of U.S. Geological Survey (USGS) daily-record sediment-monitoring stations was reduced by more than 67% from 1981 to 2005 (Larsen et al., 2010). The number of BL stations is even fewer (U.S. Geological Survey, 2021b).
Less accurate grab samples only represent the top of the water column at one location and are analysed for total suspended solids (TSS; Clesceri et al., 1998). Unlike SSC, TSS is determined from an extracted sub-sample. These methods are inexpensive and faster to collect, require less specialized equipment and training, and meet regulatory requirements for evaluating sediment-related impairments established by the U.S. Environmental Protection Agency (Minnesota Pollution Control Agency, 2021b). However, TSS methods have been shown to substantially underestimate the suspended sand component in streams and rivers (Ellison et al., 2014;Gray et al., 2000;Groten & Johnson, 2018). TSS methods can only inform understanding of suspended fines while EWI and EDI analysed for SSC and SEWI for BL transport can inform the transport and processes for total sediment load (fines, sands, gravels, and cobbles) in rivers.
Sediment rating curves (SRCs) are empirical relations developed between streamflow and a set of discrete physical sediment samples and are used to provide estimates of sediment transport and loads when samples are unavailable. Past research has found that correlations between the SRCs' slope and intercept parameters, river basin morphology, and climate helped define physical controls on sediment loads in rivers (Syvitski et al., 2000). More recent research has used machine learning (ML) and geospatial datasets to predict SRCs' parameters in order to make inferences on sediment transport controls (e.g., Vaughan, Belmont, et al., 2017) or to make predictions at ungauged locations (e.g., Atieh et al., 2015). Having accurate estimates of the total sediment will provide insight on the processes controlling transport to help diagnose and restore fluvial systems.
Bankfull streamflows are the most geomorphically active streamflows in the channel before streamflow spills over its banks and loses energy to the floodplain (Biedenharn et al., 2008;Lane, 1955). However, bankfull streamflows can be difficult to determine, and a thorough site evaluation is often needed (Leopold et al., 1964;Rosgen, 1994Rosgen, , 1996. Generally, bankfull streamflows have a recurrence interval (RI) range from 1 to 2 years depending on the site (Simon et al., 2004). Previous sediment transport studies have used bankfull streamflow to normalize data while developing dimensionless sediment rating curves (DSRCs; Ellison et al., 2016;Rosgen, 2010).
Streamflow and sediment transport are not always strongly correlated, and hysteresis in the relation can cause inaccurate predictions.
Hysteresis occurs when the relation between sediment and streamflow changes based upon the history of the system such that different measured values of suspended sediment or BL can occur in the same stream at the same streamflow but at different times. There are multiple types of hysteresis, and only a selection of the possible types will be presented. The two most common types of hysteresis are clockwise (type-1) and counterclockwise (type-3), with clockwise hysteresis being more common (Gellis, 2013). With clockwise hysteresis there are higher values of sediment transport on the rising limb than the falling limb of the hydrograph. Clockwise hysteresis can be caused by a source of in-stream sediment that is readily mobilized during the rising limb of the hydrograph and corresponding increase in shear stress, which then becomes exhausted on the falling limb as sources are depleted (Gellis, 2013;Smith & Dragovich, 2009). Alternatively, counterclockwise hysteresis is the opposite and can be caused by a delayed delivery from a sediment source. The source of sediment can be made available from an upstream tributary or due to a saturated riverbank collapse after the river receded (Gellis, 2013;Kelly & Belmont, 2018). Therefore, SRCs should be carefully examined for stream locations with available data before applying methods to estimate sediment at stream locations of interest that have similar watershed characteristics and lack physical data. Some studies have tried to account for hysteresis in SRCs by using categorical variables to classify the rising and falling limbs or developing a SRC based on where the sample was collected on the streamflow hydrograph (Asselman, 2000;Vaughan, Belmont, et al., 2017).
The application of different ML approaches to estimate sediment transport has grown over the past two decades (Afan et al., 2016). ML has multiple benefits over traditional approaches, such as SRCs, with increased prediction accuracy of suspended sediment at specific sites while having the ability to learn complex linear and non-linear relations (Cisty et al., 2021;Francke et al., 2008;Khan et al., 2021;Zounemat-Kermani et al., 2020). Another benefit of ML is the ability to interpret these complex relations with the important features used in the model (Breiman, 2001;Cutler et al., 2007). Vaughan, Belmont, et al. (2017) (Breiman, 2001;Chen & Guestrin, 2016). Machine learning uses computer algorithms to learn complex interactions among linear and nonlinear data without the user needing to program the exact interaction (Bortnik & Camporeale, 2021). Random forest and XGBoost models learn by building many decision trees (learners) on random subsets of the full dataset and combine them to estimate the target outcome in a repeated process (ensembles). Random forest and XGBoost models can use many features as input to make predictions due to the averaging of trees which reduces the risk of overfitting while being impervious to noise (Bortnik & Camporeale, 2021;Breiman, 2001;Fox et al., 2017;Hastie et al., 2009). Both models can calculate premutation feature importance by removing features from the model to calculate if the model error increases when the feature was omitted (Breiman, 2001 There were three primary objectives of this study. The first objective was to use representative physical samples, streamflow, and publicly available geospatial datasets that describe watershed, catchment, near-channel, and channel features to develop methods to provide estimates of total sediment at stream locations where little or no physical samples have been collected. The second objective was to develop features from dimensionless streamflow to better account for hysteresis in the streamflow-sediment relation because streamflow data are more commonly available than sediment data. The third objective was to interpret XGBoost models with SHAP values to assess how predictions were made while making connections to known processes controlling sediment transport.

| Study area
Minnesota has a complex glacial history, which resulted in diverse landforms and surface water conditions (Figure 1; Ellison et al., 2016;Ojakangas & Matsch, 1982;Sims & Morey, 1972). The southwest (SW) region received drained water from glacial Lake Agassiz (not shown) approximately 10 000 years ago which resulted in incised valleys and highly erodible knickpoints that influence the region's current sediment regimes (Gran et al., 2009;Minnesota Pollution Control Agency, 2011). The southeast (SE) karst region was relatively untouched by the glaciers and has higher relief than the other regions of Minnesota (Figure 1; Lively, 2020). The predominantly forested northeast (NE) region has shallow bedrock, and steeper gradient rivers flow toward Lake Superior (Ojakangas & Matsch, 1982;Sims & Morey, 1972). Lakes and wetlands dominate the middle (MID) region, while intensively cultivated lands cover south, western, and northwest (NW) regions (Ellison et al., 2016). Overall Minnesota has diverse landscapes and complex sediment transport processes with differing sediment transport regimes, supplies, and controls that are representative of low-relief glaciated regions around the World. (Ellison et al., 2016;Vaughan, Belmont, et al., 2017).

| DATA AND METHODS
Supervised ML models were developed with physically collected sediment samples, streamflow, and geospatial datasets ( Figure 2) to predict SSC in milligrams per litre (mg/L) and BL discharge in tons per day (tons/day; Edwards & Glysson, 1999). Two new streamflow features were calculated to better account for hysteresis (Section 2.1.1).
Approximately 400 features  were tested for correlation before the final selected features were included in the models, as described in Section 2.2.1. Model development is   Geological Survey, 2021b). EWI and EDI SSC samples were primarily used along with 85 grab samples that were collected during low flows (less than 2 ft per second) when water velocities were too slow for isokinetic samplers and EWI and EDI methods (Edwards & Glysson, 1999). Two samples greater than 6400 mg/L were omitted from model development to reduce bias because they were collected at sites impacted by major floods and were deemed as extreme sediment transport events not representative of the entire dataset . Fourteen SSC samples were removed from the dataset due to having concentrations of greater than 80% sand, likely due to field crews inadvertently sampling the streambed. A total of 1382 SSC samples from 56 sites and 638 bedload samples (collected with SEWI methods [Edwards & Glysson, 1999]) from 43 sites were included in the final dataset .

| Development of streamflow slope
The dimensionless streamflow dataset was used to calculate two slope features that provide a magnitude and direction of the rate of change of streamflow ( Figure 3). These slope features facilitate ML models in understanding changing streamflow conditions around the time of sample collection and possibly better account for sediment hysteresis. The first slope feature was calculated from dimensionless streamflow during sample collection and dimensionless streamflow 24-hours (h) before (hereafter will be referred as 'streamflow slope [24-h before]'). The second slope feature was calculated from dimensionless streamflow during sample collection and dimensionless streamflow 24-h after (hereafter will be referred to as 'streamflow F I G U R E 3 Example of streamflow slope features calculated from instantaneous streamflow 24-h before (a) and 24-h after (b), and streamflow slope features calculated from daily mean streamflow 24-h before (c) and 24-h after (d) at the same river over the same period slope [24-h after]') . A 24-h interval was selected to support consistency at sites that had either daily mean or instantaneous streamflow.  (Table 1) were included in the initial dataset ).

| Removing geospatial features from the dataset
While ML models are robust to datasets including many input features (Breiman, 2001;Fox et al., 2017), it can be difficult to assess feature importance if there are multiple highly correlated features (Molnar, 2019). To reduce correlation in our dataset and improve model interpretation, two Pearson correlation matrixes were built.
The first correlation matrix was sorted manually to remove geospatial features that described similar characteristics .
The R caret package (Kuhn, 2008(Kuhn, , 2021; R Core Team, 2021) was used to search a second correlation matrix and remove features with an absolute-value pair-wise correlation greater than 0.50 . The remaining features were used in model development as described in Section 2.3.

| Model development
XGBoost ML models were trained and tested in R using the XGBoost package R Core Team, 2021). Data were split into training (80% of the data) and testing (20% of the data) datasets using a stratified random split to capture equal proportions of samples per site in each dataset. The training data were used in a grid search to build over 2000 5-fold cross validation (CV) XGBoost models to determine the best set of tuning parameters . The final evaluation error for a CV XGBoost model is the average of the five folds, so all the training data were used to validate performance. The CV XGBoost models were compiled by organizing their evaluation errors from lowest to highest .
The top 10 sets of tuning parameters from the CV XGBoost model grid search were used to build 10 XGBoost models to find the best overall set of tuning parameters for the final model . The learning rate was reduced from 0.10 to 0.01 to slow down the model and force it to be more conservative and help prevent overfitting (Chen & Guestrin, 2016). A watchlist metric from the XGBoost package was used to calculate the error statistic for the testing dataset on the trained model after every tree addition. Early stopping rounds were set to 10 to stop the addition of trees and abort the model training once the evaluation error stopped optimizing. The model with the best evaluation error from testing data was selected, and its optimal number of trees were used for the final model .
Machine learning regression models can have biased results as they typically overpredict on the low end and underpredict on the high end (Belitz & Stackelberg, 2021). Three goodness-of-fit (GOF) statistics were used as evaluation errors during the CV grid search process to test if one method could train less biased and more accurate models compared to the other. The first GOF statistic used was root mean squared error (RMSE): where, RMSE is the root mean squared error, x i are the measured values, x Ã i are the predicted values, and n is the number of observations used. The second GOF statistic used was the Nash-Sutcliffe efficiency (NSE): T A B L E 1 Categories, data type, and definitions used to describe features used in machine learning models  .
where, NSE is the Nash-Sutcliffe efficiency value, n is the number of observations used, x i is the measured value for observation i (SSC in mg/L or BL in tons/day), x Ã i is the predicted value for observation i (SSC in mg/L or BL in tons/day), and x is the mean of the measured values (SSC in mg/L or BL in tons/day). The third GOF statistic used was bias corrected correlation coefficient (bR2): where, bR 2 is the bias correlation coefficient, b is the slope of the regression line between predicted and measured values, and R 2 is the coefficient of determination.

| Comparison of streamflow control and final models
In order to test the accuracy gained with the development of streamflow features (dimensionless streamflow and streamflow slopes), 'control models' were developed for SSC and BL. In each control model, dimensionless streamflow was replaced with the corresponding streamflow value, and streamflow slopes were replaced with categorical features to describe the rising limb (1) or falling limb (0) (Landers et al., 2016). The sediment surrogates were used to compare against the ML models because of the strong relations between the surrogates and SSCs. The sites that used turbidity as a surrogate for SSC had R 2 greater than 0.95 (Groten, 2017a(Groten, , 2017b(Groten, , 2017c. The site that used sediment corrected backscatter as a surrogate for SSC had a R 2 greater than 0.8 (Groten et al., 2019). The cumulative daily SSLs estimated from the site-specific ML models were considered to be relatively validated if they were between the corresponding upper-and lower-90% prediction interval from each sediment surrogate site.

| Streamflow normalization
Bankfull streamflow values were available for 30 of the 56 sites, so these bankfull streamflows were compared to 1.5 and 2-year RIs to select the RI most representative of bankfull streamflow. The 2-year RI was selected to normalize streamflow because it had a stronger relation (higher R 2 ) with bankfull streamflow values (Figure 4). Normalizing streamflow gives a systematic way to compare the magnitude of streamflow across sediment regions and river size class as a value of 1, now represents the 2-year RI for every site ( Figure 5).
Normalizing streamflow allowed for larger datasets to train and test the SSC and BL ML models rather than having to subgroup datasets into separate models (i.e., sediment regions or river size class).

| Summary statistics
The sediment transport dataset represented different river sizes in Minnesota's five sediment regions. Summary statistics demonstrate the range of sediment transport across the state (

| Model goodness-of-fit
Three different evaluation error methods were tested when training the models, and the final models were selected for having the best combination of GOF statistics (RMSE, NSE, bR 2 ). The final SSC XGBoost model was trained for RMSE because it was more accurate (lower RMSE, higher NSE) and less biased (higher bR 2 ) ( Table 4) than using the other GOF statistics. The final BL XGBoost model was trained for bR 2 because it was more accurate (higher NSE), RMSEs were similar, and less biased due to having a higher bR 2 (Table 4).

| Extreme gradient boosting model results
The final SSC and BL XGBoost ML models were able to predict sediment transport at a variety of river sizes across the state of Minnesota. The SSC and BL ML models overpredicted on the low end (approximately less than 10 for SSC and BL) and underpredicted on the high end (approximately greater than 1000 for SSC and BL;
The two most important features were streamflow slope (24-h before) and dimensionless streamflow. When streamflow slope was near zero (streamflow is stable) or negative (streamflow is falling) the feature had a negative SHAP value, indicating lower SSC transport   percent evergreen forest land cover, and mean soil erodibility on agricultural land, respectively). One sample specific feature ranked 10 (month). One near channel feature ranked 3 (river size class).
Dimensionless streamflow BL SHAP dependence plot showed a similar shape as the SSC plot but showed different values on the xaxis as breakpoints (Figures 10b and 8b). When dimensionless streamflow was near zero, the feature generally had a negative SHAP value similar to the SSC model. When dimensionless streamflow features were above 0.75, SHAP values were positive and increased until about two and a half times the 2-year RI indicating higher BL transport ( Figure 10b). Like the SSC dependence plot, the streamflow slope (24-h before) had several high SHAP values that correspond to positive streamflow slopes (Figure 10e), indicating that rising streamflow was

| Comparison of cumulative daily suspendedsediment loads
Comparison of site-specific ML model output to in-situ sediment surrogate model outputs (Table 6) provided the opportunity to validate this ML modelling approach. The site-specific ML cumulative daily SSLs were within the sediment surrogate 90% prediction intervals at all four sites (Table 6). On shorter time intervals (e.g., one to multiple months) the site-specific ML model predicted higher SSLs than the surrogate's upper 90% prediction interval at the Knife River in 2017 and 2018, the Zumbro River in 2017, and the Minnesota River in 2018 ( Figure 11). However, the sitespecific ML model estimates were within the surrogate's 90% prediction intervals after the shorter time intervals ended (greater than one to multiple months) for the previously mentioned sites and years.

| DISCUSSION
Advancements in data science and ML allowed for enhanced data driven sediment transport modelling, prediction accuracy, and inter-  (Cisty et al., 2021;Francke et al., 2008;Zounemat-Kermani et al., 2020). XGBoost's ability to add custom evaluation metrics and additional parameter tuning helped reduce variance and bias (Table 4)

| Normalizing streamflow
Normalizing streamflow reduced variability in the dataset and allowed for the development of one model for each constituent (SSC and BL) rather than developing multiple models (e.g., developing models for each sediment region or river size class). Developing one model for each constituent allowed for larger datasets to be used for training and testing. The dimensionless streamflow SHAP dependence plots showed the highest SHAP values were near the 2-year RI, which indicates higher sediment transport near bankfull streamflows (Figures 8b and   10b). These results are consistent with bankfull streamflows being the most geomorphically active streamflows in the channel before streamflow spills over its banks and loses energy to the floodplain (Biedenharn et al., 2008;Lane, 1955). Additionally, the observed decrease in SHAP values at very high streamflows could be connected to sediment deposition in the channel and floodplain caused by the river overbanking and subsequent loss of energy to transport sediment . The observed decrease in SHAP values might also be explained by the depletion of upstream sediment sources after long flood durations (Gellis, 2013;Smith & Dragovich, 2009

| Accounting for hysteresis with streamflow slope
Streamflow slopes (24-h before and after) were features developed from the dimensionless streamflow dataset to provide the model with more insight into the complex relation between streamflow and sediment transport. The streamflow slope provided a rate and direction (rise is a positive value and fall is a negative value) of streamflow in the channel rather than just classifying streamflow's position on the hydrograph with simple categorical features. The SSC and BL ML models learned from the streamflow slope by differentiating the complex rate of change in streamflow events. A streamflow event that is quickly or slowly changing could be related to storm intensity, measured as peakflow divided by total runoff, which is positively correlated with sediment transport (Guy, 1964). As seen from the control model tests (Table 5), the final models' GOF statistics improved considerably compared to the control models.
The results from the ML models suggest that the streamflow slope features helped to reduce uncertainty between streamflow and sediment transport across varying river sizes and sediment regions by supplying the ML models with more information to understand complex relations between sediment source and transport dynamics. This finding directly supports objective number two and connects with previous work that shows ML can make more accurate predictions than SRCs (Cisty et al., 2021;Francke et al., 2008;Zounemat-Kermani et al., 2020).

| Geospatial datasets
Geospatial features in the SSC and BL ML models can be used to make inferences about sediment sources and sediment transport processes. The modelling framework facilitated the use of geospatial features without fully relying on them to predict sediment transport, and highly correlated features were removed to aid in the interpretation  (Figure 8c), which could be connected to other landscape and land-use features such as increased streamflow from tile drainage and erosion (Belmont et al., 2011;Schottler et al., 2014). An unexpected catchment feature in the SSC ML model was percent deciduous forest land cover because the SSC SHAP dependence plot showed an increase in SSC transport with an increase in percent deciduous forest land cover (Figure 8d). A possible explanation is that higher percentage of deciduous forest land cover only represents the local catchment area around the site, so an increase in SSC transport could be due to other surrounding land use features represented by the upstream watershed.
Looking more closely at potential sediment sources and grain sizes, BL SHAP values showed that the lower percent of coarse textured glacial outwash and glacial lake sediments (Figure 10d) in the catchment impacts higher BL transport while the SSC SHAP values showed that high percent clay in the watershed impacts higher SSC transport ( Figure 8e). The percent of coarse textured glacial outwash and glacial lake sediments in the catchment could be connected to potential sources of BL in the channel; however, a possible explanation is difficult to determine without having better geospatial datasets representing bed material type in the channel. The percent clay in the watershed could be teaching the SSC model about a potential suspended sediment source since fine particles can make up a considerable amount of SSCs. Altogether, these results show that geospatial predictors are helping the ML models account for complex sediment source and transport processes at various scales which are difficult to account for with SRCs and DSRCs (Atieh et al., 2015;Ellison et al., 2016;Francke et al., 2008;Vaughan, Belmont, et al., 2017).

| Possible model improvements
Further development of features and additional sites and samples could improve ML models. Improvements to ML models could include calculation of higher resolution in-channel or near-channel features that are known to be sediment transport controls. The current models were developed from publicly available datasets available for the entire state, and the resolution could be too coarse. A more efficient method of locating important features from the vast amount of available geospatial data while accounting for correlation to other features could help model interpretation and prediction accuracy. Additional continuous time-series datasets could be used to add more features like gridded rainfall patterns, precipitation intensities, and antecedent soil moisture calculated for the upstream catchment and watershed (Essou et al., 2016). Alternative methods to calculate dimensionless streamflow could be explored. Using different streamflow slope time intervals could increase prediction accuracy since varying river sizes respond differently (streamflow rising and falling at different time intervals) to snowmelt and storm events. Additional analyses of streamflow data could include a time-since-last-event feature that could teach the model sediment source and storage controls to help better account for hysteresis (Gellis, 2013;Smith & Dragovich, 2009).
ML is a complex and ever-changing field of study; additional work could be done exploring other methods including artificial neural networking (ANN) and, more specifically, hybrid wavelet and neural networking (WANN), which produced accurate results in sediment transport prediction studies (Afan et al., 2016;Khan et al., 2021).
Lastly, the models could be improved with additional sites and additional samples to better represent sediment transport.

| Comparison of streamflow feature accuracy and loads
In-situ sediment surrogates are a proven and accurate method to estimate SSC and were used to relatively validate ML model outputs in this study. This validation showed that this ML approach can be used at sites that did not have physically collected samples available to train the ML models. The results validated the use of the ML models to estimate cumulative daily SSLs. The ML models predicted higher SSLs during shorter time intervals at some sites. Because these time intervals generally had lower streamflow, ML models tend to overpredict at lower streamflow, as described in Section 3.3.3 (Figure 6), and the surrogate models tend to underpredict at lower streamflow. Future work could include using these sediment surrogate model outputs to test if the ML model is better at accounting for the processes controlling sediment transport as more accurate and representative features are calculated. The sediment surrogate model outputs could also be used to calculate a hysteresis-index to quantify rising and falling limb hysteresis trends Lloyd et al., 2016;Vaughan, Bowden, et al., 2017) to improve the ML models.

| CONCLUSIONS
This research was possible because local, state, and federal natural resource managers realized the importance in collecting physical sediment samples across the state of Minnesota. This study elucidates the potential of supervised ML models paired with geospatial datasets and more accurate streamflow features to increase prediction accuracy and provide a better understanding of the relative roles that landscape, near-channel, and in-stream conditions play in sediment transport and was achieved through three objectives.
The first objective was achieved by comparing SSLs from trained ML models to SSL datasets from in-situ sediment surrogates at four sites across Minnesota that were not included in model training.
Results from ML models were mostly within the 90% prediction intervals of the surrogates at all four sites, supporting the idea that ML can learn from complex relations and apply those relations to sites with few to no physical samples. The second objective was achieved by the normalization of streamflow by the 2-year RI, which trained the models to learn where the samples were collected in relation to a geomorphically active streamflow, and the streamflow slopes allowed the ML models to learn from changing streamflow conditions before and after sample collection. These developed streamflow features helped account for hysteresis and improved the prediction accuracy of the ML models. The third objective was achieved by using SHAP values to interpret how the ML models were making predictions and learning from the complex relations between sediment transport, streamflow, watershed, catchment, and near-channel features. Finally, these findings are useful for natural resource managers, stream practitioners, and anyone interested in fluvial sediment transport because they can help improve sediment load estimations, enhance restoration design and priority, identify streams that depart from reference conditions, and help evaluate effectiveness of sediment reduction strategies.