Abstract
Water resources in arid and semi-arid regions are susceptible to alteration in hydro-climatic variables, especially under climate change which makes runoff simulations more challenging. This study aims to simulate input runoff to a dam reservoir in an arid region under changing climatic conditions using three data-mining algorithms, including Artificial Neural Networks (ANNs), Support Vector Machine (SVM), Genetic Expression Programming (GEP), and the conceptual HYMOD model. Three parameters containing precipitation and maximum and minimum temperature were simulated from 30 Coupled Model Intercomparison Project Phase 5 (CMIP5) and Global Climate Models (GCMs) for the future period (2020–2040) under the high-end RCP8.5 scenario. The Long Ashton Research Station Weather Generator (LARS-WG) was selected as a downscaling method. The Gamma and M tests (This is an exam to determine whether an infinite series of functions will converge uniformly and absolutely or not) were applied to detect the best combinations and number of input parameters for the models, respectively. Among 29 defined input parameters for the models, the gamma test identified 11 parameters with the best functionality to simulate runoff. Based on the reliability estimates of model error variance by the M test, the data were partitioned as 75% for learning and the other 25% for test verification. A comparison of the runoff simulations of the models revealed a remarkable performance of the SVM model by 3, 5, and 14% compared to ANNs, GEP, and HYMOD models, respectively. The SVM model forecasted a 25% decrease in the mean runoff input to the dam reservoir for the 2020–2040 period compared to the study period (2000–2019). This result illustrates necessitating the implementation of sustainable adaptation strategies to protect future water resources in the basin.
Similar content being viewed by others
Availability of Data and Material
The authors confirm that all data supporting the findings of this study are available from the corresponding author by request.
Code Availability
The authors announce that there is no problem for sharing the used model and codes by make request to corresponding author.
References
Amiri-Ardakani Y, Najafzadeh M (2021) Pipe break rate assessment while considering physical and operational factors: a methodology based on global positioning system and data-driven techniques. Water Resour Manage 35:11, 35:3703–3720. https://doi.org/10.1007/S11269-021-02911-6
Bayram S, Al-Jibouri S (2016) Efficacy of estimation methods in forecasting building projects’ costs. J Constr Eng Manag. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001183
Behzad M, Asghari K, Eazi M, Palhang M (2009) Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst Appl 36:7624–7629. https://doi.org/10.1016/j.eswa.2008.09.053
Chakrabortty R, Pal SC, Janizadeh S et al (2021) Impact of climate change on future flood susceptibility: an evaluation based on deep learning algorithms and GCM model. Water Resour Manage 35:12, 35:4251–4274
Choubin B, Khalighi-Sigaroodi S, Malekian A et al (2014) Drought forecasting in a semi-arid watershed using climate signals: a neuro-fuzzy modeling approach. J Mt Sci 11:1593–1605. https://doi.org/10.1007/s11629-014-3020-6
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/bf00994018
Darabi H, Mohamadi S, Karimidastenaei Z et al (2021) Prediction of daily suspended sediment load (SSL) using new optimization algorithms and soft computing models. Soft Comput 25:7609–7626
Dawood T, Elwakil E, Novoa HM, Delgado JFG (2021) Toward urban sustainability and clean potable water: Prediction of water quality via artificial neural networks. J Clean Prod 291:125266
Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press
Fan YR, Huang W, Huang GH et al (2015) A PCM-based stochastic hydrological model for uncertainty quantification in watershed systems. Stoch Env Res Risk Assess 29:915–927
Ghaith M, Siam A, Li Z, El-Dakhakhni W (2020) Hybrid hydrological data-driven approach for daily streamflow forecasting. J Hydrol Eng 25:04019063. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001866
Hill T, Marquez L, O’Connor M, Remus W (1994) Artificial neural network models for forecasting and decision making. Int J Forecast 10:5–15. https://doi.org/10.1016/0169-2070(94)90045-0
Hosseinzadehtalaei P, Tabari H, Willems P (2020a) Climate change impact on short-duration extreme precipitation and intensity–duration–frequency curves over Europe. J Hydrol 590:125249
Hosseinzadehtalaei P, Tabari H, Willems P (2020b) Satellite-based data driven quantification of pluvial floods over Europe under future climatic and socioeconomic changes. Sci Total Environ 721:137688. https://doi.org/10.1016/j.scitotenv.2020.137688
Islam ARMT, Talukdar S, Mahato S et al (2021) Flood susceptibility modelling using advanced ensemble machine learning models. Geosci Front 12:101075
Jafarzadeh A, Pourreza-Bilondi M, Siuki AK, Moghadam JR (2021) Examination of various feature selection approaches for daily precipitation downscaling in different climates. Water Resour Manage 35:2, 35:407–427. https://doi.org/10.1007/S11269-020-02701-6
Karandish F, Mousavi SS, Tabari H (2017) Climate change impact on precipitation and cardinal temperatures in different climatic zones in Iran: Analyzing the probable effects on cereal water-use efficiency. Stoch Env Res Risk Assess 31:2121–2146. https://doi.org/10.1007/s00477-016-1355-y
Khan MS, Coulibaly P, Dibike Y (2006) Uncertainty analysis of statistical downscaling methods using Canadian Global Climate Model predictors. Hydrol Process 20:3085–3104. https://doi.org/10.1002/hyp.6084
Kharin V, Flato GM, Zhang X et al (2018) Risks from climate extremes change differently from 1.5°C to 2.0°C depending on rarity. Earth’s Future 6:704–715. https://doi.org/10.1002/2018EF000813
Kundzewicz ZW, Krysanova V, Benestad RE et al (2018) Uncertainty in climate change impacts on water resources. Environ Sci Policy 79:1–8
Loveridge M, Rahman A (2021) Effects of probability-distributed losses on flood estimates using event-based rainfall-runoff models. Water 13:2049
Makkeasorn A, Chang NB, Zhou X (2008) Short-term streamflow forecasting with global climate change implications - a comparative study between genetic programming and neural network models. J Hydrol 352:336–354. https://doi.org/10.1016/j.jhydrol.2008.01.023
Malik A, Kumar A, Kisi O, Shiri J (2019) Evaluating the performance of four different heuristic approaches with Gamma test for daily suspended sediment concentration modeling. Environ Sci Pollut Res 26:22670–22687
Meng E, Huang S, Huang Q et al (2021) A hybrid VMD-SVM model for practical streamflow prediction using an innovative input selection framework. Water Resour Manage 35:1321–1337. https://doi.org/10.1007/S11269-021-02786-7
Mohammadi AA, Yousefi M, Soltani J et al (2018) Using the combined model of gamma test and neuro-fuzzy system for modeling and estimating lead bonds in reservoir sediments. Environ Sci Pollut Res 25:30315–30324
Mohanta A, Pradhan A, Mallick M, Patra KC (2021) Assessment of shear stress distribution in meandering compound channels with differential roughness through various artificial intelligence approach. Water Resour Manage 35:13, 35:4535–4559. https://doi.org/10.1007/S11269-021-02966-5
Quan Z, Teng J, Sun W et al (2015) Evaluation of the HYMOD model for rainfall–runoff simulation using the GLUE method. Proc Int Assoc Hydrol Sci 368:180–185. https://doi.org/10.5194/piahs-368-180-2015
Ravindran SM, Bhaskaran SKM, Ambat SKN (2021) A deep neural network architecture to model reference evapotranspiration using a single input meteorological parameter. Environ Process 8:1567–1599. https://doi.org/10.1007/S40710-021-00543-X
Remesan R, Shamim MA, Han D, Mathew J (2009) Runoff prediction using an integrated hybrid modelling scheme. J Hydrol 372:48–60. https://doi.org/10.1016/J.JHYDROL.2009.03.034
Rezaeianzadeh M, Stein A, Tabari H et al (2013) Assessment of a conceptual hydrological model and artificial neural networks for daily outflows forecasting. Int J Environ Sci Technol 10:1181–1192
Roy DK (2021) Long short-term memory networks to predict one-step ahead reference evapotranspiration in a subtropical climatic zone. Environ Process 8:911–941
Shoaib M, Shamseldin AY, Melville BW, Khan MM (2015) Runoff forecasting using hybrid Wavelet Gene Expression Programming (WGEP) approach. J Hydrol 527:326–344. https://doi.org/10.1016/j.jhydrol.2015.04.072
Singh VK, Kumar D, Kashyap PS et al (2020) Modelling of soil permeability using different data driven algorithms based on physical properties of soil. J Hydrol 580:124223. https://doi.org/10.1016/j.jhydrol.2019.124223
Tabari H (2020) Climate change impact on flood and extreme precipitation increases with water availability. Sci Rep 10:13768
Tabari H, Kisi O, Ezani A, Talaee PH (2012) SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J Hydrol 444:78–89
Tabari H, Willems P (2018) Seasonally varying footprint of climate change on precipitation in the Middle East. Sci Rep 8:2–11
Tayfur G (2021) Empirical, numerical, and soft modelling approaches for non-cohesive sediment transport. Environ Process 8:37–58
Vijay S, Kamaraj K (2021) Prediction of water quality index in drinking water distribution system using activation functions based ann. Water Resour Manage 35:2, 35:535–553. https://doi.org/10.1007/S11269-020-02729-8
Wang W, Du Y, Chau K et al (2021) An ensemble hybrid forecasting model for annual runoff based on sample entropy, secondary decomposition, and long short-term memory neural network. Water Resour Manage 2021:1–32. https://doi.org/10.1007/S11269-021-02920-5
Wang Y, Tabari H, Xu Y et al (2019) Unraveling the role of human activities and climate variability in water level changes in the Taihu plain using artificial neural network. Water 11:720
Winsemius HC, Aerts JCJH, van Beek LPH et al (2015) Global drivers of future river flood risk. Nat Clim Change 64, 6:381–385. https://doi.org/10.1038/nclimate2893
YoosefDoost A, Asghari H, Abunuri R, Sadeghian MS (2018a) Comparison of CGCM3, CSIRO MK3 and HADCM3 Models in estimating the effects of climate change on temperature and precipitation in Taleghan Basin. Am J Environ Protect 6:28–34. https://doi.org/10.12691/env-6-1-5
YoosefDoost A, YoosefDoost I, Asghari H, Sadegh Sadeghian MS (2018b) Comparison of HadCM3, CSIRO Mk3 and GFDL CM2.1 in prediction the climate change in Taleghan River Basin. Am J Civil Eng Architect 6:93–100. https://doi.org/10.12691/ajcea-6-3-1
YoosefDoost A, Sadeghian MS, NodeFarahani M, Rasekhi A (2017) Comparison between performance of statistical and Low Cost ARIMA Model with GFDL, CM2. 1 and CGM 3 atmosphere-ocean general circulation models in assessment of the effects of climate change on temperature and precipitation in Taleghan Basin. Am J Water Resour 5:92–99. https://doi.org/10.12691/ajwr-5-4-1
Yousefi Malekshah M, Ghazavi R, Sadatinejad SJ (2019) Evaluating the effect of climate changes on runoff and maximum flood discharge in the dry area (Case Study : Tehran-Karaj Basin). Ecopersia 7:211–221 (In Farsi)
Zhang W et al (2021) Increasing precipitation variability on daily-to-multiyear time scales in a warmer world. Sci Adv 7(31):eabf8021
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study’s conception and design. Material preparation, data collection and analysis were performed by Icen Yoosefdoost, Abbas Khashei Siuki, Hossein Tabari and Omolbani Mohammadrezapour. The first draft of the manuscript was written by Yoosefdoost, Icen, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of Interest
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Appendix A
1.1 GCM Selection
The projected monthly precipitation and minimum, maximum temperatures under climate change for all GCMs and the RCP8.5 scenario in the base period are compared to observations in Fig. A1. The red dashed, and continuous lines show the 95% and 99% confidence intervals for observations, respectively. The GCMs simulation is considered acceptable when they are within the confidence interval of the observed data. The left column in Fig. A1 compares the mean monthly climate projections for all 30 GCMs shown with gray shadow (see Table A1 for GCMs names) with the confidence intervals of observations data. The results show that the accuracy of GCM simulations alters with variables (precipitation, max and min temperature) and period (month). The simulation CCMs, located within the confidence intervals of observations, are selected for further analysis, shown in the right column of Fig. A1.
The projected changes in precipitation and minimum, maximum temperatures for historical and future periods are shown in Fig. A2. The highest increases in maximum and minimum temperatures are seen in April (1.6 °C) and December (1.3 °C), respectively. Moreover, the lowest increases are projected for January (0.5 °C), May (0.5 °C) and November (0.6 °C). The largest uncertainty in the maximum and minimum temperature projections are seen in April and June, and October.
Table A1 All GCMs models are used in this research. The selected GCMs for precipitation and temperature projections are highlighted in bold with T (only temperature), P (only precipitation) and T, P (both temperature and precipitation) index
Models | |||
---|---|---|---|
\({\textbf{MPI-ESM-LR}}_\textbf{(T,P)}\) | \({\textbf{BCC-CSM1.1}}_\textbf{(T,P)}\) | CESM1(CAM5) | \({\textbf{EC-EARTH}}_\textbf{(T)}\) |
MPI-ESM-MR | BCC-CSM1.1(m) | \({\textbf{CESM1(WACCM)}}_\textbf{(T,P)}\) | \({\textbf{IPSL-CM5A-LR}}_\textbf{(P)}\) |
\({\textbf{HadGEM2-ES}}_\textbf{(P)}\) | BNU-ESM | \({\textbf{NorESM1-M}}_\textbf{(P)}\) | IPSL-CM5A-MR |
MRI-CGCM3 | CanESM2 | NorESM1-ME | FGOALS-g2 |
GISS-E2-H | CNRM-CM5 | HadGEM2-AO | MIROC-ESM |
\({\textbf{GISS-E2-R}}_\textbf{(T)}\) | CSIRO-Mk3.6.0 | GFDL-CM3 | MIROC-ESM-CHEM |
CCSM4 | FIO-ESM | GFDL-ESM2G | |
GFDL-ESM2M | HadGEM2-AO | MIROC5 |
Precipitation is expected to decrease in most months, with the highest decline of about 30% in May. In the last three months of the year and February, a precipitation increase is projected, with the highest growth of + 9% in February. The largest uncertainty in precipitation projections is predicted in July, where the changes vary from a –30% decline to a + 20% increase among GCMs.
1.2 Trend Analysis
To determine the trend of data, two types of statistical analyses are selected. First, the presence of a steady rising or declining trend is examined with the nonparametric Mann–Kendall test. Then, the slope of the linear trends is estimated with the nonparametric Sen’s method. Sen’s statistics for precipitation and temperature time series are shown in Figs. A3, A4, A5, and A6. These figures confirm the presence of annual and monthly time series of minimum temperature (all months except December and November) and maximum temperature in April and June. The time series consists of monthly averages with monotonously decreasing trends. It seems that the residuals follow a random distribution, indicating that a linear model should be applied. The statistical calculations give a high significance level, with narrow angles between the confidence lines. The precipitation, max temperature (except in April and June), and min temperature time series in November and December are three instances in which neither of the used methods is statistically suitable. However, they have a negative slope in the studied time scales. Because the data trends have significant fluctuations in monthly and annual periods, the trend is thus neither linear nor monotonic (Figs. A3, A4, A5, and A6).
1.3 Reference Evapotranspiration Estimation
The ultimate goal of this research was to forecast runoff under climate change conditions. Input variables include daily precipitation, minimum and maximum temperatures, reference evapotranspiration (ET0), and runoff for 20 years (2020–2040). The ET0 estimates are obtained using the Penman, Penman–Monteith, Wright–Penman, Blaney–Criddle, Radiation balance, and Hargreaves models. The results show that the Penman–Monteith have a more reliable performance to estimate in this area. The Penman–Monteith model is not applicable for the future period due to a lack of relative humidity, net radiation, and wind speed data. Therefore, future ET0 was estimated from temperature using an empirical copmuted relationship (Fig. A7) between historical ET0 and temperature data.
Figure A7 demonstrates the correlation and equation obtained from different input–output combinations based on various trend line regressions. Green, red, and black lines illustrate linear, quadratic, and cubic relations of min, max, and average temperature with ET0, respectively. The results show the highest correlation is between maximum temperature and ET0 using a quadratic polynomial function.
1.4 Inputs Selection
Choosing appropriate inputs data is a prerequisite for developing a forecasting model. Due to the limited access to climatic data, various inputs for the model based on existing parameters to predict daily runoff under climate change scenarios are defined. The gamma test inputs are Tave, Tmin, Tmax, P (Precipitation), ET0, and R (Runoff) in combination with the repetition of all these six parameters with one, two, three and four 20 years steps delay, which are shown as 1 to 4 indices under each parameter symbol.
According to Table A2, the maximum gamma statistics is achieved for the 27th combination (elimination of R3 variable), indicating that R3 is the most vital parameter in the combination. Apart from R3, Tmax1, Tmin3, Tmin4, Tave1, Tmax2, Tmin1, ET0, and Tave2 are considered in order of importance.
Table A2 Selected combinations with higher Gamma amounts for the initial input
Combination number | Rank | Absent parameter | Gamma | Gradient | Standard Error | V-Ratio |
---|---|---|---|---|---|---|
0 | 10 | - | 0.018234 | 0.042148 | 0.007177 | 0.072935 |
1 | 14 | T ave4 | 0.018097 | 0.043687 | 0.006757 | 0.072388 |
2 | 13 | T ave3 | 0.018162 | 0.043621 | 0.006622 | 0.072647 |
3 | 9 | T ave2 | 0.018452 | 0.043108 | 0.007137 | 0.073808 |
4 | 5 | T ave1 | 0.019758 | 0.041836 | 0.006242 | 0.079033 |
5 | 27 | T ave | 0.012948 | 0.050928 | 0.006395 | 0.051791 |
6 | 15 | Tmax4 | 0.017814 | 0.04411 | 0.004546 | 0.071256 |
7 | 24 | Tmax3 | 0.014837 | 0.048172 | 0.004714 | 0.059347 |
8 | 6 | Tmax2 | 0.019243 | 0.042887 | 0.006142 | 0.076971 |
9 | 2 | Tmax1 | 0.021062 | 0.040848 | 0.007463 | 0.084246 |
10 | 26 | Tmax | 0.013296 | 0.050814 | 0.007465 | 0.053183 |
11 | 4 | Tmin4 | 0.019819 | 0.042261 | 0.007033 | 0.079278 |
12 | 3 | Tmin3 | 0.020647 | 0.041046 | 0.007229 | 0.082588 |
13 | 21 | Tmin2 | 0.016109 | 0.046281 | 0.008033 | 0.064434 |
14 | 7 | Tmin1 | 0.019152 | 0.043338 | 0.007156 | 0.076607 |
15 | 20 | Tmin | 0.016149 | 0.049204 | 0.004846 | 0.064597 |
16 | 11 | EV4 | 0.018234 | 0.042148 | 0.007177 | 0.072935 |
17 | 12 | EV3 | 0.018234 | 0.042148 | 0.007177 | 0.072935 |
18 | 16 | EV2 | 0.017709 | 0.044322 | 0.007342 | 0.070836 |
19 | 19 | EV1 | 0.016451 | 0.047765 | 0.004804 | 0.065803 |
20 | 8 | EV | 0.018541 | 0.04441 | 0.00705 | 0.074164 |
21 | 28 | P4 | 0.012765 | 0.056347 | 0.004487 | 0.051061 |
22 | 17 | P3 | 0.017331 | 0.047929 | 0.005202 | 0.069323 |
23 | 25 | P2 | 0.014061 | 0.05073 | 0.005285 | 0.056245 |
24 | 23 | P1 | 0.015254 | 0.047209 | 0.005479 | 0.061016 |
25 | 30 | P | 0.007831 | 0.067974 | 0.005675 | 0.031324 |
26 | 22 | R4 | 0.015703 | 0.050782 | 0.004229 | 0.062812 |
27 | 1 | R3 | 0.026969 | 0.044033 | 0.006039 | 0.107875 |
28 | 18 | R2 | 0.016623 | 0.050932 | 0.006062 | 0.066494 |
29 | 29 | R1 | 0.010764 | 0.05958 | 0.005647 | 0.043055 |
Then, the number of input data is defined. Seventy percent of all available data is used for learning and the rest for testing and evaluation. The minimum number of data required for learning was calculated using the M test (Fig. A8).
The results show that both Gamma and standard error diagrams reach a plateau at approximately the point of 2750. The M test is applied for simulations using various learning data utilizing the Local Linear Regression (LLR) as the control model to achieve a more accurate result. The simulation was applied for every model using the selected combinations from the previous analysis. Table A3 compares the simulations based on three criteria of R2, RMSE, and CORR.
The results indicate that the values of R2 and CORR remain approximately constant after 2450 iterations. The minimum RMSE and the maximum R2 and CORR values are obtained using 2800 data for training. Based on the results, the optimum training number is 2737 (75% of all data), and the rest (912) is considered for the evaluation and testing. To choose the best input combination, the gamma statistics obtained from 29 input combinations are compared with a state considering all input parameters. Table A3 presents the findings for some of the input combinations. The number of required data for training was determined using the M test while considering LLR as the control model.
Table A3 Summary of modelling results from the LLR control model to evaluate the M test
Number of training data | Test | Train | ||||
---|---|---|---|---|---|---|
RMSE | R2 | CORR | RMSE | R2 | CORR | |
500 | 0.3925 | 0.9208 | 0.9365 | 0.6055 | 0.6042 | 0.7262 |
2400 | 0.3192 | 0.979 | 0.9691 | 0.3731 | 0.863932 | 0.8885 |
2450 | 0.3156 | 0.9856 | 0.9721 | 0.3551 | 0.881487 | 0.8985 |
2700 | 0.3044 | 0.9878 | 0.972095 | 0.3410 | 0.89665 | 0.9076 |
2750 | 0.3029 | 0.9881 | 0.9730 | 0.3310 | 0.907729 | 0.9135 |
2800 | 0.2915 | 0.9884 | 0.9733 | 0.3183 | 0.920364 | 0.920 |
2850 | 0.3004 | 0.9872 | 0.9728 | 0.3228 | 0.9140 | 0.9170 |
3000 | 0.3082 | 0.9881 | 0.9728 | 0.3478 | 0.888332 | 0.902409 |
Table A4 Comparison of LLR and gamma test input combinations
Number of combinations | Combination type | Training | Test | ||||
---|---|---|---|---|---|---|---|
RMSE | R2 | CORR | RMSE | R2 | CORR | ||
0 | 11111111111111111111111111111 | 0.1557 | 0.9117 | 0.9471 | 9.6246 | 0.0434 | 0.1048 |
1 | 01111111111111111111111111111 | 0.1505 | 0.9838 | 0.9369 | 2.1047 | 0.1075 | 0.3289 |
2 | 10111111111111111111111111111 | 0.1641 | 0.9923 | 0.9314 | 0.5164 | 0.6586 | 0.8125 |
3 | 11011111111111111111111111111 | 0.1813 | 0.9681 | 0.9396 | 1.2207 | 0.1989 | 0.4026 |
4 | 11101111111111111111111111111 | 0.2084 | 0.9711 | 0.9369 | 1.923 | 0.0947 | 0.3049 |
5 | 11110111111111111111111111111 | 0.1639 | 0.9903 | 0.9302 | 3.4479 | 0.073 | 0.2313 |
6 | 11111011111111111111111111111 | 0.1642 | 0.9745 | 0.9327 | 0.9039 | 0.4256 | 0.661 |
7 | 11111101111111111111111111111 | 0.1555 | 0.9923 | 0.9251 | 1.3766 | 0.1957 | 0.4146 |
8 | 11111110111111111111111111111 | 0.1735 | 0.9764 | 0.9292 | 1.2166 | 0.2839 | 0.5304 |
9 | 11111111011111111111111111111 | 0.1714 | 0.9734 | 0.9277 | 1.4732 | 0.3054 | 0.5529 |
10 | 11111111101111111111111111111 | 0.2368 | 0.972 | 0.9218 | 0.3648 | 0.799 | 0.9089 |
11 | 11111111110111111111111111111 | 0.1432 | 0.9492 | 0.9269 | 9.624 | 0.0151 | 0.1006 |
12 | 11111111111011111111111111111 | 0.156 | 0.9529 | 0.9174 | 2.0766 | 0.0936 | 0.3175 |
13 | 11111111111101111111111111111 | 0.1582 | 0.9502 | 0.917 | 0.5164 | 0.6464 | 0.8275 |
14 | 11111111111110111111111111111 | 0.1675 | 0.9448 | 0.9183 | 1.2037 | 0.1567 | 0.4119 |
15 | 11111111111111011111111111111 | 0.1886 | 0.9347 | 0.9181 | 1.9077 | 0.0844 | 0.3214 |
16 | 11111111111111101111111111111 | 0.149 | 0.9519 | 0.9192 | 3.4294 | 0.0381 | 0.2293 |
17 | 11111111111111110111111111111 | 0.1534 | 0.9435 | 0.9159 | 0.9019 | 0.4008 | 0.6653 |
18 | 11111111111111111011111111111 | 0.1607 | 0.948 | 0.9175 | 1.3765 | 0.1538 | 0.4314 |
19 | 11111111111111111101111111111 | 0.1502 | 0.9434 | 0.9235 | 1.1905 | 0.2551 | 0.5277 |
20 | 11111111111111111110111111111 | 0.1702 | 0.9435 | 0.9212 | 1.4699 | 0.2899 | 0.5506 |
21 | 11111111111111111111011111111 | 0.2152 | 0.92 | 0.9068 | 0.3568 | 0.7844 | 0.9026 |
22 | 11111111111111111111101111111 | 0.155 | 0.9701 | 0.9817 | 3.4204 | 0.0379 | 0.2225 |
23 | 11111111111111111111110111111 | 0.1551 | 0.9698 | 0.98 | 0.9163 | 0.4019 | 0.6594 |
24 | 11111111111111111111111011111 | 0.1615 | 0.9679 | 0.9811 | 1.3895 | 0.1592 | 0.4093 |
25 | 11111111111111111111111101111 | 0.1564 | 0.9675 | 0.9829 | 1.1946 | 0.2549 | 0.5156 |
26 | 11111111111111111111111110111 | 0.1712 | 0.9629 | 0.9838 | 1.4735 | 0.297 | 0.5614 |
27 | 11111111111111111111111111011 | 0.1488 | 0.9431 | 0.9695 | 0.3737 | 0.7892 | 0.9159 |
28 | 11111111111111111111111111101 | 0.1563 | 0.967 | 0.9815 | 2.0849 | 0.0934 | 0.3093 |
29 | 11111111111111111111111111110 | 0.1637 | 0.9684 | 0.9846 | 0.3289 | 0.8301 | 0.9104 |
30 | 00110001101101000001000001110 | 0.2326 | 0.9045 | 0.8935 | 0.3288 | 0.8355 | 0.9169 |
Table A4 shows the comparison of LLR and gamma test input combinations. The 30th combination in the training step has less accuracy, while it demonstrated the best accuracy during the evaluation process. On the other hand, the combination No. 1 has an appropriate accuracy in the training phase but poses an unacceptable accuracy during the test step.
Generally, it can be stated that the generated model does not have a predictable nature. In the 29th combination, both training and test steps have an appropriate accuracy. The selected combination using the gamma test demonstrates the best performance for the simulations. A multiple linear regression approach (step-by-step approach) was used to evaluate the gamma test findings. In this way, the input parameters are selected. According to Table A5, the best combination is obtained using the multiple linear regression method is presented in Eq. (6):
The simulation findings based on the LLR approach selected by a progressive selection method are compared with the results of the LLR and the gamma test. Table A5 shows the outcomes of simulations by the local linear regression approach and the optimum combination based on the progressive selection method.
Table A5 Results of multiple linear regression applying the progressive method (step-by-step approach)
Number | Variables | R2 | Standard Estimation Error |
---|---|---|---|
1 | R1 | 64.5300 | 0.4235 |
2 | R1,R2 | 66.1813 | 0.4217 |
3 | R1,R2,R3 | 69.6481 | 0.4114 |
4 | R1,R2,R3,R4 | 70.0429 | 0.4109 |
5 | R1,R2,R3,R4,Tmax | 76.9536 | 0.3991 |
6 | R1,R2,R3,R4,Tmax,Tmax1 | 78.0523 | 0.3921 |
7 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2 | 79.7664 | 0.3897 |
8 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4 | 80.0629 | 0.3672 |
9 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4 | 82.3377 | 0.3575 |
10 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3 | 84.0826 | 0.3547 |
11 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin | 85.7705 | 0.3509 |
12 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ETo | 85.7739 | 0.3431 |
13 | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ETo,ETo1 | 85.7792 | 0.336 |
14* | R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ET0,ET01,Tave | 89.0485 | 0.3357 |
Table A6 shows that the generated model using the LLR progressive selection method demonstrated a higher accuracy during the training step than the LLR using gamma test. The gamma test input combination reveals a higher accuracy than the one generated by the progressive selection approach in the testing phase. Hence, the gamma test provides the most suitable input combination for the model. Remesan et al. (2009) applied the gamma test to determine the best input combination for runoff modelling in Brue catchment, England. Their findings revealed that the combination of five parameters, including R3, R2, R1, P, P1 were the best input combinations for the model. They also considered 1056 data of all 2236 data for the training step based on the M test. Therefore, in this study, Tmax1, Tmin3, Tmin4, Tave1, Tmax2, Tmin1, Tave2, R4, R2, R3, and ET0 were considered as input variables, and Qt was the variable being predicted by the data-mining models.
Moreover, in the data-driven HYMOD model, daily precipitation, temperature, ET0, and discharge variables were considered inputs. To enhance the performance of the HYMOD model, the optimum calculated parameters for the model in this region, as presented in Table A7, were used. The number of data used for learning and calibration based on the M test in all four models of SVM, ANN, GEP, and HYMOD were considered as 3030 (≃75% of all data), and the remaining 1010 data were applied for test and evaluation purpose (≃25% of all data).
Table A6 Comparison of results of LLR with the optimal combination obtained from the progressive selection approach and LLR with the optimal combination obtained from the Gamma test
Model | Combination type | Evaluation | Training | ||||
---|---|---|---|---|---|---|---|
RMSE | R2 | CORR | RMSE | R2 | CORR | ||
Gamma | 00110001101101000001000001110 | 0.3288 | 0.8354 | 0.9168 | 0.2326 | 0.9045 | 0.8934 |
LLR | 00001101111100100011000001111 | 0.3537 | 0.7858 | 0.8903 | 0.0091 | 0.9702 | 0.9598 |
Table A7 Calculated parameters by the optimization algorithms for the HYMOD model for the Karaj river basin
Parameter (unit) | Min | Max | Optimal value |
---|---|---|---|
The maximum amount of moisture in the basin (mm) | 1 | 500 | 500 |
Spatial changes in soil moisture storage | 0.1 | 2 | 0.45 |
B distribution factor of the two moisture reservoirs | 0.1 | 0.99 | 0.1 |
Retention time in slow flow tank (day) | 0.001 | 1 | 0.01 |
Retention time in rapid flow tank (day) | 0.1 | 0.99 | 0.1 |
Rights and permissions
About this article
Cite this article
Yoosefdoost, I., Khashei-Siuki, A., Tabari, H. et al. Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models. Water Resour Manage 36, 1191–1215 (2022). https://doi.org/10.1007/s11269-022-03068-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-022-03068-6