Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models

Yoosefdoost, Icen; Khashei-Siuki, Abbas; Tabari, Hossein; Mohammadrezapour, Omolbani

doi:10.1007/s11269-022-03068-6

Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models

Published: 04 March 2022

Volume 36, pages 1191–1215, (2022)
Cite this article

Water Resources Management Aims and scope Submit manuscript

861 Accesses
20 Citations
Explore all metrics

Abstract

Water resources in arid and semi-arid regions are susceptible to alteration in hydro-climatic variables, especially under climate change which makes runoff simulations more challenging. This study aims to simulate input runoff to a dam reservoir in an arid region under changing climatic conditions using three data-mining algorithms, including Artificial Neural Networks (ANNs), Support Vector Machine (SVM), Genetic Expression Programming (GEP), and the conceptual HYMOD model. Three parameters containing precipitation and maximum and minimum temperature were simulated from 30 Coupled Model Intercomparison Project Phase 5 (CMIP5) and Global Climate Models (GCMs) for the future period (2020–2040) under the high-end RCP8.5 scenario. The Long Ashton Research Station Weather Generator (LARS-WG) was selected as a downscaling method. The Gamma and M tests (This is an exam to determine whether an infinite series of functions will converge uniformly and absolutely or not) were applied to detect the best combinations and number of input parameters for the models, respectively. Among 29 defined input parameters for the models, the gamma test identified 11 parameters with the best functionality to simulate runoff. Based on the reliability estimates of model error variance by the M test, the data were partitioned as 75% for learning and the other 25% for test verification. A comparison of the runoff simulations of the models revealed a remarkable performance of the SVM model by 3, 5, and 14% compared to ANNs, GEP, and HYMOD models, respectively. The SVM model forecasted a 25% decrease in the mean runoff input to the dam reservoir for the 2020–2040 period compared to the study period (2000–2019). This result illustrates necessitating the implementation of sustainable adaptation strategies to protect future water resources in the basin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Water quality prediction using machine learning models based on grid search method

Article Open access 29 September 2023

Air pollution prediction with machine learning: a case study of Indian cities

Article 15 May 2022

Forecasting Future Groundwater Recharge from Rainfall Under Different Climate Change Scenarios Using Comparative Analysis of Deep Learning and Ensemble Learning Techniques

Article 13 April 2024

Availability of Data and Material

The authors confirm that all data supporting the findings of this study are available from the corresponding author by request.

Code Availability

The authors announce that there is no problem for sharing the used model and codes by make request to corresponding author.

References

Amiri-Ardakani Y, Najafzadeh M (2021) Pipe break rate assessment while considering physical and operational factors: a methodology based on global positioning system and data-driven techniques. Water Resour Manage 35:11, 35:3703–3720. https://doi.org/10.1007/S11269-021-02911-6
Bayram S, Al-Jibouri S (2016) Efficacy of estimation methods in forecasting building projects’ costs. J Constr Eng Manag. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001183
Article Google Scholar
Behzad M, Asghari K, Eazi M, Palhang M (2009) Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst Appl 36:7624–7629. https://doi.org/10.1016/j.eswa.2008.09.053
Article Google Scholar
Chakrabortty R, Pal SC, Janizadeh S et al (2021) Impact of climate change on future flood susceptibility: an evaluation based on deep learning algorithms and GCM model. Water Resour Manage 35:12, 35:4251–4274
Choubin B, Khalighi-Sigaroodi S, Malekian A et al (2014) Drought forecasting in a semi-arid watershed using climate signals: a neuro-fuzzy modeling approach. J Mt Sci 11:1593–1605. https://doi.org/10.1007/s11629-014-3020-6
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/bf00994018
Article Google Scholar
Darabi H, Mohamadi S, Karimidastenaei Z et al (2021) Prediction of daily suspended sediment load (SSL) using new optimization algorithms and soft computing models. Soft Comput 25:7609–7626
Article Google Scholar
Dawood T, Elwakil E, Novoa HM, Delgado JFG (2021) Toward urban sustainability and clean potable water: Prediction of water quality via artificial neural networks. J Clean Prod 291:125266
Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press
Book Google Scholar
Fan YR, Huang W, Huang GH et al (2015) A PCM-based stochastic hydrological model for uncertainty quantification in watershed systems. Stoch Env Res Risk Assess 29:915–927
Article Google Scholar
Ghaith M, Siam A, Li Z, El-Dakhakhni W (2020) Hybrid hydrological data-driven approach for daily streamflow forecasting. J Hydrol Eng 25:04019063. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001866
Article Google Scholar
Hill T, Marquez L, O’Connor M, Remus W (1994) Artificial neural network models for forecasting and decision making. Int J Forecast 10:5–15. https://doi.org/10.1016/0169-2070(94)90045-0
Article Google Scholar
Hosseinzadehtalaei P, Tabari H, Willems P (2020a) Climate change impact on short-duration extreme precipitation and intensity–duration–frequency curves over Europe. J Hydrol 590:125249
Hosseinzadehtalaei P, Tabari H, Willems P (2020b) Satellite-based data driven quantification of pluvial floods over Europe under future climatic and socioeconomic changes. Sci Total Environ 721:137688. https://doi.org/10.1016/j.scitotenv.2020.137688
Article Google Scholar
Islam ARMT, Talukdar S, Mahato S et al (2021) Flood susceptibility modelling using advanced ensemble machine learning models. Geosci Front 12:101075
Jafarzadeh A, Pourreza-Bilondi M, Siuki AK, Moghadam JR (2021) Examination of various feature selection approaches for daily precipitation downscaling in different climates. Water Resour Manage 35:2, 35:407–427. https://doi.org/10.1007/S11269-020-02701-6
Karandish F, Mousavi SS, Tabari H (2017) Climate change impact on precipitation and cardinal temperatures in different climatic zones in Iran: Analyzing the probable effects on cereal water-use efficiency. Stoch Env Res Risk Assess 31:2121–2146. https://doi.org/10.1007/s00477-016-1355-y
Article Google Scholar
Khan MS, Coulibaly P, Dibike Y (2006) Uncertainty analysis of statistical downscaling methods using Canadian Global Climate Model predictors. Hydrol Process 20:3085–3104. https://doi.org/10.1002/hyp.6084
Kharin V, Flato GM, Zhang X et al (2018) Risks from climate extremes change differently from 1.5°C to 2.0°C depending on rarity. Earth’s Future 6:704–715. https://doi.org/10.1002/2018EF000813
Kundzewicz ZW, Krysanova V, Benestad RE et al (2018) Uncertainty in climate change impacts on water resources. Environ Sci Policy 79:1–8
Article Google Scholar
Loveridge M, Rahman A (2021) Effects of probability-distributed losses on flood estimates using event-based rainfall-runoff models. Water 13:2049
Article Google Scholar
Makkeasorn A, Chang NB, Zhou X (2008) Short-term streamflow forecasting with global climate change implications - a comparative study between genetic programming and neural network models. J Hydrol 352:336–354. https://doi.org/10.1016/j.jhydrol.2008.01.023
Article Google Scholar
Malik A, Kumar A, Kisi O, Shiri J (2019) Evaluating the performance of four different heuristic approaches with Gamma test for daily suspended sediment concentration modeling. Environ Sci Pollut Res 26:22670–22687
Article Google Scholar
Meng E, Huang S, Huang Q et al (2021) A hybrid VMD-SVM model for practical streamflow prediction using an innovative input selection framework. Water Resour Manage 35:1321–1337. https://doi.org/10.1007/S11269-021-02786-7
Article Google Scholar
Mohammadi AA, Yousefi M, Soltani J et al (2018) Using the combined model of gamma test and neuro-fuzzy system for modeling and estimating lead bonds in reservoir sediments. Environ Sci Pollut Res 25:30315–30324
Article Google Scholar
Mohanta A, Pradhan A, Mallick M, Patra KC (2021) Assessment of shear stress distribution in meandering compound channels with differential roughness through various artificial intelligence approach. Water Resour Manage 35:13, 35:4535–4559. https://doi.org/10.1007/S11269-021-02966-5
Quan Z, Teng J, Sun W et al (2015) Evaluation of the HYMOD model for rainfall–runoff simulation using the GLUE method. Proc Int Assoc Hydrol Sci 368:180–185. https://doi.org/10.5194/piahs-368-180-2015
Article Google Scholar
Ravindran SM, Bhaskaran SKM, Ambat SKN (2021) A deep neural network architecture to model reference evapotranspiration using a single input meteorological parameter. Environ Process 8:1567–1599. https://doi.org/10.1007/S40710-021-00543-X
Article Google Scholar
Remesan R, Shamim MA, Han D, Mathew J (2009) Runoff prediction using an integrated hybrid modelling scheme. J Hydrol 372:48–60. https://doi.org/10.1016/J.JHYDROL.2009.03.034
Article Google Scholar
Rezaeianzadeh M, Stein A, Tabari H et al (2013) Assessment of a conceptual hydrological model and artificial neural networks for daily outflows forecasting. Int J Environ Sci Technol 10:1181–1192
Article Google Scholar
Roy DK (2021) Long short-term memory networks to predict one-step ahead reference evapotranspiration in a subtropical climatic zone. Environ Process 8:911–941
Article Google Scholar
Shoaib M, Shamseldin AY, Melville BW, Khan MM (2015) Runoff forecasting using hybrid Wavelet Gene Expression Programming (WGEP) approach. J Hydrol 527:326–344. https://doi.org/10.1016/j.jhydrol.2015.04.072
Article Google Scholar
Singh VK, Kumar D, Kashyap PS et al (2020) Modelling of soil permeability using different data driven algorithms based on physical properties of soil. J Hydrol 580:124223. https://doi.org/10.1016/j.jhydrol.2019.124223
Article Google Scholar
Tabari H (2020) Climate change impact on flood and extreme precipitation increases with water availability. Sci Rep 10:13768
Article Google Scholar
Tabari H, Kisi O, Ezani A, Talaee PH (2012) SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J Hydrol 444:78–89
Article Google Scholar
Tabari H, Willems P (2018) Seasonally varying footprint of climate change on precipitation in the Middle East. Sci Rep 8:2–11
Article Google Scholar
Tayfur G (2021) Empirical, numerical, and soft modelling approaches for non-cohesive sediment transport. Environ Process 8:37–58
Article Google Scholar
Vijay S, Kamaraj K (2021) Prediction of water quality index in drinking water distribution system using activation functions based ann. Water Resour Manage 35:2, 35:535–553. https://doi.org/10.1007/S11269-020-02729-8
Wang W, Du Y, Chau K et al (2021) An ensemble hybrid forecasting model for annual runoff based on sample entropy, secondary decomposition, and long short-term memory neural network. Water Resour Manage 2021:1–32. https://doi.org/10.1007/S11269-021-02920-5
Article Google Scholar
Wang Y, Tabari H, Xu Y et al (2019) Unraveling the role of human activities and climate variability in water level changes in the Taihu plain using artificial neural network. Water 11:720
Article Google Scholar
Winsemius HC, Aerts JCJH, van Beek LPH et al (2015) Global drivers of future river flood risk. Nat Clim Change 64, 6:381–385. https://doi.org/10.1038/nclimate2893
YoosefDoost A, Asghari H, Abunuri R, Sadeghian MS (2018a) Comparison of CGCM3, CSIRO MK3 and HADCM3 Models in estimating the effects of climate change on temperature and precipitation in Taleghan Basin. Am J Environ Protect 6:28–34. https://doi.org/10.12691/env-6-1-5
YoosefDoost A, YoosefDoost I, Asghari H, Sadegh Sadeghian MS (2018b) Comparison of HadCM3, CSIRO Mk3 and GFDL CM2.1 in prediction the climate change in Taleghan River Basin. Am J Civil Eng Architect 6:93–100. https://doi.org/10.12691/ajcea-6-3-1
YoosefDoost A, Sadeghian MS, NodeFarahani M, Rasekhi A (2017) Comparison between performance of statistical and Low Cost ARIMA Model with GFDL, CM2. 1 and CGM 3 atmosphere-ocean general circulation models in assessment of the effects of climate change on temperature and precipitation in Taleghan Basin. Am J Water Resour 5:92–99. https://doi.org/10.12691/ajwr-5-4-1
Yousefi Malekshah M, Ghazavi R, Sadatinejad SJ (2019) Evaluating the effect of climate changes on runoff and maximum flood discharge in the dry area (Case Study : Tehran-Karaj Basin). Ecopersia 7:211–221 (In Farsi)
Google Scholar
Zhang W et al (2021) Increasing precipitation variability on daily-to-multiyear time scales in a warmer world. Sci Adv 7(31):eabf8021

Download references

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Author information

Authors and Affiliations

Department of Water Engineering, University of Birjand, Birjand, Iran
Icen Yoosefdoost & Abbas Khashei-Siuki
Department of Civil Engineering, KU Leuven, Leuven, Belgium
Hossein Tabari
Department of Sciences and Water Engineering, Gorgan University of Agriculture Science and Natural Resources, Gorgan, Iran
Omolbani Mohammadrezapour

Authors

Icen Yoosefdoost
View author publications
You can also search for this author in PubMed Google Scholar
Abbas Khashei-Siuki
View author publications
You can also search for this author in PubMed Google Scholar
Hossein Tabari
View author publications
You can also search for this author in PubMed Google Scholar
Omolbani Mohammadrezapour
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. Material preparation, data collection and analysis were performed by Icen Yoosefdoost, Abbas Khashei Siuki, Hossein Tabari and Omolbani Mohammadrezapour. The first draft of the manuscript was written by Yoosefdoost, Icen, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Abbas Khashei-Siuki.

Ethics declarations

Conflicts of Interest

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

1.1 GCM Selection

The projected monthly precipitation and minimum, maximum temperatures under climate change for all GCMs and the RCP8.5 scenario in the base period are compared to observations in Fig. A1. The red dashed, and continuous lines show the 95% and 99% confidence intervals for observations, respectively. The GCMs simulation is considered acceptable when they are within the confidence interval of the observed data. The left column in Fig. A1 compares the mean monthly climate projections for all 30 GCMs shown with gray shadow (see Table A1 for GCMs names) with the confidence intervals of observations data. The results show that the accuracy of GCM simulations alters with variables (precipitation, max and min temperature) and period (month). The simulation CCMs, located within the confidence intervals of observations, are selected for further analysis, shown in the right column of Fig. A1.

The projected changes in precipitation and minimum, maximum temperatures for historical and future periods are shown in Fig. A2. The highest increases in maximum and minimum temperatures are seen in April (1.6 °C) and December (1.3 °C), respectively. Moreover, the lowest increases are projected for January (0.5 °C), May (0.5 °C) and November (0.6 °C). The largest uncertainty in the maximum and minimum temperature projections are seen in April and June, and October.

Table A1 All GCMs models are used in this research. The selected GCMs for precipitation and temperature projections are highlighted in bold with T (only temperature), P (only precipitation) and T, P (both temperature and precipitation) index

Models
${\textbf{MPI-ESM-LR}}_\textbf{(T,P)}$	${\textbf{BCC-CSM1.1}}_\textbf{(T,P)}$	CESM1(CAM5)	${\textbf{EC-EARTH}}_\textbf{(T)}$
MPI-ESM-MR	BCC-CSM1.1(m)	${\textbf{CESM1(WACCM)}}_\textbf{(T,P)}$	${\textbf{IPSL-CM5A-LR}}_\textbf{(P)}$
${\textbf{HadGEM2-ES}}_\textbf{(P)}$	BNU-ESM	${\textbf{NorESM1-M}}_\textbf{(P)}$	IPSL-CM5A-MR
MRI-CGCM3	CanESM2	NorESM1-ME	FGOALS-g2
GISS-E2-H	CNRM-CM5	HadGEM2-AO	MIROC-ESM
${\textbf{GISS-E2-R}}_\textbf{(T)}$	CSIRO-Mk3.6.0	GFDL-CM3	MIROC-ESM-CHEM
CCSM4	FIO-ESM	GFDL-ESM2G
GFDL-ESM2M	HadGEM2-AO	MIROC5

Precipitation is expected to decrease in most months, with the highest decline of about 30% in May. In the last three months of the year and February, a precipitation increase is projected, with the highest growth of + 9% in February. The largest uncertainty in precipitation projections is predicted in July, where the changes vary from a –30% decline to a + 20% increase among GCMs.

1.2 Trend Analysis

To determine the trend of data, two types of statistical analyses are selected. First, the presence of a steady rising or declining trend is examined with the nonparametric Mann–Kendall test. Then, the slope of the linear trends is estimated with the nonparametric Sen’s method. Sen’s statistics for precipitation and temperature time series are shown in Figs. A3, A4, A5, and A6. These figures confirm the presence of annual and monthly time series of minimum temperature (all months except December and November) and maximum temperature in April and June. The time series consists of monthly averages with monotonously decreasing trends. It seems that the residuals follow a random distribution, indicating that a linear model should be applied. The statistical calculations give a high significance level, with narrow angles between the confidence lines. The precipitation, max temperature (except in April and June), and min temperature time series in November and December are three instances in which neither of the used methods is statistically suitable. However, they have a negative slope in the studied time scales. Because the data trends have significant fluctuations in monthly and annual periods, the trend is thus neither linear nor monotonic (Figs. A3, A4, A5, and A6).

1.3 Reference Evapotranspiration Estimation

The ultimate goal of this research was to forecast runoff under climate change conditions. Input variables include daily precipitation, minimum and maximum temperatures, reference evapotranspiration (ET0), and runoff for 20 years (2020–2040). The ET0 estimates are obtained using the Penman, Penman–Monteith, Wright–Penman, Blaney–Criddle, Radiation balance, and Hargreaves models. The results show that the Penman–Monteith have a more reliable performance to estimate in this area. The Penman–Monteith model is not applicable for the future period due to a lack of relative humidity, net radiation, and wind speed data. Therefore, future ET0 was estimated from temperature using an empirical copmuted relationship (Fig. A7) between historical ET0 and temperature data.

Figure A7 demonstrates the correlation and equation obtained from different input–output combinations based on various trend line regressions. Green, red, and black lines illustrate linear, quadratic, and cubic relations of min, max, and average temperature with ET0, respectively. The results show the highest correlation is between maximum temperature and ET0 using a quadratic polynomial function.

1.4 Inputs Selection

Choosing appropriate inputs data is a prerequisite for developing a forecasting model. Due to the limited access to climatic data, various inputs for the model based on existing parameters to predict daily runoff under climate change scenarios are defined. The gamma test inputs are Tave, Tmin, Tmax, P (Precipitation), ET0, and R (Runoff) in combination with the repetition of all these six parameters with one, two, three and four 20 years steps delay, which are shown as 1 to 4 indices under each parameter symbol.

According to Table A2, the maximum gamma statistics is achieved for the 27th combination (elimination of R3 variable), indicating that R3 is the most vital parameter in the combination. Apart from R3, Tmax1, Tmin3, Tmin4, Tave1, Tmax2, Tmin1, ET0, and Tave2 are considered in order of importance.

Table A2 Selected combinations with higher Gamma amounts for the initial input

Combination number	Rank	Absent parameter	Gamma	Gradient	Standard Error	V-Ratio
0	10	-	0.018234	0.042148	0.007177	0.072935
1	14	T ave4	0.018097	0.043687	0.006757	0.072388
2	13	T ave3	0.018162	0.043621	0.006622	0.072647
3	9	T ave2	0.018452	0.043108	0.007137	0.073808
4	5	T ave1	0.019758	0.041836	0.006242	0.079033
5	27	T ave	0.012948	0.050928	0.006395	0.051791
6	15	Tmax4	0.017814	0.04411	0.004546	0.071256
7	24	Tmax3	0.014837	0.048172	0.004714	0.059347
8	6	Tmax2	0.019243	0.042887	0.006142	0.076971
9	2	Tmax1	0.021062	0.040848	0.007463	0.084246
10	26	Tmax	0.013296	0.050814	0.007465	0.053183
11	4	Tmin4	0.019819	0.042261	0.007033	0.079278
12	3	Tmin3	0.020647	0.041046	0.007229	0.082588
13	21	Tmin2	0.016109	0.046281	0.008033	0.064434
14	7	Tmin1	0.019152	0.043338	0.007156	0.076607
15	20	Tmin	0.016149	0.049204	0.004846	0.064597
16	11	EV4	0.018234	0.042148	0.007177	0.072935
17	12	EV3	0.018234	0.042148	0.007177	0.072935
18	16	EV2	0.017709	0.044322	0.007342	0.070836
19	19	EV1	0.016451	0.047765	0.004804	0.065803
20	8	EV	0.018541	0.04441	0.00705	0.074164
21	28	P4	0.012765	0.056347	0.004487	0.051061
22	17	P3	0.017331	0.047929	0.005202	0.069323
23	25	P2	0.014061	0.05073	0.005285	0.056245
24	23	P1	0.015254	0.047209	0.005479	0.061016
25	30	P	0.007831	0.067974	0.005675	0.031324
26	22	R4	0.015703	0.050782	0.004229	0.062812
27	1	R3	0.026969	0.044033	0.006039	0.107875
28	18	R2	0.016623	0.050932	0.006062	0.066494
29	29	R1	0.010764	0.05958	0.005647	0.043055

Then, the number of input data is defined. Seventy percent of all available data is used for learning and the rest for testing and evaluation. The minimum number of data required for learning was calculated using the M test (Fig. A8).

The results show that both Gamma and standard error diagrams reach a plateau at approximately the point of 2750. The M test is applied for simulations using various learning data utilizing the Local Linear Regression (LLR) as the control model to achieve a more accurate result. The simulation was applied for every model using the selected combinations from the previous analysis. Table A3 compares the simulations based on three criteria of R², RMSE, and CORR.

The results indicate that the values of R² and CORR remain approximately constant after 2450 iterations. The minimum RMSE and the maximum R² and CORR values are obtained using 2800 data for training. Based on the results, the optimum training number is 2737 (75% of all data), and the rest (912) is considered for the evaluation and testing. To choose the best input combination, the gamma statistics obtained from 29 input combinations are compared with a state considering all input parameters. Table A3 presents the findings for some of the input combinations. The number of required data for training was determined using the M test while considering LLR as the control model.

Table A3 Summary of modelling results from the LLR control model to evaluate the M test

Number of training data	Test			Train
Number of training data	RMSE	R²	CORR	RMSE	R²	CORR
500	0.3925	0.9208	0.9365	0.6055	0.6042	0.7262
2400	0.3192	0.979	0.9691	0.3731	0.863932	0.8885
2450	0.3156	0.9856	0.9721	0.3551	0.881487	0.8985
2700	0.3044	0.9878	0.972095	0.3410	0.89665	0.9076
2750	0.3029	0.9881	0.9730	0.3310	0.907729	0.9135
2800	0.2915	0.9884	0.9733	0.3183	0.920364	0.920
2850	0.3004	0.9872	0.9728	0.3228	0.9140	0.9170
3000	0.3082	0.9881	0.9728	0.3478	0.888332	0.902409

Table A4 Comparison of LLR and gamma test input combinations

Number of combinations	Combination type	Training			Test
Number of combinations	Combination type	RMSE	R²	CORR	RMSE	R²	CORR
0	11111111111111111111111111111	0.1557	0.9117	0.9471	9.6246	0.0434	0.1048
1	01111111111111111111111111111	0.1505	0.9838	0.9369	2.1047	0.1075	0.3289
2	10111111111111111111111111111	0.1641	0.9923	0.9314	0.5164	0.6586	0.8125
3	11011111111111111111111111111	0.1813	0.9681	0.9396	1.2207	0.1989	0.4026
4	11101111111111111111111111111	0.2084	0.9711	0.9369	1.923	0.0947	0.3049
5	11110111111111111111111111111	0.1639	0.9903	0.9302	3.4479	0.073	0.2313
6	11111011111111111111111111111	0.1642	0.9745	0.9327	0.9039	0.4256	0.661
7	11111101111111111111111111111	0.1555	0.9923	0.9251	1.3766	0.1957	0.4146
8	11111110111111111111111111111	0.1735	0.9764	0.9292	1.2166	0.2839	0.5304
9	11111111011111111111111111111	0.1714	0.9734	0.9277	1.4732	0.3054	0.5529
10	11111111101111111111111111111	0.2368	0.972	0.9218	0.3648	0.799	0.9089
11	11111111110111111111111111111	0.1432	0.9492	0.9269	9.624	0.0151	0.1006
12	11111111111011111111111111111	0.156	0.9529	0.9174	2.0766	0.0936	0.3175
13	11111111111101111111111111111	0.1582	0.9502	0.917	0.5164	0.6464	0.8275
14	11111111111110111111111111111	0.1675	0.9448	0.9183	1.2037	0.1567	0.4119
15	11111111111111011111111111111	0.1886	0.9347	0.9181	1.9077	0.0844	0.3214
16	11111111111111101111111111111	0.149	0.9519	0.9192	3.4294	0.0381	0.2293
17	11111111111111110111111111111	0.1534	0.9435	0.9159	0.9019	0.4008	0.6653
18	11111111111111111011111111111	0.1607	0.948	0.9175	1.3765	0.1538	0.4314
19	11111111111111111101111111111	0.1502	0.9434	0.9235	1.1905	0.2551	0.5277
20	11111111111111111110111111111	0.1702	0.9435	0.9212	1.4699	0.2899	0.5506
21	11111111111111111111011111111	0.2152	0.92	0.9068	0.3568	0.7844	0.9026
22	11111111111111111111101111111	0.155	0.9701	0.9817	3.4204	0.0379	0.2225
23	11111111111111111111110111111	0.1551	0.9698	0.98	0.9163	0.4019	0.6594
24	11111111111111111111111011111	0.1615	0.9679	0.9811	1.3895	0.1592	0.4093
25	11111111111111111111111101111	0.1564	0.9675	0.9829	1.1946	0.2549	0.5156
26	11111111111111111111111110111	0.1712	0.9629	0.9838	1.4735	0.297	0.5614
27	11111111111111111111111111011	0.1488	0.9431	0.9695	0.3737	0.7892	0.9159
28	11111111111111111111111111101	0.1563	0.967	0.9815	2.0849	0.0934	0.3093
29	11111111111111111111111111110	0.1637	0.9684	0.9846	0.3289	0.8301	0.9104
30	00110001101101000001000001110	0.2326	0.9045	0.8935	0.3288	0.8355	0.9169

Table A4 shows the comparison of LLR and gamma test input combinations. The 30th combination in the training step has less accuracy, while it demonstrated the best accuracy during the evaluation process. On the other hand, the combination No. 1 has an appropriate accuracy in the training phase but poses an unacceptable accuracy during the test step.

Generally, it can be stated that the generated model does not have a predictable nature. In the 29th combination, both training and test steps have an appropriate accuracy. The selected combination using the gamma test demonstrates the best performance for the simulations. A multiple linear regression approach (step-by-step approach) was used to evaluate the gamma test findings. In this way, the input parameters are selected. According to Table A5, the best combination is obtained using the multiple linear regression method is presented in Eq. (6):

$$\begin{aligned}R&=0.373787\times R3+0.070545\times R4+0.163553\times R2 \\&+0.087728\times R1+0.042036\times Tmin-0.02445\times Tmin4 \\&+0.032388\times Tmin3-0.04031\times Tmax+0.006369\times Tmax1 \\&-0.01492\times Tmax2+0.015357\times Tmax4+0.020101\times {ET}_{0}1 \\&-0.011170\times {ET}_{0}-0.023045\times Tave+0.031988\end{aligned}$$

(6)

The simulation findings based on the LLR approach selected by a progressive selection method are compared with the results of the LLR and the gamma test. Table A5 shows the outcomes of simulations by the local linear regression approach and the optimum combination based on the progressive selection method.

Table A5 Results of multiple linear regression applying the progressive method (step-by-step approach)

Number	Variables	R²	Standard Estimation Error
1	R1	64.5300	0.4235
2	R1,R2	66.1813	0.4217
3	R1,R2,R3	69.6481	0.4114
4	R1,R2,R3,R4	70.0429	0.4109
5	R1,R2,R3,R4,Tmax	76.9536	0.3991
6	R1,R2,R3,R4,Tmax,Tmax1	78.0523	0.3921
7	R1,R2,R3,R4,Tmax,Tmax1,Tmax2	79.7664	0.3897
8	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4	80.0629	0.3672
9	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4	82.3377	0.3575
10	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3	84.0826	0.3547
11	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin	85.7705	0.3509
12	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ETo	85.7739	0.3431
13	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ETo,ETo1	85.7792	0.336
14^*	R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ET0,ET01,Tave	89.0485	0.3357

^*Best inputs combination

Table A6 shows that the generated model using the LLR progressive selection method demonstrated a higher accuracy during the training step than the LLR using gamma test. The gamma test input combination reveals a higher accuracy than the one generated by the progressive selection approach in the testing phase. Hence, the gamma test provides the most suitable input combination for the model. Remesan et al. (2009) applied the gamma test to determine the best input combination for runoff modelling in Brue catchment, England. Their findings revealed that the combination of five parameters, including R3, R2, R1, P, P1 were the best input combinations for the model. They also considered 1056 data of all 2236 data for the training step based on the M test. Therefore, in this study, Tmax1, Tmin3, Tmin4, Tave1, Tmax2, Tmin1, Tave2, R4, R2, R3, and ET0 were considered as input variables, and Q_t was the variable being predicted by the data-mining models.

Moreover, in the data-driven HYMOD model, daily precipitation, temperature, ET0, and discharge variables were considered inputs. To enhance the performance of the HYMOD model, the optimum calculated parameters for the model in this region, as presented in Table A7, were used. The number of data used for learning and calibration based on the M test in all four models of SVM, ANN, GEP, and HYMOD were considered as 3030 (≃75% of all data), and the remaining 1010 data were applied for test and evaluation purpose (≃25% of all data).

Table A6 Comparison of results of LLR with the optimal combination obtained from the progressive selection approach and LLR with the optimal combination obtained from the Gamma test

Model	Combination type	Evaluation			Training
Model	Combination type	RMSE	R²	CORR	RMSE	R²	CORR
Gamma	00110001101101000001000001110	0.3288	0.8354	0.9168	0.2326	0.9045	0.8934
LLR	00001101111100100011000001111	0.3537	0.7858	0.8903	0.0091	0.9702	0.9598

Table A7 Calculated parameters by the optimization algorithms for the HYMOD model for the Karaj river basin

Parameter (unit)	Min	Max	Optimal value
The maximum amount of moisture in the basin (mm)	1	500	500
Spatial changes in soil moisture storage	0.1	2	0.45
B distribution factor of the two moisture reservoirs	0.1	0.99	0.1
Retention time in slow flow tank (day)	0.001	1	0.01
Retention time in rapid flow tank (day)	0.1	0.99	0.1

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoosefdoost, I., Khashei-Siuki, A., Tabari, H. et al. Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models. Water Resour Manage 36, 1191–1215 (2022). https://doi.org/10.1007/s11269-022-03068-6

Download citation

Received: 30 September 2020
Accepted: 10 January 2022
Published: 04 March 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11269-022-03068-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models

Abstract

Access this article

Similar content being viewed by others

Water quality prediction using machine learning models based on grid search method

Air pollution prediction with machine learning: a case study of Indian cities

Forecasting Future Groundwater Recharge from Rainfall Under Different Climate Change Scenarios Using Comparative Analysis of Deep Learning and Ensemble Learning Techniques

Availability of Data and Material

Code Availability

References

Funding