Skip to main content

Advertisement

Log in

Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

Water resources in arid and semi-arid regions are susceptible to alteration in hydro-climatic variables, especially under climate change which makes runoff simulations more challenging. This study aims to simulate input runoff to a dam reservoir in an arid region under changing climatic conditions using three data-mining algorithms, including Artificial Neural Networks (ANNs), Support Vector Machine (SVM), Genetic Expression Programming (GEP), and the conceptual HYMOD model. Three parameters containing precipitation and maximum and minimum temperature were simulated from 30 Coupled Model Intercomparison Project Phase 5 (CMIP5) and Global Climate Models (GCMs) for the future period (2020–2040) under the high-end RCP8.5 scenario. The Long Ashton Research Station Weather Generator (LARS-WG) was selected as a downscaling method. The Gamma and M tests (This is an exam to determine whether an infinite series of functions will converge uniformly and absolutely or not) were applied to detect the best combinations and number of input parameters for the models, respectively. Among 29 defined input parameters for the models, the gamma test identified 11 parameters with the best functionality to simulate runoff. Based on the reliability estimates of model error variance by the M test, the data were partitioned as 75% for learning and the other 25% for test verification. A comparison of the runoff simulations of the models revealed a remarkable performance of the SVM model by 3, 5, and 14% compared to ANNs, GEP, and HYMOD models, respectively. The SVM model forecasted a 25% decrease in the mean runoff input to the dam reservoir for the 2020–2040 period compared to the study period (2000–2019). This result illustrates necessitating the implementation of sustainable adaptation strategies to protect future water resources in the basin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of Data and Material

The authors confirm that all data supporting the findings of this study are available from the corresponding author by request.

Code Availability

The authors announce that there is no problem for sharing the used model and codes by make request to corresponding author.

References

  • Amiri-Ardakani Y, Najafzadeh M (2021) Pipe break rate assessment while considering physical and operational factors: a methodology based on global positioning system and data-driven techniques. Water Resour Manage 35:11, 35:3703–3720. https://doi.org/10.1007/S11269-021-02911-6

  • Bayram S, Al-Jibouri S (2016) Efficacy of estimation methods in forecasting building projects’ costs. J Constr Eng Manag. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001183

    Article  Google Scholar 

  • Behzad M, Asghari K, Eazi M, Palhang M (2009) Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst Appl 36:7624–7629. https://doi.org/10.1016/j.eswa.2008.09.053

    Article  Google Scholar 

  • Chakrabortty R, Pal SC, Janizadeh S et al (2021) Impact of climate change on future flood susceptibility: an evaluation based on deep learning algorithms and GCM model. Water Resour Manage 35:12, 35:4251–4274

  • Choubin B, Khalighi-Sigaroodi S, Malekian A et al (2014) Drought forecasting in a semi-arid watershed using climate signals: a neuro-fuzzy modeling approach. J Mt Sci 11:1593–1605. https://doi.org/10.1007/s11629-014-3020-6

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/bf00994018

    Article  Google Scholar 

  • Darabi H, Mohamadi S, Karimidastenaei Z et al (2021) Prediction of daily suspended sediment load (SSL) using new optimization algorithms and soft computing models. Soft Comput 25:7609–7626

    Article  Google Scholar 

  • Dawood T, Elwakil E, Novoa HM, Delgado JFG (2021) Toward urban sustainability and clean potable water: Prediction of water quality via artificial neural networks. J Clean Prod 291:125266

  • Efron B, Tibshirani RJ (1994) An introduction to the bootstrap. CRC Press

    Book  Google Scholar 

  • Fan YR, Huang W, Huang GH et al (2015) A PCM-based stochastic hydrological model for uncertainty quantification in watershed systems. Stoch Env Res Risk Assess 29:915–927

    Article  Google Scholar 

  • Ghaith M, Siam A, Li Z, El-Dakhakhni W (2020) Hybrid hydrological data-driven approach for daily streamflow forecasting. J Hydrol Eng 25:04019063. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001866

    Article  Google Scholar 

  • Hill T, Marquez L, O’Connor M, Remus W (1994) Artificial neural network models for forecasting and decision making. Int J Forecast 10:5–15. https://doi.org/10.1016/0169-2070(94)90045-0

    Article  Google Scholar 

  • Hosseinzadehtalaei P, Tabari H, Willems P (2020a) Climate change impact on short-duration extreme precipitation and intensity–duration–frequency curves over Europe. J Hydrol 590:125249

  • Hosseinzadehtalaei P, Tabari H, Willems P (2020b) Satellite-based data driven quantification of pluvial floods over Europe under future climatic and socioeconomic changes. Sci Total Environ 721:137688. https://doi.org/10.1016/j.scitotenv.2020.137688

    Article  Google Scholar 

  • Islam ARMT, Talukdar S, Mahato S et al (2021) Flood susceptibility modelling using advanced ensemble machine learning models. Geosci Front 12:101075

  • Jafarzadeh A, Pourreza-Bilondi M, Siuki AK, Moghadam JR (2021) Examination of various feature selection approaches for daily precipitation downscaling in different climates. Water Resour Manage 35:2, 35:407–427. https://doi.org/10.1007/S11269-020-02701-6

  • Karandish F, Mousavi SS, Tabari H (2017) Climate change impact on precipitation and cardinal temperatures in different climatic zones in Iran: Analyzing the probable effects on cereal water-use efficiency. Stoch Env Res Risk Assess 31:2121–2146. https://doi.org/10.1007/s00477-016-1355-y

    Article  Google Scholar 

  • Khan MS, Coulibaly P, Dibike Y (2006) Uncertainty analysis of statistical downscaling methods using Canadian Global Climate Model predictors. Hydrol Process 20:3085–3104. https://doi.org/10.1002/hyp.6084

  • Kharin V, Flato GM, Zhang X et al (2018) Risks from climate extremes change differently from 1.5°C to 2.0°C depending on rarity. Earth’s Future 6:704–715. https://doi.org/10.1002/2018EF000813

  • Kundzewicz ZW, Krysanova V, Benestad RE et al (2018) Uncertainty in climate change impacts on water resources. Environ Sci Policy 79:1–8

    Article  Google Scholar 

  • Loveridge M, Rahman A (2021) Effects of probability-distributed losses on flood estimates using event-based rainfall-runoff models. Water 13:2049

    Article  Google Scholar 

  • Makkeasorn A, Chang NB, Zhou X (2008) Short-term streamflow forecasting with global climate change implications - a comparative study between genetic programming and neural network models. J Hydrol 352:336–354. https://doi.org/10.1016/j.jhydrol.2008.01.023

    Article  Google Scholar 

  • Malik A, Kumar A, Kisi O, Shiri J (2019) Evaluating the performance of four different heuristic approaches with Gamma test for daily suspended sediment concentration modeling. Environ Sci Pollut Res 26:22670–22687

    Article  Google Scholar 

  • Meng E, Huang S, Huang Q et al (2021) A hybrid VMD-SVM model for practical streamflow prediction using an innovative input selection framework. Water Resour Manage 35:1321–1337. https://doi.org/10.1007/S11269-021-02786-7

    Article  Google Scholar 

  • Mohammadi AA, Yousefi M, Soltani J et al (2018) Using the combined model of gamma test and neuro-fuzzy system for modeling and estimating lead bonds in reservoir sediments. Environ Sci Pollut Res 25:30315–30324

    Article  Google Scholar 

  • Mohanta A, Pradhan A, Mallick M, Patra KC (2021) Assessment of shear stress distribution in meandering compound channels with differential roughness through various artificial intelligence approach. Water Resour Manage 35:13, 35:4535–4559. https://doi.org/10.1007/S11269-021-02966-5

  • Quan Z, Teng J, Sun W et al (2015) Evaluation of the HYMOD model for rainfall–runoff simulation using the GLUE method. Proc Int Assoc Hydrol Sci 368:180–185. https://doi.org/10.5194/piahs-368-180-2015

    Article  Google Scholar 

  • Ravindran SM, Bhaskaran SKM, Ambat SKN (2021) A deep neural network architecture to model reference evapotranspiration using a single input meteorological parameter. Environ Process 8:1567–1599. https://doi.org/10.1007/S40710-021-00543-X

    Article  Google Scholar 

  • Remesan R, Shamim MA, Han D, Mathew J (2009) Runoff prediction using an integrated hybrid modelling scheme. J Hydrol 372:48–60. https://doi.org/10.1016/J.JHYDROL.2009.03.034

    Article  Google Scholar 

  • Rezaeianzadeh M, Stein A, Tabari H et al (2013) Assessment of a conceptual hydrological model and artificial neural networks for daily outflows forecasting. Int J Environ Sci Technol 10:1181–1192

    Article  Google Scholar 

  • Roy DK (2021) Long short-term memory networks to predict one-step ahead reference evapotranspiration in a subtropical climatic zone. Environ Process 8:911–941

    Article  Google Scholar 

  • Shoaib M, Shamseldin AY, Melville BW, Khan MM (2015) Runoff forecasting using hybrid Wavelet Gene Expression Programming (WGEP) approach. J Hydrol 527:326–344. https://doi.org/10.1016/j.jhydrol.2015.04.072

    Article  Google Scholar 

  • Singh VK, Kumar D, Kashyap PS et al (2020) Modelling of soil permeability using different data driven algorithms based on physical properties of soil. J Hydrol 580:124223. https://doi.org/10.1016/j.jhydrol.2019.124223

    Article  Google Scholar 

  • Tabari H (2020) Climate change impact on flood and extreme precipitation increases with water availability. Sci Rep 10:13768

    Article  Google Scholar 

  • Tabari H, Kisi O, Ezani A, Talaee PH (2012) SVM, ANFIS, regression and climate based models for reference evapotranspiration modeling using limited climatic data in a semi-arid highland environment. J Hydrol 444:78–89

    Article  Google Scholar 

  • Tabari H, Willems P (2018) Seasonally varying footprint of climate change on precipitation in the Middle East. Sci Rep 8:2–11

    Article  Google Scholar 

  • Tayfur G (2021) Empirical, numerical, and soft modelling approaches for non-cohesive sediment transport. Environ Process 8:37–58

    Article  Google Scholar 

  • Vijay S, Kamaraj K (2021) Prediction of water quality index in drinking water distribution system using activation functions based ann. Water Resour Manage 35:2, 35:535–553. https://doi.org/10.1007/S11269-020-02729-8

  • Wang W, Du Y, Chau K et al (2021) An ensemble hybrid forecasting model for annual runoff based on sample entropy, secondary decomposition, and long short-term memory neural network. Water Resour Manage 2021:1–32. https://doi.org/10.1007/S11269-021-02920-5

    Article  Google Scholar 

  • Wang Y, Tabari H, Xu Y et al (2019) Unraveling the role of human activities and climate variability in water level changes in the Taihu plain using artificial neural network. Water 11:720

    Article  Google Scholar 

  • Winsemius HC, Aerts JCJH, van Beek LPH et al (2015) Global drivers of future river flood risk. Nat Clim Change 64, 6:381–385. https://doi.org/10.1038/nclimate2893

  • YoosefDoost A, Asghari H, Abunuri R, Sadeghian MS (2018a) Comparison of CGCM3, CSIRO MK3 and HADCM3 Models in estimating the effects of climate change on temperature and precipitation in Taleghan Basin. Am J Environ Protect 6:28–34. https://doi.org/10.12691/env-6-1-5

  • YoosefDoost A, YoosefDoost I, Asghari H, Sadegh Sadeghian MS (2018b) Comparison of HadCM3, CSIRO Mk3 and GFDL CM2.1 in prediction the climate change in Taleghan River Basin. Am J Civil Eng Architect 6:93–100. https://doi.org/10.12691/ajcea-6-3-1

  • YoosefDoost A, Sadeghian MS, NodeFarahani M, Rasekhi A (2017) Comparison between performance of statistical and Low Cost ARIMA Model with GFDL, CM2. 1 and CGM 3 atmosphere-ocean general circulation models in assessment of the effects of climate change on temperature and precipitation in Taleghan Basin. Am J Water Resour 5:92–99. https://doi.org/10.12691/ajwr-5-4-1

  • Yousefi Malekshah M, Ghazavi R, Sadatinejad SJ (2019) Evaluating the effect of climate changes on runoff and maximum flood discharge in the dry area (Case Study : Tehran-Karaj Basin). Ecopersia 7:211–221 (In Farsi)

    Google Scholar 

  • Zhang W et al (2021) Increasing precipitation variability on daily-to-multiyear time scales in a warmer world. Sci Adv 7(31):eabf8021

Download references

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study’s conception and design. Material preparation, data collection and analysis were performed by Icen Yoosefdoost, Abbas Khashei Siuki, Hossein Tabari and Omolbani Mohammadrezapour. The first draft of the manuscript was written by Yoosefdoost, Icen, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Abbas Khashei-Siuki.

Ethics declarations

Conflicts of Interest

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

1.1 GCM Selection

The projected monthly precipitation and minimum, maximum temperatures under climate change for all GCMs and the RCP8.5 scenario in the base period are compared to observations in Fig. A1. The red dashed, and continuous lines show the 95% and 99% confidence intervals for observations, respectively. The GCMs simulation is considered acceptable when they are within the confidence interval of the observed data. The left column in Fig. A1 compares the mean monthly climate projections for all 30 GCMs shown with gray shadow (see Table A1 for GCMs names) with the confidence intervals of observations data. The results show that the accuracy of GCM simulations alters with variables (precipitation, max and min temperature) and period (month). The simulation CCMs, located within the confidence intervals of observations, are selected for further analysis, shown in the right column of Fig. A1.

figure a

Fig. A1 Comparison of the observation data and confidence intervals with GCMs simulations in (1985–2005)

The projected changes in precipitation and minimum, maximum temperatures for historical and future periods are shown in Fig. A2. The highest increases in maximum and minimum temperatures are seen in April (1.6 °C) and December (1.3 °C), respectively. Moreover, the lowest increases are projected for January (0.5 °C), May (0.5 °C) and November (0.6 °C). The largest uncertainty in the maximum and minimum temperature projections are seen in April and June, and October.

Table A1 All GCMs models are used in this research. The selected GCMs for precipitation and temperature projections are highlighted in bold with T (only temperature), P (only precipitation) and T, P (both temperature and precipitation) index

Models

\({\textbf{MPI-ESM-LR}}_\textbf{(T,P)}\)  

\({\textbf{BCC-CSM1.1}}_\textbf{(T,P)}\)  

CESM1(CAM5)

\({\textbf{EC-EARTH}}_\textbf{(T)}\)  

MPI-ESM-MR

BCC-CSM1.1(m)

\({\textbf{CESM1(WACCM)}}_\textbf{(T,P)}\)  

\({\textbf{IPSL-CM5A-LR}}_\textbf{(P)}\)  

\({\textbf{HadGEM2-ES}}_\textbf{(P)}\)  

BNU-ESM

\({\textbf{NorESM1-M}}_\textbf{(P)}\)  

IPSL-CM5A-MR

MRI-CGCM3

CanESM2

NorESM1-ME

FGOALS-g2

GISS-E2-H

CNRM-CM5

HadGEM2-AO

MIROC-ESM

\({\textbf{GISS-E2-R}}_\textbf{(T)}\)  

CSIRO-Mk3.6.0

GFDL-CM3

MIROC-ESM-CHEM

CCSM4

FIO-ESM

GFDL-ESM2G

 

GFDL-ESM2M

HadGEM2-AO

MIROC5

 

Precipitation is expected to decrease in most months, with the highest decline of about 30% in May. In the last three months of the year and February, a precipitation increase is projected, with the highest growth of + 9% in February. The largest uncertainty in precipitation projections is predicted in July, where the changes vary from a –30% decline to a + 20% increase among GCMs.

figure b

Fig. A2 Projected changes in monthly precipitation and temperature by the selected GCMs under the RCP8.5 scenario

1.2 Trend Analysis

To determine the trend of data, two types of statistical analyses are selected. First, the presence of a steady rising or declining trend is examined with the nonparametric Mann–Kendall test. Then, the slope of the linear trends is estimated with the nonparametric Sen’s method. Sen’s statistics for precipitation and temperature time series are shown in Figs. A3, A4, A5, and A6. These figures confirm the presence of annual and monthly time series of minimum temperature (all months except December and November) and maximum temperature in April and June. The time series consists of monthly averages with monotonously decreasing trends. It seems that the residuals follow a random distribution, indicating that a linear model should be applied. The statistical calculations give a high significance level, with narrow angles between the confidence lines. The precipitation, max temperature (except in April and June), and min temperature time series in November and December are three instances in which neither of the used methods is statistically suitable. However, they have a negative slope in the studied time scales. Because the data trends have significant fluctuations in monthly and annual periods, the trend is thus neither linear nor monotonic (Figs. A3, A4, A5, and A6).

figure c

Fig. A3 Precipitation monthly time series and statistical trend

figure d

Fig. A4 Maximum temperature monthly time series and statistical trend

figure e

Fig. A5 Minimum temperature monthly time series and statistical trend

figure f

Fig. A6 Annual time series and statistical trends of minimum and maximum temperature and precipitation

1.3 Reference Evapotranspiration Estimation

The ultimate goal of this research was to forecast runoff under climate change conditions. Input variables include daily precipitation, minimum and maximum temperatures, reference evapotranspiration (ET0), and runoff for 20 years (2020–2040). The ET0 estimates are obtained using the Penman, Penman–Monteith, Wright–Penman, Blaney–Criddle, Radiation balance, and Hargreaves models. The results show that the Penman–Monteith have a more reliable performance to estimate in this area. The Penman–Monteith model is not applicable for the future period due to a lack of relative humidity, net radiation, and wind speed data. Therefore, future ET0 was estimated from temperature using an empirical copmuted relationship (Fig. A7) between historical ET0 and temperature data.

Figure A7 demonstrates the correlation and equation obtained from different input–output combinations based on various trend line regressions. Green, red, and black lines illustrate linear, quadratic, and cubic relations of min, max, and average temperature with ET0, respectively. The results show the highest correlation is between maximum temperature and ET0 using a quadratic polynomial function.

figure g

Fig. A7 Correlation between average, minimum, and maximum temperature and ET0 estimated by FAO-Penman–Monteith method

1.4 Inputs Selection

Choosing appropriate inputs data is a prerequisite for developing a forecasting model. Due to the limited access to climatic data, various inputs for the model based on existing parameters to predict daily runoff under climate change scenarios are defined. The gamma test inputs are Tave, Tmin, Tmax, P (Precipitation), ET0, and R (Runoff) in combination with the repetition of all these six parameters with one, two, three and four 20 years steps delay, which are shown as 1 to 4 indices under each parameter symbol.

According to Table A2, the maximum gamma statistics is achieved for the 27th combination (elimination of R3 variable), indicating that R3 is the most vital parameter in the combination. Apart from R3, Tmax1, Tmin3, Tmin4, Tave1, Tmax2, Tmin1, ET0, and Tave2 are considered in order of importance.

Table A2 Selected combinations with higher Gamma amounts for the initial input

Combination number

Rank

Absent parameter

Gamma

Gradient

Standard Error

V-Ratio

0

10

-

0.018234

0.042148

0.007177

0.072935

1

14

T ave4

0.018097

0.043687

0.006757

0.072388

2

13

T ave3

0.018162

0.043621

0.006622

0.072647

3

9

T ave2

0.018452

0.043108

0.007137

0.073808

4

5

T ave1

0.019758

0.041836

0.006242

0.079033

5

27

T ave

0.012948

0.050928

0.006395

0.051791

6

15

Tmax4

0.017814

0.04411

0.004546

0.071256

7

24

Tmax3

0.014837

0.048172

0.004714

0.059347

8

6

Tmax2

0.019243

0.042887

0.006142

0.076971

9

2

Tmax1

0.021062

0.040848

0.007463

0.084246

10

26

Tmax

0.013296

0.050814

0.007465

0.053183

11

4

Tmin4

0.019819

0.042261

0.007033

0.079278

12

3

Tmin3

0.020647

0.041046

0.007229

0.082588

13

21

Tmin2

0.016109

0.046281

0.008033

0.064434

14

7

Tmin1

0.019152

0.043338

0.007156

0.076607

15

20

Tmin

0.016149

0.049204

0.004846

0.064597

16

11

EV4

0.018234

0.042148

0.007177

0.072935

17

12

EV3

0.018234

0.042148

0.007177

0.072935

18

16

EV2

0.017709

0.044322

0.007342

0.070836

19

19

EV1

0.016451

0.047765

0.004804

0.065803

20

8

EV

0.018541

0.04441

0.00705

0.074164

21

28

P4

0.012765

0.056347

0.004487

0.051061

22

17

P3

0.017331

0.047929

0.005202

0.069323

23

25

P2

0.014061

0.05073

0.005285

0.056245

24

23

P1

0.015254

0.047209

0.005479

0.061016

25

30

P

0.007831

0.067974

0.005675

0.031324

26

22

R4

0.015703

0.050782

0.004229

0.062812

27

1

R3

0.026969

0.044033

0.006039

0.107875

28

18

R2

0.016623

0.050932

0.006062

0.066494

29

29

R1

0.010764

0.05958

0.005647

0.043055

Then, the number of input data is defined. Seventy percent of all available data is used for learning and the rest for testing and evaluation. The minimum number of data required for learning was calculated using the M test (Fig. A8).

figure h

Fig. A8 The results of the M test

The results show that both Gamma and standard error diagrams reach a plateau at approximately the point of 2750. The M test is applied for simulations using various learning data utilizing the Local Linear Regression (LLR) as the control model to achieve a more accurate result. The simulation was applied for every model using the selected combinations from the previous analysis. Table A3 compares the simulations based on three criteria of R2, RMSE, and CORR.

The results indicate that the values of R2 and CORR remain approximately constant after 2450 iterations. The minimum RMSE and the maximum R2 and CORR values are obtained using 2800 data for training. Based on the results, the optimum training number is 2737 (75% of all data), and the rest (912) is considered for the evaluation and testing. To choose the best input combination, the gamma statistics obtained from 29 input combinations are compared with a state considering all input parameters. Table A3 presents the findings for some of the input combinations. The number of required data for training was determined using the M test while considering LLR as the control model.

Table A3 Summary of modelling results from the LLR control model to evaluate the M test

Number of training data

Test

Train

RMSE

R2

CORR

RMSE

R2

CORR

500

0.3925

0.9208

0.9365

0.6055

0.6042

0.7262

2400

0.3192

0.979

0.9691

0.3731

0.863932

0.8885

2450

0.3156

0.9856

0.9721

0.3551

0.881487

0.8985

2700

0.3044

0.9878

0.972095

0.3410

0.89665

0.9076

2750

0.3029

0.9881

0.9730

0.3310

0.907729

0.9135

2800

0.2915

0.9884

0.9733

0.3183

0.920364

0.920

2850

0.3004

0.9872

0.9728

0.3228

0.9140

0.9170

3000

0.3082

0.9881

0.9728

0.3478

0.888332

0.902409

Table A4 Comparison of LLR and gamma test input combinations

Number of combinations

Combination type

Training

Test

RMSE

R2

CORR

RMSE

R2

CORR

0

11111111111111111111111111111

0.1557

0.9117

0.9471

9.6246

0.0434

0.1048

1

01111111111111111111111111111

0.1505

0.9838

0.9369

2.1047

0.1075

0.3289

2

10111111111111111111111111111

0.1641

0.9923

0.9314

0.5164

0.6586

0.8125

3

11011111111111111111111111111

0.1813

0.9681

0.9396

1.2207

0.1989

0.4026

4

11101111111111111111111111111

0.2084

0.9711

0.9369

1.923

0.0947

0.3049

5

11110111111111111111111111111

0.1639

0.9903

0.9302

3.4479

0.073

0.2313

6

11111011111111111111111111111

0.1642

0.9745

0.9327

0.9039

0.4256

0.661

7

11111101111111111111111111111

0.1555

0.9923

0.9251

1.3766

0.1957

0.4146

8

11111110111111111111111111111

0.1735

0.9764

0.9292

1.2166

0.2839

0.5304

9

11111111011111111111111111111

0.1714

0.9734

0.9277

1.4732

0.3054

0.5529

10

11111111101111111111111111111

0.2368

0.972

0.9218

0.3648

0.799

0.9089

11

11111111110111111111111111111

0.1432

0.9492

0.9269

9.624

0.0151

0.1006

12

11111111111011111111111111111

0.156

0.9529

0.9174

2.0766

0.0936

0.3175

13

11111111111101111111111111111

0.1582

0.9502

0.917

0.5164

0.6464

0.8275

14

11111111111110111111111111111

0.1675

0.9448

0.9183

1.2037

0.1567

0.4119

15

11111111111111011111111111111

0.1886

0.9347

0.9181

1.9077

0.0844

0.3214

16

11111111111111101111111111111

0.149

0.9519

0.9192

3.4294

0.0381

0.2293

17

11111111111111110111111111111

0.1534

0.9435

0.9159

0.9019

0.4008

0.6653

18

11111111111111111011111111111

0.1607

0.948

0.9175

1.3765

0.1538

0.4314

19

11111111111111111101111111111

0.1502

0.9434

0.9235

1.1905

0.2551

0.5277

20

11111111111111111110111111111

0.1702

0.9435

0.9212

1.4699

0.2899

0.5506

21

11111111111111111111011111111

0.2152

0.92

0.9068

0.3568

0.7844

0.9026

22

11111111111111111111101111111

0.155

0.9701

0.9817

3.4204

0.0379

0.2225

23

11111111111111111111110111111

0.1551

0.9698

0.98

0.9163

0.4019

0.6594

24

11111111111111111111111011111

0.1615

0.9679

0.9811

1.3895

0.1592

0.4093

25

11111111111111111111111101111

0.1564

0.9675

0.9829

1.1946

0.2549

0.5156

26

11111111111111111111111110111

0.1712

0.9629

0.9838

1.4735

0.297

0.5614

27

11111111111111111111111111011

0.1488

0.9431

0.9695

0.3737

0.7892

0.9159

28

11111111111111111111111111101

0.1563

0.967

0.9815

2.0849

0.0934

0.3093

29

11111111111111111111111111110

0.1637

0.9684

0.9846

0.3289

0.8301

0.9104

30

00110001101101000001000001110

0.2326

0.9045

0.8935

0.3288

0.8355

0.9169

Table A4 shows the comparison of LLR and gamma test input combinations. The 30th combination in the training step has less accuracy, while it demonstrated the best accuracy during the evaluation process. On the other hand, the combination No. 1 has an appropriate accuracy in the training phase but poses an unacceptable accuracy during the test step.

Generally, it can be stated that the generated model does not have a predictable nature. In the 29th combination, both training and test steps have an appropriate accuracy. The selected combination using the gamma test demonstrates the best performance for the simulations. A multiple linear regression approach (step-by-step approach) was used to evaluate the gamma test findings. In this way, the input parameters are selected. According to Table A5, the best combination is obtained using the multiple linear regression method is presented in Eq. (6):

$$\begin{aligned}R&=0.373787\times R3+0.070545\times R4+0.163553\times R2 \\&+0.087728\times R1+0.042036\times Tmin-0.02445\times Tmin4 \\&+0.032388\times Tmin3-0.04031\times Tmax+0.006369\times Tmax1 \\&-0.01492\times Tmax2+0.015357\times Tmax4+0.020101\times {ET}_{0}1 \\&-0.011170\times {ET}_{0}-0.023045\times Tave+0.031988\end{aligned}$$
(6)

The simulation findings based on the LLR approach selected by a progressive selection method are compared with the results of the LLR and the gamma test. Table A5 shows the outcomes of simulations by the local linear regression approach and the optimum combination based on the progressive selection method.

Table A5 Results of multiple linear regression applying the progressive method (step-by-step approach)

Number

Variables

R2

Standard Estimation Error

1

R1

64.5300

0.4235

2

R1,R2

66.1813

0.4217

3

R1,R2,R3

69.6481

0.4114

4

R1,R2,R3,R4

70.0429

0.4109

5

R1,R2,R3,R4,Tmax

76.9536

0.3991

6

R1,R2,R3,R4,Tmax,Tmax1

78.0523

0.3921

7

R1,R2,R3,R4,Tmax,Tmax1,Tmax2

79.7664

0.3897

8

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4

80.0629

0.3672

9

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4

82.3377

0.3575

10

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3

84.0826

0.3547

11

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin

85.7705

0.3509

12

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ETo

85.7739

0.3431

13

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ETo,ETo1

85.7792

0.336

14*

R1,R2,R3,R4,Tmax,Tmax1,Tmax2,Tmax4,Tmin4,Tmin3,Tmin,ET0,ET01,Tave

89.0485

0.3357

  1. *Best inputs combination

Table A6 shows that the generated model using the LLR progressive selection method demonstrated a higher accuracy during the training step than the LLR using gamma test. The gamma test input combination reveals a higher accuracy than the one generated by the progressive selection approach in the testing phase. Hence, the gamma test provides the most suitable input combination for the model. Remesan et al. (2009) applied the gamma test to determine the best input combination for runoff modelling in Brue catchment, England. Their findings revealed that the combination of five parameters, including R3, R2, R1, P, P1 were the best input combinations for the model. They also considered 1056 data of all 2236 data for the training step based on the M test. Therefore, in this study, Tmax1, Tmin3, Tmin4, Tave1, Tmax2, Tmin1, Tave2, R4, R2, R3, and ET0 were considered as input variables, and Qt was the variable being predicted by the data-mining models.

Moreover, in the data-driven HYMOD model, daily precipitation, temperature, ET0, and discharge variables were considered inputs. To enhance the performance of the HYMOD model, the optimum calculated parameters for the model in this region, as presented in Table A7, were used. The number of data used for learning and calibration based on the M test in all four models of SVM, ANN, GEP, and HYMOD were considered as 3030 (≃75% of all data), and the remaining 1010 data were applied for test and evaluation purpose (≃25% of all data).

Table A6 Comparison of results of LLR with the optimal combination obtained from the progressive selection approach and LLR with the optimal combination obtained from the Gamma test

Model

Combination type

Evaluation

Training

RMSE

R2

CORR

RMSE

R2

CORR

Gamma

00110001101101000001000001110

0.3288

0.8354

0.9168

0.2326

0.9045

0.8934

LLR

00001101111100100011000001111

0.3537

0.7858

0.8903

0.0091

0.9702

0.9598

Table A7 Calculated parameters by the optimization algorithms for the HYMOD model for the Karaj river basin

Parameter (unit)

Min

Max

Optimal value

The maximum amount of moisture in the basin (mm)

1

500

500

Spatial changes in soil moisture storage

0.1

2

0.45

B distribution factor of the two moisture reservoirs

0.1

0.99

0.1

Retention time in slow flow tank (day)

0.001

1

0.01

Retention time in rapid flow tank (day)

0.1

0.99

0.1

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yoosefdoost, I., Khashei-Siuki, A., Tabari, H. et al. Runoff Simulation Under Future Climate Change Conditions: Performance Comparison of Data-Mining Algorithms and Conceptual Models. Water Resour Manage 36, 1191–1215 (2022). https://doi.org/10.1007/s11269-022-03068-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-022-03068-6

Keywords

Navigation