Abstract
Flood routing models are vital in predicting floods and taking all necessary precautions in the region where floods occur, preventing loss of life and property in the region and protecting agricultural areas. This study aims to compare the performance of various machine learning models such as Bagged Tree, Gradient-Boosted Machine, Random Forest, K-Nearest Neighbor, Support Vector Machine and Extreme Gradient Boosting for flood routing prediction models in Ankara, Eskişehir and Sivas. In addition, the predictive success of tree-based algorithms established according to the optimized and default parameters was compared. For this purpose, the flood data of 2013, 2014 and 2015 discharge observation stations located in Ankara D12A242-D12A126, D12A170-D12A172 in Eskişehir and D15A290-E15A035 in Sivas were used. While establishing the machine learning (ML) models, the data was selected as 80% training and 20% testing. Model performances were tested according to various statistical indicators such as root mean square error, mean absolute error and determination coefficient. As a result of the study, the Gradient-Boosted Machine was chosen as the most successful model in estimating flood routing. In addition, the K-nearest neighbor model with 3-nearest neighbor achieved high-level prediction success with the lowest error rates in Ankara. The findings are important in terms of flood management and taking necessary precautions before the flood occurs.
Similar content being viewed by others
Data availability
Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.
References
Abedi R, Costache R, Shafizadeh-Moghadam H, Pham QB (2021) Flash-flood susceptibility mapping based on XGBoost, random forest, and boosted regression trees. Geocarto Int. https://doi.org/10.1080/10106049.2021.1920636
Altunkaynak A, Başakin EE, Kartal E (2020) Air Polution prediction with wavelet K-nearest neighbour method. Uludağ Univ J Fac Eng 25(3):1547–1556. https://doi.org/10.17482/uumfd.809938
ASCE Task Committee on Application of Artificial Neural Networks in Hydrology (2000) Artificial neural networks in hydrology. I: preliminary concepts. J Hydrol Eng 5(2):115–123. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(115)
Atasever ÜH, Özkan C (2012) The use of bagged-decision tree method for determination of land cover (UZAL-CBS 2012), 16–19 Oct 2012, Zonguldak
Avand M, Moradi H, Lasboyee MR (2021a) Spatial modeling of flood probability using geo-environmental variables and machine learning models, case study: Tajan watershed, Iran. Adv Space Res 67(10):3169–3186. https://doi.org/10.1016/j.asr.2021.02.011
Avand M, Khiavi AN, Khazaei M, Tiefenbacher JP (2021b) Determination of flood probability and prioritization of sub-watersheds: a comparison of game theory to machine learning. J Environ Manage 295:113040. https://doi.org/10.1016/j.jenvman.2021.113040
Avand M, Kuriqi A, Khazaei M, Ghorbanzadeh O (2022) DEM resolution effects on machine learning performance for flood probability mapping. J Hydro-Environ Res 40:1–16. https://doi.org/10.1016/j.jher.2021.10.002
Balci F (2022) A hybrid attention-based LSTM-XGBoost model for detection of ECG-based atrial fibrillation. Gazi Univ J Sci Part a: Eng Innov 9(3):199–210. https://doi.org/10.54287/gujsa.1128006
Ball JE (2022) Modelling accuracy for urban design flood estimation. Urban Water J 19(1):87–96. https://doi.org/10.1080/1573062X.2021.1955283
Barati R, Badfar M, Azizyan G, Akbari GH (2018) Discussion of “Parameter estimation of extended nonlinear muskingum models with the weed optimization algorithm” by Farzan Hamedi, Omid Bozorg-Haddad, Maryam Pazoki, Hamid-Reza Asgari, Mehran Parsa, and hugo a. Loáiciga. J Irrig Drain Eng 144:7017021. https://doi.org/10.1061/(ASCE)IR.1943-4774.0001095
Barbetta S, Moramarco T, Perumal M (2017) A Muskingum-based methodology for river discharge estimation and rating curve development under significant lateral inflow conditions. J Hydrol 554:216–232. https://doi.org/10.1016/j.jhydrol.2017.09.022
Başakin EE, Ekmekcioğlu Ö, Özger M (2019) Drought analysis with machine learning methods. Pamukkale Univ J Eng Sci 25(8):985–991. https://doi.org/10.5505/pajes.2019.34392. (in Turkish)
Breiman L (1996a) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/BF00058655
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Breiman L (1996b) Out-of-bag estimation. Technical report, Department of Statistics: University of California, Berkeley
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE). Geosci Model Dev Discuss 7(1):1525–1534. https://doi.org/10.5194/gmd-7-1247-2014
Chapi K, Singh VP, Shirzadi A, Shahabi H, Bui DT, Pham BT, Khosravi K (2017) A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ Model Softw 95:229–245. https://doi.org/10.1016/j.envsoft.2017.06.012
Chen T, Guestrin C (2016) XGBoost. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, NY, USA, pp785–794
Choi C, Kim J, Han H, Han D, Kim HS (2019) Development of water level prediction models using machine learning in wetlands: a case study of Upo wetland in South Korea. Water 12(1):93. https://doi.org/10.3390/w12010093
Choubin B, Hosseini FS, Rahmati O, Youshanloei MM (2022) A step toward considering the return period in flood spatial modeling. Nat Hazards. https://doi.org/10.1007/s11069-022-05561-y
Chow VT (1959) Open channel hydraulics. McGraw-Hill International Book Company Inc, New York
Costache R, Pham QB, Arabameri A, Diaconu DC, Costache I, Crăciun A, Ciobotaru N, Pandey M, Arora A, Ali SA, Pham BT, Nguyen H, Tuan HA, Avand M (2021) Flash-flood propagation susceptibility estimation using weights of evidence and their novel ensembles with multicriteria decision making and machine learning. Geocarto Int. https://doi.org/10.1080/10106049.2021.2001580
Costache R, Tin TT, Arabameri A, Crăciun A, Ajin RS, Costache I, Islam AR, Abba SI, Sahana M, Avand M, Pham BT (2022) Flash-flood hazard using deep learning based on H2O R package and fuzzy-multicriteria decision-making analysis. J Hydrol 609:127747. https://doi.org/10.1016/j.jhydrol.2022.127747
Danso-Amoako E, Scholz M, Kalimeris N, Yang Q, Shao J (2012) Predicting dam failure risk for sustainable flood retention basins: a generic case study for the wider Greater Manchester area. Comput Environ Urban Syst 36(5):423–433. https://doi.org/10.1016/j.compenvurbsys.2012.02.003
Dazzi S, Vacondio R, Mignosa P (2021) Flood stage forecasting using machine-learning methods: a case study on the Parma River (Italy). Water 13(12):1612. https://doi.org/10.3390/w13121612
DSİ (2022) https://www.dsi.gov.tr/Sayfa/Detay/744. Received 31 July 2022
Duan T, Anand A, Ding DY, Thai KK, Basu S, Ng A, Schuler A (2020, November) Ngboost: natural gradient boosting for probabilistic prediction. In: International conference on machine learning, pp 2690–2700. PMLR
Eliçalışkan M (2022) https://www.cografya.gen.tr/egitim/bolgeler/ic-anadolu.htm. Received 31 July 2022
Erdem F, Derinpınar MA, Nasırzadehdızajı R, Oy S, Şeker DZ, Bayram B (2018) Coastline extraction by using random forest method; a case study of Istanbul. Geomatik 3(2):100–107. https://doi.org/10.29128/geomatik.362179
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
Genuer R, Poggi JM, Tuleau-Malot C, Villa-Vialaneix N (2017) Random forests for big data. Big Data Res 9:28–46. https://doi.org/10.1016/j.bdr.2017.07.003
Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random forests for land cover classification. Pattern Recognit Lett 27(4):294–300. https://doi.org/10.1016/j.patrec.2005.08.011
Granata F, Gargano R, De Marinis G (2016) Support vector regression for rainfall-runoff modeling in urban drainage: a comparison with the EPA’s storm water management model. Water 8(3):69. https://doi.org/10.3390/w8030069
Hadidi A, Holzbecher E, Molenaar RE (2020) Flood mapping in face of rapid urbanization: a case study of Wadi Majraf-Manumah, Muscat, Sultanate of Oman. Urban Water J 17(5):407–415. https://doi.org/10.1080/1573062X.2020.1713172
Hamedi F, Bozorg-Haddad O, Pazoki M, Asgari HR, Parsa M, Loáiciga HA (2016) Parameter estimation of extended nonlinear Muskingum models with the weed optimization algorithm. J Irrig Drain Eng 142(12):04016059. https://doi.org/10.1061/(ASCE)IR.1943-4774.0001095
Haritatr-a (2022) https://www.haritatr.com/harita/mera-cayi/127692. Received 31 July 2022
Haritatr-b (2022) https://www.haritatr.com/harita/sarisu-cayi/102970. Received 31 July 2022
Hassanvand MR, Karami H, Mousavi SF (2018) Investigation of neural network and fuzzy inference neural network and their optimization using meta-algorithms in river flood routing. Nat Hazards 94:1057–1080. https://doi.org/10.1007/s11069-018-3456-z
Hosseini FS, Sigaroodi SK, Salajegheh A, Moghaddamnia A, Choubin B (2021) Towards a flood vulnerability assessment of watershed using integration of decision-making trial and evaluation laboratory, analytical network process, and fuzzy theories. Environ Sci Pollut Res 28(44):62487–62498. https://doi.org/10.1007/s11356-021-14534-w
Hu R, Fang F, Pain CC, Navon IM (2019) Rapid spatio-temporal flood prediction and uncertainty quantification using a deep learning method. J Hydrol 575:911–920
Karahan H, Iplikci S, Yasar M, Gurarslan G (2014) River flow estimation from upstream flow records using support vector machines. J Appl Math. https://doi.org/10.1155/2014/714213
Kundzewicz ZW, Napiórkowski JJ (1986) Nonlinear models of dynamic hydrology. Hydrol Sci J 31(2):163–185. https://doi.org/10.1080/02626668609491038
Kundzewicz ZW, Kanae S, Seneviratne SI, Handmer J, Nicholls N, Peduzzi P, Sherstyukov B (2014) Flood risk and climate change: global and regional perspectives. Hydrol Sci J 59(1):1–28. https://doi.org/10.1080/02626667.2013.857411
Li Y, Huang G, Huang Y, Qin X (2014) Modeling of water quality, quantity, and sustainability. J Appl Math. https://doi.org/10.1155/2014/714213
Liu K, Li Z, Yao C, Chen J, Zhang K, Saifullah M (2016) Coupling the k-nearest neighbor procedure with the Kalman filter for real-time updating of the hydraulic model in flood forecasting. Int J Sedim Res 31(2):149–158. https://doi.org/10.1016/j.ijsrc.2016.02.002
Liu M, Huang Y, Li Z, Tong B, Liu Z, Sun M, Zhang H (2020) The applicability of LSTM-KNN model for real-time flood forecasting in different climate zones in China. Water 12(2):440. https://doi.org/10.3390/w12020440
Mosavi A, Ozturk P, Chau KW (2018) Flood prediction using machine learning models: literature review. Water 10(11):1536. https://doi.org/10.3390/w10111536
Mosavi A, Golshan M, Janizadeh S, Choubin B, Melesse AM, Dineva AA (2022) Ensemble models of GLM, FDA, MARS, and RF for flood and erosion susceptibility mapping: a priority assessment of sub-basins. Geocarto Int 37(9):2541–2560. https://doi.org/10.1080/10106049.2020.1829101
Noury M, Sedghi H, Babazedeh H, Fahmi H (2014) Urmia lake water level fluctuation hydro informatics modeling using support vector machine and conjunction of wavelet and neural network. Water Resour 41(3):261–269. https://doi.org/10.1134/S0097807814030129
Pant R, Thacker S, Hall JW, Alderson D, Barr S (2018) Critical infrastructure impact assessment due to flood exposure. J Flood Risk Manag 11(1):22–33. https://doi.org/10.1111/jfr3.12288
Pitt M (2008) Learning lessons from the 2007 floods. Cabinet Office, London
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9(2):181–199. https://doi.org/10.1007/s10021-005-0054-1
Rafiei-Sardooi E, Azareh A, Choubin B, Mosavi AH, Clague JJ (2021) Evaluating urban flood risk using hybrid method of TOPSIS and machine learning. Int J Disaster Risk Reduct 66:102614. https://doi.org/10.1016/j.ijdrr.2021.102614
Sahana M, Rehman S, Sajjad H, Hong H (2020) Exploring effectiveness of frequency ratio and support vector machine models in storm surge flood susceptibility assessment: a study of Sundarban Biosphere Reserve, India. CATENA 189:104450. https://doi.org/10.1016/j.catena.2019.104450
Sahoo A, Samantaray S, Ghose DK (2022) Multilayer perceptron and support vector machine trained with grey wolf optimiser for predicting floods in Barak river, India. J Earth Syst Sci 131(2):1–23. https://doi.org/10.1007/s12040-022-01815-2
Sanders W, Li D, Li W, Fang ZN (2022) Data-driven flood alert system (FAS) using extreme gradient boosting (XGBoost) to forecast flood stages. Water 14(5):747. https://doi.org/10.3390/w14050747
Schoppa L, Disse M, Bachmair S (2020) Evaluating the performance of random forest for large-scale flood discharge simulation. J Hydrol 590:125531. https://doi.org/10.1016/j.jhydrol.2020.125531
Scornet E, Biau G, Vert JP (2015) Consistency of random forests. Ann Stat 43(4):1716–1741. https://doi.org/10.1214/15-AOS1321
Sen Z, Khiyami HA, Al-Harthy SG, Al-Ammawi FA, Al- Balkhi AB, Al-Zahrani MI, Al-Hawsawy HM (2013) Flash flood inundation map preparation for wadis in arid regions. Arab J Geosci 6(9):3563–3572. https://doi.org/10.1007/s12517-012-0614-6
Serencam U, Ekmekcioğlu Ö, Başakın EE, Özger M (2022) Determining the water level fluctuations of Lake Van through the integrated machine learning methods. Int J Global Warm 27(2):123–142. https://doi.org/10.1504/IJGW.2022.10047900
Taromideh F, Fazloula R, Choubin B, Emadi A, Berndtsson R (2022) Urban flood-risk assessment: integration of decision-making and machine learning. Sustainability 14(8):4483. https://doi.org/10.3390/su14084483
Tayfur G (2017) Modern optimization methods in water resources planning, engineering and management. Water Resour Manag 31(10):3205–3233. https://doi.org/10.1007/s11269-017-1694-6
Tayfur G, Singh VP, Moramarco T, Barbetta S (2018) Flood hydrograph prediction using machine learning methods. Water 10(8):968. https://doi.org/10.3390/w10080968
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Venkatesan E, Mahindrakar AB (2020) Short-term flood forecasting using ensemble learning. Indian J Ecol 47(4):943–948
Wikipedia.org (2022) https://tr.wikipedia.org/wiki/K%C4%B1z%C4%B1l%C4%B1rmak. Received 31 July 2022
Xie K, Ozbay K, Zhu Y, Yang H (2017) Evacuation zone modeling under climate change: a data-driven method. J Infrastruct Syst. https://doi.org/10.1061/(ASCE)IS.1943-555X.0000369
Yariyan P, Janizadeh S, Van Phong T, Nguyen HD, Costache R, Van Le H, Pham BT, Pradhan B, Tiefenbacher JP (2020) Improvement of best first decision trees using bagging and dagging ensembles for flood probability mapping. Water Resour Manag 34(9):3037–3053. https://doi.org/10.1007/s11269-020-02603-7
Yaseen ZM, El-Shafie A, Jaafar O, Afan HA, Sayl KN (2015) Artificial intelligence-based models for streamflow forecasting: 2000–2015. J Hydrol 530:829–844. https://doi.org/10.1016/j.jhydrol.2015.10.0384
Yeşilyurt SN, Dalkılıç H (2021) Daily river flow forecasting with Xgboost and gradient boost machine. In: The 3rd international symposium of engineering applications on civil engineering and earth sciences. 22–24 Sept 2021. Karabük, Türkiye
Yuan X, Zhang X, Tian F (2020) Research and application of an intelligent networking model for flood forecasting in the arid mountainous basins. J Flood Risk Manag 13:e12638. https://doi.org/10.1111/jfr3.12638
Zare M, Koch M (2014) An analysis of MLR and NLP for use in river food routing and comparison with the Muskingum method. In: ICHE 2014. Proceedings of the 11th International Conference on Hydroscience & Engineering, September 28–October 2, 2014, Hamburg, Germany, pp 505–514
Zhu Z, Zhang Y (2022) Flood disaster risk assessment based on random forest algorithm. Neural Comput Appl 34:3443–3455. https://doi.org/10.1007/s00521-021-05757-6
Zounemat-Kermani M, Matta E, Cominola A, Xia X, Zhang Q, Liang Q, Hinkelmann R (2020) Neurocomputing in surface water hydrology and hydraulics: a review of two decades retrospective, current status and future prospects. J Hydrol 588:125085. https://doi.org/10.1016/j.jhydrol.2020.125085
Acknowledgements
The data used in the study were obtained from the General Directorate of State Hydraulic Works Rasatlar Branch Office and DSI Regional Directorates. Figure 1 was obtained http://cografyaharita.com/haritalarim/2eturkiyenin-akarsular-gollar-haritasi3.png and the date of access is 27 July 2022.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
O. M. Katipoğlu contributed to the data analysis, findings, and conclusions. M. Sarıgöl contributed with data collection, literature review, and writing methods. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The author declares no conflict of interest.
Ethical approval
The manuscript complies with all the ethical requirements. The paper was not published in any journal.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Katipoğlu, O.M., Sarıgöl, M. Prediction of flood routing results in the Central Anatolian region of Türkiye with various machine learning models. Stoch Environ Res Risk Assess 37, 2205–2224 (2023). https://doi.org/10.1007/s00477-023-02389-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-023-02389-1