Skip to main content

Advertisement

Log in

Identification of Critical Flood Prone Areas in Data-Scarce and Ungauged Regions: A Comparison of Three Data Mining Models

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

Flood is one of the most devastating natural disasters with socio-economic consequences. Thus, preparation of the flood prone areas (FPA) map is essential for flood disaster management, and for planning further development activities. The main goal of this study is to investigate new applications of the evidential belief function (EBF), random forest (RF), and boosted regression trees (BRT) models for identifying the FPA in the Galikesh region, Iran. This research was conducted in three main stages such as data preparation, flood susceptibility mapping using EBF, RF, and BRT models and validation of constructed models using receiver operating characteristic (ROC) curve. At first, a flood inventory map was prepared using documentary sources of Iranian Water Resources Department (IWRD) and extensive field surveys. In total, 63 flood locations were identified in the study area. Of these, 47 (75%) floods were randomly selected as training/model building and the remaining 16 (25%) cases were used for the validation purposes. The flood conditioning factors considered in the study area are altitude, slope aspect, slope angle, topographic wetness index, plan curvature, geology, landuse, distance from rivers, drainage density, and soil texture. Subsequently, the FPA maps were prepared using EBF, RF, and BRT models in a GIS environment. Finally, the results were validated using ROC curve and area under the curve (AUC) analysis. From the analysis, it was seen that the EBF (AUC = 78.67%) and BRT models (AUC = 78.22%) performed better than RF model (AUC = 73.33%). Therefore, the resultant FPA maps can be useful for researchers and planner in flood mitigation strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Abdolhay A, Saghafian B, Soom MAM, Ghazali AHB (2012) Identification of homogenous regions in Gorganrood basin (Iran) for the purpose of regionalization. Nat Hazards 61(3):1427–1442

    Article  Google Scholar 

  • Aertsen W, Kint K, Vos BD, Deckers J, Orshoven JV, Muys B (2012) Predicting forest site productivity in temperate lowland from forest floor, soil and litterfall characteristics using boosted regression trees. Plant Soil 354:157–172

    Article  Google Scholar 

  • Ahmadisharaf E, Kalyanapu AJ, Chung ES (2016a) Spatial probabilistic multi-criteria decision making for assessment of flood management alternatives. J Hydrol 533:365–378

    Article  Google Scholar 

  • Ahmadisharaf E, Tajrishy M, Alamdari N (2016b) Integrating flood hazard into site selection of detention basins using spatial multi-criteria decision-making. J Environ Plann Manag 59(8):1397–1417

    Article  Google Scholar 

  • Alvarado-Aguilar D, Jiménez JA, Nicholls RJ (2012) Flood hazard and damage assessment in the Ebro Delta (NW Mediterranean) to relative sea level rise. Nat Hazard 62:1301–1321

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Carranza EJM, Wibowo H, Barritt SD, Sumintadireja P (2008) Spatial data analysis and integration for regional-scale geothermal potential mapping, West Java, Indonesia. Geothermics 37:267–299

    Article  Google Scholar 

  • Cosby BJ, Hornberger GM, Clapp RB, Ginn TR (1984) A statistical exploration of the relationships of soil moisture characteristics to the physical properties of soils. Water Resour Res 20:682–690

    Article  Google Scholar 

  • Cutler DR, Edwards TC, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecology 88(11):2783–2792

    Article  Google Scholar 

  • Degiorgis M, Gnecco G, Gorni S, Roth G, Sanguineti M, Taramasso AC (2012) Classifiers for the detection of flood-prone areas using remote sensed elevation data. J Hydrol 470–471:302–315

    Article  Google Scholar 

  • Dempster AP (1967) Upper and lower probabilities induced by a multi valued mapping. Ann Math Stat 28:325–339

    Article  Google Scholar 

  • Dempster AP (1968) Generalization of Bayesian inference. J R Stat Soc: Ser B 30:205–247

    Google Scholar 

  • Dempster A (2008) Upper and lower probabilities induced by a multivalued mapping. In: Shafer G, Yager R, Liu L, Dempster AP (eds) Classic works of the Dempster-Shafer theory of belief functions. Springer, Berlin

    Google Scholar 

  • R Development Core Team (2009) R: a language for environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3900051007-0, http://www.R-project.org

  • Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. J Anim Ecol 77:802–813

    Article  Google Scholar 

  • Fernández DS, Lutz MA (2010) Urban flood hazard zoning in Tucumán Province, Argentina, using GIS and multicriteria decision analysis. Eng Geol 111:90–98

    Article  Google Scholar 

  • Foudi S, Osés-Eraso N, Tamayo I (2015) Integrated spatial flood risk assessment: the case of Zaragoza. Land Use Policy 42:278–292

    Article  Google Scholar 

  • Froeschke JT, Stunz GW, Wildhaber ML (2010) Environmental influences on the occurrence of coastal sharks in estuarine waters. Mar Ecol Prog Ser 407:279–292

    Article  Google Scholar 

  • García-Pintado J, Neal JC, Mason DC, Dance SL, Bates PD (2013) Scheduling satellite-based SAR acquisition for sequential assimilation of water level observations into flood modelling. J Hydrol 495:252–266

    Article  Google Scholar 

  • Ghanbarpour MR, Salimi S, Hipel KW (2013) A comparative evaluation of flood mitigation alternatives using GIS‐based river hydraulics modelling and multicriteria decision analysis. J Flood Risk Manag 6(4):319–331

    Article  Google Scholar 

  • Glenn E, Morino K, Nagler P, Murray R, Pearlstein S, Hultine K (2012) Roles of saltcedar (Tamarix spp.) and capillary rise in salinizing a non-flooding terrace on a flow-regulated desert river. J Arid Environ 79:56–65

    Article  Google Scholar 

  • Grabs T, Seibert J, Bishop K, Laudon H (2009) Modeling spatial patterns of saturated areas: a comparison of the topographic wetness index and a dynamic distributed model. J Hydrol 373:15–23

    Article  Google Scholar 

  • Gromping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319

    Article  Google Scholar 

  • Hastie LC, Boon PJ, Young MR, Way S (2001) The effects of a major flood on an endangered freshwater mussel population. Biol Conserv 98:107–115

    Article  Google Scholar 

  • Hastie TJ, Tibshirani RJ, Friedman JJH (2009) The elements of statistical learning. Springer, New York

    Book  Google Scholar 

  • Jakubcova A, Grežo H, Hreško J (2015) Identification of areas with significant flood risk at the confluence of Danube and Ipeĭ rivers (southern Slovakia). Nat Hazards 75:849–867

    Article  Google Scholar 

  • Kamat R (2015) Planning and managing earthquake and flood prone towns. Stoch Environ Res Risk Assess 29(2):527–545

    Article  Google Scholar 

  • Khosravi K, Nohani E, Maroufinia E, Pourghasemi HR (2016) A GIS-based flood susceptibility assessment and its mapping in Iran: a comparison between frequency ratio and weights-of-evidence bivariate statistical models with multi-criteria decision-making technique. Nat Hazards 83(2):947–987

    Article  Google Scholar 

  • Kia MB, Pirasteh S, Pradhan, Mahmud B, Sulaiman AR, Moradi WNAA (2012) An artificial neural network model for flood simulation using GIS: Johor River Basin Malaysia. Environ Earth Sci 67:251–264

    Article  Google Scholar 

  • Koks EE, Jongman B, Husby TG, Botzen WJW (2015) Combining hazard, exposure and social vulnerability to provide lessons for flood risk management. Environ Sci Policy 47:42–52

    Article  Google Scholar 

  • Lee MJ, Kang, JE, Jeon S (2012) Application of frequency ratio model and validation for predictive flooded area susceptibility mapping using GIS In: Geoscience and Remote Sensing Symposium (IGARSS). IEEE International. Munich 895–898.

  • Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2(3):18–22

    Google Scholar 

  • Marfai MA, Sekaranom AB, Ward P (2015) Community responses and adaptation strategies toward flood hazard in Jakarta, Indonesia. Nat Hazards 75:1127–1144

    Article  Google Scholar 

  • Markantonis V, Meyer V, Lienhoop N (2013) Evaluation of the environmental impacts of extreme floods in the Evros river basin using contingent valuation method. Nat Hazards 69:1535–1549

    Article  Google Scholar 

  • Moore ID, Grayson RB, Ladson AR (1991) Digital terrain modeling: a review of hydrological, geomorphological and biological applications. Hydrol Pro 5:3–30

    Article  Google Scholar 

  • Negnevitsky M (2002) Artificial intelligence: a guide to intelligent systems. Addison–Wesley/Pearson, Harlow

    Google Scholar 

  • Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343

    Article  Google Scholar 

  • Oliveira S, Oehler F, San-Miguel-Ayanz J, Camia A, Pereira JMC (2012) Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random forest. Forest. Ecol Manag 275:117–129

    Article  Google Scholar 

  • Omidvar B, Khodaei H (2008) Using value engineering to optimize flood forecasting and flood warning systems: Golestan and Golabdare watersheds in Iran as case studies. Nat Hazards 47:281–296

    Article  Google Scholar 

  • Papaioannou G, Vasiliades L, Loukas A (2015) multi-criteria analysis framework for potential flood prone areas mapping. Water Resour Manag 29(2):399–418

    Article  Google Scholar 

  • Pradhan B (2010) Flood susceptible mapping and risk area delineation using logistic regression, GIS and remote sensing. J Spat Hydrol 9:1–18

    Google Scholar 

  • Rahmati O, Pourghasemi HR, Zeinivand H (2016a) Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto Int 31(1):42–70

    Article  Google Scholar 

  • Rahmati O, Zeinivand H, Besharat M (2016b) Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis. Geomatics, Nat Hazards Risk 7(3):1000–1017

    Article  Google Scholar 

  • Ridgeway G (2006) Generalized boosted models: a guide to the gbm package

  • Saghafian B, Farazjoo H, Bozorgy B, Yazdandoost F (2008) flood intensification due to changes in land use. Water Resour Manag 22:1051–1067

    Article  Google Scholar 

  • Shafer G (1976) A mathematical theory of evidence , vol. 1. Princeton University Press, Princeton

    Google Scholar 

  • Stefanidis S, Stathis D (2013) Assessment of flood hazard based on natural and anthropogenic factors using analytic hierarchy process (AHP). Nat Hazards 68(2):569–585

    Article  Google Scholar 

  • Tehrany MS, Pradhan B, Jebur MN (2013) Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. J Hydrol 504:69–79

    Article  Google Scholar 

  • Tehrany MS, Pradhan B, Mansor S, Ahmad N (2015) Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena 125:91–101

    Article  Google Scholar 

  • Tunusluoglu M, Gokceoglu C, Nefeslioglu H, Sonmez H (2008) Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey). Environ Geol 54:9–22

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank all the reviewers for many useful comments. We also greatly appreciate the comments of Prof. George P. Tsakiris, Editor-in-Chief, and the Associate Editor.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omid Rahmati.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahmati, O., Pourghasemi, H.R. Identification of Critical Flood Prone Areas in Data-Scarce and Ungauged Regions: A Comparison of Three Data Mining Models. Water Resour Manage 31, 1473–1487 (2017). https://doi.org/10.1007/s11269-017-1589-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-017-1589-6

Keywords

Navigation