Abstract
Landslide susceptibility prediction can be considered a crucial step in landslide risk assessment. This prediction helps in planning the land use properly. The primary aim of the study is to investigate different machine learning methods and develop anatomy to train and validate the landslide susceptibility prediction models with the help of various statistical techniques. The Kullu–Rohtang pass transport corridor has been selected as the study area. Initially, a landslide inventory was prepared using different sources and nine landslide triggering features were used for further study. All landslide locations in the study area were arbitrarily divided into a ratio of 67:33 to train and test various landslide susceptibility prediction models. The best-triggering features were chosen with the help of the information gain ratio (IGR) defining the predictive capability of different triggering features. Afterwards, five landslide susceptibility prediction models were constructed using a decision tree, K-nearest neighbour (KNN), Gaussian Naïve Bayes, support vector machine (SVM) and multilayer perceptron (MLP). The comparison and validation study of different resulting models was done by applying the receiver operating characteristic (ROC) curve, the kappa index and other statistical methods. Results show that the different models have the outstanding predictive capability with the decision tree model (100%), the Gaussian Naïve Bayes model (100%), the SVM model (100%), and the MLP model (100%) and the KNN model (99.9%). The result indicates statistical differences among various models. The validation results demonstrate the perfect agreement between the expected and predicted landslides along the transport corridor.
Similar content being viewed by others
Data availability
Data will be made available on request.
References
Achour Y, Boumezbeur A, Hadji R et al (2017) Landslide susceptibility mapping using analytic hierarchy process and information value methods along a highway road section in Constantine, Algeria. Arab J Geosci 10:194. https://doi.org/10.1007/s12517-017-2980-6
Achour Y, Garçia S, Cavaleiro V (2018) GIS-based spatial prediction of debris flows using logistic regression and frequency ratio models for Zêzere River basin and its surrounding area, Northwest Covilhã, Portugal. Arab J Geosci 11:550. https://doi.org/10.1007/s12517-018-3920-9
Arabameri A, Pradhan B, Rezaei K et al (2019) GIS-based landslide susceptibility mapping using numerical risk factor bivariate model and its ensemble with linear multivariate regression and boosted regression tree algorithms. J Mt Sci 16:595–618. https://doi.org/10.1007/s11629-018-5168-y
Arabameri A, Pal SC, Rezaie F et al (2021) Decision tree based ensemble machine learning approaches for landslide susceptibility mapping. Geocarto Int 37:1–35. https://doi.org/10.1080/10106049.2021.1892210
Bai SB, Wang J, Thiebes B et al (2014) Susceptibility assessments of the Wenchuan earthquake-triggered landslides in Longnan using logistic regression. Environ Earth Sci 71:731–743. https://doi.org/10.1007/s12665-013-2475-z
Beasley TM, Zumbo BD (2003) Comparison of aligned Friedman rank and parametric methods for testing interactions in split-plot designs. Comput Stat Data Anal 42:569–593. https://doi.org/10.1016/S0167-9473(02)00147-0
Bertsimas D, Dunn J (2017) Optimal classification trees. Mach Learn 106:1039–1082. https://doi.org/10.1007/s10994-017-5633-9
Bhardwaj BK, Pal S (2012) Data mining: a prediction for performance improvement using classification. Int J Comput Sci Inf Secur. https://doi.org/10.48550/ARXIV.1201.3418
Brenning A (2005) Spatial prediction models for landslide hazards: review, comparison and evaluation. Nat Hazards Earth Syst Sci 5:853–862. https://doi.org/10.5194/nhess-5-853-2005
Bröcker J, Smith LA (2007) Increasing the reliability of reliability diagrams. Weather Forecast 22:651–661. https://doi.org/10.1175/WAF993.1
Budimir MEA, Atkinson PM, Lewis HG (2015) A systematic review of landslide probability mapping using logistic regression. Landslides 12:419–436. https://doi.org/10.1007/s10346-014-0550-5
Bui DT, Lofman O, Revhaug I, Dick O (2011) Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Nat Hazards 59:1413–1444. https://doi.org/10.1007/s11069-011-9844-2
Carrara A, Pike RJ (2008) GIS technology and models for assessing landslide hazard and risk. Geomorphology 94:257–260. https://doi.org/10.1016/j.geomorph.2006.07.042
Chen W, Pourghasemi HR, Panahi M et al (2017a) Spatial prediction of landslide susceptibility using an adaptive neuro-fuzzy inference system combined with frequency ratio, generalized additive model, and support vector machine techniques. Geomorphology 297:69–85. https://doi.org/10.1016/j.geomorph.2017.09.007
Chen W, Xie X, Wang J et al (2017b) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160. https://doi.org/10.1016/j.catena.2016.11.032
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46. https://doi.org/10.1177/001316446002000104
Costanzo D, Chacón J, Conoscenti C et al (2014) Forward logistic regression for earth-flow landslide susceptibility assessment in the Platani river basin (southern Sicily, Italy). Landslides 11:639–653. https://doi.org/10.1007/s10346-013-0415-3
Di Napoli M, Carotenuto F, Cevasco A et al (2020) Machine learning ensemble modelling as a tool to improve landslide susceptibility mapping reliability. Landslides 17:1897–1914. https://doi.org/10.1007/s10346-020-01392-9
Dinov ID (2018) Decision tree divide and conquer classification. Data science and predictive analytics. Springer International Publishing, Cham, pp 307–343
Esteves JT, de Souza RG, Ferraudo AS (2019) Rainfall prediction methodology with binary multilayer perceptron neural networks. Clim Dyn 52:2319–2331. https://doi.org/10.1007/s00382-018-4252-x
Felicísimo ÁM, Cuartero A, Remondo J, Quirós E (2013) Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: a comparative study. Landslides 10:175–189. https://doi.org/10.1007/s10346-012-0320-1
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701. https://doi.org/10.1080/01621459.1937.10503522
Goyal S, Maheshwar (2019) Naïve Bayes model based improved k-nearest neighbor classifier for breast cancer prediction. In: Luhach A, Jat D, Hawari K, Gao XZ, Lingras P (eds) Advanced informatics for computing research. ICAICR 2019. Communications in computer and information science, vol 1075. Springer, Singapore. https://doi.org/10.1007/978-981-15-0108-1_1
Guzzetti F, Galli M, Reichenbach P et al (2006) Landslide hazard assessment in the Collazzone area, Umbria, Central Italy. Nat Hazards Earth Syst Sci 6:115–131. https://doi.org/10.5194/nhess-6-115-2006
Hong H, Pradhan B, Xu C, Tien Bui D (2015) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. CATENA 133:266–281. https://doi.org/10.1016/j.catena.2015.05.019
Jahed Armaghani D, Shoib RSNSBR, Faizi K, Rashid ASA (2017) Developing a hybrid PSO–ANN model for estimating the ultimate bearing capacity of rock-socketed piles. Neural Comput Appl 28:391–405. https://doi.org/10.1007/s00521-015-2072-z
Kalantar B, Pradhan B, Naghibi SA et al (2018) Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomat Nat Hazards Risk 9:49–69. https://doi.org/10.1080/19475705.2017.1407368
Kavzoglu T, Mather PM (2003) The use of backpropagating artificial neural networks in land cover classification. Int J Remote Sens 24:4907–4938. https://doi.org/10.1080/0143116031000114851
Lee S (2005) Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. Int J Remote Sens 26:1477–1491. https://doi.org/10.1080/01431160412331331012
Lian C, Zeng Z, Yao W, Tang H (2013) Displacement prediction model of landslide based on a modified ensemble empirical mode decomposition and extreme learning machine. Nat Hazards 66:759–771. https://doi.org/10.1007/s11069-012-0517-6
Maheshwar, Kumar G (2019) Breast cancer detection using Decision Tree, Naïve Bayes, KNN and SVM classifiers: a comparative study. In: 2019 international conference on smart systems and inventive technology (ICSSIT). IEEE, pp 683–686
Maheshwar, Kaushik K, Arora V (2015) A hybrid data clustering using firefly algorithm based improved genetic algorithm. Procedia Comput Sci 58:249–256. https://doi.org/10.1016/j.procs.2015.08.018
Marrapu BM, Jakka RS (2014) Landslide hazard zonation methods: a critical review. Int J Civil Eng Res 5:215–220
Mohamad ET, Armaghani DJ, Momeni E et al (2018) Rock strength estimation: a PSO-based BP approach. Neural Comput Appl 30:1635–1646. https://doi.org/10.1007/s00521-016-2728-3
Nefeslioglu HA, Sezer E, Gokceoglu C et al (2010) Assessment of landslide susceptibility by Decision Trees in the Metropolitan Area of Istanbul, Turkey. Math Probl Eng 2010:1–15. https://doi.org/10.1155/2010/901095
Nirbhav, Malik A, Maheshwar et al (2023) Landslide susceptibility prediction based on decision tree and feature selection methods. J Indian Soc Remote Sens. https://doi.org/10.1007/s12524-022-01645-1
Novellino A, Cesarano M, Cappelletti P et al (2021) Slow-moving landslide risk assessment combining machine learning and InSAR techniques. CATENA 203(5):105317. https://doi.org/10.1016/j.catena.2021.105317
O’brien RM (2007) A caution regarding rules of thumb for variance inflation factors. Qual Quant 41:673–690. https://doi.org/10.1007/s11135-006-9018-6
Pandey VK, Tripathi AK, Sharma KK (2022) Implications of landslide inventory in susceptibility modeling along a Himalayan highway corridor, India. Phys Geogr 43:440–462. https://doi.org/10.1080/02723646.2021.1872857
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Pham BT, Pradhan B, Tien Bui D et al (2016) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250. https://doi.org/10.1016/j.envsoft.2016.07.005
Pham BT, Tien Bui D, Pourghasemi HR et al (2017) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of Naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 128:255–273. https://doi.org/10.1007/s00704-015-1702-9
Pourghasemi HR, Rahmati O (2018) Prediction of the landslide susceptibility: which algorithm, which precision? CATENA 162:177–192. https://doi.org/10.1016/j.catena.2017.11.022
Pourghasemi HR, Jirandeh AG, Pradhan B et al (2013) Landslide susceptibility mapping using support vector machine and GIS at the Golestan Province, Iran. J Earth Syst Sci 122:349–369. https://doi.org/10.1007/s12040-013-0282-2
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–365. https://doi.org/10.1016/j.cageo.2012.08.023
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90
Rokach L (2016) Decision forest: twenty years of research. Inf Fusion 27:111–125. https://doi.org/10.1016/j.inffus.2015.06.005
Sahu N, Sayama T, Saini A, Panda A, Takara K (2020a) Understanding the hydropower and potential climate change impact on the Himalayan river regimes—a study of local perceptions and responses from Himachal Pradesh India. Water. https://doi.org/10.3390/w12102739
Sahu N, Saini A, Behera SK, Sayama T, Sahu L, Nguyen VT, Kaoru T (2020b) Why apple orchards are shifting to the higher altitudes of the Himalayas? PLOS ONE. https://doi.org/10.1371/journal.pone.0235041
Saaty TL (1990) How to make a decision: the analytic hierarchy process. Eur J Oper Res 48:9–26. https://doi.org/10.1016/0377-2217(90)90057-I
Saha AK, Gupta RP, Sarkar I et al (2005) An approach for GIS-based statistical landslide susceptibility zonation? With a case study in the Himalayas. Landslides 2:61–69. https://doi.org/10.1007/s10346-004-0039-8
Saito H, Nakayama D, Matsuyama H (2009) Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: The Akaishi Mountains, Japan. Geomorphology 109:108–121. https://doi.org/10.1016/j.geomorph.2009.02.026
Tien Bui D, Pradhan B, Lofman O, Revhaug I (2012) Landslide susceptibility assessment in Vietnam using support vector machines, Decision Tree, and Naïve Bayes models. Math Probl Eng 2012:1–26. https://doi.org/10.1155/2012/974638
Tien Bui D, Tuan TA, Klempe H et al (2016) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378. https://doi.org/10.1007/s10346-015-0557-6
Trigila A, Iadanza C, Esposito C, Scarascia-Mugnozza G (2015) Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology 249:119–136. https://doi.org/10.1016/j.geomorph.2015.06.001
Wang L-J, Guo M, Sawada K et al (2016) A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network. Geosci J 20:117–136. https://doi.org/10.1007/s12303-015-0026-1
Yao X, Tham LG, Dai FC (2008) Landslide susceptibility mapping based on support vector machine: a case study on natural slopes of Hong Kong, China. Geomorphology 101:572–582. https://doi.org/10.1016/j.geomorph.2008.02.011
Yeon Y-K, Han J-G, Ryu KH (2010) Landslide susceptibility mapping in Injae, Korea, using a Decision Tree. Eng Geol 116:274–283. https://doi.org/10.1016/j.enggeo.2010.09.009
Yilmaz I (2010) Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ Earth Sci 61:821–836. https://doi.org/10.1007/s12665-009-0394-9
Zare M, Pourghasemi HR, Vafakhah M, Pradhan B (2013) Landslide susceptibility mapping at Vaz Watershed (Iran) using an artificial neural network model: a comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arab J Geosci 6:2873–2888. https://doi.org/10.1007/s12517-012-0610-x
Acknowledgements
We are thankful to anonymous reviewers for the insightful and constructive comments and suggestions, which helped to improve the overall quality of the manuscript. Nirbhav would like to thank the University Grants Commission (UGC), and the Government of India for providing a senior research fellowship to carry out this research. In addition, the authors thank the National Disaster Management Authority (NDMA) Government of India, Border Road Organization (BRO), Manali and Public Work Department (PWD), and Kullu for providing various data sets used in this research.
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
Data collection: [Nirbhav]; Writing—original draft preparation: [Nirbhav, Maheshwar]; Conceptualization: [Nirbhav, Maheshwar, AM, AS]; Methodology: [Nirbhav, Maheshwar]; Formal analysis and investigation: [Nirbhav, Maheshwar, AM, AS]; Writing—review and editing: [Nirbhav, Maheshwar, AM, MP, AS, NTL]; Supervision: [AM, MP].
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nirbhav, Malik, A., Maheshwar et al. A comparative study of different machine learning models for landslide susceptibility prediction: a case study of Kullu-to-Rohtang pass transport corridor, India. Environ Earth Sci 82, 167 (2023). https://doi.org/10.1007/s12665-023-10846-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-023-10846-x