Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method

Pandey, Mithilesh Kumar; Singh, Munindra Kumar; Pal, Saurabh; Tiwari, B. B.

doi:10.1007/s42979-022-01387-4

Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method

Original Research
Published: 25 September 2022

Volume 3, article number 488, (2022)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Mithilesh Kumar Pandey¹,
Munindra Kumar Singh¹,
Saurabh Pal ORCID: orcid.org/0000-0001-9545-7481¹ &
…
B. B. Tiwari²

167 Accesses
2 Citations
Explore all metrics

Abstract

Phishing is considered a big concern in this age of data and digital technologies because of its significant influence on the banking and online retailing industries. Cybercriminals target all economic activity on the Internet; thus, it is critical to take security precautions to safeguard assets. One of the first steps in constructing a safe cyberspace is to prevent phishing attacks before they happen. The detection mechanisms for these assaults were created using machine learning and other methods. However, there is still room for improvement in terms of detection accuracy. This paper proposes the optimization of an ensemble classification algorithm for phishing website (PW) detection. The suggested technique was optimised using a hybrid features selection method (Chi-square, extra tree, and heatmap) by modifying numerous machine learning (ML) method parameters, including random forest, naive Bayes, J48, and KNN. These were achieved by rating the optimal classifiers and selecting the top classifiers to serve as the foundation for the suggested technique. The obtained results by all experiments show that assigned optimized stacking ensemble approach outperforms previous ML-based detection methods. The level of precision attained was 99.7%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Article 30 October 2021

Phishing website prediction using base and ensemble classifier techniques with cross-validation

Article Open access 02 November 2022

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

References

Buber E, Demir Ö, Sahingoz OK. Feature selections for the machine learning based detection of phishing websites. In: 2017 International artificial intelligence and data processing symposium (IDAP). IEEE; 2017. pp. 1–5. https://doi.org/10.1109/IDAP.2017.8090317.
Vijayalakshmi M, Mercy Shalinie S, Yang MH, Raja Meenakshi U. Web phishing detection techniques: a survey on the state-of-the-art, taxonomy and future directions. IET Networks. 2020;9(5):235–46.
Article Google Scholar
Jain AK, Gupta BB. A novel approach to protect against phishing attacks at client side using auto-updated white-list. EURASIP J Inf Secur. 2016;2016:9.
Article Google Scholar
Jain AK, Gupta BB. “PHISH-SAFE: URL features-based phishing detection system using machine learning”, Cyber Security. Adv Intell Syst Comput. 2018. https://doi.org/10.1007/978-981-10-8536-9_44.
Article Google Scholar
Purbay M, Kumar D. Split behavior of supervised machine learning algorithms for phishing URL detection. In: Lecture Notes in Electrical Engineering, vol. 683, 2021; https://doi.org/10.1007/978-981-15-6840-4_40.
Gandotra E, Gupta D. An efficient approach for phishing detection using machine learning. In: Algorithms for Intelligent Systems. Singapore: Springer; 2021. https://doi.org/10.1007/978-981-15-8711-5_12.
Book Google Scholar
Basit A, Zafar M, Javed AR, Jalil Z. A novel ensemble machine learning method to detect phishing attack. In: 2020 IEEE 23rd international multitopic conference (INMIC). IEEE; 2020. pp. 1–5. https://doi.org/10.1109/INMIC50486.2020.9318210.
Le H, Pham Q, Sahoo D, and Hoi SCH. URLNet: Learning a URL representation with deep learning for malicious URL detection. Conference’17, Washington, DC, USA, arXiv:1802.03162, 2017.
Hong J, Kim T, Liu J, Park N, Kim SW. “Phishing URL detection with lexical features and blacklisted domains”, Autonomous Secure Cyber Systems. Springer, https://doi.org/10.1007/978-3-030-33432-1_12.
Kumar J, Santhanavijayan A, Janet B, Rajendran B and Bindhumadhava BS. Phishing website classification and detection using machine learning. In: 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1–6, https://doi.org/10.1109/ICCCI48352.2020.9104161.
Gao X, Shan C, Hu C, Niu Z, Liu Z. An adaptive ensemble machine learning model for intrusion detection. IEEE Access. 2019;7:82512–21.
Article Google Scholar
Hassan YA, Abdelfettah B. Using case- based reasoning for phishing detection. Procedia Comput Sci. 2017;109:281–8.
Article Google Scholar
Rao RS, Pais AR. Jail-Phish: an improved search engine based phishing detection system. Comput Secur. 2019;1(83):246–67.
Article Google Scholar
Aljofey A, Jiang Q, Qu Q, Huang M, Niyigena JP. An effective phishing detection model based on character level convolutional neural network from URL. Electronics. 2020;9(9):1514.
Article Google Scholar
AlEroud A, Karabatis G. Bypassing detection of URL-based phishing attacks using generative adversarial deep neural networks. In: Proceedings of the sixth international workshop on security and privacy analytics. 2020. pp. 53–60. https://doi.org/10.1145/3375708.3380315.
Wen Y, Wu R, Zhou Z, Zhang S, Yang S, Wallington TJ, et al. A data-driven method of traffic emissions mapping with land use random forest models. Appl Energy. 2022;305: 117916.
Article Google Scholar
Anand R, Sakkari DS. Classification of fake news on Twitter by using Naïve Bayes classifier. In: Ranganathan G, Fernando X, Shi F, El Allioui Y, editors. Soft computing for security applications. Singapore: Springer; 2022. pp. 399–408. https://doi.org/10.1007/978-981-16-5301-8_30.
Chapter Google Scholar
Tanvir Fayaz S, Tejanmayi GS, Kanaka Ruthvi Y, Vijaya Shetty S, Shenoy SU, Bhat G. Prediction of liver patients using machine learning algorithms. In: Shetty NR, Patnaik LM, Nagaraj HC, Hamsavath PN, Nalini N, editors. Emerging research in computing, information, communication and applications. Singapore: Springer; 2022. p. 135–45. https://doi.org/10.1007/978-981-16-1338-8_12.
Chapter Google Scholar
Wang Y, Pan Z, Dong J. A new two-layer nearest neighbor selection method for kNN classifier. Knowl-Based Syst. 2022;235: 107604.
Article Google Scholar
Lin CW, Hong S, Lin M, Huang X, Liu J. Bird posture recognition based on target keypoints estimation in dual-task convolutional neural networks. Ecol Ind. 2022;135: 108506.
Article Google Scholar
Sumant AS, Patil D. Ensemble Feature Subset Selection: Integration of Symmetric Uncertainty and Chi-Square techniques with RReliefF. J Inst Eng (India). 2022. https://doi.org/10.1007/s40031-021-00684-5.
Article Google Scholar
Kharwar AR, Thakor DV. An ensemble approach for feature selection and classification in intrusion detection using extra-tree algorithm. Int J Inf Secur Privacy (IJISP). 2022;16(1):1–21.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Applications, VBS Purvanchal University, Jaunpur, Uttar Pradesh, 222001, India
Mithilesh Kumar Pandey, Munindra Kumar Singh & Saurabh Pal
Department of Electronics and Communication, VBS Purvanchal University, Jaunpur, Jaunpur, Uttar Pradesh, 222001, India
B. B. Tiwari

Authors

Mithilesh Kumar Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Munindra Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Saurabh Pal
View author publications
You can also search for this author in PubMed Google Scholar
B. B. Tiwari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saurabh Pal.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K N and M Shivakumar.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pandey, M.K., Singh, M.K., Pal, S. et al. Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method. SN COMPUT. SCI. 3, 488 (2022). https://doi.org/10.1007/s42979-022-01387-4

Download citation

Received: 31 January 2022
Accepted: 25 August 2022
Published: 25 September 2022
DOI: https://doi.org/10.1007/s42979-022-01387-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method

Abstract

Access this article

Similar content being viewed by others

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Phishing website prediction using base and ensemble classifier techniques with cross-validation

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Prediction of Phishing Websites Using Stacked Ensemble Method and Hybrid Features Selection Method

Abstract

Access this article

Similar content being viewed by others

Stop-Phish: an intelligent phishing detection method using feature selection ensemble

Phishing website prediction using base and ensemble classifier techniques with cross-validation

A Machine Learning Approach for Phishing Websites Prediction with Novel Feature Selection Framework

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation