Fuzzy K-Means with M-KMP: a security framework in pyspark environment for intrusion detection

Begum, Gousiya; Ul Huq, S. Zahoor; Kumar, A. P. Siva

doi:10.1007/s11042-024-18180-5

Fuzzy K-Means with M-KMP: a security framework in pyspark environment for intrusion detection

Published: 13 February 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Gousiya Begum^1,2,
S. Zahoor Ul Huq³ &
A. P. Siva Kumar⁴

69 Accesses
Explore all metrics

Abstract

In recent times, IDS (Intrusion Detection System) has become a significant tool for improvising network security through the detection of abnormal and normal data. It is vital as it permits one to identify and respond to incoming malicious traffic. The intruders have also enhanced the inclusion of attacks in systems with a recent increase in data. Concurrently, ML (Machine Learning) algorithms can learn from corresponding data that has been afforded. With the provision of new data, the accuracy and efficacy of the ML model to take decisions to enhance with training. However, with the evolution of big data, ML has turned incapable of handling huge data interpretation issues which made most of the conventional systems explore high FP (False Positive) rates and low accuracy rates. This gave rise to pyspark which serves as a platform for addressing these issues that the ML method fails to solve. ML in pyspark is a scale and easy to use. Considering this, the present research intends to propose ML-based algorithms for classifying intrusion detection in a pyspark environment. This study proposes a security framework named Fuzzy K-Means with M-KMP (Modified-Knuth Morris Pratt) wherein the clustering is accomplished by Fuzzy K-means which is capable of exploring data points that potentially relate to multiple clusters. Whereas, M-KMP achieves information matching on the clustered data for assessment of the information occurrence on the allocated threat data that will serve as an assistance for security developers in attack prevention. The efficiency of this proposed work is confirmed through the results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intrusion Detection Based on PCA with Improved K-Means

A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers

Article 04 July 2017

A Comparative Evolution of Unsupervised Techniques for Effective Network Intrusion Detection in Hadoop

References

Soe YN, Feng Y, Santosa PI, Hartanto R, Sakurai K (2019) Rule generation for signature based detection systems of cyber attacks in iot environments. Bull Netw Comput Syst Softw 8(2):93–97
Google Scholar
Ali MM, El-Henawy IM, Salah A (2021) Usages of spark framework with different machine learning algorithms. Comput Intell Neurosci 2021. https://doi.org/10.1155/2021/1896953
Othman SM, Ba-Alwi FM, Alsohybe NT, Al-Hashida AY (2018) Intrusion detection model using machine learning algorithm on Big Data environment. J Big Data 5(1):1–12
Article Google Scholar
Morfino V, Rampone S (2020) Towards near-real-time intrusion detection for IoT devices using supervised learning and apache spark. Electronics 9(3):444
Article Google Scholar
Singh J, Singh J (2021) A survey on machine learning-based malware detection in executable files. J Syst Architect 112:101861
Article Google Scholar
Karataş F, Korkmaz SA (2018) Big Data: controlling fraud by using machine learning libraries on spark. Int J Appl Math Electron Computers 6(1):1–5
Article Google Scholar
Peng K, Leung VC, Huang Q (2018) Clustering approach based on mini batch kmeans for intrusion detection system over big data. IEEE Access 6:11897–11906
Article Google Scholar
Sun L, Zhang H, Fang C (2021) Data security governance in the era of big data: status, challenges, and prospects. Data Sci Manage 2:41–44
Article Google Scholar
Latah M (2020) Detection of malicious social bots: a survey and a refined taxonomy. Expert Syst Appl 151:113383
Article Google Scholar
Zhang F, Kodituwakku HADE, Hines JW, Coble J (2019) Multilayer data-driven cyber-attack detection system for industrial control systems based on network, system, and process data. IEEE Trans Industr Inf 15(7):4362–4369
Article CAS Google Scholar
Do Xuan C, Nguyen HD, Tisenko VN (2020) Malicious URL detection based on machine learning. Int J Adv Comput Sci Appl 11(1). https://doi.org/10.14569/ijacsa.2020.0110119
Shi Y, Chen G, Li J (2018) Malicious domain name detection based on extreme machine learning. Neural Process Lett 48(3):1347–1357
Article Google Scholar
Liu S, Huang S, Xu X, Lloret J, Muhammad K (2023) Efficient visual tracking based on fuzzy inference for intelligent transportation systems. IEEE Trans Intell Trans Syst. https://doi.org/10.1109/TITS.2022.3232242
Liu S et al (2022) Human inertial thinking strategy: a novel fuzzy reasoning mechanism for IoT-assisted visual monitoring. IEEE Internet of Things J 10(5):3735–3748
Jemal I, Cheikhrouhou O, Hamam H, Mahfoudhi A (2020) Sql injection attack detection and prevention techniques using machine learning. Int J Appl Eng Res 15(6):569–580
Google Scholar
Dhalaria M, Gandotra E (2021) A hybrid approach for android malware detection and family classification. IJIMAI 6.6(2021):174–188
Singh J, Singh J (2020) Detection of malicious software by analyzing the behavioral artifacts using machine learning algorithms. ‎Inf Softw Technol 121:106273
Article Google Scholar
Shahriar H, Nimmagadda S (2021) Network Intrusion Detection for TCP/IP packets with machine learning techniques. In: Machine Intelligence and Big Data Analytics for Cybersecurity Applications. Springer, vol 919, pp 231–247. https://doi.org/10.1007/978-3-030-57024-8_10
Subroto A, Apriyana A (2019) Cyber risk prediction through social media big data analytics and statistical machine learning. J Big Data 6(1):1–19
Article Google Scholar
Kotenko I, Saenko I, Branitskiy A (2018) Framework for mobile internet of things security monitoring based on big data processing and machine learning. IEEE Access 6:72714–72723
Article Google Scholar
Rashid M, Singh H, Goyal V, Parah SA, Wani AR (2021) Big data based hybrid machine learning model for improving performance of medical internet of things data in healthcare systems. In: Healthcare Paradigms in the Internet of Things Ecosystem. Elsevier, pp 47–62
Shrestha R, Omidkar A, Roudi SA, Abbas R, Kim S (2021) Machine-learning-enabled intrusion detection system for cellular connected UAV networks. Electronics 10(13):1549
Article Google Scholar
Peng K, Leung V, Zheng L, Wang S, Huang C, Lin T (2018) Intrusion detection system based on decision tree over big data in fog environment. Wirel Commun Mob Comput 2018. https://doi.org/10.1155/2018/4680867
Deepa G, Thilagam PS, Khan FA, Praseed A, Pais AR, Palsetia N (2018) Black-box detection of XQuery injection and parameter tampering vulnerabilities in web applications. Int J Inf Secur 17(1):105–120
Article Google Scholar
Atefinia R, Ahmadi M (2022) Performance evaluation of Apache Spark MLlib algorithms on an intrusion detection dataset. J Comput Secur 9(1):57–69
Google Scholar
Marir N, Wang H, Feng G, Li B, Jia M (2018) Distributed abnormal behavior detection approach based on deep belief network and ensemble SVM using spark. IEEE Access 6:59657–59671
Article Google Scholar
Hafsa M, Jemili F (2018) Comparative study between big data analysis techniques in intrusion detection. Big Data and Cogn Comput 3(1):1
Article Google Scholar
Donkal G, Verma GK (2018) A multimodal fusion based framework to reinforce IDS for securing Big Data environment using spark. J Inform Secur Appl 43:1–11
Google Scholar
Atefinia R, Ahmadi M (2021) Network intrusion detection using multi-architectural modular deep neural network. J Supercomput 77(4):3571–3593
Article Google Scholar
Basnet RB, Shash R, Johnson C, Walgren L, Doleck T (2019) Towards detecting and classifying Network Intrusion Traffic using Deep Learning frameworks. J Internet Serv Inf Secur 9(4):1–17
CAS Google Scholar
Al-Tarawneh A, Al-Saraireh Ja (2021) Efficient detection of hacker community based on twitter data using complex networks and machine learning algorithm. J Intell Fuzzy Syst 40(6):12321–12337
Article Google Scholar
Islam U et al (2022) Detection of Distributed Denial of Service (DDoS) Attacks in IOT Based Monitoring System of Banking Sector Using Machine Learning Models. Sustainability 14(14):8374
Article Google Scholar
Iqbal F, Batool R, Fung BC, Aleem S, Abbasi A, Javed AR (2021) Toward tweet-mining framework for extracting terrorist attack-related information and reporting. IEEE Access 9:115535–115547
Article Google Scholar
Bouya-Moko BE, Boahen EK, Wang C (2022) Fuzzy Local Information and Bhattacharya-Based C-Means Clustering and Optimized Deep Learning in Spark Framework for Intrusion Detection. Electronics 11(11):1675
Article Google Scholar
Gupta R, Tanwar S, Tyagi S, Kumar N (2020) Machine learning models for secure data analytics: A taxonomy and threat model. Comput Commun 153:406–440
Article Google Scholar
Akkem Y, Biswas SK, Varanasi A (2023) Smart farming using artificial intelligence: a review. Eng Appl Artif Intell 120:105899
Article Google Scholar

Download references

Funding

There is no funding for this study.

Author information

Authors and Affiliations

Research Scholar, Department of Computer Science and Engineering, JNTUA Anantapur, Anantapuramu, Andhra Pradesh, India
Gousiya Begum
Department of Computer Science and Engineering, Mahatma Gandhi Institute of Technology, Gandipet, Hyderabad, India
Gousiya Begum
Department of Computer Science and Engineering, G. Pulla Reddy Engineering College, Kurnool, Andhra Pradesh, India
S. Zahoor Ul Huq
Department of Computer Science and Engineering, JNTU Anantapur, Anantapuramu, Andhra Pradesh, India
A. P. Siva Kumar

Authors

Gousiya Begum
View author publications
You can also search for this author in PubMed Google Scholar
S. Zahoor Ul Huq
View author publications
You can also search for this author in PubMed Google Scholar
A. P. Siva Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gousiya Begum.

Ethics declarations

Conflict of interest

There is no conflict of Interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Begum, G., Ul Huq, S.Z. & Kumar, A.P.S. Fuzzy K-Means with M-KMP: a security framework in pyspark environment for intrusion detection. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18180-5

Download citation

Received: 25 January 2023
Revised: 06 October 2023
Accepted: 05 January 2024
Published: 13 February 2024
DOI: https://doi.org/10.1007/s11042-024-18180-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fuzzy K-Means with M-KMP: a security framework in pyspark environment for intrusion detection

Abstract

Access this article

Similar content being viewed by others

Intrusion Detection Based on PCA with Improved K-Means

A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers

A Comparative Evolution of Unsupervised Techniques for Effective Network Intrusion Detection in Hadoop

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fuzzy K-Means with M-KMP: a security framework in pyspark environment for intrusion detection

Abstract

Access this article

Similar content being viewed by others

Intrusion Detection Based on PCA with Improved K-Means

A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers

A Comparative Evolution of Unsupervised Techniques for Effective Network Intrusion Detection in Hadoop

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation