Skip to main content
Log in

Bot detection using unsupervised machine learning

  • Technical Paper
  • Published:
Microsystem Technologies Aims and scope Submit manuscript

Abstract

This research focuses on bot detection through implementation of techniques such as traffic analysis, unsupervised machine learning, and similarity analysis between benign traffic data and bot traffic data. In this study, we tested and experimented with different clustering algorithms and recorded their accuracy with our prepared datasets. Later, the best clustering algorithm was used to proceed with the next steps of the methodology such as determination of majority clusters (cluster with most flows), removal of duplicate flows, and calculation of similarity analysis. Results were recorded for the removal of duplicate flows stage, the results indicate how many flows each majority cluster contains and how many duplicate flows were removed from this majority cluster. Next, results for similarity analysis indicate the value of the similarity coefficient for the comparisons between all datasets (bot datasets and benign dataset). With these results we can present some heuristic conclusion for determining possible bot infection in a certain host.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Ab Rahman NH, Cahyani NDW, Choo KKR (2016) Cloud incident handling and forensic-by-design: cloud storage as a case study. Concurrency and Computation: Practice and Experience

  • Cahyani NDW et al (2016) Forensic data acquisition from cloud-of-things devices: windows Smartphones as a case study. Concurrency and Computation: Practice and Experience

  • Alomari E, Manickama S (2014) Design, deployment and use of http-based botnet testbed. National Advanced IPv6 Centre (NAv6), Universiti Sains Malaysia, Malaysia. 16th International Conference on Advanced Communication Technology, 1265–1269 (IEEE)

  • Arndt D (2016) How to: calculating flow statistics using netmate. https://dan.arndt.ca/nims/calculating-flowstatistics-using-netmate/. Accessed 04 Dec 2016

  • Barford P, Yegneswaran V (2006) An inside look at botnets. In Special Workshop on Malware Detection, Advances in Information Security, Springer Verlag

  • Barthakur P, Dahal M, Ghose MK (2015) Clusibothealer: botnet detection through similarity analysis of clusters. J Adv Comp Netw 3:1

    Article  Google Scholar 

  • Brozycki J (2010) Capturing and analyzing packets with perl. SANS Institute InfoSec Reading Room

  • Cai T, Zou F (2012) Detecting http botnet with clustering network traffic. In: Wireless Communications, Networking and Mobile Computing (WiCOM), 8th International Conference on IEEE

  • Choo KR (2007) Zombies And botnets. trends and issues in crime and criminal justice. Australian Institute of Criminology Canberra, Justice 333:1–6

  • Choo KK (2008) Raymond organised crime groups in cyberspace: a typology. Trends organ crime 11(3):270–295

    Article  Google Scholar 

  • Choo K-KR (2014) Mobile Cloud Storage Users. IEEE Cloud Comput 1(3):20–23

    Article  Google Scholar 

  • Choo KKR, Grabosky P (2014) Cyber crime. In: Paoli L (ed) Oxford handbook of organized crime. Oxford University Press, New York, pp 482–499

    Google Scholar 

  • Choo KKR, Smith RG (2008) Criminal exploitation of online systems by organised crime groups. Asian J criminol 3(1):37–59

    Article  Google Scholar 

  • Debiao H, Jianhua C, Rui Z (2012) A more secure authentication scheme for telecare medicine information systems. J Med Syst 36(3):1989–1995

    Article  Google Scholar 

  • Do Q, Martini B, Choo KKR (2016) Is the data on your wearable device secure? An Android Wear smartwatch case study. Software: Practice and Experience

  • Fowler CA, Robert JH (2014) Converting PCAPs into Weka mineable data. Department of Computer and Information Sciences Towson University. Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 15th IEEE/ACIS International Conference on IEEE

  • Garca-Pedrajas N, de Haro-Garca A, Prez-Rodriguez J (2013) A scalable approach to simultaneous evolutionary instance and feature selection. Inf Sci 228:150–174

    Article  MathSciNet  Google Scholar 

  • Ghahramani Z (2004) Unsupervised learning. Gatsby Computational Neuroscience Unit University College London, UK. In: Advanced lectures on machine learning. Springer, Berlin, Heidelberg, pp 72–112

  • Gu G, Perdisci R, Zhang J, Lee W (2008) Botminer: clustering analysis of network traffic for protocol- and structure-independent botnet detection. College of Computing, Georgia Institute of Technology. USENIX Security Symposium, vol. 5, no. 2

  • Guntuku SC, Narang P, Hota C (2013) Real-time peer-to-peer botnet detection framework based on bayesian regularized neural network. Institute of Network Engineering College of Computer Science National Chiao Tung University. arXiv preprint arXiv:1307.7464

  • Hota C, Narang P, Reddy JM (2013) Feature selection for detection of peer-to-peer botnet traffic. In: Proceedings of the 6th ACM India Computing Convention

  • Huseynov K, Kim K (2014) Unsupervised hadoop-based p2p botnet detection with threshold setting. Department of Computer Science, Korea Advanced, Institute of Science and Technology

  • Huseynov K, Kim K, Yoo PD (2014) Semi-supervised botnet detection using ant colony clustering. SCIS 2014. In: The 31th symposium on cryptography and information security Kagoshima. The Institute of Electronics, Information and Communication Engineers, Japan

  • Jiang T, Chen X et al (2014) TIMER: secure and reliable cloud storage against data re-outsourcing. In: International Conference on Information Security Practice and Experience. Springer International Publishing

  • Karim Ahmad et al (2016) On the analysis and detection of mobile botnet applications. J Univ Comput Sci 22(4):567–588

    Google Scholar 

  • Livadas C (2006) Using machine learning techniques to identify botnet traffic. Internetwork Research Department BBN Technologies. In: Proceedings 31st IEEE Conference on Local Computer Networks, p 967–974

  • Lu W, Rammidi G, Ghorbani AA (2011) Clustering botnet communication traffic based on n-gram feature selection. Comp Commun 34(3):502–514

    Article  Google Scholar 

  • Martini B, Choo KKR (2012) An integrated conceptual digital forensic framework for cloud computing. Digit Invest 9(2):71–80

    Article  Google Scholar 

  • Martini B, Choo KKR (2013) Cloud storage forensics: ownCloud as a case study. Digit Invest 10(4):287–299

    Article  Google Scholar 

  • Martini B, Choo KKR (2014) Distributed filesystem forensics: XtreemFS as a case study. Digit Invest 11(4):295–313

    Article  Google Scholar 

  • McGregor A, Hall M, Lorier P, Brunskill J (2004) Flow clustering using machine learning techniques. The University of Waikato, New Zealand

    Book  Google Scholar 

  • Narang P, Reddy JM, Hota C (2013) Feature selection for detection of peer-to-peer botnet traffic. Department of Computer Science & Engineering Birla Institute of Technology and Science-Pilani. In: Proceedings of the 6th ACM India Computing Convention, p 16

  • Narang P, Hota C, Venkatakrishnan VN (2014) Peershark: flow-clustering and conversation generation for malicious peer-to-peer traffic identification. EURASIP J Inf Sec

  • Nivargi V,  Bhaowal M, Lee T (2016) Machine learning based botnet detection. Citeseer. http://www.stanford.edu/class/cs229/proj2006/NivargiBhaowalLeeMachineLearningBasedBotnetDetection.pdf. Accessed 10 Oct 2006

  • Osanaiye O et al (2016a) Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J Wirel Commun Netw 2016(1):1

    Article  Google Scholar 

  • Osanaiye O, Choo KKR, Dlodlo M (2016b) Analysing feature selection and classification techniques for DDoS detection in cloud. In: Proceedings of Southern Africa Telecommunication

  • Osanaiye O, Choo KKR, Dlodlo M (2016c) Distributed denial of service (DDoS) resilience in cloud: review and conceptual cloud DDoS mitigation framework. In: Journal of network and computer applications, Networks and Applications Conference, 3–7, SATNAC, vol 67, pp 147–165

  • Osanaiye O, Choo KKR, Dlodlo M (2016d) Change-point cloud DDoS detection using packet inter-arrival time. In: Proceedings of IEEE Computer Science & Electronic Engineering Conference, 28–30, IEEE CEEC

  • Peng J, Choo KKR, Ashman H (2016) User profiling in intrusion detection: a review. J Netw Comp Appl 72:14–27

    Article  Google Scholar 

  • Pohlmanna N, Dietricha CJ, Rossowa C (2013) Cocospot: clustering and recognizing botnet command and control channels using traffic analysis. Comp Netw 57(2):475–486

    Article  Google Scholar 

  • Quick D, Choo KKR (2013a) Digital droplets: Microsoft SkyDrive forensic data remnants. Futur Gener Comp Syst 29(6):1378–1394

    Article  Google Scholar 

  • Quick D, Choo KKR (2013b) Dropbox analysis: data remnants on user machines. Digit Invest 10(1):3–18

    Article  Google Scholar 

  • Quick D, Choo KKR (2013c) Forensic collection of cloud storage data: does the act of collection result in changes to the data or its metadata? Digit Invest 10(3):266–277

    Article  Google Scholar 

  • Quick D, Choo KKR (2014a) Data reduction and data mining framework for digital forensic evidence: storage, intelligence, review, and archive. Trends Issues Crime Crimin Justice 480:1–11

    Google Scholar 

  • Quick D, Choo KKR (2014b) Google drive: forensic analysis of data remnants. J Netw Comp Appl 40:179–193

    Article  Google Scholar 

  • Quick D, Choo KKR (2016) Big forensic data reduction: digital forensic images and electronic evidence. Clust Comput 19(2):723–740. doi:10.1007/s10586-016-0553-1

  • Quick D, Choo KKR (2016) Big forensic data management in heterogeneous distributed systems: quick analysis of multimedia forensic data. Software: Practice and Experience

  • Rahbarinia B, Perdisci R (2013) Peerrush: Mining for unwanted p2p traffic. Dept. of Computer Science, University of Georgia. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment pp. 62-82. Springer Berlin Heidelberg

  • Saad S, Traore I, Ghorbani AA, Sayed B, Zhao D, Lu W, Felix J, Hakimian P (2011) Detecting p2p botnets through network behavior analysis and machine learning. In: Proceedings of 9th Annual Conference on Privacy, Security and Trust (PST2011)

  • Singh K, Guntuku SC, Thakur A, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection. Network 3:0 (Elsevier)

    Google Scholar 

  • Stevanovic M, Pedersen JM (2013) Machine learning for identifying botnet network traffic. Department of Computer Science & Engineering Birla Institute of Technology and Science-Pilani

  • Stevanovic M, Pedersen M (2014) An efficientflow-based botnet detection using supervised machine learning. In: International Conference on Computing, Networking and Communications (ICNC), Honolulu, p 797–801

  • Stevanovic M, Pedersen JM (2014) An efficient flow-based botnet detection using supervised machine learning. Department of Electronic Systems, Aalborg University. In Computing, Networking and Communications (ICNC), 2014 International Conference on p797–801 (IEEE)

  • Su SC (2015) Detecting p2p botnet in software defined network. Institute of Network Engineering College of Computer Science National Chiao Tung University. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22103NCTU5726023%22.&searchmode=basic

  • Trolle Borup L (2009) Peer-to-peer botnets: a case study on Waledac. Dissertation, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark

  • Yahyazadeh M, Abadi M (2015) Botonus: an online unsupervised method for botnet detection. ISC Int J Inf Sec 4:1

    Google Scholar 

  • Zhao D (2013) Botnet detection based on traffic behavior analysis and flow intervals. Comp Secur 39:2–16

    Article  Google Scholar 

  • Zhao D, Traore I, Sayed B (2013) Botnet detection based on traffic behavior analysis and flow intervals. Elsevier, Amsterdam

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Pratik Narang and Babak Rahbarinia for their immense help and insights. And, we sincerely thank the anonymous reviewers and our shepherd for their constructive comments and valuable feedback which helped to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hung-Min Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, W., Alvarez, J., Liu, C. et al. Bot detection using unsupervised machine learning. Microsyst Technol 24, 209–217 (2018). https://doi.org/10.1007/s00542-016-3237-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00542-016-3237-0

Navigation