Abstract
This research focuses on bot detection through implementation of techniques such as traffic analysis, unsupervised machine learning, and similarity analysis between benign traffic data and bot traffic data. In this study, we tested and experimented with different clustering algorithms and recorded their accuracy with our prepared datasets. Later, the best clustering algorithm was used to proceed with the next steps of the methodology such as determination of majority clusters (cluster with most flows), removal of duplicate flows, and calculation of similarity analysis. Results were recorded for the removal of duplicate flows stage, the results indicate how many flows each majority cluster contains and how many duplicate flows were removed from this majority cluster. Next, results for similarity analysis indicate the value of the similarity coefficient for the comparisons between all datasets (bot datasets and benign dataset). With these results we can present some heuristic conclusion for determining possible bot infection in a certain host.
Similar content being viewed by others
References
Ab Rahman NH, Cahyani NDW, Choo KKR (2016) Cloud incident handling and forensic-by-design: cloud storage as a case study. Concurrency and Computation: Practice and Experience
Cahyani NDW et al (2016) Forensic data acquisition from cloud-of-things devices: windows Smartphones as a case study. Concurrency and Computation: Practice and Experience
Alomari E, Manickama S (2014) Design, deployment and use of http-based botnet testbed. National Advanced IPv6 Centre (NAv6), Universiti Sains Malaysia, Malaysia. 16th International Conference on Advanced Communication Technology, 1265–1269 (IEEE)
Arndt D (2016) How to: calculating flow statistics using netmate. https://dan.arndt.ca/nims/calculating-flowstatistics-using-netmate/. Accessed 04 Dec 2016
Barford P, Yegneswaran V (2006) An inside look at botnets. In Special Workshop on Malware Detection, Advances in Information Security, Springer Verlag
Barthakur P, Dahal M, Ghose MK (2015) Clusibothealer: botnet detection through similarity analysis of clusters. J Adv Comp Netw 3:1
Brozycki J (2010) Capturing and analyzing packets with perl. SANS Institute InfoSec Reading Room
Cai T, Zou F (2012) Detecting http botnet with clustering network traffic. In: Wireless Communications, Networking and Mobile Computing (WiCOM), 8th International Conference on IEEE
Choo KR (2007) Zombies And botnets. trends and issues in crime and criminal justice. Australian Institute of Criminology Canberra, Justice 333:1–6
Choo KK (2008) Raymond organised crime groups in cyberspace: a typology. Trends organ crime 11(3):270–295
Choo K-KR (2014) Mobile Cloud Storage Users. IEEE Cloud Comput 1(3):20–23
Choo KKR, Grabosky P (2014) Cyber crime. In: Paoli L (ed) Oxford handbook of organized crime. Oxford University Press, New York, pp 482–499
Choo KKR, Smith RG (2008) Criminal exploitation of online systems by organised crime groups. Asian J criminol 3(1):37–59
Debiao H, Jianhua C, Rui Z (2012) A more secure authentication scheme for telecare medicine information systems. J Med Syst 36(3):1989–1995
Do Q, Martini B, Choo KKR (2016) Is the data on your wearable device secure? An Android Wear smartwatch case study. Software: Practice and Experience
Fowler CA, Robert JH (2014) Converting PCAPs into Weka mineable data. Department of Computer and Information Sciences Towson University. Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 15th IEEE/ACIS International Conference on IEEE
Garca-Pedrajas N, de Haro-Garca A, Prez-Rodriguez J (2013) A scalable approach to simultaneous evolutionary instance and feature selection. Inf Sci 228:150–174
Ghahramani Z (2004) Unsupervised learning. Gatsby Computational Neuroscience Unit University College London, UK. In: Advanced lectures on machine learning. Springer, Berlin, Heidelberg, pp 72–112
Gu G, Perdisci R, Zhang J, Lee W (2008) Botminer: clustering analysis of network traffic for protocol- and structure-independent botnet detection. College of Computing, Georgia Institute of Technology. USENIX Security Symposium, vol. 5, no. 2
Guntuku SC, Narang P, Hota C (2013) Real-time peer-to-peer botnet detection framework based on bayesian regularized neural network. Institute of Network Engineering College of Computer Science National Chiao Tung University. arXiv preprint arXiv:1307.7464
Hota C, Narang P, Reddy JM (2013) Feature selection for detection of peer-to-peer botnet traffic. In: Proceedings of the 6th ACM India Computing Convention
Huseynov K, Kim K (2014) Unsupervised hadoop-based p2p botnet detection with threshold setting. Department of Computer Science, Korea Advanced, Institute of Science and Technology
Huseynov K, Kim K, Yoo PD (2014) Semi-supervised botnet detection using ant colony clustering. SCIS 2014. In: The 31th symposium on cryptography and information security Kagoshima. The Institute of Electronics, Information and Communication Engineers, Japan
Jiang T, Chen X et al (2014) TIMER: secure and reliable cloud storage against data re-outsourcing. In: International Conference on Information Security Practice and Experience. Springer International Publishing
Karim Ahmad et al (2016) On the analysis and detection of mobile botnet applications. J Univ Comput Sci 22(4):567–588
Livadas C (2006) Using machine learning techniques to identify botnet traffic. Internetwork Research Department BBN Technologies. In: Proceedings 31st IEEE Conference on Local Computer Networks, p 967–974
Lu W, Rammidi G, Ghorbani AA (2011) Clustering botnet communication traffic based on n-gram feature selection. Comp Commun 34(3):502–514
Martini B, Choo KKR (2012) An integrated conceptual digital forensic framework for cloud computing. Digit Invest 9(2):71–80
Martini B, Choo KKR (2013) Cloud storage forensics: ownCloud as a case study. Digit Invest 10(4):287–299
Martini B, Choo KKR (2014) Distributed filesystem forensics: XtreemFS as a case study. Digit Invest 11(4):295–313
McGregor A, Hall M, Lorier P, Brunskill J (2004) Flow clustering using machine learning techniques. The University of Waikato, New Zealand
Narang P, Reddy JM, Hota C (2013) Feature selection for detection of peer-to-peer botnet traffic. Department of Computer Science & Engineering Birla Institute of Technology and Science-Pilani. In: Proceedings of the 6th ACM India Computing Convention, p 16
Narang P, Hota C, Venkatakrishnan VN (2014) Peershark: flow-clustering and conversation generation for malicious peer-to-peer traffic identification. EURASIP J Inf Sec
Nivargi V, Bhaowal M, Lee T (2016) Machine learning based botnet detection. Citeseer. http://www.stanford.edu/class/cs229/proj2006/NivargiBhaowalLeeMachineLearningBasedBotnetDetection.pdf. Accessed 10 Oct 2006
Osanaiye O et al (2016a) Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J Wirel Commun Netw 2016(1):1
Osanaiye O, Choo KKR, Dlodlo M (2016b) Analysing feature selection and classification techniques for DDoS detection in cloud. In: Proceedings of Southern Africa Telecommunication
Osanaiye O, Choo KKR, Dlodlo M (2016c) Distributed denial of service (DDoS) resilience in cloud: review and conceptual cloud DDoS mitigation framework. In: Journal of network and computer applications, Networks and Applications Conference, 3–7, SATNAC, vol 67, pp 147–165
Osanaiye O, Choo KKR, Dlodlo M (2016d) Change-point cloud DDoS detection using packet inter-arrival time. In: Proceedings of IEEE Computer Science & Electronic Engineering Conference, 28–30, IEEE CEEC
Peng J, Choo KKR, Ashman H (2016) User profiling in intrusion detection: a review. J Netw Comp Appl 72:14–27
Pohlmanna N, Dietricha CJ, Rossowa C (2013) Cocospot: clustering and recognizing botnet command and control channels using traffic analysis. Comp Netw 57(2):475–486
Quick D, Choo KKR (2013a) Digital droplets: Microsoft SkyDrive forensic data remnants. Futur Gener Comp Syst 29(6):1378–1394
Quick D, Choo KKR (2013b) Dropbox analysis: data remnants on user machines. Digit Invest 10(1):3–18
Quick D, Choo KKR (2013c) Forensic collection of cloud storage data: does the act of collection result in changes to the data or its metadata? Digit Invest 10(3):266–277
Quick D, Choo KKR (2014a) Data reduction and data mining framework for digital forensic evidence: storage, intelligence, review, and archive. Trends Issues Crime Crimin Justice 480:1–11
Quick D, Choo KKR (2014b) Google drive: forensic analysis of data remnants. J Netw Comp Appl 40:179–193
Quick D, Choo KKR (2016) Big forensic data reduction: digital forensic images and electronic evidence. Clust Comput 19(2):723–740. doi:10.1007/s10586-016-0553-1
Quick D, Choo KKR (2016) Big forensic data management in heterogeneous distributed systems: quick analysis of multimedia forensic data. Software: Practice and Experience
Rahbarinia B, Perdisci R (2013) Peerrush: Mining for unwanted p2p traffic. Dept. of Computer Science, University of Georgia. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment pp. 62-82. Springer Berlin Heidelberg
Saad S, Traore I, Ghorbani AA, Sayed B, Zhao D, Lu W, Felix J, Hakimian P (2011) Detecting p2p botnets through network behavior analysis and machine learning. In: Proceedings of 9th Annual Conference on Privacy, Security and Trust (PST2011)
Singh K, Guntuku SC, Thakur A, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection. Network 3:0 (Elsevier)
Stevanovic M, Pedersen JM (2013) Machine learning for identifying botnet network traffic. Department of Computer Science & Engineering Birla Institute of Technology and Science-Pilani
Stevanovic M, Pedersen M (2014) An efficientflow-based botnet detection using supervised machine learning. In: International Conference on Computing, Networking and Communications (ICNC), Honolulu, p 797–801
Stevanovic M, Pedersen JM (2014) An efficient flow-based botnet detection using supervised machine learning. Department of Electronic Systems, Aalborg University. In Computing, Networking and Communications (ICNC), 2014 International Conference on p797–801 (IEEE)
Su SC (2015) Detecting p2p botnet in software defined network. Institute of Network Engineering College of Computer Science National Chiao Tung University. http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22103NCTU5726023%22.&searchmode=basic
Trolle Borup L (2009) Peer-to-peer botnets: a case study on Waledac. Dissertation, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark
Yahyazadeh M, Abadi M (2015) Botonus: an online unsupervised method for botnet detection. ISC Int J Inf Sec 4:1
Zhao D (2013) Botnet detection based on traffic behavior analysis and flow intervals. Comp Secur 39:2–16
Zhao D, Traore I, Sayed B (2013) Botnet detection based on traffic behavior analysis and flow intervals. Elsevier, Amsterdam
Acknowledgements
The authors would like to thank Pratik Narang and Babak Rahbarinia for their immense help and insights. And, we sincerely thank the anonymous reviewers and our shepherd for their constructive comments and valuable feedback which helped to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, W., Alvarez, J., Liu, C. et al. Bot detection using unsupervised machine learning. Microsyst Technol 24, 209–217 (2018). https://doi.org/10.1007/s00542-016-3237-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00542-016-3237-0