Abstract
In this contribution, analysis of usefulness of selected parameters of a distributed information system, for early detection of anomalies in its operation, is considered. Use of statistical analysis, or machine learning (ML), can result in high computational complexity and requirement to transfer large amount of data from the monitored system’s elements. This enforces monitoring of only major components (e.g., access link, key machine components, filtering of selected traffic parameters). To overcome this limitation, a model in which an arbitrary number of elements could be monitored, using microservices, is proposed. For this purpose, it is necessary to determine the sampling threshold value and the influence of sampling coarseness on the quality of anomaly detection. To validate the proposed approach, the ST4000DM000 (Disk failure) and CICIDS2017 (DDoS) datasets were used, to study effects of limiting the number of parameters and the sampling rate reduction on the detection performance of selected classic ML algorithms. Moreover, an example of microservice architecture for coarse network anomaly detection for a network node is presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Janus, P., Ganzha, M., Bicki, A., Paprzycki, M.: Applying machine learning to study infrastructure anomalies in a mid-size data center - preliminary considerations. In: Proceedings of the 54th Hawaii International Conference on System Sciences (2021). http://hdl.handle.net/10125/70636. https://doi.org/10.24251/HICSS.2021.025
Neu, D.A., Lahann, J., Fettke, P.: A systematic literature review on state-of-the-art deep learning methods for process prediction. Artif. Intell. Rev. (2021). https://doi.org/10.1007/s10462-021-09960-8
Williams, A.W., Pertet, S.M., Narasimhan, P.: Tiresias: black-box failure prediction in distributed systems. In: Proceedings of the 2007 IEEE International Parallel and Distributed Processing Symposium, Long Beach, CA, USA, pp. 1–8. IEEE (2007). https://doi.org/10.1109/IPDPS.2007.370345
Mariani, L., Pezzè, M., Riganelli, O., Xin, R.: Predicting failures in multi-tier distributed systems. J. Syst. Softw. 161, 110464 (2020). https://doi.org/10.1016/j.jss.2019.110464
Chen, X., Lu, C., Pattabiraman, K.: Failure prediction of jobs in compute clouds: a google cluster case study. In: Proceedings of the 2014 IEEE International Symposium on Software Reliability Engineering Workshops, Naples, Italy, pp. 341–346. IEEE (2014). https://doi.org/10.1109/ISSREW.2014.105
Zhao, J., Ding, Y., Zhai, Y., Jiang, Y., Zhai, Y., Hu, M.: Explore unlabeled big data learning to online failure prediction in safety-aware cloud environment. J. Parallel Distrib. Comput. 153, 53–63 (2021). https://doi.org/10.1016/j.jpdc.2021.02.025
https://medium.com/apprentice-journal/pca-application-in-machine-learning-4827c07a61db. Accessed 18 Nov 2021
Chigurupati, A., Thibaux, R., Lassar, N.: Predicting hardware failure using machine learning. In: Proceedings of the 2016 Annual Reliability and Maintainability Symposium (RAMS), Tucson, AZ, USA, pp. 1–6 (2016). https://doi.org/10.1109/RAMS.2016.7448033
Suchatpong, T., Bhumkittipich, K.: Hard Disk Drive failure mode prediction based on industrial standard using decision tree learning. In: Proceedings of the 2014 11th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Nakhon Ratchasima, Thailand, pp. 1–4. IEEE (2014). https://doi.org/10.1109/ECTICon.2014.6839839
Strom, B.D., Lee, S.C., Tyndall, G.W., Khurshudov, A.: Hard disk drive reliability modeling and failure prediction. In: Proceedings of the Asia-Pacific Magnetic Recording Conference 2006, Singapore, pp. 1–2. IEEE (2006). https://doi.org/10.1109/APMRC.2006.365900
Hu, L., Han, L., Xu, Z., Jiang, T., Qi, H.: A disk failure prediction method based on LSTM network due to its individual specificity. Proc. Comput. Sci. 176, 791–799 (2020). https://doi.org/10.1016/j.procs.2020.09.074
Li, Q., Li, H., Zhang, K.: Prediction of HDD failures by ensemble learning. In: Proceedings of the 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, pp. 237–240. IEEE (2019). https://doi.org/10.1109/ICSESS47205.2019.9040739
Zhang, S., Wang, Y., Liu, M., Bao, Z.: Data-based line trip fault prediction in power systems using LSTM networks and SVM. IEEE Access 6, 7675–7686 (2018). https://doi.org/10.1109/ACCESS.2017.2785763
Omran, S., El Houby, E.M.F.: Prediction of electrical power disturbances using machine learning techniques. J. Amb. Intel. Hum. Comput. 11(7), 2987–3003 (2019). https://doi.org/10.1007/s12652-019-01440-w
Mehlo, N.A., Pretorius, J.H.C., Rhyn, P.V.: Reliability assessment of medium voltage underground cable network using a failure prediction method. In: Proceedings of the 2019 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), Macao, China, pp. 1–5. IEEE (2019). https://doi.org/10.1109/APPEEC45492.2019.8994720
Sachan, S., Zhou, C., Bevan G., Alkali, B.: Failure prediction of power cables using failure history and operational condition. In: Procedings of the 2015 IEEE 11th International Conference on the Properties and Applications of Dielectric Materials (ICPADM), Sydney, NSW, Australia, pp. 380–383. IEEE (2015). https://doi.org/10.1109/ICPADM.2015.7295288
Kwon, J.-H., Kim, E.-J.: Failure prediction model using iterative feature selection for industrial internet of things. Symmetry 12, 454 (2020). https://doi.org/10.3390/sym12030454
Fernandes, S., Antunes, M., Santiago, A.R., Barraca, J.P., Gomes, D., Aguiar, R.L.: Forecasting appliances failures: a machine-learning approach to predictive maintenance. Information 11, 208 (2020). https://doi.org/10.3390/info11040208
Cai, Z., Sun, S., Si, S., Wang, N.: Research of failure prediction Bayesian network model. In: Proceedings of the 2009 16th International Conference on Industrial Engineering and Engineering Management, Beijing, China, pp. 2021–2025. IEEE (2009). https://doi.org/10.1016/10.1109/ICIEEM.2009.5344265
Bai, C.G., Hu, Q.P., Xie, M., Ng, S.H.: Software failure prediction based on a Markov Bayesian network model. J. Syst. Softw. 74(3), 275–282 (2005). https://doi.org/10.1016/j.jss.2004.02.028
Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network Traffic Anomaly Detection and Prevention: Concepts, Techniques, and Tools. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-65188-0. ISBN 978-3-319-87968-0
Bolanowski, M., Twaróg, B., Mlicki, R.: Anomalies detection in computer networks with the use of SDN. Meas. Autom. Monit. 9(61), 443–445 (2015)
Bolanowski, M., Paszkiewicz, A.: The use of statistical signatures to detect anomalies in computer network. In: Gołębiowski, L., Mazur, D. (eds.) Analysis and Simulation of Electrical and Computer Systems. LNEE, vol. 324, pp. 251–260. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-11248-0_19
Zhong, J., Guo, W., Wang, Z.: Study on network failure prediction based on alarm logs. In: Proceedings of the 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, Oman, pp. 1–7. IEEE (2016). https://doi.org/10.1109/ICBDSC.2016.7460337
Ji, W., Duan, S., Chen, R., Wang, S., Ling, Q.: A CNN-based network failure prediction method with logs. In: Proceedings of the 2018 Chinese Control and Decision Conference (CCDC), Shenyang, China, pp. 4087–4090. IEEE (2018). https://doi.org/10.1109/CCDC.2018.8407833
Bolanowski, M., Paszkiewicz, A., Rumak, B.: Coarse traffic classification for high-bandwidth connections in a computer network using deep learning techniques. In: Barolli, L., Yim, K., Enokido, T. (eds.) CISIS 2021. LNNS, vol. 278, pp. 131–141. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79725-6_13
Bolanowski, M., Paszkiewicz, A., Kwater, T., Kwiatkowski, B.: The multilayer complex network design with use of the arbiter. Monographs in Applied Informatics, Computing in Science and Technology, Wydawnictwo Uniwersytetu Rzeszowskiego, Rzeszów, pp. 116–127 (2015). ISBN 978-83-7996-140-5
https://pypi.org/project/netmiko. Accessed 8 Nov 2021
https://www.paramiko.org. Accessed 8 Nov 2021
https://www.mathworks.com/help/stats/getting-started-12.html. Accessed 8 Nov 2021
https://www.backblaze.com/b2/hard-drive-test-data.html. Accessed 8 Nov 2021
Sharafaldin, I., Habibi Lashkari, A., Ghorbani, A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy - ICISSP, Funchal, Madeira, Portugal, pp. 108–116. SciTePress (2018). https://doi.org/10.5220/0006639801080116
Acknowledgement
This project is financed by the Minister of Education and Science of the Republic of Poland within the “Regional Initiative of Excellence” program for years 2019–2022. Project number 027/RID/2018/19, amount granted 11 999 900 PLN.
This work has been supported by the joint research project “Agent Technologies in Dynamics Environments” under the agreement on scientific cooperation between University of Novi Sad, University of Craiova, SRI PAS and Warsaw University of Technology, as well as by the joint research project “Novel methods for development of distributed systems” under the agreement on scientific cooperation between the Polish Academy of Sciences and Romanian Academy for years 2019–2021. Finally, support from the Bulgarian Academy of Sciences and the Polish Academy of Sciences, (Bilateral grant agreement between BAS and PAS) is acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Bădică, A. et al. (2022). Cascaded Anomaly Detection with Coarse Sampling in Distributed Systems. In: Sachdeva, S., Watanobe, Y., Bhalla, S. (eds) Big-Data-Analytics in Astronomy, Science, and Engineering. BDA 2021. Lecture Notes in Computer Science(), vol 13167. Springer, Cham. https://doi.org/10.1007/978-3-030-96600-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-96600-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96599-0
Online ISBN: 978-3-030-96600-3
eBook Packages: Computer ScienceComputer Science (R0)