Skip to main content

Cascaded Anomaly Detection with Coarse Sampling in Distributed Systems

  • Conference paper
  • First Online:
Big-Data-Analytics in Astronomy, Science, and Engineering (BDA 2021)

Abstract

In this contribution, analysis of usefulness of selected parameters of a distributed information system, for early detection of anomalies in its operation, is considered. Use of statistical analysis, or machine learning (ML), can result in high computational complexity and requirement to transfer large amount of data from the monitored system’s elements. This enforces monitoring of only major components (e.g., access link, key machine components, filtering of selected traffic parameters). To overcome this limitation, a model in which an arbitrary number of elements could be monitored, using microservices, is proposed. For this purpose, it is necessary to determine the sampling threshold value and the influence of sampling coarseness on the quality of anomaly detection. To validate the proposed approach, the ST4000DM000 (Disk failure) and CICIDS2017 (DDoS) datasets were used, to study effects of limiting the number of parameters and the sampling rate reduction on the detection performance of selected classic ML algorithms. Moreover, an example of microservice architecture for coarse network anomaly detection for a network node is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Janus, P., Ganzha, M., Bicki, A., Paprzycki, M.: Applying machine learning to study infrastructure anomalies in a mid-size data center - preliminary considerations. In: Proceedings of the 54th Hawaii International Conference on System Sciences (2021). http://hdl.handle.net/10125/70636. https://doi.org/10.24251/HICSS.2021.025

  2. Neu, D.A., Lahann, J., Fettke, P.: A systematic literature review on state-of-the-art deep learning methods for process prediction. Artif. Intell. Rev. (2021). https://doi.org/10.1007/s10462-021-09960-8

  3. Williams, A.W., Pertet, S.M., Narasimhan, P.: Tiresias: black-box failure prediction in distributed systems. In: Proceedings of the 2007 IEEE International Parallel and Distributed Processing Symposium, Long Beach, CA, USA, pp. 1–8. IEEE (2007). https://doi.org/10.1109/IPDPS.2007.370345

  4. Mariani, L., Pezzè, M., Riganelli, O., Xin, R.: Predicting failures in multi-tier distributed systems. J. Syst. Softw. 161, 110464 (2020). https://doi.org/10.1016/j.jss.2019.110464

    Article  Google Scholar 

  5. Chen, X., Lu, C., Pattabiraman, K.: Failure prediction of jobs in compute clouds: a google cluster case study. In: Proceedings of the 2014 IEEE International Symposium on Software Reliability Engineering Workshops, Naples, Italy, pp. 341–346. IEEE (2014). https://doi.org/10.1109/ISSREW.2014.105

  6. Zhao, J., Ding, Y., Zhai, Y., Jiang, Y., Zhai, Y., Hu, M.: Explore unlabeled big data learning to online failure prediction in safety-aware cloud environment. J. Parallel Distrib. Comput. 153, 53–63 (2021). https://doi.org/10.1016/j.jpdc.2021.02.025

    Article  Google Scholar 

  7. https://medium.com/apprentice-journal/pca-application-in-machine-learning-4827c07a61db. Accessed 18 Nov 2021

  8. Chigurupati, A., Thibaux, R., Lassar, N.: Predicting hardware failure using machine learning. In: Proceedings of the 2016 Annual Reliability and Maintainability Symposium (RAMS), Tucson, AZ, USA, pp. 1–6 (2016). https://doi.org/10.1109/RAMS.2016.7448033

  9. Suchatpong, T., Bhumkittipich, K.: Hard Disk Drive failure mode prediction based on industrial standard using decision tree learning. In: Proceedings of the 2014 11th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Nakhon Ratchasima, Thailand, pp. 1–4. IEEE (2014). https://doi.org/10.1109/ECTICon.2014.6839839

  10. Strom, B.D., Lee, S.C., Tyndall, G.W., Khurshudov, A.: Hard disk drive reliability modeling and failure prediction. In: Proceedings of the Asia-Pacific Magnetic Recording Conference 2006, Singapore, pp. 1–2. IEEE (2006). https://doi.org/10.1109/APMRC.2006.365900

  11. Hu, L., Han, L., Xu, Z., Jiang, T., Qi, H.: A disk failure prediction method based on LSTM network due to its individual specificity. Proc. Comput. Sci. 176, 791–799 (2020). https://doi.org/10.1016/j.procs.2020.09.074

    Article  Google Scholar 

  12. Li, Q., Li, H., Zhang, K.: Prediction of HDD failures by ensemble learning. In: Proceedings of the 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, pp. 237–240. IEEE (2019). https://doi.org/10.1109/ICSESS47205.2019.9040739

  13. Zhang, S., Wang, Y., Liu, M., Bao, Z.: Data-based line trip fault prediction in power systems using LSTM networks and SVM. IEEE Access 6, 7675–7686 (2018). https://doi.org/10.1109/ACCESS.2017.2785763

    Article  Google Scholar 

  14. Omran, S., El Houby, E.M.F.: Prediction of electrical power disturbances using machine learning techniques. J. Amb. Intel. Hum. Comput. 11(7), 2987–3003 (2019). https://doi.org/10.1007/s12652-019-01440-w

    Article  Google Scholar 

  15. Mehlo, N.A., Pretorius, J.H.C., Rhyn, P.V.: Reliability assessment of medium voltage underground cable network using a failure prediction method. In: Proceedings of the 2019 IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), Macao, China, pp. 1–5. IEEE (2019). https://doi.org/10.1109/APPEEC45492.2019.8994720

  16. Sachan, S., Zhou, C., Bevan G., Alkali, B.: Failure prediction of power cables using failure history and operational condition. In: Procedings of the 2015 IEEE 11th International Conference on the Properties and Applications of Dielectric Materials (ICPADM), Sydney, NSW, Australia, pp. 380–383. IEEE (2015). https://doi.org/10.1109/ICPADM.2015.7295288

  17. Kwon, J.-H., Kim, E.-J.: Failure prediction model using iterative feature selection for industrial internet of things. Symmetry 12, 454 (2020). https://doi.org/10.3390/sym12030454

    Article  Google Scholar 

  18. Fernandes, S., Antunes, M., Santiago, A.R., Barraca, J.P., Gomes, D., Aguiar, R.L.: Forecasting appliances failures: a machine-learning approach to predictive maintenance. Information 11, 208 (2020). https://doi.org/10.3390/info11040208

  19. Cai, Z., Sun, S., Si, S., Wang, N.: Research of failure prediction Bayesian network model. In: Proceedings of the 2009 16th International Conference on Industrial Engineering and Engineering Management, Beijing, China, pp. 2021–2025. IEEE (2009). https://doi.org/10.1016/10.1109/ICIEEM.2009.5344265

  20. Bai, C.G., Hu, Q.P., Xie, M., Ng, S.H.: Software failure prediction based on a Markov Bayesian network model. J. Syst. Softw. 74(3), 275–282 (2005). https://doi.org/10.1016/j.jss.2004.02.028

    Article  Google Scholar 

  21. Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network Traffic Anomaly Detection and Prevention: Concepts, Techniques, and Tools. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-65188-0. ISBN 978-3-319-87968-0

  22. Bolanowski, M., Twaróg, B., Mlicki, R.: Anomalies detection in computer networks with the use of SDN. Meas. Autom. Monit. 9(61), 443–445 (2015)

    Google Scholar 

  23. Bolanowski, M., Paszkiewicz, A.: The use of statistical signatures to detect anomalies in computer network. In: Gołębiowski, L., Mazur, D. (eds.) Analysis and Simulation of Electrical and Computer Systems. LNEE, vol. 324, pp. 251–260. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-11248-0_19

    Chapter  Google Scholar 

  24. Zhong, J., Guo, W., Wang, Z.: Study on network failure prediction based on alarm logs. In: Proceedings of the 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, Oman, pp. 1–7. IEEE (2016). https://doi.org/10.1109/ICBDSC.2016.7460337

  25. Ji, W., Duan, S., Chen, R., Wang, S., Ling, Q.: A CNN-based network failure prediction method with logs. In: Proceedings of the 2018 Chinese Control and Decision Conference (CCDC), Shenyang, China, pp. 4087–4090. IEEE (2018). https://doi.org/10.1109/CCDC.2018.8407833

  26. Bolanowski, M., Paszkiewicz, A., Rumak, B.: Coarse traffic classification for high-bandwidth connections in a computer network using deep learning techniques. In: Barolli, L., Yim, K., Enokido, T. (eds.) CISIS 2021. LNNS, vol. 278, pp. 131–141. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79725-6_13

    Chapter  Google Scholar 

  27. Bolanowski, M., Paszkiewicz, A., Kwater, T., Kwiatkowski, B.: The multilayer complex network design with use of the arbiter. Monographs in Applied Informatics, Computing in Science and Technology, Wydawnictwo Uniwersytetu Rzeszowskiego, Rzeszów, pp. 116–127 (2015). ISBN 978-83-7996-140-5

    Google Scholar 

  28. https://pypi.org/project/netmiko. Accessed 8 Nov 2021

  29. https://www.paramiko.org. Accessed 8 Nov 2021

  30. https://www.mathworks.com/help/stats/getting-started-12.html. Accessed 8 Nov 2021

  31. https://www.backblaze.com/b2/hard-drive-test-data.html. Accessed 8 Nov 2021

  32. Sharafaldin, I., Habibi Lashkari, A., Ghorbani, A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th International Conference on Information Systems Security and Privacy - ICISSP, Funchal, Madeira, Portugal, pp. 108–116. SciTePress (2018). https://doi.org/10.5220/0006639801080116

Download references

Acknowledgement

This project is financed by the Minister of Education and Science of the Republic of Poland within the “Regional Initiative of Excellence” program for years 2019–2022. Project number 027/RID/2018/19, amount granted 11 999 900 PLN.

This work has been supported by the joint research project “Agent Technologies in Dynamics Environments” under the agreement on scientific cooperation between University of Novi Sad, University of Craiova, SRI PAS and Warsaw University of Technology, as well as by the joint research project “Novel methods for development of distributed systems” under the agreement on scientific cooperation between the Polish Academy of Sciences and Romanian Academy for years 2019–2021. Finally, support from the Bulgarian Academy of Sciences and the Polish Academy of Sciences, (Bilateral grant agreement between BAS and PAS) is acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marek Bolanowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bădică, A. et al. (2022). Cascaded Anomaly Detection with Coarse Sampling in Distributed Systems. In: Sachdeva, S., Watanobe, Y., Bhalla, S. (eds) Big-Data-Analytics in Astronomy, Science, and Engineering. BDA 2021. Lecture Notes in Computer Science(), vol 13167. Springer, Cham. https://doi.org/10.1007/978-3-030-96600-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-96600-3_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-96599-0

  • Online ISBN: 978-3-030-96600-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics