Abstract
Network performance monitoring is a crucial aspect in order to maintain reliable and efficient communications between different hosts and clusters. This is becoming more relevant as companies are progressively moving towards cloud-native environments, where hyperconnected islands are deployed. While monitoring for individual clusters and components is widely deployed, inter-domain metrics have not been yet added to present solutions. In this paper, we present MONCHi, an open-source based custom monitoring tool designed to collect and analyse traditional and custom metrics for network links within and among Kubernetes clusters. MONCHi consists on the flexible design and implementation on top of a conventional monitoring tool –Prometheus in this case– of several custom scripts that run in separate containers within a single pod and continuously collect and store metrics. These metrics are then exposed and visualised in an analysis tool, allowing for easy monitoring of network performance in clusters and for scalable multi-domain deployments based on multiple Prometheus instances. We also discuss the modification of Prometheus configuration to support the new endpoint for MONCHi. Finally, we present the results of several performance tests, focusing mainly on Round-Trip Time (RTT) and bandwidth, conducted to validate the effectiveness of MONCHi to accurately collect and analyse network performance metrics. Overall, MONCHi provides an effective and easy to customise solution for monitoring cloud-native multi-domain environments.
This paper has partially been supported by the European H2020 FISHY Project (grant agreement 952644), and the TRUE5G project funded by the Spanish National Research Agency (PID2019-108713RB-C52/AEI/10.13039/501100011033).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
5TONIC (2023). https://www.5tonic.org. Accessed May 2023
FISHY (2023). https://fishy-project.eu. Accessed May 2023
Abeele, N.G.V.D., Daigle, L.A., Kumari, W.A.: A Framework for Large-Scale Measurement of Broadband Performance (LMAP). RFC 7594, Internet Engineering Task Force, September 2015. https://doi.org/10.17487/RFC7594
Carcassi, G., Breen, J., Bryant, L., Gardner, R.W., McKee, S., Weaver, C.: SLATE: monitoring distributed Kubernetes clusters. In: Practice and Experience in Advanced Research Computing, pp. 19–25 (2020)
Chang, C.C., Yang, S.R., Yeh, E.H., Lin, P., Jeng, J.Y.: A Kubernetes-based monitoring platform for dynamic cloud resource provisioning. In: GLOBECOM 2017 IEEE Global Communications Conference, pp. 1–6 (2017). https://doi.org/10.1109/GLOCOM.2017.8254046
Cisco Systems: Snort: network intrusion detection and prevention system. https://www.snort.org. Accessed May 2023
Fathoni, H., Yen, H.Y., Yang, C.T., Huang, C.Y., Kristiani, E.: A container-based of edge device monitoring on Kubernetes. In: Chang, J.W., Yen, N., Hung, J.C. (eds.) Frontier Computing: Proceedings of FC 2020, Lecture Notes in Electrical Engineering, vol. 747, pp. 231–237. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0115-6_22
FISHY: Use cases settings and demonstration strategy (IT-1). Deliverable D6.1, European Commission, October 2021. https://fishy-project.eu/library/deliverables. Accessed May 2023
Song, H., Qin, F., Martinez-Julia, P., Wang, L.A.: Network Telemetry Framework. RFC 9232. Internet Engineering Task Force (IETF), May 2022
Kubernetes: Documentation. https://kubernetes.io/docs. Accessed May 2023
MWTeam: 15 Best cloud monitoring tools & services in 2023 (Updated) (2023). https://middleware.io/blog/cloud-monitoring-tools. Accessed May 2023
Open Telemetry: Tools to instrument, generate, collect, and export telemetry data. https://opentelemetry.io. Accessed May 2023
OpenStack Foundation: Open source cloud computing infrastructure. https://www.openstack.org. Accessed May 2023
Oracle: What is cloud native? https://www.oracle.com/in/cloud/cloud-native/what-is-cloud-native. Accessed May 2023
Prometheus: kube-prometheus-stack: prometheus end-to-end K8s cluster monitoring. https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack. Accessed May 2023
Prometheus: An open-source systems monitoring and alerting toolkit (2023). https://prometheus.io/docs. Accessed May 2023
Shamir, J.: Virtualization is the foundation of cloud computing - key benefits it can bring to your organization? (2021). https://www.ibm.com/cloud/blog/5-benefits-of-virtualization. Accessed May 2023
Sukhija, N., Bautista, E.: Towards a framework for monitoring and analyzing high performance computing environments using Kubernetes and prometheus. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, pp. 257–262. IEEE (2019)
Wazuh: Wazuh: Open source security platform. https://wazuh.com. Accessed May 2023
Wireshark: Network protocol analyzer. https://www.wireshark.org. Accessed May 2023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
de M. Artalejo, D.N., Vidal, I., Valera, F., Nogales, B. (2023). MONCHi: MONitoring for Cloud-native Hyperconnected Islands. In: Iacono, M., Scarpa, M., Barbierato, E., Serrano, S., Cerotti, D., Longo, F. (eds) Computer Performance Engineering and Stochastic Modelling. EPEW ASMTA 2023 2023. Lecture Notes in Computer Science, vol 14231. Springer, Cham. https://doi.org/10.1007/978-3-031-43185-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-43185-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43184-5
Online ISBN: 978-3-031-43185-2
eBook Packages: Computer ScienceComputer Science (R0)