Abstract
As networks get more complex, the ability to track almost all the flows is becoming of paramount importance. This is because we can then detect transient events impacting only a subset of the traffic. Solutions for flow monitoring exist, but it is getting very difficult to produce accurate estimations for every <flowID,counter> tuple given the memory constraints of commodity programmable switches. Indeed, as networks grow in size, more flows have to be tracked, increasing the number of tuples to be recorded. At the same time, end-host virtualization requires more specific flowIDs, enlarging the memory cost for every single entry. Finally, the available memory resources have to be shared with other important functions as well (e.g., load balancing, forwarding, ACL).
To address those issues, we present FlowLiDAR (Flow Lightweight Detection and Ranging), a new solution that is capable of tracking almost all the flows in the network while requiring only a modest amount of data plane memory which is not dependent on the size of flowIDs. We implemented the scheme in P4, tested it using real traffic from ISPs and compared it against four state-of-the-art solutions: FlowRadar, NZE, PR-sketch, and Elastic Sketch. While those can only reconstruct up to 60% of the tuples, FlowLiDAR can track 98.7% of them with the same amount of memory.
- Anurag Agrawal and Changhoon Kim. 2020. Intel tofino2--a 12.9 tbps p4-programmable ethernet switch. In 2020 IEEE Hot Chips 32 Symposium (HCS). IEEE Computer Society, 1--32.Google ScholarCross Ref
- Mohammad Alizadeh, Tom Edsall, Sarang Dharmapurikar, Ramanan Vaidyanathan, Kevin Chu, Andy Fingerhut, Vinh The Lam, Francis Matus, Rong Pan, Navindra Yadav, and George Varghese. 2014. CONGA: Distributed Congestion-Aware Load Balancing for Datacenters. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM). ACM.Google ScholarDigital Library
- Gianni Antichi and Gábor Rétvári. 2020. Full-Stack SDN: The Next Big Challenge?. In Symposium on SDN Research (SOSR). ACM.Google ScholarDigital Library
- Sachin Ashok, P. Brighten Godfrey, and Radhika Mittal. 2021. Leveraging Service Meshes as a New Network Layer. In Workshop on Hot Topics in Networks (HotNets). ACM.Google ScholarDigital Library
- Ran Ben Basat, Xiaoqi Chen, Gil Einziger, and Ori Rottenstreich. 2020. Designing heavy-hitter detection algorithms for programmable switches. IEEE/ACM Transactions on Networking, Vol. 28, 3 (2020), 1172--1185.Google ScholarDigital Library
- Ran Ben Basat, Sivaramakrishnan Ramanathan, Yuliang Li, Gianni Antichi, Minian Yu, and Michael Mitzenmacher. 2020. PINT: Probabilistic In-band Network Telemetry. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 662--680.Google ScholarDigital Library
- Giuseppe Bianchi, Elisa Boschi, Simone Teofili, and Brian Trammell. 2010. Measurement data reduction through variation rate metering. In 2010 Proceedings IEEE INFOCOM. IEEE, 1--9.Google ScholarCross Ref
- Caida. 2016. The CAIDA UCSD Anonymized Internet Traces. http://www.caida.org/data/passive/passive_2016_dataset.xml.Google Scholar
- Timothy A Davis, Sivasankaran Rajamanickam, and Wissam M Sid-Lakhdar. 2016. A survey of direct methods for sparse linear systems. Acta Numerica, Vol. 25 (2016), 383--566.Google ScholarCross Ref
- Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-Value Store. In Proceedings of the ACM SIGOPS Symposium on Operating Systems Principles (SOSP). Association for Computing Machinery.Google ScholarDigital Library
- Vitalii Demianiuk, Sergey Gorinsky, Sergey I Nikolenko, and Kirill Kogan. 2020. Robust distributed monitoring of traffic flows. IEEE/ACM Transactions on Networking, Vol. 29, 1 (2020), 275--288.Google ScholarDigital Library
- Martin Dietzfelbinger, Andreas Goerdt, Michael Mitzenmacher, Andrea Montanari, Rasmus Pagh, and Michael Rink. 2010. Tight Thresholds for Cuckoo Hashing via XORSAT. In Proc. 37th ICALP (1). 213--225. https://doi.org/10.1007/978--3--642--14165--2_19Google ScholarCross Ref
- Olivier Dubois and Jacques Mandler. 2002. The 3-XORSAT Threshold. In Proc. 43rd FOCS. 769--778. https://doi.org/10.1109/SFCS.2002.1182002Google ScholarCross Ref
- Cristian Estan and George Varghese. 2002. New Directions in Traffic Measurement and Accounting. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM). ACM.Google ScholarDigital Library
- Philippe Flajolet, Eric Fusy, Olivier Gandouet, and Frederic Meunier. 2007. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In Conference on Analysis of Algorithms (AofA).Google ScholarCross Ref
- Nikolaos Fountoulakis and Konstantinos Panagiotou. 2012. Sharp Load Thresholds for Cuckoo Hashing. Random Struct. Algorithms, Vol. 41, 3 (2012), 306--333. https://doi.org/10.1002/rsa.20426Google ScholarDigital Library
- Jiaqi Gao, Nofel Yaseen, Robert MacDavid, Felipe Vieira Frujeri, Vincent Liu, Ricardo Bianchini, Ramaswamy Aditya, Xiaohang Wang, Henry Lee, David Maltz, Minlan Yu, and Behnaz Arzani. 2020. Scouts: Improving the Diagnosis Process Through Domain-Customized Incident Routing. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM). ACM.Google ScholarDigital Library
- Marco Genuzio, Giuseppe Ottaviano, and Sebastiano Vigna. 2016. Fast Scalable Construction of (Minimal Perfect Hash) Functions. In Proc. 15th SEA. 339--352. https://doi.org/10.1007/978--3--319--38851--9_23Google ScholarDigital Library
- Arpit Gupta, Rob Harrison, Marco Canini, Nick Feamster, Jennifer Rexford, and Walter Willinger. 2018. Sonata: Query-driven streaming network telemetry. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM). 357--371.Google ScholarDigital Library
- Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew W. Moore, Gianni Antichi, and Marcin Wójcik. 2017. Re-Architecting Datacenter Networks and Stacks for Low Latency and High Performance. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM). ACM.Google ScholarDigital Library
- George Havas, Bohdan S. Majewski, Nicholas C. Wormald, and Zbigniew J. Czech. 1993. Graphs, Hypergraphs and Hashing. In Proc. 19th WG. 153--165. https://doi.org/10.1007/3--540--57899--4_49Google ScholarCross Ref
- J. Hill, M. Aloserij, and P. Grosso. 2018. Tracking Network Flows with P4. In IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS).Google Scholar
- Qun Huang, Siyuan Sheng, Xiang Chen, Yungang Bao, Rui Zhang, Yanwei Xu, and Gong Zhang. 2021. Toward Nearly-Zero-Error Sketching via Compressive Sensing. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). 1027--1044.Google Scholar
- Qun Huang, Haifeng Sun, Patrick PC Lee, Wei Bai, Feng Zhu, and Yungang Bao. 2020. Omnimon: Re-architecting network telemetry with resource efficiency and full accuracy. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 404--421.Google ScholarDigital Library
- Intel. 2021. Intel Deep Insight Network Analytics Software. https://www.intel.com/content/www/us/en/products/network-io/programmable-ethernet-switch/network-analytics/deep-insight.html. Accessed: 2022--10-04.Google Scholar
- Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing Key-Value Stores with Fast In-Network Caching. In Symposium on Operating Systems Principles (SOSP). ACM.Google ScholarDigital Library
- Piotr Jurkiewicz, Grzegorz Rzym. 2021. Flow length and size distributions in campus Internet traffic. Computer Communications, Vol. 167 (2021), 15--30. https://doi.org/10.1016/j.comcom.2020.12.016Google ScholarCross Ref
- Naga Katta, Mukesh Hira, Changhoon Kim, Anirudh Sivaraman, and Jennifer Rexford. 2016. HULA: Scalable Load Balancing Using Programmable Data Planes. In Symposium on SDN Research (SOSR). ACM.Google Scholar
- Changhoon Kim, Anirudh Sivaraman, Naga Katta, Antonin Bas, Advait Dixit, and Lawrence J Wobker. 2015. In-band network telemetry via programmable dataplanes. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM).Google Scholar
- Daehyeok Kim, Zaoxing Liu, Yibo Zhu, Changhoon Kim, Jeongkeun Lee, Vyas Sekar, and Srinivasan Seshan. 2020. TEA: Enabling state-intensive network functions on programmable switches. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 90--106.Google ScholarDigital Library
- Abhishek Kumar, Minho Sung, Jun (Jim) Xu, and Jia Wang. 2004. Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution. In Special Interest Group for the Computer Performance Evaluation (SIGMETRICS). ACM.Google Scholar
- Jan Kuvcera, Ran Ben Basat, Mário Kuka, Gianni Antichi, Minlan Yu, and Michael Mitzenmacher. 2020. Detecting Routing Loops in the Data Plane. In Conference on Emerging Networking EXperiments and Technologies (CoNEXT). ACM.Google Scholar
- Gene Moo Lee, Huiya Liu, Young Yoon, and Yin Zhang. 2005. Improving sketch reconstruction accuracy using linear least squares method. In Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement. 24--24.Google ScholarDigital Library
- Yuliang Li, Rui Miao, Changhoon Kim, and Minlan Yu. 2016a. FlowRadar: A Better NetFlow for Data Centers. In USENIX NSDI.Google Scholar
- Yuliang Li, Rui Miao, Changhoon Kim, and Minlan Yu. 2016b. LossRadar: Fast Detection of Lost Packets in Data Center Networks. In Conference on Emerging Networking EXperiments and Technologies (CoNEXT). ACM.Google ScholarDigital Library
- Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, and Minlan Yu. 2019. HPCC: High Precision Congestion Control. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM). ACM.Google ScholarDigital Library
- Zaoxing Liu, Antonis Manousis, Gregory Vorsanger, Vyas Sekar, and Vladimir Braverman. 2016. One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM).Google ScholarDigital Library
- Lailong Luo, Deke Guo, Richard T. B. Ma, Ori Rottenstreich, and Xueshan Luo. 2019. Optimizing Bloom Filter: Challenges, Solutions, and Comparisons. IEEE Communications Surveys and Tutorials, Vol. 21, 2 (2019), 1912--1949.Google ScholarCross Ref
- Bruce M Maggs and Ramesh K Sitaraman. 2015. Algorithmic nuggets in content delivery. ACM SIGCOMM Computer Communication Review, Vol. 45, 3 (2015), 52--66.Google ScholarDigital Library
- Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2005. Efficient computation of frequent and top-k elements in data streams. In International conference on database theory. Springer, 398--412.Google Scholar
- Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. 2017. Silkroad: Making stateful layer-4 load balancing fast and cheap using switching asics. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication. 15--28.Google ScholarDigital Library
- Michael Molloy. 2005. Cores in random hypergraphs and Boolean formulas. Random Struct. Algorithms, Vol. 27, 1 (2005), 124--135. https://doi.org/10.1002/rsa.20061Google ScholarCross Ref
- F. Pereira, N. Neves, and F. M. V. Ramos. 2017. Secure network monitoring using programmable data planes. In IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN).Google ScholarCross Ref
- Boris Pittel and Gregory B. Sorkin. 2016. The Satisfiability Threshold for k-XORSAT. Combinatorics, Probability & Computing, Vol. 25, 2 (2016), 236--268. https://doi.org/10.1017/S0963548315000097Google ScholarCross Ref
- Salvatore Pontarelli, Pedro Reviriego, and Michael Mitzenmacher. 2014. Improving the Performance of Invertible Bloom Lookup Tables. Inf. Process. Lett., Vol. 114, 4 (apr 2014), 185--191. https://doi.org/10.1016/j.ipl.2013.11.015Google ScholarDigital Library
- Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2021. Scaling Distributed Machine Learning with In-Network Aggregation. In Symposium on Networked Systems Design and Implementation (NSDI). USENIX Association.Google Scholar
- Mariano Scazzariello, Tommaso Caiazzi, Hamid Ghasemirahni, Tom Barbette, Dejan Kostic, and Marco Chiesa. 2023. A High-Speed Stateful Packet Processing Approach for Tbps Programmable Switches. In Networked Systems Design and Implementation (NSDI). USENIX.Google Scholar
- Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Yin Zhang, Peter A. Dinda, Ming-Yang Kao, and Gokhan Memik. 2007. Reversible Sketches: Enabling Monitoring and Analysis over High-Speed Data Streams. In Transactions on Networking, Volume: 15, Issue: 5. IEEE Press.Google Scholar
- Siyuan Sheng, Qun Huang, Sa Wang, and Yungang Bao. 2021. PR-Sketch: Monitoring per-Key Aggregation of Streaming Data with Nearly Full Accuracy. Proc. VLDB Endow., Vol. 14, 10 (jun 2021), 1783--1796. https://doi.org/10.14778/3467861.3467868Google ScholarDigital Library
- John Sonchack, Adam J Aviv, Eric Keller, and Jonathan M Smith. 2018. Turboflow: Information rich flow record generation on commodity switches. In Proceedings of the Thirteenth EuroSys Conference. 1--16.Google ScholarDigital Library
- Daniel Ting. 2018. Count-min: Optimal estimation and tight error bounds using empirical error distributions. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2319--2328.Google ScholarDigital Library
- Nguyen Van Tu, Jonghwan Hyun, and James Won-Ki Hong. 2017. Towards onos-based sdn monitoring using in-band network telemetry. In 2017 19th Asia-Pacific Network Operations and Management Symposium (APNOMS). IEEE, 76--81.Google Scholar
- Xiong Wang, Hanyu Liu, Jun Zhang, Jing Ren, Sheng Wang, and Shizhong Xu. 2019. FlowMap: A Fine-Grained Flow Measurement Approach for Data-Center Networks. In ICC 2019--2019 IEEE International Conference on Communications (ICC). IEEE, 1--7.Google Scholar
- Tong Yang, Jie Jiang, Peng Liu, Qun Huang, Junzhi Gong, Yang Zhou, Rui Miao, Xiaoming Li, and Steve Uhlig. 2018. Elastic sketch: Adaptive and fast network-wide measurements. In Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM).Google ScholarDigital Library
- Tong Yang, Haowei Zhang, Jinyang Li, Junzhi Gong, Steve Uhlig, Shigang Chen, and Xiaoming Li. 2019. HeavyKeeper: an accurate algorithm for finding Top-$ k $ elephant flows. IEEE/ACM Transactions on Networking, Vol. 27, 5 (2019), 1845--1858.Google ScholarDigital Library
- Minlan Yu. 2019. Network telemetry: towards a top-down approach. ACM SIGCOMM Computer Communication Review, Vol. 49, 1 (2019), 11--17.Google ScholarDigital Library
- Fuheng Zhao, Punnal Ismail Khan, Divyakant Agrawal, Amr El Abbadi, Arpit Gupta, and Zaoxing Liu. 2023. Panakos: Chasing the Tails for Multidimensional Data Streams. Proceedings of the VLDB Endowment, Vol. 16, 6 (2023), 1291--1304.Google ScholarDigital Library
- Qi Zhao, Abhishek Kumar, and Jun Xu. 2005. Joint Data Streaming and Sampling Techniques for Detection of Super Sources and Destinations. In Conference on Internet Measurement (IMC). USENIX Association.Google Scholar
- Zongyi Zhao, Xingang Shi, Zhiliang Wang, Qing Li, Han Zhang, and Xia Yin. 2021. Efficient and Accurate Flow Record Collection With HashFlow. IEEE Transactions on Parallel and Distributed Systems, Vol. 33, 5 (2021), 1069--1083.Google ScholarDigital Library
- Yu Zhou, Jun Bi, Tong Yang, Kai Gao, Jiamin Cao, Dai Zhang, Yangyang Wang, and Cheng Zhang. 2020a. Hypersight: Towards scalable, high-coverage, and dynamic network monitoring queries. IEEE Journal on Selected Areas in Communications, Vol. 38, 6 (2020), 1147--1160.Google ScholarCross Ref
- Yu Zhou, Chen Sun, Hongqiang Harry Liu, Rui Miao, Shi Bai, Bo Li, Zhilong Zheng, Lingjun Zhu, Zhen Shen, Yongqing Xi, et al. 2020b. Flow event telemetry on programmable data plane. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 76--89.Google ScholarDigital Library
- Yibo Zhu, Nanxi Kang, Jiaxin Cao, Albert Greenberg, Guohan Lu, Ratul Mahajan, Dave Maltz, Lihua Yuan, Ming Zhang, Ben Y Zhao, et al. 2015. Packet-level telemetry in large datacenter networks. In Proceedings of the ACM Conference on Special Interest Group on Data Communication. 479--491.Google ScholarDigital Library
Index Terms
- Lightweight Acquisition and Ranging of Flows in the Data Plane
Recommendations
Delay-based TCP congestion avoidance: A network calculus interpretation and performance improvements
In delay-based TCP congestion avoidance mechanisms, a source adjusts its window size to adapt to changes in network conditions as measured through changing queueing delays. Although network calculus (NC) has been used to study window flow control and ...
Computational Fluid Dynamics Model for Sensitivity Analysis and Design of Flow Conditioners
SIMULTECH 2019: Proceedings of the 9th International Conference on Simulation and Modeling Methodologies, Technologies and ApplicationsFlow conditioners are used to measure flow rate more accurately. The sensitivity of flow measurement devices to swirling flows and not fully developed flows are subjects of concerns to flowmeter manufacturers as well as industries. Inaccurate flow ...
Comments