ABSTRACT
In cellular networks, there is a growing adoption of virtualized radio access networks (vRANs), where operators are replacing the traditional specialized hardware for RAN processing with software running on commodity servers. Today's vRAN deployments lack resilience, since there is no support for vRAN failover or upgrades without long service interruptions. Enabling these features for vRANs is challenging because of their strict real-time latency requirements and black-box nature. Slingshot is a new system that transparently provides resilience for the vRAN's most performance-critical layer: the physical layer (PHY). We design new techniques for realtime workload migration with fast RAN protocol middle-boxes, and realtime RAN failure detection. A key insight in our design is to view the transient disruptions from resilience events to RAN computation state and I/O similarly to regular wireless signal impairments, and leverage the inherent resilience of cellular networks to these events. Experiments with a state-of-the-art 5G vRAN testbed show that Slingshot handles PHY failover with no disruption to video conferencing, and under 110 ms disruption to a TCP connection, and it also enables zero-downtime upgrades.
- 2016. QEMU RDMA Live Migration. https://wiki.qemu.org/Features/RDMALiveMigrationGoogle Scholar
- 2020. Arista: 7170 Series Technical Specifications and Features. https://www.arista.com/en/products/7170-series/specifications.Google Scholar
- 2020. iperf(1) - Linux man page. https://linux.die.net/man/1/iperf.Google Scholar
- 2020. QEMU - A generic and open source machine emulator and virtualizer. https://www.qemu.org/.Google Scholar
- 2021. Altiostar and Rakuten Mobile Demonstrate Success Across Performance and Scalability for Open RAN Network. https://www.altiostar.com/altiostar-and-rakuten-mobile-demonstrate-success-across-performance-and-scalability-for-open-ran-network/.Google Scholar
- 2021. CapGemini 5G gNodeB. https://capgemini-engineering.com/nl/en/services/next-core/wireless-frameworks/.Google Scholar
- 2021. Cloud Architecture and Deployment Scenarios for O-RAN Virtualized RAN. https://www.o-ran.org/specifications.Google Scholar
- 2021. Data Plane Development Kit (DPDK). http://dpdk.org/.Google Scholar
- 2021. Deutsche Telekom lights open RAN test site. https://www.mobileworldlive.com/featured-content/top-three/dt-openran-testbed/.Google Scholar
- 2021. FlexRAN™ Reference Architecture for Wireless Access. https://www.intel.com/content/www/us/en/developer/topic-technology/edge-5g/tools/flexran.html.Google Scholar
- 2021. Kernel Virtual Machine. https://www.linux-kvm.org/page/Main_Page.Google Scholar
- 2021. O-RAN Alliance: Operator Defined Open and Intelligent Radio Access Networks. https://www.o-ran.org/.Google Scholar
- 2021. O-RAN Hardware Reference Design Specification for Indoor Pico Cell with Fronthaul Split Option 6. https://www.o-ran.org/specifications.Google Scholar
- 2021. O-RAN: Towards an Open and Smart RAN. https://www.o-ran.org/s/O-RAN-WP-FInal-181017.pdf.Google Scholar
- 2021. P416 Language Specification. https://p4.org/p4-spec/docs/P4-16-v1.2.0.html.Google Scholar
- 2021. Radisys 5G NR Software Suite. https://www.radisys.com/connect/connectran/5g.Google Scholar
- 2021. SRS: Software Radio Systems. https://www.srs.io/.Google Scholar
- 2021. vSphere Performance Equivalent to Bare Metal for RAN Workloads. https://telco.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/microsites/telco/vmware-telco-ran-performance-wp.pdf.Google Scholar
- 2022. 5G FAPI: PHY API specification. https://www.smallcellforum.org/reports/5g-fapi-phy-api-specification.Google Scholar
- 2022. 5G nFAPI specifications. https://www.smallcellforum.org/reports/5g-nfapi-specifications.Google Scholar
- 2022. NVIDIA Aerial SDK: Build and Deploy GPU-Accelerated 5G Virtual Radio Access Networks (vRAN). https://developer.nvidia.com/aerial-sdk.Google Scholar
- 2022. Qualcomm Introduces New 5G Distributed Unit Accelerator Card to Drive Global 5G Virtualized RAN Growth. https://www.qualcomm.com/news/releases/2021/06/qualcomm-introduces-new-5g-distributed-unit-accelerator-card-drive-global.Google Scholar
- 2022. Rakuten Symphony Symware™ Phase Two Begins with Plans to Commercially Deploy 30,000 Units in Japan. https://symphony.rakuten.com/newsroom/rakuten-symphony-symware-phase-two-begins.Google Scholar
- 2022. The Journey to a Cloud-native, Fully Software-defined vRAN Architecture. https://www.vodafone.com/sites/default/files/2022-12/journey-to-cloud-native-fully-software-defined-vran-architecture.pdf.Google Scholar
- 2022. Verizon deploys more than 8,000 VRAN cell sites, rapidly marches towards goal of 20,000. https://www.verizon.com/about/news/verizon-deploys-more-8000-vran-cell-sites.Google Scholar
- 2022. Vodafone turns on first U.K. 5G open RAN site. https://www.fiercewireless.com/tech/vodafone-turns-first-uk-5g-open-ran-site.Google Scholar
- Marcos K. Aguilera, Naama Ben-David, Rachid Guerraoui, Virendra J. Marathe, Athanasios Xygkis, and Igor Zablotchi. 2020. Microsecond Consensus for Microsecond Applications. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 599--616. https://www.usenix.org/conference/osdi20/presentation/aguileraGoogle Scholar
- Kazi Main Uddin Ahmed, Manuel Alvarez, and Math H. J. Bollen. 2020. Characterizing Failure and Repair Time of Servers in a Hyper-scale Data Center. In IEEE PES Innovative Smart Grid Technologies Europe, ISGT Europe 2020, Delft, The Netherlands, October 26--28, 2020. IEEE, 660--664. Google ScholarCross Ref
- Jesutofunmi Ademiposi Ajayi. 2019. Live eNodeB Container Migration in LTE Mobile Networks. Master's thesis. University of Bern.Google Scholar
- Sally R. Aldaeabool and Maysam F. Abbod. 2017. Reducing power consumption by dynamic BBUs-RRHs allocation in C-RAN. In 2017 25th Telecommunication Forum (TELFOR). 1--4. Google ScholarCross Ref
- ORAN Alliance. 2022. Control, user and synchronization plane specification. O-RAN Fronthaul Working Group, ORAN-WG4.CUS.0-v10.00 (2022).Google Scholar
- M. Baker-Harvey. 2015. Google Compute Engine uses Live Migration technology to service infrastructure without application downtime. https://cloudplatform.googleblog.com/2015/03/Google-Compute-Engine-uses-Live-Migration-technology-to-service-infrastructure-without-application-downtime.htmlGoogle Scholar
- Robert Birke, Ioana Giurgiu, Lydia Y. Chen, Dorothea Wiesmann, and Ton Engbersen. 2014. Failure Analysis of Virtual and Physical Machines: Patterns, Causes and Characteristics. In 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. 1--12. Google ScholarDigital Library
- Nishant Budhdev, Raj Joshi, Pravein Govindan Kannan, Mun Choon Chan, and Tulika Mitra. 2021. FSA: Fronthaul Slicing Architecture for 5G Using Dataplane Programmable Switches. Association for Computing Machinery, New York, NY, USA, 723--735. Google ScholarDigital Library
- Yi Chen, Di Tang, Yepeng Yao, Mingming Zha, XiaoFeng Wang, Xiaozhong Liu, Haixu Tang, and Dongfang Zhao. 2022. Seeing the Forest for the Trees: Understanding Security Hazards in the 3GPP Ecosystem through Intelligent Analysis on Change Requests. In 31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 17--34. https://www.usenix.org/conference/usenixsecurity22/presentation/chen-yiGoogle Scholar
- Josep Colom Ikuno, Martin Wrulich, and Markus Rupp. 2009. Performance and modeling of LTE H-ARQ. International ITG Workshop on Smart Antennas (WSA 2009) (01 2009).Google Scholar
- Michael Dalton, David Schultz, Jacob Adriaens, Ahsan Arefin, Anshuman Gupta, Brian Fahs, Dima Rubinstein, Enrique Cauich Zermeno, Erik Rubow, James Alexander Docauer, Jesse Alpert, Jing Ai, Jon Olson, Kevin DeCabooter, Marc de Kruijf, Nan Hua, Nathan Lewis, Nikhil Kasinadhuni, Riccardo Crepaldi, Srinivas Krishnan, Subbaiah Venkata, Yossi Richter, Uday Naik, and Amin Vahdat. 2018. Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 373--387. https://www.usenix.org/conference/nsdi18/presentation/daltonGoogle ScholarDigital Library
- Aleksandar Dragojević, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No Compromises: Distributed Transactions with Consistency, Availability, and Performance. In Proceedings of the 25th Symposium on Operating Systems Principles (Monterey, California) (SOSP '15). Association for Computing Machinery, New York, NY, USA, 54--70. Google ScholarDigital Library
- Daniel E. Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, and Jinnah Dylan Hosein. 2016. Maglev: A Fast and Reliable Software Network Load Balancer. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). USENIX Association, Santa Clara, CA, 523--535. https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/eisenbudGoogle ScholarDigital Library
- Kevin Fall, Gianluca Iannaccone, Maziar Manesh, Sylvia Ratnasamy, Katerina Argyraki, Mihai Dobrescu, and Norbert Egi. 2011. RouteBricks: Enabling General Purpose Network Infrastructure. SIGOPS Oper. Syst. Rev. 45, 1 (2011). Google ScholarDigital Library
- Xenofon Foukas and Bozidar Radunovic. 2021. Concordia: teaching the 5G vRAN to share compute. In ACM SIGCOMM 2021 Conference, Virtual Event, USA, August 23--27, 2021, Fernando A. Kuipers and Matthew C. Caesar (Eds.). ACM, 580--596. Google ScholarDigital Library
- Gines Garcia-Aviles, Andres Garcia-Saavedra, Marco Gramaglia, Xavier Costa-Perez, Pablo Serrano, and Albert Banchs. 2021. Nuberu: Reliable RAN Virtualization in Shared Platforms. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (New Orleans, Louisiana) (MobiCom '21). Association for Computing Machinery, New York, NY, USA, 749--761. Google ScholarDigital Library
- Ahmad Hassan, Arvind Narayanan, Anlan Zhang, Wei Ye, Ruiyang Zhu, Shuowei Jin, Jason Carpenter, Z. Morley Mao, Feng Qian, and Zhi-Li Zhang. 2022. Vivisecting Mobility Management in 5G Cellular Networks. In Proceedings of the ACM SIGCOMM 2022 Conference (Amsterdam, Netherlands) (SIGCOMM '22). Association for Computing Machinery, New York, NY, USA, 86--100. Google ScholarDigital Library
- Steven S. Hong, Jeffrey Mehlman, and Sachin Katti. 2012. Picasso: Flexible RF and Spectrum Slicing. SIGCOMM Comput. Commun. Rev. 42, 4 (aug 2012), 37--48. Google ScholarDigital Library
- Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell, and Mark Watson. 2015. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. ACM SIGCOMM Computer Communication Review 44, 4 (2015), 187--198.Google ScholarDigital Library
- Patrick Jahnke, Vincent Riesop, Pierre-Louis Roman, Pavel Chuprikov, and Patrick Eugster. 2021. Live in the Express Lane. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). 581--595.Google Scholar
- Antonios Katsarakis, Yijun Ma, Zhaowei Tan, Andrew Bainbridge, Matthew Balkwill, Aleksandar Dragojevic, Boris Grot, Bozidar Radunovic, and Yongguang Zhang. 2021. Zeus: Locality-Aware Distributed Transactions. In Proceedings of the Sixteenth European Conference on Computer Systems (Online Event, United Kingdom) (EuroSys '21). Association for Computing Machinery, New York, NY, USA, 145--161. Google ScholarDigital Library
- Antonios Katsarakis, Zhaowei Tan, Matthew Balkwill, Bozidar Radunovic, Andrew Bainbridge, Aleksandar Dragojevic, Boris Grot, and Yongguang Zhang. [n.d.]. rVNF: Reliable, scalable and performant cellular VNFs in the cloud. Technical Report.Google Scholar
- Sean Kenney. [n.d.]. Breaking down the pros of Open RAN. https://www.rcrwireless.com/20200925/5g/breaking-down-the-pros-of-open-ran.Google Scholar
- Junaid Khalid and Aditya Akella. 2019. Correctness and Performance for Stateful Chained Network Functions. In USENIX NSDI (2019).Google Scholar
- M. Khan, R.S. Alhumaima, and H.S. Al-Raweshidy. 2015. Quality of Service aware dynamic BBU-RRH mapping in Cloud Radio Access Network. In 2015 International Conference on Emerging Technologies (ICET). 1--5. Google ScholarCross Ref
- Yuanjie Li, Zengwen Yuan, and Chunyi Peng. 2017. A Control-Plane Perspective on Reducing Data Access Latency in LTE Networks. In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking (Snowbird, Utah, USA) (MobiCom '17). Association for Computing Machinery, New York, NY, USA, 56--69. Google ScholarDigital Library
- Kyle MacMillan, Tarun Mangla, James Saxon, and Nick Feamster. 2021. Measuring the performance and network utilization of popular video conferencing applications. In Proceedings of the 21st ACM Internet Measurement Conference. 229--244.Google ScholarDigital Library
- José Mendes, Xianjun Jiao, Andres Garcia-Saavedra, Felipe Huici, and Ingrid Moerman. 2017. Cellular Access Multi-Tenancy through Small Cell Virtualization and Common RF Front-End Sharing. 35--42. Google ScholarDigital Library
- Arvind Narayanan, Eman Ramadan, Jason Carpenter, Qingxu Liu, Yu Liu, Feng Qian, and Zhi-Li Zhang. 2020. A First Look at Commercial 5G Performance on Smartphones. In Proceedings of The Web Conference 2020. 894--905.Google ScholarDigital Library
- Binh Nguyen, Tian Zhang, Bozidar Radunovic, Ryan Stutsman, Thomas Karagiannis, Jakub Kocur, and Jacobus Van der Merwe. 2018. ECHO: A Reliable Distributed Cellular Core Network for Hyper-Scale Public Clouds. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (New Delhi, India) (MobiCom '18). Association for Computing Machinery, New York, NY, USA, 163--178. Google ScholarDigital Library
- Navid Nikaein, Mahesh K. Marina, Saravana Manickam, Alex Dawson, Raymond Knopp, and Christian Bonnet. 2014. OpenAirInterface: A Flexible Platform for 5G Research. SIGCOMM Comput. Commun. Rev. 44, 5 (oct 2014), 33--38. Google ScholarDigital Library
- Guillermo Pocovi, Hamidreza Shariatmadari, Gilberto Berardinelli, Klaus Pedersen, Jens Steiner, and Zexian Li. 2018. Achieving ultra-reliable low-latency communications: Challenges and envisioned system enhancements. IEEE Network 32, 2 (2018), 8--15.Google ScholarCross Ref
- Chandra Prakash, Debadatta Mishra, Purushottam Kulkarni, and Umesh Bellur. 2022. Portkey: Hypervisor-Assisted Container Migration in Nested Cloud Environments. In Proceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (Virtual, Switzerland) (VEE 2022). Association for Computing Machinery, New York, NY, USA, 3--17. Google ScholarDigital Library
- Qualcomm. 2014. 3GPP RAN2 R2-140089, Mobility Performance in Real Networks. (2014).Google Scholar
- Mubashir Adnan Qureshi, Ajay Mahimkar, Lili Qiu, Zihui Ge, Max Zhang, and Ioannis Broustis. 2017. Coordinating rolling software upgrades for cellular networks. In 25th IEEE International Conference on Network Protocols, ICNP 2017, Toronto, ON, Canada, October 10--13, 2017. IEEE Computer Society. Google ScholarCross Ref
- Shriram Rajagopalan, Dan Williams, and Hani Jamjoom. 2013. Pico replication: A high availability framework for middleboxes. In ACM SoCC (2013).Google ScholarDigital Library
- Ermínio Augusto Ramos da Paixão, Rafael Fogarolli Vieira, Welton Vasconcelos Araújo, and Diego Lisboa Cardoso. 2018. Optimized load balancing by dynamic BBU-RRH mapping in C-RAN architecture. In 2018 Third International Conference on Fog and Mobile Edge Computing (FMEC). 100--104. Google ScholarCross Ref
- Rehenuma Tasnim Rodoshi, Taewoon Kim, and Wooyeol Choi. 2020. Resource Management in Cloud Radio Access Network: Conventional and New Approaches. Sensors (Basel, Switzerland) 20 (2020).Google Scholar
- Adam Ruprecht, Danny Jones, Dmitry Shiraev, Greg Harmon, Maya Spivak, Michael Krebs, Miche Baker-Harvey, and Tyler Sanderson. 2018. VM Live Migration At Scale. In Proceedings of the 14th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (Williamsburg, VA, USA) (VEE '18). Association for Computing Machinery, New York, NY, USA, 45--56. Google ScholarDigital Library
- Justine Sherry, Peter Xiang Gao, Soumya Basu, Aurojit Panda, Arvind Krishnamurthy, Christian Maciocco, Maziar Manesh, João Martins, Sylvia Ratnasamy, Luigi Rizzo, and Scott Shenker. 2015. Rollback-Recovery for Middle-boxes. SIGCOMM Comput. Commun. Rev. 45, 4 (aug 2015), 227--240. Google ScholarDigital Library
- Aidan Shribman and Benoit Hudzia. 2012. Pre-copy and post-copy VM live migration for memory intensive applications. In European Conference on Parallel Processing. Springer, 539--547.Google Scholar
- Tshiamo Sigwele, Atm S Alam, Prashant Pillai, and Yim F Hu. 2017. Energy-efficient cloud radio access networks by cloud based workload consolidation for 5G. Journal of Network and Computer Applications 78 (2017), 1--8.Google ScholarDigital Library
- Radostin Stoyanov and Martin J. Kollingbaum. 2018. Efficient Live Migration of Linux Containers. In ISC Workshops.Google Scholar
- Sharan Turlapati and Srivatsa Bhat. 2021. Linux kernel support for kernel thread starvation avoidance. Real-Time Micro-conference, Linux Plumbers Conference 2021 (2021). https://linuxplumbersconf.org/event/11/contributions/1061/Google Scholar
- Cheng Wang, Xusheng Chen, Weiwei Jia, Boxuan Li, Haoran Qiu, Shixiong Zhao, and Heming Cui. 2018. PLOVER: Fast, Multi-core Scalable Virtual Machine Fault-tolerance. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 483--489.Google Scholar
- Sen Xu, Meng Hou, Yu Fu, Honglian Bian, and Cheng Gao. 2018. Improved Fast Centralized Retransmission Scheme for High-Layer Functional Split in 5G Network. Journal of Physics: Conference Series 960 (2018).Google ScholarCross Ref
- Xing Xu, Ioannis Broustis, Zihui Ge, Ramesh Govindan, Ajay Mahimkar, N. K. Shankaranarayanan, and Jia Wang. 2015. Magus: minimizing cellular service disruption during network upgrades. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, CoNEXT 2015, Heidelberg, Germany, December 1--4, 2015. ACM. Google ScholarDigital Library
- Francis Y. Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, and Keith Winstein. 2020. Learning in situ: a randomized experiment in video streaming. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 495--511. https://www.usenix.org/conference/nsdi20/presentation/yanGoogle Scholar
- Qing Yang, Xiaoxiao Li, Hongyi Yao, Ji Fang, Kun Tan, Wenjun Hu, Jiansong Zhang, and Yongguang Zhang. 2013. BigStation: enabling scalable real-time signal processingin large MU-MIMO systems. Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM (2013).Google ScholarDigital Library
- Hang Yin, Nanxi Li, Jing Guo, Jianchi Zhu, and Xiaoming She. 2022. NR Coverage Enhancements for PUSCH. IEEE Communications Magazine (2022).Google ScholarCross Ref
- Diyu Zhou and Yuval Tamir. 2021. HyCoR: Fault-Tolerant Replicated Containers Based on Checkpoint and Replay. CoRR abs/2101.09584 (2021). arXiv:2101.09584 https://arxiv.org/abs/2101.09584Google Scholar
- Diyu Zhou and Yuval Tamir. 2022. RRC: Responsive Replicated Containers. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). 85--100.Google Scholar
Index Terms
- Resilient Baseband Processing in Virtualized RANs with Slingshot
Recommendations
Enabling Resilience in Virtualized RANs with Atlas
ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and NetworkingVirtualized radio access networks (vRANs), which allow running RAN processing on commodity servers instead of proprietary hardware, are gaining adoption in cellular networks. Two properties of the vRAN's "Distributed Unit (DU)" that implements the ...
Cooperative VM migration for a virtualized HPC cluster with VMM-bypass I/O devices
E-SCIENCE '12: Proceedings of the 2012 IEEE 8th International Conference on E-Science (e-Science)An HPC cloud, a flexible and robust cloud computing service specially dedicated to high performance computing, is a promising future e-Science platform. In cloud computing, virtualization is widely used to achieve flexibility and security. ...
Evaluating Network Stacks for the Virtualized Mobile Packet Core
APNet '21: Proceedings of the 5th Asia-Pacific Workshop on NetworkingSeveral novel userspace network stacks have been proposed in recent research to overcome the limitations of the Linux network stack in providing high-performance I/O for Virtual Network Functions (VNFs). In this paper, we evaluate the performance of ...
Comments