skip to main content
research-article

ORL-SDN: Online Reinforcement Learning for SDN-Enabled HTTP Adaptive Streaming

Published:09 August 2018Publication History
Skip Abstract Section

Abstract

In designing an HTTP adaptive streaming (HAS) system, the bitrate adaptation scheme in the player is a key component to ensure a good quality of experience (QoE) for viewers. We propose a new online reinforcement learning optimization framework, called ORL-SDN, targeting HAS players running in a software-defined networking (SDN) environment. We leverage SDN to facilitate the orchestration of the adaptation schemes for a set of HAS players. To reach a good level of QoE fairness in a large population of players, we cluster them based on a perceptual quality index. We formulate the adaptation process as a Partially Observable Markov Decision Process and solve the per-cluster optimization problem using an online Q-learning technique that leverages model predictive control and parallelism via aggregation to avoid a per-cluster suboptimal selection and to accelerate the convergence to an optimum. This framework achieves maximum long-term revenue by selecting the optimal representation for each cluster under time-varying network conditions. The results show that ORL-SDN delivers substantial improvements in viewer QoE, presentation quality stability, fairness, and bandwidth utilization over well-known adaptation schemes.

References

  1. Saamer Akhshabi, Lakshmi Anantakrishnan, Ali C. Begen, and Constantine Dovrolis. 2012. What happens when HTTP adaptive streaming players compete for bandwidth? In Proceedings of the 22Nd International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV'12). ACM, New York, NY, USA, 9--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ahsan Arefin, Raoul Rivas, Rehana Tabassum, and Klara Nahrstedt. 2013. OpenSession: SDN-based cross-layer multi-stream management protocol for 3D teleimmersion. In 21st IEEE International Conference on Network Protocols (ICNP'13). 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  3. Abdelhak Bentaleb, Ali C. Begen, and Roger Zimmermann. 2016. SDNDASH: Improving QoE of HTTP adaptive streaming using software defined networking. In Proceedings of the 2016 ACM on Multimedia Conference (MM'16). ACM, New York, NY, USA, 1296--1305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Niels Bouten, Ricardo de O’Schmidt, Jeroen Famaey, Steven Latré, Aiko Pras, and Filip De Turck. 2015. QoE-driven in-network optimization for adaptive video streaming based on packet sampling measurements. Computer Networks 81, C (2015), 96--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Valentín Carela-Español, Pere Barlet-Ros, Albert Cabellos-Aparicio, and Josep Solé-Pareta. 2011. Analysis of the impact of sampling on NetFlow traffic classification. Computer Networks 55, 5 (2011), 1083--1099. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Federico Chiariotti, Stefano D’Aronco, Laura Toni, and Pascal Frossard. 2016. Online learning adaptation strategy for DASH clients. In Proceedings of the 7th International Conference on Multimedia Systems (MMSys'16). ACM, New York, NY, USA, Article 8, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Maxim Claeys, Steven Latré, Jeroen Famaey, Tingyao Wu, Werner Van Leekwijck, and Filip De Turck. 2013. Design of a Q-learning-based client quality selection algorithm for HTTP adaptive video streaming. In Proceedings of the Adaptive and Learning Agents Workshop, part of AAMAS2013. 30--37.Google ScholarGoogle Scholar
  8. Maxim Claeys, Steven Latré, Jeroen Famaey, Tingyao Wu, Werner Van Leekwijck, and Filip De Turck. 2014. Design and optimisation of a (FA)Q-learning-based HTTP adaptive streaming client. Connection Science 26, 1 (2014), 25--43.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. DASH-IF. 2017. Guidelines for Implementation: DASH-AVC/264 Test cases and Vectors. Retrieved from https://goo.gl/NhJcui (accessed June 5, 2017).Google ScholarGoogle Scholar
  10. Dash Industry Forum. 2017. DASH-264 JavaScript Reference Client. Retrieved from https://goo.gl/yd8rrt (accessed March 30, 2017).Google ScholarGoogle Scholar
  11. Johan De Vriendt, Danny De Vleeschauwer, and David Robinson. 2013. Model for estimating QoE of video delivered using HTTP adaptive streaming. In 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM'13). 1288--1293.Google ScholarGoogle Scholar
  12. Giorgos Dimopoulos, Ilias Leontiadis, Pere Barlet-Ros, and Konstantina Papagiannaki. 2016. Measuring video QoE from encrypted traffic. In Proceedings of the 2016 Internet Measurement Conference (IMC'16). ACM, New York, NY, USA, 513--526. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhengfang Duanmu, Kai Zeng, Kede Ma, Abdul Rehman, and Zhou Wang. 2017. A quality-of-experience index for streaming video. IEEE Journal of Selected Topics in Signal Processing 11, 1 (2017), 154--166.Google ScholarGoogle ScholarCross RefCross Ref
  14. Marcus Eckert and Thomas Martin Knoll. 2013. QoE management framework for internet services in SDN enabled mobile networks. In Meeting of the European Network of Universities and Companies in Information and Communication Engineering. Springer, 112--123.Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhengzhu Feng and E. Hansen. 2004. An approach to state aggregation for POMDPs. In AAAI-04 Workshop on Learning and Planning in Markov Processes--Advances and Challenges. 7--12.Google ScholarGoogle Scholar
  16. Markus Fiedler, Tobias Hossfeld, and Phuoc Tran-Gia. 2010. A quantitative relationship between quality of experience and quality of service. IEEE Network 24, 2 (2010), 36--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Aditya Ganjam, Faisal Siddiqui, Jibin Zhan, Xi Liu, Ion Stoica, Junchen Jiang, Vyas Sekar, and Hui Zhang. 2015. C3: Internet-scale control plane for video quality optimization. In Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation (NSDI'15). USENIX Association, Berkeley, CA, USA, 131--144. http://dl.acm.org/citation.cfm?id=2789770.2789780 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Panagiotis Georgopoulos, Yehia Elkhatib, Matthew Broadbent, Mu Mu, and Nicholas Race. 2013. Towards network-wide QoE fairness using openflow-assisted adaptive video streaming. In Proceedings of the 2013 ACM SIGCOMMWorkshop on Future Human-centric Multimedia Networking (FhMN'13). ACM, New York, NY, USA, 15--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Simon Haykin. 1998. Neural Networks: A Comprehensive Foundation (2nd ed.). Prentice Hall PTR, Upper Saddle River, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Victor Heorhiadi, Michael K. Reiter, and Vyas Sekar. 2016. Simplifying software-defined network optimization using SOL. In Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation (NSDI'16). USENIX Association, Berkeley, CA, USA, 223--237. http://dl.acm.org/citation.cfm?id=2930611.2930627 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Te-Yuan Huang, Nikhil Handigol, Brandon Heller, Nick McKeown, and Ramesh Johari. 2012. Confused, timid, and unstable: Picking a video streaming rate is hard. In Proceedings of the 2012 Internet Measurement Conference (IMC'12). ACM, New York, NY, USA, 225--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell, and Mark Watson. 2015. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM'14). ACM, New York, NY, USA, 187--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Milosz Marian Hulboj and Ryszard Erazm Jurga. 2007. Packet sampling and network monitoring. Retrieved from https://bit.ly/2mTg88B (accessed June 15, 2016).Google ScholarGoogle Scholar
  24. InMon. 2004. sFlow. Retrieved from http://www.sflow.org/ (accessed December 25, 2016).Google ScholarGoogle Scholar
  25. Raj Jain, Dah-Ming Chiu, and William R. Hawe. 1984. A Quantitative Measure of Fairness and Discrimination for Resource Allocation in Shared Computer System. Vol. 38. Eastern Research Laboratory, Digital Equipment Corporation, Hudson, MA.Google ScholarGoogle Scholar
  26. Junchen Jiang, Vyas Sekar, Henry Milner, Davis Shepherd, Ion Stoica, and Hui Zhang. 2016. CFA: A practical prediction system for video QoE optimization. In Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation (NSDI'16). USENIX Association, Berkeley, CA, USA, 137--150. http://dl.acm.org/citation.cfm?id=2930611.2930621 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Junchen Jiang, Vyas Sekar, and Hui Zhang. 2012. Improving fairness, efficiency, and stability in HTTP-based adaptive video streaming with FESTIVE. In Proceedings of the 8th International Conference on Emerging Networking Experiments and Technologies (CoNEXT'12). ACM, New York, NY, USA, 97--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Junchen Jiang, Shijie Sun, Vyas Sekar, and Hui Zhang. 2017. Pytheas: Enabling data-driven quality of experience optimization using group-based exploration-exploitation. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (NSDI'17). USENIX Association, Berkeley, CA, USA, 393--406. http://dl.acm.org/citation.cfm?id=3154630.3154662 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Parikshit Juluri, Venkatesh Tamarapalli, and Deep Medhi. 2015. SARA: Segment aware rate adaptation algorithm for dynamic adaptive streaming over HTTP. In IEEE International Conference on Communication Workshop (ICCW'15). 1765--1770.Google ScholarGoogle ScholarCross RefCross Ref
  30. Jan Willem Kleinrouweler, Sergio Cabrero, and Pablo Cesar. 2016. Delivering stable high-quality video: An SDN architecture with DASH assisting network elements. In Proceedings of the 7th International Conference on Multimedia Systems (MMSys'16). ACM, New York, NY, USA, Article 4, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Diego Kreutz, Fernando M. V. Ramos, P. Esteves Verissimo, C. Esteve Rothenberg, Siamak Azodolmolky, and Steve Uhlig. 2015. Software-defined networking: A comprehensive survey. Proceedings of the IEEE 103, 1 (2015), 14--76.Google ScholarGoogle ScholarCross RefCross Ref
  32. Bob Lantz and Brian O’Connor. 2015. Mininet. Retrieved from http://mininet.org/ (accessed January 20, 2017).Google ScholarGoogle Scholar
  33. Stefan Lederer, Christopher Müller, and Christian Timmerer. 2012. Dynamic adaptive streaming over HTTP dataset. In Proceedings of the 3rd Multimedia Systems Conference (MMSys'12). ACM, New York, NY, USA, 89--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Zhi Li, Xiaoqing Zhu, Joshua Gahm, Rong Pan, Hao Hu, Ali C. Begen, and David Oran. 2014. Probe and adapt: Rate adaptation for HTTP video streaming at scale. IEEE Journal on Selected Areas in Communications 32, 4 (2014), 719--733.Google ScholarGoogle ScholarCross RefCross Ref
  35. Xi Liu, Florin Dobrian, Henry Milner, Junchen Jiang, Vyas Sekar, Ion Stoica, and Hui Zhang. 2012. A case for a coordinated internet video control plane. In Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM'12). ACM, New York, NY, USA, 359--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xiaomei Liu and Li Xiao. 2007. A survey of multihoming technology in stub networks: Current research and open issues. Network 21, 3 (2007), 32--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Nicholas Mastronarde and Mihaela van der Schaar. 2011. Fast reinforcement learning for energy-efficient wireless communication. IEEE Transactions on Signal Processing 59, 12 (2011), 6262--6266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar, Larry Peterson, Jennifer Rexford, Scott Shenker, and Jonathan Turner. 2008. OpenFlow: Enabling innovation in campus networks. In SIGCOMM Comput. Commun. Rev. 38, 2 (2008), 69--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ricky K. P. Mok, Xiapu Luo, Edmond W. W. Chan, and Rocky K. C. Chang. 2012. QDASH: a QoE-aware DASH system. In Proceedings of the 3rd Multimedia Systems Conference (MMSys'12). ACM, New York, NY, USA, 11--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Mu Mu, Matthew Broadbent, Arsham Farshad, Nicholas Hart, David Hutchison, Qiang Ni, and Nicholas Race. 2016. A scalable user fairness model for adaptive video streaming over SDN-assisted future networks. IEEE Journal on Selected Areas in Communications 34, 8 (2016), 2168--2184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Matthew K. Mukerjee, David Naylor, Junchen Jiang, Dongsu Han, Srinivasan Seshan, and Hui Zhang. 2015. Practical, real-time centralized control for CDN-based live video delivery. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM'15). ACM, New York, NY, USA, 311--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yustus Eko Oktian, SangGon Lee, HoonJae Lee, and JunHuy Lam. 2017. Distributed SDN controller system: A survey on design choice. Computer Networks 121 (2017), 100--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Athanasios Papoulis and S. Unnikrishna Pillai. 2002. Probability, Random Variables, and Stochastic Processes. Tata McGraw-Hill Education.Google ScholarGoogle Scholar
  44. Stefano Petrangeli, Maxim Claeys, Steven Latré, Jeroen Famaey, and Filip De Turck. 2014. A multi-agent Q-learning-based framework for achieving fairness in HTTP adaptive streaming. In IEEE Network Operations and Management Symposium (NOMS'14). 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  45. Stefano Petrangeli, Jeroen Famaey, Maxim Claeys, Steven Latré, and Filip De Turck. 2016. QoE-driven rate adaptation heuristic for fair adaptive video streaming. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 2 (2016), 28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Stefano Petrangeli, Tim Wauters, Rafael Huysegems, Tom Bostoen, and Filip De Turck. 2016. Software-defined network-based prioritization to avoid video freezes in HTTP adaptive streaming. Netw. 26, 4 (2016), 248--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Warren B. Powell. 2009. What you should know about approximate dynamic programming. NRL 56, 3 (2009), 239--249.Google ScholarGoogle ScholarCross RefCross Ref
  48. PyPI. 2017. The Python Package Index. Retrieved from https://goo.gl/635J3x (accessed August 25, 2017).Google ScholarGoogle Scholar
  49. Abdul Rehman, Kai Zeng, and Zhou Wang. 2015. Display device-adapted video quality-of-experience assessment. In SPIE/IS&T Electronic Imaging. Int. Society for Optics and Photonics, 939406--939406.Google ScholarGoogle Scholar
  50. Martin Riedmiller. 2005. Neural fitted Q iteration--first experiences with a data efficient neural reinforcement learning method. In Proceedings of the 16th European Conference on Machine Learning (ECM'05). Springer-Verlag, Berlin, Heidelberg, 317--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. RYU SDN Community. 2015. RYU SDN Framework. Retrieved from https://osrg.github.io/ryu/ (accessed December 25, 2016).Google ScholarGoogle Scholar
  52. Sandvine. 2015. Deep Packet Inspection (DPI). Retrieved from https://goo.gl/2Ms8bH (accessed April 10, 2017).Google ScholarGoogle Scholar
  53. Sandvine. 2016. Video Quality of Experience: Requirements and Considerations for Meaningful Insight. White Paper.Google ScholarGoogle Scholar
  54. Justine Sherry, Chang Lan, Raluca Ada Popa, and Sylvia Ratnasamy. 2015. Blindbox: Deep Packet Inspection over Encrypted Traffic. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM'15). ACM, New York, NY, USA, 213--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. SSIMWave. 2015. SSIMWave’s Video QoE Monitor. Retrieved from https://goo.gl/u7pG45 (accessed June 11, 2016).Google ScholarGoogle Scholar
  56. Thomas Stockhammer. 2011. Dynamic adaptive streaming over HTTP: Standards and design principal. In Proceedings of the Second Annual ACM Conference on Multimedia Systems (MMSys'11). ACM, New York, NY, USA, 133--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yi Sun, Xiaoqi Yin, Junchen Jiang, Vyas Sekar, Fuyuan Lin, Nanshu Wang, Tao Liu, and Bruno Sinopoli. 2016. Cs2p: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 ACM SIGCOMM Conference (SIGCOMM'16). ACM, New York, NY, USA, 272--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. MIT Press, Cambridge. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Emmanuel Thomas, M. O. van Deventer, Thomas Stockhammer, Ali C. Begen, and Jeroen Famaey. 2017. Enhancing MPEG DASH Performance via Server and Network Assistance. SMPTE Motion Imaging Journal 126, 1 (2017), 22--27.Google ScholarGoogle ScholarCross RefCross Ref
  60. Michel Tokic and Günther Palm. 2011. Value-difference based exploration: Adaptive control between Epsilon-Greedy and Softmax. In KI. Springer, 335--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Jeroen van der Hooft, Stefano Petrangeli, Maxim Claeys, Jeroen Famaey, and Filip De Turck. 2015. A learning-based algorithm for improved bandwidth-awareness of adaptive streaming clients. In IFIP/IEEE International Symposium on Integrated Network Management (IM'15). 131--138.Google ScholarGoogle ScholarCross RefCross Ref
  62. Yang Wang and Stephen Boyd. 2010. Fast Model Predictive Control using Online Optimization. IEEE Transactions on Control Systems Technology 18, 2 (2010), 267--278.Google ScholarGoogle ScholarCross RefCross Ref
  63. Dapeng Wu, Yiwei Thomas Hou, Wenwu Zhu, Ya-Qin Zhang, and Jon M. Peha. 2001. Streaming video over the Internet: Approaches and directions. IEEE Transactions on Circuits and Systems for Video Technology 11, 3 (2001), 282--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control-theoretic approach for dynamic adaptive video streaming over HTTP. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM'15). ACM, New York, NY, USA, 325--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Chao Zhou, Chia-Wen Lin, and Zongming Guo. 2016. mDASH: A Markov decision-based rate adaptation approach for dynamic HTTP streaming. IEEE Transactions on Multimedia 18, 4 (2016), 738--751.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Wei Zhou, Li Li, Min Luo, and Wu Chou. 2014. REST API design patterns for SDN northbound API. In 2014 28th International Conference on Advanced Information Networking and Applications Workshops. 358--365. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ORL-SDN: Online Reinforcement Learning for SDN-Enabled HTTP Adaptive Streaming

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 14, Issue 3
      August 2018
      249 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3241977
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 August 2018
      • Accepted: 1 April 2018
      • Revised: 1 February 2018
      • Received: 1 September 2017
      Published in tomm Volume 14, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader