ABSTRACT
In this paper we look at content placement in the high-dimensional regime: there are n servers, and O(n) distinct types of content. Each server can store and serve O(1) types at any given time. Demands for these content types arrive, and have to be served in an online fashion; over time, there are a total of O(n) of these demands. We consider the algorithmic task of content placement: determining which types of content should be on which server at any given time, in the setting where the demand statistics (i.e. the relative popularity of each type of content) are not known a-priori, but have to be inferred from the very demands we are trying to satisfy. This is the high-dimensional regime because this scaling (everything being O(n)) prevents consistent estimation of demand statistics; it models many modern settings where large numbers of users, servers and videos/webpages interact in this way.
We characterize the performance of any scheme that separates learning and placement (i.e. which use a portion of the demands to gain some estimate of the demand statistics, and then uses the same for the remaining demands), showing it is order-wise strictly suboptimal. We then study a simple adaptive scheme - which myopically attempts to store the most recently requested content on idle servers - and show it outperforms schemes that separate learning and placement. Our results also generalize to the setting where the demand statistics change with time. Overall, our results demonstrate that separating the estimation of demand, and the subsequent use of the same, is strictly suboptimal.
- M. Ahmed, S. Traverso, M. Garetto, P. Giaccone, E. Leonardi, and S. Niccolini. Temporal locality in today's content caching: why it matters and how to model it. ACM SIGCOMM Computer Communication Review, 43(5):5--12, October 2013. Google ScholarDigital Library
- L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In IEEE INFOCOM'99, pages 126--134, 1999.Google ScholarCross Ref
- D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and G. Torrisi. Stochastic analysis of self-sustainability in peer-assisted VoD systems. In IEEE INFOCOM, pages 1539--1547, 2012.Google ScholarCross Ref
- E. Cohen and S. Shenker. Replication strategies in unstructured peer-to-peer networks. In ACM SIGCOMM Computer Communication Review, volume 32, pages 177--190. ACM, 2002. Google ScholarDigital Library
- P. Gill, M. Arlitt, Z. Li, and A. Mahanti. YouTube traffic characterization: A view from the edge. In 7th ACM SIGCOMM Conference on Internet Measurement, pages 15--28, 2007. Google ScholarDigital Library
- I. J. Good. The population frequencies of species and the estimation of population parameters. Biometrika, 40(3--4):237--264, 1953.Google Scholar
- A. Iamnitchi, M. Ripeanu, and I. Foster. Small-world file-sharing communities. In IEEE INFOCOM, March 2004.Google ScholarCross Ref
- J. Kangasharju, K. Ross, and D. Turner. Optimizing file availability in peer-to-peer content distribution. In INFOCOM, 2007.Google ScholarDigital Library
- J. Kangasharjua, J. Roberts, and K. Ross. Object replication strategies in content distribution networks. Computer Communications, 25:376--383, 2002. Google ScholarDigital Library
- M. Leconte, M. Lelarge, and L. Massoulie. Bipartite graph structures for efficient balancing of heterogeneous loads. In the 12th ACM SIGMETRICS Conference, pages 41--52, 2012. Google ScholarDigital Library
- M. Leconte, M. Lelarge, and L. Massoulie. Adaptive replication in distributed content delivery networks. Preprint, 2013.Google Scholar
- Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker. Search and replication in unstructured peer-to-peer networks. In 16th international conference on Supercomputing, 2002. Google ScholarDigital Library
- A. McAllester and R. Schapire. On the convergence rate of Good-Turing estimators. In COLT Conference, pages 1--6, 2000. Google ScholarDigital Library
- B. Tan and L. Massoulie. Optimal content placement for peer-to-peer video-on-demand systems. IEEE/ACM Trans. Networking, 21:566--579, 2013. Google ScholarDigital Library
- J. Tsitsiklis and K. Xu. Queueing system topologies with limited flexibility. In SIGMETRICS '13, 2013. Google ScholarDigital Library
- G. Valiant and P. Valiant. Estimating the unseen: An n/log (n)-sample estimator for entropy and support size, shown optimal via new clts. In Proceedings of the 43rd annual ACM Symposium on Theory of Computing, pages 685--694, 2011. Google ScholarDigital Library
- E. Veloso, V. Almeida, W. Meira, A. Bestavros, and S. Jin. A hierarchical characterization of a live streaming media workload. In Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment, pages 117--130, 2002. Google ScholarDigital Library
- R. B. Wallace and W. Whitt. A staffing algorithm for call centers with skill-based routing. Manufacturing and Service Operations Management, 7:276--294, 2007. Google ScholarDigital Library
- J. Wang. A survey of web caching schemes for the Internet. ACM SIGCOMM Computer Communication Review, 29:36--46, 1999. Google ScholarDigital Library
- S. Williams, M. Abrams, C. Standridge, G. Abdulla, and E. Fox. Removal policies in network caches for world-wide web documents. In SIGCOMM'96, 1996. Google ScholarDigital Library
- S. Williams, M. Abrams, C. Standridge, G. Abdulla, and E. Fox. Caching proxies: limitations and potentials. In the 4th International WWW Conference, December 1995.Google Scholar
- W. Wu and J. Lui. Exploring the optimal replication strategy in P2P-VoD systems: Characterization and evaluation. IEEE Transactions on Parallel and Distributed Systems, 23, August 2012. Google ScholarDigital Library
- www.youtube.com/yt/press/statistics.html.Google Scholar
- H. Yu, D. Zheng, B. Zhao, and W. Zheng. Understanding user behavior in large scale video-on-demand systems. In EuroSys, April 2006. Google ScholarDigital Library
- X. Zhou and C. Xu. Optimal video replication and placement on a cluster of video-on-demand servers. In International Conference on Parallel Processing, pages 547--555, 2002. Google ScholarDigital Library
- Y. Zhou, T. Fu, and D. Chiu. On replication algorithm in P2P-VoD. IEEE/ACM Transactions on Networking, pages 233--243, 2013. Google ScholarDigital Library
Index Terms
- Serving content with unknown demand: the high-dimensional regime
Recommendations
Serving content with unknown demand: the high-dimensional regime
Performance evaluation reviewIn this paper we look at content placement in the high-dimensional regime: there are n servers, and O(n) distinct types of content. Each server can store and serve O(1) types at any given time. Demands for these content types arrive, and have to be ...
Serving content with unknown demand: the high-dimensional regime
In this paper, we look at content placement in the high-dimensional regime: there are $$n$$n servers, and $${\mathrm {O}}(n)$$O(n) distinct types of content. Each server can store and serve $${\mathrm {O}}(1)$$O(1) types at any given time. Demands for ...
Effect of Recommendations on Serving Content with Unknown Demand
Mobihoc '17: Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and ComputingWe consider the task of content replication in distributed content delivery systems used by Video-on-Demand (VoD) services with large content catalogs. The prior work in this area focuses on the setting where each request is generated independent of all ...
Comments