skip to main content
10.1145/2591971.2591978acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
research-article

Serving content with unknown demand: the high-dimensional regime

Published:16 June 2014Publication History

ABSTRACT

In this paper we look at content placement in the high-dimensional regime: there are n servers, and O(n) distinct types of content. Each server can store and serve O(1) types at any given time. Demands for these content types arrive, and have to be served in an online fashion; over time, there are a total of O(n) of these demands. We consider the algorithmic task of content placement: determining which types of content should be on which server at any given time, in the setting where the demand statistics (i.e. the relative popularity of each type of content) are not known a-priori, but have to be inferred from the very demands we are trying to satisfy. This is the high-dimensional regime because this scaling (everything being O(n)) prevents consistent estimation of demand statistics; it models many modern settings where large numbers of users, servers and videos/webpages interact in this way.

We characterize the performance of any scheme that separates learning and placement (i.e. which use a portion of the demands to gain some estimate of the demand statistics, and then uses the same for the remaining demands), showing it is order-wise strictly suboptimal. We then study a simple adaptive scheme - which myopically attempts to store the most recently requested content on idle servers - and show it outperforms schemes that separate learning and placement. Our results also generalize to the setting where the demand statistics change with time. Overall, our results demonstrate that separating the estimation of demand, and the subsequent use of the same, is strictly suboptimal.

References

  1. M. Ahmed, S. Traverso, M. Garetto, P. Giaccone, E. Leonardi, and S. Niccolini. Temporal locality in today's content caching: why it matters and how to model it. ACM SIGCOMM Computer Communication Review, 43(5):5--12, October 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In IEEE INFOCOM'99, pages 126--134, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  3. D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and G. Torrisi. Stochastic analysis of self-sustainability in peer-assisted VoD systems. In IEEE INFOCOM, pages 1539--1547, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  4. E. Cohen and S. Shenker. Replication strategies in unstructured peer-to-peer networks. In ACM SIGCOMM Computer Communication Review, volume 32, pages 177--190. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Gill, M. Arlitt, Z. Li, and A. Mahanti. YouTube traffic characterization: A view from the edge. In 7th ACM SIGCOMM Conference on Internet Measurement, pages 15--28, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. J. Good. The population frequencies of species and the estimation of population parameters. Biometrika, 40(3--4):237--264, 1953.Google ScholarGoogle Scholar
  7. A. Iamnitchi, M. Ripeanu, and I. Foster. Small-world file-sharing communities. In IEEE INFOCOM, March 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. J. Kangasharju, K. Ross, and D. Turner. Optimizing file availability in peer-to-peer content distribution. In INFOCOM, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Kangasharjua, J. Roberts, and K. Ross. Object replication strategies in content distribution networks. Computer Communications, 25:376--383, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Leconte, M. Lelarge, and L. Massoulie. Bipartite graph structures for efficient balancing of heterogeneous loads. In the 12th ACM SIGMETRICS Conference, pages 41--52, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Leconte, M. Lelarge, and L. Massoulie. Adaptive replication in distributed content delivery networks. Preprint, 2013.Google ScholarGoogle Scholar
  12. Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker. Search and replication in unstructured peer-to-peer networks. In 16th international conference on Supercomputing, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. McAllester and R. Schapire. On the convergence rate of Good-Turing estimators. In COLT Conference, pages 1--6, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Tan and L. Massoulie. Optimal content placement for peer-to-peer video-on-demand systems. IEEE/ACM Trans. Networking, 21:566--579, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Tsitsiklis and K. Xu. Queueing system topologies with limited flexibility. In SIGMETRICS '13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Valiant and P. Valiant. Estimating the unseen: An n/log (n)-sample estimator for entropy and support size, shown optimal via new clts. In Proceedings of the 43rd annual ACM Symposium on Theory of Computing, pages 685--694, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Veloso, V. Almeida, W. Meira, A. Bestavros, and S. Jin. A hierarchical characterization of a live streaming media workload. In Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment, pages 117--130, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. B. Wallace and W. Whitt. A staffing algorithm for call centers with skill-based routing. Manufacturing and Service Operations Management, 7:276--294, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Wang. A survey of web caching schemes for the Internet. ACM SIGCOMM Computer Communication Review, 29:36--46, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Williams, M. Abrams, C. Standridge, G. Abdulla, and E. Fox. Removal policies in network caches for world-wide web documents. In SIGCOMM'96, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Williams, M. Abrams, C. Standridge, G. Abdulla, and E. Fox. Caching proxies: limitations and potentials. In the 4th International WWW Conference, December 1995.Google ScholarGoogle Scholar
  22. W. Wu and J. Lui. Exploring the optimal replication strategy in P2P-VoD systems: Characterization and evaluation. IEEE Transactions on Parallel and Distributed Systems, 23, August 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. www.youtube.com/yt/press/statistics.html.Google ScholarGoogle Scholar
  24. H. Yu, D. Zheng, B. Zhao, and W. Zheng. Understanding user behavior in large scale video-on-demand systems. In EuroSys, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Zhou and C. Xu. Optimal video replication and placement on a cluster of video-on-demand servers. In International Conference on Parallel Processing, pages 547--555, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Zhou, T. Fu, and D. Chiu. On replication algorithm in P2P-VoD. IEEE/ACM Transactions on Networking, pages 233--243, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Serving content with unknown demand: the high-dimensional regime

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMETRICS '14: The 2014 ACM international conference on Measurement and modeling of computer systems
        June 2014
        614 pages
        ISBN:9781450327893
        DOI:10.1145/2591971

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 June 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGMETRICS '14 Paper Acceptance Rate40of237submissions,17%Overall Acceptance Rate459of2,691submissions,17%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader