skip to main content
10.1145/1400751.1400789acmconferencesArticle/Chapter ViewAbstractPublication PagespodcConference Proceedingsconference-collections
research-article

The stretched exponential distribution of internet media access patterns

Published:18 August 2008Publication History

ABSTRACT

The commonly agreed Zipf-like access pattern of Web workloads is mainly based on Internet measurements when text-based content dominated the Web traffic. However, with dramatic increase of media traffic on the Internet, the inconsistency between the access patterns of media objects and the Zipf model has been observed in a number of studies. An insightful understanding of media access patterns is essential to guide Internet system design and management, including resource provisioning and performance optimizations.

In this paper, we have studied a large variety of media workloads collected from both client and server sides in different media systems with different delivery methods. Through extensive analysis and modeling, we find: (1) the object reference ranks of all these workloads follow the stretched exponential (SE) distribution despite their different media systems and delivery methods; (2) one parameter of this distribution well characterizes the media file sizes, the other well characterizes the aging of media accesses; (3) some biased measurements may lead to Zipf-like observations on media access patterns; and (4) the deviation of media access pattern from the Zipf model in these workloads increases along with the workload duration.

We have further analyzed the effectiveness of media caching with a mathematical model. Compared with Web caching under the Zipf model, media caching under the SE model is far less effective unless the cache size is enormously large. This indicates that many previous studies based on a Zipf-like assumption have potentially overestimated the media caching benefit, while an effective media caching system must be able to scale its storage size to accommodate the increase of media content over a long time. Our study provides an analytical basis for applying a P2P model rather than a client-server model to build large scale Internet media delivery systems.

References

  1. http://iblnews.com/story.php?id=17429.Google ScholarGoogle Scholar
  2. http://www.youtube.com/.Google ScholarGoogle Scholar
  3. http://www.imdb.com/chart/top.Google ScholarGoogle Scholar
  4. Buffer settings in windows media player. http://support.microsoft.com/.Google ScholarGoogle Scholar
  5. Helix universal proxy. http://www.realnetworks.com/.Google ScholarGoogle Scholar
  6. IFILM. http://www.ifilm.com/.Google ScholarGoogle Scholar
  7. Using the Microsoft Windows media proxy with ACNS 5.1. http://www.cisco.com/.Google ScholarGoogle Scholar
  8. Windows media services. http://www.microsoft.com/.Google ScholarGoogle Scholar
  9. S. Acharya, B. Smith, and P. Parnes. Characterizing user access to videos on the world wide web. In Proc. of MMCN, 2000.Google ScholarGoogle Scholar
  10. M. Arlitt and C. Williamson. Web server workload characterization: The search for invariants. In Proc. of ACM SIGMETRICS, May 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Bellissimo, B. Levine, and P. Shenoy. Exploring the use of BitTorrent as the basis for a large trace repository. Technical report, Department of Computer Science, University of Massachusetts, Amherst, 2004.Google ScholarGoogle Scholar
  12. L. Breslau, P. Cao, L. Fan, G. Philips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In Proc. of INFOCOM, Mar. 1999.Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Cha, H. Kwak, P. Rodriguez, Y. Ahn, and S. Moon. I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system. In Proc. of ACM SIGCOMM IMC, Oct. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Cherkasova and M. Gupta. Characterizing locality, evolution, and life span of accesses in enterprise media server workloads. In Proc. of ACM NOSSDAV, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Chesire, A. Wolman, G. Voelker, and H. Levy. Measurement and analysis of a streaming media workload. In Proc. of USENIX USITS, Mar. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Chu, K. Labonte, and B. Levine. Availability and popularity measurements of peer-to-peer file systems. In Proc. of SPIE ITCom, July 2002.Google ScholarGoogle Scholar
  17. P. Gill, M. Arlitt, Z. Li, and A. Mahanti. YouTube traffic characterization: A view from the edge. In Proc. of ACM SIGCOMM IMC, Oct. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Griwodz, M. Bar, and L. Wolf. Long-term movie popularity models in video-on-demand systems or the life of an on-demand movie. In Proc. of ACM Multimedia, Nov. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. P. Gummadi, R. J. Dunn, S. Saroiu, S. D. Gribble, H. M. Levy, and J. Zahorjan. Measurement, modeling, and analysis of a peer-to-peer file-sharing workload. In Proc. of ACM SOSP, Oct. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Guo. Insights into Access Patterns of Internet Media Systems: Measurements, Analysis, and System Design. PhD thesis, The Ohio State University, Nov. 2007. http://www.cse.ohio-state.edu/ lguo/papers/thesis.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Guo, S. Chen, Z. Xiao, E. Tan, X. Ding, and X. Zhang. Measurements, analysis, and modeling of BitTorrent-like systems. In Proc. of ACM SIGCOMM IMC, Oct. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Guo, S. Chen, Z. Xiao, and X. Zhang. DISC: Dynamic interleaved segment caching for interactive streaming. In Proc. of IEEE ICDCS, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. L. Guo, E. Tan, S. Chen, Z. Xiao, O. Spatscheck, and X. Zhang. Delving into Internet streaming media delivery: A quality and resource utilization perspective. In Proc. of ACM SIGCOMM IMC, Oct. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Iamnitchi, M. Ripeanu, and I. Foster. Small-world file-sharing communities. In Proc. of IEEE INFOCOM, Mar. 2004.Google ScholarGoogle ScholarCross RefCross Ref
  25. J. Laherrere and D. Sornette. Stretched exponential distributions in nature and economy: “fat tails" with characteristic scales. European Physical Journal B, 2:525--539, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  26. O. Saleh and M. Hefeeda. Modeling and caching of peer-to-peer traffic. In Proc. of IEEE ICNP, Nov. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Sripanidkulchai, B. Maggs, and H. Zhang. An analysis of live streaming workloads on the Internet. In Proc. of ACM SIGCOMM IMC, Oct. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. W. Tang, Y. Fu, L. Cherkasova, and A. Vahdat. MediSyn: A synthetic streaming media service workload generator. In Proc. of ACM NOSSDAV, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. E. Veloso, V. Almeida, W. Meira, A. Bestavros, and S. Jin. A hierarchical characterization of a live streaming media workload. In Proc. of ACM SIGCOMM IMW, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Williams, M. Arlitt, C. Williamson, and K. Barker. Web Content Delivery, chapter Web Workload Characterization: Ten Years Later. Springer, 2005.Google ScholarGoogle Scholar
  31. C. Williamson. On filter effects in Web caching hierarchies. ACM Transactions on Internet Technology, 2(1):47--77, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Yu, D. Zheng, B. Y. Zhao, and W. Zheng. Understanding user behavior in large scale video-on-demand systems. In Proc. of ACM EuroSys, Apr. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The stretched exponential distribution of internet media access patterns

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PODC '08: Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
        August 2008
        474 pages
        ISBN:9781595939890
        DOI:10.1145/1400751

        Copyright © 2008 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 18 August 2008

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate740of2,477submissions,30%

        Upcoming Conference

        PODC '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader