skip to main content
10.1145/1989323.1989358acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Predicting cost amortization for query services

Published:12 June 2011Publication History

ABSTRACT

Emerging providers of online services offer access to data collections. Such data service providers need to build data structures, e.g. materialized views and indexes, in order to offer better performance for user query execution. The cost of such structures is charged to the user as part of the overall query service cost. In order to ensure the economic viability of the provider, the building and maintenance cost of new structures has to be amortized to a set of prospective query services that will use them. This work proposes a novel stochastic model that predicts the extent of cost amortization in time and number of services. The model is completed with a novel method that regresses query traffic statistics and provides input to the prediction model. In order to demonstrate the effectiveness of the prediction model, we study its application on an extension of an existing economy model for the management of a cloud DBMS. A thorough experimental study shows that the prediction model ensures the economic viability of the cloud DBMS while enabling the offer of fast and cheap query services.

References

  1. D. Agrawal, A. E. Abbadi, F. Emekci, and A. Metwally. Database management as a service: Challenges and opportunities. In ICDE '09, pages 1709--1716, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Bruno and R. V. Nehme. Configuration-parametric query optimization for physical design tuning. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Calder and D. Grunwald. Next cache line and set prediction. In ISCA, pages 287--296, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chaudhuri and V. R. Narasayya. An efficient cost-driven index selection tool for Microsoft SQL server. In VLDB, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. R. Cox and D. Oakes. Analysis of Survival Data. CRC Press, 1984.Google ScholarGoogle Scholar
  6. D. Dash, V. Kantere, and A. Ailamaki. An economic model for self-tuned cloud caching. In ICDE, pages 1687--1693, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. J. Faraway. Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. G. Feitelson. Locality of sampling and diversity in parallel system workloads. In Intl. Conf. Supercomputing, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Ghosh, J. Parikh, V. S. Sengar, and J. R. Haritsa. Plan selection based on query clustering. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Gibas, G. Canahuate, and H. Ferhatosmanoglu. Online index recommendations for high-dimensional databases using query workloads. IEEE TKDE, 20(2):246--260, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Hacigumus, B. Iyer, and S. Mehrotra. Providing database as a service. In ICDE '02: Proceedings of the 18th International Conference on Data Engineering, page 29, Washington, DC, USA, 2002. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. http://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  13. http://code.google.com/appengine/.Google ScholarGoogle Scholar
  14. http://msdn.microsoft.com/en us/library/ms189747.aspx.Google ScholarGoogle Scholar
  15. http://www.microsoft.com/azure/.Google ScholarGoogle Scholar
  16. http://www.salesforce.com/.Google ScholarGoogle Scholar
  17. http://www.sdss.org/.Google ScholarGoogle Scholar
  18. Z. Hu, S. Kaxiras, and M. Martonosi. Timekeeping in the memory system: predicting and optimizing memory behavior. SIGARCH Comput. Archit. News, 30(2):209--220, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. F. Ilyas, V. Markl, P. Haas, P. Brown, and A. Aboulnaga. Cords: Automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Kharbutli and Y. Solihin. Counter-based cache replacement and bypassing algorithms. IEEE Trans. Comput., 57(4):433--447, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Kimura, G. Huo, A. Rasin, S. Madden, and S. B. Zdonik. Correlation maps: A compressed access method for exploiting soft functional dependencies. In VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A.-C. Lai, C. Fide, and B. Falsafi. Dead-block prediction & dead-block correlating prefetchers. SIGARCH Comput. Archit. News, 29(2):144--154, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. A. Lang and A. K. Singh. Modeling high-dimensional index structures using sampling. SIGMOD Rec., 30(2):389--400, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Li, D. Groep, and L. Wolters. Workload characteristics of a multi-cluster supercomputer. In Workshop on Job Scheduling Strategies for Parallel Processing, 2004.Google ScholarGoogle Scholar
  25. H. Li, M. Muskulus, and L. Wolters. Modeling job arrivals in a data-intensive grid. In Workshop on Job Scheduling Strategies for Parallel Processing, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Li, M. Muskulus, and L. Wolters. Modeling correlated workloads by combining model based clustering and a localized sampling algorithm. In ICS, pages 64--72. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Malik, R. C. Burns, and A. Chaudhary. Bypass caching: Making scientific databases good network citizens. In ICDE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. T. Malik, X. Wang, R. Burns, D. Dash, and A. Ailamaki. Automated physical design in database caches. In SMDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. N. Minh and L. Wolters. Modeling job arrival process with long range dependence and burstiness characteristics. In CCGRID, pages 324--330. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. T. N. Minh and L. Wolters. Modeling parallel system workloads with temporal locality. In Workshop on Job Scheduling Strategies for Parallel Processing, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. A. Nieto-Santisteban, J. Gray, A. S. Szalay, J. Annis, A. R. Thakar, and W. J. O'mullane. When database systems meet the grid. In CIDR, 2005.Google ScholarGoogle Scholar
  32. S. Papadomanolakis, D. Dash, and A. Ailamaki. Efficient use of the query optimizer for automated database design. In VLDB, pages 1093--1104, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu. Mariposa: A wide-area distributed database system. VLDB J., 5(1):48--63, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Taylor and S. Karlin. An Introduction to Stochastic Modeling. Academic Press, third edition, 1998.Google ScholarGoogle Scholar
  35. A. Tewari and P. Bartlett. Optimistic linear programming gives logarithmic regret for irreducible mdps. In Advances in Neural Information Processing Systems 20.Google ScholarGoogle Scholar
  36. P. Unterbrunner, G. Giannikis, G. Alonso, D. Fauser, and D. Kossmann. Predictable performance for unpredictable workloads. In VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. X. Wang, T. Malik, R. C. Burns, S. Papadomanolakis, and A. Ailamaki. A workload-driven unit of cache replacement for mid-tier database caching. In DASFAA, pages 374--385, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Predicting cost amortization for query services

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
        June 2011
        1364 pages
        ISBN:9781450306614
        DOI:10.1145/1989323

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 June 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader