ABSTRACT
Emerging providers of online services offer access to data collections. Such data service providers need to build data structures, e.g. materialized views and indexes, in order to offer better performance for user query execution. The cost of such structures is charged to the user as part of the overall query service cost. In order to ensure the economic viability of the provider, the building and maintenance cost of new structures has to be amortized to a set of prospective query services that will use them. This work proposes a novel stochastic model that predicts the extent of cost amortization in time and number of services. The model is completed with a novel method that regresses query traffic statistics and provides input to the prediction model. In order to demonstrate the effectiveness of the prediction model, we study its application on an extension of an existing economy model for the management of a cloud DBMS. A thorough experimental study shows that the prediction model ensures the economic viability of the cloud DBMS while enabling the offer of fast and cheap query services.
- D. Agrawal, A. E. Abbadi, F. Emekci, and A. Metwally. Database management as a service: Challenges and opportunities. In ICDE '09, pages 1709--1716, 2009. Google ScholarDigital Library
- N. Bruno and R. V. Nehme. Configuration-parametric query optimization for physical design tuning. In SIGMOD, 2008. Google ScholarDigital Library
- B. Calder and D. Grunwald. Next cache line and set prediction. In ISCA, pages 287--296, 1995. Google ScholarDigital Library
- S. Chaudhuri and V. R. Narasayya. An efficient cost-driven index selection tool for Microsoft SQL server. In VLDB, 1997. Google ScholarDigital Library
- D. R. Cox and D. Oakes. Analysis of Survival Data. CRC Press, 1984.Google Scholar
- D. Dash, V. Kantere, and A. Ailamaki. An economic model for self-tuned cloud caching. In ICDE, pages 1687--1693, 2009. Google ScholarDigital Library
- J. J. Faraway. Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC Press, 2006. Google ScholarDigital Library
- D. G. Feitelson. Locality of sampling and diversity in parallel system workloads. In Intl. Conf. Supercomputing, 2007. Google ScholarDigital Library
- A. Ghosh, J. Parikh, V. S. Sengar, and J. R. Haritsa. Plan selection based on query clustering. In VLDB, 2002. Google ScholarDigital Library
- M. Gibas, G. Canahuate, and H. Ferhatosmanoglu. Online index recommendations for high-dimensional databases using query workloads. IEEE TKDE, 20(2):246--260, 2008. Google ScholarDigital Library
- H. Hacigumus, B. Iyer, and S. Mehrotra. Providing database as a service. In ICDE '02: Proceedings of the 18th International Conference on Data Engineering, page 29, Washington, DC, USA, 2002. IEEE Computer Society. Google ScholarDigital Library
- http://aws.amazon.com/ec2/.Google Scholar
- http://code.google.com/appengine/.Google Scholar
- http://msdn.microsoft.com/en us/library/ms189747.aspx.Google Scholar
- http://www.microsoft.com/azure/.Google Scholar
- http://www.salesforce.com/.Google Scholar
- http://www.sdss.org/.Google Scholar
- Z. Hu, S. Kaxiras, and M. Martonosi. Timekeeping in the memory system: predicting and optimizing memory behavior. SIGARCH Comput. Archit. News, 30(2):209--220, 2002. Google ScholarDigital Library
- I. F. Ilyas, V. Markl, P. Haas, P. Brown, and A. Aboulnaga. Cords: Automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004. Google ScholarDigital Library
- M. Kharbutli and Y. Solihin. Counter-based cache replacement and bypassing algorithms. IEEE Trans. Comput., 57(4):433--447, 2008. Google ScholarDigital Library
- H. Kimura, G. Huo, A. Rasin, S. Madden, and S. B. Zdonik. Correlation maps: A compressed access method for exploiting soft functional dependencies. In VLDB, 2009. Google ScholarDigital Library
- A.-C. Lai, C. Fide, and B. Falsafi. Dead-block prediction & dead-block correlating prefetchers. SIGARCH Comput. Archit. News, 29(2):144--154, 2001. Google ScholarDigital Library
- C. A. Lang and A. K. Singh. Modeling high-dimensional index structures using sampling. SIGMOD Rec., 30(2):389--400, 2001. Google ScholarDigital Library
- H. Li, D. Groep, and L. Wolters. Workload characteristics of a multi-cluster supercomputer. In Workshop on Job Scheduling Strategies for Parallel Processing, 2004.Google Scholar
- H. Li, M. Muskulus, and L. Wolters. Modeling job arrivals in a data-intensive grid. In Workshop on Job Scheduling Strategies for Parallel Processing, 2006. Google ScholarDigital Library
- H. Li, M. Muskulus, and L. Wolters. Modeling correlated workloads by combining model based clustering and a localized sampling algorithm. In ICS, pages 64--72. ACM, 2007. Google ScholarDigital Library
- T. Malik, R. C. Burns, and A. Chaudhary. Bypass caching: Making scientific databases good network citizens. In ICDE, 2005. Google ScholarDigital Library
- T. Malik, X. Wang, R. Burns, D. Dash, and A. Ailamaki. Automated physical design in database caches. In SMDB, 2008. Google ScholarDigital Library
- T. N. Minh and L. Wolters. Modeling job arrival process with long range dependence and burstiness characteristics. In CCGRID, pages 324--330. IEEE, 2009. Google ScholarDigital Library
- T. N. Minh and L. Wolters. Modeling parallel system workloads with temporal locality. In Workshop on Job Scheduling Strategies for Parallel Processing, 2009. Google ScholarDigital Library
- M. A. Nieto-Santisteban, J. Gray, A. S. Szalay, J. Annis, A. R. Thakar, and W. J. O'mullane. When database systems meet the grid. In CIDR, 2005.Google Scholar
- S. Papadomanolakis, D. Dash, and A. Ailamaki. Efficient use of the query optimizer for automated database design. In VLDB, pages 1093--1104, 2007. Google ScholarDigital Library
- M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu. Mariposa: A wide-area distributed database system. VLDB J., 5(1):48--63, 1996. Google ScholarDigital Library
- H. Taylor and S. Karlin. An Introduction to Stochastic Modeling. Academic Press, third edition, 1998.Google Scholar
- A. Tewari and P. Bartlett. Optimistic linear programming gives logarithmic regret for irreducible mdps. In Advances in Neural Information Processing Systems 20.Google Scholar
- P. Unterbrunner, G. Giannikis, G. Alonso, D. Fauser, and D. Kossmann. Predictable performance for unpredictable workloads. In VLDB, 2009. Google ScholarDigital Library
- X. Wang, T. Malik, R. C. Burns, S. Papadomanolakis, and A. Ailamaki. A workload-driven unit of cache replacement for mid-tier database caching. In DASFAA, pages 374--385, 2007. Google ScholarDigital Library
Index Terms
- Predicting cost amortization for query services
Recommendations
A secure re-encryption scheme for data services in a cloud computing environment
Cloud computing as a promising technology and paradigm can provide various data services, such as data sharing and distribution, which allows users to derive benefits without the need for deep knowledge about them. However, the popular cloud data ...
Learning GraphQL query cost
ASE '21: Proceedings of the 36th IEEE/ACM International Conference on Automated Software EngineeringGraphQL is a query language for APIs and a runtime for executing those queries, fetching the requested data from existing microservices, REST APIs, databases, or other sources. Its expressiveness and its flexibility have made it an attractive candidate ...
A model-driven method for describing and predicting the reliability of composite services
Service-oriented computing is the prominent paradigm for viewing business processes as composed of functions provided by modular and standardized services. Web services are the building blocks for the application of service-oriented computing on the Web ...
Comments