skip to main content
article

Cost models for overlapping and multiversion structures

Authors Info & Claims
Published:01 September 2002Publication History
Skip Abstract Section

Abstract

Overlapping and multiversion techniques are two popular frameworks that transform an ephemeral index into a multiple logical-tree structure in order to support versioning databases. Although both frameworks have produced numerous efficient indexing methods, their performance analysis is rather limited; as a result there is no clear understanding about the behavior of the alternative structures and the choice of the best one, given the data and query characteristics. Furthermore, query optimization based on these methods is currently impossible. These are serious problems due to the incorporation of overlapping and multiversion techniques in several traditional (e.g., financial) and emerging (e.g., spatiotemporal) applications. In this article, we reduce performance analysis of overlapping and multiversion structures to that of the corresponding ephemeral structures, thus simplifying the problem significantly. This reduction leads to accurate cost models that predict the sizes of the trees, the node/page accesses, and selectivity of queries. Furthermore, the models offer significant insight into the behavior of the structures and provide guidelines about the selection of the most appropriate method in practice. Extensive experimentation proves that the proposed models yield errors below 5 and 15% for uniform and nonuniform data, respectively.

References

  1. Acharya, S., Poosala, V., and Ramaswamy, S. 1999. Selectivity estimation in spatial databases. In Proceedings of the ACM SIGMOD Conference (June), 13--24.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Becker, B., Gschwind, S., Ohler, T., Seeger, B., and Widmayer, P. 1996. An asymptotically optimal multiversion B-tree. VLDB J. 5, 4, 264--275.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bercken, J. V. D. and Seeger, B. 1996. Query processing techniques for multiversion access methods. In Proceedings of the VLDB Conference (Sept.), 168--179.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bhide, A., Dan, A., and Dias, D. 1993. A simple analysis of LRU buffer replacement policy and its relationship to buffer warm-up transient. In Proceedings of the International Conference of Data Engineering (ICDE) (April), 125--133.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Burton, F. and Huntbach, M. 1985. Multiple generation text files using overlapping tree. Comput. J. 28, 4, 414--416.]]Google ScholarGoogle ScholarCross RefCross Ref
  6. Burton, F., Kollias, J., Kollias, V., and Matsakis, D. 1990. Implementation of overlapping B-trees for time and space efficient representation of collection of similar files. Comput. J. 33, 3, 279--280.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Carey, M., DeWitt, D., Richardson, J., and Shekita, E. 1986. Object and file management in the EXODUS extensible database system. In Proceedings of the VLDB Conference (August), 91--100.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chien, S., Tsotras, V., Zaniolo, C., and Zhang, D. 2002. Efficient complex query support for multiversion XML documents. In Proceedings of the Extending Database Technology Conference (EDBT) (March), 161--178.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Easton, M. 1986. Key-sequence data sets on indelible storage. IBM J. Res. Dev. 30, 3, 230--241.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gargantini, I. 1982. An efficient way to represent quadtrees. Commun. ACM 25, 12, 905--910.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD Conference (June), 47--57.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Huang, Y., Jing, N., and Rundensteiner, E. 1997. A cost model for estimating the performance of spatial joins using R-trees. In Proceedings of the Scientific and Statistical Database Management Conference (SSDBM) (August), 30--38.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jiang, L., Salzberg, B., Lomet, D., and Barrena, M. 2000. The BT-tree: A branched and temporal access method. In Proceedings of the VLDB Conference (Sept.), 451--460.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kamel, I. and Faloutsos, C. 1993. On packing R-trees. In Proceedings of the Conference on Information and Knowledge Management (CIKM) (Nov.), 490--499.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kollios, G., Gunopulos, D., Tsotras, V., Delis, A., and Hadjieleftheriou, M. 2001. Indexing animated objects using spatiotemporal access methods. IEEE Trans. Knowl. Data Eng. (to appear).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kumar, A., Tsotras, V., and Faloutsos, C. 1998. Designing access methods for bitemporal databases. IEEE Trans. Knowl. Data Eng. 10, 1, 1--20.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lanka, S. and Mays, E. 1991. Fully persistent B+-trees. In Proceedings of the ACM SIGMOD Conference (May), 426--435.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Leutenegger, S. and Lopez, M. 2000. The effect of buffering on the performance of R-trees. IEEE Trans. Knowl. Data Eng. 12, 1, 33--44.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Liption, R., Naughton, J., and Schneider, D. 1990. Practical selectivity estimation through adaptive sampling. In Proceedings of the ACM SIGMOD Conference (May), 1--11.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lomet, D. and Salzberg, B. 1989. Access methods for multiversion data. In Proceedings of the ACM SIGMOD Conference (May), 315--324.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lomet, D. and Salzberg, B. 1990. The performance of a multiversion access method. In Proceedings of the ACM SIGMOD Conference (May), 353--363.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Nascimento, M. and Silva, J. 1998. Towards historical R-trees. In Proceedings of the ACM Symposium on Applied Computing (Feb.), 235--240.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Pagel, B. and Six, H. 1996. Are window queries representative for arbitrary range queries? In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (June), 150--160.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Papadias, D., Tao, Y., Kalnis, P., and Zhang, J. 2002. Indexing spatio-temporal data warehouses. In Proceedings of the International Conference on Data Engineering (ICDE) (Feb.), 166--175.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Piatetsky-Shapiro, G. and Connell, C. 1984. Accurate estimation of the number of tuples satisfying a condition. In Proceedings of the ACM SIGMOD Conference (June), 256--276.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Salzberg, B. and Tsotras, V. 1999. A comparison of access methods for temporal data. ACM Comput. Surv. 31, 2, 158--221.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Soo, M., Snodgrass, T., and Jensen, C. 1994. Efficient evaluation of the valid-time natural join. In Proceedings of the International Conference on Data Engineering (ICDE) (Feb.), 282--292.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tao, Y. and Papadias, D. 2001a. The MV3R-tree: A spatio-temporal access method for timestamp and interval queries. In Proceedings of the VLDB Conference (Sept.), 431--440.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tao, Y. and Papadias, D. 2001b. Efficient historical R-trees. In Proceedings of the Scientific and Statistical Database Management (SSDBM) (July), 223--232.]]Google ScholarGoogle Scholar
  30. Tao, Y. and Papadias, D., Zhang, J. 2002. Aggregate processing of planar points. In Proceedings of the Extending Database Technology Conference (EDBT) (March), 179--196.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Theodoridis, Y. and Sellis, T. 1996. A model for the prediction of R-tree performance. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (June), 161--171.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Theodoridis, Y., Stefanakis, E., and Sellis, T. 2000. Efficient cost models for spatial queries using R-trees. IEEE Trans. Knowl. Data Eng. 12, 1, 19--32.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Tzouramanis, T., Manolopoulos, Y., and Lorentzos, N. 1999. Overlapping B+-trees: An implementation of a transaction time access method. Data Know. Eng. 29, 381--404.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Tzouramanis, T., Vassilakopoulos, M., and Manolopoulos, Y. 2000a. Overlapping linear quadtrees and spatio-temporal query processing. Comput. J. 43, 4, 325--343.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Tzouramanis, T., Vassilakopoulos, M., and Manolopoulos, Y. 2000b. Multiversion linear quadtree for spatio-temporal data. In Proceedings of the Database Systems for Advanced Applications Conference (DASFAA) (Sept.), 279--292.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Varman, P. and Verma, R. 1997. An efficient multiversion access structure. IEEE Trans. Knowl. Data Eng. 9, 3, 391--409.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Web. Http://dias.cti.gr/∼ytheod/research/datasets/spatial.html.]]Google ScholarGoogle Scholar
  38. Xu, X., Han, J., and Lu, W. 1990. RT-tree: An improved R-tree index structure for spatiotemporal data. In Proceedings of the International Symposium on Spatial Data Handling Conference (SDH) (July), 1040--1049.]]Google ScholarGoogle Scholar
  39. Yang, J. and Widom, J. 2001. Incremental computation and maintenance of temporal aggregates. In Proceedings of the International Conference on Data Engineering (ICDE) (April), 51--60.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yao, S. 1978. Random 2-3 trees. Acta Inf. 2, 9, 159--179.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Zhang, D., Markowetz, A., Tsotras, V., Gunopulos, D., and Seeger, B. 2001. Efficient computation of temporal aggregates with range predicates. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (May), 237--245.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Zhang, D., Tsotras, V., and Seeger, B. 2002. Efficient temporal join processing using indices. In Proceedings of the International Conference on Data Engineering (Feb.), 103--113.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cost models for overlapping and multiversion structures

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM Transactions on Database Systems
                  ACM Transactions on Database Systems  Volume 27, Issue 3
                  September 2002
                  114 pages
                  ISSN:0362-5915
                  EISSN:1557-4644
                  DOI:10.1145/581751
                  Issue’s Table of Contents

                  Copyright © 2002 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 1 September 2002
                  Published in tods Volume 27, Issue 3

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • article

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader