Abstract
Overlapping and multiversion techniques are two popular frameworks that transform an ephemeral index into a multiple logical-tree structure in order to support versioning databases. Although both frameworks have produced numerous efficient indexing methods, their performance analysis is rather limited; as a result there is no clear understanding about the behavior of the alternative structures and the choice of the best one, given the data and query characteristics. Furthermore, query optimization based on these methods is currently impossible. These are serious problems due to the incorporation of overlapping and multiversion techniques in several traditional (e.g., financial) and emerging (e.g., spatiotemporal) applications. In this article, we reduce performance analysis of overlapping and multiversion structures to that of the corresponding ephemeral structures, thus simplifying the problem significantly. This reduction leads to accurate cost models that predict the sizes of the trees, the node/page accesses, and selectivity of queries. Furthermore, the models offer significant insight into the behavior of the structures and provide guidelines about the selection of the most appropriate method in practice. Extensive experimentation proves that the proposed models yield errors below 5 and 15% for uniform and nonuniform data, respectively.
- Acharya, S., Poosala, V., and Ramaswamy, S. 1999. Selectivity estimation in spatial databases. In Proceedings of the ACM SIGMOD Conference (June), 13--24.]] Google ScholarDigital Library
- Becker, B., Gschwind, S., Ohler, T., Seeger, B., and Widmayer, P. 1996. An asymptotically optimal multiversion B-tree. VLDB J. 5, 4, 264--275.]] Google ScholarDigital Library
- Bercken, J. V. D. and Seeger, B. 1996. Query processing techniques for multiversion access methods. In Proceedings of the VLDB Conference (Sept.), 168--179.]] Google ScholarDigital Library
- Bhide, A., Dan, A., and Dias, D. 1993. A simple analysis of LRU buffer replacement policy and its relationship to buffer warm-up transient. In Proceedings of the International Conference of Data Engineering (ICDE) (April), 125--133.]] Google ScholarDigital Library
- Burton, F. and Huntbach, M. 1985. Multiple generation text files using overlapping tree. Comput. J. 28, 4, 414--416.]]Google ScholarCross Ref
- Burton, F., Kollias, J., Kollias, V., and Matsakis, D. 1990. Implementation of overlapping B-trees for time and space efficient representation of collection of similar files. Comput. J. 33, 3, 279--280.]] Google ScholarDigital Library
- Carey, M., DeWitt, D., Richardson, J., and Shekita, E. 1986. Object and file management in the EXODUS extensible database system. In Proceedings of the VLDB Conference (August), 91--100.]] Google ScholarDigital Library
- Chien, S., Tsotras, V., Zaniolo, C., and Zhang, D. 2002. Efficient complex query support for multiversion XML documents. In Proceedings of the Extending Database Technology Conference (EDBT) (March), 161--178.]] Google ScholarDigital Library
- Easton, M. 1986. Key-sequence data sets on indelible storage. IBM J. Res. Dev. 30, 3, 230--241.]] Google ScholarDigital Library
- Gargantini, I. 1982. An efficient way to represent quadtrees. Commun. ACM 25, 12, 905--910.]] Google ScholarDigital Library
- Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD Conference (June), 47--57.]] Google ScholarDigital Library
- Huang, Y., Jing, N., and Rundensteiner, E. 1997. A cost model for estimating the performance of spatial joins using R-trees. In Proceedings of the Scientific and Statistical Database Management Conference (SSDBM) (August), 30--38.]] Google ScholarDigital Library
- Jiang, L., Salzberg, B., Lomet, D., and Barrena, M. 2000. The BT-tree: A branched and temporal access method. In Proceedings of the VLDB Conference (Sept.), 451--460.]] Google ScholarDigital Library
- Kamel, I. and Faloutsos, C. 1993. On packing R-trees. In Proceedings of the Conference on Information and Knowledge Management (CIKM) (Nov.), 490--499.]] Google ScholarDigital Library
- Kollios, G., Gunopulos, D., Tsotras, V., Delis, A., and Hadjieleftheriou, M. 2001. Indexing animated objects using spatiotemporal access methods. IEEE Trans. Knowl. Data Eng. (to appear).]] Google ScholarDigital Library
- Kumar, A., Tsotras, V., and Faloutsos, C. 1998. Designing access methods for bitemporal databases. IEEE Trans. Knowl. Data Eng. 10, 1, 1--20.]] Google ScholarDigital Library
- Lanka, S. and Mays, E. 1991. Fully persistent B+-trees. In Proceedings of the ACM SIGMOD Conference (May), 426--435.]] Google ScholarDigital Library
- Leutenegger, S. and Lopez, M. 2000. The effect of buffering on the performance of R-trees. IEEE Trans. Knowl. Data Eng. 12, 1, 33--44.]] Google ScholarDigital Library
- Liption, R., Naughton, J., and Schneider, D. 1990. Practical selectivity estimation through adaptive sampling. In Proceedings of the ACM SIGMOD Conference (May), 1--11.]] Google ScholarDigital Library
- Lomet, D. and Salzberg, B. 1989. Access methods for multiversion data. In Proceedings of the ACM SIGMOD Conference (May), 315--324.]] Google ScholarDigital Library
- Lomet, D. and Salzberg, B. 1990. The performance of a multiversion access method. In Proceedings of the ACM SIGMOD Conference (May), 353--363.]] Google ScholarDigital Library
- Nascimento, M. and Silva, J. 1998. Towards historical R-trees. In Proceedings of the ACM Symposium on Applied Computing (Feb.), 235--240.]] Google ScholarDigital Library
- Pagel, B. and Six, H. 1996. Are window queries representative for arbitrary range queries? In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (June), 150--160.]] Google ScholarDigital Library
- Papadias, D., Tao, Y., Kalnis, P., and Zhang, J. 2002. Indexing spatio-temporal data warehouses. In Proceedings of the International Conference on Data Engineering (ICDE) (Feb.), 166--175.]] Google ScholarDigital Library
- Piatetsky-Shapiro, G. and Connell, C. 1984. Accurate estimation of the number of tuples satisfying a condition. In Proceedings of the ACM SIGMOD Conference (June), 256--276.]] Google ScholarDigital Library
- Salzberg, B. and Tsotras, V. 1999. A comparison of access methods for temporal data. ACM Comput. Surv. 31, 2, 158--221.]] Google ScholarDigital Library
- Soo, M., Snodgrass, T., and Jensen, C. 1994. Efficient evaluation of the valid-time natural join. In Proceedings of the International Conference on Data Engineering (ICDE) (Feb.), 282--292.]] Google ScholarDigital Library
- Tao, Y. and Papadias, D. 2001a. The MV3R-tree: A spatio-temporal access method for timestamp and interval queries. In Proceedings of the VLDB Conference (Sept.), 431--440.]] Google ScholarDigital Library
- Tao, Y. and Papadias, D. 2001b. Efficient historical R-trees. In Proceedings of the Scientific and Statistical Database Management (SSDBM) (July), 223--232.]]Google Scholar
- Tao, Y. and Papadias, D., Zhang, J. 2002. Aggregate processing of planar points. In Proceedings of the Extending Database Technology Conference (EDBT) (March), 179--196.]] Google ScholarDigital Library
- Theodoridis, Y. and Sellis, T. 1996. A model for the prediction of R-tree performance. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (June), 161--171.]] Google ScholarDigital Library
- Theodoridis, Y., Stefanakis, E., and Sellis, T. 2000. Efficient cost models for spatial queries using R-trees. IEEE Trans. Knowl. Data Eng. 12, 1, 19--32.]] Google ScholarDigital Library
- Tzouramanis, T., Manolopoulos, Y., and Lorentzos, N. 1999. Overlapping B+-trees: An implementation of a transaction time access method. Data Know. Eng. 29, 381--404.]] Google ScholarDigital Library
- Tzouramanis, T., Vassilakopoulos, M., and Manolopoulos, Y. 2000a. Overlapping linear quadtrees and spatio-temporal query processing. Comput. J. 43, 4, 325--343.]] Google ScholarDigital Library
- Tzouramanis, T., Vassilakopoulos, M., and Manolopoulos, Y. 2000b. Multiversion linear quadtree for spatio-temporal data. In Proceedings of the Database Systems for Advanced Applications Conference (DASFAA) (Sept.), 279--292.]] Google ScholarDigital Library
- Varman, P. and Verma, R. 1997. An efficient multiversion access structure. IEEE Trans. Knowl. Data Eng. 9, 3, 391--409.]] Google ScholarDigital Library
- Web. Http://dias.cti.gr/∼ytheod/research/datasets/spatial.html.]]Google Scholar
- Xu, X., Han, J., and Lu, W. 1990. RT-tree: An improved R-tree index structure for spatiotemporal data. In Proceedings of the International Symposium on Spatial Data Handling Conference (SDH) (July), 1040--1049.]]Google Scholar
- Yang, J. and Widom, J. 2001. Incremental computation and maintenance of temporal aggregates. In Proceedings of the International Conference on Data Engineering (ICDE) (April), 51--60.]] Google ScholarDigital Library
- Yao, S. 1978. Random 2-3 trees. Acta Inf. 2, 9, 159--179.]]Google ScholarDigital Library
- Zhang, D., Markowetz, A., Tsotras, V., Gunopulos, D., and Seeger, B. 2001. Efficient computation of temporal aggregates with range predicates. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (May), 237--245.]] Google ScholarDigital Library
- Zhang, D., Tsotras, V., and Seeger, B. 2002. Efficient temporal join processing using indices. In Proceedings of the International Conference on Data Engineering (Feb.), 103--113.]] Google ScholarDigital Library
Index Terms
- Cost models for overlapping and multiversion structures
Recommendations
Indexing and querying overlapping structures
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrievalStructural information retrieval is mostly based on hierarchy. However, in real life information is not purely hierarchical and structural elements may overlap each other. The most common example is a document with two distinct structural views, where ...
Indexing frequently updated trajectories of network-constrained moving objects
DEXA'11: Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part IIIndex is a key technique in improving the query processing performance of moving objects databases. However, current index methods for moving object trajectories take trajectory units as the basic index records and frequent index updates are needed when ...
Temporal join processing with hilbert curve space mapping
SAC '14: Proceedings of the 29th Annual ACM Symposium on Applied ComputingManagement of data with a time dimension increases the overhead of storage and query processing in large database applications especially with the join operation, which is a commonly used and expensive relational operator. The join evaluation is ...
Comments