Abstract
Indexing schemes are redundant structures offered by DBMSs to speed up complex queries. Two types of indices are available: mono-attribute indices (B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). In relational data warehouses, bitmap join indices (BJIs) are bitmap indices for optimizing star join queries through bit-wise operations. They can be used to avoid actual joins of tables, or to greatly reduce the volume of data that must be joined, by executing restrictions in advance. BJIs are defined using non-key dimension attributes and fact key attributes. Moreover, the problem of selecting these indices is difficult because there is a large number of candidate attributes (defined on dimension tables) that could participate in building these indices. To reduce this complexity, we propose an approach which first prunes the search space of this problem using data mining techniques, and then based on the new search space, it uses a greedy algorithm to select BJIs that minimize the cost of executing a set of queries and satisfy a storage constraint. The main peculiarity of our pruning approach, compared to the existing ones (that use only appearance frequencies of indexable attributes appearing in queries as a pruning metric), is that it uses others parameters such as the size of their dimension tables, the length of each tuple and the size of a disk page.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994. 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Aouiche, K., Boussaid, O., Bentayeb, F.: Automatic Selection of Bitmap Join Indexes in Data Warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, Springer, Heidelberg (2005)
Chaudhuri, S.: Index selection for databases: A hardness study and a principled heuristic solution. IEEE Transactions on Knowledge and Data Engineering 16(11), 1313–1323 (2004)
Chaudhuri, S., Narasayya, V.: An efficient cost-driven index selection tool for microsoft sql server. In: Proceedings of the International Conference on Very Large Databases, August 1997, pp. 146–155 (1997)
Fung, C.-H., Karlapalem, K., Li, Q.: Cost-driven vertical class partitioning for methods in object oriented databases. VLDB Journal 12(3), 187–210 (2003)
Getoor, L., Taskar, B., Koller, D.: Selectivity estimation using probabilistic models. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 461–472 (2001)
Golfarelli, M., Rizzi, E., Saltarelli, S.: Index selection for data warehousing. In: DMDW 2002. Proceedings 4th International Workshop on Design and Management of Data Warehouses, Toronto, Canada, pp. 33–42 (2002)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proceedings of the ACM-SIGMOD 2000 Conference, Dallas, Texas, USA, pp. 1–12 (2000)
Johnson, T.: Performance measurements of compressed bitmap indices. In: Proceedings of the International Conference on Very Large Databases, pp. 278–289 (1999)
Labio, W., Quass, D., Adelberg, B.: Physical database design for data warehouses. In: ICDE. Proceedings of the International Conference on Data Engineering (1997)
Oneil, P.: Multi-table joins through bitmapped join indioces. SIGMOD 24(03) (1995)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets. In: ICDT, pp. 398–416 (1999)
Rizzi, S., Saltarelli, E.: View materialization vs. indexing: Balancing space constraints in data warehouse design. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, Springer, Heidelberg (2003)
Stöhr, T., Märtens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the International Conference on Very Large Databases, pp. 273–284 (2000)
Red Brick Systems. Star schema processing for complex queries. White Paper (July 1997)
Valduriez, P.: Join indices. ACM Transactions on Database Systems 12(2), 218–246 (1987)
Valentin, G., Zuliani, M., Zilio, D.C., Lohman, G.M., Skelley, A.: Db2 advisor: An optimizer smart enough to recommend its own indexes. In: ICDE 2000, pp. 101–110 (2000)
Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: ICDM 2002. Proceeding of the 2nd SIAM International Conference on Data Mining (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bellatreche, L., Missaoui, R., Necir, H., Drias, H. (2007). Selection and Pruning Algorithms for Bitmap Index Selection Problem Using Data Mining. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-74553-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74552-5
Online ISBN: 978-3-540-74553-2
eBook Packages: Computer ScienceComputer Science (R0)