Skip to main content

Selection and Pruning Algorithms for Bitmap Index Selection Problem Using Data Mining

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4654))

Abstract

Indexing schemes are redundant structures offered by DBMSs to speed up complex queries. Two types of indices are available: mono-attribute indices (B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). In relational data warehouses, bitmap join indices (BJIs) are bitmap indices for optimizing star join queries through bit-wise operations. They can be used to avoid actual joins of tables, or to greatly reduce the volume of data that must be joined, by executing restrictions in advance. BJIs are defined using non-key dimension attributes and fact key attributes. Moreover, the problem of selecting these indices is difficult because there is a large number of candidate attributes (defined on dimension tables) that could participate in building these indices. To reduce this complexity, we propose an approach which first prunes the search space of this problem using data mining techniques, and then based on the new search space, it uses a greedy algorithm to select BJIs that minimize the cost of executing a set of queries and satisfy a storage constraint. The main peculiarity of our pruning approach, compared to the existing ones (that use only appearance frequencies of indexable attributes appearing in queries as a pruning metric), is that it uses others parameters such as the size of their dimension tables, the length of each tuple and the size of a disk page.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: VLDB 1994. 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)

    Google Scholar 

  2. Aouiche, K., Boussaid, O., Bentayeb, F.: Automatic Selection of Bitmap Join Indexes in Data Warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Chaudhuri, S.: Index selection for databases: A hardness study and a principled heuristic solution. IEEE Transactions on Knowledge and Data Engineering 16(11), 1313–1323 (2004)

    Article  Google Scholar 

  4. Chaudhuri, S., Narasayya, V.: An efficient cost-driven index selection tool for microsoft sql server. In: Proceedings of the International Conference on Very Large Databases, August 1997, pp. 146–155 (1997)

    Google Scholar 

  5. Fung, C.-H., Karlapalem, K., Li, Q.: Cost-driven vertical class partitioning for methods in object oriented databases. VLDB Journal 12(3), 187–210 (2003)

    Article  Google Scholar 

  6. Getoor, L., Taskar, B., Koller, D.: Selectivity estimation using probabilistic models. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 461–472 (2001)

    Google Scholar 

  7. Golfarelli, M., Rizzi, E., Saltarelli, S.: Index selection for data warehousing. In: DMDW 2002. Proceedings 4th International Workshop on Design and Management of Data Warehouses, Toronto, Canada, pp. 33–42 (2002)

    Google Scholar 

  8. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proceedings of the ACM-SIGMOD 2000 Conference, Dallas, Texas, USA, pp. 1–12 (2000)

    Google Scholar 

  9. Johnson, T.: Performance measurements of compressed bitmap indices. In: Proceedings of the International Conference on Very Large Databases, pp. 278–289 (1999)

    Google Scholar 

  10. Labio, W., Quass, D., Adelberg, B.: Physical database design for data warehouses. In: ICDE. Proceedings of the International Conference on Data Engineering (1997)

    Google Scholar 

  11. Oneil, P.: Multi-table joins through bitmapped join indioces. SIGMOD 24(03) (1995)

    Google Scholar 

  12. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets. In: ICDT, pp. 398–416 (1999)

    Google Scholar 

  13. Rizzi, S., Saltarelli, E.: View materialization vs. indexing: Balancing space constraints in data warehouse design. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Stöhr, T., Märtens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the International Conference on Very Large Databases, pp. 273–284 (2000)

    Google Scholar 

  15. Red Brick Systems. Star schema processing for complex queries. White Paper (July 1997)

    Google Scholar 

  16. Valduriez, P.: Join indices. ACM Transactions on Database Systems 12(2), 218–246 (1987)

    Article  Google Scholar 

  17. Valentin, G., Zuliani, M., Zilio, D.C., Lohman, G.M., Skelley, A.: Db2 advisor: An optimizer smart enough to recommend its own indexes. In: ICDE 2000, pp. 101–110 (2000)

    Google Scholar 

  18. Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: ICDM 2002. Proceeding of the 2nd SIAM International Conference on Data Mining (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Il Yeal Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bellatreche, L., Missaoui, R., Necir, H., Drias, H. (2007). Selection and Pruning Algorithms for Bitmap Index Selection Problem Using Data Mining. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74553-2_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74552-5

  • Online ISBN: 978-3-540-74553-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics