Skip to main content

\(\mathcal{F}\)&\(\mathcal{A}\): A Methodology for Effectively and Efficiently Designing Parallel Relational Data Warehouses on Heterogenous Database Clusters

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6263))

Abstract

In this paper we propose a comprehensive methodology for designing Parallel Relational Data Warehouses (PRDW) over database clusters, called \(\mathcal{F}\) ragmentation&\(\mathcal{A}\) llocation (\(\mathcal{F}\)&\(\mathcal{A}\)). \(\mathcal{F}\)&\(\mathcal{A}\) assumes that cluster nodes are heterogeneous in processing power and storage capacity, contrary to traditional design approaches that assume that cluster nodes are instead homogeneous, and fragmentation and allocation phases are performed in a simultaneous manner, contrary to traditional design approaches that instead perform these phases in an isolated manner. Also, a naive replication algorithm that takes into account the heterogeneous characteristics of our reference architecture is proposed. Finally, our proposal is experimentally assessed and validated against the widely-known data warehouse benchmark APB-1 release II.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bellatreche, L., Benkrid, S.: A joint design approach of partitioning and allocation in parallel data warehouses. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2009. LNCS, vol. 5691, pp. 99–110. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  2. Bellatreche, L., Boukhalfa, K.: An evolutionary approach to schema partitioning selection in a data warehouse environment. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 115–125. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Bellatreche, L., Boukhalfa, K., Richard, P.: Data partitioning in data warehouses: Hardness study, heuristics and oracle validation. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 87–96. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. OLAP Council. Apb-1 olap benchmark, release ii (1998), http://www.olapcouncil.org/research/bmarkly.htm

  5. Cuzzocrea, A., Darmont, J., Mahboubi, H.: Fragmenting very large XML data warehouses via k-means clustering algorithm. International Journal of Business Intelligence and Data Mining 4(3-4), 301–328 (2009)

    Article  Google Scholar 

  6. Cuzzocrea, A., Kumar, A., Russo, V.: Experimenting the query performance of a grid-based sensor network data warehouse. In: Hameurlain, A. (ed.) Globe 2008. LNCS, vol. 5187, pp. 105–119. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Cuzzocrea, A., Serafino, P.: LCS-hist: taming massive high-dimensional data cube compression. In: 12th International Conference on Extending Database Technology, EDBT 2009 (2009)

    Google Scholar 

  8. Davis, L.D.: Bit-climbing, representational bias, and test suite design. In: Proceedings of the 4th International Conference on Genetic Algorithms (ICGE 1991), pp. 18–23 (March 1991)

    Google Scholar 

  9. DeWitt, D.J.D., Madden, S., Stonebraker, M.: How to build a high-performance data warehouse, http://db.lcs.mit.edu/madden/high_perf.pdf

  10. Eadon, G., Chong, E.I., Shankar, S., Raghavan, A., Srinivasan, J., Das, S.: Supporting table partitioning by reference in oracle. In: SIGMOD 2008 (2008)

    Google Scholar 

  11. Furtado, P.: Experimental evidence on partitioning in parallel data warehouses. In: DOLAP, pp. 23–30 (2004)

    Google Scholar 

  12. Gupta, H.: Selection and maintenance of views in a data warehouse. Ph.d. thesis, Stanford University (September 1999)

    Google Scholar 

  13. Karlapalem, K., Pun, N.M.: Query driven data allocation algorithms for distributed database systems. In: Tjoa, A.M. (ed.) DEXA 1997. LNCS, vol. 1308, pp. 347–356. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  14. Lima, A.B., Furtado, C., Valduriez, P., Mattoso, M.: Improving parallel olap query processing in database clusters with data replication. Distributed and Parallel Database Journal (2009) (to appear)

    Google Scholar 

  15. Navathe, S.B., Ra, M.: Vertical partitioning for database design: a graphical algorithm. In: ACM SIGMOD, pp. 440–450 (1989)

    Google Scholar 

  16. Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 2nd edn. Prentice Hall, Englewood Cliffs (1999)

    Google Scholar 

  17. Röhm, U., Böhm, K., Schek, H.-J.: Olap query routing and physical design in a database cluster. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 254–268. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  18. Röhm, U., Böhm, K., Schek, H.-J.: Cache-aware query routing in a cluster of databases. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 641–650 (2001)

    Google Scholar 

  19. Saccà, D., Wiederhold, G.: Database partitioning in a cluster of processors. ACM Transactions on Database Systems 10(1), 29–56 (1985)

    Article  MATH  Google Scholar 

  20. Sarawagi, S.: Indexing olap data. IEEE Data Engineering Bulletin 20(1), 36–43 (1997)

    Google Scholar 

  21. Stöhr, T., Märtens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: Proceedings of the International Conference on Very Large Databases, pp. 273–284 (2000)

    Google Scholar 

  22. Stöhr, T., Rahm, E.: Warlock: A data allocation tool for parallel warehouses. In: Proceedings of the International Conference on Very Large Databases, pp. 721–722 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bellatreche, L., Cuzzocrea, A., Benkrid, S. (2010). \(\mathcal{F}\)&\(\mathcal{A}\): A Methodology for Effectively and Efficiently Designing Parallel Relational Data Warehouses on Heterogenous Database Clusters. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2010. Lecture Notes in Computer Science, vol 6263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15105-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15105-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15104-0

  • Online ISBN: 978-3-642-15105-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics