Skip to main content

ScaMMDB: Facing Challenge of Mass Data Processing with MMDB

  • Conference paper
Advances in Web and Network Technologies, and Information Management (APWeb 2009, WAIM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5731))

Abstract

Main memory database(MMDB) has much higher performance than disk resident database(DRDB), but the architecture of hardware limits the scalability of memory capacity. In OLAP applications, comparing with data volume, main memory capacity is not big enough and it is hard to extend. In this paper, ScaMMDB prototype is proposed towards the scalability of MMDB. A multi-node structure is established to enable system to adjust total main memory capacity dynamically when new nodes enter the system or some nodes leave the system. ScaMMDB is based on open source MonetDB which is a typical column storage model MMDB, column data transmission module, column data distribution module and query execution plan re-writing module are developed directly in MonetDB. Any node in ScaMMDB can response user’s requirements and SQL statements are transformed automatically into extended column operating commands including local commands and remote call commands. Operation upon certain column is pushed into the node where column is stored, current node acts as temporarily mediator to call remote commands and assembles the results of each column operations. ScaMMDB is a test bed for scalability of MMDB, it can extend to MMDB cluster, MMDB replication server, even peer-to-peer OLAP server for further applications.

Supported by the National Natural Science Foundation of China under Grant No. 60473069,60496325; the joint research of HP Lab China and Information School of Renmin University(Large Scale Data Management);the joint research of Beijing Municipal Commission of education and Information School of Renmin University(Main Memory OLAP Server); the Renmin University of China Graduate Science Foundation No. 08XNG040.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.wintercorp.com/VLDB/2005_TopTen_Survey/TopTenWinners_2005.asp

  2. Han, W.-S., et al.: Progressive optimization in a shared-nothing parallel database. In: Proc. SIGMOD, Beijing,China, pp. 809–820 (2007)

    Google Scholar 

  3. Antunes, R., Furtado, P.: Hardware Capacity Evaluation in Shared-Nothing Data Warehouses. In: Parallel and Distributed Processing Symposium, IPDPS, pp. 1–6 (2007)

    Google Scholar 

  4. Bamha, M., Hains, G.: A skew-insensitive algorithm for join and multi-join operations on shared nothing machines. In: Ibrahim, M., Küng, J., Revell, N. (eds.) DEXA 2000. LNCS, vol. 1873, pp. 644–653. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Abadi, D.J., Madden, S.R., Hachem, N.: Column-Stores vs. Row-Stores: How Different Are They Really? In: Proc. SIGMOD, Vancouver, Canada (2008)

    Google Scholar 

  6. Stonebraker, M., et al.: C-Store: A Column-oriented DBMS. In: Proc. VLDB, Trondheim, Norway, pp. 553–564 (2005)

    Google Scholar 

  7. Zukowski, M., Nes, N., Boncz, P.A.: DSM vs. NSM: CPU Performance Tradeoffs in Block-Oriented Query Processing. In: Proc. the International Workshop on Data Management on New Hardware (DaMoN), Vancouver, Canada (2008)

    Google Scholar 

  8. Ghandeharizadeh, S., DeWitt, D.: Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor database machines. In: Proc. VLDB, Brisbane, pp. 481–492 (1990)

    Google Scholar 

  9. Jianzhong, L., Srivastava, J., Rotem, D.: CMD: A multi-dimensional declustering mothod for parallel database system. In: Proc. VLDB, VanCouver, pp. 3–14 (1992)

    Google Scholar 

  10. Ghandeharizadeh, S., DeWitt, D.J.: A performance analysis of alternative multi-attribute declustering strategies. In: Proc. SIGMOD, San Diego, California, pp. 29–38 (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Y., Xiao, Y., Wang, Z., Ji, X., Huang, Y., Wang, S. (2009). ScaMMDB: Facing Challenge of Mass Data Processing with MMDB. In: Chen, L., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5731. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03996-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03996-6_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03995-9

  • Online ISBN: 978-3-642-03996-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics