ScaMMDB: Facing Challenge of Mass Data Processing with MMDB

Zhang, Yansong; Xiao, Yanqin; Wang, Zhanwei; Ji, Xiaodong; Huang, Yunkui; Wang, Shan

doi:10.1007/978-3-642-03996-6_1

Yansong Zhang^29,30,31,
Yanqin Xiao^29,30,32,
Zhanwei Wang^29,30,
Xiaodong Ji^29,30,
Yunkui Huang^29,30 &
…
Shan Wang^29,30

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5731))

Included in the following conference series:

462 Accesses
3 Citations

Abstract

Main memory database(MMDB) has much higher performance than disk resident database(DRDB), but the architecture of hardware limits the scalability of memory capacity. In OLAP applications, comparing with data volume, main memory capacity is not big enough and it is hard to extend. In this paper, ScaMMDB prototype is proposed towards the scalability of MMDB. A multi-node structure is established to enable system to adjust total main memory capacity dynamically when new nodes enter the system or some nodes leave the system. ScaMMDB is based on open source MonetDB which is a typical column storage model MMDB, column data transmission module, column data distribution module and query execution plan re-writing module are developed directly in MonetDB. Any node in ScaMMDB can response user’s requirements and SQL statements are transformed automatically into extended column operating commands including local commands and remote call commands. Operation upon certain column is pushed into the node where column is stored, current node acts as temporarily mediator to call remote commands and assembles the results of each column operations. ScaMMDB is a test bed for scalability of MMDB, it can extend to MMDB cluster, MMDB replication server, even peer-to-peer OLAP server for further applications.

Supported by the National Natural Science Foundation of China under Grant No. 60473069,60496325; the joint research of HP Lab China and Information School of Renmin University(Large Scale Data Management);the joint research of Beijing Municipal Commission of education and Information School of Renmin University(Main Memory OLAP Server); the Renmin University of China Graduate Science Foundation No. 08XNG040.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

http://www.wintercorp.com/VLDB/2005_TopTen_Survey/TopTenWinners_2005.asp
Han, W.-S., et al.: Progressive optimization in a shared-nothing parallel database. In: Proc. SIGMOD, Beijing,China, pp. 809–820 (2007)
Google Scholar
Antunes, R., Furtado, P.: Hardware Capacity Evaluation in Shared-Nothing Data Warehouses. In: Parallel and Distributed Processing Symposium, IPDPS, pp. 1–6 (2007)
Google Scholar
Bamha, M., Hains, G.: A skew-insensitive algorithm for join and multi-join operations on shared nothing machines. In: Ibrahim, M., Küng, J., Revell, N. (eds.) DEXA 2000. LNCS, vol. 1873, pp. 644–653. Springer, Heidelberg (2000)
Chapter Google Scholar
Abadi, D.J., Madden, S.R., Hachem, N.: Column-Stores vs. Row-Stores: How Different Are They Really? In: Proc. SIGMOD, Vancouver, Canada (2008)
Google Scholar
Stonebraker, M., et al.: C-Store: A Column-oriented DBMS. In: Proc. VLDB, Trondheim, Norway, pp. 553–564 (2005)
Google Scholar
Zukowski, M., Nes, N., Boncz, P.A.: DSM vs. NSM: CPU Performance Tradeoffs in Block-Oriented Query Processing. In: Proc. the International Workshop on Data Management on New Hardware (DaMoN), Vancouver, Canada (2008)
Google Scholar
Ghandeharizadeh, S., DeWitt, D.: Hybrid-range partitioning strategy: a new declustering strategy for multiprocessor database machines. In: Proc. VLDB, Brisbane, pp. 481–492 (1990)
Google Scholar
Jianzhong, L., Srivastava, J., Rotem, D.: CMD: A multi-dimensional declustering mothod for parallel database system. In: Proc. VLDB, VanCouver, pp. 3–14 (1992)
Google Scholar
Ghandeharizadeh, S., DeWitt, D.J.: A performance analysis of alternative multi-attribute declustering strategies. In: Proc. SIGMOD, San Diego, California, pp. 29–38 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of the Ministry of Education for Data Engineering and Knowledge Engineering, Renmin University of China, Beijing, 100872, China
Yansong Zhang, Yanqin Xiao, Zhanwei Wang, Xiaodong Ji, Yunkui Huang & Shan Wang
School of Information, Renmin University of China, Beijing, 100872, China
Yansong Zhang, Yanqin Xiao, Zhanwei Wang, Xiaodong Ji, Yunkui Huang & Shan Wang
Department of Computer Science, Harbin Financial College, Harbin, 150030, China
Yansong Zhang
Computer Center, Hebei University, Baoding, Hebei, 071002, China
Yanqin Xiao

Authors

Yansong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanqin Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Zhanwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Ji
View author publications
You can also search for this author in PubMed Google Scholar
Yunkui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hong Kong University of Science and Technology, Hong Kong
Lei Chen
Swinburne University of Technology, Melbourne, Australia
Chengfei Liu
Renmin Universty of China, China
Xiao Zhang
Renmin University China, China
Shan Wang
Dept. of Industrial Economics and Technology Management, NTNU, Norway
Darijus Strasunskas
NTNU, Norway
Stein L. Tomassen
AOL, China
Jinghai Rao
SAP Research China, China
Wen-Syan Li
Comp. Sci. and Eng. Dept., Arizona State University, 85287, Tempe, AZ
K. Selçuk Candan
Dickson Computer Systems, 7A Victory Avenue 4th floor, Homantin, Kln, P.O. Box, Hong Kong
Dickson K. W. Chiu
Zhejiang Gongshang University, China
Yi Zhuang
University of Colorado at Boulder, USA
Clarence A. Ellis
Kyonggi University, Korea
Kwang-Hoon Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Xiao, Y., Wang, Z., Ji, X., Huang, Y., Wang, S. (2009). ScaMMDB: Facing Challenge of Mass Data Processing with MMDB. In: Chen, L., et al. Advances in Web and Network Technologies, and Information Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5731. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03996-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-03996-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03995-9
Online ISBN: 978-3-642-03996-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics