research-article

Efficient distributed memory management with RDMA and caching

Authors:
Qingchao Cai

National University of Singapore

National University of Singapore
View Profile

,
Wentian Guo

National University of Singapore

National University of Singapore
View Profile

,
Hao Zhang

National University of Singapore

National University of Singapore
View Profile

,
Divyakant Agrawal

University of California at Santa Barbara

University of California at Santa Barbara
View Profile

,
Gang Chen

Zhejiang University

Zhejiang University
View Profile

,
Beng Chin Ooi

National University of Singapore

National University of Singapore
View Profile

,
Kian-Lee Tan

National University of Singapore

National University of Singapore
View Profile

,
Yong Meng Teo

National University of Singapore

National University of Singapore
View Profile

,
Sheng Wang

National University of Singapore

National University of Singapore
View Profile

Proceedings of the VLDB Endowment Volume 11 Issue 11pp 1604–1617https://doi.org/10.14778/3236187.3236209

Published:01 July 2018Publication History

Proceedings of the VLDB Endowment

Abstract

Recent advancements in high-performance networking interconnect significantly narrow the performance gap between intra-node and inter-node communications, and open up opportunities for distributed memory platforms to enforce cache coherency among distributed nodes. To this end, we propose GAM, an efficient distributed in-memory platform that provides a directory-based cache coherence protocol over remote direct memory access (RDMA). GAM manages the free memory distributed among multiple nodes to provide a unified memory model, and supports a set of user-friendly APIs for memory operations. To remove writes from critical execution paths, GAM allows a write to be reordered with the following reads and writes, and hence enforces partial store order (PSO) memory consistency. A light-weight logging scheme is designed to provide fault tolerance in GAM. We further build a transaction engine and a distributed hash table (DHT) atop GAM to show the ease-of-use and applicability of the provided APIs. Finally, we conduct an extensive micro benchmark to evaluate the read/write/lock performance of GAM under various workloads, and a macro benchmark against the transaction engine and DHT. The results show the superior performance of GAM over existing distributed memory platforms.

References

S. V. Adve and K. Gharachorloo. Shared memory consistency models: A tutorial. Computer, 29(12):66--76, 1996. Google ScholarDigital Library
M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. In ACM SIGOPS Operating Systems Review, volume 41, pages 159--174. ACM, 2007. Google ScholarDigital Library
E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. L. Steele Jr, S. Tobin-Hochstadt, J. Dias, C. Eastlund, et al. The fortress language specification. Sun Microsystems, 139(140):116, 2005.Google Scholar
C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel. Treadmarks: shared memory computing on networks of workstations. Computer, 29(2):18--28, Feb 1996. Google ScholarDigital Library
C. Binnig, A. Crotty, A. Galakatos, T. Kraska, and E. Zamanian. The end of slow networks: It's time for a redesign. PVLDB, 9(7):528--539, 2016. Google ScholarDigital Library
Q. Cai, H. Zhang, W. Guo, G. Chen, B. C. Ooi, K. L. Tan, and W. F. Wong. Memepic: Towards a unified in-memory big data management system. IEEE Transactions on Big Data, pages 1--1, 2018.Google ScholarCross Ref
J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and performance of Munin. In SOSP '91, pages 152--164, 1991. Google ScholarDigital Library
B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. The International Journal of High Performance Computing Applications, 21(3):291--312, 2007. Google ScholarDigital Library
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. Von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In Acm Sigplan Notices, volume 40, pages 519--538. ACM, 2005. Google ScholarDigital Library
C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanti, Y. Yao, and D. Chavarría-Miranda. An evaluation of global address space languages: co-array fortran and unified parallel c. In Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 36--47. ACM, 2005. Google ScholarDigital Library
U. Consortium et al. Upc language specifications v1. 2. Lawrence Berkeley National Laboratory, 2005.Google Scholar
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In SoCC '10, pages 143--154, 2010. Google ScholarDigital Library
P. J. Denning. The locality principle. Communications of the ACM, 48(7):19--24, 2005. Google ScholarDigital Library
A. Dragojević, D. Narayanan, O. Hodson, and M. Castro. FaRM: Fast remote memory. In NSDI '14, pages 401--414, 2014. Google ScholarDigital Library
A. Dragojević, D. Narayanan, E. B. Nightingale, M. Renzelmann, A. Shamis, A. Badam, and M. Castro. No compromises: Distributed transactions with consistency, availability, and performance. In SOSP '15, pages 54--70, 2015. Google ScholarDigital Library
M. J. Feeley, W. E. Morgan, E. Pighin, A. R. Karlin, H. M. Levy, and C. A. Thekkath. Implementing global memory management in a workstation cluster. In ACM SIGOPS Operating Systems Review, volume 29, pages 201--212. ACM, 1995. Google ScholarDigital Library
M. J. Franklin, M. J. Carey, and M. Livny. Global memory management in client-server database architectures. In VLDB, pages 596--609, 1992. Google ScholarDigital Library
J. Gu, Y. Lee, Y. Zhang, M. Chowdhury, and K. G. Shin. Efficient memory disaggregation with infiniswap. In NSDI, pages 649--667, 2017. Google ScholarDigital Library
InfiniBand Trade Association. Infiniband roadmap. http://www.infinibandta.org, 2016.Google Scholar
N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, and D. K. Panda. High performance rdma-based design of hdfs over infiniband. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, page 35. IEEE Computer Society Press, 2012. Google ScholarDigital Library
A. Kalia, M. Kaminsky, and D. G. Andersen. Using RDMA efficiently for key-value services. In SIGCOMM '14, pages 295--306, 2014. Google ScholarDigital Library
R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: A high-performance, distributed main memory transaction processing system. PVLDB, 1(2):1496--1499, 2008. Google ScholarDigital Library
A. K. M. Kaminsky and D. G. Andersen. Design guidelines for high performance rdma systems. In 2016 USENIX Annual Technical Conference, page 437, 2016. Google ScholarDigital Library
S. Kaxiras, D. Klaftenegger, M. Norgren, A. Ros, and K. Sagonas. Turning centralized coherence and distributed critical-section execution on their head: A new approach for scalable distributed shared memory. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pages 3--14. ACM, 2015. Google ScholarDigital Library
F. Li, S. Das, M. Syamala, and V. R. Narasayya. Accelerating relational databases by leveraging remote memory and rdma. In SIGMOD '16, pages 355--370, 2016. Google ScholarDigital Library
K. Li and P. Hudak. Memory coherence in shared virtual memory systems. TOCS, 7(4):321--359, Nov. 1989. Google ScholarDigital Library
Q. Lin, P. Chang, G. Chen, B. C. Ooi, K.-L. Tan, and Z. Wang. Towards a non-2PC transaction management in distributed database systems. In SIGMOD '16, pages 1659--1674, 2016. Google ScholarDigital Library
F. Liu, L. Yin, and S. Blanas. Design and evaluation of an rdma-aware data shuffling operator for parallel database systems. In EuroSys. Google ScholarDigital Library
J. Liu, J. Wu, S. P. Kini, P. Wyckoff, and D. K. Panda. High performance RDMA-based mpi implementation over infiniband. In ICS '03, pages 295--304, 2003. Google ScholarDigital Library
S. Loesing, M. Pilman, T. Etter, and D. Kossmann. On the design and scalability of distributed shared-data databases. In SIGMOD '15, pages 663--676, 2015. Google ScholarDigital Library
Mellanox. Connectx<sup>@</sup>-6 en 200gb/s adapter. http://www.mellanox.com/related-docs/prod_silicon/PB_ConnectX-6_EN_IC.pdf, 2016.Google Scholar
Mellanox. Infiniband performance. http://www.mellanox.com/page/performance infini-band, 2016.Google Scholar
C. Mitchell, Y. Geng, and J. Li. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store. In USENIX ATC '13, pages 103--114, 2013. Google ScholarDigital Library
B. Mutnury, F. Paglia, J. Mobley, G. K. Singh, and R. Bellomio. Quickpath interconnect (QPI) design and analysis in high speed servers. In EPEPS '10, pages 265--268, 2010.Google ScholarCross Ref
J. Nelson, B. Holt, B. Myers, P. Briggs, L. Ceze, S. Kahan, and M. Oskin. Latency-tolerant software distributed shared memory. In USENIX ATC '15, pages 291--305, 2015. Google ScholarDigital Library
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for RAMClouds: Scalable high-performance storage entirely in DRAM. Operating Systems Review, pages 92--105, 2010. Google ScholarDigital Library
QLogic. Introduction to Ethernet latency. http://www.qlogic.com/Resources/Documents/TechnologyBriefs/Adapters/Tech_Brief_Introduction_to_-Ethernet_Latency.pdf, 2016.Google Scholar
W. Rödiger, T. Mühlbauer, A. Kemper, and T. Neumann. High-speed query processing over high-speed networks. PVLDB, 9(4):228--239, 2015. Google ScholarDigital Library
Y. Shan, S.-Y. Tsai, and Y. Zhang. Distributed shared persistent memory. In SoCC, pages 323--337, 2017. Google ScholarDigital Library
R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, and M. Scott. Cashmere-2L: Software coherent shared memory on a clustered remote-write network. In SOSP '97, pages 170--183, 1997. Google ScholarDigital Library
M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era: (it's time for a complete rewrite). In VLDB, pages 1150--1160, 2007. Google ScholarDigital Library
M. Stonebraker and A. Weisberg. The voltdb main memory dbms. IEEE Data Engineering Bulletin, 2013.Google Scholar
Transaction Processing Performance Council. TPC-C benchmark specification. http://www.tpc.org/tpcc, 2010.Google Scholar
S. Wang, T. T. A. Dinh, Q. Lin, Z. Xie, M. Zhang, Q. Cai, G. Chen, B. C. Ooi, and P. Ruan. Forkbase: An efficient storage engine for blockchain and forkable applications. PVLDB, 11(10):1137--1150, 2018. Google ScholarDigital Library
T. Wang, R. Johnson, and I. Pandis. Query fresh: Log shipping on steroids. PVLDB, 11(4):406--419, 2017. Google ScholarDigital Library
X. Wei, J. Shi, Y. Chen, R. Chen, and H. Chen. Fast in-memory transaction processing using rdma and htm. In Proceedings of the 25th Symposium on Operating Systems Principles, pages 87--104. ACM, 2015. Google ScholarDigital Library
J. Wu, P. Wyckoff, and D. Panda. Pvfs over infiniband: Design and performance evaluation. In Parallel Processing, 2003. Proceedings. 2003 International Conference on, pages 125--132. IEEE, 2003.Google ScholarCross Ref
K. Yelick, D. Bonachea, W.-Y. Chen, P. Colella, K. Datta, J. Duell, S. L. Graham, P. Hargrove, P. Hilfinger, P. Husbands, et al. Productivity and performance using partitioned global address space languages. In Proceedings of the 2007 international workshop on Parallel symbolic computation, pages 24--32. ACM, 2007. Google ScholarDigital Library
K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. Hilfinger, S. Graham, D. Gay, P. Colella, et al. Titanium: A high-performance java dialect. Concurrency Practice and Experience, 10(11--13):825--836, 1998.Google Scholar

Index Terms

Efficient distributed memory management with RDMA and caching

Index terms have been assigned to the content through auto-classification.

Recommendations

An efficient design for fast memory registration in RDMA

Remote Direct Memory Access (RDMA) improves network bandwidth and reduces latency by eliminating unnecessary copies from network interface card to application buffers, but the communication buffer management to reduce memory registration and ...
Read More
Efficient page caching algorithm with prediction and migration for a hybrid main memory

Emerging next generation memories, NVRAMs, such as Phase-change RAM (PRAM), Ferroelectric RAM (FRAM), and Magnetic RAM (MRAM) are rapidly becoming promising candidates for large scale main memory because of their high density and low power consumption. ...
Read More
Characterizing Memory Write References for Efficient Management of Hybrid PCM and DRAM Memory
MASCOTS '11: Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems

In order to reduce the energy dissipation in main memory of computer systems, phase change memory (PCM) has emerged as one of the most promising technologies to incorporate into the memory hierarchy. However, PCM has two critical weaknesses to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 11, Issue 11
July 2018
507 pages
ISSN:2150-8097
Editors:
Sihem Amer-Yahia
University of Grenoble Alpes, CNRS
,
Jian Pei
Simon Fraser University
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 July 2018
Published in pvldb Volume 11, Issue 11
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 34
  Total Citations
  View Citations
- 1,347
  Total Downloads
- Downloads (Last 12 months)213
- Downloads (Last 6 weeks)24
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient distributed memory management with RDMA and caching

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

An efficient design for fast memory registration in RDMA

Efficient page caching algorithm with prediction and migration for a hybrid main memory

Characterizing Memory Write References for Efficient Management of Hybrid PCM and DRAM Memory

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Efficient distributed memory management with RDMA and caching

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

An efficient design for fast memory registration in RDMA

Efficient page caching algorithm with prediction and migration for a hybrid main memory

Characterizing Memory Write References for Efficient Management of Hybrid PCM and DRAM Memory

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media