skip to main content
research-article

Efficient distributed memory management with RDMA and caching

Published:01 July 2018Publication History
Skip Abstract Section

Abstract

Recent advancements in high-performance networking interconnect significantly narrow the performance gap between intra-node and inter-node communications, and open up opportunities for distributed memory platforms to enforce cache coherency among distributed nodes. To this end, we propose GAM, an efficient distributed in-memory platform that provides a directory-based cache coherence protocol over remote direct memory access (RDMA). GAM manages the free memory distributed among multiple nodes to provide a unified memory model, and supports a set of user-friendly APIs for memory operations. To remove writes from critical execution paths, GAM allows a write to be reordered with the following reads and writes, and hence enforces partial store order (PSO) memory consistency. A light-weight logging scheme is designed to provide fault tolerance in GAM. We further build a transaction engine and a distributed hash table (DHT) atop GAM to show the ease-of-use and applicability of the provided APIs. Finally, we conduct an extensive micro benchmark to evaluate the read/write/lock performance of GAM under various workloads, and a macro benchmark against the transaction engine and DHT. The results show the superior performance of GAM over existing distributed memory platforms.

References

  1. S. V. Adve and K. Gharachorloo. Shared memory consistency models: A tutorial. Computer, 29(12):66--76, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. In ACM SIGOPS Operating Systems Review, volume 41, pages 159--174. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. E. Allen, D. Chase, J. Hallett, V. Luchangco, J.-W. Maessen, S. Ryu, G. L. Steele Jr, S. Tobin-Hochstadt, J. Dias, C. Eastlund, et al. The fortress language specification. Sun Microsystems, 139(140):116, 2005.Google ScholarGoogle Scholar
  4. C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel. Treadmarks: shared memory computing on networks of workstations. Computer, 29(2):18--28, Feb 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Binnig, A. Crotty, A. Galakatos, T. Kraska, and E. Zamanian. The end of slow networks: It's time for a redesign. PVLDB, 9(7):528--539, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Q. Cai, H. Zhang, W. Guo, G. Chen, B. C. Ooi, K. L. Tan, and W. F. Wong. Memepic: Towards a unified in-memory big data management system. IEEE Transactions on Big Data, pages 1--1, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and performance of Munin. In SOSP '91, pages 152--164, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. The International Journal of High Performance Computing Applications, 21(3):291--312, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. Von Praun, and V. Sarkar. X10: an object-oriented approach to non-uniform cluster computing. In Acm Sigplan Notices, volume 40, pages 519--538. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanti, Y. Yao, and D. Chavarría-Miranda. An evaluation of global address space languages: co-array fortran and unified parallel c. In Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 36--47. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. U. Consortium et al. Upc language specifications v1. 2. Lawrence Berkeley National Laboratory, 2005.Google ScholarGoogle Scholar
  12. B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In SoCC '10, pages 143--154, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. J. Denning. The locality principle. Communications of the ACM, 48(7):19--24, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Dragojević, D. Narayanan, O. Hodson, and M. Castro. FaRM: Fast remote memory. In NSDI '14, pages 401--414, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Dragojević, D. Narayanan, E. B. Nightingale, M. Renzelmann, A. Shamis, A. Badam, and M. Castro. No compromises: Distributed transactions with consistency, availability, and performance. In SOSP '15, pages 54--70, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. J. Feeley, W. E. Morgan, E. Pighin, A. R. Karlin, H. M. Levy, and C. A. Thekkath. Implementing global memory management in a workstation cluster. In ACM SIGOPS Operating Systems Review, volume 29, pages 201--212. ACM, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. J. Franklin, M. J. Carey, and M. Livny. Global memory management in client-server database architectures. In VLDB, pages 596--609, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Gu, Y. Lee, Y. Zhang, M. Chowdhury, and K. G. Shin. Efficient memory disaggregation with infiniswap. In NSDI, pages 649--667, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. InfiniBand Trade Association. Infiniband roadmap. http://www.infinibandta.org, 2016.Google ScholarGoogle Scholar
  20. N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, and D. K. Panda. High performance rdma-based design of hdfs over infiniband. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, page 35. IEEE Computer Society Press, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Kalia, M. Kaminsky, and D. G. Andersen. Using RDMA efficiently for key-value services. In SIGCOMM '14, pages 295--306, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: A high-performance, distributed main memory transaction processing system. PVLDB, 1(2):1496--1499, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. K. M. Kaminsky and D. G. Andersen. Design guidelines for high performance rdma systems. In 2016 USENIX Annual Technical Conference, page 437, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Kaxiras, D. Klaftenegger, M. Norgren, A. Ros, and K. Sagonas. Turning centralized coherence and distributed critical-section execution on their head: A new approach for scalable distributed shared memory. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pages 3--14. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Li, S. Das, M. Syamala, and V. R. Narasayya. Accelerating relational databases by leveraging remote memory and rdma. In SIGMOD '16, pages 355--370, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Li and P. Hudak. Memory coherence in shared virtual memory systems. TOCS, 7(4):321--359, Nov. 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Q. Lin, P. Chang, G. Chen, B. C. Ooi, K.-L. Tan, and Z. Wang. Towards a non-2PC transaction management in distributed database systems. In SIGMOD '16, pages 1659--1674, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Liu, L. Yin, and S. Blanas. Design and evaluation of an rdma-aware data shuffling operator for parallel database systems. In EuroSys. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Liu, J. Wu, S. P. Kini, P. Wyckoff, and D. K. Panda. High performance RDMA-based mpi implementation over infiniband. In ICS '03, pages 295--304, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Loesing, M. Pilman, T. Etter, and D. Kossmann. On the design and scalability of distributed shared-data databases. In SIGMOD '15, pages 663--676, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mellanox. Connectx<sup>@</sup>-6 en 200gb/s adapter. http://www.mellanox.com/related-docs/prod_silicon/PB_ConnectX-6_EN_IC.pdf, 2016.Google ScholarGoogle Scholar
  32. Mellanox. Infiniband performance. http://www.mellanox.com/page/performance infini-band, 2016.Google ScholarGoogle Scholar
  33. C. Mitchell, Y. Geng, and J. Li. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store. In USENIX ATC '13, pages 103--114, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Mutnury, F. Paglia, J. Mobley, G. K. Singh, and R. Bellomio. Quickpath interconnect (QPI) design and analysis in high speed servers. In EPEPS '10, pages 265--268, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  35. J. Nelson, B. Holt, B. Myers, P. Briggs, L. Ceze, S. Kahan, and M. Oskin. Latency-tolerant software distributed shared memory. In USENIX ATC '15, pages 291--305, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for RAMClouds: Scalable high-performance storage entirely in DRAM. Operating Systems Review, pages 92--105, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. QLogic. Introduction to Ethernet latency. http://www.qlogic.com/Resources/Documents/TechnologyBriefs/Adapters/Tech_Brief_Introduction_to_-Ethernet_Latency.pdf, 2016.Google ScholarGoogle Scholar
  38. W. Rödiger, T. Mühlbauer, A. Kemper, and T. Neumann. High-speed query processing over high-speed networks. PVLDB, 9(4):228--239, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Y. Shan, S.-Y. Tsai, and Y. Zhang. Distributed shared persistent memory. In SoCC, pages 323--337, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, and M. Scott. Cashmere-2L: Software coherent shared memory on a clustered remote-write network. In SOSP '97, pages 170--183, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era: (it's time for a complete rewrite). In VLDB, pages 1150--1160, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Stonebraker and A. Weisberg. The voltdb main memory dbms. IEEE Data Engineering Bulletin, 2013.Google ScholarGoogle Scholar
  43. Transaction Processing Performance Council. TPC-C benchmark specification. http://www.tpc.org/tpcc, 2010.Google ScholarGoogle Scholar
  44. S. Wang, T. T. A. Dinh, Q. Lin, Z. Xie, M. Zhang, Q. Cai, G. Chen, B. C. Ooi, and P. Ruan. Forkbase: An efficient storage engine for blockchain and forkable applications. PVLDB, 11(10):1137--1150, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. T. Wang, R. Johnson, and I. Pandis. Query fresh: Log shipping on steroids. PVLDB, 11(4):406--419, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. X. Wei, J. Shi, Y. Chen, R. Chen, and H. Chen. Fast in-memory transaction processing using rdma and htm. In Proceedings of the 25th Symposium on Operating Systems Principles, pages 87--104. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. J. Wu, P. Wyckoff, and D. Panda. Pvfs over infiniband: Design and performance evaluation. In Parallel Processing, 2003. Proceedings. 2003 International Conference on, pages 125--132. IEEE, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  48. K. Yelick, D. Bonachea, W.-Y. Chen, P. Colella, K. Datta, J. Duell, S. L. Graham, P. Hargrove, P. Hilfinger, P. Husbands, et al. Productivity and performance using partitioned global address space languages. In Proceedings of the 2007 international workshop on Parallel symbolic computation, pages 24--32. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. Hilfinger, S. Graham, D. Gay, P. Colella, et al. Titanium: A high-performance java dialect. Concurrency Practice and Experience, 10(11--13):825--836, 1998.Google ScholarGoogle Scholar

Index Terms

  1. Efficient distributed memory management with RDMA and caching
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the VLDB Endowment
          Proceedings of the VLDB Endowment  Volume 11, Issue 11
          July 2018
          507 pages
          ISSN:2150-8097
          Issue’s Table of Contents

          Publisher

          VLDB Endowment

          Publication History

          • Published: 1 July 2018
          Published in pvldb Volume 11, Issue 11

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader