Portable SHMEMCache: A High-Performance Key-Value Store on OpenSHMEM and MPI

Fu, Huansong; Venkata, Manjunath Gorentla; Imam, Neena; Yu, Weikuan

doi:10.1007/978-3-319-73814-7_8

Huansong Fu¹⁶,
Manjunath Gorentla Venkata¹⁷,
Neena Imam¹⁷ &
…
Weikuan Yu¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10679))

Included in the following conference series:

Workshop on OpenSHMEM and Related Technologies

452 Accesses

Abstract

The integration of Big Data frameworks and HPC capabilities has drawn enormous interests in recent years. SHMEMCache is a distributed key-value store built on the OpenSHMEM global address space. It has solved several practical issues in leveraging OpenSHMEM’s one-sided operations for a distributed key-value store and providing efficient key-value operations on both commodity machines and supercomputers. However, being based solely on OpenSHMEM, SHMEMCache cannot leverage one-sided operations from a variety of software packages. This results in several limitations for SHMEMCache. First, we cannot make SHMEMCache available to a wider range of platforms. Second, an opportunity for potential performance improvement is missed. Third, there is a lack of deep understanding about how different one-sided operations can fit in with SHMEMCache and other distributed key-values in general. For example, the one-sided operations in OpenSHMEM and MPI have many differences in their interfaces, memory semantics and synchronization methods, all of which can have distinct implications and also increase the complexity in supporting both OpenSHMEM and MPI for SHMEMCache. Therefore, we have taken on an effort on leveraging different one-sided operations for SHMEMCache and proposed a design of portable SHMEMCache. Based on this new framework, we have supported both OpenSHMEM and MPI for SHMEMCache. We have also conducted an extensive set of experiments to compare the performance of the two versions on both commodity machines and the Titan supercomputer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 60.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Memcached. https://memcached.org/downloads
MVAPICH. http://mvapich.cse.ohio-state.edu/
OpenMPI. https://www.open-mpi.org/
Redis. http://redis.io/
Titan Supercomputer. https://www.olcf.ornl.gov/titan/
Aniszczyk, C.: Caching with twemcache (2012)
Google Scholar
Appavoo, J., Waterland, A., Da Silva, D., Uhlig, V., Rosenburg, B., Van Hensbergen, E., Stoess, J., Wisniewski, R., Steinberg, U.: Providing a cloud network infrastructure on a supercomputer. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, pp. 385–394. ACM (2010)
Google Scholar
Chapman, B., Curtis, T., Pophale, S., Poole, S., Kuehn, J., Koelbel, C., Smith, L.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, p. 2. ACM (2010)
Google Scholar
Chen, Y., Wei, X., Shi, J., Chen, R., Chen, H.: Fast and general distributed transactions using RDMA and HTM. In: Proceedings of the Eleventh European Conference on Computer Systems, p. 26. ACM (2016)
Google Scholar
UPC Consortium: UPC language specifications v1. 2. Lawrence Berkeley National Laboratory (2005)
Google Scholar
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)
Google Scholar
Dinan, J., Balaji, P., Buntinas, D., Goodell, D., Gropp, W., Thakur, R.: An implementation and evaluation of the MPI 3.0 one-sided communication interface. Concurr. Comput. Pract. Exp. 28, 4385–4404 (2016)
Article Google Scholar
Dragojević, A., Narayanan, D., Castro, M., Hodson, O.: Farm: fast remote memory. In: 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pp. 401–414 (2014)
Google Scholar
Dragojević, A., Narayanan, D., Nightingale, E.B., Renzelmann, M., Shamis, A., Badam, A., Castro, M.: No compromises: distributed transactions with consistency, availability, and performance. In: Proceedings of the 25th Symposium on Operating Systems Principles, pp. 54–70. ACM (2015)
Google Scholar
Fu, H., SinghaRoy, K., Venkata, M.G., Zhu, Y., Yu, W.: SHMemCache: enabling memcached on the OpenSHMEM global address model. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 131–145. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_9
Chapter Google Scholar
Fu, H., Venkata, M.G., Choudhury, A.R., Imam, N., Yu, W.: High-performance key-value store on OpenSHMEM. In: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 559–568. IEEE Press (2017)
Google Scholar
Geist, A., Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Saphir, W., Skjellum, T., Snir, M.: MPI-2: extending the message-passing interface. In: Bougé, L., Fraigniaud, P., Mignotte, A., Robert, Y. (eds.) Euro-Par 1996. LNCS, vol. 1123, pp. 128–135. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61626-8_16
Google Scholar
Gropp, W., Thakur, R.: An evaluation of implementation options for MPI one-sided communication. In: Di Martino, B., Kranzlmüller, D., Dongarra, J. (eds.) EuroPVM/MPI 2005. LNCS, vol. 3666, pp. 415–424. Springer, Heidelberg (2005). https://doi.org/10.1007/11557265_53
Chapter Google Scholar
Hammond, J.R., Ghosh, S., Chapman, B.M.: Implementing OpenSHMEM using MPI-3 one-sided communication. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 44–58. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_4
Chapter Google Scholar
Huang, J., Ouyang, X., Jose, J., Wasi-ur Rahman, M., Wang, H., Luo, M., Subramoni, H., Murthy, C., Panda, D.K.: High-performance design of HBase with RDMA over infiniband. In: 2012 IEEE 26th International Parallel & Distributed Processing Symposium (IPDPS), pp. 774–785. IEEE (2012)
Google Scholar
Jiang, W., Liu, J., Jin, H.-W., Panda, D.K., Gropp, W., Thakur, R.: High performance MPI-2 one-sided communication over infiniband. In: IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2004, pp. 531–538. IEEE (2004)
Google Scholar
Jose, J., Subramoni, H., Kandalla, K., Wasi-ur Rahman, M., Wang, H., Narravula, S., Panda, D.K.: Scalable memcached design for infiniband clusters using hybrid transports. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 236–243. IEEE (2012)
Google Scholar
Jose, J., Subramoni, H., Luo, M., Zhang, M., Huang, J., Wasi-ur Rahman, M., Islam, N.S., Ouyang, X., Wang, H., Sur, S., et al.: Memcached design on high performance RDMA capable interconnects. In: 2011 International Conference on Parallel Processing (ICPP), pp. 743–752. IEEE (2011)
Google Scholar
Jose, J., Zhang, J., Venkatesh, A., Potluri, S., Panda, D.K.: A comprehensive performance evaluation of OpenSHMEM libraries on InfiniBand clusters. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 14–28. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_2
Chapter Google Scholar
Liu, J., Wu, J., Panda, D.K.: High performance RDMA-based MPI implementation over infiniband. Int. J. Parallel Prog. 32(3), 167–198 (2004)
Article MATH Google Scholar
Mitchell, C., Geng, Y., Li, J.: Using one-sided RDMA reads to build a fast, CPU-efficient key-value store. In: USENIX Annual Technical Conference, pp. 103–114 (2013)
Google Scholar
Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., Aprà, E.: Advances, applications and performance of the global arrays shared memory programming toolkit. Int. J. High Perform. Comput. Appl. 20(2), 203–231 (2006)
Article Google Scholar
Nishtala, R., Fugal, H., Grimm, S., Kwiatkowski, M., Lee, H., Li, H.C., McElroy, R., Paleczny, M., Peek, D., Saab, P., et al.: Scaling memcache at facebook. In: Presented as Part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pp. 385–398 (2013)
Google Scholar
Numrich, R.W., Reid, J.: Co-array Fortran for parallel programming. ACM SIGPLAN Fortran Forum 17, 1–31 (1998). ACM
Article Google Scholar
Pophale, S., Nanjegowda, R., Curtis, T., Chapman, B., Jin, H., Poole, S., Kuehn, J.: OpenSHMEM performance and potential: a NPB experimental study. In: The 6th Conference on Partitioned Global Address Space Programming Models (PGAS12). Citeseer (2012)
Google Scholar
Shamis, P., Venkata, M.G., Poole, S., Welch, A., Curtis, T.: Designing a high performance OpenSHMEM implementation using universal common communication substrate as a communication middleware. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 1–13. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_1
Chapter Google Scholar
Shankar, D., Lu, X., Islam, N., Wasi-Ur-Rahman, M., Panda, D.K.: High-performance hybrid key-value store on modern clusters with RDMA interconnects and SSDs: non-blocking extensions, designs, and benefits. In: 2016 IEEE International Parallel and Distributed Processing Symposium, pp. 393–402. IEEE (2016)
Google Scholar
Shipman, G.M., Woodall, T.S., Graham, R.L., Maccabe, A.B., Bridges, P.G.: Infiniband scalability in Open MPI. In: Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, p. 10-pp. IEEE (2006)
Google Scholar
Wang, Y., Que, X., Yu, W., Goldenberg, D., Sehgal, D.: Hadoop acceleration through network levitated merge. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 57. ACM (2011)
Google Scholar
Wang, Y., Zhang, L., Tan, J., Li, M., Gao, Y., Guerin, X., Meng, X., Meng, S.: HydraDB: a resilient RDMA-driven key-value middleware for in-memory cluster computing. In: SC 2015, p. 22. ACM (2015)
Google Scholar

Download references

Acknowledgment

This work was supported in part by a contract from Oak Ridge National Laboratory and the National Science Foundation awards 1561041 and 1564647.

This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.

Author information

Authors and Affiliations

Florida State University, Tallahassee, USA
Huansong Fu & Weikuan Yu
Oak Ridge National Laboratory, Oak Ridge, USA
Manjunath Gorentla Venkata & Neena Imam

Authors

Huansong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Manjunath Gorentla Venkata
View author publications
You can also search for this author in PubMed Google Scholar
Neena Imam
View author publications
You can also search for this author in PubMed Google Scholar
Weikuan Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huansong Fu .

Editor information

Editors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Manjunath Gorentla Venkata
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Neena Imam
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Swaroop Pophale

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, H., Venkata, M.G., Imam, N., Yu, W. (2018). Portable SHMEMCache: A High-Performance Key-Value Store on OpenSHMEM and MPI. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds) OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence. OpenSHMEM 2017. Lecture Notes in Computer Science(), vol 10679. Springer, Cham. https://doi.org/10.1007/978-3-319-73814-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-73814-7_8
Published: 10 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73813-0
Online ISBN: 978-3-319-73814-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics