Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems

Lai, Ping; Sur, Sayantan; Panda, Dhabaleswar K.

doi:10.1007/s00450-010-0115-3

Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems

Special Issue Paper
Published: 10 April 2010

Volume 25, pages 3–14, (2010)
Cite this article

Computer Science - Research and Development

Ping Lai¹,
Sayantan Sur¹ &
Dhabaleswar K. Panda¹

93 Accesses
11 Citations
Explore all metrics

Abstract

The increasing popularity of multi-core processors has made MPI intra-node communication, including the intra-node RMA (Remote Memory Access) communication, a critical component in high performance computing. MPI-2 RMA model includes one-sided data transfer and synchronization operations. Existing designs in popularly used MPI stacks do not provide truly one-sided intra-node RMA communication. They are built on top of two-sided send-receive operations, therefore suffering from overheads of two-sided communication and dependency on the remote side. In this paper, we enhance existing shared memory mechanisms to design truly one-sided synchronization. In addition, we design truly one-sided intra-node data transfer using two kernel based direct copy alternatives: basic kernel-assisted approach and I/OAT-assisted approach. Our new design eliminates the overhead of using two-sided operations and eliminates the involvement from the remote side. We also propose a series of benchmarks to evaluate various performance aspects over multi-core architectures (Intel Clovertown, Intel Nehalem and AMD Barcelona). The results show that the new design obtains up to 39% lower latency for small and medium messages and demonstrates 29% improvement in large message bandwidth. Moreover, it provides superior performance in terms of better scalability, reduced cache misses, higher resilience to process skew and increased computation and communication overlap. Finally, up to 10% performance benefits is demonstrated for a real scientific application AWM-Olsen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters

Finepoints: Partitioned Multithreaded MPI Communication

Exploiting copy engines for intra-node MPI collective communication

Article Open access 11 May 2023

References

I/OAT Acceleration Technology. http://www.intel.com/network/connectivity/vtc_ioat.htm
LAM/MPI Parallel Computing. http://www.lam-mpi.org/
MPICH2: High Performance and Widely Portable MPI. http://www.mcs.anl.gov/research/projects/mpich2/
MVAPICH2: MPI over InfiniBand and iWARP. http://mvapich.cse.ohio-state.edu/
OSU Microbenchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/
Unified Parallel C. http://en.wikipedia.org/wiki/Unified_Parallel_C
Asai N, Kentemich T, Lagier P (1999) MPI-2 implementation on Fujitsu generic message passing kernel. In: Proceedings of the ACM/IEEE conference on supercomputing (CDROM)
Barrett BW, Shipman GM, Lumsdaine A (2007) Analysis of implementation options for MPI-2 one-sided. In: EuroPVM/MPI
Bertozzi M, Panella M, Reggiani M (2001) Design of a VIA based communication protocol for LAM/MPI suite. In: Euromicro workshop on parallel and distributed processing
Booth S, Mourao E (2000) Single sided MPI implementations for SUN MPI. In: Proceedings of the 2000 ACM/IEEE conference on supercomputing
Buntinas D, Goglin B, Goodell D, Mercier G, Moreaud S (2009) Cache-efficient, intranode, large-message MPI communication with MPICH2-Nemesis. In: International conference on parallel processing (ICPP)
Chai L, Lai P, Jin H-W, Panda DK (2008) Designing an efficient kernel-level and user-level hybrid approach for MPI intra-node communication on multi-core systems. In: International conference on parallel processing (ICPP)
Cui Y, Moore R, Olsen K, Chourasia A, Maechling P, Minster B, Day S, Hu Y, Zhu J, Jordan T Toward petascale earthquake simulations. In: Special issue on geodynamic modeling, vol. 4, July 2009
Forum M (1993) MPI: a message passing interface. In: Proceedings of supercomputing
Goglin B (2009) High throughput intra-node MPI communication with Open-MX. In: Proceedings of the 17th Euromicro international conference on parallel, distributed and network-based processing (PDP)
Träff JL, Ritzdorf H, Hempel R (2000) The implementation of MPI-2 one-sided communication for the NEC SX-5. In: Proceedings of the 2000 ACM/IEEE conference on supercomputing
Jiang W, Liu J, Jin H, Panda DK, Gropp W, Thakur R (2004) High performance MPI-2 one-sided communication over infiniband. In: IEEE/ACM international symposium on cluster computing and the grid (CCGrid 04)
Jiang W, Liu JX, Jin H-W, Panda DK, Buntinas D, Thakur R, Gropp W (2004) Efficient implementation of MPI-2 passive one-sided communication on infiniBand clusters. In: EuroPVM/MPI
Jin H-W, Sur S, Chai L, Panda DK (2008) Lightweight kernel-level primitives for high-performance MPI intra-node communication over multi-core systems. In: IEEE international sympsoium on cluster computing and the grid
Lai P, Panda DK (2009) Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems. In: Technical Report OSU-CISRC-9/09-TR46, Computer Science and Engineering, The Ohio State University
Message Passing Interface Forum. MPI-2: extensions to the message-passing interface, July 1997
Santhanaraman G, Balaji P, Gopalakrishnan K, Thakur R, Gropp WD, Panda DK (2009) Natively supporting true one-sided communication in MPI on multi-core systems with infiniband. In: IEEE international sympsoium on cluster computing and the grid
Santhanaraman G, Narravula S, Panda DK (2008) Designing passive synchronization for MPI-2 one-sided communication to maximize overlap. In: Int’l Parallel and Distributed Processing Symposium (IPDPS)
Thakur R, Gropp W, Toonen B (2005) Optimizing the synchronization operations in message passing interface one-sided communication. Int J High Perform Comput Appl

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, Ohio State University, Columbus, USA
Ping Lai, Sayantan Sur & Dhabaleswar K. Panda

Authors

Ping Lai
View author publications
You can also search for this author in PubMed Google Scholar
Sayantan Sur
View author publications
You can also search for this author in PubMed Google Scholar
Dhabaleswar K. Panda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ping Lai.

Additional information

This research is supported in part by DOE grants #DE-FC02-06ER25749 and #DE-FC02-06ER25755; NSF grants #CNS-0403342, #CCF-0702675, #CCF-0833169, #CCF-0916302 and #OCI-0926691; grants from Intel, Mellanox, Cisco systems, QLogic and Sun Microsystems; and equipment donations from Intel, Mellanox, AMD, Appro, Chelsio, Dell, Fujitsu, Fulcrum, Microway, Obsidian, QLogic, and Sun Microsystems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lai, P., Sur, S. & Panda, D.K. Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems. Comput Sci Res Dev 25, 3–14 (2010). https://doi.org/10.1007/s00450-010-0115-3

Download citation

Published: 10 April 2010
Issue Date: May 2010
DOI: https://doi.org/10.1007/s00450-010-0115-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems

Abstract

Access this article

Similar content being viewed by others

High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters

Finepoints: Partitioned Multithreaded MPI Communication

Exploiting copy engines for intra-node MPI collective communication

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems

Abstract

Access this article

Similar content being viewed by others

High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters

Finepoints: Partitioned Multithreaded MPI Communication

Exploiting copy engines for intra-node MPI collective communication

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation