Skip to main content
Log in

Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems

  • Special Issue Paper
  • Published:
Computer Science - Research and Development

Abstract

The increasing popularity of multi-core processors has made MPI intra-node communication, including the intra-node RMA (Remote Memory Access) communication, a critical component in high performance computing. MPI-2 RMA model includes one-sided data transfer and synchronization operations. Existing designs in popularly used MPI stacks do not provide truly one-sided intra-node RMA communication. They are built on top of two-sided send-receive operations, therefore suffering from overheads of two-sided communication and dependency on the remote side. In this paper, we enhance existing shared memory mechanisms to design truly one-sided synchronization. In addition, we design truly one-sided intra-node data transfer using two kernel based direct copy alternatives: basic kernel-assisted approach and I/OAT-assisted approach. Our new design eliminates the overhead of using two-sided operations and eliminates the involvement from the remote side. We also propose a series of benchmarks to evaluate various performance aspects over multi-core architectures (Intel Clovertown, Intel Nehalem and AMD Barcelona). The results show that the new design obtains up to 39% lower latency for small and medium messages and demonstrates 29% improvement in large message bandwidth. Moreover, it provides superior performance in terms of better scalability, reduced cache misses, higher resilience to process skew and increased computation and communication overlap. Finally, up to 10% performance benefits is demonstrated for a real scientific application AWM-Olsen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. I/OAT Acceleration Technology. http://www.intel.com/network/connectivity/vtc_ioat.htm

  2. LAM/MPI Parallel Computing. http://www.lam-mpi.org/

  3. MPICH2: High Performance and Widely Portable MPI. http://www.mcs.anl.gov/research/projects/mpich2/

  4. MVAPICH2: MPI over InfiniBand and iWARP. http://mvapich.cse.ohio-state.edu/

  5. OSU Microbenchmarks. http://mvapich.cse.ohio-state.edu/benchmarks/

  6. Unified Parallel C. http://en.wikipedia.org/wiki/Unified_Parallel_C

  7. Asai N, Kentemich T, Lagier P (1999) MPI-2 implementation on Fujitsu generic message passing kernel. In: Proceedings of the ACM/IEEE conference on supercomputing (CDROM)

  8. Barrett BW, Shipman GM, Lumsdaine A (2007) Analysis of implementation options for MPI-2 one-sided. In: EuroPVM/MPI

  9. Bertozzi M, Panella M, Reggiani M (2001) Design of a VIA based communication protocol for LAM/MPI suite. In: Euromicro workshop on parallel and distributed processing

  10. Booth S, Mourao E (2000) Single sided MPI implementations for SUN MPI. In: Proceedings of the 2000 ACM/IEEE conference on supercomputing

  11. Buntinas D, Goglin B, Goodell D, Mercier G, Moreaud S (2009) Cache-efficient, intranode, large-message MPI communication with MPICH2-Nemesis. In: International conference on parallel processing (ICPP)

  12. Chai L, Lai P, Jin H-W, Panda DK (2008) Designing an efficient kernel-level and user-level hybrid approach for MPI intra-node communication on multi-core systems. In: International conference on parallel processing (ICPP)

  13. Cui Y, Moore R, Olsen K, Chourasia A, Maechling P, Minster B, Day S, Hu Y, Zhu J, Jordan T Toward petascale earthquake simulations. In: Special issue on geodynamic modeling, vol. 4, July 2009

  14. Forum M (1993) MPI: a message passing interface. In: Proceedings of supercomputing

  15. Goglin B (2009) High throughput intra-node MPI communication with Open-MX. In: Proceedings of the 17th Euromicro international conference on parallel, distributed and network-based processing (PDP)

  16. Träff JL, Ritzdorf H, Hempel R (2000) The implementation of MPI-2 one-sided communication for the NEC SX-5. In: Proceedings of the 2000 ACM/IEEE conference on supercomputing

  17. Jiang W, Liu J, Jin H, Panda DK, Gropp W, Thakur R (2004) High performance MPI-2 one-sided communication over infiniband. In: IEEE/ACM international symposium on cluster computing and the grid (CCGrid 04)

  18. Jiang W, Liu JX, Jin H-W, Panda DK, Buntinas D, Thakur R, Gropp W (2004) Efficient implementation of MPI-2 passive one-sided communication on infiniBand clusters. In: EuroPVM/MPI

  19. Jin H-W, Sur S, Chai L, Panda DK (2008) Lightweight kernel-level primitives for high-performance MPI intra-node communication over multi-core systems. In: IEEE international sympsoium on cluster computing and the grid

  20. Lai P, Panda DK (2009) Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems. In: Technical Report OSU-CISRC-9/09-TR46, Computer Science and Engineering, The Ohio State University

  21. Message Passing Interface Forum. MPI-2: extensions to the message-passing interface, July 1997

  22. Santhanaraman G, Balaji P, Gopalakrishnan K, Thakur R, Gropp WD, Panda DK (2009) Natively supporting true one-sided communication in MPI on multi-core systems with infiniband. In: IEEE international sympsoium on cluster computing and the grid

  23. Santhanaraman G, Narravula S, Panda DK (2008) Designing passive synchronization for MPI-2 one-sided communication to maximize overlap. In: Int’l Parallel and Distributed Processing Symposium (IPDPS)

  24. Thakur R, Gropp W, Toonen B (2005) Optimizing the synchronization operations in message passing interface one-sided communication. Int J High Perform Comput Appl

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Lai.

Additional information

This research is supported in part by DOE grants #DE-FC02-06ER25749 and #DE-FC02-06ER25755; NSF grants #CNS-0403342, #CCF-0702675, #CCF-0833169, #CCF-0916302 and #OCI-0926691; grants from Intel, Mellanox, Cisco systems, QLogic and Sun Microsystems; and equipment donations from Intel, Mellanox, AMD, Appro, Chelsio, Dell, Fujitsu, Fulcrum, Microway, Obsidian, QLogic, and Sun Microsystems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lai, P., Sur, S. & Panda, D.K. Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems. Comput Sci Res Dev 25, 3–14 (2010). https://doi.org/10.1007/s00450-010-0115-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-010-0115-3

Keywords

Navigation