Skip to main content

Building Efficient HPC Cloud with SR-IOV-Enabled InfiniBand: The MVAPICH2 Approach

  • Chapter
  • First Online:
Research Advances in Cloud Computing

Abstract

Single Root I/O Virtualization (SR-IOV) technology has been steadily gaining momentum for high-speed interconnects such as InfiniBand. SR-IOV enabled InfiniBand has been widely used in modern HPC clouds with virtual machines and containers. While SR-IOV can deliver near-native I/O performance, recent studies have shown that locality-aware communication schemes play an important role in achieving high I/O performance on SR-IOV enabled InfiniBand clusters. To discuss how to build efficient HPC clouds, this chapter presents a novel approach using the MVAPICH2 library. We first propose locality-aware designs inside the MVAPICH2 library to achieve near-native performance on HPC clouds with virtual machines and containers. Then, we propose advanced designs with cloud resource managers such as OpenStack and Slurm to make users easier to deploy and run their applications with the MVAPICH2 library on HPC clouds. Performance evaluations with benchmarks and applications on an OpenStack-based HPC cloud (i.e., NSF-supported Chameleon Cloud) show that MPI applications with our designs are able to get near bare-metal performance on HPC clouds with different virtual machine and container deployment scenarios. Compared to running default MPI applications on Amazon EC2, our design can deliver much better performance. The MVAPICH2 over HPC Cloud software package presented in this chapter is publicly available from http://mvapich.cse.ohio-state.edu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Virtualization. (2016). https://en.wikipedia.org/wiki/Virtualization.

  2. Rosenblum, M., & Garfinkel, T. (2005). Virtual machine monitors: Current technology and future trends. Computer, 38(5), 39–47.

    Article  Google Scholar 

  3. Jose, J., Li, M., Lu, X., Kandalla, K., Arnold, M.,  & Panda, D. K. (2013). SR-IOV support for virtualization on InfiniBand clusters: Early experience. In Proceedings of 13th IEEE/ACM International Symposium Cluster, Cloud and Grid Computing (CCGrid), Delft, Netherlands.

    Google Scholar 

  4. MVAPICH: MPI over InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE. (2016). http://mvapich.cse.ohio-state.edu/.

  5. OpenMPI: Open Source High Performance Computing. (2016). http://www.open-mpi.org/.

  6. Zhang, J., Lu, X., Jose, J., Shi, R.,  & Panda, D. K. (2014). Can inter-VM Shmem benefit MPI applications on SR-IOV based virtualized InfiniBand clusters? In Proceedings of 20th International Conference Euro-Par 2014 Parallel Processing, Porto, Portugal.

    Google Scholar 

  7. Single Root I/O Virtualization. (2016). http://www.pcisig.com/specifications/iov/single_root.

  8. Cross Memory Attach (CMA). (2016). http://kernelnewbies.org/Linuxi_3.2.

  9. Macdonell, A. C. (2011). Shared-memory optimizations for virtual machines. Ph.D. Thesis. University of Alberta, Edmonton, Alberta, Fall 2011

    Google Scholar 

  10. Zhang, J., Lu, X., Jose, J., Li, M., Shi, R.,  & Panda, D. K. (2014). High performance MPI library over SR-IOV enabled InfiniBand clusters. In Proceedings of International Conference on High Performance Computing (HiPC), Goa, India.

    Google Scholar 

  11. Zhang, J., Lu, X.,  & Panda, D. K. (2016). High performance MPI library for container-based HPC cloud on InfiniBand clusters. In Proceedings of the 45th International Conference on Parallel Processing (ICPP), Philadelphia, USA.

    Google Scholar 

  12. Yoo, A., Jette, M.,  & Grondona, M. (2003). SLURM: Simple linux utility for resource management. In Proceedings of 9th International Workshop (JSSPP 2003), Seattle, WA, USA

    Google Scholar 

  13. Zhang, J., Lu, X., Chakraborty, S.,  & Panda, D. K. (2016). SLURM-V: Extending SLURM for building efficient HPC cloud with SR-IOV and IVShmem. In Proceeding of the 22nd International European Conference on Parallel and Distributed Computing (Euro-Par ’16), Grenoble, France.

    Google Scholar 

  14. Markwardt, U., Jurenz, M., Rotscher, D., Muller-Pfefferkorn, R., Jakel, R.,  & Wesarg, B. (2016). Running virtual machines in a Slurm batch system. http://slurm.schedmd.com/SLUG15/SlurmVM.pdf.

  15. Jacobsen, D., Botts, J.,  & Canon, S. (2016). Never port your code again Docker functionality with Shifter using SLURM. http://slurm.schedmd.com/SLUG15/shifter.pdf.

  16. Zhang, J., Lu, X., Arnold, M.,  & Panda, D. K. (2015). MVAPICH2 over OpenStack with SR-IOV: An efficient approach to build HPC clouds. In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Shenzhen, China.

    Google Scholar 

  17. Chameleon. (2016). http://chameleoncloud.org/.

  18. Docker. (2016). https://www.docker.com/.

  19. Singularity. (2016). http://singularity.lbl.gov/.

  20. Keahey, K., Foster, I., Freeman, T., & Zhang, X. (2005). Virtual workspaces: Achieving quality of service and quality of life in the grid. Scientific Programming, 13(4), 265–275.

    Article  Google Scholar 

  21. Eucalyptus. (2016). http://eucalyptus.com/.

  22. OpenNebula. (2016). http://opennebula.org.

  23. Peng, J., Lu, X., Cheng, B.,  & Zha, L. (2010). JAMILA: A usable batch job management system to coordinate heterogeneous clusters and diverse applications over grid or cloud infrastructure. In Proceedings of Network and Parallel Computing, Zhengzhou, China.

    Google Scholar 

  24. Lu, X., Lin, J., Zha, L.,  & Xu, Z. (2011). Vega LingCloud: A resource single leasing point system to support heterogeneous application modes on shared infrastructure. In Proceedings of IEEE 9th International Symposium on Parallel and Distributed Processing with Applications (ISPA), Busan, Korea.

    Google Scholar 

  25. Crago, S., Dunn, K., Eads, P., Hochstein, L., Kang, D., Kang, M., et al. (2011). Heterogeneous cloud computing. In Proceedings of 2011 IEEE International Conference on Cluster Computing (Cluster), Austin, TX, USA.

    Google Scholar 

  26. SPANK. (2016). https://slurm.schedmd.com/spank.html.

  27. Subramoni, H., Lai, P., Luo, M.,  & Panda, D. K. (2009). RDMA over ethernet—A preliminary study. In Proceedings of the 2009 Workshop on High Performance Interconnects for Distributed Computing (HPIDC’09).

    Google Scholar 

  28. Romanow, A.,  & Bailey, S. (2003). An overview of RDMA over IP. In Proceedings of International Workshop on Protocols for Long-Distance Networks (PFLDnet2003).

    Google Scholar 

  29. Zhang, X., McIntosh, S., Rohatgi, P.,  & Griffin, J. (2007). XenSocket: A high-throughput interdomain transport for virtual machines. In Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware (Middleware), Newport Beach, USA.

    Google Scholar 

  30. Kim, K., Kim, C., Jung, S., Shin, H.,  & Kim, J. (2008). Inter-domain socket communications supporting high performance and full binary compatibility on Xen. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE ’08), Seattle, USA.

    Google Scholar 

  31. Wang, J., Wright, K.,  & Gopalan, K. (2008). XenLoop: A transparent high performance inter-vm network loopback. In Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC), Boston, USA.

    Google Scholar 

  32. Huang, W., Koop, M., Gao, Q.,  & Panda, D. K. (2007). Virtual machine aware communication libraries for high performance computing. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC), Reno, USA.

    Google Scholar 

  33. Xavier, M., Neves, M., Rossi, F., Ferreto, T., Lange, T., & Rose, C. (2013). Performance evaluation of container-based virtualization for high performance computing environments. 2013 21st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) (pp. 233–240). Northern Ireland: Belfast.

    Chapter  Google Scholar 

  34. Felter, W., Ferreira, A., Rajamony, R.,  & Rubio, J. (2014). An updated performance comparison of virtual machines and Linux containers. Technical Report RC25482 (AUS1407-001).

    Google Scholar 

  35. Ruiz, C., Jeanvoine, E.,  & Nussbaum, L. (2015). Performance evaluation of containers for HPC. In 10th Workshop on Virtualization in High-Performance Cloud Computing (VHPC), Vienna, Austria.

    Google Scholar 

  36. Zhou, Y., Subramaniam, B., Keahey, K.,  & Lange, J. (2015). Comparison of virtualization and containerization techniques for high performance computing. In Proceedings of the 2015 ACM/IEEE Conference on Supercomputing, Austin, USA.

    Google Scholar 

  37. Estrada, I. (2016). Overview of a virtual cluster using OpenNebula and SLURM. https://portal.futuresystems.org/sites/default/files/one-slurm.pdf.

  38. Ruivo, T., Altayo, G., Garzoglio, G., Timm, S., Kim, H., Noh, S., et al. (2014). Exploring InfiniBand hardware virtualization in OpenNebula towards efficient high-performance computing. In Proceedings of 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).

    Google Scholar 

  39. MVAPICH2-Virt Heat-based Complex Appliance. (2016). https://www.chameleoncloud.org/appliances/28/.

  40. Telfer, S. (2016). The crossroads of cloud and HPC: OpenStack for scientific research. OpenStack Foundation.

    Google Scholar 

  41. Guay, W., Reinemo, S., Johnsen, B., Yen, C., Skeie, T., Lysne, O., et al. (2015). Early experiences with live migration of SR-IOV enabled InfiniBand. Journal of Parallel and Distributed Computing (JPDC).

    Google Scholar 

  42. Xu, X.,  & Davda, B. (2016). SRVM: Hypervisor support for live migration with passthrough SR-IOV network devices. In Proceedings of the 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE ’16), Atlanta, USA.

    Google Scholar 

  43. Pan, Z., Dong, Y., Chen, Y., Zhang, L.,  & Zhang, Z. (2012). CompSC: Live migration with pass-through devices. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments (VEE ’12), London, UK (pp. 109–120).

    Google Scholar 

  44. Zhang, J., Lu, X.,  & Panda, D. K. (2017). High-performance virtual machine migration framework for MPI applications on SR-IOV enabled InfiniBand clusters. In Proceedings of the 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS ’17), Orlando, USA.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoyi Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lu, X., Zhang, J., Panda, D.K. (2017). Building Efficient HPC Cloud with SR-IOV-Enabled InfiniBand: The MVAPICH2 Approach. In: Chaudhary, S., Somani, G., Buyya, R. (eds) Research Advances in Cloud Computing. Springer, Singapore. https://doi.org/10.1007/978-981-10-5026-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-5026-8_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-5025-1

  • Online ISBN: 978-981-10-5026-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics