Utilizing Multiple Xeon Phi Coprocessors on One Compute Node

Dong, Xinnan; Chai, Jun; Yang, Jing; Wen, Mei; Wu, Nan; Cai, Xing; Zhang, Chunyuan; Chen, Zhaoyun

doi:10.1007/978-3-319-11194-0_6

Xinnan Dong²⁵,
Jun Chai²⁵,
Jing Yang²⁵,
Mei Wen²⁵,
Nan Wu²⁵,
Xing Cai^26,27,
Chunyuan Zhang²⁵ &
…
Zhaoyun Chen²⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8631))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2679 Accesses

Abstract

Future exascale systems are expected to adopt compute nodes that incorporate many accelerators. This paper thus investigates the topic of programming multiple Xeon Phi coprocessors that lie inside one compute node. Besides a standard MPI-OpenMP programming approach, which belongs to the symmetric usage mode, two offload-mode programming approaches are considered. The first offload approach is conventional and uses compiler pragmas, whereas the second one is new and combines Intel’s APIs of coprocessor offload infrastructure (COI) and symmetric communication interface (SCIF) for low-latency communication. While the pragma-based approach allows simpler programming, the COI-SCIF approach has three advantages in (1) lower overhead associated with launching offloaded code, (2) higher data transfer bandwidths, and (3) more advanced asynchrony between computation and data movement. The low-level COI-SCIF approach is also shown to have benefits over the MPI-OpenMP counterpart. All the programming approaches are tested by a real-world 3D application, for which the COI-SCIF approach shows a performance upper hand on a Tianhe-2 compute node with three Xeon Phi coprocessors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Top500, China’s Tianhe-2 Supercomputer Takes No.1 Ranking on 41st TOP500 List, http://www.top500.org/blog/lists/2013/06/press-release/
Dongarra, J.: Visit to the National University for Defense Technology Changsha, http://www.netlib.org/utk/people/JackDongarra/PAPERS/tianhe-2-dongarra-report.pdf
Intel Corporation, Intel Xeon Phi Coprocessor Instruction Set Architecture Reference Manual. Reference number 327364-001 (2012)
Google Scholar
Jeffers, J., Reinders, J.C.: Intel Xeon Phi Coprocessor High-Performance Programming. Morgan Kaufmann, Walthman (2013)
Google Scholar
Intel MIC Architecture, http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-codename-knights-corner
Intel Corporation, Intel Xeon Phi System Software Developer’s Guide. Reference number 328207-001EN (2012)
Google Scholar
Heinecke, A., Vaidyanathan, K., Smelyanskiy, M., Kobotov, A., Dubtsov, R., Henry, G., Chrysos, G., Dubey, P.: Design and implementation of the Linpack benchmark for single and multi-node systems based on Intel Xeon Phi coprocessor. In: IPDPS (2013), doi:10.1109/IPDPS.2013.113
Google Scholar
Si, M., Ishikawa, Y., Direct, M.P.I.: library for Intel Xeon Phi Co-Processors. In: 27th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), Boston, MA, USA (2013), doi:10.1109/IPDPSW.2013.179
Google Scholar
MPICH: High-performance and Portable MPI, http://www.mpich.org/
OFS for Xeon Phi, https://www.openfabrics.org/images/docs/2013Dev_WorkshopnewlineMon_0422/2013_Workshop_Mon_1430_OpenFabrics_OFS_software_for_Xeon_Phi.pdf
Cadambi, S., Coviello, G., Li, C., Phull, R., Rao, K., Sankaradass, M., Chakradhar, S.: COSMIC: Middleware for high performance and reliable multiprocessing on Xeon Phi coprocessors. In: Proceedings of the 22nd Int’l Symposium on High-Performance Parallel and Distributed Computing, HPDC 2013 (2013), doi:10.1145/2462902.2462921
Google Scholar
Dokulila, J., Bajrovica, E., Benknera, S., Pllanaa, S., Sandriesera, M., Bachmayerb, B.: High-level support for hybrid parallel execution of C++ applications targeting Intel Xeon Phi coprocessors. In: 2013 International Conference on Computational Science, ICCS 2013 (2013), doi:10.1016/j.procs.2013.05.430
Google Scholar
Schulz, W., Ulerich, K., Malaya, R., Bauman, N., Stogner, T.P., Simmons, R., Early, C.: experiences porting scientific applications to the many integrated core (MIC) platform. In: TACC-Intel Highly Parallel Computing Symposium, Tech. Rep. (2012), doi:10.1145/2016741.2016764
Google Scholar
Pennycook, J., Hughes, S., Smelyanskiy, J.C., Jarvis, M., Exploring, A.S.: SIMD for molecular dynamics, using Intel Xeon processors and Intel Xeon Phi coprocessors. In: IEEE Int’l Parallel & Distributed Processing Symposium (2013), doi:10.1109/IPDPS.2013.44
Google Scholar
Rosales, C.: Porting to the Intel Xeon Phi: Opportunities and challenges. In: Extreme Scaling Workshop, XSCALE 2013 (2013)
Google Scholar
Potluri, S., Bureddy, D., Hamidouche, K., Venkatesh, A., Kandalla, K., Subramoni, H., Panda, D.K.: MVAPICH-PRISM: A Proxy-based Communication Framework using InfiniBand and SCIF for Intel MIC Clusters. In: Int’l Conference on Supercomputing (2013)
Google Scholar
Potluri, S., Venkatesh, A., Bureddy, D., Kandalla, K., Panda, K.: D., Efficient intra-node communication on Intel-MIC clusters. In: 13th IEEE Int’l Symposium on Cluster Computing and the Grid, CCGrid 2013 (2013), doi:10.1109/CCGrid.2013.86
Google Scholar
The Heterogeneous Offload Model for Intel Many Integrated Core Architecture, http://software.intel.com/sites/default/files/article/326701/heterogeneous-programming-model.pdf
Intel Manycore Platform Software Stack (MPSS), http://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss#downloads
Intel Corporation, MIC COI API Reference Manual 0.65. Monday December 17 12:12:33 (2012)
Google Scholar
Intel Corporation, MIC SCIF API Reference Manual 0.65 for User Mode Linux. Mon Dec17 12:05:03 (2012)
Google Scholar
Chai, Jun, Hake, Johan, Wu, Nan, Wen, Mei, Cai, Xing, Lines, T., Glenn, Yang, Jing, Su, Huayou, Zhang, Chunyuan, Liao, Xiangke, S.: Towards simulation of subcellular calcium dynamics at nanometre resolution. International Journal of High Performance Computing Applications (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, National University of Defense Technology, Changsha, Hunan, 410073, China
Xinnan Dong, Jun Chai, Jing Yang, Mei Wen, Nan Wu, Chunyuan Zhang & Zhaoyun Chen
Simula Research Laboratory, P.O. Box 134, 1325, Lyakser, Norway
Xing Cai
Department of Informatics, University of Oslo., P.O. Box 1080, Blindern, 0316, Oslo, Norway
Xing Cai

Authors

Xinnan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Jun Chai
View author publications
You can also search for this author in PubMed Google Scholar
Jing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Mei Wen
View author publications
You can also search for this author in PubMed Google Scholar
Nan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xing Cai
View author publications
You can also search for this author in PubMed Google Scholar
Chunyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Illinois Institute of Technology, 60616-3793, Chicago, IL, USA
Xian-he Sun
School of Computer Science and Technology, Dalian Maritime University, 1 Linghai Road, 116026, Dalian, China
Wenyu Qu
SEECS, University of Ottawa, 8, King Edward Ave, K1N 6N5, Ottawa, ON, Canada
Ivan Stojmenovic
Deakin University, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Wanlei Zhou
Dalian Maritime University, NO.1 Linhai Road Dailian, 116026, China
Zhiyang Li
BeiHang University, XueYuan Road No.37, HaiDian District, Beijing, China
Hua Guo
University of Bradford, BD7 1DP, Bradford, West Yorkshire, United Kingdom
Geyong Min
Dalian Maritime University, NO.1 Linhai Road Dailian, China, 116026
Tingting Yang
Computer Network Information Center, Chinese Academy of Sciences, 100190, Beijing, China
Yulei Wu
Shandong University, 27 Shanda Nanlu, 250100, Jinan City, Shandong Province, China
Lei Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, X. et al. (2014). Utilizing Multiple Xeon Phi Coprocessors on One Compute Node. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-11194-0_6
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11193-3
Online ISBN: 978-3-319-11194-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics