Skip to main content

From Describing to Prescribing Parallelism: Translating the SPEC ACCEL OpenACC Suite to OpenMP Target Directives

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2016)

Abstract

Current and next generation HPC systems will exploit accelerators and self-hosting devices within their compute nodes to accelerate applications. This comes at a time when programmer productivity and the ability to produce portable code has been recognized as a major concern. One of the goals of OpenMP and OpenACC is to allow the user to specify parallelism via directives so that compilers can generate device specific code and optimizations. However, the challenge of porting codes becomes more complex because of the different types of parallelism and memory hierarchies available on different architectures. In this paper we discuss our experience with porting the SPEC ACCEL benchmarks from OpenACC to OpenMP 4.5 using a performance portable style that lets the compiler make platform-specific optimizations to achieve good performance on a variety of systems. The ported SPEC ACCEL OpenMP benchmarks were validated on different platforms including Xeon Phi, GPUs and CPUs. We believe that this experience can help the community and compiler vendors understand how users plan to write OpenMP 4.5 applications in a performance portable style.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.spec.org/fairuse.html#Comparisons.

  2. 2.

    https://www.spec.org/fairuse.html#Academic.

References

  1. Github repository for the extended Clang implementation supporting OpenMP 4.0 (2016). https://github.com/clang-omp/clang_trunk

  2. Agathos, S.N., Papadogiannakis, A., Dimakopoulos, V.V.: Targeting the parallella. In: Träff, J.L., Hunold, S., Versaci, F. (eds.) Euro-Par 2015. LNCS, vol. 9233, pp. 662–674. Springer, Heidelberg (2015). doi:10.1007/978-3-662-48096-0_51

    Chapter  Google Scholar 

  3. Bertolli, C., Antao, S.F., Bercea, G.T., Jacob, A.C., Eichenberger, A.E., Chen, T., Sura, Z., Sung, H., Rokos, G., Appelhans, D., O’Brien, K.: Integrating GPU support for OpenMP offloading directives into clang. In: Proceedings of 2nd Workshop on the LLVM Compiler Infrastructure in HPC, LLVM 2015, NY, USA, pp. 5:1–5:11. ACM, New York (2015). http://doi.acm.org/10.1145/2833157.2833161

  4. Bertolli, C., Antao, S.F., Eichenberger, A.E., O’Brien, K., Sura, Z., Jacob, A.C., Chen, T., Sallenave, O.: Coordinating GPU threads for OpenMP 4.0 in LLVM (2014)

    Google Scholar 

  5. Calore, E., Schifano, S.F., Tripiccione, R.: On portability, performance and scalability of an MPI OpenCL lattice Boltzmann code. In: Lopes, L., et al. (eds.) Euro-Par 2014. LNCS, vol. 8806, pp. 438–449. Springer, Heidelberg (2014). doi:10.1007/978-3-319-14313-2_37

    Google Scholar 

  6. Cray: Cray Compiling Environment Release: Overview and Installation Guide (Document: S-5212-84) (2015)

    Google Scholar 

  7. Foundation, F.S.: GCC 6 Release Series: Changes, New Features, and Fixes (2016). https://gcc.gnu.org/gcc-6/changes.html

  8. GCC Wiki: Offloading Support in GCC. https://gcc.gnu.org/wiki/Offloading

  9. Herdman, J.A., Gaudin, W.P., Perks, O., Beckingsale, D.A., Mallinson, A.C., Jarvis, S.A.: Achieving portability and performance through OpenACC. In: Proceedings of 1st Workshop on Accelerator Programming Using Directives, WACCPD 2014, pp. 19–26. IEEE Press, Piscataway (2014). http://dx.doi.org/10.1109/WACCPD.2014.10

  10. Intel Corporation: Intel\(\textregistered \) C++ Compiler 16.0 User and Reference Guide: OpenMP* Support (2015)

    Google Scholar 

  11. Juckeland, G., Grund, A., Nagel, W.E.: Performance portable applications for hardware accelerators: lessons learned from SPEC ACCEL. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), pp. 689–698, May 2015

    Google Scholar 

  12. Juckeland, G., et al.: SPEC ACCEL: a standard application suite for measuring hardware accelerator performance. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014. LNCS, vol. 8966, pp. 46–67. Springer, Heidelberg (2015). http://dx.doi.org/10.1007/978-3-319-17248-4_3

    Google Scholar 

  13. Liao, C., Yan, Y., Supinski, B.R., Quinlan, D.J., Chapman, B.: Early experiences with the OpenMP accelerator model. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 84–98. Springer, Heidelberg (2013). http://dx.doi.org/10.1007/978-3-642-40698-0_7

    Chapter  Google Scholar 

  14. Lin, P.H., Liao, C., Quinlan, D.J., Guzik, S.: Experiences of using the OpenMP accelerator model to port DOE stencil applications. In: Terboven, C., de Supinski, B.R., Reble, P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2015. LNCS, vol. 9342, pp. 45–59. Springer, Berlin (2015)

    Chapter  Google Scholar 

  15. Martineau, M., McIntosh-Smith, S., Boulton, M., Gaudin, W.: An evaluation of emerging many-core parallel programming models. In: Proceedings of 7th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2016, NY, USA pp. 1–10 (2016)

    Google Scholar 

  16. Mitra, G., Stotzer, E., Jayaraj, A., Rendell, A.P.: Implementation and optimization of the OpenMP accelerator model for the TI Keystone II architecture. In: DeRose, L., Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 202–214. Springer, Heidelberg (2014)

    Google Scholar 

  17. Müller, M.S., et al.: SPEC OMP2012 — an application benchmark suite for parallel systems using OpenMP. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 223–236. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-30961-8_17

    Chapter  Google Scholar 

  18. Müller, M.S., van Waveren, M., Lieberman, R., Whitney, B., Saito, H., Kumaran, K., Baron, J., Brantley, W.C., Parrott, C., Elken, T., Feng, H., Ponder, C.: SPEC MPI2007 - an application benchmark suite for parallel systems using MPI. Concurr. Comput.: Pract. Exper. 22(2), 191–205 (2010). http://dx.doi.org/10.1002/cpe.v22:2

  19. Newburn, C.J., Dmitriev, S., Narayanaswamy, R., Wiegert, J., Murty, R., Chinchilla, F., Deodhar, R., McGuire, R.: Offload compiler runtime for the Intel Xeon Phi™ coprocessor. In: 2013 IEEE 27th International Parallel and Distributed Processing Symposium Workshops and Ph.D. Forum (IPDPSW), pp. 1213–1225 (2013)

    Google Scholar 

  20. OpenMP Architecture Review Board: OpenMP Application Program Interface. Version 4.0, July 2013. http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf

  21. OpenMP Architecture Review Board: OpenMP Application Program Interface. Version 4.5, November 2015. http://www.openmp.org/mp-documents/openmp-4.5.pdf

  22. Oracle: Oracle\({\textregistered }\) Solaris Studio 12.4: OpenMP API User’s Guide (2014). http://docs.oracle.com/cd/E37069_01/pdf/E37081.pdf

  23. PathScale: PathScale ENZO 2015 (2015). http://www.pathscale.com/enzo

  24. Pennycook, S.J., Jarvis, S.A.: Developing Performance-Portable Molecular Dynamics Kernels in OpenCL. In: 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), pp. 386–395 (2012)

    Google Scholar 

  25. Sabne, A., Sakdhnagool, P., Lee, S., Vetter, J.S.: Evaluating performance portability of OpenACC. In: Brodman, J., Tu, P. (eds.) LCPC 2014. LNCS, vol. 8967, pp. 51–66. Springer, Heidelberg (2015). http://dx.doi.org/10.1007/978-3-319-17473-0_4

    Google Scholar 

  26. Strohmeier, E., Simon, H., Dongarra, J., Meurer, M.: The 46th top. 500 list, November 2015. http://top500.org/list/2015/11/

  27. Wienke, S., Terboven, C., Beyer, J.C., Müller, M.S.: A pattern-based comparison of OpenACC and OpenMP for accelerator computing. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014 Parallel Processing. LNCS, vol. 8632, pp. 812–823. Springer, Heidelberg (2014). http://dx.doi.org/10.1007/978-3-319-09873-9_68

    Google Scholar 

  28. Wong, M.: The future of GPU/accelerator programming models. In: Keynote at the 2nd Workshop on the LLVM Compiler Infrastructure in HPC (2015). https://llvm-hpc2-workshop.github.io/slides/Wong.pdf

  29. Woolley, C.: Profiling and tuning OpenACC code. http://on-demand.gputechconf.com/gtc/2012/presentations/S0517B-Monday-Programming-GPUs-OpenACC.pdf

Download references

Acknowledgments

The authors thank Cloyce Spradling for his work on the SPEC harness as well as the SPEC POWER group for their work on enabling the integration of power measurements into other SPEC suites.

SPEC®, SPEC ACCEL™, SPEC CPU™, SPEC MPI®, and SPEC OMP® are registered trademarks of the Standard Performance Evaluation Corporation (SPEC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guido Juckeland .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Juckeland, G. et al. (2016). From Describing to Prescribing Parallelism: Translating the SPEC ACCEL OpenACC Suite to OpenMP Target Directives. In: Taufer, M., Mohr, B., Kunkel, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9945. Springer, Cham. https://doi.org/10.1007/978-3-319-46079-6_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46079-6_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46078-9

  • Online ISBN: 978-3-319-46079-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics