Logo PTI Logo FedCSIS

Proceedings of the 16th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 25

Machine Learning and High-Performance Computing Hybrid Systems, a New Way of Performance Acceleration in Engineering and Scientific Applications

DOI: http://dx.doi.org/10.15439/2021F004

Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 2736 ()

Full text

Abstract. Machine learning is one of the hottest topics in IT industry as well as in academia. Some of the IT leaders and scientists believe that this is going to totally revolutionise the industry. This transformation is happening on both fronts, one is the application and software paradigm, the other is at the hardware and system level. At the same time, the High-Performance Computing segment is striving to achieve the level of Exascale performance. It is not debatable that to meet such level of performance and keep the cost of system and power consumption on reasonable level is not a trivial task. In this article, we try to look at a potential solution to these problems and discuss a new approach to building systems and software to meet these challenges and the growing needs of the computing power for HPC systems on the one hand, but also be ready for a new type of workload including Artificial Intelligence type of applications

References

  1. https://top500.org/statistics/list/
  2. Geoffrey Fox, James A. Glazier, JCS Kadupitiya, Vikram Jadhao, Minje Kim, Judy Qiu, James P. Sluka, Endre Somogyi, Madhav Marathe, Abhijin Adiga, Jiangzhuo Chen, Oliver Beckstein, and Shantenu Jha. "Learning Everywhere: Pervasive machine learning for effective High-Performance computation: Application background". Technical report, Indiana University, February 2019. http://dsc.soic.indiana.edu/publications/Learning_Everywhere.pdf.
  3. Geoffrey Fox, Shantenu Jha,"Understanding ML driven HPC: Applications and Infrastructure", Invited talk to "Visionary Track" at IEEE eScience 2019.
  4. Jeff Dean. "Machine learning for systems and systems for machine learning". In Presentation at 2017 Conference on Neural Information Processing Systems, 2017.
  5. Satoshi Matsuoka. "Post-K: A game changing supercomputer for convergence of HPC and big data" / AI. Multicore 2019, February 2019.
  6. Kadupitiya Kadupitige. "Intersection of HPC and Machine Learning". ENGR-E 687 IND STUDY INTEL SYS: FINAL REPORT
  7. https://docs.graphcore.ai/projects/ipu-overview/en/latest/about-ipu.html
  8. Leslie G. Valiant. 1990. "A bridging model for parallel computation". Commun. ACM 33, 8 (August 1990), 103-111.
  9. https://www.graphcore.ai/products/ipu
  10. Zhe Jia, Blake Tillman, Marco Maggioni, Daniele Paolo Scarpazza, "Dissecting the Graphcore IPU Architecture via Microbenchmarking", https://arxiv.org/abs/1912.03413
  11. Ion Stoica, Dawn Song, Raluca Ada Popa, David Patterson, Michael W. Mahoney, Randy Katz, Anthony D. Joseph, Michael Jordan, Joseph M. Hellerstein, Joseph Gonzalez, Ken Goldberg, Ali Ghodsi, David Culler, Pieter Abbeel. " A Berkeley View of Systems Challenges for AI". EECS Department. University of California, Berkeley. Technical Report No. UCB/EECS-2017-159. October 16, 201.7
  12. Karl Freund, Patrick Moorhead. "The Graphcore Second-Generation IPU". https://moorinsightsstrategy.com/research-paper-the-graphcore-second-generation-ipu/
  13. Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston. "Exascale Deep Learning for Climate Analytics". Super Computing Conference November 11-16, 2018, Dallas, TX, USA
  14. J. Luc Peterson, Ben Bay, Joe Koning, Peter Robinson, Jessica Semler, Jeremy White, Rushil Anirudh, Kevin Athey, Peer-Timo Bremer, Francesco Di Natale, David Fox, Jim A. Gaffney, Sam A. Jacobs, Bhavya Kailkhura, Bogdan Kustowski, Steven Langer, Brian Spears, Jayaraman Thiagarajan, Brian Van Essen, Jae-Seung Yeom. "Enabling Machine Learning-Ready HPC Ensembles with Merlin". Lawrence Livermore National Laboratory, Livermore, California 94550, USA. https://arxiv.org/pdf/1912.02892.pdf
  15. Krzysztof Rojek, Roman Wyrzykowski, Pawel Gepner. "AI-Accelerated CFD Simulation Based on OpenFOAM and CPU/GPU Computing" International Conference on Computational Science -2021
  16. Alexander Brace, Hyungro Lee, Heng Ma, Anda Trifan, Matteo Turilli, Igor Yakushin, Todd Munson, Ian Foster, Shantenu Jha, Arvind Ramanathan. "Achieving 100X faster simulations of complex biological phenomena by coupling ML to HPC ensembles". https://arxiv.org/abs/2104.04797
  17. Steven W. D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, Jeffrey S. Vetter. "TensorFlow Doing HPC". https://arxiv.org/abs/1903.04364