Abstract
We present and compare different approaches for using multiple Graphics Processing Units in the simulation of physical systems. As benchmarks we consider the time required to update a single spin of the 3D Heisenberg spin glass model, by using both the Over-relaxation and the Heat Bath algorithms, and the solution of a Poisson equation by using a finite-difference method. The results show that a suitable combination of techniques allows to hide almost completely the communication overhead by using the CPU as a communication coprocessor of the GPU. Large scale simulations on clusters of GPUs can be efficiently carried out by following the same approach for other applications where a clear cut exists between bulk and boundaries data.
Similar content being viewed by others
References
M.Bernaschi, G.Parisi, L.Parisi, Comput. Phys. Comm. 182, 6 (2011)
S.Adler, Phys. Rev. D 23, 2901 (1981)
T.Preis, P.Virnau, W.Paul, J.Schneider, J. Comput. Phys. 228, 4468 (2009)
M.Weigel, Comput. Phys. Commun. 182, 1833 (2011)
NVIDIA CUDA Compute Unified Device Architecture Programming Guide http://www.nvidia.com/cuda
I.Campos, M.Cotallo-Aban, V.Martin-Mayor, S.Perez-Gavir, A.Tarancon, Phys. Rev. Lett. 97, 217204 (2006)
The Riken Himeno CFD Benchmark: http://accc.riken.jp/HPC/HimenoBMT/index_e.html
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bernaschi, M., Bisson, M., Fatica, M. et al. An introduction to multi-GPU programming for physicists. Eur. Phys. J. Spec. Top. 210, 17–31 (2012). https://doi.org/10.1140/epjst/e2012-01635-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1140/epjst/e2012-01635-x