poster

Poster: Passing the three trillion particle limit with an error-controlled fast multipole method

Authors:
Ivo Kabadshow

Forschungszentrum Juelich, Juelich, Germany

Forschungszentrum Juelich, Juelich, Germany
View Profile

,
Holger Dachsel

Forschungszentrum Juelich, Juelich, Germany

Forschungszentrum Juelich, Juelich, Germany
View Profile

,
Jeff Hammond

Argonne National Laboratory, Argonne, IL, USA

Argonne National Laboratory, Argonne, IL, USA
View Profile

SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis CompanionNovember 2011Pages 73–74https://doi.org/10.1145/2148600.2148638

Published:12 November 2011Publication History

SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion

Pages 73–74

ABSTRACT

We present an error-controlled, highly scalable FMM implementation for long-range interactions of particle systems with open, 1D, 2D and 3D periodic boundary conditions. We highlight three aspects of fast summation codes not fully addressed in most articles; namely memory consumption, error control and runtime minimization. The aim of this poster is to contribute to all of these three points in the context of modern large scale parallel machines. Especially the used data structures, the parallelization approach and the precision-dependent parameter optimization will be discussed.

The current code is able to compute all mutual long-range interactions of more than three trillion particles on 294.912 BG/P cores within a few minutes for an expansion up to quadrupoles. The maximum memory footprint of such a computation has been reduced to less than 45 Bytes per particle. The code employs a one-sided, non-blocking parallelization approach with a small communication overhead.

Supplemental Material

Available for Download

pdf

post223.pdf (2 MB)

References

J. Barnes and P. Hut. A hierarchical O(N log N) force-calculation algorithm. Nature, 324:446--449, 1986.Google ScholarCross Ref
BMBF Project 01 IH 08001 A-D. ScaFaCoS -- Scalable Fast Coulomb Solver. http://www.fz-juelich.de/jsc/scafacos, Apr. 2011.Google Scholar
H. Dachsel. An error-controlled Fast Multipole Method. J. Chem. Phys., 132(11):119901, 2010.Google ScholarCross Ref
J. W. Eastwood, R. W. Hockney, and D. N. Lawrence. P3M3DP--the three-dimensional periodic particle-particle/ particle-mesh program. Comput. Phys. Commun., 19(2):215--261, 1980.Google ScholarCross Ref
T. C. Germann and K. Kadau. Trillion-atom molecular dynamics becomes a reality. Int. J. Mod. Phys. C, 19(9):1315--1319, 2008.Google ScholarCross Ref
L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comput. Phys., 73(2):325--348, 1987. Google ScholarDigital Library
A. Rahimian, I. Lashuk, S. Veerapaneni, A. Chandramowlishwaran, D. Malhotra, L. Moon, R. Sampath, A. Shringarpure, J. Vetter, R. Vuduc, D. Zorin, and G. Biros. Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC'10, pages 1--11, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarDigital Library
D. F. Richards, J. N. Glosli, B. Chan, M. R. Dorr, E. W. Draeger, J.-L. Fattebert, W. D. Krauss, T. Spelce, F. H. Streitz, M. P. Surh, and J. A. Gunnels. Beyond homogeneous decomposition: scaling long-range forces on massively parallel systems. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 60:1--60:12, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
D. Solvason and H. G. Petersen. Error estimates for the fast multipole method. J. Stat. Phys., 86:391--420, 1997.Google ScholarCross Ref
V. Springel. Simulating the joint evolution of quasars, galaxies and their large-scale distribution. Nature, 435:629, 2005.Google ScholarCross Ref
V. Springel. The millennium-XXL project: Simulating the galaxy population of dark energy universes. inSiDE, 8(2):20--28, 2010.Google Scholar

Index Terms

Poster: Passing the three trillion particle limit with an error-controlled fast multipole method
1. Applied computing
  1. Physical sciences and engineering
    1. Astronomy
    2. Physics
2. Computing methodologies
  1. Modeling and simulation
    1. Simulation types and techniques
      1. Massively parallel and high-performance simulations

Recommendations

MPPs and clusters for scalable computing
ISPAN '96: Proceedings of the 1996 International Symposium on Parallel Architectures, Algorithms and Networks

This article assess the state-of-the-art technology in massively parallel processors (MPPs) and clusters of workstations (COWs) for scalable parallel computing. We evaluate the IBM SP2, the Intel Paragon, the Cray T3D/T3E, and the ASCI TeraFLOPS system ...
Read More
Tools-supported HPF and MPI parallelization of the NAS parallel benchmarks
FRONTIERS '96: Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation

High Performance Fortran (HPF) compilers and communication libraries with the standardized Message Passing Interface (MPI) are becoming widely available, easing the development of portable parallel applications. The Annai tool environment supports ...
Read More
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems

Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (fmm) appears as a rising star. Our previous recent work showed scaling of an fmm on gpu clusters, with problem sizes of the order of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion
November 2011
166 pages
ISBN:9781450310307
DOI:10.1145/2148600
Conference Chair:
Scott Lathrop
University of Chicago
,
Program Chairs:
Jim Costa
Sandia National Laboratories
,
William Kramer
National Center for Supercomputing Applications
Copyright © 2011 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
FMM
error control
scalability
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate1,516of6,373submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 126
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Poster: Passing the three trillion particle limit with an error-controlled fast multipole method

SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

MPPs and clusters for scalable computing

Tools-supported HPF and MPI parallelization of the NAS parallel benchmarks

A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Poster: Passing the three trillion particle limit with an error-controlled fast multipole method

SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

MPPs and clusters for scalable computing

Tools-supported HPF and MPI parallelization of the NAS parallel benchmarks

A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media