GPU-based shear–shear correlation calculation
Introduction
Current cosmological observations focus on surveying very large and deep regions of the sky, in order to be able to study the large-scale structure of the Universe and its evolution.
Several observational probes [1] have been identified to tackle the study of the accelerated expansion of the Universe [2]. Among them, one of the most promising turns out to be the analysis of the small deflections that large masses produce on the light travelling from distant galaxies. This phenomenon is known as gravitational lensing. Given the very small distortions involved, usually it is the statistical properties of the observed distribution of galaxy shapes which are studied [3]. The overall effect created by the gravitational lenses on these shapes is the cosmic shear. For many years this observational technique has been burdened by very large instrumental errors. Nevertheless, first results were possible in the late 1990s [4], [5], [6], [7]. Only recently have the first measurements in wide areas been carried out, delivering promising results for the future of the field [8]. This has paved the way for present and future surveys, such as the Dark Energy Survey [9], the Kilo-Degree Survey [10] and Euclid [11], [12], to exploit this observational channel by increasing statistics by almost two orders of magnitude by the next decade as well as reducing systematic errors.
In this context, cosmologists will have to deal with very large amounts of data (108 objects) to extract these measurements. In particular, the so-called shear–shear correlation function estimation requires computing the auto- and cross-correlation functions of the ellipticities of galaxies in different samples at varying redshifts. This algorithm has a high computational cost, which goes with in the case that one wants to achieve the best precision.
The calculation of correlation function estimators has already been addressed in the case of large-scale structure studies, for the computation of the auto-correlation function of positions of galaxies and clusters (instead of shapes). In this case, systematic errors play a lesser role, but the probe in itself is less sensitive to the determination of cosmological parameters. Several codes have harnessed the computational power of graphics processing units (GPUs) and other hardware platforms to carry out the job (Section 2). In the case of shear–shear correlations (involving the shape of the galaxies, Section 3) several codes have been implemented using the kd-trees approach, which simplifies the problem at the cost of precision (Section 2).
To the authors’ knowledge, up to now, no code has taken advantage of the capabilities of GPUs to handle this specific problem. This calculation is of particular relevance in the short term with the rapid advent of larger datasets in the next few years. GPUs are able to deal with the shear correlation calculation of very large surveys of galaxies within a reasonable execution time.
Beyond the initial adaptation of the problem to the GPU platform and the subsequent verification of the results, the code passes through a series of optimization processes. Among other techniques, a more efficient use of on-chip memory with an increment of data locality and the study of the compilation options modifying the use of the cache memory are checked. Finally, a concurrent computing scenario (hybrid OpenMP–CUDA implementation) is presented.
This paper is organized as follows. Section 2 summarizes the related work and previous efforts. In Section 3.1, the concepts of gravitational lensing and shear are described, together with the equations to be implemented in the code. The statistical support to the analysis is described in Section 3.2. The hardware used in this work is presented in Section 3.3. Results are presented and analysed in Section 4. Finally, Section 5 contains the conclusions of this work. Details about GPU architecture and CUDA programming model are included in Appendix.
Section snippets
Related work
Previous efforts implement some kind of mechanism to reduce the computational cost of the point-to-point correlation estimation, for example, the widely used ATHENA code.1 This is a powerful tool based on kd-trees [13], which allows controlling the precision of the estimation by means of a parameter termed the opening angle measured in radians ( hereafter). This parameter regulates the minimum angle at which two kd-tree nodes must ‘see’ each other
The shear–shear correlation function
Cosmological information, such as dark matter distribution at different epochs, the amount of matter, and the expansion history, is contained in the so-called shear–shear correlation function. A thorough review on the topic of gravitational lensing can be found in [18]. The value of the shear field can be conveniently estimated from the ellipticity, , of a particular galaxy. Here, is defined as such that an ellipse with axes :
Given that each galaxy has an orientation
Baseline implementation
In the Appendix, an overview of the GPU architecture and the CUDA programming model is presented. In the following section, the CUDA baseline implementation of the shear–shear correlation code is described, while mentioning the technical aspects that affect the performance.
Conclusions
In this paper, the first GPU code for the computation of the shear–shear correlation function is presented. In the past, the computational cost of the problem has prevented a brute-force implementation, and approximations (kd-trees) were used, aiming to reduce the execution time. By using GPU computing, the shear–shear correlation function estimation without any simplifications can be achieved in a reasonable timescale. In this work, an implementation is shown where a 68-fold improvement in
Acknowledgements
The authors would like to thank Martin Kilbinger for permission to reproduce a figure from their paper.
IS would like to thank Tim Eifler for useful comments regarding the cosmology-related aspects of this work.
The authors would like to thank the Spanish Ministry of Science and Innovation (MICINN) for funding support through grants AYA2009-13936 and through the Consolider Ingenio-2010 program, under project CSD2007-00060.
CB is also supported by project 2009SGR1398 from Generalitat de Catalunya
References (30)
- et al.
Cosmological calculations on the GPU
Astronomy and Computing
(2013) - et al.
Weak gravitational lensing
Physics Reports
(2001) - et al.
Report of the dark energy task force
(2006) - et al.
Dark energy and the accelerating universe
Annual Review of Astronomy and Astrophysics
(2008) - et al.
Very weak lensing in the CFHTLS wide: cosmology from cosmic shear in the linear regime
Astronomy and Astrophysics
(2008) - et al.
Detection of weak gravitational lensing by large-scale structure
Monthly Notices of the Royal Astronomical Society
(2000) - Nick Kaiser, Gillian Wilson, Gerard A. Luppino, Large-scale cosmic shear measurements, 2000....
- et al.
Detection of correlated galaxy ellipticities on CFHT data: first evidence for gravitational lensing by large-scale structures
Astronomy and Astrophysics
(2000) - et al.
Detection of weak gravitational lensing distortions of distant galaxies by cosmic dark matter at large scales
Nature
(2000) - et al.
CFHTLenS: the Canada–France–Hawaii telescope lensing survey
Monthly Notices of the Royal Astronomical Society
(2012)
Euclid definition study report, Report Number: ESA/SRE(2011)12
Fast algorithms and efficient statistics: -point correlation functions
ESO Astrophysics Symposia
Cited by (10)
Performance evaluation of the three-point angular correlation function
2018, Parallel ComputingVoronoi Tessellation for reducing the processing time of correlation functions
2018, Computer Physics CommunicationsCitation Excerpt :These methods also allow for the discard of a large bulk of calculations by setting an additional threshold beyond which the correlation function is of no interest, and thus is not computed. Another approach is to benefit from the computational capacity of GPUs for the two point angular correlation function [5–9,11], the shear–shear correlation function [12], and the aperture mass statistic [11]. In these cases, the throughput-oriented processor architecture of GPU is used for exploiting the parallelism of correlation function estimation.
Bin recycling strategy for improving the histogram precision on GPU
2016, Computer Physics CommunicationsCitation Excerpt :Cosmology, and particularly the study of the Large Scale Structure of Universe, is an example of scientific discipline where the increment in the available data volume has driven an intense use of GPU computing to mitigate the large processing times. These studies include the analysis of the two-point angular correlation function (2PACF) [12–15,18,11], the three-point angular correlation function [19], or the shear–shear correlation function [16]. In-depth explanation about these correlation functions and their relevance in the cosmological studies for analysing the Large Scale Structure of Universe can be found in the bibliography [20,21].
Performance and precision of histogram calculation on GPUs: Cosmological analysis as a case study
2014, Computer Physics CommunicationsCitation Excerpt :This strategy saves the problem of the lack of precision of number-representation, but penalizes the performance of the application. Finally, other works from the authors where histograms are an essential part of the cosmological analysis include [5–7,15]. In all of these cases, the classic, device memory gathering implementation described in [4] is employed.
An Accuracy-Aware Implementation of Two-Point Three-Dimensional Correlation Function Using Bin-Recycling Strategy on GPU
2017, Proceedings - 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017