An improved generalized conjugate residual squared algorithm suitable for distributed parallel computing

https://doi.org/10.1016/j.cam.2014.04.009Get rights and content
Under an Elsevier user license
open archive

Abstract

In this paper, based on GCRS algorithm in Zhang and Zhao (2010) and the ideas in Gu et al. (2007), we present an improved generalized conjugate residual squared (IGCRS) algorithm that is designed for distributed parallel environments. The new improved algorithm reduces two global synchronization points to one by changing the computation sequence in the GCRS algorithm in such a way that all inner products per iteration are independent so that communication time required for inner products can be overlapped with useful computation. Theoretical analysis and numerical comparison of isoefficiency analysis show that the IGCRS method has better parallelism and scalability than the GCRS method, and the parallel performance can be improved by a factor of about 2. Finally, some numerical experiments clearly show that the IGCRS method can achieve better parallel performance with a higher scalability than the GCRS method and the improvement percentage of communication is up to 52.19% averagely, which meets our theoretical analysis.

MSC

65F10
65F15

Keywords

Sparse nonsymmetric linear systems
IGCRS algorithm
Krylov subspace methods
Global communication

Cited by (0)

This research of this author is supported by the NSFC Tianyuan Mathematics Youth Fund (11226337), NSFC (61202098, 61203179, 61170309, 61033009, 91130024 and 11171039), Aeronautical Science Foundation of China (2013ZD55006), Project of Youth Backbone Teachers of Colleges and Universities in Henan Province (2013GGJS-142), Major project of development foundation of science and technology of CAEP (2012A0202008), Scientific and Technological Key Project of Education Department of Henan Province (12B110028, 13B430355), Basic and Advanced Technological Research Project of Henan Province (132300410373) and the School Youth Fund (2012113004).