Skip to main content

A Fast GPU Implementation for Solving Sparse Ill-Posed Linear Equation Systems

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6067))

Abstract

Image reconstruction, a very compute-intense process in general, can often be reduced to large linear equation systems represented as sparse under-determined matrices. Solvers for these equation systems (not restricted to image reconstruction) spend most of their time in sparse matrix-vector multiplications (SpMV). In this paper we will present a GPU-accelerated scheme for a Conjugate Gradient (CG) solver, with focus on the SpMV. We will discuss and quantify the optimizations employed to achieve a soft-real time constraint as well as alternative solutions relying on FPGAs, the Cell Broadband Engine, a highly optimized SSE-based software implementation, and other GPU SpMV implementations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia (2003)

    MATH  Google Scholar 

  2. ATI: AMD Stream Computing - Technical Overview. ATI (2008)

    Google Scholar 

  3. Khronos Group: OpenCL Specification 1.0 (June 2008)

    Google Scholar 

  4. NVIDIA Corp.: NVIDIA CUDA Compute Unified Device Architecture – Programming Guide (June 2007)

    Google Scholar 

  5. Krüger, J., Westermann, R.: Linear algebra operators for gpu implementation of numerical algorithms. In: SIGGRAPH 2003: ACM SIGGRAPH 2003 Papers, pp. 908–916. ACM, New York (2003)

    Chapter  Google Scholar 

  6. Larsen, E.S., McAllister, D.: Fast matrix multiplies using graphics hardware. In: Supercomputing 2001: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (CDROM), p. 55. ACM, New York (2001)

    Chapter  Google Scholar 

  7. Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation (December 2008)

    Google Scholar 

  8. Sengupta, S., Harris, M., Zhang, Y., Owens, J.D.: Scan primitives for gpu computing. In: GH 2007: Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, Aire-la-Ville, pp. 97–106. Eurographics Association, Switzerland (2007)

    Google Scholar 

  9. Buatois, L., Caumon, G., Levy, B.: Concurrent number cruncher: a gpu implementation of a general sparse linear solver. Int. J. Parallel Emerg. Distrib. Syst. 24(3), 205–223 (2009)

    Article  Google Scholar 

  10. Roux, F.X.: Acceleration of the outer conjugate gradient by reorthogonalization for a domain decomposition method for structural analysis problems. In: ICS 1989: Proceedings of the 3rd International Conference on Supercomputing, pp. 471–476. ACM, New York (1989)

    Chapter  Google Scholar 

  11. Bolz, J., Farmer, I., Grinspun, E., Schröoder, P.: Sparse matrix solvers on the gpu: conjugate gradients and multigrid. In: SIGGRAPH 2003: ACM SIGGRAPH 2003 Papers, pp. 917–924. ACM, New York (2003)

    Chapter  Google Scholar 

  12. Xilinx: Virtex 5 Family Overview. Xilinx (2008)

    Google Scholar 

  13. Williams, S., Shalf, J., Oliker, L., Kamil, S., Husbands, P., Yelick, K.: The potential of the cell processor for scientific computing. In: CF 2006: Proceedings of the 3rd Conference on Computing Frontiers, pp. 9–20. ACM Press, New York (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Stock, F., Koch, A. (2010). A Fast GPU Implementation for Solving Sparse Ill-Posed Linear Equation Systems. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2009. Lecture Notes in Computer Science, vol 6067. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14390-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14390-8_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14389-2

  • Online ISBN: 978-3-642-14390-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics