Skip to main content

Parallel Algorithm for Quasi-Band Matrix-Matrix Multiplication

  • Conference paper
  • First Online:
  • 1272 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9573))

Abstract

Sparse matrices arise in many practical scenarios. As a result, support for efficient operations such as multiplication of sparse matrices (spmm) is considered to be an important research area. Often, sparse matrices also exhibit particular characteristics that can be used towards better parallel algorithmics. In this paper, we focus on quasi-band sparse matrices that have a large majority of the non-zeros along the diagonals. We design and implement an efficient algorithm for multiplying two such matrices on a many-core architecture such as a GPU.

Our implementation outperforms the corresponding library implementation by a factor of 2x on average over a wide variety of quasi-band matrices from standard datasets. We analyze our performance over synthetic quasi-band matrices.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceeding SuperComputing (SC), pp. 1–11 (2009)

    Google Scholar 

  2. Buluc, A., Gilbert, J.R.: Challenges and advances in parallel sparse matrix-matrix multiplication. In: Proceeding International Conference on Parallel Processing, pp. 503–510 (2008)

    Google Scholar 

  3. Gharaibeh, A., Costa, B., Santos-Neto, E., Ripeanu, M.: On Graphs, GPUs, and Blind Dating: a workload to processor matchmaking quest. In: Proceeding International Parallel & Distributed Processing Symposium (IPDPS), pp. 851–862 (2013)

    Google Scholar 

  4. Hong, S., Rodia, N.C., Olukotun, K.: On fast parallel detection of strongly connected components in small-world graphs. In: Proceedings of the SC (2013). Article No. 92

    Google Scholar 

  5. Indarapu, S., Maramreddy, M., Kothapalli, K.: Architecture- and workload-aware algorithms for spare matrix- vector multiplication. In: Proceeding of ACM India Computing Conference (2014). Article No. 3

    Google Scholar 

  6. Liu, W., Vinter, B.: An efficient GPU general sparse matrix-matrix multiplication for irregular data. In: Proceeding of IPDPS, pp. 370–381 (2014)

    Google Scholar 

  7. Ramamoorthy, K.R., Banerjee, D.S., Srinathan, K., Kothapalli, K.: A novel heterogeneous algorithm for multiplying scale-free sparse matrices. In: Proceeding of IPDPS Workshops, pp. 637–646 (2015)

    Google Scholar 

  8. Nvidia sparse matrix library (cuSPARSE). http://developer.nvidia.com/cusparse

  9. Intel Math Kernel Library. https://software.intel.com/en-us/articles/intel-mkl/

  10. University of Florida UF sparse matrix collection (2011). http://www.cise.ufl.edu/research/sparse/matrices/groups.html

  11. Yang, W., Li, K., Liu, Y., Shi, L., Wan, L.: Optimization of quasi-diagonal matrix-vector multiplication on GPU. Int. J. High Perform. Comput. Appl. 28(2), 183–195 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dharma Teja Vooturi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Vooturi, D.T., Kothapalli, K. (2016). Parallel Algorithm for Quasi-Band Matrix-Matrix Multiplication. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2015. Lecture Notes in Computer Science(), vol 9573. Springer, Cham. https://doi.org/10.1007/978-3-319-32149-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32149-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32148-6

  • Online ISBN: 978-3-319-32149-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics