skip to main content
10.1145/3587828.3587834acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicscaConference Proceedingsconference-collections
research-article

GPU Sparse Matrix Vector Multiplication Optimization Based on ELLB Storage Format

Published:20 June 2023Publication History

ABSTRACT

ELLPACK(ELL) sparse matrix storage format has problems such as high storage consumption and low efficiency of sparse matrix vector multiplication(SpMV). To solve this problem, we propose a Graphic Processing Unit(GPU)-based efficient ELLPACK-Block(ELLB) sparse matrix storage format. Based on the original ELL storage format, this format adaptively divides the matrix into blocks according to the average number of non-zero elements in each row, and uses auxiliary matrices to improve the efficiency of SpMV solution. We use the ELLB storage format to solve the SpMV problem for different matrices. The experimental results show that compared with the Perfect Compressed Sparse Row(PCSR) format, the ELLB sparse matrix storage format saves 50 of the memory space, and the average efficiency of solving SpMV is increased by 7 times; compared with the Effective Compressed Sparse Row(ECSR) format, the memory space usage is increased by 25, but the solution of SpMV The efficiency is increased by an average of 7.65 times.

References

  1. Nathan Bell and Michael Garland. 2009. Implementing sparse matrix-vector multiplication on throughput-oriented processors. In Proceedings of the conference on high performance computing networking, storage and analysis. 1–11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Akrem Benatia, Weixing Ji, Yizhuo Wang, and Feng Shi. 2018. BestSF: a sparse meta-format for optimizing SpMV on GPU. ACM Transactions on Architecture and Code Optimization (TACO) 15, 3 (2018), 1–27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. E Blelloch, M. A Heroux, and M Zagha. 1993. Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors. Carnegie Mellon University (1993).Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Cheng, J. Tian, and M. A. Ruilin. 2018. Study on Efficient Storage Format of Sparse Matrix Based on GPU. Computer Engineering (2018).Google ScholarGoogle Scholar
  5. T. A. Davis and Y. Hu. 2011. The university of Florida sparse matrix collection. ACM (2011).Google ScholarGoogle Scholar
  6. A. Dziekonski, M. Rewienski, P. Sypek, A. Lamecki, and M. Mrozowski. 2017. GPU-Accelerated LOBPCG Method with Inexact Null-Space Filtering for Solving Generalized Eigenvalue Problems in Computational Electromagnetics Analysis with Higher-Order FEM. Communications in Computational Physics 22, 04 (2017), 997–1014.Google ScholarGoogle ScholarCross RefCross Ref
  7. Guixia He and Jiaquan Gao. 2016. A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs. Mathematical Problems in Engineering 2016, pt.4 (2016), 1–12.Google ScholarGoogle Scholar
  8. D. Horvat and Borut Alik. 2015. Inclusion test for polyhedra using depth value comparisons on the GPU. In ICCSIT 2015.Google ScholarGoogle Scholar
  9. A. Imakura and T. Sakurai. 2016. Block Krylov-type complex moment-based eigensolvers for solving generalized eigenvalue problems. Numerical Algorithms 75, 2 (2016), 1–21.Google ScholarGoogle Scholar
  10. W. Liu and B. Vinter. 2015. CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication. ACM (2015).Google ScholarGoogle Scholar
  11. G. Markall. [n. d.]. Accelerating Unstructured Mesh Computational Fluid Dynamics on the NVidia Tesla GPU Architecture. ([n. d.]).Google ScholarGoogle Scholar
  12. Thaha Muhammed, Rashid Mehmood, Aiiad Albeshri, and Iyad Katib. 2019. SURAA: A novel method and tool for loadbalanced and coalesced SpMV computations on GPUs. Applied Sciences 9, 5 (2019), 947.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Richter, S. Schops, and M. Clemens. 2015. Multi-GPU Acceleration of Algebraic Multi-Grid Preconditioners for Elliptic Field Problems. Magnetics, IEEE Transactions on 51, 3 (2015), 1–4.Google ScholarGoogle Scholar
  14. X. Sun, K. C. Wei, L. F. Lai, S. H. Tsai, and C. C. Wu. 2018. Optimizing Sparse Matrix-Vector Multiplication on GPUs via Index Compression. In 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC).Google ScholarGoogle Scholar
  15. B. D. Wozniak, F. D. Witherden, F. P. Russell, P. E. Vincent, and Phj Kelly. 2016. GiMMiK — Generating bespoke matrix multiplication kernels for accelerators: Application to high-order Computational Fluid Dynamics. Computer Physics Communications 202, 6 (2016), 12–22.Google ScholarGoogle ScholarCross RefCross Ref
  16. Wangdong Yang, Kenli Li, and Keqin Li. 2017. A hybrid computing method of SpMV on CPU–GPU heterogeneous computing systems. J. Parallel and Distrib. Comput. 104 (2017), 49–60.Google ScholarGoogle ScholarCross RefCross Ref
  17. Wangdong Yang, Kenli Li, and Keqin Li. 2018. A parallel computing method using blocked format with optimal partitioning for SpMV on GPU. J. Comput. System Sci. 92 (2018), 152–170.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. GPU Sparse Matrix Vector Multiplication Optimization Based on ELLB Storage Format

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICSCA '23: Proceedings of the 2023 12th International Conference on Software and Computer Applications
        February 2023
        385 pages
        ISBN:9781450398589
        DOI:10.1145/3587828

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 June 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)28
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format