Skip to main content

Enhancing the Sparse Matrix Storage Using Reordering Techniques

  • Conference paper
  • First Online:
High Performance Computing (CARLA 2023)

Abstract

Sparse linear algebra kernels are memory-bound routines, and their performance varies significantly according to the non-null pattern of the sparse matrix operands. The impressive computing power and memory bandwidth of modern massively parallel computing devices encourage researchers to develop sparse linear algebra kernels that can exploit these platforms efficiently. In this sense, a main line of work improves the storage of matrices, aiming to optimize the communication between the memory and the cores. In previous work, the use of a strategy consisting of a delta-encoding with matrix reorderings compressed the indexing data of the matrix, saving storage and communications. This work presents an algorithm to improve the reordering strategy and the resulting compression of the indexing data. The results show that this strategy leads to important storage savings, which can also reduce data movements between the main memory and processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Monakov, A., Lokhmotov, A., Avetisyan, A.: Automatically tuning sparse matrix-vector multiplication for GPU architectures. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) HiPEAC 2010. LNCS, vol. 5952, pp. 111–125. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11515-8_10

    Chapter  Google Scholar 

  2. Barrett, R., et al.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. Society for Industrial and Applied Mathematics (1994). https://doi.org/10.1137/1.9781611971538, https://epubs.siam.org/doi/abs/10.1137/1.9781611971538

  3. Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009. Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1654059.1654078

  4. Berger, G., Freire, M., Marini, R., Dufrechou, E., Ezzatti, P.: Unleashing the performance of bmSparse for the sparse matrix multiplication in GPUs. In: Proceedings of the 2021 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), pp. 19–26, November 2021

    Google Scholar 

  5. Berger, G., Freire, M., Marini, R., Dufrechou, E., Ezzatti, P.: Advancing on an efficient sparse matrix multiplication kernel for modern GPUs. Concurr. Comput. Pract. Experience 35, e7271 (2022). https://doi.org/10.1002/cpe.7271, https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.7271

  6. Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference, pp. 157–172. ACM Press (1969). https://doi.org/10.1145/800195.805928

  7. Davis, T.A., Hu, Y.: The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1–25 (2011). https://doi.org/10.1145/2049662.2049663

  8. Dufrechou, E., Ezzatti, P., Quintana-Ortí, E.S.: Selecting optimal SPMV realizations for GPUs via machine learning. Int. J. High Perform. Comput. Appl. 35(3), 254–267 (2021). https://doi.org/10.1177/1094342021990738

  9. Favaro, F., Oliver, J.P., Ezzatti, P.: Unleashing the computational power of FPGAs to efficiently perform SPMV operation. In: 40th International Conference of the Chilean Computer Science Society, SCCC 2021, La Serena, Chile, 15–19 November 2021, pp. 1–8. IEEE (2021). https://doi.org/10.1109/SCCC54552.2021.9650418

  10. Freire, M., Marichal, R., Dufrechou, E., Ezzatti, P.: Towards reducing communications in sparse matrix kernels. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds.) Cloud Computing, Big Data & Emerging Topics, JCC-BD &ET 2023. CCIS, vol. 1828, pp. 17–30. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-40942-4_2

  11. George, A.: Computer implementation of the finite element method. Ph.D. thesis, Computer Science Department, School of Humanities and Sciences, Stanford University, CA, USA (1971)

    Google Scholar 

  12. George, J.A., Liu, J.W.: Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, Englewood Cliffs (1981)

    Google Scholar 

  13. Godwin, J., Holewinski, J., Sadayappan, P.: High-performance sparse matrix-vector multiplication on GPUs for structured grid computations. In: The 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, London, United Kingdom, 3 March 2012, pp. 47–56. ACM (2012)

    Google Scholar 

  14. Gómez, C., Mantovani, F., Focht, E., Casas, M.: Efficiently running SPMV on long vector architectures. In: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2021, pp. 292–303. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3437801.3441592

  15. Choi, J.W., Singh, A., Vuduc, R.W.: Model-driven autotuning of sparse matrix-vector multiply on GPUs. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (15th PPOPP 2010), pp. 115–125. ACM SIGPLAN, Bangalore, India, January 2010

    Google Scholar 

  16. Karakasis, V., Gkountouvas, T., Kourtis, K., Goumas, G.I., Koziris, N.: An extended compression format for the optimization of sparse matrix-vector multiplication. IEEE Trans. Parallel Distributed Syst. 24(10), 1930–1940 (2013). https://doi.org/10.1109/TPDS.2012.290, https://doi.org/10.1109/TPDS.2012.290

  17. Kourtis, K., Goumas, G.I., Koziris, N.: Optimizing sparse matrix-vector multiplication using index and value compression. In: Ramírez, A., Bilardi, G., Gschwind, M. (eds.) Proceedings of the 5th Conference on Computing Frontiers, 2008, Ischia, Italy, 5–7 May 2008, pp. 87–96. ACM (2008). https://doi.org/10.1145/1366230.1366244

  18. Marichal, R., Dufrechou, E., Ezzatti, P.: Optimizing sparse matrix storage for the big data era. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds.) Cloud Computing, Big Data & Emerging Topics - 9th Conference, JCC-BD &ET, La Plata, Argentina, 22–25 June 2021, Proceedings. Communications in Computer and Information Science, vol. 1444, pp. 121–135. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84825-5_9

  19. de Oliveira, S.L.G., de Abreu, A.A.A.M.: An evaluation of pseudoperipheral vertex finders for the reverse Cuthill-McKee method for bandwidth and profile reductions of symmetric matrices. In: 37th International Conference of the Chilean Computer Science Society, SCCC 2018, Santiago, Chile, 5–9 November 2018, pp. 1–9. IEEE (2018). https://doi.org/10.1109/SCCC.2018.8705263

  20. de Oliveira, S.L.G., Silva, L.M.: Low-cost heuristics for matrix bandwidth reduction combined with a hill-climbing strategy. RAIRO Oper. Res. 55(4), 2247–2264 (2021). https://doi.org/10.1051/ro/2021102

  21. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2003)

    Google Scholar 

  22. Tang, W.T., et al.: Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. ACM, November 2013. https://doi.org/10.1145/2503210.2503234

  23. Willcock, J., Lumsdaine, A.: Accelerating sparse matrix computations via data compression. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, pp. 307–316. Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1183401.1183444

  24. Zhang, J., Gruenwald, L.: Regularizing irregularity: bitmap-based and portable sparse matrix multiplication for graph data on GPUs. In: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), GRADES-NDA 2018. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3210259.3210263

Download references

Acknowledgments

This work is partially funded by the UDELAR CSIC-INI project CompactDisp: Formatos dispersos eficientes para arquitecturas de hardware modernas. The authors also thank PEDECIBA Informática and the University of the Republic, Uruguay.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel Freire .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Freire, M., Marichal, R., Gonzaga de Oliveira, S.L., Dufrechou, E., Ezzatti, P. (2024). Enhancing the Sparse Matrix Storage Using Reordering Techniques. In: Barrios H., C.J., Rizzi, S., Meneses, E., Mocskos, E., Monsalve Diaz, J.M., Montoya, J. (eds) High Performance Computing. CARLA 2023. Communications in Computer and Information Science, vol 1887. Springer, Cham. https://doi.org/10.1007/978-3-031-52186-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-52186-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-52185-0

  • Online ISBN: 978-3-031-52186-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics