Paper
10 November 2022 Transplantation and optimization of SpMV algorithm based on DCU accelerator
Mengyu Bo, Junli He, Yuanyuan Yue, Jiandong Shang, Lin Han, Pu Han
Author Affiliations +
Proceedings Volume 12331, International Conference on Mechanisms and Robotics (ICMAR 2022); 123314X (2022) https://doi.org/10.1117/12.2653051
Event: International Conference on Mechanisms and Robotics (ICMAR 2022), 2022, Zhuhai, China
Abstract
In order to give full play to the advantages of DCU accelerator and solve the problems of algorithm SpMV (Sparse matrix-vector multiplication) with limited bandwidth, unbalanced load, and non-combined memory access, a SCSR (Static Compressed Sparse Row) using CSR (Compressed Sparse Row) storage format is proposed based on DCU accelerator. The algorithm statically allocates the same number of rows to each thread block according to the average number of non-zero elements in each row to avoid unnecessary computations; the application of storage resources is reduced by reusing the on-chip high-speed storage space LDS (Local Data Shared), thus improving the CU (Compute Unit) occupancy. The experiment uses 15 sparse matrices in different fields for testing. The results show that compared with the SpMV algorithm in the hipSPARSE library, the SCSR algorithm achieves an average speedup ratio of 4.83 times.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mengyu Bo, Junli He, Yuanyuan Yue, Jiandong Shang, Lin Han, and Pu Han "Transplantation and optimization of SpMV algorithm based on DCU accelerator", Proc. SPIE 12331, International Conference on Mechanisms and Robotics (ICMAR 2022), 123314X (10 November 2022); https://doi.org/10.1117/12.2653051
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data storage

Copper

Optimization (mathematics)

Matrices

Wavefronts

Algorithm development

Data processing

Back to Top