Computer Science > Distributed, Parallel, and Cluster Computing
[Submitted on 11 Jul 2023]
Title:MG3MConv: Multi-Grained Matrix-Multiplication-Mapping Convolution Algorithm toward the SW26010 Processor
View PDFAbstract:As the core of artificial intelligence applications, the research of convolution has become a hot topic in high performance computing. With the rapid development of the emerging SW26010 processor in artificial intelligence, there is an urgent need for high-performance convolution algorithms on the processor. However, the current support of convolution on SW26010 is still rudimentary. The only studies provide sufficient runtime peak performance but lack the adaptability to various convolution scenes. To perfect convolution algorithms on SW26010, we propose a multi-grained matrix-multiplication-mapping convolution algorithm called MG3MConv, which targets the architectural features of SW26010. MG3MConv supports diversified mapping schemes of convolution tasks based on the concept of the thread block proposed in this paper. All the architecture-oriented optimization methods are elaborately designed from four levels to fully exploit the hardware efficiency of SW26010. The experiments show that the hardware efficiency of MG3MConv can reach 84.78% in max, which is 1.75 times compared with that of cuDNN based on NVIDIA K80m GPU. Moreover, MG3MConv can overperform cuDNN in most convolution scenes. We also use six representative CNNs as real-world cases, and the hardware efficiency of MG3MConv reaches up to 67.04% on the VGG network model, which is 1.37 times and 1.96 times that of cuDNN and swDNN, respectively.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.