ABSTRACT
Convolution is the most time-consuming part in the computation of convolutional neural networks (CNNs). Due to the complex data dependency and the increase in the amount of model samples, the convolution suffers from high overhead on data movement. This work provides comprehensive analysis and methodologies to minimize the communication for the convolutions in CNNs. With an in-depth analysis on the I/O complexity theory under the red-blue pebble game model, we develop a general communication lower bound theory for a composite algorithm which consists of several different sub-computations. Based on the proposed theory, we establish the data movement lower bound results for three main convolution algorithms in CNNs, which are the direct convolution, the image2col method and Winograd algorithm. Furthermore, derived from I/O lower bound results, we design the near communication-optimal strategies respectively for the three main convolution algorithms by fully exploiting the data reuse. The deep analysis demonstrates that our designs are able to nearly reach the minimum communication in a two-level memory hierarchy.
- James Demmel and Grace Dinh. 2018. Communication-optimal convolutional neural nets. arXiv preprint arXiv:1802.06905 (2018).Google Scholar
- Venmugil Elango, Fabrice Rastello, Louis-Noël Pouchet, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2014. On characterizing the data movement complexity of computational DAGs for parallel execution. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures. 296--306.Google ScholarDigital Library
- Jia-Wei Hong and Hsiang-Tsung Kung. 1981. I/O complexity: The red-blue pebble game. In Proceedings of the thirteenth annual ACM symposium on Theory of computing. 326--333.Google Scholar
- Grzegorz Kwasniewski, Marko Kabić, Maciej Besta, Joost VandeVondele, Raffaele Solcà, and Torsten Hoefler. 2019. Red-blue pebbling revisited: Near optimal parallel matrix-matrix multiplication. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--22.Google ScholarDigital Library
- John E Savage. 1995. Extending the Hong-Kung model to memory hierarchies. In International Computing and Combinatorics Conference. Springer, 270--281.Google ScholarCross Ref
Index Terms
- Communication Lower Bounds of Convolutions in CNNs
Recommendations
I/O lower bounds for auto-tuning of convolutions in CNNs
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingConvolution is the most time-consuming part in the computation of convolutional neural networks (CNNs), which have achieved great successes in numerous practical applications. Due to the complex data dependency and the increase in the amount of model ...
Learning Activation Functions for Adversarial Attack Resilience in CNNs
Artificial Intelligence and Soft ComputingAbstractAdversarial attacks on convolutional neural networks (CNNs) have been a serious concern in recent years, as they can cause CNNs to produce inaccurate predictions. Through our analysis of training CNNs with adversarial examples, we discovered that ...
Multi-focus Image Fusion Based on Multiple CNNs in NSCT Domain
ICRAI '20: Proceedings of the 6th International Conference on Robotics and Artificial IntelligenceIn order to overcome the boundary information loss in the image fusion with single convolutional neural network, this paper proposes a novel multi-focus image fusion with multiple convolutional neural networks in nonsubsampled contourlet transform (NSCT)...
Comments