Revisiting spatial dropout for regularizing convolutional neural networks

Lee, Sanghun; Lee, Chulhee

doi:10.1007/s11042-020-09054-7

Revisiting spatial dropout for regularizing convolutional neural networks

Published: 23 June 2020

Volume 79, pages 34195–34207, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

1369 Accesses
32 Citations
Explore all metrics

Abstract

Overfitting is one of the most challenging problems in deep neural networks with a large number of trainable parameters. To prevent networks from overfitting, the dropout method, which is a strong regularization technique, has been widely used in fully-connected neural networks. In several state-of-the-art convolutional neural network architectures for object classification, however, dropout was partially or not even applied since its accuracy gain was relatively insignificant in most cases. Also, the batch normalization technique reduced the need for the dropout method because of its regularization effect. In this paper, we show that conventional element-wise dropout can be ineffective for convolutional layers. We found that dropout between channels in the CNNs can be functionally similar to dropout in the FCNNs, and spatial dropout can be an effective way to take advantage of the dropout technique for regularizing. To prove our points, we conducted several experiments using the CIFAR-10 and CIFAR-100 databases. For comparison, we only replaced the dropout layers with spatial dropout layers and kept all other hyperparameters and methods intact. DenseNet-BC with spatial dropout showed promising results (3.32% error rates with CIFAR-10, 3.0 M parameters) compared to other existing competitive methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis on the Dropout Effect in Convolutional Neural Networks

Drop-Activation: Implicit Parameter Reduction and Harmonious Regularization

Article 30 October 2020

DropFilterR: A Novel Regularization Method for Learning Convolutional Neural Networks

Article 05 November 2019

References

Deng, J, Dong, W, Socher, R, Li, LJ, Li, K, Fei-Fei, L, (2009). Imagenet: A large-scale hierarchical image database, Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE conference on. Ieee, pp 248–255.
Fortunato, M, Blundell, C, Vinyals, O, (2017). Bayesian recurrent neural networks. arXiv preprint arXiv:1704.02798
Gal, Y, Ghahramani, Z, (2016). A theoretically grounded application of dropout in recurrent neural networks, advances in neural information processing systems, pp. 1019-1027
Ghiasi, G, Lin, TY, Le, QV, (2018). Dropblock: a regularization method for convolutional networks, Advances in Neural Information Processing Systems, pp 10727–10737.
Gross, S, Wilber, M, (2016). Training and investigating residual nets. Facebook AI research, CA.[online]. Avilable: http://torch.ch/blog/2016/02/04/resnets.html
He, K, Zhang, X, Ren, S, Sun, J, (2016). Deep residual learning for image recognition, proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778
Hinton, GE, Srivastava, N, Krizhevsky, A, Sutskever, I, Salakhutdinov, RR, (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
Huang, G, Liu, Z, Weinberger, KQ, van der Maaten, L, (2016). Densely connected convolutional networks. arXiv preprint arXiv:1608.06993
Huang, G, Liu, S, van der Maaten, L, Weinberger, KQ, (2017a). CondenseNet: An Efficient DenseNet using Learned Group Convolutions group 3, 11
Huang, G, Liu, Z, Weinberger, KQ, van der Maaten, L, (2017b). Densely connected convolutional networks, Proceedings of the IEEE conference on computer vision and pattern recognition, p 3.
Huang, Y, Cheng, Y, Chen, D, Lee, H, Ngiam, J, Le, QV, Chen, Z, (2018). Gpipe: efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965
Ioffe, S, Szegedy, C, (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Khan SH, Hayat M, Porikli F (2019) Regularization of deep neural networks with spectral dropout. Neural Netw 110:82–90
Article Google Scholar
Krizhevsky, A, (2009). Learning multiple layers of features from tiny images. Tech Rep
Krizhevsky, A, Sutskever, I, Hinton, GE, (2012). Imagenet classification with deep convolutional neural networks, Advances in neural information processing systems, pp 1097–1105.
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551
Article Google Scholar
Lee, CY, Xie, S, Gallagher, P, Zhang, Z, Tu, Z, (2015). Deeply-supervised nets, artificial intelligence and statistics, pp. 562-570
Nair, V, Hinton, GE, (2010). Rectified linear units improve restricted boltzmann machines, proceedings of the 27th international conference on machine learning (ICML-10), pp. 807-814
Simonyan, K, Zisserman, A, (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
MathSciNet MATH Google Scholar
Tan, M, Le, QV, (2019). EfficientNet: rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946.
Tompson, J, Goroshin, R, Jain, A, LeCun, Y, Bregler, C, (2015). Efficient object localization using convolutional networks, proceedings of the IEEE conference on computer vision and pattern recognition, pp. 648-656.
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30:1958–1970
Article Google Scholar
Xie, S, Girshick, R, Dollár, P, Tu, Z, He, K, (2017). Aggregated residual transformations for deep neural networks, computer vision and pattern recognition (CVPR), 2017 IEEE conference on. IEEE, pp. 5987-5995.
Zagoruyko, S, Komodakis, N, (2016). Wide residual networks. arXiv preprint arXiv:1605.07146
Zaremba, W, Sutskever, I, Vinyals, O, (2014). Recurrent neural network regularization. arXiv preprint arXiv:1409.2329
Zoph, B, Vasudevan, V, Shlens, J, Le, QV, (2017). Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2017R1E1A2A01079495).

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, Yonsei University, 134 Shinchon-Dong, Seoul, Seodaemun-Gu, South Korea
Sanghun Lee & Chulhee Lee

Authors

Sanghun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Chulhee Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chulhee Lee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, S., Lee, C. Revisiting spatial dropout for regularizing convolutional neural networks. Multimed Tools Appl 79, 34195–34207 (2020). https://doi.org/10.1007/s11042-020-09054-7

Download citation

Received: 15 April 2019
Revised: 20 April 2020
Accepted: 07 May 2020
Published: 23 June 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-020-09054-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revisiting spatial dropout for regularizing convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

Analysis on the Dropout Effect in Convolutional Neural Networks

Drop-Activation: Implicit Parameter Reduction and Harmonious Regularization

DropFilterR: A Novel Regularization Method for Learning Convolutional Neural Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Revisiting spatial dropout for regularizing convolutional neural networks

Abstract

Access this article

Similar content being viewed by others

Analysis on the Dropout Effect in Convolutional Neural Networks

Drop-Activation: Implicit Parameter Reduction and Harmonious Regularization

DropFilterR: A Novel Regularization Method for Learning Convolutional Neural Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation