DropCircuit : A Modular Regularizer for Parallel Circuit Networks

Phan, Kien Tuong; Maul, Tomas Henrique; Vu, Tuong Thuy; Lai, Weng Kin

doi:10.1007/s11063-017-9677-4

DropCircuit : A Modular Regularizer for Parallel Circuit Networks

Published: 22 July 2017

Volume 47, pages 841–858, (2018)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Kien Tuong Phan¹,
Tomas Henrique Maul¹,
Tuong Thuy Vu ORCID: orcid.org/0000-0002-6656-8016² &
…
Weng Kin Lai³

336 Accesses
3 Citations
6 Altmetric
Explore all metrics

Abstract

How to design and train increasingly large neural network models is a topic that has been actively researched for several years. However, while there exists a large number of studies on training deeper and/or wider models, there is relatively little systematic research particularly on the effective usage of wide modular neural networks. Addressing this gap, and in an attempt to solve the problem of lengthy training times, we proposed Parallel Circuits (PCs), a biologically inspired architecture based on the design of the retina. In previous work we showed that this approach fails to maintain generalization performance in spite of achieving sharp speed gains. To address this issue, and motivated by the way dropout prevents node co-adaptation, in this paper, we suggest an improvement by extending dropout to the parallel-circuit architecture. The paper provides empirical proof and multiple insights into this combination. Experiments show promising results in which improved error rates are achieved in most cases, whilst maintaining the speed advantage of the PC approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Xie S, Girshick R, Dollár P, Tu Z He K (2016) Aggregated residual Transformations for deep neural networks. arXiv preprint arXiv:1611.05431
Suresh S, Omkar SN, Mani V (2005) Parallel implementation of back-propagation algorithm in networks of workstations. IEEE Trans Parallel Distrib Syst 16:24–34
Article Google Scholar
Phan KT, Maul TH, Vu TT (2015) A parallel circuit approach for improving the speed and generalization properties of neural networks. In: 11th international conference on natural computation, pp 1–7
Wu H, Gu X (2015) Towards dropout training for convolutional neural networks. Neural Netw 71:1–10
Article Google Scholar
Baldi P, Sadowski P (2014) The dropout learning algorithm. Artif Intell 210:1–64
Article MathSciNet MATH Google Scholar
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–166
Article Google Scholar
Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Article MathSciNet MATH Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol 25, pp 1097–1105
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, et al. (2015) Going deeper with convolutions In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Guo J, Gould, S (2016) Depth dropout: efficient training of residual convolutional neural networks. In: International conference on digital image computing: techniques and applications (DICTA), pp 1–7,
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. Aistats 9:249–256
Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385
Bengio Y et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153
Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. Aistats 15:275
Google Scholar
Bergstra J, Desjardins G, Lamblin P, Bengio Y (2009) Quadratic polynomials learn better image features. Technical report 1337. Département dInformatique et de Recherche Opérationnelle. Université de Montréal
Andreas V, Michael JW, Serge JB (2016) Residual networks are exponential ensembles of relatively shallow networks. arXiv preprint arXiv:1605.06431
Srivastava RK, Greff K, Schmidhuber J (2016) Highway networks. arXiv preprint arXiv:1505.00387
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146
Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Advances in neural information processing systems, pp 2654–2662
Singh S, Hoiem D, Forsyth D (2016) Swapout: learning an ensemble of deep architectures. arXiv preprint arXiv:1605.06465
Dauphin YN, Bengio Y (2013) Big neural networks waste capacity. arXiv preprint arXiv:1301.3583
Larsson G, Maire M, Shakhnarovich G (2016) FractalNet: ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648
Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3642–3649
Wang M (2015) Multi-path convolutional neural networks for complex image classification. arXiv preprint arXiv:1506.04701
Phan KT, Maul TH, Vu TT, Lai WK (2016) Improving neural network generalization by combining parallel circuits with dropout. In: 23rd international conference on neural information processing
Matsugu M, Mori K, Mitari Y, Kaneda Y (2003) Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw 16:555–559
Article Google Scholar
Kashtan N, Alon U (2005) Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci United States of America 102:13773–13778
Article Google Scholar
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res (JMLR) 15:1929–1958
Huang G, Sun Y, Liu Z, Sedra D, Weinberger K (2016) Deep networks with stochastic depth. arXiv preprint arXiv:1603.09382,
Wan L, Zeiler M (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 109–111
Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. University of California, Irvine, School of Information and Computer Sciences. arXiv e-prints 1605.02688. abs/1605.02688
Theano Development Team (2016 )Theano: a python framework for fast computation of mathematical expressions. arXiv e-prints 1605.02688, abs/1605.02688
Chorowski J, Zurada JM (2015) Learning understandable neural networks with nonnegative weight constraints. IEEE Trans Neural Netw Learn Syst 26(1):62–69. doi:10.1109/TNNLS.2014.2310059
Article MathSciNet Google Scholar
Hosseini-Asl E, Zurada JM, Nasraoui O (2016) Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Trans Neural Netw Learn Syst 27(12):2486–2498
Article Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791. doi:10.1038/44565
Article MATH Google Scholar
Francis SH, Guangyu RY, Xiao-Jing W (2016) Training excitatory-inhibitory recurrent neural networks for cognitive tasks: a simple and flexible framework. Comput Biol 12(2):e1004792
Google Scholar

Download references

Acknowledgements

The project was sponsored by the Crop for the Future Research Center (CFFRC).

Author information

Authors and Affiliations

School of Computer Science, University of Nottingham Malaysia Campus, Semenyih, Malaysia
Kien Tuong Phan & Tomas Henrique Maul
School of Geography, University of Nottingham Malaysia Campus, Semenyih, Malaysia
Tuong Thuy Vu
Faculty of Engineering, Tunku Abdul Rahman University College, Kuala Lumpur, Malaysia
Weng Kin Lai

Authors

Kien Tuong Phan
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Henrique Maul
View author publications
You can also search for this author in PubMed Google Scholar
Tuong Thuy Vu
View author publications
You can also search for this author in PubMed Google Scholar
Weng Kin Lai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kien Tuong Phan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Phan, K.T., Maul, T.H., Vu, T.T. et al. DropCircuit : A Modular Regularizer for Parallel Circuit Networks. Neural Process Lett 47, 841–858 (2018). https://doi.org/10.1007/s11063-017-9677-4

Download citation

Published: 22 July 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11063-017-9677-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DropCircuit : A Modular Regularizer for Parallel Circuit Networks

Abstract

Access this article

Similar content being viewed by others

An Empirical Study on Improving the Speed and Generalization of Neural Networks Using a Parallel Circuit Approach

Improving Neural Network Generalization by Combining Parallel Circuits with Dropout

Accelerating Training of Deep Neural Networks via Sparse Edge Processing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DropCircuit : A Modular Regularizer for Parallel Circuit Networks

Abstract

Access this article

Similar content being viewed by others

An Empirical Study on Improving the Speed and Generalization of Neural Networks Using a Parallel Circuit Approach

Improving Neural Network Generalization by Combining Parallel Circuits with Dropout

Accelerating Training of Deep Neural Networks via Sparse Edge Processing

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation