Skip to main content
Log in

DropCircuit : A Modular Regularizer for Parallel Circuit Networks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

How to design and train increasingly large neural network models is a topic that has been actively researched for several years. However, while there exists a large number of studies on training deeper and/or wider models, there is relatively little systematic research particularly on the effective usage of wide modular neural networks. Addressing this gap, and in an attempt to solve the problem of lengthy training times, we proposed Parallel Circuits (PCs), a biologically inspired architecture based on the design of the retina. In previous work we showed that this approach fails to maintain generalization performance in spite of achieving sharp speed gains. To address this issue, and motivated by the way dropout prevents node co-adaptation, in this paper, we suggest an improvement by extending dropout to the parallel-circuit architecture. The paper provides empirical proof and multiple insights into this combination. Experiments show promising results in which improved error rates are achieved in most cases, whilst maintaining the speed advantage of the PC approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Xie S, Girshick R, Dollár P, Tu Z He K (2016) Aggregated residual Transformations for deep neural networks. arXiv preprint arXiv:1611.05431

  2. Suresh S, Omkar SN, Mani V (2005) Parallel implementation of back-propagation algorithm in networks of workstations. IEEE Trans Parallel Distrib Syst 16:24–34

    Article  Google Scholar 

  3. Phan KT, Maul TH, Vu TT (2015) A parallel circuit approach for improving the speed and generalization properties of neural networks. In: 11th international conference on natural computation, pp 1–7

  4. Wu H, Gu X (2015) Towards dropout training for convolutional neural networks. Neural Netw 71:1–10

    Article  Google Scholar 

  5. Baldi P, Sadowski P (2014) The dropout learning algorithm. Artif Intell 210:1–64

    Article  MathSciNet  MATH  Google Scholar 

  6. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157–166

    Article  Google Scholar 

  7. Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  8. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol 25, pp 1097–1105

  9. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  10. Szegedy C, et al. (2015) Going deeper with convolutions In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  11. Guo J, Gould, S (2016) Depth dropout: efficient training of residual convolutional neural networks. In: International conference on digital image computing: techniques and applications (DICTA), pp 1–7,

  12. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. Aistats 9:249–256

    Google Scholar 

  13. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385

  14. Bengio Y et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153

    Google Scholar 

  15. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. Aistats 15:275

    Google Scholar 

  16. Bergstra J, Desjardins G, Lamblin P, Bengio Y (2009) Quadratic polynomials learn better image features. Technical report 1337. Département dInformatique et de Recherche Opérationnelle. Université de Montréal

  17. Andreas V, Michael JW, Serge JB (2016) Residual networks are exponential ensembles of relatively shallow networks. arXiv preprint arXiv:1605.06431

  18. Srivastava RK, Greff K, Schmidhuber J (2016) Highway networks. arXiv preprint arXiv:1505.00387

  19. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146

  20. Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Advances in neural information processing systems, pp 2654–2662

  21. Singh S, Hoiem D, Forsyth D (2016) Swapout: learning an ensemble of deep architectures. arXiv preprint arXiv:1605.06465

  22. Dauphin YN, Bengio Y (2013) Big neural networks waste capacity. arXiv preprint arXiv:1301.3583

  23. Larsson G, Maire M, Shakhnarovich G (2016) FractalNet: ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648

  24. Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3642–3649

  25. Wang M (2015) Multi-path convolutional neural networks for complex image classification. arXiv preprint arXiv:1506.04701

  26. Phan KT, Maul TH, Vu TT, Lai WK (2016) Improving neural network generalization by combining parallel circuits with dropout. In: 23rd international conference on neural information processing

  27. Matsugu M, Mori K, Mitari Y, Kaneda Y (2003) Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw 16:555–559

    Article  Google Scholar 

  28. Kashtan N, Alon U (2005) Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci United States of America 102:13773–13778

    Article  Google Scholar 

  29. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

  30. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res (JMLR) 15:1929–1958

  31. Huang G, Sun Y, Liu Z, Sedra D, Weinberger K (2016) Deep networks with stochastic depth. arXiv preprint arXiv:1603.09382,

  32. Wan L, Zeiler M (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 109–111

  33. Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml. University of California, Irvine, School of Information and Computer Sciences. arXiv e-prints 1605.02688. abs/1605.02688

  34. Theano Development Team (2016 )Theano: a python framework for fast computation of mathematical expressions. arXiv e-prints 1605.02688, abs/1605.02688

  35. Chorowski J, Zurada JM (2015) Learning understandable neural networks with nonnegative weight constraints. IEEE Trans Neural Netw Learn Syst 26(1):62–69. doi:10.1109/TNNLS.2014.2310059

    Article  MathSciNet  Google Scholar 

  36. Hosseini-Asl E, Zurada JM, Nasraoui O (2016) Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Trans Neural Netw Learn Syst 27(12):2486–2498

    Article  Google Scholar 

  37. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791. doi:10.1038/44565

    Article  MATH  Google Scholar 

  38. Francis SH, Guangyu RY, Xiao-Jing W (2016) Training excitatory-inhibitory recurrent neural networks for cognitive tasks: a simple and flexible framework. Comput Biol 12(2):e1004792

    Google Scholar 

Download references

Acknowledgements

The project was sponsored by the Crop for the Future Research Center (CFFRC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kien Tuong Phan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phan, K.T., Maul, T.H., Vu, T.T. et al. DropCircuit : A Modular Regularizer for Parallel Circuit Networks. Neural Process Lett 47, 841–858 (2018). https://doi.org/10.1007/s11063-017-9677-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9677-4

Keywords

Navigation