Abstract—Deep Artificial Neural Networks (ANNs) achieve state-of-art performance in many computer vision tasks; however, their applicability in the industry is significantly hindered by their high computational complexity. In this paper we propose a model of ANN classifier with cascade architecture, which allows to lower the average computational complexity of the system by classifying simple input samples without performing full volume of calculations. We propose a method for joint optimization of all ANNs of the cascade. We introduce joint loss function that contains a term responsible for the complexity of the model and allows to control the ratio of the precision and speed of the resulting system. We train the model on CIFAR-10 dataset with the proposed method and show that the resulting model is a Pareto improvement (regarding to speed and precision) compared to the model trained in a traditional way.
Similar content being viewed by others
Notes
The classes used were aircraft, car, bird, cat, deer, dog, frog, horse, shop, and truck.
REFERENCES
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Adv. Neural Inf. Process. Syst., 1097–1105 (2012).
F. Yang, W. Choi, and Y. Lin, “Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR, 2016), Las Vegas, Nevada, June 26–July 1, 2016 (IEEE, New York, 2016), pp. 2129–2137.
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention, Munich, Oct. 5–9, 2015 (Springer, 2015), pp. 234–241.
K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition ArXiv”, Preprint arXiv: 1409.1556 (2014).
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” in Neural Information Processing Systems (NIPS), 2015.
S. S. Kruthiventi, K. Ayush, and R. V. Babu, “Deepfix: A fully convolutional neural network for predicting human eye fixations,” IEEE Trans. Image Process., 4446–4456 (2017).
Q. Yang and S. J. Yoo, “Optimal UAV path planning: sensing data acquisition over IoT sensor networks using multi-objective bio-inspired algorithms,” IEEE Access 6, 13671–13684 (2018).
B. Paden, M. Čáp, S. Z. Yong, D. Yershov, and E. Frazzoli, “A survey of motion planning and control techniques for self-driving urban vehicles,” IEEE Trans. Intell. Vehic. 1 (1), 33–55 (2016).
D. Ilin, E. Limonova, V. Arlazarov, and D. Nikolaev, “Fast integer approximations in convolutional neural networks using layer-by-layer training,” in Proc. 9th Int. Conf. Machine Vision, Nice, France, Nov. 18–20, 2016 (ICMV, 2016), p. 103410Q.
M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net: Imagenet classification using binary convolutional neural networks,” in Proc. Eur. Conf. on Computer Vision, 2016, pp. 525–542.
E. E. Limonova, A. V. Sheshkus, D. P. Nikolaev, A. A. Ivanova, D. A. Il’in, and V. L. Arlazarov, “Optimization of speed of the first words of deep ultraprecise neural networks,” Vestn. RFFI, No. 4, 84–96 (2016).
E. E. Limonova, A. V. Sheshkus, A. A. Ivanova, and D. P. Nikolaev, “Convolutional neural network structure transformations for complexity reduction and speed improvement,” Pattern Recogn. Image Analys 28 (1), 24–33 (2018).
P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proc. 2001 Comput. Vision and Pattern Recognition, 2001, p. 1.
E. Kuznetsova, E. Shvets, and D. Nikolaev, “Viola-Jones based hybrid framework for real-time object detection in multispectral images,” in Proc. SPIE 8th Int. Conf. on Machine Vision (ICMV), 2015 (ICMV 2015), Vol. 9875, No. 98750N, pp. 1–6.
H. Schneiderman and T. Kanade, “A statistical method for 3D object detection applied to faces and cars,” Comput. Vision and Pattern Recognit., 746–751 (2000).
H. Schneiderman and T. Kanade, “A statistical Real-time object detection for ‘smart’ vehicles,” in Proc. 7th IEEE Int. Conf. on Computer Vision, (ICCV), Kerkyra, Greece, 1999 (IEEE, New York, 1999), pp. 87–93.
P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Comput. Vision, 167–181 (2004).
J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Analysis Machine Intell., 888–905 (2000).
Lim Y.W. and Lee S.U., “On the color image segmentation algorithm based on the thresholding and the fuzzy c-means techniques,” Pattern Recogn., 935–952 (1990).
E. M. Haralick and K. Shanmugam, “Textural features for image classification,” IEEE Trans. Systems, Man, Cybern., 610–621 (1973).
O. Chapelle, P. Haffner, and V. N. Vapnik, “Support vector machines for histogram-based image classification,” IEEE Trans. Neural Netw., 1055–1064 (1999).
Z. Kato, J. Zerubia, and M. Berthod, “Satellite image classification using a modified Metropolis dynamics,” Acoustics, Speech, Signal Process., 573–576 (1992).
S. A. Gladilin, D. P. Nikolaev, D. V. Polevoi, and N. A. Sokolova, “Research possibilities of increase in accuracy of recognition of neural network at fixed number of active neurons,” Inf. Tekhnol. & Vychisl. Sist., No. 1, 96–105 (2016).
A. Krizhevsky, N. Vinod, and H. Geoffrey, The CIFAR-10 dataset. http://www.cs.toronto.edu/kriz/cifar.html.
H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, “A convolutional neural network cascade for face detection,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR 2015), Boston, MA, USA, June 7–12, 2015 (IEEE, New York, 2015), pp. 5325–5334.
I. Kalinowski and V. Spitsyn, “Compact convolutional neural network cascade for face detection ArXiv”, Preprint arXiv: 1508.01292, (2015).
K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Lett., 1499–1503 (2016).
Z. Cai, M. Saberian, and N. Vasconcelos, “Learning complexity-aware cascades for deep pedestrian detection,” in Proc. IEEE Int. Conf. on Computer Vision, 3361–3369 (2015).
B. Graham, “Fractional max-pooling ArXiv”, Preprint arXiv: 1412.6071 (2014).
J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, “Striving for simplicity: The all convolutional net ArXiv”, Preprint arXiv: 1412.6806, (2014).
C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” in Proc. Int. Conf. on Machine Learning, 2017, pp. 4446–4456.
D. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate deep network learning by exponential linear units (Elus) ArXiv”, Preprint arXiv: 1511.07289, 2015.
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” J. Machine Learning Res. 15 (1), 1929–1958 (2014).
D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization ArXiv”, Preprint arXiv: 1412.6980, (2014).
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (IEEE, New York, 2016), pp. 770–778.
R. Girshick, “Fast r-cnn,” in Proc. IEEE Int. Conf. on Computer Vision, 2015 (IEEE, New York, 2015), pp. 1440–1448.
D. Zeng, F. Zhao, S. Ge, and W. Shen, “Fast Cascade Face Detection with Pyramid,” Network Pattern Recognit. Lett. (2018).
J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li, “Imagenet: A large-scale hierarchical image database,” Comput. Vision & Pattern Recogn., 248–255 (2009).
Funding
This work was supported in part by the Russian Foundation for Basic Research, project nos. 17-29-03514 and 16‑07-01167.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Translated by E. Chernokozhin
Rights and permissions
About this article
Cite this article
Teplyakov, L.M., Gladilin, S.A., Shvets, E.A. et al. Training of Neural Network-Based Cascade Classifiers. J. Commun. Technol. Electron. 64, 846–853 (2019). https://doi.org/10.1134/S1064226919080254
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064226919080254