A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network

Li, Jianquan; Long, Xianlei; Hu, Shenhua; Hu, Yiming; Gu, Qingyi; De Xu

doi:10.1007/s11554-019-00931-5

A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network

Original Research Paper
Published: 21 December 2019

Volume 17, pages 1703–1714, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Jianquan Li^1,2,
Xianlei Long^1,2,
Shenhua Hu^1,2,
Yiming Hu^1,2,
Qingyi Gu^1,2 &
…
De Xu^1,2

535 Accesses
7 Citations
Explore all metrics

Abstract

This paper describes a hardware-oriented two-stage algorithm that can be deployed in a resource-limited field-programmable gate array (FPGA) for fast-object detection and recognition with out external memory. The first stage is the bounding boxes proposal with a conventional object detection method, and the second is convolutional neural network (CNN)-based classification for accuracy improvement. Frequently accessing external memories significantly affects the execution efficiency of object classification. Unfortunately, the existing CNN models with a large number of parameters are difficult to deploy in FPGAs with limited on-chip memory resources. In this study, we designed a compact CNN model and performed the hardware-oriented quantization for parameters and intermediate results. As a result, CNN-based ultra-fast-object classification was realized with all parameters and intermediate results stored on chip. Several evaluations were performed to demonstrate the performance of the proposed algorithm. The object classification module consumes only 163.67 Kbits of on-chip memories for ten regions of interest (ROIs), this is suitable for low-end FPGA devices. In the aspect of accuracy, our method provides a correctness rate of 98.01% in open-source data set MNIST and over 96.5% in other three self-built data sets, which is distinctly better than conventional ultra-high-speed object detection algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enabling real-time object detection on low cost FPGAs

Article 30 October 2021

High Power-Efficient and Performance-Density FPGA Accelerator for CNN-Based Object Detection

Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms

Notes

References

Sharma, A., Shimasaki, K., Gu, Q., Chen, J., Aoyama, T., Takaki, T., Ishii, I., Tamura, K., Tajima, K.: Super high-speed vision platform for processing 1024\(\times\) 1024 images in real time at 12500 fps. IEEE/SICE International Symposium on system integration (SII), 544–549 (2016)
Ishii, I., Tatebe, T., Gu, Q., Takaki, T.: Color-histogram-based Tracking at 2000 fps. J. Electron. Imaging 21(1), 13010 (2012)
Article Google Scholar
Ishii, I., Ichida, T., Gu, Q., Takaki, T.: 500-fps face tracking system. J. Real-Time Image Process. 8(4), 379–388 (2013)
Article Google Scholar
Ma, X., Najjar, W.A., Roy-Chowdhury, A.K.: Evaluation and acceleration of high-throughput fixed-point object detection on FPGAs. IEEE Trans. Circuits Syst. Video Technol. 25(6), 1051–1062 (2015)
Article Google Scholar
Nakahara, H., Yonekawa, H., Fujii, T., Sato, S.: A lightweight YOLOv2: A binarized CNN with a parallel support vector regression for an FPGA. Proceedings of the 2018 ACM/SIGDA International Symposium on field-programmable gate arrays, 31–40 (2018)
Chen, Y.-H., Krishna, T., Emer, J.S., Sze, V.: Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid-State Circuits 52(1), 127–138 (2017)
Article Google Scholar
Shen, J., Huang, Y., Wang, Z., Qiao, Y., Wen, M., Zhang, C.: Proceedings of the 2018 ACM/SIGDA International Symposium on field-programmable gate arrays, 97–106 (2018)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Bay, H., Tuytelaars, T., Van Gool L.: Surf: Speeded up robust features. Eur. Conf. Comput. Vis. 404–417 (2006)
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection, Computer Vision and Pattern Recognition, 2005, IEEE Computer Society Conference on, 886–893 (2005)
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: Binary robust independent elementary features. Eur. Conf. Comput. Vis. 778–792 (2010)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. IEEE international conference on Computer Vision (ICCV), 2564–2571 (2011)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778 (2016)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, 580–587 (2014)
Uijlings, J.R.R., Van De Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Girshick, R.: Fast r-cnn. Proceedings of the IEEE international conference on computer vision 1440–1448, (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 91–99 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. arXiv preprint (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. European conference on computer vision, 21–37 (2016)
Gu, Q., Ishii, I.: Review of some advances and applications in real-time high-speed vision: Our views and experiences. Int. J. Autom. Comput. 13(4), 305–318 (2016)
Article Google Scholar
Ishii, I., Tatebe, T., Gu, Q., Takaki, T.: Color-histogram-based tracking at 2000 fps. J. Electron. Imaging 21(1), 013010 (2012)
Article Google Scholar
Gu, Q., Takaki, T., Ishii, I.: Fast FPGA-based multiobject feature extraction. IEEE Trans. Circuits Syst. Video Technol. 23(1), 30–45 (2013)
Article Google Scholar
Gu, Q., Kawahara, T., Aoyama, T., Takaki, T., Ishii, I., Takemoto, A., Sakamoto, N.: LOC-based high-throughput cell morphology analysis system. IEEE Trans. Autom. Sci. Eng. 12(4), 1346–1356 (2015)
Article Google Scholar
Gu, Q., Aoyama, T., Takaki, T., Ishii, I.: Simultaneous vision-based shape and motion analysis of cells fast-flowing in a microchannel. IEEE Trans. Autom. Sci. Eng. 12(1), 204–215 (2015)
Article Google Scholar
Li, J., Yin, Y., Liu, X., Xu, D., Gu, Q.: 12,000-fps Multi-object detection using HOG descriptor and SVM classifier. IEEE/RSJ International Conference on intelligent robots and systems (IROS), 5928–5933 (2017)
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. Adv. Neural Inf. Process. Syst., pp. 1135–1143 (2015)
Luo, J., Wu, J., Lin, W.: Thinet: A filter level pruning method for deep neural network compression. arXiv preprint arXiv:1707.06342 (2017)
Wang, W., Sun, Y., Eriksson, B., Wang, W., Aggarwal, V.: Wide compression: Tensor ring nets. IEEE Conference on computer vision and pattern recognition, 9329–9338 (2018)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. IEEE Conference on computer vision and pattern recognition (CVPR), 5987–5995 (2017)
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor-net: Imagenet classification using binary convolutional neural networks. European Conference on computer vision, 525–542 (2016)
Wang, P., Hu, Q., Zhang, Y., Zhang, C., Liu, Y., Cheng, J.: Others, two-step quantization for low-bit neural networks. Proceedings of the IEEE Conference on computer vision and pattern recognition, 4376–4384 (2018)
Park, E., Ahn, J., Yoo, S.: Weighted-entropy-based quantization for deep neural networks. IEEE Conference on computer vision and pattern recognition (CVPR) (2017)
Leng, C., Li, H., Zhu, S., Jin, R.: Extremely low bit neural network: Squeeze the last bit out with admm. arXiv preprint arXiv:1707.09870 (2017)

Download references

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China (Grant No. 61673376).

Author information

Authors and Affiliations

The Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Jianquan Li, Xianlei Long, Shenhua Hu, Yiming Hu, Qingyi Gu & De Xu
The School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing, China
Jianquan Li, Xianlei Long, Shenhua Hu, Yiming Hu, Qingyi Gu & De Xu

Authors

Jianquan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xianlei Long
View author publications
You can also search for this author in PubMed Google Scholar
Shenhua Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yiming Hu
View author publications
You can also search for this author in PubMed Google Scholar
Qingyi Gu
View author publications
You can also search for this author in PubMed Google Scholar
De Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qingyi Gu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Long, X., Hu, S. et al. A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network. J Real-Time Image Proc 17, 1703–1714 (2020). https://doi.org/10.1007/s11554-019-00931-5

Download citation

Received: 28 January 2019
Accepted: 23 November 2019
Published: 21 December 2019
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11554-019-00931-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Enabling real-time object detection on low cost FPGAs

High Power-Efficient and Performance-Density FPGA Accelerator for CNN-Based Object Detection

Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel hardware-oriented ultra-high-speed object detection algorithm based on convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Enabling real-time object detection on low cost FPGAs

High Power-Efficient and Performance-Density FPGA Accelerator for CNN-Based Object Detection

Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation