Skip to main content
Log in

Octave convolution-based vehicle detection using frame-difference as network input

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Vehicle detection in video frames has been treated the same way detecting vehicle for an isolated image. However, the models designed for the isolated image are blind to fast-moving vehicles and cannot localize the moving targets partially occluded in the scene. In this case, we figure out a way to combine the classic moving target detection method with the neural network method. In this work, first, we propose to add three-differential-frames into the neural network of Yolov3 as the second input which contains the motion information on the front and back frames to detect vehicles partially occluded; second, we reform the network by using Octave Convolution to reduce memory and computational cost while boosting accuracy. We experimentally show that by using the aforementioned methods together, compared with using original YOLOv3 on UA-DETRAC data set, AP is increased by 2.31%, recall is increased by 4.01%, and precision is increased by 3.10%. We demonstrate that the proposed method is indeed effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16.
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Chen, Y., Wu, Q.: Moving vehicle detection based on optical flow estimation of edge. In: 2015 11th International Conference on Natural Computation (ICNC), pp 754–758. IEEE (2015)

  2. Teoh, S.S., Bräunl, T.: Symmetry-based monocular vehicle detection system. Mach Vis Appl 23(5), 831–842 (2012)

    Article  Google Scholar 

  3. Tsai, L.W., Hsieh, J.W., Fan, K.C.: Vehicle detection using normalized color and edge map. IEEE Trans Image Process 16(3), 850–864 (2007)

    Article  MathSciNet  Google Scholar 

  4. Caiyuan, C., Xiaoning, Z.: Moving vehicle detection based on union of three-frame difference. In: Jin, D., Lin, S. (eds.) Advances in Electronic Engineering, Communication and Management, vol. 2, pp. 459–464. Springer, Berlin, Heidelberg (2012)

    Google Scholar 

  5. Sandeep, S.S., Susanta, M.: Moving object detection based on frame difference and W4. SIViP 11(7), 1357–1364 (2017)

    Article  Google Scholar 

  6. He, H., Ma, S.C., Sun, L.: Multi-moving target detection based on the combination of three frame difference algorithm and background difference algorithm. In: 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA), pp. 141–146. IEEE, Beijing (2018)

  7. Cui, X., Zhang, W., Liu, D.: Improved frame difference algorithm based on CNN for moving target detection. In: 39th Chinese Control Conference (CCC), pp. 7595–7598. IEEE, Shenyang (2020)

  8. Alex, K., Ilya, S., Geoffrey, E.H.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 580–587 (2014)

  11. Liu, W., et al.: SSD: Single Shot MultiBox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)

  12. Redmon, J., & Farhadi, A.: Yolov3: an incremental improvement. arXiv:1804.02767 (2018)

  13. Chen, W., Huang, H., Peng, S., et al.: YOLO-face: a real-time face detector. Vis. Comput. 37, 805–813 (2021). https://doi.org/10.1007/s00371-020-01831-7

    Article  Google Scholar 

  14. Junos, M.H., Mohd Khairuddin, A.S., Thannirmalai, S., et al.: Automatic detection of oil palm fruits from UAV images using an improved YOLO model. Vis. Comput. 1, 15 (2021). https://doi.org/10.1007/s00371-021-02116-3

    Article  Google Scholar 

  15. Zhang, H., Hu, Z., Hao, R.: Joint information fusion and multi-scale network model for pedestrian detection. Vis. Comput. 37, 2433–2442 (2021). https://doi.org/10.1007/s00371-020-01997-0

    Article  Google Scholar 

  16. Harikrishnan, P.M., Thomas, A., Gopi, V.P., et al.: Inception single shot multi-box detector with affinity propagation clustering and their application in multi-class vehicle counting. Appl. Intell. 2021, 1–16 (2021)

    Google Scholar 

  17. Chandrasekar, K.S., Geetha, P.: Multiple objects tracking by a highly decisive three-frame differencing-combined-background subtraction method with GMPFM-GMPHD filters and VGG16-LSTM classifier. J Vis Commun Image Represent 72, 102905 (2020)

    Article  Google Scholar 

  18. Ahmed, E., Moustafa, M.: House price estimation from visual and textual features. arXiv:1609.08399 (2016)

  19. Chen, Y., et al.: Drop an octave: reducing spatial redundancy in convolutional neural networks with octave convolution. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3434–3443 (2019)

  20. Carreira, J., Noland, E., Banki-Horvath, A., Hillier, C., Zisserman, A.: A short note about kinetics-600. arXiv:1808.01340 (2018)

  21. Lyu, S., et al.: UA-DETRAC 2018: report of AVSS2018 & IWT4S challenge on advanced traffic monitoring. In: 2018 15th International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE, Auckland (2018)

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61562009), the Open Fund Project in Semiconductor Power Device Reliability Engineering Center of Ministry of Education (No. ERCMEKFJJ2019-06), and the Guizhou University Introduced Talent Research Project (No. 2015-29).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benliang Xie.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Open access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, J., Liu, R., Chen, Z. et al. Octave convolution-based vehicle detection using frame-difference as network input. Vis Comput 39, 1503–1515 (2023). https://doi.org/10.1007/s00371-022-02425-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02425-1

Keywords

Navigation