Abstract
Object detection is one among the major sub-domains of the computer vision which deals with the identification of objects of a pre-defined class. Recognition of object is imminent to identify numerous pertinent objects from an image or video. Several deep neural learning, machine learning based techniques are used for object detection in digital images and videos. This paper discusses a comparative study of some deep learning based object detection frameworks, and analysed on the benchmark mean Average Precision (mAP) and selected models are evaluated using PASCAL VOC 2007 dataset which is the standard image data set for object class identification and recognition. Among the selected detection models, PVANet has the highest mAP (84.9) with FPS 21.7 and is considered as the best object detection method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Zhou, X., Gong, W., Fu, W., Du, F.: Application of deep learning in object detection. In: IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), vol. 1, pp. 631–634, China (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)
Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.: Microsoft COCO: common objects in context. In: Computer Vision–ECCV, pp. 740–755. Springer International Publishing (2014)
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2009)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). IEEE (2005)
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: Proceedings. International Conference on Image Processing, vol. 1, pp. I–I. IEEE (2002)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Xu, J., Ramos, S., Vazquez, D., Lopez, A.: Domain adaptation of deformable part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2367–2380 (2014)
Girshick, R.: Fast R-CNN. ICCV (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Computer Vision–ECCV 2014, pp. 346–361 (2014)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2015)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., Berg, A.: SSD: single shot multibox detector. In: Computer Vision–ECCV, pp. 21–37 (2016)
Li, Z., Fuqiang Z.: FSSD: feature fusion single shot multibox detector. In: CoRR (2018)
Hong, S., Roh, B., Kim, K., Cheon, Y., Park, M.: PVANet: lightweight deep neural networks for real-time object detection. In: NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN) (2016)
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Shang, W., Sohn, K., Almeida, D., Lee, H.: Understanding and improving convolutional neural networks via concatenated rectified linear units. In: ICML (2016)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kurian, E., Mathew, J. (2020). Comparative Study on Deep Learning Frameworks for Object Detection. In: Smys, S., Senjyu, T., Lafata, P. (eds) Second International Conference on Computer Networks and Communication Technologies. ICCNCT 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 44. Springer, Cham. https://doi.org/10.1007/978-3-030-37051-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-37051-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37050-3
Online ISBN: 978-3-030-37051-0
eBook Packages: EngineeringEngineering (R0)