Smart Recognition System for Business Predictions (You Only Look Once – V3) Unified, Real-Time Object Detection

Susmitha, Allumallu Veera Venkata

doi:10.1007/978-3-030-32530-5_9

Allumallu Veera Venkata Susmitha⁶

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

2741 Accesses
1 Citations

Abstract

Object recognition is viewed as a standout amongst the most difficult issues in this field of PC vision, as it includes the mix of item arrangement and object confinement inside a scene. Lately, Deep Neural Network systems (DNNs) have been shown to accomplish better item discovery execution through different methodologies, with YOLOv2 (an enhanced You Only Look Once model) being one of the cutting edges in DNN-based object location strategies in both speed and precision. Even though YOLOv2 can accomplish constant execution on a ground-breaking GPU, it stays exceptionally trying to utilize this methodology for continuous item identification in video on installed figuring gadgets with constrained computational power and restricted memory. In this paper, we propose another system called Fast YOLOv3, a quick You Only Look Once structure that quickens YOLOv2 to have the capacity to perform object location in video on inserted gadgets in a continuous way. To begin with, we use the developmental profound knowledge system to advance the YOLOv3 engineering and deliver an improved design that has 2.8× fewer parameters with only an ~2% IOU drop. To additionally lessen control utilization on installed gadgets while looking after execution, a movement versatile surmising technique is brought into the proposed Fast YOLO structure to decrease the recurrence of profound derivation with O-YOLOv3 dependent on transient movement qualities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

YOLOv1: https://arxiv.org/abs/1506.02640.
YOLOv2: https://arxiv.org/abs/1506.02640.
YOLOv3: https://pjreddie.com/media/files/papers/YOLOv3.pdf.
DarkNet: https://pjreddie.com/darknet/.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778.
Google Scholar
https://arxiv.org/pdf/1612.08242.pdf.
https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088.
https://pjreddie.com/media/files/papers/YOLOv3.pdf.
Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., & Farhadi, A. (2017). Iqa: Visual question answering in interactive environments. arXiv preprint arXiv:1712.03316.
Google Scholar
Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., & Murphy, K. (2017). Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7310–7311).
Google Scholar
Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Haija, A., Kuznetsova, H., Rom, J., Uijlings, S., Popov, A., Veit, S., Belongie, V., Gomes, A., Gupta, C., Sun, G., Chechik, D., Cai, Z., Feng, D. N., & Murphy, K. (2017). Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages.
Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2117–2125).
Google Scholar
Parham, J., Crall, J., Stewart, C., Berger-Wolf, T., & Rubenstein, D. (2017). Animal population censusing at scale with citizen science and photographic identification. In 2017 AAAI Spring Symposium Series.
Google Scholar
Redmon, J. & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on (pp. 6517–6525). IEEE.
Google Scholar
Ren S., He K., Girshick R., & Sun J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497.
Google Scholar
Russakovsky, O., Li, L.-J., & Fei-Fei, L. (2015). Best of both worlds: human-machine collaboration for object annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2121–2131.
Google Scholar
Shrivastava, A., Sukthankar, R., Malik, J., & Gupta, A. (2016). Beyond skip connections: Top-down modulation for object detection. arXiv preprint arXiv:1612.06851.
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.
Google Scholar
Cai, H., Wu, Q., Corradi, T., & Hall, P. (2015). The crossdepiction problem: Computer vision algorithms for recognising objects in artwork and in photographs. arXiv preprint arXiv:1505.00110.
Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 886–893). IEEE
Google Scholar
Dean, T., Ruzon, M., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J., et al. (2013). Fast, accurate detection of 100,000 object classes on a single machine. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on (pp. 1814–1821). IEEE.
Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T. (2013). Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.
Google Scholar

Download references

Author information

Authors and Affiliations

National Chiao Tung University, Hsinchu, Taiwan
Allumallu Veera Venkata Susmitha

Authors

Allumallu Veera Venkata Susmitha
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of CSE, Vel Tech Rangarajan Dr Sagunthala R&D, Institute of Science and Technology, Chennai, India
G. R. Kanagachidambaresan
Department of EEE, Amrita Vishwa Vidyapeetham University, Bangalore, India
R. Anand
Department of Mechanical Engineering, Vel Tech Rangarajan Dr Sagunthala R&D, Institute of Science and Technology, Chennai, India
E. Balasubramanian
Department of ECE, Vel Tech Rangarajan Dr Sagunthala R&D, Institute of Science and Technology, Chennai, India
V. Mahima

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Susmitha, A.V.V. (2020). Smart Recognition System for Business Predictions (You Only Look Once – V3) Unified, Real-Time Object Detection. In: Kanagachidambaresan, G., Anand, R., Balasubramanian, E., Mahima, V. (eds) Internet of Things for Industry 4.0. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-32530-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-32530-5_9
Published: 29 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32529-9
Online ISBN: 978-3-030-32530-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics