Abstract
Object recognition is viewed as a standout amongst the most difficult issues in this field of PC vision, as it includes the mix of item arrangement and object confinement inside a scene. Lately, Deep Neural Network systems (DNNs) have been shown to accomplish better item discovery execution through different methodologies, with YOLOv2 (an enhanced You Only Look Once model) being one of the cutting edges in DNN-based object location strategies in both speed and precision. Even though YOLOv2 can accomplish constant execution on a ground-breaking GPU, it stays exceptionally trying to utilize this methodology for continuous item identification in video on installed figuring gadgets with constrained computational power and restricted memory. In this paper, we propose another system called Fast YOLOv3, a quick You Only Look Once structure that quickens YOLOv2 to have the capacity to perform object location in video on inserted gadgets in a continuous way. To begin with, we use the developmental profound knowledge system to advance the YOLOv3 engineering and deliver an improved design that has 2.8× fewer parameters with only an ~2% IOU drop. To additionally lessen control utilization on installed gadgets while looking after execution, a movement versatile surmising technique is brought into the proposed Fast YOLO structure to decrease the recurrence of profound derivation with O-YOLOv3 dependent on transient movement qualities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
YOLOv1: https://arxiv.org/abs/1506.02640.
YOLOv2: https://arxiv.org/abs/1506.02640.
DarkNet: https://pjreddie.com/darknet/.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778.
https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088.
Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., & Farhadi, A. (2017). Iqa: Visual question answering in interactive environments. arXiv preprint arXiv:1712.03316.
Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., & Murphy, K. (2017). Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7310–7311).
Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Haija, A., Kuznetsova, H., Rom, J., Uijlings, S., Popov, A., Veit, S., Belongie, V., Gomes, A., Gupta, C., Sun, G., Chechik, D., Cai, Z., Feng, D. N., & Murphy, K. (2017). Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages.
Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2117–2125).
Parham, J., Crall, J., Stewart, C., Berger-Wolf, T., & Rubenstein, D. (2017). Animal population censusing at scale with citizen science and photographic identification. In 2017 AAAI Spring Symposium Series.
Redmon, J. & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on (pp. 6517–6525). IEEE.
Ren S., He K., Girshick R., & Sun J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497.
Russakovsky, O., Li, L.-J., & Fei-Fei, L. (2015). Best of both worlds: human-machine collaboration for object annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2121–2131.
Shrivastava, A., Sukthankar, R., Malik, J., & Gupta, A. (2016). Beyond skip connections: Top-down modulation for object detection. arXiv preprint arXiv:1612.06851.
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.
Cai, H., Wu, Q., Corradi, T., & Hall, P. (2015). The crossdepiction problem: Computer vision algorithms for recognising objects in artwork and in photographs. arXiv preprint arXiv:1505.00110.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 886–893). IEEE
Dean, T., Ruzon, M., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J., et al. (2013). Fast, accurate detection of 100,000 object classes on a single machine. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on (pp. 1814–1821). IEEE.
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T. (2013). Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Susmitha, A.V.V. (2020). Smart Recognition System for Business Predictions (You Only Look Once – V3) Unified, Real-Time Object Detection. In: Kanagachidambaresan, G., Anand, R., Balasubramanian, E., Mahima, V. (eds) Internet of Things for Industry 4.0. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-32530-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-32530-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32529-9
Online ISBN: 978-3-030-32530-5
eBook Packages: EngineeringEngineering (R0)