Skip to main content

Smart Recognition System for Business Predictions (You Only Look Once – V3) Unified, Real-Time Object Detection

  • Chapter
  • First Online:
Internet of Things for Industry 4.0

Abstract

Object recognition is viewed as a standout amongst the most difficult issues in this field of PC vision, as it includes the mix of item arrangement and object confinement inside a scene. Lately, Deep Neural Network systems (DNNs) have been shown to accomplish better item discovery execution through different methodologies, with YOLOv2 (an enhanced You Only Look Once model) being one of the cutting edges in DNN-based object location strategies in both speed and precision. Even though YOLOv2 can accomplish constant execution on a ground-breaking GPU, it stays exceptionally trying to utilize this methodology for continuous item identification in video on installed figuring gadgets with constrained computational power and restricted memory. In this paper, we propose another system called Fast YOLOv3, a quick You Only Look Once structure that quickens YOLOv2 to have the capacity to perform object location in video on inserted gadgets in a continuous way. To begin with, we use the developmental profound knowledge system to advance the YOLOv3 engineering and deliver an improved design that has 2.8× fewer parameters with only an ~2% IOU drop. To additionally lessen control utilization on installed gadgets while looking after execution, a movement versatile surmising technique is brought into the proposed Fast YOLO structure to decrease the recurrence of profound derivation with O-YOLOv3 dependent on transient movement qualities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. YOLOv1: https://arxiv.org/abs/1506.02640.

  2. YOLOv2: https://arxiv.org/abs/1506.02640.

  3. YOLOv3: https://pjreddie.com/media/files/papers/YOLOv3.pdf.

  4. DarkNet: https://pjreddie.com/darknet/.

  5. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778.

    Google Scholar 

  6. https://arxiv.org/pdf/1612.08242.pdf.

  7. https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088.

  8. https://pjreddie.com/media/files/papers/YOLOv3.pdf.

  9. Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., & Farhadi, A. (2017). Iqa: Visual question answering in interactive environments. arXiv preprint arXiv:1712.03316.

    Google Scholar 

  10. Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., & Murphy, K. (2017). Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7310–7311).

    Google Scholar 

  11. Krasin, I., Duerig, T., Alldrin, N., Ferrari, V., Abu-El-Haija, S., Haija, A., Kuznetsova, H., Rom, J., Uijlings, S., Popov, A., Veit, S., Belongie, V., Gomes, A., Gupta, C., Sun, G., Chechik, D., Cai, Z., Feng, D. N., & Murphy, K. (2017). Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages.

  12. Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2117–2125).

    Google Scholar 

  13. Parham, J., Crall, J., Stewart, C., Berger-Wolf, T., & Rubenstein, D. (2017). Animal population censusing at scale with citizen science and photographic identification. In 2017 AAAI Spring Symposium Series.

    Google Scholar 

  14. Redmon, J. & Farhadi, A. (2017). Yolo9000: Better, faster, stronger. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on (pp. 6517–6525). IEEE.

    Google Scholar 

  15. Ren S., He K., Girshick R., & Sun J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497.

    Google Scholar 

  16. Russakovsky, O., Li, L.-J., & Fei-Fei, L. (2015). Best of both worlds: human-machine collaboration for object annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2121–2131.

    Google Scholar 

  17. Shrivastava, A., Sukthankar, R., Malik, J., & Gupta, A. (2016). Beyond skip connections: Top-down modulation for object detection. arXiv preprint arXiv:1612.06851.

    Google Scholar 

  18. Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.

    Google Scholar 

  19. Cai, H., Wu, Q., Corradi, T., & Hall, P. (2015). The crossdepiction problem: Computer vision algorithms for recognising objects in artwork and in photographs. arXiv preprint arXiv:1505.00110.

    Google Scholar 

  20. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 886–893). IEEE

    Google Scholar 

  21. Dean, T., Ruzon, M., Segal, M., Shlens, J., Vijayanarasimhan, S., Yagnik, J., et al. (2013). Fast, accurate detection of 100,000 object classes on a single machine. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on (pp. 1814–1821). IEEE.

    Google Scholar 

  22. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T. (2013). Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Susmitha, A.V.V. (2020). Smart Recognition System for Business Predictions (You Only Look Once – V3) Unified, Real-Time Object Detection. In: Kanagachidambaresan, G., Anand, R., Balasubramanian, E., Mahima, V. (eds) Internet of Things for Industry 4.0. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-32530-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32530-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32529-9

  • Online ISBN: 978-3-030-32530-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics