Skip to main content
Log in

Monocular Visual Navigation Algorithm for Nursing Robots via Deep Learning Oriented to Dynamic Object Goal

  • Short Paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Robot navigation systems suffer from relatively localizing the robots and object goals in the three-dimensional(3D) dynamic environment. Especially, most object detection algorithms adopt in navigation suffer from large resource consumption and a low calculation rate. Hence, this paper proposes a lightweight PyTorch-based monocular vision 3D aware object goal navigation system for nursing robot, which relies on a novel pose-adaptive algorithm for inverse perspective mapping (IPM) to recover 3D information of an indoor scene from a monocular image. First, it detects objects and combines their location with the bird-eye view (BEV) information from the improved IPM to estimate the objects’ orientation, distance, and dynamic collision risk. Additionally, the 3D aware object goal navigation network utilizes an improved spatial pyramid pooling strategy, which introduces an average-pooling branch and a max-pooling branch, better integrating local and global features and thus improving detection accuracy. Finally, a novel pose-adaptive algorithm for IPM is proposed, which introduces a novel voting mechanism to adaptively compensate for the monocular camera’s pose variations to enhance further the depth information accuracy, called the adaptive IPM algorithm. Several experiments demonstrate that the proposed navigation algorithm has a lower memory consumption, is computationally efficient, and improves ranging accuracy, thus meeting the requirements for autonomous collision-free navigation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availibility

The data presented in this study are available on request from the corresponding author. The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Kudo, M.: Robot-assisted healthcare support for an aging society. In: 2012 Service Research and Innovation Institute Global Conference (SRII, San Jose, USA) 258–266 (2012)

  2. Zhang, J. Zhou, Z., Xing, L., Sheng, X., Wang, M.: Target recognition and location based on deep learning. In: 2020 IEEE 4th Information Technology, in Networking, Electronic and Automation Control Conference (ITNEC, Chongqing, China) 247–250 (2020)

  3. Peng, J., Ye, H., He, Q., Qin, Y., Wan, Z., Lu, J.: Design of smart home service robot based on ROS. Mobile Information Systems. 22(5511546), 1–14 (2021)

    Google Scholar 

  4. Ribeiro, T., Gonalves, F., Garcia, I.S., Lopes, G., Ribeiro, A.: CHARMIE: a collaborative healthcare and home service and assistant robot for elderly care. Appl. Sci. 11(16), 1–31 (2021)

    Article  Google Scholar 

  5. Zhao, X., Qian, Y., Zhang, M., Niu, J., Kou, Y.: An improved adaptive kalman filtering algorithm for advanced robot navigation system based on GPS/INS. In: 2011 IEEE International Conference on Mechatronics and Automation (ICMA, Beijing, China) 1039–1044 (2011)

  6. Yelve, N.P., Menezes, J.C., Das, S.B., Panchal, B.M.: Augmentation of mapping and autonomous navigation for hexapod robots by using a visual inertial system. J. Phys. Conf. Series 1969(1), 1–11 (2021)

    Article  Google Scholar 

  7. Tulsuk, P., Srestasathiern, P., Ruchanurucks, M., Phatrapornnant, T., Nagahashi, H.: A novel method for extrinsic parameters estimation between a single–line scan LiDAR and a camera. In: 2014 IEEE Intelligent Vehicles Symposium (IV, Dearborn, USA) 781–786 (2014)

  8. Gatesichapakorn, S., Takamatsu, J., Ruchanurucks, M.: ROS based autonomous mobile robot navigation using 2D LiDAR and RGB-D camera. In: 2019 First International Symposium on Instrumentation, Control, Artificial Intelligence, and Robotics (ICA-SYMP, Bangkok, Thailand) 151–154 (2019)

  9. Sereewattana, M., Ruchanurucks, M., Rakprayoon, P., Siddhichai, S., Hasegawa, S.: Automatic landing for fixed-wing UAV using stereo vision with a single camera and an orientation sensor: a concept. In: 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM, Busan, Korea) 29–34 (2015)

  10. Dong, J., Yang, S., Lu, S.: Navigation method of the monocular vision based mobile robot. J. Shandong University 43(11), 1–4 (2008)

    MathSciNet  Google Scholar 

  11. Chang, A. H., Feng, S., Zhao, Y., Smith, J. S., Vela, P. A.: Autonomous, monocular, vision-based snake robot navigation and traversal of cluttered environments using rectilinear gait motion. Robotics (1), 1–7 (2019)

  12. Xiong, X., Zhong, P., Zou, X.: The road detection technology of vision navigation for picking robot. Int. J. Signal Proc. Image Proc. Pattern Recogn. 8(9), 319–330 (2015)

  13. Li, B., Yang, Y., Qin, C., Bai, X., Wang, L.: Improved random sampling consensus algorithm for vision navigation of intelligent harvester robot. Ind. Robot. 47(6), 881–887 (2020)

    Article  Google Scholar 

  14. Hu, Z., Xiao, H., Zhou, Z., Li, N.: Detection of parking slots occupation by temporal difference of inverse perspective mapping from vehicle-borne monocular camera. Proceedings of the Institution Of Mechanical Engineers Part D Journal Of Automobile Engineering 235(12), 3119–3126 (2021)

    Article  Google Scholar 

  15. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Proc. Syst. 25(2), 84–90 (2012)

    Google Scholar 

  16. Alom, M, Z., Taha, T. M., Yakopcic, C., Westberg, S., Sidike, P.,Nasrin, M. S., Essen, B. C. V., Awwal, A. A. S., Asari, V. K.: The history began from AlexNet: a comprehensive survey on deep learning approaches. Comput. Vision Pattern Recogn 2018(1), 1–39

  17. Fujii, K., Kawamoto, K.: Generative and self-supervised domain adaptation for one-stage object detection. Array. 11(2), 1–8 (2021)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis & Machine Intelligence 37(9), 1904–1916 (2015)

    Article  Google Scholar 

  19. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  20. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: Object detection via region-based fully convolutional networks. In: 30th NIPS (NIPS, Barcelona, Spain) 379–387 (2016)

  21. He, K., Gkioxari, G., Piotr, Dollar., Girshick, R.: Mask R-CNN. IEEE Transactions on Pattern Analysis & Machine Intelligence 42(2), 386–397 (2020)

  22. Lin, T, Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie,S.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Honolulu, USA) 936–944 (2017)

  23. Zou, Z., Shi, Z., Guo, Y., Ye, J.: Object detection in 20 Years: A survey. Proceedings of the IEEE 111(3), 257–276 (2019)

    Article  Google Scholar 

  24. Bochkovskiy, A., Wang, C. Y., Liao, H. Y. M.: YOLOv4: Optimal speed and accuracy of object detection. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR, Seattle, USA) 1–17 (2020)

  25. Jocher, G., Nishimura, K., Mineeva, T., Vilariño, R.: YOLOv5 (2020). https://github.com/ult ralytics/YOLOv5. Accessed 10 July 2020

  26. Chuyi, L., Lulu, L., Hongliang, J.: YOLOv6 (2022). https://github.com/meituan/YOLOv6. Accessed 7 Sep 2022

  27. Jocher, G., Chaurasia, A., Qiu, J.: YOLO by Ultralytics. https://github.com/ultralytics/ultralytics. (2023) Accessed 30 Feb 2023

  28. García, M.C., Mateo, J.T., Benítez, P.L., Gutiérrez, J.G.: On the performance of one-stage and two-stage object detectors in autonomous vehicles using camera data. Remote Sens. 13(1), 1–23 (2020)

    Google Scholar 

  29. Zhang, D., Fang, B., Yang, W., Luo, X., Tang, Y.: Robust inverse perspective mapping based on vanishing point. In: 2014 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC, Wuhan, China) 458–463 (2014)

  30. Rezaei, M., Azarmi, M.: DeepSOCIAL: Social distancing monitoring and infection risk assessment in covid-19 pandemic. Appl. Sci. 10(7514), 1–29 (2020)

    Google Scholar 

  31. Jeong, J., Kim, A.: Adaptive inverse perspective mapping for lane map generation with SLAM. In: 2016 13th International Conference on Ubiquitous Robots and Ambient Intelligence (URAl, Xian, China) 38–41 (2016)

  32. He, H., Yang, D., Wang, S., et al.: Road extraction by using atrous spatial pyramid pooling integrated encoder-decoder network and structural similarity loss. Remote Sens. 11(9), 1–16 (2019)

    Article  Google Scholar 

  33. Wongsaree, P., Sinchai, S., Wardkein, P., Koseeyaporn, J.: Distance detection technique using enhancing inverse perspective mapping. In: 2018 3rd International Conference on Computer and Communication Systems (ICCCS, Nagoya, Japan) 217–221 (2018)

  34. Dufek, J., Xiao, X., Murphy, R.R.: Best viewpoints for external robots or sensors assisting other robots. IEEE Transactions on Human-Machine Systems 51(4), 324–334 (2021)

    Article  Google Scholar 

  35. Yan, S., Fu, Y., Zhang, W., Yang, W., Yu, R., Zhang, F.: Multi-target instance segmentation and tracking using YOLOV8 and BoT-SORT for video SAR. In: 2023 5th International Conference on Electronic Engineering and Informatics (EEI, Wuhan, China) pp. 506–510 (2023)

  36. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Zitnick, C. L.: Microsoft COCO: common objects in context. In: 2014 European Conference on Computer Vision (ECCV, Zurich, Swizterland) 740–755 (2014)

  37. Farinha, T.: Augmented reality maintenance assistant using YOLOv5. Appl. Sci. 11(4758), 1–14 (2021)

  38. Jia, W., Xu, S., Liang, Z., et al.: Real-time automatic helmet detection of motorcyclists in urban traffic using improved YOLOv5 detector. IET Image Process. 15(14), 3623–3637 (2021)

  39. Yang, F.: A Real-time apple targets detection method for picking robot based on improved YOLOv5. Remote Sens. 13(9), 1–23 (2021)

    Google Scholar 

  40. Putra, T.A., Leu, J.S.: Multilevel neural network for reducing expected inference time. IEEE Access 7(1), 174129–174138 (2019)

    Article  Google Scholar 

  41. Boldrer, M., Andreetto, M., Divan, S., Palopoli, L., Fontanelli, D.: Socially-aware reactive obstacle avoidance strategy based on limit cycle. IEEE Robotics and Automation Letters 5(2), 3251–3258 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This research is supported by National Natural Science Foundation of China (62003222), Liaoning Provincial Department of Education Service Local Project(JYTMS20231207) and Ministry of Education Spring Program (HZKY0415)

Author information

Authors and Affiliations

Authors

Contributions

The authors have worked together to complete this research. All authors contributed to the study’s conception and design. Guoqiang Fu and Yina Wang performed research, analyzed the data and were involved in writing the manuscript. Junyou Yang and Shuoyu Wang collected, analysed, and interpreted the data from different research articles to formulate a summary and to create a roadmap for future researchers. Material preparation were performed by Guang Yang.

Corresponding author

Correspondence to Yina Wang.

Ethics declarations

The author(s) declared no potential conicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

Not applicable.

Consent for publication

We, the authors give consent for the publication of identifiable details, which can include photographs or details within the text (“Material”) to be published in the above Journal and Therefore, anyone can read the material published in the Journal.

Conflicts of Interest

The authors declare no conflict of interest.

Institutional Review Board Statement

The study in the paper did not involve humans or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, G., Wang, Y., Yang, J. et al. Monocular Visual Navigation Algorithm for Nursing Robots via Deep Learning Oriented to Dynamic Object Goal. J Intell Robot Syst 110, 6 (2024). https://doi.org/10.1007/s10846-023-02024-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-023-02024-9

Keywords

Navigation