skip to main content
10.1145/3595916.3626456acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Monocular 3D Pose Estimation of Very Small Airplane in the Air

Published:01 January 2024Publication History

ABSTRACT

In this paper, a novel pose estimation algorithm is proposed specifically for maneuvering airplanes in the air. The algorithm consists of two main stages. The first stage involves semantic segmentation of a monocular input image of a flying airplane, where the entire captured area serves as feature points for the airplane, which are typically small in the image. The second stage focuses on the 3D pose estimation of the segmented image using projective registration. Since airplanes have unique characteristics and there is a scarcity of airplane-specific datasets, a custom dataset is generated for the experiments. Unreal Engine 4, a 3D computer graphics game engine renowned for its realistic simulations, is employed for this purpose. Experimental results demonstrate the suitability of the algorithm for 3D pose estimation of airplanes, providing valuable information for studying autonomous control of airplanes.

References

  1. Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. 2019. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. 9157–9166.Google ScholarGoogle ScholarCross RefCross Ref
  2. Garrick Brazil and Xiaoming Liu. 2019. M3d-rpn: Monocular 3d region proposal network for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9287–9296.Google ScholarGoogle ScholarCross RefCross Ref
  3. Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11621–11631.Google ScholarGoogle ScholarCross RefCross Ref
  4. Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).Google ScholarGoogle Scholar
  5. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 801–818.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Oscal Tzyh-Chiang Chen, Yu-Xuan Chang, Yu-Wei Jhao, Chih-Yu Chung, Yun-Ling Chang, and Wei-Hsiang Huang. 2022. 3D Object Detection of Cars and Pedestrians by Deep Neural Networks from Unit-Sharing One-Shot NAS. In 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–8. https://doi.org/10.1109/AVSS56176.2022.9959427Google ScholarGoogle ScholarCross RefCross Ref
  7. Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2147–2156.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jin-Kyu Choi, Yong-Tae Lee, HeaSook Park, BongSoo Kim, and Byung-Woon Kim. 2022. Challenges to the Development of Manned and Unmanned Combat Systems. In 2022 13th International Conference on Information and Communication Technology Convergence (ICTC). 2362–2364. https://doi.org/10.1109/ICTC55196.2022.9952483Google ScholarGoogle ScholarCross RefCross Ref
  9. Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, and Ping Luo. 2020. Learning depth-guided convolutions for monocular 3d object detection. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition workshops. 1000–1001.Google ScholarGoogle ScholarCross RefCross Ref
  10. Daoyong Fu, Songchen Han, Wei Li, and Hanren Lin. 2023. The Pose Estimation of the Aircraft on the Airport Surface Based on the Contour Features. IEEE Trans. Aerospace Electron. Systems 59, 2 (2023), 817–826. https://doi.org/10.1109/TAES.2022.3192220Google ScholarGoogle ScholarCross RefCross Ref
  11. Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Tong He and Stefano Soatto. 2019. Mono3d++: Monocular 3d vehicle detection with two-scale 3d hypotheses and task priors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8409–8416.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zhou, Peng Wang, Yuanqing Lin, and Ruigang Yang. 2018. The apolloscape dataset for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 954–960.Google ScholarGoogle ScholarCross RefCross Ref
  14. Peixuan Li, Huaici Zhao, Pengfei Liu, and Feidao Cao. 2020. Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer, 644–660.Google ScholarGoogle Scholar
  15. Shichao Li, Zengqiang Yan, Hongyang Li, and Kwang-Ting Cheng. 2021. Exploring intermediate representation for monocular vehicle pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1873–1883.Google ScholarGoogle ScholarCross RefCross Ref
  16. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.Google ScholarGoogle ScholarCross RefCross Ref
  17. Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8759–8768.Google ScholarGoogle ScholarCross RefCross Ref
  18. Arsalan Mousavian, Dragomir Anguelov, John Flynn, and Jana Kosecka. 2017. 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 7074–7082.Google ScholarGoogle ScholarCross RefCross Ref
  19. Mrunalini Nalamati, Ankit Kapoor, Muhammed Saqib, Nabin Sharma, and Michael Blumenstein. 2019. Drone Detection in Long-Range Surveillance Videos. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2019.8909830Google ScholarGoogle ScholarCross RefCross Ref
  20. Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulo, and Peter Kontschieder. 2017. The mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the IEEE international conference on computer vision. 4990–4999.Google ScholarGoogle ScholarCross RefCross Ref
  21. Felix Nobis, Ehsan Shafiei, Phillip Karle, Johannes Betz, and Markus Lienkamp. 2021. Radar voxel fusion for 3D object detection. Applied Sciences 11, 12 (2021), 5598.Google ScholarGoogle ScholarCross RefCross Ref
  22. Adrian P. Pope, Jaime S. Ide, Daria Micovic, Henry Díaz, David Rosenbluth, Lee Ritholtz, Jason C. Twedt, Thayne T. Walker, Kevin Alcedo, and Daniel Javorsek. 2021. Hierarchical Reinforcement Learning for Air-to-Air Combat. CoRR abs/2105.00990 (2021). arXiv:2105.00990https://arxiv.org/abs/2105.00990Google ScholarGoogle Scholar
  23. Arne Schumann, Lars Sommer, Johannes Klatte, Tobias Schuchert, and Jürgen Beyerer. 2017. Deep cross-domain flying obrazilject classification for robrazilust UAV detection. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2017.8078558Google ScholarGoogle ScholarCross RefCross Ref
  24. Lars Sommer, Arne Schumann, Thomas Müller, Tobrazilias Schuchert, and Jürgen Beyerer. 2017. Flying object detection for automatic UAV recognition. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2017.8078557Google ScholarGoogle ScholarCross RefCross Ref
  25. Nian Wang, Zhe Zhang, Jing Xiao, and Li Cui. 2019. DeepLap: A deep learning based non-specific low back pain symptomatic muscles recognition system. In 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). IEEE, 1–9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Di Wu, Zhaoyong Zhuang, Canqun Xiang, Wenbin Zou, and Xia Li. 2019. 6d-vnet: End-to-end 6-dof vehicle pose estimation from monocular rgb images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0.Google ScholarGoogle ScholarCross RefCross Ref
  27. Jie Xu, Qing Guo, Lei Xiao, Zhaoyi Li, and Gaowei Zhang. 2019. Autonomous Decision-Making Method for Combat Mission of UAV based on Deep Reinforcement Learning. In 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Vol. 1. 538–544. https://doi.org/10.1109/IAEAC47372.2019.8998066Google ScholarGoogle ScholarCross RefCross Ref
  28. Jaewoong Yoo, Hyunki Seong, David Hyunchul Shim, Jung Ho Bae, and Yong-Duk Kim. 2022. Deep Reinforcement Learning-based Intelligent Agent for Autonomous Air Combat. In 2022 IEEE/AIAA 41st Digital Avionics Systems Conference (DASC). 1–9. https://doi.org/10.1109/DASC55683.2022.9925811Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Monocular 3D Pose Estimation of Very Small Airplane in the Air
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia
              December 2023
              745 pages
              ISBN:9798400702051
              DOI:10.1145/3595916

              Copyright © 2023 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 January 2024

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              Overall Acceptance Rate59of204submissions,29%

              Upcoming Conference

              MM '24
              MM '24: The 32nd ACM International Conference on Multimedia
              October 28 - November 1, 2024
              Melbourne , VIC , Australia
            • Article Metrics

              • Downloads (Last 12 months)55
              • Downloads (Last 6 weeks)13

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format