ABSTRACT
In this paper, a novel pose estimation algorithm is proposed specifically for maneuvering airplanes in the air. The algorithm consists of two main stages. The first stage involves semantic segmentation of a monocular input image of a flying airplane, where the entire captured area serves as feature points for the airplane, which are typically small in the image. The second stage focuses on the 3D pose estimation of the segmented image using projective registration. Since airplanes have unique characteristics and there is a scarcity of airplane-specific datasets, a custom dataset is generated for the experiments. Unreal Engine 4, a 3D computer graphics game engine renowned for its realistic simulations, is employed for this purpose. Experimental results demonstrate the suitability of the algorithm for 3D pose estimation of airplanes, providing valuable information for studying autonomous control of airplanes.
- Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. 2019. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. 9157–9166.Google ScholarCross Ref
- Garrick Brazil and Xiaoming Liu. 2019. M3d-rpn: Monocular 3d region proposal network for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9287–9296.Google ScholarCross Ref
- Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11621–11631.Google ScholarCross Ref
- Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).Google Scholar
- Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 801–818.Google ScholarDigital Library
- Oscal Tzyh-Chiang Chen, Yu-Xuan Chang, Yu-Wei Jhao, Chih-Yu Chung, Yun-Ling Chang, and Wei-Hsiang Huang. 2022. 3D Object Detection of Cars and Pedestrians by Deep Neural Networks from Unit-Sharing One-Shot NAS. In 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–8. https://doi.org/10.1109/AVSS56176.2022.9959427Google ScholarCross Ref
- Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2147–2156.Google ScholarCross Ref
- Jin-Kyu Choi, Yong-Tae Lee, HeaSook Park, BongSoo Kim, and Byung-Woon Kim. 2022. Challenges to the Development of Manned and Unmanned Combat Systems. In 2022 13th International Conference on Information and Communication Technology Convergence (ICTC). 2362–2364. https://doi.org/10.1109/ICTC55196.2022.9952483Google ScholarCross Ref
- Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, and Ping Luo. 2020. Learning depth-guided convolutions for monocular 3d object detection. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition workshops. 1000–1001.Google ScholarCross Ref
- Daoyong Fu, Songchen Han, Wei Li, and Hanren Lin. 2023. The Pose Estimation of the Aircraft on the Airport Surface Based on the Contour Features. IEEE Trans. Aerospace Electron. Systems 59, 2 (2023), 817–826. https://doi.org/10.1109/TAES.2022.3192220Google ScholarCross Ref
- Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarDigital Library
- Tong He and Stefano Soatto. 2019. Mono3d++: Monocular 3d vehicle detection with two-scale 3d hypotheses and task priors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8409–8416.Google ScholarDigital Library
- Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zhou, Peng Wang, Yuanqing Lin, and Ruigang Yang. 2018. The apolloscape dataset for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 954–960.Google ScholarCross Ref
- Peixuan Li, Huaici Zhao, Pengfei Liu, and Feidao Cao. 2020. Rtm3d: Real-time monocular 3d detection from object keypoints for autonomous driving. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer, 644–660.Google Scholar
- Shichao Li, Zengqiang Yan, Hongyang Li, and Kwang-Ting Cheng. 2021. Exploring intermediate representation for monocular vehicle pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1873–1883.Google ScholarCross Ref
- Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.Google ScholarCross Ref
- Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8759–8768.Google ScholarCross Ref
- Arsalan Mousavian, Dragomir Anguelov, John Flynn, and Jana Kosecka. 2017. 3d bounding box estimation using deep learning and geometry. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 7074–7082.Google ScholarCross Ref
- Mrunalini Nalamati, Ankit Kapoor, Muhammed Saqib, Nabin Sharma, and Michael Blumenstein. 2019. Drone Detection in Long-Range Surveillance Videos. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2019.8909830Google ScholarCross Ref
- Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulo, and Peter Kontschieder. 2017. The mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the IEEE international conference on computer vision. 4990–4999.Google ScholarCross Ref
- Felix Nobis, Ehsan Shafiei, Phillip Karle, Johannes Betz, and Markus Lienkamp. 2021. Radar voxel fusion for 3D object detection. Applied Sciences 11, 12 (2021), 5598.Google ScholarCross Ref
- Adrian P. Pope, Jaime S. Ide, Daria Micovic, Henry Díaz, David Rosenbluth, Lee Ritholtz, Jason C. Twedt, Thayne T. Walker, Kevin Alcedo, and Daniel Javorsek. 2021. Hierarchical Reinforcement Learning for Air-to-Air Combat. CoRR abs/2105.00990 (2021). arXiv:2105.00990https://arxiv.org/abs/2105.00990Google Scholar
- Arne Schumann, Lars Sommer, Johannes Klatte, Tobias Schuchert, and Jürgen Beyerer. 2017. Deep cross-domain flying obrazilject classification for robrazilust UAV detection. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2017.8078558Google ScholarCross Ref
- Lars Sommer, Arne Schumann, Thomas Müller, Tobrazilias Schuchert, and Jürgen Beyerer. 2017. Flying object detection for automatic UAV recognition. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). 1–6. https://doi.org/10.1109/AVSS.2017.8078557Google ScholarCross Ref
- Nian Wang, Zhe Zhang, Jing Xiao, and Li Cui. 2019. DeepLap: A deep learning based non-specific low back pain symptomatic muscles recognition system. In 2019 16th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON). IEEE, 1–9.Google ScholarDigital Library
- Di Wu, Zhaoyong Zhuang, Canqun Xiang, Wenbin Zou, and Xia Li. 2019. 6d-vnet: End-to-end 6-dof vehicle pose estimation from monocular rgb images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0.Google ScholarCross Ref
- Jie Xu, Qing Guo, Lei Xiao, Zhaoyi Li, and Gaowei Zhang. 2019. Autonomous Decision-Making Method for Combat Mission of UAV based on Deep Reinforcement Learning. In 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Vol. 1. 538–544. https://doi.org/10.1109/IAEAC47372.2019.8998066Google ScholarCross Ref
- Jaewoong Yoo, Hyunki Seong, David Hyunchul Shim, Jung Ho Bae, and Yong-Duk Kim. 2022. Deep Reinforcement Learning-based Intelligent Agent for Autonomous Air Combat. In 2022 IEEE/AIAA 41st Digital Avionics Systems Conference (DASC). 1–9. https://doi.org/10.1109/DASC55683.2022.9925811Google ScholarCross Ref
Index Terms
- Monocular 3D Pose Estimation of Very Small Airplane in the Air
Recommendations
Unsupervised universal hierarchical multi-person 3D pose estimation for natural scenes
AbstractMulti-person 3D pose estimation using a monocular freely moving camera in real-world scenarios remains a challenge. There is a lack of data with 3D ground truth, and real-world scenes usually contain self-occlusions and inter-person occlusions. To ...
3D motion estimation of human body from video with dynamic camera work
MPRSS'12: Proceedings of the First international conference on Multimodal Pattern Recognition of Social Signals in Human-Computer-InteractionOcclusion or camera setting produces a high degree of ambiguity when estimating human body motion from monocular video sequences. Good human motion models are an important means of addressing this problem. In this work, we propose a hierarchical motion ...
Terminal phase vision-based target recognition and 3d pose estimation for a tail-sitter, vertical takeoff and landing unmanned air vehicle
PSIVT'06: Proceedings of the First Pacific Rim conference on Advances in Image and Video TechnologyThis paper presents an approach to accurately identify landing targets and obtain 3D pose estimates for vertical takeoff and landing unmanned air vehicles via computer vision methods. The objective of this paper is to detect and recognize a pre-known ...
Comments