Abstract
Autonomous landing in non-cooperative environments is a key step toward full autonomy of unmanned aerial vehicles (UAVs). Existing studies have addressed this problem by finding a flat area of the ground as the landing place. However, all these methods have poor generalization ability and ignore the semantic feature of the terrain. In this paper, we propose a well-designed binocular-LiDAR sensor system and a robust terrain understanding model to overcome these deficiencies. Our model infers both morphologic and semantic features of the ground by simultaneously performing depth completion and semantic segmentation. Moreover, during the inference phase, it self-evaluates the accuracy of the predicted depth map and dynamically selects the LiDAR accumulation time to ensure accurate depth prediction, which greatly improves the robustness of the UAV in completely unknown environments. Through extensive experiments on our collected low-altitude aerial image dataset and real UAVs, we verified that our model effectively learned two tasks simultaneously and achieved better performance than existing depth estimation-based landing methods. Furthermore, the UAV can robustly select a safe landing site in several complex environments with about 98% accuracy.
Similar content being viewed by others
References
Kong W W, Zhang D B, Wang X, et al. Autonomous landing of an UAV with a ground-based actuated infrared stereo vision system. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, 2013. 2963–2970
Kong W W, Zhou D L, Zhang Y, et al. A ground-based optical system for autonomous landing of a fixed wing UAV. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, 2014. 4797–4804
Sharp C S, Shakernia O, Sastry S S. A vision system for landing an unmanned aerial vehicle. In: Proceedings of IEEE International Conference on Robotics and Automation, Seoul, 2001, 2: 1720–1727
Saripalli S, Montgomery J F, Sukhatme G S. Visually guided landing of an unmanned aerial vehicle. IEEE Trans Robot Automat, 2003, 19: 371–380
Lange S, Sunderhauf N, Protzel P. A vision based onboard approach for landing and position control of an autonomous multirotor UAV in GPS-denied environments. In: Proceedings of International Conference on Advanced Robotics, Munich, 2009. 1–6
Benini A, Rutherford M J, Valavanis K P. Real-time, GPU-based pose estimation of a UAV for autonomous takeoff and landing. In: Proceedings of IEEE International Conference on Robotics and Automation, Stockholm, 2016. 3463–3470
Lee D, Ryan T, Kim H J. Autonomous landing of a VTOL UAV on a moving platform using image-based visual servoing. In: Proceedings of IEEE International Conference on Robotics and Automation, St Paul, 2012. 971–976
Muskardin T, Balmer G, Wlach S, et al. Landing of a fixed-wing UAV on a mobile ground vehicle. In: Proceedings of IEEE International Conference on Robotics and Automation, Stockholm, 2016. 1237–1242
Wang H, Shi Z Y, Lu G, et al. Hierarchical fiducial marker design for pose estimation in large-scale scenarios. J Field Robotics, 2018, 35: 835–849
Saripalli S, Montgomery J F, Sukhatme G S. Vision-based autonomous landing of an unmanned aerial vehicle. In: Proceedings of IEEE International Conference on Robotics and Automation, Washington, 2002. 3: 2799–2804
Miller A, Shah M, Harper D. Landing a UAV on a runway using image registration. In: Proceedings of IEEE International Conference on Robotics and Automation, California, 2008. 182–187
Johnson A, Montgomery J, Matthies L. Vision guided landing of an autonomous helicopter in hazardous terrain. In: Proceedings of IEEE International Conference on Robotics and Automation, Barcelona, 2005. 3966–3971
Cherian A, Andersh J, Morellas V, et al. Autonomous altitude estimation of a UAV using a single onboard camera. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, 2009. 3900–3905
Marcu A, Costea D, Licaret V, et al. SafeUAV: learning to estimate depth and safe landing areas for UAVs from synthetic data. In: Proceedings of European Conference on Computer Vision Workshops, Munich, 2018
Eynard D, Vasseur P, Demonceaux C, et al. UAV altitude estimation by mixed stereoscopic vision. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, 2010. 646–651
Park J, Kim Y, Kim S. Landing site searching and selection algorithm development using vision system and its application to quadrotor. IEEE Trans Contr Syst Technol, 2014, 23: 488–503
Papa U, Del Core G. Design of sonar sensor model for safe landing of an UAV. In: Proceedings of IEEE Metrology for Aerospace, Benevento, 2015. 346–350
Bosch S, Lacroix S, Caballero F. Autonomous detection of safe landing areas for an UAV from monocular images. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, 2006. 5522–5527
Guo X F, Denman S, Fookes C, et al. Automatic UAV forced landing site detection using machine learning. In: Proceedings of International Conference on Digital Image Computing: Techniques and Applications (DICTA), Wollongong, 2014. 1–7
Guo X F, Denman S, Fookes C, et al. A robust UAV landing site detection system using mid-level discriminative patches. In: Proceedings of International Conference on Pattern Recognition, Cancun, 2016. 1659–1664
Hinzmann T, Stastny T, Cadena C, et al. Free LSD: prior-free visual landing site detection for autonomous planes. IEEE Robot Autom Lett, 2018, 3: 2545–2552
Patterson T, McClean S, Morrow P, et al. Utilizing geographic information system data for unmanned aerial vehicle position estimation. In: Proceedings of Canadian Conference on Computer and Robot Vision, Newfoundland, 2011. 8–15
Liu S H, Wang S Q, Shi W H, et al. Vehicle tracking by detection in UAV aerial video. Sci China Inf Sci, 2019, 62: 024101
Uhrig J, Schneider N, Schneider L, et al. Sparsity invariant CNNs. In: Proceedings of International Conference on 3D Vision, Qingdao, 2017. 11–20
Ma F C, Karaman S. Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: Proceedings of IEEE International Conference on Robotics and Automation, Brisbane, 2018. 4796–4803
Chen Y, Yang B, Liang M, et al. Learning joint 2D-3D representations for depth completion. In: Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, 2019. 10023–10032
Qiu J, Cui Z, Zhang Y, et al. DeepLiDAR: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 3313–3322
Poggi M, Pallotti D, Tosi F, et al. Guided stereo matching. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019. 979–988
Tang J, Tian F P, Feng W, et al. Learning guided convolutional network for depth completion. IEEE Trans Image Process, 2020, 30: 1116–1129
Maffra F, Teixeira L, Chen Z T, et al. Real-time wide-baseline place recognition using depth completion. IEEE Robot Autom Lett, 2019, 4: 1525–1532
Teixeira L, Oswald M R, Pollefeys M, et al. Aerial single-view depth completion with image-guided uncertainty estimation. IEEE Robot Autom Lett, 2020, 5: 1055–1062
Xu J, Schwing A G, Urtasun R. Learning to segment under various forms of weak supervision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 3781–3790
Kolesnikov A, Lampert C H. Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Proceedings of European Conference on Computer Vision, Amsterdam, 2016. 695–711
Feng J P, Wang X G, Liu W Y. Deep graph cut network for weakly-supervised semantic segmentation. Sci China Inf Sci, 2021, 64: 130105
Bearman A, Russakovsky O, Ferrari V, et al. What’s the point: semantic segmentation with point supervision. In: Proceedings of European Conference on Computer Vision, Amsterdam, 2016. 549–565
Lin D, Dai J F, Jia J Y, et al. ScribbleSup: scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 3159–3167
Tang M, Djelouah A, Perazzi F, et al. Normalized cut loss for weakly-supervised CNN segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 1818–1827
Dai J F, He K M, Sun J. BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of IEEE/CVF International Conference on Computer Vision, Santiago, 2015. 1635–1643
Khoreva A, Benenson R, Hosang J, et al. Simple does it: weakly supervised instance and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 876–885
Vernaza P, Chandraker M. Learning random-walk label propagation for weakly-supervised semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 7158–7166
Tang M, Perazzi F, Djelouah A, et al. On regularized losses for weakly-supervised CNN segmentation. In: Proceedings of European Conference on Computer Vision, Munich, 2018. 507–522
Zhang Z. A flexible new technique for camera calibration. IEEE Trans Pattern Anal Machine Intell, 2000, 22: 1330–1334
Fischler M A, Bolles R C. Random sample consensus. Commun ACM, 1981, 24: 381–395
Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of European Conference on Computer Vision, Munich, 2018. 801–818
Godard C, Aodha O M, Brostow G J. Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 270–279
Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process, 2004, 13: 600–612
Eigen D, Puhrsch C, Fergus R. Depth map prediction from a single image using a multi-scale deep network. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014. 27: 2366–2374
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 3431–3440
Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations, San Diego, 2015
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 770–778
Ma F C, Cavalheiro G V, Karaman S. Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: Proceedings of International Conference on Robotics and Automation, Montreal, 2019. 3288–3295
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant Nos. 61903216, 62073185).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, L., Xiao, Y., Yuan, X. et al. Robust autonomous landing of UAVs in non-cooperative environments based on comprehensive terrain understanding. Sci. China Inf. Sci. 65, 212202 (2022). https://doi.org/10.1007/s11432-021-3429-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-021-3429-1