Abstract
Despite promising SLAM research in both vision and robotics communities, which fundamentally sustains the autonomy of intelligent unmanned systems, visual challenges still threaten its robust operation severely. Existing SLAM methods usually focus on specific challenges and solve the problem with sophisticated enhancement or multi-modal fusion. However, they are basically limited to particular scenes with a non-quantitative understanding and awareness of challenges, resulting in a significant performance decline with poor generalization and(or) redundant computation with inflexible mechanisms. To push the frontier of visual SLAM, we propose a fully computational reliable evaluation module called CEMS (Challenge Evaluation Module for SLAM) for general visual perception based on a clear definition and systematic analysis. It decomposes various challenges into several common aspects and evaluates degradation with corresponding indicators. Extensive experiments demonstrate our feasibility and outperformance. The proposed module has a high consistency of 88.298% compared with annotation ground truth, and a strong correlation of 0.879 compared with SLAM tracking performance. Moreover, we show the prototype SLAM based on CEMS with better performance and the first comprehensive CET (Challenge Evaluation Table) for common SLAM datasets (EuRoC, KITTI, etc.) with objective and fair evaluations of various challenges. We make it available online to benefit the community on our website.
Article PDF
Similar content being viewed by others
Data Availability
All data and materials generated or analysed during this study are included in this published article and its supplementary information files.
Code Availability
The code generated during the current study will be refined and then be available on GitHub.
References
Chen, B.M.: On the trends of autonomous unmanned systems research. Engineering 12, 20–23 (2021)
Bujanca, M., Shi, X., Spear, M., Zhao, P., Lennox, B., Luján, M.: Robust slam systems: are we there yet? In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5320–5327 (2021)
Garforth, J., Webb, B.: Visual appearance analysis of forest scenes for monocular slam. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 1794–1800 (2019)
Park, S., Schöps, T., Pollefeys, M.: Illumination change robustness in direct visual slam. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4523–4530 (2017)
CVPR 2020 SLAM Challenge. https://sites.google.com/view/vislocslamcvpr2020/slam-challenge
Liu, X., Gao, Z., Chen, B.M.: Ipmgan: integrating physical model and generative adversarial network for underwater image enhancement. Neurocomputing 453, 538–551 (2021)
Rahman, S., Li, A.Q., Rekleitis, I.: Svin2: an underwater slam system using sonar, visual, inertial, and depth sensor. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1861–1868 (2019)
Zhou, L., Huang, G., Mao, Y., Wang, S., Kaess, M.: Edplvo: efficient direct point-line visual odometry. In: 2022 International Conference on Robotics and Automation, pp. 7559–7565 (2022)
DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 337–33712 (2018)
Sarlin, P.-E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4937–4946 (2020)
Joo, K., Oh, T.-H., Kweon, I.S., Bazin, J.-C.: Globally optimal inlier set maximization for atlanta world understanding. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2656–2669 (2020)
Yunus, R., Li, Y., Tombari, F.: Manhattanslam: Robust planar tracking and mapping leveraging mixture of manhattan frames. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 6687–6693 (2021)
Qiu, Y., Wang, C., Wang, W., Henein, M., Scherer, S.: Airdos: dynamic slam benefits from articulated objects. In: 2022 International Conference on Robotics and Automation, pp. 8047–8053 (2022)
Tomasi, J., Wagstaff, B., Waslander, S.L., Kelly, J.: Learned camera gain and exposure control for improved visual feature detection and matching. IEEE Robotics and Automation Letters 6(2), 2028–2035 (2021)
Brunner, C., Peynot, T., Underwood, J.: Towards discrimination of challenging conditions for ugvs with visual and infrared sensors. In: ARAA Australasian Conference on Robotics and Automation, Sydney, Australia (2009)
Brunner, C., Peynot, T.: Visual metrics for the evaluation of sensor data quality in outdoor perception. In: Proceedings of the 10th Performance Metrics for Intelligent Systems Workshop, pp. 1–8 (2010)
Brunner, C., Peynot, T., Vidal-Calleja, T.: Combining multiple sensor modalities for a localisation robust to smoke. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2489–2496 (2011)
Brunner, C., Peynot, T., Vidal-Calleja, T., Underwood, J.: Selective combination of visual and thermal imaging for resilient localization in adverse conditions: day and night, smoke and fire. Journal of Field Robotics 30(4), 641–666 (2013)
Brunner, C., Peynot, T.: Perception quality evaluation with visual and infrared cameras in challenging environmental conditions. In: Experimental Robotics: The 12th International Symposium on Experimental Robotics, pp. 711–725 (2014). Springer
Kim, P., Coltin, B., Alexandrov, O., Kim, H.J.: Robust visual localization in changing lighting conditions. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5447–5452 (2017)
DARPA Subterranean(SubT) Challenge. www.darpa.mil/program/darpa-subterranean-challenge
Tranzatto, M., Miki, T., Dharmadhikari, M., Bernreiter, L., Kulkarni, M., Mascarich, F., Andersson, O., Khattak, S., Hutter, M., Siegwart, R., et al.: Cerberus in the darpa subterranean challenge. Sci. Robot. 7(66), 9742 (2022)
Carrillo, H., Reid, I., Castellanos, J.A.: On the comparison of uncertainty criteria for active slam. In: 2012 IEEE International Conference on Robotics and Automation, pp. 2080–2087 (2012)
Agha, A., Otsu, K., Morrell, B., Fan, D.D., Thakker, R., Santamaria-Navarro, A., Kim, S.-K., Bouman, A., Lei, X., Edlund, J., et al.: Nebula: quest for robotic autonomy in challenging environments; team costar at the darpa subterranean challenge. (2021). arXiv:2103.11470
Santamaria-Navarro, A., Thakker, R., Fan, D.D., Morrell, B., Agha-mohammadi, A.-a.: Towards resilient autonomous navigation of drones. In: Robotics Research: The 19th International Symposium ISRR, pp. 922–937 (2022). Springer
Kramer, A., Stahoviak, C., Santamaria-Navarro, A., Agha-Mohammadi, A.-A., Heckman, C.: Radar-inertial ego-velocity estimation for visually degraded environments. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 5739–5746 (2020). IEEE
Palieri, M., Morrell, B., Thakur, A., Ebadi, K., Nash, J., Chatterjee, A., Kanellakis, C., Carlone, L., Guaragnella, C., Agha-mohammadi, A.-a.: Locus: a multi-sensor lidar-centric solution for high-precision odometry and 3d mapping in real-time. IEEE Robotics and Automation Letters 6(2), 421–428 (2021)
Tagliabue, A., Tordesillas, J., Cai, X., Santamaria-Navarro, A., How, J.P., Carlone, L., Agha-mohammadi, A.-a.: Lion: Lidar-inertial observability-aware navigator for vision-denied environments. In: Experimental Robotics: The 17th International Symposium, pp. 380–390 (2021). Springer
Ebadi, K., Chang, Y., Palieri, M., Stephens, A., Hatteland, A., Heiden, E., Thakur, A., Funabiki, N., Morrell, B., Wood, S., Carlone, L., Agha-mohammadi, A.-a.: Lamp: large-scale autonomous mapping and positioning for exploration of perceptually-degraded subterranean environments. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 80–86 (2020)
Ebadi, K., Palieri, M., Wood, S., Padgett, C., Agha-mohammadi, A.-a.: Dare-slam: degeneracy-aware and resilient loop closing in perceptually-degraded environments. Journal of Intelligent & Robotic Systems 102, 1–25 (2021)
Rouček, T., Pecka, M., Cížek, P., Petříček, T., Bayer, J., Šalanskì, V., Heřt, D., Petrlík, M., Báča, T., Spurnỳ, V., et al.: Darpa subterranean challenge: multi-robotic exploration of underground environments. In: Modelling and Simulation for Autonomous Systems: 6th International Conference, MESAS 2019, Palermo, Italy, October 29–31, 2019, Revised Selected Papers 6, pp. 274–290 (2020). Springer
Zhang, L., Zhang, L., Mou, X., Zhang, D.: Fsim: a feature similarity index for image quality assessment. IEEE Trans. Image Process. 20(8), 2378–2386 (2011)
Moorthy, A.K., Bovik, A.C.: Blind image quality assessment: from natural scene statistics to perceptual quality. IEEE Trans. Image Process. 20(12), 3350–3364 (2011)
Ma, K., Liu, W., Zhang, K., Duanmu, Z., Wang, Z., Zuo, W.: End-to-end blind image quality assessment using deep neural networks. IEEE Trans. Image Process. 27(3), 1202–1213 (2018)
Zhu, H., Li, L., Wu, J., Dong, W., Shi, G.: Metaiqa: deep meta-learning for no reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14143–14152 (2020)
Cheon, M., Yoon, S.-J., Kang, B., Lee, J.: Perceptual image quality assessment with transformers. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 433–442 (2021)
Yang, N., Zhong, Q., Li, K., Cong, R., Zhao, Y., Kwong, S.: A reference-free underwater image quality assessment metric in frequency domain. Signal Processing: Image Communication 94, 116218 (2021)
Xiang, T., Yang, Y., Guo, S.: Blind night-time image quality assessment: subjective and objective approaches. IEEE Trans. Multimedia 22(5), 1259–1272 (2020)
Liu, W., Zhou, F., Lu, T., Duan, J., Qiu, G.: Image defogging quality assessment: real-world database and method. IEEE Trans. Image Process. 30, 176–190 (2021)
Li, X.: Blind image quality assessment. In: 2002 IEEE International Conference on Image Processing, vol. 1, p. (2002)
Mier, J.C., Huang, E., Talebi, H., Yang, F., Milanfar, P.: Deep perceptual image quality assessment for compression. In: 2021 IEEE International Conference on Image Processing, pp. 1484–1488 (2021)
Ma, K., Zeng, K., Wang, Z.: Perceptual quality assessment for multi-exposure image fusion. IEEE Trans. Image Process. 24(11), 3345–3356 (2015)
Dendi, S.V.R., Channappayya, S.S.: No-reference video quality assessment using natural spatiotemporal scene statistics. IEEE Trans. Image Process. 29, 5612–5624 (2020)
Zhang, J., Kaess, M., Singh, S.: On degeneracy of optimization-based state estimation problems. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 809–816 (2016)
Zhang, J., Singh, S.: Enabling aggressive motion estimation at low-drift and accurate mapping in real-time. In: IEEE International Conference on Robotics and Automation, pp. 5051–5058 (2017)
Thakker, R., Alatur, N., Fan, D.D., Tordesillas, J., Paton, M., Otsu, K., Toupet, O., Agha-mohammadi, A.-a.: Autonomous off-road navigation over extreme terrains with perceptually-challenging conditions. In: Experimental Robotics: The 17th International Symposium, pp. 161–173 (2021). Springer
Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, Cham (2022)
Handa, A., Whelan, T., McDonald, J., Davison, A.J.: A benchmark for rgb-d visual odometry, 3d reconstruction and slam. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 1524–1531 (2014)
Cepeda-Negrete, J., Sanchez-Yanez, R.E.: Gray-world assumption on perceptual color spaces. In: Image and Video Technology: 6th Pacific-Rim Symposium, PSIVT 2013, Guanajuato, Mexico, October 28-November 1, 2013. Proceedings 6, pp. 493–504 (2014). Springer
Tranzatto, M., Mascarich, F., Bernreiter, L., Godinho, C., Camurri, M., Khattak, S., Dang, T., Reijgwart, V., Loeje, J., Wisth, D.: Cerberus: autonomous legged and aerial robotic exploration in the tunnel and urban circuits of the darpa subterranean challenge. (2022). arXiv:2201.07067
Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Rob. 31(5), 1147–1163 (2015)
Gadkari, D.: Image quality analysis using glcm (2004)
BT, I.: Methodologies for the subjective assessment of the quality of television images, document recommendation itu-r bt. 500–14 (10/2019). ITU, Geneva, Switzerland (2020)
Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., Achtelik, M.W., Siegwart, R.: The euroc micro aerial vehicle datasets. The International Journal of Robotics Research 35(10), 1157–1163 (2016)
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580 (2012)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. The International Journal of Robotics Research 32(11), 1231–1237 (2013)
Ferrera, M., Creuze, V., Moras, J., Trouvé-Peloux, P.: Aqualoc: an underwater dataset for visual–inertial–pressure localization. The International Journal of Robotics Research 38(14), 1549–1559 (2019)
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: high-fidelity visual and physical simulation for autonomous vehicles. In: International Symposium on Field and Service Robotics (2017)
HoYoverse: Genshin Impact-Step Into a Vast Magical World of Advanture. (2023). https://genshin.hoyoverse.com/en
Schönberger, J.L., Frahm, J.-M.: Structure-from-motion revisited. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Schönberger, J.L., Zheng, E., Pollefeys, M., Frahm, J.-M.: Pixelwise view selection for unstructured multi-view stereo. In: European Conference on Computer Vision (ECCV) (2016)
Zhao, X.: The Genshin Impact Dataset (GID) for SLAM. https://github.com/zhaoxuhui/Genshin-Impact-Dataset
Cohen, I., Huang, Y., Chen, J., Benesty, J., Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson correlation coefficient. Noise reduction in speech processing, 1–4 (2009)
Zhang, Z., Scaramuzza, D.: A tutorial on quantitative trajectory evaluation for visual(-inertial) odometry. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7244–7251 (2018). IEEE
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2018)
Forster, C., Pizzoli, M., Scaramuzza, D.: Svo: fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22 (2014)
Campos, C., Elvira, R., Rodríguez, J.J.G., M. Montiel, J.M., D. Tardós, J.: Orb-slam3: an accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Transactions on Robotics 37(6), 1874–1890 (2021)
Teed, Z., Deng, J.: Droid-slam: deep visual slam for monocular, stereo, and rgb-d cameras. Adv. Neural. Inf. Process. Syst. 34, 16558–16569 (2021)
Moore, D.S.: Statistics: Concepts and controversies. (1980)
Wang, W., Zhu, D., Wang, X., Hu, Y., Qiu, Y., Wang, C., Hu, Y., Kapoor, A., Scherer, S.: Tartanair: A dataset to push the limits of visual slam. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4909–4916 (2020)
Jiao, J., Wei, H., Hu, T., Hu, X., Zhu, Y., He, Z., Wu, J., Yu, J., Xie, X., Huang, H., Geng, R., Wang, L., Liu, M.: Fusionportable: a multi-sensor campus-scene dataset for evaluation of localization and mapping accuracy on diverse platforms, 3851–3856 (2022)
Houston, J., Zuidhof, G., Bergamini, L., Ye, Y., Chen, L., Jain, A., Omari, S., Iglovikov, V., Ondruska, P.: One thousand and one hours: self-driving motion prediction dataset. In: Conference on Robot Learning, pp. 409–418 (2021). PMLR
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: Nuscenes: a multimodal dataset for autonomous driving. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11618–11628 (2020)
Acknowledgements
This research is partially supported by Wuhan University - Huawei Geoinformatics Innovation Laboratory. Some numerical calculations in this paper have been supported by the supercomputing system in the Supercomputing Center of Wuhan University.
Funding
This research is partially supported by the National Natural Science Foundation of China Major Program (Grant No. 42192580, 42192583), Hubei Province Natural Science Foundation (Grant No. 2021CFA088 and 2020CFA003), and the Science and Technology Major Project (Grant No. 2021AAA010, 2021AAA010-3).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Xuhui Zhao and Zhi Gao. The first draft of the manuscript was written by Xuhui Zhao and all authors commented on previous versions of the manuscript. All authors read and approved the fnal manuscript.
Corresponding author
Ethics declarations
Conflict of Interest
The authors have no relevant financial or non-fnancial interests to disclose.
Ethics Approval
The authors declare that no human or animal subjects are involved in the study.
Consent to Participate
Informed consent was obtained from all individual participants included in the study.
Consent for Publication
Patients signed informed consent regarding publishing their data and photographs.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhao, X., Gao, Z., Li, H. et al. How Challenging is a Challenge? CEMS: a Challenge Evaluation Module for SLAM Visual Perception. J Intell Robot Syst 110, 42 (2024). https://doi.org/10.1007/s10846-024-02077-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-024-02077-4