Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Wang, Zihan; Li, Xiangyang; Yang, Jiahao; Liu, Yeqi; Hu, Junjie; Jiang, Ming; Jiang, Shuqiang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2404.01943 (cs)

[Submitted on 2 Apr 2024]

Title:Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Authors:Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Junjie Hu, Ming Jiang, Shuqiang Jiang

View PDF HTML (experimental)

Abstract:Vision-and-language navigation (VLN) enables the agent to navigate to a remote location following the natural language instruction in 3D environments. At each navigation step, the agent selects from possible candidate locations and then makes the move. For better navigation planning, the lookahead exploration strategy aims to effectively evaluate the agent's next action by accurately anticipating the future environment of candidate locations. To this end, some existing works predict RGB images for future environments, while this strategy suffers from image distortion and high computational cost. To address these issues, we propose the pre-trained hierarchical neural radiance representation model (HNR) to produce multi-level semantic features for future environments, which are more robust and efficient than pixel-wise RGB reconstruction. Furthermore, with the predicted future environmental representations, our lookahead VLN model is able to construct the navigable future path tree and select the optimal path via efficient parallel evaluation. Extensive experiments on the VLN-CE datasets confirm the effectiveness of our method.

Comments:	Accepted by CVPR 2024. The code is available at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:2404.01943 [cs.CV]
	(or arXiv:2404.01943v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2404.01943

Submission history

From: Zihan Wang [view email]
[v1] Tue, 2 Apr 2024 13:36:03 UTC (5,737 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators