Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments

Date

2024

Authors

Hong, H.
Wang, S.
Huang, Z.
Wu, Q.
Liu, J.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Proceedings of the 32nd ACM International Conference on Multimedia (MM'24), 2024, pp.7639-7648

Statement of Responsibility

Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu

Conference Name

32nd ACM International Conference on Multimedia (MM) (28 Oct 2024 - 1 Nov 2024 : Melbourne, VIC, Australia)

Abstract

Real-world navigation often involves dealing with unexpected obstructions such as closed doors, moved objects, and unpredictable entities. However, mainstream Vision-and-Language Navigation (VLN) tasks typically assume instructions perfectly align with the fixed and predefined navigation graphs without any obstructions. This assumption overlooks potential discrepancies in actual navigation graphs and given instructions, which can cause major failures for both indoor and outdoor agents. To address this issue, we integrate diverse obstructions into the R2R dataset by modifying both the navigation graphs and visual observations, introducing an innovative dataset and task, R2R with UNexpected Obstructions (R2R-UNO). R2R-UNO contains various types and numbers of path obstructions to generate instruction-reality mismatches for VLN research. Experiments on R2R-UNO reveal that state-ofthe- art VLN methods inevitably encounter significant challenges when facing such mismatches, indicating that they rigidly follow instructions rather than navigate adaptively. Therefore, we propose a novel method called ObVLN (Obstructed VLN), which includes a curriculum training strategy and virtual graph construction to help agents effectively adapt to obstructed environments. Empirical results show that ObVLN not only maintains robust performance in unobstructed scenarios but also achieves a substantial performance advantage with unexpected obstructions. The source code is available at https://github.com/honghd16/ObstructedVLN.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

License

Call number

Persistent link to this record