ForeSI: Success-Aware Visual Navigation Agent

Kazemi Moghaddam, M.; Abbasnejad, E.; Wu, Q.; Qinfeng Shi, J.; Van Den Hengel, A.

doi:10.1109/WACV51458.2022.00346

ForeSI: Success-Aware Visual Navigation Agent

Date

2022

Authors

Kazemi Moghaddam, M.

Abbasnejad, E.

Wu, Q.

Qinfeng Shi, J.

Van Den Hengel, A.

Type:

Conference paper

Citation

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022), 2022, pp.3401-3410

Statement of Responsibility

Mahdi Kazemi Moghaddam, Ehsan Abbasnejad, Qi Wu, Javen Qinfeng shi and Anton Van Den Hengel

Conference Name

IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (4 Jan 2022 - 8 Jan 2022 : Waikoloa, Hawaii)

DOI

10.1109/WACV51458.2022.00346

Abstract

In this work, we present a method to improve the efficiency and robustness of the previous model-free Reinforcement Learning (RL) algorithms for the task of object-goal visual navigation. Despite achieving state-of-the-art results, one of the major drawbacks of those approaches is the lack of a forward model that informs the agent about the potential consequences of its actions, i.e., being model-free. In this work, we augment the model-free RL with such a forward model that can predict a representation of a future state, from the beginning of a navigation episode, if the episode were to be successful. Furthermore, in order for efficient training, we develop an algorithm to integrate a replay buffer into the model-free RL that alternates between training the policy and the forward model. We call our agent ForeSI; ForeSI is trained to imagine a future latent state that leads to success. By explicitly imagining such a state, during the navigation, our agent is able to take better actions leading to two main advantages: first, in the absence of an object detector, ForeSI presents a more robust policy, i.e., it leads to about 5% absolute improvement on the Success Rate (SR); second, when combined with an off the-shelf object detector to help better distinguish the target object, our method leads to about 3% absolute improvement on the SR and about 2% absolute improvement on Success weighted by inverse Path Length (SPL), i.e., presents higher efficiency.

Rights

Published Version

https://ieeexplore.ieee.org/xpl/conhome/9706406/proceeding

Persistent link to this record

https://hdl.handle.net/2440/135905

Full item page

ForeSI: Success-Aware Visual Navigation Agent

Date

Authors

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Citation

Statement of Responsibility

Conference Name

DOI

Abstract

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

License

Grant ID

Published Version

Call number

Persistent link to this record