ForeSI: Success-Aware Visual Navigation Agent

dc.contributor.authorKazemi Moghaddam, M.
dc.contributor.authorAbbasnejad, E.
dc.contributor.authorWu, Q.
dc.contributor.authorQinfeng Shi, J.
dc.contributor.authorVan Den Hengel, A.
dc.contributor.conferenceIEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (4 Jan 2022 - 8 Jan 2022 : Waikoloa, Hawaii)
dc.date.issued2022
dc.description.abstractIn this work, we present a method to improve the efficiency and robustness of the previous model-free Reinforcement Learning (RL) algorithms for the task of object-goal visual navigation. Despite achieving state-of-the-art results, one of the major drawbacks of those approaches is the lack of a forward model that informs the agent about the potential consequences of its actions, i.e., being model-free. In this work, we augment the model-free RL with such a forward model that can predict a representation of a future state, from the beginning of a navigation episode, if the episode were to be successful. Furthermore, in order for efficient training, we develop an algorithm to integrate a replay buffer into the model-free RL that alternates between training the policy and the forward model. We call our agent ForeSI; ForeSI is trained to imagine a future latent state that leads to success. By explicitly imagining such a state, during the navigation, our agent is able to take better actions leading to two main advantages: first, in the absence of an object detector, ForeSI presents a more robust policy, i.e., it leads to about 5% absolute improvement on the Success Rate (SR); second, when combined with an off the-shelf object detector to help better distinguish the target object, our method leads to about 3% absolute improvement on the SR and about 2% absolute improvement on Success weighted by inverse Path Length (SPL), i.e., presents higher efficiency.
dc.description.statementofresponsibilityMahdi Kazemi Moghaddam, Ehsan Abbasnejad, Qi Wu, Javen Qinfeng shi and Anton Van Den Hengel
dc.identifier.citationProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022), 2022, pp.3401-3410
dc.identifier.doi10.1109/WACV51458.2022.00346
dc.identifier.isbn9781665409155
dc.identifier.issn2472-6737
dc.identifier.orcidKazemi Moghaddam, M. [0000-0001-6544-1120]
dc.identifier.orcidWu, Q. [0000-0003-3631-256X]
dc.identifier.orcidVan Den Hengel, A. [0000-0003-3027-8364]
dc.identifier.urihttps://hdl.handle.net/2440/135905
dc.language.isoen
dc.publisherIEEE
dc.publisher.placeOnline
dc.relation.ispartofseriesIEEE Winter Conference on Applications of Computer Vision
dc.rights©2021 IEEE
dc.source.urihttps://ieeexplore.ieee.org/xpl/conhome/9706406/proceeding
dc.subjectVision for Robotics Multimedia Applications; Vision and Languages; Vision Systems and Applications; Visual Reasoning; Analysis and Understanding
dc.titleForeSI: Success-Aware Visual Navigation Agent
dc.typeConference paper
pubs.publication-statusPublished

Files