LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics

Garg, S.; Suenderhauf, N.; Milford, M.

doi:10.15607/RSS.2018.XIV.022

LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics

dc.contributor.author	Garg, S.
dc.contributor.author	Suenderhauf, N.
dc.contributor.author	Milford, M.
dc.contributor.conference	Robotics: Science and Systems XIV (RSS) (26 Jun 2018 - 30 Jun 2018 : Pittsburgh, PA, USA)
dc.contributor.editor	KressGazit, H.
dc.contributor.editor	Srinivasa, S.
dc.contributor.editor	Howard, T.
dc.contributor.editor	Atanasov, N.
dc.date.issued	2018
dc.description.abstract	Human visual scene understanding is so remarkable that we are able to recognize a revisited place when entering it from the opposite direction it was first visited, even in the presence of extreme variations in appearance. This capability is especially apparent during driving: a human driver can recognize where they are when traveling in the reverse direction along a route for the first time, without having to turn back and look. The difficulty of this problem exceeds any addressed in past appearance- and viewpoint-invariant visual place recognition (VPR) research, in part because large parts of the scene are not commonly observable from opposite directions. Consequently, as shown in this paper, the precision-recall performance of current state-of-the-art viewpoint- and appearance-invariant VPR techniques is orders of magnitude below what would be usable in a closed-loop system. Current engineered solutions predominantly rely on panoramic camera or LIDAR sensing setups; an eminently suitable engineering solution but one that is clearly very different to how humans navigate, which also has implications for how naturally humans could interact and communicate with the navigation system. In this paper we develop a suite of novel semantic- and appearance-based techniques to enable for the first time high performance place recognition in this challenging scenario. We first propose a novel Local Semantic Tensor (LoST) descriptor of images using the convolutional feature maps from a state-of-the-art dense semantic segmentation network. Then, to verify the spatial semantic arrangement of the top matching candidates, we develop a novel approach for mining semanticallysalient keypoint correspondences. On publicly available benchmark datasets that involve both 180 degree viewpoint change and extreme appearance change, we show how meaningful recall at 100% precision can be achieved using our proposed system where existing systems often fail to ever reach 100% precision. We also present analysis delving into the performance differences between a current and the proposed system, and characterize unique properties of the opposite direction localization problem including the metric matching offset. The source code is available online at https://github.com/oravus/lostX.
dc.description.statementofresponsibility	Sourav Garg, Niko Suenderhauf and Michael Milford
dc.identifier.citation	Proceedings of Robotics: Science and Systems XIV (RSS 2018), 2018 / KressGazit, H., Srinivasa, S., Howard, T., Atanasov, N. (ed./s), vol.2018, pp.1-10
dc.identifier.doi	10.15607/RSS.2018.XIV.022
dc.identifier.isbn	9780992374747
dc.identifier.issn	2330-765X
dc.identifier.issn	2330-765X
dc.identifier.orcid	Garg, S. [0000-0001-6068-3307]
dc.identifier.uri	https://hdl.handle.net/2440/138399
dc.language.iso	en
dc.publisher	Robotics: Science and Systems Foundation
dc.publisher.place	California, USA
dc.relation.grant	http://purl.org/au-research/grants/arc/CE140100016
dc.relation.grant	http://purl.org/au-research/grants/arc/FT140101229
dc.rights	Copyright status unknown
dc.source.uri	https://www.roboticsfoundation.org/
dc.title	LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics
dc.type	Conference paper
pubs.publication-status	Published online

Collections

Australian Institute for Machine Learning publications

LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics

Files

Collections