Sequence searching with deep-learnt depth for condition-and viewpoint-invariant route-based place recognition

Files

RA_hdl_107514.pdf (793.6 KB)
  (Restricted Access)

Date

2015

Authors

Milford, M.
Lowry, S.
Sunderhauf, N.
Shirazi, S.
Pepperell, E.
Upcroft, B.
Shen, C.
Lin, G.
Liu, F.
Cadena, C.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Conference on Computer Vision and Pattern Recognition Workshops IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Workshops, 2015, vol.2015-October, pp.18-25

Statement of Responsibility

Michael Milford, Stephanie Lowry, Niko Sunderhauf, Sareh Shirazi, Edward Pepperell, Ben Upcroft Chunhua Shen, Guosheng Lin, Fayao Liu, Cesar Cadena, Ian Reid

Conference Name

Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (7 Jun 2015 - 12 Jun 2015 : Boston, MA)

Abstract

Vision-based localization on robots and vehicles remains unsolved when extreme appearance change and viewpoint change are present simultaneously. The current state of the art approaches to this challenge either deal with only one of these two problems; for example FABMAP (viewpoint invariance) or SeqSLAM (appearanceinvariance), or use extensive training within the test environment, an impractical requirement in many application scenarios. In this paper we significantly improve the viewpoint invariance of the SeqSLAM algorithm by using state-of-the-art deep learning techniques to generate synthetic viewpoints. Our approach is different to other deep learning approaches in that it does not rely on the ability of the CNN network to learn invariant features, but only to produce good enough depth images from day-time imagery only. We evaluate the system on a new multi-lane day-night car dataset specifically gathered to simultaneously test both appearance and viewpoint change. Results demonstrate that the use of synthetic viewpoints improves the maximum recall achieved at 100% precision by a factor of 2.2 and maximum recall by a factor of 2.7, enabling correct place recognition across multiple road lanes and significantly reducing the time between correct localizations¹

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

Copyright © 2015, IEEE

License

Call number

Persistent link to this record