Please use this identifier to cite or link to this item:
Scopus Web of ScienceĀ® Altmetric
Type: Conference paper
Title: Multi-modal auto-encoders as joint estimators for robotics scene understanding
Author: Cadena, C.
Dick, A.
Reid, I.D.
Citation: Proceedings of Robotics: Science and Systems, 2016 / Hsu, D., Amato, N., Berman, S., Jacobs, S. (ed./s), vol.12, pp.1-9
Publisher: MIT Press
Issue Date: 2016
ISBN: 978-0-9923747-2-3
ISSN: 2330-765X
Conference Name: Robotics: Science and Systems (18 Jun 2016 - 22 Jun 2016 : Michigan, USA)
Editor: Hsu, D.
Amato, N.
Berman, S.
Jacobs, S.
Statement of
Cesar Cadena, Anthony Dick and Ian D. Reid
Abstract: We explore the capabilities of Auto-Encoders to fuse the information available from cameras and depth sensors, and to reconstruct missing data, for scene understanding tasks. In particular we consider three input modalities: RGB images; depth images; and semantic label information. We seek to generate complete scene segmentations and depth maps, given images and partial and/or noisy depth and semantic data. We formulate this objective of reconstructing one or more types of scene data using a Multi-modal stacked Auto-Encoder. We show that suitably designed Multi-modal Auto-Encoders can solve the depth estimation and the semantic segmentation problems simultaneously, in the partial or even complete absence of some of the input modalities. We demonstrate our method using the outdoor dataset KITTI that includes LIDAR and stereo cameras. Our results show that as a means to estimate depth from a single image, our method is comparable to the state-of-the-art, and can run in real time (i.e., less than 40ms per frame). But we also show that our method has a significant advantage over other methods in that it can seamlessly use additional data that may be available, such as a sparse point-cloud and/or incomplete coarse semantic labels.
Rights: Copyright status unknown
DOI: 10.15607/RSS.2016.XII.041
Grant ID:
Appears in Collections:Aurora harvest 8
Computer Science publications

Files in This Item:
File Description SizeFormat 
  Restricted Access
Restricted Access2.2 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.