Real-time monocular object instance 6D pose estimation
Files
(Published version)
Date
2019
Authors
Do, T.
Pham, T.
Cai, M.
Reid, I.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the 29th British Machine Vision Conference (BMVC 2018), 2019, pp.1-12
Statement of Responsibility
Thanh-Toan Do, Trung Pham, Ming Cai, Ian Reid
Conference Name
British Machine Vision Conference (BMVC) (3 Sep 2018 - 6 Sep 2018 : Newcastle upon Tyne, UK)
Abstract
In this work, we present, LieNet, a novel deep learning framework that simultaneously detects, segments multiple object instances, and estimates their 6D poses from a single RGB image without requiring additional post-processing. Our system is accurate and fast (∼10 fps), which is well suited for real-time applications. In particular, LieNet detects and segments object instances in the image analogous to modern instance segmentation networks such as Mask R-CNN, but contains a novel additional sub-network for 6D pose estimation. LieNet estimates the rotation matrix of an object by regressing a Lie algebra based rotation representation, and estimates the translation vector by predicting the distance of the object to the camera center. The experiments on two standard pose benchmarking datasets show that LieNet greatly outperforms other recent CNN based pose prediction methods when they are used with monocular images and without post-refinements.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© 2018. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.