Unsupervised scale-consistent depth learning from video

Bian, J.; Zhan, H.; Wang, N.; Li, Z.; Zhang, L.; Shen, C.; Cheng, M.-M.; Reid, I.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/131501

Scopus	Web of Science®	Altmetric
Citations
?	?

Full metadata record

DC Field	Value	Language
dc.contributor.author	Bian, J.	-
dc.contributor.author	Zhan, H.	-
dc.contributor.author	Wang, N.	-
dc.contributor.author	Li, Z.	-
dc.contributor.author	Zhang, L.	-
dc.contributor.author	Shen, C.	-
dc.contributor.author	Cheng, M.-M.	-
dc.contributor.author	Reid, I.	-
dc.date.issued	2021	-
dc.identifier.citation	International Journal of Computer Vision, 2021; 129(9):2548-2564	-
dc.identifier.issn	0920-5691	-
dc.identifier.issn	1573-1405	-
dc.identifier.uri	http://hdl.handle.net/2440/131501	-
dc.description.abstract	We propose a monocular depth estimation method SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time. Our contributions include: (i) we propose a geometry consistency loss, which penalizes the inconsistency of predicted depths between adjacent views; (ii) we propose a self-discovered mask to automatically localize moving objects that violate the underlying static scene assumption and cause noisy signals during training; (iii) we demonstrate the efficacy of each component with a detailed ablation study and show high-quality depth estimation results in both KITTI and NYUv2 datasets. Moreover, thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into ORB-SLAM2 system for more robust and accurate tracking. The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training. Finally, we provide several demos for qualitative evaluation. The source code is released on GitHub.	-
dc.description.statementofresponsibility	Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid	-
dc.language.iso	en	-
dc.publisher	Springer Nature	-
dc.rights	© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021	-
dc.source.uri	http://dx.doi.org/10.1007/s11263-021-01484-6	-
dc.subject	Unsupervised depth estimation; scale consistency; visual SLAM; pseudo-RGBD SLAM	-
dc.title	Unsupervised scale-consistent depth learning from video	-
dc.type	Journal article	-
dc.identifier.doi	10.1007/s11263-021-01484-6	-
dc.relation.grant	http://purl.org/au-research/grants/arc/CE140100016	-
dc.relation.grant	http://purl.org/au-research/grants/arc/FL130100102	-
pubs.publication-status	Published	-
dc.identifier.orcid	Bian, J. [0000-0003-2046-3363]	-
dc.identifier.orcid	Reid, I. [0000-0001-7790-6423]	-
Appears in Collections:	Aurora harvest 8 Computer Science publications

Files in This Item:

There are no files associated with this item.

Show simple item record

Adelaide Research & Scholarship