Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorShen, Chunhua-
dc.contributor.advisorLiu, Lingqiao-
dc.contributor.authorCao, Yuanzhouhan-
dc.description.abstractDepth is an important source of information in computer vision. However, depth is usually discarded in most vision tasks. In this thesis, we study the tasks of estimating depth from single monocular images, and incorporating depth for object detection and semantic segmentation. Recently, a significant number of breakthroughs have been introduced to the vision community by deep convolutional neural networks (CNNs). All of our algorithms in this thesis are built upon deep CNNs. The first part of this thesis addresses the task of incorporating depth for object detection and semantic segmentation. The aim is to improve the performance of vision tasks that are only based on RGB data. Two approaches for object detection and two approaches for semantic segmentation are presented. These approaches are based on existing depth estimation, object detection and semantic segmentation algorithms. The second part of this thesis addresses the task of depth estimation. Depth estimation is often formulated as a regression task due to the continuous property of depths. Deep CNNs for depth estimation are trained by iteratively minimizing regression errors between predicted and ground-truth depths. A drawback of regression is that it predicts depths without confidence. In this thesis, we propose to formulate depth estimation as a classification task which naturally predicts depths with confidence. The confidence can be used during training and post-processing. We also propose to exploit ordinal depth relationships from stereo videos to improve the performance of metric depth estimation. By doing so we propose a Relative Depth in Stereo (RDIS) dataset that is densely annotated with relative depths.en
dc.subjectdepth estimationen
dc.subjectobject detectionen
dc.subjectsemantic segmentationen
dc.titleDeep learning based RGB-D vision tasksen
dc.contributor.schoolSchool of Computer Scienceen
dc.provenanceThis electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at:
dc.description.dissertationThesis (Ph.D.) -- University of Adelaide,School of Computer Science , 2018en
Appears in Collections:Research Theses

Files in This Item:
File Description SizeFormat 
Cao2018_PhD.pdf2.85 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.