Deep Learning for Robotic Scene Understanding

Sun, Libo

Deep Learning for Robotic Scene Understanding

Files

Sun2022_PhD.pdf (28.64 MB)

(Thesis)

Date

2022

Authors

Sun, Libo

Advisors

Shen, Chunhua
Liu, Yifan
Pang, Guansong

Type:

Thesis

Abstract

Scene understanding is a complex yet essential task for intelligent robots. However, how to achieve reliable scene understanding is still a challenging problem. With the widely successful application of deep learning, many breakthroughs have been witnessed in various areas. In this thesis, we aim to investigate how to use deep learning-based methods to significantly improve the scene understanding ability of robots. Specifically, our work involves four fundamental robotic scene understanding subtasks, namely road detection, semantic segmentation, depth estimation, and visual odometry (VO). We present details of how to use proposed deep learning-based approaches to improve the performance of these subtasks. To begin with, as drivable area detection is a critically important task for autonomous driving and robotics, we propose a road detection method which can reduce device reliance while maintaining performance. Unlike previous road detection methods that rely on LiDAR, our method can obtain state-of-the-art performance with RGB images only. In our framework, we exploit a pseudo-LiDAR using monocular depth estimation and propose a feature fusion network to fuse RGB and pseudo- LiDAR information. To optimize the network architecture and improve the efficiency of our network, we propose a method to search for the information propagation paths. Additionally, we design a modality distillation strategy which can significantly reduce network parameters and inference time. Furthermore, because autonomous vehicles and robots are commonly equipped with stereo cameras to capture binocular images, we propose a stereo vision-based semantic segmentation framework which enables current monocular architectures to exploit stereo image data to improve semantic segmentation performance. The improvements are obtained via two approaches: label generation and pre-training, and stereo vision-based information fusion. Comprehensive experiments using different well-known semantic segmentation architectures on different datasets demonstrate the efficacy of our method. Finally, to obtain better 3D scene understanding, we propose a framework to exploit monocular depth estimation for improving monocular VO. The core of this framework is a monocular depth estimation module with a strong generalization capability for diverse scenes. It consists of two separate working modes to assist the localization and mapping. With a single monocular image input, the depth estimation module predicts a relative depth to help the localization module on improving the accuracy. With a sparse depth map and an RGB image input, the depth estimation module can generate accurate scale-consistent depth for dense mapping. Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes. More significantly, our framework is able to boost the performances of existing geometry-based VO methods by a large margin.

School/Discipline

School of Computer Science

Dissertation Note

Thesis (Ph.D.) -- University of Adelaide, School of Computer and Mathematical Sciences, 2022

Provenance

This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals.

Persistent link to this record

https://hdl.handle.net/2440/138259

Full item page

Deep Learning for Robotic Scene Understanding

Files

Date

Authors

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Citation

Statement of Responsibility

Conference Name

Abstract

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

License

Grant ID

Published Version

Call number

Persistent link to this record