Dynamic Scene Understanding with Applications to Traffic Monitoring

Hu, Qichang

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/119678

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Shen, Chunhua	-
dc.contributor.author	Hu, Qichang	-
dc.date.issued	2017	-
dc.identifier.uri	http://hdl.handle.net/2440/119678	-
dc.description.abstract	Many breakthroughs have been witnessed in the computer vision community in recent years, largely due to deep Convolutional Neural Networks (CNN) and largescale datasets. This thesis aims to investigate dynamic scene understanding from images. The problem of dynamic scene understanding involves simultaneously solving several sub-tasks including object detection, object recognition, and segmentation. Successfully completing these tasks will enable us to interpret the objects of interest within a scene. Vision-based traffic monitoring is one of many fast-emerging areas in the intelligent transportation system (ITS). In the thesis, we focus on the following problems in traffic scene understanding. They are 1) How to detect and recognize all the objects of interest in street view images? 2) How to employ CNN features and semantic pixel labelling to boost the performance of pedestrian detection? 3) How to enhance the discriminative power of CNN representations for improving the performance of fine-grained car recognition? 4) How to learn an adaptive color space to represent vehicle images for vehicle color recognition? For the first task, we propose a single learning based detection framework to detect three important classes of objects (traffic signs, cars, and cyclists). The proposed framework consists of a dense feature extractor and detectors of these three classes. The advantage of using one common framework is that the detection speed is much faster, since all dense features need only to be evaluated once and then are shared with all detectors. The proposed framework introduces spatially pooled features as a part of aggregated channel features to enhance the robustness to noises and image deformations. We also propose an object subcategorization scheme as a means of capturing the intra-class variation of objects. To address the second problem, we show that by re-using the convolutional feature maps (CFMs) of a deep CNN model as visual features to train an ensemble of boosted decision forests, we are able to remarkably improve the performance of pedestrian detection without using specially designed learning algorithms. We also show that semantic pixel labelling can be simply combined with a pedestrian detector to further boost the detection performance. Fine-grained details of objects usually contain very discriminative information which are crucial for fine-grained object recognition. Conventional pooling strategies (e.g. max-pooling, average-pooling) may discard these fine-grained details and hurt the iii iv recognition performance. To remedy this problem, we propose a spatially weighted pooling (swp) strategy which considerably improves the discriminative power of CNN representations. The swp pools the CNN features with the guidance of its learnt masks, which measures the importance of the spatial units in terms of discriminative power. In image color recognition, visual features are extracted from image pixels represented in one color space. The choice of the color space may influence the quality of extracted features and impact the recognition performance. We propose a color transformation method that converts image pixels from the RGB space to a learnt space for improving the recognition performance. Moreover, we propose a ColorNet which optimizes the architecture of AlexNet and embeds a mini-CNN of color transformation for vehicle color recognition.	en
dc.language.iso	en	en
dc.subject	Traffic scene perception	en
dc.subject	Object subcategorization	en
dc.subject	Treffic sign detection	en
dc.subject	Car detection	en
dc.subject	Cyclist detection	en
dc.subject	Pedestrian detection	en
dc.subject	Fine-grained recognition	en
dc.subject	Car model classification	en
dc.subject	Vehicle color recognition	en
dc.subject	Deep learning	en
dc.title	Dynamic Scene Understanding with Applications to Traffic Monitoring	en
dc.type	Thesis	en
dc.contributor.school	School of Computer Science	en
dc.provenance	This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals	en
dc.description.dissertation	Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2017	en
Appears in Collections:	Research Theses

Files in This Item:

File	Description	Size	Format
Hu2017_PhD.pdf		9.82 MB	Adobe PDF	View/Open

Show simple item record

Adelaide Research & Scholarship