Deep Anomaly Detection in Computer Vision and Medical Imaging
Files
(Thesis)
Date
2022
Authors
Tian, Yu
Editors
Advisors
Carneiro, Gustavo
Singh, Rajvinder
Verjans, Johan
Singh, Rajvinder
Verjans, Johan
Journal Title
Journal ISSN
Volume Title
Type:
Thesis
Citation
Statement of Responsibility
Conference Name
Abstract
Anomaly detection is a fundamental problem in computer vision and medical imaging, which aims to detect unseen (i.e., not present in the training set) abnormal data instances that deviate from the distribution of seen (or present in the training set) normal instances. Deep neural networks have been the dominant model behind current solutions that have achieved great success in different application domains. Anomaly detection can be formulated as: (i) unsupervised anomaly detection (UAD) developed with a one-class classification method that only uses normal training data, (ii) few shot anomaly detection that uses a small amount of abnormal training data and a large amount of normal training data, and (iii) weakly supervised learning for video anomaly detection with video-level labels without any indication of where the anomaly happens inside the video sequence. Despite the remarkable achievements of current approaches, there are still many challenges worth exploring to advance the field. Traditional reconstruction-based UAD methods use generative models to learn to reconstruct normal training images, where the assumption is that these models will reconstruct unseen abnormal images with larger error than the normal images. However, such an assumption often fails since modern generative models, such as autoencoders (AE) and generative adversarial networks (GAN), can generalise well to unseen abnormal images and yield low reconstruction errors, particularly for hard anomalies (i.e., subtle abnormal samples that look similar to normal instances). Thus, this thesis first targets this low reconstruction error for hard anomaly, present in generative models. We design several new reconstruction-based UAD methods that explicitly constrain the generative model to be able to only reconstruct normality patterns, reducing their ability to reconstruct unseen abnormal cases, and consequently improving their unsupervised anomaly detection accuracy. Moreover, we argue that another major issue that may reduce UAD accuracy is the inadequate feature representations obtained from pre-trained models designed to solve general classification tasks instead of UAD tasks. To address this issue, we propose the new self-supervised pre-training methods in the field designed specifically for downstream UAD tasks. When pre-training off-the-shelf anomaly classifiers, our self-supervised methods are shown to enable substantial improvements in terms of anomaly detection accuracy. We also notice that the accuracy of UAD methods can be improved by leveraging a few labelled abnormal samples during training, which should be used in addition the normal samples to facilitate the classification of normal and abnormal instances. This idea allowed us to propose the new fewshot anomaly detection method to improve anomaly detection accuracy. Furthermore, we propose a new video anomaly detection approach that relies on weak video-level annotations. One of the major challenges of weakly supervised video anomaly detection (WVAD) is how to accurately identify anomalous frames or snippets from abnormal videos during training. Our solution for WAVD involves the design of a new temporal feature learning and a novel transformer-based multiple instance learning framework. Finally, we propose a simple and effective anomaly segmentation model that targets the pixel-wise anomaly detection task from complex urban driving scenes. This method aims to address the fundamental problem that current semantic segmentation models often produce misclassifications on unexpected road anomalies. We conduct our experiments on public anomaly detection and segmentation benchmarks and most of the methods presented in this thesis achieve state-of-the-art (SOTA) performance on various natural image and medical image analysis datasets.
School/Discipline
School of Computer Science
Dissertation Note
Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2022
Provenance
This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals