Interpretable Deep Learning for Medical Imaging and Computer Vision
Files
(Library staff access only.)
Date
2024
Authors
Wang, Chong
Editors
Advisors
Carneiro, Gustavo
Frazer, Helen (St Vincent's Hospital, Melbourne)
Frazer, Helen (St Vincent's Hospital, Melbourne)
Journal Title
Journal ISSN
Volume Title
Type:
Thesis
Citation
Statement of Responsibility
Conference Name
Abstract
Deep learning networks have gained popularity in computer vision and medical imaging, excelling in various application domains due to their exceptional capabilities in automatic feature extraction and discrimination. Despite these remarkable achievements, existing methods are often viewed as black-box models, making it hard for people to understand how they arrive at specific predictions from input images. This black-box problem has been handled with the development of new interpretable methods for deep learning networks. One of the most successful solutions is the prototypical-part network (ProtoPNet), which is a deep-learning classification method that can self-explain the model’s predictions by associating decisions on a testing image with learnable prototypes representing object parts of training images. A particularly interesting application of ProtoPNet models is in medical image analysis tasks, although these models do present some challenges. For instance, while ProtoPNet models offer good interpretability, they often fall short of achieving the same level of accuracy as black-box networks. Also, in multi-label tasks, which is a practical scenario in medical images, ProtoPNet tends to fail due to the high degree of entanglement of the learned prototypes. Moreover, ProtoPNet relies on single-level prototypes that cannot fully represent complex visual class patterns characterised by varying sizes and appearances, such as the ones shown by lesions in medical images. To address the issues above and improve the classification and interpretation performances on medical image analysis tasks, we propose numerous approaches that use knowledge distillation, reciprocal learning, disentangled prototype learning, and hierarchical prototypes. Another application of ProtoPNet is in computer vision tasks, where current models mostly rely on a point-based learning of trivial prototypes with limited representation power, often producing trivial (easy-to-learn) prototypes from the most salient object parts. Such issues cause relatively low classification accuracy in natural image recognition and difficulties in detecting Out-of-Distribution (OoD) inputs. We tackle these problems with our innovative learning of support prototypes, combined with the trivial prototypes, to achieve enhanced and complementary interpretations. Furthermore, we propose a new generative training paradigm to learn prototype distributions, together with a novel prototype mining strategy inspired by the game-theoretic horse-racing problem of Tian Ji. These innovations enable both interpretable image classification and trustworthy recognition of OoD samples. Results on standard fine-grained classification benchmarks show the effectiveness and advantages of the proposed methods.
School/Discipline
School of Computer and Mathematical Sciences
Dissertation Note
Thesis (Ph.D.) -- University of Adelaide, School of Computer and Mathematical Sciences, 2024
Provenance
This thesis is currently under embargo and not available.