The Utility of Validation Sets for Meta-learning Methods for Noisy-Label and Imbalanced Learning Problems
Date
2023
Authors
Hoang, Dung
Editors
Advisors
Carneiro, Gustavo (Friedrich-Alexander-Universität Erlangen-Nürnberg FAU)
Journal Title
Journal ISSN
Volume Title
Type:
Thesis
Citation
Statement of Responsibility
Conference Name
Abstract
In recent years, the world has witnessed successful developments for solving visual learning tasks, including image classification, object detection, and semantic segmentation. In large part, this success is due to the introduction of sophisticated deep learning models. Unfortunately, these models often require a massive amount of annotated training data in order to achieve acceptable performance. Annotating such large amount of data is not only time-consuming and costly, but also impractical or even impossible in many scenarios. Such problems have motivated the development of more affordable solutions, e.g., using crow-sourcing for data annotation process, which is a less expensive way to collect and annotate data, but may result in training dataset that are likely to be contaminated with label noise. Unfortunately, deep neural networks with their high capacity can easily overfit to those training samples, resulting in a deterioration in terms of prediction performance. Moreover, the presence of noisy-labeled data can aggravate label distribution imbalances on such training sets. Consequently, the field has intensively worked in the development of methods to address the issues produced by imbalanced noisy-label datasets in the training of deep learning models. Many approaches have been proposed to handle training datasets with label noise and imbalanced label distribution. Among those, meta-learning has been demonstrating to be one of the most successful methods. Conventionally, a clean and balanced validation set is usually required to train traditional meta-learning model. However, obtaining such validation set can be expensive or even impossible to access for certain datasets, particularly when the number of classes in the dataset is in the order of 103 or more. Such issues when building a clean dataset have motivated the development of meta-learning methods that automatically select validation samples that are likely to have clean labels and balanced class distribution. The aim is to form an “informative” validation set where the samples belonging to that set not only are clean and classbalanced, but also have high utility for the meta-learning algorithm. This is, however, missing from the majority of existing studies in meta-learning literature. In addition, a common problem with these methods is that when the level of label noise is high, most prior meta-learning methods are prone to overfitting due to their inability to select truly clean samples for the validation set. The main focus of this thesis is, therefore, the proposal of a new meta-learning method that is robust to training sets that contain imbalanced class distribution and noisy labels, without requiring a clean and balanced validation set. The main technical contribution is the “informativeness” measure derived from a theoretical observation in the meta-learning approach, called Learning to Re-weight (L2W) which allows us to define a sample informativeness measure. Using this theoretical observation, the proposed method can automatically a highly-informative validation set that has highlyinformative samples which have clean labels with high probability, where the class distribution is balanced. Empirical evaluation is then carried out on publicly available noisy label benchmarks that explore all common types of label noise, such as symmetric, asymmetric, instant-dependent, close-set, and open-set, on both synthetic and real-world datasets. The proposed method shows state-of-the-art performance on the majority of these benchmarks and outperforms all previous meta-learning approaches by a large margin. In summary, the newly-proposed meta-learning method has replaced the manual data collection and annotation to form a validation set by an automatic mechanism, while substantially boosting their prediction performance on several benchmarks. Despite its affordability and effectiveness, the proposed method still has some drawbacks, especially the overfitting issue at extremely-high label noise. Such weakness will be investigated and studied as a part of my future work.
School/Discipline
School of Computer and Mathematical Sciences
Dissertation Note
Thesis (MPhil) -- University of Adelaide, School of Computer and Mathematical Sciences, 2023
Provenance
This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals