Benchmarking In-the-Wild Multimodal Disease Recognition and A Versatile Baseline
Date
2024
Authors
Wei, T.
Chen, Z.
Huang, Z.
Yu, X.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the 32nd ACM International Conference on Multimedia, 2024, pp.1593-1601
Statement of Responsibility
Tianqi Wei, Zhi Chen, Zi Huang and Xin Yu
Conference Name
ACM International Conference on Multimedia (MM) (28 Oct 2024 - 1 Nov 2024 : Melbourne, Australia)
Abstract
Existing plant disease classification models have achieved remarkable performance in recognizing in-laboratory diseased images. However, their performance often significantly degrades in classifying in-the-wild images. Furthermore, we observed that in-the-wild plant images may exhibit similar appearances across various diseases (i.e., small inter-class discrepancy) while the same diseases may look quite different (i.e., large intra-class variance). Motivated by this observation, we propose an in-the-wild multimodal plant disease recognition dataset that contains the largest number of disease classes but also text-based descriptions for each disease. Particularly, the newly provided text descriptions are introduced to provide rich information in textual modality and facilitate in-the-wild disease classification with small inter-class discrepancy and large intra-class variance issues. Therefore, our proposed dataset can be regarded as an ideal testbed for evaluating disease recognition methods in the real world. In addition, we further present a strong yet versatile baseline that models text descriptions and visual data through multiple prototypes for a given class. By fusing the contributions of multimodal prototypes in classification, our baseline can effectively address the small inter-class discrepancy and large intra-class variance issues. Remarkably, our baseline model can not only classify diseases but also recognize diseases in few-shot or training-free scenarios. Extensive benchmarking results demonstrate that our proposed in-the-wild multimodal dataset sets many new challenges to the plant disease recognition task and there is a large space to improve for future works.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© 2024 ACM.