AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering

dc.contributor.authorLiao, Z.
dc.contributor.authorWu, Q.
dc.contributor.authorShen, C.
dc.contributor.authorVan Den Hengel, A.
dc.contributor.authorVerjans, J.
dc.contributor.conferenceInternational Conference of the CLEF Initiative (CLEF) (22 Sep 2020 - 25 Sep 2020 : virtual online)
dc.contributor.editorCappellato, L.
dc.contributor.editorEickhoff, C.
dc.contributor.editorFerro, N.
dc.contributor.editorNévéol, A.
dc.date.issued2020
dc.descriptionSession - ImageCLEF: Multimedia Retrieval in Medicine, Lifelogging, and Internet.
dc.description.abstractIn this paper, we describe our contribution to the 2020 ImageCLEF Medical Domain Visual Question Answering (VQA-Med) challenge. Our submissions scored first place on the VQA challenge leaderboard, and also the first place on the associated Visual Question Generation (VQG) challenge leaderboard. Our VQA approach was developed using a knowledge inference methodology called Skeleton-based Sentence Mapping (SSM). Using all the questions and answers, we derived a set of classifiable tasks and inferred the corresponding labels. As a result, we were able to transform the VQA task into a multi-task image classification problem which allowed us to focus on the image modelling aspect. We further propose a class-wise and task-wise normalization facilitating optimization of multiple tasks in a single network. This enabled us to apply a multi-scale and multi-architecture ensemble strategy for robust prediction. Lastly, we positioned the VQG task as a transfer learning problem using the VGA task trained models. The VQG task was also solved using classification.
dc.description.statementofresponsibilityZhibin Liao, Qi Wu, Chunhua Shen, Anton van den Hengel, and Johan Verjans
dc.identifier.citationCEUR Workshop Proceedings, 2020 / Cappellato, L., Eickhoff, C., Ferro, N., Névéol, A. (ed./s), vol.2696, pp.1-14
dc.identifier.issn1613-0073
dc.identifier.orcidLiao, Z. [0000-0001-9965-4511]
dc.identifier.orcidWu, Q. [0000-0003-3631-256X]
dc.identifier.orcidVan Den Hengel, A. [0000-0003-3027-8364]
dc.identifier.orcidVerjans, J. [0000-0002-8336-6774]
dc.identifier.urihttps://hdl.handle.net/2440/132209
dc.language.isoen
dc.publisherCEUR-WS
dc.publisher.placeonline
dc.relation.ispartofseriesCEUR Workshop Proceedings; 2696
dc.rightsCopyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
dc.source.urihttp://ceur-ws.org/Vol-2696
dc.subjectVisual Question Answering; Visual Question Generation; Knowledge Inference; Deep Neural Networks; Skeleton-based Sentence Mapping; Class-wise and Task-wise Normalization
dc.titleAIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering
dc.typeConference paper
pubs.publication-statusPublished

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
hdl_132209.pdf
Size:
501.4 KB
Format:
Adobe Portable Document Format
Description:
Published version