AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering

Liao, Z.; Wu, Q.; Shen, C.; Van Den Hengel, A.; Verjans, J.

AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering

dc.contributor.author	Liao, Z.
dc.contributor.author	Wu, Q.
dc.contributor.author	Shen, C.
dc.contributor.author	Van Den Hengel, A.
dc.contributor.author	Verjans, J.
dc.contributor.conference	International Conference of the CLEF Initiative (CLEF) (22 Sep 2020 - 25 Sep 2020 : virtual online)
dc.contributor.editor	Cappellato, L.
dc.contributor.editor	Eickhoff, C.
dc.contributor.editor	Ferro, N.
dc.contributor.editor	Névéol, A.
dc.date.issued	2020
dc.description	Session - ImageCLEF: Multimedia Retrieval in Medicine, Lifelogging, and Internet.
dc.description.abstract	In this paper, we describe our contribution to the 2020 ImageCLEF Medical Domain Visual Question Answering (VQA-Med) challenge. Our submissions scored first place on the VQA challenge leaderboard, and also the first place on the associated Visual Question Generation (VQG) challenge leaderboard. Our VQA approach was developed using a knowledge inference methodology called Skeleton-based Sentence Mapping (SSM). Using all the questions and answers, we derived a set of classifiable tasks and inferred the corresponding labels. As a result, we were able to transform the VQA task into a multi-task image classification problem which allowed us to focus on the image modelling aspect. We further propose a class-wise and task-wise normalization facilitating optimization of multiple tasks in a single network. This enabled us to apply a multi-scale and multi-architecture ensemble strategy for robust prediction. Lastly, we positioned the VQG task as a transfer learning problem using the VGA task trained models. The VQG task was also solved using classification.
dc.description.statementofresponsibility	Zhibin Liao, Qi Wu, Chunhua Shen, Anton van den Hengel, and Johan Verjans
dc.identifier.citation	CEUR Workshop Proceedings, 2020 / Cappellato, L., Eickhoff, C., Ferro, N., Névéol, A. (ed./s), vol.2696, pp.1-14
dc.identifier.issn	1613-0073
dc.identifier.orcid	Liao, Z. [0000-0001-9965-4511]
dc.identifier.orcid	Wu, Q. [0000-0003-3631-256X]
dc.identifier.orcid	Van Den Hengel, A. [0000-0003-3027-8364]
dc.identifier.orcid	Verjans, J. [0000-0002-8336-6774]
dc.identifier.uri	https://hdl.handle.net/2440/132209
dc.language.iso	en
dc.publisher	CEUR-WS
dc.publisher.place	online
dc.relation.ispartofseries	CEUR Workshop Proceedings; 2696
dc.rights	Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
dc.source.uri	http://ceur-ws.org/Vol-2696
dc.subject	Visual Question Answering; Visual Question Generation; Knowledge Inference; Deep Neural Networks; Skeleton-based Sentence Mapping; Class-wise and Task-wise Normalization
dc.title	AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering
dc.type	Conference paper
pubs.publication-status	Published

Files

Original bundle

Now showing 1 - 1 of 1

Name:: hdl_132209.pdf
Size:: 501.4 KB
Format:: Adobe Portable Document Format
Description:: Published version

Download

Collections

Australian Institute for Machine Learning publications