AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering

Files

hdl_132209.pdf (501.4 KB)
  (Published version)

Date

2020

Authors

Liao, Z.
Wu, Q.
Shen, C.
Van Den Hengel, A.
Verjans, J.

Editors

Cappellato, L.
Eickhoff, C.
Ferro, N.
Névéol, A.

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

CEUR Workshop Proceedings, 2020 / Cappellato, L., Eickhoff, C., Ferro, N., Névéol, A. (ed./s), vol.2696, pp.1-14

Statement of Responsibility

Zhibin Liao, Qi Wu, Chunhua Shen, Anton van den Hengel, and Johan Verjans

Conference Name

International Conference of the CLEF Initiative (CLEF) (22 Sep 2020 - 25 Sep 2020 : virtual online)

Abstract

In this paper, we describe our contribution to the 2020 ImageCLEF Medical Domain Visual Question Answering (VQA-Med) challenge. Our submissions scored first place on the VQA challenge leaderboard, and also the first place on the associated Visual Question Generation (VQG) challenge leaderboard. Our VQA approach was developed using a knowledge inference methodology called Skeleton-based Sentence Mapping (SSM). Using all the questions and answers, we derived a set of classifiable tasks and inferred the corresponding labels. As a result, we were able to transform the VQA task into a multi-task image classification problem which allowed us to focus on the image modelling aspect. We further propose a class-wise and task-wise normalization facilitating optimization of multiple tasks in a single network. This enabled us to apply a multi-scale and multi-architecture ensemble strategy for robust prediction. Lastly, we positioned the VQG task as a transfer learning problem using the VGA task trained models. The VQG task was also solved using classification.

School/Discipline

Dissertation Note

Provenance

Description

Session - ImageCLEF: Multimedia Retrieval in Medicine, Lifelogging, and Internet.

Access Status

Rights

Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

License

Grant ID

Call number

Persistent link to this record