Visual Question Answering: a tutorial

Teney, D.; Wu, Q.; Van Den Hengel, A.

doi:10.1109/MSP.2017.2739826

Visual Question Answering: a tutorial

Files

hdl_116146.pdf (4.5 MB)

(Accepted version)

Date

2017

Authors

Teney, D.

Wu, Q.

Van Den Hengel, A.

Type:

Journal article

Citation

IEEE: Signal Processing Magazine, 2017; 34(6):63-75

Statement of Responsibility

Damien Teney, Qi Wu, and Anton van den Hengel

DOI

10.1109/MSP.2017.2739826

Abstract

The task of visual question answering (VQA) is receiving increasing interest from researchers in both the computer vision and natural language processing fields. Tremendous advances have been seen in the field of computer vision due to the success of deep learning, in particular on low- and midlevel tasks, such as image segmentation or object recognition. These advances have fueled researchers' confidence for tackling more complex tasks that combine vision with language and high-level reasoning. VQA is a prime example of this trend. This article presents the ongoing work in the field and the current approaches to VQA based on deep learning. VQA constitutes a test for deep visual understanding and a benchmark for general artificial intelligence (AI). While the field of VQA has seen recent successes, it remains a largely unsolved task.

Rights

Published Version

https://doi.org/10.1109/msp.2017.2739826

Persistent link to this record

http://hdl.handle.net/2440/116146

Full item page

Visual Question Answering: a tutorial

Files

Date

Authors

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Citation

Statement of Responsibility

Conference Name

DOI

Abstract

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

License

Grant ID

Published Version

Call number

Persistent link to this record