Visual Question Answering: a tutorial

dc.contributor.authorTeney, D.
dc.contributor.authorWu, Q.
dc.contributor.authorVan Den Hengel, A.
dc.date.issued2017
dc.description.abstractThe task of visual question answering (VQA) is receiving increasing interest from researchers in both the computer vision and natural language processing fields. Tremendous advances have been seen in the field of computer vision due to the success of deep learning, in particular on low- and midlevel tasks, such as image segmentation or object recognition. These advances have fueled researchers' confidence for tackling more complex tasks that combine vision with language and high-level reasoning. VQA is a prime example of this trend. This article presents the ongoing work in the field and the current approaches to VQA based on deep learning. VQA constitutes a test for deep visual understanding and a benchmark for general artificial intelligence (AI). While the field of VQA has seen recent successes, it remains a largely unsolved task.
dc.description.statementofresponsibilityDamien Teney, Qi Wu, and Anton van den Hengel
dc.identifier.citationIEEE: Signal Processing Magazine, 2017; 34(6):63-75
dc.identifier.doi10.1109/MSP.2017.2739826
dc.identifier.issn1053-5888
dc.identifier.issn1558-0792
dc.identifier.orcidTeney, D. [0000-0003-2130-6650]
dc.identifier.orcidWu, Q. [0000-0003-3631-256X]
dc.identifier.orcidVan Den Hengel, A. [0000-0003-3027-8364]
dc.identifier.urihttp://hdl.handle.net/2440/116146
dc.language.isoen
dc.publisherIEEE
dc.rights© 2017 IEEE
dc.source.urihttps://doi.org/10.1109/msp.2017.2739826
dc.titleVisual Question Answering: a tutorial
dc.typeJournal article
pubs.publication-statusPublished

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
hdl_116146.pdf
Size:
4.5 MB
Format:
Adobe Portable Document Format
Description:
Accepted version