Aligned Dual Channel Graph Convolutional Network for Visual Question Answering

Qingbao Huang; Jielong Wei; Yi Cai; Changmeng Zheng; Junying Chen; Ho-fung Leung; Qing Li

doi:10.18653/v1/2020.acl-main.642

Aligned Dual Channel Graph Convolutional Network for Visual Question Answering

Qingbao Huang, Jielong Wei, Yi Cai, Changmeng Zheng, Junying Chen, Ho-fung Leung, Qing Li

Abstract

Visual question answering aims to answer the natural language question about a given image. Existing graph-based methods only focus on the relations between objects in an image and neglect the importance of the syntactic dependency relations between words in a question. To simultaneously capture the relations between objects in an image and the syntactic dependency relations between words in a question, we propose a novel dual channel graph convolutional network (DC-GCN) for better combining visual and textual advantages. The DC-GCN model consists of three parts: an I-GCN module to capture the relations between objects in an image, a Q-GCN module to capture the syntactic dependency relations between words in a question, and an attention alignment module to align image representations and question representations. Experimental results show that our model achieves comparable performance with the state-of-the-art approaches.

Anthology ID:: 2020.acl-main.642
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Editors:: Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7166–7176
Language:
URL:: https://aclanthology.org/2020.acl-main.642
DOI:: 10.18653/v1/2020.acl-main.642
Bibkey:
Cite (ACL):: Qingbao Huang, Jielong Wei, Yi Cai, Changmeng Zheng, Junying Chen, Ho-fung Leung, and Qing Li. 2020. Aligned Dual Channel Graph Convolutional Network for Visual Question Answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7166–7176, Online. Association for Computational Linguistics.
Cite (Informal):: Aligned Dual Channel Graph Convolutional Network for Visual Question Answering (Huang et al., ACL 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.acl-main.642.pdf
Video:: http://slideslive.com/38928979

PDF Cite Search Video