STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering

Hrituraj Singh, Sumit Shekhar


Abstract
Chart Question Answering (CQA) is the task of answering natural language questions about visualisations in the chart image. Recent solutions, inspired by VQA approaches, rely on image-based attention for question/answering while ignoring the inherent chart structure. We propose STL-CQA which improves the question/answering through sequential elements localization, question encoding and then, a structural transformer-based learning approach. We conduct extensive experiments while proposing pre-training tasks, methodology and also an improved dataset with more complex and balanced questions of different types. The proposed methodology shows a significant accuracy improvement compared to the state-of-the-art approaches on various chart Q/A datasets, while outperforming even human baseline on the DVQA Dataset. We also demonstrate interpretability while examining different components in the inference pipeline.
Anthology ID:
2020.emnlp-main.264
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Editors:
Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3275–3284
Language:
URL:
https://aclanthology.org/2020.emnlp-main.264
DOI:
10.18653/v1/2020.emnlp-main.264
Bibkey:
Cite (ACL):
Hrituraj Singh and Sumit Shekhar. 2020. STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3275–3284, Online. Association for Computational Linguistics.
Cite (Informal):
STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering (Singh & Shekhar, EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.264.pdf
Optional supplementary material:
 2020.emnlp-main.264.OptionalSupplementaryMaterial.zip
Video:
 https://slideslive.com/38938832