Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?

Abhishek Das, Harsh Agrawal, Larry Zitnick, Devi Parikh, Dhruv Batra


Anthology ID:
D16-1092
Volume:
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2016
Address:
Austin, Texas
Editors:
Jian Su, Kevin Duh, Xavier Carreras
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
932–937
Language:
URL:
https://aclanthology.org/D16-1092
DOI:
10.18653/v1/D16-1092
Bibkey:
Cite (ACL):
Abhishek Das, Harsh Agrawal, Larry Zitnick, Devi Parikh, and Dhruv Batra. 2016. Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions?. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 932–937, Austin, Texas. Association for Computational Linguistics.
Cite (Informal):
Human Attention in Visual Question Answering: Do Humans and Deep Networks look at the same regions? (Das et al., EMNLP 2016)
Copy Citation:
PDF:
https://aclanthology.org/D16-1092.pdf
Attachment:
 D16-1092.Attachment.pdf
Data
Visual Question Answering