Probing Neural Network Comprehension of Natural Language Arguments

Timothy Niven, Hung-Yu Kao


Abstract
We are surprised to find that BERT’s peak performance of 77% on the Argument Reasoning Comprehension Task reaches just three points below the average untrained human baseline. However, we show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. We analyze the nature of these cues and demonstrate that a range of models all exploit them. This analysis informs the construction of an adversarial dataset on which all models achieve random accuracy. Our adversarial dataset provides a more robust assessment of argument comprehension and should be adopted as the standard in future work.
Anthology ID:
P19-1459
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4658–4664
Language:
URL:
https://aclanthology.org/P19-1459
DOI:
10.18653/v1/P19-1459
Bibkey:
Cite (ACL):
Timothy Niven and Hung-Yu Kao. 2019. Probing Neural Network Comprehension of Natural Language Arguments. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4658–4664, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Probing Neural Network Comprehension of Natural Language Arguments (Niven & Kao, ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1459.pdf
Code
 IKMLab/arct2