Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering

Eric Wallace, Pedro Rodriguez, Shi Feng, Ikuya Yamada, Jordan Boyd-Graber


Abstract
Adversarial evaluation stress-tests a model’s understanding of natural language. Because past approaches expose superficial patterns, the resulting adversarial examples are limited in complexity and diversity. We propose human- in-the-loop adversarial generation, where human authors are guided to break models. We aid the authors with interpretations of model predictions through an interactive user interface. We apply this generation framework to a question answering task called Quizbowl, where trivia enthusiasts craft adversarial questions. The resulting questions are validated via live human–computer matches: Although the questions appear ordinary to humans, they systematically stump neural and information retrieval models. The adversarial questions cover diverse phenomena from multi-hop reasoning to entity type distractors, exposing open challenges in robust question answering.
Anthology ID:
Q19-1029
Volume:
Transactions of the Association for Computational Linguistics, Volume 7
Month:
Year:
2019
Address:
Cambridge, MA
Editors:
Lillian Lee, Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
387–401
Language:
URL:
https://aclanthology.org/Q19-1029
DOI:
10.1162/tacl_a_00279
Bibkey:
Cite (ACL):
Eric Wallace, Pedro Rodriguez, Shi Feng, Ikuya Yamada, and Jordan Boyd-Graber. 2019. Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering. Transactions of the Association for Computational Linguistics, 7:387–401.
Cite (Informal):
Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering (Wallace et al., TACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/Q19-1029.pdf
Code
 Eric-Wallace/trickme-interface
Data
LAMBADAReCoRD