End-to-End Open-Domain Question Answering with BERTserini

Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, Jimmy Lin


Abstract
We demonstrate an end-to-end question answering system that integrates BERT with the open-source Anserini information retrieval toolkit. In contrast to most question answering and reading comprehension models today, which operate over small amounts of input text, our system integrates best practices from IR with a BERT-based reader to identify answers from a large corpus of Wikipedia articles in an end-to-end fashion. We report large improvements over previous results on a standard benchmark test collection, showing that fine-tuning pretrained BERT with SQuAD is sufficient to achieve high accuracy in identifying answer spans.
Anthology ID:
N19-4013
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Waleed Ammar, Annie Louis, Nasrin Mostafazadeh
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
72–77
Language:
URL:
https://aclanthology.org/N19-4013
DOI:
10.18653/v1/N19-4013
Bibkey:
Cite (ACL):
Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. 2019. End-to-End Open-Domain Question Answering with BERTserini. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pages 72–77, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
End-to-End Open-Domain Question Answering with BERTserini (Yang et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-4013.pdf
Code
 rsvp-ai/bertserini
Data
SQuAD