A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature

Rohan Bhambhoria, Luna Feng, Dawn Sepehr, John Chen, Conner Cowling, Sedef Kocak, Elham Dolatabadi


Abstract
Automatically generating question answer (QA) pairs from the rapidly growing coronavirus-related literature is of great value to the medical community. Creating high quality QA pairs would allow researchers to build models to address scientific queries for answers which are not readily available in support of the ongoing fight against the pandemic. QA pair generation is, however, a very tedious and time consuming task requiring domain expertise for annotation and evaluation. In this paper we present our contribution in addressing some of the challenges of building a QA system without gold data. We first present a method to create QA pairs from a large semi-structured dataset through the use of transformer and rule-based models. Next, we propose a means of engaging subject matter experts (SMEs) for annotating the QA pairs through the usage of a web application. Finally, we demonstrate some experiments showcasing the effectiveness of leveraging active learning in designing a high performing model with a substantially lower annotation effort from the domain experts.
Anthology ID:
2020.sdp-1.4
Volume:
Proceedings of the First Workshop on Scholarly Document Processing
Month:
November
Year:
2020
Address:
Online
Editors:
Muthu Kumar Chandrasekaran, Anita de Waard, Guy Feigenblat, Dayne Freitag, Tirthankar Ghosal, Eduard Hovy, Petr Knoth, David Konopnicki, Philipp Mayr, Robert M. Patton, Michal Shmueli-Scheuer
Venue:
sdp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20–30
Language:
URL:
https://aclanthology.org/2020.sdp-1.4
DOI:
10.18653/v1/2020.sdp-1.4
Bibkey:
Cite (ACL):
Rohan Bhambhoria, Luna Feng, Dawn Sepehr, John Chen, Conner Cowling, Sedef Kocak, and Elham Dolatabadi. 2020. A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature. In Proceedings of the First Workshop on Scholarly Document Processing, pages 20–30, Online. Association for Computational Linguistics.
Cite (Informal):
A Smart System to Generate and Validate Question Answer Pairs for COVID-19 Literature (Bhambhoria et al., sdp 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sdp-1.4.pdf
Video:
 https://slideslive.com/38940713
Data
BioASQCORD-19PubMedQA