Name Variation in Community Question Answering Systems

Anietie Andy, Satoshi Sekine, Mugizi Rwebangira, Mark Dredze


Abstract
Name Variation in Community Question Answering Systems Abstract Community question answering systems are forums where users can ask and answer questions in various categories. Examples are Yahoo! Answers, Quora, and Stack Overflow. A common challenge with such systems is that a significant percentage of asked questions are left unanswered. In this paper, we propose an algorithm to reduce the number of unanswered questions in Yahoo! Answers by reusing the answer to the most similar past resolved question to the unanswered question, from the site. Semantically similar questions could be worded differently, thereby making it difficult to find questions that have shared needs. For example, “Who is the best player for the Reds?” and “Who is currently the biggest star at Manchester United?” have a shared need but are worded differently; also, “Reds” and “Manchester United” are used to refer to the soccer team Manchester United football club. In this research, we focus on question categories that contain a large number of named entities and entity name variations. We show that in these categories, entity linking can be used to identify relevant past resolved questions with shared needs as a given question by disambiguating named entities and matching these questions based on the disambiguated entities, identified entities, and knowledge base information related to these entities. We evaluated our algorithm on a new dataset constructed from Yahoo! Answers. The dataset contains annotated question pairs, (Qgiven, [Qpast, Answer]). We carried out experiments on several question categories and show that an entity-based approach gives good performance when searching for similar questions in entity rich categories.
Anthology ID:
W16-3909
Volume:
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Bo Han, Alan Ritter, Leon Derczynski, Wei Xu, Tim Baldwin
Venue:
WNUT
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
51–60
Language:
URL:
https://aclanthology.org/W16-3909
DOI:
Bibkey:
Cite (ACL):
Anietie Andy, Satoshi Sekine, Mugizi Rwebangira, and Mark Dredze. 2016. Name Variation in Community Question Answering Systems. In Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), pages 51–60, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Name Variation in Community Question Answering Systems (Andy et al., WNUT 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-3909.pdf