Named Entity Disambiguation for Noisy Text

Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya Yamada, Omer Levy


Abstract
We address the task of Named Entity Disambiguation (NED) for noisy text. We present WikilinksNED, a large-scale NED dataset of text fragments from the web, which is significantly noisier and more challenging than existing news-based datasets. To capture the limited and noisy local context surrounding each mention, we design a neural model and train it with a novel method for sampling informative negative examples. We also describe a new way of initializing word and entity embeddings that significantly improves performance. Our model significantly outperforms existing state-of-the-art methods on WikilinksNED while achieving comparable performance on a smaller newswire dataset.
Anthology ID:
K17-1008
Volume:
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Roger Levy, Lucia Specia
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
58–68
Language:
URL:
https://aclanthology.org/K17-1008
DOI:
10.18653/v1/K17-1008
Bibkey:
Cite (ACL):
Yotam Eshel, Noam Cohen, Kira Radinsky, Shaul Markovitch, Ikuya Yamada, and Omer Levy. 2017. Named Entity Disambiguation for Noisy Text. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 58–68, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Named Entity Disambiguation for Noisy Text (Eshel et al., CoNLL 2017)
Copy Citation:
PDF:
https://aclanthology.org/K17-1008.pdf
Code
 yotam-happy/NEDforNoisyText
Data
YAGO