Named Entity Recognition for Social Media Texts with Semantic Augmentation

Yuyang Nie, Yuanhe Tian, Xiang Wan, Yan Song, Bo Dai


Abstract
Existing approaches for named entity recognition suffer from data sparsity problems when conducted on short and informal texts, especially user-generated social media content. Semantic augmentation is a potential way to alleviate this problem. Given that rich semantic information is implicitly preserved in pre-trained word embeddings, they are potential ideal resources for semantic augmentation. In this paper, we propose a neural-based approach to NER for social media texts where both local (from running text) and augmented semantics are taken into account. In particular, we obtain the augmented semantic information from a large-scale corpus, and propose an attentive semantic augmentation module and a gate module to encode and aggregate such information, respectively. Extensive experiments are performed on three benchmark datasets collected from English and Chinese social media platforms, where the results demonstrate the superiority of our approach to previous studies across all three datasets.
Anthology ID:
2020.emnlp-main.107
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Editors:
Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1383–1391
Language:
URL:
https://aclanthology.org/2020.emnlp-main.107
DOI:
10.18653/v1/2020.emnlp-main.107
Bibkey:
Cite (ACL):
Yuyang Nie, Yuanhe Tian, Xiang Wan, Yan Song, and Bo Dai. 2020. Named Entity Recognition for Social Media Texts with Semantic Augmentation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1383–1391, Online. Association for Computational Linguistics.
Cite (Informal):
Named Entity Recognition for Social Media Texts with Semantic Augmentation (Nie et al., EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.107.pdf
Video:
 https://slideslive.com/38939305
Code
 cuhksz-nlp/SANER
Data
WNUT 2016 NERWNUT 2017Weibo NER