Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling

Canasai Kruengkrai, Thien Hai Nguyen, Sharifah Mahani Aljunied, Lidong Bing


Abstract
Exploiting sentence-level labels, which are easy to obtain, is one of the plausible methods to improve low-resource named entity recognition (NER), where token-level labels are costly to annotate. Current models for jointly learning sentence and token labeling are limited to binary classification. We present a joint model that supports multi-class classification and introduce a simple variant of self-attention that allows the model to learn scaling factors. Our model produces 3.78%, 4.20%, 2.08% improvements in F1 over the BiLSTM-CRF baseline on e-commerce product titles in three different low-resource languages: Vietnamese, Thai, and Indonesian, respectively.
Anthology ID:
2020.acl-main.523
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5898–5905
Language:
URL:
https://aclanthology.org/2020.acl-main.523
DOI:
10.18653/v1/2020.acl-main.523
Bibkey:
Cite (ACL):
Canasai Kruengkrai, Thien Hai Nguyen, Sharifah Mahani Aljunied, and Lidong Bing. 2020. Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5898–5905, Online. Association for Computational Linguistics.
Cite (Informal):
Improving Low-Resource Named Entity Recognition using Joint Sentence and Token Labeling (Kruengkrai et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.523.pdf
Video:
 http://slideslive.com/38929237