Masked Conditional Random Fields for Sequence Labeling

Tianwen Wei, Jianwei Qi, Shenghuan He, Songtao Sun


Abstract
Conditional Random Field (CRF) based neural models are among the most performant methods for solving sequence labeling problems. Despite its great success, CRF has the shortcoming of occasionally generating illegal sequences of tags, e.g. sequences containing an “I-” tag immediately after an “O” tag, which is forbidden by the underlying BIO tagging scheme. In this work, we propose Masked Conditional Random Field (MCRF), an easy to implement variant of CRF that impose restrictions on candidate paths during both training and decoding phases. We show that the proposed method thoroughly resolves this issue and brings significant improvement over existing CRF-based models with near zero additional cost.
Anthology ID:
2021.naacl-main.163
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2024–2035
Language:
URL:
https://aclanthology.org/2021.naacl-main.163
DOI:
10.18653/v1/2021.naacl-main.163
Bibkey:
Cite (ACL):
Tianwen Wei, Jianwei Qi, Shenghuan He, and Songtao Sun. 2021. Masked Conditional Random Fields for Sequence Labeling. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2024–2035, Online. Association for Computational Linguistics.
Cite (Informal):
Masked Conditional Random Fields for Sequence Labeling (Wei et al., NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.163.pdf
Video:
 https://aclanthology.org/2021.naacl-main.163.mp4
Code
 DandyQi/MaskedCRF +  additional community code
Data
ATISCoNLL 2003