AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging

Bill Yuchen Lin, Dong-Ho Lee, Frank F. Xu, Ouyu Lan, Xiang Ren


Abstract
We introduce an open-source web-based data annotation framework (AlpacaTag) for sequence tagging tasks such as named-entity recognition (NER). The distinctive advantages of AlpacaTag are three-fold. 1) Active intelligent recommendation: dynamically suggesting annotations and sampling the most informative unlabeled instances with a back-end active learned model; 2) Automatic crowd consolidation: enhancing real-time inter-annotator agreement by merging inconsistent labels from multiple annotators; 3) Real-time model deployment: users can deploy their models in downstream systems while new annotations are being made. AlpacaTag is a comprehensive solution for sequence labeling tasks, ranging from rapid tagging with recommendations powered by active learning and auto-consolidation of crowd annotations to real-time model deployment.
Anthology ID:
P19-3010
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Marta R. Costa-jussà, Enrique Alfonseca
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
58–63
Language:
URL:
https://aclanthology.org/P19-3010
DOI:
10.18653/v1/P19-3010
Bibkey:
Cite (ACL):
Bill Yuchen Lin, Dong-Ho Lee, Frank F. Xu, Ouyu Lan, and Xiang Ren. 2019. AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 58–63, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
AlpacaTag: An Active Learning-based Crowd Annotation Framework for Sequence Tagging (Lin et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-3010.pdf