Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation

King Chan, Julian Brooke, Timothy Baldwin


Abstract
This paper presents a methodology for identifying and resolving various kinds of inconsistency in the context of merging dependency and multiword expression (MWE) annotations, to generate a dependency treebank with comprehensive MWE annotations. Candidates for correction are identified using a variety of heuristics, including an entirely novel one which identifies violations of MWE constituency in the dependency tree, and resolved by arbitration with minimal human intervention. Using this technique, we identified and corrected several hundred errors across both parse and MWE annotations, representing changes to a significant percentage (well over 10%) of the MWE instances in the joint corpus.
Anthology ID:
W17-1726
Volume:
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Stella Markantonatou, Carlos Ramisch, Agata Savary, Veronika Vincze
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
187–193
Language:
URL:
https://aclanthology.org/W17-1726
DOI:
10.18653/v1/W17-1726
Bibkey:
Cite (ACL):
King Chan, Julian Brooke, and Timothy Baldwin. 2017. Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation. In Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017), pages 187–193, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation (Chan et al., MWE 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1726.pdf
Code
 eltimster/HAMSTER
Data
English Web TreebankUniversal Dependencies