If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions

Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine


Abstract
Multiword expressions, especially verbal ones (VMWEs), show idiosyncratic variability, which is challenging for NLP applications, hence the need for VMWE identification. We focus on the task of variant identification, i.e. identifying variants of previously seen VMWEs, whatever their surface form. We model the problem as a classification task. Syntactic subtrees with previously seen combinations of lemmas are first extracted, and then classified on the basis of features relevant to morpho-syntactic variation of VMWEs. Feature values are both absolute, i.e. hold for a particular VMWE candidate, and relative, i.e. based on comparing a candidate with previously seen VMWEs. This approach outperforms a baseline by 4 percent points of F-measure on a French corpus.
Anthology ID:
C18-1219
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2582–2594
Language:
URL:
https://aclanthology.org/C18-1219
DOI:
Bibkey:
Cite (ACL):
Caroline Pasquer, Agata Savary, Carlos Ramisch, and Jean-Yves Antoine. 2018. If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2582–2594, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions (Pasquer et al., COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1219.pdf