Delexicalized Word Embeddings for Cross-lingual Dependency Parsing

Mathieu Dehouck, Pascal Denis


Abstract
This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically, this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with each word and its contexts. These delexicalized word embeddings, which can be trained on any set of languages and capture features shared across languages, are then used in combination with standard language-specific features to train a lexicalized parser in the target language. We evaluate our approach through experiments on a set of eight different languages that are part the Universal Dependencies Project. Our main results show that using such delexicalized embeddings, either trained in a monolingual or multilingual fashion, achieves significant improvements over monolingual baselines.
Anthology ID:
E17-1023
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
241–250
Language:
URL:
https://aclanthology.org/E17-1023
DOI:
Bibkey:
Cite (ACL):
Mathieu Dehouck and Pascal Denis. 2017. Delexicalized Word Embeddings for Cross-lingual Dependency Parsing. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 241–250, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Delexicalized Word Embeddings for Cross-lingual Dependency Parsing (Dehouck & Denis, EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-1023.pdf
Data
Universal Dependencies