Geometry-aware domain adaptation for unsupervised alignment of word embeddings

Pratik Jawanpuria, Mayank Meghwanshi, Bamdev Mishra


Abstract
We propose a novel manifold based geometric approach for learning unsupervised alignment of word embeddings between the source and the target languages. Our approach formulates the alignment learning problem as a domain adaptation problem over the manifold of doubly stochastic matrices. This viewpoint arises from the aim to align the second order information of the two language spaces. The rich geometry of the doubly stochastic manifold allows to employ efficient Riemannian conjugate gradient algorithm for the proposed formulation. Empirically, the proposed approach outperforms state-of-the-art optimal transport based approach on the bilingual lexicon induction task across several language pairs. The performance improvement is more significant for distant language pairs.
Anthology ID:
2020.acl-main.276
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3052–3058
Language:
URL:
https://aclanthology.org/2020.acl-main.276
DOI:
10.18653/v1/2020.acl-main.276
Bibkey:
Cite (ACL):
Pratik Jawanpuria, Mayank Meghwanshi, and Bamdev Mishra. 2020. Geometry-aware domain adaptation for unsupervised alignment of word embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3052–3058, Online. Association for Computational Linguistics.
Cite (Informal):
Geometry-aware domain adaptation for unsupervised alignment of word embeddings (Jawanpuria et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.276.pdf
Software:
 2020.acl-main.276.Software.zip
Video:
 http://slideslive.com/38929319