Automated WordNet Construction Using Word Embeddings

Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora


Abstract
We present a fully unsupervised method for automated construction of WordNets based upon recent advances in distributional representations of sentences and word-senses combined with readily available machine translation tools. The approach requires very few linguistic resources and is thus extensible to multiple target languages. To evaluate our method we construct two 600-word testsets for word-to-synset matching in French and Russian using native speakers and evaluate the performance of our method along with several other recent approaches. Our method exceeds the best language-specific and multi-lingual automated WordNets in F-score for both languages. The databases we construct for French and Russian, both languages without large publicly available manually constructed WordNets, will be publicly released along with the testsets.
Anthology ID:
W17-1902
Volume:
Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Jose Camacho-Collados, Mohammad Taher Pilehvar
Venue:
SENSE
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–23
Language:
URL:
https://aclanthology.org/W17-1902
DOI:
10.18653/v1/W17-1902
Bibkey:
Cite (ACL):
Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, and Sanjeev Arora. 2017. Automated WordNet Construction Using Word Embeddings. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications, pages 12–23, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Automated WordNet Construction Using Word Embeddings (Khodak et al., SENSE 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1902.pdf
Code
 mkhodak/pawn