Wordnet-based Evaluation of Large Distributional Models for Polish

Maciej Piasecki, Gabriela Czachor, Arkadiusz Janz, Dominik Kaszewski, Paweł Kędzia


Abstract
The paper presents construction of large scale test datasets for word embeddings on the basis of a very large wordnet. They were next applied for evaluation of word embedding models and used to assess and compare the usefulness of different word embeddings extracted from a very large corpus of Polish. We analysed also and compared several publicly available models described in literature. In addition, several large word embeddings models built on the basis of a very large Polish corpus are presented.
Anthology ID:
2018.gwc-1.26
Volume:
Proceedings of the 9th Global Wordnet Conference
Month:
January
Year:
2018
Address:
Nanyang Technological University (NTU), Singapore
Editors:
Francis Bond, Piek Vossen, Christiane Fellbaum
Venue:
GWC
SIG:
SIGLEX
Publisher:
Global Wordnet Association
Note:
Pages:
229–238
Language:
URL:
https://aclanthology.org/2018.gwc-1.26
DOI:
Bibkey:
Cite (ACL):
Maciej Piasecki, Gabriela Czachor, Arkadiusz Janz, Dominik Kaszewski, and Paweł Kędzia. 2018. Wordnet-based Evaluation of Large Distributional Models for Polish. In Proceedings of the 9th Global Wordnet Conference, pages 229–238, Nanyang Technological University (NTU), Singapore. Global Wordnet Association.
Cite (Informal):
Wordnet-based Evaluation of Large Distributional Models for Polish (Piasecki et al., GWC 2018)
Copy Citation:
PDF:
https://aclanthology.org/2018.gwc-1.26.pdf