Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings

Nikola Ljubešić, Darja Fišer, Anita Peti-Stantić


Abstract
The notions of concreteness and imageability, traditionally important in psycholinguistics, are gaining significance in semantic-oriented natural language processing tasks. In this paper we investigate the predictability of these two concepts via supervised learning, using word embeddings as explanatory variables. We perform predictions both within and across languages by exploiting collections of cross-lingual embeddings aligned to a single vector space. We show that the notions of concreteness and imageability are highly predictable both within and across languages, with a moderate loss of up to 20% in correlation when predicting across languages. We further show that the cross-lingual transfer via word embeddings is more efficient than the simple transfer via bilingual dictionaries.
Anthology ID:
W18-3028
Volume:
Proceedings of the Third Workshop on Representation Learning for NLP
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Isabelle Augenstein, Kris Cao, He He, Felix Hill, Spandana Gella, Jamie Kiros, Hongyuan Mei, Dipendra Misra
Venue:
RepL4NLP
SIG:
SIGREP
Publisher:
Association for Computational Linguistics
Note:
Pages:
217–222
Language:
URL:
https://aclanthology.org/W18-3028
DOI:
10.18653/v1/W18-3028
Bibkey:
Cite (ACL):
Nikola Ljubešić, Darja Fišer, and Anita Peti-Stantić. 2018. Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings. In Proceedings of the Third Workshop on Representation Learning for NLP, pages 217–222, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings (Ljubešić et al., RepL4NLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-3028.pdf
Code
 clarinsi/megahr-crossling