Discriminating Homonymy from Polysemy in Wordnets: English, Spanish and Polish Nouns

Arkadiusz Janz, Marek Maziarz


Abstract
We propose a novel method of homonymy-polysemy discrimination for three Indo-European Languages (English, Spanish and Polish). Support vector machines and LASSO logistic regression were successfully used in this task, outperforming baselines. The feature set utilised lemma properties, gloss similarities, graph distances and polysemy patterns. The proposed ML models performed equally well for English and the other two languages (constituting testing data sets). The algorithms not only ruled out most cases of homonymy but also were efficacious in distinguishing between closer and indirect semantic relatedness.
Anthology ID:
2021.gwc-1.7
Volume:
Proceedings of the 11th Global Wordnet Conference
Month:
January
Year:
2021
Address:
University of South Africa (UNISA)
Editors:
Piek Vossen, Christiane Fellbaum
Venue:
GWC
SIG:
SIGLEX
Publisher:
Global Wordnet Association
Note:
Pages:
53–62
Language:
URL:
https://aclanthology.org/2021.gwc-1.7
DOI:
Bibkey:
Cite (ACL):
Arkadiusz Janz and Marek Maziarz. 2021. Discriminating Homonymy from Polysemy in Wordnets: English, Spanish and Polish Nouns. In Proceedings of the 11th Global Wordnet Conference, pages 53–62, University of South Africa (UNISA). Global Wordnet Association.
Cite (Informal):
Discriminating Homonymy from Polysemy in Wordnets: English, Spanish and Polish Nouns (Janz & Maziarz, GWC 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.gwc-1.7.pdf