Building Sense Representations in Danish by Combining Word Embeddings with Lexical Resources

Ida Rørmann Olsen, Bolette Pedersen, Asad Sayeed


Abstract
Our aim is to identify suitable sense representations for NLP in Danish. We investigate sense inventories that correlate with human interpretations of word meaning and ambiguity as typically described in dictionaries and wordnets and that are well reflected distributionally as expressed in word embeddings. To this end, we study a number of highly ambiguous Danish nouns and examine the effectiveness of sense representations constructed by combining vectors from a distributional model with the information from a wordnet. We establish representations based on centroids obtained from wordnet synests and example sentences as well as representations established via are tested in a word sense disambiguation task. We conclude that the more information extracted from the wordnet entries (example sentence, definition, semantic relations) the more successful the sense representation vector.
Anthology ID:
2020.globalex-1.8
Volume:
Proceedings of the 2020 Globalex Workshop on Linked Lexicography
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Ilan Kernerman, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi, Besim Kabashi
Venue:
GLOBALEX
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
45–52
Language:
English
URL:
https://aclanthology.org/2020.globalex-1.8
DOI:
Bibkey:
Cite (ACL):
Ida Rørmann Olsen, Bolette Pedersen, and Asad Sayeed. 2020. Building Sense Representations in Danish by Combining Word Embeddings with Lexical Resources. In Proceedings of the 2020 Globalex Workshop on Linked Lexicography, pages 45–52, Marseille, France. European Language Resources Association.
Cite (Informal):
Building Sense Representations in Danish by Combining Word Embeddings with Lexical Resources (Rørmann Olsen et al., GLOBALEX 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.globalex-1.8.pdf