Towards a Swedish Roget-Style Thesaurus for NLP

Niklas Zechner, Lars Borin


Abstract
Bring’s thesaurus (Bring) is a Swedish counterpart of Roget, and its digitized version could make a valuable language resource for use in many and diverse natural language processing (NLP) applications. From the literature we know that Roget-style thesauruses and wordnets have complementary strengths in this context, so both kinds of lexical-semantic resource are good to have. However, Bring was published in 1930, and its lexical items are in the form of lemma–POS pairings. In order to be useful in our NLP systems, polysemous lexical items need to be disambiguated, and a large amount of modern vocabulary must be added in the proper places in Bring. The work presented here describes experiments aiming at automating these two tasks, at least in part, where we use the structure of an existing Swedish semantic lexicon – Saldo – both for disambiguation of ambiguous Bring entries and for addition of new entries to Bring.
Anthology ID:
2020.globalex-1.9
Volume:
Proceedings of the 2020 Globalex Workshop on Linked Lexicography
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Ilan Kernerman, Simon Krek, John P. McCrae, Jorge Gracia, Sina Ahmadi, Besim Kabashi
Venue:
GLOBALEX
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
53–60
Language:
English
URL:
https://aclanthology.org/2020.globalex-1.9
DOI:
Bibkey:
Cite (ACL):
Niklas Zechner and Lars Borin. 2020. Towards a Swedish Roget-Style Thesaurus for NLP. In Proceedings of the 2020 Globalex Workshop on Linked Lexicography, pages 53–60, Marseille, France. European Language Resources Association.
Cite (Informal):
Towards a Swedish Roget-Style Thesaurus for NLP (Zechner & Borin, GLOBALEX 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.globalex-1.9.pdf