A Parallel WordNet for English, Swedish and Bulgarian

Krasimir Angelov


Abstract
We present the parallel creation of a WordNet resource for Swedish and Bulgarian which is tightly aligned with the Princeton WordNet. The alignment is not only on the synset level, but also on word level, by matching words with their closest translations in each language. We argue that the tighter alignment is essential in machine translation and natural language generation. About one-fifth of the lexical entries are also linked to the corresponding Wikipedia articles. In addition to the traditional semantic relations in WordNet, we also integrate morphological and morpho-syntactic information. The resource comes with a corpus where examples from Princeton WordNet are translated to Swedish and Bulgarian. The examples are aligned on word and phrase level. The new resource is open-source and in its development we used only existing open-source resources.
Anthology ID:
2020.lrec-1.368
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3008–3015
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.368
DOI:
Bibkey:
Cite (ACL):
Krasimir Angelov. 2020. A Parallel WordNet for English, Swedish and Bulgarian. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3008–3015, Marseille, France. European Language Resources Association.
Cite (Informal):
A Parallel WordNet for English, Swedish and Bulgarian (Angelov, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.368.pdf