An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems

Alessandro Raganato, Yves Scherrer, Jörg Tiedemann


Abstract
Lexical ambiguity is one of the many challenging linguistic phenomena involved in translation, i.e., translating an ambiguous word with its correct sense. In this respect, previous work has shown that the translation quality of neural machine translation systems can be improved by explicitly modeling the senses of ambiguous words. Recently, several evaluation test sets have been proposed to measure the word sense disambiguation (WSD) capability of machine translation systems. However, to date, these evaluation test sets do not include any training data that would provide a fair setup measuring the sense distributions present within the training data itself. In this paper, we present an evaluation benchmark on WSD for machine translation for 10 language pairs, comprising training data with known sense distributions. Our approach for the construction of the benchmark builds upon the wide-coverage multilingual sense inventory of BabelNet, the multilingual neural parsing pipeline TurkuNLP, and the OPUS collection of translated texts from the web. The test suite is available at http://github.com/Helsinki-NLP/MuCoW.
Anthology ID:
2020.lrec-1.452
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3668–3675
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.452
DOI:
Bibkey:
Cite (ACL):
Alessandro Raganato, Yves Scherrer, and Jörg Tiedemann. 2020. An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3668–3675, Marseille, France. European Language Resources Association.
Cite (Informal):
An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems (Raganato et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.452.pdf
Code
 Helsinki-NLP/MuCoW