The Ubiqus English-Inuktitut System for WMT20

François Hernandez, Vincent Nguyen


Abstract
This paper describes Ubiqus’ submission to the WMT20 English-Inuktitut shared news translation task. Our main system, and only submission, is based on a multilingual approach, jointly training a Transformer model on several agglutinative languages. The English-Inuktitut translation task is challenging at every step, from data selection, preparation and tokenization to quality evaluation down the line. Difficulties emerge both because of the peculiarities of the Inuktitut language as well as the low-resource context.
Anthology ID:
2020.wmt-1.21
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
213–217
Language:
URL:
https://aclanthology.org/2020.wmt-1.21
DOI:
Bibkey:
Cite (ACL):
François Hernandez and Vincent Nguyen. 2020. The Ubiqus English-Inuktitut System for WMT20. In Proceedings of the Fifth Conference on Machine Translation, pages 213–217, Online. Association for Computational Linguistics.
Cite (Informal):
The Ubiqus English-Inuktitut System for WMT20 (Hernandez & Nguyen, WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.21.pdf