CUNI Submission for Low-Resource Languages in WMT News 2019

Tom Kocmi, Ondřej Bojar


Abstract
This paper describes the CUNI submission to the WMT 2019 News Translation Shared Task for the low-resource languages: Gujarati-English and Kazakh-English. We participated in both language pairs in both translation directions. Our system combines transfer learning from a different high-resource language pair followed by training on backtranslated monolingual data. Thanks to the simultaneous training in both directions, we can iterate the backtranslation process. We are using the Transformer model in a constrained submission.
Anthology ID:
W19-5322
Volume:
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
Month:
August
Year:
2019
Address:
Florence, Italy
Editors:
Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Marco Turchi, Karin Verspoor
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
234–240
Language:
URL:
https://aclanthology.org/W19-5322
DOI:
10.18653/v1/W19-5322
Bibkey:
Cite (ACL):
Tom Kocmi and Ondřej Bojar. 2019. CUNI Submission for Low-Resource Languages in WMT News 2019. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 234–240, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
CUNI Submission for Low-Resource Languages in WMT News 2019 (Kocmi & Bojar, WMT 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-5322.pdf
Poster:
 W19-5322.Poster.pdf