Deep Blue Sonics’ Submission to IWSLT 2020 Open Domain Translation Task

Enmin Su, Yi Ren


Abstract
We present in this report our submission to IWSLT 2020 Open Domain Translation Task. We built a data pre-processing pipeline to efficiently handle large noisy web-crawled corpora, which boosts the BLEU score of a widely used transformer model in this translation task. To tackle the open-domain nature of this task, back- translation is applied to further improve the translation performance.
Anthology ID:
2020.iwslt-1.16
Volume:
Proceedings of the 17th International Conference on Spoken Language Translation
Month:
July
Year:
2020
Address:
Online
Editors:
Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, Francois Yvon
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
140–144
Language:
URL:
https://aclanthology.org/2020.iwslt-1.16
DOI:
10.18653/v1/2020.iwslt-1.16
Bibkey:
Cite (ACL):
Enmin Su and Yi Ren. 2020. Deep Blue Sonics’ Submission to IWSLT 2020 Open Domain Translation Task. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 140–144, Online. Association for Computational Linguistics.
Cite (Informal):
Deep Blue Sonics’ Submission to IWSLT 2020 Open Domain Translation Task (Su & Ren, IWSLT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.iwslt-1.16.pdf
Video:
 http://slideslive.com/38929592