POSTECH-ETRI’s Submission to the WMT2020 APE Shared Task: Automatic Post-Editing with Cross-lingual Language Model

Jihyung Lee, WonKee Lee, Jaehun Shin, Baikjin Jung, Young-Kil Kim, Jong-Hyeok Lee


Abstract
This paper describes POSTECH-ETRI’s submission to WMT2020 for the shared task on automatic post-editing (APE) for 2 language pairs: English-German (En-De) and English-Chinese (En-Zh). We propose APE systems based on a cross-lingual language model, which jointly adopts translation language modeling (TLM) and masked language modeling (MLM) training objectives in the pre-training stage; the APE models then utilize jointly learned language representations between the source language and the target language. In addition, we created 19 million new sythetic triplets as additional training data for our final ensemble model. According to experimental results on the WMT2020 APE development data set, our models showed an improvement over the baseline by TER of -3.58 and a BLEU score of +5.3 for the En-De subtask; and TER of -5.29 and a BLEU score of +7.32 for the En-Zh subtask.
Anthology ID:
2020.wmt-1.82
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Editors:
Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Yvette Graham, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
777–782
Language:
URL:
https://aclanthology.org/2020.wmt-1.82
DOI:
Bibkey:
Cite (ACL):
Jihyung Lee, WonKee Lee, Jaehun Shin, Baikjin Jung, Young-Kil Kim, and Jong-Hyeok Lee. 2020. POSTECH-ETRI’s Submission to the WMT2020 APE Shared Task: Automatic Post-Editing with Cross-lingual Language Model. In Proceedings of the Fifth Conference on Machine Translation, pages 777–782, Online. Association for Computational Linguistics.
Cite (Informal):
POSTECH-ETRI’s Submission to the WMT2020 APE Shared Task: Automatic Post-Editing with Cross-lingual Language Model (Lee et al., WMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wmt-1.82.pdf
Video:
 https://slideslive.com/38939561