The Transference Architecture for Automatic Post-Editing

Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krüger, Josef van Genabith


Abstract
In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input. This has led to multi-encoder based neural APE approaches. A research challenge now is the search for architectures that best support the capture, preparation and provision of src and mt information and its integration with pe decisions. In this paper we present an efficient multi-encoder based APE model, called transference. Unlike previous approaches, it (i) uses a transformer encoder block for src, (ii) followed by a decoder block, but without masking for self-attention on mt, which effectively acts as second encoder combining src –> mt, and (iii) feeds this representation into a final decoder block generating pe. Our model outperforms the best performing systems by 1 BLEU point on the WMT 2016, 2017, and 2018 English–German APE shared tasks (PBSMT and NMT). Furthermore, the results of our model on the WMT 2019 APE task using NMT data shows a comparable performance to the state-of-the-art system. The inference time of our model is similar to the vanilla transformer-based NMT system although our model deals with two separate encoders. We further investigate the importance of our newly introduced second encoder and find that a too small amount of layers does hurt the performance, while reducing the number of layers of the decoder does not matter much.
Anthology ID:
2020.coling-main.524
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5963–5974
Language:
URL:
https://aclanthology.org/2020.coling-main.524
DOI:
10.18653/v1/2020.coling-main.524
Bibkey:
Cite (ACL):
Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krüger, and Josef van Genabith. 2020. The Transference Architecture for Automatic Post-Editing. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5963–5974, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
The Transference Architecture for Automatic Post-Editing (Pal et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.524.pdf
Data
WMT 2016eSCAPE