Terminology-Constrained Neural Machine Translation at SAP

Miriam Exel, Bianka Buschbeck, Lauritz Brandt, Simona Doneva


Abstract
This paper examines approaches to bias a neural machine translation model to adhere to terminology constraints in an industrial setup. In particular, we investigate variations of the approach by Dinu et al. (2019), which uses inline annotation of the target terms in the source segment plus source factor embeddings during training and inference, and compare them to constrained decoding. We describe the challenges with respect to terminology in our usage scenario at SAP and show how far the investigated methods can help to overcome them. We extend the original study to a new language pair and provide an in-depth evaluation including an error classification and a human evaluation.
Anthology ID:
2020.eamt-1.29
Volume:
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
Month:
November
Year:
2020
Address:
Lisboa, Portugal
Editors:
André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, Mikel L. Forcada
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
271–280
Language:
URL:
https://aclanthology.org/2020.eamt-1.29
DOI:
Bibkey:
Cite (ACL):
Miriam Exel, Bianka Buschbeck, Lauritz Brandt, and Simona Doneva. 2020. Terminology-Constrained Neural Machine Translation at SAP. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 271–280, Lisboa, Portugal. European Association for Machine Translation.
Cite (Informal):
Terminology-Constrained Neural Machine Translation at SAP (Exel et al., EAMT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.eamt-1.29.pdf