Integrating Domain Terminology into Neural Machine Translation

Elise Michon, Josep Crego, Jean Senellart


Abstract
This paper extends existing work on terminology integration into Neural Machine Translation, a common industrial practice to dynamically adapt translation to a specific domain. Our method, based on the use of placeholders complemented with morphosyntactic annotation, efficiently taps into the ability of the neural network to deal with symbolic knowledge to surpass the surface generalization shown by alternative techniques. We compare our approach to state-of-the-art systems and benchmark them through a well-defined evaluation framework, focusing on actual application of terminology and not just on the overall performance. Results indicate the suitability of our method in the use-case where terminology is used in a system trained on generic data only.
Anthology ID:
2020.coling-main.348
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3925–3937
Language:
URL:
https://aclanthology.org/2020.coling-main.348
DOI:
10.18653/v1/2020.coling-main.348
Bibkey:
Cite (ACL):
Elise Michon, Josep Crego, and Jean Senellart. 2020. Integrating Domain Terminology into Neural Machine Translation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3925–3937, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Integrating Domain Terminology into Neural Machine Translation (Michon et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.348.pdf