Combining domain and topic adaptation for SMT

Eva Hasler, Barry Haddow, Philipp Koehn


Abstract
Recent years have seen increased interest in adapting translation models to test domains that are known in advance as well as using latent topic representations to adapt to unknown test domains. However, the relationship between domains and latent topics is still somewhat unclear and topic adaptation approaches typically do not make use of domain knowledge in the training data. We show empirically that combining domain and topic adaptation approaches can be beneficial and that topic representations can be used to predict the domain of a test document. Our best combined model yields gains of up to 0.82 BLEU over a domain-adapted translation system and up to 1.67 BLEU over an unadapted system, measured on the stronger of two training conditions.
Anthology ID:
2014.amta-researchers.11
Volume:
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
Month:
October 22-26
Year:
2014
Address:
Vancouver, Canada
Editors:
Yaser Al-Onaizan, Michel Simard
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
139–151
Language:
URL:
https://aclanthology.org/2014.amta-researchers.11
DOI:
Bibkey:
Cite (ACL):
Eva Hasler, Barry Haddow, and Philipp Koehn. 2014. Combining domain and topic adaptation for SMT. In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, pages 139–151, Vancouver, Canada. Association for Machine Translation in the Americas.
Cite (Informal):
Combining domain and topic adaptation for SMT (Hasler et al., AMTA 2014)
Copy Citation:
PDF:
https://aclanthology.org/2014.amta-researchers.11.pdf