Content Word Aware Neural Machine Translation

Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita


Abstract
Neural machine translation (NMT) encodes the source sentence in a universal way to generate the target sentence word-by-word. However, NMT does not consider the importance of word in the sentence meaning, for example, some words (i.e., content words) express more important meaning than others (i.e., function words). To address this limitation, we first utilize word frequency information to distinguish between content and function words in a sentence, and then design a content word-aware NMT to improve translation performance. Empirical results on the WMT14 English-to-German, WMT14 English-to-French, and WMT17 Chinese-to-English translation tasks show that the proposed methods can significantly improve the performance of Transformer-based NMT.
Anthology ID:
2020.acl-main.34
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
358–364
Language:
URL:
https://aclanthology.org/2020.acl-main.34
DOI:
10.18653/v1/2020.acl-main.34
Bibkey:
Cite (ACL):
Kehai Chen, Rui Wang, Masao Utiyama, and Eiichiro Sumita. 2020. Content Word Aware Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 358–364, Online. Association for Computational Linguistics.
Cite (Informal):
Content Word Aware Neural Machine Translation (Chen et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.34.pdf
Video:
 http://slideslive.com/38928867