Neural Machine Translation with Supervised Attention

Lemao Liu, Masao Utiyama, Andrew Finch, Eiichiro Sumita


Abstract
The attention mechanism is appealing for neural machine translation, since it is able to dynamically encode a source sentence by generating a alignment between a target word and source words. Unfortunately, it has been proved to be worse than conventional alignment models in alignment accuracy. In this paper, we analyze and explain this issue from the point view of reordering, and propose a supervised attention which is learned with guidance from conventional alignment models. Experiments on two Chinese-to-English translation tasks show that the supervised attention mechanism yields better alignments leading to substantial gains over the standard attention based NMT.
Anthology ID:
C16-1291
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
3093–3102
Language:
URL:
https://aclanthology.org/C16-1291
DOI:
Bibkey:
Cite (ACL):
Lemao Liu, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. 2016. Neural Machine Translation with Supervised Attention. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 3093–3102, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Neural Machine Translation with Supervised Attention (Liu et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1291.pdf