Neural Hidden Markov Model for Machine Translation

Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan, Hermann Ney


Abstract
Attention-based neural machine translation (NMT) models selectively focus on specific source positions to produce a translation, which brings significant improvements over pure encoder-decoder sequence-to-sequence models. This work investigates NMT while replacing the attention component. We study a neural hidden Markov model (HMM) consisting of neural network-based alignment and lexicon models, which are trained jointly using the forward-backward algorithm. We show that the attention component can be effectively replaced by the neural network alignment model and the neural HMM approach is able to provide comparable performance with the state-of-the-art attention-based models on the WMT 2017 German↔English and Chinese→English translation tasks.
Anthology ID:
P18-2060
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
377–382
Language:
URL:
https://aclanthology.org/P18-2060
DOI:
10.18653/v1/P18-2060
Bibkey:
Cite (ACL):
Weiyue Wang, Derui Zhu, Tamer Alkhouli, Zixuan Gan, and Hermann Ney. 2018. Neural Hidden Markov Model for Machine Translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 377–382, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Neural Hidden Markov Model for Machine Translation (Wang et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/P18-2060.pdf
Presentation:
 P18-2060.Presentation.pdf
Video:
 https://aclanthology.org/P18-2060.mp4