Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

Arya D. McCarthy, Xian Li, Jiatao Gu, Ning Dong


Abstract
This paper proposes a simple and effective approach to address the problem of posterior collapse in conditional variational autoencoders (CVAEs). It thus improves performance of machine translation models that use noisy or monolingual data, as well as in conventional settings. Extending Transformer and conditional VAEs, our proposed latent variable model measurably prevents posterior collapse by (1) using a modified evidence lower bound (ELBO) objective which promotes mutual information between the latent variable and the target, and (2) guiding the latent variable with an auxiliary bag-of-words prediction task. As a result, the proposed model yields improved translation quality compared to existing variational NMT models on WMT Ro↔En and De↔En. With latent variables being effectively utilized, our model demonstrates improved robustness over non-latent Transformer in handling uncertainty: exploiting noisy source-side monolingual data (up to +3.2 BLEU), and training with weakly aligned web-mined parallel data (up to +4.7 BLEU).
Anthology ID:
2020.acl-main.753
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8512–8525
Language:
URL:
https://aclanthology.org/2020.acl-main.753
DOI:
10.18653/v1/2020.acl-main.753
Bibkey:
Cite (ACL):
Arya D. McCarthy, Xian Li, Jiatao Gu, and Ning Dong. 2020. Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8512–8525, Online. Association for Computational Linguistics.
Cite (Informal):
Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation (McCarthy et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.753.pdf
Video:
 http://slideslive.com/38929049