Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation

Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee


Abstract
Speech translation (ST) aims to learn transformations from speech in the source language to the text in the target language. Previous works show that multitask learning improves the ST performance, in which the recognition decoder generates the text of the source language, and the translation decoder obtains the final translations based on the output of the recognition decoder. Because whether the output of the recognition decoder has the correct semantics is more critical than its accuracy, we propose to improve the multitask ST model by utilizing word embedding as the intermediate.
Anthology ID:
2020.acl-main.533
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5998–6003
Language:
URL:
https://aclanthology.org/2020.acl-main.533
DOI:
10.18653/v1/2020.acl-main.533
Bibkey:
Cite (ACL):
Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, and Hung-yi Lee. 2020. Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5998–6003, Online. Association for Computational Linguistics.
Cite (Informal):
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation (Chuang et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.533.pdf
Video:
 http://slideslive.com/38929303