Amharic-English Speech Translation in Tourism Domain

Michael Melese, Laurent Besacier, Million Meshesha


Abstract
This paper describes speech translation from Amharic-to-English, particularly Automatic Speech Recognition (ASR) with post-editing feature and Amharic-English Statistical Machine Translation (SMT). ASR experiment is conducted using morpheme language model (LM) and phoneme acoustic model(AM). Likewise,SMT conducted using word and morpheme as unit. Morpheme based translation shows a 6.29 BLEU score at a 76.4% of recognition accuracy while word based translation shows a 12.83 BLEU score using 77.4% word recognition accuracy. Further, after post-edit on Amharic ASR using corpus based n-gram, the word recognition accuracy increased by 1.42%. Since post-edit approach reduces error propagation, the word based translation accuracy improved by 0.25 (1.95%) BLEU score. We are now working towards further improving propagated errors through different algorithms at each unit of speech translation cascading component.
Anthology ID:
W17-4608
Volume:
Proceedings of the Workshop on Speech-Centric Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Nicholas Ruiz, Srinivas Bangalore
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
59–66
Language:
URL:
https://aclanthology.org/W17-4608
DOI:
10.18653/v1/W17-4608
Bibkey:
Cite (ACL):
Michael Melese, Laurent Besacier, and Million Meshesha. 2017. Amharic-English Speech Translation in Tourism Domain. In Proceedings of the Workshop on Speech-Centric Natural Language Processing, pages 59–66, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Amharic-English Speech Translation in Tourism Domain (Melese et al., 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4608.pdf