Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages

Ngoc Tan Le, Fatiha Sadat


Abstract
Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With low-resource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are particularly useful. These models are based on initial alignments between grapheme source and phoneme target sequences. Inspired by sequence-to-sequence recurrent neural network-based translation methods, the current research presents an approach that applies an alignment representation for input sequences and pre-trained source and target embeddings to overcome the transliteration problem for a low-resource languages pair. We participated in the NEWS 2018 shared task for the English-Vietnamese transliteration task.
Anthology ID:
W18-2414
Volume:
Proceedings of the Seventh Named Entities Workshop
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Nancy Chen, Rafael E. Banchs, Xiangyu Duan, Min Zhang, Haizhou Li
Venue:
NEWS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
95–100
Language:
URL:
https://aclanthology.org/W18-2414
DOI:
10.18653/v1/W18-2414
Bibkey:
Cite (ACL):
Ngoc Tan Le and Fatiha Sadat. 2018. Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages. In Proceedings of the Seventh Named Entities Workshop, pages 95–100, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages (Le & Sadat, NEWS 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-2414.pdf