Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion

Nikhil Prabhu, Katharina Kann


Abstract
In this paper, we describe two CU-Boulder submissions to the SIGMORPHON 2020 Task 1 on multilingual grapheme-to-phoneme conversion (G2P). Inspired by the high performance of a standard transformer model (Vaswani et al., 2017) on the task, we improve over this approach by adding two modifications: (i) Instead of training exclusively on G2P, we additionally create examples for the opposite direction, phoneme-to-grapheme conversion (P2G). We then perform multi-task training on both tasks. (ii) We produce ensembles of our models via majority voting. Our approaches, though being conceptually simple, result in systems that place 6th and 8th amongst 23 submitted systems, and obtain the best results out of all systems on Lithuanian and Modern Greek, respectively.
Anthology ID:
2020.sigmorphon-1.13
Volume:
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
July
Year:
2020
Address:
Online
Editors:
Garrett Nicolai, Kyle Gorman, Ryan Cotterell
Venue:
SIGMORPHON
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
123–127
Language:
URL:
https://aclanthology.org/2020.sigmorphon-1.13
DOI:
10.18653/v1/2020.sigmorphon-1.13
Bibkey:
Cite (ACL):
Nikhil Prabhu and Katharina Kann. 2020. Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion. In Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 123–127, Online. Association for Computational Linguistics.
Cite (Informal):
Frustratingly Easy Multilingual Grapheme-to-Phoneme Conversion (Prabhu & Kann, SIGMORPHON 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sigmorphon-1.13.pdf