Bootstrapping New Language ASR Capabilities: Achieving Best Letter-to-Sound Performance under Resource Constraints

Jim Talley


Abstract
One of the most critical components in the process of building automatic speech recognition (ASR) capabilities for a new language is the lexicon, or pronouncing dictionary. For practical reasons, it is desirable to manually create only the minimal lexicon using available native-speaker phonetic expertise and, then, use the resulting seed lexicon for machine learning based induction of a high-quality letter-to-sound (L2S) model for generation of pronunciations for the remaining words of the language. This paper examines the viability of this scenario, specifically investigating three possible strategies for selection of lexemes (words) for manual transcription – choosing the most frequent lexemes of the language, choosing lexemes randomly, and selection of lexemes via an information theoretic diversity measure. The relative effectiveness of these three strategies is evaluated as a function of the number of lexemes to be transcribed to create a bootstrapping lexicon. Generally, the newly developed orthographic diversity based selection strategy outperforms the others for this scenario where a limited number of lexemes can be transcribed. The experiments also provide generally useful insight into expected L2S accuracy sacrifice as a function of decreasing training set size.
Anthology ID:
L06-1257
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/436_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Jim Talley. 2006. Bootstrapping New Language ASR Capabilities: Achieving Best Letter-to-Sound Performance under Resource Constraints. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Bootstrapping New Language ASR Capabilities: Achieving Best Letter-to-Sound Performance under Resource Constraints (Talley, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/436_pdf.pdf