Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary

Martin Jansche


Abstract
We propose a model-driven method for ensuring the quality of pronunciation dictionaries. The key ingredient is computing an alignment between letter strings and phoneme strings, a standard technique in pronunciation modeling. The novel aspect of our method is the use of informative, parametric alignment models which are refined iteratively as they are tested against the data. We discuss the use of alignment failures as a signal for detecting and correcting problematic dictionary entries. We illustrate this method using an existing pronunciation dictionary for Icelandic. Our method is completely general and has been applied in the construction of pronunciation dictionaries for commercially deployed speech recognition systems in several languages.
Anthology ID:
L14-1299
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2111–2114
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/339_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Martin Jansche. 2014. Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 2111–2114, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary (Jansche, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/339_Paper.pdf