Gaëlle Vidal


2022

pdf bib
Techniques de synthèse vocale neuronale à l’épreuve des données d’apprentissage non dédiées : les livres audio amateurs en français [Neural speech synthesis techniques put to the test with non-dedicated training data: amateur French audio books]
Aghilas Sini | Lily Wadoux | Antoine Perquin | Gaëlle Vidal | David Guennec | Damien Lolive | Pierre Alain | Nelly Barbot | Jonathan Chevelu | Arnaud Delhay
Traitement Automatique des Langues, Volume 63, Numéro 2 : Traitement automatique des langues intermodal et multimodal [Cross-modal and multimodal natural language processing]

2018

pdf bib
SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.
Aghilas Sini | Damien Lolive | Gaëlle Vidal | Marie Tahon | Élisabeth Delais-Roussarie
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2012

pdf bib
Vers une annotation automatique de corpus audio pour la synthèse de parole (Towards Fully Automatic Annotation of Audio Books for Text-To-Speech (TTS) Synthesis) [in French]
Olivier Boëffard | Laure Charonnat | Sébastien Le Maguer | Damien Lolive | Gaëlle Vidal
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

2008

pdf bib
Automatic Phone Segmentation of Expressive Speech
Laure Charonnat | Gaëlle Vidal | Olivier Boeffard
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In order to improve the flexibility and the precision of an automatic phone segmentation system for a type of expressive speech, the dubbing into French of fiction movies, we developed both the phonetic labeling process and the alignment process. The automatic labelling system relies on an automatic grapheme-to-phoneme conversion including all the variants of the phonetic chain and on HMM modeling. In this article, we will distinguish three sets of phone models: a set of context independent models, a set of left and right context dependant models and finally a mixing of the two that combines phone and triphone models according to the precision of alignment obtained for each phonetic broad-class. The three models are evaluated on a test corpus. On the one hand we notice a little decrease in the score of phonetic labelling mainly due to pauses insertions, but on the other hand the mixed set of models gives the best results for the score of precision of the alignment.