Word Boundaries in French: Evidence from Large Speech Corpora

Rena Nemoto, Martine Adda-Decker, Jacques Durand


Abstract
The goal of this paper is to investigate French word segmentation strategies using phonemic and lexical transcriptions as well as prosodic and part-of-speech annotations. Average fundamental frequency (f0) profiles and phoneme duration profiles are measured using 13 hours of broadcast news speech to study prosodic regularities of French words. Some influential factors are taken into consideration for f0 and duration measurements: word syllable length, word-final schwa, part-of-speech. Results from average f0 profiles confirm word final syllable accentuation and from average duration profiles, we can observe long word final syllable length. Both are common tendencies in French. From noun phrase studies, results of average f0 profiles illustrate higher noun first syllable after determiner. Inter-vocalic duration profile results show long inter-vocalic duration between determiner vowel and preceding word vowel. These results reveal measurable cues contributing to word boundary location. Further studies will include more detailed within syllable f0 patterns, other speaking styles and languages.
Anthology ID:
L10-1264
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/386_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Rena Nemoto, Martine Adda-Decker, and Jacques Durand. 2010. Word Boundaries in French: Evidence from Large Speech Corpora. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Word Boundaries in French: Evidence from Large Speech Corpora (Nemoto et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/386_Paper.pdf