Arabic Speech Rhythm Corpus: Read and Spontaneous Speaking Styles

Omnia Ibrahim, Homa Asadi, Eman Kassem, Volker Dellwo


Abstract
Databases for studying speech rhythm and tempo exist for numerous languages. The present corpus was built to allow comparisons between Arabic speech rhythm and other languages. 10 Egyptian speakers (gender-balanced) produced speech in two different speaking styles (read and spontaneous). The design of the reading task replicates the methodology used in the creation of BonnTempo corpus (BTC). During the spontaneous task, speakers talked freely for more than one minute about their daily life and/or their studies, then they described the directions to come to the university from a famous near location using a map as a visual stimulus. For corpus annotation, the database has been manually and automatically time-labeled, which makes it feasible to perform a quantitative analysis of the rhythm of Arabic in both Modern Standard Arabic (MSA) and Egyptian dialect variety. The database serves as a phonetic resource, which allows researchers to examine various aspects of Arabic supra-segmental features and it can be used for forensic phonetic research, for comparison of different speakers, analyzing variability in different speaking styles, and automatic speech and speaker recognition.
Anthology ID:
2020.lrec-1.657
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5337–5342
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.657
DOI:
Bibkey:
Cite (ACL):
Omnia Ibrahim, Homa Asadi, Eman Kassem, and Volker Dellwo. 2020. Arabic Speech Rhythm Corpus: Read and Spontaneous Speaking Styles. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5337–5342, Marseille, France. European Language Resources Association.
Cite (Informal):
Arabic Speech Rhythm Corpus: Read and Spontaneous Speaking Styles (Ibrahim et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.657.pdf