Regional Bias in the Broad Phonetic Transcriptions of the Spoken Dutch Corpus

Evie Coussé, Steven Gillis


Abstract
In this paper, we assess an aspect of the quality of the broad phonetic transcriptions in the Spoken Dutch Corpus (CGN). The corpus contains speech from native speakers of Dutch originating from The Netherlands and the Dutch speaking part of Belgium. The phonetic transcriptions were made by transcribers from both regions. In previous research, we have identified regional differences in the transcribers' behaviour. In this paper, we explore the precise sources of the regional bias in the CGN transcriptions and we evaluate its impact on the phonetic transcriptions. More specifically, (1) the regional bias in the canonical transcriptions that served as the basis for the verification task of the transcribers is critically analysed, and (2) we verify in an experiment the regional bias introduced by the transcribers themselves. The possible effects of this inherent regional bias in the CGN transcriptions on subsequent linguistic analyses are briefly discussed.
Anthology ID:
L06-1185
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/323_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Evie Coussé and Steven Gillis. 2006. Regional Bias in the Broad Phonetic Transcriptions of the Spoken Dutch Corpus. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Regional Bias in the Broad Phonetic Transcriptions of the Spoken Dutch Corpus (Coussé & Gillis, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/323_pdf.pdf