Slips and errors in spoken data transcription

Isabella Chiari


Abstract
The present work illustrates the main results of an experiment on errors and repairs in spoken language transcription, with significant relevance for the evaluation of validity, reliability and correctness of transcriptions of speech belonging to several different typologies, set for the annotation of spoken corpora. In particular, we dealt with errors and repair strategies that appear on the first drafts of the transcription process that are not easily detectable with automatic post-editing procedures. 20 participants were asked to give an accurate transcription of 22 short utterances, repeated from one to four times, belonging two non-spontaneous (10) and spontaneous conversation (10). Error analysis suggests a general preference for meaning preservation even after the alteration of the original form, and for the preference for certain error patterns and repair strategies.
Anthology ID:
L06-1431
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/692_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Isabella Chiari. 2006. Slips and errors in spoken data transcription. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Slips and errors in spoken data transcription (Chiari, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/692_pdf.pdf