Can Crowdsourcing be used for Effective Annotation of Arabic?

Wajdi Zaghouani, Kais Dukes


Abstract
Crowdsourcing has been used recently as an alternative to traditional costly annotation by many natural language processing groups. In this paper, we explore the use of Amazon Mechanical Turk (AMT) in order to assess the feasibility of using AMT workers (also known as Turkers) to perform linguistic annotation of Arabic. We used a gold standard data set taken from the Quran corpus project annotated with part-of-speech and morphological information. An Arabic language qualification test was used to filter out potential non-qualified participants. Two experiments were performed, a part-of-speech tagging task in where the annotators were asked to choose a correct word-category from a multiple choice list and case ending identification task. The results obtained so far showed that annotating Arabic grammatical case is harder than POS tagging, and crowdsourcing for Arabic linguistic annotation requiring expert annotators could be not as effective as other crowdsourcing experiments requiring less expertise and qualifications.
Anthology ID:
L14-1370
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
224–228
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/431_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Wajdi Zaghouani and Kais Dukes. 2014. Can Crowdsourcing be used for Effective Annotation of Arabic?. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 224–228, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Can Crowdsourcing be used for Effective Annotation of Arabic? (Zaghouani & Dukes, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/431_Paper.pdf