Morphological Analysis and Disambiguation for Gulf Arabic: The Interplay between Resources and Methods

Salam Khalifa, Nasser Zalmout, Nizar Habash


Abstract
In this paper we present the first full morphological analysis and disambiguation system for Gulf Arabic. We use an existing state-of-the-art morphological disambiguation system to investigate the effects of different data sizes and different combinations of morphological analyzers for Modern Standard Arabic, Egyptian Arabic, and Gulf Arabic. We find that in very low settings, morphological analyzers help boost the performance of the full morphological disambiguation task. However, as the size of resources increase, the value of the morphological analyzers decreases.
Anthology ID:
2020.lrec-1.480
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3895–3904
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.480
DOI:
Bibkey:
Cite (ACL):
Salam Khalifa, Nasser Zalmout, and Nizar Habash. 2020. Morphological Analysis and Disambiguation for Gulf Arabic: The Interplay between Resources and Methods. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3895–3904, Marseille, France. European Language Resources Association.
Cite (Informal):
Morphological Analysis and Disambiguation for Gulf Arabic: The Interplay between Resources and Methods (Khalifa et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.480.pdf
Data
The Annotated Gumar Corpus