ASIREM Participation at the Discriminating Similar Languages Shared Task 2016

Wafia Adouane, Nasredine Semmar, Richard Johansson


Abstract
This paper presents the system built by ASIREM team for the Discriminating between Similar Languages (DSL) Shared task 2016. It describes the system which uses character-based and word-based n-grams separately. ASIREM participated in both sub-tasks (sub-task 1 and sub-task 2) and in both open and closed tracks. For the sub-task 1 which deals with Discriminating between similar languages and national language varieties, the system achieved an accuracy of 87.79% on the closed track, ending up ninth (the best results being 89.38%). In sub-task 2, which deals with Arabic dialect identification, the system achieved its best performance using character-based n-grams (49.67% accuracy), ranking fourth in the closed track (the best result being 51.16%), and an accuracy of 53.18%, ranking first in the open track.
Anthology ID:
W16-4821
Volume:
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
Venue:
VarDial
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
163–169
Language:
URL:
https://aclanthology.org/W16-4821
DOI:
Bibkey:
Cite (ACL):
Wafia Adouane, Nasredine Semmar, and Richard Johansson. 2016. ASIREM Participation at the Discriminating Similar Languages Shared Task 2016. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 163–169, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
ASIREM Participation at the Discriminating Similar Languages Shared Task 2016 (Adouane et al., VarDial 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4821.pdf