Arabic Dialect Identification Using BERT Fine-Tuning

Moataz Mansour, Moustafa Tohamy, Zeyad Ezzat, Marwan Torki


Abstract
In the last few years, deep learning has proved to be a very effective paradigm to discover patterns in large data sets. Unfortunately, deep learning training on small data sets is not the best option because most of the time traditional machine learning algorithms could get better scores. Now, we can train the neural network on a large data set then fine-tune on a smaller data set using the transfer learning technique. In this paper, we present our system for NADI shared Task: Country-level Dialect Identification, Our system is based on fine-tuning of BERT and it achieves 22.85 F1-score on Test Set and our rank is 5th out of 18 teams.
Anthology ID:
2020.wanlp-1.33
Volume:
Proceedings of the Fifth Arabic Natural Language Processing Workshop
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Imed Zitouni, Muhammad Abdul-Mageed, Houda Bouamor, Fethi Bougares, Mahmoud El-Haj, Nadi Tomeh, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
308–312
Language:
URL:
https://aclanthology.org/2020.wanlp-1.33
DOI:
Bibkey:
Cite (ACL):
Moataz Mansour, Moustafa Tohamy, Zeyad Ezzat, and Marwan Torki. 2020. Arabic Dialect Identification Using BERT Fine-Tuning. In Proceedings of the Fifth Arabic Natural Language Processing Workshop, pages 308–312, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Arabic Dialect Identification Using BERT Fine-Tuning (Mansour et al., WANLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wanlp-1.33.pdf
Code
 zeyad3ezzat/nadi-shared-task