AdelaideCyC at SemEval-2020 Task 12: Ensemble of Classifiers for Offensive Language Detection in Social Media

Mahen Herath, Thushari Atapattu, Hoang Anh Dung, Christoph Treude, Katrina Falkner


Abstract
This paper describes the systems our team (AdelaideCyC) has developed for SemEval Task 12 (OffensEval 2020) to detect offensive language in social media. The challenge focuses on three subtasks – offensive language identification (subtask A), offense type identification (subtask B), and offense target identification (subtask C). Our team has participated in all the three subtasks. We have developed machine learning and deep learning-based ensembles of models. We have achieved F1-scores of 0.906, 0.552, and 0.623 in subtask A, B, and C respectively. While our performance scores are promising for subtask A, the results demonstrate that subtask B and C still remain challenging to classify.
Anthology ID:
2020.semeval-1.198
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Editors:
Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
1516–1523
Language:
URL:
https://aclanthology.org/2020.semeval-1.198
DOI:
10.18653/v1/2020.semeval-1.198
Bibkey:
Cite (ACL):
Mahen Herath, Thushari Atapattu, Hoang Anh Dung, Christoph Treude, and Katrina Falkner. 2020. AdelaideCyC at SemEval-2020 Task 12: Ensemble of Classifiers for Offensive Language Detection in Social Media. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1516–1523, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
AdelaideCyC at SemEval-2020 Task 12: Ensemble of Classifiers for Offensive Language Detection in Social Media (Herath et al., SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.198.pdf
Code
 additional community code