Ssn_nlp at SemEval 2020 Task 12: Offense Target Identification in Social Media Using Traditional and Deep Machine Learning Approaches

Thenmozhi D., Nandhinee P.r., Arunima S., Amlan Sengupta


Abstract
Offensive language identification (OLI) in user generated text is automatic detection of any profanity, insult, obscenity, racism or vulgarity that is addressed towards an individual or a group. Due to immense growth and usage of social media, it has an extensive reach and impact on the society. OLI is helpful for hate speech detection, flame detection and cyber bullying, hence it is used to avoid abuse and hurts. In this paper, we present state of the art machine learning approaches for OLI. We follow several approaches which include classifiers like Naive Bayes, Support Vector Machine(SVM) and deep learning approaches like Recurrent Neural Network(RNN) and Masked LM (MLM). The approaches are evaluated on the OffensEval@SemEval2020 dataset and our team ssn_nlp submitted runs for the third task of OffensEval shared task. The best run of ssn_nlp that uses BERT (Bidirectional Encoder Representations from Transformers) for the purpose of training the OLI model obtained F1 score as 0.61. The model performs with an accuracy of 0.80 and an evaluation loss of 1.0828. The model has a precision rate of 0.72 and a recall rate of 0.58.
Anthology ID:
2020.semeval-1.286
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Editors:
Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
2155–2160
Language:
URL:
https://aclanthology.org/2020.semeval-1.286
DOI:
10.18653/v1/2020.semeval-1.286
Bibkey:
Cite (ACL):
Thenmozhi D., Nandhinee P.r., Arunima S., and Amlan Sengupta. 2020. Ssn_nlp at SemEval 2020 Task 12: Offense Target Identification in Social Media Using Traditional and Deep Machine Learning Approaches. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 2155–2160, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
Ssn_nlp at SemEval 2020 Task 12: Offense Target Identification in Social Media Using Traditional and Deep Machine Learning Approaches (D. et al., SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.286.pdf