IITP-AINLPML at SemEval-2020 Task 12: Offensive Tweet Identification and Target Categorization in a Multitask Environment

Soumitra Ghosh, Asif Ekbal, Pushpak Bhattacharyya


Abstract
In this paper, we describe the participation of IITP-AINLPML team in the SemEval-2020 SharedTask 12 on Offensive Language Identification and Target Categorization in English Twitter data. Our proposed model learns to extract textual features using a BiGRU-based deep neural network supported by a Hierarchical Attention architecture to focus on the most relevant areas in the text. We leverage the effectiveness of multitask learning while building our models for sub-task A and B. We do necessary undersampling of the over-represented classes in the sub-tasks A and C.During training, we consider a threshold of 0.5 as the separation margin between the instances belonging to classes OFF and NOT in sub-task A and UNT and TIN in sub-task B. For sub-task C, the class corresponding to the maximum score among the given confidence scores of the classes(IND, GRP and OTH) is considered as the final label for an instance. Our proposed model obtains the macro F1-scores of 90.95%, 55.69% and 63.88% in sub-task A, B and C, respectively.
Anthology ID:
2020.semeval-1.261
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Editors:
Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
1983–1991
Language:
URL:
https://aclanthology.org/2020.semeval-1.261
DOI:
10.18653/v1/2020.semeval-1.261
Bibkey:
Cite (ACL):
Soumitra Ghosh, Asif Ekbal, and Pushpak Bhattacharyya. 2020. IITP-AINLPML at SemEval-2020 Task 12: Offensive Tweet Identification and Target Categorization in a Multitask Environment. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1983–1991, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
IITP-AINLPML at SemEval-2020 Task 12: Offensive Tweet Identification and Target Categorization in a Multitask Environment (Ghosh et al., SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.261.pdf
Data
OLID