Aggressive language in an online hacking forum

Andrew Caines, Sergio Pastrana, Alice Hutchings, Paula Buttery


Abstract
We probe the heterogeneity in levels of abusive language in different sections of the Internet, using an annotated corpus of Wikipedia page edit comments to train a binary classifier for abuse detection. Our test data come from the CrimeBB Corpus of hacking-related forum posts and we find that (a) forum interactions are rarely abusive, (b) the abusive language which does exist tends to be relatively mild compared to that found in the Wikipedia comments domain, and tends to involve aggressive posturing rather than hate speech or threats of violence. We observe that the purpose of conversations in online forums tend to be more constructive and informative than those in Wikipedia page edit comments which are geared more towards adversarial interactions, and that this may explain the lower levels of abuse found in our forum data than in Wikipedia comments. Further work remains to be done to compare these results with other inter-domain classification experiments, and to understand the impact of aggressive language in forum conversations.
Anthology ID:
W18-5109
Volume:
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)
Month:
October
Year:
2018
Address:
Brussels, Belgium
Editors:
Darja Fišer, Ruihong Huang, Vinodkumar Prabhakaran, Rob Voigt, Zeerak Waseem, Jacqueline Wernimont
Venue:
ALW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
66–74
Language:
URL:
https://aclanthology.org/W18-5109
DOI:
10.18653/v1/W18-5109
Bibkey:
Cite (ACL):
Andrew Caines, Sergio Pastrana, Alice Hutchings, and Paula Buttery. 2018. Aggressive language in an online hacking forum. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pages 66–74, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Aggressive language in an online hacking forum (Caines et al., ALW 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-5109.pdf