Detecting East Asian Prejudice on Social Media

Bertie Vidgen, Scott Hale, Ella Guest, Helen Margetts, David Broniatowski, Zeerak Waseem, Austin Botelho, Matthew Hall, Rebekah Tromble


Abstract
During COVID-19 concerns have heightened about the spread of aggressive and hateful language online, especially hostility directed against East Asia and East Asian people. We report on a new dataset and the creation of a machine learning classifier that categorizes social media posts from Twitter into four classes: Hostility against East Asia, Criticism of East Asia, Meta-discussions of East Asian prejudice, and a neutral class. The classifier achieves a macro-F1 score of 0.83. We then conduct an in-depth ground-up error analysis and show that the model struggles with edge cases and ambiguous content. We provide the 20,000 tweet training dataset (annotated by experienced analysts), which also contains several secondary categories and additional flags. We also provide the 40,000 original annotations (before adjudication), the full codebook, annotations for COVID-19 relevance and East Asian relevance and stance for 1,000 hashtags, and the final model.
Anthology ID:
2020.alw-1.19
Volume:
Proceedings of the Fourth Workshop on Online Abuse and Harms
Month:
November
Year:
2020
Address:
Online
Editors:
Seyi Akiwowo, Bertie Vidgen, Vinodkumar Prabhakaran, Zeerak Waseem
Venue:
ALW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
162–172
Language:
URL:
https://aclanthology.org/2020.alw-1.19
DOI:
10.18653/v1/2020.alw-1.19
Bibkey:
Cite (ACL):
Bertie Vidgen, Scott Hale, Ella Guest, Helen Margetts, David Broniatowski, Zeerak Waseem, Austin Botelho, Matthew Hall, and Rebekah Tromble. 2020. Detecting East Asian Prejudice on Social Media. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 162–172, Online. Association for Computational Linguistics.
Cite (Informal):
Detecting East Asian Prejudice on Social Media (Vidgen et al., ALW 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.alw-1.19.pdf
Optional supplementary material:
 2020.alw-1.19.OptionalSupplementaryMaterial.zip
Video:
 https://slideslive.com/38939526
Code
 additional community code