Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing

Erenay Dayanık; Sebastian Padó

Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing

Abstract

Text classification is a central tool in NLP. However, when the target classes are strongly correlated with other textual attributes, text classification models can pick up “wrong” features, leading to bad generalization and biases. In social media analysis, this problem surfaces for demographic user classes such as language, topic, or gender, which influence the generate text to a substantial extent. Adversarial training has been claimed to mitigate this problem, but thorough evaluation is missing. In this paper, we experiment with text classification of the correlated attributes of document topic and author gender, using a novel multilingual parallel corpus of TED talk transcripts. Our findings are: (a) individual classifiers for topic and author gender are indeed biased; (b) debiasing with adversarial training works for topic, but breaks down for author gender; (c) gender debiasing results differ across languages. We interpret the result in terms of feature space overlap, highlighting the role of linguistic surface realization of the target classes.

Anthology ID:: 2021.wassa-1.6
Volume:: Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Month:: April
Year:: 2021
Address:: Online
Editors:: Orphee De Clercq, Alexandra Balahur, Joao Sedoc, Valentin Barriere, Shabnam Tafreshi, Sven Buechel, Veronique Hoste
Venue:: WASSA
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 50–61
Language:
URL:: https://aclanthology.org/2021.wassa-1.6
DOI:
Bibkey:
Cite (ACL):: Erenay Dayanik and Sebastian Padó. 2021. Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing. In Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 50–61, Online. Association for Computational Linguistics.
Cite (Informal):: Disentangling Document Topic and Author Gender in Multiple Languages: Lessons for Adversarial Debiasing (Dayanik & Padó, WASSA 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.wassa-1.6.pdf

PDF Cite Search