Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media

Isabel Cachola, Eric Holgate, Daniel Preoţiuc-Pietro, Junyi Jessy Li


Abstract
Vulgarity is a common linguistic expression and is used to perform several linguistic functions. Understanding their usage can aid both linguistic and psychological phenomena as well as benefit downstream natural language processing applications such as sentiment analysis. This study performs a large-scale, data-driven empirical analysis of vulgar words using social media data. We analyze the socio-cultural and pragmatic aspects of vulgarity using tweets from users with known demographics. Further, we collect sentiment ratings for vulgar tweets to study the relationship between the use of vulgar words and perceived sentiment and show that explicitly modeling vulgar words can boost sentiment analysis performance.
Anthology ID:
C18-1248
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2927–2938
URL:
https://www.aclweb.org/anthology/C18-1248
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://www.aclweb.org/anthology/C18-1248.pdf