Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews

Jan Kocoń, Piotr Miłkowski, Monika Zaśko-Zielińska


Abstract
In this article we present an extended version of PolEmo – a corpus of consumer reviews from 4 domains: medicine, hotels, products and school. Current version (PolEmo 2.0) contains 8,216 reviews having 57,466 sentences. Each text and sentence was manually annotated with sentiment in 2+1 scheme, which gives a total of 197,046 annotations. We obtained a high value of Positive Specific Agreement, which is 0.91 for texts and 0.88 for sentences. PolEmo 2.0 is publicly available under a Creative Commons copyright license. We explored recent deep learning approaches for the recognition of sentiment, such as Bi-directional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT).
Anthology ID:
K19-1092
Volume:
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
Month:
November
Year:
2019
Address:
Hong Kong, China
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
980–991
URL:
https://www.aclweb.org/anthology/K19-1092
DOI:
10.18653/v1/K19-1092
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://www.aclweb.org/anthology/K19-1092.pdf
Attachment:
 K19-1092.Attachment.zip