Zyy1510 Team at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text with Sub-word Level Representations

Yueying Zhu, Xiaobing Zhou, Hongling Li, Kunjie Dong


Abstract
This paper reports the zyy1510 team’s work in the International Workshop on Semantic Evaluation (SemEval-2020) shared task on Sentiment analysis for Code-Mixed (Hindi-English, English-Spanish) Social Media Text. The purpose of this task is to determine the polarity of the text, dividing it into one of the three labels positive, negative and neutral. To achieve this goal, we propose an ensemble model of word n-grams-based Multinomial Naive Bayes (MNB) and sub-word level representations in LSTM (Sub-word LSTM) to identify the sentiments of code-mixed data of Hindi-English and English-Spanish. This ensemble model combines the advantage of rich sequential patterns and the intermediate features after convolution from the LSTM model, and the polarity of keywords from the MNB model to obtain the final sentiment score. We have tested our system on Hindi-English and English-Spanish code-mixed social media data sets released for the task. Our model achieves the F1 score of 0.647 in the Hindi-English task and 0.682 in the English-Spanish task, respectively.
Anthology ID:
2020.semeval-1.183
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Editors:
Aurelie Herbelot, Xiaodan Zhu, Alexis Palmer, Nathan Schneider, Jonathan May, Ekaterina Shutova
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
1354–1359
Language:
URL:
https://aclanthology.org/2020.semeval-1.183
DOI:
10.18653/v1/2020.semeval-1.183
Bibkey:
Cite (ACL):
Yueying Zhu, Xiaobing Zhou, Hongling Li, and Kunjie Dong. 2020. Zyy1510 Team at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text with Sub-word Level Representations. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1354–1359, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
Zyy1510 Team at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text with Sub-word Level Representations (Zhu et al., SemEval 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.semeval-1.183.pdf