BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs

Mathieu Cliche


Abstract
In this paper we describe our attempt at producing a state-of-the-art Twitter sentiment classifier using Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTMs) networks. Our system leverages a large amount of unlabeled data to pre-train word embeddings. We then use a subset of the unlabeled data to fine tune the embeddings using distant supervision. The final CNNs and LSTMs are trained on the SemEval-2017 Twitter dataset where the embeddings are fined tuned again. To boost performances we ensemble several CNNs and LSTMs together. Our approach achieved first rank on all of the five English subtasks amongst 40 teams.
Anthology ID:
S17-2094
Volume:
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Editors:
Steven Bethard, Marine Carpuat, Marianna Apidianaki, Saif M. Mohammad, Daniel Cer, David Jurgens
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
573–580
Language:
URL:
https://aclanthology.org/S17-2094
DOI:
10.18653/v1/S17-2094
Bibkey:
Cite (ACL):
Mathieu Cliche. 2017. BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 573–580, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
BB_twtr at SemEval-2017 Task 4: Twitter Sentiment Analysis with CNNs and LSTMs (Cliche, SemEval 2017)
Copy Citation:
PDF:
https://aclanthology.org/S17-2094.pdf