EmoIntens Tracker at SemEval-2018 Task 1: Emotional Intensity Levels in #Tweets

The „Affect in Tweets” task is centered on emotions categorization and evaluation matrix using multi-language tweets (English and Spanish). In this research, SemEval Affect dataset was preprocessed, categorized, and evaluated accordingly (precision, recall, and accuracy). The system described in this paper is based on the implementation of supervised machine learning (Naive Bayes, KNN and SVM), deep learning (NN Tensor Flow model), and decision trees algorithms.


Introduction
Emotion Analysis is still a challenging task in NLP (Natural Language Processing); the researchers try to recognize not only emotions "generally speaking" but also their intensity. Because the level of subjectivity is particularly high in this matter, prediction of emotions, mainly in text, demands for continuous research and improvement strategies. SemEval competition has already a tradition in developing tasks to address this subject. This year proposed an even more challenging task: emotion intensity prediction in tweets.
Within this context, this present study aims to develop a system that can not only detect emotions but also their intensities, namely emotion intensity regression and emotion intensity ordinal classification tasks for fear, joy, sadness and anger. Better results are finally provided thanks to the combination between neural network and proper decision tree algorithms.
From the four tasks, namely: EI-reg, EI-oc, VAD-reg, and VAD-oc, the system focuses on the first two of them, for both English and Spanish datasets, according to their relation with the emotion concept.
While tweets annotation, emotion intensity regression and emotion intensity ordinal classification is an active field of emotion analysis, we believe that a supervised machine learning (Naive Bayes, KNN and SVM), deep learning approach (NN Tensor Flow model), and decision trees would increase the effectiveness of this system.

State of the Art
Some of the most used technologies which led to considerable results and step forwards in the domain of sentiment analysis are mainly represented by machine learning algorithms which already have led to impressive results can be thoroughly analyzed in several studies such as: the work of Bac Huy Nguyen (2015); latent semantic analysis (LSA) by Andrew et al. (2014), support vector machines in the work of Rohini S. Rahate and Emmanuel M, (2013), grammatical dependency relations, Support Vector Regression (SVR), and Neural Networks.

Abstract
The "Affect in Tweets" task is centered on emotions categorization and evaluation matrix using multi-language tweets (English and Spanish). In this research, SemEval Affect dataset was preprocessed, categorized, and evaluated accordingly (precision, recall, and accuracy). The system described in this paper is based on the implementation of supervised machine learning (Naive Bayes, KNN and SVM), deep learning (NN Tensor Flow model), and decision trees algorithms.
An important work has been done by the tool IZU-NLP at EmoInt-2017 (Yuanye He et al., 2017), meant to determine numerical values that would represent the emotion intensity in a tweet. There are researchers who prefer to combine several methods in order to achieve better results. For instance, in the paper of Sreekanth Maunendra and Sankar Desarkar (2017) there were implemented three regression methods: (1) content-based features (ex. hashtags, emoticons); (2) training based on word and character n-grams; and finally (3) lexicons, word embeddings, word n-grams and character n-grams all together.
Best-Worst Scaling (BWS) was highly valued in the work of Saif M. Mohammad and Felipe Bravo-Marquez (2017) when producing the first datasets of tweets annotated for sentiment intensities (anger, fear, joy, and sadness).
Some existing tools and resources that enlarged the perspective and built the basis of sentiment analysis are: Emo-Int2017, NRC Emotion lexicon, Best-Worst Scaling resources; VADER-Sentiment-Analysis, SentiWordNet (Andrea Esuli et al., 2010), NLTK Sentiment Analyze, and Affective Tweets. All these works prove the interest of researchers on this subject and the fast evolution of this specific domain in time.
In the context of the Semeval Competition, we developed a system for emotion intensity and ordinal classification of the subtasks already stated above.

Data set
The SemEval affect dataset used in this work contains an annotated set multi-language tweets (English and Spanish). For each emotion (anger, fear, sadness, joy) we had 3 sets of data (for train, developing and test). The English data set was revised by Hardik Meisheri Dec 5, 2017 consisting of ~100 million English tweet ids and for Spanish the data set released on Dec 5, 2017 containing ~1.2 million Spanish tweet ids.

Method
This research is oriented towards the first two tasks of SemEval so it will contain two components, one for Task EI-reg and the other one for Task EI-oc. The first step was to preprocess the development data set in multiple stages as follows: basic cleaning (Ids, useless stopwords, emoticons), tokenization and parsing to make data less repetitive. Then we apply the NN Tensor Flow model, the basic one, offered by Python with 600 neurons and with a layer of 1 to 1000. The neural network was trained separately for each language using the same configuration. Once we obtained the results, we applied a Decision Tree Algorithm in order to refine them. For all subtasks we use Neural Network Tensor Flow (NNTF): Analyzing Tweet's Sentiment with Character-Level LSTMs NN Tensor Flowpython implemented neuronal network with the same parameters for the EI-oc subtask, we improved our results by implementing also the classifiers from pattern.vector. The algorithms are based on three big approaches -they implement the Naive Bayes, KNN and SVM classifiers respectively. Even though machine learning and neural networks gave decent result, the difficulty in controlling the actual reasoning implied the necessity of adding a refining algorithm that would improve final results. An algorithm that would meet this condition is the Decision Tree Classification, a Weka J48 implementation that was improved by adding an algorithm used to generate same decision trees: check for base classification (this classification should be done by the first method described in this paper -NN Tensor Flow); for each score/class; find the normalized score; best_score will be the highest normalized score, this will be the root; create a decision node that splits on best_score; search on the sublists obtained by splitting on best_score; add those nodes as child of node.
For development and training we used the results from the first method for both English and French.

Results and Observations
In both sets of results we will notice that the best results were obtained in the single positive feeling dataset -the one for joy. For EI-reg, the lowest result was registered for anger (accuracy), while the highest was the recall for joy. For EI-oc the best result is in the precision of joy, the lowest result being again in accuracy, but this time for sadness.
The implementation of the Decision Tree Algorithm leads to growth and uniformization of the results in the EI-reg subtask, while it lowers those from EI-oc.
This being said, it becomes clear that for ordinal classification (EI-oc) NNTF is preferred, while the additional Decision Tree Algorithm helps the improvement of intensity detection.

Conclusions
Within our project, we succeeded to obtain relevant results as participants in Semeval-Task 1 Affect in Tweets, by implementing machine learning and decision tree algorithms.
A constant concern relates to the modalities of sentiment summarization and visualization. When the results of sentiment analysis tasks need to be presented to an end user, a corresponding level of uncertainty should be taken into account (uncertain results shown as certain may lead to incorrect conclusions). Of course it is clear that we may increase it by identifying manually certain patterns and indices about intensity and/or their classification.
It would have been rather interesting to have a balance between the negative and positive emotions (the Semeval Competition providing us with three negative and only one positive emotion.) Much and interesting work is to be done as we speak about such a subjective part and manifestation of human mindemotions.

Acknowledgments
This survey was published with the support by two grants of the Romanian National Authority for Scientific Research and Innovation, UEFISCDI, project number PN-III-P2-2.1-BG-2016-0390, contract 126BG/2016 and project number PN-III-P1-1.2-PCCDI-2017-0818, contract 73PCCDI/2018 within PNCDI III, and partially by the README project "Interactive and Innovative application for evaluating the readability of texts in Romanian Language and Table 1. NNTF and Classification Algoithm Results (a-accuracy, r-recall, p-precission) Table 2. Results obtained after implementing the Decission Tree Algorithm (a-accuracy, r-recall, p-precission)