Identification and Classification of Emotional Key Phrases from Psychological Texts

.


Introduction
Human emotions are the most complex and unique features to be described. If we ask someone regarding emotion, he or she will reply simply that it is a 'feeling'. Then, the obvious question that comes into our mind is about the definition of feeling. It is observed that such terms are difficult to define and even more difficult to understand completely. Ekman (1980) proposed six basic emotions (anger, disgust, fear, guilt, joy and sadness) that have a shared meaning on the level of facial expressions across cultures (Scherer, 1997;Scher-er and Wallbott, 1994). Psychological texts contain huge number of emotional words because psychology and emotions are inter-wined, though they are different (Brahmachari et.al, 2013). A phrase that contains more than one word can be a better way of representing emotions than a single word. Thus, the emotional phrase identification and their classification from text have great importance in Natural Language Processing (NLP).
In the present work, we have extracted seven different types of emotional statements (anger, disgust, fear, guilt, joy, sadness and shame) from the Psychological corpus. Each of the emotional statements was tokenized; the tokens were grouped in trigrams and considered as Context Vectors. These Context Vectors are POS tagged and corresponding TF and TF-IDF scores were measured for considering them as important features or not. In addition, the Affinity Scores were calculated for each pair of Context Vectors based on different distance metrics (Chebyshev, Euclidean and Hamming). Such features lead to apply different classification methods like NaiveBayes, J48, Decision Tree and BayesNet and after that the results are compared.
The route map for this paper is the Related Work (Section 2), Data Preprocessing Framework (Section 3) followed by Feature Analysis and Classification framework (Section 4) and result analysis (Section 5) along with the improvement due to ranking. Finally, we have concluded the discussion (Section 6). Strapparava and Valitutti (2004) developed the WORDNET-AFFECT, a lexical resource that assigns one or more affective labels such as emotion, mood, trait, cognitive state, physical state, behavior, attitude and sensation etc to a number of WORDNET synsets. A detailed annotation scheme that identifies key components and properties of opinions and emotions in language has been described in (Wiebe et al., 2005). The authors in (Kobayashi et al., 2004) also developed an opinion lexicon out of their annotated corpora. Takamura et al. (2005) extracted semantic orientation of words according to the spin model, where the semantic orientation of words propagates in two possible directions like electrons. Esuli and Sebastiani's (2006) approach to develop the SentiWord-Net is an adaptation to synset classification based on the training of ternary classifiers for deciding positive and negative (P-N) polarity. Each of the ternary classifiers is generated using the Semisupervised rules.

Related Work
On the other hand, Mohammad, et al., (2010) has performed an extensive analysis of the annotations to better understand the distribution of emotions evoked by terms of different parts of speech. The authors in Bandyopadhyay, 2009, 2010) created the emotion lexicon and systems for Bengali language. The development of SenticNet (Cambria et al., 2010) was inspired later by (Poria et al., 2013). The authors developed an enriched SenticNet with affective information by assigning emotion labels. Similarly, ConceptNet 1 is a multilingual knowledge base, representing words and phrases that people use and the common-sense relationships between them. Balahur et al., (2012) had shown that the task of emotion detection from texts such as the one in the ISEAR corpus (where little or no lexical clues of affect are present) can be best tackled using approaches based on commonsense knowledge. In this sense, EmotiNet, apart from being a precise resource for classifying emotions in such examples, has the advantage of being extendable with external sources, thus increasing the recall of the methods employing it. Patra et al., (2013) adopted the Potts model for the probability modeling of the lexical network that was constructed by connecting each pair of words in which one of the two words appears in the gloss of the other.
In contrast to the previous approaches, the present task comprises of classifying the emotional phrases by forming Context Vectors and the experimentation with simple features like POS, TF-IDF and Affinity Score followed by the computation of 1 http://conceptnet5.media.mit.edu/ similarities based on different distance metrics help in making decisions to correctly classify the emotional phrases.

Corpus Preparation
The emotional statements were collected from the ISEAR 7 (International Survey on Emotion Antecedents and Reactions) database. Each of the emotion classes contains the emotional statements given by the respondents as answers based on some predefined questions. Student respondents, both psychologists and non-psychologists were asked to report situations in which they had experienced all of the 7 major emotions (anger, disgust, fear, guilt, joy, sadness, shame). The final data set contains reports of 3000 respondents from 37 countries. The statements were split in sentences and tokenized into words and the statistics were presented in Table 1. It is found that only 1096 statements belong to anger, disgust sadness and shame classes whereas the fear, guilt and joy classes contain 1095, 1093 and 1094 different statements, respectively. Since each statement may contain multiple sentences, so after sentence tokenization, it is observed that the anger and fear classes contain the maximum number of sentences. Similarly, it is observed that the anger class contains the maximum number of tokenized words. The tokenized words were grouped to form trigrams in order to grasp the roles of the previous and next tokens with respect to the target token. Thus, each of the trigrams was considered as a Context Window (CW) to acquire the emotional phrases. The updated version of the standard word lists of the WordNet Affect (Strapparava, and Vali-tutti, 2004) was collected and it is observed that the total of 2,958 affect words is present.
It is considered that, in each of the Context Windows, the first word appears as a non-affect word, second word as an affect word, and third word as a non-affect word (<NAW 1 >, <AW>, <NAW 2 >). It is observed from the statistics of CW as shown in Table 2 that the anger class contains the maximum number of trigrams (20,785) and joy class has the minimum number of trigrams (15,743) whereas only the fear class contains the maximum number of trigrams (1,573) that follow the CW pattern. A few example patterns of the CWs which follows the pattern (<NAW 1 >, <AW>, <NAW 2 >) are "advices, about, problems" (Anger), "already, frightened, us" (Fear), "always, joyous, one" (Joy), "acted, cruelly, to" (Disgust), "adolescent, guilt, growing" (guilt), "always, sad, for" (sad) , "and, sorry, just" (Shame) etc.
It was observed that the stop words are mostly present in <NAW 1 , AW, NAW 2 > pattern where similar and dissimilar NAWs are appeared before and after their corresponding CWs. In case of fear, a total of 979 stop words were found in NAW 1 position and 935 stop words in NAW 2 position. It is observed that in case of fear, the occurrence of similar NAW before and after of CWs is only 22 in contrast to the dissimilar occurrences of 1551. Table 3 explains the statistics of similar and dissimilar NAWs along with their appearances as stop words.

Context Vector Formation
In order to identify whether the Context Windows (CWs) play any significant role in classifying emotions or not, we have mapped the Context Windows in a Vector space by representing them as vectors. We have tried to find out the semantic relation or similarity between a pair of vectors using Affinity Score which in turn takes care of different distances into consideration. Since a CW follows the pattern (NAW1, AW, NAW2), the formation of vector with respect to each of the Context Windows of each emotion class was done based on the following formula, Where, T= Total count of CW in an emotion class #NAW 1 = Total occurrence of a nonaffect word in NAW 1 position #NAW 2 = Total occurrence of a nonaffect word in NAW 2 position #AW = Total occurrence of an affect word in AW position.
It was found that in case of anger emotion, a CW identified as (always, angry, about) corresponds to a Vector, <0.29, 10.69, 1.47>

Affinity Score Calculation
We assume that each of the Context Vectors in an emotion class is represented in the vector space at a specific distance from the others. Thus, there must be some affinity or similarity exists between each of the Context Vectors. An Affinity Score was calculated for each pair of Context Vectors (p u ,q v ) where u = {1,2,3,.........n} and v = {1,2,3,.......n} for n number of vectors with respect to each of the emotion classes. The final Score is calculated using the following gravitational formula as described in (Poria et al., 2013):

Score 2 dist
The Score of any two context vectors p and q of an emotion class is the dot product of the vectors divided by the square of distance (dist) between p and q. This score was inspired by Newton's law of gravitation. This score values reflect the affinity between two context vectors p and q. Higher score implies higher affinity between p and q.
However, apart from the score values, we also calculated the median, standard deviation and inter quartile range (iqr) and only those context windows were considered if their iqr values are greater than some cutoff value selected during experiments.

Affinity Scores using Distance Metrics
In the vector space, it is needed to calculate how close the context vectors are in the space in order to conduct better classification into their respective emotion classes. The Score values were calculated for all the emotion classes with respect to different metrics of distance (dist) viz. Chebyshev, Euclidean and Hamming. The distance was calculated for each context vector with respect to all the vectors of the same emotion class. The distance formula is given below: a. Chebyshev distance (C d ) = max |x i -y i | where x i and y i represents two vectors. b. Euclidean distance (E d ) = ||x -y|| 2 for vectors x and y. c. Hamming distance (H d ) = (c 01 + c 10 ) / n where c ij is the number of occurrence in the boolean vectors x and y and x[k] = i and y[k] = j for k < n. Hamming distance denotes the proportion of disagreeing components in x and y.

Feature Selection and Analysis
It is observed that the feature selection always plays an important role in building a good pattern classifier.

POS Tagged Context Windows and Windows (PTCW and PTW)
The sentences were POS tagged using the Stanford POS Tagger and the POS tagged Context Windows were extracted and termed as PTCW. Similarly, the POS tag sequence from each of the PTCWs were extracted and named each as POS Tagged Window (PTW). It is observed that "fear" emotion class has the maximum number of CWs and unique PTCWs whereas the "anger" class contains the maximum number of unique PTWs. The Figure 1 as shown below represents the counts of CW, unique PTCWs and PTWs. It was noticed that the total number of CWs is 8967, total number of unique PTCW is 7609 and of unique PTW is 3117. Obviously, the number of PTCW was less than CW and number of PTW was less than PTCW, because of the uniqueness of PTCW and PTW. In Figure 2, the total counts of CW, PTCW and PTW have been shown. Some sample patterns of PTWs that occur with the maximum frequencies in three emotion classes are "VBD/RB_JJ_IN" (anger), "NN/VBD_VBN_NN" (disgust) and "VBD_VBN/JJ_IN/NN" (fear).

TF and TF-IDF Measure
The Term Frequencies (TFs) and the Inverse Document Frequencies (IDFs) of the CWs for each of the emotion classes were calculated. In order to identify different ranges of the TF and TF-IDF scores, the minimum and maximum values of the TF and the variance of TF were calculated for each of the emotion classes. It was observed that guilt has the maximum scores for Max_TF and variance whereas the emotions like anger and disgust have the lowest scores for Max_TF as shown in Figure  3. Similarly, the minimum, maximum and variance of the TF-IDF values were calculated for each emotion class, separately. Again, it is found that the guilt emotion has the highest Max_TF-IDF and disgust emotion has the lowest Max_TF-IDF as shown in Figure 4. Not only for the Context Windows (CWs), the TF and TF-IDF scores of the POS Tagged Context Windows (PTCWs) and POS Tagged Windows (PTWs) were also calculated with respect to each emotion. It was observed that, similar results were found. Variance, or second moment about the mean, is a measure of the variability (spread or dispersion) of data. A large variance indicates that the data is spread out; a small variance indicates it is clustered closely around the mean.The variance for TF_IDF of guilt is 0.0000456874. A few slight differences were found in the results of PTWs while calculating Max_TF , Min_TF and variance as shown in Figure 3. It was observed that fear emotion has the highest Max_TF and anger has the lowest Max_TF whereas the variance of TF for guilt is 0.0002435522. Similarly, Figure 4 shows that fear has the highest Max_TF_IDF and anger contains the lowest Max_TF-IDF values and the variance of TF-IDF of fear is 0.000922226.

Ranking Score of CW
It was found that some of the Context Windows appear more than one time in the same emotion class. Thus, they were removed and a ranking score was calculated for each of the context windows. Each of the words in a context window was searched in the SentiWordnet lexicon and if found, we considered either positive or negative or both scores. The summation of the absolute scores of all the words in a Context Window is returned. The returned scores were sorted so that, in turn, each of the context windows obtains a rank in its corresponding emotion class. All the ranks were calculated for each emotion class, successively. This rank is useful in finding the important emotional phrases from the list of CWs. Some examples from the list of top 12 important context windows according to their rank are "much anger when" (anger), "whom love after" (happy), "felt sad about" (sadness) etc.

Result Analysis
The accuracies of the classifiers were obtained by employing user defined test data and data for 10 fold cross validation. It is observed that when Euclidean distance was considered, the BayesNet Classifier gives 100% accuracy on the Test data and gives 97.91% of accuracy on 10-fold cross validation data. On the other hand, J48 classifier achieves 77% accuracy on Test data and 83.54% on 10-fold cross validation data whereas the Nai-veBayesSimple classifier obtains 92.30% accuracy on Test data and 27.07% accuracy on 10-fold cross validation data. In the Naïve BayesSimple with 10fold cross validation, the average Recall, Precision and F-measure values are 0.271, 0.272 and 0.264, respectively. But, the DecisionTree classifier obtains 98.30% and 98.10% accuracies on the Test data as well as 10-fold cross validation data. The comparative results are shown in Figure 5. Overall, it is observed from Figure 5 that the BayesNet classifier achieves the best results on the score data which was prepared based on the Euclidean distance. In contrast, the BayesNet achieved 99.30% accuracy on the Test data and 96.92% accuracy on 10-fold cross validation data when the Hamming distance was considered. Similarly, J48 and Naïve BayesSimple classifiers produce 93.05% and 85.41% accuracies on the Test data and 87.95% and 39.50% accuracies on 10-fold cross validation data, respectively.
From Figure 6, it is observed that the DecisionTree classifier produces the best accuracy on the score data that was found using Hamming distance. When the score values are found by using Chebyshev distance, the BayesNet classifier obtains 100% accuracy on Test data and 97.57% accuracy on 10-fold cross validation data. Similarly, J48 achieves 84.82% accuracy on the Test data and 82.75% accuracy on 10-fold cross validation data whereas NaiveBayes and DecisionTable achieve 80% , 29.85% and 98.62% ,96.93% accuracies on the Test data and 10-fold cross validatation data, respectively.
It has to be mentioned based on Figure 7 that the DecisionTree classifier performs better in comparison with all other classifiers and achieves the best result among the rest of the classifiers on affinity score data prepared based on the Chebyshev distance only. Conclusions and Future Works In this paper, vector formation was done for each of the Context Windows; TF and TF-IDF measures were calculated. The calculated affinity score, depending on the distance values was inspired from Newton's law of gravitation. To classify these CWs, BayesNet, J48, NaivebayesSimple and Deci-sionTable classifiers.
In future, we would like to incorporate more number of lexicons to identify and classify emotional expressions. Moreover, we are planning to include associative learning process to identify some important rules for classification.