SINAI: Syntactic Approach for Aspect-Based Sentiment Analysis

This paper describes the participation of the SINAI research group in the task Aspect Based Sentiment Analysis of SemEval Work-shop 2015 Edition. We propose a syntactic approach for identifying the words that modify each aspect, with the aim of classifying the sentiment expressed towards each attribute of an entity.


Introduction
Opinion Mining (OM), also known as Sentiment Analysis (SA), is the discipline that focuses on the computational treatment of opinion, sentiment and subjectivity in texts (Pang and Lee, 2008). Currently, OM is a trendy task in the field of Natural Language Processing, due mainly to the fact of the proliferation of user-generated content and the interest in the knowledge of the opinion of people by consumers and businesses.
Most of the systems developed up to now carry out opinion analysis at document level ( (Pang et al., 2002), (Turney, 2002)) or at sentence level ( (Wilson et al., 2005), (Yu and Hatzivassiloglou, 2003)), that is, they determine the overall sentiment expressed by the reviewer about the topic, product, person. . . of study. However, the fact that the overall sentiment of a product is positive does not mean that the author thinks that all the aspects of the product are positives, or the fact that is negative does not involve that everything about the product is bad. For this reason, users and companies are not satisfied with knowing the overall sentiment of a product or service, they seek a more detailed knowledge. Consequently, to achieve a higher level of detail, part of the scientific community related to this area is working on SA at aspect level ( (Quan and Ren, 2014), (Marcheggiani et al., 2014), (Lu et al., 2011), (Thet et al., 2010) and even, there is a competition on this topic that began to conduct last year (Pontiki et al., 2014) in the International Workshop on Semantic Evaluation 2014 (SemEval 2014).
This year, the 2015 edition of SemEval has also proposed a task for SA at aspect level. The SemEval-2015 Aspect Based Sentiment Analysis task is a continuation of SemEval-2014Task 4 (Pontiki et al., 2014. The aim of this task is to identify the attributes of an entity that are being reviewed and the sentiment expressed for each one. It is divided into three slots. The first one is focused on the identification of every entity E and attribute A pair (E#A) towards which an opinion is expressed in the given text. Slot 2 proposes to determine the expression used in the text to refer to the reviewed entity, that is, the Opinion Target Expression (OTE). Finally, Slot 3 has as goal to classify the sentiment expressed over each category (E#A pair) as positive, negative or neutral. We have participated in the slot related to sentiment polarity (Slot 3).
Due to the fact that OM is a domain-dependent task, the organization proposes the three slots in different domains, two known (restaurants and laptops) and one unknown until the evaluation (hotels). A wider description of the task and the dataset used can be found in the task description paper (Pontiki et al., 2015).
The rest of the paper is organized as follows. Sec-tion 2 describes the system developed and the resources that we have used. To sum up, the results reached and an analysis of the same are shown in Section 3.
2 System description -Slot 3 As we have mentioned above, we have taken part in the Slot 3. The aim of this slot is to identify the polarity of each category or each <category, OTE> pair on which an opinion is expressed in a given review. This task has been carried out on two known domains an one unknown domain. For each of the known domains, restaurants and laptops, the organization has provided a dataset for training, whereas for the unknown domain any information has been given until the test set has been released. Therefore, we have used a supervised method for restaurants and laptops domains and we have developed an unsupervised method for the unknown domain.

Slot 3 -Restaurant domain ABSA
The training data related to restaurants domain contains 254 reviews. Each review is composed of different sentences annotated with opinion tuples. Each opinion tuple has information about the Opinion Target Expression (OTE), the Entity and Attribute pair (E#A category) towards the opinion is expressed, the polarity (positive, negative or neutral) and the position of the OTE in the text (from -to). Using this information we have developed different experiments for polarity prediction. In all of them an SVM classifier of type C-SVC with linear kernel and the default configuration has been trained, and a 10-fold-cross validation model has been used for the assessment (Table 1).
The features that have provided the best results in the training and that we have used for our participation in this slot are the following. For each <category, OTE, polarity> tuple of the training data, we have used as label the polarity value and as features the words that modify the OTE, their PoS tag, their syntactic relation and their polarity using three lexicons (taking into account negation): Senti-WordNet (Baccianella et al., 2010), MPQA (Wilson et al., 2005) and eBLR (enriched version of Bing Liu Lexicon (Hu and Liu, 2004)  mation has been obtained. Thereby, each <category, OTE> tuple of the test data is classified using its features vector and the trained SVM model.

Features
Words that modify the OTE We call words that modify an OTE those words that specifically have been used in the review to discuss about the OTE. In order to determine what these words are, we use the Stanford Dependencies Parser 1 . This parser was designed to provide a simple description of the grammatical relationships that can appear in a sentence and it can be easily understood and effectively used by people without linguistic expertise who want to extract textual relations (De Marneffe and Manning, 2008). It represents all sentence relationships uniformly as typed depen-dency relations. In this experiment, we have considered the main relationships for expressing opinion about a noun or nominal expression: using an adjectival modifier ("amod"), an active or passive verb ("nsub", "nsubjpass"), a noun compound modifier ("nn") or a dependency relation with another word ("dep"). In this way, for each OTE of a review, we use these relationships to extract all the words that modify the aspect of the entity that has been reviewed and we use them as features. If there is no word related to the aspect using these relationships, the previous word to the OTE and the following four words will be used.
Pos Tag In addition, for each of the words that modify an aspect we get their particular Part of Speech Tag (noun, verb, adjective. . . ).

Syntactic relations
As it has been mentioned above, the syntactic relation of each modifying word with the OTE has also been used as feature. Polarity The last feature of our SVM classifier is the polarity of each modifying word according to three lexicons: SentiWordNet, MPQA and eBLR. In addition, it has been used the fixed window method for the treatment of negation. Then, if any of the preceding or following 3 words is a negative particle ("not", "n't", "no", "never". . . ), the modifying word polarity will be reversed (positive -> negative, negative -> positive, neutral -> neutral).
SentiWordNet is a lexical resource that assigns to each synset of WordNet 2 (Miller, 1995) three sentiment scores (positivity, negativity and objectivity) that describe how positive, negative and objective the terms contained in the synset are.
MPQA is a subjectivity lexicon formed by over 8000 subjectivity clues. For each word, it has information about its prior polarity, its part of speech tag and its grade of subjectivity (strong or weak).
Finally, eBLR is an enriched version of Bing Liu Lexicon that we explain below. As is well-known in the SA research community, the semantic orientation of a word is domain-dependent. Therefore, we decided to generate a list of opinion words for the restaurant domain, taking as baseline the Bing Liu Lexicon and using the training data for restaurant domain supplied by the organization. For this, we have employed a corpus-based approach following the methodology of (Molina- González et al., 2013) that consists of the use of a sentiment labeled corpus in order to select the most frequent positive and negative words. A word is added to the list of opinion positive words if it only appears in positive reviews and its frequency exceeds a certain threshold. The same process is followed for negative words. In the case of words that appear in both positive and negative reviews, a word is considered as opinion positive/negative word if the frequency of occurrence in positive/negative reviews exceeds the frequency of occurrence in negative/positive reviews in a certain threshold.

Slot 3 -Laptops domain ABSA
The training data for laptops domain contains 277 reviews. Each review has different sentences annotated at aspect level with the Entity and Attribute pair (E#A category) towards each opinion is expressed and the polarity (positive, negative or neutral). In this case no information about the OTE is provided and thus, we have followed a different approach to that used in the restaurant domain. We have also developed different experiments with an SVM classifier of type C-SVC with linear kernel and the default configuration, and we have also used a 10-fold-cross validation model for the assessment but with different features (  For this domain we have submitted two runs, one constrained (using only the provided training data) and another unconstrained (using additional resources for training). These experiments are those that have provided better results with the training data and we have used them for our participation in this domain.
• SINAI B Lap 1 (Exp 1 -constrained). For each <category(E#A pair), polarity> tuple of the training data we have used as label the polarity and as features the entity and the specific attribute of this entity about someone is reviewing, and all the words of the sentence with their pertinent Part of Speech Tag.
• SINAI B Lap 2 (Exp 3 -unconstrained). In this case, the features that we have selected for each <category (E#A pair), polarity> tuple of the training data are the entity and the attribute about someone is reviewing, all the words of the sentence and the number of positive and negative opinion words according to eBLL. eBLL is an enriched version of Bing Liu Lexicon for laptops domain. It has been built using the training data supplied by the organization for laptops domain, in the same way that eBLR Lexicon.
Thus, given a category of the test data, it is classified using its features vector and the trained SVM model.

Slot 3 -Out of domain ABSA
For the last domain, the organization has not provided any information until the test set has been released. We only knew that we had to assign a polarity value for each <OTE, category> tuple present in the test data. In this case we have followed an unsupervised approach that we present below.
In order to classify the sentiment expressed about each OTE is important to determine the words that have been used in the review to discuss about the aspect. For this, we have employed the Stanford Dependencies Parser and the main relationships for expressing opinion about a noun or nominal expression: "amod", "nsubj", "nsubjpass", "nn", "dep" (they are explained in Subsection 2.1). In this way, for each OTE of a review, we use these relationships to extract all the words that modify it and we use them to determine the sentiment expressed about the OTE. If there is no word related to the aspect using these relationships, the previous word to the aspect and the following four words will be used. We calculate the polarity of each OTE through a voting system based on three classifiers: Bing Liu Lexicon, SentiWordNet and MPQA. To do this we determine, with each of the classifiers individually, the polarity of an OTE using the words that modify it. Thus, according to Bing Liu Lexicon, we count the number of positive (pw) and negative words (nw) that modify the OTE and tag it following the equation 1. On the other hand, we use MPQA as classifier following the same approach but in this case we take into account the PoS of the modifying words in order to get their polarity. At last, we employ SentiWordNet also following the approach of comparing the number of positive and negative words but as this lexicon assigns three sentiment scores to each synset, we calculate the polarity of each modifying word using the Denecke method (Denecke, 2008), that is, we calculate the average of the positivity, negativity and objectivity scores of all the synsets of the word with the same PoS and assign the word the polarity of the highest average.
Therefore, an OTE is positive/negative if there are at least two classifiers that tag it as positive/negative and neutral in another case. It may happen that an OTE is affected by negation, so if any of the preceding or following 3 words is a negative particle ("not", "n't", "no", "never". . . ), the OTE polarity will be reversed (positive -> negative, negative -> positive, neutral -> neutral).

Analysis of results
This section shows the results reached in the evaluation of the task using the system described in Section 2. Table 3 presents the official results of our submissions. We also include the results of the best team and the average of all participants for comparision.
A clear difference between the results obtained by our team and the average may be seen in Table 3. Furthermore, the results in restaurants and laptops domain are worse than those achieved in the training phase (Table 1 and Table 2). Therefore, we have calculated the confusion matrix related to each experiment for a deeper analysis (Table 4, Table 5, Table 6 and Table 7).       If we observe Table 4, Table 5 and Table 6, we can see that, in restaurants and laptops domains, the system has failed mainly in the classification of negative and neutral opinions. It has classified most of them as positive. We think that one of the reasons may be that the training data for restaurants and laptops domains is unbalanced (Table 8). For restaurants, the number of positive opinions is almost three times the number of negative opinions. Another possible reason, in restaurants domain, is that we have only taken into account the scope (words that modify the OTE) and not the whole context (all words present in the review). In future works, we will do experiments balancing the datasets in order to test how the system works. Furthermore, we will take into account the whole context in restaurants domain to see if that improves the system. Regarding the unsupervised system, that has been tested with hotels domain, there are also differences with respect to the mean accuracy of all teams (Table  3). This is a first approach that can be improved with the consideration of other relationships to determine which words modify the OTE and with a treatment of negation more exhaustive. In future works we will consider these possible improvements.