Context-aware Embedding for Targeted Aspect-based Sentiment Analysis

Attention-based neural models were employed to detect the different aspects and sentiment polarities of the same target in targeted aspect-based sentiment analysis (TABSA). However, existing methods do not specifically pre-train reasonable embeddings for targets and aspects in TABSA. This may result in targets or aspects having the same vector representations in different contexts and losing the context-dependent information. To address this problem, we propose a novel method to refine the embeddings of targets and aspects. Such pivotal embedding refinement utilizes a sparse coefficient vector to adjust the embeddings of target and aspect from the context. Hence the embeddings of targets and aspects can be refined from the highly correlative words instead of using context-independent or randomly initialized vectors. Experiment results on two benchmark datasets show that our approach yields the state-of-the-art performance in TABSA task.


Introduction
Targeted aspect-based sentiment analysis (TABSA) aims at detecting aspects according to the specific target and inferring sentiment polarities corresponding to different target-aspect pairs simultaneously (Saeidi et al., 2016).For example, in sentence "location1 is your best bet for secure although expensive and location2 is too far.",for target "location1", the sentiment polarity is positive towards aspect "SAFETY" but is negative towards aspect "PRICE".While "location2" only express negative polarity about aspect "TRANSIT-LOCATION".This can be seen in Figure 1, e.g., where opinions on the aspects "SAFETY" and "PRICE" are expressed for target "location1" but not for target "location2", whose corresponding aspect is "TRANSIT-LOCATION ".Here, an interesting phenomenon is that, the opinion "Positive" towards aspect "SAFETY" is expressed for target "location1" will be change if "location1" and "location2" are exchanged.That is to say, the representation of target and aspect should take full account of context information rather than use context-independent representation.Aspect-based sentiment analysis (ABSA) is a basic subtask of TABSA, which aims at inferring the sentiment polarities of different aspects in the sentence (Ruder et al., 2016;Chen et al., 2017;Gui et al., 2017;Peng et al., 2018;Ma et al., 2018a).Recently, attention-based neural models achieve remarkable success in ABSA (Fan et al., 2018;Wang et al., 2016;Tang et al., 2016).In TABSA task, the attention-based sentiment LSTM (Ma et al., 2018b) is proposed to tackle the challenges of both aspect-based sentiment analysis and targeted sentiment analysis by incorporating external knowledge.For neural model improvement, a delayed memory is proposed to track and update the states of targets at the right time with external memory (Liu et al., 2018).
Despite the remarkable progress made by the previous works, they usually utilize contextindependent or randomly initialized vectors to rep-arXiv:1906.06945v1[cs.CL] 17 Jun 2019 resent targets and aspects, which losses the semantic information and ignores the interdependence among specific target, corresponding aspects and the context.Because the targets themselves have no expression of sentiment, and the opinions of the given sentence are generally composed of words highly correlative to the targets.For example, if the words "price" and "expensive" are in the sentence, it probably expresses the "Negative" sentiment polarity about aspect "PRICE".
To alleviate these problems above, we propose a novel embedding refinement method to obtain context-aware embedding for TABSA.Specifically, we use a sparse coefficient vector to select highly correlated words from the sentence, and then adjust the representations of target and aspect to make them more valuable.The main contributions of our work can be summarized as follows: • We reconstruct the vector representation for target from the context by means of a sparse coefficient vector, hence the representation of target is generated from highly correlative words rather than using context-independent or randomly initialized embedding.
• The aspect embedding is fine-tuned to be close to the highly correlated target and be away from the irrelevant targets.
• Experiment results on SentiHood and Semeval 2015 show that our proposed method can be directly incorporated into embeddingbased TABSA models and achieve state-ofthe-art performance.

Methodology
In this section, we describe the proposed method in detail.The framework of our proposed method is demonstrated in Figure 2. We assume a words sequence of a given sentence as an embedding matrix X ∈ R m×n , where n is the length of sentence, m is the dimension of embedding, and each word can be represented as an m-dimensional embedding x ∈ R m including the embedding of target t ∈ R m via random initialization and the embedding of aspect a ∈ R m which is an average of its constituting word embeddings or single word embedding.The sentence embedding matrix X is fed as input into our model to achieve the sparse coefficient vector u via the fully connected layer and the step function successively.The hidden output u is utilized to compute the refined representation of target t ∈ R m and aspect ã ∈ R m .Afterwards, the squared Euclidean function d( t, t) and d(ã, t, t ) are used to iteratively minimize the distance to get the refined embeddings of target and aspect.

Task Definition
Given a sentence consisting of a sequence of words s = {w 1 , w 2 , . . ., LOC, . . ., w n }, where LOC is a target in the sentence, there will be 1 or 2 targets in the sentence corresponding to several aspects.There are a pre-identified set of targets T and a fixed set of aspects A. The goal of TABSA can be regarded as a fine-grained sentiment expression as a tuple (t, a, p), where p refers to the polarity which is associated with aspect a, and the aspect a belongs to a target t.
The objective of TABSA task is to detect the aspect a ∈ A and classify the sentiment polarity p ∈ {P ositive, N egative, N one} according to a specific target t and the sentence s.

Target Representation
The idea of target representation is to reconstruct the target embedding from a given sentence according to the highly correlated words in the context.By this means we can extract the correlation between target and context, the target representation is computed as: where t is the representation of target, u is a sparse coefficient vector indicating the importance of different words in the context, defined as: where Φ is a step function given a real value: where mean(•) is an average function, and the vector u can be computed by a non-linear function of basic embedding matrix X: where f is a non-linear operation function like sigmoid, W ∈ R m and b ∈ R n denote the weight matrix and bias respectively.The target representation is inspired by the recent success of embedding refinement (Yu et al., 2017).For each target, our reconstruction operation aims to get a contextual relevant embedding by iteratively minimizes the squared Euclidean between the target and the highly correlative words in the sentence.The objective function is defined as: where λu i aims to control the sparseness of vector u .Through the iterative procedure, the vector representation of the target will be iteratively updated until the number of the non-zero elements of vector u less than the threshold value: k c.Where k is the number of the non-zero elements of vector u and c is a threshold to control the nonzero number of vector u .

Aspect Representation
Generally, the words of aspects contain the most important semantic information.Coordinate with the aspect itself, the context information can also reflect the aspect, such as the word "price" in the sentence probably has relevant to aspect "PRICE".To this end, we refine the aspect representation according to target representation.By incorporating highly correlated words into the representation of aspect, every element in the fine-tuned aspect embedding ã is calculated as: where α is a parameter to control the influence between aspect and the context.For each aspect, the fine-tuning method aims to move closer to the homologous target and further away from the irrelevant one by iteratively minimizes the squared Euclidean.The objective function is thus divided into two parts: where t is the the homologous target and t is the irrelevant one.β is a parameter that controls the distance from the irrelevant target.

Experiments
This section evaluates several deep neural models based on our proposed embedding refinement method for TABSA.
Dataset.Two benchmark datasets: Senti-Hood (Saeidi et al., 2016) and Task 12 of Semeval 2015 (Pontiki et al., 2015) are used to evaluate our proposed method.SentiHood contains annotated sentences containing one or two location target mentions.The whole dataset contains 5215 sentences with 3862 sentences containing a single location and 1353 sentences containing multiple (two) locations.
Location target names are masked by LOCATION1 and LOCATION2 in the whole dataset.Following (Saeidi et al., 2016), we only consider the top 4 aspects ("GEN-ERAL", "PRICE", "TRANSIT-LOCATION" and "SAFETY") when evaluate aspect detection and sentiment classification.To show the generalizability of our method, we evaluate our works in another dataset: restaurants domain in Task 12 for TABSA from Semeval 2015.We remove sentences containing no targets as well as NULL targets like the work of (Ma et al., 2018b).The whole dataset contains 1,197 targets in the training set and 542 targets in the testing set.
Experiment setting.We use Glove (Pennington et al., 2014) 1 to initialize the word embeddings in our experiments, and target embeddings (loca-tion1 and location2) are randomly initialized.We initialize W and b with random initialization.The parameters of c, α and β in our experiment are 4, 1 and 0.5 respectively.Given a unit of text s, a list of labels (t, a, p) is provided correspondingly, the overall task of TABSA can be defined as a threeclass classification task for each (t, a) pair with labels Positive, Negative, None.We use macroaverage F1, Strict accuracy (Acc.) and AUC for aspect detection, and Acc. and AUC for sentiment classification ignoring the class of None, which indicates that a sentence does not contain an opinion for the target-aspect pair (t, a).
Comparison methods.We compare our method with several typical baseline systems (Saeidi et al., 2016) and remarkable models (Ma et al., 2018b;Liu et al., 2018) proposed for the task of TABSA.
(2) LSTM-Loc (Saeidi et al., 2016): A bidirectional LSTM model takes the output representation at the index corresponding to the location target.

Comparative Results of SentiHood
The experimental results are shown in Table 1.The classifiers based on our proposed methods (RE+Delayed-memory, RE+SenticLSTM) achieve better performance than competitor models for both aspect detection and sentiment classification.In comparison with the previous bestperforming model (Delayed-memory), our best model (RE+Delayed-memory) significantly improves aspect detection (by 2.9% in strict accuracy, 2.5% in macro-average F1 and 2.4% in AUC) and sentiment classification (by 1.8% in strict accuracy and 1.4% in AUC) on SentiHood.
The comprehensive results show that by incorporating refined context-aware embeddings of targets and aspects into the neural models can substantially improve the performance of aspect detection.This indicates that the refined representation is more learnable and is able to extract the interdependence between aspect and the corresponding target in the context.On the other hand, the performance of sentiment classification is improved certainly in comparison with the remarkable models (Delayed-memory and SenticLSTM).It indicates that our context-aware embeddings can capture sentiment information better than the models using traditional embeddings even incor-porating external knowledge.

Comparative Results of Semeval 2015
To illustrate the robustness of our proposed method, a comparative experiment was conducted on Semeval 2015.As shown in Table 2, our embedding refinement method achieves better performance for both aspect detection and sentiment classification than two original embedding-based models, for aspect detection in particular.Consequently, our method is capable of achieving stateof-the-art performance on different datasets.

Visualization
To qualitatively demonstrate how the proposed embedding refinement improves the performance for both aspect detection and sentiment classification in TABSA, we visualize the proposed context-aware aspect embeddings ã and original aspect embeddings a which are learned with Delayed-memory and SenticLSTM models via t-SNE (Maaten and Hinton, 2008).As shown in Figure 3, compared with randomly initialized embedding, it is observed a significantly clearer separation between different aspects represented by our proposed context-aware embedding.This in-dicates that different representations of aspects can be distinguished from the context, and that the commonality of a specific aspect can also be effectively preserved.Hence the model can extract different semantic information according to different aspects, when detecting multiple aspects in the same sentence in particular.The results verify that encoding aspect by leveraging context information is more effective for aspect detection and sentiment classification in TABSA task.

Conclusion
In this paper, we proposed a novel method for refining representations of targets and aspects.The proposed method is able to select a set of highly correlated words from the context via a sparse coefficient vector and then adjust the representations of targets and aspects.Hence, the interdependence among specific target, corresponding aspect, and the context can be extracted to generate superior embedding.Experimental results demonstrated the effectiveness and robustness of the proposed method on two benchmark datasets over the task of TABSA.In future works, we will explore the extension of this approach for other tasks.

location1Figure 1 :
Figure 1: Example of TABSA task.Highly correlative words and corresponding aspects are in the same color.Entity names are masked by location1 and location2.

Figure 2 :
Figure 2: The framework of our refinement model.⊗ is element-wise product, ⊕ is vector addition, Φ is step function.

Figure 3 :
Figure 3: The visualization of intermediate embeddings learned by embedding-based models.Different colors represent different aspects.

Table 1 :
Experimental results on SentiHood.† denotes average score over 10 runs, and best scores are in bold.