Topic-Specific Sentiment Analysis Can Help Identify Political Ideology

Ideological leanings of an individual can often be gauged by the sentiment one expresses about different issues. We propose a simple framework that represents a political ideology as a distribution of sentiment polarities towards a set of topics. This representation can then be used to detect ideological leanings of documents (speeches, news articles, etc.) based on the sentiments expressed towards different topics. Experiments performed using a widely used dataset show the promise of our proposed approach that achieves comparable performance to other methods despite being much simpler and more interpretable.


Introduction
The ideological leanings of a person within the left-right political spectrum are often reflected by how one feels about different topics and by means of preferences among various choices on particular issues. For example, a left-leaning person would prefer nationalization and state control of public services (such as healthcare) where privatization would be often preferred by people that lean towards the right. Likewise, a left-leaning person would often be supportive of immigration and will often talk about immigration in a positive manner citing examples of benefits of immigration on a country's economy. A right-leaning person, on the other hand, will often have a negative opinion about immigration.
Most of the existing works on political ideology detection from text have focused on utilizing bag-of-words and other syntactic features to capture variations in language use (Sim et al., 2013;Biessmann, 2016;Iyyer et al., 2014). We propose an alternative mechanism for political ideology detection based on sentiment analysis. We posit that adherents of a political ideology generally have similar sentiment toward specific topics (for example, right wing followers are often positive towards free markets, lower tax rates, etc.) and thus, a political ideology can be represented by a characteristic sentiment distribution over different topics (Section 3). This topic-specific sentiment representation of a political ideology can then be used for automatic ideology detection by comparing the topic-specific sentiments as expressed by the content in a document (news article, magazine article, collection of social media posts by a user, utterances in a conversation, etc.).
In order to validate our hypothesis, we consider exploiting the sentiment information towards topics from archives of political debates to build a model for identifying political orientation of speakers as one of right or left leaning, which corresponds to republicans and democrats respectively, within the context of US politics. This is inspired by our observation that the political leanings of debators are often expressed in debates by way of speakers' sentiments towards particular topics. Parliamentary or Senate debates often bring the ideological differences to the centre stage, though somewhat indirectly. Heated debates in such forums tend to focus on the choices proposed by the executive that are in sharp conflict with the preference structure of the opposition members. Due to this inherent tendency of parliamentary debates to focus on topics of disagreement, the sentiments exposited in debates hold valuable cues to identify the political orientation of the participants.
We develop a simple classification model that uses a topic-specific sentiment summarization for republican and democrat speeches separately. Initial results of experiments conducted using a widely used dataset of US Congress debates (Thomas et al., 2006) are encouraging and show that this simple model compares well with classification models that employ state-of-the-art distributional text representations (Section 4).

Political Ideology Detection
Political ideology detection has been a relatively new field of research within the NLP community. Most of the previous efforts have focused on capturing the variations in language use in text representing content of different ideologies. Beissmann et al. (2016) employ bag-of-word features for ideology detection in different domains such as speeches in German parliament, party manifestos, and facebook posts. Sim et al. (2013) use a labeled corpus of political writings to infer lexicons of cues strongly associated with different ideologies. These "ideology lexicons" are then used to analyze political speeches and identify their ideological leanings. Iyyer at al. (2014) recently adopted a recursive neural network architecture to detect ideological bias of single sentences. In addition, topic models have also been used for ideology detection by identifying latent topic distributions across different ideologies (Lin et al., 2008;Ahmed and Xing, 2010). Gerrish and Blei (2011) connected text of the legislations to voting patterns of legislators from different parties.

Sentiment Analysis for Controversy Detection
Sentiment analysis has proved to be a useful tool in detecting controversial topics as it can help identify topics that evoke different feelings among people on opposite side of the arguments. Mejova et al. (2014) analyzed language use in controversial news articles and found that a writer may choose to highlight the negative aspects of the opposing view rather than emphasizing the positive aspects of ones view. Lourentzou et al. (2015) utilize the sentiments expressed in social media comments to identify controversial portions of news articles. Given a news article and its associated comments on social media, the paper links comments with each sentence of the article (by using a sentence as a query and retrieving comments using BM25 score). For all the comments associated with a sentence, a sentiment score is then computed, and sentences with large variations in positive and negative comments are identified as controversial sentences. Choi et al. (2010) go one step further and identify controversial topics and their sub-topics in news articles.

Using Topic Sentiments for Ideology Detection
Let D = {. . . , d, . . .} be a corpus of political documents such as speeches or social media postings. Let L = {. . . , l, . . .} be the set of ideology class labels. Typical scenarios would just have two class labels (i.e., |L| = 2), but we will outline our formulation for a general case. For document d ∈ D, l d ∈ L denotes the class label for that document. Our method relies on the usage of topics, each of which are most commonly represented by a probability distribution over the vocabulary. The set of topics over D, which we will denote using T , may be identified using a topic modeling method such as LDA (Blei et al., 2003) unless a pre-defined set of handcrafted topics is available. Given a document d and a topic t, our method relies on identifying the sentiment as expressed by content in d towards the topic t. The sentiment could be estimated in the form of a categorical label such as one of positive, negative and neutral (Haney, 2013). Within our modelling, however, we adopt a more finegrained sentiment labelling, whereby the sentiment for a topic-document pair is a probability distribution over a plurality of ordinal polarity classes ranging from strongly positive to strongly negative. Let s dt represent the topic-sentiment polarity vector of d towards t such that s dt (x) represents the probability of the polarity class x. Combining the topic-sentiment vectors for all topics yields a document-specific topic-sentiment matrix (TSM) as follows: Each row in the matrix corresponds to a topic within T , with each element quantifying the probability associated with the sentiment polarity class x for the topic t within document d. The topic-sentiment matrix above may be regarded as a sentiment signature for the document over the topic set T .

Determining Topic-specific Sentiments
In constructing TSMs, we make use of topicspecific sentiment estimations as outlined above. Typical sentiment analysis methods (e.g., NLTK Sentiment Analysis 1 ) are designed to determine the overall sentiment for a text segment. Using such sentiment analysis methods in order to determine topic-specific sentiments is not necessarily straightforward. We adopt a simple keyword based approach for the task. For every document-topic pair (t, d), we extract the sentences from d that contain at least one of the top-k keywords associated with the topic t. We then collate the sentences in the order in which they appear in d and form a mini-document d t . This document d t is then passed on to a conventional sentiment analyzer that would then estimate the sentiment polarity as a probability distribution over sentiment polarity classes, which then forms the s dt (.) vector. We use k = 5 and the RNN based sentiment analyzer (Socher et al., 2013) in our method.

Nearest TSM Classification
We now outline a simple classification model that uses summaries of TSMs. Given a labeled training set of documents, we would like to find the prototypical TSM corresponding to each label. This can be done by identifying the matrix that minimizes the cumulative deviation from those corresponding to the documents with the label.
where ||M || F denotes the Frobenius norm. It turns out that such a label-specific signature matrix is simply the mean of the topicsentiment matrices corresponding to docu-1 http://text-processing.com/demo/sentiment/ ments that bear the respective label, which may be computed using the below equation.
For an unseen (test) document d ′ , we first compute the TSM S d ′ ,T , and assign it the label corresponding to the label whose TSM is most proximal to S d ′ ,T .

Logistic Regression Classification
In two class scenarios with label such as {lef t, right} or {democrat, republican} as we have in our dataset, TSMs can be flattened into a vector and fed into a logistic regression classifier that learns weights -i.e., co-efficients for each topic + sentiment polarity class combination. These weights can then be used to estimate the label by applying it to the new document's TSM.

Dataset
We used the publicly available Convote dataset 2 (Thomas et al., 2006) for our experiments. The dataset provides transcripts of debates in the House of Representatives of the U.S Congress for the year 2005. Each file in the dataset corresponds to a single, uninterrupted utterance by a speaker in a given debate. We combine all the utterances of a speaker in a given debate in a single file to capture different opinions/view points of the speaker about the debate topic. We call this document the view point document (VPD) representing the speaker's opinion about different aspects of the issue being debated. The dataset also provides political affiliations of all the speakers -Republican (R), Democrat (D), and Independent (I). With there being only six documents for the independent class (four in training, two in test), we excluded them from our evaluation. Table 1 summarizes the statistics about the dataset and distribution of different classes. We obtained 50 topics  using LDA from Mallet 3 run over the training dataset. The topic-sentiment matrix was obtained using the Stanford CoreNLP sentiment API 4  which provides probability distributions over a set of five sentiment polarity classes.

Methods
In order to evaluate our proposed TSM-based methods -viz., nearest class (NC) and logistic regression (LR) -we use the following methods in our empirical evaluation.

GloVe-d2v:
We use pre-trained GloVe (Pennington et al., 2014) word embeddings to compute vector representation of each VPD by averaging the GloVe vectors for all words in the document. A logistic regression classifier is then trained on the vector representations thus obtained.
2. GloVe-d2v+TSM: A logistic regression classifier trained on the GloVe features as well as TSM features.

Results
Table 2 reports the classification results for different methods described above. TSM-NC, the method that uses the T SM vectors and performs simple nearest class classification achieves an overall accuracy of 57%. Next, training a logistic regression classifier trained on T SM vectors as features, TSM-LR, achieves significant improvement with an overall accuracy of 65.04%. The word embedding based baseline, the GloVe-d2v method, achieves slightly lower performance with an overall accuracy of 64.30%. However, we do note that the per-class performance of GloVe-d2v method is more balanced with about 64% accuracy for both classes. The TSM-LR method on the other hand achieves about 76% for R class and only 52% for the D class. The results obtained are promising and lend weight to out hypothesis that ideological leanings of a person can be identified by using the finegrained sentiment analysis of the viewpoint a person has towards different underlying topics.

Discussion
Towards analyzing the significance of the results, we would like to start with drawing attention to the format of the data used in the TSM methods. The document-specific TSM matrices do not contain any information about the topics themselves, but only about the sentiment in the document towards each topic; one may recollect that s dt (.) is a quantification of the strength of the sentiment in d towards topic t. Thus, in contrast to distributional embeddings such as doc2vec, TSMs contain only the information that directly relates to sentiment towards specific topics that are learnt from across the corpus. The results indicate that TSM methods are able to achieve comparable performance to doc2vec-based methods despite usage of only a small slice of informatiom. This points to the importance of sentiment information in determining the political leanings from text. We believe that leveraging TSMs along with distributional embeddings in a manner that can combine the best of both views would improve the state-of-the-art of political ideology detection. Next, we also studied if there are topics that are more polarizing than others and how different topics impact classification performance. We identified polarizing topics, i.e, topics that invoke opposite sentiments across two classes (ideologies) by using the following equation.
Here, s R,t and s D,t represent the sentiment  vectors for topic t for republican and democrat classes. Note that these sentiment vectors are the rows corresponding to topic t in TSMs for the two classes, respectively. Table 3 lists the top five topics with most distance, i.e., most polarizing topics (top) and five topics with least distance, i.e.,least polarizing topics (bottom) as computed by equation 5. Note that the topics are represented using the top keywords that they contain according to the probability distribution of the topic. We observe that the most polarizing topics include topics related to healthcare (H3, H4), military programs (H5), and topics related to administration processes (H1 and H2). The least polarizing topics include topics related to worker safety (L3) and energy projects (L2). One counter-intuitive observation is topic related to gun control (L4) that is amongst the least polarizing topics. This anomaly could be attributed to only a few speeches related to this issue in the training set (only 23 out of 1175 speeches mention gun) that prevents a reliable estimate of the probability distributions. We observed similar low occurrences of other lower distance topics too indicating the potential for improvements in computation of topic-specific sentiment representations with more data. In fact, performing the nearest neighbor classification (T SM −N C) with only top-10 most polarizing topics led to improvements in classification accuracy from 57% to 61% suggesting that with more data, better T SM representations could be learned that are better at discriminating between different ideologies.

Conclusions
We proposed to exploit topic-specific sentiment analysis for the task of automatic ideology detection from text. We described a simple framework for representing political ideologies and documents as a matrix capturing sentiment distributions over topics and used this representation for classifying documents based on their topic-sentiment signatures. Empirical evaluation over a widely used dataset of US Congressional speeches showed that the proposed approach performs on a par with classifiers using distributional text representations. In addition, the proposed approach offers simplicity and easy interpretability of results making it a promising technique for ideology detection. Our immediate future work will focus on further solidifying our observations by using a larger dataset to learn better TSMs for different ideologies. Further, the framework easily lends itself to be used for detecting ideological leanings of authors, social media users, news websites, magazines, etc. by computing their TSMs and comparing against the TSMs of different ideologies.