CU-GWU Perspective at SemEval-2016 Task 6: Ideological Stance Detection in Informal Text

We present a supervised system that uses lexical, sentiment, semantic dictionaries and latent and frame semantic features to identify the stance of a tweeter towards an ideological target. We evaluate the performance of the proposed system on subtask A in SemEval-2016 Task 6: “Detecting Stance in Tweets”. The system yields an average F β =1 score of 63.6% on the task’s test set and has been ranked the 6 th by the task organizers out of 19 judged systems.


Introduction & Background
Automatically identifying a person's ideological stance (or perspective) from written text is quite a challenging research problem that finds applications spanning from enhancing advertisements targeting to planning political campaigns. From a computational viewpoint, perspective detection is closely related to subjectivity and sentiment analysis. A person's perspective normally influences his/her sentiment towards different topics or targets. Conversely identifying the sentiment of a person towards multiple targets can serve as a cue for identifying his/her perspective. While from a social-science viewpoint, the notion of "perspective" is related to the concept of "framing". Framing involves making some topics (or some aspects of the discussed topics) more prominent in order to promote the views and interpretations of the writer (communicator). The communicator can either make these framing decisions consciously or unconsciously (Entman, 1993).
In this paper, we present a system that employs lexical and semantic features to identify the stance of a person -"Favor", "Against" or "None"-on topics that are often backed up by one's ideology or belief system such as abortion, climate change and feminism. We evaluate the performance of the proposed system through the participation in Sub-Task A "Supervised Framework" of SemEval Task 6 "Detecting Stance in Tweets". (Mohammad et al., 2016) We build on our previous work on automatic perspective detection (Elfardy et al., 2015) by exploring the use of sentiment, Linguistic Inquiry and Word Count (LIWC) dictionaries, frame and latent semantics in addition to standard lexical features to automatically classify a given tweet according to its stance on the topic of interest.
Most current computational linguistics research on supervised stance detection performs document (or post) level stance classification, whether binary or multiway and use a variety of lexical, syntactic and semantic features to identify the stance of a post towards a specific contentious topic/target such as the Israeli-Palestinian conflict, abortion, creationism, gun-rights, gay-rights, healthcare, legalization of marijuana and death penalty (Lin et al., 2006;Klebanov et al., 2010;Hasan and NG, 2012;Hasan and Ng, 2013;Somasundaran and Wiebe, 2010;Elfardy et al., 2015). For a detailed literature review, we direct the reader to our *Sem paper. (Elfardy et al., 2015) 2 Shared Task Description SemEval Task 6 "Detecting Stance in Tweets" (Mohammad et al., 2016)  tweeter on several contentious targets. Sub-Task A "Supervised Framework" of the task -in which we participate-focuses on five targets: "Atheism", "Climate Change is a Real Concern", "Feminist Movement", "Hillary Clinton", and "Legalization of Abortion". The training data has a total of 2,814 tweets and the test data has 1,249 tweets. While each tweet can have one of three class labels "Favor", "Against" or "None", the official metric calculates the performance of each system as the average F β=1 score of the first two class labels only. Table  1 shows the distribution of the training and test data for each one of the five targets.

Approach
Our goal is to see how well lexical and semantic features can help in identifying the stance of a tweeter. We split the data into five subsets according to the target -"Atheism", "Climate Change is a Real Concern", "Feminist Movement", "Hillary Clinton", and "Legalization of Abortion'-and train a separate classifier for each one of these targets.

Lexical Features
For lexical features, we use standard n-grams. We apply basic preprocessing to the text by removing all punctuation and numbers and converting all words to lower case. Converting the text to lower case is intended to reduce the sparseness of the data while excluding the punctuation and number is meant to avoid overfitting the training data. We use n-grams having a length between 1 and 3 and exclude the ones that occur in only one training instance.

Latent Semantics
Latent Semantics map text from a high-dimensional space such as n-grams to a low-dimensional one such as topics. Most of these models assign a semantic profile to each given sentence (or document) by considering the observed words and assuming that each given document has a distribution over K topics. We apply Weighted Textual Matrix Factorization (WTMF) (Guo and Diab, 2012) to each tweet. The advantage WTMF offers -over standard topic models-is that in addition to modeling observed words, it models missing ones. WTMF defines missing words as the whole vocabulary of the training data minus the ones observed in the given tweet/post. Modeling missing words is particularly useful when the input is very short in length because in such case, only a limited context of observed words is present. We use the distributable version of WTMF which is trained on WordNet (Fellbaum, 2010), Wiktionary definitions 1 and the Brown Corpus 2 and sets the number of topics (K) to 100.

Sentiment
As mentioned earlier, sentiment analysis is closely related to stance detection. The ideological stance of a person normally influences his/her sentiment towards different ideological topics. Accordingly we decide to identify the overall sentiment of each sentence in the given tweet and use this sentiment as a possible indicator of a tweeter's stance. Our assumption is that since tweets are inherently short in length, we can assume that the expressed sentiment targets the ideological topic the tweet is discussing. We use Stanford's sentiment analysis system (Socher et al., 2013) to identify the number of positive, negative and neutral sentences in a given tweet and use these three counts as features for our

Linguistic Inquiry & Word Count (LIWC)
LIWC (Tausczik and Pennebaker, 2010) uses a set of dictionaries to assign words in a given text to a set of psychologically meaningful categories such as death, science, emotion, space, family, swear words and many others. We apply LIWC to each given tweet in order to estimate the percentage of words belonging to each of the following categories: Feeling, Biological Processes, Body, Health, Sexual, Ingestion, Relativity, Motion, Space, Time, Work, Achievement, Leisure, Home, Money, Religion, Death and Assent.
These categories convey the discussed topics in the given tweet. Table 2 shows the coverage of each of these categories. The coverage for each category corresponds to the percentage of tweets -out of all tweets-that have at least one word in this LIWC category. We calculate the coverage of the used LIWC categories on both the training and test sets for all five targets. By analyzing the numbers in Table 2, we find that "Atheism" dataset has the highest coverage for "Religion" category, "Abortion" has the highest coverage for "Death" category while "Cli-mate Change" has the highest coverage for "Motion". Moreover for "Abortion", 25.8% of the tweets opposing it use words in the "death" category as opposed to only 8.6% for the tweets favoring the topic. For "Atheism", the use of "religion" category is high among both opposing and favoring tweets where 78.26% of the tweets that favor atheism use words in the "religion" category and 80.26% of the ones that oppose it use this category. While LIWC will potentially help in identifying the stance from which a given tweet or text in general is written, a main drawback to it is that it only performs a dictionary look-up on each word hence judging words in isolation of their context.

Frame Semantics
Frame semantics assemble the meanings of different elements in a given piece of text to model the meaning of the whole text (Baker et al., 1998). The basic semantic unit in frame semantics theory is the "frame". A frame is a conceptual structure that refers to a group of related concepts (or elements) where understanding one of these concepts requires an understanding of its whole structure. When any of these structures is present, it automatically triggers all of the other ones in the reader's mind (Fillmore, 2006 Table 4: Held-Out Test Set Results (measured in average F β=1 score of "Favor" and "Against" classes) frame is called the "frame-target". For example, the target for the frame "Killing" can be any kill verb and the frame elements will include the killer and the victim. We use SEMAFOR , a publicly available frame-semantic parser, to identify all the semantic frames in each given tweet. For example, in the tweet "Because I want young American women to be able to be proud of the 1st woman president.", SEMAFOR identifies the following frames, "Leadership Target:president", "Capability Target:able", "Origin Target:American","Desiring Target:want", "People Target:women" and "Age Target:young" . We create a list of all the frames that occur in the training data and use binary features to indicate the presence/absence of each of them in each given tweet. This set of features provides yet another abstraction in order to infer the topics discussed in the given text.

Classifier Training
Using WEKA toolkit (Hall et al., 2009) and the derived lexical and semantic features, we train an SVM classifier for each one of the five targets. We use a radial basis kernel and set the cost parameter C to 1.

Experiments & Results
For tuning, we split the training data for each target into two subsets: 90% training and 10% development. We evaluate different configurations of the proposed features on the development set and apply the ones that yield best tuning results to the held-out test set. Specifically we experiment with the following configurations: (1) Lexical Features (n-grams), (2) WTMF, (3) Sentiment, (4) Frames, (5) LIWC, (6) All Semantic features, (7) All features. We compare the proposed approach against a majority baseline which assigns all tweets to the most frequent class in the data ("Against"). Table  3 shows the results on the development set. Ngrams outperform using any of the semantic features separately which is quite expected given how well n-grams generally perform on text classification tasks. Moreover WTMF outperforms all other semantic features except on "Climate Change" dataset. Overall we find "Climate Change" to show different trends and a much lower performance than those of the other four targets. We believe that this is at-tributed to the small size of this set and to its very skewed distribution. Only 3.8% of its tweets belong to "against" class which is the most frequent class in the whole dataset. Combining all semantic features yields better results than using any of them separately for three out of the five targets "Abortion", "Climate Change" and "Feminist Movement" while for the other two targets using WTMF only outperforms using the full set of semantic features. For the same two targets -"Atheism" and "Hillary Clinton"using only n-grams outperforms combing n-grams with all-semantic features. Overall, combining lexical and semantic features yields the best results on the majority of tweets. Accordingly we used this setup in our official task submission.
We apply the top four development-set configurations -"n-grams", "WTMF", "all semantic" and "all features"-to the test set. Table 4 shows the held-out test results. While using the combination of lexical and all-semantic features yields best overall performance, using only the semantics features achieves close to best results. Additionally, using only WTMF yields best performance on two targets -"Atheism" and "Climate Change". We also find that while the best configuration for each target varies across the development and test sets, the best overall configuration for the majority of tweets remains the same.

Conclusion
In this paper, we explore the use of lexical and several semantic feature sets within a supervised framework to identify the stance of a tweet towards an ideological topic. We evaluate the performance of the proposed approach by participating in SemEval-2016 Task 6: "Detecting Stance in Tweets" which aims at identifying the stance of a given tweet towards five targets; "Abortion", "Atheism", "Climate Change", "Hillary Clinton" and "Feminist Movement". We find that WTMF, a latent semantics model that incorporates both observed and missing words, yields the best results among semantic features. Moreover, combining lexical and all semantic features results in the best performance.