Toward Stance Classification Based on Claim Microstructures

Claims are the building blocks of arguments and the reasons underpinning opinions, thus analyzing claims is important for both argumentation mining and opinion mining. We propose a framework for representing claims as microstructures, which express the beliefs, judgments, and policies about the relations between domain-specific concepts. In a proof-of-concept study, we manually build microstructures for over 800 claims extracted from an online debate. We test the so-obtained microstructures on the task of claim stance classification, achieving considerable improvements over text-based baselines.


Introduction
In online discussions, users express their opinions using more or less well structured arguments. The building blocks of these arguments are claims: statements that are in dispute and that we are trying to support with reason Govier (2013). Claims can support or attack other claims, giving rise to complex argumentative structures. Thus, the ability to identify and analyze claims in text is a crucial part of argumentation mining (Moens, 2014;Lippi and Torroni, 2016). Outside the realm of wellstructured argumentation, the ability to analyze claims is crucial for tasks such as stance classification (Anand et al., 2011;Hasan and Ng, 2013; and fine-grained opinion analysis (Stoyanov and Cardie, 2008;Yang and Cardie, 2013), as well as the converging task of argument-based opinion mining (Clos et al., 2014;Boltužić andŠnajder, 2014), which aims to uncover the reasons underpinning the opinions.
Previous research has tackled the claim detection task for diverse domains, including legal docu-ments (Palau and Moens, 2009), microtexts (Peldszus and Stede, 2015), Wikipedia articles Rinott et al., 2015), student essays (Stab and Gurevych, 2017), and user-generated web discourse (Habernal and Gurevych, 2015). Boltužić andŠnajder (2015) addressed the task of identifying prominent claims in online debates, while Boltužić andŠnajder (2016) analyzed the implicit premises between two claims. Recently, Bar-Haim et al. (2017) introduced the claim stance classification task, where classification is done at the claim rather than document level.
In this paper, we address the task of claim analysis from a different angle. While prior work has dealt with claims as textual fragments, we study the possibility of a more precise, domain-specific analysis of claims based on their internal logical structure. The work closest to ours is that of Wyner and Van Engers (2010) and Wyner et al. (2016), who explored normalizing claims from the policy making domain by translating them to Attempto Controlled English (Fuchs et al., 2008), and then mapping them to propositions. In contrast, we propose a framework for representing claims as microstructures: structures expressing the relations between the domain-specific concepts, reflecting the beliefs, value judgments, or desired policies of the claim author. We present a preliminary proof-of-concept study, where we use the proposed framework to manually create microstructures for over eight hundred claims extracted from an online debate.
We envisage that claim microstructures could play an important role in a variety of opinion mining and argument mining tasks, including stance classification, extraction of argumentative structures, analysis of implicit premises, fine-grained opinion mining, identifying prominent claims, and claim matching. To demonstrate the viability of claim microstructures for a downstream task, we look into supervised claim stance classification and show that, even with a simple encoding of microstructures as features, we get substantial improvements on this task over text-based baselines.
The contribution of our work may be summed up as follows: (1) we investigate the feasibility of using microstructures for representing claims, (2) we demonstrate the use of microstructures for stance classification, and (3) to promote further research, we make available the dataset annotated with claim paraphrases and microstructures. 1

Claim Microstructures
We introduce a framework for representing claims from text using logical microstructures whose purpose is to capture the gist of a claim. The initial motivation came from the analysis of our dataset (cf. Section 4), which revealed that a large majority of claims can be conceived of as expressing relations between concepts using a certain modality. Figure 1 shows a claim microstructure bringing together these three elements.
Relations. Many claims can be represented as expressing a relation between two concepts. For example, on the topic of gay rights, the relations may be 'promotes(GayMarriage, Depopulation)' or 'purpose(Love, Procreation)'. There are also comparably fewer claims that can be expressed via higher-order relations, e.g., 'entails(Constitution, allow(State, GayMarriage)))'. Each relation can be negated, e.g., '¬promotes(GayMarriage, Depopulation)' expresses that gay marriage does not cause depopulation.
Concepts. The relations are established between concepts, expressed by noun phrases. For ease of access, these can be arranged into a small, domainspecific taxonomy of concepts. For instance, "gay marriage", "heterosexual marriage", and "religious marriage" all belong under the concept of "marriage". The taxonomic relations could also be useful for later computational processing. Unlike relations, concepts are domain dependent and need to be defined for each new topic.
Modalities. We furthermore observed that the claims express different modalities, which can roughly be categorized into beliefs, value judgments, and policies. We formalize this via unary relations 'believes', 'approves', and 'desires', corresponding to beliefs (factual, religious, 1 http://takelab.fer.hr/claim-micro and opinion-based), positive value judgment, and desired policy (desired state of affairs), respectively. The three modalities act as a wrapper on the propositional content of the claim, effectively modulating what is being claimed. For instance, 'believes(purpose(Love, Procreation))' expresses the belief that love serves procreation, while 'desires(¬allow(State, GayMarriage))' expresses the wish for the state not to allow gay marriages. Finally, we observed that in a fair number of cases the claims are supported by a reference to a second opinion holder (e.g., the Bible, the state). We represent this by introducing one additional modality layer with the opinion holder as an additional modifier. For instance, 'believes(believes[State](promotes(Marriage, Advancement)))' corresponds to the belief that the state believes gay marriages lead to an advancement. By convention, the opinion holder of the first modality is always the author of the post.
Let R, C, and M denote the set of relations, concepts, and modalities, respectively. Formally, we define a claim microstructure as a quadruple (m 1 , m 2 , o 2 , r), where m 1 ∈ M and (optionally) m 2 ∈ M ∪ { } are the modalities, o 2 ∈ C ∪ { } is the (optional) second opinion holder, and r = (t, c 1 , c 2 ) ∈ R is the (possibly higherorder) relation between two concepts or relations c 1 , c 2 ∈ C ∪ R, conveyed by the relation type t. Table 1 defines the relation types used in this work.
It should be noted that, unlike , who consider as claims only the statements that directly support or contest the debating topic, we consider all statements with propositional content. For example, in the context of gay rights, 'belief(purpose(Life, Love))' is a valid claim in our framework, although it does not support nor contest the topic, i.e., the stance of that claim is neutral.

Data Annotation
We adopted the dataset of Hasan and Ng (2014), which contains user posts from online two-sided debates on a number of topics. For reasons of feasibility, in this study we consider only one topic: "Gay rights". We sampled 100 posts (50 for and 50 against) from this topic. The manual annotation was carried out in two steps. In the first step, the annotators segmented out the individual claims from user posts and paraphrased them into wellarticulated claims. In the second step, the annotators translated each paraphrased claim into the corresponding logical microstructure. While in principle the claim microstructures could have been built directly from segments, we chose to introduce the additional step of claim paraphrasing for three reasons. First, we assumed that paraphrasing would help in identifying the segments corresponding to individual claims, since paraphrasing demonstrates understanding. In that respect, our work is similar to that of Wyner and Van Engers (2010), who used a controlled language for paraphrasing the claims. Second, we assumed that paraphrases will make overt the logical structure of claims, making their translation into microstructures easier. Lastly, we assumed that paraphrases could help in identifying the prominent concepts for the domain-specific taxonomy.

Claim Segments and Paraphrases
The purpose of the this step was to extract claim segments from user posts, thus separating argumentative from non-argumentative content, and to paraphrase the claims into simple, well-articulated statements. This obviously involves two non-trivial tasks: segmentation and paraphrasing. Arguably, there are many ways in which a post can be segmented into claims, and even more ways in which each segment could be paraphrased. We hypothesize that much of this ambiguity can be reduced by considering these two tasks jointly, and by adopting certain paraphrasing principles aimed at obtaining simplifying paraphrases -paraphrases that expresses the essence of the claims devoid of superfluous words and phrases. To this end, we adopted the following nine paraphrasing principles: (1) Argumentativeness -Only argumentative text should be paraphrased; (2) Atomicity -A claim should convey a single thought; (3) Authority -Experts in claims from expert opinion should be made explicit in the paraphrase; (4) Brevity -Paraphrases should keep only the relevant argumentative content; (5) Canonicity -Canonical terms and phrases are preferred over idiomatic language; (6) Contextuality -Claims should be paraphrased by considering their local and topical context as well as their context; (7) Declarativity -paraphrases should be in declarative form, and (8) Dereferencing -Pronouns and nominal references should be resolved; and (9) Explicitness -Only explicitly stated information should be paraphrased, and not whatever might be implied by the claim.
The annotation was carried out by one trained annotator and took 25 hours. The 100 user posts yielded 920 claim segments and the same number of paraphrases. Table 2 gives an example. Note that generally the claim segments may overlap, though this is not the case in this example. Overall, the segments covered 79.6% of the text, while the remaining 20.4% may be considered non-argumentative.

Claim Microstructures
In the second step, we asked two annotators (A1 and A2) to translate each of the 920 paraphrases into claim microstructures. The annotators were provided with a domain-specific taxonomy on "Gay rights", compiled based on a manual analysis of claim paraphrases. The taxonomy consists of 150 concepts arranged into a tree of a maximum depth of four. The annotators were instructed to use the existing concepts from the taxonomy, and introduce new ones only if they could not find a suitable one in the taxonomy. They were also instructed not to use microstructures of order higher than two.

User post
Men should fall in love with women that's why they where created and women should get married to men because it makes everything easier.
Men should fall in love with women.
People of opposite sex should fall in love.

desires(entails(OppositeSex, FallingInLove))
that's why they where created Men and women are created to pair.

believes(purpose(MenAndWomen, Procreation))
women should get married to men because it makes everything easier.
Heterosexual marriages make everything easier.
believes(entails(HeterosexualMarriage, Normal)) Table 2: An example of a user post segmented into three claim segments, each paraphrased and translated into the corresponding claim microstructure.
Out of 920 claim paraphrases, annotator A1 managed to translate 882 claims into 707 distinct microstructures, while annotator A2 translated 842 claims into 767 distinct microstructures. The average annotation effort was 33 hours. The number of claims for which both annotators provided a microstructure is 819 (89%), while the number of claims for which both provided an identical microstructure is only 58 (6.3%), The annotators introduced a total of 157 new concepts, indicating that the initial taxonomy was of too limited a scope. The low annotator agreement and the relatively large number of newly added concepts suggest that a fair amount of ambiguity exists in translating paraphrases to microstructures. Our analysis revealed that, in the majority of cases, the ambiguity is genuine and in such cases having more candidate microstructures for a single claim can be considered advantageous.
The analysis also revealed that 'believes' is the most frequent modality, used for about 79% of claims. For A1, entails is by far the most common relation (61%), while A2 made a more balanced use of relations, with the top two being has (21%) and entails (15%). The concepts most frequently used by A1 are homosexuality, homosexual people and marriage, while for A2 these are The Bible, homosexual people, and government interest. Gay couples should be able to experience parenting.   Table 4: Stance classification macro-averaged F1score using segments (seg), paraphrases (par), and microstructures (ms) as features. The best result in each group is shown in boldface.
addition of the in-between options (f and a) for indicating implicit or indirect stance. Table 3 shows some examples. On the five classes, we observe a moderate inter-annotator agreement of 0.53 Cohen's κ (Cohen, 1960). The aggregation was done by first removing 16 instances on which the annotators disagreed in stance polarity, and then averaging and rounding the two labels by treating them as numbers from the [−2, +2] interval.
Baselines. We compare against two baselines, which, to the best of our knowledge, are considered state of the art for stance classification : (1) a sum of skip-gram vec-tors (Mikolov et al., 2013) 2 for each word and a (2) tf-idf unigram and bigram representation of a claim. For baselines, we use these representations on claim segments (seg). For the sake of completeness, we also run the baselines on claim paraphrases (par), but note that this serves only as a reference, as obtaining paraphrases is arguably a task that is more difficult to automate than obtaining microstructures.
Microstructures. To represent the claim microstructures (ms), we adopted a simple one-hot encoding scheme: we use one one-hot vector for each of the modalities, relations, relation negations, concepts, and opinion holders concatenating the vectors into a single feature vector (onehot). In addition, to leverage the taxonomical relations between concepts, we experimented with encoding for each concept its ancestors in the taxonomy, by encoding the nodes along the path leading from the root to the concept (path).
Models. We used support vector machine (SVM) classifier and regression models with an RBF kernel, as implemented in the LibSVM library of Chang and Lin (2011). We trained and evaluated the models on 803 claim instances (either segments, paraphrases, or microstructures) using a 5×3 nested cross-validation, using grid search to optimize hyperparameters C and γ.
Tasks. We considered four task: (1) a 5-way regression setup, in which the model is trained to predict the numeric stance score, but afterwards the predictions are rounded and mapped to labels, (2) a 5-way classification task, (3) a 3-way classification task in which the implicit labels (a and f) are mapped to neutral (3-way-N), and a (4) 3-way classification task in which the implicit labels are mapped to explicit for and against labels (3-way-E). The last two tasks are easier, so we expected the models to perform better on these tasks. Table 4 shows the classification results in terms of macro-averaged F1-score. As expected, the 3-way classification tasks are easier than 5-way classification tasks. Furthermore, the 5-way regression model performs better than 5-way classifier, suggesting that using distance-sensitive loss is beneficial for this task. In all four tasks, the claim microstructures considerably outperform both segment-based baselines, yielding between 9 and 25 points of improvement in F1-score, depending on the task. All differences between the baseline and the microstructure model are statistically significant at p<0.05 (tested using a twotailed permutation test (Yeh, 2000)). By comparing with claim paraphrases as a reference, we find that microstructures give comparable performance for 5-way and 3-way-N classification tasks (the differences are not statistically significant at p<0.05), while for 5-way regression and 3-way-E classification tasks the microstructures outperform paraphrase representations. Finally, the performance difference between one-hot encoded microstructures and microstructures with path-encoded concepts are not statistically significant at p<0.05, suggesting that stance classification did not profit from encoding taxonomical relations.

Results
In the above experiments, the results for microstructures were obtained on annotations of A1. The models trained on annotations of A2 gave consistently lower performance, albeit still better (and statistically different) than the baseline.
We conclude the experimental section by noting that microstructures improve claim stance classification performance over a segment-based baseline by a maximum of 50.7% F1-score for a 3-way classification setup.

Conclusion and Future Work
We presented a framework for representing the microstructures of claims. A microstructure expresses the relations between domain-specific concepts, and is intended to capture the beliefs, value judgments, and desired policies conveyed by claims. In the proof-of-concept study, we manually annotated microstructures for one debating topic. The annotators were able to translate 89% of claims into microstructures, thus proving the viability of the approach. We next demonstrated the usefulness of microstructures on the task of claim stance classification, where a simple encoding of microstructures yielded notable performance improvements over segment-based baselines. This in turn suggests that a claim microstructure does a good job in capturing the argumentative gist of the claim.
We note, however, that this is a preliminary study, which has left aside some important practical issues. While our results are promising, the major question now is how to automatically extract the microstructures from text. In our study, the claims were segmented and paraphrased by human annotators; an end-to-end system would need to both segment out the claims and extract the corresponding microstructures. We believe that one way to tackle this problem might be to frame it as an information extraction task.
In our preliminary study, the annotators managed to translate most of the claims into microstructures. However, the low agreement rate (6.3%) suggests that the annotation workflow could perhaps be improved.
Another issue worth investigating is the application of the framework to a new domain: the tedious work of deriving a domain-specific taxonomy of concepts and the microstructures could perhaps be alleviated using active learning methods.
Finally, it would of course be interesting to investigate the use of microstructures in other opinion mining and argument mining tasks, including tasks that could profit from analyzing the logical links between claims. We intend to pursue some of these directions in future work.