Enhancing Bias Detection in Political News Using Pragmatic Presupposition

Usage of presuppositions in social media and news discourse can be a powerful way to influence the readers as they usually tend to not examine the truth value of the hidden or indirectly expressed information. Fairclough and Wodak (1997) discuss presupposition at a discourse level where some implicit claims are taken for granted in the explicit meaning of a text or utterance. From the Gricean perspective, the presuppositions of a sentence determine the class of contexts in which the sentence could be felicitously uttered. This paper aims to correlate the type of knowledge presupposed in a news article to the bias present in it. We propose a set of guidelines to identify various kinds of presuppositions in news articles and present a dataset consisting of 1050 articles which are annotated for bias (positive, negative or neutral) and the magnitude of presupposition. We introduce a supervised classification approach for detecting bias in political news which significantly outperforms the existing systems.


Introduction
In today's situation where we see several instances of social media being used to interfere with politics in controversial ways, the platforms that have been considered as sources of information are now often seen as politically biased. Especially in newspapers and news websites, sometimes the reporters tend emphasize more on particular view points selectively, and present biased information which is aligned with their personal political ideology. This can lead to widespread alteration of mass political opinion and impact the decision of the voters.
In this paper, we aim to establish a correlation between presupposition and bias in political news articles, and use the knowledge of presupposition to enhance the task of automatic bias detection.
Presuppositions are linguistic tools whose function is to enable us to take some information for granted without actually asserting it. For instance, consider the utterance "Sam will visit California again". This utterance presupposes that Sam has visited California before, and asserts that he will visit once again in future.
Based on their function in discourse, Alcarza (1999) classifies presuppositions into two levels -Semantic and Pragmatic. The propositions which the reader or listener assumes to be true come under the class of Semantic Presuppositions. On the other hand, he defines pragmatic presupposition as "the proposition that a writer or a speaker has taken its truth value for granted in his statement. It consists of previous information about the knowledge, beliefs, ideology and scale of values that the reader or listener must be acquainted with in order to understand the meaning".
The notion of pragmatic presupposition is highly useful in analysing media and political discourse such as news articles and election campaign speeches. Using them in an article or a speech could be an indicator of some hidden intentions and strategies, such as avoiding some key information, or manipulating the audience to focus on certain aspects which favour the speaker by indirectly suggesting that they are true.
Similar to this classification is another popular dichotomy which is widely used for studying implicature as Conventional or Conversational. This idea is extended to the context of presuppositions, based on how they arise (Simons, 2013). Karttunen and Peters (1979) define a presupposition as conventional when the presuppositional content arises due to the properties of lexical items present in a sentence. In their view, "certain lexical items have, in addition to their truth content, a special presuppositional content, which is carried through the compositional process to produce a propositional presupposition." On the other hand, there can be some presuppositions which do not contain any lexical triggers. Stalnaker (1974) defines them as Conversational Presuppositions. He suggests that they are the "inferences which are licensed by general conversational principles, in combination with the truth conditions of the presupposing utterance".

Related Work
Though there have been several speculations in the linguistic research community about the extra linguistic information provided by the use of presuppositions, very few of them are backed up with proper surveys and observations. The initial direction towards such research was motivated by Van Dijk's idea that in Critical Discourse Analysis, one should closely look at the propositions which in turn suggest some other propositions to be true, but in fact are either not true or controversial. He pointed out some examples from Opinion Discourse (Van Dijk, 1995). For instance, the editorial sections of news usually contain a lot of such propositions which aid in persuading the reader to agree to the given interpretation of some news in the editorial. Wang (2010) conducted a study on how presuppositions can make newspaper advertisements more effective by compensating for the small place occupied by them. He argued that when an advertisement directs the readers to infer some data which is not directly mentioned, they tend to pay more attention to the product being advertised.
Bekalu (2006) took a small sample of data from 5 newspapers and analysed the use of presuppositions in the articles. He manually analysed how presuppositions can contribute in differentiating between the styles of reporting in the pro-government and anti-government stance of the newspaper.
However, none of these studies have tested the validity of their claims on a large corpus and no computational work has been done in this domain so far.
Moreover, all of the above research was carried out for English news, and there has been little work on Politics and News discourse in Telugu, which is a low resource language. Mukku et al. (2016) applied ML techniques for Sentiment Analysis of Telugu news articles. Kameswari and Mamidi (2018) carried out a case study on political influence through linguistic choices on a corpus of election campaign speeches. Gangula et al. (2019) proposed an attention mechanism to detection of bias in Telugu news articles. To our knowledge there has been no work on presupposition in Telugu till date.
Our research is the first of its kind which proposes guidelines to identify presuppositions in political news and use that information to enhance the computational methods to detect bias in political news articles.

Corpus Creation and Annotation
To validate our idea computationally, we need a large dataset of news articles which have been annotated for their bias and magnitude of presupposition. There is no such dataset which captures both the features, so we took the corpus 1 created by Gangula et al. (2019) and modified it. It consists of 1329 articles collected from various newspapers in Telugu, a Dravidian language spoken widely in Telangana and Andhra Pradesh in India. Each article was annotated with a label out of the 6 labels they chose -BJP, TDP, Congress, TRS, YCP and None. The first five labels represent the bias towards or against those parties (marked by "Positive" or "Negative" in their dataset), and "None" denotes that the article is Unbiased.
We created a modified version of the corpus according to our requirement as follows. The original corpus consisted of 218 unbiased articles and 1111 articles which had bias towards some party. Out of those, it was found that some were very short, and some had very little or no mention about any political parties or events. Such articles were filtered out and we were left with 1050 articles of which 850 were biased and 200 were unbiased. Since our main aim was to see the the contribution of presupposition to the biased content in the article, we did not keep the existing labels of "Positive" and "Negative" to denote the direction of bias. All the biased articles were labelled with a bias label of 1 and unbiased articles with 0.
Our annotated corpus 2 is publicly available to ensure reproducibility of the results and to facilitate further research in this domain.

Annotating for presuppositions
For our purpose, there is a need for a systematic way to identify and quantify the presuppositions 3 in an article. After discussions and observation of several articles, we came up with a novel annotation scheme and guidelines.

Annotation Scheme
Each article is split into individual sentences. Each sentence is given a score of 1 if it contains any pragmatic presupposition which the reader is not expected to know. If no such presuppositions are present, the sentence is given a score of 0. After evaluating all the individual sentences, the score of the article is calculated as the mean of all the individual sentence scores.

Annotation guidelines
To ensure consistency in annotation as well as to capture the linguistic information at both semantic and pragmatic levels, we propose the following annotation guidelines: 1. Coreference: If an article contains multiple references to an entity such as a person or an organization, each sentence containing such reference is marked as 1 if it is not expected to be known by the reader or requires additional background information. In other cases, the sentence is marked as 0.
2. Deixis: If any person, place, time or discourse deixis is observed in a sentence, we recursively go to the previous reference of the entity in the article. If there is sufficient context in the article to resolve deixis, the sentence is marked as 0. However, if all the previous references are marked as 1, the sentence is marked as 1.

Presence of certain verbal suffixes:
If there is any reference to the events in the past/present or some party policies which were not described and do not fall under the minimum knowledge the reader is expected to have, then the sentence is marked as 1. In Telugu, such references are generally identified by morphological suffixes such as -ina, -ani, -tuna, -unTE, etc.
e.g: Dilli lO ErpATu cEsina dharnA "The strike organised in Delhi" 4. Verbal Nouns: If a sentence contains one or more verbs in nominal form indicating change or continuation of state, then it is marked as 1.
e.g: telaNGANA dEsam lO agrasthAnam lO konasAgaDam "Telangana continuing being in the first position in the country" 5. Rhetorical Questions: If a sentence contains some rhetorical question which is suggestive of some action which is not common knowledge, then the sentence is marked as 1.

Experiments
Our goal is to detect political bias in an article with and without the presupposition information, and compare the results. For this purpose, similar to Gangula et al. (2019), we label Political bias detection as a classification problem. The presence or absence of bias (0 or 1) is treated as the label, and the task is to assign an appropriate label to a news article.
The first step is to represent each article as a vector. Since each vector can be extremely large and sparse, chi-square feature selection algorithm applied, which reduces the size of the vector to 10000.
We performed experiments with the following six classifers: 1. Bernoulli Naive Bayes: Naive Bayes (NB) classifier is a probabilistic classifier which uses Bayes Theorem. It evaluates the probability of an event given the probability of another event which has previously occurred. Bernoulli Naive Bayes is a binomial model, particularly useful if the feature vectors are binary (i.e., 0s and 1s).  Table 1: Average Accuracy (Acc) in percentage and F1 score for each experiment 4. Support Vector Machines (SVM): SVM is a non probabilistic classifier which constructs a set of hyperplanes in a high-dimensional space separating the data into classes. We implemented SVM with radial basis function as the choice of kernel.

Multinomial Naive
5. Random Forest Classifier: Random Forest (RF) is an ensemble of Decision Trees, which are structures that use a tree-like model for the decisions and likely outcomes. Random Forests construct multiple decision trees and take each of their prediction into consideration for giving the final output.

Multi Layer Perceptrons (MLP):
A multilayer perceptron (MLP) is a feed-forward artificial neural network model which maps input data sets on an appropriate set of outputs.
For training purpose, Scikit-learn (Pedregosa et al., 2011) implementations have been used for all the classifiers with default hyperparameters. We conducted each experiment four times, each differing in the input given to the classifier. Following are the four categories of inputs which were used: 1. Article 2. Headline + Article

Headline + Article + Presupposition Value
In categories 2, 3 and 4, the entities were concatenated to form a final vector which was given to the classifier. In all the experiments, 10-fold cross validation was carried out. The accuracy and F1 scores for each experiment were calculated.

Observations and Results
After assigning a presupposition value to each article, we calculated the mean presupposition value for each category. Biased articles have 0.46 as the mean value of their presupposition, whereas in the case of unbiased articles, it was found to be 0.15. Figure 1 shows us the distribution of articles in our dataset in terms of their bias and presupposition values. It can be see that the density of the articles decreases as we move up in case of unbiased articles, with most of them being in the 0.15 -0.3 range, and no articles were observed with a value higher than 0.6. On the other hand, there were many biased articles with relatively higher values, most of them in the range 0.4 to 0.7, and the maximum value observed was 1.0. Based on the average value of presupposition and the plot in Figure 1, we assert that biased articles usually tend to have higher presupposition content in them.
The experimental results are shown in Table 1. It can be observed that there is a small improvement whenever the headline is added, when compared to the significant improvement in the performance of each classifier after adding presupposition information. This is seen by observing the difference in performance between categories 1 and 2 and comparing it with the difference in performance between categories 1 and 3 in Table 1. From this, we understand that the knowledge about presupposition contributes more to the detection of bias than the headline. The highest performance for each classifier is observed in category 4, where each classifier has information about the headline, article and the presupposition value. Figure 2 shows us the performance of each classifiers for the four categories of inputs as discussed in Section 4. The highest performance is achieved by Multi Layer Perceptron classifier with an accuracy of 96.49% and F1 score of 95.68. We can observe that MLP and SVM with RBF Kernel, which are non linear, perform better than all other models. In our task, MLP achieves an improvement of 6.95% over the Attention Network proposed by Gangula et al. (2019), which had the previous best performance in the task of Political Bias detection. This is an example of improving performance by incorporating sophisticated linguistic features, to the point where a simple multilayer perceptron extended with such features performs better than the State-of-the-art Attention based model.

Conclusions and Future Work
In this paper, we came up with an interesting correlation between bias and presupposition in news articles. We proved that pragmatic presupposition contributes towards bias in a news article. By using this information, we came up with a supervised method for automatic detection of bias in news articles along with exhaustive guidelines to identify and annotate presuppositions, and a manually annotated dataset to enable further research. The results of our experiments show that our model significantly outperforms all the previous models.
Though we used only news articles for our experiments, our idea is also applicable to other forms of opinion discourse such as Social Media texts, reviews, blogs, etc. where bias in text could lead to spread of misinterpreted information at various levels.

Future Work
Continuing this work, we plan to come up with an improved scheme for classifying presuppositions into various categories and modified guidelines to annotate accordingly. Subsequently we wish to develop tools to automate the process of presupposition annotation and extend our idea to check whether we can predict the polarity of bias (positive/negative) by the kind of presuppositions present in the text.
We would also like to extend our annotated corpus to accommodate English and other Indian languages by using other corpora such as NELA-GT-2018 (Nørregaard et al., 2019) and come up with better multilingual deep learning models.