Understanding Negation in Positive Terms Using Syntactic Dependencies

This paper presents a two-step procedure to extract positive meaning from verbal negation. We ﬁrst generate potential positive interpretations manipulating syntactic dependencies. Then, we score them according to their likelihood. Manual annotations show that positive interpretations are ubiquitous and intuitive to humans. Experimental results show that dependencies are better suited than semantic roles for this task, and automation is possible


Introduction
Negation is a complex phenomenon present in all human languages, allowing for the uniquely human capacities of denial, contradiction, misrepresentation, lying, and irony (Horn and Wansing, 2015). Despite negation always being marked-in the absence of a negation cue, statements are positiveacquiring and understanding sentences that contain negation is more challenging than those that do not. Children acquire negation after learning to communicate (Nordmeyer and Frank, 2013), and adults take longer to process negated statements than positive ones (Clark and Chase, 1972).
In any given language, humans communicate in positive terms most of the time, and use negation to express something unusual or an exception (Horn, 1989). Albeit most sentences are affirmative, negation is ubiquitous (Morante and Sporleder, 2012): In scientific papers, 13.76% of statements contain a negation (Szarvas et al., 2008); in product reviews, 19% (Councill et al., 2010); and in Conan Doyle stories, 22.23% (Morante and Daelemans, 2012). In OntoNotes (Hovy et al., 2006), 10.15% of statements contain a verb negated with not, n't or never.
From a theoretical point of view, it is accepted that negation conveys positive meaning (Rooth, 1992;Huddleston and Pullum, 2002). For example, when reading (1) John didn't order the right parts, humans intuitively understand that (1a) John ordered something, or more specifically, (1b) John ordered the wrong parts. Interpretation (1a) can be obtained after determining that n't does not negate verb order, but its THEME, i.e., the right parts. Interpretation (1b) can be obtained after determining that n't is actually negating right, an adjective modifying parts.
Determining which words are intended to be negated-identifying the foci of negation, thereby revealing positive interpretations-is challenging. First, as exemplified in (1a, 1b), there is a granularity continuum yielding interpretations that entail each other, e.g., (1b) entails (1a). Second, a single negation often yields several positive interpretations, e.g., from (2) John doesn't eat meat, we can extract that (2a) John eats something other than meat and (2b) Some people eat meat, but not John.
This paper presents a methodology to extract positive interpretations from verbal negation. The main contributions are: (1) deterministic procedure to generate potential interpretations by manipulating syntactic dependencies; (2) analysis showing that dependencies yield finer-grained interpretations and better results than previous work using semantic roles; (3) a corpus of negations and their positive interpretations; 1 and (4) experimental results with gold-standard and predicted linguistic information.

Terminology, Scope and Focus
Negation is well-understood in grammars, which detail the valid ways to form a negation (Quirk et al., 2000;van der Wouden, 1997). Negation can be expressed by verbs (e.g., avoid running), nouns (e.g., the absence of evidence), adjectives (e.g., it is pointless), adverbs (e.g., I never tried Persian food before), prepositions (e.g., you can exchange it without a problem), determiners (e.g., the new law has no direct implications), pronouns (e.g., nobody will keep election promises), and others. In this paper, we focus on verbal negation, i.e., when the negation mark-usually an adverb such as never and not-is grammatically associated with a verb. Positive Interpretations. In philosophy and linguistics, it is generally accepted that negation conveys positive meaning (Horn, 1989). This positive meaning ranges from implicatures, i.e., what is suggested in an utterance even though neither expressed nor strictly implied (Blackburn, 2008), to entailments. Other terms used in the literature include implied meanings (Mitkov, 2005), implied alternatives (Rooth, 1985) and semantically similars (Agirre et al., 2013). We do not strictly fit into any of this terminology, we reveal positive interpretations as intuitively done by humans when reading text.

Scope and Focus
From a theoretical perspective, it is accepted that negation has scope and focus, and that the focusnot just the scope-yields positive interpretations (Horn, 1989;Rooth, 1992;Taglicht, 1984). Scope is "the part of the meaning that is negated" and focus "the part of the scope that is most prominently or explicitly negated" (Huddleston and Pullum, 2002).
Consider the following statement in the context of the recent refugee crisis: (2) Mr. Haile was not looking for heaven in Europe. By definition, scope refers to "all elements whose individual falsity would make the negated statement strictly true", and focus is "the element of the scope that is intended to be interpreted as false to make the overall negative true" (Huddleston and Pullum, 2002). The falsity of any of the truth conditions below makes statement (2)  [THEME of looking, heaven] 2d. Somebody was looking for something in Europe. [LOCATION of looking, in Europe] Determining the focus is almost always more challenging than the scope. The challenge relies on determining which of the truth conditions (2a-2d) is intended to be interpreted as false to make the negated statement true: all of them qualify, but some are more likely. A natural reading of statement (2) suggests that Mr. Haile was looking for something (a regular life, a job, etc.) in Europe, but not heaven. Determining that the focus is heaven, i.e., that everything in statement (2) is positive except the THEME of looking, is the key to reveal the intended positive interpretation. Note that scope on its own does not identify positive interpretations, and other foci yield unlikely positive interpretations, e.g., Mr. Haile was looking for heaven somewhere, but not in Europe.
It is worth noting that while scope is defined from a logical standpoint, in most negations there are several possible foci and corresponding positive interpretations. For example, given (3) Most jobs now don't last for decades, the following are valid positive interpretations: (3a) Few jobs now last for decades, (3b) Most jobs in the past lasted for decades, and (3c) Most jobs now last for a few years. Granularity of Focus. The definition of focus does not provide guidelines about identifying the element of the scope that is the focus. The larger the focus, the more generic the corresponding positive interpretation; and the smaller the focus, the more specific the corresponding positive interpretation. Let us consider statement (3) again. A possible focus is Most jobs, yielding the positive interpretation Something now lasts for decades, but not most jobs. Another possible focus is Most, yielding the interpretation Few (not most) jobs now last for decades. We argue that the latter is preferable, as it yields a more specific interpretation and it entails the former: if some jobs last for decades, then something lasts for decades, but not the other way around.
We use the term coarse-grained focus to refer to foci that include all tokens belonging to an argument of a verb (e.g., Most Jobs above), and fine-grained focus to refer to foci that do not (e.g., Most above).

Previous Work
Within computational linguistics, approaches to process negation are shallow, or target scope and focus detection. Popular semantic representations such as semantic roles (Palmer et al., 2005;Baker et al., 1998) or AMR (Banarescu et al., 2013) do not reveal the positive interpretations we target in this paper. Shallow approaches are usually application-specific. In sentiment and opinion analysis, negation has been reduced to marking as negated all words between a negation cue and the first punctuation mark (Pang et al., 2002), or within a five-word window of a negation cue (Hu and Liu, 2004). The examples throughout this paper show that these techniques are insufficient to reveal implicit positive interpretations.

Scope Annotations and Detection
Scope of negation detection has received a lot of attention, mostly using two corpora: BioScope in the medical domain (Szarvas et al., 2008) and CD-SCO (Morante and Daelemans, 2012). BioScope annotates negation cues and linguistic scopes exclusively in biomedical texts. CD-SCO annotates negation cues, scopes, and negated events or properties in selected Conan Doyle stories.
There have been several supervised proposals to detect the scope of negation using BioScope and CD-SCO (Özgür and Radev, 2009;Øvrelid et al., 2010). Automatic approaches are mature (Abu-Jbara and Radev, 2012): F-scores are 0.96 for negation cue detection, and 0.89 for negation cue and scope detection (Velldal et al., 2012;Li et al., 2010). Fancellu et al. (2016) present the best results to date using CD-SCO, and analyze the main sources of errors. Outside BioScope and CD-SCO, Reitan et al. (2015) present a negation scope detector for tweets, and show that it improves sentiment analysis. As shown in Section 2, scope detection is insufficient to reveal positive interpretations from negation.

Focus Annotation and Detection
While focus of negation has been studied for decades in philosophy and linguistics (Section 2), corpora and automated tools are scarce. Blanco and Moldovan (2011) annotate focus of negation in the 3,993 negations marked with ARGM-NEG semantic role in PropBank (Palmer et al., 2005). Their an-notations, PB-FOC, were used in the *SEM-2012 Shared Task (Morante and Blanco, 2012). Their guidelines require annotators to choose as focus the semantic role that "is most prominently negated" or the verb. If several roles may be the focus, they prioritize "the one that yields the most meaningful implicit [positive] information", but do not specify what most meaningful means. Their approach has 2 limitations. First, because they select one focus per negation, they only extract one positive interpretation per negation. Second, because they select as focus a semantic role, they only consider coarsegrained foci. Consider again statement (3) from Section 2.1. By design, their approach is limited to extract a single interpretation even though interpretations (3a-3c) are valid. Similarly, their approach is limited to select as focus Most jobs-all tokens belonging to a semantic role-although Most yields a "more meaningful" interpretation: Something now lasts for decades (generic, worse) vs. Few jobs now last for decades (specific, better).
Blanco and Sarabi (2016) present a complimentary approach to extract and score several positive interpretations from a single verbal negation. Their methodology is grounded on semantic roles and does not consider fine-grained foci. In this paper, we improve upon their work: we extract both coarse-and fine-grained interpretations, and also extract several interpretations from one negation. Anand and Martell (2012) reannotate PB-FOC and argue that positive interpretations arising from scalar implicatures and neg-raising predicates should be separated from those arising from focus detection. They argue that 27.4% of negations with a focus annotated in PB-FOC do not have one. In this paper, we are not concerned about annotating foci per se, but about extracting positive interpretations from negation, as intuitively done by humans.
Automatic systems to detect the focus of negation yield modest results. Blanco and Moldovan (2011) obtain an accuracy of 65.5 using supervised learning and features derived from gold-standard linguistic information. With predicted linguistic information, Rosenberg and Bergler (2012) report an Fmeasure of 58.4 using 4 linguistically sound heuristics, and Zou et al. (2014) an F-measure of 65.62 using contextual discourse information. Blanco and Sarabi (2016) obtain Pearson correlation of 0.642 ranking coarse-grained interpretations. Unlike the work presented here, none of these systems extract fine-grained interpretations from a single negation.

Corpus Creation
Our goal is to create a corpus of negations and their positive interpretations. We put a strong emphasis on automation and simplicity. First, we deterministically generate potential positive interpretations from verbal negations by manipulating syntactic dependencies (Section 4.1). Second, we ask annotators to score potential positive interpretations (Section 4.2). Positive interpretations and their scores are later used to learn models to rank potential interpretations automatically (Section 6). Generating potential interpretations deterministically prior to scoring them proved very beneficial. After pilot experiments, it became clear that asking annotators to propose positive interpretations complicates the annotation effort (lower agreements) as well as learning.
We decided to work on top of OntoNotes (Hovy et al., 2006) 2 instead of plain text or other corpora for several reasons. First, OntoNotes includes gold linguistic annotations such as part-of-speech tags, parse trees and semantic roles. Second, unlike BioScope, CD-SCO and PB-FOC (Section 3.2), OntoNotes includes sentences from several genres, e.g., newswire, broadcast news and conversations, magazines, the web. We transformed the parse trees in OntoNotes into syntactic dependencies using Stanford CoreNLP (Manning et al., 2014).

Manipulating Syntactic Dependencies to Generate Potential Positive Interpretations
OntoNotes contains 63,918 sentences. Annotating all positive interpretations from all negations is outside the scope of this paper. Instead, we target selected representative negations. Selecting Negations. We first select all verbal negations by retrieving all tokens whose syntactic head is a verb and dependency type neg. 3 Then, we discard negations from sentences that contain two negations, conditionals, commas or questions. Finally, we dis-card negations if the negated verb is to be or it does not have a subject (dependency nsubj or nsubjpass).
Converting Negated Statements into their positive counterparts. We apply 3 steps inspired after the grammatical rules to form negation detailed by Huddleston and Pullum (2002, Ch. 9): 1. Remove the negation mark by deleting the token with syntactic dependency neg. 2. Remove auxiliaries, expand contractions, and fix third-person singular and past tense. For example (before: after), doesn't go: goes, didn't go: went, won't go: will go. We loop through the tokens whose head is the negated verb with dependency aux, and use a list of irregular verbs and grammar rules to convert to thirdperson singular and past tense. 3. Rewrite negatively-oriented polarity-sensitive items. For example (before: after), anyone: someone, any longer: still, yet: already. at all: somewhat. We use the correspondences between negatively-oriented and positively-oriented polarity-sensitive items by (Huddleston and Pullum, 2002, pp. 831). Selecting Relevant tokens. Verbal negation often occurs in multi-clause sentences. In order to identify the relevant (syntactically negated) eventuality, we simplify the original statement by including only the negated verb and all tokens that are dependents of the verb, i.e., tokens reachable from the negated verb traversing dependencies. For example, from Individuals familiar with the Justice Department's policy said that Justice officials hadn't any knowledge of the IRS's actions in the last week, after getting the positive counterpart and selecting relevant tokens, we obtain Justice officials had some knowledge of the IRS's actions in the last week. Generating Interpretations. Given the simplified positive counterpart, generating all combinations of tokens as potential foci would result in 2 t potential positive interpretations for t tokens. To avoid a brute-force approach that generates many nonsensical potential interpretations, we define a procedure grounded on syntactic dependencies.
The main idea is to run a modified breadth-first traversal of the dependency tree to select subtrees that are potential foci. We start the traversal from the negated verb and stop it at depth 3, selecting as potential foci the subtrees rooted at all tokens except Positive counterpart Step 1 The report claims that underclass youth do have those opportunities.
Step 2 The report claims that underclass youth have those opportunities.
Step 3 The report claims that underclass youth have those opportunities.  those whose syntactic dependency is aux, auxpass or punct (auxiliary, passive auxiliary and punctuation). Additionally, we discard potential foci that consist only of (1) the determiners the, a and an, or (2) a single token with part-of-speech tag TO, CC, UH, POS, XX, IN, WP or dependency relation prt. These rules were defined after manually observing several examples and concluding that the corresponding positive interpretation was useless. For example, from the negated statement And our credit standards haven't changed one iota, we avoid generating the useless potential interpretation Our credit standards X changed one iota, but not have changed.
(focus would be have, with dependency aux). Similarly, from It is not supported by the text or history of the Constitution, we avoid generating potential interpretation It is supported by X text or history of the Constitution, but not by the text or history of the Constitution (focus would be the); and from You don't want to get yourself too upset about these things, potential interpretation You want X get yourself too upset about these things, but not to get (focus would be to, with part-of-speech tag TO).
Once potential foci are selected, we generate positive interpretations by rewriting each focus with "someone/some people/something/etc." and appending "but not text of focus" at the end. Additionally, if the first token of the focus is a preposition, we include it to improve readability, e.g., didn't leave [by noon]: left by sometime, but not by noon.
Note that potential interpretations obtained from foci that are direct syntactic dependents of the negated verb are coarse-grained interpretations, and the rest are fine-grained interpretations. Table 1 exemplifies the procedure step by step.

Scoring Potential Positive Interpretations
After generating potential positive interpretations automatically, we asked annotators to score them. Annotators had access to the original negated sentence, the previous and next sentence as context, and one potential positive interpretation at a time. The interface asked Given the three sentences [previous sentence, negated sentence and next sentence] above, do you think the statement [positive interpretation] below is true? Annotators were forced to answer with a score from 0 to 5, where 0 means absolutely disagree and 5 means absolutely agree. We did not provide descriptions for intermediate scores or use categorical labels. This simple guidelines were sufficient to reliably score plausible positive interpretations automatically generated (Section 5).

Corpus Analysis
The procedure described in Section 4.1 generates 9729 potential positive interpretations (5865 coarsegrained and 3864 fine-grained) from 1671 verbal negations. Out of all these potential positive interpretations, we annotate 1700 (1008 coarse-and 692   fine-grained). Overall, the mean score is 3.20, and the standard deviation is 1.66. Table 2 shows basic statistics for potential foci, where dependency indicates the dependency from the potential focus to a token outside the potential focus. Most foci are nsubj, dobj and pobj, and the mean scores and standard deviation are similar for most dependencies.
Annotation Quality. In order to ensure annotation quality, we calculated Pearson correlation. Kappa and other measures designed for categorical labels are ill-suited for our annotations, since not all disagreements between numeric scores are the same, e.g., 4 vs. 5 should be counted as higher agreement, than 1 vs. 5. Overall Pearson correlation was 0.75. Table 3 presents 2 statements that contain verbal negation, the list of positive interpretations automatically generated and the annotated scores. Example (1) is a simple negated clause, yet we generate 7 potential positive interpretations and 3 of them receive high scores (4 or 5). Given You're not paying me for my overtime work and the previous statement, it is reasonable to believe that the author is in an employee-employer relationship, and the employer is not fair to the employee. Interpretations 1.1, 1.4 and 1.6 are implicit positive interpretations intuitively understood by humans when reading the original negated statement. Namely, Interpretation 1.1: You (the employer) are nickel-and-diming me for my overtime work (focus is paying), Interpretation 1.4: You (the employer) are paying me for something (focus is my overtime work), and Interpretation 1.6: You (the employer) are paying me for my regular work (focus is overtime). These interpretations show the benefits of fine-grained interpretations: Interpretation 1.6 is a refinement of Interpretation 1.4, and the former is more desirable than the latter as it reveals more specific positive knowledge. The remaining interpretations are legible, but do not make sense given the negated statement, e.g., interpretation 1.2: Somebody (but not the employer) pays me for my overtime (focus is You).

Annotation Examples
Example (2) is also a simple negated clause, and 4 out of 5 interpretations receive high scores, capturing valid positive meaning. Specifically, Interpreta-Type Name Description Basic neg mark word form of negation mark verb word form and part-of-speech tag of verb coarse or fine flag indicating whether interpretation is coarse-or fine-grained Path syn path dep syntactic path from focus to verb (concatenation of dependencies) syn path pos syntactic path from focus to verb (concatenation of part-of-speech tags) syn path last dep last syntactic dependency in syn path dep (direct dependent of verb) syn path last pos last part-of-speech tag in syn path pos (direct dependent of verb) Focus focus length number of words in subgraph chosen as focus focus first word word form and part-of-speech tag of first word in focus focus last word word form and part-of-speech tag of last word in focus focus direction flag indicating whether focus occurs before or after verb focus head word word form of head of focus focus head pos part-of-speech tag of head of focus focus head rel syntactic dependency of head of focus Table 4: Features used to score potential positive interpretations automatically generated. tion 2.1: Those concerns are avoided in public (focus is expressed), Interpretation 2.2: Something is expressed in public (focus is Those concerns), Interpretation 2.4: Some concerns (but not problematic or secret concerns) are expressed in public (focus is Those), and Interpretation 2.5: Those concerns are expressed in private (focus is in public).

Syntactic Dependencies vs. Semantic Roles
The procedure presented in Section 4.1 is not the first to generate potential positive interpretations from negation (Section 3.2). Our approach has 2 advantages with respect to those grounded on semantic roles (Blanco and Sarabi, 2016): (1) it generates both coarse-and fine-grained interpretations, and (2) learning to score interpretations is easier because state-of-the-art tools extract dependencies more reliably than semantic roles. To support claim (1), we compare the interpretations generated with our procedure and previous work using semantic roles. 96.12% of interpretations generated using roles are also generated using syntactic dependencies. Also, using dependencies allow us to generate 67.9% of additional (finegrained) interpretations not obtainable with roles.
To support claim (2), we compare interpretations generated with gold and predicted linguistic information (roles or dependencies). The overlap with semantic roles is 70.1%, and with syntactic dependencies, 92.8%. Syntactic dependencies are thus better in a realistic scenario because they allow us to automatically generate (and score) most interpretations.

Supervised Learning to Score Potential Positive Interpretations
We follow a standard supervised machine learning approach. The 1,700 potential positive interpretations along with their scores become instances, and we divide them into training (80%) and test splits (20%) making sure that all interpretations generated from a sentence are assigned to either the training or test splits. Note that splitting instances randomly would not be sound: training with some interpretations generated from a negation, and testing with the rest of interpretations generated from the same negation would be an unfair evaluation. We train a Support Vector Machine for regression with RBF kernel using scikit-learn (Pedregosa et al., 2011), which in turn uses LIBSVM (Chang and Lin, 2011). SVM parameters (C and γ) were tuned using 10-fold cross-validation with the training set, and results are calculated using the test set. Table 4 presents the full feature set. Features are relatively simple and characterize the verbal negation from which a potential interpretation was generated, as well as the interpretation per se, i.e., the dependency subgraph chosen as potential focus.

Feature Selection
Basic features account for the negation mark, the negation verb (word form and part-of-speech tag) and a binary flag indicating whether we are scoring a coarse-or fine-grained interpretation.
Path features are derived from the syntactic path  between the subgraph selected as focus and the verb. We include the actual path (concatenation of dependencies and up/down symbols), and the modified path using part-of-speech tags. Additionally, we also include the last dependency and part-of-speech tag, i.e., the ones closest to the verb in the path.
Focus features characterize the dependency subgraph chosen as focus to generate the potential interpretation. Specifically, we include the number of tokens, word form and part-of-speech tags of the first and last tokens, and whether the focus occurs before or after the verb. We also include features derived form the head of the focus, which we define as the token whose syntactic head is outside the focus. We include the word form and part-of-speech of the focus head, as well as its the dependency.

Experiments and Results
We report results obtained with several combinations of features in Table 5. We detail results obtained with features extracted from gold-standard and predicted linguistic annotations (part-of-speech tags and syntactic dependencies) as annotated in the gold and auto files from the CoNLL-2011 Shared Task release of OntoNotes (Pradhan et al., 2011). All models are trained with gold-standard linguistic annotations, and tested with either gold-standard or predicted linguistic annotations. Testing with gold-standard POS tags and syntactic dependencies. Training with the word form of the negation mark is virtually useless, it yields a Pearson correlation of −0.109. Basic features (negation mark, verb and flag indicating coarseor fine-grained interpretation) are also ineffective to score potential interpretations (Pearson: 0.033). Including features derived from the syntactic path yields higher correlation, 0.474, even though these features only capture the syntactic relationship be-tween the focus from which the interpretation was generated and the verb. Finally, adding focus features yields the best results (Pearson: 0.53, +11.8%). Testing with predicted POS tags and syntactic dependencies. We selected 20% of positive interpretations in our corpus as test instances, totalling 379 interpretations (Section 6). When executing the procedure to generate potential interpretations (Section 4.1) with predicted linguistic information, however, we are unable to generate all of them due to incorrect and missing syntactic dependencies. Specifically, 352 of the 379 interpretations are generated (92.8%). While we do not generate 7.2% of instances, this percentage is substantially lower than previous work grounded on semantic roles (Section 5.2).
Pearson correlations with predicted linguistic information are calculated using the 352 instances that were also generated with gold dependencies (and thus assigned a score during the manual annotations). Correlations are slightly higher and follow a similar trend than the correlations obtained with gold-standard linguistic information. These results should be taken with a grain of salt: the test instances are not exactly the same, and the 352 test instances in this scenario are presumably easier to score than the remainder 27, as dependencies were predicted correctly.

Conclusions
Humans intuitively extract positive meaning from negation when reading text. This paper presents an automated procedure to generate potential positive interpretations from verbal negation, and score them according to their likelihood. Our procedure is grounded on syntactic dependencies, allowing us to extract fine-grained interpretations beyond semantic roles (67.9% additional interpretations). Additionally, because dependencies are extracted automatically more reliably than semantic roles, we generate 92.8% of all potential interpretations when using predicted linguistic information, as opposed to 70.1% with semantic roles.
On average, we generate 6.4 potential interpretations per verbal negation (coarse-grained: 3.8, finegrained: 2.6). Manual annotations show that potential interpretations are deemed likely. The mean score is 3.20 (out of 5.0), thus we extract a substantial amount of positive meaning.
The work presented in this paper is not tied to any existing semantic representation. While we rely heavily on syntactic dependencies, positive interpretations are generated in plain text, and they could be processed, along with the original negated statement, with any NLP pipeline.