Opposition Relations among Verb Frames

In this paper we propose a scheme for annotating opposition relations among verb frames in lexical resources. The scheme is tested on the T-PAS resource, an inventory of typed predicate argument structures for Italian, conceived for both linguistic research and computational tasks. After discussing opposition relations from a linguistic point of view and listing the tags we decided to use, we report the results of the experiment we performed to test the annotation scheme, in terms of interannotation agreement and linguistic analysis of annotated data.


Introduction
Several studies have been carried out on the definition and classification of oppositions in linguistics, philosophy, cognitive science and psychology. Our notion of opposition is based on lexical semantic studies by Lyons (1977), Cruse (1986;2002;2011), andPustejovsky (2000). In the presentation that follows, we draw from the synthesis of these studies reported in Jezek (2015), and focus on oppositions among verb frames.
Traditionally, the study of semantic relations among verbs or verb frames has focused on the manner relation, the cause relation, and the relation of lexical entailment (see, for example, the classification in Fellbaum (1998)). In the computational field, several initiatives have proposed annotation schemas both for the annotation of the internal structure of events (see, for instance, Aguilar et al. (2014), Fokkens et al. (2013)) and for relations among events, including for instance temporal relations as proposed in the TimeML scheme (Pustejovsky et al., 2003). However, less works have systematically addressed the relation of opposition for verbs.
From a general point of view, the category of opposites can be said to include pairs of terms that contrast each other with respect to one key aspect of their meaning, such that together they exhaust this aspect completely. Examples include the following pairs: to open / to close, to rise / to fall. Paradoxically, the first step in the process of identifying a relationship of opposition often consists in identifying something that the meanings of the words under examination have in common. A second step is to identify a key aspect in which the two meanings oppose each other. 1 Opposites cannot be true simultaneously of the same entity at the same time, for example a price cannot be said to rise and fall at exactly the same point in time. A basic test to identify an opposition is It is both X and Y. Based on this test, "*The price is both rising and falling" is ruled out as odd because to rise and to fall are opposites in the sense of being mutually exclusive. The test, however, does not tell us what kind of opposition it is.
Among the various types of oppositions that can be said to exist among verbs, we focus here on antonymy, complementarity, converseness and reversiveness, which appear to recur frequently across the vocabulary and have been discussed at length in the literature, with some points of divergence.
Two verbs are antonyms when they denote a change in property (to increase / to decrease) that has the characteristic of being gradual from a conceptual point of view. Two antonyms, therefore, oppose each other in relation to a scale of values for a given property, of which they may specify the two poles (or bounds). For this reason in the case of antonyms one may also speak of polar (Pustejovsky, 2000) or scalar opposition. From a logical point of view, antonyms are contraries, not contradictories; the negation of one term is not equivalent to the opposite term. For example, not increased does not necessarily mean decreased.
In the world's languages it is easy to find series of terms that identify very refined gradations of a specific property, for example with temperature: freeze, cool, warm up, boil. Potentially, along a scale of this type we could have very many terms lexicalizing different degrees along the scale. In reality, as a rule, we have a few, and use degree modifiers (such as a bit or slightly) to refine the concept; for example, we say "The weather warmed slightly".
Two verbs are complementary (to accept / to reject; to succeed / to fail) when they oppose each other with regards to a distinction that is not polar but binary; in other words, complementaries partition a conceptual domain into mutually exclusive compartments. For this reason, this opposition can also be called binary opposition (Pustejovsky, 2000). Complementary terms exclude each other and there is never an intermediate term. Therefore, a binary opposition corresponds to the relationship "X is equivalent to non-Y": accept is equivalent to nonreject, fail is equivalent to non-succeed, and so on. There is no underlying scale of values.
Converses (to lend / to borrow) are terms whose meaning involves necessarily a relation between at least two elements. That is, a person can lend something only if there is a borrower, and so forth. Therefore, converse terms are inherently relational. The underlying relation is asymmetrical that is, it is seen from the point of view of one of the two participants: (1) x lends something to y y borrows something from x The characteristic of two converse terms is that each expresses the underlying relation in the opposite way from the other. Therefore, not all relational terms are converses, but only those with reversed or converted roles. Finally, terms which denote reversive actions or events, such as build / destroy, assemble / disperse, wrap / unwrap, are reversives. It has been proposed (Cruse 2011) that reversives include two main subtypes: directional opposites, defined as verbs denoting movement in opposite directions between two terminal states (such as rise / fall or enter / leave), and "more abstract examples" denoting change in opposite directions between two states (such as all the examples above). According to Cruse (2011), in the case of reversives, the manner of the process and details of the path do not count, it is the effective direction from origin to goal which matters. Compare tie and untie: both are different actions, but the states in the beginning and the ends of both are the same. Fellbaum (1998) has noted that the relation between the verbs in these pairs seems less one of contrast than one of lexical entailment (Fellbaum, 1998, p. 75); for example one can only unwrap something which has been previously wrapped. We will consider them as opposites with a temporal entailment.
It is an open discussion whether opposition is a semantic relation or a lexical relation (Murphy, 2010;Fellbaum, 1998); what is clear is that that the predicate that is considered opposite of another predicate, does not activate this relation for all its senses. Our schema, as referenced above, will anyway apply to patterns and not to verbs.
Finally, let us look how opposition relations are encoded in lexical resources. WordNet 3.1 (Miller et al., 1990) has one single label, antonymy to identify opposition relations among senses for verbs; for example, increase#1 is in antonymy relation with de-crease#1, diminish#1, lessen#1, fall#11. Antonymy in WordNet subsumes all the categories discussed above: complementaries (as in succeed#1 / fail#1), converses (as in buy#1 / sell#1) and reversives (as in tie#1 / untie#1). In FrameNet (Ruppenhofer et al., 2010), on the other hand, despite the attention given to relations among frames in the resource, no relation of opposition is considered, not even con-verseness, as we can see from Figure 1 where commerce buy and commerce sell, both specializations of the commerce good-transfer frame, are indirectly related by the "perspective on" relation, but they are not related to each other by a direct converse relation. After presenting the main categories introduced in the literature for opposites, and inspecting whether and how they are implemented in lexical resources such as WordNet e FrameNet, before illustrating our annotation scheme of opposition relations among frames, in the next Section we introduce the resource on which the annotation is being performed.

The T-PAS Resource and Oppositions among Patterns
The T-PAS resource (Jezek et al., 2014 . T-PAS is the first resource for Italian in which semantic selection properties and sense-in context distinctions of verbal predicates are characterized fully on empirical ground. In the resource, the acquisition of T-PAS is totally corpusdriven. We discover the most salient verbal patterns using a lexicographic procedure called Corpus Pattern Analysis (CPA) (Hanks, 2004), which relies on the analysis of co-occurrence statistics of syntactic slots in concrete examples found in corpora. 3 2 tpas.fbk.eu 3 Important reference points for the T-PAS project are FrameNet (Ruppenhofer et al., 2010) and VerbNet (Schuler, 2005). They differ from T-PAS because the structures they identify are not acquired from corpora following a systematic proce-The first release contains 1000 analyzed average polysemy verbs, selected on the basis of random extraction of 1000 lemmas out of the total set of fundamental lemmas of Sabatini Coletti 2008 (Sabatini and Coletti, 2007), according to the following proportions: 10% 2-sense verbs, 60% 3-5-sense verbs, 30% 6-11-sense verbs.
The resource consists of three components: 1. a repository of corpus-derived T-PAS linked to lexical units (verbs); 2. an inventory of about 230 corpus-derived semantic types (STs) for nouns (HUMAN, ARTI-FACT, EVENT, etc.), relevant for disambiguation of the verb in context, which was obtained by applying the CPA procedure to the analysis of concordances for ca 1500 English and Italian verbs; 3. a corpus of sentences that instantiate T-PAS, tagged with lexical unit (verb) and pattern number.
The reference corpus is a reduced version of ItWAC (Baroni and Kilgarriff, 2006).
Pattern acquisition and ST tagging involves the following steps: 1) choose a target verb and create a sample of 250 concordances in the corpus; 2) while browsing the corpus lines, identify the variety of relevant syntagmatic structures corresponding to the minimal contexts where all words are disambiguated; 3) identify the typing constraint of each argument slot of the structure by inspecting the lexical set of fillers: such constraints are crucial to distinguish dure. Another important resource is PDEV (Hanks and Pustejovsky, 2005), a pattern dictionary of English verbs which is the main product of the CPA procedure applied to English. As for Italian, a complementary project is LexIt (Lenci et al., 2012), a resource providing automatically acquired distributional information about verbs, adjectives and nouns. Differently from T-PAS, LexIt does not convey an inventory of patterns and the categories used for classifying the semantics of arguments are not corpus-driven. Inventory of senses such as MultiWordNet (Pianta et al., 2002) and Senso Comune (Oltramari et al., 2013) are resources to which T-PAS can be successfully linked with the goal of populating the former with corpus-driven patternbased sense distinctions for verbs.  among the different senses of the target verb in context. Each semantic class of fillers corresponds to a category from the inventory the analyst is provided with. If none of the existing ones captures the selectional properties of the predicate, the analyst can propose a new ST or list a lexical set, in case no generalization can be done; 4) when the structures and the typing constraints are identified, registration of the patterns in the Resource using the Pattern Editor. Each pattern has a unique identification number, and a description of its sense, expressed in the form of an implicature linked to the typing constrains of the pattern, for example the T-PAS in Figure 2 has the implicature [[Human]] legge [[Document]] con grande interesse: 5) assignment of the instances of the sample to the corresponding patterns, as shown in Figure 3.
In this phase, the analyst annotates the corpus line by assigning it the same number associated with the pattern. Concordances containing tagging errors are annotated as x and verb uses that do not come close to matching any of the normal patterns are tagged u (unclassifiable). All above mentioned steps are explained in details in Guidelines, which are provided to the analysts before starting the annotation.
At present, patterns are stored in the resource as a flat list, in the sense that they are not linked by any semantic relation. In the following Section, we describe the motivation for extending the resource by adding opposition relations among patterns, then illustrate the annotation scheme we elaborated for this task and its evaluation.

Motivation and Background
Detecting oppositions, both among words and among portions of text, is a fundamental requirement for any approach in Computational Linguistics aiming to deep language understanding. Indeed, textual opposition plays a crucial role in applications such as machine translation, discourse understanding, summarization and information retrieval.
On the lexical side, most of the computational work focused on approaches for the automatic acquisition of oppositions from corpora. Saif et al. (2013) propose an automatic method to identify contrasting word pairs that is based on the contrast hypothesis, i.e. that if a pair of words, A and B, are contrasting, then there is a pair of opposites, C and D, such that A and C are strongly related and B and D are strongly related. For example, there exists the pair of opposites hot and cold such that tropical is related to hot, and freezing is related to cold.
With a similar goal, Santus et al. (2014) apply Distributional Semantic Models to detect pairs of antonyms from corpora in an unsupervised manner. Under the hypothesis that antonym words share a salient contrasting dimension of meaning, this dimension can be used to discriminate antonyms from synonyms. For example, size is the salient dimension of meaning for the words giant and dwarf and it is expected that while giant occurs more often with words such as big, huge, etc., dwarf is more likely to occur in contexts such as small and hide. Accordingly, this work predicts that synonyms share a number of salient contexts that is significantly higher than the one shared by antonyms.
At the textual level, i.e. oppositions between portions of text, de Marneffe (2012) has investigated automatic methods for detecting contradictions in text pairs, based on the pragmatic definition that contradiction occurs when two sentences are extremely unlikely to be true simultaneously. It is worth to note that one the outcome of this work is that event coreference plays a crucial role in detecting textual oppositions, very much as similarity features are relevant to establish opposition at the lexical level.
The Recognizing Textual Entailment initiative (Dagan et al., 2009) addressed contradiction under the so called "three-way" evaluation schema (i.e. entailment, contradiction, unknown). Specific tech-niques for detecting contradiction include the use of "negative alignments" among portions of text  and methods for detecting the polarity of predicates (Lotan et al., 2013).
As far as applications are concerned, there is an increasing interest in detecting various kinds of oppositions in large document repositories. Few examples include recent approaches that address inconsistencies in Wikipedia (Cabrio et al., 2014), approaches to estimate the truth of a certain fact (Martinez-Gomez et al., 2014), and the automatic reconstruction of consistent story-lines on a certain topic of interest.

Annotation Schema for Opposite Relations
For Italian, to the best of our knowledge, there are no annotation schemas that identify different types of opposition applied to verbal frames. In general, lexical resources, such as synonyms and antonyms dictionaries, list semantic opposition using the cover term antonymy or contraries. Differently, we want to develop an annotation schema that specifies the type of opposition between frames, maintaining all the semantic and syntactic information that frames may contain. Following the classification we described in the Introduction, we propose guidelines for the annotation of oppositions among frame structures where we distinguish: • Antonymy (tag: ANT) • Complementarity (tag: COMPL) • Converseness (tag: CONV) • Reversiveness (tag: REV) The standard tests to determine whether two words are antonyms are the following: "neither X nor Y"; "It X moderately / lightly / a bit". For example: "The water did not cool nor warm (up)"; "The weather has warmed moderately". The "neither X nor Y" test verifies whether it is possible to negate both terms simultaneously, and whether there is a neutral interval with respect to the two terms. The second test verifies whether the terms of the opposition express a scalable dimension.
The same test can be used for complementaries ("*he was neither accepted nor rejected"; "*He neither failed nor succeeded"). Complementary terms fail the test because the opposition they encode is exclusive, in the sense that the assertion of one term entails the negation of the other (and vice versa); there are no intermediate cases. It is not possible to negate both terms simultaneously.
Converses describe the same action from an opposite perspective with regard to the participant roles. If syntactical changes are adopted, converses can be substituted without affecting the meaning of the sentence (see (1) in Section 1). Converses can be twoplace predicates, where two elements are involved or three-place predicates, where more than two elements are involved. In three-place converses, one of the arguments can be omitted.
As for reversives, a test which permits the delimitation of a coherent set of reversible verbs is the "again-test", which verifies the possibility of using unstressed again without the process denoted by the verb having happened before (Cruse, 2002). For example, the following sentences are taken as evidence that enter and leave are a reversive pair: (2) a. The spacecraft left the earth's atmosphere. b. Five days later, the spacecraft entered the atmosphere again. c. The alien spacecraft entered the earth's atmosphere. d. Five days later, the spacecraft left the atmosphere again.

Pilot Experiment on T-PAS
In order to determine the reliability of the opposition schema, we conducted a pilot experiment on the T-PAS resource described in Section 2. In particular, we calculated the degree of agreement between two annotators on the application of the scheme among the verbal patterns of T-PAS.
In the next Sections, we first describe the setting of the pilot experiment (Section 5.1), then the inter annotator agreement results (Section 5.2), and finally we discuss the obtained results (Section 5.3).

Experimental Setting
We designed and ran a pilot annotation over a selected set of verbs defined in T-PAS. Specifically, a set of 25 pairs of verbs (for a total of 216 patterns) have been identified, which, according to human judgment, display a relation of opposition for at least one of their pattern. Consequently, such verb pairs are expected to present a high frequency of the phenomena the schema is designed for. We provided the annotators with the list of verbs and their respective patterns and implicatures. Moreover, annotated corpus-derived examples in T-PAS could be consulted.
The two annotators, both familiar with verbal pattern structures and pattern acquisition, were asked to identify and classify opposition relations between patterns following the annotation schema proposed in Section 4. For each given pair of verbs, the annotation task consists in two main steps: (i) for each pair of patterns, to identify the presence or absence of an opposition relation and (ii), if the opposition relation is present, to recognize which type of opposition occurs.
In both steps of the task, annotators make use of the semantic types expressed in the verb pattern. In particular, STs help annotators in interpreting the sense of the pattern and consequently in identifying which are the senses of the verbs in an opposition relation (if an opposition relation is realized). As an example, consider patterns 2 and 3 of the verb abbattere (in (3) and (4)) and pattern 1 of the verb costruire (in (5)) and their implicatures.
Pattern 2 of the verb abbattere (to demolish, to destroy) with its implicature: ( , help the annotator in understanding which senses of the two verbs s/he is comparing, and, possibly, to establish an opposition relation between (3) and (5), but not between (4) and (5).
In case of multiple semantic types for the same argument slot, annotators are allowed to mark opposition relations between patterns even if they are realized only by a subset of such STs. For instance, (3) and (5) are opposites only as far as [[Human]] is considered as the subject of the two predicates (i.e. pattern 1 of the verb costruire shows multiple semantic types, and does not select [[Event]] as subject).
Finally, annotators can match the same pattern of a verb to more than one patterns of the other verb: this is mainly due to the fact that in T-PAS lexicographers can possibly have adopted a different degree of specification for pattern acquisition (Jezek et al., 2014). In total each annotator had to judge 595 pattern pairs. To complete the task annotators took approximately two days, including corpus examples consultation.

Inter Annotator Agreement
To calculate the agreement between the two annotators, we have adopted the Dice's coefficient (Rijsbergen, 1997), which measures how similar two sets are by dividing the number of shared elements of the two sets by the total number of elements they are composed by. This produces a value from 1, if both sets share all elements, to 0, if they have no element in common.
We calculate the Dice's coefficient for two configurations. In the first configuration, opposition recognition, we consider one agreement if both annotators agree on recognizing opposition or non-opposition between two patterns, 0 if they do not agree. In the second configuration, we calculate the agreement considering opposition category, i.e. we consider as agreement if both annotators identify exactly the same opposition relation.
Finally, for each category, we calculate the per category disagreement as the proportion of pairs where the two annotators disagree over the total pairs in which the category has been recognized.
Out of 595 pairs of patterns used in the experiment, the two annotators agreed in recognizing a pair as displaying or not an opposition relation in 588 cases (44 are marked as opposites by both annotators, 544 as non-opposites): the Dice value for opposition recognition is 0,98. This result suggests that identifying opposition relations between patterns is not to a controversial decision among annotators. Moreover, annotators identified the same type of opposition or agreed in recognizing non-opposition in 582 cases, thus Dice value for type of opposition is 0,97 showing that the agreement between the two annotators has a very high degree of overlap.
On the other hand, considering disagreement for each opposition category (see Table 1), results show that most cases stem from annotating the COMPL category (annotators identified this category in 16 pairs but disagreed on 6 of them) and the REV category (disagreement on 9/21 pairs); by contrast, annotators agreed more consistently on recognizing CONV pairs (just one case of disagreement).
In order to understand the motivations of these discrepancies, we have adopted a reconciliation strategy among annotators. In particular, we asked annotators to motivate their choices with the possibility to revise their selections. After the reconciliation discussion, Dice values increased to 0,99 (considering only opposition recognition) and to 0,98 (considering opposition category) and the per category disagreement decreased for every category (see

Discussion
In this Section we discuss three cases of disagreement among annotators. A first case concerns disagreement when the semantic types specified in the pattern include elements with different characteristics. This, in some cases, has induced annotators to consider the pattern as opposite (or not) of another pattern. As an example, consider mettere, pattern 1 in (6) -togliere, pattern 2 in (7).
Pattern 1 of the verb mettere (to place): The second case we discuss concerns disagreement between opposition category selection, as observable in caricare, pattern 1, in (8) -scaricare, pattern 3, in (9).
Pattern 1 of the verb caricare (to load): In this pair, one annotator recognized the two patterns as REV, as the two events describe a change in opposite direction, and display a temporal relation; in contrast, the other annotator selected ANT, considering that, for both predicates, the objects of caricare -scaricare observed in the corpus samples are quantifiable, and thus the actions are in a certain way measurable.
The third case we discuss highlights disagreement due to the semantic interpretation of the verbal patterns. In these cases, it seems that while one annotator focuses on the temporal entailment relation among patterns, thus marking the pair as REV; the other mainly recognizes that the two patterns divide in two a conceptual domain (see Introduction), thus selecting COMPL. This reason for disagreement lead to an interesting possible interpretation. As detailed in our schema, reversives hold also a temporal relation: a dimension that is not captured by the other opposition relations of the schema. In that sense the category of reversives seems not to be exclusive, but in some cases it appears to be a cross relation that co-exists with other types of oppositions.

Conclusions
In this paper we have presented an annotation schema of oppositions among verbal frames. In our schema, opposition relations have been classified in four categories: complementaries, antonyms, converses, reversives. We have conducted a pilot annotation experiment selecting 25 verb pairs from the T-PAS resource to access the reliability of the scheme. Results show that the IA agreement is very high in the identification of opposite pattern pairs, and fair in distinguishing among categories. We also found that several pattern pairs appear to have properties pertaining to more than one kind of opposites. The experiment confirms that the annotation is doable and can be extended to all verbs in the resource, thus enriching it with opposition relations among frames. For extending the annotation to the whole T-PAS resource, we plan to adopt crowdsourcing, with a more systematic use of the corpus samples associated to each pattern in T-PAS (see Section 2). This will make T-PAS the first resource systematically enriched with opposition relations, which can potentially be exploited to investigate opposition at textual level.