Identification of Alias Links among Participants in Narratives

Identification of distinct and independent participants (entities of interest) in a narrative is an important task for many NLP applications. This task becomes challenging because these participants are often referred to using multiple aliases. In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper nouns, pronouns or noun phrases with common noun headword. We use Markov Logic Network (MLN) to encode the linguistic knowledge for identification of aliases. We evaluate on four diverse history narratives of varying complexity. Our approach performs better than the state-of-the-art approach as well as a combination of standard named entity recognition and coreference resolution techniques.


Introduction
Identifying aliases of participants in a narrative is crucial for many NLP applications like timeline creation, question-answering, summarization, and information extraction. For instance, to answer a question (in the context of Table 1) When did Napoleon defeat the royalist rebels?, we need to identify Napoleon and the young lieutenant as aliases of Napoleon Bonaparte. Similarly, timeline for Napoleon Bonaparte will be inconsistent with the text, if the young lieutenant is not identified as an alias Napoleon Bonaparte. This will further affect any analysis of the timeline (Bedi et al., 2017).
In the context of narrative analysis, we define -• A participant as an entity of type PERSON (PER), LOCATION (LOC), or ORGANIZATION (ORG). A participant has a canonical mention, * These authors contributed equally.  which is a standardized reference to that participant (e.g., Napoleon Bonaparte). Further, it may have several aliases, which are different mentions referring to the same participant.
• A basic participant mention can be a sequence of proper nouns (e.g., Napoleon or N. Bonaparte), a pronoun (e.g., he) or a generic NP 1 (e.g., a short man or the young lieutenant).
• Independent basic mentions of a participant play primary role in the narrative. Dependent basic mentions play supporting role by qualifying or elaborating independent basic mentions. For each independent mention, we merge all its dependent mentions to create its composite mention.
Note that our notion of dependency is syntactic. A basic mention can be either dependent or independent. A basic mention is said to be dependent if its governor in the dependency parse tree is itself a participant mention; otherwise it is called as independent mention. An independent mention can be a basic (if it does not have any dependent mentions) or a composite mention. An in-dependent composite mention is created by recursively merging all its dependent mentions. For instance, for the phrases his parents and parents of Napoleon, following are the basic participant mentionshis, Napoleon, and parents. In the dependency parse trees, parents is the governor in both cases. Hence, his and Napoleon would be basic dependent mentions. Final independent composite mentions his parents or parents of Napoleon are created by merging the dependent mentions with the independent mention parents.
In this paper, we focus on identification of independent mentions (basic as well as composite) for any participant in a narrative. The problem of identifying aliases of participants is challenging because even though the standard NLP toolkits work well to resolve the coreferences among pronouns and named entities, we observed that they perform poorly for generic NPs. For instance, Stanford CoreNLP does not identify the young lieutenant and Napoleon Bonaparte as the same participant (Table 1); a task we aim to do. This task can be considered as a sub-problem of the standard coreference resolution (Ng, 2017). We build upon output from any standard coreference resolution algorithm, and improve it significantly to detect the missing aliases.
Our goal is to identify the canonical mentions of all independent participants and their aliases. In this paper, we propose a linguistically grounded algorithm for alias detection. Our algorithm utilizes WordNet hypernym structure for identifying participant mentions. It encodes linguistic knowledge in the form of first order logic rules and performs inference in Markov Logic Networks (MLN) (Richardson and Domingos, 2006) for establishing alias links among these mentions.

Related Work
Traditionally, alias detection restricts the focus on aliases of named entities which occur as proper nouns (Sapena et al., 2007;Hsiung et al., 2005) using lexical, semantic, and social network analysis. This ignores the aliases which occur as generic NPs. Even in the coreference resolution, recently (Peng et al., 2015a,b) the focus has come back to generic NP aliases by detecting mention heads. Peng et al. (2015b) propose a notion of Predicate Schemas to capture interaction between entities at predicate level and instantiate them using knowledge sources like Wikipedia. These in- stances of Predicate Schemas are then compiled into constraints in an Integer Linear Programming (ILP) based formulation to resolve coreferences. In addition to pronouns, our approach focuses on identification of common noun based aliases of a participant using MLN.
MLN has been used to solve the problem of coreference resolution (Poon and Domingos, 2008;Song et al., 2012). Our work differs from them as we build upon output of off-the-shelf coreference resolution system, rather than identifying aliases/coreferences from scratch. This helps in exploiting the strengths (such as linking pronoun mentions to their antecedents) of the existing systems and overcome the weaknesses (such as resolving generic NP mentions) by incorporating additional linguistic knowledge.
A more general and challenging problem in-volves resolution of bridging descriptions which study relationships between a definite description and its antecedent. As noted in Poesio et al., 1997), bridging descriptions consider many different types of relationships between a definite description (definite generic NP) and its antecedent; e.g., synonymy, hyponymy, meronymy, events, compound nouns, etc. However, in this paper we focus on identity type of relationships only. Further,  use WordNet to identify these relationship types between definite descriptions. As described in Phase-I of algorithm 1 (Section 3), we use Word-Net for a completely different purpose of identifying participant type. 2 Gardent and Kow (2003) presented a corpus study of bridging definite descriptions and their typologies. They have identified several types of bridging relations like setsubset, event-argument etc.

Our Approach
Our approach has three broad phases: (I) Identification of participants, (II) MLN based formulation to identify aliases, and (III) Composite mention creation. We use a Unified Linguistic Denotation Graph (ULDG) representation of NLP-processed sentences in the input narrative. The ULDG unifies output from various stages of NLP pipeline such as dependency parsing, NER and coreference resolution, e.g., Figure 1 shows a sample ULDG.
corresponding to a set S of n sentences, is a vertexlabeled and edge-labeled graph. A node u ∈ V corresponds to a token in S and its label is defined as: L u = (s, t, token, P OS, p, a); where s : sentence index, t : token index, token, P OS : partof-speech tag of token, p denotes participant type (p ∈ {P ER, ORG, LOC, OT HER (OT H)}) if u is a headword of a participant mention and a denotes canonical participant mention of corresponding group of aliases. There are three types of edges -• E d = { u, v, dep : directed dependency edge labelled with dep (dependency relation), which connects a governor (parent) token u to its dependent token v}; e.g., sent, parent, nsubj • E p = { u, v : directed edge, which connects headword u of a participant phrase to its each constituent word v}; e.g., Bonaparte, Napoleon • E a = { u, v : undirected edge, which connects nodes u and v which are headwords of aliases of the same participant }; e.g., him, Bonaparte Our approach has been summarized in Algorithm 1. Its input is an ULDG G(V, E d , E p , E a ) for a set S of given sentences. We initialize V , E d , E p and E a using any standard dependency parser, NER and coreference resolution techniques 3 .

Algorithm 1: identif y participants & aliases
Our algorithm modifies the input ULDG inplace by updating node labels, E p and E a . Figure 1 shows an example of initialized input ULDG, which gets transformed by our algorithm to the output ULDG shown in Figure 2. Phase-I: In this phase, we update participant type Predicates Description N ET ype(x, y) y is entity type of participant x CopulaConnect(x, y) Participants x and y are connected through a copula verb or a "copula-like" verb in E d (e.g., become) Conj(x, y) Participants x and y are connected by a conjunction in E d Dif f V erbConnect(x, y) Participants x and y are connected through a "differentiating" verb or a copula-like verb in E d (e.g. tell) LexSim(x, y) Participants x and y are lexically similar, i.e. having low edit distance Alias(x, y) Participants x and y are aliases of each other (used as a query predicate) Hard rules Description Alias(x, x) ; Alias(x, y) ⇒ Alias(y, x) Reflexivity and symmetry of aliases Alias(x, y) ∧ Alias(y, z) ⇒ Alias(x, z) Transitivity of aliases Alias(x, y) ∧ ¬Alias(y, z) ⇒ ¬Alias(x, z) Alias(x, y) ⇒ (N ET ype(x, z) ⇔ N ET ype(y, z)) If x and y are aliases, their entity types should be same Conj(x, y) ⇒ ¬Alias(x, y) If x and y are conjuncts, then they are less likely to be aliases Soft rules Description CopulaConnect(x, y) ⇒ Alias(x, y) If x and y are connected though a copula or copula-like verb in E d , then they are aliases of each other LexSim(x, y) ⇒ Alias(x, y) If x and y are lexically similar, then they are likely to be aliases Dif f V erbConnect(x, y) ⇒ ¬Alias(x, y) If x and y are subjects / objects of a "differentiating" verb, then they are not likely to be aliases of each other  (2016), MLN gives the benefits of (i) ability to employ soft constraints, (ii) compact representation, and (iii) ease of specification of domain knowledge. The predicates and key first-order logic rules are described in Table 2. Here, Alias(x, y) is the only query predicate. Others are evidence predicates, whose observed groundings are specified using G. As we use a combination of hard rules (i.e., rules with infinite weight) and soft rules (i.e., rules with finite weights), probabilistic inference in MLN is necessary to get find most likely groundings of the predicate-Alias(x, y). As the goal is to minimize supervision and to avoid dependence on annotated data, we rely on domain knowledge in the current version to set the MLN rule weights. Phase-III: In this phase, we extract an auxiliary subgraph G (V , E ) ⊂ G; where V contains only those nodes which correspond to headwords of basic participant mentions and E contains only those edges incident on nodes in V and labeled with appos or nmod. We identify each independent participant mention in G and merge its dependent mentions using depth first search (DFS) on G .
Finally, each clique in E a represents aliases of an unique participant. We use the earliest nonpronoun mention in text order as the canonical mention for that clique.

Experimental Analysis
Datasets: We evaluate our approach on history narratives as they are replete with challenging cases of alias detection. We choose public narratives of varying linguistic complexity to cover a spectrum of history: (i) famous personalities: Napoleon (Nap) (Littel, 2008), and Mao Zedong (Mao) (Wikipedia, 2018), (ii) a key event: Battle of Haldighati (BoH) (Chandra, 2007), and (iii) a major phenomenon: Fascism (Fas) (Littel, 2008). We manually annotated these datasets for the independent participant mentions and their aliases. For each alias group of participant mentions we use earliest non-pronoun mention as its canonical mention 4 .
We also evaluate it on the newswire subset (ACE nw ) of standard ACE 2005 dataset (Walker et al., 2006). Entity mention annotations were transformed 5 such that only independent entity mentions and their aliases are used. We relied on Nap dataset to develop intuition for designing the algorithm and tuning of MLN rules. All other datasets (ACE, BoH, Fas, and Mao) are unseen, independent test datasets. Baselines: B1 is a standard approach to this problem where output of NER and coreference components of Stanford CoreNLP toolkit are combined to detect aliases. B2 is the state-of-the-art coreference resolution system based on (Peng et al., 2015a,b). M is our proposed alias detection approach (Algorithm 1). Evaluation: The performance of all the approaches is evaluated at two levels: all independent participant mentions (i.e., participant detection) and their links with canonical mentions (i.e., participant linking). We use the standard F1 metric to measure performance of participant detection. For participant linking, we evaluate (Pradhan et al., 2014) the combined performance of participant mention identification and alias detection using the standard evaluation metrics, MUC (Vilain et al., 1995), BCUB (Bagga and Baldwin, 1998), Entity-based CEAF (CEAFe) (Luo, 2005) and their average. Results: Results of the quantitative evaluation are summarized in Table 3. We observe that the proposed approach outperforms other baselines on all datasets.   . B2 is (Peng et al., 2015a). M is proposed method.
Correct identification of generic NPs as participant mentions, and accurate addition of alias edges due to MLN formulation lead to improved performance of Algorithm 1; e.g., in Table 1, the baselines fail to detect a lieutenant as an alias for Napoleon Bonaparte, but the pro-posed approach succeeds as it exploits MLN rule CopulaConnect(x, y) ⇒ Alias(x, y). As an illustration of the proposed approach, Table 4

Conclusions
Alias detection is an important and challenging NLP problem. We proposed a linguistically grounded approach to identify aliases of participants in a narrative. We observed that WordNet hypernym tree helps in identification of participant aliases mentioned using generic NPs. MLN proved to be an effective framework to encode linguistic knowledge and achieve better alias detection performance. Our approach was evaluated on history narratives which pose challenging alias detection cases and demonstrated better performance than the state-of-the-art approach. Our goal in current paper was to improve the output by exploiting the strengths (such as linking pronoun mentions to their antecedents) of off-the-shelf coreference algorithms and to overcome their weaknesses (such as resolving generic noun phrase mentions). As part of future work, we are planning to enhance existing MLN frameworks for coreference resolution by integrating the proposed MLN predicates and rules.