DA-IICT Submission for PDTB-styled Discourse Parser

The CONLL 2016 Shared task focusses on building a Shallow Discourse Parsing sys-tem, which is given a piece of newswire text as input and it returns all discourse relations in that text in the form of discourse connectives, its two arguments and the relation sense. We have built a parser for the same. We follow a pipeline architecture to build the system. We employ machine learning methods to train our classiﬁers for each component in the pipeline. The sys-tem achieves an overall F1 score of 0.1065 when tested on blind dataset provided by the task organisers. On the same dataset, for explicit relations, F1 score of 0.2067 is achieved, while for non explicit relations, an F1 score of 0.0112 is achieved.


INTRODUCTION
Discourse Parsing is the process of assigning a discourse structure to the input provided in the form of natural language. The term "Shallow" signifies that the annotation of one discourse relation is independent of all other discourse relations, thus leaving room for a high level analysis that may attempt to connect them.
For the purpose of training and testing the system, we used PDTB (Penn Discourse Tree Bank), which is a discourse-level annotation on top of PTB (Penn Tree Bank). The corpus provides annotation for all discourse relations present in the documents. A discourse relation is composed of discourse connectives, its two arguments and the relation sense. PDTB provides a list of 100 discourse connectives, which may indicate the presence of a relation. A discourse connective can fall in any of 3 categories: Coordinating Conjunctions (e.g.: and, but, etc.), Subordinating Conjunctions (e.g.: if, because, etc.) or Discourse Adverbial (e.g.: however, also, etc.).

EntRel (Entity Based Coherence)
Explicit Relations are marked by the presence of 100 connectives pre-defined by PDTB. Implicit Relations are realised by the reader. There are no words explicitly indicating the relationship. Sometimes, words not pre-defined like connectives by PDTB indicate a relationship. Such relations are called AltLex relations. EntRel relations exist between two sentences in which same entity is being realised. EntRel relations do not have a sense. Some examples are specified in figure 1. Here, the underlined word represents the discourse connective. Italicised text represents argument 1 and bold text represents argument 2. The right indented text following each relation represents the relation sense. The text in the bracket represents the relation type.
There are many challenges associated with this task. Firstly, we need to identify when a word works as a discourse connective and when it does not. In figure 1, consider examples 1 and 3. Both relations contain the word and which is present in the list of explicit connectives. But it acts as a discourse connective in example 1 and not in 3. In 3, it just links political and currency in a noun phrase. Secondly, we need to extract the arguments from sentences. And finally, we need to identify the relation sense.
Study of discourse parsing has a variety of applications in the field of Natural Language Processing. For instance, in summarisation systems, 1. The agency has already spent roughly $19 billion selling 34 insolvent SLs, and it is likely to sell or merge 600 by the time the bailout concludes.
Expansion.Conjunction (Explicit) 2. But it doesn't take much to get burned. Implicit = FOR EXAMPLE Political and currency gyrations can whipsaw the funds.
Expansion.Restatement.Specification (Implicit) 3. Political and currency gyrations can whipsaw the funds. AltLex [Another concern]: The funds' share prices tend to swing more than the broader declared San Francisco batting coach Dusty Baker after game two market.
Expansion.Conjunction (AltLex) 4. Pierre Vinken, 61 years old, will join the board as a non-executive director Nov. 29. Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
(EntRel) Figure 1: Examples of various types of discourse relations redundancy is an important aspect. We can analyse discourse relations with Expansion sense to weed out the redundant material. Also, in Question Answering systems, we can make use of relations with Cause senses to answer the why questions. The report is organised as follows. Section 2 gives a brief overview of the system. Section 3 describes each component in detail and features deployed to build our parser. Section 4 reports the evaluation strategy and results achieved by our parser.

System Overview
There are five major components involved in the process of discourse parsing as shown in figure 2.  In PDTB corpus for explicit relations, argument 2 is always syntactically bound to the connective (i.e. it is in the same sentence as connective). As far as argument 1 is concerned, it can either be in one of the previous sentences (PS case), in the same sentence (SS case) or after that sentence (FS case). Since, FS cases' occurance was too low (only 4 instances out of total 32000 relations), therefore, such cases are ignored by our system. Argument Position Identifier tries to identify this relative position of argument 1 with respect to argument 2.
If the PS case appears, then the immediately previous sentence is considered as the sentence containing argument 1. This is true for 92% of the cases in training data. Argument Extractor extracts the argument span from the sentence.
Explicit Sense Classifier identifies the relation sense. It is important to identify this as same connective may convey different meanings in different contexts. For example the word since can either be used in different senses as shown in figure 3. In 1, it is used in temporal sense while in 2, it is being used in causal sense.
1. There have been more than 100 mergers and acquisitions within the European paper industry since the most recent wave of friendly takeovers was completed in the U.S. in 1986.
2. It was a far safer deal for lenders since NWA had a healthier cash flow and more collateral on hand.

Figure 3: Since being used in different senses
Non Explicit Classifier tries to identify one of the non-explicit relations (Implicit, AltLex, En-tRel) and otherwise NoRel (no relation) between adjecent sentences within the same paragraph.
Non Explicit Argument Extractor tries to extract the argument spans for non-explicit relations.
For the purpose of classification, our system uses MaxEnt Classification Algorithm without smoothing.

Connective Classifier
The input to this component is free text from the documents. We sift through all the words in all the documents and identify the occurences of predefined explicit connectives. Then, we identify whether these connectives actually work as discourse connectives or not. For this task, we used Pitler and Nenkova 's (2009) syntactic features. Lin et al. (2014) approached this problem by using POS tags and context based features . They used used features from syntax tree, namely path from connective word to the root and compressed path (i.e. same subsequent nodes in the path are clubbed). We too, have used the similar features, as shown in table 1. Here, C-syn features refer to the combination of Connective string with each of syntactic feature and syn-syn features mean the pairing of a syntactic feature with another different syntactic feature.

Argument Labeller
Here, we first identify the relative position of argument 1 with respect to argument 2. Given this position, we extract the arguments from sentences.

Argument Position Identifier
To identify the position of argument 1, we extract the features mentioned in table 2:

Argument Extractor
After predicting the position of argument 1, we employed different tactics for different positions: • If the position is SS (that is, both arguments are in same sentence), then we use constituency based approach by Kong et.al. without Joint Inference to extract arguments. This consists of two steps: -Pruning: In the parse tree of sentence, identify the node dominating all the connective words. From that node move towards the root and collect all the siblings. If this node does not exactly contain the connective words, collect all its children too. These nodes are termed as constituents. -Classification: For all these constituents, we extract the features mentioned in table 3.
• If the position is PS, then we consider the immediately previous sentence as a candidate for containing argument 1 and the sentence containing connective string as a candidate for containing argument 2. Extracting the arguments from sentence is a two step process: -  Path of Connective String to the constituent node in syntax tree 6 Relative Position of constituent node with respect to Connective String 7 Path of Connective String to the constituent node in syntax tree + whether number of left siblings of Connective String ¿ 1 First word in this clause 8 Last word in this clause 9 Last word in previous clause 10 First word in next clause 11 Last word in previous clause + First word in this clause 12 Last word in this clause + First word in next clause 13 Position of this clause in sentence: start, middle or end  First 3 terms of argument 2 sentence First Word in this clause 5 Last Word in this clause 6 Last Word in previous clause 7 Fist word in next clause 8 Last Word in previous clause + First word in this clause 9 Last Word in this clause + First Word in next clause 10 Position of this clause in the sentence

Explicit Sense Classifier
To determine the relation sense, we use Lin's as well as Pitler's features, as shown in table 5.

Non Explicit Classifier
Non Explicit Relations occur between adjacent sentences within same paragraph. We consider the first sentence as the one containing argument 1 and second containing argument 2. Then, we extract the features mentioned in table 6.

Argument Extractor
To extract argument spans for Non Explicit and Non EntRel Relations, we first use clause splitter as mentioned before and then extract the features for each clause as mentioned in table 7. For EntRel relations, we simply mention the first sentence as argument 1 and second sentence as argument 2.

System Setup
We used the training datasets provided by CONLL 2016 organisers (LDC2016E50). In addition we also used the brown clusters (3200 classes). For Stemming purposes, we used snowball stemmer and for lemmatising, we used stanford core nlp library.
For the purpose of classification, we used Apache OpenNLP implementation of MaxEnt classifier. We used Java programming language to implement the parser.

Evaluation Strategy
A relation is seen correct iff: • The discourse connective is correctly detected (for explicit relations) • Sense of relation is correctly predicted.
• Text spans of two arguments as well as their labels (Arg1 and Arg2) are correctly predicted. Partial matches are not identified as correct.

Results
Results are mentioned in tables 8. As we can see, explicit connective classifier achieves only a precision score of around 0.77 while the best team previous year (Wang) achieved a precision of 0.93. This is not good enough and perhaps is the major reason for error being propagated towards subsequent components. The results of non explicit relations were also discouraging with an F1 score of only 0.012.

Conclusion and Further Work
This paper describes the PDTB-styled discourse parser system we implemented for CONLL '16 shared task. We divided the system into different components and arrange in a pipeline. We apply Maximum Entropy for each of these components. It is an ongoing work. We plan to incorporate deep learning mehods in each component to try to improve the system. We also plan to do feature selection to optimise the components of our system.