A Dual-Layer Semantic Role Labeling System

We describe a well-performed semantic role labeling system that further extracts concepts (smaller semantic expressions) from unstructured natural language sentences language independently. A dual-layer semantic role labeling (SRL) system is built using Chinese Treebank and Propbank data. Contextual information is incorporated while labeling the predicate arguments to achieve better performance. Experimental results show that the proposed approach is superior to CoNLL 2009 best systems and comparable to the state of the art with the advantage that it requires no feature engineering process. Concepts are further extracted according to templates formulated by the labeled semantic roles to serve as features in other NLP tasks to provide semantically related cues and potentially help in re-lated research problems. We also show that it is easy to generate a different language version of this system by actually building an English system which performs satisfactory.


Introduction
Semantic roles are utilized to find concepts automatically and assure their meaningfulness. Semantic role labeling is a research problem which finds in a given sentence the predicates and their arguments (identification), and further labels the semantic relationship between predicates and arguments, that is, their semantic roles (classification). There are several labeling sets. Researchers have widely adopted the semantic role labels defined in Propbank (Bonial et al., 2010) like predicate (PRED), numbered arguments 0 to 5 (ARG0, ARG1, ARG2, ARG3, ARG4, ARG5), or modifier arguments (ARGM-X); finer labels are those defined in Sinica Treebank (Huang et al., 2000) like agent, theme, target, which are labeled on each node of the parse tree; those defined in FrameNet (Ruppenhofer et al., 2006) are the finest but most expressive. Each set provides semantic information. As long as the semantic relationship between terms derives from their semantic role labels, we are able to determine whether they should be extracted from the current sentence to construct a concept.
The word concept usually refers to an abstract or general idea inferred or derived from specific instances. Therefore, the extraction of concepts from text is often defined as extracting terms that are in some way related to one another. These terms could be predefined by people in resources such as ontologies, or they could be typical words in texts. In this paper, we view concepts as the continuous or discontinuous meaningful units in a sentence and hence they are tightly related to semantic roles. We propose a dual-layer semantic role labeling system which provides extracted concepts according to the reported labels, and then demonstrate the functions of this system. Experimental results will show the merit of the proposed framework.

Related Work
Previous studies related to this work can be divided into two groups: semantic role labeling and concept extraction. Semantic role labeling (SRL) has sparked much interest in NLP (Shen and Lapata, 2007;Liu and Gildea, 2010). The first automatic SRL systems were reported by Jurafsky in 2002 (Gildea andJurafsky 2002); since then, their ideas have dominated the field. In their approach, they emphasize the selection of appropriate lexical and syntactical features for SRL, the use of statistical classifiers and their combinations, and ways to handle data sparseness. Many researchers have tried to build on their work by augmenting and/or altering the feature set (Xue 2004), by experimenting with various classification approaches (Pradhan et al. 2004;Park and Rim 2005), and by attempting different ways to handle data sparseness (Zapirain, Agirre, and Màrquez 2007). Moreover, some researchers have tried to extend it in novel ways. For example, Ding and Chang (2008) used a hierarchical feature selection strategy, while Jiang, Li, and Ng (2005) proposed exploiting argument interdependence, that is, the fact that the semantic role of one argument can depend on the semantic roles of other arguments.
Many researchers have tried to extract concepts from texts (Gelfand et al., 1998;Hovy et al., 2009;Villalon and Calvo, 2009;Dinh and Tamine, 2011;Torii et al., 2011). Hovy narrowed the domain of interest into concepts "below" a given seed term. Villalon and Calvo extract concepts from student essays for concept map mining, which generates a directed relational graph of the extracted concepts in an essay. For specific domains, biological or medical concepts are of greatest interest to researchers (Jonnalagadda et al., 2011). Two relatively new and related approaches are the Concept parser (Rajagopal et al. 2013), a part of the SenticNet project (Cambria, Olsher, and Rajagopal 2014) and ConceptNet (Liu and Singh 2004). The former is a tool to decompose unrestricted natural language text to a bag of concepts, which is similar to our work. However, in the final phase a seman-tic knowledge base is used to express a concept in all its different forms and their concept-parser does not use any semantic knowledge during decomposition. The latter is a semantic network based on the Open Mind Common Sense (OMCS) knowledge base. As it is a knowledge base, its construction process is quite different from the work described here of automatically extracting concepts from sentences.

System
The proposed system includes three major components: a syntactic parser, a semantic role labeler, and a concept formulation component. The framework is shown in Figure 1. The input sentence is first transformed into a syntactic parse tree through a syntactical analysis step that almost all automatic semantic role labeling systems require . Here the Stanford parser (Klein and Manning 2003) is utilized. Figure 2 shows the system interface. The left part is the English system and the right part is the Chinese system. After users input a sentence, the system will automatically parse, label semantic roles and report the related concepts for it.

Semantic Role Labeling
To develop a SRL system, a total of 33 features including features related to the head word related features, target word related features, grammar related features, and semantic type related features, are collected from related work (Xue, 2008;Ding and Chang, 2008;Sun and Jurafsky 2004;Gildea and Jurafsky 2002). Then the baseline maximum entropy system is developed using these features (Manning and Schutze, 1999). Two sets of data -Chinese Treebank 5.0 together with Propbank 1.0 and Chinese Treebank 6.0 with Propbank 2.0 -are separated into the training and testing sets, and are then used to build models to identify and classify semantic labels, and also to evaluate the performance, respectively. As Chinese data was selected for experiments, the hypernyms of words from E-Hownet 1 , a Chinese word ontology, are utilized as the semantic type of words. When applying the whole system on data in other languages, for major languages it is not difficult to find resources to obtain hypernyms. For minor languages, it is fine to just ignore these features. According to our experience, this will yield F-Score reductions of only 1% to 2%.
We further exploit argument interdependence to enhance performance by the dual-layer framework shown in Figure 2. Suppose for any given predicate P in a sentence, the system has identified the three potential arguments A1, A2, and A3 of the predicate. Next, to predict the semantic role labels of those three arguments, a critical observation made by (Jiang,Li,and Ng 1 http://ckip.iis.sinica.edu.tw/CKIP/conceptnet.htm 2005) is that the semantic roles of arguments may depend on each other; this phenomenon is known as argument interdependence. A common way to escape argument interdependence is to adopt sequence labeling, and use the features extracted from the arguments around the current argument together with the features of the current one to predict the label for the current argument. For example, while predicting the label of argument A2, features extracted from arguments A1 and A3 are also used. Although window sizes can be used to set the scope of this interdependence, the window-size strategy has some practical limits: the typically large feature set necessitates the use of smaller window sizes (a window size of [-1,1] is common). However, small window sizes can make it impossible to capture long dependency phenomena.
To overcome the limitations of the windowsize strategy, we use all the surrounding arguments' predicted labels -window size [-∞,∞], as opposed to their features -to predict the label of the current node. This also conforms to the rule that when a role is taken by the other argument, it is less likely that the current argument is of the same role. We implement this idea using the dual-layer classification framework shown in Figure 3.
In layer 1 the baseline system is used to predict the labels for identified nodes. Then in layer 2, these predicted labels of all surrounding arguments (in this example, A1 and A3) together with other features of the current node (A2) are used  to predict the label of the current node. Note as this approach is under no window size limitation, the labels of all arguments under the same predicate are taken into account. Experimental resu lts show that this strategy works better than the window-size strategy. Table 1 shows the system accuracies for the single-and dual-layer frameworks. The predicted dual-layer framework utilized the SRL labels predicted in layer 1, while the gold dual-layer framework used as features the gold SRL labels of the surrounding arguments. To further evaluate the performance of the proposed system and offer comparisons, we applied it on Chinese Treebank 6.0 with Propbank 2.0 in the same way as in the CoNLL 2009 SRLonly task data according to the information provided by the CoNLL organizers. Table 2 shows the results of the proposed system. The CoNLL 2009 task builds dependencybased SRL systems, while the proposed system works on the constituent-based parsing trees. Also the settings of the proposed system are not all the same as the CoNLL 2009 SRL systems. In CoNLL 2009, as noted in Table 5, participants can participate in open or closed challenges, and can choose whether they want to attempt both syntactic and semantic labeling tasks (joint task) or only to attempt the SRL task. The setting of the proposed system is open challenge, SRL-only, while researchers working on the Chinese data selected only two other different settings: closed challenge, SRL only and open challenge, joint task. However, Table 5 shows that the proposed system outperforms the CoNLL 2009 best systems in terms of precision (86.89 vs. 82.66), recall (80.11 vs. 79.31), and f-score (83.36 vs. 78.50). Moreover, lately, dependency-based SRL has shown advantages over constituent-based SRL ; thus we expect to show better results if working on dependency-based parsed data. Therefore, we believe the proposed system is comparable or even superior to other systems.

Concept-Formulations
Once the sentence has been annotated semantically, the concepts are formulated by concept templates designed according to Propbank SRL labels. Propbank provides semantic role labels of two types. One type is numbered arguments Arg0, Arg1, and so on until Arg5; the other type is modifiers with function tags, which give additional information about when, where, or how the event occurred. Tables 4 and 5 list the descriptions of the Propbank arguments utilized for the concept template generation. Table 6 then lists the generated concept templates.
As shown in Table 6, the predicate and its arguments are placed in various orders to build a list of concepts according to their semantic roles. These role combinations serve as templates which can capture a complete and important piece of information described in one sentence to form a concept. Additionally, the arguments (i.e., the subjects and objects of the predicate) in themselves can represent useful concepts, and for this reason, the arguments alone are also included in extracted concepts. For comparison, in Table 7 the extracted concepts are listed with those from the SenticNet concept parser.

Conclusion
We have presented a system to decompose a sentence into a set of concepts through the proposed well-performed semantic role labeling system (http://doraemon.iis.sinica.edu.tw/srl-concept/), which differs from previous related attempts. We demonstrated that this dual-layer semantic role labeling framework that exploits argument interdependence performs slightly better than the state of the art, and that it is relatively simple as no feature selection or engineering processes are required. We easily generated another English system under the same framework, which showcased the language independency of the system. In addition, it reached an F-Score 0.84, which was considered satisfactory. In the future, we plan to investigate how to further represent and utilize these extracted concepts efficiently in more NLP tasks which call for deep language understanding.