ICL-HD at SemEval-2016 Task 8: Meaning Representation Parsing - Augmenting AMR Parsing with a Preposition Semantic Role Labeling Neural Network

,


Introduction
Progress in Natural Language Processing has led to a multitude of well-motivated tasks that each represent part of a sentence's meaning but result in a meaning description spread over separate, unconnected descriptions. These separate levels of semantic annotation, like co-reference or named entities, and the lack of simple human-readable corpora where whole sentence meanings are encoded led to the Abstract Meaning Representation (AMR) formalism (Banarescu et al. 2013). AMR structures capture sentence meanings with rooted, directed and labeled graphs where sentences with the same meaning receive the same AMR. These graphs are encoded in a bracketed format and can be visually represented in a human-understandable way (see Figure   1). AMR structures are organized with nodes representing concepts and the semantic relationships that hold between these concepts 1 . Hence, AMRs can be useful for every NLP component that relies on or exploits semantic meaning resources. Particular application areas are, among others, entity linking (Pan et al. 2015), event detection (Li et al. 2015) and machine translation. An example for a AMR graph is given in Figure 1: there is a concept RECOMMEND-01 which is the root of the graph and there is a concept OFFER-01 that stands in semantic relationship to RECOMMEND-01 with the edge ARG1.
We augment the existing AMR parser CAMR (Wang, Xue, and S. Pradhan 2015a) with a preposition semantic role labeling (prepSRL) neural network with the intention to improve the AMR graph creation accuracy. Prepositions in conjunction with their arguments make a crucial contribution to the meaning of sentences and are therefore a very intuitive supplement to AMR parsing. For example, see how in Figure 2 the meaning of the preposition in is involved in the creation of the AMR edge :LOCA-TION. in semantically expresses the agency's spatial location and therefore triggers the identically named AMR edge :LOCATION. Prepositional semantics is a knowledge resource that has not yet been exploited for the domain of AMR parsing. Moreover, CAMR has problems in correctly creating AMR edges triggered by prepositional relations.
We should offer earthquake workers our full understanding.

Related Work
The first attempt to automatically generate AMR structures from sentences was the work of Flanigan et al. (2014). They used a graph-based structured prediction algorithm with two stages: the first stage is a semi-Markov model concerned with identification of concepts, the second stage connects these concepts by finding the maximum spanning connected subgraph from a graph where all possible relations between concepts are realized. They achieve an F-score of 0.58 on the LDC2013E117 corpus. Werling, Angeli, and Manning (2015) improve the AMR parsing concept of Flanigan et al. (2014) by supporting the critical task of concept identification with a predefined set of actions for concept subgraph generation that are evoked after a statistical classification procedure. Besides graph-based approaches, there exist also other strategies on AMR parsing: Peng, Song, and Gildea (2015) learn synchronous hyperedge replacement grammar rules from string-There is nothing sad about old shells. graph pairs. An Earley algorithm with cube-pruning then performs string-to-AMR parsing with these rules. Pust et al. (2015) treat English and AMR as a language pair and use a machine translation approach to parse AMRs from sentences. They convert AMRs into into a grammar of string-to-tree rules that can be handled by syntax-based machine translation formalisms and use these rules with a bottomup chart decoder to parse AMRs with given local features and a language model. Wang, Xue, and S. Pradhan (2015a) use a transition-based system that transforms dependency graphs into AMR structures by evoking specific actions at each reached state while traversing the dependency tree. As can be seen, there are many different point of views on AMR parsing.

Motivation
The motivation for our system design comes from the error analysis of the transition-based AMR parser CAMR of Wang, Xue, and S. Pradhan (2015a). It turns out that the parser has difficulties on correctly identifying AMR relations which involve prepositional semantics. Therefore, we have chosen to aid CAMR with preposition semantic role labeling (prepSRL) in order to improve AMR parsing results. Figure 3 shows a CAMR parse error: (b) should indicate a :TOPIC edge label for the edge between the concepts SAD and SHELL. This relation is semantically expressed by the preposition about. As can be seen in

Baseline System
We used the AMR parser CAMR of Wang, Xue, and S. Pradhan (2015a) as a starting point for our idea of supporting AMR parsing with prepSRL. It converts dependency trees into AMR graphs with a transition-based technique by evoking certain tree transforming actions at reached transition states. In the training procedure, the tokens of the input sentence are first aligned with the nodes of its gold AMR graph using the JAMR aligner (Flanigan et al. 2014). Such aligned AMR graphs are represented as span graphs storing token spans for AMR concept nodes. With these span graphs, a greedy transitionbased mechanism learns to rewrite the dependency trees into AMR graphs. In order to learn these transformations, a transition system processes the nodes of the input dependency graphs in a bottom-up leftto-right fashion. It decides at each reached configurations which action to perform next in transforming the dependency graph into an AMR span graph.
Configurations are defined as a tuple of buffers holding unprocessed nodes and unprocessed edges and the partial span graph parses for the current input sentence. While traversing the dependency tree, an • NEXT-EDGE-lr: assigns relation label to current edge and steps on to the next edge • SWAP-lr: swaps dependency relation between nodes (head transforms to dependent and vice versa) • REATTACH k -lr: removes an arc, reattaches the former dependent to another node and assigns a label to the new arc • REPLACE HEAD: replaces a head with its dependent • REENTRANCE k -lr: links a node to another node in the subgraph and therefore has the ability to convert trees into graphs • MERGE: merges two nodes into one node • NEXT-NODE-lc: assigns a concept label to current node and proceeds to next element in buffer • DELETE-NODE: deletes a node and all its connections • INFER-lc: inserts a concept node between current node and parent averaged perceptron algorithm decides the actions to take by computing scores for all possible actions given specific features 2 and a weight vector. During test time, always the highest scoring action is chosen before moving on to the next state. During training, the algorithm will update the weight vector if it has chosen the wrong action and proceeds parsing with the correct one 3 . The core of the system are the set of actions that can be taken at states by the algorithm. Figure 4 shows an overview and a short description of the eight possible action types 4 . Actions alter the dependency tree by deleting or inserting nodes, merging two nodes into one, assigning relation labels and creating or modifying arcs. Therefore, the averaged perceptron can learn to do the right transformations to end up with a AMR span graph.

Semantic Role Labeling Features
Wang, Xue, and S. Pradhan (2015b) successfully improve their base system described in chapter 3.1 by adding SRL features to their model 5 . We also included these features and added our prepSRL information in a similar way. The first SRL feature encodes an action's compatibility with the predicted frameset from the SRL system. For each action that predicts a concept label (NEXT-NODE-l c ), the predicted SRL frameset is compared to the candidate concept labels. If both match, the value of the feature is set to true. Therefore, the system will bias towards choosing the predicted SRL frameset as concept label 6 . The second feature encodes predicted SRL argumenthood for an action's current edge. For each action that predicts edge labels, the parser has access to the information whether current action's dependent is predicted by the semantic role labeler. Hence, the system will favor edges that are congruent with the semantic role labeler's edges.

Preposition Semantic Role Labels
The prepSRL information consists of an attachment according to a dependency parse and a semantic role label for the prepositional phrase head predicted by a neural network. The neural network is trained on data annotated with semantic role labels. A simple feed-forward neural network with one hidden layer is trained in a softmax regression framework on the role labels of Penn Treebank (PTB), the SemEval 2007 Task 6 corpus and the DEFT corpus 78 . Additionally a 'multi-task' neural network was trained on all three corpora simultaneously. The network architecture is sketched in Figure 5. All three prediction models share the same two hidden layers. As labels and number of labels vary, softmax regression is performed for each corpus separately. The neural networks are fed a combination of word embedding 9 and a subset of the hand-crafted 5 They used the ASSERT SRL system described in (S. S. Pradhan et al. 2004). 6 Remember that concept labels in AMR can be PropBank framesets. 7 Used version: LDC2015E86: DEFT Phase 2 AMR Annotation R1 8 All neural networks are trained with 200 hidden nodes per layer, a learning rate of 0.01 in a gradient descent batch learning environment. The weights are randomly initialized in ± 6 / (input dim + output dim ) . Tanh is used as non-linear activation function. 9 The word embeddings are taken from (Pennington, Socher, and Manning 2014). For unknown words, a null vector is used . These features include a binary indicator for token capitalization and a binary vector representation of both, the token's POS-tag and the supersense label according to WordNet 10 . For each sample, these features are extracted for the following tokens: the preposition token, the previous token, the preceding verb, the preceding verb/adjective/noun, the dependency head, the dependency child and a heuristic child, for which we choose the ensuing token. All corpora are parsed with the ClearNLP parser 11 to obtain POS-Tags and tokenization. For reasons of compatibility with the AMR-Parsing task, the dependency trees were extracted from parses by the BLLIP parser 12 . Corpus sizes and accuracy measures for the simple neural network can be seen in Table 2. The neural network predicts a semantic role label for every prepositional phrase head. Comparing the different models by accuracy on these labels is difficult, as the target spaces and number of samples available differ. A list of valid target labels for each corpus can be seen in Table 3. The SemEval corpus comes with a total of 155 labels, where each preposition has a number of senses. These sense labels are reduced in number to create more meaningful target labels for the prepSRL task. The mapping scheme of Srikumar and Roth (2013)   They are creating traffic congestion in new places.  of 82 labels. We remove all labels with less than 200 samples from the corpus to ensure training quality. Given this prepSRL system, the AMR parsing results are expected to be improved in the following way. In the AMR of Figure 6 currently the concept PLACE is the ARG3 of CREATE-01 but it should be the :LOCATION of CONGEST-01. Because the prep-SRL feature has the ability to influence the edge creation and labeling actions of the transition system, the AMR parser can decide for the correct actions to take.

Experimental Setup
We first preprocessed the 16, 831 sentences of the DEFT corpus training section (Knight et al. 2014) that we used for training CAMR with the prepSRL features. Preprocessing information for the AMR parser includes lemmas, POS tags, named entities and dependency parses 13 . In addition, we preprocessed the training sentences with the ClearNLP toolkit for the training of the neural network. We used the tokenization and POS tag components of ClearNLP and replaced the generated dependencies for compatibility reasons with the dependencies generated by the BLLIP parser. After preprocessing, the alignments between the AMR graphs and their sentences were created with the JAMR aligner. ASSERT-generated SRL files were provided to us by Sameer Pradhan for the training and test inputs, enabling us to run CAMR with the SRL features. Separately, the neural network for prepSRL is trained with PTB-style preposition labels. We parsed our training corpora with the resulting model and generated the feature files for the prepSRL information. We trained CAMR in four different feature settings that are shown in Figure 8. The generated models were tested on the DEFT corpus test set that contains 1371 sentences.

Results
We evaluated our approach of augmenting AMR parsing with a prepSRL system by using the standard evaluation measure for AMR parsing which is the Smatch evaluation metric to date 14 . Smatch uses semantic overlap between AMR parses to measure parsing accuracy. Results of the evaluation are given in Table 4. They reveal that the prepSRL features have a slightly negative influence on the parsing accuracy of CAMR. The Smatch F-score remains the same over all trained models, but the recall is reduced by 1% when adding the prepSRL features. The model with prepSRL achieves a Smatch score of 0.60 on the SemEval-2016 Task 8 test data. One possible explanation for the prepSRL results could be the ambiguity concerned with prepositions: (1) Establishing Models in Industrial Innovation.
(2) There is a travel agency in Sydney.
In (1), in does not indicate an AMR :LOCATION relation, in contrast to its occurrence in (2). At the moment, our system cannot disambiguate between the two appearances of in according to the features used.

Error Analysis
A quantitative error analysis of our parser's output is shown in Table 5. If compared with the previous results in Table 1, the :LOCATION relation shows a minor improvement of precision and recall, where all other relations either show no difference or are parsed worse than before.

Conclusion
We extended the AMR parser CAMR (Wang, Xue, and S. Pradhan 2015a) with a neural network for prepSRL but did not reach improved AMR results using this method. In fact, the combination with prepSRL slightly reduced the recall of the system. This could be due to the fact that our prepSRL neural network generates parses for all preposition occurrences without disambiguating ambiguous prepositions. Future work has to find a better way to integrate prepSRL information into the architecture of CAMR. One possibility of this could be the refinement of the neural network where only prepositions receive a SRL parse that are likely to produce an AMR relation. Despite our results, we nevertheless think that the inclusion of prepositional semantics could improve AMR parsing results if used in an appropriate way.