Graph-Based Meaning Representations: Design and Processing

This tutorial is on representing and processing sentence meaning in the form of labeled directed graphs. The tutorial will (a) briefly review relevant background in formal and linguistic semantics; (b) semi-formally define a unified abstract view on different flavors of semantic graphs and associated terminology; (c) survey common frameworks for graph-based meaning representation and available graph banks; and (d) offer a technical overview of a representative selection of different parsing approaches.


Tutorial Content and Relevance
All things semantic are receiving heightened attention in recent years. Despite remarkable advances in vector-based (continuous, dense, and distributed) encodings of meaning, 'classic' (hierarchically structured and discrete) semantic representations will continue to play an important role in 'making sense' of natural language. While parsing has long been dominated by treestructured target representations, there is now growing interest in general graphs as more expressive and arguably more adequate target structures for sentence-level grammatical analysis beyond surface syntax and in particular for the representation of semantic structure.
Today, the landscape of meaning representation approaches, annotated graph banks, and parsing techniques into these structures is complex and diverse. Graph-based semantic parsing has been a task in almost every Semantic Evaluation (Sem-Eval) exercise since 2014. These shared tasks were based on a variety of different corpora with graph-based meaning annotations (graph banks), which differ both in their formal properties and in the facets of meaning they aim to represent. The relevance of this tutorial is to clarify this landscape for our research community by providing a unifying view on these graph banks and their associated parsing problems, while working out similarities and differences between common frameworks and techniques.
Based on common-sense linguistic and formal dimensions established in its first part, the tutorial will provide a coherent, systematized overview of this field. Participants will be enabled to identify genuine content differences between frameworks as well as to tease apart more superficial variation, for example in terminology or packaging. Furthermore, major current processing techniques for semantic graphs will be reviewed against a highlevel inventory of families of approaches. This part of the tutorial will emphasize reflections on codependencies with specific graph flavors or frameworks, on worst-case and typical time and space complexity, as well as on what guarantees (if any) are obtained on the wellformedness and correctness of output structures.
Kate and Wong (2010) suggest a definition of semantic parsing as "the task of mapping natural language sentences into complete formal meaning representations which a computer can execute for some domain-specific application." This view brings along a tacit expectation to map (more or less) directly from a linguistic surface form to an actionable encoding of its intended meaning, e.g. in a database query or even programming language. In this tutorial, we embrace a broader perspective on semantic parsing as it has come to be viewed commonly in recent years. We will review graph-based meaning representations that aim to be application-and domain-independent, i.e. seek to provide a reusable intermediate layer of interpretation that captures, in suitably abstract form, relevant constraints that the linguistic signal imposes on interpretation.
Tutorial slides and additional materials are available at the following address: https://github.com/cfmrp/tutorial

Semantic Graph Banks
In the first part of the tutorial, we will give a systematic overview of the available semantic graph banks. On the one hand, we will distinguish graph banks with respect to the facets of natural language meaning they aim to represent. For instance, some graph banks focus on predicate-argument structure, perhaps with some extensions for polarity or tense, whereas others capture (some) scopal phenomena. Furthermore, while the graphs in most graph banks do not have a precisely defined model theory in the sense of classical linguistic semantics, there are still underlying intuitions about what the nodes of the graphs mean (individual entities and eventualities in the world vs. more abstract objects to which statements about scope and presupposition can attach). We will discuss the different intuitions that underly different graph banks.
On the other hand, we will follow Kuhlmann and Oepen (2016) in classifying graph banks with respect to the relationship they assume between the tokens of the sentence and the nodes of the graph (called anchoring of graph fragments onto input sub-strings). We will distinguish three flavors of semantic graphs, which by degree of anchoring we will call type (0) to type (2). While we use 'flavor' to refer to formally defined sub-classes of semantic graphs, we will reserve the term 'framework' for a specific linguistic approach to graph-based meaning representation (typically cast in a particular graph flavor, of course).
Type (0) The strongest form of anchoring is obtained in bi-lexical dependency graphs, where graph nodes injectively correspond to surface lexical units (tokens). In such graphs, each node is directly linked to a specific token (conversely, there may be semantically empty tokens), and the nodes inherit the linear order of their corresponding tokens. This flavor of semantic graphs was popularized in part through a series of Semantic Dependency Parsing (SDP) tasks at the Se-mEval exercises in 2014-16 (Oepen et al., 2014(Oepen et al., , 2015Che et al., 2016). Prominent linguistic frameworks instantiating this graph flavor include CCG word-word dependencies (CCD; Hockenmaier and Steedman, 2007), Enju Predicate-Argument Structures (PAS; Miyao and Tsujii, 2008), DELPH-IN MRS Bi-Lexical Dependencies (DM; Ivanova et al., 2012) and Prague Semantic Dependencies (PSD; a simplification of the tectogrammatical structures of Hajič et al., 2012).
Type (1) A more general form of anchored semantic graphs is characterized by relaxing the correspondence relations between nodes and tokens, while still explicitly annotating the correspondence between nodes and parts of the sentence. Some graph banks of this flavor align nodes with arbitrary parts of the sentence, including subtoken or multi-token sequences, which affords more flexibility in the representation of meaning contributed by, for example, (derivational) affixes or phrasal constructions. Some further allow multiple nodes to correspond to overlapping spans, enabling lexical decomposition (e.g. of causatives or comparatives). Frameworks instantiating this flavor of semantic graphs include Universal Conceptual Cognitive Annotation (UCCA; Abend and Rappoport, 2013; featured in a SemEval 2019 task) and two variants of 'reducing' the underspecified logical forms of Flickinger (2000) and Copestake et al. (2005) into directed graphs, viz. Elementary Dependency Structures (EDS; Oepen and Lønning, 2006) and Dependency Minimal Recursion Semantics (DMRS; Copestake, 2009). All three frameworks serve as target representations in recent parsing research (e.g. Buys and Blunsom, 2017;Chen et al., 2018;Hershcovich et al., 2018).
Type (2) Finally, our framework review will include Abstract Meaning Representation (AMR; Banarescu et al., 2013), which in our hierarchy of graph flavors is considered unanchored, in that the correspondence between nodes and tokens is not explicitly annotated. The AMR framework deliberately backgrounds notions of compositionality and derivation. At the same time, AMR frequently invokes lexical decomposition and represents some implicitly expressed elements of meaning, such that AMR graphs quite generally appear to 'abstract' furthest from the surface signal. Since the first general release of an AMR graph bank in 2014, the framework has provided a popular target for semantic parsing and has been the subject of two consecutive tasks at SemEval 2016 and 2017 (May, 2016;May and Priyadarshi, 2017).

Processing Semantic Graphs
The creation of large-scale, high-quality semantic graph banks has driven research on semantic parsing, where a system is trained to map from natural-language sentences to graphs. There is now a dizzying array of different semantic parsing algorithms, and it is a challenge to keep track of their respective strengths and weaknesses. Different parsing approaches are, of course, more or less effective for graph banks of different flavors (and, at times, even specific frameworks). We will discuss these interactions in the tutorial and organize the research landscape on graph-based semantic parsing along three dimensions.
Decoding strategy Semantic parsers differ with respect to the type of algorithm that is used to compute the graph. These include factorizationbased methods, which factorize the score of a graph into parts for smaller substrings and can then apply dynamic programming to search for the best graph, as well as transition-based methods, which learn to make individual parsing decisions for each token in the sentence. Some neural techniques also make use of an encoder-decoder architecture, as in neural machine translation.
Compositionality Semantic parsers also differ with respect to whether they assume that the graph-based semantic representations are constructed compositionally. Some approaches follow standard linguistic practice in assuming that the graphs have a latent compositional structure and try to reconstruct it explicitly or implicitly during parsing. Others are more agnostic and simply predict the edges of the target graph without regard to such linguistic assumptions.
Structural information Finally, semantic parsers differ with respect to how structure information is represented. Some model the target graph directly, whereas others use probability models that score a tree which evaluates to the target graph (e.g. a syntactic derivation tree or a term over a graph algebra). This choice interacts with the compositionality dimension, in that tree-based models for graph parsing go together well with compositional models.

Tutorial Structure
We have organized the content of the tutorial into the following blocks, which add up to a total of three hours of presentation. The references below are illustrative of the content in each block; in the tutorial itself, we will present one or two approaches per block in detail while treating others more superficially.
(1) Linguistic Foundations: Layers of Sentence Meaning • Availability of training and evaluation data; shared tasks; state-of-the-art empirical results.

Content Breadth
Each of us has contributed research to the design of meaning representation frameworks, creation of semantic graph banks, and and/or the development of meaning representation parsing systems. Nonetheless, both the design and the processing of graph banks are highly active research areas, and our own work will not represent more than a fifth of the total tutorial content.

Participant Background
An understanding of basic parsing techniques (chart-based and transition-based) and a familiarity with basic neural techniques (feed-forward and recurrent networks, encoder-decoder) will be useful.

Presenters
The tutorial will be presented jointly by three experts with partly overlapping and partly complementary expertise. Each will contribute about one third of the content, and each will be involved in multiple parts of the tutorial.
Alexander Koller Department of Language Science and Technology, Saarland University, Germany koller@coli.uni-saarland.de http://www.coli.uni-saarland.de/ koller Alexander Koller received his PhD in 2004, with a thesis on underspecified processing of semantic ambiguities using graph-based representations. His research interests span a variety of topics including parsing, generation, the expressive capacity of representation formalisms for natural language, and semantics. Within semantics, he has published extensively on semantic parsing using both grammar-based and neural approaches. His most recent work in this field (Groschwitz et al., 2018) achieved state-of-the-art semantic parsing accuracy for AMR using neural supertagging and dependency in the context of a compositional model. and natural language processing. The main topic is symbolic and statistical parsing, with a special focus on parsing into semantic graphs of various flavors. She has repeatedly chaired teams that have submitted top-performing systems to recent SemEval shared tasks and has continuously advanced both the state of the art in semantic parsing in terms of empirical results and the understanding of how design decisions in different schools of linguistic graph representations impact formal and algorithmic complexity.