Parsing Weighted Order-Preserving Hyperedge Replacement Grammars

We introduce a weighted extension of the recently proposed notion of order-preserving hyperedge-replacement grammars and prove that the weight of a graph according to such a weighted graph grammar can be computed uniformly in quadratic time (under assumptions made precise in the paper).


Introduction
The hyperedge-replacement grammar (HRG) is a well-studied formalism for describing graph languages; see, e.g., (Bauderon and Courcelle, 1987;Habel and Kreowski, 1987;Habel, 1992;Drewes et al., 1997). As argued by Jones et al. (2012), Koller (2015), and Groschwitz et al. (2015) it is also a promising candidate for modelling semantic representations of natural language such as Abstract Meaning Representation (AMR, see Banarescu et al. (2013)). However, HRGs overshoot the mark in that parsing with respect to them is computationally too expensive. Further, HRGs can express intricate structural properties whose complexity is far beyond what seems to be required to describe practically relevant languages of semantic graphs such as AMR. For example, as argued by Chiang et al. (2018) it suffices if the path languages of such graph languages are regular languages. In contrast, HRGs easily give rise to even non-context-free path languages. Thus, from both perspectives less powerful special cases should be sought if this helps to cut down on parsing complexity. Recently, such a restriction, called order preservation, was proposed and studied in (Björklund et al., 2016;Björklund et al., 2017;Björklund et al., 2018).
The present article builds upon the orderpreserving HRGs (OPHGs) of Björklund et al. (2018), where it was shown that parsing for OPHGs is efficient, requiring polynomial time even in the uniform case i.e. when the grammar is considered to be part of the input. Here, we define a weighted version of OPHGs, and extend the results of Björklund et al. (2018) to show that when the weights are taken from a commutative semiring, we can efficiently compute the weight assigned by an OPHG to any input graph. This is an important feature since applications such as semantic modelling require ways to quantify the well-formedness of a generated graph.
While providing a notion of grammars with weights may appear to be a simple task as one only has to assign weights to the rules, doing so in a meaningful way for unrestricted HRGs is actually not simple at all. The reason is that the weights of different derivation trees generating the same graph should be summed up to obtain the weight of the graph. However, if a right-hand side of a rule has nontrivial automorphisms that interchange two or more nonterminal hyperedges, one gets spuriously distinct derivation trees that should intuitively be considered identical. At the very least, this complicates uniform parsing as it requires to preprocess the rules to detect the automorphisms of their righthand sides, a task for which no polynomial solution is known.
In OPHGs, only the right-hand sides of so-called duplication rules have nontrivial automorphisms, and those do not require preprocessing. These rules correspond to associative and commutative operations, which we propose to take special care of in the computation of weights by using a type of reduced derivation trees introduced for the same purpose by Courcelle (1991a); see also Courcelle and Engelfriet (2012). In these derivation trees, some nodes have a set of children, while others have them ordered in a list. After this, we show how weights can efficiently be computed, and prove the correctness of the algorithm.
Related work. Another type of restricted HRGs for semantic modelling was proposed by Chiang et al. (2013), together with a parsing algorithm and a detailed complexity analysis. The complexity is, however, exponential even in the non-uniform case. In particular, it is exponential in the maximum degree of nodes in the input graph. The same holds for the parsing algorithm for regular graph grammars presented by Gilroy et al. (2017). We also mention that another technique for efficient HRG parsing was resently developed by Drewes et al. (2015Drewes et al. ( , 2017.

Preliminaries
The set of non-negative integers is N, and [k] = {1, . . . , k}. For a set S, S * is the set of strings over S, while S is the set of strings in S * in which no element of S occurs twice. The empty string is , and we have S + = S * \ and S ⊕ = S \ . The length of a string w is denoted |w|. We use the terms 'string' and 'sequence' interchangably. For a sequence w = a 1 · · · a n , every sequence a i 1 · · · a i k with 1 ≤ i 1 < · · · < i k ≤ n is a subsequence of w, and [w] is the set {a 1 , . . . , a n }.

Hypergraphs
We fix a disjoint, countably infinite supply LAB of labels, such that each σ ∈ LAB has a rank rank(σ) ∈ N. A hypergraph is a structure g = (V, E, lab, att, ext) where V and E are the (finite) sets of nodes and hyperedges, lab : E → LAB is the edge labelling, att : E → V ⊕ is the edge attachment with |att(e)| = rank(lab(e))+1 for all e ∈ E, and ext ∈ V ⊕ is the sequence of external nodes.
From now on, we simply call hypergraphs graphs, and hyperedges edges. We use the graph as a subscript to identify its components. E.g., E g refers to the set of edges of g. For an edge e ∈ E g with att g (e) = v 0 · · · v k , we say that src g (e) = v 0 , tar g (e) = v 1 · · · v k , and name these the source and sequence of targets, respectively. Similarly, for ext g = v 0 · · · v l , we say that v 0 = g is the source of the graph, and v 1 · · · v l = g its sequence of targets. In this paper, we require all targets of a graph to be leaves, i.e. src g (e) / ∈ [g ] for all e ∈ E g . For a graph g, rank(g) = |g |, and for an edge e, rank(e) = rank(lab g (e)) = |tar g (e)|. Graphs g, h are isomorphic, denoted g ≡ h, if they are equal up to a bijective renaming of nodes and edges.
An alternating sequence v 1 e 1 . . . v k e k of nodes and edges is a path in g from v 1 to e k if src g (e i ) = v i and v i+1 ∈ [tar g (e i )], for each i ∈ [k]. We may optionally terminate the path at v k+1 instead of e k . In either case, the path passes all nodes and edges v i and e i for i ∈ [k]. If v 1 = g, it is a source path. A node v or edge e is reachable from s (in g) if there is a path in g from s to v (e). A node or edge is reachable in g if there is a source path to it.

Hyperedge replacement
Consider graphs h, f , and an edge e ∈ E h such that rank(e) = rank(f ), and att h (e) = ext f . Then we can use hyperedge replacement to obtain the graph Clearly, if rank(e) = rank ( We divide LAB into two subsets TLAB and NLAB of terminals and nonterminals, and accordingly call edges terminal and nonterminal ones. We sometimes shorten the expressions further to just "terminals" and "nonterminals".

Hyperedge replacement grammars
A hyperedge replacement grammar (HRG) G = (Σ, N, S, R) consists of a terminal alphabet Σ ⊂ TLAB, a nonterminal alphabet N ⊂ NLAB, an initial nonterminal S ∈ N , and a set R of (HR) rules form A → f , where A ∈ N and f is a graph over Σ ∪ N with rank(A) = rank(f ). If f has nonterminal edges, we name them {e 1 , . . . , e } and write arity (A → f ) for .
Derivations in HRGs are context-free: Given a graph h, an edge e ∈ E h with lab h (e) = A ∈ N , and a rule (A → f ) ∈ R, we can derive the graph g = h[ [e : f ] ] from h. We call this a derivation step, and denote it h → A→f g. We also write more generally h → G g for a derivation step using any rule in R. The reflexive and transitive closure of → G is → * G . The language of G is the set L(G) of all graphs g over TLAB such that S • → * G g.

Order-Preserving Hyperedge Replacement Grammars
We now turn to order-preserving HRGs. The first ingredient is a condition called reentrancy preservation. Reentrancies are deeply entwined with the way we identify places in a graph that match the right-hand side of a given rule.

Reentrancies
Suppose we consider a subgraph h of a graph g as a candidate of a subgraph that may have been derived from a nonterminal e. If so, then where, intuitively, g is obtained from g by replacing h by e. To perform this backwards replacement, we have to determine which nodes of h are its external nodes, i.e., which ones are to be attached to e. By the very definition of hyperedge replacement, a node of h that is external in g or has an attached edge not belonging to h, must be in [att g (e)] (but not generally vice versa). In particular, all nodes in h that can be reached from g without passing a node in h must be in [att g (e)]. The notion of reentrant nodes to be defined now serves to turn this inclusion into an equality (once we add [ext g ] ∩ V h to this set) in the case where h is rooted at some node or edge x of g. Intuitively, the reentrant nodes of a node or edge x in a graph g are the first descendants of x that can also be reached on a path that avoids x. As the external nodes of a right-hand side of an HR rule are the ones that, after the replacement, are reachable from "outside" the subgraph, we also consider them as reentrant. The graph delineated by x and its reentrant nodes is the subgraph rooted at x.
Let us have a look at a simple example before defining the notion of reentrant nodes formally. The graph in Figure 1 is single-rooted, with r the root node. The reentrant nodes of r is the set of external targets (i.e. x 1 , x 2 and x 3 ), and these are also the reentrant nodes of the edge e sourced at r.
For the edge marked f , x 2 is a reentrant node, and so is v 1 and v 2 , as v 2 is reachable through the path rei 1 gv 2 that avoids f , and v 1 likewise is reachable by the path rei 1 gi 2 hv 1 , also avoiding f . For f , the set of reentrant nodes is {v 1 , v 3 }, as v 3 is also a direct target of f , making it reachable on the path x 1 Figure 1: An example graph for reentrancies.
Definition 3.1 (Reentrant node). Given a graph g and E ⊂ E g , let TAR g (E) be the union of all sets of targets of edges in E, i.e. e∈E [tar g (e)].
g be the set of all edges e ∈ E g such that all source paths to e pass x. 1 Then the set of reentrant nodes of x in g is Definition 3.2 (Rooted subgraph). Given a graph , att h and lab h are the appropriate restrictions of att g and lab g , respectively, and ext h isx followed by reent h (x) in some order.
Rooted subgraphs are strictly nested, which is proved by Björklund et al. (2018) in the form of the following lemma (where ∼ is isomorphy modulo the order of g ): Lemma 3.3 (Lemma 3.4 in (Björklund et al., 2018)). Let g be a graph, h = g↓ x for some

Reentrancy Preservation
Reentrancy preservation formalizes the property that, given a graph h and some edge e ∈ E h with lab h (e), we can replace e by some graph f according to a rule A → f without affecting the sets We achieve this by restricting our grammars to two types of rules, namely duplication rules and deep rules. Rules of these two kinds are called reentrancy preserving. To define duplication rules, consider a graph where att(e 1 ) = v 0 · · · v n = att(e 2 ), lab(e 1 ) = lab(e 2 ) ∈ NLAB, and ext is a subsequence of att(e 1 ) starting with v 0 . If |ext| < n then f (and every graph isomorphic to f ) is a twin, and if |ext| = n then it is a clone. A rule A → f is a twin rule if f is a twin and a clone rule if f is a clone with lab(e 1 ) = lab(e 2 ) = A. A duplication rule is either a clone or a twin rule.
A rule A → f is a deep rule if f fulfills the following conditions: • all nodes in V f are reachable from f and have out-degree ≤ 1, and A HRG is reentrancy preserving if it has only reentrancy-preserving rules. We note here that Björklund et al. (2018) also permits chain rules, i.e. rules that only change the label of an edge from one nonterminal to another nonterminal, and thus violate the first condition above. In the present paper we exclude them because they can result in an infinite number of derivations of a given graph, thus making it in general unreasonable to associate a weight with such a graph. 2 Later on, we will also need the following generalization of duplication rules to the case where +1 copies of a nonterminal edge are created: given any duplication rule r = (A → f ) and some ≥ 1, we denote by r the rule A → f , where f is obtained from f by replacing its two nonterminals by + 1 copies. Thus, r 1 = r. Lemma 3.4 (Björklund et al. (2018) adapted). Let g ∈ L(G) for some reentrancy-preserving HRG G. There is a quadratic algorithm that computes, for every x ∈ V g ∪ E g , the set reent g (x), and thus the subgraph g↓ x .

Ordering nodes
Reentrancy preservation allows us to pinpoint the subgraphs that may have been generated by a specific nonterminal, but as shown by Björklund et al. (2016), this is not sufficient to achieve efficient parsing, as needing to guess the order of targets in subgraphs g↓ x may still cause NP-hardness. Thus, we require a way to determine the order of nodes, in particular reentrant nodes. This requires an ordering relation that can be efficiently computed, and fulfils some basic requirements, and a set of reentrancy-preserving rules that additionally preserves that order. Formally: Definition 3.5 (Suitable order). For a set G of graphs, a suitable family of orders is a family Definition 3.6 (Order preservation). A reentrancypreserving set R of HR rules preserves a suitable family of orders An order-preserving HRG (OPHG) is a reentrancy preserving HRG (Σ, N, S, R) together with a suitable family of orders preserved by R.

Weighted Order-Preserving HR Grammars
We now add weights -taken from some semiring -to order-preserving HR grammars. For this, and throughout the rest of this paper, let S = (S, +, ·, 0, 1) be a commutative semiring, meaning that (S, +, 0) and (S, ·, 1) are two monoids over the domain S such that · distributes over +. Thus, spelled out in detail, + and · are binary operations on S such that • 1 is the identity element for · • 0 is the identity element for + and the absorbing one for ·, • + and · are commutative, and • · distributes over +.
As usual, for every a ∈ S we let a 0 = 1 and a n+1 = a · a n for all n ∈ N.
Examples of well-known semirings are the Boolean semiring, the real numbers with addition and multiplication, the tropical semiring consisting of the positive real numbers extended by ∞ with minimum and addition, and the Viterbi semiring over [0, 1] in which multiplication is as usual and addition is maximum. The latter is used in natural language processing to compute the likelihood of the most probable derivation. See (Goodman, 1999) for more information on the use of semirings in natural language parsing.
A weighted OPHG computes a graph series, i.e. a mapping of graphs to S. As usual, this is achieved by assigning weights to rules. Informally speaking, if several distinct derivations can produce the same graph, we sum up the weights of the individual derivations to obtain the weight of the graph. The weight for a single derivation is the product of the weights of all the rules applied.
It is inconvenient to formalise this based on the derivations themselves because, just as in the case of ordinary context-free grammars, derivations may differ only in the order in which nonterminals are replaced, which yields distinct derivations that should be considered equivalent. A standard technique to solve this problem is to consider derivation trees instead of derivations. We can mostly use this standard technique, but we propose to take into account the fact, mentioned in the introduction, that each duplication rules has a nontrivial automorphism that interchanges the nonterminals in its right-hand side. Hence, these nonterminals are indistinguishable. Moreover, if the rule is a clone rule, then applying it to any of the nonterminals in its right-hand side yields three indistinguishable nonterminals in two different ways.
In general, suppose that a nonterminal is cloned times, yielding + 1 copies which are then further derived into graphs g 0 , . . . , g of weights w 0 , . . . , w . Then the clones can be derived by C different derivation trees, where C is the -th Catalan number (i.e., the number of binary trees with + 1 leaves). The resulting nonterminals e 0 , . . . , e can be derived into the graphs g 0 , . . . , g in any order, all leading to the same result. This yields !C distinct derivations, all generating the same graph g which consists of g 0 , . . . , g fused at their external nodes. The weight of g would thus be w !C j=1 i=0 w i , where w is the weight of the cloning rule. While there is nothing wrong with this in principle, the fact that we only allow for this particular type of cloning rule implies that there would be no way to avoid the sum by writing the rules of the grammar in a different way. Further, since the number of terms summed up depends on , it cannot in general be compensated for by reducing the weights of rules. We expect this to be a limiting factor in applications, and thus propose to represent a -fold cloning as an unordered node of rank + 1 in the derivation tree, leading to the weight w i=0 w i . Let us begin the process of making these notions more precise by recalling the notions of shallow graphs and siblinghoods from (Björklund et al., 2018).
for all e ∈ E g . A siblinghood in g is a set Sib ⊆ E g such that |Sib| ≥ 2 and tar g (e) = tar g (e ) for all e, e ∈ Sib. We denote tar g (e), e ∈ Sib, by tar g (Sib), and let g(Sib) = ({g} ∪ [tar g (Sib)], Sib, att g | Sib , lab g | Sib , tar), where tar is the subsequence of tar g (Sib) of nodes that are external in g or targets of edges outside of Sib, i.e. that belong to the set For siblinghoods Sib, Sib , we let Sib ≤ Sib if tar g (Sib) is a subsequence of tar g (Sib ). A siblinghood of g is prime if it is maximal with respect to both ≤ and set inclusion.
From now on, we shall for technical simplicity assume that the considered OPHG G contains exactly one clone rule for every A ∈ N . This is not a restriction because the definition of the weight of derived graphs to be given below ensures that any number of clone rules for the same nonterminal can be replaced by a single clone rule whose weight is the sum of the weights of the individual rules. In particular, if there is no clone rule for A, this has the same effect as a single clone rule of weight 0. The weight of the unique clone rule for A ∈ N is denoted by ω(A), and we write → cl for the derivation relation that exclusively uses clone rules, i.e. g → * cl g if g is obtained from g by cloning nonterminal edges.
The following is essentially Lemma 5.3 of (Björklund et al., 2018): Lemma 4.3. Let A ∈ N and let g be a shallow graph over N with |E g | ≥ 2.
• If A • → + g, then for every prime siblinghood Sib of g we either have g = g(Sib) and • Up to reordering of derivation steps, the derivations of these forms are the only ones deriving g from A • .
Hence, a derivation of a shallow graph can be broken down into an initial series of clonings followed by iterated sub-derivations each consisting of an application of a twin rule A → f and any number of clonings of the two nonterminal edges e 1 , e 2 of f . Note that the result of each such subderivation depends only on A → f and the number of clonings since att f (e 1 ) = att f (e 2 ). Therefore, the following definition of derivation trees uses trees in which the nodes that correspond to derivations of siblinghoods are unordered and unranked. For a tree consisting of a root labelled a and subtrees t 1 , . . . , t , we write a[t 1 , . . . , t ] or a t 1 , . . . , t depending on whether t 1 , . . . , t is to be interpreted as an ordered or unordered list (or a multiset), respectively. We write a(t 1 , . . . , t ) to denote a tree in which the first level of children can be either ordered or unordered. Definition 4.4 (derivation tree). For a weighted OPHG G = (Σ, N, S, R, ω) and A ∈ N , the set of all A-derivation trees is the smallest set of trees t belonging to one of the following three types: (1) t = r[t 1 , . . . , t ] for a deep rule r = (A → f ) ∈ R such that arity (A → f ) = , and t i is a lab f (e i )-derivation tree for every i ∈ [k].
(2) t = r t 1 , . . . , t +1 for a clone rule A → f , where ≥ 1 and, for every i ∈ [ + 1], the subtree t i is an A-derivation tree that is not of type (2).
(3) t = r t 1 , . . . , t +1 for a twin rule A → f , where ≥ 1 and, for every i ∈ [ + 1], the subtree t i is a lab f (e 1 )-derivation tree that is not of type (2).
A more rigorous and complete treatment of various issues surrounding derivation trees of graph algebras with associative and commutative operations can be found in (Courcelle, 1991b).
We can evaluate a derivation tree to yield a graph g in the following way: Given a derivation tree t = r(t 1 , . . . , t ), eval (t) is defined as the right-hand side f of r, with each successive nonterminal e i replaced with the evaluation of the corresponding subtree of the derivation tree, i.e. eval ((A → f )(t 1 , . . . , t )) = f [ [e 1 : eval (t 1 ), . . . , e : eval (t )] ]. Given a graph g, we let DT G (g) denote the set of all S-derivation trees such that eval (t) ≡ g.
We make the following observation, whose correctness follows from the context-freeness of hyperedge replacement.
Observation 4.5. Let G = (Σ, N, S, R, ω), be an OPHG. Then it holds that Now, as mentioned, the weight of a graph is defined to be the sum of the weights of all its derivation trees: Definition 4.6 (generated graph series). Let G = (Σ, N, S, R, ω) be a weighted OPHG and A ∈ N .
1. For every duplication rule r = (A → f ) ∈ R and every ≥ 1, let ω(r ) = ω(r) · ω(lab f (e 1 )) −1 . (Note that r corresponds to the application of r followed by − 1 clonings of any of the two resulting nonterminal edges.)

The weight of an
3. The graph series ω G : G Σ → S generated by G is given by (The sum is finite, and thus well defined due to the commutativity of +.) Note that given G, the language L(G) of G seen as an unweighted grammar, is a superset of the support of G, i.e. the set of all graphs g such that ω G (g) = 0.

Computing Weights
Our algorithm builds upon the unweighted parsing algorithm by Björklund et al. (2018). We store in each node and edge nothing more than an |N |vector of weights, which is computed in very much the same way as the sets of nonterminals computed in (Björklund et al., 2018). We use the distributivity of multiplication over addition to keep our computations efficient (assuming efficient multiplication and addition).
The algorithm exploits Lemma 3.3, i.e. the property that the subgraphs g↓ x are strictly nested in all graphs derivable by an OPHL. Using this, it is possible to process the subgraphs of g in a tree-like "bottom-up" manner, marking each node and edge x with the set of all nonterminals that can generate g↓ x , after all g↓ y properly contained in g↓ x have already been processed. Eventually, S belongs to the set which the node g is marked with if and only if g ∈ L(G).
Order preservation enters the picture as follows: every subgraph h of g which was derived from some nonterminal edge, is of the form h = g↓ x for some node or edge x of g. As shown by Björklund et al. (2018), order preservation guarantees that h is ordered by g . Thus, in the algorithm only those subgraphs g↓ x are of interest for which the ordering of targets is uniquely determined by g . From now on, we will thus assume that, whenever a subgraph h = g↓ x is constructed, the order of nodes in h is chosen according to g .
To show how ω G (g) can be computed, we describe two algorithms in one: the first computes the derivation trees of g whereas the second computes its weight by summing up over all the derivation trees. In the current paper, we mainly use the first algorithm as a tool to facilitate the correctness proof of the second. As a consequence, we do not present that first algorithm in a way which immediately yields an efficient algorithm, i.e., we only care for the efficiency of the second algorithm. The set of derivation trees computed by the first algorithm can, however, be represented in a compact fashion as a "packed forest", which is of independent usefulness and makes the algorithm efficient.
The main procedure of the algorithm computes, in the same bottom-up manner as in (Björklund et al., 2018), a set D x (A) of A-derivation trees for each x ∈ V g ∪ E g and every A ∈ N . More precisely, D x (A) is the set of all A-derivation trees of the input HRG G such that A • → * G g↓ x . As the correctness of this procedure was proved by Björklund et al. (2018) (though not explicitly in terms of derivation trees), all that remains to be shown is that the second version of the algorithm computes t∈D g (S) ω(t) under the assumption that the first one is correct.
That second algorithm computes weights W x (A) instead of the sets D x (A), where W x (A) = t∈Dx(A)) ω(t). In the pseudocode, we always indicate the changes that must be made to obtain the second version by lines marked by "alt:". The line marked in this manner replaces its immediate predecessor. For sets of (derivation) trees D 1 , . . . , D ( ∈ N) and a rule r of arity , we furthermore write r(D 1 , . . . , D ) to denote the set {r(t 1 , . . . , t ) | (t 1 , . . . , t ) ∈ D 1 × · · · × D } (i.e. we use that notation in both the ordered and unordered case).
A subroutine used by the algorithm is Algorithm 1, a modified version of the corresponding procedure in (Björklund et al., 2018). It takes as input a shallow graph h whose edges e are already assumed to be annotated with the respective sets D e (A). The algorithm uses Lemma 4.3 in order to assemble -in a bottom-up manner over the prime siblinghoods of h -the set D h (A). In the algorithm we say that a duplication rule A → f of G fits a siblinghood Sib = {s 1 , . . . , s } of h if f ≡ h({s 1 , s 2 }) when disregarding edge labels, and we denote f by B •• to indicate that the two edges in f carry the label B.
The reader should note that the result of Algorithm 1 does not depend on the choice of Sib because the prime siblinghoods Sib 1 , . . . , Sib k of h are pairwise disjoint and the replacement of Sib = Sib i by e does not affect the siblinghoods Sib j , j ∈ [k] \ {i} (though it may of course create an additional prime siblinghood).
The main procedure of the parsing algorithm is shown in Algorithm 2. In its while loop, it repeatedly chooses an x ∈ V g ∪ E g for which the sets D x (A) shall be computed, and calls PARSE V (Algorithm 3) or PARSE E (Algorithm 4) depending on whether x ∈ V g or x ∈ E g .
The function MATCHING used in line 4 of Al- for each A ∈ N do 8: Algorithm 2 Computing Derivation Trees for Order-Preserving HR Grammars 1: function PARSE(order-preserving HR grammar G = (Σ, N, S, R), graph g ∈ G R ) 2: preP rocess(g) Compute ≺ g as well as all g↓ x for all x ∈ V g ∪ E g 3: if g↓ x is defined then D x ← ⊥ 5: 10: else PARSE E (x)

11:
return D g (S) alt: return W g (S) Algorithm 3 Computing Derivations Trees of g↓ v for nodes v ∈ V g 1: function PARSE V (node v such that D e (A) = ⊥ for all e ∈ E g with src g (e) = v)

2:
if v has out-degree 0 then 3: if φ = null then 6: ) (lab f (e i ))} gorithm 4 is described by Björklund et al. (2018) (using slightly different notation). It is based on the fact that, if g↓ e can be derived from a deep right-hand side f , then the mapping φ of the nodes in f to their images in g↓ e is uniquely determined by f and the reentrancies in g↓ e , due to reentrancy and order preservation. As proved by Björklund et al. (2018), this makes it furthermore possible to compute φ = MATCHING(f, e) in linear time.
As mentioned above, the correctness of the computation of the sets D x (A) was essentially shown by Björklund et al. (2018), and so we take it for granted here and use that fact to show inductively that the second version of the algorithm correctly computes the weights. Below, we assume for the sake of technical simplicity that the operations of the semiring S are computable in constant time. Clearly, the efficiency of the algorithm decreases accordingly if the operations a more complex. However, by the closedness of the class of polynomials under composition, the computation of weights stays polynomial whenever the operations of S are computable in polynomial time with respect to the input graph and the HRG.
Theorem 5.1. Let ≺ be a suitable family of orders, and let η be a function mapping graphs to N such that both η(g) and ≺ g can be computed in time η(g). 3 Then there is an algorithm which takes as input a graph g and an OPHG grammar G = (Σ, N, S, R, ω), and computes ω G (g) in time O(η(g) + |g| 2 + |G| 2 ).
Proof. With straightforward reformulations, the proof of the main theorem in (Björklund et al., 2018) shows that Algorithm 2 computes DT G (g) and runs in time O(η(g) + |g| 2 + |G| 2 ) if the time required for the explicit construction of deriva-3 The function η describes the complexity of computing ≺g, and the condition that it can be executed in time η(g) corresponds to the usual requirement of time constructibility. tion trees is neglected. 4 Together with the assumption that the operations of S can be computed in constant time, the latter means that the weight-computing version of Algorithm 2 runs in time O(η(g) + |g| 2 + |G| 2 ) as well. To complete the proof, it thus suffices to prove by induction that Algorithms 1-4 maintain the invariant that W x (A) = t∈Dx(A) ω(t) for those edges and nodes x and those A ∈ N such that D x (A) = ⊥.
In the proof, for a set D of derivation trees, we abbreviate t∈D ω(t) by ω(D). We check the algorithms one by one. Note that the induction hypothesis states that the equation W x (A) = ω(D x (A)) holds when the respective procedure is entered, and we have to show that it still holds afterwards. We use the fact that, by distributivity, for every rule r = (A → f ) of arity and all sets D 1 , . . . , D of derivation trees, it holds that ω(r(D 1 , . . . , D )) = ω(r) · = ω(D e (A)).
Procedure PARSE: Only line 6 affects some D x (A) and W x (A). These lines obviously preserve the invariant.
Procedure PARSE V : As before, line 3 respects the invariant. Concerning line 10, note that the two versions of SHALLOWPARSE return (A → D e (A)) A∈N and (A → W e (A)) A∈N , respectively, for some edge e. By induction hypothesis, W e (A) = ω(D e (A)) for all A ∈ N , which completes the argument. = ω(D) + ω(r[D φ(src f (e 1 )) (lab f (e 1 )), . . .
This completes the correctness proof of the theorem.
As indicated before, it is worthwhile noticing that the first version of the parsing algorithm computes the set DT G (g) in time O(η(g)+|g| 2 +|G| 2 ) if the sets D x (A) are represented in a compact way as packed forests. This may be useful for further applications.

Conclusions
Semantic parsing is a necessary tool for the improvement of any number of natural language processing tools and the use of graphs as semantic models is becoming a standard approach. Abstract Meaning Representation is one example. There is, however no formal standard, and the algorithmic issues involved are largely unexplored. In particular, there are hardly any models for the formal description of weighted semantic graphs, despite the importance of probabilities and other kinds of weights in natural language processing for, e.g., resolving ambituities. In this contribution, we have taken a step towards resolving this situation by showing that order-preserving hyperedge replacement grammars can be extended with weights, without signficantly affecting the complexity of analysing a graph with respect to the grammar. We thus hope to have provided a useful building block for making semantic parsing practical.
To allow for efficient parsing, order-preserving hyperedge replacement grammars allow only for restricted forms of rules. In particular, the only way to create nodes of unlimited out-degree is to use so-called clone rules. Since clone rules are associative and commutative, we have opted to view the corresponding sections of the resulting derivation trees as unordered nodes of the appropriate degree and define the weight of these substructures as w i=0 w i , where w is the weight of the cloning rule (which is applied times) and w 0 , . . . , w are the weights of the subderivations. It may be worthwhile noting that, in cases where this is too restrictive, one may use a commutative product valuation monoid (Droste and Meinecke, 2010) as a weight structure. Such a valuation monoid comes with an additional valuation function val which takes an arbitrary multiset of weights to a generalized product. Then the expression above may be generalized to w · val (w 0 , . . . , w ) without making parsing more difficult.