Incremental Semantic Construction Using Normal Form CCG Derivation

This paper proposes a method of incrementally constructing semantic representations. Our method is based on Steedman’s Combinatory Categorial Grammar (CCG), which has a transparent correspondence between the syntax and semantics. In our method, a derivation for a sentence is constructed in an incremental fashion and the corresponding semantic representation is derived synchronously. Our method uses normal form CCG derivation. This is the difference between our approach and previous ones. Previous approaches use most left-branching derivation called incremental derivation, but they cannot process co-ordinate structures incrementally. Our method overcomes this problem.


Introduction
By incremental interpretation, we mean that a sentence is analyzed from left to right, and a semantic representation is assigned to each initial fragment of the sentence. These properties enable NLP systems to analyze unfinished sentences. Moreover, incremental interpretation is useful for incremental dialogue systems (Allen et al., 2001;Aist et al., 2007;Purver et al., 2011;Peldszus and Schlangen, 2012). Furthermore, in the field of psycholinguistics, incremental interpretation has been explored as a human sentence processing model. This paper proposes a method of constructing a semantic representation for each initial fragment of a sentence in an incremental fashion. The proposed method is based on Combinatory Categorial Grammar (CCG) (Steedman, 2000). CCG represents the syntactic process as a derivation which is a tree structure. Our method constructs a CCG derivation by applying operations used in incremental phrase structure parsing. Each intermediate data structure constructed by the operations represents partial information of some derivation. Our method obtains a semantic representation from the intermediate structure. Since the obtained semantic representations conform to the CCG semantic construction, we can expect that incremental semantic interpretation is realized by applying a CCG-based semantic analysis such as (Bos, 2008). This paper is organized as follows: Section 2 briefly explains Combinatory Categorial Grammar. Section 3 gives an overview of previous work of CCG-based incremental parsing and discusses its problem. Section 4 proposes our CCG-based method of incrementally constructing semantic representations. Section 5 reviews related work and Section 6 concludes this paper.

Combinatory Categorial Grammar
Combinatory Categorial Grammar (CCG) (Steedman, 2000) is a grammar formalism which has a transparent correspondence between the syntax and semantics. Syntactic information is represented using basic categories (e.g., S, NP) and complex categories. Complex categories are in the form of X/Y or X\Y , where X and Y are categories. Intuitively, each category in the form of X/Y means that it receives a category Y from its right and returns a category X. In the case of the form X\Y , the direction is to left. For example, the category of a transitive verb is (S\NP)/NP, which receives an object NP 269 Forward function application:  from its right and returns a category S\NP. The category S\NP corresponds to a verb phrase. It receives a subject NP from its left and the result is a sentence S. Formally, categories are combined using CCG rules such as the ones shown in Figure 1. Each rule means that, when the elements of the left-hand side of the arrow are combined in this order, the result is the right-hand side. The symbol with which the arrow is subscripted designates its rule type. Each element consists of a syntactic category and a semantic representation which is separated by a colon. A semantic representation is a λ-term. Each combination of syntactic categories has a corresponding semantic composition of their semantic representations. Figure 2 shows an example of CCG derivation, which is taken from (Steedman, 2000). 1 Here, we write λx 1 x 2 · · · x n .M and M 1 M 2 M 3 · · · M n to abbreviate λ-terms (λx 1 .(λx 2 .(· · · (λx n .M ) · · · ))) and ((· · · ((M 1 M 2 )M 3 ) · · · )M n ), respectively. In this example, each node has three labels: a syntactic category, a semantic representation and the rule type which is used to derive this node. For each leaf node, a word is assigned instead of a rule type.

Incremental Parsing Based on CCG
Incremental parsing methods based on CCG have been proposed so far (Reitter et al., 2006;Hassan et al., 2008;Hefny et al., 2011). By using the property that CCG allows non-standard constituents, previous CCG-based incremental parsers assign a syntactic category to each initial fragment of an input sentence. The obtained derivations are most leftbranching ones which are called incremental derivations. Figure 3 shows two examples of incremental derivations. In Figure 3(a), the fragment "Anna met" is a non-phrase, but it has a syntactic category S/NP. However, Demberg (2012) has demonstrated that some kinds of sentences cannot have strictly leftbranching derivations. This means that previous approaches have the case where the parser cannot assign any syntactic categories to an initial fragment. This also means that such initial fragments do not have any semantic representations.
A typical example is coordinate structure. In CCG, a coordinate structure is derived by combining conjuncts and a conjunction using coordination rule. This prevents the first conjunct from combining with its left constituent. As an example, let us consider the incremental derivation shown in Figure 3(b). Here, the word "met" is the first conjunct of "met and might marry" and cannot be combined with "Anna". If we assign the category S/NP to initial fragment "Anna met" as shown in Figure  3(a), the word "met" cannot be treated as a con-270  junct. This example demonstrates that sentences including coordinate structures cannot be represented by any strictly left-branching derivations. That is, incremental derivation approaches cannot achieve a word-by-word incremental interpretation.

Incremental Semantic Construction Based on CCG
This section proposes a method of constructing semantic representations in an incremental fashion. To overcome the problem described in the previous section, our method adapts a different approach. Our method needs not to use incremental derivations. For each initial fragment of a sentence, our proposed method obtains a semantic representation from the normal form derivation. A normal form derivation is defined as the one which uses type-raising and function composition only if they are required. 2 We consider a derivation as a parse tree and construct it based on incremental phrase structure parsing. For each initial fragment of a sentence, incremental parsing can construct a partial parse tree which connects all words in the fragment. Our method obtains a semantic representation from the partial parse tree. In the constructed partial parse tree, some parts of the derivation are underspecified. Our method introduces variables to denote underspecified parts of the semantic representation. These variables are re-2 Several variants of normal form have been presented. For example, see (Eisner, 1996) and (Hockenmaier and Bisk, 2010). placed with semantic representations as soon as they are determined. In the rest of this section, we first describe incremental parsing which is the basis of our method. Next, we explain how to obtain a semantic representation from a partial parse tree constructed by incremental parsing.

Incremental Construction of CCG Derivation
Our method considers a CCG derivation as a tree structure. We call this parse tree. Our method constructs a parse tree according to an incremental parsing formalism proposed in (Kato and Matsubara, 2009). This formalism extends the incremental parsing of (Collins and Roark, 2004) by introducing adjoining operation used in Tree Adjoining Grammar (Joshi, 1985). The incremental parsing assigns partial parse trees for any initial fragments of a sentence. Adjoining operation reduces local ambiguity caused by left-recursive structure, and improves the parsing accuracy (Kato and Matsubara, 2009). Furthermore, in the field of psycholinguistics, adjoining operation is introduced to a human sentence processing model (e.g., (Sturt and Lombardo, 2005;Mazzei et al., 2007;Demberg et al., 2013)).

A Formal Description of Incremental Parsing
This section gives a formal description of incremental parsing of (Kato and Matsubara, 2009). The 271 parsing grammar consists of three types of elements: allowable tuples, allowable chains and auxiliary trees. Each allowable tuple is a 3-tuple ⟨X, Y, Z⟩ which means that the grammar allows a node labelled with Z to follow a node labelled with Y under its parent labelled with X. Each allowable chain is a sequence of labels. This corresponds to a sequence of labels on a path from a node to its leftmost descendant leaf in a parse tree. Each auxiliary tree consists of two nodes: a root and a foot. The label of a root is the same as that of its foot.
A parse tree is constructed by applying two operations: attaching and adjoining. Attaching operation combines a partial parse tree and an allowable chain. The operation is defined as follows: attaching: Let σ be a partial parse tree and c be an allowable chain. Let η be the attachment site of σ. attach(σ, c) is the result of attaching c to η as the rightmost child (see Figure 4(a)).
Let X, Y and Z be the label of η, the label of the rightmost child of η and the label of the root of c. If a grammar does not have allowable tuple ⟨X, Y, Z⟩, attach(σ, c) is not allowed by the grammar. Next, we give the definition of adjoining operation. Adjoining operation inserts an auxiliary tree into a partial parse tree. The operation is defined as follows: adjoining: Let σ be a partial parse tree and a be an auxiliary tree. Let η be the adjunction site of σ. adjoin(σ, a) is the result of splitting σ at η and combining the upper tree of σ with the root of a and the lower tree of σ with the foot of a (see Figure 4(b)). If the label of η is not the same as that of the foot of a, adjoin(σ, a) is undefined.
Here, we give the definitions of attachment site and adjunction site. These sites are defined in order to construct a parse tree from left to right. We say that a node η is complete if η satisfies the following conditions: • All children of η are instantiated and complete. 3 • Adjoining operation is not applicable to η. By the term "applicable", we mean that the grammar has an auxiliary tree whose foot label is identical to that of η and adjoining operation has not been applied to η yet.
The attachment site of σ is defined as the node η satisfying the following conditions: • Not all children of η are instantiated.
• All instantiated children of η are complete.
The adjunction site of σ is defined as the node η satisfying the following conditions: • All children of η are instantiated and complete.
• Adjoining operation is applicable to η.
Finally, we introduce nil-adjoining operation which changes not a partial parse tree, but node states. When the operation is applied to a node, we deem that adjoining operation is applied to the node. This affects whether or not each node in the partial parse tree is complete. The symbol nil designates the operation.

Constructing CCG Derivations
First of all, we show an example of incremental constructing process of CCG derivations in our proposed method. See Figure 5. Attaching operation is represented as a solid arrow labelled with an allowable chain. Adjoining operation is represented as a dotted arrow labelled with an auxiliary tree. The subscript i of a node indicates that the node is instantiated at the point when i-th word w i is consumed. The solid boxes mean that the nodes are complete. The dotted box represents that adjoining operation is applicable to the node. The symbol '*' means that the annotated node is introduced by adjoining operation (This node corresponds to the root of the auxiliary tree.). We call it adjoined node. Each node in a partial parse tree is labelled with a syntactic category and a rule type (or a word). No semantic representations are assigned. This is because each partial parse tree includes underspecified parts and it is impossible to determine their contents. This example Figure 4: Attaching operation and adjoining operation. demonstrates that each initial fragment has a partial parse tree, which connects all the words in the fragment.
Next, we consider the parsing grammar for CCG derivation. We do not need any allowable tuples, since the CCG rules determine the syntactic category of the node which follows a node. For example, when a parent node is labelled with category S and rule type <, and its leftmost child is labelled with category NP, the following node must be labelled with S\NP. The rule type is arbitrary. Of course, we can also define allowable tuples to restrict the rule type.
Each node of the allowable chains and the auxiliary trees is also labelled with a category and a rule type as shown in Figure 5. When an auxiliary tree a is adjoined to a partial parse tree at a node η, the label of η must be the same as that of the foot of a. That is, cat(η) = cat(f oot(a)) and rule(η) = rule(f oot(a)) hold. Here, we write cat(η) and rule(η) for the category and the rule type of a node η, respectively. f oot(a) is the foot node of an auxiliary tree a.

Incremental Semantic Construction
This section presents our incremental semantic construction procedure. For each initial fragment, our method derives a semantic representation from the partial parse tree obtained by the incremental constructing process. The semantic representation is composed as follows: • Construct a function t i which adds the information about the word w i to the semantic representation s i−1 for w 1 · · · w n−1 . The function is obtained from the nodes which are instantiated at the point when the word w i is consumed.
• Apply the function t i to the semantic representation s i−1 . That is, the semantic representation for w 1 · · · w i is s i = t i (s i−1 ).
We call the function t i semantic transition function (or transition function for short). The key point is how to construct the semantic transition function for a word. In the following, we explain it.
To construct a semantic transition function t i , our method assigns a pair ⟨α, M ⟩ to each node η ∈ N i (σ) where N i (σ) is the set of the nodes in a partial parse tree σ which are instantiated at the point when i-th word w i is consumed. Here, α is a sequence of variables and M is a semantic representation. The variables in α occur in M and represent underspecified parts of the semantic representation M . The semantic representation M conveys information about the word w i . The variables are expected to be specified in the order of α. A transition function is obtained from a pair.

Semantic Construction without
Adjoining Operation For ease of explanation, we first describe the construction of transition function in the case where adjoining operation is not used. Below, arity(R) is the number of the elements of the left-hand side of rule R. C R [M 1 , . . . , M n ] is the result of combining semantic representations M 1 , . . ., M n using rule R where n must be equal to arity(R). The procedure of constructing a transition function is as follows: 1. For the leaf node η ∈ N i (σ), if cat(η) : M is a lexical entry for w i , assign ⟨ε, M ⟩ to η.
3. Let ⟨α, M ⟩ be the pair assigned to the highest node in N i (σ). The semantic transition function t i is defined as follows:

λsα.sM
where s is a fresh variable.
By applying semantic transition functions, our method realizes incremental semantic construction. All semantic representations for initial fragments are in the form of λxα ′ .M ′ where xα ′ is a sequence of variables designating underspecified parts in a semantic representation M ′ (x is the first variable.). By applying semantic transition function λsα.sM , we obtain the following semantic representation: The result is in the same form. The underspecified part designated by the variable x is replaced with M which is specified by the word w i . As an example of our incremental semantic construction, let us consider a sentence "Anna met Manny." Figure 6 shows examples of semantic transition functions. The initial semantic representation is the identity function λx.x. For the word "Anna", the transition function shown in Figure 6(a) is constructed. By applying this function to the initial semantic representation, we obtain the following semantic representation for the initial fragment "Anna": Next, by applying the semantic transition function for "met" which is shown in Figure 6(b) to the semantic representation (1), the following one is obtained for the initial fragment "Anna met": (λsy.s(meet ′ y))(λy.yanna ′ ) ↠ β λy.meet ′ yanna ′ (2) This semantic representation captures the predicateargument relation between anna ′ and meet ′ . Finally, by applying the semantic transition function λs.smanny ′ to the semantic representation (2), we can obtain the following one: This semantic representation is the same as that of the normal form derivation.

Semantic Construction Using Adjoining Operation
In this section, we extend the transition function construction procedure to allow adjoining operation.
For η ∈ N i (σ) which is a node of an allowable chain, we modify steps 1 and 2 in the transition function construction procedure as follows: 274  Figure 6: Examples of semantic transition function construction.
• Let ⟨α, M ⟩ be the pair assigned to η in the version without adjoining operation. If adjoining operation is applicable to η, assign the pair ⟨αz, zM ⟩ to η instead of ⟨α, M ⟩ where z is a fresh variable.
The variable z is utilized for updating a semantic representation when adjoining operation is applied to η. When nil-adjoining operation is applied to η, the variable z is replaced with the identity function λx.x. That is, after applying λs.s(λx.x) to the semantic representation s i−1 , the semantic transition function t i is applied.
For an adjoined node η ∈ N i (σ), the modified procedure assigns a pair to η in the following way: • Let ⟨α, M ⟩ be the pair assigned to the root node of the allowable chain which is attached under η. Let R be rule(η) and n be arity(R). If adjoining operation is applicable to η, assign the following pair to η: ⟨αy 3 . . . y n z, λx.zC R [x, M, y 3 , . . . y n ]⟩ Otherwise, assign the following pair to η: Here, x, y 3 ,. . .,y n and z are fresh variables.
The pair assignment for a node to which adjoining operation is applicable and the one for an adjoined node work cooperatively (see Figure 7). If adjoining operation is applicable to a node, a fresh variable z is introduced to the semantic representation. When adjoining operation is applied to the node, this variable is replaced with a function in the form of λx.C R [x, M 2 , . . .] which receives a semantic representation of the first child and returns the result of semantic composition. Figure 6(c) shows an example of constructing the transition function where adjoining operation is applicable to the node (S\NP)/NP. Figure 6(d) shows an example of constructing the transition function where the node (S\NP)/NP is an adjoined node. The transition function is applied in the same way as the version without adjoining operation. Table 1 shows an example of the semantic representations constructed by our method.
As an example, let us consider the initial fragment "Anna met..." By applying the transition function shown in Figure 6(c) to the semantic representation (1), we obtain the semantic representation #2 shown in Table 1.
In the case where the next word is "Manny", nil-adjoining operation is applied to the node (S\NP)/NP, that is, the function λs.s(λx.x) is applied to #2. The result is identical to the semantic representation (2), therefore, we obtain the semantic representation (3) for "Anna met Manny".
Next, let us consider the case where the word "and" follows the initial fragment "Anna met." In this case, the derivation is constructed as shown in the lower side of Figure 5. The semantic transition function for the word "and" is constructed as shown in Figure 6(d). By applying the function to the semantic representation #2, we obtain the semantic representation #3. Furthermore, if the word sequence "might marry Manny" follows this initial 275 fragment, the semantic representations #4, #5 and #6 are obtained in this order. This example demonstrates that our method can incrementally construct semantic representations for sentences including coordinate structures. In comparison with our incremental semantic construction, incremental derivation approaches have the case where no semantic representations are assigned to initial fragments. Table 2 shows semantic representations which are assigned using incremental derivations. There exist initial fragments which have no semantic representations as discussed in Section 3. 4

Related Work
Our incremental semantic construction is based on the λ-calculus. There have been several methods of incremental semantic construction using the λ-calculus. Pulman (1985) has developed an incremental parser which uses context-free rules an-4 The initial fragment "Anna met" can have the semantic representation λx.meet ′ xanna ′ as shown in Figure 3(a). However, the derivation which has this semantic representation is not a partial structure of incremental derivation shown in Figure 3(b). That is, the derivation is not consistent with that of "Anna met and might marry Manny." notated with semantic representations. The parsing process proceeds on a word-by-word basis, but its intermediate structure is a stack, that is, the parser does not assign a fully-connected semantic representation to each initial fragment. Milward (1995) has proposed an incremental semantic construction method based on Categorial Grammar. The method uses two types of transition functions: state-application and state-prediction. Our semantic transition function is similar to these functions. However, our method is more general than that of Milward. Milward's method cannot produce CCG derivations, since it can deal with only function application.
There are other approaches to incremental semantic construction, which use different formalism. Purver et al. (2011) have developed a dialogue system based on Dynamic Syntax (DS) (Kempson et al., 2001), which provides an incremental framework of constructing semantic representations. Peldszus and Schlangen (2012) have proposed incremental semantic construction based on Robust Minimal Recursion Semantics (RMRS) (Copestake, 2007). Sayeed and Demberg (2012) have proposed incremental semantic construction for PLTAG (Demberg et al., 2013). It is unclear how to construct a wide coverage grammar (with semantic annotation) in these frameworks. 5 On the other hand, our method can use CCG-based lexicon (e.g., (Bos, 2009)) directly. Although our method requires a set of allowable chains and auxiliary trees in addition to such a lexicon, we can easily extract it from CCGbank (Hockenmaier and Steedman, 2007) by using the method proposed in (Kato and Matsubara, 2009).

Conclusion
This paper proposed a CCG-based method of incrementally constructing semantic representations. Our approach is based on normal form derivations unlike previous ones. In this paper, we focused on the formal aspect of our method. We defined semantic transition function to obtain semantic representations for each initial fragment of an input sentence.
Another important issue is how to interpret intermediate semantic representations for initial fragments. To our knowledge, there is little work to this direction. In future work, we will explore a modeltheoretic approach to this problem.