Abstract Categorial Parsing as Linear Logic Programming

This paper shows how the parsing problem for general Abstract Categorial Grammars can be reduced to the provability problem for Multi-plicative Exponential Linear Logic. It follows essentially a similar reduction by Kanazawa, who has shown how the parsing problem for second-order Abstract Categorial Grammars reduces to datalog queries.

1 Introduction Kanazawa (2007;2011) has shown how parsing and generation may be reduced to datalog queries for a class of grammars that encompasses mildly context-sensitive formalisms. These grammars, which he calls context-free λ-term grammars, correspond to second-order abstract categorial grammars (de Groote, 2001).
In this paper, we show how Kanazawa's reduction may be carried out in the case of abstract categorial grammars of a degree higher than two. The price to pay is that we do not end up with a datalog query, but with a provability problem in multiplicative exponential linear logic (Girard, 1987). This is of course a serious difference. In particular, it is not known whether the multiplicative exponential fragment of linear logic is decidable.
The paper is organized as follows. Section 2 presents some mathematical preliminaries concerning the linear λ-calculus. We then introduce, in Section 3, the notion of abstract categorial grammar. Section 4 is the core of the paper, where we explain Kanazawa's reduction. To this end, we proceed by stepwise refinement. We first introduce an obviously correct but inefficient parsing algorithm. We then improve it by successive correctness-preserving transformations. Finally, we conclude in Section 5.

Linear λ-calculus
We assume from the reader some acquaintance with the basic concepts of the (simply typed) λ-calculus. Nevertheless, in order to fix the terminology and the notations, we briefly reminds the main definitions and properties that will be needed in the sequel. In particular, we review the notions linear implicative types, higher-order linear signature, and linear λterms built upon a higher-order linear signature.
Let A be a set of atomic types. The set T (A) of linear implicative types built upon A is inductively defined as follows: 1. if a ∈ A, then a ∈ T (A); 2. if α, β ∈ T (A), then (α −• β) ∈ T (A).
Given two sets of atomic types, A and B, a mapping h : T (A) → T (B) is called a type homomorphism (or a type substitution) if it satisfies the following condition: A type substitution that maps atomic types to atomic types is called a relabeling.
A higher-order linear signature consists of a triple Σ = A, C, τ , where: 1. A is a finite set of atomic types; 2. C is a finite set of constants; 3. τ : C → T (A) is a function that assigns to each constant in C a linear implicative type in T (A). Given, a higher-order linear signature Σ, we write A Σ , C Σ , and τ Σ , for its respective components.
The above notion of linear implicative type is isomorphic to the usual notion of simple type. Consequently, there is no technical difference between a higher-order linear signature and a higher-order signature. The only reason for using the word linear is to emphasize that we will be concerned with the typing of the linear λ-terms, i.e., the λ-terms whose typing system corresponds to the implicative fragment of multiplicative linear logic (Girard, 1987).
Let X be a infinite countable set of λ-variables. The set Λ(Σ) of linear λ-terms built upon a higherorder linear signature Σ is inductively defined as follows: 3. if x ∈ X, t ∈ Λ(Σ), and x occurs free in t exactly once, then (λx. t) ∈ Λ(Σ); 4. if t, u ∈ Λ(Σ), and the sets of free variables of t and u are disjoint, then (t u) ∈ Λ(Σ).
Λ(Σ) is provided with the usual notions of captureavoiding substitution, α-conversion, β-reduction, and η-reduction (Barendregt, 1984). Let t and u be linear λ-terms. We write t → → β u and t = β u for the relations of β-reduction and β-conversion, respectively, We use similar notations for the relations of reduction and conversion induced by η and βη. Let Σ 1 and Σ 2 be two signatures. We say that a mapping h : Λ(Σ 1 ) → Λ(Σ 2 ) is a λ-term homomorphism if it satisfies the following conditions: Given a higher-order linear signature Σ, each linear λ-term in Λ(Σ) may possibly be assigned a linear implicative type in T (A Σ ). This type assignment obeys the following typing rules: We end this section by reviewing some properties that will turn out to be useful in the sequel.
The set of linear λ-terms being a subset of the set of simply typed λ-terms, it inherits the universal properties of the latter (e.g., strong normalization, or existence of a principal type scheme). It also satisfies the usual subject-reduction property.
The set of simply typed λ-terms, which is not closed in general under β-expansion, is known to be closed under linear β-expansion. Consequently, the set of linear λ-terms satisfies the subject-expansion property.
The subject-reduction property also holds for the relation of βη-reduction. This is not the case, however, for the subject-expansion property. This possible difficulty may be circumvented by using the notion of η-long form.
A linear λ-term is said to be in η-long form when every of its sub-terms of functional type is either a λ-abstraction or the operator of an application. The set of linear λ-terms in η-long forms is closed under both β-reduction and β-expansion. Consequently, the following proposition holds.
Proposition 3 Let t and u be λ-terms in η-long forms. Then, t = βη u if and only if t = β u.
In the sequel, we will often assume that the linear λterms under consideration are in η-long forms. This will allow us to only consider β-reduction and βexpansion, while using the relation of βη-conversion as the notion of equality between linear λ-terms.
Finally, it is known from a categorical coherence theorem that every balanced simple type is inhabited by at most one λ-term up to βη-conversion (see (Babaev and Solov'ev, 1982;Mints, 1981)). It is also known that the principal type of a pure linear λterm is balanced (Hirokawa, 1991). Consequently, the following property holds.
Proposition 4 Let t be a pure linear λ-term (i.e., a linear λ-term that does not contain any constant), and let Γ − t : α be its principal typing. If u is a pure linear λ-term such that Γ − u : α, then t = βη u.

Abstract Categorial Grammar
This section gives the definition of an abstract categorial grammar (ACG, for short) (de Groote, 2001).
We first define a lexicon to be a morphism between higher-order linear signatures. Let Σ 1 = A 1 , C 1 , τ 1 and Σ 2 = A 2 , C 2 , τ 2 be two higherorder signatures. A lexicon L : Σ 1 → Σ 2 is a realization of Σ 1 into Σ 2 , i.e., an interpretation of the atomic types of Σ 1 as types built upon A 2 , together with an interpretation of the constants of Σ 1 as linear λ-terms built upon Σ 2 . These two interpretations must be such that their homomorphic extensions commute with the typing relations. More formally, a lexicon L from Σ 1 to Σ 2 is defined to be a pair L = F, G such that: is a function that interprets the atomic types of Σ 1 as linear implicative types built upon A 2 ; is a function that interprets the constants of Σ 1 as linear λ-terms built upon Σ 2 ; 3. the interpretation functions are compatible with the typing relation, i.e., for any c ∈ C 1 , the following typing judgement is derivable: whereF is the unique homomorphic extension of F .
Remark that Condition (1) compels G(c) to be typable with respect to the empty typing environment. This means that G interprets each constant c as a closed linear λ-term. Now, definingĜ to be the unique homomorphic extension of G, Condition (1) ensures that the following commutation property holds for every t ∈ Λ(Σ 1 ): In the sequel, given such a lexicon L = F, G , L (a) will stand for eitherF (a) orĜ(a), according to the context.
We now define an abstract categorial grammar as quadruple, G = Σ 1 , Σ 2 , L , S , where: 1. Σ 1 and Σ 2 are two higher-order linear signatures; they are called the abstract vocabulary and the object vocabulary, respectively; 2. L : Σ 1 → Σ 2 is a lexicon from the abstract vocabulary to the object vocabulary; 3. S is an atomic type of the abstract vocabulary; it is called the distinguished type of the grammar.
Every ACG G generates two languages: an abstract language, A(G ), and an object language O(G ).
The abstract language, which may be seen as a set of abstract parse structures, is the set of closed linear λ-terms built upon the abstract vocabulary and whose type is the distinguished type of the grammar. It is formally defined as follows: The object language, which may be seen as the set of surface forms generated by the grammar, is defined to be the image of the abstract language by the term homomorphism induced by the lexicon.
Both the abstract language and the object language generated by an ACG are sets of linear λterms. This allows more specific data structures such as strings, trees, or first-order terms to be represented. A string of symbols, for instance, can be encoded as a composition of functions. Consider an arbitrary atomic type s, and define σ = s −• s to be the type of strings. Then, a string such as 'abbac' may be represented by the linear λ-term: where the atomic strings 'a', 'b', and 'c' are declared to be constants of type σ. In this setting, the empty word is represented by the identity function: and concatenation is defined to be functional composition: which is indeed an associative operator that admits the identity function as a unit. We end this section by giving a fragment of a categorial grammar that will serve as a running example throughout the rest of this paper. 1 The abstract vocabulary, which specifies the abstract parse structures, is given in Fig. 1. In this signature, the atomic types (N , NP s , NP o , S, S) must be thought of as atomic syntactic categories. The lexicon, which is given in Fig. 2, allows the abstract structures to be transformed in surface forms. These surface forms are strings that are built upon an object vocabulary, Σ 2 , which includes the following atomic strings as constants of type σ: man, woman, wise, a, seeks.
For such a grammar, the parsing problem consists in deciding whether a possible surface form (i.e., term t ∈ Λ(Σ 2 )) belongs to the object vocabulary of the grammar. Spelling it out, is there an abstract parse structure (i.e., a term u ∈ Λ(Σ 1 ) of type S) whose image through the lexicon is the given surface form (i.e., L (u) = t).
Consider, for instance, the following string: a + wise + woman + seeks + a + wise + man (2) One can show that it belongs to the object language of the grammar. Indeed, when applying the lexicon to the following abstract term: one obtains a λ-term that is βη-convertible to (2). In fact, it is even the case that (2) is ambiguous in the sense that there is another abstract term, essentially different from (3), whose image through the lexicon yields (2). 2 This abstract term is the following:

Development of the parsing algorithm
In this section, we develop a parsing algorithm based on proof-search in the implicative fragment of linear logic. We start with a simple non-deterministic algorithm, which is rather inefficient but whose correctness and semi-completeness are obvious. Then, we proceed by stepwise refinement, preserving the correctness and semi-completeness of the algorithm. By correctness, we mean that if the parsing algorithm answers positively then it is indeed the case that the input term belongs to the object language of the grammar. By semi-completeness, we mean that if the input term belongs to the object language of the grammar, then the parsing algorithm will eventually give a positive answer.
In the present state of knowledge, semicompleteness is the best we may expect. Indeed, the ACG membership problem is known to be equivalent to provability in multiplicative exponential logic (de Groote et al., 2004;Yoshinaka and Kanazawa, 2005), the decidability of which is still open.

Generate and test
Our starting point is a simple generate and test algorithm: 1. derive S using the rules of implicative linear logic with the types of the abstract constants (Fig. 3) as proper axioms; 2. interpret the obtained derivation as a linear λ-term (through the Curry-Howard isomorphism); 3. apply the lexicon to the resulting λ-term, and check whether it yields a term βη-convertible to the input term. The above algorithm is obviously correct. It is also semi-complete because it enumerates all the terms of the abstract language. Now, if the input term belongs to the object language of the grammar then its abstract parse structure(s) will eventually appear in the enumeration.

Type-driven search
The generate and test algorithm proceeds by trial and error without taking into account the form of the input term. In order to improve our algorithm, we must focus on the construction of an abstract term whose image by the lexicon would be the input term. To this end, we take advantage of Proposition 4.
In general, the input term is not a pure λ-term. Consequently, in order to apply Proposition 4, we must consider each occurrence of a constant in the input term as a fresh free variable. Applying this idea to our example, we obtain the following principal typing that characterizes uniquely the input string (in η-long β-normal form): a 1 : s 1 −• s 0 , wise 1 : s 2 −• s 1 , woman : s 3 −• s 2 , seeks : s 4 −• s 3 , a 2 : s 5 −• s 4 , wise 2 : s 6 −• s 5 , man : s 7 −• s 6 − λz. a 1 (wise 1 (woman (seeks (a 2 (wise 2 (man z))))))) : s 7 −• s 0 The types assigned to the constant occurrences of the input term induce a new specialized object vocabulary, which we will call Σ S 2 . We take for granted the definition of the forgetful homomorphism | · | : Σ S 2 → Σ 2 that allows to project Σ S 2 on Σ 2 . Roughly speaking, this forgetful homomorphism consists simply in identifying the several occurences of a same object constant. Remark that at the level of the types, this forgetful homomorphism is a relabeling because the input string has been given in η-long form. In our case, this relabeling is the following one: The next step is to adapt the abstract vocabulary and the lexicon to this specialized object vocabulary. We start with the abstract atomic types. Let a ∈ A Σ 1 , and define the set ξ(a) as follows: For instance, we have: Then we define the set of atomic types of the specialized abstract signature as follows: and we let L S (a α ) = α Back to our example, it means that the specialised abstract signature contains 64 copies of N : In order to accommodate the abstract constants, we look at the lexicon. Consider the first two lexical entries. Their typing, according to the specialized object vocabulary, is as follows: MAN := λz. man z : s 7 −• s 6 WOMAN := λz. woman z : s 3 −• s 2 Accordingly, we let the specialized abstract vocabulary contain the following two constants: MAN : N s 7 −•s 6 WOMAN : N s 3 −•s 2 Consider now the third entry: There are two ways of specializing it. On the one hand, the object constant wise may be replaced by its first occurrence (wise 1 ) or by its second one (wise 2 ). On the other hand, each occurrence of the atomic type s may be instantiated by one of s 0 , s 1 , ..., s 7 . This give rise to 8,192 a priori possibilities. These possibilities, however, do not all correspond to actual typing judgements. Filtering out the ill-typed ones (which is effective since typing is decidable), we are left with 16 new lexical entries which obey the following schemes: and we add the following 16 constants to the specialized abstract vocabulary: By proceeding in the same way with the other lexical entries, we obtain a new specialized abstract signature Σ S 1 together with a new specialized lexicon: Clearly, there exists a forgetful homomorphism between Σ S 1 and Σ 1 , and the specialized abstract signature and specialized lexicon are such that the following diagram commutes: We may now use the specialized grammar to drive the proof-search on which the generate and test algorithm is based. Remember that the specialized object type assigned to the input string is s 7 −•s 0 . Our parsing problem is then reduced to the following proofsearch problem: derive S s 7 −•s 0 using the rules of implicative linear logic with the types of the specialized abstract constants as proper axioms. Now, suppose that we derive S s 7 −•s 0 , and that t ∈ Λ(Σ S 1 ) is the specialized abstract linear λ-term corresponding to this derivation. By construction of the specialized grammar, we have that: Then, by Proposition 4, we have that L s (t) = βη λz. a 1 (wise 1 (woman (seeks (a 2 (wise 2 (man z)))))) because − Σ S 2 λz. a 1 (wise 1 (woman (seeks (a 2 (wise 2 (man z)))))) : s 7 −• s 0 (7) amounts to a principal typing. Finally, by taking t = |t|, we obtain a term t ∈ Λ(Σ 1 ) such that: L (t ) = βη λz. a (wise (woman (seeks (a (wise (man z)))))) This shows the correctness of the algorithm.
To establish its semi-completeness, suppose that there exists an abstract linear λ-term t ∈ Λ(Σ 1 ) such that (8). From this, one can easily construct a term t ∈ Λ(Σ S 1 ) of type S such that |t| = t and Equation (6) holds. Since the lexical entries are given in η-long forms, so is L s (t). Then, because the specialized input term is in η-long β-normal form, by Proposition 3, we have that: L s (t) → → β λz. a 1 (wise 1 (woman (seeks (a 2 (wise 2 (man z)))))) Then, (5) follows from (7) and (9) by Proposition 2. From this, it is not too difficult to establish that t is of type S s 7 −•s 0 .

Proof-search in the implicative fragment of linear logic
The type-driven algorithm that we have sketched presents two serious defects. On the one hand, the construction of the specialized grammar is both time and space consuming. For our simple running example, for instance, we would obtain 6,226 specialized lexical entries. On the other hand, the reduction depends upon the input string.
In order to circumvent these difficulties, consider again the specialized lexical entries corresponding to the third lexical entry of the original grammar: In fact, all the specialized object types assigned to these lexical entries are instances of the principal typing of the corresponding lexical entry of the original lexicon: This means that if the specialized object vocabulary assigns the constant wise with the following type: then the specialized abstract vocabulary should contain abstract constants obeying the following type scheme:  Figure 5: The lexicon as a set of inference rules Writing N [j, k] for N k−•j and representing (10) by the predicate wise[i, j], we may represent the dependence between 10 and 11 by the following linear logic sequent: 3 Applying the same process to the other lexical entries, we end up with the set of sequents given in We give in Fig. 6 and Fig. 7 (in the annex) the derivations corresponding to the de dicto parsing (3) and to the de re parsing (4). These two derivations use the inference rules given in Fig. 5, which are equivalent to the sequents of Fig. 4. queries. A preliminary version of the results reported in this paper has been presented in a talk given in June 2007 at the Colloquium in Honor of Gérard Huet on the occasion of his 60th birthday. This work has been supported by the French agency Agence Nationale de la Recherche (ANR-12-CORD-0004). (p y)) x)))