Benoit Crabbé

Also published as: Benoît Crabbé


2019

pdf bib
Variable beam search for generative neural parsing and its relevance for the analysis of neuro-imaging signal
Benoit Crabbé | Murielle Fabre | Christophe Pallier
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper describes a method of variable beam size inference for Recurrent Neural Network Grammar (rnng) by drawing inspiration from sequential Monte-Carlo methods such as particle filtering. The paper studies the relevance of such methods for speeding up the computations of direct generative parsing for rnng. But it also studies the potential cognitive interpretation of the underlying representations built by the search method (beam activity) through analysis of neuro-imaging signal.

pdf bib
Using Wiktionary as a resource for WSD : the case of French verbs
Vincent Segonne | Marie Candito | Benoît Crabbé
Proceedings of the 13th International Conference on Computational Semantics - Long Papers

As opposed to word sense induction, word sense disambiguation (WSD) has the advantage of us-ing interpretable senses, but requires annotated data, which are quite rare for most languages except English (Miller et al. 1993; Fellbaum, 1998). In this paper, we investigate which strategy to adopt to achieve WSD for languages lacking data that was annotated specifically for the task, focusing on the particular case of verb disambiguation in French. We first study the usability of Eurosense (Bovi et al. 2017) , a multilingual corpus extracted from Europarl (Kohen, 2005) and automatically annotated with BabelNet (Navigli and Ponzetto, 2010) senses. Such a resource opened up the way to supervised and semi-supervised WSD for resourceless languages like French. While this perspective looked promising, our evaluation on French verbs was inconclusive and showed the annotated senses’ quality was not sufficient for supervised WSD on French verbs. Instead, we propose to use Wiktionary, a collaboratively edited, multilingual online dictionary, as a resource for WSD. Wiktionary provides both sense inventory and manually sense tagged examples which can be used to train supervised and semi-supervised WSD systems. Yet, because senses’ distribution differ in lexicographic examples found in Wiktionary with respect to natural text, we then focus on studying the impact on WSD of the training data size and senses’ distribution. Using state-of-the art semi-supervised systems, we report experiments of Wiktionary-based WSD for French verbs, evaluated on FrenchSemEval (FSE), a new dataset of French verbs manually annotated with wiktionary senses.

pdf bib
Unlexicalized Transition-based Discontinuous Constituency Parsing
Maximin Coavoux | Benoît Crabbé | Shay B. Cohen
Transactions of the Association for Computational Linguistics, Volume 7

Lexicalized parsing models are based on the assumptions that (i) constituents are organized around a lexical head and (ii) bilexical statistics are crucial to solve ambiguities. In this paper, we introduce an unlexicalized transition-based parser for discontinuous constituency structures, based on a structure-label transition system and a bi-LSTM scoring system. We compare it with lexicalized parsing models in order to address the question of lexicalization in the context of discontinuous constituency parsing. Our experiments show that unlexicalized models systematically achieve higher results than lexicalized models, and provide additional empirical evidence that lexicalization is not necessary to achieve strong parsing results. Our best unlexicalized model sets a new state of the art on English and German discontinuous constituency treebanks. We further provide a per-phenomenon analysis of its errors on discontinuous constituents.

2017

pdf bib
Incremental Discontinuous Phrase Structure Parsing with the GAP Transition
Maximin Coavoux | Benoît Crabbé
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

This article introduces a novel transition system for discontinuous lexicalized constituent parsing called SR-GAP. It is an extension of the shift-reduce algorithm with an additional gap transition. Evaluation on two German treebanks shows that SR-GAP outperforms the previous best transition-based discontinuous parser (Maier, 2015) by a large margin (it is notably twice as accurate on the prediction of discontinuous constituents), and is competitive with the state of the art (Fernández-González and Martins, 2015). As a side contribution, we adapt span features (Hall et al., 2014) to discontinuous parsing.

pdf bib
Multilingual Lexicalized Constituency Parsing with Word-Level Auxiliary Tasks
Maximin Coavoux | Benoît Crabbé
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We introduce a constituency parser based on a bi-LSTM encoder adapted from recent work (Cross and Huang, 2016b; Kiperwasser and Goldberg, 2016), which can incorporate a lower level character biLSTM (Ballesteros et al., 2015; Plank et al., 2016). We model two important interfaces of constituency parsing with auxiliary tasks supervised at the word level: (i) part-of-speech (POS) and morphological tagging, (ii) functional label prediction. On the SPMRL dataset, our parser obtains above state-of-the-art results on constituency parsing without requiring either predicted POS or morphological tags, and outputs labelled dependency trees.

2016

pdf bib
Neural Greedy Constituent Parsing with Dynamic Oracles
Maximin Coavoux | Benoît Crabbé
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Boosting for Efficient Model Selection for Syntactic Parsing
Rachel Bawden | Benoît Crabbé
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We present an efficient model selection method using boosting for transition-based constituency parsing. It is designed for exploring a high-dimensional search space, defined by a large set of feature templates, as for example is typically the case when parsing morphologically rich languages. Our method removes the need to manually define heuristic constraints, which are often imposed in current state-of-the-art selection methods. Our experiments for French show that the method is more efficient and is also capable of producing compact, state-of-the-art models.

2015

pdf bib
Multilingual discriminative lexicalized phrase structure parsing
Benoit Crabbé
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Dependency length minimisation effects in short spans: a large-scale analysis of adjective placement in complex noun phrases
Kristina Gulordava | Paola Merlo | Benoit Crabbé
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
An LR-inspired generalized lexicalized phrase structure parser
Benoit Crabbé
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
A discriminative parser of the LR family for phrase structure parsing (Un analyseur discriminant de la famille LR pour l’analyse en constituants) [in French]
Benoît Crabbé
Proceedings of TALN 2014 (Volume 1: Long Papers)

2013

pdf bib
Towards a treebank of spoken French (Vers un treebank du français parlé) [in French]
Anne Abeillé | Benoit Crabbé
Proceedings of TALN 2013 (Volume 1: Long Papers)

pdf bib
XMG: eXtensible MetaGrammar
Benoît Crabbé | Denys Duchier | Claire Gardent | Joseph Le Roux | Yannick Parmentier
Computational Linguistics, Volume 39, Issue 3 - September 2013

2012

pdf bib
Ubiquitous Usage of a Broad Coverage French Corpus: Processing the Est Republicain corpus
Djamé Seddah | Marie Candito | Benoit Crabbé | Enrique Henestroza Anguiano
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

2011

pdf bib
Testing the Robustness of Online Word Segmentation: Effects of Linguistic Diversity and Phonetic Variation
Luc Boruta | Sharon Peperkamp | Benoît Crabbé | Emmanuel Dupoux
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics

2010

pdf bib
Statistical French Dependency Parsing: Treebank Conversion and First Results
Marie Candito | Benoît Crabbé | Pascal Denis
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

2009

pdf bib
On Statistical Parsing of French with Supervised and Semi-Supervised Strategies
Marie Candito | Benoit Crabbé | Djamé Seddah
Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference

pdf bib
Improving generative statistical parsing with semi-supervised word clustering
Marie Candito | Benoît Crabbé
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
Cross parser evaluation : a French Treebanks study
Djamé Seddah | Marie Candito | Benoît Crabbé
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

2006

pdf bib
A Constraint Driven Metagrammar
Joseph Le Roux | Benoît Crabbé | Yannick Parmentier
Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms

pdf bib
Increasing the coverage of a domain independent dialogue lexicon with VERBNET
Benoit Crabbé | Myroslava O. Dzikovska | William de Beaumont | Mary Swift
Proceedings of the Third Workshop on Scalable Natural Language Understanding

pdf bib
XMG - An Expressive Formalism for Describing Tree-Based Grammars
Yannick Parmentier | Joseph Le Roux | Benoît Crabbé
Demonstrations

2002

pdf bib
A new metagrammar compiler
Bertrand Gaiffe | Benoit Crabbé | Azim Roussanaly
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)